llvm-project

Commit Graph

Author	SHA1	Message	Date
Justin Holewinski	e988409423	[NVPTX] Fix handling of vector arguments llvm-svn: 177847	2013-03-24 21:17:47 +00:00
Jakob Stoklund Olesen	9619fc0bd1	Clean up Sparc patterns. The types of register variables no longer need to be specified in output patterns. llvm-svn: 177845	2013-03-24 19:37:04 +00:00
Jakob Stoklund Olesen	83aa671f09	Give Sparc instruction patterns direct types instead of register classes. Also update the documentation since Sparc is the nicest backend, and used as an example in WritingAnLLVMBackend. llvm-svn: 177835	2013-03-24 00:56:20 +00:00
Hal Finkel	915769edd9	PPC ZERO register needs a register number of 0. In order for the new ZERO register to be used with MC, etc. we need to specify its register number (0). Thanks to Kai for reporting the problem! llvm-svn: 177833	2013-03-23 22:06:07 +00:00
Hal Finkel	cc1eeda16d	Note in PPCFunctionInfo VRSAVE spills In preparation for using the new register scavenger capability for providing more than one register simultaneously, specifically note functions that have spilled VRSAVE (currently, this can happen only in functions that use the setjmp intrinsic). As with CR spilling, such functions will need to provide two emergency spill slots to the scavenger. No functionality change intended. llvm-svn: 177832	2013-03-23 22:06:03 +00:00
Hal Finkel	f07a8e04ab	MCize the bcl instruction in PPCAsmPrinter I recently added a BCL instruction definition as part of implementing SjLj support. This can also be used to MCize bcl emission in the asm printer. No functionality change intended. llvm-svn: 177830	2013-03-23 20:53:15 +00:00
Jakob Stoklund Olesen	b1f7c28765	Use direct types in Sparc def : Pat patterns. The SelectionDAG graph has MVT type labels, not register classes, so this makes it clearer what is happening. This notation is also robust against adding more types to the IntRegs register class. llvm-svn: 177829	2013-03-23 20:35:05 +00:00
Hal Finkel	c6eaa4cead	Cleanup some unused reg. scavenger parameters in PPCRegisterInfo These spilling functions will eventually make use of the register scavenger, however, they'll do so by taking advantage of PEI's virtual-register-based delayed scavenging mechanism. As a result, these function parameters will not be used, and can be removed. No functionality change intended. llvm-svn: 177827	2013-03-23 19:36:47 +00:00
Hal Finkel	794e05b03b	Remove dead PPC LR spilling code The LR register is unconditionally reserved, and its spilling and restoration is handled by the prologue/epilogue code. As a result, it is never explicitly spilled by the register allocator. No functionality change intended. llvm-svn: 177823	2013-03-23 17:14:27 +00:00
Hal Finkel	9e331c2f9c	Allow the register scavenger to spill multiple registers This patch lets the register scavenger make use of multiple spill slots in order to guarantee that it will be able to provide multiple registers simultaneously. To support this, the RS's API has changed slightly: setScavengingFrameIndex / getScavengingFrameIndex have been replaced by addScavengingFrameIndex / isScavengingFrameIndex / getScavengingFrameIndices. In forthcoming commits, the PowerPC backend will use this capability in order to implement the spilling of condition registers, and some special-purpose registers, without relying on r0 being reserved. In some cases, spilling these registers requires two GPRs: one for addressing and one to hold the value being transferred. llvm-svn: 177774	2013-03-22 23:32:27 +00:00
Jyotsna Verma	fdc660bf2e	Hexagon: Add and enable memops setbit, clrbit, &,\|,+,- for byte, short, and word. llvm-svn: 177747	2013-03-22 18:41:34 +00:00
Ulrich Weigand	f62e83f415	Remove ABI-duplicated call instruction patterns. We currently have a duplicated set of call instruction patterns depending on the ABI to be followed (Darwin vs. Linux). This is a bit odd; while the different ABIs will result in different instruction sequences, the actual instructions themselves ought to be independent of the ABI. And in fact it turns out that the only nontrivial difference between the two sets of patterns is that in the PPC64 Linux ABI, the instruction used for indirect calls is marked to take X11 as extra input register (which is indeed used only with that ABI to hold an incoming environment pointer for nested functions). However, this does not need to be hard-coded at the .td pattern level; instead, the C++ code expanding calls can simply add that use, just like it adds uses for argument registers anyway. No change in generated code expected. llvm-svn: 177735	2013-03-22 15:24:13 +00:00
Ulrich Weigand	1df06d8b58	Rename memrr ptrreg and offreg components. Currently, the sub-operand of a memrr address that corresponds to what hardware considers the base register is called "offreg", while the sub-operand that corresponds to the offset is called "ptrreg". To avoid confusion, this patch simply swaps the named of those two sub-operands and updates all uses. No functional change is intended. llvm-svn: 177734	2013-03-22 14:59:13 +00:00
Ulrich Weigand	e90b022468	Fix swapped BasePtr and Offset in pre-inc memory addresses. PPCTargetLowering::getPreIndexedAddressParts currently provides the base part of a memory address in the offset result, and the offset part in the base result. That swap is then undone again when an MI instruction is generated (in PPCDAGToDAGISel::Select for loads, and using .md Pat patterns for stores). This patch reverts this double swap, to make common code and back-end be in sync as to which part of the address is base and which is offset. To avoid performance regressions in certain cases, target code now checks whether the choice of base register would be rejected for pre-inc accesses by common code, and attempts to swap base and offset again in such cases. (Overall, this means that now pre-ice accesses are generated more frequently than before.) llvm-svn: 177733	2013-03-22 14:58:48 +00:00
Ulrich Weigand	d1b99d350c	Tighten iaddroff ComplexPattern. The iaddroff ComplexPattern is supposed to recognize displacement expressions that have been processed by a SelectAddressRegImm, which means it needs to accept TargetConstant and TargetGlobalAddress nodes. Currently, it erroneously also accepts some other nodes, in particular Constant and PPCISD::Lo. While this problem is currently latent, it would cause wrong-code bugs with a follow-on patch I'm about to commit, so this patch tightens the ComplexPattern. The equivalent change is made in PPCDAGToDAGISel::Select, where pre-inc load patterns are handled (as opposed to store patterns, the loads are handled in C++ code without making use of the .td ComplexPattern). llvm-svn: 177732	2013-03-22 14:58:17 +00:00
Ulrich Weigand	e448badbb1	Remove the xaddroff ComplexPattern. The xaddroff pattern is currently (mistakenly) used to recognize the base register in pre-inc store patterns. This patch replaces those uses by ptr_rc_nor0 (as is elsewhere done to match the base register of an address), and removes the now unused ComplexPattern. llvm-svn: 177731	2013-03-22 14:57:48 +00:00
Michel Danzer	a2e28156b4	R600: Use legacy (0 * anything = 0) MUL instructions for pow intrinsics Fixes wrong lighting in some corner cases with r600g and radeonsi, e.g. manifested by failure of two piglit/glean tests and intermittent black patches in many apps. Tested on SI and RS880. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62012 [radeonsi] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58150 [r600g] NOTE: This is a candidate for the Mesa stable branch. Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 177730	2013-03-22 14:09:10 +00:00
Jack Carter	4f69a0f25d	Fix the invalid opcode for Mips branch instructions in the assembler For mips a branch an 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. Previously, the code generator did not perform the shift of the immediate branch offset which resulted in wrong instruction opcode. This patch fixes the issue. Contributor: Vladimir Medic llvm-svn: 177687	2013-03-22 00:29:10 +00:00
Jack Carter	9e65aa35a0	This patch that enables the Mips assembler to use symbols for offset for instructions This patch uses the generated instruction info tables to identify memory/load store instructions. After successful matching and based on the operand type and size, it generates additional instructions to the output. Contributor: Vladimir Medic llvm-svn: 177685	2013-03-22 00:05:30 +00:00
Hal Finkel	f70c41ea7c	Remove the G8RC_NOX0_and_GPRC_NOR0 PPC register class As Jakob pointed out in his review of r177423, having a shared ZERO register between the 32- and 64-bit register classes causes this odd G8RC_NOX0_and_GPRC_NOR0 class to be created. As recommended, this adds a ZERO8 register which differentiates the 32- and 64-bit zeros. No functionality change intended. llvm-svn: 177683	2013-03-21 23:45:03 +00:00
Hal Finkel	891671afe5	Fix a register-class comparison bug in PPCCTRLoops Thanks to Jakob for isolating the underlying problem from the test case in r177423. The original commit had introduced asymmetric copy operations, but these turned out to be a work-around to the real problem (the use of == instead of hasSubClassEq in PPCCTRLoops). llvm-svn: 177679	2013-03-21 23:23:34 +00:00
Jack Carter	d76b2376f2	This patch enables the Mips .set directive to define aliases The .set directive in the Mips the assembler can be used to set the value of a symbol to an expression. This changes the symbol's value and type to conform to the expression's. Syntax: .set symbol, expression This patch implements the parsing of the above syntax and enables the parser to use defined symbols when parsing operands. Contributor: Vladimir Medic llvm-svn: 177667	2013-03-21 21:44:16 +00:00
Hal Finkel	756810fe36	Implement builtin_{setjmp/longjmp} on PPC This implements SJLJ lowering on PPC, making the Clang functions __builtin_{setjmp/longjmp} functional on PPC platforms. The implementation strategy is similar to that on X86, with the exception that a branch-and-link variant is used to get the right jump address. Credit goes to Bill Schmidt for suggesting the use of the unconditional bcl form (instead of the regular bl instruction) to limit return-address-cache pollution. Benchmarking the speed at -O3 of: static jmp_buf env_sigill; void foo() { __builtin_longjmp(env_sigill,1); } main() { ... for (int i = 0; i < c; ++i) { if (__builtin_setjmp(env_sigill)) { goto done; } else { foo(); } done:; } ... } vs. the same code using the libc setjmp/longjmp functions on a P7 shows that this builtin implementation is ~4x faster with Altivec enabled and ~7.25x faster with Altivec disabled. This comparison is somewhat unfair because the libc version must also save/restore the VSX registers which we don't yet support. llvm-svn: 177666	2013-03-21 21:37:52 +00:00
Hal Finkel	a1431df540	Add support for spilling VRSAVE on PPC Although there is only one Altivec VRSAVE register, it is a member of a register class, and we need the ability to spill it. Because this register is normally callee-preserved and handled by special code this has never before been necessary. However, this capability will be required by a forthcoming commit adding SjLj support. llvm-svn: 177654	2013-03-21 19:03:21 +00:00
Hal Finkel	aa03c03a2d	Correct PPC FRAMEADDR lowering using a pseudo-register The old code used to lower FRAMEADDR tried to replicate the logic in the real frame-lowering code that determines whether or not the frame pointer (r31) will be used. When it seemed as through the frame pointer would not be used, the stack pointer (r1) was used instead. Unfortunately, because the stack size is not yet known, this does not work. Instead, this change introduces new always-reserved pseudo-registers (FP and FP8) that are replaced during prologue insertion with the real frame-pointer register (either r1 or r31). It is important that this intrinsic always return a valid frame address because it is used by Clang to store the frame address as part of code generation for __builtin_setjmp. llvm-svn: 177653	2013-03-21 19:03:19 +00:00
Renato Golin	b4dd6c5945	Avoid NEON SP-FP unless unsafe-math or Darwin NEON is not IEEE 754 compliant, so we should avoid lowering single-precision floating point operations with NEON unless unsafe-math is turned on. The equivalent VFP instructions are IEEE 754 compliant, but in some cores they're much slower, so some archs/OSs might still request it to be on by default, such as Swift and Darwin. llvm-svn: 177651	2013-03-21 18:47:47 +00:00
Jakob Stoklund Olesen	5891cf9788	Add a WriteMicrocoded for ancient microcoded instructions. llvm-svn: 177611	2013-03-21 00:07:17 +00:00
Jakob Stoklund Olesen	712f674880	Model prefetches and barriers as loads. It's not yet clear if these instructions need a more careful model. llvm-svn: 177599	2013-03-20 23:09:53 +00:00
Jakob Stoklund Olesen	5b535c965e	Add a catch-all WriteSystem SchedWrite type. This is used for all the expensive system instructions. llvm-svn: 177598	2013-03-20 23:09:50 +00:00
Jakob Stoklund Olesen	cd4ebb7639	Annotate the remaining SSE MOV instructions. llvm-svn: 177592	2013-03-20 22:37:16 +00:00
Jakob Stoklund Olesen	c6dc70d865	Annotate SSE horizontal and integer instructions. llvm-svn: 177591	2013-03-20 22:37:13 +00:00
Michael Liao	70dd7f999d	Correct cost model for vector shift on AVX2 - After moving logic recognizing vector shift with scalar amount from DAG combining into DAG lowering, we declare to customize all vector shifts even vector shift on AVX is legal. As a result, the cost model needs special tuning to identify these legal cases. llvm-svn: 177586	2013-03-20 22:01:10 +00:00
Jakob Stoklund Olesen	7a8bb72a3a	Add some missing SSE annotations. llvm-svn: 177540	2013-03-20 16:56:39 +00:00
Jakob Stoklund Olesen	50bd713b5e	Annotate remaining IIC_BIN_* instructions. llvm-svn: 177539	2013-03-20 16:56:36 +00:00
Michael Liao	0f4ea0c4a9	Fix PR15296 - Move SRA/SRL/SHL lowering support from DAG combination to DAG lowering to support extended 256-bit integer in AVX but not AVX2. llvm-svn: 177478	2013-03-20 02:33:21 +00:00
Michael Liao	5a4e81d2e8	Mark all variable shifts needing customizing - Prepare moving logic from DAG combining into DAG lowering. There's no functionality change. llvm-svn: 177477	2013-03-20 02:28:20 +00:00
Michael Liao	48e8a3727c	Move scalar immediate shift lowering into a dedicated func - no functionality change llvm-svn: 177476	2013-03-20 02:20:36 +00:00
Chad Rosier	b162a5ca4d	Fix pr13145 - Naming a function like a register name confuses the asm parser. Patch by Stepan Dyatkovskiy <stpworld@narod.ru> rdar://13457826 llvm-svn: 177463	2013-03-19 23:44:03 +00:00
Jakob Stoklund Olesen	3a546156c7	Annotate various null idioms with SchedRW lists. llvm-svn: 177461	2013-03-19 23:23:31 +00:00
Jakob Stoklund Olesen	24aac1dc92	Annotate SSE float conversions with SchedRW lists. llvm-svn: 177460	2013-03-19 23:23:29 +00:00
Jakob Stoklund Olesen	050fa62fd6	Annotate X86InstrCMovSetCC.td with SchedRW lists. llvm-svn: 177459	2013-03-19 23:23:26 +00:00
Chad Rosier	f3c04f6a9f	[ms-inline asm] Move the immediate asm rewrite into the target specific logic as a QOI cleanup. No functional change. Tests already in place. rdar://13456414 llvm-svn: 177446	2013-03-19 21:58:18 +00:00
Jakob Stoklund Olesen	9bd6b8bd96	Annotate X86InstrCompiler.td with SchedRW lists. Add a new WriteZero SchedWrite type for the common dependency-breaking instructions that clear a register. llvm-svn: 177442	2013-03-19 21:16:56 +00:00
Chad Rosier	7ca135b25f	[ms-inline asm] Create a helper function, CreateMemForInlineAsm, that creates an X86Operand, but also performs a Sema lookup and adds the sizing directive when appropriate. Use this when parsing a bracketed statement. This is necessary to get the instruction matching correct as well. Test case coming on clang side. rdar://13455408 llvm-svn: 177439	2013-03-19 21:11:56 +00:00
Ulrich Weigand	01dd4c1a12	Add missing mayLoad flag to LHAUX8 and LWAUX. All pre-increment load patterns need to set the mayLoad flag (since they don't provide a DAG pattern). This was missing for LHAUX8 and LWAUX, which is added by this patch. llvm-svn: 177431	2013-03-19 19:53:27 +00:00
Ulrich Weigand	f8030096b1	Rewrite LHAU8 pattern to use standard memory operand. As opposed to to pre-increment store patterns, the pre-increment load patterns were already using standard memory operands, with the sole exception of LHAU8. As there's no real reason why LHAU8 should be different here, this patch simply rewrites the pattern to also use a memri operand, just like all the other patterns. llvm-svn: 177430	2013-03-19 19:52:30 +00:00
Ulrich Weigand	d850167a19	Rewrite pre-increment store patterns to use standard memory operands. Currently, pre-increment store patterns are written to use two separate operands to represent address base and displacement: stwu $rS, $ptroff($ptrreg) This causes problems when implementing the assembler parser, so this commit changes the patterns to use standard (complex) memory operands like in all other memory access instruction patterns: stwu $rS, $dst To still match those instructions against the appropriate pre_store SelectionDAG nodes, the patch uses the new feature that allows a Pat to match multiple DAG operands against a single (complex) instruction operand. Approved by Hal Finkel. llvm-svn: 177429	2013-03-19 19:52:04 +00:00
Ulrich Weigand	fd24544ff8	Fix sub-operand size mismatch in tocentry operands. The tocentry operand class refers to 64-bit values (it is only used in 64-bit, where iPTR is a 64-bit type), but its sole suboperand is designated as 32-bit type. This causes a mismatch to be detected at compile-time with the TableGen patch I'll check in shortly. To fix this, this commit changes the suboperand to a 64-bit type as well. llvm-svn: 177427	2013-03-19 19:50:30 +00:00
Ulrich Weigand	80d9ad398d	Remove an invalid and unnecessary Pat pattern from the X86 backend: def : Pat<(load (i64 (X86Wrapper tglobaltlsaddr :$dst))), (MOV64rm tglobaltlsaddr :$dst)>; This pattern is invalid because the MOV64rm instruction expects a source operand of type "i64mem", which is a subclass of X86MemOperand and thus actually consists of five MI operands, but the Pat provides only a single MI operand ("tglobaltlsaddr" matches an SDnode of type ISD::TargetGlobalTLSAddress and provides a single output). Thus, if the pattern were ever matched, subsequent uses of the MOV64rm instruction pattern would access uninitialized memory. In addition, with the TableGen patch I'm about to check in, this would actually be reported as a build-time error. Fortunately, the pattern does in fact never match, for at least two independent reasons. First, the code generator actually never generates a pattern of the form (load (X86Wrapper (tglobaltlsaddr))). For most combinations of TLS and code models, (tglobaltlsaddr) represents just an offset that needs to be added to some base register, so it is never directly dereferenced. The only exception is the initial-exec model, where (tglobaltlsaddr) refers to the (pc-relative) address of a GOT slot, which is in fact directly dereferenced: but in that case, the X86WrapperRIP node is used, not X86Wrapper, so the Pat doesn't match. Second, even if some patterns along those lines were ever generated, we should not need an extra Pat pattern to match it. Instead, the original MOV64rm instruction pattern ought to match directly, since it uses an "addr" operand, which is implemented via the SelectAddr C++ routine; this routine is supposed to accept the full range of input DAGs that may be implemented by a single mov instruction, including those cases involving ISD::TargetGlobalTLSAddress (and actually does so e.g. in the initial-exec case as above). To avoid build breaks (due to the above-mentioned error) after the TableGen patch is checked in, I'm removing this Pat here. llvm-svn: 177426	2013-03-19 19:49:52 +00:00
Hal Finkel	638a9fa43e	Prepare to make r0 an allocatable register on PPC Currently the PPC r0 register is unconditionally reserved. There are two reasons for this: 1. r0 is treated specially (as the constant 0) by certain instructions, and so cannot be used with those instructions as a regular register. 2. r0 is used as a temporary register in the CR-register spilling process (where, under some circumstances, we require two GPRs). This change addresses the first reason by introducing a restricted register class (without r0) for use by those instructions that treat r0 specially. These register classes have a new pseudo-register, ZERO, which represents the r0-as-0 use. This has the side benefit of making the existing target code simpler (and easier to understand), and will make it clear to the register allocator that uses of r0 as 0 don't conflict will real uses of the r0 register. Once the CR spilling code is improved, we'll be able to allocate r0. Adding these extra register classes, for some reason unclear to me, causes requests to the target to copy 32-bit registers to 64-bit registers. The resulting code seems correct (and causes no test-suite failures), and the new test case covers this new kind of asymmetric copy. As r0 is still reserved, no functionality change intended. llvm-svn: 177423	2013-03-19 18:51:05 +00:00

1 2 3 4 5 ...

23654 Commits