llvm-project

Commit Graph

Author	SHA1	Message	Date
Juergen Ributzka	c6f314b8ed	[FastISel][AArch64] Fix a missing nullptr check in 'computeAddress'. The load/store value type is currently not available when lowering the memcpy intrinsic. Add the missing nullptr check to support this in 'computeAddress'. Fixes rdar://problem/19178947. llvm-svn: 223818	2014-12-09 19:44:38 +00:00
Tim Northover	3c55ccac48	AArch64: treat [N x Ty] as a block during procedure calls. The AAPCS treats small structs and homogeneous floating (or vector) aggregates specially, and guarantees they either get passed as a contiguous block of registers, or prevent any future use of those registers and get passed on the stack. This concept can fit quite neatly into LLVM's own type system, mapping an HFA to [N x float] and so on, and small structs to [N x i64]. Doing so allows front-ends to emit AAPCS compliant code without having to duplicate the register counting logic. llvm-svn: 222903	2014-11-27 21:02:42 +00:00
Juergen Ributzka	eb67bd8d74	[FastISel][AArch64] Fix and extend the tbz/tbnz pattern matching. The pattern matching failed to recognize all instances of "-1", because when comparing against "-1" we didn't use an APInt of the same bitwidth. This commit fixes this and also adds inverse versions of the conditon to catch more cases. llvm-svn: 222722	2014-11-25 04:16:15 +00:00
Chad Rosier	c250881838	[FastISel][AArch64] Also allow folding of sign-/zero-extend and arithmetic shift-right for booleans (i1). Arithmetic shift-right immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. llvm-svn: 222272	2014-11-18 22:41:49 +00:00
Chad Rosier	e16d16ae41	[FastISel][AArch64] Also allow folding of sign-/zero-extend and logical shift-right for booleans (i1). Logical shift-right immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. llvm-svn: 222270	2014-11-18 22:38:42 +00:00
Juergen Ributzka	cdda930843	[FastISel][AArch64] Follow-up fix for "Fix shift-immediate emission for "zero" shifts." Shifts also perform sign-/zero-extends to larger types, which requires us to emit an integer extend instead of a simple COPY. Related to PR21594. llvm-svn: 222257	2014-11-18 21:20:17 +00:00
Juergen Ributzka	4328fd94b0	[FastISel][AArch64] Fix shift-immediate emission for "zero" shifts. This change emits a COPY for a shift-immediate with a "zero" shift value. This fixes PR21594 where we emitted a shift instruction with an incorrect immediate operand. llvm-svn: 222247	2014-11-18 19:58:59 +00:00
Juergen Ributzka	0af310d052	[FastISel][AArch64] Don't bail during simple GEP instruction selection. The generic FastISel code would bail, because it can't emit a sign-extend for AArch64. This copies the code over and uses AArch64 specific emit functions. This is not ideal and 'computeAddress' should handles this, so it can fold the address computation into the memory operation. I plan to clean up 'computeAddress' anyways, so I will add that in a future commit. Related to rdar://problem/18962471. llvm-svn: 221923	2014-11-13 20:50:44 +00:00
Juergen Ributzka	957a1454cc	[FastISel][AArch64] Optimize select when one of the operands is a 'true' or 'false' value. Optimize selects of i1 in the presence of 'true' and 'false' operands to simple logic operations. This fixes rdar://problem/18960150. llvm-svn: 221848	2014-11-13 00:36:46 +00:00
Juergen Ributzka	424c5fd12f	[FastISel][AArch64] Fold the cmp into the select when possible. This folds the compare emission into the select emission when possible, so we can directly use the flags and don't have to emit a separate compare. Related to rdar://problem/18960150. llvm-svn: 221847	2014-11-13 00:36:43 +00:00
Juergen Ributzka	d1a042abd0	[FastISel][AArch64] Extend 'select' lowering to support also i1 to i16. Related to rdar://problem/18960150. llvm-svn: 221846	2014-11-13 00:36:38 +00:00
Juergen Ributzka	89441b0dd8	[FastISel][AArch64] Add support for fabs intrinsic. Lower the llvm.fabs intrinsic to the 'fabs' MI instruction. This fixes rdar://problem/18946552. llvm-svn: 221729	2014-11-11 23:10:44 +00:00
Juergen Ributzka	ea5870a530	[AArch64][FastISel] Fix kill flags for integer extends. In the case we optimize an integer extend away and replace it directly with the source register, we also have to clear all kill flags at all its uses. This is necessary, because the orignal IR instruction might be trivially dead, but we replaced it with a nop at MI level. llvm-svn: 221628	2014-11-10 21:05:31 +00:00
Juergen Ributzka	7ccebec668	[FastISel][AArch64] Emit immediate version of icmp (subs) for null pointer check. This is a minor change to use the immediate version when the operand is a null value. This should get rid of an unnecessary 'mov' instruction in debug builds and align the code more with the one generated by SelectionDAG. This fixes rdar://problem/18785125. llvm-svn: 220713	2014-10-27 19:58:36 +00:00
Juergen Ributzka	0190fea941	[FastISel][AArch64] Optimize compare-and-branch for i1 to use 'tbz'. Minor enhancement to use 'tbz' for i1 compare-and-branch to get rid of an 'and' instruction. This fixes rdar://problem/18784953. llvm-svn: 220712	2014-10-27 19:46:23 +00:00
Juergen Ributzka	90f741a2ce	[FastISel][AArch64] Use 'cbz' also for null values (pointers). The pattern matching for a 'ConstantInt' value was too restrictive. Checking for a 'Constant' with a bull value is sufficient for using an 'cbz/cbnz' instruction. This fixes rdar://problem/18784732. llvm-svn: 220709	2014-10-27 19:38:05 +00:00
Juergen Ributzka	eae91040d8	[FastISel][AArch64] Don't fold the 'and' instruction into the 'tbz/tbnz' instruction if it is in a different basic block. This fixes a bug where the input register was not defined for the 'tbz/tbnz' instruction. This happened, because we folded the 'and' instruction from a different basic block. This fixes rdar://problem/18784013. llvm-svn: 220704	2014-10-27 19:16:48 +00:00
Juergen Ributzka	6de054a25a	[FastISel][AArch64] Fix load/store with frame indices. At higher optimization levels the LLVM IR may contain more complex patterns for loads/stores from/to frame indices. The 'computeAddress' function wasn't able to handle this and triggered an assertion. This fix extends the possible addressing modes for frame indices. This fixes rdar://problem/18783298. llvm-svn: 220700	2014-10-27 18:21:58 +00:00
Oliver Stannard	f7a5afc3f2	[AArch64] Fix fast-isel of cbz of i1, i8, i16 This fixes a miscompilation in the AArch64 fast-isel which was triggered when a branch is based on an icmp with condition eq or ne, and type i1, i8 or i16. The cbz instruction compares the whole 32-bit register, so values with the bottom 1, 8 or 16 bits clear would cause the wrong branch to be taken. llvm-svn: 220553	2014-10-24 09:54:41 +00:00
Juergen Ributzka	03a0611061	[AArch64] Fix miscompile of sdiv-by-power-of-2. When the constant divisor was larger than 32bits, then the optimized code generated for the AArch64 backend would emit the wrong code, because the shift was defined as a shift of a 32bit constant '(1<<Lg2(divisor))' and we would loose the upper 32bits. This fixes rdar://problem/18678801. llvm-svn: 219934	2014-10-16 16:41:15 +00:00
Juergen Ributzka	f82c987a5c	Reapply "[FastISel][AArch64] Add custom lowering for GEPs." This is mostly a copy of the existing FastISel GEP code, but we have to duplicate it for AArch64, because otherwise we would bail out even for simple cases. This is because the standard fastEmit functions don't cover MUL at all and ADD is lowered very inefficientily. The original commit had a bug in the add emit logic, which has been fixed. llvm-svn: 219831	2014-10-15 18:58:07 +00:00
Juergen Ributzka	6780f0f7a0	[FastISel][AArch64] Factor out add with immediate emission into a helper function. NFC. Simplify add with immediate emission by factoring it out into a helper function. llvm-svn: 219830	2014-10-15 18:58:02 +00:00
Juergen Ributzka	42379d4cf7	Revert "[FastISel][AArch64] Add custom lowering for GEPs." This breaks our internal build bots. Reverting it to get the bots green again. llvm-svn: 219776	2014-10-15 04:55:48 +00:00
Juergen Ributzka	4dfd590eaa	[FastISel][AArch64] Add custom lowering for GEPs. This is mostly a copy of the existing FastISel GEP code, but on AArch64 we bail out even for simple cases, because the standard fastEmit functions don't cover MUL and ADD is lowered inefficientily. llvm-svn: 219726	2014-10-14 21:41:23 +00:00
Juergen Ributzka	cd11a2806b	[FastISel][AArch64] Fix sign-/zero-extend folding when SelectionDAG is involved. Sign-/zero-extend folding depended on the load and the integer extend to be both selected by FastISel. This cannot always be garantueed and SelectionDAG might interfer. This commit adds additonal checks to load and integer extend lowering to catch this. Related to rdar://problem/18495928. llvm-svn: 219716	2014-10-14 20:36:02 +00:00
Juergen Ributzka	ef3722d8e9	[FastISel][AArch64] Teach the address computation code to also fold sign-/zero-extends. The code already folds sign-/zero-extends, but only if they are arguments to mul and shift instructions. This extends the code to also fold them when they are direct inputs. llvm-svn: 219187	2014-10-07 03:40:06 +00:00
Juergen Ributzka	75b2f34069	[FastISel][AArch64] Teach the address computation to also fold sub instructions. Tiny enhancement to the address computation code to also fold sub instructions if the rhs is constant and can be folded into the offset. llvm-svn: 219186	2014-10-07 03:40:03 +00:00
Juergen Ributzka	42bf665f2b	[FastISel][AArch64] Fix "Fold sign-/zero-extends into the load instruction." This commit fixes an issue with sign-/zero-extending loads that was discovered by Richard Barton. We use now the correct load instructions for sign-extending loads to 64bit. Also updated and added more unit tests. llvm-svn: 219185	2014-10-07 03:39:59 +00:00
Jingyue Wu	4938e271c6	Add fake use to suppress defined-but-unused warnings llvm-svn: 219045	2014-10-04 03:50:10 +00:00
Juergen Ributzka	c110c0b99a	Recommit r218010 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ. Note: This version fixed an issue with the TBZ/TBNZ instructions that were generated in FastISel. The issue was that the 64bit version of TBZ (TBZX) automagically sets the upper bit of the immediate field that is used to specify the bit we want to test. To test for any of the lower 32bits we have to first extract the subregister and use the 32bit version of the TBZ instruction (TBZW). Original commit message: Teach selectBranch to fold bit test and branch into a single instruction (TBZ or TBNZ). llvm-svn: 218693	2014-09-30 19:59:35 +00:00
Juergen Ributzka	6ac12439d0	[FastISel][AArch64] Fold sign-/zero-extends into the load instruction. The sign-/zero-extension of the loaded value can be performed by the memory instruction for free. If the result of the load has only one use and the use is a sign-/zero-extend, then we emit the proper load instruction. The extend is only a register copy and will be optimized away later on. Other instructions that consume the sign-/zero-extended value are also made aware of this fact, so they don't fold the extend too. This fixes rdar://problem/18495928. llvm-svn: 218653	2014-09-30 00:49:58 +00:00
Juergen Ributzka	0616d9d41a	[FastISel][AArch64] Factor out scale factor calculation. NFC. Factor out the code that determines the implicit scale factor of memory operations for a given value type. llvm-svn: 218652	2014-09-30 00:49:54 +00:00
Juergen Ributzka	27e959d7b2	[FastISel][AArch64] Also allow folding of sign-/zero-extend and shift-left for booleans (i1). Shift-left immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. This should fix a bug found by Chad. llvm-svn: 218275	2014-09-22 21:08:53 +00:00
Juergen Ributzka	92e8978e40	[FastIsel][AArch64] Fix a think-o in address computation. When looking through sign/zero-extensions the code would always assume there is such an extension instruction and use the wrong operand for the address. There was also a minor issue in the handling of 'AND' instructions. I accidentially used a 'cast' instead of a 'dyn_cast'. llvm-svn: 218161	2014-09-19 22:23:46 +00:00
Juergen Ributzka	1d3a312e2d	Revert "[FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ." Reverting it until I have time to investigate a regression. llvm-svn: 218035	2014-09-18 08:07:40 +00:00
Juergen Ributzka	0f3076785f	Fix previous commit: [FastISel][AArch64] Simplify XALU multiplies. When folding the intrinsic flag into the branch or select we also have to consider the fact if the intrinsic got simplified, because it changes the flag we have to check for. llvm-svn: 218034	2014-09-18 07:26:26 +00:00
Juergen Ributzka	2964b832ef	[FastISel][AArch64] Simplify XALU multiplies. Simplify {s\|u}mul.with.overflow to {s\|u}add.with.overflow when possible. llvm-svn: 218033	2014-09-18 07:04:54 +00:00
Juergen Ributzka	2fc851002b	[FastISel][AArch64] Followup commit for 218031 to handle negative offsets too. llvm-svn: 218032	2014-09-18 07:04:49 +00:00
Juergen Ributzka	a33070c321	[FastISel][AArch64] Try to fold the offset into the add instruction when simplifying a memory address. Small optimization in 'simplifyAddress'. When the offset cannot be encoded in the load/store instruction, then we need to materialize the address manually. The add instruction can encode a wider range of immediates than the load/store instructions. This change tries to fold the offset into the add instruction first before materializing the offset in a register. llvm-svn: 218031	2014-09-18 05:40:47 +00:00
Juergen Ributzka	99b7758ba0	[FastISel][AArch64] Fold 'AND' instruction during the address computation. The 'AND' instruction could be used to mask out the lower 32 bits of a register. If this is done inside an address computation we might be able to fold the instruction into the memory instruction itself. and x1, x1, #0xffffffff ---> ldrb x0, [x0, w1, uxtw] ldrb x0, [x0, x1] llvm-svn: 218030	2014-09-18 05:40:41 +00:00
Juergen Ributzka	c35fb03661	[FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ. Teach selectBranch to fold bit test and branch into a single instruction (TBZ or TBNZ). llvm-svn: 218010	2014-09-18 02:44:13 +00:00
Juergen Ributzka	f6430314b4	[FastISel][AArch64] Custom lower sdiv by power-of-2. Emit an optimized instruction sequence for sdiv by power-of-2 depending on the exact flag. This fixes rdar://problem/18224511. llvm-svn: 217986	2014-09-17 21:55:55 +00:00
Juergen Ributzka	c611d72754	[FastISel][AArch64] Simplify mul to shift when possible. This is related to rdar://problem/18369687. llvm-svn: 217980	2014-09-17 20:35:41 +00:00
Juergen Ributzka	3871c69422	[FastISel][AArch64] Fold mul into add/sub and logical operations. Try to fold the multiply into the add/sub or logical operations (when possible). This is related to rdar://problem/18369687. llvm-svn: 217978	2014-09-17 19:51:38 +00:00
Juergen Ributzka	22d4cd0a4f	[FastISel][AArch64] Fold mul into the address computation of memory operations. Teach 'computeAddress' to also fold multiplies into the address computation (when possible). This fixes rdar://problem/18369443. llvm-svn: 217977	2014-09-17 19:19:31 +00:00
Juergen Ributzka	d8e30c0db8	[FastISel][AArch64] Fold compare with zero and branch into CBZ and CBNZ. This takes advanatage of the CBZ and CBNZ instruction to further optimize the common null check pattern into a single instruction. This is related to rdar://problem/18358882. llvm-svn: 217972	2014-09-17 18:05:34 +00:00
Juergen Ributzka	fb3e14375a	[FastISel][AArch64] Improve branch selection to support all FP conditions. This adds the last two missing floating-point condition codes (FCMP_UEQ and FCMP_ONE) also to the branch selection. In these two cases an additonal branch instruction is required. This also adds unit tests to checks all the different condition codes. This is related o rdar://problem/18358882. llvm-svn: 217966	2014-09-17 17:46:47 +00:00
Juergen Ributzka	59e631c728	[FastISel][AArch64] Add vector support to argument lowering. Lower the first 8 vector arguments too. llvm-svn: 217850	2014-09-16 00:25:30 +00:00
Juergen Ributzka	de47c47cc1	[FastISel][AArch64] Allow handling of vectors during return lowering for little endian machines. Allow handling of vectors during return lowering at least for little endian machines. This was restricted in r208200 to fix it for big endian machines (according to the comment), but it also disabled it for little endian too. llvm-svn: 217846	2014-09-15 23:40:10 +00:00
Juergen Ributzka	b9e49c73ee	[FastISel][AArch64] Update function and variable names to follow the coding standard. NFC. llvm-svn: 217845	2014-09-15 23:20:17 +00:00
Juergen Ributzka	cbe802e730	[FastISel][AArch64] Make AArch64FastISel class final. NFC. llvm-svn: 217840	2014-09-15 22:33:11 +00:00
Juergen Ributzka	993224a553	[FastISel][AArch64] Lower sin/cos/pow to runtime lib calls. Also lower sin/cos/pow to runtime lib calls. This fixes rdar://problem/18343468. llvm-svn: 217839	2014-09-15 22:33:06 +00:00
Juergen Ributzka	afa034fb61	[FastISel][AArch64] Add lowering support for frem. This lowers frem to a runtime libcall inside fast-isel. The test case also checks the CallLoweringInfo bug that was exposed by this change. This fixes rdar://problem/18342783. llvm-svn: 217833	2014-09-15 22:07:49 +00:00
Juergen Ributzka	e1779e2a8b	[FastISel][AArch64] Refactor selectAddSub, selectLogicalOp, and SelectShift. NFC. Small refactor to tidy up the code a little. llvm-svn: 217827	2014-09-15 21:27:56 +00:00
Juergen Ributzka	6127b1968d	[FastISel][AArch64] Refactor code to use isTypeSupported. NFC. Gets rid of isLoadStoreTypeLegal and replace it with isTypeSupported. llvm-svn: 217826	2014-09-15 21:27:54 +00:00
Juergen Ributzka	8984f48d89	[FastISel][AArch64] Improve floating-point compare support. Add support for the last two missing fcmp condition codes: UEQ and ONE. This fixes rdar://problem/18341575. llvm-svn: 217823	2014-09-15 20:47:16 +00:00
Juergen Ributzka	85c1f84650	[FastISel][AArch64] Add support for non-native types for logical ops. Extend the logical ops selection to also support non-native types such as i1, i8, and i16. Fixes rdar://problem/18330589. llvm-svn: 217732	2014-09-13 23:46:28 +00:00
Asiri Rathnayake	369c030633	[AArch 64] Use a constant pool load for weak symbol references when using static relocation model and small code model. Summary: currently we generate GOT based relocations for weak symbol references regardless of the underlying relocation model. This should be change so that in static relocation model we use a constant pool load instead. Patch from: Keith Walker Reviewers: Renato Golin, Tim Northover llvm-svn: 217503	2014-09-10 13:54:38 +00:00
Juergen Ributzka	30c02e36cc	[FastISel][AArch64] Cleanup and simplify 'fastSelectInstruction'. NFC. llvm-svn: 217119	2014-09-04 01:29:21 +00:00
Juergen Ributzka	1dbc15f02d	[FastISel][AArch64] Add target-specific lowering for logical operations. This change adds support for immediate and shift-left folding into logical operations. This fixes rdar://problem/18223183. llvm-svn: 217118	2014-09-04 01:29:18 +00:00
Juergen Ributzka	88e32517c4	[FastISel][tblgen] Rename tblgen generated FastISel functions. NFC. This is the final round of renaming. This changes tblgen to emit lower-case function names for FastEmitInst_* and FastEmit_*, and updates all its uses in the source code. Reviewed by Eric llvm-svn: 217075	2014-09-03 20:56:59 +00:00
Juergen Ributzka	5b8bb4d7dd	[FastISel] Rename public visible FastISel functions. NFC. This commit renames the following public FastISel functions: LowerArguments -> lowerArguments SelectInstruction -> selectInstruction TargetSelectInstruction -> fastSelectInstruction FastLowerArguments -> fastLowerArguments FastLowerCall -> fastLowerCall FastLowerIntrinsicCall -> fastLowerIntrinsicCall FastEmitZExtFromI1 -> fastEmitZExtFromI1 FastEmitBranch -> fastEmitBranch UpdateValueMap -> updateValueMap TargetMaterializeConstant -> fastMaterializeConstant TargetMaterializeAlloca -> fastMaterializeAlloca TargetMaterializeFloatZero -> fastMaterializeFloatZero LowerCallTo -> lowerCallTo Reviewed by Eric llvm-svn: 217074	2014-09-03 20:56:52 +00:00
Juergen Ributzka	7a76c2409e	[FastISel] Some long overdue spring cleaning of FastISel. Things got a little bit messy over the years and it is time for a little bit spring cleaning. This first commit is focused on the FastISel base class itself. It doxyfies all comments, C++11fies the code where it makes sense, renames internal methods to adhere to the coding standard, and clang-formats the files. Reviewed by Eric llvm-svn: 217060	2014-09-03 18:46:45 +00:00
Juergen Ributzka	31c8054594	[FastISel][AArch64] Move unconditional branch handling into 'SelectBranch'. NFC. llvm-svn: 217054	2014-09-03 17:58:10 +00:00
Juergen Ributzka	a1148b2173	[FastISel][AArch64] Add target-dependent instruction selection for Add/Sub. There is already target-dependent instruction selection support for Adds/Subs to support compares and the intrinsics with overflow check. This takes advantage of the existing infrastructure to also support Add/Sub, which allows the folding of immediates, sign-/zero-extends, and shifts. This fixes rdar://problem/18207316. llvm-svn: 217007	2014-09-03 01:38:36 +00:00
Juergen Ributzka	53dbef6ef1	[FastISel][AArch64] Use the target-dependent selection code for shifts first. This uses the target-dependent selection code for shifts first, which allows us to create better code for shifts with immediates and sign-/zero-extend folding. Vector type are not handled yet and the code falls back to target-independent instruction selection for these cases. This fixes rdar://problem/17907920. llvm-svn: 216985	2014-09-02 22:33:57 +00:00
Juergen Ributzka	8a4b8bebdc	[FastISel][AArch64] Use a new helper function to determine if a value type is supported. NFCI. FastISel for AArch64 supports more value types than are actually legal. Use a dedicated helper function to reflect this. It is very similar to the isLoadStoreTypeLegal function, with the exception that vector types are not supported yet. llvm-svn: 216984	2014-09-02 22:33:53 +00:00
Juergen Ributzka	dbe9e174b6	[FastISel][AArch64] Move over to target-dependent instruction selection only. This change moves FastISel for AArch64 to target-dependent instruction selection only. This change replicates the existing target-independent behavior, therefore there are no changes to the unit tests or new tests. Future changes will take advantage of this change and update functionality and unit tests. llvm-svn: 216955	2014-09-02 21:32:54 +00:00
Juergen Ributzka	c5c1c6090f	[FastISel][AArch64] Use the correct register class for branches. Also constrain the register class for branches. This fixes rdar://problem/18181496. llvm-svn: 216804	2014-08-29 23:48:06 +00:00
Juergen Ributzka	f6ee7a7cdd	[FastISel][AArch64] Fix an incorrect kill flag due to a bug in SelectTrunc. When we select a trunc instruction we don't emit any code if the type is already i32 or smaller. This is because the instruction that uses the truncated value will deal with it. This behavior can incorrectly transfer a kill flag, which was meant for the result of the truncate, onto the source register. %2 = trunc i32 %1 to i16 ... = ... %2 -> ... = ... vreg1 <kill> ... = ... %1 ... = ... vreg1 This commit fixes this by emitting a COPY instruction, so that the result and source register are distinct virtual registers. This fixes rdar://problem/18178188. llvm-svn: 216750	2014-08-29 17:58:16 +00:00
Juergen Ributzka	77bc09f5ab	[FastISel][AArch64] Don't fold instructions that are not in the same basic block. This fix checks first if the instruction to be folded (e.g. sign-/zero-extend, or shift) is in the same machine basic block as the instruction we are folding into. Not doing so can result in incorrect code, because the value might not be live-out of the basic block, where the value is defined. This fixes rdar://problem/18169495. llvm-svn: 216700	2014-08-29 00:19:21 +00:00
Juergen Ributzka	843f14f411	Revert "[FastISel][AArch64] Don't fold instructions too aggressively into the memory operation." Quentin pointed out that this is not the correct approach and there is a better and easier solution. llvm-svn: 216632	2014-08-27 23:09:40 +00:00
Juergen Ributzka	ad8beabe38	[FastISel][AArch64] Don't fold instructions too aggressively into the memory operation. Currently instructions are folded very aggressively into the memory operation, which can lead to the use of killed operands: %vreg1<def> = ADDXri %vreg0<kill>, 2 %vreg2<def> = LDRBBui %vreg0, 2 ... = ... %vreg1 ... This usually happens when the result is also used by another non-memory instruction in the same basic block, or any instruction in another basic block. If the computed address is used by only memory operations in the same basic block, then it is safe to fold them. This is because all memory operations will fold the address computation and the original computation will never be emitted. This fixes rdar://problem/18142857. llvm-svn: 216629	2014-08-27 22:52:33 +00:00
Juergen Ributzka	56b4b33190	[FastISel][AArch64] Fix a comment in my previous commit (r216617). llvm-svn: 216622	2014-08-27 21:40:50 +00:00
Juergen Ributzka	3c1b286152	[FastISel][AArch64] Fix simplify address when the address comes from a shift. When the address comes directly from a shift instruction then the address computation cannot be folded into the memory instruction, because the zero register is not available as a base register. Simplify addess needs to emit the shift instruction and use the result as base register. llvm-svn: 216621	2014-08-27 21:38:33 +00:00
Juergen Ributzka	100a9b7fda	[FastISel][AArch64] Use the zero register for stores. Use the zero register directly when possible to avoid an unnecessary register copy and a wasted register at -O0. This also uses integer stores to store a positive floating-point zero. This saves us from materializing the positive zero in a register and then storing it. llvm-svn: 216617	2014-08-27 21:04:52 +00:00
Juergen Ributzka	fb506a417d	[FastISel][AArch64] Fix address simplification. When a shift with extension or an add with shift and extension cannot be folded into the memory operation, then the address calculation has to be materialized separately. While doing so the code forgot to consider a possible sign-/zero- extension. This fix folds now also the sign-/zero-extension into the add or shift instruction which is used to materialize the address. This fixes rdar://problem/18141718. llvm-svn: 216511	2014-08-27 00:58:30 +00:00
Juergen Ributzka	99dd30f338	[FastISel][AArch64] Fold Sign-/Zero-Extend into the shift immediate instruction. llvm-svn: 216510	2014-08-27 00:58:26 +00:00
Juergen Ributzka	1912e24898	[FastISel][AArch64] Refactor float zero materialization. NFCI. llvm-svn: 216403	2014-08-25 19:58:05 +00:00
Juergen Ributzka	0e0b4c1cda	[FastISel][AArch64] Add support for variable shift. This adds the missing variable shift support for value type i8, i16, and i32. This fixes <rdar://problem/18095685>. llvm-svn: 216242	2014-08-21 23:06:07 +00:00
Juergen Ributzka	addb75a4f3	[FastISel][AArch64] Use the correct register class to make the MI verifier happy. This is mostly achieved by providing the correct register class manually, because getRegClassFor always returns the GPRAllRegClass for MVT::i32 and MVT::i64. Also cleanup the code to use the FastEmitInst_ method whenever possible. This makes sure that the operands' register class is properly constrained. For all the remaining cases this adds the missing constrainOperandRegClass calls for each operand. llvm-svn: 216225	2014-08-21 20:57:57 +00:00
Juergen Ributzka	c83265a6c5	[FastISel][AArch64] Factor out ANDWri instruction generation into a helper function. NFCI. llvm-svn: 216199	2014-08-21 18:02:25 +00:00
Juergen Ributzka	e1bb055ed3	[FastISel][AArch64] Don't fold the sign-/zero-extend from i1 into the compare. This fixes a bug I introduced in a previous commit (r216033). Sign-/Zero- extension from i1 cannot be folded into the ADDS/SUBS instructions. Instead both operands have to be sign-/zero-extended with separate instructions. Related to <rdar://problem/17913111>. llvm-svn: 216073	2014-08-20 16:34:15 +00:00
Aaron Ballman	bf6ee22113	Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)). NFC. llvm-svn: 216067	2014-08-20 12:14:35 +00:00
Juergen Ributzka	0781b860e4	[FastISel][AArch64] Use the proper FMOV instruction to materialize a +0.0. Use FMOVWSr/FMOVXDr instead of FMOVSr/FMOVDr, which have the proper register class to be used with the zero register. This makes the MachineInstruction verifier happy again. This is related to <rdar://problem/18027157>. llvm-svn: 216040	2014-08-20 01:10:36 +00:00
Juergen Ributzka	c0886dd5b0	[FastISel][AArch64] Factor out ADDS/SUBS instruction emission and add support for extensions and shift folding. Factor out the ADDS/SUBS instruction emission code into helper functions and make the helper functions more clever to support most of the different ADDS/SUBS instructions the architecture support. This includes better immedediate support, shift folding, and sign-/zero-extend folding. This fixes <rdar://problem/17913111>. llvm-svn: 216033	2014-08-19 22:29:55 +00:00
Juergen Ributzka	b46ea081ad	Reapply [FastISel][AArch64] Add support for more addressing modes (r215597). Note: This was originally reverted to track down a buildbot error. Reapply without any modifications. Original commit message: FastISel didn't take much advantage of the different addressing modes available to it on AArch64. This commit allows the ComputeAddress method to recognize more addressing modes that allows shifts and sign-/zero-extensions to be folded into the memory operation itself. For Example: lsl x1, x1, #3 --> ldr x0, [x0, x1, lsl #3] ldr x0, [x0, x1] sxtw x1, w1 lsl x1, x1, #3 --> ldr x0, [x0, x1, sxtw #3] ldr x0, [x0, x1] llvm-svn: 216013	2014-08-19 19:44:17 +00:00
Juergen Ributzka	7e23f77d82	Reapply [FastISel][AArch64] Make use of the zero register when possible (r215591). Note: This was originally reverted to track down a buildbot error. Reapply without any modifications. Original commit message: This change materializes now the value "0" from the zero register. The zero register can be folded by several instruction, so no materialization is need at all. Fixes <rdar://problem/17924413>. llvm-svn: 216009	2014-08-19 19:44:02 +00:00
Juergen Ributzka	5460cbfda4	[FastISel][AArch64] Fix a few BuildMI callsites where the result register was added as an operand register. This fixes a few BuildMI callsites where the result register was added by using addReg, which is per default a use and therefore an operand register. Also use the zero register as result register when emitting a compare instruction (SUBS with unused result register). llvm-svn: 215997	2014-08-19 17:41:53 +00:00
Juergen Ributzka	6597d319fe	[FastISel][AArch64] Fix a latent bug in floating-point materialization. The floating-point value positive zero (+0.0) is a valid immedate value according to isFPImmLegal. As a result AArch64 FastISel went ahead and used the immediate version of fmov to materialize the constant. The problem is that the immediate version of fmov cannot encode an imediate for postive zero. Instead a fmov from the zero register was supposed to be used in this case. This fix adds handling for this special case and uses fmov from the zero register to materialize a positive zero (negative zeroes go to the constant pool). There is no test case for this, because this code is currently dead. It will be enabled in a future commit and I will add a test case in a separate commit after that. This fixes <rdar://problem/18027157>. llvm-svn: 215753	2014-08-15 18:55:55 +00:00
Juergen Ributzka	6bca986ef1	Reapplying [FastISel][AArch64] Cleanup constant materialization code. NFCI. Note: This reapplies r215582 without any modifications. The refactoring wasn't responsible for the buildbot failures. Original commit message: Cleanup and prepare constant materialization code for future commits. llvm-svn: 215752	2014-08-15 18:55:52 +00:00
Juergen Ributzka	790bacf232	Revert several FastISel commits to track down a buildbot error. This reverts: r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants." r215594 "[FastISel][X86] Use XOR to materialize the "0" value." r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization." r215591 "[FastISel][AArch64] Make use of the zero register when possible." r215588 "[FastISel] Let the target decide first if it wants to materialize a constant." r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI." llvm-svn: 215673	2014-08-14 19:56:28 +00:00
Juergen Ributzka	34ed422c42	Revert "[FastISel][AArch64] Add support for more addressing modes." This reverts commits r215597, because it might have broken the build bots. llvm-svn: 215659	2014-08-14 17:10:54 +00:00
Aaron Ballman	61acc22129	Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)). NFC. llvm-svn: 215642	2014-08-14 13:43:57 +00:00
David Majnemer	c307c66765	AArch64: Silence warning in AArch64FastISel GCC was emitting a signed vs unsigned comparison warning. llvm-svn: 215620	2014-08-14 06:44:51 +00:00
Akira Hatanaka	b74db09c97	[AArch64, fast-isel] Fall back to SelectionDAG to select tail calls. Certain functions such as objc_autoreleaseReturnValue have to be called as tail-calls even at -O0. Since normal fast-isel doesn't emit calls as tail calls, we have to fall back to SelectionDAG to select calls that are marked as tail. <rdar://problem/17991614> llvm-svn: 215600	2014-08-13 23:23:58 +00:00
Juergen Ributzka	98347d902e	[FastISel][AArch64] Add support for more addressing modes. FastISel didn't take much advantage of the different addressing modes available to it on AArch64. This commit allows the ComputeAddress method to recognize more addressing modes that allows shifts and sign-/zero-extensions to be folded into the memory operation itself. For Example: lsl x1, x1, #3 --> ldr x0, [x0, x1, lsl #3] ldr x0, [x0, x1] sxtw x1, w1 lsl x1, x1, #3 --> ldr x0, [x0, x1, sxtw #3] ldr x0, [x0, x1] llvm-svn: 215597	2014-08-13 22:53:29 +00:00
Juergen Ributzka	24080d60fa	[FastISel][AArch64] Make use of the zero register when possible. This change materializes now the value "0" from the zero register. The zero register can be folded by several instruction, so no materialization is need at all. Fixes <rdar://problem/17924413>. llvm-svn: 215591	2014-08-13 22:13:14 +00:00
Juergen Ributzka	5ae43a136b	[FastISel][AArch64] Cleanup constant materialization code. NFCI. Cleanup and prepare constant materialization code for future commits. llvm-svn: 215582	2014-08-13 21:34:04 +00:00
Juergen Ributzka	241fd486eb	[FastISel][AArch64] Attach MachineMemOperands to load and store instructions. llvm-svn: 215231	2014-08-08 17:24:10 +00:00

1 2 3 4

182 Commits