llvm-project

Commit Graph

Author	SHA1	Message	Date
Reid Kleckner	c20276d0b2	[WinEH] Move WinEHFuncInfo from MachineModuleInfo to MachineFunction Summary: Now that there is a one-to-one mapping from MachineFunction to WinEHFuncInfo, we don't need to use a DenseMap to select the right WinEHFuncInfo for the current funclet. The main challenge here is that X86WinEHStatePass is an IR pass that doesn't have access to the MachineFunction. I gave it its own WinEHFuncInfo object that it uses to calculate state numbers, which it then throws away. As long as nobody creates or removes EH pads between this pass and SDAG construction, we will get the same state numbers. The other thing X86WinEHStatePass does is to mark the EH registration node. Instead of communicating which alloca was the registration through WinEHFuncInfo, I added the llvm.x86.seh.ehregnode intrinsic. This intrinsic generates no code and simply marks the alloca in use. Reviewers: JCTremoulet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14668 llvm-svn: 253378	2015-11-17 21:10:25 +00:00
Pat Gavlin	c8ea157811	Lower statepoints with multi-def targets. Statepoint lowering currently expects that the target method of a statepoint only defines a single value. This precludes using statepoints with ABIs that return values in multiple registers (e.g. the SysV AMD64 ABI). This change adds support for lowering statepoints with mutli-def targets. llvm-svn: 253339	2015-11-17 16:04:21 +00:00
James Molloy	bb1dbf530a	[SDAG] Fix expansion of BITREVERSE Richard Trieu noted that UBSan detected an overflowing shift, and the obvious fix caused a crash. What was happening was that the shiftee (1U) was indeed too small for the possible range of shifts it had to handle, but also we were using "VT.getSizeInBits()" to get the maximum type bitwidth, but we wanted "VT.getScalarSizeInBits()" to get the vector lane size instead of the entire vector size. Use an APInt for the shift and VT.getScalarSizeInBits(). llvm-svn: 253023	2015-11-13 10:02:36 +00:00
Tom Stellard	0967c91e0c	Revert "Remove unnecessary call to getAllocatableRegClass" This reverts commit r252565. This also includes the revert of the commit mentioned below in order to avoid breaking tests in AMDGPU: Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64" This reverts commit r252674. llvm-svn: 252956	2015-11-12 21:43:25 +00:00
James Molloy	90111f79f9	[SDAG] Introduce a new BITREVERSE node along with a corresponding LLVM intrinsic Several backends have instructions to reverse the order of bits in an integer. Conceptually matching such patterns is similar to @llvm.bswap, and it was mentioned in http://reviews.llvm.org/D14234 that it would be best if these patterns were matched in InstCombine instead of reimplemented in every different target. This patch introduces an intrinsic @llvm.bitreverse.i* that operates similarly to @llvm.bswap. For plumbing purposes there is also a new ISD node ISD::BITREVERSE, with simple expansion and promotion support. The intention is that InstCombine's BSWAP detection logic will be extended to support BITREVERSE too, and @llvm.bitreverse intrinsics emitted (if the backend supports lowering it efficiently). llvm-svn: 252878	2015-11-12 12:29:09 +00:00
Matthias Braun	b9610a6bc2	LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization - Factor out code to query and modify the sign bit of a floatingpoint value as an integer. This also works if none of the targets integer types is big enough to hold all bits of the floatingpoint value. - Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available, otherwise perform bit manipulation on the sign bit. The previous code used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also takes 34 instructions on ARM Cortex-M4. With this patch we only require 5: vldr d0, LCPI0_0 vmov r2, r3, d0 lsrs r2, r3, #31 bfi r1, r2, #31, #1 bx lr (This could be further improved if the compiler would recognize that r2, r3 is zero). - Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is available otherwise perform bit manipulation on the sign bit. - Perform the sign(x) test by masking out the sign bit and comparing with 0 rather than shifting the sign bit to the highest position and testing for "<s 0". For x86 copysignl (on 80bit values) this gets us: testl $32768, %eax rather than: shlq $48, %rax sets %al testb %al, %al Differential Revision: http://reviews.llvm.org/D11172 llvm-svn: 252839	2015-11-12 01:02:47 +00:00
Geoff Berry	2ddfc5e60f	[DAGCombiner] Improve zextload optimization. Summary: Don't fold (zext (and (load x), cst)) -> (and (zextload x), (zext cst)) if (and (load x) cst) will match as a zextload already and has additional users. For example, the following IR: %load = load i32, i32* %ptr, align 8 %load16 = and i32 %load, 65535 %load64 = zext i32 %load16 to i64 store i32 %load16, i32* %dst1, align 4 store i64 %load64, i64* %dst2, align 8 used to produce the following aarch64 code: ldr w8, [x0] and w9, w8, #0xffff and x8, x8, #0xffff str w9, [x1] str x8, [x2] but with this change produces the following aarch64 code: ldrh w8, [x0] str w8, [x1] str x8, [x2] Reviewers: resistor, mcrosier Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14340 llvm-svn: 252789	2015-11-11 19:42:52 +00:00
Matt Arsenault	d8fed1b793	Add target preference for GatherAllAliases max depth llvm-svn: 252775	2015-11-11 18:44:33 +00:00
Matt Arsenault	aa118e299c	LegalizeDAG: Implement promote for scalar_to_vector This allows avoiding the default Expand behavior which introduces stack usage. Bitcast the scalar and replace the missing elements with undef. This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252632	2015-11-10 18:48:11 +00:00
Matt Arsenault	a46aa641f2	LegalizeDAG: Implement promote for insert_vector_elt This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252631	2015-11-10 18:48:08 +00:00
Matt Arsenault	0b7958a59b	LegalizeDAG: Implement promote for extract_vector_elt This is for AMDGPU to implement v2i64 extract as extract of half of a v4i32. This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252630	2015-11-10 18:48:04 +00:00
Matt Arsenault	6d87f28afd	Remove unnecessary call to getAllocatableRegClass I'm not sure what the point of this was. I'm not sure why you would ever define an instruction that produces an unallocatable register class. No tests fail with this removed, and it seems like it should be a verifier error to define such an instruction. This was problematic for AMDGPU because it would make bad decisions by arbitrarily changing the register class when unsetting isAllocatable for VS_32/VS_64, which is currently set as a workaround to this problem. AMDGPU uses the VS_32/VS_64 register classes to represent operands which can use either VGPRs or SGPRs. When isAllocatable is unset for these, this would need to pick either the SGPR or VGPR class and insert either a copy we don't want, or an illegal copy we would need to deal with later. A semi-arbitrary register class ordering decision is made in tablegen, which resulted in always picking a VGPR class because it happens to have more registers than the SGPR register class. We really just want to use whatever register class the original register had. llvm-svn: 252565	2015-11-10 00:30:14 +00:00
Sanjay Patel	533c10c651	add a SelectionDAG method to check if no common bits are set in two nodes; NFCI This was suggested in: http://reviews.llvm.org/D13956 and is a follow-on to: http://reviews.llvm.org/rL252515 http://reviews.llvm.org/rL252519 This lets us remove logically equivalent/duplicated code from DAGCombiner and X86ISelDAGToDAG. A corresponding function for IR instructions already exists in ValueTracking. llvm-svn: 252539	2015-11-09 23:31:38 +00:00
David Majnemer	2652b75700	[WinEH] Don't emit CATCHRET from visitCatchPad Instead, emit a CATCHPAD node which will get selected to a target specific sequence. llvm-svn: 252528	2015-11-09 23:07:48 +00:00
Oliver Stannard	563585789c	[CodeGen] Always promote f16 if not legal We don't currently have any runtime library functions for operations on f16 values (other than conversions to and from f32 and f64), so we should always promote it to f32, even if that is not a legal type. In that case, the f32 values would be softened to f32 library calls. SoftenFloatRes_FP_EXTEND now needs to check the promoted operand's type, as it may ne a no-op or require a different library call. getCopyFromParts and getCopyToParts now need to cope with a floating-point value stored in a larger integer part, as is the case for any target that needs to store an f16 value in a 32-bit integer register. Differential Revision: http://reviews.llvm.org/D12856 llvm-svn: 252459	2015-11-09 11:03:18 +00:00
Joseph Tremoulet	f748c8937e	[WinEH] Update exception pointer registers Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383	2015-11-07 01:11:31 +00:00
Tom Stellard	05691a678e	DAGCombiner: Check shouldReduceLoadWidth before combining (and (load), x) -> extload Reviewers: resistor, arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13805 llvm-svn: 252349	2015-11-06 21:58:37 +00:00
Reid Kleckner	6ddae31045	[WinEH] Fix funclet prologues with stack realignment We already had a test for this for 32-bit SEH catchpads, but those don't actually create funclets. We had a bug that only appeared in funclet prologues, where we would establish EBP and ESI as our FP and BP, and then downstream prologue code would overwrite them. While I was at it, I fixed Win64+funclets+stackrealign. This issue doesn't come up as often there due to the ABI requring 16 byte stack alignment, but now we can rest easy that AVX and WinEH will work well together =P. llvm-svn: 252210	2015-11-05 21:09:49 +00:00
Igor Laevsky	35fe692025	[StatepointLowering] Remove distinction between call and invoke safepoints There is no point in having invoke safepoints handled differently than the call safepoints. All relevant decisions could be made by looking at whether or not gc.result and gc.relocate lay in a same basic block. This change will allow to lower call safepoints with relocates and results in a different basic blocks. See test case for example. Differential Revision: http://reviews.llvm.org/D14158 llvm-svn: 252028	2015-11-04 01:16:10 +00:00
Simon Pilgrim	191ac7c679	[SelectionDAG] Use existing constant nodes instead of recreating them. NFC. llvm-svn: 251990	2015-11-03 22:21:38 +00:00
James Y Knight	646c4032e7	Fix two issues in MergeConsecutiveStores: 1) PR25154. This is basically a repeat of PR18102, which was fixed in r200201, and broken again by r234430. The latter changed which of the store nodes was merged into from the first to the last. Thus, we now also need to prefer merging a later store at a given address into the target node, instead of an earlier one. 2) While investigating that, I also realized I'd introduced a bug in r236850. There, I removed a check for alignment -- not realizing that nothing except the alignment check was ensuring that none of the stores were overlapping! This is a really bogus way to ensure there's no aliased stores. A better solution to both of these issues is likely to always use the code added in the 'if (UseAA)' branches which rearrange the chain based on a more principled analysis. I'll look into whether that can be used always, but in the interest of getting things back to working, I think a minimal change makes sense. llvm-svn: 251816	2015-11-02 18:48:08 +00:00
Sanjoy Das	1d1929aace	[ValueTracking] Use !range metadata more aggressively in KnownBits Summary: Teach `computeKnownBitsFromRangeMetadata` to use `!range` metadata more aggressively. Reviewers: majnemer, nlewycky, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14100 llvm-svn: 251487	2015-10-28 03:20:15 +00:00
Sanjoy Das	4ff3cf6d92	[SelectionDAG] Don't inspect !range metadata for extended loads Summary: Don't call `computeKnownBitsFromRangeMetadata` for extended loads -- this can cause a mismatch between the width of the !range metadata and the width of the APInt's accumulating `KnownZero` (and `KnownOne` in the future). This isn't a problem now, but will be after a future change. Note: this can be made more aggressive in the future. Reviewers: nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14107 llvm-svn: 251486	2015-10-28 03:20:10 +00:00
James Y Knight	14eedd189b	Make the SelectionDAG graph printer use SDNode::PersistentId labels. r248010 changed the -debug output to use short ids, but did not similarly modify the graph printer. Change to be consistent, for ease of cross-reference. llvm-svn: 251465	2015-10-27 23:09:03 +00:00
Sanjay Patel	bbd4c79c8f	Use the 'arcp' fast-math-flag when combining repeated FP divisors This is a usage of the IR-level fast-math-flags now that they are propagated to SDNodes. This was originally part of D8900. Removing the global 'enable-unsafe-fp-math' checks will require auto-upgrade and possibly other changes. Differential Revision: http://reviews.llvm.org/D9708 llvm-svn: 251450	2015-10-27 20:27:25 +00:00
Cong Hou	07eeb8001e	Create a new interface addSuccessorWithoutWeight(MBB) in MBB to add successors when optimization is disabled. When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights. We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled. In this patch, a new interface addSuccessorWithoutWeight(MBB) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list. Differential revision: http://reviews.llvm.org/D13963 llvm-svn: 251429	2015-10-27 17:59:36 +00:00
Mehdi Amini	891c0973df	Do not use "else" when both branches return (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 251398	2015-10-27 08:12:08 +00:00
Steve King	fee370be72	Fix llc crash processing S/UREM for -Oz builds caused by rL250825. When taking the remainder of a value divided by a constant, visitREM() attempts to convert the REM to a longer but faster sequence of instructions. This conversion calls combine() on a speculative DIV instruction. Commit rL250825 may cause this combine() to return a DIVREM, corrupting nearby nodes. Flow eventually hits unreachable(). This patch adds a test case and a check to prevent visitREM() from trying to convert the REM instruction in cases where a DIVREM is possible. See http://reviews.llvm.org/D14035 llvm-svn: 251373	2015-10-27 00:14:06 +00:00
Michael Kuperstein	eaa16005af	[X86] Use correct calling convention for MCU psABI libcalls When using the MCU psABI, compiler-generated library calls should pass some parameters in-register. However, since inreg marking for x86 is currently done by the front end, it will not be applied to backend-generated calls. This is a workaround for PR3997, which describes a similar issue for -mregparm. Differential Revision: http://reviews.llvm.org/D13977 llvm-svn: 251223	2015-10-25 08:14:05 +00:00
Simon Pilgrim	3448cbcc51	[DAGCombiner] Tidy up ConstantFP commutation. NFCI Move ConstantFP canonicalization of commutative instructions to start of 2-op node creation (matches integer) - simplifies constant folding code. llvm-svn: 251203	2015-10-24 20:06:18 +00:00
Simon Pilgrim	7430804fe1	[DAGCombiner] Generalize masking of constant rotates. We don't need a mask of a rotation result to be a constant splat - any constant scalar/vector can be usefully folded. Followup to D13851. llvm-svn: 251197	2015-10-24 18:44:52 +00:00
Simon Pilgrim	d5ef318b5b	[X86][XOP] Add support for lowering vector rotations This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions. This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future. Differential Revision: http://reviews.llvm.org/D13851 llvm-svn: 251188	2015-10-24 13:17:26 +00:00
Davide Italiano	fbb958c24b	[CodeGen] Remove usage of NDEBUG in header. Moreover, this seems unused. llvm-svn: 251081	2015-10-23 00:17:40 +00:00
Craig Topper	8fe40e0ed5	Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. This removes the need to pass a hardcoded size in many places. NFC llvm-svn: 251032	2015-10-22 17:05:00 +00:00
Zia Ansari	8f509a7044	[X86] - Catch extra combine opportunities for redundant imuls. When we fold "mul ((add x, c1), c1)" -> "add ((mul x, c2), c1*c2)", we bail if (add x, c1) has multiple users which would result in an extra add instruction. In such cases, this patch adds a check to see if we can eliminate a multiply instruction in exchange for the extra add. I also added the capability of doing the existing optimization with non-splatted vectors (splatted also works). Differential Revision: http://reviews.llvm.org/D13740 llvm-svn: 251028	2015-10-22 16:14:45 +00:00
Matt Arsenault	29f9663f97	LegalizeDAG: Implement promote for build_vector This will be used in future commits for AMDGPU to promote operations on i64 vectors into operations on 32-bit vector components. This will be used / tested in future AMDGPU commits. llvm-svn: 250945	2015-10-21 21:10:10 +00:00
Artyom Skrobov	c736863a85	Two switch blocks in VectorLegalizer::LegalizeOp already have a default: llvm_unreachable("This action is not supported yet!"); -- so I'm adding one to the third switch block, too. This is a follow-up fix for http://reviews.llvm.org/D13862 llvm-svn: 250830	2015-10-20 15:06:37 +00:00
Artyom Skrobov	7fd67e25aa	Adding support for TargetLoweringBase::LibCall Summary: TargetLoweringBase::Expand is defined as "Try to expand this to other ops, otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between the two possibilities was defined in a rather convoluted way: - if DIVREM is legal, expand to DIVREM - if DIVREM has a custom lowering, expand to DIVREM - if DIVREM libcall is defined and a remainder from the same division is computed elsewhere, expand to a DIVREM libcall - else, expand to a DIV libcall This had the undesirable effect that if both DIV and DIVREM are implemented as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM libcall, even when the remainder isn't used. The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that backends can directly control whether they prefer an expansion or a conversion to a libcall. This makes the generic lowering code even more generic, allowing its reuse in a wider range of target-specific configurations. The useful effect is that ARM backend will now generate a call to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where it doesn't need the remainder. There's no functional change outside the ARM backend. Reviewers: t.p.northover, rengolin Subscribers: t.p.northover, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13862 llvm-svn: 250826	2015-10-20 13:14:52 +00:00
Artyom Skrobov	b844fa7fc0	Combining DIV+REM->DIVREM doesn't belong in LegalizeDAG; move it over into DAGCombiner. Summary: In addition to moving the code over, this patch amends the DIV,REM -> DIVREM combining to run on all affected nodes at once: if the nodes are converted to DIVREM one at a time, then the resulting DIVREM may get legalized by the backend into something target-specific that we won't be able to recognize and correlate with the remaining nodes. The motivation is to "prepare terrain" for D13862: when we set DIV and REM to be legalized to libcalls, instead of the DIVREM, we otherwise lose the ability to combine them together. To prevent this, we need to take the DIV,REM -> DIVREM combining out of the lowering stage. Reviewers: RKSimon, eli.friedman, rengolin Subscribers: john.brawn, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13733 llvm-svn: 250825	2015-10-20 13:06:02 +00:00
Owen Anderson	faf5187ee0	Restore the original behavior of SelectionDAG::getTargetIndex(). It looks like an extra negation snuck in as apart of restoring it. llvm-svn: 250726	2015-10-19 19:27:40 +00:00
Benjamin Kramer	2002aadaad	Put back SelectionDAG::getTargetIndex. While technically this is untested dead code, it has out-of-tree users. This reverts a part of r250434. llvm-svn: 250717	2015-10-19 18:26:16 +00:00
Simon Pilgrim	04d52d26f6	Use SDValue bool check. NFCI. llvm-svn: 250653	2015-10-18 12:33:54 +00:00
Simon Pilgrim	c2c154e078	Move one-use variable inside test. NFC. llvm-svn: 250651	2015-10-18 11:47:23 +00:00
Simon Pilgrim	24057b9566	[DAG] Ensure vector constant folding uses correct scalar undef types Minor fix to D13665 found during post-commit review. llvm-svn: 250616	2015-10-17 16:49:43 +00:00
Joseph Tremoulet	55b51e9dcc	[WinEH] Fix eh.exceptionpointer intrinsic lowering Summary: Some shared code for handling eh.exceptionpointer and eh.exceptioncode needs to not share the part that truncates to 32 bits, which is intended just for exception codes. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13747 llvm-svn: 250588	2015-10-17 00:08:08 +00:00
Benjamin Kramer	bacc7ba7aa	[SelectionDAG] Remove dead code. NFC. Carefully selected parts without deleting graph stuff and dumping methods. llvm-svn: 250434	2015-10-15 17:54:06 +00:00
Artyom Skrobov	4bca0bb010	A doccomment for CombineTo, and some NFC refactorings Summary: Caching SDLoc(N), instead of recreating it in every single function call, keeps the code denser, and allows to unwrap long lines. Reviewers: sunfish, atrick, sdmitrouk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13726 llvm-svn: 250305	2015-10-14 17:18:35 +00:00
Artyom Skrobov	a5b9ad22b3	Merge DAGCombiner::visitSREM and DAGCombiner::visitUREM (NFC) Summary: The two implementations had more code in common than not. Reviewers: sunfish, MatzeB, sdmitrouk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13724 llvm-svn: 250302	2015-10-14 16:54:14 +00:00
Duncan P. N. Exon Smith	e400a7d412	SelectionDAG: Remove implicit ilist iterator conversions, NFC llvm-svn: 250214	2015-10-13 19:47:46 +00:00
Matt Arsenault	e5d9515fb7	DAGCombiner: Don't stop finding better chain on 2 aliases The comment says this was stopped because it was unlikely to be profitable. This is not true if you want to combine vector loads with multiple components. For a simple case that looks like t0 = load t0 ... t1 = load t0 ... t2 = load t0 ... t3 = load t0 ... t4 = store t0:1, t0:1 t5 = store t4, t1:0 t6 = store t5, t2:0 t7 = store t6, t3:0 We want to get all of these stores onto a chain that is a TokenFactor of these N loads. This mostly solves the AMDGPU merge-stores.ll regressions with -combiner-alias-analysis for merging vector stores of vector loads. llvm-svn: 250138	2015-10-13 00:49:00 +00:00

1 2 3 4 5 ...

7295 Commits