llvm-project

Commit Graph

Author	SHA1	Message	Date
Amaury Sechet	d90f5f6698	Use InstCombine's builder in foldSelectCttzCtlz instead of creating a new one. Summary: As per title. This will add the instructiions we are interested in in the worklist. Reviewers: mehdi_amini, majnemer, andreadb Differential Revision: https://reviews.llvm.org/D29081 llvm-svn: 292957	2017-01-24 17:48:25 +00:00
Amaury Sechet	5da456e6a1	Fix formating in foldSelectCttzCtlz. NFC llvm-svn: 292934	2017-01-24 14:22:27 +00:00
Simon Pilgrim	78f8630ac0	[InstCombine][X86] MULDQ/MULUDQ undef -> zero Added early out for single undef input - we were already supporting (and testing) this in the constant folding code, we just do it quicker now Drop undef handling from demanded elts code now that we handle it fully in InstCombiner::visitCallInst llvm-svn: 292913	2017-01-24 11:07:41 +00:00
Matt Arsenault	954a624fb9	SimplifyLibCalls: Replace more unary libcalls with intrinsics llvm-svn: 292855	2017-01-23 23:55:08 +00:00
Simon Pilgrim	f6f3a36159	[InstCombine][X86] Add MULDQ/MULUDQ constant folding support llvm-svn: 292793	2017-01-23 15:22:59 +00:00
Simon Pilgrim	bb13fdabec	[InstCombine][X86] MULDQ/MULUDQ undef -> zero Match generic mul behaviour so that <X x i64> multiply and muldq/muludq pattern act the same llvm-svn: 292784	2017-01-23 12:07:32 +00:00
Sanjay Patel	478a83c905	[InstCombine] use m_APInt to allow ashr folds for vectors with splat constants We may be able to assert that no shl-shl or lshr-lshr pairs ever get here because we should have already handled those in foldShiftedShift(). llvm-svn: 292726	2017-01-21 17:59:59 +00:00
Simon Pilgrim	a50a93fcd0	[InstCombine][X86] Add MULDQ/MULUDQ undef handling llvm-svn: 292627	2017-01-20 18:20:30 +00:00
Simon Pilgrim	51b3b98e3a	[InstCombine][SSE] Add DemandedElts support for PACKSS/PACKUS instructions Simplify a packss/packus truncation based on the elements of the mask that are actually demanded. Differential Revision: https://reviews.llvm.org/D28777 llvm-svn: 292591	2017-01-20 09:28:21 +00:00
Davide Italiano	2ef8c4e708	[InstCombine] Simplify gep (gep p, a), (b-a) Patch by Andrea Canciani. Differential Revision: https://reviews.llvm.org/D27413 llvm-svn: 292506	2017-01-19 18:51:56 +00:00
Sanjay Patel	291c3d8ff2	[InstCombine] icmp Pred (shl nsw X, C1), C0 --> icmp Pred X, C0 >> C1 Try harder to fold icmp with shl nsw as discussed here: http://lists.llvm.org/pipermail/llvm-dev/2017-January/108749.html This is similar to the 'shl nuw' transforms that were added with D25913. This may eventually help solve: https://llvm.org/bugs/show_bug.cgi?id=30773 Differential Revision: https://reviews.llvm.org/D28406 llvm-svn: 292492	2017-01-19 16:12:10 +00:00
Sanjay Patel	ae23d65a7d	[InstCombine] add an assert to make a shl+icmp transform assumption explicit; NFCI llvm-svn: 292440	2017-01-18 21:16:12 +00:00
Sanjay Patel	589de5ea4e	[InstCombine] remove a redundant check; NFCI I missed deleting this check when I refactored this chunk in: https://reviews.llvm.org/rL292260 llvm-svn: 292433	2017-01-18 20:09:59 +00:00
Simon Pilgrim	fe2c0ed4cf	[InstCombine][AVX2] Add DemandedElts support for VPERMD/VPERMPS shuffles Simplify a vpermv shuffle mask based on the elements of the mask that are actually demanded. llvm-svn: 292371	2017-01-18 14:47:49 +00:00
Simon Pilgrim	a22c3a1c0f	[InstCombine] Remove unnecessary intrinsics demanded elts handling As discussed on D28777 - we don't need to handle 'all element' shuffles inside InstCombiner::visitCallInst as InstCombiner::SimplifyDemandedVectorElts will do everything we need. llvm-svn: 292365	2017-01-18 13:44:04 +00:00
Sanjay Patel	14715b3c2a	[InstCombine] refactor foldICmpShlConstant(); NFCI This reduces the size of and increases the symmetry with the planned functional change in: https://reviews.llvm.org/D28406 llvm-svn: 292260	2017-01-17 21:25:16 +00:00
David Majnemer	de55c606d1	[InstCombine] Fold ((C1 OP zext(X)) & C2) -> zext((C1 OP X) & C2) This further extends r292179 to support additional binary operators beyond subtraction. llvm-svn: 292238	2017-01-17 18:08:06 +00:00
Sanjay Patel	5424bd2625	[InstCombine] reduce indent; NFCI llvm-svn: 292230	2017-01-17 16:59:09 +00:00
Simon Pilgrim	d4eb800b03	[InstCombine][X86][AVX] Add DemandedElts support for VPERMILPD/VPERMILPS instructions Simplify a vpermilvar shuffle mask based on the elements of the mask that are actually demanded. llvm-svn: 292209	2017-01-17 11:35:03 +00:00
Sanjoy Das	679bc32c6a	[InstCombine] Don't DSE across readnone functions that may throw Summary: Depends on D28740 Reviewers: dberlin, chandlerc, hfinkel, majnemer Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D28742 llvm-svn: 292197	2017-01-17 05:45:09 +00:00
David Majnemer	36d382b773	[InstCombine] Fold ((C1-zext(X)) & C2) -> zext((C1-X) & C2) This is valid if C2 fits within the bitwidth of X thanks to two's complement modulo arithmetic. llvm-svn: 292179	2017-01-17 00:45:57 +00:00
Matt Arsenault	7233344c28	SimplifyLibCalls: Replace fabs libcalls with intrinsics Add missing fabs(fpext) optimzation that worked with the call, and also fixes it creating a second fpext when there were multiple uses. llvm-svn: 292172	2017-01-17 00:10:40 +00:00
Sanjay Patel	da5682afdd	[InstCombine] use m_APInt instead of faking it llvm-svn: 292164	2017-01-16 21:24:41 +00:00
Sanjay Patel	65cce20caa	[InstCombine] fix names in canEvaluateShiftedShift(); NFC It's not clear what 'First' and 'Second' mean, so use 'Inner' and 'Outer' to match foldShiftedShift() and add comments with formulas, so it's easier to see what's going on. llvm-svn: 292153	2017-01-16 20:05:26 +00:00
Sanjay Patel	ab8b32de71	[InstCombine] use m_APInt to allow shift-shift folds for vectors with splat constants Some existing 'FIXME' tests are still not folded because of splat holes in value tracking. llvm-svn: 292151	2017-01-16 19:35:45 +00:00
Sanjay Patel	646734a6cd	[InstCombine] refactor shift-of-shift folds; NFCI Reduces code duplication and makes it easier to extend these folds for vectors. llvm-svn: 292145	2017-01-16 17:27:50 +00:00
Simon Pilgrim	73a68c25a0	[InstCombine][SSE] Add DemandedElts support for PSHUFB instructions Simplify a pshufb shuffle mask based on the elements of the mask that are actually demanded. Differential Revision: https://reviews.llvm.org/D28745 llvm-svn: 292101	2017-01-16 11:30:41 +00:00
Sanjay Patel	20aaf58543	[InstCombine] fix formatting; NFC llvm-svn: 292073	2017-01-15 17:55:35 +00:00
Sanjay Patel	5f8451afad	[InstCombine] use m_APInt to allow ashr folds for vectors with splat constants llvm-svn: 292064	2017-01-15 16:38:19 +00:00
Chandler Carruth	ca68a3ec47	[PM] Introduce an analysis set used to preserve all analyses over a function's CFG when that CFG is unchanged. This allows transformation passes to simply claim they preserve the CFG and analysis passes to check for the CFG being preserved to remove the fanout of all analyses being listed in all passes. I've gone through and removed or cleaned up as many of the comments reminding us to do this as I could. Differential Revision: https://reviews.llvm.org/D28627 llvm-svn: 292054	2017-01-15 06:32:49 +00:00
Chandler Carruth	2f19a324cb	[PM] The assumption cache is fundamentally designed to be self-updating, mark it as never invalidated in the new PM. The old PM already required this to work, and after a discussion with Hal this seems to really be the only sensible answer. The cache gracefully degrades as the IR is mutated, and most things which do this should already be incrementally updating the cache. This gets rid of a bunch of logic preserving and testing the invalidation of this analysis. llvm-svn: 292039	2017-01-15 00:26:18 +00:00
Chandler Carruth	5edfd4d99e	[PM] Fix instcombine's analysis preservation in the new pass manager to cover domtree and alias analysis. These are the pretty clear analyses that we would always want to survive this pass. To make these survive, we also need to preserve the assumption cache. Added a test that verifies the important bits of this preservation. llvm-svn: 292037	2017-01-14 23:25:22 +00:00
Sanjay Patel	ca3124f74b	[InstCombine] clean up visitAshr(); NFCI llvm-svn: 292036	2017-01-14 23:13:50 +00:00
Sanjay Patel	40f401776b	[InstCombine] optimize unsigned icmp of increment Allows LLVM to optimize sequences like the following: %add = add nuw i32 %x, 1 %cmp = icmp ugt i32 %add, %y Into: %cmp = icmp uge i32 %x, %y Previously, only signed comparisons were being handled. Decrements could also be handled, but 'sub nuw %x, 1' is currently canonicalized to 'add %x, -1' in InstCombineAddSub, losing the nuw flag. Removing that canonicalization seems like it might have far-reaching ramifications so I kept this simple for now. Patch by Matti Niemenmaa! Differential Revision: https://reviews.llvm.org/D24700 llvm-svn: 291975	2017-01-13 23:25:46 +00:00
Sanjay Patel	2d4b456427	[InstCombine] use m_APInt to allow lshr folds for vectors with splat constants llvm-svn: 291972	2017-01-13 23:04:10 +00:00
Sanjay Patel	acd24c7b6a	[InstCombine] use 'match' and other clean-up; NFCI llvm-svn: 291937	2017-01-13 18:52:10 +00:00
Sanjay Patel	b22f6c5f26	[InstCombine] use m_APInt to allow shl folds for vectors with splat constants llvm-svn: 291934	2017-01-13 18:39:09 +00:00
Sanjay Patel	cf08203105	[InstCombine] use Op0/Op1 local variables more consistently with shifts; NFC llvm-svn: 291923	2017-01-13 18:08:25 +00:00
Sanjay Patel	5178363687	[InstCombine] if the condition of a select may be known via assumes, eliminate the select This is a limited solution for PR31512: https://llvm.org/bugs/show_bug.cgi?id=31512 The motivation is that we will need to increase usage of llvm.assume and/or metadata to solve PR28430: https://llvm.org/bugs/show_bug.cgi?id=28430 ...and this kind of simplification is needed to take advantage of that extra information. The 'not' test case would be handled by: https://reviews.llvm.org/D28485 Differential Revision: https://reviews.llvm.org/D28337 llvm-svn: 291915	2017-01-13 17:02:42 +00:00
Robert Lougher	f5df7a18dd	[DebugInfo] Add const to DILocation variable declaration; NFC. llvm-svn: 291785	2017-01-12 18:29:28 +00:00
Hal Finkel	8a9a783f2c	Make processing @llvm.assume more efficient - Add affected values to the assumption cache Here's my second try at making @llvm.assume processing more efficient. My previous attempt, which leveraged operand bundles, r289755, didn't end up working: it did make assume processing more efficient but eliminating the assumption cache made ephemeral value computation too expensive. This is a more-targeted change. We'll keep the assumption cache, but extend it to keep a map of affected values (i.e. values about which an assumption might provide some information) to the corresponding assumption intrinsics. This allows ValueTracking and LVI to find assumptions relevant to the value being queried without scanning all assumptions in the function. The fact that ValueTracking started doing O(number of assumptions in the function) work, for every known-bits query, has become prohibitively expensive in some cases. As discussed during the review, this is a pragmatic fix that, longer term, will likely be replaced by a more-principled solution (perhaps based on an extended SSA form). Differential Revision: https://reviews.llvm.org/D28459 llvm-svn: 291671	2017-01-11 13:24:24 +00:00
Sanjay Patel	db0938fd9a	[InstCombine] add a wrapper for a common pair of transforms; NFCI Some of the callers are artificially limiting this transform to integer types; this should make it easier to incrementally remove that restriction. llvm-svn: 291620	2017-01-10 23:49:07 +00:00
Matt Arsenault	3f509042b0	InstCombine: Set operands instead of creating new call llvm-svn: 291612	2017-01-10 23:17:52 +00:00
Matt Arsenault	fdb78f8bae	InstCombine: fdiv -x, -y -> fdiv x, y llvm-svn: 291611	2017-01-10 23:08:54 +00:00
Sanjay Patel	940c06188e	fix comment typos; NFC llvm-svn: 291447	2017-01-09 16:27:56 +00:00
Matt Arsenault	3bdd75d01e	InstCombine: Fold cos(-x) -> cos(x) Also cos(fabs(x)) -> cos(x) llvm-svn: 291022	2017-01-04 22:49:03 +00:00
David Majnemer	cb892e9066	[InstCombine] Move casts around shift operations It is possible to perform a left shift before zero extending if the shift would only shift out zeros. llvm-svn: 290928	2017-01-04 02:21:34 +00:00
David Majnemer	022d2a563b	[InstCombine] Combine adds across a zext We can perform the following: (add (zext (add nuw X, C1)), C2) -> (zext (add nuw X, C1+C2)) This is only possible if C2 is negative and C2 is greater than or equal to negative C1. llvm-svn: 290927	2017-01-04 02:21:31 +00:00
Matt Arsenault	56ff4839ae	InstCombine: Fold fabs on select of constants llvm-svn: 290913	2017-01-03 22:40:34 +00:00
Sanjay Patel	f0d1e77373	[InstCombine] use 'match' to reduce code bloat; NFCI I wrote this patch before seeing the comment in: https://reviews.llvm.org/D27114 ...that suggests we should actually be canonicalizing the other way. So just in case we decide this is the right way, we might as well have a cleaner implementation. llvm-svn: 290912	2017-01-03 22:25:31 +00:00
Matt Arsenault	b264c94963	InstCombine: Add fma with constant transforms DAGCombine already does these. llvm-svn: 290860	2017-01-03 04:32:35 +00:00
Matt Arsenault	1cc294c85d	InstCombine: Add fma + fabs/fneg transforms fma (fneg x), (fneg y), z -> fma x, y, z fma (fabs x), (fabs x), z -> fma x, x, z llvm-svn: 290859	2017-01-03 04:32:31 +00:00
Sanjay Patel	b38ad88e9f	[InstCombine] use combineMetadataForCSE instead of copying it; NFCI llvm-svn: 290844	2017-01-02 23:25:28 +00:00
Craig Topper	d00db69227	[InstCombine][AVX-512] Teach InstCombine that llvm.x86.avx512.vcomi.sd and llvm.x86.avx512.vcomi.ss don't use the upper elements of their input. This was already done for the SSE/SSE2 version of the intrinsics. llvm-svn: 290776	2016-12-31 00:45:06 +00:00
Craig Topper	991636312b	[InstCombine][AVX-512] When turning intrinsics with masking into native IR, don't emit a select if the mask is known to be all ones. This saves InstCombine the burden of having to optimize the select later. llvm-svn: 290774	2016-12-30 23:06:28 +00:00
David Majnemer	5ec5f278c9	[InstCombine] Address post-commit feedback llvm-svn: 290741	2016-12-30 03:36:17 +00:00
David Majnemer	a1cfd7c5f8	[InstCombine] More thoroughly canonicalize the position of zexts We correctly canonicalized (add (sext x), (sext y)) to (sext (add x, y)) where possible. However, we didn't perform the same canonicalization for zexts or for muls. llvm-svn: 290733	2016-12-30 00:28:58 +00:00
Craig Topper	17b5568bc7	[InstCombine] Use getVectorNumElements instead of explicitly casting to VectorType and calling getNumElements. NFC llvm-svn: 290707	2016-12-29 07:03:18 +00:00
Craig Topper	62f06e241b	[InstCombine] Fix typo in comment. NFC llvm-svn: 290706	2016-12-29 05:38:31 +00:00
Craig Topper	2e18bcfc60	[InstCombine] Use a 32-bits instead of 64-bits for storing the number of elements in VectorType for a ShuffleVector. While there getVectorNumElements to avoid an explicit cast. NFC llvm-svn: 290705	2016-12-29 04:24:32 +00:00
Craig Topper	1a8a3377cc	[InstCombine][X86] If the lowest element of a scalar intrinsic isn't used make sure we add it to the worklist so we can DCE it sooner. We bypassed the intrinsic and returned the passthru operand, but we should also add the intrinsic to the worklist since its now dead. This can allow DCE to find it sooner and remove it. Similar was done for InsertElement when the inserted element isn't demanded. llvm-svn: 290704	2016-12-29 03:30:17 +00:00
Craig Topper	28ec3460e4	[InstCombine] Remove a piece of a comment that said that InstCombiner contains pass infrastructure. That hasn't been true since r226618. NFC llvm-svn: 290648	2016-12-28 03:12:42 +00:00
Michael Kuperstein	cd7ad7130f	[InstCombine] Canonicalize insert splat sequences into an insert + shuffle This adds a combine that canonicalizes a chain of inserts which broadcasts a value into a single insert + a splat shufflevector. This fixes PR31286. Differential Revision: https://reviews.llvm.org/D27992 llvm-svn: 290641	2016-12-28 00:18:08 +00:00
Craig Topper	72f2d4e8d6	[InstCombine][X86] Add DemandedElts support for 512-bit PMULDQ/PMULUDQ instructions PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use. This builds on r290554 which added supported for 128 and 256-bit. llvm-svn: 290582	2016-12-27 05:30:09 +00:00
Craig Topper	7f8540b5e7	[AVX-512][InstCombine] Teach InstCombine to turn masked scalar add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION. An earlier commit added support for unmasked scalar operations. At that time isel wouldn't generate an optimal sequence for masked operations, but that has now been fixed. llvm-svn: 290566	2016-12-27 01:56:30 +00:00
Craig Topper	020b228155	[AVX-512][InstCombine] Teach InstCombine to turn packed add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION. llvm-svn: 290559	2016-12-27 00:23:16 +00:00
Simon Pilgrim	c9cf7fc7a4	[InstCombine][X86] Add DemandedElts support for PMULDQ/PMULUDQ instructions PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use. Differential Revision: https://reviews.llvm.org/D28119 llvm-svn: 290554	2016-12-26 23:28:17 +00:00
Craig Topper	7b788ada2d	[AVX-512][InstCombine] Teach InstCombine to turn scalar add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION. Summary: I only do this for unmasked cases for now because isel is failing to fold the mask. I'll try to fix that soon. I'll do the same thing for packed add/sub/mul/div in a future patch. Reviewers: delena, RKSimon, zvi, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27879 llvm-svn: 290535	2016-12-26 06:33:19 +00:00
Craig Topper	e328045711	[AVX-512][InstCombine] Teach InstCombine to converted masked vpermv intrinsics into shufflevector instructions Summary: This patch adds support for converting the masked vpermv intrinsics into shufflevector instructions if the indices are constants. We also need to wrap a select instruction around the shuffle to take care of the masking part. InstCombine will take care of optimizing the select if the mask is constant so I didn't bother checking for that. Reviewers: zvi, delena, spatel, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27825 llvm-svn: 290530	2016-12-25 23:58:57 +00:00
David Majnemer	b0761a0c1b	Revert "[InstCombine] New opportunities for FoldAndOfICmp and FoldXorOfICmp" This reverts commit r289813, it caused PR31449. llvm-svn: 290266	2016-12-21 19:21:59 +00:00
George Burgess IV	3f08914e7e	[Analysis] Centralize objectsize lowering logic. We're currently doing nearly the same thing for @llvm.objectsize in three different places: two of them are missing checks for overflow, and one of them could subtly break if InstCombine gets much smarter about removing alloc sites. Seems like a good idea to not do that. llvm-svn: 290214	2016-12-20 23:46:36 +00:00
Sanjay Patel	5a443ac000	[InstCombine] use commutative matcher for pattern with commutative operators This is a case that was missed in: https://reviews.llvm.org/rL290067 ...and it would regress if we fix operand complexity (PR28296). llvm-svn: 290127	2016-12-19 18:35:37 +00:00
Sanjay Patel	dd46b52942	[InstCombine] add folds for icmp (umin\|umax X, Y), X This is a follow-up to: https://reviews.llvm.org/rL289855 (https://reviews.llvm.org/D27531) https://reviews.llvm.org/rL290111 llvm-svn: 290118	2016-12-19 17:32:37 +00:00
Sanjay Patel	8296c6c96f	[InstCombine] add folds for icmp (smax X, Y), X This is a follow-up to: https://reviews.llvm.org/rL289855 (D27531) llvm-svn: 290111	2016-12-19 16:28:53 +00:00
Daniel Jasper	aec2fa352f	Revert @llvm.assume with operator bundles (r289755-r289757) This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086	2016-12-19 08:22:17 +00:00
Sanjay Patel	2b9d4b4daf	[InstCombine] use commutative matchers for patterns with commutative operators Background/motivation - I was circling back around to: https://llvm.org/bugs/show_bug.cgi?id=28296 I made a simple patch for that and noticed some regressions, so added test cases for those with rL281055, and this is hopefully the minimal fix for just those cases. But as you can see from the surrounding untouched folds, we are missing commuted patterns all over the place, and of course there are no regression tests to cover any of those cases. We could sprinkle "m_c_" dust all over this file and catch most of the missing folds, but then we still wouldn't have test coverage, and we'd still miss some fraction of commuted patterns because they require adjustments to the match order. I'm aware of the concern about the potential compile-time performance impact of adding matches like this (currently being discussed on llvm-dev), but I don't think there's any evidence yet to suggest that handling commutative pattern matching more thoroughly is not a worthwhile goal of InstCombine. Differential Revision: https://reviews.llvm.org/D24419 llvm-svn: 290067	2016-12-18 18:49:48 +00:00
Craig Topper	e32b5fd7f9	[InstCombine] Simplify code slightly. NFC llvm-svn: 290046	2016-12-17 18:10:04 +00:00
Sanjay Patel	d640641a61	[InstCombine] add folds for icmp (smin X, Y), X Min/max canonicalization (r287585) exposes the fact that we're missing combines for min/max patterns. This patch won't solve the example that was attached to that thread, so something else still needs fixing. The line between InstCombine and InstSimplify gets blurry here because sometimes the icmp instruction that we want to fold to already exists, but sometimes it's the swapped form of what we want. Corresponding changes for smax/umin/umax to follow. Differential Revision: https://reviews.llvm.org/D27531 llvm-svn: 289855	2016-12-15 19:13:37 +00:00
Ehsan Amiri	795b0671c5	[InstCombine] New opportunities for FoldAndOfICmp and FoldXorOfICmp A number of new patterns for simplifying and/xor of icmp: (icmp ne %x, 0) ^ (icmp ne %y, 0) => icmp ne %x, %y if the following is true: 1- (%x = and %a, %mask) and (%y = and %b, %mask) 2- %mask is a power of 2. (icmp eq %x, 0) & (icmp ne %y, 0) => icmp ult %x, %y if the following is true: 1- (%x = and %a, %mask1) and (%y = and %b, %mask2) 2- Let %t be the smallest power of 2 where %mask1 & %t != 0. Then for any %s that is a power of 2 and %s & %mask2 != 0, we must have %s <= %t. For example if %mask1 = 24 and %mask2 = 16, setting %s = 16 and %t = 8 violates condition (2) above. So this optimization cannot be applied. llvm-svn: 289813	2016-12-15 12:25:13 +00:00
Craig Topper	ab5f355d8c	[AVX-512][InstCombine] Add masked scalar FMA intrinsics to SimplifyDemandedVectorElts. llvm-svn: 289759	2016-12-15 03:49:45 +00:00
Hal Finkel	3ca4a6bcf1	Remove the AssumptionCache After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756	2016-12-15 03:02:15 +00:00
Hal Finkel	cb9f78e1c3	Make processing @llvm.assume more efficient by using operand bundles There was an efficiency problem with how we processed @llvm.assume in ValueTracking (and other places). The AssumptionCache tracked all of the assumptions in a given function. In order to find assumptions relevant to computing known bits, etc. we searched every assumption in the function. For ValueTracking, that means that we did O(#assumes * #values) work in InstCombine and other passes (with a constant factor that can be quite large because we'd repeat this search at every level of recursion of the analysis). Several of us discussed this situation at the last developers' meeting, and this implements the discussed solution: Make the values that an assume might affect operands of the assume itself. To avoid exposing this detail to frontends and passes that need not worry about it, I've used the new operand-bundle feature to add these extra call "operands" in a way that does not affect the intrinsic's signature. I think this solution is relatively clean. InstCombine adds these extra operands based on what ValueTracking, LVI, etc. will need and then those passes need only search the users of the values under consideration. This should fix the computational-complexity problem. At this point, no passes depend on the AssumptionCache, and so I'll remove that as a follow-up change. Differential Revision: https://reviews.llvm.org/D27259 llvm-svn: 289755	2016-12-15 02:53:42 +00:00
Robert Lougher	cfd7198698	[InstCombine] Folding of a compare with RHS const should merge debug locations If all the operands to a phi node are compares that have a RHS constant, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the new op should be the merged debug locations of the phi node arguments. Patch 8 of 8 for D26256. Folding of a compare that has a RHS constant. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289704	2016-12-14 20:27:22 +00:00
Robert Lougher	c9f7354776	[InstCombine] Folding of a binop with RHS const should merge the debug locations If all the operands to a phi node are a binop with a RHS constant, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the new op should be the merged debug locations of the phi node arguments. Patch 7 of 8 for D26256. Folding of a binop with RHS constant. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289699	2016-12-14 20:07:49 +00:00
Robert Lougher	f02d9b8325	[InstCombine] When folding casts through a phi node merge the debug locations If all the operands to a phi node are a cast, instcombine will try to pull them through the phi node, combining them into a single cast. When it does this, the debug location of the new cast should be the merged debug locations of the phi node arguments. Patch 6 of 8 for D26256. Folding of a cast operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289693	2016-12-14 19:24:01 +00:00
Robert Lougher	373e36a410	[InstCombine] Folding loads through a phi node should merge the debug locations If all the operands to a phi node are a load, instcombine will try to pull them through the phi node, combining them into a single load. When it does this, the debug location of the new load should be the merged debug locations of the phi node arguments. Patch 5 of 8 for D26256. Folding of a load operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289688	2016-12-14 19:02:14 +00:00
Robert Lougher	8fc1e89bbb	[InstCombine] When folding GEP through a phi node merge the debug locations If all the operands to a phi node are getelementptr, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the new getelementptr should be the merged debug locations of the phi node arguments. Patch 4 of 8 for D26256. Folding of a getelementptr operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289684	2016-12-14 18:37:50 +00:00
Robert Lougher	4b0790d488	[InstCombine] Merge debug locations when folding through a phi node If all the operands to a phi node are of the same operation, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the operation should be the merged debug locations of the phi node arguments. Patch 3 of 8 for D26256. Folding of a compare operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289681	2016-12-14 18:14:57 +00:00
Robert Lougher	2428a4050f	[InstCombine] Merge debug locations when folding through a phi node If all the operands to a phi node are of the same operation, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the operation should be the merged debug locations of the phi node arguments. Patch 2 of 8 for D26256. Folding of a binary operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289679	2016-12-14 17:49:19 +00:00
Stephan Bergmann	17c7f70362	Replace APFloatBase static fltSemantics data members with getter functions At least the plugin used by the LibreOffice build (<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly uses those members (through inline functions in LLVM/Clang include files in turn using them), but they are not exported by utils/extract_symbols.py on Windows, and accessing data across DLL/EXE boundaries on Windows is generally problematic. Differential Revision: https://reviews.llvm.org/D26671 llvm-svn: 289647	2016-12-14 11:57:17 +00:00
Craig Topper	aeaa52cc11	[X86][InstCombine] Handle demanded elements for operand of AVX-512 scalar floating point to integer conversion intrinsics. llvm-svn: 289639	2016-12-14 07:46:12 +00:00
Craig Topper	268b3abe6d	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle masked scalar add/sub/mul/div/max/min intrinsics better. Now we can remove these intrinsics if element 0 isn't used. Also fix undef element tracking. llvm-svn: 289636	2016-12-14 06:06:58 +00:00
Craig Topper	dfd268d76b	[X86][InstCombine] Handle scalar fmadd intrinsics correctly in SimplifyDemandedVectorElts. Now we pass a modified version of DemandedElts to each operand and we calculate undef elts correctly. llvm-svn: 289632	2016-12-14 05:43:05 +00:00
Craig Topper	eb6a20e79e	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar round intrinsics more correctly. Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Similarly we clear bit 0 for optimizing operand 0. Also calculate UndefElts correctly. Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support. llvm-svn: 289629	2016-12-14 03:17:30 +00:00
Craig Topper	a0372dec26	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar min/max/cmp intrinsics more correctly. Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Also calculate UndefElts correctly. Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support. llvm-svn: 289628	2016-12-14 03:17:27 +00:00
Craig Topper	ac75bca1eb	[X86][InstCombine] Fix SimplifyDemandedVectorElts to handle frcz scalar intrinsics correctly. Only the lower bits of the input element are used. And only the lower element can be undef since the upper bits are zeroed. Have InstCombineCalls call SimplifyDemandedVectorElts for these intrinsics to reuse this support. llvm-svn: 289523	2016-12-13 07:45:45 +00:00
Sanjay Patel	e730ce87a5	[InstCombine] fix bug when offsetting case values of a switch (PR31260) We could truncate the condition and then try to fold the add into the original condition value causing wrong case constants to be used. Move the offset transform ahead of the truncate transform and return after each transform, so there's no chance of getting confused values. Fix for: https://llvm.org/bugs/show_bug.cgi?id=31260 llvm-svn: 289442	2016-12-12 16:13:52 +00:00
Sanjay Patel	87e2f677d7	[InstCombine] clean up range-for-loops in visitSwitchInst(); NFCI llvm-svn: 289439	2016-12-12 15:52:56 +00:00
Craig Topper	7fc6d34ed1	[InstCombine][XOP] The instructions for the scalar frcz intrinsics are defined to put 0 in the upper bits, not pass bits through like other intrinsics. So we should return a zero vector instead. llvm-svn: 289411	2016-12-11 22:32:38 +00:00
Craig Topper	23ebd9564f	[X86][InstCombine] Add support for scalar FMA intrinsics to SimplifyDemandedVectorElts. This teaches SimplifyDemandedElts that the FMA can be removed if the lower element isn't used. It also teaches it that if upper elements of the first operand aren't used then we can simplify them. llvm-svn: 289377	2016-12-11 08:54:52 +00:00
Craig Topper	61b280e7b0	[X86][InstCombine] Teach InstCombineCalls to simplify demanded elements for scalar FMA intrinsics. These intrinsics don't read the upper bits of their second and third inputs so we can try to simplify them. llvm-svn: 289372	2016-12-11 07:42:06 +00:00
Craig Topper	d96395365a	[AVX-512][InstCombine] Teach InstCombineCalls how to simplify demanded for scalar cmp intrinsics with masking and rounding. These intrinsics don't read the upper elements of their first and second input. These are slightly different the the SSE version which does use the upper bits of its first element as passthru bits since the result goes to an XMM register. For AVX-512 the result goes to a mask register instead. llvm-svn: 289371	2016-12-11 07:42:04 +00:00
Craig Topper	790d0fa569	[AVX-512][InstCombine] Teach InstCombineCalls how to simplify demanded elements for scalar add,div,mul,sub,max,min intrinsics with masking and rounding. These intrinsics don't read the upper bits of their second input. And the third input is the passthru for masking and that only uses the lower element as well. llvm-svn: 289370	2016-12-11 07:42:01 +00:00
Craig Topper	58917f3508	[AVX-512][InstCombine] Add 512-bit vpermilvar intrinsics to InstCombineCalls to match 128 and 256-bit. llvm-svn: 289354	2016-12-11 01:59:36 +00:00
Craig Topper	9a63d7ade5	[X86][InstCombine] Teach InstCombineCalls to turn pshufb intrinsic into a shufflevector if the indices are constant. llvm-svn: 289348	2016-12-11 00:23:50 +00:00
Sanjay Patel	4c48bbe94d	[InstCombine] add helper for shift-by-shift folds; NFCI These are currently limited to integer types, but we should be able to extend to splat vectors and possibly general vectors. llvm-svn: 289343	2016-12-10 22:16:29 +00:00
Sanjay Patel	b7f8cb698c	[InstCombine] change select type to eliminate bitcasts This solves a secondary problem seen in PR6137: https://llvm.org/bugs/show_bug.cgi?id=6137#c6 This is similar to the bitwise logic op fold added with: https://reviews.llvm.org/rL287707 And like that patch, I'm artificially restricting the transform from vector <-> scalar types until we're sure that the backend can handle that. llvm-svn: 288584	2016-12-03 15:25:16 +00:00
Peter Collingbourne	ab85225be4	IR: Change the gep_type_iterator API to avoid always exposing the "current" type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458	2016-12-02 02:24:42 +00:00
Philip Reames	89e92d21b4	[PR29121] Don't fold if it would produce atomic vector loads or stores The instcombine code which folds loads and stores into their use types can trip up if the use is a bitcast to a type which we can't directly load or store in the IR. In principle, such types shouldn't exist, but in practice they do today. This is a workaround to avoid a bug while we work towards the long term goal. Differential Revision: https://reviews.llvm.org/D24365 llvm-svn: 288415	2016-12-01 20:17:06 +00:00
Sanjay Patel	aa8b28e509	[InstCombine] allow more narrowing transforms for logic ops We had a limited version of this for scalar 'and'; this expands the transform to 'or' and 'xor' and allows vectors types too. llvm-svn: 288273	2016-11-30 20:48:54 +00:00
Sanjay Patel	8ca30ab0c5	[InstSimplify] allow integer vector types to use computeKnownBits Note that the non-splat lshr+lshr test folded, but that does not work in general. Something is missing or wrong in computeKnownBits as the non-splat shl+shl test still shows. llvm-svn: 288005	2016-11-27 21:07:28 +00:00
Sanjay Patel	8bd69b7ed9	[InstCombine] don't drop metadata in FoldOpIntoSelect() llvm-svn: 287980	2016-11-26 15:23:20 +00:00
Sanjay Patel	91e73a7bfa	add optional param to copy metadata when creating selects; NFC There are other spots where we can use this; we're currently dropping metadata in some places, and there are proposed changes where we will want to propagate metadata. IRBuilder's CreateSelect() already has a parameter like this, so this change makes the regular 'Create' API line up with that. llvm-svn: 287976	2016-11-26 15:01:59 +00:00
David Majnemer	d5648c7a7d	Replace some callers of setTailCall with setTailCallKind We were a little sloppy with adding tailcall markers. Be more consistent by using setTailCallKind instead of setTailCall. llvm-svn: 287955	2016-11-25 22:35:09 +00:00
Sanjay Patel	1e6ca44a8e	add and use isBitwiseLogicOp() helper function; NFCI llvm-svn: 287712	2016-11-22 22:54:36 +00:00
Sanjay Patel	e359eaaf70	[InstCombine] change bitwise logic type to eliminate bitcasts In PR27925: https://llvm.org/bugs/show_bug.cgi?id=27925 ...we proposed adding this fold to eliminate a bitcast. In D20774, there was some concern about changing the type of a bitwise op as well as creating bitcasts that might not be free for a target. However, if we're strictly eliminating an instruction (by limiting this to one-use ops), then we should be able to do this in InstCombine. But we're cautiously restricting the transform for now to vector types to avoid possible backend problems. A transform to make sure the logic op is legal for the target should be added to reverse this transform and improve codegen. Differential Revision: https://reviews.llvm.org/D26641 llvm-svn: 287707	2016-11-22 22:05:48 +00:00
Sanjay Patel	3b0bafee63	[InstCombine] canonicalize min/max constant to select's false value This is a first step towards canonicalization and improved folding/codegen for integer min/max as discussed here: http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html Here, we're just matching the simplest min/max patterns and adjusting the icmp predicate while swapping the select operands. I've included FIXME tests in test/Transforms/InstCombine/select_meta.ll so it's easier to see how this might be extended (corresponds to the TODO comment in the code). That's also why I'm using matchSelectPattern() rather than a simpler check; once the backend is patched, we can just remove some of the restrictions to allow the obfuscated min/max patterns in the FIXME tests to be matched. Differential Revision: https://reviews.llvm.org/D26525 llvm-svn: 287585	2016-11-21 22:04:14 +00:00
Sanjay Patel	c89911ba02	fix formatting; NFC llvm-svn: 287582	2016-11-21 21:48:36 +00:00
Simon Pilgrim	7d18a70dac	Fix spelling mistakes in Transforms comments. NFC. Identified by Pedro Giffuni in PR27636. llvm-svn: 287488	2016-11-20 13:19:49 +00:00
Craig Topper	1de753f7f5	[InstCombine][AVX-512] Teach InstCombineCalls how to handle the intrinsics for variable shift with 16-bit elements. This is a straightforward extension of the existing support for 32/64-bit element types. Just needed to add the additional instrinsics to the switches. llvm-svn: 287316	2016-11-18 06:04:33 +00:00
Chris Bieneman	05c279fc4b	[CMake] NFC. Updating CMake dependency specifications This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system. llvm-svn: 287206	2016-11-17 04:36:50 +00:00
Sanjay Patel	80baf69cb5	[InstCombine] replace unreachable with assert and remove unreachable code; NFCI llvm-svn: 287147	2016-11-16 20:40:02 +00:00
Sanjay Patel	1b9560ffd6	[InstCombine] fix formatting and add FIXMEs to foldOperationIntoSelectOperand(); NFC llvm-svn: 287145	2016-11-16 20:18:34 +00:00
Craig Topper	6910fa0ef4	[X86] Remove the scalar intrinsics for fadd/fsub/fdiv/fmul Summary: These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file. Reviewers: zvi, delena, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26660 llvm-svn: 287083	2016-11-16 05:24:10 +00:00
Craig Topper	b4173a5a70	[InstCombine][AVX-512] Teach InstCombineCalls to handle the new unmasked AVX-512 variable shift intrinsics. llvm-svn: 286755	2016-11-13 07:26:19 +00:00
Craig Topper	8b831cbb2a	[InstCombine][AVX-512] Expand vector shift handling to work on the AVX-512 shift by immediate and shift by single value. This does not include support for the AVX-512 variable shifts. That will be coming in a future patch. llvm-svn: 286739	2016-11-13 01:51:55 +00:00
Sanjay Patel	84ae943f0d	[InstCombine] use dyn_cast rather isa+cast; NFC Follow-up to r286664 cleanup as suggested by Eli. Thanks! llvm-svn: 286671	2016-11-11 23:20:01 +00:00
Sanjay Patel	cb2199b2f3	[InstCombine] clean up foldSelectOpOp(); NFC llvm-svn: 286664	2016-11-11 23:01:20 +00:00
Sanjay Patel	d1bf4340ef	[InstCombine] fix formatting of FoldOpIntoSelect(); NFCI llvm-svn: 286604	2016-11-11 17:42:16 +00:00
Sanjay Patel	4e1b5a53c7	[InstCombine] avoid infinite loop from shuffle-extract-insert sequence (PR30923) Removing the limitation in visitInsertElementInst() causes several regressions because we're not prepared to fold sequences of shuffles or inserts and extracts separated by shuffles. Fixing that appears to be a difficult mission because we are purposely trying to avoid creating shuffles with arbitrary shuffle masks because some targets may choke on those. https://llvm.org/bugs/show_bug.cgi?id=30923 llvm-svn: 286423	2016-11-10 00:15:14 +00:00
Sanjay Patel	4e9d6cd354	[InstCombine] fix profitability equation for max-of-nots transform As the test change shows, we can increase the critical path by adding a 'not' instruction, so make sure that we're actually removing an instruction if we do this transform. This transform could also cause us to miss folds of min/max pairs. llvm-svn: 286315	2016-11-09 00:13:11 +00:00
Sanjay Patel	99dc5feff1	[InstCombine] reduce indentation; NFC llvm-svn: 286314	2016-11-08 23:49:15 +00:00
Sanjay Patel	86408a8048	[InstCombine] allow splat vector folds in adjustMinMax() (retry r285732) This was reverted at r285866 because there was a crash handling a scalar select of vectors. I added a check for that pattern and a test case based on the example provided in the post-commit thread for r285732. llvm-svn: 286113	2016-11-07 15:52:45 +00:00
Greg Bedwell	5fc6f94591	Revert "[InstCombine] allow splat vector folds in adjustMinMax()" This reverts commit r285732. This change introduced a new assertion failure in the following testcase at -O2: typedef short __v8hi __attribute__((__vector_size__(16))); __v8hi foo(__v8hi &V1, __v8hi &V2, unsigned mask) { __v8hi Result = V1; if (mask & 0x80) Result[0] = V2[0]; return Result; } llvm-svn: 285866	2016-11-02 23:17:05 +00:00
Sanjay Patel	c3d89842ad	[InstCombine] allow splat vector folds in adjustMinMax() llvm-svn: 285732	2016-11-01 20:08:02 +00:00
Sanjay Patel	c0339c77ef	[InstCombine] Fold nuw left-shifts in `ugt`/`ule` comparisons. This transforms %a = shl nuw %x, c1 %b = icmp {ugt\|ule} %a, c0 into %b = icmp {ugt\|ule} %x, (c0 >> c1) z3: (declare-const x (_ BitVec 64)) (declare-const c0 (_ BitVec 64)) (declare-const c1 (_ BitVec 64)) (push) (assert (= x (bvlshr (bvshl x c1) c1))) ; nuw (assert (not (= (bvugt (bvshl x c1) c0) (bvugt x (bvlshr c0 c1))))) (check-sat) (get-model) (pop) (push) (assert (= x (bvlshr (bvshl x c1) c1))) ; nuw (assert (not (= (bvule (bvshl x c1) c0) (bvule x (bvlshr c0 c1))))) (check-sat) (get-model) (pop) Patch by bryant! Differential Revision: https://reviews.llvm.org/D25913 llvm-svn: 285729	2016-11-01 19:19:29 +00:00
Sanjay Patel	644d7c3b8a	[InstCombine] clean up adjustMinMax(); NFCI 1. Change param names for readability 2. Change pointer param to ref 3. Early exit to reduce indent 4. Change switch to if/else llvm-svn: 285718	2016-11-01 18:15:03 +00:00
Sanjay Patel	7ce658388b	[InstCombine] add helper function for adjustMinMax(); NFCI This is just a cut and paste; clean-up and enhancements to follow. llvm-svn: 285715	2016-11-01 17:46:08 +00:00
Simon Pilgrim	6dd8fab443	[InstCombine] Folding of shifts by the sum of positive values This patch introduces the combine: (C1 shift (A add C2)) -> ((C1 shift C2) shift A) iff A and C2 are both positive If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift. Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive). Differential Revision: https://reviews.llvm.org/D26000 llvm-svn: 285696	2016-11-01 15:40:30 +00:00
Sanjay Patel	978f827d12	[InstCombine] re-use bitcasted compare operands in selects (PR28001) These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>. The bitcasts obfuscate the expected min/max forms as shown in PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001#c6 Differential Revision: https://reviews.llvm.org/D25943 llvm-svn: 285495	2016-10-29 15:22:04 +00:00
Sanjay Patel	c0de9c9e40	[InstCombine] fix foldSPFofSPF() to handle vector splats llvm-svn: 285345	2016-10-27 21:19:40 +00:00
Sanjay Patel	611f9f92fc	[InstCombine] handle simple vector integer constants in IsFreeToInvert llvm-svn: 285318	2016-10-27 17:30:50 +00:00
Sanjay Patel	8d7196bfde	[InstCombine] clean up commonCastTransforms; NFC 1. Use 'auto' with dyn_cast. 2. Variables start with a capital letter. 3. Use proper punctuation in comments. llvm-svn: 285200	2016-10-26 14:52:35 +00:00
Guozhi Wei	ae541f6a71	[InstCombine] Resubmit the combine of A->B->A BitCast and fix for pr27996 The original patch of the A->B->A BitCast optimization was reverted by r274094 because it may cause infinite loop inside compiler https://llvm.org/bugs/show_bug.cgi?id=27996. The problem is with following code xB = load (type B); xA = load (type A); +yA = (A)xB; B -> A +zAn = PHI[yA, xA]; PHI +zBn = (B)zAn; // A -> B store zAn; store zBn; optimizeBitCastFromPhi generates +zBn = (B)zAn; // A -> B and expects it will be combined with the following store instruction to another store zAn Unfortunately before combineStoreToValueType is called on the store instruction, optimizeBitCastFromPhi is called on the new BitCast again, and this pattern repeats indefinitely. optimizeBitCastFromPhi only generates BitCast for load/store instructions, only the BitCast before store can cause the reexecution of optimizeBitCastFromPhi, and BitCast before store can easily be handled by InstCombineLoadStoreAlloca.cpp. So the solution to the problem is if all users of a CI are store instructions, we should not do optimizeBitCastFromPhi on it. Then optimizeBitCastFromPhi will not be called on the new BitCast instructions. Differential Revision: https://reviews.llvm.org/D23896 llvm-svn: 285116	2016-10-25 20:43:42 +00:00
Sanjay Patel	f3dda13bd2	[InstCombine] Ensure that truncated int types are legal. Fixes the FIXMEs in D25952 and rL285075. Patch by bryant! Differential Revision: https://reviews.llvm.org/D25955 llvm-svn: 285108	2016-10-25 20:11:47 +00:00
Sanjay Patel	e3de152530	fix formatting; NFC llvm-svn: 285078	2016-10-25 16:12:31 +00:00
Sanjay Patel	d59f7f9047	[InstCombine] add test and code comment to show potentially misguided icmp trunc transform llvm-svn: 285075	2016-10-25 15:16:39 +00:00
Peter Collingbourne	ecdd58f1d6	Analysis: Move llvm::getConstantRangeFromMetadata to IR library. We're about to start using it there. Differential Revision: https://reviews.llvm.org/D25877 llvm-svn: 284865	2016-10-21 19:59:26 +00:00
Sanjay Patel	6d6eca5cdc	[InstCombine] use m_APInt to allow sub with constant folds for splat vectors llvm-svn: 284247	2016-10-14 16:31:54 +00:00
Sanjay Patel	c6c5965a42	[InstCombine] sub X, sext(bool Y) -> add X, zext(bool Y) Prefer add/zext because they are better supported in terms of value-tracking. Note that the backend should be prepared for this IR canonicalization (including vector types) after: https://reviews.llvm.org/rL284015 Differential Revision: https://reviews.llvm.org/D25135 llvm-svn: 284241	2016-10-14 15:24:31 +00:00
Simon Pilgrim	fd0d7b21e0	[InstCombine] Fix constexpr issue in select combining As discussed by Andrea on PR30486, we have an unsafe cast to an Instruction type in the select combine which doesn't take into account that it could be a ConstantExpr instead. Differential Revision: https://reviews.llvm.org/D25466 llvm-svn: 284000	2016-10-12 10:20:15 +00:00
David Majnemer	80dca0c78f	[InstCombine] Transform !range metadata to !nonnull when combining loads When combining an integer load with !range metadata that does not include 0 to a pointer load, make sure emit !nonnull metadata on the newly-created pointer load. This prevents the !nonnull metadata from being dropped during a ptrtoint/inttoptr pair. This fixes PR30597. Patch by Ariel Ben-Yehuda! Differential Revision: https://reviews.llvm.org/D25215 llvm-svn: 283836	2016-10-11 01:00:45 +00:00
Davide Italiano	f6988d2980	[InstCombine] Don't unpack arrays that are too large (part 2). This is similar to r283599, but for store instructions. Thanks to David for pointing out! llvm-svn: 283612	2016-10-07 21:53:09 +00:00
Davide Italiano	da11412243	[InstCombine] Don't unpack arrays that are too large Differential Revision: https://reviews.llvm.org/D25376 llvm-svn: 283599	2016-10-07 20:57:42 +00:00
Sanjay Patel	4326c4ac8f	[InstCombine] fold select X, (ext X), C If we're going to canonicalize IR towards select of constants, try harder to create those. Also, don't lose the metadata. This is actually 4 related transforms in one patch: // select X, (sext X), C --> select X, -1, C // select X, (zext X), C --> select X, 1, C // select X, C, (sext X) --> select X, C, 0 // select X, C, (zext X) --> select X, C, 0 Differential Revision: https://reviews.llvm.org/D25126 llvm-svn: 283575	2016-10-07 17:53:07 +00:00
Sanjoy Das	1f7b813e2b	Remove duplicated code; NFC ICmpInst::makeConstantRange does exactly the same thing as ConstantRange::makeExactICmpRegion. llvm-svn: 283059	2016-10-02 00:09:57 +00:00
Sanjay Patel	f7b851fe84	[InstCombine] allow non-splat folds of select cond (ext X), C llvm-svn: 282906	2016-09-30 19:49:22 +00:00
Sanjay Patel	453ceff261	[InstCombine] fix function names; NFC Also, make foldSelectExtConst() a member of InstCombiner, remove unnecessary parameters from its interface, and group visitSelectInst helpers together in the header file. llvm-svn: 282796	2016-09-29 22:18:30 +00:00
Sanjay Patel	ccc2927b69	fix formatting; NFC llvm-svn: 282737	2016-09-29 17:48:19 +00:00
Alexey Bataev	793c946ecb	[InstCombine] Fixed bug introduced in r282237 The index of the new insertelement instruction was evaluated in the wrong way, it was considered as the index of the inserted value instead of index of the position, where the value should be inserted. llvm-svn: 282401	2016-09-26 13:18:59 +00:00
Andrea Di Biagio	a82d52d11d	[InstCombine] Teach the udiv folding logic how to handle constant expressions. This patch fixes PR30366. Function foldUDivShl() worked under the assumption that one of the values in input to the function was always an instance of llvm::Instruction. However, function visitUDivOperand() (the only user of foldUDivShl) was clearly violating that precondition; internally, visitUDivOperand() uses pattern matches to check the operands of a udiv. Pattern matchers for binary operators know how to handle both Instruction and ConstantExpr values. This patch fixes the problem in foldUDivShl(). Now we use pattern matchers instead of explicit casts to Instruction. The reduced test case from PR30366 has been added to test file InstCombine/udiv-simplify.ll. Differential Revision: https://reviews.llvm.org/D24565 llvm-svn: 282398	2016-09-26 12:07:23 +00:00
Alexey Bataev	fee9078dcd	[InstCombine] Fix for PR29124: reduce insertelements to shufflevector If inserting more than one constant into a vector: define <4 x float> @foo(<4 x float> %x) { %ins1 = insertelement <4 x float> %x, float 1.0, i32 1 %ins2 = insertelement <4 x float> %ins1, float 2.0, i32 2 ret <4 x float> %ins2 } InstCombine could reduce that to a shufflevector: define <4 x float> @goo(<4 x float> %x) { %shuf = shufflevector <4 x float> %x, <4 x float> <float undef, float 1.0, float 2.0, float undef>, <4 x i32><i32 0, i32 5, i32 6, i32 3> ret <4 x float> %shuf } Also, InstCombine tries to convert shuffle instruction to single insertelement, if one of the vectors is a constant vector and only a single element from this constant should be used in shuffle, i.e. shufflevector <4 x float> %v, <4 x float> <float undef, float 1.0, float undef, float undef>, <4 x i32> <i32 0, i32 5, i32 undef, i32 undef> -> insertelement <4 x float> %v, float 1.0, 1 Differential Revision: https://reviews.llvm.org/D24182 llvm-svn: 282237	2016-09-23 09:14:08 +00:00
Sanjay Patel	30ef70b090	[InstCombine] fold X urem C -> X < C ? X : X - C when C is big (PR28672) We already have the udiv variant of this transform, so I think this is ok for InstCombine too even though there is an increase in IR instructions. As the tests and TODO comments show, the transform can lead to follow-on combines. This should fix: https://llvm.org/bugs/show_bug.cgi?id=28672 Differential Revision: https://reviews.llvm.org/D24527 llvm-svn: 282209	2016-09-22 22:36:26 +00:00
Sanjay Patel	f26710d97d	[InstCombine] canonicalize vector select with constant vector condition to shuffle As discussed on llvm-dev ( http://lists.llvm.org/pipermail/llvm-dev/2016-August/104210.html ): turn a vector select with constant condition operand into a shuffle as a canonicalization step. Shuffles may be easier to reason about in conjunction with other shuffles and insert/extract. Possible known (minor?) regressions from this change are filed as: https://llvm.org/bugs/show_bug.cgi?id=28530 https://llvm.org/bugs/show_bug.cgi?id=28531 https://llvm.org/bugs/show_bug.cgi?id=30371 If something terrible happens to perf after this commit, feel free to revert until a backend fix is in place. Differential Revision: https://reviews.llvm.org/D24279 llvm-svn: 281787	2016-09-16 22:16:18 +00:00
Sanjay Patel	c96f6db246	[InstCombine] allow vector types for constant folding / computeKnownBits (PR24942) computeKnownBits() already works for integer vectors, so allow vector types when calling that from InstCombine. I don't think the change to use m_APInt in computeKnownBits is strictly necessary because we do check for ConstantVector later, but it's more efficient to handle the splat case without needing to loop on vector elements. This should work with InstSimplify, but doesn't yet, so I made that a FIXME comment on the test for PR24942: https://llvm.org/bugs/show_bug.cgi?id=24942 Differential Revision: https://reviews.llvm.org/D24677 llvm-svn: 281777	2016-09-16 21:20:36 +00:00
Sanjay Patel	10494b2682	[InstCombine] add helper functions for visitICmpInst(); NFCI llvm-svn: 281743	2016-09-16 16:10:22 +00:00
Sanjay Patel	8da42cc5d3	[InstCombine] move folds for icmp (sh C2, Y), C1 in with other icmp+sh folds; NFCI llvm-svn: 281672	2016-09-15 22:26:31 +00:00
Sanjay Patel	af91d1f81e	[InstCombine] allow icmp (shr/shl) folds for vectors These 2 helper functions were already using APInt internally, so just change the API and caller to allow folds for splats. The scalar regression tests look quite thorough, so I just added a couple of tests to prove that vectors are handled too. These folds should be grouped with the other cmp+shift folds though. That can be an NFC follow-up. llvm-svn: 281663	2016-09-15 21:35:30 +00:00
David Majnemer	8b16da8744	[InstCombine] Do not RAUW a constant GEP canRewriteGEPAsOffset expects to process instructions, not constants. This fixes PR30342. llvm-svn: 281650	2016-09-15 20:10:09 +00:00
Sanjay Patel	524fcdf041	[InstCombine] simplify code; NFCI llvm-svn: 281644	2016-09-15 19:04:55 +00:00
Sanjay Patel	d93c4c0137	fix function names; NFC llvm-svn: 281637	2016-09-15 18:22:25 +00:00
Sanjay Patel	886a542e23	[InstCombine] allow icmp (sub nsw) folds for vectors Also, clean up the code and comments for the existing folds in foldICmpSubConstant(). llvm-svn: 281631	2016-09-15 18:05:17 +00:00
Sanjay Patel	362ff5c0a5	[InstCombine] remove duplicated fold ; NFCI This pattern is matched in foldICmpBinOpEqualityWithConstant() and already works with vectors too. I changed some comments over there to point out the current location. The tests for this transform are currently in 'sub.ll'. Note that the remaining folds in this block all require a sub too, so they should get grouped with the other icmp(sub) patterns. llvm-svn: 281627	2016-09-15 17:01:17 +00:00
Sanjay Patel	40c53ea933	[InstCombine] allow (icmp sgt smin(PosA, B), 0) fold for vectors llvm-svn: 281624	2016-09-15 16:23:20 +00:00
Sanjay Patel	9745983a4d	[InstCombine] clean up foldICmpWithConstant(); NFC 1. Early exit to reduce indent 2. Rename variables 3. Add local 'Pred' variable llvm-svn: 281615	2016-09-15 15:11:12 +00:00
Sanjay Patel	06b127a771	[InstCombine] add helper function for foldICmpWithConstant; NFC This is a big glob of transforms that probably should work for vectors, but currently they are disallowed because of ConstantInt guards. llvm-svn: 281614	2016-09-15 14:37:50 +00:00
Sanjay Patel	7577a3d799	[InstCombine] use m_APInt to allow icmp folds using known bits for splat constant vectors llvm-svn: 281613	2016-09-15 14:15:47 +00:00
Sanjay Patel	9efb1bdcc4	[InstCombine] refactor eq/ne cases in foldICmpUsingKnownBits() ; NFCI The pattern matching and transforms are identical; the cmp predicate just changes. llvm-svn: 281561	2016-09-14 23:38:56 +00:00
Matt Arsenault	e2e6cfee61	Reapply "InstCombine: Reduce trunc (shl x, K) width." This reapplies r272987 with a fix for infinitely looping when the truncated value is another shift of a constant. llvm-svn: 281379	2016-09-13 19:43:57 +00:00
Sanjay Patel	f5887f1fbd	[InstCombine] use m_APInt to allow icmp X, C folds for splat constant vectors isSignBitCheck could be changed to take a pointer param to avoid the 'UnusedBit' ugliness. llvm-svn: 281231	2016-09-12 16:25:41 +00:00
Sanjay Patel	0531f0a5bb	fix formatting; NFC llvm-svn: 281220	2016-09-12 15:52:28 +00:00
Sanjay Patel	3151dec7f1	[InstCombine] add helper function for foldICmpUsingKnownBits; NFCI llvm-svn: 281217	2016-09-12 15:24:31 +00:00
Sanjay Patel	5352331716	fix formatting/typos; NFC llvm-svn: 281214	2016-09-12 14:25:46 +00:00
Sanjay Patel	60312bc45f	[InstCombine] add helper function for folding {and,or,xor} (cast X), C ; NFCI llvm-svn: 281187	2016-09-12 00:16:23 +00:00
Arnold Schwaighofer	5d335559b9	InstCombine: Don't combine loads/stores from swifterror to a new type This generates invalid IR: the only users of swifterror can be call arguments, loads, and stores. rdar://28242257 llvm-svn: 281144	2016-09-10 18:14:57 +00:00
Sanjay Patel	0a3d72bb93	[InstCombine] clean up foldICmpBinOpEqualityWithConstant / foldICmpIntrinsicWithConstant ; NFC 1. Rename variables to be consistent with related/preceding code (may want to reorganize). 2. Fix comments/formatting. llvm-svn: 281140	2016-09-10 15:33:39 +00:00
Sanjay Patel	f58f68c891	[InstCombine] rename and reorganize some icmp folding functions; NFC Everything under foldICmpInstWithConstant() should now be working for splat vectors via m_APInt matchers. Ie, I've removed all of the FIXMEs that I added while cleaning that section up. Note that not all of the associated FIXMEs in the regression tests are gone though, because some of the tests require earlier folds that are still scalar-only. llvm-svn: 281139	2016-09-10 15:03:44 +00:00
Sanjay Patel	58109abe91	[InstCombine] use m_APInt to allow icmp ult X, C folds for splat constant vectors llvm-svn: 281107	2016-09-09 21:59:37 +00:00
Sanjay Patel	1c608f4323	[InstCombine] return a vector-safe true/false constant I introduced this potential bug by missing this diff in: https://reviews.llvm.org/rL280873 ...however, I'm not sure how to reach this code path with a regression test. We may be able to remove this code and assume that the transform to a constant is always handled by InstSimplify? llvm-svn: 280964	2016-09-08 16:54:02 +00:00
Sanjay Patel	9b40f98357	[InstCombine] use m_APInt to allow icmp (and (sh X, Y), C2), C1 folds for splat constant vectors llvm-svn: 280873	2016-09-07 22:33:03 +00:00
Sanjay Patel	def931e76a	[InstCombine] allow icmp (and X, C2), C1 folds for splat constant vectors This is a revert of r280676 which was a revert of r280637; ie, this is r280637 again. It was speculatively reverted to help debug buildbot failures. llvm-svn: 280861	2016-09-07 20:50:44 +00:00
Andrea Di Biagio	f3fd316223	[InstCombine][SSE4a] Fix assertion failure in the insertq/insertqi combining logic. This fixes a similar issue to the one already fixed by r280804 (revieved in D24256). Revision 280804 fixed the problem with unsafe dyn_casts in the extrq/extrqi combining logic. However, it turns out that even the insertq/insertqi logic was affected by the same problem. llvm-svn: 280807	2016-09-07 12:47:53 +00:00
Andrea Di Biagio	8df5b9cf48	[InstCombine][SSE4a] Fix assertion failure caused by unsafe dyn_casts on the operands of extrq/extrqi intrinsic calls. This patch fixes an assertion failure caused by unsafe dynamic casts on the constant operands of sse4a intrinsic calls to extrq/extrqi The combine logic that simplifies sse4a extrq/extrqi intrinsic calls currently checks if the input operands are constants. Internally, that logic relies on dyn_casts of values returned by calls to method Constant::getAggregateElement. However, method getAggregateElemet may return nullptr if the constant element cannot be retrieved. So, all the dyn_casts can potentially fail. This is what happens for example if a constexpr value is passed in input to an extrq/extrqi intrinsic call. This patch fixes the problem by using a dyn_cast_or_null (instead of a simple dyn_cast) on the result of each call to Constant::getAggregateElement. Added reproducible test cases to x86-sse4a.ll. Differential Revision: https://reviews.llvm.org/D24256 llvm-svn: 280804	2016-09-07 12:03:03 +00:00
Sanjay Patel	4e463b4a2c	fix formatting; NFC llvm-svn: 280727	2016-09-06 18:16:31 +00:00
Sanjay Patel	eea2ef7862	[InstCombine] don't assert that division-by-constant has been folded (PR30281) This is effectively a revert of: https://reviews.llvm.org/rL280115 And this should fix https://llvm.org/bugs/show_bug.cgi?id=30281: llvm-svn: 280677	2016-09-05 23:38:22 +00:00
Sanjay Patel	46f9df5b71	[InstCombine] revert r280637 because it causes test failures on an ARM bot http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/14952/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Aicmp.ll llvm-svn: 280676	2016-09-05 22:36:32 +00:00
Sanjay Patel	c641e9d6ff	[InstCombine] allow icmp (and X, C2), C1 folds for splat constant vectors The code to calculate 'UsesRemoved' could be simplified. As-is, that code is a victim of PR30273: https://llvm.org/bugs/show_bug.cgi?id=30273 llvm-svn: 280637	2016-09-04 20:58:27 +00:00
Sanjay Patel	6b4909749b	[InstCombine] recode icmp fold in a vector-friendly way; NFC The transform in question: icmp (and (trunc W), C2), C1 -> icmp (and W, C2'), C1' ...is still not enabled for vectors, thus no functional change intended. It's not clear to me if this is a good transform for vectors or even scalars in general. Changing that behavior may be a follow-on patch. llvm-svn: 280627	2016-09-04 14:32:15 +00:00
Dorit Nuzman	abd15f69b2	[InstCombine] Preserve llvm.mem.parallel_loop_access metadata when replacing memcpy with ld/st. When InstCombine replaces a memcpy with loads+stores it does not copy over the llvm.mem.parallel_loop_access from the memcpy instruction. This patch fixes that. Differential Revision: https://reviews.llvm.org/D23499 llvm-svn: 280617	2016-09-04 07:49:39 +00:00
Dorit Nuzman	7673ba7ac2	Test commit. llvm-svn: 280615	2016-09-04 07:06:00 +00:00
Matt Arsenault	46a0382ab2	AMDGPU: Do basic folding of class intrinsic This allows more of the OCML builtin library to be constant folded. llvm-svn: 280586	2016-09-03 07:06:58 +00:00
Sanjay Patel	521f19f249	[InsttCombine] fold insertelement of constant into shuffle with constant operand (PR29126) The motivating case occurs with SSE/AVX scalar intrinsics, so this is a first step towards shrinking that to a single shufflevector. Note that the transform is intentionally limited to shuffles that are equivalent to vector selects to avoid creating arbitrary shuffle masks that may not lower well. This should solve PR29126: https://llvm.org/bugs/show_bug.cgi?id=29126 Differential Revision: https://reviews.llvm.org/D23886 llvm-svn: 280504	2016-09-02 17:05:43 +00:00
Sanjay Patel	dd861964d1	[InstCombine] remove fold of an icmp pattern that should never happen While removing a scalar shackle from an icmp fold, I noticed that I couldn't find any tests to trigger this code path. The 'and' shrinking transform should be handled by InstCombiner::foldCastedBitwiseLogic() or eliminated with InstSimplify. The icmp narrowing is part of InstCombiner::foldICmpWithCastAndCast(). Differential Revision: https://reviews.llvm.org/D24031 llvm-svn: 280370	2016-09-01 14:20:43 +00:00
Sanjay Patel	0d70831d73	[InstCombine] allow icmp (shr exact X, C2), C fold for splat constant vectors The enhancement to foldICmpDivConstant ( http://llvm.org/viewvc/llvm-project?view=revision&revision=280299 ) allows us to remove the ConstantInt check; no other changes needed. llvm-svn: 280300	2016-08-31 22:18:43 +00:00
Sanjay Patel	541aef4661	[InstCombine] allow icmp (div X, Y), C folds for splat constant vectors Converting all of the overflow ops to APInt looked risky, so I've left that as a TODO. llvm-svn: 280299	2016-08-31 21:57:21 +00:00
Sanjay Patel	85d79744df	[InstCombine] change insertRangeTest() to use APInt instead of Constant; NFCI This is prep work before changing the callers to also use APInt which will allow folds for splat vectors. Currently, the callers have ConstantInt guards in place, so no functional change intended with this commit. llvm-svn: 280282	2016-08-31 19:49:56 +00:00
Sanjay Patel	7d9ebaf337	[InstCombine] clean up InsertRangeTest; NFCI It's much less code and easier to read if we don't duplicate everything between the 'Inside' and not 'Inside' cases. As noted with the FIXME, the goal is to make this vector-friendly in a follow-up patch. llvm-svn: 280183	2016-08-31 00:19:35 +00:00
Sanjay Patel	b37145712e	[InstCombine] replace divide-by-constant checks with asserts; NFC These folds already have tests for scalar and vector types, except for the vector div-by-0 case, so I'm adding tests for that. llvm-svn: 280115	2016-08-30 17:31:34 +00:00
Sanjay Patel	a7cb477277	[InstCombine] clean up foldICmpDivConstant; NFCI 1. Fix comments to match variable names 2. Remove redundant CmpRHS variable 3. Add FIXME to replace some checks with asserts llvm-svn: 280112	2016-08-30 17:10:49 +00:00
Sanjay Patel	5c5311f4e5	[InstCombine] use m_APInt to allow icmp (and X, Y), C folds for splat constant vectors llvm-svn: 279937	2016-08-28 18:18:00 +00:00
Sanjay Patel	14e0e18d76	[InstCombine] add helper function for icmp (and (sh X, Y), C2), C1 ; NFC Like other recent changes near here, the goal is to allow vector types for all of these folds. Splitting things up makes it easier to incrementally enhance the code and easier to read. llvm-svn: 279851	2016-08-26 18:28:46 +00:00
Sanjay Patel	da9c56299b	[InstCombine] clean up foldICmpAndConstConst(); NFC 1. Early exit to reduce indent 2. Fix comments and variable names to match 3. Reformat comments / clang-format code llvm-svn: 279837	2016-08-26 17:15:22 +00:00
Sanjay Patel	d3c7bb28be	[InstCombine] add helper function for folding of icmp (and X, C2), C; NFC llvm-svn: 279834	2016-08-26 16:42:33 +00:00
Sanjay Patel	311e0fabb1	[InstCombine] rename variables in foldICmpAndConstant(); NFC llvm-svn: 279831	2016-08-26 16:14:06 +00:00
Sanjay Patel	f7ba0891ce	[InstCombine] rename variables in foldICmpDivConstant(); NFC Removing the redundant 'CmpRHSV' local variable exposes a bug in the caller foldICmpShrConstant() - it was sending in the div constant instead of the cmp constant. But I have not been able to expose this in a regression test yet - the affected folds all appear to be handled before we ever reach this code. I'll keep trying to find a case as I make changes to allow vector folds in both functions. llvm-svn: 279828	2016-08-26 15:53:01 +00:00
Xinliang David Li	cad3a995a4	[Profile] Propagate branch metadata properly in instcombine Differential Revision: http://reviews.llvm.org/D23590 llvm-svn: 279693	2016-08-25 00:26:32 +00:00
Sanjay Patel	1655414903	[InstCombine] move foldICmpDivConstConst() contents to foldICmpDivConstant(); NFCI There was no logic in foldICmpDivConstant, so no need for a separate function. The code is directly copy/pasted, so further cleanups to follow. llvm-svn: 279685	2016-08-24 23:03:36 +00:00
Sanjay Patel	d398d4a39e	[InstCombine] use m_APInt to allow icmp eq/ne (shr X, C2), C folds for splat constant vectors llvm-svn: 279677	2016-08-24 22:22:06 +00:00
Sanjay Patel	8e297749c1	[InstCombine] add assert and explanatory comment for fold removed in r279568; NFC I deleted a fold from InstCombine at: https://reviews.llvm.org/rL279568 because it (like any InstCombine to a constant?) should always happen in InstSimplify, however, it's not obvious what the assumptions are in the remaining code. Add a comment and assert to make it clearer. Differential Revision: https://reviews.llvm.org/D23819 llvm-svn: 279626	2016-08-24 13:55:55 +00:00
Sanjay Patel	d64e988701	[InstCombine] use local variables for repeated values; NFCI llvm-svn: 279578	2016-08-23 22:05:55 +00:00
Sanjay Patel	dcac0dfca9	[InstCombine] move foldICmpShrConstConst() contents to foldICmpShrConst(); NFCI There will only be 3 lines of code in foldICmpShrConst() when the cleanup is done, so it doesn't make much sense to have a separate function for a single fold. llvm-svn: 279575	2016-08-23 21:25:13 +00:00
Sanjay Patel	6ef22da9ec	[InstCombine] remove icmp shr folds that are already handled by InstSimplify AFAICT, these already worked in all cases for scalar types, and I enhanced the code to work for vector types in: https://reviews.llvm.org/rL279543 llvm-svn: 279568	2016-08-23 21:01:35 +00:00
Sanjay Patel	c9196c4488	[InstCombine] change param type from Instruction to BinaryOperator for icmp helpers; NFCI This saves some casting in the helper functions and eases some further refactoring. llvm-svn: 279478	2016-08-22 21:24:29 +00:00
Sanjay Patel	a392049419	[InstCombine] use m_APInt to allow icmp (shr exact X, Y), 0 folds for splat constant vectors llvm-svn: 279472	2016-08-22 20:45:06 +00:00
Jun Bum Lim	ec8b8cc595	[InstCombine] Allow sinking from unique predecessor with multiple edges Summary: We can allow sinking if the single user block has only one unique predecessor, regardless of the number of edges. Note that a switch statement with multiple cases can have the same destination. Reviewers: mcrosier, majnemer, spatel, reames Subscribers: reames, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23722 llvm-svn: 279448	2016-08-22 18:21:56 +00:00
Sanjay Patel	643d21a62c	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 4 This concludes the fixes for icmp+shl in this series: https://reviews.llvm.org/rL279339 https://reviews.llvm.org/rL279398 https://reviews.llvm.org/rL279399 llvm-svn: 279401	2016-08-21 17:10:07 +00:00
Sanjay Patel	7ffcde7422	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 3 This is a partial enablement (move the ConstantInt guard down). llvm-svn: 279399	2016-08-21 16:35:34 +00:00
Sanjay Patel	7e09f13fed	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 2 This is a partial enablement (move the ConstantInt guard down). llvm-svn: 279398	2016-08-21 16:28:22 +00:00
Sanjay Patel	792636603f	[InstCombine] use APInt instead of ConstantInt in isSignBitCheck(); NFCI The callers still have ConstantInt guards, so there is no functional change intended from this change. But relaxing the callers will allow more folds for vector types. llvm-svn: 279396	2016-08-21 15:07:45 +00:00
Sanjay Patel	fa7de606c4	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 1 This is a partial enablement (move the ConstantInt guard down) because there are many different folds here and one of the later ones will require reworking 'isSignBitCheck'. llvm-svn: 279339	2016-08-19 22:33:26 +00:00
Sanjay Patel	7a104615c5	[InstCombine] remove an icmp fold that is already handled by InstSimplify Specifically, this is done near the end of "SimplifyICmpInst" using computeKnownBits() as the broader solution. There are even vector tests (yay!) for this in test/Transforms/InstSimplify/compare.ll. I considered putting an assert here instead of just deleting, but then we could assert every possible fold in InstSimplify in InstCombine, so...less is more? llvm-svn: 279300	2016-08-19 19:03:07 +00:00
Sanjay Patel	e38e79c3e6	[InstCombine] use local variables to reduce code in foldICmpShlConstant; NFC llvm-svn: 279282	2016-08-19 17:34:05 +00:00
Sanjay Patel	38b7506f75	[InstCombine] rename variables in foldICmpShlConstant(); NFC llvm-svn: 279279	2016-08-19 17:20:37 +00:00
Reid Kleckner	a871d3872a	Fix regression in InstCombine introduced by r278944 The intended transform is: // Simplify icmp eq (or (ptrtoint P), (ptrtoint Q)), 0 // -> and (icmp eq P, null), (icmp eq Q, null). P and Q are both pointer types, but may have different types. We need two calls to getNullValue() to make the icmps. llvm-svn: 279271	2016-08-19 16:53:18 +00:00
Sanjay Patel	a867afe094	[InstCombine] use m_APInt to allow icmp (shl 1, Y), C folds for splat constant vectors llvm-svn: 279266	2016-08-19 16:12:16 +00:00
Sanjay Patel	57b12d3876	[InstCombine] use m_APInt to allow icmp X, C folds for splat constant vectors Of course, we really need to refactor and fix all of the cmp predicates, but this one is interesting because without it, we later perform an information-losing transform of icmp (shl 1, Y), C, and we can't recover the better fold. llvm-svn: 279263	2016-08-19 15:40:44 +00:00
Sanjay Patel	98cd99dfc6	[InstCombine] add helper function for folds of icmp (shl 1, Y), C; NFCI Clean up the existing code by: 1. Renaming variables 2. Adding local variables 3. Making it vector-safe This is still guarded by a ConstantInt check, so no functional change is intended. But this should be ready to go: if we move the ConstantInt check down, all of these folds should do the right thing for vector types. llvm-svn: 279150	2016-08-18 21:28:30 +00:00
Amaury Sechet	763c59dc9a	Make cltz and cttz zero undef when the operand cannot be zero in InstCombine Summary: Also add popcount(n) == bitsize(n) -> n == -1 transformation. Reviewers: majnemer, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23134 llvm-svn: 279141	2016-08-18 20:43:50 +00:00
Sanjay Patel	40e8ca46ad	[InstCombine] use m_APInt to allow icmp (trunc X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 https://reviews.llvm.org/rL279077 https://reviews.llvm.org/rL279101 llvm-svn: 279133	2016-08-18 20:28:54 +00:00
Sanjay Patel	5f4ce4e23d	[InstCombine] clean up foldICmpTruncConstant(); NFCI 1. Fix variable names 2. Add local variables to reduce code llvm-svn: 279132	2016-08-18 20:25:16 +00:00
Sanjay Patel	fa5ca2bf46	[InstCombine] use m_APInt to allow icmp (udiv X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 https://reviews.llvm.org/rL279077 llvm-svn: 279101	2016-08-18 17:55:59 +00:00
Sanjay Patel	12a4105647	[InstCombine] clean up foldICmpUDivConstant; NFC 1. Better variable names 2. Remove unnecessary check of ConstantInt llvm-svn: 279094	2016-08-18 17:37:26 +00:00
Sanjay Patel	6347807f87	[InstCombine] use m_APInt to allow icmp (mul X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 llvm-svn: 279077	2016-08-18 15:44:44 +00:00
Sanjay Patel	5b112845da	[InstCombine] use APInt in isSignTest instead of ConstantInt; NFC This will enable vector splat folding, but NFC until the callers have their ConstantInt restrictions removed. llvm-svn: 279072	2016-08-18 14:59:14 +00:00
Sanjay Patel	4c5e60d95c	[InstCombine] use m_APInt to allow icmp (xor X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 llvm-svn: 279066	2016-08-18 14:10:48 +00:00
Justin Bogner	cd1d5aaf2e	Replace a few more "fall through" comments with LLVM_FALLTHROUGH Follow up to r278902. I had missed "fall through", with a space. llvm-svn: 278970	2016-08-17 20:30:52 +00:00
Sanjay Patel	daffec91ef	[InstCombine] more clean up of foldICmpXorConstant(); NFCI Use m_APInt for the xor constant, but this is all still guarded by the initial ConstantInt check, so no vector types should make it in here. llvm-svn: 278957	2016-08-17 19:45:18 +00:00
Sanjay Patel	6d5f448746	[InstCombine] clean up foldICmpXorConstant(); NFCI 1. Change variable names 2. Use local variables to reduce code 3. Early exit to reduce indent llvm-svn: 278955	2016-08-17 19:23:42 +00:00
Sanjay Patel	63e14a07e8	[InstCombine] use m_APInt to allow icmp (or X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 llvm-svn: 278945	2016-08-17 16:38:57 +00:00
Sanjay Patel	943e92efde	[InstCombine] clean up foldICmpOrConstant(); NFCI 1. Change variable names 2. Use local variables to reduce code 3. Use ? instead of if/else 4. Use the APInt variable instead of 'RHS' so the removal of the FIXME code will be direct llvm-svn: 278944	2016-08-17 16:30:43 +00:00
Sanjay Patel	4f7eb2aa95	[InstCombine] use m_APInt to allow icmp (add X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 llvm-svn: 278935	2016-08-17 15:24:30 +00:00
Justin Bogner	b03fd12cef	Replace "fallthrough" comments with LLVM_FALLTHROUGH This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead. llvm-svn: 278902	2016-08-17 05:10:15 +00:00
Sanjay Patel	60ea1b43d6	[InstCombine] clean up foldICmpAddConstant(); NFCI 1. Fix variable names 2. Add local variables to reduce code 3. Fix code comments 4. Add early exit to reduce indentation 5. Remove 'else' after if -> return 6. Hoist common predicate llvm-svn: 278864	2016-08-16 22:34:42 +00:00
Sanjay Patel	e47df1ac62	[InstCombine] use m_APInt to allow icmp (sub X, Y), C folds for splat constant vectors llvm-svn: 278859	2016-08-16 21:53:19 +00:00
Sanjay Patel	b9aa67bfcf	[InstCombine] fix variable names to match formula comments; NFC llvm-svn: 278855	2016-08-16 21:26:10 +00:00
Sanjay Patel	a3f4f0828b	[InstCombine] add helper functions for foldICmpWithConstant; NFCI Besides breaking up a 700 line function to improve readability, this sinks the 'FIXME: ConstantInt' check into each helper. So now we can independently break that restriction within any of the helper functions. As much as possible, the code was only {cut/paste/clang-format}'ed to minimize risk (no functional changes intended), so several more readability improvements are still possible. llvm-svn: 278828	2016-08-16 17:54:36 +00:00
Sanjay Patel	1e5b2d1611	[InstCombine] use m_APInt in foldICmpWithConstant; NFCI There's some formatting and pointer deref ugliness here that I intend to fix in subsequent patches. The overall goal is to refactor the obnoxiously long switch and incrementally remove the restriction to scalar types (allow folds for vector splats). This patch introduces the use of m_APInt which means the RHSV reference is now a pointer (and may have matched a vector splat), but the check of 'RHS' remains, so vector folds are disallowed and no functional change is intended. llvm-svn: 278816	2016-08-16 16:08:11 +00:00
Pete Cooper	980a935e27	constify InstCombine::foldAllocaCmp. NFC. This is part of an effort to constify ValueTracking.cpp. This change is to methods which need const Value* instead of Value* to go with the upcoming changes to ValueTracking. llvm-svn: 278528	2016-08-12 17:13:28 +00:00
David Majnemer	42531260b3	Use the range variant of find/find_if instead of unpacking begin/end If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278469	2016-08-12 03:55:06 +00:00
David Majnemer	0a16c22846	Use range algorithms instead of unpacking begin/end No functionality change is intended. llvm-svn: 278417	2016-08-11 21:15:00 +00:00
Eugene Zelenko	cdc7161281	Fix some Clang-tidy modernize and Include What You Use warnings. Differential revision: https://reviews.llvm.org/D23291 llvm-svn: 278364	2016-08-11 17:20:18 +00:00
Sanjay Patel	38ae83de38	fix comment; NFC llvm-svn: 278342	2016-08-11 15:23:56 +00:00
Sanjay Patel	e3c335cbed	use auto* with dyn_cast ; NFC llvm-svn: 278340	2016-08-11 15:21:21 +00:00
Sanjay Patel	5a470950b9	getParent()->getParent() == getFunction() ; NFC llvm-svn: 278339	2016-08-11 15:16:06 +00:00
Sean Silva	36e0d01e13	Consistently use FunctionAnalysisManager Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278077	2016-08-09 00:28:15 +00:00
Justin Bogner	6b4422e6fe	InstCombine: Remove a redundant #ifdef NDEBUG. NFC The DEBUG() macro already does this. llvm-svn: 278049	2016-08-08 21:02:11 +00:00
Eli Friedman	02419a9849	[JumpThreading] Fix handling of aliasing metadata. Summary: The correctness fix here is that when we CSE a load with another load, we need to combine the metadata on the two loads. This matches the behavior of other passes, like instcombine and GVN. There's also a minor optimization improvement here: for load PRE, the aliasing metadata on the inserted load should be the same as the metadata on the original load. Not sure why the old code was throwing it away. Issue found by inspection. Differential Revision: http://reviews.llvm.org/D21460 llvm-svn: 277977	2016-08-08 04:10:22 +00:00
David Majnemer	4e4f4437c2	[InstCombine] Infer inbounds on geps of allocas llvm-svn: 277950	2016-08-07 07:58:00 +00:00
Sanjoy Das	ba04d3a620	[InstCombine] Don't coerce non-integral pointers to integers Reviewers: majnemer Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23231 llvm-svn: 277910	2016-08-06 02:58:48 +00:00
Sanjay Patel	8e3ab17c44	[InstCombine] refactor ctlz/cttz folds (NFCI) Note that this fold really belongs in InstSimplify. Refactoring here anyway as an intermediate step because there's a planned addition to this function in D23134. Differential Revision: https://reviews.llvm.org/D23223 llvm-svn: 277883	2016-08-05 22:42:46 +00:00
Nicolai Haehnle	870bf1788c	[InstCombine] try to fold (select C, (sext A), B) into logical ops Summary: Turn (select C, (sext A), B) into (sext (select C, A, B')) when A is i1 and B is a compatible constant, also for zext instead of sext. This will then be further folded into logical operations. The transformation would be valid for non-i1 types as well, but other parts of InstCombine prefer to have sext from non-i1 as an operand of select. Motivated by the shader compiler frontend in Mesa for AMDGPU, which emits i32 for boolean operations. With this change, the boolean logic is fully recovered. Reviewers: majnemer, spatel, tstellarAMD Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22747 llvm-svn: 277801	2016-08-05 08:22:29 +00:00
Justin Bogner	c7e4fbe11c	InstCombine: Clean up some trailing whitespace. NFC llvm-svn: 277793	2016-08-05 01:09:48 +00:00
Justin Bogner	9979840f59	InstCombine: Replace some never-null pointers with references. NFC llvm-svn: 277792	2016-08-05 01:06:44 +00:00
Justin Bogner	19dd0da153	IR: Provide an IRBuilder Inserter that calls a callback after insertion Add a generalized IRBuilderCallbackInserter, which is just given a callback to execute after insertion. This can be used to get rid of the custom inserter in InstCombine, which will in turn allow me to add target specific InstCombineCalls API for intrinsics without horrible layering violations. llvm-svn: 277784	2016-08-04 23:41:01 +00:00
Sanjay Patel	3bade138b5	[InstCombine] use m_APInt to allow icmp eq (mul X, C1), C2 folds for splat constant vectors This concludes the splat vector enhancements for foldICmpEqualityWithConstant(). Other commits in this series: https://reviews.llvm.org/rL277762 https://reviews.llvm.org/rL277752 https://reviews.llvm.org/rL277738 https://reviews.llvm.org/rL277731 https://reviews.llvm.org/rL277659 https://reviews.llvm.org/rL277638 https://reviews.llvm.org/rL277629 llvm-svn: 277779	2016-08-04 22:19:27 +00:00
Sanjay Patel	d938e88e89	[InstCombine] use m_APInt to allow icmp eq (and X, C1), C2 folds for splat constant vectors llvm-svn: 277762	2016-08-04 20:05:02 +00:00
Sanjay Patel	b3de75d3a0	[InstCombine] use m_APInt to allow icmp eq (or X, C1), C2 folds for splat constant vectors llvm-svn: 277752	2016-08-04 19:12:12 +00:00
Sanjay Patel	bcaf6f39dd	[InstCombine] use m_APInt to allow icmp eq (op X, Y), C folds for splat constant vectors I'm removing a misplaced pair of more specific folds from InstCombine in this patch as well, so we know where those folds are happening in InstSimplify. llvm-svn: 277738	2016-08-04 17:48:04 +00:00
Sanjay Patel	9d591d15ec	[InstCombine] use m_APInt to allow icmp eq (sub C1, X), C2 folds for splat constant vectors llvm-svn: 277731	2016-08-04 15:19:25 +00:00
Amaury Sechet	6bea674c43	Add popcount(n) == bitsize(n) -> n == -1 transformation. Summary: As per title. Reviewers: majnemer, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23139 llvm-svn: 277694	2016-08-04 05:27:20 +00:00
Sanjay Patel	00a324e893	[InstCombine] use m_APInt to allow icmp eq (add X, C1), C2 folds for splat constant vectors llvm-svn: 277659	2016-08-03 22:08:44 +00:00
Sanjay Patel	2e9675ff52	[InstCombine] use m_APInt to allow icmp eq (srem X, C1), C2 folds for splat constant vectors llvm-svn: 277638	2016-08-03 19:48:40 +00:00
Tobias Grosser	8757e387dd	[InstCombine] Refactor optimization of zext(or(icmp, icmp)) to enable more aggressive cast-folding Summary: InstCombine unfolds expressions of the form `zext(or(icmp, icmp))` to `or(zext(icmp), zext(icmp))` such that in a later iteration of InstCombine the exposed `zext(icmp)` instructions can be optimized. We now combine this unfolding and the subsequent `zext(icmp)` optimization to be performed together. Since the unfolding doesn't happen separately anymore, we also again enable the folding of `logic(cast(icmp), cast(icmp))` expressions to `cast(logic(icmp, icmp))` which had been disabled due to its interference with the unfolding transformation. Tested via `make check` and `lnt`. Background ========== For a better understanding on how it came to this change we subsequently summarize its history. In commit r275989 we've already tried to enable the folding of `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` which had to be reverted in r276106 because it could lead to an endless loop in InstCombine (also see http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160718/374347.html). The root of this problem is that in `visitZExt()` in InstCombineCasts.cpp there also exists a reverse of the above folding transformation, that unfolds `zext(or(icmp, icmp))` to `or(zext(icmp), zext(icmp))` in order to expose `zext(icmp)` operations which would then possibly be eliminated by subsequent iterations of InstCombine. However, before these `zext(icmp)` would be eliminated the folding from r275989 could kick in and cause InstCombine to endlessly switch back and forth between the folding and the unfolding transformation. This is the reason why we now combine the `zext`-unfolding and the elimination of the exposed `zext(icmp)` to happen at one go because this enables us to still allow the cast-folding in `logic(cast(icmp), cast(icmp))` without entering an endless loop again. Details on the submitted changes ================================ - In `visitZExt()` we combine the unfolding and optimization of `zext` instructions. - In `transformZExtICmp()` we have to use `Builder->CreateIntCast()` instead of `CastInst::CreateIntegerCast()` to make sure that the new `CastInst` is inserted in a `BasicBlock`. The new calls to `transformZExtICmp()` that we introduce in `visitZExt()` would otherwise cause according assertions to be triggered (in our case this happend, for example, with lnt for the MultiSource/Applications/sqlite3 and SingleSource/Regression/C++/EH/recursive-throw tests). The subsequent usage of `replaceInstUsesWith()` is necessary to ensure that the new `CastInst` replaces the `ZExtInst` accordingly. - In InstCombineAndOrXor.cpp we again allow the folding of casts on `icmp` instructions. - The instruction order in the optimized IR for the zext-or-icmp.ll test case is different with the introduced changes. - The test cases in zext.ll have been adopted from the reverted commits r275989 and r276105. Reviewers: grosser, majnemer, spatel Subscribers: eli.friedman, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D22864 Contributed-by: Matthias Reisinger <d412vv1n@gmail.com> llvm-svn: 277635	2016-08-03 19:30:35 +00:00
Sanjay Patel	43aeb001c9	[InstCombine] use m_APInt to allow icmp (binop X, Y), C folds with constant splat vectors This removes the restriction for the icmp constant, but as noted by the FIXME comments, we still need to change individual checks for binop operand constants. llvm-svn: 277629	2016-08-03 18:59:03 +00:00
Sanjay Patel	51a767c6b8	use local variables; NFC llvm-svn: 277612	2016-08-03 17:23:08 +00:00
Sanjay Patel	ab50a93888	[InstCombine] replace dyn_casts with matches; NFCI Clean-up before changing this to allow folds for vectors. llvm-svn: 277538	2016-08-02 22:38:33 +00:00
David Majnemer	d536f2328e	[ConstnatFolding] Teach the folder how to fold ConstantVector A ConstantVector can have ConstantExpr operands and vice versa. However, the folder had no ability to fold ConstantVectors which, in some cases, was an optimization barrier. Instead, rephrase the folder in terms of Constants instead of ConstantExprs and teach callers how to deal with failure. llvm-svn: 277099	2016-07-29 03:27:26 +00:00
Vitaly Buka	0ab23cf1c8	Do not remove empty lifetime.start/lifetime.end ranges Summary: Asan stack-use-after-scope check should poison alloca even if there is no access between start and end. This is possible for code like this: for (int i = 0; i < 3; i++) { int x; p = &x; } "Loop Invariant Code Motion" will move "p = &x;" out of the loop, making start/end range empty. PR27453 Reviewers: eugenis Differential Revision: https://reviews.llvm.org/D22842 llvm-svn: 277072	2016-07-28 22:59:03 +00:00
Vitaly Buka	2fae6a7702	Should be committed as one CL. This reverts commits r277068 r277067 r277066. llvm-svn: 277071	2016-07-28 22:59:01 +00:00
Vitaly Buka	f0500b6ae5	Do not remove empty lifetime.start/lifetime.end ranges Summary: Asan stack-use-after-scope check should poison alloca even if there is no access between start and end. This is possible for code like this: for (int i = 0; i < 3; i++) { int x; p = &x; } "Loop Invariant Code Motion" will move "p = &x;" out of the loop, making start/end range empty. PR27453 Reviewers: eugenis Differential Revision: https://reviews.llvm.org/D22842 llvm-svn: 277068	2016-07-28 22:50:48 +00:00
Vitaly Buka	3645793872	maned llvm-svn: 277067	2016-07-28 22:50:45 +00:00
Vitaly Buka	caca9da4ff	range llvm-svn: 277066	2016-07-28 22:50:43 +00:00
David Majnemer	0be7155350	[InstCombine] Handle failures from ConstantFoldConstantExpression ConstantFoldConstantExpression returns null when folding fails. This fixes PR28745. llvm-svn: 276952	2016-07-28 02:29:06 +00:00
Sanjay Patel	1271bf9178	[InstCombine] allow icmp (bit-manipulation-intrinsic(), C) folds for vectors llvm-svn: 276523	2016-07-23 13:06:49 +00:00
Sanjay Patel	6ebd5857c8	[InstCombine] move udiv+cmp fold over with other BinOp+cmp folds; NFCI llvm-svn: 276502	2016-07-23 00:28:39 +00:00
David Majnemer	522a91181a	Don't remove side effecting instructions due to ConstantFoldInstruction Just because we can constant fold the result of an instruction does not imply that we can delete the instruction. It may have side effects. This fixes PR28655. llvm-svn: 276389	2016-07-22 04:54:44 +00:00
Sanjay Patel	18fa9d3ca1	[InstCombine] break up foldICmpEqualityWithConstant(); NFCI Almost all of these folds require changes to allow vector types. Splitting up the logic should make that easier to do incrementally. llvm-svn: 276360	2016-07-21 23:27:36 +00:00
Sanjay Patel	43395060a1	make InstCombine compare helper functions private; NFC Also, rename some of them for consistency and to follow current conventions. llvm-svn: 276312	2016-07-21 18:07:40 +00:00
Sanjay Patel	1710e7cfa7	[InstCombine] break up visitICmpInstWithInstAndIntCst(); NFCI Making smaller pieces out of some of these ~1000 line functions should make it easier to incrementally upgrade them to handle vector types. llvm-svn: 276304	2016-07-21 17:15:49 +00:00
Sanjay Patel	0753c06d9c	[InstCombine] LogicOpc (zext X), C --> zext (LogicOpc X, C) (PR28476) The benefits of this change include: 1. Remove DeMorgan-matching code that was added specifically to work-around the missing transform in http://reviews.llvm.org/rL248634. 2. Makes the DeMorgan transform work for vectors too. 3. Fix PR28476: https://llvm.org/bugs/show_bug.cgi?id=28476 Extending this transform to other casts and other associative operators may be useful too. See https://reviews.llvm.org/D22421 for a prerequisite for doing that though. Differential Revision: https://reviews.llvm.org/D22271 llvm-svn: 276221	2016-07-21 00:24:18 +00:00
Sanjay Patel	5f3c70307d	[InstSimplify][InstCombine] don't crash when folding vector selects of icmp Differential Revision: https://reviews.llvm.org/D22602 llvm-svn: 276209	2016-07-20 23:40:01 +00:00
Sanjay Patel	683170bf56	move decomposeBitTestICmp() to Transforms/Utils; NFC As noted in https://reviews.llvm.org/D22537 , we can use this functionality in visitSelectInstWithICmp() and InstSimplify, but currently we have duplicated code. llvm-svn: 276140	2016-07-20 17:18:45 +00:00
Benjamin Kramer	b4d64cf27d	Revert "[InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp))" Makes InstCombine infloop when compiling v8. This reverts commit r275989 and r276105. llvm-svn: 276106	2016-07-20 11:40:16 +00:00
Sanjay Patel	2d477e59e8	[InstCombine] fold add(zext(xor X, C), C) --> sext X when C is INT_MIN in the source type The pattern may look more obviously like a sext if written as: define i32 @g(i16 %x) { %zext = zext i16 %x to i32 %xor = xor i32 %zext, 32768 %add = add i32 %xor, -32768 ret i32 %add } We already have that fold in visitAdd(). Differential Revision: https://reviews.llvm.org/D22477 llvm-svn: 276035	2016-07-19 22:09:34 +00:00
Tobias Grosser	1c38262279	[InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp)) Summary: Currently, InstCombine is already able to fold expressions of the form `logic(cast(A), cast(B))` to the simpler form `cast(logic(A, B))`, where logic designates one of `and`/`or`/`xor`. This transformation is implemented in `foldCastedBitwiseLogic()` in InstCombineAndOrXor.cpp. However, this optimization will not be performed if both `A` and `B` are `icmp` instructions. The decision to preclude casts of `icmp` instructions originates in r48715 in combination with r261707, and can be best understood by the title of the former one: > Transform (zext (or (icmp), (icmp))) to (or (zext (cimp), (zext icmp))) if at least one of the (zext icmp) can be transformed to eliminate an icmp. Apparently, it introduced a transformation that is a reverse of the transformation that is done in `foldCastedBitwiseLogic()`. Its purpose is to expose pairs of `zext icmp` that would subsequently be optimized by `transformZExtICmp()` in InstCombineCasts.cpp. Therefore, in order to avoid an endless loop of switching back and forth between these two transformations, the one in `foldCastedBitwiseLogic()` has been restricted to exclude `icmp` instructions which is mirrored in the responsible check: `if ((!isa<ICmpInst>(Cast0Src) \|\| !isa<ICmpInst>(Cast1Src)) && ...` This check seems to sort out more cases than necessary because: - the reverse transformation is obviously done for `or` instructions only - and also not every `zext icmp` pair is necessarily the result of this reverse transformation Therefore we now remove this check and replace it by a more finegrained one in `shouldOptimizeCast()` that now rejects only those `logic(zext(icmp), zext(icmp))` that would be able to be optimized by `transformZExtICmp()`, which also avoids the mentioned endless loop. That means we are now able to also simplify expressions of the form `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` (`cast` being an arbitrary `CastInst`). As an example, consider the following IR snippet ``` %1 = icmp sgt i64 %a, %b %2 = zext i1 %1 to i8 %3 = icmp slt i64 %a, %c %4 = zext i1 %3 to i8 %5 = and i8 %2, %4 ``` which would now be transformed to ``` %1 = icmp sgt i64 %a, %b %2 = icmp slt i64 %a, %c %3 = and i1 %1, %2 %4 = zext i1 %3 to i8 ``` This issue became apparent when experimenting with the programming language Julia, which makes use of LLVM. Currently, Julia lowers its `Bool` datatype to LLVM's `i8` (also see https://github.com/JuliaLang/julia/pull/17225). In fact, the above IR example is the lowered form of the Julia snippet `(a > b) & (a < c)`. Like shown above, this may introduce `zext` operations, casting between `i1` and `i8`, which could for example hinder ScalarEvolution and Polly on certain code. Reviewers: grosser, vtjnash, majnemer Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D22511 Contributed-by: Matthias Reisinger llvm-svn: 275989	2016-07-19 16:39:17 +00:00
Tobias Grosser	8ef834c712	[InstCombine] Minor cleanup of cast simplification code [NFC] Summary: This patch cleans up parts of InstCombine to raise its compliance with the LLVM coding standards and to increase its readability. The changes and according rationale are summarized in the following: - Rename `ShouldOptimizeCast()` to `shouldOptimizeCast()` since functions should start with a lower case letter. - Move `shouldOptimizeCast()` from InstCombineCasts.cpp to InstCombineAndOrXor.cpp since it's only used there. - Simplify interface of `shouldOptimizeCast()`. - Minor code style adaptions in `shouldOptimizeCast()`. - Remove the documentation on the function definition of `shouldOptimizeCast()` since it just repeats the documentation on its declaration. Also enhance the documentation on its declaration with more information describing its intended use and make it doxygen-compliant. - Change a comment in `foldCastedBitwiseLogic()` from `fold (logic (cast A), (cast B)) -> (cast (logic A, B))` to `fold logic(cast(A), cast(B)) -> cast(logic(A, B))` since the surrounding comments use this format. - Remove comment `Only do this if the casts both really cause code to be generated.` in `foldCastedBitwiseLogic()` since it just repeats parts of the documentation of `shouldOptimizeCast()` and does not help to improve readability. - Simplify the interface of `isEliminableCastPair()`. - Removed the documentation on the function definition of `isEliminableCastPair()` which only contained obvious statements about its implementation. Instead added more general doxygen-compliant documentation to its declaration. - Renamed parameter `DoXform` of `transformZExtIcmp()` to `DoTransform` to make its intention clearer. - Moved documentation of `transformZExtIcmp()` from its definition to its declaration and made it doxygen-compliant. Reviewers: vtjnash, grosser Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D22449 Contributed-by: Matthias Reisinger llvm-svn: 275964	2016-07-19 09:06:08 +00:00
Sanjay Patel	79acd2a96b	[InstCombine] allow X + signbit --> X ^ signbit for vector splats llvm-svn: 275691	2016-07-16 18:29:26 +00:00
Sanjay Patel	f9d2b20daf	[InstCombine] reassociate logic ops with constants separated by a zext This is a partial implementation of a general fold for associative+commutative operators: (op (cast (op X, C2)), C1) --> (cast (op X, op (C1, C2))) (op (cast (op X, C2)), C1) --> (op (cast X), op (C1, C2)) There are 7 associative operators and 13 cast types, so this could potentially go a lot further. Differential Revision: https://reviews.llvm.org/D22421 llvm-svn: 275684	2016-07-16 15:20:19 +00:00
Sanjay Patel	bbbb3ce787	don't repeat function names in comments; NFC llvm-svn: 275470	2016-07-14 20:54:43 +00:00
David Majnemer	666aa945a5	[InstCombine] Masked loads with undef masks can fold to normal loads We were able to fold masked loads with an all-ones mask to a normal load. However, we couldn't turn a masked load with a mask with mixed ones and undefs into a normal load. llvm-svn: 275380	2016-07-14 06:58:42 +00:00
David Majnemer	d77a3b61eb	Move a transform from InstCombine to InstSimplify. This transform doesn't require any new instructions, it can safely live in InstSimplify. llvm-svn: 275344	2016-07-13 23:32:53 +00:00
Sanjay Patel	c00e48a3db	[InstCombine] extend vector select matching for non-splat constants In D21740, we discussed trying to make this a more general matcher. However, I didn't see a clean way to handle the regular m_Not cases and these non-splat vector patterns, so I've opted for the direct approach here. If there are other potential uses of areInverseVectorBitmasks(), we could move that helper function to a higher level. There is an open question as to which is of these forms should be considered the canonical IR: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i32> %a, <4 x i32> %b %shuf = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 0, i32 5, i32 6, i32 3> Differential Revision: http://reviews.llvm.org/D22114 llvm-svn: 275289	2016-07-13 18:07:02 +00:00
Anna Thomas	9ad45adfd7	Revert "InstCombine rule to fold truncs whose value is available" This reverts commit r274853. Caused failure in ppcBE build llvm-svn: 274943	2016-07-08 22:15:08 +00:00
Sanjay Patel	664514f7fe	[InstCombine] don't form select from bitcasted logic ops if bitcasts have >1 use This isn't a sure thing (are 2 extra bitcasts less expensive than a logic op?), but we'll try to err on the conservative side by going with the case that has less IR instructions. Note: This question came up in http://reviews.llvm.org/D22114 , but this part is independent of that patch proposal, so I'm making this small change ahead of that one. See also: http://reviews.llvm.org/rL274926 llvm-svn: 274932	2016-07-08 21:17:51 +00:00
Sanjay Patel	f4a08ede03	[InstCombine] don't form select from logic ops if it's unlikely that we'll eliminate any ops llvm-svn: 274926	2016-07-08 20:53:29 +00:00
Sanjay Patel	1b6b824548	[InstCombine] check for one-use before turning simple logic op into a select llvm-svn: 274891	2016-07-08 17:26:47 +00:00
Sanjay Patel	cbfca9e8ef	[InstCombine] allow or(sext(A), B) --> A ? -1 : B transform for vectors llvm-svn: 274883	2016-07-08 17:01:15 +00:00
Anna Thomas	3124f6273a	InstCombine rule to fold truncs whose value is available We can fold truncs whose operand feeds from a load, if the trunc value is available through a prior load/store. This change is from: http://reviews.llvm.org/D21246, which folded the trunc but missed the bitcast or ptrtoint/inttoptr required in the RAUW call, when the load type didnt match the prior load/store type. Differential Revision: http://reviews.llvm.org/D21791 llvm-svn: 274853	2016-07-08 15:18:56 +00:00
Sanjay Patel	25600f39eb	save type in local var; NFCI llvm-svn: 274760	2016-07-07 15:28:17 +00:00
Sanjay Patel	65a51c25c1	[InstCombine] enhance (select X, C1, C2 --> ext X) to handle vectors By replacing dyn_cast of ConstantInt with m_Zero/m_One/m_AllOnes, we allow these transforms for splat vectors. Differential Revision: http://reviews.llvm.org/D21899 llvm-svn: 274696	2016-07-06 22:23:01 +00:00
Sanjay Patel	ea23436638	[InstCombine] use more specific pattern matchers; NFCI Follow-up from r274465: we don't need to capture the value in these cases, so just match the constant that we're looking for. m_One/m_Zero work with vector splats as well as scalars. llvm-svn: 274670	2016-07-06 21:01:26 +00:00
Sanjay Patel	cbaac41856	[InstCombine] enable vector select of bools -> logic folds llvm-svn: 274465	2016-07-03 14:34:39 +00:00
Sanjay Patel	a1a4e100be	fix formatting; NFC llvm-svn: 274463	2016-07-03 14:08:19 +00:00
Sean Silva	45835e731d	Remove dead TLI arg of isKnownNonNull and propagate deadness. NFC. This actually uncovered a surprisingly large chain of ultimately unused TLI args. From what I can gather, this argument is a remnant of when isKnownNonNull would look at the TLI directly. The current approach seems to be that InferFunctionAttrs runs early in the pipeline and uses TLI to annotate the TLI-dependent non-null information as return attributes. This also removes the dependence of functionattrs on TLI altogether. llvm-svn: 274455	2016-07-02 23:47:27 +00:00
Sanjay Patel	7521e1b880	fix formatting, add TODO; NFC llvm-svn: 274238	2016-06-30 15:32:45 +00:00
Sanjay Patel	7c6eab5777	[InstCombine] shrink switch conditions better (PR24766) https://llvm.org/bugs/show_bug.cgi?id=24766#c2 This removes a hack that was added for the benefit of x86 codegen. It prevented shrinking the switch condition even to smaller legal (DataLayout) types. We have a safety mechanism in CGP after: http://reviews.llvm.org/rL251857 ...so we're free to use the optimal (smallest) IR type now. Differential Revision: http://reviews.llvm.org/D12965 llvm-svn: 274233	2016-06-30 14:51:21 +00:00
Sanjay Patel	4520d9a1f5	[InstCombine] use ConstantExpr::getBitCast() instead of creating useless instruction llvm-svn: 274229	2016-06-30 14:27:41 +00:00
Sanjay Patel	7ad98babfa	[InstCombine] extend matchSelectFromAndOr() to work with i1 scalar types If the incoming types are i1, then we don't have to pattern match any sext ops. Differential Revision: http://reviews.llvm.org/D21740 llvm-svn: 274228	2016-06-30 14:18:18 +00:00
Tim Shen	aec68b263d	[InstCombine] Simplify and correct folding fcmps with the same children Summary: Take advantage of FCmpInst::Predicate's bit pattern and handle (fcmp , x, y) \| (fcmp , x, y) and (fcmp , x, y) & (fcmp , x, y) more consistently. Also fold more FCmpInst::FCMP_FALSE and FCmpInst::FCMP_TRUE to constants. Currently InstCombine wrongly folds (fcmp ogt, x, y) \| (fcmp ord, x, y) to (fcmp ogt, x, y); this patch also fixes that. Reviewers: spatel Subscribers: llvm-commits, iteratee, echristo Differential Revision: http://reviews.llvm.org/D21775 llvm-svn: 274156	2016-06-29 20:10:17 +00:00
Tim Shen	860a67eb4c	[InstCombine, NFC] Change the generated variable names by creating new instructions This removes some noise for D21775's test changes. llvm-svn: 274155	2016-06-29 20:10:13 +00:00
Eric Christopher	0c58837b1f	Revert "[InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions" Revert "[InstCombine] Combine A->B->A BitCast" as this appears to cause PR27996 and as discussed in http://reviews.llvm.org/D20847 This reverts commits r270135 and r263734. llvm-svn: 274094	2016-06-29 03:05:58 +00:00
Michael Kuperstein	835facd863	[PM] Normalize FIXMEs for missing PreserveCFG to have the same wording. llvm-svn: 273974	2016-06-28 00:54:12 +00:00
Sanjay Patel	59ed2ffca3	[InstCombine] shrink type of sdiv if dividend is sexted and constant divisor is small enough (PR28153) This should fix PR28153: https://llvm.org/bugs/show_bug.cgi?id=28153 Differential Revision: http://reviews.llvm.org/D21769 llvm-svn: 273951	2016-06-27 22:27:11 +00:00
Sanjay Patel	bedd1f9d3d	[InstCombine] refactor sdiv by APInt transforms (NFC) There's at least one more fold to do here: https://llvm.org/bugs/show_bug.cgi?id=28153 llvm-svn: 273904	2016-06-27 18:38:40 +00:00
Sanjay Patel	c6ada53be5	[InstCombine] use m_APInt for div --> ashr fold The APInt matcher works with splat vectors, so we get this fold for vectors too. llvm-svn: 273897	2016-06-27 17:25:57 +00:00
Benjamin Kramer	135f735af1	Apply clang-tidy's modernize-loop-convert to most of lib/Transforms. Only minor manual fixes. No functionality change intended. llvm-svn: 273808	2016-06-26 12:28:59 +00:00
Sanjay Patel	2cbe679774	[InstCombine] use m_APInt; NFCI llvm-svn: 273715	2016-06-24 20:36:34 +00:00
Sanjay Patel	4e8ebce196	[InstCombine] refactor optional bitcasting in matchSelectFromAndOr() into one code path (NFCI) Tests to verify that the commuted variants are all exercised were added with: http://reviews.llvm.org/rL273702 llvm-svn: 273706	2016-06-24 18:55:27 +00:00
Reid Kleckner	fbd5eef691	Revert "InstCombine rule to fold trunc when value available" This reverts commit r273608. Broke building code with sanitizers, where apparently these kinds of loads, casts, and truncations are common: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/24502 http://crbug.com/623099 llvm-svn: 273703	2016-06-24 18:42:58 +00:00
Sanjay Patel	f8b08f7179	[InstCombine] consolidate commutation variants of matchSelectFromAndOr() in one place; NFCI By putting all the possible commutations together, we simplify the code. Note that this is NFCI, but I'm adding tests that actually exercise each commutation pattern because we don't have this anywhere else. llvm-svn: 273702	2016-06-24 18:26:02 +00:00
Anna Thomas	31a0b2088f	InstCombine rule to fold trunc when value available Summary: This instcombine rule folds away trunc operations that have value available from a prior load or store. This kind of code can be generated as a result of GVN widening the load or from source code as well. Reviewers: reames, majnemer, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21246 llvm-svn: 273608	2016-06-23 20:22:22 +00:00
Rafael Espindola	2b7fef681f	Delete more dead code. Found by gcc 6. llvm-svn: 273402	2016-06-22 12:44:16 +00:00
David Majnemer	e61e4bfd87	Replace silly uses of 'signed' with 'int' llvm-svn: 273244	2016-06-21 05:10:24 +00:00
Sanjay Patel	9ad8fb68f7	[InstSimplify] analyze (optionally casted) icmps to eliminate obviously false logic (PR27869) By moving this transform to InstSimplify from InstCombine, we sidestep the problem/question raised by PR27869: https://llvm.org/bugs/show_bug.cgi?id=27869 ...where InstCombine turns an icmp+zext into a shift causing us to miss the fold. Credit to David Majnemer for a draft patch of the changes to InstructionSimplify.cpp. Differential Revision: http://reviews.llvm.org/D21512 llvm-svn: 273200	2016-06-20 20:59:59 +00:00
Matt Arsenault	802ebcb4bb	InstCombine: Don't strip convergent from intrinsic callsites Specific instances of intrinsic calls may want to be convergent, such as certain register reads but the intrinsic declaration is not. llvm-svn: 273188	2016-06-20 19:04:44 +00:00
Matt Arsenault	8fd5978811	Revert "Revert "Revert "InstCombine: Reduce trunc (shl x, K) width.""" This seems to be causing an infinite loop / crash in instcombine on some bots. llvm-svn: 273069	2016-06-17 23:36:38 +00:00
Matt Arsenault	d76efc14b9	Revert "Revert "InstCombine: Reduce trunc (shl x, K) width."" Reapply r272987. Condition should be in terms of the destination type, and the flags should not be copied. llvm-svn: 273045	2016-06-17 20:33:53 +00:00
Sanjay Patel	216d8cf720	[InstCombine] allow more than one use for vector bitcast folding with selects The motivating example for this transform is similar to D20774 where bitcasts interfere with a single cmp/select sequence, but in this case we have 2 uses of each bitcast to produce min and max ops: define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) { %cmp = fcmp olt <4 x float> %a, %b %bc1 = bitcast <4 x float> %a to <4 x i32> %bc2 = bitcast <4 x float> %b to <4 x i32> %sel1 = select <4 x i1> %cmp, <4 x i32> %bc1, <4 x i32> %bc2 %sel2 = select <4 x i1> %cmp, <4 x i32> %bc2, <4 x i32> %bc1 %bc3 = bitcast <4 x float>* %ptr1 to <4 x i32>* store <4 x i32> %sel1, <4 x i32>* %bc3 %bc4 = bitcast <4 x float>* %ptr2 to <4 x i32>* store <4 x i32> %sel2, <4 x i32>* %bc4 ret void } With this patch, we move the selects up to use the input args which allows getting rid of all of the bitcasts: define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) { %cmp = fcmp olt <4 x float> %a, %b %sel1.v = select <4 x i1> %cmp, <4 x float> %a, <4 x float> %b %sel2.v = select <4 x i1> %cmp, <4 x float> %b, <4 x float> %a store <4 x float> %sel1.v, <4 x float>* %ptr1, align 16 store <4 x float> %sel2.v, <4 x float>* %ptr2, align 16 ret void } The asm for x86 SSE then improves from: movaps %xmm0, %xmm2 cmpltps %xmm1, %xmm2 movaps %xmm2, %xmm3 andnps %xmm1, %xmm3 movaps %xmm2, %xmm4 andnps %xmm0, %xmm4 andps %xmm2, %xmm0 orps %xmm3, %xmm0 andps %xmm1, %xmm2 orps %xmm4, %xmm2 movaps %xmm0, (%rdi) movaps %xmm2, (%rsi) To: movaps %xmm0, %xmm2 minps %xmm1, %xmm2 maxps %xmm0, %xmm1 movaps %xmm2, (%rdi) movaps %xmm1, (%rsi) The TODO comments show that we're limiting this transform only to vectors and only to bitcasts because we need to improve other transforms or risk creating worse codegen. Differential Revision: http://reviews.llvm.org/D21190 llvm-svn: 273011	2016-06-17 16:46:50 +00:00
Matt Arsenault	ce56f7bbaa	Revert "InstCombine: Reduce trunc (shl x, K) width." This reverts commit r272987. This might be causing crashes on some bots. llvm-svn: 272990	2016-06-17 06:28:53 +00:00
Matt Arsenault	028fd50642	InstCombine: Reduce trunc (shl x, K) width. llvm-svn: 272987	2016-06-17 04:43:22 +00:00
Eli Friedman	bd254a6f45	[InstCombine] Don't widen metadata on store-to-load forwarding The original check for load CSE or store-to-load forwarding is wrong when the forwarded stored value happened to be a load. Ref https://github.com/JuliaLang/julia/issues/16894 Differential Revision: http://reviews.llvm.org/D21271 Patch by Yichao Yu! llvm-svn: 272868	2016-06-16 02:33:42 +00:00
Craig Topper	99d1eab327	[IR] Require ArrayRef of 'uint32_t' instead of 'int' for the mask argument for one of the signatures of CreateShuffleVector. This better emphasises that you can't use it for the -1 as undef behavior. llvm-svn: 272491	2016-06-12 00:41:19 +00:00
Sanjay Patel	3929313811	[InstCombine] move fold of select of add/sub to helper function; NFCI llvm-svn: 272199	2016-06-08 21:10:01 +00:00
Sanjay Patel	384d0f219d	[InstCombine] fix outdated comment, simplify logic; NFCI llvm-svn: 272196	2016-06-08 20:31:52 +00:00
Sanjay Patel	10a2c38d83	[InstCombine] reduce indent; NFC llvm-svn: 272193	2016-06-08 20:09:04 +00:00
Sanjay Patel	916f8a0cdb	[InstCombine] use copyIRFlags() ; NFCI llvm-svn: 272191	2016-06-08 19:33:52 +00:00
Benjamin Kramer	c321e53402	Apply most suggestions of clang-tidy's performance-unnecessary-value-param Avoids unnecessary copies. All changes audited & pass tests with asan. No functional change intended. llvm-svn: 272190	2016-06-08 19:09:22 +00:00
Benjamin Kramer	46e38f3678	Avoid copies of std::strings and APInt/APFloats where we only read from it As suggested by clang-tidy's performance-unnecessary-copy-initialization. This can easily hit lifetime issues, so I audited every change and ran the tests under asan, which came back clean. llvm-svn: 272126	2016-06-08 10:01:20 +00:00
Simon Pilgrim	db9893fb90	[InstCombine][AVX2] Add support for simplifying AVX2 per-element shifts to native shifts Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit). If the shift amount is constant we can sometimes convert these instructions to native shifts: 1 - if all shift amounts are in range then the conversion is trivial. 2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion. 3 - logical shifts just return zero if all elements have out of range shift amounts. In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts. Differential Revision: http://reviews.llvm.org/D19675 llvm-svn: 271996	2016-06-07 10:27:15 +00:00
Simon Pilgrim	91e3ac8293	[InstCombine][SSE] Add MOVMSK constant folding (PR27982) This patch adds support for folding undef/zero/constant inputs to MOVMSK instructions. The SSE/AVX versions can be fully folded, but the MMX version can only handle undef inputs. Differential Revision: http://reviews.llvm.org/D20998 llvm-svn: 271990	2016-06-07 08:18:35 +00:00
Michael Kuperstein	a0c6ae02a5	[InstCombine] scalarizePHI should not assume the code it sees has been CSE'd scalarizePHI only looked for phis that have exactly two uses - the "latch" use, and an extract. Unfortunately, we can not assume all equivalent extracts are CSE'd, since InstCombine itself may create an extract which is a duplicate of an existing one. This extends it to handle several distinct extracts from the same index. This should fix at least some of the performance regressions from PR27988. Differential Revision: http://reviews.llvm.org/D20983 llvm-svn: 271961	2016-06-06 23:38:33 +00:00
Sanjay Patel	6a333c3ed9	[InstCombine] limit icmp transform to ConstantInt (PR28011) In r271810 ( http://reviews.llvm.org/rL271810 ), I loosened the check above this to work for any Constant rather than ConstantInt. AFAICT, that part makes sense if we can determine that the shrunken/extended constant remained equal. But it doesn't make sense for this later transform where we assume that the constant DID change. This could assert for a ConstantExpr: https://llvm.org/bugs/show_bug.cgi?id=28011 And it could be wrong for a vector as shown in the added regression test. llvm-svn: 271908	2016-06-06 16:56:57 +00:00
Sanjoy Das	b7e861a488	Add safety check to InstCombiner::commonIRemTransforms Since FoldOpIntoPhi speculates the binary operation to potentially each of the predecessors of the PHI node (pulling it out of arbitrary control dependence in the process), we can FoldOpIntoPhi only if we know the operation doesn't have UB. This also brings up an interesting profitability question -- the way it is written today, commonIRemTransforms will hoist out work from dynamically dead code into code that will execute at runtime. Perhaps that isn't the best canonicalization? Fixes PR27968. llvm-svn: 271857	2016-06-05 21:17:04 +00:00
Sanjay Patel	a6fbc82392	[InstCombine] allow vector icmp bool transforms llvm-svn: 271843	2016-06-05 17:49:45 +00:00
Sanjay Patel	5f0217f42e	fix documentation comments and other clean-ups; NFC llvm-svn: 271839	2016-06-05 16:46:18 +00:00
Sanjay Patel	6f8f47b358	[InstCombine] less 'CI' confusion; NFC Change the name of the ICmpInst to 'ICmp' and the Constant (was a ConstantInt) to 'C', so that it's hopefully clearer that 'CI' refers to CastInst in this context. While we're scrubbing, fix the documentation comment and use 'auto' with 'dyn_cast'. llvm-svn: 271817	2016-06-05 00:12:32 +00:00
Sanjay Patel	ea8a211169	[InstCombine] allow vector constants for cast+icmp fold This is step 1 of unknown towards fixing PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001 llvm-svn: 271810	2016-06-04 22:04:05 +00:00
Sanjay Patel	c774f8c265	clean-up; NFC llvm-svn: 271807	2016-06-04 21:20:44 +00:00
Sanjay Patel	4c204230fc	fix formatting, punctuation; NFC llvm-svn: 271804	2016-06-04 20:39:22 +00:00
Simon Pilgrim	fda22d66fc	[InstCombine][MMX] Extend SimplifyDemandedUseBits MOVMSK support to MMX Add the MMX implementation to the SimplifyDemandedUseBits SSE/AVX MOVMSK support added in D19614 Requires a minor tweak as llvm.x86.mmx.pmovmskb takes a x86_mmx argument - so we have to be explicit about the implied v8i8 vector type. llvm-svn: 271789	2016-06-04 13:42:46 +00:00
Sanjay Patel	6cf18af1c5	[InstCombine] look through bitcasts to find selects There was concern that creating bitcasts for the simpler potential select pattern: define <2 x i64> @vecBitcastOp1(<4 x i1> %cmp, <2 x i64> %a) { %a2 = add <2 x i64> %a, %a %sext = sext <4 x i1> %cmp to <4 x i32> %bc = bitcast <4 x i32> %sext to <2 x i64> %and = and <2 x i64> %a2, %bc ret <2 x i64> %and } might lead to worse code for some targets, so this patch is matching the larger patterns seen in the test cases. The motivating example for this patch is this IR produced via SSE intrinsics in C: define <2 x i64> @gibson(<2 x i64> %a, <2 x i64> %b) { %t0 = bitcast <2 x i64> %a to <4 x i32> %t1 = bitcast <2 x i64> %b to <4 x i32> %cmp = icmp sgt <4 x i32> %t0, %t1 %sext = sext <4 x i1> %cmp to <4 x i32> %t2 = bitcast <4 x i32> %sext to <2 x i64> %and = and <2 x i64> %t2, %a %neg = xor <4 x i32> %sext, <i32 -1, i32 -1, i32 -1, i32 -1> %neg2 = bitcast <4 x i32> %neg to <2 x i64> %and2 = and <2 x i64> %neg2, %b %or = or <2 x i64> %and, %and2 ret <2 x i64> %or } For an AVX target, this is currently: vpcmpgtd %xmm1, %xmm0, %xmm2 vpand %xmm0, %xmm2, %xmm0 vpandn %xmm1, %xmm2, %xmm1 vpor %xmm1, %xmm0, %xmm0 retq With this patch, it becomes: vpmaxsd %xmm1, %xmm0, %xmm0 Differential Revision: http://reviews.llvm.org/D20774 llvm-svn: 271676	2016-06-03 14:42:07 +00:00
Sanjay Patel	dba8b4c04d	transform obscured FP sign bit ops into a fabs/fneg using TLI hook This is effectively a revert of: http://reviews.llvm.org/rL249702 - [InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886) and: http://reviews.llvm.org/rL249701 - [ValueTracking] teach computeKnownBits that a fabs() clears sign bits and a reimplementation as a DAG combine for targets that have IEEE754-compliant fabs/fneg instructions. This is intended to resolve the objections raised on the dev list: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098154.html and: https://llvm.org/bugs/show_bug.cgi?id=24886#c4 In the interest of patch minimalism, I've only partly enabled AArch64. PowerPC, MIPS, x86 and others can enable later. Differential Revision: http://reviews.llvm.org/D19391 llvm-svn: 271573	2016-06-02 20:01:37 +00:00
Sanjay Patel	5c0bc02878	[InstCombine] remove guard for generating a vector select This is effectively NFC because we already do this transform after r175380: http://reviews.llvm.org/rL175380 and also via foldBoolSextMaskToSelect(). This change should just make it a bit more efficient to match the pattern. The original guard was added in r95058: http://reviews.llvm.org/rL95058 A sampling of codegen for current in-tree targets shows no problems. This makes sense given that we're already producing the vector selects via the other transforms. llvm-svn: 271554	2016-06-02 18:03:05 +00:00
Saleem Abdulrasool	d2f705ddf9	X86: permit using SjLj EH on x86 targets as an option This adds support to the backed to actually support SjLj EH as an exception model. This is NOT the default model, and requires explicitly opting into it from the frontend. GCC supports this model and for MinGW can still be enabled via the `--using-sjlj-exceptions` options. Addresses PR27749! llvm-svn: 271244	2016-05-31 01:48:07 +00:00
Craig Topper	8287fd8abd	[X86] Remove SSE/AVX unaligned store intrinsics as clang no longer uses them. Auto upgrade to native unaligned store instructions. llvm-svn: 271236	2016-05-30 23:15:56 +00:00
Simon Pilgrim	9602d678cb	[X86][SSE] (Reapplied) Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm) This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused. Reapplied now that the the companion patch (D20684) removes/auto-upgrade the clang intrinsics has been committed. Differential Revision: http://reviews.llvm.org/D20686 llvm-svn: 271131	2016-05-28 18:03:41 +00:00
Sanjay Patel	74d23ad498	[InstCombine] move and/sext fold to helper function; NFCI We need to enhance the pattern matching on these to look through bitcasts. llvm-svn: 271051	2016-05-27 21:41:29 +00:00
Simon Pilgrim	4642a57fbf	Revert: r270973 - [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm) llvm-svn: 270976	2016-05-27 09:02:25 +00:00
Simon Pilgrim	c013e5737b	[X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm) This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused. A companion patch (D20684) removes/auto-upgrade the clang intrinsics. Differential Revision: http://reviews.llvm.org/D20686 llvm-svn: 270973	2016-05-27 08:49:15 +00:00
Chad Rosier	e5819e2732	[InstCombine] Catch more bswap cases missed due to zext and truncs. Fixes PR27824. Differential Revision: http://reviews.llvm.org/D20591. llvm-svn: 270853	2016-05-26 14:58:51 +00:00
Craig Topper	a423aa4642	[X86] Add the AVX storeu intrinsics to InstCombine and LoopStrengthReduce in the same places that the SSE/SSE2 storeu intrinsics appear. I don't really know how to test this. Just seemed like we should be consistent. llvm-svn: 270819	2016-05-26 04:28:45 +00:00
Chad Rosier	a00df49dc5	Clarify that we match BSwap in InstCombine and BitReverse in CGP. NFC. Also, rename recognizeBitReverseOrBSwapIdiom to recognizeBSwapOrBitReverseIdiom, so the ordering of the MatchBSwaps and MatchBitReversals arguments are consistent with the function name. llvm-svn: 270715	2016-05-25 16:22:14 +00:00
Gerolf Hoflehner	00e7092f68	[InstCombine] Fix assertion when bitcast is converted to gep When an aggregate contains an opaque type its size cannot be determined. This triggers an "Invalid GetElementPtrInst indices for type" assert in function checkGEPType. The fix suppresses the conversion in this case. http://reviews.llvm.org/D20319 llvm-svn: 270479	2016-05-23 19:23:17 +00:00
Sanjay Patel	a8ef4a5737	reduce indent; NFC llvm-svn: 270372	2016-05-22 17:08:52 +00:00
Guozhi Wei	b1d37199cc	[InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions This patch fixes https://llvm.org/bugs/show_bug.cgi?id=27703. If there is a sequence of one or more load instructions, each loaded value is used as address of later load instruction, bitcast is necessary to change the value type, don't optimize it. llvm-svn: 270135	2016-05-19 21:07:01 +00:00
Sanjay Patel	22b01febd4	[InstCombine] add another test for wrong icmp constant (PR27792) It doesn't matter if the comparison is unsigned; the inc/dec is always signed. llvm-svn: 269831	2016-05-17 20:20:40 +00:00
Sanjay Patel	86564cad06	[InstCombine] fix constant to be signed for signed comparisons This bug was introduced in r269728 and is the likely cause of many stage 2 ubsan bot failures. I'll add a test in a follow-up commit assuming this fixes things properly. llvm-svn: 269797	2016-05-17 18:38:55 +00:00
Benjamin Kramer	ca9a0fe2b9	[InstCombine] Don't crash when trying to take an element of a ConstantExpr. Fixes PR27786. llvm-svn: 269757	2016-05-17 12:08:55 +00:00
Sanjay Patel	18254935c9	try to avoid unused variable warning in release build; NFCI llvm-svn: 269729	2016-05-17 01:12:31 +00:00
Sanjay Patel	e9b2c32e7f	[InstCombine] check vector elements before trying to transform LE/GE vector icmp (PR27756) Fix a bug introduced with rL269426 : [InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819) We were assuming that a ConstantDataVector / ConstantVector / ConstantAggregateZero operand of an ICMP was composed of ConstantInt elements, but it might have ConstantExpr or UndefValue elements. Handle those appropriately. Also, refactor this function to join the scalar and vector paths and eliminate the switches. Differential Revision: http://reviews.llvm.org/D20289 llvm-svn: 269728	2016-05-17 00:57:57 +00:00
Sanjay Patel	abbc2ac231	use 'match' for less indenting; NFCI llvm-svn: 269494	2016-05-13 21:51:17 +00:00
Jun Bum Lim	be11bdc4b0	Rename getLargestLegalIntTypeSize to getLargestLegalIntTypeSizeInBits(). NFC. Summary: Rename DataLayout::getLargestLegalIntTypeSize to DataLayout::getLargestLegalIntTypeSizeInBits() to prevent similar mistakes fixed in r269433. Reviewers: joker.eph, mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20248 llvm-svn: 269456	2016-05-13 18:38:35 +00:00
Sanjay Patel	0c8f3f9332	[InstCombine] handle zero constant vectors for LE/GE comparisons too Enhancement to: http://reviews.llvm.org/rL269426 With discussion in: http://reviews.llvm.org/D17859 This should complete the fixes for: PR26701, PR26819: https://llvm.org/bugs/show_bug.cgi?id=26701 https://llvm.org/bugs/show_bug.cgi?id=26819 llvm-svn: 269439	2016-05-13 17:28:12 +00:00
Sanjay Patel	b79ab27853	[InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819) *We don't currently handle the edge case constants (min/max values), so it's not a complete canonicalization. To fully solve the motivating bugs, we need to enhance this to recognize a zero vector too because that's a ConstantAggregateZero which is a ConstantData, not a ConstantVector or a ConstantDataVector. Differential Revision: http://reviews.llvm.org/D17859 llvm-svn: 269426	2016-05-13 15:10:46 +00:00
Chad Rosier	4e6cda2db5	[InstCombine] Fold icmp ugt/ult (udiv i32 C2, X), C1. This patch adds support for two optimizations: icmp ugt (udiv C2, X), C1 -> icmp ule X, C2/(C1+1) icmp ult (udiv C2, X), C1 -> icmp ugt X, C2/C1 Differential Revision: http://reviews.llvm.org/D20123 llvm-svn: 269109	2016-05-10 20:22:09 +00:00
Arnaud A. de Grandmaison	333ef381b8	[InstCombine] Remove trivially empty va_start/va_end and va_copy/va_end ranges. When a va_start or va_copy is immediately followed by a va_end (ignoring debug information or other start/end in between), then it is safe to remove the pair. As this code shares some commonalities with the lifetime markers, this has been factored to helper functions. This InstCombine pattern kicks-in 3 times when running the LLVM test suite. llvm-svn: 269033	2016-05-10 09:24:49 +00:00
Chad Rosier	58919cc6f8	Typo. NFC. llvm-svn: 268975	2016-05-09 21:37:43 +00:00
Chad Rosier	131a42ccdf	[InstCombine] Fold icmp eq/ne (udiv i32 A, B), 0 -> icmp ugt/ule B, A. Differential Revision: http://reviews.llvm.org/D20036 llvm-svn: 268960	2016-05-09 19:30:20 +00:00
Philip Reames	6f4d0088c6	Reapply 267210 with fix for PR27490 Original Commit Message Extend load/store type canonicalization to handle unordered operations Extend the type canonicalization logic to work for unordered atomic loads and stores. Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before. Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered. If you see problems, feel free to revert this change, but please make sure you collect a test case. Note that the concern about lowering is now much less likely. PR27490 proved that we already were mucking with the types of ordered atomics and volatiles. As a result, this change doesn't introduce as much new behavior as originally thought. llvm-svn: 268809	2016-05-06 22:17:01 +00:00
Balaram Makam	569eaec5f3	"Reapply r268521 "[InstCombine] Canonicalize icmp instructions based on dominating conditions."" This reapplies commit r268521, that was reverted in r268530 due to a test failure in select-implied.ll Modified the test case to reflect the new change. llvm-svn: 268557	2016-05-04 21:32:14 +00:00

... 6 7 8 9 10 ...

2457 Commits