llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	bc56b2432d	[InstCombine] fix rotate narrowing bug for non-pow-2 types llvm-svn: 346968	2018-11-15 17:19:14 +00:00
Mandeep Singh Grang	0905fc77c1	[InstCombine] Remove a couple of asserts based on incorrect assumptions Summary: These asserts are based on the assumption that the order of true/false operands in a select and those in the compare would always be the same. This fixes PR39595. Reviewers: craig.topper, spatel, dmgreen Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54359 llvm-svn: 346874	2018-11-14 17:55:07 +00:00
Sanjay Patel	6072842770	[InstCombine] fix formatting for matchBSwap(); NFC We should have a similar function for matching rotate and/or funnel shift, so tidy up the related existing call. llvm-svn: 346871	2018-11-14 16:03:36 +00:00
Sanjay Patel	a139564896	[InstCombine] fold funnel shift amount based on demanded bits The shift amount of a funnel shift is modulo the scalar bitwidth: http://llvm.org/docs/LangRef.html#llvm-fshl-intrinsic ...so we can use demanded bits analysis on that operand to simplify it when we have a power-of-2 bitwidth. This is another step towards canonicalizing {shift/shift/or} to the intrinsics in IR. Differential Revision: https://reviews.llvm.org/D54478 llvm-svn: 346814	2018-11-13 23:27:23 +00:00
Sanjay Patel	f8f12272e8	[InstCombine] canonicalize rotate patterns with cmp/select The cmp+branch variant of this pattern is shown in: https://bugs.llvm.org/show_bug.cgi?id=34924 ...and as discussed there, we probably can't transform that without a rotate intrinsic. We do have that now via funnel shift, but we're not quite ready to canonicalize IR to that form yet. The case with 'select' should already be transformed though, so that's this patch. The sequence with negation followed by masking is what we use in the backend and partly in clang (though that part should be updated). https://rise4fun.com/Alive/TplC %cmp = icmp eq i32 %shamt, 0 %sub = sub i32 32, %shamt %shr = lshr i32 %x, %shamt %shl = shl i32 %x, %sub %or = or i32 %shr, %shl %r = select i1 %cmp, i32 %x, i32 %or => %neg = sub i32 0, %shamt %masked = and i32 %shamt, 31 %maskedneg = and i32 %neg, 31 %shl2 = lshr i32 %x, %masked %shr2 = shl i32 %x, %maskedneg %r = or i32 %shl2, %shr2 llvm-svn: 346807	2018-11-13 22:47:24 +00:00
Sanjay Patel	35b1c2d19d	[InstCombine] narrow width of rotate patterns, part 3 This is a longer variant for the pattern handled in rL346713 This one includes zexts. Eventually, we should canonicalize all rotate patterns to the funnel shift intrinsics, but we need a bit more infrastructure to make sure the vectorizers handle those intrinsics as well as the shift+logic ops. https://rise4fun.com/Alive/FMn Name: narrow rotateright %neg = sub i8 0, %shamt %rshamt = and i8 %shamt, 7 %rshamtconv = zext i8 %rshamt to i32 %lshamt = and i8 %neg, 7 %lshamtconv = zext i8 %lshamt to i32 %conv = zext i8 %x to i32 %shr = lshr i32 %conv, %rshamtconv %shl = shl i32 %conv, %lshamtconv %or = or i32 %shl, %shr %r = trunc i32 %or to i8 => %maskedShAmt2 = and i8 %shamt, 7 %negShAmt2 = sub i8 0, %shamt %maskedNegShAmt2 = and i8 %negShAmt2, 7 %shl2 = lshr i8 %x, %maskedShAmt2 %shr2 = shl i8 %x, %maskedNegShAmt2 %r = or i8 %shl2, %shr2 llvm-svn: 346716	2018-11-12 22:52:25 +00:00
Sanjay Patel	98e427ccf2	[InstCombine] narrow width of rotate patterns, part 2 (PR39624) The sub-pattern for the shift amount in a rotate can take on several different forms, and there's apparently no way to canonicalize those without seeing the entire rotate sequence. This is the form noted in: https://bugs.llvm.org/show_bug.cgi?id=39624 https://rise4fun.com/Alive/qnT %zx = zext i8 %x to i32 %maskedShAmt = and i32 %shAmt, 7 %shl = shl i32 %zx, %maskedShAmt %negShAmt = sub i32 0, %shAmt %maskedNegShAmt = and i32 %negShAmt, 7 %shr = lshr i32 %zx, %maskedNegShAmt %rot = or i32 %shl, %shr %r = trunc i32 %rot to i8 => %truncShAmt = trunc i32 %shAmt to i8 %maskedShAmt2 = and i8 %truncShAmt, 7 %shl2 = shl i8 %x, %maskedShAmt2 %negShAmt2 = sub i8 0, %truncShAmt %maskedNegShAmt2 = and i8 %negShAmt2, 7 %shr2 = lshr i8 %x, %maskedNegShAmt2 %r = or i8 %shl2, %shr2 llvm-svn: 346713	2018-11-12 22:11:09 +00:00
Sanjay Patel	ceab2329b6	[InstCombine] refactor code for matching shift amount of a rotate; NFC As shown in existing test cases and with: https://bugs.llvm.org/show_bug.cgi?id=39624 ...we're missing at least 2 more patterns for rotate narrowing. llvm-svn: 346711	2018-11-12 22:00:00 +00:00
Philip Reames	b8d8db30ea	[GC][InstCombine] Fix a potential iteration issue Noticed via inspection. Appears to be largely innocious in practice, but slight code change could have resulted in either visit order dependent missed optimizations or infinite loops. May be a minor compile time problem today. llvm-svn: 346698	2018-11-12 20:00:53 +00:00
Sanjay Patel	4a12aa9791	[InstCombine] simplify code for merging stores; NFCI llvm-svn: 346596	2018-11-10 20:29:25 +00:00
Tom Stellard	28d662164d	InstCombine: Avoid introducing poison values when lowering llvm.amdgcn.[us]bfe Summary: When the 3rd argument to these intrinsics is zero, lowering them to shift instructions produces poison values, since we end up with shift amounts equal to the number of bits in the shifted value. This means we can only lower these intrinsics if we can prove that the 3rd argument is not zero. Reviewers: arsenm Reviewed By: arsenm Subscribers: bnieuwenhuizen, jvesely, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D53739 llvm-svn: 346422	2018-11-08 17:57:57 +00:00
Sanjay Patel	57a08b3343	[InstCombine] propagate FMF for fcmp+fabs folds By morphing the instruction rather than deleting and creating a new one, we retain fast-math-flags and potentially other metadata (profile info?). llvm-svn: 346331	2018-11-07 16:15:01 +00:00
Sanjay Patel	bb521e63af	[InstCombine] peek through fabs() when checking isnan() That should be the end of the missing cases for this fold. See earlier patches in this series: rL346321 rL346324 llvm-svn: 346327	2018-11-07 15:44:26 +00:00
Sanjay Patel	fa5f146872	[InstCombine] add folds for fcmp Pred fabs(X), 0.0 Similar to rL346321, we had folds for the ordered versions of these compares already, so add the unordered siblings for completeness. llvm-svn: 346324	2018-11-07 15:33:03 +00:00
Sanjay Patel	76faf5145d	[InstCombine] add fold for fabs(X) u< 0.0 The sibling fold for 'oge' --> 'ord' was already here, but this half was missing. The result of fabs() must be positive or nan, so asking if the result is negative or nan is the same as asking if the result is nan. This is another step towards fixing: https://bugs.llvm.org/show_bug.cgi?id=39475 llvm-svn: 346321	2018-11-07 15:11:32 +00:00
Sanjay Patel	de58e93666	fix typos aggressively; NFC llvm-svn: 346316	2018-11-07 14:35:36 +00:00
Sanjay Patel	7552d0d2e6	[InstCombine] do not shrink switch conditions to illegal types (PR29009) This patch makes shrinking switch conditions less aggressive which was introduced by: rL274233 Note that we have 2 new bugs to track potential follow-ups that might have solved PR29009 in different ways: https://bugs.llvm.org/show_bug.cgi?id=39569 https://bugs.llvm.org/show_bug.cgi?id=39578 Patch by: @dendibakh (Denis Bakhvalov) Differential Revision: https://reviews.llvm.org/D54115 llvm-svn: 346315	2018-11-07 14:12:41 +00:00
Sanjay Patel	d1172a0c20	[IR] add optional parameter for copying IR flags to compare instructions As shown, this is used to eliminate redundant code in InstCombine, and there are more cases where we should be using this pattern, but we're currently unintentionally dropping flags. llvm-svn: 346282	2018-11-07 00:00:42 +00:00
Sanjay Patel	724014adde	[InstCombine] allow vector types for fcmp+fpext fold llvm-svn: 346245	2018-11-06 17:20:20 +00:00
Sanjay Patel	46bf3922c1	[InstCombine] propagate fast-math-flags when folding fcmp+fpext, part 2 llvm-svn: 346242	2018-11-06 16:45:27 +00:00
Sanjay Patel	7c3ee4da42	[InstCombine] rearrange code for fcmp+fpext; NFCI llvm-svn: 346241	2018-11-06 16:37:35 +00:00
Sanjay Patel	1b85f00201	[InstCombine] propagate fast-math-flags when folding fcmp+fpext llvm-svn: 346240	2018-11-06 16:23:03 +00:00
Sanjay Patel	2fd5b0ebfb	[InstCombine] propagate fast-math-flags when folding fcmp+fneg, part 2 llvm-svn: 346238	2018-11-06 15:58:57 +00:00
Sanjay Patel	05e70fb978	[InstCombine] reduce code; NFC llvm-svn: 346235	2018-11-06 15:53:58 +00:00
Sanjay Patel	70282a0501	[InstCombine] propagate fast-math-flags when folding fcmp+fneg This is another part of solving PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 This might be enough to fix that particular issue, but as noted with the FIXME, we're still dropping FMF on other folds around here. llvm-svn: 346234	2018-11-06 15:49:45 +00:00
Simon Pilgrim	c1da5f757e	[InstCombine] Ensure nested shifts are in range (OSS-Fuzz #9880 ) llvm-svn: 346225	2018-11-06 11:28:22 +00:00
Sanjay Patel	1440107821	[InstSimplify] fold select (fcmp X, Y), X, Y This is NFCI for InstCombine because it calls InstSimplify, so I left the tests for this transform there. As noted in the code comment, we can allow this fold more often by using FMF and/or value tracking. llvm-svn: 346169	2018-11-05 21:51:39 +00:00
Sanjay Patel	c26fd1e772	[InstCombine] canonicalize -0.0 to +0.0 in fcmp As stated in IEEE-754 and discussed in: https://bugs.llvm.org/show_bug.cgi?id=38086 ...the sign of zero does not affect any FP compare predicate. Known regressions were fixed with: rL346097 (D54001) rL346143 The transform will help reduce pattern-matching complexity to solve: https://bugs.llvm.org/show_bug.cgi?id=39475 ...as well as improve CSE and codegen (a zero constant is almost always easier to produce than 0x80..00). llvm-svn: 346147	2018-11-05 17:26:42 +00:00
Sanjay Patel	87aa10062c	[InstCombine] loosen FP 0.0 constraint for fcmp+select substitution It looks like we correctly removed edge cases with 0.0 from D50714, but we were a bit conservative because getBinOpIdentity() doesn't distinguish between +0.0 and -0.0 and 'nsz' is effectively always true for fcmp (see discussion in: https://bugs.llvm.org/show_bug.cgi?id=38086 Without this change, we would get regressions by canonicalizing to +0.0 in all fcmp, and that's a step towards solving: https://bugs.llvm.org/show_bug.cgi?id=39475 llvm-svn: 346143	2018-11-05 16:50:44 +00:00
Volkan Keles	3ca146d083	[InstCombine] Combine nested min/max intrinsics with constants Reviewers: arsenm, spatel Reviewed By: spatel Subscribers: lebedev.ri, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D53774 llvm-svn: 345751	2018-10-31 17:50:52 +00:00
Sanjay Patel	1c254c6716	[InstCombine] refactor fabs+fcmp fold; NFC Also, remove/replace/minimize/enhance the tests for this fold. The code drops FMF, so it needs more tests and at least 1 fix. llvm-svn: 345734	2018-10-31 16:34:43 +00:00
Sanjay Patel	b9fe3fbb57	[InstCombine] add assertion that InstSimplify has folded a fabs+fcmp; NFC The 'OLT' case was updated at rL266175, so I assume it was just an oversight that 'UGE' was not included because that patch handled both predicates in InstSimplify. llvm-svn: 345727	2018-10-31 15:31:45 +00:00
Sanjay Patel	85cba3b6fb	[InstSimplify] fold 'fcmp nnan oge X, 0.0' when X is not negative This re-raises some of the open questions about how to apply and use fast-math-flags in IR from PR38086: https://bugs.llvm.org/show_bug.cgi?id=38086 ...but given the current implementation (no FMF on casts), this is likely the only way to predicate the transform. This is part of solving PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 Differential Revision: https://reviews.llvm.org/D53874 llvm-svn: 345725	2018-10-31 14:57:23 +00:00
Sanjay Patel	4c39dfc91e	[InstCombine] use 'match' to reduce code; NFC llvm-svn: 345647	2018-10-30 20:52:25 +00:00
Quentin Colombet	900678227c	[InstCombine] Teach the move free before null test opti how to deal with noop casts InstCombine features an optimization that essentially replaces: if (a) free(a) into: free(a) Right now, this optimization is gated by the minsize attribute and therefore we only perform it if we can prove that we are going to be able to eliminate the branch and the destination block. However when casts are involved the optimization would fail to apply, because the optimization was not smart enough to realize that it is possible to also move the casts away from the destination block and that is harmless to the performance since they are just noops. E.g., foo(int a) if (a) free((char)a) Wouldn't be optimized by instcombine, because - We would refuse to hoist the `bitcast i32* %a to i8` in the source block - We would fail to see that `bitcast i32* %a to i8` and %a are the same value. This patch fixes both these problems: - It teaches the pattern matching of the comparison how to look through casts. - It checks that whether the additional instruction in the destination block can be hoisted and are harmless performance-wise. - It hoists all the code of the destination block in the source block. Differential Revision: D53356 llvm-svn: 345644	2018-10-30 20:51:04 +00:00
Sanjay Patel	68a61cb07c	[InstCombine] use getFltSemantics() instead of duplicating it; NFC llvm-svn: 345613	2018-10-30 16:21:56 +00:00
Sanjay Patel	b12e410082	[InstCombine] try to turn shuffle into insertelement shuffle (insert ?, Scalar, IndexC), V1, Mask --> insert V1, Scalar, IndexC' The motivating case is at least a couple of steps away: I noticed that SLPVectorizer does not analyze shuffles as well as sequences of insert/extract in PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 ...so SLP may fail to vectorize when source code has shuffles to start with or instcombine has converted insert/extract to shuffles. Independent of that, an insertelement is always a simpler op for IR analysis vs. a shuffle, so we should transform to insert when possible. I don't think there's any codegen concern here - if a target can't insert a scalar directly to some fixed element in a vector (x86?), then this should get expanded to the insert+shuffle that we started with. Differential Revision: https://reviews.llvm.org/D53507 llvm-svn: 345607	2018-10-30 15:26:39 +00:00
Cameron McInally	384a74b0e6	[FPEnv] Last BinaryOperator::isFNeg(...) to m_FNeg(...) changes Replacing BinaryOperator::isFNeg(...) to avoid regressions when we separate FNeg from the FSub IR instruction. Differential Revision: https://reviews.llvm.org/D53650 llvm-svn: 345295	2018-10-25 18:09:33 +00:00
Gabor Buella	1f6ca0ba15	Add -instcombine-code-sinking option Reviewers: craig.topper, andrew.w.kaylor, efriedma Reviewed By: craig.topper, andrew.w.kaylor, efriedma Differential Revision: https://reviews.llvm.org/D52709 llvm-svn: 345248	2018-10-25 08:32:29 +00:00
Sanjay Patel	3b206305fd	[InstCombine] try harder to form select from logic ops (2nd try) The original patch was committed here: rL344609 ...and reverted: rL344612 ...because it did not properly check/test data types before calling ComputeNumSignBits(). The tests that caused bot failures for the previous commit are over-reaching front-end tests that run the entire -O optimizer pipeline: Clang :: CodeGen/builtins-systemz-zvector.c Clang :: CodeGen/builtins-systemz-zvector2.c I've added a negative test here to ensure coverage for that case. The new early exit check also tests the type of the 'B' parameter, so we don't waste time on matching if either value is unsuitable. Original commit message: This is part of solving PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 The patterns shown here are a special case of something that we already convert to select. Using ComputeNumSignBits() catches that case (but not the more complicated motivating patterns yet). The backend has hooks/logic to convert back to logic ops if that's better for the target. llvm-svn: 345149	2018-10-24 15:17:56 +00:00
Sanjay Patel	95790c546f	[InstCombine] use 'match' to simplify code There's probably some vector-with-undef-element pattern that shows an improvement, so this is probably not quite 'NFC'. This is the last step towards removing the fake binop queries for not/neg. Ie, there are no more uses of those functions in trunk. Fneg should follow. llvm-svn: 345050	2018-10-23 16:54:28 +00:00
Sanjay Patel	747feb28e4	[InstCombine] use 'match' to handle vectors and simplify code This is another step towards completely removing the fake binop queries for not/neg/fneg. llvm-svn: 345036	2018-10-23 15:05:12 +00:00
Sanjay Patel	ad76c682c7	[InstCombine] swap select profile metadata when swapping select ops llvm-svn: 345034	2018-10-23 14:43:31 +00:00
Sanjay Patel	0522b0da31	[InstCombine] use 'match' to simplify code; NFC llvm-svn: 344855	2018-10-20 17:15:57 +00:00
Sanjay Patel	ec572ade20	[InstCombine] make code more flexible with lambda; NFC I couldn't tell from svn history when these checks were added, but it pre-dates the split of instcombine into its own directory at rL92459. The motivation for changing the check is partly shown by the code in PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 There are also existing regression tests for SLPVectorizer with sequences of extract+insert that are likely assumed to become shuffles by the vectorizer cost models. llvm-svn: 344854	2018-10-20 16:58:27 +00:00
Sanjay Patel	729c4362cf	[InstCombine] add explanatory comment for strange vector logic; NFC llvm-svn: 344852	2018-10-20 16:25:55 +00:00
Thomas Lively	c339250e12	[InstCombine] InstCombine and InstSimplify for minimum and maximum Summary: Depends on D52765 Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52766 llvm-svn: 344799	2018-10-19 19:01:26 +00:00
Sanjay Patel	70daf85bc2	[InstCombine] use m_Neg() in dyn_castNegVal() to match vectors with undef elts llvm-svn: 344793	2018-10-19 17:54:53 +00:00
Fangrui Song	2e83b2e9ee	Use llvm::{all,any,none}_of instead std::{all,any,none}_of. NFC llvm-svn: 344774	2018-10-19 06:12:02 +00:00
Mikael Holmen	e3605d0f70	Add a emitUnaryFloatFnCall version that fetches the function name from TLI Summary: In several places in the code we use the following pattern: if (hasUnaryFloatFn(&TLI, Ty, LibFunc_tan, LibFunc_tanf, LibFunc_tanl)) { [...] Value Res = emitUnaryFloatFnCall(X, TLI.getName(LibFunc_tan), B, Attrs); [...] } In short, we check if there is a lib-function for a certain type, and then we _always_ fetch the name of the "double" version of the lib function and construct a call to the appropriate function, that we just checked exists, using that "double" name as a basis. This is of course a problem in cases where the target doesn't support the "double" version, but e.g. only the "float" version. In that case TLI.getName(LibFunc_tan) returns "", and emitUnaryFloatFnCall happily appends an "f" to "", and we erroneously end up with a call to a function called "f". To solve this, the above pattern is changed to if (hasUnaryFloatFn(&TLI, Ty, LibFunc_tan, LibFunc_tanf, LibFunc_tanl)) { [...] Value Res = emitUnaryFloatFnCall(X, &TLI, LibFunc_tan, LibFunc_tanf, LibFunc_tanl, B, Attrs); [...] } I.e instead of first fetching the name of the "double" version and then letting emitUnaryFloatFnCall() add the final "f" or "l", we let emitUnaryFloatFnCall() fetch the right name from TLI. Reviewers: eli.friedman, efriedma Reviewed By: efriedma Subscribers: efriedma, bjope, llvm-commits Differential Revision: https://reviews.llvm.org/D53370 llvm-svn: 344725	2018-10-18 06:27:53 +00:00

1 2 3 4 5 ...

3105 Commits