Commit Graph

2307 Commits

Author SHA1 Message Date
Craig Topper 358cd9ae3a [InstCombine] Remove the zextOrTrunc from ShrinkDemandedConstant.
The demanded mask and the constant should always be the same width for all callers today.

Also stop copying the demanded mask as its passed in. We should avoid allocating memory unless we are going to do something. The final AND to create the new constant will take care of it.

llvm-svn: 300927
2017-04-20 23:58:27 +00:00
Sanjay Patel cc663b82fa [InstCombine] function names start with lower-case letter; NFC
Forgot to make this fix with the signature change in r300911.

llvm-svn: 300912
2017-04-20 22:37:01 +00:00
Sanjay Patel c9485ca895 [InstCombine] allow shl+shr demanded bits folds with splat constants
llvm-svn: 300911
2017-04-20 22:33:54 +00:00
Sanjay Patel 3e1ae72fcf [InstCombine] allow shl demanded bits folds with splat constants
More fixes are needed to enable the helper SimplifyShrShlDemandedBits().

llvm-svn: 300898
2017-04-20 21:33:02 +00:00
Craig Topper ff23889609 [InstCombine] Use APInt::intersects and APInt::isSubsetOf to improve a few more places in SimplifyDemandedBits.
llvm-svn: 300896
2017-04-20 21:24:37 +00:00
Sanjay Patel fb5b3e773a [InstCombine] allow ashr/lshr demanded bits folds with splat constants
llvm-svn: 300888
2017-04-20 20:59:02 +00:00
Craig Topper 17f37ba3b9 [InstCombine] Use APInt::isSubsetOf to simplify some code in SimplifyDemandedBits. NFC
This allows us to use less temporary APInt for And and Invert operations.

llvm-svn: 300885
2017-04-20 20:47:35 +00:00
Craig Topper 0ec3f2f39a [InstCombine] Remove redundant code from SimplifyDemandedBits handling for Or. The code above it is equivalent if you work through the bitwise math.
llvm-svn: 300876
2017-04-20 19:31:22 +00:00
Craig Topper bcfd2d1789 [APInt] Rename getSignBit to getSignMask
getSignBit is a static function that creates an APInt with only the sign bit set. getSignMask seems like a better name to convey its functionality. In fact several places use it and then store in an APInt named SignMask.

Differential Revision: https://reviews.llvm.org/D32108

llvm-svn: 300856
2017-04-20 16:56:25 +00:00
Craig Topper a8129a1122 [APInt] Add isSubsetOf method that can check if one APInt is a subset of another without creating temporary APInts
This question comes up in many places in SimplifyDemandedBits. This makes it easy to ask without allocating additional temporary APInts.

The BitVector class provides a similar functionality through its (IMHO badly named) test(const BitVector&) method. Though its output polarity is reversed.

I've provided one example use case in this patch. I plan to do more as a follow up.

Differential Revision: https://reviews.llvm.org/D32258

llvm-svn: 300851
2017-04-20 16:17:13 +00:00
Craig Topper 83dc1c60aa In SimplifyDemandedUseBits, use computeKnownBits directly to handle Constants
Currently we don't explicitly process ConstantDataSequential, ConstantAggregateZero, or ConstantVector, or Undef before applying the Depth limit. Instead they occur after the depth check in the non-instruction path.

For the constant types that we do handle, the code is replicated from computeKnownBits.

This patch fixes the missing constant handling and the reduces the amount of code by just using computeKnownBits directly for any type of Constant.

Differential Revision: https://reviews.llvm.org/D32123

llvm-svn: 300849
2017-04-20 16:14:58 +00:00
Reid Kleckner aa0cec7d6d Simplify test for sret attribute in instcombine
This change is correct because the verifier requires that at most one
argument be marked 'sret'.

NFC, removes a use of AttributeList slot APIs.

llvm-svn: 300784
2017-04-19 23:17:47 +00:00
Craig Topper 9b71a402c2 [APInt] Cast calls to add/sub/mul overflow methods to void if only their overflow bool out param is used.
This is preparation for a clang change to improve the [[nodiscard]] warning to not be ignored on methods that return a class marked [[nodiscard]] that are defined in the class itself. See D32207.

We should consider adding wrapper methods to APInt that return the overflow flag directly and discard the APInt result. This would eliminate the void casts and the need to create a bool before the call to pass to the out param.

llvm-svn: 300758
2017-04-19 21:09:45 +00:00
Davide Italiano ffcb4df204 [InstCombine] Reduce visitLoadInst() code duplication. NFCI.
llvm-svn: 300717
2017-04-19 17:26:57 +00:00
Sanjoy Das f09c1e346e Add a getPointerOperandType() helper to LoadInst and StoreInst; NFC
I will use this in a later change.

llvm-svn: 300613
2017-04-18 22:00:54 +00:00
Craig Topper fc947bcfba [APInt] Use lshrInPlace to replace lshr where possible
This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result.

This adds an lshrInPlace(const APInt &) version as well.

Differential Revision: https://reviews.llvm.org/D32155

llvm-svn: 300566
2017-04-18 17:14:21 +00:00
Davide Italiano cdc937d0fc [InstCombine] Matchers work with both ConstExpr and Instructions.
So, `cast<Instruction>` is not guaranteed to succeed. Change the
code so that we create a new constant and use it in the newly
created instruction, as it's done in other places in InstCombine.

OK'ed by Sanjay/Craig. Fixes PR32686.

llvm-svn: 300495
2017-04-17 20:49:50 +00:00
Craig Topper d23004c37b Introduce APInt::isSignBitSet/isSignBitClear. Use in place isSignBitSet in place of isNegative in known bits tracking.
This makes statements like KnownZero.isNegative() (which means the value we're tracking is positive) less confusing.

llvm-svn: 300457
2017-04-17 16:38:20 +00:00
Matt Arsenault 7205f3c2e4 AMDGPU: SimplifyDemandedElts for image intrinsics
Causes some VGPR usage improvements in shaderdb, but
introduces some SGPR spilling regressions due to random
scheduling changes later.

llvm-svn: 300453
2017-04-17 15:12:44 +00:00
Craig Topper 218a359fbd [InstCombine] Simplify 1/X for vectors.
llvm-svn: 300439
2017-04-17 03:41:47 +00:00
Craig Topper 1a18a7c51e [InstCombine] Add support for vector srem->urem.
llvm-svn: 300437
2017-04-17 01:51:24 +00:00
Craig Topper f248468359 [InstCombine] Add support for turning vector sdiv into udiv.
llvm-svn: 300435
2017-04-17 01:51:19 +00:00
Craig Topper da886c665b [InstCombine][ValueTracking] When computing known bits for Srem make sure we don't compute known bits for the LHS twice.
If we already called computeKnownBits for the RHS being a constant power of 2, we've already computed everything we can and should just stop. I think previously we would still recurse if we had determined the result was negative or had not determined the sign bit at all.

llvm-svn: 300432
2017-04-16 21:46:12 +00:00
Craig Topper 0d304f01b4 [InstCombine] In SimplifyDemandedUseBits, don't bother to mask known bits of constants with DemandedMask.
Just because we didn't demand them doesn't mean they aren't known.

llvm-svn: 300430
2017-04-16 20:55:58 +00:00
Michael Zuckerman 16b20d2fc5 [X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimization
This patch adds new optimization (Folding cmp(sub(a,b),0) into cmp(a,b))
to instCombineCall pass and was written specific for X86 CMP intrinsics.

Differential Revision: https://reviews.llvm.org/D31398

llvm-svn: 300422
2017-04-16 13:26:08 +00:00
Sanjay Patel ef9f586bb2 [InstCombine] allow (X != C1 && X != C2) and similar patterns to match splat vector constants
llvm-svn: 300402
2017-04-15 17:55:06 +00:00
Craig Topper 9a458cd517 [InstCombine] MakeAnd/Or/Xor handling to reuse previous APInt computations
When checking if we should return a constant, we create some temporary APInts to see if we know all bits. But the exact computations we do are needed in several other locations in the same code.

This patch moves them to named temporaries so we can reuse them.

Ideally we'd write directly to KnownZero/One, but we currently seem to only write those variables after all the simplifications checks and I didn't want to change that with this patch.

Differential Revision: https://reviews.llvm.org/D32094

llvm-svn: 300376
2017-04-14 22:34:14 +00:00
Reid Kleckner fb502d2f5e [IR] Make paramHasAttr to use arg indices instead of attr indices
This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern.

Previously we were testing return value attributes with index 0, so I
introduced hasReturnAttr() for that use case.

llvm-svn: 300367
2017-04-14 20:19:02 +00:00
Sanjay Patel 7cfe41659c [InstCombine] (X != C1 && X != C2) --> (X | (C1 ^ C2)) != C2
...when C1 differs from C2 by one bit and C1 <u C2:
http://rise4fun.com/Alive/Vuo

And move related folds to a helper function. This reduces code duplication and
will make it easier to remove the scalar-only restriction as a follow-up step.

llvm-svn: 300364
2017-04-14 19:23:50 +00:00
Craig Topper fb71b7d3e0 [InstCombine] Support folding a subtract with a constant LHS into a phi node
We currently only support folding a subtract into a select but not a PHI. This fixes that.

I had to fix an assumption in FoldOpIntoPhi that assumed the PHI node was always in operand 0. Now we pass it in like we do for FoldOpIntoSelect. But we still require some dancing to find the Constant when we create the BinOp or ConstantExpr. This is based code is similar to what we do for selects.

Since I touched all call sites, this also renames FoldOpIntoPhi to foldOpIntoPhi to match coding standards.

Differential Revision: https://reviews.llvm.org/D31686

llvm-svn: 300363
2017-04-14 19:20:12 +00:00
Craig Topper c22c7b1459 [InstCombine] Refactor SimplifyUsingDistributiveLaws to more explicitly skip code when LHS/RHS aren't BinaryOperators
Currently this code always makes 2 or 3 calls to tryFactorization regardless of whether the LHS/RHS are BinaryOperators. We make 3 calls when both operands are BinaryOperators with the same opcode. Or surprisingly, when neither are BinaryOperators. This is because getBinOpsForFactorization returns Instruction::BinaryOpsEnd when the operand is not a BinaryOperator. If both LHS and RHS are not BinaryOperators then they both have an Opcode of Instruction::BinaryOpsEnd. When this happens we rely on tryFactorization to early out due to A/B/C/D being null. Similar behavior occurs for the other calls, we rely on getBinOpsForFactorization having made A/B or C/D null to get tryFactorization to early out.

We also rely on these null checks to check the result of getIdentityValue and early out for it.

This patches refactors this to pull these checks up to SimplifyUsingDistributiveLaws so we don't rely on BinaryOpsEnd as a sentinel or this A/B/C/D null behavior. I think this makes this code easier to reason about. Should also give a tiny performance improvement for cases where the LHS or RHS isn't a BinaryOperator.

Differential Revision: https://reviews.llvm.org/D31913

llvm-svn: 300353
2017-04-14 17:55:41 +00:00
Craig Topper c9a4fc0750 [InstCombine] Use APInt::setSignBit and APInt::isNegative(). NFC
llvm-svn: 300305
2017-04-14 05:09:04 +00:00
Reid Kleckner f021fab2af [IR] Make getParamAttributes take argument numbers, not ArgNo+1
Add hasParamAttribute() and use it instead of hasAttribute(ArgNo+1,
Kind) everywhere.

The fact that the AttributeList index for an argument is ArgNo+1 should
be a hidden implementation detail.

NFC

llvm-svn: 300272
2017-04-13 23:12:13 +00:00
Craig Topper e7563f8dda [InstCombine] Use APInt::getBitsSetFrom instead of inverting the result of getLowBitsSet. NFC
llvm-svn: 300265
2017-04-13 21:49:48 +00:00
Richard Smith 6c2615177b Revert accidentally-committed files in r300252.
llvm-svn: 300253
2017-04-13 20:31:21 +00:00
Richard Smith 55bd375b69 Remove all allocation and divisions from GreatestCommonDivisor
Switch from Euclid's algorithm to Stein's algorithm for computing GCD. This
avoids the (expensive) APInt division operation in favour of bit operations.
Remove all memory allocation from within the GCD loop by tweaking our `lshr`
implementation so it can operate in-place.

Differential Revision: https://reviews.llvm.org/D31968

llvm-svn: 300252
2017-04-13 20:29:59 +00:00
Reid Kleckner 257cb4e099 [InstCombine] Fix !prof metadata preservation for invokes
Summary:
Bug noticed by inspection.

Extend the test to handle invokes as well as calls, and rewrite it to
not depend on the inliner and other passes.

Also simplify the call site replacement code with CallSite, similar to
what I did to dead arg elimination and arg promotion (rL300235 and
rL300229).

Reviewers: danielcdh, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32041

llvm-svn: 300251
2017-04-13 20:26:38 +00:00
Sanjay Patel 445d03bf00 [InstCombine] fold X == 0 || X == -1 to one compare (PR32524)
This is effectively a retry of:
https://reviews.llvm.org/rL299851
but now we have tests and an assert to make sure the bug
that was exposed with that attempt will not happen again.

I'll fix the code duplication and missing sibling fold next,
but I want to make this change as small as possible to reduce
risk since I messed it up last time.

This should fix:
https://bugs.llvm.org/show_bug.cgi?id=32524

llvm-svn: 300236
2017-04-13 18:47:06 +00:00
Reid Kleckner c3fae796fd [InstCombine] Simplify attribute code with new AttributeList::get NFC
llvm-svn: 300230
2017-04-13 18:11:03 +00:00
Sanjay Patel 9745d24a66 [InstCombine] use similar ops for related folds; NFCI
It's less efficient to produce 'ule' than 'ult' since we know we're going to
canonicalize to 'ult', but we shouldn't have duplicated code for these folds.

As a trade-off, this was a pretty terrible way to make a '2'. :)
       if (LHSC == SubOne(RHSC)) 
         AddC = ConstantExpr::getSub(AddOne(RHSC), LHSC);

The next steps are to share the code to fix PR32524 and add the missing 'and'
fold that was left out when PR14708 was fixed:
https://bugs.llvm.org/show_bug.cgi?id=14708

llvm-svn: 300222
2017-04-13 17:36:24 +00:00
Sanjay Patel a8ebb46e0e [InstCombine] fix assert to not always be true
llvm-svn: 300202
2017-04-13 16:05:01 +00:00
Reid Kleckner 7f72033e1c [IR] Take func, ret, and arg attrs separately in AttributeList::get
This seems like a much more natural API, based on Derek Schuff's
comments on r300015. It further hides the implementation detail of
AttributeList that function attributes come last and appear at index
~0U, which is easy for the user to screw up. git diff says it saves code
as well: 97 insertions(+), 137 deletions(-)

This also makes it easier to change the implementation, which I want to
do next.

llvm-svn: 300153
2017-04-13 00:58:09 +00:00
Craig Topper c75f94bfa5 [InstCombine] Teach SimplifyMultipleUseDemandedBits to handle And/Or/Xor known bits using the LHS/RHS known bits it already acquired without recursing back into computeKnownBits.
This replicates the known bits and constant creation code from the single use case for these instructions and adds it here. The computeKnownBits and constant creation code for other instructions is now in the default case of the opcode switch.

llvm-svn: 300094
2017-04-12 19:32:47 +00:00
Craig Topper cf3641fd57 [InstCombine] Remove unreachable code for turning an And where all demanded bits on both sides are known to be zero into a constant 0.
We already handled a superset check that included the known ones too and folded to a constant that may include ones. But it can also handle the case of no ones.

llvm-svn: 300093
2017-04-12 19:08:03 +00:00
Sanjay Patel 6e41018942 [InstCombine] fix wrong undef handling when converting select to shuffle
As discussed in:
https://bugs.llvm.org/show_bug.cgi?id=32486
...the canonicalization of vector select to shufflevector does not hold up
when undef elements are present in the condition vector. 

Try to make the undef handling clear in the code and the LangRef.

Differential Revision: https://reviews.llvm.org/D31980

llvm-svn: 300092
2017-04-12 18:39:53 +00:00
Craig Topper f35a7f7b49 [InstCombine] In SimplifyMultipleUseDemandedBits, use a switch instead of cascaded ifs on opcode. NFC
llvm-svn: 300085
2017-04-12 18:25:25 +00:00
Craig Topper 9a51c7f343 [InstCombine] Teach SimplifyDemandedInstructionBits that even if we reach an instruction that has multiple uses, if we know all the bits for the demanded bits for this context we can go ahead and create a constant.
Currently if we reach an instruction with multiples uses we know we can't do any optimizations to that instruction itself since we only have the demanded bits for one of the users. But if we know all of the bits are zero/one for that one user we can still go ahead and create a constant to give to that user.

This might then reduce the instruction to having a single use and allow additional optimizations on the other path.

This picks up an additional case that r300075 didn't catch.

Differential Revision: https://reviews.llvm.org/D31552

llvm-svn: 300084
2017-04-12 18:17:46 +00:00
Craig Topper b0076fe8b4 [InstCombine] Move portion of SimplifyDemandedUseBits that deals with instructions with multiple uses out to a separate method. NFCI
llvm-svn: 300082
2017-04-12 18:05:21 +00:00
Craig Topper 845033a6c9 Teach SimplifyDemandedUseBits that adding or subtractings 0s from every bit below the highest demanded bit can be simplified
If we are adding/subtractings 0s below the highest demanded bit we can just use the other operand and remove the operation.

My primary motivation is observing that we can call ShrinkDemandedConstant for the add/sub and create a 0 constant, rather than removing the add completely. In the case I saw, we modified the constant on an add instruction to a 0, but the add is not put into the worklist. So we didn't revisit it until the next InstCombine iteration. This caused an IR modification to remove add and a subsequent iteration to be ran.

With this change we get bypass the add in the first iteration and prevent the second iteration from changing anything.

Differential Revision: https://reviews.llvm.org/D31120

llvm-svn: 300075
2017-04-12 16:49:59 +00:00
Sanjay Patel 33439f982b [InstCombine] morph an existing instruction instead of creating a new one
One potential way to make InstCombine (very slightly?) faster is to recycle instructions 
when possible instead of creating new ones. It's not explicitly stated AFAIK, but we don't
consider this an "InstSimplify". We could, however, make a new layer to house transforms 
like this if that makes InstCombine more manageable (just throwing out an idea; not sure 
how much opportunity is actually here).

Differential Revision: https://reviews.llvm.org/D31863

llvm-svn: 300067
2017-04-12 15:11:33 +00:00