llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	691d1a40e2	[InstCombine] use SelectInst operand names to make code clearer; NFC Cleanup step for D51433. llvm-svn: 341850	2018-09-10 18:37:59 +00:00
Sanjay Patel	1a00ffd656	[InstCombine] fix formatting in SimplifyDemandedVectorElts->Select; NFCI I'm preparing to add the same functionality both here and to the DAG version of this code in D51696 / D51433, so try to make those cases as similar as possible to avoid bugs. llvm-svn: 341545	2018-09-06 13:19:22 +00:00
Craig Topper	034adf2683	[X86] Remove and autoupgrade the scalar fma intrinsics with masking. This converts them to what clang is now using for codegen. Unfortunately, there seem to be a few kinks to work out still. I'll try to address with follow up patches. llvm-svn: 336871	2018-07-12 00:29:56 +00:00
Craig Topper	350c5f1881	[X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart independent FMA and extractelement/insertelement. llvm-svn: 336315	2018-07-05 06:52:55 +00:00
Simon Pilgrim	79e474bf46	Use APInt[] bit access to avoid "32-bit shift implicitly converted to 64 bits" MSVC warning (again). NFCI. llvm-svn: 335457	2018-06-25 11:46:24 +00:00
Simon Pilgrim	3a0e13f347	Use APInt[] bit access to avoid "32-bit shift implicitly converted to 64 bits" MSVC warning. NFCI. llvm-svn: 335454	2018-06-25 11:38:27 +00:00
Nicolai Haehnle	db6911a6f9	AMDGPU: Remove old-style image intrinsics Summary: This also removes the need for atomic pseudo instructions, since we select the correct encoding directly in SITargetLowering::lowerImage for dimension-aware image intrinsics. Mesa uses dimension-aware image intrinsics since commit a9a7993441. Change-Id: I7473d20009476a4ed6d919cae4e6dca9ff42e77a Reviewers: arsenm, rampitec, mareko, tpr, b-sumner Subscribers: kzhuravl, wdng, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48167 llvm-svn: 335231	2018-06-21 13:37:45 +00:00
Nicolai Haehnle	b29ee70122	InstCombine/AMDGPU: Add dimension-aware image intrinsics to SimplifyDemanded Summary: Use the expanded features of the TableGen generic tables to avoid manually adding the combinatorially exploded set of intrinsics. The getAMDGPUImageDimIntrinsic lookup function is early-out, i.e. non-AMDGPU intrinsics will never look at the underlying table. Use a generic approach for getting the new intrinsic overload to keep the code simple, and make the image dmask handling more generic: - handle non-sampler image loads - handle the case where the set of demanded elements is not a prefix There is some overlap between this code and an optimization that happens in the backend during code generation. They currently complement each other: - only the codegen optimization can generate vec3 loads - only the InstCombine optimization can handle D16 The InstCombine optimization also likely covers more cases since the codegen optimization is fairly ad-hoc. Ideally, we'll remove the optimization in codegen once the infrastructure for vec3 is in place (which will probably take a long time). Modify the test cases to use dimension-aware intrinsics. This makes it easier to see that the test coverage for the new intrinsics is equivalent, and the old style intrinsics will be removed in a follow-up commit anyway. Change-Id: I4b91ea661413d13004956fe4ef7d13d41b8ce3ad Reviewers: arsenm, rampitec, majnemer Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48165 llvm-svn: 335230	2018-06-21 13:37:31 +00:00
Tomasz Krupa	bcaab53d47	[X86] Lowering sqrt intrinsics to native IR Summary: Complementary patch to lowering sqrt intrinsics in Clang. Reviewers: craig.topper, spatel, RKSimon, DavidKreitzer, uriel.k Reviewed By: craig.topper Subscribers: tkrupa, mike.dvoretsky, llvm-commits Differential Revision: https://reviews.llvm.org/D41599 llvm-svn: 334849	2018-06-15 18:05:24 +00:00
Craig Topper	a17d627abb	[X86] Remove and autoupgrade a bunch of FMA instrinsics that are no longer used by clang. llvm-svn: 332146	2018-05-11 21:59:34 +00:00
Benjamin Kramer	456f473ea8	[InstCombine] Only propagate known leading zeros from udiv input to output. Put in a conservatively correct estimate for now. Avoids miscompiling clang in FDO mode. This is really tricky to trigger in reality as basically all interesting cases will be folded away by computeKnownBits earlier, I was unable to find a reasonably small test case. llvm-svn: 331975	2018-05-10 11:45:18 +00:00
Benjamin Kramer	0d2fc1a501	[InstCombine] Teach SimplifyDemandedBits that udiv doesn't demand low dividend bits that are zero in the divisor This is safe as long as the udiv is not exact. The pattern is not common in C++ code, but comes up all the time in code generated by XLA's GPU backend. Differential Revision: https://reviews.llvm.org/D46647 llvm-svn: 331933	2018-05-09 22:27:34 +00:00
Craig Topper	254ed028a4	[X86] Remove the pmuldq/pmuldq intrinsics and replace with native IR. This completes the work started in r329604 and r329605 when we changed clang to no longer use the intrinsics. We lost some InstCombine SimplifyDemandedBit optimizations through this change as we aren't able to fold 'and', bitcast, shuffle very well. llvm-svn: 329990	2018-04-13 06:07:18 +00:00
Simon Pilgrim	c2ee69035c	Remove useless comment - seems to be a copy+paste typo. NFCI llvm-svn: 325385	2018-02-16 20:41:06 +00:00
Sanjay Patel	aa766efd09	[InstCombine] fix demanded-bits propagation for zext/trunc I was comparing the demanded-bits implementations between InstCombine and TargetLowering as part of investigating questions in D42088 and noticed that this was wrong in IR. We were losing all of the prior known bits when we got back to the 'zext'. llvm-svn: 322662	2018-01-17 14:39:28 +00:00
Simon Pilgrim	a42a54258e	[InstCombine] Fix SimplifyDemandedUseBits SHL handling (PR35515) Don't assume that the pattern matched SRL can be cast to an Instruction (might be ConstExpr etc.) llvm-svn: 320270	2017-12-09 23:42:56 +00:00
Sanjay Patel	e6b48a1b02	[InstCombine] improve demanded vector elements analysis of insertelement Recurse instead of returning on the first found optimization. Also, return early in the caller instead of continuing because that allows another round of simplification before we might potentially lose undef information from a shuffle mask by eliminating the shuffle. As noted in the review, we could probably do better and be more efficient by moving all of demanded elements into a separate pass, but this is yet another quick fix to instcombine. Differential Revision: https://reviews.llvm.org/D37236 llvm-svn: 312248	2017-08-31 15:57:17 +00:00
Craig Topper	3763f0e00d	[InstCombine] Call hasNoSignedWrap instead of hasNoUnsignedWrap to get the NSW flag when handling Add in SimplifyDemandedUseBits. This is a typo from r311789. This should fix PR34349. llvm-svn: 311902	2017-08-28 18:44:28 +00:00
Craig Topper	35171e5d67	[InstCombine] Don't fall back to only calling computeKnownBits if the upper bit of Add/Sub is demanded. Just create an all 1s demanded mask and continue recursing like normal. The recursive calls should be able to handle an all 1s mask and do the right thing. The only time we should care about knowing whether the upper bit was demanded is when we need to know if we should clear the NSW/NUW flags. Now that we have a consistent path through the code for all cases, use KnownBits::computeForAddSub to compute the known bits at the end since we already have the LHS and RHS. My larger goal here is to move the code that turns add into xor if only 1 bit is demanded and no bits below it are non-zero from InstCombiner::OptAndOp to here. This will allow it to be more general instead of just looking for 'add' and 'and' with constant RHS. Differential Revision: https://reviews.llvm.org/D36486 llvm-svn: 311789	2017-08-25 18:39:40 +00:00
Amjad Aboud	22178dd33b	[InstCombine] Consider more cases where SimplifyDemandedUseBits does not convert AShr to LShr. There are cases where AShr have better chance to be optimized than LShr, especially when the demanded bits are not known to be Zero, and also known to be similar to the sign bit. Differential Revision: https://reviews.llvm.org/D36936 llvm-svn: 311773	2017-08-25 11:07:54 +00:00
Craig Topper	fc9bf50dee	[InstCombine] Remove unnecessary temporary APInt. NFCI llvm-svn: 309887	2017-08-02 21:05:40 +00:00
Craig Topper	4d5050b5ba	[InstCombine] Remove explicit check for impossible condition. Replace with assert Summary: As far as I can tell the earlier call getLimitedValue will guaranteed ShiftAmt is saturated to BitWidth-1 preventing it from ever being equal or greater than BitWidth. At one point in the past the getLimitedValue call was only passed BitWidth not BitWidth - 1. This would have allowed the equality case to get here. And in fact this check was initially added as just BitWidth == ShiftAmt, but was changed shortly after to include > which should have never been possible. Reviewers: spatel, majnemer, davide Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36123 llvm-svn: 309690	2017-08-01 15:10:25 +00:00
Craig Topper	2072aca51c	[InstCombine] Move (0 - x) & 1 --> x & 1 to SimplifyDemandedUseBits. This removes a dedicated matcher and allows us to support more than just an AND masking the lower bit. llvm-svn: 308124	2017-07-16 05:37:58 +00:00
Craig Topper	bb4069e439	[InstCombine] Make InstCombine's IRBuilder be passed by reference everywhere Previously the InstCombiner class contained a pointer to an IR builder that had been passed to the constructor. Sometimes this would be passed to helper functions as either a pointer or the pointer would be dereferenced to be passed by reference. This patch makes it a reference everywhere including the InstCombiner class itself so there is more inconsistency. This a large, but mechanical patch. I've done very minimal formatting changes on it despite what clang-format wanted to do. llvm-svn: 307451	2017-07-07 23:16:26 +00:00
Craig Topper	79ab643da8	[Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne. llvm-svn: 307292	2017-07-06 18:39:47 +00:00
Craig Topper	73ba1c84be	[InstCombine][InstSimplify] Use APInt::isNullValue/isOneValue to reduce compiled code for comparing APInts with 0 and 1. NFC These methods are specifically optimized to only counting leading zeros without an additional uint64_t compare. llvm-svn: 304876	2017-06-07 07:40:37 +00:00
Craig Topper	2f9c6dafe3	[InstCombine] Merge together the SimplifyDemandedUseBits implementations for ZExt and Trunc. NFC While there avoid resizing the DemandedMask twice. Make a copy into a separate variable instead. This potentially removes an allocation on large bit widths. With the use of the zextOrTrunc methods on APInt and KnownBits these can be made almost source identical. The only difference is the zero of the upper bits for ZExt. This is similar to how its done in computeKnownBits in ValueTracking. llvm-svn: 303791	2017-05-24 18:40:25 +00:00
Craig Topper	1c660dbea6	[InstCombine] Use less bitwise operations to handle Instruction::SExt in SimplifyDemandedUseBits. Other improvements. The current code created a NewBits mask and used it as a mask several times. One of them just before a call to trunc making it unnecessary. A call to getActiveBits can get us the same information for the case. We also ORed with this mask later when we should have just sign extended the known bits. We also called trunc on the guaranteed to be zero KnownZeros/Ones masks entering this code. Creating appropriately sized temporary APInts is probably better. Differential Revision: https://reviews.llvm.org/D32098 llvm-svn: 303779	2017-05-24 17:33:30 +00:00
Craig Topper	7e0aeeb884	[KnownBits] Use !hasConflict() in asserts in place of Zero & One == 0 or similar. NFC llvm-svn: 303614	2017-05-23 07:18:37 +00:00
Craig Topper	8df66c602a	[KnownBits] Add bit counting methods to KnownBits struct and use them where possible This patch adds min/max population count, leading/trailing zero/one bit counting methods. The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give. Differential Revision: https://reviews.llvm.org/D32931 llvm-svn: 302925	2017-05-12 17:20:30 +00:00
Craig Topper	f0aeee01c3	[KnownBits] Add wrapper methods for setting and clear all bits in the underlying APInts in KnownBits. This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown. Differential Revision: https://reviews.llvm.org/D32637 llvm-svn: 302262	2017-05-05 17:36:09 +00:00
Craig Topper	d938fd1397	[KnownBits] Add zext, sext, and trunc methods to KnownBits This patch adds zext, sext, and trunc methods to KnownBits and uses them where possible. Differential Revision: https://reviews.llvm.org/D32784 llvm-svn: 302088	2017-05-03 22:07:25 +00:00
Craig Topper	ca48af3c87	[KnownBits] Add methods for determining if the known bits represent a negative/nonnegative number and add methods for changing the negative/nonnegative state Summary: This patch adds isNegative, isNonNegative for querying whether the sign bit is known. It also adds makeNegative and makeNonNegative for controlling the sign bit. Reviewers: RKSimon, spatel, davide Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32651 llvm-svn: 301747	2017-04-29 16:43:11 +00:00
Craig Topper	24e71017aa	[APInt] Use inplace shift methods where possible. NFCI llvm-svn: 301612	2017-04-28 03:36:24 +00:00
Craig Topper	b45eabcf82	[ValueTracking] Introduce a KnownBits struct to wrap the two APInts for computeKnownBits This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit. Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch. I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases. Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero\|One) so we don't write it out everywhere. Maybe a method for (Zero\|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with. Differential Revision: https://reviews.llvm.org/D32376 llvm-svn: 301432	2017-04-26 16:39:58 +00:00
Craig Topper	f3dbd17d0a	[APInt] Use isSubsetOf, intersects, and bit counting methods to reduce temporary APInts This patch uses various APInt methods to reduce temporary APInt creation. This should be all of the unrelated cleanups that got buried in D32376(creating a KnownBits struct) as well as some pointed out by Simon during the review of that. Plus a few improvements to use counting instead of masking. I've left out any places where we do something like (KnownZero & KnownOne) != 0 as I plan to add a helper method to KnownBits to ask that question and didn't want to thrash that code an additional time. Differential Revision: https://reviews.llvm.org/D32495 llvm-svn: 301338	2017-04-25 17:46:30 +00:00
Craig Topper	7603dce6b2	[InstCombine] Remove superfluous curly braces around a single line if body. NFC llvm-svn: 301326	2017-04-25 16:48:19 +00:00
Renato Golin	4abfb3d741	Revert "[APInt] Fix a few places that use APInt::getRawData to operate within the normal API." This reverts commit r301105, 4, 3 and 1, as a follow up of the previous revert, which broke even more bots. For reference: Revert "[APInt] Use operator<<= where possible. NFC" Revert "[APInt] Use operator<<= instead of shl where possible. NFC" Revert "[APInt] Use ashInPlace where possible." PR32754. llvm-svn: 301111	2017-04-23 12:15:30 +00:00
Craig Topper	5f68af0806	[APInt] Use operator<<= instead of shl where possible. NFC llvm-svn: 301103	2017-04-23 05:18:31 +00:00
Sanjay Patel	8ce1d4cbe1	[InstCombine] revert r300977 and r301021 This can cause an inf-loop. Investigating... llvm-svn: 301035	2017-04-21 20:29:17 +00:00
Sanjay Patel	0f001a4701	[InstCombine] use isSubsetOf() for efficiency C \| ~D == -1 ~(C \| ~D) == 0 ~C & D == 0 D & ~C == 0 D.isSubsetOf(C) llvm-svn: 301021	2017-04-21 19:16:52 +00:00
Sanjay Patel	347b54b093	[InstCombine] prefer xor with -1 because 'not' is easier to understand (PR32706) This matches the demanded bits behavior in the DAG and should fix: https://bugs.llvm.org/show_bug.cgi?id=32706 Differential Revision: https://reviews.llvm.org/D32255 llvm-svn: 300977	2017-04-21 14:03:54 +00:00
Craig Topper	358cd9ae3a	[InstCombine] Remove the zextOrTrunc from ShrinkDemandedConstant. The demanded mask and the constant should always be the same width for all callers today. Also stop copying the demanded mask as its passed in. We should avoid allocating memory unless we are going to do something. The final AND to create the new constant will take care of it. llvm-svn: 300927	2017-04-20 23:58:27 +00:00
Sanjay Patel	cc663b82fa	[InstCombine] function names start with lower-case letter; NFC Forgot to make this fix with the signature change in r300911. llvm-svn: 300912	2017-04-20 22:37:01 +00:00
Sanjay Patel	c9485ca895	[InstCombine] allow shl+shr demanded bits folds with splat constants llvm-svn: 300911	2017-04-20 22:33:54 +00:00
Sanjay Patel	3e1ae72fcf	[InstCombine] allow shl demanded bits folds with splat constants More fixes are needed to enable the helper SimplifyShrShlDemandedBits(). llvm-svn: 300898	2017-04-20 21:33:02 +00:00
Craig Topper	ff23889609	[InstCombine] Use APInt::intersects and APInt::isSubsetOf to improve a few more places in SimplifyDemandedBits. llvm-svn: 300896	2017-04-20 21:24:37 +00:00
Sanjay Patel	fb5b3e773a	[InstCombine] allow ashr/lshr demanded bits folds with splat constants llvm-svn: 300888	2017-04-20 20:59:02 +00:00
Craig Topper	17f37ba3b9	[InstCombine] Use APInt::isSubsetOf to simplify some code in SimplifyDemandedBits. NFC This allows us to use less temporary APInt for And and Invert operations. llvm-svn: 300885	2017-04-20 20:47:35 +00:00
Craig Topper	0ec3f2f39a	[InstCombine] Remove redundant code from SimplifyDemandedBits handling for Or. The code above it is equivalent if you work through the bitwise math. llvm-svn: 300876	2017-04-20 19:31:22 +00:00
Craig Topper	bcfd2d1789	[APInt] Rename getSignBit to getSignMask getSignBit is a static function that creates an APInt with only the sign bit set. getSignMask seems like a better name to convey its functionality. In fact several places use it and then store in an APInt named SignMask. Differential Revision: https://reviews.llvm.org/D32108 llvm-svn: 300856	2017-04-20 16:56:25 +00:00
Craig Topper	a8129a1122	[APInt] Add isSubsetOf method that can check if one APInt is a subset of another without creating temporary APInts This question comes up in many places in SimplifyDemandedBits. This makes it easy to ask without allocating additional temporary APInts. The BitVector class provides a similar functionality through its (IMHO badly named) test(const BitVector&) method. Though its output polarity is reversed. I've provided one example use case in this patch. I plan to do more as a follow up. Differential Revision: https://reviews.llvm.org/D32258 llvm-svn: 300851	2017-04-20 16:17:13 +00:00
Craig Topper	83dc1c60aa	In SimplifyDemandedUseBits, use computeKnownBits directly to handle Constants Currently we don't explicitly process ConstantDataSequential, ConstantAggregateZero, or ConstantVector, or Undef before applying the Depth limit. Instead they occur after the depth check in the non-instruction path. For the constant types that we do handle, the code is replicated from computeKnownBits. This patch fixes the missing constant handling and the reduces the amount of code by just using computeKnownBits directly for any type of Constant. Differential Revision: https://reviews.llvm.org/D32123 llvm-svn: 300849	2017-04-20 16:14:58 +00:00
Craig Topper	fc947bcfba	[APInt] Use lshrInPlace to replace lshr where possible This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result. This adds an lshrInPlace(const APInt &) version as well. Differential Revision: https://reviews.llvm.org/D32155 llvm-svn: 300566	2017-04-18 17:14:21 +00:00
Craig Topper	d23004c37b	Introduce APInt::isSignBitSet/isSignBitClear. Use in place isSignBitSet in place of isNegative in known bits tracking. This makes statements like KnownZero.isNegative() (which means the value we're tracking is positive) less confusing. llvm-svn: 300457	2017-04-17 16:38:20 +00:00
Matt Arsenault	7205f3c2e4	AMDGPU: SimplifyDemandedElts for image intrinsics Causes some VGPR usage improvements in shaderdb, but introduces some SGPR spilling regressions due to random scheduling changes later. llvm-svn: 300453	2017-04-17 15:12:44 +00:00
Craig Topper	da886c665b	[InstCombine][ValueTracking] When computing known bits for Srem make sure we don't compute known bits for the LHS twice. If we already called computeKnownBits for the RHS being a constant power of 2, we've already computed everything we can and should just stop. I think previously we would still recurse if we had determined the result was negative or had not determined the sign bit at all. llvm-svn: 300432	2017-04-16 21:46:12 +00:00
Craig Topper	0d304f01b4	[InstCombine] In SimplifyDemandedUseBits, don't bother to mask known bits of constants with DemandedMask. Just because we didn't demand them doesn't mean they aren't known. llvm-svn: 300430	2017-04-16 20:55:58 +00:00
Craig Topper	9a458cd517	[InstCombine] MakeAnd/Or/Xor handling to reuse previous APInt computations When checking if we should return a constant, we create some temporary APInts to see if we know all bits. But the exact computations we do are needed in several other locations in the same code. This patch moves them to named temporaries so we can reuse them. Ideally we'd write directly to KnownZero/One, but we currently seem to only write those variables after all the simplifications checks and I didn't want to change that with this patch. Differential Revision: https://reviews.llvm.org/D32094 llvm-svn: 300376	2017-04-14 22:34:14 +00:00
Craig Topper	c9a4fc0750	[InstCombine] Use APInt::setSignBit and APInt::isNegative(). NFC llvm-svn: 300305	2017-04-14 05:09:04 +00:00
Craig Topper	c75f94bfa5	[InstCombine] Teach SimplifyMultipleUseDemandedBits to handle And/Or/Xor known bits using the LHS/RHS known bits it already acquired without recursing back into computeKnownBits. This replicates the known bits and constant creation code from the single use case for these instructions and adds it here. The computeKnownBits and constant creation code for other instructions is now in the default case of the opcode switch. llvm-svn: 300094	2017-04-12 19:32:47 +00:00
Craig Topper	cf3641fd57	[InstCombine] Remove unreachable code for turning an And where all demanded bits on both sides are known to be zero into a constant 0. We already handled a superset check that included the known ones too and folded to a constant that may include ones. But it can also handle the case of no ones. llvm-svn: 300093	2017-04-12 19:08:03 +00:00
Craig Topper	f35a7f7b49	[InstCombine] In SimplifyMultipleUseDemandedBits, use a switch instead of cascaded ifs on opcode. NFC llvm-svn: 300085	2017-04-12 18:25:25 +00:00
Craig Topper	9a51c7f343	[InstCombine] Teach SimplifyDemandedInstructionBits that even if we reach an instruction that has multiple uses, if we know all the bits for the demanded bits for this context we can go ahead and create a constant. Currently if we reach an instruction with multiples uses we know we can't do any optimizations to that instruction itself since we only have the demanded bits for one of the users. But if we know all of the bits are zero/one for that one user we can still go ahead and create a constant to give to that user. This might then reduce the instruction to having a single use and allow additional optimizations on the other path. This picks up an additional case that r300075 didn't catch. Differential Revision: https://reviews.llvm.org/D31552 llvm-svn: 300084	2017-04-12 18:17:46 +00:00
Craig Topper	b0076fe8b4	[InstCombine] Move portion of SimplifyDemandedUseBits that deals with instructions with multiple uses out to a separate method. NFCI llvm-svn: 300082	2017-04-12 18:05:21 +00:00
Craig Topper	845033a6c9	Teach SimplifyDemandedUseBits that adding or subtractings 0s from every bit below the highest demanded bit can be simplified If we are adding/subtractings 0s below the highest demanded bit we can just use the other operand and remove the operation. My primary motivation is observing that we can call ShrinkDemandedConstant for the add/sub and create a 0 constant, rather than removing the add completely. In the case I saw, we modified the constant on an add instruction to a 0, but the add is not put into the worklist. So we didn't revisit it until the next InstCombine iteration. This caused an IR modification to remove add and a subsequent iteration to be ran. With this change we get bypass the add in the first iteration and prevent the second iteration from changing anything. Differential Revision: https://reviews.llvm.org/D31120 llvm-svn: 300075	2017-04-12 16:49:59 +00:00
Craig Topper	e06b6bcfa1	[InstCombine] Use setAllBits in place of getAllOnesValue since we know the bitwidths are the same. NFCI llvm-svn: 299413	2017-04-04 05:03:02 +00:00
Craig Topper	d33ee1b960	[APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt class. Implement them without memory allocation for multiword This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation. Differential Revision: https://reviews.llvm.org/D31565 llvm-svn: 299362	2017-04-03 16:34:59 +00:00
Craig Topper	885fa12e8a	[APInt] Remove shift functions from APIntOps namespace. Replace the few users with the APInt class methods. NFCI llvm-svn: 299248	2017-03-31 20:01:16 +00:00
Craig Topper	47596dd4cc	[InstCombine] Change the interface of SimplifyDemandedBits so that it takes the instruction and operand instead of the Use. The first thing it did was get the User for the Use to get the instruction back. This requires looking through the Uses for the User using the waymarking walk. That's pretty fast, but its probably still better to just pass the Instruction we already had. llvm-svn: 298772	2017-03-25 06:52:52 +00:00
Craig Topper	8fbb74b5b2	Revert r298711 "[InstCombine] Provide a way to calculate KnownZero/One for Add/Sub in SimplifyDemandedUseBits without recursing into ComputeKnownBits" Tsan bot is failing. llvm-svn: 298745	2017-03-24 22:12:10 +00:00
Craig Topper	d4521c2fc2	[InstCombine] Provide a way to calculate KnownZero/One for Add/Sub in SimplifyDemandedUseBits without recursing into ComputeKnownBits SimplifyDemandedUseBits for Add/Sub already recursed down LHS and RHS for simplifying bits. If that didn't provide any simplifications we fall back to calling computeKnownBits which will recurse again. Instead just take the known bits for LHS and RHS we already have and call into a new function in ValueTracking that can calculate the known bits given the LHS/RHS bits. llvm-svn: 298711	2017-03-24 16:56:51 +00:00
Craig Topper	07f2915ad8	[InstCombine] Teach SimplifyDemandedUseBits to shrink Constants on the left side of subtracts Summary: Subtracts can have constants on the left side, but we don't shrink them based on demanded bits. This patch fixes that to match the right hand side. Reviewers: davide, majnemer, spatel, sanjoy, hfinkel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31119 llvm-svn: 298478	2017-03-22 04:03:53 +00:00
Craig Topper	ff9749f759	[InstCombine] Remove duplicate code in SimplifyDemandedUseBits for URem. NFC llvm-svn: 298231	2017-03-19 21:45:57 +00:00
Craig Topper	3a86a04404	[InstCombine] Use setHighBits/setLowBits/setBitsFrom in place of getLowBitsSet/getHighBitsSet. llvm-svn: 298204	2017-03-19 05:49:16 +00:00
Matt Arsenault	a3bdd8f27b	AMDGPU: Fix insertion point when reducing load intrinsics The insertion point may be later than the next instruction, so it is necessary to set it when replacing the call. llvm-svn: 297439	2017-03-10 05:25:49 +00:00
Matt Arsenault	efe949cc67	AMDGPU: Support for SimplifyDemandedVectorElts for load intrinsics llvm-svn: 297408	2017-03-09 20:34:27 +00:00
Simon Pilgrim	e938a152c5	Use APInt::getLowBitsSet instead of APInt::getBitsSet for lower bit mask creation llvm-svn: 296882	2017-03-03 16:56:33 +00:00
Craig Topper	3731f4d173	[AVX-512][InstCombine] Teach InstCombine to optimize 512-bit packss/packus intrinsics like it does 128/256-bit. llvm-svn: 295294	2017-02-16 07:35:23 +00:00
Sanjay Patel	ae3b43e488	[InstCombine] use m_APInt to allow demanded bits analysis on splat constants llvm-svn: 294628	2017-02-09 21:43:06 +00:00
Simon Pilgrim	78f8630ac0	[InstCombine][X86] MULDQ/MULUDQ undef -> zero Added early out for single undef input - we were already supporting (and testing) this in the constant folding code, we just do it quicker now Drop undef handling from demanded elts code now that we handle it fully in InstCombiner::visitCallInst llvm-svn: 292913	2017-01-24 11:07:41 +00:00
Simon Pilgrim	a50a93fcd0	[InstCombine][X86] Add MULDQ/MULUDQ undef handling llvm-svn: 292627	2017-01-20 18:20:30 +00:00
Simon Pilgrim	51b3b98e3a	[InstCombine][SSE] Add DemandedElts support for PACKSS/PACKUS instructions Simplify a packss/packus truncation based on the elements of the mask that are actually demanded. Differential Revision: https://reviews.llvm.org/D28777 llvm-svn: 292591	2017-01-20 09:28:21 +00:00
Simon Pilgrim	fe2c0ed4cf	[InstCombine][AVX2] Add DemandedElts support for VPERMD/VPERMPS shuffles Simplify a vpermv shuffle mask based on the elements of the mask that are actually demanded. llvm-svn: 292371	2017-01-18 14:47:49 +00:00
Simon Pilgrim	d4eb800b03	[InstCombine][X86][AVX] Add DemandedElts support for VPERMILPD/VPERMILPS instructions Simplify a vpermilvar shuffle mask based on the elements of the mask that are actually demanded. llvm-svn: 292209	2017-01-17 11:35:03 +00:00
Simon Pilgrim	73a68c25a0	[InstCombine][SSE] Add DemandedElts support for PSHUFB instructions Simplify a pshufb shuffle mask based on the elements of the mask that are actually demanded. Differential Revision: https://reviews.llvm.org/D28745 llvm-svn: 292101	2017-01-16 11:30:41 +00:00
Craig Topper	62f06e241b	[InstCombine] Fix typo in comment. NFC llvm-svn: 290706	2016-12-29 05:38:31 +00:00
Craig Topper	2e18bcfc60	[InstCombine] Use a 32-bits instead of 64-bits for storing the number of elements in VectorType for a ShuffleVector. While there getVectorNumElements to avoid an explicit cast. NFC llvm-svn: 290705	2016-12-29 04:24:32 +00:00
Craig Topper	1a8a3377cc	[InstCombine][X86] If the lowest element of a scalar intrinsic isn't used make sure we add it to the worklist so we can DCE it sooner. We bypassed the intrinsic and returned the passthru operand, but we should also add the intrinsic to the worklist since its now dead. This can allow DCE to find it sooner and remove it. Similar was done for InsertElement when the inserted element isn't demanded. llvm-svn: 290704	2016-12-29 03:30:17 +00:00
Craig Topper	72f2d4e8d6	[InstCombine][X86] Add DemandedElts support for 512-bit PMULDQ/PMULUDQ instructions PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use. This builds on r290554 which added supported for 128 and 256-bit. llvm-svn: 290582	2016-12-27 05:30:09 +00:00
Simon Pilgrim	c9cf7fc7a4	[InstCombine][X86] Add DemandedElts support for PMULDQ/PMULUDQ instructions PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use. Differential Revision: https://reviews.llvm.org/D28119 llvm-svn: 290554	2016-12-26 23:28:17 +00:00
Craig Topper	e32b5fd7f9	[InstCombine] Simplify code slightly. NFC llvm-svn: 290046	2016-12-17 18:10:04 +00:00
Craig Topper	ab5f355d8c	[AVX-512][InstCombine] Add masked scalar FMA intrinsics to SimplifyDemandedVectorElts. llvm-svn: 289759	2016-12-15 03:49:45 +00:00
Craig Topper	268b3abe6d	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle masked scalar add/sub/mul/div/max/min intrinsics better. Now we can remove these intrinsics if element 0 isn't used. Also fix undef element tracking. llvm-svn: 289636	2016-12-14 06:06:58 +00:00
Craig Topper	dfd268d76b	[X86][InstCombine] Handle scalar fmadd intrinsics correctly in SimplifyDemandedVectorElts. Now we pass a modified version of DemandedElts to each operand and we calculate undef elts correctly. llvm-svn: 289632	2016-12-14 05:43:05 +00:00
Craig Topper	eb6a20e79e	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar round intrinsics more correctly. Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Similarly we clear bit 0 for optimizing operand 0. Also calculate UndefElts correctly. Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support. llvm-svn: 289629	2016-12-14 03:17:30 +00:00
Craig Topper	a0372dec26	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar min/max/cmp intrinsics more correctly. Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Also calculate UndefElts correctly. Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support. llvm-svn: 289628	2016-12-14 03:17:27 +00:00
Craig Topper	ac75bca1eb	[X86][InstCombine] Fix SimplifyDemandedVectorElts to handle frcz scalar intrinsics correctly. Only the lower bits of the input element are used. And only the lower element can be undef since the upper bits are zeroed. Have InstCombineCalls call SimplifyDemandedVectorElts for these intrinsics to reuse this support. llvm-svn: 289523	2016-12-13 07:45:45 +00:00
Craig Topper	7fc6d34ed1	[InstCombine][XOP] The instructions for the scalar frcz intrinsics are defined to put 0 in the upper bits, not pass bits through like other intrinsics. So we should return a zero vector instead. llvm-svn: 289411	2016-12-11 22:32:38 +00:00
Craig Topper	23ebd9564f	[X86][InstCombine] Add support for scalar FMA intrinsics to SimplifyDemandedVectorElts. This teaches SimplifyDemandedElts that the FMA can be removed if the lower element isn't used. It also teaches it that if upper elements of the first operand aren't used then we can simplify them. llvm-svn: 289377	2016-12-11 08:54:52 +00:00

1 2 3 4 5

236 Commits