llvm-project

Commit Graph

Author	SHA1	Message	Date
Dylan Fleming	4be7fb9762	[SVE] Add folds for truncation of vscale Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D107453	2021-08-13 10:18:00 +01:00
Sander de Smalen	fe6ae81ef3	[InstCombine] Fix vscale zext/sext optimization when vscale_range is unbounded. According to the LangRef, a (vscale_range) value of 0 means unbounded. This patch additionally cleans up the test file vscale_sext_and_zext.ll.	2021-08-04 17:17:37 +01:00
Dylan Fleming	a7a39ec886	[SVE] Add folds for sign and zero extends of vscale Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D105994	2021-07-30 16:02:50 +01:00
Datta Nagraj	ad0085d338	[InstCombine] Eliminate casts to optimize ctlz operation If a ctlz operation is performed on higher datatype and then downcasted, then this can be optimized by doing a ctlz operation on a lower datatype and adding the difference bitsize to the result of ctlz to provide the same output: https://alive2.llvm.org/ce/z/8uup9M The original problem is shown in https://llvm.org/PR50173 Differential Revision: https://reviews.llvm.org/D103788	2021-06-23 11:19:12 -04:00
Nikita Popov	e790d3667e	[OpaquePtr] Handle addrspacecasts in InstCombine This adds support for addrspace casts involving opaque pointers to InstCombine, as well as the isEliminableCastPair() helper (otherwise the assertion failure would just move there). Add PointerType::hasSameElementTypeAs() to hide the element type details. Differential Revision: https://reviews.llvm.org/D104668	2021-06-22 17:45:30 +02:00
Nikita Popov	39796e1ad0	Reapply [InstCombine] Don't try converting opaque pointer bitcast to GEP Reapplied without changes -- this was reverted together with an underlying patch. ----- Bitcasts having opaque pointer source or result type cannot be converted into a zero-index GEP, GEP source and result types always have the same opaque-ness.	2021-06-21 22:15:56 +02:00
Nikita Popov	e2c2124a4b	Reapply [InstCombine] Extract bitcast -> gep transform Relative to the original patch, an InstCombine test has been added to show a previously missed pattern, and the Coroutine test that resulted in the revert has been regenerated. ----- Move this into a separate function, to make sure that early returns do not accidentally skip other transforms. This previously happened for the isSized() check, which skipped folds like distributing a bitcast over a select.	2021-06-21 22:03:15 +02:00
Nikita Popov	6922ab73a5	Revert "[InstCombine] Extract bitcast -> gep transform" This reverts commit `d9f5d7b959`. This reverts commit `5780611d7e`. This causes a failure in Coroutine tests.	2021-06-21 21:34:17 +02:00
Nikita Popov	5780611d7e	[InstCombine] Don't try converting opaque pointer bitcast to GEP Bitcasts having opaque pointer source or result type cannot be converted into a zero-index GEP, GEP source and result types always have the same opaque-ness.	2021-06-21 21:24:50 +02:00
Nikita Popov	d9f5d7b959	[InstCombine] Extract bitcast -> gep transform Move this into a separate function, to make sure that early returns do not accidentally skip other transforms. There is already one isSized() check that could run into this issue, thus this change is not strictly NFC.	2021-06-21 21:24:50 +02:00
Nikita Popov	a969bdc56f	[InstCombine] Remove unnecessary addres space check (NFC) It's not possible to bitcast between different address spaces, and this is ensured by the IR verifier. As such, this bitcast to addrspacecast canonicalization can never be hit.	2021-06-21 20:11:39 +02:00
Juneyoung Lee	ce192ced2b	[InstCombine] Use poison constant to represent the result of unreachable instrs This patch updates InstCombine to use poison constant to represent the resulting value of (either semantically or syntactically) unreachable instrs, or a don't-care value of an unreachable store instruction. This allows more aggressive folding of unused results, as shown in llvm/test/Transforms/InstCombine/getelementptr.ll . Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104602	2021-06-21 09:58:44 +09:00
Guozhi Wei	575ba6f425	[InstCombine] Don't transform code if DoTransform is false In patch https://reviews.llvm.org/D72396, it doesn't check DoTransform before transforming the code, and generates wrong result for the attached test case. Differential Revision: https://reviews.llvm.org/D104567	2021-06-18 18:01:34 -07:00
Sanjay Patel	23a116c8c4	[InstCombine] convert lshr to ashr to eliminate cast op This is similar to `b865eead76` ( D103617 ) and fixes: https://llvm.org/PR50575 `41b71f718b` did this and more (noted with TODO comments in the tests), but it didn't handle the case where the destination is narrower than the source, so it got reverted. This is a simple match-and-replace. If there's evidence that the TODO cases are useful, we can revisit/extend.	2021-06-04 07:04:37 -04:00
Sanjay Patel	b865eead76	[InstCombine] eliminate sext and/or trunc if value has enough signbits If we have enough signbits in a source value, we can skip an intermediate cast for a trunc+sext pair: https://alive2.llvm.org/ce/z/A_mQt- This is the original problem shown in: https://llvm.org/PR49543 There's a test that shows we transformed what used to be a pair of shifts, so that suggests we could add another ComputeNumSignBits fold starting from a shift. There does not appear to be any change in compile-time from the extra analysis: https://llvm-compile-time-tracker.com/compare.php?from=3d2c9069dcafd0cbb641841aa3dd6e851fb7d760&to=b9513cdf2419704c7bb0c3a02a9ca06aae13d902&stat=instructions Differential Revision: https://reviews.llvm.org/D103617	2021-06-03 13:58:19 -04:00
Juneyoung Lee	7161bb87c9	[InsCombine] Fix a few remaining vec transforms to use poison instead of undef This is a patch that replaces shufflevector and insertelement's placeholder value with poison. Underlying motivation is to fix the semantics of shufflevector with undef mask to return poison instead (D93818) The consensus has been made in the late 2020 via mailing list as well as the thread in https://bugs.llvm.org/show_bug.cgi?id=44185 . This patch is a simple syntactic change to the existing code, hence directly pushed as a commit.	2021-05-31 18:47:09 +09:00
Sanjay Patel	c7da0c383a	[InstCombine] fold zext of masked bit set/clear This does not solve PR17101, but it is one of the underlying diffs noted here: https://bugs.llvm.org/show_bug.cgi?id=17101#c8 We could ease the one-use checks for the 'clear' (no 'not' op) half of the transform, but I do not know if that asymmetry would make things better or worse. Proofs: https://rise4fun.com/Alive/uVB Name: masked bit set %sh1 = shl i32 1, %y %and = and i32 %sh1, %x %cmp = icmp ne i32 %and, 0 %r = zext i1 %cmp to i32 => %s = lshr i32 %x, %y %r = and i32 %s, 1 Name: masked bit clear %sh1 = shl i32 1, %y %and = and i32 %sh1, %x %cmp = icmp eq i32 %and, 0 %r = zext i1 %cmp to i32 => %xn = xor i32 %x, -1 %s = lshr i32 %xn, %y %r = and i32 %s, 1 Note: this is a re-post of a patch that I committed at: rGa041c4ec6f7a The commit was reverted because it exposed another bug: rGb212eb7159b40 But that has since been corrected with: rG8a156d1c2795189 ( D101191 ) Differential Revision: https://reviews.llvm.org/D72396	2021-05-29 08:52:26 -04:00
Sanjay Patel	52f2970036	[InstCombine] reduce code duplication; NFC	2021-05-29 08:33:25 -04:00
Sanjay Patel	0bab0f6161	[InstCombine] canonicalize cast before unary shuffle We could go either direction on this transform. VectorCombine already goes this way for bitcasts (and handles more complicated cases using the cost model), so let's try cast-first. Deferring completely to VectorCombine is another possibility. But the backend should be able to invert this easily when the vectors have the same shape, so it doesn't seem like a transform that we need to avoid. The motivating example from https://llvm.org/PR49081 has an int-to-float sandwiched between 2 shuffles, and the backend currently does not reduce that, so on x86, we get something like: pshufd $249, %xmm0, %xmm0] cvtdq2ps %xmm0, %xmm0 shufps $144, %xmm0, %xmm0 ...instead of just a single conversion instruction. Differential Revision: https://reviews.llvm.org/D103038	2021-05-25 08:43:09 -04:00
Sanjay Patel	6d949a9c8f	[InstCombine] restrict funnel shift match to avoid miscompile As noted in the post-commit discussion for: https://reviews.llvm.org/rGabd7529625a73f405e40a63dcc446c41d51a219e ...that change exposed a logic hole that allows a miscompile if the shift amount could exceed the narrow width: https://alive2.llvm.org/ce/z/-i_CiM https://alive2.llvm.org/ce/z/NaYz28 The restriction isn't necessary for a rotate (same operand for both shifts), so we should adjust the matching for the shift value as a follow-up enhancement: https://alive2.llvm.org/ce/z/ahuuQb	2021-05-18 13:32:07 -04:00
Sanjay Patel	abd7529625	[InstCombine] relax masking requirement for truncated funnel/rotate match I was investigating a seemingly unrelated improvement in demanded bits for shift-left, but that caused regressions on these tests because we were able to look through/eliminate the mask. https://alive2.llvm.org/ce/z/Ztdr22 define i8 @src(i32 %x, i32 %y, i32 %shift) { %and = and i32 %shift, 3 %conv = and i32 %x, 255 %shr = lshr i32 %conv, %and %sub = sub i32 8, %and %shl = shl i32 %y, %sub %or = or i32 %shr, %shl %conv2 = trunc i32 %or to i8 ret i8 %conv2 } define i8 @tgt(i32 %x, i32 %y, i32 %shift) { %x8 = trunc i32 %x to i8 %y8 = trunc i32 %y to i8 %shift8 = trunc i32 %shift to i8 %and = and i8 %shift8, 3 %conv2 = call i8 @llvm.fshr.i8(i8 %y8, i8 %x8, i8 %and) ret i8 %conv2 } declare i8 @llvm.fshr.i8(i8,i8,i8)	2021-04-28 16:49:50 -04:00
Roman Lebedev	5a654bfeab	Revert "[InstCombine] `sext(trunc(x)) --> sext(x)` iff trunc is NSW (PR49543)" I forgot about the case where we sign-extend to width smaller than the original. This reverts commit `1e6ca23ab8`.	2021-04-21 01:11:15 +03:00
Roman Lebedev	1e68d338c1	Revert "[InstCombine] "Bypass" NUW trunc of lshr if we are going to sext the result (PR49543)" I forgot about the case where we sign-extend to width smaller than the original. This reverts commit `41b71f718b`.	2021-04-21 01:11:14 +03:00
Roman Lebedev	41b71f718b	[InstCombine] "Bypass" NUW trunc of lshr if we are going to sext the result (PR49543) This is a more convoluted form of the same pattern "sext of NSW trunc", but in this case the operand of trunc was a right-shift, and the truncation chops off just the zero bits that were shifted-in.	2021-04-21 00:31:46 +03:00
Roman Lebedev	1e6ca23ab8	[InstCombine] `sext(trunc(x)) --> sext(x)` iff trunc is NSW (PR49543) If we can tell that trunc only chops off sign bits, and not all of them, then we can simply sign-extend the trunc's source.	2021-04-21 00:31:45 +03:00
Luo, Yuanke	bcdaccfe34	[X86][AMX] Verify illegal types or instructions for x86_amx. This patch is related to https://reviews.llvm.org/D100032 which define some illegal types or operations for x86_amx. There are no arguments, arrays, pointers, vectors or constants of x86_amx. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D100472	2021-04-20 16:14:22 +08:00
Juneyoung Lee	1c10201d96	Update InstCombine to use undef matcher instead This is a patch to use m_Undef() matcher instead of isa<UndefValue>(). As suggested in D100122, this update is separately committed.	2021-04-18 11:05:36 +09:00
Philip Reames	4bf8985f4f	Replace calls to IntrinsicInst::Create with CallInst::Create [nfc] There is no IntrinsicInst::Create. These are binding to the method in the super type. Be explicitly about which method is being called.	2021-04-06 13:23:58 -07:00
Nashe Mncube	5d929794a8	[llvm-opt] Bug fix within combining FP vectors A bug was found within InstCombineCasts where a function call is only implemented to work with FixedVectors. This caused a crash when a ScalableVector was passed to this function. This commit introduces a regression test which recreates the failure and a bug fix. Differential Revision: https://reviews.llvm.org/D98351	2021-03-23 12:13:41 +00:00
Roman Lebedev	d37fe26a2b	[NFC][IR] Type: add getWithNewType() method Sometimes you want to get a type with same vector element count as the current type, but different element type, but there's no QOL wrapper to do that. Add one.	2021-03-23 00:50:58 +03:00
Philip Reames	5698537f81	Update basic deref API to account for possiblity of free [NFC] This patch is plumbing to support work towards the goal outlined in the recent llvm-dev post "[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree". The point of this change is purely to simplify iteration on other pieces on way to making the switch. Rebuilding with a change to Value.h is slow and painful, so I want to get the API change landed. Once that's done, I plan to more closely audit each caller, add the inference rules in their own patch, then post a patch with the langref changes and test diffs. The value of the command line flag is that we can exercise the inference logic in standalone patches without needing the whole switch ready to go just yet. Differential Revision: https://reviews.llvm.org/D98908	2021-03-19 11:17:19 -07:00
Luo, Yuanke	66fbf5fafb	[X86][AMX] Prevent transforming load pointer from <256 x i32>* to x86_amx*. The load/store instruction will be transformed to amx intrinsics in the pass of AMX type lowering. Prohibiting the pointer cast make that pass happy. Differential Revision: https://reviews.llvm.org/D98247	2021-03-14 09:24:56 +08:00
Sanjay Patel	4224a36957	[InstCombine] avoid creating an extra instruction in zext fold and possible inf-loop The structure of this fold is suspect vs. most of instcombine because it creates instructions and tries to delete them immediately after. If we don't have the operand types for the icmps, then we are not behaving as assumed. And as shown in PR49475, we can inf-loop.	2021-03-13 08:30:51 -05:00
Florian Hahn	c701f85c45	[STLExtras] Use return type from operator* of the wrapped iter. Currently make_early_inc_range cannot be used with iterators with operator* implementations that do not return a reference. Most notably in the LLVM codebase, this means the User iterator ranges cannot be used with make_early_inc_range, which slightly simplifies iterating over ranges while elements are removed. Instead of directly using BaseT::reference as return type of operator, this patch uses decltype to get the actual return type of the operator implementation in WrappedIteratorT. This patch also updates a few places to use make use of make_early_inc_range. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D93992	2021-01-10 14:41:13 +00:00
Kazu Hirata	33bf1cad75	[llvm] Use *Set::contains (NFC)	2021-01-07 20:29:34 -08:00
Jun Ma	0138399903	[InstCombine] Remove scalable vector restriction in InstCombineCasts Differential Revision: https://reviews.llvm.org/D93389	2020-12-17 22:02:33 +08:00
Jun Ma	2ac58e21a1	[InstCombine] Remove scalable vector restriction when fold SelectInst Differential Revision: https://reviews.llvm.org/D93083	2020-12-15 20:36:57 +08:00
Roman Lebedev	94ead0190f	[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold, 2 If the shift amount was undef for some lane, the shift amount in opposite shift is irrelevant for that lane, and the new shift amount for that lane can be undef.	2020-12-01 16:54:00 +03:00
Roman Lebedev	52533b52b8	Revert "[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold" It seems i have missed checklines, temporairly reverting, will reland momentairly.. This reverts commit `aa1aa13509`.	2020-12-01 15:47:04 +03:00
Roman Lebedev	aa1aa13509	[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold If the shift amount was undef for some lane, the shift amount in opposite shift is irrelevant for that lane, and the new shift amount for that lane can be undef.	2020-12-01 15:13:08 +03:00
Roman Lebedev	8e29e20e0d	[InstCombine] Evaluate new shift amount for sext(ashr(shl(trunc()))) fold in wide type (PR48343) It is not correct to compute that new shift amount in it's narrow type and only then extend it into the wide type: ---------------------------------------- Optimization: PR48343 good Precondition: (width(%X) == width(%r)) %o0 = trunc %X %o1 = shl %o0, %Y %o2 = ashr %o1, %Y %r = sext %o2 => %n0 = sext %Y %n1 = sub width(%o0), %n0 %n2 = sub width(%X), %n1 %n3 = shl %X, %n2 %r = ashr %n3, %n2 Done: 2016 Optimization is correct! ---------------------------------------- Optimization: PR48343 bad Precondition: (width(%X) == width(%r)) %o0 = trunc %X %o1 = shl %o0, %Y %o2 = ashr %o1, %Y %r = sext %o2 => %n0 = sub width(%o0), %Y %n1 = sub width(%X), %n0 %n2 = sext %n1 %n3 = shl %X, %n2 %r = ashr %n3, %n2 Done: 1 ERROR: Domain of definedness of Target is smaller than Source's for i9 %r Example: %X i9 = 0x000 (0) %Y i4 = 0x3 (3) %o0 i4 = 0x0 (0) %o1 i4 = 0x0 (0) %o2 i4 = 0x0 (0) %n0 i4 = 0x1 (1) %n1 i4 = 0x8 (8, -8) %n2 i9 = 0x1F8 (504, -8) %n3 i9 = 0x000 (0) Source value: 0x000 (0) Target value: undef I.e. we should be computing it in the wide type from the beginning. Fixes https://bugs.llvm.org/show_bug.cgi?id=48343	2020-12-01 15:13:07 +03:00
Simon Pilgrim	310f62b4ff	[InstCombine] narrowFunnelShift - fold trunc/zext or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) (PR35155) As discussed on PR35155, this extends narrowFunnelShift (recently renamed from narrowRotate) to support basic funnel shift patterns. Unlike matchFunnelShift we don't include the computeKnownBits limitation as extracting the pattern from the zext/trunc layers should be a indicator of reasonable funnel shift codegen, in D89139 we demonstrated how to efficiently promote funnel shifts to wider types. Differential Revision: https://reviews.llvm.org/D89542	2020-10-24 12:42:43 +01:00
Simon Pilgrim	1cf347e48b	[InstCombine] narrowRotate - minor refactoring for funnel shift support. NFC. Prep work for PR35155 - renamed narrowRotate to narrowFunnelShift, rewrote some comments and adjusted code to collect separate shift values, although we bail if they don't match (still only rotations are only actually folded). I'm trying to match matchFunnelShift as much as possible in case we finally get to merge these one day.	2020-10-16 11:27:28 +01:00
Simon Pilgrim	89657b3a3b	[InstCombine] narrowRotate - canonicalize to OR(SHL,LSHR). NFCI. Match the canonicalization code that was added to matchFunnelShift at rG02295e6d1a15	2020-10-14 16:45:00 +01:00
Simon Pilgrim	9c3138bd6d	[InstCombine] visitTrunc - pass through undefs for trunc(shift(trunc/ext(x),c)) patterns Based on the recent patches D88475 and D88429 where we are losing undef values due to extension/comparisons. I've added a Constant::mergeUndefsWith method that merges the undef scalar/elements from another Constant into a specific Constant. Differential Revision: https://reviews.llvm.org/D88687	2020-10-13 14:35:18 +01:00
Simon Pilgrim	d9f064dc0b	[InstCombine] visitTrunc - trunc(shl(X, C)) --> shl(trunc(X),trunc(C)) vector support Annoyingly vectors aren't supported by shouldChangeType(), but we have precedents for always performing this on vector types (e.g. narrowBinOp). Differential Revision: https://reviews.llvm.org/D89067	2020-10-08 22:07:51 +01:00
Simon Pilgrim	0cf48a7065	[InstCombine] visitTrunc - trunc (shr (trunc A), C) --> trunc(shr A, C) Attempt to fold trunc (shr (trunc A), C) --> trunc(shr A, C) iff the shift amount if small enough that all zero/sign bits created by the shift are removed by the last trunc. Helps fix the regressions encountered in D88316. I've tweaked a couple of shift values as suggested by @lebedev.ri to ensure we have coverage of shift values close (above/below) to the max limit. Differential Revision: https://reviews.llvm.org/D88429	2020-09-29 18:27:42 +01:00
Simon Pilgrim	b610d73b3f	[InstCombine] visitTrunc - remove dead trunc(lshr (zext A), C) combine. NFCI. I added additional test coverage at rG7a55989dc4305 - but all are handled independently of this combine and http://lab.llvm.org:8080/coverage/coverage-reports/ indicates the code is never used. Differential revision: https://reviews.llvm.org/D88492	2020-09-29 17:15:16 +01:00
Simon Pilgrim	89a8a0c910	[InstCombine] Inherit exact flags on extended shifts in trunc (lshr (sext A), C) --> (ashr A, C) This was missed in D88475	2020-09-29 15:32:09 +01:00
Simon Pilgrim	14ff38e235	[InstCombine] visitTrunc - trunc (lshr (sext A), C) --> (ashr A, C) non-uniform support This came from @lebedev.ri's suggestion to use m_SpecificInt_ICMP for D88429 - since I was going to change the m_APInt to m_Constant for that patch I thought I would do it for the only other user of the APInt first. I've added a ConstantExpr::getUMin helper - its trivial to add UMAX/SMIN/SMAX but thought I'd wait until we have use cases. Differential Revision: https://reviews.llvm.org/D88475	2020-09-29 15:01:16 +01:00

1 2 3 4 5 ...

433 Commits