llvm-project

Commit Graph

Author	SHA1	Message	Date
David Green	1de1070559	[DAGCombine] Fix alias analysis for unaligned accesses The alias analysis in DAG Combine looks at the BaseAlign, the Offset and the Size of two accesses, and determines if they are known to access different parts of memory by the fact that they are different offsets from inside that "alignment window". It does not seem to account for accesses that are not a multiple of the size, and may overflow from one alignment window into another. For example in the test case we have a 19byte memset that is splits into a 16 byte neon store and an unaligned 4 byte store with a 15 byte offset. This 15byte offset (with a base align of 8) wraps around to the next alignment windows. When compared to an access that is a 16byte offset (of the same 4byte size and 8byte basealign), the two accesses are said not to alias. I've fixed this here by just ensuring that the offsets are a multiple of the size, ensuring that they don't overlap by wrapping. Fixes PR45035, which was exposed by the UseAA changes in the arm backend. Differential Revision: https://reviews.llvm.org/D75238	2020-02-28 18:44:36 +00:00
Sanjay Patel	b3d0c79836	[DAGCombiner] avoid narrowing fake fneg vector op This may inhibit vector narrowing in general, but there's already an inconsistency in the way that we deal with this pattern as shown by the test diff. We may want to add a dedicated function for narrowing fneg. It's often folded into some other op, so moving it away from other math ops may cause regressions that we would not see for normal binops. See D73978 for more details.	2020-02-26 11:25:56 -05:00
Simon Pilgrim	bbb0933e3d	[DAG] visitRotate - modulo non-uniform constant rotation amounts	2020-02-26 15:43:12 +00:00
Roman Lebedev	d20907d1de	[Codegen] Revert rL354676/rL354677 and followups - introduced PR43446 miscompile This reverts https://reviews.llvm.org/D58468 (rL354676, `44037d7a63`), and all and any follow-ups to that code block. https://bugs.llvm.org/show_bug.cgi?id=43446	2020-02-25 20:30:12 +03:00
Francesco Petrogalli	31ec721516	[llvm][CodeGen] DAG Combiner folds for vscale. Summary: This patch simplifies the DAGs generated when using the intrinsic `@llvm.vscale.` as follows: Fold (add (vscale * C0), (vscale * C1)) to (vscale * (C0 + C1)). * Canonicalize (sub X, (vscale * C)) to (add X, (vscale * -C)). * Fold (mul (vscale * C0), C1) to (vscale * (C0 * C1)). * Fold (shl (vscale * C0), C1) to (vscale * (C0 << C1)). The test `sve-gep-ll` have been updated to reflect the folding introduced by this patch. Reviewers: efriedma, sdesmalen, andwar, rengolin Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74782	2020-02-21 18:03:12 +00:00
Simon Pilgrim	86c52af05a	[TargetLowering] SimplifyDemandedBits - use getValidShiftAmountConstant helper. Use the SelectionDAG::getValidShiftAmountConstant helper to get const/constsplat shift amounts, which allows us to drop the out of range shift amount early-out. First step towards better non-uniform shift amount support in SimplifyDemandedBits.	2020-02-21 14:23:53 +00:00
Simon Pilgrim	f9c326364e	[DAGCombiner] Use SDValue::getConstantOperandAPInt helper where possible. NFC.	2020-02-20 18:23:05 +00:00
Simon Pilgrim	fc2b4a02b1	[DAGCombine] visitEXTRACT_VECTOR_ELT - add SimplifyDemandedBits multi use support Similar to what we already do with SimplifyDemandedVectorElts, call SimplifyDemandedBits across all the extracted elements of the source vector, treating it as single use. There's a minor regression in store-weird-sizes.ll which will be addressed in an upcoming SimplifyDemandedBits patch.	2020-02-20 15:49:38 +00:00
Sander de Smalen	a7a96c726e	[AArch64] Implement passing SVE vectors by ref for AAPCS. Summary: This patch implements the part of the calling convention where SVE Vectors are passed by reference. This means the caller must allocate stack space for these objects and pass the address to the callee. Reviewers: efriedma, rovka, cameron.mcinally, c-rhodes, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71216	2020-02-17 15:20:28 +00:00
Sjoerd Meijer	dad5f00e3b	[DAGCombine] Combine pattern for REV16 This adds another pattern to the combiner for a case that we were not handling to generate the REV16 instruction for ARM/Thumb2 and a bswap+ror on X86. Differential Revision: https://reviews.llvm.org/D74032	2020-02-17 14:54:17 +00:00
Simon Pilgrim	32176133fa	Move FIXME to start of comment so visual studio actually tags it. NFC.	2020-02-13 14:28:50 +00:00
Simon Pilgrim	9eb426c88c	[TargetLowering] Add NegatibleCost enum for isNegatibleForFree return codes The isNegatibleForFree/getNegatedExpression methods currently rely on a raw char value to indicate whether a negation is beneficial or not. This patch replaces the char return value with an NegatibleCost enum to more clearly demonstrate what is implied. It also renames isNegatibleForFree to getNegatibleCost to more accurately reflect whats going on. Differential Revision: https://reviews.llvm.org/D74221	2020-02-12 11:51:42 +00:00
Sebastian Neubauer	7cddd15e56	[SelectionDAG] Optimize build_vector of truncates and shifts Add a simplification to fuse a manual vector extract with shifts and truncate into a bitcast. Unpacking and packing values into vectors is only optimized with extractelement instructions, not when manually unpacked using shifts and truncates. This patch simplifies shifts and truncates into a bitcast if possible. Simplify (build_vec (trunc $1) (trunc (srl $1 width)) (trunc (srl $1 (2 * width))) ...) to (bitcast $1) Differential Revision: https://reviews.llvm.org/D73892	2020-02-10 15:04:07 +01:00
Simon Pilgrim	4592bb7195	visitINSERT_VECTOR_ELT - pull out repeated dyn_cast. NFCI. This always gets called at least once.	2020-02-05 13:30:54 +00:00
Matt Arsenault	a3c814d234	Separately track input and output denormal mode AMDGPU and x86 at least both have separate controls for whether denormal results are flushed on output, and for whether denormals are implicitly treated as 0 as an input. The current DAGCombiner use only really cares about the input treatment of denormals.	2020-02-04 12:59:21 -05:00
Simon Pilgrim	57b0d33224	[DAGCombiner] ISD::AND/OR/XOR - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-30 12:02:53 +00:00
Simon Pilgrim	a967aa2706	[DAGCombiner] ISD::SDIV/UDIV/SREM/UREM - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-30 12:02:52 +00:00
Simon Pilgrim	f7245ef897	[DAGCombiner] ISD::SHL/SRA/SRL - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 18:49:42 +00:00
Simon Pilgrim	25b8e96388	[DAGCombiner] ISD::MUL - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 17:26:22 +00:00
Simon Pilgrim	4b04e11735	[DAGCombiner] Sub/SUBSAT - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 16:57:13 +00:00
Simon Pilgrim	48bd6a0986	[DAGCombiner] visitIMINMAX - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us instead of the ConstantSDNode variant where we have to do it ourselves.	2020-01-29 16:57:13 +00:00
@justice_adams (Justice Adams)	daee63f974	[SelectionDag] Updated FoldConstantArithmetic method signature in preparation for merge with FoldConstantVectorArithmetic Updated FoldConstantArithmetic method signature to match that of FoldConstantVectorArithmetic in preparation for merging the two functions together https://bugs.llvm.org/show_bug.cgi?id=36544 This is the first step in combining the various FoldConstantVectorArithmetic and FoldConstantVectorArithmetic functions into one FoldConstantArithmetic function. Differential Revision: https://reviews.llvm.org/D72870	2020-01-24 18:00:58 -05:00
Craig Topper	d3bf06bc81	[DAGCombiner] Add combine for (not (strict_fsetcc)) to create a strict_fsetcc with the opposite condition. Unlike the existing code that I modified here, I only handle the case where the strict_fsetcc has a single use. Not sure exactly how to handle multiples uses. Testing this on X86 is hard because we already have a other combines that get rid of lowered version of the integer setcc that this xor will eventually become. So this combine really just saves a bunch of extra nodes being created. Not sure about other targets. Differential Revision: https://reviews.llvm.org/D71816	2020-01-24 14:15:36 -08:00
Stanislav Mekhanoshin	7a94d4f4ee	Allow combining of extract_subvector to extract element Differential Revision: https://reviews.llvm.org/D73132	2020-01-24 10:50:26 -08:00
Simon Pilgrim	0b45c2264a	[SelectionDAG] rot(x, y) --> x iff ComputeNumSignBits(x) == BitWidth(x) Rotating an 0/-1 value by any amount will always result in the same 0/-1 value	2020-01-24 10:35:57 +00:00
Stanislav Mekhanoshin	2d0fcf786c	Precommit NFC part of DAGCombiner change. NFC. This is NFC part of DAGCombiner::visitEXTRACT_SUBVECTOR() change in the D73132.	2020-01-22 09:01:22 -08:00
Sander de Smalen	4cf16efe49	[AArch64][SVE] Add patterns for unpredicated load/store to frame-indices. This patch also fixes up a number of cases in DAGCombine and SelectionDAGBuilder where the size of a scalable vector is used in a fixed-width context (thus triggering an assertion failure). Reviewers: efriedma, c-rhodes, rovka, cameron.mcinally Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D71215	2020-01-22 14:32:27 +00:00
Simon Pilgrim	5f5f478564	[DAG] Fold extract_vector_elt (scalar_to_vector), K to undef (K != 0) This was unconditionally folding this to the source operand, even if the access was out of bounds. Use undef instead of the extract is not the first element. This helps with some cases where 3-vectors are legalized and avoids processing the 4th component. Original Patch by: arsenm (Matt Arsenault) Differential Revision: https://reviews.llvm.org/D51589	2020-01-21 10:58:30 +00:00
Michael Liao	6d0d86a64d	[DAG] Add helper for creating constant vector index with correct type. NFC.	2020-01-18 01:23:36 -05:00
Michael Liao	8d07f8d98c	[DAGCombine] Replace `getIntPtrConstant()` with `getVectorIdxTy()`. - Prefer `getVectorIdxTy()` as the index operand type for `EXTRACT_SUBVECTOR` as targets expect different types by overloading `getVectorIdxTy()`.	2020-01-14 17:03:05 -05:00
Sanjay Patel	cb5612e2df	[DAGCombiner] reduce extract subvector of concat If we are extracting a chunk of a vector that's a fraction of an operand of the concatenated vector operand, we can extract directly from one of those original operands. This is another suggestion from PR42024: https://bugs.llvm.org/show_bug.cgi?id=42024#c2 But I'm not sure yet if it will make any difference on those patterns. It seems to help a few existing AVX512 tests though. Differential Revision: https://reviews.llvm.org/D72361	2020-01-09 09:38:12 -05:00
QingShan Zhang	d48ac7d54d	[DAGCombine] Fold the (fma -x, y, -z) to -(fma x, y, z) This is a positive combination as long as the NEG is NOT free, as we are reducing the number of NEG from two to one. Differential Revision: https://reviews.llvm.org/D72312	2020-01-09 04:33:46 +00:00
Sanjay Patel	780ba1f22b	[DAGCombiner] clean up extract-of-concat fold; NFC This hopes to improve readability and adds an assert. The functional change noted by the TODO comment is proposed in: D72361	2020-01-08 10:15:33 -05:00
Sanjay Patel	58e2e92a57	[DAGCombiner] reduce shuffle of concat of same vector This is possibly a small part towards solving PR42024: https://bugs.llvm.org/show_bug.cgi?id=42024 The vectorizer is creating shuffles of concat like this: %63 = shufflevector <4 x i64> %x, <4 x i64> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %64 = shufflevector <8 x i64> %63, <8 x i64> undef, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> That might be fixable in the vectorizers, but we're not allowed to fold that into a single shuffle in instcombine, so we should have a backend backstop to convert that into the likely simpler form: %64 = shufflevector <4 x i64> %x, <4 x i64> undef, <8 x i32> <i32 0, i32 0, i32 1, i32 1, i32 2, i32 2, i32 3, i32 3> Differential Revision: https://reviews.llvm.org/D72300	2020-01-07 09:48:59 -05:00
Simon Pilgrim	6fa6000e3e	[DAG] DAGCombiner::XformToShuffleWithZero - use APInt::extractBits helper. NFCI.	2020-01-06 13:17:02 +00:00
QingShan Zhang	b9780f4f80	[DAGCombine] Don't check the legality of type when combine the SIGN_EXTEND_INREG This is the DAG node for SIGN_EXTEND_INREG : t21: v4i32 = sign_extend_inreg t18, ValueType:ch:v4i16 It has two operands. The first one is the value it want to extend, and the second one is the type to specify how to extend the value. For this example, it means that, it is signed extend the t18(v4i32) from v4i16 to v4i32. That is the semantics of c code: vector int foo(vector int m) { return m << 16 >> 16; } And it could be any vector type that hardware support the operation, though the type 'v4i16' is NOT legal for the target. When we are trying to combine the srl + sra, what we did now is calling the TLI.isOperationLegal(), which will also check the legality of the type. That doesn't make sense. Differential Revision: https://reviews.llvm.org/D70230	2020-01-06 03:00:58 +00:00
Sanjay Patel	ca7fdd41bd	[DAGCombiner] fix miscompile in translating (X & undef) to shuffle See PR42982 for more context: https://bugs.llvm.org/show_bug.cgi?id=42982	2020-01-03 14:58:49 -05:00
Roman Lebedev	0727e2b90c	[DAGCombiner][X86][AArch64] Generalize `A-(A&B)`->`A&(~B)` fold (PR44448) The fold 'A - (A & (B - 1))' -> 'A & (0 - B)' added in `8dab0a4a7d` is too specific. It should/can just be 'A - (A & B)' -> 'A & (~B)' Even if we don't manage to fold `~` into B, we have likely formed `ANDN` node. Also, this way there's less similar-but-duplicate folds. Name: X - (X & Y) -> X & (~Y) %o = and i32 %X, %Y %r = sub i32 %X, %o => %n = xor i32 %Y, -1 %r = and i32 %X, %n https://rise4fun.com/Alive/kOUl See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499	2020-01-03 17:55:47 +03:00
Roman Lebedev	86403c0ff8	[DAGCombiner] `~(add X, -1)` -> `neg X` fold The fold 'A - (A & (B - 1))' -> 'A & (0 - B)' added in `8dab0a4a7d` is too specific. It should just be 'A - (A & B)' -> 'A & (~B)', but we currently fail to sink that '~' into `(B - 1)`. Name: ~(X - 1) -> (0 - X) %o = add i32 %X, -1 %r = xor i32 %o, -1 => %r = sub i32 0, %X https://rise4fun.com/Alive/rjU	2020-01-03 17:55:46 +03:00
Roman Lebedev	3d492d7503	[DAGCombine][X86][Thumb2/LowOverheadLoops] `A - (A & C)` -> `A & (~C)` fold (PR44448) While we do manage to fold integer-typed IR in middle-end, we can't do that for the main motivational case of pointers. There is @llvm.ptrmask() intrinsic which may or may not be helpful, but i'm not sure it is fully considered canonical yet, not everything is fully aware of it likely. Name: PR44448 ptr - (ptr & C) -> ptr & (~C) %bias = and i32 %ptr, C %r = sub i32 %ptr, %bias => %r = and i32 %ptr, ~C See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499	2020-01-03 17:55:45 +03:00
Roman Lebedev	1711be78f7	[NFC][DAGCombine] Clarify comment for 'A - (A & (B - 1))' fold	2020-01-03 17:55:42 +03:00
Roman Lebedev	8dab0a4a7d	[DAGCombine][X86][AArch64] 'A - (A & (B - 1))' -> 'A & (0 - B)' fold (PR44448) While we do manage to fold integer-typed IR in middle-end, we can't do that for the main motivational case of pointers. There is @llvm.ptrmask() intrinsic which may or may not be helpful, but i'm not sure it is fully considered canonical yet, not everything is fully aware of it likely. https://rise4fun.com/Alive/ZVdp Name: ptr - (ptr & (alignment-1)) -> ptr & (0 - alignment) %mask = add i64 %alignment, -1 %bias = and i64 %ptr, %mask %r = sub i64 %ptr, %bias => %highbitmask = sub i64 0, %alignment %r = and i64 %ptr, %highbitmask See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499	2020-01-03 13:58:36 +03:00
Matt Arsenault	4d7201e7b9	DAG: Stop trying to fold FP -(x-y) -> y-x in getNode with nsz This was increasing the number of instructions when fsub was legalized on AMDGPU with no signed zeros enabled. This fold should be guarded by hasOneUse, and I don't think getNode should be doing that. The same fold is already done as a regular combine through isNegatibleForFree. This does require duplicating, even though isNegatibleForFree does this combine already (and properly checks hasOneUse) to avoid one PPC regression. In the regression, the outer fneg has nsz but the fsub operand does not. isNegatibleForFree only sees the operand, and doesn't see it's used from a nsz context. A nsz parameter needs to be added and threaded through isNegatibleForFree to avoid this.	2019-12-31 22:49:51 -05:00
Sanjay Patel	8cefc37be5	[DAGCombine] visitEXTRACT_SUBVECTOR - 'little to big' extract_subvector(bitcast()) support This moves the X86 specific transform from rL364407 into DAGCombiner to generically handle 'little to big' cases (for example: extract_subvector(v2i64 bitcast(v16i8))). This allows us to remove both the x86 implementation and the aarch64 bitcast(extract_subvector(bitcast())) combine. Earlier patches that dealt with regressions initially exposed by this patch: rG5e5e99c041e4 rG0b38af89e2c0 Patch by: @RKSimon (Simon Pilgrim) Differential Revision: https://reviews.llvm.org/D63815	2019-12-23 10:11:45 -05:00
Carl Ritson	2791667d2e	[DAGCombiner] Check term use before applying aggressive FSUB optimisations Summary: Without this check unnecessary FMA instructions are generated when the FSUB terms are reused. This also has the side-effect that the same value is computed to different levels of precision, which can create undesirable effects if the results are used together in subsequent computation. Reviewers: arsenm, nhaehnle, foad, tpr, dstuttard, spatel Reviewed By: arsenm Subscribers: jvesely, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71656	2019-12-23 09:37:58 +09:00
Amaury Séchet	ff6567cc77	[DAGCombiner] Add node back in the worklist in topological order in CommitTargetLoweringOpt Summary: Right now, DAGCombiner process the nodes in an iplementation defined order. This tends to be fragile as optimisation may or may not kick in depending on the traversal order. This is part of a larger effort to get the DAGCombiner to process its node in topological order. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70921	2019-12-17 18:26:16 +01:00
Alex Richardson	11448eeb72	[NFC] Use SelectionDAG::getMemBasePlusOffset() instead of getNode(ISD::ADD) Summary: To find potential opportunities to use getMemBasePlusOffset() I looked at all ISD::ADD uses found with the regex getNode\(ISD::ADD,.+,.+Ptr in lib/CodeGen/SelectionDAG. If this patch is accepted I will convert the files in the individual backends too. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel Reviewed By: spatel Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71207	2019-12-13 21:40:03 +00:00
Sanjay Patel	2f0c7fd2db	[DAGCombiner] fold shift-trunc-shift to shift-mask-trunc (2nd try) The initial attempt (rG89633320) botched the logic by reversing the source/dest types. Added x86 tests for additional coverage. The vector tests show a potential improvement (fold vector load instead of broadcasting), but that's a known/existing problem. This fold is done in IR by instcombine, and we have a special form of it already here in DAGCombiner, but we want the more general transform too: https://rise4fun.com/Alive/3jZm Name: general Pre: (C1 + zext(C2) < 64) %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %a = and i64 %s2, zext((1 << (16 - C2)) - 1) %r = trunc %a to i16 Name: special Pre: C1 == 48 %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %r = trunc %s2 to i16 ...because D58017 exposes a regression without this fold.	2019-12-13 14:03:54 -05:00
Alex Richardson	be15dfa88f	[NFC] Use EVT instead of bool for getSetCCInverse() Summary: The use of a boolean isInteger flag (generally initialized using VT.isInteger()) caused errors in our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). In our backend, pointers use a separate ValueType (iFATPTR) and therefore .isInteger() returns false. This meant that getSetCCInverse() was using the floating-point variant and generated incorrect code for us: `(void )0x12033091e < (void )0xffffffffffffffff` would return false. Committing this change will significantly reduce our merge conflicts for each upstream merge. Reviewers: spatel, bogner Reviewed By: bogner Subscribers: wuzish, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70917	2019-12-13 12:22:03 +00:00
Sanjay Patel	9432937190	Revert "[DAGCombiner] fold shift-trunc-shift to shift-mask-trunc" This reverts commit `8963332c33`. There was a logic bug typo in this code, but it wasn't visible in the asm for the tests.	2019-12-12 16:24:40 -05:00

1 2 3 4 5 ...

2767 Commits