llvm-project

Commit Graph

Author	SHA1	Message	Date
Eli Friedman	96ef6998df	[InstCombine] Fix a couple crashes with extractelement on a scalable vector. Differential Revision: https://reviews.llvm.org/D86989	2020-09-02 18:02:07 -07:00
Christopher Tetreault	640f20b0c7	[SVE] Remove calls to VectorType::getNumElements from InstCombine Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D82237	2020-08-31 12:59:10 -07:00
Sanjay Patel	912c09e845	[InstCombine] eliminate a pointer cast around insertelement I'm not sure if this solves PR46839 completely, but reducing the casting should help: https://bugs.llvm.org/show_bug.cgi?id=46839 Differential Revision: https://reviews.llvm.org/D85647	2020-08-12 09:08:17 -04:00
Sanjay Patel	bebca662d4	[InstCombine] rearrange code for readability; NFC The code comment refers to the path where we change the size of the integer type, so handle that first, otherwise deal with the general case.	2020-08-10 08:07:29 -04:00
Sebastian Neubauer	2a6c871596	[InstCombine] Move target-specific inst combining For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for improvement for a long time. D81728 moves most target specific code out of the InstCombine pass. Applying the target specific combinations in an extra pass would probably result in inferior optimizations compared to the current fixed-point iteration, therefore the InstCombine pass resorts to newly introduced functions in the TargetTransformInfo when it encounters unknown intrinsics. The patch should not have any effect on generated code (under the assumption that code never uses intrinsics from a foreign target). This introduces three new functions: TargetTransformInfo::instCombineIntrinsic TargetTransformInfo::simplifyDemandedUseBitsIntrinsic TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic A few target specific parts are left in the InstCombine folder, where it makes sense to share code. The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64. This allows to move about 3000 lines out from InstCombine to the targets. Differential Revision: https://reviews.llvm.org/D81728	2020-07-22 15:59:49 +02:00
Florian Hahn	31971ca1c6	[InstCombine] Try to narrow expr if trunc cannot be removed. Narrowing an input expression of a truncate to a type larger than the result of the truncate won't allow removing the truncate, but it may enable further optimizations, e.g. allowing for larger vectorization factors. For now this is intentionally limited to integer types only, to avoid producing new vector ops that might not be suitable for the target. If we know that the only user is a trunc, we can also be allow more cases, e.g. also shortening expressions with some additional shifts. I would appreciate feedback on the best place to do such a narrowing. This fixes PR43580. Reviewers: spatel, RKSimon, lebedev.ri, xbolva00 Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D82973	2020-07-03 20:22:51 +01:00
Simon Pilgrim	eb0e7acbd4	[InstCombine] canEvaluateTruncated - use KnownBits to check for inrange shift amounts Currently canEvaluateTruncated can only attempt to truncate shifts if they are scalar/uniform constant amounts that are in range. This patch replaces the constant extraction code with KnownBits handling, using the KnownBits::getMaxValue to check that the amounts are inrange. This enables support for nonuniform constant cases, and also variable shift amounts that have been masked somehow. Annoyingly, this still won't work for vectors with (demanded) undefs as KnownBits returns nothing in those cases, but its a definite improvement on what we currently have. Differential Revision: https://reviews.llvm.org/D83127	2020-07-03 16:02:10 +01:00
Simon Pilgrim	3da42f4810	[InstCombine] Add sext(ashr(shl(trunc(x),c),c)) folding support for vectors Replacing m_ConstantInt with m_Constant permits folding of vectors as well as scalars. Differential Revision: https://reviews.llvm.org/D83058	2020-07-03 10:04:37 +01:00
Simon Pilgrim	769b979930	[InstCombine] Add (vXi1 trunc(lshr(x,c))) -> icmp_eq(and(x,c')) support for non-uniform vectors As noted on PR46531, we were only performing this transform on uniform vectors as we were using the m_APInt pattern matcher to extract the shift amount. Differential Revision: https://reviews.llvm.org/D83035	2020-07-02 16:56:33 +01:00
Guillaume Chatelet	8dbafd24d6	[Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82977	2020-07-02 11:28:02 +00:00
Roman Lebedev	381054a989	[InstCombine] visitBitCast(): do not crash on weird `bitcast <1 x i8> to i8` Even if we know that RHS of a bitcast is a pointer, we can't assume LHS is, because it might be a single-element vector of pointer.	2020-06-25 00:58:53 +03:00
Sanjay Patel	d50366d29f	[InstCombine] improve matching for sext-lshr-trunc patterns, part 2 Similar to rG42f488b63a04 This is intended to preserve the logic of the existing transform, but remove unnecessary restrictions on uses and types. https://rise4fun.com/Alive/oS0 Name: narrow input Pre: C1 <= width(C1) - 24 %B = sext i8 %A %C = lshr %B, C1 %r = trunc %C to i24 => %s = ashr i8 %A, trunc(umin(C1, 7)) %r = sext i8 %s to i24 Name: wide input Pre: C1 <= width(C1) - 24 %B = sext i24 %A %C = lshr %B, C1 %r = trunc %C to i8 => %s = ashr i24 %A, trunc(umin(C1, 23)) %r = trunc i24 %s to i8	2020-06-08 14:41:50 -04:00
Sanjay Patel	42f488b63a	[InstCombine] improve matching for sext-lshr-trunc patterns This is intended to preserve the logic of the existing transform, but remove unnecessary restrictions on uses and types. https://rise4fun.com/Alive/pYfR Pre: C1 <= width(C1) - 8 %B = sext i8 %A %C = lshr %B, C1 %r = trunc %C to i8 => %r = ashr i8 %A, trunc(umin(C1, 7))	2020-06-08 11:55:30 -04:00
Sanjay Patel	af7587d755	[InstCombine] reduce code duplication in visitTrunc(); NFC	2020-06-08 11:15:44 -04:00
Christopher Tetreault	8f8029b458	[SVE] Eliminate calls to default-false VectorType::get() from InstCombine Reviewers: efriedma, david-arm, fpetrogalli, spatel Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80334	2020-05-29 15:31:31 -07:00
David Sherwood	f254f1d94e	[SVE] Remove getNumElements() warnings in InstCombiner::visitBitCast Whilst trying to compile this test to assembly: CodeGen/aarch64-sve-intrinsics/acle_sve_reinterpret.c I discovered some warnings were firing in InstCombiner::visitBitCast due to calls to getNumElements() for scalable vector types. These calls only really made sense for fixed width vectors so I have fixed up the code appropriately. Differential Revision: https://reviews.llvm.org/D80559	2020-05-29 08:00:08 +01:00
Serge Pavlov	4d20e31f73	[FPEnv] Intrinsic llvm.roundeven This intrinsic implements IEEE-754 operation roundToIntegralTiesToEven, and performs rounding to the nearest integer value, rounding halfway cases to even. The intrinsic represents the missed case of IEEE-754 rounding operations and now llvm provides full support of the rounding operations defined by the standard. Differential Revision: https://reviews.llvm.org/D75670	2020-05-26 19:24:58 +07:00
Sanjay Patel	c048a02b5b	[InstCombine] fold FP trunc into exact itofp Similar to D79116 and rGbfd512160fe0 - if the 1st cast is exact, then we can go directly to the destination type because there is no double-rounding.	2020-05-24 09:30:19 -04:00
Sanjay Patel	7eed772a27	[PatternMatch] abbreviate vector inst matchers; NFC Readability is not reduced with these opcodes/match lines, so reduce odds of awkward wrapping from 80-col limit.	2020-05-24 09:19:47 -04:00
Sanjay Patel	bfd512160f	[InstCombine] improve analysis of FP->int->FP to eliminate fpextend This was originally in D79116. Converting from a narrow-enough FP source value to integer and back to FP guarantees that the conversion to FP is exact because of UB/poison-on-overflow. This was suggested in PR36617: https://bugs.llvm.org/show_bug.cgi?id=36617#c19	2020-05-17 09:06:57 -04:00
Eli Friedman	4f04db4b54	AllocaInst should store Align instead of MaybeAlign. Along the lines of D77454 and D79968. Unlike loads and stores, the default alignment is getPrefTypeAlign, to match the existing handling in various places, including SelectionDAG and InstCombine. Differential Revision: https://reviews.llvm.org/D80044	2020-05-16 14:53:16 -07:00
Sanjay Patel	a62533c29f	[InstCombine] fold fpext into exact integer-to-FP cast We can combine a floating-point extension cast with a conversion from integer if we know the earlier cast is exact. This is an optimization suggested in PR36617: https://bugs.llvm.org/show_bug.cgi?id=36617#c19 However, this patch does not change the example suggested there. This patch only uses the existing analysis to handle cases where the integer source value magnitude is narrower than the intermediate FP mantissa (guarantees that the conversion to FP is exact). Follow-up patches to the analysis function can enable more cases. Differential Revision: https://reviews.llvm.org/D79116	2020-05-10 07:04:54 -04:00
Sanjay Patel	46d6f76be3	[InstCombine] fix typo in comment; NFC	2020-05-08 15:43:14 -04:00
Sanjay Patel	5cf17034e5	[InstCombine] add helper for known exact cast to FP; NFC As suggested in D79116 - there's shared logic between the existing code and potential new folds. This could go in ValueTracking if it seems generally useful.	2020-05-08 15:22:36 -04:00
Sanjay Patel	ff9045dc9c	[InstCombine] clean up foldItoFPtoI; NFC Mostly cosmetic improvements to variable names and logic to ease refactoring suggested in D79116.	2020-05-08 12:13:42 -04:00
Sanjay Patel	09d70e0588	[InstCombine] simplify code for FP to integer casts; NFCI FoldIToFPtoI() returns immediately if the operand is not an opposite cast instruction, so the extra checks in the callers are redundant.	2020-05-08 10:14:03 -04:00
Sanjay Patel	2058c98715	[InstCombine] limit bitcast+insertelement transform to x86 MMX type This is unusual for the general case because we are replacing 1 instruction with 2. Splitting from a potential conflicting transform in D79171	2020-05-06 13:12:36 -04:00
Benjamin Kramer	cc035d475f	Upgrade users of 'new ShuffleVectorInst' to pass indices as an int array No functionality change intended.	2020-04-15 14:29:43 +02:00
Christopher Tetreault	155740cc33	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, rriddle, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77263	2020-04-08 15:15:41 -07:00
Eli Friedman	1ee6ec2bf3	Remove "mask" operand from shufflevector. Instead, represent the mask as out-of-line data in the instruction. This should be more efficient in the places that currently use getShuffleVector(), and paves the way for further changes to add new shuffles for scalable vectors. This doesn't change the syntax in textual IR. And I don't currently plan to change the bitcode encoding in this patch, although we'll probably need to do something once we extend shufflevector for scalable types. I expect that once this is finished, we can then replace the raw "mask" with something more appropriate for scalable vectors. Not sure exactly what this looks like at the moment, but there are a few different ways we could handle it. Maybe we could try to describe specific shuffles. Or maybe we could define it in terms of a function to convert a fixed-length array into an appropriate scalable vector, using a "step", or something like that. Differential Revision: https://reviews.llvm.org/D72467	2020-03-31 13:08:59 -07:00
Nikita Popov	19df7fa892	[InstCombine] Erase old alloca in cast of alloca transform As we don't return the replaceInstUsesWith() result, we are responsible for erasing the instruction. NFC apart from worklist order.	2020-03-31 21:57:39 +02:00
Daan Sprenkels	464b9aeafe	[InstCombine] Transform extelt-trunc -> bitcast-extelt Canonicalize the case when a scalar extracted from a vector is truncated. Transform such cases to bitcast-then-extractelement. This will enable erasing the truncate operation. This commit fixes PR45314. reviewers: spatel Differential revision: https://reviews.llvm.org/D76983	2020-03-31 11:53:41 +02:00
Eli Friedman	e24e95fe90	Remove CompositeType class. The existence of the class is more confusing than helpful, I think; the commonality is mostly just "GEP is legal", which can be queried using APIs on GetElementPtrInst. Differential Revision: https://reviews.llvm.org/D75660	2020-03-18 13:53:17 -07:00
Simon Moll	d871ef4e6a	[instcombine] remove fsub to fneg hacks; only emit fneg Summary: Rewrite the fsub-0.0 idiom to fneg and always emit fneg for fp negation. This also extends the scalarization cost in instcombine for unary operators to result in the same IR rewrites for fneg as for the idiom. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D75467	2020-03-10 16:57:02 +01:00
Nikita Popov	9b5de84e27	[InstCombine] Use IRBuilder to create bitcast This makes sure that the constant expression bitcast goes through target-dependent constant folding, and thus avoids an additional iteration of InstCombine.	2020-03-04 18:28:38 +01:00
Nikita Popov	0e890cd4d4	[ConstantFolding] Always return something from ConstantFoldConstant Spin-off from D75407. As described there, ConstantFoldConstant() currently returns null for non-ConstantExpr/ConstantVector inputs, but otherwise always returns non-null, independently of whether any folding has happened or not. This is confusing and makes consumer code more complicated. I would expect either that ConstantFoldConstant() returns only if it actually folded something, or that it always returns non-null. I'm going to the latter possibility here, which appears to be more useful considering existing usage. Differential Revision: https://reviews.llvm.org/D75543	2020-03-04 18:24:47 +01:00
Simon Moll	ddd11273d9	Remove BinaryOperator::CreateFNeg Use UnaryOperator::CreateFNeg instead. Summary: With the introduction of the native fneg instruction, the fsub -0.0, %x idiom is obsolete. This patch makes LLVM emit fneg instead of the idiom in all places. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D75130	2020-02-27 09:06:03 -08:00
Nikita Popov	5f7b92b1b4	[IRBuilder] Prefer InsertPointGuard over full copy; NFC Don't copy the IRBuilder when an InsertPointGuard would also do.	2020-02-16 18:02:29 +01:00
Sanjay Patel	0cf0be993c	[InstCombine] fix operands of shouldChangeType() for casted phi transform This is a bug noted in the recent D72733 and seen in the similar transform just above the changed source code. I added tests with illegal types and zexts to show the bug - we could transform legal phi ops to illegal, etc. I did not add tests with trunc because we won't see any diffs on those patterns. That is because InstCombiner::SliceUpIllegalIntegerPHI() appears to do those transforms independently of datalayout. It can also create more casts than are present in existing code. There are some existing regression tests that do not include a datalayout that would be altered by this fix. I assumed that the lack of a datalayout in those regression files is an oversight, so I added the minimal layout (make i32 legal) necessary to preserve behavior on those tests. Differential Revision: https://reviews.llvm.org/D73907	2020-02-04 07:45:48 -05:00
Nikita Popov	878cb38a5c	[InstCombine] Add replaceOperand() helper Adds a replaceOperand() helper, which is like Instruction.setOperand() but adds the old operand to the worklist. This reduces the amount of missing or incorrect worklist management. This only applies the helper to a relatively small subset of setOperand() calls in InstCombine, namely those of the pattern `I.setOperand(); return &I;`, where it is most obviously applicable. Differential Revision: https://reviews.llvm.org/D73803	2020-02-03 19:00:17 +01:00
Nikita Popov	e6c9ab4fb7	[InstCombine] Rename worklist methods; NFC This renames Worklist.AddDeferred() to Worklist.add() and Worklist.Add() to Worklist.push(). The intention here is that Worklist.add() should be the go-to method for explicit worklist management, while the raw Worklist.push() is mostly for InstCombine internals. I will then migrate uses of Worklist.push() to Worklist.add() in followup changes. As suggested by spatel on D73411 I'm also changing the remaining method names to lowercase first character, in line with current coding standards. Differential Revision: https://reviews.llvm.org/D73745	2020-02-03 18:56:51 +01:00
Sanjay Patel	747242af8d	[InstCombine] allow more narrowing of casted select D47163 created a rule that we should not change the casted type of a select when we have matching types in its compare condition. That was intended to help vector codegen, but it also could create situations where we miss subsequent folds as shown in PR44545: https://bugs.llvm.org/show_bug.cgi?id=44545 By using shouldChangeType(), we can continue to get the vector folds (because we always return false for vector types). But we also solve the motivating bug because it's ok to narrow the scalar select in that example. Our canonicalization rules around select are a mess, but AFAICT, this will not induce any infinite looping from the reverse transform (but we'll need to watch for that possibility if committed). Side note: there's a similar use of shouldChangeType() for phi ops just below this diff, and the source and destination types appear to be reversed. Differential Revision: https://reviews.llvm.org/D72733	2020-01-27 16:35:50 -05:00
Nikita Popov	65c0805be5	[InstCombine] Fix infinite loop due to bitcast <-> phi transforms Fix for https://bugs.llvm.org/show_bug.cgi?id=44245. The optimizeBitCastFromPhi() and FoldPHIArgOpIntoPHI() end up fighting against each other, because optimizeBitCastFromPhi() assumes that bitcasts of loads will get folded. This doesn't happen here, because a dangling phi node prevents the one-use fold in https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp#L620-L628 from triggering. This patch fixes the issue by explicitly performing the load combine as part of the bitcast of phi transform. Other attempts to force the load to be combined first were ultimately too unreliable. Differential Revision: https://reviews.llvm.org/D71164	2020-01-14 20:45:13 +01:00
Nikita Popov	652cd7c100	[InstCombine] Fix user iterator invalidation in bitcast of phi transform This fixes the issue encountered in D71164. Instead of using a range-based for, manually iterate over the users and advance the iterator beforehand, so we do not skip any users due to iterator invalidation. Differential Revision: https://reviews.llvm.org/D72657	2020-01-14 20:38:10 +01:00
Kadir Cetinkaya	b212eb7159	Revert "[InstCombine] fold zext of masked bit set/clear" This reverts commit `a041c4ec6f`. This looks like a non-trivial change and there has been no code reviews (at least there were no phabricator revisions attached to the commit description). It is also causing a regression in one of our downstream integration tests, we haven't been able to come up with a minimal reproducer yet.	2020-01-08 11:21:21 +01:00
Sanjay Patel	a041c4ec6f	[InstCombine] fold zext of masked bit set/clear This does not solve PR17101, but it is one of the underlying diffs noted here: https://bugs.llvm.org/show_bug.cgi?id=17101#c8 We could ease the one-use checks for the 'clear' (no 'not' op) half of the transform, but I do not know if that asymmetry would make things better or worse. Proofs: https://rise4fun.com/Alive/uVB Name: masked bit set %sh1 = shl i32 1, %y %and = and i32 %sh1, %x %cmp = icmp ne i32 %and, 0 %r = zext i1 %cmp to i32 => %s = lshr i32 %x, %y %r = and i32 %s, 1 Name: masked bit clear %sh1 = shl i32 1, %y %and = and i32 %sh1, %x %cmp = icmp eq i32 %and, 0 %r = zext i1 %cmp to i32 => %xn = xor i32 %x, -1 %s = lshr i32 %xn, %y %r = and i32 %s, 1	2019-12-31 12:35:10 -05:00
Nikita Popov	7adb5c2aca	Revert "[InstCombine] Fix infinite loop due to bitcast <-> phi transforms" This reverts commit `27a0795943`. Seems to break test-suite.	2019-12-31 17:42:57 +01:00
Nikita Popov	27a0795943	[InstCombine] Fix infinite loop due to bitcast <-> phi transforms Fix for https://bugs.llvm.org/show_bug.cgi?id=44245. The optimizeBitCastFromPhi() and FoldPHIArgOpIntoPHI() end up fighting against each other, because optimizeBitCastFromPhi() assumes that bitcasts of loads will get folded. This doesn't happen here, because a dangling phi node prevents the one-use fold in https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp#L620-L628 from triggering. This patch fixes the issue by adding manually removing the old phis. Differential Revision: https://reviews.llvm.org/D71164	2019-12-31 16:17:14 +01:00
Connor Abbott	fb114694e9	[InstCombine] Don't rewrite phi-of-bitcast when the phi has other users Judging by the existing comments, this was the intention, but the transform never actually checked if the existing phi's would be removed. See https://bugs.llvm.org/show_bug.cgi?id=44242 for an example where this causes much worse code generation on AMDGPU. Differential Revision: https://reviews.llvm.org/D71209	2019-12-31 12:15:02 +01:00
Jakub Kuderski	3d29c41ad5	[InstCombine] Insert instructions before adding them to worklist Summary: This patch adds instructions to the InstCombine worklist after they are properly inserted. This way we don't get `<badref>`s printed when logging added instructions. It also adds a check in `Worklist::Add` that ensures that all added instructions have parents. Simple test case that illustrates the difference when run with `--debug-only=instcombine`: ``` define i32 @test35(i32 %a, i32 %b) { %1 = or i32 %a, 1135 %2 = or i32 %1, %b ret i32 %2 } ``` Before this patch: ``` INSTCOMBINE ITERATION #1 on test35 IC: ADDING: 3 instrs to worklist IC: Visiting: %1 = or i32 %a, 1135 IC: Visiting: %2 = or i32 %1, %b IC: ADD: %2 = or i32 %a, %b IC: Old = %3 = or i32 %1, %b New = <badref> = or i32 %2, 1135 IC: ADD: <badref> = or i32 %2, 1135 ... ``` With this patch: ``` INSTCOMBINE ITERATION #1 on test35 IC: ADDING: 3 instrs to worklist IC: Visiting: %1 = or i32 %a, 1135 IC: Visiting: %2 = or i32 %1, %b IC: ADD: %2 = or i32 %a, %b IC: Old = %3 = or i32 %1, %b New = <badref> = or i32 %2, 1135 IC: ADD: %3 = or i32 %2, 1135 ... ``` Reviewers: fhahn, davide, spatel, foad, grosser, nikic Reviewed By: nikic Subscribers: nikic, lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71093	2019-12-18 14:55:41 -05:00

1 2 3 4 5 ...

382 Commits