llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	3fcb00df5d	[InstCombine] restrict shift-trunc-shift fold to opposite direction shifts This is NFCI because the pattern with 2 left-shifts should get folded independently by smaller folds. The motivation is to refine this block to avoid infinite loops seen with D110170.	2021-09-30 15:06:13 -04:00
Sanjay Patel	ea56dcb730	[InstCombine] fix miscompile from dropRedundantMaskingOfLeftShiftInput() The test is from https://llvm.org/PR51351. There are 2 related logic bugs from over-generalizing "lshr" to "any shr", but I'm not sure how to expose the difference for "MaskC" because instsimplify already folds ashr of -1. I'll extend instsimplify to catch the MaskD pattern as a follow-up, but this patch should be enough to avoid the miscompile.	2021-09-29 11:43:18 -04:00
Sanjay Patel	98fde3489a	[InstCombine] reduce redundant code for shl-binop folds This is NFCI (no-functional-change-intended), but there are benign diffs possible with commutable ops as seen in the test diffs. The transforms were repeated for the commutative opcodes, but that should not be necessary if we canonicalize the patterns that we're matching. If both operands of the binop match, that should get folded eventually. The transform that starts with a mask op seems to over-constrain the use checks, so that could be a potential enhancement.	2021-09-28 17:06:45 -04:00
Alex Richardson	ebb3dc0833	[InstCombine] Fold ptrtoint(gep i8 null, x) -> x This commit is the InstCombine follow-up to the previous constant-folding change that enables noticeable optimizations for CHERI-enabled targets. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D110247	2021-09-28 17:57:37 +01:00
Sanjay Patel	1f8bead678	[InstCombine] reduce code for swapped predicate; NFC	2021-09-28 10:00:35 -04:00
Sanjay Patel	fdba1dccbe	[InstCombine] reduce code for shl-of-sub transform; NFC	2021-09-27 14:56:01 -04:00
Sanjay Patel	623f93ed1c	[InstCombine] add use check to shl transform This bug was introduced with the refactoring in: `9075edc89b` ...but there were no tests to detect it.	2021-09-27 14:10:26 -04:00
Kazu Hirata	59540b29f8	[InstCombine] Fix an "unused variable" warning	2021-09-27 09:49:32 -07:00
Sanjay Patel	9075edc89b	[InstCombine] move shl-only folds out from under commonShiftTransforms(); NFCI This is no-functional-change-intended, but it hopefully makes things slightly clearer and more efficient to have transforms that require 'shl' be called only from visitShl(). Further cleanup is possible.	2021-09-27 12:09:47 -04:00
Sanjay Patel	21429cf43a	[InstCombine] generalize fold for (trunc (X u>> C1)) u>> C This is another step towards trying to re-apply D110170 by eliminating conflicting transforms that cause infinite loops. `a47c8e40c7` was a previous patch in this direction. The diffs here are mostly cosmetic, but intentional: 1. The existing code that would handle this pattern in FoldShiftByConstant() is limited to 'shl' only now. The formatting change to IsLeftShift shows that we could move several transforms into visitShl() directly for efficiency because they are not common shift transforms. 2. The tests are regenerated to show new instruction names to prove that we are getting (almost) identical logic results. 3. The one case where we differ ("trunc_sandwich_small_shift1") shows that we now use a narrow 'and' instruction. Previously, we relied on another transform to do that, but it is limited to legal types. That seems to be a legacy constraint from when IR analysis and codegen were less robust. https://alive2.llvm.org/ce/z/JxyGA4 declare void @llvm.assume(i1) define i8 @src(i32 %x, i32 %c0, i8 %c1) { ; The sum of the shifts must not overflow the source width. %z1 = zext i8 %c1 to i32 %sum = add i32 %c0, %z1 %ov = icmp ult i32 %sum, 32 call void @llvm.assume(i1 %ov) %sh1 = lshr i32 %x, %c0 %tr = trunc i32 %sh1 to i8 %sh2 = lshr i8 %tr, %c1 ret i8 %sh2 } define i8 @tgt(i32 %x, i32 %c0, i8 %c1) { %z1 = zext i8 %c1 to i32 %sum = add i32 %c0, %z1 %maskc = lshr i8 -1, %c1 %s = lshr i32 %x, %sum %t = trunc i32 %s to i8 %a = and i8 %t, %maskc ret i8 %a }	2021-09-27 10:57:31 -04:00
Sanjay Patel	025a805d7c	[InstCombine] match variable names and code comments; NFC Similar to: `29c09c7` Planned follow-up is to add a transform here to allow removing a common shift fold that is conflicting with D110170.	2021-09-27 10:57:31 -04:00
Sanjay Patel	6063e6b499	[InstCombine] move add after min/max intrinsic This is another regression noted with the proposal to canonicalize to the min/max intrinsics in D98152. Here are Alive2 attempts to show correctness without specifying exact constants: https://alive2.llvm.org/ce/z/bvfCwh (smax) https://alive2.llvm.org/ce/z/of7eqy (smin) https://alive2.llvm.org/ce/z/2Xtxoh (umax) https://alive2.llvm.org/ce/z/Rm4Ad8 (umin) (if you comment out the assume and/or no-wrap, you should see failures) The different output for the umin test is due to a fold added with `c4fc2cb5b2` : // umin(x, 1) == zext(x != 0) We probably want to adjust that, so it applies more generally (umax --> sext or patterns where we can fold to select-of-constants). Some folds that were ok when starting with cmp+select may increase instruction count for the equivalent intrinsic, so we have to decide if it's worth altering a min/max. Differential Revision: https://reviews.llvm.org/D110038	2021-09-26 09:49:10 -04:00
Simon Pilgrim	5a14edd8ed	[InstCombine] Ensure shifts are in range for (X << C1) / C2 -> X fold. We can get here before out of range shift amounts have been handled - limit to BW-2 for sdiv and BW-1 for udiv Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=38078	2021-09-25 12:57:43 +01:00
Sanjay Patel	a47c8e40c7	[InstCombine] fold lshr(trunc(lshr X, C1)) C2 Only the multi-use cases are changing here because there's another fold that catches the simpler patterns. But that other fold is the source of infinite loops when we try to add D110170, so removing that is planned as a follow-up. Attempt to show the general proof in Alive2: https://alive2.llvm.org/ce/z/Ns1uS2 Note that the overshift fold-to-zero tests are not currently handled by instsimplify. If they were, we could assert that the shift amount sum is less than the source bitwidth.	2021-09-24 15:44:07 -04:00
Sanjay Patel	29c09c7653	[InstCombine] match variable names and code comments; NFC	2021-09-24 15:44:07 -04:00
Sanjay Patel	3c5500907b	Revert "[InstCombine] fold cast of right-shift if high bits are not demanded (2nd try)" This reverts commit `bb9333c350`. This exposes another existing bug that causes an infinite loop as shown in D110170 ...so reverting while I look at another fix.	2021-09-24 10:47:35 -04:00
David Sherwood	c2634fc6ab	[Analysis] Fix issues when querying vscale attributes on functions There are several places in the code that are currently broken as they assume an Instruction always has a parent Function when attempting to get the vscale_range attribute. This patch adds checks that an Instruction has a parent. I've added a test for a parentless @llvm.vscale intrinsic call here: unittests/Analysis/ValueTrackingTest.cpp Differential Revision: https://reviews.llvm.org/D110158	2021-09-24 09:58:10 +01:00
Sanjay Patel	bb9333c350	[InstCombine] fold cast of right-shift if high bits are not demanded (2nd try) The 1st try at this was reverted because it caused an infinite loop in instcombine. That should be fixed after: `1cd6b44f26` (masked) trunc (lshr X, C) --> (masked) lshr (trunc X), C Narrowing the shift should be better for analysis and can lead to follow-on transforms as shown. Attempt at a general proof in Alive2: https://alive2.llvm.org/ce/z/tRnnSF Here are a couple of the specific tests: https://alive2.llvm.org/ce/z/bCnTp- https://alive2.llvm.org/ce/z/TfaHnb Differential Revision: https://reviews.llvm.org/D110170	2021-09-23 09:41:37 -04:00
Alex Richardson	05663dc146	[InstSimplify] Don't lose inbounds when simplifying a GEP I noticed this while working on a (ptrtoint (gep null, x)) -> x fold. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D110168	2021-09-23 09:25:06 +01:00
hyeongyu kim	10a5632550	[NFC][InstCombine] Fix inconsistent comments	2021-09-23 09:31:39 +09:00
Sanjay Patel	1cd6b44f26	[InstCombine] add one-use check to shift-shift transform We don't want to create extra instructions, and this could infinite loop with the proposed transform in D110170.	2021-09-22 16:31:12 -04:00
hyeongyu kim	98e96663f6	[InstCombine] Update InstCombine to use poison instead of undef for shufflevector's placeholder (3/3) This patch is for fixing potential shufflevector-related bugs like D93818. As D93818, this patch change shufflevector's default placeholder to poison. To reduce risk, it was divided into several patches, and this patch is for InstCombineVectorOps. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D110230	2021-09-23 00:48:24 +09:00
hyeongyu kim	ec8311444a	[InstCombine] Update InstCombine to use poison instead of undef for shufflevector's placeholder (2/3) This patch is for fixing potential shufflevector-related bugs like D93818. As D93818, this patch change shufflevector's default placeholder to poison. To reduce risk, it was divided into several patches, and this patch is for InstCombineCompares and InstructionCombining. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D110227	2021-09-23 00:14:50 +09:00
hyeongyu kim	e5aaf03326	[InstCombine] Update InstCombine to use poison instead of undef for shufflevector's placeholder (1/3) This patch is for fixing potential shufflevector-related bugs like D93818. As D93818, this patch change shufflevector's default placeholder to poison. To reduce risk, it was divided into several patches, and this patch is for InstCombineCasts. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D110226	2021-09-22 23:18:51 +09:00
Sanjay Patel	c6013f71a4	Revert "[InstCombine] fold cast of right-shift if high bits are not demanded" This reverts commit `2f6b07316f`. This caused several bots to hit an infinite loop at stage 2, so it needs to be reverted while figuring out how to fix that.	2021-09-22 07:45:21 -04:00
Yi Kong	d0746f2e9b	Don't fold (select C, (gep Ptr, Idx), Ptr) if C is vector but Idx is scalar The folding rule (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0)) creates a malformed SELECT IR if C is a vector while Idx is scalar. SELECT VecC, ScalarIdx, 0 We could splat Idx to a vector but it defeats the purpose of optimisation. Don't apply the folding rule in this case. This fixes a regression from commit `d561b6fbdb`.	2021-09-22 18:11:33 +08:00
Florian Hahn	e08a5dc86f	[InstCombine] Move InstCombineWorklist to Utils to allow reuse (NFC). InstCombine's worklist can be re-used by other passes like VectorCombine. Move it to llvm/Transform/Utils and rename it to InstructionWorklist. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D110181	2021-09-22 08:47:21 +01:00
Sanjay Patel	2f6b07316f	[InstCombine] fold cast of right-shift if high bits are not demanded (masked) trunc (lshr X, C) --> (masked) lshr (trunc X), C Narrowing the shift should be better for analysis and can lead to follow-on transforms as shown. Attempt at a general proof in Alive2: https://alive2.llvm.org/ce/z/tRnnSF Here are a couple of the specific tests: https://alive2.llvm.org/ce/z/bCnTp- https://alive2.llvm.org/ce/z/TfaHnb Differential Revision: https://reviews.llvm.org/D110170	2021-09-21 16:09:08 -04:00
Owen Anderson	b5fbbdd202	Teach InstCombine to eliminate malloc-realloc-free triplets. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D109988	2021-09-21 18:07:49 +00:00
Dávid Bolvanský	c0fdfc9af2	[InstCombine] powi(x, y) * powi(x, z) -> powi(x, y + z) We already have pow(x, y) * pow(x, z) -> pow(x, y + z) transformation, but we are missing same transformation for powi (power is integer). Requires reassoc. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D109954	2021-09-21 18:20:46 +02:00
Anna Thomas	69921f6f45	[InstCombine] Improve TryToSinkInstruction with multiple uses This patch allows sinking an instruction which can have multiple uses in a single user. We were previously over-restrictive by looking for exactly one use, rather than one user. Also added an API for retrieving a unique undroppable user. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D109700	2021-09-21 10:04:04 -04:00
Simon Pilgrim	fc8f1e4419	[InstCombine] foldConstantInsEltIntoShuffle - bail if we fail to find constant element (PR51824) If getAggregateElement() returns null for any element, early out as otherwise we will assert when creating a new constant vector Fixes PR51824 + ; OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=38057	2021-09-21 13:01:09 +01:00
Usman Nadeem	f417d9d821	[InstCombine] Eliminate vector reverse if all inputs/outputs to an instruction are reverses Differential Revision: https://reviews.llvm.org/D109808 Change-Id: I1a10d2bc33acbe0ea353c6cb3d077851391fe73e	2021-09-20 18:32:24 -07:00
Nikita Popov	dd0226561e	[IR] Add helper to convert offset to GEP indices We implement logic to convert a byte offset into a sequence of GEP indices for that offset in a number of places. This patch adds a DataLayout::getGEPIndicesForOffset() method, which implements the core logic. I've updated SROA, ConstantFolding and InstCombine to use it, and there's a few more places where it looks relevant. Differential Revision: https://reviews.llvm.org/D110043	2021-09-20 20:18:16 +02:00
Sanjay Patel	41ff7612b3	[InstCombine] allow splat vectors for narrowing masked fold Mostly cosmetic diffs, but the use of m_APInt matches splat constants.	2021-09-17 11:24:16 -04:00
Nikita Popov	0fc624f029	[IR] Return AAMDNodes from Instruction::getMetadata() (NFC) getMetadata() currently uses a weird API where it populates a structure passed to it, and optionally merges into it. Instead, we can return the AAMDNodes and provide a separate merge() API. This makes usages more compact. Differential Revision: https://reviews.llvm.org/D109852	2021-09-16 21:06:57 +02:00
Dávid Bolvanský	a4a426c9e0	[InstCombine] Added llvm.powi optimizations If power is even: powi(-x, p) -> powi(x, p) powi(fabs(x), p) -> powi(x, p) powi(copysign(x, y), p) -> powi(x, p)	2021-09-16 19:42:21 +02:00
Kazu Hirata	24c8eaec94	[Transforms] Use make_early_inc_range (NFC)	2021-09-15 19:55:24 -07:00
Anna Thomas	f9e4aebe4a	Revert "[InstCombine] Improve TryToSinkInstruction with multiple uses" This reverts commit `4ac4e52189`. There are couple of test failures, which needs update of the test cases. Doing a clean revert and will recommit the change along with fixed testcases.	2021-09-15 18:03:11 -04:00
Anna Thomas	4ac4e52189	[InstCombine] Improve TryToSinkInstruction with multiple uses This patch allows sinking an instruction which can have multiple uses in a single user. We were previously over-restrictive by looking for exactly one use, rather than one user. Also, the API for retrieving undroppable user has been updated accordingly since in both usecases (Attributor and InstCombine), we seem to care about the user, rather than the use. Reviewed-By: nikic Differential Revision: https://reviews.llvm.org/D109700	2021-09-15 20:39:38 +00:00
Sanjay Patel	e5a32d720e	[InstCombine] move extend after insertelement if both operands are extended I was wondering how instcombine does on the examples in D109236, and we're missing a basic transform: inselt (ext X), (ext Y), Index --> ext (inselt X, Y, Index) https://alive2.llvm.org/ce/z/z2aBu9 Note that there are several possible extensions of this fold (see TODO comments). Differential Revision: https://reviews.llvm.org/D109537	2021-09-15 14:38:03 -04:00
Filipp Zhinkin	f5d8952356	[InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y) Enabled mul folding optimization that was previously disabled by being incorrect. To preserve correctness, mul's operand that is not compared with zero in select's condition is now frozen. Related bug: https://bugs.llvm.org/show_bug.cgi?id=51286 Correctness: https://alive2.llvm.org/ce/z/bHef7J https://alive2.llvm.org/ce/z/QcR7sf https://alive2.llvm.org/ce/z/vvBLzt https://alive2.llvm.org/ce/z/jGDXgq https://alive2.llvm.org/ce/z/3Pe8Z4 https://alive2.llvm.org/ce/z/LGga8M https://alive2.llvm.org/ce/z/CTG5fs Differential Revision: https://reviews.llvm.org/D108408	2021-09-15 09:04:06 -04:00
Anna Thomas	b4e787d8f4	[InstCombining] Refactor checks for TryToSinkInstruction. NFC Moved out the checks for profitability of TryToSinkInstructions into a lambda function. This will also allow us to easily add checks for bailing out if the transform is not profitable. Tests-Run: instCombine tests.	2021-09-13 09:04:34 -04:00
Sanjay Patel	3a126134d3	[InstCombine] remove casts from splat-a-bit pattern https://alive2.llvm.org/ce/z/_AivbM This case seems clear since we can reduce instruction count and avoid an intermediate type change, but we might want to use mask-and-compare for other sequences. Currently, we can generate more instructions on some related patterns by trying to use bit-hacks instead of mask+cmp, so something is not behaving as expected.	2021-09-12 09:18:14 -04:00
Sanjay Patel	75e8eb2b10	[InstCombine] update code/test comments; NFC Follow-up for post-commit suggestion on: `28afaed691` The comments were partly copied from the original code, but not updated to match the new code.	2021-09-11 10:53:53 -04:00
Sanjay Patel	28afaed691	[InstCombine] fold sub of min/max intrinsics with invertible ops This is a translation of the existing code to handle the intrinsics and another step towards D98152. https://alive2.llvm.org/ce/z/jA7eBC This pattern is already handled by underlying folds if there are less uses, so the minimal tests in this case have extra uses. The larger cmyk tests show the motivation - when combined with other folds, we invert a larger sequence and eliminate 'not' ops.	2021-09-11 09:18:46 -04:00
Chris Lattner	735f46715d	[APInt] Normalize naming on keep constructors / predicate methods. This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAllOnesValue` to simply `isAllOnes`. This achieves two things: 1) This starts standardizing predicates across the LLVM codebase, following (in this case) ConstantInt. The word "Value" doesn't convey anything of merit, and is missing in some of the other things. 2) Calling an integer "null" doesn't make any sense. The original sin here is mine and I've regretted it for years. This moves us to calling it "zero" instead, which is correct! APInt is widely used and I don't think anyone is keen to take massive source breakage on anything so core, at least not all in one go. As such, this doesn't actually delete any entrypoints, it "soft deprecates" them with a comment. Included in this patch are changes to a bunch of the codebase, but there are more. We should normalize SelectionDAG and other APIs as well, which would make the API change more mechanical. Differential Revision: https://reviews.llvm.org/D109483	2021-09-09 09:50:24 -07:00
Sanjay Patel	97a4e7b7ff	[InstCombine] remove a buggy set of zext-icmp transforms The motivating case is an infinite loop shown with a reduced test from: https://llvm.org/PR51762 To solve this, I'm proposing we delete the most obviously broken part of this code. The bug example shows a fundamental problem: we ask computeKnownBits if a transform will be profitable, alter the code by creating new instructions, then rely on computeKnownBits to return the same answer to actually eliminate instructions. But there's no guarantee that the results will be the same between the 1st and 2nd calls. In the infinite loop example, we get different answers, so we add instructions that conflict with some other transform, and we're stuck. There's at least one other problem visible in the test diff for `@zext_or_masked_bit_test_uses`: the code doesn't check uses properly, so we can end up with extra instructions created. Last, it's not clear if this set of transforms actually improves analysis or codegen. I spot-checked a few targets and don't see a clear win: https://godbolt.org/z/x87EWovso If we do see a regression from this change, codegen seems like the right place to add a cmp -> bit-hack fold. If this is too big of a step, we could limit the computeKnownBits calls by not passing a context instruction and/or limiting the recursion. I checked that those would stop the infinite loop for PR51762, but that won't guarantee that some other example does not fall into the same loop. Differential Revision: https://reviews.llvm.org/D109440	2021-09-09 08:49:39 -04:00
Sanjay Patel	a3c1669b17	[InstCombine] fold icmp equality with 'or' mask ops This could go either direction since the instruction count is the same either way, but there are a few reasons to prefer this: 1. We already do the related transform with 'and' (see just above the new code). 2. We try (too hard) to compensate for not having this and possibly other folds in transformZExtICmp(), and that leads to bugs like https://llvm.org/PR51762 . 3. Codegen looks better across a variety of targets. https://alive2.llvm.org/ce/z/uEgn4P	2021-09-07 16:34:00 -04:00
Arthur Eubanks	b81fc14f2d	[NFC][InstCombine] Make check for sret in a vararg function clearer We're trying to get the parameter index of sret and see if it's part of a function's varargs. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D109335	2021-09-07 11:19:27 -07:00

1 2 3 4 5 ...

4569 Commits