llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	fc18a88231	[InstCombine] Avoid creating float binop ConstantExprs Replace ConstantExpr:getFAdd etc with call to ConstantFoldBinaryOpOperands(). I'm using the constant folding API rather than IRBuilder here to ensure that this does actually constant fold. These transforms don't use m_ImmConstant(), so this would not otherwise be guaranteed (and apparently, they can't use m_ImmConstant because they want to handle scalable vector splats). There is an opportunity here to further migrate these to the ConstantFoldFPInstOperands() API, which would respect the denormal mode. I've held off on doing so here, because some of this code explicitly checks for denormal results, and I don't want to touch it in a mostly NFC change.	2022-07-08 16:36:04 +02:00
Simon Moll	b8c2781ff6	[NFC] format InstructionSimplify & lowerCaseFunctionNames Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with the cleanup of InstructionSimplify in one commit. This is the alternative to the less invasive clang-format only patch: D126783 Reviewed By: spatel, rengolin Differential Revision: https://reviews.llvm.org/D126889	2022-06-09 16:10:08 +02:00
Sanjay Patel	3f33d67d8a	[InstCombine] fold mul with masked low bit operand to trunc+select https://alive2.llvm.org/ce/z/o7rQ5q This shows an extra instruction in some cases, but that is caused by an existing canonicalization of trunc -> and+icmp. Codegen should be better for any target where a multiply is more costly than the most simple ALU op. This ends up producing the requested x86 asm from issue #55618, but it's not the same IR. We are missing a canonicalization from the negate+mask pattern to the trunc+select created here.	2022-06-05 20:07:18 -04:00
Sanjay Patel	8689463bfb	[InstCombine] make pattern matching more consistent; NFC We could go either way on this and several similar matches. Just matching as a binop is possibly slightly more efficient; we don't need to re-confirm the opcode of the instruction.	2022-06-02 16:01:23 -04:00
zhongyunde	3e6ba89055	[InstCombine] Fold a mul with bool value into and Fixes https://github.com/llvm/llvm-project/issues/55599 X * Y --> X & Y, iff X, Y can be only {0, 1}. https://alive2.llvm.org/ce/z/_RsTKF Reviewed By: spatel, nikic Differential Revision: https://reviews.llvm.org/D126040	2022-05-30 21:05:00 +08:00
Sanjay Patel	b5b6aa4d53	[InstCombine] fold multiply by signbit-splat to cmp+select (ashr i32 X, 31) * C --> (X < 0) ? -C : 0 https://alive2.llvm.org/ce/z/G8u9SS With a constant operand, this is an improvement in IR and codegen (where it can be converted to a mask op). Without a constant operand, we would have to negate the operand, so that is probably better left to the backend. This is similar but not the same optimization that is requested in #55618.	2022-05-27 11:54:19 -04:00
Sanjay Patel	5a6e085757	[InstCombine] reduce code duplication; NFC	2022-05-27 11:54:19 -04:00
Sanjay Patel	c4c750058f	[InstCombine] fold mul of signbit directly to X < 0 ? Y : 0 This is effectively NFC (intentionally no test diffs) because we already have the related fold that converts the 'and' pattern to select. So this is just an efficiency improvement.	2022-05-26 16:19:15 -04:00
Sanjay Patel	e8c20d995b	[IR] add and use pattern match specialization for sqrt intrinsic; NFC This was included in D126190 originally, but it's independent and a useful change for readability.	2022-05-23 14:16:30 -04:00
Sanjay Patel	be7f09f7b2	[IR] create and use helper functions that test the signbit; NFCI	2022-05-16 11:26:23 -04:00
Sanjay Patel	2fa8fc3d0a	[InstCombine] freeze operand in div+mul fold As discussed in issue #37809, this transform is not safe if the input is an undefined value. This is similar to recent changes for urem and sdiv: `d428f09b2c` `99ef341ce9` There is no difference in codegen on the basic examples, but this could lead to regressions. We may need to improve freeze analysis or lowering if that happens. Presumably, in real cases that are similar to the tests where a subsequent transform removes the rem, we will also be able to remove the freeze by seeing that the parameter has 'noundef'.	2022-05-12 13:49:29 -04:00
Sanjay Patel	99ef341ce9	[InstCombine] freeze operand in sdiv expansion As discussed in issue #37809, this transform is not safe if the input is an undefined value. This is similar to a recent change for urem: `d428f09b2c` There is no difference in codegen on the basic examples, but this could lead to regressions. We may need to improve freeze analysis or lowering if that happens. Presumably, in real cases that are similar to the tests where a subsequent transform removes the select, we will also be able to remove the freeze by seeing that the parameter has 'noundef'.	2022-05-11 14:01:28 -04:00
Sanjay Patel	d428f09b2c	[InstCombine] freeze operand in urem expansion As discussed in issue #37809, this transform is not safe if the input is an undefined value. There is no difference in codegen on the basic examples, but this could lead to regressions. We may need to improve freeze analysis or lowering if that happens.	2022-05-11 12:47:26 -04:00
Jonas Paulsson	304378fd09	Reapply "[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls." (was `0f8c626`). This reverts commit `14d9390`. The patch previously failed to recognize cases where user had defined a function alias with an identical name as that of the library function. Module::getFunction() would then return nullptr which is what the sanitizer discovered. In this updated version a new function isLibFuncEmittable() has as well been introduced which is now used instead of TLI->has() anytime a library function is to be emitted . It additionally also makes sure there is e.g. no function alias with the same name in the module. Reviewed By: Eli Friedman Differential Revision: https://reviews.llvm.org/D123198	2022-05-02 19:37:00 +02:00
Liqin Weng	fa4b4f0fcb	[InstCombine] fold more constant remainder to select-of-constants remainder Reviewed By: xbolva00, spatel, Chenbing.Zheng Differential Revision: https://reviews.llvm.org/D123486	2022-04-12 09:40:56 +08:00
Chenbing Zheng	467cbb6249	[InstCombine] fold more constant divisor to select-of-constants divisor By adding a parameter to function FoldOpIntoSelect， we can fold more Ops to Select. For this example, we tend to fold the division instruction, so we no longer care whether SelectInst is one use. This patch slove TODO left in InstCombine/div.ll. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D122967	2022-04-08 10:19:24 +08:00
serge-sans-paille	59630917d6	Cleanup includes: Transform/Scalar Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817	2022-03-03 07:56:34 +01:00
Nikita Popov	587c7ff15c	[InstCombine] Support min/max intrinsics in udiv->lshr fold This complements the existing fold for selects. This fold is a bit more conservative, requiring one-use. The other folds here should probably also be subjected to a one-use restriction. https://alive2.llvm.org/ce/z/Q9eCDU https://alive2.llvm.org/ce/z/8YK2CJ	2022-02-23 15:51:36 +01:00
Nikita Popov	03e6efb8c2	[InstCombine] Further simplify udiv -> lshr folding Rather than queuing up actions, have one function that does the log2() fold in the obvious way, but with a flag that allows us to check whether the fold will succeed without actually performing it.	2022-02-23 15:29:21 +01:00
Nikita Popov	5ccb0582c2	[InstCombine] Simplify udiv -> lshr folding What we're really doing here is converting Op0 udiv Op1 into Op0 lshr log2(Op1), so phrase it in that way. Actually pushing the lshr into the log2(Op1) expression should be seen as a separate transform.	2022-02-23 14:55:23 +01:00
Nikita Popov	5fb65557e3	[InstCombine] Remove unused visitUDivOperand() argument (NFC) This function only works on the RHS operand.	2022-02-23 13:16:44 +01:00
Sanjay Patel	39e602b6c4	[InstCombine] try to fold binop with phi operands This is an alternate version of D115914 that handles/tests all binary opcodes. I suspect that we don't see these patterns too often because -simplifycfg would convert the minimal cases into selects rather than leave them in phi form (note: instcombine has logic holes for combining the select patterns too though, so that's another potential patch). We only create a new binop in a predecessor that unconditionally branches to the final block. https://alive2.llvm.org/ce/z/C57M2F https://alive2.llvm.org/ce/z/WHwAoU (not safe to speculate an sdiv for example) https://alive2.llvm.org/ce/z/rdVUvW (but it is ok on this path) Differential Revision: https://reviews.llvm.org/D117110	2022-01-22 15:00:06 -05:00
Sanjay Patel	a7a2860d0e	[InstCombine] convert mul with sexted bool and constant to select We already have the related folds for zext-of-bool, so it should make things more consistent to have this transform to select for sext-of-bool too: https://alive2.llvm.org/ce/z/YikdfA Fixes #53319	2022-01-20 15:57:01 -05:00
Sanjay Patel	a7ed21aa1e	[InstCombine] try to fold div with constant dividend and select-of-constants divisor We avoid this fold in the more general cases where we use FoldOpIntoSelect. That's because -- unlike most binary opcodes -- 'div' can't usually be speculated with a variable divisor since it can have immediate UB. But in the case where both arms of the select are constants, we can safely evaluate both sides and eliminate 'div' completely. This is a follow-up to the equivalent fold for 'rem' opcodes: D115173 / `f65be726ab`	2021-12-08 10:27:50 -05:00
Sanjay Patel	f65be726ab	[InstCombine] try to fold rem with constant dividend and select-of-constants divisor We avoid this fold in the more general cases where we use `FoldOpIntoSelect`. That's because -- unlike most binary opcodes -- 'rem' can't usually be speculated with a variable divisor since it can have immediate UB. But in the case where both arms of the select are constants, we can safely evaluate both sides and eliminate 'rem' completely. This should fix: https://llvm.org/PR52102 The same optimization for 'div' is planned as a follow-up patch. Differential Revision: https://reviews.llvm.org/D115173	2021-12-07 15:48:45 -05:00
Sanjay Patel	7a2949647a	[InstCombine] propagate no-wrap flag through select-of-mul fold This may not be obvious, but Alive2 agrees: https://alive2.llvm.org/ce/z/Ld9qNT If the mul has "nsw", then -1 * INT_MIN is poison, so the negate can also have "nsw" because 0 - INT_MIN is poison. If the mul has "nuw", then that means the "OtherOp" can only be 0 or 1 (anything else multiplied by 0xfff... would wrap). So the replacement negate must be "nsw" because it is either "0-0" or "0-1". This is another regression noticed with a planned follow-up to D111410.	2021-10-12 12:57:20 -04:00
Jay Foad	a9bceb2b05	[APInt] Stop using soft-deprecated constructors and methods in llvm. NFC. Stop using APInt constructors and methods that were soft-deprecated in D109483. This fixes all the uses I found in llvm, except for the APInt unit tests which should still test the deprecated methods. Differential Revision: https://reviews.llvm.org/D110807	2021-10-04 08:57:44 +01:00
Simon Pilgrim	5a14edd8ed	[InstCombine] Ensure shifts are in range for (X << C1) / C2 -> X fold. We can get here before out of range shift amounts have been handled - limit to BW-2 for sdiv and BW-1 for udiv Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=38078	2021-09-25 12:57:43 +01:00
Florian Hahn	e08a5dc86f	[InstCombine] Move InstCombineWorklist to Utils to allow reuse (NFC). InstCombine's worklist can be re-used by other passes like VectorCombine. Move it to llvm/Transform/Utils and rename it to InstructionWorklist. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D110181	2021-09-22 08:47:21 +01:00
Dávid Bolvanský	c0fdfc9af2	[InstCombine] powi(x, y) * powi(x, z) -> powi(x, y + z) We already have pow(x, y) * pow(x, z) -> pow(x, y + z) transformation, but we are missing same transformation for powi (power is integer). Requires reassoc. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D109954	2021-09-21 18:20:46 +02:00
Daniil Seredkin	6643e51d79	[InstCombine] Fold (sext bool X) * (sext bool X) to zext (and X, X) InstCombine didn't perform (sext bool X) * (sext bool X) --> zext (and X, X) which can result in just (zext X). The patch adds regression tests to check this transformation and adds a check for equality of mul's operands for that case. Differential Revision: https://reviews.llvm.org/D104193	2021-06-18 16:28:06 +07:00
Daniil Seredkin	6de741de08	Revert "[InstCombine] Fold (sext bool X) * (sext bool X) to zext (and X, X)" This reverts commit `31053338c9`.	2021-06-18 14:21:02 +07:00
Daniil Seredkin	31053338c9	[InstCombine] Fold (sext bool X) * (sext bool X) to zext (and X, X) InstCombine didn't perform (sext bool X) * (sext bool X) --> zext (and X, X) which can result in just (zext X). The patch adds regression tests to check this transformation and adds a check for equality of mul's operands for that case. Differential Revision: https://reviews.llvm.org/D104193	2021-06-18 14:12:00 +07:00
Bjorn Pettersson	4c7f820b2b	Update @llvm.powi to handle different int sizes for the exponent This can be seen as a follow up to commit `0ee439b705`, that changed the second argument of __powidf2, __powisf2 and __powitf2 in compiler-rt from si_int to int. That was to align with how those runtimes are defined in libgcc. One thing that seem to have been missing in that patch was to make sure that the rest of LLVM also handle that the argument now depends on the size of int (not using the si_int machine mode for 32-bit). When using __builtin_powi for a target with 16-bit int clang crashed. And when emitting libcalls to those rtlib functions, typically when lowering @llvm.powi), the backend would always prepare the exponent argument as an i32 which caused miscompiles when the rtlib was compiled with 16-bit int. The solution used here is to use an overloaded type for the second argument in @llvm.powi. This way clang can use the "correct" type when lowering __builtin_powi, and then later when emitting the libcall it is assumed that the type used in @llvm.powi matches the rtlib function. One thing that needed some extra attention was that when vectorizing calls several passes did not support that several arguments could be overloaded in the intrinsics. This patch allows overload of a scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with an entry for powi. Differential Revision: https://reviews.llvm.org/D99439	2021-06-17 09:38:28 +02:00
Daniil Seredkin	7736c1936a	[InstCombine] Missed optimization for pow(x, y) * pow(x, z) with fast-math If FP reassociation (fast-math) is allowed, then LLVM is free to do the following transformation pow(x, y) * pow(x, z) -> pow(x, y + z). This patch adds this transformation and tests for it. See more https://bugs.llvm.org/show_bug.cgi?id=47205 It handles two cases 1. When operands of fmul are different instructions %4 = call reassoc float @llvm.pow.f32(float %0, float %1) %5 = call reassoc float @llvm.pow.f32(float %0, float %2) %6 = fmul reassoc float %5, %4 --> %3 = fadd reassoc float %1, %2 %4 = call reassoc float @llvm.pow.f32(float %0, float %3) 2. When operands of fmul are the same instruction %4 = call reassoc float @llvm.pow.f32(float %0, float %1) %5 = fmul reassoc float %4, %4 --> %3 = fadd reassoc float %1, %1 %4 = call reassoc float @llvm.pow.f32(float %0, float %3) Differential Revision: https://reviews.llvm.org/D102574	2021-06-07 08:08:05 -04:00
Daniil Seredkin	13140120dc	[InstCombine] Relax constraints of uses for exp(X) * exp(Y) -> exp(X + Y) InstCombine didn't perform the transformations when fmul's operands were the same instruction because it required to have one use for each of them which is false in the case. This patch fixes this + adds tests for them and introduces a new function isOnlyUserOfAnyOperand to check these cases in a single place. This patch is a result of discussion in D102574. Differential Revision: https://reviews.llvm.org/D102698	2021-06-01 08:33:23 -04:00
Sanjay Patel	a7cee55762	[InstCombine] fold fdiv with powi divisor (PR49147) This extends `b40fde062c` for the especially non-standard powi pattern. We want to avoid being completely wrong on the negation-of-int-min corner case, so I'm adding an extra FMF check for 'ninf' assuming that gives us the flexibility to handle that possibility. https://llvm.org/PR49147	2021-02-24 16:44:36 -05:00
Sanjay Patel	868d43fbd6	[InstCombine] add helper for x/pow(); NFC We at least want to add powi to this list, so split it off into a switch to reduce code duplication.	2021-02-24 16:44:36 -05:00
Sanjay Patel	e772618f1e	[InstCombine] fold fdiv with exp/exp2 divisor (PR49147) Follow-up to: D96648 / `b40fde062` ...for the special-case base calls. From the earlier commit: This is unusual in the general (non-reciprocal) case because we need an extra instruction, but that should be better for general FP reassociation and codegen. We conservatively check for "arcp" FMF here as we do with existing fdiv folds, but it is not strictly necessary to have that.	2021-02-20 16:02:58 -05:00
Sanjay Patel	b40fde062c	[InstCombine] fold fdiv with pow divisor (PR49147) This is unusual in the general (non-reciprocal) case because we need an extra instruction, but that should be better for general FP reassociation and codegen. We conservatively check for "arcp" FMF here as we do with existing fdiv folds, but it is not strictly necessary to have that. This is part of solving: https://llvm.org/PR49147 (The powi variant potentially has a different constraint.) Differential Revision: https://reviews.llvm.org/D96648	2021-02-14 08:07:36 -05:00
Kazu Hirata	302313a264	[Transforms] Use range-based for loops (NFC)	2021-02-08 22:33:53 -08:00
Dávid Bolvanský	ed396212da	[InstCombine] Transform abs pattern using multiplication to abs intrinsic (PR45691) ``` unsigned r(int v) { return (1 \| -(v < 0)) * v; } `r` is equivalent to `abs(v)`. ``` ``` define <4 x i8> @src(<4 x i8> %0) { %1: %2 = ashr <4 x i8> %0, { 31, undef, 31, 31 } %3 = or <4 x i8> %2, { 1, 1, 1, undef } %4 = mul nsw <4 x i8> %3, %0 ret <4 x i8> %4 } => define <4 x i8> @tgt(<4 x i8> %0) { %1: %2 = icmp slt <4 x i8> %0, { 0, 0, 0, 0 } %3 = sub nsw <4 x i8> { 0, 0, 0, 0 }, %0 %4 = select <4 x i1> %2, <4 x i8> %3, <4 x i8> %0 ret <4 x i8> %4 } Transformation seems to be correct! ``` Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D94874	2021-01-17 17:06:14 +01:00
Simon Pilgrim	b752daa26b	[InstCombine] Replace getLogBase2 internal helper with ConstantExpr::getExactLogBase2. NFCI. This exposes the helper for other power-of-2 instcombine folds that I'm intending to add vector support to. The helper only operated on power-of-2 constants so getExactLogBase2 is a more accurate name.	2020-10-11 10:31:17 +01:00
Simon Pilgrim	702ccb40e2	[InstCombine] getLogBase2(undef) -> 0. Move the undef element handling into the getLogBase2 helper instead of pre-empting with replaceUndefsWith.	2020-10-10 20:29:03 +01:00
Simon Pilgrim	3aab3cbd4a	[InstCombine] getLogBase2 - no need to specify Type. NFCI. In all the getLogBase2 uses, the specified Type is always the same as the constant being folded.	2020-10-10 20:09:55 +01:00
Simon Pilgrim	567049f892	[InstCombine] Use m_FAbs matcher helper. NFCI.	2020-10-01 14:42:34 +01:00
Nikita Popov	58b28fa7a2	[InstCombine] Fold mul of abs intrinsic Same as the existing SPF_ABS fold. We don't need to explicitly handle NABS, as the negs will get folded away first.	2020-09-05 12:37:45 +02:00
Venkataramanan Kumar	626c3738cd	[InstCombine] Transform 1.0/sqrt(X) * X to X/sqrt(X) These transforms will now be performed irrespective of the number of uses for the expression "1.0/sqrt(X)": 1.0/sqrt(X) * X => X/sqrt(X) X * 1.0/sqrt(X) => X/sqrt(X) We already handle more general cases, and we are intentionally not creating extra (and likely expensive) fdiv ops in IR. This pattern is the exception to the rule because we always expect the Backend to reduce X/sqrt(X) to sqrt(X), if it has the necessary (reassoc) fast-math-flags. Ref: DagCombiner optimizes the X/sqrt(X) to sqrt(X). Differential Revision: https://reviews.llvm.org/D86726	2020-09-02 08:23:48 -04:00
Christopher Tetreault	640f20b0c7	[SVE] Remove calls to VectorType::getNumElements from InstCombine Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D82237	2020-08-31 12:59:10 -07:00
Sanjay Patel	e6b6787d01	[InstCombine] fold abs(X)/X to cmp+select The backend can convert the select-of-constants to bit-hack shift+logic if desirable. https://alive2.llvm.org/ce/z/pgJT6E define i8 @src(i8 %x) { %0: %a = abs i8 %x, 1 %d = sdiv i8 %x, %a ret i8 %d } => define i8 @tgt(i8 %x) { %0: %cond = icmp sgt i8 %x, 255 %r = select i1 %cond, i8 1, i8 255 ret i8 %r } Transformation seems to be correct!	2020-08-17 08:01:28 -04:00

1 2 3 4 5 ...

359 Commits