llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	236c4524a7	[InstSimplify] remove ctpop of 1 (low) bit https://llvm.org/PR48608 As noted in the test comment, we could handle a more general case in instcombine and remove this, but I don't have evidence that we need to do that. https://alive2.llvm.org/ce/z/MRW9gD	2020-12-28 16:06:20 -05:00
Sanjay Patel	1351f719d4	[InstSimplify] add tests for ctpop; NFC (PR48608)	2020-12-28 16:06:19 -05:00
Juneyoung Lee	db7a2f347f	Precommit transform tests that have poison as insertelement's placeholder This commit copies existing tests at llvm/Transforms and replaces 'insertelement undef' in those files with 'insertelement poison'. (see https://reviews.llvm.org/D93586) Tests listed using this script: grep -R -E '^[^;]insertelement <.> undef,' . \| cut -d":" -f1 \| uniq \| wc -l Tests updated: file_org=llvm/test/Transforms/$1 file=${file_org%.ll}-inseltpoison.ll cp $file_org $file sed -i -E 's/^([^;])insertelement <(.)> undef/\1insertelement <\2> poison/g' $file head -1 $file \| grep "Assertions have been autogenerated by utils/update_test_checks.py" -q if [ "$?" == 1 ]; then echo "$file : should be manually updated" # I manually updated the script exit 1 fi python3 ./llvm/utils/update_test_checks.py --opt-binary=./build-releaseassert/bin/opt $file	2020-12-24 11:46:17 +09:00
Sanjay Patel	38ca7face6	[InstSimplify] reduce logic with inverted add/sub ops https://llvm.org/PR48559 This could be part of a larger ValueTracking API, but I don't see that currently. https://rise4fun.com/Alive/gR0 Name: and Pre: C1 == ~C2 %sub = add i8 %x, C1 %sub1 = sub i8 C2, %x %r = and i8 %sub, %sub1 => %r = 0 Name: or Pre: C1 == ~C2 %sub = add i8 %x, C1 %sub1 = sub i8 C2, %x %r = or i8 %sub, %sub1 => %r = -1 Name: xor Pre: C1 == ~C2 %sub = add i8 %x, C1 %sub1 = sub i8 C2, %x %r = xor i8 %sub, %sub1 => %r = -1	2020-12-21 08:51:43 -05:00
Sanjay Patel	d6118759f3	[InstSimplify] add tests for inverted logic operands; NFC	2020-12-21 08:51:42 -05:00
Roman Lebedev	e9289dc25f	[InstSimplify] Don't miscompile `X == 0 ? abs(X) : -abs(X) --> -abs(X)` xform The transform wasn't checking that the LHS of the comparison is the `X` in question... This is the miscompile that was holding up D87188. Thanks to Dave Green for producing an actionable reproducer!	2020-12-18 21:18:13 +03:00
Roman Lebedev	9b183a1452	[NFC][InstSimplify] Add miscompiled testcase from D87188/D87197 Thanks to Dave Green for producing an actionable reproducer! It is (obviously) a miscompile: ``` ---------------------------------------- define i32 @select_abs_of_abs_eq_wrong(i32 %x, i32 %y) { %0: %abs = abs i32 %x, 0 %neg = sub i32 0, %abs %cmp = icmp eq i32 %y, 0 %sel = select i1 %cmp, i32 %neg, i32 %abs ret i32 %sel } => define i32 @select_abs_of_abs_eq_wrong(i32 %x, i32 %y) { %0: %abs = abs i32 %x, 0 ret i32 %abs } Transformation doesn't verify! ERROR: Value mismatch Example: i32 %x = #xe0000000 (3758096384, -536870912) i32 %y = #x00000000 (0) Source: i32 %abs = #x20000000 (536870912) i32 %neg = #xe0000000 (3758096384, -536870912) i1 %cmp = #x1 (1) i32 %sel = #xe0000000 (3758096384, -536870912) Target: i32 %abs = #x20000000 (536870912) Source value: #xe0000000 (3758096384, -536870912) Target value: #x20000000 (536870912) Alive2: Transform doesn't verify! ```	2020-12-18 21:18:13 +03:00
Juneyoung Lee	864dda5fd5	[InstSimplify] Add tests that fold instructions with poison operands (NFC)	2020-12-02 01:01:59 +09:00
Juneyoung Lee	9c49dcc356	[ConstantFold] Don't fold and/or i1 poison to poison (NFC) .. because it causes miscompilation when combined with select i1 -> and/or. It is the select fold which is incorrect; but it is costly to disable the fold, so hack this one. D92270	2020-11-30 22:58:31 +09:00
Juneyoung Lee	53040a968d	[ConstantFold] Fold more operations to poison This patch folds more operations to poison. Alive2 proof: https://alive2.llvm.org/ce/z/mxcb9G (it does not contain tests about div/rem because they fold to poison when raising UB) Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D92270	2020-11-29 21:19:48 +09:00
Juneyoung Lee	c6b62efb91	[ConstantFold] Fold operations to poison if possible This patch updates ConstantFold, so operations are folded into poison if possible. <alive2 proofs> casts: https://alive2.llvm.org/ce/z/WSj7rw binary operations (arithmetic): https://alive2.llvm.org/ce/z/_7dEyJ binary operations (bitwise): https://alive2.llvm.org/ce/z/cezjVN vector/aggregate operations: https://alive2.llvm.org/ce/z/BQ7hWz unary ops: https://alive2.llvm.org/ce/z/yBRs4q other ops: https://alive2.llvm.org/ce/z/iXbcFD Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D92203	2020-11-29 02:28:40 +09:00
Cullen Rhodes	7b8d50b141	[InstSimplify] Clarify use of FixedVectorType in SimplifySelectInst Folding a select of vector constants that include undef elements only applies to fixed vectors, but there's no earlier check the type is not scalable so it crashes for scalable vectors. This adds a check so this optimization is only attempted for fixed vectors. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D92046	2020-11-27 09:55:29 +00:00
Christopher Tetreault	792f8e1114	[SVE] Take constant fold fast path for splatted vscale vectors This should be a perfectly reasonable operation for scalable vectors. Currently, it only works for zeroinitializer values of ScalableVectorType, but the fundamental operation is sound and it should be possible to make it work for other splats Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D77442	2020-11-17 12:45:31 -08:00
Simon Pilgrim	6fa7030a76	[ConstProp] Remove unused check-prefixes Just use default CHECK and remove duplicate RUN	2020-11-09 13:12:40 +00:00
Sanjay Patel	00808e321c	[InstSimplify] allow vector folds for (Pow2C << X) == NonPow2C Existing pre-conditions seem to be correct: https://rise4fun.com/Alive/lCLB Name: non-zero C1 Pre: !isPowerOf2(C1) && isPowerOf2(C2) && C1 != 0 %sub = shl i8 C2, %X %cmp = icmp eq i8 %sub, C1 => %cmp = false Name: one == C2 Pre: !isPowerOf2(C1) && isPowerOf2(C2) && C2 == 1 %sub = shl i8 C2, %X %cmp = icmp eq i8 %sub, C1 => %cmp = false Name: nuw Pre: !isPowerOf2(C1) && isPowerOf2(C2) %sub = shl nuw i8 C2, %X %cmp = icmp eq i8 %sub, C1 => %cmp = false Name: nsw Pre: !isPowerOf2(C1) && isPowerOf2(C2) %sub = shl nsw i8 C2, %X %cmp = icmp eq i8 %sub, C1 => %cmp = false	2020-11-08 09:52:05 -05:00
Sanjay Patel	73a5f0b614	[InstSimplify] add tests for icmp with power-of-2 operand; NFC	2020-11-08 09:52:05 -05:00
Sanjay Patel	c74db55ff5	[InstSimplify] allow vector folds for icmp Pred (1 << X), 0x80	2020-11-04 08:12:48 -05:00
Sanjay Patel	5765edbf9e	[InstSimplify] add vector cmp tests; NFC	2020-11-04 08:12:47 -05:00
Sanjay Patel	e77ba263fe	[InstSimplify] peek through 'not' operand in logic-of-icmps fold This extends D78430 to solve cases like: https://llvm.org/PR47858 There are still missed opportunities shown in the tests, and as noted in the earlier patches, we have related functionality in InstCombine, so we may want to extend other folds in a similar way. A semi-random sampling of test diff proofs in this patch: https://rise4fun.com/Alive/sS4C	2020-10-25 11:13:30 -04:00
Sanjay Patel	7de2add829	[InstSimplify] add tests for logic-of-cmps with not op; NFC One variant of this is shown in: https://llvm.org/PR47858	2020-10-25 11:13:30 -04:00
Sanjay Patel	c72198079d	[ValueTracking] add range limits for cttz As discussed in D89952, instcombine can sometimes find a way to reduce similar patterns, but it is incomplete. InstSimplify uses the computeConstantRange() ValueTracking analysis via simplifyICmpWithConstant(), so we just need to fill in the max value of cttz to process any "icmp pred cttz(X), C" pattern (the min value is initialized to zero automatically). https://alive2.llvm.org/ce/z/Z_SLWZ Follow-up to D89976.	2020-10-23 08:43:45 -04:00
Sanjay Patel	3fb0d6b0d5	[ValueTracking] add range limits for ctlz As discussed in D89952, instcombine can sometimes find a way to reduce similar patterns, but it is incomplete. InstSimplify uses the computeConstantRange() ValueTracking analysis via simplifyICmpWithConstant(), so we just need to fill in the max value of ctlz to process any "icmp pred ctlz(X), C" pattern (the min value is initialized to zero automatically). Follow-up to D89976.	2020-10-23 08:43:45 -04:00
Sanjay Patel	0351bd959f	[InstSimplify] add tests for cttz constant range; NFC This is a search-and-replace of `f6cb7f3`	2020-10-23 08:43:45 -04:00
Sanjay Patel	9bcb437f46	[InstSimplify] add tests for ctlz constant range; NFC This is a search-and-replace of `f6cb7f3`.	2020-10-23 08:43:45 -04:00
Sanjay Patel	748ecc6b32	[ValueTracking] add range limits for ctpop As discussed in D89952, instcombine can sometimes find a way to reduce similar patterns, but it is incomplete. InstSimplify uses the computeConstantRange() ValueTracking analysis via simplifyICmpWithConstant(), so we just need to fill in the max value of ctpop to process any "icmp pred ctpop(X), C" pattern (the min value is initialized to zero automatically). Differential Revision: https://reviews.llvm.org/D89976	2020-10-23 08:17:54 -04:00
Sanjay Patel	f6cb7f37ff	[InstSimplify] add tests for ctpop constant range; NFC	2020-10-22 14:16:48 -04:00
Sjoerd Meijer	51d7df3fa1	[InstructionSimplify] icmp (X+Y), (X+Z) simplification This improves simplifications for pattern `icmp (X+Y), (X+Z)` -> `icmp Y,Z` if only one of the operands has NSW set, e.g.: icmp slt (x + 0), (x +nsw 1) We can still safely rewrite this to: icmp slt 0, 1 because we know that the LHS can't overflow if the RHS has NSW set and C1 < C2 && C1 >= 0, or C2 < C1 && C1 <= 0 This simplification is useful because ScalarEvolutionExpander which is used to generate code for SCEVs in different loop optimisers is not always able to put back NSW flags across control-flow, thus inhibiting CFG simplifications. Differential Revision: https://reviews.llvm.org/D89317	2020-10-22 08:55:52 +01:00
Sjoerd Meijer	e86a70ce3d	[InstructionSimplify] And precommit more tests for D89317. NFC.	2020-10-21 11:02:25 +01:00
Sjoerd Meijer	782b8f0d38	[InstructionSimplify] Precommit more tests for D89317. NFC.	2020-10-21 10:14:39 +01:00
Sanjay Patel	7c516504a1	[InstSimplify] allow vector splats for icmp-of-neg folds	2020-10-20 09:24:36 -04:00
Sanjay Patel	b11588b18e	[InstSimplify] add vector icmp tests; NFC	2020-10-20 09:24:35 -04:00
Juneyoung Lee	62a0ec1612	Add support for !noundef metatdata on loads This patch adds metadata !noundef and makes load instructions can optionally have it. A load with !noundef always return a well-defined value (has no undef bit or isn't poison). If the loaded value isn't well defined, the behavior is undefined. This metadata can be used to encode the assumption from C/C++ that certain reads of variables should have well-defined values. It is helpful for optimizing freeze instructions away, because freeze can be removed when its operand has well-defined value, and showing that a load from arbitrary location is well-defined is usually hard otherwise. The same information can be encoded with llvm.assume with operand bundle; using metadata is chosen because I wasn't sure whether code motion can be freely done when llvm.assume is inserted from clang instead. The existing codebase already is stripping unknown metadata when doing code motion, so using metadata is UB-safe as well. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D89050	2020-10-17 13:50:10 +09:00
Jay Foad	1417abe54c	[AMDGPU] Add new llvm.amdgcn.fma.legacy intrinsic Differential Revision: https://reviews.llvm.org/D89558	2020-10-16 17:10:21 +01:00
Sjoerd Meijer	66f22411e1	[InstructionSimplify] Precommit tests for D89317. NFC.	2020-10-13 15:40:33 +01:00
Amara Emerson	322d0afd87	[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics. This change renames the intrinsics to not have "experimental" in the name. The autoupgrader will handle legacy intrinsics. Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html Differential Revision: https://reviews.llvm.org/D88787	2020-10-07 10:36:44 -07:00
Sanjay Patel	149f5b573c	[APFloat] convert SNaN to QNaN in convert() and raise Invalid signal This is an alternate fix (see D87835) for a bug where a NaN constant gets wrongly transformed into Infinity via truncation. In this patch, we uniformly convert any SNaN to QNaN while raising 'invalid op'. But we don't have a way to directly specify a 32-bit SNaN value in LLVM IR, so those are always encoded/decoded by calling convert from/to 64-bit hex. See D88664 for a clang fix needed to allow this change. Differential Revision: https://reviews.llvm.org/D88238	2020-10-01 14:37:38 -04:00
Sanjay Patel	645c53a9d9	[ValueTracking] enhance isKnownNeverInfinity to understand sitofp As discussed in D87877, instcombine already has this fold, but it was missing from the more general ValueTracking logic. https://alive2.llvm.org/ce/z/PumYZP	2020-09-27 08:40:31 -04:00
Sanjay Patel	71f25ac8ca	[InstSimplify] add tests for fcmp with casted op; NFC This shows missing analysis in ValueTracking's isKnownNeverInfinity().	2020-09-27 08:36:57 -04:00
Sanjay Patel	e34bd1e0b0	[APFloat] prevent NaN morphing into Inf on conversion (PR43907) We shift the significand right on a truncation, but that needs to be made NaN-safe: always set at least 1 bit in the significand. https://llvm.org/PR43907 See D88238 for the likely follow-up (but needs some plumbing fixes before it can proceed). Differential Revision: https://reviews.llvm.org/D87835	2020-09-24 14:02:19 -04:00
Arthur Eubanks	61ac58e10a	[NewPM] Pin tests with -debug-pass to legacy PM -debug-pass is a legacy PM only option. Some tests checks that the pass returned that it made a change, which is not relevant to the NPM, since passes return PreservedAnalyses. Some tests check that passes are freed at the proper time, which is also not relevant to the NPM. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87945	2020-09-22 17:54:25 -07:00
Arthur Eubanks	f4f7df037e	[DIE] Remove DeadInstEliminationPass This pass is like DeadCodeEliminationPass, but only does one pass through a function instead of iterating on users of eliminated instructions. DeadCodeEliminationPass should be used in all cases. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87933	2020-09-21 12:12:25 -07:00
Sanjay Patel	f74a334fe3	[ConstantFolding] add undef handling for fmin/fmax intrinsics The output here may not be optimal (yet), but it should be consistent for commuted operands (it was not before) and correct. We can do better by checking FMF and NaN if needed. Code in InstSimplify generally assumes that we have already folded code like this, so it was not handling 2 constant inputs by commuting consistently.	2020-09-19 10:31:01 -04:00
Sanjay Patel	d3b0644e22	[InstSimplify] add tests for constant folding fmin/fmax with undef op; NFC	2020-09-18 16:09:44 -04:00
Sanjay Patel	3f100e64b4	[InstSimplify] fix fmin/fmax miscompile for partial undef vectors (PR47567) It would also be correct to return the variable operand in these cases, but eliminating a variable use is probably better for optimization.	2020-09-18 10:05:44 -04:00
Sanjay Patel	6690de098e	[InstSimplify] add another test for NaN propagation; NFC	2020-09-18 09:20:26 -04:00
Sanjay Patel	c6ebe3fd00	[InstSimplify] add tests for FP constant miscompile; NFC (PR43907)	2020-09-17 12:04:39 -04:00
Sanjay Patel	8985755762	[InstSimplify] add limit folds for fmin/fmax If the constant operand is the opposite of the min/max value, then the result must be the other value. This is based on the similar codegen transform proposed in: D87571	2020-09-15 10:58:44 -04:00
Sanjay Patel	55d371abd7	[InstSimplify] add folds for fmin/fmax with 'nnan' maximum(nnan X, +INF) --> +INF minimum(nnan X, -INF) --> -INF This is based on the similar codegen transform proposed in: D87571	2020-09-14 11:46:11 -04:00
Sanjay Patel	7526376164	[InstSimplify] allow folds for fmin/fmax with 'ninf' maxnum(ninf X, +FLT_MAX) --> +FLT_MAX minnum(ninf X, -FLT_MAX) --> -FLT_MAX This is based on the similar codegen transform proposed in: D87571	2020-09-14 11:18:08 -04:00
Sanjay Patel	dae68fdf9e	[InstSimplify] add/move tests for fmin/fmax; NFC The new tests are duplicated from the sibling patch for codegen: D87571	2020-09-14 10:24:19 -04:00

1 2 3 4 5 ...

832 Commits