This commit copies existing tests at llvm/Transforms and replaces
'insertelement undef' in those files with 'insertelement poison'.
(see https://reviews.llvm.org/D93586)
Tests listed using this script:
grep -R -E '^[^;]*insertelement <.*> undef,' . | cut -d":" -f1 | uniq |
wc -l
Tests updated:
file_org=llvm/test/Transforms/$1
file=${file_org%.ll}-inseltpoison.ll
cp $file_org $file
sed -i -E 's/^([^;]*)insertelement <(.*)> undef/\1insertelement <\2> poison/g' $file
head -1 $file | grep "Assertions have been autogenerated by utils/update_test_checks.py" -q
if [ "$?" == 1 ]; then
echo "$file : should be manually updated"
# I manually updated the script
exit 1
fi
python3 ./llvm/utils/update_test_checks.py --opt-binary=./build-releaseassert/bin/opt $file
The transform wasn't checking that the LHS of the comparison
*is* the `X` in question...
This is the miscompile that was holding up D87188.
Thanks to Dave Green for producing an actionable reproducer!
.. because it causes miscompilation when combined with select i1 -> and/or.
It is the select fold which is incorrect; but it is costly to disable the fold, so hack this one.
D92270
Folding a select of vector constants that include undef elements only
applies to fixed vectors, but there's no earlier check the type is not
scalable so it crashes for scalable vectors. This adds a check so this
optimization is only attempted for fixed vectors.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D92046
This should be a perfectly reasonable operation for scalable vectors.
Currently, it only works for zeroinitializer values of
ScalableVectorType, but the fundamental operation is sound and it should
be possible to make it work for other splats
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D77442
This extends D78430 to solve cases like:
https://llvm.org/PR47858
There are still missed opportunities shown in the tests,
and as noted in the earlier patches, we have related
functionality in InstCombine, so we may want to extend
other folds in a similar way.
A semi-random sampling of test diff proofs in this patch:
https://rise4fun.com/Alive/sS4C
As discussed in D89952,
instcombine can sometimes find a way to reduce similar patterns,
but it is incomplete.
InstSimplify uses the computeConstantRange() ValueTracking analysis
via simplifyICmpWithConstant(), so we just need to fill in the max
value of cttz to process any "icmp pred cttz(X), C" pattern (the
min value is initialized to zero automatically).
https://alive2.llvm.org/ce/z/Z_SLWZ
Follow-up to D89976.
As discussed in D89952,
instcombine can sometimes find a way to reduce similar patterns,
but it is incomplete.
InstSimplify uses the computeConstantRange() ValueTracking analysis
via simplifyICmpWithConstant(), so we just need to fill in the max
value of ctlz to process any "icmp pred ctlz(X), C" pattern (the
min value is initialized to zero automatically).
Follow-up to D89976.
As discussed in D89952,
instcombine can sometimes find a way to reduce similar patterns,
but it is incomplete.
InstSimplify uses the computeConstantRange() ValueTracking analysis
via simplifyICmpWithConstant(), so we just need to fill in the max
value of ctpop to process any "icmp pred ctpop(X), C" pattern (the
min value is initialized to zero automatically).
Differential Revision: https://reviews.llvm.org/D89976
This improves simplifications for pattern `icmp (X+Y), (X+Z)` -> `icmp Y,Z`
if only one of the operands has NSW set, e.g.:
icmp slt (x + 0), (x +nsw 1)
We can still safely rewrite this to:
icmp slt 0, 1
because we know that the LHS can't overflow if the RHS has NSW set and
C1 < C2 && C1 >= 0, or C2 < C1 && C1 <= 0
This simplification is useful because ScalarEvolutionExpander which is used to
generate code for SCEVs in different loop optimisers is not always able to put
back NSW flags across control-flow, thus inhibiting CFG simplifications.
Differential Revision: https://reviews.llvm.org/D89317
This patch adds metadata !noundef and makes load instructions can optionally have it.
A load with !noundef always return a well-defined value (has no undef bit or isn't poison).
If the loaded value isn't well defined, the behavior is undefined.
This metadata can be used to encode the assumption from C/C++ that certain reads of variables should have well-defined values.
It is helpful for optimizing freeze instructions away, because freeze can be removed when its operand has well-defined value, and showing that a load from arbitrary location is well-defined is usually hard otherwise.
The same information can be encoded with llvm.assume with operand bundle; using metadata is chosen because I wasn't sure whether code motion can be freely done when llvm.assume is inserted from clang instead.
The existing codebase already is stripping unknown metadata when doing code motion, so using metadata is UB-safe as well.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D89050
This is an alternate fix (see D87835) for a bug where a NaN constant
gets wrongly transformed into Infinity via truncation.
In this patch, we uniformly convert any SNaN to QNaN while raising
'invalid op'.
But we don't have a way to directly specify a 32-bit SNaN value in LLVM IR,
so those are always encoded/decoded by calling convert from/to 64-bit hex.
See D88664 for a clang fix needed to allow this change.
Differential Revision: https://reviews.llvm.org/D88238
As discussed in D87877, instcombine already has this fold,
but it was missing from the more general ValueTracking logic.
https://alive2.llvm.org/ce/z/PumYZP
We shift the significand right on a truncation, but that needs to be made NaN-safe:
always set at least 1 bit in the significand.
https://llvm.org/PR43907
See D88238 for the likely follow-up (but needs some plumbing fixes before it can proceed).
Differential Revision: https://reviews.llvm.org/D87835
-debug-pass is a legacy PM only option.
Some tests checks that the pass returned that it made a change,
which is not relevant to the NPM, since passes return PreservedAnalyses.
Some tests check that passes are freed at the proper time, which is also
not relevant to the NPM.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D87945
This pass is like DeadCodeEliminationPass, but only does one pass
through a function instead of iterating on users of eliminated
instructions.
DeadCodeEliminationPass should be used in all cases.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D87933
The output here may not be optimal (yet), but it should be
consistent for commuted operands (it was not before) and
correct. We can do better by checking FMF and NaN if needed.
Code in InstSimplify generally assumes that we have already
folded code like this, so it was not handling 2 constant
inputs by commuting consistently.
If the constant operand is the opposite of the min/max value,
then the result must be the other value.
This is based on the similar codegen transform proposed in:
D87571