Adding following fold opportunity:
((A | B) ^ A) & ((A | B) ^ B) --> 0
Reviewed By: spatel, rampitec
Differential Revision: https://reviews.llvm.org/D115755
This fixes the assertion failure reported at
https://reviews.llvm.org/D114889#3198921 with a straightforward
check, until the cleaner fix in D115924 can be reapplied.
This reverts commit 9fd4f80e33.
This breaks SingleSource/Regression/C/gcc-c-torture/execute/pr19687.c
in test-suite. Either the test is incorrect, or clang is generating
incorrect union initialization code. I've submitted
https://reviews.llvm.org/D115994 to fix the test, assuming my
interpretation is correct. Reverting this in the meantime as it
may take some time to resolve.
There are a number of places that specially handle loads from a
uniform value where all the bits are the same (zero, one, undef,
poison), because we a) don't care about the load offset in that
case and b) it bypasses casts that might not be legal generally
but do work with uniform values.
We had multiple implementations of this, with a different set of
supported values each time, as well as incomplete type checks in
some cases. In particular, this fixes the assertion reported in
https://reviews.llvm.org/D114889#3198921, as well as a similar
assertion that could be triggered via constant folding.
Differential Revision: https://reviews.llvm.org/D115924
Usually the case where the types are the same ends up being handled
fine because it's legal to do a trivial bitcast to the same type.
However, this is not true for aggregate types. Short-circuit the
whole code if the types match exactly to account for this.
The test is switched to use -instsimplify as it is in the
InstSimplify directory. In this particular case InstCombine does
fold the load (in a very roundabout way), but InstSimplify does not.
Refer to https://llvm.org/PR52546.
Simplifies the following cases:
not(X) == 0 -> X != 0 -> X
not(X) <=u 0 -> X >u 0 -> X
not(X) >=s 0 -> X <s 0 -> X
not(X) != 1 -> X == 1 -> X
not(X) <=u 1 -> X >=u 1 -> X
not(X) >s 1 -> X <=s -1 -> X
Differential Revision: https://reviews.llvm.org/D114666
See D114666 for proposed code change to instsimplify.
The difference between the CHECK result of these 2 tests
highlights missed folds in instsimplify
(e.g. (icmp eq (xor X, true), false) -> X) that are
already being handled by instcombine.
The tests are based on:
llvm/test/Transforms/InstSimplify/icmp-bool-constant.ll
Differential Revision: https://reviews.llvm.org/D115209
Reduce code duplication for commutative pattern matching
and fix a miscompile.
We can't safely propagate an undef element in this transform:
https://alive2.llvm.org/ce/z/s5xy55
This adjusts all the MVE and CDE intrinsics now that v2i1 is a legal
type, to use a <2 x i1> as opposed to emulating the predicate with a
<4 x i1>. The v4i1 workarounds have been removed leaving the natural
v2i1 types, notably in vctp64 which now generates a v2i1 type.
AutoUpgrade code has been added to upgrade old IR, which needs to
convert the old v4i1 to a v2i1 be converting it back and forth to an
integer with arm.mve.v2i and arm.mve.i2v intrinsics. These should be
optimized away in the final assembly.
Differential Revision: https://reviews.llvm.org/D114455
https://alive2.llvm.org/ce/z/4PaPDy
There's a related fold where the inner 'or' is replaced by 'and',
but that needs to be more careful about matching a 'not'.
(~a & b) ^ (a | b) --> a
This is the swapped and/or (Demorgan?) sibling fold for
the fold added with D114462 ( 892648b18a ).
This case is easier to specify because we are returning
a root value, not a 'not':
https://alive2.llvm.org/ce/z/SRzj4f
(a & b) ^ (~a | b) --> ~a
I was looking for a shortcut to reduce some of the complex logic
folds that are currently up for review (D113216
and others in that stack), and I found this missing from
instcombine/instsimplify.
There is a trade-off in putting it into instsimplify: because
we can't create new values here, we need a strict 'not' op (no
undef elements). Otherwise, the fold is not valid:
https://alive2.llvm.org/ce/z/k_AGGj
If this was in instcombine instead, we could create the proper
'not'. But having the fold here benefits other passes like GVN
that use instsimplify as an analysis.
There is a related fold where 'and' and 'or' are swapped, and
that is planned as a follow-up commit.
Differential Revision: https://reviews.llvm.org/D114462
As described in https://bugs.llvm.org/show_bug.cgi?id=52429 this
fold is incorrect, because inbounds only guarantees that the
pointers don't wrap in the unsigned space: It is possible that
the sign boundary is crossed by an object.
I'm dropping the fold entirely rather than adjusting it, because
computePointerICmp() fully subsumes it (just with correct predicate
handling).
Differential Revision: https://reviews.llvm.org/D113343
The maximal value of a half is 0x7bff, which is 65504 when converted to
an integer. This patch teaches that to computeConstantRange to compute a
constant range with the correct maximum value.
https://alive2.llvm.org/ce/z/BV_Spbhttps://alive2.llvm.org/ce/z/Nwuqvb
The maximum value for a float converted in the same way is 3.4e38, which
requires 129bits of data. I have not added that here as integer types so
larger are rare, compared to integers types larger than 17 bits require
for half floats.
The MVE tests change because instsimplify happens to be run as a part of
the backend, where it doesn't tend to for other backends.
Differential Revision: https://reviews.llvm.org/D112694
Currently the fadd optimizations in InstSimplify don't know how to do this
NoSignedZeros "X + 0.0 ==> X" fold when using the constrained intrinsics.
This adds the support.
This review is derived from D106362 with some improvements from D107285
and is a follow-on to D111085.
Differential Revision: https://reviews.llvm.org/D111450
Currently the fadd optimizations in InstSimplify don't know how to do this
"X + -0.0 ==> X" fold when using the constrained intrinsics. This adds the
support.
This commit is derived from D106362 with some improvements from D107285.
Differential Revision: https://reviews.llvm.org/D111085
https://alive2.llvm.org/ce/z/QagQMn
This fold is handled by instcombine via SimplifyUsingDistributiveLaws(),
but we are missing the sibliing fold for 'logical and' (implemented with
'select'). Retrofitting the code in instcombine looks much harder
than just adding a small adjustment here, and this is potentially more
efficient and beneficial to other passes.
This refactors load folding to happen in two cleanly separated
steps: ConstantFoldLoadFromConstPtr() takes a pointer to load from
and decomposes it into a constant initializer base and an offset.
Then ConstantFoldLoadFromConst() loads from that initializer at
the given offset. This makes the core logic independent of having
actual GEP expressions (and those GEP expressions having certain
structure) and will allow exposing ConstantFoldLoadFromConst() as
an independent API in the future.
This is mostly only a refactoring, but it does make the folding
logic slightly more powerful.
Differential Revision: https://reviews.llvm.org/D111023
In working on D106362 I found that a few more tests were needed. I've
been asked to pre-push the tests for that ticket. This should complete
the tests needed for now.
In ValueTracking.cpp we use a function called
computeKnownBitsFromOperator to determine the known bits of a value.
For the vscale intrinsic if the function contains the vscale_range
attribute we can use the maximum and minimum values of vscale to
determine some known zero and one bits. This should help to improve
code quality by allowing certain optimisations to take place.
Tests added here:
Transforms/InstCombine/icmp-vscale.ll
Differential Revision: https://reviews.llvm.org/D109883