This is outwardly NFC from what I can tell, but it should be more efficient
to simplify first (despite the name, SimplifyAssociativeOrCommutative does
not actually simplify as InstSimplify does - it creates/morphs instructions).
This should make it easier to refactor duplicated code that runs for all binops.
llvm-svn: 335258
Summary:
Related to https://bugs.llvm.org/show_bug.cgi?id=37793, https://reviews.llvm.org/D46760#1127287
We'd like to do this canonicalization https://rise4fun.com/Alive/Gmc
But it is currently restricted by rL155136 / rL155362, which says:
```
// This is a constant shift of a constant shift. Be careful about hiding
// shl instructions behind bit masks. They are used to represent multiplies
// by a constant, and it is important that simple arithmetic expressions
// are still recognizable by scalar evolution.
//
// The transforms applied to shl are very similar to the transforms applied
// to mul by constant. We can be more aggressive about optimizing right
// shifts.
//
// Combinations of right and left shifts will still be optimized in
// DAGCombine where scalar evolution no longer applies.
```
I think these tests show that for *constants*, SCEV has no issues with that canonicalization.
Reviewers: mkazantsev, spatel, efriedma, sanjoy
Reviewed By: mkazantsev
Subscribers: sanjoy, javed.absar, llvm-commits, stoklund, bixia
Differential Revision: https://reviews.llvm.org/D48229
llvm-svn: 335101
Summary:
We already do it for splat constants, but not just values.
Also, undef cases are mostly non-functional.
The original commit was reverted because
it broke tests for amdgpu backend, which i didn't check.
Now, the backed was updated to recognize these new
patterns, so we are good.
https://bugs.llvm.org/show_bug.cgi?id=37603https://rise4fun.com/Alive/cplX
Reviewers: spatel, craig.topper, mareko, bogner, rampitec, nhaehnle, arsenm
Reviewed By: spatel, rampitec, nhaehnle
Subscribers: wdng, nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D47980
llvm-svn: 334818
As noted in the review thread for rL333782, we could have
made a bug harder to hit if we were simplifying instructions
before trying other folds.
The shuffle transform in question isn't ever a simplification;
it's just a canonicalization. So I've renamed that to make that
clearer.
This is NFCI at this point, but I've regenerated the test file
to show the cosmetic value naming difference of using
instcombine's RAUW vs. the builder.
Possible follow-ups:
1. Move reassociation folds after simplifies too.
2. Refactor common code; we shouldn't have so much repetition.
llvm-svn: 333820
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
llvm-svn: 332240
Also, rename 'foldOpWithConstantIntoOperand' because that's annoyingly
vague. The constant check is redundant in some cases, but it allows
removing duplication for most of the calls.
llvm-svn: 326329
The hexagon test should be fixed now.
Original commit message:
This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.
This can allow us to get the select closer to other selects to enable removing one.
Differential Revision: https://reviews.llvm.org/D39222
llvm-svn: 317600
This broke the CodeGen/Hexagon/loop-idiom/pmpy-mod.ll test on a bunch of buildbots.
> This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.
>
> This can allow us to get the select closer to other selects to enable removing one.
>
> Differential Revision: https://reviews.llvm.org/D39222
>
> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317510 91177308-0d34-0410-b5e6-96231b3b80d8
llvm-svn: 317518
This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.
This can allow us to get the select closer to other selects to enable removing one.
Differential Revision: https://reviews.llvm.org/D39222
llvm-svn: 317510
Narrow ops are better for bit-tracking, and in the case of vectors,
may enable better codegen.
As the trunc test shows, this can allow follow-on simplifications.
There's a block of code in visitTrunc that deals with shifted ops
with FIXME comments. It may be possible to remove some of that now,
but I want to make sure there are no problems with this step first.
http://rise4fun.com/Alive/Y3a
Name: hoist_ashr_ahead_of_sext_1
%s = sext i8 %x to i32
%r = ashr i32 %s, 3 ; shift value is < than source bit width
=>
%a = ashr i8 %x, 3
%r = sext i8 %a to i32
Name: hoist_ashr_ahead_of_sext_2
%s = sext i8 %x to i32
%r = ashr i32 %s, 8 ; shift value is >= than source bit width
=>
%a = ashr i8 %x, 7 ; so clamp this shift value
%r = sext i8 %a to i32
Name: junc_the_trunc
%a = sext i16 %v to i32
%s = ashr i32 %a, 18
%t = trunc i32 %s to i16
=>
%t = ashr i16 %v, 15
llvm-svn: 310942
We already support pulling through an add with constant RHS. We can do the same for subtract.
Differential Revision: https://reviews.llvm.org/D36443
llvm-svn: 310407
Name: narrow_shift
Pre: C1 < 8
%zx = zext i8 %x to i32
%l = lshr i32 %zx, C1
=>
%narrowC = trunc i32 C1 to i8
%ns = lshr i8 %x, %narrowC
%l = zext i8 %ns to i32
http://rise4fun.com/Alive/jIV
This isn't directly applicable to PR34046 as written, but we
need to have more narrowing folds like this to be sure that
rotate patterns are recognized.
llvm-svn: 310060
Previously the InstCombiner class contained a pointer to an IR builder that had been passed to the constructor. Sometimes this would be passed to helper functions as either a pointer or the pointer would be dereferenced to be passed by reference.
This patch makes it a reference everywhere including the InstCombiner class itself so there is more inconsistency. This a large, but mechanical patch. I've done very minimal formatting changes on it despite what clang-format wanted to do.
llvm-svn: 307451
Summary: This matches the behavior we already had for compares and makes us consistent everywhere.
Reviewers: dberlin, hfinkel, spatel
Reviewed By: dberlin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33604
llvm-svn: 305049
This was discussed in D33338. We have larger pattern-matching ending in a truncate that
we can reduce or remove by handling these smaller patterns first. Further motivation is
that narrower shift ops are easier for value tracking and zext is better than sext.
http://rise4fun.com/Alive/rhh
Name: boolshift
%sext = sext i1 %x to i8
%r = lshr i8 %sext, 7
=>
%r = zext i1 %x to i8
Name: noboolshift
%sext = sext i3 %x to i8
%r = lshr i8 %sext, 7
=>
%sh = lshr i3 %x, 2
%r = zext i3 %sh to i8
Differential Revision: https://reviews.llvm.org/D33879
llvm-svn: 304939
Every other place in InstCombine that uses these methods in ValueTracking already pass this information. This makes the remaining sites consistent.
Differential Revision: https://reviews.llvm.org/D33567
llvm-svn: 304018
getSignBit is a static function that creates an APInt with only the sign bit set. getSignMask seems like a better name to convey its functionality. In fact several places use it and then store in an APInt named SignMask.
Differential Revision: https://reviews.llvm.org/D32108
llvm-svn: 300856
This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result.
This adds an lshrInPlace(const APInt &) version as well.
Differential Revision: https://reviews.llvm.org/D32155
llvm-svn: 300566
This fold already existed for vectors but only when 'C1' was a splat
constant (but 'C2' could be any constant).
There were no tests for any vector constants, so I'm adding a test
that shows non-splat constants for both operands.
llvm-svn: 294650
Although this is 'no-functional-change-intended', I'm adding tests
for shl-shl and lshr-lshr pairs because there is no existing test
coverage for those folds.
It seems like we should be able to remove some code from foldShiftedShift()
at this point because we're handling those patterns on the general path.
llvm-svn: 293814
The original shift is bigger, so this may qualify as 'obvious',
but here's an attempt at an Alive-based proof:
Name: exact
Pre: (C1 u< C2)
%a = shl i8 %x, C1
%b = lshr exact i8 %a, C2
=>
%c = lshr exact i8 %x, C2 - C1
%b = and i8 %c, ((1 << width(C1)) - 1) u>> C2
Optimization is correct!
llvm-svn: 293498
We already have this fold when the lshr has one use, but it doesn't need that
restriction. We may be able to remove some code from foldShiftedShift().
Also, move the similar:
(X << C) >>u C --> X & (-1 >>u C)
...directly into visitLShr to help clean up foldShiftByConstOfShiftByConst().
That whole function seems questionable since it is called by commonShiftTransforms(),
but there's really not much in common if we're checking the shift opcodes for every
fold.
llvm-svn: 293215
We may be able to assert that no shl-shl or lshr-lshr pairs ever get here
because we should have already handled those in foldShiftedShift().
llvm-svn: 292726
It's not clear what 'First' and 'Second' mean, so use 'Inner' and 'Outer'
to match foldShiftedShift() and add comments with formulas, so it's easier
to see what's going on.
llvm-svn: 292153
Some of the callers are artificially limiting this transform to integer types;
this should make it easier to incrementally remove that restriction.
llvm-svn: 291620
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...
llvm-svn: 289756
Note that the non-splat lshr+lshr test folded, but that does not
work in general. Something is missing or wrong in computeKnownBits
as the non-splat shl+shl test still shows.
llvm-svn: 288005
This patch introduces the combine:
(C1 shift (A add C2)) -> ((C1 shift C2) shift A)
iff A and C2 are both positive
If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift.
Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive).
Differential Revision: https://reviews.llvm.org/D26000
llvm-svn: 285696
A ConstantVector can have ConstantExpr operands and vice versa.
However, the folder had no ability to fold ConstantVectors which, in
some cases, was an optimization barrier.
Instead, rephrase the folder in terms of Constants instead of
ConstantExprs and teach callers how to deal with failure.
llvm-svn: 277099
We need just a couple of logic tweaks to consolidate the shl and lshr cases.
This is step 5 of refactoring to solve PR26760:
https://llvm.org/bugs/show_bug.cgi?id=26760
llvm-svn: 265965
This is the straightforward fix for PR26760:
https://llvm.org/bugs/show_bug.cgi?id=26760
But we still need to make some changes to generalize this helper function
and then send the lshr case into here.
llvm-svn: 265960
We already had a method to iterate over all the incoming values of a PHI. This just changes all eligible code to use it.
Ineligible code included anything which cared about the index, or was also trying to get the i'th incoming BB.
llvm-svn: 237169
Summary:
Now that the DataLayout is a mandatory part of the module, let's start
cleaning the codebase. This patch is a first attempt at doing that.
This patch is not exactly NFC as for instance some places were passing
a nullptr instead of the DataLayout, possibly just because there was a
default value on the DataLayout argument to many functions in the API.
Even though it is not purely NFC, there is no change in the
validation.
I turned as many pointer to DataLayout to references, this helped
figuring out all the places where a nullptr could come up.
I had initially a local version of this patch broken into over 30
independant, commits but some later commit were cleaning the API and
touching part of the code modified in the previous commits, so it
seemed cleaner without the intermediate state.
Test Plan:
Reviewers: echristo
Subscribers: llvm-commits
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231740
creating a non-internal header file for the InstCombine pass.
I thought about calling this InstCombiner.h or in some way more clearly
associating it with the InstCombiner clas that it is primarily defining,
but there are several other utility interfaces defined within this for
InstCombine. If, in the course of refactoring, those end up moving
elsewhere or going away, it might make more sense to make this the
combiner's header alone.
Naturally, this is a bikeshed to a certain degree, so feel free to lobby
for a different shade of paint if this name just doesn't suit you.
llvm-svn: 226783
a cache of assumptions for a single function, and an immutable pass that
manages those caches.
The motivation for this change is two fold. Immutable analyses are
really hacks around the current pass manager design and don't exist in
the new design. This is usually OK, but it requires that the core logic
of an immutable pass be reasonably partitioned off from the pass logic.
This change does precisely that. As a consequence it also paves the way
for the *many* utility functions that deal in the assumptions to live in
both pass manager worlds by creating an separate non-pass object with
its own independent API that they all rely on. Now, the only bits of the
system that deal with the actual pass mechanics are those that actually
need to deal with the pass mechanics.
Once this separation is made, several simplifications become pretty
obvious in the assumption cache itself. Rather than using a set and
callback value handles, it can just be a vector of weak value handles.
The callers can easily skip the handles that are null, and eventually we
can wrap all of this up behind a filter iterator.
For now, this adds boiler plate to the various passes, but this kind of
boiler plate will end up making it possible to port these passes to the
new pass manager, and so it will end up factored away pretty reasonably.
llvm-svn: 225131
This change, which allows @llvm.assume to be used from within computeKnownBits
(and other associated functions in ValueTracking), adds some (optional)
parameters to computeKnownBits and friends. These functions now (optionally)
take a "context" instruction pointer, an AssumptionTracker pointer, and also a
DomTree pointer, and most of the changes are just to pass this new information
when it is easily available from InstSimplify, InstCombine, etc.
As explained below, the significant conceptual change is that known properties
of a value might depend on the control-flow location of the use (because we
care that the @llvm.assume dominates the use because assumptions have
control-flow dependencies). This means that, when we ask if bits are known in a
value, we might get different answers for different uses.
The significant changes are all in ValueTracking. Two main changes: First, as
with the rest of the code, new parameters need to be passed around. To make
this easier, I grouped them into a structure, and I made internal static
versions of the relevant functions that take this structure as a parameter. The
new code does as you might expect, it looks for @llvm.assume calls that make
use of the value we're trying to learn something about (often indirectly),
attempts to pattern match that expression, and uses the result if successful.
By making use of the AssumptionTracker, the process of finding @llvm.assume
calls is not expensive.
Part of the structure being passed around inside ValueTracking is a set of
already-considered @llvm.assume calls. This is to prevent a query using, for
example, the assume(a == b), to recurse on itself. The context and DT params
are used to find applicable assumptions. An assumption needs to dominate the
context instruction, or come after it deterministically. In this latter case we
only handle the specific case where both the assumption and the context
instruction are in the same block, and we need to exclude assumptions from
being used to simplify their own ephemeral values (those which contribute only
to the assumption) because otherwise the assumption would prove its feeding
comparison trivial and would be removed.
This commit adds the plumbing and the logic for a simple masked-bit propagation
(just enough to write a regression test). Future commits add more patterns
(and, correspondingly, more regression tests).
llvm-svn: 217342
Refactor code, no functionality change, test case moved from instcombine to instsimplify.
Differential Revision: http://reviews.llvm.org/D4102
llvm-svn: 213231
This patch enables transformations:
BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2))
BinOp(shuffle(v1), const1) -> shuffle(BinOp, const2)
They allow to eliminate extra shuffles in some cases.
Differential Revision: http://reviews.llvm.org/D3525
llvm-svn: 208488
Don't replace shifts greater than the type with the maximum shift.
This isn't hit anywhere in the tests, and somewhere else is replacing
these with undef.
llvm-svn: 207000
definition below all of the header #include lines, lib/Transforms/...
edition.
This one is tricky for two reasons. We again have a couple of passes
that define something else before the includes as well. I've sunk their
name macros with the DEBUG_TYPE.
Also, InstCombine contains headers that need DEBUG_TYPE, so now those
headers #define and #undef DEBUG_TYPE around their code, leaving them
well formed modular headers. Fixing these headers was a large motivation
for all of these changes, as "leaky" macros of this form are hard on the
modules implementation.
llvm-svn: 206844
header files and into the cpp files.
These files will require more touches as the header files actually use
DEBUG(). Eventually, I'll have to introduce a matched #define and #undef
of DEBUG_TYPE for the header files, but that comes as step N of many to
clean all of this up.
llvm-svn: 206777
I am really sorry for the noise, but the current state where some parts of the
code use TD (from the old name: TargetData) and other parts use DL makes it
hard to write a patch that changes where those variables come from and how
they are passed along.
llvm-svn: 201827
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.
There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.
The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.
I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).
I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.
llvm-svn: 171366
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
llvm-svn: 169131