Commit Graph

127 Commits

Author SHA1 Message Date
Craig Topper 11ef356d9e [TargetLowering] Use Align in allowsMisalignedMemoryAccesses.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96097
2021-02-04 19:22:06 -08:00
Jessica Paquette 02d4b365bf [GlobalISel] Check if branches use the same MBB in matchOptBrCondByInvertingCond
If the G_BR + G_BRCOND in this combine use the same MBB, then it will infinite
loop. Don't allow that to happen.

Differential Revision: https://reviews.llvm.org/D95895
2021-02-02 15:38:48 -08:00
Jessica Paquette cbf5246359 Fix buildbot after cfc6073017
Windows buildbots were not happy with using find_if + instructionsWithoutDebug.

In cfc6073017, instructionsWithoutDebug is not technically necessary. So,
just iterate over the block directly.

http://lab.llvm.org:8011/#/builders/127/builds/4732/steps/7/logs/stdio
2021-01-19 10:38:04 -08:00
Jessica Paquette cfc6073017 [GlobalISel] Combine (a[0]) | (a[1] << k1) | ...| (a[m] << kn) into a wide load
This is a restricted version of the combine in `DAGCombiner::MatchLoadCombine`.
(See D27861)

This tries to recognize patterns like below (assuming a little-endian target):

```
s8* x = ...
s32 val = a[0] | (a[1] << 8) | (a[2] << 16) | (a[3] << 24)
->
s32 val = *((i32)a)

s8* x = ...
s32 val = a[3] | (a[2] << 8) | (a[1] << 16) | (a[0] << 24)
->
s32 val = BSWAP(*((s32)a))
```

(This patch also handles the big-endian target case as well, in which the first
example above has a BSWAP, and the second example above does not.)

To recognize the pattern, this searches from the last G_OR in the expression
tree.

E.g.

```
    Reg   Reg
     \    /
      OR_1   Reg
       \    /
        OR_2
          \     Reg
           .. /
          Root
```

Each non-OR register in the tree is put in a list. Each register in the list is
then checked to see if it's an appropriate load + shift logic.

If every register is a load + potentially a shift, the combine checks if those
loads + shifts, when OR'd together, are equivalent to a wide load (possibly with
a BSWAP.)

To simplify things, this patch

(1) Only handles G_ZEXTLOADs (which appear to be the common case)
(2) Only works in a single MachineBasicBlock
(3) Only handles G_SHL as the bit twiddling to stick the small load into a
    specific location

An IR example of this is here: https://godbolt.org/z/4sP9Pj (lifted from
test/CodeGen/AArch64/load-combine.ll)

At -Os on AArch64, this is a 0.5% code size improvement for CTMark/sqlite3,
and a 0.4% improvement for CTMark/7zip-benchmark.

Also fix a bug in `isPredecessor` which caused it to fail whenever `DefMI` was
the first instruction in the block.

Differential Revision: https://reviews.llvm.org/D94350
2021-01-19 10:24:27 -08:00
Matt Arsenault 1f9b6ef91f GlobalISel: Add combine for G_UREM by power of 2
Really I want this in the legalizer, but this is a start.
2021-01-07 16:36:35 -05:00
Amara Emerson 7df3544e80 [GlobalISel] Fix assertion failures after "GlobalISel: Return APInt from getConstantVRegVal" landed.
APInt binary ops don't promote types but instead assert, which a combine was
relying on.
2020-12-26 23:51:44 -08:00
Matt Arsenault 581d13f8ae GlobalISel: Return APInt from getConstantVRegVal
Returning int64_t was arbitrarily limiting for wide integer types, and
the functions should handle the full generality of the IR.

Also changes the full form which returns the originally defined
vreg. Add another wrapper for the common case of just immediately
converting to int64_t (arguably this would be useful for the full
return value case as well).

One possible issue with this change is some of the existing uses did
break without conversion to getConstantVRegSExtVal, and it's possible
some without adequate test coverage are now broken.
2020-12-22 22:23:58 -05:00
Jessica Paquette b184a2eccf [GlobalISel] Add matchers for specific constants and a matcher for negations
It's fairly common to need matchers for a specific constant value, or for
common idioms like finding a negated register.

Add

- `m_SpecificICst`, which returns true when matching a specific value..
- `m_ZeroInt`, which returns true when an integer 0 is matched.
- `m_Neg`, which returns when a register is negated.

Also update a few places which use idioms related to the new matchers.

Differential Revision: https://reviews.llvm.org/D91397
2020-11-13 09:24:54 -08:00
Mirko Brkusanin a75d6178b8 [GlobalISel] Add combine for (x | mask) -> x when (x | mask) == x
If we have a mask, and a value x, where (x | mask) == x, we can drop the OR
and just use x.

Differential Revision: https://reviews.llvm.org/D90952
2020-11-10 11:32:13 +01:00
Mirko Brkusanin fb36ab0a42 [GlobalISel] Expand combine for (x & mask) -> x when (x & mask) == x
We can use KnownBitsAnalysis to cover cases when mask is not trivial. It can
also help with cases when mask is not constant but can still be folded into
one. Since 'and' is comutative we should treat both operands as possible
replacements.

Differential Revision: https://reviews.llvm.org/D90674
2020-11-10 11:32:13 +01:00
Mirko Brkusanin 53ae95c946 [AMDGPU][GlobalISel] Combine shift + logic + shift with constant operands
This sequence of instructions can be simplified if they are single use and
some operands are constants. Additional combines may be applied afterwards.

Differential Revision: https://reviews.llvm.org/D90223
2020-11-10 11:32:13 +01:00
Mirko Brkusanin de719586a8 [AMDGPU][GlobalISel] Fold a chain of two shift instructions with constant operands
Sequence of same shift instructions with constant operands can be combined into
a single shift instruction.

Differential Revision: https://reviews.llvm.org/D90217
2020-11-10 11:32:12 +01:00
Aditya Nandakumar bed8394047 [GISel]: Few InsertVecElt combines
https://reviews.llvm.org/D88060

This adds the following combines
1) build_vector formation from insert_vec_elts
2) insert_vec_elts (build_vector) -> build_vector
2020-10-28 12:27:07 -07:00
Aditya Nandakumar ef3d17482f [GISel] Add combine for constant G_PTR_ADD offsets.
https://reviews.llvm.org/D88865

This adds a single combine for GlobalISel to fold:

ptradd (inttoptr C1) C2
Into:

C1 + C2
Additionally, a small test for AArch64 is added.

Patch by pnappa.
2020-10-13 17:26:12 -07:00
Mirko Brkusanin 52ba4fa6aa [GlobalISel] Avoid making G_PTR_ADD with nullptr
When the first operand is a null pointer we can avoid making a G_PTR_ADD and
make a G_INTTOPTR with the offset operand.
This helps us avoid making add with 0 later on for targets such as AMDGPU.

Differential Revision: https://reviews.llvm.org/D87140
2020-10-13 13:02:55 +02:00
Jessica Paquette a52e78012a [GlobalISel] Combine (xor (and x, y), y) -> (and (not x), y)
When we see this:

```
%and = G_AND %x, %y
%xor = G_XOR %and, %y
```

Produce this:

```
%not = G_XOR %x, -1
%new_and = G_AND %not, %y
```

as long as we are guaranteed to eliminate the original G_AND.

Also matches all commuted forms. E.g.

```
%and = G_AND %y, %x
%xor = G_XOR %y, %and
```

will be matched as well.

Differential Revision: https://reviews.llvm.org/D88104
2020-09-28 10:08:14 -07:00
Matt Arsenault c463fd136e GlobalISel: Fix truncating shift amount in trunc (shl) combine
The shift amount type does not necessarily match the result type. This
was inserting a trunc from s32 to s32, which asserted. Just preserve
the original shift amount type which can be legalized later.
2020-09-23 09:07:50 -04:00
Michael Kitzan c4e589b795 [GISel] Add new combines for unary FP instrs with constant operand
https://reviews.llvm.org/D86393

Patch adds five new `GICombinerRules`, one for each of the following unary
FP instrs: `G_FNEG`, `G_FABS`, `G_FPTRUNC`, `G_FSQRT`, and `G_FLOG2`. The
combine rules perform the FP operation on the constant operand and replace
the original instr with the result. Patch additionally adds new combiner
tests for the AArch64 target to test these new combiner rules.
2020-09-16 10:34:15 -07:00
Volkan Keles 79378b1b75 GlobalISel: Fix a failing combiner test
test/CodeGen/AArch64/GlobalISel/combine-trunc.mir was failing
due to the different order for evaluating function arguments.
This patch updates the related code to fix the issue.
2020-09-15 16:40:38 -07:00
Aditya Nandakumar 97203cfd6b [GISel] Add new GISel combiners for G_MUL
https://reviews.llvm.org/D87668

Patch adds two new GICombinerRules, one for G_MUL(X, 1) and another for G_MUL(X, -1).
G_MUL(X, 1) is an identity combine, and G_MUL(X, -1) gets replaced with G_SUB(0, X).
Patch additionally adds new combiner tests for the AArch64 target to test these
new combiner rules, as well as updates AMDGPU GISel tests.

Patch by mkitzan
2020-09-15 16:08:47 -07:00
Volkan Keles a4e35cc2ec GlobalISel: Add combines for G_TRUNC
https://reviews.llvm.org/D87050
2020-09-15 15:50:34 -07:00
Quentin Colombet b3afad0463 [GlobalISel] Add a `X, Y = G_UNMERGE(G_ZEXT Z)` -> X = G_ZEXT Z; Y = 0 combine
Add a combiner helper to transform unmerge of zext into one zext and
a constant 0

Differential Revision: https://reviews.llvm.org/D87427
2020-09-14 17:27:23 -07:00
Quentin Colombet d2321129bd [GlobalISel] Add `X,Y<dead> = G_UNMERGE Z` -> X = G_TRUNC Z
Add a combiner helper that replaces G_UNMERGE where all the destination lanes
are dead except the first one with a G_TRUNC.

Differential Revision: https://reviews.llvm.org/D87174
2020-09-14 17:27:23 -07:00
Quentin Colombet a36278c2f8 [GlobalISel] Add G_UNMERGE(Cst) -> Cst1, Cst2, ... combine
Add a combiner helper that replaces G_UNMERGE of big constants into direct
use of smaller constants.

Differential Revision: https://reviews.llvm.org/D87166
2020-09-14 16:30:18 -07:00
Aditya Nandakumar 46f9137e43 [GISel]: Add combine for G_FABS to G_FABS
https://reviews.llvm.org/D87554

Patch adds one new GICombinerRule for G_FABS. The combine rule folds G_FABS(G_FABS(X)) to G_FABS(X).
Patch additionally adds new combiner tests for the AArch64 target to test this new combiner rule.

Patch by mkitzan.
2020-09-14 15:56:24 -07:00
Quentin Colombet 670c276232 [GlobalISel] Add G_UNMERGE_VALUES(G_MERGE_VALUES) combine
Add the matching and applying function to the combiner helper for
G_UNMERGE_VALUES(G_MERGE_VALUES).

This combine also supports any merge-like input nodes, like G_BUILD_VECTORS
and is robust against bitcasts in between int unmerge and merge nodes.

When the input type of the merge node and the output type of the unmerge
node are not the same, but the sizes are, the combine still applies but
creates bitcasts between the sources and the destinations instead of
reusing the destinations directly.

Long term, the artifact combiner should probably reuse that helper, but
as of today, it doesn't use any outside helper, so I kept it this way.

Differential Revision: https://reviews.llvm.org/D87117
2020-09-14 15:45:06 -07:00
Volkan Keles d4bf90271f GlobalISel: Combine fneg(fneg x) to x
https://reviews.llvm.org/D87473
2020-09-10 12:57:38 -07:00
Amara Emerson cc76da7ada [GlobalISel] Rewrite the elide-br-by-swapping-icmp-ops combine to do less.
This combine previously tried to take sequences like:
  %cond = G_ICMP pred, a, b
  G_BRCOND %cond, %truebb
  G_BR %falsebb
%truebb:
  ...
%falsebb:
  ...

and by inverting the compare predicate and swapping branch targets, delete the
G_BR and instead have a single conditional branch to the falsebb. Since in an
earlier patch we have a combine to fold not(icmp) into just an inverted icmp,
we don't need this combine to do as much. This patch instead generalizes the
combine by just looking for:
  G_BRCOND %cond, %truebb
  G_BR %falsebb
%truebb:
  ...
%falsebb:
  ...

and then inverting the condition using a not (xor). The xor can be folded away
in a separate combine. This change also lets us avoid some optimization code
in the IRTranslator.

I also think that deleting G_BRs in the combiner is unnecessary. That's
something that targets can decide to do at selection time and could simplify
generic code in future.

Differential Revision: https://reviews.llvm.org/D86664
2020-09-09 13:08:16 -07:00
Volkan Keles 1242dd330d GlobalISel: Combine `op undef, x` to 0
https://reviews.llvm.org/D86611
2020-09-08 09:46:38 -07:00
Jay Foad 713c2ad60c [GlobalISel] Extend not_cmp_fold to work on conditional expressions
Differential Revision: https://reviews.llvm.org/D86709
2020-09-07 09:31:08 +01:00
Amara Emerson d0abc75749 [GlobalISel] Disable the indexed loads combine completely unless forced. NFC.
The post-index matcher, before it queries the target legality, walks uses
of some instructions which in pathological cases can be massive. Since
no targets actually support indexed loads yet, disable this to stop wasting
compile time on something which is going to fail anyway.
2020-09-05 21:04:03 -07:00
Amara Emerson 520ab710fb Revert "Revert "[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _)" (and dependent patch "Optimize away a Not feeding a brcond by using tbz instead of tbnz.")"
This reverts commit 8693ddc743.

Re-committing with the test requiring asserts.
2020-09-01 14:29:04 -07:00
Jordan Rupprecht 8693ddc743 Revert "[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _)" (and dependent patch "Optimize away a Not feeding a brcond by using tbz instead of tbnz.")
This reverts commit 8ad8f484b6. It causes crashes when running `ninja check-llvm-codegen-aarch64-globalisel`, e.g.
http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24132/steps/test-stage1-compiler/logs/stdio.
Note that the crash does not seem to reproduce in debug builds.

5ded444252 depends on this, so revert that too.
2020-09-01 13:31:57 -07:00
Amara Emerson 8ad8f484b6 [GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _)
This is needed for an upcoming change to how we translate conditional branches
which might generate these.

Differential Revision: https://reviews.llvm.org/D86383
2020-09-01 10:57:17 -07:00
Volkan Keles 061182b7ba GlobalISel: Add combines for extend operations
https://reviews.llvm.org/D86516
2020-09-01 08:50:06 -07:00
Matt Arsenault 1b201914b5 GlobalISel: Combine out redundant sext_inreg
The scalar tests don't work yet, since computeNumSignBits apparently
doesn't handle sextload yet, and sext folds into the load first.
2020-08-28 17:57:31 -04:00
Aditya Nandakumar db464a3dbf [GISel] Add new GISel combiners for G_SELECT
https://reviews.llvm.org/D83833

Patch adds two new GICombinerRules for G_SELECT. The rules include:
combining selects with undef comparisons into their first selectee value,
and to combine away selects with constant comparisons. Patch additionally
adds a new combiner test for the AArch64 target to test these new G_SELECT
combiner rules and the existing select_same_val combiner rule.

Patch by  mkitzan
2020-08-27 09:40:15 -07:00
Matt Arsenault 0b7f6cc71a GlobalISel: Add generic instructions for memory intrinsics
AArch64, X86 and Mips currently directly consumes these and custom
lowering to produce a libcall, but really these should follow the
normal legalization process through the libcall/lower action.
2020-08-26 20:08:45 -04:00
Matt Arsenault eb074088c9 GlobalISel: Combine G_ADD of G_PTRTOINT to G_PTR_ADD
This produces less work for addressing mode matching. I think this is
safe since I don't think machine IR is supposed to give the same
aliasing properties as getelementptr in the IR.
2020-08-26 08:57:15 -04:00
Matt Arsenault e1644a3779 GlobalISel: Reduce G_SHL width if source is extension
shl ([sza]ext x, y) => zext (shl x, y).

Turns expensive 64 bit shifts into 32 bit if it does not overflow the
source type:

This is a port of an AMDGPU DAG combine added in
5fa289f0d8. InstCombine does this
already, but we need to do it again here to apply it to shifts
introduced for lowered getelementptrs. This will help matching
addressing modes that use 32-bit offsets in a future patch.

TableGen annoyingly assumes only a single match data operand, so
introduce a reusable struct. However, this still requires defining a
separate GIMatchData for every combine which is still annoying.

Adds a morally equivalent function to the existing
getShiftAmountTy. Without this, we would have to do try to repeatedly
query the legalizer info and guess at what type to use for the shift.
2020-08-24 09:42:40 -04:00
Jessica Paquette d25b12bdc3 [GlobalISel] Add combine for (x & mask) -> x when (x & mask) == x
If we have a mask, and a value x, where (x & mask) == x, we can drop the AND
and just use x.

This is about a 0.4% geomean code size improvement on CTMark at -O3 for AArch64.

In AArch64, this is most useful post-legalization. Patterns like this often
show up when legalizing s1s, which must be extended to larger types.

e.g.

```
%cmp:_(s32) = G_ICMP ...
%and:_(s32) = G_AND %cmp, 1
```

Since G_ICMP only produces a single bit, there's no reason to mask it with the
G_AND.

Differential Revision: https://reviews.llvm.org/D85463
2020-08-19 10:20:57 -07:00
Amara Emerson ed35344524 Use std::make_tuple instead of initializer lists to make a bot happy:
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux
2020-08-18 14:55:52 -07:00
Amara Emerson 04a6ea5d77 [GlobalISel] Add a combine for sext_inreg(load x), c --> sextload x
This is restricted to single use loads, which if we fold to sextloads we can
find more optimal addressing modes on AArch64.

This also fixes an overload the MachineFunction::getMachineMemOperand() method
which was incorrectly using the MF alignment instead of the MMO alignment.

Differential Revision: https://reviews.llvm.org/D85966
2020-08-18 10:42:15 -07:00
Amara Emerson 40e269ea6d [GlobalISel] Add a combine for ashr(shl x, c), c --> sext_inreg x, c'
By detecting this sign extend pattern early, we can uncover opportunities for
more optimizations.

Differential Revision: https://reviews.llvm.org/D85965
2020-08-18 10:42:15 -07:00
Jessica Paquette bebe6a6449 [GlobalISel] Combine (logic_op (op x...), (op y...)) -> (op (logic_op x, y))
This implements

```
(logic_op (op x...), (op y...)) -> (op (logic_op x, y))
```

when `op` is an extend, a shift, or an and.

This is similar to `DAGCombiner::hoistLogicOpWithSameOpcodeHands`
(with a bunch of missing cases, e.g. G_TRUNC, G_BITCAST, etc.)

This is implemented so it works both pre and post-legalization.

This also adds a general way to add a series of instructions in a combine.
(`applyBuildInstructionSteps`).

Differential Revision: https://reviews.llvm.org/D85050
2020-08-11 10:40:06 -07:00
Aditya Nandakumar 2144a3bdbb [GISel] Add combiners for G_INTTOPTR and G_PTRTOINT
https://reviews.llvm.org/D84909

Patch adds two new GICombinerRules, one for G_INTTOPTR and one for
G_PTRTOINT. The G_INTTOPTR elides ptr2int(int2ptr(x)) to a copy of x, if
the cast is within the same address space. The G_PTRTOINT elides
int2ptr(ptr2int(x)) to a copy of x. Patch additionally adds new combiner
tests for the AArch64 target to test these new combiner rules.

Patch by mkitzan
2020-07-31 10:13:36 -07:00
Amara Emerson 645e7fc542 [GlobalISel] Use existing MIR builder instead of creating one in combiner. 2020-07-23 14:16:45 -07:00
Amara Emerson 3b10e42ba1 [AArch64][GlobalISel] Add post-legalize combine for sext(trunc(sextload)) -> trunc/copy
On AArch64 we generate redundant G_SEXTs or G_SEXT_INREGs because of this.

Differential Revision: https://reviews.llvm.org/D81993
2020-07-23 12:06:35 -07:00
Amara Emerson 791544422a Revert "[AArch64][GlobalISel] Add post-legalize combine for sext_inreg(trunc(sextload)) -> copy"
This reverts commit 64eb3a4915.

It caused miscompiles with optimizations enabled. Reverting while I investigate.
2020-07-21 16:01:18 -07:00
Amara Emerson 64eb3a4915 [AArch64][GlobalISel] Add post-legalize combine for sext_inreg(trunc(sextload)) -> copy
On AArch64 we generate redundant G_SEXTs or G_SEXT_INREGs because of this.

Differential Revision: https://reviews.llvm.org/D81993
2020-07-13 20:27:45 -07:00