Commit Graph

128 Commits

Author SHA1 Message Date
Amara Emerson 95ac3d15e9 [AArch64][GlobalISel] Add G_VECREDUCE fewerElements support for full scalarization.
For some reductions like G_VECREDUCE_OR on AArch64, we need to scalarize
completely if the source is <= 64b. This change adds support for that in
the legalizer. If the source has a pow-2 num elements, then we can do
a tree reduction using the scalar operation in the individual elements.
Otherwise, we just create a sequential chain of operations.

For AArch64, we only need to scalarize if the input is <64b. If it's great than
64b then we can first do a fewElements step to 64b, taking advantage of vector
instructions until we reach the point of scalarization.

I also had to relax the verifier checks for reductions because the intrinsics
support <1 x EltTy> types, which we lower to scalars for GlobalISel.

Differential Revision: https://reviews.llvm.org/D108276
2021-08-19 16:38:52 -07:00
Jessica Paquette c22b64ef66 [AArch64][GlobalISel] Don't allow s128 for G_ISNAN
getAPFloatFromSize doesn't support s128, so we can't lower this without
asserting right now.

To fix the buildbots, don't allow any scalars other than s16, s32, and s64.
2021-08-18 13:59:00 -07:00
Jessica Paquette 45e1a6bd25 [AArch64][GlobalISel] Legalize scalar G_FMINNUM + G_FMAXNUM
For subtargets with full FP16, this is legal for s16, s32, and s64. Without
full FP16, it's legal for s32 and s64.

For s128, this is a libcall.

We also support some vector types, but for now, let's just support scalars.

Differential Revision: https://reviews.llvm.org/D108259
2021-08-18 13:30:03 -07:00
Jessica Paquette 791006fb8c [GlobalISel] Implement lowering for G_ISNAN + use it in AArch64
GlobalISel equivalent to `TargetLowering::expandISNAN`.

Use it in AArch64 and add a testcase.

Differential Revision: https://reviews.llvm.org/D108227
2021-08-18 10:54:25 -07:00
Jessica Paquette ccfc079047 [AArch64][GlobalISel] Legalize scalar G_SSUBSAT + G_SADDSAT
These are lowered, matching SDAG behaviour. (See
llvm/test/CodeGen/AArch64/ssub_sat.ll and llvm/test/CodeGen/AArch64/sadd_sat.ll)

These fall back ~159 times on a build of clang with GISel enabled.

Differential Revision: https://reviews.llvm.org/D107777
2021-08-13 09:02:25 -07:00
Amara Emerson 73056f239e [AArch64][GlobalISel] Simplify/nuke the merge/unmerge legalizer rules.
These rules were originally written when the new predicate based legalizer
was introduced in an attempt to preserve existing behaviour. It wasn't
properly kept up to date as things like vector support was split out into
G_CONCAT_VECTORS, and frankly, even if it was, it was too complex.

It's much easier to start from scratch with what we can actually support,
which is just a few type combinations. Anything illegal we should either
legalize, or should be eliminated as a side effect of artifact combination.

Differential Revision: https://reviews.llvm.org/D107937
2021-08-11 16:45:23 -07:00
Tim Northover 5ad0860899 AArch64: support @llvm.va_copy in GISel 2021-08-10 13:11:03 +01:00
Jessica Paquette e6a3944ea9 [AArch64][GlobalISel] Overhaul G_INSERT legalization
Similar cleanup to G_EXTRACT (51bd4e874f).

Also swap the order of clamp/widen to avoid unnecessary complex merges.

Add a bunch of missing testcases to legalize-inserts while we're at it.

Differential Revision: https://reviews.llvm.org/D107601
2021-08-05 18:28:22 -07:00
Jessica Paquette 562c8e14d9 [AArch64][GlobalISel] Widen G_IMPLICIT_DEF and G_FREEZE before clamping
Similar to other cleanup commits which widen instructions before clamping
during legalization. Purpose of this is to avoid weird type breakdowns.

In terms of G_IMPLICIT_DEF, this simplifies legalization for other instructions.
The legalizer has to emit G_IMPLICIT_DEF to legalize certain instructions, so
this can help with emitting merges elsewhere.

Differential Revision: https://reviews.llvm.org/D107604
2021-08-05 18:21:14 -07:00
Jessica Paquette 8a557d8311 [AArch64][GlobalISel] Widen extloads before clamping during legalization
Allows us to avoid awkward type breakdowns on types like s88, like the other
commits.

Differential Revision: https://reviews.llvm.org/D107587
2021-08-05 16:14:06 -07:00
Jessica Paquette 36498374d4 [AArch64][GlobalISel] Widen G_BSWAP before clamping
This allows us to avoid odd type breakdowns + allows us to legalize types like
s88 in the first place.

Add some testcases for known legal types + testcases for s4 and s88.

Differential Revision: https://reviews.llvm.org/D107607
2021-08-05 15:16:00 -07:00
Jessica Paquette 51bd4e874f [AArch64][GlobalISel] Overhaul G_EXTRACT legalization
This simplifies our existing G_EXTRACT rules and adds some test coverage. Mostly
changing this because it should make it easier to improve legalization for
instructions which use G_EXTRACT as part of the legalization process.

This also adds support for legalizing some weird types. Similar to other recent
legalizer changes, this changes the order of widening/clamping.

There was some dead code in our existing rules (e.g. the p0 case would never get
hit), so this knocks those out and makes the types we want to handle explicit.

This also removes some checks which, nowadays, are handled by the
MachineVerifier.

Differential Revision: https://reviews.llvm.org/D107505
2021-08-05 13:55:15 -07:00
Jon Roelofs 98f38c151b [AArch64][GlobalISel] Legalize ctpop s128
This is re-landing the same patch again, but without the changes to
LegalizerHelper that regressed the Mips test:

test/CodeGen/Mips/GlobalISel/llvm-ir/ctpop.ll

Differential revision: https://reviews.llvm.org/D106494
2021-08-05 11:54:53 -07:00
Jessica Paquette f3f3098afe [AArch64][GlobalISel] Mark v16s8 <- v8s8, v8s8 G_CONCAT_VECTOR as legal
G_CONCAT_VECTORS shows up from time to time when legalizing other instructions.

We actually import patterns for the v16s8 <- v8s8, v8s8 case so marking it
as legal gives us selection for free.

Differential Revision: https://reviews.llvm.org/D107512
2021-08-05 09:40:46 -07:00
Jessica Paquette ca2e053652 [AArch64][GlobalISel] Legalize wide vector G_PHIs
Clamp the max number of elements when legalizing G_PHI. This allows us to
legalize some common fallbacks like 4 x s64.

Here's an example: https://godbolt.org/z/6YocsEYTd

Had to add -global-isel-abort=0 to legalize-phi.mir to account for the
G_EXTRACT_VECTOR_ELT from the 32 x s8 G_PHI.

Differential Revision: https://reviews.llvm.org/D107508
2021-08-04 16:48:59 -07:00
Jessica Paquette d9279843b1 [AArch64][GlobalISel] Widen G_PHI before clamping it during legalization
This allows us to handle weird types like s88; we first widen to s128, then
clamp back down to s64.

https://godbolt.org/z/9xqbP46Mz

Also this makes it possible for GISel to legalize the case in pr48188.ll. It
now does the same thing as SDAG, although regalloc chooses different registers.

Differential Revision: https://reviews.llvm.org/D107417
2021-08-04 10:25:14 -07:00
Jessica Paquette 7d97de60b3 [AArch64][GlobalISel] Widen G_FPTO*I before clamping
Going through our legalization rules and doing some cleanup.

Widening and then clamping is usually easier than clamping and then widening.

This allows us to legalize some weird types like s88.

Differential Revision: https://reviews.llvm.org/D107413
2021-08-04 10:19:26 -07:00
Jessica Paquette 5643736378 [AArch64][GlobalISel] Widen G_SELECT before clamping it
This allows us to handle the s88 G_SELECTS:

https://godbolt.org/z/5s18M4erY

Weird types like this can result in weird merges.

Widening to s128 first and then clamping down avoids that situation.

Differential Revision: https://reviews.llvm.org/D107415
2021-08-03 18:31:17 -07:00
Matt Arsenault ebc17a0d68 GlobalISel: Scalarize unaligned vector stores
This has the same problems and limitations as the load path.
2021-07-31 10:37:15 -04:00
Amara Emerson da61ab8475 [AArch64][GlobalISel] More widenToNextPow2 changes, this time for arithmetic/bitwise ops. 2021-07-29 03:02:29 -07:00
Jessica Paquette 5a333dc5da [AArch64][GlobalISel] Improve legalization for odd-type G_LOAD
Swap the order of widening so that we widen to the next power-of-2 first when
legalizing G_LOAD.

Also, provide a minimum type for the power of 2 to disallow s2 + s1. Clamping
ought to disallow s2 and s1, but I think it's better to be explicit about the
expected minimum size.

We probably need a similar change for G_STORE, but it seems to be a bit more
finnicky. So, let's just handle G_LOAD for now.

Differential Revision: https://reviews.llvm.org/D107013
2021-07-28 17:19:14 -07:00
Jessica Paquette c0a41c3d3b [AArch64][GlobalISel] Improve legalization for odd-sized G_ICMP/G_CONSTANT
We were handing types like s88 like

1) clamp to the range
2) widen to the next power of 2

This isn't desirable because it causes an odd breakdown for types like s88.
If we widen to the next power of 2 (s128) first, then we get a clean breakdown
when we clamp back to s64.

Differential Revision: https://reviews.llvm.org/D106998
2021-07-28 15:31:33 -07:00
Jon Roelofs f2e8e46d78 Revert "[AArch64][GlobalISel] Legalize ctpop s128"
This reverts commit 97e95fea53.

It broke test/CodeGen/Mips/GlobalISel/llvm-ir/ctpop.ll. Not sure why I didn't see that.
2021-07-26 17:06:43 -07:00
Jon Roelofs 97e95fea53 [AArch64][GlobalISel] Legalize ctpop s128
Differential revision: https://reviews.llvm.org/D106494
2021-07-26 16:33:50 -07:00
Amara Emerson acbc0c5f0e [AArch64][GlobalISel] Widen non-pow-2 types for shifts before clamping.
For types like s96, we don't want to clamp to s64, we want to first widen to
s128 and then narrow it. Otherwise we end up with impossible to legalize types.
2021-07-24 15:50:43 -07:00
Jessica Paquette d0af732bd0 [AArch64][GlobalISel] Widen s2 and s4 G_IMPLICIT_DEF + G_FREEZE
These had

```
.clampScalar(0, s1, 64)
.widenScalarToNextPow2(0, 8)
```

If you have s2 or s4, then `widenScalarToNextPow2` does nothing.

This changes the `widenScalarToNextPow2` rule to use s8 as the minimum type
instead, allowing us to correctly widen s2 and s4.

This does not impact s1, since it's marked as legal already.

Differential Revision: https://reviews.llvm.org/D106413
2021-07-21 12:59:20 -07:00
Tim Northover 291e0daa6e AArch64: support 8 & 16-bit atomic operations in GlobalISel
We have SelectionDAG patterns for 8 & 16-bit atomic operations, but they
assume the value types will have been legalized to 32-bits. So this adds
the ability to widen them to both AArch64 & generic GISel
infrastructure.
2021-07-21 09:35:14 +01:00
Jon Roelofs 75187aa352 [AArch64][GlobalISel] Legalize ctpop for v2s64, v2s32, v4s32, v4s16, v8s16
https://llvm.godbolt.org/z/nTTK6M5qe

Differential revision: https://reviews.llvm.org/D106388
2021-07-20 15:37:56 -07:00
Eli Friedman 843c614058 [AArch64] Fix i128 cmpxchg using ldxp/stxp.
Basically two parts to this fix:

1. Stop using AtomicExpand to expand cmpxchg i128
2. Fix AArch64ExpandPseudoInsts to use a correct expansion.

From ARM architecture reference:

To atomically load two 64-bit quantities, perform a Load-Exclusive
pair/Store-Exclusive pair sequence of reading and writing the same value
for which the Store-Exclusive pair succeeds, and use the read values
from the Load-Exclusive pair.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51102

Differential Revision: https://reviews.llvm.org/D106039
2021-07-20 12:38:12 -07:00
Matt Arsenault 30fa074c0a AArch64/GlobalISel: Preserve memory types 2021-07-19 20:21:05 -04:00
Jon Roelofs 5cd63e9ec2 [AArch64][GlobalISel] Legalize bswap <2 x i16>
Differential revision: https://reviews.llvm.org/D105935
2021-07-17 15:31:15 -07:00
Jessica Paquette 46c8e7122b [AArch64][GlobalISel] Clamp <n x p0> vecs when legalizing G_EXTRACT_VECTOR_ELT
This case was missing from G_EXTRACT_VECTOR_ELT. It's the same as for s64.

https://godbolt.org/z/Tnq4acY8z

Differential Revision: https://reviews.llvm.org/D105952
2021-07-15 14:05:28 -07:00
Irina Dobrescu 831ee6b0c3 [AArch64][GlobalISel] Optimise lowering for some vector types for min/max
Differential Revision: https://reviews.llvm.org/D105696
2021-07-15 11:34:32 +01:00
Jessica Paquette 5bd7cc4f42 [AArch64][GlobalISel] Mark v2s64 -> v2p0 G_INTTOPTR as legal
Allow

```
%x:_<2 x p0> = G_INTTOPTR %y:_<2 x s64>
```

This shows up when building clang for AArch64 with GlobalISel.

Also show that we can select it.

This should match SDAG's behaviour: https://godbolt.org/z/33oqYoaYv

Differential Revision: https://reviews.llvm.org/D105944
2021-07-13 17:28:14 -07:00
Jon Roelofs eba638dbbb [AArch64][GlobalISel] Legalize load <2 x i16>
Differential revision: https://reviews.llvm.org/D105913
2021-07-13 11:12:05 -07:00
Jon Roelofs 43c7ca8e49 [AArch64][GlobalISel] Legalize store <2 x i16>
Differential revision: https://reviews.llvm.org/D105912
2021-07-13 11:12:05 -07:00
Amara Emerson 97c426394a [AArch64][GlobalISel] Implement moreElements legalization for G_SHUFFLE_VECTOR.
Differential Revision: https://reviews.llvm.org/D103301
2021-07-10 00:25:26 -07:00
Amara Emerson 58a2cb5143 [GlobalISel] Add a new artifact combiner for unmerge which looks through general artifact expressions.
The original motivation for this was to implement moreElementsVector of shuffles
on AArch64, which resulted in complex sequences of artifacts like unmerge(unmerge(concat...))
which the combiner couldn't handle. It seemed here that the better option,
instead of writing ever-more-complex combines, was to have a way to find
the original "non-artifact" source registers for a given definition, walking
through arbitrary expressions of unmerge/concat/insert. As long as the bits
aren't extended or truncated, this is a pretty simple algorithm that avoids
the need for lots of combines and instead jumps straight to the final result
we want.

I've only used this new technique in 2 places within tryCombineUnmerge, using it
in more general situations resulted in infinite loops in AMDGPU. So for now
it's used when we would otherwise fail to combine and that seems to work.

In order to support looking through G_INSERTs, I also had to add it as an
artifact in isArtifact(), which caused a whole lot of issues in tests. AMDGPU
started infinite looping since full legalization of G_INSERT doensn't seem to
be there. To work around this, I've temporarily added a CLI option to use the
old behaviour so that the MIR tests will still run and terminate.

Other minor changes include no longer making >128b G_MERGE/UNMERGE legal.
We never had isel support for that anyway and it was a remnant of the legacy
legalizer rules. However being legal prevented the combiner from checking if it
was dead and deleting them.

Differential Revision: https://reviews.llvm.org/D104355
2021-07-09 22:35:00 -07:00
Irina Dobrescu 5888a194c1 [AArch64][GlobalISel] Lower vector types for min/max
Differential Revision: https://reviews.llvm.org/D105433
2021-07-07 15:34:03 +01:00
Irina Dobrescu 71d5b0a757 [AArch64][GlobalISel]Legalise some vector types for min/max
Differential Revision: https://reviews.llvm.org/D105200
2021-07-01 16:29:38 +01:00
Matt Arsenault 28f2f66200 GlobalISel: Use LLT in memory legality queries
This enables proper lowering of non-byte sized loads. We still aren't
faithfully preserving memory types everywhere, so the legality checks
still only consider the size.
2021-06-30 17:44:13 -04:00
Matt Arsenault 990278d026 CodeGen: Store LLT instead of uint64_t in MachineMemOperand
GlobalISel is relying on regular MachineMemOperands to track all of
the memory properties of accesses. Just the raw byte size is
insufficent to disambiguate all situations. For example, if we need to
split an unaligned extending load, we need to know the number of bits
in the original source value and can't infer it from the result
type. This is also a problem for extending vector loads.

This does decrease the maximum representable size from the full
uint64_t bytes to a maximum of 16-bits. No in tree testcases hit this,
other than places using UINT64_MAX for unknown sizes. This may be an
issue for G_MEMCPY and co., although they can just use unknown size
for large static sizes. This also has potential for backend abuse by
relying on the type when it really shouldn't be relevant after
selection.

This does not include the necessary MIR printer/parser changes to
represent this.
2021-06-29 17:38:51 -04:00
Sander de Smalen c9acd2f32e [GlobalISel] NFC: Change LLT::changeNumElements to LLT::changeElementCount.
Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D104453
2021-06-25 15:54:00 +01:00
Sander de Smalen d5e14ba88c [GlobalISel] NFC: Change LLT::vector to take ElementCount.
This also adds new interfaces for the fixed- and scalable case:
* LLT::fixed_vector
* LLT::scalable_vector

The strategy for migrating to the new interfaces was as follows:
* If the new LLT is a (modified) clone of another LLT, taking the
  same number of elements, then use LLT::vector(OtherTy.getElementCount())
  or if the number of elements is halfed/doubled, it uses .divideCoefficientBy(2)
  or operator*. That is because there is no reason to specifically restrict
  the types to 'fixed_vector'.
* If the algorithm works on the number of elements (as unsigned), then
  just use fixed_vector. This will need to be fixed up in the future when
  modifying the algorithm to also work for scalable vectors, and will need
  then need additional tests to confirm the behaviour works the same for
  scalable vectors.
* If the test used the '/*Scalable=*/true` flag of LLT::vector, then
  this is replaced by LLT::scalable_vector.

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D104451
2021-06-24 11:26:12 +01:00
Jessica Paquette 933df6ca79 [AArch64][GlobalISel] Legalize scalar G_CTTZ + G_CTTZ_ZERO_UNDEF
This adds legalization for scalar G_CTTZ and G_CTTZ_ZERO_UNDEF. Vector support
requires handling vector G_BITREVERSE, which I haven't gotten around to yet.

For G_CTTZ_ZERO_UNDEF, we just lower it to G_CTTZ.

For G_CTTZ, we match SelectionDAG's lowering to a G_BITREVERSE + G_CTLZ.

e.g. https://godbolt.org/z/nPEseYh1s

(With this patch, we have slightly worse codegen than SDAG for types smaller
than s32; it seems like we're missing a combine.)

Also, this adds in a function to build G_BITREVERSE to MachineIRBuilder.

Differential Revision: https://reviews.llvm.org/D104065
2021-06-10 15:29:51 -07:00
Jessica Paquette 1b894ccdc9 [AArch64][GlobalISel] Mark some G_BITREVERSE types as legal + select them
We fall back on G_CTTZ_ZERO_UNDEF a lot when building clang for arm64 with
gisel.

Handling this will require that we can handle G_BITREVERSE.

This patch marks G_BITREVERSE instructions with natively supported types as
legal. We get selection on these types for free via the importer.

Differential Revision: https://reviews.llvm.org/D103999
2021-06-10 10:33:52 -07:00
Tim Northover b16ddd0375 AArch64: support atomic zext/sextloads 2021-06-04 09:45:51 +01:00
Daniel Sanders aaac268285 [globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one
It's still in use in a few places so we can't delete it yet but there's not
many at this point.

Differential Revision: https://reviews.llvm.org/D103352
2021-06-01 13:23:48 -07:00
Eli Friedman 0b3b0a727a [AArch64][RISCV] Make sure isel correctly honors failure orderings.
If a cmpxchg specifies acquire or seq_cst on failure, make sure we
generate code consistent with that ordering even if the success ordering
is not acquire/seq_cst.

At one point, it was ambiguous whether this sort of construct was valid,
but the C++ standad and LLVM now accept arbitrary combinations of
success/failure orderings.

This doesn't address the corresponding issue in AtomicExpand. (This was
reported as https://bugs.llvm.org/show_bug.cgi?id=33332 .)

Fixes https://bugs.llvm.org/show_bug.cgi?id=50512.

Differential Revision: https://reviews.llvm.org/D103284
2021-05-28 12:47:40 -07:00
Amara Emerson 59a4ee9728 [AArch64][GlobalISel] Legalize oversize G_EXTRACT_VECTOR_ELT sources.
Also changes the fewerElements helper to use the lookthrough constant helper
instead of m_ICst, since m_ICst doesn't look through extends.

Differential Revision: https://reviews.llvm.org/D103227
2021-05-27 23:52:24 -07:00