Commit Graph

270 Commits

Author SHA1 Message Date
Kazu Hirata 998960ee1f [CodeGen] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 20:36:08 -08:00
Kazu Hirata 4531b61208 [GlobalISel] Use std::optional in CombinerHelper.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 14:33:45 -08:00
chenglin.bi fe07eeb825 [GlobalISel] Fix crash in applyShiftOfShiftedLogic caused by CSEMIRBuilder reuse instruction
If LogicNonShiftReg is the same to Shift1Base, and shift1 const is the same to MatchInfo.Shift2 const, CSEMIRBuilder will reuse the old shift1 when build shift2.
So, if we erase MatchInfo.Shift2 at the end, actually we remove old shift1. And it will cause crash later.

Solution for this issue is just erase it earlier to avoid the crash.

Fix #58423

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138187
2022-11-19 09:13:44 +08:00
Stanislav Mekhanoshin bcaf31ec3f [AMDGPU] Allow finer grain control of an unaligned access speed
A target can return if a misaligned access is 'fast' as defined
by the target or not. In reality there can be different levels
of 'fast' and 'slow'. This patch changes the boolean 'Fast'
argument of the allowsMisalignedMemoryAccesses family of functions
to an unsigned representing its speed.

A target can still define it as it wants and the direct translation
of the current code uses 0 and 1 for current false and true. This
makes the change an NFC.

Subsequent patch will start using an actual value of speed in
the load/store vectorizer to compare if a vectorized access going
to be not just fast, but not slower than before.

Differential Revision: https://reviews.llvm.org/D124217
2022-11-17 09:23:53 -08:00
chenglin.bi 8482247900 [GlobalISel] Correct constant type in matchReassocConstantInnerLHS
When we match a pattern from m_GCst, the register type could be different from original op. So we can't replace the original op to vreg direct.
This code create a new constant with original op type then replace the original op.

Fix #58906

Reviewed By: arsenm, aemerson

Differential Revision: https://reviews.llvm.org/D137778
2022-11-13 19:20:07 +08:00
Petar Avramovic 838d5d371a AMDGPU/GlobalISel: Fix combine crash because LI is not set in prelegalizer
Caused by legacy min/max combines (select + cmp) asking for legalizer info
in prelegalizer (D135047 added combine to all_combines).
Combine still does not work for AMDGPU since destination opcode is custom,
not legal. Similar combine works on DAG since it asks for legal or custom.

Differential Revision: https://reviews.llvm.org/D137274
2022-11-08 12:46:16 +01:00
Pierre van Houtryve 020a9d7b20 [GISel] Add (fsub +-0.0, X) -> fneg combine
Allows for better matching of VOP3 mods.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D136442
2022-11-03 08:21:50 +00:00
Jessica Paquette 0f1a51e173 [GlobalISel] Allow vectors in redundant or + add combines
We support KnownBits for vectors, so we can enable these.

https://godbolt.org/z/r9a9W4Gj1

Differential Revision: https://reviews.llvm.org/D135719
2022-10-11 15:31:09 -07:00
Jessica Paquette 036a13065b [GlobalISel] Combine (X op Y) == X --> Y == 0
This matches patterns of the form

```
(X op Y) == X
```

And transforms them to

```
Y == 0
```

where appropriate.

Example: https://godbolt.org/z/hfW811c7W

Differential Revision: https://reviews.llvm.org/D135380
2022-10-11 09:52:48 -07:00
Pierre van Houtryve 36c3833783 [GISel] Add Trunc/Lshr/BuildVector Folding
Similar to the current "Trunc/BuildVector" folding - which folds low element extracts of BuildVectors, folds hi element extracts done using bitshifts.

For D134354

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D135148
2022-10-07 08:44:03 +00:00
Pierre van Houtryve a34977c4d0 [GISel] Handle G_TRUNC in `matchExtractVecEltBuildVec`
Spotted some cases in D134354 where this was an issue.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D135147
2022-10-07 08:37:18 +00:00
Amara Emerson 8055aa8e8a [AArch64][GlobalISel] Make vector G_SEXT_INREG legal and allow combining.
As a result of making these legal, and tweaking the combine to allow vectors,
we generate vector G_SEXT_INREG during legalization.

The reason we want to make these legal in the first place is to allow for
more combine opportunities. Once those have been done, we can just lower them
back to shifts in the post-legalizer lowering.

This needs to be one commit otherwise we start causing tests to fail due to
incomplete support for selection etc.
2022-10-05 00:28:08 +01:00
Amara Emerson 07ccf651b9 x[AArch64][GlobalISel] Enable vector support for G_SELECT->G_FMAXIMUM/MINIMUM.
Vector support seems to work immediately, as long as we run the combine before
legalization (so the vector SELECTs don't get lowered) and the legalizer rules
are there to enable generation.

Differential Revision: https://reviews.llvm.org/D135047
2022-10-03 21:39:52 +01:00
Jessica Paquette 970cb99e0a [GlobalISel] Combine `(x + y) - y -> x` and friends
This adds a combine that handles

```
(x + y) - y -> x
(x + y) - x -> y
x - (y + x) -> 0 - y
x - (x + z) -> 0 - z
```

On AArch64, we get added benefit for `0 - y` because it can be selected to a
`neg` instruction.

Differential Revision: https://reviews.llvm.org/D135010
2022-10-03 10:06:48 -07:00
Amara Emerson 3daf7ddaef [GlobalISel] Allow prelegalizer combiners to have access to LegalizerInfo.
Before, the isPreLegalize() query in CombinerHelper only checked for the
presence of a LegalizerInfo object. This is problematic when we want to have
a combine actually check for legality in a pre-legalizer combine pass, since
if we pass a LegalizerInfo object to the constructor it causes the combines to
think that we're running *post* legalizer, which isn't true.

This change fixes it to instead check an explicit bool that passes to signal
whether the pass will be run before or after legalization.

Doing so exposed a bug in the extending loads combine, which tried to check for
legality of candidate extending loads if LegalizerInfo was present. Since we
only ran it pre-legalizer and therefore with a null LegalizerInfo, it never
actually ran. Also fixes the legality checks to keep the tests passing.

Differential Revision: https://reviews.llvm.org/D135044
2022-10-03 07:36:18 +01:00
Pierre van Houtryve 653beae5a1 [AMDGPU][GISel] Add Identity BUILD_VECTOR Combines
Folds-away BUILD_VECTOR-related noops in the post-legalizer combiner.

Depends on D134433

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D134953
2022-09-30 14:07:13 +00:00
serge-sans-paille 16544cbe64 [iwyu] Move <cmath> out of llvm/Support/MathExtras.h
Interestingly, MathExtras.h doesn't use <cmath> declaration, so move it out of
that header and include it when needed.

No functional change intended, but there's no longer a transitive include
fromMathExtras.h to cmath.
2022-09-28 20:49:01 +02:00
Kai Nacke ae35188f97 [GISel] Fix match tree emitter.
The following changes are necessasy to get the generated tree
matcher to compile:

- In CodeExpansions::declare(), the assert() prevents connecting
  two instructions. E.g. the match code
    (match (MUL $t, $s1, $s2),
           (SUB $d, $t, $s3)),
  results in two declarations of $t, one for the def and one for
  the use. Removing the assertion allows this construct.
  If $t is later used, it is one of the operands, which should be
  perfectly fine.
- The code emitted in GIMatchTreeVRegDefPartitioner::generatePartitionSelectorCode()
  is not compilable:
  - The value of NewInstrID should be emitted, not the name
  - Both calls involving getOperand() end with one parenthesis too many
- Swaps generated condition for the partition code in the latter function

It also changes the rules i2p_to_p2i, fabs_fabs_fold, and fneg_fneg_fold
to use the tree matcher for a linear match. These rules are tested by:

CodeGen/AArch64/GlobalISel/combine-fabs.mir
CodeGen/AArch64/GlobalISel/combine-fneg.mir
CodeGen/AArch64/GlobalISel/combine-ptrtoint.mir
CodeGen/AMDGPU/GlobalISel/combine-add-nullptr.mir

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D133257
2022-09-18 00:00:15 +00:00
Jessica Paquette 1076b31da8 [GlobalISel] Combine select + fcmp to fminnum/fmaxnum/fminimum/fmaximum
This is a partial port of the code used by the SelectionDAGBuilder to
translate selects.

In particular, see matchSelectPattern in ValueTracking.cpp. This is a
GISel-equivalent of the portion which handles fminnum/fmaxnum/fminimum/fmaximum.

I tried to set it up so it'd be easy to add the non-FP cases. Those are simpler.
On the AArch64-end, it seems like the FP cases are more important for perf
right now, so I bit the bullet and went at the more complicated problem. :)

I elected to do this as a post-legalize combine rather than in the
IRTranslator because

Deciding which fmax/fmin to use can depend on legalization rules
Philosophically-speaking (TM), putting it in a combine just feels cleaner

Being able to enable/disable the combine is handy
Another option would be to use the ValueTracking code in the IRTranslator and
match what SelectionDAGBuilder::visitSelect does. I think that may be somewhat
annoying since we'd need to write lowerings back into the selects in the
legalizer. I'm not strongly opposed to the approach.

We'd also want to be careful with vector selects once that's implemented,
which explicitly check if a vector select is legal on the target. That'd
probably need a hook.

From what I can tell, doing this as a combine is probably a cleaner option
long-term.

Differential Revision: https://reviews.llvm.org/D116702
2022-09-16 13:35:46 -07:00
Jay Foad 210e6a993d [GlobalISel] Simplify extended add/sub to add/sub with carry
Simplify extended add/sub (with carry-in and carry-out) to add/sub with
carry (with carry-out only) if carry-in is known to be zero.

Differential Revision: https://reviews.llvm.org/D133702
2022-09-12 17:05:44 +01:00
Amara Emerson fe7c3b87ce Add parantheses to silence warning. 2022-09-06 15:36:19 +01:00
Amara Emerson 3dd861818a [GlobalISel] Combine G_INSERT/EXTRACT_VECTOR_ELT with out of bounds indices to undef.
Differential Revision: https://reviews.llvm.org/D133309
2022-09-06 13:45:04 +01:00
Amara Emerson fb60e50c78 [GlobalISel] Fix a combine crash due to a negative G_INSERT_VECTOR_ELT idx.
These should really be folded away to undef but we shouldn't crash in any case.
2022-09-05 12:10:17 +01:00
Nikita Popov c635ea5c50 [CombinerHelper] Avoid deprecated method (NFC) 2022-09-01 16:09:05 +02:00
Amara Emerson 4cf3db41da [GlobalISel] Add sdiv exact (X, constant) -> mul combine.
This port of the SDAG optimization is only for exact sdiv case.

Differential Revision: https://reviews.llvm.org/D130517
2022-09-01 13:34:00 +01:00
Amara Emerson 5ae0472694 [GlobalISel] Fix miscompile of G_UREM + G_UDIV due to not checking for equality
of the first operands of each.

Fixes issue #55287

Differential Revision: https://reviews.llvm.org/D130525
2022-07-25 16:03:05 -07:00
Matt Arsenault 8d0383eb69 CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.

Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.

Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.
2022-07-18 17:23:41 -04:00
Craig Topper 7fa1c32634 [CodeGen] Remove unnecessary APInt copy. NFC 2022-07-17 23:41:53 -07:00
Craig Topper a55ff6aadd [Support][CodeGen] Fix spelling Divison->Division. NFC 2022-07-17 23:16:29 -07:00
Craig Topper 795602af0c [CodeGen] Don't compare bool with integer 0. NFC
The IsAdd field is a bool.
2022-07-17 23:16:14 -07:00
Amara Emerson 2824bdd92f [GlobalISel] Fix and(load)->zextload combine crash.
We shouldn't use getOpcodeDef() if we need to guarantee the def has only one
user since under the hood it may look through copies and optimization hints,
which themselves may have multiple users.
2022-07-13 14:58:45 -07:00
Matt Arsenault e9a45d45d0 GlobalISel: Allow forming atomic/volatile G_SEXTLOAD
Mirror the change to G_ZEXTLOAD.
2022-07-08 11:55:08 -04:00
Matt Arsenault 1ee6ce9bad GlobalISel: Allow forming atomic/volatile G_ZEXTLOAD
SelectionDAG has a target hook, getExtendForAtomicOps, which it uses
in the computeKnownBits implementation for ATOMIC_LOAD. This is pretty
ugly (as is having a separate load opcode for atomics), so instead
allow making use of atomic zextload. Enable this for AArch64 since the
DAG path defaults in to the zext behavior.

The tablegen changes are pretty ugly, but partially helps migrate
SelectionDAG from using ISD::ATOMIC_LOAD to regular ISD::LOAD with
atomic memory operands. For now the DAG emitter will emit matchers for
patterns which the DAG will not produce.

I'm still a bit confused by the intent of the isLoad/isStore/isAtomic
bits. The DAG implementation rejects trying to use any of these in
combination. For now I've opted to make the isLoad checks also check
isAtomic, although I think having isLoad and isAtomic set on these
makes most sense.
2022-07-08 11:55:08 -04:00
Kazu Hirata a81b64a1fb [llvm] Use Optional::has_value instead of Optional::hasValue (NFC)
This patch replaces x.hasValue() with x.has_value() where x is not
contextually convertible to bool.
2022-06-26 16:10:42 -07:00
Kazu Hirata a7938c74f1 [llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 21:42:52 -07:00
Kazu Hirata 3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3.
2022-06-25 11:56:50 -07:00
Kazu Hirata aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Kazu Hirata 7a47ee51a1 [llvm] Don't use Optional::getValue (NFC) 2022-06-20 22:45:45 -07:00
Kazu Hirata 064a08cd95 Don't use Optional::hasValue (NFC) 2022-06-20 20:05:16 -07:00
Kazu Hirata e0e687a615 [llvm] Don't use Optional::hasValue (NFC) 2022-06-20 10:38:12 -07:00
Michael Kitzan b7fcf6632f [GISel] Add new combines for G_ADD
Patch adds new GICombineRules for G_ADD:

G_ADD(x, G_SUB(y, x)) -> y
G_ADD(G_SUB(y, x), x) -> y

Patch additionally adds new combine tests for AArch64 target for
these new rules.

Reviewed by: paquette

Differential Revision: https://reviews.llvm.org/D87936
2022-06-06 11:19:45 -07:00
Jon Roelofs d699e54ca2 Fix an or+and miscompile w/ GlobalISel
Fixes #55284
2022-05-18 19:09:47 -07:00
Michael Kitzan 29bebb0237 [GISel] Add new combines for G_FMINNUM/MAXNUM and G_FMINIMUM/MAXIMUM
I noticed https://reviews.llvm.org/D87415 added SDAG combines to fold
FMIN/MAX instrs with NaNs.

The patch implements the same NaN combines for GISel GMIR FMIN/MAX opcodes:
G_FMINNUM(X, NaN) -> X
G_FMAXNUM(X, NaN) -> X
G_FMINIMUM(X, NaN) -> NaN
G_FMAXIMUM(X, NaN) -> NaN

The patch adds AArch64 tests for these combines as well.

Reviewed by: arsenm

Differential revision: https://reviews.llvm.org/D125819
2022-05-18 12:08:53 -07:00
Abinav Puthan Purayil 485dd0b752 [GlobalISel] Handle constant splat in funnel shift combine
This change adds the constant splat versions of m_ICst() (by using
getBuildVectorConstantSplat()) and uses it in
matchOrShiftToFunnelShift(). The getBuildVectorConstantSplat() name is
shortened to getIConstantSplatVal() so that the *SExtVal() version would
have a more compact name.

Differential Revision: https://reviews.llvm.org/D125516
2022-05-16 16:03:30 +05:30
Jon Roelofs e1c808b36e Fix zero-width bitfield extracts to emit 0
Fixes #55129
2022-05-03 14:46:42 -07:00
Shengchen Kan 37b378386e [NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments 2022-03-16 20:25:42 +08:00
serge-sans-paille ed98c1b376 Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332
2022-03-12 17:26:40 +01:00
Mircea Trofin cb2160760e [nfc][codegen] Move RegisterBank[Info].h under CodeGen
This wraps up from D119053. The 2 headers are moved as described,
fixed file headers and include guards, updated all files where the old
paths were detected (simple grep through the repo), and `clang-format`-ed it all.

Differential Revision: https://reviews.llvm.org/D119876
2022-03-01 21:53:25 -08:00
Amara Emerson b09e63bad1 [AArch64][GlobalISel] Implement combines for boolean G_SELECT->bitwise ops.
Differential Revision: https://reviews.llvm.org/D117160
2022-02-20 00:53:09 -08:00
Matt Arsenault 0877fbcc16 GlobalISel: Add FoldBinOpIntoSelect combine
This will do the combine in cases that should fold, but don't
now. e.g. we're relying on the CSEMIRBuilder's incomplete constant
folding. For instance it doesn't handle FP operations or vectors (and
we don't have separate constant folding combines either to catch
them).
2022-02-08 18:17:21 -05:00