Commit Graph

3425 Commits

Author SHA1 Message Date
zhoujing 899ca9fd8e Add support for 12 bits immediate 2023-01-09 11:59:45 +08:00
Kazu Hirata 9f252e5567 [llvm] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 17:31:17 -08:00
Kazu Hirata 3c09ed006a [llvm] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 17:12:44 -08:00
Kazu Hirata 998960ee1f [CodeGen] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 20:36:08 -08:00
Sanjay Patel 0037e21f28 [SDAG] bail out of mergeTruncStores() if there's any other use in the chain
This fixes the miscompile in issue #58883.

The test demonstrates that we gave up on store merging in that example.

This change should be strictly safe (just adds another clause
to avoid the transform), and it does not prohibit any existing
valid optimizations based on regression tests. I want to believe
that it's also a sufficient fix (possibly overkill), but I'm not
sure how to prove that.

Differential Revision: https://reviews.llvm.org/D137791
2022-12-02 10:08:19 -05:00
Simon Pilgrim 30eff7f29f [DAG] Attempt to replace a mul node with an existing umul_lohi/smul_lohi node (PR59217)
As discussed on Issue #59217, under certain circumstances the DAG can generate duplicate MUL and MUL_LOHI nodes, often during MULO legalization.

This patch attempts to replace MUL nodes with additional uses of the LO result from the MUL_LOHI node

Differential Revision: https://reviews.llvm.org/D138790
2022-11-29 12:51:30 +00:00
Kazu Hirata dd698b7777 [SelectionDAG] Use std::optional in DAGCombiner.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 15:00:23 -08:00
chenglin.bi cdb7b804f6 [DAGCombiner] fold or (xor x, y),? patterns
or (xor x, y), x --> or x, y
or (xor x, y), y --> or x, y
or (xor x, y), (and x, y) --> or x, y
or (xor x, y), (or x, y) --> or x, y

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D138401
2022-11-23 09:28:10 +08:00
Han-Kuan Chen caa9f63022 [CodeGen] Refactor visitSCALAR_TO_VECTOR. NFC.
Differential Revision: https://reviews.llvm.org/D137688
2022-11-22 01:29:04 -08:00
chenglin.bi ac1b999e85 [DAGCombiner] fold or (and x, y), x --> x
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D138398
2022-11-21 22:11:12 +08:00
Benjamin Maxwell 34d88cf6cf [DAG] Allow folding AND of anyext masked_load with >1 user to zext version
This now allows folding an AND of a anyext masked_load to a
zext_masked_load even if the masked load has multiple users.  Doing is
eliminates some redundant ANDs/MOVs for certain AArch64 SVE code.

I'm not sure if there's any cases where doing this could negatively the
other users of the masked_load.  Looking at other optimizations of
masked loads, most don't apply if the load is used more than once, so it
doesn't look like this would interfere.

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D137844
2022-11-18 10:38:09 +00:00
Stanislav Mekhanoshin bcaf31ec3f [AMDGPU] Allow finer grain control of an unaligned access speed
A target can return if a misaligned access is 'fast' as defined
by the target or not. In reality there can be different levels
of 'fast' and 'slow'. This patch changes the boolean 'Fast'
argument of the allowsMisalignedMemoryAccesses family of functions
to an unsigned representing its speed.

A target can still define it as it wants and the direct translation
of the current code uses 0 and 1 for current false and true. This
makes the change an NFC.

Subsequent patch will start using an actual value of speed in
the load/store vectorizer to compare if a vectorized access going
to be not just fast, but not slower than before.

Differential Revision: https://reviews.llvm.org/D124217
2022-11-17 09:23:53 -08:00
zhongyunde 8fbb6f8678 [NFC] Fix typo in comment
Address comment in https://reviews.llvm.org/D137936

Differential Revision: https://reviews.llvm.org/D138124
2022-11-16 23:35:53 +08:00
Sanjay Patel fe05a0a3dd [SDAG] avoid udiv/urem transform for vector/scalar type mismatches
This solves the crashing from issue #58994.
I don't know anything about VE, so I don't know if the output
is as expected or even correct.
2022-11-15 11:01:18 -05:00
Amaury Séchet 82209fd96e [NFC] Refactor DAGCombiner::foldSelectOfConstants to reduce nesting 2.0 2022-11-05 17:10:06 +00:00
Amaury Séchet 7c05f092c9 [NFC] Refactor DAGCombiner::foldSelectOfConstants to reduce nesting 2022-11-05 16:17:58 +00:00
Simon Pilgrim 78739fdb4d [DAG] Enable combineShiftOfShiftedLogic folds after type legalization
This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests), which I've addressed by adding AMDGPUTargetLowering::isDesirableToCommuteWithShift overrides.

Fixes #57872

Differential Revision: https://reviews.llvm.org/D136042
2022-10-29 12:30:04 +01:00
Sanjay Patel 1e7c1dd67c [SDAG] avoid crash from mismatched types in scalar-to-vector fold
This bug was introduced with D136713 / 54eeadcf44 .

As an enhancement, we could cast operands to the expected type,
but we need to make sure that is done correctly (zext vs. sext).
It's also possible (but seems unlikely) that an operand can have
a type larger than the result type.

Fixes #58661
2022-10-28 09:14:08 -04:00
Simon Pilgrim d47f056cd2 [DAG] visitXOR - fold XOR(A,B) -> OR(A,B) iff A and B have no common bits
Alive2: https://alive2.llvm.org/ce/z/7wvfns

Part of Issue #58624
2022-10-28 12:11:12 +01:00
Simon Pilgrim 28bfd853ab [DAG] visitFSUBForFMACombine - pass callbacks by reference in isContractableAndReassociableFMUL lambda capture. NFC.
Fixes a coverity remark about large copies by value
2022-10-28 11:48:45 +01:00
Pierre van Houtryve 088a816824 [DAGCombiner] Use `getAnyExtOrTrunc` instead of TRUNCATE in ExtractVectorElt combine
ScalarVT isn't guaranteed to be smaller than the BCSrc.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D136849
2022-10-28 06:33:29 +00:00
Sanjay Patel 54eeadcf44 [SDAG] avoid vector extract/insert around binop
scalar-to-vector (scalar binop (extractelt V, Idx), C) --> shuffle (vector binop V, C'), {Idx, -1, -1...}

We generally try to avoid ad-hoc vectorization in SDAG,
but the motivating case from issue #39482 escapes our
normal vectorization folds in IR. It seems like it should
always be a win to transform this pattern in cases where
we have the same vector type for input and output and the
target supports the vector operation. That avoids
transfers from vector to scalar and back.

In the x86 shift examples, we create the scalar-to-vector
node during legalization. I'm not sure if there's a more
general way to create the pattern for testing. (If so, I
could add tests for other targets.)

Differential Revision: https://reviews.llvm.org/D136713
2022-10-26 14:04:46 -04:00
Sanjay Patel 3aec021118 [SDAG] add helper for opcodes that are not speculatable
This is not quite NFC because one of the users should
now avoid the DIVREM opcodes too, but I'm not sure
how to test that.

I used the same name as an analysis function in IR
in case we want to expand this to include other
operations.

Another potential use is proposed in D136713.
2022-10-26 11:20:14 -04:00
Sanjay Patel b179351ad4 [SDAG] refactor folds for scalar-to-vector; NFCI
Fix typos, add comments, improve variable names,
rearrange code, add early exits.
2022-10-25 12:53:46 -04:00
Simon Pilgrim fd5f3abb07 [DAG] Fold (abs (sign_extend_inreg x)) -> (zero_extend (abs (truncate x))) (PR43370)
If the upper half of an abs() is all sign bits, then we can perform the abs() using just the lower half and then zero extend.

I've limited the DAG combine to only sign_extend_inreg (and free truncate/zero_extend) to minimise any later promotion issues, but for legalization a similar fold can use ComputeNumSignBits to be more aggressive.

Alive2: https://alive2.llvm.org/ce/z/y32fS4

Fixes #43370

Differential Revision: https://reviews.llvm.org/D136559
2022-10-24 10:27:08 +01:00
Craig Topper db25f51e37 Revert "[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))"
This reverts commit e8b3ffa532.

The AMDGPU/mad_64_32.ll seems to fail on some of the build bots but
passes locally. I'm really confused.
2022-10-22 22:50:43 -07:00
Craig Topper e8b3ffa532 [DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))
(sra X, BW-1) is either 0 or -1. So the multiply is a conditional
negate of Y.

This pattern shows up when type legalizing wide multiplies involving
a sign extended value.

Fixes PR57549.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D133399
2022-10-22 21:51:45 -07:00
Craig Topper 00816714f9 [DAGCombiner][RISCV] Make foldBinOpIntoSelect work correctly with opaque constants.
The CanFoldNonConst doesn't work correctly with opaque constants
because getNode won't constant fold constants if one is opaque. Even
if the operation is AND/OR. This can lead to infinite loops.

This patch does the folding manually in the DAGCombine. Alternatively,
we could improve getNode but that seemed likely to have bigger impact
and possibly increase compile time for the additional checks. We wouldn't
want to directly constant fold because we need to preserve the opaque flag.

Fixes PR58511.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D136472
2022-10-22 19:10:33 -07:00
Simon Pilgrim b24a9f0cef [DAG] visitFREEZE - pull out Operands array. NFCI.
Initial tidyup and it will make it easier to adjust additional Operands in a future patch.
2022-10-22 20:14:56 +01:00
Simon Pilgrim 9708d88017 Revert rG42230efccf8fe1185be5fa6c23dce0a8183d6ec9 "[DAG] Fold (sra (or (shl x, c1), (shl y, c2)), c1) -> (sext_inreg (or x, (shl y,c2-c1)) iff c2 >= c1"
@foad was right - this isn't actually going to help with D136042 as much as hoped, we need a better AMDGPU-specific solution as other targets are likely to make use of it
2022-10-19 12:07:41 +01:00
Simon Pilgrim 42230efccf [DAG] Fold (sra (or (shl x, c1), (shl y, c2)), c1) -> (sext_inreg (or x, (shl y,c2-c1)) iff c2 >= c1
Helps with some of the AMDGPU regressions identified in D136042 where we were losing signed BFE patterns after sinking shifts behind logic ops.

Differential Revision: https://reviews.llvm.org/D136081
2022-10-19 11:18:49 +01:00
Simon Pilgrim 8e77458578 [DAG] visitShiftByConstant - replace constant detection with FoldConstantArithmetic
Instead of checking that an operand is constant/opaque before calling getNode() and then checking that the result is a constant, just use FoldConstantArithmetic which will just early-out if the operands are not constant foldable.
2022-10-17 16:19:10 +01:00
Simon Pilgrim af5942cc09 Remove trailing whitespace. NFC. 2022-10-17 15:20:26 +01:00
chenglin.bi c1909d7337 [DAGCombiner] Fix crash for the merge stores with different value type
The crash case comes from #58350. It have two stores, one store is type f32 and the other is v1f32.
When we try to merge these two stores on v1f32, the memVT is vector type so the old code will use ISD::EXTRACT_SUBVECTOR for type f32 also then compiler crash.
So this patch insert a build_vector for f32 store to generate v1f32 also when memVT is v1f32.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D135954
2022-10-15 01:16:35 +08:00
Nicola Lancellotti ce1a2ccf94 [NFC] Fix typo in DAGCombiner 2022-10-14 17:47:25 +01:00
Craig Topper ac9209751a Revert "[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))"
This reverts commit 0148df8157.

Getting a lit test failures on AMDGPU but I can't reproduce it so far.
Reverting to investigate.
2022-10-11 16:30:40 -07:00
Craig Topper 0148df8157 [DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))
(sra X, BW-1) is either 0 or -1. So the multiply is a conditional
negate of Y.

This pattern shows up when type legalizing wide multiplies involving
a sign extended value.

Fixes PR57549.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D133399
2022-10-11 16:20:55 -07:00
Amaury Séchet 62ea6c5be7
[DAGCombine] Deduplicate addcarry node using commutativity.
The first two parameters of addcarry are commutative. We may face a situation where both variant are present in the DAG, in which case we benefit from using just one.

Depends on D57302 and D33587

Reviewed By: RKSimon, chfast

Differential Revision: https://reviews.llvm.org/D57317
2022-10-08 00:55:14 +02:00
Philip Reames 04bb32e58a [DAG] Extract helper for (neg x) [nfc]
This is a frequently reoccurring pattern, let's factor it out.

Differential Revision: https://reviews.llvm.org/D135301
2022-10-06 13:23:52 -07:00
jeff cebec42089 [DAGCombiner] [AMDGPU] Allow vector loads in MatchLoadCombine
Since SROA chooses promotion based on reaching load / stores of allocas, we may run into scenarios in which we alloca a vector, but promote it to an integer. The result of which is the familiar LoadCombine pattern (i.e. ZEXT, SHL, OR). However, instead of coming directly from distinct loads, the elements to be combined are coming from ExtractVectorElements which stem from a shared load.

This patch identifies such a pattern and combines it into a load.

Change-Id: I0bc06588f11e88a0a975cde1fd71e9143e6c42dd
2022-10-04 12:16:00 -07:00
Sanjay Patel 17dcbd8165 [SDAG] don't hoist div/rem through a select with neutral constant
This bug was introduced with D134966.
2022-10-04 13:15:01 -04:00
Jay Foad af947d9fcb [ISel] Fix crash in new FMA DAG combine
Fix a crash in the FMA combine added by D132837 and amended by D134810.
In cases where the newly created node could be folded, the combiner
would fail this assertion:

llc: DAGCombiner.cpp:268: void (anonymous namespace)::DAGCombiner::AddToWorklist(llvm::SDNode *): Assertion `N->getOpcode() != ISD::DELETED_NODE && "Deleted Node added to Worklist"' failed.

Differential Revision: https://reviews.llvm.org/D135150
2022-10-04 15:13:18 +01:00
Philip Reames a200b0fc25 [DAG] Introduce getSplat utility for common dispatch pattern [nfc]
We have a very common pattern of dispatching between BUILD_VECTOR and SPLAT_VECTOR creation repeated in many cases in code.  Common the pattern into a utility function.
2022-10-03 12:49:39 -07:00
Philip Reames 21f97fdc97 [DAG] Use getSplatBuildVector in a couple more places [nfc] 2022-10-03 09:48:49 -07:00
Simon Pilgrim 61dc5014ac [DAG] Update foldSelectWithIdentityConstant to use llvm::isNeutralConstant
D133866 added the llvm::isNeutralConstant helper to track neutral/passthrough constants

This patch updates foldSelectWithIdentityConstant to use the helper instead of maintaining its own opcode handling

Differential Revision: https://reviews.llvm.org/D134966
2022-09-30 17:46:52 +01:00
Amaury Séchet 031a7ad575 [NFC] Fix erroneous indentation. 2022-09-30 12:30:27 +00:00
Amaury Séchet 923909afbe [DAG] Simplify the select of constant combine code. NFC 2022-09-30 01:03:14 +00:00
Amaury Séchet d7600c7ccb [DAG] select Cond, C, -1 --> or (sext (not Cond)), C when C is MVT::i1
In the spirit of D130765 . Get rid of cbranches and/or cmov. Usually shorter, but sometime not, becaus eit's hard to prededict when dependency breaking xor will be introduced.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D134736
2022-09-30 00:36:58 +00:00
Thomas Symalla a41dde2c62 [AMDGPU] Add use check in v_fma combine.
In D132837, an existing v_fma combine was extended to regard nested
fma instructions. Originally, the inner FMA was checked for being used
only once. In its current state, this check is missing, which causes
some regressions.

In this patch, this check was added.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D134856
2022-09-29 12:25:03 +02:00
Jay Foad 2c12a04bba [ISel] Fix DAG divergence after new FMA combine
D132837 introduced a new DAG combine that used MorphNodeTo to morph an
FMUL into an FMA. It turns out that MorphNodeTo does not properly update
the divergence bit for users of the morphed node, causing an assertion
failure on the new test case:

llc: SelectionDAG.cpp:10486: void llvm::SelectionDAG::VerifyDAGDivergence(): Assertion `calculateDivergence(N) == N->isDivergent() && "Divergence bit inconsistency detected"' failed.

Fixing MorphNodeTo to propagate the divergence bit is tricky because of
the way it is used to select machine instructions, so use getNode and
ReplaceAllUsesOfValueWith instead.

Differential Revision: https://reviews.llvm.org/D134810
2022-09-28 19:41:51 +01:00