Commit Graph

1915 Commits

Author SHA1 Message Date
Abinav Puthan Purayil 485dd0b752 [GlobalISel] Handle constant splat in funnel shift combine
This change adds the constant splat versions of m_ICst() (by using
getBuildVectorConstantSplat()) and uses it in
matchOrShiftToFunnelShift(). The getBuildVectorConstantSplat() name is
shortened to getIConstantSplatVal() so that the *SExtVal() version would
have a more compact name.

Differential Revision: https://reviews.llvm.org/D125516
2022-05-16 16:03:30 +05:30
Eli Friedman 96c2a0c9ff [GlobalIsel] Fix fallback if stack protector isn't supported.
When GlobalISel fails, we need to report the error, and we need to set
the FailedISel property.  We skipped those steps if stack protector
insertion failed, which led to a very strange miscompile.

Differential Revision: https://reviews.llvm.org/D125584
2022-05-13 14:17:27 -07:00
Jay Foad 26e1ebd3ea [GlobalISel] Change ConstantFoldVectorBinop to return vector of APInt
Previously it built MIR for the results and returned a Register.

This avoids building constants for earlier elements of the vector if
later elements will fail to fold, and allows CSEMIRBuilder::buildInstr
to avoid unconditionally building a copy from the result.

Use a new helper function MachineIRBuilder::buildBuildVectorConstant
to build a G_BUILD_VECTOR of G_CONSTANTs.

Differential Revision: https://reviews.llvm.org/D117758
2022-05-13 09:33:07 +01:00
Jon Roelofs e1c808b36e Fix zero-width bitfield extracts to emit 0
Fixes #55129
2022-05-03 14:46:42 -07:00
Matt Arsenault 40bc9112c0 GlobalISel: Relax handling of G_ASSERT_* with source register classes
The most common situation where G_ASSERT_ZEXT appears for AMDGPU is a
copy from a physical register, which happens to use set the actual
register class on the virtual register. After copy coalescing, the
assert's source operand had a vreg with a set class. The verifier was
strictly rejecting cases where the set class/bank weren't an exact
match. Additionally, RegBankSelect was also expecting a register bank
to be set on the register, not a class.

This is much stricter than regular copies so relax this behavior. This
now allows these 2 cases:

1. Source register has either class or bank, and the result does not
2. Source register has a register class, and the result is a register
with a matching bank.

This should avoid needing some kind of special handling to avoid
violating this constraint when folding copies.
2022-04-22 10:49:50 -04:00
Matt Arsenault 507259820a GlobalISel: Add LegalizeMutations to help use More/FewerElements 2022-04-19 21:04:32 -04:00
Jonas Paulsson 46f83caebc [InlineAsm] Add support for address operands ("p").
This patch adds support for inline assembly address operands using the "p"
constraint on X86 and SystemZ.

This was in fact broken on X86 (see example at
https://reviews.llvm.org/D110267, Nov 23).

These operands should probably be treated the same as memory operands by
CodeGenPrepare, which have been commented with "TODO" there.

Review: Xiang Zhang and Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D122220
2022-04-13 12:50:21 +02:00
Matt Arsenault 3754f60112 GlobalISel: Implement MoreElements for select of vector conditions 2022-04-12 16:54:04 -04:00
Matt Arsenault 3f2cc7cc2b GlobalISel: Fix lowerSelect handling of boolean high bits
This was making several invalid assumptions about the incoming
select. First, it was assuming the incoming condition was either s1 or
already sign extended, not accounting for different boolean high bits
behavior between scalar and vector conditions. We only had a vector
boolean due to the intermediate step vector select, which is now
avoided.

Second, it was assuming it can use the result vector type as a boolean
mask. These types don't have anything to do with other, and only makes
sense in the context of the expansion to bit operations. Since these
logically are part of the same lowering, do the complete expansion in
a single step.

The added select_v4s1_s1 test does fail to legalize, since it seems
AArch64's vector legalization support is pretty incomplete.
2022-04-12 16:54:03 -04:00
Matt Arsenault 0e489926be GlobalISel: Handle widening addo/subo booleans
This will be tested in a future patch
2022-04-12 16:54:03 -04:00
Matt Arsenault 95c2bcbf8b GlobalISel: Handle widening umulo/smulo condition outputs 2022-04-12 16:54:03 -04:00
Matt Arsenault abe171df06 GlobalISel: Update mutationIsSane assert for scalable vectors 2022-04-12 16:54:03 -04:00
Matt Arsenault d1f97a3419 GlobalISel: Add memSizeNotByteSizePow2 legality helper
This is really a replacement for memSizeInBytesNotPow2 that actually
does what most every target wants. In particular, since s1 rounds to 1
byte, it wasn't lowered by this predicate. This results in targets
needing to think harder and add more matchers to catch all the
degenerate cases.

Also small bug fix that prevented the correct insertion of
G_ASSERT_ZEXT in the AArch64 use case.
2022-04-11 19:43:37 -04:00
Matt Arsenault 1416744f84 GlobalISel: Implement computeKnownBits for overflow bool results 2022-04-11 19:43:37 -04:00
Daniel Sanders 93977f37e6 Check if register class was changed in constrainOperandRegClass()
NFC
When no actual change happens there's no need to notify the
observers about the fact the register class is being constrained.
So we should avoid notifying observers when no change has
happened, because this can dramatically affect compile
time for particular test cases.

Reviewed By: dsanders, arsenm

Differential Revision: https://reviews.llvm.org/D122615
2022-04-05 11:55:07 -07:00
Abinav Puthan Purayil 898d5776ec [AMDGPU][GlobalISel] Scalarize add/sub with overflow ops in the legalizer
Differential Revision: https://reviews.llvm.org/D122803
2022-03-31 21:46:34 +05:30
serge-sans-paille 60ca256953 Cleanup include: Add missing header
Should fix https://lab.llvm.org/buildbot#builders/57/builds/16192 introduced by
02c28970b2
2022-03-23 15:15:56 +01:00
Benjamin Kramer 9a6e0afac5 Unbreak the build after 02c28970b2 2022-03-23 14:38:13 +01:00
serge-sans-paille 02c28970b2 Cleanup include: codegen second round
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D122180
2022-03-23 13:54:00 +01:00
Kazu Hirata 1eada2adda [CodeGen] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC) 2022-03-20 23:11:06 -07:00
Shengchen Kan 37b378386e [NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments 2022-03-16 20:25:42 +08:00
serge-sans-paille 989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
Amara Emerson 8cbf18cb04 [GlobalISel] Fix store merging incorrectly merging volatile stores.
The existing volatile checks only handle aliasing hazards between stores,
but that isn't enough since by that point volatile stores may have already
been added to the current candidate group.
2022-03-14 13:48:51 -07:00
serge-sans-paille ed98c1b376 Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332
2022-03-12 17:26:40 +01:00
Nico Weber a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeea.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille 7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
Paul Robinson 7b85f0f32f [PS4] isPS4 and isPS4CPU are not meaningfully different 2022-03-03 11:36:59 -05:00
Mircea Trofin cb2160760e [nfc][codegen] Move RegisterBank[Info].h under CodeGen
This wraps up from D119053. The 2 headers are moved as described,
fixed file headers and include guards, updated all files where the old
paths were detected (simple grep through the repo), and `clang-format`-ed it all.

Differential Revision: https://reviews.llvm.org/D119876
2022-03-01 21:53:25 -08:00
Nikita Popov 87ebd9a36f [IR] Use CallBase::getParamElementType() (NFC)
As this method now exists on CallBase, use it rather than the
one on AttributeList.
2022-02-25 10:01:58 +01:00
Amara Emerson b09e63bad1 [AArch64][GlobalISel] Implement combines for boolean G_SELECT->bitwise ops.
Differential Revision: https://reviews.llvm.org/D117160
2022-02-20 00:53:09 -08:00
Mircea Trofin c62eefb886 [nfc][codegen] Move RegisterBank[Info].cpp under CodeGen
Layering-wise, it seems RegisterBank stuff fits under CodeGen, like
other target abstraction.
In particular, TargetSubtargetInfo has a getRegBankInfo member, but
using that object requires making sure GlobalISel is linked, which is
not always the case (e.g. llvm-jitlink doesn't).

Differential Revision: https://reviews.llvm.org/D119053
2022-02-15 11:27:15 -08:00
Julien Pages dcb2da13f1 [AMDGPU] Add a new intrinsic to control fp_trunc rounding mode
Add a new llvm.fptrunc.round intrinsic to precisely control
the rounding mode when converting from f32 to f16.

Differential Revision: https://reviews.llvm.org/D110579
2022-02-11 12:08:23 -05:00
Jay Foad abda8d2229 [GlobalISel] CSE FP constants at -O0
At -O0 we claim to CSE constants only. I think this should apply to
G_FCONSTANT as well as G_CONSTANT.

Differential Revision: https://reviews.llvm.org/D119344
2022-02-10 09:17:11 +00:00
Matt Arsenault 5af0f097ba GlobalISel: Constant fold G_PTR_ADD
Some globals lower to literal addresses on AMDGPU.

This may be wrong for non-integral address spaces. I'm wondering if we
should just allow regular G_ADD to use pointer types, and reserve
G_PTR_ADD for non-integral address spaces.
2022-02-08 19:21:06 -05:00
Matt Arsenault 2af4a554fe GlobalISel: Constant fold FP bin ops in MIRBuilder
Might as well handle these if we're going to handle the integer ops
here.
2022-02-08 18:51:10 -05:00
Matt Arsenault 930f2498d4 GlobalISel: Constant fold integer min/max opcodes 2022-02-08 18:50:35 -05:00
Matt Arsenault 0877fbcc16 GlobalISel: Add FoldBinOpIntoSelect combine
This will do the combine in cases that should fold, but don't
now. e.g. we're relying on the CSEMIRBuilder's incomplete constant
folding. For instance it doesn't handle FP operations or vectors (and
we don't have separate constant folding combines either to catch
them).
2022-02-08 18:17:21 -05:00
Sheng 76c83e747f [GlobalISel] Add big endian support in CallLowering
When splitting values, CallLowering assumes Lo part goes first. But in big endian ISA such as M68k, Hi part goes first.

This patch fixes this.

Differential Revision: https://reviews.llvm.org/D116877
2022-02-08 14:43:38 +00:00
Sheng 146c7820d9 [GlobalISel][Legalizer] Support reducing load/store width in big endian order 2022-02-07 20:06:17 -05:00
Simon Pilgrim 5d3a86489f [GlobalISel] Move getOpcode() calls inside assert() to avoid (void)s. NFC.
Tidier solution to the unused variable warnings - we already do this in other places in this file.
2022-02-07 09:50:27 +00:00
Djordje Todorovic def10a2895 [GlobalIsel] Fix another "unused variable" warning 2022-02-07 09:32:22 +01:00
Djordje Todorovic eab395fa40 Fix the warning after D118805
A variable was used within assert() only.
2022-02-07 09:25:02 +01:00
Kazu Hirata 3a8c51480f [CodeGen] Use = default (NFC)
Identified with modernize-use-equals-default
2022-02-06 10:54:44 -08:00
Róbert Ágoston cd4ed08b5a [GlobalISel] Don't combine instructions which are fed by memory instructions using different size
Memory instructions like extending loads from the same address are not equal if
their size is not equal.

This fixes https://github.com/llvm/llvm-project/issues/53524.

Differential Revision: https://reviews.llvm.org/D118805
2022-02-04 15:00:47 -08:00
Jessica Paquette 9a61e731ff [GlobalISel] Combine (G_*ADDO x, 0) -> x + no carry out
Similar to the G_*MULO change.

The code for checking if a constant is legal/pre-legalize is shared between
these, and is kind of hairy. So, factor it out into a new function:
`isConstantLegalOrBeforeLegalizer`.

To make the refactoring clean, further refactor `isLegalOrBeforeLegalizer` into
a wrapper for two functions:

- `isPreLegalize`
- `isLegal`

This is a bit easier to read in general.

https://godbolt.org/z/KW7oszP1o

Differential Revision: https://reviews.llvm.org/D118655
2022-02-03 14:25:15 -08:00
Jessica Paquette c636899dc1 [GlobalISel] Combine: (G_*MULO x, 0) -> 0 + no carry out
Similar to the following combine in `DAGCombiner::visitMULO`:

```
  // fold (mulo x, 0) -> 0 + no carry out
  if (isNullOrNullSplat(N1))
    return CombineTo(N, DAG.getConstant(0, DL, VT),
                     DAG.getConstant(0, DL, CarryVT));
```

This fixes some generally poor codegen for `*mulo`:

https://godbolt.org/z/eTxYsvz8f

Differential Revision: https://reviews.llvm.org/D118635
2022-02-03 14:23:58 -08:00
Kazu Hirata 2bea207d26 [CodeGen] Use default member initialization (NFC)
Identified with modernize-use-default-member-init.
2022-01-30 12:32:51 -08:00
Matt Arsenault 2d670de84c GlobalISel: Avoid crash on asm with lying result types
The physical register in the asm has the wrong type for the declared
IR. It seems to work in the DAG by extracting the 4 elements that are
defined in the IR from the register, but that isn't handled here. This
doesn't seem to be a well tested path since other mismatched cases are
crashing the DAG asm handling.
2022-01-26 15:23:59 -05:00
Benjamin Kramer f15014ff54 Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17"
This reverts commit ef82063207.

- It conflicts with the existing llvm::size in STLExtras, which will now
  never be called.
- Calling it without llvm:: breaks C++17 compat
2022-01-26 16:55:53 +01:00
serge-sans-paille ef82063207 Rename llvm::array_lengthof into llvm::size to match std::size from C++17
As a conquence move llvm::array_lengthof from STLExtras.h to
STLForwardCompat.h (which is included by STLExtras.h so no build
breakage expected).
2022-01-26 16:17:45 +01:00
Sebastian Neubauer 4723f3cf03 [AMDGPU][GlobalISel] Combine unmerge of undef
Fold (unmerge undef) -> undef, undef, ...

Differential Revision: https://reviews.llvm.org/D118138
2022-01-26 12:30:36 +01:00
Nikita Popov a3a2239aaa [GlobalISel] Avoid pointer element type access during InlineAsm lowering
Same change as has been made for the SDAG lowering.
2022-01-25 14:26:47 +01:00
Nikita Popov aa97bc116d [NFC] Remove uses of PointerType::getElementType()
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().

This is part of D117885, in preparation for deprecating the API.
2022-01-25 09:44:52 +01:00
Matt Arsenault 99e8e17313 Reapply "Revert "GlobalISel: Add G_ASSERT_ALIGN hint instruction"
This reverts commit a97e20a3a8.
2022-01-24 09:26:52 -05:00
Nikita Popov 0d1308a7b7 [AArch64][GlobalISel] Support returned argument with multiple registers
The call lowering code assumed that a returned argument could only
consist of one register. Pass an ArrayRef<Register> instead of
Register to make sure that all parts get assigned.

Fixes https://github.com/llvm/llvm-project/issues/53315.

Differential Revision: https://reviews.llvm.org/D117866
2022-01-24 10:55:28 +01:00
Abinav Puthan Purayil 68b70d17d8 [GlobalISel] Fold or of shifts with constant amount to funnel shift.
This change folds (or (shl x, C0), (lshr y, C1)) to funnel shift iff C0
and C1 are constants where C0 + C1 is the bit-width of the shift
instructions.

Differential Revision: https://reviews.llvm.org/D116529
2022-01-24 10:43:32 +05:30
Lucas Prates 283f5a198a [GlobalISel] Fix incorrect sign extension when combining G_INTTOPTR and G_PTR_ADD
The GlobalISel combiner currently uses sign extension when manipulating
the LHS constant when combining a sequence of the following sequence of
machine instructions into a single constant:
```
  %0:_(s32) = G_CONSTANT i32 <CONSTANT>
  %1:_(p0) = G_INTTOPTR %0:_(s32)
  %2:_(s64) = G_CONSTANT i64 <CONSTANT>
  %3:_(p0) = G_PTR_ADD %1:_, %2:_(s64)
```

This causes an issue when the bit width of the first contant and the
target pointer size are different, as G_INTTOPTR has no sign extension
semantics.

This patch fixes this by capture an arbitrary precision in when matching
the constant, allowing the matching function to correctly zero extend
it.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D116941
2022-01-20 17:02:52 +00:00
Daniel Thornburgh 2e2999cd44 [NFC] Test commit to verify commit access. 2022-01-18 18:03:26 -08:00
Matt Arsenault 5599c43124 GlobalISel: Swap order of operand checks in ConstantFoldVectorBinop
Since constants are canonicalized to the RHS, this is more likely to
exit early.
2022-01-18 17:21:02 -05:00
Matt Arsenault da72822763 GlobalISel: Fix CSEMIRBuilder mishandling constant folds of vectors
This was ignoring the requested result register, resulting in a
missing def when this happened in the IRTranslator. Fixes some crashes
and verifier errors at -O0.

Alternatively we could pass DstOps to the constant fold functions.
2022-01-18 17:21:02 -05:00
Nikita Popov c63a3175c2 [AttrBuilder] Remove ctor accepting AttributeList and Index
Use the AttributeSet constructor instead. There's no good reason
why AttrBuilder itself should exact the AttributeSet from the
AttributeList. Moving this out of the AttrBuilder generally results
in cleaner code.
2022-01-15 22:39:31 +01:00
James Y Knight a97e20a3a8 Revert "GlobalISel: Add G_ASSERT_ALIGN hint instruction"
This commit sometimes causes a crash when compiling a vtable thunk. E.g.:

clang '--target=aarch64-grtev4-linux-gnu' -xc++ - -c -o /dev/null <<EOF
struct a {
  virtual int f();
};
struct c {
  virtual int &g() const;
};
struct d : a, c {
  int &g() const;
};
int &d::g() const {}
EOF

Some follow-up commits have been reverted as well:
Revert "IR: Make getRetAlign check callee function attributes"
Revert "Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC."
Revert "Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC."

This reverts commit 4f414af6a7.
This reverts commit a5507d2e25.
This reverts commit 3d2d208f6a.
This reverts commit 07ddfa95e3.
2022-01-14 04:50:07 +00:00
Simon Pilgrim 4f414af6a7 Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC. 2022-01-13 11:10:50 +00:00
Matt Arsenault 5a16306c09 GlobalISel: Always enable GISelKnownBits for InstructionSelect
This wasn't running at -O0, and causing crashes for AMDGPU. AMDGPU
needs this to match the addressing modes of stack access instructions,
which is even more important at -O0 than with optimizations.

It currently costs nothing to run ahead of time, so just always enable
it.
2022-01-12 18:57:24 -05:00
Matt Arsenault 07ddfa95e3 GlobalISel: Add G_ASSERT_ALIGN hint instruction
Insert it for call return values only for now, which is the only case
the DAG handles also.
2022-01-12 18:20:58 -05:00
Matt Arsenault 8a16201a0b GlobalISel: Fix insert point in localizer
This was inserting the new G_CONSTANT after the use, and the later
block scan would run off the end. Fix calling SkipPHIsAndLabels for no
apparent reason.
2022-01-12 13:44:05 -05:00
Petar Avramovic c8c5dc766b GlobalIsel: Fix fma combine when one of the operands comes from unmerge
Fma combine assumes that MRI.getVRegDef(Reg)->getOperand(0).getReg() = Reg
which is not true when Reg is defined by instruction with multiple defs
e.g. G_UNMERGE_VALUES.
Fix is to keep register and the instruction that defines register in
DefinitionAndSourceRegister and use when needed.

Differential Revision: https://reviews.llvm.org/D117032
2022-01-12 17:47:25 +01:00
Matt Arsenault 5a434ceafb GlobalISel: Use cloneVirtualRegister in localizer 2022-01-11 16:10:12 -05:00
Matt Arsenault 0ba4e4b500 GlobalISel: Pass DebugLoc to getFunctionLiveInPhysReg
Fixes crash in assertion about dropping debug info.
2022-01-10 13:50:52 -05:00
Serge Guelton d2cc6c2d0c Use a sorted array instead of a map to store AttrBuilder string attributes
Using and std::map<SmallString, SmallString> for target dependent attributes is
inefficient: it makes its constructor slightly heavier, and involves extra
allocation for each new string attribute. Storing the attribute key/value as
strings implies extra allocation/copy step.

Use a sorted vector instead. Given the low number of attributes generally
involved, this is cheaper, as showcased by

https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions

Differential Revision: https://reviews.llvm.org/D116599
2022-01-10 14:49:53 +01:00
Jay Foad 50fb44eebb [GlobalISel] Use getPreferredShiftAmountTy in one more G_UBFX combine
Change CombinerHelper::matchBitfieldExtractFromShrAnd to use
getPreferredShiftAmountTy for the shift-amount-like operands of G_UBFX
just like all the other G_[SU]BFX combines do. This better matches the
AMDGPU legality rules for these instructions.

Differential Revision: https://reviews.llvm.org/D116803
2022-01-08 09:20:44 +00:00
Jay Foad ff971873b3 [GlobalISel] Fix legality checks for G_UBFX combines
1. Fix CombinerHelper::matchBitfieldExtractFromAnd to check legality
   with the correct types for the G_UBFX that it builds.
2. Fix AMDGPUTargetLowering::isConstantUnsignedBitfieldExtractLegal to
   match the legality rules: result and first operand can be s32 or s64
   but the "shift amount" operands are always s32.
3. Add AMDGPU tests where the post-legalizer combiner would create
   illegal MIR without the above fixes.

Differential Revision: https://reviews.llvm.org/D116802
2022-01-08 09:20:44 +00:00
Kazu Hirata b932bdf59f [llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-07 17:45:09 -08:00
Jay Foad 3f3fe4a5cf [GlobalISel] Fix typo Extact to Extract in function name. NFC. 2022-01-07 11:13:35 +00:00
Nikita Popov e4d1779990 [IR] Add ConstraintInfo::hasArg() helper (NFC)
Checking whether a constraint corresponds to an argument is a
recurring pattern.
2022-01-07 10:44:38 +01:00
Kazu Hirata e5947760c2 Revert "[llvm] Remove redundant member initialization (NFC)"
This reverts commit fd4808887e.

This patch causes gcc to issue a lot of warnings like:

  warning: base class ‘class llvm::MCParsedAsmOperand’ should be
  explicitly initialized in the copy constructor [-Wextra]
2022-01-03 11:28:47 -08:00
Kazu Hirata fd4808887e [llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-01 16:18:18 -08:00
Petar Avramovic 508e39afe0 GlobalISel: remove redundant line added in D114198. NFC 2021-12-27 12:14:13 +01:00
Kazu Hirata 2d303e6781 Remove redundant return and continue statements (NFC)
Identified with readability-redundant-control-flow.
2021-12-24 23:17:54 -08:00
Fangrui Song ea2d4c5881 [GlobalISel] Fix -Wunused-function in -DLLVM_ENABLE_ASSERTIONS=off builds after D114198 2021-12-24 00:55:54 -08:00
Petar Avramovic 29f88b93fd [GlobalISel] Rework more/fewer elements for vectors
Artifact combiner is not able to access individual elements after using
LCMTy style merge/unmerge, extract and insert to change vector number of
elements (pad with undef or split to sub-vector instructions).
Use unmerge to individual elements instead and then merge elements into
requested types.
Change argument lowering for vectors and moreElementsVector to use
buildPadVectorWithUndefElements and buildDeleteTrailingVectorElements.
FewerElementsVector had a few helpers that had different behavior,
introduce new helper for most of the opcodes.
FewerElementsVector helper is more flexible since it can create leftover
instruction smaller then requested type (useful in case target wants to
avoid pad with undef and use fewer registers). If target does not want
leftover of different type it should call more elements first.
Some helpers were performing more elements first to have split without
leftover. Opcodes that used this helper use clampMaxNumElementsStrict
(does more elements first) in LegalizerInfo to avoid test changes.
Fixes failures caused by failing to combine artifacts created during
more/fewer elements vector.

Differential Revision: https://reviews.llvm.org/D114198
2021-12-23 14:30:02 +01:00
Konstantin Schwarz a344653725 [GlobalISel] Fix IRTranslator for constexpr fcmp
The existing code assumed fcmp to always be an Instruction, but it can also be a ConstExpr.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D115450
2021-12-10 08:49:12 +01:00
Mircea Trofin 91a0da0142 [NFC] Rename MachineFunction::DeleteMachineBasicBlock
Renamed to conform to coding style
2021-12-08 18:12:51 -08:00
Jack Andersen f108c7f59d [GlobalISel] Allow DBG_VALUE to use undefined vregs before LiveDebugValues.
Expanding on D109750.

Since `DBG_VALUE` instructions have final register validity determined in
`LDVImpl::handleDebugValue`, there is no apparent reason to immediately prune
unused register operands as their defs are erased. Consequently, this renders
`MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval` moot; gaining a
substantial performance improvement.

The only necessary changes involve making relevant passes consider invalid
DBG_VALUE vregs uses as valid.

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D112852
2021-12-05 15:55:59 -05:00
Kazu Hirata 3aed282257 [CodeGen] Use range-based for loops (NFC) 2021-12-03 20:45:59 -08:00
Abinav Puthan Purayil bc5dbb0bae [GlobalISel] Add matchers for constant splat.
This change exposes isBuildVectorConstantSplat() to the llvm namespace
and uses it to implement the constant splat versions of
m_SpecificICst().

CombinerHelper::matchOrShiftToFunnelShift() can now work with vector
types and CombinerHelper::matchMulOBy2()'s match for a constant splat is
simplified.

Differential Revision: https://reviews.llvm.org/D114625
2021-11-30 15:18:50 +05:30
Mirko Brkusanin 0dd570ff56 [AMDGPU][GlobalISel] Transform (fsub (fpext (fneg (fmul x, y))), z) -> (fneg (fma (fpext x), (fpext y), z))
Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D98050
2021-11-29 16:27:22 +01:00
Mirko Brkusanin 37c2a2201d [AMDGPU][GlobalISel] Transform (fsub (fpext (fmul x, y)), z) -> (fma (fpext x), (fpext y), (fneg z))
Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D98049
2021-11-29 16:27:22 +01:00
Mirko Brkusanin 5fe7fcd28e [AMDGPU][GlobalISel] Transform (fsub (fneg (fmul, x, y)), z) -> (fma (fneg x), y, (fneg z))
Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D98048
2021-11-29 16:27:22 +01:00
Mirko Brkusanin a782169270 [AMDGPU][GlobalISel] Transform (fsub (fmul x, y), z) -> (fma x, y, -z)
Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D96614
2021-11-29 16:27:22 +01:00
Mirko Brkusanin e5e49a08f1 [AMDGPU][GlobalISel] Transform (fadd (fma x, y, (fpext (fmul u, v))), z) -> (fma x, y, (fma (fpext u), (fpext v), z))
Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D98047
2021-11-29 16:27:21 +01:00
Mirko Brkusanin f732292536 [AMDGPU][GlobalISel] Transform (fadd (fma x, y, (fmul u, v)), z) -> (fma x, y, (fma u, v, z))
Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D97938
2021-11-29 16:27:21 +01:00
Mirko Brkusanin 8951136216 [AMDGPU][GlobalISel] Transform (fadd (fpext (fmul x, y)), z) -> (fma (fpext x), (fpext y), z)
Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D97937
2021-11-29 16:27:21 +01:00
Mirko Brkusanin 881840fc26 [AMDGPU][GlobalISel] Transform (fadd (fmul x, y), z) -> (fma x, y, z)
Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D93305
2021-11-29 16:27:21 +01:00
Abinav Puthan Purayil 4af45f10cc [GlobalISel] Fold or of shifts to funnel shift.
This change folds a basic funnel shift idiom:
- (or (shl x, amt), (lshr y, sub(bw, amt))) -> fshl(x, y, amt)
- (or (shl x, sub(bw, amt)), (lshr y, amt)) -> fshr(x, y, amt)

This also helps in folding to rotate shift if x and y are equal since we
already have a funnel shift to rotate combine.

Differential Revision: https://reviews.llvm.org/D114499
2021-11-26 17:05:29 +05:30
Kazu Hirata 259cd6f893 [llvm] Use range-based for loops (NFC) 2021-11-25 22:17:10 -08:00
Kazu Hirata bfd5dd1568 [llvm] Use range-based for loops (NFC) 2021-11-25 08:55:16 -08:00
Jameson Nash 0332d105b9 GlobalISel: remove assert that memcpy Src and Dst addrspace must be identical
The LangRef does not require these arguments to have the same type.

Differential Revision: https://reviews.llvm.org/D93154
2021-11-24 20:23:05 -05:00
Zarko Todorovski 95875d246a [LLVM][NFC]Inclusive language: remove occurances of sanity check/test from llvm
Part of work to use more inclusive language in clang/llvm. Rewording
some comments and change function and variable names.
2021-11-24 17:29:55 -05:00
Kazu Hirata d45cb1d7ea [llvm] Use range-based for loops (NFC) 2021-11-23 08:54:48 -08:00