Commit Graph

1680 Commits

Author SHA1 Message Date
Amara Emerson d7fed7b899 [AArch64][GlobalISel] Fall back if disabling neon/fp in the translator.
The previous technique relied on early-exiting the legalizer predicate
initialization, leaving an empty rule table. That causes a fallback
for most instructions, but some have legacy rules defined like G_ZEXT
which can try continue, but then crash.

We should fall back earlier, in the translator, to avoid this issue.

Differential Revision: https://reviews.llvm.org/D98730
2021-03-17 15:08:08 -07:00
Matt Arsenault 6b76d82853 GlobalISel: Fix marking byval arguments as immutable
byval arguments need to be assumed writable. Only implicitly stack
passed arguments which aren't addressable in the IR can be assumed
immutable.

Mips is still broken since for some reason its doing its own thing
with the ValueHandlers (and x86 doesn't actually handle byval
arguments now, although some of the code is there).
2021-03-12 09:01:53 -05:00
Matt Arsenault 34471c3060 GlobalISel: Partially fix handling of byval arguments
This was essentially ignoring byval and treating them as a pointer
argument which needed to be loaded from. This should copy the frame
index value to the virtual register, not insert a load from the frame
index into the pointer value.

For AMDGPU, this was producing a load from the byval pointer argument,
to a pointer used for the byval arguments. I do not understand how
AArch64 managed to work before since it appears to be similarly
broken.

We could also change the ValueHandler API to avoid the extra copy from
the frame index, since currently it returns a new register.

I believe there is still an issue with outgoing byval arguments. These
should have a copy inserted in case the callee decided to overwrite
the memory.
2021-03-12 09:01:53 -05:00
Matt Arsenault cf5ecd5644 GlobalISel: Fix off by one in finding explicit byval alignment
For attribute sets, the return index is at 0, and arguments start at
1. getParamAlignment adds the offset of 1, so we need to convert from
attribute index back to IR index.
2021-03-11 10:23:08 -05:00
Christudasan Devadasan 4c6ab48fb1 GlobalISel: Try to combine G_[SU]DIV and G_[SU]REM
It is good to have a combined `divrem` instruction when the
`div` and `rem` are computed from identical input operands.
Some targets can lower them through a single expansion that
computes both division and remainder. It effectively reduces
the number of instructions than individually expanding them.

Reviewed By: arsenm, paquette

Differential Revision: https://reviews.llvm.org/D96013
2021-03-10 18:46:07 +05:30
Amara Emerson 55e760769b [GlobalISel] Fold away G_BUILD_VECTOR with all elements extracted.
If every element is extracted from a G_BUILD_VECTOR, pass through the source
registers. This is different to the extract(build_vector) combine because this
one tolerates multiple users as long as they're exhaustive.

Differential Revision: https://reviews.llvm.org/D97890
2021-03-09 11:34:26 -08:00
Amara Emerson e60ab72137 [AArch64][GlobalISel] Add combine for extract_vector_elt(build_vector, cst)
Differential Revision: https://reviews.llvm.org/D97835
2021-03-09 11:08:02 -08:00
Jessica Paquette 5c26be214d [AArch64][GlobalISel] Lower G_BUILD_VECTOR -> G_DUP
If we have

```
%vec = G_BUILD_VECTOR %reg, %reg, ..., %reg
```

Then lower it to

```
%vec = G_DUP %reg
```

Also update the selector to handle constant splats on G_DUP.

This will not combine when the splat is all zeros or ones. Tablegen-imported
patterns rely on these being G_BUILD_VECTOR.

Minor code size improvements on CTMark at -Os.

Also adds some utility functions to make it a bit easier to recognize splats,
and an AArch64-specific splat helper.

Differential Revision: https://reviews.llvm.org/D97731
2021-03-08 13:01:10 -08:00
Petar Avramovic d44f61f81c Reland [GlobalISel] Combine zext(trunc x) to x
Recommit 4112299ee7. Depends on
4c8fb7ddd6 which was reverted.

Combine zext(trunc x) to x when truncated bits are known to be zero.

Differential Revision: https://reviews.llvm.org/D96031
2021-03-05 11:05:37 +01:00
Petar Avramovic d7834556b7 Reland [GlobalISel] Start using vectors in GISelKnownBits
This is recommit of 4c8fb7ddd6.
MIR in one unit test had mismatched types.

For vectors we consider a bit as known if it is the same for all demanded
vector elements (all elements by default). KnownBits BitWidth for vector
type is size of vector element. Add support for G_BUILD_VECTOR.
This allows combines of urem_pow2_to_mask in pre-legalizer combiner.

Differential Revision: https://reviews.llvm.org/D96122
2021-03-04 21:47:13 +01:00
Nico Weber 59beb1ef6d Revert "[GlobalISel] Combine zext(trunc x) to x"
This reverts commit 4112299ee7.
Seems to depend on 4c8fb7ddd6 which
is being reverted.
2021-03-04 10:13:40 -05:00
Nico Weber 4b1015361c Revert "[GlobalISel] Start using vectors in GISelKnownBits"
This reverts commit 4c8fb7ddd6.
Breaks check-llvm everywhere, see https://reviews.llvm.org/D96122
2021-03-04 10:13:40 -05:00
Petar Avramovic 4112299ee7 [GlobalISel] Combine zext(trunc x) to x
Combine zext(trunc x) to x when truncated bits are known to be zero.

Differential Revision: https://reviews.llvm.org/D96031
2021-03-04 15:05:23 +01:00
Petar Avramovic 4c8fb7ddd6 [GlobalISel] Start using vectors in GISelKnownBits
For vectors we consider a bit as known if it is the same for all demanded
vector elements (all elements by default). KnownBits BitWidth for vector
type is size of vector element. Add support for G_BUILD_VECTOR.
This allows combines of urem_pow2_to_mask in pre-legalizer combiner.

Differential Revision: https://reviews.llvm.org/D96122
2021-03-04 15:05:23 +01:00
Matt Arsenault 78dcff4841 GlobalISel: Add default implementation of assignValueToReg
Refactor insertion of the asserting ops. This enables using them for
AMDGPU.

This code should essentially be the same for every target. Mips, X86
and ARM all have different code there now, but this seems to be an
accident. The assignment functions are called with different types
than they would be in the DAG, so this is all likely an assortment of
hacks to get around that.
2021-03-03 09:29:53 -05:00
Matt Arsenault fd82cbcf7d GlobalISel: Merge and cleanup more AMDGPU call lowering code
This merges more AMDGPU ABI lowering code into the generic call
lowering. Start cleaning up by factoring away more of the pack/unpack
logic into the buildCopy{To|From}Parts functions. These could use more
improvement, and the SelectionDAG versions are significantly more
complex, and we'll eventually have to emulate all of those cases too.

This is mostly NFC, but does result in some minor instruction
reordering. It also removes some of the limitations with mismatched
sizes the old code had. However, similarly to the merge on the input,
this is forcing gfx6/gfx7 to use the gfx8+ ABI (which is what we
actually want, but SelectionDAG is stuck using the weird emergent
ABI).

This also changes the load/store size for stack passed EVTs for
AArch64, which makes it consistent with the DAG behavior.
2021-03-02 17:31:13 -05:00
Amara Emerson 8a316045ed [AArch64][GlobalISel] Enable use of the optsize predicate in the selector.
To do this while supporting the existing functionality in SelectionDAG of using
PGO info, we add the ProfileSummaryInfo and LazyBlockFrequencyInfo analysis
dependencies to the instruction selector pass.

Then, use the predicate to generate constant pool loads for f32 materialization,
if we're targeting optsize/minsize.

Differential Revision: https://reviews.llvm.org/D97732
2021-03-02 12:55:51 -08:00
Nikita Popov c35761db0f [GlobalISel] Bail on G_PHI narrowing of odd types (PR48188)
The current narrowing code for G_PHI can only handle the case
where the size is a multiple of the narrow size. If this is not
the case, fall back to SDAG instead of asserting.

Original patch by shepmaster.

Differential Revision: https://reviews.llvm.org/D92446
2021-03-01 23:30:50 +01:00
Matt Arsenault 0131498402 GlobalISel: Remove dead code
Generic code should probably not introduce G_INSERT/G_EXTRACT. The
mirror unpackRegs should also be removed, but AMDGPU still has a use
remaining which needs to be fixed.
2021-03-01 17:06:43 -05:00
Matt Arsenault 6c260d3bc0 GlobalISel: Move splitToValueTypes to generic code
I copied the nearly identical function from AArch64 into AMDGPU, so
fix this duplication.

Mips and X86 have their own more exotic versions which should be
removed. However replacing those is better left for a separate patch
since it requires other changes to avoid regressions.
2021-03-01 08:58:18 -05:00
James Y Knight 6de6455752 Use getAlign() on atomicrmw/cmpxchg instructions, now that it's available.
These locations were missed as part of adding alignment to the
instructions, and were still making their own alignment assumptions.
2021-02-26 15:06:15 -05:00
Jay Foad a6be26710b [GlobalISel] Make more use of replaceSingleDefInstWithReg. NFC. 2021-02-23 17:08:34 +00:00
Cassie Jones 8f956a5e8f [GlobalISel] Implement narrowScalar for SADDE/SSUBE/UADDE/USUBE
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96673
2021-02-22 19:59:36 -05:00
Cassie Jones e1532649cb [GlobalISel] Implement narrowScalar for SADDO/SSUBO
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96672
2021-02-22 19:59:36 -05:00
Cassie Jones c63b33b792 [GlobalISel] Implement narrowScalar for UADDO/USUBO
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96671
2021-02-22 19:59:35 -05:00
Amara Emerson 212d6a95ab [GloblalISel] Support lowering <3 x i8> arguments in multiple parts.
Differential Revision: https://reviews.llvm.org/D97086
2021-02-22 13:58:44 -08:00
Amara Emerson 69ce291bcc [AArch64][GlobalISel] Support lowering <1 x i8> arguments.
We don't yet have working codegen for the resulting unmerges, and if
we did it would probably be horrible.

Differential Revision: https://reviews.llvm.org/D97035
2021-02-22 13:58:44 -08:00
Kazu Hirata 0b417ba20f [CodeGen] Use range-based for loops (NFC) 2021-02-20 21:46:02 -08:00
Matt Arsenault 62d946e133 GlobalISel: Merge some AMDGPU ABI lowering code to generic code
AMDGPU currently has a lot of pre-processing code to pre-split
argument types into 32-bit pieces before passing it to the generic
code in handleAssignments. This is a bit sloppy and also requires some
overly fancy iterator work when building the calls. It's better if all
argument marshalling code is handled directly in
handleAssignments. This handles more situations like decomposing large
element vectors into sub-element sized pieces.

This should mostly be NFC, but does change the generated code by
shifting where the initial argument packing instructions are placed. I
think this is nicer looking, since it now emits the packing code
directly after the relevant copies, rather than after the copies for
the remaining arguments.

This doubles down on gfx6/gfx7 using the gfx8+ ABI for 16-bit
types. This is ultimately the better option, but incompatible with the
DAG. Fixing this requires more work, especially for f16.
2021-02-18 17:26:55 -05:00
Jessica Paquette e6064a6418 [GlobalISel] Implement computeKnownBits for G_ASSERT_SEXT
Implementation is the same as G_SEXT_INREG.

Differential Revision: https://reviews.llvm.org/D96899
2021-02-17 14:00:36 -08:00
Jessica Paquette 26fb036559 [GlobalISel] Implement computeNumSignBits for G_ASSERT_SEXT
Same implementation as G_SEXT_INREG.

Add a testcase to combine-sext-inreg for a concrete example, and a testcase
to KnownBitsTest.

Differential Revision: https://reviews.llvm.org/D96897
2021-02-17 13:53:17 -08:00
Jessica Paquette 60aa646441 [GlobalISel] Add G_ASSERT_SEXT
This adds a G_ASSERT_SEXT opcode, similar to G_ASSERT_ZEXT. This instruction
signifies that an operation was already sign extended from a smaller type.

This is useful for functions with sign-extended parameters.

E.g.

```
define void @foo(i16 signext %x) {
 ...
}
```

This adds verifier, regbankselect, and instruction selection support for
G_ASSERT_SEXT equivalent to G_ASSERT_ZEXT.

Differential Revision: https://reviews.llvm.org/D96890
2021-02-17 13:10:34 -08:00
Matt Arsenault 392e0fcfd1 GlobalISel: Handle arguments partially passed on the stack
The API is a bit awkward since you need to index into an array in the
passed struct. I guess an alternative would be to pass all of the
individual fields.
2021-02-15 17:06:14 -05:00
Cassie Jones 97a1cdb156 [GlobalISel] Disable vector types in narrowScalarAddSub
The implementation for vectors is broken and doesn't seem to be used by
anything. Explicitly remove support for them, they can be added again
later when they're properly implemented.

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D95699
2021-02-14 18:06:32 -05:00
Cassie Jones 36246388ba [GlobalISel] Extract a narrowScalarAddSub method. NFC
Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D95426
2021-02-14 18:06:32 -05:00
Kazu Hirata 905cf88d18 [CodeGen] Use range-based for loops (NFC) 2021-02-12 23:44:33 -08:00
Amara Emerson 5d6d9b63a3 [GlobalISel] Propagate extends through G_PHIs into the incoming value blocks.
This combine tries to do inter-block hoisting of extends of G_PHIs, into the
originating blocks of the phi's incoming value. The idea is to expose further
optimization opportunities that are normally obscured by the PHI.

Some basic heuristics, and a target hook for AArch64 is added, to allow tuning.
E.g. if the extend is used by a G_PTR_ADD, it doesn't perform this combine
since it may be folded into the addressing mode during selection.

There are very minor code size improvements on AArch64 -Os, but the real benefit
is that it unlocks optimizations like AArch64 conditional compares on some
benchmarks.

Differential Revision: https://reviews.llvm.org/D95703
2021-02-12 11:52:52 -08:00
Petar Avramovic f0d65f4096 AMDGPU/GlobalISel: Calculate isKnownNeverNaN for fminnum and fmaxnum
Implements same logis as in SelectionDAG.
G_FMINNUM_IEEE and G_FMAXNUM_IEEE are never SNaN by definition and
never NaN when one operand is known non-NaN and other known non-SNaN.
G_FMINNUM and G_FMAXNUM are never NaN/SNaN when one of the operands
is known non-NaN/SNaN.

Differential Revision: https://reviews.llvm.org/D91716
2021-02-12 17:14:34 +01:00
Petar Avramovic 122c649c98 AMDGPU/GlobalISel: Check values of constants in isKnownNeverNaN
Differential Revision: https://reviews.llvm.org/D91714
2021-02-12 17:14:34 +01:00
Amara Emerson de035c18cf [GlobalISel] Fix sext_inreg(load) combine to not move the originating load.
The builder was using the extend user as the insertion point, which meant that
we were incorrectly "moving" the load from its original position, and therefore
could violate memory operation ordering.
2021-02-11 19:27:09 -08:00
Matt Arsenault b72a23650f GlobalISel: Fix using wrong calling convention for callees
This was taking the calling convention from the parent function,
instead of the callee. Avoids regressions in a future patch when the
caller and callee have different type breakdowns.

For some reason AArch64's lowerFormalArguments seems to intentionally
ignore the parent isVarArg.
2021-02-09 13:48:56 -05:00
Matt Arsenault 87e280110d GlobalISel: Use correct calling convention in handleAssignments
This was using the calling convention of the calling function, not the
callee. Avoids regressions in a future patch.
2021-02-08 17:09:28 -05:00
Amara Emerson ec41ed5b1b [AArch64][GlobalISel] Support the 'returned' parameter attribute.
On AArch64 (which seems to be the only target that supports it), this
attribute allows codegen to avoid saving/restoring the value in x0
across a call.

Gives a 0.1% geomean -Os code size improvement on CTMark.

Differential Revision: https://reviews.llvm.org/D96099
2021-02-08 12:47:39 -08:00
Kazu Hirata 5438e079b1 [GlobalISel] Use ListSeparator (NFC) 2021-02-04 21:18:04 -08:00
Craig Topper 11ef356d9e [TargetLowering] Use Align in allowsMisalignedMemoryAccesses.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96097
2021-02-04 19:22:06 -08:00
Justin Bogner 62ce4b048f [GlobalISel] Combine narrowScalar of G_ADD and G_SUB. NFC
These two cases have identical implementations other than an
unreachable part of `G_ADD` that checks if the scalar we're narrowing
is a vector. Combining them to avoid unnecessary divergence.
2021-02-03 11:06:04 -08:00
Jessica Paquette 02d4b365bf [GlobalISel] Check if branches use the same MBB in matchOptBrCondByInvertingCond
If the G_BR + G_BRCOND in this combine use the same MBB, then it will infinite
loop. Don't allow that to happen.

Differential Revision: https://reviews.llvm.org/D95895
2021-02-02 15:38:48 -08:00
Jessica Paquette 4809663334 [GlobalISel] Make sure G_ASSERT_ZEXT's src ends up with the same rc as dst
When replacing the dst reg with the src reg, we need to make sure that we
propagate the dst reg's register class through to the src.

Otherwise, we aren't meeting the requirements for G_ASSERT_ZEXT, and so the
verifier will fail.

Differential Revision: https://reviews.llvm.org/D95708
2021-02-01 09:46:35 -08:00
Tim Northover c2b322fc19 GlobalISel: check type size before getZExtValue()ing it.
Otherwise getZExtValue() asserts.
2021-02-01 12:43:33 +00:00
xgupta 94fac81fcc [Branch-Rename] Fix some links
According to the [[ https://foundation.llvm.org/docs/branch-rename/ | status of branch rename ]], the master branch of the LLVM repository is removed on 28 Jan 2021.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D95766
2021-02-01 16:43:21 +05:30
Jessica Paquette d6656c3b25 [GlobalISel] Remove hint instructions in generic InstructionSelect code.
I think every target will want to remove these in the same way. Rather than
making them all implement the same code, let's just put this in
InstructionSelect.

Differential Revision: https://reviews.llvm.org/D95652
2021-01-29 11:20:07 -08:00
Jay Foad 5cf6412a27 [GlobalISel] Fix modifying a G_OR without notifying the observer
Remove the call to setFlags in favour of creating the instruction with
the correct flags in the first place, so we don't have to explicitly
notify the observer.

Differential Revision: https://reviews.llvm.org/D95681
2021-01-29 16:32:24 +00:00
Jessica Paquette d5736a2746 [GlobalISel] Implement regbankselect for G_ASSERT_ZEXT
This adds generic regbankselect support for G_ASSERT_ZEXT.

It inherits whatever register bank the source was given, always, on all targets.

I think that at the point where we run into these, the source register bank
should be decided.

This also adds some AArch64-specific code which makes sure we can handle
G_ASSERT_ZEXT when deciding on register banks for G_STORE, G_PHI, ... etc.

Differential Revision: https://reviews.llvm.org/D95649
2021-01-28 16:56:14 -08:00
Jessica Paquette f19971d1de [GlobalISel] Implement computeKnownBits for G_ASSERT_ZEXT
It's the same as the ZEXT/TRUNC case, except SrcBitWidth is given by the
immediate operand.

Update KnownBitsTest.cpp and a MIR test for a concrete example.

Differential Revision: https://reviews.llvm.org/D95566
2021-01-28 16:34:34 -08:00
Jessica Paquette daffab1985 Recommit "[GlobalISel] Walk through hints in getDefIgnoringCopies et al"
Recommit of 4580acf675

`Opc = DefMI->getOpcode()` was in the wrong place.
2021-01-28 14:43:00 -08:00
Jessica Paquette dcb5b5f1f2 Revert "[GlobalISel] Walk through hints in getDefIgnoringCopies et al"
This reverts commit 4580acf675.

Reverting while looking into some test failures.
2021-01-28 14:37:57 -08:00
Jessica Paquette 4580acf675 [GlobalISel] Walk through hints in getDefIgnoringCopies et al
Treat hint instructions like G_ASSERT_ZEXT like COPY instructions in helpers
which walk through copies.

This ensures that instructions like G_ASSERT_ZEXT won't impact any optimizations
that rely on these helpers.

Differential Revision: https://reviews.llvm.org/D95577
2021-01-28 14:27:00 -08:00
Cassie Jones f22f4557a7 [GlobalISel] Implement widenScalar for carry-in add/sub
These are widened to a wider UADDE/USUBE, with the overflow value
unused, and with the same synthesis of a new overflow value as for the
O operations.

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D95326
2021-01-28 17:06:24 -05:00
Jessica Paquette 24261729a4 [GlobalISel] Add G_ASSERT_ZEXT
This adds a generic opcode which communicates that a type has already been
zero-extended from a narrower type.

This is intended to be similar to AssertZext in SelectionDAG.

For example,

```
%x_was_extended:_(s64) = G_ASSERT_ZEXT %x, 16
```

Signifies that the top 48 bits of %x are known to be 0.

This is useful in cases like this:

```
define i1 @zeroext_param(i8 zeroext %x) {
  %cmp = icmp ult i8 %x, -20
  ret i1 %cmp
}
```

In AArch64, `%x` must use a 32-bit register, which is then truncated to a 8-bit
value.

If we know that `%x` is already zero-ed out in the relevant high bits, we can
avoid the truncate.

Currently, in GISel, this looks like this:

```
_zeroext_param:
  and w8, w0, #0xff ; We don't actually need this!
  cmp w8, #236
  cset w0, lo
  ret
```

While SDAG does not produce the truncation, since it knows that it's
unnecessary:

```
_zeroext_param:
  cmp w0, #236
  cset w0, lo
  ret
```

This patch

- Adds G_ASSERT_ZEXT
- Adds MIRBuilder support for it
- Adds MachineVerifier support for it
- Documents it

It also puts G_ASSERT_ZEXT into its own class of "hint instruction." (There
should be a G_ASSERT_SEXT in the future, maybe a G_ASSERT_ALIGN as well.)

This allows us to skip over hints in the legalizer etc. These can then later
be selected like COPY instructions or removed.

Differential Revision: https://reviews.llvm.org/D95564
2021-01-28 13:58:37 -08:00
Jessica Paquette f36007e811 [GlobalISel] Implement computeKnownBits for G_SEXT_INREG
Just use the existing `Known.sextInReg` implementation.

- Update KnownBitsTest.cpp.
- Update combine-redundant-and.mir for a more concrete example.

Differential Revision: https://reviews.llvm.org/D95484
2021-01-26 15:01:38 -08:00
Amara Emerson cbed865e1e [GlobalISel][IRTranslator] Ignore the llvm.experimental.noalias.scope.decl intrinsic.
These don't generate any code.
2021-01-26 13:04:11 -08:00
Amara Emerson 03bce0bf4e [GlobalISel][Localizer] Don't localize phi operands which are used more than once in the phi.
The current algorithm just tries to localize defs as far as they can go, and in
the case of G_PHI operands, it clones the def into the predecessor block for
each incoming edge. When multiple edges have the same register value, this can
cause unnecessary code bloat, and inhibit later optimizations.

This change checks if a given phi operand is unique in the phi, if not the
def of that register is not localized to the predecessor.

Differential Revision: https://reviews.llvm.org/D95406
2021-01-25 17:48:04 -08:00
Mitch Phillips c9466ede7e Revert "Revert "[GlobalISel] LegalizerHelper - Extract widenScalarAddoSubo method""
This reverts commit 554b3211fe.

Differential Revision: https://reviews.llvm.org/D95035
2021-01-25 16:22:22 -08:00
Cassie Jones aa8f3677f7 Recommit "[AArch64][GlobalISel] Implement widenScalar for signed overflow"
Implement widening for G_SADDO and G_SSUBO.
Add legalize-add/sub tests for narrow overflowing add/sub on AArch64.

Differential Revision: https://reviews.llvm.org/D95034
2021-01-25 16:57:20 -05:00
Mitch Phillips e3a7532cc9 Revert "[AArch64][GlobalISel] Implement widenScalar for signed overflow"
This reverts commit 541d98efa2.

Reason: Dependent patch 3dedad475d broke
UBSan on Android: http://lab.llvm.org:8011/#/builders/77/builds/3082
2021-01-22 14:32:11 -08:00
Mitch Phillips 554b3211fe Revert "[GlobalISel] LegalizerHelper - Extract widenScalarAddoSubo method"
This reverts commit 2bb92bf451.

Dependent patch broke UBSan on Android:
3dedad475d
2021-01-22 14:32:11 -08:00
Cassie Jones 2bb92bf451 [GlobalISel] LegalizerHelper - Extract widenScalarAddoSubo method
The widenScalar implementation for signed and unsigned overflowing
operations were very similar: both are checked by truncating the result
and then re-sign/zero-extending it and checking that it matches the
computed operation.

Using a truncate + zero-extend for the unsigned case instead of manually
producing the AND instruction like before leads to an extra copy
instruction during legalization, but this should be harmless.

Differential Revision: https://reviews.llvm.org/D95035
2021-01-22 14:08:46 -08:00
Cassie Jones 541d98efa2 [AArch64][GlobalISel] Implement widenScalar for signed overflow
Implement widening for G_SADDO and G_SSUBO. Previously it was only
implemented for G_UADDO and G_USUBO. Also add legalize-add/sub tests for
narrow overflowing add/sub on AArch64.

Differential Revision: https://reviews.llvm.org/D95034
2021-01-21 22:55:42 -08:00
Kazu Hirata c5c4dbd279 [CodeGen] Use llvm::append_range (NFC) 2021-01-21 19:59:46 -08:00
Matt Arsenault 35c535a7df AArch64/GlobalISel: Factor out parametersInCSRMatch
Make this look more like the DAG handling and move to common code.

I also noticed AArch64 seems to not be properly adding the
physreg:virtreg mapping to the function live ins.
2021-01-21 10:32:48 -05:00
Mirko Brkusanin a6a72dfdf2 [AMDGPU][GlobalISel] Avoid selecting S_PACK with constants
If constants are hidden behind G_ANYEXT we can treat them same way as G_SEXT.
For that purpose we extend getConstantVRegValWithLookThrough with option
to handle G_ANYEXT same way as G_SEXT.

Differential Revision: https://reviews.llvm.org/D92219
2021-01-20 11:54:53 +01:00
Gabriel Hjort Åkerlund 2aeaaf841b [GlobalISel] Add missing operand update when copy is required
When constraining an operand register using constrainOperandRegClass(),
the function may emit a COPY in case the provided register class does
not match the current operand register class. However, the operand
itself is not updated to make use of the COPY, thereby resulting in
incorrect code. This patch fixes that bug by updating the machine
operand accordingly.

Reviewed By: dsanders

Differential Revision: https://reviews.llvm.org/D91244
2021-01-20 10:32:52 +01:00
Jessica Paquette cbf5246359 Fix buildbot after cfc6073017
Windows buildbots were not happy with using find_if + instructionsWithoutDebug.

In cfc6073017, instructionsWithoutDebug is not technically necessary. So,
just iterate over the block directly.

http://lab.llvm.org:8011/#/builders/127/builds/4732/steps/7/logs/stdio
2021-01-19 10:38:04 -08:00
Jessica Paquette cfc6073017 [GlobalISel] Combine (a[0]) | (a[1] << k1) | ...| (a[m] << kn) into a wide load
This is a restricted version of the combine in `DAGCombiner::MatchLoadCombine`.
(See D27861)

This tries to recognize patterns like below (assuming a little-endian target):

```
s8* x = ...
s32 val = a[0] | (a[1] << 8) | (a[2] << 16) | (a[3] << 24)
->
s32 val = *((i32)a)

s8* x = ...
s32 val = a[3] | (a[2] << 8) | (a[1] << 16) | (a[0] << 24)
->
s32 val = BSWAP(*((s32)a))
```

(This patch also handles the big-endian target case as well, in which the first
example above has a BSWAP, and the second example above does not.)

To recognize the pattern, this searches from the last G_OR in the expression
tree.

E.g.

```
    Reg   Reg
     \    /
      OR_1   Reg
       \    /
        OR_2
          \     Reg
           .. /
          Root
```

Each non-OR register in the tree is put in a list. Each register in the list is
then checked to see if it's an appropriate load + shift logic.

If every register is a load + potentially a shift, the combine checks if those
loads + shifts, when OR'd together, are equivalent to a wide load (possibly with
a BSWAP.)

To simplify things, this patch

(1) Only handles G_ZEXTLOADs (which appear to be the common case)
(2) Only works in a single MachineBasicBlock
(3) Only handles G_SHL as the bit twiddling to stick the small load into a
    specific location

An IR example of this is here: https://godbolt.org/z/4sP9Pj (lifted from
test/CodeGen/AArch64/load-combine.ll)

At -Os on AArch64, this is a 0.5% code size improvement for CTMark/sqlite3,
and a 0.4% improvement for CTMark/7zip-benchmark.

Also fix a bug in `isPredecessor` which caused it to fail whenever `DefMI` was
the first instruction in the block.

Differential Revision: https://reviews.llvm.org/D94350
2021-01-19 10:24:27 -08:00
Jay Foad 517196e569 [Analysis,CodeGen] Make use of KnownBits::makeConstant. NFC.
Differential Revision: https://reviews.llvm.org/D94588
2021-01-14 14:02:43 +00:00
Matt Arsenault d55d592a92 GlobalISel: Do not set observer of MachineIRBuilder in LegalizerHelper
This fixes double printing of insertion debug messages in the
legalizer.

Try to cleanup usage of observers. Currently the use of observers is
pretty hard to follow and it's not clear what is responsible for
them. Observers are referenced in 3 places:

1. In the MachineFunction
2. In the MachineIRBuilder
3. In the LegalizerHelper

The observers in the MachineFunction and MachineIRBuilder are both
called only on insertions, and are redundant with each other. The
source of the double printing was the same observer was added to both
the MachineFunction, and the MachineIRBuilder. One of these references
needs to be removed. Arguably observers in general should be fully
removed from one or the other, but it may be useful to have a local
observer in the MachineIRBuilder that is not added to the function's
observers. Alternatively, the wrapper observer could manage a local
observer in one place.

The LegalizerHelper only ever calls the observer on changing/changed
instructions, and never insertions. Logically these are two different
types of observers, for changes and for insertions.

Additionally, some places used the GISelObserverWrapper when they only
needed a single observer they could use directly.

Setting the observer in the LegalizerHelper constructor is not
flexible enough if the LegalizerHelper is constructed anywhere outside
the one used by the legalizer. AMDGPU calls the LegalizerHelper in
RegBankSelect, and needs to use a local observer to apply the regbank
to newly created instructions. Currently it accomplishes this by
constructing a local MachineIRBuilder. I'm trying to move the
MachineIRBuilder to be owned/maintained by the RegBankSelect pass
itself, but the locally constructed LegalizerHelper would reset the
observer.

Mips also has a special case use of the LegalizationArtifactCombiner
in applyMappingImpl; I think we do need to run the artifact combiner
during RegBankSelect, but in a more consistent way outside of
applyMappingImpl.
2021-01-13 10:44:31 -05:00
Kazu Hirata 12fc9ca3a4 [llvm] Remove redundant string initialization (NFC)
Identified with readability-redundant-string-init.
2021-01-12 21:43:46 -08:00
Kazu Hirata e3d3dbd339 [llvm] Ensure newlines at the end of files (NFC)
This patch eliminates pesky "No newline at end of file" messages from
git diff.
2021-01-10 09:24:57 -08:00
Christudasan Devadasan ae25a397e9 AMDGPU/GlobalISel: Enable sret demotion 2021-01-08 10:56:35 +05:30
Matt Arsenault 2cbbc6e87c GlobalISel: Fail legalization on narrowing extload below memory size 2021-01-07 17:40:34 -05:00
Matt Arsenault 1f9b6ef91f GlobalISel: Add combine for G_UREM by power of 2
Really I want this in the legalizer, but this is a start.
2021-01-07 16:36:35 -05:00
Kazu Hirata cfeecdf7b6 [llvm] Use llvm::all_of (NFC) 2021-01-06 18:27:36 -08:00
Christudasan Devadasan d68458bd56 [GlobalISel] Base implementation for sret demotion.
If the return values can't be lowered to registers
SelectionDAG performs the sret demotion. This patch
contains the basic implementation for the same in
the GlobalISel pipeline.

Furthermore, targets should bring relevant changes
during lowerFormalArguments, lowerReturn and
lowerCall to make use of this feature.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D92953
2021-01-06 10:30:50 +05:30
Matt Arsenault a427f15d60 GlobalISel: Add isKnownToBeAPowerOfTwo helper function 2021-01-05 12:59:08 -05:00
Juneyoung Lee 5cdf6ed744 [CodeGen] recognize select form of and/ors when splitting branch conditions
Recently a few patches are made to move towards using select i1 instead of and/or i1 to represent "a && b"/"a || b" in C/C++.
"a && b" in C/C++ does not evaluate b if a is false whereas 'and a, b' in IR evaluates b and uses its result regardless of the result of a.
This is problematic because it can cause miscompilation if b was an erroneous operation (https://llvm.org/pr48353).
In C/C++, the result is simply false because b is not evaluated, but in IR the result is poison.
The discussion at D93065 has more context about this.

This patch makes two branch-splitting optimizations (one in SelectionDAGBuilder, one in CodeGenPrepare) recognize
select form of and/or as well using m_LogicalAnd/Or.
Since it is CodeGen, I think this is semantically ok (at least as safe as what codegen already did).

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93853
2021-01-01 04:46:10 +09:00
Amara Emerson 7df3544e80 [GlobalISel] Fix assertion failures after "GlobalISel: Return APInt from getConstantVRegVal" landed.
APInt binary ops don't promote types but instead assert, which a combine was
relying on.
2020-12-26 23:51:44 -08:00
Kazu Hirata df812115e3 [CodeGen, Transforms] Use llvm::any_of (NFC) 2020-12-24 09:08:36 -08:00
Matt Arsenault 581d13f8ae GlobalISel: Return APInt from getConstantVRegVal
Returning int64_t was arbitrarily limiting for wide integer types, and
the functions should handle the full generality of the IR.

Also changes the full form which returns the originally defined
vreg. Add another wrapper for the common case of just immediately
converting to int64_t (arguably this would be useful for the full
return value case as well).

One possible issue with this change is some of the existing uses did
break without conversion to getConstantVRegSExtVal, and it's possible
some without adequate test coverage are now broken.
2020-12-22 22:23:58 -05:00
Matt Arsenault e7e7d371fd GlobalISel: Fix generic handling of single outgoing call arguments
Simply call the argument handler like is done for the incoming
case. This will allow removal of hacks in the AMDGPU call lowering in
a future change.
2020-12-15 17:00:27 -05:00
Amara Emerson a69b76c500 [GlobalISel][IRTranslator] Ensure branch probabilities are added when translating invoke edges.
This uses a straightforward port of findUnwindDestinations() from SelectionDAG.

Differential Revision: https://reviews.llvm.org/D93256
2020-12-14 23:36:54 -08:00
Amara Emerson 21de99d43c [[GlobalISel][IRTranslator] Fix a crash when the use of an extractvalue is a non-dominated metadata use.
We don't expect uses to come before defs in the CFG, so allocateVRegs() asserted.

Fixes PR48211
2020-12-12 14:58:54 -08:00
Fangrui Song b5ad32ef5c Migrate deprecated DebugLoc::get to DILocation::get
This migrates all LLVM (except Kaleidoscope and
CodeGen/StackProtector.cpp) DebugLoc::get to DILocation::get.

The CodeGen/StackProtector.cpp usage may have a nullptr Scope
and can trigger an assertion failure, so I don't migrate it.

Reviewed By: #debug-info, dblaikie

Differential Revision: https://reviews.llvm.org/D93087
2020-12-11 12:45:22 -08:00
Fangrui Song d928dfc6f9 [GlobalISel] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds 2020-11-30 18:31:42 -08:00
Fangrui Song 36fe1a9dea [GlobalISel] Fix -Wunused-variable 2020-11-30 18:25:54 -08:00
Amara Emerson 87ff156414 [AArch64][GlobalISel] Fix crash during legalization of a vector G_SELECT with scalar mask.
The lowering of vector selects needs to first splat the scalar mask into a vector
first.

This was causing a crash when building oggenc in the test suite.

Differential Revision: https://reviews.llvm.org/D91655
2020-11-30 16:37:49 -08:00
Mirko Brkusanin 4cf6dd518e [AMDGPU][GlobalISel] Fix lowerShlSat
RegBankSelect would crash on G_SELECT when type is not s1.

Differential Revision: https://reviews.llvm.org/D91437
2020-11-16 17:43:31 +01:00
Jessica Paquette b184a2eccf [GlobalISel] Add matchers for specific constants and a matcher for negations
It's fairly common to need matchers for a specific constant value, or for
common idioms like finding a negated register.

Add

- `m_SpecificICst`, which returns true when matching a specific value..
- `m_ZeroInt`, which returns true when an integer 0 is matched.
- `m_Neg`, which returns when a register is negated.

Also update a few places which use idioms related to the new matchers.

Differential Revision: https://reviews.llvm.org/D91397
2020-11-13 09:24:54 -08:00
Matt Arsenault c67e1a985f GlobalISel: Directly expose getDefSrcRegIgnoringCopies utility
It's useful to get both the instruction and register at the same time.
2020-11-13 11:07:04 -05:00
serge-sans-paille 9218ff50f9 llvmbuildectomy - replace llvm-build by plain cmake
No longer rely on an external tool to build the llvm component layout.

Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.

These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.

Differential Revision: https://reviews.llvm.org/D90848
2020-11-13 10:35:24 +01:00
Simon Pilgrim 1a62ca65c1 [KnownBits] Add KnownBits::commonBits helper. NFCI.
We have a frequent pattern where we're merging two KnownBits to get the common/shared bits, and I just fell for the gotcha where I tried to use the & operator to merge them........
2020-11-11 12:15:54 +00:00
Mirko Brkusanin a75d6178b8 [GlobalISel] Add combine for (x | mask) -> x when (x | mask) == x
If we have a mask, and a value x, where (x | mask) == x, we can drop the OR
and just use x.

Differential Revision: https://reviews.llvm.org/D90952
2020-11-10 11:32:13 +01:00
Mirko Brkusanin fb36ab0a42 [GlobalISel] Expand combine for (x & mask) -> x when (x & mask) == x
We can use KnownBitsAnalysis to cover cases when mask is not trivial. It can
also help with cases when mask is not constant but can still be folded into
one. Since 'and' is comutative we should treat both operands as possible
replacements.

Differential Revision: https://reviews.llvm.org/D90674
2020-11-10 11:32:13 +01:00
Mirko Brkusanin 53ae95c946 [AMDGPU][GlobalISel] Combine shift + logic + shift with constant operands
This sequence of instructions can be simplified if they are single use and
some operands are constants. Additional combines may be applied afterwards.

Differential Revision: https://reviews.llvm.org/D90223
2020-11-10 11:32:13 +01:00
Mirko Brkusanin de719586a8 [AMDGPU][GlobalISel] Fold a chain of two shift instructions with constant operands
Sequence of same shift instructions with constant operands can be combined into
a single shift instruction.

Differential Revision: https://reviews.llvm.org/D90217
2020-11-10 11:32:12 +01:00
Simon Pilgrim 7fe7c6d3be [GlobalISel] Don't use Register type for getNumOperands(). NFCI.
Copy+Paste typo - we were storing getNumOperands() opcounts in a Register type instead of just an unsigned.
2020-11-05 17:12:58 +00:00
Simon Pilgrim 546d002d7a [GlobalISel] ComputeKnownBits - use common KnownBits shift handling (PR44526)
Convert GISelKnownBits.computeKnownBitsImpl shift handling to use the common KnownBits implementations, which makes use of the known leading/trailing bits for shifted values in cases where we don't know the shift amount value, as detailed in https://blog.regehr.org/archives/1709

Differential Revision: https://reviews.llvm.org/D90527
2020-11-05 11:52:26 +00:00
Simon Pilgrim b25765792b Revert rGbbeb08497ce58 "Revert "[GlobalISel] GISelKnownBits::computeKnownBitsImpl - Replace TargetOpcode::G_MUL handling with the common KnownBits::computeForMul implementation""
Updated the GISel KnownBits tests as KnownBits::computeForMul allows more accurate computation.
2020-11-05 10:39:53 +00:00
Fangrui Song bbeb08497c Revert "[GlobalISel] GISelKnownBits::computeKnownBitsImpl - Replace TargetOpcode::G_MUL handling with the common KnownBits::computeForMul implementation"
This reverts commit 0b8711e1af which broke GlobalISelTests AArch64GISelMITest.TestKnownBits
2020-11-04 09:54:04 -08:00
Simon Pilgrim 0b8711e1af [GlobalISel] GISelKnownBits::computeKnownBitsImpl - Replace TargetOpcode::G_MUL handling with the common KnownBits::computeForMul implementation
Avoid code duplication
2020-11-04 17:25:24 +00:00
Aditya Nandakumar bed8394047 [GISel]: Few InsertVecElt combines
https://reviews.llvm.org/D88060

This adds the following combines
1) build_vector formation from insert_vec_elts
2) insert_vec_elts (build_vector) -> build_vector
2020-10-28 12:27:07 -07:00
David Sherwood 35a531fb45 [SVE][CodeGen][NFC] Replace TypeSize comparison operators with their scalar equivalents
In certain places in llvm/lib/CodeGen we were relying upon the TypeSize
comparison operators when in fact the code was only ever expecting
either scalar values or fixed width vectors. I've changed some of these
places to use the equivalent scalar operator.

Differential Revision: https://reviews.llvm.org/D88482
2020-10-19 08:30:31 +01:00
Amara Emerson 6042c25b0a [GlobalISel] Add translation support for vector reduction intrinsics.
In order to prevent the ExpandReductions pass from expanding some intrinsics
before they get to codegen, I had to add a -disable-expand-reductions flag
for testing purposes.

Differential Revision: https://reviews.llvm.org/D89028
2020-10-16 10:17:53 -07:00
Aditya Nandakumar ef3d17482f [GISel] Add combine for constant G_PTR_ADD offsets.
https://reviews.llvm.org/D88865

This adds a single combine for GlobalISel to fold:

ptradd (inttoptr C1) C2
Into:

C1 + C2
Additionally, a small test for AArch64 is added.

Patch by pnappa.
2020-10-13 17:26:12 -07:00
Mirko Brkusanin 52ba4fa6aa [GlobalISel] Avoid making G_PTR_ADD with nullptr
When the first operand is a null pointer we can avoid making a G_PTR_ADD and
make a G_INTTOPTR with the offset operand.
This helps us avoid making add with 0 later on for targets such as AMDGPU.

Differential Revision: https://reviews.llvm.org/D87140
2020-10-13 13:02:55 +02:00
Konstantin Schwarz 7341123439 [GlobalISel][KnownBits] Early return on out of bound shift amounts
If the known shift amount is bigger than or equal to the bitwidth of the type of the value to be shifted,
the result is target dependent, so don't try to infer any bits.

This fixes a crash we've seen in one of our internal test suites.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D89232
2020-10-12 18:39:19 +02:00
Quentin Colombet fd8275e04a [GlobalISel] Add missing pass dependencies for IRTranslator
The IRTranslator depends on the branch probability info pass when the
optimization level is different than None and it depends all the time on
the StackProtector pass.

We have to explicitly call out pass dependencies otherwise the pass manager
may not be able to schedule the IRTranslator.

Before this patch, we were lucky because previous passes depend on the branch
probability info pass (like the Global Variable Optimization) and the stack
protector pass is initialized in initializeCodeGen.
However, if the target has a custom pipeline without any passes like Global
Variable Optimization, the pipeline creation will fail, at least because of
the branch probability info pass dependency (it is unlikely that
initializeCodeGen is not called).

This patch adds the missing dependencies to the IRTranslator.

Differential Revision: https://reviews.llvm.org/D89063
2020-10-08 13:57:21 -07:00
Amara Emerson c2bce848ec [GlobalISel] Fix CSEMIRBuilder silently allowing use-before-def.
If a CSEMIRBuilder query hits the instruction at the current insert point,
move insert point ahead one so that subsequent uses of the builder don't end up with
uses before defs.

This fix also shows that AMDGPU was also affected by this bug often, but got away
with it because it was using a G_IMPLICIT_DEF before the use.

Differential Revision: https://reviews.llvm.org/D88605
2020-10-05 11:00:00 -07:00
Matt Arsenault 5aa1119537 GlobalISel: Assert if MoreElements uses a non-vector type 2020-09-30 10:36:00 -04:00
Gabriel Hjort Åkerlund 43d239d0fa [GlobalISel] Fix incorrect setting of ValNo when splitting
Before, for each original argument i, ValNo was set to i + PartIdx, but
ValNo is intended to reflect the index of the value before splitting.
Hence, ValNo should always be set to i and not consider the PartIdx.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D86511
2020-09-30 16:08:51 +02:00
Amara Emerson 1d54e75cf2 [GlobalISel] Fix multiply with overflow intrinsics legalization generating invalid MIR.
During lowering of G_UMULO and friends, the previous code moved the builder's
insertion point to be after the legalizing instruction. When that happened, if
there happened to be a "G_CONSTANT i32 0" immediately after, the CSEMIRBuilder
would try to find that constant during the buildConstant(zero) call, and since
it dominates itself would return the iterator unchanged, even though the def
of the constant was *after* the current insertion point. This resulted in the
compare being generated *before* the constant which it was using.

There's no need to modify the insertion point before building the mul-hi or
constant. Delaying moving the insert point ensures those are built/CSEd before
the G_ICMP is built.

Fixes PR47679

Differential Revision: https://reviews.llvm.org/D88514
2020-09-29 18:40:58 -07:00
Dominik Montada 113114a5da [GlobalISel] fix widenScalarUnmerge if widen type is not a multiple of destination type
Fix creation of illegal unmerge when widen was requested to a type which
is not a multiple of the destination type. E.g. when trying to widen
an s48 unmerge to s64 the existing code would create an illegal unmerge
from s64 to s48.

Instead, create further unmerges to a GCD type, then use this to remerge
these intermediate results to the actual destinations.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D88422
2020-09-29 15:52:20 +02:00
Amara Emerson 082321909e [GlobalISel] Add support for lowering of vector G_SELECT and use for AArch64.
The lowering is a port of the SDAG expansion.

Differential Revision: https://reviews.llvm.org/D88364
2020-09-28 14:00:46 -07:00
Jessica Paquette a52e78012a [GlobalISel] Combine (xor (and x, y), y) -> (and (not x), y)
When we see this:

```
%and = G_AND %x, %y
%xor = G_XOR %and, %y
```

Produce this:

```
%not = G_XOR %x, -1
%new_and = G_AND %not, %y
```

as long as we are guaranteed to eliminate the original G_AND.

Also matches all commuted forms. E.g.

```
%and = G_AND %y, %x
%xor = G_XOR %y, %and
```

will be matched as well.

Differential Revision: https://reviews.llvm.org/D88104
2020-09-28 10:08:14 -07:00
Matt Arsenault e75afc9acf GlobalISel: Use unmerge when copying wide vectors to result registers
Avoid using G_EXTRACT and move towards a more consistent vector
legalization strategy.
2020-09-24 15:19:51 -04:00
Pushpinder Singh 41d6669f1f [GlobalISel][AMDGPU] Lower G_SMULH/G_UMULH
Reviewed By: arsenm, foad

Differential Revision: https://reviews.llvm.org/D85653
2020-09-23 22:25:29 -04:00
Eli Friedman 3f739f736b [SelectionDAG][GISel] Make LegalizeDAG lower FNEG using integer ops.
Previously, if a floating-point type was legal, but FNEG wasn't legal,
we would use FSUB.  Instead, we should use integer ops, to preserve the
semantics.  (Alternatively, there's a compiler-rt call we could use, but
there isn't much reason to use that.)

It turns out we actually are still using this obscure codepath in a few
cases: on some targets, we have "legal" floating-point types that don't
actually support any floating-point operations.  In particular, ARM and
AArch64 are using this path.

The implementation for SelectionDAG is pretty simple because we can
reuse the infrastructure from FCOPYSIGN.

See also 9a3dc3e, the corresponding change to type legalization.

Also includes a "bonus" change to STRICT_FSUB legalization, so we can
lower a STRICT_FSUB to a float libcall.

Includes the changes to both LegalizeDAG and GlobalISel so we don't have
inconsistent results in the future.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46792 .

Differential Revision: https://reviews.llvm.org/D84287
2020-09-23 14:10:33 -07:00
Matt Arsenault c463fd136e GlobalISel: Fix truncating shift amount in trunc (shl) combine
The shift amount type does not necessarily match the result type. This
was inserting a trunc from s32 to s32, which asserted. Just preserve
the original shift amount type which can be legalized later.
2020-09-23 09:07:50 -04:00
Amara Emerson 5d34d7f1a0 [GlobalISel] Add lowering support for G_ABS and use for AArch64.
Differential Revision: https://reviews.llvm.org/D87952
2020-09-18 16:17:18 -07:00
Amara Emerson 79b21fc187 [AArch64][GlobalISel] Fix bug in fewVectorElts action while legalizing oversize G_FPTRUNC vectors.
For <8 x s32> = fptrunc <8 x s64> the fewerElementsVector action tries to break
down the source vector into the final source vectors of <2 x s64> using unmerge.
This fixes a crash due to using the wrong number of elements for the breakdown
type.

Also add some legalizer tests for explicitly G_FPTRUNC which we didn't have.

Differential Revision: https://reviews.llvm.org/D87814
2020-09-17 08:56:26 -07:00
Matt Arsenault 88bdcbbf1a GlobalISel: Lift store value widening restriction
This doesn't change the memory size and doesn't need to worry about
non-power-of-2 sizes.
2020-09-16 14:25:07 -04:00
Michael Kitzan c4e589b795 [GISel] Add new combines for unary FP instrs with constant operand
https://reviews.llvm.org/D86393

Patch adds five new `GICombinerRules`, one for each of the following unary
FP instrs: `G_FNEG`, `G_FABS`, `G_FPTRUNC`, `G_FSQRT`, and `G_FLOG2`. The
combine rules perform the FP operation on the constant operand and replace
the original instr with the result. Patch additionally adds new combiner
tests for the AArch64 target to test these new combiner rules.
2020-09-16 10:34:15 -07:00
Volkan Keles 79378b1b75 GlobalISel: Fix a failing combiner test
test/CodeGen/AArch64/GlobalISel/combine-trunc.mir was failing
due to the different order for evaluating function arguments.
This patch updates the related code to fix the issue.
2020-09-15 16:40:38 -07:00
Aditya Nandakumar 97203cfd6b [GISel] Add new GISel combiners for G_MUL
https://reviews.llvm.org/D87668

Patch adds two new GICombinerRules, one for G_MUL(X, 1) and another for G_MUL(X, -1).
G_MUL(X, 1) is an identity combine, and G_MUL(X, -1) gets replaced with G_SUB(0, X).
Patch additionally adds new combiner tests for the AArch64 target to test these
new combiner rules, as well as updates AMDGPU GISel tests.

Patch by mkitzan
2020-09-15 16:08:47 -07:00
Volkan Keles a4e35cc2ec GlobalISel: Add combines for G_TRUNC
https://reviews.llvm.org/D87050
2020-09-15 15:50:34 -07:00
Petar Avramovic 9b4fa85434 GlobalISel/IRTranslator resetTargetOptions based on function attributes
Update TargetMachine.Options with function attributes before we start
to generate MIR instructions. This allows access to correct function
attributes via TargetMachine.Options (it used to access attributes of
the function that was translated first).
This affects some existing tests with "no-nans-fp-math" attribute.
Follow-up on D87456.

Differential Revision: https://reviews.llvm.org/D87511
2020-09-15 10:26:09 +02:00
Quentin Colombet b3afad0463 [GlobalISel] Add a `X, Y = G_UNMERGE(G_ZEXT Z)` -> X = G_ZEXT Z; Y = 0 combine
Add a combiner helper to transform unmerge of zext into one zext and
a constant 0

Differential Revision: https://reviews.llvm.org/D87427
2020-09-14 17:27:23 -07:00
Quentin Colombet d2321129bd [GlobalISel] Add `X,Y<dead> = G_UNMERGE Z` -> X = G_TRUNC Z
Add a combiner helper that replaces G_UNMERGE where all the destination lanes
are dead except the first one with a G_TRUNC.

Differential Revision: https://reviews.llvm.org/D87174
2020-09-14 17:27:23 -07:00
Quentin Colombet a36278c2f8 [GlobalISel] Add G_UNMERGE(Cst) -> Cst1, Cst2, ... combine
Add a combiner helper that replaces G_UNMERGE of big constants into direct
use of smaller constants.

Differential Revision: https://reviews.llvm.org/D87166
2020-09-14 16:30:18 -07:00
Aditya Nandakumar 46f9137e43 [GISel]: Add combine for G_FABS to G_FABS
https://reviews.llvm.org/D87554

Patch adds one new GICombinerRule for G_FABS. The combine rule folds G_FABS(G_FABS(X)) to G_FABS(X).
Patch additionally adds new combiner tests for the AArch64 target to test this new combiner rule.

Patch by mkitzan.
2020-09-14 15:56:24 -07:00
Quentin Colombet 670c276232 [GlobalISel] Add G_UNMERGE_VALUES(G_MERGE_VALUES) combine
Add the matching and applying function to the combiner helper for
G_UNMERGE_VALUES(G_MERGE_VALUES).

This combine also supports any merge-like input nodes, like G_BUILD_VECTORS
and is robust against bitcasts in between int unmerge and merge nodes.

When the input type of the merge node and the output type of the unmerge
node are not the same, but the sizes are, the combine still applies but
creates bitcasts between the sources and the destinations instead of
reusing the destinations directly.

Long term, the artifact combiner should probably reuse that helper, but
as of today, it doesn't use any outside helper, so I kept it this way.

Differential Revision: https://reviews.llvm.org/D87117
2020-09-14 15:45:06 -07:00
Petar Avramovic 6e2a86ed5a AMDGPU/GlobalISel Check for NoNaNsFPMath in isKnownNeverSNaN
Check for NoNaNsFPMath function attribute in isKnownNeverSNaN.
Function attributes are in held in 'TargetMachine.Options'.
Among other things, this allows selection of some patterns imported
in D87351 since G_FCANONICALIZE is not generated when isKnownNeverSNaN
returns true in lowerFMinNumMaxNum.

However we notice some incorrect results since function attributes are
not correctly written in TargetMachine.Options when next function is
processed. Take a look at @v_test_no_global_nnans_med3_f32_pat0_srcmod0,
it has "no-nans-fp-math"="false" but TargetMachine.Options still has it
set to true since first function in test file had this attribute set to
true. This will be fixed in D87511.

Differential Revision: https://reviews.llvm.org/D87456
2020-09-14 12:11:00 +02:00
Reid Kleckner 2c73bef7fa Fix wrong comment about enabling optimizations to work around a bug 2020-09-10 16:45:20 -07:00
Reid Kleckner 4e3edef4b8 Use pragmas to work around MSVC x86_32 debug miscompile bug
Halide users reported this here: https://llvm.org/pr46176
I reported the issue to MSVC here:
https://developercommunity.visualstudio.com/content/problem/1179643/msvc-copies-overaligned-non-trivially-copyable-par.html

This codepath is apparently not covered by LLVM's unit tests, so I added
coverage in a unit test.

If we want to support this configuration going forward, it means that is
in general not safe to pass a SmallVector<T, N> by value if alignof(T)
is greater than 4. This doesn't appear to come up often because passing
a SmallVector by value is inefficient and not idiomatic: it copies the
inline storage. In this case, the SmallVector<LLT,4> is captured by
value by a lambda, and the lambda is passed by value into std::function,
and that's how we hit the bug.

Differential Revision: https://reviews.llvm.org/D87475
2020-09-10 14:50:01 -07:00
Volkan Keles d4bf90271f GlobalISel: Combine fneg(fneg x) to x
https://reviews.llvm.org/D87473
2020-09-10 12:57:38 -07:00
Amara Emerson e5784ef8f6 [GlobalISel] Enable usage of BranchProbabilityInfo in IRTranslator.
We weren't using this before, so none of the MachineFunction CFG edges had the
branch probability information added. As a result, block placement later in the
pipeline was flying blind.

This is enabled only with optimizations enabled like SelectionDAG.

Differential Revision: https://reviews.llvm.org/D86824
2020-09-09 14:31:12 -07:00
Amara Emerson 467a071285 [GlobalISel][IRTranslator] Generate better conditional branch lowering.
This is a port of the functionality from SelectionDAG, which tries to find
a tree of conditions from compares that are then combined using OR or AND,
before using that result as the input to a branch. Instead of naively
lowering the code as is, this change converts that into a sequence of
conditional branches on the sub-expressions of the tree.

Like SelectionDAG, we re-use the case block codegen functionality from
the switch lowering utils, which causes us to generate some different code.
The result of which I've tried to mitigate in earlier combine patches.

Differential Revision: https://reviews.llvm.org/D86665
2020-09-09 13:16:11 -07:00
Amara Emerson cc76da7ada [GlobalISel] Rewrite the elide-br-by-swapping-icmp-ops combine to do less.
This combine previously tried to take sequences like:
  %cond = G_ICMP pred, a, b
  G_BRCOND %cond, %truebb
  G_BR %falsebb
%truebb:
  ...
%falsebb:
  ...

and by inverting the compare predicate and swapping branch targets, delete the
G_BR and instead have a single conditional branch to the falsebb. Since in an
earlier patch we have a combine to fold not(icmp) into just an inverted icmp,
we don't need this combine to do as much. This patch instead generalizes the
combine by just looking for:
  G_BRCOND %cond, %truebb
  G_BR %falsebb
%truebb:
  ...
%falsebb:
  ...

and then inverting the condition using a not (xor). The xor can be folded away
in a separate combine. This change also lets us avoid some optimization code
in the IRTranslator.

I also think that deleting G_BRs in the combiner is unnecessary. That's
something that targets can decide to do at selection time and could simplify
generic code in future.

Differential Revision: https://reviews.llvm.org/D86664
2020-09-09 13:08:16 -07:00
Volkan Keles 1242dd330d GlobalISel: Combine `op undef, x` to 0
https://reviews.llvm.org/D86611
2020-09-08 09:46:38 -07:00
Jay Foad 713c2ad60c [GlobalISel] Extend not_cmp_fold to work on conditional expressions
Differential Revision: https://reviews.llvm.org/D86709
2020-09-07 09:31:08 +01:00
Jay Foad 5350e1b509 [KnownBits] Implement accurate unsigned and signed max and min
Use the new implementation in ValueTracking, SelectionDAG and
GlobalISel.

Differential Revision: https://reviews.llvm.org/D87034
2020-09-07 09:09:01 +01:00
Amara Emerson d0abc75749 [GlobalISel] Disable the indexed loads combine completely unless forced. NFC.
The post-index matcher, before it queries the target legality, walks uses
of some instructions which in pathological cases can be massive. Since
no targets actually support indexed loads yet, disable this to stop wasting
compile time on something which is going to fail anyway.
2020-09-05 21:04:03 -07:00
Simon Pilgrim 898e42db93 GlobalISel/Utils.h - remove unused includes. NFCI.
Twine is unused, and TargetLowering can be reduced to a forward declaration and moved to Utils.cpp
2020-09-03 15:59:12 +01:00
Sander de Smalen f13beac51b [AArch64][SVE] Preserve full vector regs over EH edge.
Unwinders may only preserve the lower 64bits of Neon and SVE registers,
as only the registers in the base ABI are guaranteed to be preserved
over the exception edge. The caller will need to preserve additional
registers for when the call throws an exception and the unwinder has
tried to recover state.

For  e.g.

    svint32_t bar(svint32_t);
    svint32_t foo(svint32_t x, bool *err) {
      try { bar(x); } catch (...) { *err = true; }
      return x;
    }

`z0` needs to be spilled before the call to `bar(x)` and reloaded before
returning from foo, as the exception handler may have clobbered z0.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D84737
2020-09-02 10:54:18 +01:00
Amara Emerson 520ab710fb Revert "Revert "[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _)" (and dependent patch "Optimize away a Not feeding a brcond by using tbz instead of tbnz.")"
This reverts commit 8693ddc743.

Re-committing with the test requiring asserts.
2020-09-01 14:29:04 -07:00
Jordan Rupprecht 8693ddc743 Revert "[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _)" (and dependent patch "Optimize away a Not feeding a brcond by using tbz instead of tbnz.")
This reverts commit 8ad8f484b6. It causes crashes when running `ninja check-llvm-codegen-aarch64-globalisel`, e.g.
http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24132/steps/test-stage1-compiler/logs/stdio.
Note that the crash does not seem to reproduce in debug builds.

5ded444252 depends on this, so revert that too.
2020-09-01 13:31:57 -07:00
Amara Emerson 8ad8f484b6 [GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _)
This is needed for an upcoming change to how we translate conditional branches
which might generate these.

Differential Revision: https://reviews.llvm.org/D86383
2020-09-01 10:57:17 -07:00
Matt Arsenault 32a8a10b42 GlobalISel: Implement computeNumSignBits for G_SELECT 2020-09-01 12:50:19 -04:00
Matt Arsenault 35c94d3f7e GlobalISel: Port smarter known bits for umin/umax from DAG 2020-09-01 12:50:15 -04:00
Matt Arsenault 759482ddaa GlobalISel: Implement computeKnownBits for G_BSWAP and G_BITREVERSE 2020-09-01 12:49:57 -04:00
Volkan Keles 061182b7ba GlobalISel: Add combines for extend operations
https://reviews.llvm.org/D86516
2020-09-01 08:50:06 -07:00
Matt Arsenault 9e7e1b2d4b GlobalISel: Implement computeNumSignBits for G_SEXTLOAD/G_ZEXTLOAD 2020-09-01 11:20:02 -04:00
Matt Arsenault 92090e8bd8 GlobalISel: Implement computeKnownBits for G_UNMERGE_VALUES 2020-09-01 11:19:27 -04:00
Matt Arsenault 1b201914b5 GlobalISel: Combine out redundant sext_inreg
The scalar tests don't work yet, since computeNumSignBits apparently
doesn't handle sextload yet, and sext folds into the load first.
2020-08-28 17:57:31 -04:00
Yonghong Song 443d352a1c [GlobalISel] fix a compilation error with gcc 6.3.0
With gcc 6.3.0, I hit the following compilation error:
  ../lib/CodeGen/GlobalISel/Combiner.cpp: In member function
      ‘bool llvm::Combiner::combineMachineInstrs(llvm::MachineFunction&,
       llvm::GISelCSEInfo*)’:
  ../lib/CodeGen/GlobalISel/Combiner.cpp:156:54: error: suggest parentheses
       around ‘&&’ within ‘||’ [-Werror=parentheses]
     assert(!CSEInfo || !errorToBool(CSEInfo->verify()) &&
                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
                            "CSEInfo is not consistent. Likely missing calls to "
                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                            "observer on mutations");

Fix the code as suggested by the compiler.
2020-08-28 09:16:52 -07:00
Matt Arsenault 5feca7c9c3 GlobalISel: Implement computeNumSignBits for G_SEXT_INREG 2020-08-27 19:44:37 -04:00
Matt Arsenault f08bbde83f Correctly revert "GlobalISel: Use & operator on KnownBits"
I mis-resolved the revert through moving the code to another function.
2020-08-27 19:08:31 -04:00
Matt Arsenault 6cf4f25670 Revert "GlobalISel: Use & operator on KnownBits"
This reverts commit e53b799779.

Confusingly, this does not simply and the two sets of known bits, but
implements known bits for the and operator.
2020-08-27 18:52:34 -04:00
Matt Arsenault abc99ab572 GlobalISel: Implement known bits for min/max 2020-08-27 16:56:17 -04:00
Matt Arsenault e53b799779 GlobalISel: Use & operator on KnownBits
Avoid repeating for zero and one
2020-08-27 14:07:18 -04:00
Matt Arsenault 531f7063ba GlobalISel: Implement known bits for G_MERGE_VALUES 2020-08-27 14:07:18 -04:00
Aditya Nandakumar db464a3dbf [GISel] Add new GISel combiners for G_SELECT
https://reviews.llvm.org/D83833

Patch adds two new GICombinerRules for G_SELECT. The rules include:
combining selects with undef comparisons into their first selectee value,
and to combine away selects with constant comparisons. Patch additionally
adds a new combiner test for the AArch64 target to test these new G_SELECT
combiner rules and the existing select_same_val combiner rule.

Patch by  mkitzan
2020-08-27 09:40:15 -07:00
Aditya Nandakumar 5c2db1655b [GISel]: Fix one more CSE Non determinism
https://reviews.llvm.org/D86676

Sometimes we can have the following code

 x:gpr(s32) = G_OP

Say we build G_OP2 to the same x and then delete the previous instruction. Using something like

 Register X = ...;
 auto NewMIB = CSEBuilder.buildOp2(X, ... args);

Currently there's a mismatch in how NewMIB is profiled and inserted into the CSEMap (ie it doesn't consider register bank/register class along with type).Unify the profiling by refactoring and calling the common method.

This was found by turning on the CSEInfo::verify in at the end of each of our GISel passes which turns inconsistent state/non determinism in CSEing into crashes which likely usually indicates missing calls to Observer on mutations (the most common case). Here non determinism usually means not cseing sometimes, but almost never about producing incorrect code.
Also this patch adds this verification at the end of the combiners as well.
2020-08-27 09:06:21 -07:00
Matt Arsenault 5207545a86 GlobalISel: IRTranslate minimum of pointer sizes on memcpy
I forgot to squash this with 0b7f6cc71a
2020-08-26 20:10:00 -04:00
Matt Arsenault 0b7f6cc71a GlobalISel: Add generic instructions for memory intrinsics
AArch64, X86 and Mips currently directly consumes these and custom
lowering to produce a libcall, but really these should follow the
normal legalization process through the libcall/lower action.
2020-08-26 20:08:45 -04:00
Matt Arsenault eb074088c9 GlobalISel: Combine G_ADD of G_PTRTOINT to G_PTR_ADD
This produces less work for addressing mode matching. I think this is
safe since I don't think machine IR is supposed to give the same
aliasing properties as getelementptr in the IR.
2020-08-26 08:57:15 -04:00
Matt Arsenault 517caca359 GlobalISel: Improve dead instruction debug printing
This was printing the "Is dead" on a separate line from the
instruction, which was harder to follow.
2020-08-24 10:12:00 -04:00
Matt Arsenault e1644a3779 GlobalISel: Reduce G_SHL width if source is extension
shl ([sza]ext x, y) => zext (shl x, y).

Turns expensive 64 bit shifts into 32 bit if it does not overflow the
source type:

This is a port of an AMDGPU DAG combine added in
5fa289f0d8. InstCombine does this
already, but we need to do it again here to apply it to shifts
introduced for lowered getelementptrs. This will help matching
addressing modes that use 32-bit offsets in a future patch.

TableGen annoyingly assumes only a single match data operand, so
introduce a reusable struct. However, this still requires defining a
separate GIMatchData for every combine which is still annoying.

Adds a morally equivalent function to the existing
getShiftAmountTy. Without this, we would have to do try to repeatedly
query the legalizer info and guess at what type to use for the shift.
2020-08-24 09:42:40 -04:00
Matt Arsenault 901e3317fe GlobalISel: Merge FewerElements for G_BUILD_VECTOR/G_CONCAT_VECTORS
This switches from using G_EXTRACT in odd cases to widen with undef
and unmerge.
2020-08-22 10:25:53 -04:00
Justin Bogner 1283dca007 [GISel] Correct the known bits of G_ANYEXT
Known bits for G_ANYEXT was incorrectly using KnownBits::zext, causing
us to treat the high bits as zero even though they're (by definition)
unknown.

Differential Revision: https://reviews.llvm.org/D86323
2020-08-20 17:17:04 -07:00
Konstantin Schwarz 7497b861f4 [GlobalISel][IRTranslator] Support PHI instructions in landingpad blocks
The check for the landingpad instructions was overly restrictive. In optimimized builds PHI nodes can appear
before the landingpad instructions, resulting in a fallback to SelectionDAG.

This change relaxes the check to allow PHI nodes.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D86141
2020-08-20 10:49:31 +02:00
Matt Arsenault 31adc28d24 GlobalISel: Implement fewerElementsVector for G_CONCAT_VECTORS sources
This fixes <6 x s16> = G_CONCAT_VECTORS from <3 x s16> handling.
2020-08-19 18:53:24 -04:00
Matt Arsenault adbcc8e733 GlobalISel: Add TargetLowering member to LegalizerHelper 2020-08-19 14:50:35 -04:00
Matt Arsenault d64ad3f051 GlobalISel: Don't check for verifier enforced constraint
Loads are always required to have a single memory operand.
2020-08-19 14:15:38 -04:00
Matt Arsenault e95c08432a GlobalISel: Use Register 2020-08-19 13:45:31 -04:00
Jessica Paquette d25b12bdc3 [GlobalISel] Add combine for (x & mask) -> x when (x & mask) == x
If we have a mask, and a value x, where (x & mask) == x, we can drop the AND
and just use x.

This is about a 0.4% geomean code size improvement on CTMark at -O3 for AArch64.

In AArch64, this is most useful post-legalization. Patterns like this often
show up when legalizing s1s, which must be extended to larger types.

e.g.

```
%cmp:_(s32) = G_ICMP ...
%and:_(s32) = G_AND %cmp, 1
```

Since G_ICMP only produces a single bit, there's no reason to mask it with the
G_AND.

Differential Revision: https://reviews.llvm.org/D85463
2020-08-19 10:20:57 -07:00
Amara Emerson ed35344524 Use std::make_tuple instead of initializer lists to make a bot happy:
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux
2020-08-18 14:55:52 -07:00
Jessica Paquette bf36e90295 [GlobalISel][CallLowering] NFC: Unify flag-setting from CallBase + AttributeList
It's annoying to have to maintain multiple, nearly identical chains of if
statements which all set the same attributes.

Add a helper function, `addFlagsUsingAttrFn` which performs the attribute
setting.

Then, use wrappers for that function in `lowerCall` and `setArgFlags`.

(Note that the flag-setting code in `setArgFlags` was missing the returned
attribute. There's no selection for this yet, so no test. It's an example of
the kind of thing this lets us avoid, though.)

Differential Revision: https://reviews.llvm.org/D86159
2020-08-18 11:07:33 -07:00
Jessica Paquette f29e6277ad [GlobalISel][CallLowering] Don't tail call with non-forwarded explicit sret
Similar to this commit:

faf8065a99

Testcase is pretty much the same as

test/CodeGen/AArch64/tailcall-explicit-sret.ll

Except it uses i64 (since we don't handle the i1024 return values yet), and
doesn't have indirect tail call testcases (because we can't translate those
yet).

Differential Revision: https://reviews.llvm.org/D86148
2020-08-18 11:06:57 -07:00
Matt Arsenault 5a15f6628e GlobalISel: Implement fewerElementsVector for G_INSERT_VECTOR_ELT
Add unit tests since AMDGPU will only trigger this for gigantic
vectors, and won't use the annoying odd sized breakdown case.
2020-08-18 13:51:19 -04:00
Amara Emerson 04a6ea5d77 [GlobalISel] Add a combine for sext_inreg(load x), c --> sextload x
This is restricted to single use loads, which if we fold to sextloads we can
find more optimal addressing modes on AArch64.

This also fixes an overload the MachineFunction::getMachineMemOperand() method
which was incorrectly using the MF alignment instead of the MMO alignment.

Differential Revision: https://reviews.llvm.org/D85966
2020-08-18 10:42:15 -07:00
Amara Emerson 40e269ea6d [GlobalISel] Add a combine for ashr(shl x, c), c --> sext_inreg x, c'
By detecting this sign extend pattern early, we can uncover opportunities for
more optimizations.

Differential Revision: https://reviews.llvm.org/D85965
2020-08-18 10:42:15 -07:00
Jessica Paquette 224a8c639e [GlobalISel][CallLowering] Look through call parameters for flags
We weren't looking through the parameters on calls at all.

E.g., say you had

```
declare i32 @zext(i32 zeroext %x)

...
%y = call i32 @zext(i32 %something)
...

```

At the point of the call, we wouldn't know that the %something should have the
zeroext attribute.

This sets flags in about the same way as
TargetLoweringBase::ArgListEntry::setAttributes.

Differential Revision: https://reviews.llvm.org/D86125
2020-08-18 08:48:56 -07:00
Matt Arsenault a128292b90 GlobalISel: Make type for lower action more consistently optional
Some of the lower implementations were relying on this, however the
type was not set depending on which form .lower* helper form you were
using. For instance, if you used an unconditonal lower(), the type was
never set. Most of the lower actions do not benefit from a type
parameter, and just expand in terms of the original operation's types.

However, some lowerings could benefit from an additional type hint to
combine a promotion and an expansion. An example of this is for
add/sub sat. The DAG integer legalization tries to use smarter
expansions directly when promoting the integer type, and doesn't
always produce the same instruction with a wider type.

Treat this as an optional hint argument, that only means something for
specific lower actions. It may be useful to generalize this mechanism
to pass a full list of type indexes and desired types, but I haven't
run into a case like that yet.
2020-08-17 16:24:55 -04:00
Matt Arsenault a275acc4a9 GlobalISel: Early continue to reduce loop indentation 2020-08-17 13:51:08 -04:00
Matt Arsenault 924f31bc3c GlobalISel: Remove unnecessary check for copy type
COPY isn't allowed to change the type, but can mix no type with type.
2020-08-17 09:19:25 -04:00
Matt Arsenault 04a288f0f0 GlobalISel: Remove unnecessary llvm:: 2020-08-15 12:12:50 -04:00
Matt Arsenault 5c5e6d951e TableGen/GlobalISel: Partially handle immAllOnesV/immAllZerosV
These should really match either G_BUILD_VECTOR or
G_BUILD_VECTOR_TRUNC, but there doesn't seem to be an existing
mechanism for matching alternative opcodes. There is GIM_SwitchOpcode,
but it seems to assume it's oly only used for matcher optimization.

I could also omit any opcode check and rely on the matcher directly
checking the opcode, but the table optimizer currently assumes there
has to be an opcode check.

Also doesn't try to handle undef elements like the DAG version.
2020-08-14 13:55:30 -04:00
Amara Emerson 2ff14957e8 [GlobalISel] Implement bit-test switch table optimization.
This is mostly a straight port from SelectionDAG. We re-use the actual bit-test
analysis part from SwitchLoweringUtils, which was factored out earlier to
support jump-tables.

Differential Revision: https://reviews.llvm.org/D85233
2020-08-12 11:31:39 -07:00
Jessica Paquette bebe6a6449 [GlobalISel] Combine (logic_op (op x...), (op y...)) -> (op (logic_op x, y))
This implements

```
(logic_op (op x...), (op y...)) -> (op (logic_op x, y))
```

when `op` is an extend, a shift, or an and.

This is similar to `DAGCombiner::hoistLogicOpWithSameOpcodeHands`
(with a bunch of missing cases, e.g. G_TRUNC, G_BITCAST, etc.)

This is implemented so it works both pre and post-legalization.

This also adds a general way to add a series of instructions in a combine.
(`applyBuildInstructionSteps`).

Differential Revision: https://reviews.llvm.org/D85050
2020-08-11 10:40:06 -07:00
Jay Foad fa2b836ea3 [GlobalISel] Add G_ABS
This is equivalent to the new llvm.abs intrinsic added by D84125 with
is_int_min_poison=0.

Differential Revision: https://reviews.llvm.org/D85718
2020-08-11 16:34:37 +01:00