Commit Graph

3643 Commits

Author SHA1 Message Date
Benjamin Kramer df4b9a3f4f Hide implementation details in namespaces.
llvm-svn: 372113
2019-09-17 12:56:29 +00:00
Graham Hunter 1a9195d817 [SVE][MVT] Fixed-length vector MVT ranges
* Reordered MVT simple types to group scalable vector types
    together.
  * New range functions in MachineValueType.h to only iterate over
    the fixed-length int/fp vector types.
  * Stopped backends which don't support scalable vector types from
    iterating over scalable types.

Reviewers: sdesmalen, greened

Reviewed By: greened

Differential Revision: https://reviews.llvm.org/D66339

llvm-svn: 372099
2019-09-17 10:19:23 +00:00
Kerry McLaughlin e55b3bf40e [SVE][Inline-Asm] Add constraints for SVE predicate registers
Summary:
Adds the following inline asm constraints for SVE:
  - Upl: One of the low eight SVE predicate registers, P0 to P7 inclusive
  - Upa: SVE predicate register with full range, P0 to P15

Reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov, cameron.mcinally, greened, rengolin

Reviewed By: rovka

Subscribers: javed.absar, tschuett, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66524

llvm-svn: 371967
2019-09-16 09:45:27 +00:00
Sjoerd Meijer b1e1a26e8e [AArch64] Some more FP16 FMA pattern matching
After our previous machinecombiner exercises (rL371321, rL371818, rL371833), we
were still missing a few FP16 FMA patterns.

Differential Revision: https://reviews.llvm.org/D67576

llvm-svn: 371960
2019-09-16 07:32:13 +00:00
Jessica Paquette 727328ab63 [AArch64][GlobalISel] Tail call memory intrinsics
Because memory intrinsics are handled differently than other calls, we need to
check them for tail call eligiblity in the legalizer. This allows us to still
inline them when it's beneficial to do so, but also tail call when possible.

This adds simple tail calling support for when the intrinsic is followed by a
return.

It ports the attribute checks from `TargetLowering::isInTailCallPosition` into
a similarly-named function in LegalizerHelper.cpp. The target-specific
`isUsedByReturnOnly` hook is not ported here.

Update tailcall-mem-intrinsics.ll to show that GlobalISel can now tail call
memory intrinsics.

Update legalize-memcpy-et-al.mir to have a case where we don't tail call.

Differential Revision: https://reviews.llvm.org/D67566

llvm-svn: 371893
2019-09-13 20:25:58 +00:00
Sebastian Pop d93e136be1 [aarch64] move custom isel of extract_vector_elt to td file - NFC
In preparation for def-pat selection of dot product instructions,
this patch moves the custom instruction selection of extract_vector_elt
to the td file. Without this change it is impossible to catch a pattern that
starts with an extract_vector_elt: the custom cpp code is executed first
ahead of the patterns in the td files that are only executed at the end of
the switch statement in SelectCode(Node).

With this patch applied, it becomes possible to select a different pattern
that starts with extract_vector_elt by selecting a higher complexity than
this pattern.

The patch has been tested on aarch64-linux with make check-all.

Differential Revision: https://reviews.llvm.org/D67497

llvm-svn: 371887
2019-09-13 19:28:30 +00:00
Tim Northover 52a89cc07d AArch64: fix EXPENSIVE_CHECKS for arm64_32.
For some reason I'd decided to mark the end-result of a GOT load as
dead. It's clearly not (necessarily).

llvm-svn: 371883
2019-09-13 18:55:38 +00:00
Jessica Paquette 14bfb56b1a [AArch64][GlobalISel] Add support for sibcalling callees with varargs
This adds support for tail calling callees with varargs, equivalent to how it
is done in AArch64ISelLowering.

This only works for sibling calls, and does not add the necessary support for
musttail with varargs. (See r345641 for equivalent ISelLowering support.) This
should be implemented when we stop falling back on musttail.

Update call-translator-tail-call.ll to show that we can now tail call varargs.

Differential Revision: https://reviews.llvm.org/D67518

llvm-svn: 371868
2019-09-13 16:10:19 +00:00
Sjoerd Meijer 395a86731d [AArch64] MachineCombiner FMA matching. NFC.
Follow-up of rL371321 that added some more FP16 FMA patterns, and an attempt to
reduce the copy-pasting and make this more readable.

Differential Revision: https://reviews.llvm.org/D67403

llvm-svn: 371818
2019-09-13 07:38:54 +00:00
Jessica Paquette 0c283cb504 [AArch64][GlobalISel] Support tail calling with swiftself parameters
Swiftself uses a callee-saved register. We can tail call when the register used
in the caller and callee is the same.

This behaviour is equivalent to that in `TargetLowering::parametersInCSRMatch`.

Update call-translator-tail-call.ll to verify that we can do this. When we
support inline assembly, we can write a check similar to the one in the
general swiftself.ll. For now, we need to verify that we get the correct COPY
instruction after call lowering.

Differential Revision: https://reviews.llvm.org/D67511

llvm-svn: 371788
2019-09-12 23:00:59 +00:00
Jessica Paquette a42070a6aa [AArch64][GlobalISel] Support sibling calls with outgoing arguments
This adds support for lowering sibling calls with outgoing arguments.

e.g

```
define void @foo(i32 %a)
```

Support is ported from AArch64ISelLowering's `isEligibleForTailCallOptimization`.
The only thing that is missing is a full port of
`TargetLowering::parametersInCSRMatch`. So, if we're using swiftself,
we'll never tail call.

- Rename `analyzeCallResult` to `analyzeArgInfo`, since the function is now used
  for both outgoing and incoming arguments
- Teach `OutgoingArgHandler` about tail calls. Tail calls use frame indices for
  stack arguments.
- Teach `lowerFormalArguments` to set the bytes in the caller's stack argument
  area. This is used later to check if the tail call's parameters will fit on
  the caller's stack.
- Add `areCalleeOutgoingArgsTailCallable` to perform the eligibility check on
  the callee's outgoing arguments.

For testing:

- Update call-translator-tail-call to verify that we can now tail call with
  outgoing arguments, use G_FRAME_INDEX for stack arguments, and respect the
  size of the caller's stack
- Remove GISel-specific check lines from speculation-hardening.ll, since GISel
  now tail calls like the other selectors
- Add a GISel test line to tailcall-string-rvo.ll since we can tail call in that
  test now
- Add a GISel test line to tailcall_misched_graph.ll since we tail call there
  now. Add specific check lines for GISel, since the debug output from the
  machine-scheduler differs with GlobalISel. The dependency still holds, but
  the output comes out in a different order.

Differential Revision: https://reviews.llvm.org/D67471

llvm-svn: 371780
2019-09-12 22:10:36 +00:00
Tim Northover f1c2892912 AArch64: support arm64_32, an ILP32 slice for watchOS.
This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM.
FastISel is mostly disabled for now since it would generate incorrect code for
ILP32.

llvm-svn: 371722
2019-09-12 10:22:23 +00:00
Jessica Paquette e297ad1bd9 [GlobalISel][AArch64] Check caller for swifterror params in tailcall eligibility
Before, we only checked the callee for swifterror. However, we should also be
checking the caller to see if it has a swifterror parameter.

Since we don't currently handle outgoing arguments, this didn't show up in the
swifterror.ll testcase.

Also, remove the swifterror checks from call-translator-tail-call.ll, since
they are covered by the existing swifterror testing. Better to have it all in
one place.

Differential Revision: https://reviews.llvm.org/D67465

llvm-svn: 371692
2019-09-11 23:44:16 +00:00
Guillaume Chatelet 97264366fb [Alignment][NFC] use llvm::Align for AsmPrinter::EmitAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: dschuff, sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67443

llvm-svn: 371616
2019-09-11 13:37:35 +00:00
Jessica Paquette 469d42fcf6 [GlobalISel] When a tail call is emitted in a block, stop translating it
This fixes a crash in tail call translation caused by assume and lifetime_end
intrinsics.

It's possible to have instructions other than a return after a tail call which
will still have `Analysis::isInTailCallPosition` return true. (Namely,
lifetime_end and assume intrinsics.)

If we emit a tail call, we should stop translating instructions in the block.
Otherwise, we can end up emitting an extra return, or dead instructions in
general. This makes the verifier unhappy, and is generally unfortunate for
codegen.

This also removes the code from AArch64CallLowering that checks if we have a
tail call when lowering a return. This is covered by the new code now.

Also update call-translator-tail-call.ll to show that we now properly tail call
in the presence of lifetime_end and assume.

Differential Revision: https://reviews.llvm.org/D67415

llvm-svn: 371572
2019-09-10 23:34:45 +00:00
Jessica Paquette 2af5b193d5 [AArch64][GlobalISel] Support sibling calls with mismatched calling conventions
Add support for sibcalling calls whose calling convention differs from the
caller's.

- Port over `CCState::resultsCombatible` from CallingConvLower.cpp into
  CallLowering. This is used to verify that the way the caller and callee CC
  handle incoming arguments matches up.

- Add `CallLowering::analyzeCallResult`. This is basically a port of
  `CCState::AnalyzeCallResult`, but using `ArgInfo` rather than `ISD::InputArg`.

- Add `AArch64CallLowering::doCallerAndCalleePassArgsTheSameWay`. This checks
  that the calling conventions are compatible, and that the caller and callee
  preserve the same registers.

For testing:

- Update call-translator-tail-call.ll to show that we can now handle this.

- Add a GISel line to tailcall-ccmismatch.ll to show that we will not tail call
  when the regmasks don't line up.

Differential Revision: https://reviews.llvm.org/D67361

llvm-svn: 371570
2019-09-10 23:25:12 +00:00
Jessica Paquette bfb00e3d53 [GlobalISel][AArch64] Handle tail calls with non-void return types
Just return once you emit the call, which is exactly what SelectionDAG does in
this situation.

Update call-translator-tail-call.ll.

Also update dllimport.ll to show that we tail call here in GISel again. Add
-verify-machineinstrs to the GISel line too, to defend against verifier
failures.

Differential revision: https://reviews.llvm.org/D67282

llvm-svn: 371425
2019-09-09 17:15:56 +00:00
Cullen Rhodes 55244beeee [AArch64][SVE] Implement abs and neg intrinsics
Summary:
This patch implements two arithmetic intrinsics:

      * int_aarch64_sve_abs
      * int_aarch64_sve_neg

testing the support for scalable vector types in intrinsics added in D65930.

Reviewed By: greened

Differential Revision: https://reviews.llvm.org/D65931

llvm-svn: 371388
2019-09-09 11:21:14 +00:00
Tim Northover 36147adc0b GlobalISel: add combiner to form indexed loads.
Loosely based on DAGCombiner version, but this part is slightly simpler in
GlobalIsel because all address calculation is performed by G_GEP. That makes
the inc/dec distinction moot so there's just pre/post to think about.

No targets can handle it yet so testing is via a special flag that overrides
target hooks.

llvm-svn: 371384
2019-09-09 10:04:23 +00:00
Sebastian Pop eacb2c2c97 [aarch64] Add combine patterns for fp16 fmla
This patch enables generation of fused multiply add/sub for instructions operating on fp16.
Tested on aarch64-linux.

Differential Revision: https://reviews.llvm.org/D67297

llvm-svn: 371321
2019-09-07 20:24:51 +00:00
Amara Emerson a1cf4d9795 [AArch64][GlobalISel] Enable the localizer for optimized builds.
Despite the fact that the localizer's original motivation was to fix horrendous
constant spilling at -O0, shortening live ranges still has net benefits even
with optimizations enabled.

On an -Os build of CTMark, doing this improves code size by 0.5% geomean.

There are a few regressions, bullet increasing in size by 0.5%. One example from
bullet where code size increased slightly was due to GlobalISel actually now
generating the same code as SelectionDAG. So we actually have an opportunity
in future to implement better heuristics for localization and therefore be
*better* than SDAG in some cases. In relation to other optimizations though that
one is relatively minor.

Differential Revision: https://reviews.llvm.org/D67303

llvm-svn: 371266
2019-09-06 22:27:09 +00:00
Jessica Paquette 121d9114f5 [AArch64][GlobalISel] Always fall back on tail calls with -tailcallopt
-tailcallopt requires that we perform different stack adjustments than with
sibling calls. For example, the `@caller_to0_from8` function in
test/CodeGen/AArch64/tail-call.ll requires that we adjust SP. Without
-tailcallopt, this adjustment does not happen. With it, however, it is expected.

So, to ensure that adding sibling call support doesn't break -tailcallopt,
make CallLowering always fall back on possible tail calls when -tailcallopt
is passed in.

Update test/CodeGen/AArch64/tail-call.ll with a GlobalISel line to make sure
that we don't differ from the SDAG implementation at any point.

Differential Revision: https://reviews.llvm.org/D67245

llvm-svn: 371227
2019-09-06 16:49:13 +00:00
Guillaume Chatelet ad1cea0dda [Alignment][NFC] Use Align with TargetLowering::setPrefFunctionAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: nemanjai, javed.absar, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, ychen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67267

llvm-svn: 371212
2019-09-06 15:03:49 +00:00
Guillaume Chatelet 9fcf066d0c [Alignment][NFC] Use Align with TargetLowering::setPrefLoopAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, ychen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67278

llvm-svn: 371210
2019-09-06 14:51:15 +00:00
Guillaume Chatelet 4fc3ad9e13 [Alignment][NFC] Use Align with TargetLowering::setMinFunctionAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jyknight, sdardis, nemanjai, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67229

llvm-svn: 371200
2019-09-06 12:48:34 +00:00
Jessica Paquette 20e8667098 Recommit "[AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls"
Recommit basic sibling call lowering (https://reviews.llvm.org/D67189)

The issue was that if you have a return type other than void, call lowering
will emit COPYs to get the return value after the call.

Disallow sibling calls other than ones that return void for now. Also
proactively disable swifterror tail calls for now, since there's a similar issue
with COPYs there.

Update call-translator-tail-call.ll to include test cases for each of these
things.

llvm-svn: 371114
2019-09-05 20:18:34 +00:00
Simon Pilgrim 071287c5a9 Revert rL370996 from llvm/trunk: [AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls
This adds support for basic sibling call lowering in AArch64. The intent here is
to only handle tail calls which do not change the ABI (hence, sibling calls.)

At this point, it is very restricted. It does not handle

- Vararg calls.
- Calls with outgoing arguments.
- Calls whose calling conventions differ from the caller's calling convention.
- Tail/sibling calls with BTI enabled.

This patch adds

- `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent
   to the same function in AArch64ISelLowering.cpp (albeit with the restrictions
   above.)
- `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in
   AArch64ISelLowering.cpp.
- `getCallOpcode`, which is exactly what it sounds like.

Tail/sibling calls are lowered by checking if they pass target-independent tail
call positioning checks, and checking if they satisfy
`isEligibleForTailCallOptimization`. If they do, then a tail call instruction is
emitted instead of a normal call. If we have a sibling call (which is always the
case in this patch), then we do not emit any stack adjustment operations. When
we go to lower a return, we check if we've already emitted a tail call. If so,
then we skip the return lowering.

For testing, this patch

- Adds call-translator-tail-call.ll to test which tail calls we currently lower,
  which ones we don't, and which ones we shouldn't.
- Updates branch-target-enforcement-indirect-calls.ll to show that we fall back
  as expected.

Differential Revision: https://reviews.llvm.org/D67189
........
This fails on EXPENSIVE_CHECKS builds due to a -verify-machineinstrs test failure in CodeGen/AArch64/dllimport.ll

llvm-svn: 371051
2019-09-05 10:38:39 +00:00
Guillaume Chatelet aff45e4b23 [LLVM][Alignment] Make functions using log of alignment explicit
Summary:
This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align.
The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment.
A few renames uncovered dubious assignments:

 - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation.
 - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation,
 - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation,

Reviewers: lattner, thegameg, courbet

Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65945

llvm-svn: 371045
2019-09-05 10:00:22 +00:00
Jessica Paquette b78324fc40 [AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls
This adds support for basic sibling call lowering in AArch64. The intent here is
to only handle tail calls which do not change the ABI (hence, sibling calls.)

At this point, it is very restricted. It does not handle

- Vararg calls.
- Calls with outgoing arguments.
- Calls whose calling conventions differ from the caller's calling convention.
- Tail/sibling calls with BTI enabled.

This patch adds

- `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent
   to the same function in AArch64ISelLowering.cpp (albeit with the restrictions
   above.)
- `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in
   AArch64ISelLowering.cpp.
- `getCallOpcode`, which is exactly what it sounds like.

Tail/sibling calls are lowered by checking if they pass target-independent tail
call positioning checks, and checking if they satisfy
`isEligibleForTailCallOptimization`. If they do, then a tail call instruction is
emitted instead of a normal call. If we have a sibling call (which is always the
case in this patch), then we do not emit any stack adjustment operations. When
we go to lower a return, we check if we've already emitted a tail call. If so,
then we skip the return lowering.

For testing, this patch

- Adds call-translator-tail-call.ll to test which tail calls we currently lower,
  which ones we don't, and which ones we shouldn't.
- Updates branch-target-enforcement-indirect-calls.ll to show that we fall back
  as expected.

Differential Revision: https://reviews.llvm.org/D67189

llvm-svn: 370996
2019-09-04 22:54:52 +00:00
Amara Emerson 2a2c25ba48 [AArch64][GlobalISel] Legalize 128 bit divisions to libcalls.
Now that we have the infrastructure to support s128 types as parameters
we can expand these to libcalls.

Differential Revision: https://reviews.llvm.org/D66185

llvm-svn: 370823
2019-09-03 21:42:32 +00:00
Amara Emerson fbaf425b79 [GlobalISel][CallLowering] Add support for splitting types according to calling conventions.
On AArch64, s128 types have to be split into s64 GPRs when passed as arguments.
This change adds the generic support in call lowering for dealing with multiple
registers, for incoming and outgoing args.

Support for splitting for return types not yet implemented.

Differential Revision: https://reviews.llvm.org/D66180

llvm-svn: 370822
2019-09-03 21:42:28 +00:00
Jessica Paquette 15036acb05 [AArch64][GlobalISel] Don't import i64imm_32bit pattern at -O0
This pattern, when imported at -O0 adds an extra copy via the SUBREG_TO_REG.

This is because the SUBREG_TO_REG is not eliminated. At all other opt levels,
it is eliminated.

This is a 1% geomean code size savings at -O0 on CTMark.

Differential Revision: https://reviews.llvm.org/D67027

llvm-svn: 370789
2019-09-03 17:21:12 +00:00
Kerry McLaughlin 7b5c6b8d86 [SVE][Inline-Asm] Fix -Wimplicit-fallthrough in AArch64ISelLowering.cpp
Summary: Adds break to 'x' case in getRegForInlineAsmConstraint added by D66302, fixing the unintentional fallthrough.

Reviewers: sdesmalen, rovka, cameron.mcinally, greened, gribozavr, ruiu

Reviewed By: sdesmalen

Subscribers: bjope, javed.absar, tschuett, kristof.beyls, rkruppe, psnobl, llvm-commits, cfe-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67095

llvm-svn: 370769
2019-09-03 15:45:42 +00:00
Kerry McLaughlin da4ef9b4c8 [SVE][Inline-Asm] Support for SVE asm operands
Summary:
Adds the following inline asm constraints for SVE:
  - w: SVE vector register with full range, Z0 to Z31
  - x: Restricted to registers Z0 to Z15 inclusive.
  - y: Restricted to registers Z0 to Z7 inclusive.

This change also adds the "z" modifier to interpret a register as an SVE register.

Not all of the bitconvert patterns added by this patch are used, but they have been included here for completeness.

Reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov, rengolin, cameron.mcinally, greened

Reviewed By: sdesmalen

Subscribers: javed.absar, tschuett, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66302

llvm-svn: 370673
2019-09-02 16:12:31 +00:00
Evgeniy Stepanov 04647f5e22 MemTag: unchecked load/store optimization.
Summary:
MTE allows memory access to bypass tag check iff the address argument
is [SP, #imm]. This change takes advantage of this to demote uses of
tagged addresses to regular FrameIndex operands, reducing register
pressure in large functions.

MO_TAGGED target flag is used to signal that the FrameIndex operand
refers to memory that might be tagged, and needs to be handled with
care. Such operand must be lowered to [SP, #imm] directly, without a
scratch register.

The transformation pass attempts to predict when the offset will be
out of range and disable the optimization.
AArch64RegisterInfo::eliminateFrameIndex has an escape hatch in case
this prediction has been wrong, but it is quite inefficient and should
be avoided.

Reviewers: pcc, vitalybuka, ostannard

Subscribers: mgorny, javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66457

llvm-svn: 370490
2019-08-30 17:23:02 +00:00
Jessica Paquette 04e657be28 [AArch64][GlobalISel] Select arithmetic extended register patterns
This teaches GISel to select patterns which fold an extend plus optional shift
into the addressing mode. In particular, adds and subs.

Factor out the arith extended register ComplexPatterns in AArch64InstrFormats.td
and create GISel equivalents.

Add some equivalent functions to the ones in AArch64ISelDAGToDAG:

- `selectArithExtendedRegister`
- `narrowExtendRegIfNeeded`
- `getExtendTypeForInst`

`getExtendTypeForInst` includes the checks for loads and stores. This will be
used for WRO addressing modes in loads + stores.

Teach selectCopy to properly handle subregister copies on the same bank in
order to support `narrowExtendRegIfNeeded`. The extended register must be a
GPR32, so we need to support same-bank subregister copies.

Fix a bug in getSubRegForClass which would cause registers on things like
GPR32common to end up getting ssub. Just change the check to look for FPR32
rather than GPR32.

For tests:

- Add select-arith-extended-reg.mir
- Update addsub_ext.ll to include GlobalISel checks

Differential Revision: https://reviews.llvm.org/D66835

llvm-svn: 370410
2019-08-29 21:53:58 +00:00
Matt Arsenault caff0a88dd GlobalISel: Add known bits to InstructionSelector
AMDGPU uses this for some addressing mode selection patterns. The
analysis run itself doesn't do anything so it seems easier to just
always require this than adding a way to opt in.

llvm-svn: 370388
2019-08-29 17:24:32 +00:00
Jessica Paquette ba04f5fac1 [GlobalISel][AArch64] Select llvm.aarch64.stxr* intrinsics.
Add a GISelPredicateCode to the stxr_* PatFrags in AArch64InstrAtomics.td.

This allows us to select these intrinsics.

Differential Revision: https://reviews.llvm.org/D65779

llvm-svn: 370382
2019-08-29 16:55:55 +00:00
Jessica Paquette b8b23a1648 [GlobalISel][AArch64] Use a GISelPredicateCode to select llvm.aarch64.stlxr.*
Remove manual selection code for this intrinsic and use a GISelPredicateCode
instead.

This allows us to fully select this intrinsic without any tricky custom C++
matching.

Differential Revision: https://reviews.llvm.org/D65780

llvm-svn: 370380
2019-08-29 16:45:19 +00:00
Jessica Paquette 87720ac8c8 [AArch64][GlobalISel] Select @llvm.aarch64.ldxr.* intrinsics
Same thing as D66897, but for ldxr.* instead. Add a GISelPredicateCode to the
ldxr_* definitions, which allows us to import them.

Add select-ldxr-intrin.mir, and update arm64-ldxr-stxr.ll.

Differential Revision: https://reviews.llvm.org/D66898

llvm-svn: 370378
2019-08-29 16:33:01 +00:00
Jessica Paquette c327daeea5 [AArch64][GlobalISel] Select @llvm.aarch64.ldaxr.* intrinsics
Add a GISelPredicateCode to ldaxr_*. This allows us to import the patterns for
@llvm.aarch64.ldaxr.*, and thus select them.

Add `isLoadStoreOfNumBytes` for the GISelPredicateCode, since each of these
intrinsics involves the same check.

Add select-ldaxr-intrin.mir, and update arm64-ldxr-stxr.ll.

Differential Revision: https://reviews.llvm.org/D66897

llvm-svn: 370377
2019-08-29 16:16:38 +00:00
Shiva Chen b39876d8cd [RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCall
The patch fixed the issue that RV64 didn't clear the upper bits
when return complex floating value with lp64 ABI.

float _Complex
complex_add(float _Complex a, float _Complex b)
{
   return a + b;
}

RealResult = zero_extend(RealA + RealB)
ImageResult = ImageA + ImageB
Return (RealResult | (ImageResult << 32))

The patch introduces shouldExtendTypeInLibCall target hook to suppress
the AssertZext generation when lowering floating LibCall.

Thanks to Eli's comments from the Bugzilla
https://bugs.llvm.org/show_bug.cgi?id=42820

Differential Revision: https://reviews.llvm.org/D65497

llvm-svn: 370275
2019-08-28 23:40:37 +00:00
Jessica Paquette af0bd41e06 [AArch64][GlobalISel] Fall back when translating musttail calls
These are currently translated as normal functions calls in AArch64.

Until we have proper tail call lowering, we shouldn't translate these.

Differential Revision: https://reviews.llvm.org/D66842

llvm-svn: 370225
2019-08-28 16:19:01 +00:00
Hans Wennborg cff90f07cb [SelectionDAG] Don't generate libcalls for wide shifts on Windows (PR42711)
Neither libgcc or compiler-rt are usually used on Windows, so these
functions can't be called.

Differential revision: https://reviews.llvm.org/D66880

llvm-svn: 370204
2019-08-28 13:55:10 +00:00
Amara Emerson e20b91c265 [GlobalISel] Replace hard coded dynamic alloca handling with G_DYN_STACKALLOC.
This change moves the actual stack pointer manipulation into the legalizer,
available to targets via lower(). The codegen is slightly different because
we're using explicit masks instead of G_PTRMASK, and using G_SUB rather than
adding a negative amount via G_GEP.

Differential Revision: https://reviews.llvm.org/D66678

llvm-svn: 370104
2019-08-27 19:54:27 +00:00
Tim Northover a7f226f9db AArch64: avoid creating cycle in DAG for post-increment NEON ops.
Inserting a value into Visited has the effect of terminating a search for
predecessors if that node is seen. This is legitimate for the base address, and
acts as a slight performance optimization, but the vector-building node can be
paert of a legitimate cycle so we shouldn't stop searching there.

PR43056.

llvm-svn: 370036
2019-08-27 10:21:11 +00:00
Benjamin Kramer 16b322914a Use a bit of relaxed constexpr to make FeatureBitset costant intializable
This requires std::intializer_list to be a literal type, which it is
starting with C++14. The downside is that std::bitset is still not
constexpr-friendly so this change contains a re-implementation of most
of it.

Shrinks clang by ~60k.

llvm-svn: 369847
2019-08-24 15:02:44 +00:00
Jessica Paquette 83fe56b3b9 [AArch64][GlobalISel] Import XRO load/store patterns instead of custom selection
Instead of using custom C++ in `earlySelect` for loads and stores, just import
the patterns.

Remove `earlySelectLoad`, since we can just import the work it's doing.

Some minor changes to how `ComplexRendererFns` are returned for the XRO
addressing modes. If you add immediates in two steps, sometimes they are not
imported properly and you only end up with one immediate. I'm not sure if this
is intentional.

- Update load-addressing-modes.mir to include the instructions we can now
  import.

- Add a similar test, store-addressing-modes.mir to show which store opcodes we
  currently import, and show that we can pull in shifts etc.

- Update arm64-fastisel-gep-promote-before-add.ll to use FastISel instead of
  GISel. This test failed with GISel because GISel folds the gep into the load.
  The test checks that FastISel doesn't fold non-pointer-width adds into loads.
  GISel on the other hand, produces a G_CONSTANT of -128 for the add, and then
  a G_GEP, which must be pointer-width.

Note that we don't get STRBRoX right now. It seems like the importer can't
handle `FPR8Op:{ *:[Untyped] }:$Rt` source operands. So, those are not currently
supported.

Differential Revision: https://reviews.llvm.org/D66679

llvm-svn: 369806
2019-08-23 20:31:34 +00:00
Benjamin Kramer dc5f805d31 Do a sweep of symbol internalization. NFC.
llvm-svn: 369803
2019-08-23 19:59:23 +00:00
Simon Pilgrim c88408cf85 Use VT::getHalfNumVectorElementsVT helpers in a few places. NFCI.
llvm-svn: 369751
2019-08-23 12:37:09 +00:00