Commit Graph

2322 Commits

Author SHA1 Message Date
Tim Northover d1fd383b28 GlobalISel: handle 1-element aggregates during ABI lowering.
llvm-svn: 288706
2016-12-05 21:25:33 +00:00
Quentin Colombet 0e6cccfb53 [AArch64][RegisterBankInfo] Fix typo in the logic used in assert.
Thanks to David Binderman <dcb314@hotmail.com> for bringing it to my
attention.

llvm-svn: 288688
2016-12-05 19:02:37 +00:00
Diana Picus f11f042ecb [GlobalISel] Extract handleAssignments out of AArch64CallLowering
This function seems target-independent so far: all the target-specific behaviour
is isolated in the CCAssignFn and the ValueHandler (which we're also extracting
into the generic CallLowering).

The intention is to use this in the ARM backend.

Differential Revision: https://reviews.llvm.org/D27045

llvm-svn: 288658
2016-12-05 10:40:33 +00:00
Matthias Braun 1fbb0f6dd9 AArch64CollectLOH: Rewrite as block-local analysis.
Previously this pass was using up to 5% compile time in some cases which
is a bit much for what it is doing. The pass featured a full blown
data-flow analysis which in the default configuration was restricted to a
single block.

This rewrites the pass under the assumption that we only ever work on a
single block. This is done in a single pass maintaining a state machine
per general purpose register to catch LOH patterns.

Differential Revision: https://reviews.llvm.org/D27329

llvm-svn: 288561
2016-12-03 00:52:56 +00:00
Peter Collingbourne ab85225be4 IR: Change the gep_type_iterator API to avoid always exposing the "current" type.
Instead, expose whether the current type is an array or a struct, if an array
what the upper bound is, and if a struct the struct type itself. This is
in preparation for a later change which will make PointerType derive from
Type rather than SequentialType.

Differential Revision: https://reviews.llvm.org/D26594

llvm-svn: 288458
2016-12-02 02:24:42 +00:00
Geoff Berry 7ffce7be0c [AArch64] Fold more spilled/refilled COPYs.
Summary:
Make AArch64InstrInfo::foldMemoryOperandImpl more general by folding all
full COPYs between register classes of the same size that are either
spilled or refilled.

Reviewers: MatzeB, qcolombet

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D27271

llvm-svn: 288439
2016-12-01 23:43:55 +00:00
Tim Northover 5bb87b6769 AArch64: fix 128-bit cmpxchg at -O0 (again, again).
This time the issue is fortunately just a simple mistake rather than a horrible
design spectre. I thought SUBS/SBCS provided sufficient NZCV flags for
comparing two 64-bit values, but they don't.

The fix is slightly clunkier in AArch64 because we can't use conditional
execution to emit a pair of CMPs. Traditionally an "icmp ne i128" would map to
an EOR/EOR/ORR/CBNZ, but that uses more registers so it's easier to go with a
CSET/CINC/CBNZ combination. Slightly less efficient, but this is -O0 anyway.

Thanks to Anton Korobeynikov for pointing out the issue.

llvm-svn: 288418
2016-12-01 21:31:59 +00:00
Matthias Braun f23ef437cc Move FrameInstructions from MachineModuleInfo to MachineFunction
This is per function data so it is better kept at the function instead
of the module.

This is a necessary step to have machine module passes work properly.

Differential Revision: https://reviews.llvm.org/D27185

llvm-svn: 288291
2016-11-30 23:48:42 +00:00
Joel Jones 75818bc8f7 [AArch64] Refactor LSE support as feature separate from V8.1a support.
Summary:
This is preparation for ThunderX processors that have Large
System Extension (LSE) atomic instructions, but not the 
other instructions introduced by V8.1a.
This will mimic changes to GCC as described here:
https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00388.html

LSE instructions are: LD/ST<op>, CAS*, SWP

Reviewers: t.p.northover, echristo, jmolloy, rengolin

Subscribers: aemerson, mehdi_amini

Differential Revision: https://reviews.llvm.org/D26621

llvm-svn: 288279
2016-11-30 22:25:24 +00:00
Matthias Braun c52fe2961c Clarify rules for reserved regs, fix aarch64 ones.
No test case necessary as the problematic condition is checked with the
newly introduced assertAllSuperRegsMarked() function.

Differential Revision: https://reviews.llvm.org/D26648

llvm-svn: 288277
2016-11-30 22:17:10 +00:00
Silviu Baranga aab65b155e [AArch64] Fix useful bits detection for BFM instructions
Summary:
When computing useful bits for a BFM instruction, we need
to take into consideration the case where both operands
of the BFM are equal and provide data that we need to track.

Not doing this can cause us to miss useful bits.
    
Fixes PR31138 (https://llvm.org/bugs/show_bug.cgi?id=31138)

Reviewers: t.p.northover, jmolloy

Subscribers: evandro, gberry, srhines, pirama, mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D27130

llvm-svn: 288253
2016-11-30 17:04:22 +00:00
Sanjay Patel 47f7f30df9 [AArch64] allow and-not-compare transform to form 'bics'
This target hook was added with D19087:
https://reviews.llvm.org/D19087

Differential Revision: https://reviews.llvm.org/D27221

llvm-svn: 288206
2016-11-29 22:28:58 +00:00
Chad Rosier d34c26eb08 [AArch64] Add a basic SchedMachineModel for Falkor.
Differential Revision: https://reviews.llvm.org/D26972

llvm-svn: 288194
2016-11-29 20:00:27 +00:00
Geoff Berry 7c078fc035 [AArch64] Fold spills of COPY of WZR/XZR
Summary:
In AArch64InstrInfo::foldMemoryOperandImpl, catch more cases where the
COPY being spilled is copying from WZR/XZR, but the source register is
not in the COPY destination register's regclass.

For example, when spilling:

  %vreg0 = COPY %XZR ; %vreg0:GPR64common

without this change, the code in TargetInstrInfo::foldMemoryOperand()
and canFoldCopy() that normally handles cases like this would fail to
optimize since %XZR is not in GPR64common.  So the spill code generated
would be:

  %vreg0 = COPY %XZR
  STR %vreg

instead of the new code generated:

  STR %XZR

Reviewers: qcolombet, MatzeB

Subscribers: mcrosier, aemerson, t.p.northover, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D26976

llvm-svn: 288176
2016-11-29 18:28:32 +00:00
Matthias Braun 115efcd3d1 MachineScheduler: Export function to construct "default" scheduler.
This makes the createGenericSchedLive() function that constructs the
default scheduler available for the public API. This should help when
you want to get a scheduler and the default list of DAG mutations.

This also shrinks the list of default DAG mutations:
{Load|Store}ClusterDAGMutation and MacroFusionDAGMutation are no longer
added by default. Targets can easily add them if they need them. It also
makes it easier for targets to add alternative/custom macrofusion or
clustering mutations while staying with the default
createGenericSchedLive(). It also saves the callback back and forth in
TargetInstrInfo::enableClusterLoads()/enableClusterStores().

Differential Revision: https://reviews.llvm.org/D26986

llvm-svn: 288057
2016-11-28 20:11:54 +00:00
Kuba Mracek 06995e866b [xray] Add XRay support for Mach-O in CodeGen
Currently, XRay only supports emitting the XRay table (xray_instr_map) on ELF binaries. Let's add Mach-O support.

Differential Revision: https://reviews.llvm.org/D26983

llvm-svn: 287734
2016-11-23 02:07:04 +00:00
Tim Northover b64fb453ea CodeGen: simplify TargetMachine::getSymbol interface. NFC.
No-one actually had a mangler handy when calling this function, and
getSymbol itself went most of the way towards getting its own mangler
(with a local TLOF variable) so forcing all callers to supply one was
just extra complication.

llvm-svn: 287645
2016-11-22 16:17:20 +00:00
Chad Rosier ecc77273a0 [AArch64] Set the max interleave factor for Falkor.
llvm-svn: 287642
2016-11-22 14:25:02 +00:00
Chad Rosier 2abc29c593 [AArch64] Maximize 80-column. NFC.
llvm-svn: 287640
2016-11-22 14:12:09 +00:00
Geoff Berry e0bf52f394 [AArch64LoadStoreOptimizer] Don't treat write to XZR/WZR as a clobber.
Summary:
When searching for load/store instructions to pair/merge don't treat
writes to WZR/XZR as clobbers since they don't change the value read
from WZR/XZR (which is always 0).

Reviewers: mcrosier, junbuml, jmolloy, t.p.northover

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D26921

llvm-svn: 287592
2016-11-21 22:51:10 +00:00
Dean Michael Berris 31761f300d [XRay][AArch64] Implemented a test for the compile-time sleds emitted, and fixed a bug in the jump instruction
This patch adds a test for the assembly code emitted with XRay
instrumentation. It also fixes a bug where the operand of a jump
instruction must be not the number of bytes to jump over, but rather the
number of 4-byte instructions.

Author: rSerge

Reviewers: dberris, rengolin

Differential Revision: https://reviews.llvm.org/D26805

llvm-svn: 287516
2016-11-21 03:01:43 +00:00
Benjamin Kramer ffd3715d16 Give some helper classes/functions internal linkage. NFC.
llvm-svn: 287462
2016-11-19 20:44:26 +00:00
Daniel Sanders 72db2a390a Check that emitted instructions meet their predicates on all targets except ARM, Mips, and X86.
Summary:
* ARM is omitted from this patch because this check appears to expose bugs in this target.
* Mips is omitted from this patch because this check either detects bugs or deliberate
  emission of instructions that don't satisfy their predicates. One deliberate
  use is the SYNC instruction where the version with an operand is correctly
  defined as requiring MIPS32 while the version without an operand is defined
  as an alias of 'SYNC 0' and requires MIPS2.
* X86 is omitted from this patch because it doesn't use the tablegen-erated
  MCCodeEmitter infrastructure.

Patches for ARM and Mips will follow.

Depends on D25617

Reviewers: tstellarAMD, jmolloy

Subscribers: wdng, jmolloy, aemerson, rengolin, arsenm, jyknight, nemanjai, nhaehnle, tstellarAMD, llvm-commits

Differential Revision: https://reviews.llvm.org/D25618

llvm-svn: 287439
2016-11-19 13:05:44 +00:00
Dean Michael Berris 3234d3a4bd [XRay] Support AArch64 in LLVM
This patch adds XRay support in LLVM for AArch64 targets.
This patch is one of a series:

Clang: https://reviews.llvm.org/D26415
compiler-rt: https://reviews.llvm.org/D26413

Author: rSerge

Reviewers: rengolin, dberris

Subscribers: amehsan, aemerson, llvm-commits, iid_iunknown

Differential Revision: https://reviews.llvm.org/D26412

llvm-svn: 287209
2016-11-17 05:15:37 +00:00
Chris Bieneman 05c279fc4b [CMake] NFC. Updating CMake dependency specifications
This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system.

llvm-svn: 287206
2016-11-17 04:36:50 +00:00
Geoff Berry 8301c645c8 [AArch64] Handle vector types in replaceZeroVectorStore.
Summary:
Extend replaceZeroVectorStore to handle more vector type stores,
floating point zero vectors and set alignment more accurately on split
stores.

This is a follow-up change to r286875.

This change fixes PR31038.

Reviewers: MatzeB

Subscribers: mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D26682

llvm-svn: 287142
2016-11-16 19:35:19 +00:00
Matthias Braun 3d51cf0a2c AArch64: Use DeadRegisterDefinitionsPass before regalloc.
Doing this before register allocation reduces register pressure as we do
not even have to allocate a register for those dead definitions.

Differential Revision: https://reviews.llvm.org/D26111

llvm-svn: 287076
2016-11-16 03:38:27 +00:00
Chad Rosier 201fc1ed26 [AArch64] Add support for Qualcomm's Falkor CPU.
Differential Revision: https://reviews.llvm.org/D26673

llvm-svn: 287036
2016-11-15 21:34:12 +00:00
Haicheng Wu faee2b71a7 [AArch64] Lower multiplication by a constant int to shl+add+shl
Lower a = b * C where C = (2^n + 1) * 2^m to

add     w0, w0, w0, lsl n
lsl     w0, w0, m

Differential Revision: https://reviews.llvm.org/D229245

llvm-svn: 287019
2016-11-15 20:16:48 +00:00
Evandro Menezes 9fc54826e0 [AArch64] Compute the Newton series for reciprocals natively
Implement the Newton series for square root, its reciprocal and reciprocal
natively using the specialized instructions in AArch64 to perform each
series iteration.

Differential revision: https://reviews.llvm.org/D26518

llvm-svn: 286907
2016-11-14 23:29:01 +00:00
Geoff Berry e8de67abad [AArch64] Change some pointers to references. NFC.
Follow-up change to r286875.

llvm-svn: 286879
2016-11-14 19:59:11 +00:00
Geoff Berry 526c50588d [AArch64] Split 0 vector stores into scalar store pairs.
Summary:
Replace a splat of zeros to a vector store by scalar stores of WZR/XZR.
The load store optimizer pass will merge them to store pair stores.
This should be better than a movi to create the vector zero followed by
a vector store if the zero constant is not re-used, since one
instructions and one register live range will be removed.

For example, the final generated code should be:

  stp xzr, xzr, [x0]

instead of:

  movi v0.2d, #0
  str q0, [x0]

Reviewers: t.p.northover, mcrosier, MatzeB, jmolloy

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26561

llvm-svn: 286875
2016-11-14 19:39:04 +00:00
Geoff Berry def4bfa9d9 [AArch64] Factor out transform code from split16BStore. NFC.
llvm-svn: 286874
2016-11-14 19:39:00 +00:00
Diana Picus bda7276120 GlobalISel: Fix indentation. NFC
llvm-svn: 286808
2016-11-14 10:25:43 +00:00
Chad Rosier 8ade03463e [AArch64] Update a FIXME comment to reflect current state. NFC.
llvm-svn: 286625
2016-11-11 19:52:45 +00:00
Geoff Berry 25fa4999ff [AArch64] Fix bugs in isel lowering replaceSplatVectorStore.
Summary:
Fix off-by-one indexing error in loop checking that inserted value was a
splat vector.

Add code to check that INSERT_VECTOR_ELT nodes constructing the splat
vector have the expected constant index values.

Reviewers: t.p.northover, jmolloy, mcrosier

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D26409

llvm-svn: 286616
2016-11-11 19:25:20 +00:00
Chad Rosier d6e85ce3c3 [AArch64] Remove lots of redundant code. NFC.
llvm-svn: 286606
2016-11-11 17:49:34 +00:00
Chad Rosier 31ee813068 [AArch64] Early return and minor renaming/refactoring to ease code review. NFC.
llvm-svn: 286601
2016-11-11 17:07:37 +00:00
Chad Rosier 10c7aaaee9 [AArch64] Enable merging of adjacent zero stores for all subtargets.
This optimization merges adjacent zero stores into a wider store.

e.g.,

strh wzr, [x0]
strh wzr, [x0, #2]
; becomes
str wzr, [x0]

e.g.,

str wzr, [x0]
str wzr, [x0, #4]
; becomes
str xzr, [x0]

Previously, this was only enabled for Kryo and Cortex-A57.

Differential Revision: https://reviews.llvm.org/D26396

llvm-svn: 286592
2016-11-11 14:10:12 +00:00
Evandro Menezes 21f9ce1a0d [DAG Combiner] Fix the native computation of the Newton series for reciprocals
The generic infrastructure to compute the Newton series for reciprocal and
reciprocal square root was conceived to allow a target to compute the series
itself.  However, the original code did not properly consider this condition
if returned by a target.  This patch addresses the issues to allow a target
to compute the series on its own.

Differential revision: https://reviews.llvm.org/D22975

llvm-svn: 286523
2016-11-10 23:31:06 +00:00
Tim Northover a9105be437 GlobalISel: translate invoke and landingpad instructions
Pretty bare-bones support for exception handling (no weird MSVC stuff, no SjLj
etc), but it should get things going.

llvm-svn: 286407
2016-11-09 22:39:54 +00:00
Matthias Braun c53cbbb1d1 AArch64DeadRegisterDefinitionsPass: Fix Changed flag
Fix a bug in the calculation of the changed flag introduced in r285488.

llvm-svn: 286293
2016-11-08 20:59:03 +00:00
Nirav Dave e833c6c61a [MC][AArch64] Cleanup end-of-line parsing in AArch64 AsmParser.
Reviewers: t.p.northover, rengolin

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D26309

llvm-svn: 286265
2016-11-08 18:31:04 +00:00
Tim Northover 5f7dea85c2 GlobalISel: support selecting fpext/fptrunc instructions on AArch64.
llvm-svn: 286253
2016-11-08 17:44:07 +00:00
Roger Ferrer Ibanez 80c0f33c29 [AArch64] Fix incorrect CSEL node created
Under -enable-unsafe-fp-math, SELECT_CC lowering in AArch64
transforms floating point comparisons of the form "a == 0.0 ? 0.0 : x" to
"a == 0.0 ? a : x". But it incorrectly assumes that 'x' and 'a' have
the same type which can lead to a wrong CSEL node that crashes later
due to nonsensical copies.

Differential Revision: https://reviews.llvm.org/D26394

llvm-svn: 286231
2016-11-08 13:34:41 +00:00
Tim Northover 9ac0eba672 GlobalISel: support selecting G_SELECT on AArch64.
llvm-svn: 286185
2016-11-08 00:45:29 +00:00
Tim Northover 7d88da6a46 GlobalISel: constrain PHI registers on AArch64.
Self-referencing PHI nodes need their destination operands to be constrained
because nothing else is likely to do so. For now we just pick a register class
naively.

Patch mostly by Ahmed again.

llvm-svn: 286183
2016-11-08 00:34:06 +00:00
Sanjin Sijaric 6f020d91a1 [AArch64] Transfer memory operands when lowering vector load/store intrinsics
Summary:
Some vector loads and stores generated from AArch64 intrinsics alias each other
unnecessarily, preventing better scheduling.  We just need to transfer memory
operands during lowering.

Reviewers: mcrosier, t.p.northover, jmolloy

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26313

llvm-svn: 286168
2016-11-07 22:39:02 +00:00
Davide Italiano 5df6066ec1 [AArch64] Remove dead store. Found by gcc7.
llvm-svn: 286137
2016-11-07 19:11:25 +00:00
Amara Emerson 614b44bbe9 This patch adds support for 16 bit floating point registers to the inline asm register selection on AArch64.
Without this patch, register allocation for the example below fails.

define half @test(half %a1, half %a2) #0 {
entry:
  %0 = tail call half asm "sqrshl ${0:h}, ${1:h}, ${2:h}", "=w,w,w" (half %a1, half %a2) #1
  ret half %0
}

Patch by Florian Hahn.

Differential Revision: https://reviews.llvm.org/D25080

llvm-svn: 286111
2016-11-07 15:42:12 +00:00
Chad Rosier d6daac4746 [AArch64] Removed the narrow load merging code in the ld/st optimizer.
This feature has been disabled for some time now, so remove cruft.

Differential Revision: https://reviews.llvm.org/D26248

llvm-svn: 286110
2016-11-07 15:27:22 +00:00
Peter Collingbourne 4e76019e34 Support: Remove MemoryObject and DataStreamer interfaces.
These interfaces are no longer used.

Differential Revision: https://reviews.llvm.org/D26222

llvm-svn: 285774
2016-11-02 00:08:37 +00:00
Alex Bradbury 58eba09949 [TableGen] Move OperandMatchResultTy enum to MCTargetAsmParser.h
As it stands, the OperandMatchResultTy is only included in the generated
header if there is custom operand parsing. However, almost all backends
make use of MatchOperand_Success and friends from OperandMatchResultTy for
e.g. parseRegister. This is a pain when starting an AsmParser for a new
backend that doesn't yet have custom operand parsing. Move the enum to
MCTargetAsmParser.h.

This patch is a prerequisite for D23563

Differential Revision: https://reviews.llvm.org/D23496

llvm-svn: 285705
2016-11-01 16:32:05 +00:00
Tim Northover 037af52c8b GlobalISel: allow truncating pointer casts on AArch64.
llvm-svn: 285615
2016-10-31 18:31:09 +00:00
Tim Northover cdf23f1d93 GlobalISel: translate stack protector intrinsics
llvm-svn: 285614
2016-10-31 18:30:59 +00:00
Matthias Braun 7d78614ae9 AArch64DeadRegisterDefinitionsPass: Cleanup; NFC
- Fix doxygen file comment
- reduce indentation in loop
- Factor out some common subexpressions
- Move independent helper function out of class
- Fix Changed flag (this is not strictly NFC but a bugfix, but the flag
  seems ignored anyway)

llvm-svn: 285488
2016-10-29 01:03:41 +00:00
Evandro Menezes ca8370396a [AArch64] Create feature set for Samsung Exynos-M2
Since Exynos-M2 improved the FP square root unit a bit over the one in
Exynos-M1, it does not benefit from using the Newton series for such
operations.

llvm-svn: 285246
2016-10-26 22:06:20 +00:00
Chad Rosier 0c621fda0d [AArch64] Avoid materializing constant 1 when generating cneg instructions.
Instead of

 cmp w0, #1
 orr w8, wzr, #0x1
 cneg w0, w8, ne

we now generate

 cmp w0, #1
 csinv w0, w0, wzr, eq

PR28965

llvm-svn: 285217
2016-10-26 18:15:32 +00:00
Evandro Menezes 7696dc0685 [AArch64] Adjust the cost model for Exynos M1.
Modify the maximum jump table size.

llvm-svn: 285106
2016-10-25 20:05:42 +00:00
Evandro Menezes eff2bd9d4f [AArch64] Optionally use the Newton series for reciprocal estimation
Add support for estimating the square root or its reciprocal and division or
reciprocal using the combiner generic Newton series.

Differential revision: https://reviews.llvm.org/D25291

llvm-svn: 284986
2016-10-24 16:14:58 +00:00
Joel Jones 504bf334b0 AArch64 ILP32 relocations for assembly and ELF
Summary:
Add relocations for AArch64 ILP32. Includes:
  - Addition of definitions for R_AARCH32_*
  - Definition of new -target-abi: ilp32
  - Definition of data layout string
  - Tests for added relocations. Not comprehensive, but matches
    existing tests for 64-bit. Renames "CHECK-OBJ" to "CHECK-OBJ-LP64".
  - Tests for llvm-readobj

Reviewers: zatrazz, peter.smith, echristo, t.p.northover

Subscribers: aemerson, rengolin, mehdi_amini

Differential Revision: https://reviews.llvm.org/D25159

llvm-svn: 284973
2016-10-24 13:37:13 +00:00
Abderrazek Zaafrani 9daf8110c8 Set the vectorizer MaxInterleaveFactor for Exynos.
llvm-svn: 284839
2016-10-21 16:28:27 +00:00
Abderrazek Zaafrani 9f382f53d1 Test commit
llvm-svn: 284832
2016-10-21 15:24:08 +00:00
Bjorn Pettersson 9fcd605d1e [AArch64] Corrected spill size for DDD register class. NFCI
Summary:
The spill size was incorrectly set to 196 bits,
which isn't a multiple of 8. This problem was detected when
experimenting with asserts that the spill size should be a
multiple of the byte size.

New corrected value for the spill size is set to 192 bits.

Note that tablegen (RegisterInfoEmitter) will divide the
size set in the RegisterClass definition by 8. So this
change should not have any impact on the tablegen output
(trunc(192/8) == trunc(196/8) == 24 bytes).

Reviewers: t.p.northover

Subscribers: llvm-commits, aemerson, rengolin

Differential Revision: https://reviews.llvm.org/D25818

llvm-svn: 284814
2016-10-21 09:53:42 +00:00
Benjamin Kramer 2a8bef8769 Do a sweep over move ctors and remove those that are identical to the default.
All of these existed because MSVC 2013 was unable to synthesize default
move ctors. We recently dropped support for it so all that error-prone
boilerplate can go.

No functionality change intended.

llvm-svn: 284721
2016-10-20 12:20:28 +00:00
Evandro Menezes ce8d60156c [AArch64] Avoid materializing 0.0 when generating FP SELECT
Transform `a == 0.0 ? 0.0 : x` to `a == 0.0 ? a : x` and `a != 0.0 ? x : 0.0`
to `a != 0.0 ? x : a` to avoid materializing 0.0 for FCSEL, since it does not
have to be materialized beforehand for FCMP, as it has a form that has 0.0
as an implicit operand.

Differential Revision: https://reviews.llvm.org/D24808

llvm-svn: 284531
2016-10-18 20:37:35 +00:00
Tim Northover 55782222c0 GlobalISel: select small binary operations on AArch64.
AArch64 actually supports many 8-bit operations under the definition used by
GlobalISel: the designated information-carrying bits of a GPR32 get the right
value if you just use the normal 32-bit instruction.

llvm-svn: 284526
2016-10-18 20:03:48 +00:00
Tim Northover 4494d69862 GlobalISel: support floating-point constants on AArch64.
Patch from Ahmed Bougacha.

llvm-svn: 284523
2016-10-18 19:47:57 +00:00
Tim Northover 020d104496 GlobalISel: support wider range of load/store sizes in AArch64.
llvm-svn: 284406
2016-10-17 18:36:53 +00:00
Tim Northover 69fa84a6e9 GlobalISel: rename legalizer components to match others.
The previous names were both misleading (the MachineLegalizer actually
contained the info tables) and inconsistent with the selector & translator (in
having a "Machine") prefix. This should make everything sensible again.

The only functional change is the name of a couple of command-line options.

llvm-svn: 284287
2016-10-14 22:18:18 +00:00
Quentin Colombet b3f5a8c644 [AArch64][RegisterBankInfo] Switch to fully static opds mapping for G_BITCAST.
NFC.

llvm-svn: 284146
2016-10-13 18:46:38 +00:00
Quentin Colombet 6b87a3109c [AArch64][RegisterBankInfo] Provide alternative mappings for 64-bit load
This allows RegBankSelect in greedy mode to get rid some of the cross
register bank copies when loads are involved in the chain of
computation.

llvm-svn: 284097
2016-10-13 01:01:23 +00:00
Quentin Colombet cd80e97e88 [AArch64][RegisterBankInfo] Provide alternative mappings for G_BITCASTs.
Thanks to this patch, RegBankSelect is able to get rid of some register
bank copies as demonstrated in the test case.

llvm-svn: 284094
2016-10-13 00:34:48 +00:00
Quentin Colombet 45c9c1432f [AArch64][RegisterBankInfo] Describe cross regbank copies statically.
NFC.

llvm-svn: 284091
2016-10-13 00:12:06 +00:00
Quentin Colombet 9e64919b7c [AArch64][RegisterBankInfo] Use static mapping for same bank G_BITCAST.
NFC.

llvm-svn: 284090
2016-10-13 00:12:04 +00:00
Quentin Colombet db643d9091 [AArch64][MachineLegalizer] Mark more G_BITCAST as legal.
Basically any vector types that fits in a 32-bit register is also valid
as far as copies are concerned.

llvm-svn: 284089
2016-10-13 00:12:01 +00:00
Quentin Colombet f760799c40 [AArch64][RegisterBankInfo] Bump the cost of vector loads.
This does not change anything yet, because we do not offer any
alternative mapping.

llvm-svn: 284088
2016-10-13 00:11:59 +00:00
Quentin Colombet f35a8c5bdc [AArch64][RegisterBankInfo] Use a proper cost for cross regbank G_BITCASTs.
This does not change anything yet, because we do not offer any
alternative mapping.

llvm-svn: 284087
2016-10-13 00:11:57 +00:00
Quentin Colombet 27b40356f7 [AArch64][RegisterBankInfo] Provide more realistic copy costs.
llvm-svn: 284086
2016-10-13 00:11:55 +00:00
Tim Northover fb8d989818 GlobalISel: support G_TRUNC selection on AArch64.
Ahmed's patch again.

llvm-svn: 284075
2016-10-12 22:49:15 +00:00
Tim Northover 69271c64d5 GlobalISel: support int <-> float conversions on AArch64.
More of Ahmed's work.

llvm-svn: 284074
2016-10-12 22:49:11 +00:00
Tim Northover 7dd378dd08 GlobalISel: select G_FCMP instructions on AArch64.
Another of Ahmed's patches.

llvm-svn: 284073
2016-10-12 22:49:07 +00:00
Tim Northover 6c02ad5e4f GlobalISel: support selection of G_ICMP on AArch64.
Patch from Ahmed Bougaca again.

llvm-svn: 284072
2016-10-12 22:49:04 +00:00
Tim Northover 5e3dbf326c GlobalISel: select G_BRCOND instructions on AArch64.
llvm-svn: 284071
2016-10-12 22:49:01 +00:00
Tim Northover 6aacd27cd7 GlobalISel: mark G_BRCOND on s1 as legal.
It's going to be a TBNZ (at -O0) anyway, so the high bits don't matter.

llvm-svn: 284070
2016-10-12 22:48:36 +00:00
Quentin Colombet 9de30faeac [AArch64][InstrustionSelector] Teach the selector about G_BITCAST.
llvm-svn: 283973
2016-10-12 03:57:52 +00:00
Quentin Colombet cb629a897c [AArch64][InstructionSelector] Refactor the handling of copies.
Although Copies are not specific to preISel, we still have to assign them
a proper register class. However, given they are not constrained to
anything we do not have to handle the source register at the copy. It
will be properly mapped when reaching the related definition.

In the process, the handlong of G_ANYEXT is slightly modified as those
end up being selected as copy. The difference is that when register size
do not match on both sides, we need to insert SUBREG_TO_REG operation,
otherwise the post RA copy expansion will not be happy!

llvm-svn: 283972
2016-10-12 03:57:49 +00:00
Quentin Colombet 404e4350dc [AArch64][MachineLegalizer] Mark more bitcasts as legal.
Those are copies, we do not have to do any legalization action for them.

llvm-svn: 283970
2016-10-12 03:57:43 +00:00
Tim Northover c1d8c2bf8c GlobalISel: support same-size casts on AArch64.
Mostly Ahmed's work again, I'm just sprucing things up slightly before
committing.

llvm-svn: 283952
2016-10-11 22:29:23 +00:00
Tim Northover 3d38b3a4d1 GlobalISel: support selection of extend operations.
Patch mostly by Ahmed Bougaca.

llvm-svn: 283937
2016-10-11 20:50:21 +00:00
Diana Picus c93518db8c [AArch64] Allow label arithmetic with add/sub/cmp
Allow instructions such as 'cmp w0, #(end - start)' by folding the
expression into a constant. For ELF, we fold only if the symbols are in
the same section. For MachO, we fold if the expression contains only
symbols that are not linker visible.

Fixes https://llvm.org/bugs/show_bug.cgi?id=18920

Differential Revision: https://reviews.llvm.org/D23834

llvm-svn: 283862
2016-10-11 09:17:47 +00:00
Quentin Colombet d2623f8e38 [AArch64][InstructionSelector] Teach how to select FP load/store.
This patch allows to select 32 and 64-bit FP load and store.

llvm-svn: 283832
2016-10-11 00:21:14 +00:00
Quentin Colombet 0e5312787e [AArch64][InstructionSelector] Teach the selector how to handle vector OR.
This only adds the support for 64-bit vector OR. Adding more sizes is
not difficult, but it requires a bigger refactoring because ORs work on
any size, not necessarly the ones that match the width of the register
width. Right now, this is not expressed in the legalization, so don't
bother pushing the refactoring yet.

llvm-svn: 283831
2016-10-11 00:21:11 +00:00
Quentin Colombet d3126d5fb4 [AArch64][MachineLegalizer] Mark v2s32 G_LOAD as legal.
Actually every 64-bit loads are legal, but right now the API does not
offer a simple way to express that.

llvm-svn: 283829
2016-10-11 00:21:08 +00:00
Peter Collingbourne 0da86301ad Revert r283690, "MC: Remove unused entities."
llvm-svn: 283814
2016-10-10 22:49:37 +00:00
Tim Northover bdf1624367 GlobalISel: select G_GLOBAL_VALUE uses on AArch64.
llvm-svn: 283809
2016-10-10 21:50:00 +00:00
Tim Northover ad0acca544 GlobalISel: allow G_GLOBAL_VALUEs in AArch64 legalization.
llvm-svn: 283808
2016-10-10 21:49:53 +00:00
Tim Northover 2fda4b08ae GlobalISel: support selecting G_GEP instructions.
They're basically just an alias for G_ADD on AArch64.

llvm-svn: 283807
2016-10-10 21:49:49 +00:00
Tim Northover 4edc60d785 GlobalISel: support selecting constants on AArch64.
llvm-svn: 283806
2016-10-10 21:49:42 +00:00
Mehdi Amini f42454b94b Move the global variables representing each Target behind accessor function
This avoids "static initialization order fiasco"

Differential Revision: https://reviews.llvm.org/D25412

llvm-svn: 283702
2016-10-09 23:00:34 +00:00
Peter Collingbourne cc723cccab MC: Remove unused entities.
llvm-svn: 283691
2016-10-09 04:39:13 +00:00
Peter Collingbourne 5c924d7117 Target: Remove unused entities.
llvm-svn: 283690
2016-10-09 04:38:57 +00:00
Mehdi Amini 732afdd09a Turn cl::values() (for enum) from a vararg function to using C++ variadic template
The core of the change is supposed to be NFC, however it also fixes
what I believe was an undefined behavior when calling:

 va_start(ValueArgs, Desc);

with Desc being a StringRef.

Differential Revision: https://reviews.llvm.org/D25342

llvm-svn: 283671
2016-10-08 19:41:06 +00:00
Sebastian Pop eb65d72d9c [AArch64] Avoid generating indexed vector instructions for Exynos
Avoid generating indexed vector instructions for Exynos. This is needed for
fmla/fmls/fmul/fmulx. For example, the instruction

  fmla v0.4s, v1.4s, v2.s[1]

is less efficient than the instructions

  dup v2.4s, v2.s[1]
  fmla v0.4s, v1.4s, v2.4s

Patch written by Abderrazek Zaafrani.

Differential Revision: https://reviews.llvm.org/D21571

llvm-svn: 283663
2016-10-08 12:30:07 +00:00
Mehdi Amini a0016ec95f Use StringReg in TargetParser APIs (NFC)
llvm-svn: 283527
2016-10-07 08:37:29 +00:00
Matt Arsenault 36919a4f7c Move AArch64BranchRelaxation to generic code
llvm-svn: 283459
2016-10-06 15:38:53 +00:00
Matt Arsenault 0a3ea89e85 AArch64: Move remaining target specific BranchRelaxation bits to TII
llvm-svn: 283458
2016-10-06 15:38:09 +00:00
Matthias Braun 46a5238682 AArch64: Macrofusion: Split features, add missing combinations.
AArch64InstrInfo::shouldScheduleAdjacent() determines whether two
instruction can benefit from macroop fusion on apple CPUs. The list
turned out to be incomplete:
- the "rr" variants of the instructions were missing
- even the "rs" variants can have shift value == 0 and behave like the
  "rr" variants

This also splits the MacropFusion target feature into
ArithmeticBccFusion and ArithmeticCbzFusion.

Differential Revision: https://reviews.llvm.org/D25142

llvm-svn: 283243
2016-10-04 19:28:21 +00:00
Quentin Colombet 3a06701913 [AArch64][RegisterBankInfo] Add getSameKindofOperandsMapping.
Refactor the code so that the same function can be used for all
instructions with all the same operands for up to 3 operands.

This is going to be useful for cast instructions.
NFC.

llvm-svn: 283144
2016-10-03 20:20:13 +00:00
Matthias Braun a827ed8891 AArch64Subtarget: Remove unused CPUString field
llvm-svn: 283142
2016-10-03 20:17:02 +00:00
Mehdi Amini 48878ae579 Use StringRef in Datalayout API (NFC)
llvm-svn: 283013
2016-10-01 05:57:55 +00:00
Mehdi Amini 117296c0a0 Use StringRef in Pass/PassManager APIs (NFC)
llvm-svn: 283004
2016-10-01 02:56:57 +00:00
Eric Christopher 98983d0aff Remove TargetTriple from AArch64MCInstLower as it's used in few places
and can be pulled from the TargetMachine. NFC.

llvm-svn: 283000
2016-10-01 01:50:25 +00:00
Quentin Colombet a6119958ff [AArch64][RegisterBankInfo] Use the helper functions for the checks
This makes sure the helper functions work as expected.

NFC.

llvm-svn: 282961
2016-09-30 21:46:21 +00:00
Quentin Colombet 7c3fa8e361 [AArch64][RegisterBankInfo] Rename getValueMappingIdx to getValueMapping
We don't return index, we return the actual ValueMapping.

NFC.

llvm-svn: 282960
2016-09-30 21:46:19 +00:00
Quentin Colombet b4afac7b32 [AArch64][RegisterBankInfo] Compress the ValueMapping table a bit.
We don't need to have singleton ValueMapping on their own, we can just
reuse one of the elements of the 3-ops mapping.
This allows even more code sharing.

NFC.

llvm-svn: 282959
2016-09-30 21:46:17 +00:00
Quentin Colombet 7fc5fe41c5 [AArch64][RegisterBankInfo] Refactor the code to access AArch64::ValMapping
Use a helper function to access ValMapping. This should make the code
easier to understand and maintain.

NFC.

llvm-svn: 282958
2016-09-30 21:46:15 +00:00
Quentin Colombet 15dc25bb3d [AArch64][RegisterBankInfo] Rename getRegBankIdx to getRegBankIdxOffset
The function name did not make it clear that the returned value was an
offset to apply to a register bank index.

NFC.

llvm-svn: 282957
2016-09-30 21:46:12 +00:00
Quentin Colombet b2308987ab [AArch64][RegisterBankInfo] Use the static opds mapping for alt mappings
Avoid to rely on the dynamically allocated operands mapping for the
alternative mapping.
NFC.

llvm-svn: 282956
2016-09-30 21:45:56 +00:00
Quentin Colombet 4b36e0c409 [AArch64][RegisterBankInfo] Use static mapping for 3-operands instrs.
This uses a TableGen'ed like structure for all 3-operands instrs.
The output of the RegBankSelect pass should be identical but the
RegisterBankInfo will do less dynamic allocations.

llvm-svn: 282817
2016-09-30 00:10:00 +00:00
Quentin Colombet fdd303afe2 [AArch64][RegisterBankInfo] Add static value mapping for 3-op instrs.
This is the kind of input TableGen should generate at some point.
NFC.

llvm-svn: 282816
2016-09-30 00:09:58 +00:00
Quentin Colombet eb8d3da9a0 [AArch64][RegisterBankInfo] Check the statically created ValueMapping.
Make sure that the ValueMappings contain the value we expect at the
indices we expect.

NFC.

llvm-svn: 282815
2016-09-30 00:09:43 +00:00
Lei Liu 361615cfd0 AArch64: Set shift bit of TLSLE HI12 add instruction
Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model.  This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000.

Reviewers: t.p.northover, peter.smith, rovka

Subscribers: salim.nasser, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D24702

llvm-svn: 282661
2016-09-29 01:05:48 +00:00
Quentin Colombet 40cbc27ff3 [RegisterBankInfo] Uniquely generate OperandsMapping.
This is a step toward statically allocate InstructionMapping. Like the
previous few commits, the goal is to move toward a TableGen'ed like
structure with no dynamic allocation at all.

This should already improve compile time by getting rid of a bunch of
memmove of SmallVectors.

llvm-svn: 282643
2016-09-28 22:20:49 +00:00
Quentin Colombet c0f11a9fb8 [AArch64][RegisterBankInfo] Switch to statically allocated ValueMapping.
Another step toward TableGen'ed like structure for the RegisterBankInfo
of AArch64. By doing this, we also save a bit of compile time for the
exact same output.

llvm-svn: 282550
2016-09-27 22:55:04 +00:00
Quentin Colombet caae9cd246 [AArch64][RegisterBankInfo] Fix copy/paste in comments.
NFC.

llvm-svn: 282549
2016-09-27 22:54:57 +00:00
Geoff Berry b124331db7 [TargetRegisterInfo, AArch64] Add target hook for isConstantPhysReg().
Summary:
The current implementation of isConstantPhysReg() checks for defs of
physical registers to determine if they are constant.  Some
architectures (e.g. AArch64 XZR/WZR) have registers that are constant
and may be used as destinations to indicate the generated value is
discarded, preventing isConstantPhysReg() from returning true.  This
change adds a TargetRegisterInfo hook that overrides the no defs check
for cases such as this.

Reviewers: MatzeB, qcolombet, t.p.northover, jmolloy

Subscribers: junbuml, aemerson, mcrosier, rengolin

Differential Revision: https://reviews.llvm.org/D24570

llvm-svn: 282543
2016-09-27 22:17:27 +00:00
Geoff Berry 256fcf975f [AArch64] Improve add/sub/cmp isel of uxtw forms.
Don't match the UXTW extended reg forms of ADD/ADDS/SUB/SUBS if the
32-bit to 64-bit zero-extend can be done for free by taking advantage
of the 32-bit defining instruction zeroing the upper 32-bits of the X
register destination.  This enables better instruction selection in a
few cases, such as:

  sub x0, xzr, x8
  instead of:
  mov x8, xzr
  sub x0, x8, w9, uxtw

  madd x0, x1, x1, x8
  instead of:
  mul x9, x1, x1
  add x0, x9, w8, uxtw

  cmp x2, x8
  instead of:
  sub x8, x2, w8, uxtw
  cmp x8, #0

  add x0, x8, x1, lsl #3
  instead of:
  lsl x9, x1, #3
  add x0, x9, w8, uxtw

Reviewers: t.p.northover, jmolloy

Subscribers: mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D24747

llvm-svn: 282413
2016-09-26 15:34:47 +00:00
Evandro Menezes e45de8a5ec Add support to optionally limit the size of jump tables.
Many high-performance processors have a dedicated branch predictor for
indirect branches, commonly used with jump tables.  As sophisticated as such
branch predictors are, they tend to have well defined limits beyond which
their effectiveness is hampered or even nullified.  One such limit is the
number of possible destinations for a given indirect branches that such
branch predictors can handle.

This patch considers a limit that a target may set to the number of
destination addresses in a jump table.

Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar
<aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>.

Differential revision: https://reviews.llvm.org/D21940

llvm-svn: 282412
2016-09-26 15:32:33 +00:00
Quentin Colombet fd8c95adf4 [RegisterBankInfo] Uniquely generate ValueMapping.
This is a step toward statically allocate ValueMapping. Like the
previous few commits, the goal is to move toward a TableGen'ed like
structure with no dynamic allocation at all.

llvm-svn: 282324
2016-09-24 04:53:52 +00:00
Quentin Colombet fd0ab5c660 [AArch64][RegisterBankInfo] Sanity check TableGen'ed like inputs.
Make sure the entries written to mimic the behavior of TableGen are
sane.

llvm-svn: 282220
2016-09-23 00:59:07 +00:00
Quentin Colombet 5b16d931dc [AArch64][RegisterBankInfo] Switch to TableGen'ed like PartialMapping.
Statically instanciate the most common PartialMappings. This should
be closer to what the code would look like when TableGen support is
added for GlobalISel. As a side effect, this should improve compile
time.

llvm-svn: 282215
2016-09-23 00:14:36 +00:00
Quentin Colombet 0afa7d6b82 [RegisterBankInfo] Use array instead of SmallVector for BreakDown.
This is another step toward TableGen'ed like structures. The BreakDown of
the mapping of the value will be statically computed by TableGen, thus
we only have to point to the right entry in the table instead of
dynamically allocate the mapping for each instruction.

We still support the dynamic allocation through a factory of
PartialMapping to ease the bring-up of the targets while the TableGen
backend is not available.

llvm-svn: 282213
2016-09-23 00:14:30 +00:00
Tim Northover a5e38fa00d GlobalISel: handle stack-based parameters on AArch64.
llvm-svn: 282153
2016-09-22 13:49:25 +00:00
Quentin Colombet 6a76323c64 [RegisterBankInfo] Move to statically allocated RegisterBank.
This commit is basically the first step toward what will
RegisterBankInfo look when it gets TableGen'ed.

It introduces a XXXGenRegisterBankInfo.def file that is what TableGen
will issue at some point. Moreover, the RegBanks field in
RegisterBankInfo changed to reflect the static (compile time) aspect of
the information.

llvm-svn: 282131
2016-09-22 02:10:37 +00:00
Tim Northover 9a46718378 GlobalISel: produce correct code for signext/zeroext ABI flags.
We still don't really have an equivalent of "AssertXExt" in DAG, so we don't
exploit the guarantees on the receiving side yet, but this should produce
conservatively correct code on iOS ABIs.

llvm-svn: 282069
2016-09-21 12:57:45 +00:00
Tim Northover 862758ec14 GlobalISel: pass Function to lowerFormalArguments directly (NFC).
The only implementation that exists immediately looks it up anyway, and the
information is needed to handle various parameter attributes (stored on the
function itself).

llvm-svn: 282068
2016-09-21 12:57:35 +00:00
Diana Picus 2a3f066349 Revert "AArch64: Set shift bit of TLSLE HI12 add instruction"
This reverts commit r282057 because it broke the buildbots - see e.g.
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-42vma/builds/12063

llvm-svn: 282058
2016-09-21 08:24:41 +00:00
Lei Liu 6c87f23526 AArch64: Set shift bit of TLSLE HI12 add instruction
Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model.  This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000.

Reviewers: t.p.northover, peter.smith, rovka

Subscribers: salim.nasser, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D24702

llvm-svn: 282057
2016-09-21 07:41:41 +00:00
Evandro Menezes 9b5d89513b Revert part of "AArch64: Do not test for CPUs, use SubtargetFeatures"
This reverts part of commit 119e358d9635c8d1f3e7aee67e3ea3b8a62f8db6 by
removing FeatureUseRSqrt et al per request by Eric Christopher
<echristo@gmail.com> (v. http://bit.ly/2cmz6kW).

llvm-svn: 282001
2016-09-20 19:02:09 +00:00
Evandro Menezes ba4926efde Revert "[AArch64] Use the reciprocal estimation machinery"
This reverts commit b7d42b0048f65346e9fa37fb65defeea7ce8c337 per request by
Eric Christopher <echristo@gmail.com> (v. http://bit.ly/2cmz6kW).

llvm-svn: 282000
2016-09-20 19:02:06 +00:00
Evandro Menezes 61a1273d27 Revert "[AArch64] Properly validate the reciprocal estimation."
This reverts commit ad8ca1528242e2a4cb363e3779309e70eb7a430e per request by
Eric Christopher <echristo@gmail.com> (v. http://bit.ly/2cmz6kW).

llvm-svn: 281999
2016-09-20 19:02:02 +00:00
Tim Northover b18ea162df GlobalISel: split aggregates for PCS lowering
This should match the existing behaviour for passing complicated struct and
array types, in particular HFAs come through like that from Clang.

For C & C++ we still need to somehow support all the weird ABI flags, or at
least those that are present in the IR (signext, byval, ...), and stack-based
parameter passing.

llvm-svn: 281977
2016-09-20 15:20:36 +00:00
Diana Picus a53660e4a3 [AArch64] Fix encoding for lsl #12 in add/sub immediates
Whenever an add/sub immediate needs a fixup, we set that immediate field to zero,
which is correct, but we also set the shift bits to zero, which is not true for
instructions that use lsl #12. This patch makes sure that if lsl #12 was used,
it will appear in the encoding of the instruction.

Differential Revision: https://reviews.llvm.org/D23930

llvm-svn: 281898
2016-09-19 11:10:18 +00:00
Nirav Dave 2364748a49 Defer asm errors to post-statement failure
Recommitting after fixing AsmParser initialization and X86 inline asm
error cleanup.

Allow errors to be deferred and emitted as part of clean up to simplify
and shorten Assembly parser code. This will allow error messages to be
emitted in helper functions and be modified by the caller which has
better context.

As part of this many minor cleanups to the Parser:

* Unify parser cleanup on error
* Add Workaround for incorrect return values in ParseDirective instances
* Tighten checks on error-signifying return values for parser functions
  and fix in-tree TargetParsers to be more consistent with the changes.
* Fix AArch64 test cases checking for spurious error messages that are
  now fixed.

These changes should be backwards compatible with current Target Parsers
so long as the error status are correctly returned in appropriate
functions.

Reviewers: rnk, majnemer

Subscribers: aemerson, jyknight, llvm-commits

Differential Revision: https://reviews.llvm.org/D24047

llvm-svn: 281762
2016-09-16 18:30:20 +00:00
Ahmed Bougacha 85ef4a1c47 [AArch64][GlobalISel] Add default regbank mapping for int<>FP.
llvm-svn: 281739
2016-09-16 15:12:46 +00:00
Ahmed Bougacha 7b3b2e7f65 [AArch64][GlobalISel] Add default regbank mapping for G_FCMP.
llvm-svn: 281738
2016-09-16 15:12:43 +00:00
Ahmed Bougacha 90637f6196 [AArch64][GlobalISel] Add default regbank mapping for FP ops.
These should have all their operands - even scalars - go on FPR.

llvm-svn: 281737
2016-09-16 15:12:40 +00:00
Ahmed Bougacha 7306313e6d [AArch64][GlobalISel] Add default regbank mappings for mixed-type ops.
We used to only support instructions with same-type operands.
Instead, use the per-register type information to map each
operand more accurately.

llvm-svn: 281734
2016-09-16 14:44:51 +00:00
Ahmed Bougacha b532360dd6 [AArch64][GlobalISel] Use the generic DefaultMapping as the default.
This lets generic logic handle the common case, instead of having to
implement applyMappingImpl for each instruction.

llvm-svn: 281720
2016-09-16 12:33:34 +00:00
Eric Christopher 4367c7fb9a Move the Mangler from the AsmPrinter down to TLOF and clean up the
TLOF API accordingly.

llvm-svn: 281708
2016-09-16 07:33:15 +00:00
Evandro Menezes 19b2aed308 [AArch64] Support for FP FMA when -ffp-contract=fast
Currently, the machine combiner can proceed matching when -ffast-math is on.
It should also match when only -ffp-contract=fast is specified as was the
case before when DAGCombiner was doing the job.

Patch by: Abderrazek Zaafrani <a.zaafrani@samsung.com>.

Differential Revision: https://reviews.llvm.org/D24366

llvm-svn: 281649
2016-09-15 19:55:23 +00:00
Tim Northover 22d82cf179 GlobalISel: legalize GEP instructions with small offsets.
llvm-svn: 281602
2016-09-15 11:02:19 +00:00
Tim Northover 4cf0a482bc GlobalISel: relax type constraints on G_ICMP to allow pointers.
llvm-svn: 281600
2016-09-15 10:40:38 +00:00
Tim Northover 32a078ad1a GlobalISel: remove "unsized" LLT
It was only really there as a sentinel when instructions had to have precisely
one type. Now that registers are typed, each register really has to have a type
that is sized.

llvm-svn: 281599
2016-09-15 10:09:59 +00:00
Tim Northover 5ae8350af6 GlobalISel: cache pointer sizes in LLT
Otherwise everything that needs to work out what size they are has to keep a
DataLayout handy, which is a bit silly and very annoying.

llvm-svn: 281597
2016-09-15 09:20:34 +00:00
Matt Arsenault 1b9fc8ed65 Finish renaming remaining analyzeBranch functions
llvm-svn: 281535
2016-09-14 20:43:16 +00:00
Matt Arsenault e8e0f5cac6 Make analyzeBranch family of instruction names consistent
analyzeBranch was renamed to use lowercase first, rename
the related set to match.

llvm-svn: 281506
2016-09-14 17:24:15 +00:00
Matt Arsenault a2b036e88b AArch64: Use TTI branch functions in branch relaxation
The main change is to return the code size from
InsertBranch/RemoveBranch.

Patch mostly by Tim Northover

llvm-svn: 281505
2016-09-14 17:23:48 +00:00
Sanjay Patel 1ed771f5d7 getVectorElementType().getSizeInBits() -> getScalarSizeInBits() ; NFCI
llvm-svn: 281495
2016-09-14 16:37:15 +00:00
Sanjay Patel b1f0a0f4a8 getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI
llvm-svn: 281493
2016-09-14 16:05:51 +00:00
Sanjay Patel 5f6bb6cd24 getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits() ; NFCI
llvm-svn: 281490
2016-09-14 15:43:44 +00:00
Sanjay Patel bd6fca1419 getScalarType().getSizeInBits() -> getScalarSizeInBits() ; NFCI
llvm-svn: 281489
2016-09-14 15:21:00 +00:00
Tim Northover 1c7825fd79 GlobalISel: mark pointer stores as legal on AArch64.
llvm-svn: 281448
2016-09-14 08:28:54 +00:00
Matthias Braun 1af1414d4d AArch64: Cleanup tailcall CC check, enable swiftcc.
Cleanup/change the code that checks for possible tailcall conventions to
look the same as the one in the X86 target. This makes the distinction
between calling conventions that can guarnatee tailcalls and the ones
that may tailcall more obvious.

- Add Swift to the mayTailCall list
- PreserveMost seemed to be incorrectly part of the guarnteed tail call
  list, move it to the mayTailCall list.

llvm-svn: 281376
2016-09-13 19:27:38 +00:00
Nico Weber e204c48d16 Revert r281336 (and r281337), it caused PR30372.
llvm-svn: 281361
2016-09-13 18:17:00 +00:00
Nirav Dave 9fa8af2180 Defer asm errors to post-statement failure
Recommitting after fixing AsmParser Initialization.

Allow errors to be deferred and emitted as part of clean up to simplify
and shorten Assembly parser code. This will allow error messages to be
emitted in helper functions and be modified by the caller which has
better context.

As part of this many minor cleanups to the Parser:

* Unify parser cleanup on error
* Add Workaround for incorrect return values in ParseDirective instances
* Tighten checks on error-signifying return values for parser functions
  and fix in-tree TargetParsers to be more consistent with the changes.
* Fix AArch64 test cases checking for spurious error messages that are
  now fixed.

These changes should be backwards compatible with current Target Parsers
so long as the error status are correctly returned in appropriate
functions.

Reviewers: rnk, majnemer

Subscribers: aemerson, jyknight, llvm-commits

Differential Revision: https://reviews.llvm.org/D24047

llvm-svn: 281336
2016-09-13 13:55:06 +00:00
Diana Picus 4b97288184 [AArch64] Support stackmap/patchpoint in getInstSizeInBytes
We currently return 4 for stackmaps and patchpoints, which is very optimistic
and can in rare cases cause the branch relaxation pass to fail to relax certain
branches.

This patch causes getInstSizeInBytes to return a pessimistic estimate of the
size as the number of bytes requested in the stackmap/patchpoint. In the future,
we could provide a more accurate estimate by sharing some of the logic in
AArch64::LowerSTACKMAP/PATCHPOINT.

Fixes part of https://llvm.org/bugs/show_bug.cgi?id=28750

Differential Revision: https://reviews.llvm.org/D24073

llvm-svn: 281301
2016-09-13 07:45:17 +00:00
Eric Christopher 04c7db31e8 Temporarily Revert "[MC] Defer asm errors to post-statement failure" as it's causing errors on the sanitizer bots.
This reverts commit r281249.

llvm-svn: 281280
2016-09-13 00:19:29 +00:00
Nirav Dave c0c0f7a196 [MC] Defer asm errors to post-statement failure
Allow errors to be deferred and emitted as part of clean up to simplify
and shorten Assembly parser code. This will allow error messages to be
emitted in helper functions and be modified by the caller which has
better context.

As part of this many minor cleanups to the Parser:

* Unify parser cleanup on error
* Add Workaround for incorrect return values in ParseDirective instances
* Tighten checks on error-signifying return values for parser functions
  and fix in-tree TargetParsers to be more consistent with the changes.
* Fix AArch64 test cases checking for spurious error messages that are
  now fixed.

These changes should be backwards compatible with current Target Parsers
so long as the error status are correctly returned in appropriate
functions.

Reviewers: rnk, majnemer

Subscribers: aemerson, jyknight, llvm-commits

Differential Revision: https://reviews.llvm.org/D24047

llvm-svn: 281249
2016-09-12 20:03:02 +00:00
Duncan P. N. Exon Smith 1872096f1e CodeGen: Give MachineBasicBlock::reverse_iterator a handle to the current MI
Now that MachineBasicBlock::reverse_instr_iterator knows when it's at
the end (since r281168 and r281170), implement
MachineBasicBlock::reverse_iterator directly on top of an
ilist::reverse_iterator by adding an IsReverse template parameter to
MachineInstrBundleIterator.  This replaces another hard-to-reason-about
use of std::reverse_iterator on list iterators, matching the changes for
ilist::reverse_iterator from r280032 (see the "out of scope" section at
the end of that commit message).  MachineBasicBlock::reverse_iterator
now has a handle to the current node and has obvious invalidation
semantics.

r280032 has a more detailed explanation of how list-style reverse
iterators (invalidated when the pointed-at node is deleted) are
different from vector-style reverse iterators like std::reverse_iterator
(invalidated on every operation).  A great motivating example is this
commit's changes to lib/CodeGen/DeadMachineInstructionElim.cpp.

Note: If your out-of-tree backend deletes instructions while iterating
on a MachineBasicBlock::reverse_iterator or converts between
MachineBasicBlock::iterator and MachineBasicBlock::reverse_iterator,
you'll need to update your code in similar ways to r280032.  The
following table might help:

                  [Old]              ==>             [New]
        delete &*RI, RE = end()                   delete &*RI++
        RI->erase(), RE = end()                   RI++->erase()
      reverse_iterator(I)                 std::prev(I).getReverse()
      reverse_iterator(I)                          ++I.getReverse()
    --reverse_iterator(I)                            I.getReverse()
      reverse_iterator(std::next(I))                 I.getReverse()
                RI.base()                std::prev(RI).getReverse()
                RI.base()                         ++RI.getReverse()
              --RI.base()                           RI.getReverse()
     std::next(RI).base()                           RI.getReverse()

(For more details, have a look at r280032.)

llvm-svn: 281172
2016-09-11 18:51:28 +00:00
Justin Lebar adbf09e8cf [CodeGen] Split out the notions of MI invariance and MI dereferenceability.
Summary:
An IR load can be invariant, dereferenceable, neither, or both.  But
currently, MI's notion of invariance is IR-invariant &&
IR-dereferenceable.

This patch splits up the notions of invariance and dereferenceability at
the MI level.  It's NFC, so adds some probably-unnecessary
"is-dereferenceable" checks, which we can remove later if desired.

Reviewers: chandlerc, tstellarAMD

Subscribers: jholewinski, arsenm, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D23371

llvm-svn: 281151
2016-09-11 01:38:58 +00:00
Tim Northover 25d1286e5a GlobalISel: remove G_TYPE and G_PHI
These instructions were only necessary when type information was stored in the
MachineInstr (because only generic MachineInstrs possessed a type). Now that
it's in MachineRegisterInfo, COPY and PHI work fine.

llvm-svn: 281037
2016-09-09 11:47:31 +00:00
Tim Northover 0f140c769a GlobalISel: move type information to MachineRegisterInfo.
We want each register to have a canonical type, which means the best place to
store this is in MachineRegisterInfo rather than on every MachineInstr that
happens to use or define that register.

Most changes following from this are pretty simple (you need an MRI anyway if
you're going to be doing any transformations, so just check the type there).
But legalization doesn't really want to check redundant operands (when, for
example, a G_ADD only ever has one type) so I've made use of MCInstrDesc's
operand type field to encode these constraints and limit legalization's work.

As an added bonus, more validation is possible, both in MachineVerifier and
MachineIRBuilder (coming soon).

llvm-svn: 281035
2016-09-09 11:46:34 +00:00
Eric Christopher 98ddbdb563 AArch64 .arch directive - Include default arch attributes with extensions.
Fix the .arch asm parser to use the full set of features for the architecture
and any extensions on the command line. Add and update testcases accordingly
as well as add an extension that was used but not supported.

llvm-svn: 280971
2016-09-08 17:27:03 +00:00
Evandro Menezes 405c90e6cc [AArch64] Adjust the scheduling model for Exynos M1.
Further refine the model for branches.

llvm-svn: 280736
2016-09-06 19:22:29 +00:00
Evandro Menezes 77e6b5d4e0 [AArch64] Adjust the scheduling model for Exynos M1.
Further refine the model for stores.

llvm-svn: 280735
2016-09-06 19:22:27 +00:00
Evandro Menezes 199cad4f17 [AArch64] Adjust the scheduling model for Exynos M1.
Further refine the model for loads.

llvm-svn: 280734
2016-09-06 19:22:19 +00:00
Tim Northover 8d8812c5d7 GlobalISel: add a G_PHI instruction to give phis a type.
They're another source of generic vregs, which are going to need a type on the
definition when we remove the register width from MachineRegisterInfo.

llvm-svn: 280412
2016-09-01 20:45:41 +00:00
Tim Northover 11a2354670 GlobalISel: use G_TYPE to annotate physregs with a type.
More preparation for dropping source types from MachineInstrs: regsters coming
out of already-selected code (i.e. non-generic instructions) don't have a type,
but that information is needed so we must add it manually.

This is done via a new G_TYPE instruction.

llvm-svn: 280292
2016-08-31 21:24:02 +00:00
Diana Picus 760c757633 Use abstraction in AArch64AsmPrinter::lowerSTACKMAP. NFCI
Use functionality from StackMapOpers instead of hardcoding an operand access.

llvm-svn: 280230
2016-08-31 12:43:49 +00:00
Diana Picus 16c818820b Typo fixes. NFC
llvm-svn: 280229
2016-08-31 12:43:44 +00:00
James Y Knight d7d9e1069b Replace incorrect "#ifdef DEBUG" with "#ifndef NDEBUG".
The former is simply wrong -- the code will either never be used or will
always be used, rather than being dependent upon whether it's built with
debug assertions enabled.

The macro DEBUG isn't ever set by the llvm build system. But, the macro
DEBUG(X) is defined (unconditionally) if you happen to include
llvm/Support/Debug.h.

The code in Value.h which was erroneously protected by the #ifdef DEBUG
didn't even compile -- you can't cast<> from an LLVMOpaqueValue
directly. Fortunately, it was never invoked, as Core.cpp included
Value.h before Debug.h.

The conditionalized code in AArch64CollectLOH.cpp was previously always
used, as it includes Debug.h.

llvm-svn: 280056
2016-08-30 03:16:16 +00:00
Tim Northover edb3c8ccb8 GlobalISel: legalize frem to a libcall on AArch64.
llvm-svn: 279988
2016-08-29 19:07:16 +00:00
Tim Northover fe5f89ba14 GlobalISel: rework CallLowering so that it can be used for libcalls too.
There should be no functional change here, I'm just making the implementation
of "frem" (to libcall) legalization easier for a followup.

llvm-svn: 279987
2016-08-29 19:07:08 +00:00
Evandro Menezes a8a25ca905 [AArch64] Adjust the scheduling model for Exynos M1.
Further refine the model for loads.

llvm-svn: 279976
2016-08-29 16:04:37 +00:00
Quentin Colombet a94caa5673 [AArch64][CallLowering] Do not assert for not implemented part.
When doing the ABI lowering, report a failure to the caller instead of
asserting. This gives a chance for the caller to recover.

llvm-svn: 279890
2016-08-27 00:18:28 +00:00
Manman Ren 66b54e9f32 Swift Calling Convetion: add support for AArch64.
It will just be the same as the regular calling convention.

rdar://28029509

llvm-svn: 279853
2016-08-26 19:28:17 +00:00
Tim Northover 85cf564c51 AArch64: avoid assertion on illegal types in performFDivCombine.
In the code to detect fixed-point conversions and make use of AArch64's special
instructions, we weren't prepared for weird types. The fptosi direction got
fixed recently, but not the similar sitofp code.

llvm-svn: 279852
2016-08-26 18:52:31 +00:00
Chad Rosier 58f505ba24 [AArch64] Avoid materializing constant values when generating csel instructions.
Differential Revision: https://reviews.llvm.org/D23677

llvm-svn: 279849
2016-08-26 18:05:50 +00:00
Reid Kleckner a5b1eef846 [MC] Move .cv_loc management logic out of MCContext
MCContext already has many tasks, and separating CodeView out from it is
probably a good idea. The .cv_loc tracking was modelled on the DWARF
tracking which lived directly in MCContext.

Removes the inclusion of MCCodeView.h from MCContext.h, so now there are
only 10 build actions while I hack on CodeView support instead of 265.

llvm-svn: 279847
2016-08-26 17:58:37 +00:00
Tim Northover bc1701c7fb GlobalISel: mark G_FPEXT legal from float to double.
llvm-svn: 279845
2016-08-26 17:46:22 +00:00
Tim Northover 30bd36e3fc GlobalISel: mark G_FCMP legal on float & double.
llvm-svn: 279844
2016-08-26 17:46:19 +00:00
Tim Northover 051b8ad3d9 GlobalISel: simplify G_ICMP legalization regime.
It's unclear how the old

    %res(32) = G_ICMP { s32, s32 } intpred(eq), %0, %1

is actually different from an s1 verison

    %res(1) = G_ICMP { s1, s32 } intpred(eq), %0, %1

so we'll remove it for now.

llvm-svn: 279843
2016-08-26 17:46:17 +00:00
Tim Northover cecee56abb GlobalISel: legalize sdiv and srem operations.
llvm-svn: 279842
2016-08-26 17:46:13 +00:00
Tim Northover 7a753d9bec GlobalISel: legalize under-width divisions.
llvm-svn: 279841
2016-08-26 17:46:06 +00:00
Tim Northover 1d18a99a53 GlobalISel: mark selects legal
llvm-svn: 279840
2016-08-26 17:46:03 +00:00
Tim Northover 5d0eaa4e79 GlobalISel: mark float/int conversions legal
llvm-svn: 279839
2016-08-26 17:45:58 +00:00
Chad Rosier 39c1dbb845 [AArch64] Avoid materializing constant 1 by using csinc, rather than csel.
This is similar to what was done in r261675, but for CSINC rather than CSINV.

Differential Revision: https://reviews.llvm.org/D23892

llvm-svn: 279822
2016-08-26 14:01:55 +00:00
Tim Northover d8a6d7ce91 GlobalISel: mark overflow bit of overflow ops legal.
It's expected this will map to NZCV register class and be properly selectable.

llvm-svn: 279761
2016-08-25 17:37:41 +00:00
Tim Northover fe880a8801 GlobalISel: mark simple ops legal even on types < 32-bit.
The 32-bit variants of these operations don't depend on the bits not being
operated on, so they also naturally model operations narrower than the actual
register width.

llvm-svn: 279760
2016-08-25 17:37:39 +00:00
Tim Northover 7a1ec0141a GlobalISel: mark pointer constants as legal on AArch64.
llvm-svn: 279759
2016-08-25 17:37:35 +00:00
Tim Northover 438c77ca1a GlobalISel: perform multi-step legalization
llvm-svn: 279758
2016-08-25 17:37:32 +00:00
Tim Northover 2c4a838e24 GlobalISel: mark small extends as legal on AArch64
llvm-svn: 279757
2016-08-25 17:37:25 +00:00
Matthias Braun 1eb473680a MachineFunctionProperties/MIRParser: Rename AllVRegsAllocated->NoVRegs, compute it
Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of
running after register and simply describes that no vregs are used in
a machine function. With that we can simply compute the property and do
not need to dump/parse it in .mir files.

Differential Revision: http://reviews.llvm.org/D23850

llvm-svn: 279698
2016-08-25 01:27:13 +00:00
George Burgess IV 381fc0ee3c Make some LLVM_CONSTEXPR variables const. NFC.
This patch changes LLVM_CONSTEXPR variable declarations to const
variable declarations, since LLVM_CONSTEXPR expands to nothing if the
current compiler doesn't support constexpr. In all of the changed
cases, it looks like the code intended the variable to be const instead
of sometimes-constexpr sometimes-not.

llvm-svn: 279696
2016-08-25 01:05:08 +00:00
Evandro Menezes 5395187fe5 [AArch64] Adjust the feature set for Exynos M1.
Enable zero cycle zeroing.

llvm-svn: 279648
2016-08-24 18:17:30 +00:00
Philip Reames e83c4b30ca [stackmaps] More extraction of common code [NFCI]
General cleanup before starting to work on the part I want to actually change.

llvm-svn: 279586
2016-08-23 23:33:29 +00:00
Tim Northover 6cd4b23a0f GlobalISel: legalize integer comparisons on AArch64.
Next step is doing both legalizations at the same time! Marvel at GlobalISel's
cunning.

llvm-svn: 279566
2016-08-23 21:01:26 +00:00
Tim Northover b3a0be4d38 GlobalISel: legalize conditional branches on AArch64.
llvm-svn: 279565
2016-08-23 21:01:20 +00:00
Tim Northover a01bece1dc GlobalISel: extend legalizer interface to handle multiple types.
Instructions like G_ICMP have multiple types that may need to be legalized (the
boolean output and nearly arbitrary inputs in this case). So the legalizer must
be capable of deciding what to do for each of them separately.

llvm-svn: 279554
2016-08-23 19:30:42 +00:00
Tim Northover 456a3c03ac GlobalISel: mark pointer casts legal on AArch64.
llvm-svn: 279553
2016-08-23 19:30:38 +00:00
Tim Northover 3c73e367c0 GlobalISel: legalize 1-bit load/store and mark 8/16 bit variants legal on AArch64.
llvm-svn: 279548
2016-08-23 18:20:09 +00:00
Matt Arsenault 567631bdd4 BranchRelaxation: Fix handling of blocks with multiple conditional
branches

Looping over all terminators exposed AArch64 tests hitting
an assert from analyzeBranch failing. I believe these cases
were miscompiled before.

e.g.
  fcmp s0, s1
  b.ne LBB0_1
  b.vc LBB0_2
  b LBB0_2
LBB0_1:
  ; Large block
LBB0_2:
 ; ...

Both of the individual conditional branches need to
be expanded, since neither can reach the final block.

Split the original block into ones which analyzeBranch
will be able to understand.

llvm-svn: 279499
2016-08-23 01:30:30 +00:00
Tim Northover a11be04769 GlobalISel: support legalization of G_FCONSTANTs
llvm-svn: 279341
2016-08-19 22:40:08 +00:00
Tim Northover ea904f9424 GlobalISel: teach legalizer how to handle integer constants.
llvm-svn: 279340
2016-08-19 22:40:00 +00:00
Saleem Abdulrasool dab786fb78 AArch64: remove extraneous padding
The structs BarrierOp, PrefetchOp, PSBHintOp are in AArch64AsmParser.cpp
(inside anonymous namespace).  This diff changes the order of fields and
removes the excessive padding (8 bytes).

Patch by Alexander Shaposhnikov!

llvm-svn: 279173
2016-08-18 22:35:06 +00:00
Michael Kuperstein 2bc3d4d46c [SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> fround
The names of the tablegen defs now match the names of the ISD nodes.
This makes the world a slightly saner place, as previously "fround" matched
ISD::FP_ROUND and not ISD::FROUND.

Differential Revision: https://reviews.llvm.org/D23597

llvm-svn: 279129
2016-08-18 20:08:15 +00:00
Duncan P. N. Exon Smith 84c2da47f9 AArch64: Don't call getIterator() on iterators
Remove an unnecessary round-trip:

    iterator => operator->() => getIterator()

In some cases, the iterator is end(), so the dereference of operator->
is invalid (UB).

The testcase only crashes with r278974 (currently reverted to
investigate this), which adds an assertion for invalid dereferences of
ilist nodes.

Fixes PR29035.

llvm-svn: 279104
2016-08-18 17:58:09 +00:00
Ahmed Bougacha 33e19fe1c4 [AArch64][GlobalISel] Select floating-point binary ops.
There is no FREM instruction, but the others are straightforward.

llvm-svn: 279081
2016-08-18 16:05:11 +00:00
Ahmed Bougacha 1d0560b14d [AArch64][GlobalISel] Select G_SDIV/G_UDIV.
There is no REM instruction; that will require an expansion.
It's not obvious that should be done in select, rather than as a
(custom?) legalization.

llvm-svn: 279074
2016-08-18 15:17:13 +00:00
Justin Bogner cd1d5aaf2e Replace a few more "fall through" comments with LLVM_FALLTHROUGH
Follow up to r278902. I had missed "fall through", with a space.

llvm-svn: 278970
2016-08-17 20:30:52 +00:00
Justin Bogner b03fd12cef Replace "fallthrough" comments with LLVM_FALLTHROUGH
This is a mechanical change of comments in switches like fallthrough,
fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.

llvm-svn: 278902
2016-08-17 05:10:15 +00:00
Evandro Menezes 5a5b8dcd32 [AArch64] Adjust the scheduling model for Exynos M1.
Refine the model for the FP division unit.

llvm-svn: 278846
2016-08-16 20:35:01 +00:00
Evandro Menezes d03aff2e11 [AArch64] Adjust the scheduling model for Exynos M1.
Refine the model for the integer division unit.

llvm-svn: 278845
2016-08-16 20:34:58 +00:00
Ahmed Bougacha e4c03abddd [AArch64][GlobalISel] Select G_MUL.
llvm-svn: 278810
2016-08-16 14:37:46 +00:00
Ahmed Bougacha 59e160a19c [AArch64][GlobalISel] Factor out unsupported binop check. NFC.
We're going to need it for G_MUL, and, if other targets end up using
something similar, we can easily put it in the generic selector.

llvm-svn: 278808
2016-08-16 14:37:40 +00:00
Ahmed Bougacha 2ac5bf94bc [AArch64][GlobalISel] Select (variable) shifts.
For now, no support for immediates.

llvm-svn: 278804
2016-08-16 14:02:47 +00:00
Ahmed Bougacha 0306b5ef07 [AArch64][GlobalISel] Select p0 G_FRAME_INDEX.
And mark it as legal.

llvm-svn: 278802
2016-08-16 14:02:42 +00:00
Eli Friedman f184e4befc [AArch64LoadStoreOptimizer] Check aliasing correctly when creating paired loads/stores.
The existing code accidentally skipped the aliasing check in edge cases.

Differential revision: https://reviews.llvm.org/D23372

llvm-svn: 278562
2016-08-12 20:39:51 +00:00
Mike Aizatsky f4fdb5ddf3 [AArch64] Registering default MCInstrAnalysis
Even in this form it is useful: it can detect branch instructions.

https://github.com/google/sanitizers/issues/706

Subscribers: aemerson, rengolin

Differential Revision: https://reviews.llvm.org/D23426

llvm-svn: 278560
2016-08-12 20:28:05 +00:00
Eli Friedman 8585e9d33d [AArch64LoadStoreOpt] Handle offsets correctly for post-indexed paired loads.
Trunk would try to create something like "stp x9, x8, [x0], #512", which isn't actually a valid instruction.

Differential revision: https://reviews.llvm.org/D23368

llvm-svn: 278559
2016-08-12 20:28:02 +00:00
Geoff Berry 22dfbc5637 [AArch64] Re-factor code shared by AArch64LoadStoreOpt and AArch64InstrInfo.
This re-factoring could cause the following slight changes in generated
code, though none were observed during testing:

- MachineScheduler could decide not to cluster some loads/stores if
  there are other load/stores with non-pairable opcodes that have the
  same base register and offset as a pairable set of load/stores.  One
  case of different MachineScheduler pairing did show up in my testing,
  but it wasn't due to this issue, but due
  BaseMemOpClusterMutation::clusterNeighboringMemOps() being unstable
  w.r.t. the order it considers memory operations.  See PR28942.

- The ImplicitNullChecks optimization could be done for more load/store
  opcodes.  This optimization isn't done for C/C++ code, so it didn't
  show up in my testing.

Reviewers: mcrosier, t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D23365

llvm-svn: 278515
2016-08-12 15:26:00 +00:00
David Majnemer 91a02f5bee Use the range variant of transform instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278477
2016-08-12 04:32:45 +00:00
David Majnemer 2d006e7673 Use the range variant of transform instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278476
2016-08-12 04:32:42 +00:00
David Majnemer 562e82945e Use the range variant of find_if instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278443
2016-08-12 00:18:03 +00:00
David Majnemer 0d955d0bf5 Use the range variant of find instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278433
2016-08-11 22:21:41 +00:00
Matt Arsenault 76837df6ff AArch64: Assert on analyzeBranch failing
llvm-svn: 278366
2016-08-11 17:22:59 +00:00
Simon Pilgrim 5c91764af5 Fixed VS2015 (Update 3) warning - differing const/volatile qualifiers for overridden function
Dropped the const qualifier to match llvm::CallLowering::lowerCall

llvm-svn: 278329
2016-08-11 12:19:43 +00:00
Tim Northover 406024a108 GlobalISel: implement simple function calls on AArch64.
We're still limited in the arguments we support, but this at least handles the
basic cases.

llvm-svn: 278293
2016-08-10 21:44:01 +00:00
Silviu Baranga fa00ba3c1a [AArch64] PR28877: Don't assume we're running after legalization when creating vcvtfp2fxs
Summary:
The DAG combine transformation that was generating the
aarch64_neon_vcvtfp2fxs node was assuming that all
inputs where legal and wasn't accounting that the input
could be a v4f64 if we're trying to do the transformation
before legalization. We now bail out in this case.

All illegal types besides v4f64 were already rejected.

Fixes https://llvm.org/bugs/show_bug.cgi?id=28877.

Reviewers: jmolloy

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D23261

llvm-svn: 278002
2016-08-08 13:13:57 +00:00
Benjamin Kramer b7d3311c77 Move helpers into anonymous namespaces. NFC.
llvm-svn: 277916
2016-08-06 11:13:10 +00:00
Tim Northover 97d0cb3165 GlobalISel: IRTranslate PHI instructions
llvm-svn: 277835
2016-08-05 17:16:40 +00:00
Tim Northover 61c16142b4 GlobalISel: extend add widening to SUB, MUL, OR, AND and XOR.
These are the operations that are trivially identical. Division is omitted for
now because you need to use the correct sign/zero extension.

llvm-svn: 277775
2016-08-04 21:39:49 +00:00
Tim Northover 9656f1476c GlobalISel: implement narrowing for G_ADD.
llvm-svn: 277769
2016-08-04 20:54:13 +00:00
Tim Northover 2f32e7f0ac AArch64: don't assume all i128s are BUILD_PAIRs
It leads to a crash when they're not. I'm *sure* I've made this mistake before,
at least once.

llvm-svn: 277755
2016-08-04 19:32:28 +00:00
Tim Northover 1021d89398 AArch64: properly calculate cmpxchg status in FastISel.
We were relying on the misleadingly-names $status result to actually be the
status. Actually it's just a scratch register that may or may not be valid (and
is the inverse of the real ststus anyway). Success can be determined by
comparing the value loaded against the one we wanted to see for "cmpxchg
strong" loops like this.

Should fix PR28819.

llvm-svn: 277513
2016-08-02 20:22:36 +00:00
Ahmed Bougacha ad30db32e6 [AArch64][GlobalISel] Mark basic binops/memops as legal.
We currently use and test these, and select most of them. Mark them
as legal even though we don't go through the full ir->asm flow yet.

This doesn't currently have standalone tests, but the verifier will
soon learn to check that the regbankselect/select tests are legal.

llvm-svn: 277471
2016-08-02 15:10:28 +00:00
Matt Arsenault 6f1ae3c7db AArch64: Assert on branch displacement bits
llvm-svn: 277434
2016-08-02 08:56:52 +00:00
Matt Arsenault 5b54971ff9 AArch64: Consolidate branch inversion logic
llvm-svn: 277431
2016-08-02 08:30:06 +00:00
Matt Arsenault e8da145493 AArch64: BranchRelaxtion cleanups
Move some logic into TII.

llvm-svn: 277430
2016-08-02 08:06:17 +00:00
Matt Arsenault f7065e15f8 AArch64: Fix end iterator dereference
Not all blocks have terminators. I'm not sure how this wasn't
crashing before.

llvm-svn: 277427
2016-08-02 07:20:09 +00:00
Evandro Menezes 82e245a202 [AArch64] Add support for Samsung Exynos M2 (NFC).
llvm-svn: 277364
2016-08-01 18:39:45 +00:00
Diana Picus ab5a4c7dbb [AArch64] Return the correct size for TLSDESC_CALLSEQ
The branch relaxation pass is computing the wrong offsets because it assumes
TLSDESC_CALLSEQ eats up 4 bytes, when in fact it is lowered to an instruction
sequence taking up 16 bytes. This can become a problem in huge files with lots
of TLS accesses, as it may slowly move branch targets out of the range computed
by the branch relaxation pass.

Fixes PR24234 https://llvm.org/bugs/show_bug.cgi?id=24234

Differential Revision: https://reviews.llvm.org/D22870

llvm-svn: 277331
2016-08-01 08:38:49 +00:00
Diana Picus 850043b25a [AArch64] Register passes so they can be run by llc
Initialize all AArch64-specific passes in the TargetMachine so they can be run
by llc. This can lead to conflicts in opt with some command line options that
share the same name as the pass, so I took this opportunity to do some cleanups:
* rename all relevant command line options from "aarch64-blah" to
  "aarch64-enable-blah" and update the tests accordingly
* run clang-format on their declarations
* move all these declarations to a common place (the TargetMachine) as opposed
  to having them scattered around (AArch64BranchRelaxation and
  AArch64AddressTypePromotion were the only offenders)

llvm-svn: 277322
2016-08-01 05:56:57 +00:00
Tim Northover a51575ffa2 CodeGen: improve MachineInstrBuilder & MachineIRBuilder interface
For MachineInstrBuilder, having to manually use RegState::Define is ugly and
makes register definitions clunkier than they need to be, so this adds two
convenience functions: addDef and addUse.

For MachineIRBuilder, we want to avoid BuildMI's first-reg-is-def rule because
it's hidden away and causes bugs. So this patch switches buildInstr to
returning a MachineInstrBuilder and adding *all* operands via addDef/addUse.

NFC.

llvm-svn: 277176
2016-07-29 17:43:52 +00:00
Ahmed Bougacha 6db3cfe2da [AArch64][GlobalISel] Select G_XOR.
llvm-svn: 277173
2016-07-29 16:56:25 +00:00
Ahmed Bougacha 7adfac56b3 [AArch64][GlobalISel] Select G_LOAD/G_STORE.
Mostly straightforward as we ignore addressing modes and just
use the base + unsigned immediate offset (always 0) variants.

This currently fails to select extloads because we have yet to
agree on a representation.

llvm-svn: 277171
2016-07-29 16:56:16 +00:00
Sjoerd Meijer 0eb96ed0de TargetInstrInfo: add virtual function getInstSizeInBytes
This adds a target hook getInstSizeInBytes to TargetInstrInfo that a lot of
subclasses already implement.

Differential Revision: https://reviews.llvm.org/D22885

llvm-svn: 277126
2016-07-29 08:16:16 +00:00
Matthias Braun 941a705b7b MachineFunction: Return reference for getFrameInfo(); NFC
getFrameInfo() never returns nullptr so we should use a reference
instead of a pointer.

llvm-svn: 277017
2016-07-28 18:40:00 +00:00
Ahmed Bougacha 8550509b64 [AArch64][GlobalISel] Select G_BR.
This is the first unsized instruction we support; move down the
'sized' check to binops.

llvm-svn: 277007
2016-07-28 17:15:15 +00:00
Ahmed Bougacha d7748d6491 [AArch64][GlobalISel] Select GPR G_SUB.
llvm-svn: 277003
2016-07-28 16:58:35 +00:00
Ahmed Bougacha 61a7928dde [AArch64][GlobalISel] Select GPR G_AND.
llvm-svn: 277002
2016-07-28 16:58:31 +00:00
Ahmed Bougacha 46c05fc861 [GlobalISel] Remove types on selected insts instead of using LLT().
LLT() has a particular meaning: it's one invalid type. But we really
want selected instructions to have no type whatsoever.

Also verify that types don't linger after ISel, and enable the verifier
on the AArch64 select test.

llvm-svn: 277001
2016-07-28 16:58:27 +00:00
Sjoerd Meijer 89217f8835 TargetInstrInfo: rename GetInstSizeInBytes to getInstSizeInBytes. NFC
Differential Revision: https://reviews.llvm.org/D22925

llvm-svn: 276997
2016-07-28 16:32:22 +00:00
Zijiao Ma e56a53a9b3 Add unittests to {ARM | AArch64}TargetParser.
Add unittest to {ARM | AArch64}TargetParser,and by the way correct problems as below:
1.Correct a incorrect indexing problem in AArch64TargetParser. The architecture enumeration
 is shared across ARM and AArch64 in original implementation.But In the code,I just used the
 index which was offset by the ARM, and this would index into the array incorrectly. To make
 AArch64 has its own arch enum,or we will do a lot of slowly iterating.
2.Correct a spelling error. The parameter of llvm::AArch64::getArchExtName.
3.Correct a writing mistake, in llvm::ARM::parseArchISA.

Differential Revision: https://reviews.llvm.org/D21785

llvm-svn: 276957
2016-07-28 06:11:18 +00:00
Diana Picus c65d8bdcf2 Typo fix. NFC
llvm-svn: 276879
2016-07-27 15:13:25 +00:00
Ahmed Bougacha 6756a2c953 [GlobalISel] Introduce an instruction selector.
And implement it for AArch64, supporting x/w ADD/OR.

Differential Revision: https://reviews.llvm.org/D22373

llvm-svn: 276875
2016-07-27 14:31:55 +00:00
Ahmed Bougacha 5e402eec7b [AArch64] Mark various *Info classes as 'final'. NFC.
llvm-svn: 276874
2016-07-27 14:31:46 +00:00
Ahmed Bougacha b8459d14ef [AArch64] Define AArch64RegisterInfo as a class, not a struct. NFC.
llvm-svn: 276873
2016-07-27 14:31:40 +00:00
Tim Northover e2e0067352 GlobalISel[AArch64]: support pointer types in argument lowering.
They're basically i64 for AArch64, but we'll leave them intact for stranger
targets. Also add some tests for the (very few) other cases we can handle right
now.

llvm-svn: 276689
2016-07-25 21:01:17 +00:00
Joel Jones 373d7d30dd MC] Provide an MCTargetOptions to implementors of MCAsmBackendCtorTy, NFC
Some targets, notably AArch64 for ILP32, have different relocation encodings
based upon the ABI. This is an enabling change, so a future patch can use the
ABIName from MCTargetOptions to chose which relocations to use. Tested using
check-llvm.

The corresponding change to clang is in: http://reviews.llvm.org/D16538

Patch by: Joel Jones

Differential Revision: https://reviews.llvm.org/D16213

llvm-svn: 276654
2016-07-25 17:18:28 +00:00
Chandler Carruth 488cb137a9 Fix a GCC error due to this member name also being a type name. This
should fix the build with GCC 4.9 at least. Not sure if this is the
right name or fix, but I've followed up on the original commit.

llvm-svn: 276522
2016-07-23 07:50:05 +00:00
George Burgess IV 8a457ddfd8 Fix include case. NFC.
llvm-svn: 276465
2016-07-22 20:15:19 +00:00
Tim Northover 33b07d6725 GlobalISel: implement legalization pass, with just one transformation.
This adds the actual MachineLegalizeHelper to do the work and a trivial pass
wrapper that legalizes all instructions in a MachineFunction. Currently the
only transformation supported is splitting up a vector G_ADD into one acting on
smaller vectors.

llvm-svn: 276461
2016-07-22 20:03:43 +00:00
David Majnemer 1182dd8ed9 [AArch64] Cleanup sign extend in genAlternativeCodeSequence
Use the machinery in MathExtras instead of rolling it by hand.

This fixes PR28624.

llvm-svn: 276366
2016-07-21 23:46:56 +00:00
Akira Hatanaka b8d2873d93 [AArch64][Inline-Asm] Return the 32-bit floating point register class
when constraint "w" is used on a 32-bit operand.

This enables compiling the following code, which used to error out in
the backend:

void foo1(int a) {
  asm volatile ("sqxtn h0, %s0\n" : : "w"(a):);
}

Fixes PR28633.

llvm-svn: 276344
2016-07-21 21:39:05 +00:00
Geoff Berry 4ff2e36d32 [AArch64] Load/store opt: Don't count transient instructions towards search limits.
Summary:
This change also changes findMatchingInsn and
findMatchingUpdateInsnForward to take DBG_VALUE opcodes into account
when tracking register defs and uses, which could potentially inhibit
these optimizations in the presence of debug information.

Reviewers: mcrosier

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22582

llvm-svn: 276293
2016-07-21 15:20:25 +00:00
Geoff Berry 24c81e8d7c [AArch64] Register AArch64LoadStoreOptimizer so it can be run by llc -run-pass. NFCI.
llvm-svn: 276193
2016-07-20 21:45:58 +00:00
Ahmed Bougacha a0cdd79070 [AArch64][FastISel] Select -O0 legal cmpxchg.
At -O0, cmpxchg survives AtomicExpand: it's mostly straightforward
to select it in fast-isel, and let the pseudo be expanded later.

extractvalues on the result are the tricky part: the generic logic
only works for legal types (and it would be painful to make it
support illegal types), so we can only support i32/i64 cmpxchg.

llvm-svn: 276183
2016-07-20 21:12:32 +00:00
Ahmed Bougacha b0674d1143 [AArch64][FastISel] Select atomic stores into STLR.
llvm-svn: 276182
2016-07-20 21:12:27 +00:00
Tim Northover 62ae568bbb GlobalISel: implement low-level type with just size & vector lanes.
This should be all the low-level instruction selection needs to determine how
to implement an operation, with the remaining context taken from the opcode
(e.g. G_ADD vs G_FADD) or other flags not based on type (e.g. fast-math).

llvm-svn: 276158
2016-07-20 19:09:30 +00:00
David Majnemer 5d26127752 Revert "Disable this-return argument forwarding on ARM/AArch64"
Inference of the 'returned' attribute was fixed in r276008, lets try
turning the backend support back on.

This reverts commit r275677.

llvm-svn: 276081
2016-07-20 04:13:01 +00:00
Evandro Menezes 238fa76574 [AArch64] Properly validate the reciprocal estimation.
Add check for legal data types when expanding into a Newton series.

Differential Revision: https://reviews.llvm.org/D22267

llvm-svn: 276041
2016-07-19 22:31:11 +00:00
Pankaj Gode 1bfca191da [AArch64] PredictableSelectIsExpensive for Vulcan.
Adding PredictableSelectIsExpensive for Vulcan

Differential Revision: https://reviews.llvm.org/D22448

llvm-svn: 275978
2016-07-19 14:30:21 +00:00
Hal Finkel 04b5330ccd Disable this-return argument forwarding on ARM/AArch64
r275042 reverted function-attribute inference for the 'returned' attribute
because the feature triggered self-hosting failures on ARM and AArch64. James
Molloy determined that the this-return argument forwarding feature, which
directly ties the returned input argument to the returned value, was the cause.
It seems likely that this forwarding code contains, or triggers, a subtle bug.
Disabling for now until we can track that down.

llvm-svn: 275677
2016-07-16 07:07:29 +00:00
Junmo Park 45513a8eca Minor code cleanups. NFC.
llvm-svn: 275637
2016-07-15 22:42:52 +00:00
Justin Lebar 9c375817ac [SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, getStore, and friends.
Summary:
Instead, we take a single flags arg (a bitset).

Also add a default 0 alignment, and change the order of arguments so the
alignment comes before the flags.

This greatly simplifies many callsites, and fixes a bug in
AMDGPUISelLowering, wherein the order of the args to getLoad was
inverted.  It also greatly simplifies the process of adding another flag
to getLoad.

Reviewers: chandlerc, tstellarAMD

Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits

Differential Revision: http://reviews.llvm.org/D22249

llvm-svn: 275592
2016-07-15 18:27:10 +00:00
Justin Lebar 0af80cd6f0 [CodeGen] Take a MachineMemOperand::Flags in MachineFunction::getMachineMemOperand.
Summary:
Previously we took an unsigned.

Hooray for type-safety.

Reviewers: chandlerc

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D22282

llvm-svn: 275591
2016-07-15 18:26:59 +00:00
Jacques Pienaar 71c30a14b7 Rename AnalyzeBranch* to analyzeBranch*.
Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect.

Reviewers: tstellarAMD, mcrosier

Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai

Differential Revision: https://reviews.llvm.org/D22409

llvm-svn: 275564
2016-07-15 14:41:04 +00:00
Haicheng Wu f0b0127f60 [AArch64] Set COPY ZR isAsCheapAsAMove when needed.
If a subtarget has both ZCZeroing and CustomCheapAsMoveHandling features (now
only Kryo has both), set COPY (W|X)ZR isAsCheapAsAMove.

Differential Revision: http://reviews.llvm.org/D22360

llvm-svn: 275503
2016-07-15 00:27:01 +00:00
Justin Lebar b49185dd5f s/constexpr/LLVM_CONSTEXPR in AArch64InstrInfo.cpp.
Yet again.

llvm-svn: 275463
2016-07-14 20:08:23 +00:00
Evandro Menezes 77d470ff3c [AArch64] Adjust the scheduling model for Exynos-M1.
Enable use-postra-scheduler. (NFC)

llvm-svn: 275457
2016-07-14 19:25:46 +00:00
Justin Lebar 288b3376ae [CodeGen] Refactor MachineMemOperand::Flags's target-specific flags.
Summary:
Make the target-specific flags in MachineMemOperand::Flags real, bona
fide enum values.  This simplifies users, prevents various constants
from going out of sync, and avoids the false sense of security provided
by declaring static members in classes and then forgetting to define
them inside of cpp files.

Reviewers: MatzeB

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22372

llvm-svn: 275451
2016-07-14 18:15:20 +00:00
Justin Lebar a3b786a8c1 [CodeGen] Refactor MachineMemOperand's Flags enum.
Summary:
- Give it a shorter name (because we're going to refer to it often from
  SelectionDAG and friends).

- Split the flags and alignment into separate variables.

- Specialize FlagsEnumTraits for it, so we can do bitwise ops on it
  without losing type information.

- Make some enum values constants in MachineMemOperand instead.
  MOMaxBits should not be a valid Flag.

- Simplify some of the bitwise ops for dealing with Flags.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22281

llvm-svn: 275438
2016-07-14 17:07:44 +00:00
Haicheng Wu 711ca868fc [AArch64] Set FMOVS0 and FMOVD0 as isAsCheapAsAMove when needed.
If a subtarget has both ZCZeroing and CustomCheapAsMoveHandling features (now
only Kryo has both), set FMOVS0 and FMOVD0 isAsCheapAsAMove.

Differential Revision: http://reviews.llvm.org/D22256

llvm-svn: 275178
2016-07-12 15:31:41 +00:00
Haicheng Wu 1e39574e9f [Kryo] Enable ZCZeroing feature
This feature uses immediate #0 to zero a register.

Differential Revision: http://reviews.llvm.org/D19985

llvm-svn: 275143
2016-07-12 02:04:01 +00:00
Nirav Dave 8603062ee4 Fix branch relaxation in 16-bit mode.
Thread through MCSubtargetInfo to relaxInstruction function allowing relaxation
to generate jumps with 16-bit sized immediates in 16-bit mode.

This fixes PR22097.

Reviewers: dwmw2, tstellarAMD, craig.topper, jyknight

Subscribers: jfb, arsenm, jyknight, llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D20830

llvm-svn: 275068
2016-07-11 14:23:53 +00:00
Duncan P. N. Exon Smith ab53fd9b50 AArch64: Avoid implicit iterator conversions, NFC
Avoid implicit conversions from MachineInstrBundleInstr to MachineInstr*
in the AArch64 backend, mainly by preferring MachineInstr& over
MachineInstr* when a pointer isn't nullable.

llvm-svn: 274924
2016-07-08 20:29:42 +00:00
Duncan P. N. Exon Smith 25b132e93b Target: Avoid getFirstTerminator() => pointer, NFC
Stop using an implicit conversion from the return of
MachineBasicBlock::getFirstTerminator to MachineInstr*.  In two cases,
directly dereference to a MachineInstr& since later code assumes it's
valid.  In a third case, change to an iterator since later code checks
against MachineBasicBlock::end.

Although the fix for the third case avoids undefined behaviour, I expect
this doesn't cause a functionality change in practice (since the basic
block already has a terminator).

llvm-svn: 274898
2016-07-08 18:26:20 +00:00
Pankaj Gode 5d118a1676 [AArch64] Macro fusion of simple ALU ops with branches for Broadcom's Vulcan
Support for the macro fusion of simple ALU ops with branches for the Vulcan sub-target.

Patch by Meador Inge <meadori@gmail.com>

Differential Revision: http://reviews.llvm.org/D22042

llvm-svn: 274837
2016-07-08 11:13:59 +00:00
Chad Rosier 112d0e996b [AArch64] Change the preferred alignment for char and short to word alignment.
The commit reinstates r273279, which was informally approved.

Original Review: http://reviews.llvm.org/D21414

This reverts commit ca632c91aaa7cafc50942f890c49f727a046ace1.

llvm-svn: 274790
2016-07-07 20:02:18 +00:00
Chad Rosier 3972953efd Revert "[AArch64] Change the preferred alignment for char and short to word alignment"
This reverts commit r273279 as the change was not properly approved.

llvm-svn: 274768
2016-07-07 16:37:29 +00:00
Junmo Park 384d376545 fix documentation comment. NFC.
llvm-svn: 274704
2016-07-06 23:18:58 +00:00
Junmo Park 5e4bd2e7c4 Minor code cleanup. NFC.
llvm-svn: 274702
2016-07-06 23:15:18 +00:00
Matthias Braun ad0032a649 AArch64: Change modeling of zero cycle zeroing.
On CPUs with the zero cycle zeroing feature enabled "movi v.2d" should
be used to zero a vector register. This was previously done at
instruction selection time, however the register coalescer sometimes
widened multiple vregs to the Q width because of that leading to extra
spills. This patch leaves the decision on how to zero a register to the
AsmPrinter phase where it doesn't affect register allocation anymore.

This patch also sets isAsCheapAsAMove=1 on FMOVS0, FMOVD0.

This fixes http://llvm.org/PR27454, rdar://25866262

Differential Revision: http://reviews.llvm.org/D21826

llvm-svn: 274686
2016-07-06 21:39:33 +00:00
Matthias Braun 332bb5c236 AArch64: Replace a RegScavenger instance with LivePhysRegs
findScratchNonCalleeSaveRegister() just needs a simple liveness
analysis, use LivePhysRegs for that as it is simpler and does not depend
on the kill flags.

This commit adds a convenience function available() to LivePhysRegs:
This function returns true if the given register is not reserved and
neither the register nor any of its aliases are alive.

Differential Revision: http://reviews.llvm.org/D21865

llvm-svn: 274685
2016-07-06 21:31:27 +00:00
Tim Northover 449c15e1bd AArch64: try to fix optimized build failure.
I think the Ops filled out by Regex::match contain pointers into the temporary
std::string returned by StringRef::upper. Its lifetime is extended by the call
to match, but only until the end of that call (not to the uses of Ops later
on).

llvm-svn: 274586
2016-07-05 23:15:58 +00:00
Manman Ren 39b37c0f9d Fix an ordering problem in r274431
llvm-svn: 274582
2016-07-05 22:24:44 +00:00
Tim Northover e6ae6767d9 AArch64: TableGenerate system instruction operands.
The way the named arguments for various system instructions are handled at the
moment has a few problems:

  - Large-scale duplication between AArch64BaseInfo.h and AArch64BaseInfo.cpp
  - That weird Mapping class that I have no idea what I was on when I thought
    it was a good idea.
  - Searches are performed linearly through the entire list.
  - We print absolutely all registers in upper-case, even though some are
    canonically mixed case (SPSel for example).
  - The ARM ARM specifies sysregs in terms of 5 fields, but those are relegated
    to comments in our implementation, with a slightly opaque hex value
    indicating the canonical encoding LLVM will use.

This adds a new TableGen backend to produce efficiently searchable tables, and
switches AArch64 over to using that infrastructure.

llvm-svn: 274576
2016-07-05 21:23:04 +00:00
Balaram Makam d4acd7ed10 Revert r259387: "AArch64: Implement missed conditional compare sequences."
This reverts commit r259387 because it inserts illegal code after legalization
    in some backends where i64 OR type is illegal for example.

llvm-svn: 274573
2016-07-05 20:24:05 +00:00
Tim Northover 01dff9d18a AArch64: use correct SDValue # when looking for bitfield placement.
The other use really does only care about the SDNode (it checks the
opcode against a whitelist), but bitFieldPlacement can be misled if
the node produces multiple results.

Patch by Ismail Badawi.

llvm-svn: 274567
2016-07-05 18:02:57 +00:00
Benjamin Kramer 3bc1edf95b Use arrays or initializer lists to feed ArrayRefs instead of SmallVector where possible.
No functionality change intended.

llvm-svn: 274431
2016-07-02 11:41:39 +00:00
Craig Topper 2bd8b4b180 [CodeGen,Target] Remove the version of DAG.getVectorShuffle that takes a pointer to a mask array. Convert all callers to use the ArrayRef version. No functional change intended.
For the most part this simplifies all callers. There were two places in X86 that needed an explicit makeArrayRef to shorten a statically sized array.

llvm-svn: 274337
2016-07-01 06:54:47 +00:00
Duncan P. N. Exon Smith 632987296f Target: Remove unused arguments from overrideSchedPolicy, NFC
TargetSubtargetInfo::overrideSchedPolicy takes two MachineInstr*
arguments (begin and end) that invite implicit conversions from
MachineInstrBundleIterator.  One option would be to change their type to
an iterator, but since they don't seem to have been used since the API
was added in 2010, I'm deleting the dead code.

llvm-svn: 274304
2016-07-01 00:23:27 +00:00
Duncan P. N. Exon Smith e4f5e4f4d1 CodeGen: Use MachineInstr& in TargetLowering, NFC
This is a mechanical change to make TargetLowering API take MachineInstr&
(instead of MachineInstr*), since the argument is expected to be a valid
MachineInstr.  In one case, changed a parameter from MachineInstr* to
MachineBasicBlock::iterator, since it was used as an insertion point.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

llvm-svn: 274287
2016-06-30 22:52:52 +00:00
Rafael Espindola d86e8bb0ed Delete MCCodeGenInfo.
MC doesn't really care about CodeGen stuff, so this was just
complicating target initialization.

llvm-svn: 274258
2016-06-30 18:25:11 +00:00
Rafael Espindola db6bd02185 Delete unused includes. NFC.
llvm-svn: 274225
2016-06-30 12:19:16 +00:00
Pankaj Gode f4b25547cf [AArch64] Add Broadcom Vulcan scheduling model.
Adding scheduling model for new Broadcom Vulcan core (ARMv8.1A).

Differential Revision: http://reviews.llvm.org/D21728

llvm-svn: 274213
2016-06-30 06:42:31 +00:00
Craig Topper bc56e3ba53 Use ShuffleVectorSDNode::isSplat member method instead of static method isSplatMask where the mask came directly from getMask() on a shuffle node.
llvm-svn: 274208
2016-06-30 04:38:51 +00:00
Duncan P. N. Exon Smith 9cfc75c214 CodeGen: Use MachineInstr& in TargetInstrInfo, NFC
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr.  This is a
general API improvement.

Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other.  Instead I've done everything as a block and just
updated what was necessary.

This is mostly mechanical fixes: adding and removing `*` and `&`
operators.  The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency.  Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy.  I couldn't run tests
for AVR since llc doesn't link with it turned on.

llvm-svn: 274189
2016-06-30 00:01:54 +00:00
Matthias Braun ec3330328a AArch64: Remove unnecessary namespace llvm; NFC
llvm-svn: 273975
2016-06-28 00:54:33 +00:00
Rafael Espindola 3beef8d6db Move shouldAssumeDSOLocal to Target.
Should fix the shared library build.

llvm-svn: 273958
2016-06-27 23:15:57 +00:00
Evandro Menezes 3830479f41 [AArch64] Adjust the model for the vector by element FP multiplies on Exynos M1. (NFC)
llvm-svn: 273708
2016-06-24 18:58:54 +00:00
Evandro Menezes 62c70101c3 [AArch64] Model the cost of vector by element FP multiplies on Exynos M1. (NFC)
llvm-svn: 273630
2016-06-23 23:43:23 +00:00
Chad Rosier 8c106bcbe8 [AArch64] Remove an overly aggressive assert.
llvm-svn: 273458
2016-06-22 19:18:52 +00:00
Krzysztof Parzyszek e116d500a7 [SDAG] Remove FixedArgs parameter from CallLoweringInfo::setCallee
The setCallee function will set the number of fixed arguments based
on the size of the argument list. The FixedArgs parameter was often
explicitly set to 0, leading to a lack of consistent value for non-
vararg functions.

Differential Revision: http://reviews.llvm.org/D20376

llvm-svn: 273403
2016-06-22 12:54:25 +00:00
Rafael Espindola 2b7fef681f Delete more dead code.
Found by gcc 6.

llvm-svn: 273402
2016-06-22 12:44:16 +00:00
Haicheng Wu a783bac50b [Kryo] Enable loop prefetcher.
Differential Revision: http://reviews.llvm.org/D21535

llvm-svn: 273329
2016-06-21 22:47:56 +00:00
Evandro Menezes 230083ff9d [AArch64] Change the preferred alignment for char and short to word alignment
Differential Revision: http://reviews.llvm.org/D21414

llvm-svn: 273279
2016-06-21 15:55:18 +00:00
Silviu Baranga aee40fc61c [AArch64] Restore codegen for AArch64 Cortex-A72/A73 after NFCI
Summary:
Code generation for Cortex-A72/Cortex-A73 was accidentally changed
by r271555, which was a NFCI. The isCortexA57() predicate was not true
for Cortex-A72/Cortex-A73 before r271555 (since it was checking the CPU
string). Because Cortex-A72/Cortex-A73 inherit all features from Cortex-A57,
all decisions previously guarded by isCortexA57() are now taken.

This change restores the behaviour before r271555 by adding separate
ProcA72/ProcA73, which have the required features to preserve code
generation.

Reviewers: kristof.beyls, aadg, mcrosier, rengolin

Subscribers: mcrosier, llvm-commits, aemerson, t.p.northover, MatzeB, rengolin

Differential Revision: http://reviews.llvm.org/D21182

llvm-svn: 273277
2016-06-21 15:53:54 +00:00
David Majnemer e61e4bfd87 Replace silly uses of 'signed' with 'int'
llvm-svn: 273244
2016-06-21 05:10:24 +00:00
Evandro Menezes 8057265cc2 [AArch64] Adjust the loop buffer size for Exynos M1 (NFC)
llvm-svn: 273185
2016-06-20 18:39:41 +00:00
Pankaj Gode 0aab2e398a [AARCH64] Add support for Broadcom Vulcan
Adding core tuning support for new Broadcom Vulcan core (ARMv8.1A).

Differential Revision: http://reviews.llvm.org/D21500

llvm-svn: 273148
2016-06-20 11:13:31 +00:00
NAKAMURA Takumi fe1202c4cb Untabify.
llvm-svn: 273129
2016-06-20 00:37:41 +00:00
Matt Arsenault f1c3906a5d AArch64: Fix range loop contradicting comment above it
llvm-svn: 272959
2016-06-16 21:21:49 +00:00
Tim Northover daa1c018b0 AArch64: allow MOV (imm) alias to be printed
The backend has been around for years, it's pretty ridiculous that we can't
even use the preferred form for printing "MOV" aliases. Unfortunately, TableGen
can't handle the complex predicates when printing so it's a bunch of nasty C++.
Oh well.

llvm-svn: 272865
2016-06-16 01:42:25 +00:00
Tim Northover 389a1e39ea AArch64: stop trying to use 32-bit MOVZs when expanding patchpoints.
Of course the assembly was right but because the opcode was MOVZWi it was
encoded as "movz w16, #65535, lsl #32" which is an unallocated encoding and
would go horribly wrong on a CPU.

No idea how this bug survived this long. It seems nobody is using that aspect
of patchpoints.

llvm-svn: 272831
2016-06-15 20:33:36 +00:00
Pankaj Gode a67fea464c Test commit after access grant. Modified comment by adding a period.
llvm-svn: 272808
2016-06-15 17:24:52 +00:00
Kevin Enderby d2d2ce9b9f Update the AArch64ExternalSymbolizer to print literal strings as escaped strings
so it is the same as the MCExternalSymbolizer.

rdar://17349181

llvm-svn: 272588
2016-06-13 21:08:57 +00:00
Benjamin Kramer bdc4956bac Pass DebugLoc and SDLoc by const ref.
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.

llvm-svn: 272512
2016-06-12 15:39:02 +00:00
Chad Rosier d1f6c840ee [AArch64] Move comments closer to relevant check. NFC.
llvm-svn: 272430
2016-06-10 20:49:18 +00:00
Chad Rosier c5083c2ccf [AArch64] Refactor a check earlier. NFC.
llvm-svn: 272429
2016-06-10 20:47:14 +00:00
Evandro Menezes a3a0a60cff [AArch64] Add preferred alignments for Exynos M1
Differential Revision: http://reviews.llvm.org/D21203

llvm-svn: 272400
2016-06-10 16:00:18 +00:00
Saleem Abdulrasool 6c19ffc8bc AArch64: support the `.arch` directive in the IAS
Add support to the AArch64 IAS for the `.arch` directive.  This allows the
assembly input to use architectural functionality in part of a file.  This is
used in existing code like BoringSSL.

Resolves PR26016!

llvm-svn: 272241
2016-06-09 02:56:40 +00:00
Benjamin Kramer c321e53402 Apply most suggestions of clang-tidy's performance-unnecessary-value-param
Avoids unnecessary copies. All changes audited & pass tests with asan.
No functional change intended.

llvm-svn: 272190
2016-06-08 19:09:22 +00:00
Quentin Colombet d1cd30b218 [AArch64][RegisterBankInfo] G_OR are fine on either GPR or FPR.
Teach AArch64RegisterBankInfo that G_OR can be mapped on either GPR or
FPR for 64-bit or 32-bit values.

Add test cases demonstrating how this information is used to coalesce a
computation on a single register bank.

llvm-svn: 272170
2016-06-08 16:53:32 +00:00
Benjamin Kramer 46e38f3678 Avoid copies of std::strings and APInt/APFloats where we only read from it
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.

llvm-svn: 272126
2016-06-08 10:01:20 +00:00
Quentin Colombet a4ac7cdac2 [AArch64][RegisterBankInfo] Use the generic implementation of copyCost.
Long term we may want to give high cost at FPR to/from GPR copies.

llvm-svn: 272086
2016-06-08 01:24:00 +00:00
Quentin Colombet cfbdee2312 [RegisterBankInfo] Add a size argument for the cost of copy.
The cost of a copy may be different based on how many bits we have to
copy around. E.g., a 8-bit copy may be different than a 32-bit copy.

llvm-svn: 272084
2016-06-08 01:11:03 +00:00
Geoff Berry 486f49cc63 Reapply [AArch64] Fix isLegalAddImmediate() to return true for valid negative values.
Originally reviewed here: http://reviews.llvm.org/D17463

llvm-svn: 272023
2016-06-07 16:48:43 +00:00
Chad Rosier be879ea751 [AArch64] Spot SBFX-compatible code expressed with sign_extend.
This is very similar to r271677, but for extracts from i32 with the SIGN_EXTEND
acting on a arithmetic shift.

llvm-svn: 271717
2016-06-03 20:05:49 +00:00
Chad Rosier 2d658703e1 [AArch64] Spot SBFX-compatbile code expressed with sign_extend_inreg.
We were assuming all SBFX-like operations would have the shl/asr form, but often
when the field being extracted is an i8 or i16, we end up with a
SIGN_EXTEND_INREG acting on a shift instead.

This is a port of r213754 from ARM to AArch64.

llvm-svn: 271677
2016-06-03 15:00:09 +00:00
Sjoerd Meijer d906bf1369 RAS extensions are part of ARMv8.2-A. This change enables them by introducing a
new instruction to ARM and AArch64 targets and several system registers.

Patch by: Roger Ferrer Ibanez and Oliver Stannard

Differential Revision: http://reviews.llvm.org/D20282

llvm-svn: 271670
2016-06-03 14:03:27 +00:00
Sanjay Patel dba8b4c04d transform obscured FP sign bit ops into a fabs/fneg using TLI hook
This is effectively a revert of:
http://reviews.llvm.org/rL249702 - [InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886)
and:
http://reviews.llvm.org/rL249701 - [ValueTracking] teach computeKnownBits that a fabs() clears sign bits
and a reimplementation as a DAG combine for targets that have IEEE754-compliant fabs/fneg instructions.

This is intended to resolve the objections raised on the dev list:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098154.html
and:
https://llvm.org/bugs/show_bug.cgi?id=24886#c4

In the interest of patch minimalism, I've only partly enabled AArch64. PowerPC, MIPS, x86 and others can enable later.

Differential Revision: http://reviews.llvm.org/D19391

llvm-svn: 271573
2016-06-02 20:01:37 +00:00
Matthias Braun 651cff42c4 AArch64: Do not test for CPUs, use SubtargetFeatures
Testing for specific CPUs has a number of problems, better use subtarget
features:
- When some tweak is added for a specific CPU it is often desirable for
  the next version of that CPU as well, yet we often forget to add it.
- It is hard to keep track of checks scattered around the target code;
  Declaring all target specifics together with the CPU in the tablegen
  file is a clear representation.
- Subtarget features can be tweaked from the command line.

To discourage people from using CPU checks in the future I removed the
isCortexXX(), isCyclone(), ... functions. I added an getProcFamily()
function for exceptional circumstances but made it clear in the comment
that usage is discouraged.

Reformat feature list in AArch64.td to have 1 feature per line in
alphabetical order to simplify merging and sorting for out of tree
tweaks.

No functional change intended.

Differential Revision: http://reviews.llvm.org/D20762

llvm-svn: 271555
2016-06-02 18:03:53 +00:00
Geoff Berry 66f6b65fed [PEI, AArch64] Use empty spaces in stack area for local stack slot allocation.
Summary:
If the target requests it, use emptry spaces in the fixed and
callee-save stack area to allocate local stack objects.

AArch64: Change last callee-save reg stack object alignment instead of
size to leave a gap to take advantage of above change.

Reviewers: t.p.northover, qcolombet, MatzeB

Subscribers: rengolin, mcrosier, llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D20220

llvm-svn: 271527
2016-06-02 16:22:07 +00:00
Sjoerd Meijer 0b7bb16e5b This adds support for Cortex-A73 as an available target.
Differential Revision: http://reviews.llvm.org/D20865

llvm-svn: 271508
2016-06-02 10:48:52 +00:00
Rafael Espindola 4d29099f7f Delete AArch64II::MO_CONSTPOOL.
A constant pool holding the address of a variable in equivalent to
a got entry. It produces exactly the same instruction sequence as a
got use and unlike a got use this is not uniqued by the linker.

llvm-svn: 271311
2016-05-31 18:31:14 +00:00
Matthias Braun bcfd23673b AArch64: Fix indentation
llvm-svn: 271084
2016-05-28 01:06:51 +00:00
Matthias Braun 27b6692fe2 AArch64Subtarget: Use default member initializers
llvm-svn: 271057
2016-05-27 22:14:09 +00:00
Benjamin Kramer 3e9a5d3468 Apply clang-tidy's misc-static-assert where it makes sense.
Also fold conditions into assert(0) where it makes sense. No functional
change intended.

llvm-svn: 270982
2016-05-27 11:36:04 +00:00
Chad Rosier 14aa2ad1f4 [AArch64] Generate rev16/rev32 from bswap + srl when upper bits are known zero.
Canonicalize (srl (bswap i32 x), 16) to (rotr (bswap i32 x), 16), if the high
16-bits of x are zero. Similarly, canonicalize (srl (bswap i64 x), 32) to
(rotr (bswap i64 x), 32), if the high 32-bits of x are zero.

test_rev_w_srl16:            test_rev_w_srl16:
  and w8, w0, #0xffff          and     w8, w0, #0xffff
  rev w8, w8           --->    rev16   w0, w8
  lsr     w0, w8, #16

test_rev_x_srl32:            test_rev_x_srl32:
  rev x8, x8           --->    rev32   x0, x8
  lsr x0, x8, #32

llvm-svn: 270896
2016-05-26 19:41:33 +00:00
Chad Rosier 816a67da49 [AArch64] Generate a BFI/BFXIL from 'or (and X, MaskImm), OrImm'.
If and only if the value being inserted sets only known zero bits.

This combine transforms things like

  and w8, w0, #0xfffffff0
  movz w9, #5
  orr w0, w8, w9

into

  movz w8, #5
  bfxil w0, w8, #0, #4

The combine is tuned to make sure we always reduce the number of instructions.
We avoid churning code for what is expected to be performance neutral changes
(e.g., converted AND+OR to OR+BFI).

Differential Revision: http://reviews.llvm.org/D20387

llvm-svn: 270846
2016-05-26 13:27:56 +00:00
Rafael Espindola a224de06bc Use shouldAssumeDSOLocal on AArch64.
This reduces code duplication and now AArch64 also handles PIE.

llvm-svn: 270844
2016-05-26 12:42:55 +00:00
Rafael Espindola 6b93bf5783 Don't repeat name in comment and git-clang-format.
llvm-svn: 270785
2016-05-25 22:44:06 +00:00
Rafael Espindola 6b4baa5f58 Sort includes.
llvm-svn: 270769
2016-05-25 21:37:29 +00:00
Jun Bum Lim b21d4e17a2 [AArch64] Disable narrow load merge by default
Summary:
As this optimization converts two loads into one load with two shift instructions,
it could potentially hurt performance if a loop is arithmetic operation intensive.

Reviewers: t.p.northover, mcrosier, jmolloy

Subscribers: evandro, jmolloy, aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20172

llvm-svn: 270251
2016-05-20 18:45:49 +00:00
Chad Rosier 02f25a9565 [AArch64 ] Generate a BFXIL from 'or (and X, Mask0Imm),(and Y, Mask1Imm)'.
Mask0Imm and ~Mask1Imm must be equivalent and one of the MaskImms is a shifted
mask (e.g., 0x000ffff0).  Both 'and's must have a single use.

This changes code like:

  and w8, w0, #0xffff000f
  and w9, w1, #0x0000fff0
  orr w0, w9, w8

into

  lsr w8, w1, #4
  bfi w0, w8, #4, #12

llvm-svn: 270063
2016-05-19 14:19:47 +00:00
Chad Rosier e006202a4d [AArch64] Push comment into function. NFC.
llvm-svn: 270003
2016-05-18 23:51:17 +00:00
Rafael Espindola 8c34dd8257 Delete Reloc::Default.
Having an enum member named Default is quite confusing: Is it distinct
from the others?

This patch removes that member and instead uses Optional<Reloc> in
places where we have a user input that still hasn't been maped to the
default value, which is now clear has no be one of the remaining 3
options.

llvm-svn: 269988
2016-05-18 22:04:49 +00:00
Chad Rosier 91294c5bdc [AArch64] Minor refactoring. NFC.
llvm-svn: 269963
2016-05-18 17:43:11 +00:00
Rafael Espindola 38af4d6347 Trivial cleanups.
This just clang formats and cleans comments in an area I am about to
post a patch for review.

llvm-svn: 269946
2016-05-18 16:00:24 +00:00
Geoff Berry 74cb718ea9 [AArch64] Fix bug in large stack spill slot handling (PR27717)
Summary:
Fix bug in MachO path where a frame index offset would not be reserved
for handling large frames when an extra non-used callee-save register
was saved.  In the case where the extra register is reserved or not a
GPR (e.g. %FP in the MachO case), this would lead to the register
scavenger later failing when called from PrologEpilogInserter.

Reviewers: t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20185

llvm-svn: 269697
2016-05-16 20:52:28 +00:00
Chad Rosier c73d559df4 Use proper capitalization and punctuation per coding standards. NFC.
llvm-svn: 269652
2016-05-16 12:55:01 +00:00
Benjamin Kramer a65b610bd2 Move helper classes into anonymous namespaces. NFC.
llvm-svn: 269591
2016-05-15 15:18:11 +00:00
Chad Rosier 7e8dd51d50 [AArch64] Update local variable names to conform to coding standard. NFC.
llvm-svn: 269573
2016-05-14 18:56:28 +00:00
Chad Rosier 08d9908ea9 [AArch64] Simplify logic to reduce vertical space. NFC.
llvm-svn: 269512
2016-05-13 22:53:13 +00:00
Paul Osmialowski 4f5b3be7f1 add support for -print-imm-hex for AArch64
Most immediates are printed in Aarch64InstPrinter using 'formatImm' macro,
but not all of them.

Implementation contains following rules:

- floating point immediates are always printed as decimal
- signed integer immediates are printed depends on flag settings
  (for negative values 'formatImm' macro prints the value as i.e -0x01
  which may be convenient when imm is an address or offset)
- logical immediates are always printed as hex
- the 64-bit immediate for advSIMD, encoded in "a🅱️c:d:e:f:g:h" is always printed as hex
- the 64-bit immedaite in exception generation instructions like:
  brk, dcps1, dcps2, dcps3, hlt, hvc, smc, svc is always printed as hex
- the rest of immediates is printed depends on availability
  of -print-imm-hex

Signed-off-by: Maciej Gabka <maciej.gabka@arm.com>
Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com>

Differential Revision: http://reviews.llvm.org/D16929

llvm-svn: 269446
2016-05-13 18:00:09 +00:00
Justin Bogner 283e3bd793 SDAG: Implement Select instead of SelectImpl in AArch64DAGToDAGISel
This one has a lot of code churn, but it's all mechanical and
straightforward.

- Where we were returning a node before, call ReplaceNode instead.
- Where we would return null to fall back to another selector, rename
  the method to try* and return a bool for success.
- Where we were calling SelectNodeTo, just return afterwards.

Part of llvm.org/pr26808.

llvm-svn: 269379
2016-05-12 23:10:30 +00:00
Justin Bogner 3525da7466 SDAG: Clean up dangling nodes in AArch64ISelDAGToDAG::SelectImpl
When we convert to the void Select interface, leaving unreferenced
nodes around won't be allowed anymore.

Part of llvm.org/pr26808.

llvm-svn: 269345
2016-05-12 20:54:27 +00:00
Chad Rosier 179480a4ea [AArch64] Give function a more appropriate name.
llvm-svn: 269335
2016-05-12 19:51:58 +00:00
Chad Rosier 042ac2c17f [AArch64] Minor refactoring to simplify future patch. NFC.
llvm-svn: 269329
2016-05-12 19:38:18 +00:00
Chad Rosier 39481ace40 [AArch64] Remove command-line option use for testing.
The EXTR combine has been in tree for over 2 years without complain, so go ahead
and remove the option.

llvm-svn: 269292
2016-05-12 13:27:24 +00:00
Chad Rosier 9926a5e31d [AArch64] Add support for unscaled narrow stores in getUsefulBitsForUse.
llvm-svn: 269263
2016-05-12 01:42:01 +00:00
Chad Rosier fe7bba4ee4 [AArch64] Remove floating-point narrow stores from getUsefulBitsForUse.
While not impossible, it's unlikely we'd be performing bitwise operations on FP
values.

llvm-svn: 269260
2016-05-12 01:04:15 +00:00
Chad Rosier 23a1a9a66d [AArch64] Improve getUsefulBitsForUse for narrow stores.
For narrow stores (e.g., strb, srth) we know the upper bits of the register are
unused/not useful. In some cases we can use this information to eliminate
unnecessary instructions.

For example, without this patch we generate (from the 2nd test case):

 ldr w8, [x0]
 and w8, w8, #0xfff0
 bfxil w8, w2, #16, #4
 strh w8, [x1]

and after the patch the 'and' is removed:

 ldr w8, [x0]
 bfxil w8, w2, #16, #4
 strh w8, [x1]
 ret

During the lowering of the bitfield insert instruction the 'and' is eliminated
because we know the upper 16-bits that are masked off are unused and the lower
4-bits that are masked off are overwritten by the insert itself. Therefore, the
'and' is unnecessary.

Differential Revision: http://reviews.llvm.org/D20175

llvm-svn: 269226
2016-05-11 20:19:54 +00:00
Weiming Zhao 095c271131 [AArch64] Fix DAG selection for cmps for fp16 type
Summary: When emitting comparison for fp16, in addition to promote the LHS and RHS to fp32, we need to change the VT as well.

Reviewers: t.p.northover

Subscribers: t.p.northover, aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D19922

llvm-svn: 269151
2016-05-11 01:26:32 +00:00
Tim Northover 9508a70adc AArch64: allow vN to represent 64-bit registers in inline asm.
Unlike xN/wN, the size of vN is genuinely ambiguous in the assembly, so we
should try to infer what was intended from the type. But only down to 64-bits
(vN can never represent sN, hN or bN).

llvm-svn: 269132
2016-05-10 22:26:45 +00:00
Jonas Paulsson 8e5b0c65cc [foldMemoryOperand()] Pass LiveIntervals to enable liveness check.
SystemZ (and probably other targets as well) can fold a memory operand
by changing the opcode into a new instruction that as a side-effect
also clobbers the CC-reg.

In order to do this, liveness of that reg must first be checked. When
LIS is passed, getRegUnit() can be called on it and the right
LiveRange is computed on demand.

Reviewed by Matthias Braun.
http://reviews.llvm.org/D19861

llvm-svn: 269026
2016-05-10 08:09:37 +00:00
Matthias Braun 31d19d43c7 CodeGen: Move TargetPassConfig from Passes.h to an own header; NFC
Many files include Passes.h but only a fraction needs to know about the
TargetPassConfig class. Move it into an own header. Also rename
Passes.cpp to TargetPassConfig.cpp while we are at it.

llvm-svn: 269011
2016-05-10 03:21:59 +00:00
Silviu Baranga f60be28ed8 [AArch64] Implement lowering of the X constraint on AArch64
Summary:
This implements the lowering of the X constraint on
AArch64.

The default behaviour of the X constraint lowering is to
restrict it to "f". This is a problem because the "f"
constraint is not implemented on AArch64 and would be too
restrictive anyway. Therefore, the AArch64 hook will
lower this to "w" (if the operand is a floating point or
vector) or "r" otherwise.

The implementation is similar with the one added for
ARM (r267411).

This is the AArch64 side of the fix for http://llvm.org/PR26493

Reviewers: rengolin

Subscribers: aemerson, rengolin, llvm-commits, t.p.northover

Differential Revision: http://reviews.llvm.org/D19967

llvm-svn: 268907
2016-05-09 11:10:44 +00:00
Geoff Berry a5335647d5 [AArch64] Combine callee-save and local stack SP adjustment instructions.
Summary:
If a function needs to allocate both callee-save stack memory and local
stack memory, we currently decrement/increment the SP in two steps:
first for the callee-save area, and then for the local stack area.  This
changes the code to allocate them both at once at the very beginning/end
of the function.  This has two benefits:

1) there is one fewer sub/add micro-op in the prologue/epilogue

2) the stack adjustment instructions act as a scheduling barrier, so
moving them to the very beginning/end of the function increases post-RA
scheduler's ability to move instructions (that only depend on argument
registers) before any of the callee-save stores

This change can cause an increase in instructions if the original local
stack SP decrement could be folded into the first store to the stack.
This occurs when the first local stack store is to stack offset 0.  In
this case we are trading off one more sub instruction for one fewer sub
micro-op (along with benefits (2) and (3) above).

Reviewers: t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18619

llvm-svn: 268746
2016-05-06 16:34:59 +00:00
Jun Bum Lim 33be4997ed [AArch64] Decouple zero store promotion from narrow ld merge. NFC.
Summary: This change refactors to decouple the zero store promotion from the narrow ld merge and add a flag (enable-narrow-ld-merge=true) to control the narrow ld merge optimization.

Reviewers: jmolloy, t.p.northover, mcrosier

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19885

llvm-svn: 268744
2016-05-06 15:08:57 +00:00
Justin Bogner b012699741 SDAG: Rename Select->SelectImpl and repurpose Select as returning void
This is a step towards removing the rampant undefined behaviour in
SelectionDAG, which is a part of llvm.org/PR26808.

We rename SelectionDAGISel::Select to SelectImpl and update targets to
match, and then change Select to return void and consolidate the
sketchy behaviour we're trying to get away from there.

Next, we'll update backends to implement `void Select(...)` instead of
SelectImpl and eventually drop the base Select implementation.

llvm-svn: 268693
2016-05-05 23:19:08 +00:00
Chad Rosier 777dc513a0 [AArch64] Remove unused MBP headers/dependency. NFC.
llvm-svn: 268682
2016-05-05 20:58:38 +00:00
Evandro Menezes d23324aab1 [AArch64] Add cheap as move instructions for Exynos M1
llvm-svn: 268549
2016-05-04 20:47:25 +00:00
Evandro Menezes bcb95cd0ed [AArch64] Use the reciprocal estimation machinery
This patch adds support for estimating the square root, its reciprocal and
division or reciprocal using the combiner generic reciprocal machinery.

llvm-svn: 268539
2016-05-04 20:18:27 +00:00
Matthias Braun e25bbd0bb8 AArch64/optimizeCondBranch: Remove earlier kill flag when forming TBZ
This fixes -verify-machineinstrs complaints when compiling
test-suite/SingleSource/Benchmarks/Shootout-C++/wordfreq.cpp

llvm-svn: 268360
2016-05-03 04:54:16 +00:00
Matthias Braun d1aabb2813 livePhysRegs: Pass MBB by reference in addLive{Ins|Outs}(); NFC
The block must no be nullptr for the addLiveIns()/addLiveOuts()
function.

llvm-svn: 268340
2016-05-03 00:24:32 +00:00