Commit Graph

2322 Commits

Author SHA1 Message Date
Chad Rosier d6daac4746 [AArch64] Removed the narrow load merging code in the ld/st optimizer.
This feature has been disabled for some time now, so remove cruft.

Differential Revision: https://reviews.llvm.org/D26248

llvm-svn: 286110
2016-11-07 15:27:22 +00:00
Peter Collingbourne 4e76019e34 Support: Remove MemoryObject and DataStreamer interfaces.
These interfaces are no longer used.

Differential Revision: https://reviews.llvm.org/D26222

llvm-svn: 285774
2016-11-02 00:08:37 +00:00
Alex Bradbury 58eba09949 [TableGen] Move OperandMatchResultTy enum to MCTargetAsmParser.h
As it stands, the OperandMatchResultTy is only included in the generated
header if there is custom operand parsing. However, almost all backends
make use of MatchOperand_Success and friends from OperandMatchResultTy for
e.g. parseRegister. This is a pain when starting an AsmParser for a new
backend that doesn't yet have custom operand parsing. Move the enum to
MCTargetAsmParser.h.

This patch is a prerequisite for D23563

Differential Revision: https://reviews.llvm.org/D23496

llvm-svn: 285705
2016-11-01 16:32:05 +00:00
Tim Northover 037af52c8b GlobalISel: allow truncating pointer casts on AArch64.
llvm-svn: 285615
2016-10-31 18:31:09 +00:00
Tim Northover cdf23f1d93 GlobalISel: translate stack protector intrinsics
llvm-svn: 285614
2016-10-31 18:30:59 +00:00
Matthias Braun 7d78614ae9 AArch64DeadRegisterDefinitionsPass: Cleanup; NFC
- Fix doxygen file comment
- reduce indentation in loop
- Factor out some common subexpressions
- Move independent helper function out of class
- Fix Changed flag (this is not strictly NFC but a bugfix, but the flag
  seems ignored anyway)

llvm-svn: 285488
2016-10-29 01:03:41 +00:00
Evandro Menezes ca8370396a [AArch64] Create feature set for Samsung Exynos-M2
Since Exynos-M2 improved the FP square root unit a bit over the one in
Exynos-M1, it does not benefit from using the Newton series for such
operations.

llvm-svn: 285246
2016-10-26 22:06:20 +00:00
Chad Rosier 0c621fda0d [AArch64] Avoid materializing constant 1 when generating cneg instructions.
Instead of

 cmp w0, #1
 orr w8, wzr, #0x1
 cneg w0, w8, ne

we now generate

 cmp w0, #1
 csinv w0, w0, wzr, eq

PR28965

llvm-svn: 285217
2016-10-26 18:15:32 +00:00
Evandro Menezes 7696dc0685 [AArch64] Adjust the cost model for Exynos M1.
Modify the maximum jump table size.

llvm-svn: 285106
2016-10-25 20:05:42 +00:00
Evandro Menezes eff2bd9d4f [AArch64] Optionally use the Newton series for reciprocal estimation
Add support for estimating the square root or its reciprocal and division or
reciprocal using the combiner generic Newton series.

Differential revision: https://reviews.llvm.org/D25291

llvm-svn: 284986
2016-10-24 16:14:58 +00:00
Joel Jones 504bf334b0 AArch64 ILP32 relocations for assembly and ELF
Summary:
Add relocations for AArch64 ILP32. Includes:
  - Addition of definitions for R_AARCH32_*
  - Definition of new -target-abi: ilp32
  - Definition of data layout string
  - Tests for added relocations. Not comprehensive, but matches
    existing tests for 64-bit. Renames "CHECK-OBJ" to "CHECK-OBJ-LP64".
  - Tests for llvm-readobj

Reviewers: zatrazz, peter.smith, echristo, t.p.northover

Subscribers: aemerson, rengolin, mehdi_amini

Differential Revision: https://reviews.llvm.org/D25159

llvm-svn: 284973
2016-10-24 13:37:13 +00:00
Abderrazek Zaafrani 9daf8110c8 Set the vectorizer MaxInterleaveFactor for Exynos.
llvm-svn: 284839
2016-10-21 16:28:27 +00:00
Abderrazek Zaafrani 9f382f53d1 Test commit
llvm-svn: 284832
2016-10-21 15:24:08 +00:00
Bjorn Pettersson 9fcd605d1e [AArch64] Corrected spill size for DDD register class. NFCI
Summary:
The spill size was incorrectly set to 196 bits,
which isn't a multiple of 8. This problem was detected when
experimenting with asserts that the spill size should be a
multiple of the byte size.

New corrected value for the spill size is set to 192 bits.

Note that tablegen (RegisterInfoEmitter) will divide the
size set in the RegisterClass definition by 8. So this
change should not have any impact on the tablegen output
(trunc(192/8) == trunc(196/8) == 24 bytes).

Reviewers: t.p.northover

Subscribers: llvm-commits, aemerson, rengolin

Differential Revision: https://reviews.llvm.org/D25818

llvm-svn: 284814
2016-10-21 09:53:42 +00:00
Benjamin Kramer 2a8bef8769 Do a sweep over move ctors and remove those that are identical to the default.
All of these existed because MSVC 2013 was unable to synthesize default
move ctors. We recently dropped support for it so all that error-prone
boilerplate can go.

No functionality change intended.

llvm-svn: 284721
2016-10-20 12:20:28 +00:00
Evandro Menezes ce8d60156c [AArch64] Avoid materializing 0.0 when generating FP SELECT
Transform `a == 0.0 ? 0.0 : x` to `a == 0.0 ? a : x` and `a != 0.0 ? x : 0.0`
to `a != 0.0 ? x : a` to avoid materializing 0.0 for FCSEL, since it does not
have to be materialized beforehand for FCMP, as it has a form that has 0.0
as an implicit operand.

Differential Revision: https://reviews.llvm.org/D24808

llvm-svn: 284531
2016-10-18 20:37:35 +00:00
Tim Northover 55782222c0 GlobalISel: select small binary operations on AArch64.
AArch64 actually supports many 8-bit operations under the definition used by
GlobalISel: the designated information-carrying bits of a GPR32 get the right
value if you just use the normal 32-bit instruction.

llvm-svn: 284526
2016-10-18 20:03:48 +00:00
Tim Northover 4494d69862 GlobalISel: support floating-point constants on AArch64.
Patch from Ahmed Bougacha.

llvm-svn: 284523
2016-10-18 19:47:57 +00:00
Tim Northover 020d104496 GlobalISel: support wider range of load/store sizes in AArch64.
llvm-svn: 284406
2016-10-17 18:36:53 +00:00
Tim Northover 69fa84a6e9 GlobalISel: rename legalizer components to match others.
The previous names were both misleading (the MachineLegalizer actually
contained the info tables) and inconsistent with the selector & translator (in
having a "Machine") prefix. This should make everything sensible again.

The only functional change is the name of a couple of command-line options.

llvm-svn: 284287
2016-10-14 22:18:18 +00:00
Quentin Colombet b3f5a8c644 [AArch64][RegisterBankInfo] Switch to fully static opds mapping for G_BITCAST.
NFC.

llvm-svn: 284146
2016-10-13 18:46:38 +00:00
Quentin Colombet 6b87a3109c [AArch64][RegisterBankInfo] Provide alternative mappings for 64-bit load
This allows RegBankSelect in greedy mode to get rid some of the cross
register bank copies when loads are involved in the chain of
computation.

llvm-svn: 284097
2016-10-13 01:01:23 +00:00
Quentin Colombet cd80e97e88 [AArch64][RegisterBankInfo] Provide alternative mappings for G_BITCASTs.
Thanks to this patch, RegBankSelect is able to get rid of some register
bank copies as demonstrated in the test case.

llvm-svn: 284094
2016-10-13 00:34:48 +00:00
Quentin Colombet 45c9c1432f [AArch64][RegisterBankInfo] Describe cross regbank copies statically.
NFC.

llvm-svn: 284091
2016-10-13 00:12:06 +00:00
Quentin Colombet 9e64919b7c [AArch64][RegisterBankInfo] Use static mapping for same bank G_BITCAST.
NFC.

llvm-svn: 284090
2016-10-13 00:12:04 +00:00
Quentin Colombet db643d9091 [AArch64][MachineLegalizer] Mark more G_BITCAST as legal.
Basically any vector types that fits in a 32-bit register is also valid
as far as copies are concerned.

llvm-svn: 284089
2016-10-13 00:12:01 +00:00
Quentin Colombet f760799c40 [AArch64][RegisterBankInfo] Bump the cost of vector loads.
This does not change anything yet, because we do not offer any
alternative mapping.

llvm-svn: 284088
2016-10-13 00:11:59 +00:00
Quentin Colombet f35a8c5bdc [AArch64][RegisterBankInfo] Use a proper cost for cross regbank G_BITCASTs.
This does not change anything yet, because we do not offer any
alternative mapping.

llvm-svn: 284087
2016-10-13 00:11:57 +00:00
Quentin Colombet 27b40356f7 [AArch64][RegisterBankInfo] Provide more realistic copy costs.
llvm-svn: 284086
2016-10-13 00:11:55 +00:00
Tim Northover fb8d989818 GlobalISel: support G_TRUNC selection on AArch64.
Ahmed's patch again.

llvm-svn: 284075
2016-10-12 22:49:15 +00:00
Tim Northover 69271c64d5 GlobalISel: support int <-> float conversions on AArch64.
More of Ahmed's work.

llvm-svn: 284074
2016-10-12 22:49:11 +00:00
Tim Northover 7dd378dd08 GlobalISel: select G_FCMP instructions on AArch64.
Another of Ahmed's patches.

llvm-svn: 284073
2016-10-12 22:49:07 +00:00
Tim Northover 6c02ad5e4f GlobalISel: support selection of G_ICMP on AArch64.
Patch from Ahmed Bougaca again.

llvm-svn: 284072
2016-10-12 22:49:04 +00:00
Tim Northover 5e3dbf326c GlobalISel: select G_BRCOND instructions on AArch64.
llvm-svn: 284071
2016-10-12 22:49:01 +00:00
Tim Northover 6aacd27cd7 GlobalISel: mark G_BRCOND on s1 as legal.
It's going to be a TBNZ (at -O0) anyway, so the high bits don't matter.

llvm-svn: 284070
2016-10-12 22:48:36 +00:00
Quentin Colombet 9de30faeac [AArch64][InstrustionSelector] Teach the selector about G_BITCAST.
llvm-svn: 283973
2016-10-12 03:57:52 +00:00
Quentin Colombet cb629a897c [AArch64][InstructionSelector] Refactor the handling of copies.
Although Copies are not specific to preISel, we still have to assign them
a proper register class. However, given they are not constrained to
anything we do not have to handle the source register at the copy. It
will be properly mapped when reaching the related definition.

In the process, the handlong of G_ANYEXT is slightly modified as those
end up being selected as copy. The difference is that when register size
do not match on both sides, we need to insert SUBREG_TO_REG operation,
otherwise the post RA copy expansion will not be happy!

llvm-svn: 283972
2016-10-12 03:57:49 +00:00
Quentin Colombet 404e4350dc [AArch64][MachineLegalizer] Mark more bitcasts as legal.
Those are copies, we do not have to do any legalization action for them.

llvm-svn: 283970
2016-10-12 03:57:43 +00:00
Tim Northover c1d8c2bf8c GlobalISel: support same-size casts on AArch64.
Mostly Ahmed's work again, I'm just sprucing things up slightly before
committing.

llvm-svn: 283952
2016-10-11 22:29:23 +00:00
Tim Northover 3d38b3a4d1 GlobalISel: support selection of extend operations.
Patch mostly by Ahmed Bougaca.

llvm-svn: 283937
2016-10-11 20:50:21 +00:00
Diana Picus c93518db8c [AArch64] Allow label arithmetic with add/sub/cmp
Allow instructions such as 'cmp w0, #(end - start)' by folding the
expression into a constant. For ELF, we fold only if the symbols are in
the same section. For MachO, we fold if the expression contains only
symbols that are not linker visible.

Fixes https://llvm.org/bugs/show_bug.cgi?id=18920

Differential Revision: https://reviews.llvm.org/D23834

llvm-svn: 283862
2016-10-11 09:17:47 +00:00
Quentin Colombet d2623f8e38 [AArch64][InstructionSelector] Teach how to select FP load/store.
This patch allows to select 32 and 64-bit FP load and store.

llvm-svn: 283832
2016-10-11 00:21:14 +00:00
Quentin Colombet 0e5312787e [AArch64][InstructionSelector] Teach the selector how to handle vector OR.
This only adds the support for 64-bit vector OR. Adding more sizes is
not difficult, but it requires a bigger refactoring because ORs work on
any size, not necessarly the ones that match the width of the register
width. Right now, this is not expressed in the legalization, so don't
bother pushing the refactoring yet.

llvm-svn: 283831
2016-10-11 00:21:11 +00:00
Quentin Colombet d3126d5fb4 [AArch64][MachineLegalizer] Mark v2s32 G_LOAD as legal.
Actually every 64-bit loads are legal, but right now the API does not
offer a simple way to express that.

llvm-svn: 283829
2016-10-11 00:21:08 +00:00
Peter Collingbourne 0da86301ad Revert r283690, "MC: Remove unused entities."
llvm-svn: 283814
2016-10-10 22:49:37 +00:00
Tim Northover bdf1624367 GlobalISel: select G_GLOBAL_VALUE uses on AArch64.
llvm-svn: 283809
2016-10-10 21:50:00 +00:00
Tim Northover ad0acca544 GlobalISel: allow G_GLOBAL_VALUEs in AArch64 legalization.
llvm-svn: 283808
2016-10-10 21:49:53 +00:00
Tim Northover 2fda4b08ae GlobalISel: support selecting G_GEP instructions.
They're basically just an alias for G_ADD on AArch64.

llvm-svn: 283807
2016-10-10 21:49:49 +00:00
Tim Northover 4edc60d785 GlobalISel: support selecting constants on AArch64.
llvm-svn: 283806
2016-10-10 21:49:42 +00:00
Mehdi Amini f42454b94b Move the global variables representing each Target behind accessor function
This avoids "static initialization order fiasco"

Differential Revision: https://reviews.llvm.org/D25412

llvm-svn: 283702
2016-10-09 23:00:34 +00:00
Peter Collingbourne cc723cccab MC: Remove unused entities.
llvm-svn: 283691
2016-10-09 04:39:13 +00:00
Peter Collingbourne 5c924d7117 Target: Remove unused entities.
llvm-svn: 283690
2016-10-09 04:38:57 +00:00
Mehdi Amini 732afdd09a Turn cl::values() (for enum) from a vararg function to using C++ variadic template
The core of the change is supposed to be NFC, however it also fixes
what I believe was an undefined behavior when calling:

 va_start(ValueArgs, Desc);

with Desc being a StringRef.

Differential Revision: https://reviews.llvm.org/D25342

llvm-svn: 283671
2016-10-08 19:41:06 +00:00
Sebastian Pop eb65d72d9c [AArch64] Avoid generating indexed vector instructions for Exynos
Avoid generating indexed vector instructions for Exynos. This is needed for
fmla/fmls/fmul/fmulx. For example, the instruction

  fmla v0.4s, v1.4s, v2.s[1]

is less efficient than the instructions

  dup v2.4s, v2.s[1]
  fmla v0.4s, v1.4s, v2.4s

Patch written by Abderrazek Zaafrani.

Differential Revision: https://reviews.llvm.org/D21571

llvm-svn: 283663
2016-10-08 12:30:07 +00:00
Mehdi Amini a0016ec95f Use StringReg in TargetParser APIs (NFC)
llvm-svn: 283527
2016-10-07 08:37:29 +00:00
Matt Arsenault 36919a4f7c Move AArch64BranchRelaxation to generic code
llvm-svn: 283459
2016-10-06 15:38:53 +00:00
Matt Arsenault 0a3ea89e85 AArch64: Move remaining target specific BranchRelaxation bits to TII
llvm-svn: 283458
2016-10-06 15:38:09 +00:00
Matthias Braun 46a5238682 AArch64: Macrofusion: Split features, add missing combinations.
AArch64InstrInfo::shouldScheduleAdjacent() determines whether two
instruction can benefit from macroop fusion on apple CPUs. The list
turned out to be incomplete:
- the "rr" variants of the instructions were missing
- even the "rs" variants can have shift value == 0 and behave like the
  "rr" variants

This also splits the MacropFusion target feature into
ArithmeticBccFusion and ArithmeticCbzFusion.

Differential Revision: https://reviews.llvm.org/D25142

llvm-svn: 283243
2016-10-04 19:28:21 +00:00
Quentin Colombet 3a06701913 [AArch64][RegisterBankInfo] Add getSameKindofOperandsMapping.
Refactor the code so that the same function can be used for all
instructions with all the same operands for up to 3 operands.

This is going to be useful for cast instructions.
NFC.

llvm-svn: 283144
2016-10-03 20:20:13 +00:00
Matthias Braun a827ed8891 AArch64Subtarget: Remove unused CPUString field
llvm-svn: 283142
2016-10-03 20:17:02 +00:00
Mehdi Amini 48878ae579 Use StringRef in Datalayout API (NFC)
llvm-svn: 283013
2016-10-01 05:57:55 +00:00
Mehdi Amini 117296c0a0 Use StringRef in Pass/PassManager APIs (NFC)
llvm-svn: 283004
2016-10-01 02:56:57 +00:00
Eric Christopher 98983d0aff Remove TargetTriple from AArch64MCInstLower as it's used in few places
and can be pulled from the TargetMachine. NFC.

llvm-svn: 283000
2016-10-01 01:50:25 +00:00
Quentin Colombet a6119958ff [AArch64][RegisterBankInfo] Use the helper functions for the checks
This makes sure the helper functions work as expected.

NFC.

llvm-svn: 282961
2016-09-30 21:46:21 +00:00
Quentin Colombet 7c3fa8e361 [AArch64][RegisterBankInfo] Rename getValueMappingIdx to getValueMapping
We don't return index, we return the actual ValueMapping.

NFC.

llvm-svn: 282960
2016-09-30 21:46:19 +00:00
Quentin Colombet b4afac7b32 [AArch64][RegisterBankInfo] Compress the ValueMapping table a bit.
We don't need to have singleton ValueMapping on their own, we can just
reuse one of the elements of the 3-ops mapping.
This allows even more code sharing.

NFC.

llvm-svn: 282959
2016-09-30 21:46:17 +00:00
Quentin Colombet 7fc5fe41c5 [AArch64][RegisterBankInfo] Refactor the code to access AArch64::ValMapping
Use a helper function to access ValMapping. This should make the code
easier to understand and maintain.

NFC.

llvm-svn: 282958
2016-09-30 21:46:15 +00:00
Quentin Colombet 15dc25bb3d [AArch64][RegisterBankInfo] Rename getRegBankIdx to getRegBankIdxOffset
The function name did not make it clear that the returned value was an
offset to apply to a register bank index.

NFC.

llvm-svn: 282957
2016-09-30 21:46:12 +00:00
Quentin Colombet b2308987ab [AArch64][RegisterBankInfo] Use the static opds mapping for alt mappings
Avoid to rely on the dynamically allocated operands mapping for the
alternative mapping.
NFC.

llvm-svn: 282956
2016-09-30 21:45:56 +00:00
Quentin Colombet 4b36e0c409 [AArch64][RegisterBankInfo] Use static mapping for 3-operands instrs.
This uses a TableGen'ed like structure for all 3-operands instrs.
The output of the RegBankSelect pass should be identical but the
RegisterBankInfo will do less dynamic allocations.

llvm-svn: 282817
2016-09-30 00:10:00 +00:00
Quentin Colombet fdd303afe2 [AArch64][RegisterBankInfo] Add static value mapping for 3-op instrs.
This is the kind of input TableGen should generate at some point.
NFC.

llvm-svn: 282816
2016-09-30 00:09:58 +00:00
Quentin Colombet eb8d3da9a0 [AArch64][RegisterBankInfo] Check the statically created ValueMapping.
Make sure that the ValueMappings contain the value we expect at the
indices we expect.

NFC.

llvm-svn: 282815
2016-09-30 00:09:43 +00:00
Lei Liu 361615cfd0 AArch64: Set shift bit of TLSLE HI12 add instruction
Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model.  This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000.

Reviewers: t.p.northover, peter.smith, rovka

Subscribers: salim.nasser, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D24702

llvm-svn: 282661
2016-09-29 01:05:48 +00:00
Quentin Colombet 40cbc27ff3 [RegisterBankInfo] Uniquely generate OperandsMapping.
This is a step toward statically allocate InstructionMapping. Like the
previous few commits, the goal is to move toward a TableGen'ed like
structure with no dynamic allocation at all.

This should already improve compile time by getting rid of a bunch of
memmove of SmallVectors.

llvm-svn: 282643
2016-09-28 22:20:49 +00:00
Quentin Colombet c0f11a9fb8 [AArch64][RegisterBankInfo] Switch to statically allocated ValueMapping.
Another step toward TableGen'ed like structure for the RegisterBankInfo
of AArch64. By doing this, we also save a bit of compile time for the
exact same output.

llvm-svn: 282550
2016-09-27 22:55:04 +00:00
Quentin Colombet caae9cd246 [AArch64][RegisterBankInfo] Fix copy/paste in comments.
NFC.

llvm-svn: 282549
2016-09-27 22:54:57 +00:00
Geoff Berry b124331db7 [TargetRegisterInfo, AArch64] Add target hook for isConstantPhysReg().
Summary:
The current implementation of isConstantPhysReg() checks for defs of
physical registers to determine if they are constant.  Some
architectures (e.g. AArch64 XZR/WZR) have registers that are constant
and may be used as destinations to indicate the generated value is
discarded, preventing isConstantPhysReg() from returning true.  This
change adds a TargetRegisterInfo hook that overrides the no defs check
for cases such as this.

Reviewers: MatzeB, qcolombet, t.p.northover, jmolloy

Subscribers: junbuml, aemerson, mcrosier, rengolin

Differential Revision: https://reviews.llvm.org/D24570

llvm-svn: 282543
2016-09-27 22:17:27 +00:00
Geoff Berry 256fcf975f [AArch64] Improve add/sub/cmp isel of uxtw forms.
Don't match the UXTW extended reg forms of ADD/ADDS/SUB/SUBS if the
32-bit to 64-bit zero-extend can be done for free by taking advantage
of the 32-bit defining instruction zeroing the upper 32-bits of the X
register destination.  This enables better instruction selection in a
few cases, such as:

  sub x0, xzr, x8
  instead of:
  mov x8, xzr
  sub x0, x8, w9, uxtw

  madd x0, x1, x1, x8
  instead of:
  mul x9, x1, x1
  add x0, x9, w8, uxtw

  cmp x2, x8
  instead of:
  sub x8, x2, w8, uxtw
  cmp x8, #0

  add x0, x8, x1, lsl #3
  instead of:
  lsl x9, x1, #3
  add x0, x9, w8, uxtw

Reviewers: t.p.northover, jmolloy

Subscribers: mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D24747

llvm-svn: 282413
2016-09-26 15:34:47 +00:00
Evandro Menezes e45de8a5ec Add support to optionally limit the size of jump tables.
Many high-performance processors have a dedicated branch predictor for
indirect branches, commonly used with jump tables.  As sophisticated as such
branch predictors are, they tend to have well defined limits beyond which
their effectiveness is hampered or even nullified.  One such limit is the
number of possible destinations for a given indirect branches that such
branch predictors can handle.

This patch considers a limit that a target may set to the number of
destination addresses in a jump table.

Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar
<aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>.

Differential revision: https://reviews.llvm.org/D21940

llvm-svn: 282412
2016-09-26 15:32:33 +00:00
Quentin Colombet fd8c95adf4 [RegisterBankInfo] Uniquely generate ValueMapping.
This is a step toward statically allocate ValueMapping. Like the
previous few commits, the goal is to move toward a TableGen'ed like
structure with no dynamic allocation at all.

llvm-svn: 282324
2016-09-24 04:53:52 +00:00
Quentin Colombet fd0ab5c660 [AArch64][RegisterBankInfo] Sanity check TableGen'ed like inputs.
Make sure the entries written to mimic the behavior of TableGen are
sane.

llvm-svn: 282220
2016-09-23 00:59:07 +00:00
Quentin Colombet 5b16d931dc [AArch64][RegisterBankInfo] Switch to TableGen'ed like PartialMapping.
Statically instanciate the most common PartialMappings. This should
be closer to what the code would look like when TableGen support is
added for GlobalISel. As a side effect, this should improve compile
time.

llvm-svn: 282215
2016-09-23 00:14:36 +00:00
Quentin Colombet 0afa7d6b82 [RegisterBankInfo] Use array instead of SmallVector for BreakDown.
This is another step toward TableGen'ed like structures. The BreakDown of
the mapping of the value will be statically computed by TableGen, thus
we only have to point to the right entry in the table instead of
dynamically allocate the mapping for each instruction.

We still support the dynamic allocation through a factory of
PartialMapping to ease the bring-up of the targets while the TableGen
backend is not available.

llvm-svn: 282213
2016-09-23 00:14:30 +00:00
Tim Northover a5e38fa00d GlobalISel: handle stack-based parameters on AArch64.
llvm-svn: 282153
2016-09-22 13:49:25 +00:00
Quentin Colombet 6a76323c64 [RegisterBankInfo] Move to statically allocated RegisterBank.
This commit is basically the first step toward what will
RegisterBankInfo look when it gets TableGen'ed.

It introduces a XXXGenRegisterBankInfo.def file that is what TableGen
will issue at some point. Moreover, the RegBanks field in
RegisterBankInfo changed to reflect the static (compile time) aspect of
the information.

llvm-svn: 282131
2016-09-22 02:10:37 +00:00
Tim Northover 9a46718378 GlobalISel: produce correct code for signext/zeroext ABI flags.
We still don't really have an equivalent of "AssertXExt" in DAG, so we don't
exploit the guarantees on the receiving side yet, but this should produce
conservatively correct code on iOS ABIs.

llvm-svn: 282069
2016-09-21 12:57:45 +00:00
Tim Northover 862758ec14 GlobalISel: pass Function to lowerFormalArguments directly (NFC).
The only implementation that exists immediately looks it up anyway, and the
information is needed to handle various parameter attributes (stored on the
function itself).

llvm-svn: 282068
2016-09-21 12:57:35 +00:00
Diana Picus 2a3f066349 Revert "AArch64: Set shift bit of TLSLE HI12 add instruction"
This reverts commit r282057 because it broke the buildbots - see e.g.
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-42vma/builds/12063

llvm-svn: 282058
2016-09-21 08:24:41 +00:00
Lei Liu 6c87f23526 AArch64: Set shift bit of TLSLE HI12 add instruction
Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model.  This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000.

Reviewers: t.p.northover, peter.smith, rovka

Subscribers: salim.nasser, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D24702

llvm-svn: 282057
2016-09-21 07:41:41 +00:00
Evandro Menezes 9b5d89513b Revert part of "AArch64: Do not test for CPUs, use SubtargetFeatures"
This reverts part of commit 119e358d9635c8d1f3e7aee67e3ea3b8a62f8db6 by
removing FeatureUseRSqrt et al per request by Eric Christopher
<echristo@gmail.com> (v. http://bit.ly/2cmz6kW).

llvm-svn: 282001
2016-09-20 19:02:09 +00:00
Evandro Menezes ba4926efde Revert "[AArch64] Use the reciprocal estimation machinery"
This reverts commit b7d42b0048f65346e9fa37fb65defeea7ce8c337 per request by
Eric Christopher <echristo@gmail.com> (v. http://bit.ly/2cmz6kW).

llvm-svn: 282000
2016-09-20 19:02:06 +00:00
Evandro Menezes 61a1273d27 Revert "[AArch64] Properly validate the reciprocal estimation."
This reverts commit ad8ca1528242e2a4cb363e3779309e70eb7a430e per request by
Eric Christopher <echristo@gmail.com> (v. http://bit.ly/2cmz6kW).

llvm-svn: 281999
2016-09-20 19:02:02 +00:00
Tim Northover b18ea162df GlobalISel: split aggregates for PCS lowering
This should match the existing behaviour for passing complicated struct and
array types, in particular HFAs come through like that from Clang.

For C & C++ we still need to somehow support all the weird ABI flags, or at
least those that are present in the IR (signext, byval, ...), and stack-based
parameter passing.

llvm-svn: 281977
2016-09-20 15:20:36 +00:00
Diana Picus a53660e4a3 [AArch64] Fix encoding for lsl #12 in add/sub immediates
Whenever an add/sub immediate needs a fixup, we set that immediate field to zero,
which is correct, but we also set the shift bits to zero, which is not true for
instructions that use lsl #12. This patch makes sure that if lsl #12 was used,
it will appear in the encoding of the instruction.

Differential Revision: https://reviews.llvm.org/D23930

llvm-svn: 281898
2016-09-19 11:10:18 +00:00
Nirav Dave 2364748a49 Defer asm errors to post-statement failure
Recommitting after fixing AsmParser initialization and X86 inline asm
error cleanup.

Allow errors to be deferred and emitted as part of clean up to simplify
and shorten Assembly parser code. This will allow error messages to be
emitted in helper functions and be modified by the caller which has
better context.

As part of this many minor cleanups to the Parser:

* Unify parser cleanup on error
* Add Workaround for incorrect return values in ParseDirective instances
* Tighten checks on error-signifying return values for parser functions
  and fix in-tree TargetParsers to be more consistent with the changes.
* Fix AArch64 test cases checking for spurious error messages that are
  now fixed.

These changes should be backwards compatible with current Target Parsers
so long as the error status are correctly returned in appropriate
functions.

Reviewers: rnk, majnemer

Subscribers: aemerson, jyknight, llvm-commits

Differential Revision: https://reviews.llvm.org/D24047

llvm-svn: 281762
2016-09-16 18:30:20 +00:00
Ahmed Bougacha 85ef4a1c47 [AArch64][GlobalISel] Add default regbank mapping for int<>FP.
llvm-svn: 281739
2016-09-16 15:12:46 +00:00
Ahmed Bougacha 7b3b2e7f65 [AArch64][GlobalISel] Add default regbank mapping for G_FCMP.
llvm-svn: 281738
2016-09-16 15:12:43 +00:00
Ahmed Bougacha 90637f6196 [AArch64][GlobalISel] Add default regbank mapping for FP ops.
These should have all their operands - even scalars - go on FPR.

llvm-svn: 281737
2016-09-16 15:12:40 +00:00
Ahmed Bougacha 7306313e6d [AArch64][GlobalISel] Add default regbank mappings for mixed-type ops.
We used to only support instructions with same-type operands.
Instead, use the per-register type information to map each
operand more accurately.

llvm-svn: 281734
2016-09-16 14:44:51 +00:00
Ahmed Bougacha b532360dd6 [AArch64][GlobalISel] Use the generic DefaultMapping as the default.
This lets generic logic handle the common case, instead of having to
implement applyMappingImpl for each instruction.

llvm-svn: 281720
2016-09-16 12:33:34 +00:00
Eric Christopher 4367c7fb9a Move the Mangler from the AsmPrinter down to TLOF and clean up the
TLOF API accordingly.

llvm-svn: 281708
2016-09-16 07:33:15 +00:00
Evandro Menezes 19b2aed308 [AArch64] Support for FP FMA when -ffp-contract=fast
Currently, the machine combiner can proceed matching when -ffast-math is on.
It should also match when only -ffp-contract=fast is specified as was the
case before when DAGCombiner was doing the job.

Patch by: Abderrazek Zaafrani <a.zaafrani@samsung.com>.

Differential Revision: https://reviews.llvm.org/D24366

llvm-svn: 281649
2016-09-15 19:55:23 +00:00
Tim Northover 22d82cf179 GlobalISel: legalize GEP instructions with small offsets.
llvm-svn: 281602
2016-09-15 11:02:19 +00:00
Tim Northover 4cf0a482bc GlobalISel: relax type constraints on G_ICMP to allow pointers.
llvm-svn: 281600
2016-09-15 10:40:38 +00:00
Tim Northover 32a078ad1a GlobalISel: remove "unsized" LLT
It was only really there as a sentinel when instructions had to have precisely
one type. Now that registers are typed, each register really has to have a type
that is sized.

llvm-svn: 281599
2016-09-15 10:09:59 +00:00
Tim Northover 5ae8350af6 GlobalISel: cache pointer sizes in LLT
Otherwise everything that needs to work out what size they are has to keep a
DataLayout handy, which is a bit silly and very annoying.

llvm-svn: 281597
2016-09-15 09:20:34 +00:00
Matt Arsenault 1b9fc8ed65 Finish renaming remaining analyzeBranch functions
llvm-svn: 281535
2016-09-14 20:43:16 +00:00
Matt Arsenault e8e0f5cac6 Make analyzeBranch family of instruction names consistent
analyzeBranch was renamed to use lowercase first, rename
the related set to match.

llvm-svn: 281506
2016-09-14 17:24:15 +00:00
Matt Arsenault a2b036e88b AArch64: Use TTI branch functions in branch relaxation
The main change is to return the code size from
InsertBranch/RemoveBranch.

Patch mostly by Tim Northover

llvm-svn: 281505
2016-09-14 17:23:48 +00:00
Sanjay Patel 1ed771f5d7 getVectorElementType().getSizeInBits() -> getScalarSizeInBits() ; NFCI
llvm-svn: 281495
2016-09-14 16:37:15 +00:00
Sanjay Patel b1f0a0f4a8 getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI
llvm-svn: 281493
2016-09-14 16:05:51 +00:00
Sanjay Patel 5f6bb6cd24 getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits() ; NFCI
llvm-svn: 281490
2016-09-14 15:43:44 +00:00
Sanjay Patel bd6fca1419 getScalarType().getSizeInBits() -> getScalarSizeInBits() ; NFCI
llvm-svn: 281489
2016-09-14 15:21:00 +00:00
Tim Northover 1c7825fd79 GlobalISel: mark pointer stores as legal on AArch64.
llvm-svn: 281448
2016-09-14 08:28:54 +00:00
Matthias Braun 1af1414d4d AArch64: Cleanup tailcall CC check, enable swiftcc.
Cleanup/change the code that checks for possible tailcall conventions to
look the same as the one in the X86 target. This makes the distinction
between calling conventions that can guarnatee tailcalls and the ones
that may tailcall more obvious.

- Add Swift to the mayTailCall list
- PreserveMost seemed to be incorrectly part of the guarnteed tail call
  list, move it to the mayTailCall list.

llvm-svn: 281376
2016-09-13 19:27:38 +00:00
Nico Weber e204c48d16 Revert r281336 (and r281337), it caused PR30372.
llvm-svn: 281361
2016-09-13 18:17:00 +00:00
Nirav Dave 9fa8af2180 Defer asm errors to post-statement failure
Recommitting after fixing AsmParser Initialization.

Allow errors to be deferred and emitted as part of clean up to simplify
and shorten Assembly parser code. This will allow error messages to be
emitted in helper functions and be modified by the caller which has
better context.

As part of this many minor cleanups to the Parser:

* Unify parser cleanup on error
* Add Workaround for incorrect return values in ParseDirective instances
* Tighten checks on error-signifying return values for parser functions
  and fix in-tree TargetParsers to be more consistent with the changes.
* Fix AArch64 test cases checking for spurious error messages that are
  now fixed.

These changes should be backwards compatible with current Target Parsers
so long as the error status are correctly returned in appropriate
functions.

Reviewers: rnk, majnemer

Subscribers: aemerson, jyknight, llvm-commits

Differential Revision: https://reviews.llvm.org/D24047

llvm-svn: 281336
2016-09-13 13:55:06 +00:00
Diana Picus 4b97288184 [AArch64] Support stackmap/patchpoint in getInstSizeInBytes
We currently return 4 for stackmaps and patchpoints, which is very optimistic
and can in rare cases cause the branch relaxation pass to fail to relax certain
branches.

This patch causes getInstSizeInBytes to return a pessimistic estimate of the
size as the number of bytes requested in the stackmap/patchpoint. In the future,
we could provide a more accurate estimate by sharing some of the logic in
AArch64::LowerSTACKMAP/PATCHPOINT.

Fixes part of https://llvm.org/bugs/show_bug.cgi?id=28750

Differential Revision: https://reviews.llvm.org/D24073

llvm-svn: 281301
2016-09-13 07:45:17 +00:00
Eric Christopher 04c7db31e8 Temporarily Revert "[MC] Defer asm errors to post-statement failure" as it's causing errors on the sanitizer bots.
This reverts commit r281249.

llvm-svn: 281280
2016-09-13 00:19:29 +00:00
Nirav Dave c0c0f7a196 [MC] Defer asm errors to post-statement failure
Allow errors to be deferred and emitted as part of clean up to simplify
and shorten Assembly parser code. This will allow error messages to be
emitted in helper functions and be modified by the caller which has
better context.

As part of this many minor cleanups to the Parser:

* Unify parser cleanup on error
* Add Workaround for incorrect return values in ParseDirective instances
* Tighten checks on error-signifying return values for parser functions
  and fix in-tree TargetParsers to be more consistent with the changes.
* Fix AArch64 test cases checking for spurious error messages that are
  now fixed.

These changes should be backwards compatible with current Target Parsers
so long as the error status are correctly returned in appropriate
functions.

Reviewers: rnk, majnemer

Subscribers: aemerson, jyknight, llvm-commits

Differential Revision: https://reviews.llvm.org/D24047

llvm-svn: 281249
2016-09-12 20:03:02 +00:00
Duncan P. N. Exon Smith 1872096f1e CodeGen: Give MachineBasicBlock::reverse_iterator a handle to the current MI
Now that MachineBasicBlock::reverse_instr_iterator knows when it's at
the end (since r281168 and r281170), implement
MachineBasicBlock::reverse_iterator directly on top of an
ilist::reverse_iterator by adding an IsReverse template parameter to
MachineInstrBundleIterator.  This replaces another hard-to-reason-about
use of std::reverse_iterator on list iterators, matching the changes for
ilist::reverse_iterator from r280032 (see the "out of scope" section at
the end of that commit message).  MachineBasicBlock::reverse_iterator
now has a handle to the current node and has obvious invalidation
semantics.

r280032 has a more detailed explanation of how list-style reverse
iterators (invalidated when the pointed-at node is deleted) are
different from vector-style reverse iterators like std::reverse_iterator
(invalidated on every operation).  A great motivating example is this
commit's changes to lib/CodeGen/DeadMachineInstructionElim.cpp.

Note: If your out-of-tree backend deletes instructions while iterating
on a MachineBasicBlock::reverse_iterator or converts between
MachineBasicBlock::iterator and MachineBasicBlock::reverse_iterator,
you'll need to update your code in similar ways to r280032.  The
following table might help:

                  [Old]              ==>             [New]
        delete &*RI, RE = end()                   delete &*RI++
        RI->erase(), RE = end()                   RI++->erase()
      reverse_iterator(I)                 std::prev(I).getReverse()
      reverse_iterator(I)                          ++I.getReverse()
    --reverse_iterator(I)                            I.getReverse()
      reverse_iterator(std::next(I))                 I.getReverse()
                RI.base()                std::prev(RI).getReverse()
                RI.base()                         ++RI.getReverse()
              --RI.base()                           RI.getReverse()
     std::next(RI).base()                           RI.getReverse()

(For more details, have a look at r280032.)

llvm-svn: 281172
2016-09-11 18:51:28 +00:00
Justin Lebar adbf09e8cf [CodeGen] Split out the notions of MI invariance and MI dereferenceability.
Summary:
An IR load can be invariant, dereferenceable, neither, or both.  But
currently, MI's notion of invariance is IR-invariant &&
IR-dereferenceable.

This patch splits up the notions of invariance and dereferenceability at
the MI level.  It's NFC, so adds some probably-unnecessary
"is-dereferenceable" checks, which we can remove later if desired.

Reviewers: chandlerc, tstellarAMD

Subscribers: jholewinski, arsenm, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D23371

llvm-svn: 281151
2016-09-11 01:38:58 +00:00
Tim Northover 25d1286e5a GlobalISel: remove G_TYPE and G_PHI
These instructions were only necessary when type information was stored in the
MachineInstr (because only generic MachineInstrs possessed a type). Now that
it's in MachineRegisterInfo, COPY and PHI work fine.

llvm-svn: 281037
2016-09-09 11:47:31 +00:00
Tim Northover 0f140c769a GlobalISel: move type information to MachineRegisterInfo.
We want each register to have a canonical type, which means the best place to
store this is in MachineRegisterInfo rather than on every MachineInstr that
happens to use or define that register.

Most changes following from this are pretty simple (you need an MRI anyway if
you're going to be doing any transformations, so just check the type there).
But legalization doesn't really want to check redundant operands (when, for
example, a G_ADD only ever has one type) so I've made use of MCInstrDesc's
operand type field to encode these constraints and limit legalization's work.

As an added bonus, more validation is possible, both in MachineVerifier and
MachineIRBuilder (coming soon).

llvm-svn: 281035
2016-09-09 11:46:34 +00:00
Eric Christopher 98ddbdb563 AArch64 .arch directive - Include default arch attributes with extensions.
Fix the .arch asm parser to use the full set of features for the architecture
and any extensions on the command line. Add and update testcases accordingly
as well as add an extension that was used but not supported.

llvm-svn: 280971
2016-09-08 17:27:03 +00:00
Evandro Menezes 405c90e6cc [AArch64] Adjust the scheduling model for Exynos M1.
Further refine the model for branches.

llvm-svn: 280736
2016-09-06 19:22:29 +00:00
Evandro Menezes 77e6b5d4e0 [AArch64] Adjust the scheduling model for Exynos M1.
Further refine the model for stores.

llvm-svn: 280735
2016-09-06 19:22:27 +00:00
Evandro Menezes 199cad4f17 [AArch64] Adjust the scheduling model for Exynos M1.
Further refine the model for loads.

llvm-svn: 280734
2016-09-06 19:22:19 +00:00
Tim Northover 8d8812c5d7 GlobalISel: add a G_PHI instruction to give phis a type.
They're another source of generic vregs, which are going to need a type on the
definition when we remove the register width from MachineRegisterInfo.

llvm-svn: 280412
2016-09-01 20:45:41 +00:00
Tim Northover 11a2354670 GlobalISel: use G_TYPE to annotate physregs with a type.
More preparation for dropping source types from MachineInstrs: regsters coming
out of already-selected code (i.e. non-generic instructions) don't have a type,
but that information is needed so we must add it manually.

This is done via a new G_TYPE instruction.

llvm-svn: 280292
2016-08-31 21:24:02 +00:00
Diana Picus 760c757633 Use abstraction in AArch64AsmPrinter::lowerSTACKMAP. NFCI
Use functionality from StackMapOpers instead of hardcoding an operand access.

llvm-svn: 280230
2016-08-31 12:43:49 +00:00
Diana Picus 16c818820b Typo fixes. NFC
llvm-svn: 280229
2016-08-31 12:43:44 +00:00
James Y Knight d7d9e1069b Replace incorrect "#ifdef DEBUG" with "#ifndef NDEBUG".
The former is simply wrong -- the code will either never be used or will
always be used, rather than being dependent upon whether it's built with
debug assertions enabled.

The macro DEBUG isn't ever set by the llvm build system. But, the macro
DEBUG(X) is defined (unconditionally) if you happen to include
llvm/Support/Debug.h.

The code in Value.h which was erroneously protected by the #ifdef DEBUG
didn't even compile -- you can't cast<> from an LLVMOpaqueValue
directly. Fortunately, it was never invoked, as Core.cpp included
Value.h before Debug.h.

The conditionalized code in AArch64CollectLOH.cpp was previously always
used, as it includes Debug.h.

llvm-svn: 280056
2016-08-30 03:16:16 +00:00
Tim Northover edb3c8ccb8 GlobalISel: legalize frem to a libcall on AArch64.
llvm-svn: 279988
2016-08-29 19:07:16 +00:00
Tim Northover fe5f89ba14 GlobalISel: rework CallLowering so that it can be used for libcalls too.
There should be no functional change here, I'm just making the implementation
of "frem" (to libcall) legalization easier for a followup.

llvm-svn: 279987
2016-08-29 19:07:08 +00:00
Evandro Menezes a8a25ca905 [AArch64] Adjust the scheduling model for Exynos M1.
Further refine the model for loads.

llvm-svn: 279976
2016-08-29 16:04:37 +00:00
Quentin Colombet a94caa5673 [AArch64][CallLowering] Do not assert for not implemented part.
When doing the ABI lowering, report a failure to the caller instead of
asserting. This gives a chance for the caller to recover.

llvm-svn: 279890
2016-08-27 00:18:28 +00:00
Manman Ren 66b54e9f32 Swift Calling Convetion: add support for AArch64.
It will just be the same as the regular calling convention.

rdar://28029509

llvm-svn: 279853
2016-08-26 19:28:17 +00:00
Tim Northover 85cf564c51 AArch64: avoid assertion on illegal types in performFDivCombine.
In the code to detect fixed-point conversions and make use of AArch64's special
instructions, we weren't prepared for weird types. The fptosi direction got
fixed recently, but not the similar sitofp code.

llvm-svn: 279852
2016-08-26 18:52:31 +00:00
Chad Rosier 58f505ba24 [AArch64] Avoid materializing constant values when generating csel instructions.
Differential Revision: https://reviews.llvm.org/D23677

llvm-svn: 279849
2016-08-26 18:05:50 +00:00
Reid Kleckner a5b1eef846 [MC] Move .cv_loc management logic out of MCContext
MCContext already has many tasks, and separating CodeView out from it is
probably a good idea. The .cv_loc tracking was modelled on the DWARF
tracking which lived directly in MCContext.

Removes the inclusion of MCCodeView.h from MCContext.h, so now there are
only 10 build actions while I hack on CodeView support instead of 265.

llvm-svn: 279847
2016-08-26 17:58:37 +00:00
Tim Northover bc1701c7fb GlobalISel: mark G_FPEXT legal from float to double.
llvm-svn: 279845
2016-08-26 17:46:22 +00:00
Tim Northover 30bd36e3fc GlobalISel: mark G_FCMP legal on float & double.
llvm-svn: 279844
2016-08-26 17:46:19 +00:00
Tim Northover 051b8ad3d9 GlobalISel: simplify G_ICMP legalization regime.
It's unclear how the old

    %res(32) = G_ICMP { s32, s32 } intpred(eq), %0, %1

is actually different from an s1 verison

    %res(1) = G_ICMP { s1, s32 } intpred(eq), %0, %1

so we'll remove it for now.

llvm-svn: 279843
2016-08-26 17:46:17 +00:00
Tim Northover cecee56abb GlobalISel: legalize sdiv and srem operations.
llvm-svn: 279842
2016-08-26 17:46:13 +00:00
Tim Northover 7a753d9bec GlobalISel: legalize under-width divisions.
llvm-svn: 279841
2016-08-26 17:46:06 +00:00
Tim Northover 1d18a99a53 GlobalISel: mark selects legal
llvm-svn: 279840
2016-08-26 17:46:03 +00:00
Tim Northover 5d0eaa4e79 GlobalISel: mark float/int conversions legal
llvm-svn: 279839
2016-08-26 17:45:58 +00:00
Chad Rosier 39c1dbb845 [AArch64] Avoid materializing constant 1 by using csinc, rather than csel.
This is similar to what was done in r261675, but for CSINC rather than CSINV.

Differential Revision: https://reviews.llvm.org/D23892

llvm-svn: 279822
2016-08-26 14:01:55 +00:00
Tim Northover d8a6d7ce91 GlobalISel: mark overflow bit of overflow ops legal.
It's expected this will map to NZCV register class and be properly selectable.

llvm-svn: 279761
2016-08-25 17:37:41 +00:00
Tim Northover fe880a8801 GlobalISel: mark simple ops legal even on types < 32-bit.
The 32-bit variants of these operations don't depend on the bits not being
operated on, so they also naturally model operations narrower than the actual
register width.

llvm-svn: 279760
2016-08-25 17:37:39 +00:00
Tim Northover 7a1ec0141a GlobalISel: mark pointer constants as legal on AArch64.
llvm-svn: 279759
2016-08-25 17:37:35 +00:00
Tim Northover 438c77ca1a GlobalISel: perform multi-step legalization
llvm-svn: 279758
2016-08-25 17:37:32 +00:00
Tim Northover 2c4a838e24 GlobalISel: mark small extends as legal on AArch64
llvm-svn: 279757
2016-08-25 17:37:25 +00:00
Matthias Braun 1eb473680a MachineFunctionProperties/MIRParser: Rename AllVRegsAllocated->NoVRegs, compute it
Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of
running after register and simply describes that no vregs are used in
a machine function. With that we can simply compute the property and do
not need to dump/parse it in .mir files.

Differential Revision: http://reviews.llvm.org/D23850

llvm-svn: 279698
2016-08-25 01:27:13 +00:00
George Burgess IV 381fc0ee3c Make some LLVM_CONSTEXPR variables const. NFC.
This patch changes LLVM_CONSTEXPR variable declarations to const
variable declarations, since LLVM_CONSTEXPR expands to nothing if the
current compiler doesn't support constexpr. In all of the changed
cases, it looks like the code intended the variable to be const instead
of sometimes-constexpr sometimes-not.

llvm-svn: 279696
2016-08-25 01:05:08 +00:00
Evandro Menezes 5395187fe5 [AArch64] Adjust the feature set for Exynos M1.
Enable zero cycle zeroing.

llvm-svn: 279648
2016-08-24 18:17:30 +00:00
Philip Reames e83c4b30ca [stackmaps] More extraction of common code [NFCI]
General cleanup before starting to work on the part I want to actually change.

llvm-svn: 279586
2016-08-23 23:33:29 +00:00
Tim Northover 6cd4b23a0f GlobalISel: legalize integer comparisons on AArch64.
Next step is doing both legalizations at the same time! Marvel at GlobalISel's
cunning.

llvm-svn: 279566
2016-08-23 21:01:26 +00:00
Tim Northover b3a0be4d38 GlobalISel: legalize conditional branches on AArch64.
llvm-svn: 279565
2016-08-23 21:01:20 +00:00
Tim Northover a01bece1dc GlobalISel: extend legalizer interface to handle multiple types.
Instructions like G_ICMP have multiple types that may need to be legalized (the
boolean output and nearly arbitrary inputs in this case). So the legalizer must
be capable of deciding what to do for each of them separately.

llvm-svn: 279554
2016-08-23 19:30:42 +00:00
Tim Northover 456a3c03ac GlobalISel: mark pointer casts legal on AArch64.
llvm-svn: 279553
2016-08-23 19:30:38 +00:00
Tim Northover 3c73e367c0 GlobalISel: legalize 1-bit load/store and mark 8/16 bit variants legal on AArch64.
llvm-svn: 279548
2016-08-23 18:20:09 +00:00
Matt Arsenault 567631bdd4 BranchRelaxation: Fix handling of blocks with multiple conditional
branches

Looping over all terminators exposed AArch64 tests hitting
an assert from analyzeBranch failing. I believe these cases
were miscompiled before.

e.g.
  fcmp s0, s1
  b.ne LBB0_1
  b.vc LBB0_2
  b LBB0_2
LBB0_1:
  ; Large block
LBB0_2:
 ; ...

Both of the individual conditional branches need to
be expanded, since neither can reach the final block.

Split the original block into ones which analyzeBranch
will be able to understand.

llvm-svn: 279499
2016-08-23 01:30:30 +00:00
Tim Northover a11be04769 GlobalISel: support legalization of G_FCONSTANTs
llvm-svn: 279341
2016-08-19 22:40:08 +00:00
Tim Northover ea904f9424 GlobalISel: teach legalizer how to handle integer constants.
llvm-svn: 279340
2016-08-19 22:40:00 +00:00
Saleem Abdulrasool dab786fb78 AArch64: remove extraneous padding
The structs BarrierOp, PrefetchOp, PSBHintOp are in AArch64AsmParser.cpp
(inside anonymous namespace).  This diff changes the order of fields and
removes the excessive padding (8 bytes).

Patch by Alexander Shaposhnikov!

llvm-svn: 279173
2016-08-18 22:35:06 +00:00
Michael Kuperstein 2bc3d4d46c [SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> fround
The names of the tablegen defs now match the names of the ISD nodes.
This makes the world a slightly saner place, as previously "fround" matched
ISD::FP_ROUND and not ISD::FROUND.

Differential Revision: https://reviews.llvm.org/D23597

llvm-svn: 279129
2016-08-18 20:08:15 +00:00
Duncan P. N. Exon Smith 84c2da47f9 AArch64: Don't call getIterator() on iterators
Remove an unnecessary round-trip:

    iterator => operator->() => getIterator()

In some cases, the iterator is end(), so the dereference of operator->
is invalid (UB).

The testcase only crashes with r278974 (currently reverted to
investigate this), which adds an assertion for invalid dereferences of
ilist nodes.

Fixes PR29035.

llvm-svn: 279104
2016-08-18 17:58:09 +00:00
Ahmed Bougacha 33e19fe1c4 [AArch64][GlobalISel] Select floating-point binary ops.
There is no FREM instruction, but the others are straightforward.

llvm-svn: 279081
2016-08-18 16:05:11 +00:00
Ahmed Bougacha 1d0560b14d [AArch64][GlobalISel] Select G_SDIV/G_UDIV.
There is no REM instruction; that will require an expansion.
It's not obvious that should be done in select, rather than as a
(custom?) legalization.

llvm-svn: 279074
2016-08-18 15:17:13 +00:00
Justin Bogner cd1d5aaf2e Replace a few more "fall through" comments with LLVM_FALLTHROUGH
Follow up to r278902. I had missed "fall through", with a space.

llvm-svn: 278970
2016-08-17 20:30:52 +00:00
Justin Bogner b03fd12cef Replace "fallthrough" comments with LLVM_FALLTHROUGH
This is a mechanical change of comments in switches like fallthrough,
fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.

llvm-svn: 278902
2016-08-17 05:10:15 +00:00
Evandro Menezes 5a5b8dcd32 [AArch64] Adjust the scheduling model for Exynos M1.
Refine the model for the FP division unit.

llvm-svn: 278846
2016-08-16 20:35:01 +00:00
Evandro Menezes d03aff2e11 [AArch64] Adjust the scheduling model for Exynos M1.
Refine the model for the integer division unit.

llvm-svn: 278845
2016-08-16 20:34:58 +00:00
Ahmed Bougacha e4c03abddd [AArch64][GlobalISel] Select G_MUL.
llvm-svn: 278810
2016-08-16 14:37:46 +00:00
Ahmed Bougacha 59e160a19c [AArch64][GlobalISel] Factor out unsupported binop check. NFC.
We're going to need it for G_MUL, and, if other targets end up using
something similar, we can easily put it in the generic selector.

llvm-svn: 278808
2016-08-16 14:37:40 +00:00
Ahmed Bougacha 2ac5bf94bc [AArch64][GlobalISel] Select (variable) shifts.
For now, no support for immediates.

llvm-svn: 278804
2016-08-16 14:02:47 +00:00
Ahmed Bougacha 0306b5ef07 [AArch64][GlobalISel] Select p0 G_FRAME_INDEX.
And mark it as legal.

llvm-svn: 278802
2016-08-16 14:02:42 +00:00
Eli Friedman f184e4befc [AArch64LoadStoreOptimizer] Check aliasing correctly when creating paired loads/stores.
The existing code accidentally skipped the aliasing check in edge cases.

Differential revision: https://reviews.llvm.org/D23372

llvm-svn: 278562
2016-08-12 20:39:51 +00:00
Mike Aizatsky f4fdb5ddf3 [AArch64] Registering default MCInstrAnalysis
Even in this form it is useful: it can detect branch instructions.

https://github.com/google/sanitizers/issues/706

Subscribers: aemerson, rengolin

Differential Revision: https://reviews.llvm.org/D23426

llvm-svn: 278560
2016-08-12 20:28:05 +00:00
Eli Friedman 8585e9d33d [AArch64LoadStoreOpt] Handle offsets correctly for post-indexed paired loads.
Trunk would try to create something like "stp x9, x8, [x0], #512", which isn't actually a valid instruction.

Differential revision: https://reviews.llvm.org/D23368

llvm-svn: 278559
2016-08-12 20:28:02 +00:00
Geoff Berry 22dfbc5637 [AArch64] Re-factor code shared by AArch64LoadStoreOpt and AArch64InstrInfo.
This re-factoring could cause the following slight changes in generated
code, though none were observed during testing:

- MachineScheduler could decide not to cluster some loads/stores if
  there are other load/stores with non-pairable opcodes that have the
  same base register and offset as a pairable set of load/stores.  One
  case of different MachineScheduler pairing did show up in my testing,
  but it wasn't due to this issue, but due
  BaseMemOpClusterMutation::clusterNeighboringMemOps() being unstable
  w.r.t. the order it considers memory operations.  See PR28942.

- The ImplicitNullChecks optimization could be done for more load/store
  opcodes.  This optimization isn't done for C/C++ code, so it didn't
  show up in my testing.

Reviewers: mcrosier, t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D23365

llvm-svn: 278515
2016-08-12 15:26:00 +00:00
David Majnemer 91a02f5bee Use the range variant of transform instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278477
2016-08-12 04:32:45 +00:00
David Majnemer 2d006e7673 Use the range variant of transform instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278476
2016-08-12 04:32:42 +00:00
David Majnemer 562e82945e Use the range variant of find_if instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278443
2016-08-12 00:18:03 +00:00
David Majnemer 0d955d0bf5 Use the range variant of find instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278433
2016-08-11 22:21:41 +00:00
Matt Arsenault 76837df6ff AArch64: Assert on analyzeBranch failing
llvm-svn: 278366
2016-08-11 17:22:59 +00:00
Simon Pilgrim 5c91764af5 Fixed VS2015 (Update 3) warning - differing const/volatile qualifiers for overridden function
Dropped the const qualifier to match llvm::CallLowering::lowerCall

llvm-svn: 278329
2016-08-11 12:19:43 +00:00
Tim Northover 406024a108 GlobalISel: implement simple function calls on AArch64.
We're still limited in the arguments we support, but this at least handles the
basic cases.

llvm-svn: 278293
2016-08-10 21:44:01 +00:00
Silviu Baranga fa00ba3c1a [AArch64] PR28877: Don't assume we're running after legalization when creating vcvtfp2fxs
Summary:
The DAG combine transformation that was generating the
aarch64_neon_vcvtfp2fxs node was assuming that all
inputs where legal and wasn't accounting that the input
could be a v4f64 if we're trying to do the transformation
before legalization. We now bail out in this case.

All illegal types besides v4f64 were already rejected.

Fixes https://llvm.org/bugs/show_bug.cgi?id=28877.

Reviewers: jmolloy

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D23261

llvm-svn: 278002
2016-08-08 13:13:57 +00:00
Benjamin Kramer b7d3311c77 Move helpers into anonymous namespaces. NFC.
llvm-svn: 277916
2016-08-06 11:13:10 +00:00
Tim Northover 97d0cb3165 GlobalISel: IRTranslate PHI instructions
llvm-svn: 277835
2016-08-05 17:16:40 +00:00
Tim Northover 61c16142b4 GlobalISel: extend add widening to SUB, MUL, OR, AND and XOR.
These are the operations that are trivially identical. Division is omitted for
now because you need to use the correct sign/zero extension.

llvm-svn: 277775
2016-08-04 21:39:49 +00:00
Tim Northover 9656f1476c GlobalISel: implement narrowing for G_ADD.
llvm-svn: 277769
2016-08-04 20:54:13 +00:00
Tim Northover 2f32e7f0ac AArch64: don't assume all i128s are BUILD_PAIRs
It leads to a crash when they're not. I'm *sure* I've made this mistake before,
at least once.

llvm-svn: 277755
2016-08-04 19:32:28 +00:00
Tim Northover 1021d89398 AArch64: properly calculate cmpxchg status in FastISel.
We were relying on the misleadingly-names $status result to actually be the
status. Actually it's just a scratch register that may or may not be valid (and
is the inverse of the real ststus anyway). Success can be determined by
comparing the value loaded against the one we wanted to see for "cmpxchg
strong" loops like this.

Should fix PR28819.

llvm-svn: 277513
2016-08-02 20:22:36 +00:00
Ahmed Bougacha ad30db32e6 [AArch64][GlobalISel] Mark basic binops/memops as legal.
We currently use and test these, and select most of them. Mark them
as legal even though we don't go through the full ir->asm flow yet.

This doesn't currently have standalone tests, but the verifier will
soon learn to check that the regbankselect/select tests are legal.

llvm-svn: 277471
2016-08-02 15:10:28 +00:00
Matt Arsenault 6f1ae3c7db AArch64: Assert on branch displacement bits
llvm-svn: 277434
2016-08-02 08:56:52 +00:00
Matt Arsenault 5b54971ff9 AArch64: Consolidate branch inversion logic
llvm-svn: 277431
2016-08-02 08:30:06 +00:00
Matt Arsenault e8da145493 AArch64: BranchRelaxtion cleanups
Move some logic into TII.

llvm-svn: 277430
2016-08-02 08:06:17 +00:00
Matt Arsenault f7065e15f8 AArch64: Fix end iterator dereference
Not all blocks have terminators. I'm not sure how this wasn't
crashing before.

llvm-svn: 277427
2016-08-02 07:20:09 +00:00
Evandro Menezes 82e245a202 [AArch64] Add support for Samsung Exynos M2 (NFC).
llvm-svn: 277364
2016-08-01 18:39:45 +00:00
Diana Picus ab5a4c7dbb [AArch64] Return the correct size for TLSDESC_CALLSEQ
The branch relaxation pass is computing the wrong offsets because it assumes
TLSDESC_CALLSEQ eats up 4 bytes, when in fact it is lowered to an instruction
sequence taking up 16 bytes. This can become a problem in huge files with lots
of TLS accesses, as it may slowly move branch targets out of the range computed
by the branch relaxation pass.

Fixes PR24234 https://llvm.org/bugs/show_bug.cgi?id=24234

Differential Revision: https://reviews.llvm.org/D22870

llvm-svn: 277331
2016-08-01 08:38:49 +00:00
Diana Picus 850043b25a [AArch64] Register passes so they can be run by llc
Initialize all AArch64-specific passes in the TargetMachine so they can be run
by llc. This can lead to conflicts in opt with some command line options that
share the same name as the pass, so I took this opportunity to do some cleanups:
* rename all relevant command line options from "aarch64-blah" to
  "aarch64-enable-blah" and update the tests accordingly
* run clang-format on their declarations
* move all these declarations to a common place (the TargetMachine) as opposed
  to having them scattered around (AArch64BranchRelaxation and
  AArch64AddressTypePromotion were the only offenders)

llvm-svn: 277322
2016-08-01 05:56:57 +00:00
Tim Northover a51575ffa2 CodeGen: improve MachineInstrBuilder & MachineIRBuilder interface
For MachineInstrBuilder, having to manually use RegState::Define is ugly and
makes register definitions clunkier than they need to be, so this adds two
convenience functions: addDef and addUse.

For MachineIRBuilder, we want to avoid BuildMI's first-reg-is-def rule because
it's hidden away and causes bugs. So this patch switches buildInstr to
returning a MachineInstrBuilder and adding *all* operands via addDef/addUse.

NFC.

llvm-svn: 277176
2016-07-29 17:43:52 +00:00
Ahmed Bougacha 6db3cfe2da [AArch64][GlobalISel] Select G_XOR.
llvm-svn: 277173
2016-07-29 16:56:25 +00:00
Ahmed Bougacha 7adfac56b3 [AArch64][GlobalISel] Select G_LOAD/G_STORE.
Mostly straightforward as we ignore addressing modes and just
use the base + unsigned immediate offset (always 0) variants.

This currently fails to select extloads because we have yet to
agree on a representation.

llvm-svn: 277171
2016-07-29 16:56:16 +00:00
Sjoerd Meijer 0eb96ed0de TargetInstrInfo: add virtual function getInstSizeInBytes
This adds a target hook getInstSizeInBytes to TargetInstrInfo that a lot of
subclasses already implement.

Differential Revision: https://reviews.llvm.org/D22885

llvm-svn: 277126
2016-07-29 08:16:16 +00:00
Matthias Braun 941a705b7b MachineFunction: Return reference for getFrameInfo(); NFC
getFrameInfo() never returns nullptr so we should use a reference
instead of a pointer.

llvm-svn: 277017
2016-07-28 18:40:00 +00:00
Ahmed Bougacha 8550509b64 [AArch64][GlobalISel] Select G_BR.
This is the first unsized instruction we support; move down the
'sized' check to binops.

llvm-svn: 277007
2016-07-28 17:15:15 +00:00
Ahmed Bougacha d7748d6491 [AArch64][GlobalISel] Select GPR G_SUB.
llvm-svn: 277003
2016-07-28 16:58:35 +00:00
Ahmed Bougacha 61a7928dde [AArch64][GlobalISel] Select GPR G_AND.
llvm-svn: 277002
2016-07-28 16:58:31 +00:00
Ahmed Bougacha 46c05fc861 [GlobalISel] Remove types on selected insts instead of using LLT().
LLT() has a particular meaning: it's one invalid type. But we really
want selected instructions to have no type whatsoever.

Also verify that types don't linger after ISel, and enable the verifier
on the AArch64 select test.

llvm-svn: 277001
2016-07-28 16:58:27 +00:00
Sjoerd Meijer 89217f8835 TargetInstrInfo: rename GetInstSizeInBytes to getInstSizeInBytes. NFC
Differential Revision: https://reviews.llvm.org/D22925

llvm-svn: 276997
2016-07-28 16:32:22 +00:00
Zijiao Ma e56a53a9b3 Add unittests to {ARM | AArch64}TargetParser.
Add unittest to {ARM | AArch64}TargetParser,and by the way correct problems as below:
1.Correct a incorrect indexing problem in AArch64TargetParser. The architecture enumeration
 is shared across ARM and AArch64 in original implementation.But In the code,I just used the
 index which was offset by the ARM, and this would index into the array incorrectly. To make
 AArch64 has its own arch enum,or we will do a lot of slowly iterating.
2.Correct a spelling error. The parameter of llvm::AArch64::getArchExtName.
3.Correct a writing mistake, in llvm::ARM::parseArchISA.

Differential Revision: https://reviews.llvm.org/D21785

llvm-svn: 276957
2016-07-28 06:11:18 +00:00
Diana Picus c65d8bdcf2 Typo fix. NFC
llvm-svn: 276879
2016-07-27 15:13:25 +00:00
Ahmed Bougacha 6756a2c953 [GlobalISel] Introduce an instruction selector.
And implement it for AArch64, supporting x/w ADD/OR.

Differential Revision: https://reviews.llvm.org/D22373

llvm-svn: 276875
2016-07-27 14:31:55 +00:00
Ahmed Bougacha 5e402eec7b [AArch64] Mark various *Info classes as 'final'. NFC.
llvm-svn: 276874
2016-07-27 14:31:46 +00:00
Ahmed Bougacha b8459d14ef [AArch64] Define AArch64RegisterInfo as a class, not a struct. NFC.
llvm-svn: 276873
2016-07-27 14:31:40 +00:00
Tim Northover e2e0067352 GlobalISel[AArch64]: support pointer types in argument lowering.
They're basically i64 for AArch64, but we'll leave them intact for stranger
targets. Also add some tests for the (very few) other cases we can handle right
now.

llvm-svn: 276689
2016-07-25 21:01:17 +00:00
Joel Jones 373d7d30dd MC] Provide an MCTargetOptions to implementors of MCAsmBackendCtorTy, NFC
Some targets, notably AArch64 for ILP32, have different relocation encodings
based upon the ABI. This is an enabling change, so a future patch can use the
ABIName from MCTargetOptions to chose which relocations to use. Tested using
check-llvm.

The corresponding change to clang is in: http://reviews.llvm.org/D16538

Patch by: Joel Jones

Differential Revision: https://reviews.llvm.org/D16213

llvm-svn: 276654
2016-07-25 17:18:28 +00:00
Chandler Carruth 488cb137a9 Fix a GCC error due to this member name also being a type name. This
should fix the build with GCC 4.9 at least. Not sure if this is the
right name or fix, but I've followed up on the original commit.

llvm-svn: 276522
2016-07-23 07:50:05 +00:00
George Burgess IV 8a457ddfd8 Fix include case. NFC.
llvm-svn: 276465
2016-07-22 20:15:19 +00:00
Tim Northover 33b07d6725 GlobalISel: implement legalization pass, with just one transformation.
This adds the actual MachineLegalizeHelper to do the work and a trivial pass
wrapper that legalizes all instructions in a MachineFunction. Currently the
only transformation supported is splitting up a vector G_ADD into one acting on
smaller vectors.

llvm-svn: 276461
2016-07-22 20:03:43 +00:00
David Majnemer 1182dd8ed9 [AArch64] Cleanup sign extend in genAlternativeCodeSequence
Use the machinery in MathExtras instead of rolling it by hand.

This fixes PR28624.

llvm-svn: 276366
2016-07-21 23:46:56 +00:00
Akira Hatanaka b8d2873d93 [AArch64][Inline-Asm] Return the 32-bit floating point register class
when constraint "w" is used on a 32-bit operand.

This enables compiling the following code, which used to error out in
the backend:

void foo1(int a) {
  asm volatile ("sqxtn h0, %s0\n" : : "w"(a):);
}

Fixes PR28633.

llvm-svn: 276344
2016-07-21 21:39:05 +00:00
Geoff Berry 4ff2e36d32 [AArch64] Load/store opt: Don't count transient instructions towards search limits.
Summary:
This change also changes findMatchingInsn and
findMatchingUpdateInsnForward to take DBG_VALUE opcodes into account
when tracking register defs and uses, which could potentially inhibit
these optimizations in the presence of debug information.

Reviewers: mcrosier

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22582

llvm-svn: 276293
2016-07-21 15:20:25 +00:00
Geoff Berry 24c81e8d7c [AArch64] Register AArch64LoadStoreOptimizer so it can be run by llc -run-pass. NFCI.
llvm-svn: 276193
2016-07-20 21:45:58 +00:00
Ahmed Bougacha a0cdd79070 [AArch64][FastISel] Select -O0 legal cmpxchg.
At -O0, cmpxchg survives AtomicExpand: it's mostly straightforward
to select it in fast-isel, and let the pseudo be expanded later.

extractvalues on the result are the tricky part: the generic logic
only works for legal types (and it would be painful to make it
support illegal types), so we can only support i32/i64 cmpxchg.

llvm-svn: 276183
2016-07-20 21:12:32 +00:00
Ahmed Bougacha b0674d1143 [AArch64][FastISel] Select atomic stores into STLR.
llvm-svn: 276182
2016-07-20 21:12:27 +00:00
Tim Northover 62ae568bbb GlobalISel: implement low-level type with just size & vector lanes.
This should be all the low-level instruction selection needs to determine how
to implement an operation, with the remaining context taken from the opcode
(e.g. G_ADD vs G_FADD) or other flags not based on type (e.g. fast-math).

llvm-svn: 276158
2016-07-20 19:09:30 +00:00
David Majnemer 5d26127752 Revert "Disable this-return argument forwarding on ARM/AArch64"
Inference of the 'returned' attribute was fixed in r276008, lets try
turning the backend support back on.

This reverts commit r275677.

llvm-svn: 276081
2016-07-20 04:13:01 +00:00
Evandro Menezes 238fa76574 [AArch64] Properly validate the reciprocal estimation.
Add check for legal data types when expanding into a Newton series.

Differential Revision: https://reviews.llvm.org/D22267

llvm-svn: 276041
2016-07-19 22:31:11 +00:00
Pankaj Gode 1bfca191da [AArch64] PredictableSelectIsExpensive for Vulcan.
Adding PredictableSelectIsExpensive for Vulcan

Differential Revision: https://reviews.llvm.org/D22448

llvm-svn: 275978
2016-07-19 14:30:21 +00:00
Hal Finkel 04b5330ccd Disable this-return argument forwarding on ARM/AArch64
r275042 reverted function-attribute inference for the 'returned' attribute
because the feature triggered self-hosting failures on ARM and AArch64. James
Molloy determined that the this-return argument forwarding feature, which
directly ties the returned input argument to the returned value, was the cause.
It seems likely that this forwarding code contains, or triggers, a subtle bug.
Disabling for now until we can track that down.

llvm-svn: 275677
2016-07-16 07:07:29 +00:00
Junmo Park 45513a8eca Minor code cleanups. NFC.
llvm-svn: 275637
2016-07-15 22:42:52 +00:00
Justin Lebar 9c375817ac [SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, getStore, and friends.
Summary:
Instead, we take a single flags arg (a bitset).

Also add a default 0 alignment, and change the order of arguments so the
alignment comes before the flags.

This greatly simplifies many callsites, and fixes a bug in
AMDGPUISelLowering, wherein the order of the args to getLoad was
inverted.  It also greatly simplifies the process of adding another flag
to getLoad.

Reviewers: chandlerc, tstellarAMD

Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits

Differential Revision: http://reviews.llvm.org/D22249

llvm-svn: 275592
2016-07-15 18:27:10 +00:00
Justin Lebar 0af80cd6f0 [CodeGen] Take a MachineMemOperand::Flags in MachineFunction::getMachineMemOperand.
Summary:
Previously we took an unsigned.

Hooray for type-safety.

Reviewers: chandlerc

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D22282

llvm-svn: 275591
2016-07-15 18:26:59 +00:00
Jacques Pienaar 71c30a14b7 Rename AnalyzeBranch* to analyzeBranch*.
Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect.

Reviewers: tstellarAMD, mcrosier

Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai

Differential Revision: https://reviews.llvm.org/D22409

llvm-svn: 275564
2016-07-15 14:41:04 +00:00
Haicheng Wu f0b0127f60 [AArch64] Set COPY ZR isAsCheapAsAMove when needed.
If a subtarget has both ZCZeroing and CustomCheapAsMoveHandling features (now
only Kryo has both), set COPY (W|X)ZR isAsCheapAsAMove.

Differential Revision: http://reviews.llvm.org/D22360

llvm-svn: 275503
2016-07-15 00:27:01 +00:00
Justin Lebar b49185dd5f s/constexpr/LLVM_CONSTEXPR in AArch64InstrInfo.cpp.
Yet again.

llvm-svn: 275463
2016-07-14 20:08:23 +00:00
Evandro Menezes 77d470ff3c [AArch64] Adjust the scheduling model for Exynos-M1.
Enable use-postra-scheduler. (NFC)

llvm-svn: 275457
2016-07-14 19:25:46 +00:00
Justin Lebar 288b3376ae [CodeGen] Refactor MachineMemOperand::Flags's target-specific flags.
Summary:
Make the target-specific flags in MachineMemOperand::Flags real, bona
fide enum values.  This simplifies users, prevents various constants
from going out of sync, and avoids the false sense of security provided
by declaring static members in classes and then forgetting to define
them inside of cpp files.

Reviewers: MatzeB

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22372

llvm-svn: 275451
2016-07-14 18:15:20 +00:00
Justin Lebar a3b786a8c1 [CodeGen] Refactor MachineMemOperand's Flags enum.
Summary:
- Give it a shorter name (because we're going to refer to it often from
  SelectionDAG and friends).

- Split the flags and alignment into separate variables.

- Specialize FlagsEnumTraits for it, so we can do bitwise ops on it
  without losing type information.

- Make some enum values constants in MachineMemOperand instead.
  MOMaxBits should not be a valid Flag.

- Simplify some of the bitwise ops for dealing with Flags.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22281

llvm-svn: 275438
2016-07-14 17:07:44 +00:00
Haicheng Wu 711ca868fc [AArch64] Set FMOVS0 and FMOVD0 as isAsCheapAsAMove when needed.
If a subtarget has both ZCZeroing and CustomCheapAsMoveHandling features (now
only Kryo has both), set FMOVS0 and FMOVD0 isAsCheapAsAMove.

Differential Revision: http://reviews.llvm.org/D22256

llvm-svn: 275178
2016-07-12 15:31:41 +00:00
Haicheng Wu 1e39574e9f [Kryo] Enable ZCZeroing feature
This feature uses immediate #0 to zero a register.

Differential Revision: http://reviews.llvm.org/D19985

llvm-svn: 275143
2016-07-12 02:04:01 +00:00
Nirav Dave 8603062ee4 Fix branch relaxation in 16-bit mode.
Thread through MCSubtargetInfo to relaxInstruction function allowing relaxation
to generate jumps with 16-bit sized immediates in 16-bit mode.

This fixes PR22097.

Reviewers: dwmw2, tstellarAMD, craig.topper, jyknight

Subscribers: jfb, arsenm, jyknight, llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D20830

llvm-svn: 275068
2016-07-11 14:23:53 +00:00
Duncan P. N. Exon Smith ab53fd9b50 AArch64: Avoid implicit iterator conversions, NFC
Avoid implicit conversions from MachineInstrBundleInstr to MachineInstr*
in the AArch64 backend, mainly by preferring MachineInstr& over
MachineInstr* when a pointer isn't nullable.

llvm-svn: 274924
2016-07-08 20:29:42 +00:00
Duncan P. N. Exon Smith 25b132e93b Target: Avoid getFirstTerminator() => pointer, NFC
Stop using an implicit conversion from the return of
MachineBasicBlock::getFirstTerminator to MachineInstr*.  In two cases,
directly dereference to a MachineInstr& since later code assumes it's
valid.  In a third case, change to an iterator since later code checks
against MachineBasicBlock::end.

Although the fix for the third case avoids undefined behaviour, I expect
this doesn't cause a functionality change in practice (since the basic
block already has a terminator).

llvm-svn: 274898
2016-07-08 18:26:20 +00:00
Pankaj Gode 5d118a1676 [AArch64] Macro fusion of simple ALU ops with branches for Broadcom's Vulcan
Support for the macro fusion of simple ALU ops with branches for the Vulcan sub-target.

Patch by Meador Inge <meadori@gmail.com>

Differential Revision: http://reviews.llvm.org/D22042

llvm-svn: 274837
2016-07-08 11:13:59 +00:00
Chad Rosier 112d0e996b [AArch64] Change the preferred alignment for char and short to word alignment.
The commit reinstates r273279, which was informally approved.

Original Review: http://reviews.llvm.org/D21414

This reverts commit ca632c91aaa7cafc50942f890c49f727a046ace1.

llvm-svn: 274790
2016-07-07 20:02:18 +00:00
Chad Rosier 3972953efd Revert "[AArch64] Change the preferred alignment for char and short to word alignment"
This reverts commit r273279 as the change was not properly approved.

llvm-svn: 274768
2016-07-07 16:37:29 +00:00
Junmo Park 384d376545 fix documentation comment. NFC.
llvm-svn: 274704
2016-07-06 23:18:58 +00:00
Junmo Park 5e4bd2e7c4 Minor code cleanup. NFC.
llvm-svn: 274702
2016-07-06 23:15:18 +00:00
Matthias Braun ad0032a649 AArch64: Change modeling of zero cycle zeroing.
On CPUs with the zero cycle zeroing feature enabled "movi v.2d" should
be used to zero a vector register. This was previously done at
instruction selection time, however the register coalescer sometimes
widened multiple vregs to the Q width because of that leading to extra
spills. This patch leaves the decision on how to zero a register to the
AsmPrinter phase where it doesn't affect register allocation anymore.

This patch also sets isAsCheapAsAMove=1 on FMOVS0, FMOVD0.

This fixes http://llvm.org/PR27454, rdar://25866262

Differential Revision: http://reviews.llvm.org/D21826

llvm-svn: 274686
2016-07-06 21:39:33 +00:00
Matthias Braun 332bb5c236 AArch64: Replace a RegScavenger instance with LivePhysRegs
findScratchNonCalleeSaveRegister() just needs a simple liveness
analysis, use LivePhysRegs for that as it is simpler and does not depend
on the kill flags.

This commit adds a convenience function available() to LivePhysRegs:
This function returns true if the given register is not reserved and
neither the register nor any of its aliases are alive.

Differential Revision: http://reviews.llvm.org/D21865

llvm-svn: 274685
2016-07-06 21:31:27 +00:00
Tim Northover 449c15e1bd AArch64: try to fix optimized build failure.
I think the Ops filled out by Regex::match contain pointers into the temporary
std::string returned by StringRef::upper. Its lifetime is extended by the call
to match, but only until the end of that call (not to the uses of Ops later
on).

llvm-svn: 274586
2016-07-05 23:15:58 +00:00
Manman Ren 39b37c0f9d Fix an ordering problem in r274431
llvm-svn: 274582
2016-07-05 22:24:44 +00:00
Tim Northover e6ae6767d9 AArch64: TableGenerate system instruction operands.
The way the named arguments for various system instructions are handled at the
moment has a few problems:

  - Large-scale duplication between AArch64BaseInfo.h and AArch64BaseInfo.cpp
  - That weird Mapping class that I have no idea what I was on when I thought
    it was a good idea.
  - Searches are performed linearly through the entire list.
  - We print absolutely all registers in upper-case, even though some are
    canonically mixed case (SPSel for example).
  - The ARM ARM specifies sysregs in terms of 5 fields, but those are relegated
    to comments in our implementation, with a slightly opaque hex value
    indicating the canonical encoding LLVM will use.

This adds a new TableGen backend to produce efficiently searchable tables, and
switches AArch64 over to using that infrastructure.

llvm-svn: 274576
2016-07-05 21:23:04 +00:00
Balaram Makam d4acd7ed10 Revert r259387: "AArch64: Implement missed conditional compare sequences."
This reverts commit r259387 because it inserts illegal code after legalization
    in some backends where i64 OR type is illegal for example.

llvm-svn: 274573
2016-07-05 20:24:05 +00:00
Tim Northover 01dff9d18a AArch64: use correct SDValue # when looking for bitfield placement.
The other use really does only care about the SDNode (it checks the
opcode against a whitelist), but bitFieldPlacement can be misled if
the node produces multiple results.

Patch by Ismail Badawi.

llvm-svn: 274567
2016-07-05 18:02:57 +00:00
Benjamin Kramer 3bc1edf95b Use arrays or initializer lists to feed ArrayRefs instead of SmallVector where possible.
No functionality change intended.

llvm-svn: 274431
2016-07-02 11:41:39 +00:00
Craig Topper 2bd8b4b180 [CodeGen,Target] Remove the version of DAG.getVectorShuffle that takes a pointer to a mask array. Convert all callers to use the ArrayRef version. No functional change intended.
For the most part this simplifies all callers. There were two places in X86 that needed an explicit makeArrayRef to shorten a statically sized array.

llvm-svn: 274337
2016-07-01 06:54:47 +00:00
Duncan P. N. Exon Smith 632987296f Target: Remove unused arguments from overrideSchedPolicy, NFC
TargetSubtargetInfo::overrideSchedPolicy takes two MachineInstr*
arguments (begin and end) that invite implicit conversions from
MachineInstrBundleIterator.  One option would be to change their type to
an iterator, but since they don't seem to have been used since the API
was added in 2010, I'm deleting the dead code.

llvm-svn: 274304
2016-07-01 00:23:27 +00:00
Duncan P. N. Exon Smith e4f5e4f4d1 CodeGen: Use MachineInstr& in TargetLowering, NFC
This is a mechanical change to make TargetLowering API take MachineInstr&
(instead of MachineInstr*), since the argument is expected to be a valid
MachineInstr.  In one case, changed a parameter from MachineInstr* to
MachineBasicBlock::iterator, since it was used as an insertion point.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

llvm-svn: 274287
2016-06-30 22:52:52 +00:00
Rafael Espindola d86e8bb0ed Delete MCCodeGenInfo.
MC doesn't really care about CodeGen stuff, so this was just
complicating target initialization.

llvm-svn: 274258
2016-06-30 18:25:11 +00:00
Rafael Espindola db6bd02185 Delete unused includes. NFC.
llvm-svn: 274225
2016-06-30 12:19:16 +00:00
Pankaj Gode f4b25547cf [AArch64] Add Broadcom Vulcan scheduling model.
Adding scheduling model for new Broadcom Vulcan core (ARMv8.1A).

Differential Revision: http://reviews.llvm.org/D21728

llvm-svn: 274213
2016-06-30 06:42:31 +00:00
Craig Topper bc56e3ba53 Use ShuffleVectorSDNode::isSplat member method instead of static method isSplatMask where the mask came directly from getMask() on a shuffle node.
llvm-svn: 274208
2016-06-30 04:38:51 +00:00
Duncan P. N. Exon Smith 9cfc75c214 CodeGen: Use MachineInstr& in TargetInstrInfo, NFC
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr.  This is a
general API improvement.

Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other.  Instead I've done everything as a block and just
updated what was necessary.

This is mostly mechanical fixes: adding and removing `*` and `&`
operators.  The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency.  Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy.  I couldn't run tests
for AVR since llc doesn't link with it turned on.

llvm-svn: 274189
2016-06-30 00:01:54 +00:00
Matthias Braun ec3330328a AArch64: Remove unnecessary namespace llvm; NFC
llvm-svn: 273975
2016-06-28 00:54:33 +00:00
Rafael Espindola 3beef8d6db Move shouldAssumeDSOLocal to Target.
Should fix the shared library build.

llvm-svn: 273958
2016-06-27 23:15:57 +00:00
Evandro Menezes 3830479f41 [AArch64] Adjust the model for the vector by element FP multiplies on Exynos M1. (NFC)
llvm-svn: 273708
2016-06-24 18:58:54 +00:00
Evandro Menezes 62c70101c3 [AArch64] Model the cost of vector by element FP multiplies on Exynos M1. (NFC)
llvm-svn: 273630
2016-06-23 23:43:23 +00:00
Chad Rosier 8c106bcbe8 [AArch64] Remove an overly aggressive assert.
llvm-svn: 273458
2016-06-22 19:18:52 +00:00
Krzysztof Parzyszek e116d500a7 [SDAG] Remove FixedArgs parameter from CallLoweringInfo::setCallee
The setCallee function will set the number of fixed arguments based
on the size of the argument list. The FixedArgs parameter was often
explicitly set to 0, leading to a lack of consistent value for non-
vararg functions.

Differential Revision: http://reviews.llvm.org/D20376

llvm-svn: 273403
2016-06-22 12:54:25 +00:00
Rafael Espindola 2b7fef681f Delete more dead code.
Found by gcc 6.

llvm-svn: 273402
2016-06-22 12:44:16 +00:00
Haicheng Wu a783bac50b [Kryo] Enable loop prefetcher.
Differential Revision: http://reviews.llvm.org/D21535

llvm-svn: 273329
2016-06-21 22:47:56 +00:00
Evandro Menezes 230083ff9d [AArch64] Change the preferred alignment for char and short to word alignment
Differential Revision: http://reviews.llvm.org/D21414

llvm-svn: 273279
2016-06-21 15:55:18 +00:00
Silviu Baranga aee40fc61c [AArch64] Restore codegen for AArch64 Cortex-A72/A73 after NFCI
Summary:
Code generation for Cortex-A72/Cortex-A73 was accidentally changed
by r271555, which was a NFCI. The isCortexA57() predicate was not true
for Cortex-A72/Cortex-A73 before r271555 (since it was checking the CPU
string). Because Cortex-A72/Cortex-A73 inherit all features from Cortex-A57,
all decisions previously guarded by isCortexA57() are now taken.

This change restores the behaviour before r271555 by adding separate
ProcA72/ProcA73, which have the required features to preserve code
generation.

Reviewers: kristof.beyls, aadg, mcrosier, rengolin

Subscribers: mcrosier, llvm-commits, aemerson, t.p.northover, MatzeB, rengolin

Differential Revision: http://reviews.llvm.org/D21182

llvm-svn: 273277
2016-06-21 15:53:54 +00:00
David Majnemer e61e4bfd87 Replace silly uses of 'signed' with 'int'
llvm-svn: 273244
2016-06-21 05:10:24 +00:00
Evandro Menezes 8057265cc2 [AArch64] Adjust the loop buffer size for Exynos M1 (NFC)
llvm-svn: 273185
2016-06-20 18:39:41 +00:00
Pankaj Gode 0aab2e398a [AARCH64] Add support for Broadcom Vulcan
Adding core tuning support for new Broadcom Vulcan core (ARMv8.1A).

Differential Revision: http://reviews.llvm.org/D21500

llvm-svn: 273148
2016-06-20 11:13:31 +00:00
NAKAMURA Takumi fe1202c4cb Untabify.
llvm-svn: 273129
2016-06-20 00:37:41 +00:00
Matt Arsenault f1c3906a5d AArch64: Fix range loop contradicting comment above it
llvm-svn: 272959
2016-06-16 21:21:49 +00:00
Tim Northover daa1c018b0 AArch64: allow MOV (imm) alias to be printed
The backend has been around for years, it's pretty ridiculous that we can't
even use the preferred form for printing "MOV" aliases. Unfortunately, TableGen
can't handle the complex predicates when printing so it's a bunch of nasty C++.
Oh well.

llvm-svn: 272865
2016-06-16 01:42:25 +00:00
Tim Northover 389a1e39ea AArch64: stop trying to use 32-bit MOVZs when expanding patchpoints.
Of course the assembly was right but because the opcode was MOVZWi it was
encoded as "movz w16, #65535, lsl #32" which is an unallocated encoding and
would go horribly wrong on a CPU.

No idea how this bug survived this long. It seems nobody is using that aspect
of patchpoints.

llvm-svn: 272831
2016-06-15 20:33:36 +00:00
Pankaj Gode a67fea464c Test commit after access grant. Modified comment by adding a period.
llvm-svn: 272808
2016-06-15 17:24:52 +00:00
Kevin Enderby d2d2ce9b9f Update the AArch64ExternalSymbolizer to print literal strings as escaped strings
so it is the same as the MCExternalSymbolizer.

rdar://17349181

llvm-svn: 272588
2016-06-13 21:08:57 +00:00
Benjamin Kramer bdc4956bac Pass DebugLoc and SDLoc by const ref.
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.

llvm-svn: 272512
2016-06-12 15:39:02 +00:00
Chad Rosier d1f6c840ee [AArch64] Move comments closer to relevant check. NFC.
llvm-svn: 272430
2016-06-10 20:49:18 +00:00
Chad Rosier c5083c2ccf [AArch64] Refactor a check earlier. NFC.
llvm-svn: 272429
2016-06-10 20:47:14 +00:00
Evandro Menezes a3a0a60cff [AArch64] Add preferred alignments for Exynos M1
Differential Revision: http://reviews.llvm.org/D21203

llvm-svn: 272400
2016-06-10 16:00:18 +00:00
Saleem Abdulrasool 6c19ffc8bc AArch64: support the `.arch` directive in the IAS
Add support to the AArch64 IAS for the `.arch` directive.  This allows the
assembly input to use architectural functionality in part of a file.  This is
used in existing code like BoringSSL.

Resolves PR26016!

llvm-svn: 272241
2016-06-09 02:56:40 +00:00
Benjamin Kramer c321e53402 Apply most suggestions of clang-tidy's performance-unnecessary-value-param
Avoids unnecessary copies. All changes audited & pass tests with asan.
No functional change intended.

llvm-svn: 272190
2016-06-08 19:09:22 +00:00
Quentin Colombet d1cd30b218 [AArch64][RegisterBankInfo] G_OR are fine on either GPR or FPR.
Teach AArch64RegisterBankInfo that G_OR can be mapped on either GPR or
FPR for 64-bit or 32-bit values.

Add test cases demonstrating how this information is used to coalesce a
computation on a single register bank.

llvm-svn: 272170
2016-06-08 16:53:32 +00:00
Benjamin Kramer 46e38f3678 Avoid copies of std::strings and APInt/APFloats where we only read from it
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.

llvm-svn: 272126
2016-06-08 10:01:20 +00:00
Quentin Colombet a4ac7cdac2 [AArch64][RegisterBankInfo] Use the generic implementation of copyCost.
Long term we may want to give high cost at FPR to/from GPR copies.

llvm-svn: 272086
2016-06-08 01:24:00 +00:00
Quentin Colombet cfbdee2312 [RegisterBankInfo] Add a size argument for the cost of copy.
The cost of a copy may be different based on how many bits we have to
copy around. E.g., a 8-bit copy may be different than a 32-bit copy.

llvm-svn: 272084
2016-06-08 01:11:03 +00:00
Geoff Berry 486f49cc63 Reapply [AArch64] Fix isLegalAddImmediate() to return true for valid negative values.
Originally reviewed here: http://reviews.llvm.org/D17463

llvm-svn: 272023
2016-06-07 16:48:43 +00:00
Chad Rosier be879ea751 [AArch64] Spot SBFX-compatible code expressed with sign_extend.
This is very similar to r271677, but for extracts from i32 with the SIGN_EXTEND
acting on a arithmetic shift.

llvm-svn: 271717
2016-06-03 20:05:49 +00:00
Chad Rosier 2d658703e1 [AArch64] Spot SBFX-compatbile code expressed with sign_extend_inreg.
We were assuming all SBFX-like operations would have the shl/asr form, but often
when the field being extracted is an i8 or i16, we end up with a
SIGN_EXTEND_INREG acting on a shift instead.

This is a port of r213754 from ARM to AArch64.

llvm-svn: 271677
2016-06-03 15:00:09 +00:00
Sjoerd Meijer d906bf1369 RAS extensions are part of ARMv8.2-A. This change enables them by introducing a
new instruction to ARM and AArch64 targets and several system registers.

Patch by: Roger Ferrer Ibanez and Oliver Stannard

Differential Revision: http://reviews.llvm.org/D20282

llvm-svn: 271670
2016-06-03 14:03:27 +00:00
Sanjay Patel dba8b4c04d transform obscured FP sign bit ops into a fabs/fneg using TLI hook
This is effectively a revert of:
http://reviews.llvm.org/rL249702 - [InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886)
and:
http://reviews.llvm.org/rL249701 - [ValueTracking] teach computeKnownBits that a fabs() clears sign bits
and a reimplementation as a DAG combine for targets that have IEEE754-compliant fabs/fneg instructions.

This is intended to resolve the objections raised on the dev list:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098154.html
and:
https://llvm.org/bugs/show_bug.cgi?id=24886#c4

In the interest of patch minimalism, I've only partly enabled AArch64. PowerPC, MIPS, x86 and others can enable later.

Differential Revision: http://reviews.llvm.org/D19391

llvm-svn: 271573
2016-06-02 20:01:37 +00:00
Matthias Braun 651cff42c4 AArch64: Do not test for CPUs, use SubtargetFeatures
Testing for specific CPUs has a number of problems, better use subtarget
features:
- When some tweak is added for a specific CPU it is often desirable for
  the next version of that CPU as well, yet we often forget to add it.
- It is hard to keep track of checks scattered around the target code;
  Declaring all target specifics together with the CPU in the tablegen
  file is a clear representation.
- Subtarget features can be tweaked from the command line.

To discourage people from using CPU checks in the future I removed the
isCortexXX(), isCyclone(), ... functions. I added an getProcFamily()
function for exceptional circumstances but made it clear in the comment
that usage is discouraged.

Reformat feature list in AArch64.td to have 1 feature per line in
alphabetical order to simplify merging and sorting for out of tree
tweaks.

No functional change intended.

Differential Revision: http://reviews.llvm.org/D20762

llvm-svn: 271555
2016-06-02 18:03:53 +00:00
Geoff Berry 66f6b65fed [PEI, AArch64] Use empty spaces in stack area for local stack slot allocation.
Summary:
If the target requests it, use emptry spaces in the fixed and
callee-save stack area to allocate local stack objects.

AArch64: Change last callee-save reg stack object alignment instead of
size to leave a gap to take advantage of above change.

Reviewers: t.p.northover, qcolombet, MatzeB

Subscribers: rengolin, mcrosier, llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D20220

llvm-svn: 271527
2016-06-02 16:22:07 +00:00
Sjoerd Meijer 0b7bb16e5b This adds support for Cortex-A73 as an available target.
Differential Revision: http://reviews.llvm.org/D20865

llvm-svn: 271508
2016-06-02 10:48:52 +00:00
Rafael Espindola 4d29099f7f Delete AArch64II::MO_CONSTPOOL.
A constant pool holding the address of a variable in equivalent to
a got entry. It produces exactly the same instruction sequence as a
got use and unlike a got use this is not uniqued by the linker.

llvm-svn: 271311
2016-05-31 18:31:14 +00:00
Matthias Braun bcfd23673b AArch64: Fix indentation
llvm-svn: 271084
2016-05-28 01:06:51 +00:00
Matthias Braun 27b6692fe2 AArch64Subtarget: Use default member initializers
llvm-svn: 271057
2016-05-27 22:14:09 +00:00
Benjamin Kramer 3e9a5d3468 Apply clang-tidy's misc-static-assert where it makes sense.
Also fold conditions into assert(0) where it makes sense. No functional
change intended.

llvm-svn: 270982
2016-05-27 11:36:04 +00:00
Chad Rosier 14aa2ad1f4 [AArch64] Generate rev16/rev32 from bswap + srl when upper bits are known zero.
Canonicalize (srl (bswap i32 x), 16) to (rotr (bswap i32 x), 16), if the high
16-bits of x are zero. Similarly, canonicalize (srl (bswap i64 x), 32) to
(rotr (bswap i64 x), 32), if the high 32-bits of x are zero.

test_rev_w_srl16:            test_rev_w_srl16:
  and w8, w0, #0xffff          and     w8, w0, #0xffff
  rev w8, w8           --->    rev16   w0, w8
  lsr     w0, w8, #16

test_rev_x_srl32:            test_rev_x_srl32:
  rev x8, x8           --->    rev32   x0, x8
  lsr x0, x8, #32

llvm-svn: 270896
2016-05-26 19:41:33 +00:00
Chad Rosier 816a67da49 [AArch64] Generate a BFI/BFXIL from 'or (and X, MaskImm), OrImm'.
If and only if the value being inserted sets only known zero bits.

This combine transforms things like

  and w8, w0, #0xfffffff0
  movz w9, #5
  orr w0, w8, w9

into

  movz w8, #5
  bfxil w0, w8, #0, #4

The combine is tuned to make sure we always reduce the number of instructions.
We avoid churning code for what is expected to be performance neutral changes
(e.g., converted AND+OR to OR+BFI).

Differential Revision: http://reviews.llvm.org/D20387

llvm-svn: 270846
2016-05-26 13:27:56 +00:00
Rafael Espindola a224de06bc Use shouldAssumeDSOLocal on AArch64.
This reduces code duplication and now AArch64 also handles PIE.

llvm-svn: 270844
2016-05-26 12:42:55 +00:00
Rafael Espindola 6b93bf5783 Don't repeat name in comment and git-clang-format.
llvm-svn: 270785
2016-05-25 22:44:06 +00:00
Rafael Espindola 6b4baa5f58 Sort includes.
llvm-svn: 270769
2016-05-25 21:37:29 +00:00
Jun Bum Lim b21d4e17a2 [AArch64] Disable narrow load merge by default
Summary:
As this optimization converts two loads into one load with two shift instructions,
it could potentially hurt performance if a loop is arithmetic operation intensive.

Reviewers: t.p.northover, mcrosier, jmolloy

Subscribers: evandro, jmolloy, aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20172

llvm-svn: 270251
2016-05-20 18:45:49 +00:00
Chad Rosier 02f25a9565 [AArch64 ] Generate a BFXIL from 'or (and X, Mask0Imm),(and Y, Mask1Imm)'.
Mask0Imm and ~Mask1Imm must be equivalent and one of the MaskImms is a shifted
mask (e.g., 0x000ffff0).  Both 'and's must have a single use.

This changes code like:

  and w8, w0, #0xffff000f
  and w9, w1, #0x0000fff0
  orr w0, w9, w8

into

  lsr w8, w1, #4
  bfi w0, w8, #4, #12

llvm-svn: 270063
2016-05-19 14:19:47 +00:00
Chad Rosier e006202a4d [AArch64] Push comment into function. NFC.
llvm-svn: 270003
2016-05-18 23:51:17 +00:00
Rafael Espindola 8c34dd8257 Delete Reloc::Default.
Having an enum member named Default is quite confusing: Is it distinct
from the others?

This patch removes that member and instead uses Optional<Reloc> in
places where we have a user input that still hasn't been maped to the
default value, which is now clear has no be one of the remaining 3
options.

llvm-svn: 269988
2016-05-18 22:04:49 +00:00
Chad Rosier 91294c5bdc [AArch64] Minor refactoring. NFC.
llvm-svn: 269963
2016-05-18 17:43:11 +00:00
Rafael Espindola 38af4d6347 Trivial cleanups.
This just clang formats and cleans comments in an area I am about to
post a patch for review.

llvm-svn: 269946
2016-05-18 16:00:24 +00:00
Geoff Berry 74cb718ea9 [AArch64] Fix bug in large stack spill slot handling (PR27717)
Summary:
Fix bug in MachO path where a frame index offset would not be reserved
for handling large frames when an extra non-used callee-save register
was saved.  In the case where the extra register is reserved or not a
GPR (e.g. %FP in the MachO case), this would lead to the register
scavenger later failing when called from PrologEpilogInserter.

Reviewers: t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20185

llvm-svn: 269697
2016-05-16 20:52:28 +00:00
Chad Rosier c73d559df4 Use proper capitalization and punctuation per coding standards. NFC.
llvm-svn: 269652
2016-05-16 12:55:01 +00:00
Benjamin Kramer a65b610bd2 Move helper classes into anonymous namespaces. NFC.
llvm-svn: 269591
2016-05-15 15:18:11 +00:00
Chad Rosier 7e8dd51d50 [AArch64] Update local variable names to conform to coding standard. NFC.
llvm-svn: 269573
2016-05-14 18:56:28 +00:00
Chad Rosier 08d9908ea9 [AArch64] Simplify logic to reduce vertical space. NFC.
llvm-svn: 269512
2016-05-13 22:53:13 +00:00
Paul Osmialowski 4f5b3be7f1 add support for -print-imm-hex for AArch64
Most immediates are printed in Aarch64InstPrinter using 'formatImm' macro,
but not all of them.

Implementation contains following rules:

- floating point immediates are always printed as decimal
- signed integer immediates are printed depends on flag settings
  (for negative values 'formatImm' macro prints the value as i.e -0x01
  which may be convenient when imm is an address or offset)
- logical immediates are always printed as hex
- the 64-bit immediate for advSIMD, encoded in "a🅱️c:d:e:f:g:h" is always printed as hex
- the 64-bit immedaite in exception generation instructions like:
  brk, dcps1, dcps2, dcps3, hlt, hvc, smc, svc is always printed as hex
- the rest of immediates is printed depends on availability
  of -print-imm-hex

Signed-off-by: Maciej Gabka <maciej.gabka@arm.com>
Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com>

Differential Revision: http://reviews.llvm.org/D16929

llvm-svn: 269446
2016-05-13 18:00:09 +00:00
Justin Bogner 283e3bd793 SDAG: Implement Select instead of SelectImpl in AArch64DAGToDAGISel
This one has a lot of code churn, but it's all mechanical and
straightforward.

- Where we were returning a node before, call ReplaceNode instead.
- Where we would return null to fall back to another selector, rename
  the method to try* and return a bool for success.
- Where we were calling SelectNodeTo, just return afterwards.

Part of llvm.org/pr26808.

llvm-svn: 269379
2016-05-12 23:10:30 +00:00
Justin Bogner 3525da7466 SDAG: Clean up dangling nodes in AArch64ISelDAGToDAG::SelectImpl
When we convert to the void Select interface, leaving unreferenced
nodes around won't be allowed anymore.

Part of llvm.org/pr26808.

llvm-svn: 269345
2016-05-12 20:54:27 +00:00
Chad Rosier 179480a4ea [AArch64] Give function a more appropriate name.
llvm-svn: 269335
2016-05-12 19:51:58 +00:00
Chad Rosier 042ac2c17f [AArch64] Minor refactoring to simplify future patch. NFC.
llvm-svn: 269329
2016-05-12 19:38:18 +00:00
Chad Rosier 39481ace40 [AArch64] Remove command-line option use for testing.
The EXTR combine has been in tree for over 2 years without complain, so go ahead
and remove the option.

llvm-svn: 269292
2016-05-12 13:27:24 +00:00
Chad Rosier 9926a5e31d [AArch64] Add support for unscaled narrow stores in getUsefulBitsForUse.
llvm-svn: 269263
2016-05-12 01:42:01 +00:00
Chad Rosier fe7bba4ee4 [AArch64] Remove floating-point narrow stores from getUsefulBitsForUse.
While not impossible, it's unlikely we'd be performing bitwise operations on FP
values.

llvm-svn: 269260
2016-05-12 01:04:15 +00:00
Chad Rosier 23a1a9a66d [AArch64] Improve getUsefulBitsForUse for narrow stores.
For narrow stores (e.g., strb, srth) we know the upper bits of the register are
unused/not useful. In some cases we can use this information to eliminate
unnecessary instructions.

For example, without this patch we generate (from the 2nd test case):

 ldr w8, [x0]
 and w8, w8, #0xfff0
 bfxil w8, w2, #16, #4
 strh w8, [x1]

and after the patch the 'and' is removed:

 ldr w8, [x0]
 bfxil w8, w2, #16, #4
 strh w8, [x1]
 ret

During the lowering of the bitfield insert instruction the 'and' is eliminated
because we know the upper 16-bits that are masked off are unused and the lower
4-bits that are masked off are overwritten by the insert itself. Therefore, the
'and' is unnecessary.

Differential Revision: http://reviews.llvm.org/D20175

llvm-svn: 269226
2016-05-11 20:19:54 +00:00
Weiming Zhao 095c271131 [AArch64] Fix DAG selection for cmps for fp16 type
Summary: When emitting comparison for fp16, in addition to promote the LHS and RHS to fp32, we need to change the VT as well.

Reviewers: t.p.northover

Subscribers: t.p.northover, aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D19922

llvm-svn: 269151
2016-05-11 01:26:32 +00:00
Tim Northover 9508a70adc AArch64: allow vN to represent 64-bit registers in inline asm.
Unlike xN/wN, the size of vN is genuinely ambiguous in the assembly, so we
should try to infer what was intended from the type. But only down to 64-bits
(vN can never represent sN, hN or bN).

llvm-svn: 269132
2016-05-10 22:26:45 +00:00
Jonas Paulsson 8e5b0c65cc [foldMemoryOperand()] Pass LiveIntervals to enable liveness check.
SystemZ (and probably other targets as well) can fold a memory operand
by changing the opcode into a new instruction that as a side-effect
also clobbers the CC-reg.

In order to do this, liveness of that reg must first be checked. When
LIS is passed, getRegUnit() can be called on it and the right
LiveRange is computed on demand.

Reviewed by Matthias Braun.
http://reviews.llvm.org/D19861

llvm-svn: 269026
2016-05-10 08:09:37 +00:00
Matthias Braun 31d19d43c7 CodeGen: Move TargetPassConfig from Passes.h to an own header; NFC
Many files include Passes.h but only a fraction needs to know about the
TargetPassConfig class. Move it into an own header. Also rename
Passes.cpp to TargetPassConfig.cpp while we are at it.

llvm-svn: 269011
2016-05-10 03:21:59 +00:00
Silviu Baranga f60be28ed8 [AArch64] Implement lowering of the X constraint on AArch64
Summary:
This implements the lowering of the X constraint on
AArch64.

The default behaviour of the X constraint lowering is to
restrict it to "f". This is a problem because the "f"
constraint is not implemented on AArch64 and would be too
restrictive anyway. Therefore, the AArch64 hook will
lower this to "w" (if the operand is a floating point or
vector) or "r" otherwise.

The implementation is similar with the one added for
ARM (r267411).

This is the AArch64 side of the fix for http://llvm.org/PR26493

Reviewers: rengolin

Subscribers: aemerson, rengolin, llvm-commits, t.p.northover

Differential Revision: http://reviews.llvm.org/D19967

llvm-svn: 268907
2016-05-09 11:10:44 +00:00
Geoff Berry a5335647d5 [AArch64] Combine callee-save and local stack SP adjustment instructions.
Summary:
If a function needs to allocate both callee-save stack memory and local
stack memory, we currently decrement/increment the SP in two steps:
first for the callee-save area, and then for the local stack area.  This
changes the code to allocate them both at once at the very beginning/end
of the function.  This has two benefits:

1) there is one fewer sub/add micro-op in the prologue/epilogue

2) the stack adjustment instructions act as a scheduling barrier, so
moving them to the very beginning/end of the function increases post-RA
scheduler's ability to move instructions (that only depend on argument
registers) before any of the callee-save stores

This change can cause an increase in instructions if the original local
stack SP decrement could be folded into the first store to the stack.
This occurs when the first local stack store is to stack offset 0.  In
this case we are trading off one more sub instruction for one fewer sub
micro-op (along with benefits (2) and (3) above).

Reviewers: t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18619

llvm-svn: 268746
2016-05-06 16:34:59 +00:00
Jun Bum Lim 33be4997ed [AArch64] Decouple zero store promotion from narrow ld merge. NFC.
Summary: This change refactors to decouple the zero store promotion from the narrow ld merge and add a flag (enable-narrow-ld-merge=true) to control the narrow ld merge optimization.

Reviewers: jmolloy, t.p.northover, mcrosier

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19885

llvm-svn: 268744
2016-05-06 15:08:57 +00:00
Justin Bogner b012699741 SDAG: Rename Select->SelectImpl and repurpose Select as returning void
This is a step towards removing the rampant undefined behaviour in
SelectionDAG, which is a part of llvm.org/PR26808.

We rename SelectionDAGISel::Select to SelectImpl and update targets to
match, and then change Select to return void and consolidate the
sketchy behaviour we're trying to get away from there.

Next, we'll update backends to implement `void Select(...)` instead of
SelectImpl and eventually drop the base Select implementation.

llvm-svn: 268693
2016-05-05 23:19:08 +00:00
Chad Rosier 777dc513a0 [AArch64] Remove unused MBP headers/dependency. NFC.
llvm-svn: 268682
2016-05-05 20:58:38 +00:00
Evandro Menezes d23324aab1 [AArch64] Add cheap as move instructions for Exynos M1
llvm-svn: 268549
2016-05-04 20:47:25 +00:00
Evandro Menezes bcb95cd0ed [AArch64] Use the reciprocal estimation machinery
This patch adds support for estimating the square root, its reciprocal and
division or reciprocal using the combiner generic reciprocal machinery.

llvm-svn: 268539
2016-05-04 20:18:27 +00:00
Matthias Braun e25bbd0bb8 AArch64/optimizeCondBranch: Remove earlier kill flag when forming TBZ
This fixes -verify-machineinstrs complaints when compiling
test-suite/SingleSource/Benchmarks/Shootout-C++/wordfreq.cpp

llvm-svn: 268360
2016-05-03 04:54:16 +00:00
Matthias Braun d1aabb2813 livePhysRegs: Pass MBB by reference in addLive{Ins|Outs}(); NFC
The block must no be nullptr for the addLiveIns()/addLiveOuts()
function.

llvm-svn: 268340
2016-05-03 00:24:32 +00:00
Matthias Braun 24f26e6d91 LivePhysRegs: Automatically determine presence of pristine regs.
Remove the AddPristinesAndCSRs parameters from
addLiveIns()/addLiveOuts().

We need to respect pristine registers after prologue epilogue insertion,
Seeing that we got this wrong in at least two commits already, we should
rather pay the small price to query MachineFrameInfo for it.

There are three cases that did not set AddPristineAndCSRs to true even
after register allocation:
- ExecutionDepsFix: live-out registers are used as a hint that the
  register is used soon. This is not true for pristine registers so
  use the new addLiveOutsNoPristines() to maintain this behaviour.
- SystemZShortenInst: Not setting AddPristineAndCSRs to true looks like
  a bug, should do the right thing automatically now.
- StackMapLivenessAnalysis: Not adding pristine registers looks like a
  bug to me. Added a FIXME comment but maintain the current behaviour
  as a change may need to get coordinated with GC runtimes.

llvm-svn: 268336
2016-05-03 00:08:46 +00:00
Chad Rosier 9d1a556125 Cleanup comments. NFC.
llvm-svn: 268236
2016-05-02 14:56:21 +00:00
Chad Rosier 7b6001ee0f Cleanup comments. NFC.
llvm-svn: 268235
2016-05-02 14:50:30 +00:00
Craig Topper 33772c5375 [CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior.
llvm-svn: 267853
2016-04-28 03:34:31 +00:00
Craig Topper 3b4842b56f [AArch64] Expand CTTZ for all vector types.
llvm-svn: 267837
2016-04-28 01:58:21 +00:00
Ahmed Bougacha 5a3bf6a4a9 [AArch64] Set AddPristinesAndCSRs to expandCMP_SWAP LivePhysRegs.
We run after PEI.
Found via inspection; no obvious testcase.

Follow-up to r266339.

llvm-svn: 267780
2016-04-27 20:33:05 +00:00
Ahmed Bougacha 9e71425f54 [AArch64] Set correct successors in CMPXCHG pseudo expansion.
transferSuccessors() would LoadCmpBB a successor of DoneBB,
whereas it should be a successor of the original MBB.

Follow-up to r266339.

Unfortunately, it's tricky to catch this in the verifier.

llvm-svn: 267779
2016-04-27 20:33:02 +00:00
Gerolf Hoflehner 50426191d7 [DAGCombiner] Follow coding convention for function name (NFC)
llvm-svn: 267745
2016-04-27 17:27:16 +00:00
Matthew Simpson 47bd3994b7 Add parentheses to silence buildbot warning
llvm-svn: 267734
2016-04-27 16:25:04 +00:00
Matthew Simpson e5dfb08fcb [TTI] Add hook for vector extract with extension
This change adds a new hook for estimating the cost of vector extracts followed
by zero- and sign-extensions. The motivating example for this change is the
SMOV and UMOV instructions on AArch64. These instructions move data from vector
to general purpose registers while performing the corresponding extension
(sign-extend for SMOV and zero-extend for UMOV) at the same time. For these
operations, TargetTransformInfo can assume the extensions are free and only
report the cost of the vector extract. The SLP vectorizer has been updated to
make use of the new hook.

Differential Revision: http://reviews.llvm.org/D18523

llvm-svn: 267725
2016-04-27 15:20:21 +00:00
Ahmed Bougacha 128f8732a5 [CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI.
Differential Revision: http://reviews.llvm.org/D17176

llvm-svn: 267606
2016-04-26 21:15:30 +00:00
Craig Topper c5551bfc26 [AArch64] Expand v1i64 and v2i64 ctlz.
The default is legal, which results in 'Cannot select' errors.

llvm-svn: 267522
2016-04-26 05:26:51 +00:00
Junmo Park 3c65acf87e Remove MinLatency in SchedMachineModel. NFC.
Summary:
We don't use MinLatency any more since r184032.

Reviewers: atrick, hfinkel, mcrosier

Differential Revision: http://reviews.llvm.org/D19474

llvm-svn: 267502
2016-04-26 00:37:46 +00:00
Andrew Kaylor 1ac98bb088 Add optimization bisect opt-in calls for AArch64 passes
Differential Revision: http://reviews.llvm.org/D19394

llvm-svn: 267479
2016-04-25 21:58:52 +00:00
Quentin Colombet abe2d016cf Re-apply r267206 with a fix for the encoding problem: when the immediate of
log2(Mask) is smaller than 32, we must use the 32-bit variant because the 64-bit
variant cannot encode it. Therefore, set the subreg part accordingly.

[AArch64] Fix optimizeCondBranch logic.

The opcode for the optimized branch does not depend on the size
of the activate bits in the AND masks, but the AND opcode itself.
Indeed, we need to use a X or W variant based on the AND variant
not based on whether the mask fits into the related variant.
Otherwise, we may end up using the W variant of the optimized branch
for 64-bit register inputs!

This fixes the last make check verifier issues for AArch64: PR27479.

llvm-svn: 267465
2016-04-25 20:54:08 +00:00
Nick Lewycky d595d4265b Fix typo in comment. NFC
llvm-svn: 267354
2016-04-24 17:55:57 +00:00
Gerolf Hoflehner 01b3a6184a [MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098)
The original patch caused crashes because it could derefence a null pointer
for SelectionDAGTargetInfo for targets that do not define it.

Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:

- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math

llvm-svn: 267328
2016-04-24 05:14:01 +00:00
Renato Golin 179d1f5dad Revert "[AArch64] Fix optimizeCondBranch logic."
This reverts commit r267206, as it broke self-hosting on AArch64.

llvm-svn: 267294
2016-04-23 19:30:52 +00:00
Quentin Colombet 10768ab09e [AArch64] Fix optimizeCondBranch logic.
The opcode for the optimized branch does not depend on the size
of the activate bits in the AND masks, but the AND opcode itself.
Indeed, we need to use a X or W variant based on the AND variant
not based on whether the mask fits into the related variant.
Otherwise, we may end up using the W variant of the optimized branch
for 64-bit register inputs!

This fixes the last make check verifier issues for AArch64: PR27479.

llvm-svn: 267206
2016-04-22 20:09:58 +00:00
Quentin Colombet 658d9dbe56 [AArch64] When creating MRS instruction, make sure the destination register is
declared as a definition.

This fixes the machine verifier error for CodeGen/AArch64/nzcv-save.ll.

llvm-svn: 267185
2016-04-22 18:46:17 +00:00
Quentin Colombet 9598f10104 [AArch64][AdvSIMDScalar] Update the kill flags correctly.
We used to simply set the kill flags to true when transforming a scalar
instruction to a vector one.
SrcScalar1 = copy SrcVector1
... = opScalar SrcScalar1
=>
SrcScalar1 = copy SrcVector1
... = opVector SrcVector1<kill>

This is obviously wrong. The proper update consists in:
1. Propagate the kill status from the copy to the new opVector
2. Reset the kill status on the copy, since the live-range of
   SrcVector1 got extended.

This fixes some of the machine verifier errors for AArch64 with make check.

llvm-svn: 267180
2016-04-22 18:09:14 +00:00
Daniel Sanders 591c379563 Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64
It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others.

llvm-svn: 267127
2016-04-22 09:37:26 +00:00
Gerolf Hoflehner b32f11fc62 [MachineCombiner] Support for floating-point FMA on ARM64
Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:

- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math

llvm-svn: 267098
2016-04-22 02:15:19 +00:00
Quentin Colombet c320fb4eae [RegisterBankInfo] Change the API for the verify methods.
Return bool instead of void so that it is natural to put the calls into
asserts.

llvm-svn: 267033
2016-04-21 18:34:43 +00:00
Evgeny Astigeevich fd89fe0dd3 [AArch64][CodeGen] Fix of PR27158: incorrect peephole optimization in AArch64InstrInfo::optimizeCompareInstr
AArch64InstrInfo::optimizeCompareInstr has bug PR27158 which causes generation of incorrect code.
A compare instruction is substituted with another instruction which does not
produce the same flags as the original compare instruction.
This patch contains:
1. Fix of the bug.
2. A regression test in MIR.
3. A new test to check that SUBS is replaced by SUB.

Differential Revision: http://reviews.llvm.org/D18838

llvm-svn: 266969
2016-04-21 08:54:08 +00:00
Marcin Koscielnicki 3fdc257d6a [AArch64] [ARM] Make a target-independent llvm.thread.pointer intrinsic.
Both AArch64 and ARM support llvm.<arch>.thread.pointer intrinsics that
just return the thread pointer.  I have a pending patch that does the same
for SystemZ (D19054), and there are many more targets that could benefit
from one.

This patch merges the ARM and AArch64 intrinsics into a single target
independent one that will also be used by subsequent targets.

Differential Revision: http://reviews.llvm.org/D19098

llvm-svn: 266818
2016-04-19 20:51:05 +00:00
Tim Shen e885d5e4d3 [SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARD
With this change, ideally IR pass can always generate llvm.stackguard
call to get the stack guard; but for now there are still IR form stack
guard customizations around (see getIRStackGuard()). Future SSP
customization should go through LOAD_STACK_GUARD.

There is a behavior change: stack guard values are not CSEed anymore,
since we should never reuse the value in case that it has been spilled (and
corrupted). See ssp-guard-spill.ll. This also cause the change of stack
size and codegen in X86 and AArch64 test cases.

Ideally we'd like to know if the guard created in llvm.stackprotector() gets
spilled or not. If the value is spilled, discard the value and reload
stack guard; otherwise reuse the value. This can be done by teaching
register allocator to know how to rematerialize LOAD_STACK_GUARD and
force a rematerialization (which seems hard), or check for spilling in
expandPostRAPseudo. It only makes sense when the stack guard is a global
variable, which requires more instructions to load. Anyway, this seems to go out
of the scope of the current patch.

llvm-svn: 266806
2016-04-19 19:40:37 +00:00
Mehdi Amini b550cb1750 [NFC] Header cleanup
Removed some unused headers, replaced some headers with forward class declarations.

Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'

Patch by Eugene Kosov <claprix@yandex.ru>

Differential Revision: http://reviews.llvm.org/D19219

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
2016-04-18 09:17:29 +00:00
Chad Rosier 1fbe9bcab4 [AArch64] Add load/store pair instructions to getMemOpBaseRegImmOfsWidth().
This improves AA in the MI schduler when reason about paired instructions.

Phabricator Revision: http://reviews.llvm.org/D17098
PR26358

llvm-svn: 266462
2016-04-15 18:09:10 +00:00
Geoff Berry c376406669 [AArch64] Add MMOs to callee-save load/store instructions.
Summary:
Without MMOs, the callee-save load/store instructions were treated as
volatile by the MI post-RA scheduler and AArch64LoadStoreOptimizer.

Reviewers: t.p.northover, mcrosier

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17661

llvm-svn: 266439
2016-04-15 15:16:19 +00:00
Jun Bum Lim 4c5bd58ebe [MachineScheduler]Add support for store clustering
Perform store clustering just like load clustering. This change add
StoreClusterMutation in machine-scheduler. To control StoreClusterMutation,
added enableClusterStores() in TargetInstrInfo.h. This is enabled only on
AArch64 for now.

This change also add support for unscaled stores which were not handled in
getMemOpBaseRegImmOfs().

llvm-svn: 266437
2016-04-15 14:58:38 +00:00
Craig Topper 18e69f4f63 Use MVT instead of EVT to remove a bunch of unnecessary calls to getSimpleVT.
llvm-svn: 266414
2016-04-15 06:20:21 +00:00
Tom Stellard cef0fe4245 [GlobalISel] Move GISelAccessor class into public headers
Reviewers: qcolombet

Subscribers: joker.eph, vkalintiris, llvm-commits

Differential Revision: http://reviews.llvm.org/D19120

llvm-svn: 266348
2016-04-14 17:45:38 +00:00
Tom Stellard b72a65ff53 [GlobalISel] Coding style and whitespace fixes
Reviewers: qcolombet

Subscribers: joker.eph, llvm-commits, vkalintiris

Differential Revision: http://reviews.llvm.org/D19119

llvm-svn: 266342
2016-04-14 17:23:33 +00:00
Tim Northover cdf1529c01 AArch64: expand cmpxchg after regalloc at -O0.
FastRegAlloc works only at the basic-block level and spills all live-out
registers. Unfortunately for a stack-based cmpxchg near the spill slots, this
can perpetually clear the exclusive monitor, which means the cmpxchg will never
succeed.

I believe the only way to handle this within LLVM is by expanding the loop
post-regalloc. We don't want this in general because it severely limits the
optimisations that can be done, so we limit this to -O0 compilations.

It's an ugly hack, and about the one good point in the whole mess is that we
can treat all cmpxchg operations in the most naive way possible (seq_cst, no
clrex faff) without affecting correctness.

Should fix PR25526.

llvm-svn: 266339
2016-04-14 17:03:29 +00:00
Matthias Braun 46b0f03e12 TargetLowering: Factor out common code for tail call eligibility checking; NFC
llvm-svn: 266270
2016-04-14 01:10:42 +00:00
Matthias Braun 74a0bd319a AArch64: Use a callee save registers for swiftself parameters
It is very likely that the swiftself parameter is alive throughout most
functions function so putting it into a callee save register should
avoid spills for the callers with only a minimum amount of extra spills
in the callees.

Currently the generated code is correct but unnecessarily spills and
reloads arguments passed in callee save registers, I will address this
in upcoming patches.

This also adds a missing check that for tail calls the preserved value
of the caller must be the same as the callees parameter.

Differential Revision: http://reviews.llvm.org/D19007

llvm-svn: 266251
2016-04-13 21:43:16 +00:00
Evandro Menezes 8d53f88162 [AArch64] Disable LDP/STP for quads
Disable LDP/STP for quads on Exynos M1 as they are not as efficient as pairs
of regular LDR/STR.

Patch by Abderrazek Zaafrani <a.zaafrani@samsung.com>.

llvm-svn: 266223
2016-04-13 18:31:45 +00:00
Tim Northover b8a1ecfc62 AArch64: don't create instructions that write to xzr/wzr twice.
These are unpredictable even on AArch64.

Patch by Yichao Yu.

llvm-svn: 266206
2016-04-13 16:25:39 +00:00
Evandro Menezes 551af44e31 [AArch64] Fuse AES{D,E}/AESMC for Exynos M1. (NFC)
llvm-svn: 266144
2016-04-12 22:42:36 +00:00
Matthias Braun dff243e597 AArch64: Drive-by cleanup
llvm-svn: 266035
2016-04-12 02:16:13 +00:00
Manman Ren 5751814eda Swift Calling Convention: swifterror target support.
Differential Revision: http://reviews.llvm.org/D18716

llvm-svn: 265997
2016-04-11 21:08:06 +00:00
Tim Shen 0012756489 [SSP] Remove llvm.stackprotectorcheck.
This is a cleanup patch for SSP support in LLVM. There is no functional change.
llvm.stackprotectorcheck is not needed, because SelectionDAG isn't
actually lowering it in SelectBasicBlock; rather, it adds check code in
FinishBasicBlock, ignoring the position where the intrinsic is inserted
(See FindSplitPointForStackProtector()).

llvm-svn: 265851
2016-04-08 21:26:31 +00:00
Quentin Colombet 88f7f6bc4f [AArch64] Fix a typo in the register class to register bank mapping.
For GPR family we want the GPR register bank, not FPR!

llvm-svn: 265743
2016-04-07 23:10:14 +00:00
Quentin Colombet 846219ae10 [AArch64] Get rid of some GlobalISel ifdefs.
llvm-svn: 265725
2016-04-07 21:24:40 +00:00
Quentin Colombet 6cc73ce808 [AArch64] gcc does not like litteral without quotes even on preprocessor macros.
llvm-svn: 265720
2016-04-07 20:49:15 +00:00
Quentin Colombet 789ad56248 [AArch64][CallLowering] Do not build the API if GlobalISel is not built.
This gets rid of some ifdefs and dummy implementations that were here
just to fill the blanks.

llvm-svn: 265719
2016-04-07 20:47:51 +00:00
Quentin Colombet d4131814b3 [GlobalISel] Add RegBankSelect hooks into the pass pipeline.
Now, RegBankSelect will happen after the IRTranslation and the target
may optionally add additional passes in between.

llvm-svn: 265716
2016-04-07 20:27:33 +00:00
Quentin Colombet d21115876c [RegisterBank] Rename RegisterBank::contains into RegisterBank::covers.
llvm-svn: 265695
2016-04-07 17:09:39 +00:00
Quentin Colombet b073c12912 [AArch64] Teach RegisterBankInfo about the CC register bank.
We need to cover each register class with a register bank.

llvm-svn: 265629
2016-04-07 00:39:29 +00:00
Quentin Colombet cbc353a422 [AArch64] Teach RegisterBankInfo about the mapping of register classes
on register banks.

llvm-svn: 265626
2016-04-07 00:14:30 +00:00
JF Bastien 800f87a871 NFC: make AtomicOrdering an enum class
Summary:
In the context of http://wg21.link/lwg2445 C++ uses the concept of
'stronger' ordering but doesn't define it properly. This should be fixed
in C++17 barring a small question that's still open.

The code currently plays fast and loose with the AtomicOrdering
enum. Using an enum class is one step towards tightening things. I later
also want to tighten related enums, such as clang's
AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI'
enum).

This change touches a few lines of code which can be improved later, I'd
like to keep it as NFC for now as it's already quite complex. I have
related changes for clang.

As a follow-up I'll add:
  bool operator<(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>(AtomicOrdering, AtomicOrdering) = delete;
  bool operator<=(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>=(AtomicOrdering, AtomicOrdering) = delete;
This is separate so that clang and LLVM changes don't need to be in sync.

Reviewers: jyknight, reames

Subscribers: jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D18775

llvm-svn: 265602
2016-04-06 21:19:33 +00:00
James Y Knight 037b9894bd Put quotes around #error string.
GCC reports "missing terminating ' character", even when it's being
skipped by preprocessing.

llvm-svn: 265590
2016-04-06 19:52:32 +00:00
Quentin Colombet 4f03c0b806 [AArch64] Change the CMake to avoid to build GlobalISel related APIs
when GISel is not built.
The positive side effects are:
- We do not have to define dummy implementation
- We do not have to do weird gymnastic to avoid like issues (like
  missing constructor or vtable for the base classes)

llvm-svn: 265570
2016-04-06 17:38:12 +00:00
Quentin Colombet c17f744001 [AArch64] Teach the subtarget how to get to the RegisterBankInfo.
Rework the access to GlobalISel APIs to contain how much of
the APIs we need to access for the final executable to build when
GlobalISel is not built.

This prevents massive usage of ifdefs in various places. Now, all the
GlobalISel ifdefs will be happing only in AArch64TargetMachine.cpp.

llvm-svn: 265567
2016-04-06 17:26:03 +00:00
Quentin Colombet d1d324b2ae [AArch64] Use the default constructor of RegisterBankInfo when GlobalISel is not built.
This will avoid link-time error as the defautl constructor of RegisterBankInfo is
the only one available when GlobalISel is not built.

llvm-svn: 265549
2016-04-06 15:53:13 +00:00
Evgeny Astigeevich 9c24ebfa6d [AArch64][CodeGen] NFC refactor AArch64InstrInfo::optimizeCompareInstr to prepare it for fixing a bug in it
AArch64InstrInfo::optimizeCompareInstr has a bug which causes generation of incorrect code (PR#27158).
The patch refactors the function to simplify reviewing the fix of the bug.

1. Function name ‘modifiesConditionCode’ is changed to ‘areCFlagsAccessedBetweenInstrs’
   to reflect that the function can check modifying accesses, reading accesses or both.
2. Function ‘AArch64InstrInfo::optimizeCompareInstr’
   - Documented the function
   - Cmp_NZCV is DeadNZCVIdx to reflect that it is an operand index of dead NZCV
   - The code for the case of substituting CmpInstr is put into separate
     functions the main of them is ‘substituteCmpInstr’.

Differential Revision: http://reviews.llvm.org/D18609

llvm-svn: 265531
2016-04-06 11:39:00 +00:00
Matthias Braun 8e594fdf19 AArch64: Fix compile error
Fixed to adapt a use of enterBasicBlock() in my last commit (because I
had follow on patches in my repository that change the code).

llvm-svn: 265513
2016-04-06 02:59:44 +00:00
Matthias Braun 7dc03f060e RegisterScavenger: Take a reference as enterBasicBlock() argument.
Make it obvious that the argument cannot be nullptr.
Remove an unnecessary nullptr check in initRegState.

llvm-svn: 265511
2016-04-06 02:47:09 +00:00
NAKAMURA Takumi 285c8ff753 AArch64CodeGen: Make AArch64RegisterBankInfo.cpp optional along LLVM_BUILD_GLOBAL_ISEL.
llvm-svn: 265499
2016-04-06 01:18:08 +00:00
Quentin Colombet 5300950f3a [AArch64] Initial implementation of the targeting of the register bank information.
llvm-svn: 265489
2016-04-05 23:34:59 +00:00
Evgeniy Stepanov dde29e2799 Faster stack-protector for Android/AArch64.
Bionic has a defined thread-local location for the stack protector
cookie. Emit a direct load instead of going through __stack_chk_guard.

llvm-svn: 265481
2016-04-05 22:41:50 +00:00
Matthias Braun 870c34f0cf ARM, AArch64, X86: Check preserved registers for tail calls.
We can only perform a tail call to a callee that preserves all the
registers that the caller needs to preserve.

This situation happens with calling conventions like preserver_mostcc or
cxx_fast_tls. It was explicitely handled for fast_tls and failing for
preserve_most. This patch generalizes the check to any calling
convention.

Related to rdar://24207743

Differential Revision: http://reviews.llvm.org/D18680

llvm-svn: 265329
2016-04-04 18:56:13 +00:00
Derek Schuff 1dbf7a571f Add MachineFunctionProperty checks for AllVRegsAllocated for target passes
Summary:
This adds the same checks that were added in r264593 to all
target-specific passes that run after register allocation.

Reviewers: qcolombet

Subscribers: jyknight, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D18525

llvm-svn: 265313
2016-04-04 17:09:25 +00:00
Saleem Abdulrasool 85b43639b1 AArch64: support .cpu directive
Add support for the AArch64 .cpu directive.  This is a slightly involved
directive since the parameter is actually a variable encoded string.  The
general structure is:

  <cpu>[[+-]<feature>]*

We now map some of the supported string names for features for internal
representation of feature flags.  If we encounter one which we do not support,
bail out as we cannot validate the assembly any longer.

Resolves PR27010.

llvm-svn: 265240
2016-04-02 19:29:52 +00:00
Tim Northover 5dad9df9f7 AArch64: avoid clobbering SP for dead MOVimm pseudos.
We were producing ORR, which actually defines a GPR32sp rather than a GPR32.

Should fix PR23209.

llvm-svn: 265198
2016-04-01 23:14:52 +00:00
Chad Rosier 8787a81023 [AArch64] Fix a typo. NFC.
llvm-svn: 265160
2016-04-01 17:34:38 +00:00
Oliver Stannard a5520b02a5 [AArch64] Better errors for out-of-range fixups
When a fixup that can be resolved by the assembler is out of range, we should
report an error in the source, rather than crashing.

Differential Revision: http://reviews.llvm.org/D18402

llvm-svn: 265120
2016-04-01 09:14:50 +00:00
Matthias Braun cc7fba40fe AArch64ISelLowering: Remove unused variables/arguments; NFC
llvm-svn: 265098
2016-04-01 02:49:17 +00:00
Jun Bum Lim 760afcb338 [AArch64] Allow loads with imp-def to be handled in getMemOpBaseRegImmOfsWidth()
Summary:
This change will allow loads with imp-def to be clustered in machine-scheduler pass.
areMemAccessesTriviallyDisjoint() can also handle loads with imp-def.

Reviewers: mcrosier, jmolloy, t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18665

llvm-svn: 265051
2016-03-31 20:53:47 +00:00
Hans Wennborg e1a2e90ffa Change eliminateCallFramePseudoInstr() to return an iterator
This will become necessary in a subsequent change to make this method
merge adjacent stack adjustments, i.e. it might erase the previous
and/or next instruction.

It also greatly simplifies the calls to this function from Prolog-
EpilogInserter. Previously, that had a bunch of logic to resume iteration
after the call; now it just continues with the returned iterator.

Note that this changes the behaviour of PEI a little. Previously,
it attempted to re-visit the new instruction created by
eliminateCallFramePseudoInstr(). That code was added in r36625,
but I can't see any reason for it: the new instructions will obviously
not be pseudo instructions, they will not have FrameIndex operands,
and we have already accounted for the stack adjustment.

Differential Revision: http://reviews.llvm.org/D18627

llvm-svn: 265036
2016-03-31 18:33:38 +00:00
Jun Bum Lim cf9744367b [AArch64] Handle missing store pair opportunity
Summary:
This change will handle missing store pair opportunity where the first store
instruction stores zero followed by the non-zero store. For example, this change
will convert :

  str wzr, [x8]
  str w1, [x8, #4]
into:
  stp wzr, w1, [x8]

Reviewers: jmolloy, t.p.northover, mcrosier

Subscribers: flyingforyou, aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18570

llvm-svn: 265021
2016-03-31 14:47:24 +00:00
Matthias Braun 8d41436004 CodeGen: Factor out code for tail call result compatibility check; NFC
llvm-svn: 264959
2016-03-30 22:46:04 +00:00
Chad Rosier f7ac5f28ab [AArch64] Fix warnings pointed out by Hal.
llvm-svn: 264882
2016-03-30 18:08:51 +00:00
Adam Nemet fb8fbba584 [Aarch64] Turn on the LoopDataPrefetch pass for Cyclone
llvm-svn: 264811
2016-03-30 00:21:29 +00:00
Adam Nemet 1428d41f9a [LoopDataPrefetch] Centralize the tuning cl::opts under the pass
This is effectively NFC, minus the renaming of the options
(-cyclone-prefetch-distance -> -prefetch-distance).

The change was requested by Tim in D17943.

llvm-svn: 264806
2016-03-29 23:45:52 +00:00
Manman Ren f46262e0b7 Swift Calling Convention: add swiftself attribute.
Differential Revision: http://reviews.llvm.org/D17866

llvm-svn: 264754
2016-03-29 17:37:21 +00:00
Haicheng Wu 6a6bc750d5 [AArch64] Do not lower scalar sdiv/udiv to a shifts + mul sequence when optimizing for minsize
Mimic what x86 does when optimizing sdiv/udiv for minsize.

llvm-svn: 264606
2016-03-28 18:17:07 +00:00
Chad Rosier 59bcbba6b4 [AArch64] Fix typo. NFC.
llvm-svn: 264408
2016-03-25 14:37:43 +00:00
Chad Rosier 85c8594056 [AArch64] Replace return 0 with return false. NFC.
llvm-svn: 264185
2016-03-23 20:07:28 +00:00
Oliver Stannard aa77b1e025 [AArch64] Replace some uses of report_fatal_error with reportError in AArch64 ELF object writer
If we can't handle a relocation type, report it as an error in the source,
rather than asserting. I've added a more descriptive message and a test for the
only cases of this that I've been able to trigger.

Differential Revision: http://reviews.llvm.org/D18388

llvm-svn: 264156
2016-03-23 13:45:03 +00:00
Duncan P. N. Exon Smith 20be876a64 Fix -Wdocumentation warnings from r263853
Thanks to chapuni for catching this.

llvm-svn: 263993
2016-03-21 22:13:44 +00:00
Chad Rosier cf173ffb46 [AArch64] Add a helpful assert. NFC.
llvm-svn: 263965
2016-03-21 18:04:10 +00:00
Chad Rosier 4aeab5fbf2 [AArch64] Fix a -Wdocumentation warning. NFC.
llvm-svn: 263942
2016-03-21 13:43:58 +00:00
Manman Ren 4865d89653 [CXX_FAST_TLS] Disable tail call when calling conventions are mismatched.
Since CXX_FAST_TLS has a bigger set of CSRs, we don't tail call when caller
and callee have mismatched calling conventions.

llvm-svn: 263856
2016-03-18 23:41:51 +00:00
Manman Ren 2828c57b6f [CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86.
Since at O0, explicit copies via SplitCSR may not be removed even if
they are unnecessary, we choose not to use SplitCSR at O0.

llvm-svn: 263855
2016-03-18 23:38:49 +00:00
Duncan P. N. Exon Smith c3fa1eded2 AArch64: Don't modify other modules in AArch64PromoteConstant
Avoid modifying other modules in `AArch64PromoteConstant` when the
constant is `ConstantData` (a horrible accident, I'm sure, caught by an
experimental follow-up to r261464).

Previously, this walked through all the users of a constant, but that
reaches into other modules when the constant doesn't depend transitively
on a `GlobalValue`!  Since we're walking instructions anyway, just
modify the instructions we actually see.

As a drive-by, instead of storing `Use` and getting the instructions
again via `Use::getUser()` (which is not a constantant time lookup),
store `std::pair<Instruction, unsigned>`.  Besides being cheaper, this
makes it easier to drop use-lists form `ConstantData` in the future.
(I threw this in because I was touching all the code anyway.)

Because the patch completely changes the traversal logic, it looks
like a rewrite of the pass, but the core logic is all the same (or
should be, minus the out-of-module changes).  In other words, there
should be NFC as long as the LLVMContext only has a single Module.

I didn't think of a good way to test this, but I hope to submit a patch
eventually that makes walking these use-lists illegal/impossible.

llvm-svn: 263853
2016-03-18 23:30:54 +00:00
Chad Rosier cdfd7e7201 [AArch64] Enable more load clustering in the MI Scheduler.
This patch adds unscaled loads and sign-extend loads to the TII
getMemOpBaseRegImmOfs API, which is used to control clustering in the MI
scheduler. This is done to create more opportunities for load pairing.  I've
also added the scaled LDRSWui instruction, which was missing from the scaled
instructions. Finally, I've added support in shouldClusterLoads for clustering
adjacent sext and zext loads that too can be paired by the load/store optimizer.

Differential Revision: http://reviews.llvm.org/D18048

llvm-svn: 263819
2016-03-18 19:21:02 +00:00
Adam Nemet 709e3046ee [LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch ahead
Summary:
It can hurt performance to prefetch ahead too much.  Be conservative for
now and don't prefetch ahead more than 3 iterations on Cyclone.

Reviewers: hfinkel

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17949

llvm-svn: 263772
2016-03-18 00:27:43 +00:00
Adam Nemet 6d8beeca53 [LoopDataPrefetch/Aarch64] Allow selective prefetching of large-strided accesses
Summary:
And use this TTI for Cyclone.  As it was explained in the original RFC
(http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW
prefetcher work up to 2KB strides.

I am also adding tests for this and the previous change (D17943):

* Cyclone prefetching accesses with a large stride
* Cyclone not prefetching accesses with a small stride
* Generic Aarch64 subtarget not prefetching either

Reviewers: hfinkel

Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17945

llvm-svn: 263771
2016-03-18 00:27:38 +00:00
Adam Nemet 53e758fc55 [Aarch64] Add pass LoopDataPrefetch for Cyclone
Summary:
This wires up the pass for Cyclone but keeps it off for now because we
need a few more TTIs.

The getPrefetchMinStride value is not very well tuned right now but it
works well with CFP2006/433.milc which motivated this.

Tests will be added as part of the upcoming large-stride prefetching
patch.

Reviewers: t.p.northover

Subscribers: llvm-commits, aemerson, hfinkel, rengolin

Differential Revision: http://reviews.llvm.org/D17943

llvm-svn: 263770
2016-03-18 00:27:29 +00:00
Chad Rosier 27c352d26d [AArch64] Refactor AArch64FrameLowering::emitPrologue. NFC.
http://reviews.llvm.org/D18125
Patch by Aditya Kumar.

llvm-svn: 263461
2016-03-14 18:24:34 +00:00
Chad Rosier 6d98655070 [AArch64] Break the dependency between FP and SP when possible.
When the SP in not changed because of realignment/VLAs etc., we restore the SP
by using the previous value of SP and not the FP. Breaking the dependency will
help in cases when the epilog of a callee is close to the epilog of the caller;
for then "sub sp, fp, #" depends on the load restoring the FP in the epilog of
the callee.

http://reviews.llvm.org/D18060
Patch by Aditya Kumar and Evandro Menezes.

llvm-svn: 263458
2016-03-14 18:17:41 +00:00
Sanjay Patel 7506852709 [DAG] use !isUndef() ; NFCI
llvm-svn: 263453
2016-03-14 18:09:43 +00:00
Sanjay Patel 5719584129 [DAG] use isUndef() ; NFCI
llvm-svn: 263448
2016-03-14 17:28:46 +00:00
Ahmed Bougacha 171f7b9986 [AArch64] Don't blindly lower f16/f128 FCCMPs.
Instead, extend f16 (like we do when lowering a standalone SETCC),
and let f128 be legalized to the RT calls.

Fixes PR26803.

llvm-svn: 263301
2016-03-11 22:02:58 +00:00
Tim Northover 6092de5075 AArch64: only try to use scaled fcvt ops on legal vector types.
Before we ended up calling getSimpleVectorType on a <3 x float>, which
asserted.

llvm-svn: 263169
2016-03-10 23:02:21 +00:00
Tim Northover 00e2dcec02 AArch64: remove pseudo-instructions used only for their patterns.
There's no real reason for these pseudos to exist, we should be writing real
patterns even if it is slightly less convenient. NFC.

llvm-svn: 263141
2016-03-10 18:46:12 +00:00
Balaram Makam e9b2725287 [AArch64] Optimize compare and branch sequence when the compare's constant operand is power of 2
Summary:
Peephole optimization that generates a single TBZ/TBNZ instruction
for test and branch sequences like in the example below. This handles
the cases that miss folding of AND into TBZ/TBNZ during ISelLowering of BR_CC

Examples:
   and  w8, w8, #0x400
   cbnz w8, L1
 to
   tbnz w8, #10, L1

Reviewers: MatzeB, jmolloy, mcrosier, t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17942

llvm-svn: 263136
2016-03-10 17:54:55 +00:00
Roman Levenstein 2792b3f02f Add support for a preserve_most calling convention to the AArch64 backend.
This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64.

There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention.

Differential Revision: http://reviews.llvm.org/D18016

llvm-svn: 263092
2016-03-10 04:35:09 +00:00