Commit Graph

10118 Commits

Author SHA1 Message Date
Diana Picus 4ed0ee7b5f [ARM GlobalISel] Legalize G_FPTOSI and G_FPTOUI
Legal if we have hardware support for floating point, libcalls
otherwise.

Also add the necessary support for libcalls in the legalizer helper.

llvm-svn: 323726
2018-01-30 07:54:52 +00:00
Daniel Sanders 08464524c3 [ARM][GISel] PR35965 Constrain RegClasses of nested instructions built from Dst Pattern
Summary:
Apparently, we missed on constraining register classes of VReg-operands of all the instructions
built from a destination pattern but the root (top-level) one. The issue exposed itself
while selecting G_FPTOSI for armv7: the corresponding pattern generates VTOSIZS wrapped
into COPY_TO_REGCLASS, so top-level COPY_TO_REGCLASS gets properly constrained,
while nested VTOSIZS (or rather its destination virtual register to be exact) does not.

Fixing this by issuing GIR_ConstrainSelectedInstOperands for every nested GIR_BuildMI.

https://bugs.llvm.org/show_bug.cgi?id=35965
rdar://problem/36886530

Patch by Roman Tereshin

Reviewers: dsanders, qcolombet, rovka, bogner, aditya_nandakumar, volkan

Reviewed By: dsanders, qcolombet, rovka

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42565

llvm-svn: 323692
2018-01-29 21:09:12 +00:00
Daniel Sanders 9ade5592d9 [globalisel] Make LegalizerInfo::LegalizeAction available outside of LegalizerInfo. NFC
Summary:
The improvements to the LegalizerInfo discussed in D42244 require that
LegalizerInfo::LegalizeAction be available for use in other classes. As such,
it needs to be moved out of LegalizerInfo. This has been done separately to the
next patch to minimize the noise in that patch.

llvm-svn: 323669
2018-01-29 17:37:29 +00:00
Sjoerd Meijer 3ddb7fb663 [ARM] FP16Pat and FullFP16Pat patterns. NFC.
Create and use FP16Pat FullFP16Pat helper patterns to make the difference
explicit.

Differential Revision: https://reviews.llvm.org/D42634

llvm-svn: 323640
2018-01-29 11:28:06 +00:00
Momchil Velikov d2cc6fd90b [ARM] Accept a subset of Thumb GPR register class when emitting an SP-relative
load instruction

The function `Thumb1InstrInfo::loadRegFromStackSlot` accepts only the `tGPR`
register class. The function serves to emit a `tLDRspi` instruction and
certainly any subset of the `tGPR` register class is a valid destination of the
load.

Differential revision: https://reviews.llvm.org/D42535

llvm-svn: 323514
2018-01-26 10:20:58 +00:00
Sjoerd Meijer 011de9c0ca [ARM] Armv8.2-A FP16 code generation (part 1/3)
This is the groundwork for Armv8.2-A FP16 code generation .

Clang passes and returns _Float16 values as floats, together with the required
bitconverts and truncs etc. to implement correct AAPCS behaviour, see D42318.
We will implement half-precision argument passing/returning lowering in the ARM
backend soon, but for now this means that this:

_Float16 sub(_Float16 a, _Float16 b) {
  return a + b;
}

gets lowered to this:

define float @sub(float %a.coerce, float %b.coerce) {
entry:
  %0 = bitcast float %a.coerce to i32
  %tmp.0.extract.trunc = trunc i32 %0 to i16
  %1 = bitcast i16 %tmp.0.extract.trunc to half
  <SNIP>
  %add = fadd half %1, %3
  <SNIP>
}

When FullFP16 is *not* supported, we don't make f16 a legal type, and we get
legalization for "free", i.e. nothing changes and everything works as before.
And also f16 argument passing/returning is handled.

When FullFP16 is supported, we do make f16 a legal type, and have 2 places that
we need to patch up: f16 argument passing and returning, which involves minor
tweaks to avoid unnecessary code generation for some bitcasts.

As a "demonstrator" that this works for the different FP16, FullFP16, softfp
modes, etc., I've added match rules to the VSUB instruction description showing
that we can codegen this instruction from IR, but more importantly, also to
some conversion instructions. These conversions were causing issue before in
the FP16 and FullFP16 cases.

I've also added match rules to the VLDRH and VSTRH desriptions, so that we can
actually compile the entire half-precision sub code example above. This showed
that these loads and stores had the wrong addressing mode specified: AddrMode5
instead of AddrMode5FP16, which turned out not be implemented at all, so that
has also been added.

This is the minimal patch that shows all the different moving parts. In patch
2/3 I will add some efficient lowering of bitcasts, and in 2/3 I will add the
remaining Armv8.2-A FP16 instruction descriptions.


Thanks to Sam Parker and Oliver Stannard for their help and reviews!


Differential Revision: https://reviews.llvm.org/D38315

llvm-svn: 323512
2018-01-26 09:26:40 +00:00
Weiming Zhao 665784f170 [ARM] Expand long shifts for Thumb1 to __aeabi_ calls
Summary: For long shifts, the inlined version takes about 20 instructions on Thumb1. To avoid the code bloat, expand to __aeabi_ calls if target is Thumb1.

Reviewers: samparker

Reviewed By: samparker

Subscribers: samparker, aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42401

llvm-svn: 323354
2018-01-24 18:00:57 +00:00
Martin Storsjo 4ed94a06ac [ARM] Call __chkstk for dynamic stack allocation in all windows environments
This matches what MSVC does for alloca() function calls on ARM.
Even if MSVC doesn't support VLAs at the language level, it does
support the alloca function.

On the clang level, both the _alloca() (when emulating MSVC, which is
what the alloca() function expands to) and __builtin_alloca() builtin
functions, and VLAs, map to the same LLVM IR "alloca" function - so
within LLVM they're not distinguishable from each other.

Differential Revision: https://reviews.llvm.org/D42292

llvm-svn: 323308
2018-01-24 06:40:11 +00:00
Joel Galenson 1d89cd2bb4 [ARM] Cleanup part of ARMBaseInstrInfo::optimizeCompareInstr (NFCI).
As noted in another review, this loop is confusing.  This commit cleans it up
somewhat.

Differential Revision: https://reviews.llvm.org/D42312

llvm-svn: 323136
2018-01-22 17:53:47 +00:00
Marina Yatsina 0bf841ac2a Separate LoopTraversal, ReachingDefAnalysis and BreakFalseDeps into their own files.
This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40333

Change-Id: Ie5f8eb34d98cfdfae23a3072eb69b5794f0e2d56
llvm-svn: 323095
2018-01-22 10:06:50 +00:00
Marina Yatsina 3d8efa4f0c Rename ExecutionDepsFix files to ExecutionDomainFix
This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40332

Change-Id: I6a048cca7fdafbfc42fb1bac94343e483befded8
llvm-svn: 323094
2018-01-22 10:06:33 +00:00
Marina Yatsina 6fc2aaae8d Separate ExecutionDepsFix into 4 parts:
1. ReachingDefsAnalysis - Allows to identify for each instruction what is the “closest” reaching def of a certain register. Used by BreakFalseDeps (for clearance calculation) and ExecutionDomainFix (for arbitrating conflicting domains).
2. ExecutionDomainFix - Changes the variant of the instructions in order to minimize domain crossings.
3. BreakFalseDeps - Breaks false dependencies.
4. LoopTraversal - Creatws a traversal order of the basic blocks that is optimal for loops (introduced in revision L293571). Both ExecutionDomainFix and ReachingDefsAnalysis use this to determine the order they will traverse the basic blocks.

This also included the following changes to ExcecutionDepsFix original logic:
1. BreakFalseDeps and ReachingDefsAnalysis logic no longer restricted by a register class.
2. ReachingDefsAnalysis tracks liveness of reg units instead of reg indices into a given reg class.

Additional changes in affected files:
1. X86 and ARM targets now inherit from ExecutionDomainFix instead of ExecutionDepsFix. BreakFalseDeps also was added to the passes they activate.
2. Comments and references to ExecutionDepsFix replaced with ExecutionDomainFix and BreakFalseDeps, as appropriate.

Additional refactoring changes will follow.

This commit is (almost) NFC.
The only functional change is that now BreakFalseDeps will break dependency for all register classes.
Since no additional instructions were added to the list of instructions that have false dependencies, there is no actual change yet.
In a future commit several instructions (and tests) will be added.

This is the first of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40330

Change-Id: Icaeb75e014eff96a8f721377783f9a3e6c679275
llvm-svn: 323087
2018-01-22 10:05:23 +00:00
Joel Galenson dbc724f764 [ARM] Fix perf regression in compare optimization.
Fix a performance regression caused by r322737.

While trying to make it easier to replace compares with existing adds and
subtracts, I accidentally stopped it from doing so in some cases.  This should
fix that.  I'm also fixing another potential bug in that commit.

Differential Revision: https://reviews.llvm.org/D42263

llvm-svn: 322972
2018-01-19 17:46:27 +00:00
Daniel Neilson 1e68724d24 Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
Summary:
 This is a resurrection of work first proposed and discussed in Aug 2015:
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
and initially landed (but then backed out) in Nov 2015:
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

 The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

 This change is the first in a series that allows source and dest to each
have their own alignments by using the alignment attribute on their arguments.

 In this change we:
1) Remove the alignment argument.
2) Add alignment attributes to the source & dest arguments. We, temporarily,
   require that the alignments for source & dest be equal.

 For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

 Downstream users may have to update their lit tests that check for
@llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script
may help with updating the majority of your tests, but it does not catch all possible
patterns so some manual checking and updating will be required.

s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g

 The remaining changes in the series will:
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
   source and dest alignments.
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
        and those that use use MemIntrinsicInst::[get|set]Alignment() to use
        getDestAlignment() and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
        MemIntrinsicInst::[get|set]Alignment() methods.

Reviewers: pete, hfinkel, lhames, reames, bollu

Reviewed By: reames

Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits

Differential Revision: https://reviews.llvm.org/D41675

llvm-svn: 322965
2018-01-19 17:13:12 +00:00
Reid Kleckner 1aa9061c5f [CodeGen] Hoist common AsmPrinter code out of X86, ARM, and AArch64
Every known PE COFF target emits /EXPORT: linker flags into a .drective
section. The AsmPrinter should handle this.

While we're at it, use global_values() and emit each export flag with
its own .ascii directive. This should make the .s file output more
readable.

llvm-svn: 322788
2018-01-17 23:55:23 +00:00
Joel Galenson bbcaf4ac5c [ARM] Optimize {s,u}mul.with.overflow.
This extends my previous patches to also optimize overflow-checked multiplies during SelectionDAG.

Differential revision: https://reviews.llvm.org/D40922

llvm-svn: 322738
2018-01-17 19:19:05 +00:00
Joel Galenson fe7fa40869 [ARM] Optimize {s,u}{add,sub}.with.overflow.
The ARM backend contains code that tries to optimize compares by replacing them with an existing instruction that sets the flags the same way. This allows it to replace a "cmp" with a "adds", generalizing the code that replaces "cmp" with "sub". It also heuristically disables sinking of instructions that could potentially be used to replace compares (currently only if they're next to each other).

Differential revision: https://reviews.llvm.org/D38378

llvm-svn: 322737
2018-01-17 19:19:05 +00:00
Diana Picus 01bcfd2112 [ARM GlobalISel] Rename local variable. NFC
llvm-svn: 322667
2018-01-17 15:25:37 +00:00
Diana Picus c62a16234b [ARM GlobalISel] Map G_FPEXT and G_FPTRUNC to FPR
llvm-svn: 322657
2018-01-17 14:14:14 +00:00
Diana Picus 65ed364fac [ARM GlobalISel] Legalize G_FPEXT and G_FPTRUNC
Mark G_FPEXT and G_FPTRUNC as legal or libcall, depending on hardware
support, but only for conversions between float and double.

Also add the necessary boilerplate so that the LegalizerHelper can
introduce the required libcalls. This also works only for float and
double, but isn't too difficult to extend when the need arises.

llvm-svn: 322651
2018-01-17 13:34:10 +00:00
Diana Picus 2dc5405693 [ARM GlobalISel] Map G_FMA to FPR
llvm-svn: 322367
2018-01-12 12:06:01 +00:00
Diana Picus e74243d473 [ARM GlobalISel] Legalize G_FMA
For hard float with VFP4, it is legal. Otherwise, we use libcalls.

This needs a bit of support in the LegalizerHelper for soft float
because we didn't handle G_FMA libcalls yet. The support is trivial, as
the only difference between G_FMA and other libcalls that we already
handle is that it has 3 input operands rather than just 2.

llvm-svn: 322366
2018-01-12 11:30:45 +00:00
Andre Vieira 5627c218e1 [ARM] Add codegen for SMMULR, SMMLAR and SMMLSR
This patch teaches the Arm back-end to generate the SMMULR, SMMLAR and SMMLSR
instructions from equivalent IR patterns.

Differential Revision: https://reviews.llvm.org/D41775

llvm-svn: 322361
2018-01-12 09:24:41 +00:00
Andre Vieira 26b9de9ebb [ARM] Fix erroneous availability of SMMLS for Armv7-M
Differential Revision: https://reviews.llvm.org/D41855

llvm-svn: 322360
2018-01-12 09:21:09 +00:00
Matthias Braun ea4359e922 PeepholeOptimizer: Fix for vregs without defs
The PeepholeOptimizer would fail for vregs without a definition. If this
was caused by an undef operand abort to keep the code simple (so we
don't need to add logic everywhere to replicate the undef flag).

Differential Revision: https://reviews.llvm.org/D40763

llvm-svn: 322319
2018-01-11 22:30:43 +00:00
Evgeniy Stepanov 5223b5d9d6 [arm] Implement Target Operand Flag MIR serialization.
Reviewers: efriedma, pcc

Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D39975

llvm-svn: 322312
2018-01-11 21:37:58 +00:00
Diana Picus 0ed7513c83 [ARM GlobalISel] Map G_FNEG to the FPR bank
llvm-svn: 322169
2018-01-10 11:13:31 +00:00
Diana Picus f949a0abac [ARM GlobalISel] Legalize G_FNEG for s32 and s64
For hard float, it is legal.

For soft float, we need to lower to 0 - x first, and then we can use the
libcall for G_FSUB. This is undoing some of the canonicalization
performed by the IRTranslator (which introduces G_FNEG when it sees a
0 - x). Ideally, that canonicalization would be performed by a
pre-legalizer pass that would allow targets to opt out of this behaviour
rather than dance around it in the legalizer.

llvm-svn: 322168
2018-01-10 10:45:34 +00:00
Diana Picus 8f14886630 [ARM GlobalISel] Legalize s32/s64 G_FCONSTANT
Legal for hard float.
Change to G_CONSTANT for soft float (but preserve the binary
representation).

llvm-svn: 322164
2018-01-10 10:01:49 +00:00
Diana Picus 734a5e8912 [ARM GlobalISel] Legalize G_CONSTANT for scalars > 32 bits
Make G_CONSTANT narrow for any scalars larger than 32 bits.

llvm-svn: 322162
2018-01-10 09:32:01 +00:00
Francis Visoiu Mistrih 7d9bef8f5c [CodeGen] Don't print "pred:" and "opt:" in -debug output
In -debug output we print "pred:" whenever a MachineOperand is a
predicate operand in the instruction descriptor, and "opt:" whenever a
MachineOperand is an optional def in the instruction descriptor.

Differential Revision: https://reviews.llvm.org/D41870

llvm-svn: 322096
2018-01-09 17:31:07 +00:00
Momchil Velikov ac7c5c1d92 [ARM] Fix PR35379 - incorrect unwind information when compiling with -Oz
The patch makes the unwind information not mention registers, which were pushed
solely for the purpose of saving stack adjustment instructions.

Differential revision: https://reviews.llvm.org/D41300
Fixes https://bugs.llvm.org/show_bug.cgi?id=35379

llvm-svn: 321996
2018-01-08 14:47:19 +00:00
Momchil Velikov d17dabca31 [ARM] Fix PR35481
This patch allows `r7` to be used, regardless of its use as a frame pointer, as
a temporary register when popping `lr`, and also falls back to using a high
temporary register if, for some reason, we weren't able to find a suitable low
one.

Differential revision: https://reviews.llvm.org/D40961
Fixes https://bugs.llvm.org/show_bug.cgi?id=35481

llvm-svn: 321989
2018-01-08 11:32:37 +00:00
Reid Kleckner 5619669a5a Fix -Wsign-compare warnings on Windows
These arise because enums are 'int' by default.

llvm-svn: 321887
2018-01-05 19:53:51 +00:00
Momchil Velikov 7efdd090e2 [ARM] Issue an erorr when non-general-purpose registers are used in address operands
Currently the assembler would accept, e.g. `ldr r0, [s0, #12]` and similar.
This patch add checks that only general-purpose registers are used in address
operands, shifted registers, and shift amounts.

Differential revision: https://reviews.llvm.org/D39910

llvm-svn: 321866
2018-01-05 13:28:10 +00:00
Oliver Stannard 7d9198b296 [ARM] Fix endianness of Thumb .inst.w directive
Wide Thumb2 instructions should be emitted into the object file as pairs of
16-bit words of the appropriate endianness, not one 32-bit word.

Differential revision: https://reviews.llvm.org/D41185

llvm-svn: 321799
2018-01-04 13:56:40 +00:00
Diana Picus 865f7fecb2 [ARM GlobalISel] Select G_PHI
Select G_PHI to PHI and manually constrain the result register. This is
very similar to how COPY is handled, so extract and reuse some of that
code.

llvm-svn: 321797
2018-01-04 13:09:25 +00:00
Diana Picus c768bbe2e7 [ARM GlobalISel] Legalize scalar G_PHI
Mark G_PHI as Legal for s32 and p0, and also for s64 if we have hard
float. Widen any smaller types.

llvm-svn: 321795
2018-01-04 13:09:14 +00:00
Diana Picus 37ae9f68a4 [ARM GlobalISel] Fix selection of pointer constants
We used to handle G_CONSTANT with pointer type by forcing the type of
the result register to s32 and then letting TableGen handle it.
Unfortunately, setting the type only works for generic virtual
registers, that haven't yet been constrained to a register class (e.g.
those used only by a COPY later on). If the result register has already
been constrained as a use of a previously selected instruction, then
setting the type will assert.

It would be nice to be able to teach TableGen to select pointer
constants the same as integer constants, but since it's such an edge
case (at the moment the only pointer constant that we're generally
interested in is 0, and that is mostly used for comparisons and selects,
which are also not supported by TableGen) it's probably not worth the
effort right now. Instead, handle pointer constants with some trivial
handwritten code.

llvm-svn: 321793
2018-01-04 10:54:57 +00:00
Alex Bradbury 46db78b743 [ARM][NFC] Avoid recreating MCSubtargetInfo in ARMAsmBackend
After D41349, we can now directly access MCSubtargetInfo from 
createARM*AsmBackend. This patch makes use of this, avoiding the need to 
create a fresh MCSubtargetInfo (which was previously always done with a blank 
CPU and feature string). Given the total size of the change remains pretty 
tiny and we're removing the old explicit destructor, I changed the STI field 
to a reference rather than a pointer.

Differential Revision: https://reviews.llvm.org/D41693

llvm-svn: 321707
2018-01-03 13:46:21 +00:00
Alex Bradbury b22f751fa7 Thread MCSubtargetInfo through Target::createMCAsmBackend
Currently it's not possible to access MCSubtargetInfo from a TgtMCAsmBackend. 
D20830 threaded an MCSubtargetInfo reference through 
MCAsmBackend::relaxInstruction, but this isn't the only function that would 
benefit from access. This patch removes the Triple and CPUString arguments 
from createMCAsmBackend and replaces them with MCSubtargetInfo.

This patch just changes the interface without making any intentional 
functional changes. Once in, several cleanups are possible:
* Get rid of the awkward MCSubtargetInfo handling in ARMAsmBackend
* Support 16-bit instructions when valid in MipsAsmBackend::writeNopData
* Get rid of the CPU string parsing in X86AsmBackend and just use a SubtargetFeature for HasNopl
* Emit 16-bit nops in RISCVAsmBackend::writeNopData if the compressed instruction set extension is enabled (see D41221)

This change initially exposed PR35686, which has since been resolved in r321026.

Differential Revision: https://reviews.llvm.org/D41349

llvm-svn: 321692
2018-01-03 08:53:05 +00:00
Sanjoy Das 26d11ca4b0 (Re-landing) Expose a TargetMachine::getTargetTransformInfo function
Re-land r321234.  It had to be reverted because it broke the shared
library build.  The shared library build broke because there was a
missing LLVMBuild dependency from lib/Passes (which calls
TargetMachine::getTargetIRAnalysis) to lib/Target.  As far as I can
tell, this problem was always there but was somehow masked
before (perhaps because TargetMachine::getTargetIRAnalysis was a
virtual function).

Original commit message:

This makes the TargetMachine interface a bit simpler.  We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.

See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html

I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.

Reviewers: echristo, MatzeB, hfinkel

Reviewed By: hfinkel

Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D41464

llvm-svn: 321375
2017-12-22 18:21:59 +00:00
Diana Picus 28a6d0e639 [ARM GlobalISel] Support G_INTTOPTR and G_PTRTOINT for s32
Mark conversions between pointers and 32-bit scalars as legal, map them
to the GPR and select to a simple COPY.

llvm-svn: 321356
2017-12-22 13:05:51 +00:00
Diana Picus 68773859c8 [ARM GlobalISel] Support pointer constants
Pointer constants are pretty rare, since we usually represent them as
integer constants and then cast to pointer. One notable exception is the
null pointer constant, which is represented directly as a G_CONSTANT 0
with pointer type. Mark it as legal and make sure it is selected like
any other integer constant.

llvm-svn: 321354
2017-12-22 11:09:18 +00:00
Eli Friedman 39ed9a602b [Inliner] Restrict soft-float inlining penalty.
The penalty is currently getting applied in a bunch of places where it
doesn't make sense, like bitcasts (which are free) and calls (which
were getting the call penalty applied twice). Instead, just apply the
penalty to binary operators and floating-point casts.

While I'm here, also fix getFPOpCost() to do the right thing in more
cases, so we don't have to dig into function attributes.

Differential Revision: https://reviews.llvm.org/D41522

llvm-svn: 321332
2017-12-22 02:08:08 +00:00
Sam Parker 98727bc261 [ARM] Armv8-R DFB instruction
Implement MC support for the Armv8-R 'Data Full Barrier' instruction.

Differential Revision: https://reviews.llvm.org/D41430

llvm-svn: 321256
2017-12-21 11:17:49 +00:00
Sanjoy Das 747d1114d6 Revert "Expose a TargetMachine::getTargetTransformInfo function"
This reverts commit r321234.  It breaks the -DBUILD_SHARED_LIBS=ON build.

llvm-svn: 321243
2017-12-21 02:34:39 +00:00
Sanjoy Das 0c3de350b4 Expose a TargetMachine::getTargetTransformInfo function
Summary:
This makes the TargetMachine interface a bit simpler.  We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.

See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html

I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.

Reviewers: echristo, MatzeB, hfinkel

Reviewed By: hfinkel

Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D41464

llvm-svn: 321234
2017-12-21 01:06:58 +00:00
Joel Galenson 6f4e827e4c [ARM] Optimize {s,u}{add,sub}.with.overflow.
The AArch64 backend contains code to optimize {s,u}{add,sub}.with.overflow during SelectionDAG.  This commit ports that code to the ARM backend.

Differential revision: https://reviews.llvm.org/D35635

llvm-svn: 321224
2017-12-20 22:25:39 +00:00
Diana Picus 75ce852abe [ARM GlobalISel] Fix assertion in RegBankSelect
We get an assertion in RegBankSelect for code along the lines of
my_32_bit_int = my_64_bit_int, which tends to translate into a 64-bit
load, followed by a G_TRUNC, followed by a 32-bit store. This appears in
a couple of places in the test-suite.

At the moment, the legalizer doesn't distinguish between integer and
floating point scalars, so a 64-bit load will be marked as legal for
targets with VFP, and so will the rest of the sequence, leading to a
slightly bizarre G_TRUNC reaching RegBankSelect.

Since the current support for 64-bit integers is rather immature, this
patch works around the issue by explicitly handling this case in
RegBankSelect and InstructionSelect. In the future, we may want to
revisit this decision and make sure 64-bit integer loads are narrowed
before reaching RegBankSelect.

llvm-svn: 321165
2017-12-20 11:27:10 +00:00
Florian Hahn c3aa6d83fd [ARM] Lower unsigned saturation to USAT
Summary:
Implement lower of unsigned saturation on an interval [0, k] where k + 1 is a power of two using USAT instruction in a similar way to how [~k, k] is lowered using SSAT on ARM models that supports it.

Patch by Marten Svanfeldt

Reviewers: t.p.northover, pbarrio, eastig, SjoerdMeijer, javed.absar, fhahn

Reviewed By: fhahn

Subscribers: fhahn, aemerson, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D41348

llvm-svn: 321164
2017-12-20 11:13:57 +00:00
Adrian Prantl 0e6694d111 Silence a bunch of implicit fallthrough warnings
llvm-svn: 321114
2017-12-19 22:05:25 +00:00
David Green 110844d21c [ARM] Register the Thumb2SizeReducePass. NFC
Also adds a simple test case.

llvm-svn: 321072
2017-12-19 12:19:08 +00:00
Matthias Braun a4852d2c19 X86/AArch64/ARM: Factor out common sincos_stret logic; NFCI
Note:
- X86ISelLowering: setLibcallName(SINCOS) was superfluous as
  InitLibcalls() already does it.
- ARMISelLowering: Setting libcallnames for sincos/sincosf seemed
  superfluous as in the darwin case it wouldn't be used while for all
  other cases InitLibcalls already does it.

llvm-svn: 321036
2017-12-18 23:19:42 +00:00
Diana Picus 8ee540c01a [ARM GlobalISel] Fix G_(UN)MERGE_VALUES handling after r319524
r319524 has made more G_MERGE_VALUES/G_UNMERGE_VALUES pairs legal than
are supported by the rest of the pipeline. Restrict that to only the
cases that we can currently handle: packing 32-bit values into 64-bit
ones, when we have hardware FP.

llvm-svn: 320980
2017-12-18 13:22:28 +00:00
Matthias Braun f1caa2833f MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

llvm-svn: 320884
2017-12-15 22:22:58 +00:00
Matt Arsenault 7d7adf4f2e TLI: Allow using PSV for intrinsic mem operands
llvm-svn: 320756
2017-12-14 22:34:10 +00:00
Sanjay Patel 0ab0c1a201 [SimplifyCFG] don't sink common insts too soon (PR34603)
This should solve:
https://bugs.llvm.org/show_bug.cgi?id=34603
...by preventing SimplifyCFG from altering redundant instructions before early-cse has a chance to run.
It changes the default (canonical-forming) behavior of SimplifyCFG, so we're only doing the
sinking transform later in the optimization pipeline.

Differential Revision: https://reviews.llvm.org/D38566

llvm-svn: 320749
2017-12-14 22:05:20 +00:00
Matt Arsenault 1117133687 DAG: Expose all MMO flags in getTgtMemIntrinsic
Rather than adding more bits to express every
MMO flag you could want, just directly use the
MMO flags. Also fixes using a bunch of bool arguments to
getMemIntrinsicNode.

On AMDGPU, buffer and image intrinsics should always
have MODereferencable set, but currently there is no
way to do that directly during the initial intrinsic
lowering.

llvm-svn: 320746
2017-12-14 21:39:51 +00:00
Geoff Berry dcc646e40b [ARM] Fix isRenamable flag setting on expanded VSTMDIA opcode.
Fixes expensive-check ARM buildbot failure.

llvm-svn: 320718
2017-12-14 18:06:25 +00:00
Francis Visoiu Mistrih 5df3bbf3e6 [CodeGen] Print global addresses as @foo in both MIR and debug output
Work towards the unification of MIR and debug output by printing
`@foo` instead of `<ga:@foo>`.

Also print target flags in the MIR format since most of them are used on
global address operands.

Only debug syntax is affected.

llvm-svn: 320682
2017-12-14 10:03:09 +00:00
Michael Zolotukhin 67b04bd8ac Recover some overzealously removed includes.
llvm-svn: 320648
2017-12-13 22:21:02 +00:00
Michael Zolotukhin caf9ea6aa0 Remove redundant includes from lib/Target/ARM.
llvm-svn: 320635
2017-12-13 21:31:17 +00:00
Geoff Berry 60c431022e [MachineOperand][MIR] Add isRenamable to MachineOperand.
Summary:
Add isRenamable() predicate to MachineOperand.  This predicate can be
used by machine passes after register allocation to determine whether it
is safe to rename a given register operand.  Register operands that
aren't marked as renamable may be required to be assigned their current
register to satisfy constraints that are not captured by the machine
IR (e.g. ABI or ISA constraints).

Reviewers: qcolombet, MatzeB, hfinkel

Subscribers: nemanjai, mcrosier, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D39400

llvm-svn: 320503
2017-12-12 17:53:59 +00:00
Roger Ferrer Ibanez 5ea0f2501f [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045
 - fixes PR34564
 - fixes PR35103

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 320355
2017-12-11 12:13:45 +00:00
Francis Visoiu Mistrih a8a83d150f [CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.
Work towards the unification of MIR and debug output by refactoring the
interfaces.

For MachineOperand::print, keep a simple version that can be easily called
from `dump()`, and a more complex one which will be called from both the
MIRPrinter and MachineInstr::print.

Add extra checks inside MachineOperand for detached operands (operands
with getParent() == nullptr).

https://reviews.llvm.org/D40836

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/<def>//g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<def[ ]*,[ ]*dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ]*,[ ]*dead>/implicit-def dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g'

llvm-svn: 320022
2017-12-07 10:40:31 +00:00
Nirav Dave 7d8f3e0c93 [ARM][AArch64][DAG] Reenable post-legalize store merge
Reenable post-legalize stores with constant merging computation and
corresponding test case.

 * Properly truncate store merge constants
 * Disable merging of truncated stores floating points
 * Ensure merges of constant stores into a single vector are
   constructed from legal elements.

Reviewers: eastig, efriedma

Reviewed By: eastig

Subscribers: spatel, rengolin, aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40701

llvm-svn: 319899
2017-12-06 15:30:13 +00:00
Daniel Sanders 3c1c4c0ee0 Revert r319691: [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
Some concerns were raised with the direction. Revert while we discuss it and look into an alternative

llvm-svn: 319739
2017-12-05 05:52:07 +00:00
Daniel Sanders 04e4f47e93 [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
This patch splits atomics out of the generic G_LOAD/G_STORE and into their own
G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a
necessary one. Atomic load/store has little in implementation in common with
non-atomic load/store. They tend to be handled very differently throughout the
backend. It also has the nice side-effect of slightly improving the common-case
performance at ISel since there's no longer a need for an atomicity check in the
matcher table.

All targets have been updated to remove the atomic load/store check from the
G_LOAD/G_STORE path. AArch64 has also been updated to mark
G_ATOMIC_LOAD/G_ATOMIC_STORE legal.

There is one issue with this patch though which also affects the extending loads
and truncating stores. The rules only match when an appropriate G_ANYEXT is
present in the MIR. For example,
  (G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X))))
will match but:
  (G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X))
will not. This shouldn't be a problem at the moment, but as we get better at
eliminating extends/truncates we'll likely start failing to match in some
cases. The current plan is to fix this in a patch that changes the
representation of extending-load/truncating-store to allow the MMO to describe
a different type to the operation.

llvm-svn: 319691
2017-12-04 20:39:32 +00:00
Francis Visoiu Mistrih 25528d6de7 [CodeGen] Unify MBB reference format in both MIR and debug output
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

llvm-svn: 319665
2017-12-04 17:18:51 +00:00
Pablo Barrio 2b4385846c Fix function pointer tail calls in armv8-M.base
Summary:
The compiler fails with the following error message:

fatal error: error in backend: ran out of registers during
register allocation

Tail call optimization for Armv8-M.base fails to meet all the required
constraints when handling calls to function pointers where the
arguments take up r0-r3. This is because the pointer to the
function to be called can only be stored in r0-r3, but these are
all occupied by arguments. This patch makes sure that tail call
optimization does not try to handle this type of calls.

Reviewers: chill, MatzeB, olista01, rengolin, efriedma

Reviewed By: olista01, efriedma

Subscribers: efriedma, aemerson, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D40706

llvm-svn: 319664
2017-12-04 16:55:49 +00:00
Oliver Stannard 7ab60605f8 Revert r319649 - [Asm, ARM] Add fallback diag for multiple invalid operands
This is causing a failure in the llvm-clang-x86_64-expensive-checks-win
buildbot, and I can't reproduce it locally, so reverting until I can work out
what is wrong.

llvm-svn: 319654
2017-12-04 13:42:22 +00:00
Oliver Stannard 7cd4db94f8 [Asm, ARM] Add fallback diag for multiple invalid operands
This adds a "invalid operands for instruction" diagnostic for
instructions where there is an instruction encoding with the correct
mnemonic and which is available for this target, but where multiple
operands do not match those which were provided. This makes it clear
that there is some combination of operands that is valid for the current
target, which the default diagnostic of "invalid instruction" does not.

Since this is a very general error, we only emit it if we don't have a
more specific error.

Differential revision: https://reviews.llvm.org/D36747

llvm-svn: 319649
2017-12-04 12:02:32 +00:00
Martin Storsjo c85cc41801 [ARM] Allow using emulated tls on platforms other than ELF
This matches how it is done on X86.

This allows using emulated tls on windows; in MinGW environments,
native tls isn't supported at the moment.

Differential Revision: https://reviews.llvm.org/D40769

llvm-svn: 319643
2017-12-04 09:08:55 +00:00
Nirav Dave 3e76e1e89e [DAG][ARM] Revert "Reenable post-legalize store merge"
due to failures in AArch and ARM code gen.

llvm-svn: 319587
2017-12-01 21:55:47 +00:00
Nirav Dave eb2b24fded [ARM][DAG] Reenable post-legalize store merge
Summary: Reenable post-legalize stores with constant merging computation and cofrresponding test case.

Reviewers: eastig, efriedma

Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40701

llvm-svn: 319547
2017-12-01 14:49:26 +00:00
Volkan Keles a32ff00b00 GlobalISel: Enable the legalization of G_MERGE_VALUES and G_UNMERGE_VALUES
Summary: LegalizerInfo assumes all G_MERGE_VALUES and G_UNMERGE_VALUES instructions are legal, so it is not possible to legalize vector operations on illegal vector types. This patch fixes the problem by removing the related check and adding default actions for G_MERGE_VALUES and G_UNMERGE_VALUES.

Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, t.p.northover, kristof.beyls

Reviewed By: dsanders

Subscribers: rovka, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39823

llvm-svn: 319524
2017-12-01 08:19:10 +00:00
Diana Picus f003d9ff95 [ARM GlobalISel] Bail out for byval
Fallback if we have a byval parameter or argument since we don't support
them yet.

llvm-svn: 319428
2017-11-30 12:23:44 +00:00
Francis Visoiu Mistrih 93ef145862 [CodeGen] Print "%vreg0" as "%0" in both MIR and debug output
As part of the unification of the debug format and the MIR format, avoid
printing "vreg" for virtual registers (which is one of the current MIR
possibilities).

Basically:

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g"
* grep -nr '%vreg' . and fix if needed
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g"
* grep -nr 'vreg[0-9]\+' . and fix if needed

Differential Revision: https://reviews.llvm.org/D40420

llvm-svn: 319427
2017-11-30 12:12:19 +00:00
Nirav Dave bafaa53c4d [ARM][DAG] Revert Disable post-legalization store merge for ARM
Partially reverting enabling of post-legalization store merge
(r319036) for just ARM backend as it is causing incorrect code
in some Thumb2 cases.

llvm-svn: 319331
2017-11-29 18:06:13 +00:00
Diana Picus 863b5b05f1 [ARM GlobalISel] Fix selecting G_BRCOND
When lowering a G_BRCOND, we generate a TSTri of the condition against
1, which sets the flags, and then a Bcc which branches based on the
value of the flags.

Unfortunately, we were using the wrong condition code to check whether
we need to branch (EQ instead of NE), which caused all our branches to
do the opposite of what they were intended to do. This patch fixes the
issue by using the correct condition code.

llvm-svn: 319313
2017-11-29 14:20:06 +00:00
Oliver Stannard 9ea2eaeb50 [ARM] Add support for armv7e-m to the .arch directive
This will allow compilation of assembly files targeting armv7e-m without having
to specify the Tag_CPU_arch attribute as a workaround.

Differential revision: https://reviews.llvm.org/D40370

Patch by Ian Tessier!

llvm-svn: 319303
2017-11-29 10:12:15 +00:00
Francis Visoiu Mistrih 9d7bb0cb40 [CodeGen] Print register names in lowercase in both MIR and debug output
As part of the unification of the debug format and the MIR format,
always print registers as lowercase.

* Only debug printing is affected. It now follows MIR.

Differential Revision: https://reviews.llvm.org/D40417

llvm-svn: 319187
2017-11-28 17:15:09 +00:00
Francis Visoiu Mistrih 9d419d3b0c [CodeGen] Rename functions PrintReg* to printReg*
LLVM Coding Standards:
  Function names should be verb phrases (as they represent actions), and
  command-like function should be imperative. The name should be camel
  case, and start with a lower case letter (e.g. openFile() or isFoo()).

Differential Revision: https://reviews.llvm.org/D40416

llvm-svn: 319168
2017-11-28 12:42:37 +00:00
Matthias Braun 5d01e708e1 ARM: Fix PR32578
https://llvm.org/PR32578

I simplified and converted the reproducer into a lit test.

Patch by Vedant Kumar!

llvm-svn: 319130
2017-11-28 01:17:52 +00:00
Momchil Velikov bd2c7eb923 [ARM] Fix an off-by-one error when restoring LR for 16-bit Thumb
The commit https://reviews.llvm.org/rL318143 computes incorrectly to offset to
restore LR from.

The number of tPOP operands is 2 (condition) + 2 (implicit def and use of SP) +
count of the popped registers. We need to load LR from just past the last
register, hence the correct offset should be either getNumOperands() - 4 and
getNumExplicitOperands() - 2 (multiplied by 4).

Differential revision: https://reviews.llvm.org/D40305

llvm-svn: 319014
2017-11-27 10:13:14 +00:00
Diana Picus c01f7f131b [ARM GlobalISel] Support G_FDIV for s32 and s64
TableGen already generates code for selecting a G_FDIV, so we only need
to add a test.

For the legalizer and reg bank select, we do the same thing as for the
other floating point binary operations: either mark as legal if we have
a FP unit or lower to a libcall, and map to the floating point
registers.

llvm-svn: 318915
2017-11-23 13:26:07 +00:00
Diana Picus 9faa09b21e [ARM GlobalISel] Support G_FMUL for s32 and s64
TableGen already generates code for selecting a G_FMUL, so we only need
to add a test for that part.

For the legalizer and reg bank select, we do the same thing as the other
floating point binary operators: either mark as legal if we have a FP
unit or lower to a libcall, and map to the floating point registers.

llvm-svn: 318910
2017-11-23 12:44:20 +00:00
Oliver Stannard 9cb89f6611 [ARM] Remove pre-UAL FLDM/FSTM aliases
These are pre-UAL syntax, and we don't support any other pre-UAL instructions,
with the exception of FLDMX/FSTMX, which don't have a UAL equivalent. Therefore
there's no reason to keep them or their AsmParser hacks around.

With the AsmParser hacks removed, the FLDMX and FSTMX instructions get the same
operand diagnostics as the UAL instructions.

Differential revision: https://reviews.llvm.org/D39196

llvm-svn: 318777
2017-11-21 16:20:25 +00:00
Oliver Stannard 1e6d4b9e62 [ARM] Don't omit non-default predication code
This was causing the (invalid) predicated versions of the NEON VRINTX and
VRINTZ instructions to be accepted, with the condition code being ignored.

Also, there is no NEON VRINTR instruction, so that part of the check was not
necessary.

Differential revision: https://reviews.llvm.org/D39193

llvm-svn: 318771
2017-11-21 15:34:15 +00:00
Oliver Stannard 1e73e95f3c [Asm] Improve "too few operands" errors
- We can still emit this error if the actual instruction has two or more
  operands missing compared to the expected one.
- We should only emit this error once per instruction.

Differential revision: https://reviews.llvm.org/D36746

llvm-svn: 318770
2017-11-21 15:16:50 +00:00
Oliver Stannard d6ca9879ba [ARM] Add diagnostics for SPR/DPR lists
Differential revision: https://reviews.llvm.org/D39195

llvm-svn: 318766
2017-11-21 15:06:01 +00:00
Martell Malone 51d82696db [ARM] Use SEH exceptions on thumbv7-windows
Reviewers: mstorsjo

Differential Revision: https://reviews.llvm.org/D40286

llvm-svn: 318756
2017-11-21 11:30:20 +00:00
Eugene Leviant 6bc35a93e6 [MI scheduler] Fix VADD and VSUB in cortex-a57 model
This patch fixes instregex for interger vector add/sub instructions

Differential revision: https://reviews.llvm.org/D40254

llvm-svn: 318749
2017-11-21 11:01:28 +00:00
Martin Storsjo b4c907edd7 [ARM] Use dwarf exception handling on MinGW
Enabling and using dwarf exceptions seems like an easier path
to take, than to make the COFF/ARM backend output EHABI directives.
Previously, no EH model was enabled at all on this target.

There's no point in setting UseIntegratedAssembler to false since
GNU binutils doesn't support Windows on ARM, and since we don't
need to support external assembler, we don't need to use register
numbers in cfi directives.

Differential Revision: https://reviews.llvm.org/D39532

llvm-svn: 318510
2017-11-17 08:04:40 +00:00
David Blaikie b3bde2ea50 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Yi Kong 39bcd4ed3e [ARM] 't' asm constraint should accept i32
't' constraint normally only accepts f32 operands, but for VCVT the
operands can be i32. LLVM is overly restrictive and rejects asm like:

  float foo() {
    float result;
    __asm__ __volatile__(
      "vcvt.f32.s32 %[result], %[arg1]\n"
      : [result]"=t"(result)
      : [arg1]"t"(0x01020304) );
    return result;
  }

Relax the value type for 't' constraint to either f32 or i32.

Differential Revision: https://reviews.llvm.org/D40137

llvm-svn: 318472
2017-11-16 23:38:17 +00:00
Daniel Sanders f76f315436 [globalisel][tablegen] Generate rule coverage and use it to identify untested rules
Summary:
This patch adds a LLVM_ENABLE_GISEL_COV which, like LLVM_ENABLE_DAGISEL_COV,
causes TableGen to instrument the generated table to collect rule coverage
information. However, LLVM_ENABLE_GISEL_COV goes a bit further than
LLVM_ENABLE_DAGISEL_COV. The information is written to files
(${CMAKE_BINARY_DIR}/gisel-coverage-* by default). These files can then be
concatenated into ${LLVM_GISEL_COV_PREFIX}-all after which TableGen will
read this information and use it to emit warnings about untested rules.

This technique could also be used by SelectionDAG and can be further
extended to detect hot rules and give them priority over colder rules.

Usage:
* Enable LLVM_ENABLE_GISEL_COV in CMake
* Build the compiler and run some tests
* cat gisel-coverage-[0-9]* > gisel-coverage-all
* Delete lib/Target/*/*GenGlobalISel.inc*
* Build the compiler

Known issues:
* ${LLVM_GISEL_COV_PREFIX}-all must be generated as a manual
  step due to a lack of a portable 'cat' command. It should be the
  concatenation of all ${LLVM_GISEL_COV_PREFIX}-[0-9]* files.
* There's no mechanism to discard coverage information when the ruleset
  changes

Depends on D39742

Reviewers: ab, qcolombet, t.p.northover, aditya_nandakumar, rovka

Reviewed By: rovka

Subscribers: vsk, arsenm, nhaehnle, mgorny, kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39747

llvm-svn: 318356
2017-11-16 00:46:35 +00:00
Daniel Sanders 725584e26d Add backend name to Target to enable runtime info to be fed back into TableGen
Summary:
Make it possible to feed runtime information back to tablegen to enable
profile-guided tablegen-eration, detection of untested tablegen definitions, etc.

Being a cross-compiler by nature, LLVM will potentially collect data for multiple
architectures (e.g. when running 'ninja check'). We therefore need a way for
TableGen to figure out what data applies to the backend it is generating at the
time. This patch achieves that by including the name of the 'def X : Target ...'
for the backend in the TargetRegistry.

Reviewers: qcolombet

Reviewed By: qcolombet

Subscribers: jholewinski, arsenm, jyknight, aditya_nandakumar, sdardis, nemanjai, ab, nhaehnle, t.p.northover, javed.absar, qcolombet, llvm-commits, fedor.sergeev

Differential Revision: https://reviews.llvm.org/D39742

llvm-svn: 318352
2017-11-15 23:55:44 +00:00
Momchil Velikov 4a91fb93db [ARM] Split Arm jump table branch into i12 and rs suffixed versions
This is a refactoring/cleanup of Arm `addrmode2` operand class. The patch
removes it completely.

Differential Revision: https://reviews.llvm.org/D39832

llvm-svn: 318291
2017-11-15 12:02:55 +00:00
Martin Storsjo 4629f52312 [ARM, AArch64] Fix an assert message, Darwin isn't the only target supporting TLS. NFC.
llvm-svn: 318184
2017-11-14 19:57:59 +00:00
Tim Northover 5cdc4f9c33 ARM: correctly update CFG when splitting BB to fix branch.
Because the block-splitting code is multi-purpose, we have to meddle with the
branches when using it to fixup a conditional branch destination. We got the
code right, but forgot to update the CFG so the verifier complained when
expensive checks were on.

Probably harmless since constant-islands comes so late, but best to fix it
anyway.

llvm-svn: 318148
2017-11-14 11:43:54 +00:00
Diana Picus 21a42bcc0b [ARM GlobalISel] Remove C++ code for G_CONSTANT
Get rid of the handwritten instruction selector code for handling
G_CONSTANT. This code wasn't checking all the preconditions correctly
anyway, so it's better to leave it to TableGen, which can handle at
least some cases correctly (e.g. MOVi, MOVi16, folding into binary
operations). Also add tests to cover those cases.

llvm-svn: 318146
2017-11-14 11:20:32 +00:00
Momchil Velikov dc86e1444d [ARM] Fix incorrect conversion of a tail call to an ordinary call
When we emit a tail call for Armv8-M, but then discover that the caller needs to
save/restore `LR`, we convert the tail call to an ordinary one, since restoring
`LR` takes extra instructions, which may negate the benefits of the tail
call. If the callee, however, takes stack arguments, this conversion is
incorrect, since nothing has been done to pass the stack arguments.

Thus the patch reverts https://reviews.llvm.org/rL294000

Also, we improve the instruction sequence for popping `LR` in the case when we
couldn't immediately find a scratch low register, but we can use as a temporary
one of the callee-saved low registers and restore `LR` before popping other
callee-saves.

Differential Revision: https://reviews.llvm.org/D39599

llvm-svn: 318143
2017-11-14 10:36:52 +00:00
Evgeniy Stepanov 76d5ac4906 [arm] Fix Unnecessary reloads from GOT.
Summary:
This fixes PR35221.
Use pseudo-instructions to let MachineCSE hoist global address computation.

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D39871

llvm-svn: 318081
2017-11-13 20:45:38 +00:00
Momchil Velikov 842aa90192 [ARM] Place jump table as the first operand in additions
When generating table jump code for switch statements, place the jump
table label as the first operand in the various addition instructions
in order to enable addressing mode selectors to better match index
computation and possibly fold them into the addressing mode of the
table entry load instruction.

Differential revision: https://reviews.llvm.org/D39752

llvm-svn: 318033
2017-11-13 11:56:48 +00:00
Mandeep Singh Grang d104673257 [llvm] Remove redundant return [NFC]
Reviewers: davidxl, olista01, Eugene.Zelenko

Reviewed By: Eugene.Zelenko

Subscribers: sdardis, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D39917

llvm-svn: 317995
2017-11-12 03:47:50 +00:00
Jonas Paulsson 4b017e682d [RegAlloc, SystemZ] Increase number of LOCRs by passing "hard" regalloc hints.
* The method getRegAllocationHints() is now of bool type instead of void. If
true is returned, regalloc (AllocationOrder) will *only* try to allocate the
hints, as opposed to merely trying them before non-hinted registers.

* TargetRegisterInfo::getRegAllocationHints() is implemented for SystemZ with
an increase in number of LOCRs.

In this case, it is desired to force the hints even though there is a slight
increase in spilling, because if a non-hinted register would be allocated,
the LOCRMux pseudo would have to be expanded with a jump sequence. The LOCR
(Load On Condition) SystemZ instruction must have both operands in either the
low or high part of the 64 bit register.

Reviewers: Quentin Colombet and Ulrich Weigand
https://reviews.llvm.org/D36795

llvm-svn: 317879
2017-11-10 08:46:26 +00:00
David Blaikie 3f833edc7c Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering
This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.

llvm-svn: 317647
2017-11-08 01:01:31 +00:00
Kristof Beyls af9814a1fc [GlobalISel] Enable legalizing non-power-of-2 sized types.
This changes the interface of how targets describe how to legalize, see
the below description.

1. Interface for targets to describe how to legalize.

In GlobalISel, the API in the LegalizerInfo class is the main interface
for targets to specify which types are legal for which operations, and
what to do to turn illegal type/operation combinations into legal ones.

For each operation the type sizes that can be legalized without having
to change the size of the type are specified with a call to setAction.
This isn't different to how GlobalISel worked before. For example, for a
target that supports 32 and 64 bit adds natively:

  for (auto Ty : {s32, s64})
    setAction({G_ADD, 0, s32}, Legal);

or for a target that needs a library call for a 32 bit division:

  setAction({G_SDIV, s32}, Libcall);

The main conceptual change to the LegalizerInfo API, is in specifying
how to legalize the type sizes for which a change of size is needed. For
example, in the above example, how to specify how all types from i1 to
i8388607 (apart from s32 and s64 which are legal) need to be legalized
and expressed in terms of operations on the available legal sizes
(again, i32 and i64 in this case). Before, the implementation only
allowed specifying power-of-2-sized types (e.g. setAction({G_ADD, 0,
s128}, NarrowScalar).  A worse limitation was that if you'd wanted to
specify how to legalize all the sized types as allowed by the LLVM-IR
LangRef, i1 to i8388607, you'd have to call setAction 8388607-3 times
and probably would need a lot of memory to store all of these
specifications.

Instead, the legalization actions that need to change the size of the
type are specified now using a "SizeChangeStrategy".  For example:

   setLegalizeScalarToDifferentSizeStrategy(
       G_ADD, 0, widenToLargerAndNarrowToLargest);

This example indicates that for type sizes for which there is a larger
size that can be legalized towards, do it by Widening the size.
For example, G_ADD on s17 will be legalized by first doing WidenScalar
to make it s32, after which it's legal.
The "NarrowToLargest" indicates what to do if there is no larger size
that can be legalized towards. E.g. G_ADD on s92 will be legalized by
doing NarrowScalar to s64.

Another example, taken from the ARM backend is:
   for (unsigned Op : {G_SDIV, G_UDIV}) {
     setLegalizeScalarToDifferentSizeStrategy(Op, 0,
         widenToLargerTypesUnsupportedOtherwise);
     if (ST.hasDivideInARMMode())
       setAction({Op, s32}, Legal);
     else
       setAction({Op, s32}, Libcall);
   }

For this example, G_SDIV on s8, on a target without a divide
instruction, would be legalized by first doing action (WidenScalar,
s32), followed by (Libcall, s32).

The same principle is also followed for when the number of vector lanes
on vector data types need to be changed, e.g.:

   setAction({G_ADD, LLT::vector(8, 8)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(16, 8)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(4, 16)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(8, 16)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(2, 32)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(4, 32)}, LegalizerInfo::Legal);
   setLegalizeVectorElementToDifferentSizeStrategy(
       G_ADD, 0, widenToLargerTypesUnsupportedOtherwise);

As currently implemented here, vector types are legalized by first
making the vector element size legal, followed by then making the number
of lanes legal. The strategy to follow in the first step is set by a
call to setLegalizeVectorElementToDifferentSizeStrategy, see example
above.  The strategy followed in the second step
"moreToWiderTypesAndLessToWidest" (see code for its definition),
indicating that vectors are widened to more elements so they map to
natively supported vector widths, or when there isn't a legal wider
vector, split the vector to map it to the widest vector supported.

Therefore, for the above specification, some example legalizations are:
  * getAction({G_ADD, LLT::vector(3, 3)})
    returns {WidenScalar, LLT::vector(3, 8)}
  * getAction({G_ADD, LLT::vector(3, 8)})
    then returns {MoreElements, LLT::vector(8, 8)}
  * getAction({G_ADD, LLT::vector(20, 8)})
    returns {FewerElements, LLT::vector(16, 8)}


2. Key implementation aspects.

How to legalize a specific (operation, type index, size) tuple is
represented by mapping intervals of integers representing a range of
size types to an action to take, e.g.:

       setScalarAction({G_ADD, LLT:scalar(1)},
                       {{1, WidenScalar},  // bit sizes [ 1, 31[
                        {32, Legal},       // bit sizes [32, 33[
                        {33, WidenScalar}, // bit sizes [33, 64[
                        {64, Legal},       // bit sizes [64, 65[
                        {65, NarrowScalar} // bit sizes [65, +inf[
                       });

Please note that most of the code to do the actual lowering of
non-power-of-2 sized types is currently missing, this is just trying to
make it possible for targets to specify what is legal, and how non-legal
types should be legalized.  Probably quite a bit of further work is
needed in the actual legalizing and the other passes in GlobalISel to
support non-power-of-2 sized types.

I hope the documentation in LegalizerInfo.h and the examples provided in the
various {Target}LegalizerInfo.cpp and LegalizerInfoTest.cpp explains well
enough how this is meant to be used.

This drops the need for LLT::{half,double}...Size().


Differential Revision: https://reviews.llvm.org/D30529

llvm-svn: 317560
2017-11-07 10:34:34 +00:00
David Blaikie 1be62f0327 Move TargetFrameLowering.h to CodeGen where it's implemented
This header already includes a CodeGen header and is implemented in
lib/CodeGen, so move the header there to match.

This fixes a link error with modular codegeneration builds - where a
header and its implementation are circularly dependent and so need to be
in the same library, not split between two like this.

llvm-svn: 317379
2017-11-03 22:32:11 +00:00
Diana Picus acf4bf21ab [ARM GlobalISel] Move the check for Thumb higher up
We're currently bailing out for Thumb targets while lowering formal
parameters, but there used to be some other checks before it, which
could've caused some functions (e.g. those without formal parameters) to
sneak through unnoticed.

llvm-svn: 317312
2017-11-03 10:30:12 +00:00
Sam Parker 242052c6b4 [ARM] and, or, xor and add with shl combine
The generic dag combiner will fold:

(shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2)
(shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2)

This can create constants which are too large to use as an immediate.
Many ALU operations are also able of performing the shl, so we can
unfold the transformation to prevent a mov imm instruction from being
generated.

Other patterns, such as b + ((a << 1) | 510), can also be simplified
in the same manner.

Differential Revision: https://reviews.llvm.org/D38084

llvm-svn: 317197
2017-11-02 10:43:10 +00:00
Roger Ferrer Ibanez 9dfbc10522 Revert r313618 "[ARM] Use ADDCARRY / SUBCARRY"
That change causes PR35103, so reverting until I figure it out.

llvm-svn: 317092
2017-11-01 14:06:57 +00:00
Javed Absar 5cde1ccb29 [GlobalISel|ARM] : Allow legalizing G_FSUB
Adding support for VSUB.
Reviewed by: @rovka
Differential Revision: https://reviews.llvm.org/D39261

llvm-svn: 316902
2017-10-30 13:51:56 +00:00
Sanjay Patel b049173157 [SimplifyCFG] use pass options and remove the latesimplifycfg pass
This is no-functional-change-intended.

This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and 
D35411 (defer folding unconditional branches) with pass parameters rather than a named
"latesimplifycfg" pass. Now that we have individual options to control the functionality,
we could decouple when these fire (but that's an independent patch if desired). 

The next planned step would be to add another option bit to disable the sinking transform
mentioned in D38566. This should also make it clear that the new pass manager needs to
be updated to limit simplifycfg in the same way as the old pass manager.

Differential Revision: https://reviews.llvm.org/D38631

llvm-svn: 316835
2017-10-28 18:43:07 +00:00
David Blaikie 8699f71310 Add a few missing headers for modularization/IWYU/etc
Several cases where class definitions are required for DenseMap pointer
traits handling.

llvm-svn: 316803
2017-10-27 22:12:46 +00:00
David Blaikie 6265130054 InstructionSelectorImpl.h: Modularize/remove ODR violations by using a static member function to expose the debug name
llvm-svn: 316715
2017-10-26 23:39:54 +00:00
Eli Friedman d5dfb62de7 [ARM] Honor -mfloat-abi for libcall calling convention
As far as I can tell, this matches gcc: -mfloat-abi determines the
calling convention for all functions except those explicitly defined as
soft-float in the ARM RTABI.

This change only affects cases where the user specifies -mfloat-abi to
override the default calling convention derived from the target triple.

Fixes https://bugs.llvm.org//show_bug.cgi?id=34530.

Differential Revision: https://reviews.llvm.org/D38299

llvm-svn: 316708
2017-10-26 21:42:32 +00:00
Yichao Yu 221dae31a5 Clear LastMappingSymbols and LastEMS(Info) when resetting the ARM(AArch64)ELFStreamer
Summary:
This causes a segfault on ARM when (I think) the pass manager is used multiple times.

Reset set the (last) current section to NULL without saving the corresponding LastEMSInfo back into the map. The next use of the streamer then save the LastEMSInfo for the NULL section leaving the LastEMSInfo mapping for the last current section (the one that was there before the reset) NULL which cause the LastEMSInfo to be set to NULL when the section is being used again.

The reuse of the section (pointer) might mean that the map was holding dangling pointers previously which is why I went for clearing the map and resetting the info, making it as similar to the state right after the constructor run as possible. The AArch64 one doesn't have segfault (since LastEMS isn't a pointer) but it seems to have the same issue.

The segfault is likely caused by https://reviews.llvm.org/D30724 which turns LastEMSInfo into a pointer. As mentioned above, it seems that the actual issue was older though.

No test is included since the test is believed to be too complicated for such an obvious fix and not worth doing.

Reviewers: llvm-commits, shankare, t.p.northover, peter.smith, rengolin

Reviewed By: rengolin

Subscribers: mgorny, aemerson, rengolin, javed.absar, kristof.beyls

Differential Revision: https://reviews.llvm.org/D38588

llvm-svn: 316679
2017-10-26 17:36:43 +00:00
Craig Topper 0551556ed2 [AsmParser][TableGen] Add VariantID argument to the generated mnemonic spell check function so it can use the correct table based on variant.
I'm considering implementing the mnemonic spell checker for x86, and that would require the separate intel and att variants.

llvm-svn: 316641
2017-10-26 06:46:41 +00:00
Craig Topper 2a06028c0a [AsmParser][TableGen] Make the generated mnemonic spell checker function a file local static function.
Also only emit in targets that specificially request it. This is required so we don't get an unused static function error.

llvm-svn: 316640
2017-10-26 06:46:40 +00:00
Diana Picus b35022121d [ARM GlobalISel] Fix call opcodes
We were generating BLX for all the calls, which was incorrect in most
cases. Update ARMCallLowering to generate BL for direct calls, and BLX,
BX_CALL or BMOVPCRX_CALL for indirect calls.

llvm-svn: 316570
2017-10-25 11:42:40 +00:00
Sam Parker 1f742117bd [ARM] OrCombineToBFI function
Extract the functionality to combine OR to BFI into its own function.

Differential Revision: https://reviews.llvm.org/D39001

llvm-svn: 316563
2017-10-25 08:37:33 +00:00
Sam Parker ccb209bb97 [ARM] Swap cmp operands for automatic shifts
Swap the compare operands if the lhs is a shift and the rhs isn't,
as in arm and T2 the shift can be performed by the compare for its
second operand.

Differential Revision: https://reviews.llvm.org/D39004

llvm-svn: 316562
2017-10-25 08:33:06 +00:00
David Blaikie c70b392e49 ARMAddressingModes.h: Don't mark header functions as file local
llvm-svn: 316517
2017-10-24 21:29:21 +00:00
Oliver Stannard 03ded27bbc [ARM] Error for invalid shift in memory operand
Report a diagnostic when we fail to parse a shift in a memory operand because
the shift type is not an identifier. Without this, we were silently ignoring
the whole instruction.

Differential revision: https://reviews.llvm.org/D39237

llvm-svn: 316441
2017-10-24 14:19:08 +00:00
Oliver Stannard ce256a3a01 [ARM] Replace development diagnostics with normal DEBUG macro
* Remove the -arm-asm-parser-dev-diags option.
* Use normal DEBUG(dbgs()) printing for the extra development information about
  missing diagnostics.

Differential Revision: https://reviews.llvm.org/D39194

llvm-svn: 316423
2017-10-24 09:46:56 +00:00
Oliver Stannard 6d5a5b98ab [ARM] tSETEND needs IsThumb
This is the Thumb encoding, so the Requires list must include IsThumb.

No test because we happen to select the ARM one first, but that's just luck.

Differential Revision: https://reviews.llvm.org/D39190

llvm-svn: 316421
2017-10-24 09:03:33 +00:00
Oliver Stannard c507b370a1 [ARM] Remove tCPS alias which just crashed
This alias caused a crash when trying to print the "cps #0" instruction in a
diagnostic for thumbv6 (which doesn't have that instruction).
	    
The comment was incorrect, this instruction is UNPREDICTABLE if no flag bits
are set, so I don't think it's worth keeping.

Differential Revision: https://reviews.llvm.org/D39191

llvm-svn: 316420
2017-10-24 08:55:36 +00:00
Sam Parker 487ab86942 [ARM] Allow unrolling of multi-block loops.
Before, loop unrolling was only enabled for loops with a single
block. This restriction has been removed and replaced by:
- allow a maximum of two exiting blocks,
- a four basic block limit for cores with a branch predictor.

Differential Revision: https://reviews.llvm.org/D38952

llvm-svn: 316313
2017-10-23 08:05:14 +00:00
Momchil Velikov d6a4ab3d49 [ARM] Dynamic stack alignment for 16-bit Thumb
This patch implements dynamic stack (re-)alignment for 16-bit Thumb. When
targeting processors, which support only the 16-bit Thumb instruction set
the compiler ignores the alignment attributes of automatic variables and may
silently generate incorrect code.

Differential revision: https://reviews.llvm.org/D38143

llvm-svn: 316289
2017-10-22 11:56:35 +00:00
Eugene Leviant 27b226fb65 [ARM] Use post-RA MI scheduler when +use-misched is set
Differential revision: https://reviews.llvm.org/D39100

llvm-svn: 316214
2017-10-20 14:29:17 +00:00
Andre Vieira d4a25707f0 [ARM] Fix disassembly for conditional VMRS and VMSR instructions in ARM mode
Differential Revision: https://reviews.llvm.org/D38347

llvm-svn: 316085
2017-10-18 14:47:37 +00:00
NAKAMURA Takumi 6f43bd4bde Untabify.
llvm-svn: 316079
2017-10-18 13:31:28 +00:00
Aaron Ballman 615eb47035 Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people.
Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1

llvm-svn: 315854
2017-10-15 14:32:27 +00:00
Matthias Braun bb8507e63c Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"
Reverting to investigate layering effects of MCJIT not linking
libCodeGen but using TargetMachine::getNameWithPrefix() breaking the
lldb bots.

This reverts commit r315633.

llvm-svn: 315637
2017-10-12 22:57:28 +00:00
Matthias Braun 3a9c114b24 TargetMachine: Merge TargetMachine and LLVMTargetMachine
Merge LLVMTargetMachine into TargetMachine.

- There is no in-tree target anymore that just implements TargetMachine
  but not LLVMTargetMachine.
- It should still be possible to stub out all the various functions in
  case a target does not want to use lib/CodeGen
- This simplifies the code and avoids methods ending up in the wrong
  interface.

Differential Revision: https://reviews.llvm.org/D38489

llvm-svn: 315633
2017-10-12 22:28:54 +00:00
Don Hinton 3e0199f7eb [dump] Remove NDEBUG from test to enable dump methods [NFC]
Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.

Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.

Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.

Differential Revision: https://reviews.llvm.org/D38406

llvm-svn: 315590
2017-10-12 16:16:06 +00:00
Lang Hames 2241ffa43c [MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr.
MCObjectStreamer owns its MCCodeEmitter -- this fixes the types to reflect that,
and allows us to remove the last instance of MCObjectStreamer's weird "holding
ownership via someone else's reference" trick.

llvm-svn: 315531
2017-10-11 23:34:47 +00:00
Oliver Stannard 4191b9eaea [Asm] Add debug tracing in table-generated assembly matcher
This adds debug tracing to the table-generated assembly instruction matcher,
enabled by the -debug-only=asm-matcher option.

The changes in the target AsmParsers are to add an MCInstrInfo reference under
a consistent name, so that we can use it from table-generated code. This was
already being used this way for targets that use deprecation warnings, but 5
targets did not have it, and Hexagon had it under a different name to the other
backends.

llvm-svn: 315445
2017-10-11 09:17:43 +00:00
Lang Hames 02d330548d [MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr.
MCObjectStreamer owns its MCAsmBackend -- this fixes the types to reflect that,
and allows us to remove another instance of MCObjectStreamer's weird "holding
ownership via someone else's reference" trick.

llvm-svn: 315410
2017-10-11 01:57:21 +00:00
Lang Hames 232cdb48fc [MC] Add another missing <memory> include left out of r315327.
llvm-svn: 315332
2017-10-10 16:59:01 +00:00
Lang Hames 60fbc7cc38 [MC] Thread unique_ptr<MCObjectWriter> through the create.*ObjectWriter
functions.

This makes the ownership of the resulting MCObjectWriter clear, and allows us
to remove one instance of MCObjectStreamer's bizarre "holding ownership via
someone else's reference" trick.

llvm-svn: 315327
2017-10-10 16:28:07 +00:00
Oliver Stannard 30b732c942 [ARM, Asm] Harden GNU LDRD/STRD aliases against invalid inputs
Previously, the code that implemented the GNU assembler aliases for the
LDRD and STRD instructions (where the second register is omitted)
assumed that the input was a valid instruction. This caused assertion
failures for every example in ldrd-strd-gnu-bad-inst.s.

This improves this code so that it bails out if the instruction is not
in the expected format, the check bails out, and the asm parser is run
on the unmodified instruction.

It also relaxes the alias on thumb targets, so that unaligned pairs of
registers can be used. The restriction that Rt must be even-numbered
only applies to the ARM versions of these instructions.

Differential revision: https://reviews.llvm.org/D36732

llvm-svn: 315305
2017-10-10 12:38:22 +00:00
Oliver Stannard cd3306f62f [ARM, Asm] Add diagnostics for floating-point register operands
This adds diagnostic strings for the ARM floating-point register
classes, which will be used when these classes are expected by the
assembler, but the provided operand is not valid.

One of these, DPR, requires C++ code to select the correct error
message, as that class contains different registers depending on the
FPU. The rest can all have their diagnostic strings stored in the
tablegen decription of them.

Differential revision: https://reviews.llvm.org/D36693

llvm-svn: 315304
2017-10-10 12:35:09 +00:00
Oliver Stannard bbad419e94 [ARM, Asm] Add diagnostics for general-purpose register operands
This adds diagnostic strings for the ARM general-purpose register
classes, which will be used when these classes are expected by the
assembler, but the provided operand is not valid.

One of these, rGPR, requires C++ code to select the correct error
message, as that class contains different registers in pre-v8 and v8
targets. The rest can all have their diagnostic strings stored in the
tablegen description of them.

Differential revision: https://reviews.llvm.org/D36692

llvm-svn: 315303
2017-10-10 12:31:53 +00:00
Lang Hames 77dff39cb4 [MC] Plumb unique_ptr<MCWinCOFFObjectTargetWriter> through
createWinCOFFObjectWriter to WinCOFFObjectWriter's constructor.

Fixes the same ownership issue for COFF that r315245 did for MachO:
WinCOFFObjectWriter takes ownership of its MCWinCOFFObjectTargetWriter, so we
want to pass this through to the constructor via a unique_ptr, rather than a
raw ptr.

llvm-svn: 315257
2017-10-10 00:50:29 +00:00
Lang Hames dcb312bdb9 [MC] Plumb unique_ptr<MCELFObjectTargetWriter> through createELFObjectWriter to
ELFObjectWriter's constructor.

Fixes the same ownership issue for ELF that r315245 did for MachO:
ELFObjectWriter takes ownership of its MCELFObjectTargetWriter, so we want to
pass this through to the constructor via a unique_ptr, rather than a raw ptr.

llvm-svn: 315254
2017-10-09 23:53:15 +00:00
Lang Hames 9b206a7d60 [MC] Plumb unique_ptr<MCMachObjectTargetWriter> through createMachObjectWriter
to MCObjectWriter's constructor.

MCObjectWriter takes ownership of its MCMachObjectTargetWriter argument -- this
patch plumbs that ownership relationship through the constructor (which
previously took raw MCMachObjectTargetWriter*) and the createMachObjectWriter
function.

llvm-svn: 315245
2017-10-09 22:38:13 +00:00
Aditya Nandakumar c3bfc81a1f [GISel]: Fix generation of illegal COPYs during CallLowering
We end up creating COPY's that are either truncating/extending and this
should be illegal.

https://reviews.llvm.org/D37640

Patch for X86 and ARM by igorb, rovka

llvm-svn: 315240
2017-10-09 20:07:43 +00:00
Diana Picus e393bc72ee [ARM] GlobalISel: Select shifts
Unfortunately TableGen doesn't handle this yet:
Unable to deduce gMIR opcode to handle Src (which is a leaf).

Just add some temporary hand-written code to generate the proper MOVsr.

llvm-svn: 315071
2017-10-06 15:39:16 +00:00
Diana Picus a81a4b17e5 [ARM] GlobalISel: Map shift operands to GPRs
llvm-svn: 315067
2017-10-06 14:52:43 +00:00
Diana Picus 2c95730450 [ARM] GlobalISel: Mark shifts as legal for s32
The new legalize combiner introduces shifts all over the place, so we
should support them sooner rather than later.

llvm-svn: 315064
2017-10-06 14:30:05 +00:00
Oliver Stannard 878216dd05 [ARM] Add diag string for movw/movt immediates in assembly
This adds diagnostics for invalid immediate operands to the MOVW and MOVT
instructions (ARM and Thumb).

Differential revision: https://reviews.llvm.org/D31879

llvm-svn: 314888
2017-10-04 09:24:54 +00:00
Oliver Stannard 5a7aae3a80 [ARM, Asm] Change grammar of immediate operand diagnostics
Currently, our diagnostics for assembly operands are not consistent.
Some start with (for example) "immediate operand must be ...",
and some with "operand must be an immediate ...". I think the latter
form is preferable for a few reasons:
* It's unambiguous that it is referring to the expected type of operand, not
  the type the user provided. For example, the user could provide an register
  operand, and get a message taking about an operand is if it is already an
  immediate, just not in the accepted range.
* It allows us to have a consistent style once we add diagnostics for operands
  that could take two forms, for example a label or pc-relative memory operand.

Differential revision: https://reviews.llvm.org/D36689

llvm-svn: 314887
2017-10-04 09:18:07 +00:00
Oliver Stannard 0d5c792223 [ARM] Use table-gen'd assembly operand diags in ARM asm parser
This switches the ARM AsmParser to use assembly operand diagnostics from
tablegen, rather than a switch statement on the ARMMatchResultTy. It
moves the existing diagnostic strings to tablegen, but adds no new ones,
so this is NFC except for one diagnostic string that had an off-by-1 error
in the hand-written switch statement.

Differential revision: https://reviews.llvm.org/D31607

llvm-svn: 314804
2017-10-03 14:38:52 +00:00
Oliver Stannard 55114fd9f0 [ARM, Asm] Use correct source location for register tokens
tryParseRegister advances the lexer, so we need to take copies of the start and
end locations of the register operand before calling it.

Previously, the caret in the diagnostic pointer to the comma after the r0
operand in the test, rather than the start of the operand.

Differential revision: https://reviews.llvm.org/D31537

llvm-svn: 314799
2017-10-03 14:30:58 +00:00
Oliver Stannard 68aa7de517 [ARM, Asm] Fix ubsan failure caused by out-of-range enum value
In this code, we use ~0U as a sentinel value for any operand class that doesn't
have a user-friendly error message, but this value isn't in range of the
MatchClassKind enum, so we need to ensure it does not get passed to isSubclass.

llvm-svn: 314793
2017-10-03 12:45:18 +00:00
Oliver Stannard 5daee987fd [ARM, Asm] Remove dead code causing MSan failure.
r314779 caused ErrorInfo to be red uninitialised, but also made this code dead,
so it can just be removed.

llvm-svn: 314791
2017-10-03 12:28:28 +00:00
Oliver Stannard e093bad472 [ARM] Use new assembler diags for ARM
This converts the ARM AsmParser to use the new assembly matcher error
reporting mechanism, which allows errors to be reported for multiple
instruction encodings when it is ambiguous which one the user intended
to use.

By itself this doesn't improve many error messages, because we don't have
diagnostic text for most operand types, but as we add that then this will allow
more of those diagnostic strings to be used when they are relevant.

Differential revision: https://reviews.llvm.org/D31530

llvm-svn: 314779
2017-10-03 10:26:11 +00:00
Sjoerd Meijer 7a22a4948f ISel type legalization: add debug messages. NFCI.
This adds some more debug messages to the type legalizer and functions
like PromoteNode, ExpandNode, ExpandLibCall in an attempt to make
the debug messages a little bit more informative and useful.

Differential Revision: https://reviews.llvm.org/D38450

llvm-svn: 314773
2017-10-03 08:54:15 +00:00
Jonas Paulsson c9e363ac69 [SystemZ] implement shouldCoalesce()
Implement shouldCoalesce() to help regalloc avoid running out of GR128
registers.

If a COPY involving a subreg of a GR128 is coalesced, the live range of the
GR128 virtual register will be extended. If this happens where there are
enough phys-reg clobbers present, regalloc will run out of registers (if
there is not a single GR128 allocatable register available).

This patch tries to allow coalescing only when it can prove that this will be
safe by checking the (local) interval in question.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D37899
https://bugs.llvm.org/show_bug.cgi?id=34610

llvm-svn: 314516
2017-09-29 14:31:39 +00:00
Sam Parker 963da5b119 [ARM] v8.3-a complex number support
New instructions are added to AArch32 and AArch64 to aid
floating-point multiplication and addition of complex numbers, where
the complex numbers are packed in a vector register as a pair of
elements. The Imaginary part of the number is placed in the more
significant element, and the Real part of the number is placed in the
less significant element.

This patch adds assembler for the ARM target.

Differential Revision: https://reviews.llvm.org/D36789

llvm-svn: 314511
2017-09-29 13:11:33 +00:00
Matthias Braun 51687912a4 ARM: Fix cases where CSI Restored bit is not cleared
LR is an untypical callee saved register in that it is restored into a
different register (PC) and thus does not live-out of the return block.
This case requires the `Restored` flag in CalleeSavedInfo to be cleared.

This fixes a number of cases where this wasn't handled correctly yet.

llvm-svn: 314471
2017-09-28 23:12:06 +00:00
Martin Storsjo d6218cc385 [ARM] Restore the right frame pointer register in Int_eh_sjlj_longjmp
In setupEntryBlockAndCallSites in CodeGen/SjLjEHPrepare.cpp,
we fetch and store the actual frame pointer, but on return via
the longjmp intrinsic, it always was restored into the r7 variable.

On windows, the frame pointer should be restored into r11 instead of r7.

On Darwin (where sjlj exception handling is used by default), the frame
pointer is always r7, both in arm and thumb mode, and likewise, on
windows, the frame pointer always is r11.

On linux however, if sjlj exception handling is enabled (which it isn't
by default), libcxxabi and the user code can be built in differing modes
using different registers as frame pointer. Therefore, when restoring
registers on a platform where we don't always use the same register
depending on code mode, restore both r7 and r11.

Differential Revision: https://reviews.llvm.org/D38253

llvm-svn: 314451
2017-09-28 19:04:30 +00:00
Martin Storsjo adceba59a2 [ARM] Fix SJLJ exception handling when manually chosen on a platform where it isn't default
Differential Revision: https://reviews.llvm.org/D38252

llvm-svn: 314450
2017-09-28 19:04:14 +00:00
Sam Parker 211f47aa37 [ARM] isTruncateFree fix
I implemented isTruncateFree in rL313533, this patch fixes the logic
to match my comment, as the previous logic was too general. Now the
only truncates that are free are i64 -> i32.

Differential Revision: https://reviews.llvm.org/D38234

llvm-svn: 314280
2017-09-27 08:30:45 +00:00
Eli Friedman edee9999c4 Revert r312724 ("[ARM] Remove redundant vcvt patterns.").
It leads to some improvements, but also a regression for the simple
case, so it's not clearly a good idea.

test/CodeGen/ARM/vcvt.ll now has test coverage to show the difference.

Ultimately, the right solution is probably to custom-lower fp-to-int
conversions, to something like ARMISD::VCVT_F32_S32 plus a bitcast.
It's hard to do the right thing when the implicit bitcast isn't visible
to DAG transforms.

llvm-svn: 314169
2017-09-25 22:07:33 +00:00
Craig Topper 5bc10ede53 [SelectionDAG] Teach simplifyDemandedBits to handle shifts by constant splat vectors
This teach simplifyDemandedBits to handle constant splat vector shifts.

This required changing some uses of getZExtValue to getLimitedValue since we can't rely on legalization using getShiftAmountTy for the shift amount.

I believe there may have been a bug in the ((X << C1) >>u ShAmt) handling where we didn't check if the inner shift was too large. I've fixed that here.

I had to add new patterns to ARM because the zext/sext the patterns were trying to look for got turned into an any_extend with this patch. Happy to split that out too, but not sure how to test without this change.

Differential Revision: https://reviews.llvm.org/D37665

llvm-svn: 314139
2017-09-25 19:26:08 +00:00
Arnold Schwaighofer b45717adda ARM: One more fix for swifterror CSR set
We use a differently ordered CSR set if the frame pointer is pushed. Add a
matching ..._SwiftError version.

llvm-svn: 314128
2017-09-25 17:51:33 +00:00
Benjamin Kramer a23c1a37d0 [ARM] Fix -Wdangling-else warning.
A ternary is clearer here. No functionality change.

llvm-svn: 314123
2017-09-25 17:35:38 +00:00
Arnold Schwaighofer ae4de58a5b ARM: Use the proper swifterror CSR list on platforms other than darwin
Noticed by inspection

llvm-svn: 314121
2017-09-25 17:19:50 +00:00
Andre Vieira 640527f7f1 [ARM] Fix assembly and disassembly for VMRS/VMSR
Reviewed by: t.p.northover
Differential Revision: https://reviews.llvm.org/D36306

llvm-svn: 313979
2017-09-22 12:17:42 +00:00
Simon Pilgrim 2b1c3bb25d [ARM] Add missing selection patterns for vnmla
For the following function:

  double fn1(double d0, double d1, double d2) {
    double a = -d0 - d1 * d2;
    return a;
  }

on ARM, LLVM generates code along the lines of

  vneg.f64  d0, d0
  vmls.f64  d0, d1, d2

i.e., a negate and a multiply-subtract.

The attached patch adds instruction selection patterns to allow it to generate the single instruction

  vnmla.f64  d0, d1, d2

(multiply-add with negation) instead, like GCC does.

Committed on behalf of @gergo- (Gergö Barany)

Differential Revision: https://reviews.llvm.org/D35911

llvm-svn: 313972
2017-09-22 09:50:52 +00:00
Eugene Zelenko 076468c0d0 [ARM] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 313823
2017-09-20 21:35:51 +00:00
Jonathan Roelofs 85908aa84b [ARM] Relax 'cpsie'/'cpsid' flag parsing.
The ARM docs suggest in examples that the flags can have either case, and there
are applications in the wild that (libopencm3, for example) that expect to be
able to use the uppercase spelling.

https://reviews.llvm.org/D37953

llvm-svn: 313680
2017-09-19 21:23:19 +00:00
Roger Ferrer Ibanez 8d0180c955 [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045
 - fixes PR34564

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 313618
2017-09-19 09:05:39 +00:00
Sam Parker 71efbe4c68 [ARM] Implement isTruncateFree
Implement the isTruncateFree hooks, lifted from AArch64, that are
used by TargetTransformInfo. This allows simplifycfg to reduce the
test case into a single basic block.

Differential Revision: https://reviews.llvm.org/D37516

llvm-svn: 313533
2017-09-18 14:28:51 +00:00
Sjoerd Meijer 4e6df15962 [ARM] Fix for indexed dot product instruction descriptions
The indexed dot product instructions only accept the lower 16 D-registers as
the indexed register, but we were e.g. incorrectly accepting:

vudot.u8 d16,d16,d18[0]

Differential Revision: https://reviews.llvm.org/D37968

llvm-svn: 313531
2017-09-18 14:17:57 +00:00
Hans Wennborg 8c1eb106bd Revert r313009 "[ARM] Use ADDCARRY / SUBCARRY"
This was causing PR34045 to fire again.

> This is a preparatory step for D34515 and also is being recommitted as its
> first version caused PR34045.
>
> This change:
>  - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
>  - lowering is done by first converting the boolean value into the carry flag
>    using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
>    using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
>    operations does the actual addition.
>  - for subtraction, given that ISD::SUBCARRY second result is actually a
>    borrow, we need to invert the value of the second operand and result before
>    and after using ARMISD::SUBE. We need to invert the carry result of
>    ARMISD::SUBE to preserve the semantics.
>  - given that the generic combiner may lower ISD::ADDCARRY and
>    ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
>    as well otherwise i64 operations now would require branches. This implies
>    updating the corresponding test for unsigned.
>  - add new combiner to remove the redundant conversions from/to carry flags
>    to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
>  - fixes PR34045
>
> Differential Revision: https://reviews.llvm.org/D35192

Also revert follow-up r313010:

> [ARM] Fix typo when creating ISD::SUB nodes
>
> In D35192, I accidentally introduced a typo when creating ISD::SUB nodes,
> giving them two values instead of one.
>
> This fails when the merge_values combiner finds one of these nodes.
>
> This change fixes PR34564.
>
> Differential Revision: https://reviews.llvm.org/D37690

llvm-svn: 313044
2017-09-12 16:24:17 +00:00
Roger Ferrer Ibanez 9df2527b0b [ARM] Fix typo when creating ISD::SUB nodes
In D35192, I accidentally introduced a typo when creating ISD::SUB nodes,
giving them two values instead of one.

This fails when the merge_values combiner finds one of these nodes.

This change fixes PR34564.

Differential Revision: https://reviews.llvm.org/D37690

llvm-svn: 313010
2017-09-12 07:42:28 +00:00
Roger Ferrer Ibanez 4f92b4162f [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515 and also is being recommitted as its
first version caused PR34045.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 313009
2017-09-12 07:40:09 +00:00
Hans Wennborg 075e5a2e2b Revert r312898 "[ARM] Use ADDCARRY / SUBCARRY"
It caused PR34564.

> This is a preparatory step for D34515 and also is being recommitted as its
> first version caused PR34045.
>
> This change:
>  - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
>  - lowering is done by first converting the boolean value into the carry flag
>    using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
>    using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
>    operations does the actual addition.
>  - for subtraction, given that ISD::SUBCARRY second result is actually a
>    borrow, we need to invert the value of the second operand and result before
>    and after using ARMISD::SUBE. We need to invert the carry result of
>    ARMISD::SUBE to preserve the semantics.
>  - given that the generic combiner may lower ISD::ADDCARRY and
>    ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
>    as well otherwise i64 operations now would require branches. This implies
>    updating the corresponding test for unsigned.
>  - add new combiner to remove the redundant conversions from/to carry flags
>    to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
>  - fixes PR34045
>
> Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 312980
2017-09-11 23:52:02 +00:00
Andre Vieira c429aabb91 [ARM] Enable the use of SVC anywhere in an IT block
Differential Revision: https://reviews.llvm.org/D37374

llvm-svn: 312908
2017-09-11 11:11:17 +00:00
Roger Ferrer Ibanez 12b20f2307 [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515 and also is being recommitted as its
first version caused PR34045.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 312898
2017-09-11 07:38:05 +00:00
Benjamin Kramer 6ef976d5e1 [ARM] Remove redundant vcvt patterns.
These don't add any value as they're just compositions of existing
patterns. However, they can confuse the cost logic in ISel, leading to
duplicated vcvt instructions like in PR33199.

llvm-svn: 312724
2017-09-07 14:52:26 +00:00
Saleem Abdulrasool 5fba8ba9cc ARM: track globals promoted to coalesced const pool entries
Globals that are promoted to an ARM constant pool may alias with another
existing constant pool entry. We need to keep a reference to all globals
that were promoted to each constant pool value so that we can emit a
distinct label for each promoted global. These labels are necessary so
that debug info can refer to the promoted global without an undefined
reference during linking.

Patch by Stephen Crane!

llvm-svn: 312692
2017-09-07 04:00:13 +00:00
Matthias Braun c9056b834d Insert IMPLICIT_DEFS for undef uses in tail merging
Tail merging can convert an undef use into a normal one when creating a
common tail. Doing so can make the register live out from a block which
previously contained the undef use. To keep the liveness up-to-date,
insert IMPLICIT_DEFs in such blocks when necessary.

To enable this patch the computeLiveIns() function which used to
compute live-ins for a block and set them immediately is split into new
functions:
- computeLiveIns() just computes the live-ins in a LivePhysRegs set.
- addLiveIns() applies the live-ins to a block live-in list.
- computeAndAddLiveIns() is a convenience function combining the other
  two functions and behaving like computeLiveIns() before this patch.

Based on a patch by Krzysztof Parzyszek <kparzysz@codeaurora.org>

Differential Revision: https://reviews.llvm.org/D37034

llvm-svn: 312668
2017-09-06 20:45:24 +00:00
Eli Friedman c22c699882 [ARM] Make ARMExpandPseudo add implicit uses for predicated instructions
Missing these could potentially screw up post-ra scheduling.

Issue found by inspection, so I don't have a real testcase. Included
test just verifies the expected operands after expansion.

Differential Revision: https://reviews.llvm.org/D35156

llvm-svn: 312589
2017-09-05 22:54:06 +00:00
Eli Friedman 06d0ee734a [ARM] Register ARMExpandPseudo pass.
This allows -run-pass etc. to refer to it.

(Split off from D35156.)

llvm-svn: 312587
2017-09-05 22:45:23 +00:00
Diana Picus ac15473cdd [ARM] GlobalISel: Minor cleanups in inst selector
Use the STI member of ARMInstructionSelector instead of
TII.getSubtarget() and also make use of STI's methods instead of
checking the object format manually.

llvm-svn: 312522
2017-09-05 08:22:47 +00:00
Diana Picus abb088691b [ARM] GlobalISel: Support global variables for RWPI
In RWPI code, globals that are not read-only are accessed relative to
the SB register (R9). This is achieved by explicitly generating an ADD
instruction between SB and an offset that we either load from a constant
pool or movw + movt into a register.

llvm-svn: 312521
2017-09-05 07:57:41 +00:00
Diana Picus f959791189 [ARM] GlobalISel: Support ROPI global variables
In the ROPI relocation model, read-only variables are accessed relative
to the PC. We use the (MOV|LDRLIT)_ga_pcrel pseudoinstructions for this.

llvm-svn: 312323
2017-09-01 11:13:39 +00:00
Oliver Stannard d771f6cb16 [ARM] Add 2-operand assembly aliases for Thumb1 ADD/SUB
This adds 2-operand assembly aliases for these instructions:
  add r0, r1    =>   add r0, r0, r1
  sub r0, r1    =>   sub r0, r0, r1

Previously this syntax was only accepted for Thumb2 targets, where the
wide versions of the instructions were used.

This patch allows the 2-operand syntax to be used for Thumb1 targets,
and selects the narrow encoding when it is used for Thumb2 targets.

Differential revision: https://reviews.llvm.org/D37377

llvm-svn: 312321
2017-09-01 10:47:25 +00:00
Diana Picus 1d101d76c0 Move static helper into ARMTargetLowering. NFC
This exposes the isReadOnly(GlobalValue *) in the ARMTargetLowering so
we can make use of it in GlobalISel as well.

llvm-svn: 312320
2017-09-01 10:44:48 +00:00
Sam Parker b036757f3d [ARM] Reverse PostRASched subtarget feature logic
Replace the UsePostRAScheduler SubtargetFeature with
DisablePostRAScheduler, which is then used by Swift and Cyclone.
This patch maintains enabling PostRA scheduling for other Thumb2
capable cores and/or for functions which are being compiled in Arm
mode.

Differential Revision: https://reviews.llvm.org/D37055

llvm-svn: 312226
2017-08-31 08:57:51 +00:00
Benjamin Kramer 79d53febcf [ARM] Replace fixed-size SmallSet with a bitset.
It's smaller. No functionality change.

llvm-svn: 312180
2017-08-30 22:28:30 +00:00
Brian Gesiak 3332976478 [ARM] Use Swift error registers on non-Darwin targets
Summary:
Remove a check for `ARMSubtarget::isTargetDarwin` when determining
whether to use Swift error registers, so that Swift errors work
properly on non-Darwin ARM32 targets (specifically Android).

Before this patch, generated code would save and restores ARM register r8 at
the entry and returns of a function that throws. As r8 is used as a virtual
return value for the object being thrown, this gets overwritten by the restore,
and calling code is unable to catch the error. In turn this caused Swift code
that used `do`/`try`/`catch` to work improperly on Android ARM32 targets.

Addresses Swift bug report https://bugs.swift.org/browse/SR-5438.

Patch by John Holdsworth.

Reviewers: manmanren, rjmccall, aschwaighofer

Reviewed By: aschwaighofer

Subscribers: srhines, aschwaighofer, aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35835

llvm-svn: 312164
2017-08-30 20:03:54 +00:00
Javed Absar 5766b8eea0 [ARM] - Tidy-up ARMAsmPrinter.cpp
Change to range-loop where missing.

Reviwewed by: @fhahn, @asb
Differential Revision: https://reviews.llvm.org/D37199

llvm-svn: 311993
2017-08-29 10:04:18 +00:00
Diana Picus c9f29c62cc [ARM] GlobalISel: Select globals in PIC mode
Support the selection of G_GLOBAL_VALUE in the PIC relocation model. For
simplicity we use the same pseudoinstructions for both Darwin and ELF:
(MOV|LDRLIT)_ga_pcrel(_ldr).

This is new for ELF, so it requires a small update to the ARM pseudo
expansion pass to make sure it adds the correct constant pool modifier
and add-current-address in the case of ELF.

Differential Revision: https://reviews.llvm.org/D36507

llvm-svn: 311992
2017-08-29 09:47:55 +00:00
Joerg Sonnenberger 0f76a35c5e Fix ARMv4 support
ARMv4 doesn't support the "BX" instruction, which has been introduced
with ARMv4t. Adjust the call lowering and tail call implementation
accordingly.

Further changes are necessary to ensure that presence of the v4t feature
is correctly set. Most importantly, the "generic" CPU for thumb-*
triples should include ARMv4t, since thumb mode without thumb support
would naturally be pointless.

Add a couple of asserts to ensure thumb instructions are not emitted
without CPU support.

Differential Revision: https://reviews.llvm.org/D37030

llvm-svn: 311921
2017-08-28 20:20:47 +00:00
Geoff Berry 75c4ae3066 [ARM] Fix bug in ARMLoadStoreOptimizer when kill flags are missing.
Summary:
ARMLoadStoreOpt::FixInvalidRegPairOp() was only checking if one of the
load destination registers to be split overlapped with the base register
if the base register was marked as killed.  Since kill flags may not
always be present, this can lead to incorrect code.

This bug was exposed by my MachineCopyPropagation change D30751 breaking
the sanitizer-x86_64-linux-android buildbot.

Also clean up some dead code and add an assert that a register offset is
never encountered by this code, since it does not handle them correctly.

Reviewers: MatzeB, qcolombet, t.p.northover

Subscribers: aemerson, javed.absar, kristof.beyls, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D37164

llvm-svn: 311907
2017-08-28 19:03:45 +00:00
NAKAMURA Takumi a1e97a77f5 Untabify.
llvm-svn: 311875
2017-08-28 06:47:47 +00:00
Javed Absar b81fa9932a [ARM] Tidy-up condition-code support functions
Move condition code support functions to Utils and remove code duplication.

Reviewed by: @fhahn, @asb
Differential Revision: https://reviews.llvm.org/D37179

llvm-svn: 311860
2017-08-27 20:38:28 +00:00
Javed Absar 17ee7c0977 [ARM] Tidy-up ARMAsmParser. NFC.
Simplify getDRegFromQReg function

Reviewed by: @fhahn, @asb
Differential Revision: https://reviews.llvm.org/D37118

llvm-svn: 311850
2017-08-27 14:46:57 +00:00
Evgeny Astigeevich 540a39adf7 [ARM, Thumb1] Prevent ARMTargetLowering::isLegalAddressingMode from accepting illegal modes
ARMTargetLowering::isLegalAddressingMode can accept illegal addressing modes
for the Thumb1 target. This causes generation of redundant code and affects
performance.

This fixes PR34106: https://bugs.llvm.org/show_bug.cgi?id=34106

Differential Revision: https://reviews.llvm.org/D36467

llvm-svn: 311649
2017-08-24 10:00:25 +00:00
Tim Northover 4bafa16748 ARM: use internal relocations for local symbols after all.
Switching to external relocations for ARM-mode branches (to allow Thumb
interworking when the offset is unencodable) causes calls to temporary symbols
to be miscompiled and instead go to the parent externally visible symbol.

Calling a temporary never happens in compiled code, but can occasionally in
hand-written assembly.

llvm-svn: 311611
2017-08-23 22:07:10 +00:00
Aditya Nandakumar efd8a84cd5 [GISEl]: Translate phi into G_PHI
G_PHI has the same semantics as PHI but also has types.
This lets us verify that the types in the G_PHI are consistent.
This also allows specifying legalization actions for G_PHIs.

https://reviews.llvm.org/D36990

llvm-svn: 311596
2017-08-23 20:45:48 +00:00
Florian Hahn 214e13d949 [ARM] Add missing patterns for insert_subvector.
Summary: In some cases, shufflevector instruction can be transformed involving insert_subvector instructions. The ARM backend was missing some insert_subvector patterns, causing a failure during instruction selection. AArch64 has similar patterns.

Reviewers: t.p.northover, olista01, javed.absar, rengolin

Reviewed By: javed.absar

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D36796

llvm-svn: 311543
2017-08-23 10:20:59 +00:00
Matthias Braun 55bc9b3f9e TargetInstrInfo: Change duplicate() to work on bundles.
Adds infrastructure to clone whole instruction bundles rather than just
single instructions. This fixes a bug where tail duplication would
unbundle instructions while cloning.

This should unbreak the "Clang Stage 1: cmake, RA, with expensive checks
enabled" build on greendragon. The bot broke with r311139 hitting this
pre-existing bug.

A proper testcase will come next.

llvm-svn: 311511
2017-08-22 23:56:30 +00:00
Sam Parker 6dc3fcb1c6 [ARM][AArch64] v8.3-A Javascript Conversion
Armv8.3-A adds instructions that convert a double-precision floating
point number to a signed 32-bit integer with round towards zero,
designed for improving Javascript performance.

Differential Revision: https://reviews.llvm.org/D36785

llvm-svn: 311448
2017-08-22 11:08:21 +00:00
Renato Golin f63d701669 [ARM] Call setBooleanContents(ZeroOrOneBooleanContent)
The ARM backend should call setBooleanContents so that it can
use known bits to make some optimizations.

Review: D35821

Patch by Joel Galenson <jgalenson@google.com>

llvm-svn: 311446
2017-08-22 11:02:37 +00:00
Chandler Carruth b866178067 Fix a typo in r311435.
llvm-svn: 311437
2017-08-22 09:20:52 +00:00
Alex Bradbury 080f6976c0 Use report_fatal_error for unsupported calling conventions
The calling convention can be specified by the user in IR. Failing to support 
a particular calling convention isn't a programming error, and so relying on 
llvm_unreachable to catch and report an unsupported calling convention is not 
appropriate.

Differential Revision: https://reviews.llvm.org/D36830

llvm-svn: 311435
2017-08-22 09:11:41 +00:00
Sam Parker b252ffd2cc [ARM][AArch64] Cortex-A75 and Cortex-A55 support
This patch introduces support for Cortex-A75 and Cortex-A55, Arm's
latest big.LITTLE A-class cores. They implement the ARMv8.2-A
architecture, including the cryptography and RAS extensions, plus
the optional dot product extension. They also implement the RCpc
AArch64 extension from ARMv8.3-A.

Cortex-A75:
https://developer.arm.com/products/processors/cortex-a/cortex-a75

Cortex-A55:
https://developer.arm.com/products/processors/cortex-a/cortex-a55

Differential Revision: https://reviews.llvm.org/D36667

llvm-svn: 311316
2017-08-21 08:43:06 +00:00
Martin Storsjo d606d2a643 [ARM] Factorize the calculation of WhichResult in isV*Mask. NFC.
Differential Revision: https://reviews.llvm.org/D36930

llvm-svn: 311260
2017-08-19 20:26:51 +00:00
Martin Storsjo 91522ffa12 [ARM] Check the right order for halves of VZIP/VUZP if both parts are used
This is the exact same fix as in SVN r247254. In that commit, the fix was
applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue
is present for VZIP/VUZP as well.

This fixes PR33921.

Differential Revision: https://reviews.llvm.org/D36899

llvm-svn: 311258
2017-08-19 19:47:48 +00:00
Matthias Braun 91bd3ad128 ARMRegsiterInfo: Define more ssub indexes; NFC
This doesn't really change anything as Tablegen would have inferred
those indices anyway; defining them gives us shorter names that are
easier to read while debugging (i.e. "ssub_4" rather than
"dsub2_then_ssub_0")

llvm-svn: 311218
2017-08-19 01:21:11 +00:00
Tim Northover 14302fcb24 ARM: use an external relocation for calls from MachO ARM mode.
The internal (__text-relative) relocation risks the offset not being encodable
if the destination is Thumb.

llvm-svn: 311187
2017-08-18 19:13:56 +00:00
Sam Parker 04a7db5915 [ARM] Add PostRAScheduler option
This patch adds the option to allow also using the PostRA scheduler,
which brings the ARM backend inline with AArch64 targets. The
SchedModel can also set 'PostRAScheduler', as the R52 does, so also
query this property in the overridden function.

Differential Revision: https://reviews.llvm.org/D36866

llvm-svn: 311162
2017-08-18 14:27:51 +00:00
Saleem Abdulrasool dd8c16b58e ARM: mark CPSR as clobbered for Windows VLAs
When lowering a VLA, we emit a __chstk call.  However, this call can
internally clobber CPSR.  We did not mark this register as an ImpDef,
which could potentially allow a comparison to be hoisted above the call
to `__chkstk`.  In such a case, the CPSR could be clobbered, and the
check invalidated.  When the support was initially added, it seemed that
the call would take care of preventing CPSR from being clobbered, but
this is not the case.  Mark the register as clobbered to fix a possible
state corruption.

llvm-svn: 311061
2017-08-17 02:42:24 +00:00
Sam Parker 84fd0c3bf2 [ARM] Improve loop unrolling for Cortex-M
- Set the default runtime unroll count to 4 and use the newly added
  UnrollRemainder option.
- Create loop cost and force unroll for a cost less than 12.
- Disable unrolling on Thumb1 only targets.

Differential Revision: https://reviews.llvm.org/D36134

llvm-svn: 310997
2017-08-16 07:42:44 +00:00
Quentin Colombet 61d71a138b Reapply "[GlobalISel] Remove the GISelAccessor API."
This reverts commit r310425, thus reapplying r310335 with a fix for link
issue of the AArch64 unittests on Linux bots when BUILD_SHARED_LIBS is ON.

Original commit message:
[GlobalISel] Remove the GISelAccessor API.

Its sole purpose was to avoid spreading around ifdefs related to
building global-isel. Since r309990, GlobalISel is not optional anymore,
thus, we can get rid of this mechanism all together.

NFC.

----
The fix for the link issue consists in adding the GlobalISel library in
the list of dependencies for the AArch64 unittests. This dependency
comes from the use of AArch64Subtarget that needs to know how
to destruct the GISel related APIs when being detroyed.

Thanks to Bill Seurer and Ahmed Bougacha for helping me reproducing and
understand the problem.

llvm-svn: 310969
2017-08-15 22:31:51 +00:00
Javed Absar 37b2286564 [ARM] Tidy-up Cortex-A15 DPR-SPR optimizer implementation
Modernise the code with range-loops etc

Reviewed by: @fhahn, @rovka
Differential Revision: https://reviews.llvm.org/D36502

llvm-svn: 310807
2017-08-14 01:38:01 +00:00
Craig Topper 2251ef95a3 [X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheap
Summary:
Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not.

For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors.

Reviewers: RKSimon, zvi, efriedma

Reviewed By: RKSimon

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D36649

llvm-svn: 310793
2017-08-13 17:29:07 +00:00
Florian Hahn a5ba4ee8bc [Triple] Add isThumb and isARM functions.
Summary:
isThumb returns true for Thumb triples (little and big endian), isARM
returns true for ARM triples (little and big endian).
There are a few more checks using arm/thumb that are not covered by
those functions, e.g. that the architecture is either ARM or Thumb
(little endian) or ARM/Thumb little endian only.

Reviewers: javed.absar, rengolin, kristof.beyls, t.p.northover

Reviewed By: rengolin

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D34682

llvm-svn: 310781
2017-08-12 17:40:18 +00:00
Sjoerd Meijer 7426c97bc6 [ARM] Assembler support for the ARMv8.2a dot product instructions
Commit r310480 added the AArch64 ARMv8.2a dot product instructions;
this adds the AArch32 instructions.

Differential Revision: https://reviews.llvm.org/D36575

llvm-svn: 310701
2017-08-11 09:52:30 +00:00
Eli Friedman e9499dd4c7 [ARM] Clarify legal addressing modes for ARM and Thumb2. NFC
The existing code is very clever, but not clear, which seems
like the wrong tradeoff here.

Differential Revision: https://reviews.llvm.org/D36559

llvm-svn: 310653
2017-08-10 19:31:27 +00:00
Krzysztof Parzyszek bea30c6286 Add "Restored" flag to CalleeSavedInfo
The liveness-tracking code assumes that the registers that were saved
in the function's prolog are live outside of the function. Specifically,
that registers that were saved are also live-on-exit from the function.
This isn't always the case as illustrated by the LR register on ARM.

Differential Revision: https://reviews.llvm.org/D36160

llvm-svn: 310619
2017-08-10 16:17:32 +00:00
Sam Parker 9d95764c3b [ARM][AArch64] ARMv8.3-A enablement
The beta ARMv8.3 ISA specifications have been released for AArch64
and AArch32, these can be found at:
https://developer.arm.com/products/architecture/a-profile/exploration-tools

An introduction to this architecture update can be found at:
https://community.arm.com/processors/b/blog/posts/armv8-a-architecture-2016-additions

This patch is the first in a series which will add ARM v8.3-A support
in LLVM and Clang. It adds the necessary changes that create targets
for both the ARM and AArch64 backends.

Differential Revision: https://reviews.llvm.org/D36514

llvm-svn: 310561
2017-08-10 09:41:00 +00:00
Matthias Braun a88587ce0c ARM: Fix CMP_SWAP expansion
Clean up after my misguided attempt in r304267 to "fix" CMP_SWAP
returning an uninitialized status value.

- I was always using tMOVi8 to zero the status register which cannot
  encode higher register numbers and llvm would silently miscompile)

- Nobody was ever looking at that status value outside the expansion.
  ARMDAGToDAGISel::SelectCMP_SWAP() the only place creating CMP_SWAP
  instructions was not mapping anything to it. (The cmpxchg status value
  from llvm IR is lowered to a manual comparison after the CMP_SWAP)

So this:
- Renames the register from "status" to "temp" it make it obvious that
  it isn't used outside the expansion.
- Remove the zeroing status/temp register.
- Keep the live-in list improvements from r304267

Fixes http://llvm.org/PR34056

llvm-svn: 310534
2017-08-09 22:22:05 +00:00
Florian Hahn d68bc7ae8d [ARM] Emit error when ARM exec mode is not available.
Summary:
A similar error message has been removed from the ARMTargetMachineBase
constructor in r306939. With this patch, we generate an error message
for the example below, compiled with -mcpu=cortex-m0, which does not
have ARM execution mode.

    __attribute__((target("arm"))) int foo(int a, int b)
    {
        return a + b % a;
    }

    __attribute__((target("thumb"))) int bar(int a, int b)
    {
        return a + b % a;
    }

By adding this error message to ARMBaseTargetMachine::getSubtargetImpl,
we can deal with functions that set -thumb-mode in target-features.
At the moment it seems like Clang does not have access to target-feature
specific information, so adding the error message to the frontend will
be harder.

Reviewers: echristo, richard.barton.arm, t.p.northover, rengolin, efriedma

Reviewed By: echristo, efriedma

Subscribers: efriedma, aemerson, javed.absar, kristof.beyls

Differential Revision: https://reviews.llvm.org/D35627

llvm-svn: 310486
2017-08-09 15:39:10 +00:00
Florian Hahn fc4b3951e9 [ARM] Remove FeatureNoARM implies ModeThumb.
Summary:
By removing FeatureNoARM implies ModeThumb, we can detect cases where a
function's target-features contain -thumb-mode (enables ARM codegen for the
function), but the architecture does not support ARM mode. Previously, the
implication caused the FeatureNoARM bit to be cleared for functions with
-thumb-mode, making the assertion in ARMSubtarget::ARMSubtarget [1]
pointless for such functions.

This assertion is the only guard against generating ARM code for
architectures without ARM codegen support. Is there a place where we
could easily generate error messages for the user? At the moment, we
would generate ARM code for Thumb-only architectures. X86 has the same
behavior as ARM, as in it only has an assertion and no error message,
but I think for ARM an error message would be helpful. What do you
think?

For the example below, `llc -mtriple=armv7m-eabi test.ll -o -` will
generate ARM assembler (or fail with an assertion error with this patch).
Note that if we run the resulting assembler through llvm-mc, we get
an appropriate error message, but not when codegen is handled
through clang.

```
define void @bar() #0 {
entry:
  ret void
}

attributes #0 = { "target-features"="-thumb-mode" }
```

[1] c1f7b54cef/lib/Target/ARM/ARMSubtarget.cpp (L147)

Reviewers: t.p.northover, rengolin, peter.smith, aadg, silviu.baranga, richard.barton.arm, echristo

Reviewed By: rengolin, echristo

Subscribers: efriedma, aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35569

llvm-svn: 310476
2017-08-09 13:53:28 +00:00
Quentin Colombet 8dd90fb54b Revert "[GlobalISel] Remove the GISelAccessor API."
This reverts commit r310115.

It causes a linker failure for the one of the unittests of AArch64 on one
of the linux bot:
http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/3429

: && /home/fedora/gcc/install/gcc-7.1.0/bin/g++   -fPIC
-fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W
-Wno-unused-parameter -Wwrite-strings -Wcast-qual
-Wno-missing-field-initializers -pedantic -Wno-long-long
-Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment
-ffunction-sections -fdata-sections -O2
-L/home/fedora/gcc/install/gcc-7.1.0/lib64 -Wl,-allow-shlib-undefined
-Wl,-O3 -Wl,--gc-sections
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o  -o
unittests/Target/AArch64/AArch64Tests
lib/libLLVMAArch64CodeGen.so.6.0.0svn lib/libLLVMAArch64Desc.so.6.0.0svn
lib/libLLVMAArch64Info.so.6.0.0svn lib/libLLVMCodeGen.so.6.0.0svn
lib/libLLVMCore.so.6.0.0svn lib/libLLVMMC.so.6.0.0svn
lib/libLLVMMIRParser.so.6.0.0svn lib/libLLVMSelectionDAG.so.6.0.0svn
lib/libLLVMTarget.so.6.0.0svn lib/libLLVMSupport.so.6.0.0svn -lpthread
lib/libgtest_main.so.6.0.0svn lib/libgtest.so.6.0.0svn -lpthread
-Wl,-rpath,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib
&& :
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x0):
undefined reference to `vtable for llvm::LegalizerInfo'
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x8):
undefined reference to `vtable for llvm::RegisterBankInfo'

The particularity of this bot is that it is built with
BUILD_SHARED_LIBS=ON

However, I was not able to reproduce the problem so far.
Reverting to unblock the bot.

llvm-svn: 310425
2017-08-08 22:22:30 +00:00
Tim Northover f370f2e3c6 Revert "[ARM] Fix assembly and disassembly for VMRS/VMSR"
This reverts r310243. Only MVFR2 is actually restricted to v8 and it'll be a
little while before we can get a proper fix together. Better that we allow
incorrect code than reject correct in the meantime.

llvm-svn: 310384
2017-08-08 17:16:46 +00:00
Andre Vieira 7dffb9bfa6 [ARM] Fix assembly and disassembly for VMRS/VMSR
This patch addresses two issues with assembly and disassembly for VMRS/VMSR:

1.currently VMRS/VMSR instructions accessing fpsid, mvfr{0-2} and fpexc, are
  accepted for non ARMv8-A targets.

2. all VMRS/VMSR instructions accept writing/reading to PC and SP, when only
   ARMv7-A and ARMv8-A should be allowed to write/read to SP and none to PC.

This patch addresses those issues and adds tests for these cases.

Differential Revision: https://reviews.llvm.org/D36306

llvm-svn: 310243
2017-08-07 08:41:05 +00:00
Florian Hahn d51a35e339 [ARM] The ARM backend is MachineVerifier clean now.
Summary: Thanks everyone involved in fixing the outstanding issues.

Reviewers: rovka, MatzeB, efriedma

Reviewed By: MatzeB

Subscribers: aemerson, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36153

llvm-svn: 310180
2017-08-05 15:14:06 +00:00
Quentin Colombet c046208c52 [GlobalISel] Remove the GISelAccessor API.
Its sole purpose was to avoid spreading around ifdefs related to
building global-isel. Since r309990, GlobalISel is not optional anymore,
thus, we can get rid of this mechanism all together.

NFC.

llvm-svn: 310115
2017-08-04 20:15:46 +00:00
Javed Absar 9cda599151 [ARM] Use searchable-table for banked registers
This is a continuation of https://reviews.llvm.org/D36219

This patch uses reverse mapping (encoding->name) in
ARMInstPrinter::printBankedRegOperand to get rid of
hard-coded values (as pointed out by @olista01).

Reviewed by: @fhahn, @rovka, @olista01
Differential Revision: https://reviews.llvm.org/D36260

llvm-svn: 310072
2017-08-04 17:10:11 +00:00
Quentin Colombet 250e050a50 [GlobalISel] Make GlobalISel a non-optional library.
With this change, the GlobalISel library gets always built. In
particular, this is not possible to opt GlobalISel out of the build
using the LLVM_BUILD_GLOBAL_ISEL variable any more.

llvm-svn: 309990
2017-08-03 21:52:25 +00:00
Nico Weber bfde70b097 Revert r309923, it caused PR34045.
llvm-svn: 309950
2017-08-03 15:41:26 +00:00
Diana Picus 930e6ec8f3 [ARM] GlobalISel: Select simple G_GLOBAL_VALUE instructions
Add support in the instruction selector for G_GLOBAL_VALUE for ELF and
MachO for the static relocation model. We don't handle Windows yet
because that's Thumb-only, and we don't handle Thumb in general at the
moment.

Support for PIC, ROPI, RWPI and TLS will be added in subsequent commits.

Differential Revision: https://reviews.llvm.org/D35883

llvm-svn: 309927
2017-08-03 09:14:59 +00:00
Roger Ferrer Ibanez 854980341b [ARM] Use ADDCARRY / SUBCARRY
This patch:

- makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
- lowering is done by first converting the boolean value into the carry flag
  using (_, C) <- (ARMISD::ADDC R, -1) and converted back to an integer value
  using (R, _) <- (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
  operations does the actual addition.
- for subtraction, given that ISD::SUBCARRY second result is actually a
  borrow, we need to invert the value of the second operand and result before
  and after using ARMISD::SUBE. We need to invert the carry result of
  ARMISD::SUBE to preserve the semantics.
- given that the generic combiner may lower ISD::ADDCARRY and
  ISD::SUBCARRY into ISD::UADDO and ISD::USUBO we need to update their lowering
  as well otherwise i64 operations now would require branches. This implies
  updating the corresponding test for unsigned.
- add new combiner to remove the redundant conversions from/to carry flags
  to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) -> C

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 309923
2017-08-03 07:45:10 +00:00
Rafael Espindola 79e238afee Delete Default and JITDefault code models
IMHO it is an antipattern to have a enum value that is Default.

At any given piece of code it is not clear if we have to handle
Default or if has already been mapped to a concrete value. In this
case in particular, only the target can do the mapping and it is nice
to make sure it is always done.

This deletes the two default enum values of CodeModel and uses an
explicit Optional<CodeModel> when it is possible that it is
unspecified.

llvm-svn: 309911
2017-08-03 02:16:21 +00:00
Javed Absar 054d1aef43 [ARM] Tidy up banked registers encoding
Moves encoding (SYSm) information of banked registers to ARMSystemRegister.td,
where it rightly belongs and forms a single point of reference in the code.

Reviewed by: @fhahn, @rovka, @olista01
Differential Revision: https://reviews.llvm.org/D36219

llvm-svn: 309910
2017-08-03 01:24:12 +00:00
Strahinja Petrovic 25e9e1b866 [ARM] Add the option to directly access TLS pointer
This patch enables choice for accessing thread local
storage pointer (like '-mtp' in gcc).

Differential Revision: https://reviews.llvm.org/D34408

llvm-svn: 309381
2017-07-28 12:54:57 +00:00
Matthias Braun c618a466f1 ARMFrameLowering: Only set ExtraCSSpill for actually unused registers.
The code assumed that unclobbered/unspilled callee saved registers are
unused in the function. This is not true for callee saved registers that are
also used to pass parameters such as swiftself.

rdar://33401922

llvm-svn: 309350
2017-07-28 01:36:32 +00:00
Florian Hahn e3583bdf91 [ARM] Add use-misched feature, to enable the MachineScheduler.
Summary:
This change makes it easier to experiment with the MachineScheduler in
the ARM backend and also makes it very explicit which CPUs use the
MachineScheduler (currently only swift and cyclone).



Reviewers: MatzeB, t.p.northover, javed.absar

Reviewed By: MatzeB

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35935

llvm-svn: 309316
2017-07-27 19:56:44 +00:00
Florian Hahn 67ddd1d08f [TargetParser] Use enum classes for various ARM kind enums.
Summary:
Using c++11 enum classes ensures that only valid enum values are used
for ArchKind, ProfileKind, VersionKind and ISAKind. This removes the
need for checks that the provided values map to a proper enum value,
allows us to get rid of AK_LAST and prevents comparing values from
different enums. It also removes a bunch of static_cast
from unsigned to enum values and vice versa, at the cost of introducing
static casts to access AArch64ARCHNames and ARMARCHNames by ArchKind.

FPUKind and ArchExtKind are the only remaining old-style enum in
TargetParser.h. I think it's beneficial to keep ArchExtKind as old-style
enum, but FPUKind can be converted too, but this patch is quite big, so
could do this in a follow-up patch. I could also split this patch up a
bit, if people would prefer that.

Reviewers: rengolin, javed.absar, chandlerc, rovka

Reviewed By: rovka

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35882

llvm-svn: 309287
2017-07-27 16:27:56 +00:00
Florian Hahn db479524dd [ARM] Mark labels in skipAlignedDPRCS2Spills as fallthrough (NFC).
The comment at the top of the switch statement indicates that the
fall-through behavior is intentional. By using LLVM_FALLTHROUGH,
-Wimplicit-fallthrough are silenced, which is enabled by default in GCC
7.

llvm-svn: 309272
2017-07-27 14:37:17 +00:00
Daniel Sanders 8e82af2be6 Re-commit: r309094 [globalisel][tablegen] Fuse the generated tables together.
Summary:
Now that we have control flow in place, fuse the per-rule tables into a
single table. This is a compile-time saving at this point. However, this will
also enable the optimization of a table so that similar instructions can be
tested together, reducing the time spent on the matching the code.

This is NFC in terms of externally visible behaviour but some internals have
changed slightly. State.MIs is no longer reset between each rule that is
attempted because it's not necessary to do so. As a consequence of this the
restriction on the order that instructions are added to State.MIs has been
relaxed to only affect recorded instructions that require new elements to be
added to the vector. GIM_RecordInsn can now write to any element from 1 to
State.MIs.size() instead of just State.MIs.size().

The compile-time regressions from the last commit were caused by the ARM target
including a non-const variable (zero_reg) in the table and therefore generating
an initializer for it. That variable is now const.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35681

llvm-svn: 309264
2017-07-27 11:03:45 +00:00
Evandro Menezes b3ed4bcb8f [ARM] Minor cosmetic edits (NFC)
Change the order of a case and the description for Exynos Mx processors.

llvm-svn: 309184
2017-07-26 21:28:20 +00:00
Peter Collingbourne 081ffe2ff2 Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI.
This was a use-after-free waiting to happen.

llvm-svn: 309159
2017-07-26 19:15:29 +00:00
Rafael Espindola e06e4df8be Simplify. NFC.
llvm-svn: 309141
2017-07-26 17:27:27 +00:00
Diana Picus a5d6518e93 [ARM] GlobalISel: Map G_GLOBAL_VALUE to GPR
A G_GLOBAL_VALUE is basically a pointer, so it should live in the GPR.

llvm-svn: 309101
2017-07-26 11:01:13 +00:00
Diana Picus b1fd784936 [ARM] GlobalISel: Mark G_GLOBAL_VALUE as legal
llvm-svn: 309090
2017-07-26 09:25:15 +00:00
Zvi Rackover 1b73682243 TargetLowering: Change isShuffleMaskLegal's mask argument type to ArrayRef<int>. NFCI.
Changing mask argument type from const SmallVectorImpl<int>& to
ArrayRef<int>.

This came up in D35700 where a mask is received as an ArrayRef<int> and
we want to pass it to TargetLowering::isShuffleMaskLegal().
Also saves a few lines of code.

llvm-svn: 309085
2017-07-26 08:06:58 +00:00
Eric Christopher 97ae58686f Update the comments on default subtargets based on feedback.
llvm-svn: 309041
2017-07-25 22:21:08 +00:00
Sam Parker 19a08e42a8 [ARM] Enable partial and runtime unrolling
Enable runtime and partial loop unrolling of simple loops without
calls on M-class cores. The thresholds are calculated based on
whether the target is Thumb or Thumb-2.

Differential Revision: https://reviews.llvm.org/D34619

llvm-svn: 308956
2017-07-25 08:51:30 +00:00
Erich Keane d8f61f8f7e Remove Bitrig: LLVM Changes
Bitrig code has been merged back to OpenBSD, thus the OS has been abandoned.

Differential Revision: https://reviews.llvm.org/D35707

llvm-svn: 308799
2017-07-21 22:48:47 +00:00
Jonas Paulsson 024e319489 [SystemZ, LoopStrengthReduce]
This patch makes LSR generate better code for SystemZ in the cases of memory
intrinsics, Load->Store pairs or comparison of immediate with memory.

In order to achieve this, the following common code changes were made:

 * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if
 LSR should do instruction-based addressing evaluations by calling
 isLegalAddressingMode() with the Instruction pointers.
 * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy
 as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address,
 not just loads or stores.

SystemZ changes:

 * isLSRCostLess() implemented with Insns first, and without ImmCost.
 * New function supportedAddressingMode() that is a helper for TTI methods
 looking at Instructions passed via pointers.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D35262
https://reviews.llvm.org/D35049

llvm-svn: 308729
2017-07-21 11:59:37 +00:00
Javed Absar e9599e39fe [ARM] Simplify ExpandPseudoInst. NFC.
Remove headers not required and convert to range-loop

Reviewed by: @mcrosier
Differential Revision: https://reviews.llvm.org/D35626

llvm-svn: 308607
2017-07-20 12:35:37 +00:00
Javed Absar 2cb0c95031 [ARM] Unify handling of M-Class system registers
This patch cleans up and fixes issues in the M-Class system register handling:

1. It defines the system registers and the encoding (SYSm values) in one place:
   a new ARMSystemRegister.td using SearchableTable, thereby removing the
   hand-coded values which existed in multiple places.

2. Some system registers e.g. BASEPRI_MAX_NS which do not exist were being allowed!
   Ref: ARMv6/7/8M architecture reference manual.

Reviewed by: @t.p.northover, @olist01, @john.brawn
Differential Revision: https://reviews.llvm.org/D35209

llvm-svn: 308456
2017-07-19 12:57:16 +00:00
Javed Absar 5b8e487b47 [ARM|CodeGen] Improve the code in FastISel
Cleaned up the code in FastISel a bit.
Had to add make_range to MCInstrDesc as that was needed and seems missing.

Reviewed by: @t.p.northover
Differential Revision: https://reviews.llvm.org/D35494

llvm-svn: 308291
2017-07-18 10:19:48 +00:00
Diana Picus da25d5b8b0 [ARM] GlobalISel: Support G_(S|U)REM for s8 and s16
Widen to s32, and then do whatever Lowering/Custom/Libcall action the
subtarget wants.

llvm-svn: 308285
2017-07-18 10:07:01 +00:00
Javed Absar dd2c29ef83 [CodeGen] Add begin-end iterators to MachineInstr
Convert iteration over operands to range-loop.

Reviewed by: @rovka, @echristo
Differential Revision: https://reviews.llvm.org/D35419

llvm-svn: 308173
2017-07-17 13:15:26 +00:00
Hiroshi Inoue a9ee279e70 fix typos in comments; NFC
llvm-svn: 308126
2017-07-16 07:48:48 +00:00
Diana Picus 87a7067983 [ARM] GlobalISel: Support G_BRCOND
Insert a TSTri to set the flags and a Bcc to branch based on their
values. This is a bit inefficient in the (common) cases where the
condition for the branch comes from a compare right before the branch,
since we set the flags both as part of the compare lowering and as part
of the branch lowering. We're going to live with that until we settle on
a principled way to handle this kind of situation, which occurs with
other patterns as well (combines might be the way forward here).

llvm-svn: 308009
2017-07-14 09:46:06 +00:00
Sam Parker 2893448576 [ARM] Allow rematerialization of ARM Thumb literal pool loads
Constants are crucial for code size in the ARM Thumb-1 instruction
set. The 16 bit instruction size often does not offer enough space
for immediate arguments. This means that additional instructions are
frequently used to load constants into registers. Since constants are
hoisted, this can lead to significant register spillage if they are
used multiple times in a single function. This can be avoided by
rematerialization, i.e. recomputing a constant instead of reloading
it from the stack. This patch fixes the rematerialization of literal
pool loads in the ARM Thumb instruction set.

Patch by Philip Ginsbach

Differential Revision: https://reviews.llvm.org/D33936

llvm-svn: 308004
2017-07-14 08:23:56 +00:00
Eric Christopher 4e332c7cf1 Add a set of comments explaining why getSubtargetImpl() is deleted on these targets.
llvm-svn: 307999
2017-07-14 04:33:43 +00:00
Diana Picus c452175642 [ARM] GlobalISel: Support G_BR
This boils down to not crashing in reg bank select due to the lack of
register operands on this instruction, and adding some tests. The
instruction selection is already covered by the TableGen'erated code.

llvm-svn: 307904
2017-07-13 11:09:34 +00:00
Javed Absar d32f9c8190 [ARM] Tidy up and organise better ARM.td. NFC.
This patch tidies up and organises ARM.td
so that it is easier to understandand
and extend in the future.

Reviewed by: @hahn, @rovka
Differential Revision: https://reviews.llvm.org/D35248

llvm-svn: 307897
2017-07-13 10:24:30 +00:00
Diana Picus f69d7b0495 Fixup r307893: Silence warning
Silence unused variable warning in release builds.
*sigh*

llvm-svn: 307896
2017-07-13 09:52:06 +00:00
Diana Picus 6860a60c07 [ARM] GlobalISel: Move local variable. NFC
Move a local variable from outside a switch to inside every case that
needs it (which isn't all of the cases, of course).

llvm-svn: 307893
2017-07-13 09:30:08 +00:00
Florian Hahn 4adcfcf1d6 [ARM] Inline callee if its target-features are a subset of the caller
Summary:
Similar to X86, it should be safe to inline callees if their
target-features are a subset of the caller. As some subtarget features
provide different instructions depending on whether they are set or
unset (e.g. ThumbMode and ModeSoftFloat), we use a whitelist of
target-features describing hardware capabilities only.

Reviewers: kristof.beyls, rengolin, t.p.northover, SjoerdMeijer, peter.smith, silviu.baranga, efriedma

Reviewed By: SjoerdMeijer, efriedma

Subscribers: dschuff, efriedma, aemerson, sdardis, javed.absar, arichardson, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D34697

llvm-svn: 307889
2017-07-13 08:26:17 +00:00
John Brawn 97cc283117 [ARM] Adjust ifcvt heuristic for the diamond ifcvt case
When we have a diamond ifcvt the fallthough block will have a branch at the end
of it that disappears when predicated, so discount it from the predication cost.

Differential Revision: https://reviews.llvm.org/D34952

llvm-svn: 307788
2017-07-12 13:23:10 +00:00
Diana Picus 995746da03 [ARM] GlobalISel: Simplify inst selector code. NFC
Refactor CmpHelper into something simpler. It was overkill to use
templates for this - instead, use a simple CmpConstants structure to
hold the opcodes and other constants that are different when selecting
int / float / double comparisons. Also, extract some of the helpers that
were in CmpHelper into ARMInstructionSelector and make use of some of
them when selecting other things than just compares.

llvm-svn: 307766
2017-07-12 10:31:16 +00:00
Diana Picus 21014df5e0 [ARM] GlobalISel: Select s64 G_FCMP
Very similar to how we select s32 G_FCMP, the only thing that is
different is the exact opcodes that we use.

llvm-svn: 307763
2017-07-12 09:01:54 +00:00
Rafael Espindola 1beb702ba2 Fully fix the movw/movt addend.
The issue is not if the value is pcrel. It is whether we have a
relocation or not.

If we have a relocation, the static linker will select the upper
bits. If we don't have a relocation, we have to do it.

llvm-svn: 307730
2017-07-11 23:18:25 +00:00
Konstantin Zhuravlyov bb80d3e1d3 Enhance synchscope representation
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
  global and local memory. These scopes restrict how synchronization is
  achieved, which can result in improved performance.

  This change extends existing notion of synchronization scopes in LLVM to
  support arbitrary scopes expressed as target-specific strings, in addition to
  the already defined scopes (single thread, system).

  The LLVM IR and MIR syntax for expressing synchronization scopes has changed
  to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
  replaces *singlethread* keyword), or a target-specific name. As before, if
  the scope is not specified, it defaults to CrossThread/System scope.

  Implementation details:
    - Mapping from synchronization scope name/string to synchronization scope id
      is stored in LLVM context;
    - CrossThread/System and SingleThread scopes are pre-defined to efficiently
      check for known scopes without comparing strings;
    - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
      the bitcode.

Differential Revision: https://reviews.llvm.org/D21723

llvm-svn: 307722
2017-07-11 22:23:00 +00:00
Martin Storsjo 0e83e85f63 [ARM, ELF] Don't shift movt relocation offsets
For ELF, a movw+movt pair is handled as two separate relocations.
If an offset should be applied to the symbol address, this offset is
stored as an immediate in the instruction (as opposed to stored as an
offset in the relocation itself).

Even though the actual value stored in the movt immediate after linking
is the top half of the value, we need to store the unshifted offset
prior to linking. When the relocation is made during linking, the offset
gets added to the target symbol value, and the upper half of the value
is stored in the instruction.

This makes sure that movw+movt with offset symbols get properly
handled, in case the offset addition in the lower half should be
carried over to the upper half.

This makes the output from the additions to the test case match
the output from GNU binutils.

For COFF and MachO, the movw/movt relocations are handled as a pair,
and the overflow from the lower half gets carried over to the movt,
so they should keep the shifted offset just as before.

Differential Revision: https://reviews.llvm.org/D35242

llvm-svn: 307713
2017-07-11 21:07:10 +00:00
Diana Picus 069da27f49 [ARM] GlobalISel: Add reg mapping for s64 G_FCMP
Map the result into GPR and the operands into FPR.

llvm-svn: 307653
2017-07-11 11:47:45 +00:00
Peter Smith a2e5ecc1f3 [ARM] ldr pc,=expression should be allowed in Thumb2
This change allows the pc to be used as a destination register for the
pseudo instruction LDR pc,=expression . The pseudo instruction must not be
transformed into a MOV, but it can use the Thumb2 LDR (literal) instruction
to a constant pool entry. See (A7.7.43 from ARMv7M ARM ARM).

Differential Revision: https://reviews.llvm.org/D34751

llvm-svn: 307640
2017-07-11 09:47:12 +00:00
Diana Picus 443135c6eb [ARM] GlobalISel: Fix oversight in G_FCMP legalization
We used to forget to erase the original instruction when replacing a
G_FCMP true/false. Fix this bug and make sure the tests check for it.

llvm-svn: 307639
2017-07-11 09:43:51 +00:00
Diana Picus b57bba8316 [ARM] GlobalISel: Legalize s64 G_FCMP
Same as the s32 version, for both hard and soft float.

llvm-svn: 307633
2017-07-11 08:50:01 +00:00
Nirav Dave 4dcad5dc6b Add DAG argument to canMergeStoresTo NFC.
llvm-svn: 307583
2017-07-10 20:25:54 +00:00
Javed Absar fb3210aa05 [ARM] Tidy up ARMBaseRegisterInfo implementation. NFC
Clean up ARMBaseRegisterInfo implementation a bit.
Differential Revision: https://reviews.llvm.org/D35116

llvm-svn: 307531
2017-07-10 10:42:55 +00:00
Simon Pilgrim e2d84d953e [ARM] Fix -Wimplicit-fallthrough warning. NFCI.
llvm-svn: 307480
2017-07-08 18:42:04 +00:00
Simon Pilgrim cb07d67a5c Fix some more -Wimplicit-fallthrough warnings. NFCI.
llvm-svn: 307411
2017-07-07 16:40:06 +00:00
Matthew Simpson 12eaef75ce [ARM] Implement interleaved access bug fix from r306334
r306334 fixed a bug in AArch64 dealing with wide interleaved accesses having
pointer types. The bug also exists in ARM, so this patch copies over the fix.

llvm-svn: 307409
2017-07-07 16:15:05 +00:00
Simon Pilgrim ce1fb22c6a [Arm] Fix -Wimplicit-fallthrough warnings. NFCI.
llvm-svn: 307375
2017-07-07 10:05:45 +00:00
Diana Picus 77367378ac [ARM] GlobalISel: Fixup r307365
Rename member DebugLoc -> DbgLoc (so it doesn't conflict with the class
name).

llvm-svn: 307366
2017-07-07 08:53:27 +00:00
Diana Picus 5b91653840 [ARM] GlobalISel: Select hard G_FCMP for s32
We lower to a sequence consisting of:
- MOVi 0 into a register
- VCMPS to do the actual comparison and set the VFP flags
- FMSTAT to move the flags out of the VFP unit
- MOVCCi to either use the "zero register" that we have previously set
  with the MOVi, or move 1 into the result register, based on the values
  of the flags

As was the case with soft-float, for some predicates (one, ueq) we
actually need two comparisons instead of just one. When that happens, we
generate two VCMPS-FMSTAT-MOVCCi sequences and chain them by means of
using the result of the first MOVCCi as the "zero register" for the
second one. This is a bit overkill, since one comparison followed by
two non-flag-setting conditional moves should be enough. In any case,
the backend manages to CSE one of the comparisons away so it doesn't
matter much.

Note that unlike SelectionDAG and FastISel, we always use VCMPS, and not
VCMPES. This makes the code a lot simpler, and it also seems correct
since the LLVM Lang Ref defines simple true/false returns if the
operands are QNaN's. For SNaN's, even VCMPS throws an Invalid Operand
exception, so they won't be slipping through unnoticed.

Implementation-wise, this introduces a template so we can share the same
code that we use for handling integer comparisons, since the only
differences are in the details (exact opcodes to be used etc). Hopefully
this will be easy to extend to s64 G_FCMP.

llvm-svn: 307365
2017-07-07 08:39:04 +00:00
Diana Picus c3a9c34761 [ARM] GlobalISel: Map s32 G_FCMP in reg bank select
Map hard G_FCMP operands to FPR and the result to GPR.

llvm-svn: 307245
2017-07-06 09:57:46 +00:00
Diana Picus d0104eaae8 [ARM] GlobalISel: Legalize G_FCMP for s32
This covers both hard and soft float.

Hard float is easy, since it's just Legal.

Soft float is more involved, because there are several different ways to
handle it based on the predicate: one and ueq need not only one, but two
libcalls to get a result. Furthermore, we have large differences between
the values returned by the AEABI and GNU functions.

AEABI functions return a nice 1 or 0 representing true and respectively
false. GNU functions generally return a value that needs to be compared
against 0 (e.g. for ogt, the value returned by the libcall is > 0 for
true).  We could introduce redundant comparisons for AEABI as well, but
they don't seem easy to remove afterwards, so we do different processing
based on whether or not the result really needs to be compared against
something (and just truncate if it doesn't).

llvm-svn: 307243
2017-07-06 09:09:33 +00:00
Diana Picus cd460c89c4 [ARM] GlobalISel: Widen s1, s8, s16 G_CONSTANT
Get the legalizer to widen small constants.

llvm-svn: 307239
2017-07-06 08:04:16 +00:00
Diana Picus fc1675eb16 [GlobalISel] Refactor Legalizer helpers for libcalls
We used to have a helper that replaced an instruction with a libcall.
That turns out to be too aggressive, since sometimes we need to replace
the instruction with at least two libcalls. Therefore, change our
existing helper to only create the libcall and leave the instruction
removal as a separate step. Also rename the helper accordingly.

llvm-svn: 307149
2017-07-05 12:57:24 +00:00
Sjoerd Meijer 6d14fdf62d [AsmParser] Mnemonic Spell Corrector
This implements suggesting other mnemonics when an invalid one is specified,
for example:

$ echo "adXd r1,r2,#3" | llvm-mc -triple arm
<stdin>:1:1: error: invalid instruction, did you mean: add, qadd?
adXd r1,r2,#3
^

The implementation is target agnostic, but as a first step I have added it only
to the ARM backend; so the ARM backend is a good example if someone wants to
enable this too for another target.

Differential Revision: https://reviews.llvm.org/D33128

llvm-svn: 307148
2017-07-05 12:39:13 +00:00
Diana Picus 3e8851a1b4 [ARM] GlobalISel: Extract tiny helper. NFC
Extract functionality for determining if the target uses AEABI.

llvm-svn: 307145
2017-07-05 11:53:51 +00:00
Hiroshi Inoue 79f8933f23 fix trivial typos in comments; NFC
llvm-svn: 307094
2017-07-04 16:35:26 +00:00
Daniel Sanders 6ab0daade8 [globalisel][tablegen] Partially fix compile-time regressions by converting matcher to state-machine(s)
Summary:
Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
recommitted.

The following patches will expand on this further to fully fix the regressions.

Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar

Reviewed By: ab

Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33758

llvm-svn: 307079
2017-07-04 14:35:06 +00:00
Eric Christopher 3df231a1f7 Remove the default ARMSubtarget from the ARM TargetMachine.
This enables us to ensure better LTO and code generation in the face of module linking.
Remove a report_fatal_error from the TargetMachine and replace it with an assert in ARMSubtarget - and remove the test that depended on the error. The assertion will still fire in the case that we were reporting before, but error reporting needs to be in front end tools if possible for options parsing.

llvm-svn: 306939
2017-07-01 03:41:53 +00:00
Eric Christopher 015dc2094e Rewrite ARM execute only support to avoid the use of a command line flag and unqualified ARMSubtarget lookup.
Paired with a clang commit to use the new behavior.

llvm-svn: 306927
2017-07-01 02:55:22 +00:00
Quentin Colombet 51b7af3e14 [ARM] Move GISel accessor initialization from TargetMachine to Subtarget.
NFC

llvm-svn: 306920
2017-07-01 00:45:45 +00:00
Rafael Espindola 76287ab3a0 Rename and adjust processFixupValue.
It was not processing any value. All that it ever did was force
relocations, so name it shouldForceRelocation.

llvm-svn: 306906
2017-06-30 22:47:27 +00:00
Tim Northover 2b5f03aa12 ARM: fix big-endian 64-bit cmpxchg.
On big-endian machines the high and low parts of the value accessed by ldrexd
and strexd are swapped around. To account for this we swap inputs and outputs
in ISelLowering.

Patch by Bharathi Seshadri.

llvm-svn: 306865
2017-06-30 19:51:02 +00:00
Kristof Beyls b539ea5393 [GlobalISel] Make multi-step legalization work.
In r301116, a custom lowering needed to be introduced to be able to
legalize 8 and 16-bit divisions on ARM targets without a division
instruction, since 2-step legalization (WidenScalar from 8 bit to 32
bit, then Libcall the 32-bit division) doesn't work.

This fixes this and makes this kind of multi-step legalization, where
first the size of the type needs to be changed and then some action is
needed that doesn't require changing the size of the type,
straighforward to specify.

Differential Revision: https://reviews.llvm.org/D32529

llvm-svn: 306806
2017-06-30 08:26:20 +00:00
Eric Christopher ee837a59f7 Unified logic for computing target ABI in backend and front end by moving this common code to Support/TargetParser.
Modeled Triple::GNU after front end code (aapcs abi) and  updated tests that expect apcs abi.

Based heavily on a patch by Ana Pazos!

llvm-svn: 306768
2017-06-30 00:03:54 +00:00
Eugene Leviant 6269d39f44 [llvm-objdump] Handle invalid instruction gracefully on ARM
Differential revision: https://reviews.llvm.org/D34813

llvm-svn: 306687
2017-06-29 15:38:47 +00:00
Florian Hahn 08fdd040b5 [ARM] Add tGPRwithpc register class and use it for TBB/THH
Summary:
TBB and THH allow using a Thumb GPR or the PC as destination operand.
A few machine verifier failures where due to those instructions not
expecting PC as destination operand.

Add -verify-machineinstrs to test/CodeGen/ARM/jump-table-tbh.ll to add
test coverage even if expensive checks are disabled.



Reviewers: MatzeB, t.p.northover, jmolloy

Reviewed By: MatzeB

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34610

llvm-svn: 306654
2017-06-29 08:45:31 +00:00
Rafael Espindola 920fa14011 Don't repeat names and reformat. NFC.
llvm-svn: 306556
2017-06-28 16:00:16 +00:00
John Brawn 75d76e5e95 [ARM] Improve if-conversion for M-class CPUs without branch predictors
The current heuristic in isProfitableToIfCvt assumes we have a branch predictor,
and so gives the wrong answer in some cases when we don't. This patch adds a
subtarget feature to indicate that a subtarget has no branch predictor, and
changes the heuristic in isProfitableToiIfCvt when it's present. This gives a
slight overall improvement in a set of embedded benchmarks on Cortex-M4 and
Cortex-M33.

Differential Revision: https://reviews.llvm.org/D34398

llvm-svn: 306547
2017-06-28 14:11:15 +00:00
Kristof Beyls eecb353d0e [ARM] Make -mcpu=generic schedule for an in-order core (Cortex-A8).
The benchmarking summarized in
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113525.html showed
this is beneficial for a wide range of cores.

As is to be expected, quite a few small adaptations are needed to the
regressions tests, as the difference in scheduling results in:
- Quite a few small instruction schedule differences.
- A few changes in register allocation decisions caused by different
 instruction schedules.
- A few changes in IfConversion decisions, due to a difference in
 instruction schedule and/or the estimated cost of a branch mispredict.

llvm-svn: 306514
2017-06-28 07:07:03 +00:00
Diana Picus 0e74a134f8 [ARM] GlobalISel: Support G_SELECT for pointers
All we need to do is mark it as legal, otherwise it's just like s32.

llvm-svn: 306390
2017-06-27 10:29:50 +00:00
Diana Picus 7145d22f81 [ARM] GlobalISel: Support G_SELECT for i32
* Mark as legal for (s32, i1, s32, s32)
* Map everything into GPRs
* Select to two instructions: a CMP of the condition against 0, to set
  the flags, and a MOVCCr to select between the two inputs based on the
  flags that we've just set

llvm-svn: 306382
2017-06-27 09:19:51 +00:00
Rafael Espindola 6418856127 Simplify the processFixupValue interface. NFC.
llvm-svn: 306202
2017-06-24 05:22:28 +00:00
Rafael Espindola f351292141 Remove redundant argument.
llvm-svn: 306189
2017-06-24 00:26:57 +00:00
Rafael Espindola 801b42de31 ARM: move some logic from processFixupValue to applyFixup.
processFixupValue is called on every relaxation iteration. applyFixup
is only called once at the very end. applyFixup is then the correct
place to do last minute changes and value checks.

While here, do proper range checks again for fixup_arm_thumb_bl. We
used to do it, but dropped because of thumb2. We now do it again, but
use the thumb2 range.

llvm-svn: 306177
2017-06-23 22:52:36 +00:00
Rafael Espindola 58173b9720 COFF: Produce an error on invalid pcrel relocs.
X86_64 COFF only has support for 32 bit pcrel relocations. Produce an
error on all others.

Note that gnu as has extended the relocation values to support
this. It is not clear if we should support the gnu extension.

llvm-svn: 306082
2017-06-23 04:07:44 +00:00
Florian Hahn 5991b5be74 [ARM] Create relocations for beq.w branches to ARM function syms.
Summary:
The ARM ELF ABI requires the linker to do interworking for wide
conditional branches from Thumb code to ARM code. 

That was pointed out by @peter.smith in the comments for D33436.

Reviewers: rafael, peter.smith, echristo

Reviewed By: peter.smith

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits, peter.smith

Differential Revision: https://reviews.llvm.org/D34447

llvm-svn: 306009
2017-06-22 15:32:41 +00:00
Kristof Beyls 9665249fd8 Don't conditionalize Neon instructions, even in IT blocks.
This has been deprecated since ARMARM v7-AR, release C.b, published back
in 2012.

This also removes test/CodeGen/Thumb2/ifcvt-neon.ll that originally was
introduced to check that conditionalization of Neon instructions did
happen when generating Thumb2. However, the test had evolved and was no
longer testing that. Rather than trying to adapt that test, this commit
introduces test/CodeGen/Thumb2/ifcvt-neon-deprecated.mir, since we can
now use the MIR framework to write nicer/more maintainable tests.

llvm-svn: 305998
2017-06-22 12:11:38 +00:00
John Brawn ed78aaf093 [ARM] Add .w aliases of MOV with shifted operand
These appear to have been simply missing.

Differential Revision: https://reviews.llvm.org/D34461

llvm-svn: 305993
2017-06-22 10:30:53 +00:00
John Brawn 192f74a84d [ARM] Clean up choice of narrow instructions in ARMAsmParser, NFC
This patch makes a couple of changes to how we decide whether to use the narrow
or wide encoding of thumb2 instructions:
 * Common out the detection of the .w qualifier
 * Check for the CPSR operand in a consistent way

Differential Revision: https://reviews.llvm.org/D34460

llvm-svn: 305992
2017-06-22 10:29:31 +00:00
Florian Hahn b489e56ae2 [ARM] Add macro fusion for AES instructions.
Summary:
This patch adds a macro fusion using CodeGen/MacroFusion.cpp to pair AES
instructions back to back and adds FeatureFuseAES to enable the feature.

Reviewers: evandro, javed.absar, rengolin, t.p.northover

Reviewed By: javed.absar

Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34142

llvm-svn: 305988
2017-06-22 09:39:36 +00:00
Rafael Espindola 88d9e37ec8 Use a MutableArrayRef. NFC.
llvm-svn: 305968
2017-06-21 23:06:53 +00:00
Alexandros Lamprineas 2b2b420563 [ARM] Support constant pools in data when generating execute-only code.
Resubmission of r305387, which was reverted at r305390. The Address
Sanitizer caught a stack-use-after-scope of a Twine variable. This
is now fixed by passing the Twine directly as a function parameter.

The ARM backend asserts against constant pool lowering when it generates
execute-only code in order to prevent the generation of constant pools in
the text section. It appears that target independent optimizations might
generate DAG nodes that represent constant pools. By lowering such nodes
as global addresses we don't violate the semantics of execute-only code
and also it is guaranteed that execute-only behaves correct with the
position-independent addressing modes that support execute-only code.

Differential Revision: https://reviews.llvm.org/D33773

llvm-svn: 305776
2017-06-20 07:20:52 +00:00
Diana Picus 78aaf7db04 [ARM] GlobalISel: Support G_ICMP for s8 and s16
Widen to s32 (like all other binary ops).

llvm-svn: 305683
2017-06-19 11:47:28 +00:00
Diana Picus 621894ac76 [ARM] GlobalISel: Support G_ICMP for i32 and pointers
Add support throughout the pipeline:
- mark as legal for s32 and pointers
- map to GPRs
- lower to a sequence of instructions, which moves 0 or 1 into the
  result register based on the flags set by a CMPrr

We have copied from FastISel a helper function which maps CmpInst
predicates into ARMCC codes. Ideally, we should be able to move it
somewhere that both FastISel and GlobalISel can use.

llvm-svn: 305672
2017-06-19 09:40:51 +00:00
Diana Picus 02e11010b2 [ARM] GlobalISel: Add support for i32 modulo
Add support for modulo for targets that have hardware division and for
those that don't. When hardware division is not available, we have to
choose the correct libcall to use. This is generally straightforward,
except for AEABI.

The AEABI variant is trickier than the other libcalls because it
returns { quotient, remainder }, instead of just one value like the
other libcalls that we've seen so far. Therefore, we need to use custom
lowering for it. However, we don't want to have too much special code,
so we refactor the target-independent code in the legalizer by adding a
helper for replacing an instruction with a libcall. This helper is used
by the legalizer itself when dealing with simple calls, and also by the
custom ARM legalization for the more complicated AEABI divmod calls.

llvm-svn: 305459
2017-06-15 10:53:31 +00:00
Diana Picus 8fd1601d32 [ARM] GlobalISel: Lower only homogeneous struct args
Lowering mixed struct args, params and returns used G_INSERT, which is a
bit more convoluted to support through the entire pipeline. Since they
don't occur that often in practice, it's probably wiser to leave them
out until later.

Meanwhile, we can lower homogeneous structs using G_MERGE_VALUES, which
has good support in the legalizer. These occur e.g. as the return of
__aeabi_idivmod, so it's nice to be able to support them.

llvm-svn: 305458
2017-06-15 09:42:02 +00:00
Alexandros Lamprineas 1c15ee2631 Revert "[ARM] Support constant pools in data when generating execute-only code."
This reverts commit 3a204faa093c681a1e96c5e0622f50649b761ee0.

I've upset a buildbot which runs the address sanitizer:
ERROR: AddressSanitizer: stack-use-after-scope
lib/Target/ARM/ARMISelLowering.cpp:2690
That Twine variable is used illegally.

llvm-svn: 305390
2017-06-14 15:00:08 +00:00
Alexandros Lamprineas c582d6e133 [ARM] Support constant pools in data when generating execute-only code.
The ARM backend asserts against constant pool lowering when it generates
execute-only code in order to prevent the generation of constant pools in
the text section. It appears that target independent optimizations might
generate DAG nodes that represent constant pools. By lowering such nodes
as global addresses we don't violate the semantics of execute-only code
and also it is guaranteed that execute-only behaves correct with the
position-independent addressing modes that support execute-only code.

Differential Revision: https://reviews.llvm.org/D33773

llvm-svn: 305387
2017-06-14 13:22:41 +00:00
Oliver Stannard 852fbd2fea [ARM] Add scheduling classes for VFNM[AS]
The VFNM[AS] instructions did not have scheduling information attached, which
was causing assertion failures with the Cortex-A57 scheduling model and
-fp-contract=fast, because the Cortex-A57 sched model claims to be complete.

Differential Revision: https://reviews.llvm.org/D34139

llvm-svn: 305288
2017-06-13 13:04:32 +00:00
Daniel Neilson c0112ae8da Const correctness for TTI::getRegisterBitWidth
Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation.

Reviewers: chandlerc, rnk, reames

Reviewed By: reames

Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits

Differential Revision: https://reviews.llvm.org/D33903

llvm-svn: 305189
2017-06-12 14:22:21 +00:00
Javed Absar 9e1ff8654f [ARM] Custom machine-scheduler. NFCI.
This patch creates a customised machine-scheduler for ARM targets,
so that subsequently DAG mutations etc can be added.
Reviewed by: hahn, rengolin, rovka. 
Differential Revision: https://reviews.llvm.org/D34039

llvm-svn: 305078
2017-06-09 14:07:21 +00:00
Oliver Stannard ad0973557c [ARM] Add scheduling info for VFMS
The scalar VFMS instructions did not have scheduling information attached (but
VFMA did), which was causing assertion failures with the Cortex-A57 scheduling
model and -fp-contract=fast.

Differential Revision: https://reviews.llvm.org/D34040

llvm-svn: 305064
2017-06-09 09:19:09 +00:00
Florian Hahn 28a61d64e2 [ARM] Use FixupKind variable in processFixupValue (cleanup, NFC).
llvm-svn: 304905
2017-06-07 12:58:08 +00:00
Diana Picus 0b4190a9d6 [ARM] GlobalISel: Purge G_SEQUENCE
According to the commit message from r296921, G_MERGE_VALUES and
G_INSERT are to be preferred over G_SEQUENCE. Therefore, stop generating
G_SEQUENCE in the ARM backend and remove the code dealing with it.

This boils down to the code breaking up double values for the soft float
calling convention. Use G_MERGE_VALUES + G_UNMERGE_VALUES instead of
G_SEQUENCE + G_EXTRACT for it. This maps very nicely to VMOVDRR +
VMOVRRD and simplifies the code in the instruction selector.

There's one occurence of G_SEQUENCE left in arm-irtranslator.ll, but
that is part of the target-independent code for translating constant
structs. Therefore, it is beyond the scope of this commit.

llvm-svn: 304902
2017-06-07 12:35:05 +00:00
Diana Picus 0196427b03 [ARM] GlobalISel: Support G_XOR
Same as the other binary operators:
- legalize to 32 bits
- map to GPRs
- select to EORrr via TableGen'erated code

llvm-svn: 304898
2017-06-07 11:57:30 +00:00
Diana Picus eeb0aad8e4 [ARM] GlobalISel: Support G_OR
Same as the other binary operators:
- legalize to 32 bits
- map to GPRs
- select ORRrr thanks to TableGen'erated code

llvm-svn: 304890
2017-06-07 10:14:23 +00:00
Diana Picus 8445858a93 [ARM] GlobalISel: Support G_AND
This is identical to the support for the other binary operators:
- widen to s32
- map into GPR
- select ANDrr (via TableGen'erated code)

llvm-svn: 304885
2017-06-07 09:17:41 +00:00
Florian Hahn 9afd9d9254 [ARM] Create relocations for unconditional branches.
Summary:
Relocations are required for unconditional branches to function symbols with
different execution mode. Without this patch, incorrect branches are
generated for tail calls between functions with different execution
mode.


Reviewers: peter.smith, rafael, echristo, kristof.beyls

Reviewed By: peter.smith

Subscribers: aemerson, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33898

llvm-svn: 304882
2017-06-07 08:54:47 +00:00
Zachary Turner 264b5d9e88 Move Object format code to lib/BinaryFormat.
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.

Differential Revision: https://reviews.llvm.org/D33843

llvm-svn: 304864
2017-06-07 03:48:56 +00:00
Eugene Zelenko fb69e66cff [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 304839
2017-06-06 22:22:41 +00:00
Chandler Carruth 6bda14b313 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
Peter Smith d16c55de6d [ARM] Add curly braces around switch case [NFC]
My previous commit r304702 introduced a new case into a switch statement.
This case defined a variable but I forgot to add the curly brackets around the
case to limit the scope.

This change puts the curly braces back in so that the next person that adds a
case doesn't get a build failure. Thanks to avieira for the spot.

Differential Revision: https://reviews.llvm.org/D33931

llvm-svn: 304785
2017-06-06 10:22:49 +00:00
Diana Picus 0091cc3528 [ARM] GlobalISel: Constrain callee register on indirect calls
When lowering calls, we generate instructions with machine opcodes
rather than generic ones. Therefore, we need to constrain the register
classes of the operands.

Also enable the machine verifier on the arm-irtranslator.ll test, since
that would've caught this issue.

Fixes (part of) PR32146.

llvm-svn: 304712
2017-06-05 12:54:53 +00:00
Peter Smith adde667007 [ARM] Support fixup for Thumb2 modified immediate
This change adds a new fixup fixup_t2_so_imm for the t2_so_imm_asmoperand
"T2SOImm". The fixup permits code such as:
.L1:
 sub r3, r3, #.L2 - .L1
.L2:
to assemble in Thumb2 as well as in ARM state.
    
The operand predicate isT2SOImm() explicitly doesn't match expressions
containing :upper16: and :lower16: as expressions with these operators
must match the movt and movw instructions.
    
The test mov r0, foo2 in thumb2-diagnostics is moved to a new file as the
fixup delays the error message till after the assembler has quit due to
the other errors.
    
As the mov instruction shares the t2_so_imm_asmoperand mov instructions
with a non constant expression now match t2MOVi rather than t2MOVi16 so the
error message is slightly different.
    
Fixes PR28647

Differential Revision: https://reviews.llvm.org/D33492

llvm-svn: 304702
2017-06-05 09:37:12 +00:00
Diana Picus e7aa90987d [ARM] GlobalISel: Support struct params/returns
Very very similar to the support for arrays. As with arrays, we don't
support returning large structs that wouldn't fit in R0-R3. Most
front-ends would likely use sret arguments for that anyway.

The only significant difference is that when splitting a struct, we need
to make sure we set the correct original alignment on each member,
otherwise it may get split incorrectly between stack and registers.

llvm-svn: 304536
2017-06-02 10:16:48 +00:00
Javed Absar 4ae7e81233 [ARM] Cortex-A57 scheduling model for ARM backend (AArch32)
This patch implements the Cortex-A57 scheduling model.
The main code is in ARMScheduleA57.td, ARMScheduleA57WriteRes.td.
Small changes in cpp,.h files to support required scheduling predicates.

Scheduling model implemented according to:
 http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/Cortex_A57_Software_Optimization_Guide_external.pdf.

Patch by : Andrew Zhogin (submitted on his behalf, as requested).
Rewiewed by: Renato Golin, Diana Picus, Javed Absar, Kristof Beyls.
Differential Revision: https://reviews.llvm.org/D28152

llvm-svn: 304530
2017-06-02 08:53:19 +00:00
Florian Hahn fca7b8348f [ARM] Create relocations for Thumb functions calling ARM fns in ELF.
Summary:
Without using a fixup in this case, BL will be used instead of BLX to
call internal ARM functions from Thumb functions.

Reviewers: rafael, t.p.northover, peter.smith, kristof.beyls

Reviewed By: peter.smith

Subscribers: srhines, echristo, aemerson, rengolin, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33436

llvm-svn: 304413
2017-06-01 13:50:57 +00:00
Matthias Braun d6a36ae282 TargetMachine: Indicate whether machine verifier passes.
This adds a callback to the LLVMTargetMachine that lets target indicate
that they do not pass the machine verifier checks in all cases yet.

This is intended to be a temporary measure while the targets are fixed
allowing us to enable the machine verifier by default with
EXPENSIVE_CHECKS enabled!

Differential Revision: https://reviews.llvm.org/D33696

llvm-svn: 304320
2017-05-31 18:41:23 +00:00
Matthias Braun 05eeadbfd1 ARM: Fix cmpxchg O0 expansion
This is the equivalent of r304048 for ARM:

- Rewrite livein calculation to use the computeLiveIns() helper
  function. This is slightly less efficient but easier to reason about
  and doesn't unnecessarily add pristine and reserved registers[1]
- Zero the status register at the beginning of the loop to make sure it
  has a defined value.
- Remove kill flags of values that need to stay alive throughout the loop.

[1] An upcoming commit of mine will tighten the MachineVerifier to catch
    these.

llvm-svn: 304267
2017-05-31 01:21:35 +00:00
Matthias Braun 0dba4e3509 ARM: Do not add reserved registers to block livein lists; NFC
llvm-svn: 304266
2017-05-31 01:21:30 +00:00
Matthias Braun 5e394c3d6f TargetPassConfig: Keep a reference to an LLVMTargetMachine; NFC
TargetPassConfig is not useful for targets that do not use the CodeGen
library, so we may just as well store a pointer to an
LLVMTargetMachine instead of just to a TargetMachine.

While at it, also change the constructor to take a reference instead of a
pointer as the TM must not be nullptr.

llvm-svn: 304247
2017-05-30 21:36:41 +00:00
Matthias Braun 700603555a ARM: Add missing flags to TBB_[JH]T pseudo instructions
NFC except for calming down the machine verifier in some cases.

llvm-svn: 304227
2017-05-30 18:52:33 +00:00
Craig Topper f6d4dc5b4a [SelectionDAG] Set ISD::FPOWI to Expand by default
Summary:
Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie".

This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default.

Reviewers: spatel, RKSimon, efriedma

Reviewed By: RKSimon

Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits

Differential Revision: https://reviews.llvm.org/D33530

llvm-svn: 304215
2017-05-30 15:27:55 +00:00
Diana Picus 0c05cce4e0 [ARM] GlobalISel: Extract helper. NFCI.
Create a helper to deal with the common code for merging incoming values
together after they've been split during call lowering. There's likely
more stuff that can be commoned up here, but we'll leave that for later.

llvm-svn: 304143
2017-05-29 09:09:54 +00:00
Diana Picus bf4aed2c38 [ARM] GlobalISel: Support array returns
These are a bit rare in practice, but they don't require anything
special compared to array parameters, so support them as well.

llvm-svn: 304137
2017-05-29 08:19:19 +00:00
Diana Picus 8cca8cb0ce [ARM] GlobalISel: Support array parameters/arguments
Clang coerces structs into arrays, so it's a good idea to support them.
Most of the support boils down to getting the splitToValueTypes helper
to actually split types. We then use G_INSERT/G_EXTRACT to deal with the
parts.

llvm-svn: 304132
2017-05-29 07:01:52 +00:00
Matthias Braun ac4307c41e LivePhysRegs: Rework constructor + documentation; NFC
- Take reference instead of pointer to a TRI that cannot be nullptr.
- Improve documentation comments.

llvm-svn: 304038
2017-05-26 21:51:00 +00:00
John Brawn 9009d2905d [ARM] Fix lowering of misaligned memcpy/memset
Currently getOptimalMemOpType returns i32 for large enough sizes without
checking for alignment, leading to poor code generation when misaligned accesses
aren't permitted as we generate a word store then later split it up into byte
stores. This means we inadvertantly go over the MaxStoresPerMemcpy limit and for
memset we splat the memset value into a word then immediately split it up
again.

Fix this by leaving it up to FindOptimalMemOpLowering to figure out which type
to use, but also fix a bug there where it wasn't correctly checking if
misaligned memory accesses are allowed.

Differential Revision: https://reviews.llvm.org/D33442

llvm-svn: 303990
2017-05-26 13:59:12 +00:00
Florian Hahn d211fe7c26 [ARM] Remove ThumbTargetMachines. (NFC)
Summary:
Thumb code generation is controlled by ARMSubtarget and the concrete
ThumbLETargetMachine and ThumbBETargetMachine are not needed.

Eric Christopher suggested removing the unneeded target machines in
https://reviews.llvm.org/D33287.

I think it still makes sense to keep separate TargetMachines for big and
little endian as we probably do not want to have different endianess for
difference functions in a single compilation unit. The MIPS backend has
two separate TargetMachines for big and little endian as well. 

Reviewers: echristo, rengolin, kristof.beyls, t.p.northover

Reviewed By: echristo

Subscribers: aemerson, javed.absar, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D33318

llvm-svn: 303733
2017-05-24 10:18:57 +00:00
Javed Absar a32e3a1acf [ARM] Add VLDx/VSTx sched defs for machine-schedulers. NFCI
This patch adds missing scheds for Neon VLDx/VSTx instructions.
This will help one write schedulers easier/faster in the future for ARM sub-targets.
Existing models will not affected by this patch.
Reviewed by: Renato Golin, Diana Picus
Differential Revision: https://reviews.llvm.org/D33120

llvm-svn: 303717
2017-05-24 05:32:48 +00:00
Oleg Ranevskyy 09df0020fc [ARM] Temporarily disable globals promotion to constant pools to prevent miscompilation
Summary:
A temporary workaround for PR32780 - rematerialized instructions accessing the same promoted global through different constant pool entries.

The patch turns off the globals promotion optimization leaving all its code in place, so that it can be easily turned on once PR32780 is fixed.

Since this is a miscompilation issue causing generation of misbehaving code, and the problem is very subtle, the patch might be valuable enough to get into 4.0.1.

Reviewers: efriedma, jmolloy

Reviewed By: efriedma

Subscribers: aemerson, javed.absar, llvm-commits, rengolin, asl, tstellar

Differential Revision: https://reviews.llvm.org/D33446

llvm-svn: 303679
2017-05-23 19:38:37 +00:00
Nirav Dave 6c910c0dd8 [DAG] Add AddressSpace parameter to canMergeStoresTo. NFC.
llvm-svn: 303673
2017-05-23 18:53:02 +00:00
James Molloy 6110be9759 Re-apply r302416: [ARM] Clear the constant pool cache on explicit .ltorg directives
Re-applying now that PR32825 which was raised on the commit this fixed up is now known to have also been fixed by this commit.

Original commit message:
    Multiple ldr pseudoinstructions with the same constant value will
    reuse the same constant pool entry. However, if the constant pool
    is explicitly flushed with a .ltorg directive, we should not try
    to reference constants in the previous pool any longer, since they
    may be out of range.

    This fixes assembling hand-written assembler source which repeatedly
    loads the same constant value, across a binary size larger than the
    pc-relative fixup range for ldr instructions (4096 bytes). Such
    assembler source already uses explicit .ltorg instructions to emit
    constant pools with regular intervals. However if we try to reuse
    constants emitted in earlier pools, they end up out of range.

    This makes the output of the testcase match what binutils gas does
    (prior to this patch, it would fail to assemble).

    Differential Revision: https://reviews.llvm.org/D32847

llvm-svn: 303540
2017-05-22 09:42:07 +00:00
James Molloy 5cc75ae8f9 Revert "[ARM] Clear the constant pool cache on explicit .ltorg directives"
This reverts commit r302416. This was a fixup for r286006, which has now been reverted so this doesn't apply (either in concept or in code).

This commit itself has no problems, but the underlying issue it was fixing has now disappeared from the codebase.

llvm-svn: 303536
2017-05-22 08:49:28 +00:00
Francis Visoiu Mistrih 8b61764cbb [LegacyPassManager] Remove TargetMachine constructors
This provides a new way to access the TargetMachine through
TargetPassConfig, as a dependency.

The patterns replaced here are:

* Passes handling a null TargetMachine call
  `getAnalysisIfAvailable<TargetPassConfig>`.

* Passes not handling a null TargetMachine
  `addRequired<TargetPassConfig>` and call
  `getAnalysis<TargetPassConfig>`.

* MachineFunctionPasses now use MF.getTarget().

* Remove all the TargetMachine constructors.
* Remove INITIALIZE_TM_PASS.

This fixes a crash when running `llc -start-before prologepilog`.

PEI needs StackProtector, which gets constructed without a TargetMachine
by the pass manager. The StackProtector pass doesn't handle the case
where there is no TargetMachine, so it segfaults.

Related to PR30324.

Differential Revision: https://reviews.llvm.org/D33222

llvm-svn: 303360
2017-05-18 17:21:13 +00:00
Diana Picus eafa4aa910 Reland r303247: [ARM] GlobalISel: Remove dead instruction selection code
It only failed on llvm-clang-x86_64-expensive-checks-win, probably
because the TableGen stuff hasn't been regenerated.
Requires a clean build.

llvm-svn: 303252
2017-05-17 12:42:52 +00:00
Diana Picus 36e4ba0f6e Revert "[ARM] GlobalISel: Remove dead instruction selection code"
This reverts commit r303247 because the tests are failing on some bots.
Sorry!

llvm-svn: 303249
2017-05-17 11:56:07 +00:00
Diana Picus 68d21c864e [ARM] GlobalISel: Remove dead instruction selection code
We can now generate code for selecting G_ADD, G_SUB and G_MUL. Remove
the hand-written versions.

llvm-svn: 303247
2017-05-17 11:39:26 +00:00
Francis Visoiu Mistrih b52e036600 BitVector: add iterators for set bits
Differential revision: https://reviews.llvm.org/D32060

llvm-svn: 303227
2017-05-17 01:07:53 +00:00
Renato Golin d69570e017 Revert "[ARM] Mark LEApcrel instructions as isAsCheapAsAMove"
Revert "[ARM] Mark LEApcrel as not having side effects"

This reverts commit r303054 and r303053, as they broke the ARM
self-hosting buildbots:

http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/1550

http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/1349

http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost/builds/1845

Offline investigation on course.

llvm-svn: 303193
2017-05-16 17:59:07 +00:00
John Brawn 9486becf09 [ARM] Mark LEApcrel instructions as isAsCheapAsAMove
Doing this means that if an LEApcrel is used in two places we will rematerialize
instead of generating two MOVs. This is particularly useful for printfs using
the same format string, where we want to generate an address into a register
that's going to get corrupted by the call.

Differential Revision: https://reviews.llvm.org/D32858

llvm-svn: 303054
2017-05-15 11:57:54 +00:00
John Brawn 43132c46a6 [ARM] Mark LEApcrel as not having side effects
Doing this lets us hoist it out of loops, and I've also marked it as
rematerializable the same as the thumb1 and thumb2 counterparts.

It looks like it being marked as such was just a mistake, as the commit that
made that change only mentions LEApcrelJT and in thumb1 and thumb2 only the
LEApcrelJT instructions were marked as having side-effects, so it looks like
the intent was to only mark LEApcrelJT as having side-effects but LEApcrel was
accidentally marked as such also.

Differential Revision: https://reviews.llvm.org/D32857

llvm-svn: 303053
2017-05-15 11:50:21 +00:00
Diana Picus 9cfbc6d94f [ARM][GlobalISel] Legalize narrow scalar ops by widening
This is the same as r292827 for AArch64: we widen 8- and 16-bit ADD, SUB
and MUL to 32 bits since we only have TableGen patterns for 32 bits.
See the commit message for r292827 for more details.

At this point we could just remove some of the tests for regbankselect
and instruction-select, since we're not going to see any narrow
operations at those levels anymore. Instead I decided to update them
with G_ANYEXT/G_TRUNC operations, so we can validate the full sequences
generated by the legalizer.

llvm-svn: 302782
2017-05-11 09:45:57 +00:00
Diana Picus 657bfd3302 [ARM][GlobalISel] Support for G_ANYEXT
G_ANYEXT can be introduced by the legalizer when widening scalars. Add
support for it in the register bank info (same mapping as everything
else) and in the instruction selector.

When selecting it, we treat it as a COPY, just like G_TRUNC. On this
occasion we get rid of some assertions in selectCopy so we can reuse it.
This shouldn't be a problem at the moment since we're not supporting any
complicated cases (e.g. FPR, different register banks). We might want to
separate the paths when we do.

llvm-svn: 302778
2017-05-11 08:28:31 +00:00
Serge Guelton e38003f839 Suppress all uses of LLVM_END_WITH_NULL. NFC.
Use variadic templates instead of relying on <cstdarg> + sentinel.
This enforces better type checking and makes code more readable.

Differential Revision: https://reviews.llvm.org/D32541

llvm-svn: 302571
2017-05-09 19:31:13 +00:00
Tim Shen 04de70d3a7 [Atomic] Remove IsStore/IsLoad in the interface, and pass the instruction instead. NFC.
Now both emitLeadingFence and emitTrailingFence take the instruction
itself, instead of taking IsLoad/IsStore pairs.
Instruction::mayReadFromMemory and Instrucion::mayWriteToMemory are used
for determining those two booleans.

The instruction argument is also useful for later D32763, in
emitTrailingFence. For emitLeadingFence, it seems to have cleaner
interface with the proposed change.

Differential Revision: https://reviews.llvm.org/D32762

llvm-svn: 302539
2017-05-09 15:27:17 +00:00
Aaron Ballman 3234647df6 Amend r302535; ifndef and ifdef are different, as it turns out.
llvm-svn: 302537
2017-05-09 15:12:03 +00:00
Aaron Ballman 06297e839a ARMRegisterBankInfo.h requires LLVM_BUILD_GLOBAL_ISEL to be defined. If it is not defined, then ARMGenRegisterBank.inc is not table generated and the inclusion of this header causes the build to fail.
llvm-svn: 302535
2017-05-09 14:59:48 +00:00
Serge Pavlov d526b13e61 Add extra operand to CALLSEQ_START to keep frame part set up previously
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to  CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.

This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.

The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
  affects all targets that use frame pseudo instructions and touched many
  files although the changes are uniform.
- Access to frame properties are implemented using special instructions
  rather than calls getOperand(N).getImm(). For X86 and ARM such
  replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
  instruction. These involve proper instruction initialization and
  methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
  frame parts initialized inside frame instruction pair and outside it.

The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.

Differential Revision: https://reviews.llvm.org/D32394

llvm-svn: 302527
2017-05-09 13:35:13 +00:00
Diana Picus 95640a1c4d [ARM GlobalISel] Remove hand-written G_FADD selection
Remove the code selecting G_FADD - now that TableGen can handle more
opcodes, it's not needed anymore.

llvm-svn: 302511
2017-05-09 08:32:42 +00:00
Tim Northover c48c993b75 ARM: use divmod libcalls on embedded MachO platforms too.
The separated libcalls are implemented in terms of __divmodsi4 and __udivmodsi4
anyway, so we should always use them if possible.

llvm-svn: 302462
2017-05-08 20:00:14 +00:00
Craig Topper 8297e52285 [ARM] Use a Changed flag to avoid making a pass's return value dependent on a compare with a Statistic object.
Statistic compile to always be 0 in release build so this compare would always return false. And in the debug builds Statistic are global variables and remember their values across pass runs. So this compare returns true anytime the pass runs after the first time it modifies something.

This was found after reviewing all usages of comparison operators on a Statistic object. We had some internal code that did a compare with a statistic that caused a mismatch in output between debug and release builds. So we did an audit out of paranoia.

llvm-svn: 302450
2017-05-08 18:02:51 +00:00
Simon Pilgrim f5ca255d18 [ARM][NEON] Add support for ISD::ABS lowering
Update NEON int_arm_neon_vabs intrinsic to use the ISD::ABS opcode directly

Added constant folding tests.

Differential Revision: https://reviews.llvm.org/D32938

llvm-svn: 302417
2017-05-08 10:37:34 +00:00
Martin Storsjo fd4c158a84 [ARM] Clear the constant pool cache on explicit .ltorg directives
Multiple ldr pseudoinstructions with the same constant value will
reuse the same constant pool entry. However, if the constant pool
is explicitly flushed with a .ltorg directive, we should not try
to reference constants in the previous pool any longer, since they
may be out of range.

This fixes assembling hand-written assembler source which repeatedly
loads the same constant value, across a binary size larger than the
pc-relative fixup range for ldr instructions (4096 bytes). Such
assembler source already uses explicit .ltorg instructions to emit
constant pools with regular intervals. However if we try to reuse
constants emitted in earlier pools, they end up out of range.

This makes the output of the testcase match what binutils gas does
(prior to this patch, it would fail to assemble).

Differential Revision: https://reviews.llvm.org/D32847

llvm-svn: 302416
2017-05-08 10:26:24 +00:00
Quentin Colombet 245994d968 [RegisterBankInfo] Uniquely allocate instruction mapping.
This is a step toward having statically allocated instruciton mapping.
We are going to tablegen them eventually, so let us reflect that in
the API.

NFC.

llvm-svn: 302316
2017-05-05 22:48:22 +00:00
Matthias Braun 4682ac6c83 ARM: Compute MaxCallFrame size early
This exposes a method in MachineFrameInfo that calculates
MaxCallFrameSize and calls it after instruction selection in the ARM
target.

This avoids
ARMBaseRegisterInfo::canRealignStack()/ARMFrameLowering::hasReservedCallFrame()
giving different answers in early/late phases of codegen.

The testcase shows a particular nasty example result of that where we
would fail to properly align an alloca.

Differential Revision: https://reviews.llvm.org/D32622

llvm-svn: 302303
2017-05-05 22:04:05 +00:00
Craig Topper f0aeee01c3 [KnownBits] Add wrapper methods for setting and clear all bits in the underlying APInts in KnownBits.
This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown.

Differential Revision: https://reviews.llvm.org/D32637

llvm-svn: 302262
2017-05-05 17:36:09 +00:00
John Brawn 1b74f8c51f [ARM] Add support for ORR and ORN instruction substitutions
Recently support was added for substituting one intruction for another by
negating or inverting the immediate, but ORR and ORN were missed so this patch
adds them.

This one is slightly different to the others in that ORN only exists in thumb,
so we only do the substitution in thumb.

Differential Revision: https://reviews.llvm.org/D32534

llvm-svn: 302224
2017-05-05 11:31:25 +00:00
Sam Parker df337704f0 [ARM] ACLE Chapter 9 intrinsics
Added the integer data processing intrinsics from ACLE v2.1 Chapter 9
but I have missed out the saturation_occurred intrinsics for now. For
the instructions that read and write the GE bits, a chain is included
and the only instruction that reads these flags (sel) is only
selectable via the implemented intrinsic.

Differential Revision: https://reviews.llvm.org/D32281

llvm-svn: 302126
2017-05-04 07:31:28 +00:00
Reid Kleckner a0b45f4bfc [IR] Abstract away ArgNo+1 attribute indexing as much as possible
Summary:
Do three things to help with that:
- Add AttributeList::FirstArgIndex, which is an enumerator currently set
  to 1. It allows us to change the indexing scheme with fewer changes.
- Add addParamAttr/removeParamAttr. This just shortens addAttribute call
  sites that would otherwise need to spell out FirstArgIndex.
- Remove some attribute-specific getters and setters from Function that
  take attribute list indices.  Most of these were only used from
  BuildLibCalls, and doesNotAlias was only used to test or set if the
  return value is malloc-like.

I'm happy to split the patch, but I think they are probably easier to
review when taken together.

This patch should be NFC, but it sets the stage to change the indexing
scheme to this, which is more convenient when indexing into an array:
  0: func attrs
  1: retattrs
  2...: arg attrs

Reviewers: chandlerc, pete, javed.absar

Subscribers: david2050, llvm-commits

Differential Revision: https://reviews.llvm.org/D32811

llvm-svn: 302060
2017-05-03 18:17:31 +00:00
Tim Northover 4a01ffbd6a ARM: avoid handing a deleted node back to TableGen during ISel.
When we replaced the multiplicand the destination node might already exist.
When that happens the original gets CSEd and deleted. However, it's actually
used as the offset so nonsense is produced.

Should fix PR32726.

llvm-svn: 301983
2017-05-02 22:45:19 +00:00
Tim Northover f9d8eee3db ARM: add arm1176j-f processor
I doubt anyone actually uses it, and I'm not even entirely convinced it exists
myself; but it is our default for "clang -arch armv6". Functionally, if it does
exist it's identical to the arm1176jz-f from LLVM's point of view (the
difference is apparently in the "Security Extensions").

llvm-svn: 301962
2017-05-02 19:06:13 +00:00
Diana Picus 8abcbbb24b [ARM] GlobalISel: Use TableGen instruction selector
Emit and use the TableGen instruction selector for ARM. At the moment,
this allows us to remove the hand-written code for selecting G_SDIV and
G_UDIV.

Future commits will focus on increasing the code coverage for it and
removing more dead code from the current instruction selector.

llvm-svn: 301905
2017-05-02 09:40:49 +00:00
Reid Kleckner 6652a52e2b Use Argument::hasAttribute and AttributeList::ReturnIndex more
This eliminates many extra 'Idx' induction variables in loops over
arguments in CodeGen/ and Target/. It also reduces the number of places
where we assume that ReturnIndex is 0 and that we should add one to
argument numbers to get the corresponding attribute list index.

NFC

llvm-svn: 301666
2017-04-28 18:37:16 +00:00
Diana Picus 6f975692e5 [ARM] GlobalISel: fixup r301632
Actually remove ARMInstructionSelector.h... Forgot to stage the removal
in the previous commit.

llvm-svn: 301633
2017-04-28 09:20:31 +00:00
Diana Picus 674888d84c [ARM] GlobalISel: Get rid of ARMInstructionSelector.h. NFC.
Declare the ARMInstructionSelector in an anonymous namespace, to make it
more in line with the other targets which were migrated to this in
r299637 in order to avoid TableGen'erated headers being included in
non-GlobalISel builds.

llvm-svn: 301632
2017-04-28 09:10:38 +00:00
Craig Topper d0af7e8ab8 [SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits
This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.

This is largely a mechanical transformation from KnownZero to Known.Zero.

Differential Revision: https://reviews.llvm.org/D32569

llvm-svn: 301620
2017-04-28 05:31:46 +00:00
Diana Picus 4f46be327c [ARM] GlobalISel: Fix extended stack operands
Fix a crash when trying to extend a value passed as a sign- or
zero-extended stack parameter. The cause of the crash was that we were
setting the size of the loaded value to 32 bits, and then tyring to
extend again to 32 bits.

This patch addresses the issue by also introducing a G_TRUNC after the
load. This will leave the unused bits to their original values set by
the caller, while being consistent about the types. For values that are
not extended, we just use a smaller load.

llvm-svn: 301531
2017-04-27 10:23:30 +00:00
Krzysztof Parzyszek 44e25f37ae Move size and alignment information of regclass to TargetRegisterInfo
1. RegisterClass::getSize() is split into two functions:
   - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const;
   - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const;
2. RegisterClass::getAlignment() is replaced by:
   - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const;

This will allow making those values depend on subtarget features in the
future.

Differential Revision: https://reviews.llvm.org/D31783

llvm-svn: 301221
2017-04-24 18:55:33 +00:00
Diana Picus f53865daa4 [ARM] GlobalISel: Legalize s8 and s16 G_(S|U)DIV
We have to widen the operands to 32 bits and then we can either use
hardware division if it is available or lower to a libcall otherwise.

At the moment it is not enough to set the Legalizer action to
WidenScalar, since for libcalls it won't know what to do (it won't be
able to find what size to widen to, because it will find Libcall and not
Legal for 32 bits). To hack around this limitation, we request Custom
lowering, and as part of that we widen first and then we run another
legalizeInstrStep on the widened DIV.

llvm-svn: 301166
2017-04-24 09:12:19 +00:00
Diana Picus b70e88bdec [ARM] GlobalISel: Support G_(S|U)DIV for s32
Add support for both targets with hardware division and without. For
hardware division we have to add support throughout the pipeline
(legalizer, reg bank select, instruction select). For targets without
hardware division, we only need to mark it as a libcall.

llvm-svn: 301164
2017-04-24 08:20:05 +00:00
Diana Picus 95a8aa93e2 [ARM] GlobalISel: Select G_CONSTANT with CImm operands
When selecting a G_CONSTANT to a MOVi, we need the value to be an Imm
operand. We used to just leave the G_CONSTANT operand unchanged, which
works in some cases (such as the GEP offsets that we create when
referring to stack slots). However, in many other places the G_CONSTANTs
are created with CImm operands. This patch makes sure to handle those as
well, and to error out gracefully if in the end we don't end up with an
Imm operand.

Thanks to Oliver Stannard for reporting this issue.

llvm-svn: 301162
2017-04-24 06:30:56 +00:00
Daniel Sanders 2deea1878e [globalisel][tablegen] Revise API for ComplexPattern operands to improve flexibility.
Summary:
Some targets need to be able to do more complex rendering than just adding an
operand or two to an instruction. For example, it may need to insert an
instruction to extract a subreg first, or it may need to perform an operation
on the operand.

In SelectionDAG, targets would create SDNode's to achieve the desired effect
during the complex pattern predicate. This worked because SelectionDAG had a
form of garbage collection that would take care of SDNode's that were created
but not used due to a later predicate rejecting a match. This doesn't translate
well to GlobalISel and the churn was wasteful.

The API changes in this patch enable GlobalISel to accomplish the same thing
without the waste. The API is now:
	InstructionSelector::OptionalComplexRendererFn selectArithImmed(MachineOperand &Root) const;
where Root is the root of the match. The return value can be omitted to
indicate that the predicate failed to match, or a function with the signature
ComplexRendererFn can be returned. For example:
	return OptionalComplexRendererFn(
	       [=](MachineInstrBuilder &MIB) { MIB.addImm(Immed).addImm(ShVal); });
adds two immediate operands to the rendered instruction. Immed and ShVal are
captured from the predicate function.

As an added bonus, this also reduces the amount of information we need to
provide to GIComplexOperandMatcher.

Depends on D31418

Reviewers: aditya_nandakumar, t.p.northover, qcolombet, rovka, ab, javed.absar

Reviewed By: ab

Subscribers: dberris, kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D31761

llvm-svn: 301079
2017-04-22 15:11:04 +00:00
Hans Wennborg 9b9a5358dd Re-commit r301040 "X86: Don't emit zero-byte functions on Windows"
In addition to the original commit, tighten the condition for when to
pad empty functions to COFF Windows.  This avoids running into problems
when targeting e.g. Win32 AMDGPU, which caused test failures when this
was committed initially.

llvm-svn: 301047
2017-04-21 21:48:41 +00:00
Hans Wennborg 04593000d8 Revert r301040 "X86: Don't emit zero-byte functions on Windows"
This broke almost all bots. Reverting while fixing.

llvm-svn: 301041
2017-04-21 21:10:37 +00:00
Hans Wennborg cb3e810714 X86: Don't emit zero-byte functions on Windows
Empty functions can lead to duplicate entries in the Guard CF Function
Table of a binary due to multiple functions sharing the same RVA,
causing the kernel to refuse to load that binary.

We had a terrific bug due to this in Chromium.

It turns out we were already doing this for Mach-O in certain
situations. This patch expands the code for that in
AsmPrinter::EmitFunctionBody() and renames
TargetInstrInfo::getNoopForMachoTarget() to simply getNoop() since it
seems it was used for not just Mach-O anyway.

Differential Revision: https://reviews.llvm.org/D32330

llvm-svn: 301040
2017-04-21 20:58:12 +00:00
Tim Northover e31cf3f824 ARM: make sure we use all entries in a vector before forming a vpaddl.
Otherwise there's some mismatch, and we'll either form an illegal type or an
illegal node.

Thanks to Eli Friedman for pointing out the problem with my original solution.

llvm-svn: 301036
2017-04-21 20:35:52 +00:00
Tim Northover 1061ccca8c ARM: don't try to create an i8 -> i32 vpaddl.
DAG combine was mistakenly assuming that the step-up it was looking at was
always a doubling, but it can sometimes be a larger extension in which case
we'd crash.

llvm-svn: 301002
2017-04-21 17:21:59 +00:00
Diana Picus 64a33431eb [ARM] GlobalISel: Add support for G_TRUNC
Select them as copies. We only select if both the source and the
destination are on the same register bank, so this shouldn't cause any
trouble.

llvm-svn: 300971
2017-04-21 13:16:50 +00:00
Diana Picus f941ec0ecc [ARM] GlobalISel: Make struct arguments fail elegantly
The condition in isSupportedType didn't handle struct/array arguments
properly. Fix the check and add a test to make sure we use the fallback
path in this kind of situation. The test deals with some common cases
where the call lowering should error out. There are still some issues
here that need to be addressed (tail calls come to mind), but they can
be addressed in other patches.

llvm-svn: 300967
2017-04-21 11:53:01 +00:00
Artyom Skrobov 8d9643009f [Thumb1] The recently added tADCS and tSBCS pseudo-instructions were missing `Uses = [CPSR]`
Summary: Thanks to Oliver Stannard for helping catch this.

Reviewers: olista01, efriedma

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D31815

llvm-svn: 300951
2017-04-21 07:35:21 +00:00
Tim Northover 46e58354da ARM: lower "fence singlethread" to a pure compiler barrier.
Single-threaded fences aren't required to provide any synchronization with
other processing elements so there's no need for a DMB. They should still be a
barrier for compiler optimizations though.

llvm-svn: 300904
2017-04-20 21:56:52 +00:00
Tim Northover 8b1240b0f0 ARM: handle post-indexed NEON ops where the offset isn't the access width.
Before, we assumed that any ConstantInt offset was precisely the access width,
so we could use the "[rN]!" form. ISelLowering only ever created that kind, but
further simplification during combining could lead to unexpected constants and
incorrect codegen.

Should fix PR32658.

llvm-svn: 300878
2017-04-20 19:54:02 +00:00
Weiming Zhao 962c5a3aec [Thumb-1] Fix corner cases for compressed jump tables
Summary:
When synthesized TBB/TBH is expanded, we need to avoid the case of:
   BaseReg is redefined after the load of branching target. E.g.:

    %R2 = tLEApcrelJT <jt#1>
    %R1 =  tLDRr %R1, %R2    ==> %R2 = tLEApcrelJT <jt#1>
    %R2 = tLDRspi %SP, 12        %R2 = tLDRspi %SP, 12
    tBR_JTr %R1                  tTBB_JT %R2, %R1
`
Reviewers: jmolloy

Reviewed By: jmolloy

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D32250

llvm-svn: 300870
2017-04-20 18:37:14 +00:00
Benjamin Kramer 58dadd59d9 Fix use-after-frees on memory allocated in a Recycler.
This will become asan errors once the patch lands that poisons the
memory after free. The x86 change is a hack, but I don't see how to
solve this properly at the moment.

llvm-svn: 300867
2017-04-20 18:29:14 +00:00
John Brawn 66719f63d0 [ARM] Fix handling of mapping symbols when changing sections
ChangeSection incorrectly registers LastEMSInfo as belonging to the previous
section, not the current section. This happens to work when changing sections
using .section, as the previous section is set to the current section before
the call to ChangeSection, but not when using .popsection.

Differential Revision: https://reviews.llvm.org/D32225

llvm-svn: 300831
2017-04-20 10:18:13 +00:00
Diana Picus 7c6dee9f16 [ARM] Rename HW div feature to HW div Thumb. NFCI.
The hardware div feature refers only to Thumb, but because of its name
it is tempting to use it to check for hardware division in general,
which may cause problems in ARM mode. See https://reviews.llvm.org/D32005.

This patch adds "Thumb" to its name, to make its scope clear. One
notable place where I haven't made the change is in the feature flag
(used with -mattr), which is still hwdiv. Changing it would also require
changes in a lot of tests, including clang tests, and it doesn't seem
like it's worth the effort.

Differential Revision: https://reviews.llvm.org/D32160

llvm-svn: 300827
2017-04-20 09:38:25 +00:00
Matthias Braun 8aaa368d00 ARMFrameLowering: Reserve emergency spill slot for large arguments
Re-commit after revert in r300668. Changed getMaxFPOffset() to a
more conservative heuristic instead of trying to be clever and missing
for some exotic calling conventions.

We need to reserve an emergency spill slot in cases with large argument
types that could overflow immediate offsets for FP relative address
calculations.

rdar://31317893

Differential Revision: https://reviews.llvm.org/D31643

llvm-svn: 300761
2017-04-19 21:11:44 +00:00
Eli Friedman 70ad2751d5 [ARM] Remove redundant computeKnownBits helper.
Move the BFI logic to computeKnownBitsForTargetNode, and delete
the redundant CMOV logic.

This is intended as a cleanup, but it's probably possible to construct
a case where moving the BFI logic allows more combines.

Differential Revision: https://reviews.llvm.org/D31795

llvm-svn: 300752
2017-04-19 20:50:57 +00:00
Eli Friedman f281d490cc [ARM] Use TableGen patterns to select vtbl. NFC.
Differential Revision: https://reviews.llvm.org/D32103

llvm-svn: 300749
2017-04-19 20:39:39 +00:00
Tim Northover ff168c68dc ARM: TLS calling convention doesn't preserve r9 or r12 on Darwin.
llvm-svn: 300726
2017-04-19 18:07:54 +00:00
Renato Golin 742aed8683 Revert "ARMFrameLowering: Reserve emergency spill slot for large arguments"
This reverts commit r300639, as it broke self-hosting on ARM. PR32709.

llvm-svn: 300668
2017-04-19 09:02:52 +00:00
Diana Picus 49472ff1cf [ARM] GlobalISel: Add support for G_MUL
Support G_MUL, very similar to G_ADD and G_SUB. The only difference is
in the instruction selector, where we have to select either MUL or MULv5
depending on the target.

llvm-svn: 300665
2017-04-19 07:29:46 +00:00
Serge Pavlov 5943a96d81 ARM: Use methods to access data stored with frame instructions
In r300196 several methods were added to TarfetInstrInfo to access
data stored with call frame setup/destroy instructions. This change
replaces calls to getOperand with calls to such special methods in
ARM target.

Differential Revision: https://reviews.llvm.org/D32127

llvm-svn: 300655
2017-04-19 03:12:05 +00:00
Matthias Braun 661d3d4b00 ARMFrameLowering: Reserve emergency spill slot for large arguments
We need to reserve an emergency spill slot in cases with large argument
types that could overflow immediate offsets for FP relative address
calculations.

rdar://31317893

Differential Revision: https://reviews.llvm.org/D31643

llvm-svn: 300639
2017-04-19 01:16:07 +00:00
Matt Arsenault 3138075dd4 DAG: Make mayBeEmittedAsTailCall parameter const
llvm-svn: 300603
2017-04-18 21:16:46 +00:00
Oliver Stannard 7ad2e8aae1 [ARM] Add hardware build attributes in assembler
In the assembler, we should emit build attributes based on the target
selected with command-line options. This matches the GNU assembler's
behaviour. We only do this for build attributes which describe the
hardware that is expected to be available, not the ones that describe
ABI compatibility.

This is done by moving some of the attribute emission code to
ARMTargetStreamer, so that it can be shared between the assembly and
code-generation code paths. Since the assembler only creates a
MCSubtargetInfo, not an ARMSubtarget, the code had to be changed to
check raw features, and not use the convenience functions in
ARMSubtarget.

If different attributes are later specified using the .eabi_attribute
directive, then they will take precedence, as happens when the same
.eabi_attribute is specified twice.

This must be enabled by an option, because we don't want to do this when
parsing inline assembly. The attributes would match the ones emitted at
the start of the file, so wouldn't actually change the emitted object
file, but the extra directives would be added to every inline assembly
block when emitting assembly, which we'd like to avoid.

The majority of the changes in the build-attributes.ll test are just
re-ordering the directives, because the hardware attributes are now
emitted before the ABI ones. However, I did fix one bug which I spotted:
Tag_CPU_arch_profile was not being emitted for v6M.

Differential revision: https://reviews.llvm.org/D31812

llvm-svn: 300547
2017-04-18 12:52:35 +00:00
Diana Picus a3a0cccb2c [ARM] GlobalISel: Add support for G_SUB
Support G_SUB throughout the GlobalISel pipeline. It is exactly the same
as G_ADD, nothing fancy.

llvm-svn: 300546
2017-04-18 12:35:28 +00:00
Diana Picus e2626bb7c2 [ARM] Check for correct HW div when lowering divmod
For subtargets that use the custom lowering for divmod, e.g. gnueabi,
we used to check if the subtarget has hardware divide and then lower to
a div-mul-sub sequence if true, or to a libcall if false.

However, judging by the usage of hasDivide vs hasDivideInARMMode, it
seems that hasDivide only refers to Thumb. For instance, in the
ARMTargetLowering constructor, the code that specifies whether to use
libcalls for (S|U)DIV looks like this:

bool hasDivide = Subtarget->isThumb() ? Subtarget->hasDivide()
                                      : Subtarget->hasDivideInARMMode();

In the case of divmod for arm-gnueabi, using only hasDivide() to
determine what to do means that instead of lowering to __aeabi_idivmod
to get the remainder, we lower to div-mul-sub and then further lower the
div to __aeabi_idiv. Even worse, if we have hardware divide in ARM but
not in Thumb, we generate a libcall instead of using it (this is not an
issue in practice since AFAICT none of the cores that we support have
hardware divide in ARM but not Thumb).

This patch fixes the code dealing with custom lowering to take into
account the mode (Thumb or ARM) when deciding whether or not hardware
division is available.

Differential Revision: https://reviews.llvm.org/D32005

llvm-svn: 300536
2017-04-18 08:32:27 +00:00
Reid Kleckner fb502d2f5e [IR] Make paramHasAttr to use arg indices instead of attr indices
This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern.

Previously we were testing return value attributes with index 0, so I
introduced hasReturnAttr() for that use case.

llvm-svn: 300367
2017-04-14 20:19:02 +00:00
Andrew V. Tischenko 75745d0c3e This patch closes PR#32216: Better testing of schedule model instruction latencies/throughputs.
The details are here: https://reviews.llvm.org/D30941

llvm-svn: 300311
2017-04-14 07:44:23 +00:00
Jonas Paulsson fccc7d66c3 [SystemZ] TargetTransformInfo cost functions implemented.
getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(),
getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(),
getInterleavedMemoryOpCost() implemented.

Interleaved access vectorization enabled.

BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads,
in which case the cost of the z/sext instruction becomes 0.

Review: Ulrich Weigand, Renato Golin.
https://reviews.llvm.org/D29631

llvm-svn: 300052
2017-04-12 11:49:08 +00:00
Sam Parker 83b64654fd [ARM] Refactor Thumb2 sat instructions
Refactor the USAT, SSAT, USAT16 and SSAT16 instruction descriptions
for Thumb2.

Differential Revision: https://reviews.llvm.org/D31933

llvm-svn: 299945
2017-04-11 14:42:08 +00:00
Diana Picus 1314a2889c GlobalISel: Allow legalizing G_FADD to a libcall
Use the same handling in the generic legalizer code as for the other
libcalls (G_FREM, G_FPOW).

Enable it on ARM for float and double so we can test it.

llvm-svn: 299931
2017-04-11 10:52:34 +00:00
Matthew Simpson 1468d3e04e [ARM/AArch64] Ensure valid vector element types for interleaved accesses
This patch refactors and strengthens the type checks performed for interleaved
accesses. The primary functional change is to ensure that the interleaved
accesses have valid element types. The added test cases previously failed
because the element type is f128.

Differential Revision: https://reviews.llvm.org/D31817

llvm-svn: 299864
2017-04-10 18:34:37 +00:00
Diana Picus 3ff82c8cb7 [ARM] GlobalISel: Support G_FPOW for float and double
Legalize to a libcall.

llvm-svn: 299841
2017-04-10 09:27:39 +00:00
Eli Friedman 75631c97ba [ARM] Prefer BIC over BFC in ARM mode.
BIC is generally faster, and it can put the output in a different
register from the input.

We already do this in Thumb2 mode; not sure why the equivalent fix
never got applied to ARM mode.

Differential Revision: https://reviews.llvm.org/D31797

llvm-svn: 299803
2017-04-07 22:01:23 +00:00
Diana Picus 3c608448e1 [ARM] GlobalISel: Support frem for 64-bit values
Legalize to a libcall.

llvm-svn: 299756
2017-04-07 10:50:02 +00:00
Diana Picus a5bab61a8d [ARM] GlobalISel: Support frem for 32-bit values
Legalize to a libcall.
On this occasion, also start allowing soft float subtargets. For the
moment G_FREM is the only legal floating point operation for them.

llvm-svn: 299753
2017-04-07 09:41:39 +00:00
Yi Kong 60b5a1cd17 Revert "Revert "[ARM] Add Kryo to available targets""
This reverts commit dc9458d5a747a02a9a8f198b84c2b92a6939a8dd.

Added missing case for PreISelOperandLatencyAdjustment.

llvm-svn: 299724
2017-04-06 22:47:47 +00:00
Huihui Zhang 98240e9643 [SelectionDAG] [ARM CodeGen] Fix chain information of LowerMUL
In LowerMUL, the chain information is not preserved for the new
created Load SDNode.

For example, if a Store alias with one of the operand of Mul.
The Load for that operand need to be scheduled before the Store.
The dependence is recorded in the chain of Store, in TokenFactor.
However, when lowering MUL, the SDNodes for the new Loads for
VMULL are not updated in the TokenFactor for the Store. Thus the
chain is not preserved for the lowered VMULL.

llvm-svn: 299701
2017-04-06 20:22:51 +00:00
Yi Kong 5e7059b702 Revert "[ARM] Add Kryo to available targets"
This reverts commit 942d6e6f58bf7e63810dd7cbcbce1fdfa5ebc6d4.

Build breakage.

llvm-svn: 299689
2017-04-06 19:16:14 +00:00
Yi Kong 2b622b1fc1 [ARM] Add Kryo to available targets
Summary:
Host CPU detection now supports Kryo, so we need to recognize it in ARM
target.

Reviewers: mcrosier, t.p.northover, rengolin, echristo, srhines

Reviewed By: t.p.northover, echristo

Subscribers: aemerson

Differential Revision: https://reviews.llvm.org/D31775

llvm-svn: 299674
2017-04-06 18:10:08 +00:00
David Green 1b4b59a415 [ARM] Remove a dead ADD during the creation of TBBs
During the optimisation of jump tables in the constant island pass,
an extra ADD could be left over, now dead but not removed.

Differential Revision: https://reviews.llvm.org/D31389

llvm-svn: 299634
2017-04-06 08:32:47 +00:00
Matthias Braun 44047427b1 ARMFrameLowering: Slight cleanups; NFC
llvm-svn: 299562
2017-04-05 16:58:41 +00:00
Sanjay Patel b2f1621bb1 [DAGCombiner] add and use TLI hook to convert and-of-seteq / or-of-setne to bitwise logic+setcc (PR32401)
This is a generic combine enabled via target hook to reduce icmp logic as discussed in:
https://bugs.llvm.org/show_bug.cgi?id=32401

It's likely that other targets will want to enable this hook for scalar transforms, 
and there are probably other patterns that can use bitwise logic to reduce comparisons.

Note that we are missing an IR canonicalization for these patterns, and we will probably
prefer the pair-of-compares form in IR (shorter, more likely to fold).

Differential Revision: https://reviews.llvm.org/D31483

llvm-svn: 299542
2017-04-05 14:09:39 +00:00
Alex Bradbury 866113c2ea Add MCContext argument to MCAsmBackend::applyFixup for error reporting
A number of backends (AArch64, MIPS, ARM) have been using
MCContext::reportError to report issues such as out-of-range fixup values in
their TgtAsmBackend. This is great, but because MCContext couldn't easily be
threaded through to the adjustFixupValue helper function from its usual
callsite (applyFixup), these backends ended up adding an MCContext* argument
and adding another call to applyFixup to processFixupValue. Adding an
MCContext parameter to applyFixup makes this unnecessary, and even better -
applyFixup can take a reference to MCContext rather than a potentially null
pointer.

Differential Revision: https://reviews.llvm.org/D30264

llvm-svn: 299529
2017-04-05 10:16:14 +00:00
Weiming Zhao 74a7fa0594 Reland r298901 with modifications (reverted in r298932)
Dont emit Mapping symbols for sections that contain only data.

Summary:
Dont emit mapping symbols for sections that contain only data.

Reviewers: rengolin, weimingz, kparzysz, t.p.northover, peter.smith

Reviewed By: t.p.northover

Patched by Shankar Easwaran <shankare@codeaurora.org>

Subscribers: alekseyshl, t.p.northover, llvm-commits

Differential Revision: https://reviews.llvm.org/D30724

llvm-svn: 299392
2017-04-03 21:50:04 +00:00
Sjoerd Meijer 1179470ff8 ARMAsmParser: clean up of isImmediate functions
- we are now using immediate AsmOperands so that the range check functions are
  tablegen'ed.
- Big bonus is that error messages become much more accurate, i.e. instead of a
  useless "invalid operand" error message it will not say that the immediate
  operand must in range [x,y], which is why regression tests needed updating.

More tablegen operand descriptions could probably benefit from using
immediateAsmOperand, but this is a first good step to get rid of most of the
nearly identical range check functions. I will address the remaining immediate
operands in next clean ups.

Differential Revision: https://reviews.llvm.org/D31333

llvm-svn: 299358
2017-04-03 14:50:04 +00:00
Simon Pilgrim 37b536e4b3 [DAGCombiner] Add vector demanded elements support to computeKnownBitsForTargetNode
Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes.

Differential Revision: https://reviews.llvm.org/D31249

llvm-svn: 299201
2017-03-31 11:24:16 +00:00
Weiming Zhao da4d12a8e5 Revert "Dont emit Mapping symbols for sections that contain only data."
It breaks some lld tests.

This reverts commit 3a50eea6d9732ab40e9a7aebe6be777b53a8b35c.

llvm-svn: 298932
2017-03-28 17:15:11 +00:00
Weiming Zhao 320848458b Dont emit Mapping symbols for sections that contain only data.
Summary:
Dont emit mapping symbols for sections that contain only data.

Patched by Shankar Easwaran <shankare@codeaurora.org>

Reviewers: rengolin, peter.smith, weimingz, kparzysz, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, llvm-commits

Differential Revision: https://reviews.llvm.org/D30724

llvm-svn: 298901
2017-03-28 05:40:36 +00:00
Javed Absar 3d59437093 Improve machine schedulers for in-order processors
This patch enables schedulers to specify instructions that 
cannot be issued with any other instructions.
It also fixes BeginGroup/EndGroup.

Reviewed by: Andrew Trick
Differential Revision: https://reviews.llvm.org/D30744

llvm-svn: 298885
2017-03-27 20:46:37 +00:00
Eli Friedman 95ddd18703 [ARM] Fix mixup between Lo and Hi in SMLALBB formation.
llvm-svn: 298752
2017-03-25 00:13:24 +00:00
Pirama Arumuga Nainar bc26482717 [ARM] Fix computeKnownBits for ARMISD::CMOV
Summary:
The true and false operands for the CMOV are operands 0 and 1.
ARMISelLowering.cpp::computeKnownBits was looking at operands 1 and 2
instead.  This can cause CMOV instructions to be incorrectly folded into
BFI if value set by the CMOV is another CMOV, whose known bits are
computed incorrectly.

This patch fixes the issue and adds a test case.

Reviewers: kristof.beyls, jmolloy

Subscribers: llvm-commits, aemerson, srhines, rengolin

Differential Revision: https://reviews.llvm.org/D31265

llvm-svn: 298624
2017-03-23 16:47:47 +00:00
Davide Italiano 6974dd6412 [ARM] Reduce code duplication by factoring out in a lambda. NFCI.
llvm-svn: 298572
2017-03-23 01:34:45 +00:00
Artyom Skrobov 92c0653095 Reapply r298417 "[ARM] Recommit the glueless lowering of addc/adde in Thumb1"
The UB in t2_so_imm_neg conversion has been addressed under D31242 / r298512

This reverts commit r298482.

llvm-svn: 298562
2017-03-22 23:35:51 +00:00
Artyom Skrobov ee66d6e18e [ARM] simplifying t2_so_imm_neg as suggested by Eli Friedman in D31242 (NFC)
llvm-svn: 298559
2017-03-22 23:12:59 +00:00
Artyom Skrobov 50a066b313 [ARM] t2_so_imm_neg had a subtle bug in the conversion, and could trigger UB by negating (int)-2147483648. By pure luck, none of the pre-existing tests triggered this; so I'm adding one.
Summary: Thanks to Vitaly Buka for helping catch this.

Reviewers: rengolin, jmolloy, efriedma, vitalybuka

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D31242

llvm-svn: 298512
2017-03-22 15:09:30 +00:00
Vitaly Buka e69c137f90 Revert "[ARM] Recommit the glueless lowering of addc/adde in Thumb1, including the amended (no UB anymore) fix for adding/subtracting -2147483648."
Fails check-llvm with ubsan

This reverts commit r298417.

llvm-svn: 298482
2017-03-22 05:07:44 +00:00
Artyom Skrobov 40a4f40679 [ARM] Recommit the glueless lowering of addc/adde in Thumb1,
including the amended (no UB anymore) fix for adding/subtracting -2147483648.

This reverts r298328 "[ARM] Revert r297443 and r297820."
and partially reverts r297842 "Revert "[Thumb1] Fix the bug when adding/subtracting -2147483648""

llvm-svn: 298417
2017-03-21 18:39:41 +00:00
Reid Kleckner b518054b87 Rename AttributeSet to AttributeList
Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
typically accessed by parameter and return value index, so
"AttributeList" seems like a more intuitive name.

Rename AttributeSetImpl to AttributeListImpl to follow suit.

It's useful to rename this class so that we can rename AttributeSetNode
to AttributeSet later. AttributeSet is the set of attributes that apply
to a single function, argument, or return value.

Reviewers: sanjoy, javed.absar, chandlerc, pete

Reviewed By: pete

Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits

Differential Revision: https://reviews.llvm.org/D31102

llvm-svn: 298393
2017-03-21 16:57:19 +00:00
Sanne Wouda 2409c6403d [ARM] [Assembler] Support negative immediates for A32, T32 and T16
Summary:
To support negative immediates for certain arithmetic instructions, the
instruction is converted to the inverse instruction with a negated (or inverted)
immediate. For example, "ADD r0, r1, #FFFFFFFF" cannot be encoded as an ADD
instruction.  However, "SUB r0, r1, #1" is equivalent.

These conversions are different from instruction aliases.  An alias maps
several assembler instructions onto one encoding.  A conversion, however, maps
an *invalid* instruction--e.g. with an immediate that cannot be represented in
the encoding--to a different (but equivalent) instruction.

Several instructions with negative immediates were being converted already, but
this was not systematically tested, nor did it cover all instructions.

This patch implements all possible substitutions for ARM, Thumb1 and
Thumb2 assembler and adds tests.  It also adds a feature flag
(-mattr=+no-neg-immediates) to turn these substitutions off.  This is
helpful for users who want their code to assemble to exactly what they
wrote.

Reviewers: t.p.northover, rovka, samparker, javed.absar, peter.smith, rengolin

Reviewed By: javed.absar

Subscribers: aadg, aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D30571

llvm-svn: 298380
2017-03-21 14:59:17 +00:00
Eli Friedman 76732acc23 [ARM] Revert r297443 and r297820.
The glueless lowering of addc/adde in Thumb1 has known serious
miscompiles (see https://reviews.llvm.org/D31081), and r297820
causes an infinite loop for certain constructs.  It's not
clear when they will be fixed, so let's just take them out
of the tree for now.

(I resolved a small conflict with r297453.)

llvm-svn: 298328
2017-03-21 00:26:39 +00:00
Vadzim Dambrouski ba789cbd3d [ARM] Fix PR32130: Handle promotion of zero sized constants.
The special case of zero sized values was previously not handled correctly.
This patch handles this by not promoting if the size is zero.

Patch by Tim Neumann.

Differential Revision: https://reviews.llvm.org/D31116

llvm-svn: 298320
2017-03-20 22:59:57 +00:00
Diana Picus d79253a9f7 [GlobalISel] Use the correct calling conv for calls
This commit adds a parameter that lets us pass in the calling convention
of the call to CallLowering::lowerCall. This allows us to handle
situations where the calling convetion of the callee is different from
that of the caller.

Differential Revision: https://reviews.llvm.org/D31039

llvm-svn: 298254
2017-03-20 14:40:18 +00:00
Matthias Braun e6ff30b696 ExecutionDepsFix: Let targets specialize the pass; NFC
Let targets specialize the pass with the register class so we can get a
parameterless default constructor and can put the pass into the pass
registry to enable testing with -run-pass=.

llvm-svn: 298184
2017-03-18 05:08:58 +00:00
Matthias Braun e9f8209e87 ExecutionDepsFix: Normalize names; NFC
Normalize ExeDepsFix, execution-fix, ExecutionDependencyFix and
ExecutionDepsFix to the last one.

llvm-svn: 298183
2017-03-18 05:05:40 +00:00
Nirav Dave ac6081cb67 Make library calls sensitive to regparm module flag (Fixes PR3997).
Reviewers: mkuper, rnk

Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D27050

llvm-svn: 298179
2017-03-18 00:44:07 +00:00
Nirav Dave 6de2c77944 Capitalize ArgListEntry fields. NFC.
llvm-svn: 298178
2017-03-18 00:43:57 +00:00
Andre Vieira 913ffeb5ba [ARM] Fix triple format in test branch disassemble test
Fixing triple format in the tests added for the branch label fix for Thumb
Targets. Also recommitting previously approved patch, see
https://reviews.llvm.org/D30943.

Reviewed by: samparker

Differential Revision: https://reviews.llvm.org/D30987

llvm-svn: 298056
2017-03-17 09:37:10 +00:00
Eli Friedman da228fee0c [ARM] Use alias analysis in ARMPreAllocLoadStoreOpt.
This allows the optimization to rearrange loads and stores more
aggressively. This doesn't really affect performance, but it helps
codesize.

Differential Revision: https://reviews.llvm.org/D30839

llvm-svn: 298021
2017-03-17 00:34:26 +00:00
Reid Kleckner 45707d4d5a Remove getArgumentList() in favor of arg_begin(), args(), etc
Users often call getArgumentList().size(), which is a linear way to get
the number of function arguments. arg_size(), on the other hand, is
constant time.

In general, the fact that arguments are stored in an iplist is an
implementation detail, so I've removed it from the Function interface
and moved all other users to the argument container APIs (arg_begin(),
arg_end(), args(), arg_size()).

Reviewed By: chandlerc

Differential Revision: https://reviews.llvm.org/D31052

llvm-svn: 298010
2017-03-16 22:59:15 +00:00
Matthias Braun e959544517 TargetInstrInfo: Provide default implementation of isTailCall().
In fact this default implementation should be the only implementation,
keep it virtual for now to accomodate targets that don't model flags
correctly.

Differential Revision: https://reviews.llvm.org/D30747

llvm-svn: 297980
2017-03-16 20:02:30 +00:00
Tim Northover 0d98b03b9f ARM: avoid clobbering register in v6 jump-table expansion.
If we got unlucky with register allocation and actual constpool placement, we
could end up producing a tTBB_JT with an index that's already been clobbered.

Technically, we might be able to fix this situation up with a MOV, but I think
the constant islands pass is complex enough without having to deal with more
weird edge-cases.

llvm-svn: 297871
2017-03-15 18:38:13 +00:00
Sanjay Patel fa929a2134 Cyle -> Cycle; NFCI
llvm-svn: 297846
2017-03-15 15:37:42 +00:00
Artyom Skrobov e72e1ba434 Revert "[Thumb1] Fix the bug when adding/subtracting -2147483648"
This reverts r297820 which apparently fails on A15 hosts.

llvm-svn: 297842
2017-03-15 14:50:43 +00:00
Sam Parker db20d48336 Reverting r297821 due to breaking lld test.
llvm-svn: 297838
2017-03-15 14:06:42 +00:00
Sam Parker 274472f7c5 [ARM] Fix for branch label disassembly for Thumb
Different MCInstrAnalysis classes for arm and thumb mode, each with
their own evaluateBranch implementation. I added a test case and
fixed the coff-relocations test to use '<label>:' rather than
'<label>' in the CHECK-LABEL entries, since the ones without the
colon would match branch targets. Might be worth noticing that
llvm-objdump does not lookup the relocation and thus assigns it a
target depending on the encoded immediate which #0, so it thinks it
branches to the next instruction.

Committed on behalf of Andre Vieira (avieira).

Differential Revision: https://reviews.llvm.org/D30943

llvm-svn: 297821
2017-03-15 10:21:23 +00:00
Artyom Skrobov 3fa5fd1dd2 [Thumb1] Fix the bug when adding/subtracting -2147483648
Differential Revision: https://reviews.llvm.org/D30829

llvm-svn: 297820
2017-03-15 10:19:16 +00:00
Sam Parker 654cb8263a [ARM] Enable SMLAL[B|T] isel
Enable the selection of the 64-bit signed multiply accumulate
instructions which operate on 16-bit operands. These are enabled for
ARMv5TE onwards for ARM and for V6T2 and other DSP enabled Thumb
architectures.

Differential Revision: https://reviews.llvm.org/D30044

llvm-svn: 297809
2017-03-15 08:27:11 +00:00
Daniel Sanders 8a4bae9993 [globalisel][tblgen] Add support for ComplexPatterns
Summary:
Adds a new kind of MachineOperand: MO_Placeholder.
This operand must not appear in the MIR and only exists as a way of
creating an 'uninitialized' operand until a matcher function overwrites it.

Depends on D30046, D29712

Reviewers: t.p.northover, ab, rovka, aditya_nandakumar, javed.absar, qcolombet

Reviewed By: qcolombet

Subscribers: dberris, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D30089

llvm-svn: 297782
2017-03-14 21:32:08 +00:00
Evgeniy Stepanov 43dcf4d330 Fix asm printing of associated sections.
Make MCSectionELF::AssociatedSection be a link to a symbol, because
that's how it works in the assembly, and use it in the asm printer.

llvm-svn: 297769
2017-03-14 19:28:51 +00:00
Eli Friedman caea769f11 [ARM] Replace some C++ selection code with TableGen patterns. NFC.
Differential Revision: https://reviews.llvm.org/D30794

llvm-svn: 297768
2017-03-14 18:43:37 +00:00
Artyom Skrobov 2de256642b Fix typo in comment
llvm-svn: 297742
2017-03-14 14:13:19 +00:00
Oliver Stannard 6ee22c41f8 [ARM] Diagnose ARM MOVT without :lower16: or :upper16: expression
This instruction was missing from the list of opcodes that we check, so we were
hitting an llvm_unreachable in ARMMCCodeEmitter.cpp for the ARM MOVT
instruction, rather than the diagnostic that is emitted for the other MOVW/MOVT
instructions.

Differential revision: https://reviews.llvm.org/D30936

llvm-svn: 297739
2017-03-14 13:50:10 +00:00
Artyom Skrobov 283316b5c0 De-duplicate the two implementations of ARMBaseInstrInfo::isProfitableToIfCvt() [NFC]
Reviewers: congh, rengolin

Subscribers: aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D30934

llvm-svn: 297738
2017-03-14 13:38:45 +00:00
Sam Parker 916b1ba617 [ARM] Move SMULW[B|T] isel to DAG Combine
Create nodes for smulwb and smulwt and move their selection from
DAGToDAG to DAG combine. smlawb and smlawt can then be selected
using tablegen. Added some helper functions to detect shift patterns
as well as a wrapper around SimplifyDemandBits. Added a couple of
extra tests.

Differential Revision: https://reviews.llvm.org/D30708

llvm-svn: 297716
2017-03-14 09:13:22 +00:00
Nirav Dave 54e22f33d9 In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Recommiting with compiler time improvements

    Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.

    * Simplify Consecutive Merge Store Candidate Search

    Now that address aliasing is much less conservative, push through
    simplified store merging search and chain alias analysis which only
    checks for parallel stores through the chain subgraph. This is cleaner
    as the separation of non-interfering loads/stores from the
    store-merging logic.

    When merging stores search up the chain through a single load, and
    finds all possible stores by looking down from through a load and a
    TokenFactor to all stores visited.

    This improves the quality of the output SelectionDAG and the output
    Codegen (save perhaps for some ARM cases where we correctly constructs
    wider loads, but then promotes them to float operations which appear
    but requires more expensive constant generation).

    Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

    Additional Minor Changes:

      1. Finishes removing unused AliasLoad code

      2. Unifies the chain aggregation in the merged stores across code
         paths

      3. Re-add the Store node to the worklist after calling
         SimplifyDemandedBits.

      4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
         arbitrary, but seems sufficient to not cause regressions in
         tests.

      5. Remove Chain dependencies of Memory operations on CopyfromReg
         nodes as these are captured by data dependence

      6. Forward loads-store values through tokenfactors containing
          {CopyToReg,CopyFromReg} Values.

      7. Peephole to convert buildvector of extract_vector_elt to
         extract_subvector if possible (see
         CodeGen/AArch64/store-merge.ll)

      8. Store merging for the ARM target is restricted to 32-bit as
         some in some contexts invalid 64-bit operations are being
         generated. This can be removed once appropriate checks are
         added.

    This finishes the change Matt Arsenault started in r246307 and
    jyknight's original patch.

    Many tests required some changes as memory operations are now
    reorderable, improving load-store forwarding. One test in
    particular is worth noting:

      CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
      forwarding converts a load-store pair into a parallel store and
      a memory-realized bitcast of the same value. However, because we
      lose the sharing of the explicit and implicit store values we
      must create another local store. A similar transformation
      happens before SelectionDAG as well.

    Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

llvm-svn: 297695
2017-03-14 00:34:14 +00:00
Artyom Skrobov bf19d4bc29 [Thumb1] combine ADDC/SUBC with a negative immediate
Summary: This simple optimization has been split out of https://reviews.llvm.org/D30400

Reviewers: efriedma, jmolloy

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D30829

llvm-svn: 297682
2017-03-13 22:36:14 +00:00
Diana Picus 94db2e288b [ARM] GlobalISel: Support SP in regbankselect
We used to hit an unreachable in getRegBankFromRegClass when dealing with the
stack pointer. This commit adds support for the GPRsp reg class.

llvm-svn: 297621
2017-03-13 14:28:34 +00:00
Sjoerd Meijer aea3a990a2 ARMDisassembler: loop over ARM decode tables
Loop over the ARM decode tables; this is a clean-up to reduce some code
duplication.

Differential Revision: https://reviews.llvm.org/D30814

llvm-svn: 297608
2017-03-13 09:41:10 +00:00
Artyom Skrobov 94fb0bb65f imm_comp_XFORM (defined in ARMInstrThumb.td) duplicates imm_not_XFORM (defined in ARMInstrInfo.td)
Reviewers: grosbach, rengolin, jmolloy

Reviewed By: jmolloy

Subscribers: aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D30782

llvm-svn: 297456
2017-03-10 13:21:12 +00:00
Artyom Skrobov 0cc80c1f5a Refactor the multiply-accumulate combines to act on
ARMISD::ADD[CE] nodes, instead of the generic ISD::ADD[CE].

Summary:
This allows for some simplification because the combines
are no longer limited to just one go at the node before
it gets legalized into an ARM target-specific one.

Reviewers: jmolloy, rogfer01

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D30401

llvm-svn: 297453
2017-03-10 12:41:33 +00:00
Artyom Skrobov 0c93ceb5d8 For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,
same as already done for ARM and Thumb2.

Reviewers: jmolloy, rogfer01, efriedma

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D30400

llvm-svn: 297443
2017-03-10 07:40:27 +00:00
Sjoerd Meijer 7f1a982d3d [ARM] remove FIXMEs and add vcmp MC test
Minor cleanup in ARMInstrVFP.td: removed some FIXMEs and added a MC test for
vcmp that was actually missing.

Differential Revision: https://reviews.llvm.org/D30745

llvm-svn: 297376
2017-03-09 13:28:37 +00:00
John Brawn eba9fdac7e [ARM] Correct handling of LSL #0 in an IT block
The check for LSL #0 in an IT block was checking if operand 4 was zero, but
operand 4 is the condition code operand so it was actually checking for LSLEQ.
Fix this by checking operand 3, which really is the immediate operand, and add
some tests.

Differential Revision: https://reviews.llvm.org/D30692

llvm-svn: 297142
2017-03-07 14:42:03 +00:00
Ranjeet Singh 3d0af578cc [ARM] Reapply r296865 "[ARM] fpscr read/write intrinsics not aware of each other""
The original patch r296865 was reverted as it broke the chromium builds for
Android https://bugs.llvm.org/show_bug.cgi?id=32134, this patch reapplies
r296865 with a fix to make sure it doesn't cause the build regression.

The problem was that intrinsic selection on int_arm_get_fpscr was failing in
ISel this was because the code to manually select this intrinsic still thought
it was the version with no side-effects (INTRINSIC_WO_CHAIN) which is wrong as
it doesn't semantically match the definition in the tablegen code which says it
does have side-effects, I've fixed this by updating the intrinsic type to
INTRINSIC_W_CHAIN (has side-effects). I've also added a test for this based on
Hans original reproducer.

Differential Revision: https://reviews.llvm.org/D30645

llvm-svn: 297137
2017-03-07 11:17:53 +00:00
Artyom Skrobov 1388e2f792 In Thumb1, materialize a move between low registers as a `movs`, if CPSR isn't live.
Summary: Previously, it had always been materialized as a push/pop sequence.

Reviewers: labrinea, jroelofs

Reviewed By: jroelofs

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D30648

llvm-svn: 297134
2017-03-07 09:38:16 +00:00
Tim Northover c2c545b8f7 GlobalISel: restrict G_EXTRACT instruction to just one operand.
A bit more painful than G_INSERT because it was more widely used, but this
should simplify the handling of extract operations in most locations.

llvm-svn: 297100
2017-03-06 23:50:28 +00:00
Krzysztof Parzyszek cc31871dc4 Make TargetInstrInfo::isPredicable take a const reference, NFC
llvm-svn: 296901
2017-03-03 18:30:54 +00:00
Chandler Carruth ce52b80744 [SDAG] Revert r296476 (and r296486, r296668, r296690).
This patch causes compile times for some patterns to explode. I have
a (large, unreduced) test case that slows down by more than 20x and
several test cases slow down by 2x. I'm sending some of the test cases
directly to Nirav and following up with more details in the review log,
but this should unblock anyone else hitting this.

llvm-svn: 296862
2017-03-03 10:02:25 +00:00
Eli Friedman bb821276d0 [ARM] Fix insert point for store rescheduling.
In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last
operation which we want to merge. If we break out of the loop because
an operation has the wrong offset, we shouldn't use that operation
as LastOp.

This patch fixes some cases where we would move stores to the wrong
insert point.

Re-commit with a fix to increment NumMove in the right place.

Differential Revision: https://reviews.llvm.org/D30124

llvm-svn: 296815
2017-03-02 21:39:39 +00:00
Matthew Simpson aee9771ae2 [ARM/AArch64] Update costs for interleaved accesses with wide types
After r296750, we're able to match interleaved accesses having types wider than
128 bits. This patch updates the associated TTI costs.

Differential Revision: https://reviews.llvm.org/D29675

llvm-svn: 296751
2017-03-02 15:15:35 +00:00
Matthew Simpson 1bfa159db9 [ARM/AArch64] Support wide interleaved accesses
This patch teaches (ARM|AArch64)ISelLowering.cpp to match illegal vector types
to interleaved access intrinsics as long as the types are multiples of the
vector register width. A "wide" access will now be mapped to multiple
interleave intrinsics similar to the way in which non-interleaved accesses with
illegal types are legalized into multiple accesses. I'll update the associated
TTI costs (in getInterleavedMemoryOpCost) as a follow-on.

Differential Revision: https://reviews.llvm.org/D29466

llvm-svn: 296750
2017-03-02 15:11:20 +00:00
Eli Friedman 933863ce61 Revert r296708; causing test failures on ARM hosts.
Original commit message:

[ARM] Fix insert point for store rescheduling.
    
In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last
operation which we want to merge. If we break out of the loop because
an operation has the wrong offset, we shouldn't use that operation as
LastOp.
    
This patch fixes some cases where we would sink stores for no reason.

llvm-svn: 296718
2017-03-02 00:08:50 +00:00
Eli Friedman 1c9216b003 [ARM] Fix insert point for store rescheduling.
In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last
operation which we want to merge. If we break out of the loop because
an operation has the wrong offset, we shouldn't use that operation as
LastOp.
    
This patch fixes some cases where we would sink stores for no reason.
    
Differential Revision: https://reviews.llvm.org/D30124

llvm-svn: 296708
2017-03-01 23:20:29 +00:00
Eli Friedman 28c2c0e311 [ARM] Check correct instructions for load/store rescheduling.
This code starts from the high end of the sorted vector of offsets, and
works backwards: it tries to find contiguous offsets, process them, then
pops them from the end of the vector. Most of the code agrees with this
order of processing, but one loop doesn't: it instead processes elements
from the low end of the vector (which are nodes with unrelated offsets).
Fix that loop to process the correct elements.
    
This has a few implications. One, we don't incorrectly return early when
processing multiple groups of offsets in the same block (which allows
rescheduling prera-ldst-insertpt.mir). Two, we pick the correct insert
point for loads, so they're correctly sorted (which affects the
scheduling of vldm-liveness.ll). I think it might also impact some of
the heuristics slightly.
    
Differential Revision: https://reviews.llvm.org/D30368

llvm-svn: 296701
2017-03-01 22:56:20 +00:00
Diana Picus 3841522259 clang-format r296631
Apparently I forgot to run it after fixing up some things...

llvm-svn: 296634
2017-03-01 15:54:21 +00:00
Diana Picus 9c52309b37 [ARM] GlobalISel: Lower call params that need extensions
Lower i1, i8 and i16 call parameters by extending them before storing them on
the stack. Also make sure we encode the correct, extended size in the
corresponding memory operand, and that we compute the correct stack size in the
end.

The latter is a bit more complicated because we used to compute the stack size
in the getStackAddress method, based on the Size and Offset of the parameters.
However, if the last parameter is sign extended, we'd be using the wrong,
non-extended size, and we'd end up with a smaller stack than we need to hold the
extended value. Instead of hacking this up based on the value of Size in
getStackAddress, we move our stack size handling logic to assignArg, where we
have access to the CCState which knows everything we could possibly want to know
about the stack. This way we don't need to duplicate any knowledge or resort to
any ugly hacks.

On this same occasion, update the IRTranslator test to check the sizes of the
stores everywhere, not just for sign extended paramteres.

llvm-svn: 296631
2017-03-01 15:35:14 +00:00
Oliver Stannard 5d35b9e56c [ARM] Fix parsing of special register masks
This parsing code was incorrectly checking for invalid characters, so an
invalid instruction like:
  msr spsr_w, r0
would be emitted as:
  msr spsr_cxsf, r0

Differential revision: https://reviews.llvm.org/D30462

llvm-svn: 296607
2017-03-01 10:51:04 +00:00
Eli Friedman 36795239f5 [ARM] Don't generate deprecated T1 STM.
This prevents generating stm r1!, {r0, r1} on Thumb1, where value
stored for r1 is UNKONWN.

Patch by Zhaoshi Zheng.

Differential Revision: https://reviews.llvm.org/D27910

llvm-svn: 296538
2017-02-28 23:32:55 +00:00
Nirav Dave f830dec3f2 In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.

    * Simplify Consecutive Merge Store Candidate Search

    Now that address aliasing is much less conservative, push through
    simplified store merging search and chain alias analysis which only
    checks for parallel stores through the chain subgraph. This is cleaner
    as the separation of non-interfering loads/stores from the
    store-merging logic.

    When merging stores search up the chain through a single load, and
    finds all possible stores by looking down from through a load and a
    TokenFactor to all stores visited.

    This improves the quality of the output SelectionDAG and the output
    Codegen (save perhaps for some ARM cases where we correctly constructs
    wider loads, but then promotes them to float operations which appear
    but requires more expensive constant generation).

    Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

    Additional Minor Changes:

      1. Finishes removing unused AliasLoad code

      2. Unifies the chain aggregation in the merged stores across code
         paths

      3. Re-add the Store node to the worklist after calling
         SimplifyDemandedBits.

      4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
         arbitrary, but seems sufficient to not cause regressions in
         tests.

      5. Remove Chain dependencies of Memory operations on CopyfromReg
         nodes as these are captured by data dependence

      6. Forward loads-store values through tokenfactors containing
          {CopyToReg,CopyFromReg} Values.

      7. Peephole to convert buildvector of extract_vector_elt to
         extract_subvector if possible (see
         CodeGen/AArch64/store-merge.ll)

      8. Store merging for the ARM target is restricted to 32-bit as
         some in some contexts invalid 64-bit operations are being
         generated. This can be removed once appropriate checks are
         added.

    This finishes the change Matt Arsenault started in r246307 and
    jyknight's original patch.

    Many tests required some changes as memory operations are now
    reorderable, improving load-store forwarding. One test in
    particular is worth noting:

      CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
      forwarding converts a load-store pair into a parallel store and
      a memory-realized bitcast of the same value. However, because we
      lose the sharing of the explicit and implicit store values we
      must create another local store. A similar transformation
      happens before SelectionDAG as well.

    Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

llvm-svn: 296476
2017-02-28 14:24:15 +00:00
Diana Picus 1ffca2aeaf [ARM] GlobalISel: Lower i32 and fp call parameters on the stack
Lower i32, float and double parameters that need to live on the stack. This
boils down to creating some G_GEPs starting from the stack pointer and storing
the values there. During the process we also keep track of the stack size and
use the final value in the ADJCALLSTACKDOWN/UP instructions.

We currently assert for smaller types, since they usually require extensions.
They will be handled in a separate patch.

llvm-svn: 296473
2017-02-28 14:17:53 +00:00
Diana Picus 5a7203a0af [ARM] GlobalISel: Select 32-bit G_CONSTANT
Put it into a register by means of a MOVi.

llvm-svn: 296471
2017-02-28 13:05:42 +00:00
Diana Picus 5b8514559e [ARM] GlobalISel: Add mapping for G_CONSTANT
Like G_FRAME_INDEX, G_CONSTANT has one register operand and one non-register
operand.

llvm-svn: 296469
2017-02-28 12:13:58 +00:00
Diana Picus e6beac6742 [ARM] GlobalISel: Legalize 32-bit constants
llvm-svn: 296468
2017-02-28 11:33:46 +00:00
Diana Picus 9d07094913 [ARM] GlobalISel: Select G_GEP
At this point, G_GEP is just an add, so we treat it exactly like a G_ADD.

llvm-svn: 296462
2017-02-28 10:14:38 +00:00
Oliver Stannard 85d4d5b493 [ARM] Diagnose PC-writing instructions in IT blocks
In Thumb2, instructions which write to the PC are UNPREDICTABLE if they are in
an IT block but not the last instruction in the block.

Previously, we only diagnosed this for LDM instructions, this patch extends the
diagnostic to cover all of the relevant instructions.

Differential Revision: https://reviews.llvm.org/D30398

llvm-svn: 296459
2017-02-28 10:04:36 +00:00
Diana Picus 566a15d749 [ARM] GlobalISel: Add reg bank mapping for G_GEP
This should be the same as the mapping for G_ADD etc.

llvm-svn: 296455
2017-02-28 09:35:10 +00:00
Diana Picus 8598b17076 [ARM] GlobalISel: Legalize G_GEP with 32-bit offsets
At the moment we're only interested in GEPs for putting call parameters on the
stack, so we'll stick to 32-bit offsets.

llvm-svn: 296452
2017-02-28 09:02:42 +00:00
Sanjay Patel ae7873fe55 [ARM] don't transform an add(ext Cond), C to select unless there's a setcc of the condition
The transform in question claims to be doing:

// fold (add (select cc, 0, c), x) -> (select cc, x, (add, x, c))

...starting in PerformADDCombineWithOperands(), but it wasn't actually checking for a setcc node
for the sext/zext patterns.

This is exactly the opposite of a transform I'd like to add to DAGCombiner's foldSelectOfConstants(),
so I was seeing infinite loops with my draft of a patch applied.

The changes in select_const.ll look positive (less instructions). The change in arm-and-tst-peephole.ll
is unrelated. We're changing the input IR in that test to preserve the intent of the test, but that's 
not affected by this code change.

Differential Revision:
https://reviews.llvm.org/D30355

llvm-svn: 296389
2017-02-27 21:30:54 +00:00
John Brawn c97b714ffb [ARM] LSL #0 is an alias of MOV
Currently we handle this correctly in arm, but in thumb we don't which leads to
an unpredictable instruction being emitted for LSL #0 in an IT block and SP not
being permitted in some cases when it should be.

For the thumb2 LSL we can handle this by making LSL #0 an alias of MOV in the
.td file, but for thumb1 we need to handle it in checkTargetMatchPredicate to
get the IT handling right. We also need to adjust the handling of
MOV rd, rn, LSL #0 to avoid generating the 16-bit encoding in an IT block. We
should also adjust it to allow SP in the same way that it is allowed in
MOV rd, rn, but I haven't done that here because it looks like it would take
quite a lot of work to get right.

Additionally correct the selection of the 16-bit shift instructions in
processInstruction, where it was checking if the two registers were equal when
it should have been checking if they were low. It appears that previously this
code was never executed and the 16-bit encoding was selected by default, but
the other changes I've done here have somehow made it start being used.

Differential Revision: https://reviews.llvm.org/D30294

llvm-svn: 296342
2017-02-27 14:40:51 +00:00
Nirav Dave 73cd0194cf Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
This reverts commit r296252 until 256-bit operations are more efficiently generated in X86.

llvm-svn: 296279
2017-02-26 01:27:32 +00:00
Nirav Dave beabf456df In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.

    * Simplify Consecutive Merge Store Candidate Search

    Now that address aliasing is much less conservative, push through
    simplified store merging search and chain alias analysis which only
    checks for parallel stores through the chain subgraph. This is cleaner
    as the separation of non-interfering loads/stores from the
    store-merging logic.

    When merging stores search up the chain through a single load, and
    finds all possible stores by looking down from through a load and a
    TokenFactor to all stores visited.

    This improves the quality of the output SelectionDAG and the output
    Codegen (save perhaps for some ARM cases where we correctly constructs
    wider loads, but then promotes them to float operations which appear
    but requires more expensive constant generation).

    Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

    Additional Minor Changes:

      1. Finishes removing unused AliasLoad code

      2. Unifies the chain aggregation in the merged stores across code
         paths

      3. Re-add the Store node to the worklist after calling
         SimplifyDemandedBits.

      4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
         arbitrary, but seems sufficient to not cause regressions in
         tests.

      5. Remove Chain dependencies of Memory operations on CopyfromReg
         nodes as these are captured by data dependence

      6. Forward loads-store values through tokenfactors containing
          {CopyToReg,CopyFromReg} Values.

      7. Peephole to convert buildvector of extract_vector_elt to
         extract_subvector if possible (see
         CodeGen/AArch64/store-merge.ll)

      8. Store merging for the ARM target is restricted to 32-bit as
         some in some contexts invalid 64-bit operations are being
         generated. This can be removed once appropriate checks are
         added.

    This finishes the change Matt Arsenault started in r246307 and
    jyknight's original patch.

    Many tests required some changes as memory operations are now
    reorderable, improving load-store forwarding. One test in
    particular is worth noting:

      CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
      forwarding converts a load-store pair into a parallel store and
      a memory-realized bitcast of the same value. However, because we
      lose the sharing of the explicit and implicit store values we
      must create another local store. A similar transformation
      happens before SelectionDAG as well.

    Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

llvm-svn: 296252
2017-02-25 11:43:58 +00:00
Diana Picus 3b99c64ba1 [ARM] GlobalISel: Select G_STORE
Same as selecting G_LOAD.

llvm-svn: 296122
2017-02-24 14:01:27 +00:00
Diana Picus 1f432f995a [ARM] GlobalISel: Add reg bank mappings for stores
Same as the ones for loads.

llvm-svn: 296115
2017-02-24 13:07:25 +00:00
Diana Picus a2b632a353 [ARM] GlobalISel: Legalize stores
Allow the same types that we allow for loads.

llvm-svn: 296108
2017-02-24 11:28:24 +00:00
Diana Picus c21d1e5d94 Revert "[ARM] GlobalISel: Legalize stores"
This reverts commit r296103 because the test broke on one of the bots. Sorry!

llvm-svn: 296104
2017-02-24 10:35:39 +00:00
Diana Picus a5f1cfd1a7 [ARM] GlobalISel: Legalize stores
Allow the same types that we allow for loads.

llvm-svn: 296103
2017-02-24 10:19:23 +00:00
Tim Northover 063a56e81c ARM: make sure FastISel bails on f64 operations for Cortex-M4.
FastISel wasn't checking the isFPOnlySP subtarget feature before emitting
double-precision operations, so it got completely invalid CodeGen for doubles
on Cortex-M4F.

The normal ISel testing wasn't spectacular either so I added a second RUN line
to improve that while I was in the area.

llvm-svn: 296031
2017-02-23 22:35:00 +00:00
Diana Picus a8cb0cd8f2 [ARM] GlobalISel: Lower call returns
Introduce a common ValueHandler for call returns and formal arguments, and
inherit two different versions for handling the differences (at the moment the
only difference is the way physical registers are marked as used).

llvm-svn: 295973
2017-02-23 14:18:41 +00:00
Diana Picus a606713c33 [ARM] GlobalISel: Lower call parameters in regs
Add support for lowering calls with parameters than can fit into regs.  Use the
same ValueHandler that we used for function returns, but rename it to match its
new, extended purpose.

llvm-svn: 295971
2017-02-23 13:25:43 +00:00
Kristof Beyls 5ac6adbb6d Fix assertion failure in ARMConstantIslandPass.
The ARMConstantIslandPass didn't have support for handling accesses to
constant island objects through ARM::t2LDRBpci instructions. This adds
support for that.

This fixes PR31997.

llvm-svn: 295964
2017-02-23 12:24:55 +00:00
Roger Ferrer Ibanez 56db97d4de [ARM] Fix constant islands pass.
The pass tries to fix a spill of LR that turns out to be unnecessary.
So it removes the tPOP but forgets to remove tPUSH.
This causes the stack be misaligned upon returning the function.

Thus, remove the tPUSH as well in this case.

Differential Revision: https://reviews.llvm.org/D30207

llvm-svn: 295816
2017-02-22 09:06:21 +00:00
Javed Absar b672722810 [ARM] Classification Improvements to ARM Sched-Models. NFCI.
This patch adds missing sched classes for Thumb2 instructions.
This has been missing so far, and as a consequence, machine
scheduler models for individual sub-targets have tended to
be larger than they needed to be. These patches should help
write schedulers better and faster in the future
for ARM sub-targets.

Reviewer: Diana Picus
Differential Revision: https://reviews.llvm.org/D29953

llvm-svn: 295811
2017-02-22 07:22:57 +00:00
Dan Gohman 18eafb6c68 [WebAssembly] Add skeleton MC support for the Wasm container format
This just adds the basic skeleton for supporting a new object file format.
All of the actual encoding will be implemented in followup patches.

Differential Revision: https://reviews.llvm.org/D26722

llvm-svn: 295803
2017-02-22 01:23:18 +00:00
Evgeniy Stepanov 1fd19c6e5d Fix PR31896.
Address of an alias of a global with offset is incorrectly lowered as an address of the global (i.e. ignoring offset).

llvm-svn: 295762
2017-02-21 20:17:34 +00:00
John Brawn a6e95e1652 [ARM] Correct SP/PC handling in t2MOVr
PC isn't allowed in the source operand of t2MOVr, so change the register class
to one without PC. SP handling is slightly trickier and changes depending on if
we're in ARMv8, so do that in checkTargetMatchPredicate.

Differential Revision: https://reviews.llvm.org/D30199

llvm-svn: 295732
2017-02-21 16:41:29 +00:00
Diana Picus 613b65696a [ARM] GlobalISel: Lower calls to void() functions
For now, we hardcode a BLX instruction, and generate an ADJCALLSTACKDOWN/UP pair
with amount 0.

llvm-svn: 295716
2017-02-21 11:33:59 +00:00
Diana Picus 1c33c9f0b0 [ARM] GlobalISel: Don't select atomic loads
There used to be a check in the IRTranslator that prevented us from having to
deal with atomic loads/stores. That check has been removed in r294993 and the
AArch64 backend was updated accordingly. This commit does the same thing for the
ARM backend.

In general, in the ARM backend we introduce fences during the atomic expand
pass, so we don't have to worry about atomics, *except* for the 32-bit ARMv8
target, which handles atomics more like AArch64. Since we don't want to worry
about that yet, just bail out of instruction selection if we find any atomic
loads.

llvm-svn: 295662
2017-02-20 14:45:58 +00:00
Artyom Skrobov 4592f6206c In Thumb1 mode, the custom lowering for ARMISD::CMPZ could never emit tADDi3
Reviewers: jmolloy, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D30097

llvm-svn: 295478
2017-02-17 18:59:16 +00:00
Sam Parker 58af0c55d2 [ARM] Replace HasT2ExtractPack with HasDSP
Removed the HasT2ExtractPack feature and replaced its references
with HasDSP. This then allows the Thumb2 extend instructions to be
selected for ARMv8M +dsp. These instruction descriptions have also
been refactored and more target tests have been added for their isel.

Differential Revision: https://reviews.llvm.org/D29623

llvm-svn: 295452
2017-02-17 15:42:44 +00:00
Diana Picus e836878bf1 [ARM] GlobalISel: Clean up some helpers
Return invalid opcodes when some of the helpers in the instruction selection
pass can't handle a given combination.

llvm-svn: 295446
2017-02-17 13:44:19 +00:00
Diana Picus 38699dbac5 [ARM] GlobalISel: Check mappings used by reg bank select
Add some asserts to make sure we're using the mappings that we think we're
using. This is to keep us from accidentally breaking functionality while moving
to TableGen'erated mappings.

llvm-svn: 295441
2017-02-17 13:14:25 +00:00
Diana Picus 7cab0786bd [ARM] GlobalISel: Use Subtarget in Legalizer
Start using the Subtarget to make decisions about what's legal. In particular,
we only mark floating point operations as legal if we have VFP2, which is
something we should've done from the very start.

llvm-svn: 295439
2017-02-17 11:25:17 +00:00
Diana Picus 1540b06ef8 [ARM] GlobalISel: Select floating point loads
llvm-svn: 295321
2017-02-16 14:10:50 +00:00
Diana Picus b1701e0b05 [ARM] GlobalISel: Select G_SEQUENCE and G_EXTRACT
Since they're only used for passing around double precision floating point
values into the general purpose registers, we'll lower them to VMOVDRR and
VMOVRRD.

llvm-svn: 295310
2017-02-16 12:19:57 +00:00
Diana Picus 6beef3c087 [ARM] GlobalISel: Select double G_FADD and copies
Just use VADDD if available, bail out if not.

llvm-svn: 295309
2017-02-16 12:19:52 +00:00
Diana Picus 9b32faa821 [ARM] GlobalISel: Assert that we don't use the FPR bank if we don't have VFP
llvm-svn: 295308
2017-02-16 11:25:09 +00:00
Diana Picus a93803b9fe [ARM] GlobalISel: Add reg bank mappings for G_SEQUENCE and G_EXTRACT
Support G_SEQUENCE and G_EXTRACT as needed for passing double precision floating
point values in the soft-fp float mode.

llvm-svn: 295306
2017-02-16 11:00:31 +00:00
Diana Picus 7f82c87022 [ARM] GlobalISel: Make the FPR bank 64-bit wide
Also add mappings for single and double precision FP, and use them for G_FADD
and G_LOAD.

llvm-svn: 295302
2017-02-16 10:12:49 +00:00
Diana Picus 21c3d8e0fc [ARM] GlobalISel: Legalize 64-bit G_FADD and G_LOAD
For now we just mark them as legal all the time and let the other passes bail
out if they can't handle it. In the future, we'll want to move more of the
brains into the legalizer.

llvm-svn: 295300
2017-02-16 09:09:49 +00:00
Diana Picus ca6a890d7f [ARM] GlobalISel: Lower double precision FP args
For the hard float calling convention, we just use the D registers.

For the soft-fp calling convention, we use the R registers and move values
to/from the D registers by means of G_SEQUENCE/G_EXTRACT. While doing so, we
make sure to honor the endianness of the target, since the CCAssignFn doesn't do
that for us.

For pure soft float targets, we still bail out because we don't support the
libcalls yet.

llvm-svn: 295295
2017-02-16 07:53:07 +00:00
James Molloy 0ae2202235 [ARM] Fix crash caused by r294945
I'd missed a creator of FCMP nodes - duplicateCmp().

Kindly and promptly reported by Gabor Ballabas, due to his CSiBE test suite.

llvm-svn: 294968
2017-02-13 17:18:00 +00:00
Sanne Wouda 490d4a6da6 [CodeGen] fix alignment of JUMPTABLE_INSTS on v8M.base
Summary:
The attached test case fails with "fatal error: error in backend:
misaligned pc-relative fixup value" as the jump table is misaligned.
The EmitAlignment existed already for ARM and Thumb-1 code, but was
missing for Thumb-2.

The test checks that the fatal error disappears when generating an obj
file, as well as checking the align directive is there when producing an
asm file.


Reviewers: rengolin, grosbach, t.p.northover, jmolloy, SjoerdMeijer, samparker

Reviewed By: samparker

Subscribers: samparker, aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D29650

llvm-svn: 294950
2017-02-13 14:07:45 +00:00
James Molloy 92497542e7 [Thumb-1] TBB generation: spot redefinitions of index register
We match a sequence of 3-4 instructions into a tTBB pseudo. One of our checks is that
a particular register in that sequence is killed (so it can be clobbered by the pseudo).

We weren't noticing if an errant MOV or other instruction had infiltrated the
sequence we were walking. If it had, and it defined the register we've already
identified as killed, it makes it live across the tBR_JT and thus unclobberable.

Notice this case and bail out.

llvm-svn: 294949
2017-02-13 14:07:39 +00:00
James Molloy 9b3b899669 [ARM] Register ConstantIslands with the pass manager
This allows us to use -stop-before/-stop-after/-run-pass - we can now write
.mir tests.

llvm-svn: 294948
2017-02-13 14:07:25 +00:00
James Molloy d508789668 [ARM] Use VCMP, not VCMPE, for floating point equality comparisons
When generating a floating point comparison we currently unconditionally
generate VCMPE. This has the sideeffect of setting the cumulative Invalid
bit in FPSCR if any of the operands are QNaN.

It is expected that use of a relational predicate on a QNaN value should
raise Invalid. Quoting from the C standard:

  The relational and equality operators support the usual mathematical
  relationships between numeric values. For any ordered pair of numeric
  values exactly one of relationships the less, greater, equal and is true.
  Relational operators may raise the floating-point exception when argument
  values are NaNs.

The standard doesn't explicitly state the expectation for equality operators,
but the implication and obvious expectation is that equality operators
should not raise Invalid on a QNaN input, as those predicates are wholly
defined on unordered inputs (to return not equal).

Therefore, add a new operand to ARMISD::FPCMP and FPCMPZ indicating if
QNaN should raise Invalid, and pipe that through to TableGen.

llvm-svn: 294945
2017-02-13 12:32:47 +00:00
Ahmed Bougacha 8425f453ef [ARM] Make f16 interleaved accesses expensive.
There are no vldN/vstN f16 variants, even with +fullfp16.
We could use the i16 variants, but, in practice, even with +fullfp16,
the f16 sequence leading to the i16 shuffle usually gets scalarized.
We'd need to improve our support for f16 codegen before getting there.

Teach the cost model to consider f16 interleaved operations as
expensive.  Otherwise, we are all but guaranteed to end up with
a large block of scalarized vector code.

llvm-svn: 294819
2017-02-11 01:53:04 +00:00
Ahmed Bougacha fc979dc9dd [ARM] Don't lower f16 interleaved accesses.
There are no vldN/vstN f16 variants, even with +fullfp16.
We could use the i16 variants, but, in practice, even with +fullfp16,
the f16 sequence leading to the i16 shuffle usually gets scalarized.
We'd need to improve our support for f16 codegen before getting there.

Reject f16 interleaved accesses.  If we try to emit the f16 intrinsics,
we'll just end up with a selection failure.

llvm-svn: 294818
2017-02-11 01:53:00 +00:00
John Brawn e60f4e4b8d [ARM] Fix incorrect mask bits in MSR encoding for write_register intrinsic
In the encoding of system registers in the M-class MSR instruction the mask bits
should be 2 for registers that don't take a _<bits> qualifier (the instruction
is unpredictable otherwise), and should also be 2 if the register takes a
_<bits> qualifier but it's not present as no _<bits> is an alias for _nzcvq.

Differential Revision: https://reviews.llvm.org/D29828

llvm-svn: 294762
2017-02-10 17:41:08 +00:00
Matthias Braun 2bef2a08c0 Fix syntax error
llvm-svn: 294678
2017-02-10 00:09:20 +00:00
Matthias Braun 62e1e8531b ARMSubtarget.h: Change to one line per enum element; NFC
Change syntax to have enum elements sorted alphabetically and one per
line as that is more merge/cherry pick friendly.

llvm-svn: 294677
2017-02-10 00:06:44 +00:00
George Burgess IV ccf11c2f9f [ARM] Add support for armv7ve triple in llvm (PR31358).
Gcc supports target armv7ve which is armv7-a with virtualization
extensions. This change adds support for this in llvm for gcc
compatibility.

Also remove redundant FeatureHWDiv, FeatureHWDivARM for a few models as
this is specified automatically by FeatureVirtualization.

Patch by Manoj Gupta.

Differential Revision: https://reviews.llvm.org/D29472

llvm-svn: 294661
2017-02-09 23:29:14 +00:00
Diana Picus 7232af352f [ARM] GlobalISel: Lower single precision FP args
Both for aapcscc and aapcs_vfpcc. We currently filter out soft float targets
because we don't support libcalls yet.

llvm-svn: 294584
2017-02-09 13:09:59 +00:00
Arnold Schwaighofer 26f016f143 SwiftCC: swifterror register cannot be as the base register
Functions that have a dynamic alloca require a base register which is defined to
be X19 on AArch64 and r6 on ARM.  We have defined the swifterror register to be
the same register. Use a different callee save register for swifterror instead:

 X21 on AArch64
 R8 on ARM

rdar://30433803

llvm-svn: 294551
2017-02-09 01:52:17 +00:00
Arnold Schwaighofer db7bbcbe78 [ARM/AArch ISel] SwiftCC: First parameters that are marked swiftself are not 'this returns'
We mark X0 as preserved by a call that passes the returned parameter.

 x0 = ...
 fun(x0) // no implicit def of x0

This no longer is valid if we pass the parameter in a different register then
the returned value as is the case with a swiftself parameter (passed in x20).

x20 = ...
fun(x20) // there should be an implict def of x8

rdar://30425845

llvm-svn: 294527
2017-02-08 22:30:47 +00:00
Eugene Zelenko 41e7e34c49 [ARM] Fix some Include What You Use warnings; other minor fixes (NFC).
This is preparation to reduce MC headers dependencies.

llvm-svn: 294525
2017-02-08 22:19:56 +00:00
Diana Picus 4fa83c03fd [ARM] GlobalISel: Add FPR reg bank
Add a register bank for floating point values and select simple instructions
using them (add, copies from GPR).

This assumes that the hardware can cope with a single precision add (VADDS)
instruction, so the legalizer will treat G_FADD as legal and the instruction
selector will refuse to select if the hardware doesn't support it. In the future
we'll want to be more careful about this, and legalize to libcalls if we have to
use soft float.

llvm-svn: 294442
2017-02-08 13:23:04 +00:00
Christof Douma d3ed8380e0 [ARM] Make RWPI use movw/movt when available
When constructing global address literals while targeting the RWPI
relocation model. LLVM currently only uses literal pools. If MOVW/MOVT
instructions are available we can use these instead. Beside being more
efficient it allows -arm-execute-only to work with
-relocation-model=RWPI as well.

When we generate MOVW/MOVT for global addresses when targeting the RWPI
relocation model, we need to use base relative relocations. This patch
does the needed plumbing in MC to generate these for MOVW/MOVT.

Differential Revision: https://reviews.llvm.org/D29487

Change-Id: I446786e43a6f5aa9b6a5bb2cd216d60d41c7755d
llvm-svn: 294298
2017-02-07 13:07:12 +00:00
Daniel Sanders c3ac566754 [globalisel][arm] Tablegen-erate current Register Bank Information.
Summary:
This patch tablegen-erates the ARM register bank information so that the
static tables added in D27807 no longer need to be maintained.

Depends on D27338

Reviewers: t.p.northover, rovka, ab, qcolombet, aditya_nandakumar

Reviewed By: rovka

Subscribers: aemerson, rengolin, mgorny, dberris, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D28567

llvm-svn: 294124
2017-02-05 12:07:55 +00:00
Eugene Zelenko 07dc38f67a [ARM] Fix some Include What You Use warnings; other minor fixes (NFC).
This is preparation to reduce MCExpr.h dependencies.

llvm-svn: 294052
2017-02-03 21:48:12 +00:00
Sanne Wouda a994185757 [ARM] Change TCReturn to tBL if tailcall optimization fails.
Summary:
The tail call optimisation is performed before register allocation, so
at that point we don't know if LR is being spilt or not. If LR was spilt
to the stack, then we cannot do a tail call optimisation. That would
involve popping back into LR which is not possible in Thumb1 code.

Reviewers: rengolin, jmolloy, rovka, olista01

Reviewed By: olista01

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D29020

llvm-svn: 294000
2017-02-03 11:15:53 +00:00
Reid Kleckner c35139ec0d [CodeGen] Remove dead call-or-prologue enum from CCState
This enum has been dead since Olivier Stannard re-implemented ARM byval
handling in r202985 (2014).

llvm-svn: 293943
2017-02-02 21:58:22 +00:00
Rafael Espindola 13a79bbfe5 Change how we handle section symbols on ELF.
On ELF every section can have a corresponding section symbol. When in
an assembly file we have

.quad .text

the '.text' refers to that symbol.

The way we used to handle them is to leave .text an undefined symbol
until the very end when the object writer would map them to the
actual section symbol.

The problem with that is that anything before the end would see an
undefined symbol. This could result in bad diagnostics
(test/MC/AArch64/label-arithmetic-diags-elf.s), or incorrect results
when using the asm streamer (est/MC/Mips/expansion-jal-sym-pic.s).

Fixing this will also allow using the section symbol earlier for
setting sh_link of SHF_METADATA sections.

This patch includes a few hacks to avoid changing our behaviour when
handling conflicts between section symbols and other symbols. I
reported pr31850 to track that.

llvm-svn: 293936
2017-02-02 21:26:06 +00:00
Javed Absar bb8dcc6aec [ARM] Classification Improvements to ARM Sched-Model. NFCI.
This is the second in the series of patches to enable adding
of machine sched-models for ARM processors easier and compact.
This patch focuses on integer instructions and adds missing
sched definitions.

Reviewers: rovka, rengolin
Differential Revision: https://reviews.llvm.org/D29127

llvm-svn: 293935
2017-02-02 21:08:12 +00:00
Nirav Dave 93f9d5ce04 Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
This reverts commit r293893 which is miscompiling lua on ARM and
bootstrapping for x86-windows.

llvm-svn: 293915
2017-02-02 18:24:55 +00:00
Nirav Dave 4442667fc5 In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Recommiting after fixing X86 inc/dec chain bug.

    * Simplify Consecutive Merge Store Candidate Search

    Now that address aliasing is much less conservative, push through
    simplified store merging search and chain alias analysis which only
    checks for parallel stores through the chain subgraph. This is cleaner
    as the separation of non-interfering loads/stores from the
    store-merging logic.

    When merging stores search up the chain through a single load, and
    finds all possible stores by looking down from through a load and a
    TokenFactor to all stores visited.

    This improves the quality of the output SelectionDAG and the output
    Codegen (save perhaps for some ARM cases where we correctly constructs
    wider loads, but then promotes them to float operations which appear
    but requires more expensive constant generation).

    Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

    Additional Minor Changes:

      1. Finishes removing unused AliasLoad code

      2. Unifies the chain aggregation in the merged stores across code
         paths

      3. Re-add the Store node to the worklist after calling
         SimplifyDemandedBits.

      4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
         arbitrary, but seems sufficient to not cause regressions in
         tests.

      5. Remove Chain dependencies of Memory operations on CopyfromReg
         nodes as these are captured by data dependence

      6. Forward loads-store values through tokenfactors containing
          {CopyToReg,CopyFromReg} Values.

      7. Peephole to convert buildvector of extract_vector_elt to
         extract_subvector if possible (see
         CodeGen/AArch64/store-merge.ll)

      8. Store merging for the ARM target is restricted to 32-bit as
         some in some contexts invalid 64-bit operations are being
         generated. This can be removed once appropriate checks are
         added.

    This finishes the change Matt Arsenault started in r246307 and
    jyknight's original patch.

    Many tests required some changes as memory operations are now
    reorderable, improving load-store forwarding. One test in
    particular is worth noting:

      CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
      forwarding converts a load-store pair into a parallel store and
      a memory-realized bitcast of the same value. However, because we
      lose the sharing of the explicit and implicit store values we
      must create another local store. A similar transformation
      happens before SelectionDAG as well.

    Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

llvm-svn: 293893
2017-02-02 14:39:42 +00:00
Diana Picus 32cd9b434c [ARM] GlobalISel: Lower pointer args and returns
It is important to change the ArgInfo's type from pointer to integer, otherwise
the CC assign function won't know what to do. Instead of hacking it up, we use
ComputeValueVTs and introduce some of the helpers that we will need later on for
lowering more complex types.

llvm-svn: 293889
2017-02-02 14:01:00 +00:00
Diana Picus 0c11c7b5c7 [ARM] GlobalISel: Error out instead of asserting
Allow unknown types in TLI.getValueType, otherwise we get asserts for certain
types that we do not support yet (instead of returning that we don't support
them and falling through the normal error path).

llvm-svn: 293888
2017-02-02 14:00:54 +00:00
Diana Picus fc19a8ff07 [ARM] GlobalISel: Legalize loading pointers
Make it legal to load pointer values. Also check that pointers are assigned
to the GPR reg bank by default.

llvm-svn: 293886
2017-02-02 13:20:49 +00:00
Matthew Simpson ba5cf9dfee [LV] Move interleaved access helper functions to VectorUtils (NFC)
This patch moves some helper functions related to interleaved access
vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like
to use these functions in a follow-on patch that improves interleaved load and
store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was
already duplicated there and has been removed.

Differential Revision: https://reviews.llvm.org/D29398

llvm-svn: 293788
2017-02-01 17:45:46 +00:00
Javed Absar e5ad87e939 [ARM] Enable Cortex-M23 and Cortex-M33 support.
Add both cores to the target parser and TableGen. Test that eabi
attributes are set correctly for both cores. Additionally, test the
absence and presence of MOVT in Cortex-M23 and Cortex-M33, respectively.

Committed on behalf of Sanne Wouda.
Reviewers : rengolin, olista01.

Differential Revision: https://reviews.llvm.org/D29073

llvm-svn: 293761
2017-02-01 11:55:03 +00:00
Sam Parker 9bf658d5fe [ARM] Avoid using ARM instructions in Thumb mode
The Requires class overrides the target requirements of an instruction,
rather than adding to them, so all ARM instructions need to include the
IsARM predicate when they have overwitten requirements.

This caused the swp and swpb instructions to be allowed in thumb mode
assembly, and the ARM encoding of CDP to be selected in codegen (which
is different for conditional instructions).

Differential Revision: https://reviews.llvm.org/D29283

llvm-svn: 293634
2017-01-31 14:35:01 +00:00
Eugene Zelenko 342257ea92 [ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 293578
2017-01-31 00:56:17 +00:00
Saleem Abdulrasool 5282eed06c ARM: support `-mlong-calls` with AEABI TLS on ELF
Support lowering AEABI TLS access (__aeabi_read_tp) with long calls.
This requires adjusting the call sequence to use an indirect call to get
full addressability.

Resolves PR31769!

llvm-svn: 293433
2017-01-29 16:46:22 +00:00
Matthias Braun 8c209aa877 Cleanup dump() functions.
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html

For reference:
- Public headers should just declare the dump() method but not use
  LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
  #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
  LLVM_DUMP_METHOD void MyClass::dump() {
    // print stuff to dbgs()...
  }
  #endif

llvm-svn: 293359
2017-01-28 02:02:38 +00:00
Eugene Zelenko e79c077ef9 [ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 293348
2017-01-27 23:58:02 +00:00
Saleem Abdulrasool 26c00e3700 ARM: fix vectorized division on WoA
The Windows on ARM target uses custom division for normal division as
the backend needs to insert division-by-zero checks.  However, it is
designed to only handle non-vectorized division.  ARM has custom
lowering for vectorized division as that can avoid loading registers
with the values and invoke a division routine for each one, preferring
to lower using NEON instructions.  Fall back to the custom lowering for
the NEON instructions if we encounter a vectorized division.

Resolves PR31778!

llvm-svn: 293259
2017-01-27 03:41:53 +00:00
Quentin Colombet 89dbea06f1 [ARM][LegalizerInfo] Specify the type of the opcode.
This is to fix the win7 bot that does not seem to be very
good at infering the type when it gets used in an initiliazer list.

llvm-svn: 293248
2017-01-27 01:30:46 +00:00
Eugene Zelenko e6cf4374b0 [ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 293229
2017-01-26 23:40:06 +00:00
Nirav Dave d32a421f75 Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
This reverts commit r293184 which is failing in LTO builds

llvm-svn: 293188
2017-01-26 16:46:13 +00:00
Serge Rogatch e09ba748cf [XRay][Arm32] Reduce the portion of the stub and implement more staging for tail calls - in LLVM
Summary:
This patch provides more staging for tail calls in XRay Arm32 . When the logging part of XRay is ready for tail calls, its support in the core part of XRay Arm32 may be as easy as changing the number passed to the handler from 1 to 2.
Coupled patch:
- https://reviews.llvm.org/D28674

Reviewers: dberris, rengolin

Reviewed By: dberris

Subscribers: llvm-commits, iid_iunknown, aemerson, rengolin, dberris

Differential Revision: https://reviews.llvm.org/D28673

llvm-svn: 293185
2017-01-26 16:17:03 +00:00
Nirav Dave de6516c466 In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
* Simplify Consecutive Merge Store Candidate Search

    Now that address aliasing is much less conservative, push through
    simplified store merging search and chain alias analysis which only
    checks for parallel stores through the chain subgraph. This is cleaner
    as the separation of non-interfering loads/stores from the
    store-merging logic.

    When merging stores search up the chain through a single load, and
    finds all possible stores by looking down from through a load and a
    TokenFactor to all stores visited.

    This improves the quality of the output SelectionDAG and the output
    Codegen (save perhaps for some ARM cases where we correctly constructs
    wider loads, but then promotes them to float operations which appear
    but requires more expensive constant generation).

    Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

    Additional Minor Changes:

      1. Finishes removing unused AliasLoad code

      2. Unifies the chain aggregation in the merged stores across code
         paths

      3. Re-add the Store node to the worklist after calling
         SimplifyDemandedBits.

      4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
         arbitrary, but seems sufficient to not cause regressions in
         tests.

      5. Remove Chain dependencies of Memory operations on CopyfromReg
         nodes as these are captured by data dependence

      6. Forward loads-store values through tokenfactors containing
          {CopyToReg,CopyFromReg} Values.

      7. Peephole to convert buildvector of extract_vector_elt to
         extract_subvector if possible (see
         CodeGen/AArch64/store-merge.ll)

      8. Store merging for the ARM target is restricted to 32-bit as
         some in some contexts invalid 64-bit operations are being
         generated. This can be removed once appropriate checks are
         added.

    This finishes the change Matt Arsenault started in r246307 and
    jyknight's original patch.

    Many tests required some changes as memory operations are now
    reorderable, improving load-store forwarding. One test in
    particular is worth noting:

      CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
      forwarding converts a load-store pair into a parallel store and
      a memory-realized bitcast of the same value. However, because we
      lose the sharing of the explicit and implicit store values we
      must create another local store. A similar transformation
      happens before SelectionDAG as well.

    Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

llvm-svn: 293184
2017-01-26 16:02:24 +00:00
Diana Picus 278c722e6d [ARM] GlobalISel: Load i1, i8 and i16 args from stack
Add support for loading i1, i8 and i16 arguments from the stack, with or without
the ABI extension flags.

When the ABI extension flags are present, we load a 4-byte value, otherwise we
preserve the size of the load and let the instruction selector replace it with a
LDRB/LDRH. This generates the same thing as DAGISel.

Differential Revision: https://reviews.llvm.org/D27803

llvm-svn: 293163
2017-01-26 09:20:47 +00:00
Jonas Paulsson 8e2f948ef0 [TargetTransformInfo] Refactor and improve getScalarizationOverhead()
Refactoring to remove duplications of this method.

New method getOperandsScalarizationOverhead() that looks at the present unique
operands and add extract costs for them. Old behaviour was to just add extract
costs for one operand of the type always, which still happens in
getArithmeticInstrCost() if no operands are provided by the caller.

This is a good start of improving on this, but there are more places
that can be improved by using getOperandsScalarizationOverhead().

Review: Hal Finkel
https://reviews.llvm.org/D29017

llvm-svn: 293155
2017-01-26 07:03:25 +00:00
Martin Bohme 8396e14e7f [ARM] GlobalISel: Fix stack-use-after-scope bug.
Summary:
Lifetime extension wasn't triggered on the result of BuildMI because the
reference was non-const. However, instead of adding a const, I've
removed the reference entirely as RVO should kick in anyway.

Reviewers: rovka, bkramer

Reviewed By: bkramer

Subscribers: aemerson, rengolin, dberris, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D29124

llvm-svn: 293059
2017-01-25 14:28:19 +00:00
Diana Picus d83df5d372 [ARM] GlobalISel: Support i1 add and ABI extensions
Add support for:
* i1 add
* i1 function arguments, if passed through registers
* i1 returns, with ABI signext/zeroext

Differential Revision: https://reviews.llvm.org/D27706

llvm-svn: 293035
2017-01-25 08:47:40 +00:00
Diana Picus 8b6c6bedcb [ARM] GlobalISel: Support i8/i16 ABI extensions
At the moment, this means supporting the signext/zeroext attribute on the return
type of the function. For function arguments, signext/zeroext should be handled
by the caller, so there's nothing for us to do until we start lowering calls.

Note that this does not include support for other extensions (i8 to i16), those
will be added later.

Differential Revision: https://reviews.llvm.org/D27705

llvm-svn: 293034
2017-01-25 08:10:40 +00:00
Diana Picus 1d8eaf4387 [ARM] GlobalISel: Bail out on Thumb. NFC
Thumb is not supported yet, so bail out early.

llvm-svn: 293029
2017-01-25 07:08:53 +00:00
Javed Absar 00cce41752 [ARM] Classification Improvements to ARM Sched-Models. NFCI.
This is a series of patches to enable adding of machine sched
models for ARM processors easier and compact. They define new
sched-readwrites for groups of ARM instructions. This has been
missing so far, and as a consequence, machine scheduler models
for individual sub-targets have tended to be larger than they
needed to be. 

The current patch focuses on floating-point instructions.

Reviewers: Diana Picus (rovka), Renato Golin (rengolin)

Differential Revision: https://reviews.llvm.org/D28194

llvm-svn: 292825
2017-01-23 20:20:39 +00:00
Matthias Braun 856548a616 ARM: tLDR_postidx should be marked mayLoad
This fixes -verify-machineinstrs complaints.

llvm-svn: 292629
2017-01-20 18:30:28 +00:00
Sjoerd Meijer 2db2a947f6 [Thumb] Add support for tMUL in the compare instruction peephole optimizer.
We also want to optimise tests like this: return a*b == 0.  The MULS
instruction is flag setting, so we don't need the CMP instruction but can
instead branch on the result of the MULS. The generated instructions sequence
for this example was: MULS, MOVS, MOVS, CMP. The MOVS instruction load the
boolean values resulting from the select instruction, but these MOVS
instructions are flag setting and were thus preventing this optimisation. Now
we first reorder and move the MULS to before the CMP and generate sequence
MOVS, MOVS, MULS, CMP so that the optimisation could trigger. Reordering of the
MULS and MOVS is safe to do because the subsequent MOVS instructions just set
the CPSR register and don't use it, i.e. the CPSR is dead.

Differential Revision: https://reviews.llvm.org/D27990

llvm-svn: 292608
2017-01-20 13:10:12 +00:00
Diana Picus bd66b7dc87 [ARM] Use helpers for adding pred / CC operands. NFC
Hunt down some of the places where we use bare addReg(0) or addImm(AL).addReg(0)
and replace with add(condCodeOp()) and add(predOps()). This should make it
easier to understand what those operands represent (without having to look at
the definition of the instruction that we're adding to).

Differential Revision: https://reviews.llvm.org/D27984

llvm-svn: 292587
2017-01-20 08:15:24 +00:00
Serge Rogatch f83d2a25bf [XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier
Summary:
Emission of XRay table was occasionally disabled for Arm32, but this bug was not then detected because earlier (also by mistake) testing of XRay was occasionally disabled on 32-bit Arm targets. This patch should fix that problem and detect such problems in the future.
This patch is one of a series, see also
- https://reviews.llvm.org/D28623

Reviewers: rengolin, dberris

Reviewed By: dberris

Subscribers: llvm-commits, aemerson, rengolin, dberris, iid_iunknown

Differential Revision: https://reviews.llvm.org/D28624

llvm-svn: 292516
2017-01-19 20:24:23 +00:00
Daniel Sanders d64d5024a4 Re-commit: [globalisel] Tablegen-erate current Register Bank Information
Summary:
Adds a RegisterBank tablegen class that can be used to declare the register
banks and an associated tablegen pass to generate the necessary code.

Changes since first commit attempt:
* Added missing guards
* Added more missing guards
* Found and fixed a use-after-free bug involving Twine locals

Reviewers: t.p.northover, ab, rovka, qcolombet

Reviewed By: qcolombet

Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka

Differential Revision: https://reviews.llvm.org/D27338

llvm-svn: 292478
2017-01-19 11:15:55 +00:00
Chad Rosier 771db6f895 [Assembler] Fix crash when assembling .quad for AArch32.
A 64-bit relocation does not exist in 32-bit ARMELF. Report an error
instead of crashing.

PR23870
Patch by Sanne Wouda (sanwou01).
Differential Revision: https://reviews.llvm.org/D28851

llvm-svn: 292373
2017-01-18 15:02:54 +00:00
Florian Hahn 8485cecd3f [thumb,framelowering] Reset NoVRegs in Thumb1FrameLowering::emitPrologue.
Summary:
In this function, virtual registers can be introduced (for example
through calls to emitThumbRegPlusImmInReg). doScavengeFrameVirtualRegs
will replace those virtual registers with concrete registers later on
in PrologEpilogInserter, which sets NoVRegs again.

This patch fixes the Codegen/Thumb/segmented-stacks.ll test case which
failed with expensive checks.
https://llvm.org/bugs/show_bug.cgi?id=27484


Reviewers: rnk, bkramer, olista01

Reviewed By: olista01

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D28829

llvm-svn: 292372
2017-01-18 15:01:22 +00:00
Daniel Sanders af76f989b5 Re-revert: [globalisel] Tablegen-erate current Register Bank Information
More missing guards. My build didn't notice it due to a stale file left over
from a Global ISel build.

llvm-svn: 292369
2017-01-18 14:26:12 +00:00
Daniel Sanders 517b61cb69 Re-commit: [globalisel] Tablegen-erate current Register Bank Information
Summary:
Adds a RegisterBank tablegen class that can be used to declare the register
banks and an associated tablegen pass to generate the necessary code.

Changes since last commit:
The new tablegen pass is now correctly guarded by LLVM_BUILD_GLOBAL_ISEL and
this should fix the buildbots however it may not be the whole fix. The previous
buildbot failures suggest there may be a memory bug lurking that I'm unable to
reproduce (including when using asan) or spot in the source. If they re-occur
on this commit then I'll need assistance from the bot owners to track it down.

Reviewers: t.p.northover, ab, rovka, qcolombet

Reviewed By: qcolombet

Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka

Differential Revision: https://reviews.llvm.org/D27338

llvm-svn: 292367
2017-01-18 14:17:50 +00:00
Sam Parker df7c6ef96f [ARM] Create objdump subtarget from build attrs
Enable an ELFObjectFile to read the its arm build attributes to
produce a target triple with a specific ARM architecture.
llvm-objdump now uses this functionality to automatically produce
a more accurate target.

Differential Revision: https://reviews.llvm.org/D28769

llvm-svn: 292366
2017-01-18 13:52:12 +00:00
Renato Golin 03c5e69d07 Revert "[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier"
This reverts commit r292210, as it broke the Thumb buldbot with:

clang-5.0: error: the clang compiler does not support '-fxray-instrument
on thumbv7-unknown-linux-gnueabihf'.

llvm-svn: 292357
2017-01-18 09:08:43 +00:00
Tim Northover d943354216 GlobalISel: correctly handle varargs
Some platforms (notably iOS) use a different calling convention for unnamed vs
named parameters in varargs functions, so we need to keep track of this
information when translating calls.

Since not many platforms are involved, the guts of the special handling is in
the ValueHandler class (with a generic implementation that should work for most
targets).

llvm-svn: 292283
2017-01-17 22:30:10 +00:00
Serge Rogatch 50be6b45a9 [XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier
Summary:
Emission of XRay table was occasionally disabled for Arm32, but this bug was not then detected because earlier (also by mistake) testing of XRay was occasionally disabled on 32-bit Arm targets. This patch should fix that problem and detect such problems in the future.
This patch is one of a series, see also
- https://reviews.llvm.org/D28623

Reviewers: rengolin, dberris

Reviewed By: dberris

Subscribers: llvm-commits, aemerson, rengolin, dberris, iid_iunknown

Differential Revision: https://reviews.llvm.org/D28624

llvm-svn: 292210
2017-01-17 11:52:10 +00:00
Daniel Sanders a83a1a69c5 Revert r292132: [globalisel] Tablegen-erate current Register Bank Information'...
Several buildbots encountered a crash in tablegen when building this commit.
Reverting while I investigate the cause.

llvm-svn: 292136
2017-01-16 15:34:43 +00:00
Daniel Sanders ab8194def0 [globalisel] Tablegen-erate current Register Bank Information
Summary:
Adds a RegisterBank tablegen class that can be used to declare the register
banks and an associated tablegen pass to generate the necessary code.

Reviewers: t.p.northover, ab, rovka, qcolombet

Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka

Differential Revision: https://reviews.llvm.org/D27338

llvm-svn: 292132
2017-01-16 15:20:43 +00:00
Ivan Krasin 1ed7896c1b Revert r291903 and r291898. Reason: they break check-lld on the bots.
Summary:
Revert [ARM] Fix ubig32_t read in ARMAttributeParser

Now using support functions to read data instead of trying to
perform casts.
===========================================================

Revert [ARM] Enable objdump to construct triple for ARM

Now that The ARMAttributeParser has been moved into the library,
it has been modified so that it can parse the attributes without
printing them and stores them in a map. ELFObjectFile now queries
the attributes to fill out the architecture details of a provided
triple for 'arm' and 'thumb' targets. llvm-objdump uses this new
functionality.

Subscribers: llvm-commits, samparker, aemerson, mgorny

Differential Revision: https://reviews.llvm.org/D28683

llvm-svn: 291911
2017-01-13 16:45:15 +00:00
Saleem Abdulrasool 6ef45916c6 ARM: match GCC's behaviour for builtins
GCC changes the CC between the user-code and the builtins based on the
value of `-target` rather than `-mfloat-abi`.  When a HF target is used,
the VFP variant of the AAPCS CC is used.  Otherwise, the AAPCS variant
is used.  In all cases, the AEABI functions use the AAPCS CC.  Adjust
the calling convention based on the target.

Resolves PR30543!

llvm-svn: 291909
2017-01-13 16:25:33 +00:00
Benjamin Kramer 061f4a5fe6 Apply clang-tidy's performance-unnecessary-value-param to LLVM.
With some minor manual fixes for using function_ref instead of
std::function. No functional change intended.

llvm-svn: 291904
2017-01-13 14:39:03 +00:00
Sam Parker 770ceb69ba [ARM] Enable objdump to construct triple for ARM
Now that The ARMAttributeParser has been moved into the library,
it has been modified so that it can parse the attributes without
printing them and stores them in a map. ELFObjectFile now queries
the attributes to fill out the architecture details of a provided
triple for 'arm' and 'thumb' targets. llvm-objdump uses this new
functionality.

Differential Revision: https://reviews.llvm.org/D28281

llvm-svn: 291898
2017-01-13 11:04:21 +00:00
Diana Picus a2c59149e1 [ARM] CodeGen: Replace AddDefaultT1CC and AddNoT1CC. NFC
For AddDefaultT1CC, we add a new helper t1CondCodeOp, which creates the
appropriate register operand. For AddNoT1CC, we use the existing condCodeOp
helper - we only had two uses of AddNoT1CC, so at this point it's probably not
worth having yet another helper just for them.

Differential Revision: https://reviews.llvm.org/D28603

llvm-svn: 291894
2017-01-13 10:37:37 +00:00
Diana Picus 8a73f5562f [ARM] CodeGen: Remove AddDefaultCC. NFC.
Replace all uses of AddDefaultCC with add(condCodeOp()).
The transformation has been done automatically with a custom tool based on Clang
AST Matchers + RefactoringTool.

Differential Revision: https://reviews.llvm.org/D28557

llvm-svn: 291893
2017-01-13 10:18:01 +00:00
Diana Picus 116bbab4e4 [CodeGen] Rename MachineInstrBuilder::addOperand. NFC
Rename from addOperand to just add, to match the other method that has been
added to MachineInstrBuilder for adding more than just 1 operand.

See https://reviews.llvm.org/D28057 for the whole discussion.

Differential Revision: https://reviews.llvm.org/D28556

llvm-svn: 291891
2017-01-13 09:58:52 +00:00
Diana Picus 4f8c3e1882 [ARM] CodeGen: Remove AddDefaultPred. NFC.
Replace all uses of AddDefaultPred with MachineInstrBuilder::add(predOps()).
This makes the code building MachineInstrs more readable, because it allows us
to write code like:

MIB.addSomeOperand(blah)
   .add(predOps())
   .addAnotherOperand(blahblah)

instead of

AddDefaultPred(MIB.addSomeOperand(blah))
    .addAnotherOperand(blahblah)

This commit also adds the predOps helper in the ARM backend, as well as the add
method taking a variable number of operands to the MachineInstrBuilder.

The transformation has been done mostly automatically with a custom tool based
on Clang AST Matchers + RefactoringTool.

Differential Revision: https://reviews.llvm.org/D28555

llvm-svn: 291890
2017-01-13 09:37:56 +00:00
Saleem Abdulrasool 555e5980a5 ARM: slightly more table driven libcall setup
Switch some additional library call setup to be table driven.  This
makes it more immediately obvious what the library call looks like.
This is important for ARM since the calling conventions for the builtins
change based on the target/libcall name.  NFC

llvm-svn: 291789
2017-01-12 18:46:11 +00:00
Daniel Sanders b7391dd3b4 [globalisel] Move as much RegisterBank initialization to the constructor as possible
Summary:
The register bank is now entirely initialized in the constructor. However,
we still have the hardcoded number of register classes which will be
dealt with in the TableGen patch (D27338) since we do not have access
to this information to resolve this at this stage. The number of register
classes is known to the TRI and to TableGen but the RegisterBank
constructor is too early for the former and too late for the latter.
This will be fixed when the data is tablegen-erated.

Reviewers: t.p.northover, ab, rovka, qcolombet

Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D27809

llvm-svn: 291770
2017-01-12 16:11:23 +00:00
Daniel Sanders ae03595bfb [globalisel] Initialize RegisterBanks with static data.
Summary:
Refactor the RegisterBank initialization to use static data. This requires
GlobalISel implementations to rewrite calls to createRegisterBank() and
addRegBankCoverage() into a call to setRegBankData().

Out of tree targets can use diff 4 of D27807
(https://reviews.llvm.org/D27807?id=84117) to have addRegBankCoverage() dump
the register classes and other data that needs to be provided to
setRegBankData(). This is the method that was used to generate the static data
in this patch.

Tablegen-eration of this static data will follow after some refactoring.

Reviewers: t.p.northover, ab, rovka, qcolombet

Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D27807
Differential Revision: https://reviews.llvm.org/D27808

llvm-svn: 291768
2017-01-12 15:32:10 +00:00
Eli Friedman 3a03742c37 [ARM] More aggressive matching for vpadd and vpaddl.
The new matchers work after legalization to make them simpler, and to avoid
blocking other optimizations.

Differential Revision: https://reviews.llvm.org/D27779

llvm-svn: 291693
2017-01-11 19:33:38 +00:00
Mohammed Agabaria 2c96c43388 [X86] updating TTI costs for arithmetic instructions on X86\SLM arch.
updated instructions:
pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.

special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. 
In case if the real operands bitwidth <= 16.

Differential Revision: https://reviews.llvm.org/D28104 

llvm-svn: 291657
2017-01-11 08:23:37 +00:00
Eugene Zelenko c4ad1ce068 [Target] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 291641
2017-01-11 01:45:03 +00:00
Chad Rosier d0114fc1dd [ARM] Remove rbit intrinsics and autoupgrade to generic bitreverse.
Testing already covered by CodeGen/ARM/rbit.ll

llvm-svn: 291587
2017-01-10 19:23:51 +00:00
Mohammed Agabaria 23599ba794 Currently isLikelyComplexAddressComputation tries to figure out if the given stride seems to be 'complex' and need some extra cost for address computation handling.
This code seems to be target dependent which may not be the same for all targets.
Passed the decision whether the given stride is complex or not to the target by sending stride information via SCEV to getAddressComputationCost instead of 'IsComplex'.

Specifically at X86 targets we dont see any significant address computation cost in case of the strided access in general.

Differential Revision: https://reviews.llvm.org/D27518

llvm-svn: 291106
2017-01-05 14:03:41 +00:00
Dean Michael Berris f7e7b938ea [XRay] Merge instrumentation point table emission code into AsmPrinter.
Summary:
No need to have this per-architecture.  While there, unify 32-bit ARM's
behaviour with what changed elsewhere and start function names lowercase
as per the coding standards.  Individual entry emission code goes to the
entry's own class.

Fully tested on amd64, cross-builds on both ARMs and PowerPC.

Reviewers: dberris

Subscribers: aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D28209

llvm-svn: 290858
2017-01-03 04:30:21 +00:00
Aaron Ballman 58a61e723e Caught a simple typo. I do not know of a way to test this, but it seems like an unlikely thing to regress in the future.
llvm-svn: 290757
2016-12-30 15:57:56 +00:00
Eli Friedman d03df8145f [ARM] Implement isExtractSubvectorCheap.
See https://reviews.llvm.org/D6678 for the history of
isExtractSubvectorCheap. Essentially the same considerations apply
to ARM.

This temporarily breaks the formation of vpadd/vpaddl in certain cases;
AddCombineToVPADDL essentially assumes that we won't form VUZP shuffles.
See https://reviews.llvm.org/D27779 for followup fix.

Differential Revision: https://reviews.llvm.org/D27774

llvm-svn: 290198
2016-12-20 20:05:07 +00:00
Daniel Jasper 24218d5993 Silence unused warning.
llvm-svn: 290109
2016-12-19 14:24:22 +00:00
Diana Picus 97ae95c3a8 [ARM] GlobalISel: Lower i8 and i16 register args
This allows lowering i8 and i16 arguments if they can fit in the registers. Note
that the lowering is incomplete - ABI extensions are handled in a subsequent
patch.

(Last part of)
Differential Revision: https://reviews.llvm.org/D27704

llvm-svn: 290106
2016-12-19 14:08:02 +00:00
Diana Picus 5a724452a0 [ARM] GlobalISel: Allow i8 and i16 adds
Teach the instruction selector and legalizer that it's ok to have adds with 8 or
16-bit integers.

This is the second part of https://reviews.llvm.org/D27704

llvm-svn: 290105
2016-12-19 14:07:56 +00:00
Diana Picus 36aa09fa3c [ARM] GlobalISel: Select i8 and i16 copies
Teach the instruction selector that it's ok to copy small values from physical
registers.

First part of https://reviews.llvm.org/D27704

llvm-svn: 290104
2016-12-19 14:07:50 +00:00
Diana Picus 1437f6d710 [ARM] GlobalISel: Lower more than 4 arguments
This adds support for lowering more than 4 arguments (although still i32 only).
It uses the handleAssignments / ValueHandler infrastructure extracted from
the AArch64 backend in r288658.

Differential Revision: https://reviews.llvm.org/D27195

llvm-svn: 290098
2016-12-19 11:55:41 +00:00
Diana Picus 519807f7be [ARM] GlobalISel: Support loading from the stack
Add support for selecting simple G_LOAD and G_FRAME_INDEX instructions (32-bit
scalars only). This will be useful for functions that need to pass arguments on
the stack.

First part of https://reviews.llvm.org/D27195.

llvm-svn: 290096
2016-12-19 11:26:31 +00:00
Eli Friedman f624ec27b7 [ARM] Add ARMISD::VLD1DUP to match vld1_dup more consistently.
Currently, there are substantial problems forming vld1_dup even if the
VDUP survives legalization. The lack of an actual node
leads to terrible results: not only can we not form post-increment vld1_dup
instructions, but we form scalar pre-increment and post-increment
loads which force the loaded value into a GPR. This patch fixes that
by combining the vdup+load into an ARMISD node before DAGCombine
messes it up.

Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable).

Recommiting with fix to avoid forming vld1dup if the type of the load
doesn't match the type of the vdup (see
https://llvm.org/bugs/show_bug.cgi?id=31404).

Differential Revision: https://reviews.llvm.org/D27694

llvm-svn: 289972
2016-12-16 18:44:08 +00:00
Benjamin Kramer 24bf8689dd [GlobalISel] Silence unused variable warnings in Release builds.
llvm-svn: 289941
2016-12-16 13:13:03 +00:00
Diana Picus 812caee65a [ARM] GlobalISel: Select add i32, i32
Add the minimal support necessary to select a function that returns the sum of
two i32 values.

This includes some support for argument/return lowering of i32 values through
registers, as well as the handling of copy and add instructions throughout the
GlobalISel pipeline.

Differential Revision: https://reviews.llvm.org/D26677

llvm-svn: 289940
2016-12-16 12:54:46 +00:00