Commit Graph

8149 Commits

Author SHA1 Message Date
Roman Divacky 6fd64ff577 Add a FIXME as requested by Renato Golin.
llvm-svn: 223390
2014-12-04 21:39:24 +00:00
Asiri Rathnayake 13cef35cba Fix yet another unseen regression caused by r223113
r223113 added support for ARM modified immediate assembly syntax. Which
assumes all immediate operands are prefixed with a '#'. This assumption
is wrong as per the ARMARM - which recommends that all '#' characters be
treated optional. The current patch fixes this regression and adds a test
case. A follow-up patch will expand the test coverage to other instructions.

llvm-svn: 223381
2014-12-04 19:34:59 +00:00
Jonathan Roelofs 300d8ffdf2 Fix thumbv4t indirect calls
So there are a couple of issues with indirect calls on thumbv4t. First, the most
'obvious' instruction, 'blx' isn't available until v5t. And secondly, the
next-most-obvious sequence: 'mov lr, pc; bx rN' doesn't DTRT in thumb code
because the saved off pc has its thumb bit cleared, so when the callee returns
we end up in ARM mode.... yuck.

The solution is to 'bl' to a nearby landing pad with a 'bx rN' in it.

We could cut down on code size by sharing the landing pads between call sites
that are close enough, but for the moment let's do correctness first and look at
performance later.


Patch by: Iain Sandoe

http://reviews.llvm.org/D6519

llvm-svn: 223380
2014-12-04 19:34:50 +00:00
Asiri Rathnayake d33304b3ad Fix a minor regression introduced in r223113
r223113 added support for ARM modified immediate assembly syntax. That patch
has broken support for immediate expressions, as in:
    add r0, #(4 * 4)
It wasn't caught because we don't have any tests for this feature. This patch
fixes this regression and adds test cases.

llvm-svn: 223366
2014-12-04 14:49:07 +00:00
Rafael Espindola 5403da4569 Revert "[Thumb/Thumb2] Added restrictions on PC, LR, SP in the register list for PUSH/POP/LDM/STM. <Differential Revision: http://reviews.llvm.org/D6090>"
This reverts commit r223356.

It was failing check-all (MC/ARM/thumb.s in particular).

llvm-svn: 223363
2014-12-04 14:10:20 +00:00
Jyoti Allur b24d0abfe3 [Thumb/Thumb2] Added restrictions on PC, LR, SP in the register list for PUSH/POP/LDM/STM. <Differential Revision: http://reviews.llvm.org/D6090>
llvm-svn: 223356
2014-12-04 11:52:49 +00:00
Matt Arsenault 4e27343eec Allow target to specify prefix for labels
Use the MCAsmInfo instead of the DataLayout, and allow
specifying a custom prefix for labels specifically. HSAIL
requires that labels begin with @, but global symbols with &.

llvm-svn: 223323
2014-12-04 00:06:57 +00:00
Roman Divacky fdf0560997 Change the name to be in style.
llvm-svn: 223255
2014-12-03 18:39:44 +00:00
Charlie Turner f02c92489a Emit ABI_FP_rounding attribute.
LLVM understands a -enable-sign-dependent-rounding-fp-math codegen option. When
the user has specified this option, the Tag_ABI_FP_rounding attribute should be
emitted with value 1. This option currently does not appear to disable
transformations and optimizations that assume default floating point rounding
behavior, AFAICT, but the intention should be recorded in the build attributes,
regardless of what the compiler actually does with the intention.

Change-Id: If838578df3dc652b6f2796b8d152545674bcb30e
llvm-svn: 223218
2014-12-03 08:12:26 +00:00
Roman Divacky 7e6b5955d4 Introduce CPUStringIsValid() into MCSubtargetInfo and use it for ARM .cpu parsing.
Previously .cpu directive in ARM assembler didnt switch to the new CPU and
therefore acted as a nop. This implemented real action for .cpu and eg. 
allows to assembler FreeBSD kernel with -integrated-as.

llvm-svn: 223147
2014-12-02 20:03:22 +00:00
Asiri Rathnayake cdfa931db9 Remove unused function.
Removing an unused function which is causing one of the build bots to fail.
This was introduced in the commit r223113. A proper cleanup of the so_imm
tblgen defintion (made redundant by the mod_imm definition) needs to happen
soon.

llvm-svn: 223115
2014-12-02 12:09:55 +00:00
Asiri Rathnayake a0199b9a59 Add support for ARM modified-immediate assembly syntax.
Certain ARM instructions accept 32-bit immediate operands encoded as a 8-bit
integer value (0-255) and a 4-bit rotation (0-30, even). Current ARM assembly
syntax support in LLVM allows the decoded (32-bit) immediate to be specified
as a single immediate operand for such instructions:

mov r0, #4278190080

The ARMARM defines an extended assembly syntax allowing the encoding to be made
more explicit, as in:

mov r0, #255, #8 ; (same 32-bit value as above)

The behaviour of the two instructions can be different w.r.t flags, which is
documented under "Modified immediate constants" in ARMARM. This patch enables
support for this extended syntax at the MC layer.

llvm-svn: 223113
2014-12-02 10:53:20 +00:00
Charlie Turner 15f91c5240 Emit Tag_ABI_FP_denormal correctly in fast-math mode.
The default ARM floating-point mode does not support IEEE 754 mode exactly. Of
relevance to this patch is that input denormals are flushed to zero. The way in
which they're flushed to zero depends on the architecture,

  * For VFPv2, it is implementation defined as to whether the sign of zero is
    preserved.
  * For VFPv3 and above, the sign of zero is always preserved when a denormal
    is flushed to zero.

When FP support has been disabled, the strategy taken by this patch is to
assume the software support will mirror the behaviour of the hardware support
for the target *if it existed*. That is, for architectures which can only have
VFPv2, it is assumed the software will flush to positive zero. For later
architectures it is assumed the software will flush to zero preserving sign.

Change-Id: Icc5928633ba222a4ba3ca8c0df44a440445865fd
llvm-svn: 223110
2014-12-02 08:22:29 +00:00
Tim Northover 3024b5535c ARM: lower tail calls correctly when using GHC calling convention.
Patch by Ben Gamari.

llvm-svn: 223055
2014-12-01 17:46:39 +00:00
Charlie Turner 30895f9ab8 Add post-decode checking of HVC instruction.
Add checkDecodedInstruction for post-decode checking of instructions, to catch
the corner cases like HVC that don't fit into the general pattern. Needed to
check for an invalid condition field in instruction encoding despite HVC not
taking a predicate.

Patch by Matthew Wahab.

Change-Id: I48e28de981d7a9e43569594da3c45fb478b4f795
llvm-svn: 222992
2014-12-01 08:50:27 +00:00
Charlie Turner 7de905cd17 Add Thumb HVC and ERET virtualisation extension instructions.
Patch by Matthew Wahab.

Change-Id: I131f71c1150d5fa797066a18e09d526c19bf9016
llvm-svn: 222990
2014-12-01 08:39:19 +00:00
Charlie Turner 4d88ae2002 Add ARM ERET and HVC virtualisation extension instructions.
Patch by Matthew Wahab.

Change-Id: Iad75f078fbaa4ecc7d7a4820ad9b3930679cbbbb
llvm-svn: 222989
2014-12-01 08:33:28 +00:00
Charlie Turner db6c5e7afa Fix wrong encoding of MRSBanked.
Patch by Matthew Wahab.

Change-Id: Ia2a001ca2760028ea360fe77b56f203a219eefbc
llvm-svn: 222920
2014-11-28 15:01:06 +00:00
Tim Northover a38e5cbf20 Stop using ArrayRef of a const type.
I *think* this is what the GCC bots are complaining about.

llvm-svn: 222905
2014-11-27 21:29:20 +00:00
Tim Northover 3c55ccac48 AArch64: treat [N x Ty] as a block during procedure calls.
The AAPCS treats small structs and homogeneous floating (or vector) aggregates
specially, and guarantees they either get passed as a contiguous block of
registers, or prevent any future use of those registers and get passed on the
stack.

This concept can fit quite neatly into LLVM's own type system, mapping an HFA
to [N x float] and so on, and small structs to [N x i64]. Doing so allows
front-ends to emit AAPCS compliant code without having to duplicate the
register counting logic.

llvm-svn: 222903
2014-11-27 21:02:42 +00:00
Charlie Turner 8d43369163 Stop uppercasing build attribute data.
The string data for string-valued build attributes were being unconditionally
uppercased. There is no mention in the ARM ABI addenda about case conventions,
so it's technically implementation defined as to whether the data are
capitialised in some way or not. However, there are good reasons not to
captialise the data.

  * It's less work.
  * Some vendors may legitimately have case-sensitive checks for these
    attributes which would fail on LLVM generated object files.
  * There could be locale issues with uppercasing.

The original reasons for uppercasing appear to have stemmed from an
old codesourcery toolchain behaviour, see

http://comments.gmane.org/gmane.comp.compilers.llvm.cvs/87133

This patch makes the object file emitted no longer captialise string
data, it encodes as seen in the assembly source.

Change-Id: Ibe20dd6e60d2773d57ff72a78470839033aa5538
llvm-svn: 222882
2014-11-27 12:13:56 +00:00
Craig Topper c50d64b07b Replace neverHasSideEffects=1 with hasSideEffects=0 in all .td files.
llvm-svn: 222801
2014-11-26 00:46:26 +00:00
Simon Pilgrim a279410ede Tidied up target triple OS detection. NFC
Use Triple::isOS*() helper functions where possible.

llvm-svn: 222622
2014-11-22 19:12:10 +00:00
Joerg Sonnenberger 02b13a8d9b Fix transformation of add with pc argument to adr for non-immediate
arguments.

llvm-svn: 222587
2014-11-21 22:39:34 +00:00
Craig Topper 61e88f44f9 Remove a bunch of unnecessary typecasts to 'const TargetRegisterClass *'
llvm-svn: 222509
2014-11-21 05:58:21 +00:00
Reid Kleckner 343c395f11 Fix more instances of -Wsentinel on Windows with s/NULL/nullptr/
Follow up to r221940, where I must not have caught em all. NFC

llvm-svn: 222481
2014-11-20 23:51:47 +00:00
Reid Kleckner 357600eab5 Add out of line virtual destructors to all LLVMTargetMachine subclasses
These recently all grew a unique_ptr<TargetLoweringObjectFile> member in
r221878.  When anyone calls a virtual method of a class, clang-cl
requires all virtual methods to be semantically valid. This includes the
implicit virtual destructor, which triggers instantiation of the
unique_ptr destructor, which fails because the type being deleted is
incomplete.

This is just part of the ongoing saga of PR20337, which is affecting
Blink as well. Because the MSVC ABI doesn't have key functions, we end
up referencing the vtable and implicit destructor on any virtual call
through a class. We don't actually end up emitting the dtor, so it'd be
good if we could avoid this unneeded type completion work.

llvm-svn: 222480
2014-11-20 23:37:18 +00:00
Jyoti Allur 5b9f35220e [ELF] Prevent ARM ELF object writer from generating deprecated relocation code R_ARM_PLT32
llvm-svn: 222414
2014-11-20 05:58:11 +00:00
David Blaikie 70573dcd9f Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard
library's associative container insert function.

This lead to updating SmallSet::insert to return pair<iterator, bool>,
and then to update SmallPtrSet::insert to return pair<iterator, bool>,
and then to update all the existing users of those functions...

llvm-svn: 222334
2014-11-19 07:49:26 +00:00
David Blaikie 5106ce7897 Remove StringMap::GetOrCreateValue in favor of StringMap::insert
Having two ways to do this doesn't seem terribly helpful and
consistently using the insert version (which we already has) seems like
it'll make the code easier to understand to anyone working with standard
data structures. (I also updated many references to the Entry's
key and value to use first() and second instead of getKey{Data,Length,}
and get/setValue - for similar consistency)

Also removes the GetOrCreateValue functions so there's less surface area
to StringMap to fix/improve/change/accommodate move semantics, etc.

llvm-svn: 222319
2014-11-19 05:49:42 +00:00
Reid Kleckner d970702ab3 Revert "ADT: correctly report isMSVCEnvironment for windows itanium"
This reverts commit r222180.

llvm-svn: 222188
2014-11-17 22:55:59 +00:00
Saleem Abdulrasool 76f2c77070 ADT: correctly report isMSVCEnvironment for windows itanium
The itanium environment on Windows uses MSVC and is a MSVC environment.  Report
this correctly.

llvm-svn: 222180
2014-11-17 22:13:26 +00:00
Oliver Stannard 970b0d576c [Thumb1] Re-write emitThumbRegPlusImmediate
This was motivated by a bug which caused code like this to be
miscompiled:
  declare void @take_ptr(i8*)
  define void @test() {
    %addr1.32 = alloca i8
    %addr2.32 = alloca i32, i32 1028
    call void @take_ptr(i8* %addr1)
    ret void
  }

This was emitting the following assembly to get the value of %addr1:
  add r0, sp, #1020
  add r0, r0, #8
However, "add r0, r0, #8" is not a valid Thumb1 instruction, and this
could not be assembled. The generated object file contained this,
resulting in r0 holding SP+8 rather tha SP+1028:
  add r0, sp, #1020
  add r0, sp, #8

This function looked like it could have caused miscompilations for
other combinations of registers and offsets (though I don't think it is
currently called with these), and the heuristic it used did not match
the emitted code in all cases.

llvm-svn: 222125
2014-11-17 11:18:10 +00:00
Tim Northover 603d316517 ARM: refactor .cfi_def_cfa_offset emission.
We use to track quite a few "adjusted" offsets through the FrameLowering code
to account for changes in the prologue instructions as we went and allow the
emission of correct CFA annotations. However, we were missing a couple of cases
and the code was almost impenetrable.

It's easier to just add any stack-adjusting instruction to a list and emit them
together.

llvm-svn: 222057
2014-11-14 22:45:33 +00:00
Tim Northover 9d2d218f49 ARM: correctly calculate the offset of FP in its push.
When we folded the DPR alignment gap into a push, we weren't noting the extra
distance from the beginning of the push to the FP, and so FP ended up pointing
at an incorrect offset.

The .cfi_def_cfa_offset directives are still wrong in this case, but I think
that can be improved by refactoring.

llvm-svn: 222056
2014-11-14 22:45:31 +00:00
Aditya Nandakumar 3053155652 We can get the TLOF from the TargetMachine - so constructor no longer requires TargetLoweringObjectFile to be passed.
llvm-svn: 221926
2014-11-13 21:29:21 +00:00
Tim Northover 631cc9ce1a ARM: allow constpool entry to be moved to the user's block in all cases.
Normally entries can only move to a lower address, but when that wasn't viable,
the user's block was considered anyway. Unfortunately, it went via
createNewWater which wasn't designed to handle the case where there's already
an island after the block.

Unfortunately, the test we have is slow and fragile, and I couldn't reduce it
to anything sane even with the @llvm.arm.space intrinsic. The test change here
is recreating the previous one after the change.

rdar://problem/18545506

llvm-svn: 221905
2014-11-13 17:58:53 +00:00
Tim Northover ab85dcc7b8 ARM: avoid duplicating branches during constant islands.
We were using a naive heuristic to determine whether a basic block already had
an unconditional branch at the end. This mostly corresponded to reality
(assuming branches got optimised) because there's not much point in a branch to
the next block, but could go wrong.

llvm-svn: 221904
2014-11-13 17:58:51 +00:00
Tim Northover 650b0ee53b ARM: add @llvm.arm.space intrinsic for testing ConstantIslands.
Creating tests for the ConstantIslands pass is very difficult, since it depends
on precise layout details. Having the ability to precisely inject a number of
bytes into the stream helps greatly.

llvm-svn: 221903
2014-11-13 17:58:48 +00:00
Aditya Nandakumar a27193297f This patch changes the ownership of TLOF from TargetLoweringBase to TargetMachine so that different subtargets could share the TLOF effectively
llvm-svn: 221878
2014-11-13 09:26:31 +00:00
Rafael Espindola 7fc5b87480 Pass an ArrayRef to MCDisassembler::getInstruction.
With this patch MCDisassembler::getInstruction takes an ArrayRef<uint8_t>
instead of a MemoryObject.

Even on X86 there is a maximum size an instruction can have. Given
that, it seems way simpler and more efficient to just pass an ArrayRef
to the disassembler instead of a MemoryObject and have it do a virtual
call every time it wants some extra bytes.

llvm-svn: 221751
2014-11-12 02:04:27 +00:00
Tom Roeder eb7a303d1b Add Forward Control-Flow Integrity.
This commit adds a new pass that can inject checks before indirect calls to
make sure that these calls target known locations. It supports three types of
checks and, at compile time, it can take the name of a custom function to call
when an indirect call check fails. The default failure function ignores the
error and continues.

This pass incidentally moves the function JumpInstrTables::transformType from
private to public and makes it static (with a new argument that specifies the
table type to use); this is so that the CFI code can transform function types
at call sites to determine which jump-instruction table to use for the check at
that site.

Also, this removes support for jumptables in ARM, pending further performance
analysis and discussion.

Review: http://reviews.llvm.org/D4167
llvm-svn: 221708
2014-11-11 21:08:02 +00:00
Rafael Espindola 961d469445 MCAsmParserExtension has a copy of the MCAsmParser. Use it.
Base classes were storing a second copy.

llvm-svn: 221667
2014-11-11 05:18:41 +00:00
Rafael Espindola 4aa6bea7a2 Misc style fixes. NFC.
This fixes a few cases of:

* Wrong variable name style.
* Lines longer than 80 columns.
* Repeated names in comments.
* clang-format of the above.

This make the next patch a lot easier to read.

llvm-svn: 221615
2014-11-10 18:11:10 +00:00
Tilmann Scheller 30c5ca25a5 [ARM] Remove more dead code.
Dead code identified by the Clang static analyzer.

llvm-svn: 221372
2014-11-05 17:45:04 +00:00
Tilmann Scheller c339992338 [ARM] Remove another redundant assignment.
Found by the Clang static analyzer.

llvm-svn: 221368
2014-11-05 17:34:04 +00:00
Tilmann Scheller 219ad28076 [ARM] Remove redundant assignment.
Found by the Clang static analyzer.

llvm-svn: 221366
2014-11-05 17:28:19 +00:00
Tilmann Scheller f2572c5097 [ARM] Remove dead code identified by the Clang static analyzer.
llvm-svn: 221358
2014-11-05 17:10:43 +00:00
Oliver Stannard 9e89d8cc5c [ARM] Honor FeatureD16 in the assembler and disassembler
Some ARM FPUs only have 16 double-precision registers, rather than the
normal 32. LLVM represents this with the D16 target feature. This is
currently used by CodeGen to avoid using high registers when they are
not available, but the assembler and disassembler do not.

I fix this in the assmebler and disassembler rather than the
InstrInfo.td files, as the latter would require a large number of
changes everywhere one of the floating-point instructions is referenced
in the backend. This solution is similar to the one used for
co-processor numbers and MSR masks.

llvm-svn: 221341
2014-11-05 12:06:39 +00:00
Tim Northover dc0d9e46a5 ARM: try to add extra CS-register whenever stack alignment >= 8.
We currently try to push an even number of registers to preserve 8-byte
alignment during a function's prologue, but only when the stack alignment is
prcisely 8. Many of the reasons for doing it apply also when that alignment > 8
(the extra store is often free, and can save another stack adjustment, though
less frequently for 16-byte stack alignment).

llvm-svn: 221321
2014-11-05 00:27:20 +00:00
Tim Northover 228c943f31 ARM/Dwarf: correctly align stack before callee-saved VPRs
We were making an attempt to do this by adding an extra callee-saved GPR (so
that there was an even number in the list), but when that failed we went ahead
and pushed anyway.

This had a couple of potential issues:
  + The .cfi directives we emit misplaced dN because they were based on
    PrologEpilogInserter's calculation.
  + Unaligned stores can be less efficient.
  + Unaligned stores can actually fault (likely only an issue in niche cases,
    but possible).

This adds a final explicit stack adjustment if all other options fail, so that
the actual locations of the registers match up with where they should be.

llvm-svn: 221320
2014-11-05 00:27:13 +00:00
Akira Hatanaka b961534818 [ARM, inline-asm] Fix ARMTargetLowering::getRegForInlineAsmConstraint to return
register class tGPRRegClass if the target is thumb1.

This commit fixes a crash that occurs during register allocation which was
triggered when a virtual register defined by an inline-asm instruction had to
be spilled.
 
rdar://problem/18740489

llvm-svn: 221178
2014-11-03 20:37:04 +00:00
Charlie Turner 1d8cc909cc Remove the cortex-a9-mp CPU.
This CPU definition is redundant. The Cortex-A9 is defined as
supporting multiprocessing extensions. Remove its definition and
update appropriate tests.

LLVM defines both a cortex-a9 CPU and a cortex-a9-mp CPU. The only
difference between the two CPU definitions in ARM.td is that
cortex-a9-mp contains the feature FeatureMP for multiprocessing
extensions.

This is redundant since the Cortex-A9 is defined as having
multiprocessing extensions in the TRMs. armcc also defines the
Cortex-A9 as having multiprocessing extensions by default.

Change-Id: Ifcadaa6c322be0a33d9d2a39cfdd7da1d75981a7
llvm-svn: 221166
2014-11-03 17:38:00 +00:00
Daniel Sanders 8104b75c9f Renamed CCState members that appear to misspell 'Processed' as 'Proceed'. NFC.
Reviewers: rnk

Reviewed By: rnk

Subscribers: rnk, llvm-commits

Differential Revision: http://reviews.llvm.org/D5978

llvm-svn: 221061
2014-11-01 19:32:23 +00:00
Rafael Espindola 246c4fb5d9 Remove redundant calls to isMaterializable.
This removes calls to isMaterializable in the following cases:

* It was redundant with a call to isDeclaration now that isDeclaration returns
  the correct answer for materializable functions.
* It was followed by a call to Materialize. Just call Materialize and check EC.

llvm-svn: 221050
2014-11-01 16:46:18 +00:00
Reid Kleckner da00cf5f73 Work around bugs in MSVC "14" CTP 3's conversion logic
It appears to ignore or find ambiguous MachineInstrBuilder's conversion
operators that allow conversion to MachineInstr* and
MachineBasicBlock::bundle_iterator.

As a workaround, add an explicit way to get the MachineInstr.

llvm-svn: 221017
2014-10-31 23:19:46 +00:00
Quentin Colombet c32615dfef [CodeGenPrepare] Move extractelement close to store if they can be combined.
This patch adds an optimization in CodeGenPrepare to move an extractelement
right before a store when the target can combine them.
The optimization may promote any scalar operations to vector operations in the
way to make that possible.


** Context **

Some targets use different register files for both vector and scalar operations.
This means that transitioning from one domain to another may incur copy from one
register file to another. These copies are not coalescable and may be expensive.
For example, according to the scheduling model, on cortex-A8 a vector to GPR
move is 20 cycles.


** Motivating Example **

Let us consider an example:
define void @foo(<2 x i32>* %addr1, i32* %dest) {
 %in1 = load <2 x i32>* %addr1, align 8
 %extract = extractelement <2 x i32> %in1, i32 1
 %out = or i32 %extract, 1
 store i32 %out, i32* %dest, align 4
 ret void
}

As it is, this IR generates the following assembly on armv7:
  vldr  d16, [r0]            @vector load  
  vmov.32 r0, d16[1]  @ cross-register-file copy: 20 cycles
  orr r0, r0, #1           @ scalar bitwise or
  str r0, [r1]               @ scalar store
  bx  lr

Whereas we could generate much faster code:
  vldr  d16, [r0]               @ vector load
  vorr.i32  d16, #0x1     @ vector bitwise or
  vst1.32 {d16[1]}, [r1:32] @ vector extract + store
  bx  lr

Half of the computation made in the vector is useless, but this allows to get
rid of the expensive cross-register-file copy.


** Proposed Solution **

To avoid this cross-register-copy penalty, we promote the scalar operations to
vector operations. The penalty will be removed if we manage to promote the whole
chain of computation in the vector domain.
Currently, we do that only when the chain of computation ends by a store and the
target is able to combine an extract with a store.

Stores are the most likely candidates, because other instructions produce values
that would need to be promoted and so, extracted as some point[1]. Moreover,
this is customary that targets feature stores that perform a vector extract (see
AArch64 and X86 for instance).

The proposed implementation relies on the TargetTransformInfo to decide whether
or not it is beneficial to promote a chain of computation in the vector domain.
Unfortunately, this interface is rather inaccurate for this level of details and
although this optimization may be beneficial for X86 and AArch64, the inaccuracy
will lead to the optimization being too aggressive.
Basically in TargetTransformInfo, everything that is legal has a cost of 1,
whereas, even if a vector type is legal, usually a vector operation is slightly
more expensive than its scalar counterpart. That will lead to too many
promotions that may not be counter balanced by the saving of the
cross-register-file copy. For instance, on AArch64 this penalty is just 4
cycles.

For now, the optimization is just enabled for ARM prior than v8, since those
processors have a larger penalty on cross-register-file copies, and the scope is
limited to basic blocks. Because of these two factors, we limit the effects of
the inaccuracy. Indeed, I did not want to build up a fancy cost model with block
frequency and everything on top of that.

[1] We can imagine targets that can combine an extractelement with  other
instructions than just stores. If we want to go into that direction, the current
interfaces must be augmented and, moreover, I think this becomes a global isel
problem.

Differential Revision: http://reviews.llvm.org/D5921

<rdar://problem/14170854>

llvm-svn: 220978
2014-10-31 17:52:53 +00:00
Oliver Stannard 79efe41a0c [ARM] Select VMAXNM and VMINNM regardless of operand order
Currently, the ARM backend will select the VMAXNM and VMINNM for these C
expressions:
  (a < b) ? a : b
  (a > b) ? a : b
but not these expressions:
  (a > b) ? b : a
  (a < b) ? b : a

This patch allows all of these expressions to be matched.

llvm-svn: 220671
2014-10-27 09:23:02 +00:00
Renato Golin 6fb9c2ea70 Do not emit intermediate register for zero FP immediate
This updates check for double precision zero floating point constant to allow
use of instruction with immediate value rather than temporary register.
Currently "a == 0.0", where "a" is of "double" type generates:

vmov.i32        d16, #0x0
vcmpe.f64       d0, d16

With this change it becomes:

vcmpe.f64        d0, #0

Patch by Sergey Dmitrouk.

llvm-svn: 220486
2014-10-23 15:31:50 +00:00
Oliver Stannard 39a85abddf [Thumb2] Improve disassembly of memory hints
Currently, the ARM disassembler will disassemble the Thumb2 memory hint
instructions (PLD, PLDW and PLI), even for targets which do not have
these instructions. This patch adds the required checks to the
disassmebler.

llvm-svn: 220472
2014-10-23 08:52:58 +00:00
Akira Hatanaka 2ee0e9e6ee [ARM, stack protector] If supported, use armv7 instructions.
This commit enables using movt/movw to load the stack guard address:

movw r0, :lower16:(L_g3$non_lazy_ptr-(LPC0_0+8))
movt r0, :upper16:(L_g3$non_lazy_ptr-(LPC0_0+8))
ldr r0, [pc, r0]

Previously a pc-relative load was emitted:

ldr r0, LCPI0_0
ldr r0, [pc, r0]

rdar://problem/18740489

llvm-svn: 220470
2014-10-23 04:17:05 +00:00
Jyoti Allur 3b68607eac [Thumb/Thumb2] Implement restrictions on SP in register list on LDM, STM variants in thumb mode
llvm-svn: 220379
2014-10-22 10:41:14 +00:00
Oliver Stannard cdb8db8d3c [ARM] NEON 32-bit scalar moves are also available in VFPv2
The 32-bit variants of the NEON scalar<->GPR move instructions are
also available in VFPv2. The 8- and 16-bit variants do require NEON.

Note that the checks in the test file are all -DAG because they are
checking a mixture of stdout and stderr, and the ordering is not
guaranteed.

llvm-svn: 220288
2014-10-21 11:49:14 +00:00
Oliver Stannard 38e6d45a46 [Thumb2] LDRS?[BH] cannot load to the PC
The Thumb2 LDRS?[BH] instructions are not valid when the destination
register is the PC (these encodings are used for preload hints).

llvm-svn: 220278
2014-10-21 09:14:15 +00:00
Tim Northover 23075ccee7 ARM: rework Thumb1 frame index rewriting
The previous code had a few problems, motivating the choices here.

1. It could create instructions clobbering CPSR, but the incoming MachineInstr
   didn't reflect this. A potential source of corruption. This is why the patch
   has a new PseudoInst for before lowering.
2. Similarly, there was some code to handle the incoming instruction not being
   ARMCC::AL, but this would have caused massive problems if it was actually
   invoked when a complex offset needing more than one instruction was requested.
3. It wasn't designed to handle unaligned pointers (or offsets). These should
   probably be minimised anyway, but the code needs to deal with them properly
   regardless.
4. It had some rather dubious ad-hoc code to avoid calling
   emitThumbRegPlusImmediate, a function which should be designed to do precisely
   this job.

We seem to cover the common cases correctly now, and hopefully can enhance
emitThumbRegPlusImmediate to handle any extra optimisations we need to add in
future.

llvm-svn: 220236
2014-10-20 21:28:41 +00:00
Oliver Stannard 6672f37ed2 [Thumb2] RFE, SRS and "SUBS pc, lr" are undefined on v7M
These instructions are related to the v7[AR] exception model, and are
not defined on v7M.

llvm-svn: 220204
2014-10-20 15:37:35 +00:00
Oliver Stannard e8f63a54b4 [ARM] Do not select SMULW[BT] or SMLAW[BT]
The current instruction selection patterns for SMULW[BT] and SMLAW[BT]
are incorrect. These instructions multiply a 32-bit and a 16-bit value
(both signed) and return the top 32 bits of the 48-bit result. This
preserves the 16 bits of overflow, whereas the patterns they currently
match truncate the result to 16 bits then sign extend.

To select these instructions, we would need to match an ISD::SMUL_LOHI,
a sign extend, two shifts and an or. There is no way to match SMUL_LOHI
in an instruction pattern as it defines multiple values, so this would
have to be done in C++. I have raised
http://llvm.org/bugs/show_bug.cgi?id=21297 to cover allowing correct
selection of these instructions.

This fixes http://llvm.org/bugs/show_bug.cgi?id=19396

llvm-svn: 220196
2014-10-20 11:30:35 +00:00
Oliver Stannard fce039240a [Thumb] Fix crash in Thumb1RegisterInfo::rewriteFrameIndex
This function can, for some offsets from the SP, split one instruction
into two. Since it re-uses the original instruction as the first
instruction of the result, we need ensure its result register is not
marked as dead before we use it in the second instruction.

llvm-svn: 220194
2014-10-20 11:00:18 +00:00
Bob Wilson 1e1f13862e Use triple predicate functions instead of checking values directly. NFC.
llvm-svn: 220155
2014-10-19 00:39:30 +00:00
Akira Hatanaka 0d0c78180d ARM: Fix a bug which was causing convergence failure in constant-island pass.
The bug is in ARMConstantIslands::createNewWater where the upper bound of the
new water split point is computed:

// This could point off the end of the block if we've already got constant
// pool entries following this block; only the last one is in the water list.
// Back past any possible branches (allow for a conditional and a maximally
// long unconditional).
if (BaseInsertOffset + 8 >= UserBBI.postOffset()) {
  BaseInsertOffset = UserBBI.postOffset() - UPad - 8;
  DEBUG(dbgs() << format("Move inside block: %#x\n", BaseInsertOffset));
}

The split point is supposed to be somewhere between the machine instruction that
loads from the constant pool entry and the end of the basic block, before branch
instructions. The code above is fine if the basic block is large enough and
there are a sufficient number of instructions following the machine instruction.
However, if the machine instruction is near the end of the basic block,
BaseInsertOffset can point to the machine instruction or another instruction
that precedes it, and this can lead to convergence failure.

This commit fixes this bug by ensuring BaseInsertOffset is larger than the
offset of the instruction following the constant-loading instruction.

rdar://problem/18581150

llvm-svn: 220015
2014-10-17 01:31:47 +00:00
Rafael Espindola 7b61ddfa6e Simplify handling of --noexecstack by using getNonexecutableStackSection.
llvm-svn: 219799
2014-10-15 16:12:52 +00:00
Tim Northover e9ff4c29b9 ARM: drop check for triple that's no longer used.
Early attempts to support AAPCS bare metal MachO targets based the decision on
the CPU being compiled for. This was not a particularly great idea and we've
got a better option now, but this check remained.

No functional change for any target we care about.

llvm-svn: 219767
2014-10-15 01:05:01 +00:00
Tim Northover cf6ce0c8f7 ARM: remove ARM/Thumb distinction for preferred alignment.
Thumb1 has legitimate reasons for preferring 32-bit alignment of types
i1/i8/i16, since the 16-bit encoding of "add rD, sp, #imm" requires #imm to be
a multiple of 4. However, this is a trade-off betweem code size and RAM usage;
the DataLayout string is not the best place to represent it even if desired.

So this patch removes the extra Thumb requirements, hopefully making ARM and
Thumb completely compatible in this respect.

llvm-svn: 219734
2014-10-14 22:12:17 +00:00
Tim Northover 9a4c043d67 ARM: allow misaligned local variables in Thumb1 mode.
There's no hard requirement on LLVM to align local variable to 32-bits, so the
Thumb1 frame handling needs to be able to deal with variables that are only
naturally aligned without falling over.

llvm-svn: 219733
2014-10-14 22:12:14 +00:00
Tim Northover aa09ac6e83 ARM: set preferred aggregate alignment to 32 universally.
Before, ARM and Thumb mode code had different preferred alignments, which could
lead to some rather unexpected results. There's justification for reducing it
from the default 64-bits (wasted space), but I don't think there is for going
below 32-bits.

There's no actual ABI change here, just to reassure people.

llvm-svn: 219719
2014-10-14 20:57:26 +00:00
Eric Christopher 7c558cf4d6 Grab the subtarget info off of the MachineFunction rather than
indirecting through the TargetMachine.

llvm-svn: 219674
2014-10-14 08:44:19 +00:00
Eric Christopher 4c67d5a1e3 Include map into the A15SDOptimizer rather than pick it up
transitively from the DFAPacketizer via TargetInstrInfo.h.

llvm-svn: 219652
2014-10-14 01:13:51 +00:00
Renato Golin 16ea8ba3bc Adds support for the Cortex-A17 to the ARM backend
Patch by Matthew Wahab.

llvm-svn: 219606
2014-10-13 10:22:19 +00:00
Benjamin Kramer 3e67db92bc MC: Bit pack MCSymbolData.
On x86_64 this brings it from 80 bytes to 64 bytes. Also make any member
variables private and clean up uses to go through the existing accessors.

NFC.

llvm-svn: 219573
2014-10-11 15:07:21 +00:00
Benjamin Kramer 2c3778dc51 Remove a compiler bug workaround from 2007. The affected versions of gcc are long gone.
NFC.

llvm-svn: 219433
2014-10-09 19:50:39 +00:00
Bob Wilson 9868d71ffe Use triple's isiOS() and isOSDarwin() methods.
These methods are already used in lots of places. This makes things more
consistent. NFC.

llvm-svn: 219386
2014-10-09 05:43:30 +00:00
Renato Golin 0595a26c25 Emit unaligned access build attribute for ARM
Patch by Charlie Turner.

llvm-svn: 219301
2014-10-08 12:26:22 +00:00
Renato Golin bab5ace6aa Refactor isThumb1Only() && isMClass() into a predicate called isV6M()
This must be enforced for all v6M cores, not just the cortex-m0,
irregardless of the user-specified alignment.

Patch by Charlie Turner.

llvm-svn: 219300
2014-10-08 12:26:16 +00:00
Renato Golin 51dc3f4701 Simplify switch statement in ARM subtarget align access
This switch can be reduced to a simpler if/else statement.

Patch by Charlie Turner.

llvm-svn: 219299
2014-10-08 12:26:13 +00:00
Eric Christopher b17140de35 Cache TargetLowering on SelectionDAGISel and update previous
calls to getTargetLowering() with the cached variable.

llvm-svn: 219284
2014-10-08 07:32:17 +00:00
NAKAMURA Takumi c62436c60a ARMInstPrinter.cpp: Suppress a warning for -Asserts. [-Wunused-variable]
llvm-svn: 219172
2014-10-06 23:48:04 +00:00
Tim Northover ea964f53c3 ARM: silence unused variable warning
llvm-svn: 219128
2014-10-06 17:26:36 +00:00
Tim Northover 8997fedfc6 ARM: remove dead InstPrinting code
This instruction form is handled by different AsmOperands now, so the code is
completely dead (and wrong anyway).

llvm-svn: 219127
2014-10-06 17:10:13 +00:00
Eric Christopher 3faf2f1e02 Add subtarget caches to aarch64, arm, ppc, and x86.
These will make it easier to test further changes to the
code generation and optimization pipelines as those are
moved to subtargets initialized with target feature and
target cpu.

llvm-svn: 219106
2014-10-06 06:45:36 +00:00
Benjamin Kramer e12a6bac32 Eliminate some deep std::vector copies. NFC.
llvm-svn: 218999
2014-10-03 18:33:16 +00:00
Renato Golin 4e31ae1051 Revert 202433 - Provide a target override for the latest regalloc heuristic
That commit was introduced in order to help investigate a problem in ARM
codegen breaking from commit 202304 (Add a limit to the heuristic that register
allocates instructions in local order). Recent analisys indicated that the
problem no longer exists, so I'm reverting this change.

See PR18996.

llvm-svn: 218981
2014-10-03 12:20:53 +00:00
Eric Christopher 5312afe7e1 constify TargetMachine argument.
llvm-svn: 218930
2014-10-03 00:17:59 +00:00
Eric Christopher a94e592e49 We can grab the options struct from the TargetMachine, no need to
pass it down in the constructor.

llvm-svn: 218929
2014-10-03 00:10:03 +00:00
Tim Northover 5d72c5de02 ARM: allow copying of CPSR when all else fails.
As with x86 and AArch64, certain situations can arise where we need to spill
CPSR in the middle of a calculation. These should be avoided where possible
(MRS/MSR is rather expensive), which ARM is actually better at than the other
two since it tries to Glue defs to uses, but as a last ditch effort, copying is
better than crashing.

rdar://problem/18011155

llvm-svn: 218789
2014-10-01 19:21:03 +00:00
Oliver Stannard d4e0a4fd2c [ARM] Allow selecting VRINT[APMXZR] and VCVT[BT] instructions for FPv5
Currently, we only codegen the VRINT[APMXZR] and VCVT[BT] instructions
when targeting ARMv8, but they are actually present on any target with
FP-ARMv8. Note that FP-ARMv8 is called FPv5 when is is part of an
M-profile core, but they have the same instructions so we model them
both as FPARMv8 in the ARM backend.

llvm-svn: 218763
2014-10-01 13:13:18 +00:00
Oliver Stannard 37e4daab05 [ARM] Add support for Cortex-M7, FPv5-SP and FPv5-DP (LLVM)
The Cortex-M7 has 3 options for its FPU: none, FPv5-SP-D16 and
FPv5-DP-D16. FPv5 has the same instructions as FP-ARMv8, so it can be
modelled using the same target feature, and all double-precision
operations are already disabled by the fp-only-sp target features.

llvm-svn: 218747
2014-10-01 09:02:17 +00:00
Oliver Stannard a4eba5ad70 [Thumb2] ldrexd and strexd are not defined on v7M
The Thumb2 ldrexd and strexd instructions are not defined for
M-class architectures.

llvm-svn: 218603
2014-09-29 10:57:29 +00:00
Renato Golin 36c626e33f Elide repeated register operand in Thumb1 instructions
This patch makes the ARM backend transform 3 operand instructions such as
'adds/subs' to the 2 operand version of the same instruction if the first
two register operands are the same.

Example: 'adds r0, r0, #1' will is transformed to 'adds r0, #1'.

Currently for some instructions such as 'adds' if you try to assemble
'adds r0, r0, #8' for thumb v6m the assembler would throw an error message
because the immediate cannot be encoded using 3 bits.

The backend should be smart enough to transform the instruction to
'adds r0, #8', which allows for larger immediate constants.

Patch by Ranjeet Singh.

llvm-svn: 218521
2014-09-26 16:14:29 +00:00
Tom Stellard 1fa1ce6112 ARM: Remove unneeded check for MI->hasPostISelHook()
llvm-svn: 218459
2014-09-25 18:59:23 +00:00
Renato Golin f5dd1dacb6 Add aliases for VAND imm to VBIC ~imm
On ARM NEON, VAND with immediate (16/32 bits) is an alias to VBIC ~imm with
the same type size. Adding that logic to the parser, and generating VBIC
instructions from VAND asm files.

This patch also fixes the validation routines for NEON splat immediates which
were wrong.

Fixes PR20702.

llvm-svn: 218450
2014-09-25 11:31:24 +00:00
Oliver Stannard 3256b26ef2 [Thumb2] BXJ should be undefined for v7M, v8A
The Thumb2 BXJ instruction (Branch and Exchange Jazelle) is not
defined for v7M or v8A. It is defined for all other Thumb2-supporting
architectures (v6T2, v7A and v7R).

llvm-svn: 218445
2014-09-25 10:02:05 +00:00
Moritz Roth f5d0c7c2c0 [Thumb] Make load/store optimizer less conservative.
If it's safe to clobber the condition flags, we can do a few extra things:
it's then possible to reset the base register writeback using a SUBS, so
we can try to merge even if the base register isn't dead after the merged
instruction.

This is effectively a (heavily bug-fixed) rewrite of r208992.

llvm-svn: 218386
2014-09-24 16:35:50 +00:00
Oliver Stannard 1ae8b476f4 [Thumb] 32-bit encodings of 'cps' are not valid for v7M
v7M only allows the 16-bit encoding of the 'cps' (Change Processor
State) instruction, and does not have the 32-bit encoding which is
valid from v6T2 onwards.

llvm-svn: 218382
2014-09-24 14:20:01 +00:00
Robin Morisset dedef3325f Add AtomicExpandPass::bracketInstWithFences, and use it whenever getInsertFencesForAtomic would trigger in SelectionDAGBuilder
Summary:
The goal is to eventually remove all the code related to getInsertFencesForAtomic
in SelectionDAGBuilder as it is wrong (designed for ARM, not really portable, works
mostly by accident because the backends are overly conservative), and repeats the
same logic that goes in emitLeading/TrailingFence.

In this patch, I make AtomicExpandPass insert the fences as it knows better
where to put them. Because this requires getting the fences and not just
passing an IRBuilder around, I had to change the return type of
emitLeading/TrailingFence.
This code only triggers on ARM for now. Because it is earlier in the pipeline
than SelectionDAGBuilder, it triggers and lowers atomic accesses to atomic so
SelectionDAGBuilder does not add barriers anymore on ARM.

If this patch is accepted I plan to implement emitLeading/TrailingFence for all
backends that setInsertFencesForAtomic(true), which will allow both making them
less conservative and simplifying SelectionDAGBuilder once they are all using
this interface.

This should not cause any functionnal change so the existing tests are used
and not modified.

Test Plan: make check-all, benefits from existing tests of atomics on ARM

Reviewers: jfb, t.p.northover

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D5179

llvm-svn: 218329
2014-09-23 20:31:14 +00:00
Robin Morisset a7b357fed1 Just add a fixme about a possibly faster implementation of some atomic loads on some ARM processors
llvm-svn: 218326
2014-09-23 18:33:21 +00:00
Lang Hames d5f496d57c [MCJIT] Nuke MachineRelocation and MachineCodeEmitter. Now that the old JIT is
gone they're no longer needed.

llvm-svn: 218320
2014-09-23 18:08:47 +00:00
Quentin Colombet 17799fedb7 [ARM] Do not perform a tail call when the caller returns several values.
The fix is slightly different then x86 (see r216117) because the number of values
attached to a return can vary even for a single returned value (e.g., f64 yields
two returned values).

<rdar://problem/18352998>

llvm-svn: 218076
2014-09-18 21:17:50 +00:00
Robin Morisset 5349e8e532 Restore "[ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors"
Summary:
This patch was originally in D5304 (I could not find a way to reopen that revision).
It was accepted, commited and broke the build bots because the overloading of
the constructor of ArrayRef for braced initializer lists is not supported by all
toolchains. I then reverted it, and propose this fixed version that uses a plain
C array instead in makeDMB (that array is then converted implicitly to an
ArrayRef, but that is not behind an ifdef). Could someone confirm me whether
initialization lists for plain C arrays are supported by every toolchain used
to build llvm ? Otherwise I can just initialize the array in the old way:
args[0] = ...; .. ; args[5] = ...;

Below is the description of the original patch:
```
I had only tested this code for ARMv7 and ARMv8. This patch adds several
fallback paths if the processor does not support dmb ish:
- dmb sy if a cortex-M with support for dmb
- mcr p15, #0, r0, c7, c10, #5 for ARMv6 (special instruction equivalent to a DMB)
These fallback paths were chosen based on the code for fence seq_cst.

Thanks to luqmana for having noticed this bug.
```

Test Plan: Added more cases to atomic-load-store.ll + make check-all

Reviewers: jfb, t.p.northover, luqmana

Subscribers: llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D5386

llvm-svn: 218066
2014-09-18 18:56:04 +00:00
Aaron Ballman 0bb041b5f4 Reverting NFC changes from r218050. Instead, the warning was disabled for GCC in r218059, so these changes are no longer required.
llvm-svn: 218062
2014-09-18 17:34:23 +00:00
Aaron Ballman 11fa97fa32 Fixing a bunch of -Woverloaded-virtual warnings due to hiding getSubtargetImpl from the base class. NFC.
llvm-svn: 218050
2014-09-18 13:27:14 +00:00
Saleem Abdulrasool bfdfb14a8f ARM: prevent crash on ELF directives on COFF
Certain directives are unsupported on Windows (some of which could/should be
supported).  We would not diagnose the use but rather crash during the emission
as we try to access the Target Streamer.  Add an assertion to prevent creating a
NULL reference (which is not permitted under C++) as well as a test to ensure
that we can diagnose the disabled directives.

llvm-svn: 218014
2014-09-18 04:28:29 +00:00
Saleem Abdulrasool 8c61c6c0f9 ARM: use a more precise check for MachO
Rather than relying on support for a specific directive to determine if we are
targeting MachO, explicitly check the output format.

As an additional bonus, cleanup the caret diagnostic for the non-MachO case and
avoid the spurious error caused by not discarding the statement.

llvm-svn: 218012
2014-09-18 03:49:55 +00:00
Robin Morisset bf26f8fd56 Revert "[ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors"
It is breaking the build on the buildbots but works fine on my machine, I revert
while trying to understand what happens (it appears to depend on the compiler used
to build, I probably used a C++11 feature that is not perfectly supported by some
of the buildbots).

This reverts commit feb3176c4d006f99af8b40373abd56215a90e7cc.

llvm-svn: 217973
2014-09-17 18:09:13 +00:00
Robin Morisset 1c8a457575 [ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors
Summary:
I had only tested this code for ARMv7 and ARMv8. This patch adds several
fallback paths if the processor does not support dmb ish:
- dmb sy if a cortex-M with support for dmb
- mcr p15, #0, r0, c7, c10, #5 for ARMv6 (special instruction equivalent to a DMB)
These fallback paths were chosen based on the code for fence seq_cst.

Thanks to luqmana for having noticed this bug.

Test Plan: Added more cases to atomic-load-store.ll + make check-all

Reviewers: jfb, t.p.northover, luqmana

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D5304

llvm-svn: 217965
2014-09-17 17:41:16 +00:00
Richard Trieu 1fbe1a8ba7 | -> ||
No functional change.

llvm-svn: 217934
2014-09-17 01:47:52 +00:00
Robin Morisset 25c8e318e4 [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass
This required a new hook called hasLoadLinkedStoreConditional to know whether
to expand atomics to LL/SC (ARM, AArch64, in a future patch Power) or to
CmpXchg (X86).

Apart from that, the new code in AtomicExpandPass is mostly moved from
X86AtomicExpandPass. The main result of this patch is to get rid of that
pass, which had lots of code duplicated with AtomicExpandPass.

llvm-svn: 217928
2014-09-17 00:06:58 +00:00
Moritz Roth eef9f4dc74 ARM load/store optimizer: Don't materialize a new base register with
ADDS/SUBS unless it's safe to clobber the condition flags.

If the merged instructions are in a range where the CPSR is live,
e.g. between a CMP -> Bcc, we can't safely materialize a new base
register.

This problem is quite rare, I couldn't come up with a test case and I've
never actually seen this happen in the tests I'm running - there is a
potential trigger for this in LNT/oggenc (spills being inserted between
a CMP/Bcc), but at the moment this isn't being merged. I'll try to
reduce that into a small test case once I've committed my upcoming patch
to make merging less conservative.

llvm-svn: 217881
2014-09-16 16:25:07 +00:00
Joe Abbey 8e72eb780e ARMAsmBackend uses a factory method to generate binary file format specific
objects.  There were a few FIXMEs in ARMAsmBackend.cpp suggesting the class
definitions should be in a separate file.  Starting with ARMAsmBackend, the
class definition has been put in a header file, and #includes reduced.  Each
sub-type of ARMAsmBackend is now in its own header file.

Derived types have been painted with a different color of bike-shed:

  s/DarwinARMAsmBackend/ARMAsmBackendDarwin/g
  s/ARMWinCOFFAsmBackend/ARMAsmBackendWinCOFF/g
  s/ELFARMAsmBackend/ARMAsmBackendELF/g

Finally, clang-format has been run across ARMAsmBackend.cpp

llvm-svn: 217866
2014-09-16 09:18:23 +00:00
James Molloy a9f47b6bae [ARM] Teach the cost model that cross-class copies are costly.
Cross-class copies being expensive is actually a trait of the microarchitecture, but as I haven't yet seen an example of a microarchitecture where they're cheap it seems best to just enable this by default, covering the non-mcpu build case.

llvm-svn: 217674
2014-09-12 13:29:40 +00:00
Sanjay Patel b653de1ada Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option names controlling this variable.
"Unroll" is not the appropriate name for this variable. Clang already uses 
the term "interleave" in pragmas and metadata for this.

Differential Revision: http://reviews.llvm.org/D5066

llvm-svn: 217528
2014-09-10 17:58:16 +00:00
Tim Northover ba1d704229 ARM: don't size-reduce STMs using the LR register.
The only Thumb-1 multi-store capable of using LR is the PUSH instruction, which
translates to STMDB, so we shouldn't convert STMIAs.

Patch by Sergey Dmitrouk.

llvm-svn: 217498
2014-09-10 12:53:28 +00:00
Renato Golin 63e27980da ARM: Negative offset support problem
This patch is to permit a negative offset usage for a non frame access.

Patch by Igor Oblakov.

llvm-svn: 217431
2014-09-09 09:57:59 +00:00
Tim Northover c879d06a85 ARM: cover all sub-architecture enumerators to keep compiler happy.
No change in behaviour (hopefully).

llvm-svn: 217233
2014-09-05 07:56:46 +00:00
Aaron Ballman 169eeb913d Silencing a usually-helpful-but-braindead-silly-in-this-case sign mismatch warning with MSVC. NFC.
llvm-svn: 217143
2014-09-04 11:52:24 +00:00
Robin Morisset ed3d48f161 Refactor AtomicExpandPass and add a generic isAtomic() method to Instruction
Summary:
Split shouldExpandAtomicInIR() into different versions for Stores/Loads/RMWs/CmpXchgs.
Makes runOnFunction cleaner (no more redundant checking/casting), and will help moving
the X86 backend to this pass.

This requires a way of easily detecting which instructions are atomic.
I followed the pattern of mayReadFromMemory, mayWriteOrReadMemory, etc.. in making
isAtomic() a method of Instruction implemented by a switch on the opcodes.

Test Plan: make check

Reviewers: jfb

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D5035

llvm-svn: 217080
2014-09-03 21:29:59 +00:00
Robin Morisset a47cb411dc Use target-dependent emitLeading/TrailingFence instead of the target-independent insertLeading/TrailingFence (in AtomicExpandPass)
Fixes two latent bugs:
- There was no fence inserted before expanded seq_cst load (unsound on Power)
- There was only a fence release before seq_cst stores (again unsound, in particular on Power)
    It is not even clear if this is correct on ARM swift processors (where release fences are
    DMB ishst instead of DMB ish). This behaviour is currently preserved on ARM Swift
    as it is not clear whether it is incorrect. I would love to get documentation stating
    whether it is correct or not.
These two bugs were not triggered because Power is not (yet) using this pass, and these
behaviours happen to be (mostly?) working on ARM
(although they completely butchered the semantics of the llvm IR).

See:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075821.html
for an example of the problems that can be caused by the second of these bugs.

I couldn't see a way of fixing these in a completely target-independent way without
adding lots of unnecessary fences on ARM, hence the target-dependent parts of this
patch.

This patch implements the new target-dependent parts only for ARM (the default
of not doing anything is enough for AArch64), other architectures will use this
infrastructure in later patches.

llvm-svn: 217076
2014-09-03 21:01:03 +00:00
Juergen Ributzka 88e32517c4 [FastISel][tblgen] Rename tblgen generated FastISel functions. NFC.
This is the final round of renaming. This changes tblgen to emit lower-case
function names for FastEmitInst_* and FastEmit_*, and updates all its uses
in the source code.

Reviewed by Eric

llvm-svn: 217075
2014-09-03 20:56:59 +00:00
Juergen Ributzka 5b8bb4d7dd [FastISel] Rename public visible FastISel functions. NFC.
This commit renames the following public FastISel functions:
LowerArguments -> lowerArguments
SelectInstruction -> selectInstruction
TargetSelectInstruction -> fastSelectInstruction
FastLowerArguments -> fastLowerArguments
FastLowerCall -> fastLowerCall
FastLowerIntrinsicCall -> fastLowerIntrinsicCall
FastEmitZExtFromI1 -> fastEmitZExtFromI1
FastEmitBranch -> fastEmitBranch
UpdateValueMap -> updateValueMap
TargetMaterializeConstant -> fastMaterializeConstant
TargetMaterializeAlloca -> fastMaterializeAlloca
TargetMaterializeFloatZero -> fastMaterializeFloatZero
LowerCallTo -> lowerCallTo

Reviewed by Eric

llvm-svn: 217074
2014-09-03 20:56:52 +00:00
Eric Christopher b68e25330b Remove resetSubtargetFeatures as it is unused.
llvm-svn: 217071
2014-09-03 20:36:31 +00:00
Benjamin Kramer 8c90fd71f7 Add override to overriden virtual methods, remove virtual keywords.
No functionality change. Changes made by clang-tidy + some manual cleanup.

llvm-svn: 217028
2014-09-03 11:41:21 +00:00
Renato Golin e07a22ac14 Only emit movw on ARMv6T2+
Fix PR18364.

Patch by Dimitry Andric.

llvm-svn: 216989
2014-09-02 22:45:13 +00:00
Eric Christopher 79cc1e3ae7 Reinstate "Nuke the old JIT."
Approved by Jim Grosbach, Lang Hames, Rafael Espindola.

This reinstates commits r215111, 215115, 215116, 215117, 215136.

llvm-svn: 216982
2014-09-02 22:28:02 +00:00
Pete Cooper 1175945710 Change MCSchedModel to be a struct of statically initialized data.
This removes static initializers from the backends which generate this data, and also makes this struct match the other Tablegen generated structs in behaviour

Reviewed by Andy Trick and Chandler C

llvm-svn: 216919
2014-09-02 17:43:54 +00:00
JF Bastien 12cc99eb13 Add missing override on ARMAsmBackend's dtor.
Test Plan: ninja check && ninja clang-test

Subscribers: aemerson

Differential Revision: http://reviews.llvm.org/D5075

llvm-svn: 216912
2014-09-02 16:26:55 +00:00
Renato Golin 92c816c68f Thumb2 M-class MSR instruction support changes
This patch implements a few changes related to the Thumb2 M-class MSR instruction:
 * better handling of unpredictable encodings,
 * recognition of the _g and _nzcvqg variants by the asm parser only if the DSP
   extension is available, preferred output of MSR APSR moves with the _<bits>
   suffix for v7-M.

Patch by Petr Pavlu.

llvm-svn: 216874
2014-09-01 11:25:07 +00:00
Craig Topper 6dc4a8bc2c Fix some cases where StringRef was being passed by const reference. Remove const from some other StringRefs since its implicitly const already.
llvm-svn: 216820
2014-08-30 16:48:02 +00:00
Robin Morisset 039781ef26 Fix typos in comments, NFC
Summary: Just fixing comments, no functional change.

Test Plan: N/A

Reviewers: jfb

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D5130

llvm-svn: 216784
2014-08-29 21:53:01 +00:00
Craig Topper e1d1294853 Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created.
llvm-svn: 216525
2014-08-27 05:25:25 +00:00
Yi Kong ebaa150e23 ARM: Add patterns for dbg
llvm-svn: 216451
2014-08-26 12:47:26 +00:00
Chad Rosier e62f365458 [AArch32] Add patterns for VCVT{A,N,P,M}.
Patterns for lowering libm calls to VCVT{A,N,P,M} are also included.
Phabricator Revision: http://reviews.llvm.org/D5033

llvm-svn: 216388
2014-08-25 16:56:33 +00:00
Karthik Bhat 7f33ff7dea Allow vectorization of division by uniform power of 2.
This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible.
Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend.
Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971)

llvm-svn: 216371
2014-08-25 04:56:54 +00:00
Craig Topper 4627679cec Use range based for loops to avoid needing to re-mention SmallPtrSet size.
llvm-svn: 216351
2014-08-24 23:23:06 +00:00
Chad Rosier ad7c910ecf Revert "ARM: improve RTABI 4.2 conformance on Linux"
This reverts commit r215862 due to nightly failures.  Will work on getting a
reduced test case, but I wanted to get our bots green in the meantime.

llvm-svn: 216325
2014-08-23 18:29:43 +00:00
Chad Rosier d2959362fb Revert "ARM: mark missing functions from RTABI"
This reverts commit r215863.

llvm-svn: 216324
2014-08-23 18:29:40 +00:00
Reid Kleckner 2d9bb65b3d ARM / x86_64 varargs: Don't save regparms in prologue without va_start
There's no need to do this if the user doesn't call va_start. In the
future, we're going to have thunks that forward these register
parameters with musttail calls, and they won't need these spills for
handling va_start.

Most of the test suite changes are adding va_start calls to existing
tests to keep things working.

llvm-svn: 216294
2014-08-22 21:59:26 +00:00
Quentin Colombet d358e84d9c [ARM] Move the implementation of the target hooks related to copy-related
instruction from ARMInstrInfo to ARMBaseInstrInfo.
That way, thumb mode can also benefit from the advanced copy optimization.

<rdar://problem/12702965>

llvm-svn: 216274
2014-08-22 18:05:22 +00:00
Robin Morisset 59c23cd946 Rename AtomicExpandLoadLinked into AtomicExpand
AtomicExpandLoadLinked is currently rather ARM-specific. This patch is the first of
a group that aim at making it more target-independent. See
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075873.html
for details

The command line option is "atomic-expand"

llvm-svn: 216231
2014-08-21 21:50:01 +00:00
Moritz Roth dfdda0d41c Thumb1 load/store optimizer: Improve code to materialize new base register.
There are two add-immediate instructions in Thumb1: tADDi8 and tADDi3. Only
the latter supports using different source and destination registers, so
whenever we materialize a new base register (at a certain offset) we'd do
so by moving the base register value to the new register and then adding in
place. This patch changes the code to use a single tADDi3 if the offset is
small enough to fit in 3 bits.

Differential Revision: http://reviews.llvm.org/D5006

llvm-svn: 216193
2014-08-21 17:11:03 +00:00
Jonathan Roelofs 5e98ff967b Add a thread-model knob for lowering atomics on baremetal & single threaded systems
http://reviews.llvm.org/D4984

llvm-svn: 216182
2014-08-21 14:35:47 +00:00
Oliver Stannard 51b1d460cb [ARM] Enable DP copy, load and store instructions for FPv4-SP
The FPv4-SP floating-point unit is generally referred to as
single-precision only, but it does have double-precision registers and
load, store and GPR<->DPR move instructions which operate on them.
This patch enables the use of these registers, the main advantage of
which is that we now comply with the AAPCS-VFP calling convention.
This partially reverts r209650, which added some AAPCS-VFP support,
but did not handle return values or alignment of double arguments in
registers.

This patch also adds tests for Thumb2 code generation for
floating-point instructions and intrinsics, which previously only
existed for ARM.

llvm-svn: 216172
2014-08-21 12:50:31 +00:00
Craig Topper 71b7b68b74 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
llvm-svn: 216158
2014-08-21 05:55:13 +00:00
Quentin Colombet 84f15bd1b0 [ARM] Mark VSETLNi32 with the InsertSubreg property and implement the related
target hook.

This patch teaches the compiler that:
dX = VSETLNi32 dY, rZ, imm
is the same as:
dX = INSERT_SUBREG dY, rZ, translateImmToSubIdx(imm)

<rdar://problem/12702965>

llvm-svn: 216143
2014-08-21 00:10:52 +00:00
Jonathan Roelofs 44937d98a3 Lower thumbv4t & thumbv5 lo->lo copies through a push-pop sequence
On pre-v6 hardware, 'MOV lo, lo' gives undefined results, so such copies need to
be avoided. This patch trades simplicity for implementation time at the expense
of performance... As they say: correctness first, then performance.

See http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075998.html for a few
ideas on how to make this better.

llvm-svn: 216138
2014-08-20 23:38:50 +00:00
Quentin Colombet deb82eab3e [ARM] Mark VMOVRRD with the ExtractSubreg property and implement the related
target hook.

This patch teaches the compiler that:
rX, rY = VMOVRRD dZ
is the same as:
rX = EXTRACT_SUBREG dZ, ssub_0
rY = EXTRACT_SUBREG dZ, ssub_1

<rdar://problem/12702965>

llvm-svn: 216132
2014-08-20 22:16:19 +00:00
Yi Kong c655f0c898 ARM: Fix codegen for rbit intrinsic
LLVM generates illegal `rbit r0, #352` instruction for rbit intrinsic.
According to ARM ARM, rbit only takes register as argument, not immediate.
The correct instruction should be rbit <Rd>, <Rm>.

The bug was originally introduced in r211057.

Differential Revision: http://reviews.llvm.org/D4980

llvm-svn: 216064
2014-08-20 10:40:20 +00:00
Alexey Samsonov f17f03e00e Hide two different AlignMode enums in anonymous namespaces. This bug is reported by UBSan.
llvm-svn: 216001
2014-08-19 18:40:39 +00:00
Robin Morisset b155f529fc Make use of isAtLeastRelease/Acquire in the ARM/AArch64 backends
Summary:
Make use of isAtLeastRelease/Acquire in the ARM/AArch64 backends
These helper functions are introduced in D4844.
Depends D4844

Test Plan: make check-all passes

Reviewers: jfb

Subscribers: aemerson, llvm-commits, mcrosier, reames

Differential Revision: http://reviews.llvm.org/D4937

llvm-svn: 215902
2014-08-18 16:48:58 +00:00
Oliver Stannard 12993dd916 [ARM,AArch64] Do not tail-call to an externally-defined function with weak linkage
Externally-defined functions with weak linkage should not be
tail-called on ARM or AArch64, as the AAELF spec requires normal calls
to undefined weak functions to be replaced with a NOP or jump to the
next instruction. The behaviour of branch instructions in this
situation (as used for tail calls) is implementation-defined, so we
cannot rely on the linker replacing the tail call with a return.

llvm-svn: 215890
2014-08-18 12:42:15 +00:00
Tim Northover 26bb14e6a7 TableGen: allow use of uint64_t for available features mask.
ARM in particular is getting dangerously close to exceeding 32 bits worth of
possible subtarget features. When this happens, various parts of MC start to
fail inexplicably as masks get truncated to "unsigned".

Mostly just refactoring at present, and there's probably no way to test.

llvm-svn: 215887
2014-08-18 11:49:42 +00:00
Craig Topper 6230691c91 Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size."
Getting a weird buildbot failure that I need to investigate.

llvm-svn: 215870
2014-08-18 00:24:38 +00:00
Craig Topper 5229cfd163 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
llvm-svn: 215868
2014-08-17 23:47:00 +00:00
Saleem Abdulrasool 3fd996ef5c ARM: mark missing functions from RTABI
Simply indicate the functions that are part of the runtime library that we do
not setup libcalls for.  This is merely for ease of identification.  NFC.

llvm-svn: 215863
2014-08-17 22:51:04 +00:00
Saleem Abdulrasool 017bd57fce ARM: improve RTABI 4.2 conformance on Linux
The set of functions defined in the RTABI was separated for no real reason.
This brings us closer to proper utilisation of the functions defined by the
RTABI.  It also sets the ground for correctly emitting function calls to AEABI
functions on all AEABI conforming platforms.

The previously existing lie on the behaviour of __ldivmod and __uldivmod is
propagated as it is beyond the scope of the change.

The changes to the test are due to the fact that we now use the divmod functions
which return both the quotient and remainder and thus we no longer need to
invoke two functions on Linux (making it closer to EABI's behaviour).

llvm-svn: 215862
2014-08-17 22:51:02 +00:00
Saleem Abdulrasool 740be89f51 ARM: whitespace
Whitespace fix, NFC.

llvm-svn: 215861
2014-08-17 22:50:59 +00:00
Saleem Abdulrasool 78c44725f8 ARM: correct toggling behaviour
This was a thinko.  The intent was to flip the explicit bits that need toggling
rather than all bits.  This would result in incorrect behaviour (which now is
tested).

Thanks to Nico Weber for pointing this out!

llvm-svn: 215846
2014-08-17 19:20:38 +00:00
Nico Weber ae050bb057 arm asm: Let .fpu enable instructions, PR20447.
I'm not very happy with duplicating the fpu->feature mapping in ARMAsmParser.cpp
and in clang's driver. See the bug for a patch that doesn't do that, and the
review thread [1] for why this duplication exists.

1: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140811/231052.html
llvm-svn: 215811
2014-08-16 05:37:51 +00:00
Robin Morisset d18cda620c Fix typos in comments
llvm-svn: 215777
2014-08-15 22:17:28 +00:00
Chad Rosier b1bbf6f8ce [AArch32] Add support for FP rounding operations for ARMv8/AArch32.
Phabricator Revision: http://reviews.llvm.org/D4935

llvm-svn: 215772
2014-08-15 21:38:16 +00:00
Moritz Roth 8f3765625e ARM: Fix and re-enable load/store optimizer for Thumb1.
In a previous iteration of the pass, we would try to compensate for
writeback by updating later instructions and/or inserting a SUBS to
reset the base register if necessary.
Since such a SUBS sets the condition flags it's not generally safe to do
this. For now, only merge LDR/STRs if there is no writeback to the base
register (LDM that loads into the base register) or the base register is
killed by one of the merged instructions. These cases are clear wins
both in terms of instruction count and performance.

Also add three new test cases, and update the existing ones accordingly.

llvm-svn: 215729
2014-08-15 17:00:30 +00:00
Moritz Roth 378a43bfe0 ARM load/store optimizer: Compute BaseKill correctly.
This adds some code back that was deleted in r92053. The location of the
last merged memory operation needs to be kept up-to-date since MemOps
may be in a different order to the original instruction stream to
allow merging (since registers need to be in ascending order). Also
simplify the logic to determine BaseKill using findRegisterUseOperandIdx
to use an equivalent function call instead.

llvm-svn: 215728
2014-08-15 17:00:20 +00:00
Juergen Ributzka 5df8603dfd [FastISel][ARM] Fix a think-o in my previous commit (r215682).
We actually need to return the register into which we materialized the constant
and not just "true" for success. This code is currently partially dead, that is
why it didn't trigger any failures yet. Once I change the order of the constant
materialization this code will be fully exercised.

llvm-svn: 215727
2014-08-15 16:59:46 +00:00
Rafael Espindola d610ba99cb Remove HasLEB128.
We already require CFI, so it should be safe to require .leb128 and .uleb128.

llvm-svn: 215712
2014-08-15 14:01:07 +00:00
Tim Northover ee843ef0fa ARM: implement MRS/MSR (banked reg) system instructions.
These are system-only instructions for CPUs with virtualization
extensions, allowing a hypervisor easy access to all of the various
different AArch32 registers.

rdar://problem/17861345

llvm-svn: 215700
2014-08-15 10:47:12 +00:00
Juergen Ributzka 81db58e177 [FastISel][ARM] Fall-back to constant pool loads when materializing an i32 constant.
FastEmit_i won't always succeed to materialize an i32 constant and just fail.
This would trigger a fall-back to SelectionDAG, which is really not necessary.

This fix will first fall-back to a constant pool load to materialize the constant
before giving up for good.

This fixes <rdar://problem/18022633>.

llvm-svn: 215682
2014-08-14 23:29:49 +00:00
Juergen Ributzka a5b083853c [FastISel][ARM] Use MOVT/MOVW if the subtarget requests it.
This change is also in preparation for a future change to make sure that
the constant materialization uses MOVT/MOVW when available and not a load
from the constant pool.

llvm-svn: 215584
2014-08-13 21:42:19 +00:00
Juergen Ributzka 2cbcf7aad9 [FastISel][ARM] Fix a bug in the integer materialization code.
getRegClassFor returns the incorrect register class when in Thumb2 mode.
This fix simply manually selects the register class as in the code just a few
lines above.

There is no test case for this code, because the code is currently
unreachable. This will be changed in a future commit and existing test
cases will exercise this code.

llvm-svn: 215583
2014-08-13 21:39:18 +00:00
Benjamin Kramer a7c40ef022 Canonicalize header guards into a common format.
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)

Changes made by clang-tidy with minor tweaks.

llvm-svn: 215558
2014-08-13 16:26:38 +00:00
Justin Bogner c0087f3611 IR: Print a newline when dumping Types
Type::dump() doesn't print a newline, which makes for a poor
experience in a debugger. This looks like it was an ommission
considering Value::dump() two lines above, so I've changed Type to add
a newline as well.

Of the two in-tree callers, one added a newline anyway, and I've
updated the other one to use Type::print instead.

llvm-svn: 215421
2014-08-12 03:24:59 +00:00
Quentin Colombet 55fd3ba33e [ARM] Mark VMOVDRR with the RegSequence property and implement the related
target hook.

This patch teaches the compiler that:
dX = VMOVDRR rY, rZ
is the same as:
dX = REG_SEQUENCE rY, ssub_0, rZ, ssub_1

<rdar://problem/12702965>

llvm-svn: 215404
2014-08-11 22:56:22 +00:00
Saleem Abdulrasool 27c78bf131 ARM: try harder to detect non-IT eligible instructions
For many Thumb-1 register register instructions, setting the CPSR is not
permitted inside an IT block.  We would not correctly flag those instructions.
The previous change to identify this scenario was insufficient as it did not
actually catch all the instances.  The current list is formed by manual
inspection of the ARMv6M ARM.

The change to the Thumb2 IT block test is due to the fact that the new more
stringent checking of the MIs results in the If Conversion pass being prevented
from executing (since not all the instructions in the BB are predicable).  This
results in code gen changes.

Thanks to Tim Northover for pointing out that the previous patch was
insufficient and hinting that the use of the v6M ARM would be much easier to use
than the v7 or v8!

llvm-svn: 215382
2014-08-11 20:13:25 +00:00
Oliver Stannard 11790b2dac ARM: __gnu_h2f_ieee and __gnu_f2h_ieee always use the soft-float calling convention
By default, LLVM uses the "C" calling convention for all runtime
library functions. The half-precision FP conversion functions use the
soft-float calling convention, and are needed for some targets which
use the hard-float convention by default, so must have their calling
convention explicitly set.

llvm-svn: 215348
2014-08-11 09:12:32 +00:00
Saleem Abdulrasool ed8885b402 ARM: correct isPredicable for MULS in ThHUMB mode
The ARM ARM states that CPSR may not be updated by a MUL in thumb mode.  Due to
an ordering of Thumb 2 Size Reduction and If Conversion, we would end up
generating a THUMB MULS inside an IT block.

The If Conversion pass uses the TTI isPredicable method to ensure that it can
transform a Basic Block.  However, because we only check for IT handling on
Thumb2 functions, we may miss some cases.  Even then, it only validates that the
CPSR is not *live* rather than it is not accessed.  This corrects the handling
for that particular case since the same restriction does not hold on the vast
majority of the instructions.

This does prevent the IfConversion optimization from kicking in in certain
cases, but generating correct code is more valuable.  Addresses PR20555.

llvm-svn: 215328
2014-08-10 22:20:37 +00:00
Joerg Sonnenberger 752b91bd82 If available, pass down the Fixup object to EvaluateAsRelocatable.
At least on PowerPC, the interpretation of certain modifiers depends on
the context they appear in.

llvm-svn: 215310
2014-08-10 11:35:12 +00:00
Eric Christopher b9fd9ed37e Temporarily Revert "Nuke the old JIT." as it's not quite ready to
be deleted. This will be reapplied as soon as possible and before
the 3.6 branch date at any rate.

Approved by Jim Grosbach, Lang Hames, Rafael Espindola.

This reverts commits r215111, 215115, 215116, 215117, 215136.

llvm-svn: 215154
2014-08-07 22:02:54 +00:00
Rafael Espindola f8b27c41e8 Nuke the old JIT.
I am sure we will be finding bits and pieces of dead code for years to
come, but this is a good start.

Thanks to Lang Hames for making MCJIT a good replacement!

llvm-svn: 215111
2014-08-07 14:21:18 +00:00
Pete Cooper c18261d467 Fix a whole bunch of binary literals which were the wrong size. All were being silently zero extended to the correct width.
The commit after this changes { } and 0bxx literals to be of type bits<n> and not int.  This means we need to write exactly the right number of bits, and not rely on the values being silently zero extended for us.

llvm-svn: 215082
2014-08-07 05:46:54 +00:00
Eric Christopher b5217507c7 Remove the target machine from CCState. Previously it was only used
to get the subtarget and that's accessible from the MachineFunction
now. This helps clear the way for smaller changes where we getting
a subtarget will require passing in a MachineFunction/Function as
well.

llvm-svn: 214988
2014-08-06 18:45:26 +00:00
Tim Northover 2a417b96d4 ARM: do not generate BLX instructions on Cortex-M CPUs.
Particularly on MachO, we were generating "blx _dest" instructions on M-class
CPUs, which don't actually exist. They happen to get fixed up by the linker
into valid "bl _dest" instructions (which is why such a massive issue has
remained largely undetected), but we shouldn't rely on that.

llvm-svn: 214959
2014-08-06 11:13:14 +00:00
Tim Northover d4d294dd51 ARM-MachO: materialize callee address correctly on v4t.
llvm-svn: 214958
2014-08-06 11:13:06 +00:00
Rafael Espindola b8141d55b9 Remove a virtual function from TargetMachine. NFC.
llvm-svn: 214929
2014-08-05 22:10:21 +00:00
Jonathan Roelofs ef84bda531 Re-apply r214881: Fix return sequence on armv4 thumb
This reverts r214893, re-applying r214881 with the test case relaxed a bit to
satiate the build bots.

POP on armv4t cannot be used to change thumb state (unilke later non-m-class
architectures), therefore we need a different return sequence that uses 'bx'
instead:

  POP {r3}
  ADD sp, #offset
  BX r3

This patch also fixes an issue where the return value in r3 would get clobbered
for functions that return 128 bits of data. In that case, we generate this
sequence instead:

  MOV ip, r3
  POP {r3}
  ADD sp, #offset
  MOV lr, r3
  MOV r3, ip
  BX lr

http://reviews.llvm.org/D4748

llvm-svn: 214928
2014-08-05 21:32:21 +00:00
Jonathan Roelofs 064eb5a177 Revert r214881 because it broke lots of build-bots
llvm-svn: 214893
2014-08-05 17:36:05 +00:00
Jonathan Roelofs f5fad3767b Fix return sequence on armv4 thumb
POP on armv4t cannot be used to change thumb state (unilke later non-m-class
architectures), therefore we need a different return sequence that uses 'bx'
instead:

  POP {r3}
  ADD sp, #offset
  BX r3

This patch also fixes an issue where the return value in r3 would get clobbered
for functions that return 128 bits of data. In that case, we generate this
sequence instead:

  MOV ip, r3
  POP {r3}
  ADD sp, #offset
  MOV lr, r3
  MOV r3, ip
  BX lr

http://reviews.llvm.org/D4748

llvm-svn: 214881
2014-08-05 17:13:17 +00:00
Keith Walker 1045717584 Specify that the thumb setend and blx <immed> instructions are not valid on an m-class target
llvm-svn: 214871
2014-08-05 15:11:59 +00:00
Keith Walker 292aa3d5f7 Define stc2/stc2l/ldc2/ldc2l as thumb2 instructions
llvm-svn: 214868
2014-08-05 14:58:05 +00:00
Eric Christopher fc6de428c8 Have MachineFunction cache a pointer to the subtarget to make lookups
shorter/easier and have the DAG use that to do the same lookup. This
can be used in the future for TargetMachine based caching lookups from
the MachineFunction easily.

Update the MIPS subtarget switching machinery to update this pointer
at the same time it runs.

llvm-svn: 214838
2014-08-05 02:39:49 +00:00
Renato Golin bc0b0378c5 Allow CP10/CP11 operations on ARMv5/v6
Those registers are VFP/NEON and vector instructions should be used instead,
but old cores rely on those co-processors to enable VFP unwinding. This change
was prompted by the libc++abi's unwinding routine and is also present in many
legacy low-level bare-metal code that we ought to compile/assemble.

Fixing bug PR20025 and allowing PR20529 to proceed with a fix in libc++abi.

llvm-svn: 214802
2014-08-04 23:21:56 +00:00
Eric Christopher d913448b38 Remove the TargetMachine forwards for TargetSubtargetInfo based
information and update all callers. No functional change.

llvm-svn: 214781
2014-08-04 21:25:23 +00:00
Akira Hatanaka dc08c30df9 [ARM] In dynamic-no-pic mode, ARM's post-RA pseudo expansion was incorrectly
expanding pseudo LOAD_STATCK_GUARD using instructions that are normally used
in pic mode. This patch fixes the bug.

<rdar://problem/17886592>

llvm-svn: 214614
2014-08-02 05:40:40 +00:00
Chandler Carruth 3707dda904 [SDAG] Let the DAG combiner take care of dead nodes rather than manually
deleting them. This already seems to work, as no tests fail without
this.

llvm-svn: 214601
2014-08-02 00:19:10 +00:00
Eric Christopher 6c05d9135f Add a non-const subtarget returning function to the target machine
so that we can use it to get the old-style JIT out of the subtarget.

This code should be removed when the old-style JIT is removed
(imminently).

llvm-svn: 214560
2014-08-01 21:18:01 +00:00
Juergen Ributzka 4c018a12a3 [FastISel][ARM] Do not emit stores for undef arguments.
This is a followup patch for r214366, which added the same behavior to the
AArch64 and X86 FastISel code. This fix reproduces the already existing
behavior of SelectionDAG in FastISel.

llvm-svn: 214531
2014-08-01 18:04:14 +00:00
James Molloy 137ce60ecf Allow only disassembling of M-class MSR masks that the assembler knows how to assemble back.
Note: The current code in DecodeMSRMask() rejects the unpredictable A/R MSR mask '0000' with Fail. The code in the patch follows this style and rejects unpredictable M-class MSR masks also with Fail (instead of SoftFail). If SoftFail is preferred in this case then additional changes to ARMInstPrinter (to print non-symbolic masks) and ARMAsmParser (to parse non-symbolic masks) will be needed.

Patch by Petr Pavlu!

llvm-svn: 214505
2014-08-01 12:42:11 +00:00
Tilmann Scheller 7cc0ed48f0 [ARM] Make the assembler reject unpredictable pre/post-indexed ARM LDRB/LDRSB instructions.
The ARM ARM prohibits LDRB/LDRSB instructions with writeback into the destination register. With this commit this constraint is now enforced and we stop assembling LDRH/LDRSH instructions with unpredictable behavior.

llvm-svn: 214500
2014-08-01 12:08:04 +00:00
Tilmann Scheller 8ff079c16b [ARM] Make the assembler reject unpredictable pre/post-indexed ARM LDRH/LDRSH instructions.
The ARM ARM prohibits LDRH/LDRSH instructions with writeback into the source register. With this commit this constraint is now enforced and we stop assembling LDRH/LDRSH instructions with unpredictable behavior.

llvm-svn: 214499
2014-08-01 11:33:47 +00:00
Tilmann Scheller 8ba74305da [ARM] Make the assembler reject unpredictable pre/post-indexed ARM LDR instructions.
The ARM ARM prohibits LDR instructions with writeback into the destination register. With this commit this constraint is now enforced and we stop assembling LDR instructions with unpredictable behavior.

llvm-svn: 214498
2014-08-01 11:08:51 +00:00
Louis Gerbarg 67474e3755 Make sure no loads resulting from load->switch DAGCombine are marked invariant
Currently when DAGCombine converts loads feeding a switch into a switch of
addresses feeding a load the new load inherits the isInvariant flag of the left
side. This is incorrect since invariant loads can be reordered in cases where it
is illegal to reoarder normal loads.

This patch adds an isInvariant parameter to getExtLoad() and updates all call
sites to pass in the data if they have it or false if they don't. It also
changes the DAGCombine to use that data to make the right decision when
creating the new load.

llvm-svn: 214449
2014-07-31 21:45:05 +00:00
Pete Cooper 95709e5604 Fix bit initializer which was one bit too long, but worked so long as we silently dropped the leading 0
llvm-svn: 214372
2014-07-31 01:43:51 +00:00
Tim Northover 4e13a61413 ARM: add __aeabi_d2h for truncation on AEABI systems
ARM does actually define the name for this conversion, so we should use it on
"-eabi" platforms.

llvm-svn: 214176
2014-07-29 09:56:45 +00:00
Saleem Abdulrasool 8988c2a524 ARM: correct handling of features in arch_extension
The subtarget information is the ultimate source of truth for the feature set
that is enabled at this point.  We would previously not propagate the feature
information to the subtarget.  While this worked for the most part (features
would be enabled/disabled as requested), if another operation that changed the
feature bits was encountered (such as a mode switch via a .arm or .thumb
directive), we would end up resetting the behaviour of the architectural
extensions.

Handling this properly requires a slightly more complicated handling.  We need
to check if the feature is now being toggled.  If so, only then do we toggle the
features.  In return, we no longer have to calculate the feature bits ourselves.

The test changes are mostly to the diagnosis, which is now more uniform (a nice
side effect!).  Add an additional test to ensure that we handle this case
properly.

Thanks to Nico Weber for alerting me to this issue!

llvm-svn: 214057
2014-07-27 19:07:09 +00:00
Saleem Abdulrasool 45cf67b8e9 ARM: convert loop to range based
Convert a loop to use range based iteration.  Rename structure members to help
naming, and make structure definition anonymous.  NFC.

llvm-svn: 214056
2014-07-27 19:07:05 +00:00
Matt Arsenault 6f2a526101 Add alignment value to allowsUnalignedMemoryAccess
Rename to allowsMisalignedMemoryAccess.

On R600, 8 and 16 byte accesses are mostly OK with 4-byte alignment,
and don't need to be split into multiple accesses. Vector loads with
an alignment of the element type are not uncommon in OpenCL code.

llvm-svn: 214055
2014-07-27 17:46:40 +00:00
Nico Weber a822d94f57 Wrap to 80 columns, no behavior change.
llvm-svn: 213975
2014-07-25 21:37:41 +00:00
Akira Hatanaka e5b6e0d231 [stack protector] Fix a potential security bug in stack protector where the
address of the stack guard was being spilled to the stack.

Previously the address of the stack guard would get spilled to the stack if it
was impossible to keep it in a register. This patch introduces a new target
independent node and pseudo instruction which gets expanded post-RA to a
sequence of instructions that load the stack guard value. Register allocator
can now just remat the value when it can't keep it in a register. 

<rdar://problem/12475629>

llvm-svn: 213967
2014-07-25 19:31:34 +00:00
Amara Emerson 115d2df8a4 [ARM] Emit ABI_PCS_R9_use build attribute.
Patch by Ben Foster!

Differential Revision: http://reviews.llvm.org/D4657

llvm-svn: 213944
2014-07-25 14:03:14 +00:00
Akira Hatanaka 16e47ff42e [ARM] In thumb mode, emit directive ".code 16" before file level inline
assembly instructions.

This is necessary to ensure ARM assembler switches to Thumb mode before it
starts assembling the file level inline assembly instructions at the beginning
of a .s file.

<rdar://problem/17757232>

llvm-svn: 213924
2014-07-25 05:12:49 +00:00
Hal Finkel cc39b67530 AA metadata refactoring (introduce AAMDNodes)
In order to enable the preservation of noalias function parameter information
after inlining, and the representation of block-level __restrict__ pointer
information (etc.), additional kinds of aliasing metadata will be introduced.
This metadata needs to be carried around in AliasAnalysis::Location objects
(and MMOs at the SDAG level), and so we need to generalize the current scheme
(which is hard-coded to just one TBAA MDNode*).

This commit introduces only the necessary refactoring to allow for the
introduction of other aliasing metadata types, but does not actually introduce
any (that will come in a follow-up commit). What it does introduce is a new
AAMDNodes structure to hold all of the aliasing metadata nodes associated with
a particular memory-accessing instruction, and uses that structure instead of
the raw MDNode* in AliasAnalysis::Location, etc.

No functionality change intended.

llvm-svn: 213859
2014-07-24 12:16:19 +00:00
NAKAMURA Takumi 98d18be5fe Prune dependency to MC from each target disassembler.
llvm-svn: 213856
2014-07-24 11:45:11 +00:00
Tilmann Scheller 96ef72e54a [ARM] Make the assembler reject unpredictable pre/post-indexed ARM STRH instructions.
The ARM ARM prohibits STRH instructions with writeback into the source register. With this commit this constraint is now enforced and we stop assembling STRH instructions with unpredictable behavior.

llvm-svn: 213850
2014-07-24 09:55:46 +00:00
NAKAMURA Takumi 9c3bd7618a Update library dependencies.
llvm-svn: 213832
2014-07-24 02:10:42 +00:00
Rafael Espindola 45bcf8a59c Fix the build when building with only the ARM backend.
llvm-svn: 213814
2014-07-23 22:54:28 +00:00
Tim Northover 14ff2df05c ARM: spot SBFX-compatbile code expressed with sign_extend_inreg
We were assuming all SBFX-like operations would have the shl/asr form, but
often when the field being extracted is an i8 or i16, we end up with a
SIGN_EXTEND_INREG acting on a shift instead. Simple enough to check for though.

llvm-svn: 213754
2014-07-23 13:59:12 +00:00
Tim Northover 7ad2a0e0c2 ARM: add patterns for [su]xta[bh] from just a shift.
Although the final shifter operand is a rotate, this actually only matters for
the half-word extends when the amount == 24. Otherwise folding a shift in is
just as good.

llvm-svn: 213753
2014-07-23 13:59:07 +00:00
Tilmann Scheller 2727279117 [ARM] Make the assembler reject unpredictable pre/post-indexed ARM STRB instructions.
The ARM ARM prohibits STRB instructions with writeback into the source register. With this commit this constraint is now enforced and we stop assembling STRB instructions with unpredictable behavior.

llvm-svn: 213750
2014-07-23 13:03:47 +00:00
Tilmann Scheller 3352a58ddc [ARM] Make the assembler reject unpredictable pre/post-indexed ARM STR instructions.
The ARM ARM prohibits STR instructions with writeback into the source register. With this commit this constraint is now enforced and we stop assembling STR instructions with unpredictable behavior.

llvm-svn: 213745
2014-07-23 12:38:17 +00:00
Tilmann Scheller c28f0d587d [ARM] Add earlyclobber constraint to pre/post-indexed ARM STRH instructions.
The post-indexed instructions were missing the constraint, causing unpredictable STRH instructions to be emitted.

The earlyclobber constraint on the pre-indexed STR instructions is not strictly necessary, as the instruction selection for pre-indexed STR instructions goes through an additional layer of pseudo instructions which have the constraint defined, however it doesn't hurt to specify the constraint directly on the pre-indexed instructions as well, since at some point someone might create instances of them programmatically and then the constraint is definitely needed.

llvm-svn: 213729
2014-07-23 08:12:51 +00:00
Ulrich Weigand 46797c6960 [MC] Pass MCSymbolData to needsRelocateWithSymbol
As discussed in a previous checking to support the .localentry
directive on PowerPC, we need to inspect the actual target symbol
in needsRelocateWithSymbol to make the appropriate decision based
on that symbol's st_other bits.

Currently, needsRelocateWithSymbol does not get the target symbol.
However, it is directly available to its sole caller.  This patch
therefore simply extends the needsRelocateWithSymbol by a new
parameter "const MCSymbolData &SD", passes in the target symbol,
and updates all derived implementations.

In particular, in the PowerPC implementation, this patch removes
the FIXME added by the previous checkin.

llvm-svn: 213487
2014-07-20 23:15:06 +00:00
Saleem Abdulrasool c4e00289a7 ARM: correct WoA __builtin_alloca handling on O0
When performing a dynamic stack adjustment without optimisations, we would mark
SP as def and R4 as kill.  This occurred as part of the expansion of a
WIN__CHKSTK SDNode which indicated the proper handling of SP and R4.  The result
would be that we would double define SP as part of an operation, which is
obviously incorrect.

Furthermore, the VTList for the chain had an incorrect parameter type of i32
instead of Other.

Correct these to permit proper lowering of __builtin_alloca at -O0.

llvm-svn: 213442
2014-07-19 01:29:51 +00:00
David Peixotto ae5ba76221 MC: support different sized constants in constant pools
On AArch64 the pseudo instruction ldr <reg>, =... supports both
32-bit and 64-bit constants. Add support for 64 bit constants for
the pools to support the pseudo instruction fully.

Changes the AArch64 ldr-pseudo tests to use 32-bit registers and
adds tests with 64-bit registers.

Patch by Janne Grunau!

Differential Revision: http://reviews.llvm.org/D4279

llvm-svn: 213387
2014-07-18 16:05:14 +00:00
Tim Northover 4e80b584fe ARM: support legalisation of "fptrunc ... to half" operations.
llvm-svn: 213373
2014-07-18 13:01:19 +00:00
Renato Golin e48d9dc15e Suppress 'not handled in switch' warning
llvm-svn: 213371
2014-07-18 12:13:04 +00:00
Tilmann Scheller 0fc933d6b8 [ARM] Add earlyclobber constraint to pre/post-indexed ARM STR instructions.
The post-indexed instructions were missing the constraint, causing unpredictable STR instructions to be emitted.

The earlyclobber constraint on the pre-indexed STR instructions is not strictly necessary, as the instruction selection for pre-indexed STR instructions goes through an additional layer of pseudo instructions which have the constraint defined, however it doesn't hurt to specify the constraint directly on the pre-indexed instructions as well, since at some point someone might create instances of them programmatically and then the constraint is definitely needed.

This fixes PR20323.

llvm-svn: 213369
2014-07-18 12:05:49 +00:00
Renato Golin c17a07b36a Refactor ARM subarchitecture parsing
Re-commit of a patch to rework the triple parsing on ARM to a more sane
model.

Patch by Gabor Ballabas.

llvm-svn: 213367
2014-07-18 12:00:48 +00:00
Tim Northover 53f3bcf0e9 ARM: support direct f16 <-> f64 conversions
ARMv8 has instructions to handle it, otherwise a libcall is needed.

llvm-svn: 213254
2014-07-17 11:27:04 +00:00
Tim Northover fd7e424935 CodeGen: extend f16 conversions to permit types > float.
This makes the two intrinsics @llvm.convert.from.f16 and
@llvm.convert.to.f16 accept types other than simple "float". This is
only strictly needed for the truncate operation, since otherwise
double rounding occurs and there's no way to represent the strict IEEE
conversion. However, for symmetry we allow larger types in the extend
too.

During legalization, we can expand an "fp16_to_double" operation into
two extends for convenience, but abort when the truncate isn't legal. A new
libcall is probably needed here.

Even after this commit, various target tweaks are needed to actually use the
extended intrinsics. I've put these into separate commits for clarity, so there
are no actual tests of f64 conversion here.

llvm-svn: 213248
2014-07-17 10:51:23 +00:00
Chris Bieneman df4b763be5 [RegisterCoalescer] Moving the RegisterCoalescer subtarget hook onto the TargetRegisterInfo instead of the TargetSubtargetInfo.
llvm-svn: 213188
2014-07-16 20:13:31 +00:00
Chris Bieneman 80a866a316 Added documentation for SizeMultiplier in the ARM subtarget hook for register coalescing. Also fixed some 80 col violations.
No functional code changes.

llvm-svn: 213169
2014-07-16 16:27:31 +00:00
Sanjay Patel a2f658d69d Move Post RA Scheduling flag bit into SchedMachineModel
Refactoring; no functional changes intended

    Removed PostRAScheduler bits from subtargets (X86, ARM).
    Added PostRAScheduler bit to MCSchedModel class.
    This bit is set by a CPU's scheduling model (if it exists).
    Removed enablePostRAScheduler() function from TargetSubtargetInfo and subclasses.
    Fixed the existing enablePostMachineScheduler() method to use the MCSchedModel (was just returning false!).
    Added methods to TargetSubtargetInfo to allow overrides for AntiDepBreakMode, CriticalPathRCs, and OptLevel for PostRAScheduling.
    Added enablePostRAScheduler() function to PostRAScheduler class which queries the subtarget for the above values.
    Preserved existing scheduler behavior for ARM, MIPS, PPC, and X86: 
       a. ARM overrides the CPU's postRA settings by enabling postRA for any non-Thumb or Thumb2 subtarget. 
       b. MIPS overrides the CPU's postRA settings by enabling postRA for everything. 
       c. PPC overrides the CPU's postRA settings by enabling postRA for everything. 
       d. X86 is the only target that actually has postRA specified via sched model info.

Differential Revision: http://reviews.llvm.org/D4217

llvm-svn: 213101
2014-07-15 22:39:58 +00:00
Chris Bieneman 03695ab57e [RegisterCoalescer] Add new subtarget hook allowing targets to opt-out of coalescing.
The coalescer is very aggressive at propagating constraints on the register classes, and the register allocator doesn’t know how to split sub-registers later to recover. This patch provides an escape valve for targets that encounter this problem to limit coalescing.

This patch also implements such for ARM to lower register pressure when using lots of large register classes. This works around PR18825.

llvm-svn: 213078
2014-07-15 17:18:41 +00:00
Renato Golin b8a86c43c0 Revert "Refactor ARM subarchitecture parsing"
This reverts commit 7b4a6882467e7fef4516a0cbc418cbfce0fc6f6d.

llvm-svn: 212521
2014-07-08 10:06:16 +00:00
Renato Golin 1e9c282cd1 Refactor ARM subarchitecture parsing
According to a FIXME in ARMMCTargetDesc.cpp the ARM version parsing should be
in the Triple helper class.

Patch by: Gabor Ballabas

llvm-svn: 212479
2014-07-07 20:01:11 +00:00
Saleem Abdulrasool 763f9a50a5 ARM: properly lower dllimport'ed global values
This completes the handling for DLL import storage symbols when lowering
instructions.  A DLL import storage symbol must have an additional load
performed prior to use.  This is applicable to variables and functions.

This is particularly important for non-function symbols as it is possible to
handle function references by emitting a thunk which performs the translation
from the unprefixed __imp_ symbol to the proper symbol (although, this is a
non-optimal lowering).  For a variable symbol, no such thunk can be
accommodated.

llvm-svn: 212431
2014-07-07 05:18:35 +00:00
Saleem Abdulrasool 220a044888 ARM: correctly mangle dllimport symbols
Add support for tracking DLLImport storage class information on a per symbol
basis in the ARM instruction selection.  Use that information to correctly
mangle the symbol (dllimport symbols are referenced via *__imp_<name>).

llvm-svn: 212430
2014-07-07 05:18:30 +00:00
Saleem Abdulrasool 1eb4a28b44 ARM: unify symbol name retrieval
Ensure that all paths that retrieve the symbol name go through GetARMGVSymbol
rather than getSymbol.  This is desirable so that any global symbol mangling can
be centralised to this function.  The motivation for this is handling of symbols
that are marked as having dll import dll storage.  Such a symbol requires an
extra load that is currently handled in the backend and a __imp_ prefix on the
symbol name.

llvm-svn: 212429
2014-07-07 05:18:22 +00:00
Tim Northover 1bc367a41b ARM: when falling back to scattered relocs, keep the type.
The linker relies on relocation type info (e.g. is it a branch?) to perform the
correct actions, so we should keep that even when we end up using a scattered
relocation for whatever reason.

rdar://problem/17553104

llvm-svn: 212333
2014-07-04 10:58:05 +00:00
Eric Christopher c1058df66f Move function dependent resetting of a subtarget variable out of the
subtarget. This involved having the movt predicate take the current
function - since we care about size in instruction selection for
whether or not to use movw/movt take the function so we can check
the attributes. This required adding the current MachineFunction to
FastISel and propagating through.

llvm-svn: 212309
2014-07-04 01:55:26 +00:00
Eric Christopher 2f991c9ee1 Remove caching of the target machine and initialization of the
subtarget from ARMISelDAGtoDAG. The former is unnecessary and the
latter is initialized on each runOnMachineFunction.

llvm-svn: 212297
2014-07-03 22:24:49 +00:00
Yi Kong 93e52da641 [ARM] Implement ISB memory barrier intrinsic
Adds support for __builtin_arm_isb. Also corrects DSB and ISB instructions
modelling by adding has-side-effects property.

llvm-svn: 212276
2014-07-03 16:00:41 +00:00
Juergen Ributzka 3bd03c7099 [DAG] Pass the argument list to the CallLoweringInfo via move semantics. NFCI.
The argument list vector is never used after it has been passed to the
CallLoweringInfo and moving it to the CallLoweringInfo is cleaner and
pretty much as cheap as keeping a pointer to it.

llvm-svn: 212135
2014-07-01 22:01:54 +00:00
Scott Douglass 7650a9b871 ARM: take care not to set the ThumbFunc bit on TLS data symbols
This fixes LNT SingleSource/UnitTests/Threads with -mthumb.

Differential Revision: http://reviews.llvm.org/D4324

llvm-svn: 212029
2014-06-30 09:37:24 +00:00
Saleem Abdulrasool 4a1e409a4d ARM: use symbolic name for constant
This just changes the constant value to the symbolic name corresponding to it.
NFC.

llvm-svn: 212011
2014-06-30 03:11:14 +00:00
Eric Christopher 83e0723457 Remove extraneous includes from the target machines.
llvm-svn: 211800
2014-06-26 19:30:05 +00:00
Eric Christopher 80b24ef429 Move all of the ARM subtarget features down onto the subtarget
rather than the target machine.

llvm-svn: 211799
2014-06-26 19:30:02 +00:00
Eric Christopher 45fb7b6397 Move the frame lowering constructors out of line to avoid circular
includes.

llvm-svn: 211798
2014-06-26 19:29:59 +00:00
Renato Golin ac561c3ac7 Added parsing co-processor names starting with "cr"
Additional compliant GAS names for coprocessor register name
are enabled for all instruction with parameter MCK_CoprocReg:
LDC,LDC2,STC,STC2,CDP,CDP2,MCR,MCR2,MCRR,MCRR2,MRC,MRC2,MRRC,MRRC2

Patch by Andrey Kuharev.

llvm-svn: 211776
2014-06-26 13:10:53 +00:00
Rafael Espindola e2c6624475 Move expression visitation logic up to MCStreamer.
Remove the duplicate from MCRecordStreamer. No functionality change.

llvm-svn: 211714
2014-06-25 15:45:33 +00:00
Rafael Espindola 2be1281d43 Simplify the visitation of target expressions. No functionality change.
llvm-svn: 211707
2014-06-25 15:29:54 +00:00
Christian Pirker c6308f59b2 ARM: Fix TPsoft for Thumb mode
Reviewed at http://reviews.llvm.org/D4230

llvm-svn: 211601
2014-06-24 15:45:59 +00:00
Christian Pirker 6f81e75dab ARMEB: Vector extend operations
Reviewed at http://reviews.llvm.org/D4043

llvm-svn: 211520
2014-06-23 18:05:53 +00:00
Tim Northover 2099862a50 ARM: mark UBFX as not allowing PC.
Strictly, it's unpredictable. But we don't quite model that yet and an error is
better than ignoring the issue. This one somehow got left out before though.

rdar://problem/15997748

llvm-svn: 211490
2014-06-23 09:20:02 +00:00
Rafael Espindola 1fc003e6c5 Allow a target to create a null streamer.
Targets can assume that a target streamer is present, so they have to be able
to construct a null streamer in order to set the target streamer in it to.

Fixes a crash when using the null streamer with arm.

llvm-svn: 211358
2014-06-20 13:11:28 +00:00
Oliver Stannard 5dc2934ba2 Emit the ARM build attributes ABI_PCS_wchar_t and ABI_enum_size.
Emit the ARM build attributes ABI_PCS_wchar_t and ABI_enum_size based on
module flags metadata.

llvm-svn: 211349
2014-06-20 10:08:11 +00:00
Karthik Bhat e03a25da70 Add Support to Recognize and Vectorize NON SIMD instructions in SLPVectorizer.
This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and
vectorizes them as vector shuffles if they are profitable.
These patterns of vector shuffle can later be converted to instructions such as addsubpd etc on X86.
Thanks to Arnold and Hal for the reviews. http://reviews.llvm.org/D4015 

llvm-svn: 211339
2014-06-20 04:32:48 +00:00
Eric Christopher c40e5edbbc Add a new subtarget hook for whether or not we'd like to enable
the atomic load linked expander pass to run for a particular
subtarget. This requires a check of the subtarget and so save
the TargetMachine rather than only TargetLoweringInfo and update
all callers.

llvm-svn: 211314
2014-06-19 21:03:04 +00:00
Alp Toker 1d099d9339 Fix typos
llvm-svn: 211304
2014-06-19 19:41:26 +00:00
Craig Topper 35b2f75733 Convert some assert(0) to llvm_unreachable or fold an 'if' condition into the assert.
llvm-svn: 211254
2014-06-19 06:10:58 +00:00
Eric Christopher 3d19f1388f Move ARMJITInfo off of the TargetMachine and down onto the subtarget.
This required untangling a mess of headers that included around.

This a recommit of r210953 with a fix for the removed accessor
for JITInfo.

llvm-svn: 211233
2014-06-18 22:48:09 +00:00
Weiming Zhao 8c89973462 [ARM] [MC] Refactor the constant pool classes
ARMTargetStreamer implements ConstantPool and AssmeblerConstantPools
to keep track of assembler-generated constant pools that are used for
ldr-pseudo.

When implementing ldr-pseudo for AArch64, these two classes can be reused.
So this patch factors them out from ARM target to the general MC lib.

llvm-svn: 211198
2014-06-18 18:17:25 +00:00
Craig Topper 2a30d7889f Replace some assert(0)'s with llvm_unreachable.
llvm-svn: 211141
2014-06-18 05:05:13 +00:00
James Molloy c1fd09ba2c Fix memory leak of RegScavenger accidentally added in r211037.
llvm-svn: 211097
2014-06-17 12:31:41 +00:00
Jim Grosbach 07393ba31b ARM: intrinsic support for rbit.
We already have an ARMISD node. Create an intrinsic to map to it so we can
add support for the frontend __rbit() intrinsic.

rdar://9283021

llvm-svn: 211057
2014-06-16 21:55:30 +00:00
Eric Christopher daca3cc54a Since the DataLayout is always found off of the subtarget go ahead
and query the base target machine implementation for it.

llvm-svn: 211055
2014-06-16 21:18:27 +00:00
Tim Northover b45c3b74b4 ARM: implement correct atomic operations on v7M
ARM v7M has ldrex/strex but not ldrexd/strexd. This means 32-bit
operations should work as normal, but 64-bit ones are almost certainly
doomed.

Patch by Phoebe Buckheister.

llvm-svn: 211042
2014-06-16 18:49:36 +00:00
James Molloy f6419cfb14 Refactor the disabling of Thumb-1 LDM/STM generation
Originally I switched the LD/ST optimizer off in TargetMachine as it was previously, but Eric has suggested he'd prefer that it be short-circuited in the pass itself.

No functionality change.

llvm-svn: 211037
2014-06-16 16:42:53 +00:00
Christian Pirker 2cc1cf0d4b ARMEB: Fix trunc store for vector types
Reviewed at http://reviews.llvm.org/D4135

llvm-svn: 211010
2014-06-16 09:17:30 +00:00
Eric Christopher f6db93ab81 Temporarily revert r210953 in an attempt to bring the ARM buildbots
back.

llvm-svn: 210996
2014-06-15 19:55:14 +00:00
Eric Christopher fb0c26c696 Remove InstrItineraryData off of the TargetMachine - it's already
on the subtarget and just forward the accessor.

llvm-svn: 210955
2014-06-13 23:11:13 +00:00
Eric Christopher a0cdc005dd Move ARMJITInfo off of the TargetMachine and down onto the subtarget.
This required untangling a mess of headers that included around.

llvm-svn: 210953
2014-06-13 23:04:46 +00:00
Eric Christopher f047bfd115 The hazard recognizer only needs a subtarget, not a target machine
so make it take one. Fix up all users accordingly.

llvm-svn: 210948
2014-06-13 22:38:52 +00:00
Oliver Stannard b5e596f7c3 ARM: Fix fastcc calling convention for Thumb1
When targetting Thumb1 on a processor which has a VFP unit (which
is not accessible from Thumb1), we were converting the fastcc calling
convention to AAPCS-VFP, which is not possible.

llvm-svn: 210889
2014-06-13 08:33:03 +00:00
Eric Christopher 030294e4c5 Move ARMSelectionDAGInfo from the TargetMachine to the subtarget.
llvm-svn: 210862
2014-06-13 00:20:39 +00:00
Eric Christopher a47f6804d2 Move to a private function to initialize subtarget dependencies
so we can use initializer lists for the ARMSubtarget and then
use this to initialize a moved DataLayout on the subtarget from
the TargetMachine.

llvm-svn: 210861
2014-06-13 00:20:35 +00:00
Eric Christopher 70e005a171 Have ARMSelectionDAGInfo take a DataLayout as it's argument as the
DAG has access to the subtarget and TargetSelectionDAGInfo only
needs a DataLayout.

llvm-svn: 210859
2014-06-12 23:39:49 +00:00
Saleem Abdulrasool 65ca57a418 CodeGen: enable mov.w/mov.t pairs with minsize for WoA
Windows on ARM uses COFF/PE which is intrinsically position independent.  For
the case of 32-bit immediates, use a pair-wise relocation as otherwise we may
exceed the range of operators.  This fixes a code generation crash when using
-Oz when targeting Windows on ARM.

llvm-svn: 210814
2014-06-12 20:06:33 +00:00
James Molloy 1417b0be3e Disable the load/store optimization pass for Thumb-1.
Moritz's changes have improved codegen a lot, but further testing showed significant correctness problems. Disable by default until these have been worked out.

Patch by Moritz Roth!

llvm-svn: 210789
2014-06-12 15:18:33 +00:00
Jim Grosbach 7a930bf9ef ARM: honor hex immediate formatting for ldr/str i12 offsets.
Previously we would always print the offset as decimal, regardless of
the formatting requested. Now we use the formatImm() helper so the value
is printed as the client (LLDB in the motivating example) requested.

Before:
ldr.w r8, [sp, #180] @ always

After:
ldr.w r8, [sp, #0xb4] @ when printing hex immediates
ldr.w r8, [sp, #0180] @ when printing decimal immediates

rdar://17237103

llvm-svn: 210701
2014-06-11 20:26:45 +00:00
Renato Golin 65eea557ae Fix a bug in the Thumb1 ARM Load/Store optimizer
Previously, the basic block was searched for future uses of the base register,
and if necessary any writeback to the base register was reset using a SUB
instruction (e.g. before calling a function) just before such a use. However,
this step happened *before* the merged LDM/STM instruction was built. So if
there was (e.g.) a function call directly after the not-yet-formed LDM/STM,
the pass would first insert a SUB instruction to reset the base register,
and then (at the same location, incorrectly) insert the LDM/STM itself.

This patch fixes PR19972. Patch by Moritz Roth.

llvm-svn: 210542
2014-06-10 16:39:21 +00:00
Saleem Abdulrasool abac6e92a0 ARM: add VLA extension for WoA Itanium ABI
The armv7-windows-itanium environment is nearly identical to the MSVC ABI. It
has a few divergences, mostly revolving around the use of the Itanium ABI for
C++. VLA support is one of the extensions that are amongst the set of the
extensions.

This adds support for proper VLA emission for this environment. This is
somewhat similar to the handling for __chkstk emission on X86 and the large
stack frame emission for ARM. The invocation style for chkstk is still
controlled via the -mcmodel flag to clang.

Make an explicit note that this is an extension.

llvm-svn: 210489
2014-06-09 20:18:42 +00:00
David Blaikie 960ea3f018 AsmMatchers: Use unique_ptr to manage ownership of MCParsedAsmOperand
I saw at least a memory leak or two from inspection (on probably
untested error paths) and r206991, which was the original inspiration
for this change.

I ran this idea by Jim Grosbach a few weeks ago & he was OK with it.
Since it's a basically mechanical patch that seemed sufficient - usual
post-commit review, revert, etc, as needed.

llvm-svn: 210427
2014-06-08 16:18:35 +00:00
Saleem Abdulrasool 90386ad60a ARM: correct assertion for long-calls on WoA
COFF/PE, so the relocation model is never static.  Loosen the assertion
accordingly.  The relocation can still be emitted properly, as it will be
converted to an IMAGE_REL_ARM_ADDR32 which will be resolved by the loader
taking the base relocation into account.  This is necessary to permit the
emission of long calls which can be controlled via the -mlong-calls option in
the driver.

llvm-svn: 210399
2014-06-07 20:29:27 +00:00
Eric Christopher 0dd8d486b3 Have TargetSelectionDAGInfo take a DataLayout initializer rather than
a TargetMachine since the only thing it wants is DataLayout.

llvm-svn: 210366
2014-06-06 19:04:48 +00:00
Tom Roeder 44cb65fff1 Add a new attribute called 'jumptable' that creates jump-instruction tables for functions marked with this attribute.
It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables.

This also adds backend support for generating the jump-instruction tables on ARM and X86.
Note that since the jumptable attribute creates a second function pointer for a
function, any function marked with jumptable must also be marked with unnamed_addr.

llvm-svn: 210280
2014-06-05 19:29:43 +00:00
Andrew Trick 8d2ee37f31 Add a subtarget hook: enablePostMachineScheduler.
As requested by AArch64 subtargets.

Note that this will have no effect until the
AArch64 target actually enables the pass like this:
substitutePass(&PostRASchedulerID, &PostMachineSchedulerID);

As soon as armv7 switches over, PostMachineScheduler will become the
default postRA scheduler, so this won't be necessary any more.
Targets using the old postRA schedule would then do:
substitutePass(&PostMachineSchedulerID, &PostRASchedulerID);

llvm-svn: 210167
2014-06-04 07:06:27 +00:00
Christian Pirker 762b2c624f ARMEB: Fix function return type f64
Reviewed at http://reviews.llvm.org/D3968

llvm-svn: 209990
2014-06-01 09:30:52 +00:00
Alp Toker 5f83e477c3 Update a couple of header inclusion guards
llvm-svn: 209980
2014-05-31 21:26:09 +00:00
Eric Christopher 8995833a34 Have the TLOF creation take a Triple rather than needing a subtarget.
llvm-svn: 209937
2014-05-31 00:07:32 +00:00
Tim Northover 86f60b7266 ARM: use AAPCS-style prologues for embedded MachO.
Darwin prologues save their GPRs in two stages: a narrow push of r0-r7 & lr,
followed by a wide push of the remaining registers if there are any. AAPCS uses
a single push.w instruction.

It turns out that, on average, enough registers get pushed that code is smaller
in the AAPCS prologue, which is a nice property for M-class programmers. They
also have other options available for back-traces, so can hopefully deal with
the fact that FP & LR aren't adjacent in memory.

rdar://problem/15909583

llvm-svn: 209895
2014-05-30 13:23:06 +00:00
Tim Northover b4ddc0845a ARM & AArch64: make use of common cmpxchg idioms after expansion
The C and C++ semantics for compare_exchange require it to return a bool
indicating success. This gets mapped to LLVM IR which follows each cmpxchg with
an icmp of the value loaded against the desired value.

When lowered to ldxr/stxr loops, this extra comparison is redundant: its
results are implicit in the control-flow of the function.

This commit makes two changes: it replaces that icmp with appropriate PHI
nodes, and then makes sure earlyCSE is called after expansion to actually make
use of the opportunities revealed.

I've also added -{arm,aarch64}-enable-atomic-tidy options, so that
existing fragile tests aren't perturbed too much by the change. Many
of them either rely on undef/unreachable too pervasively to be
restored to something well-defined (particularly while making sure
they test the same obscure assert from many years ago), or depend on a
particular CFG shape, which is disrupted by SimplifyCFG.

rdar://problem/16227836

llvm-svn: 209883
2014-05-30 10:09:59 +00:00
Amara Emerson ceeb1c4830 [ARM] Emit correct build attributes for the relocation models.
Patch by Asiri Rathnayake.

llvm-svn: 209656
2014-05-27 13:30:21 +00:00
Tim Northover 4f1909f1da ARM: teach AAPCS-VFP to deal with Cortex-M4.
Cortex-M4 only has single-precision floating point support, so any LLVM
"double" type will have been split into 2 i32s by now. Fortunately, the
consecutive-register framework turns out to be precisely what's needed to
reconstruct the double and follow AAPCS-VFP correctly!

rdar://problem/17012966

llvm-svn: 209650
2014-05-27 10:43:38 +00:00
Tim Northover f9e798ba6a Segmented stacks: omit __morestack call when there's no frame.
Patch by Florian Zeitz

llvm-svn: 209436
2014-05-22 13:03:43 +00:00
Saleem Abdulrasool 2bd1262a32 ARM: introduce llvm.arm.undefined intrinsic
This intrinsic permits the emission of platform specific undefined sequences.
ARM has reserved the 0xde opcode which takes a single integer parameter (ignored
by the CPU).  This permits the operating system to implement custom behaviour on
this trap.  The llvm.arm.undefined intrinsic is meant to provide a means for
generating the target specific behaviour from the frontend.  This is
particularly useful for Windows on ARM which has made use of a series of these
special opcodes.

llvm-svn: 209390
2014-05-22 04:46:46 +00:00
Eric Christopher 0e6e7cf385 Override runOnMachineFunction for ARMISelDAGToDAG so that we can
reset the subtarget on each function.

llvm-svn: 209386
2014-05-22 02:00:27 +00:00
Eric Christopher 89f18805f4 Fix typo.
llvm-svn: 209377
2014-05-22 01:21:44 +00:00
Saleem Abdulrasool 0bd31835ea MC: correct IMAGE_REL_ARM_MOV32T relocation emission
This corrects the emission of IMAGE_REL_ARM_MOV32T relocations.  Previously, we
were avoiding the high portion of the relocation too early.  If there was a
section-relative relocation with an offset greater than 16-bits (65535), you
would end up truncating the high order bits of the offset.  Allow the current
relocation representation to flow through out the MC layer to the object writer.
Use the new ability to restrict recorded relocations to avoid emitting the
relocation into the final object.

llvm-svn: 209337
2014-05-21 23:17:56 +00:00
Saleem Abdulrasool 8d60fdc50d ARM: correct bundle generation for MOV32T relocations
Although the previous code would construct a bundle and add the correct elements
to it, it would not finalise the bundle.  This resulted in the InternalRead
markers not being added to the MachineOperands nor, more importantly, the
externally visible defs to the bundle itself.  So, although the bundle was not
exposing the def, the generated code would be correct because there was no
optimisations being performed.  When optimisations were enabled, the post
register allocator would kick in, and the hazard recognizer would reorder
operations around the load which would define the value being operated upon.

Rather than manually constructing the bundle, simply construct and finalise the
bundle via the finaliseBundle call after both MIs have been emitted.  This
improves the code generation with optimisations where IMAGE_REL_ARM_MOV32T
relocations are emitted.

The changes to the other tests are the result of the bundle generation
preventing the scheduler from hoisting the moves across the loads.  The net
effect of the generated code is equivalent, but, is much more identical to what
is actually being lowered.

llvm-svn: 209267
2014-05-21 01:25:24 +00:00
Christian Pirker 875629f713 ARMEB: Additional test files for ARM fixups
llvm-svn: 209200
2014-05-20 09:24:37 +00:00
Benjamin Kramer f3ad23551d SDAG: Legalize vector BSWAP into a shuffle if the shuffle is legal but the bswap not.
- On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though.
- On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal.
- On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled.

llvm-svn: 209123
2014-05-19 13:12:38 +00:00
Saleem Abdulrasool 8bfb192ecd ARM: make libcall setup more table driven
Rather than create a series of function calls to setup the library calls, create
a table with the information and just use the table to drive the configuration
of the library calls.  This makes it easier to both inspect the list as well as
to modify it.  NFC.

llvm-svn: 209089
2014-05-18 16:39:11 +00:00
Saleem Abdulrasool a521845381 ARM: improve WoA ABI conformance for frame register
Windows on ARM uses R11 for the frame pointer even though the environment is a
pure Thumb-2, thumb-only environment.  Replicate this behaviour to improve
Windows ABI compatibility.  This register is used for fast stack walking, and
thus is part of the Windows ABI.

llvm-svn: 209085
2014-05-18 04:12:52 +00:00
Saleem Abdulrasool f11f4b4e20 ARM: consolidate frame pointer register knowledge
Use the ARMBaseRegisterInfo to query the frame register.  The base register info
is aware of the frame register that is used for the frame pointer.  Use that to
determine the frame register rather than duplicating the knowledge.  Although,
the code path is slightly different in that it may return SP, that can only
occur if the frame pointer has been omitted in the machine function, which is
supposed to contain the desired value in that case.

llvm-svn: 209084
2014-05-18 03:18:09 +00:00
Saleem Abdulrasool f3a5a5c546 Target: remove old constructors for CallLoweringInfo
This is mostly a mechanical change changing all the call sites to the newer
chained-function construction pattern.  This removes the horrible 15-parameter
constructor for the CallLoweringInfo in favour of setting properties of the call
via chained functions.  No functional change beyond the removal of the old
constructors are intended.

llvm-svn: 209082
2014-05-17 21:50:17 +00:00
Saleem Abdulrasool 6d11b7cd7a ARM: whitespace
Remove some whitespace.  NFC.

llvm-svn: 209079
2014-05-17 21:49:54 +00:00
Saleem Abdulrasool 46fed305db ARM: use the proper target object format for WoA
WoA uses COFF, not ELF.  ARMISelLowering::createTLOF would previously return ELF
for any non-MachO platform.  This was a missed site when the original change for
target format support for Windows on ARM was done.

llvm-svn: 209057
2014-05-17 04:28:08 +00:00
James Molloy a70697e10e Re-enable inline memcpy expansion for Thumb1.
Patch by Moritz Roth!

llvm-svn: 208994
2014-05-16 14:24:22 +00:00
James Molloy 556763d2ef Fix the Load/Store optimization pass to work with Thumb1.
Patch by Moritz Roth!

llvm-svn: 208992
2014-05-16 14:14:30 +00:00
James Molloy 92a15078f1 Enable the Load/Store optimization pass for Thumb1 but make it return immediately for now.
Patch by Moritz Roth!

llvm-svn: 208991
2014-05-16 14:11:38 +00:00
James Molloy bb73c23ffa Fix a few comment typos and style issues.
Patch by Moritz Roth!

llvm-svn: 208990
2014-05-16 14:08:46 +00:00
Saleem Abdulrasool 056fc3da4a ARM: add some integer/floating point conversion libcalls
Add some Windows on ARM specific library calls.  These are provided by msvcrt,
and can be used to perform integer to floating-point conversions (and
vice-versa) mirroring similar functions in the RTABI.

llvm-svn: 208949
2014-05-16 05:41:33 +00:00
Tim Northover d8d65a69cf TableGen/ARM64: print aliases even if they have syntax variants.
To get at least one use of the change (and some actual tests) in with its
commit, I've enabled the AArch64 & ARM64 NEON mov aliases.

llvm-svn: 208867
2014-05-15 11:16:32 +00:00
Jonathan Roelofs 4971b40d7e Fix some dyslexia in an assert message
llvm-svn: 208842
2014-05-15 02:24:50 +00:00
Jay Foad a0653a3e6c Rename ComputeMaskedBits to computeKnownBits. "Masked" has been
inappropriate since it lost its Mask parameter in r154011.

llvm-svn: 208811
2014-05-14 21:14:37 +00:00
Christian Pirker 6692e7c116 ARM-BE: test files for vector argument passing
Reviewed at http://reviews.llvm.org/D3766

llvm-svn: 208793
2014-05-14 16:59:44 +00:00
Saleem Abdulrasool 27351f2022 ARM: implement support for the UDF mnemonic
The UDF instruction is a reserved undefined instruction space.  The assembler
mnemonic was introduced with ARM ARM rev C.a.  The instruction is not predicated
and the immediate constant is ignored by the CPU.  Add support for the three
encodings for this instruction.

The changes to the invalid instruction test is due to the fact that the invalid
instructions actually overlap with the undefined instruction.  Introduction of
the new instruction results in a partial decode as an undefined sequence.  Drop
the tests as they are invalid instruction patterns anyways.

llvm-svn: 208751
2014-05-14 03:47:39 +00:00
Christian Pirker 39db7ec81f ARMEB: Fix byte order of EH frame unwinding instructions, with modified test file
This commit was already commited as revision rL208689 and discussd in
phabricator revision D3704.
But the test file was crashing on OS X and windows.

I fixed the test file in the same way as in rL208340.

llvm-svn: 208711
2014-05-13 16:44:30 +00:00
Rafael Espindola 2e7eceb317 Revert "ARMEB: Fix byte order of EH frame unwinding instructions"
This reverts commit r208689.

The test was crashing on OS X and windows.

llvm-svn: 208704
2014-05-13 15:19:56 +00:00
Christian Pirker ea3514ecdb ARMEB: Fix byte order of EH frame unwinding instructions
llvm-svn: 208689
2014-05-13 11:41:49 +00:00
Louis Gerbarg efdcf23736 Add support bswap16 to/from memory compiling to rev16 on ARM/Thumb
The current patterns for REV16 misses mostn __builtin_bswap16() due to
legalization promoting the operands to from load/stores toi32s and then
truncing/extending them. This patch adds new patterns that catch the resultant
DAGs and codegens them to rev16 instructions. Tests included.

rdar://15353652

llvm-svn: 208620
2014-05-12 19:53:52 +00:00
Christian Pirker 238c7c165b ARM: Implement big endian bit-conversion for NEON type
llvm-svn: 208538
2014-05-12 11:19:20 +00:00
Hal Finkel f0e086a0bc Pass the value type to TLI::getRegisterByName
We must validate the value type in TLI::getRegisterByName, because if we
don't and the wrong type was used with the IR intrinsic, then we'll assert
(because we won't be able to find a valid register class with which to
construct the requested copy operation). For PPC64, additionally, the type
information is necessary to decide between the 64-bit register and the 32-bit
subregister.

No functionality change.

llvm-svn: 208508
2014-05-11 19:29:07 +00:00
Hal Finkel b33e9872a0 Add 'override' to getRegisterByName in *ISelLowering.h
No functionality change intended.

llvm-svn: 208507
2014-05-11 19:28:55 +00:00
Louis Gerbarg 3342bf1451 Add custom lowering for add/sub with overflow intrinsics to ARM
This patch adds support to ARM for custom lowering of the
llvm.{u|s}add.with.overflow.i32 intrinsics for i32/i64. This is particularly useful
for handling idiomatic saturating math functions as generated by
InstCombineCompare.

Test cases included.

rdar://14853450

llvm-svn: 208435
2014-05-09 17:02:49 +00:00
Oliver Stannard c24f2171ca ARM: HFAs must be passed in consecutive registers
When using the ARM AAPCS, HFAs (Homogeneous Floating-point Aggregates) must
be passed in a block of consecutive floating-point registers, or on the stack.
This means that unused floating-point registers cannot be back-filled with
part of an HFA, however this can currently happen. This patch, along with the
corresponding clang patch (http://reviews.llvm.org/D3083) prevents this.

llvm-svn: 208413
2014-05-09 14:01:47 +00:00
Saleem Abdulrasool 40bca0afab ARM: support PIC on Windows on ARM
Handle lowering of global addresses for PIC mode compilation on Windows.  Always
use the movw/movt load to load the address as Windows on ARM requires ARMv7+ and
is a pure Thumb environment.

llvm-svn: 208385
2014-05-09 00:58:32 +00:00
Christian Pirker b5728191c2 ARM big endian function argument passing
llvm-svn: 208316
2014-05-08 14:06:24 +00:00
Saleem Abdulrasool fc6b85b185 ARM: support FK_SecRel_2 relocations on WoA
This adds FK_SecRel_2 relocation support to ARM.  This enables the building of
object files for armv7-windows-msvc which enables CodeView line tables for
debugging as opposed to armv7-windows-itanium which currently uses DWARF.

llvm-svn: 208273
2014-05-08 01:35:57 +00:00
Rafael Espindola 566fcfe69b Remove the UseCFI option from createAsmStreamer.
We were already always passing true, this just removes the option.

llvm-svn: 208205
2014-05-07 13:00:43 +00:00
Joerg Sonnenberger cf86ce136c Allow using normal .eh_frame based unwinding on ARM. Use the same
encodings as x86. Use this exception model for NetBSD.

llvm-svn: 208166
2014-05-07 07:49:34 +00:00
Saleem Abdulrasool 985dcf18a9 ARM: mark additional instructions as MachineFrameSetup
Mark up additional instructions which are part of the function prologue as
MachineFrameSetup.  These instructions are part of the function prologue,
emitted by the PEI pass to setup the stack for use in the activating frame.

llvm-svn: 208153
2014-05-07 03:03:31 +00:00
Saleem Abdulrasool acd0338c61 ARM: fix WoA PEI instruction selection
The ARM::BLX instruction is an ARM mode instruction.  The Windows on ARM target
is limited to Thumb instructions.  Correctly use the thumb mode tBLXr
instruction.  This would manifest as an errant write into the object file as the
instruction is 4-bytes in length rather than 2.  The result would be a corrupted
object file that would eventually result in an executable that would crash at
runtime.

llvm-svn: 208152
2014-05-07 03:03:27 +00:00
Joerg Sonnenberger 818e725158 If a function needs a frame pointer, but r11 (aka fp) has not been used,
remove it from the list of unspilled registers. Otherwise the following
attempt to keep the stack aligned by picking an extra GPR register to
spill will not work as it picks up r11.

llvm-svn: 208129
2014-05-06 20:43:01 +00:00
Renato Golin c7aea40ec6 Implememting named register intrinsics
This patch implements the infrastructure to use named register constructs in
programs that need access to specific registers (bare metal, kernels, etc).

So far, only the stack pointer is supported as a technology preview, but as it
is, the intrinsic can already support all non-allocatable registers from any
architecture.

llvm-svn: 208104
2014-05-06 16:51:25 +00:00
Christian Pirker fdce7cea93 ARM: For thumb fixups store halfwords high first and low second
llvm-svn: 208076
2014-05-06 10:05:11 +00:00
Saleem Abdulrasool e8a7afef86 CodeGen: correct memset emittance for WoA
Windows on ARM does not conform to AEABI.  However, memset would be emitted
using the AEABI signature, resulting in inverted parameters.  Handle this
special case appropriately.

llvm-svn: 207943
2014-05-04 23:13:21 +00:00
Saleem Abdulrasool 729c7a08fb MC: support FK_SecRel_4 for Windows on ARM
Add handling for FK_SecRel_4 (4-byte section relative relocations).  These are
used by the generation of DWARF debug information (the abbrevations use section
relative relocations).  This will also be used in generation of CodeView line
tables.

llvm-svn: 207941
2014-05-04 23:13:15 +00:00
Rafael Espindola 3d082fa507 Fix pr19645.
The fix itself is fairly simple: move getAccessVariant to MCValue so that we
replace the old weak expression evaluation with the far more general
EvaluateAsRelocatable.

This then requires that EvaluateAsRelocatable stop when it finds a non
trivial reference kind. And that in turn requires the ELF writer to look
harder for weak references.

Last but not least, this found a case where we were being bug by bug
compatible with gas and accepting an invalid input. I reported pr19647
to track it.

llvm-svn: 207920
2014-05-03 19:57:04 +00:00
Rafael Espindola 4a04294882 Don't force symbols to be globals in .thumb_set.
We currently force symbols to be globals in .thumb_set. The intent
seems to be that given

.thumb_set foo, bar

we emit an undefined symbol to bar if it is never defined. The side
effect is that we mark bar as global, even if it is defined, which gas
does not.

Producing an undefined reference to bar is a general difference from MC and gas.
For example, given

a = b

gas will produce an undefined reference to b, MC will not. I would be surprised
if any code depends on this, but it it does, we should fix the general
difference, not special case .thumb_set.

llvm-svn: 207757
2014-05-01 12:45:43 +00:00
Richard Barton 3db1d580b3 Correction to assert statemtent to allow 32-bit unsigned numbers with the top bit set.
This fixes an ARM assembler crash - regression test added.

llvm-svn: 207747
2014-05-01 11:37:44 +00:00
Saleem Abdulrasool 7158303ad7 ARM: fix memory leak, simplify WoA stack probing
This fixes the memory leak introduced with the initial addition of support for
WoA stack probing.  Now that the pseudo-instruction expansion can handle an
external symbol, use that to generate the load which simplifies the logic as
well as avoids the memory leak.

llvm-svn: 207737
2014-05-01 04:19:59 +00:00
Saleem Abdulrasool d6c0ba3787 ARM: support expanding external symbols in 32-bit moves
This enhances the expansion of the mov32imm pseudo-instruction to support an
external symbol reference.  This is motivated by a simplification of the stack
probe emission for Windows on ARM (and fixing a leak).

llvm-svn: 207736
2014-05-01 04:19:56 +00:00
Joerg Sonnenberger 0f90c95ccf If necessary for indirect encodings, emit stubs.
llvm-svn: 207730
2014-05-01 00:25:15 +00:00
Joerg Sonnenberger 3c10817b92 Prepare support of Itanium ABI on ARM as opposed to EHABI by
conditionally emitting .fnstart and friends only for EHABI.

llvm-svn: 207718
2014-04-30 22:43:13 +00:00
Craig Topper 2d2aa0ca1f Use makeArrayRef insted of calling ArrayRef<T> constructor directly. I introduced most of these recently.
llvm-svn: 207616
2014-04-30 07:17:30 +00:00
Saleem Abdulrasool 25947c318b ARM: support stack probe emission for Windows on ARM
This introduces the stack lowering emission of the stack probe function for
Windows on ARM. The stack on Windows on ARM is a dynamically paged stack where
any page allocation which crosses a page boundary of the following guard page
will cause a page fault. This page fault must be handled by the kernel to
ensure that the page is faulted in. If this does not occur and a write access
any memory beyond that, the page fault will go unserviced, resulting in an
abnormal program termination.

The watermark for the stack probe appears to be at 4080 bytes (for
accommodating the stack guard canaries and stack alignment) when SSP is
enabled.  Otherwise, the stack probe is emitted on the page size boundary of
4096 bytes.

llvm-svn: 207615
2014-04-30 07:05:07 +00:00
Saleem Abdulrasool 0aca1c30c6 ARM: print COFF function header for Windows on ARM
Emit the COFF header when printing out the function.  This is important as the
header contains two important pieces of information: the storage class for the
symbol and the symbol type information.  This bit of information is required for
the linker to correctly identify the type of symbol that it is dealing with.

llvm-svn: 207613
2014-04-30 06:14:25 +00:00
Saleem Abdulrasool ef550a6d01 ARM: move llvm_unreachable use
When building with -Werror=covered-switch-default (as on the buildbots), the
build would fail since all cases are covered by the switch.  Move the
llvm_unreachable to the end of the function as an annotation.

llvm-svn: 207609
2014-04-30 05:12:41 +00:00
Saleem Abdulrasool f8222631a5 ARM: partially handle 32-bit relocations for WoA
IMAGE_REL_ARM_MOV32T relocations require that the movw/movt pair-wise
relocation is not split up and reordered. When expanding the mov32imm
pseudo-instruction, create a bundle if the machine operand is referencing an
address.  This helps ensure that the relocatable address load is not reordered
by subsequent passes.

Unfortunately, this only partially handles the case as the Constant Island Pass
occurs after the instructions are unbundled and does not properly handle
bundles.  That is a more fundamental issue with the pass itself and beyond the
scope of this change.

llvm-svn: 207608
2014-04-30 04:54:58 +00:00
Joerg Sonnenberger dd18d5b0f6 Parse and create GOT_PREL relocations.
llvm-svn: 207526
2014-04-29 13:42:02 +00:00
Rafael Espindola b60c829a2a Centralize the handling of the thumb bit.
This patch centralizes the handling of the thumb bit around
MCStreamer::isThumbFunc and makes isThumbFunc handle aliases.

This fixes a corner case, but the main advantage is having just one
way to check if a MCSymbol is thumb or not. This should still be
refactored to be ARM only, but at least now it is just one predicate
that has to be refactored instead of 3 (isThumbFunc,
ELF_Other_ThumbFunc, and SF_ThumbFunc).

llvm-svn: 207522
2014-04-29 12:46:50 +00:00
Tim Northover 2372301bcf ARM: emit hidden stubs into a proper non_lazy_symbol_pointer section.
rdar://problem/16660411

llvm-svn: 207517
2014-04-29 10:06:05 +00:00
Craig Topper 9d74a5a5f1 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves.
llvm-svn: 207511
2014-04-29 07:58:41 +00:00
Tim Northover 6ad1f5c817 ARM: stop passing unused values up the TableGen hierarchy.
It's bad enough that I have to look up 5 different levels of TableGen class
definitions to work out what bits go where in a simple NEON instruction anyway,
without having to keep track of umpteen unused parameters.

llvm-svn: 207420
2014-04-28 13:53:00 +00:00
Craig Topper 8c0b4d0791 Convert more SelectionDAG functions to use ArrayRef.
llvm-svn: 207397
2014-04-28 05:57:50 +00:00
Craig Topper e73658ddbb [C++] Use 'nullptr'.
llvm-svn: 207394
2014-04-28 04:05:08 +00:00
Rafael Espindola 466d66358d Add emitThumbSet to the arm target streamer.
This fixes the asm printer implementation and lets the parser be unaware of
what .thumb_set is.

llvm-svn: 207381
2014-04-27 20:23:58 +00:00
Craig Topper 481fb2879f Convert SelectionDAG::SelectNodeTo to use ArrayRef.
llvm-svn: 207377
2014-04-27 19:21:11 +00:00
Craig Topper 64941d9786 Convert SelectionDAG::getMergeValues to use ArrayRef.
llvm-svn: 207374
2014-04-27 19:20:57 +00:00
Rafael Espindola 4c6f61302e Avoid using MCSymbolData on the asm streamer.
Only the object streamers need to track if a symbol should be marked thumb or
not. This ports the ELF case. The COFF case is not ported since it is currently
not working for some other reason (I will report a bug).

llvm-svn: 207366
2014-04-27 17:10:46 +00:00
Saleem Abdulrasool 0ea5d091c7 ARM: MSVC does not support = default
Explicitly "implement" the destructor as MSVC does not support defaulted methods
yet.

llvm-svn: 207350
2014-04-27 05:28:10 +00:00
Saleem Abdulrasool 84b952b677 Add WoA object file emission support
Introduce support for WoA PE/COFF object file emission from LLVM.  Add the new
target specific PE/COFF Streamer (ARMWinCOFFStreamer) that handles the ARM
specific behaviour of PE/COFF object emission.  ARM exception information is not
yet emitted and is a TODO item.

The ARM specific object writer (ARMWinCOFFObjectWriter) handles the ARM specific
relocation handling in conjunction with the WinCOFFObjectWriter in the MC layer.
The MC layer needs to be updated to deal with the relocation adjustments.
Branch relocations are adjusted by 4 bytes (unlikely their ELF counterparts).

Minor tweaks to switch multiple conditional checks into equivalent switch
statements.  The ObjectFileInfo is updated to relax the object file setup for
Windows COFF.  Move the architecture checks into an assertion.  Windows COFF is
currently only supported on x86, x86_64, and ARM (thumb).  Rather than
defaulting to ELF, we will refuse to generate an object file.  This is better
though as you do not get an (arbitrary) object file which is different from the
request.

llvm-svn: 207345
2014-04-27 03:48:22 +00:00
Saleem Abdulrasool 6d6fee9cbc ARM: Support SingleParameterDotFile on WoA
Currently, the integrated assembler is the only choice for assembling Windows on
ARM binaries.  IAS supports the .file <filename> directive which emits the file
symbol into the resulting object binary.  Mark the GNU COFF information to
indicate support for this feature.

llvm-svn: 207341
2014-04-27 03:47:57 +00:00
Craig Topper 206fcd450a Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer and size.
llvm-svn: 207329
2014-04-26 19:29:41 +00:00
Craig Topper 48d114bed1 Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.
llvm-svn: 207327
2014-04-26 18:35:24 +00:00
Benjamin Kramer 4dae598bc8 DAGCombiner: Turn divs of vector splats into vectorized multiplications.
Otherwise the legalizer would just scalarize everything. Support for
mulhi in the targets isn't that great yet so on most targets we get
exactly the same scalarized output. Add a test for x86 vector udiv.

I had to disable the mulhi nodes on ARM because there aren't any patterns
for it. As far as I know ARM has instructions for getting the high part of
a multiply so this should be fixed.

llvm-svn: 207315
2014-04-26 12:06:28 +00:00
Saleem Abdulrasool 99f0d458c3 ARM: remove @llvm.arm.sevl
This intrinsic is no longer needed with the new @llvm.arm.hint(i32) intrinsic
which provides a generic, extensible manner for adding hint instructions.  This
functionality can now be represented as @llvm.arm.hint(i32 5).

llvm-svn: 207246
2014-04-25 17:51:25 +00:00
Saleem Abdulrasool 7e7c2f9ca6 ARM: provide a new generic hint intrinsic
Introduce the llvm.arm.hint(i32) intrinsic that can be used to inject hints into
the instruction stream. This is particularly useful for generating IR from a
compiler where the user may inject an intrinsic (e.g. __yield). These are then
pattern substituted into the correct instruction which already existed.

llvm-svn: 207242
2014-04-25 17:24:24 +00:00
Craig Topper 062a2baef0 [C++] Use 'nullptr'. Target edition.
llvm-svn: 207197
2014-04-25 05:30:21 +00:00
Reid Kleckner 5772b77789 Add 'musttail' marker to call instructions
This is similar to the 'tail' marker, except that it guarantees that
tail call optimization will occur.  It also comes with convervative IR
verification rules that ensure that tail call optimization is possible.

Reviewers: nicholas

Differential Revision: http://llvm-reviews.chandlerc.com/D3240

llvm-svn: 207143
2014-04-24 20:14:34 +00:00
David Blaikie 908f4d4bf5 Spread some const around for non-mutating uses of MCSymbolData.
I discovered this const-hole while attempting to coalesnce the Symbol
and SymbolMap data structures. There's some pending issues with that,
but I figured this change was easy to flush early.

llvm-svn: 207124
2014-04-24 16:59:40 +00:00
Stepan Dyatkovskiy 00dcc0f53c Fix for PR18921, "vmov" part.
Added support for bytes replication feature, so it could be GAS compatible.

E.g. instructions below:
"vmov.i32 d0, 0xffffffff"
"vmvn.i32 d0, 0xabababab"
"vmov.i32 d0, 0xabababab"
"vmov.i16 d0, 0xabab"
are incorrect, but we could deal with such cases.

For first one we should emit:
"vmov.i8 d0, 0xff"
For second one ("vmvn"):
"vmov.i8 d0, 0x54"
For last two instructions it should emit:
"vmov.i8 d0, 0xab"

P.S.: In ARMAsmParser.cpp I have also fixed few nearby style issues in old code.
Just for keeping method bodies in harmony with themselves.

llvm-svn: 207080
2014-04-24 06:03:01 +00:00
Evgeniy Stepanov 0a951b775e Create MCTargetOptions.
For now it contains a single flag, SanitizeAddress, which enables
AddressSanitizer instrumentation of inline assembly.

Patch by Yuri Gorshenin.

llvm-svn: 206971
2014-04-23 11:16:03 +00:00
Kevin Enderby 96918bc406 Fix the assembler to print a better relocatable expression error
diagnostic that includes location information.

Currently if one has this assembly:

	.quad (0x1234 + (4 * SOME_VALUE))

where SOME_VALUE is undefined ones gets the less than
useful error message with no location information:

% clang -c x.s
clang -cc1as: fatal error: error in backend: expected relocatable expression

With this fix one now gets a more useful error message
with location information:

% clang -c x.s 
x.s:5:8: error: expected relocatable expression
 .quad (0x1234 + (4 * SOME_VALUE))
       ^

To do this I plumbed the SMLoc through the MCObjectStreamer
EmitValue() and EmitValueImpl() interfaces so it could be used
when creating the MCFixup.

rdar://12391022

llvm-svn: 206906
2014-04-22 17:27:29 +00:00
Tim Northover 978d25f391 ARM: disable emission of __XYZvfp in soft-float environment.
The point of these calls is to allow Thumb-1 code to make use of the VFP unit
to perform its operations. This is not desirable with -msoft-float, since most
of the reasons you'd want that apply equally to the runtime library.

rdar://problem/13766161

llvm-svn: 206874
2014-04-22 10:10:09 +00:00
Chandler Carruth 84e68b2994 [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all of the header #include lines, lib/Target/...
edition.

llvm-svn: 206842
2014-04-22 02:41:26 +00:00
Chandler Carruth d174b72a28 [cleanup] Lift using directives, DEBUG_TYPE definitions, and even some
system headers above the includes of generated '.inc' files that
actually contain code. In a few targets this was already done pretty
consistently, but it wasn't done *really* consistently anywhere. It is
strictly cleaner IMO and necessary in a bunch of places where the
DEBUG_TYPE is referenced from the generated code. Consistency with the
necessary places trumps. Hopefully the build bots are OK with the
movement of intrin.h...

llvm-svn: 206838
2014-04-22 02:03:14 +00:00
Chandler Carruth e96dd8975f [Modules] Make Support/Debug.h modular. This requires it to not change
behavior based on other files defining DEBUG_TYPE, which means it cannot
define DEBUG_TYPE at all. This is actually better IMO as it forces folks
to define relevant DEBUG_TYPEs for their files. However, it requires all
files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't
already. I've updated all such files in LLVM and will do the same for
other upstream projects.

This still leaves one important change in how LLVM uses the DEBUG_TYPE
macro going forward: we need to only define the macro *after* header
files have been #include-ed. Previously, this wasn't possible because
Debug.h required the macro to be pre-defined. This commit removes that.
By defining DEBUG_TYPE after the includes two things are fixed:

- Header files that need to provide a DEBUG_TYPE for some inline code
  can do so by defining the macro before their inline code and undef-ing
  it afterward so the macro does not escape.

- We no longer have rampant ODR violations due to including headers with
  different DEBUG_TYPE definitions. This may be mostly an academic
  violation today, but with modules these types of violations are easy
  to check for and potentially very relevant.

Where necessary to suppor headers with DEBUG_TYPE, I have moved the
definitions below the includes in this commit. I plan to move the rest
of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big
enough.

The comments in Debug.h, which were hilariously out of date already,
have been updated to reflect the recommended practice going forward.

llvm-svn: 206822
2014-04-21 22:55:11 +00:00
Benjamin Kramer d2da720ead [C++11] Replace OwningPtr with std::unique_ptr in places where it doesn't break the API.
No functionality change.

llvm-svn: 206740
2014-04-21 09:34:48 +00:00
Alp Toker 9844434151 Remove some empty statements
Cleanup only.

llvm-svn: 206710
2014-04-19 23:56:35 +00:00
Kevin Enderby b7e51f6af5 Change the ARM assembler to require a :lower16: or :upper16 on non-constant
expressions for mov instructions instead of silently truncating by default.

For the ARM assembler, we want to avoid misleadingly allowing something
like "mov r0, <symbol>" especially when we turn it into a movw and the
expression <symbol> does not have a :lower16: or :upper16" as part of the
expression.  We don't want the behavior of silently truncating, which can be
unexpected and lead to bugs that are difficult to find since this is an easy
mistake to make.

This does change the previous behavior of llvm but actually matches an
older gnu assembler that would not allow this but print less useful errors
of like “invalid constant (0x927c0) after fixup” and “unsupported relocation on
symbol foo”.  The error for llvm is "immediate expression for mov requires
:lower16: or :upper16" with correct location information on the operand
as shown in the added test cases.

rdar://12342160

llvm-svn: 206669
2014-04-18 23:06:39 +00:00
Tim Northover 037f26f212 Atomics: promote ARM's IR-based atomics pass to CodeGen.
Still only 32-bit ARM using it at this stage, but the promotion allows
direct testing via opt and is a reasonably self-contained patch on the
way to switching ARM64.

At this point, other targets should be able to make use of it without
too much difficulty if they want. (See ARM64 commit coming soon for an
example).

llvm-svn: 206485
2014-04-17 18:22:47 +00:00
Craig Topper abb4ac7f87 Convert SelectionDAG::getVTList to use ArrayRef
llvm-svn: 206357
2014-04-16 06:10:51 +00:00
Tim Northover 2f553f326a FastISel: constrain the RegClass of operands when emitting instructions.
ARM64 suffered multiple -verify-machineinstr failures (principally over the
xsp/xzr issue) because FastISel was completely ignoring which subset of the
general-purpose registers each instruction required.

More fixes are coming in ARM64 specific FastISel, but this should cover the
generic problems.

llvm-svn: 206283
2014-04-15 13:59:49 +00:00
Lang Hames a1bc0f5662 [MC] Require an MCContext when constructing an MCDisassembler.
This patch re-introduces the MCContext member that was removed from
MCDisassembler in r206063, and requires that an MCContext be passed in at
MCDisassembler construction time. (Previously the MCContext member had been
initialized in an ad-hoc fashion after construction). The MCCContext member
can be used by MCDisassembler sub-classes to construct constant or
target-specific MCExprs.

This patch updates disassemblers for in-tree targets, and provides the
MCRegisterInfo instance that some disassemblers were using through the
MCContext (previously those backends were constructing their own
MCRegisterInfo instances).

llvm-svn: 206241
2014-04-15 04:40:56 +00:00
Benjamin Kramer 44a53da346 Spell the specialization namespace correctly.
Not sure why clang didn't diagnose this (GCC does).

llvm-svn: 206117
2014-04-12 18:45:24 +00:00
Benjamin Kramer 30120c0626 Make helper static and place random global into the llvm namespace.
llvm-svn: 206116
2014-04-12 18:39:57 +00:00
Kevin Enderby 488f20b64e For the ARM integrated assembler add checking of the
alignments on vld/vst instructions.  And report errors for
alignments that are not supported.

While this is a large diff and an big test case, the changes
are very straight forward.  But pretty much had to touch
all vld/vst instructions changing the addrmode to one of the
new ones that where added will do the proper checking for
the specific instruction.

FYI, re-committing this with a tweak so MemoryOp's default
constructor is trivial and will work with MSVC 2012. Thanks
to Reid Kleckner and Jim Grosbach for help with the tweak.

rdar://11312406

llvm-svn: 205986
2014-04-10 20:18:58 +00:00
Reid Kleckner 2d4a69e9c9 Revert "For the ARM integrated assembler add checking of the alignments on vld/vst instructions. And report errors for alignments that are not supported."
It doesn't build with MSVC 2012, because MSVC doesn't allow union
members that have non-trivial default constructors.  This change added
'SMLoc AlignmentLoc' to MemoryOp, which made MemoryOp's default ctor
non-trivial.

This reverts commit r205930.

llvm-svn: 205944
2014-04-10 00:52:14 +00:00
Kevin Enderby c296ecd96c For the ARM integrated assembler add checking of the
alignments on vld/vst instructions.  And report errors for
alignments that are not supported.

While this is a large diff and an big test case, the changes
are very straight forward.  But pretty much had to touch
all vld/vst instructions changing the addrmode to one of the
new ones that where added will do the proper checking for
the specific instruction.

rdar://11312406

llvm-svn: 205930
2014-04-09 21:32:59 +00:00
Alp Toker 16f98b255d Fix some doc and comment typos
llvm-svn: 205899
2014-04-09 14:47:27 +00:00
Saleem Abdulrasool fedfc2a63f ARM MC: 80 column
llvm-svn: 205833
2014-04-09 06:18:26 +00:00
Saleem Abdulrasool 5019a978ae ARM MC: sort source files in CMakeLists
llvm-svn: 205832
2014-04-09 06:18:23 +00:00
Kevin Enderby d88fec3d3a Fix the ARM VLD3 (single 3-element structure to all lanes)
size 16 double-spaced registers instruction printing.

This:
	vld3.16 {d0[], d2[], d4[]}, [r4]!

was being printed as:

	vld3.16	{d0[], d1[], d2[]}, [r4]!

rdar://16531387

llvm-svn: 205779
2014-04-08 18:00:52 +00:00
Saleem Abdulrasool dd979e6457 ARM: consolidate MachO checks for ARM asm parser
This consolidates the duplicated MachO checks in the directive parsing for
various directives that are unsupported for Mach-O.  The error message change is
unimportant as this restores the behaviour to that prior to the addition of the
new directive handling.  Furthermore, use a more direct check for MachO
targeting rather than an indirect feature check of the assembler.

Also simplify the test execution command to avoid temporary files.  Further more,
perform the check in both object and assembly emission.

Whether all non-applicable directives are handled is another question.  .fnstart
is marked as being unsupported, however, the complementary .fnend is not.  The
additional unwinding directives are also still honoured.  This change does not
change that, though, it would be good to validate and mark them as being
unsupported if they are unsupported for the MachO emission.

llvm-svn: 205678
2014-04-05 22:09:51 +00:00
Stepan Dyatkovskiy 3f1fa3d545 Fix for PR18921 (LDRD/STRD part)::
Removed "GNU Assembler extension (compatibility)" definitions from ARMInstrInfo.td
Fixed ARMAsmParser::ParseInstruction GNU compatability branch, so it also works for thumb mode from now.
Added new tests.

llvm-svn: 205622
2014-04-04 10:17:56 +00:00
Stepan Dyatkovskiy a09bd2379c Fixed register class in STRD instruction for Thumb2 mode.
llvm-svn: 205612
2014-04-04 08:14:13 +00:00
Craig Topper 840beec2d0 Make consistent use of MCPhysReg instead of uint16_t throughout the tree.
llvm-svn: 205610
2014-04-04 05:16:06 +00:00
Jim Grosbach 537f3ed838 ARM: Range based for-loop over block predecessors.
No functional change.

llvm-svn: 205604
2014-04-04 02:11:03 +00:00
Jim Grosbach f92e8f5a8b ARM: Use range-based for loops in frame lowering.
No functional change.

llvm-svn: 205602
2014-04-04 02:10:55 +00:00
Jim Grosbach bb1af943bb Tidy up. 80 columns.
llvm-svn: 205584
2014-04-03 23:43:22 +00:00
Jim Grosbach 1a59711505 Tidy up. Trailing whitespace.
llvm-svn: 205583
2014-04-03 23:43:18 +00:00
Tim Northover 01b4aa9437 ARM: tell LLVM about zext properties of ldrexb/ldrexh
Implementing this via ComputeMaskedBits has two advantages:
  + It actually works. DAGISel doesn't deal with the chains properly
    in the previous pattern-based solution, so they never trigger.
  + The information can be used in other DAG combines, as well as the
    trivial "get rid of truncs". For example if the trunc is in a
    different basic block.

rdar://problem/16227836

llvm-svn: 205540
2014-04-03 15:10:35 +00:00
Tim Northover 70450c59a4 ARM: skip cmpxchg failure barrier if ordering is monotonic.
The terminal barrier of a cmpxchg expansion will be either Acquire or
SequentiallyConsistent. In either case it can be skipped if the
operation has Monotonic requirements on failure.

rdar://problem/15996804

llvm-svn: 205535
2014-04-03 13:06:54 +00:00
Tim Northover c882eb0723 ARM: expand atomic ldrex/strex loops in IR
The previous situation where ATOMIC_LOAD_WHATEVER nodes were expanded
at MachineInstr emission time had grown to be extremely large and
involved, to account for the subtly different code needed for the
various flavours (8/16/32/64 bit, cmpxchg/add/minmax).

Moving this transformation into the IR clears up the code
substantially, and makes future optimisations much easier:

1. an atomicrmw followed by using the *new* value can be more
   efficient. As an IR pass, simple CSE could handle this
   efficiently.
2. Making use of cmpxchg success/failure orderings only has to be done
   in one (simpler) place.
3. The common "cmpxchg; did we store?" idiom can be exposed to
   optimisation.

I intend to gradually improve this situation within the ARM backend
and make sure there are no hidden issues before moving the code out
into CodeGen to be shared with (at least ARM64/AArch64, though I think
PPC & Mips could benefit too).

llvm-svn: 205525
2014-04-03 11:44:58 +00:00
Stepan Dyatkovskiy 6207a4dadc PR19320:
The trouble as in ARMAsmParser, in ParseInstruction method. It assumes that ARM::R12 + 1 == ARM::SP.
It is wrong, since ARM::<Register> codes are generated by tablegen and actually could be any random numbers.

llvm-svn: 205524
2014-04-03 11:29:15 +00:00
Silviu Baranga a3106e6847 [ARM] When generating a vpaddl node the input lane type is not always the type of the
add operation since extract_vector_elt can perform an extend operation. Get the input lane
type from the vector on which we're performing the vpaddl operation on and extend or
truncate it to the output type of the original add node.

llvm-svn: 205523
2014-04-03 10:44:27 +00:00
Oliver Stannard 92e0fc0484 ARM: Use __STACK_LIMIT symbol for segmented stacks
We cannot use STACK_LIMIT, as it is not reserved for the compiler
by the C spec.

llvm-svn: 205516
2014-04-03 08:45:16 +00:00
Saleem Abdulrasool cd1308296e ARM: update subtarget information for Windows on ARM
Update the subtarget information for Windows on ARM.  This enables using the MC
layer to target Windows on ARM.

llvm-svn: 205459
2014-04-02 20:32:05 +00:00
Jim Grosbach 36c4953348 Simplify resolveFrameIndex() signature.
Just pass a MachineInstr reference rather than an MBB iterator.
Creating a MachineInstr& is the first thing every implementation did
anyway.

llvm-svn: 205453
2014-04-02 19:28:18 +00:00
Jim Grosbach 4a1a9ce5e6 ARM: cortex-m0 doesn't support unaligned memory access.
Unlike other v6+ processors, cortex-m0 never supports unaligned accesses.
From the v6m ARM ARM:

"A3.2 Alignment support: ARMv6-M always generates a fault when an unaligned
access occurs."

rdar://16491560

llvm-svn: 205452
2014-04-02 19:28:13 +00:00
Oliver Stannard b14c625111 ARM: Add support for segmented stacks
Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav.

llvm-svn: 205430
2014-04-02 16:10:33 +00:00
Renato Golin d93295ea56 Remove duplicated DMB instructions
ARM specific optimiztion, finding places in ARM machine code where 2 dmbs
follow one another, and eliminating one of them.

Patch by Reinoud Elhorst.

llvm-svn: 205409
2014-04-02 09:03:43 +00:00
Christian Pirker dc9ff75554 ARM: rename ARMle/ARMbe with ARMLE/ARMBE, and Thumble/Thumbbe with ThumbLE/ThumbBE
llvm-svn: 205317
2014-04-01 15:19:30 +00:00
Tim Northover 0feb91ef15 ARM: teach LLVM that Cortex-A7 is very similar to A8.
llvm-svn: 205314
2014-04-01 14:10:07 +00:00
Tim Northover 1351030801 ARM: add cyclone CPU with ZeroCycleZeroing feature.
The Cyclone CPU is similar to swift for most LLVM purposes, but does have two
preferred instructions for zeroing a VFP register. This teaches LLVM about
them.

llvm-svn: 205309
2014-04-01 13:22:02 +00:00
Saleem Abdulrasool 2070088bef ARM: fix typo
llvm-svn: 205233
2014-03-31 18:09:10 +00:00
Christian Pirker 6d301b8ff2 ARM: change parameter names of the ELFARMAsmBackend constructor
I removed the underscore at the beginning of the parameter name,
because of a comment from Tim.

llvm-svn: 205215
2014-03-31 16:06:39 +00:00
Stepan Dyatkovskiy df657cc1d5 Recommitted fix for PR18931, with extended tests set.
Issue subject: Crash using integrated assembler with immediate arithmetic

Fix description:
Expressions like 'cmp r0, #(l1 - l2) >> 3' could not be evaluated on asm parsing stage,
since it is impossible to resolve labels on this stage. In the end of stage we still have
expression (MCExpr).
Then, when we want to encode it, we expect it to be an immediate, but it still an expression.
Patch introduces a Fixup (MCFixup instance), that is processed after main encoding stage.

llvm-svn: 205094
2014-03-29 13:12:40 +00:00
Rafael Espindola 5904e12bfa Completely rewrite ELFObjectWriter::RecordRelocation.
I started trying to fix a small issue, but this code has seen a small fix too
many.

The old code was fairly convoluted. Some of the issues it had:

* It failed to check if a symbol difference was in the some section when
  converting a relocation to pcrel.
* It failed to check if the relocation was already pcrel.
* The pcrel value computation was wrong in some cases (relocation-pc.s)
* It was missing quiet a few cases where it should not convert symbol
  relocations to section relocations, leaving the backends to patch it up.
* It would not propagate the fact that it had changed a relocation to pcrel,
  requiring a quiet nasty work around in ARM.
* It was missing comments.

llvm-svn: 205076
2014-03-29 06:26:49 +00:00
Rafael Espindola 3e3de5e353 Add const.
llvm-svn: 205013
2014-03-28 16:06:09 +00:00
Christian Pirker 2a11160956 Add ARM big endian Target (armeb, thumbeb)
Reviewed at http://llvm-reviews.chandlerc.com/D3095

llvm-svn: 205007
2014-03-28 14:35:30 +00:00
Rafael Espindola c03f44ca8a Remove another unused argument.
llvm-svn: 204961
2014-03-27 20:49:35 +00:00
Rafael Espindola 9ab380122a Remove unused argument.
llvm-svn: 204956
2014-03-27 20:41:17 +00:00
Stepan Dyatkovskiy e8747e30ef Rejected r204899 and r204900 due to remaining test failures on cmake-llvm-x86_64-linux buildbot.
llvm-svn: 204901
2014-03-27 08:38:18 +00:00
Stepan Dyatkovskiy 3530003008 Fix for pr18931: Crash using integrated assembler with immediate arithmetic
Fix description:
Expressions like 'cmp r0, #(l1 - l2) >> 3' could not be evaluated on asm parsing stage,
since it is impossible to resolve labels on this stage. In the end of stage we still have
expression (MCExpr).
Then, when we want to encode it, we expect it to be an immediate, but it still an expression.
Patch introduces a Fixup (MCFixup instance), that is processed after main encoding stage.

llvm-svn: 204899
2014-03-27 07:49:39 +00:00
Jiangning Liu 1d3f2c7c82 ARM: raise error message when complex SO expressions can't really be
solved as a constant at compilation time.

llvm-svn: 204898
2014-03-27 07:42:58 +00:00
Kevin Enderby 5611398b6b Fix a problem with the ARM assembler incorrectly matching a
vector list parameter that is using all lanes "{d0[], d2[]}" but can
match and instruction with a ”{d0, d2}" parameter.

I’m finishing up a fix for proper checking of the unsupported
alignments on vld/vst instructions and ran into this.  Thus I don’t
have a test case at this time.  And adding all code that will
demonstrate the bug would obscure the very simple one line fix.
So if you would indulge me on not having a test case at this
time I’ll instead offer up a detailed explanation of what is
going on in this commit message.

This instruction:

	vld2.8  {d0[], d2[]}, [r4:64]

is not legal as the alignment can only be 16 when the size is 8.
Per this documentation:

A8.8.325 VLD2 (single 2-element structure to all lanes)
 <align> The alignment. It can be one of:
16 2-byte alignment, available only if <size> is 8, encoded as a = 1.
32 4-byte alignment, available only if <size> is 16, encoded as a = 1.
64 8-byte alignment, available only if <size> is 32, encoded as a = 1.
omitted Standard alignment, see Unaligned data access on page A3-108.

So when code is added to the llvm integrated assembler to not match
that instruction because of the alignment it then goes on to try to match
other instructions and comes across this:

	vld2.8  {d0, d2}, [r4:64]

and and matches it. This is because of the method
ARMOperand::isVecListDPairSpaced() is missing the check of the Kind.
In this case the Kind is k_VectorListAllLanes . While the name of the method
may suggest that this is OK it really should check that the Kind is
k_VectorList.

As the method ARMOperand::isDoubleSpacedVectorAllLanes() is what was
used to match {d0[], d2[]}  and correctly checks the Kind:

  bool isDoubleSpacedVectorAllLanes() const {
    return Kind == k_VectorListAllLanes && VectorList.isDoubleSpaced;
  }

where the original ARMOperand::isVecListDPairSpaced() does not check
the Kind:

  bool isVecListDPairSpaced() const {
    if (isSingleSpacedVectorList()) return false;
    return (ARMMCRegisterClasses[ARM::DPairSpcRegClassID]
              .contains(VectorList.RegNum));
  }

Jim Grosbach has reviewed the change and said:  Yep, that sounds right. …
And by "right" I mean, "wow, that's a nasty latent bug I'm really, really
glad to see fixed." :)

rdar://16436683

llvm-svn: 204861
2014-03-26 21:54:11 +00:00
Kevin Enderby 8108f38437 Fix the ARM VST4 (single 4-element structure from one lane)
size 16 double-spaced registers instruction printing.

This:
	vld4.16 {d17[1], d19[1], d21[1], d23[1]}, [r7]!

was being printed as:

	vld4.16 {d17[1], d18[1], d19[1], d20[1]}, [r7]!

rdar://16435096

llvm-svn: 204847
2014-03-26 19:35:40 +00:00
Tim Northover 1ff5f29fb5 ARM: add intrinsics for the v8 ldaex/stlex
We've already got versions without the barriers, so this just adds IR-level
support for generating the new v8 ones.

rdar://problem/16227836

llvm-svn: 204813
2014-03-26 14:39:31 +00:00
Renato Golin 93010e687f Change @llvm.clear_cache default to call rt-lib
After some discussion on IRC, emitting a call to the library function seems
like a better default, since it will move from a compiler internal error to
a linker error, that the user can work around until LLVM is fixed.

I'm also adding a note on the responsibility of the user to confirm that
the cache was cleared on platforms where nothing is done.

llvm-svn: 204806
2014-03-26 14:01:32 +00:00
Renato Golin c0a3c1d66b Add @llvm.clear_cache builtin
Implementing the LLVM part of the call to __builtin___clear_cache
which translates into an intrinsic @llvm.clear_cache and is lowered
by each target, either to a call to __clear_cache or nothing at all
incase the caches are unified.

Updating LangRef and adding some tests for the implemented architectures.
Other archs will have to implement the method in case this builtin
has to be compiled for it, since the default behaviour is to bail
unimplemented.

A Clang patch is required for the builtin to be lowered into the
llvm intrinsic. This will be done next.

llvm-svn: 204802
2014-03-26 12:52:28 +00:00
Kevin Enderby 89299400ac Fix crashes when assembler directives are used that are not
for Mach-O object files by generating an error instead.

rdar://16335232

llvm-svn: 204687
2014-03-25 00:05:50 +00:00
Arnaud A. de Grandmaison 1182600f20 ARM: no need to update SplatBits as it is not used
llvm-svn: 204575
2014-03-23 21:14:32 +00:00
Nuno Lopes 31617266ea remove a bunch of unused private methods
found with a smarter version of -Wunused-member-function that I'm playwing with.
Appologies in advance if I removed someone's WIP code.

 include/llvm/CodeGen/MachineSSAUpdater.h            |    1 
 include/llvm/IR/DebugInfo.h                         |    3 
 lib/CodeGen/MachineSSAUpdater.cpp                   |   10 --
 lib/CodeGen/PostRASchedulerList.cpp                 |    1 
 lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp    |   10 --
 lib/IR/DebugInfo.cpp                                |   12 --
 lib/MC/MCAsmStreamer.cpp                            |    2 
 lib/Support/YAMLParser.cpp                          |   39 ---------
 lib/TableGen/TGParser.cpp                           |   16 ---
 lib/TableGen/TGParser.h                             |    1 
 lib/Target/AArch64/AArch64TargetTransformInfo.cpp   |    9 --
 lib/Target/ARM/ARMCodeEmitter.cpp                   |   12 --
 lib/Target/ARM/ARMFastISel.cpp                      |   84 --------------------
 lib/Target/Mips/MipsCodeEmitter.cpp                 |   11 --
 lib/Target/Mips/MipsConstantIslandPass.cpp          |   12 --
 lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp              |   21 -----
 lib/Target/NVPTX/NVPTXISelDAGToDAG.h                |    2 
 lib/Target/PowerPC/PPCFastISel.cpp                  |    1 
 lib/Transforms/Instrumentation/AddressSanitizer.cpp |    2 
 lib/Transforms/Instrumentation/BoundsChecking.cpp   |    2 
 lib/Transforms/Instrumentation/MemorySanitizer.cpp  |    1 
 lib/Transforms/Scalar/LoopIdiomRecognize.cpp        |    8 -
 lib/Transforms/Scalar/SCCP.cpp                      |    1 
 utils/TableGen/CodeEmitterGen.cpp                   |    2 
 24 files changed, 2 insertions(+), 261 deletions(-)

llvm-svn: 204560
2014-03-23 17:09:26 +00:00
Craig Topper a9253267a9 Prune includes in ARM target.
llvm-svn: 204548
2014-03-22 23:51:00 +00:00
Saleem Abdulrasool 44419fc3cd ARM IAS: properly handle function entries in .thumb
When a label is parsed, check if there is type information available for the
label.  If so, check if the symbol is a function.  If the symbol is a function
and we are in thumb mode and no explicit thumb_func has been emitted, adjust the
symbol data to indicate that the function definition is a thumb function.

The application of this inferencing is improved value handling in the object
file (the required thumb bit is set on symbols which are thumb functions).  It
also helps improve compatibility with binutils.

The one complication that arises from this handling is the MCAsmStreamer.  The
default implementation of getOrCreateSymbolData in MCStreamer does not support
tracking the symbol data.  In order to support the semantics of thumb functions,
track symbol data in assembly streamer.  Although O(n) in number of labels in
the TU, this is already done in various other streamers and as such the memory
overhead is not a practical concern in this scenario.

llvm-svn: 204544
2014-03-22 19:26:18 +00:00
Jiangning Liu db55b02e1c This reverts commit r203762, "ARM: support emission of complex SO expressions".
The commit r203762 introduced silent failure for complext SO expression, and it's even worse than compiler crash.

llvm-svn: 204427
2014-03-21 02:51:01 +00:00
Weiming Zhao 0152485679 Fix PR19136: [ARM] Fix Folding SP Update into vpush/vpop
Sicne MBB->computeRegisterLivenes() returns Dead for sub regs like s0,
d0 is used in vpop instead of updating sp, which causes s0 dead before
its use.

This patch checks the liveness of each subreg to make sure the reg is
actually dead.

llvm-svn: 204411
2014-03-20 23:28:16 +00:00
Saleem Abdulrasool 39f773f939 Reapply 'ARM IAS: support .thumb_set'
Re-apply the change after it was reverted to do conflicts due to another change
being reverted.

llvm-svn: 204306
2014-03-20 06:05:33 +00:00
Hao Liu 40b5ab8e5b [ARM]Fix an assertion failure in A15SDOptimizer about DPair reg class by treating DPair as QPR.
llvm-svn: 204304
2014-03-20 05:36:59 +00:00
Rafael Espindola 7fadc0ea7d Look through variables when computing relocations.
Given

bar = foo + 4
	.long bar

MC would eat the 4. GNU as includes it in the relocation. The rule seems to be
that a variable that defines a symbol is used in the relocation and one that
does not define a symbol is evaluated and the result included in the relocation.

Fixing this unfortunately required some other changes:

* Since the variable is now evaluated, it would prevent the ELF writer from
  noticing the weakref marker the elf streamer uses. This patch then replaces
  that with a VariantKind in MCSymbolRefExpr.

* Using VariantKind then requires us to look past other VariantKind to see

	.weakref	bar,foo
	call	bar@PLT

  doing this also fixes

	zed = foo +2
	call zed@PLT

  so that is a good thing.

* Looking past VariantKind means that the relocation selection has to use
  the fixup instead of the target.

This is a reboot of the previous fixes for MC. I will watch the sanitizer
buildbot and wait for a build before adding back the previous fixes.

llvm-svn: 204294
2014-03-20 02:12:01 +00:00