Commit Graph

1037 Commits

Author SHA1 Message Date
Saleem Abdulrasool abac6e92a0 ARM: add VLA extension for WoA Itanium ABI
The armv7-windows-itanium environment is nearly identical to the MSVC ABI. It
has a few divergences, mostly revolving around the use of the Itanium ABI for
C++. VLA support is one of the extensions that are amongst the set of the
extensions.

This adds support for proper VLA emission for this environment. This is
somewhat similar to the handling for __chkstk emission on X86 and the large
stack frame emission for ARM. The invocation style for chkstk is still
controlled via the -mcmodel flag to clang.

Make an explicit note that this is an extension.

llvm-svn: 210489
2014-06-09 20:18:42 +00:00
Saleem Abdulrasool 90386ad60a ARM: correct assertion for long-calls on WoA
COFF/PE, so the relocation model is never static.  Loosen the assertion
accordingly.  The relocation can still be emitted properly, as it will be
converted to an IMAGE_REL_ARM_ADDR32 which will be resolved by the loader
taking the base relocation into account.  This is necessary to permit the
emission of long calls which can be controlled via the -mlong-calls option in
the driver.

llvm-svn: 210399
2014-06-07 20:29:27 +00:00
Christian Pirker 762b2c624f ARMEB: Fix function return type f64
Reviewed at http://reviews.llvm.org/D3968

llvm-svn: 209990
2014-06-01 09:30:52 +00:00
Eric Christopher 8995833a34 Have the TLOF creation take a Triple rather than needing a subtarget.
llvm-svn: 209937
2014-05-31 00:07:32 +00:00
Tim Northover 4f1909f1da ARM: teach AAPCS-VFP to deal with Cortex-M4.
Cortex-M4 only has single-precision floating point support, so any LLVM
"double" type will have been split into 2 i32s by now. Fortunately, the
consecutive-register framework turns out to be precisely what's needed to
reconstruct the double and follow AAPCS-VFP correctly!

rdar://problem/17012966

llvm-svn: 209650
2014-05-27 10:43:38 +00:00
Benjamin Kramer f3ad23551d SDAG: Legalize vector BSWAP into a shuffle if the shuffle is legal but the bswap not.
- On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though.
- On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal.
- On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled.

llvm-svn: 209123
2014-05-19 13:12:38 +00:00
Saleem Abdulrasool 8bfb192ecd ARM: make libcall setup more table driven
Rather than create a series of function calls to setup the library calls, create
a table with the information and just use the table to drive the configuration
of the library calls.  This makes it easier to both inspect the list as well as
to modify it.  NFC.

llvm-svn: 209089
2014-05-18 16:39:11 +00:00
Saleem Abdulrasool f11f4b4e20 ARM: consolidate frame pointer register knowledge
Use the ARMBaseRegisterInfo to query the frame register.  The base register info
is aware of the frame register that is used for the frame pointer.  Use that to
determine the frame register rather than duplicating the knowledge.  Although,
the code path is slightly different in that it may return SP, that can only
occur if the frame pointer has been omitted in the machine function, which is
supposed to contain the desired value in that case.

llvm-svn: 209084
2014-05-18 03:18:09 +00:00
Saleem Abdulrasool f3a5a5c546 Target: remove old constructors for CallLoweringInfo
This is mostly a mechanical change changing all the call sites to the newer
chained-function construction pattern.  This removes the horrible 15-parameter
constructor for the CallLoweringInfo in favour of setting properties of the call
via chained functions.  No functional change beyond the removal of the old
constructors are intended.

llvm-svn: 209082
2014-05-17 21:50:17 +00:00
Saleem Abdulrasool 46fed305db ARM: use the proper target object format for WoA
WoA uses COFF, not ELF.  ARMISelLowering::createTLOF would previously return ELF
for any non-MachO platform.  This was a missed site when the original change for
target format support for Windows on ARM was done.

llvm-svn: 209057
2014-05-17 04:28:08 +00:00
Saleem Abdulrasool 056fc3da4a ARM: add some integer/floating point conversion libcalls
Add some Windows on ARM specific library calls.  These are provided by msvcrt,
and can be used to perform integer to floating-point conversions (and
vice-versa) mirroring similar functions in the RTABI.

llvm-svn: 208949
2014-05-16 05:41:33 +00:00
Jay Foad a0653a3e6c Rename ComputeMaskedBits to computeKnownBits. "Masked" has been
inappropriate since it lost its Mask parameter in r154011.

llvm-svn: 208811
2014-05-14 21:14:37 +00:00
Christian Pirker 6692e7c116 ARM-BE: test files for vector argument passing
Reviewed at http://reviews.llvm.org/D3766

llvm-svn: 208793
2014-05-14 16:59:44 +00:00
Christian Pirker 238c7c165b ARM: Implement big endian bit-conversion for NEON type
llvm-svn: 208538
2014-05-12 11:19:20 +00:00
Hal Finkel f0e086a0bc Pass the value type to TLI::getRegisterByName
We must validate the value type in TLI::getRegisterByName, because if we
don't and the wrong type was used with the IR intrinsic, then we'll assert
(because we won't be able to find a valid register class with which to
construct the requested copy operation). For PPC64, additionally, the type
information is necessary to decide between the 64-bit register and the 32-bit
subregister.

No functionality change.

llvm-svn: 208508
2014-05-11 19:29:07 +00:00
Louis Gerbarg 3342bf1451 Add custom lowering for add/sub with overflow intrinsics to ARM
This patch adds support to ARM for custom lowering of the
llvm.{u|s}add.with.overflow.i32 intrinsics for i32/i64. This is particularly useful
for handling idiomatic saturating math functions as generated by
InstCombineCompare.

Test cases included.

rdar://14853450

llvm-svn: 208435
2014-05-09 17:02:49 +00:00
Oliver Stannard c24f2171ca ARM: HFAs must be passed in consecutive registers
When using the ARM AAPCS, HFAs (Homogeneous Floating-point Aggregates) must
be passed in a block of consecutive floating-point registers, or on the stack.
This means that unused floating-point registers cannot be back-filled with
part of an HFA, however this can currently happen. This patch, along with the
corresponding clang patch (http://reviews.llvm.org/D3083) prevents this.

llvm-svn: 208413
2014-05-09 14:01:47 +00:00
Saleem Abdulrasool 40bca0afab ARM: support PIC on Windows on ARM
Handle lowering of global addresses for PIC mode compilation on Windows.  Always
use the movw/movt load to load the address as Windows on ARM requires ARMv7+ and
is a pure Thumb environment.

llvm-svn: 208385
2014-05-09 00:58:32 +00:00
Christian Pirker b5728191c2 ARM big endian function argument passing
llvm-svn: 208316
2014-05-08 14:06:24 +00:00
Renato Golin c7aea40ec6 Implememting named register intrinsics
This patch implements the infrastructure to use named register constructs in
programs that need access to specific registers (bare metal, kernels, etc).

So far, only the stack pointer is supported as a technology preview, but as it
is, the intrinsic can already support all non-allocatable registers from any
architecture.

llvm-svn: 208104
2014-05-06 16:51:25 +00:00
Craig Topper 64941d9786 Convert SelectionDAG::getMergeValues to use ArrayRef.
llvm-svn: 207374
2014-04-27 19:20:57 +00:00
Craig Topper 206fcd450a Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer and size.
llvm-svn: 207329
2014-04-26 19:29:41 +00:00
Craig Topper 48d114bed1 Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.
llvm-svn: 207327
2014-04-26 18:35:24 +00:00
Benjamin Kramer 4dae598bc8 DAGCombiner: Turn divs of vector splats into vectorized multiplications.
Otherwise the legalizer would just scalarize everything. Support for
mulhi in the targets isn't that great yet so on most targets we get
exactly the same scalarized output. Add a test for x86 vector udiv.

I had to disable the mulhi nodes on ARM because there aren't any patterns
for it. As far as I know ARM has instructions for getting the high part of
a multiply so this should be fixed.

llvm-svn: 207315
2014-04-26 12:06:28 +00:00
Craig Topper 062a2baef0 [C++] Use 'nullptr'. Target edition.
llvm-svn: 207197
2014-04-25 05:30:21 +00:00
Reid Kleckner 5772b77789 Add 'musttail' marker to call instructions
This is similar to the 'tail' marker, except that it guarantees that
tail call optimization will occur.  It also comes with convervative IR
verification rules that ensure that tail call optimization is possible.

Reviewers: nicholas

Differential Revision: http://llvm-reviews.chandlerc.com/D3240

llvm-svn: 207143
2014-04-24 20:14:34 +00:00
Tim Northover 978d25f391 ARM: disable emission of __XYZvfp in soft-float environment.
The point of these calls is to allow Thumb-1 code to make use of the VFP unit
to perform its operations. This is not desirable with -msoft-float, since most
of the reasons you'd want that apply equally to the runtime library.

rdar://problem/13766161

llvm-svn: 206874
2014-04-22 10:10:09 +00:00
Chandler Carruth 84e68b2994 [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all of the header #include lines, lib/Target/...
edition.

llvm-svn: 206842
2014-04-22 02:41:26 +00:00
Tim Northover 037f26f212 Atomics: promote ARM's IR-based atomics pass to CodeGen.
Still only 32-bit ARM using it at this stage, but the promotion allows
direct testing via opt and is a reasonably self-contained patch on the
way to switching ARM64.

At this point, other targets should be able to make use of it without
too much difficulty if they want. (See ARM64 commit coming soon for an
example).

llvm-svn: 206485
2014-04-17 18:22:47 +00:00
Craig Topper abb4ac7f87 Convert SelectionDAG::getVTList to use ArrayRef
llvm-svn: 206357
2014-04-16 06:10:51 +00:00
Craig Topper 840beec2d0 Make consistent use of MCPhysReg instead of uint16_t throughout the tree.
llvm-svn: 205610
2014-04-04 05:16:06 +00:00
Jim Grosbach 1a59711505 Tidy up. Trailing whitespace.
llvm-svn: 205583
2014-04-03 23:43:18 +00:00
Tim Northover 01b4aa9437 ARM: tell LLVM about zext properties of ldrexb/ldrexh
Implementing this via ComputeMaskedBits has two advantages:
  + It actually works. DAGISel doesn't deal with the chains properly
    in the previous pattern-based solution, so they never trigger.
  + The information can be used in other DAG combines, as well as the
    trivial "get rid of truncs". For example if the trunc is in a
    different basic block.

rdar://problem/16227836

llvm-svn: 205540
2014-04-03 15:10:35 +00:00
Tim Northover c882eb0723 ARM: expand atomic ldrex/strex loops in IR
The previous situation where ATOMIC_LOAD_WHATEVER nodes were expanded
at MachineInstr emission time had grown to be extremely large and
involved, to account for the subtly different code needed for the
various flavours (8/16/32/64 bit, cmpxchg/add/minmax).

Moving this transformation into the IR clears up the code
substantially, and makes future optimisations much easier:

1. an atomicrmw followed by using the *new* value can be more
   efficient. As an IR pass, simple CSE could handle this
   efficiently.
2. Making use of cmpxchg success/failure orderings only has to be done
   in one (simpler) place.
3. The common "cmpxchg; did we store?" idiom can be exposed to
   optimisation.

I intend to gradually improve this situation within the ARM backend
and make sure there are no hidden issues before moving the code out
into CodeGen to be shared with (at least ARM64/AArch64, though I think
PPC & Mips could benefit too).

llvm-svn: 205525
2014-04-03 11:44:58 +00:00
Silviu Baranga a3106e6847 [ARM] When generating a vpaddl node the input lane type is not always the type of the
add operation since extract_vector_elt can perform an extend operation. Get the input lane
type from the vector on which we're performing the vpaddl operation on and extend or
truncate it to the output type of the original add node.

llvm-svn: 205523
2014-04-03 10:44:27 +00:00
Saleem Abdulrasool cd1308296e ARM: update subtarget information for Windows on ARM
Update the subtarget information for Windows on ARM.  This enables using the MC
layer to target Windows on ARM.

llvm-svn: 205459
2014-04-02 20:32:05 +00:00
Oliver Stannard b14c625111 ARM: Add support for segmented stacks
Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav.

llvm-svn: 205430
2014-04-02 16:10:33 +00:00
Tim Northover 1ff5f29fb5 ARM: add intrinsics for the v8 ldaex/stlex
We've already got versions without the barriers, so this just adds IR-level
support for generating the new v8 ones.

rdar://problem/16227836

llvm-svn: 204813
2014-03-26 14:39:31 +00:00
Arnaud A. de Grandmaison 1182600f20 ARM: no need to update SplatBits as it is not used
llvm-svn: 204575
2014-03-23 21:14:32 +00:00
Craig Topper a9253267a9 Prune includes in ARM target.
llvm-svn: 204548
2014-03-22 23:51:00 +00:00
Saleem Abdulrasool 0d96f3dd6e ARM: honour -f{no-,}optimize-sibling-calls
Use the options in the ARMISelLowering to control whether tail calls are
optimised or not.  Previously, this option was entirely ignored on the ARM
target and only honoured on x86.

This option is mostly useful in profiling scenarios.  The default remains that
tail call optimisations will be applied.

llvm-svn: 203577
2014-03-11 15:09:54 +00:00
Saleem Abdulrasool b720a6bab7 ARM: remove ancient -arm-tail-calls option
This option is from 2010, designed to work around a linker issue on Darwin for
ARM.  According to grosbach this is no longer an issue and this option can
safely be removed.

llvm-svn: 203576
2014-03-11 15:09:49 +00:00
Tim Northover 445dd58aae ARM: simplify EmitAtomicBinary64
ATOMIC_STORE operations always get here as a lowered ATOMIC_SWAP, so there's no
need for any code to handle them specially.

There should be no functionality change so no tests.

llvm-svn: 203567
2014-03-11 13:19:55 +00:00
Tim Northover e94a518a22 IR: add a second ordering operand to cmpxhg for failure
The syntax for "cmpxchg" should now look something like:

	cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic

where the second ordering argument gives the required semantics in the case
that no exchange takes place. It should be no stronger than the first ordering
constraint and cannot be either "release" or "acq_rel" (since no store will
have taken place).

rdar://problem/15996804

llvm-svn: 203559
2014-03-11 10:48:52 +00:00
Oliver Stannard d55e115b58 ARM: Correctly align arguments after a byval struct is passed on the stack
llvm-svn: 202985
2014-03-05 15:25:27 +00:00
Benjamin Kramer b6d0bd48bd [C++11] Replace llvm::next and llvm::prior with std::next and std::prev.
Remove the old functions.

llvm-svn: 202636
2014-03-02 12:27:27 +00:00
Mark Seaborn be266aa325 Use 16 byte stack alignment for NaCl on ARM
NaCl's ARM ABI uses 16 byte stack alignment, so set that in
ARMSubtarget.cpp.

Using 16 byte alignment exposes an issue in code generation in which a
varargs function leaves a 4 byte gap between the values of r1-r3 saved
to the stack and the following arguments that were passed on the
stack.  (Previously, this code only needed to support 4 byte and 8
byte alignment.)

With this issue, llc generated:

varargs_func:
        sub     sp, sp, #16
        push    {lr}
        sub     sp, sp, #12
        add     r0, sp, #16   // Should be 20
        stm     r0, {r1, r2, r3}
        ldr     r0, .LCPI0_0  // Address of va_list
        add     r1, sp, #16
        str     r1, [r0]
        bl      external_func

Fix the bug by checking for "Align > 4".  Also simplify the code by
using OffsetToAlignment(), and update comments.

Differential Revision: http://llvm-reviews.chandlerc.com/D2677

llvm-svn: 201497
2014-02-16 18:59:48 +00:00
Tim Northover b0430415e6 ARM: use natural LLVM IR for vshll instructions
Similarly to the vshrn instructions, these are simple zext/sext + trunc
operations. Using normal LLVM IR should allow for better code, and more sharing
with the AArch64 backend.

llvm-svn: 201093
2014-02-10 16:20:29 +00:00
Tim Northover 170daafe01 ARM: use LLVM IR to represent the vshrn operation
vshrn is just the combination of a right shift and a truncate (and the limits
on the immediate value actually mean the signedness of the shift doesn't
matter). Using that representation allows us to get rid of an ARM-specific
intrinsic, share more code with AArch64 and hopefully get better code out of
the mid-end optimisers.

llvm-svn: 201085
2014-02-10 14:04:07 +00:00
Matt Arsenault 25793a3f22 Add address space argument to allowsUnalignedMemoryAccess.
On R600, some address spaces have more strict alignment
requirements than others.

llvm-svn: 200887
2014-02-05 23:15:53 +00:00