Commit Graph

8528 Commits

Author SHA1 Message Date
Artyom Skrobov e6f1b7f094 Replace a string comparison in ARMSubtarget.h with a tablegen entry in ARM.td (NFC)
Reviewers: rengolin, t.p.northover

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D18393

llvm-svn: 264165
2016-03-23 16:18:13 +00:00
Peter Collingbourne 86b9fbe980 ARM: Better codegen for 64-bit compares.
This introduces a custom lowering for ISD::SETCCE (introduced in r253572)
that allows us to emit a short code sequence for 64-bit compares.

Before:

	push	{r7, lr}
	cmp	r0, r2
	mov.w	r0, #0
	mov.w	r12, #0
	it	hs
	movhs	r0, #1
	cmp	r1, r3
	it	ge
	movge.w	r12, #1
	it	eq
	moveq	r12, r0
	cmp.w	r12, #0
	bne	.LBB1_2
@ BB#1:                                 @ %bb1
	bl	f
	pop	{r7, pc}
.LBB1_2:                                @ %bb2
	bl	g
	pop	{r7, pc}

After:

	push	{r7, lr}
	subs	r0, r0, r2
	sbcs.w	r0, r1, r3
	bge	.LBB1_2
@ BB#1:                                 @ %bb1
	bl	f
	pop	{r7, pc}
.LBB1_2:                                @ %bb2
	bl	g
	pop	{r7, pc}

Saves around 80KB in Chromium's libchrome.so.

Some notes on this patch:

- I don't much like the ARMISD::BRCOND and ARMISD::CMOV combines I
  introduced (nothing else needs them). However, they are necessary in
  order to avoid poor codegen, and they seem similar to existing combines
  in other backends (e.g. X86 combines (brcond (cmp (setcc Compare))) to
  (brcond Compare)).

- No support for Thumb-1. This is in principle possible, but we'd need
  to implement ARMISD::SUBE for Thumb-1.

Differential Revision: http://reviews.llvm.org/D15256

llvm-svn: 263962
2016-03-21 18:00:02 +00:00
Renato Golin 2b6b7ffd6c [ARM] Add Cortex-A32 support
Adding Cortex-A32 as an available target in the ARM backend.

Patch by Sam Parker.

llvm-svn: 263956
2016-03-21 17:29:01 +00:00
Manman Ren a3a019cf90 [CXX_FAST_TLS] Fix issues in ARM.
We need to be careful on which registers can be explicitly handled
via copies. Prologue, Epilogue use physical registers and if one belongs
to the set of CSRsViaCopy, it will no longer be CSRed, since PEI overwrites
it after the explicit copies.

llvm-svn: 263857
2016-03-18 23:44:37 +00:00
Manman Ren 4865d89653 [CXX_FAST_TLS] Disable tail call when calling conventions are mismatched.
Since CXX_FAST_TLS has a bigger set of CSRs, we don't tail call when caller
and callee have mismatched calling conventions.

llvm-svn: 263856
2016-03-18 23:41:51 +00:00
Manman Ren 2828c57b6f [CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86.
Since at O0, explicit copies via SplitCSR may not be removed even if
they are unnecessary, we choose not to use SplitCSR at O0.

llvm-svn: 263855
2016-03-18 23:38:49 +00:00
Tim Northover 498c56c240 ARM: stop asserting on weird <3 x Ty> vectors in ISelLowering.
llvm-svn: 263741
2016-03-17 20:10:28 +00:00
Saleem Abdulrasool 071a099102 ARM: Revert SVN r253865, 254158, fix windows division
The two changes together weakened the test and caused a regression with division
handling in MSVC mode.  They were applied to avoid an assertion being triggered
in the block frequency analysis.  However, the underlying problem was simply
being masked rather than solved properly.  Address the actual underlying problem
and revert the changes.  Rather than analyze the cause of the assertion, the
division failure was assumed to be an overflow.

The underlying issue was a subtle bug in the BB construction in the emission of
the div-by-zero check (WIN__DBZCHK).  We did not construct the proper successor
information in the basic blocks, nor did we update the PHIs associated with the
basic block when we split them.  This would result in assertions being triggered
in the block frequency analysis pass.

Although the original tests are being removed, the tests themselves performed
very little in terms of validation but merely tested that we did not assert when
generating code.  Update this with new tests that actually ensure that we do not
regress on the code generation.

llvm-svn: 263714
2016-03-17 14:10:49 +00:00
James Y Knight f44fc5219f Tweak some atomics functions in preparation for larger changes; NFC.
- Rename getATOMIC to getSYNC, as llvm will soon be able to emit both
  '__sync' libcalls and '__atomic' libcalls, and this function is for
  the '__sync' ones.

- getInsertFencesForAtomic() has been replaced with
  shouldInsertFencesForAtomic(Instruction), so that the decision can be
  made per-instruction. This functionality will be used soon.

- emitLeadingFence/emitTrailingFence are no longer called if
  shouldInsertFencesForAtomic returns false, and thus don't need to
  check the condition themselves.

llvm-svn: 263665
2016-03-16 22:12:04 +00:00
Davide Italiano dfdf278ebf [MC] Rename TLSDESC as it's not ARM specific.
Similarly to what was done for TLSCALL in r263515.

llvm-svn: 263564
2016-03-15 17:29:52 +00:00
Davide Italiano 249c45d92e [MC] Rename TLSCALL as it's not ARM specific.
`MCSymbolRefExpr` variant kind for TLSCALL is prefixed with 
_ARM_ since this is how it was originally implemented.
The X86_64 version is exactly the same so there's no reason
to create a new variant, we can just rename the existing
one to be machine-independent.
This generalization is the first step to implement support
for GNU2 TLS dialect in MC.

Differential Revision:  http://reviews.llvm.org/D18160

llvm-svn: 263515
2016-03-15 00:25:22 +00:00
Sanjay Patel 7506852709 [DAG] use !isUndef() ; NFCI
llvm-svn: 263453
2016-03-14 18:09:43 +00:00
Sanjay Patel 5719584129 [DAG] use isUndef() ; NFCI
llvm-svn: 263448
2016-03-14 17:28:46 +00:00
Peter Collingbourne aba16fca5d ARM: Support relative references using the PREL31 symbol variant.
Differential Revision: http://reviews.llvm.org/D17937

llvm-svn: 263156
2016-03-10 19:30:18 +00:00
Alexandros Lamprineas 843164242e [ARM] Cortex-R8 support
This patch adds Cortex-R8 to Target Parser and TableGen.
It also adds CodeGen tests for the build attributes.

Patch by Pablo Barrio.

Differential Revision: http://reviews.llvm.org/D17925

llvm-svn: 263132
2016-03-10 17:38:41 +00:00
Saleem Abdulrasool 1632fe1f77 ARM: follow up improvements for SVN r263118
The initial change was insufficiently complete for always getting the semantics
of __builtin_longjmp correct.  The builtin is translated into a
`tInt_eh_sjlj_longjmp` DAG node.  This node set R7 as clobbered.  However, the
code would then follow up with a clobber of R11.  I had failed to notice the
imp-def,kill on R7 in the isel.  Unfortunately, it seems that it is not possible
to conditionalise the Defs list via an !if.  Instead, construct a new parallel
WIN node and prefer that when targeting windows.  This ensures that we now both
correctly model the __builtin_longjmp as well as construct the frame in a more
ABI conformant manner.

llvm-svn: 263123
2016-03-10 16:26:37 +00:00
Saleem Abdulrasool 8b30f9854e ARM: correct __builtin_longjmp on WoA
WoA uses r11 as the FP even though it is a pure thumb-2 environment in contrast
to AAPCS which states r7.  This adjusts __builtin_longjmp to not clobber r7 and
to properly restore the frame pointer on execution.

llvm-svn: 263118
2016-03-10 15:11:09 +00:00
Roman Levenstein 2792b3f02f Add support for a preserve_most calling convention to the AArch64 backend.
This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64.

There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention.

Differential Revision: http://reviews.llvm.org/D18016

llvm-svn: 263092
2016-03-10 04:35:09 +00:00
Artyom Skrobov 5ddea6a8e9 [ARM] Simplify ARMInstr*.td by getting rid of identity PatFrags (NFC)
Reviewers: t.p.northover, grosbach, resistor

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17636

llvm-svn: 262936
2016-03-08 16:23:54 +00:00
Renato Golin 175c6d6d95 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make
sure we only emit valid types or the ones that were explicitly marked as custom.
Now, passing check-all and test-suite on x86, ARM and AArch64.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262738
2016-03-04 19:19:36 +00:00
Renato Golin 3d78271eac Revert "[ARM] Merging 64-bit divmod lib calls into one"
This reverts commit r262507, which broke some ARM buildbots.

llvm-svn: 262594
2016-03-03 08:57:44 +00:00
Renato Golin 93e42d9934 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262507
2016-03-02 19:35:45 +00:00
Matthias Braun f290912d22 ARM: Introduce conservative load/store optimization mode
Most of the time ARM has the CCR.UNALIGN_TRP bit set to false which
means that unaligned loads/stores do not trap and even extensive testing
will not catch these bugs. However the multi/double variants are not
affected by this bit and will still trap. In effect a more aggressive
load/store optimization will break existing (bad) code.

These bugs do not necessarily manifest in the broken code where the
misaligned pointer is formed but often later in perfectly legal code
where it is accessed. This means recompiling system libraries (which
have no alignment bugs) with a newer compiler will break existing
applications (with alignment bugs) that worked before.

So (under protest) I implemented this safe mode which limits the
formation of multi/double operations to cases that are not affected by
user code (stack operations like spills/reloads) or cases where the
normal operations trap anyway (floating point load/stores). It is
disabled by default.

Differential Revision: http://reviews.llvm.org/D17015

llvm-svn: 262504
2016-03-02 19:20:00 +00:00
Matthias Braun 17cb57995e TableGen: Check scheduling models for completeness
TableGen checks at compiletime that for scheduling models with
"CompleteModel = 1" one of the following holds:

- Is marked with the hasNoSchedulingInfo flag
- The instruction is a subclass of Sched
- There are InstRW definitions in the scheduling model

Typical steps necessary to complete a model:

- Ensure all pseudo instructions that are expanded before machine
  scheduling (usually everything handled with EmitYYY() functions in
  XXXTargetLowering).
- If a CPU does not support some instructions mark the corresponding
  resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }".
- Add missing scheduling information.

Differential Revision: http://reviews.llvm.org/D17747

llvm-svn: 262384
2016-03-01 20:03:21 +00:00
Duncan P. N. Exon Smith fd8cc23220 CodeGen: Change MachineInstr to use MachineInstr&, NFC
Change MachineInstr API to prefer MachineInstr& over MachineInstr*
whenever the parameter is expected to be non-null.  Slowly inching
toward being able to fix PR26753.

llvm-svn: 262149
2016-02-27 20:01:33 +00:00
Tim Northover aa35bd26c7 ARM: disallow pc as a base register in Thumb2 memory ops.
These should all be deferring to the "OP (literal)" variant according to the
ARM ARM.

llvm-svn: 261895
2016-02-25 16:54:52 +00:00
Tim Northover b9edc5ca21 ARM: fix handling of movw/movt relocations with addend.
We were emitting only one half of a the paired relocations needed for these
instructions because we decided that an offset needed a scattered relocation.
In fact, movw/movt relocations can be paired without being scattered.

llvm-svn: 261679
2016-02-23 20:20:23 +00:00
Weiming Zhao 5e0c3ebe9e Fix PR25339: ARM Constant Island
Summary:
Currently, the ARM Constant Island may not converge (or not converge quickly).
This patch let it move to the closest water after the user if it doesn't converge after 15 iterations.

This address https://llvm.org/bugs/show_bug.cgi?id=25339

Reviewers: t.p.northover, srhines, kristof.beyls, aadg, rengolin

Subscribers: weimingz, aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D16890

llvm-svn: 261665
2016-02-23 18:39:19 +00:00
Junmo Park 453f4aa4dd [ARM] fix initialization of PredictableSelectIsExpensive
Summary:
If we want classify OoO or not, using getSchedModel().isOutOfOrder()
could be more proper way than using Subtarget->isLikeA9().

Reviewers: jmolloy, rengolin

Differential Revision: http://reviews.llvm.org/D17433

llvm-svn: 261623
2016-02-23 09:56:58 +00:00
Duncan P. N. Exon Smith 6307eb5518 CodeGen: TII: Take MachineInstr& in predicate API, NFC
Change TargetInstrInfo API to take `MachineInstr&` instead of
`MachineInstr*` in the functions related to predicated instructions
(I'll try to come back later and get some of the rest).  All of these
functions require non-null parameters already, so references are more
clear.  As a bonus, this happens to factor away a host of implicit
iterator => pointer conversions.

No functionality change intended.

llvm-svn: 261605
2016-02-23 02:46:52 +00:00
Duncan P. N. Exon Smith d84f600653 CodeGen: Bring back MachineBasicBlock::iterator::getInstrIterator()...
This is a little embarrassing.

When I reverted r261504 (getIterator() => getInstrIterator()) in
r261567, I did a `git grep` to see if there were new calls to
`getInstrIterator()` that I needed to migrate.  There were 10-20 hits,
and I blindly did a `sed ...` before calling `ninja check`.

However, these were `MachineInstrBundleIterator::getInstrIterator()`,
which predated r261567.  Perhaps coincidentally, these had an identical
name and return type.

This commit undoes my careless sed and restores
`MachineBasicBlock::iterator::getInstrIterator()`.

llvm-svn: 261577
2016-02-22 21:30:15 +00:00
Duncan P. N. Exon Smith c5b668deb8 Revert "CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC"
This reverts commit r261504, since it's not obvious the new name is
better:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html

I'll recommit if we get consensus that it's the right direction.

llvm-svn: 261567
2016-02-22 20:49:58 +00:00
Duncan P. N. Exon Smith dc0848c029 CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC
Delete MachineInstr::getIterator(), since the term "iterator" is
overloaded when talking about MachineInstr.

- Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so
  that ilist_node::getIterator() is still available.
- Add it back as MachineInstr::getInstrIterator().  This matches the
  naming in MachineBasicBlock.
- Add MachineInstr::getBundleIterator().  This is explicitly called
  "bundle" (not matching MachineBasicBlock) to disintinguish it clearly
  from ilist_node::getIterator().
- Update all calls.  Some of these I switched to `auto` to remove
  boiler-plate, since the new name is clear about the type.

There was one call I updated that looked fishy, but it wasn't clear what
the right answer was.  This was in X86FrameLowering::inlineStackProbe(),
added in r252578 in lib/Target/X86/X86FrameLowering.cpp.  I opted to
leave the behaviour unchanged, but I'll reply to the original commit on
the list in a moment.

llvm-svn: 261504
2016-02-21 22:58:35 +00:00
Duncan P. N. Exon Smith e9bc579c37 ADT: Remove == and != comparisons between ilist iterators and pointers
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.

Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators.  (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)

There should be NFC here.

llvm-svn: 261498
2016-02-21 20:39:50 +00:00
Junmo Park 1108ab059c Minor code cleanups. NFC.
llvm-svn: 261294
2016-02-19 01:46:04 +00:00
Ahmed Bougacha 93cff7fb82 [CodeGen] Document and use getConstant's splat-building feature. NFC.
Differential Revision: http://reviews.llvm.org/D17229

llvm-svn: 260901
2016-02-15 18:07:29 +00:00
Ahmed Bougacha f8dfb47c02 [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.
llvm-svn: 260316
2016-02-09 22:54:12 +00:00
Saleem Abdulrasool f36005a358 ARM: support TLS for WoA
Add support for TLS access for Windows on ARM.  This generates a similar access
to MSVC for ARM.

The changes to the tablegen data is needed to support loading an external symbol
global that is not for a call.  The adjustments to the DAG to DAG transforms are
needed to preserve the 32-bit move.

llvm-svn: 259676
2016-02-03 18:21:59 +00:00
Renato Golin 6027dd38ef [ARM] Move GNUEABI divmod to __aeabi_divmod*
The GNU toolchain emits __aeabi_divmod for soft-divide on ARM cores
which happens to be a lot faster than __divsi3/__modsi3 when the core
has hardware divide instructions. Do the same here.

Fixes PR26450.

llvm-svn: 259657
2016-02-03 16:10:54 +00:00
Sjoerd Meijer ffe19f5245 Removed FeatureVFPOnlySP from the Cortex-R7 processor model
description and changed the regression test accordingly.
The default configuration of a Cortex-R7 is to implement the
VFPv3-D16 architecture and the feature line as it was is too
restrictive.

llvm-svn: 259480
2016-02-02 09:28:20 +00:00
Matthias Braun b30f2f5141 Avoid overly large SmallPtrSet/SmallSet
These sets perform linear searching in small mode so it is never a good
idea to use SmallSize/N bigger than 32.

llvm-svn: 259283
2016-01-30 01:24:31 +00:00
Yaron Keren eb2a25467e Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment.
clang part in r259232, this is the LLVM part of the patch.

llvm-svn: 259240
2016-01-29 20:50:44 +00:00
Tim Northover c4093c3ced ARM: don't mangle DAG constant if it has more than one use
The basic optimisation was to convert (mul $LHS, $complex_constant) into
roughly "(shl (mul $LHS, $simple_constant), $simple_amt)" when it was expected
to be cheaper. The original logic checks that the mul only has one use (since
we're mangling $complex_constant), but when used in even more complex
addressing modes there may be an outer addition that can pick up the wrong
value too.

I *think* the ARM addressing-mode problem is actually unreachable at the
moment, but that depends on complex assessments of the profitability of
pre-increment addressing modes so I've put a real check in there instead of an
assertion.

llvm-svn: 259228
2016-01-29 19:18:46 +00:00
Alexandros Lamprineas 8c26e7c647 [ARM] Emit trap instruction using .inst directive
The trap instruction is emitted as a data-in-text rather
than an instruction. This patch uses the .inst directive
for emitting trap.

Differential Revision: http://reviews.llvm.org/D16684

llvm-svn: 259182
2016-01-29 10:23:32 +00:00
Tim Northover 042a6c1fe1 ARMv7k: base ABI decision on v7k Arch rather than watchos OS.
Various bits we want to use the new ABI actually compile with "-arch armv7k
-miphoneos-version-min=9.0". Not ideal, but also not ridiculous given how
slices work.

llvm-svn: 258975
2016-01-27 19:32:29 +00:00
Benjamin Kramer 391be792f2 One more batch of self-containing headers.
llvm-svn: 258974
2016-01-27 19:29:56 +00:00
Benjamin Kramer b32a5042bd Don't put classes in headers into anonymous namespaces.
You want ODR violations? That's how you get ODR violations.

llvm-svn: 258973
2016-01-27 19:29:42 +00:00
Benjamin Kramer 45275a4d3c Make more headers self-contained.
A lot of this comes from the new complete type requirement of DenseMap.

llvm-svn: 258956
2016-01-27 18:03:37 +00:00
Benjamin Kramer f9172fd4ac Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to CodeGen/
It's a SelectionDAG thing, not a Target thing.

llvm-svn: 258939
2016-01-27 16:32:26 +00:00
Benjamin Kramer b3e8a6d2b8 Move MCTargetAsmParser.h to llvm/MC/MCParser where it belongs.
llvm-svn: 258917
2016-01-27 10:01:28 +00:00
Chris Bieneman e49730d4ba Remove autoconf support
Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html

"I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened."
- Obi Wan Kenobi

Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark

Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16471

llvm-svn: 258861
2016-01-26 21:29:08 +00:00
Benjamin Kramer f57c1977c1 Reflect the MC/MCDisassembler split on the include/ level.
No functional change, just moving code around.

llvm-svn: 258818
2016-01-26 16:44:37 +00:00
Bradley Smith d27a6a7072 [ARM] Add DSP build attribute and extension targeting
This patch was originally committed as r257885, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.

llvm-svn: 258683
2016-01-25 11:26:11 +00:00
Bradley Smith f277c8a5ea [ARM] Add new system registers to ARMv8-M Baseline/Mainline
This patch was originally committed as r257884, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.

llvm-svn: 258682
2016-01-25 11:25:36 +00:00
Bradley Smith fed3e4ac00 [ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/Mainline
This patch was originally committed as r257883, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.

llvm-svn: 258681
2016-01-25 11:24:47 +00:00
Oliver Stannard 65b85382f6 [ARM] Add ARMv8.2-A FP16 scalar instructions
This was originally committed as r255762, but reverted as it broke windows
bots. Re-commitiing the exact same patch, as the underlying cause was fixed by
r258677.

ARMv8.2-A adds 16-bit floating point versions of all existing VFP
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.

The assembly for these instructions uses S registers (AArch32 does not
have H registers), but the instructions have ".f16" type specifiers
rather than ".f32" or ".f64". The top 16 bits of each source register
are ignored, and the top 16 bits of the destination register are set to
zero.

These instructions are mostly the same as the 32- and 64-bit versions,
but they use coprocessor 9 rather than 10 and 11.

Two new instructions, VMOVX and VINS, have been added to allow packing
and extracting two 16-bit floats stored in the top and bottom halves of
an S register.

New fixup kinds have been added for the PC-relative load and store
instructions, but no ELF relocations have been added as they have a
range of 512 bytes.

Differential Revision: http://reviews.llvm.org/D15038

llvm-svn: 258678
2016-01-25 10:26:26 +00:00
Oliver Stannard 9f68749eba [ARM] Operands for PKHTB alias should be swapped
When the shift immediate is zero, PKHTB is an alias for PKHBT, but the order of
the input operands needs to be swapped.

Differential Revision: http://reviews.llvm.org/D16288

llvm-svn: 258044
2016-01-18 11:56:35 +00:00
Manuel Jacob 190577ac81 [opaque pointer types] [NFC] CallSite: use getFunctionType() instead of going through PointerType::getElementType.
Patch by Eduard Burtescu.

Reviewers: dblaikie, mjacob

Subscribers: dsanders, llvm-commits, dblaikie

Differential Revision: http://reviews.llvm.org/D16273

llvm-svn: 258023
2016-01-17 22:37:39 +00:00
Manman Ren e5f807f928 CXX_FAST_TLS calling convention: fix issue on ARM.
When we have a single basic block, the explicit copy-back instructions should
be inserted right before the terminator. Before this fix, they were wrongly
placed at the beginning of the basic block.

PR26136

llvm-svn: 257930
2016-01-15 20:24:11 +00:00
Reid Kleckner d4a0d18899 Revert "[ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/Mainline"
This reverts commit r257883.

Somehow this didn't make it into r257916.

llvm-svn: 257919
2016-01-15 18:55:12 +00:00
Reid Kleckner 47f2452da8 # This is a combination of 2 commits.
# The first commit's message is:

Revert "[ARM] Add DSP build attribute and extension targeting"

This reverts commit b11cc50c0b4a7c8cdb628abc50b7dc226ff583dc.

# This is the 2nd commit message:

Revert "[ARM] Add new system registers to ARMv8-M Baseline/Mainline"

This reverts commit 837d08454e3e5beb8581951ac26b22fa07df3cd5.

llvm-svn: 257916
2016-01-15 18:31:29 +00:00
Bradley Smith 48b93e1f21 [ARM] Add DSP build attribute and extension targeting
llvm-svn: 257885
2016-01-15 10:28:25 +00:00
Bradley Smith 42f6e90a43 [ARM] Add new system registers to ARMv8-M Baseline/Mainline
llvm-svn: 257884
2016-01-15 10:28:03 +00:00
Bradley Smith 618712df04 [ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/Mainline
llvm-svn: 257883
2016-01-15 10:27:14 +00:00
Bradley Smith 433c22e35c [ARM] Add ARMv8-A semaphore/atomic instructions to ARMv8-M Baseline/Mainline
llvm-svn: 257882
2016-01-15 10:26:51 +00:00
Bradley Smith a1189106d5 [ARM] Add B.W and CBZ instructions to ARMv8-M Baseline
llvm-svn: 257881
2016-01-15 10:26:17 +00:00
Bradley Smith 519563e371 [ARM] Add SDIV/UDIV instructions to ARMv8-M Baseline
llvm-svn: 257880
2016-01-15 10:25:35 +00:00
Bradley Smith d9a99ce53d [ARM] Add MOVW/MOVT instructions to ARMv8-M Baseline/Mainline
llvm-svn: 257879
2016-01-15 10:25:14 +00:00
Bradley Smith e26f799422 [ARM] Add ARMv8-M Baseline/Mainline LLVM targeting
llvm-svn: 257878
2016-01-15 10:24:39 +00:00
Bradley Smith 4c21cba72b [ARM] Split out ARMv8-A semaphores and atomics and ARMv7 clrex as separate features
llvm-svn: 257877
2016-01-15 10:23:46 +00:00
Rui Ueyama da00f2fdf4 Update to use new name alignTo().
llvm-svn: 257804
2016-01-14 21:06:47 +00:00
Benjamin Kramer fc1f7d893e [ARM] Use the efficient version of BitVector::set and a static_assert.
No functional change intended.

llvm-svn: 257766
2016-01-14 14:33:04 +00:00
Rafael Espindola 8340f94df1 Convert a few assert failures into proper errors.
Fixes PR25944.

llvm-svn: 257697
2016-01-13 22:56:57 +00:00
Ana Pazos 359cab3bb3 Guard fabs to bfc convert with V6T2 flag
Summary:
BFC instructions are available in ARMv6T2 and above.


Reviewers: t.p.northover

Subscribers: aemerson

Differential Revision: http://reviews.llvm.org/D16076

llvm-svn: 257546
2016-01-13 00:03:35 +00:00
Quentin Colombet f8e3030794 [ARM] Mark VMOV with immediate: isAsCheapAsMove.
VMOVs are not strictly speaking cheap, but they are as expensive as a vector
copy (VORR), so we should prefer rematerialization over splitting when it
applies.

rdar://problem/23754176

llvm-svn: 257545
2016-01-13 00:02:40 +00:00
Keno Fischer 00021429d4 [ARM] Fix several state persistence bugs
Summary:
This fixes three bugs, in all of which state is not or incorrecly reset between
objects (i.e. when reusing the same pass manager to create multiple object
files):
1) AttributeSection needs to be reset to nullptr, because otherwise the backend
   will try to emit into the old object file's attribute section causing a
   segmentation fault.
2) MappingSymbolCounter needs to be reset, otherwise the second object file
   will start where the first one left off.
3) The MCStreamer base class resets the Streamer's e_flags settings. Since
   EF_ARM_EABI_VER5 is set on streamer creation, we need to set it again
   after the MCStreamer was rest.

Also rename Reset (uppser case) to EHReset to avoid confusion with
reset (lower case).

Reviewers: rengolin
Differential Revision: http://reviews.llvm.org/D15950

llvm-svn: 257473
2016-01-12 13:38:15 +00:00
Manman Ren 5e9e65e705 CXX_FAST_TLS calling convention: performance improvement for ARM.
This is the same change on ARM as r255821 on AArch64.
rdar://9001553

llvm-svn: 257424
2016-01-12 00:47:18 +00:00
Manman Ren 1602605bf8 CXX_FAST_TLS calling convention: Add support for ARM on Darwin.
rdar://9001553

llvm-svn: 257417
2016-01-11 23:50:43 +00:00
Weiming Zhao 4b3b13d3bc RBIT Instruction only available for ARMv6t2 and above.
Summary:
r255334 matches bit-reverse pattern in InstCombine and generates calls to Instrinsic::bitreverse.

RBIT instruction is only available for ARMv6t2 and above. This patch has the intrinsic expanded during legalization for ARMv4 and ARMv5.

Patch by Z. Zheng <zhaoshiz@codeaurora.org>

Reviewers: apazos, jmolloy, weimingz

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D15932

llvm-svn: 257188
2016-01-08 18:43:41 +00:00
Weiming Zhao 48c033e021 Disable shrink-wrap for Thumb1
Summary: In ARMConstantIslandPass, which runs after Shrink Wrap pass, long jumps will be fixed up as BL (tBfar) which depends on spilling LR in epilogue.  However, shrink-wrap may remove the LR, which causes issues when the function returns.

Reviewers: qcolombet, rengolin

Subscribers: aemerson, rengolin

Differential Revision: http://reviews.llvm.org/D15984

llvm-svn: 257187
2016-01-08 18:37:43 +00:00
Eric Christopher b793230797 Add some testing for thumb1 and thumb2 inline asm immediate constraints
and fix a couple of bugs on inspection.

Also fixes PR26061.

llvm-svn: 257122
2016-01-08 00:34:44 +00:00
Tim Northover bd41cf880c ARM: support TLS accesses on Darwin platforms
Darwin TLS accesses most closely resemble ELF's general-dynamic situation,
since they have to be able to handle all possible situations. The descriptors
and so on are obviously slightly different though.

llvm-svn: 257039
2016-01-07 09:03:03 +00:00
Philip Reames c86ed0055d Extract helper function to merge MemoryOperand lists [NFC]
In the discussion on http://reviews.llvm.org/D15730, Andy pointed out we had a utility function for merging MMO lists. Since it turned we actually had two copies and there's another review in progress (http://reviews.llvm.org/D15230) which needs the same, extract it into a utility function and clean up the interfaces to make it easier to use with a MachineInstBuilder.

I introduced a pair here to track size and allocation together. I think we should probably move in the direction of the MachineOperandsRef helper class, but I'm leaving that for further work. I want to get the poison state introduced before I make major changes to the interface.

Differential Revision: http://reviews.llvm.org/D15757

llvm-svn: 256909
2016-01-06 04:39:03 +00:00
MinSeong Kim a7385ebf78 [AArch64] Add support for Samsung Exynos-M1
Adds core tuning support for new Samsung Exynos-M1 core (ARMv8-A).

Differential Revision: http://reviews.llvm.org/D15663

llvm-svn: 256828
2016-01-05 12:51:59 +00:00
Craig Topper e30b8ca149 Use std::is_sorted and std::none_of instead of manual loops. NFC
llvm-svn: 256719
2016-01-03 19:43:40 +00:00
Artyom Skrobov 2aca0c622a [Thumb] Fix assembler error 'cannot honor width suffix pop {lr}'
Summary:
* avoid generating POP {LR} in Thumb1 epilogues
* combine MOV LR, Rx + BX LR -> BX Rx in a peephole optimization pass
* combine POP {LR} + B + BX LR -> POP {PC} on v5T+

Test cases by Ana Pazos

Differential Revision: http://reviews.llvm.org/D15707

llvm-svn: 256523
2015-12-28 21:40:45 +00:00
Roman Divacky 73fc84761f Support clrex instruction on ARMv6k. Patch by Andrew Turner.
llvm-svn: 256505
2015-12-28 17:47:23 +00:00
Craig Topper daf2e3ff7a Remove extra forward declarations and scrub includes for all in tree InstPrinters. NFC
llvm-svn: 256427
2015-12-25 22:10:01 +00:00
David Majnemer 03e2cc3007 [MC, COFF] Support link /incremental conditionally
Today, we always take into account the possibility that object files
produced by MC may be consumed by an incremental linker.  This results
in us initialing fields which vary with time (TimeDateStamp) which harms
hermetic builds (e.g. verifying a self-host went well) and produces
sub-optimal code because we cannot assume anything about the relative
position of functions within a section (call sites can get redirected
through incremental linker thunks).

Let's provide an MCTargetOption which controls this behavior so that we
can disable this functionality if we know a-priori that the build will
not rely on /incremental.

llvm-svn: 256203
2015-12-21 22:09:27 +00:00
Adrian Prantl 5d9acc2443 Teach ARMLoadStoreOptimizer to ignore DBG_VALUE instructions when merging
instructions.

As noted in PR24563.
rdar://problem/23963293

llvm-svn: 256183
2015-12-21 19:25:03 +00:00
Chad Rosier 353d71914a Remove extra whitespace. NFC.
llvm-svn: 256173
2015-12-21 18:08:05 +00:00
Weiming Zhao 613c6862fa Fix mapping of @llvm.arm.ssat/usat intrinsics to ssat/usat instructions for Thumb2
Summary:
r250697 fixed the mapping for ARM mode. We have to do the same for Thumb2 otherwise the same llvm.arm.ssat() will generate different saturating amount for ARM and Thumb.

r250697: http://reviews.llvm.org/rL250697

Reviewers: rmaprath

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D15653

llvm-svn: 256115
2015-12-20 06:41:44 +00:00
Reid Kleckner 187d33ee74 Revert "[ARM] Add ARMv8.2-A FP16 scalar instructions"
This reverts commit r255762.

llvm-svn: 255806
2015-12-16 19:21:03 +00:00
Oliver Stannard 2de8c16913 [ARM] Add ARMv8.2-A FP16 vector instructions
ARMv8.2-A adds 16-bit floating point versions of all existing SIMD
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.

Note that VFP without SIMD is not a valid combination for any version of
ARMv8-A, but I have ensured that these instructions all depend on both
FeatureNEON and FeatureFullFP16 for consistency.

Differential Revision: http://reviews.llvm.org/D15039

llvm-svn: 255764
2015-12-16 12:37:39 +00:00
Oliver Stannard 48568cbe18 [ARM] Add ARMv8.2-A FP16 scalar instructions
ARMv8.2-A adds 16-bit floating point versions of all existing VFP
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.

The assembly for these instructions uses S registers (AArch32 does not
have H registers), but the instructions have ".f16" type specifiers
rather than ".f32" or ".f64". The top 16 bits of each source register
are ignored, and the top 16 bits of the destination register are set to
zero.

These instructions are mostly the same as the 32- and 64-bit versions,
but they use coprocessor 9 rather than 10 and 11.

Two new instructions, VMOVX and VINS, have been added to allow packing
and extracting two 16-bit floats stored in the top and bottom halves of
an S register.

New fixup kinds have been added for the PC-relative load and store
instructions, but no ELF relocations have been added as they have a
range of 512 bytes.

Differential Revision: http://reviews.llvm.org/D15038

llvm-svn: 255762
2015-12-16 11:35:44 +00:00
Cong Hou c106989fd5 Normalize MBB's successors' probabilities in several locations.
This patch adds some missing calls to MBB::normalizeSuccProbs() in several
locations where it should be called. Those places are found by checking if the
sum of successors' probabilities is approximate one in MachineBlockPlacement
pass with some instrumented code (not in this patch).


Differential revision: http://reviews.llvm.org/D15259

llvm-svn: 255455
2015-12-13 09:26:17 +00:00
Saleem Abdulrasool 778c268594 ARM: only emit EABI attributes on EABI targets
EABI attributes should only be emitted on EABI targets.  This prevents the
emission of the optimization goals EABI attribute on Windows ARM.

llvm-svn: 255448
2015-12-13 05:27:45 +00:00
Hal Finkel cd8664c3c2 Revert r248483, r242546, r242545, and r242409 - absdiff intrinsics
After much discussion, ending here:

  http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html

it has been decided that, instead of having the vectorizer directly generate
special absdiff and horizontal-add intrinsics, we'll recognize the relevant
reduction patterns during CodeGen. Accordingly, these intrinsics are not needed
(the operations they represent can be pattern matched, as is already done in
some backends). Thus, we're backing these out in favor of the current
development work.

r248483 - Codegen: Fix llvm.*absdiff semantic.
r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA
r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA
r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation

llvm-svn: 255387
2015-12-11 23:11:52 +00:00
Matthias Braun 60d69e2865 CodeGen: Redo analyzePhysRegs() and computeRegisterLiveness()
computeRegisterLiveness() was broken in that it reported dead for a
register even if a subregister was alive. I assume this was because the
results of analayzePhysRegs() are hard to understand with respect to
subregisters.

This commit: Changes the results of analyzePhysRegs (=struct
PhysRegInfo) to be clearly understandable, also renames the fields to
avoid silent breakage of third-party code (and improve the grammar).

Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling
it and removing workarounds for the bug.

This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033

Differential Revision: http://reviews.llvm.org/D15320

llvm-svn: 255362
2015-12-11 19:42:09 +00:00
Tim Northover d91d635b36 ARM: don't use a deleted node as the BaseReg in complex pattern.
We mutated the DAG, which invalidated the node we were trying to use
as a base register. Sometimes we got away with it, but other times the
node really did get deleted before it was finished with.

Should fix PR25733

llvm-svn: 255120
2015-12-09 15:54:50 +00:00