Adds a subtarget feature for the CRC instructions (optional in v8-A) to the ARM (32-bit) backend.
Differential Revision: http://llvm-reviews.chandlerc.com/D2036
llvm-svn: 193599
This function-attribute modifies the callee-saved register list and function
epilogue (specifically the return instruction) so that a routine is suitable
for use as an interrupt-handler of the specified type without disrupting
user-mode applications.
rdar://problem/14207019
llvm-svn: 191766
These were pretty straightforward instructions, with some assembly support
required for HLT.
The ARM assembler is keen to split the instruction mnemonic into a
(non-existent) 'H' instruction with the LT condition code. An exception for
HLT is needed.
HLT follows the same rules as BKPT when in IT blocks, so the special BKPT
hadling code has been adapted to handle HLT also.
Regression tests added including diagnostic tests for out of range immediates
and illegal condition codes, as well as negative tests for pre-ARMv8.
llvm-svn: 190053
Solution is not sufficient to prevent 'mov pc, lr' being emitted for jump table code.
Test case doesn't trigger the added functionality.
llvm-svn: 190047
This improves code generation for jump tables by avoiding the emission of "mov pc, lr" which could fool the processor into believing this is a return from a function causing mispredicts. The code generation logic for jump tables uses ADR to materialize the address of the jump target.
Patch by Daniel Stewart!
llvm-svn: 190043
Fix a few things in one swoop.
# Add some negative tests.
# Fix some formatting issues.
# Add some missing IsThumb / ARMv8
# Fix some outs / ins mistakes.
llvm-svn: 189490
Back in the mists of time (2008), it seems TableGen couldn't handle the
patterns necessary to match ARM's CMOV node that we convert select operations
to, so we wrote a lot of fairly hairy C++ to do it for us.
TableGen can deal with it now: there were a few minor differences to CodeGen
(see tests), but nothing obviously worse that I could see, so we should
probably address anything that *does* come up in a localised manner.
llvm-svn: 188995
According to the ARM specification, "mov" is a valid mnemonic for all Thumb2 MOV encodings.
To achieve this, the patch adds one instruction alias with a special range condition to avoid collision with the Thumb1 MOV.
llvm-svn: 188901
The Thumb2 add immediate is in fact defined for SP. The manual is misleading as it points to a different section for add immediate with SP, however the encoding is the same as for add immediate with register only with the SP operand hard coded. As such add immediate with SP and add immediate with register can safely be treated as the same instruction.
All the patch does is adjust a register constraint on an instruction alias.
llvm-svn: 188676
There are many Thumb instructions which take 12-bit immediates encoded in a special
8-byte value + 4-byte rotator form. Not all numbers are represented, and it's legal
to transform an assembly instruction to be able to encode the immediate.
For example: AND and BIC are complementary instructions; one can switch the AND
to a BIC as long as the immediate is complemented.
The intent is to switch one instruction into its complementary one when the immediate
cannot be encoded in the form requested in the original assembly and when the
complementary immediate is encodable.
The patch addresses two issues:
1. definition of t2SOImmNot immediate - it has to check that the orignal value is
not encoded naturally
2. t2AND and t2BIC instruction aliases which should use the Thumb2 SOImm operand
rather than the ARM one.
llvm-svn: 188548
1. The offset range for Thumb1 PC relative loads is [0..1020] and not [-1024..1020]
2. Thumb2 PC relative loads may define the PC, so the restriction placed on target register is removed
3. Removes unneeded alias between "ldr.n" and t1LDRpci. ".n" is actually stripped by both tablegen
and the ASM parser, so this alias rule really does nothing
llvm-svn: 188466
In Thumb1, only one variant is supported: CPS{effect} {flags}
Thumb2 supports three:
CPS{effect}.W {flags}
CPS{effect} {flags} {mode}
CPS {mode}
Canonically, .W should be used only when ambiguity is present between encodings of different width.
The wide suffix is still accepted for the latter two forms via aliases.
llvm-svn: 188071
The long encoding for Thumb2 unconditional branches is broken.
Additionally, there is no range checking for target operands; as such
for instructions originating in assembly code, only short Thumb encodings
are generated, regardless of the bitsize needed for the offset.
Adding range checking is non trivial due to the representation of Thumb
branch instructions. There is no true difference between conditional and
unconditional branches in terms of operands and syntax - even unconditional
branches have a predicate which is expected to match that of the IT block
they are in. Yet, the encodings and the permitted size of the offset differ.
Due to this, for any mnemonic there are really 4 encodings to choose for.
The problem cannot be handled in the parser alone or by manipulating td files.
Because the parser builds first a set of match candidates and then checks them
one by one, whatever tablegen-only solution might be found will ultimately be
dependent of the parser's evaluation order. What's worse is that due to the fact
that all branches have the same syntax and the same kinds of operands, that
order is governed by the lexicographical ordering of the names of operand
classes...
To circumvent all this, any necessary disambiguation is added to the instruction
validation pass.
llvm-svn: 188067
This is the only Thumb2 instruction defined with "t" prefix; all other Thumb2 instructions have "t2" prefix (e.g. "t2CDP2" which is defined immediately afterwards).
Patch by Artyom Skrobov.
llvm-svn: 187973
While the .td entry is nice and all, it takes a pretty gross hack in
ARMAsmParser::ParseInstruction() because of handling of other "subs"
instructions to get it to match. Ran it by Jim Grosbach and he said it was
about what he expected to make this work given the existing code.
rdar://14214063
llvm-svn: 187530
instructions. With this patch:
1. ldr.n is recognized as mnemonic for the short encoding
2. ldr.w is recognized as menmonic for the long encoding
3. ldr will map to either short or long encodings depending on the size of the offset
llvm-svn: 186831
After Ulrich's r180677 (thanks!) TableGen is intelligent enough to
handle tied constraints involving complex operands properly, so
virtually all of the ARM custom converters are now unnecessary.
llvm-svn: 186810
This adds an instruction alias to make the assembler recognize the alternate literal form: pli [PC, #+/-<imm>]
See A8.8.129 in the ARM ARM (DDI 0406C.b).
Fixes <rdar://problem/14403733>.
llvm-svn: 186459
Propagate the fix from r185712 to Thumb2 codegen as well. Original
commit message applies here as well:
A "pkhtb x, x, y asr #num" uses the lower 16 bits of "y asr #num" and
packs them in the bottom half of "x". An arithmetic and logic shift are
only equivalent in this context if the shift amount is 16. We would be
shifting in ones into the bottom 16bits instead of zeros if "y" is
negative.
rdar://14338767
llvm-svn: 185982
1. it should accept only 4-byte aligned addresses
2. the maximum offset should be 1020
3. it should be encoded with the offset scaled by two bits
llvm-svn: 185528
Unfortunately this addresses two issues (by the time I'd disentangled the logic
it wasn't worth putting it back to half-broken):
+ Coprocessor instructions should all be predicable in Thumb mode.
+ BKPT should never be predicable.
llvm-svn: 184965
When using a positive offset, literal loads where encoded
as if it was negative, because:
- The sign bit was not assigned to an operand
- The addrmode_imm12 operand was not encoding the sign bit correctly
This patch also makes the assembler look at the .w/.n specifier for
loads.
llvm-svn: 184182
Tidy up three places where the register class for ARM and Thumb wasn't
restrictive enough:
- No PC dest for reg-reg add/orr/sub.
- No PC dest for shifts.
- No PC or SP for Thumb2 reg-imm add.
I encountered this while combining FastISel with
-verify-machineinstrs. These instructions defined registers whose
classes weren't restrictive enough, and the uses failed
verification. They're also undefined in the ISA, or would produce code
that FastISel wouldn't want. This doesn't fix the register class
narrowing issue (where uses should restrict definitions), and isn't
thorough, but it's a small step in the right direction.
llvm-svn: 182863
"hint" space for Thumb actually overlaps the encoding space of the CPS
instruction. In actuality, hints can be defined as CPS instructions where imod
and M bits are all nil.
Handle decoding of permitted nop-compatible hints (i.e. nop, yield, wfi, wfe,
sev) in DecodeT2CPSInstruction.
This commit adds a proper diagnostic message for Imm0_4 and updates all tests.
Patch by Mihail Popa <Mihail.Popa@arm.com>.
llvm-svn: 180617
According to the ARM reference manual, constant offsets are mandatory for pre-indexed addressing modes.
The MC disassembler was not obeying this when the offset is 0.
It was producing instructions like: str r0, [r1]!.
Correct syntax is: str r0, [r1, #0]!.
This change modifies the dumping of operands so that the offset is always printed, regardless of its value, when pre-indexed addressing mode is used.
Patch by Mihail Popa <Mihail.Popa@arm.com>
llvm-svn: 179398
These instructions aren't universally available, but depend on a specific
extension to the normal ARM architecture (rather than, say, v6/v7/...) so a new
feature is appropriate.
This also enables the feature by default on A-class cores which usually have
these extensions, to avoid breaking existing code and act as a sensible
default.
llvm-svn: 179171
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
if it's not possible to materialize an integer immediate with a single
instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
Also increase the threshold to something reasonable (8 for memset, 4 pairs
for memcpy).
This significantly improves Dhrystone, up to 50% on ARM iOS devices.
rdar://12760078
llvm-svn: 169791
This patch changes the definition of negative from -0..-255 to -1..-255. I am changing this because of
a bug that we had in some of the patterns that assumed that "subs" of zero does not set the carry flag.
rdar://12028498
llvm-svn: 167963
mov lr, pc
b.w _foo
The "mov" instruction doesn't set bit zero to one, it's putting incorrect
value in lr. It messes up backtraces.
rdar://12663632
llvm-svn: 167657
When the operand is a plain immediate rather than a label, print it
as [pc, #imm] like we do for the Thumb2 wide encoding variant.
rdar://12154503
llvm-svn: 166991
is 24 bits not 20 and the decoding needed to correctly handle converting the
J1 and J2 bits to their I1 and I2 values to reconstruct the displacement.
llvm-svn: 166982
into a sbc with a positive number, the immediate should be complemented, not
negated. Also added a missing pattern for ARM codegen.
rdar://12559385
llvm-svn: 166613
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.
Bug 12213
Patch by Yin Ma!
llvm-svn: 163136
This wasn't the right way to enforce ordering of atomics.
We are already setting the isVolatile bit on memory operands of atomic
operations which is good enough to enforce the correct ordering.
llvm-svn: 162732
It is not safe to use normal LDR instructions because they may be
reordered by the scheduler. The ATOMIC_LDR pseudos have a mayStore flag
that prevents reordering.
Atomic loads are also prevented from participating in rematerialization
and load folding.
llvm-svn: 162713
It's not clear that they should be marked as such, but tbb formation
fails if t2LEApcrelJT is hoisted of of a loop.
This doesn't change the flags on these instructions,
UnmodeledSideEffects was already inferred from the missing pattern.
llvm-svn: 162603
It is not my plan to duplicate the entire ARM instruction set with
predicated versions. We need a way of representing predicated
instructions in SSA form without requiring a separate opcode.
Then the pseudo-instructions can go away.
llvm-svn: 162061
The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.
This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.
llvm-svn: 161994
When predicating this instruction:
Rd = ADD Rn, Rm
We need an extra operand to represent the value given to Rd when the
predicate is false:
Rd = ADDCC Rfalse, Rn, Rm, pred
The Rd and Rfalse operands are different registers while in SSA form.
Rfalse is tied to Rd to make sure they get the same register during
register allocation.
Previously, Rd and Rn were tied, but that is not required.
Compare to MOVCC:
Rd = MOVCC Rfalse, Rtrue, pred
llvm-svn: 161955
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.
Bug 12213
Patch by Yin Ma!
llvm-svn: 161581
Now that TableGen supports references to NAME w/o it being explicitly
referenced in the definition's own name, use that to simplify
assembly InstAlias definitions in multiclasses.
llvm-svn: 161218
Function argument registers are added to the call SDNode, but
InstrEmitter now knows how to make those operands implicit, and the call
instruction doesn't have to be variadic.
Explicit register operands should only be those that are encoded in the
instruction, implicit register operands are for extra dependencies like
call argument and return values.
llvm-svn: 160188
There are patterns to handle immediates when they fit in the immediate field.
e.g. %sub = add i32 %x, -123
=> sub r0, r0, #123
Add patterns to catch immediates that do not fit but should be materialized
with a single movw instruction rather than movw + movt pair.
e.g. %sub = add i32 %x, -65535
=> movw r1, #65535
sub r0, r0, r1
rdar://11726136
llvm-svn: 159057
The NOP, WFE, WFI, SEV and YIELD instructions are all hints w/
a different immediate value in bits [7,0]. Define a generic HINT
instruction and refactor NOP, WFI, WFI, SEV and YIELD to be
assembly aliases of that.
rdar://11600518
llvm-svn: 158674
when a compile time constant is known. This occurs when implicitly zero
extending function arguments from 16 bits to 32 bits. The 8 bit case doesn't
need to be handled, as the 8 bit constants are encoded directly, thereby
not needing a separate load instruction to form the constant into a register.
<rdar://problem/11481151>
llvm-svn: 158659
We turned off the CMN instruction because it had semantics which we weren't
getting correct. If we are comparing with an immediate, then it's okay to use
the CMN instruction.
<rdar://problem/7569620>
llvm-svn: 158302
when a compile time constant is known. This occurs when implicitly zero
extending function arguments from 16 bits to 32 bits.
<rdar://problem/11481151>
llvm-svn: 157966
the 0b10 mask encoding bits. Make MSR APSR writes without a _<bits> qualifier
an alias for MSR APSR_nzcvq even though ARM as deprecated it use. Also add
support for suffixes (_nzcvq, _g, _nzcvqg) for APSR versions. Some FIXMEs in
the code for better error checking when versions shouldn't be used.
rdar://11457025
llvm-svn: 157019
We were incorrectly conflating some add variants which don't have a
cc_out operand with the mirroring sub encodings, which do. Part of the
awesome non-orthogonality legacy of thumb1. Similarly, handling of
add/sub of an immediate was sometimes incorrectly removing the cc_out
operand for add/sub register variants.
rdar://11216577
llvm-svn: 154411
We had special instructions for iOS because r9 is call-clobbered, but
that is represented dynamically by the register mask operands now, so
there is no need for the pseudo-instructions.
llvm-svn: 154144
'add r2, #-1024' should just use 'sub r2, #1024' rather than erroring out.
Thumb1 aliases for adding a negative immediate to the stack pointer,
also.
rdar://11192734
llvm-svn: 154123
A MOVCCr instruction can be commuted by inverting the condition. This
can help reduce register pressure and remove unnecessary copies in some
cases.
<rdar://problem/11182914>
llvm-svn: 154033