Summary:
LDRAA and LDRAB in their writeback variant should softfail when the same
register is used as result and base.
This patch adds a custom decoder that catches such case and emits a
warning when it occurs.
Differential Revision: https://reviews.llvm.org/D82541
This patch implements builtins for the following prototypes:
unsigned long long __builtin_cntlzdm (unsigned long long, unsigned long long)
unsigned long long __builtin_cnttzdm (unsigned long long, unsigned long long)
vector unsigned long long vec_cntlzm (vector unsigned long long, vector unsigned long long)
vector unsigned long long vec_cnttzm (vector unsigned long long, vector unsigned long long)
Differential Revision: https://reviews.llvm.org/D80941
This patch implements builtins for the following prototypes for the VSX Permute
Control Vector Generate with Mask Instructions:
vector unsigned char vec_genpcvm (vector unsigned char, const int);
vector unsigned short vec_genpcvm (vector unsigned short, const int);
vector unsigned int vec_genpcvm (vector unsigned int, const int);
vector unsigned long long vec_genpcvm (vector unsigned long long, const int);
Differential Revision: https://reviews.llvm.org/D81774
This patch implements builtins for the following prototypes:
```
vector signed char vec_clrl (vector signed char a, unsigned int n);
vector unsigned char vec_clrl (vector unsigned char a, unsigned int n);
vector signed char vec_clrr (vector signed char a, unsigned int n);
vector signed char vec_clrr (vector unsigned char a, unsigned int n);
```
Differential Revision: https://reviews.llvm.org/D81707
We were missing the modrm byte this instruction has according
to current Intel SDM. Experiments with gcc indicate that different
modrm values are chosen based on 2 operands so I've added those
as well.
I think our previous implementation was based on an older behavior of
binutils that has since been changed.
These are documented as using modrm byte of 0xe8, 0xf0, and 0xf8
respectively. But hardware ignore bits 2:0. So 0xe9-0xef is treated
the same as 0xe8. Similar for the other two.
Fixing this required adding 8 new formats to the X86 instructions
to convey this information. Could have gotten away with 3, but
adding all 8 made for a more logical conversion from format to
modrm encoding.
I renumbered the format encodings to keep the register modrm
formats grouped together.
This patch implements builtins for the following prototypes:
vector unsigned long long vec_pdep(vector unsigned long long, vector unsigned long long);
vector unsigned long long vec_pext(vector unsigned long long, vector unsigned long long __b);
unsigned long long __builtin_pdepd (unsigned long long, unsigned long long);
unsigned long long __builtin_pextd (unsigned long long, unsigned long long);
Revision Depends on D80758
Differential Revision: https://reviews.llvm.org/D80935
It seems to be a hardware defect that the half inline constants do not
work as expected for the 16-bit integer operations (the inverse does
work correctly). Experimentation seems to show these are really
reading the 32-bit inline constants, which can be observed by writing
inline asm using op_sel to see what's in the high half of the
constant. Theoretically we could fold the high halves of the 32-bit
constants using op_sel.
The *_asm_all.s MC tests are broken, and I don't know where the script
to autogenerate these are. I started manually fixing it, but there's
just too many cases to fix. This also does break the
assembler/disassembler support for these values, and I'm not sure what
to do about it. These are still valid encodings, so it seems like you
should be able to use them in some way. If you wrote assembly using
them, you could have really meant it (perhaps to read the high bits
with op_sel?). The disassembler will print the invalid literal
constant which will fail to re-assemble. The behavior is also
different depending on the use context. Consider this example, which
was previously accepted and encoded using the inline constant:
v_mad_i16 v5, v1, -4.0, v3
; encoding: [0x05,0x00,0xec,0xd1,0x01,0xef,0x0d,0x04]
In contexts where an inline immediate is required (such as on gfx8/9),
this will now be rejected. For gfx10, this will produce the literal
encoding and change the printed format:
v_mad_i16 v5, v1, 0xc400, v3
; encoding: [0x05,0x00,0x5e,0xd7,0x01,0xff,0x0d,0x04,0x00,0xc4,0x00,0x00]
This is just another variation of the issue that we don't perfectly
handle round trip assembly/disassembly due to not tracking how
immediates were encoded. This doesn't matter much in practice, since
compilers don't emit the suboptimal encoding. I doubt any users are
relying on this behavior (although I did make use of the old behavior
to figure out what was wrong).
Fixes bug 46302.
Summary:
We have defined MTSPR/MFSPR and MTSPR8/MFSPR8, but we only defined
mtspr/mfspr InstAlias for some MTSPR/MFSPR.
This patch is to add the InstAlias definitions for MTSPR8/MFSPR8,
and add the some new mtspr/mfspr InstAlias we may use.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D77531
This patch adds support for Vector Multiply-Sum Unsigned Doubleword Modulo
instruction; vmsumudm.
Differential Revision: https://reviews.llvm.org/D80294
Summary:
As described in https://github.com/WebAssembly/simd/pull/209. This is
the final reorganization of the SIMD opcode space before
standardization. It has been landed in concert with corresponding
changes in other projects in the WebAssembly SIMD ecosystem.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79224
Summary:
AArch64's system register ERXTS_EL1 is present in the backend as a
component of the Arm Reliability, Availability and Serviceability (RAS)
extension. However, it has been removed from the specification before
its final release.
This patch removes the register.
Reviewers: SjoerdMeijer, DavidSpickett
Reviewed By: DavidSpickett
Subscribers: DavidSpickett, kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79007
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
This patch includes:
- Assembly support for AArch32 and Assembly Parsing
D77872 has already added the MC representations of the instructions so that
they can be used in code gen; this patch fills in the details needed to
make assembly parsing work, and adds tests for asm and disasm
This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)
Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman
Reviewers: t.p.northover, simon_tatham
Reviewed By: simon_tatham
Subscribers: simon_tatham, ostannard, kristof.beyls, hiraditya,
danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77874
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
This patch includes:
- Assembly support for AArch64 only (no SVE or Neon)
- Intrinsics Support for AArch64 Armv8.6a Matrix Multiplication Instructions (No bfloat16 matrix multiplication)
No IR types or C Types are needed for this extension.
This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)
Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman
Reviewers: ostannard, t.p.northover, rengolin, kmclaughlin
Reviewed By: kmclaughlin
Subscribers: kmclaughlin, kristof.beyls, hiraditya, danielkiss,
cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77871
This implements the instruction analysis required to print branch
targets as part of llvm-objdump's disassembly.
Note, this only handles those branches which can be analyzed in a single
instruction, a future patch will handle multiple-instruction patterns,
such as AUIPC/LUI+JALR instruction pairs.
Differential Revision: https://reviews.llvm.org/D77567
From Arm v8 Architecture Reference Manual F5.1.84 LDREXD
The ldrexd instruction in Arm state has the following conditions:
t = UInt(Rt); t2 = t + 1; n = UInt(Rn);
if Rt<0> == '1' || t2 == 15 || n == 15 then UNPREDICTABLE;
In when Rt is odd or if Rt is 14 (making t2 15).
In the implementation when the pair is the UNPREDICTABLE R14_R15 we
would ideally return SOFT_FAIL. We can't because there is no R14_R15
value for us to return so we fail early returning FAIL.
The early return for registers outside the bounds of the table means
the check for Rt == 14 (0xE) redundant which causes a static analyzer
to flag the condition as never being true.
To fix the warning I've removed the check and replaced with a comment
explaining the difference with the specification.
Fixes pr41660
Differential Revision: https://reviews.llvm.org/D77463
Summary:
This patch upstreams support the optional ARMv8.0 Data Gathering Hint (DGH)
extension, which adds the Data Gathering Hint instruction to the hint
space.
See ARMv8.0-DGH in the Arm Architecture Reference Manual Armv8 for more
information.
Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, danielkiss, samparker
Reviewed By: SjoerdMeijer
Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77097
Summary:
This patch upstreams support for the ARMv8.6A Enhanced Counter Virtualization
(ECV) extension, which adds 6 new system registers.
See ARMv8.6-ECV in the Arm Architecture Reference Manual Armv8 for more
information.
Reviewers: t.p.northover, rengolin, SjoerdMeijer, pcc, ab, chill
Reviewed By: SjoerdMeijer
Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77094
Summary:
This patch upstreams support for the ARMv8.6A Fine Grain Traps (FGT)
extension, which adds 5 new system registers.
See ARMv8.6-FGT in the Arm Architecture Reference Manual Armv8 for more
information.
Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, momchil.velikov
Reviewed By: SjoerdMeijer
Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76991
Summary:
This patch upstreams v8.6A activity monitors virtualization
assembler support, which consists of 32 new system
registers (two groups, each with 16 numbered registers).
See ARMv8.6-AMU in the Arm Architecture Reference Manual Armv8 for more
information.
Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, john.brawn, ostannard
Reviewed By: ostannard
Subscribers: LukeGeeson, dnsampaio, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76998
Summary:
Below InstAlias have been redefined, this patch is to remove the repeated
definition.
mtdec/mfdec mtsdr1/mfsdr1 mtsrr0/mfsrr0 mtsrr1/mfsrr1 mtasr
Reviewed By: nemanjai, steven.zhang
Differential Revision: https://reviews.llvm.org/D75821
Summary:
This patch introduces command-line support for the Armv8.6-a architecture and assembly support for BFloat16. Details can be found
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
in addition to the GCC patch for the 8..6-a CLI:
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-11/msg02647.html
In detail this patch
- march options for armv8.6-a
- BFloat16 assembly
This is part of a patch series, starting with command-line and Bfloat16
assembly support. The subsequent patches will upstream intrinsics
support for BFloat16, followed by Matrix Multiplication and the
remaining Virtualization features of the armv8.6-a architecture.
Based on work by:
- labrinea
- MarkMurrayARM
- Luke Cheeseman
- Javed Asbar
- Mikhail Maltsev
- Luke Geeson
Reviewers: SjoerdMeijer, craig.topper, rjmccall, jfb, LukeGeeson
Reviewed By: SjoerdMeijer
Subscribers: stuij, kristof.beyls, hiraditya, dexonsmith, danielkiss, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D76062
Summary:
This patch adds assembly-level support for a new Arm M-profile
architecture extension, Custom Datapath Extension (CDE).
A brief description of the extension is available at
https://developer.arm.com/architectures/instruction-sets/custom-instructions
The latest specification for CDE is currently a beta release and is
available at
https://static.docs.arm.com/ddi0607/aa/DDI0607A_a_armv8m_arm_supplement_cde.pdf
CDE allows chip vendors to add custom CPU instructions. The CDE
instructions re-use the same encoding space as existing coprocessor
instructions (such as MRC, MCR, CDP etc.). Each coprocessor in range
cp0-cp7 can be configured as either general purpose (GCP) or custom
datapath (CDEv1). This configuration is defined by the CPU vendor and
is provided to LLVM using 8 subtarget features: cdecp0 ... cdecp7.
The semantics of CDE instructions are implementation-defined, but the
instructions are guaranteed to be pure (that is, they are stateless,
they do not access memory or any registers except their explicit
inputs/outputs).
CDE requires the CPU to support at least Armv8.0-M mainline
architecture. CDE includes 3 sets of instructions:
* Instructions that operate on general purpose registers and NZCV
flags
* Instructions that operate on the S or D register file (require
either FP or MVE extension)
* Instructions that operate on the Q register file, require MVE
The user-facing names that can be specified on the command line are
the same as the 8 subtarget feature names. For example:
$ clang -target arm-none-none-eabi -march=armv8m.main+cdecp0+cdecp3
tells the compiler that the coprocessors 0 and 3 are configured as
CDEv1 and the remaining coprocessors are configured as GCP (which is
the default).
Reviewers: simon_tatham, ostannard, dmgreen, eli.friedman
Reviewed By: simon_tatham
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74044
This reverts commit 80a34ae311 with fixes.
Previously, since bots turning on EXPENSIVE_CHECKS are essentially turning on
MachineVerifierPass by default on X86 and the fact that
inline-asm-avx-v-constraint-32bit.ll and inline-asm-avx512vl-v-constraint-32bit.ll
are not expected to generate functioning machine code, this would go
down to `report_fatal_error` in MachineVerifierPass. Here passing
`-verify-machineinstrs=0` to make the intent explicit.
This reverts commit 80a34ae311 with fixes.
On bots llvm-clang-x86_64-expensive-checks-ubuntu and
llvm-clang-x86_64-expensive-checks-debian only,
llc returns 0 for these two tests unexpectedly. I tweaked the RUN line a little
bit in the hope that LIT is the culprit since this change is not in the
codepath these tests are testing.
llvm\test\CodeGen\X86\inline-asm-avx-v-constraint-32bit.ll
llvm\test\CodeGen\X86\inline-asm-avx512vl-v-constraint-32bit.ll
This reverts commit rGcd5b308b828e, rGcd5b308b828e, rG8cedf0e2994c.
There are issues to be investigated for polly bots and bots turning on
EXPENSIVE_CHECKS.
This patch makes the following System Registers Read Only:
- CurrentEL
- ICH_MISR_EL2
- PMBIDR_EL1
- PMSIDR_EL1
as found in:
https://developer.arm.com/docs/ddi0595/e/aarch64-system-registers
Relative line numbers were also added to the tests so we get more
informative error messages on failure.
Change-Id: I963b4f01ca5737b58f9e8e7abe9ca1d99e328758
The registers TRCEXTINSELR and TRCEXTINSELR0 are distinct registers,
defined by separate extension specifications (ETM and ETE,
respectively), yet they use the same encoding in MSR/MRS.
When performing a system register lookup by encoding, we would
essentially return a random one, depending on the number, relative
position in the TableGen file, whether the TableGen records for system
registers are named or not, and, if they are named, depending on
record (not register!) name as well.
This patch works around the issue by explictly checking for the
TRCEXTINSELR/TRCEXTINSELR0 encoding and always returning TRCEXTINSELR.
Differential Revision: https://reviews.llvm.org/D74074
A previous patch should have added pld and pstd and any support code in
the backend that is required for prefixed load and store type operations.
This patch adds a number of additional prefixed load and store type
instructions for the future CPU.
Differential Revision: https://reviews.llvm.org/D72577
This adds some missing MVE opcodes to evaluateBranch, which results in
llvm-objdump being able to print the PC relative branch target as an
annotation.
Differential Revision: https://reviews.llvm.org/D73553
Add the prefixed instructions pld and pstd to future CPU. These are load and
store instructions that require new operand types that are 34 bits. This patch
adds the two instructions as well as the operand types required.
Note that this patch also makes a minor change to tablegen to account for the
fact that some instructions are going to require shifts greater than 31 bits
for the new 34 bit instructions.
Differential Revision: https://reviews.llvm.org/D72574
Future CPU will include support for prefixed instructions.
These prefixed instructions are formed by a 4 byte prefix
immediately followed by a 4 byte instruction effectively
making an 8 byte instruction. The new instruction paddi
is a prefixed form of addi.
This patch adds paddi and all of the support required
for that instruction. The majority of the patch deals with
supporting the new prefixed instructions. The addition of
paddi is mainly to allow for testing.
Differential Revision: https://reviews.llvm.org/D72569
Summary:
This patch could be treated as a rebase of D33960. It also fixes PR35547.
A fix for `llvm/test/Other/close-stderr.ll` is proposed in D68164. Seems
the consensus is that the test is passing by chance and I'm not
sure how important it is for us. So it is removed like in D33960 for now.
The rest of the test fixes are just adding `--crash` flag to `not` tool.
** The reason it fixes PR35547 is
`exit` does cleanup including calling class destructor whereas `abort`
does not do any cleanup. In multithreading environment such as ThinLTO or JIT,
threads may share states which mostly are ManagedStatic<>. If faulting thread
tearing down a class when another thread is using it, there are chances of
memory corruption. This is bad 1. It will stop error reporting like pretty
stack printer; 2. The memory corruption is distracting and nondeterministic in
terms of error message, and corruption type (depending one the timing, it
could be double free, heap free after use, etc.).
Reviewers: rnk, chandlerc, zturner, sepavloff, MaskRay, espindola
Reviewed By: rnk, MaskRay
Subscribers: wuzish, jholewinski, qcolombet, dschuff, jyknight, emaste, sdardis, nemanjai, jvesely, nhaehnle, sbc100, arichardson, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, lenary, s.egerton, pzheng, cfe-commits, MaskRay, filcab, davide, MatzeB, mehdi_amini, hiraditya, steven_wu, dexonsmith, rupprecht, seiya, llvm-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D67847
Summary:
This patch fixes pr23772 [ARM] r226200 can emit illegal thumb2 instruction: "sub sp, r12, #80".
The violation was that SUB and ADD (reg, immediate) instructions can only write to SP if the source register is also SP. So the above instructions was unpredictable.
To enforce that the instruction t2(ADD|SUB)ri does not write to SP we now enforce the destination register to be rGPR (That exclude PC and SP).
Different than the ARM specification, that defines one instruction that can read from SP, and one that can't, here we inserted one that can't write to SP, and other that can only write to SP as to reuse most of the hard-coded size optimizations.
When performing this change, it uncovered that emitting Thumb2 Reg plus Immediate could not emit all variants of ADD SP, SP #imm instructions before so it was refactored to be able to. (see test/CodeGen/Thumb2/mve-stacksplot.mir where we use a subw sp, sp, Imm12 variant )
It also uncovered a disassembly issue of adr.w instructions, that were only written as SUBW instructions (see llvm/test/MC/Disassembler/ARM/thumb2.txt).
Reviewers: eli.friedman, dmgreen, carwil, olista01, efriedma, andreadb
Reviewed By: efriedma
Subscribers: gbedwell, john.brawn, efriedma, ostannard, kristof.beyls, hiraditya, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70680
Summary:
This adds assembler tests for cases that were previously only in the
disassembler tests, and vice versa.
Reviewers: rampitec, arsenm, nhaehnle
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72592
Summary:
This adds assembler tests for cases that were previously only in the
disassembler tests, and vice versa.
Reviewers: rampitec, arsenm, nhaehnle
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72561
Summary:
This patch fixes pr23772 [ARM] r226200 can emit illegal thumb2 instruction: "sub sp, r12, #80".
The violation was that SUB and ADD (reg, immediate) instructions can only write to SP if the source register is also SP. So the above instructions was unpredictable.
To enforce that the instruction t2(ADD|SUB)ri does not write to SP we now enforce the destination register to be rGPR (That exclude PC and SP).
Different than the ARM specification, that defines one instruction that can read from SP, and one that can't, here we inserted one that can't write to SP, and other that can only write to SP as to reuse most of the hard-coded size optimizations.
When performing this change, it uncovered that emitting Thumb2 Reg plus Immediate could not emit all variants of ADD SP, SP #imm instructions before so it was refactored to be able to. (see test/CodeGen/Thumb2/mve-stacksplot.mir where we use a subw sp, sp, Imm12 variant )
It also uncovered a disassembly issue of adr.w instructions, that were only written as SUBW instructions (see llvm/test/MC/Disassembler/ARM/thumb2.txt).
Reviewers: eli.friedman, dmgreen, carwil, olista01, efriedma
Reviewed By: efriedma
Subscribers: john.brawn, efriedma, ostannard, kristof.beyls, hiraditya, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70680
Summary:
In rG643ac6c0420b, the syntax `ldraa x1, [x0]!` was added as an alias
for `ldraa x1, [x0, #0]!`. That syntax is less obvious in meaning, and
also will not be accepted by assemblers that haven't been updated yet.
So it would be better not to emit it as the preferred disassembly for
that instruction.
This change lowers the EmitPriority of the new alias so that the more
explicit syntax `[x0, #0]!` is preferred by the disassembler. The new
syntax is still accepted by the assembler.
Reviewers: ab, ostannard
Reviewed By: ostannard
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70813
The instruction definition has been retroactively expanded to
allow for an alias for '[xN, 0]!' as '[xN]!'.
That wouldn't make sense on LDR, but does for LDRA.
The Overflow version of XO-Form instruction uses the SO, OV and
OV32 special registers.
This changes modifies existing multiclasses and instruction
definitions to allow for the use of the XER register to record
the various types if overflow from possible add, subtract and
multiply instructions. It then modifies the existing instructions
as to use these multiclasses as needed.
Patch By: Kamau Bridgeman
Differential Revision: https://reviews.llvm.org/D66902
`saa` and `saad` are 32-bit and 64-bit store atomic add instructions.
memory[base] = memory[base] + rt
These instructions are available for "Octeon+" CPU. The patch adds support
for both instructions to MIPS assembler and diassembler and introduces new
CPU type - "octeon+".
Next patches will implement `.set arch=octeon+` directive and `AFL_EXT_OCTEONP`
ISA extension flag support.
Differential Revision: https://reviews.llvm.org/D69849
addOperand() method of AMDGPU disassembler returns SoftFail
on error. All instances which may lead to that place are
an impossible encdoing, not something which is possible to
encode, but semantically incorrect as described for SoftFail.
Then tablegen generates a check of the following form:
if (Decode...(..) == MCDisassembler::Fail) { return MCDisassembler::Fail; }
Since we can only return Success and SoftFail that is dead
code as detected by the static code analyzer.
Solution: return Fail as it should be.
See https://bugs.llvm.org/show_bug.cgi?id=43886
Differential Revision: https://reviews.llvm.org/D69819
Summary:
The PMMIR_EL1 register is present in Armv8.4 with PMU extension.
This patch adds support for it.
Reviewers: t.p.northover, dnsampaio
Reviewed By: dnsampaio
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68940
llvm-svn: 375228
Summary:
Renames `ExprType` to the more apt `BlockType` and adds a variant for
multivalue blocks. Currently non-void blocks are only generated at the
end of functions where the block return type needs to agree with the
function return type, and that remains true for multivalue
blocks. That invariant means that the actual signature does not need
to be stored in the block signature `MachineOperand` because it can be
inferred by `WebAssemblyMCInstLower` from the return type of the
parent function. `WebAssemblyMCInstLower` continues to lower block
signature operands to immediates when possible but lowers multivalue
signatures to function type symbols. The AsmParser and Disassembler
are updated to handle multivalue block types as well.
Reviewers: aheejin, dschuff, aardappel
Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68889
llvm-svn: 374933
The recently announced IBM z15 processor implements the architecture
already supported as "arch13" in LLVM. This patch adds support for
"z15" as an alternate architecture name for arch13.
The patch also uses z15 in a number of places where we used arch13
as long as the official name was not yet announced.
llvm-svn: 372435
microMIPS jump and link exchange instruction stores a target in a
26-bits field. Despite other microMIPS JAL instructions these bits
are target address shifted right 2 bits [1]. The patch fixes the
JALX instruction decoding and uses 2-bit shift.
[1] MIPS Architecture for Programmers Volume II-B: The microMIPS32 Instruction Set
Differential Revision: https://reviews.llvm.org/D67320
llvm-svn: 371428
The family of 'dual-accumulating' vector multiply-add instructions
(VMLADAV, VMLALDAV and VRMLALDAVH) can all operate on both signed and
unsigned integer types, and they all have an 'exchange' variant (with
an X in the name) that modifies which pairs of vector lanes in the two
inputs are multiplied together. But there's a clause in the spec that
says that the X variants //don't// operate on unsigned integer types,
only signed. You can have X, or unsigned, or neither, but not both.
We didn't notice that clause when we implemented the MC support for
these instructions, so LLVM believes that things like VMLADAVX.U8 do
exist, contradicting the spec. Here I fix that by conditioning them
out in Tablegen.
In order to do that, I've reversed the nesting order of the Tablegen
multiclasses for those instructions. Previously, the innermost
multiclass generated the X and not-X variants, and the one outside
that generated the A and not-A variants. Now X is done by the outer
multiclass, which allows me to bypass that one when I only want the
two not-X variants.
Changing the multiclass nesting order also changes the names of the
instruction ids unless I make a special effort not to. I decided that
while I was changing them anyway I'd make them look nicer; so now the
instructions have names like MVE_VMLADAVs32 or MVE_VMLADAVaxs32,
instead of cumbersome _noacc_noexch suffixes.
The corresponding multiply-subtract instructions are unaffected. Those
don't accept unsigned types at all, either in the spec or in LLVM.
Reviewers: ostannard, dmgreen
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67214
llvm-svn: 371405
Specify the Unpredictable bits, and return softfails when appropriate.
Patch by Mark Murray!
Differential revision: https://reviews.llvm.org/D66939
llvm-svn: 371374
Decoding of VMSR doesn't diagnose some unpredictable encodings, as the unpredictable bits are not correctly set.
Diff-reduce this instruction's internals WRT VMRS so I can see the differences better. Mostly this is s/src/Rt/g.
Fill in the "should-be-(0)" bits.
Designate the Unpredictable{} bits for both VMRS and VMSR.
Patch by Mark Murray!
Differential revision: https://reviews.llvm.org/D66938
llvm-svn: 370729
Summary:
Reported in https://github.com/opencv/opencv/issues/15413.
We have serveral extended mnemonics for Move To/From Vector-Scalar Register Instructions
eg: mffprd,mtfprd etc.
We only support one of them, this patch add the others.
Reviewers: nemanjai, steven.zhang, hfinkel, #powerpc
Reviewed By: hfinkel
Subscribers: wuzish, qcolombet, hiraditya, kbarton, MaskRay, shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66963
llvm-svn: 370411
Summary:
According to the Armv8.1-M manual CSEL, CSINC, CSINV and CSNEG are
"constrained unpredictable" when SP is used as the source register Rn.
The assembler should diagnose this case.
Reviewers: momchil.velikov, dmgreen, ostannard, simon_tatham, t.p.northover
Reviewed By: ostannard
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65505
llvm-svn: 367433
Embedded Trace Extension and Trace Buffer Extension are optional
future architecture extensions.
(cf. https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools)
Their system registers are documented here:
https://developer.arm.com/docs/ddi0601/a
ETE shares register names with ETM. One exception is the ETE
TRCEXTINSELR0 register, which has the same encoding as the ETM
TRCEXTINSELR register (but different semantics). This patch treats
them as aliases: the assembler will accept both names, emitting
identical encoding, and the disassembler will keep disassembling
to TRCEXRINSELR.
Differential Revision: https://reviews.llvm.org/D63707
llvm-svn: 367093
Summary:
According to the new Armv8-M specification
https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf the
instructions SQRSHRL and UQRSHLL now have an additional immediate
operand <saturate>. The new assembly syntax is:
SQRSHRL<c> RdaLo, RdaHi, #<saturate>, Rm
UQRSHLL<c> RdaLo, RdaHi, #<saturate>, Rm
where <saturate> can be either 64 (the existing behavior) or 48, in
that case the result is saturated to 48 bits.
The new operand is encoded as follows:
#64 Encoded as sat = 0
#48 Encoded as sat = 1
sat is bit 7 of the instruction bit pattern.
This patch adds a new assembler operand class MveSaturateOperand which
implements parsing and encoding. Decoding is implemented in
DecodeMVEOverlappingLongShift.
Reviewers: ostannard, simon_tatham, t.p.northover, samparker, dmgreen, SjoerdMeijer
Reviewed By: simon_tatham
Subscribers: javed.absar, kristof.beyls, hiraditya, pbarrio, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64810
llvm-svn: 366555
Summary:
We agreed to rename `except_ref` to `exnref` for consistency with other
reference types in
https://github.com/WebAssembly/exception-handling/issues/79. This also
renames WebAssemblyInstrExceptRef.td to WebAssemblyInstrRef.td in order
to use the file for other reference types in future.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64703
llvm-svn: 366145
This patch series adds support for the next-generation arch13
CPU architecture to the SystemZ backend.
This includes:
- Basic support for the new processor and its features.
- Assembler/disassembler support for new instructions.
- CodeGen for new instructions, including new LLVM intrinsics.
- Scheduler description for the new processor.
- Detection of arch13 as host processor.
Note: No currently available Z system supports the arch13
architecture. Once new systems become available, the
official system name will be added as supported -march name.
llvm-svn: 365932
The VQDMLAH.U8, VQDMLAH.U16 and VQDMLAH.U32 instructions don't
actually exist: the Armv8.1-M architecture spec only lists signed
forms of that instruction. The unsigned ones were added in error: they
existed in an early draft of the spec, but they were removed before
the public version, and we missed that particular spec change.
Also affects the variant forms VQDMLASH, VQRDMLAH and VQRDMLASH.
Reviewers: miyuki
Subscribers: javed.absar, kristof.beyls, hiraditya, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64502
llvm-svn: 365747
This was reported in https://bugs.llvm.org/show_bug.cgi?id=41751
llvm-mc aborted when disassembling tabortdc.
This patch try to clean up TM related DAGs.
* Fixes the problem by remove explicit output of cr0, and put it as implicit def.
* Update int_ppc_tbegin pattern to accommodate the implicit def of cr0.
* Update the TCHECK operand and int_ppc_tcheck accordingly.
* Add some builtin test and disassembly tests.
* Remove unused CRRC0/crrc0
Differential Revision: https://reviews.llvm.org/D61935
llvm-svn: 364544
The BF and WLS/WLSTP instructions have various branch-offset fields
occupying different positions and lengths in the instruction encoding,
and all of them were decoded at disassembly time by the function
DecodeBFLabelOffset() which returned SoftFail if the offset was zero.
In fact, it's perfectly fine and not even a SoftFail for most of those
offset fields to be zero. The only one that can't be zero is the 4-bit
field labelled `boff` in the architecture spec, occupying bits {26-23}
of the BF instruction family. If that one is zero, the encoding
overlaps other instructions (WLS, DLS, LETP, VCTP), so it ought to be
a full Fail.
Fixed by adding an extra template parameter to DecodeBFLabelOffset
which controls whether a zero offset is accepted or rejected. Adjusted
existing tests (only in error messages for bad disassemblies); added
extra tests to demonstrate zero offsets being accepted in all the
right places, and a few demonstrating rejection of zero `boff`.
Reviewers: DavidSpickett, ostannard
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63864
llvm-svn: 364533
Different versions of the Arm architecture disallow the use of generic
coprocessor instructions like MCR and CDP on different sets of
coprocessors. This commit centralises the check of the coprocessor
number so that it's consistent between assembly and disassembly, and
also updates it for the new restrictions in Arm v8.1-M.
New tests added that check all the coprocessor numbers; old tests
updated, where they used a number that's now become illegal in the
context in question.
Reviewers: DavidSpickett, ostannard
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63863
llvm-svn: 364532
In the `CSEL Rd,Rm,Rn` instruction family (also including CSINC, CSINV
and CSNEG), the architecture lists it as CONSTRAINED UNPREDICTABLE
(i.e. SoftFail) to use SP in the Rd or Rm slot, but outright illegal
to use it in the Rn slot, not least because some encodings of that
form are used by MVE instructions such as UQRSHLL.
MC was treating all three slots the same, as SoftFail. So the only
reason UQRSHLL was disassembled correctly at all was because the MVE
decode table is separate from the Thumb2 one and takes priority; if
you turned off MVE, then encodings such as `[0x5f,0xea,0x0d,0x83]`
would disassemble as spurious CSELs.
Fixed by inventing another version of the `GPRwithZR` register class,
which disallows SP completely instead of just SoftFailing it.
Reviewers: DavidSpickett, ostannard
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63862
llvm-svn: 364531
This adds some extra RUN lines to existing test files, to check that
things that worked in previous architecture versions haven't
accidentally stopped working in 8.1-M. Also we add some new tests: a
test of scalar floating point instructions that could be easily
confused with the similar-looking vector ones at assembly time, a test
of basic load/store/move access to the FP registers (which has to work
even in integer-only MVE); and one final check of the really obvious
case where turning off MVE should make sure MVE instructions really
are rejected.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62682
llvm-svn: 364293
This final batch includes the tail-predicated versions of the
low-overhead loop instructions (LETP); the VPSEL instruction to select
between two vector registers based on the predicate mask without
having to open a VPT block; and VPNOT which complements the predicate
mask in place.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62681
llvm-svn: 364292
This adds the rest of the vector memory access instructions. It
includes contiguous loads/stores, with an ordinary addressing mode
such as [r0,#offset] (plus writeback variants); gather loads and
scatter stores with a scalar base address register and a vector of
offsets from it (written [r0,q1] or similar); and gather/scatters with
a vector of base addresses (written [q0,#offset], again with
writeback). Additionally, some of the loads can widen each loaded
value into a larger vector lane, and the corresponding stores narrow
them again.
To implement these, we also have to add the addressing modes they
need. Also, in AsmParser, the `isMem` query function now has
subqueries `isGPRMem` and `isMVEMem`, according to which kind of base
register is used by a given memory access operand.
I've also had to add an extra check in `checkTargetMatchPredicate` in
the AsmParser, without which our last-minute check of `rGPR` register
operands against SP and PC was failing an assertion because Tablegen
had inserted an immediate 0 in place of one of a pair of tied register
operands. (This matches the way the corresponding check for `MCK_rGPR`
in `validateTargetOperandClass` is guarded.) Apparently the MVE load
instructions were the first to have ever triggered this assertion, but
I think only because they were the first to have a combination of the
usual Arm pre/post writeback system and the `rGPR` class in particular.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62680
llvm-svn: 364291
Summary:
The LLVM disassembler assumes that the unused src0 operand of v_nop is
zero. Other tools can put another value in that field, which is still
valid. This commit fixes the LLVM disassembler to recognize such an
encoding as v_nop, in the same way as we already do for s_getpc.
Differential Revision: https://reviews.llvm.org/D63724
Change-Id: Iaf0363eae26ff92fc4ebc716216476adbff37a6f
llvm-svn: 364208
This adds the family of loads and stores with names like VLD20.8 and
VST42.32, which load and store parts of multiple q-registers in such a
way that executing both VLD20 and VLD21, or all four of VLD40..VLD43,
will distribute 2 or 4 vectors' worth of memory data across the lanes
of the same number of registers but in a transposed order.
In addition to the Tablegen descriptions of the instructions
themselves, this patch also adds encode and decode support for the
QQPR and QQQQPR register classes (representing the range of loaded or
stored vector registers), and tweaks to the parsing system for lists
of vector registers to make it return the right format in this case
(since, unlike NEON, MVE regards q-registers as primitive, and not
just an alias for two d-registers).
llvm-svn: 364172
These instructions let you load half a vector register at once from
two general-purpose registers, or vice versa.
The assembly syntax for these instructions mentions the vector
register name twice. For the move _into_ a vector register, the MC
operand list also has to mention the register name twice (once as the
output, and once as an input to represent where the unchanged half of
the output register comes from). So we can conveniently assign one of
the two asm operands to be the output $Qd, and the other $QdSrc, which
avoids confusing the auto-generated AsmMatcher too much. For the move
_from_ a vector register, there's no way to get round the fact that
both instances of that register name have to be inputs, so we need a
custom AsmMatchConverter to avoid generating two separate output MC
operands. (And even that wouldn't have worked if it hadn't been for
D60695.)
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62679
llvm-svn: 364041
This adds the `MVE_qDest_rSrc` superclass and all its instances, plus
a few other instructions that also take a scalar input register or two.
I've also belatedly added custom diagnostic messages to the operand
classes for odd- and even-numbered GPRs, which required matching
changes in two of the existing MVE assembly test files.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62678
llvm-svn: 364040
Summary:
This adds the `MVE_qDest_qSrc` superclass and all instructions that
inherit from it. It's not the complete class of _everything_ with a
q-register as both destination and source; it's a subset of them that
all have similar encodings (but it would have been hopelessly unwieldy
to call it anything like MVE_111x11100).
This category includes add/sub with carry; long multiplies; halving
multiplies; multiply and accumulate, and some more complex
instructions.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62677
llvm-svn: 364037
Summary:
These take a pair of vector register to compare, and a comparison type
(written in the form of an Arm condition suffix); they output a vector
of booleans in the VPR register, where predication can conveniently
use them.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62676
llvm-svn: 364027
This includes integer arithmetic of various kinds (add/sub/multiply,
saturating and not), and the immediate forms of VMOV and VMVN that
load an immediate into all lanes of a vector.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62674
llvm-svn: 363936
This includes all the obvious bitwise operations (AND, OR, BIC, ORN,
MVN) in register-to-register forms, and the immediate forms of
AND/OR/BIC/ORN; byte-order reverse instructions; and the VMOVs that
access a single lane of a vector.
Some of those VMOVs (specifically, the ones that access a 32-bit lane)
share an encoding with existing instructions that were disassembled as
accessing half of a d-register (e.g. `vmov.32 r0, d1[0]`), but in
8.1-M they're now written as accessing a quarter of a q-register (e.g.
`vmov.32 r0, q0[2]`). The older syntax is still accepted by the
assembler.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62673
llvm-svn: 363838
Vector load/store instructions support an optional alignment field
that the compiler can use to provide known alignment info to the
hardware. If the field is used (and the information is correct),
the hardware may be able (on some models) to perform faster memory
accesses than otherwise.
This patch adds support for alignment hints in the assembler and
disassembler, and fills in known alignment during codegen.
llvm-svn: 363806
This includes saturating and non-saturating shifts, both with
immediate shift count and with the shift counts given by another
vector register; VSHLC (in which the bits shifted out of each active
vector lane are shifted in to the next active lane); and also VMOVL,
which is enough like an immediate shift that it didn't fit too badly
in this category.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62672
llvm-svn: 363696
Summary:
These form a small family of their own, to go with the floating-point
VMINNM/VMAXNM instructions added in a previous commit.
They introduce the first of many special cases in the mnemonic
recognition code, because VMIN with the E suffix used by the VPT
predication system needs to avoid being interpreted as the nonexistent
instruction 'VMI' with an ordinary 'NE' condition suffix.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62671
llvm-svn: 363695
This is the family of vector instructions that combine all the lanes
in their input vector(s), and output a value in one or two GPRs.
Differential Revision: https://reviews.llvm.org/D62670
llvm-svn: 363403
This commit prepares the way to start adding the main collection of
MVE instructions, which operate on the 128-bit vector registers.
The most obvious thing that's needed, and the simplest, is to add the
MQPR register class, which is like the existing QPR except that it has
fewer registers in it.
The more complicated part: MVE defines a system of vector predication,
in which instructions operating on 128-bit vector registers can be
constrained to operate on only a subset of the lanes, using a system
of prefix instructions similar to the existing Thumb IT, in that you
have one prefix instruction which designates up to 4 following
instructions as subject to predication, and within that sequence, the
predicate can be inverted by means of T/E suffixes ('Then' / 'Else').
To support instructions of this type, we've added two new Tablegen
classes `vpred_n` and `vpred_r` for standard clusters of MC operands
to add to a predicated instruction. Both include a flag indicating how
the instruction is predicated at all (options are T, E and 'not
predicated'), and an input register field for the register controlling
the set of active lanes. They differ from each other in that `vpred_r`
also includes an input operand for the previous value of the output
register, for instructions that leave inactive lanes unchanged.
`vpred_n` lacks that extra operand; it will be used for instructions
that don't preserve inactive lanes in their output register (either
because inactive lanes are zeroed, as the MVE load instructions do, or
because the output register isn't a vector at all).
This commit also adds the family of prefix instructions themselves
(VPT / VPST), and all the machinery needed to work with them in
assembly and disassembly (e.g. generating the 't' and 'e' mnemonic
suffixes on disassembled instructions within a predicated block)
I've added a couple of demo instructions that derive from the new
Tablegen base classes and use those two operand clusters. The bulk of
the vector instructions will come in followup commits small enough to
be manageable. (One exception is that I've added the full version of
`isMnemonicVPTPredicable` in the AsmParser, because it seemed
pointless to carefully split it up.)
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62669
llvm-svn: 363258
This introduces a new decoding table for MVE instructions, and starts
by adding the family of scalar shift instructions that are part of the
MVE architecture extension: saturating shifts within a single GPR, and
long shifts across a pair of GPRs (both saturating and normal).
Some of these shift instructions have only 3-bit register fields in
the encoding, with the low bit fixed. So they can only address an odd
or even numbered GPR (depending on the operand), and therefore I add
two new register classes, GPREven and GPROdd.
Differential Revision: https://reviews.llvm.org/D62668
Change-Id: Iad95d5f83d26aef70c674027a184a6b1e0098d33
llvm-svn: 363051
This adds support for the new family of conditional selection /
increment / negation instructions; the low-overhead branch
instructions (e.g. BF, WLS, DLS); the CLRM instruction to zero a whole
list of registers at once; the new VMRS/VMSR and VLDR/VSTR
instructions to get data in and out of 8.1-M system registers,
particularly including the new VPR register used by MVE vector
predication.
To support this, we also add a register name 'zr' (used by the CSEL
family to force one of the inputs to the constant 0), and operand
types for lists of registers that are also allowed to include APSR or
VPR (used by CLRM). The VLDR/VSTR instructions also need a new
addressing mode.
The low-overhead branch instructions exist in their own separate
architecture extension, which we treat as enabled by default, but you
can say -mattr=-lob or equivalent to turn it off.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Reviewed By: samparker
Subscribers: miyuki, javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62667
llvm-svn: 363039
These caused a build failure because I managed not to notice they
depended on a later unpushed commit in my current stack. Sorry about
that.
llvm-svn: 362956
This adds support for the new family of conditional selection /
increment / negation instructions; the low-overhead branch
instructions (e.g. BF, WLS, DLS); the CLRM instruction to zero a whole
list of registers at once; the new VMRS/VMSR and VLDR/VSTR
instructions to get data in and out of 8.1-M system registers,
particularly including the new VPR register used by MVE vector
predication.
To support this, we also add a register name 'zr' (used by the CSEL
family to force one of the inputs to the constant 0), and operand
types for lists of registers that are also allowed to include APSR or
VPR (used by CLRM). The VLDR/VSTR instructions also need some new
addressing modes.
The low-overhead branch instructions exist in their own separate
architecture extension, which we treat as enabled by default, but you
can say -mattr=-lob or equivalent to turn it off.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Reviewed By: samparker
Subscribers: miyuki, javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62667
llvm-svn: 362953
The family of 32-bit Thumb instruction encodings that include t2ORR,
t2AND and t2EOR are all listed in the ArmARM as having (0) in bit 15.
The Tablegen descriptions of those instructions listed them as ?. This
change tightens that up by making them into 0 + Unpredictable.
In the specific case of t2ORR, we tighten it up still further by
making the zero bit mandatory. This change comes from Arm v8.1-M, in
which encodings with that bit equal to 1 will now be used for
different instructions.
Reviewers: dmgreen, samparker, SjoerdMeijer, efriedma
Reviewed By: dmgreen, efriedma
Subscribers: efriedma, javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60705
llvm-svn: 362470
We require d/q suffixes on the memory form of these instructions to disambiguate the memory size.
We don't require it on the register forms, but need to support parsing both with and without it.
Previously we always printed the d/q suffix on the register forms, but it's redundant and
inconsistent with gcc and objdump.
After this patch we should support the d/q for parsing, but not print it when its unneeded.
llvm-svn: 360085