llvm-project

Commit Graph

Author	SHA1	Message	Date
Kazu Hirata	d0ec61c9ff	[Target] Remove unused forward declarations (NFC)	2022-08-07 00:16:16 -07:00
David Spickett	5d14873249	[llvm][AArch64] Add missing FPCR, H and B registers to Codeview mapping Fixes https://github.com/llvm/llvm-project/issues/56484 H registers are 16 bit views of AArch64's Neon registers and B are the 8 bit views. msvc does not support 16 bit float (some mention in DirectX but I couldn't find a way to get to it) so for lack of a better reference I'm using: `85c9b41b33/server/references/dia/include/cvconst.h` (the other microsoft-pdb repo is no longer up to date) Luckily clang does support fp16 so a test is added for that. There is no 8 bit float type so I had to get creative with the test case. We're not testing for correct debug info here just that we can select the B register and not crash in the process. For FPCR it's never going to be passed as an argument so I've not added a test for it. It is included to keep our list looking the same as the reference. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D129774	2022-07-19 09:33:13 +00:00
David Green	3e0bf1c7a9	[CodeGen] Move instruction predicate verification to emitInstruction D25618 added a method to verify the instruction predicates for an emitted instruction, through verifyInstructionPredicates added into <Target>MCCodeEmitter::encodeInstruction. This is a very useful idea, but the implementation inside MCCodeEmitter made it only fire for object files, not assembly which most of the llvm test suite uses. This patch moves the code into the <Target>_MC::verifyInstructionPredicates method, inside the InstrInfo. The allows it to be called from other places, such as in this patch where it is called from the <Target>AsmPrinter::emitInstruction methods which should trigger for both assembly and object files. It can also be called from other places such as verifyInstruction, but that is not done here (it tends to catch errors earlier, but in reality just shows all the mir tests that have incorrect feature predicates). The interface was also simplified slightly, moving computeAvailableFeatures into the function so that it does not need to be called externally. The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently show errors in the test-suite, so have been disabled with FIXME comments. Recommitted with some fixes for the leftover MCII variables in release builds. Differential Revision: https://reviews.llvm.org/D129506	2022-07-14 09:33:28 +01:00
David Green	95252133e1	Revert "Move instruction predicate verification to emitInstruction" This reverts commit `e2fb8c0f4b` as it does not build for Release builds, and some buildbots are giving more warning than I saw locally. Reverting to fix those issues.	2022-07-13 13:28:11 +01:00
David Green	e2fb8c0f4b	Move instruction predicate verification to emitInstruction D25618 added a method to verify the instruction predicates for an emitted instruction, through verifyInstructionPredicates added into <Target>MCCodeEmitter::encodeInstruction. This is a very useful idea, but the implementation inside MCCodeEmitter made it only fire for object files, not assembly which most of the llvm test suite uses. This patch moves the code into the <Target>_MC::verifyInstructionPredicates method, inside the InstrInfo. The allows it to be called from other places, such as in this patch where it is called from the <Target>AsmPrinter::emitInstruction methods which should trigger for both assembly and object files. It can also be called from other places such as verifyInstruction, but that is not done here (it tends to catch errors earlier, but in reality just shows all the mir tests that have incorrect feature predicates). The interface was also simplified slightly, moving computeAvailableFeatures into the function so that it does not need to be called externally. The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently show errors in the test-suite, so have been disabled with FIXME comments. Differential Revision: https://reviews.llvm.org/D129506	2022-07-13 12:53:32 +01:00
David Sherwood	8f9d73fbd6	[NFC][AArch64] Minor refactor of AArch64InstPrinter::printMatrixTileList We can remove the MatrixZADRegisterTable table of tile registers and just calculate the register index directly. Differential Revision: https://reviews.llvm.org/D127757	2022-06-15 09:52:24 +01:00
Fangrui Song	adf4142f76	[MC] De-capitalize SwitchSection. NFC Add SwitchSection to return switchSection. The API will be removed soon.	2022-06-10 22:50:55 -07:00
Martin Storsjö	e71b07e468	[MC] [Win64EH] Wrap the epilog instructions in a struct. NFC. For ARM SEH, the epilogs will need a little more associated data than just the plain list of opcodes. This is a preparatory refactoring for D125645. Differential Revision: https://reviews.llvm.org/D125879	2022-06-01 11:25:48 +03:00
Fangrui Song	9ee15bba47	[MC] Lower case the first letter of EmitCOFF* EmitWin* EmitCV*. NFC	2022-05-26 00:14:08 -07:00
serge-sans-paille	fb67d683db	[iwyu] Handle regressions in libLLVM header include Running iwyu-diff on LLVM codebase since `7030654296` detected a few regressions, fixing them. Differential Revision: https://reviews.llvm.org/D126417	2022-05-26 08:12:34 +02:00
Caroline Concatto	8765ad42cd	[AArch64][SME][NFC] Add implicit operands for SME instructions in the disassembly. This patch simplifies the switch statement in getInstruction to add implicit operands (register ZA and Immediate equal to zero) in the SME operands when disassembly. The register ZA and the zero immediate can be added by checking the operand in MCInstDesc. Differential Revision: https://reviews.llvm.org/D125534	2022-05-20 10:29:21 +01:00
David Green	1379b15099	[AArch64] Fix the generation of BE Nops Big endian Nops were being generated as d5 03 20 1f fnmadd s21, s30, s0, s0, getting the bytes of the NOP in the wrong order. This switches the bytes to not be dependant on the endianness. Differential Revision: https://reviews.llvm.org/D125980	2022-05-20 09:31:00 +01:00
Tim Northover	04e5b7fd17	AArch64: fall back to DWARF instead of crashing on weird .cfi directives CodeGen will only produce fixed formwat prologues, but hand-written assembly can have .cfi directives in any combination they want. This should cause a fallback to DWARF rather than an assertion failure (or an incorrect compact unwind if assertions are disabled).	2022-05-18 11:42:42 +01:00
Martin Storsjö	2d8ce08b09	[AArch64] Stop creating unnecessary label MCSymbols for each Windows unwind opcode. NFC. These labels aren't needed in the ARM version of WinEH tables, as each unwind opcode maps to a specific instruction (each opcode is assumed to represent one instruction), and the written tables don't contain offsets like on x86_64. Differential Revision: https://reviews.llvm.org/D125369	2022-05-12 15:23:04 +03:00
CHIANG, YU-HSUN (Tommy Chiang, oToToT)	4a31af88a2	[MC][AArch64] Enable '+v8a' when nothing specified for MCSubtargetInfo Since D110065, the 'R' profile support is added to LLVM. It turns the `generic` cpu into the intersection of v8-a and v8-r. However, this makes some backward compatibility problems. The original patch makes the clang driver implicitly pass -march=armv8-a when only the triple is specified. Since it only applies to clang, other tools like llvm-objdump still faces the backward compatibility problem. This patch applies the same idea to MC related tools by enabling '+v8a' feature when nothing is specified (both CPU and FS are empty) for MCSubtargetInfo creation. This patch should fix PR53956. Reviewed by: labrinea Differential Revision: https://reviews.llvm.org/D124319	2022-04-29 04:53:22 +08:00
Tim Northover	901831a4e6	Revert "AArch64: take compact unwind frame size from last CFI instruction." It was on ToT when I pushed and committed unintentionally.	2022-04-11 12:25:58 +01:00
Tim Northover	4120a3abdd	AArch64: take compact unwind frame size from last CFI instruction. Asynchronous exception support for the prologue means that there can be multiple .cfi_def_cfa_offset instructions in a single function, which tripped up an assertion in the compact unwind generator. In reality the compact unwind format is far too restrictive to represent asynchronous frames so if we ever wanted that on Darwin we'd fall back to DWARF (possibly keeping compact unwind around for synchronous users). So the compact format should continue to represent the synchronous situation, and the assertion can be removed.	2022-04-11 12:24:48 +01:00
Fangrui Song	cfbd5c8e4a	[AArch64] Allow .variant_pcs before the symbol is registered glibc sysdeps/aarch64/tst-vpcs-mod.S has something like: ``` .variant_pcs vpcs_call .global vpcs_call ``` This is supported by GNU as but leads to an error in MC. Use getOrCreateSymbol to support a not-yet-registered symbol: call `registerSymbol` to ensure the symbol exists even if there is no binding directive/label, to match GNU as. While here, improve tests to check (1) a local symbol can get STO_AARCH64_VARIANT_PCS (2) undefined .variant_pcs (3) an alias does not inherit STO_AARCH64_VARIANT_PCS. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122507	2022-03-28 17:52:27 -07:00
Momchil Velikov	a6d238536d	[AArch64] Fallback to DWARF when trying to emit compact unwind info with multiple CFA offset adjustments Instead of asserting, fallback to emitting DWARF unwind info when an attempt is made to output compact unwind info for a function with multiple adjustments to the CFA offset. Multiple adjustments of SP are common and with instruction precise unwind tables these may translate into multiple `.cfi_def_cfa_offset` directives. Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=1302998 Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D121017	2022-03-23 15:32:42 +00:00
Sander de Smalen	ef9816e43c	[AArch64][SME] Don't infer -neon from +streaming-sve. In Streaming SVE mode full NEON is not available, even though this is implied from armv8-a. LLVM previously inferred that NEON needed to be disabled when setting +streaming-sve, but there is no need to infer this from +streaming-sve, because we can explicitly disable NEON using LLVM's attribute mechanism. This is specifically relevant because +streaming-sve is not a user-facing feature, but rather an LLVM internal feature. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D120809	2022-03-02 17:33:06 +00:00
Adrian Kuegel	d74f15faff	[AArch64][NFC] Fix unused-lambda-capture warning. Differential Revision: https://reviews.llvm.org/D120041	2022-02-17 14:09:58 +01:00
Simon Pilgrim	f1877eb1bb	AArch64_MC::isQForm - Fix MSVC 'no default capture mode' lambda warning	2022-02-17 11:41:47 +00:00
Pavel Kosov	f3809b20f2	[AArch64][SchedModels] Handle virtual registers in FP/NEON predicates Current implementation of Check[HSDQ]Form predicates doesn’t handle virtual registers and therefore isn’t useful for pre-RA scheduling. Patch fixes this implementing two function predicates: CheckQForm for checking that instruction writes 128-bit NEON register and CheckFpOrNEON which checks that instruction writes FP register (any width). The latter supersedes Check[HSD]Form predicates which are not used individually. OS Laboratory. Huawei Russian Research Institute. Saint-Petersburg Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D114642	2022-02-17 13:41:05 +03:00
Shao-Ce SUN	2aed07e96c	[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter` Reviewed By: skan Differential Revision: https://reviews.llvm.org/D119846	2022-02-16 13:10:09 +08:00
serge-sans-paille	ef736a1c39	Cleanup LLVMMC headers There's a few relevant forward declarations in there that may require downstream adding explicit includes: llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h Counting preprocessed lines required to rebuild llvm-project on my setup: before: 1052436830 after: 1049293745 Which is significant and backs up the change in addition to the usual benefits of decreasing coupling between headers and compilation units. Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D119244	2022-02-09 11:09:17 +01:00
Benjamin Kramer	f15014ff54	Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17" This reverts commit `ef82063207`. - It conflicts with the existing llvm::size in STLExtras, which will now never be called. - Calling it without llvm:: breaks C++17 compat	2022-01-26 16:55:53 +01:00
serge-sans-paille	ef82063207	Rename llvm::array_lengthof into llvm::size to match std::size from C++17 As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no build breakage expected).	2022-01-26 16:17:45 +01:00
Kazu Hirata	41bfac6aed	[Target] Remove unused forward declarations (NFC)	2022-01-02 10:20:15 -08:00
Kazu Hirata	d395befa65	[llvm] Use range-based for loops (NFC)	2021-12-11 11:29:12 -08:00
Kazu Hirata	262dd1e42d	[llvm] Use range-based for loops (NFC)	2021-12-02 09:27:47 -08:00
Eli Friedman	c964afb2c8	[AArch64] Diagnose large adrp offset on Windows. On Windows, this relocation can only encode a 21-bit offset. Make sure we emit an error, instead of silently truncating the offset. Found investigating https://bugs.llvm.org/show_bug.cgi?id=52378 Differential Revision: https://reviews.llvm.org/D113051	2021-11-02 15:11:22 -07:00
Alexandros Lamprineas	8689f5e6e7	[AArch64] Add support for the 'R' architecture profile. This change introduces subtarget features to predicate certain instructions and system registers that are available only on 'A' profile targets. Those features are not present when targeting a generic CPU, which is the default processor. In other words the generic CPU now means the intersection of 'A' and 'R' profiles. To maintain backwards compatibility we enable the features that correspond to -march=armv8-a when the architecture is not explicitly specified on the command line. References: https://developer.arm.com/documentation/ddi0600/latest Differential Revision: https://reviews.llvm.org/D110065	2021-10-27 12:32:30 +01:00
Reid Kleckner	89b57061f7	Move TargetRegistry.(h\|cpp) from Support to MC This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually use the target, so we might as well move this out of Support. This allows us to ensure that Support doesn't have includes from MC/*. Differential Revision: https://reviews.llvm.org/D111454	2021-10-08 14:51:48 -07:00
Jingu Kang	30caca39f4	Third Recommit "[AArch64] Split bitmask immediate of bitwise AND operation" This reverts the revert commit `fc36fb4d23` with bug fixes. Differential Revision: https://reviews.llvm.org/D109963	2021-10-08 11:28:49 +01:00
Cullen Rhodes	42ba79b7b0	[AArch64][SME] Update tile slice index offset Changes in architecture revision 00eac1: * Tile slice index offset no longer prefixed with '#'. * The syntax for 128-bit (.Q) ZA tile slice accesses must now include an explicit zero index. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-09 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D111212	2021-10-07 08:55:10 +00:00
David Spickett	fc36fb4d23	Revert "Second Recommit "[AArch64] Split bitmask immediate of bitwise AND operation"" This reverts commit `13f3c39f36`. Due to test failures in stage 2 clang tests on AArch64 bots.	2021-10-06 08:39:48 +00:00
Jingu Kang	13f3c39f36	Second Recommit "[AArch64] Split bitmask immediate of bitwise AND operation" This reverts the revert commit `c07f709969` with bug fixes. Differential Revision: https://reviews.llvm.org/D109963	2021-09-30 09:27:08 +01:00
Kazu Hirata	9a640a1cb8	[AArch64] Remove redundant declaration createAArch64ObjectTargetStreamer (NFC) Note that createAArch64ObjectTargetStreamer is declared in AArch64TargetStreamer.h and defined in AArch64TargetStreamer.cpp. Identified with readability-redundant-declaration.	2021-09-29 09:08:41 -07:00
Sterling Augustine	c07f709969	Revert "Recommit "[AArch64] Split bitmask immediate of bitwise AND operation"" This reverts commit `73a196a11c`. Causes crashes as reported in https://reviews.llvm.org/D109963	2021-09-28 18:02:06 -07:00
Jingu Kang	73a196a11c	Recommit "[AArch64] Split bitmask immediate of bitwise AND operation" This reverts the revert commit `f85d8a5bed` with bug fixes. Original message: MOVi32imm + ANDWrr ==> ANDWri + ANDWri MOVi64imm + ANDXrr ==> ANDXri + ANDXri The mov pseudo instruction could be expanded to multiple mov instructions later. In this case, try to split the constant operand of mov instruction into two bitmask immediates. It makes only two AND instructions intead of multiple mov + and instructions. Added a peephole optimization pass on MIR level to implement it. Differential Revision: https://reviews.llvm.org/D109963	2021-09-28 15:26:29 +01:00
Jingu Kang	f85d8a5bed	Revert "[AArch64] Split bitmask immediate of bitwise AND operation" This reverts commit `864b206796`. Reverting due to error on buildbots.	2021-09-28 13:28:09 +01:00
Jingu Kang	864b206796	[AArch64] Split bitmask immediate of bitwise AND operation MOVi32imm + ANDWrr ==> ANDWri + ANDWri MOVi64imm + ANDXrr ==> ANDXri + ANDXri The mov pseudo instruction could be expanded to multiple mov instructions later. In this case, try to split the constant operand of mov instruction into two bitmask immediates. It makes only two AND instructions intead of multiple mov + and instructions. Added a peephole optimization pass on MIR level to implement it. Differential Revision: https://reviews.llvm.org/D109963	2021-09-28 11:57:43 +01:00
Peter Smith	e63455d5e0	[MC] Use local MCSubtargetInfo in writeNops On some architectures such as Arm and X86 the encoding for a nop may change depending on the subtarget in operation at the time of encoding. This change replaces the per module MCSubtargetInfo retained by the targets AsmBackend in favour of passing through the local MCSubtargetInfo in operation at the time. On Arm using the architectural NOP instruction can have a performance benefit on some implementations. For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to limit the chances of this causing problems in the future. I've not done this for other targets such as X86 as there is more frequent use of the MCSubtargetInfo and it looks to be for stable properties that we would not expect to vary per function. This change required threading STI through MCNopsFragment and MCBoundaryAlignFragment. I've attempted to take into account the in tree experimental backends. Differential Revision: https://reviews.llvm.org/D45962	2021-09-07 15:46:19 +01:00
Cullen Rhodes	09507b5325	[AArch64][SME] Disable NEON in streaming mode In streaming mode most of the NEON instruction set is illegal, disable NEON when compiling with `+streaming-sve`, unless NEON is explictly requested. Subsequent patches will add support for the small subset of NEON instructions that are legal in streaming mode. Reviewed By: paulwalker-arm, david-arm Differential Revision: https://reviews.llvm.org/D107902	2021-08-16 07:56:48 +00:00
Arthur Eubanks	ad25344620	[MC][CodeGen] Emit constant pools earlier Previously we would emit constant pool entries for ldr inline asm at the very end of AsmPrinter::doFinalization(). However, if we're emitting dwarf aranges, that would end all sections with aranges. Then if we have constant pool entries to be emitted in those same sections, we'd hit an assert that the section has already been ended. We want to emit constant pool entries before emitting dwarf aranges. This patch splits out arm32/64's constant pool entry emission into its own MCTargetStreamer virtual method. Fixes PR51208 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D107314	2021-08-03 20:55:31 -07:00
Jason Molenda	0d8cd4e2d5	[AArch64InstPrinter] Change printAddSubImm to comment imm value when shifted Add a comment when there is a shifted value, add x9, x0, #291, lsl #12 ; =1191936 but not when the immediate value is unshifted, subs x9, x0, #256 ; =256 when the comment adds nothing additional to the reader. Differential Revision: https://reviews.llvm.org/D107196	2021-08-03 02:28:46 -07:00
Cullen Rhodes	2e27c4e1f1	[AArch64][SME] Add zero instruction This patch adds the zero instruction for zeroing a list of 64-bit element ZA tiles. The instruction takes a list of up to eight tiles ZA0.D-ZA7.D, which must be in order, e.g. zero {za0.d,za1.d,za2.d,za3.d,za4.d,za5.d,za6.d,za7.d} zero {za1.d,za3.d,za5.d,za7.d} The assembler also accepts 32-bit, 16-bit and 8-bit element tiles which are mapped to corresponding 64-bit element tiles in accordance with the architecturally defined mapping between different element size tiles, e.g. * Zeroing ZA0.B, or the entire array name ZA, is equivalent to zeroing all eight 64-bit element tiles ZA0.D to ZA7.D. * Zeroing ZA0.S is equivalent to zeroing ZA0.D and ZA4.D. The preferred disassembly of this instruction uses the shortest list of tile names that represent the encoded immediate mask, e.g. * An immediate which encodes 64-bit element tiles ZA0.D, ZA1.D, ZA4.D and ZA5.D is disassembled as {ZA0.S, ZA1.S}. * An immediate which encodes 64-bit element tiles ZA0.D, ZA2.D, ZA4.D and ZA6.D is disassembled as {ZA0.H}. * An all-ones immediate is disassembled as {ZA}. * An all-zeros immediate is disassembled as an empty list {}. This patch adds the MatrixTileList asm operand and related parsing to support this. Depends on D105570. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105575	2021-07-27 08:35:45 +00:00
Cullen Rhodes	15af3aaa2e	[AArch64][SME] Add system registers and related instructions This patch adds the new system registers introduced in SME: - ID_AA64SMFR0_EL1 (ro) SME feature identifier. - SMCR_ELx (r/w) streaming mode control register for configuring effective SVE Streaming SVE Vector length when the PE is in Streaming SVE mode. - SVCR (r/w) streaming vector control register, visible at all exception levels. Provides access to PSTATE.SM and PSTATE.ZA using MSR and MRS instructions. - SMPRI_EL1 (r/w) streaming mode execution priority register. - SMPRIMAP_EL2 (r/w) streaming mode priority mapping register. - SMIDR_EL1 (ro) streaming mode identification register. - TPIDR2_EL0 (r/w) for use by SME software to manage per-thread SME context. - MPAMSM_EL1 (r/w) MPAM (v8.4) streaming mode register, for labelling memory accesses performed in streaming mode. Also added in this patch are the SME mode change instructions. Three MSR immediate instructions are implemented to set or clear PSTATE.SM, PSTATE.ZA, or both respectively: - MSR SVCRSM, #<imm1> - MSR SVCRZA, #<imm1> - MSR SVCRSMZA, #<imm1> The following smstart/smstop aliases are also implemented for convenience: smstart -> MSR SVCRSMZA, #1 smstart sm -> MSR SVCRSM, #1 smstart za -> MSR SVCRZA, #1 smstop -> MSR SVCRSMZA, #0 smstop sm -> MSR SVCRSM, #0 smstop za -> MSR SVCRZA, #0 The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105576	2021-07-20 08:06:26 +00:00
Cullen Rhodes	99eb96f031	[AArch64][SME] Add load and store instructions This patch adds support for following contiguous load and store instructions: * LD1B, LD1H, LD1W, LD1D, LD1Q * ST1B, ST1H, ST1W, ST1D, ST1Q A new register class and operand is added for the 32-bit vector select register W12-W15. The differences in the following tests which have been re-generated are caused by the introduction of this register class: * llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-inline-asm.ll * llvm/test/CodeGen/AArch64/GlobalISel/regbank-inlineasm.mir * llvm/test/CodeGen/AArch64/stp-opt-with-renaming-reserved-regs.mir * llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir D88663 attempts to resolve the issue with the store pair test differences in the AArch64 load/store optimizer. The GlobalISel differences are caused by changes in the enum values of register classes, tests have been updated with the new values. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: CarolineConcatto Differential Revision: https://reviews.llvm.org/D105572	2021-07-16 10:11:10 +00:00
Cullen Rhodes	c08dabb0f4	[AArch64][SME] Add matrix register definitions and parsing support SME introduces the ZA array, a new piece of architectural register state consisting of a matrix of [SVLb x SVLb] bytes, where SVL is the implementation defined Streaming SVE vector length and SVLb is the number of 8-bit elements in a vector of SVL bits. SME instructions consist of three types of matrix operands: * Tiles: a ZA tile is a square, two-dimensional sub-array of elements within the ZA array. These tiles make up the larger accumulator array and the granularity varies based on the element size, i.e. - ZAQ0..ZAQ15 (smallest tile granule) - ZAD0..ZAD7 - ZAS0..ZAS3 - ZAH0..ZAH1 or ZAB0 (largest tile granule, single tile) * Tile vectors: similar to regular tiles, but have an extra 'h' or 'v' to tell how the vector at [reg+offset] is layed out in the tile, horizontally or vertically. E.g. za1h.h or za15v.q, which corresponds to vectors in registers ZAH1 and ZAQ15, respectively. * Accumulator matrix: this is the entire accumulator array ZA. This patch adds the register classes and related operands and parsing for SME instructions operating on the accumulator array. The ADDHA and ADDVA instructions which operate on tiles are also added in this patch to make some use of the code added, later patches will make use of the other operands introduced here. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Co-authored by: Sander de Smalen (@sdesmalen) Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105570	2021-07-14 08:25:49 +00:00

1 2 3 4 5 ...

420 Commits