llvm-project

Commit Graph

Author	SHA1	Message	Date
Joe Nash	729467acef	[AMDGPU] gfx11 LDSDIR instructions MC support Contributors: Carl Ritson <carl.ritson@amd.com> Patch 8/N for upstreaming of AMDGPU gfx11 architecture. Depends on D125498 Reviewed By: critson, rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D125820	2022-05-19 10:08:47 -04:00
Sheng	a5d618b393	[M68k][Disassembler] Fix decoding conflict This diff fixes decoding conflict between these pair of instructions: ADD(16\|32)dd / ADD(16\|32)dr SUB(16\|32)dd / SUB(16\|32)dr AND(16\|32)dd / AND(16\|32)dr OR(16\|32)dd / OR(16\|32)dr Reviewed By: ricky26 Differential Revision: https://reviews.llvm.org/D125861	2022-05-19 09:10:50 +08:00
Dmitry Preobrazhensky	32ca9bd7b5	[AMDGPU][MC][GFX940] Correct tied operand decoding for smfmac opcodes Differential Revision: https://reviews.llvm.org/D125790	2022-05-18 15:39:30 +03:00
Ivan Kosarev	140ad30b24	[AMDGPU][MC][GFX10] Add missing s_scratch_load tests. Completes https://reviews.llvm.org/D125117 Reviewed By: dp, arsenm Differential Revision: https://reviews.llvm.org/D125753	2022-05-18 11:11:10 +01:00
Stanislav Mekhanoshin	a09af86693	[AMDGPU] Enable FLAT LDS DMA on gfx9/10 before gfx940 We always had global and scratch loads to LDS in the gfx9, but did not handle it. These were available via the 'lds' encoding bit. In gfx940 this bit was reused as 'svs' which resulted in new '_lds' opcodes effectively pushing this bit into the opcode, but functionally it is the same. These instructions are also available on gfx10. Differential Revision: https://reviews.llvm.org/D125126	2022-05-17 12:16:37 -07:00
Joe Nash	d21b9b4946	[AMDGPU] gfx11 scalar alu instructions MC layer support for SOP(scalar alu operations) including encoding support for s_delay_alu and s_sendmsg_rtn. Contributors: Jay Foad <jay.foad@amd.com> Patch 7/N for upstreaming of AMDGPU gfx11 architecture. Depends on D125319 Reviewed By: #amdgpu, arsenm Differential Revision: https://reviews.llvm.org/D125498	2022-05-17 13:35:41 -04:00
Joe Nash	c70259405c	[AMDGPU] gfx11 BUF Instructions Includes MachineCode layer support and tests, and MIR tests not requiring CodeGen pass changes. Includes a small change in SMInstructions.td to correct encoded bits. Contributors: Petar Avramovic <Petar.Avramovic@amd.com> Dmitry Preobrazhensky <dmitry.preobrazhensky@amd.com> Depends on D125316 Patch 6/N for upstreaming of AMDGPU gfx11 architecture. Reviewed By: dp, Petar.Avramovic Differential Revision: https://reviews.llvm.org/D125319	2022-05-16 09:41:40 -04:00
Sheng	cf0b6df6db	[M68k][Disassembler] Adopt the new variable length decoder This is an example usage of D120958. After these patches are landed, we can strip off the codebeads officially. Reviewed By: myhsu Differential Revision: https://reviews.llvm.org/D120960	2022-05-15 08:44:58 +08:00
Ivan Kosarev	cb67b2ccc4	[AMDGPU][GFX10] Support base+soffset+offset SMEM stores. Also makes another step towards resolving https://github.com/llvm/llvm-project/issues/38652 Reviewed By: foad, dp Differential Revision: https://reviews.llvm.org/D125380	2022-05-12 08:48:05 +01:00
Ivan Kosarev	88f04bdbd8	[AMDGPU][GFX10] Support base+soffset+offset SMEM loads. Also makes a step towards resolving https://github.com/llvm/llvm-project/issues/38652 Reviewed By: foad, dp Differential Revision: https://reviews.llvm.org/D125117	2022-05-10 16:17:14 +01:00
Simon Pilgrim	c0840799e3	[MC][X86] Add vcmpps disassembler tests for Issue #41491 We were missing coverage for vcmpps imm, vreg, vreg, mreg {mreg} patterns	2022-05-06 15:39:17 +01:00
Philipp Tomsich	64816e68f4	[AArch64] Support for Ampere1 core Add support for the Ampere Computing Ampere1 core. Ampere1 implements the AArch64 state and is compatible with ARMv8.6-A. Differential Revision: https://reviews.llvm.org/D117112	2022-05-03 15:54:02 +01:00
CHIANG, YU-HSUN (Tommy Chiang, oToToT)	4a31af88a2	[MC][AArch64] Enable '+v8a' when nothing specified for MCSubtargetInfo Since D110065, the 'R' profile support is added to LLVM. It turns the `generic` cpu into the intersection of v8-a and v8-r. However, this makes some backward compatibility problems. The original patch makes the clang driver implicitly pass -march=armv8-a when only the triple is specified. Since it only applies to clang, other tools like llvm-objdump still faces the backward compatibility problem. This patch applies the same idea to MC related tools by enabling '+v8a' feature when nothing is specified (both CPU and FS are empty) for MCSubtargetInfo creation. This patch should fix PR53956. Reviewed by: labrinea Differential Revision: https://reviews.llvm.org/D124319	2022-04-29 04:53:22 +08:00
Stanislav Mekhanoshin	00d84a9f92	[AMDGPU] Remove vdata from buffer to lds load Differential Revision: https://reviews.llvm.org/D124485	2022-04-26 17:16:26 -07:00
Ulrich Weigand	1283ccb610	Support z16 processor name The recently announced IBM z16 processor implements the architecture already supported as "arch14" in LLVM. This patch adds support for "z16" as an alternate architecture name for arch14.	2022-04-21 19:58:22 +02:00
Dmitry Preobrazhensky	b4231ac4be	[AMDGPU][GFX90A+] Disabled ds_ordered_count and exp Differential Revision: https://reviews.llvm.org/D124087	2022-04-21 13:16:44 +03:00
Dmitry Preobrazhensky	ab18e1a533	[AMDGPU][GFX10] Enabled op_sel for v_add_nc_u16 and v_sub_nc_u16 Differential Revision: https://reviews.llvm.org/D123594	2022-04-13 13:48:42 +03:00
Shengchen Kan	fcade8e91e	[X86][test] Add encoding/decoding tests for VEX instruction w/ address-size prefix This patch also contains a regression test for D122448 Reviewed By: hvdijk, RKSimon Differential Revision: https://reviews.llvm.org/D122449	2022-04-13 12:50:25 +08:00
Dmitry Preobrazhensky	1f6aa90386	[AMDGPU][MC][GFX10] Added syntactic sugar for s_waitcnt_depctr operand Added the following helpers: depctr_hold_cnt(...) depctr_sa_sdst(...) depctr_va_vdst(...) depctr_va_sdst(...) depctr_va_ssrc(...) depctr_va_vcc(...) depctr_vm_vsrc(...) Differential Revision: https://reviews.llvm.org/D123022	2022-04-07 17:03:44 +03:00
Simon Tatham	82bd0bd24f	[AArch64] Make PMMIR_EL1 read-only. The Arm architecture reference manual (ARM DDI 0487H.a section D13.5.12) lists every field in the register as RO, and does not list an MSR instruction that writes it. So we should be defining it as an ROSysReg, not an RWSysReg. Reviewed By: vhscampos Differential Revision: https://reviews.llvm.org/D123111	2022-04-05 11:09:56 +01:00
Min-Yih Hsu	18b38ff6c7	[M68k] Adopt VarLenCodeEmitter for move instructions The `move` instruction has one of the most complicate sets of variants, so we're refactoring it first before finishing up rest of the data instructions in a separate patch. Note that since we're introducing more `move` variants, the codegen actually got improved in terms of code size.	2022-04-04 23:02:27 -07:00
Min-Yih Hsu	fccdc5618d	[M68k] Adopt VarLenCodeEmitter for shift / rotate instructions This patch is covered by existing MC tests.	2022-04-03 22:52:32 -07:00
Stefan Pintilie	2e55bc9f3c	[PowerPC] Set the special DSCR with a compiler option. Add a compiler option and the instructions required to set the special Data Stream Control Register (DSCR). The special register will not be set by default. Original patch by: Muhammad Usman Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D117013	2022-03-31 14:06:30 -05:00
Simon Pilgrim	4a33b9ece0	[MC][X86] Ensure all opcode tests are sorted by instruction name Noticed while reviewing D122449	2022-03-30 11:08:11 +01:00
Stanislav Mekhanoshin	6e3e14f600	[AMDGPU] Support gfx940 smfmac instructions Differential Revision: https://reviews.llvm.org/D122191	2022-03-24 12:40:42 -07:00
Stanislav Mekhanoshin	27439a7642	[AMDGPU] New gfx940 mfma instructions Differential Revision: https://reviews.llvm.org/D122044	2022-03-24 12:12:52 -07:00
Stanislav Mekhanoshin	72c1a0d9c2	[AMDGPU] Allow v_accvgpr_write to use SGPR on gfx90a This is undocumented, but it should work. Differential Revision: https://reviews.llvm.org/D122252	2022-03-22 13:52:29 -07:00
Stanislav Mekhanoshin	d9ac55fab2	[AMDGPU] New MFMA names for existing instructions Old names are supported as aliases. _1k MFMA got new opcodes. Differential Revision: https://reviews.llvm.org/D121741	2022-03-17 13:05:36 -07:00
Stanislav Mekhanoshin	522b259976	[AMDGPU] Allow v_accvgpr_write to use SGPR src on gfx940 Differential Revision: https://reviews.llvm.org/D121843	2022-03-17 12:12:06 -07:00
Amir Ayupov	2c4e38fa6f	[X86] Emit REX prefix immediately before the opcode Fix prefix emission order to emit REX immediately before the opcode (SDM vol2, 2.1, Figure 2-1). According to SDM vol2 2.2.1, "Other placements are ignored". This fix has a side effect of outputting segment override prefix in a different order than previously (benign). Follow-up to https://reviews.llvm.org/D120592 Reviewed By: skan, craig.topper Differential Revision: https://reviews.llvm.org/D120871	2022-03-16 08:30:31 -07:00
Amir Ayupov	1d3719820f	[X86] Preserve redundant Address-Size override prefix Print and emit redundant Address-Size override prefix if it's set on the instruction. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D120592	2022-03-16 08:30:29 -07:00
Stanislav Mekhanoshin	8dd3d1cf1f	[AMDGPU] Add symbolic names for gfx940 HWREGs The namespaces of HWREGs is now overlapping with gfx10. Thus the patch is longer than necessary to just support new names. It also need to handle proper error messages, i.e. to issue a "specified hardware register is not supported on this GPU" message. This may need a major refactoring in the future. Differential Revision: https://reviews.llvm.org/D121418	2022-03-14 16:13:33 -07:00
Stanislav Mekhanoshin	23499103f7	[AMDGPU] Support for gfx940 flat lds opcodes Differential Revision: https://reviews.llvm.org/D121414	2022-03-14 15:46:19 -07:00
Stanislav Mekhanoshin	1f53f20fc1	[AMDGPU] Support gfx940 v_lshl_add_u64 instruction Differential Revision: https://reviews.llvm.org/D121401	2022-03-14 15:45:42 -07:00
Stanislav Mekhanoshin	36fe3f13a9	[AMDGPU] flat scratch SVS addressing mode for gfx940 Both VADDR and SADDR are used in SVS mode. Differential Revision: https://reviews.llvm.org/D121254	2022-03-14 15:23:36 -07:00
Stanislav Mekhanoshin	6181458662	[AMDGPU] gfx940 MUBUF format changes Differential Revision: https://reviews.llvm.org/D121234	2022-03-11 11:36:49 -08:00
Stanislav Mekhanoshin	932f628121	[AMDGPU] new gfx940 fp atomics Differential Revision: https://reviews.llvm.org/D121028	2022-03-07 12:32:02 -08:00
Stanislav Mekhanoshin	e7b362d75d	[AMDGPU] Add v_mov_b64 gfx940 opcode Differential Revision: https://reviews.llvm.org/D121023	2022-03-07 12:07:12 -08:00
Stanislav Mekhanoshin	8992b50e2f	[AMDGPU] gfx940 uses new names for coherency bits Differential Revision: https://reviews.llvm.org/D120855	2022-03-07 11:50:07 -08:00
Stanislav Mekhanoshin	2c830c8fab	[AMDGPU] gfx940: support V_FMAMK_F32 and V_FMAAK_F32 Differential Revision: https://reviews.llvm.org/D120769	2022-03-07 11:31:01 -08:00
Simon Tatham	54dafd38c5	[AArch64] Move FeatureSpecRestrict into core 8.0-R architecture. It was included in HasV8_0rOps when D88660 first introduced that architecture definition. In D118045 I moved it out of there and into ProcessorFeatures.R82, so that -mcpu=cortex-r82 would continue to behave the same as before but -march=armv8-r would include only the mandatory parts of the architecture. In fact, that was a mistake. Firstly, Cortex-R82 _doesn't_ implement that feature, so it makes no sense to deliberately enable it for that CPU in particular. But also, it's an extension that only adds system registers, and we're generally more relaxed about where we enable those (because kernel developers find it useful to write sysreg-access instructions after runtime checking, and because sysreg accesses aren't manufactured during code generation so the risk is small). So, in line with that usual AArch64 policy, FeatureSpecRestrict ought to be considered part of 8.0-R for LLVM purposes. So I'm moving it back into HasV8_0rOps, where it started out. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D120830	2022-03-07 15:55:08 +00:00
Aakanksha	840695814a	[AMDGPU] Add gfx1036 target Differential Revision: https://reviews.llvm.org/D120846	2022-03-02 23:26:38 +00:00
Stefan Pintilie	eb1c5a9862	[PowerPC] Add the Power10 LXVKQ instrution. Add the Power 10 instruction LXVKQ. This patch was taken from an original patch by: Yi-Hong Lyu Reviewed By: lei Differential Revision: https://reviews.llvm.org/D117507	2022-02-23 08:48:59 -06:00
Min-Yih Hsu	4986a41f58	[M68k] Adopt VarLenCodeEmitter for bits instructions And introduce operand encoding fragments (i.e. MxEncMemOp record) for addressing modes 'o' and 'e'.	2022-02-17 14:16:19 -08:00
Sheng	4306fbff9c	Revert "Revert "[M68k] Adopt VarLenCodeEmitter for control instructions"" This reverts commit `69a7d49de6`. llvm/test/MC/M68k/Relaxations/branch.s needs disassembler support. So I disabled it temporarily	2022-02-16 17:41:49 +08:00
Sheng	69a7d49de6	Revert "[M68k] Adopt VarLenCodeEmitter for control instructions" This reverts commit `9ffd498fcb`. This patch introduce regression on MC/M68k/Relaxations/branch.s	2022-02-16 17:09:46 +08:00
Sheng	9ffd498fcb	[M68k] Adopt VarLenCodeEmitter for control instructions Refactor the instructions in M68kInstrControl.td to use the VarLenCodeEmitter. This patch is tested by the existing test cases. Reviewed By: myhsu, ricky26 Differential Revision: https://reviews.llvm.org/D119665	2022-02-16 12:54:20 +08:00
Stefan Pintilie	a601db30c6	[PowerPC] Remove the LDMX instruction. The LDMX instruction was to be potentially added in P9 but it was never added in either ISA 3.0 or ISA 3.1. This patch removes that instruction as it is currently still an invalid instruction. Reviewed By: lei Differential Revision: https://reviews.llvm.org/D118074	2022-02-14 17:03:48 -06:00
Min-Yih Hsu	08f2b0dcf6	[M68k] Adopt the new VarLenCodeEmitterGen for arithmetic instructions This patch refactors all the existing M68k arithmetic instructions to use the new VarLenCodeEmitterGen infrastructure. This patch is tested by the existing MC test cases. Note that one of the codegen tests needed to be updated because the ordering of two equivalent instructions were switched. Differential Revision: https://reviews.llvm.org/D115234	2022-02-11 09:31:12 -08:00
Stanislav Mekhanoshin	d3b87e4a1c	[AMDGPU] HWRegs TMA and TBA also supported on gfx9 Differential Revision: https://reviews.llvm.org/D118860	2022-02-03 09:36:10 -08:00
Shao-Ce SUN	005fd8aa70	[RISCV] Add support for Zihintpause extention Add support for the 'pause' hint instruction as an alias for 'fence w, 0'. To do this allow the 'fence' operands pred and succ to be set to 0 (the empty set). This will also allow future hints to be encoded as 'fence 0, <x>' and 'fence <x>, 0'. This patch revised from @mundaym's D93019. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D117789	2022-02-03 20:55:47 +08:00
Ting Wang	6f25cb8685	[PowerPC] Add the Power10 XS[MAX\|MIN]CQP instruction Add the Power 10 instruction XS[MAX\|MIN]CQP. Reviewed By: shchenz, amyk Differential Revision: https://reviews.llvm.org/D118036	2022-01-26 23:00:43 -05:00
Simon Tatham	f302e0b5dd	[AArch64] Exclude optional features from HasV8_0rOps. The following SubtargetFeatures are removed from the definition of HasV8_0rOps, on the grounds that they are optional in Armv8.4-A, and therefore (by the definition of Armv8.0-R) also optional in v8.0-R: * performance monitoring: FeaturePerfMon * cryptography: FeatureSM4 and FeatureSHA3 * half-precision FP: FeatureFullFP16, FeatureFP16FML * speculation control: FeatureSSBS, FeaturePredRes, FeatureSB, FeatureSpecRestrict This isn't the full set of features that are listed as optional in the spec. FeatureCCIDX and FeatureTRACEV8_4 are also optional. But LLVM includes those in HasV8_3aOps and HasV8_4aOps respectively (I think on the grounds that the system registers they enable are useful to be able to access after a runtime check), and so for consistency, I've left those in HasV8_0rOps too. After this commit, HasV8_0rOps is a strict subset of HasV8_4aOps (but missing features that are not in Armv8.0-R at all). The definition of Cortex-R82 is correspondingly updated to add most of the features that I've removed from base Armv8.0-R (with the exception of the cryptography ones), since that particular implementation of v8.0-R does have them. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D118045	2022-01-25 10:54:59 +00:00
Min-Yih Hsu	07be76f2ae	[M68k][Disassembler][NFC] Re-organize test files Put test cases of each instruction category into their own files. NFC.	2022-01-25 13:10:15 +08:00
Simon Tatham	a4ac40e92f	[AArch64] Remove PRBAR0_ELn and PRLAR0_ELn sysregs. The Armv8-R.64 architecture defines numbered MPU region registers with indices 1-15, not 0-15. So there's no such register as PRBAR0_EL2 or PRLAR0_EL1 (for example). The encodings that they would occupy are used for the unnumbered PRBAR_ELn and PRLAR_ELn registers. Reviewed By: labrinea Differential Revision: https://reviews.llvm.org/D117755	2022-01-20 13:37:58 +00:00
Simon Tatham	19b9cd4eae	[MC] Add a disassembly test for Armv8-R sysregs. This is the counterpart to llvm/test/MC/AArch64/armv8r-sysreg.s, checking all the same encodings when fed to the disassembler.	2022-01-20 13:37:58 +00:00
Simon Tatham	e35a3f188f	[AArch64] Adding "armv8.8-a" memcpy/memset support. This family of instructions includes CPYF (copy forward), CPYB (copy backward), SET (memset) and SETG (memset + initialise MTE tags), with some sub-variants to indicate whether address translation is done in a privileged or unprivileged way. For the copy instructions, you can separately specify the read and write translations (so that kernels can safely use these instructions in syscall handlers, to memcpy between the calling process's user-space memory map and the kernel's own privileged one). The unusual thing about these instructions is that they write back to multiple registers, because they perform an implementation-defined amount of copying each time they run, and write back to _all_ the address and size registers to indicate how much remains to be done (and the code is expected to loop on them until the size register becomes zero). But this is no problem in LLVM - you just define each instruction to have multiple outputs, multiple inputs, and a set of constraints tying their register numbers together appropriately. This commit introduces a special subtarget feature called MOPS (after the name the spec gives to the CPU id field), which is a dependency of the top-level 8.8-A feature, and uses that to enable most of the new instructions. The SETMG instructions also depend on MTE (and the test checks that). Differential Revision: https://reviews.llvm.org/D116157	2022-01-05 14:44:24 +00:00
Simon Tatham	8c1e520c90	[AArch64] Adding "armv8.8-a" BC instruction. This instruction is described in the Arm A64 Instruction Set Architecture documentation available here: https://developer.arm.com/documentation/ddi0596/2021-12/Base-Instructions/BC-cond--Branch-Consistent-conditionally-?lang=en FEAT_HBC "Hinted Conditional Branches" is listed in the 2021 A-Profile Architecture Extensions: https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools/feature-names-for-a-profile 'BC.cc', where 'cc' is any ordinary condition code, is an instruction that looks exactly like B.cc (the normal conditional branch), except that bit 4 of the encoding is 1 rather than 0, which hints something to the branch predictor (specifically, that this branch is expected to be highly consistent, even though _which way_ it will consistently go is not known at compile time). This commit introduces a special subtarget feature for HBC, which is a dependency of the top-level 8.8-A feature, and uses that to enable the new BC instruction. Differential Revision: https://reviews.llvm.org/D116156	2022-01-03 12:33:51 +00:00
Ties Stuij	63eb7ff47d	[ARM] Implement PAC return address signing mechanism for PACBTI-M This patch implements PAC return address signing for armv8-m. This patch roughly accomplishes the following things: - PAC and AUT instructions are generated. - They're part of the stack frame setup, so that shrink-wrapping can move them inwards to cover only part of a function - The auth code generated by PAC is saved across subroutine calls so that AUT can find it again to check - PAC is emitted before stacking registers (so that the SP it signs is the one on function entry). - The new pseudo-register ra_auth_code is mentioned in the DWARF frame data - With CMSE also in use: PAC is emitted before stacking FPCXTNS, and AUT validates the corresponding value of SP - Emit correct unwind information when PAC is replaced by PACBTI - Handle tail calls correctly Some notes: We make the assembler accept the `.save {ra_auth_code}` directive that is emitted by the compiler when it saves a register that contains a return address authentication code. For EHABI we need to have the `FrameSetup` flag on the instruction and handle the `t2PACBTI` opcode (identically to `t2PAC`), so we can emit `.save {ra_auth_code}`, instead of `.save {r12}`. For PACBTI-M, the instruction which computes return address PAC should use SP value before adjustment for the argument registers save are (used for variadic functions and when a parameter is is split between stack and register), but at the same it should be after the instruction that saves FPCXT when compiling a CMSE entry function. This patch moves the varargs SP adjustment after the FPCXT save (they are never enabled at the same time), so in a following patch handling of the `PAC` instruction can be placed between them. Epilogue emission code adjusted in a similar manner. PACBTI-M code generation should not emit any instructions for architectures v6-m, v8-m.base, and for A- and R-class cores. Diagnostic message for such cases is handled separately by a future ticket. note on tail calls: If the called function has four arguments that occupy registers `r0`-`r3`, the only option for holding the function pointer itself is `r12`, but this register is used to keep the PAC during function/prologue epilogue and clobbers the function pointer. When we do the tail call we need the five registers (`r0`-`r3` and `r12`) to keep six values - the four function arguments, the function pointer and the PAC, which is obviously impossible. One option would be to authenticate the return address before all callee-saved registers are restored, so we have a scratch register to temporarily keep the value of `r12`. The issue with this approach is that it violates a fundamental invariant that PAC is computed using CFA as a modifier. It would also mean using separate instructions to pop `lr` and the rest of the callee-saved registers, which would offset the advantages of doing a tail call. Instead, this patch disables indirect tail calls when the called function take four or more arguments and the return address sign and authentication is enabled for the caller function, conservatively assuming the caller function would spill LR. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Momchil Velikov - Ties Stuij Reviewed By: danielkiss Differential Revision: https://reviews.llvm.org/D112429	2021-12-07 10:15:19 +00:00
Ties Stuij	5cff77c23f	[clang][ARM] PACBTI-M assembly support Introduce assembly support for Armv8.1-M PACBTI extension. This is an optional extension in v8.1-M. There are 10 new system registers and 5 new instructions, all predicated on the feature. The attribute for llvm-mc is called "pacbti". For armclang, an architecture extension also called "pacbti" was created. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Victor Campos - Ties Stuij Reviewed By: labrinea Differential Revision: https://reviews.llvm.org/D112420	2021-11-30 09:28:18 +00:00
Dmitry Preobrazhensky	91f4650ebb	[AMDGPU][MC][GFX10] Corrected global_atomic_fcmpswap* Corrected src data size of global_atomic_fcmpswap and global_atomic_fcmpswap_x2 opcodes. Differential Revision: https://reviews.llvm.org/D113746	2021-11-15 12:51:12 +03:00
Alexandros Lamprineas	8689f5e6e7	[AArch64] Add support for the 'R' architecture profile. This change introduces subtarget features to predicate certain instructions and system registers that are available only on 'A' profile targets. Those features are not present when targeting a generic CPU, which is the default processor. In other words the generic CPU now means the intersection of 'A' and 'R' profiles. To maintain backwards compatibility we enable the features that correspond to -march=armv8-a when the architecture is not explicitly specified on the command line. References: https://developer.arm.com/documentation/ddi0600/latest Differential Revision: https://reviews.llvm.org/D110065	2021-10-27 12:32:30 +01:00
Joe Nash	b4b7e605a6	[AMDGPU] Support shared literals in FMAMK/FMAAK These instructions should allow src0 to be a literal with the same value as the mandatory other literal. Enable it by introducing an operand that defers adding its value to the MI when decoding till the mandatory literal is parsed. Reviewed By: dp, foad Differential Revision: https://reviews.llvm.org/D111067 Change-Id: I22b0ae0d35bad17b6f976808e48bffe9a6af70b7	2021-10-11 13:09:54 -04:00
Dmitry Preobrazhensky	3500e7d2b0	[AMDGPU][MC][GFX7][GFX10] Corrected image_atomic_fcmpswap Differential Revision: https://reviews.llvm.org/D109616	2021-09-21 18:06:02 +03:00
Dmitry Preobrazhensky	b8e7f53208	[AMDGPU][MC][GFX10] Enabled dlc for FLAT and GLOBAL atomics Differential Revision: https://reviews.llvm.org/D109614	2021-09-21 16:23:20 +03:00
Victor Campos	79f9c79aaf	[AArch64][MC] Merge FeaturePMU into FeaturePerfMon FeaturePMU was created in AArch64 to accommodate one missing system register, PMMIR_EL1, in commit `ffcd7698ae`. However, the Performance Monitors extension already had a target feature, which is called FeaturePerfMon. Therefore, FeaturePMU is redundant. This patch removes FeaturePMU and merges its contents into FeaturePerfMon. Reviewed By: dnsampaio Differential Revision: https://reviews.llvm.org/D109246	2021-09-06 14:56:49 +01:00
Wang, Pengfei	ab40dbfe03	[X86] AVX512FP16 instructions enabling 6/6 Enable FP16 complex FMA instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105269	2021-08-30 13:08:45 +08:00
Thomas Johnson	8c3886b0ec	[ARC] Add ADC (addition with carry) and SBC (subtraction with carry) instructions Differential Revision: https://reviews.llvm.org/D108672	2021-08-25 07:46:15 -07:00
Thomas Johnson	ce1dc9d647	[ARC] Add codegen for the readcyclecounter intrinsic along with disassembly for associated instructions Differential Revision: https://reviews.llvm.org/D108598	2021-08-24 11:53:20 -07:00
Wang, Pengfei	c728bd5bba	[X86] AVX512FP16 instructions enabling 5/6 Enable FP16 FMA instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105268	2021-08-24 09:07:19 +08:00
Wang, Pengfei	b088536ce9	[X86] AVX512FP16 instructions enabling 4/6 Enable FP16 unary operator instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105267	2021-08-22 08:59:35 +08:00
Wang, Pengfei	2379949aad	[X86] AVX512FP16 instructions enabling 3/6 Enable FP16 conversion instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105265	2021-08-18 09:03:41 +08:00
Carl Ritson	99c790dc21	[AMDGPU] Make BVH isel consistent with other MIMG opcodes Suffix opcodes with _gfx10. Remove direct references to architecture specific opcodes. Add a BVH flag and apply this to diassembly. Fix a number of disassembly errors on gfx90a target caused by previous incorrect BVH detection code. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D108117	2021-08-17 10:42:22 +09:00
Craig Topper	b82ce77b2b	[X86] Support avx512fp16 compare instructions in the IntelInstPrinter. This enables printing of the mnemonics that contain the predicate in the Intel printer. This requires accounting for the memory size that is explicitly printed in Intel syntax. Those changes have been synced to the ATT printer as well. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D108093	2021-08-16 12:31:36 +08:00
Wang, Pengfei	f1de9d6dae	[X86] AVX512FP16 instructions enabling 2/6 Enable FP16 binary operator instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105264	2021-08-15 08:56:33 +08:00
Thomas Johnson	b821086876	[ARC] Add codegen for count trailing zeros intrinsic for the ARC backend Differential Revision: https://reviews.llvm.org/D107828	2021-08-10 12:07:35 -07:00
Wang, Pengfei	6f7f5b54c8	[X86] AVX512FP16 instructions enabling 1/6 1. Enable FP16 type support and basic declarations used by following patches. 2. Enable new instructions VMOVW and VMOVSH. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105263	2021-08-10 12:46:01 +08:00
Min-Yih Hsu	2167e237ee	[M68k] Update disassembler test case following up ADD / ADDA changes Update disassembler test case to reflect the changes on ADD/ADDA instruction separation.	2021-08-08 14:20:46 -07:00
Mark Schimmel	e622c99f30	[ARC] Add norm/normh instructions with disassembly tests Add disassembler support for the NORM and NORMH instructions. These instructions only exist when the ARC processor is configured with the "norm" extension. fferential Revision: https://reviews.llvm.org/D107118	2021-07-29 17:54:52 -07:00
Thomas Johnson	cc238a6e03	[ARC] Add additional mov immediate instruction formats with a fix for u6 decoding Differential Revision: https://reviews.llvm.org/D107088	2021-07-29 16:41:55 -07:00
Lei Huang	64a15817a0	[PowerPC]Add addex instruction definition and MC tests Add td definitions and asm/disasm tests for the addex instruction introduced in ISA 3.0. Reviewed By: nemanjai, amyk, NeHuang Differential Revision: https://reviews.llvm.org/D106666	2021-07-26 14:55:38 -05:00
Ulrich Weigand	8cd8120a7b	[SystemZ] Add support for new cpu architecture - arch14 This patch adds support for the next-generation arch14 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Detection of arch14 as host processor. - Assembler/disassembler support for new instructions. - New LLVM intrinsics for certain new instructions. - Support for low-level builtins mapped to new LLVM intrinsics. - New high-level intrinsics in vecintrin.h. - Indicate support by defining __VEC__ == 10304. Note: No currently available Z system supports the arch14 architecture. Once new systems become available, the official system name will be added as supported -march name.	2021-07-26 16:57:28 +02:00
Thomas Johnson	51d8e67e88	[ARC] Add tablegen definition for the Find Leading Set (FLS) instruction Differential Revision: https://reviews.llvm.org/D106602	2021-07-22 17:42:25 -07:00
Thomas Johnson	1cda1e6186	[ARC] Add disassembly for the conditioned RSUB immediate instruction Differential Revision: https://reviews.llvm.org/D106497	2021-07-22 11:34:39 -07:00
Carl Ritson	6efb3220b4	[AMDGPU] Add VReg_192/VReg_224 support for MIMG instructions Allow MIMG instructions to be selected with 6/7 VGPRs for vaddr. Previously these were rounded up to VReg_256 this saves VGPRs. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D103800	2021-07-22 10:42:15 +09:00
Igor Kudrin	657e067bb5	[ARMInstPrinter] Print the target address of a branch instruction This follows other patches that changed printing immediate values of branch instructions to target addresses, see D76580 (x86), D76591 (PPC), D77853 (AArch64). As observing immediate values might sometimes be useful, they are printed as comments for branch instructions. // llvm-objdump -d output (before) 000200b4 <_start>: 200b4: ff ff ff fa blx #-4 <thumb> 000200b8 <thumb>: 200b8: ff f7 fc ef blx #-8 <_start> // llvm-objdump -d output (after) 000200b4 <_start>: 200b4: ff ff ff fa blx 0x200b8 <thumb> @ imm = #-4 000200b8 <thumb>: 200b8: ff f7 fc ef blx 0x200b4 <_start> @ imm = #-8 // GNU objdump -d. 000200b4 <_start>: 200b4: faffffff blx 200b8 <thumb> 000200b8 <thumb>: 200b8: f7ff effc blx 200b4 <_start> Differential Revision: https://reviews.llvm.org/D104701	2021-06-30 16:35:28 +07:00
Lucas Prates	88b1135e72	[Aarch64] Adding support for Armv9-A Realm Management Extension This adds support for Armv9-A's Realm Management Extension, including three new system registers - MFAR_EL3, GPCCR_EL3 and GPTBR_EL3 - and four new TLBI instructions. The reference for the Realm Management Extension can be found at: https://developer.arm.com/documentation/ddi0615/aa. Based on patches by Victor Campos. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D104773	2021-06-28 13:45:22 +01:00
Aakanksha Patil	3453f3dd46	[AMDGPU] Add gfx1035 target Differential Revision: https://reviews.llvm.org/D104804	2021-06-24 14:32:41 -04:00
Kai Luo	1c450c3d7e	[PowerPC] Export 16 byte load-store instructions Export `lq`, `stq`, `lqarx` and `stqcx.` in preparation for implementing 16-byte lock free atomic operations on AIX. Add a new register class `g8prc` for these instructions, since these instructions require even-odd register pair. Reviewed By: nemanjai, jsji, #powerpc Differential Revision: https://reviews.llvm.org/D103010	2021-06-15 01:56:10 +00:00
Carl Ritson	f8816c7400	[AMDGPU] Add v5f32/VReg_160 support for MIMG instructions Avoid having to round up to v8f32/VReg_256 when only 5 VGPRs are required for a MIMG address operand. Maintain _V8 instruction variants of pseudo instructions allowing assembly prior to GFX10 to work as-is. Currently the validator can tell for GFX10 what the correct size is, so will disallow oversize address registers. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D103672	2021-06-08 11:11:40 +09:00
Jay Foad	9e9edede18	[AMDGPU] Fix MC tests for v_fmaak_f16 and v_fmamk_f16 This looks like a mistake when the tests were committed in r363946. There were two sets of tests for the f32 variant of these instructions, instead of one set for f16 and one set for f32. Differential Revision: https://reviews.llvm.org/D103699	2021-06-07 10:42:52 +01:00
Dmitry Preobrazhensky	13c6568c6e	[AMDGPU][MC][GFX90A] Corrected DS_GWS opcodes Corrected DS_GWS opcodes to use even aligned registers. Differential Revision: https://reviews.llvm.org/D103185	2021-05-26 21:31:50 +03:00
Stanislav Mekhanoshin	f4c0fdc6c9	[AMDGPU] Set unused dst_sel to '?' in the encoding This is to allow disasm with any bits in the unused fields. Differential Revision: https://reviews.llvm.org/D102526	2021-05-17 08:38:52 -07:00
Alexandros Lamprineas	1079870971	[llvm-mc][AArch64] HINT instruction disassembled as BTI The Arm Architecture Reference Manual says that the SystemHintOp_BTI opcode is prefered when CRm:op2 matches 0100:xx0, but llvm-mc currently accepts 0100:xxx, which isn't right. Differential Revision: https://reviews.llvm.org/D102415	2021-05-14 10:05:37 +01:00
David Stuttard	72d570ca08	[AMDGPU][AsmParser/Disassembler] Correct A16 and G16 handling A16 support for image instructions assembly/disassembly (gfx10) was missing Also refactor MIMG op addr size calcs to common function We'd got 3 places where the same operation was being done. One test is now marked XFAIL until a related codegen patch is in place Differential Revision: https://reviews.llvm.org/D102231 Change-Id: I7e86e730ef8c71901457855cba570581f4f576bb	2021-05-14 09:25:44 +01:00
Aakanksha Patil	464e4dc50f	[AMDGPU] Add gfx1034 target Differential Revision: https://reviews.llvm.org/D102306	2021-05-13 14:25:18 -04:00
Min-Yih Hsu	fc86e6d188	[ARM][disassembler] Fix incorrect number of MCOperands generated by the disassembler Try to fix bug 49974. This patch fixes two issues: 1. BL does not use predicate (BL_pred is the predicate version of BL), so we shouldn't add predicate operands in DecodeBranchImmInstruction. 2. Inside DecodeT2AddSubSPImm, we shouldn't add predicate operands into the MCInst because ARMDisassembler::AddThumbPredicate will do that for us. However, we should handle CC-out operand for t2SUBspImm and t2AddspImm. Differential Revision: https://reviews.llvm.org/D100585	2021-04-25 11:55:10 -07:00
Ricky Taylor	2221185776	[M68k] Implement Disassembler This is an implementation of a disassembler for M68k. Differential Revision: https://reviews.llvm.org/D98540	2021-04-19 22:24:12 +01:00
Stefan Pintilie	f28cb01be0	[PowerPC] Add ROP Protection Instructions for PowerPC There are four new PowerPC instructions that are introduced in Power 10. They are hashst, hashchk, hashstp, hashchkp. These instructions will be used for ROP Protection. This patch adds the four instructions. Reviewed By: nemanjai, amyk, #powerpc Differential Revision: https://reviews.llvm.org/D99375	2021-04-15 11:38:38 -05:00
Thomas Lively	ea8dd3ee2e	[WebAssembly] Update v128.any_true In the final SIMD spec, there is only a single v128.any_true instruction, rather than one for each lane interpretation because the semantics do not depend on the lane interpretation. Differential Revision: https://reviews.llvm.org/D100241	2021-04-11 11:13:16 -07:00

1 2 3 4 5 ...

1995 Commits