llvm-project

Commit Graph

Author	SHA1	Message	Date
Dmitry Preobrazhensky	b26afab9d1	[AMDGPU][MC][GFX11] Correct src0 for dpp variants of v_cvt_*_e64 Differential Revision: https://reviews.llvm.org/D127847	2022-06-16 13:48:43 +03:00
Joe Nash	989bd57f98	[AMDGPU] gfx11 support add_f16 The instruction was skipped in the earlier large patch adding VOP2, https://reviews.llvm.org/D126917. Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D127697	2022-06-14 08:59:45 -04:00
Dmitry Preobrazhensky	365d827f65	[AMDGPU][MC][GFX11] Correct ds_swizzle_b32 Enable offset parsing. Differential Revision: https://reviews.llvm.org/D127404	2022-06-14 12:58:03 +03:00
Joe Nash	fd3304ef85	[AMDGPU] gfx11 EXECZ and VCCZ are no longer allowed to be used as sources to SALU and VALU instructions. Contributors: Baptiste Saleil <baptiste.saleil@amd.com> Patch 20/N for upstreaming of AMDGPU gfx11 architecture Depends on D126989 Reviewed By: rampitec, foad, #amdgpu Differential Revision: https://reviews.llvm.org/D127143	2022-06-10 10:03:43 -04:00
Ivan Kosarev	60d6fbb621	[AMDGPU][GFX9][GFX10] Support base+soffset+offset SMEM atomics. Resolves a part of https://github.com/llvm/llvm-project/issues/38652 Reviewed By: dp Differential Revision: https://reviews.llvm.org/D127314	2022-06-10 13:22:41 +01:00
Joe Nash	be1082c6d5	[AMDGPU] gfx11 VOPC instructions Supports encoding existing instrutions on gfx11 and MC support for the new VOPC dpp instructions. Patch 19/N for upstreaming of AMDGPU gfx11 architecture Depends on D126978 Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D126989	2022-06-09 15:22:42 -04:00
Joe Nash	40f35cef89	[AMDGPU] gfx11 VOP3P instruction MC support Includes dpp versions of VOP3P instructions. Patch 18/N for upstreaming of AMDGPU gfx11 architecture Depends on D126917 Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D126978	2022-06-08 13:32:01 -04:00
Joe Nash	086a9c1062	Reland [AMDGPU] gfx11 VOP1+VOP2 Instruction MC support The reverted dependent commit is now relanded, so reland this. Includes dpp instructions and vop1/vop2 promoted to vop3 Patch 17/N for upstreaming of AMDGPU gfx11 architecture Depends on D126483 Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D126917	2022-06-08 11:10:57 -04:00
Joe Nash	e243ead6fc	Reland [AMDGPU] gfx11 vop3dpp instructions There was an issue with encoding wide (>64 bit) instructions on BigEndian hosts, which is fixed in D127195. Therefore reland this. gfx11 adds the ability to use dpp modifiers on vop3 instructions. This patch adds machine code layer support for that. The MCCodeEmitter is changed to use APInt instead of uint64_t to support these wider instructions. Patch 16/N for upstreaming of AMDGPU gfx11 architecture Differential Revision: https://reviews.llvm.org/D126483	2022-06-07 14:49:13 -04:00
Joe Nash	eaed07eb7e	Revert "[AMDGPU] gfx11 vop3dpp instructions" This reverts commit `99a83b1286`.	2022-06-06 17:12:09 -04:00
Joe Nash	f617f89e5b	Revert "[AMDGPU] gfx11 VOP1+VOP2 Instruction MC support" This reverts commit `6079804498`.	2022-06-06 17:11:35 -04:00
Ivan Kosarev	facbfb121a	[AMDGPU][GFX9+] Support base+soffset+offset s_atc_probe's. Resolves part of https://github.com/llvm/llvm-project/issues/38652 Reviewed By: dp Differential Revision: https://reviews.llvm.org/D126791	2022-06-06 16:46:22 +01:00
Ivan Kosarev	79ec1e8fd6	[AMDGPU][GFX9][GFX10] Support base+soffset+offset s_dcache_discard's. Resolves part of https://github.com/llvm/llvm-project/issues/38652 Reviewed By: dp Differential Revision: https://reviews.llvm.org/D126766	2022-06-06 16:32:16 +01:00
Joe Nash	6079804498	[AMDGPU] gfx11 VOP1+VOP2 Instruction MC support Includes dpp instructions and vop1/vop2 promoted to vop3 Patch 17/N for upstreaming of AMDGPU gfx11 architecture Depends on D126483 Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D126917	2022-06-06 09:57:59 -04:00
Joe Nash	99a83b1286	[AMDGPU] gfx11 vop3dpp instructions gfx11 adds the ability to use dpp modifiers on vop3 instructions. This patch adds machine code layer support for that. The MCCodeEmitter is changed to use APInt instead of uint64_t to support these wider instructions. Patch 16/N for upstreaming of AMDGPU gfx11 architecture Depends on D126475 Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D126483	2022-06-06 09:34:59 -04:00
Joe Nash	3732cd59be	[AMDGPU] gfx11 vop3 and inherited vop instructions This patch includes MC layer support for VOP3 encoded instructions and generic VOP support classes. Some VOP1 and VOP2 instructions which share an encoding with gfx10 and are using the AssemblerPredicate = isGFX10Plus are also enabled. That predicate will be changed to isGFX10Only in a later patch. Patch 15/N for upstreaming of AMDGPU gfx11 architecture. Depends on D126468 Reviewed By: dp Differential Revision: https://reviews.llvm.org/D126475	2022-06-02 14:03:02 -04:00
Joe Nash	e4870c8357	[AMDGPU] gfx11 ds instructions MC layer support for ds instructions Contributors: Piotr Sobczak <Piotr.Sobczak@amd.com> Patch 14/N for upstreaming of AMDGPU gfx11 architecture. Depends on D126463 Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D126468	2022-06-02 13:36:56 -04:00
Joe Nash	e8860bee28	[AMDGPU] gfx11 Image instructions MC layer support for instructions in the MIMG encoding(Image instructions). Contributors: Carl Ritson <carl.ritson@amd.com> Patch 13/N for upstreaming of AMDGPU gfx11 architecture. Depends on D125992 Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D126463	2022-05-31 10:53:35 -04:00
Ivan Kosarev	082822b381	[AMDGPU][GFX9] Support base+soffset+offset SMEM stores. Reviewed By: dp Differential Revision: https://reviews.llvm.org/D126388	2022-05-30 10:27:57 +01:00
Ivan Kosarev	b0ccf38b01	[AMDGPU][GFX9] Support base+soffset+offset SMEM loads. Resolves part of https://github.com/llvm/llvm-project/issues/38652 Reviewed By: dp Differential Revision: https://reviews.llvm.org/D125700	2022-05-26 12:42:33 +01:00
Joe Nash	835e09c4c3	[AMDGPU] gfx11 FLAT Instructions MachineCode Support for FLAT type instructions Contributors: Sebastian Neubauer <sebastian.neubauer@amd.com> Patch 12/N for upstreaming of AMDGPU gfx11 architecture. Depends on D125989 Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D125992	2022-05-25 15:29:39 -04:00
Joe Nash	ef1ea5ac01	[AMDGPU] gfx11 vinterp instructions MC support A new instruction encoding. Some of these instructions were previously VOP3 encoded. Contributors: Carl Ritson <carl.ritson@amd.com> Patch 11/N for upstreaming of AMDGPU gfx11 architecture. Depends on D125824 Reviewed By: critson Differential Revision: https://reviews.llvm.org/D125989	2022-05-25 14:59:16 -04:00
Joe Nash	1a51ab766f	[AMDGPU] gfx11 export instructions Contributors: Jay Foad <jay.foad@amd.com> Dmitry Preobrazhensky <d-pre@mail.ru> Patch 10/N for upstreaming of AMDGPU gfx11 architecture. Depends on D125822 Reviewed By: dp Differential Revision: https://reviews.llvm.org/D125824	2022-05-25 14:44:09 -04:00
Ivan Kosarev	1586e1dc95	[AMDGPU][MC][GFX11] Support base+soffset+offset SMEM loads. Reviewed By: dp Differential Revision: https://reviews.llvm.org/D126207	2022-05-24 15:13:14 +01:00
Sheng	09865ae95d	[NFC][M68k][test] Add disassembler tests for move instructions	2022-05-22 10:35:13 +08:00
Joe Nash	ac2ff258d6	[AMDGPU] gfx11 scalar memory instructions Contributors: Mirko Brkusanin <Mirko.Brkusanin@amd.com> Patch 9/N for upstreaming of AMDGPU gfx11 architecture. Depends on D125820 Reviewed By: kosarev, #amdgpu, arsenm Differential Revision: https://reviews.llvm.org/D125822	2022-05-19 10:27:47 -04:00
Joe Nash	729467acef	[AMDGPU] gfx11 LDSDIR instructions MC support Contributors: Carl Ritson <carl.ritson@amd.com> Patch 8/N for upstreaming of AMDGPU gfx11 architecture. Depends on D125498 Reviewed By: critson, rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D125820	2022-05-19 10:08:47 -04:00
Sheng	a5d618b393	[M68k][Disassembler] Fix decoding conflict This diff fixes decoding conflict between these pair of instructions: ADD(16\|32)dd / ADD(16\|32)dr SUB(16\|32)dd / SUB(16\|32)dr AND(16\|32)dd / AND(16\|32)dr OR(16\|32)dd / OR(16\|32)dr Reviewed By: ricky26 Differential Revision: https://reviews.llvm.org/D125861	2022-05-19 09:10:50 +08:00
Dmitry Preobrazhensky	32ca9bd7b5	[AMDGPU][MC][GFX940] Correct tied operand decoding for smfmac opcodes Differential Revision: https://reviews.llvm.org/D125790	2022-05-18 15:39:30 +03:00
Ivan Kosarev	140ad30b24	[AMDGPU][MC][GFX10] Add missing s_scratch_load tests. Completes https://reviews.llvm.org/D125117 Reviewed By: dp, arsenm Differential Revision: https://reviews.llvm.org/D125753	2022-05-18 11:11:10 +01:00
Stanislav Mekhanoshin	a09af86693	[AMDGPU] Enable FLAT LDS DMA on gfx9/10 before gfx940 We always had global and scratch loads to LDS in the gfx9, but did not handle it. These were available via the 'lds' encoding bit. In gfx940 this bit was reused as 'svs' which resulted in new '_lds' opcodes effectively pushing this bit into the opcode, but functionally it is the same. These instructions are also available on gfx10. Differential Revision: https://reviews.llvm.org/D125126	2022-05-17 12:16:37 -07:00
Joe Nash	d21b9b4946	[AMDGPU] gfx11 scalar alu instructions MC layer support for SOP(scalar alu operations) including encoding support for s_delay_alu and s_sendmsg_rtn. Contributors: Jay Foad <jay.foad@amd.com> Patch 7/N for upstreaming of AMDGPU gfx11 architecture. Depends on D125319 Reviewed By: #amdgpu, arsenm Differential Revision: https://reviews.llvm.org/D125498	2022-05-17 13:35:41 -04:00
Joe Nash	c70259405c	[AMDGPU] gfx11 BUF Instructions Includes MachineCode layer support and tests, and MIR tests not requiring CodeGen pass changes. Includes a small change in SMInstructions.td to correct encoded bits. Contributors: Petar Avramovic <Petar.Avramovic@amd.com> Dmitry Preobrazhensky <dmitry.preobrazhensky@amd.com> Depends on D125316 Patch 6/N for upstreaming of AMDGPU gfx11 architecture. Reviewed By: dp, Petar.Avramovic Differential Revision: https://reviews.llvm.org/D125319	2022-05-16 09:41:40 -04:00
Sheng	cf0b6df6db	[M68k][Disassembler] Adopt the new variable length decoder This is an example usage of D120958. After these patches are landed, we can strip off the codebeads officially. Reviewed By: myhsu Differential Revision: https://reviews.llvm.org/D120960	2022-05-15 08:44:58 +08:00
Ivan Kosarev	cb67b2ccc4	[AMDGPU][GFX10] Support base+soffset+offset SMEM stores. Also makes another step towards resolving https://github.com/llvm/llvm-project/issues/38652 Reviewed By: foad, dp Differential Revision: https://reviews.llvm.org/D125380	2022-05-12 08:48:05 +01:00
Ivan Kosarev	88f04bdbd8	[AMDGPU][GFX10] Support base+soffset+offset SMEM loads. Also makes a step towards resolving https://github.com/llvm/llvm-project/issues/38652 Reviewed By: foad, dp Differential Revision: https://reviews.llvm.org/D125117	2022-05-10 16:17:14 +01:00
Simon Pilgrim	c0840799e3	[MC][X86] Add vcmpps disassembler tests for Issue #41491 We were missing coverage for vcmpps imm, vreg, vreg, mreg {mreg} patterns	2022-05-06 15:39:17 +01:00
Philipp Tomsich	64816e68f4	[AArch64] Support for Ampere1 core Add support for the Ampere Computing Ampere1 core. Ampere1 implements the AArch64 state and is compatible with ARMv8.6-A. Differential Revision: https://reviews.llvm.org/D117112	2022-05-03 15:54:02 +01:00
CHIANG, YU-HSUN (Tommy Chiang, oToToT)	4a31af88a2	[MC][AArch64] Enable '+v8a' when nothing specified for MCSubtargetInfo Since D110065, the 'R' profile support is added to LLVM. It turns the `generic` cpu into the intersection of v8-a and v8-r. However, this makes some backward compatibility problems. The original patch makes the clang driver implicitly pass -march=armv8-a when only the triple is specified. Since it only applies to clang, other tools like llvm-objdump still faces the backward compatibility problem. This patch applies the same idea to MC related tools by enabling '+v8a' feature when nothing is specified (both CPU and FS are empty) for MCSubtargetInfo creation. This patch should fix PR53956. Reviewed by: labrinea Differential Revision: https://reviews.llvm.org/D124319	2022-04-29 04:53:22 +08:00
Stanislav Mekhanoshin	00d84a9f92	[AMDGPU] Remove vdata from buffer to lds load Differential Revision: https://reviews.llvm.org/D124485	2022-04-26 17:16:26 -07:00
Ulrich Weigand	1283ccb610	Support z16 processor name The recently announced IBM z16 processor implements the architecture already supported as "arch14" in LLVM. This patch adds support for "z16" as an alternate architecture name for arch14.	2022-04-21 19:58:22 +02:00
Dmitry Preobrazhensky	b4231ac4be	[AMDGPU][GFX90A+] Disabled ds_ordered_count and exp Differential Revision: https://reviews.llvm.org/D124087	2022-04-21 13:16:44 +03:00
Dmitry Preobrazhensky	ab18e1a533	[AMDGPU][GFX10] Enabled op_sel for v_add_nc_u16 and v_sub_nc_u16 Differential Revision: https://reviews.llvm.org/D123594	2022-04-13 13:48:42 +03:00
Shengchen Kan	fcade8e91e	[X86][test] Add encoding/decoding tests for VEX instruction w/ address-size prefix This patch also contains a regression test for D122448 Reviewed By: hvdijk, RKSimon Differential Revision: https://reviews.llvm.org/D122449	2022-04-13 12:50:25 +08:00
Dmitry Preobrazhensky	1f6aa90386	[AMDGPU][MC][GFX10] Added syntactic sugar for s_waitcnt_depctr operand Added the following helpers: depctr_hold_cnt(...) depctr_sa_sdst(...) depctr_va_vdst(...) depctr_va_sdst(...) depctr_va_ssrc(...) depctr_va_vcc(...) depctr_vm_vsrc(...) Differential Revision: https://reviews.llvm.org/D123022	2022-04-07 17:03:44 +03:00
Simon Tatham	82bd0bd24f	[AArch64] Make PMMIR_EL1 read-only. The Arm architecture reference manual (ARM DDI 0487H.a section D13.5.12) lists every field in the register as RO, and does not list an MSR instruction that writes it. So we should be defining it as an ROSysReg, not an RWSysReg. Reviewed By: vhscampos Differential Revision: https://reviews.llvm.org/D123111	2022-04-05 11:09:56 +01:00
Min-Yih Hsu	18b38ff6c7	[M68k] Adopt VarLenCodeEmitter for move instructions The `move` instruction has one of the most complicate sets of variants, so we're refactoring it first before finishing up rest of the data instructions in a separate patch. Note that since we're introducing more `move` variants, the codegen actually got improved in terms of code size.	2022-04-04 23:02:27 -07:00
Min-Yih Hsu	fccdc5618d	[M68k] Adopt VarLenCodeEmitter for shift / rotate instructions This patch is covered by existing MC tests.	2022-04-03 22:52:32 -07:00
Stefan Pintilie	2e55bc9f3c	[PowerPC] Set the special DSCR with a compiler option. Add a compiler option and the instructions required to set the special Data Stream Control Register (DSCR). The special register will not be set by default. Original patch by: Muhammad Usman Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D117013	2022-03-31 14:06:30 -05:00
Simon Pilgrim	4a33b9ece0	[MC][X86] Ensure all opcode tests are sorted by instruction name Noticed while reviewing D122449	2022-03-30 11:08:11 +01:00
Stanislav Mekhanoshin	6e3e14f600	[AMDGPU] Support gfx940 smfmac instructions Differential Revision: https://reviews.llvm.org/D122191	2022-03-24 12:40:42 -07:00
Stanislav Mekhanoshin	27439a7642	[AMDGPU] New gfx940 mfma instructions Differential Revision: https://reviews.llvm.org/D122044	2022-03-24 12:12:52 -07:00
Stanislav Mekhanoshin	72c1a0d9c2	[AMDGPU] Allow v_accvgpr_write to use SGPR on gfx90a This is undocumented, but it should work. Differential Revision: https://reviews.llvm.org/D122252	2022-03-22 13:52:29 -07:00
Stanislav Mekhanoshin	d9ac55fab2	[AMDGPU] New MFMA names for existing instructions Old names are supported as aliases. _1k MFMA got new opcodes. Differential Revision: https://reviews.llvm.org/D121741	2022-03-17 13:05:36 -07:00
Stanislav Mekhanoshin	522b259976	[AMDGPU] Allow v_accvgpr_write to use SGPR src on gfx940 Differential Revision: https://reviews.llvm.org/D121843	2022-03-17 12:12:06 -07:00
Amir Ayupov	2c4e38fa6f	[X86] Emit REX prefix immediately before the opcode Fix prefix emission order to emit REX immediately before the opcode (SDM vol2, 2.1, Figure 2-1). According to SDM vol2 2.2.1, "Other placements are ignored". This fix has a side effect of outputting segment override prefix in a different order than previously (benign). Follow-up to https://reviews.llvm.org/D120592 Reviewed By: skan, craig.topper Differential Revision: https://reviews.llvm.org/D120871	2022-03-16 08:30:31 -07:00
Amir Ayupov	1d3719820f	[X86] Preserve redundant Address-Size override prefix Print and emit redundant Address-Size override prefix if it's set on the instruction. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D120592	2022-03-16 08:30:29 -07:00
Stanislav Mekhanoshin	8dd3d1cf1f	[AMDGPU] Add symbolic names for gfx940 HWREGs The namespaces of HWREGs is now overlapping with gfx10. Thus the patch is longer than necessary to just support new names. It also need to handle proper error messages, i.e. to issue a "specified hardware register is not supported on this GPU" message. This may need a major refactoring in the future. Differential Revision: https://reviews.llvm.org/D121418	2022-03-14 16:13:33 -07:00
Stanislav Mekhanoshin	23499103f7	[AMDGPU] Support for gfx940 flat lds opcodes Differential Revision: https://reviews.llvm.org/D121414	2022-03-14 15:46:19 -07:00
Stanislav Mekhanoshin	1f53f20fc1	[AMDGPU] Support gfx940 v_lshl_add_u64 instruction Differential Revision: https://reviews.llvm.org/D121401	2022-03-14 15:45:42 -07:00
Stanislav Mekhanoshin	36fe3f13a9	[AMDGPU] flat scratch SVS addressing mode for gfx940 Both VADDR and SADDR are used in SVS mode. Differential Revision: https://reviews.llvm.org/D121254	2022-03-14 15:23:36 -07:00
Stanislav Mekhanoshin	6181458662	[AMDGPU] gfx940 MUBUF format changes Differential Revision: https://reviews.llvm.org/D121234	2022-03-11 11:36:49 -08:00
Stanislav Mekhanoshin	932f628121	[AMDGPU] new gfx940 fp atomics Differential Revision: https://reviews.llvm.org/D121028	2022-03-07 12:32:02 -08:00
Stanislav Mekhanoshin	e7b362d75d	[AMDGPU] Add v_mov_b64 gfx940 opcode Differential Revision: https://reviews.llvm.org/D121023	2022-03-07 12:07:12 -08:00
Stanislav Mekhanoshin	8992b50e2f	[AMDGPU] gfx940 uses new names for coherency bits Differential Revision: https://reviews.llvm.org/D120855	2022-03-07 11:50:07 -08:00
Stanislav Mekhanoshin	2c830c8fab	[AMDGPU] gfx940: support V_FMAMK_F32 and V_FMAAK_F32 Differential Revision: https://reviews.llvm.org/D120769	2022-03-07 11:31:01 -08:00
Simon Tatham	54dafd38c5	[AArch64] Move FeatureSpecRestrict into core 8.0-R architecture. It was included in HasV8_0rOps when D88660 first introduced that architecture definition. In D118045 I moved it out of there and into ProcessorFeatures.R82, so that -mcpu=cortex-r82 would continue to behave the same as before but -march=armv8-r would include only the mandatory parts of the architecture. In fact, that was a mistake. Firstly, Cortex-R82 _doesn't_ implement that feature, so it makes no sense to deliberately enable it for that CPU in particular. But also, it's an extension that only adds system registers, and we're generally more relaxed about where we enable those (because kernel developers find it useful to write sysreg-access instructions after runtime checking, and because sysreg accesses aren't manufactured during code generation so the risk is small). So, in line with that usual AArch64 policy, FeatureSpecRestrict ought to be considered part of 8.0-R for LLVM purposes. So I'm moving it back into HasV8_0rOps, where it started out. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D120830	2022-03-07 15:55:08 +00:00
Aakanksha	840695814a	[AMDGPU] Add gfx1036 target Differential Revision: https://reviews.llvm.org/D120846	2022-03-02 23:26:38 +00:00
Stefan Pintilie	eb1c5a9862	[PowerPC] Add the Power10 LXVKQ instrution. Add the Power 10 instruction LXVKQ. This patch was taken from an original patch by: Yi-Hong Lyu Reviewed By: lei Differential Revision: https://reviews.llvm.org/D117507	2022-02-23 08:48:59 -06:00
Min-Yih Hsu	4986a41f58	[M68k] Adopt VarLenCodeEmitter for bits instructions And introduce operand encoding fragments (i.e. MxEncMemOp record) for addressing modes 'o' and 'e'.	2022-02-17 14:16:19 -08:00
Sheng	4306fbff9c	Revert "Revert "[M68k] Adopt VarLenCodeEmitter for control instructions"" This reverts commit `69a7d49de6`. llvm/test/MC/M68k/Relaxations/branch.s needs disassembler support. So I disabled it temporarily	2022-02-16 17:41:49 +08:00
Sheng	69a7d49de6	Revert "[M68k] Adopt VarLenCodeEmitter for control instructions" This reverts commit `9ffd498fcb`. This patch introduce regression on MC/M68k/Relaxations/branch.s	2022-02-16 17:09:46 +08:00
Sheng	9ffd498fcb	[M68k] Adopt VarLenCodeEmitter for control instructions Refactor the instructions in M68kInstrControl.td to use the VarLenCodeEmitter. This patch is tested by the existing test cases. Reviewed By: myhsu, ricky26 Differential Revision: https://reviews.llvm.org/D119665	2022-02-16 12:54:20 +08:00
Stefan Pintilie	a601db30c6	[PowerPC] Remove the LDMX instruction. The LDMX instruction was to be potentially added in P9 but it was never added in either ISA 3.0 or ISA 3.1. This patch removes that instruction as it is currently still an invalid instruction. Reviewed By: lei Differential Revision: https://reviews.llvm.org/D118074	2022-02-14 17:03:48 -06:00
Min-Yih Hsu	08f2b0dcf6	[M68k] Adopt the new VarLenCodeEmitterGen for arithmetic instructions This patch refactors all the existing M68k arithmetic instructions to use the new VarLenCodeEmitterGen infrastructure. This patch is tested by the existing MC test cases. Note that one of the codegen tests needed to be updated because the ordering of two equivalent instructions were switched. Differential Revision: https://reviews.llvm.org/D115234	2022-02-11 09:31:12 -08:00
Stanislav Mekhanoshin	d3b87e4a1c	[AMDGPU] HWRegs TMA and TBA also supported on gfx9 Differential Revision: https://reviews.llvm.org/D118860	2022-02-03 09:36:10 -08:00
Shao-Ce SUN	005fd8aa70	[RISCV] Add support for Zihintpause extention Add support for the 'pause' hint instruction as an alias for 'fence w, 0'. To do this allow the 'fence' operands pred and succ to be set to 0 (the empty set). This will also allow future hints to be encoded as 'fence 0, <x>' and 'fence <x>, 0'. This patch revised from @mundaym's D93019. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D117789	2022-02-03 20:55:47 +08:00
Ting Wang	6f25cb8685	[PowerPC] Add the Power10 XS[MAX\|MIN]CQP instruction Add the Power 10 instruction XS[MAX\|MIN]CQP. Reviewed By: shchenz, amyk Differential Revision: https://reviews.llvm.org/D118036	2022-01-26 23:00:43 -05:00
Simon Tatham	f302e0b5dd	[AArch64] Exclude optional features from HasV8_0rOps. The following SubtargetFeatures are removed from the definition of HasV8_0rOps, on the grounds that they are optional in Armv8.4-A, and therefore (by the definition of Armv8.0-R) also optional in v8.0-R: * performance monitoring: FeaturePerfMon * cryptography: FeatureSM4 and FeatureSHA3 * half-precision FP: FeatureFullFP16, FeatureFP16FML * speculation control: FeatureSSBS, FeaturePredRes, FeatureSB, FeatureSpecRestrict This isn't the full set of features that are listed as optional in the spec. FeatureCCIDX and FeatureTRACEV8_4 are also optional. But LLVM includes those in HasV8_3aOps and HasV8_4aOps respectively (I think on the grounds that the system registers they enable are useful to be able to access after a runtime check), and so for consistency, I've left those in HasV8_0rOps too. After this commit, HasV8_0rOps is a strict subset of HasV8_4aOps (but missing features that are not in Armv8.0-R at all). The definition of Cortex-R82 is correspondingly updated to add most of the features that I've removed from base Armv8.0-R (with the exception of the cryptography ones), since that particular implementation of v8.0-R does have them. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D118045	2022-01-25 10:54:59 +00:00
Min-Yih Hsu	07be76f2ae	[M68k][Disassembler][NFC] Re-organize test files Put test cases of each instruction category into their own files. NFC.	2022-01-25 13:10:15 +08:00
Simon Tatham	a4ac40e92f	[AArch64] Remove PRBAR0_ELn and PRLAR0_ELn sysregs. The Armv8-R.64 architecture defines numbered MPU region registers with indices 1-15, not 0-15. So there's no such register as PRBAR0_EL2 or PRLAR0_EL1 (for example). The encodings that they would occupy are used for the unnumbered PRBAR_ELn and PRLAR_ELn registers. Reviewed By: labrinea Differential Revision: https://reviews.llvm.org/D117755	2022-01-20 13:37:58 +00:00
Simon Tatham	19b9cd4eae	[MC] Add a disassembly test for Armv8-R sysregs. This is the counterpart to llvm/test/MC/AArch64/armv8r-sysreg.s, checking all the same encodings when fed to the disassembler.	2022-01-20 13:37:58 +00:00
Simon Tatham	e35a3f188f	[AArch64] Adding "armv8.8-a" memcpy/memset support. This family of instructions includes CPYF (copy forward), CPYB (copy backward), SET (memset) and SETG (memset + initialise MTE tags), with some sub-variants to indicate whether address translation is done in a privileged or unprivileged way. For the copy instructions, you can separately specify the read and write translations (so that kernels can safely use these instructions in syscall handlers, to memcpy between the calling process's user-space memory map and the kernel's own privileged one). The unusual thing about these instructions is that they write back to multiple registers, because they perform an implementation-defined amount of copying each time they run, and write back to _all_ the address and size registers to indicate how much remains to be done (and the code is expected to loop on them until the size register becomes zero). But this is no problem in LLVM - you just define each instruction to have multiple outputs, multiple inputs, and a set of constraints tying their register numbers together appropriately. This commit introduces a special subtarget feature called MOPS (after the name the spec gives to the CPU id field), which is a dependency of the top-level 8.8-A feature, and uses that to enable most of the new instructions. The SETMG instructions also depend on MTE (and the test checks that). Differential Revision: https://reviews.llvm.org/D116157	2022-01-05 14:44:24 +00:00
Simon Tatham	8c1e520c90	[AArch64] Adding "armv8.8-a" BC instruction. This instruction is described in the Arm A64 Instruction Set Architecture documentation available here: https://developer.arm.com/documentation/ddi0596/2021-12/Base-Instructions/BC-cond--Branch-Consistent-conditionally-?lang=en FEAT_HBC "Hinted Conditional Branches" is listed in the 2021 A-Profile Architecture Extensions: https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools/feature-names-for-a-profile 'BC.cc', where 'cc' is any ordinary condition code, is an instruction that looks exactly like B.cc (the normal conditional branch), except that bit 4 of the encoding is 1 rather than 0, which hints something to the branch predictor (specifically, that this branch is expected to be highly consistent, even though _which way_ it will consistently go is not known at compile time). This commit introduces a special subtarget feature for HBC, which is a dependency of the top-level 8.8-A feature, and uses that to enable the new BC instruction. Differential Revision: https://reviews.llvm.org/D116156	2022-01-03 12:33:51 +00:00
Ties Stuij	63eb7ff47d	[ARM] Implement PAC return address signing mechanism for PACBTI-M This patch implements PAC return address signing for armv8-m. This patch roughly accomplishes the following things: - PAC and AUT instructions are generated. - They're part of the stack frame setup, so that shrink-wrapping can move them inwards to cover only part of a function - The auth code generated by PAC is saved across subroutine calls so that AUT can find it again to check - PAC is emitted before stacking registers (so that the SP it signs is the one on function entry). - The new pseudo-register ra_auth_code is mentioned in the DWARF frame data - With CMSE also in use: PAC is emitted before stacking FPCXTNS, and AUT validates the corresponding value of SP - Emit correct unwind information when PAC is replaced by PACBTI - Handle tail calls correctly Some notes: We make the assembler accept the `.save {ra_auth_code}` directive that is emitted by the compiler when it saves a register that contains a return address authentication code. For EHABI we need to have the `FrameSetup` flag on the instruction and handle the `t2PACBTI` opcode (identically to `t2PAC`), so we can emit `.save {ra_auth_code}`, instead of `.save {r12}`. For PACBTI-M, the instruction which computes return address PAC should use SP value before adjustment for the argument registers save are (used for variadic functions and when a parameter is is split between stack and register), but at the same it should be after the instruction that saves FPCXT when compiling a CMSE entry function. This patch moves the varargs SP adjustment after the FPCXT save (they are never enabled at the same time), so in a following patch handling of the `PAC` instruction can be placed between them. Epilogue emission code adjusted in a similar manner. PACBTI-M code generation should not emit any instructions for architectures v6-m, v8-m.base, and for A- and R-class cores. Diagnostic message for such cases is handled separately by a future ticket. note on tail calls: If the called function has four arguments that occupy registers `r0`-`r3`, the only option for holding the function pointer itself is `r12`, but this register is used to keep the PAC during function/prologue epilogue and clobbers the function pointer. When we do the tail call we need the five registers (`r0`-`r3` and `r12`) to keep six values - the four function arguments, the function pointer and the PAC, which is obviously impossible. One option would be to authenticate the return address before all callee-saved registers are restored, so we have a scratch register to temporarily keep the value of `r12`. The issue with this approach is that it violates a fundamental invariant that PAC is computed using CFA as a modifier. It would also mean using separate instructions to pop `lr` and the rest of the callee-saved registers, which would offset the advantages of doing a tail call. Instead, this patch disables indirect tail calls when the called function take four or more arguments and the return address sign and authentication is enabled for the caller function, conservatively assuming the caller function would spill LR. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Momchil Velikov - Ties Stuij Reviewed By: danielkiss Differential Revision: https://reviews.llvm.org/D112429	2021-12-07 10:15:19 +00:00
Ties Stuij	5cff77c23f	[clang][ARM] PACBTI-M assembly support Introduce assembly support for Armv8.1-M PACBTI extension. This is an optional extension in v8.1-M. There are 10 new system registers and 5 new instructions, all predicated on the feature. The attribute for llvm-mc is called "pacbti". For armclang, an architecture extension also called "pacbti" was created. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Victor Campos - Ties Stuij Reviewed By: labrinea Differential Revision: https://reviews.llvm.org/D112420	2021-11-30 09:28:18 +00:00
Dmitry Preobrazhensky	91f4650ebb	[AMDGPU][MC][GFX10] Corrected global_atomic_fcmpswap* Corrected src data size of global_atomic_fcmpswap and global_atomic_fcmpswap_x2 opcodes. Differential Revision: https://reviews.llvm.org/D113746	2021-11-15 12:51:12 +03:00
Alexandros Lamprineas	8689f5e6e7	[AArch64] Add support for the 'R' architecture profile. This change introduces subtarget features to predicate certain instructions and system registers that are available only on 'A' profile targets. Those features are not present when targeting a generic CPU, which is the default processor. In other words the generic CPU now means the intersection of 'A' and 'R' profiles. To maintain backwards compatibility we enable the features that correspond to -march=armv8-a when the architecture is not explicitly specified on the command line. References: https://developer.arm.com/documentation/ddi0600/latest Differential Revision: https://reviews.llvm.org/D110065	2021-10-27 12:32:30 +01:00
Joe Nash	b4b7e605a6	[AMDGPU] Support shared literals in FMAMK/FMAAK These instructions should allow src0 to be a literal with the same value as the mandatory other literal. Enable it by introducing an operand that defers adding its value to the MI when decoding till the mandatory literal is parsed. Reviewed By: dp, foad Differential Revision: https://reviews.llvm.org/D111067 Change-Id: I22b0ae0d35bad17b6f976808e48bffe9a6af70b7	2021-10-11 13:09:54 -04:00
Dmitry Preobrazhensky	3500e7d2b0	[AMDGPU][MC][GFX7][GFX10] Corrected image_atomic_fcmpswap Differential Revision: https://reviews.llvm.org/D109616	2021-09-21 18:06:02 +03:00
Dmitry Preobrazhensky	b8e7f53208	[AMDGPU][MC][GFX10] Enabled dlc for FLAT and GLOBAL atomics Differential Revision: https://reviews.llvm.org/D109614	2021-09-21 16:23:20 +03:00
Victor Campos	79f9c79aaf	[AArch64][MC] Merge FeaturePMU into FeaturePerfMon FeaturePMU was created in AArch64 to accommodate one missing system register, PMMIR_EL1, in commit `ffcd7698ae`. However, the Performance Monitors extension already had a target feature, which is called FeaturePerfMon. Therefore, FeaturePMU is redundant. This patch removes FeaturePMU and merges its contents into FeaturePerfMon. Reviewed By: dnsampaio Differential Revision: https://reviews.llvm.org/D109246	2021-09-06 14:56:49 +01:00
Wang, Pengfei	ab40dbfe03	[X86] AVX512FP16 instructions enabling 6/6 Enable FP16 complex FMA instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105269	2021-08-30 13:08:45 +08:00
Thomas Johnson	8c3886b0ec	[ARC] Add ADC (addition with carry) and SBC (subtraction with carry) instructions Differential Revision: https://reviews.llvm.org/D108672	2021-08-25 07:46:15 -07:00
Thomas Johnson	ce1dc9d647	[ARC] Add codegen for the readcyclecounter intrinsic along with disassembly for associated instructions Differential Revision: https://reviews.llvm.org/D108598	2021-08-24 11:53:20 -07:00
Wang, Pengfei	c728bd5bba	[X86] AVX512FP16 instructions enabling 5/6 Enable FP16 FMA instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105268	2021-08-24 09:07:19 +08:00
Wang, Pengfei	b088536ce9	[X86] AVX512FP16 instructions enabling 4/6 Enable FP16 unary operator instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105267	2021-08-22 08:59:35 +08:00
Wang, Pengfei	2379949aad	[X86] AVX512FP16 instructions enabling 3/6 Enable FP16 conversion instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105265	2021-08-18 09:03:41 +08:00
Carl Ritson	99c790dc21	[AMDGPU] Make BVH isel consistent with other MIMG opcodes Suffix opcodes with _gfx10. Remove direct references to architecture specific opcodes. Add a BVH flag and apply this to diassembly. Fix a number of disassembly errors on gfx90a target caused by previous incorrect BVH detection code. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D108117	2021-08-17 10:42:22 +09:00
Craig Topper	b82ce77b2b	[X86] Support avx512fp16 compare instructions in the IntelInstPrinter. This enables printing of the mnemonics that contain the predicate in the Intel printer. This requires accounting for the memory size that is explicitly printed in Intel syntax. Those changes have been synced to the ATT printer as well. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D108093	2021-08-16 12:31:36 +08:00

1 2 3 4 5 ...

2021 Commits