Commit Graph

146 Commits

Author SHA1 Message Date
Abinav Puthan Purayil d96361d714 [AMDGPU] Add the uses_dynamic_stack field to the kernel descriptor and the kernel metadata map
This change introduces the dynamic stack boolean field to code-object-v3
and above under the code properties of the kernel descriptor and under
the kernel metadata map of NT_AMDGPU_METADATA. This field corresponds to
the is_dynamic_callstack field of amd_kernel_code_t.

Differential Revision: https://reviews.llvm.org/D128344
2022-07-18 10:07:13 +05:30
Dmitry Preobrazhensky 2a6532d542 [AMDGPU][MC][GFX11] Correct disassembly of *_e64_dpp opcodes which support op_sel
These opcodes cannot be disassembled because op_sel operand is missing - it must be added manually.
See https://github.com/llvm/llvm-project/issues/56512 for detailed issue analysis.

Differential Revision: https://reviews.llvm.org/D129637
2022-07-15 13:11:59 +03:00
Piotr Sobczak 4874838a63 [AMDGPU] gfx11 WMMA instruction support
gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate)
instructions.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D128756
2022-06-30 11:13:45 -04:00
Joe Nash 07b7fada73 [AMDGPU] gfx11 VOPD instructions MC support
VOPD is a new encoding for dual-issue instructions for use in wave32.
This patch includes MC layer support only.

A VOPD instruction is constituted of an X component (for which there are
13 possible opcodes) and a Y component (for which there are the 13 X
opcodes plus 3 more). Most of the complexity in defining and parsing
a VOPD operation arises from the possible different total numbers of
operands and deferred parsing of certain operands depending on the
constituent X and Y opcodes.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D128218
2022-06-24 11:08:39 -04:00
Dmitry Preobrazhensky dcb24f93af [AMDGPU][MC][GFX11] Correct disassembly of VOP3.DPP8 opcodes
Fix bug #56163.
Add W32/W64 tests for all VOP3.DPP opcodes.

Differential Revision: https://reviews.llvm.org/D128369
2022-06-23 13:07:45 +03:00
Dmitry Preobrazhensky 485e8b4f63 [AMDGPU][MC][GFX11] Correct disassembly of DPP variants of VOPC64 opcodes
Fix bugs https://github.com/llvm/llvm-project/issues/56091, https://github.com/llvm/llvm-project/issues/56065.

Differential Revision: https://reviews.llvm.org/D128075
2022-06-20 14:23:07 +03:00
Joe Nash be1082c6d5 [AMDGPU] gfx11 VOPC instructions
Supports encoding existing instrutions on gfx11 and MC support for the new VOPC
dpp instructions.

Patch 19/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126978

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126989
2022-06-09 15:22:42 -04:00
Joe Nash 40f35cef89 [AMDGPU] gfx11 VOP3P instruction MC support
Includes dpp versions of VOP3P instructions.

Patch 18/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126917

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126978
2022-06-08 13:32:01 -04:00
Joe Nash 086a9c1062 Reland [AMDGPU] gfx11 VOP1+VOP2 Instruction MC support
The reverted dependent commit is now relanded, so reland this.
Includes dpp instructions and vop1/vop2 promoted to vop3

Patch 17/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126483

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126917
2022-06-08 11:10:57 -04:00
Joe Nash e243ead6fc Reland [AMDGPU] gfx11 vop3dpp instructions
There was an issue with encoding wide (>64 bit) instructions on
BigEndian hosts, which is fixed in D127195. Therefore reland this.

gfx11 adds the ability to use dpp modifiers on vop3 instructions.
This patch adds machine code layer support for that. The MCCodeEmitter
is changed to use APInt instead of uint64_t to support these wider
instructions.

Patch 16/N for upstreaming of AMDGPU gfx11 architecture

Differential Revision: https://reviews.llvm.org/D126483
2022-06-07 14:49:13 -04:00
Joe Nash eaed07eb7e Revert "[AMDGPU] gfx11 vop3dpp instructions"
This reverts commit 99a83b1286.
2022-06-06 17:12:09 -04:00
Joe Nash f617f89e5b Revert "[AMDGPU] gfx11 VOP1+VOP2 Instruction MC support"
This reverts commit 6079804498.
2022-06-06 17:11:35 -04:00
Joe Nash 6079804498 [AMDGPU] gfx11 VOP1+VOP2 Instruction MC support
Includes dpp instructions and vop1/vop2 promoted to vop3

Patch 17/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126483

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126917
2022-06-06 09:57:59 -04:00
Joe Nash 99a83b1286 [AMDGPU] gfx11 vop3dpp instructions
gfx11 adds the ability to use dpp modifiers on vop3 instructions.
This patch adds machine code layer support for that. The MCCodeEmitter
is changed to use APInt instead of uint64_t to support these wider
instructions.

Patch 16/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126475

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126483
2022-06-06 09:34:59 -04:00
Joe Nash e8860bee28 [AMDGPU] gfx11 Image instructions
MC layer support for instructions in the MIMG encoding(Image
instructions).

Contributors:
Carl Ritson <carl.ritson@amd.com>

Patch 13/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125992

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126463
2022-05-31 10:53:35 -04:00
Maksim Panchenko bed9efed71 [MCDisassembler] Disambiguate Size parameter in tryAddingSymbolicOperand()
MCSymbolizer::tryAddingSymbolicOperand() overloaded the Size parameter
to specify either the instruction size or the operand size depending on
the architecture. However, for proper symbolic disassembly on X86, we
need to know both sizes, as an instruction can have two operands, and
the instruction size cannot be reliably calculated based on the operand
offset and its size. Hence, split Size into OpSize and InstSize.

For X86, the new interface allows to fix a couple of issues:
  * Correctly adjust the value of PC-relative operands.
  * Set operand size to zero when the operand is specified implicitly.

Differential Revision: https://reviews.llvm.org/D126101
2022-05-25 13:44:32 -07:00
Joe Nash ef1ea5ac01 [AMDGPU] gfx11 vinterp instructions MC support
A new instruction encoding. Some of these instructions were previously VOP3
encoded.

Contributors:
Carl Ritson <carl.ritson@amd.com>

Patch 11/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125824

Reviewed By: critson

Differential Revision: https://reviews.llvm.org/D125989
2022-05-25 14:59:16 -04:00
Joe Nash 1a51ab766f [AMDGPU] gfx11 export instructions
Contributors:
Jay Foad <jay.foad@amd.com>
Dmitry Preobrazhensky <d-pre@mail.ru>

Patch 10/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125822

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D125824
2022-05-25 14:44:09 -04:00
Dmitry Preobrazhensky 32ca9bd7b5 [AMDGPU][MC][GFX940] Correct tied operand decoding for smfmac opcodes
Differential Revision: https://reviews.llvm.org/D125790
2022-05-18 15:39:30 +03:00
Joe Nash d21b9b4946 [AMDGPU] gfx11 scalar alu instructions
MC layer support for SOP(scalar alu operations) including encoding
support for s_delay_alu and s_sendmsg_rtn.

Contributors:
Jay Foad <jay.foad@amd.com>

Patch 7/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125319

Reviewed By: #amdgpu, arsenm

Differential Revision: https://reviews.llvm.org/D125498
2022-05-17 13:35:41 -04:00
Joe Nash c70259405c [AMDGPU] gfx11 BUF Instructions
Includes MachineCode layer support and tests, and MIR tests not requiring
CodeGen pass changes.
Includes a small change in SMInstructions.td to correct encoded bits.

Contributors:
Petar Avramovic <Petar.Avramovic@amd.com>
Dmitry Preobrazhensky <dmitry.preobrazhensky@amd.com>

Depends on D125316

Patch 6/N for upstreaming of AMDGPU gfx11 architecture.

Reviewed By: dp, Petar.Avramovic

Differential Revision: https://reviews.llvm.org/D125319
2022-05-16 09:41:40 -04:00
Sheng c644488a8b Rename `MCFixedLenDisassembler.h` as `MCDecoderOps.h`
The name `MCFixedLenDisassembler.h` is out of date after D120958.

Rename it as `MCDecoderOps.h` to reflect the change.

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D124987
2022-05-15 08:44:58 +08:00
Joe Nash a0a406b257 [AMDGPU] gfx11 Decode wider instructions. NFC
Refactor to pass a templatized size parameter to the decoder to allow wider than
64bit decodes in a later patch.

Contributors:
Jay Foad <jay.foad@amd.com>

Depends on D125261

Patch 5/N for upstreaming of AMDGPU gfx11 architecture.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D125316
2022-05-11 11:05:58 -04:00
Maksim Panchenko 4ae9745af1 [Disassember][NFCI] Use strong type for instruction decoder
All LLVM backends use MCDisassembler as a base class for their
instruction decoders. Use "const MCDisassembler *" for the decoder
instead of "const void *". Remove unnecessary static casts.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D122245
2022-03-25 18:53:59 -07:00
Stanislav Mekhanoshin 6e3e14f600 [AMDGPU] Support gfx940 smfmac instructions
Differential Revision: https://reviews.llvm.org/D122191
2022-03-24 12:40:42 -07:00
Stanislav Mekhanoshin 522b259976 [AMDGPU] Allow v_accvgpr_write to use SGPR src on gfx940
Differential Revision: https://reviews.llvm.org/D121843
2022-03-17 12:12:06 -07:00
serge-sans-paille ef736a1c39 Cleanup LLVMMC headers
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:

llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h

Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after:  1049293745

Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
2022-02-09 11:09:17 +01:00
Joe Nash b4b7e605a6 [AMDGPU] Support shared literals in FMAMK/FMAAK
These instructions should allow src0 to be a literal with the same
value as the mandatory other literal. Enable it by introducing an
operand that defers adding its value to the MI when decoding till
the mandatory literal is parsed.

Reviewed By: dp, foad

Differential Revision: https://reviews.llvm.org/D111067

Change-Id: I22b0ae0d35bad17b6f976808e48bffe9a6af70b7
2021-10-11 13:09:54 -04:00
Reid Kleckner 89b57061f7 Move TargetRegistry.(h|cpp) from Support to MC
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.

This allows us to ensure that Support doesn't have includes from MC/*.

Differential Revision: https://reviews.llvm.org/D111454
2021-10-08 14:51:48 -07:00
Carl Ritson 99c790dc21 [AMDGPU] Make BVH isel consistent with other MIMG opcodes
Suffix opcodes with _gfx10.
Remove direct references to architecture specific opcodes.
Add a BVH flag and apply this to diassembly.
Fix a number of disassembly errors on gfx90a target caused by
previous incorrect BVH detection code.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D108117
2021-08-17 10:42:22 +09:00
Carl Ritson 6efb3220b4 [AMDGPU] Add VReg_192/VReg_224 support for MIMG instructions
Allow MIMG instructions to be selected with 6/7 VGPRs for vaddr.
Previously these were rounded up to VReg_256 this saves VGPRs.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D103800
2021-07-22 10:42:15 +09:00
Carl Ritson f8816c7400 [AMDGPU] Add v5f32/VReg_160 support for MIMG instructions
Avoid having to round up to v8f32/VReg_256 when only 5 VGPRs are
required for a MIMG address operand.

Maintain _V8 instruction variants of pseudo instructions allowing
assembly prior to GFX10 to work as-is.  Currently the validator
can tell for GFX10 what the correct size is, so will disallow
oversize address registers.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D103672
2021-06-08 11:11:40 +09:00
Stanislav Mekhanoshin 6fb02596a2 [AMDGPU] Add support for architected flat scratch
Add support for the readonly flat Scratch register initialized
by the SPI.

Differential Revision: https://reviews.llvm.org/D102432
2021-05-14 10:53:48 -07:00
David Stuttard 72d570ca08 [AMDGPU][AsmParser/Disassembler] Correct A16 and G16 handling
A16 support for image instructions assembly/disassembly (gfx10) was missing

Also refactor MIMG op addr size calcs to common function

We'd got 3 places where the same operation was being done.

One test is now marked XFAIL until a related codegen patch is in place

Differential Revision: https://reviews.llvm.org/D102231

Change-Id: I7e86e730ef8c71901457855cba570581f4f576bb
2021-05-14 09:25:44 +01:00
Tim Renouf 8710eff6c3 [MC][AMDGPU][llvm-objdump] Synthesized local labels in disassembly
1. Add an accessor function to MCSymbolizer to retrieve addresses
   referenced by a symbolizable operand, but not resolved to a symbol.
   That way, the caller can synthesize labels at those addresses and
   then retry disassembling the section.

2. Implement that in AMDGPU -- a failed symbol lookup results in the
   address being added to a vector returned by the new function.

3. Use that in llvm-objdump when using MCSymbolizer (which only happens
   on AMDGPU) and SymbolizeOperands is on.

Differential Revision: https://reviews.llvm.org/D101145

Change-Id: I19087c3bbfece64bad5a56ee88bcc9110d83989e
2021-04-26 13:56:36 +01:00
Joe Nash 919236e608 [AMDGPU] NFC, Comment in disassembler for dpp8
Gives reasoning for convertDPP8.
Also corrects typo in Operand type comment.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D100665

Change-Id: I33ff269db8072d83e5e0ecdbfb731d6000fc26c4
2021-04-16 16:21:47 -04:00
Konstantin Zhuravlyov f4ace63737 AMDGPU: Add target id and code object v4 support
- Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id)
  - Add code object v4 support (https://llvm.org/docs/AMDGPUUsage.html#elf-code-object)
    - Add kernarg_size to kernel descriptor
    - Change trap handler ABI to no longer move queue pointer into s[0:1]
  - Cleanup ELF definitions
    - Add V2, V3, V4 suffixes to make a clear distinction for code object version
    - Consolidate note names

Differential Revision: https://reviews.llvm.org/D95638
2021-03-24 11:54:05 -04:00
Stanislav Mekhanoshin 3bffb1cd0e [AMDGPU] Use single cache policy operand
Replace individual operands GLC, SLC, and DLC with a single cache_policy
bitmask operand. This will reduce the number of operands in MIR and I hope
the amount of code. These operands are mostly 0 anyway.

Additional advantage that parser will accept these flags in any order unlike
now.

Differential Revision: https://reviews.llvm.org/D96469
2021-03-15 13:00:59 -07:00
Stanislav Mekhanoshin a8d9d50762 [AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
2021-02-17 16:01:32 -08:00
Kazu Hirata 49231c1f80 [llvm] Use static_assert instead of assert (NFC)
Identified with misc-static-assert.
2021-01-22 23:25:05 -08:00
Petar Avramovic 4ab704d628 [AMDGPU][MC] Add tfe disassembler support MIMG opcodes
With tfe on there can be a vgpr write to vdata+1.
Add tablegen support for 5 register vdata store.
This is required for 4 register vdata store with tfe.

Differential Revision: https://reviews.llvm.org/D94960
2021-01-20 10:37:09 +01:00
Jay Foad 18cb7441b6 [AMDGPU] Simpler names for arch-specific ttmp registers. NFC.
Rename the *_gfx9_gfx10 ttmp registers to *_gfx9plus for simplicity,
and use the corresponding isGFX9Plus predicate to decide when to use
them instead of the old *_vi versions.

Differential Revision: https://reviews.llvm.org/D94975
2021-01-19 18:47:14 +00:00
Kazu Hirata b934160aaa [Target] Use llvm::find_if (NFC) 2021-01-07 20:29:36 -08:00
dfukalov 6a87e9b08b [NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813
2021-01-07 22:22:05 +03:00
Tony d5ea8f7010 [AMDGPU] Clarify scratch initialization
- Clarify documentation on initializing scratch.
- Rename compute_pgm_rsrc2 field for enabling scratch from
  ENABLE_SGPR_PRIVATE_SEGMENT_WAVEFRONT_OFFSET to
  ENABLE_PRIVATE_SEGMENT to match hardware definition.

Differential Revision: https://reviews.llvm.org/D93271
2020-12-15 20:14:20 +00:00
Jay Foad 4f87d30a06 [AMDGPU] Introduce and use isGFX10Plus. NFC.
It's more future-proof to use isGFX10Plus from the start, on the
assumption that future architectures will be based on current
architectures.

Also make use of the existing isGFX9Plus in a few places.

Differential Revision: https://reviews.llvm.org/D92092
2020-11-26 09:02:36 +00:00
Stanislav Mekhanoshin f738aee0bb [AMDGPU] Add default 1 glc operand to rtn atomics
This change adds a real glc operand to the return atomic
instead of just string " glc" in the middle of the asm
string.

Improves asm parser diagnostics.

Differential Revision: https://reviews.llvm.org/D90730
2020-11-05 10:41:59 -08:00
Jay Foad edc37baca6 [AMDGPU] Add MC layer support for v_fmac_legacy_f32
This instruction was introduced in GFX10.3, reusing the opcode of
v_mac_legacy_f32 from GFX10.1.

Differential Revision: https://reviews.llvm.org/D89247
2020-10-13 21:57:33 +01:00
Jay Foad 7238faa4ae [AMDGPU] Add patterns for mad/mac legacy f32 instructions
Note that all subtargets up to GFX10.1 have v_mad_legacy_f32, but GFX8/9
lack v_mac_legacy_f32. GFX10.3 has no mad/mac f32 instructions at all.

Differential Revision: https://reviews.llvm.org/D88890
2020-10-08 15:20:06 +01:00
Ronak Chauhan 528057c197 [AMDGPU] Support disassembly for AMDGPU kernel descriptors
Decode AMDGPU Kernel descriptors as assembler directives.

Reviewed By: scott.linder, jhenderson, kzhuravl

Differential Revision: https://reviews.llvm.org/D80713
2020-10-07 20:39:43 +05:30