Commit Graph

71 Commits

Author SHA1 Message Date
Jay Foad 77e63b25f9 [AMDGPU] Fix assertion failure on mad with negative immediate addend
Without this, the new test case would fail with:

AMDGPUInstPrinter.cpp:545: void llvm::AMDGPUInstPrinter::printImmediate64(uint64_t, const llvm::MCSubtargetInfo &, llvm::raw_ostream &): Assertion `isUInt<32>(Imm) || Imm == 0x3fc45f306dc9c882' failed.

Differential Revision: https://reviews.llvm.org/D128435
2022-06-27 09:49:20 +01:00
Joe Nash be1082c6d5 [AMDGPU] gfx11 VOPC instructions
Supports encoding existing instrutions on gfx11 and MC support for the new VOPC
dpp instructions.

Patch 19/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126978

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126989
2022-06-09 15:22:42 -04:00
Joe Nash 086a9c1062 Reland [AMDGPU] gfx11 VOP1+VOP2 Instruction MC support
The reverted dependent commit is now relanded, so reland this.
Includes dpp instructions and vop1/vop2 promoted to vop3

Patch 17/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126483

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126917
2022-06-08 11:10:57 -04:00
Joe Nash e243ead6fc Reland [AMDGPU] gfx11 vop3dpp instructions
There was an issue with encoding wide (>64 bit) instructions on
BigEndian hosts, which is fixed in D127195. Therefore reland this.

gfx11 adds the ability to use dpp modifiers on vop3 instructions.
This patch adds machine code layer support for that. The MCCodeEmitter
is changed to use APInt instead of uint64_t to support these wider
instructions.

Patch 16/N for upstreaming of AMDGPU gfx11 architecture

Differential Revision: https://reviews.llvm.org/D126483
2022-06-07 14:49:13 -04:00
Joe Nash eaed07eb7e Revert "[AMDGPU] gfx11 vop3dpp instructions"
This reverts commit 99a83b1286.
2022-06-06 17:12:09 -04:00
Joe Nash f617f89e5b Revert "[AMDGPU] gfx11 VOP1+VOP2 Instruction MC support"
This reverts commit 6079804498.
2022-06-06 17:11:35 -04:00
Joe Nash 6079804498 [AMDGPU] gfx11 VOP1+VOP2 Instruction MC support
Includes dpp instructions and vop1/vop2 promoted to vop3

Patch 17/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126483

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126917
2022-06-06 09:57:59 -04:00
Joe Nash 99a83b1286 [AMDGPU] gfx11 vop3dpp instructions
gfx11 adds the ability to use dpp modifiers on vop3 instructions.
This patch adds machine code layer support for that. The MCCodeEmitter
is changed to use APInt instead of uint64_t to support these wider
instructions.

Patch 16/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126475

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126483
2022-06-06 09:34:59 -04:00
Joe Nash 835e09c4c3 [AMDGPU] gfx11 FLAT Instructions
MachineCode Support for FLAT type instructions

Contributors:
Sebastian Neubauer <sebastian.neubauer@amd.com>

Patch 12/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125989

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D125992
2022-05-25 15:29:39 -04:00
Joe Nash ef1ea5ac01 [AMDGPU] gfx11 vinterp instructions MC support
A new instruction encoding. Some of these instructions were previously VOP3
encoded.

Contributors:
Carl Ritson <carl.ritson@amd.com>

Patch 11/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125824

Reviewed By: critson

Differential Revision: https://reviews.llvm.org/D125989
2022-05-25 14:59:16 -04:00
Joe Nash 729467acef [AMDGPU] gfx11 LDSDIR instructions MC support
Contributors:
Carl Ritson <carl.ritson@amd.com>

Patch 8/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125498

Reviewed By: critson, rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D125820
2022-05-19 10:08:47 -04:00
Joe Nash d21b9b4946 [AMDGPU] gfx11 scalar alu instructions
MC layer support for SOP(scalar alu operations) including encoding
support for s_delay_alu and s_sendmsg_rtn.

Contributors:
Jay Foad <jay.foad@amd.com>

Patch 7/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125319

Reviewed By: #amdgpu, arsenm

Differential Revision: https://reviews.llvm.org/D125498
2022-05-17 13:35:41 -04:00
Joe Nash c70259405c [AMDGPU] gfx11 BUF Instructions
Includes MachineCode layer support and tests, and MIR tests not requiring
CodeGen pass changes.
Includes a small change in SMInstructions.td to correct encoded bits.

Contributors:
Petar Avramovic <Petar.Avramovic@amd.com>
Dmitry Preobrazhensky <dmitry.preobrazhensky@amd.com>

Depends on D125316

Patch 6/N for upstreaming of AMDGPU gfx11 architecture.

Reviewed By: dp, Petar.Avramovic

Differential Revision: https://reviews.llvm.org/D125319
2022-05-16 09:41:40 -04:00
Ivan Kosarev bf5fc0d603 [AMDGPU][NFC] Remove unused function.
Introduced in
https://reviews.llvm.org/rG229d5e669bbbe7ca38ad832627a9809405939f1b

and then became unused in
https://reviews.llvm.org/D19584

Reviewed By: foad, dp

Differential Revision: https://reviews.llvm.org/D125385
2022-05-12 08:52:06 +01:00
Ivan Kosarev 88f04bdbd8 [AMDGPU][GFX10] Support base+soffset+offset SMEM loads.
Also makes a step towards resolving
https://github.com/llvm/llvm-project/issues/38652

Reviewed By: foad, dp

Differential Revision: https://reviews.llvm.org/D125117
2022-05-10 16:17:14 +01:00
Dmitry Preobrazhensky 1f6aa90386 [AMDGPU][MC][GFX10] Added syntactic sugar for s_waitcnt_depctr operand
Added the following helpers:

    depctr_hold_cnt(...)
    depctr_sa_sdst(...)
    depctr_va_vdst(...)
    depctr_va_sdst(...)
    depctr_va_ssrc(...)
    depctr_va_vcc(...)
    depctr_vm_vsrc(...)

Differential Revision: https://reviews.llvm.org/D123022
2022-04-07 17:03:44 +03:00
Dmitry Preobrazhensky 1d817a1448 [AMDGPU][MC][NFC] Refactored sendmsg(...) handling
Differential Revision: https://reviews.llvm.org/D121995
2022-03-21 15:37:30 +03:00
Stanislav Mekhanoshin 0a79e1f30a [AMDGPU] reuse blgp as neg in 2 mfma operations on gfx940
GFX940 repurposes BLGP as NEG only in DGEMM MFMA.

Differential Revision: https://reviews.llvm.org/D121745
2022-03-18 12:56:51 -07:00
Stanislav Mekhanoshin 8992b50e2f [AMDGPU] gfx940 uses new names for coherency bits
Differential Revision: https://reviews.llvm.org/D120855
2022-03-07 11:50:07 -08:00
Dmitry Preobrazhensky b5fb7e485e [AMDGPU][MC] Corrected disassembly of s_waitcnt
s_waitcnt with default expcnt, vmcnt and lgkmcnt values was disassembled without arguments.
See https://github.com/llvm/llvm-project/issues/52716

Differential Revision: https://reviews.llvm.org/D117305
2022-01-17 20:22:03 +03:00
Joe Nash b4b7e605a6 [AMDGPU] Support shared literals in FMAMK/FMAAK
These instructions should allow src0 to be a literal with the same
value as the mandatory other literal. Enable it by introducing an
operand that defers adding its value to the MI when decoding till
the mandatory literal is parsed.

Reviewed By: dp, foad

Differential Revision: https://reviews.llvm.org/D111067

Change-Id: I22b0ae0d35bad17b6f976808e48bffe9a6af70b7
2021-10-11 13:09:54 -04:00
Daniil Fukalov 48958d02d2 [NFC][AMDGPU] Reduce includes dependencies.
1. Splitted out some parts of R600 target to separate modules/headers.
2. Reduced some include lists in headers.
3. Found and fixed issue with override `GCNTargetMachine::getSubtargetImpl()`
   and `R600TargetMachine::getSubtargetImpl()` had different return value type
   than base class.
4. Minor forward declarations cleanup.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D108596
2021-08-25 12:01:55 +03:00
Stanislav Mekhanoshin 28f1d018b1 [AMDGPU] Fix 64 bit DPP validation
AMDGPUAsmParser::isSupportedDPPCtrl() was failing to correctly
find a DPP register operand, regadless of the position it is
always src0. Moved this check into a new validateDPP() method
where we have full instruction already. In particular it was
failing to reject this case:

v_cvt_u32_f64 v5, v[0:1] quad_perm:[0,2,1,1] row_mask:0xf bank_mask:0xf

Essentially it was broken for any case where size of dst and
src0 differ.

It also improves the diagnostics with a proper error message.

The check in the InstPrinter also drops verification of the dst
register as it does not have anything to do with the dpp operand.

Differential Revision: https://reviews.llvm.org/D101930
2021-05-06 08:40:26 -07:00
Dmitry Preobrazhensky 67b39661c8 [AMDGPU][MC][NFC] Removed extra spaces
Fixed bugs 49646, 49647.

Differential Revision: https://reviews.llvm.org/D100173
2021-04-12 13:33:19 +03:00
Sebastian Neubauer 36138db116 [AMDGPU] IsFlatScratch/Global -> FlatScratch/Global
Remove 'Is' from IsFlatScratch/Global. NFC

Differential Revision: https://reviews.llvm.org/D100108
2021-04-09 11:20:31 +02:00
Dmitry Preobrazhensky 0f5ebbcc7f [AMDGPU][MC] Added flag to identify VOP instructions which have a single variant
By convention, VOP1/2/C instructions which can be promoted to VOP3 have _e32 suffix while promoted instructions have _e64 suffix. Instructions which have a single variant should have no _e32/_e64 suffix. Unfortunately there was no simple way to identify single variant instructions - it was implemented by a hack. See bug https://bugs.llvm.org/show_bug.cgi?id=39086.

This fix simplifies handling of single VOP instructions by adding a dedicated flag.

Differential Revision: https://reviews.llvm.org/D99408
2021-04-01 13:53:12 +03:00
Stanislav Mekhanoshin 3bffb1cd0e [AMDGPU] Use single cache policy operand
Replace individual operands GLC, SLC, and DLC with a single cache_policy
bitmask operand. This will reduce the number of operands in MIR and I hope
the amount of code. These operands are mostly 0 anyway.

Additional advantage that parser will accept these flags in any order unlike
now.

Differential Revision: https://reviews.llvm.org/D96469
2021-03-15 13:00:59 -07:00
Jay Foad 67f0620831 [AMDGPU] Update s_sendmsg messages
Update the list of s_sendmsg messages known to the assembler and
disassembler and validate the ones that were added or removed in gfx9
and gfx10.

Differential Revision: https://reviews.llvm.org/D97295
2021-02-24 13:07:00 +00:00
Dmitry Preobrazhensky 4813518092 [AMDGPU][MC] Corrected bound_ctrl for compatibility with sp3
Enabled "bound_ctrl:1" and disabled "bound_ctrl:-1" syntax.
Corrected printer to output "bound_ctrl:1" instead of "bound_ctrl:0".
See bug 35397 for detailed issue description.

Differential Revision: https://reviews.llvm.org/D97048
2021-02-22 14:59:40 +03:00
Stanislav Mekhanoshin a8d9d50762 [AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
2021-02-17 16:01:32 -08:00
Dan Gohman 698c6b0a09 [WebAssembly] Support single-floating-point immediate value
As mentioned in TODO comment, casting double to float causes NaNs to change bits.
To avoid the change, this patch adds support for single-floating-point immediate value on MachineCode.

Patch by Yuta Saito.

Differential Revision: https://reviews.llvm.org/D77384
2021-02-04 18:05:06 -08:00
Dmitry Preobrazhensky 745064e36b [AMDGPU][MC] Refactored exp tgt handling
Summary:
- Separated tgt encoding from parsing;
- Separated tgt decoding from printing;
- Improved errors handling;
- Disabled leading zeroes in index. The following code is no longer accepted: exp pos00 v3, v2, v1, v0

Reviewers: arsenm, rampitec, foad

Differential Revision: https://reviews.llvm.org/D95216
2021-01-26 14:54:15 +03:00
Jay Foad 49dce85584 [AMDGPU] Simplify AMDGPUInstPrinter::printExpSrcN. NFC.
Change-Id: Idd7f47647bc0faa3ad6f61f44728c0f20540ec00
2021-01-19 10:39:56 +00:00
dfukalov 6a87e9b08b [NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813
2021-01-07 22:22:05 +03:00
Jay Foad 4f87d30a06 [AMDGPU] Introduce and use isGFX10Plus. NFC.
It's more future-proof to use isGFX10Plus from the start, on the
assumption that future architectures will be based on current
architectures.

Also make use of the existing isGFX9Plus in a few places.

Differential Revision: https://reviews.llvm.org/D92092
2020-11-26 09:02:36 +00:00
Simon Pilgrim a4d3691d55 Fix MSVC signed/unsigned comparison warning. NFCI. 2020-11-13 10:20:48 +00:00
Jay Foad d7d6ac5624 [AMDGPU] Define and use names for export targets. NFC.
Differential Revision: https://reviews.llvm.org/D91289
2020-11-12 19:57:14 +00:00
Stanislav Mekhanoshin c9d6fe6f7d [AMDGPU] Improve FLAT scratch detection
We were useing too broad check for isFLATScratch() which also
includes FLAT global.

Differential Revision: https://reviews.llvm.org/D90505
2020-11-02 11:37:33 -08:00
Jay Foad 9cee87d72a [AMDGPU] Fix double space in disassembly of ds_gws_sema_* with gds
By setting up the AsmStrings correctly we can remove some special cases
from AMDGPUInstPrinter::printOffset.

Differential Revision: https://reviews.llvm.org/D90307
2020-10-29 17:31:59 +00:00
Jay Foad a442fad911 [AMDGPU] Fix double space in disassembly of s_set_gpr_idx_mode
Differential Revision: https://reviews.llvm.org/D90374
2020-10-29 14:54:33 +00:00
Jay Foad e9dd2c4fe2 [AMDGPU] Fix double space in disassembly of some DPP instructions
Differential Revision: https://reviews.llvm.org/D90373
2020-10-29 14:54:33 +00:00
Jay Foad 0ca4124798 [AMDGPU] Make more use of printNamedBit in AMDGPUInstPrinter. NFC. 2020-10-26 14:03:35 +00:00
Dmitry Preobrazhensky 6b8948922c [AMDGPU][MC] Added support of SP3 syntax for MTBUF format modifier
Currently supported LLVM MTBUF syntax is shown below. It is not compatible with SP3.

    op     dst, addr, rsrc, FORMAT, soffset

This change adds support for SP3 syntax:

    op     dst, addr, rsrc, soffset SP3FORMAT

In addition to being compatible with SP3, this syntax allows using symbolic names for data, numeric and unified formats. Below is a list of added syntax variants.

format:<expression>
format:[<numeric-format-name>,<data-format-name>]
format:[<data-format-name>,<numeric-format-name>]
format:[<data-format-name>]
format:[<numeric-format-name>]
format:[<unified-format-name>]

The last syntax variant is supported for GFX10 only.

See llvm bug 37738

Reviewers: arsenm, rampitec, vpykhtin

Differential Revision: https://reviews.llvm.org/D84026
2020-07-24 16:41:03 +03:00
Matt Arsenault 6f437117af AMDGPU: Don't assert on f16 inv2pi immediates pre-gfx8
v_cvt_f32_f16 can still accept this value as a literal constant. This
showed up in GlobalISel since it doesn't have constant folding for
G_FPEXT.
2020-07-22 13:59:03 -04:00
Dmitry Preobrazhensky 0b8fd77ad9 [AMDGPU][MC] Corrected decoding of 16-bit literals
16-bit literals are encoded as 32-bit values. If high 16-bits of the value is 0xFFFF, the decoded instruction cannot be reassembled.

For example, the following code

0xff,0x04,0x04,0x52,0xcd,0xab,0xff,0xff

was decoded as

v_mul_lo_u16_e32 v2, 0xffffabcd, v2

However this literal is actually a 64-bit constant 0x00000000ffffabcd which violates requirements described in the documentation - the truncation is not safe.

This change corrects decoding to make reassembly possible.

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D84098
2020-07-22 17:20:43 +03:00
Dmitry Preobrazhensky e122eba185 [AMDGPU][MC] Corrected MTBUF parsing and decoding
MTBUF implementation has many issues and this change addresses most of these:
- refactored duplicated code;
- hardcoded constants moved out of high-level code;
- fixed a decoding error when nfmt or dfmt are zero (bug 36932);
- corrected parsing of operand separators (bug 46403);
- corrected handling of missing operands (bug 46404);
- corrected handling of out-of-range modifiers (bug 46421);
- corrected default value (bug 46467).

Reviewers: arsenm, rampitec, vpykhtin, artem.tamazov, kzhuravl

Differential Revision: https://reviews.llvm.org/D83760
2020-07-15 19:46:00 +03:00
Matt Arsenault 5f5f566b26 AMDGPU: Don't use 16-bit FP inline constants in integer operands
It seems to be a hardware defect that the half inline constants do not
work as expected for the 16-bit integer operations (the inverse does
work correctly). Experimentation seems to show these are really
reading the 32-bit inline constants, which can be observed by writing
inline asm using op_sel to see what's in the high half of the
constant. Theoretically we could fold the high halves of the 32-bit
constants using op_sel.

The *_asm_all.s MC tests are broken, and I don't know where the script
to autogenerate these are. I started manually fixing it, but there's
just too many cases to fix. This also does break the
assembler/disassembler support for these values, and I'm not sure what
to do about it. These are still valid encodings, so it seems like you
should be able to use them in some way. If you wrote assembly using
them, you could have really meant it (perhaps to read the high bits
with op_sel?). The disassembler will print the invalid literal
constant which will fail to re-assemble. The behavior is also
different depending on the use context. Consider this example, which
was previously accepted and encoded using the inline constant:

  v_mad_i16 v5, v1, -4.0, v3
  ; encoding: [0x05,0x00,0xec,0xd1,0x01,0xef,0x0d,0x04]

In contexts where an inline immediate is required (such as on gfx8/9),
this will now be rejected. For gfx10, this will produce the literal
encoding and change the printed format:
  v_mad_i16 v5, v1, 0xc400, v3
  ; encoding: [0x05,0x00,0x5e,0xd7,0x01,0xff,0x0d,0x04,0x00,0xc4,0x00,0x00]

This is just another variation of the issue that we don't perfectly
handle round trip assembly/disassembly due to not tracking how
immediates were encoded. This doesn't matter much in practice, since
compilers don't emit the suboptimal encoding. I doubt any users are
relying on this behavior (although I did make use of the old behavior
to figure out what was wrong).

Fixes bug 46302.
2020-06-17 19:14:10 -04:00
Simon Pilgrim b05b69e056 AMDGPUInstPrinter.cpp - add CommandLine.h include. NFC.
Fixes implicit dependency that will be exposed by a future patch.
2020-05-24 14:17:04 +01:00
Stanislav Mekhanoshin 54d6dfe996 [AMDGPU] Drop 16 bit subreg suffixes on print
We do not want to break asm syntax. These suffixes are
quite useful for debugging, so add an option to print
them. Right now it is NFC.

Differential Revision: https://reviews.llvm.org/D79435
2020-05-06 08:14:10 -07:00
Dmitry Preobrazhensky 5998baccb9 [AMDGPU][MC][GFX9+] Enabled 21-bit signed offsets for SMEM instructions
Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D79288
2020-05-06 14:13:10 +03:00