Commit Graph

618 Commits

Author SHA1 Message Date
Ivan Kosarev ad1d60c3be [FileCheck] Catch missspelled directives.
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D125604
2022-05-26 11:37:19 +01:00
Joe Nash 835e09c4c3 [AMDGPU] gfx11 FLAT Instructions
MachineCode Support for FLAT type instructions

Contributors:
Sebastian Neubauer <sebastian.neubauer@amd.com>

Patch 12/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125989

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D125992
2022-05-25 15:29:39 -04:00
Joe Nash ef1ea5ac01 [AMDGPU] gfx11 vinterp instructions MC support
A new instruction encoding. Some of these instructions were previously VOP3
encoded.

Contributors:
Carl Ritson <carl.ritson@amd.com>

Patch 11/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125824

Reviewed By: critson

Differential Revision: https://reviews.llvm.org/D125989
2022-05-25 14:59:16 -04:00
Joe Nash 1a51ab766f [AMDGPU] gfx11 export instructions
Contributors:
Jay Foad <jay.foad@amd.com>
Dmitry Preobrazhensky <d-pre@mail.ru>

Patch 10/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125822

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D125824
2022-05-25 14:44:09 -04:00
Ivan Kosarev 1586e1dc95 [AMDGPU][MC][GFX11] Support base+soffset+offset SMEM loads.
Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D126207
2022-05-24 15:13:14 +01:00
Dmitry Preobrazhensky 818cc9b285 [AMDGPU][MC][GFX940] Disable v_mac_f32_dpp
Differential Revision: https://reviews.llvm.org/D126070
2022-05-23 15:49:44 +03:00
Dmitry Preobrazhensky f598dfb3bf [AMDGPU][MC][GFX8+] Correct SMEM offset parsing
Differential Revision: https://reviews.llvm.org/D125907
2022-05-20 14:00:34 +03:00
Joe Nash ac2ff258d6 [AMDGPU] gfx11 scalar memory instructions
Contributors:
Mirko Brkusanin <Mirko.Brkusanin@amd.com>

Patch 9/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125820

Reviewed By: kosarev, #amdgpu, arsenm

Differential Revision: https://reviews.llvm.org/D125822
2022-05-19 10:27:47 -04:00
Joe Nash 729467acef [AMDGPU] gfx11 LDSDIR instructions MC support
Contributors:
Carl Ritson <carl.ritson@amd.com>

Patch 8/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125498

Reviewed By: critson, rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D125820
2022-05-19 10:08:47 -04:00
Dmitry Preobrazhensky 44673278e0 [AMDGPU][MC][GFX940] Add SMFMAC aliases
Differential Revision: https://reviews.llvm.org/D125888
2022-05-19 13:40:48 +03:00
Dmitry Preobrazhensky 169416c64a [AMDGPU][MC][GFX7] Disable cache policy modifiers with SMRD
Differential Revision: https://reviews.llvm.org/D125799
2022-05-18 15:17:49 +03:00
Ivan Kosarev 140ad30b24 [AMDGPU][MC][GFX10] Add missing s_scratch_load tests.
Completes
https://reviews.llvm.org/D125117

Reviewed By: dp, arsenm

Differential Revision: https://reviews.llvm.org/D125753
2022-05-18 11:11:10 +01:00
Stanislav Mekhanoshin a09af86693 [AMDGPU] Enable FLAT LDS DMA on gfx9/10 before gfx940
We always had global and scratch loads to LDS in the gfx9,
but did not handle it. These were available via the 'lds'
encoding bit. In gfx940 this bit was reused as 'svs' which
resulted in new '_lds' opcodes effectively pushing this
bit into the opcode, but functionally it is the same. These
instructions are also available on gfx10.

Differential Revision: https://reviews.llvm.org/D125126
2022-05-17 12:16:37 -07:00
Joe Nash d21b9b4946 [AMDGPU] gfx11 scalar alu instructions
MC layer support for SOP(scalar alu operations) including encoding
support for s_delay_alu and s_sendmsg_rtn.

Contributors:
Jay Foad <jay.foad@amd.com>

Patch 7/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125319

Reviewed By: #amdgpu, arsenm

Differential Revision: https://reviews.llvm.org/D125498
2022-05-17 13:35:41 -04:00
Stanislav Mekhanoshin 332b73fe12 [AMDGPU] Revert wide LDS DMA support.
This reverts ffbee7acdc, see also bug 37653 which it was fixing.
The bug claims this is an undocumented feature which actually works.
In the reality it is documented as not working for a good reason.
It likely does something, but it is useless anyway. These instructions
write into the LDS. The LDS address is:

M0 + inst_offset + (TIDinWave * 4).

For a store wider than a DWORD neighboring lanes will overwrite each
other.

Differential Revision: https://reviews.llvm.org/D125409
2022-05-16 11:23:35 -07:00
Joe Nash c70259405c [AMDGPU] gfx11 BUF Instructions
Includes MachineCode layer support and tests, and MIR tests not requiring
CodeGen pass changes.
Includes a small change in SMInstructions.td to correct encoded bits.

Contributors:
Petar Avramovic <Petar.Avramovic@amd.com>
Dmitry Preobrazhensky <dmitry.preobrazhensky@amd.com>

Depends on D125316

Patch 6/N for upstreaming of AMDGPU gfx11 architecture.

Reviewed By: dp, Petar.Avramovic

Differential Revision: https://reviews.llvm.org/D125319
2022-05-16 09:41:40 -04:00
Ivan Kosarev cb67b2ccc4 [AMDGPU][GFX10] Support base+soffset+offset SMEM stores.
Also makes another step towards resolving
https://github.com/llvm/llvm-project/issues/38652

Reviewed By: foad, dp

Differential Revision: https://reviews.llvm.org/D125380
2022-05-12 08:48:05 +01:00
Ivan Kosarev 88f04bdbd8 [AMDGPU][GFX10] Support base+soffset+offset SMEM loads.
Also makes a step towards resolving
https://github.com/llvm/llvm-project/issues/38652

Reviewed By: foad, dp

Differential Revision: https://reviews.llvm.org/D125117
2022-05-10 16:17:14 +01:00
Stanislav Mekhanoshin 00d84a9f92 [AMDGPU] Remove vdata from buffer to lds load
Differential Revision: https://reviews.llvm.org/D124485
2022-04-26 17:16:26 -07:00
Dmitry Preobrazhensky 81af32b9a3 [AMDGPU][MC][NFC][GFX940] Corrected an error position
Differential Revision: https://reviews.llvm.org/D124099
2022-04-21 14:04:46 +03:00
Dmitry Preobrazhensky b4231ac4be [AMDGPU][GFX90A+] Disabled ds_ordered_count and exp
Differential Revision: https://reviews.llvm.org/D124087
2022-04-21 13:16:44 +03:00
Dmitry Preobrazhensky e01dbabdd1 [AMDGPU][MC] Corrected error message "image data size does not match dmask and tfe"
Differential Revision: https://reviews.llvm.org/D123929
2022-04-19 13:52:58 +03:00
Dmitry Preobrazhensky 5c0bf1303e [AMDGPU][MC][GFX10] Removed unsupported 64bit DPP opcodes
Removed 64bit DPP opcodes from asm matcher tables.

Differential Revision: https://reviews.llvm.org/D123611
2022-04-13 14:43:40 +03:00
Dmitry Preobrazhensky ab18e1a533 [AMDGPU][GFX10] Enabled op_sel for v_add_nc_u16 and v_sub_nc_u16
Differential Revision: https://reviews.llvm.org/D123594
2022-04-13 13:48:42 +03:00
Anshil Gandhi 528aa09010 [AMDGPU][Codegen] Unsupported image sample texture map instructions
Disables image_sample_*_g16 instructions on architectures lacking g16 support. This patch fixes the issue 54672.

Differential Revision: https://reviews.llvm.org/D123461
2022-04-12 10:38:59 -06:00
Dmitry Preobrazhensky 1f6aa90386 [AMDGPU][MC][GFX10] Added syntactic sugar for s_waitcnt_depctr operand
Added the following helpers:

    depctr_hold_cnt(...)
    depctr_sa_sdst(...)
    depctr_va_vdst(...)
    depctr_va_sdst(...)
    depctr_va_ssrc(...)
    depctr_va_vcc(...)
    depctr_vm_vsrc(...)

Differential Revision: https://reviews.llvm.org/D123022
2022-04-07 17:03:44 +03:00
Stanislav Mekhanoshin 6e3e14f600 [AMDGPU] Support gfx940 smfmac instructions
Differential Revision: https://reviews.llvm.org/D122191
2022-03-24 12:40:42 -07:00
Stanislav Mekhanoshin 27439a7642 [AMDGPU] New gfx940 mfma instructions
Differential Revision: https://reviews.llvm.org/D122044
2022-03-24 12:12:52 -07:00
Stanislav Mekhanoshin 72c1a0d9c2 [AMDGPU] Allow v_accvgpr_write to use SGPR on gfx90a
This is undocumented, but it should work.

Differential Revision: https://reviews.llvm.org/D122252
2022-03-22 13:52:29 -07:00
Dmitry Preobrazhensky 1d817a1448 [AMDGPU][MC][NFC] Refactored sendmsg(...) handling
Differential Revision: https://reviews.llvm.org/D121995
2022-03-21 15:37:30 +03:00
Stanislav Mekhanoshin 43c4d915a3 [AMDGPU] Added gfx940 mfma dst constraint test. NFC. 2022-03-18 13:34:35 -07:00
Stanislav Mekhanoshin 4570527e72 [AMDGPU] Disable some MFMA instructions on gfx940
Differential Revision: https://reviews.llvm.org/D121956
2022-03-18 13:19:12 -07:00
Stanislav Mekhanoshin 0a79e1f30a [AMDGPU] reuse blgp as neg in 2 mfma operations on gfx940
GFX940 repurposes BLGP as NEG only in DGEMM MFMA.

Differential Revision: https://reviews.llvm.org/D121745
2022-03-18 12:56:51 -07:00
Stanislav Mekhanoshin d9ac55fab2 [AMDGPU] New MFMA names for existing instructions
Old names are supported as aliases.
_1k MFMA got new opcodes.

Differential Revision: https://reviews.llvm.org/D121741
2022-03-17 13:05:36 -07:00
Stanislav Mekhanoshin 522b259976 [AMDGPU] Allow v_accvgpr_write to use SGPR src on gfx940
Differential Revision: https://reviews.llvm.org/D121843
2022-03-17 12:12:06 -07:00
Stanislav Mekhanoshin c4500de255 [AMDGPU] gfx940: disable OP_SEL on V_DOT instructions
Differential Revision: https://reviews.llvm.org/D121634
2022-03-14 17:02:00 -07:00
Stanislav Mekhanoshin 8dd3d1cf1f [AMDGPU] Add symbolic names for gfx940 HWREGs
The namespaces of HWREGs is now overlapping with gfx10. Thus the
patch is longer than necessary to just support new names. It also
need to handle proper error messages, i.e. to issue a "specified
hardware register is not supported on this GPU" message.

This may need a major refactoring in the future.

Differential Revision: https://reviews.llvm.org/D121418
2022-03-14 16:13:33 -07:00
Stanislav Mekhanoshin 23499103f7 [AMDGPU] Support for gfx940 flat lds opcodes
Differential Revision: https://reviews.llvm.org/D121414
2022-03-14 15:46:19 -07:00
Stanislav Mekhanoshin 1f53f20fc1 [AMDGPU] Support gfx940 v_lshl_add_u64 instruction
Differential Revision: https://reviews.llvm.org/D121401
2022-03-14 15:45:42 -07:00
Stanislav Mekhanoshin 36fe3f13a9 [AMDGPU] flat scratch SVS addressing mode for gfx940
Both VADDR and SADDR are used in SVS mode.

Differential Revision: https://reviews.llvm.org/D121254
2022-03-14 15:23:36 -07:00
Stanislav Mekhanoshin 6181458662 [AMDGPU] gfx940 MUBUF format changes
Differential Revision: https://reviews.llvm.org/D121234
2022-03-11 11:36:49 -08:00
Jacob Lambert 5160447f58 [AMDGPU] Add gfx10 assembler directive to specify shared VGPR count
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D105507
2022-03-07 14:27:41 -08:00
Stanislav Mekhanoshin 932f628121 [AMDGPU] new gfx940 fp atomics
Differential Revision: https://reviews.llvm.org/D121028
2022-03-07 12:32:02 -08:00
Stanislav Mekhanoshin e7b362d75d [AMDGPU] Add v_mov_b64 gfx940 opcode
Differential Revision: https://reviews.llvm.org/D121023
2022-03-07 12:07:12 -08:00
Stanislav Mekhanoshin 8992b50e2f [AMDGPU] gfx940 uses new names for coherency bits
Differential Revision: https://reviews.llvm.org/D120855
2022-03-07 11:50:07 -08:00
Stanislav Mekhanoshin 2c830c8fab [AMDGPU] gfx940: support V_FMAMK_F32 and V_FMAAK_F32
Differential Revision: https://reviews.llvm.org/D120769
2022-03-07 11:31:01 -08:00
Stanislav Mekhanoshin 13256caf02 [AMDGPU] Added hsa-gfx90a-v3.s test. NFC. 2022-03-03 11:17:50 -08:00
Stanislav Mekhanoshin f769899115 [AMDGPU] Add test for instructions unsupported on gfx940. NFC. 2022-03-03 11:01:29 -08:00
Stanislav Mekhanoshin b05918f2f7 [AMDGPU] Removed XFAIL from hsa-gfx940-v3.s. NFC.
Handling of big endian was fixed in D88858.
2022-03-02 15:38:14 -08:00
Aakanksha 840695814a [AMDGPU] Add gfx1036 target
Differential Revision: https://reviews.llvm.org/D120846
2022-03-02 23:26:38 +00:00
Stanislav Mekhanoshin 35ec58d8c0 [AMDGPU] gfx940 removes all image instructions
Differential Revision: https://reviews.llvm.org/D120763
2022-03-02 13:55:26 -08:00
Stanislav Mekhanoshin 2e2e64df4a [AMDGPU] Add gfx940 target
This is target definition only.

Differential Revision: https://reviews.llvm.org/D120688
2022-03-02 13:54:48 -08:00
Jacob Lambert 7470244475 [AMDGPU] Add agpr_count to metadata and AsmParser
gfx90a allows the number of ACC registers (AGPRs) to be set
independently to the VGPR registers. For both HSA and PAL metadata, we
now include an "agpr_count" key to report the number of AGPRs set for
supported devices (gfx90a, gfx908, as determined by hasMAIInsts()).
This is collected from SIProgramInfo.NumAccVGPR for both HSA and PAL.
The AsmParser also now recognizes ".kernel.agpr_count" for supported
devices.

Differential Revision: https://reviews.llvm.org/D116140
2022-02-16 15:17:23 -08:00
Dmitry Preobrazhensky 6655c5a6bb [AMDGPU][MC][GFX10] Added an alias for HW_REG_HW_ID1
Enabled HW_REG_HW_ID as an alias for HW_REG_HW_ID1. This is required for compatibility with existing code.

Differential Revision: https://reviews.llvm.org/D119939
2022-02-16 19:45:44 +03:00
Stanislav Mekhanoshin d3b87e4a1c [AMDGPU] HWRegs TMA and TBA also supported on gfx9
Differential Revision: https://reviews.llvm.org/D118860
2022-02-03 09:36:10 -08:00
Fangrui Song 426437d1fe [MC] Add MCAsmParser::parseRParen to improve consistency and simplify code
Some diagnostics are more verbose but they don't seem to be more useful than
simple `expected ')'`
2022-01-27 00:37:49 -08:00
Stanislav Mekhanoshin 409c4436f9 [AMDGPU] Validate dst and src2 non-overlapping restriction in asm
Differential Revision: https://reviews.llvm.org/D118089
2022-01-26 15:14:06 -08:00
Matt Arsenault e6564f39c7 AMDGPU: Emit user sgpr count directives in text asm
We were emitting these in the object file but not printing them.
2022-01-26 13:51:12 -05:00
Dmitry Preobrazhensky c7ca4c6365 [AMDGPU][GFX10][MC] Updated symbolic names of internal HW registers
GFX10 no longer support HW_ID. It has been replaced with HW_ID1 and HW_ID2.
See bug 52904: https://github.com/llvm/llvm-project/issues/52904

Differential Revision: https://reviews.llvm.org/D117313
2022-01-17 20:29:10 +03:00
Dmitry Preobrazhensky b5fb7e485e [AMDGPU][MC] Corrected disassembly of s_waitcnt
s_waitcnt with default expcnt, vmcnt and lgkmcnt values was disassembled without arguments.
See https://github.com/llvm/llvm-project/issues/52716

Differential Revision: https://reviews.llvm.org/D117305
2022-01-17 20:22:03 +03:00
Dmitry Preobrazhensky 91f4650ebb [AMDGPU][MC][GFX10] Corrected global_atomic_fcmpswap*
Corrected src data size of global_atomic_fcmpswap and global_atomic_fcmpswap_x2 opcodes.

Differential Revision: https://reviews.llvm.org/D113746
2021-11-15 12:51:12 +03:00
Joe Nash b4b7e605a6 [AMDGPU] Support shared literals in FMAMK/FMAAK
These instructions should allow src0 to be a literal with the same
value as the mandatory other literal. Enable it by introducing an
operand that defers adding its value to the MI when decoding till
the mandatory literal is parsed.

Reviewed By: dp, foad

Differential Revision: https://reviews.llvm.org/D111067

Change-Id: I22b0ae0d35bad17b6f976808e48bffe9a6af70b7
2021-10-11 13:09:54 -04:00
Qiu Chaofan 573531fb1f Fix typo of colon to semicolon in lit tests 2021-10-09 10:03:50 +08:00
Dmitry Preobrazhensky 3500e7d2b0 [AMDGPU][MC][GFX7][GFX10] Corrected image_atomic_fcmpswap
Differential Revision: https://reviews.llvm.org/D109616
2021-09-21 18:06:02 +03:00
Dmitry Preobrazhensky b8e7f53208 [AMDGPU][MC][GFX10] Enabled dlc for FLAT and GLOBAL atomics
Differential Revision: https://reviews.llvm.org/D109614
2021-09-21 16:23:20 +03:00
Michael Liao b0402a35fc [amdgpu] Add 64-bit PC support when expanding unconditional branches.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D106445
2021-07-26 14:50:30 -04:00
Carl Ritson 6efb3220b4 [AMDGPU] Add VReg_192/VReg_224 support for MIMG instructions
Allow MIMG instructions to be selected with 6/7 VGPRs for vaddr.
Previously these were rounded up to VReg_256 this saves VGPRs.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D103800
2021-07-22 10:42:15 +09:00
Fangrui Song aa3df8ddcd [test] Avoid llvm-readelf/llvm-readobj one-dash long options and deprecated aliases (e.g. --file-headers) 2021-07-15 10:26:21 -07:00
Hafiz Abid Qadeer b205f2bb89 [AMDGPU] Handle s_branch to another section.
Currently, if target of s_branch instruction is in another section, it will fail with the error of undefined label.  Although in this case, the label is not undefined but present in another section. This patch tries to handle this issue. So while handling fixup_si_sopp_br fixup in getRelocType, if the target label is undefined we issue an error as before. If it is defined, a new relocation type R_AMDGPU_REL16 is returned.

This issue has been reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100181 and https://bugs.llvm.org/show_bug.cgi?id=45887. Before https://reviews.llvm.org/D79943, we used to get an crash for this scenario. The crash is fixed now but the we still get an undefined label error.  Jumps to other section can arise with hold/cold splitting.

A patch to handle the relocation in lld will follow shortly.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D105760
2021-07-13 12:17:47 +01:00
Fangrui Song a9854045f6 [test] Change -t to --syms and -s to -S for llvm-readobj RUN lines
-s and -t will be changed to improve consistency with llvm-readelf.
The inconsistency issue regularly contributes to confusion using the two tools.
2021-06-29 11:50:31 -07:00
Aakanksha Patil 3453f3dd46 [AMDGPU] Add gfx1035 target
Differential Revision: https://reviews.llvm.org/D104804
2021-06-24 14:32:41 -04:00
Brendon Cahoon 294efbbd3e Reland "[AMDGPU] Add gfx1013 target"
This reverts commit 211e584fa2.

Fixed a use-after-free error that caused the sanitizers to fail.
2021-06-08 21:15:35 -04:00
Brendon Cahoon 211e584fa2 Revert "[AMDGPU] Add gfx1013 target"
This reverts commit ea10a86984.

A sanitizer buildbot reports an error.
2021-06-08 16:29:41 -04:00
Brendon Cahoon ea10a86984 [AMDGPU] Add gfx1013 target
Differential Revision: https://reviews.llvm.org/D103663
2021-06-08 12:49:49 -04:00
Carl Ritson c8bbfb8cf5 [AMDGPU] Allow oversize vaddr in GFX10 MIMG assembly
As a follow up to D103672, we should allow vaddr to be larger than
required when assembling GFX10 MIMG instructions.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D103733
2021-06-08 11:57:07 +09:00
Carl Ritson f8816c7400 [AMDGPU] Add v5f32/VReg_160 support for MIMG instructions
Avoid having to round up to v8f32/VReg_256 when only 5 VGPRs are
required for a MIMG address operand.

Maintain _V8 instruction variants of pseudo instructions allowing
assembly prior to GFX10 to work as-is.  Currently the validator
can tell for GFX10 what the correct size is, so will disallow
oversize address registers.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D103672
2021-06-08 11:11:40 +09:00
Jay Foad 9e9edede18 [AMDGPU] Fix MC tests for v_fmaak_f16 and v_fmamk_f16
This looks like a mistake when the tests were committed in r363946.
There were two sets of tests for the f32 variant of these instructions,
instead of one set for f16 and one set for f32.

Differential Revision: https://reviews.llvm.org/D103699
2021-06-07 10:42:52 +01:00
Dmitry Preobrazhensky 13c6568c6e [AMDGPU][MC][GFX90A] Corrected DS_GWS opcodes
Corrected DS_GWS opcodes to use even aligned registers.

Differential Revision: https://reviews.llvm.org/D103185
2021-05-26 21:31:50 +03:00
David Stuttard 72d570ca08 [AMDGPU][AsmParser/Disassembler] Correct A16 and G16 handling
A16 support for image instructions assembly/disassembly (gfx10) was missing

Also refactor MIMG op addr size calcs to common function

We'd got 3 places where the same operation was being done.

One test is now marked XFAIL until a related codegen patch is in place

Differential Revision: https://reviews.llvm.org/D102231

Change-Id: I7e86e730ef8c71901457855cba570581f4f576bb
2021-05-14 09:25:44 +01:00
Aakanksha Patil 464e4dc50f [AMDGPU] Add gfx1034 target
Differential Revision: https://reviews.llvm.org/D102306
2021-05-13 14:25:18 -04:00
Stanislav Mekhanoshin 28f1d018b1 [AMDGPU] Fix 64 bit DPP validation
AMDGPUAsmParser::isSupportedDPPCtrl() was failing to correctly
find a DPP register operand, regadless of the position it is
always src0. Moved this check into a new validateDPP() method
where we have full instruction already. In particular it was
failing to reject this case:

v_cvt_u32_f64 v5, v[0:1] quad_perm:[0,2,1,1] row_mask:0xf bank_mask:0xf

Essentially it was broken for any case where size of dst and
src0 differ.

It also improves the diagnostics with a proper error message.

The check in the InstPrinter also drops verification of the dst
register as it does not have anything to do with the dpp operand.

Differential Revision: https://reviews.llvm.org/D101930
2021-05-06 08:40:26 -07:00
David Stuttard 8f2948731e [AMDGPU][AsmParser] Correct the order of optional operands to mimg
Ordering of operands was incorrect meaning that a16 operand was treated as tfe

Differential Revision: https://reviews.llvm.org/D101618

Change-Id: I3b15e71ef5ff625f19f52823414ab684d76aca33
2021-05-04 13:14:48 +01:00
Dmitry Preobrazhensky bcc29e0fcf [AMDGPU][MC] Corrected parsing of carry in/out operands in VOP3
Disabled constants as carry in/out operands. See bug 48711.

Differential Revision: https://reviews.llvm.org/D100642
2021-04-19 13:42:31 +03:00
Stanislav Mekhanoshin 37878de503 Disable use of SCC bit from asm
Differential Revision: https://reviews.llvm.org/D100069
2021-04-07 15:32:17 -07:00
Dmitry Preobrazhensky cd953434f2 [AMDGPU][MC][GFX10][GFX90A] Corrected _e32/_e64 suffices
Fixed bugs https://bugs.llvm.org//show_bug.cgi?id=49643, https://bugs.llvm.org//show_bug.cgi?id=49644, https://bugs.llvm.org//show_bug.cgi?id=49645.

Differential Revision: https://reviews.llvm.org/D99413
2021-04-01 14:21:00 +03:00
Konstantin Zhuravlyov f4ace63737 AMDGPU: Add target id and code object v4 support
- Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id)
  - Add code object v4 support (https://llvm.org/docs/AMDGPUUsage.html#elf-code-object)
    - Add kernarg_size to kernel descriptor
    - Change trap handler ABI to no longer move queue pointer into s[0:1]
  - Cleanup ELF definitions
    - Add V2, V3, V4 suffixes to make a clear distinction for code object version
    - Consolidate note names

Differential Revision: https://reviews.llvm.org/D95638
2021-03-24 11:54:05 -04:00
Matt Arsenault a0f5aad6d7 AMDGPU: Fix allowing immediates for tail call pseudo.
The pseudo was using SSrc_b64, so it allowed folding immediates into
the destination operand for a tail call to null. However, this is not
a valid operand for the s_setpc_b64 this will be lowered to. Avoids
printing the operand as an invalid immediate.

Avoids a regression when tail calls are enabled in GlobalISel (somehow
tail calls to null get deleted in the DAG).
2021-03-21 13:14:04 -04:00
Stanislav Mekhanoshin 961e4384f4 [AMDGPU] Support SCC on buffer atomics
Differential Revision: https://reviews.llvm.org/D98731
2021-03-18 09:56:14 -07:00
Dmitry Preobrazhensky 596db9934b [AMDGPU][MC] Disabled lds_direct for GFX90a
Fixed bug 49382.

Differential Revision: https://reviews.llvm.org/D98626
2021-03-16 13:52:36 +03:00
Stanislav Mekhanoshin 3bffb1cd0e [AMDGPU] Use single cache policy operand
Replace individual operands GLC, SLC, and DLC with a single cache_policy
bitmask operand. This will reduce the number of operands in MIR and I hope
the amount of code. These operands are mostly 0 anyway.

Additional advantage that parser will accept these flags in any order unlike
now.

Differential Revision: https://reviews.llvm.org/D96469
2021-03-15 13:00:59 -07:00
Carl Ritson c07f2025e4 [AMDGPU] Restrict image_msaa_load to MSAA dimension types
This instruction is only valid on 2D MSAA and 2D MSAA Array
surfaces.  Remove intrinsic support for other dimension types,
and block assembly for unsupported dimensions.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D98397
2021-03-12 09:47:24 +09:00
Stanislav Mekhanoshin 9931b1f7a4 [AMDGPU] Disable SCC bit on fp atomics
Differential Revision: https://reviews.llvm.org/D98221
2021-03-10 12:36:09 -08:00
Stanislav Mekhanoshin e08f278f5b [AMDGPU] Cleanup test checks. NFC. 2021-03-08 16:05:25 -08:00
Jay Foad 99682bc039 Revert "Revert "[AMDGPU] Restore the s_memtime instruction in gfx1030""
This reverts commit e58d68fcd0.

This reinstates commit fc28f600e5
with a fix to initialize HasShaderCyclesRegister. See
https://reviews.llvm.org/D97928.
2021-03-06 09:00:01 +00:00
Mitch Phillips e58d68fcd0 Revert "[AMDGPU] Restore the s_memtime instruction in gfx1030"
Broke the ASan/MSan buildbots. See more comments in the original patch,
https://reviews.llvm.org/D97928.

Build failure at http://lab.llvm.org:8011/#/builders/5/builds/5327

This reverts commit fc28f600e5.
2021-03-05 18:24:59 -08:00
Jay Foad fc28f600e5 [AMDGPU] Restore the s_memtime instruction in gfx1030
gfx1030 added a new way to implement readcyclecounter using the
SHADER_CYCLES hardware register, but the s_memtime instruction still
exists, so the MC layer should still accept it and the
llvm.amdgcn.s.memtime intrinsic should still work.

Differential Revision: https://reviews.llvm.org/D97928
2021-03-05 20:19:11 +00:00
Joe Nash 5531f24cc2 [AMDGPU] Make OMod explicit for V_CVT_{U,I}*
Make OMod explicit instead of implied by HasModifiers in the
operand list. Requires explicitly setting HasOMod=1 for
irregular OMod usage in instruction V_CVT_{U,I}*

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D97587

Change-Id: I230e1476f529e816eec60e242531f23a99e3839f
2021-03-02 13:32:06 -05:00
Dmitry Preobrazhensky 28f164bca7 [AMDGPU][MC][GFX9+] Corrected encoding of op_sel_hi for unused operands in VOP3P
Corrected encoding of VOP3P op_sel_hi for unused operands. See bug 49363.

Differential Revision: https://reviews.llvm.org/D97689
2021-03-02 13:02:25 +03:00
Jay Foad aab709f090 [AMDGPU] Add more PAL metadata register names
Add all the registers that are currently used by
LLPC: https://github.com/GPUOpen-Drivers/llpc

This only affects disassembly of PAL metadata generated by LLPC and
similar frontends.

Differential Revision: https://reviews.llvm.org/D95619
2021-02-24 13:37:05 +00:00
Jay Foad 67f0620831 [AMDGPU] Update s_sendmsg messages
Update the list of s_sendmsg messages known to the assembler and
disassembler and validate the ones that were added or removed in gfx9
and gfx10.

Differential Revision: https://reviews.llvm.org/D97295
2021-02-24 13:07:00 +00:00