Commit Graph

618 Commits

Author SHA1 Message Date
Stanislav Mekhanoshin 35ec58d8c0 [AMDGPU] gfx940 removes all image instructions
Differential Revision: https://reviews.llvm.org/D120763
2022-03-02 13:55:26 -08:00
Stanislav Mekhanoshin 2e2e64df4a [AMDGPU] Add gfx940 target
This is target definition only.

Differential Revision: https://reviews.llvm.org/D120688
2022-03-02 13:54:48 -08:00
Jacob Lambert 7470244475 [AMDGPU] Add agpr_count to metadata and AsmParser
gfx90a allows the number of ACC registers (AGPRs) to be set
independently to the VGPR registers. For both HSA and PAL metadata, we
now include an "agpr_count" key to report the number of AGPRs set for
supported devices (gfx90a, gfx908, as determined by hasMAIInsts()).
This is collected from SIProgramInfo.NumAccVGPR for both HSA and PAL.
The AsmParser also now recognizes ".kernel.agpr_count" for supported
devices.

Differential Revision: https://reviews.llvm.org/D116140
2022-02-16 15:17:23 -08:00
Dmitry Preobrazhensky 6655c5a6bb [AMDGPU][MC][GFX10] Added an alias for HW_REG_HW_ID1
Enabled HW_REG_HW_ID as an alias for HW_REG_HW_ID1. This is required for compatibility with existing code.

Differential Revision: https://reviews.llvm.org/D119939
2022-02-16 19:45:44 +03:00
Stanislav Mekhanoshin d3b87e4a1c [AMDGPU] HWRegs TMA and TBA also supported on gfx9
Differential Revision: https://reviews.llvm.org/D118860
2022-02-03 09:36:10 -08:00
Fangrui Song 426437d1fe [MC] Add MCAsmParser::parseRParen to improve consistency and simplify code
Some diagnostics are more verbose but they don't seem to be more useful than
simple `expected ')'`
2022-01-27 00:37:49 -08:00
Stanislav Mekhanoshin 409c4436f9 [AMDGPU] Validate dst and src2 non-overlapping restriction in asm
Differential Revision: https://reviews.llvm.org/D118089
2022-01-26 15:14:06 -08:00
Matt Arsenault e6564f39c7 AMDGPU: Emit user sgpr count directives in text asm
We were emitting these in the object file but not printing them.
2022-01-26 13:51:12 -05:00
Dmitry Preobrazhensky c7ca4c6365 [AMDGPU][GFX10][MC] Updated symbolic names of internal HW registers
GFX10 no longer support HW_ID. It has been replaced with HW_ID1 and HW_ID2.
See bug 52904: https://github.com/llvm/llvm-project/issues/52904

Differential Revision: https://reviews.llvm.org/D117313
2022-01-17 20:29:10 +03:00
Dmitry Preobrazhensky b5fb7e485e [AMDGPU][MC] Corrected disassembly of s_waitcnt
s_waitcnt with default expcnt, vmcnt and lgkmcnt values was disassembled without arguments.
See https://github.com/llvm/llvm-project/issues/52716

Differential Revision: https://reviews.llvm.org/D117305
2022-01-17 20:22:03 +03:00
Dmitry Preobrazhensky 91f4650ebb [AMDGPU][MC][GFX10] Corrected global_atomic_fcmpswap*
Corrected src data size of global_atomic_fcmpswap and global_atomic_fcmpswap_x2 opcodes.

Differential Revision: https://reviews.llvm.org/D113746
2021-11-15 12:51:12 +03:00
Joe Nash b4b7e605a6 [AMDGPU] Support shared literals in FMAMK/FMAAK
These instructions should allow src0 to be a literal with the same
value as the mandatory other literal. Enable it by introducing an
operand that defers adding its value to the MI when decoding till
the mandatory literal is parsed.

Reviewed By: dp, foad

Differential Revision: https://reviews.llvm.org/D111067

Change-Id: I22b0ae0d35bad17b6f976808e48bffe9a6af70b7
2021-10-11 13:09:54 -04:00
Qiu Chaofan 573531fb1f Fix typo of colon to semicolon in lit tests 2021-10-09 10:03:50 +08:00
Dmitry Preobrazhensky 3500e7d2b0 [AMDGPU][MC][GFX7][GFX10] Corrected image_atomic_fcmpswap
Differential Revision: https://reviews.llvm.org/D109616
2021-09-21 18:06:02 +03:00
Dmitry Preobrazhensky b8e7f53208 [AMDGPU][MC][GFX10] Enabled dlc for FLAT and GLOBAL atomics
Differential Revision: https://reviews.llvm.org/D109614
2021-09-21 16:23:20 +03:00
Michael Liao b0402a35fc [amdgpu] Add 64-bit PC support when expanding unconditional branches.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D106445
2021-07-26 14:50:30 -04:00
Carl Ritson 6efb3220b4 [AMDGPU] Add VReg_192/VReg_224 support for MIMG instructions
Allow MIMG instructions to be selected with 6/7 VGPRs for vaddr.
Previously these were rounded up to VReg_256 this saves VGPRs.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D103800
2021-07-22 10:42:15 +09:00
Fangrui Song aa3df8ddcd [test] Avoid llvm-readelf/llvm-readobj one-dash long options and deprecated aliases (e.g. --file-headers) 2021-07-15 10:26:21 -07:00
Hafiz Abid Qadeer b205f2bb89 [AMDGPU] Handle s_branch to another section.
Currently, if target of s_branch instruction is in another section, it will fail with the error of undefined label.  Although in this case, the label is not undefined but present in another section. This patch tries to handle this issue. So while handling fixup_si_sopp_br fixup in getRelocType, if the target label is undefined we issue an error as before. If it is defined, a new relocation type R_AMDGPU_REL16 is returned.

This issue has been reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100181 and https://bugs.llvm.org/show_bug.cgi?id=45887. Before https://reviews.llvm.org/D79943, we used to get an crash for this scenario. The crash is fixed now but the we still get an undefined label error.  Jumps to other section can arise with hold/cold splitting.

A patch to handle the relocation in lld will follow shortly.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D105760
2021-07-13 12:17:47 +01:00
Fangrui Song a9854045f6 [test] Change -t to --syms and -s to -S for llvm-readobj RUN lines
-s and -t will be changed to improve consistency with llvm-readelf.
The inconsistency issue regularly contributes to confusion using the two tools.
2021-06-29 11:50:31 -07:00
Aakanksha Patil 3453f3dd46 [AMDGPU] Add gfx1035 target
Differential Revision: https://reviews.llvm.org/D104804
2021-06-24 14:32:41 -04:00
Brendon Cahoon 294efbbd3e Reland "[AMDGPU] Add gfx1013 target"
This reverts commit 211e584fa2.

Fixed a use-after-free error that caused the sanitizers to fail.
2021-06-08 21:15:35 -04:00
Brendon Cahoon 211e584fa2 Revert "[AMDGPU] Add gfx1013 target"
This reverts commit ea10a86984.

A sanitizer buildbot reports an error.
2021-06-08 16:29:41 -04:00
Brendon Cahoon ea10a86984 [AMDGPU] Add gfx1013 target
Differential Revision: https://reviews.llvm.org/D103663
2021-06-08 12:49:49 -04:00
Carl Ritson c8bbfb8cf5 [AMDGPU] Allow oversize vaddr in GFX10 MIMG assembly
As a follow up to D103672, we should allow vaddr to be larger than
required when assembling GFX10 MIMG instructions.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D103733
2021-06-08 11:57:07 +09:00
Carl Ritson f8816c7400 [AMDGPU] Add v5f32/VReg_160 support for MIMG instructions
Avoid having to round up to v8f32/VReg_256 when only 5 VGPRs are
required for a MIMG address operand.

Maintain _V8 instruction variants of pseudo instructions allowing
assembly prior to GFX10 to work as-is.  Currently the validator
can tell for GFX10 what the correct size is, so will disallow
oversize address registers.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D103672
2021-06-08 11:11:40 +09:00
Jay Foad 9e9edede18 [AMDGPU] Fix MC tests for v_fmaak_f16 and v_fmamk_f16
This looks like a mistake when the tests were committed in r363946.
There were two sets of tests for the f32 variant of these instructions,
instead of one set for f16 and one set for f32.

Differential Revision: https://reviews.llvm.org/D103699
2021-06-07 10:42:52 +01:00
Dmitry Preobrazhensky 13c6568c6e [AMDGPU][MC][GFX90A] Corrected DS_GWS opcodes
Corrected DS_GWS opcodes to use even aligned registers.

Differential Revision: https://reviews.llvm.org/D103185
2021-05-26 21:31:50 +03:00
David Stuttard 72d570ca08 [AMDGPU][AsmParser/Disassembler] Correct A16 and G16 handling
A16 support for image instructions assembly/disassembly (gfx10) was missing

Also refactor MIMG op addr size calcs to common function

We'd got 3 places where the same operation was being done.

One test is now marked XFAIL until a related codegen patch is in place

Differential Revision: https://reviews.llvm.org/D102231

Change-Id: I7e86e730ef8c71901457855cba570581f4f576bb
2021-05-14 09:25:44 +01:00
Aakanksha Patil 464e4dc50f [AMDGPU] Add gfx1034 target
Differential Revision: https://reviews.llvm.org/D102306
2021-05-13 14:25:18 -04:00
Stanislav Mekhanoshin 28f1d018b1 [AMDGPU] Fix 64 bit DPP validation
AMDGPUAsmParser::isSupportedDPPCtrl() was failing to correctly
find a DPP register operand, regadless of the position it is
always src0. Moved this check into a new validateDPP() method
where we have full instruction already. In particular it was
failing to reject this case:

v_cvt_u32_f64 v5, v[0:1] quad_perm:[0,2,1,1] row_mask:0xf bank_mask:0xf

Essentially it was broken for any case where size of dst and
src0 differ.

It also improves the diagnostics with a proper error message.

The check in the InstPrinter also drops verification of the dst
register as it does not have anything to do with the dpp operand.

Differential Revision: https://reviews.llvm.org/D101930
2021-05-06 08:40:26 -07:00
David Stuttard 8f2948731e [AMDGPU][AsmParser] Correct the order of optional operands to mimg
Ordering of operands was incorrect meaning that a16 operand was treated as tfe

Differential Revision: https://reviews.llvm.org/D101618

Change-Id: I3b15e71ef5ff625f19f52823414ab684d76aca33
2021-05-04 13:14:48 +01:00
Dmitry Preobrazhensky bcc29e0fcf [AMDGPU][MC] Corrected parsing of carry in/out operands in VOP3
Disabled constants as carry in/out operands. See bug 48711.

Differential Revision: https://reviews.llvm.org/D100642
2021-04-19 13:42:31 +03:00
Stanislav Mekhanoshin 37878de503 Disable use of SCC bit from asm
Differential Revision: https://reviews.llvm.org/D100069
2021-04-07 15:32:17 -07:00
Dmitry Preobrazhensky cd953434f2 [AMDGPU][MC][GFX10][GFX90A] Corrected _e32/_e64 suffices
Fixed bugs https://bugs.llvm.org//show_bug.cgi?id=49643, https://bugs.llvm.org//show_bug.cgi?id=49644, https://bugs.llvm.org//show_bug.cgi?id=49645.

Differential Revision: https://reviews.llvm.org/D99413
2021-04-01 14:21:00 +03:00
Konstantin Zhuravlyov f4ace63737 AMDGPU: Add target id and code object v4 support
- Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id)
  - Add code object v4 support (https://llvm.org/docs/AMDGPUUsage.html#elf-code-object)
    - Add kernarg_size to kernel descriptor
    - Change trap handler ABI to no longer move queue pointer into s[0:1]
  - Cleanup ELF definitions
    - Add V2, V3, V4 suffixes to make a clear distinction for code object version
    - Consolidate note names

Differential Revision: https://reviews.llvm.org/D95638
2021-03-24 11:54:05 -04:00
Matt Arsenault a0f5aad6d7 AMDGPU: Fix allowing immediates for tail call pseudo.
The pseudo was using SSrc_b64, so it allowed folding immediates into
the destination operand for a tail call to null. However, this is not
a valid operand for the s_setpc_b64 this will be lowered to. Avoids
printing the operand as an invalid immediate.

Avoids a regression when tail calls are enabled in GlobalISel (somehow
tail calls to null get deleted in the DAG).
2021-03-21 13:14:04 -04:00
Stanislav Mekhanoshin 961e4384f4 [AMDGPU] Support SCC on buffer atomics
Differential Revision: https://reviews.llvm.org/D98731
2021-03-18 09:56:14 -07:00
Dmitry Preobrazhensky 596db9934b [AMDGPU][MC] Disabled lds_direct for GFX90a
Fixed bug 49382.

Differential Revision: https://reviews.llvm.org/D98626
2021-03-16 13:52:36 +03:00
Stanislav Mekhanoshin 3bffb1cd0e [AMDGPU] Use single cache policy operand
Replace individual operands GLC, SLC, and DLC with a single cache_policy
bitmask operand. This will reduce the number of operands in MIR and I hope
the amount of code. These operands are mostly 0 anyway.

Additional advantage that parser will accept these flags in any order unlike
now.

Differential Revision: https://reviews.llvm.org/D96469
2021-03-15 13:00:59 -07:00
Carl Ritson c07f2025e4 [AMDGPU] Restrict image_msaa_load to MSAA dimension types
This instruction is only valid on 2D MSAA and 2D MSAA Array
surfaces.  Remove intrinsic support for other dimension types,
and block assembly for unsupported dimensions.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D98397
2021-03-12 09:47:24 +09:00
Stanislav Mekhanoshin 9931b1f7a4 [AMDGPU] Disable SCC bit on fp atomics
Differential Revision: https://reviews.llvm.org/D98221
2021-03-10 12:36:09 -08:00
Stanislav Mekhanoshin e08f278f5b [AMDGPU] Cleanup test checks. NFC. 2021-03-08 16:05:25 -08:00
Jay Foad 99682bc039 Revert "Revert "[AMDGPU] Restore the s_memtime instruction in gfx1030""
This reverts commit e58d68fcd0.

This reinstates commit fc28f600e5
with a fix to initialize HasShaderCyclesRegister. See
https://reviews.llvm.org/D97928.
2021-03-06 09:00:01 +00:00
Mitch Phillips e58d68fcd0 Revert "[AMDGPU] Restore the s_memtime instruction in gfx1030"
Broke the ASan/MSan buildbots. See more comments in the original patch,
https://reviews.llvm.org/D97928.

Build failure at http://lab.llvm.org:8011/#/builders/5/builds/5327

This reverts commit fc28f600e5.
2021-03-05 18:24:59 -08:00
Jay Foad fc28f600e5 [AMDGPU] Restore the s_memtime instruction in gfx1030
gfx1030 added a new way to implement readcyclecounter using the
SHADER_CYCLES hardware register, but the s_memtime instruction still
exists, so the MC layer should still accept it and the
llvm.amdgcn.s.memtime intrinsic should still work.

Differential Revision: https://reviews.llvm.org/D97928
2021-03-05 20:19:11 +00:00
Joe Nash 5531f24cc2 [AMDGPU] Make OMod explicit for V_CVT_{U,I}*
Make OMod explicit instead of implied by HasModifiers in the
operand list. Requires explicitly setting HasOMod=1 for
irregular OMod usage in instruction V_CVT_{U,I}*

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D97587

Change-Id: I230e1476f529e816eec60e242531f23a99e3839f
2021-03-02 13:32:06 -05:00
Dmitry Preobrazhensky 28f164bca7 [AMDGPU][MC][GFX9+] Corrected encoding of op_sel_hi for unused operands in VOP3P
Corrected encoding of VOP3P op_sel_hi for unused operands. See bug 49363.

Differential Revision: https://reviews.llvm.org/D97689
2021-03-02 13:02:25 +03:00
Jay Foad aab709f090 [AMDGPU] Add more PAL metadata register names
Add all the registers that are currently used by
LLPC: https://github.com/GPUOpen-Drivers/llpc

This only affects disassembly of PAL metadata generated by LLPC and
similar frontends.

Differential Revision: https://reviews.llvm.org/D95619
2021-02-24 13:37:05 +00:00
Jay Foad 67f0620831 [AMDGPU] Update s_sendmsg messages
Update the list of s_sendmsg messages known to the assembler and
disassembler and validate the ones that were added or removed in gfx9
and gfx10.

Differential Revision: https://reviews.llvm.org/D97295
2021-02-24 13:07:00 +00:00
Jay Foad 64831fb089 [AMDGPU] Rename a prefix for sanity. NFC. 2021-02-23 14:53:27 +00:00
Dmitry Preobrazhensky 4813518092 [AMDGPU][MC] Corrected bound_ctrl for compatibility with sp3
Enabled "bound_ctrl:1" and disabled "bound_ctrl:-1" syntax.
Corrected printer to output "bound_ctrl:1" instead of "bound_ctrl:0".
See bug 35397 for detailed issue description.

Differential Revision: https://reviews.llvm.org/D97048
2021-02-22 14:59:40 +03:00
Stanislav Mekhanoshin 3d10ec0d6a [AMDGPU] Temporary remove test
Remove hsa-gfx90a-v3.s until D95638. It unexpectedly passes
on s390x.
2021-02-17 22:41:04 -08:00
Stanislav Mekhanoshin a8d9d50762 [AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
2021-02-17 16:01:32 -08:00
Fangrui Song 962b29d716 ELFObjectWriter: Don't sort non-local symbols
As we don't sort local symbols, don't sort non-local symbols.  This makes
non-local symbols appear in their register order, which matches GNU as. The
register order is nice in that you can write tests with interleaved CHECK
prefixes, e.g.

```
// CHECK: something about foo
.globl foo
foo:
// CHECK: something about bar
.globl bar
bar:
```

With the lexicographical order, the user needs to place lexicographical smallest
symbol first or keep CHECK prefixes in one place.
2021-02-13 10:32:27 -08:00
Carl Ritson e5b0b434f6 [AMDGPU] Refactor MIMG tables to better handle hardware variants
Add mimgopc object to represent the opcode allowing different
opcodes for different hardware variants.
This enables image_atomic_fcmpswap, image_atomic_fmin, and
image_atomic_fmax on GFX10

Reviewed By: foad, rampitec

Differential Revision: https://reviews.llvm.org/D96309
2021-02-11 13:22:41 +09:00
Dmitry Preobrazhensky 05433a8d03 [AMDGPU][MC] Corrected error position for invalid dim modifiers
Fixed bug 49054.

Differential Revision: https://reviews.llvm.org/D96117
2021-02-08 14:32:28 +03:00
Dmitry Preobrazhensky 168ccc8ecb [AMDGPU][MC][GFX10] Improved errors reporting for invalid MIMG NSA operands
Differential Revision: https://reviews.llvm.org/D96118
2021-02-08 14:04:28 +03:00
Fangrui Song 980d28d955 ELFObjectWriter: Don't sort local symbols
GNU as does not sort local symbols. This has several advantages:

* The .symtab order is roughly the symbol occurrence order.
* The closest preceding STT_SECTION symbol is the definition of a local symbol.
* The closest preceding STT_FILE symbol is the defining file of a local symbol, if there are multiple default-version .file directives. (Not implemented in MC.)
2021-02-07 15:47:10 -08:00
Dmitry Preobrazhensky 586df38478 [AMDGPU][MC] Corrected parsing of optional modifiers
Fixed bugs in parsing of "no*" modifiers and improved errors handling.
See https://bugs.llvm.org/show_bug.cgi?id=41282.

Differential Revision: https://reviews.llvm.org/D95675
2021-02-02 14:52:29 +03:00
Dmitry Preobrazhensky 99b5631649 [AMDGPU][MC] Corrected error position for invalid operands
Generic parser may report an incorrect error position when an offending operand is followed by a comma.
See bug 48884 for details: https://bugs.llvm.org/show_bug.cgi?id=48884.

Differential Revision: https://reviews.llvm.org/D95674
2021-02-01 14:31:08 +03:00
Jay Foad 164c6de530 [AMDGPU] Test all register names known to AMDGPUPALMetadata
Differential Revision: https://reviews.llvm.org/D95684
2021-01-29 16:16:26 +00:00
Austin Kerbow 2291bd137d [AMDGPU] Update subtarget features for new target ID support
Support for XNACK and SRAMECC is not static on some GPUs. We must be able
to differentiate between different scenarios for these dynamic subtarget
features.

The possible settings are:

- Unsupported: The GPU has no support for XNACK/SRAMECC.
- Any: Preference is unspecified. Use conservative settings that can run anywhere.
- Off: Request support for XNACK/SRAMECC Off
- On: Request support for XNACK/SRAMECC On

GCNSubtarget will track the four options based on the following criteria. If
the subtarget does not support XNACK/SRAMECC we say the setting is
"Unsupported". If no subtarget features for XNACK/SRAMECC are requested we
must support "Any" mode. If the subtarget features XNACK/SRAMECC exist in the
feature string when initializing the subtarget, the settings are "On/Off".

The defaults are updated to be conservatively correct, meaning if no setting
for XNACK or SRAMECC is explicitly requested, defaults will be used which
generate code that can be run anywhere. This corresponds to the "Any" setting.

Differential Revision: https://reviews.llvm.org/D85882
2021-01-26 11:25:51 -08:00
Dmitry Preobrazhensky 745064e36b [AMDGPU][MC] Refactored exp tgt handling
Summary:
- Separated tgt encoding from parsing;
- Separated tgt decoding from printing;
- Improved errors handling;
- Disabled leading zeroes in index. The following code is no longer accepted: exp pos00 v3, v2, v1, v0

Reviewers: arsenm, rampitec, foad

Differential Revision: https://reviews.llvm.org/D95216
2021-01-26 14:54:15 +03:00
Dmitry Preobrazhensky 558b3bbb5b [AMDGPU][MC] Improved errors handling for SDWA operands
Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D95212
2021-01-25 19:02:53 +03:00
Mircea Trofin c042aff886 [NFC] Disallow unused prefixes under llvm/test
This patch sets the default for llvm tests, with the exception of tests
under Reduce, because quite a few of them use 'FileCheck' as parameter
to a tool, and including a flag as that parameter would complicate
matters.

The rest of the patch undo-es the lit.local.cfg changes we progressively
introduced as temporary measure to avoid regressions under various
directories.

Differential Revision: https://reviews.llvm.org/D95111
2021-01-21 20:31:52 -08:00
Dmitry Preobrazhensky 55c557a5d2 [AMDGPU][MC] Refactored parsing of dpp ctrl
Summary of changes:
- simplified code to improve maintainability;
- replaced lex() with higher level parser functions;
- improved errors handling.

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D94777
2021-01-18 18:14:19 +03:00
Dmitry Preobrazhensky 911961c9c1 [AMDGPU][MC][GFX10] Improved dpp8 errors handling
Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D94756
2021-01-18 15:02:31 +03:00
Mircea Trofin 585612355c [NFC] Disallow unused prefixes under MC/AMDGPU
This patches remaining tests, and patches lit.local.cfg to block future
such cases (until we flip FileCheck's flag)

Differential Revision: https://reviews.llvm.org/D94556
2021-01-12 15:24:44 -08:00
Mircea Trofin 55f2eeebc9 [NFC] Disallow unused prefixes in MC/AMDGPU
1 out of 2 patches.

Differential Revision: https://reviews.llvm.org/D94553
2021-01-12 14:31:22 -08:00
Joe Nash 60466fad2d [AMDGPU] Remove deprecated V_MUL_LO_I32 from GFX10
It was removed in GFX10 GPUs, but LLVM could
generate it.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D94020

Change-Id: Id1c716d71313edcfb768b2b175a6789ef9b01f3c
2021-01-05 11:59:57 -05:00
Dmitry Preobrazhensky 6d02d12e17 [AMDGPU][MC][NFC] Added more tests for flat_global
Restored tests from 7898803c63
2020-12-28 23:00:56 +03:00
Dmitry Preobrazhensky c7ff2c0da1 [AMDGPU][MC][NFC] Split large asm tests into smaller chunks
The following large tests have been split into smaller parts by instruction formats:

    gfx7_asm_all.s
    gfx8_asm_all.s
    gfx9_asm_all.s
    gfx10_asm_all.s

This change results in noticeable lit testing speedup.
For example, on a debug Windows build, split asm tests are run 3.5 times faster.
2020-12-28 20:22:38 +03:00
Dmitry Preobrazhensky 8c25bb3d0d [AMDGPU][MC] Improved errors handling for v_interp* operands
See bug 48596 (https://bugs.llvm.org/show_bug.cgi?id=48596)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D93757
2020-12-28 16:15:48 +03:00
Dmitry Preobrazhensky a323682dcb [AMDGPU][MC][NFC] Lit tests cleanup
See bug 48513

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D93550
2020-12-21 20:04:02 +03:00
Sebastian Neubauer 409a2f0f9e [AMDGPU] Allow no saddr for global addtid insts
I think the global_load/store_dword_addtid instructions support
switching off the scalar address.
Add assembler and disassembler support for this.

Differential Revision: https://reviews.llvm.org/D93288
2020-12-16 10:01:40 +01:00
Sebastian Neubauer 91445979be [AMDGPU] Unify flat offset logic
Move getNumFlatOffsetBits from AMDGPUAsmParser and SIInstrInfo into
AMDGPUBaseInfo.

Differential Revision: https://reviews.llvm.org/D93287
2020-12-15 14:59:59 +01:00
Sebastian Neubauer 7898803c63 [AMDGPU][NFC] Add more global_atomic_cmpswap tests 2020-12-15 14:47:33 +01:00
Georgii Rymar 98a4289810 [llvm-readobj] - For SHT_REL relocations, don't display an addend.
This is https://bugs.llvm.org/show_bug.cgi?id=44257.

In LLVM style we always print `0` as addend when dumping
SHT_REL relocations. It is confusing, this patch stops
printing it as the first comment on the bug page suggests.

Differential revision: https://reviews.llvm.org/D93033
2020-12-14 12:03:00 +03:00
Scott Linder 9260a99999 [MC][AMDGPU] Consume EndOfStatement in asm parser
Avoids spurious newlines showing up in the output when emitting assembly
via MC.

Reviewed By: MaskRay, arsenm

Differential Revision: https://reviews.llvm.org/D92690
2020-12-09 21:45:55 +00:00
Scott Linder f5f4b8b60f [AMDGPU][MC] Restore old error position for "too few operands"
Revert part of https://reviews.llvm.org/D92084 to make it simpler to
start consuming the EndOfStatement token within AMDGPU's
ParseInstruction in a future patch. This also brings us back to what
every other target currently does.

A future change to move the position back to the end of the statement
would likely need to audit all of the AMDGPUOperand SMLoc ranges, and
determine the SMLoc for the last character of the last operand.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D92960
2020-12-09 21:09:47 +00:00
Petar Avramovic 3a042dcd2e [AMDGPU] Fix default value of glc for mubuf rtn atomics
Mubuf rtn atomics use GLC_1 thus default value for glc operand
should be -1, see https://reviews.llvm.org/D90730.
This allows us to report error when rtn atomic requires glc=1
but does not have glc operand in input.

Differential Revision: https://reviews.llvm.org/D92654
2020-12-07 14:00:08 +01:00
Dmitry Preobrazhensky a0b3a9391c [AMDGPU][MC] Improved diagnostics message for sym/expr operands
See bug 48295 (https://bugs.llvm.org/show_bug.cgi?id=48295)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D92088
2020-12-05 14:05:53 +03:00
Dmitry Preobrazhensky e97dd11977 [AMDGPU][MC] Corrected error position for invalid MOVREL src
See bug 47518 (https://bugs.llvm.org/show_bug.cgi?id=47518)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D92084
2020-12-05 13:23:14 +03:00
Dmitry Preobrazhensky ce44bf2cf2 [AMDGPU][MC] Improved diagnostic messages
See bug 47518 (https://bugs.llvm.org/show_bug.cgi?id=47518)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D91794
2020-11-23 16:15:05 +03:00
Dmitry Preobrazhensky e4effef330 [AMDGPU][MC] Improved diagnostic messages for invalid literals
See bug 47518 (https://bugs.llvm.org/show_bug.cgi?id=47518)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D91793
2020-11-23 15:48:06 +03:00
Dmitry Preobrazhensky 65f3e121fe [AMDGPU][MC] Corrected error position for some operands and modifiers
Partially fixes bug 47518 (https://bugs.llvm.org/show_bug.cgi?id=47518)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D91412
2020-11-16 16:11:23 +03:00
Dmitry Preobrazhensky 0bee8c784b [AMDGPU][MC] Corrected error position for swizzle()
Partially fixes bug 47518 (https://bugs.llvm.org/show_bug.cgi?id=47518)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D91408
2020-11-16 14:37:57 +03:00
Dmitry Preobrazhensky 89df8fc0d7 [AMDGPU][MC] Corrected error position for hwreg() and sendmsg()
Partially fixes bug 47518 (https://bugs.llvm.org/show_bug.cgi?id=47518)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D91407
2020-11-16 14:25:07 +03:00
Scott Linder d5f2c3e7c0 [NFC][AMDGPU] Clean up some lit test prefixes
Replace some instances of "ALL" with "GCN" where it applies. Committed
as obvious.
2020-11-11 17:12:37 +00:00
Stanislav Mekhanoshin 544ef42e40 [AMDGPU] Set default op_sel_hi on accvgpr read/write
These are opsel opcodes with op_sel actually being ignored.
As a such op_sel_hi needs to be set to default 1 even though
these bits are ignored. This is compatibility change.

Differential Revision: https://reviews.llvm.org/D91202
2020-11-10 13:07:29 -08:00
Dmitry Preobrazhensky f4cc511303 [AMDGPU][MC] Added tests for checking error position
See bug 47519: https://bugs.llvm.org/show_bug.cgi?id=47519

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D90925
2020-11-09 16:19:28 +03:00
Jay Foad d61f2cfb9f [AMDGPU] Simplify exp target parsing
Treat any identifier as a potential exp target and diagnose them all the
same way as "invalid exp target"s.

Differential Revision: https://reviews.llvm.org/D90947
2020-11-06 16:09:34 +00:00
Jay Foad 75a026e93b [AMDGPU] Run exp tests on GFX9 and GFX10 too. NFC. 2020-11-06 15:03:05 +00:00
Stanislav Mekhanoshin f738aee0bb [AMDGPU] Add default 1 glc operand to rtn atomics
This change adds a real glc operand to the return atomic
instead of just string " glc" in the middle of the asm
string.

Improves asm parser diagnostics.

Differential Revision: https://reviews.llvm.org/D90730
2020-11-05 10:41:59 -08:00
Tim Renouf 89d41f3a2b [AMDGPU] Add gfx1033 target
Differential Revision: https://reviews.llvm.org/D90447

Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761
2020-11-03 16:27:48 +00:00
Jay Foad 5b91a6a88b [AMDGPU] Allow some modifiers on VOP3B instructions
V_DIV_SCALE_F32/F64 are VOP3B encoded so they can't use the ABS src
modifier, but they can still use NEG and the usual output modifiers.

This partially reverts 3b99f12a4e "AMDGPU: Remove modifiers from v_div_scale_*".

Differential Revision: https://reviews.llvm.org/D90296
2020-10-28 21:54:14 +00:00
Stanislav Mekhanoshin 6ddadf9901 [AMDGPU] flat scratch ST addressing mode on gfx10
GFX10 enables third addressing mode for flat scratch instructions,
an ST mode. In that mode both register operands are omitted and
only swizzled offset is used in addition to flat_scratch base.

Differential Revision: https://reviews.llvm.org/D89501
2020-10-19 15:29:52 -07:00
Tony ceb9940b39 [AMDGPU] Correct hsa-diag-v3.s test
- Use file_check -LABEL markers to prevent false positives being
  reported due to messages from different tests causing success to be
  reported.

- Add checks for all the run commands for more robust testing.

- Add checks for the absence of errors.

- Name and order tests more sensibly.

Differential Revision: https://reviews.llvm.org/D89635
2020-10-19 17:08:13 +00:00
Stanislav Mekhanoshin d1beb95d12 [AMDGPU] gfx1032 target
Differential Revision: https://reviews.llvm.org/D89487
2020-10-15 12:41:18 -07:00
Konstantin Zhuravlyov 3fdf3b1539 AMDGPU: Update AMDHSA code object version handling
Differential Revision: https://reviews.llvm.org/D89076
2020-10-14 13:04:27 -04:00
Jay Foad edc37baca6 [AMDGPU] Add MC layer support for v_fmac_legacy_f32
This instruction was introduced in GFX10.3, reusing the opcode of
v_mac_legacy_f32 from GFX10.1.

Differential Revision: https://reviews.llvm.org/D89247
2020-10-13 21:57:33 +01:00
Jay Foad cdf0214845 [AMDGPU] v_mac_legacy_f32 does not support DPP
Differential Revision: https://reviews.llvm.org/D89245
2020-10-13 10:03:00 +01:00
Jay Foad acd0dd3a62 [AMDGPU] Use lowercase for subtarget feature names in RUN lines 2020-10-13 09:02:09 +01:00
Jay Foad b8901230c0 [AMDGPU] Use @LINE for error checking in gfx10 assembler tests 2020-10-12 16:10:20 +01:00
Dmitry Preobrazhensky 1e75668821 [AMDGPU][MC][GFX1030] Disabled v_mac_f32
See bug 47741 <https://bugs.llvm.org/show_bug.cgi?id=47741>

Reviewers: nhaehnle, rampitec

Differential Revision: https://reviews.llvm.org/D89000
2020-10-08 14:00:52 +03:00
Jay Foad fc819b6925 [AMDGPU] Use @LINE for error checking in gfx10.3 assembler tests 2020-10-07 15:48:01 +01:00
Dmitry Preobrazhensky 4a7e7620d6 [AMDGPU][MC] Improved diagnostics for instructions with missing features
Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D88887
2020-10-07 16:31:29 +03:00
Scott Linder bf5c1d92d9 [AMDGPU] Fix remaining kernel descriptor test
Follow up on e4a9e4ef55 to fix a test I missed in the original patch.
Committed as obvious.
2020-10-06 18:45:04 +00:00
Scott Linder e4a9e4ef55 [AMDGPU] Emit correct kernel descriptor on big-endian hosts
Previously we wrote multi-byte values out as-is from host memory. Use
the `emitIntN` helpers in `MCStreamer` to produce a valid descriptor
irrespective of the host endianness.

Reviewed By: arsenm, rochauha

Differential Revision: https://reviews.llvm.org/D88858
2020-10-06 17:29:38 +00:00
Dmitry Preobrazhensky e2452f57fa [AMDGPU][MC] Added detection of unsupported instructions
Implemented identification of unsupported instructions; improved errors reporting.

See bug 42590.

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D88211
2020-10-06 16:44:27 +03:00
Stanislav Mekhanoshin a45cdb311f [AMDGPU] gfx1030 test update. NFC. 2020-09-16 13:56:16 -07:00
Stanislav Mekhanoshin 91f503c3af [AMDGPU] gfx1030 RT support
Differential Revision: https://reviews.llvm.org/D87782
2020-09-16 11:40:58 -07:00
Dmitry Preobrazhensky 95b7040e43 [AMDGPU][MC] Improved diagnostic messages for invalid registers
Corrected parser to issue meaningful error messages for invalid and malformed registers.

See bug 41303: https://bugs.llvm.org/show_bug.cgi?id=41303

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D87234
2020-09-09 16:44:03 +03:00
Dmitry Preobrazhensky ecde200209 [AMDGPU][MC] Corrected parser to avoid generation of excessive error messages
Summary of changes:
- Changed parser to eliminate generation of excessive error messages;
- Corrected lit tests to match all expected error messages;
- Corrected lit tests to guard against unwanted extra messages (added option "--implicit-check-not=error:");
- Added missing checks and fixed some typos in tests.

See bug 46907: https://bugs.llvm.org/show_bug.cgi?id=46907

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D86940
2020-09-02 19:42:18 +03:00
Matt Arsenault a7455652c0 AMDGPU: Fix global atomic saddr operand class 2020-08-15 12:12:28 -04:00
Matt Arsenault 47af1ac69a AMDGPU: Correct definitions for global saddr instructions
The VGPR component is a 32-bit offset, not 64-bits.

I'm not sure what the correct syntax is for this. This maintains the
vaddr position and leaves saddr in the end "off" position. This is
particularly terrible for stores, since the operand order is now <vgpr
offset>, <data>, <sgpr base>, splitting the pointer operands. I
suppose this is a logical consequence from the mistake of not putting
the data operand first. I'm not sure what sp3 does.
2020-08-15 12:11:57 -04:00
Stanislav Mekhanoshin ea7d0e2996 [AMDGPU] gfx1031 target
Differential Revision: https://reviews.llvm.org/D85337
2020-08-05 12:36:26 -07:00
Dmitry Preobrazhensky 6b8948922c [AMDGPU][MC] Added support of SP3 syntax for MTBUF format modifier
Currently supported LLVM MTBUF syntax is shown below. It is not compatible with SP3.

    op     dst, addr, rsrc, FORMAT, soffset

This change adds support for SP3 syntax:

    op     dst, addr, rsrc, soffset SP3FORMAT

In addition to being compatible with SP3, this syntax allows using symbolic names for data, numeric and unified formats. Below is a list of added syntax variants.

format:<expression>
format:[<numeric-format-name>,<data-format-name>]
format:[<data-format-name>,<numeric-format-name>]
format:[<data-format-name>]
format:[<numeric-format-name>]
format:[<unified-format-name>]

The last syntax variant is supported for GFX10 only.

See llvm bug 37738

Reviewers: arsenm, rampitec, vpykhtin

Differential Revision: https://reviews.llvm.org/D84026
2020-07-24 16:41:03 +03:00
Matt Arsenault 6f437117af AMDGPU: Don't assert on f16 inv2pi immediates pre-gfx8
v_cvt_f32_f16 can still accept this value as a literal constant. This
showed up in GlobalISel since it doesn't have constant folding for
G_FPEXT.
2020-07-22 13:59:03 -04:00
Elvina Yakubova b36a3e6140 [llvm-readobj] Update tests because of changes in llvm-readobj behavior
This patch updates tests using llvm-readobj and llvm-readelf, because
soon reading from stdin will be achievable only via a '-' as described
here: https://bugs.llvm.org/show_bug.cgi?id=46400. Patch with changes to
llvm-readobj behavior is here: https://reviews.llvm.org/D83704

Differential Revision: https://reviews.llvm.org/D83912

Reviewed by: jhenderson, MaskRay, grimar
2020-07-20 10:39:04 +01:00
Dmitry Preobrazhensky 2e87acac9b [AMDGPU] Removed s_mov_regrd and mov_fed opcodes
These opcodes are not intended for public use.

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D81659
2020-07-17 19:52:54 +03:00
Matt Arsenault 79f67cae91 AMDGPU: Rename add/sub with carry out instructions
The hardware has created a real mess in the naming for add/sub, which
have been renamed basically every generation. Switch the carry out
pseudos to have the gfx9/gfx10 names. We were using the original SI/CI
v_add_i32/v_sub_i32 names. Later targets reintroduced these names as
carryless instructions with a saturating clamp bit, which we do not
define. Do this rename so we can unambiguously add these missing
instructions.

The carry-in versions should also be renamed, but at least those had a
consistent _u32 name to begin with. The 16-bit instructions were also
renamed, but aren't ambiguous.

This does regress assembler error message quality in some cases. In
mismatched wave32/wave64 situations, this will switch from
"unsupported instruction" to "invalid operand", with the error
pointing at the wrong position. I couldn't quite follow how the
assembler selects these, but the previous behavior seemed accidental
to me. It looked like there was a partial attempt to handle this which
was never completed (i.e. there is an AMDGPUOperand::isBoolReg but it
isn't used for anything).
2020-07-16 13:16:30 -04:00
Dmitry Preobrazhensky e122eba185 [AMDGPU][MC] Corrected MTBUF parsing and decoding
MTBUF implementation has many issues and this change addresses most of these:
- refactored duplicated code;
- hardcoded constants moved out of high-level code;
- fixed a decoding error when nfmt or dfmt are zero (bug 36932);
- corrected parsing of operand separators (bug 46403);
- corrected handling of missing operands (bug 46404);
- corrected handling of out-of-range modifiers (bug 46421);
- corrected default value (bug 46467).

Reviewers: arsenm, rampitec, vpykhtin, artem.tamazov, kzhuravl

Differential Revision: https://reviews.llvm.org/D83760
2020-07-15 19:46:00 +03:00
Matt Arsenault 4c22f5f804 AMDGPU: Add @LINE to assembler error test checks
It was basically impossible to figure out where the failure point was
2020-07-14 18:37:50 -04:00
Matt Arsenault 31f4e43f3f AMDGPU: Remove .value_type from kernel metadata
This doesn't appear used for anything, and is emitted incorrectly
based on the description. This also depends on the IR type, and
pointee element type.
2020-07-10 18:16:31 -04:00
Dmitry Preobrazhensky 129ab77384 [AMDGPU][MC][NFC] Updated and enabled MC lit tests
Updated tests disabled by change 5f5f566.

5f5f566b26
2020-06-19 16:27:40 +03:00
Matt Arsenault 5f5f566b26 AMDGPU: Don't use 16-bit FP inline constants in integer operands
It seems to be a hardware defect that the half inline constants do not
work as expected for the 16-bit integer operations (the inverse does
work correctly). Experimentation seems to show these are really
reading the 32-bit inline constants, which can be observed by writing
inline asm using op_sel to see what's in the high half of the
constant. Theoretically we could fold the high halves of the 32-bit
constants using op_sel.

The *_asm_all.s MC tests are broken, and I don't know where the script
to autogenerate these are. I started manually fixing it, but there's
just too many cases to fix. This also does break the
assembler/disassembler support for these values, and I'm not sure what
to do about it. These are still valid encodings, so it seems like you
should be able to use them in some way. If you wrote assembly using
them, you could have really meant it (perhaps to read the high bits
with op_sel?). The disassembler will print the invalid literal
constant which will fail to re-assemble. The behavior is also
different depending on the use context. Consider this example, which
was previously accepted and encoded using the inline constant:

  v_mad_i16 v5, v1, -4.0, v3
  ; encoding: [0x05,0x00,0xec,0xd1,0x01,0xef,0x0d,0x04]

In contexts where an inline immediate is required (such as on gfx8/9),
this will now be rejected. For gfx10, this will produce the literal
encoding and change the printed format:
  v_mad_i16 v5, v1, 0xc400, v3
  ; encoding: [0x05,0x00,0x5e,0xd7,0x01,0xff,0x0d,0x04,0x00,0xc4,0x00,0x00]

This is just another variation of the issue that we don't perfectly
handle round trip assembly/disassembly due to not tracking how
immediates were encoded. This doesn't matter much in practice, since
compilers don't emit the suboptimal encoding. I doubt any users are
relying on this behavior (although I did make use of the old behavior
to figure out what was wrong).

Fixes bug 46302.
2020-06-17 19:14:10 -04:00
Stanislav Mekhanoshin 9ee272f13d [AMDGPU] Add gfx1030 target
Differential Revision: https://reviews.llvm.org/D81886
2020-06-15 16:18:05 -07:00
Dmitry Preobrazhensky f47e27e260 [AMDGPU][MC][GFX908] Corrected src0 of v_accvgpr_write to accept only VGPRs and inline constants.
This change disables use of special SGPR registers like scc, vccz, execz, etc as operands of v_accvgpr_write.

See bug 45414: https://bugs.llvm.org/show_bug.cgi?id=45414

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D80530
2020-05-28 15:10:55 +03:00
Dmitry Preobrazhensky 77aec3b4c0 [AMDGPU][MC][GFX8+] Enabled clamp for v_add_u16, v_sub_u16 and v_subrev_u16
See https://bugs.llvm.org/show_bug.cgi?id=45926

Reviewers: arsenm, rampitec, vpykhtin

Differential Revision: https://reviews.llvm.org/D80430
2020-05-25 19:55:38 +03:00
Dmitry Preobrazhensky 933ebc4078 [AMDGPU][MC][GFX8+] Enabled clamp for v_mul_i32_i24_e64 and v_mul_u32_u24_e64
See bug 45925: https://bugs.llvm.org/show_bug.cgi?id=45925

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D80287
2020-05-22 14:11:31 +03:00
Dmitry Preobrazhensky f997370d9c [AMDGPU][MC] Corrected branch relocation handling to detect undefined labels
Fixed ELF object writer to die gracefully when an undefined label is encountered in a branch instruction.
See https://bugs.llvm.org/show_bug.cgi?id=41914.

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D79943
2020-05-18 14:04:58 +03:00
Dmitry Preobrazhensky 18a5428e60 [AMDGPU][MC][GFX9+] Enabled clamp for v_add_i32 and v_sub_i32
See bug 45830: https://bugs.llvm.org/show_bug.cgi?id=45830

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D79585
2020-05-13 14:17:20 +03:00
Dmitry Preobrazhensky 5998baccb9 [AMDGPU][MC][GFX9+] Enabled 21-bit signed offsets for SMEM instructions
Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D79288
2020-05-06 14:13:10 +03:00
Jay Foad 1f32e7367c [AMDGPU] Fix test failures caused by dbdffe3ee9. 2020-04-22 14:19:21 +01:00
Fangrui Song ecd6d7254e [test] llvm/test/: change llvm-objdump single-dash long options to double-dash options
As announced here: http://lists.llvm.org/pipermail/llvm-dev/2019-April/131786.html

Grouped option syntax (POSIX Utility Conventions) does not play well with -long-option
A subsequent change will reject -long-option.
2020-03-15 17:46:23 -07:00
Fangrui Song 71e2ca6e32 [llvm-objdump] -d: print `00000000 <foo>:` instead of `00000000 foo:`
The new behavior matches GNU objdump. A pair of angle brackets makes tests slightly easier.

`.foo:` is not unique and thus cannot be used in a `CHECK-LABEL:` directive.
Without `-LABEL`, the CHECK line can match the `Disassembly of section`
line and causes the next `CHECK-NEXT:` to fail.

```
Disassembly of section .foo:

0000000000001634 .foo:
```

Bdragon: <> has metalinguistic connotation. it just "feels right"

Reviewed By: rupprecht

Differential Revision: https://reviews.llvm.org/D75713
2020-03-05 18:05:28 -08:00
Jay Foad 13f8be68e0 [AMDGPU] Use @LINE for error checking in gfx10 assembler tests
Summary:
This is a rework of D72611, using @LINE to check that errors are
reported against the right instruction instead of adding lots of extra
*-ERR-NEXT: check lines.

Reviewers: rampitec, arsenm, nhaehnle

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74227
2020-02-07 18:27:07 +00:00
Hubert Tong af3c243e99 [tests] Use host-based XFAIL for test/MC/AMDGPU/hsa-gfx10-v3.s
Summary:
This patch applies D60551 to an additional file. In particular, the test
is currently marked XFAIL for a number of big-endian targets; however,
the failure is actually dependent on the host endianness instead. The
test actually specifies a specific target triple.

Reviewers: rampitec, xingxue, daltenty

Reviewed By: rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, fedor.sergeev, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73192
2020-01-23 17:55:32 -05:00
Matt Arsenault bd7658a212 AMDGPU: Partially directly select llvm.amdgcn.interp.p1.f16
The 16 bank LDS case is complicated due to using multiple
instructions. If I attempt to write a pattern for it, the generated
selector incorrectly places the copy to m0 after the first
instruction, so that needs to be separately addressed.

Also fix not gluing the copy to m0 to the second operation in the
second half of the 16 bank lowering.
2020-01-15 08:58:58 -05:00
Jay Foad 440ce5164f [AMDGPU] Remove duplicate gfx10 assembler and disassembler tests
Summary: Depends on D72611.

Reviewers: rampitec, arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72616
2020-01-14 08:20:51 +00:00
Jay Foad 0950de264e [AMDGPU] Improve error checking in gfx10 assembler tests
Summary:
This adds checks that the expected error was actually reported against
the correct instruction, and fixes a couple of problems that that showed
up: one incorrect W32-ERR:

 v_cmp_class_f16_sdwa vcc, v1, v2 src0_sel:DWORD src1_sel:DWORD
 // W64: encoding: [0xf9,0x04,0x1e,0x7d,0x01,0x00,0x06,0x06]
-// W32-ERR: error: invalid operand for instruction
+// W32-ERR: error: {{instruction not supported on this GPU|invalid operand for instruction}}

and one missing W32-ERR:

 v_cmp_class_f16_sdwa s[6:7], v1, v2 src0_sel:DWORD src1_sel:DWORD
 // W64: encoding: [0xf9,0x04,0x1e,0x7d,0x01,0x86,0x06,0x06]
+// W32-ERR: error: invalid operand for instruction

Reviewers: rampitec, arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72611
2020-01-14 08:20:50 +00:00
Jay Foad 63c3691f79 [AMDGPU] Add gfx9 assembler and disassembler test cases
Summary:
This adds assembler tests for cases that were previously only in the
disassembler tests, and vice versa.

Reviewers: rampitec, arsenm, nhaehnle

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72592
2020-01-14 08:20:28 +00:00
Jay Foad 241f330d6b [AMDGPU] Add gfx8 assembler and disassembler test cases
Summary:
This adds assembler tests for cases that were previously only in the
disassembler tests, and vice versa.

Reviewers: rampitec, arsenm, nhaehnle

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72561
2020-01-12 21:12:48 +00:00
Dmitry Preobrazhensky edd9f70163 [AMDGPU][MC][GFX10] Enabled v_movrel*[sdwa|dpp|dpp8] opcodes
See https://bugs.llvm.org/show_bug.cgi?id=43712

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D70170
2019-11-18 17:23:40 +03:00
Dmitry Preobrazhensky e25bc5e024 [AMDGPU][MC] Corrected src0 for v_movrelsd_b32 and v_movrelsd_2_b32
See https://bugs.llvm.org/show_bug.cgi?id=40903

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D69888
2019-11-08 16:38:56 +03:00
Dmitry Preobrazhensky b8042dbe2b [AMDGPU][MC][GFX10] Added v_interp_[p1/p2/mov]_f32_e64
See https://bugs.llvm.org/show_bug.cgi?id=43747

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D69348
2019-10-28 15:03:43 +03:00
Dmitry Preobrazhensky 6c7d7eebda [AMDGPU][MC][GFX10] Added sdwa/dpp versions of v_cndmask_b32
See https://bugs.llvm.org/show_bug.cgi?id=43608

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D69096

llvm-svn: 375241
2019-10-18 14:49:53 +00:00
Dmitry Preobrazhensky 7d325fe57b [AMDGPU][MC][GFX9] Corrected parsing of v_cndmask_b32_sdwa
See https://bugs.llvm.org/show_bug.cgi?id=43607

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D69095

llvm-svn: 375231
2019-10-18 13:31:53 +00:00