Commit Graph

100 Commits

Author SHA1 Message Date
Jay Foad d7762a3b36 [AMDGPU] Increase instruction cache line size to 128 bytes for GFX11
Differential Revision: https://reviews.llvm.org/D128189
2022-06-20 14:25:10 +01:00
Fangrui Song adf4142f76 [MC] De-capitalize SwitchSection. NFC
Add SwitchSection to return switchSection. The API will be removed soon.
2022-06-10 22:50:55 -07:00
Joe Nash ea3c9a87d3 [AMDGPU] gfx11 add bits to COMPUTE_PGM_RSRC3
Contributors:
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

Patch 21/N for upstreaming of AMDGPU gfx11 architecture

Depends on D127143

Reviewed By: rampitec, #amdgpu, kzhuravl

Differential Revision: https://reviews.llvm.org/D127241
2022-06-10 13:07:14 -04:00
Fangrui Song 15d82c62dc [MC] De-capitalize MCStreamer functions
Follow-up to c031378ce0 .
The class is mostly consistent now.
2022-06-07 00:31:02 -07:00
Joe Nash 813e521e55 [AMDGPU] Add gfx11 subtarget ELF definition
This is the first patch of a series to upstream support for the new
subtarget.

Contributors:
Jay Foad <jay.foad@amd.com>
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

Patch 1/N for upstreaming AMDGPU gfx11 architectures.

Reviewed By: foad, kzhuravl, #amdgpu

Differential Revision: https://reviews.llvm.org/D124536
2022-04-29 12:27:17 -04:00
Jacob Lambert 5160447f58 [AMDGPU] Add gfx10 assembler directive to specify shared VGPR count
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D105507
2022-03-07 14:27:41 -08:00
Aakanksha 840695814a [AMDGPU] Add gfx1036 target
Differential Revision: https://reviews.llvm.org/D120846
2022-03-02 23:26:38 +00:00
Stanislav Mekhanoshin 2e2e64df4a [AMDGPU] Add gfx940 target
This is target definition only.

Differential Revision: https://reviews.llvm.org/D120688
2022-03-02 13:54:48 -08:00
Sebastian Neubauer 6527b2a4d5 [AMDGPU][NFC] Fix typos
Fix some typos in the amdgpu backend.

Differential Revision: https://reviews.llvm.org/D119235
2022-02-18 15:05:21 +01:00
serge-sans-paille ef736a1c39 Cleanup LLVMMC headers
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:

llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h

Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after:  1049293745

Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
2022-02-09 11:09:17 +01:00
Changpeng Fang 1194b9cdda AMDGPU {NFC}: Add code object v5 support and generate metadata for implicit kernel args
Summary:
  Add code object v5 support (deafult is still v4)
  Generate metadata for implicit kernel args for the new ABI
  Set the metadata version to be 1.2

Reviewers:
  t-tye, b-sumner, arsenm, and bcahoon

Fixes:
  SWDEV-307188, SWDEV-307189

Differential Revision:
  https://reviews.llvm.org/D118272
2022-01-31 18:07:47 -08:00
Matt Arsenault e6564f39c7 AMDGPU: Emit user sgpr count directives in text asm
We were emitting these in the object file but not printing them.
2022-01-26 13:51:12 -05:00
Aakanksha Patil 3453f3dd46 [AMDGPU] Add gfx1035 target
Differential Revision: https://reviews.llvm.org/D104804
2021-06-24 14:32:41 -04:00
Brendon Cahoon 294efbbd3e Reland "[AMDGPU] Add gfx1013 target"
This reverts commit 211e584fa2.

Fixed a use-after-free error that caused the sanitizers to fail.
2021-06-08 21:15:35 -04:00
Brendon Cahoon 211e584fa2 Revert "[AMDGPU] Add gfx1013 target"
This reverts commit ea10a86984.

A sanitizer buildbot reports an error.
2021-06-08 16:29:41 -04:00
Brendon Cahoon ea10a86984 [AMDGPU] Add gfx1013 target
Differential Revision: https://reviews.llvm.org/D103663
2021-06-08 12:49:49 -04:00
Stanislav Mekhanoshin 6fb02596a2 [AMDGPU] Add support for architected flat scratch
Add support for the readonly flat Scratch register initialized
by the SPI.

Differential Revision: https://reviews.llvm.org/D102432
2021-05-14 10:53:48 -07:00
Aakanksha Patil 464e4dc50f [AMDGPU] Add gfx1034 target
Differential Revision: https://reviews.llvm.org/D102306
2021-05-13 14:25:18 -04:00
Konstantin Zhuravlyov f4ace63737 AMDGPU: Add target id and code object v4 support
- Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id)
  - Add code object v4 support (https://llvm.org/docs/AMDGPUUsage.html#elf-code-object)
    - Add kernarg_size to kernel descriptor
    - Change trap handler ABI to no longer move queue pointer into s[0:1]
  - Cleanup ELF definitions
    - Add V2, V3, V4 suffixes to make a clear distinction for code object version
    - Consolidate note names

Differential Revision: https://reviews.llvm.org/D95638
2021-03-24 11:54:05 -04:00
Jay Foad 288ea820cf [AMDGPU] Refactor AMDGPUTargetStreamer::EmitCodeEnd
Refactor and add comments to explain where the magic numbers come from
in terms of the instruction cache line size. NFC.

Differential Revision: https://reviews.llvm.org/D98266
2021-03-09 19:02:18 +00:00
Stanislav Mekhanoshin a8d9d50762 [AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
2021-02-17 16:01:32 -08:00
Simon Pilgrim f82cff31d3 [AMDGPU] HSAMD::fromString - replace std::string arg with StringRef. NFCI.
Removes an unnecessary chain of StringRef -> std::string -> StringRef conversions
2021-01-26 16:09:39 +00:00
dfukalov 6a87e9b08b [NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813
2021-01-07 22:22:05 +03:00
Tony d5ea8f7010 [AMDGPU] Clarify scratch initialization
- Clarify documentation on initializing scratch.
- Rename compute_pgm_rsrc2 field for enabling scratch from
  ENABLE_SGPR_PRIVATE_SEGMENT_WAVEFRONT_OFFSET to
  ENABLE_PRIVATE_SEGMENT to match hardware definition.

Differential Revision: https://reviews.llvm.org/D93271
2020-12-15 20:14:20 +00:00
Tim Renouf 89d41f3a2b [AMDGPU] Add gfx1033 target
Differential Revision: https://reviews.llvm.org/D90447

Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761
2020-11-03 16:27:48 +00:00
Tim Renouf ee3e642627 [AMDGPU] Add gfx90c target
This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were
previously included in gfx909.

Differential Revision: https://reviews.llvm.org/D90419

Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
2020-11-03 16:27:43 +00:00
Stanislav Mekhanoshin d1beb95d12 [AMDGPU] gfx1032 target
Differential Revision: https://reviews.llvm.org/D89487
2020-10-15 12:41:18 -07:00
Tim Renouf 666ef0db20 [AMDGPU] Add gfx602, gfx705, gfx805 targets
At AMD, in an internal audit of our code, we found some corner cases
where we were not quite differentiating targets enough for some old
hardware. This commit is part of fixing that by adding three new
targets:

* The "Oland" and "Hainan" variants of gfx601 are now split out into
  gfx602. LLPC (in the GPUOpen driver) and other front-ends could use
  that to avoid using the shaderZExport workaround on gfx602.

* One variant of gfx703 is now split out into gfx705. LLPC and other
  front-ends could use that to avoid using the
  shaderSpiCsRegAllocFragmentation workaround on gfx705.

* The "TongaPro" variant of gfx802 is now split out into gfx805.
  TongaPro has a faster 64-bit shift than its former friends in gfx802,
  and a subtarget feature could be set up for that to take advantage of
  it. This commit does not make that change; it just adds the target.

V2: Add clang changes. Put TargetParser list in order.
V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order,
    so fix the GPUKind order.

Differential Revision: https://reviews.llvm.org/D88916

Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
2020-10-10 17:22:22 +01:00
Scott Linder e4a9e4ef55 [AMDGPU] Emit correct kernel descriptor on big-endian hosts
Previously we wrote multi-byte values out as-is from host memory. Use
the `emitIntN` helpers in `MCStreamer` to produce a valid descriptor
irrespective of the host endianness.

Reviewed By: arsenm, rochauha

Differential Revision: https://reviews.llvm.org/D88858
2020-10-06 17:29:38 +00:00
Steven Perron eed6476a87 Reset PAL metadata when AMDGPU traget stream finishes
If the same stream object is used for multiple compiles, the PAL metadata from eariler compilations will leak into later one.  See https://github.com/GPUOpen-Drivers/llpc/issues/882 for how this is happening in LLPC.

No tests were added because multiple compiles will have to happen using the same pass manager, and I do not see a setup for that on the LLVM side.  Let me know if there is a good way to test this.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D85667
2020-08-17 10:56:11 -04:00
Stanislav Mekhanoshin ea7d0e2996 [AMDGPU] gfx1031 target
Differential Revision: https://reviews.llvm.org/D85337
2020-08-05 12:36:26 -07:00
Guillaume Chatelet 52911428ef [Alignment][NFC] Migrate AMDGPU backend to Align
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82743
2020-06-29 11:56:06 +00:00
Stanislav Mekhanoshin 9ee272f13d [AMDGPU] Add gfx1030 target
Differential Revision: https://reviews.llvm.org/D81886
2020-06-15 16:18:05 -07:00
Sebastian Neubauer 1de4e56933 [AMDGPU] Don't mark the .note section as ALLOC
Marking a section as ALLOC tells the ELF loader to load the section into memory.
As we do not want to load the notes into VRAM, the flag should not be there.

On AMDHSA, .note is still marked as ALLOC, apparently this is currently
needed for OpenCL (see https://reviews.llvm.org/D74995).

Differential Revision: https://reviews.llvm.org/D76278
2020-05-05 14:21:45 +02:00
Fangrui Song 692e0c9648 [MC] Add MCStreamer::emitInt{8,16,32,64}
Similar to AsmPrinter::emitInt{8,16,32,64}.
2020-02-29 09:40:21 -08:00
Mark Searles d3e170c438 Revert "[AMDGPU] Don’t marke the .note section as ALLOC"
This reverts commit 977cd661cf.

It breaks OpenCL testing. OpenCL Runtime is using PT_LOAD information
to calculate memory for global variables. This commit should be relanded once
the OpenCL runtime stops relying on PT_LOAD information for calculating global
variable memory size.

Differential Revision: https://reviews.llvm.org/D74995
2020-02-21 16:08:30 -08:00
Sebastian Neubauer 977cd661cf [AMDGPU] Don’t marke the .note section as ALLOC
Marking a section as ALLOC tells the ELF loader to load the section into memory.
As we do not want to load the notes into VRAM, the flag should not be there.

Differential Revision: https://reviews.llvm.org/D74600
2020-02-20 15:14:48 +01:00
Fangrui Song 774971030d [MCStreamer] De-capitalize EmitValue EmitIntValue{,InHex} 2020-02-14 23:08:40 -08:00
Fangrui Song 6d2d589b06 [MC] De-capitalize another set of MCStreamer::Emit* functions
Emit{ValueTo,Code}Alignment Emit{DTP,TP,GP}* EmitSymbolValue etc
2020-02-14 19:26:52 -08:00
Fangrui Song a55daa1461 [MC] De-capitalize some MCStreamer::Emit* functions 2020-02-14 19:11:53 -08:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Simon Pilgrim ea2c159f96 [AMDGPU] Fix "use of uninitialized variable" static analyzer warning. NFCI.
Add "unreachable" default case to AMDGPUTargetStreamer::getArchNameFromElfMach
2020-01-06 16:36:56 +00:00
Stanislav Mekhanoshin c43784ff26 [AMDGPU] Increase kernel padding
To support prefetch mode 3 we need to pad current
cacheline and fill 3 cachelines after. Current padding
is only sufficient for mode 2.

Differential Revision: https://reviews.llvm.org/D65236

llvm-svn: 366938
2019-07-24 19:40:13 +00:00
Stanislav Mekhanoshin 22b2c3d651 [AMDGPU] gfx908 target
Differential Revision: https://reviews.llvm.org/D64429

llvm-svn: 365525
2019-07-09 18:10:06 +00:00
Nicolai Haehnle 08e8cb5760 AMDGPU/MC: Add .amdgpu_lds directive
Summary:
The directive defines a symbol as an group/local memory (LDS) symbol.
LDS symbols behave similar to common symbols for the purposes of ELF,
using the processor-specific SHN_AMDGPU_LDS as section index.

It is the linker and/or runtime loader's job to "instantiate" LDS symbols
and resolve relocations that reference them.

It is not possible to initialize LDS memory (not even zero-initialize
as for .bss).

We want to be able to link together objects -- starting with relocatable
objects, but possible expanding to shared objects in the future -- that
access LDS memory in a flexible way.

LDS memory is in an address space that is entirely separate from the
address space that contains the program image (code and normal data),
so having program segments for it doesn't really make sense.

Furthermore, we want to be able to compile multiple kernels in a
compilation unit which have disjoint use of LDS memory. In that case,
we may want to place LDS symbols differently for different kernels
to save memory (LDS memory is very limited and physically private to
each kernel invocation), so we can't simply place LDS symbols in a
.lds section.

Hence this solution where LDS symbols always stay undefined.

Change-Id: I08cbc37a7c0c32f53f7b6123aa0afc91dbc1748f

Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61493

llvm-svn: 364296
2019-06-25 11:51:35 +00:00
Stanislav Mekhanoshin 5d00c3060e [AMDGPU] gfx1010 wave32 metadata
Differential Revision: https://reviews.llvm.org/D63207

llvm-svn: 363577
2019-06-17 16:48:56 +00:00
Stanislav Mekhanoshin c43e67bfff [AMDGPU] gfx1011/gfx1012 targets
Differential Revision: https://reviews.llvm.org/D63307

llvm-svn: 363344
2019-06-14 00:33:31 +00:00
Stanislav Mekhanoshin 41bbe101a2 [AMDGPU] gfx1010 s_code_end generation
Also add some missing metadata in the streamer.

Differential Revision: https://reviews.llvm.org/D61531

llvm-svn: 359937
2019-05-03 21:26:39 +00:00
Stanislav Mekhanoshin cee607e414 [AMDGPU] Add gfx1010 target definitions
Differential Revision: https://reviews.llvm.org/D61041

llvm-svn: 359113
2019-04-24 17:03:15 +00:00
Tim Renouf e7bd52f86e [AMDGPU] Added MsgPack format PAL metadata
Summary:
PAL metadata now supports both the old linear reg=val pairs format and
the new MsgPack format.

The MsgPack format uses YAML as its textual representation. On output to
YAML, a mnemonic name is provided for some hardware registers.

Differential Revision: https://reviews.llvm.org/D57028

Change-Id: I2bbaabaaca4b3574f7e03b80fbef7c7a69d06a94
llvm-svn: 356591
2019-03-20 18:47:21 +00:00