llvm-project

Commit Graph

Author	SHA1	Message	Date
Sebastian Neubauer	2b08f6af62	[AMDGPU] Improve register computation for indirect calls First, collect the register usage in each function, then apply the maximum register usage of all functions to functions with indirect calls. This is more accurate than guessing the maximum register usage without looking at the actual usage. As before, assume that indirect calls will hit a function in the current module. Differential Revision: https://reviews.llvm.org/D105839	2021-07-20 13:48:50 +02:00
Sebastian Neubauer	db646de3ee	[AMDGPU] Set optional PAL metadata Set informational fields in the .shader_functions table. Also correct the documentation, .scratch_memory_size and .lds_size are integers. Differential Revision: https://reviews.llvm.org/D105116	2021-07-06 11:58:00 +02:00
Carl Ritson	98f48723f2	[AMDGPU] Add 224-bit vector types and link 192-bit types to MVTs Add SReg_224, VReg_224, AReg_224, etc. Link 224-bit types with v7i32/v7f32. Link existing 192-bit types to newly added v3i64/v3f64/v6i32/v6f32. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D104622	2021-06-24 12:41:22 +09:00
Matt Arsenault	ad4a18251a	AMDGPU: Fix assert on m0_lo16/m0_hi16 These get added (redundantly) to the bundle expanded for indirect register accesses. We hit this path only when there is a call in the function.	2021-06-18 18:48:53 -04:00
madhur13490	c27e8141b3	[AMDGPU][IndirectCalls] Fix register usage propagation for indirect/external calls This patch computes max SGPRs and VGPRs used by module in presence of indirect calls and makes that as register requirement for functions/kernels which makes indirect calls. This patch also refactors code AMDGPUSubTarget.cpp which add a "base" variants of getMaxNumSGPRs which is used by MachineFunction and new Function version. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D103636	2021-06-12 11:59:34 +05:30
madhur13490	3c874ce427	[AMDGPU][NFC] Remove author's name from codebase This must have made to code by accident. Differential Revision: https://reviews.llvm.org/D103484	2021-06-02 00:51:48 +05:30
Stanislav Mekhanoshin	6fb02596a2	[AMDGPU] Add support for architected flat scratch Add support for the readonly flat Scratch register initialized by the SPI. Differential Revision: https://reviews.llvm.org/D102432	2021-05-14 10:53:48 -07:00
Konstantin Zhuravlyov	f4ace63737	AMDGPU: Add target id and code object v4 support - Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id) - Add code object v4 support (https://llvm.org/docs/AMDGPUUsage.html#elf-code-object) - Add kernarg_size to kernel descriptor - Change trap handler ABI to no longer move queue pointer into s[0:1] - Cleanup ELF definitions - Add V2, V3, V4 suffixes to make a clear distinction for code object version - Consolidate note names Differential Revision: https://reviews.llvm.org/D95638	2021-03-24 11:54:05 -04:00
Stanislav Mekhanoshin	a8d9d50762	[AMDGPU] gfx90a support Differential Revision: https://reviews.llvm.org/D96906	2021-02-17 16:01:32 -08:00
dfukalov	560d7e0411	[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets ... to reduce headers dependency. Reviewed By: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D95036	2021-01-20 22:22:45 +03:00
dfukalov	6a87e9b08b	[NFC][AMDGPU] Reduce include files dependency. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D93813	2021-01-07 22:22:05 +03:00
Stanislav Mekhanoshin	eb66bf0802	[AMDGPU] Print SCRATCH_EN field after the kernel Differential Revision: https://reviews.llvm.org/D93353	2020-12-15 22:44:30 -08:00
Sebastian Neubauer	5733167f54	[AMDGPU] Mark amdgpu_gfx functions as module entry function - Allows lds allocations - Writes resource usage into COMPUTE_PGM_RSRC1 registers in PAL metadata Differential Revision: https://reviews.llvm.org/D92946	2020-12-14 10:43:39 +01:00
Jay Foad	4f87d30a06	[AMDGPU] Introduce and use isGFX10Plus. NFC. It's more future-proof to use isGFX10Plus from the start, on the assumption that future architectures will be based on current architectures. Also make use of the existing isGFX9Plus in a few places. Differential Revision: https://reviews.llvm.org/D92092	2020-11-26 09:02:36 +00:00
Sebastian Neubauer	edd675643d	[AMDGPU] Emit stack frame size in metadata Add .shader_functions to pal metadata, which contains the stack frame size for all non-entry-point functions. Differential Revision: https://reviews.llvm.org/D90036	2020-11-25 16:30:02 +01:00
Matt Arsenault	79f75468b4	AMDGPU: Fix counting kernel arguments towards register usage Also use DataLayout to get type size. Relying on the IR type size is also pretty broken here, since this won't perfectly capture how types are legalized.	2020-11-20 21:23:33 -05:00
Stanislav Mekhanoshin	cf6565f6d0	[AMDGPU] Enable multi-dword flat scratch load/stores Differential Revision: https://reviews.llvm.org/D91384	2020-11-12 13:38:56 -08:00
Sebastian Neubauer	a022b1ccd8	[AMDGPU] Add amdgpu_gfx calling convention Add a calling convention called amdgpu_gfx for real function calls within graphics shaders. For the moment, this uses the same calling convention as other calls in amdgpu, with registers excluded for return address, stack pointer and stack buffer descriptor. Differential Revision: https://reviews.llvm.org/D88540	2020-11-09 16:51:44 +01:00
Sebastian Neubauer	1124bf4ab7	[AMDGPU] Set rsrc1 flags for graphics shaders Before they were only set for compute kernels and compute shaders but not for other shaders. Differential Revision: https://reviews.llvm.org/D89399	2020-11-04 12:25:41 +01:00
Matt Arsenault	1ed4caff1d	AMDGPU: Lower the threshold reported for maximum stack size exceeded Check the actual maximum supported stack size for a kernel.	2020-10-21 12:06:27 -04:00
Sebastian Neubauer	dd3f7a494a	[AMDGPU] Add a message to an assert	2020-10-16 13:03:20 +02:00
Konstantin Zhuravlyov	3fdf3b1539	AMDGPU: Update AMDHSA code object version handling Differential Revision: https://reviews.llvm.org/D89076	2020-10-14 13:04:27 -04:00
Dmitry Preobrazhensky	1c9d681092	[AMDGPU][CODEGEN] Added support of new inline assembler constraints Added support for constraints 'I', 'J', 'B', 'C', 'DA', 'DB'. See https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints. Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D81651	2020-07-02 17:20:15 +03:00
Guillaume Chatelet	52911428ef	[Alignment][NFC] Migrate AMDGPU backend to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82743	2020-06-29 11:56:06 +00:00
Matt Arsenault	6f09bb7da2	AMDGPU: Don't pass MachineFunction if only the IR Function is used	2020-06-18 11:06:46 -04:00
Matt Arsenault	5e007fe998	AMDGPU: Support non-entry block static sized allocas OpenMP emits these for some reason, so handle them. Assume these use 4096 bytes by default, with a flag to override this. Also change the related stack assumption for calls to have a flag.	2020-05-27 18:46:10 -04:00
Dmitry Preobrazhensky	b087b91c91	[AMDGPU][CODEGEN] Added 'A' constraint for inline assembler Summary: 'A' constraint requires an immediate int or fp constant that can be inlined in an instruction encoding. Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D78494	2020-05-25 14:23:34 +03:00
Matt Arsenault	2e82667f60	AMDGPU: Define mode register This should eventually model FP mode constraints as well as the other special fields it tracks.	2020-05-23 13:24:42 -04:00
Michael Liao	4ee5a04187	[amdgpu] Fix check of VCC. Summary: - Need to include checking on the new 16-bit subregs. Reviewers: rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79498	2020-05-06 14:16:37 -04:00
Stanislav Mekhanoshin	8a30460697	[AMDGPU] Define AGPR subregs These are only needed as VGPR counterpart. Differential Revision: https://reviews.llvm.org/D78597	2020-04-28 15:30:43 -07:00
Stanislav Mekhanoshin	46a75436f8	[AMDGPU] Define special SGPR subregs These are used in SReg_32 and when we start to use SGPR_LO16 there will be compaints that not all registers in RC support all subreg indexes. For now it is NFC. Unused regunits are reserved so that verifier does not complain about missing phys reg live-ins. Differential Revision: https://reviews.llvm.org/D78591	2020-04-28 14:57:46 -07:00
Stanislav Mekhanoshin	395d93358e	Revert "[AMDGPU] Define special SGPR subregs" This reverts commit `1baaa080e0`.	2020-04-28 13:53:15 -07:00
Stanislav Mekhanoshin	1baaa080e0	[AMDGPU] Define special SGPR subregs These are used in SReg_32 and when we start to use SGPR_LO16 there will be compaints that not all registers in RC support all subreg indexes. For now it is NFC. Unused regunits are reserved so that verifier does not complain about missing phys reg live-ins. Differential Revision: https://reviews.llvm.org/D78591	2020-04-28 13:34:24 -07:00
Jay Foad	dbdffe3ee9	[AMDGPU] Add 192-bit register classes Differential Revision: https://reviews.llvm.org/D78312	2020-04-22 13:10:37 +01:00
Jay Foad	d625b4b081	[AMDGPU] Add missing AReg classes Add 96-bit, 160-bit and 256-bit AReg classes to match VReg and SReg. NFC as far as I know, but it may avoid weird legalization problems. Differential Revision: https://reviews.llvm.org/D78348	2020-04-22 13:10:37 +01:00
Stanislav Mekhanoshin	2e94a64b57	[AMDGPU] Define 16 bit SGPR subregs These are needed as a counterpart for VGPR subregs even though there are no scalar instructions which can operate 16 bit values. When we are materializing a constant that is done into an SGPR and that SGPR may/will be copied into a 16 bit VGPR subreg. Such copy is illegal. There are also similar problems if a source operand of a 16 bit VALU instruction is an SGPR. In addition we need to get a register with a lo16 subregister of an SGPR RC during selection and this fails as well. All of that makes me believe we need these subregisters as a syntactic glue. Differential Revision: https://reviews.llvm.org/D78250	2020-04-16 10:31:39 -07:00
Matt Arsenault	4ea1baf6a0	AMDGPU: Initial, crude support for indirect calls This isn't really usable, and requires using the -amdgpu-fixed-function-abi flag to work. Assumes a uniform call target, and will hit a verifier error if the call target ends up in a VGPR. Also doesn't attempt to do anything sensible for the reported register/stack usage.	2020-03-18 12:03:48 -04:00
Guillaume Chatelet	d000655a8c	[Alignment][NFC] Deprecate getMaxAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76348	2020-03-18 14:48:45 +01:00
Sebastian Neubauer	4327a9b46b	[AMDGPU] Use progbits type for .AMDGPU.disasm section The note section type implies a specific format that this section does not have thus tools like readelf fail here. Progbits has no format and another pipeline compiler already sets the type to progbits. Differential Revision: https://reviews.llvm.org/D75913	2020-03-12 09:08:11 +01:00
Fangrui Song	692e0c9648	[MC] Add MCStreamer::emitInt{8,16,32,64} Similar to AsmPrinter::emitInt{8,16,32,64}.	2020-02-29 09:40:21 -08:00
Fangrui Song	774971030d	[MCStreamer] De-capitalize EmitValue EmitIntValue{,InHex}	2020-02-14 23:08:40 -08:00
Fangrui Song	6d2d589b06	[MC] De-capitalize another set of MCStreamer::Emit* functions Emit{ValueTo,Code}Alignment Emit{DTP,TP,GP}* EmitSymbolValue etc	2020-02-14 19:26:52 -08:00
Fangrui Song	a55daa1461	[MC] De-capitalize some MCStreamer::Emit* functions	2020-02-14 19:11:53 -08:00
Fangrui Song	1d49eb00d9	[AsmPrinter] De-capitalize all AsmPrinter::Emit* but EmitInstruction Similar to rL328848.	2020-02-13 17:06:24 -08:00
Fangrui Song	0bc77a0f0d	[AsmPrinter] De-capitalize some AsmPrinter::Emit* functions Similar to rL328848.	2020-02-13 13:38:33 -08:00
Fangrui Song	0dce409cee	[AsmPrinter] De-capitalize Emit{Function,BasicBlock]* and Emit{Start,End}OfAsmFile	2020-02-13 13:22:49 -08:00
Matt Arsenault	1024b73ef5	AMDGPU: Split denormal mode tracking bits Prepare to accurately track the future denormal-fp-math attribute changes. The way to actually set these separately is not wired in yet. This is just a mechanical change, and mostly still assumes the input and output mode match. This should be refined for some cases. For example, fcanonicalize lowering should use the flushing variant if either input or output flushing is enabled	2020-02-04 10:44:21 -08:00
Tom Stellard	0dbcb36394	CMake: Make most target symbols hidden by default Summary: For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF this change makes all symbols in the target specific libraries hidden by default. A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these libraries public, which is mainly needed for the definitions of the LLVMInitialize* functions. This patch reduces the number of public symbols in libLLVM.so by about 25%. This should improve load times for the dynamic library and also make abi checker tools, like abidiff require less memory when analyzing libLLVM.so One side-effect of this change is that for builds with LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that access symbols that are no longer public will need to be statically linked. Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1): nm before/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 36221 nm after/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 26278 Reviewers: chandlerc, beanz, mgorny, rnk, hans Reviewed By: rnk, hans Subscribers: merge_guards_bot, luismarques, smeenai, ldionne, lenary, s.egerton, pzheng, sameer.abuasal, MaskRay, wuzish, echristo, Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D54439	2020-01-14 19:46:52 -08:00
James Henderson	d68904f957	[NFC] Fix trivial typos in comments Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D72143 Patch by Kazuaki Ishizaki.	2020-01-06 10:50:26 +00:00
Matt Arsenault	db0ed3e429	AMDGPU: Refactor treatment of denormal mode Start moving towards treating this as a property of the calling convention, and not the subtarget. The default denormal mode should not be part of the subtarget, and be moved into a separate function attribute. This patch is still NFC. The denormal mode remains as a subtarget feature for now, but make the necessary changes to switch to using an attribute.	2019-11-19 19:55:43 +05:30

1 2 3 4 5 ...

256 Commits