llvm-project

Commit Graph

Author	SHA1	Message	Date
Konstantin Pyzhov	72e8754916	[AMDGPU] Disable 'Skip Uniform Regions' optimization by default for AMDGPU. Reviewers: sameerds, dstuttard Differential Revision: https://reviews.llvm.org/D77228	2020-04-06 09:05:58 -04:00
Konstantin Pyzhov	51dc028314	Revert `e1730cfeb3`	2020-04-06 05:56:11 -04:00
Konstantin Pyzhov	e1730cfeb3	[AMDGPU] Disable 'Skip Uniform Regions' optimization by default for AMDGPU. Reviewers: sameerds, dstuttard Differential Revision: https://reviews.llvm.org/D77228	2020-04-06 05:10:37 -04:00
Scott Linder	60b1967c39	[AMDGPU] Add Scratch Wave Offset to Scratch Buffer Descriptor in entry functions Add the scratch wave offset to the scratch buffer descriptor (SRSrc) in the entry function prologue. This allows us to removes the scratch wave offset register from the calling convention ABI. As part of this change, allow the use of an inline constant zero for the SOffset of MUBUF instructions accessing the stack in entry functions when a frame pointer is not requested/required. Entry functions with calls still need to set up the calling convention ABI stack pointer register, and reference it in order to address arguments of called functions. The ABI stack pointer register remains unswizzled, but is now wave-relative instead of queue-relative. Non-entry functions also use an inline constant zero SOffset for wave-relative scratch access, but continue to use the stack and frame pointers as before. When the stack or frame pointer is converted to a swizzled offset it is now scaled directly, as the scratch wave offset no longer needs to be subtracted first. Update llvm/docs/AMDGPUUsage.rst to reflect these changes to the calling convention. Tags: #llvm Differential Revision: https://reviews.llvm.org/D75138	2020-03-19 15:35:16 -04:00
Matt Arsenault	37c452a289	AMDGPU/GlobalISel: Adjust branch target when lowering loop intrinsic This needs to steal the branch target like the other control flow intrinsics.	2020-02-18 06:35:40 -08:00
Matt Arsenault	54137bbaaf	GlobalISel: Allow running localizer earlier This required legal and regbankselected MIR for seemingly no reason. For AMDGPU this wouldn't see legalized G_GLOBAL_VALUEs.	2020-02-17 11:24:06 -08:00
Matt Arsenault	e174c278ca	AMDGPU/GlobalISel: Fix mapping G_ICMP with constrained result When SI_IF is inserted, it constrains the source register with a register class, which was quite likely a G_ICMP. This was incorrectly treating it as a scalar, and then applyMappingImpl would end up producing invalid MIR since this was unexpected. Also fix not using all VGPR sources for vcc outputs.	2020-02-12 16:19:45 -08:00
cdevadas	e53a9d96e6	Resubmit: [AMDGPU] Invert the handling of skip insertion. The current implementation of skip insertion (SIInsertSkip) makes it a mandatory pass required for correctness. Initially, the idea was to have an optional pass. This patch inserts the s_cbranch_execz upfront during SILowerControlFlow to skip over the sections of code when no lanes are active. Later, SIRemoveShortExecBranches removes the skips for short branches, unless there is a sideeffect and the skip branch is really necessary. This new pass will replace the handling of skip insertion in the existing SIInsertSkip Pass. Differential revision: https://reviews.llvm.org/D68092	2020-01-22 13:18:32 +09:00
Nicolai Hähnle	a80291ce10	Revert "[AMDGPU] Invert the handling of skip insertion." This reverts commit `0dc6c249bf`. The commit is reported to cause a regression in piglit/bin/glsl-vs-loop for Mesa.	2020-01-21 09:17:25 +01:00
cdevadas	0dc6c249bf	[AMDGPU] Invert the handling of skip insertion. The current implementation of skip insertion (SIInsertSkip) makes it a mandatory pass required for correctness. Initially, the idea was to have an optional pass. This patch inserts the s_cbranch_execz upfront during SILowerControlFlow to skip over the sections of code when no lanes are active. Later, SIRemoveShortExecBranches removes the skips for short branches, unless there is a sideeffect and the skip branch is really necessary. This new pass will replace the handling of skip insertion in the existing SIInsertSkip Pass. Differential revision: https://reviews.llvm.org/D68092	2020-01-15 15:18:16 +05:30
Matt Arsenault	ca19d7a399	AMDGPU/GlobalISel: Fix branch targets when emitting SI_IF The branch target needs to be changed depending on whether there is an unconditional branch or not. Loops also need to be similarly fixed, but compiling a simple testcase end to end requires another set of patches that aren't upstream yet.	2020-01-13 12:51:05 -05:00

11 Commits