llvm-project

Commit Graph

Author	SHA1	Message	Date
Stanislav Mekhanoshin	c80d8a8cea	[AMDGPU] MachineLICM cannot hoist VALU MachineLoop::isLoopInvariant() returns false for all VALU because of the exec use. Check TII::isIgnorableUse() to allow hoisting. That unfortunately results in higher register consumption since MachineLICM does not adequately estimate pressure. Therefor I think it shall only be enabled after D107677 even though it does not depend on it. Differential Revision: https://reviews.llvm.org/D107859	2021-10-20 11:47:24 -07:00
Joe Nash	3ce1b9631a	[AMDGPU] Switch PostRA sched to MachineSched Use GCNHazardRecognizer in postra sched. Updated tests for the new schedules. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D109536 Change-Id: Ia86ba2ae168f12fb34b4d8efdab491f84d936cde	2021-09-14 15:11:27 -04:00
Matt Arsenault	d719f1c3cc	AMDGPU: Add alloc priority to global ranges The requested register class priorities weren't respected globally. Not sure why this is a target option, and not just the expected behavior (recently added in `1a6dc92be7`). This avoids an allocation failure when many wide tuple spills are introduced. I think this is a workaround since I would not expect the allocation priority to be required, and only a performance hint. The allocator should be smarter about when only a subregister needs to be spilled and restored. This does regress a couple of degenerate store stress lit tests which shouldn't be too important.	2021-08-10 13:12:34 -04:00
Tony Tye	7f19aa73c2	[AMDGPU] Update gfx90a memory model support Update AMDGPU gfx90a memory model to make coarse grain memory allocations consistent when fine grained system scope atomic acquire and release is performed. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D105137	2021-06-30 04:05:22 +00:00
Stanislav Mekhanoshin	45764efb69	[AMDGPU] Do not check denorm for LDS FP atomic with unsafe flag This is already how it is handled for global and flat atomics. Differential Revision: https://reviews.llvm.org/D102366	2021-05-17 16:53:09 -07:00
Stanislav Mekhanoshin	8f98356bb5	[AMDGPU] Only allow global fp atomics with unsafe option Previously we were allowing to use FP atomics without -amdgpu-unsafe-fp-atomics option if a scope is less then system. This is not safe just as well if we have UC memory. This change only allows global and flat FP atomics with the unsafe option. Consequentially that makes a check for denorm mode redundant since we skip it with the unsafe option and do not have a way to produce these instructions without it anyway. Differential Revision: https://reviews.llvm.org/D102347	2021-05-13 08:52:20 -07:00
Austin Kerbow	f5199d7ae0	[AMDGPU] Revise handling of preexisting waitcnt Preexisting waitcnt may not update the scoreboard if the instruction being examined needed to wait on fewer counters than what was encoded in the old waitcnt instruction. Fixing this results in the elimination of some redudnat waitcnt. These changes also enable combining consecutive waitcnt into a single S_WAITCNT or S_WAITCNT_VSCNT instruction. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D100281	2021-05-05 17:21:33 -07:00
Stanislav Mekhanoshin	189310a140	[AMDGPU] Allow -amdgpu-unsafe-fp-atomics to ignore denorm mode Fixes: SWDEV-274276 Differential Revision: https://reviews.llvm.org/D100072	2021-04-08 12:46:36 -07:00
Tony Tye	4658cd4c18	[AMDGPU] Update gfx90a memory model support Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D100070	2021-04-07 22:17:58 +00:00
Stanislav Mekhanoshin	30b3aab329	Copy syncscope when expanding atomicrmw into cmpxchg loop Fixes: SWDEV-280070 Differential Revision: https://reviews.llvm.org/D99902	2021-04-05 17:29:38 -07:00
Stanislav Mekhanoshin	574a9dabc6	[AMDGPU] Always expand system scope fp atomics on gfx90a FP atomics in system scope cannot be used and shall always be expanded in a CAS loop. Differential Revision: https://reviews.llvm.org/D98085	2021-03-10 12:35:23 -08:00
Matt Arsenault	589223e044	AMDGPU: Remove special case in shouldCoalesce Unaligned registers are now constrained with classes, rather than specially reserving a subset of the whole class.	2021-02-24 14:49:44 -05:00
Stanislav Mekhanoshin	a8d9d50762	[AMDGPU] gfx90a support Differential Revision: https://reviews.llvm.org/D96906	2021-02-17 16:01:32 -08:00

13 Commits