Commit Graph

6 Commits

Author SHA1 Message Date
Stanislav Mekhanoshin 555d8f4ef5 [AMDGPU] Bundle loads before post-RA scheduler
We are relying on atrificial DAG edges inserted by the
MemOpClusterMutation to keep loads and stores together in the
post-RA scheduler. This does not work all the time since it
allows to schedule a completely independent instruction in the
middle of the cluster.

Removed the DAG mutation and added pass to bundle already
clustered instructions. These bundles are unpacked before the
memory legalizer because it does not work with bundles but also
because it allows to insert waitcounts in the middle of a store
cluster.

Removing artificial edges also allows a more relaxed scheduling.

Differential Revision: https://reviews.llvm.org/D72737
2020-01-24 11:33:38 -08:00
Amaury Sechet 05f56a1ddd [AMDGPU] Automatically generate various tests. NFC
llvm-svn: 369787
2019-08-23 17:58:49 +00:00
Ron Lieberman f48e43bbf7 [AMDGPU] Disable SReg Global LD/ST, perf regression
Differential Revision: https://reviews.llvm.org/D55093

llvm-svn: 348014
2018-11-30 18:29:17 +00:00
Ron Lieberman cac749ac88 [AMDGPU] Add FixupVectorISel pass, currently Supports SREGs in GLOBAL LD/ST
Add a pass to fixup various vector ISel issues.
Currently we handle converting GLOBAL_{LOAD|STORE}_*
and GLOBAL_Atomic_* instructions into their _SADDR variants.
This involves feeding the sreg into the saddr field of the new instruction.

llvm-svn: 347008
2018-11-16 01:13:34 +00:00
Mark Searles 72da47df25 run post-RA hazard recognizer pass late
Memory legalizer, waitcnt, and shrink  passes can perturb the instructions,
which means that the post-RA hazard recognizer pass should run after them.
Otherwise, one of those passes may invalidate the work done by the hazard
recognizer. Note that this has adverse side-effect that any consecutive
S_NOP 0's, emitted by the hazard recognizer, will not be shrunk into a
single S_NOP <N>. This should be addressed in a follow-on patch.

Differential Revision: https://reviews.llvm.org/D49288

llvm-svn: 337154
2018-07-16 10:02:41 +00:00
Stanislav Mekhanoshin 739174c4be [AMDGPU] Construct memory clauses before RA
Memory clauses are formed into bundles in presence of xnack.
Their source operands are marked as early-clobber.

This allows to allocate distinct source and destination registers
within a clause and prevent breaking the clause with s_nop in the
hazard recognizer.

Clauses are undone before post-RA scheduler to allow some rescheduling,
which will not break the clause since artificial edges are created in
the dag to keep memory operations together. Yet this allows a better
ILP in some cases.

Differential Revision: https://reviews.llvm.org/D47511

llvm-svn: 333691
2018-05-31 20:13:51 +00:00