memory-barrier instructions to providing targets and developers a convenient
way to explicitly declare which instructions are memory-barriers.
Differential Revision: https://reviews.llvm.org/D116779
This implementation allows mca to model the desired behaviour of the s_waitcnt
instruction. This patch also adds the RetireOOO flag to the AMDGPU instructions
within the scheduling model. This flag is only used by mca and allows
instructions to finish out-of-order which helps mca's simulations more closely
model the actual device.
Differential Revision: https://reviews.llvm.org/D104730
This commit also makes some slight changes to the scheduling model for AMDGPU to set the RetireOOO flag for all scheduling classes.
This flag is only used by llvm-mca and allows instructions to retire out of order.
See the differential link below for a deeper explanation of everything.
Differential Revision: https://reviews.llvm.org/D104730
Change --max-timeline-cycles=0 to mean no limit on the number of cycles.
Use this in AMDGPU tests to show all instructions in the timeline view
instead of having it arbitrarily truncated.
Differential Revision: https://reviews.llvm.org/D104846
Instructions on the transcendental unit are executed in parallel to the
normal VALU, so add this as an extra resource.
This doesn't seem to have any effect, but it should be more correct.
Differential Revision: https://reviews.llvm.org/D100123
Instructions that have more uops than the processor's IssueWidth are
issued in multiple cycles.
The patch fixes PR49712.
Differential Revision: https://reviews.llvm.org/D99339
This is a follow-up for:
D98604 [MCA] Ensure that writes occur in-order
When instructions are aligned by the order of writes, they retire
in-order naturally. There is no need for an RCU, so it is disabled.
Differential Revision: https://reviews.llvm.org/D98628
Coyp SchedRW from pseudos to real instructions so that llvm-mca has
access to it. This is NFC for normal compiler codegen, which schedules
pseudos not real instructions.
Add an llvm-mca test for some high latency double-precision instructions
as a smoke test.
Differential Revision: https://reviews.llvm.org/D99187