This brings the MachineInstrs in line with the corresponding intrinsics
which have side effects but do not access memory. It also matches how
BUF cache invalidation instructions are defined.
The lit test changes are just because the machine scheduler previously
treated them like loads, and added an artificial scheduling edge from
them to the exit SU, which caused them to be scheduled earlier.
Differential Revision: https://reviews.llvm.org/D126074
It is already marked as having side effects, at least in MIR. It does
not interact with anything else that is modelled as a memory access
either in IR or MachineIR.
Differential Revision: https://reviews.llvm.org/D125985
Includes MachineCode layer support and tests, and MIR tests not requiring
CodeGen pass changes.
Includes a small change in SMInstructions.td to correct encoded bits.
Contributors:
Petar Avramovic <Petar.Avramovic@amd.com>
Dmitry Preobrazhensky <dmitry.preobrazhensky@amd.com>
Depends on D125316
Patch 6/N for upstreaming of AMDGPU gfx11 architecture.
Reviewed By: dp, Petar.Avramovic
Differential Revision: https://reviews.llvm.org/D125319
Adding IntrHasSideEffects to @llvm.amdgcn.s.memtime and
@llvm.amdgcn.s.memrealtime means that we can stop pretending they read
and write memory, and similarly for the corresponding pseudo
instructions.
This should stop these intrinsics from being rescheduled past all other
instructions, even ones which don't load or store.
See also https://reviews.llvm.org/D58635.
Differential Revision: https://reviews.llvm.org/D115227
When used as a non-leaf node, TableGen does not currently use the type
of a ComplexPattern for type inference, which also means it does not
check it doesn't conflict with the use. This differs from when used as a
leaf value, where the type is used for inference. Fixing that
discrepancy is something I intend to upstream as a subsequent review.
AMDGPU currently has several ComplexPatterns that are used in contexts
where they're expected to be an iPTR, and where using an iPTR instead of
a fixed-width integer type matters. With my locally-patched TableGen,
none of these mismatches result in type contradictions, but do change
the patterns and cause various failures to select. These changes to the
ComplexPatterns' types reflect how they are actually used, result in
bit-for-bit identical TableGen output (without my local TableGen patch),
and ensure that with improved type inference AMDGPU's backend will
continue to work.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D109032
This does not affect codegen, which only tests these flags on Pseudo
instructions, but might help llvm-mca which has to work with Real
instructions. In particular setting LGKM_CNT on DS instructions helps
with the problem identified in D104149.
Differential Revision: https://reviews.llvm.org/D104293
Coyp SchedRW from pseudos to real instructions so that llvm-mca has
access to it. This is NFC for normal compiler codegen, which schedules
pseudos not real instructions.
Add an llvm-mca test for some high latency double-precision instructions
as a smoke test.
Differential Revision: https://reviews.llvm.org/D99187
Replace individual operands GLC, SLC, and DLC with a single cache_policy
bitmask operand. This will reduce the number of operands in MIR and I hope
the amount of code. These operands are mostly 0 anyway.
Additional advantage that parser will accept these flags in any order unlike
now.
Differential Revision: https://reviews.llvm.org/D96469
gfx1030 added a new way to implement readcyclecounter using the
SHADER_CYCLES hardware register, but the s_memtime instruction still
exists, so the MC layer should still accept it and the
llvm.amdgcn.s.memtime intrinsic should still work.
Differential Revision: https://reviews.llvm.org/D97928
We did not have atomic flags on SMRD, did not copy TSFlags
to real instructions, and did not have ret/noret atomic map.
At the moment it is NFC, but needed for D96469.
Differential Revision: https://reviews.llvm.org/D96823
We have this subtarget feature so it makes sense to use it here. This is
NFC because it's always defined by default on GFX8+.
Differential Revision: https://reviews.llvm.org/D93202
The usage of the Imm out argument from SelectSMRDOffset is pretty
confusing. Stop trying to reject CI immediates in the case where the
offset field can be used. It's not an illegal way to encode the
immediate, so just prefer the better encoding pattern with
AddedComplexity.
We probably don't even really need the different opcodes for the
different offset types anymore, but that will be more work to cleanup.
The SMRD non-buffer load patterns could also use a cleanup to be done
separately.
There's not much value to this separate node from the intrinsic. Make
the operand structure the same as the intrinsic, so we can reuse the
same pattern for GlobalISel.
We are duplicating predicates if several parts of the combined
predicate list contain the same condition. Added code to deduplicate
the list.
We have AssemblerPredicates and AssemblerPredicate in the
PredicateControl, but we never use AssemblerPredicates with an
actual list, so this one is dropped.
This addresses the first part of the llvm bug 43886:
https://bugs.llvm.org/show_bug.cgi?id=43886
Differential Revision: https://reviews.llvm.org/D69815
We have done some predicate and feature refactoring lately but
did not upstream it. This is to sync.
Differential revision: https://reviews.llvm.org/D60292
llvm-svn: 357791