For most DPP instructions, the old operand stores the value that was in
the current lane before the DPP operation, and is tied to the
destination. For VOPC DPP, this is unnecessary and incorrect.
There appears to have been a latent bug related to D122737 with
SIInstrInfo::isOperandLegal. If you checked if a register operand was legal
when the InstructionDesc expected an immediate, it reported that is valid.
Its fix is necessary for and tested in this patch.
Reviewed By: foad, rampitec
Differential Revision: https://reviews.llvm.org/D130040
Supports encoding existing instrutions on gfx11 and MC support for the new VOPC
dpp instructions.
Patch 19/N for upstreaming of AMDGPU gfx11 architecture
Depends on D126978
Reviewed By: rampitec, #amdgpu
Differential Revision: https://reviews.llvm.org/D126989
On GFX10.3 targets, the following instruction sequence
v_cmp_* SGPR, ...
s_and_saveexec ..., SGPR
leads to a fairly long stall caused by a VALU write to a SGPR and having the
following SALU wait for the SGPR.
An equivalent sequence is to save the exec mask manually instead of letting
s_and_saveexec do the work and use a v_cmpx instruction instead to do the
comparison.
This patch modifies the SIOptimizeExecMasking pass as this is the last position
where s_and_saveexec instructions are inserted. It does the transformation by
trying to find the pattern, extracting the operands and generating the new
instruction sequence.
It also changes some existing lit tests and introduces a few new tests to show
the changed behavior on GFX10.3 targets.
Same as D119696 including a buildbot and MIR test fix.
Reviewed By: critson
Differential Revision: https://reviews.llvm.org/D122332
On GFX10.3 targets, the following instruction sequence
v_cmp_* SGPR, ...
s_and_saveexec ..., SGPR
leads to a fairly long stall caused by a VALU write to a SGPR and having the
following SALU wait for the SGPR.
An equivalent sequence is to save the exec mask manually instead of letting
s_and_saveexec do the work and use a v_cmpx instruction instead to do the
comparison.
This patch modifies the SIOptimizeExecMasking pass as this is the last position
where s_and_saveexec instructions are inserted. It does the transformation by
trying to find the pattern, extracting the operands and generating the new
instruction sequence.
It also changes some existing lit tests and introduces a few new tests to show
the changed behavior on GFX10.3 targets.
Reviewed By: sebastian-ne, critson
Differential Revision: https://reviews.llvm.org/D119696
Coyp SchedRW from pseudos to real instructions so that llvm-mca has
access to it. This is NFC for normal compiler codegen, which schedules
pseudos not real instructions.
Add an llvm-mca test for some high latency double-precision instructions
as a smoke test.
Differential Revision: https://reviews.llvm.org/D99187
Removed "implicit def VCC" from declarations of AMDGPU VOPC instructions since they do not implicitly write to VCC in SDWA mode.
Differential Revision: https://reviews.llvm.org/D89168
This is the groundwork required to implement strictfp. For now, this
should be NFC for regular instructoins (many instructions just gain an
extra use of a reserved register). Regalloc won't rematerialize
instructions with reads of physical registers, but we were suffering
from that anyway with the exec reads.
Should add it for all the related FP uses (possibly with some
extras). I did not add it to either the gpr index mode instructions
(or every single VALU instruction) since it's a ridiculous feature
already modeled as an arbitrary side effect.
Also work towards marking instructions with FP exceptions. This
doesn't actually set the bit yet since this would start to change
codegen. It seems nofpexcept is currently not implied from the regular
IR FP operations. Add it to some MIR tests where I think it might
matter.
We are duplicating predicates if several parts of the combined
predicate list contain the same condition. Added code to deduplicate
the list.
We have AssemblerPredicates and AssemblerPredicate in the
PredicateControl, but we never use AssemblerPredicates with an
actual list, so this one is dropped.
This addresses the first part of the llvm bug 43886:
https://bugs.llvm.org/show_bug.cgi?id=43886
Differential Revision: https://reviews.llvm.org/D69815
Do not generate non-existing sdwa instructions. It reduces the
number of generated instructions by 185.
Differential Revision: https://reviews.llvm.org/D69010
llvm-svn: 375016
We have done some predicate and feature refactoring lately but
did not upstream it. This is to sync.
Differential revision: https://reviews.llvm.org/D60292
llvm-svn: 357791
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
In some cases we do not copy implicit defs from pseudo to real
VOP instructions. It has no visible impact at the moment thus no
tests are affected or added.
Differential Revision: https://reviews.llvm.org/D41783
llvm-svn: 322496
These are problematic because they apply to everything,
and can easily clobber whatever more specific predicate
you are trying to add to a function.
Currently instructions use SubtargetPredicate/PredicateControl
to apply this to patterns applied to an instruction definition,
but not to free standing Pats. Add a wrapper around Pat
so the special PredicateControls requirements can be appended
to the final predicate list like how Mips does it.
llvm-svn: 314742
Summary:
Previously, CodeGen checked first src operand type to determine if omod is supported by instruction. This isn't correct for some instructions: e.g. V_CMP_EQ_F32 has floating-point src operands but desn't support omod.
Changed .td files to check if dst operand instead of src operand.
Reviewers: arsenm, vpykhtin
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D35350
llvm-svn: 308179
Summary: Previously there were two separate pseudo instruction for SDWA on VI and on GFX9. Created one pseudo instruction that is union of both of them. Added verifier to check that operands conform either VI or GFX9.
Reviewers: dp, arsenm, vpykhtin
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, artem.tamazov
Differential Revision: https://reviews.llvm.org/D34026
llvm-svn: 305886
Summary:
Added separate pseudo and real instruction for GFX9 SDWA instructions.
Currently supports only in assembler.
Depends D32493
Reviewers: vpykhtin, artem.tamazov
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D33132
llvm-svn: 303620
Added code to check constant bus restrictions for VOP formats (only one SGPR value or literal-constant may be used by the instruction).
Note that the same checks are performed by SIInstrInfo::verifyInstruction (used by lowering code).
Added LIT tests.
llvm-svn: 296873
Summary: This is needed for later SDWA support in CodeGen.
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27412
llvm-svn: 290338
Summary: Real instruction should copy constraints from real instruction. This allows auto-generated disassembler to correctly process tied operands.
Reviewers: nhaustov, vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27847
llvm-svn: 290336