Tests don't cover the masked input path since non-kernel arguments
aren't lowered yet.
Test is copied directly from the existing test, with 2 additions.
llvm-svn: 364833
Replace the brcond for the 2 cases that act as branches. For now
follow how the current system works, although I think we can
eventually get rid of the pseudos.
llvm-svn: 364832
This needs to be extended to s32, and expanded into cmp+select. This
is relying on the fact that widenScalar happens to leave the
instruction in place, but this isn't a guaranteed property of
LegalizerHelper.
llvm-svn: 364831
The condition register bank must be scc or vcc so that a copy will be
inserted, which will be lowered to a compare.
Currently greedy unnecessarily forces using a VCC select.
llvm-svn: 364825
Summary:
ds_ordered_count can now simultaneously operate on up to 4 dwords
in a single instruction, which are taken from (and returned to)
lanes 0..3 of a single VGPR.
Change-Id: I19b6e7b0732b617c10a779a7f9c0303eec7dd276
Reviewers: mareko, arsenm, rampitec
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63716
llvm-svn: 364815
Also works around tablegen defect in selecting add with unused carry,
but if we have to manually select GEP, might as well handle add
manually.
llvm-svn: 364806
There are several things broken, but at least emit the right thing for
gfx9.
The import of the pattern with the unused carry out seems to not
work. Needs a special class for clamp, because OperandWithDefaultOps
doesn't really work.
llvm-svn: 364804
Summary:
The stride should depend on the wave size, not the hardware generation.
Also, the 32_FLOAT format is 0x16, not 16; though that shouldn't be
relevant.
Change-Id: I088f93bf6708974d085d1c50967f119061da6dc6
Reviewers: arsenm, rampitec, mareko
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63808
llvm-svn: 364788
This was checking the size of the register with the value of the size,
which happens to be exec. Also fix assuming VCC is 64-bit to fix
wave32.
Also remove some untested handling for physical registers which is
skipped. This doesn't insert the V_CNDMASK_B32 if SCC is the physical
copy source. I'm not sure if this should be trying to handle this
special case instead of dealing with this in copyPhysReg.
llvm-svn: 364761
isLoopExiting should only be called for blocks in the loop. A follow
up patch makes this requirement an assertion.
I've updated the usage here, to only match for actual exit blocks. Previously,
it would also match blocks not in the loop.
Reviewers: arsenm, nhaehnle
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D63980
llvm-svn: 364750