This allows computeKnownBits to see the constant being loaded.
This recovers the rv64zbp test case changes from D127520.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D127679
Rather than emitting a MachineSDNode from lowering. Let isel match it.
This is consistent with the RISCVISD::HI and ADD_LO nodes that were
also added. Having them both the same will make D127679 consistent.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D127714
Instead add RISCVISD opcodes that will be selected to LUI/ADDI
during isel.
I'm looking into maybe moving doPeepholeLoadStoreADDI into isel.
Having the ADDI as a RISCVISD node will make it visible to isel.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D127713
We were incorrectly creating a VRGATHER node with i1 vector type. We
could support this by promoting the mask to i8 and truncating it, but
for now I want to prevent the crash.
Fixes PR56007.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D127681
This simplifies the isel code by removing the manual load creation.
It also improves our ability to use 0 strided loads for vector splats.
There is an assumption here that Mask and ShiftedMask constants are
cheap enough that they don't become constant pool loads so that our
isel optimizations involving And still work. I believe those constants
are 3 instructions in the worst case.
The rv64zbp-intrinsic.ll changes is a regression caused by intrinsics
being expanded to RISCVISD also occuring during lowering. So the optimizations
were only happening during the last DAGCombine, which can't see through the
load. I believe we can fix this test by implementing
TargetLowering::getTargetConstantFromLoad for RISC-V or by adding the intrinsic
to computeKnownBitsForTargetNode to enable earlier DAG combine. Since Zbp is not
a ratified extension, I don't view these as blocking this patch.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D127520
This prevents them from being assumed legal by the cost model.
This matches what is done for AArch64 SVE.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D123799
Based on D24038.
LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin.
Reviewed By: StephenFan
Differential Revision: https://reviews.llvm.org/D126181
We enable a custom handler to optimize conversions between scalars
and fixed vectors. Unfortunately, the custom handler picks up scalar
to scalar conversions as well. If the scalar types are both legal,
we wouldn't match any of the fixed vector cases and would return SDValue()
causing the LegalizeDAG to expand the bitcast through memory.
This patch fixes this by checking if it's a scalar to scalar conversion
and returns `Op` if both types are legal.
Differential Revision: https://reviews.llvm.org/D126739
When lowering GlobalAddressNodes, we were removing a non-zero offset and
creating a separate ADD.
It already comes out of SelectionDAGBuilder with a separate ADD. The
ADD was being removed by DAGCombiner.
This patch disables the DAG combine so we don't have to reverse it.
Test changes all look to be instruction order changes. Probably due
to different DAG node ordering.
Differential Revision: https://reviews.llvm.org/D126558
A RISCV implementation can choose to implement unaligned load/store support. We currently don't have a way for such a processor to indicate a preference for unaligned load/stores, so add a subtarget feature.
There doesn't appear to be a formal extension for unaligned support. The RISCV Profiles (https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#rva20u64-profile) docs use the name Zicclsm, but a) that doesn't appear to actually been standardized, and b) isn't quite what we want here anyway due to the perf comment.
Instead, we can follow precedent from other backends and have a feature flag for the existence of misaligned load/stores with sufficient performance that user code should actually use them.
Differential Revision: https://reviews.llvm.org/D126085
This patch tries to solve the incoordination between the direct and intermediate cast caused by D123975.
This patch replaces ISD::FP_EXTEND and ISD::FP_ROUND with RVV VL op in the lowering of FP scalable vector direct cast to unify with the intermediate cast.
And it also changes the FP widenning pattern with the VL op.
Differential Revision: https://reviews.llvm.org/D125364
Update test to check MIR after finalize-isel instead of debug output.
This is of course not the only place we should preserve FMF, but
it's the most obvious one.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D126306
Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op). Now that
the standard zext, sext and trunc allow this, there is no reason to use
the OrSelf versions.
The OrSelf versions additionally have the strange behaviour of allowing
extending to a *smaller* width, or truncating to a *larger* width, which
are also treated as no-ops. A small amount of client code relied on this
(ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and
needed rewriting.
Differential Revision: https://reviews.llvm.org/D125557
During early gather/scatter enablement two different approaches
were taken to represent scaled indices:
* A Scale operand whereby byte_offsets = Index * Scale
* An IndexType whereby byte_offsets = Index * sizeof(MemVT.ElementType)
Having multiple representations is bad as shown by this patch which
fixes instances where the two are out of sync. The dedicated scale
operand is more flexible and pervasive so this patch removes the
UNSCALED values from IndexType. This means all indices are scaled
but the scale can be one, hence unscaled. SDNodes now use the scale
operand to answer the "isScaledIndex" question.
I toyed with the idea of keeping the UNSCALED enums and helper
functions but because they will have no uses and force SDNodes to
validate the set of supported values I figured it's best to remove
them. We can re-add them if there's a real need. For similar
reasons I've kept the IndexType enum when a bool could be used as I
think being explicitly looks better.
Depends On D123347
Differential Revision: https://reviews.llvm.org/D123381
This patch replaces some for-each set with the new arrayref argument API, since it already used an array in defination, I think this change won't cause any ambiguity.
Differential Revision: https://reviews.llvm.org/D125455
When building the final merged node, we were using the original chain
rather than the output chain of the new operation. After some collapsing
of the chain this could cause the loads be incorrectly scheduled respect
to later stores.
This was uncovered by SingleSource/Regression/C/gcc-c-torture/execute/pr36038.c
of the llvm testsuite.
https://reviews.llvm.org/D125560
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D125323
This hook determines if SimplifySetcc transforms (X & (C l>>/<< Y))
==/!= 0 into ((X <</l>> Y) & C) ==/!= 0. Where C is a constant and
X might be a constant.
The default implementation favors doing the transform if X is not
a constant. Otherwise the code is left alone. There is a provision
that if the target supports a bit test instruction then the transform
will favor ((1 << Y) & X) ==/!= 0. RISCV does not say it has a variable
bit test operation.
RISCV with Zbs does have a BEXT instruction that performs (X >> Y) & 1.
Without Zbs, (X >> Y) & 1 still looks preferable to ((1 << Y) & X) since
we can fold use ANDI instead of putting a 1 in a register for SLL.
This patch overrides this hook to favor bit extract patterns and
otherwise falls back to the "do the transform if X is not a constant"
heuristic.
I've added tests where both C and X are constants with both the shl form
and lshr form. I've also added a test for a switch statement that lowers
to a bit test. That was my original motivation for looking at this.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D124639
Type legalization will want to turn (srl X, Y) into RISCVISD::SRLW,
which will prevent us from using a BEXT instruction.
I don't think there is any precedent for type promotion checking
users to decide how to promote. Instead, I've added this DAG combine to
do it before type legalization.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D124109
Rather than VP_SEXT/VP_ZEXT/VP_TRUNC, having
VP_SIGN_EXTEND/VP_ZERO_EXTEND/VP_TRUNCATE better matches their non-VP
counterparts.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D125298
This patch adds rvv codegen support for vp.fpext. The lowering of fp_round, vp.fptrunc, fp_extend and vp.fpext share most code so use a common lowering function to handle these four.
And this patch changes the intermediate cast from ISD::FP_EXTEND/ISD::FP_ROUND to the RVV VL version op RISCVISD::FP_EXTEND_VL and RISCVISD::FP_ROUND_VL for scalable vectors.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D123975
This improves opportunities to use bset/bclr/binv. Unfortunately,
there are no W versions of these instrcutions so this isn't always
a clear win. If we use SLLW we get free sign extend and shift masking,
but need to put a 1 in a register and can't remove an or/xor. If
we use bset/bclr/binv we remove the immediate materializationg and
logic op, but might need a mask on the shift amount and sext.w.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D124096
We can't shift-right negative numbers to divide them, so avoid emitting
such sequences. Use negative numerators as a proxy for this situation, since
the indices are always non-negative.
An alternative strategy could be to add a compiler flag to emit division
instructions, which would at least allow us to test the VID sequence
matching itself.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D123796
There's an existing generic combine that does this for legal types.
This patch adds a RISCV specific combine for W instructions.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D123983
This patch fixes a bug when lowering BUILD_VECTOR via VID sequences.
After adding support for fractional steps in D106533, elements with zero
steps may be skipped if no step has yet been computed. This allowed
certain sequences to slip through the cracks, being identified as VID
sequences when in fact they are not.
The fix for this is to perform a second loop over the BUILD_VECTOR to
validate the entire sequence once the step has been computed. This isn't
the most efficient, but on balance the code is more readable and
maintainable than doing back-validation during the first loop.
Fixes the tests introduced in D123785.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D123786
This patch adds rvv codegen support for vp.fptrunc. The lowering of fp_round and vp.fptrunc share most code so use a common lowering function to handle those two, similar to vp.trunc.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D123841
Materializing constants on RISCV is simpler if the constant is sign
extended from i32. By default i32 constant operands of phis are
zero extended.
This patch adds a hook to allow RISCV to override this for i32. We
have an existing isSExtCheaperThanZExt, but it operates on EVT which
we don't have at these places in the code.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D122951
This was added before Zve extensions were defined. I think users
should use Zve32x or Zve32f now. Though we will lose support for limiting
ELEN to 16 or 8, but I hope no one was using that.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D123418
This patch adds the minimum required to successfully lower vp.icmp via
the new ISD::VP_SETCC node to RVV instructions.
Regular ISD::SETCC goes through a lot of canonicalization which targets
may rely on which has not hereto been ported to VP_SETCC. It also
supports expansion of individual condition codes and a non-boolean
return type. Support for all of that will follow in later patches.
In the case of RVV this largely isn't a problem as the vector integer
comparison instructions are plentiful enough that it can lower all
VP_SETCC nodes on legal integer vectors except for boolean vectors,
which regular SETCC folds away immediately into logical operations.
Floating-point VP_SETCC operations aren't as well supported in RVV and
the backend relies on condition code expansion, so support for those
operations will come in later patches.
Portions of this code were taken from the VP reference patches.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D122743
We can do this conversion by converting the same sized integer type, then compare the result with 0. The conversion is undefined if the converted FP value doesn't fit in an i1.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D122678
If we expand (uaddo X, 1) we previously expanded the overflow calculation
as (X + 1) <u X. This potentially increases the live range of X and
can prevent X+1 from reusing the register that previously held X.
Since we're adding 1, overflow only occurs if X was UINT_MAX in which
case (X+1) would be 0. So this patch adds a special case to expand
the overflow calculation to (X+1) == 0.
This seems to help with uaddo intrinsics that get introduced by
CodeGenPrepare after LSR. Alternatively, we could block the uaddo
transform in CodeGenPrepare for this case.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D122933
The splat_vector will be legalized to build_vector eventually
anyway. This patch makes it take fewer steps.
Unfortunately, this results in some codegen changes. It looks
like it comes down to how the nodes were ordered in the topological
sort for isel. Because the build_vector is created earlier we end up
with a different ordering of nodes.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D122185