If we're adding a constant that can't use addi we try a few tricks,
one of which is using li+sh3add. We should not do this if lui+add
would work. For example adding 8192. Using sh3add prevents folding
a sext.w to form addw, thus increasing instruction count.
We can use slli.uw by C followed by sh1add. Similar can be done
for multiples of 5 and 9. We need to make sure that C is less than
32 to stay in bounds of the 5-bit immediate for slli.uw.
We have existing patterns for (mul X, 3<<C) that use sh1add
followed by slli. That order doesn't allow the and to be folded.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D130146
Some more complex cases require checking the relationship of
operands on different nodes of the match. They also require
additional instructions to be created. Using a ComplexPattern
gives us that flexibility.
I'll be adding another pattern in a future patch.
This handles the code we get for
int foo(int* x, unsigned y) {
return x[y >> 1];
}
The shift right and the shl will get DAG combined into
(shl (and X, 0xfffffffe), 1). We have custom isel to match the
shl+and, but with Zba the (add (shl X, 1), Y) part will get
matched and leave the and to be iseled by itself. This commit
adds a larger pattern that includes the and.
Similar for SH2ADD and SH3ADD.
This is what we get from
int foo(int* x, unsigned y) {
return x[y >> 1];
}
This allows us to avoid materializing 0xFFFFFFFE into a register.
The immediate range check for CSImm12MulBy8 included some values
covered by CSImm12MulBy4. I assume CSImm12MulBy4 had priority due
to pattern order in the td file, but this makes the priority
explicit in the predicate.
This pattern is what we get after DAG combine for C code like this.
short *ptr1, *ptr2, *ptr3;
unsigned diff = ptr1 - ptr2;
return ptr3[diff];
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D126588
This reverts commit dfe513ae1b.
Tests have been changed to avoid the type legalization bug being
fixed in D126036.
Original commit message:
This will remove masks on the shift amount. We usually get this with
SimplifyDemandedBits in DAGCombine, but that's restricted to cases
where the AND has a single use. selectShiftMaskXLen does not have
that restriction.
This reverts commit 86f7d7074a.
The test cases added for this exposed an pre-existing bug that is failing
the expensive checks bot. Reverting so I can revert that patch.
This will remove masks on the shift amount. We usually get this with
SimplifyDemandedBits in DAGCombine, but that's restricted to cases
where the AND has a single use. selectShiftMaskXLen does not have
that restriction.
Type legalize narrow RISCVISD::GREV/GORC with constant to a larger
type without switching to W. Detect sext_inreg+gorci/grevi with a
uimm5 immediate during isel to emit GREVIW/GORCIW.
This allows us to better propagate known bits information through
extended bits after type legalization. It will also simplify a
change I'm considering for BREV8 with Zbkb.
A future patch will add computeKnownBits support for GORC.
A further improvement here would be to use hasAllWUsers and
doPeepholeSExtW like we do for SLLIW, but I don't think we have
the test coverage for that yet.
setgt X, -1 is the canonical form of setge X, 0. We can swap the
select operands and use setlt X, X0 when selecting CMOV. This
avoid materializing the -1 in a register.
I think the i32 in the pattern prevents this from matching on RV64,
but using IsRV32 is safer.
Add tests for RV64 to make sure we don't print zip or unzip
because we incorrectly picked ZIP_RV32/UNZIP_RV32.
Move some combine patterns to DAG combine,and
it dealt with fixme left in RISCVInstrInfoZb.td.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D119527
We can use the RISCVISD::GREV encoding that swaps the bits in
each byte. This allows it to use the existing computeKnownBits
support for RISCVISD::GREV.
We already have an ISD opcode for the more general GREV/GREVI
instructon. We can just use it with the encoding that corresponds
to the behavior of brev8. This is similar to what we do for orc.b
where we use the GORC ISD opcode.
Where the instruction mnemonic contains a dot, we name the corresponding
instruction in the .td file using a _ in the place of the dot. e.g. LR_W
rather than LRW. This commit updates RISCVInstrInfoZb.td to follow that
convention.
This patch adds support for zbkx extension from K extension(v1.0.0) in MC layer.
Instructions with same functionality and same encoding is defined in the bitmanip extension.
It defines {Xperm8, Xperm4} as instruction aliases for xperm.* in Zbp extension. When Zbkx is enabled while Zbp is not, xperm.h will not be available. When Zbkx and Zbp are both enabled, the instructions will be decoded in Zbp format.
[[ https://reviews.llvm.org/D94999 | D94999 ]] this is the patch that introduces xperm.* instructions.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D117889
The Zbk* extensions have some overlap with Zb so have been placed in this file.
Reviewed By: VincentWu
Differential Revision: https://reviews.llvm.org/D117958
Zbkb supports some encodings of the general grevi, shfli, and
unshfli instructions legal, so we added separate instructions for
those encodings to improve the diagnostics for assembler and
disassembler. To be consistent we should always use these separate
instructions whenever those specific encodings of grevi/shfli/unshfli
occur. So this patch adds specific isel patterns to override the generic
isel patterns for these cases. Similar was done for rev8 and zext.h
for Zbb previously.
This commit add instructions supports of `zbkb` which defined in scalar cryptography extension version v1.0.0 (has been ratified already).
Most of the zbkb directives reuse parts of the zbp and zbb directives, so this patch just modified some of the inst aliases and predicates.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D117640
Remove fshl/fshr with constant shift amount isel patterns. Replace
with fsr/fsl with constant isel patterns.
This hack was trying to preserve as much optimization opportunity
for fshl/fshr by constant as possible, but the conversion to
RISCVISD::FSR/FSL happens so late it probably isn't worth much.
The new isel patterns are needed by D117468 anyway.
This reverts the revert commit e328385739.
Accidental demanded bits change has been removed. The demanded bits
code itself was remove in a pre-commit since it isn't tested.
Original commit message:
Previous we used the fshl/fshr operand ordering for simplicity. This
made things confusing when D117468 proposed adding intrinsics for
the instructions. We can't just use the generic funnel shifting
intrinsics because fsl/fsr have different functionality that should
be exposed to software.
Now we use rs1, rs3, rs2/shamt order which matches the instruction
printing order and the order used in this intrinsic header
https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h
Previous we used the fshl/fshr operand ordering for simplicity. This
made things confusing when D117468 proposed adding intrinsics for
the instructions. We can't just use the generic funnel shifting
intrinsics because fsl/fsr have different functionality that should
be exposed to software.
Now we use rs1, rs3, rs2/shamt order which matches the instruction
printing order and the order used in this intrinsic header
https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h
Zbc extension:
CLMUL/CLMULR/CLMULH are grouped together, defined one schedule class.
Zbs extension:
BCLR/BSET/BINV/BEXT are grouped together, defined one schedule class.
BCLRI/BSETI/BINVI/BEXTI are grouped together, defined one schedule class.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D117538