These methods don't access any state from RISCVInstrInfo. Make them
free functions in the RISCV namespace.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D127583
The patch is a replacement of D125199. PseudoReadVL with vtype has worry for
computing same vtypes of VLEFF/VLSEGFF in two different places, DAGToDAG and
InsertVSETVLI. VLEFF/VLSEGFF MI with VL output still could provide the vtype of
VLEFF/VLSEGFF to the users of its VL.
The patch names the new pseudo as original VLEFF/VLSEGFF name suffixed "_VL" and
expand them in RISCVInsertVSETVLI pass.
This patch also reverts commit 4537aae0d5,
"[RISCV] Make PseudoReadVL have the vtypes of the corresponding VLEFF/VLSEGFF.".
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D126794
Instead of matching opcodes to know the format to emit, use an
enum value that we can get from the RISCVMatInt::Inst class.
Change the consumers to use fully covered switches so that we get
a compiler warning if a new kind is added. With the opcode checks
it was easier to forget to update one of the 3 consumers.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D126317
This patch adds minimal support for lowering an read.register intrinsic with vlenb as the argument. Note that vlenb is an implementation constant, so it is never allocatable.
This was split off a patch to eventually replace PseudoReadVLENB with a COPY MI because doing so revealed a couple of optimization opportunities which really seemed to warrant individual patches and tests. To write those patches, I need a way to write the tests involving vlenb, and read.register seemed like the right testing hook.
Differential Revision: https://reviews.llvm.org/D125552
The patch make PseudoReadVL have the vtypes of the corresponding VLEFF/VLSEGFF.
It's useful to get the vtypes of locations of PseudoReadVL without finding the
corresponding VLEFF/VLSEGFF.
It could simplify optimizations in RISCVInsertVSETVLI like D123581.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D125199
Enable default outlining when the function has the minsize attribute.
`addr-label.ll` crashed after enabling this, so a barrier is added before
instruction selection as a workaround.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D122213
No reason to special case simm12, movImm handles all immediates.
This also fixe a bug that we weren't passing the frame-setup/destroy
flag to movImm when we were calling it.
We can use SH1ADD, SH2ADD, SH3ADD to multipy by 3, 5, and 9 respectively.
We could extend this to 3, 5, or 9 multiplied by a power 2 by also
emitting a SLLI.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D124824
This patch adds custom MIR operand comments to VTYPE immediate operands
in VSETVLI instructions and SEW/LMUL operands in vector codegen pseudo
instructions. The result is intended to be more human-readable and
hopefully maintainable when working with MIR, particularly when
writing or reading test cases.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D124187
We saw a failure caused by unwinding with incomplete CFIs, so we
can't outline CFI instructions when they are needed in EH.
This is a recommit of 0d40688, which was reverted in ce83883 as
related precommit test 360d44e caused some errors.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D122634
We saw a failure caused by unwinding with incomplete CFIs, so we
can't outline CFI instructions when they are needed in EH.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D122634
This reverts commit 23a5073600.
Although this patch achieved better codegen in most cases, it is really
important to accurately describe the cost of instructions. So I revert it.
It's not particularly user-friendly to have to call `initLRU` everywhere. Also,
it wasn't particularly great that the LRU for registers used in a sequence was
also initialized by `initLRU`.
This patch hides this stuff behind some helper functions:
* `isAvailableAcrossAndOutOfSeq`
* `isAnyUnavailableAcrossOrOutOfSeq`
* `isAvailableInsideSeq`
This allows the user to avoid calling `initLRU` explicitly. Also, it allows
us to separate initializing the used-in-sequence LRU from the main LRU.
Since both ARM and AArch64 check LR liveness in `insertOutlinedCall`, this
refactor requires that we de-const the Candidate there.
Some other quality-of-code improvements:
* LRUs in outliner::Candidate now have more descriptive names
* Use `Register` instead of `unsigned` in some places
* Improve readability in some places by using ranges rather than `std::for_each`
This is a preparatory commit for a larger compile time related change for the
AArch64 outliner.
A LUI instruction with flag RISCVII::MO_HI is usually used in conjunction
with ADDI, and jointly complete address computation. To bind the cost
evaluation of address computation, the LUI should not be regarded as a cheap
move separately, which is consistent with ADDI.
In this test case, it improves the unroll-loop code that the rematerialization
of array's base address miss MachineCSE with Heuristics #1 at isProfitableToCSE.
Reviewed By: asb, frasercrmck
Differential Revision: https://reviews.llvm.org/D118216
Based on the discussion in D61884, this was done to enable compressed
instructions by giving freedom to pick a compressible register.
Integer materializing can generate LUI, ADDI, ADDIW, SLLI and some
Zb* instructions. C.LI, C.LUI, C.ADDI, C.ADDIW, and C.SLLI all have a 5-bit
register encoding. The Zb* instructions aren't compressible. Based on
that I don't think compressibility of the register is a concern.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D118741
Fix the pseudos to have the correct size in the MCInstrDesc description.
Inspired by D118009 and D117970.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D118175
Where the instruction mnemonic contains a dot, we name the corresponding
instruction in the .td file using a _ in the place of the dot. e.g. LR_W
rather than LRW. This commit updates RISCVInstrInfoZb.td to follow that
convention.
This commit is currently implementing supports for scalar cryptography extension for LLVM according to version v1.0.0 of [K Ext specification](https://github.com/riscv/riscv-crypto/releases)(scala crypto has been ratified already). Currently, we are implementing the MC (Machine Code) layer of his extension and the majority of work is done under `llvm/lib/Target/RISCV` directory. There are also some test files in `llvm/test/MC/RISCV` directory.
Remove the subfeature of Zbk* which conflict with b extensions to reduce the size of the patch.
(Zbk* will be resubmit after this patch has been merged)
**Co-author:**@ksyx & @VincentWu & @lihongliang & @achieveartificialintelligence
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D98136
For floating point specific vector instructions, we don't need
pseudos for mf8.
Reviewed By: khchen
Differential Revision: https://reviews.llvm.org/D116460
For .vf instructions, we don't need MF8 pseudos for f16. We don't
need MF8 or MF4 pseudos for f32. Or MF8, MF4, MF2 for f64.
Reviewed By: khchen
Differential Revision: https://reviews.llvm.org/D116437
The implicit defines may come from a partial define in an instruction.
It does not mean the defining instruction and the COPY instruction have
the same vl and vtype. When the source comes from the implicit defines,
do not convert the whole register copies to vmv.v.v.
Differential Revision: https://reviews.llvm.org/D115866
In order to support constrained FP intrinsics we need to model FRM
dependency. Whether or not a instruction uses FRM is based on a 3
bit field in the instruction. Because of this we can't add
'Uses = [FRM]' to the tablegen descriptions.
This patch examines the immediate after isel and adds an implicit
use of FRM. This idea came from Roger Ferrer Ibanez.
Other ideas:
We could be overly conservative and just pretend all instructions with
frm field read the FRM register. Or we could have pseudoinstructions
for CodeGen with rounding mode.
Reviewed By: asb, frasercrmck, arcbbb
Differential Revision: https://reviews.llvm.org/D115555
MachineOutliner may outline a "patchable-function-entry" function whose body has
a TargetOpcode::PATCHABLE_FUNCTION_ENTER MachineInstr. This is incorrect because
the special code sequence must stay unchanged to be used at run-time.
Avoid outlining PATCHABLE_FUNCTION_ENTER. While here, avoid outlining FENTRY_CALL too
(which doesn't reproduce currently) to allow phase ordering flexibility.
Fixes#52635
Reviewed By: paquette
Differential Revision: https://reviews.llvm.org/D115614
Delegate updating of LiveIntervals to each target's
convertToThreeAddress implementation, instead of repairing LiveIntervals
after the fact in TwoAddressInstruction::convertInstTo3Addr.
Differential Revision: https://reviews.llvm.org/D113493
If we know the source operand of COPY is defined by a vector instruction
with tail agnostic and the same LMUL and there is no vsetvli between
COPY and the define instruction to change the vl and vtype, we could use
vmv.v.v or vmv.v.i to copy vector registers to get better performance than
the whole vector register move instructions.
If the source of COPY is from vmv.v.i, we could use vmv.v.i for the
COPY.
This patch only considers all these instructions within one basic block.
Case 1:
```
bb.0:
...
VSETVLI # The first VSETVLI before COPY and VOP.
... # Use this VSETVLI to check LMUL and tail agnostic.
...
vy = VOP va, vb # Define vy.
... # There is no vsetvli between VOP and COPY.
vx = COPY vy
```
Case 2:
```
bb.0:
...
VSETVLI # The first VSETVLI before VOP.
... # Use this VSETVLI to check LMUL and tail agnostic.
...
vy = VOP va, vb # Define vy.
... # There is no vsetvli to change vl between VOP and COPY.
...
VSETVLI # The first VSETVLI before COPY.
... # This VSETVLI does not change vl and vtype.
...
vx = COPY vy
```
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Co-Authored-by: Kito Cheng <kito.cheng@sifive.com>
Differential Revision: https://reviews.llvm.org/D103510
- When an unconditional branch is expanded into an indirect branch, if
there is no scavenged register, an SGPR pair needs spilling to enable
the destination PC calculation. In addition, before jumping into the
destination, that clobbered SGPR pair need restoring.
- As SGPR cannot be spilled to or restored from memory directly, the
spilling/restoring of that SGPR pair reuses the regular SGPR spilling
support but without spilling it into memory. As that spilling and
restoring points are fully controlled, we only need to spill that SGPR
into the temporary VGPR, which needs spilling into its emergency slot.
- The target-specific hook is revised to take additional restore block,
where the restoring code is filled. After that, the relaxation will
place that restore block directly before the destination block and
insert an unconditional branch in any fall-through block into the
destination block.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D106449
This function was copied from ARM where register pairs/triples/quads can wrap around the 32 encoding space. So register 31 can pair with register 0. This is not true for RISCV vectors. The spec specifically mentions the possibility of a future encoding that has more than 32 registers.
This patch removes the modulo from the code and directly checks that destination register is in the source register range and not the beginning of the range. Though I don't expect an identity copy will occur.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D111467
Use SH1ADD/SH2ADD/SH3ADD along with LUI+ADDI to compose int32*3,
int32*5 and int32*9.
Reviewed By: craig.topper, luismarques
Differential Revision: https://reviews.llvm.org/D111484
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
This reverts commit 1f16191906.
We're seeing some issues with this internally. It seems that when
the spill is created by register allocation, the GPR doesn't get
allocated and an assertion fires during virtual register rewriting.
The .mir test case contains the spill before register allocation so
register allocation sees it as any other instruction.
This simplifies the API and addresses a FIXME in
TwoAddressInstructionPass::convertInstTo3Addr.
Differential Revision: https://reviews.llvm.org/D110229
Don't outline machine instructions which are using jump table indexes
since they are materialized as local labels (like the already handled
case of constant pools).
Reviewed By: paquette
Differential Revision: https://reviews.llvm.org/D109436
The expansion of these pseudos creates ADD instructions. Those
ADDs modify a GPR so that it is no longer contains the same value
as the input base pointer. Therefore, I believe we should have a
GPR as a Def on these instructions and expansion should get the
destination register for the ADDs from that operand.
At least in our tests here this works out so that register
scavenging picks the same register as the base pointer.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D109405
This patch fixes an issue where RISCV's `findCommutedOpIndices` would
incorrectly return the pseudo `CommuteAnyOperandIndex` as a commutable
operand index, rather than fixing a specific index.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D108206
Previously we converted ISD condition codes to integers and stored
them directly in our MIR instructions. The ISD enum kind of belongs
to SelectionDAG so that seems like incorrect layering.
This patch instead uses a CondCode node on RISCV::SELECT_CC until
isel and then converts it from ISD encoding to a RISCV specific value.
This value can be converted to/from the RISCV branch opcodes in the
RISCV namespace.
My larger motivation is to possibly support a microarchitectural
feature of some CPUs where a short forward branch over a single
instruction can be predicated internally. This will require a new
pseudo instruction for select that needs to carry a branch condition
and live probably until RISCVExpandPseudos. At that point it can be
expanded to control flow without other instructions ending up in the
predicated basic block. Using an ISD encoding in RISCVExpandPseudos
doesn't seem like correct layering.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D107400
Use a tail policy operand instead. Inspired by the work in D105092,
but without the intrinsic interface changes.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D106512
If the upper 32 bits are zero and bit 31 is set, we might be able to
use zext.w to fill in the zeros after using an lui and/or addi.
Most of this patch is plumbing the subtarget features into the constant
materialization.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D105509
This patch teaches the compiler to generate code to handle larger RVV
stack sizes and stack offsets which resolve an amount larger than 2047
vector registers in size.
The previous behaviour was asserting on such large values as it was only
able to materialize the constant by feeding it to the 12-bit immediate
of an `ADDI` instruction. The compiler can now materialize this amount
into a temporary register before continuing with the computation.
A test case for this scenario is included which also checks that the
temporary register used to materialize the amount doesn't require an
additional spill slot over what we're already reserving for RVV code.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D104727