Original patch by @rogfer01.
This patch supports vector truncates, which on RVV must be done in a
series of instructions truncating by one power-of-two at a time. This is
done through custom-lowering and a custom node to avoid LLVM
re-combining the split TRUNCATE nodes.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94796
The vcompress intrinsic is defined such that it requires a tail
undisturbed policy. This patch makes it so we can use the tail
agnostic policy if the user has passed vundefined to the dest
operand.
We need to do something similar for masked policy, but we need
annotation of which instructions use the mask policy first.
Not sure if this is sufficient for scheduling or if we'll need to
select different pseudos that don't have a tied def.
Reviewed By: evandro
Differential Revision: https://reviews.llvm.org/D94566
According to "9. Vector Memory Alignment Constraints" in V
specification, the alignment of vector memory access is aligned to the
size of the element. In our current implementation, we support ELEN up
to 64. We could assume the alignment of vector registers is 64 under the
assumption.
Differential Revision: https://reviews.llvm.org/D94751
SimplifyDemandedBits can remove set bits from immediates from instructions
like AND/OR/XOR. This can prevent them from being efficiently
codegened on RISCV.
This adds an initial version that tries to keep or form 12 bit
sign extended immediates for AND operations to enable use of ANDI.
If that doesn't work we'll try to create a 32 bit sign extended immediate
to use LUI+ADDIW.
More optimizations are possible for different size immediates or
different operations. But this is a good starting point that already
has test coverage.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D94628
I noticed in D94450 that there were quite a few places where we generate
the sequence:
```
xN <- comparison ...
xN <- xor xN, 1
bnez xN, symbol
```
Given we know the XOR will be used by BRCOND, which only looks at the lowest
bit, I think we can remove the XOR and just invert the branch condition in
these cases?
The case mostly seems to come up in floating point tests, where there is often
more logic to combine the results of multiple SETCCs, rather than a single
(BRCOND (SETCC ...) ...).
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94535
Some FP compares expand to a sequence ending with (xor X, 1) to invert the result. If
the consumer is a select_cc we can likely get rid of this xor by fixing
up the select_cc condition.
This patch combines (select_cc (xor X, 1), 0, setne, trueV, falseV) -
(select_cc X, 0, seteq, trueV, falseV) if we can prove X is 0/1.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D94546
MCTargetDesc includes headers from Utils and Utils includes headers
from MCTargetDesc. So from a library layering perspective it makes sense
for them to be in the same library. I guess the other option might be to
move the tablegen includes from RISCVMCTargetDesc.h to RISCVBaseInfo.h
so that RISCVBaseInfo.h didn't need to include RISCVMCTargetDesc.h.
Everything else that depends on Utils also depends on MCTargetDesc so
having one library seemed simpler.
Differential Revision: https://reviews.llvm.org/D93168
This patch custom lowers ISD::VSCALE into a csrr vlenb followed
by a shift right by 3 followed by a multiply by the scale amount.
I've added computeKnownBits support to indicate that the csrr vlenb
always produces 3 trailng bits of 0s so the shift right is "exact".
This allows the shift and multiply sequence to be nicely optimized
into a single shift or removed completely when the scale amount is
a power of 2.
The non power of 2 case multiplying by 24 is still producing
suboptimal code. We could remove the right shift and use a
multiply by 3. Hopefully we can improve DAG combine to fix that
since it's not unique to this sequence.
This replaces D94144.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D94249
The custom expansion of select operations in the RISC-V backend
interferes with the matching of cmov instructions. Legalizing
select when the Zbt extension is available solves that problem.
Reviewed By: lenary, craig.topper
Differential Revision: https://reviews.llvm.org/D93767
We can use a 0 immediate to avoid needing to materialize 0 into
an FPR first.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D94459
Define the `vfclass` IR intrinsics for the respective V instructions.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com>
Differential Revision: https://reviews.llvm.org/D94356
Original patch by @rogfer01.
This patch adds ISel patterns for the above operations to the
corresponding vector/vector and vector/scalar RVV instructions, as well
as extra patterns to match operand-swapped scalar/vector vfrsub and
vfrdiv.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94408
Original patch by @rogfer01.
All ordered comparisons except ONE are supported natively, and all
unordered comparisons except UNE are expanded into sequences involving
explicit NaN checks and mask arithmetic.
Additionally, we expand GT,OGT,GE,OGE to their swapped-operand versions, and
pattern-match those back to the "original", swapping operands once more. This
way we catch both operations and both "vf" and "fv" forms with fewer patterns.
Also add support for floating-point splat_vector, with an optimization for
splatting fpimm0.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94242
The Pseudo class sets isCodeGenOnly=1 which causes the asm strings
in the pseudos to be ignored. I think this is why the aliases are
needed at all.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D94024
This patch moves all but the BaseInstr to bits in TSFlags.
For the index fields, we can just use a bit to indicate their presence.
The locations of the operands are well defined.
This reduces the llc binary by about 32K on my build. It also
removes the binary search of the table from the custom inserter.
Instead we just check that the SEW op is present.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D94375
This makes the mask align with the position of the bits in TSFlags
which is a little more logical.
I might be adding more fields to TSFlags and some might be single
bits where just ANDing with mask to test the bit would make sense.
While there rename TargetFlags in validateInstruction to reflect
that it's just the constraint bits.
We currently have about 7000 opcodes in the RISCVGenInstrInfo.inc
enum. We can use uint16_t to store these values. We would need to
grow by nearly 9x before we run out of space so this should last
for a little while.
This reduces the llc binary by 32K.
Original patch by @rogfer01.
The RVV integer comparison instructions are defined in such a way that
many LLVM operations are defined by using the "opposite" comparison
instruction and swapping the operands. This is done in this patch in
most cases, except for the mappings where the immediate range must be
adjusted to accomodate:
va < i --> vmsle{u}.vi vd, va, i-1, vm
va >= i --> vmsgt{u}.vi vd, va, i-1, vm
That is left for future optimization; this patch supports all operations
but in the case of the missing mappings the immediate will be moved to
a scalar register first.
Since there are so many condition codes and operand cases to check, it
was decided to reduce the test burden by only testing the "vscale x 8"
vector types.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94168
The TableGen immAllOnesV and immAllZerosV helpers implicitly wrapped the
ISD::isBuildVectorAll(Ones|Zeros) helper functions. This was inhibiting
their use for targets such as RISC-V which use ISD::SPLAT_VECTOR. In
particular, RISC-V had to define its own 'vnot' fragment.
In order to extend the scope of these nodes to include support for
ISD::SPLAT_VECTOR, two new ISD predicate functions have been introduced:
ISD::isConstantSplatVectorAll(Ones|Zeros). These effectively supersede
the older "isBuildVector" predicates, which are now simple wrappers for
the new functions. They pass a defaulted boolean toggle which preserves
the old behaviour. It is hoped that in time all call-sites can be ported
to the "isConstantSplatVector" functions.
While the use of ISD::isBuildVectorAll(Ones|Zeros) has not changed, the
behaviour of the TableGen immAll(Ones|Zeros)V **has**. To test the new
functionality, the custom RISC-V TableGen fragment has been removed and
replaced with the built-in 'vnot'. To test their use as pattern-roots, two
splat patterns have been updated accordingly.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94223
This is a first change needed to fix a crash in which the emergency
spill splot ends being out of reach. This happens when we run the
register scavenger after we have eliminated the frame indexes. The fix
for the actual crash will come in a later change.
This change removes an extra stack size increase we do in
RISCVFrameLowering::determineFrameLayout.
We don't have to change the size of the stack here as
PEI::calculateFrameObjectOffsets is already doing this with the right
size accounting the extra alignment.
Differential Revision: https://reviews.llvm.org/D89237
1. Break MUL with specific constant to a SLLI and an ADD/SUB on riscv32
with the M extension.
2. Break MUL with specific constant to two SLLI and an ADD/SUB, if the
constant needs a pair of LUI/ADDI to construct.
Reviewed by: craig.topper
Differential Revision: https://reviews.llvm.org/D93619
Define the `vfsqrt` IR intrinsics for the respective V instructions.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com>
Differential Revision: https://reviews.llvm.org/D93745
The patterns that want to use 'vnot' use a custom PatFrag. This is
because 'vnot' uses immAllOnesV which implicitly uses BUILD_VECTOR
rather than SPLAT_VECTOR.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94078
nvxXi1 types are legal with V extension and that's the result
vmseq/vmsne/vmslt/etc instructions return.
No test cases yet because the setcc isel patterns aren't in
and we'll need more than basic tests to observe this. I locally
tested that this plus D947078, D94168, D94142, and D94149
was enough to be able to handle the overflow result from
llvm.sadd.overflow.
There is no test coverage for the mulhs or mulhu patterns as I can't get
the DAGCombiner to generate them for scalable vectors. There are a few
places in that still need updating for that to work. I left the patterns
in regardless as they are correct.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94073
If the return values can't be lowered to registers
SelectionDAG performs the sret demotion. This patch
contains the basic implementation for the same in
the GlobalISel pipeline.
Furthermore, targets should bring relevant changes
during lowerFormalArguments, lowerReturn and
lowerCall to make use of this feature.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D92953
ComplexPatterns are kind of weird, they don't call any of the predicates on their operands. And their "complexity" used for tablegen ordering purposes in the matcher table is hand specified.
This started as an attempt to just use sext_inreg + SLOIPat to implement SLOIW just to have one less Select function. The matching for the or+shl is the same as long as you know the immediate is less than 32 for SLOIW. But that didn't work out because using uimm5 with SLOIPat didn't do anything if it was a ComplexPattern.
I realized I could just use a PatFrag with the opcodes I wanted to match and an immediate predicate would then evaluate correctly. This also computes the complexity just like any other pattern does. Then I just needed to check the constraints on the immediates in the predicate. Conveniently the predicate is evaluated after the fragment has been matched. So the structure has already been checked, we just need to find the constants.
I'll note that this is unusual, I didn't find any other targets looking through operands in PatFrag predicate. There is a PredicateCodeUsesOperands feature that can be used to collect the operands into an array that is used by AMDGPU/VOP3Instructions.td. I believe that feature exists to handle commuted matching, but since the nodes here use constants, they aren't ever commuted
Differential Revision: https://reviews.llvm.org/D91901
vmsltu.vi v0, v1, 0 is always false there is no unsigned number
less than 0. vmsleu.vi v0, v1, -1 on the other hand is always true
since -1 will be considered unsigned max and all numbers are <=
unsigned max.
A similar problem exists for vmsgeu.vi v0, v1, 0 which is always true,
but becomes vmsgtu.vi v0, v1, -1 which is always false.
To match the GNU assembler we'll emit vmsne.vv and vmseq.vv with
the same register for these cases instead.
I'm using AsmParserOnly pseudo instructions here because we can't
match an explicit immediate in an InstAlias. And we can't use a
AsmOperand for the zero because the output we want doesn't use an
immediate so there's nowhere to name the AsmOperand we want to use.
To keep the implementations similar I'm also handling signed with
pseudo instructions even though they don't have this issue. This
way we can avoid the special renderMethod that decremented by 1 so
the immediate we see for the pseudo instruction in processInstruction
is 0 and not -1. Another option might have been to have a different
simm5_plus1 operand for the unsigned case or just live with the
immediate being pre-decremented. I felt this way was clearer, but I'm
open to other opinions.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D94035
This alias for andi x, 255 was recently added to the spec. If we
print it, code we output can't be compiled with -fno-integrated-as
unless the GNU assembler is also a version that supports alias.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D93826
There are vmsle(u).vx and vmsle(u).vi instructions, but there is
only vmslt(u).vx and no vmslt(u).vi. vmslt(u).vi can be emulated
for some immediates by decrementing the immediate and using vmsle(u).vi.
To avoid the user needing to know about this, this patch does this
conversion.
The assembler does the same thing for vmslt(u).vi and vmsge(u).vi
pseudoinstructions. There is no vmsge(u).vx intrinsic or
instruction so this patch is limited to vmslt(u).
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D94070
With the i32 these patterns will only fire on RV32, but they
don't look RV32 specific.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D93843
We could expand vmsge{u}.vx pseudo instructions in RISCVAsmParser.
It is more appropriate to expand it before encoding.
Differential Revision: https://reviews.llvm.org/D93968
Define intrinsics:
1. vfcvt.xu.f.v/vfcvt.x.f.v
2. vfcvt.rtz.xu.f.v/vfcvt.rtz.x.f.v
3. vfcvt.f.xu.v/vfcvt.f.x.v
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Monk Chiang <monk.chiang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93933
Define intrinsics:
1. vfncvt.xu.f.w/vfncvt.x.f.w
2. vfncvt.rtz.xu.f.w/vfncvt.rtz.x.f.w
3. vfncvt.f.xu.w/vfncvt.f.x.w
4. vfncvt.f.f.w/vfncvt.rod.f.f.w
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Monk Chiang <monk.chiang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93932
Define intrinsics:
1. vfwcvt.xu.f.v/vfwcvt.x.f.v
2. vfwcvt.rtz.xu.f.v/vfwcvt.rtz.x.f.v
3. vfwcvt.f.xu.v/vfwcvt.f.x.v
4. vfwcvt.f.f.v
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Monk Chiang <monk.chiang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93855
This patch defines vcompress intrinsics and lower to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>
Differential revision: https://reviews.llvm.org/D93809
Define vsext/vzext intrinsics.and lower to V instructions.
Define new fraction register class fields in LMULInfo and a
NoReg to present invalid LMUL register classes.
Authored-by: ShihPo Hung <shihpo.hung@sifive.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D93893
This complements the existing RVV ISel patterns for arithmetic, bitwise
and shifts with the remaining operations in those categories: sub, and,
xor, sra.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D93852
If the destination is tied, then user has some control of the
register used for input. They would have the ability to control
the value of any tail elements. By using tail agnostic we take
this option away from them.
Its not clear that the intrinsics are defined such that this isn't
supposed to work. And undisturbed is a valid implementation for agnostic
so code wouldn't even fail to work on all systems if we always used
agnostic.
The vcompress intrinsic is defined to require tail undisturbed so
at minimum we need this for that instruction or need to redefine
the intrinsic.
I've made an exception here for vmv.s.x/fmv.s.f and reduction
instructions which only write to element 0 regardless of the tail
policy. This allows us to keep the agnostic policy on those which
should allow better redundant vsetvli removal.
An enhancement would be to check for undef input and keep the
agnostic policy, but we don't have good test coverage for that yet.
Reviewed By: khchen
Differential Revision: https://reviews.llvm.org/D93878
The spec for these instructions include this note. "The destination register
cannot overlap either the source register or the mask register ('v0') if the
instruction is masked." So we need earlyclobber to enforce this constraint.
I've regenerated the tests with update_llc_test_checks.py to show the
effects of the earlyclobber.
Reviewed By: khchen, frasercrmck
Differential Revision: https://reviews.llvm.org/D93867
Define vmclr.m/vmset.m intrinsics and lower to vmxor.mm/vmxnor.mm.
Ideally all rvv pseudo instructions could be implemented in C header,
but those two instructions don't take an input, codegen can not guarantee
that the source register becomes the same as the destination.
We expand pseduo-v-inst into corresponding v-inst in
RISCVExpandPseudoInsts pass.
Reviewed By: craig.topper, frasercrmck
Differential Revision: https://reviews.llvm.org/D93849
Define those intrinsics and lower to V instructions.
Use update_llc_test_checks.py for viota.m tests to check
earlyclobber is applied correctly.
mask viota.m tests uses the same argument as input and mask for
avoid dependency of D93364.
We work with @rogfer01 from BSC to come out this patch.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D93823
This patch extends the pattern-matching capability of vector-splatted
constants. When illegally-typed constants are legalized they are
canonically sign-extended to XLenVT. This preserves the sign and allows
us to match simm5. If they were zero-extended for whatever reason we'd
lose that ability: e.g. `(i8 -1) -> (XLenVT 255)` would not be matched
under the current logic.
To address this we first manually sign-extend the splatted constant from
the vector element type to int64_t. This preserves the semantics while
removing any implicitly-truncated bits.
The corresponding logic for uimm5 was not updated, the rationale being
that neither sign- nor zero-extending a legal uimm5 immediate should
change that (unless we expect actual "garbage" upper bits).
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D93837
We weren't consistently marking unary instructions as OneInput
and vid.v is really ZeroInput but we had no way to mark that.
This patch improves this by removing the error prone OneInput constraint.
Instead we just always look for the mask in the last operand.
It appears that the "CheckReg" variable used for the check on the broken
instruction was unitialized or garbage because it was also used for
VS1/VS2 constraints. I've scoped the variable locally to each check now.
I've gone through and set NoConstraint on instructions that don't have
a real VMConstraint and don't have a mask as the last operand.
I've also removed the unused enum values in RISCVBaseInfo.h. We
never use them in C++ and we have separate versions in a td file.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D93784
Define vwredsumu/vwredsum/vfwredosum/vfwredsum
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D93807
Define vpopc/vfirst intrinsics and lower to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D93795
Define vector mask-register logical intrinsics and lower them
to V instructions. Also define pseudo instructions vmmv.m
and vmnot.m.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D93705
This patch defines vrgather intrinsics and lower to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>
Differential revision: https://reviews.llvm.org/D93797
integer group:
vredsum/vredmaxu/vredmax/vredminu/vredmin/vredand/vredor/vredxor
float group:
vfredosum/vfredsum/vfredmax/vfredmin
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D93746
This patch extends the SDNode ISel support for RVV from only the
vector/vector instructions to include the vector/scalar and
vector/immediate forms.
It uses splat_vector to carry the scalar in each case, except when
XLEN<SEW (RV32 SEW=64) when a custom node `SPLAT_VECTOR_I64` is used for
type-legalization and to encode the fact that the value is sign-extended
to SEW. When the scalar is a full 64-bit value we use a sequence to
materialize the constant into the vector register.
The non-intrinsic ISel patterns have also been split into their own
file.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D93312
Also include a special case pattern to use vmv.v.x vd, zero when
the argument is 0.0.
Reviewed By: khchen
Differential Revision: https://reviews.llvm.org/D93672
This patch defines vfwmacc, vfwnmacc, vfwmsc, vfwnmsac intrinsics
and lower to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>
Differential Revision: https://reviews.llvm.org/D93693
Define vmerge/vfmerge intrinsics and lower to V instructions.
Include support for vector-vector vfmerge by vmerge.vvm.
We work with @rogfer01 from BSC to come out this patch.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D93674
Define the vfmin, vfmax IR intrinsics for the respective V instructions.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com>
Differential Revision: https://reviews.llvm.org/D93673
This patch defines vfmadd/vfnmacc, vfmsac/vfnmsac, vfmadd/vfnmadd,
and vfmsub/vfnmsub lower to V instructions.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>
Differential Revision: https://reviews.llvm.org/D93691
This patch defines vwmacc[u|su|us] intrinsics and lower to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>
Differential Revision: https://reviews.llvm.org/D93675
This patch enables jump table lowering in the RISC-V backend.
In addition to the test case included, the new lowering was
tested by compiling the OCaml runtime and running it under qemu.
Differential Revision: https://reviews.llvm.org/D92097
Define vector compare intrinsics and lower them to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93368
Define vleff intrinsics and lower to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D93516
This defines vmadd, vmacc, vnmsub, and vnmsac intrinsics and
lower to V instructions.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>
Differential Revision: https://reviews.llvm.org/D93632
Define the `vand`, `vor` and `vxor` IR intrinsics for the respective V instructions.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com>
Differential Revision: https://reviews.llvm.org/D93574
CanBeUnnamed is rarely false. Splitting to a createNamedTempSymbol makes the
intention clearer and matches the direction of reverted r240130 (to drop the
unneeded parameters).
No behavior change.
This patch base on D93366, and define vector fixed-point intrinsics.
1. vaaddu/vaadd/vasubu/vasub
2. vsmul
3. vssrl/vssra
4. vnclipu/vnclip
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>
Differential Revision: https://reviews.llvm.org/D93508
Define vector vfwmul intrinsics and lower them to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93584
Define vector vfwadd/vfwsub intrinsics and lower them to V
instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93583
Define vector vfsgnj/vfsgnjn/vfsgnjx intrinsics and lower them to V
instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93581
Define vector vfmul/vfdiv/vfrdiv intrinsics and lower them to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93580
Define vlxe/vsxe intrinsics and lower to vlxei<EEW>/vsxei<EEW>
instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D93471
To support OpenCL, which typically uses SPIR as an IR, non-zero address
spaces must be accounted for. This patch makes the RISC-V target assume
no-op address space casts across the board, which effectively removes
the need to support addrspacecast instructions in the backend.
For a RISC-V implementation with different configurations or specialized
address spaces where casts aren't no-ops, the function can be adjusted
as required.
Reviewed By: jrtc27
Differential Revision: https://reviews.llvm.org/D93536
This patch adds two IR intrinsics for vsetvli instruction. One to set the vector length to a user specified value and one to set it to vlmax. The vlmax uses the X0 source register encoding.
Clang builtins will follow in a separate patch
Differential Revision: https://reviews.llvm.org/D92973
The default behavior for any_extend of a constant is to zero extend.
This occurs inside of getNode rather than allowing type legalization
to promote the constant which would sign extend. By using sign extend
with getNode the constant will be sign extended. This gives a better
chance for isel to find a simm5 immediate since all xlen bits are
examined there.
For instructions that use a uimm5 immediate, this change only affects
constants >= 128 for i8 or >= 32768 for i16. Constants that large
already wouldn't have been eligible for uimm5 and would need to use a
scalar register.
If the instruction isn't able to use simm5 or the immediate is
too large, we'll need to materialize the immediate in a register.
As far as I know constants with all 1s in the upper bits should
materialize as well or better than all 0s.
Longer term we should probably have a SEW aware PatFrag to ignore
the bits above SEW before checking simm5.
I updated about half the test cases in some tests to use a negative
constant to get coverage for this.
Reviewed By: evandro
Differential Revision: https://reviews.llvm.org/D93487
This time with tests.
Original message:
Similar to D93365, but for floating point. No need for special ISD opcodes
though. We can directly isel these from intrinsics. I had to use anyfloat_ty
instead of anyvector_ty in the intrinsics to make LLVMVectorElementType not
crash when imported into the -gen-dag-isel tablegen backend.
Differential Revision: https://reviews.llvm.org/D93426
Similar to D93365, but for floating point. No need for special ISD opcodes
though. We can directly isel these from intrinsics. I had to use anyfloat_ty
instead of anyvector_ty in the intrinsics to make LLVMVectorElementType not
crash when imported into the -gen-dag-isel tablegen backend.
Differential Revision: https://reviews.llvm.org/D93426
This adds intrinsics for vmv.x.s and vmv.s.x.
I've used stricter type constraints on these intrinsics than what we've been doing on the arithmetic intrinsics so far. This will allow us to not need to pass the scalar type to the Intrinsic::getDeclaration call when creating these intrinsics.
A custom ISD is used for vmv.x.s in order to implement the change in computeNumSignBitsForTargetNode which can remove sign extends on the result.
I also modified the MC layer description of these instructions to show the tied source/dest operand. This is different than what we do for masked instructions where we drop the tied source operand when converting to MC. But it is a more accurate description of the instruction. We can't do this for masked instructions since we use the same MC instruction for masked and unmasked. Tools like llvm-mca operate in the MC layer and rely on ins/outs and Uses/Defs for analysis so I don't know if we'll be able to maintain the current behavior for masked instructions. So I went with the accurate description here since it was easy.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D93365
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Craig Topper <craig.topper@sifive.com>
Differential Revision: https://reviews.llvm.org/D93514
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>
Co-Authored-by: Monk Chiang <monk.chiang@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D93366
Define vlse/vsse intrinsics and lower to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D93445
Define vector widening mul intrinsics and lower them to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93381
Define vector mul/div/rem intrinsics and lower them to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93380
If users want to use vector floating point instructions, they need to
specify 'F' extension additionally.
Differential Revision: https://reviews.llvm.org/D93282
Define vle/vse intrinsics and lower to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D93359
Refine tablegen pattern for vector load/store, and follow
D93012 to separate masked and unmasked definitions for
pseudo load/store instructions.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D93284
Define vfadd/vfsub/vfrsub intrinsics and lower to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93291
Define vwadd/vwaddu/vwsub/vwsubu intrinsics and lower to V instructions.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93108
This moves the vtype decoding and printing to RISCVBaseInfo. This keeps all of
the decoding code in the same area as the encoding code. This will make it
easier to change the decoding for the 1.0 spec in the future.
We're now sharing the printing with the debug output for operands in the
assembler. This also fixes that debug output to include the tail and mask
agnostic bits. Since the printing code works on the vtype immediate value, we
now encode the immediate during parsing and store just the immediate in the
operand.
Add simple pass for removing redundant vsetvli instructions within a basic block. This handles the case where the AVL register and VTYPE immediate are the same and no other instructions that change VTYPE or VL are between them.
There are going to be more opportunities for improvement in this space as we development more complex tests.
Differential Revision: https://reviews.llvm.org/D92679
The compiler is making no effort to preserve upper elements. To do so would require another source operand tied with the destination and a different intrinsic interface to give control of this source to the programmer.
This patch changes the tail policy to agnostic so that the CPU doesn't need to make an effort to preserve them.
This is consistent with the RVV intrinsic spec here https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#configuration-setting
Differential Revision: https://reviews.llvm.org/D93080
Use RegisterClass::contains instead of going through getMinimalPhysRegClass
and hasSuperClassEq.
Remove the special case for NoRegister. It's identical to the
handling for any other regsiter that isn't VRM2/M4/M8.
There is an in-progress proposal for the following pseudo-instructions
in the assembler, to complement the existing `sext.w` rv64i instruction:
- sext.b
- sext.h
- zext.b
- zext.h
- zext.w
The `.b` and `.h` variants are available with rv32i and rv64i, and `zext.w` is
only available with `rv64i`.
These are implemented primarily as pseudo-instructions, as these instructions
expand to multiple real instructions. In the case of `zext.b`, this expands to a
single rv32/64i instruction, so it is implemented with an InstAlias (like
`sext.w` is on rv64i).
The proposal is available here: https://github.com/riscv/riscv-asm-manual/pull/61
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D92793
If SETUNE isn't legal, UO can use the NOT of the SETO expansion.
Removes some complex isel patterns. Most of the test changes are
from using XORI instead of SEQZ.
Differential Revision: https://reviews.llvm.org/D92008
The register operand was not being marked as a def when it should be. No tests
for this in the main branch as there are not yet any pseudos without a
non-negative VLIndex.
Also change the type of a virtual register operand from unsigned to Register
and adjust formatting.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D92823
This merges the SEW and LMUL enums that each used into singles enums in RISCVBaseInfo.h. The patch also adds a new encoding helper to take SEW, LMUL, tail agnostic, mask agnostic and turn it into a vtype immediate.
I also stopped storing the Encoding in the VTYPE operand in the assembler. It is easy to calculate when adding the operand which should only happen once per instruction.
Differential Revision: https://reviews.llvm.org/D92813
We can use these instructions for single bit immediates that are too large for ANDI/ORI/CLRI.
The _10 test cases are to make sure that we still use ANDI/ORI/CLRI for small immediates.
Differential Revision: https://reviews.llvm.org/D92262
-Reject an "mf1" lmul
-Make sure tail agnostic is exactly "tu" or "ta" not just that it starts with "tu" or "ta"
-Make sure mask agnostic is exactly "mu" or "ma" not just that it starts with "mu" or "ma"
Differential Revision: https://reviews.llvm.org/D92805
APInt's string constructor asserts on error. Since this is the parser and we don't yet know if the string is a valid integer we shouldn't use that.
Instead use StringRef::getAsInteger which returns a bool to indicate success or failure.
Since we no longer need APInt, use 'unsigned' instead.
Differential Revision: https://reviews.llvm.org/D92801
This node returns 2 results and uses a chain. As long as we use a DAG as part of the pseudo instruction definition where we can use the "set" operator, it looks like tablegen can handle use a pattern for this without a problem. I believe the original implementation was copied from PowerPC.
This also fixes the pseudo instruction so that it is marked as having side effects to match the definition of CSRRS and the RV64 instruction. And we don't need to explicitly clear mayLoad/mayStore since those can be inferred now.
Differential Revision: https://reviews.llvm.org/D92786
A rotate by half the bitwidth swaps the bottom and top half which is the same as one of the MSB GREVI stage.
We have to do this as a special combine because we prefer to keep (rotl/rotr X, BitWidth/2) as a rotate rather than a single stage GREVI.
Differential Revision: https://reviews.llvm.org/D92286
On the surface this would be slightly less optimal for the isel
table, but due to a tablegen issue with HW mode this ends up
generating a smaller isel table.
The companion RFC (http://lists.llvm.org/pipermail/llvm-dev/2020-October/145850.html) gives lots of details on the overall strategy, but we summarize it here:
LLVM IR involving vector types is going to be selected using pseudo instructions (only MachineInstr). These pseudo instructions contain dummy operands to represent the vector type being operated and the vector length for the operation.
These two dummy operands, as set by instruction selection, will be used by the custom inserter to prepend every operation with an appropriate vsetvli instruction that ensures the vector architecture is properly configured for the operation. Not in this patch: later passes will remove the redundant vsetvli instructions.
Register classes of tuples of vector registers are used to represent vector register groups (LMUL > 1).
Those pseudos are eventually lowered into the actual instructions when emitting the MCInsts.
About the patch:
Because there is a bit of initial infrastructure required, this is the minimal patch that allows us to select instructions for 3 LLVM IR instructions: load, add and store vectors of integers. LLVM IR operations have "whole-vector" semantics (as in they generate values for all the elements).
Later patches will extend the information represented in TableGen.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com>
Co-Authored-by: Craig Topper <craig.topper@sifive.com>
Differential Revision: https://reviews.llvm.org/D89449
This makes the llvm-objdump output much more readable and closer to binutils objdump. This builds on D76591
It requires changing the OperandType for certain immediates to "OPERAND_PCREL" so tablegen will generate code to pass the instruction's address. This means we can't do the generic check on these instructions in verifyInstruction any more. Should I add it back with explicit opcode checks? Or should we add a new operand flag to control the passing of address instead of matching the name?
Differential Revision: https://reviews.llvm.org/D92147
Rather than having a different opcode for RV32 and RV64. Let's just say the integer type is XLenVT and use a single opcode for both modes.
Differential Revision: https://reviews.llvm.org/D92538
Internally the pass skips any function with the optnone attribute. But that still requires checking each function. If the opt level is set to None we might as well just skip putting in the pipeline at all. This what is already done for many of the passes added by TargetPassConfig.
Differential Revision: https://reviews.llvm.org/D92511
So that instructions like `lla a5, (0xFF + end) - 4` (supported by GNU as) can
be parsed.
Add a missing test that an operand like `foo + foo` is not allowed.
Reviewed By: jrtc27
Differential Revision: https://reviews.llvm.org/D92293
This enables bswap/bitreverse to combine with other GREVI patterns or each other without needing to add more special cases to the DAG combine or new DAG combines.
I've also enabled the existing GREVI combine for GREVIW so that it can pick up the i32 bswap/bitreverse on RV64 after they've been type legalized to GREVIW.
Differential Revision: https://reviews.llvm.org/D92253
GORCI performs an OR between each stage. So we need to ensure only
one stage is active before doing this combine.
Initial attempts at finding a test case for this failed due to
the order things get combined. It's most likely that we'll form
one stage of GREVI then combine to GORCI before the two stages of
GREVI are able to be formed and combined with each other to form
a multi stage GREVI.
Differential Revision: https://reviews.llvm.org/D92289
Not sure why bswap was treated specially. This also applies to bitreverse
or generic grevi. We can improve this in future patches.
For now I just wanted to get the consistency and the test coverage
as I plan to make some other changes around bswap.
We had an zexti32 after a sign_extend_inreg. The AND X, 0xffffffff
part of the zexti32 should never occur since SimplifyDemandedBits
from the sign_extend_inreg would have removed it.
We also had sexti32 as the root node of a pattern, but SelectionDAGISel
matches assertsext early before the tablegen based patterns are
evaluated.
These patterns are using zexti32 which matches either assertzexti32
or (and X, 0xffffffff). But if we match (and X, 0xffffffff) it will
remove the AND and the inputs may no longer have the zero bits
needed to guarantee the result has enough zeros.
This commit changes the patterns to only match assertzexti32.
I'm not sure how to test the broken case since the DIVUW/REMUW nodes
are created during type legalization, but type legalization won't
create an (and X, 0xfffffffff) directly on the inputs.
I've also changed the zexti32 on the root of the pattern to just
checking for AND. We were previously also matching assertzexti32,
but I doubt that pattern would ever occur.
Start with an assumption that FMA is faster than Fmul+FAdd. If thats not true
on some particular implementation we can add a tuning parameter in the future.
I've update the fmuladd test cases and added new test cases for fast math flag
based contraction.
Differential Revision: https://reviews.llvm.org/D91987
This is the logically correct thing to do. But it generates worse
code for i32 umin/umax on the rv64 due to type legalize requesting
zext even though the arguments are sext. Maybe we can teach type
legalizer to use sext for umin/umax for RISCV.
It's also producing possibly worse code on i64 on RV32 since we
still end up with selects that become branches. But this seems
like something we could improve in type legalization or DAG combine.
Hopefully this makes D92095 work for RISCV with Zbb.
This adds custom opcodes for FSLW/FSRW so we can type legalize
fshl/fshr without needing to match a sign_extend_inreg.
I've used the operand order from fshl/fshr to make the isel
pattern similar to the non-W form. It was also hard to decide
another order since the register instruction has the shift amount
as the second operand, but the immediate instruction has it as
the third operand.
Differential Revision: https://reviews.llvm.org/D91479
This is a special calling convention to be used by the GHC compiler.
Patch by Andreas Schwab (schwab)
Differential Revision: https://reviews.llvm.org/D89788
X86 was already specially marking fma as commutable which allowed
tablegen to autogenerate commuted patterns. This moves it to the target
independent definition and fix up the targets to remove now
unneeded patterns.
Unfortunately, the tests change because the commuted version of
the patterns are generating operands in a different than the
explicit patterns.
Differential Revision: https://reviews.llvm.org/D91842
We generate two 4 byte loads or two stores as part of the expansion.
Previously the MemOperand was set the same for both to cover the
full 8 bytes. Now we set a separate 4 byte mem operand for each
with a 4 byte offset for the high part.
Prior to this the DefaultMode was never selected, but RISCVGenDAGISel.inc, RISCVGenRegisterInfo.inc, RISCVGenGlobalISel.inc all ended up with extra table entries for that mode.
This patch removes the RV32 and uses DefaultMode for RV32. This impressively reduces the size of my release+asserts llc binary by about 270K. About 15K from RISCVGenDAGISel.inc, 1-2K from RISCVGenRegisterInfo.inc, but the vast majority from RISCVGenGlobalISel.inc.
Differential Revision: https://reviews.llvm.org/D90973
Previously we required a sra to pattern match these properly in isel. If the consumer didn't need the result sign extended we'll have an srl instead of sra and fail to match.
This patch switches to custom legalizing to GREVIW using portions of D91259.
Differential Revision: https://reviews.llvm.org/D91457
This should result in better utilization of RORIW since we
don't need to look for a SIGN_EXTEND_INREG that may not exist.
Also remove rotl/rotr isel matching to GREVI and just prefer RORI.
This is to keep consistency so we don't have to match ROLW/RORW
to GREVIW as well. I imagine RORI/RORIW performance will be the
same or better than GREVI.
Differential Revision: https://reviews.llvm.org/D91449
This moves the recognition of GREVI and GORCI from TableGen patterns
into a DAGCombine. This is done primarily to match "deeper" patterns in
the future, like (grevi (grevi x, 1) 2) -> (grevi x, 3).
TableGen is not best suited to matching patterns such as these as the compile
time of the DAG matchers quickly gets out of hand due to the expansion of
commutative permutations.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D91259
@tangxingxin1008 found a bug that regard vadd.vv v1, v3, a0 as a valid V
instruction. We should remove the VRegAsmOperand operand class and use
VR register class directly.
Patched by: tangxingxin1008, Hsiangkai
Differential Revision: https://reviews.llvm.org/D91712
This patch factors out the part of printInstruction that gets the
mnemonic string for a given MCInst. This is intended to be used
subsequently for the instruction-mix remarks to display the final
mnemonic (D90040).
Unfortunately making `getMnemonic` available to the AsmPrinter
seems to require making it virtual. Not sure if there's a way around
that with the current layering of the AsmPrinters.
Reviewed By: Paul-C-Anagnostopoulos
Differential Revision: https://reviews.llvm.org/D90039
We need to make sure the upper 32 bits are all ones to ensure the result is properly sign extended. Previously we only checked the lower 32 bits of the mask. I've also added a check that the shift amount is less than 32. Without that the original code asserts inside maskLeadingOnes if the SROI check is removed or the SROIW pattern is checked first. I've refactored the code to use early outs to reduce nesting.
I've also updated SLOIW matching with the same changes, but I couldn't find a broken test case with the existing code.
Differential Revision: https://reviews.llvm.org/D90961
Similar to the X86 and AMDGPU targets, this uses a macro to cut down on
repetitive and error-prone code when converting RISCVISD node names to
strings in getTargetNodeName.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D91414
No longer rely on an external tool to build the llvm component layout.
Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.
These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.
Differential Revision: https://reviews.llvm.org/D90848
-Use MCRegister instead of Register in MC layer.
-Move some enums from RISCVInstrInfo.h to RISCVBaseInfo.h to be with other TSFlags bits.
Differential Revision: https://reviews.llvm.org/D91114
The fshl and fshr intrinsics are defined to modulo their shift amount by the bitwidth of one of their inputs. The FSR/FSL instructions read one extra bit from the shift amount. If that bit is set the inputs are swapped. In order to preserve the semantics of the llvm intrinsics we need to make sure that the extra bit isn't set. DAG combine or instcombine may have removed any mask that was originally present.
We could be smarter here and try to use computeKnownBits to check if the bit is known zero, but wanted to start with correctness.
Differential Revision: https://reviews.llvm.org/D90905
We were creating RISCVISD::SELECT_CC nodes with Glue output that was never being used, and the tablegen SDNode had the SDNPInGlue flag instead of the SDNPOutGlue flag.
Since we don't seem to need the Glue just get rid of it from both places.
Differential Revision: https://reviews.llvm.org/D91199
This uses the shiftop PatFrags to handle the masked shift amount
and unmasked shift amount cases. That also checks XLen as part
of the masked amount check so we don't need separate RV32 and RV64
patterns.
Differential Revision: https://reviews.llvm.org/D91016
Bitconvert requires the bitwidth to match on both sides. On RV64
the GPR size is i64 so bitconvert between f32 isn't possible. The
node should never be generated so the pattern won't ever match, but
moving the patterns under IsRV32 makes it more obviously impossible.
It also moves it to a similar location to the patterns for the
custom nodes we use for RV64.
The multiply part of FMA is commutable, but TargetSelectionDAG.td
doesn't have it marked as commutable so tablegen won't automatically
create the additional patterns.
So manually add commuted patterns.
D80526 added custom lowering to pick the si lib call on RV64, but this custom handling is only enabled when the F and D extension are both disabled. This prevents the si library call from being used for double when F is enabled but D is not.
This patch changes the behavior so we always enable the Custom hook on RV64 and decide in ReplaceNodeResults if we should emit a libcall based on whether the FP type should be softened or not.
Differential Revision: https://reviews.llvm.org/D90817
The _F and _D registers are already sub/super registers. When one gets allocated all its aliases are already marked as allocated. We don't need to explicitly shadow it too.
I believe shadow is for calling conventions like 64-bit Windows on X86 where have rules like this
CCIfType<[i32], CCAssignToRegWithShadow<[ECX , EDX , R8D , R9D ],
[XMM0, XMM1, XMM2, XMM3]>>
For that calling convention the argument number determines which register is used regardless of how many scalars or vectors came before it.
Removing this removes a question I had in D90738.
Differential Revision: https://reviews.llvm.org/D90801
There is no FSLI instruction, but we can emulate it using FSRI by swapping operands and subtracting the immediate from the bitwidth.
Differential Revision: https://reviews.llvm.org/D90826
To accommodate frame layouts that have both fixed and scalable objects
on the stack, describing a stack location or offset using a pointer + uint64_t
is not sufficient. For this reason, we've introduced the StackOffset class,
which models both the fixed- and scalable sized offsets.
The TargetFrameLowering::getFrameIndexReference is made to return a StackOffset,
so that this can be used in other interfaces, such as to eliminate frame indices
in PEI or to emit Debug locations for variables on the stack.
This patch is purely mechanical and doesn't change the behaviour of how
the result of this function is used for fixed-sized offsets. The patch adds
various checks to assert that the offset has no scalable component, as frame
offsets with a scalable component are not yet supported in various places.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D90018
The operations in these patterns shouldn't be effected by sign
bits. And the pattern is starting from a sign_extend_inreg so
we aren't expecting sign bits to be passed through either.
Differential Revision: https://reviews.llvm.org/D90739
fsl/fsr take their shift amount in $rs2 or an immediate. The
sources are $rs1 and $rs3.
fshl/fshr ISD opcodes both concatenate operand 0 in the high bits and
operand 1 in the lower bits. fshl returns the high bits after
shifting and fshr returns the low bits. So a shift amount of 0
returns operand 0 for fshl and operand 1 for fshr.
fsl/fsr concatenate their operands in different orders such that
$rs1 will be returned for a shift amount of 0. So $rs1 needs to
come from operand 0 of fshl and operand 1 of fshr.
Differential Revision: https://reviews.llvm.org/D90735
riscv_sllw/srlw only reads the lower 32 bits of the first operand.
And the lower 5 bits of the second operands. Whether the upper
32 bits of the input are sign bits or not doesn't matter.
Also use ineg and not to shorten the patterns.
Differential Revision: https://reviews.llvm.org/D90668
We need to ensure the upper 32 bits of the mask are zero.
So that the srl shifts zeroes into the lower 32 bits.
Differential Revision: https://reviews.llvm.org/D90585
We don't need custom matching, we just a need a predicate to check
the immediate is greater than 32. We can use the existing ImmSub32
to adjust the immediate.
I've also used the new predicate in the other location that used
ImmSub32. I tried to create a test case where we would break without
the greater than 32 check on that pattern, but DAG combine defeated me.
Still seemed safer to have it.
Differential Revision: https://reviews.llvm.org/D90546
DAGCombine doesn't canonicalize rotl/rotr with immediate so we
need patterns for both.
Remove the custom matcher for rotl to RORI and just use a SDNodeXForm
to convert the immediate instead. Doing this gives priority to the
rev32/rev16 versions of grevi over rori since an explicit immediate
is more precise than any immediate. I also added rotr patterns for
rev32/rev16. And removed the (or (shl), (shr)) patterns that should be
combined to rotl by DAG combine.
There is at least one other grev pattern that probably needs a
another rotr pattern, but we need more test coverage first.
Differential Revision: https://reviews.llvm.org/D90575
ADDI often has a frameindex in operand 1, but consumers of this
interface, such as MachineSink, tend to call getReg() on the Destination
and Source operands, leading to the following crash when building
FreeBSD after this implementation was added in 8cf6778d30:
```
clang: llvm/include/llvm/CodeGen/MachineOperand.h:359: llvm::Register llvm::MachineOperand::getReg() const: Assertion `isReg() && "This is not a register operand!"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
#0 0x00007f4286f9b4d0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) llvm/lib/Support/Unix/Signals.inc:563:0
#1 0x00007f4286f9b587 PrintStackTraceSignalHandler(void*) llvm/lib/Support/Unix/Signals.inc:630:0
#2 0x00007f4286f9926b llvm::sys::RunSignalHandlers() llvm/lib/Support/Signals.cpp:71:0
#3 0x00007f4286f9ae52 SignalHandler(int) llvm/lib/Support/Unix/Signals.inc:405:0
#4 0x00007f428646ffd0 (/lib/x86_64-linux-gnu/libc.so.6+0x3efd0)
#5 0x00007f428646ff47 raise /build/glibc-2ORdQG/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0
#6 0x00007f42864718b1 abort /build/glibc-2ORdQG/glibc-2.27/stdlib/abort.c:81:0
#7 0x00007f428646142a __assert_fail_base /build/glibc-2ORdQG/glibc-2.27/assert/assert.c:89:0
#8 0x00007f42864614a2 (/lib/x86_64-linux-gnu/libc.so.6+0x304a2)
#9 0x00007f428d4078e2 llvm::MachineOperand::getReg() const llvm/include/llvm/CodeGen/MachineOperand.h:359:0
#10 0x00007f428d8260e7 attemptDebugCopyProp(llvm::MachineInstr&, llvm::MachineInstr&) llvm/lib/CodeGen/MachineSink.cpp:862:0
#11 0x00007f428d826442 performSink(llvm::MachineInstr&, llvm::MachineBasicBlock&, llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>, llvm::SmallVectorImpl<llvm::MachineInstr*>&) llvm/lib/CodeGen/MachineSink.cpp:918:0
#12 0x00007f428d826e27 (anonymous namespace)::MachineSinking::SinkInstruction(llvm::MachineInstr&, bool&, std::map<llvm::MachineBasicBlock*, llvm::SmallVector<llvm::MachineBasicBlock*, 4u>, std::less<llvm::MachineBasicBlock*>, std::allocator<std::pair<llvm::MachineBasicBlock* const, llvm::SmallVector<llvm::MachineBasicBlock*, 4u> > > >&) llvm/lib/CodeGen/MachineSink.cpp:1073:0
#13 0x00007f428d824a2c (anonymous namespace)::MachineSinking::ProcessBlock(llvm::MachineBasicBlock&) llvm/lib/CodeGen/MachineSink.cpp:410:0
#14 0x00007f428d824513 (anonymous namespace)::MachineSinking::runOnMachineFunction(llvm::MachineFunction&) llvm/lib/CodeGen/MachineSink.cpp:340:0
```
Thus, check that operand 1 is also a register in the condition.
Reviewed By: arichardson, luismarques
Differential Revision: https://reviews.llvm.org/D89090
The code is looking for (sext_inreg (or (shl X, C2), (shr (and Y, C3), C1))).
We need to ensure X and Y are the same.
Differential Revision: https://reviews.llvm.org/D90580
As discussed on D90322, some MSVC builds are failing with is_trivially_copyable static asserts (see D86126) - we can avoid this by not using the std::pair<unsigned,unsigned> which held both the FP+DP Registers, just handle the FP register and convert to DP on the fly.
This reverts 781917254d and recommits
781917254d.
I've changed getRegForInlineAsmConstraint to not use a std::pair
of Register in a previous commit. Hopefully that fixes the reported
issue with expensive checks on Windows. I'm still not sure exactly
why this commit removing an include affected a different file.
Original message:
RISCVRegisterInfo.h is part of the CodeGen layer. The Utils library
is intended to be shared with the MC layer so shouldn't use files
from the CodeGen layer.
The register enum names are already available from
RISCVMCTargetDesc.h. It appears what was coming from this include
was a transitive include of the Register class which I've replaced
with MCRegister. Register has a constructor from MCRegister so it
should be convertible.
The return value of this interface still uses an 'unsigned' on all
targets. So we convert Register back to unsigned at the end.
I'm hoping this will prevent the issue that caused the revert of
D90322.
Just return the new node, which is the standard practice.
I also noticed what appeared to be an unnecessary attempt at
creating an ANY_EXTEND where the type should already be correct.
I replace with an assert to verify the type.
Differential Revision: https://reviews.llvm.org/D90444
This combine makes two calls to SimplifyDemandedBits, one for the LHS and one
for the RHS. If the LHS call returns true, we don't make the RHS call. When
SimplifyDemandedBits makes a change, it will add the nodes around the change to
the DAG combiner worklist. If the simplification happens on the first recursion
step, the N will get added to the worklist. But if the simplification happens
deeper in the recursion, then N will not be revisited until the next time the
DAG combiner runs.
This patch explicitly addes N to the worklist anytime a Simplification is made.
Without this we might miss additional simplifications on the LHS or never
simplify the RHS. Special care also needs to be taken to not add N if it has
been CSEd by the simplification. There are similar examples in DAGCombiner and
the X86 target, but I don't have a test for it for RISC-V. I've also returned
SDValue(N, 0) instead of SDValue() so DAGCombiner knows a change was made and
will update its Statistic variable.
The test here was constructed so that 2 simplifications happen to the LHS.
Without this fix one happens in the post type legalization DAG combine and the
other happens after LegalizeDAG. This prevents the RHS from ever being
simplified causing the left and right shift to clear the upper 32 bits of the
RHS to be left behind.
Differential Revision: https://reviews.llvm.org/D90339
RISCVRegisterInfo.h is part of the CodeGen layer. The Utils library
is intended to be shared with the MC layer so shouldn't use files
from the CodeGen layer.
The register enum names are already available from
RISCVMCTargetDesc.h. It appears what was coming from this include
was a transitive include of the Register class which I've replaced
with MCRegister. Register has a constructor from MCRegister so it
should be convertible.
- The goal of this patch is improve option compatible with RISCV-V GCC,
-mcpu support on GCC side will sent patch in next few days.
- -mtune only affect the pipeline model and non-arch/extension related
target feature, e.g. instruction fusion; in td file it called
TuneFeatures, which is introduced by X86 back-end[1].
- -mtune accept all valid option for -mcpu and extra alias processor
option, e.g. `generic`, `rocket` and `sifive-7-series`, the purpose is
option compatible with RISCV-V GCC.
- Processor alias for -mtune will resolve according the current target arch,
rv32 or rv64, e.g. `rocket` will resolve to `rocket-rv32` or `rocket-rv64`.
- Interaction between -mcpu and -mtune:
* -mtune has higher priority than -mcpu for pipeline model and
TuneFeatures.
[1] https://reviews.llvm.org/D85165
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D89025
Implement vmsge{u}.vx pseudo instruction.
According to RISC-V V specification, there are different scenarios for this
pseudo instruction. I list them below.
unmasked va >= x
pseudoinstruction: vmsge{u}.vx vd, va, x
expansion: vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd
masked va >= x, vd != v0
pseudoinstruction: vmsge{u}.vx vd, va, x, v0.t
expansion: vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0
masked va >= x, vd == v0
pseudoinstruction: vmsge{u}.vx vd, va, x, v0.t, vt
expansion: vmslt{u}.vx vt, va, x; vmandnot.mm vd, vd, vt
Use pseudo instruction to model vmsge{u}.vx. The pseudo instruction will convert
to different expansion according to the condition.
Differential Revision: https://reviews.llvm.org/D84732
Changes TTI function getIntImmCostInst to take an additional Instruction parameter,
which enables us to be able to check it is part of a min(max())/max(min()) pattern that will match SSAT.
We can then mark the constant used as free to prevent it being hoisted so SSAT can still be generated.
Required minor changes in some non-ARM backends to allow for the optional parameter to be included.
Differential Revision: https://reviews.llvm.org/D87457
Scheduling information is of little value when they may disrupt the
pipeline. This patch allows omitting the scheduling information for CSR
instructions while still setting `SchedMachineModel::CompleteModel`. For
specific cases, any scheduling information added will be used by the
scheduler.
Differential revision: https://reviews.llvm.org/D85366
This does not result in changes for any of the current tests, but it might
improve debug information in some cases.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D86522
Currenlty assume x18 is used as pointer to shadow call stack. User shall pass
flags:
"-fsanitize=shadow-call-stack -ffixed-x18"
Runtime supported is needed to setup x18.
If SCS is desired, all parts of the program should be built with -ffixed-x18 to
maintain inter-operatability.
There's no particuluar reason that we must use x18 as SCS pointer. Any register
may be used, as long as it does not have designated purpose already, like RA or
passing call arguments.
Differential Revision: https://reviews.llvm.org/D84414
We weren't using this before, so none of the MachineFunction CFG edges had the
branch probability information added. As a result, block placement later in the
pipeline was flying blind.
This is enabled only with optimizations enabled like SelectionDAG.
Differential Revision: https://reviews.llvm.org/D86824
There's a special case in hasAttribute for None when pImpl is null. If pImpl is not null we dispatch to pImpl->hasAttribute which will always return false for Attribute::None.
So if we just want to check for None its sufficient to just check that pImpl is null. Which can even be done inline.
This patch adds a helper for that case which I hope will speed up our getSubtargetImpl implementations.
Differential Revision: https://reviews.llvm.org/D86744
Since the canonical floatig-point move is fsgnj rd, rs, rs, we should
handle this case in RISCVInstrInfo::isAsCheapAsAMove().
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D86518
The isTriviallyRematerializable hook is only called for instructions that are
tagged as isAsCheapAsAMove. Since ADDI 0 is used for "mv" it should definitely
be marked with "isAsCheapAsAMove". This change avoids one stack spill in most of
the atomic-rmw.ll tests functions. It also avoids stack spills in two of our
out-of-tree CHERI tests.
ORI/XORI with zero may or may not be the same as a move micro-architecturally,
but since we are already doing it for register == x0, we might as well
do the same if the immediate is zero.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D86480
Implements the assemble and disassemble support of RISCV Vector
extension zvamo instructions, base on the 0.9 spec version.
Reviewed by HsiangKai
Differential Revision: https://reviews.llvm.org/D85069
PseudoBRIND had seemingly inherited incorrect annotations denoting it as
a call instruction and that it defines X1/ra. This caused excess
save/restore code to be emitted for ra.
Differential Revision: https://reviews.llvm.org/D86286
In SelectionDAGBuilder always translate the fshl and fshr intrinsics to
FSHL and FSHR (or ROTL and ROTR) instead of lowering them to shifts and
ORs. Improve the legalization of FSHL and FSHR to avoid code quality
regressions.
Differential Revision: https://reviews.llvm.org/D77152
This ensures that we never encode an instruction which is unavailable,
such as if we explicitly insert a forbidden instruction when lowering.
This is particularly important on RISC-V given its high degree of
modularity, and will become increasingly important as new standard
extensions appear.
Reviewed By: asb, lenary
Differential Revision: https://reviews.llvm.org/D85015
This implements the assemble and disassemble support of RISCV Vector
extension Zvlsseg instructions, base on the 0.9 spec version.
Reviewed by HsiangKai
Differential Revision: https://reviews.llvm.org/D84416
The RISC-V Privileged Specification 1.11 defines `mcountinhibit`, which
has the same numeric CSR value as `mucounteren` from 1.09.1. This patch
enables the use of the old `mucounteren` name.
Patch by Yuichi Sugiyama.
Reviewed By: lenary, jrtc27, pzheng
Differential Revision: https://reviews.llvm.org/D85067
This fixes the "Unable to insert indirect branch" fatal error sometimes
seen when generating position-independent code.
Patch by msizanoen1
Reviewed By: jrtc27
Differential Revision: https://reviews.llvm.org/D84833
This patch implements initial backend support for a -mtune CPU controlled by a "tune-cpu" function attribute. If the attribute is not present X86 will use the resolved CPU from target-cpu attribute or command line.
This patch adds MC layer support a tune CPU. Each CPU now has two sets of features stored in their GenSubtargetInfo.inc tables . These features lists are passed separately to the Processor and ProcessorModel classes in tablegen. The tune list defaults to an empty list to avoid changes to non-X86. This annoyingly increases the size of static tables on all target as we now store 24 more bytes per CPU. I haven't quantified the overall impact, but I can if we're concerned.
One new test is added to X86 to show a few tuning features with mismatched tune-cpu and target-cpu/target-feature attributes to demonstrate independent control. Another new test is added to demonstrate that the scheduler model follows the tune CPU.
I have not added a -mtune to llc/opt or MC layer command line yet. With no attributes we'll just use the -mcpu for both. MC layer tools will always follow the normal CPU for tuning.
Differential Revision: https://reviews.llvm.org/D85165
Summary:
1. gcc uses `-march` and `-mtune` flag to chose arch and
pipeline model, but clang does not have `-mtune` flag,
we uses `-mcpu` to chose both infos.
2. Add SiFive e31 and u54 cpu which have default march
and pipeline model.
3. Specific `-mcpu` with rocket-rv[32|64] would select
pipeline model only, and use the driver's arch choosing
logic to get default arch.
Reviewers: lenary, asb, evandro, HsiangKai
Reviewed By: lenary, asb, evandro
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D71124
This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the ternary subset (zbt subextension) of the
experimental B extension of RISC-V.
It adds also the correspondent codegen tests.
This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf
Differential Revision: https://reviews.llvm.org/D79875
This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the single-bit subset (zbs subextension) of
the experimental B extension of RISC-V.
It adds also the correspondent codegen tests.
This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf
Differential Revision: https://reviews.llvm.org/D79874
This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions belonging to both the permutation and the base
subsets of the experimental B extension of RISC-V.
It adds also the correspondent codegen tests.
This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf
Differential Revision: https://reviews.llvm.org/D79873
This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the permutation subset (zbp subextension) of
the experimental B extension of RISC-V.
It adds also the correspondent codegen tests.
This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf
Differential Revision: https://reviews.llvm.org/D79871
This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the base subset (zbb subextension) of the
experimental B extension of RISC-V.
It adds also the correspondent codegen tests.
This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf
Differential Revision: https://reviews.llvm.org/D79870
Summary:
Without these, the generic branch relaxation pass will underestimate the
range required for branches spanning these and we can end up with
"fixup value out of range" errors rather than relaxing the branches.
Some of the instructions in the expansion may end up being compressed
but exactly determining that is awkward, and these conservative values
should be safe, if slightly suboptimal in rare cases.
Reviewers: asb, lenary, luismarques, lewis-revill
Reviewed By: asb, luismarques
Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, evandro, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77443
Because of the layout of stores (that don't have a destination operand)
this check is exactly the same as the one in
RISCVInstrInfo::isLoadFromStackSlot.
Differential Revision: https://reviews.llvm.org/D81805
The GlobalISelEmitter is stricter about matching timm instruction
outputs to timm inputs (although in an accidental sort of way that
doesn't hit a proper import failure error). Also, apparently no
intrinsic patterns were importing since the ID enum declaration was
missing.
Since the `RISCVExpandPseudo` pass has been split from
`RISCVExpandAtomicPseudo` pass, it would be nice to run the former as
early as possible (The latter has to be run as late as possible to
ensure correctness). Running earlier means we can reschedule these pairs
as we see fit.
Running earlier in the machine pass pipeline is good, but would mean
teaching many more passes about `hasLabelMustBeEmitted`. Splitting the
basic blocks also pessimises possible optimisations because some
optimisations are MBB-local, and others are disabled if the block has
its address taken (which is notionally what `hasLabelMustBeEmitted`
means).
This patch uses a new approach of setting the pre-instruction symbol on
the AUIPC instruction to a temporary symbol and referencing that. This
avoids splitting the basic block, but allows us to reference exactly the
instruction that we need to. Notionally, this approach seems more
correct because we do actually want to address a specific instruction.
This then allows the pass to be moved much earlier in the pass pipeline,
before both scheduling and register allocation. However, to do so we
must leave the MIR in SSA form (by not redefining registers), and so use
a virtual register for the intermediate value. By using this virtual
register, this pass now has to come before register allocation.
Reviewed By: luismarques, asb
Differential Revision: https://reviews.llvm.org/D82988
For an addition with an immediate in specific ranges, a pair of
addi-addi can be generated instead of the ordinary lui-addi-add serial.
Reviewed By: MaskRay, luismarques
Differential Revision: https://reviews.llvm.org/D82262
... to shift/add or shift/sub.
Do not enable it on riscv32 with the M extension where decomposeMulByConstant
may not be an optimization.
Reviewed By: luismarques, MaskRay
Differential Revision: https://reviews.llvm.org/D82660
We can often fold an ADDI into the offset of load/store instructions:
(load (addi base, off1), off2) -> (load base, off1+off2)
(store val, (addi base, off1), off2) -> (store val, base, off1+off2)
This is possible when the off1+off2 continues to fit the 12-bit immediate.
We remove the previous restriction where we would never fold the ADDIs if
the load/stores had nonzero offsets. We now do the fold the the resulting
constant still fits a 12-bit immediate, or if off1 is a variable's address
and we know based on that variable's alignment that off1+offs2 won't overflow.
Differential Revision: https://reviews.llvm.org/D79690
The pass to split atomic and non-atomic RISC-V pseudo-instructions was itself
split into two passes in D79635 / commit rG2cb0644f90b7, with the splitting of
non-atomic instructions being moved to the PreSched2 phase. A comment was
added to D79635 detailing a case where this caused problems, so this commit
moves the non-atomic split pass back to the PreEmitPass2 phase. This allows
the bulk of the changes from D79635 to remain committed, while addressing the
the reported problem (the pass split is now almost NFC). Once the root problem
is fixed we can move the (non-atomic) instruction splitting pass back to
earlier in the pipeline.
The pass to split atomic and non-atomic RISC-V pseudo-instructions was itself
split into two passes in D79635 / commit rG2cb0644f90b7, with the splitting of
non-atomic instructions being moved to the PreSched2 phase. A comment was
added to D79635 detailing a case where this caused problems, so this commit
moves the non-atomic split pass back to the PreEmitPass2 phase. This allows
the bulk of the changes from D79635 to remain committed, while addressing the
the reported problem (the pass split is now almost NFC). Once the root problem
is fixed we can move the (non-atomic) instruction splitting pass back to
earlier in the pipeline.
Summary:
This implements two hooks that attempt to avoid control flow for RISC-V. RISC-V
will lower SELECTs into control flow, which is not a great idea.
The hook `hasMultipleConditionRegisters()` turns off the following
DAGCombiner folds:
select(C0|C1, x, y) <=> select(C0, x, select(C1, x, y))
select(C0&C1, x, y) <=> select(C0, select(C1, x, y), y)
The second hook `setJumpIsExpensive` controls a flag that has a similar purpose
and is used in CodeGenPrepare and the SelectionDAGBuilder.
Both of these have the effect of ensuring more logic is done before fewer jumps.
Note: with the `B` extension, we may be able to lower select into a conditional
move instruction, so at some point these hooks will need to be guarded based on
enabled extensions.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D79268
Extracts the atomic pseudo-instructions' splitting from `riscv-expand-pseudo`
/ `RISCVExpandPseudo` into its own pass, `riscv-expand-atomic-pseudo` /
`RISCVExpandAtomicPseudo`. This allows for the expansion of atomic operations
to continue to happen late (the new pass is added in `addPreEmitPass2`, so
those expansions continue to happen in the same place), while the remaining
pseudo-instructions can now be expanded earlier and benefit from more
optimization passes. The nonatomics pass is now added in `addPreSched2`.
Differential Revision: https://reviews.llvm.org/D79635
Assemble/disassemble RISC-V V extension instructions according to
latest version spec in https://github.com/riscv/riscv-v-spec/.
I have tested this patch using GNU toolchain. The encoding is aligned
to GNU assembler output. In this patch, there is a test case for each
instruction at least.
The V register definition is just for assemble/disassemble. Its type
is not important in this stage. I think it will be reviewed and modified
as we want to do codegen for scalable vector types.
This patch does not include Zvamo, Zvlsseg, and Zvediv.
Differential revision: https://reviews.llvm.org/D69987
Since i32 is not legal in riscv64,
it always promoted to i64 before emitting lib call and
for conversions like float/double to int and float/double to unsigned int
wrong lib call was emitted. This commit fix it using custom lowering.
Differential Revision: https://reviews.llvm.org/D80526
Currently, some fairly arbitrary subset of overriden methods in
RISCVISelLowering are private rather than public (which is the
visibility they have in TargetLowering). I suspect this is a holdover
from too closely copying another backend.
D78545 pointed out this can be difficult for some downstream patches,
and nobody has come forward to suggest a reason for keeping the
visibility as-is.
This commit simply makes all overridden methods match the public
visiblity of the parent.
Differential Revision: https://reviews.llvm.org/D79928
Let the codegen recognized the nomerge attribute and disable branch folding when the attribute is given
Differential Revision: https://reviews.llvm.org/D79537
Summary:
RISC-V uses a post-select peephole pass to optimise
`(load/store (ADDI $reg, %lo(addr)), 0)` into `(load/store $reg, %lo(addr))`.
This peephole wasn't firing for accesses to constant pools, which is how we
materialise most floating point constants.
This adds support for the constantpool case, which improves code generation for
lots of small FP loading examples. I have not added any tests because this
structure is well-covered by the `fp-imm.ll` testcases, as well as almost
all other uses of floating point constants in the RISC-V backend tests.
Reviewed By: luismarques, asb
Differential Revision: https://reviews.llvm.org/D79523
Summary:
RISC-V uses a post-select peephole pass to optimise
`(load/store (ADDI $reg, %lo(addr)), 0)` into `(load/store $reg, %lo(addr))`.
This peephole wasn't firing for accesses to constant pools, which is how we
materialise most floating point constants.
This adds support for the constantpool case, which improves code generation for
lots of small FP loading examples. I have not added any tests because this
structure is well-covered by the `fp-imm.ll` testcases, as well as almost
all other uses of floating point constants in the RISC-V backend tests.
Reviewed By: luismarques, asb
Differential Revision: https://reviews.llvm.org/D79523
This patch stores the alignment for ConstantPoolSDNode as an
Align and updates the getConstantPool interface to take a MaybeAlign.
Removing getAlignment() will be done as a follow up.
Differential Revision: https://reviews.llvm.org/D79436
Summary:
The RISC-V debug register was named dscratch in a previous draft of the RISC-V
debug mode spec. The number of registers has been increased to 2 in the latest
ratified version of the debug mode spec and the registers were named dscratch0
and dscratch1. We still support using the old register name "dscratch", but it
would be disassembled as "dscratch0" with this change.
Reviewers: apazos, asb, lenary, luismarques
Reviewed By: asb
Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, sameer.abuasal, evandro, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78764
Make the kind of cost explicit throughout the cost model which,
apart from making the cost clear, will allow the generic parts to
calculate better costs. It will also allow some backends to
approximate and correlate the different costs if they wish. Another
benefit is that it will also help simplify the cost model around
immediate and intrinsic costs, where we currently have multiple APIs.
RFC thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html
Differential Revision: https://reviews.llvm.org/D79002
Summary:
The current lowering of `select` on RISC-V uses a branch instruction to load a
register with one or other value. This is inefficient, especially in the case of
small constants that can be computed easily.
By implementing the TargetLowering::convertSelectOfConstantsToMath hook, some of
the simpler cases are covered that let us avoid introducing a branch in these
cases.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D79260
Summary:
This patch addresses some weird assembly sequences we were seeing during
comparing floats. In particular, comparing a float to itself tells you whether
it is NaN or not, which we were doing correctly, but with an extra unneeded
`and` instruction.
This patch specialises the existing patterns to remove the `and` instructions
when both their operands are the same.
Reviewed By: luismarques, asb
Differential Revision: https://reviews.llvm.org/D78908
Preserving liveness can be useful even late in the pipeline, if we're
doing substantial optimization work afterwards. (See, for example,
D76065.) Teach MachineOutliner how to correctly set live-ins on the
basic block in outlined functions.
Differential Revision: https://reviews.llvm.org/D78605
Summary:
Before this patch, `relaxInstruction` takes three arguments, the first
argument refers to the instruction before relaxation and the third
argument is the output instruction after relaxation. There are two quite
strange things:
1) The first argument's type is `const MCInst &`, the third
argument's type is `MCInst &`, but they may be aliased to the same
variable
2) The backends of ARM, AMDGPU, RISC-V, Hexagon assume that the third
argument is a fresh uninitialized `MCInst` even if `relaxInstruction`
may be called like `relaxInstruction(Relaxed, STI, Relaxed)` in a
loop.
In this patch, we drop the thrid argument, and let `relaxInstruction`
directly modify the given instruction. Also, this patch fixes the bug https://bugs.llvm.org/show_bug.cgi?id=45580, which is introduced by D77851, and
breaks the assumption of ARM, AMDGPU, RISC-V, Hexagon.
Reviewers: Razer6, MaskRay, jyknight, asb, luismarques, enderby, rtaylor, colinl, bcain
Reviewed By: Razer6, MaskRay, bcain
Subscribers: bcain, nickdesaulniers, nathanchance, wuzish, annita.zhang, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, tpr, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78364
This adds the instruction encoding and mnenomics for the proposed
RISC-V Bit Manipulation extension (version 0.92). It is implemented with
each category of instruction as its own target feature, with the 'b'
extension feature enabling all options. Since this extension is not yet
ratified, all target features are prefixed with 'experimental-' to note
their status.
Differential Revision: https://reviews.llvm.org/D65649
This implements the instruction analysis required to print branch
targets as part of llvm-objdump's disassembly.
Note, this only handles those branches which can be analyzed in a single
instruction, a future patch will handle multiple-instruction patterns,
such as AUIPC/LUI+JALR instruction pairs.
Differential Revision: https://reviews.llvm.org/D77567
Summary:
Currently, the comparison argument used for ATOMIC_CMP_XCHG is legalised
with GetPromotedInteger, which leaves the upper bits of the value
undefind. Since this is used for comparing in an LR/SC loop with a
full-width comparison, we must sign extend it. We introduce a new
getExtendForAtomicCmpSwapArg to complement getExtendForAtomicOps, since
many targets have compare-and-swap instructions (or pseudos) that
correctly handle an any-extend input, and the existing function
determines the extension of the result, whereas we are concerned with
the input.
This is related to https://reviews.llvm.org/D58829, which solved the
issue for ATOMIC_CMP_SWAP_WITH_SUCCESS, but not the simpler
ATOMIC_CMP_SWAP.
Reviewers: asb, lenary, efriedma
Reviewed By: asb
Subscribers: arichardson, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, evandro, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74453
For the downstream RISCV maintenance, it would be easier to inherent
RISCVISelDAGToDAG by including header and only override the method that needs
to be customized for the provider non-standard ISA extension without touching
RISCVISelDAGToDAG.cpp which may cause conflict when upgrading the downstream
LLVM version.
Differential Revision: https://reviews.llvm.org/D77117
Leverage ARM ELF build attribute section to create ELF attribute section
for RISC-V. Extract the common part of parsing logic for this section
into ELFAttributeParser.[cpp|h] and ELFAttributes.[cpp|h].
Differential Revision: https://reviews.llvm.org/D74023
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: jyknight, sdardis, nemanjai, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, jfb, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77059
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: dylanmckay, sdardis, nemanjai, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76551
-fuse-init-array is now the CC1 default but TargetLoweringObjectFileELF::UseInitArray still defaults to false.
The following two unknown OS target triples continue using .ctors/.dtors because InitializeELF is not called.
clang -target i386 -c a.c
clang -target x86_64 -c a.c
This cleanup fixes this as a bonus.
X86SpeculativeLoadHardeningPass::tracePredStateThroughCall can call
MCContext::createTempSymbol before TargetLoweringObjectFileELF::Initialize().
We need to call TargetLoweringObjectFileELF::Initialize() ealier.
test/CodeGen/X86/speculative-load-hardening-indirect.ll
Differential Revision: https://reviews.llvm.org/D71360
UseInitArray is now the CC1 default but TargetLoweringObjectFileELF::UseInitArray still defaults to false.
The following two unknown OS target triples continue using .ctors/.dtors because InitializeELF is not called.
clang -target i386 -c a.c
clang -target x86_64 -c a.c
This cleanup fixes this as a bonus.
Differential Revision: https://reviews.llvm.org/D71360
This reverts commit e9f22fd429.
When building with -DLLVM_USE_SANITIZER="Thread", check-llvm has 70
failing tests with this revision, and 29 without this revision.
Floating point positive zero can be selected using fmv.w.x / fmv.d.x /
fcvt.d.w and the zero source register.
Differential Revision: https://reviews.llvm.org/D75729
This patch generates TableGen descriptions for the specified register
banks which contain a list of register sizes corresponding to the
available HwModes. The appropriate size is used during codegen according
to the current HwMode. As this HwMode was not available on generation,
it is set upon construction of the RegisterBankInfo class. Targets
simply need to provide the HwMode argument to the
<target>GenRegisterBankInfo constructor.
The RISC-V RegisterBankInfo constructor has been updated accordingly
(plus an unused argument removed).
Differential Revision: https://reviews.llvm.org/D76007
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: jholewinski, arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76348
For context, the proposed RISC-V bit manipulation extension has a subset
of instructions which require one of two SubtargetFeatures to be
enabled, 'zbb' or 'zbp', and there is no defined feature which both of
these can imply to use as a constraint either (see comments in D65649).
AssemblerPredicates allow multiple SubtargetFeatures to be declared in
the "AssemblerCondString" field, separated by commas, and this means
that the two features must both be enabled. There is no equivalent to
say that _either_ feature X or feature Y must be enabled, short of
creating a dummy SubtargetFeature for this purpose and having features X
and Y imply the new feature.
To solve the case where X or Y is needed without adding a new feature,
and to better match a typical TableGen style, this replaces the existing
"AssemblerCondString" with a dag "AssemblerCondDag" which represents the
same information. Two operators are defined for use with
AssemblerCondDag, "all_of", which matches the current behaviour, and
"any_of", which adds the new proposed ORing features functionality.
This was originally proposed in the RFC at
http://lists.llvm.org/pipermail/llvm-dev/2020-February/139138.html
Changes to all current backends are mechanical to support the replaced
functionality, and are NFCI.
At this stage, it is illegal to combine features with ands and ors in a
single AssemblerCondDag. I suspect this case is sufficiently rare that
adding more complex changes to support it are unnecessary.
Differential Revision: https://reviews.llvm.org/D74338
The patch fixes some typos and introduces ReadFMemBase, ReadFSGNJ32,
ReadFSGNJ64, WriteFSGNJ32, WriteFSGNJ64, ReadFMinMax32, ReadFMinMax64,
WriteFMinMax32, WriteFMinMax64, so the target CPU with different pipeline model
could use them to describe latency.
Differential Revision: https://reviews.llvm.org/D75515
When running under LTO, it is common to not specify the architecture
spec, which is used for setting up the target machine, and instead rely
on features specified in each function to generate the correct
instructions.
This works for the code generator, but the RISC-V backend uses the
AsmPrinter to do instruction compression, which does not see these
features but instead uses a MCSubtargetInfo object to see whether
compression is enabled. Since this is configured based on the
TargetMachine at startup, it will result in compressed instructions not
being emitted when it has not been given the 'c' TargetFeature, but the
function has it.
This changes the RISCVAsmPrinter to re-initialize the STI feature set
based on the current MachineFunction, such that compressed instructions
are now correctly emitted regardless of the method used to enable them.
Differential revision: https://reviews.llvm.org/D73339
Implement TargetLowering callback mayBeEmittedAsTailCall for riscv in CodeGenPrepare,
which will duplicate return instructions to enable tailcall optimization.
Differential Revision: https://reviews.llvm.org/D73699
CallPreservedMask is used to describe the register liveness after a
function call. The function call in an interrupt handler should use the same
CallPreservedMask as normal functions. So that only callee save registers
can live through the function call.
This patch adds the support required for using the __riscv_save and
__riscv_restore libcalls to implement a size-optimization for prologue
and epilogue code, whereby the spill and restore code of callee-saved
registers is implemented by common functions to reduce code duplication.
Logic is also included to ensure that if both this optimization and
shrink wrapping are enabled then the prologue and epilogue code can be
safely inserted into the basic blocks chosen by shrink wrapping.
Differential Revision: https://reviews.llvm.org/D62686
Summary:
Add a new method (tryParseRegister) that attempts to parse a register specification.
MASM allows the use of IFDEF <register>, as well as IFDEF <symbol>. To accommodate this, we make it possible to check whether a register specification can be parsed at the current location, without failing the entire parse if it can't.
Reviewers: thakis
Reviewed By: thakis
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73486
ADDI(C.ADDI) may achieve better code size than XORI, since XORI has no C extension.
This patch transforms two patterns and gets almost equivalent results.
Differential Revision: https://reviews.llvm.org/D71774
When the FP exists, the FP base CFI directive offset should take the size of variable arguments into account.
Differential Revision: https://reviews.llvm.org/D73862
Remove code from LegalizeTypes that allowed this to work.
We were already using BUILD_PAIR for this in some places so this
standardizes on a single way to do this.
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73885
Summary:
Implements the jump pseudo-instruction, which is used in e.g. the Linux kernel.
Reviewers: asb, lenary
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73178
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
Pipeline scheduler model for the RISC-V Rocket micro-architecture using the
MIScheduler interface. Support for both 32 and 64-bit Rocket cores is
implemented.
Differential revision: https://reviews.llvm.org/D68685
Summary:
Previously, we would erroneously turn %pcrel_lo(label), where label has
a %pcrel_hi against a weak symbol, into %pcrel_lo(label + offset), as
evaluatePCRelLo would believe the target independent logic was going to
fold it. Moreover, even if that were fixed, shouldForceRelocation lacks
an MCAsmLayout and thus cannot evaluate the %pcrel_hi fixup to a value
and check the symbol, so we would then erroneously constant-fold the
%pcrel_lo whilst leaving the %pcrel_hi intact. After D72197, this same
sequence also occurs for symbols with global binding, which is triggered
in real-world code.
Instead, as discussed in D71978, we introduce a new FKF_IsTarget flag to
avoid these kinds of issues. All the resolution logic happens in one
place, with no coordination required between RISCAsmBackend and
RISCVMCExpr to ensure they implement the same logic twice. Although the
implementation of %pcrel_hi can be left as target independent, we make
it target dependent to ensure that they are handled identically to
%pcrel_lo, otherwise we risk one of them being constant folded but the
other being preserved. This also allows us to properly support fixup
pairs where the instructions are in different fragments.
Reviewers: asb, lenary, efriedma
Reviewed By: efriedma
Subscribers: arichardson, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73211
1. if users don't specific -mattr, the default target-feature come
from IR attribute.
2. fixed bug and re-land this patch
Reviewers: lenary, asb
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70837
These names have been changed from CamelCase to camelCase, but there were
many places (comments mostly) that still used the old names.
This change is NFC.
Except AMDGPU/R600RegisterInfo (a bunch of MIR tests seem to have
problems), every target overrides it with true. PostMachineScheduler
requires livein information. Not providing it can cause assertion
failures in ScheduleDAGInstrs::addSchedBarrierDeps().
This include file was created in October and has a "using namespace llvm". This seems to get exposed to other include files and finally onto cpp files. While this somewhat okay for llvm itself, its bad for other projects that use llvm as a library and includes a header file that picks this up. This was found by ISPC which has some class names at gloal scope with the same names as LLVM.
It looks like RISCV accidentally became dependent on this. I fixed it by reordering some includes in the RISCV code, but maybe we want to change the TableGenEmitter to put "namespace llvm {" in the generated file instead? But we probably want to do the simplest thing first so we can merge it to 10.0.
Differential Revision: https://reviews.llvm.org/D72895
if users don't specific -mattr, the default target-feature come
from IR attribute.
Reviewers: lenary, asb
Reviewed By: lenary, asb
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70837
Summary:
For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF
this change makes all symbols in the target specific libraries hidden
by default.
A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these
libraries public, which is mainly needed for the definitions of the
LLVMInitialize* functions.
This patch reduces the number of public symbols in libLLVM.so by about
25%. This should improve load times for the dynamic library and also
make abi checker tools, like abidiff require less memory when analyzing
libLLVM.so
One side-effect of this change is that for builds with
LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that
access symbols that are no longer public will need to be statically linked.
Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1):
nm before/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
36221
nm after/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
26278
Reviewers: chandlerc, beanz, mgorny, rnk, hans
Reviewed By: rnk, hans
Subscribers: merge_guards_bot, luismarques, smeenai, ldionne, lenary, s.egerton, pzheng, sameer.abuasal, MaskRay, wuzish, echristo, Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D54439
Enabling shrink wrapping requires ensuring the insertion point of the
epilogue is correct for MBBs without a terminator, in which case the
instruction to adjust the stack pointer is the last instruction in the
block.
Differential Revision: https://reviews.llvm.org/D62190
Summary: These seem to be the machine operand types currently needed by the
RISC-V target.
Reviewers: asb, lenary
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72275
Summary:
It is useful to keep statistics on how many instructions we have
compressed, so we can see if future changes are increasing or decreasing this
number.
Reviewers: asb, luismarques
Reviewed By: asb, luismarques
Subscribers: xbolva00, sameer.abuasal, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67495
Summary:
AMO memory operands use a custom parser in order to accept both (reg)
and 0(reg). However, the validation predicate used for these operands
was only checking that they were registers, and not the register class,
so non-GPRs (such as FPRs) were also accepted. Thus, fix this by making
the predicate check that they are GPRs.
Reviewers: asb, lenary
Reviewed By: asb, lenary
Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72471
The argument is llvm::null() everywhere except llvm::errs() in
llvm-objdump in -DLLVM_ENABLE_ASSERTIONS=On builds. It is used by no
target but X86 in -DLLVM_ENABLE_ASSERTIONS=On builds.
If we ever have the needs to add verbose log to disassemblers, we can
record log with a member function, instead of passing it around as an
argument.
Only PPC seems to be using it, and only checks some simple cases and
doesn't distinguish between FP. Just switch to using LLT to simplify
use from GlobalISel.
Summary:
This is analogous to D58943, which correctly finds the corresponding
fixup. However, when linker relaxations are disabled and we evaluate the
fixup, we need to also ensure we use an offset of 0 rather than the size
of the previous fragment.
Reviewers: asb, efriedma, lenary
Reviewed By: efriedma
Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71978
printInst prints a branch/call instruction as `b offset` (there are many
variants on various targets) instead of `b address`.
It is a convention to use address instead of offset in most external
symbolizers/disassemblers. This difference makes `llvm-objdump -d`
output unsatisfactory.
Add `uint64_t Address` to printInst(), so that it can pass the argument to
printInstruction(). `raw_ostream &OS` is moved to the last to be
consistent with other print* methods.
The next step is to pass `Address` to printInstruction() (generated by
tablegen from the instruction set description). We can gradually migrate
targets to print addresses instead of offsets.
In any case, downstream projects which don't know `Address` can pass 0 as
the argument.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D72172
When the "disable-tail-calls" attribute was added, checks were added for
it in various backends. Now this code has proliferated, and it is
something the target is responsible for checking. Move that
responsibility back to the ISels (fast, global, and SD).
There's no major functionality change, except for targets that never
implemented this check.
This LLVM attribute was originally added in
d9699bc7bd (2015).
Reviewers: echristo, MaskRay
Differential Revision: https://reviews.llvm.org/D72118
This allows us to delete InlineAsm::Constraint_i workarounds in
SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and
TargetLowering::getInlineAsmMemConstraint overrides.
They were introduced to X86 in r237517 to prevent crashes for
constraints like "=*imr". They were later copied to other targets.
Summary: Instead of crashing due to the `llvm_unreachable`, provide a proper
error when invalid fixups/relocations are encountered.
Reviewers: asb, lenary
Reviewed By: asb
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71536
This patch enables the machine outliner for RISC-V and adds the
necessary logic for checking whether sequences can be safely outlined,
and describing how they should be outlined. Outlined functions are
called using the register t0 (x5) as the return address register, which
must be available for an occurrence of a sequence to be safely outlined.
Differential Revision: https://reviews.llvm.org/D66210
expected failed test (RV32IF-ILP32F) will be fixed in a subsequent patch.
Reviewers: efriedma, lenary, asb
Reviewed By: efriedma, lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70116
This fixes an assertion failure that triggers inside
getMemOperandWithOffset when Machine Sinking calls it on a MachineInstr
that is not a memory operation.
Different backends implement getMemOperandWithOffset differently: some
return false on non-memory MachineInstrs, others assert.
The Machine Sinking pass in at least SinkingPreventsImplicitNullCheck
relies on getMemOperandWithOffset to return false on non-memory
MachineInstrs, instead of asserting.
This patch updates the documentation on getMemOperandWithOffset that it
should return false on any MachineInstr it cannot handle, instead of
asserting. It also adapts the in-tree backends accordingly where
necessary.
Differential Revision: https://reviews.llvm.org/D71359
Currently -fuse-init-array option is not effective when target triple
does not specify os, on x86,x86_64.
i.e.
// -fuse-init-array is not honored.
$ clang -target i386 -fuse-init-array test.c -S
// -fuse-init-array is honored.
$ clang -target i386-linux -fuse-init-array test.c -S
This patch fixes first case.
And does cleanup.
Reviewers: rnk, craig.topper, fhahn, echristo
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D71360
Summary:
This copy ensures that debug location information is kept for
compressed instructions. There are places where both compressInstruction and
uncompressInstruction are called that were not doing this copy, discarding some
debug info.
This change merely moves the copy into the generated file, so you cannot forget
to copy the location over when compressing or uncompressing.
Reviewers: asb, luismarques
Reviewed By: luismarques
Subscribers: sameer.abuasal, aprantl, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67493
This has two main effects:
- Optimizes debug info size by saving 221.86 MB of obj file size in a
Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of
object file size.
- Incremental step towards decoupling target intrinsics.
The enums are still compact, so adding and removing a single
target-specific intrinsic will trigger a rebuild of all of LLVM.
Assigning distinct target id spaces is potential future work.
Part of PR34259
Reviewers: efriedma, echristo, MaskRay
Reviewed By: echristo, MaskRay
Differential Revision: https://reviews.llvm.org/D71320
Soon Intrinsic::ID will be a plain integer, so this overload will not be
possible.
Rename both overloads to ensure that downstream targets observe this as
a build failure instead of a runtime failure.
Split off from D71320
Reviewers: efriedma
Differential Revision: https://reviews.llvm.org/D71381
This adds support for printing improved missing feature error messages
from the assembler, which now indicates which feature caused the parse
to fail.
Differential Revision: https://reviews.llvm.org/D69899
Summary:
Forcing Local Exec TLS requires the use of copy relocations. Copy
relocations need special handling in the runtime linker when being used
against TLS symbols, which is present in glibc, but not in FreeBSD nor
musl, and so cannot be relied upon. Moreover, copy relocations are a
hack that embed the size of an object in the ABI when it otherwise
wouldn't be, and break protected symbols (which are expected to be DSO
local), whilst also wasting space, thus they should be avoided whenever
possible. As discussed in D70398, RISC-V should move away from forcing
Local Exec, and instead use Initial Exec like other targets, with
possible linker relaxation to follow. The RISC-V GCC maintainers also
intend to adopt this more-conventional behaviour (see
https://github.com/riscv/riscv-elf-psabi-doc/issues/122).
Reviewers: asb, MaskRay
Reviewed By: MaskRay
Subscribers: emaste, krytarowski, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, llvm-commits, bsdjhb
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70649
Summary: Adds tablegen patterns to explicitly handle fcopysign where the
magnitude and sign arguments have different types, due to the sign value casts
being removed the by DAGCombiner. Support for RV32IF follows in a separate
commit. Adds tests for all relevant scenarios except RV32IF.
Reviewers: lenary
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70678
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so. Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:
1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so. This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.
With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.
2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set. This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.
I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:
- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON
Reviewers: beanz, smeenai, compnerd, phosek
Reviewed By: beanz
Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70179
Summary:
The RISC-V backend used to generate `add <reg>, x0, <reg>` in a few
instances. It seems most places no longer generate this sequence.
This is semantically equivalent to `addi <reg>, <reg>, 0`, but the
latter has the advantage of being noted to be the canonical instruction
to be used for moves (which microarchitectures can and should recognise
as such).
The changed testcases use instruction aliases - `mv <reg>, <reg>` is an
alias for `addi <reg>, <reg>, 0`.
Reviewers: luismarques
Reviewed By: luismarques
Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70124
Summary: Removes CFI CFA directives that could incorrectly propagate
beyond the basic block they were inteded for. Specifically it removes
the epilogue CFI directives. See the branch_and_tail_call test for an
example of the issue. Should fix the stack unwinding issues caused by
the incorrect directives.
Reviewers: asb, lenary, shiva0217
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69723
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
Summary: Removes CFI CFA directives that could incorrectly propagate
beyond the basic block they were inteded for. Specifically it removes
the epilogue CFI directives. See the branch_and_tail_call test for an
example of the issue. Should fix the stack unwinding issues caused by
the incorrect directives.
Reviewers: asb, lenary, shiva0217
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69723
Summary: When using the split sp adjustment and using the frame-pointer
we were still emitting CFI CFA directives based on the sp value. The
final sp-based offset also didn't reflect the two-stage sp adjust. There
remain CFI issues that aren't related to the split sp adjustment, and
thus will be addressed in a separate patch.
Reviewers: asb, lenary, shiva0217
Reviewed By: lenary, shiva0217
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69385
Summary: Adds tests necessary to properly show the impact of other
patches that affect the emission of CFI directives.
Reviewers: asb, lenary
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69721
The following testcase
function:
.Lpcrel_label1:
auipc a0, %pcrel_hi(other_function)
addi a1, a0, %pcrel_lo(.Lpcrel_label1)
.p2align 2 # Causes a new fragment to be emitted
.type other_function,@function
other_function:
ret
exposes an odd behaviour in which only the %pcrel_hi relocation is
evaluated but not the %pcrel_lo.
$ llvm-mc -triple riscv64 -filetype obj t.s | llvm-objdump -d -r -
<stdin>: file format ELF64-riscv
Disassembly of section .text:
0000000000000000 function:
0: 17 05 00 00 auipc a0, 0
4: 93 05 05 00 mv a1, a0
0000000000000004: R_RISCV_PCREL_LO12_I other_function+4
0000000000000008 other_function:
8: 67 80 00 00 ret
The reason seems to be that in RISCVAsmBackend::shouldForceRelocation we
only consider the fragment but in RISCVMCExpr::evaluatePCRelLo we
consider the section. This usually works but there are cases where the
section may still be the same but the fragment may be another one. In
that case we end forcing a %pcrel_lo relocation without any %pcrel_hi.
This patch makes RISCVAsmBackend::shouldForceRelocation use the section,
if any, to determine if the relocation must be forced or not.
Differential Revision: https://reviews.llvm.org/D60657
Summary: Introduces the `InstrInfo::areMemAccessesTriviallyDisjoint`
hook. The test could check for instruction reorderings, but to avoid
being brittle it just checks instruction dependencies.
Reviewers: asb, lenary
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67046
Summary: The hook should work for any RISC-V register. Non-allocatable registers
do not need to be reserved, for the remaining the hook will only succeed
if you pass clang the -ffixed-xX flag. This builds upon D67185, which
currently only allows reserving GPRs.
Reviewers: asb, lenary
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69130
Summary:
Until this commit, these have lowered to a call to abort().
`llvm.trap()` now lowers to `unimp`, which should trap on all systems.
`llvm.debugtrap()` now lowers to `ebreak`, which is exactly what this
instruction is for.
Reviewers: asb, luismarques
Reviewed By: asb
Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69390
Complete fp16 support by ensuring that load extension / truncate store
operations are properly expanded.
Reviewers: asb, lenary
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D69246
MipsMCAsmInfo was using '$' prefix for Mips32 and '.L' for Mips64
regardless of -target-abi option. By passing MCTargetOptions to MCAsmInfo
we can find out Mips ABI and pick appropriate prefix.
Tags: #llvm, #clang, #lldb
Differential Revision: https://reviews.llvm.org/D66795
This adds support for reserving GPRs such that the compiler will not
choose a register for register allocation. The implementation follows
the same design as for AArch64; each reserved register becomes a target
feature and used for getting the reserved registers for a given
MachineFunction. The backend checks that it does not need to write to
any reserved register; if it does a relevant error is generated.
Differential Revision: https://reviews.llvm.org/D67185
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68993
llvm-svn: 375084
Summary:
This patch implements the `TargetInstrInfo::verifyInstruction` hook for RISC-V. Currently the hook verifies the machine instruction's immediate operands, to check if the immediates are within the expected bounds. Without the hook invalid immediates are not detected except when doing assembly parsing, so they are silently emitted (including being truncated when emitting object code).
The bounds information is specified in tablegen by using the `OperandType` definition, which sets the `MCOperandInfo`'s `OperandType` field. Several RISC-V-specific immediate operand types were created, which extend the `MCInstrDesc`'s `OperandType` `enum`.
To have the hook called with `llc` pass it the `-verify-machineinstrs` option. For Clang add the cmake build config `-DLLVM_ENABLE_EXPENSIVE_CHECKS=True`, or temporarily patch `TargetPassConfig::addVerifyPass`.
Review concerns:
- The patch adds immediate operand type checks that cover at least the base ISA. There are several other operand types for the C extension and one type for the F/D extensions that were left out of this initial patch because they introduced further design concerns that I felt were best evaluated separately.
- Invalid register classes (e.g. passing a GPR register where a GPRC is expected) are already caught, so were not included.
- This design makes the more abstract `MachineInstr` verification depend on MC layer definitions, which arguably is not the cleanest design, but is in line with how things are done in other parts of the target and LLVM in general.
- There is some duplication of logic already present in the `MCOperandPredicate`s. Since the `MachineInstr` and `MCInstr` notions of immediates are fundamentally different, this is currently necessary.
Reviewers: asb, lenary
Reviewed By: lenary
Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67397
llvm-svn: 375006
LLVM may annotate the function with fastcc if there has only one caller
and there're no other caller out of the module and the function is not
naked or contain variable arguments.
The fastcc functions could pass the arguments by the caller saved registers.
Differential Revision: https://reviews.llvm.org/D68559
llvm-svn: 374857
We would like to split the SP adjustment to reduce the instructions in
prologue and epilogue as the following case. In this way, the offset of
the callee saved register could fit in a single store.
add sp,sp,-2032
sw ra,2028(sp)
sw s0,2024(sp)
sw s1,2020(sp)
sw s3,2012(sp)
sw s4,2008(sp)
add sp,sp,-64
Differential Revision: https://reviews.llvm.org/D68011
llvm-svn: 373688
These old aliases were renamed, but are still used by some projects (eg newlib).
Differential Revision: https://reviews.llvm.org/D68392
llvm-svn: 373618
The new names for FPRs ensure that the Register values within the same class are
enumerated consecutively (the order is determined by the `LessRecordRegister`
function object). Where there were tables mapping between 32- and 64-bit FPRs
(and vice versa) this patch replaces them with Register arithmetic. The
enumeration order between different register classes is expected to continue to
be arbitrary, although it does impact the conversion from the (overloaded) asm
FPR names to Register values, and therefore might require updates to the target
if the sorting algorithm is changed. Static asserts were added to ensure that
changes to the ordering that would impact the current implementation are
detected.
Differential Revision: https://reviews.llvm.org/D67423
llvm-svn: 373096
This reverts r372314, reapplying r372285 and the commits which depend
on it (r372286-r372293, and r372296-r372297)
This was missing one switch to getTargetConstant in an untested case.
llvm-svn: 372338
This broke the Chromium build, causing it to fail with e.g.
fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15>
See llvm-commits thread of r372285 for details.
This also reverts r372286, r372287, r372288, r372289, r372290, r372291,
r372292, r372293, r372296, and r372297, which seemed to depend on the
main commit.
> Encode them directly as an imm argument to G_INTRINSIC*.
>
> Since now intrinsics can now define what parameters are required to be
> immediates, avoid using registers for them. Intrinsics could
> potentially want a constant that isn't a legal register type. Also,
> since G_CONSTANT is subject to CSE and legalization, transforms could
> potentially obscure the value (and create extra work for the
> selector). The register bank of a G_CONSTANT is also meaningful, so
> this could throw off future folding and legalization logic for AMDGPU.
>
> This will be much more convenient to work with than needing to call
> getConstantVRegVal and checking if it may have failed for every
> constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
> immarg operands, many of which need inspection during lowering. Having
> to find the value in a register is going to add a lot of boilerplate
> and waste compile time.
>
> SelectionDAG has always provided TargetConstant for constants which
> should not be legalized or materialized in a register. The distinction
> between Constant and TargetConstant was somewhat fuzzy, and there was
> no automatic way to force usage of TargetConstant for certain
> intrinsic parameters. They were both ultimately ConstantSDNode, and it
> was inconsistently used. It was quite easy to mis-select an
> instruction requiring an immediate. For SelectionDAG, start emitting
> TargetConstant for these arguments, and using timm to match them.
>
> Most of the work here is to cleanup target handling of constants. Some
> targets process intrinsics through intermediate custom nodes, which
> need to preserve TargetConstant usage to match the intrinsic
> expectation. Pattern inputs now need to distinguish whether a constant
> is merely compatible with an operand or whether it is mandatory.
>
> The GlobalISelEmitter needs to treat timm as a special case of a leaf
> node, simlar to MachineBasicBlock operands. This should also enable
> handling of patterns for some G_* instructions with immediates, like
> G_FENCE or G_EXTRACT.
>
> This does include a workaround for a crash in GlobalISelEmitter when
> ARM tries to uses "imm" in an output with a "timm" pattern source.
llvm-svn: 372314
Encode them directly as an imm argument to G_INTRINSIC*.
Since now intrinsics can now define what parameters are required to be
immediates, avoid using registers for them. Intrinsics could
potentially want a constant that isn't a legal register type. Also,
since G_CONSTANT is subject to CSE and legalization, transforms could
potentially obscure the value (and create extra work for the
selector). The register bank of a G_CONSTANT is also meaningful, so
this could throw off future folding and legalization logic for AMDGPU.
This will be much more convenient to work with than needing to call
getConstantVRegVal and checking if it may have failed for every
constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
immarg operands, many of which need inspection during lowering. Having
to find the value in a register is going to add a lot of boilerplate
and waste compile time.
SelectionDAG has always provided TargetConstant for constants which
should not be legalized or materialized in a register. The distinction
between Constant and TargetConstant was somewhat fuzzy, and there was
no automatic way to force usage of TargetConstant for certain
intrinsic parameters. They were both ultimately ConstantSDNode, and it
was inconsistently used. It was quite easy to mis-select an
instruction requiring an immediate. For SelectionDAG, start emitting
TargetConstant for these arguments, and using timm to match them.
Most of the work here is to cleanup target handling of constants. Some
targets process intrinsics through intermediate custom nodes, which
need to preserve TargetConstant usage to match the intrinsic
expectation. Pattern inputs now need to distinguish whether a constant
is merely compatible with an operand or whether it is mandatory.
The GlobalISelEmitter needs to treat timm as a special case of a leaf
node, simlar to MachineBasicBlock operands. This should also enable
handling of patterns for some G_* instructions with immediates, like
G_FENCE or G_EXTRACT.
This does include a workaround for a crash in GlobalISelEmitter when
ARM tries to uses "imm" in an output with a "timm" pattern source.
llvm-svn: 372285
Most of the test changes are trivial instruction reorderings and differing
register allocations, without any obvious performance impact.
Differential Revision: https://reviews.llvm.org/D66973
llvm-svn: 372106
Summary:
Now that llvm-objdump allows target-specific options, we match the
`no-aliases` and `numeric` options for RISC-V, as supported by GNU objdump.
This is done by overriding the variables used for the command-line options, so
that the command-line options are still supported.
This patch updates all tests using `llvm-objdump -riscv-no-aliases` to use
`llvm-objdump -M no-aliases`.
Reviewers: luismarques, asb
Reviewed By: luismarques, asb
Subscribers: pzheng, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66139
llvm-svn: 371534
Summary:
This is an option primarily to use during testing. Instead of always
printing registers using their ABI names, this allows a user to request they
are printed with their architectural name.
This is then used in the register constraint tests to ensure the mapping
between architectural and abi names is correct.
Reviewers: asb, luismarques
Reviewed By: asb
Subscribers: pzheng, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65950
llvm-svn: 371531
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: nemanjai, javed.absar, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, ychen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67267
llvm-svn: 371212
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: jyknight, sdardis, nemanjai, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67229
llvm-svn: 371200
Summary:
This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align.
The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment.
A few renames uncovered dubious assignments:
- `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation.
- `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation,
- `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation,
Reviewers: lattner, thegameg, courbet
Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65945
llvm-svn: 371045
The patch fixed the issue that RV64 didn't clear the upper bits
when return complex floating value with lp64 ABI.
float _Complex
complex_add(float _Complex a, float _Complex b)
{
return a + b;
}
RealResult = zero_extend(RealA + RealB)
ImageResult = ImageA + ImageB
Return (RealResult | (ImageResult << 32))
The patch introduces shouldExtendTypeInLibCall target hook to suppress
the AssertZext generation when lowering floating LibCall.
Thanks to Eli's comments from the Bugzilla
https://bugs.llvm.org/show_bug.cgi?id=42820
Differential Revision: https://reviews.llvm.org/D65497
llvm-svn: 370275
Prefer `MCFixupKind` where possible and add getTargetKind() to
convert to `unsigned` when needed rather than scattering cast
operators around the place.
Differential Revision: https://reviews.llvm.org/D59890
llvm-svn: 369720
The hint instructions are enabled by default (if the standard C extension is
enabled). To disable them pass -mattr=-rvc-hints.
Differential Revision: https://reviews.llvm.org/D62592
llvm-svn: 369528
Follow binutils in using RISCV_32_PCREL for the FDE initial location. As
explained in the relevant binutils commit
<a6cbf936e3>,
the ADD/SUB pair of relocations is problematic in the presence of linker
relaxation.
This patch has the same end goal as D64715 but includes test changes and
avoids adding a new global VariantKind to MCExpr.h (preferring
RISCVMCExpr VKs like the rest of the RISC-V backend).
Differential Revision: https://reviews.llvm.org/D66419
llvm-svn: 369375
The current behavior of shouldForceRelocation forces relocations for the
majority of fixups when relaxation is enabled. This makes sense for
fixups which incorporate symbols but is unnecessary for simple data
fixups where the fixup target is already resolved to an absolute value.
Differential Revision: https://reviews.llvm.org/D63404
Patch by Edward Jones.
llvm-svn: 369257
Only in public interfaces that have not yet been converted should there remain
registers with unsigned type.
Differential Revision: https://reviews.llvm.org/D66252
llvm-svn: 369114
This patch allows symbols followed by an expression for an offset to be
parsed as bare symbols.
Differential Revision: https://reviews.llvm.org/D57332
llvm-svn: 369097
This allows arguments with the constraint A to be lowered to input nodes
for RISC-V, which implies a memory address stored in a register.
This patch adds the minimal amount of code required to get operands with
the right constraints to compile.
https://reviews.llvm.org/D54296
llvm-svn: 369095
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
llvm-svn: 369013
Summary:
Ana Pazos reported a bug where we were not checking that an APInt would
fit into 64-bits before calling `getSExtValue()`. This caused asserts when
compiling large constants, such as i128s, as happens when compiling compiler-rt.
This patch adds a testcase and makes the callback less error-prone.
Reviewers: apazos, asb, luismarques
Reviewed By: luismarques
Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66081
llvm-svn: 368572
Summary:
Clang will replace references to registers using ABI names in inline
assembly constraints with references to architecture names, but other
frontends do not. LLVM uses the regular assembly parser to parse inline asm,
so inline assembly strings can contain references to registers using their
ABI names.
This patch adds support for parsing constraints using either the ABI name or
the architectural register name. This means we do not need to implement the
ABI name replacement code in every single frontend, especially those like
Rust which are a very thin shim on top of LLVM IR's inline asm, and that
constraints can more closely match the assembly strings they refer to.
Reviewers: asb, simoncook
Reviewed By: simoncook
Subscribers: hiraditya, rbar, johnrusso, JDevlieghere, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65947
llvm-svn: 368303
Summary:
Currently the RISC-V backend does not realign the stack. This can be an issue even for the RV32I/RV64I ABIs (where the stack is 16-byte aligned), though is rare. It will be much more comment with RV32E (though the alignment requirements for common data types remain under-documented...).
This patch adds minimal support for stack realignment. It should cope with large realignments. It will error out if the stack needs realignment and variable sized objects are present.
It feels like a lot of the code like getFrameIndexReference and determineFrameLayout could be refactored somehow, as right now it feels fiddly and brittle. We also seem to allocate a lot more memory than GCC does for equivalent C code.
Reviewers: asb
Reviewed By: asb
Subscribers: wwei, jrtc27, s.egerton, MaskRay, Jim, lenary, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62007
llvm-svn: 368300
Summary:
This is patch is part of a serie to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet, jfb, jakehehrlich
Reviewed By: jfb
Subscribers: wuzish, jholewinski, arsenm, dschuff, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65514
llvm-svn: 367828
Summary:
GCC Accepts both (reg) and 0(reg) for atomic instruction memory
operands. These instructions do not allow for an offset in their
encoding, so in the latter case, the 0 is silently dropped.
Due to how we have structured the RISCVAsmParser, the easiest way to add
support for parsing this offset is to add a custom AsmOperand and
parser. This parser drops all the parens, and just keeps the register.
This commit also adds a custom printer for these operands, which matches
the GCC canonical printer, printing both `(a0)` and `0(a0)` as `(a0)`.
Reviewers: asb, lewis-revill
Reviewed By: asb
Subscribers: s.egerton, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, Jim, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65205
llvm-svn: 367553
Summary:
This adds the 'f' inline assembly constraint, as supported by GCC. An
'f'-constrained operand is passed in a floating point register. Exactly
which kind of floating-point register (32-bit or 64-bit) is decided
based on the operand type and the available standard extensions (-f and
-d, respectively).
This patch adds support in both the clang frontend, and LLVM itself.
Reviewers: asb, lewis-revill
Reviewed By: asb
Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65500
llvm-svn: 367403
This adds the required extension to RISC-V's getRegForInlineAsmConstraint
in order to be able to correctly distringuish between the 32 and 64-bit
floating point registers when the generic fX name appears in inlineasm
clobber contraints. It also adds a check to validate that callee saved
floating point registers are only saved in this case when a hard-float
ABI is selected.
Differential Revision: https://reviews.llvm.org/D64751
llvm-svn: 367397
For llvm/test/MC/RISCV/rv64i-aliases-invalid.s, UBSan reports:
lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp:371:9: runtime error:
load of value 3879186881, which is not a valid value for type
'RISCVMCExpr::VariantKind'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior
lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp:371:9 in
It turns out that evaluateConstantImm does not set `VK` and it remains
unitialized when doing comparisons in `isImmXLenLI()`.
Differential Revision: https://reviews.llvm.org/D65347
llvm-svn: 367230
It is necessary to generate fixups in .debug_frame or .eh_frame as
relaxation is enabled due to the address delta may be changed after
relaxation.
There is an opcode with 6-bits data in debug frame encoding. So, we
also need 6-bits fixup types.
Differential Revision: https://reviews.llvm.org/D58335
llvm-svn: 366524
It is necessary to generate fixups in .debug_frame or .eh_frame as
relaxation is enabled due to the address delta may be changed after
relaxation.
There is an opcode with 6-bits data in debug frame encoding. So, we
also need 6-bits fixup types.
Differential Revision: https://reviews.llvm.org/D58335
llvm-svn: 366442
The canonical GNU form of JALR resembles a load/store instruction rather
than placing the immediate offset as a separate argument, so match this
behaviour. Also add parser-only aliases for the three-operand form, and
add other shorter aliases also emitted by GNU tools.
Differential Revision: https://reviews.llvm.org/D55277
Patch by James Clarke.
llvm-svn: 366179
RISCVAsmBackend::shouldInsertExtraNopBytesForCodeAlign() assumed that the
align specified would be greater than or equal to the minimum nop length, but
that is not always the case - for example if a user specifies ".align 0" in
assembly.
Differential Revision: https://reviews.llvm.org/D63274
Patch by Edward Jones.
llvm-svn: 366176
The bool result of shouldInsertExtraNopBytesForCodeAlign() is not checked but
the returned nop count is unconditionally read even though it could be
uninitialized.
Differential Revision: https://reviews.llvm.org/D63285
Patch by Edward Jones.
llvm-svn: 366175
Since PseudoCALL defines AsmString, it can be generated from assembly,
and so code-gen patterns should be defined separately to be consistent
with the style of the RISCV backend. Other pseudo-instructions exist
that have code-gen patterns defined directly, but these instructions are
purely for code-gen and cannot be written in assembly.
Differential Revision: https://reviews.llvm.org/D64012
Patch by James Clarke.
llvm-svn: 366174
Previously, this function didn't check the IsPCRel argument. But doing so is a
useful check for errors, and also seemingly necessary for FK_Data_4 (which we
produce a R_RISCV_32_PCREL relocation for if IsPCRel).
Other than R_RISCV_32_PCREL, this should be NFC. Future exception handling
related patches will include tests that capture this behaviour.
llvm-svn: 366172
Summary:
Useful for jumps, such as `j .`.
I am not sure who should review this. Do not hesitate to change the reviewers if needed.
Reviewers: asb, jrtc27, lenary
Reviewed By: lenary
Subscribers: MaskRay, lenary, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63669
Patch by John LLVM (JohnLLVM)
llvm-svn: 365881
Summary:
There was an error being thrown from isDesirableToCommuteWithShift in
some tests. This was tracked down to the method being called before
legalisation, with an extended value type, not a machine value type.
In the case I diagnosed, the error was only hit with an instruction sequence
involving `i24`s in the add and shift. `i24` is not a Machine ValueType, it is
instead an Extended ValueType which was causing the issue.
I have added a test to cover this case, and fixed the error in the callback.
Reviewers: asb, luismarques
Reviewed By: asb
Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64425
llvm-svn: 365511
APInt::getSExtValue will assert if getMinSignedBits() > 64. This can happen,
for instance, if examining an i128. Avoid this assertion by checking
Imm.getMinSignedBits() <= 64 before doing
getTLI()->isLegalAddImmediate(Imm.getSExtValue()). We could directly check
getMinSignedBits() <= 12 but it seems better to reuse the isLegalAddImmediate
helper for this.
Differential Revision: https://reviews.llvm.org/D64390
llvm-svn: 365462
Defines RISCV registers for getExceptionPointerRegister() and
getExceptionSelectorRegister().
Differential Revision: https://reviews.llvm.org/D63411
Patch by Edward Jones.
Modified by Alex Bradbury to add CHECK lines to exception-pointer-register.ll.
llvm-svn: 365301
On RISC-V, the `cycle` CSR holds a 64-bit count of the number of clock
cycles executed by the core, from an arbitrary point in the past. This
matches the intended semantics of `@llvm.readcyclecounter()`, which we
currently leave to the default lowering (to the constant 0).
With this patch, we will now correctly lower this intrinsic to the
intended semantics, using the user-space instruction `rdcycle`. On
64-bit targets, we can directly lower to this instruction.
On 32-bit targets, we need to do more, as `rdcycle` only returns the low
32-bits of the `cycle` CSR. In this case, we perform a custom lowering,
based on the PowerPC lowering, using `rdcycleh` to obtain the high
32-bits of the `cycle` CSR. This custom lowering inserts a new basic
block which detects overflow in the high 32-bits of the `cycle` CSR
during reading (because multiple instructions are required to read). The
emitted assembly matches the suggested assembly in the RISC-V
specification.
Differential Revision: https://reviews.llvm.org/D64125
llvm-svn: 365201
This patch adds the PseudoCALLReg instruction which allows using an
explicit register operand as the destination for the return address.
GCC can successfully parse this form of the call instruction, which
would be used for calls to functions which do not use ra as the return
address register, such as the __riscv_save libcalls. This patch forms
the first part of an implementation of -msave-restore for RISC-V.
Differential Revision: https://reviews.llvm.org/D62685
llvm-svn: 364403
Avoids using a plain unsigned for registers throughoug codegen.
Doesn't attempt to change every register use, just something a little
more than the set needed to build after changing the return type of
MachineOperand::getReg().
llvm-svn: 364191
Summary:
LLVM Allows Targets to provide information that guides optimisations
made to LLVM IR. This is done with callbacks on a TargetTransformInfo object.
This patch adds a TargetTransformInfo class for RISC-V. This will allow us to
implement RISC-V specific callbacks as they become necessary.
This commit also adds the getIntImmCost callbacks, and tests them with a simple
constant hoisting test. Our immediate costs are on the conservative side, for
the moment, but we prevent hoisting in most circumstances anyway.
Previous review was on D63007
Reviewers: asb, luismarques
Reviewed By: asb
Subscribers: ributzka, MaskRay, llvm-commits, Jim, benna, psnobl, jocewei, PkmX, rkruppe, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, jrtc27, shiva0217, kito-cheng, niosHD, sabuasal, apazos, simoncook, johnrusso, rbar, hiraditya, mgorny
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63433
llvm-svn: 364046
This patch allows immediates (and CSR alias immediates) which start with
a tilde token or an exclaim (!) token to be parsed as intended.
Differential Revision: https://reviews.llvm.org/D57320
llvm-svn: 363783
Since the parser attempts to parse an operand as a register with
parentheses before parsing it as an immediate, immediates in
parentheses should not be parsed by parseRegister. However in the case
where the immediate does not start with an identifier, the LParen is not
unlexed and so the RParen causes an unexpected token error.
This patch adds the missing UnLex, and modifies the existing UnLex to
not use a buffered token, as it should always be unlexing an LParen.
Differential Revision: https://reviews.llvm.org/D57319
llvm-svn: 363782
This patch adds lowering for global TLS addresses for the TLS models of
InitialExec, GlobalDynamic, LocalExec and LocalDynamic.
LocalExec support required using a 4-operand add instruction, which uses
the fourth operand to express a relocation on the symbol. The necessary
fixup is emitted when the instruction is emitted.
Differential Revision: https://reviews.llvm.org/D55305
llvm-svn: 363771
Summary:
DAGCombine will normally turn a `(shl (add x, c1), c2)` into `(add (shl x, c2), c1 << c2)`, where `c1` and `c2` are constants. This can be prevented by a callback in TargetLowering.
On RISC-V, materialising the constant `c1 << c2` can be more expensive than materialising `c1`, because materialising the former may take more instructions, and may use a register, where materialising the latter would not.
This patch implements the hook in RISCVTargetLowering to prevent this transform, in the cases where:
- `c1` fits into the immediate field in an `addi` instruction.
- `c1` takes fewer instructions to materialise than `c1 << c2`.
In future, DAGCombine could do the check to see whether `c1` fits into an add immediate, which might simplify more targets hooks than just RISC-V.
Reviewers: asb, luismarques, efriedma
Reviewed By: asb
Subscribers: xbolva00, lebedev.ri, craig.topper, lewis-revill, Jim, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62857
llvm-svn: 363736
This patch adds support for generating calls through the procedure
linkage table where required for a given ExternalSymbol or GlobalAddress
callee.
Differential Revision: https://reviews.llvm.org/D55304
llvm-svn: 363686
Some GEPs were not being split, presumably because that split would just be
undone by the DAGCombiner. Not performing those splits can prevent important
optimizations, such as preventing the element indices / member offsets from
being (partially) folded into load/store instruction immediates. This patch:
- Makes the splits also occur in the cases where the base address and the GEP
are in the same BB.
- Ensures that the DAGCombiner doesn't reassociate them back again.
Differential Revision: https://reviews.llvm.org/D60294
llvm-svn: 363544
In order to generate correct debug frame information, it needs to
generate CFI information in prologue and epilog.
Differential Revision: https://reviews.llvm.org/D61773
llvm-svn: 363120
This patch allows lowering of PIC addresses by using PC-relative
addressing for DSO-local symbols and accessing the address through the
global offset table for non-DSO-local symbols.
Differential Revision: https://reviews.llvm.org/D55303
llvm-svn: 363058
This validates and lowers arguments to inline asm nodes which have the
constraints I, J & K, with the following semantics (equivalent to GCC):
I: Any 12-bit signed immediate.
J: Immediate integer zero only.
K: Any 5-bit unsigned immediate.
Differential Revision: https://reviews.llvm.org/D54093
llvm-svn: 363054
This reverts r362990 (git commit 374571301d)
This was causing linker warnings on Darwin:
ld: warning: direct access in function 'llvm::initializeEvexToVexInstPassPass(llvm::PassRegistry&)'
from file '../../lib/libLLVMX86CodeGen.a(X86EvexToVex.cpp.o)' to global weak symbol
'void std::__1::__call_once_proxy<std::__1::tuple<void* (&)(llvm::PassRegistry&),
std::__1::reference_wrapper<llvm::PassRegistry>&&> >(void*)' from file '../../lib/libLLVMCore.a(Verifier.cpp.o)'
means the weak symbol cannot be overridden at runtime. This was likely caused by different translation
units being compiled with different visibility settings.
llvm-svn: 363028
Summary:
For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF
this change makes all symbols in the target specific libraries hidden
by default.
A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these
libraries public, which is mainly needed for the definitions of the
LLVMInitialize* functions.
This patch reduces the number of public symbols in libLLVM.so by about
25%. This should improve load times for the dynamic library and also
make abi checker tools, like abidiff require less memory when analyzing
libLLVM.so
One side-effect of this change is that for builds with
LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that
access symbols that are no longer public will need to be statically linked.
Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1):
nm before/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
36221
nm after/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
26278
Reviewers: chandlerc, beanz, mgorny, rnk, hans
Reviewed By: rnk, hans
Subscribers: Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D54439
llvm-svn: 362990
Summary:
This allows some integer bitwise operations to instead be performed by
hardware fp instructions. This is correct because the RISC-V spec
requires the F and D extensions to use the IEEE-754 standard
representation, and fp register loads and stores to be bit-preserving.
This is tested against the soft-float ABI, but with hardware float
extensions enabled, so that the tests also ensure the optimisation also
fires in this case.
Reviewers: asb, luismarques
Reviewed By: asb
Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62900
llvm-svn: 362790
This patch adds the pseudo instructions la.tls.ie and la.tls.gd, used in
the initial-exec and global-dynamic TLS models respectively when
addressing a global. The pseudo instructions are expanded in the
assembly parser.
llvm-svn: 361499
Move the declarations of getThe<Name>Target() functions into a new header in
TargetInfo and make users of these functions include this new header.
This fixes a layering problem.
llvm-svn: 360732
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure
llvm-svn: 360505
This patch adds support for parsing and assembling the %tls_ie_pcrel_hi
and %tls_gd_pcrel_hi modifiers.
Differential Revision: https://reviews.llvm.org/D55342
llvm-svn: 358994
When not optimizing for minimum size (-Oz) we custom lower wide shifts
(SHL_PARTS, SRA_PARTS, SRL_PARTS) instead of expanding to a libcall.
Differential Revision: https://reviews.llvm.org/D59477
llvm-svn: 358498
RISCVMCCodeEmitter::expandAddTPRel asserts that the second operand must be
x4/tp. As we are not currently checking this in the RISCVAsmParser, the assert
is easy to trigger due to wrong assembly input.
This patch does a late check of this constraint.
An alternative could be using a singleton register class for x4/tp similar to
the current one for sp. Unfortunately it does not result in a good diagnostic.
Because add is an overloaded mnemonic, if no matching is possible, the
diagnostic of the first failing alternative seems to be used as the diagnostic
itself. This means that this case the %tprel_add is diagnosed as an invalid
operand (because the real add instruction only has 3 operands).
Differential Revision: https://reviews.llvm.org/D60528
llvm-svn: 358183
Because of gp = sdata_start_address + 0x800, gp with signed twelve-bit offset
could covert most of the small data section. Linker relaxation could transfer
the multiple data accessing instructions to a gp base with signed twelve-bit
offset instruction.
Differential Revision: https://reviews.llvm.org/D57493
llvm-svn: 358150
Summary:
The InlineAsm::AsmDialect is only required for X86; no architecture
makes use of it and as such it gets passed around between arch-specific
and general code while being unused for all architectures but X86.
Since the AsmDialect is queried from a MachineInstr, which we also pass
around, remove the additional AsmDialect parameter and query for it deep
in the X86AsmPrinter only when needed/as late as possible.
This refactor should help later planned refactors to AsmPrinter, as this
difference in the X86AsmPrinter makes it harder to make AsmPrinter more
generic.
Reviewers: craig.topper
Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, peter.smith, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60488
llvm-svn: 358101
This patch adds support in the MC layer for parsing and assembling the
4-operand add instruction needed for TLS addressing. This also involves
parsing the %tprel_hi, %tprel_lo and %tprel_add operand modifiers.
Differential Revision: https://reviews.llvm.org/D55341
llvm-svn: 357698
This patch allows symbols appended with @plt to parse and assemble with the
R_RISCV_CALL_PLT relocation.
Differential Revision: https://reviews.llvm.org/D55335
Patch by Lewis Revill.
llvm-svn: 357470
This patch replaces the addition of VK_RISCV_CALL in RISCVMCCodeEmitter by
creating the RISCVMCExpr when tail/call are parsed, or in the codegen case
when the callee symbols are created.
This required adding a new CallSymbol operand to allow only adding
VK_RISCV_CALL to tail/call instructions.
This patch will allow further expansion of parsing and codegen to easily
include PLT symbols which must generate the R_RISCV_CALL_PLT relocation.
Differential Revision: https://reviews.llvm.org/D55560
Patch by Lewis Revill.
llvm-svn: 357396
This patch adds an implementation of a PC-relative addressing sequence to be
used when -mcmodel=medium is specified. With absolute addressing, a 'medium'
codemodel may cause addresses to be out of range. This is because while
'medium' implies a 2 GiB addressing range, this 2 GiB can be at any offset as
opposed to 'small', which implies the first 2 GiB only.
Note that LLVM/Clang currently specifies code models differently to GCC, where
small and medium imply the same functionality as GCC's medlow and medany
respectively.
Differential Revision: https://reviews.llvm.org/D54143
Patch by Lewis Revill.
llvm-svn: 357393
Adds a `seto` pattern expansion. Without it the lowerings of `fcmp one` and
`fcmp ord` would be inefficient due to an unoptimized double negation.
Differential Revision: https://reviews.llvm.org/D59699
llvm-svn: 357378
A pcrel_lo will point to the associated pcrel_hi fixup which in turn points to
the real target. RISCVMCExpr::evaluatePCRelLo will work around this
indirection in order to allow the fixup to be evaluate properly. However, if
relocations are forced (e.g. due to linker relaxation is enabled) then its
evaluation is undesired and will result in a relocation with the wrong target.
This patch modifies evaluatePCRelLo so it will not try to evaluate if the
fixup will be forced as a relocation. A new helper method is added to
RISCVAsmBackend to query this.
Differential Revision: https://reviews.llvm.org/D59686
llvm-svn: 357374
This patch adds support for the RISC-V hard float ABIs, building on top of
rL355771, which added basic target-abi parsing and MC layer support. It also
builds on some re-organisations and expansion of the upstream ABI and calling
convention tests which were recently committed directly upstream.
A number of aspects of the RISC-V float hard float ABIs require frontend
support (e.g. flattening of structs and passing int+fp for fp+fp structs in a
pair of registers), and will be addressed in a Clang patch.
As can be seen from the tests, it would be worthwhile extending
RISCVMergeBaseOffsets to handle constant pool as well as global accesses.
Differential Revision: https://reviews.llvm.org/D59357
llvm-svn: 357352
The SplitF64 node is used on RV32D to convert an f64 directly to a pair of i32
(necessary as bitcasting to i64 isn't legal). When performed on a ConstantFP,
this will result in a FP load from the constant pool followed by a store to
the stack and two integer loads from the stack (necessary as there is no way
to directly move between f64 FPRs and i32 GPRs on RV32D). It's always cheaper
to just materialise integers for the lo and hi parts of the FP constant, so do
that instead.
llvm-svn: 357341
Adds two patterns to improve the codegen of GPR value comparisons with small
constants. Instead of first loading the constant into another register and then
doing an XOR of those registers, these patterns directly use the constant as an
XORI immediate.
llvm-svn: 356990
The RISC-V ISA defines RV32E as an alternative "base" instruction set
encoding, that differs from RV32I by having only 16 rather than 32 registers.
This patch adds basic definitions for RV32E as well as MC layer support
(assembling, disassembling) and tests. The only supported ABI on RV32E is
ILP32E.
Add a new RISCVFeatures::validate() helper to RISCVUtils which can be called
from codegen or MC layer libraries to validate the combination of TargetTriple
and FeatureBitSet. Other targets have similar checks (e.g. erroring if SPE is
enabled on PPC64 or oddspreg + o32 ABI on Mips), but they either duplicate the
checks (Mips), or fail to check for both codegen and MC codepaths (PPC).
Codegen for the ILP32E ABI support and RV32E codegen are left for a future
patch/patches.
Differential Revision: https://reviews.llvm.org/D59470
llvm-svn: 356744
This patch optimizes the emission of a sequence of SELECTs with the same
condition, avoiding the insertion of unnecessary control flow. Such a sequence
often occurs when a SELECT of values wider than XLEN is legalized into two
SELECTs with legal types. We have identified several use cases where the
SELECTs could be interleaved with other instructions. Therefore, we extend the
sequence to include non-SELECT instructions if we are able to detect that the
non-SELECT instructions do not impact the optimization.
This patch supersedes https://reviews.llvm.org/D59096, which attempted to
address this issue by introducing a new SelectionDAG node. Hat tip to Eli
Friedman for his feedback on how to best handle this issue.
Differential Revision: https://reviews.llvm.org/D59355
Patch by Luís Marques.
llvm-svn: 356741
Indicates in the TargetLowering interface that conversions from CC logic to
bitwise logic are allowed. Adds tests that show the benefit when optimization
opportunities are detected. Also adds tests that show that when the optimization
is not applied correct code is generated (but opportunities for other
optimizations remain).
Differential Revision: https://reviews.llvm.org/D59596
Patch by Luís Marques.
llvm-svn: 356740
RISCVAsmParser::ParseRegister is called from AsmParser::parseRegisterOrNumber,
which in turn is called when processing CFI directives. The RISC-V
implementation wasn't setting RegNo, and so was incorrect. This patch address
that and adds cfi directive tests that demonstrate the fix. A follow-up patch
will factor out the register parsing logic shared between ParseRegister and
parseRegister.
llvm-svn: 356329
The CSR renaming further prepares the way for an upcoming patch adding support for more
RISC-V ABIs.
Modify RISCVRegisterInfo::getCalleeSavedRegs and
RISCVRegisterInfo::getReservedRegs to do MF->getSubtarget<RISCVSubtarget>()
once rather than multiple times.
llvm-svn: 356123
This follows similar logic in the ARM and Mips backends, and allows the free
use of s0 in functions without a dedicated frame pointer. The changes in
callee-saved-gprs.ll most clearly show the effect of this patch.
llvm-svn: 356063
RISCVDisassembler was incorrectly using sizeof(Arr) when it should have used
sizeof(Arr)/sizeof(Arr[0]). Update to use array_lengthof instead.
llvm-svn: 356035
If a symbol points to the end of a fragment, instead of searching for
fixups in that fragment, search in the next fragment.
Fixes spurious assembler error with subtarget change next to "la"
pseudo-instruction, or expanded equivalent.
Alternate proposal to fix the problem discussed in
https://reviews.llvm.org/D58759.
Testcase by Ana Pazos.
Differential Revision: https://reviews.llvm.org/D58943
llvm-svn: 355946
AtomicCmpSwapWithSuccess is legalised into an AtomicCmpSwap plus a comparison.
This requires an extension of the value which, by default, is a
zero-extension. When we later lower AtomicCmpSwap into a PseudoCmpXchg32 and then expanded in
RISCVExpandPseudoInsts.cpp, the lr.w instruction does a sign-extension.
This mismatch of extensions causes the comparison to fail when the compared
value is negative. This change overrides TargetLowering::getExtendForAtomicOps
for RISC-V so it does a sign-extension instead.
Differential Revision: https://reviews.llvm.org/D58829
Patch by Ferran Pallarès Roca.
llvm-svn: 355869
The RISC-V Assembly Programmer's Manual defines fp as another alias of x8.
However, our tablegen rules only recognise s0. This patch adds fp as another
alias of x8. GCC also accepts fp.
Differential Revision: https://reviews.llvm.org/D59209
Patch by Ferran Pallarès Roca.
llvm-svn: 355867
This patch adds proper handling of -target-abi, as accepted by llvm-mc and
llc. Lowering (codegen) for the hard-float ABIs will follow in a subsequent
patch. However, this patch does add MC layer support for the hard float and
RVE ABIs (emission of the appropriate ELF flags
https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#-file-header).
ABI parsing must be shared between codegen and the MC layer, so we add
computeTargetABI to RISCVUtils. A warning will be printed if an invalid or
unrecognized ABI is given.
Differential Revision: https://reviews.llvm.org/D59023
llvm-svn: 355771
Summary:
Floating-point CSRs should be accessible even when F extension is not enabled.
But pseudo instructions that access floating point CSRs still require the F extension.
GNU tools already implement this behavior. RISC-V spec is pending update to reflect
this behavior and to extend it to pseudo instructions that access floating point CSRs.
Reviewers: asb
Reviewed By: asb
Subscribers: asb, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, llvm-commits
Differential Revision: https://reviews.llvm.org/D58932
llvm-svn: 355753
Allow load/store instructions with implied zero offset for compatibility with
GNU assembler.
Differential Revision: https://reviews.llvm.org/D57141
Patch by James Clarke.
llvm-svn: 354581
Summary:
Those pseudo-instructions are making load/store instructions able to
load/store from/to a symbol, and its always using PC-relative addressing
to generating a symbol address.
Reviewers: asb, apazos, rogfer01, jrtc27
Differential Revision: https://reviews.llvm.org/D50496
llvm-svn: 354430
This patch also introduces the emitAuipcInstPair helper, which is then used
for both emitLoadAddress and emitLoadLocalAddress.
Differential Revision: https://reviews.llvm.org/D55325
Patch by James Clarke.
llvm-svn: 354111
This patch accompanies the RFC posted here:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html
This patch adds a new CallBr IR instruction to support asm-goto
inline assembly like gcc as used by the linux kernel. This
instruction is both a call instruction and a terminator
instruction with multiple successors. Only inline assembly
usage is supported today.
This also adds a new INLINEASM_BR opcode to SelectionDAG and
MachineIR to represent an INLINEASM block that is also
considered a terminator instruction.
There will likely be more bug fixes and optimizations to follow
this, but we felt it had reached a point where we would like to
switch to an incremental development model.
Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii
Differential Revision: https://reviews.llvm.org/D53765
llvm-svn: 353563
This patch:
* Adds necessary RV64D codegen patterns
* Modifies CC_RISCV so it will properly handle f64 types (with soft float ABI)
Note that in general there is no reason to try to select fcvt.w[u].d rather than fcvt.l[u].d for i32 conversions because fptosi/fptoui produce poison if the input won't fit into the target type.
Differential Revision: https://reviews.llvm.org/D53237
llvm-svn: 352833
This requires a little extra work due tothe fact i32 is not a legal type. When
call lowering happens post-legalisation (e.g. when an intrinsic was inserted
during legalisation). A bitcast from f32 to i32 can't be introduced. This is
similar to the challenges with RV32D. To handle this, we introduce
target-specific DAG nodes that perform bitcast+anyext for f32->i64 and
trunc+bitcast for i64->f32.
Differential Revision: https://reviews.llvm.org/D53235
llvm-svn: 352807
Linker relaxation may change code size. We need to fix up the alignment
of alignment directive in text section by inserting Nops and R_RISCV_ALIGN
relocation type. So then linker could satisfy the alignment by removing Nops.
To do this:
1. Add shouldInsertExtraNopBytesForCodeAlign target hook to calculate
the Nops we need to insert.
2. Add shouldInsertFixupForCodeAlign target hook to insert
R_RISCV_ALIGN fixup type.
Differential Revision: https://reviews.llvm.org/D47755
llvm-svn: 352616
DAGCombiner::visitBITCAST will perform:
fold (bitconvert (fneg x)) -> (xor (bitconvert x), signbit)
fold (bitconvert (fabs x)) -> (and (bitconvert x), (not signbit))
As shown in double-bitmanip-dagcombines.ll, this can be advantageous. But
RV32FD doesn't use bitcast directly (as i64 isn't a legal type), and instead
uses RISCVISD::SplitF64. This patch adds an equivalent DAG combine for
SplitF64.
llvm-svn: 352247
Follow the same custom legalisation strategy as used in D57085 for
variable-length shifts (see that patch summary for more discussion). Although
we may lose out on some late-stage DAG combines, I think this custom
legalisation strategy is ultimately easier to reason about.
There are some codegen changes in rv64m-exhaustive-w-insts.ll but they are all
neutral in terms of the number of instructions.
Differential Revision: https://reviews.llvm.org/D57096
llvm-svn: 352171
The previous DAG combiner-based approach had an issue with infinite loops
between the target-dependent and target-independent combiner logic (see
PR40333). Although this was worked around in rL351806, the combiner-based
approach is still potentially brittle and can fail to select the 32-bit shift
variant when profitable to do so, as demonstrated in the pr40333.ll test case.
This patch instead introduces target-specific SelectionDAG nodes for
SHLW/SRLW/SRAW and custom-lowers variable i32 shifts to them. pr40333.ll is a
good example of how this approach can improve codegen.
This adds DAG combine that does SimplifyDemandedBits on the operands (only
lower 32-bits of first operand and lower 5 bits of second operand are read).
This seems better than implementing SimplifyDemandedBitsForTargetNode as there
is no guarantee that would be called (and it's not for e.g. the anyext return
test cases). Also implements ComputeNumSignBitsForTargetNode.
There are codegen changes in atomic-rmw.ll and atomic-cmpxchg.ll but the new
instruction sequences are semantically equivalent.
Differential Revision: https://reviews.llvm.org/D57085
llvm-svn: 352169
Previously we had names like 'Call' or 'Tail'. This potentially clashes with
the naming scheme used elsewhere in RISCVInstrInfo.td. Many other backends
would use names like AArch64call or PPCtail. I prefer the SystemZ approach,
which uses prefixed all-lowercase names. This matches the naming scheme used
for target-independent SelectionDAG nodes.
llvm-svn: 351823
Avoid the infinite loop caused by the target DAG combine converting ANYEXT to
SIGNEXT and the target-independent DAG combine logic converting back to
ANYEXT. Do this by not adding the new node to the worklist.
Committing directly as this definitely doesn't make the problem any worse, and
I intend to follow-up with a patch that avoids this custom combiner logic
altogether and just lowers the i32 operations to a target-specific
SelectionDAG node. This should be easier to reason about and improve codegen
quality in some cases (though may miss out on some later DAG combines).
llvm-svn: 351806
This broke the RISCV build, and even with that fixed, one of the RISCV
tests behaves surprisingly differently with asserts than without,
leaving there no clear test pattern to use. Generally it seems bad for
hte IR to differ substantially due to asserts (as in, an alloca is used
with asserts that isn't needed without!) and nothing I did simply would
fix it so I'm reverting back to green.
This also required reverting the RISCV build fix in r351782.
llvm-svn: 351796
The break isn't strictly needed yet as there is no subsequent entry in the
case. But adding to prevent mistakes further down the road.
llvm-svn: 351785
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
In order to support codegen RV64A, this patch:
* Introduces masked atomics intrinsics for atomicrmw operations and cmpxchg
that use the i64 type. These are ultimately lowered to masked operations
using lr.w/sc.w, but we need to use these alternate intrinsics for RV64
because i32 is not legal
* Modifies RISCVExpandPseudoInsts.cpp to handle PseudoAtomicLoadNand64 and
PseudoCmpXchg64
* Modifies the AtomicExpandPass hooks in RISCVTargetLowering to sext/trunc as
needed for RV64 and to select the i64 intrinsic IDs when necessary
* Adds appropriate patterns to RISCVInstrInfoA.td
* Updates test/CodeGen/RISCV/atomic-*.ll to show RV64A support
This ends up being a fairly mechanical change, as the logic for RV32A is
effectively reused.
Differential Revision: https://reviews.llvm.org/D53233
llvm-svn: 351422
As discussed on llvm-dev
<http://lists.llvm.org/pipermail/llvm-dev/2018-December/128497.html>, we have
to be careful when trying to select the *w RV64M instructions. i32 is not a
legal type for RV64 in the RISC-V backend, so operations have been promoted by
the time they reach instruction selection. Information about whether the
operation was originally a 32-bit operations has been lost, and it's easy to
write incorrect patterns.
Similarly to the variable 32-bit shifts, a DAG combine on ANY_EXTEND will
produce a SIGN_EXTEND if this is likely to result in sdiv/udiv/urem being
selected (and so save instructions to sext/zext the input operands).
Differential Revision: https://reviews.llvm.org/D53230
llvm-svn: 350993
This restores support for selecting the SLLW/SRLW/SRAW instructions, which was
removed in rL348067 as the previous patterns made some unsafe assumptions.
Also see the related llvm-dev discussion
<http://lists.llvm.org/pipermail/llvm-dev/2018-December/128497.html>
Ultimately I didn't introduce a custom SelectionDAG node, but instead added a
DAG combine that inserts an AssertZext i5 on the shift amount for an i32
variable-length shift and also added an ANY_EXTEND DAG-combine which will
instead produce a SIGN_EXTEND for an i32 variable-length shift, increasing the
opportunity to safely select SLLW/SRLW/SRAW.
There are obviously different ways of addressing this (a number discussed in
the llvm-dev thread), so I'd welcome further feedback and comments.
Note that there are now some cases in
test/CodeGen/RISCV/rv64i-exhaustive-w-insts.ll where sraw/srlw/sllw is
selected even though sra/srl/sll could be used without any extra instructions.
Given both are semantically equivalent, there doesn't seem a good reason to
prefer one vs the other. Given that would require more logic to still select
sra/srl/sll in those cases, I've left it preferring the *w variants.
Differential Revision: https://reviews.llvm.org/D56264
llvm-svn: 350992
This further improves compatibility with GNU as, allowing input such as the
following to be assembled:
.equ CONST, 0x123456
li a0, CONST
addi a0, a0, %lo(CONST)
.equ CONST, 1
slli a0, a0, CONST
Note that we don't have perfect compatibility with gas, as it will avoid
emitting a relocation in this case:
addi a0, a0, %lo(CONST2)
.equ CONST2, 0x123456
Thanks to Shiva Chen for suggesting a better way to approach this during review.
Differential Revision: https://reviews.llvm.org/D52298
llvm-svn: 350831
This is a update to D43157 to correctly handle fixup_riscv_pcrel_lo12.
Notable changes:
Rebased onto trunk
Handle and test S-type
Test case pcrel-hilo.s is merged into relocations.s
D43157 description:
VK_RISCV_PCREL_LO has to be handled specially. The MCExpr inside is
actually the location of an auipc instruction with a VK_RISCV_PCREL_HI fixup
pointing to the real target.
Differential Revision: https://reviews.llvm.org/D54029
Patch by Chih-Mao Chen and Michael Spencer.
llvm-svn: 349764
Adds support for the various RISC-V FMA instructions (fmadd, fmsub, fnmsub, fnmadd).
The criteria for choosing whether a fused add or subtract is used, as well as
whether the product is negated or not, is whether some of the arguments to the
llvm.fma.* intrinsic are negated or not. In the tests, extraneous fadd
instructions were added to avoid the negation being performed using a xor
trick, which prevented the proper FMA forms from being selected and thus
tested.
The FMA instruction patterns might seem incorrect (e.g., fnmadd: -rs1 * rs2 -
rs3), but they should be correct. The misleading names were inherited from
MIPS, where the negation happens after computing the sum.
The llvm.fmuladd.* intrinsics still do not generate RISC-V FMA instructions,
as that depends on TargetLowering::isFMAFasterthanFMulAndFAdd.
Some comments in the test files about what type of instructions are there
tested were updated, to better reflect the current content of those test
files.
Differential Revision: https://reviews.llvm.org/D54205
Patch by Luís Marques.
llvm-svn: 349023
Adds fatal errors for any target that does not support the Tiny or Kernel
codemodels by rejigging the getEffectiveCodeModel calls.
Differential Revision: https://reviews.llvm.org/D50141
llvm-svn: 348585
As noted by Eli Friedman <https://reviews.llvm.org/D52977?id=168629#1315291>,
the RV64I shift patterns for SLLW/SRLW/SRAW make some incorrect assumptions.
SRAW assumed that (sext_inreg foo, i32) could only be produced when
sign-extended an i32. However, it can be produced by input such as:
define i64 @tricky_ashr(i64 %a, i64 %b) {
%1 = shl i64 %a, 32
%2 = ashr i64 %1, 32
%3 = ashr i64 %2, %b
ret i64 %3
}
It's important not to select sraw in the above case, because sraw only uses
bits lower 5 bits from the shift, while a shift of 32-63 would be valid.
Similarly, the patterns for srlw assumed (and foo, 0xffffffff) would only be
produced when zero-extending a value that was originally i32 in LLVM IR. This
is obviously incorrect.
This patch removes the SLLW/SRLW/SRAW shift patterns for the time being and
adds test cases that would demonstrate a miscompile if the incorrect patterns
were re-added.
llvm-svn: 348067
This patch adds CSR instructions aliases for the cases where the instruction
takes an immediate operand but the alias doesn't have the i suffix. This is
necessary for gas/gcc compatibility.
gas doesn't do a similar conversion for fsflags or fsrm, so this should be
complete.
Differential Revision: https://reviews.llvm.org/D55008
Patch by Luís Marques.
llvm-svn: 347991
This patch adds support for UNIMP in both 32- and 16-bit forms. The 32-bit
form can be seen as a variant of the ECALL/EBREAK/etc. family of instructions.
The 16-bit form is just all zeroes, which isn't a valid RISC-V instruction,
but still follows the 16-bit instruction form (i.e. bits 0-1 != 11).
Until recently unimp was undocumented and supported just by binutils, which
printed unimp for either the 16 or 32-bit form. Both forms are now documented
<https://github.com/riscv/riscv-asm-manual/pull/20> and binutils now supports
c.unimp <https://sourceware.org/ml/binutils-cvs/2018-11/msg00179.html>.
Differential Revision: https://reviews.llvm.org/D54316
Patch by Luís Marques.
llvm-svn: 347988
DAGTypeLegalizer::PromoteSetCCOperands currently prefers to zero-extend
operands when it is able to do so. For some targets this is more expensive
than a sign-extension, which is also a valid choice. Introduce the
isSExtCheaperThanZExt hook and use it in the new SExtOrZExtPromotedInteger
helper. On RISC-V, we prefer sign-extension for FromTy == MVT::i32 and ToTy ==
MVT::i64, as it can be performed using a single instruction.
Differential Revision: https://reviews.llvm.org/D52978
llvm-svn: 347977
As discussed in the RFC
<http://lists.llvm.org/pipermail/llvm-dev/2018-October/126690.html>, 64-bit
RISC-V has i64 as the only legal integer type. This patch introduces patterns
to support codegen of the new instructions
introduced in RV64I: addiw, addiw, subw, sllw, slliw, srlw, srliw, sraw,
sraiw, ld, sd.
Custom selection code is needed for srliw as SimplifyDemandedBits will remove
lower bits from the mask, meaning the obvious pattern won't work:
def : Pat<(sext_inreg (srl (and GPR:$rs1, 0xffffffff), uimm5:$shamt), i32),
(SRLIW GPR:$rs1, uimm5:$shamt)>;
This is sufficient to compile and execute all of the GCC torture suite for
RV64I other than those files using frameaddr or returnaddr intrinsics
(LegalizeDAG doesn't know how to promote the operands - a future patch
addresses this).
When promoting i32 sltu/sltiu operands, it would be more efficient to use
sign-extension rather than zero-extension for RV64. A future patch adds a hook
to allow this.
Differential Revision: https://reviews.llvm.org/D52977
llvm-svn: 347973
Utilise a similar ('late') lowering strategy to D47882. The changes to
AtomicExpandPass allow this strategy to be utilised by other targets which
implement shouldExpandAtomicCmpXchgInIR.
All cmpxchg are lowered as 'strong' currently and failure ordering is ignored.
This is conservative but correct.
Differential Revision: https://reviews.llvm.org/D48131
llvm-svn: 347914
This adds support in the RISCVAsmParser the storing of Subtarget feature bits to a stack so that they can be pushed/popped to enable/disable multiple features at once.
Differential Revision: https://reviews.llvm.org/D46424
Patch by Lewis Revill.
llvm-svn: 347774
The RISC-V ISA manual was updated on 2018-11-07 (commit 00557c3) to define a
new compressed instruction format, RVC format CA (no actual instruction
encodings were changed). This patch updates the RISC-V backend to define the
new format, and to use it in the relevant instructions.
Differential Revision: https://reviews.llvm.org/D54302
Patch by Luís Marques.
llvm-svn: 347043
This commit introduces support for materialising 64-bit constants for RV64I,
making use of the RISCVMatInt::generateInstSeq helper in order to share logic
for immediate materialisation with the MC layer (where it's used for the li
pseudoinstruction).
test/CodeGen/RISCV/imm.ll is updated to test RV64, and gains new 64-bit
constant tests. It would be preferable if anyext constant returns were sign
rather than zero extended (see PR39092). This patch simply adds an explicit
signext to the returns in imm.ll.
Further optimisations for constant materialisation are possible, most notably
for mask-like values which can be generated my loading -1 and shifting right.
A future patch will standardise on the C++ codepath for immediate selection on
RV32 as well as RV64, and then add further such optimisations to
RISCVMatInt::generateInstSeq in order to benefit both RV32 and RV64 for
codegen and li expansion.
Differential Revision: https://reviews.llvm.org/D52962
llvm-svn: 347042
C.EBREAK was defined with hasSideEffects = 0, which is incorrect and
inconsistent with the non-compressed instruction form. This patch corrects
this oversight.
This wouldn't cause codegen issues, as compressed instructions are only ever
generated by converting the non-compressed form as an MCInst. But having
correct flags is still worthwhile.
Differential Revision: https://reviews.llvm.org/D54256
Patch by Luís Marques.
llvm-svn: 346959
Mark the FREM SelectionDAG node as Expand, which is necessary in order to
support the frem IR instruction on RISC-V. This is expanded into a library
call. Adds the corresponding test. Previously, this would have triggered an
assertion at instruction selection time.
Differential Revision: https://reviews.llvm.org/D54159
Patch by Luís Marques.
llvm-svn: 346958
Logic to load 32-bit and 64-bit immediates is currently present in
RISCVAsmParser::emitLoadImm in order to support the li pseudoinstruction. With
the introduction of RV64 codegen, there is a greater benefit of sharing
immediate materialisation logic between the MC layer and codegen. The
generateInstSeq helper allows this by producing a vector of simple structs
representing the chosen instructions. This can then be consumed in the MC
layer to produce MCInsts or at instruction selection time to produce
appropriate SelectionDAG node. Sharing this logic means that both the li
pseudoinstruction and codegen can benefit from future optimisations, and
that this logic can be used for materialising constants during RV64 codegen.
This patch does contain a behaviour change: addi will now be produced on RV64
when no lui is necessary to materialise the constant. In that case addiw takes
x0 as the source register, so is semantically identical to addi.
Differential Revision: https://reviews.llvm.org/D52961
llvm-svn: 346937
This extends the .option support from D45864 to enable/disable the relax
feature flag from D44886
During parsing of the relax/norelax directives, the RISCV::FeatureRelax
feature bits of the SubtargetInfo stored in the AsmParser are updated
appropriately to reflect whether relaxation is currently enabled in the
parser. When an instruction is parsed, the parser checks if relaxation is
currently enabled and if so, gets a handle to the AsmBackend and sets the
ForceRelocs flag. The AsmBackend uses a combination of the original
RISCV::FeatureRelax feature bits set by e.g -mattr=+/-relax and the
ForceRelocs flag to determine whether to emit relocations for symbol and
branch diffs. Diff relocations should therefore only not be emitted if the
relax flag was not set on the command line and no instruction was ever parsed
in a section with relaxation enabled to ensure correct diffs are emitted.
Differential Revision: https://reviews.llvm.org/D46423
Patch by Lewis Revill.
llvm-svn: 346655
A number of intrinsics, such as llvm.sin.f32, would result in a failure to
select. This patch adds expansions for the relevant selection DAG nodes, as
well as exhaustive testing for all f32 and f64 intrinsics.
The codegen for FMA remains a TODO item, pending support for the various
RISC-V FMA instruction variants.
The llvm.minimum.f32.* and llvm.maximum.* tests are commented-out, pending
upstream support for target-independent expansion, as discussed in
http://lists.llvm.org/pipermail/llvm-dev/2018-November/127408.html.
Differential Revision: https://reviews.llvm.org/D54034
Patch by Luís Marques.
llvm-svn: 346034
SelectionDAGBuilder::visitShift will always zero-extend a shift amount when it
is promoted to the ShiftAmountTy. This results in zero-extension (masking)
which is unnecessary for RISC-V as the shift operations only read the lower 5
or 6 bits (RV32 or RV64).
I initially proposed adding a getExtendForShiftAmount hook so the shift amount
can be any-extended (D52975). @efriedma explained this was unsafe, so I have
instead eliminate the unnecessary and operations at instruction selection time
in a manner similar to X86InstrCompiler.td.
Differential Revision: https://reviews.llvm.org/D53224
llvm-svn: 344432
Summary:
Instruction with 0 in fence field being disassembled as fence , iorw.
Printing "unknown" to match GAS behavior.
This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer
for the RISC-V assembly language.
Reviewers: asb
Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, asb
Differential Revision: https://reviews.llvm.org/D51828
llvm-svn: 344309
A pattern was present for addi rd, x0, simm6 but not addiw which is
semantically identical when the source register is x0. This patch addresses
that, and the benefit can be seen in rv64c-aliases-valid.s.
llvm-svn: 343911
lowerGlobalAddress, lowerBlockAddress, and insertIndirectBranch contain
overzealous checks for is64Bit. These functions are all safe as-implemented
for RV64.
llvm-svn: 343781
f32 values passed on the stack would previously cause an assertion in
unpackFromMemLoc.. This would only trigger in the presence of the F extension
making f32 a legal type. Otherwise the f32 would be legalized.
This patch fixes that by keeping LocVT=f32 when a float is passed on the
stack. It also adds test coverage for this case, and tests that also
demonstrate lw/sw/flw/fsw will be selected when most profitable. i.e. there is
no unnecessary i32<->f32 conversion in registers.
llvm-svn: 343756
r343712 performed this optimisation during instruction selection. As Eli
Friedman pointed out in post-commit review, implementing this as a DAGCombine
might allow opportunities for further optimisations.
llvm-svn: 343741
There was some duplicated logic for using the LocInfo of a CCValAssign in
order to convert from the ValVT to LocVT or vice versa. Resolve this by
factoring out convertLocVTFromValVT from unpackFromRegLoc. Also rename
packIntoRegLoc to the more appropriate convertValVTToLocVT and call these
helper functions consistently.
llvm-svn: 343737
Although we can't write a tablegen pattern to remove redundant
splitf64+buildf64 pairs due to the multiple return values, we can handle it
with some C++ selection code. This is simpler than removing them after
instruction selection through RISCVDAGToDAGISel::PostprocessISelDAG, as was
done previously.
llvm-svn: 343712
The patterns as defined are correct only when XLen==32.
This is another preparatory patch for a set of patches that flesh out RV64
codegen.
llvm-svn: 343679
1. brcond operates on an condition.
2. atomic_fence and the pseudo AMO instructions should all take xlen immediates
This allows the same definitions and patterns to work for RV64 (XLenVT==i64).
llvm-svn: 343678
This is a trivial refactoring that I'm committing now as it makes a patch I'm
about to post for review easier to follow. There is some overlap between
evaluateConstantImm and addExpr in RISCVAsmParser. This patch allows
evaluateConstantImm to be reused from addExpr to remove this overlap. The
benefit will be greater when a future patch adds extra code to allows
immediates to be evaluated from constant symbols (e.g. `.equ CONST, 0x1234`).
No functional change intended.
llvm-svn: 342641
Examples such as `jal a3`, `j a3` and `jal a3, a3` are accepted by gas
but rejected by LLVM MC. This patch rectifies this. I introduce
RISCVAsmParser::parseJALOffset to ensure that symbol names that coincide with
register names can safely be parsed. This is made a somewhat fiddly due to the
single-operand alias form (see the comment in parseJALOffset for more info).
Differential Revision: https://reviews.llvm.org/D52029
llvm-svn: 342629
Introduce a new RISCVExpandPseudoInsts pass to expand atomic
pseudo-instructions after register allocation. This is necessary in order to
ensure that register spills aren't introduced between LL and SC, thus breaking
the forward progress guarantee for the operation. AArch64 does something
similar for CmpXchg (though only at O0), and Mips is moving towards this
approach (see D31287). See also [this mailing list
post](http://lists.llvm.org/pipermail/llvm-dev/2016-May/099490.html) from
James Knight, which summarises the issues with lowering to ll/sc in IR or
pre-RA.
See the [accompanying RFC
thread](http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html) for an
overview of the lowering strategy.
Differential Revision: https://reviews.llvm.org/D47882
llvm-svn: 342534
This allows the hard-coded shouldForceImmediate logic to be removed because
the generated MatchOperandParserImpl makes use of the current context (i.e.
the current mnemonic) to determine parsing behaviour, and so won't first try
to parse a register before parsing a symbol name.
No functional change is intended. gas accepts immediate arguments for call,
tail and lla. This patch doesn't address this discrepancy.
Differential Revision: https://reviews.llvm.org/D51733
llvm-svn: 342488
addi a0, a0, foo and lw a0, foo(a0) and similar are now rejected. An explicit
%lo and %pcrel_lo modifier is required. This matches gas behaviour.
llvm-svn: 342487
Reject bare symbols and accept only %pcrel_hi(sym) for auipc and %hi(sym) for
lui. Also test valid operand modifiers in rv32i-valid.s.
Note this is slightly stricter than gas, which will accept either %pcrel_hi or
%hi for both lui and auipc.
Differential Revision: https://reviews.llvm.org/D51731
llvm-svn: 342486
Summary:
Fixed assertions due to invalid fixup when encoding compressed instructions
(c.addi, c.addiw, c.li, c.andi) with bare symbols with/without modifiers.
This matches GAS behavior as well.
This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer
for the RISC-V assembly language.
Reviewers: asb
Reviewed By: asb
Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb
Differential Revision: https://reviews.llvm.org/D52005
llvm-svn: 342160
Summary:
The illegal instruction 0x00 0x00 is being wrongly decoded as
c.addi4spn with 0 immediate.
The invalid instruction 0x01 0x61 is being wrongly decoded as
c.addi16sp with 0 immediate.
This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer
for the RISC-V assembly language.
Reviewers: asb
Reviewed By: asb
Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb
Differential Revision: https://reviews.llvm.org/D51815
llvm-svn: 342159
Disassemblers cannot depend on main target headers. The same is true for
MCTargetDesc, but there's a lot more cleanup needed for that.
llvm-svn: 341822