ALU seems a little vague. FAdd felt more precise even though it
also include FSUB instructions.
Reviewed By: monkchiang
Differential Revision: https://reviews.llvm.org/D133632
At least based on the lit tests, the coalescer sometimes fails to
propagate the copy from X0 into the branch instruction. This patch
does it manually during isel. The majority of the changes are from
the select patterns.
Some of the changes are just register allocation changes. Only
the Select change affects the whether a b*z instruction is generated
in the tests. I changed the branch pattern for consistency.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D130809
This patch added the MC layer support of Zfinx extension.
Authored-by: StephenFan
Co-Authored-by: Shao-Ce Sun
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D93298
This patch added the MC layer support of Zfinx extension.
Authored-by: StephenFan
Co-Authored-by: Shao-Ce Sun
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D93298
This idea has come up in several reviews -- D115978 and D105902 -- so I
can't take any credit for the idea. Instead of using a constant pool to
lower -0.0, we can emit a sequence of two instructions:
fmv.[hwd].x freg, zero
fsgnjn.[hsd] freg, freg, freg
This is only done when the floating-point type is legal.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D117687
This adds support for STRICT_FSETCC(quiet) and STRICT_FSETCCS(signaling).
FEQ matches well to STRICT_FSETCC oeq.
FLT/FLE matches well to STRICT_FSETCCS olt/ole.
Others require commuting operands or multiple instructions.
STRICT_FSETCC olt/ole/ogt/oge/ult/ule/ugt/uge uses FLT/FLE,
but we need to save/restore FFLAGS around them to avoid spurious
exceptions. I've implemented pseudo instructions with a
CustomInserter to insert the save/restore CSR instructions.
Unfortunately, this doesn't honor exceptions for signaling NANs
but I'm not sure if signaling nans are really supported by the
constrained intrinsics.
STRICT_FSETCC one and ueq expand to a pair of FLT instructions
with a save/restore of fflags around each. This could be improved
in the future.
There may be some opportunities to generate better code for strict
comparisons mixed with nonans fast math flags. I've left FIXMEs in
the .td files for that.
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>
Reviewed By: arcbbb
Differential Revision: https://reviews.llvm.org/D116694
This patch adds isel support for STRICT_LRINT/LLRINT/LROUND/LLROUND.
It also adds test cases for f32 and f64 constrained intrinsics that
correspond to the intrinsics in float-intrinsics.ll and
double-intrinsics.ll. Support for promoting the integer argument of
STRICT_FPOWI was added.
I've skipped adding tests for f16 intrinsics, since we don't have libcalls
for them and we have inconsistent support for promoting them in LegalizeDAG.
This will need to be examined more closely.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D116323
This adds support for strict conversions between fp types and between
integer and fp.
NOTE: RISCV has static rounding mode instructions, but the constrainted
intrinsic metadata is not used to select static rounding modes. Dynamic
rounding mode is always used.
Differential Revision: https://reviews.llvm.org/D115997
Test that STRICT_FMINNUM/FMAXNUM are lowered to libcalls for f32/f64.
The RISC-V instructions don't match the behavior of fmin/fmax libcalls
with respect to SNaN.
Promoting FMINNUM/FMAXNUM for f16 needs more work outside of the
RISC-V backend.
Reviewed By: asb, arcbbb
Differential Revision: https://reviews.llvm.org/D115680
Instead of having unary instruction include a 'let' in their class
body, add rs2val as a template parameter. Then we can use a let
in FPUnaryOp_r and FPUnaryOp_r_frm. This reduces the overall
verbosity of the FP files.
Reviewed By: achieveartificialintelligence
Differential Revision: https://reviews.llvm.org/D115537
By adding the register class and funct as template parameters we
can share the classes with all 3 extensions.
I've used "let SchedRW =" to avoid repeating scheduler classes on
multiple lines where we previously inherited from the Sched class.
A subsequent patch will add mayRaiseFPException and FRM dependencies.
Reducing the number of classes means less repeating for those changes.
This of course conflicts with the Zfinx patch D93298.
Reviewed By: achieveartificialintelligence
Differential Revision: https://reviews.llvm.org/D115469
The fcvt fp to integer instructions saturate if their input is
infinity or out of range, but the instructions produce a maximum
integer for nan instead of 0 required for the ISD opcodes.
This means we can use the instructions to do the saturating
conversion, but we'll need to fix up the nan case at the end.
We can probably improve the i8 and i16 default codegen as well,
but I'll leave that for a follow up.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D107230
I stumbled onto a case where our (sext_inreg (assertzexti32 (fptoui X)), i32)
isel pattern can cause an fcvt.wu and fcvt.lu to be emitted if
the assertzexti32 has an additional user. If we add a one use check
it would just cause a fcvt.lu followed by a sext.w when only need
a fcvt.wu to satisfy both users.
To mitigate this I've added custom isel and new ISD opcodes for
fcvt.wu. This allows us to keep know it started life as a conversion
to i32 without needing to match multiple nodes. ComputeNumSignBits
has been taught that this new nodes produces 33 sign bits. To
prevent regressions when we need to zero extend the result of an
(i32 (fptoui X)), I've added a DAG combine to convert it to an
(i64 (fptoui X)) before type legalization. In most cases this would
happen in InstCombine, but a zero_extend can be created for function
returns or arguments.
To keep everything consistent I've added new nodes for fptosi as well.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D106346
These are fp->int conversions using either RMM or dynamic rounding modes.
The lround and lrint opcodes have a return type of either i32 or
i64 depending on sizeof(long) in the frontend which should follow
xlen. llround/llrint should always return i64 so we'll need a libcall
for those on rv32.
The frontend will only emit the intrinsics if -fno-math-errno is in
effect otherwise a libcall will be emitted which will not use
these ISD opcodes.
gcc also does this optimization.
Reviewed By: arcbbb
Differential Revision: https://reviews.llvm.org/D105206
Rename RVInstR4 as used by F/D/Zfh extensions to RVInstR4Frm.
Introduce new RVInstR4 that takes funct3 as a parameter.
Add new format classes for FSRI and FSRIW instead of trying to
bend RVInstR4 to use a shamt overlayed on rs2 and funct2.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D100427
This previously made references to 2.3-draft which was a short
lived version number in 2017. It was replaced by date based
versions leading up to ratification.
This patch uses the latest ratified version number and just says
what the behavior is. Nothing here is in flux.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D100878
It's unlikely that FMADD and FMSUB would have different scheduling
information so merge them.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D99140
We just started using a ComplexPattern for sexti32. This updates
zexti32 to match.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D97231
An i64 AssertZExt from a type smaller than i32 has at least 33
leading zeros which mean it has at least 33 sign bits.
Since we have a couple patterns that use two sexti32, I've
switched to a ComplexPattern so tablegen didn't have to generate
9 different permutations.
As noted in the FIXME, maybe we should just call computeNumSignBits,
but we don't have tests that benefit from that yet.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D97130
This stops tablegen from generating patterns with the opposite type
in the opposite HwMode. This just adds wasted bytes to the isel table.
This reduces the isel table by about 1800 bytes.
If SETUNE isn't legal, UO can use the NOT of the SETO expansion.
Removes some complex isel patterns. Most of the test changes are
from using XORI instead of SEQZ.
Differential Revision: https://reviews.llvm.org/D92008
Rather than having a different opcode for RV32 and RV64. Let's just say the integer type is XLenVT and use a single opcode for both modes.
Differential Revision: https://reviews.llvm.org/D92538
We had an zexti32 after a sign_extend_inreg. The AND X, 0xffffffff
part of the zexti32 should never occur since SimplifyDemandedBits
from the sign_extend_inreg would have removed it.
We also had sexti32 as the root node of a pattern, but SelectionDAGISel
matches assertsext early before the tablegen based patterns are
evaluated.
X86 was already specially marking fma as commutable which allowed
tablegen to autogenerate commuted patterns. This moves it to the target
independent definition and fix up the targets to remove now
unneeded patterns.
Unfortunately, the tests change because the commuted version of
the patterns are generating operands in a different than the
explicit patterns.
Differential Revision: https://reviews.llvm.org/D91842
The multiply part of FMA is commutable, but TargetSelectionDAG.td
doesn't have it marked as commutable so tablegen won't automatically
create the additional patterns.
So manually add commuted patterns.
Summary:
This patch addresses some weird assembly sequences we were seeing during
comparing floats. In particular, comparing a float to itself tells you whether
it is NaN or not, which we were doing correctly, but with an extra unneeded
`and` instruction.
This patch specialises the existing patterns to remove the `and` instructions
when both their operands are the same.
Reviewed By: luismarques, asb
Differential Revision: https://reviews.llvm.org/D78908
Floating point positive zero can be selected using fmv.w.x / fmv.d.x /
fcvt.d.w and the zero source register.
Differential Revision: https://reviews.llvm.org/D75729
The patch fixes some typos and introduces ReadFMemBase, ReadFSGNJ32,
ReadFSGNJ64, WriteFSGNJ32, WriteFSGNJ64, ReadFMinMax32, ReadFMinMax64,
WriteFMinMax32, WriteFMinMax64, so the target CPU with different pipeline model
could use them to describe latency.
Differential Revision: https://reviews.llvm.org/D75515
Pipeline scheduler model for the RISC-V Rocket micro-architecture using the
MIScheduler interface. Support for both 32 and 64-bit Rocket cores is
implemented.
Differential revision: https://reviews.llvm.org/D68685
Summary: Adds tablegen patterns to explicitly handle fcopysign where the
magnitude and sign arguments have different types, due to the sign value casts
being removed the by DAGCombiner. Support for RV32IF follows in a separate
commit. Adds tests for all relevant scenarios except RV32IF.
Reviewers: lenary
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70678
Adds a `seto` pattern expansion. Without it the lowerings of `fcmp one` and
`fcmp ord` would be inefficient due to an unoptimized double negation.
Differential Revision: https://reviews.llvm.org/D59699
llvm-svn: 357378
Allow load/store instructions with implied zero offset for compatibility with
GNU assembler.
Differential Revision: https://reviews.llvm.org/D57141
Patch by James Clarke.
llvm-svn: 354581
Summary:
Those pseudo-instructions are making load/store instructions able to
load/store from/to a symbol, and its always using PC-relative addressing
to generating a symbol address.
Reviewers: asb, apazos, rogfer01, jrtc27
Differential Revision: https://reviews.llvm.org/D50496
llvm-svn: 354430
This patch:
* Adds necessary RV64D codegen patterns
* Modifies CC_RISCV so it will properly handle f64 types (with soft float ABI)
Note that in general there is no reason to try to select fcvt.w[u].d rather than fcvt.l[u].d for i32 conversions because fptosi/fptoui produce poison if the input won't fit into the target type.
Differential Revision: https://reviews.llvm.org/D53237
llvm-svn: 352833
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Adds support for the various RISC-V FMA instructions (fmadd, fmsub, fnmsub, fnmadd).
The criteria for choosing whether a fused add or subtract is used, as well as
whether the product is negated or not, is whether some of the arguments to the
llvm.fma.* intrinsic are negated or not. In the tests, extraneous fadd
instructions were added to avoid the negation being performed using a xor
trick, which prevented the proper FMA forms from being selected and thus
tested.
The FMA instruction patterns might seem incorrect (e.g., fnmadd: -rs1 * rs2 -
rs3), but they should be correct. The misleading names were inherited from
MIPS, where the negation happens after computing the sum.
The llvm.fmuladd.* intrinsics still do not generate RISC-V FMA instructions,
as that depends on TargetLowering::isFMAFasterthanFMulAndFAdd.
Some comments in the test files about what type of instructions are there
tested were updated, to better reflect the current content of those test
files.
Differential Revision: https://reviews.llvm.org/D54205
Patch by Luís Marques.
llvm-svn: 349023
The patterns as defined are correct only when XLen==32.
This is another preparatory patch for a set of patches that flesh out RV64
codegen.
llvm-svn: 343679
These are produced by GCC and supported by GAS, but not currently contained in
the pseudoinstruction listing in the RISC-V ISA manual.
llvm-svn: 335127
Also add double-prevoius-failure.ll which captures a test case that at one
point triggered a compiler crash, while developing calling convention support
for f64 on RV32D with soft-float ABI.
llvm-svn: 329877