Commit Graph

30 Commits

Author SHA1 Message Date
Haocong.Lu bd653f6406 [RISCV] Use shift for zero extension when Zbb and Zbp are not enabled
Now AND is used for zero extension when both Zbb and Zbp are not enabled.
It may be better to use shift operation if the trailing ones mask exceeds simm12.

This patch optimzes LUI+ADDI+AND to SLLI+SRLI.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116720
2022-01-11 02:37:03 +00:00
Craig Topper b645bcd98a [RISCV] Generalize (srl (and X, 0xffff), C) -> (srli (slli X, (XLen-16), (XLen-16) + C) optimization.
This can be generalized to (srl (and X, C2), C) ->
(srli (slli X, (XLen-C3), (XLen-C3) + C). Where C2 is a mask with
C3 trailing ones.

This can avoid constant materialization for C2. This is beneficial
even when C2 can be selected to ANDI because the SLLI can become
C.SLLI, but C.ANDI cannot cover all the immediates of ANDI.

This also enables CSE in some cases of i8 sdiv by constant codegen.
2022-01-09 23:37:10 -08:00
wangpc 41454ab256 [RISCV] Use constant pool for large integers
For large integers (for example, magic numbers generated by
TargetLowering::BuildSDIV when dividing by constant), we may
need about 4~8 instructions to build them.
In the same time, it just takes two instructions to load
constants (with extra cycles to access memory), so it may be
profitable to put these integers into constant pool.

Reviewed By: asb, craig.topper

Differential Revision: https://reviews.llvm.org/D114950
2021-12-31 14:48:48 +08:00
Craig Topper 6f7de819b9 [RISCV] Use MULHU for more division by constant cases.
D113805 improved handling of i32 divu/remu on RV64. The basic idea
from that can be extended to (mul (and X, C2), C1) where C2 is any
mask constant.

We can replace the and with an SLLI by shifting by the number of
leading zeros in C2 if we also shift C1 left by XLen - lzcnt(C1)
bits. This will give the full product XLen additional trailing zeros,
putting the result in the output of MULHU. If we can't use ANDI,
ZEXT.H, or ZEXT.W, this will avoid materializing C2 in a register.

The downside is it make take 1 additional instruction to create C1.
But since that's not on the critical path, it can hopefully be
interleaved with other operations.

The previous tablegen pattern is replaced by custom isel code.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D115310
2021-12-09 09:10:14 -08:00
wangpc af0ecfccae [RISCV] Generate pseudo instruction li
Add an alias of `addi [x], zero, imm` to generate pseudo
instruction li, which makes assembly mush more readable.
For existed tests, users can update them by running script
`llvm/utils/update_llc_test_checks.py`.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D112692
2021-11-22 14:01:37 +08:00
Craig Topper 02bed66cd5 [RISCV] Improve codegen for i32 udiv/urem by constant on RV64.
The division by constant optimization often produces constants that
are uimm32, but not simm32. These constants require 3 or 4 instructions
to materialize without Zba.

Since these instructions are often used by a multiply with a LHS
that needs to be zero extended with an AND, we can switch the MUL
to a MULHU by shifting both inputs left by 32. Once we shift the
constant left, the upper 32 bits no longer need to be 0 so constant
materialization is free to use LUI+ADDIW. This reduces the constant
materialization from 4 instructions to 3 in some cases while also
reducing the zero extend of the LHS from 2 shifts to 1.

Differential Revision: https://reviews.llvm.org/D113805
2021-11-12 14:49:10 -08:00
Craig Topper d9ba1a9c5c [RISCV] Teach isel to select ADDW/SUBW/MULW/SLLIW when only the lower 32-bits are used.
We normally select these when the root node is a sext_inreg, but
SimplifyDemandedBits can sometimes bypass the sext_inreg for some
users. This can create situation where sext_inreg+add/sub/mul/shl
is selected to a W instruction, and then the add/sub/mul/shl is
separately selected to a non-W instruction with the same inputs.

This patch tries to detect when it would still be ok to use a W
instruction without the sext_inreg by checking the direct users.
This can allow the W instruction to CSE with one created for a
sext_inreg+add/sub/mul/shl. To minimize complexity and cost of
checking, we make no attempt to determine if the CSE will happen
and just always use a W instruction when we can.

Differential Revision: https://reviews.llvm.org/D107658
2021-08-18 10:22:00 -07:00
Craig Topper 98d4adc2d1 [RISCV] Add custom isel to select (and (srl X, C1), C2) and (and (shl X, C1), C2)
Replace some existing isel patterns that are covered by the new
code. SLLIUWPat has been removed in favor of folding its root case
into the new code. The other uses in isel patterns for shXadd.uw
have been switched to using hardcoded AND masks.

This is based on the original version of D49585 from ARM. The final
version of that was made a DAG combine, but I've chosen to keep it
as custom isel. I'm not convinced DAG combine is as good with
shift pairs as it is with and+shift. I saw some issues optimizing
the shifts created by vscale lowering if an and isn't created for
from a shift pair.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D106230
2021-07-20 08:53:55 -07:00
Craig Topper 00c1cc867f [RISCV] Add more i32 srem/sdiv with power of 2 constant tests. NFC
Add a small power 2 srem test to match existing sdiv test. Add
larger power of 2 test to both.

The larger constant test shows materialization of a constant
for an AND in the RV64 code. We should be using W shift instructions
to match the RV32 code.
2021-07-18 00:21:14 -07:00
Craig Topper 1e670dc7d7 [RISCV] Use DIVUW/REMUW/DIVW instructions for i8/i16/i32 udiv/urem/sdiv when LHS is constant.
We don't really have optimizations for division with a constant
LHS. If we don't use a W instruction we end up needing to sign
or zero extend the RHS to use the 64-bit instruction.

I had to sign_extend i32 constants on the LHS instead of using
any_extend which becomes zero_extend. If we don't do this, constants
that were originally negative become harder to materialize. I think
this problem exists for more of our W instruction cases. For example
(i32 (shl -1, X)), but we don't have lit tests. I'll work on that
as a follow up.

I also left a FIXME for enabling W instruction for RHS constants
under -Oz.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D105769
2021-07-13 10:33:57 -07:00
Craig Topper 86109fa9e8 [RISCV] Add test cases for div/rem with constant left hand side. NFC
Some of these would produce better code if we used W instructions,
but constant LHS currently prevents that.
2021-07-10 17:22:40 -07:00
Craig Topper 010f0f000f Revert "[RISCV] Use zexti32/sexti32 in srliw/sraiw isel patterns to improve usage of those instructions."
I thought this might help with another optimization I was
thinking about, but I don't think it will. So it just wastes
compile time calling computeKnownBits for no benefit.

This reverts commit 81b2f95971.
2021-06-27 10:33:43 -07:00
Craig Topper 81b2f95971 [RISCV] Use zexti32/sexti32 in srliw/sraiw isel patterns to improve usage of those instructions. 2021-06-26 11:57:26 -07:00
Craig Topper ff902080a9 [RISCV] Use SLLI/SRLI instead of SLLIW/SRLIW for (srl (and X, 0xffff), C) custom isel on RV64.
We don't need the sign extending behavior here and SLLI/SRLI
are able to compress to C.SLLI/C.SRLI.
2021-04-11 13:59:51 -07:00
Craig Topper 5744502a13 [TargetLowering][RISCV][AArch64][PowerPC] Enable BuildUDIV/BuildSDIV on illegal types before type legalization if we can find a larger legal type that supports MUL.
If we wait until the type is legalized, we'll lose information
about the orginal type and need to use larger magic constants.
This gets especially bad on RISCV64 where i64 is the only legal
type.

I've limited this to simple scalar types so it only works for
i8/i16/i32 which are most likely to occur. For more odd types
we might want to do a small promotion to a type where MULH is legal
instead.

Unfortunately, this does prevent some urem/srem+seteq matching since
that still require legal types.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D96210
2021-02-11 09:43:13 -08:00
Craig Topper 16fb1c7aae [RISCV] Add i8/i16 test cases to div.ll and i8/i16/i64 to rem.ll. NFC
This improves our coverage of these operations and shows that we
use really large constants for division by constant on i8/i16
especially on RV64. The issue is that BuildSDIV/BuildUDIV are
limited to legal types so we have to promote to i64 before it
kicks in. At that point we've lost the range information for the
original type.
2021-02-04 16:46:23 -08:00
Michael Munday e28b6a60bc [RISCV][NFC] Regenerate RISCV CodeGen tests
Regenerated using:

./llvm/utils/update_llc_test_checks.py -u llvm/test/CodeGen/RISCV/*.ll

This has added comments to spill-related instructions and added @plt to
some symbols.

Differential Revision: https://reviews.llvm.org/D92841
2020-12-09 19:42:49 +00:00
Luis Marques 3d0fbafd0b [RISCV] Switch to the Machine Scheduler
Most of the test changes are trivial instruction reorderings and differing
register allocations, without any obvious performance impact.

Differential Revision: https://reviews.llvm.org/D66973

llvm-svn: 372106
2019-09-17 11:15:35 +00:00
Luis Marques 2d550d19b3 Revert Patch from Phabricator
This reverts r372092 (git commit e38695a025)

llvm-svn: 372104
2019-09-17 10:52:09 +00:00
Luis Marques e38695a025 Patch from Phabricator
llvm-svn: 372092
2019-09-17 09:43:08 +00:00
Alex Bradbury 61aa940074 [RISCV] Introduce codegen patterns for RV64M-only instructions
As discussed on llvm-dev
<http://lists.llvm.org/pipermail/llvm-dev/2018-December/128497.html>, we have
to be careful when trying to select the *w RV64M instructions. i32 is not a
legal type for RV64 in the RISC-V backend, so operations have been promoted by
the time they reach instruction selection. Information about whether the
operation was originally a 32-bit operations has been lost, and it's easy to
write incorrect patterns.

Similarly to the variable 32-bit shifts, a DAG combine on ANY_EXTEND will
produce a SIGN_EXTEND if this is likely to result in sdiv/udiv/urem being
selected (and so save instructions to sext/zext the input operands).

Differential Revision: https://reviews.llvm.org/D53230

llvm-svn: 350993
2019-01-12 07:43:06 +00:00
Shiva Chen d58bd8dc4a [RISCV] Expand function call to "call" pseudoinstruction
To do this:
1. Change GlobalAddress SDNode to TargetGlobalAddress to avoid legalizer
   split the symbol.

2. Change ExternalSymbol SDNode to TargetExternalSymbol to avoid legalizer
   split the symbol.

3. Let PseudoCALL match direct call with target operand TargetGlobalAddress
   and TargetExternalSymbol.

Differential Revision: https://reviews.llvm.org/D44885

llvm-svn: 330827
2018-04-25 14:19:12 +00:00
Alex Bradbury 921383828e [RISCV] Codegen support for the standard RV32M instruction set extension
llvm-svn: 322843
2018-01-18 12:36:38 +00:00
Alex Bradbury 7d6aa1f7ae [RISCV] Implement frame pointer elimination
llvm-svn: 322839
2018-01-18 11:34:02 +00:00
Alex Bradbury d3263aa1df [RISCV][NFC] Add nounwind to functions in div.ll and mul.ll
Committing this separately to minimise irrelevant changes for an upcoming 
patch.

llvm-svn: 322825
2018-01-18 09:41:14 +00:00
Alex Bradbury 59136ffab1 [RISCV] Enable emission of alias instructions by default
This patch switches the default for -riscv-no-aliases to false
and updates all affected MC and CodeGen tests. As recommended in
D41071, MC tests use the canonical instructions and the CodeGen
tests use the aliases.

Additionally, for the f and d instructions with rounding mode,
the tests for the aliased versions are moved and tightened such
that they can actually detect if alias emission is enabled.
(see D40902 for context)

Differential Revision: https://reviews.llvm.org/D41225

Patch by Mario Werner.

llvm-svn: 320797
2017-12-15 09:47:01 +00:00
Alex Bradbury b014e3de52 [RISCV] Implement prolog and epilog insertion
As frame pointer elimination isn't implemented until a later patch and we make 
extensive use of update_llc_test_checks.py, this changes touches a lot of the 
RISC-V tests.

Differential Revision: https://reviews.llvm.org/D39849

llvm-svn: 320357
2017-12-11 12:34:11 +00:00
Alex Bradbury 660bcceccf [RISCV] Support lowering FrameIndex
Introduces the AddrFI "addressing mode", which is necessary simply because 
it's not possible to write a pattern that directly matches a frameindex.

Ensure callee-saved registers are accessed relative to the stackpointer. This
is necessary as callee-saved register spills are performed before the frame
pointer is set.

Move HexagonDAGToDAGISel::isOrEquivalentToAdd to SelectionDAGISel, so we can 
make use of it in the RISC-V backend.

Differential Revision: https://reviews.llvm.org/D39848

llvm-svn: 320353
2017-12-11 11:53:54 +00:00
Francis Visoiu Mistrih 25528d6de7 [CodeGen] Unify MBB reference format in both MIR and debug output
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

llvm-svn: 319665
2017-12-04 17:18:51 +00:00
Alex Bradbury ffc435e9c7 [RISCV] Support and tests for a variety of additional LLVM IR constructs
Previous patches primarily ensured that codegen was possible for the standard
RISC-V instructions. However, there are a number of IR inputs that wouldn't be
appropriately lowered. This patch both adds test cases and supports lowering
for a number of these cases:
* Improved sext/zext/trunc support
* Support for setcc variants that don't map directly to RISC-V instructions
* Lowering mul, and hence support for external symbols
* addc, adde, subc, sube
* mulhs, srem, mulhu, urem, udiv, sdiv
* {srl,sra,shl}_parts
* brind
* br_jt
* bswap, ctlz, cttz, ctpop
* rotl, rotr
* BlockAddress operands

Differential Revision: https://reviews.llvm.org/D29938

llvm-svn: 318737
2017-11-21 08:11:03 +00:00