Commit Graph

58 Commits

Author SHA1 Message Date
Craig Topper c50457f3e4 [RISCV] Make the code in MatchSLLIUW ignore the lower bits of the AND mask where the shift has guaranteed zeros.
This avoids being dependent on SimplifyDemandedBits having cleared
those bits.

It could make sense to teach SimplifyDemandedBits to keep all
lower bits 1 in an AND mask when possible. This could be
implemented with slli+srli in the general case rather than
needing to materialize the constant.
2021-01-24 00:34:45 -08:00
Hsiangkai Wang 66a49aef69 [RISCV] Implement vsoxseg/vsuxseg intrinsics.
Define vsoxseg/vsuxseg intrinsics and pseudo instructions.
Lower vsoxseg/vsuxseg intrinsics to pseudo instructions in RISCVDAGToDAGISel.

Differential Revision: https://reviews.llvm.org/D94940
2021-01-23 08:54:56 +08:00
Hsiangkai Wang 97e33feb08 [RISCV] Implement vloxseg/vluxseg intrinsics.
Define vloxseg/vluxseg intrinsics and pseudo instructions.
Lower vloxseg/vluxseg intrinsics to pseudo instructions in RISCVDAGToDAGISel.

Differential Revision: https://reviews.llvm.org/D94903
2021-01-23 08:54:56 +08:00
Hsiangkai Wang a8b96eadfd [RISCV] Implement vssseg intrinsics.
Define vlsseg intrinsics and pseudo instructions. Lower vlsseg
intrinsics to pseudo instructions in RISCVDAGToDAGISel.

Differential Revision: https://reviews.llvm.org/D94863
2021-01-21 11:51:35 +08:00
Hsiangkai Wang e5e329023b [RISCV] Implement vlsseg intrinsics.
Define vlsseg intrinsics and pseudo instructions. Lower vlsseg intrinsics
to pseudo instructions in RISCVDAGToDAGISel.

Differential Revision: https://reviews.llvm.org/D94763
2021-01-21 11:51:35 +08:00
Hsiangkai Wang 47228f7854 [RISCV] Implement vsseg intrinsics.
Define vsseg intrinsics and pseudo instructions. Lower vsseg intrinsics
to pseudo instructions in RISCVDAGToDAGISel.

Differential Revision: https://reviews.llvm.org/D94688
2021-01-21 11:51:35 +08:00
Hsiangkai Wang 8ca4b174d7 [RISCV] Implement vlseg intrinsics.
For Zvlsseg, we need continuous vector registers for the values. We need
to define new register classes for the different combinations of (number
of fields and LMUL). For example,

when the number of fields(NF) = 3, LMUL = 2, the values will be assigned
to (V0M2, V2M2, V4M2), (V2M2, V4M2, V6M2), (V4M2, V6M2, V8M2), ...

We define the vlseg intrinsics with multiple outputs. There is no way to
describe the codegen patterns with multiple outputs in the tablegen
files. We do the codegen in RISCVISelDAGToDAG and use EXTRACT_SUBREG to
extract the values of output.

The multiple scalable vector values will be put into a struct. This
patch is depended on the support for scalable vector struct.

Differential Revision: https://reviews.llvm.org/D94229
2021-01-20 14:26:04 +08:00
Craig Topper 387d3c2479 [RISCV] Merge Utils library into MCTargetDesc
MCTargetDesc includes headers from Utils and Utils includes headers
from MCTargetDesc. So from a library layering perspective it makes sense
for them to be in the same library. I guess the other option might be to
move the tablegen includes from RISCVMCTargetDesc.h to RISCVBaseInfo.h
so that RISCVBaseInfo.h didn't need to include RISCVMCTargetDesc.h.
Everything else that depends on Utils also depends on MCTargetDesc so
having one library seemed simpler.

Differential Revision: https://reviews.llvm.org/D93168
2021-01-14 11:47:30 -08:00
Craig Topper 7b5a0e2f88 [RISCV] Move shift ComplexPatterns and custom isel to PatFrags with predicates
ComplexPatterns are kind of weird, they don't call any of the predicates on their operands. And their "complexity" used for tablegen ordering purposes in the matcher table is hand specified.

This started as an attempt to just use sext_inreg + SLOIPat to implement SLOIW just to have one less Select function. The matching for the or+shl is the same as long as you know the immediate is less than 32 for SLOIW. But that didn't work out because using uimm5 with SLOIPat didn't do anything if it was a ComplexPattern.

I realized I could just use a PatFrag with the opcodes I wanted to match and an immediate predicate would then evaluate correctly. This also computes the complexity just like any other pattern does. Then I just needed to check the constraints on the immediates in the predicate. Conveniently the predicate is evaluated after the fragment has been matched. So the structure has already been checked, we just need to find the constants.

I'll note that this is unusual, I didn't find any other targets looking through operands in PatFrag predicate. There is a PredicateCodeUsesOperands feature that can be used to collect the operands into an array that is used by AMDGPU/VOP3Instructions.td. I believe that feature exists to handle commuted matching, but since the nodes here use constants, they aren't ever commuted

Differential Revision: https://reviews.llvm.org/D91901
2021-01-05 11:37:48 -08:00
Fraser Cormack d85a198e85 [RISCV] Pattern-match more vector-splatted constants
This patch extends the pattern-matching capability of vector-splatted
constants. When illegally-typed constants are legalized they are
canonically sign-extended to XLenVT. This preserves the sign and allows
us to match simm5. If they were zero-extended for whatever reason we'd
lose that ability: e.g. `(i8 -1) -> (XLenVT 255)` would not be matched
under the current logic.

To address this we first manually sign-extend the splatted constant from
the vector element type to int64_t. This preserves the semantics while
removing any implicitly-truncated bits.

The corresponding logic for uimm5 was not updated, the rationale being
that neither sign- nor zero-extending a legal uimm5 immediate should
change that (unless we expect actual "garbage" upper bits).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93837
2020-12-28 07:11:10 +00:00
Fraser Cormack 1a7ac29a89 [RISCV] Add ISel support for RVV vector/scalar forms
This patch extends the SDNode ISel support for RVV from only the
vector/vector instructions to include the vector/scalar and
vector/immediate forms.

It uses splat_vector to carry the scalar in each case, except when
XLEN<SEW (RV32 SEW=64) when a custom node `SPLAT_VECTOR_I64` is used for
type-legalization and to encode the fact that the value is sign-extended
to SEW. When the scalar is a full 64-bit value we use a sequence to
materialize the constant into the vector register.

The non-intrinsic ISel patterns have also been split into their own
file.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93312
2020-12-23 20:16:18 +00:00
Craig Topper 69c8d121f7 [RISCV] Add intrinsics for vsetvli instruction
This patch adds two IR intrinsics for vsetvli instruction. One to set the vector length to a user specified value and one to set it to vlmax. The vlmax uses the X0 source register encoding.

Clang builtins will follow in a separate patch

Differential Revision: https://reviews.llvm.org/D92973
2020-12-18 12:10:09 -08:00
Craig Topper aaa925795f [RISCV] Use SDLoc created early in RISCVDAGToDAGISel::Select instead of recreating it in multiple cases in the switch. NFC 2020-12-08 21:13:25 -08:00
Craig Topper 3e86fbc971 [RISCV] Replace custom isel code for RISCVISD::READ_CYCLE_WIDE with isel pattern
This node returns 2 results and uses a chain. As long as we use a DAG as part of the pseudo instruction definition where we can use the "set" operator, it looks like tablegen can handle use a pattern for this without a problem. I believe the original implementation was copied from PowerPC.

This also fixes the pseudo instruction so that it is marked as having side effects to match the definition of CSRRS and the RV64 instruction. And we don't need to explicitly clear mayLoad/mayStore since those can be inferred now.

Differential Revision: https://reviews.llvm.org/D92786
2020-12-08 10:23:37 -08:00
Hsiangkai Wang f7bc7c2981 [RISCV] Support Zfh half-precision floating-point extension.
Support "Zfh" extension according to
https://github.com/riscv/riscv-isa-manual/blob/zfh/src/zfh.tex

Differential Revision: https://reviews.llvm.org/D90738
2020-12-03 09:16:33 +08:00
Craig Topper 78767b7f8e [RISCV] Add RISCVISD::ROLW/RORW use those for custom legalizing i32 rotl/rotr on RV64IZbb.
This should result in better utilization of RORIW since we
don't need to look for a SIGN_EXTEND_INREG that may not exist.

Also remove rotl/rotr isel matching to GREVI and just prefer RORI.
This is to keep consistency so we don't have to match ROLW/RORW
to GREVIW as well. I imagine RORI/RORIW performance will be the
same or better than GREVI.

Differential Revision: https://reviews.llvm.org/D91449
2020-11-20 10:25:47 -08:00
Craig Topper 124c93c528 [RISCV] When matching SROIW, check all 64 bits of the OR mask
We need to make sure the upper 32 bits are all ones to ensure the result is properly sign extended. Previously we only checked the lower 32 bits of the mask. I've also added a check that the shift amount is less than 32. Without that the original code asserts inside maskLeadingOnes if the SROI check is removed or the SROIW pattern is checked first. I've refactored the code to use early outs to reduce nesting.

I've also updated SLOIW matching with the same changes, but I couldn't find a broken test case with the existing code.

Differential Revision: https://reviews.llvm.org/D90961
2020-11-16 10:08:15 -08:00
Craig Topper 857563eaf0 [RISCV] Check all 64-bits of the mask in SelectRORIW.
We need to ensure the upper 32 bits of the mask are zero.
So that the srl shifts zeroes into the lower 32 bits.

Differential Revision: https://reviews.llvm.org/D90585
2020-11-04 10:15:30 -08:00
Craig Topper 3701e33a22 [RISCV] Remove custom isel for (srl (shl val, 32), imm). Use pattern instead. NFCI
We don't need custom matching, we just a need a predicate to check
the immediate is greater than 32. We can use the existing ImmSub32
to adjust the immediate.

I've also used the new predicate in the other location that used
ImmSub32. I tried to create a test case where we would break without
the greater than 32 check on that pattern, but DAG combine defeated me.
Still seemed safer to have it.

Differential Revision: https://reviews.llvm.org/D90546
2020-11-04 09:59:14 -08:00
Craig Topper 00eff96e1d [RISCV] Add missing patterns for rotr with immediate for Zbb/Zbp extensions.
DAGCombine doesn't canonicalize rotl/rotr with immediate so we
need patterns for both.

Remove the custom matcher for rotl to RORI and just use a SDNodeXForm
to convert the immediate instead. Doing this gives priority to the
rev32/rev16 versions of grevi over rori since an explicit immediate
is more precise than any immediate. I also added rotr patterns for
rev32/rev16. And removed the (or (shl), (shr)) patterns that should be
combined to rotl by DAG combine.

There is at least one other grev pattern that probably needs a
another rotr pattern, but we need more test coverage first.

Differential Revision: https://reviews.llvm.org/D90575
2020-11-03 10:04:52 -08:00
Craig Topper 9ac2910093 [RISCV] Make SelectRORIW handle the commutability of OR.
The SHL and SRL could be in opposite order so account for that.

Differential Revision: https://reviews.llvm.org/D90586
2020-11-02 09:32:54 -08:00
Craig Topper 7142ec3aaf [RISCV] When matching RORIW, make sure the same input is given to both shifts.
The code is looking for (sext_inreg (or (shl X, C2), (shr (and Y, C3), C1))).
We need to ensure X and Y are the same.

Differential Revision: https://reviews.llvm.org/D90580
2020-11-02 09:12:40 -08:00
Jay Foad 0819a6416f [SelectionDAG] Better legalization for FSHL and FSHR
In SelectionDAGBuilder always translate the fshl and fshr intrinsics to
FSHL and FSHR (or ROTL and ROTR) instead of lowering them to shifts and
ORs. Improve the legalization of FSHL and FSHR to avoid code quality
regressions.

Differential Revision: https://reviews.llvm.org/D77152
2020-08-21 10:32:49 +01:00
lewis-revill c9c955ada8 [RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbt asm instructions
This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the ternary subset (zbt subextension) of the
experimental B extension of RISC-V.
It adds also the correspondent codegen tests.

This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf

Differential Revision: https://reviews.llvm.org/D79875
2020-07-15 12:19:34 +01:00
lewis-revill 6144f0a1e5 [RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbbp asm instructions
This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions belonging to both the permutation and the base
subsets of the experimental B extension of RISC-V.
It adds also the correspondent codegen tests.

This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf

Differential Revision: https://reviews.llvm.org/D79873
2020-07-15 12:19:34 +01:00
lewis-revill e2692f0ee7 [RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbb asm instructions
This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the base subset (zbb subextension) of the
experimental B extension of RISC-V.
It adds also the correspondent codegen tests.

This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf

Differential Revision: https://reviews.llvm.org/D79870
2020-07-15 12:19:34 +01:00
Ben Shi 1e9d0811c9 [RISCV] optimize addition with a pair of (addi imm)
For an addition with an immediate in specific ranges, a pair of
addi-addi can be generated instead of the ordinary lui-addi-add serial.

Reviewed By: MaskRay, luismarques

Differential Revision: https://reviews.llvm.org/D82262
2020-07-07 18:57:28 -07:00
Luís Marques 61c2a0bb82 [RISCV] Fold ADDIs into load/stores with nonzero offsets
We can often fold an ADDI into the offset of load/store instructions:

   (load (addi base, off1), off2) -> (load base, off1+off2)
   (store val, (addi base, off1), off2) -> (store val, base, off1+off2)

This is possible when the off1+off2 continues to fit the 12-bit immediate.
We remove the previous restriction where we would never fold the ADDIs if
the load/stores had nonzero offsets. We now do the fold the the resulting
constant still fits a 12-bit immediate, or if off1 is a variable's address
and we know based on that variable's alignment that off1+offs2 won't overflow.

Differential Revision: https://reviews.llvm.org/D79690
2020-07-06 17:32:57 +01:00
Sam Elliott 969e703427 [RISCV] Support Constant Pools in Load/Store Peephole
Summary:
RISC-V uses a post-select peephole pass to optimise
`(load/store (ADDI $reg, %lo(addr)), 0)` into `(load/store $reg, %lo(addr))`.
This peephole wasn't firing for accesses to constant pools, which is how we
materialise most floating point constants.

This adds support for the constantpool case, which improves code generation for
lots of small FP loading examples. I have not added any tests because this
structure is well-covered by the `fp-imm.ll` testcases, as well as almost
all other uses of floating point constants in the RISC-V backend tests.

Reviewed By: luismarques, asb

Differential Revision: https://reviews.llvm.org/D79523
2020-05-11 19:20:38 +01:00
Sam Elliott 3242e5653a Revert "[RISCV] Support Constant Pools in Load/Store Peephole"
This reverts commit fe69dfebcf, due to
a slight change in the API.
2020-05-11 18:14:05 +01:00
Sam Elliott fe69dfebcf [RISCV] Support Constant Pools in Load/Store Peephole
Summary:
RISC-V uses a post-select peephole pass to optimise
`(load/store (ADDI $reg, %lo(addr)), 0)` into `(load/store $reg, %lo(addr))`.
This peephole wasn't firing for accesses to constant pools, which is how we
materialise most floating point constants.

This adds support for the constantpool case, which improves code generation for
lots of small FP loading examples. I have not added any tests because this
structure is well-covered by the `fp-imm.ll` testcases, as well as almost
all other uses of floating point constants in the RISC-V backend tests.

Reviewed By: luismarques, asb

Differential Revision: https://reviews.llvm.org/D79523
2020-05-11 18:01:18 +01:00
Shiva Chen af0cd9073c [RISCV] Split RISCVISelDAGToDAG.cpp to RISCVISelDAGToDAG.h and RISCVISelDAGToDAG.cpp
For the downstream RISCV maintenance, it would be easier to inherent
RISCVISelDAGToDAG by including header and only override the method that needs
to be customized for the provider non-standard ISA extension without touching
RISCVISelDAGToDAG.cpp which may cause conflict when upgrading the downstream
LLVM version.

Differential Revision: https://reviews.llvm.org/D77117
2020-04-01 11:30:21 +08:00
Fangrui Song 5edb40c022 [SelectionDAG] Disallow indirect "i" constraint
This allows us to delete InlineAsm::Constraint_i workarounds in
SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and
TargetLowering::getInlineAsmMemConstraint overrides.

They were introduced to X86 in r237517 to prevent crashes for
constraints like "=*imr". They were later copied to other targets.
2019-12-29 16:50:42 -08:00
Simon Pilgrim 29a5a6eed0 Fix uninitialized variable warning. NFCI. 2019-11-13 14:40:21 +00:00
Luis Marques 2d0cd6cac8 [RISCV] Fix static analysis issues
Unlikely to be problematic but still worth fixing.

Differential Revision: https://reviews.llvm.org/D67640

llvm-svn: 372391
2019-09-20 13:48:02 +00:00
Lewis Revill 7abf863f76 [RISCV] Lower inline asm constraint A for RISC-V
This allows arguments with the constraint A to be lowered to input nodes
for RISC-V, which implies a memory address stored in a register.

This patch adds the minimal amount of code required to get operands with
the right constraints to compile.

https://reviews.llvm.org/D54296

llvm-svn: 369095
2019-08-16 10:28:34 +00:00
Sam Elliott b2c9eed0d7 [RISCV] Support @llvm.readcyclecounter() Intrinsic
On RISC-V, the `cycle` CSR holds a 64-bit count of the number of clock
cycles executed by the core, from an arbitrary point in the past. This
matches the intended semantics of `@llvm.readcyclecounter()`, which we
currently leave to the default lowering (to the constant 0).

With this patch, we will now correctly lower this intrinsic to the
intended semantics, using the user-space instruction `rdcycle`. On
64-bit targets, we can directly lower to this instruction.

On 32-bit targets, we need to do more, as `rdcycle` only returns the low
32-bits of the `cycle` CSR. In this case, we perform a custom lowering,
based on the PowerPC lowering, using `rdcycleh` to obtain the high
32-bits of the `cycle` CSR. This custom lowering inserts a new basic
block which detects overflow in the high 32-bits of the `cycle` CSR
during reading (because multiple instructions are required to read). The
emitted assembly matches the suggested assembly in the RISC-V
specification.

Differential Revision: https://reviews.llvm.org/D64125

llvm-svn: 365201
2019-07-05 12:35:21 +00:00
Alex Bradbury 1b9cd446f7 [RISCV][NFC] Add break to case statement in RISCVDAGToDAGISel::Select
The break isn't strictly needed yet as there is no subsequent entry in the
case. But adding to prevent mistakes further down the road.

llvm-svn: 351785
2019-01-22 07:22:00 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Alex Bradbury bc96a98ed0 [RISCV] Introduce codegen patterns for instructions introduced in RV64I
As discussed in the RFC 
<http://lists.llvm.org/pipermail/llvm-dev/2018-October/126690.html>, 64-bit 
RISC-V has i64 as the only legal integer type.  This patch introduces patterns 
to support codegen of the new instructions 
introduced in RV64I: addiw, addiw, subw, sllw, slliw, srlw, srliw, sraw, 
sraiw, ld, sd.

Custom selection code is needed for srliw as SimplifyDemandedBits will remove 
lower bits from the mask, meaning the obvious pattern won't work:

def : Pat<(sext_inreg (srl (and GPR:$rs1, 0xffffffff), uimm5:$shamt), i32),
          (SRLIW GPR:$rs1, uimm5:$shamt)>;
This is sufficient to compile and execute all of the GCC torture suite for 
RV64I other than those files using frameaddr or returnaddr intrinsics 
(LegalizeDAG doesn't know how to promote the operands - a future patch 
addresses this).

When promoting i32 sltu/sltiu operands, it would be more efficient to use 
sign-extension rather than zero-extension for RV64. A future patch adds a hook 
to allow this.

Differential Revision: https://reviews.llvm.org/D52977

llvm-svn: 347973
2018-11-30 09:38:44 +00:00
Alex Bradbury 2146e8fb1e [RISCV] Constant materialisation for RV64I
This commit introduces support for materialising 64-bit constants for RV64I,
making use of the RISCVMatInt::generateInstSeq helper in order to share logic
for immediate materialisation with the MC layer (where it's used for the li
pseudoinstruction).

test/CodeGen/RISCV/imm.ll is updated to test RV64, and gains new 64-bit
constant tests. It would be preferable if anyext constant returns were sign
rather than zero extended (see PR39092). This patch simply adds an explicit
signext to the returns in imm.ll.

Further optimisations for constant materialisation are possible, most notably
for mask-like values which can be generated my loading -1 and shifting right.
A future patch will standardise on the C++ codepath for immediate selection on
RV32 as well as RV64, and then add further such optimisations to
RISCVMatInt::generateInstSeq in order to benefit both RV32 and RV64 for
codegen and li expansion.

Differential Revision: https://reviews.llvm.org/D52962

llvm-svn: 347042
2018-11-16 10:14:16 +00:00
Alex Bradbury 5ac0a2fc48 [RISCV] Handle redundant SplitF64+BuildPairF64 pairs in a DAGCombine
r343712 performed this optimisation during instruction selection. As Eli 
Friedman pointed out in post-commit review, implementing this as a DAGCombine 
might allow opportunities for further optimisations.

llvm-svn: 343741
2018-10-03 23:30:16 +00:00
Alex Bradbury ce9049952f [RISCV][NFCI] Handle redundant splitf64+buildpairf64 pairs during instruction selection
Although we can't write a tablegen pattern to remove redundant 
splitf64+buildf64 pairs due to the multiple return values, we can handle it 
with some C++ selection code. This is simpler than removing them after 
instruction selection through RISCVDAGToDAGISel::PostprocessISelDAG, as was 
done previously.

llvm-svn: 343712
2018-10-03 20:12:10 +00:00
Alex Bradbury d33ffe9bb1 [RISCV][NFC] Refactor RISCVDAGToDAGISel::Select
Introduce and use a switch on the opcode.

llvm-svn: 343688
2018-10-03 13:13:13 +00:00
Sameer AbuAsal 9b65ffb097 [RISCV] Add machine function pass to merge base + offset
Summary:
   In r333455 we added a peephole to fix the corner cases that result
   from separating base + offset lowering of global address.The
   peephole didn't handle some of the cases because it only has a basic
   block view instead of a function level view.

   This patch replaces that logic with a machine function pass. In
   addition to handling the original cases it handles uses of the global
   address across blocks in function and folding an offset from LW\SW
   instruction. This pass won't run for OptNone compilation, so there
   will be a negative impact overall vs the old approach at O0.

Reviewers: asb, apazos, mgrang

Reviewed By: asb

Subscribers: MartinMosbeck, brucehoult, the_o, rogfer01, mgorny, rbar, johnrusso, simoncook, niosHD, kito-cheng, shiva0217, zzheng, llvm-commits, edward-jones

Differential Revision: https://reviews.llvm.org/D47857

llvm-svn: 335786
2018-06-27 20:51:42 +00:00
Sameer AbuAsal 97684419e8 [RISCV] Add peepholes for Global Address lowering patterns
Summary:
  Base and offset are always separated when a GlobalAddress node is lowered
  (rL332641) as an optimization to reduce instruction count. However, this
  optimization is not profitable if the Global Address ends up being used in only
  instruction.

  This patch adds peephole optimizations that merge an offset of
  an address calculation into the LUI %%hi and ADD %lo of the lowering sequence.

  The peephole handles three patterns:

 1) ADDI (ADDI (LUI %hi(global)) %lo(global)), offset
     --->
      ADDI (LUI %hi(global + offset)) %lo(global + offset).

   This generates:
   lui a0, hi (global + offset)
   add a0, a0, lo (global + offset)

   Instead of

   lui a0, hi (global)
   addi a0, hi (global)
   addi a0, offset

   This pattern is for cases when the offset is small enough to fit in the
   immediate filed of ADDI (less than 12 bits).

 2) ADD ((ADDI (LUI %hi(global)) %lo(global)), (LUI hi_offset))
     --->
      offset = hi_offset << 12
      ADDI (LUI %hi(global + offset)) %lo(global + offset)

   Which generates the ASM:

   lui  a0, hi(global + offset)
   addi a0, lo(global + offset)

   Instead of:

   lui  a0, hi(global)
   addi a0, lo(global)
   lui a1, (offset)
   add a0, a0, a1

   This pattern is for cases when the offset doesn't fit in an immediate field
   of ADDI but the lower 12 bits are all zeros.

 3) ADD ((ADDI (LUI %hi(global)) %lo(global)), (ADDI lo_offset, (LUI hi_offset)))
     --->
        offset = global + offhi20<<12 + offlo12
        ADDI (LUI %hi(global + offset)) %lo(global + offset)

   Which generates the ASM:

   lui  a1, %hi(global + offset)
   addi a1, %lo(global + offset)

   Instead of:

   lui  a0, hi(global)
   addi a0, lo(global)
   lui a1, (offhi20)
   addi a1, (offlo12)
   add a0, a0, a1

   This pattern is for cases when the offset doesn't fit in an immediate field
   of ADDI and both the lower 1 bits and high 20 bits are non zero.

    Reviewers: asb

    Reviewed By: asb

    Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, apazos,
  niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang

llvm-svn: 333455
2018-05-29 19:34:54 +00:00
Nicola Zaghen d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Craig Topper 781aa181ab Fix a bunch of places where operator-> was used directly on the return from dyn_cast.
Inspired by r331508, I did a grep and found these.

Mostly just change from dyn_cast to cast. Some cases also showed a dyn_cast result being converted to bool, so those I changed to isa.

llvm-svn: 331577
2018-05-05 01:57:00 +00:00
Alex Bradbury 099c720426 Revert "[RISCV] implement li pseudo instruction"
Reverts rL330224, while issues with the C extension and missed common
subexpression elimination opportunities are addressed. Neither of these issues
are visible in current RISC-V backend unit tests, which clearly need
expanding.

llvm-svn: 330281
2018-04-18 19:02:31 +00:00
Alex Bradbury 480b7bc906 [RISCV] implement li pseudo instruction
The implementation follows the MIPS backend and expands the
pseudo instruction directly during asm parsing. As the result, only
real MC instructions are emitted to the MCStreamer. Additionally,
PseudoLI instructions are emitted during codegen. The actual
expansion to real instructions is performed during MI to MC lowering
and is similar to the expansion performed by the GNU Assembler.

Differential Revision: https://reviews.llvm.org/D41949
Patch by Mario Werner.

llvm-svn: 330224
2018-04-17 21:56:40 +00:00