Only handle immediates that would produce an ADDI or ADDIW of Lo12
as the final instruction in their materialization.
As the test change show this removes immediates that materialize
with lui+addiw that is not the same as lui+addi.
Offsets in the range [-4095,-2049] or [2048, 4094] are split into
two ADDIs. One of the ADDIs will be folded into the load/store
immediate through an post-isel peephole.
If the imm is out of range for an ADDI, we will materialize it in
a register using multiple instructions. If the ADD is used by a
load/store, doPeepholeLoadStoreADDI can try to pull an ADDI from
the constant materialization into the load/store offset. This only
works if the ADD has a single use, otherwise the peephole would have
to rebuild multiple nodes.
This patch instead tries to solve the problem when the add is selected.
We check that the add is only used by loads/stores and if it is
we will select it to (ADDI (ADD X, Imm-Lo12), Lo12). This will enable
the simple case in doPeepholeLoadStoreADDI that can bypass an ADDI
used as a pointer. As a result we can remove the more complicated
peephole from doPeepholeLoadStoreADDI.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D126576
When lowering GlobalAddressNodes, we were removing a non-zero offset and
creating a separate ADD.
It already comes out of SelectionDAGBuilder with a separate ADD. The
ADD was being removed by DAGCombiner.
This patch disables the DAG combine so we don't have to reverse it.
Test changes all look to be instruction order changes. Probably due
to different DAG node ordering.
Differential Revision: https://reviews.llvm.org/D126558
This is a followup to D124231.
We can fold the ADDIW in this pattern if we can prove that LUI+ADDI
would have produced the same result as LUI+ADDIW.
This pattern occurs because constant materialization prefers LUI+ADDIW
for all simm32 immediates. Only immediates in the range
0x7ffff800-0x7fffffff require an ADDIW. Other simm32 immediates
work with LUI+ADDI.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D124693
Most of the test changes are trivial instruction reorderings and differing
register allocations, without any obvious performance impact.
Differential Revision: https://reviews.llvm.org/D66973
llvm-svn: 372106
As discussed in the RFC
<http://lists.llvm.org/pipermail/llvm-dev/2018-October/126690.html>, 64-bit
RISC-V has i64 as the only legal integer type. This patch introduces patterns
to support codegen of the new instructions
introduced in RV64I: addiw, addiw, subw, sllw, slliw, srlw, srliw, sraw,
sraiw, ld, sd.
Custom selection code is needed for srliw as SimplifyDemandedBits will remove
lower bits from the mask, meaning the obvious pattern won't work:
def : Pat<(sext_inreg (srl (and GPR:$rs1, 0xffffffff), uimm5:$shamt), i32),
(SRLIW GPR:$rs1, uimm5:$shamt)>;
This is sufficient to compile and execute all of the GCC torture suite for
RV64I other than those files using frameaddr or returnaddr intrinsics
(LegalizeDAG doesn't know how to promote the operands - a future patch
addresses this).
When promoting i32 sltu/sltiu operands, it would be more efficient to use
sign-extension rather than zero-extension for RV64. A future patch adds a hook
to allow this.
Differential Revision: https://reviews.llvm.org/D52977
llvm-svn: 347973