be non-optimal. To be precise, we should avoid folding loads if the instructions
only update part of the destination register, and the non-updated part is not
needed. e.g. cvtss2sd, sqrtss. Unfolding the load from these instructions breaks
the partial register dependency and it can improve performance. e.g.
movss (%rdi), %xmm0
cvtss2sd %xmm0, %xmm0
instead of
cvtss2sd (%rdi), %xmm0
An alternative method to break dependency is to clear the register first. e.g.
xorps %xmm0, %xmm0
cvtss2sd (%rdi), %xmm0
llvm-svn: 91672
The change in SelectionDAGBuilder is needed to allow using bitcasts to convert
between f64 (the default type for ARM "d" registers) and 64-bit Neon vector
types. Radar 7457110.
llvm-svn: 91649
Fold (zext (and x, cst)) -> (and (zext x), cst)
DAG combiner likes to optimize expression in the other way so this would end up cause an infinite looping.
llvm-svn: 91574
in local register allocator. If a reg-reg copy has a phys reg
input and a virt reg output, and this is the last use of the phys
reg, assign the phys reg to the virt reg. If a reg-reg copy has
a phys reg output and we need to reload its spilled input, reload
it directly into the phys reg than passing it through another reg.
Following 76208, there is sometimes no dependency between the def of
a phys reg and its use; this creates a window where that phys reg
can be used for spilling (this is true in linear scan also). This
is bad and needs to be fixed a better way, although 76208 works too
well in practice to be reverted. However, there should normally be
no spilling within inline asm blocks. The patch here goes a long way
towards making this actually be true.
llvm-svn: 91485
1. Only perform (zext (shl (zext x), y)) -> (shl (zext x), y) when y is a constant. This makes sure it remove at least one zest.
2. If the shift is a left shift, make sure the original shift cannot shift out bits.
llvm-svn: 91399
The coalescer is supposed to clean these up, but when setting up parameters
for a function call, there may be copies to physregs. If the defining
instruction has been LICM'ed far away, the coalescer won't touch it.
The register allocation hint does not always work - when the register
allocator is backtracking, it clears the hints.
This patch takes care of a few more cases that r90163 missed.
llvm-svn: 90502
both source operands. In the canonical form, the 2nd operand is changed to an
undef and the shuffle mask is adjusted to only reference elements from the 1st
operand. Radar 7434842.
llvm-svn: 90417
- A valno should be set HasRedefByEC if there is an early clobber def in the middle of its live ranges. It should not be set if the def of the valno is defined by an early clobber.
- If a physical register def is tied to an use and it's an early clobber, it just means the HasRedefByEC is set since it's still one continuous live range.
- Add a couple of missing checks for HasRedefByEC in the coalescer. In general, it should not coalesce a vr with a physical register if the physical register has a early clobber def somewhere. This is overly conservative but that's the price for using such a nasty inline asm "feature".
llvm-svn: 90269