- For now, loads of [r, r] addressing mode is the same as the
[r, r lsl/lsr/asr #] variants. ARMBaseInstrInfo::getOperandLatency() should
identify the former case and reduce the output latency by 1.
- Also identify [r, r << 2] case. This special form of shifter addressing mode
is "free".
llvm-svn: 117519
the LDR instructions have. This makes the literal/register forms of the
instructions explicit and allows us to assign scheduling itineraries
appropriately. rdar://8477752
llvm-svn: 117505
explicit about the operands. Split out the different variants into separate
instructions. This gives us the ability to, among other things, assign
different scheduling itineraries to the variants. rdar://8477752.
llvm-svn: 117409
"long latency" enough to hoist even if it may increase spilling. Reloading
a value from spill slot is often cheaper than performing an expensive
computation in the loop. For X86, that means machine LICM will hoist
SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON
instructions.
- Enable register pressure aware machine LICM by default.
llvm-svn: 116781
allow target to correctly compute latency for cases where static scheduling
itineraries isn't sufficient. e.g. variable_ops instructions such as
ARM::ldm.
This also allows target without scheduling itineraries to compute operand
latencies. e.g. X86 can return (approximated) latencies for high latency
instructions such as division.
- Compute operand latencies for those defined by load multiple instructions,
e.g. ldm and those used by store multiple instructions, e.g. stm.
llvm-svn: 115755
stick with a constant estimate of 90% (branch predictors are good!), but we might find that we want to provide
more nuanced estimates in the future.
llvm-svn: 115364
cost modeling for if-conversion. Now if only we had a way to estimate the misprediction probability.
Adjsut CodeGen/ARM/ifcvt10.ll. The pipeline on Cortex-A8 is long enough that it is still profitable
to predicate an ldm, but the shorter pipeline on Cortex-A9 makes it unprofitable.
llvm-svn: 114995
Rather than having arbitrary cutoffs, actually try to cost model the conversion.
For now, the constants are tuned to more or less match our existing behavior, but these will be
changed to reflect realistic values as this work proceeds.
llvm-svn: 114973
I am unable to write a test for this case, help is solicited, though...
What I did is to tickle the code in the debugger and verify that we do the right thing.
llvm-svn: 114430
into OptimizeCompareInstr.
This necessitates the passing of CmpValue around,
so widen the virtual functions to accomodate.
No functionality changes.
llvm-svn: 114428
Recognize VLD1q64Pseudo as a stack slot load.
Reject these if they are loading or storing a subregister. The API (and
VirtRegRewriter) doesn't know how to deal with that.
llvm-svn: 113985
encountered while building llvm-gcc for arm. This is probably the same issue
that the ppc buildbot hit. llvm::prior works on a MachineBasicBlock::iterator,
not a plain MachineInstr.
llvm-svn: 113983
backing out following to get it back to green,
so I can investigate in peace:
svn merge -c -113840 llvm/test/CodeGen/ARM/arm-and-tst-peephole.ll
svn merge -c -113876 -c -113839 llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
llvm-svn: 113980
by morphing the 'and' to its recording form 'andS'.
This is basically a test commit into this area, to
see whether the bots like me. Several generalizations
can be applied and various avenues of code simplification
are open. I'll introduce those as I go.
I am aware of stylistic input from Bill Wendling, about
where put the analysis complexity, but I am positive
that we can move things around easily and will find a
satisfactory solution.
llvm-svn: 113839
iterator when an optimization took place. This allows us to do more insane
things with the code than just remove an instruction or two.
llvm-svn: 113640