register for local access when it's closer to the stack slot being refererenced
than the stack pointer. Make sure to take into account any argument frame
SP adjustments that are in affect at the time.
rdar://8256090
llvm-svn: 110366
simplify the call frame pseudo instructions. In that situation, the
calculations for estimating the stack size will be way off, leading to
not having an emergency spill slot when we need one. It should be possible
to be more precise about tracking the adjustment values, but not really
necessary for correctness. Upcoming cleanups for PEI in general will
render that moot.
llvm-svn: 110258
Add support for using the FPSCR in conjunction with the vcvtr instruction, for controlling fp to int rounding.
Add support for the FLT_ROUNDS_ node now that the FPSCR is exposed.
llvm-svn: 110152
reference registers past the end of the NEON register file, and report them
as invalid instead of asserting when trying to print them. PR7746.
llvm-svn: 109933
have 4 bits per register in the operand encoding), but have undefined
behavior when the operand value is 13 or 15 (SP and PC, respectively).
The trivial coalescer in linear scan sometimes will merge a copy from
SP into a subsequent instruction which uses the copy, and if that
instruction cannot legally reference SP, we get bad code such as:
mls r0,r9,r0,sp
instead of:
mov r2, sp
mls r0, r9, r0, r2
This patch adds a new register class for use by Thumb2 that excludes
the problematic registers (SP and PC) and is used instead of GPR
for those operands which cannot legally reference PC or SP. The
trivial coalescer explicitly requires that the register class
of the destination for the COPY instruction contain the source
register for the COPY to be considered for coalescing. This prevents
errant instructions like that above.
PR7499
llvm-svn: 109842
integers with mov + vdup. 8003375. This is
currently disabled by default because LICM will
not hoist a VDUP, so it pessimizes the code if
the construct occurs inside a loop (8248029).
llvm-svn: 109799
This assumption is not satisfied due to global mergeing.
Workaround the issue by temporary disablinge mergeing of const globals.
Also, ignore LLVM "special" globals. This fixes PR7716
llvm-svn: 109423
comments explaining why it was wrong. 8225024.
Fix the real problem in 8213383: the code that splits very large
blocks when no other place to put constants can be found was not
considering the case that the block contained a Thumb tablejump.
llvm-svn: 109282
it's too late to start backing off aggressive latency scheduling when most
of the registers are in use so the threshold should be a bit tighter.
- Correctly handle live out's and extract_subreg etc.
- Enable register pressure aware scheduling by default for hybrid scheduler.
For ARM, this is almost always a win on # of instructions. It's runtime
neutral for most of the tests. But for some kernels with high register
pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by
54 and sped up by 20%.
llvm-svn: 109279
ARM/PPC/MSP430-specific code (which are the only targets that
implement the hook) can directly reference their target-specific
instrinfo classes.
llvm-svn: 109171
This is probably not the best way to implement "Force LR to
be spilled if the Thumb function size is > 2048." do this,
it should use the branch shortening infrastructure, but I'm
just preserving functionality here.
llvm-svn: 109165
mov pc, r1
.align 2
LJTI0_0_0:
.long LBB0_14
This fixes rdar://8213383. No test case since it's not possible to come up with a suitable small one.
llvm-svn: 109076
it should set the jump table encloding the EK_Inline. This prevents
a second, unused, copy of the table from being emitted after the function
body. PR6581.
llvm-svn: 108730
it should set the jump table encloding the EK_Inline. This prevents
a second, unused, copy of the table from being emitted after the function
body. PR7499.
llvm-svn: 108722
- Unfortunate, but necessary for now to handle subtarget instruction matching. Eventually we should factor out the lower level target machine information so we don't need to do this.
llvm-svn: 108664
stack realignment on ARM.
Also check for function attributes as we do on X86 as well as
make explicit that we're checking can as well as needs in this function.
llvm-svn: 108582
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN.
llvm-svn: 108465
instructions use different values (e.g., 2-byte or 4-byte alignment).
Also fix ARMInstPrinter to print these alignments as bits instead of bytes.
llvm-svn: 108386
instructions already have implicit defs of LR. The comment suggests that
this is intended to fix something like pr6111, but it doesn't really do
that either.
llvm-svn: 108186
The only folding these load/store architectures can do is converting COPY into a
load or store, and the target independent part of foldMemoryOperand already
knows how to do that.
llvm-svn: 108099
correct alignment information, which simplifies ExpandRes_VAARG a bit.
The patch introduces a new alignment information to TargetLoweringInfo. This is
needed since the two natural candidates cannot be used:
* The 's' in target data: If this is set to the minimal alignment of any
argument, getCallFrameTypeAlignment would return 4 for doubles on ARM for
example.
* The getTransientStackAlignment method. It is possible for an architecture to
have argument less aligned than what we maintain the stack pointer.
llvm-svn: 108072