Commit Graph

2875 Commits

Author SHA1 Message Date
Jim Grosbach f50693d1ab For local variables in functions with a frame pointer, use FP as a base
register for local access when it's closer to the stack slot being refererenced
than the stack pointer. Make sure to take into account any argument frame
SP adjustments that are in affect at the time.

rdar://8256090

llvm-svn: 110366
2010-08-05 19:27:37 +00:00
Bob Wilson b1021395b8 Fix indentation.
llvm-svn: 110363
2010-08-05 19:00:21 +00:00
Bob Wilson 72de307116 Add an ARM RSCrr instruction for disassembly only.
Partial fix for PR7792.

llvm-svn: 110361
2010-08-05 18:59:36 +00:00
Bob Wilson adb93e56a3 Add an ARM RSBrr instruction for disassembly only.
Partial fix for PR7792.

llvm-svn: 110358
2010-08-05 18:23:43 +00:00
Chandler Carruth e6ca1cfef7 Silence a GCC warning about && and || without explicit parentheses. This
preserves the existing behavior, as it seems a concious choice to allow RS to
be null and BigStack marked true.

llvm-svn: 110307
2010-08-05 03:04:21 +00:00
Bob Wilson 97886d59d1 ARM "rrx" shift operands do not have an immediate. PR7790.
llvm-svn: 110292
2010-08-05 00:34:42 +00:00
Jim Grosbach 8aaadea8ef and back in. false alarm on the tests from another unrelated local change.
llvm-svn: 110269
2010-08-04 22:46:09 +00:00
Devang Patel a52ddc496a Implement target specific getDebugValueLocation().
llvm-svn: 110267
2010-08-04 22:39:39 +00:00
Jim Grosbach 8732d966e1 oops. revert for a moment to clean up tests first.
llvm-svn: 110259
2010-08-04 22:12:43 +00:00
Jim Grosbach 22be317fe4 Reserve a stack slot if the function adjusts the stack but doesn't
simplify the call frame pseudo instructions. In that situation, the
calculations for estimating the stack size will be way off, leading to
not having an emergency spill slot when we need one. It should be possible
to be more precise about tracking the adjustment values, but not really
necessary for correctness. Upcoming cleanups for PEI in general will
render that moot.

llvm-svn: 110258
2010-08-04 22:10:15 +00:00
Dale Johannesen 21f13209f8 Remove switch for disabling ARM tail calls. They
seem to be working correctly.  No functional change.

llvm-svn: 110226
2010-08-04 18:07:17 +00:00
Bob Wilson 79daf7e0ae Combine NEON VABD (absolute difference) intrinsics with ADDs to make VABA
(absolute difference with accumulate) intrinsics.  Radar 8228576.

llvm-svn: 110170
2010-08-04 00:12:08 +00:00
Nate Begeman b69b182191 Add support for getting & setting the FPSCR application register on ARM when VFP is enabled.
Add support for using the FPSCR in conjunction with the vcvtr instruction, for controlling fp to int rounding.
Add support for the FLT_ROUNDS_ node now that the FPSCR is exposed.

llvm-svn: 110152
2010-08-03 21:31:55 +00:00
Daniel Dunbar 727be43a3d Silence some -Asserts uninitialized variable warnings.
llvm-svn: 109956
2010-07-31 21:08:54 +00:00
Bob Wilson b128824b60 Move newlines before inline jumptables from the asm strings in .td files to
the jtblock_operand print methods.  This avoids extra newlines in the
disassembler's output.  PR7757.

llvm-svn: 109948
2010-07-31 06:28:10 +00:00
Bob Wilson cd5fc7bef1 Add support for disassembling VMVN (immediate) instructions. PR7747.
llvm-svn: 109946
2010-07-31 05:57:44 +00:00
Evan Cheng 59069ec784 Add -disable-shifter-op to disable isel of shifter ops. On Cortex-a9 the shifts cost extra instructions so it might be better to emit them separately to take advantage of dual-issues.
llvm-svn: 109934
2010-07-30 23:33:54 +00:00
Bob Wilson eb7b21f3eb Add a check in the ARM disassembler for NEON instructions that would
reference registers past the end of the NEON register file, and report them
as invalid instead of asserting when trying to print them.  PR7746.

llvm-svn: 109933
2010-07-30 23:27:59 +00:00
Bob Wilson 4320e2d1bb Add the __TEXT,__StaticInit section to the list of sections emitted at the
beginning on ARM Darwin assembly files so that it won't be placed after
debug sections.  Radar 8252813.

llvm-svn: 109879
2010-07-30 19:55:47 +00:00
Jim Grosbach d343166a0b Many Thumb2 instructions can reference the full ARM register set (i.e.,
have 4 bits per register in the operand encoding), but have undefined
behavior when the operand value is 13 or 15 (SP and PC, respectively).
The trivial coalescer in linear scan sometimes will merge a copy from
SP into a subsequent instruction which uses the copy, and if that
instruction cannot legally reference SP, we get bad code such as:
  mls r0,r9,r0,sp
instead of:
  mov r2, sp
  mls r0, r9, r0, r2

This patch adds a new register class for use by Thumb2 that excludes
the problematic registers (SP and PC) and is used instead of GPR
for those operands which cannot legally reference PC or SP. The
trivial coalescer explicitly requires that the register class
of the destination for the COPY instruction contain the source
register for the COPY to be considered for coalescing. This prevents
errant instructions like that above.

PR7499

llvm-svn: 109842
2010-07-30 02:41:01 +00:00
Nate Begeman c4a96c0e8c Add builtins for ssat/usat, similar to RealView's __ssat and __usat intrinsics.
llvm-svn: 109813
2010-07-29 22:48:09 +00:00
Bob Wilson 728eb292eb Refactor ARM-specific DAG combining in preparation for adding some more
transformations.

llvm-svn: 109800
2010-07-29 20:34:14 +00:00
Dale Johannesen 2bff50546c Implement vector constants which are splat of
integers with mov + vdup.  8003375.  This is
currently disabled by default because LICM will
not hoist a VDUP, so it pessimizes the code if
the construct occurs inside a loop (8248029).

llvm-svn: 109799
2010-07-29 20:10:08 +00:00
Bob Wilson a9bf1b1493 Don't assert on an unrecognized BrMiscFrm instruction.
PR7745.

llvm-svn: 109788
2010-07-29 18:29:28 +00:00
Nate Begeman 7010a71ac4 Add intrinsics __builtin_arm_qadd & __builtin_arm_qsub to allow access to the QADD & QSUB instructions.
Behave identically to __qadd & __qsub RealView instruction intrinsics.

llvm-svn: 109770
2010-07-29 17:56:55 +00:00
Jim Grosbach c445a7d29b ARM mode version of r109693. Remove incorrect substitution pattern for UXTB16. It wrongly assumed the input shift was actually a rotate. rdar://8240138
llvm-svn: 109696
2010-07-28 23:25:44 +00:00
Jim Grosbach 716a596cf7 Remove incorrect substitution pattern for UXTB16. It wrongly assumed the input shift was actually a rotate. rdar://8240138
llvm-svn: 109693
2010-07-28 23:17:45 +00:00
Jim Grosbach de0874a4bc Remove dead prototype
llvm-svn: 109691
2010-07-28 23:16:12 +00:00
Eli Friedman f902befe8e And a bit more non-ASCII stuff.
llvm-svn: 109458
2010-07-26 22:28:18 +00:00
Anton Korobeynikov 1e0d76bfd1 Drop some non-ascii stuff
llvm-svn: 109456
2010-07-26 22:23:07 +00:00
Anton Korobeynikov b61a6f2742 Add a note
llvm-svn: 109448
2010-07-26 21:48:35 +00:00
Anton Korobeynikov 6bcea068db Currently EH lowering code expects typeinfo to be global only.
This assumption is not satisfied due to global mergeing.
Workaround the issue by temporary disablinge mergeing of const globals.
Also, ignore LLVM "special" globals. This fixes PR7716

llvm-svn: 109423
2010-07-26 18:45:39 +00:00
Evan Cheng 23b05d1cf5 ARM fastisel isn't ready.
llvm-svn: 109421
2010-07-26 18:32:55 +00:00
Douglas Gregor 8f452bc291 Remove extraneous semicolon
llvm-svn: 109373
2010-07-25 17:34:42 +00:00
Douglas Gregor 8fcfe7aa51 Unbreak CMake build
llvm-svn: 109372
2010-07-25 17:10:14 +00:00
Anton Korobeynikov 19edda0323 Hook in GlobalMerge pass
llvm-svn: 109359
2010-07-24 21:52:08 +00:00
Jim Grosbach 0acbcb1a60 Use the appropriate register class for an i32 when adding ARM::LR to the
function live in set. This will give us tGPR for Thumb1 and GPR otherwise,
so the copy will be spillable. rdar://8224931

llvm-svn: 109293
2010-07-23 23:50:35 +00:00
Dale Johannesen c17dd5790b Revert 109076. It is wrong and was causing regressions. Add some
comments explaining why it was wrong.  8225024.

Fix the real problem in 8213383: the code that splits very large
blocks when no other place to put constants can be found was not
considering the case that the block contained a Thumb tablejump.

llvm-svn: 109282
2010-07-23 22:50:23 +00:00
Evan Cheng df907f4594 - Allow target to specify when is register pressure "too high". In most cases,
it's too late to start backing off aggressive latency scheduling when most
  of the registers are in use so the threshold should be a bit tighter.
- Correctly handle live out's and extract_subreg etc.
- Enable register pressure aware scheduling by default for hybrid scheduler.
  For ARM, this is almost always a win on # of instructions. It's runtime
  neutral for most of the tests. But for some kernels with high register
  pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by
  54 and sped up by 20%.

llvm-svn: 109279
2010-07-23 22:39:59 +00:00
Chris Lattner 749ca32da1 eliminate the TargetInstrInfo::GetInstSizeInBytes hook.
ARM/PPC/MSP430-specific code (which are the only targets that
implement the hook) can directly reference their target-specific
instrinfo classes.

llvm-svn: 109171
2010-07-22 21:27:00 +00:00
Chris Lattner dab6888bb1 switch a private implementation of GetFunctionSizeInBytes.
This is probably not the best way to implement "Force LR to 
be spilled if the Thumb function size is > 2048." do this, 
it should use the branch shortening infrastructure, but I'm
just preserving functionality here.

llvm-svn: 109165
2010-07-22 21:14:33 +00:00
Xerxes Ranby ff66cd43c4 ARMv4 JIT forgets to set the lr register when making a indirect function call. Fixes PR7608
llvm-svn: 109125
2010-07-22 17:28:34 +00:00
Chandler Carruth a1d7516cb7 Mark an assert-only variable as used.
llvm-svn: 109091
2010-07-22 08:02:25 +00:00
Chandler Carruth 2f8db38bb3 Fix the generated file name for CMake.
llvm-svn: 109090
2010-07-22 08:00:52 +00:00
Chandler Carruth 3180f9f55f Attempt to fix linking issues with CMake. Please review other CMake users,
especially on other platforms. Is there a better way to fix this.

llvm-svn: 109084
2010-07-22 06:27:45 +00:00
Owen Anderson 14646cc074 Update CMake files.
llvm-svn: 109081
2010-07-22 06:00:01 +00:00
Evan Cheng 3fabe07d4c Fix constant island pass's handling of tBR_JTr. The offset of the instruction does not have to be 4-byte aligned. Rather, it's the offset + 2 that must be aligned since the instruction expands into:
mov     pc, r1
        .align  2
LJTI0_0_0:
        .long    LBB0_14

This fixes rdar://8213383. No test case since it's not possible to come up with a suitable small one.

llvm-svn: 109076
2010-07-22 02:09:47 +00:00
Evan Cheng 285903853f More register pressure aware scheduling work.
llvm-svn: 109064
2010-07-21 23:53:58 +00:00
Jim Grosbach 965a73a28c For ARM/Darwin, add a dwarf entry indicating whether a function is arm or thumb
rdar://8202967

llvm-svn: 109057
2010-07-21 23:03:52 +00:00
Eric Christopher 84bdfd80df Baby steps towards ARM fast-isel.
llvm-svn: 109047
2010-07-21 22:26:11 +00:00
Rafael Espindola 4277e14dc4 Fix calling convention on ARM if vfp2+ is enabled.
llvm-svn: 109009
2010-07-21 11:38:30 +00:00
Evan Cheng a77f3d3b37 Teach bottom up pre-ra scheduler to track register pressure. Work in progress.
llvm-svn: 108991
2010-07-21 06:09:07 +00:00
Chris Lattner cbe9856fce prune #includes a little.
llvm-svn: 108929
2010-07-20 21:17:29 +00:00
Jim Grosbach 3680f70c9d Using BIC for immediates needs an extra bump for its complexity to get
instruction selection to prefer it when possible. rdar://7903972

llvm-svn: 108844
2010-07-20 16:07:04 +00:00
Jim Grosbach 9c7708cc1b Removed un-used code.
llvm-svn: 108841
2010-07-20 14:51:32 +00:00
Eric Christopher 4adaccf0bf Constify some arguments.
llvm-svn: 108812
2010-07-20 06:52:21 +00:00
Daniel Dunbar 0aff8033c6 Update CMake files.
llvm-svn: 108787
2010-07-20 00:08:13 +00:00
Chris Lattner b792b463af sink the arm implementations of ASmPrinter and MCInstLower
out of the AsmPrinter directory into libarm.  Now the
ARM InstPrinters depend jsut on the MC stuff, not on vmcore
or codegen.

llvm-svn: 108783
2010-07-19 23:44:46 +00:00
Evan Cheng 10f99a3490 ARM has to provide its own TargetLowering::findRepresentativeClass because its scalar floating point registers alias its vector registers.
llvm-svn: 108761
2010-07-19 22:15:08 +00:00
Jim Grosbach 8d3ba7349c Since ARM emits inline jump tables as part of the ConstantIsland pass,
it should set the jump table encloding the EK_Inline. This prevents
a second, unused, copy of the table from being emitted after the function
body. PR6581.

llvm-svn: 108730
2010-07-19 17:20:38 +00:00
Jim Grosbach d9ad52adff revert so I can get the right PR# in the log message.
llvm-svn: 108727
2010-07-19 17:19:40 +00:00
Jim Grosbach c685756cfb Since ARM emits inline jump tables as part of the ConstantIsland pass,
it should set the jump table encloding the EK_Inline. This prevents
a second, unused, copy of the table from being emitted after the function
body. PR7499.

llvm-svn: 108722
2010-07-19 17:18:28 +00:00
Daniel Dunbar 419197cc4d Target: Give the TargetAsmParser access to the TargetMachine.
- Unfortunate, but necessary for now to handle subtarget instruction matching. Eventually we should factor out the lower level target machine information so we don't need to do this.

llvm-svn: 108664
2010-07-19 00:33:49 +00:00
Jim Grosbach b97e2bbe32 Add combiner patterns to more effectively utilize the BFI (bitfield insert)
instruction for non-constant operands. This includes the case referenced
in the README.txt regarding a bitfield copy.

llvm-svn: 108608
2010-07-17 03:30:54 +00:00
Jim Grosbach 6e3b5fa91c add BFI to getTargetNodeName()
llvm-svn: 108603
2010-07-17 01:50:57 +00:00
Jim Grosbach adc81f8ee8 Fix logic think-o
llvm-svn: 108601
2010-07-17 01:22:19 +00:00
Eric Christopher 83f250f005 Remove unnecessary check that was subsumed into canRealignStack.
llvm-svn: 108588
2010-07-17 00:33:04 +00:00
Eric Christopher 24e3aa011a Make more explicit and add some currently disabled error messages for
stack realignment on ARM.

Also check for function attributes as we do on X86 as well as
make explicit that we're checking can as well as needs in this function.

llvm-svn: 108582
2010-07-17 00:27:24 +00:00
Jim Grosbach 11013eda5a Add basic support to code-gen the ARM/Thumb2 bit-field insert (BFI) instruction
and a combine pattern to use it for setting a bit-field to a constant
value. More to come for non-constant stores.

llvm-svn: 108570
2010-07-16 23:05:05 +00:00
Jakob Stoklund Olesen 8289f78569 Remove the isMoveInstr() hook.
llvm-svn: 108567
2010-07-16 22:35:46 +00:00
Jakob Stoklund Olesen 54bcf5049e Use a small local function for a single remaining late isMoveInstr call in
Thumb2ITBlockPass.

llvm-svn: 108564
2010-07-16 22:35:32 +00:00
Bill Wendling 499f797cdd Rename DBG_LABEL PROLOG_LABEL, because it's only used during prolog emission and
thus is a much more meaningful name.

llvm-svn: 108563
2010-07-16 22:20:36 +00:00
Evan Cheng 55f0c6b9fc Split -enable-finite-only-fp-math to two options:
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN.

llvm-svn: 108465
2010-07-15 22:07:12 +00:00
Eli Friedman e4be4308a9 Random note about bswap.
llvm-svn: 108396
2010-07-15 02:20:38 +00:00
Bob Wilson 0b9aafddc5 Remove restriction on NEON alignment values. Some of the NEON ld/st
instructions use different values (e.g., 2-byte or 4-byte alignment).
Also fix ARMInstPrinter to print these alignments as bits instead of bytes.

llvm-svn: 108386
2010-07-14 23:54:43 +00:00
Benjamin Kramer 92d8998348 Don't pass StringRef by reference.
llvm-svn: 108366
2010-07-14 22:38:02 +00:00
Jim Grosbach a90af1ba38 Improve 64-subtraction of immediates when parts of the immediate can fit
in the literal field of an instruction. E.g.,
long long foo(long long a) {
  return a - 734439407618LL;
}

rdar://7038284

llvm-svn: 108339
2010-07-14 17:45:16 +00:00
Bob Wilson 1aef53403f Add missing address register update to t2LDM_RET instruction.
Patch by Brian Lucas. PR7636.

llvm-svn: 108332
2010-07-14 16:02:13 +00:00
Eli Friedman c4d70125ee A couple potential optimizations inspired by comment 4 in PR6773.
llvm-svn: 108328
2010-07-14 06:58:26 +00:00
Bob Wilson bad47f62f6 Add support for NEON VMVN immediate instructions.
llvm-svn: 108324
2010-07-14 06:31:50 +00:00
Bob Wilson bd54a53628 The bits in the cmode field of 32-bit VMOV immediate instructions all depend
of the value of the immediate.

llvm-svn: 108323
2010-07-14 06:30:44 +00:00
Bob Wilson 103a0dcfe1 Add an ARM-specific DAG combining to avoid redundant VDUPLANE nodes.
Radar 7373643.

llvm-svn: 108303
2010-07-14 01:22:12 +00:00
Bob Wilson a3f1901531 Use a target-specific VMOVIMM DAG node instead of BUILD_VECTOR to represent
NEON VMOV-immediate instructions.  This simplifies some things.

llvm-svn: 108275
2010-07-13 21:16:48 +00:00
Evan Cheng 0cc4ad983d Extend the r107852 optimization which turns some fp compare to code sequence using only i32 operations. It now optimize some f64 compares when fp compare is exceptionally slow (e.g. cortex-a8). It also catches comparison against 0.0.
llvm-svn: 108258
2010-07-13 19:27:42 +00:00
Evan Cheng 58066e337d Add an ARM "feature". Cortex-a8 fp comparison is very slow (> 20 cycles).
llvm-svn: 108256
2010-07-13 19:21:50 +00:00
Bob Wilson c1c6f4796e Move NEON "modified immediate" encode/decode into ARMAddressingModes.h to
avoid replicated code.

llvm-svn: 108227
2010-07-13 04:44:34 +00:00
Bob Wilson 8a2bdc8231 Remove some code that doesn't appear to do anything. All the ARM call
instructions already have implicit defs of LR.  The comment suggests that
this is intended to fix something like pr6111, but it doesn't really do
that either.

llvm-svn: 108186
2010-07-12 20:22:45 +00:00
Duncan Sands 41b4a6b36a Convert some tab stops into spaces.
llvm-svn: 108130
2010-07-12 08:16:59 +00:00
Jakob Stoklund Olesen 0961c55161 RISC architectures get their memory operand folding for free.
The only folding these load/store architectures can do is converting COPY into a
load or store, and the target independent part of foldMemoryOperand already
knows how to do that.

llvm-svn: 108099
2010-07-11 19:19:13 +00:00
Rafael Espindola 1da1cfccb1 Make getPhysicalRegisterRegClass non-virtual. Should be able to remove it soon.
llvm-svn: 108094
2010-07-11 16:49:10 +00:00
Jakob Stoklund Olesen d7b33002dd Replace copyRegToReg with copyPhysReg for ARM.
llvm-svn: 108078
2010-07-11 06:33:54 +00:00
Rafael Espindola a76eccf815 Fix va_arg for doubles. With this patch VAARG nodes always contain the
correct alignment information, which simplifies ExpandRes_VAARG a bit.

The patch introduces a new alignment information to TargetLoweringInfo. This is
needed since the two natural candidates cannot be used:

* The 's' in target data: If this is set to the minimal alignment of any
  argument, getCallFrameTypeAlignment would return 4 for doubles on ARM for
  example.
* The getTransientStackAlignment method. It is possible for an architecture to
  have argument less aligned than what we maintain the stack pointer.

llvm-svn: 108072
2010-07-11 04:01:49 +00:00
Chandler Carruth d162d85688 Add parentheses yet again to satisfy GCC's warnings.
llvm-svn: 108043
2010-07-10 12:06:22 +00:00
Jakob Stoklund Olesen 7a7b55eb67 Automatically fold COPY instructions into stack load/store.
llvm-svn: 108012
2010-07-09 20:43:13 +00:00
Jim Grosbach 2a5725b1a3 In the presence of variable sized objects, allocate an emergency spill slot.
rdar://8131327

llvm-svn: 108008
2010-07-09 20:27:06 +00:00
Bob Wilson 88a4e6dc0e Print "dregpair" NEON operands with a space between them, for readability and
consistency with other instructions that have lists of register operands.

llvm-svn: 107944
2010-07-09 00:47:20 +00:00
Evan Cheng 0f54854a1d Check for FiniteOnlyFPMath as well.
llvm-svn: 107904
2010-07-08 20:12:24 +00:00
Bob Wilson 181e5af248 The NEONPreAllocPass should never have to assign fixed registers anymore.
This pass can go away entirely soon.

llvm-svn: 107892
2010-07-08 17:45:26 +00:00
Bob Wilson 1eade1a327 For big-endian systems, VLD2/VST2 with 32-bit vector elements will swap the
words within the 64-bit D registers.  Use VLD1/VST1 with 64-bit elements
instead.

llvm-svn: 107890
2010-07-08 17:44:00 +00:00
Bob Wilson 6c25043493 Clean up a comment.
llvm-svn: 107882
2010-07-08 16:54:45 +00:00