Commit Graph

4802 Commits

Author SHA1 Message Date
Bruno Cardoso Lopes 3c4d652210 We already support 256-bit packed ADD, SUB, DIV, MUL. Add testcases.
llvm-svn: 135099
2011-07-13 22:28:55 +00:00
Bruno Cardoso Lopes 9613b64916 Make X86ISD::ANDNP more general and Codegen 256-bit VANDNP. A more
general version of X86ISD::ANDNP also opened the room for a little bit
of refactoring.

llvm-svn: 135088
2011-07-13 21:36:51 +00:00
Eli Friedman 344ec79715 Make sure we don't combine a large displacement and a frame index in the same addressing mode on x86-64. It can overflow, leading to a crash/miscompile.
<rdar://problem/9763308>

llvm-svn: 135084
2011-07-13 21:29:53 +00:00
Bruno Cardoso Lopes 1021b4a9dd AVX Codegen support for 256-bit versions of vandps, vandpd, vorps, vorpd, vxorps, vxorpd
llvm-svn: 135023
2011-07-13 01:15:33 +00:00
Evan Cheng f863e3fb73 Improve codegen for select's:
if (x != 0) x = 1
if (x == 1) x = 1

Previous codegen looks like this:
        mov     r1, r0
        cmp     r1, #1
        mov     r0, #0
        moveq   r0, #1

The naive lowering select between two different values. It should recognize the
test is equality test so it's more a conditional move rather than a select:
        cmp     r0, #1
        movne   r0, #0

rdar://9758317

llvm-svn: 135017
2011-07-13 00:42:17 +00:00
Jim Grosbach ade1fb17d4 Improve test cases from r134746.
Use memory barriers to force if-conversion off for these tests instead of
the internal llc command line option ifcvt-limit.

llvm-svn: 134986
2011-07-12 16:06:01 +00:00
Andrew Trick 1b9d9b6b7a Comment correction.
llvm-svn: 134958
2011-07-12 03:39:22 +00:00
Jim Grosbach 581da64241 Simplify printing of ARM shifted immediates.
Print shifted immediate values directly rather than as a payload+shifter
value pair. This makes for more readable output assembly code, simplifies
the instruction printer, and is consistent with how Thumb immediates are
 displayed.

llvm-svn: 134902
2011-07-11 16:48:36 +00:00
NAKAMURA Takumi 5b304432bb test/CodeGen/PowerPC/vector.ll: Tweak redirection >%t >%t to >%t >>%t. See also r134814 (test/CodeGen/X86/vector.ll).
llvm-svn: 134900
2011-07-11 16:21:52 +00:00
Cameron Zwarich 61715740cd Add a missing test for r134882.
llvm-svn: 134889
2011-07-11 08:35:17 +00:00
Chris Lattner b1ed91f397 Land the long talked about "type system rewrite" patch. This
patch brings numerous advantages to LLVM.  One way to look at it
is through diffstat:
 109 files changed, 3005 insertions(+), 5906 deletions(-)

Removing almost 3K lines of code is a good thing.  Other advantages
include:

1. Value::getType() is a simple load that can be CSE'd, not a mutating
   union-find operation.
2. Types a uniqued and never move once created, defining away PATypeHolder.
3. Structs can be "named" now, and their name is part of the identity that
   uniques them.  This means that the compiler doesn't merge them structurally
   which makes the IR much less confusing.
4. Now that there is no way to get a cycle in a type graph without a named
   struct type, "upreferences" go away.
5. Type refinement is completely gone, which should make LTO much MUCH faster
   in some common cases with C++ code.
6. Types are now generally immutable, so we can use "Type *" instead 
   "const Type *" everywhere.

Downsides of this patch are that it removes some functions from the C API,
so people using those will have to upgrade to (not yet added) new API.  
"LLVM 3.0" is the right time to do this.

There are still some cleanups pending after this, this patch is large enough
as-is.

llvm-svn: 134829
2011-07-09 17:41:24 +00:00
Chris Lattner 2522d2df06 more tests not making the jump into the brave new world.
llvm-svn: 134820
2011-07-09 16:57:10 +00:00
NAKAMURA Takumi 9c6a679d3b test/CodeGen/X86/vector.ll: Tweak temporary output to appease Win32 hosts.
With Lit (not bash) in a test, multiple redirects >%t might open(%t, "w") multiple. It can be avoided if latter redirect is >>%t.

It might work even if ">/dev/null" were used.

llvm-svn: 134814
2011-07-09 10:22:28 +00:00
Jakob Stoklund Olesen bf6afec312 Hoist spills within a basic block.
Try to move spills as early as possible in their basic block. This can
help eliminate interferences by shortening the live range being
spilled.

This fixes PR10221.

llvm-svn: 134776
2011-07-09 00:25:03 +00:00
Evan Cheng fd7e3fcad3 Fix broken x86_64 tests which specify non-64-bit cpu's.
llvm-svn: 134756
2011-07-08 22:29:33 +00:00
Eli Friedman e2f76c4ade Default 64-bit target features and SSE2 on when a triple specifies x86-64. Clean up all the other hacks which are now unnecessary.
llvm-svn: 134753
2011-07-08 22:16:47 +00:00
Jim Grosbach 7471937ad7 Make tBX_RET and tBX_RET_vararg predicable.
The normal tBX instruction is predicable, so there's no reason the
pseudos for using it as a return shouldn't be. Gives us some nice code-gen
improvements as can be seen by the test changes. In particular, several
tests now have to disable if-conversion because it works too well and defeats
the test.

llvm-svn: 134746
2011-07-08 21:50:04 +00:00
Julien Lerouge 112fcc164a Add _allrem, _aullrem and _allmul to the runtime for MSVC.
http://llvm.org/bugs/show_bug.cgi?id=10305

llvm-svn: 134744
2011-07-08 21:40:25 +00:00
Cameron Zwarich f03fa189ca Add an intrinsic and codegen support for fused multiply-accumulate. The intent
is to use this for architectures that have a native FMA instruction.

llvm-svn: 134742
2011-07-08 21:39:21 +00:00
Jakob Stoklund Olesen 4931bbc671 Be more aggressive about following hints.
RAGreedy::tryAssign will now evict interference from the preferred
register even when another register is free.

To support this, add the EvictionCost struct that counts how many hints
are broken by an eviction. We don't want to break one hint just to
satisfy another.

Rename canEvict to shouldEvict, and add the first bit of eviction policy
that doesn't depend on spill weights: Always make room in the preferred
register as long as the evictees can be split and aren't already
assigned to their preferred register.

Also make the CSR avoidance more accurate. When looking for a cheaper
register it is OK to use a new volatile register. Only CSR aliases that
have never been used before should be avoided.

llvm-svn: 134735
2011-07-08 20:46:18 +00:00
Jim Grosbach dbfb29d6c0 Use ARMPseudoExpand for ARM tail calls.
llvm-svn: 134719
2011-07-08 18:50:22 +00:00
Benjamin Kramer 9960a25006 Emit a more efficient magic number multiplication for exact sdivs.
We have to do this in DAGBuilder instead of DAGCombiner, because the exact bit is lost after building.

  struct foo { char x[24]; };
  long bar(struct foo *a, struct foo *b) { return a-b; }
is now compiled into
  movl	4(%esp), %eax
  subl	8(%esp), %eax
  sarl	$3, %eax
  imull	$-1431655765, %eax, %eax
instead of
  movl	4(%esp), %eax
  subl	8(%esp), %eax
  movl	$715827883, %ecx
  imull	%ecx
  movl	%edx, %eax
  shrl	$31, %eax
  sarl	$2, %edx
  addl	%eax, %edx
  movl	%edx, %eax

llvm-svn: 134695
2011-07-08 10:31:30 +00:00
Jakob Stoklund Olesen bbe2a5cfff Fix more register allocation sensitive tests.
llvm-svn: 134667
2011-07-08 00:24:06 +00:00
Jakob Stoklund Olesen b138c44276 Remove a test that no longer makes sense.
It was testing a linear scan feature:

  Test if linearscan is unfavoring registers for allocation to allow
  more reuse of reloads from stack slots.

The greedy register allocator doesn't access any stack slots in this
function, so the linear scan feature was not being tested.

llvm-svn: 134666
2011-07-08 00:24:03 +00:00
Nick Lewycky 9badf60203 Let the inline asm 'q' constraint match float, and on 64-bit double too.
Fixes PR9602!

llvm-svn: 134665
2011-07-08 00:19:27 +00:00
Eric Christopher 7a2a0f80de Go ahead and emit the barrier on x86-64 even without sse2. The
processor supports it just fine.

Fixes PR9675 and rdar://9740801

llvm-svn: 134664
2011-07-08 00:04:56 +00:00
Eric Christopher 9721396dab Add support for the X86 'l' constraint.
Fixes PR10149 and rdar://9738585

llvm-svn: 134648
2011-07-07 22:29:07 +00:00
Evan Cheng 13bcc6c1c7 Add Mode64Bit feature and sink it down to MC layer.
llvm-svn: 134641
2011-07-07 21:06:52 +00:00
Evan Cheng 8b2bda09a5 Change some ARM subtarget features to be single bit yes/no in order to sink them down to MC layer. Also fix tests.
llvm-svn: 134590
2011-07-07 03:55:05 +00:00
Lang Hames 2bbdc0bcda Added a testcase for PR10220.
llvm-svn: 134573
2011-07-07 00:36:02 +00:00
Jakub Staszak 3f158fdf6e Introduce "expect" intrinsic instructions.
llvm-svn: 134516
2011-07-06 18:22:43 +00:00
Dan Gohman ad3e8fda3b Revert r134366 and add an explicit triple to make this test host-independent.
llvm-svn: 134447
2011-07-05 22:09:19 +00:00
Jakob Stoklund Olesen bbad3bceb7 Fix PR10277.
Remat during spilling triggers dead code elimination. If a phi-def
becomes unused, that may also cause live ranges to split into separate
connected components.

This type of splitting is different from normal live range splitting. In
particular, there may not be a common original interval.

When the split range is its own original, make sure that the new
siblings are also their own originals. The range being split cannot be
used as an original since it doesn't cover the new siblings.

llvm-svn: 134413
2011-07-05 15:38:41 +00:00
NAKAMURA Takumi bb2f28f41c test/CodeGen/X86/lsr-nonaffine.ll: Relax expressions for Win64 CC to appease Win32 hosts.
llvm-svn: 134366
2011-07-03 09:26:14 +00:00
Chandler Carruth a6e593b4eb FileCheck-ize another test. Reduces the llc invocations from 8 to 1, and
makes one of the tests actually mean something (as the string 'add' will
always appear in the output of this file).

llvm-svn: 134358
2011-07-02 21:34:52 +00:00
Chandler Carruth a33e630c55 FileCheck-ize another X86 test, making it more precisely verify the
desired result based on the comments in the file.

llvm-svn: 134354
2011-07-02 20:43:16 +00:00
Chandler Carruth 959fe548d7 FileCheck-ize and simplify RUN lines.
llvm-svn: 134352
2011-07-02 20:43:11 +00:00
Chandler Carruth 02bece4957 FileCheck-ize
llvm-svn: 134351
2011-07-02 20:43:08 +00:00
Chandler Carruth 91c4008373 FileCheck-ize and tighten up assertions to only check the relevant sections.
llvm-svn: 134350
2011-07-02 20:43:04 +00:00
Chandler Carruth 4b15dd38a8 FileCheck-ize and cleanup IR.
llvm-svn: 134349
2011-07-02 20:43:01 +00:00
Chandler Carruth a7440ff322 FileCheck-ize
llvm-svn: 134348
2011-07-02 20:42:59 +00:00
Chandler Carruth 5c87df3179 Remove a grep that is already checked with FileCheck.
llvm-svn: 134346
2011-07-02 20:42:56 +00:00
Chandler Carruth 106ae72933 FileCheck-ize
llvm-svn: 134345
2011-07-02 20:42:53 +00:00
Chandler Carruth fbb7c9ba06 FileCheck-ize and modernize IR.
llvm-svn: 134344
2011-07-02 20:42:50 +00:00
Chandler Carruth 81275c3029 FileCheck-ize and simplify RUNs.
llvm-svn: 134343
2011-07-02 20:42:48 +00:00
Chandler Carruth f0e2e37518 FileCheck-ize and modernize the RUN line.
llvm-svn: 134342
2011-07-02 20:42:44 +00:00
Chandler Carruth abd8a8f3ae FileCheck-ize, tightening checks and avoiding a temporary file.
llvm-svn: 134341
2011-07-02 20:42:42 +00:00
Chandler Carruth 6e4d90f7c6 FileCheck-ize, tightening checks and avoiding a temporary file.
llvm-svn: 134340
2011-07-02 20:42:39 +00:00
Chandler Carruth ff0e32536e FileCheck-ize
llvm-svn: 134339
2011-07-02 20:42:36 +00:00
Chandler Carruth d954bb7ebb FileCheck-ize
llvm-svn: 134338
2011-07-02 20:42:33 +00:00
Chandler Carruth 362bff3bd3 FileCheck-ize a test, avoiding a temporary file.
llvm-svn: 134337
2011-07-02 20:42:31 +00:00
Chandler Carruth f2a29b726f FileCheck-ize and simplify this test.
llvm-svn: 134336
2011-07-02 20:42:28 +00:00
Chandler Carruth 7e44b420e1 FileCheck-ize
llvm-svn: 134335
2011-07-02 20:42:25 +00:00
Chandler Carruth 144cf1a974 FileCheck-ize another codegen test.
llvm-svn: 134334
2011-07-02 20:42:22 +00:00
Chandler Carruth 1d815f5373 Partially FileCheck-ize a test to remove a weird quoting situation.
llvm-svn: 134333
2011-07-02 20:42:20 +00:00
Chandler Carruth 8a3f20abac FileCheck-ize another test, and upgrade its syntax a bit.
llvm-svn: 134332
2011-07-02 20:42:17 +00:00
Chandler Carruth 38d367e473 FileCheck-ize another codegen test, tightening it up.
llvm-svn: 134331
2011-07-02 20:42:14 +00:00
Chandler Carruth bf252382b8 FileCheck-ize another test, making it much more precise for testing the
individual cases, while hard coding less about registers in use.

llvm-svn: 134330
2011-07-02 20:42:11 +00:00
Chandler Carruth 308b3b66b1 FileCheck-ize another test. This one is more clear and runs fewer
commands as a result.

llvm-svn: 134329
2011-07-02 20:42:08 +00:00
Chandler Carruth 334faf8f1a FileCheck-ize a test, no functionality changed.
llvm-svn: 134328
2011-07-02 20:42:06 +00:00
Jakob Stoklund Olesen 54f7c59c1a Better diagnostics when inline asm fails to allocate.
asm.c:2:7: error: ran out of registers during register allocation
  asm(""::"r"(0), "r"(1), "r"(2), "r"(3), "r"(4), "r"(5), "r"(6), "r"(7), "r"(8), "r"(9));
        ^

llvm-svn: 134310
2011-07-02 07:17:37 +00:00
Eric Christopher 2eca9d5ddf Be less specific about register allocation ordering.
llvm-svn: 134308
2011-07-02 04:06:41 +00:00
Eric Christopher a8a56f7e5c TargetConstant immediates won't be placed into registers so tighten
up the valid constant check earlier.

rdar://9692967

llvm-svn: 134286
2011-07-01 23:04:38 +00:00
Dan Gohman a293f24a0d Teach IVUsers to stop at non-affine expressions unless they are both
outside the loop and reducible.

This more completely hides them from LSR, which isn't usually able to
do anything meaningful with non-affine expressions anyway, and this
consequently hides them from SCEVExpander, which is acutely unprepared
for non-affine expressions.

Replace test/CodeGen/X86/lsr-nonaffine.ll with a new test that tests
the new behavior.

This works around the bug in PR10117 / rdar://problem/9633149, and is
generally an improvement besides.

llvm-svn: 134268
2011-07-01 22:05:19 +00:00
Jim Grosbach cf1464d943 ARMv7M vs. ARMv7E-M support.
The DSP instructions in the Thumb2 instruction set are an optional extension
in the Cortex-M* archtitecture. When present, the implementation is considered
an "ARMv7E-M implementation," and when not, an "ARMv7-M implementation."

Add a subtarget feature hook for the v7e-m instructions and hook it up. The
cortex-m3 cpu is an example of a v7m implementation, while the cortex-m4 is
a v7e-m implementation.

rdar://9572992

llvm-svn: 134261
2011-07-01 21:12:19 +00:00
Eric Christopher 29f1db85dd Add support for the 'j' immediate constraint. This is conditionalized on
supporting the instruction that the constraint is for 'movw'.

Part of rdar://9119939

llvm-svn: 134222
2011-07-01 01:00:07 +00:00
Eric Christopher c011d31543 Add support for the ARM 't' register constraint. And another testcase
for the 'x' register constraint.

Part of rdar://9119939

llvm-svn: 134220
2011-07-01 00:30:46 +00:00
Eric Christopher f1c74595aa Add support for the 'x' constraint.
Part of rdar://9307836 and rdar://9119939

llvm-svn: 134215
2011-07-01 00:14:47 +00:00
Jakob Stoklund Olesen d0e2352b65 Fix a problem with fast-isel return values introduced in r134018.
We would put the return value from long double functions in the wrong
register.

This fixes gcc.c-torture/execute/conversion.c

llvm-svn: 134205
2011-06-30 23:42:18 +00:00
Eric Christopher f45daac30f Add support for the 'h' constraint.
Part of rdar://9119939

llvm-svn: 134203
2011-06-30 23:23:01 +00:00
Jim Grosbach b98ab91e39 Thumb1 register to register MOV instruction is predicable.
Fix a FIXME and allow predication (in Thumb2) for the T1 register to
register MOV instructions. This allows some better codegen with
if-conversion (as seen in the test updates), plus it lays the groundwork
for pseudo-izing the tMOVCC instructions.

llvm-svn: 134197
2011-06-30 22:10:46 +00:00
Jim Grosbach 353da73186 Pseudo-ize the t2LDMIA_RET instruction.
It's just a t2LDMIA_UPD instruction with extra codegen properties, so it
doesn't need the encoding information. As a side-benefit, we now correctly
recognize for instruction printing as a 'pop' instruction.

llvm-svn: 134173
2011-06-30 18:25:42 +00:00
Eric Christopher c932173773 Fix a small thinko for constant i64 lock/orq optimization where we
we didn't have an opcode for 64-bit constant or expressions.

Fixes rdar://9692967

llvm-svn: 134121
2011-06-30 00:48:30 +00:00
Devang Patel 0eada03216 Revert r133953 for now.
llvm-svn: 134116
2011-06-29 23:50:13 +00:00
Cameron Zwarich 34c8f51d65 In the ARM global merging pass, allow extraneous alignment specifiers. This pass
already makes the assumption, which is correct on ARM, that a type's alignment is
less than its alloc size. This improves codegen with Clang (which inserts a lot of
extraneous alignment specifiers) and fixes <rdar://problem/9695089>.

llvm-svn: 134106
2011-06-29 22:24:25 +00:00
Benjamin Kramer d2a84f6a63 Don't depend on the optimization reverted in r134067.
llvm-svn: 134068
2011-06-29 14:07:18 +00:00
Benjamin Kramer 8665f8d916 Revert a part of r126557 which could create unschedulable DAGs.
llvm-svn: 134067
2011-06-29 13:47:25 +00:00
Jakob Stoklund Olesen 7297e7e223 Clean up the handling of the x87 fp stack to make it more robust.
Drop the FpMov instructions, use plain COPY instead.

Drop the FpSET/GET instruction for accessing fixed stack positions.
Instead use normal COPY to/from ST registers around inline assembly, and
provide a single new FpPOP_RETVAL instruction that can access the return
value(s) from a call. This is still necessary since you cannot tell from
the CALL instruction alone if it returns anything on the FP stack. Teach
fast isel to use this.

This provides a much more robust way of handling fixed stack registers -
we can tolerate arbitrary FP stack instructions inserted around calls
and inline assembly. Live range splitting could sometimes break x87 code
by inserting spill code in unfortunate places.

As a bonus we handle floating point inline assembly correctly now.

llvm-svn: 134018
2011-06-28 18:32:28 +00:00
Roman Divacky 4394e68c24 Implement ISD::VAARG lowering on PPC32.
llvm-svn: 134005
2011-06-28 15:30:42 +00:00
Jakob Stoklund Olesen 74dd400410 FileCheckize a couple of tests.
Also and add a test for popping dead return values and avoid testing the
spill precision.

llvm-svn: 133997
2011-06-28 06:25:03 +00:00
Chandler Carruth e2a1b16963 FileCheck-ize a test that had the strangest TCL quote I've seen yet: an
opening single quote with no closing single quote, and with {} quotes
"inside" of it. This broke some of our tools that scrape test cases.

Also, while here, make the test actually assert what the comment says it
asserts. This was essentially authored by Nick Lewycky, and merely typed
in by myself. Let me know if this is still missing the mark, but the
previous test only succeeded due to the improper quoting preventing
*anything* from matching the grep -- it had a '4(%...)' sequence in the
output!

llvm-svn: 133980
2011-06-28 02:03:10 +00:00
Evan Cheng b7d00313dc Remove the experimental (and unused) pre-ra splitting pass. Greedy regalloc can split live ranges.
llvm-svn: 133962
2011-06-27 23:40:45 +00:00
Devang Patel 4dc034df1d During bottom up fast-isel, instructions emitted to materalize registers are at top of basic block and do not have debug location. This may misguide debugger while entering the basic block and sometimes debugger provides semi useful view of current location to developer by picking up previous known location as current location. Assign a sensible location to the first instruction in a basic block, if it does not have one location derived from source file, so that debugger can provide meaningful user experience to developers in edge cases.
llvm-svn: 133953
2011-06-27 22:32:04 +00:00
Eric Christopher aff7ed55bc Allow lr in the register options here.
llvm-svn: 133935
2011-06-27 20:31:01 +00:00
Jakob Stoklund Olesen f68976071d Move all inline-asm-fpstack tests to a single file.
Also fix some of the tests that were actually testing wrong behavior -
An input operand in {st} is only popped by the inline asm when {st} is
also in the clobber list.

The original bug reports all had ~{st} clobbers as they should.

llvm-svn: 133916
2011-06-27 17:27:37 +00:00
Dan Bailey b49b736519 PTX: corrected tests that were failing
llvm-svn: 133875
2011-06-25 19:41:17 +00:00
Dan Bailey b7ee561399 PTX: Reverting implementation of i8.
The .b8 operations in PTX are far more limiting than I first thought. The mov operation isn't even supported, so there's no way of converting a .pred value into a .b8 without going via .b16, which is
not sensible. An improved implementation needs to use the fact that loads and stores automatically extend and truncate to implement support for EXTLOAD and TRUNCSTORE in order to correctly support
boolean values.

llvm-svn: 133873
2011-06-25 18:16:28 +00:00
Chad Rosier f3e11190f3 Test case for r133858 (tail call optimize in the presence of byval).
llvm-svn: 133863
2011-06-25 02:44:56 +00:00
Devang Patel f071d72c44 Handle debug info for i128 constants.
llvm-svn: 133821
2011-06-24 20:46:11 +00:00
Dan Bailey dd01c4ac7a PTX: Add support for i8 type and introduce associated .b8 registers
The i8 type is required for boolean values, but can only use ld, st and mov instructions. The i1 type continues to be used for predicates.

llvm-svn: 133814
2011-06-24 19:27:10 +00:00
Chad Rosier fa8d89327f The Neon VCVT (between floating-point and fixed-point, Advanced SIMD)
instructions can be used to match combinations of multiply/divide and VCVT 
(between floating-point and integer, Advanced SIMD).  Basically the VCVT 
immediate operand that specifies the number of fraction bits corresponds to a 
floating-point multiply or divide by the corresponding power of 2.

For example, VCVT (floating-point to fixed-point, Advanced SIMD) can replace a 
combination of VMUL and VCVT (floating-point to integer) as follows:

Example (assume d17 = <float 8.000000e+00, float 8.000000e+00>):
  vmul.f32        d16, d17, d16
  vcvt.s32.f32    d16, d16
becomes:
  vcvt.s32.f32    d16, d16, #3

Similarly, VCVT (fixed-point to floating-point, Advanced SIMD) can replace a 
combinations of VCVT (integer to floating-point) and VDIV as follows:

Example (assume d17 = <float 8.000000e+00, float 8.000000e+00>):
  vcvt.f32.s32    d16, d16
  vdiv.f32        d16, d17, d16
becomes:
  vcvt.f32.s32    d16, d16, #3

llvm-svn: 133813
2011-06-24 19:23:04 +00:00
Akira Hatanaka 35792089e7 Change the chain input of nodes that load the address of a function. This change
enables SelectionDAG::getLoad at MipsISelLowering.cpp:1914 to return a
pre-existing node instead of redundantly create a new node every time it is
called.

llvm-svn: 133811
2011-06-24 19:01:25 +00:00
Akira Hatanaka ca88b4abec Prevent generation of redundant addiu instructions that compute address of
static variables or functions. 

llvm-svn: 133803
2011-06-24 17:55:19 +00:00
Justin Holewinski 6cdd72a9ca PTX: Always use registers for return values, but use .param space for device
parameters if SM >= 2.0

- Update test cases to be more robust against register allocation changes
- Bump up the number of registers to 128 per type
- Include Python script to re-generate register file with any number of
  registers

llvm-svn: 133736
2011-06-23 18:10:13 +00:00
Justin Holewinski a4aecf3dc4 PTX: Fixup test cases for device param changes
llvm-svn: 133735
2011-06-23 18:10:08 +00:00
Andrew Trick 67ff0718a4 lit support for REQUIRES: asserts.
Take #2. Don't piggyback on the existing config.build_mode. Instead,
define a new lit feature for each build feature we need (currently
just "asserts"). Teach both autoconf'd and cmake'd Makefiles to define
this feature within test/lit.site.cfg. This doesn't require any lit
harness changes and should be more robust across build systems.

llvm-svn: 133664
2011-06-22 23:23:19 +00:00
Rafael Espindola 2496c1f1f8 Reenable tail duplication of bb with just an unconditional jump, but
don't remove blocks that have their address taken.

llvm-svn: 133659
2011-06-22 22:31:57 +00:00
Nick Lewycky 90e6a4e5d5 Needs a triple.
llvm-svn: 133634
2011-06-22 19:42:14 +00:00
Nick Lewycky 6208a2fd66 Emit trailing padding on constant vectors when TargetData says that the vector
is larger than the sum of the elements (including per-element padding).

llvm-svn: 133631
2011-06-22 18:55:03 +00:00
Justin Holewinski 6fafebfb6a PTX: Add signed integer comparisons
llvm-svn: 133599
2011-06-22 02:09:50 +00:00