Commit Graph

3932 Commits

Author SHA1 Message Date
Benjamin Kramer 25e6e06e42 Try to reuse the value when lowering memset.
This allows us to compile:
  void test(char *s, int a) {
    __builtin_memset(s, a, 15);
  }
into 1 mul + 3 stores instead of 3 muls + 3 stores.

llvm-svn: 122710
2011-01-02 19:57:05 +00:00
Benjamin Kramer 2fdea4c8f1 Lower the i8 extension in memset to a multiply instead of a potentially long series of shifts and ors.
We could implement a DAGCombine to turn x * 0x0101 back into logic operations
on targets that doesn't support the multiply or it is slow (p4) if someone cares
enough.

Example code:
  void test(char *s, int a) {
      __builtin_memset(s, a, 4);
  }
before:
  _test:                                  ## @test
    movzbl  8(%esp), %eax
    movl  %eax, %ecx
    shll  $8, %ecx
    orl %eax, %ecx
    movl  %ecx, %eax
    shll  $16, %eax
    orl %ecx, %eax
    movl  4(%esp), %ecx
    movl  %eax, 4(%ecx)
    movl  %eax, (%ecx)
    ret
after:
  _test:                                  ## @test
    movzbl  8(%esp), %eax
    imull $16843009, %eax, %eax   ## imm = 0x1010101
    movl  4(%esp), %ecx
    movl  %eax, 4(%ecx)
    movl  %eax, (%ecx)
    ret

llvm-svn: 122707
2011-01-02 19:44:58 +00:00
Rafael Espindola abe3eaa481 Fix darwin bots.
llvm-svn: 122672
2011-01-01 21:58:41 +00:00
Rafael Espindola d606e54757 Add support for the 'H' modifier.
llvm-svn: 122667
2011-01-01 20:58:46 +00:00
Anton Korobeynikov 879be84aa5 Update the test
llvm-svn: 122666
2011-01-01 20:57:26 +00:00
Che-Liang Chiou 5451fc9195 ptx: remove reg-reg addressing mode and st.const
llvm-svn: 122653
2011-01-01 11:58:58 +00:00
Che-Liang Chiou 15e8d2c5e7 ptx: add store instruction
llvm-svn: 122652
2011-01-01 10:50:37 +00:00
Che-Liang Chiou 3ee0501338 ptx: add state spaces
llvm-svn: 122638
2010-12-30 10:41:27 +00:00
NAKAMURA Takumi dbacfb73e0 test/CodeGen/X86/negative-sin.ll: FileCheck-ize.
llvm-svn: 122619
2010-12-29 03:58:47 +00:00
NAKAMURA Takumi 74835a22d8 test/CodeGen/X86/fp-in-intregs.ll: FileCheck-ize.
llvm-svn: 122618
2010-12-29 03:58:36 +00:00
Bob Wilson 36be00ceb3 Radar 8803471: Fix expansion of ARM BCCi64 pseudo instructions.
If the basic block containing the BCCi64 (or BCCZi64) instruction ends with
an unconditional branch, that branch needs to be deleted before appending
the expansion of the BCCi64 to the end of the block.

llvm-svn: 122521
2010-12-23 22:45:49 +00:00
Andrew Trick 033efdf4d7 Fixes PR8823: add-with-overflow-128.ll
In the bottom-up selection DAG scheduling, handle two-address
instructions that read/write unspillable registers. Treat
the entire chain of two-address nodes as a single live range.

llvm-svn: 122472
2010-12-23 03:15:51 +00:00
Benjamin Kramer 1f4dfbbcb0 DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal. The latter usually compiles into smaller code.
example code:
unsigned foo(unsigned x, unsigned y) {
  if (x != 0) y--;
  return y;
}

before:
  _foo:                           ## @foo
    cmpl  $1, 4(%esp)             ## encoding: [0x83,0x7c,0x24,0x04,0x01]
    sbbl  %eax, %eax              ## encoding: [0x19,0xc0]
    notl  %eax                    ## encoding: [0xf7,0xd0]
    addl  8(%esp), %eax           ## encoding: [0x03,0x44,0x24,0x08]
    ret                           ## encoding: [0xc3]

after:
  _foo:                           ## @foo
    cmpl  $1, 4(%esp)             ## encoding: [0x83,0x7c,0x24,0x04,0x01]
    movl  8(%esp), %eax           ## encoding: [0x8b,0x44,0x24,0x08]
    adcl  $-1, %eax               ## encoding: [0x83,0xd0,0xff]
    ret                           ## encoding: [0xc3]

llvm-svn: 122455
2010-12-22 23:17:45 +00:00
Benjamin Kramer 6020ed9d99 X86: Lower a select directly to a setcc_carry if possible.
int test(unsigned long a, unsigned long b) { return -(a < b); }
compiles to
  _test:                              ## @test
    cmpq  %rsi, %rdi                  ## encoding: [0x48,0x39,0xf7]
    sbbl  %eax, %eax                  ## encoding: [0x19,0xc0]
    ret                               ## encoding: [0xc3]
instead of
  _test:                              ## @test
    xorl  %ecx, %ecx                  ## encoding: [0x31,0xc9]
    cmpq  %rsi, %rdi                  ## encoding: [0x48,0x39,0xf7]
    movl  $-1, %eax                   ## encoding: [0xb8,0xff,0xff,0xff,0xff]
    cmovael %ecx, %eax                ## encoding: [0x0f,0x43,0xc1]
    ret                               ## encoding: [0xc3]

llvm-svn: 122451
2010-12-22 23:09:28 +00:00
Che-Liang Chiou aaedf8be1c ptx: add ld instruction and test
llvm-svn: 122398
2010-12-22 10:38:51 +00:00
Chris Lattner cafc1e60bb Fix a bug in ReduceLoadWidth that wasn't handling extending
loads properly.  We miscompiled the testcase into:

_test:                                  ## @test
	movl	$128, (%rdi)
	movzbl	1(%rdi), %eax
	ret

Now we get a proper:

_test:                                  ## @test
	movl	$128, (%rdi)
	movsbl	(%rdi), %eax
	movzbl	%ah, %eax
	ret

This fixes PR8757.

llvm-svn: 122392
2010-12-22 08:02:57 +00:00
Dale Johannesen a94e36bbee Reapply 122353-122355 with fixes. 122354 was wrong;
the shift type was needed one place, the shift count
type another.  The transform in 123555 had the same
problem.

llvm-svn: 122366
2010-12-21 21:55:50 +00:00
Benjamin Kramer f6ddc4a1de Add some x86 specific dagcombines for conditional increments.
(add Y, (sete  X, 0)) -> cmp X, 1; adc  0, Y
(add Y, (setne X, 0)) -> cmp X, 1; sbb -1, Y
(sub (sete  X, 0), Y) -> cmp X, 1; sbb  0, Y
(sub (setne X, 0), Y) -> cmp X, 1; adc -1, Y

for
  unsigned foo(unsigned a, unsigned b) {
    if (a == 0) b++;
    return b;
  }
we now get:
  foo:
    cmpl  $1, %edi
    movl  %esi, %eax
    adcl  $0, %eax
    ret
instead of:
  foo:
    testl %edi, %edi
    sete  %al
    movzbl  %al, %eax
    addl  %esi, %eax
    ret

llvm-svn: 122364
2010-12-21 21:41:44 +00:00
Dale Johannesen 87c47499c6 Revert 122353-122355 for the moment, they broke stuff.
llvm-svn: 122360
2010-12-21 21:22:27 +00:00
Dale Johannesen caf42aa6a4 Add a new transform to DAGCombiner.
llvm-svn: 122355
2010-12-21 20:10:51 +00:00
Dale Johannesen fa5dc82fda Get the type of a shift from the shift, not from its shift
count operand.  These should be the same but apparently are
not always, and this is cleaner anyway.  This improves the
code in an existing test.

llvm-svn: 122354
2010-12-21 20:06:19 +00:00
Bob Wilson 1a20c2aedd Add ARM-specific DAG combining to cast i64 vector element load/stores to f64.
Type legalization splits up i64 values into pairs of i32 values, which leads
to poor quality code when inserting or extracting i64 vector elements.
If the vector element is loaded or stored, it can be treated as an f64 value
and loaded or stored directly from a VPR register.  Use the pre-legalization
DAG combiner to cast those vector elements to f64 types so that the type
legalizer won't mess them up.  Radar 8755338.

llvm-svn: 122319
2010-12-21 06:43:19 +00:00
Dale Johannesen 0a291a36f2 Cosmetic changes.
llvm-svn: 122259
2010-12-20 20:10:50 +00:00
Chris Lattner 17a06b7efa temporarily disable this: PR8823.
llvm-svn: 122222
2010-12-20 02:11:23 +00:00
Chris Lattner 5c00d41688 now that addc/adde are gone, "ADDC" in the X86 backend uses EFLAGS results,
the same as setcc.  Optimize ADDC(0,0,FLAGS) -> SET_CARRY(FLAGS).  This is
a step towards finishing off PR5443.  In the testcase in that bug we now  get:

	movq	%rdi, %rax
	addq	%rsi, %rax
	sbbq	%rcx, %rcx
	testb	$1, %cl
	setne	%dl
	ret

instead of:

	movq	%rdi, %rax
	addq	%rsi, %rax
	movl	$0, %ecx
	adcq	$0, %rcx
	testq	%rcx, %rcx
	setne	%dl
	ret

llvm-svn: 122219
2010-12-20 01:37:09 +00:00
Chris Lattner 46b9efcad7 We lower setb to sbb with the hope that the and will go away, when it
doesn't, match it back to setb.

On a 64-bit version of the testcase before we'd get:

	movq	%rdi, %rax
	addq	%rsi, %rax
	sbbb	%dl, %dl
	andb	$1, %dl
	ret

now we get:

	movq	%rdi, %rax
	addq	%rsi, %rax
	setb	%dl
	ret

llvm-svn: 122217
2010-12-20 01:16:03 +00:00
Mon P Wang 075a16b09e Add comment for testcase for 122206
llvm-svn: 122210
2010-12-20 00:54:26 +00:00
Mon P Wang 1064992c84 Prevents PerformShuffleCombine from creating a node with an illegal type after legalize types
has run, e.g., prevent creating an i64 node from a v2i64 when i64 is not a legal type.

llvm-svn: 122206
2010-12-19 23:55:53 +00:00
Chris Lattner 9edf3f50bf improve the setcc -> setcc_carry optimization to happen more
consistently by moving it out of lowering into dag combine.

Add some missing patterns for matching away extended versions of setcc_c.

llvm-svn: 122201
2010-12-19 22:08:31 +00:00
Chris Lattner ff392ab3ed now that generic vector types aren't selected onto MMX registers, these
tests don't need -disable-mmx.

llvm-svn: 122188
2010-12-19 20:12:58 +00:00
Chris Lattner 405c28bc7d add a general coverage test for overflow intrinsics.
llvm-svn: 122185
2010-12-19 20:01:13 +00:00
Chris Lattner 77a8a71414 fix PR8642: if a critical edge has a PHI value that can trap,
isel is *required* to split the edge.  PHI values get evaluated
on the edge, not in their predecessor block.

llvm-svn: 122170
2010-12-19 04:58:57 +00:00
Chris Lattner 2b43f2df2b move this test into the ARM test so that it is only run when the arm backend
is enabled.

llvm-svn: 122163
2010-12-19 02:58:14 +00:00
Anton Korobeynikov 3eb4fedecf Restore the behavior of frame lowering before my refactoring.
It turns out that ppc backend has really weird interdependencies
over different hooks and all stuff is fragile wrt small changes.
This should fix PR8749

llvm-svn: 122155
2010-12-18 19:53:14 +00:00
Benjamin Kramer 4b4f3557de Just rename the functions, relying on matching a instruction that has the same name as a symbol is way too fragile.
llvm-svn: 122154
2010-12-18 14:23:57 +00:00
Benjamin Kramer 5f6219d929 Test more than just label names and make test work on non-x86 hosts.
llvm-svn: 122153
2010-12-18 14:07:28 +00:00
Bob Wilson 00871c71e9 Fix result type of Neon floating-point comparisons against zero.
The result vector elements are always integers.  Radar 8782191.

llvm-svn: 122112
2010-12-18 00:04:33 +00:00
Bill Wendling 3fff1fd49b During local stack slot allocation, the materializeFrameBaseRegister function
may be called. If the entry block is empty, the insertion point iterator will be
the "end()" value. Calling ->getParent() on it (among others) causes problems.

Modify materializeFrameBaseRegister to take the machine basic block and insert
the frame base register at the beginning of that block. (It's very similar to
what the code does all ready. The only difference is that it will always insert
at the beginning of the entry block instead of after a previous materialization
of the frame base register. I doubt that that matters here.)

<rdar://problem/8782198>

llvm-svn: 122104
2010-12-17 23:09:14 +00:00
Bob Wilson 5408144add Fix a DAGCombiner crash when folding binary vector operations with constant
BUILD_VECTOR operands where the element type is not legal.  I had previously
changed this code to insert TRUNCATE operations, but that was just wrong.

llvm-svn: 122102
2010-12-17 23:06:49 +00:00
Bob Wilson 494d7f0367 Combine several vector-related DAGCombiner tests.
llvm-svn: 122101
2010-12-17 23:06:46 +00:00
Nate Begeman 97b72c99d2 Add support for matching psign & plendvb to the x86 target
Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts

llvm-svn: 122098
2010-12-17 22:55:37 +00:00
Dale Johannesen cd538afa52 Add a transform to DAG Combiner. This improves the
code for the case where 32-bit divide by constant is
turned into 64-bit multiply by constant.  8771012.

llvm-svn: 122090
2010-12-17 21:45:49 +00:00
Kalle Raiskila affe15fd67 Don't feed 19 bit immediates to ILA.
Patch (slightly modified) by Visa Putkinen.

llvm-svn: 122052
2010-12-17 09:36:09 +00:00
Bob Wilson bfc6904fc6 Fix crash compiling a QQQQ REG_SEQUENCE for a Neon vld3_lane operation.
Radar 8776599

llvm-svn: 122018
2010-12-17 01:21:12 +00:00
Jason W Kim 4c1386add9 1. ARM/MC/ELF: A few more ELF relocs for .o
2. Fixed EmitLocalCommonSymbol for ELF (Yes, they exist. :)
   Test added.

llvm-svn: 121951
2010-12-16 03:12:17 +00:00
Jim Grosbach bfef309d11 Thumb1 had two patterns for the same load-from-constant-pool instruction.
Canonicalize on tLDRpci and remove tLDRcp.

llvm-svn: 121920
2010-12-15 23:52:36 +00:00
Eric Christopher 347f4c32e8 Don't handle -arm-long-calls in fast isel for now.
llvm-svn: 121919
2010-12-15 23:47:29 +00:00
Evan Cheng b7ff5a0f20 Teach machine cse to commute instructions.
llvm-svn: 121903
2010-12-15 22:16:21 +00:00
Bob Wilson fa27a8621c Add Neon VCVT instructions for f32 <-> f16 conversions.
Clang is now providing intrinsics for these and so we need to support them
in the backend.  Radar 8068427.

llvm-svn: 121902
2010-12-15 22:14:12 +00:00
Wesley Peck 0c558b2080 Lower the MBlaze target specific calling conventions for "interrupt_handler"
and "save_volatiles" correctly. This completes the custom calling convention
functionality changes for the MBlaze backend that were started in 121888.

llvm-svn: 121891
2010-12-15 20:27:28 +00:00