Commit Graph

4720 Commits

Author SHA1 Message Date
Duncan Sands 61c5708b51 Fix the other problem reported in PR8582. Testcase and patch by
Nadav Rotem.

llvm-svn: 122983
2011-01-06 23:45:22 +00:00
Eric Christopher e516af753b Add some fairly duplicated code to let type legalization split illegal
typed atomics. This will lower exclusively to libcalls at the moment.

llvm-svn: 122979
2011-01-06 22:28:56 +00:00
Evan Cheng 3ae2b79aa3 Re-implement r122936 with proper target hooks. Now getMaxStoresPerMemcpy
etc. takes an option OptSize. If OptSize is true, it would return
the inline limit for functions with attribute OptSize.

llvm-svn: 122952
2011-01-06 06:52:41 +00:00
Evan Cheng c052ba7ff3 Revert r122936. I'll re-implement the change.
llvm-svn: 122949
2011-01-06 06:17:53 +00:00
Evan Cheng 06536e7158 r105228 reduced the memcpy / memset inline limit to 4 with -Os to avoid blowing
up freebsd bootloader. However, this doesn't make much sense for Darwin, whose
-Os is meant to optimize for size only if it doesn't hurt performance.
rdar://8821501

llvm-svn: 122936
2011-01-06 01:04:47 +00:00
Evan Cheng ac730dd2d1 Avoid zero extend bit test operands to pointer type if all the masks fit in
the original type of the switch statement key.
rdar://8781238

llvm-svn: 122935
2011-01-06 01:02:44 +00:00
Evan Cheng 260acf32ee Optimize:
r1025 = s/zext r1024, 4
  r1026 = extract_subreg r1025, 4
to:
  r1026 = copy r1024

llvm-svn: 122925
2011-01-05 23:06:49 +00:00
Eric Christopher c673b21a87 80-cols.
llvm-svn: 122909
2011-01-05 21:45:56 +00:00
Eric Christopher 988518109d Remove TODO, these appear to be implemented.
llvm-svn: 122849
2011-01-04 22:31:50 +00:00
Benjamin Kramer 25e6e06e42 Try to reuse the value when lowering memset.
This allows us to compile:
  void test(char *s, int a) {
    __builtin_memset(s, a, 15);
  }
into 1 mul + 3 stores instead of 3 muls + 3 stores.

llvm-svn: 122710
2011-01-02 19:57:05 +00:00
Benjamin Kramer 2fdea4c8f1 Lower the i8 extension in memset to a multiply instead of a potentially long series of shifts and ors.
We could implement a DAGCombine to turn x * 0x0101 back into logic operations
on targets that doesn't support the multiply or it is slow (p4) if someone cares
enough.

Example code:
  void test(char *s, int a) {
      __builtin_memset(s, a, 4);
  }
before:
  _test:                                  ## @test
    movzbl  8(%esp), %eax
    movl  %eax, %ecx
    shll  $8, %ecx
    orl %eax, %ecx
    movl  %ecx, %eax
    shll  $16, %eax
    orl %ecx, %eax
    movl  4(%esp), %ecx
    movl  %eax, 4(%ecx)
    movl  %eax, (%ecx)
    ret
after:
  _test:                                  ## @test
    movzbl  8(%esp), %eax
    imull $16843009, %eax, %eax   ## imm = 0x1010101
    movl  4(%esp), %ecx
    movl  %eax, 4(%ecx)
    movl  %eax, (%ecx)
    ret

llvm-svn: 122707
2011-01-02 19:44:58 +00:00
Andrew Trick 5ce945ca3a Minor cleanup related to my latest scheduler changes.
llvm-svn: 122545
2010-12-24 07:10:19 +00:00
Andrew Trick c94056692a Fix a few cases where the scheduler is not checking for phys reg copies. The scheduling node may have a NULL DAG node, yuck.
llvm-svn: 122544
2010-12-24 06:46:50 +00:00
Andrew Trick 10ffc2b6c2 Various bits of framework needed for precise machine-level selection
DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.

Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.

Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.

Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.

ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.

ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.

llvm-svn: 122541
2010-12-24 05:03:26 +00:00
Andrew Trick c416ba612b whitespace
llvm-svn: 122539
2010-12-24 04:28:06 +00:00
Chris Lattner 11a33811b6 flags -> glue for selectiondag
llvm-svn: 122509
2010-12-23 17:24:32 +00:00
Chris Lattner f647e95b9a sdisel flag -> glue.
llvm-svn: 122507
2010-12-23 17:13:18 +00:00
Andrew Trick 528fad91d2 Reorganize ListScheduleBottomUp in preparation for modeling machine cycles and instruction issue.
llvm-svn: 122491
2010-12-23 05:42:20 +00:00
Andrew Trick a52f325c35 Converted LiveRegCycles to LiveRegGens. It's easier to work with and allows multiple nodes per cycle.
llvm-svn: 122474
2010-12-23 04:16:14 +00:00
Andrew Trick 12acde11cb In CheckForLiveRegDef use TRI->getOverlaps.
llvm-svn: 122473
2010-12-23 03:43:21 +00:00
Andrew Trick 033efdf4d7 Fixes PR8823: add-with-overflow-128.ll
In the bottom-up selection DAG scheduling, handle two-address
instructions that read/write unspillable registers. Treat
the entire chain of two-address nodes as a single live range.

llvm-svn: 122472
2010-12-23 03:15:51 +00:00
Jeffrey Yasskin 9b43f33620 Change all self assignments X=X to (void)X, so that we can turn on a
new gcc warning that complains on self-assignments and
self-initializations.

llvm-svn: 122458
2010-12-23 00:58:24 +00:00
Benjamin Kramer 1f4dfbbcb0 DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal. The latter usually compiles into smaller code.
example code:
unsigned foo(unsigned x, unsigned y) {
  if (x != 0) y--;
  return y;
}

before:
  _foo:                           ## @foo
    cmpl  $1, 4(%esp)             ## encoding: [0x83,0x7c,0x24,0x04,0x01]
    sbbl  %eax, %eax              ## encoding: [0x19,0xc0]
    notl  %eax                    ## encoding: [0xf7,0xd0]
    addl  8(%esp), %eax           ## encoding: [0x03,0x44,0x24,0x08]
    ret                           ## encoding: [0xc3]

after:
  _foo:                           ## @foo
    cmpl  $1, 4(%esp)             ## encoding: [0x83,0x7c,0x24,0x04,0x01]
    movl  8(%esp), %eax           ## encoding: [0x8b,0x44,0x24,0x08]
    adcl  $-1, %eax               ## encoding: [0x83,0xd0,0xff]
    ret                           ## encoding: [0xc3]

llvm-svn: 122455
2010-12-22 23:17:45 +00:00
Chris Lattner cafc1e60bb Fix a bug in ReduceLoadWidth that wasn't handling extending
loads properly.  We miscompiled the testcase into:

_test:                                  ## @test
	movl	$128, (%rdi)
	movzbl	1(%rdi), %eax
	ret

Now we get a proper:

_test:                                  ## @test
	movl	$128, (%rdi)
	movsbl	(%rdi), %eax
	movzbl	%ah, %eax
	ret

This fixes PR8757.

llvm-svn: 122392
2010-12-22 08:02:57 +00:00
Chris Lattner 9a499e96eb more cleanups, move a check for "roundedness" earlier to reject
unhanded cases faster and simplify code.

llvm-svn: 122391
2010-12-22 08:01:44 +00:00
Chris Lattner 222374d886 reduce indentation and improve comments, no functionality change.
llvm-svn: 122389
2010-12-22 07:36:50 +00:00
Andrew Trick fbb3ed8774 In DelayForLiveRegsBottomUp, handle instructions that read and write
the same physical register. Simplifies the fix from the previous
checkin r122211.

llvm-svn: 122370
2010-12-21 22:27:44 +00:00
Andrew Trick 2085a96513 whitespace
llvm-svn: 122368
2010-12-21 22:25:04 +00:00
Dale Johannesen a94e36bbee Reapply 122353-122355 with fixes. 122354 was wrong;
the shift type was needed one place, the shift count
type another.  The transform in 123555 had the same
problem.

llvm-svn: 122366
2010-12-21 21:55:50 +00:00
Dale Johannesen 87c47499c6 Revert 122353-122355 for the moment, they broke stuff.
llvm-svn: 122360
2010-12-21 21:22:27 +00:00
Dale Johannesen caf42aa6a4 Add a new transform to DAGCombiner.
llvm-svn: 122355
2010-12-21 20:10:51 +00:00
Dale Johannesen fa5dc82fda Get the type of a shift from the shift, not from its shift
count operand.  These should be the same but apparently are
not always, and this is cleaner anyway.  This improves the
code in an existing test.

llvm-svn: 122354
2010-12-21 20:06:19 +00:00
Dale Johannesen d64931df77 Shift by the word size is invalid IR; don't create it.
llvm-svn: 122353
2010-12-21 20:00:06 +00:00
Chris Lattner 2a7ff99979 fix some typos
llvm-svn: 122349
2010-12-21 18:05:22 +00:00
Stuart Hastings 83cce8e7ab Fix indentation, add comment.
llvm-svn: 122345
2010-12-21 17:16:58 +00:00
Stuart Hastings 8c5bfcaa29 Missing logic for nested CALLSEQ_START/END.
llvm-svn: 122342
2010-12-21 17:07:24 +00:00
Chris Lattner 3e5fbd74ed rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for
something that just glues two nodes together, even if it is
sometimes used for flags.

llvm-svn: 122310
2010-12-21 02:38:05 +00:00
Chris Lattner 17f906be96 improve "cannot yet select" errors a trivial amount: now
they are just as useless, but at least a bit more gramatical

llvm-svn: 122305
2010-12-21 02:07:03 +00:00
Dale Johannesen 0a291a36f2 Cosmetic changes.
llvm-svn: 122259
2010-12-20 20:10:50 +00:00
Chris Lattner 0b3ca50ebb implement type legalization promotion support for SMULO and UMULO, giving
ARM (and other 32-bit-only) targets support for i8 and i16 overflow 
multiplies.  The generated code isn't great, but this at least fixes
CodeGen/Generic/overflow.ll when running on ARM hosts.

llvm-svn: 122221
2010-12-20 02:05:39 +00:00
Chris Lattner 981afd206b Fix a bug in the scheduler's handling of "unspillable" vregs.
Imagine we see:

EFLAGS = inst1
EFLAGS = inst2 FLAGS
gpr = inst3 EFLAGS

Previously, we would refuse to schedule inst2 because it clobbers
the EFLAGS of the predecessor.  However, it also uses the EFLAGS
of the predecessor, so it is safe to emit.  SDep edges ensure that
the right order happens already anyway.

This fixes 2 testsuite crashes with the X86 patch I'm going to
commit next.

llvm-svn: 122211
2010-12-20 00:55:43 +00:00
Chris Lattner 0cfe884874 the result of CheckForLiveRegDef is dead, remove it.
llvm-svn: 122209
2010-12-20 00:51:56 +00:00
Chris Lattner ed69c6e4b9 reduce indentation, no functionality change.
llvm-svn: 122208
2010-12-20 00:50:16 +00:00
Nick Lewycky 0de20af7ba Add missing standard headers. Patch by Joerg Sonnenberger!
llvm-svn: 122193
2010-12-19 20:43:38 +00:00
Chris Lattner 440b2804ff teach MaskedValueIsZero how to analyze ADDE. This is
enough to teach it that ADDE(0,0) is known 0 except the 
low bit, for example.

llvm-svn: 122191
2010-12-19 20:38:28 +00:00
Chris Lattner 77a8a71414 fix PR8642: if a critical edge has a PHI value that can trap,
isel is *required* to split the edge.  PHI values get evaluated
on the edge, not in their predecessor block.

llvm-svn: 122170
2010-12-19 04:58:57 +00:00
Bob Wilson 5408144add Fix a DAGCombiner crash when folding binary vector operations with constant
BUILD_VECTOR operands where the element type is not legal.  I had previously
changed this code to insert TRUNCATE operations, but that was just wrong.

llvm-svn: 122102
2010-12-17 23:06:49 +00:00
Dale Johannesen cd538afa52 Add a transform to DAG Combiner. This improves the
code for the case where 32-bit divide by constant is
turned into 64-bit multiply by constant.  8771012.

llvm-svn: 122090
2010-12-17 21:45:49 +00:00
Bob Wilson bfc6904fc6 Fix crash compiling a QQQQ REG_SEQUENCE for a Neon vld3_lane operation.
Radar 8776599

llvm-svn: 122018
2010-12-17 01:21:12 +00:00
Chris Lattner 15090e1eb0 take care of some todos, transforming [us]mul_lohi into
a wider mul if the wider mul is legal.

llvm-svn: 121848
2010-12-15 06:04:19 +00:00