Commit Graph

1384 Commits

Author SHA1 Message Date
Evan Cheng f1bd5fcdb4 More work to allow dag combiner to promote 16-bit ops to 32-bit.
llvm-svn: 101621
2010-04-17 06:13:15 +00:00
Evan Cheng f037f87bde (i32 sext_in_reg (i32 aext (i16 x)), i16) -> (i32 sext x). No known test case until -promote-16bit is enabled.
llvm-svn: 101551
2010-04-16 22:26:19 +00:00
Evan Cheng af56facacd Adding support for dag combiner to promote operations for profit. This requires target specific queries. For example, x86 should promote i16 to i32 when it does not impact load folding.
x86 support is off by default. It can be enabled with -promote-16bit.

Work in progress.

llvm-svn: 101448
2010-04-16 06:14:10 +00:00
Chris Lattner 3245afdf05 enhance the load/store narrowing optimization to handle a
tokenfactor in between the load/store.  This allows us to 
optimize test7 into:

_test7:                                 ## @test7
## BB#0:                                ## %entry
	movl	(%rdx), %eax
                                        ## kill: SIL<def> ESI<kill>
	movb	%sil, 5(%rdi)
	ret

instead of:

_test7:                                 ## @test7
## BB#0:                                ## %entry
	movl	4(%esp), %ecx
	movl	$-65281, %eax           ## imm = 0xFFFFFFFFFFFF00FF
	andl	4(%ecx), %eax
	movzbl	8(%esp), %edx
	shll	$8, %edx
	addl	%eax, %edx
	movl	12(%esp), %eax
	movl	(%eax), %eax
	movl	%edx, 4(%ecx)
	ret

llvm-svn: 101355
2010-04-15 06:10:49 +00:00
Chris Lattner 6ebd8674eb teach codegen to turn trunc(zextload) into load when possible.
This doesn't occur much at all, it only seems to formed in the case
when the trunc optimization kicks in due to phase ordering.  In that
case it is saves a few bytes on x86-32.

llvm-svn: 101350
2010-04-15 05:40:59 +00:00
Chris Lattner f9b2e3c68a add a simple dag combine to replace trivial shl+lshr with
and.  This happens with the store->load narrowing stuff.

llvm-svn: 101348
2010-04-15 05:28:43 +00:00
Chris Lattner 4041ab6e00 Implement rdar://7860110 (also in target/readme.txt) narrowing
a load/or/and/store sequence into a narrower store when it is
safe.  Daniel tells me that clang will start producing this sort
of thing with bitfields, and this does  trigger a few dozen times
on 176.gcc produced by llvm-gcc even now.

This compiles code like CodeGen/X86/2009-05-28-DAGCombineCrash.ll 
into:

        movl    %eax, 36(%rdi)

instead of:

        movl    $4294967295, %eax       ## imm = 0xFFFFFFFF
        andq    32(%rdi), %rax
        shlq    $32, %rcx
        addq    %rax, %rcx
        movq    %rcx, 32(%rdi)

and each of the testcases into a single store.  Each of them used
to compile into craziness like this:

_test4:
	movl	$65535, %eax            ## imm = 0xFFFF
	andl	(%rdi), %eax
	shll	$16, %esi
	addl	%eax, %esi
	movl	%esi, (%rdi)
	ret

llvm-svn: 101343
2010-04-15 04:48:01 +00:00
Dan Gohman bcaf681cde Add const qualifiers to CodeGen's use of LLVM IR constructs.
llvm-svn: 101334
2010-04-15 01:51:59 +00:00
Dan Gohman ecd40a34e2 Remove unnecessary parens.
llvm-svn: 101010
2010-04-12 02:24:01 +00:00
Ted Kremenek d87bd77586 Fix -Wsign-compare warning (issued by clang++).
llvm-svn: 100799
2010-04-08 18:49:30 +00:00
Chris Lattner 6855d62768 fix 80 col violation, patch by Alastair Lynn
llvm-svn: 100639
2010-04-07 18:13:33 +00:00
Evan Cheng 43cd9e3845 Fix sdisel memcpy, memset, memmove lowering:
1. Makes it possible to lower with floating point loads and stores.
2. Avoid unaligned loads / stores unless it's fast.
3. Fix some memcpy lowering logic bug related to when to optimize a
   load from constant string into a constant.
4. Adjust x86 memcpy lowering threshold to make it more sane.
5. Fix x86 target hook so it uses vector and floating point memory
   ops more effectively.
rdar://7774704

llvm-svn: 100090
2010-04-01 06:04:33 +00:00
Chris Lattner 4ec0b670d5 fix PR6533 by updating the br(xor) code to remember the case
when it looked past a trunc.

llvm-svn: 98203
2010-03-10 23:46:44 +00:00
Dan Gohman 703b12d62f Fix another bitwidth calculation to handle vector types; based on a
patch by Micah Villmow for PR6572.

llvm-svn: 98188
2010-03-10 21:04:53 +00:00
Dan Gohman e14c4087a3 Fix more code to work properly with vector operands. Based on
a patch my Micah Villmow for PR6465.

llvm-svn: 97692
2010-03-04 00:23:16 +00:00
Bill Wendling c8d3add052 Use APInt instead of zext value.
llvm-svn: 97631
2010-03-03 01:58:01 +00:00
Bill Wendling af13d82945 This test case:
long test(long x) { return (x & 123124) | 3; }

Currently compiles to:

_test:
        orl     $3, %edi
        movq    %rdi, %rax
        andq    $123127, %rax
        ret

This is because instruction and DAG combiners canonicalize

  (or (and x, C), D) -> (and (or, D), (C | D))

However, this is only profitable if (C & D) != 0. It gets in the way of the
3-addressification because the input bits are known to be zero.

llvm-svn: 97616
2010-03-03 00:35:56 +00:00
Dan Gohman 4cec543952 Fix several places to handle vector operands properly.
Based on a patch by Micah Villmow for PR6438.

llvm-svn: 97538
2010-03-02 02:14:38 +00:00
Evan Cheng 228c31f045 Re-apply 97040 with fix. This survives a ppc self-host llvm-gcc bootstrap.
llvm-svn: 97310
2010-02-27 07:36:59 +00:00
Daniel Dunbar 4811d004be Speculatively revert r97011, "Re-apply 96540 and 96556 with fixes.", again in
the hopes of fixing PPC bootstrap.

llvm-svn: 97040
2010-02-24 17:05:47 +00:00
Evan Cheng 328a607490 Re-apply 96540 and 96556 with fixes.
llvm-svn: 97011
2010-02-24 01:42:31 +00:00
Duncan Sands d0bf6f640f Revert commits 96556 and 96640, because commit 96556 breaks the
dragonegg self-host build.  I reverted 96640 in order to revert
96556 (96640 goes on top of 96556), but it also looks like with
both of them applied the breakage happens even earlier.  The
symptom of the 96556 miscompile is the following crash:

  llvm[3]: Compiling AlphaISelLowering.cpp for Release build
  cc1plus: /home/duncan/tmp/tmp/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:4982: void llvm::SelectionDAG::ReplaceAllUsesWith(llvm::SDNode*, llvm::SDNode*, llvm::SelectionDAG::DAGUpdateListener*): Assertion `(!From->hasAnyUseOfValue(i) || From->getValueType(i) == To->getValueType(i)) && "Cannot use this version of ReplaceAllUsesWith!"' failed.
  Stack dump:
  0.	Running pass 'X86 DAG->DAG Instruction Selection' on function '@_ZN4llvm19AlphaTargetLowering14LowerOperationENS_7SDValueERNS_12SelectionDAGE'
  g++: Internal error: Aborted (program cc1plus)

This occurs when building LLVM using LLVM built by LLVM (via
dragonegg).  Probably LLVM has miscompiled itself, though it
may have miscompiled GCC and/or dragonegg itself: at this point
of the self-host build, all of GCC, LLVM and dragonegg were built
using LLVM.  Unfortunately this kind of thing is extremely hard
to debug, and while I did rummage around a bit I didn't find any
smoking guns, aka obviously miscompiled code.

Found by bisection.

r96556 | evancheng | 2010-02-18 03:13:50 +0100 (Thu, 18 Feb 2010) | 5 lines

Some dag combiner goodness:
Transform br (xor (x, y)) -> br (x != y)
Transform br (xor (xor (x,y), 1)) -> br (x == y)
Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm"

r96640 | evancheng | 2010-02-19 01:34:39 +0100 (Fri, 19 Feb 2010) | 16 lines

Transform (xor (setcc), (setcc)) == / != 1 to
(xor (setcc), (setcc)) != / == 1.

e.g. On x86_64
  %0 = icmp eq i32 %x, 0
  %1 = icmp eq i32 %y, 0
  %2 = xor i1 %1, %0
  br i1 %2, label %bb, label %return
=>
	testl   %edi, %edi
	sete    %al
	testl   %esi, %esi
	sete    %cl
	cmpb    %al, %cl
	je      LBB1_2

llvm-svn: 96672
2010-02-19 11:30:41 +00:00
Evan Cheng 0ceb68a552 Some dag combiner goodness:
Transform br (xor (x, y)) -> br (x != y)
Transform br (xor (xor (x,y), 1)) -> br (x == y)
Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm"

llvm-svn: 96556
2010-02-18 02:13:50 +00:00
David Greene 39c6d01879 Add non-temporal flags and remove an assumption of default arguments.
llvm-svn: 96240
2010-02-15 17:00:31 +00:00
Dan Gohman 4a618827de Fix "the the" and similar typos.
llvm-svn: 95781
2010-02-10 16:03:48 +00:00
Mon P Wang d74e0023c5 Improve EXTRACT_VECTOR_ELT patch based on comments from Duncan
llvm-svn: 95012
2010-02-01 22:15:09 +00:00
Mon P Wang 72c60c73af Fixed a couple of optimization with EXTRACT_VECTOR_ELT that assumes the result
type is the same as the element type of the vector.  EXTRACT_VECTOR_ELT can
be used to extended the width of an integer type.  This fixes a bug for
Generic/vector-casts.ll on a ppc750.

llvm-svn: 94990
2010-02-01 19:03:18 +00:00
Evan Cheng 555f61bf58 Implement cond ? -1 : 0 with sbb.
llvm-svn: 94490
2010-01-26 02:00:44 +00:00
Dan Gohman 954f49014d Fold (add x, shl(0 - y, n)) -> sub(x, shl(y, n)), to simplify some code
that SCEVExpander can produce when running on behalf of LSR.

llvm-svn: 93949
2010-01-19 23:30:49 +00:00
Evan Cheng 88b65bc835 Canonicalize -1 - x to ~x.
Instcombine does this but apparently there are situations where this pattern will escape the optimizer and / or created by isel. Here is a case that's seen in JavaScriptCore:
  %t1 = sub i32 0, %a
  %t2 = add i32 %t1, -1
The dag combiner pattern: ((c1-A)+c2) -> (c1+c2)-A
will fold it to -1 - %a.

llvm-svn: 93773
2010-01-18 21:38:44 +00:00
Dan Gohman dd5286dc63 Fix a codegen abort seen in 483.xalancbmk.
llvm-svn: 93417
2010-01-14 03:08:49 +00:00
Mon P Wang ec57c81e64 Disable transformation of select of two loads to a select of address and then a load if the
loads are not in the default address space because the transformation discards src value info.

llvm-svn: 93180
2010-01-11 20:12:49 +00:00
Dan Gohman 6bd3ef82ff Revert an earlier change to SIGN_EXTEND_INREG for vectors. The VTSDNode
really does need to be a vector type, because
TargetLowering::getOperationAction for SIGN_EXTEND_INREG uses that type,
and it needs to be able to distinguish between vectors and scalars.

Also, fix some more issues with legalization of vector casts.

llvm-svn: 93043
2010-01-09 02:13:55 +00:00
Chris Lattner dab2cd543f Fix rdar://7517201, a regression introduced by r92849.
When folding a and(any_ext(load)) both the any_ext and the
load have to have only a single use.

This removes the anyext-uses.ll testcase which started failing
because it is unreduced and unclear what it is testing.

llvm-svn: 92950
2010-01-07 21:59:23 +00:00
Chris Lattner 88de38453f factor this code better and reduce nesting at the same
time, no functionality change.

llvm-svn: 92948
2010-01-07 21:53:27 +00:00
Evan Cheng 166a4e6caa Teach dag combine to fold the following transformation more aggressively:
(OP (trunc x), (trunc y)) -> (trunc (OP x, y))

Unfortunately this simple change causes dag combine to infinite looping. The problem is the shrink demanded ops optimization tend to canonicalize expressions in the opposite manner. That is badness. This patch disable those optimizations in dag combine but instead it is done as a late pass in sdisel.

This also exposes some deficiencies in dag combine and x86 setcc / brcond lowering. Teach them to look pass ISD::TRUNCATE in various places.

llvm-svn: 92849
2010-01-06 19:38:29 +00:00
Bill Wendling 03f0af372c Don't assign the shift the same type as the variable being shifted. This could
result in illegal types for the SHL operator.

llvm-svn: 92797
2010-01-05 22:39:10 +00:00
David Greene fe5c3524c7 Change errs() to dbgs().
llvm-svn: 92578
2010-01-05 01:25:00 +00:00
Evan Cheng b175de6356 Increase opportunities to optimize (brcond (srl (and c1), c2)).
llvm-svn: 91717
2009-12-18 21:31:31 +00:00
Evan Cheng aadf060b92 Revert this dag combine change:
Fold (zext (and x, cst)) -> (and (zext x), cst)

DAG combiner likes to optimize expression in the other way so this would end up cause an infinite looping.

llvm-svn: 91574
2009-12-17 00:40:05 +00:00
Evan Cheng 852c486946 Make 91378 more conservative.
1. Only perform (zext (shl (zext x), y)) -> (shl (zext x), y) when y is a constant. This makes sure it remove at least one zest.
2. If the shift is a left shift, make sure the original shift cannot shift out bits.

llvm-svn: 91399
2009-12-15 03:00:32 +00:00
Evan Cheng d1521ef40c Fold (zext (and x, cst)) -> (and (zext x), cst).
llvm-svn: 91380
2009-12-15 00:52:11 +00:00
Evan Cheng ca7c690d3b Propagate zest through logical shift.
llvm-svn: 91378
2009-12-15 00:41:36 +00:00
Dan Gohman cecad35728 Fix integer cast code to handle vector types.
llvm-svn: 91362
2009-12-14 23:40:38 +00:00
Dan Gohman 1d459e4937 Implement vector widening, splitting, and scalarizing for SIGN_EXTEND_INREG.
llvm-svn: 91158
2009-12-11 21:31:27 +00:00
Evan Cheng f5938d5d27 Move isConsecutiveLoad to SelectionDAG. It's not target dependent and it's primary used by selectdag passes.
llvm-svn: 90922
2009-12-09 01:36:00 +00:00
Evan Cheng 34a23ea371 Refactor InferAlignment out of DAGCombine.
llvm-svn: 90917
2009-12-09 01:04:59 +00:00
Nate Begeman 9655f84662 Don't pull vector sext through both hands of a logical operation, since doing so prevents the fusion of vector sext and setcc into vsetcc.
Add a testcase for the above transformation.
Fix a bogus use of APInt noticed while tracking this down.

llvm-svn: 90423
2009-12-03 07:11:29 +00:00
Jakob Stoklund Olesen 32042f9475 Don't call getValueType() on a null SDValue
llvm-svn: 90415
2009-12-03 05:15:35 +00:00
Dan Gohman 82e80019a5 Remove the optimizations that convert BRCOND and BR_CC into
unconditional branches or fallthroghes. Instcombine/SimplifyCFG
should be simplifying branches with known conditions.

This fixes some problems caused by these transformations not
updating the MachineBasicBlock CFG.

llvm-svn: 89017
2009-11-17 00:47:23 +00:00
Dan Gohman a951526510 Remove an unneeded #include.
llvm-svn: 86601
2009-11-09 22:28:30 +00:00
Dan Gohman ba8735d25a When discarding SrcValue information, discard all of it so that code
that uses this information knows to behave conservatively.

llvm-svn: 85654
2009-10-31 14:14:04 +00:00
Dan Gohman 14ca753e28 Don't call SDNode::isPredecessorOf when it isn't necessary. If the load's
chains have no users, they can't be predecessors of the condition.

llvm-svn: 85394
2009-10-28 15:28:02 +00:00
Nick Lewycky 974e12b2d3 Remove includes of Support/Compiler.h that are no longer needed after the
VISIBILITY_HIDDEN removal.

llvm-svn: 85043
2009-10-25 06:57:41 +00:00
Nick Lewycky 02d5f77d26 Remove VISIBILITY_HIDDEN from class/struct found inside anonymous namespaces.
Chris claims we should never have visibility_hidden inside any .cpp file but
that's still not true even after this commit.

llvm-svn: 85042
2009-10-25 06:33:48 +00:00
Anton Korobeynikov a6faf60831 Fix invalid for vector types fneg(bitconvert(x)) => bitconvert(x ^ sign)
transform.

llvm-svn: 84683
2009-10-20 21:37:45 +00:00
Nate Begeman a3ed9edd40 More heuristics for Combiner-AA. Still catches all important cases, but
compile time penalty on gnugo, the worst case in MultiSource, is down to
about 2.5% from 30%

llvm-svn: 83824
2009-10-12 05:53:58 +00:00
Nate Begeman 18150d5abc Fix combiner-aa issue with bases which are different, but can alias.
Previously, it treated GV+28 GV+0 as different bases, and assumed they could
not alias.

llvm-svn: 82753
2009-09-25 06:05:26 +00:00
Dan Gohman 203d53ed79 Use getStoreSize() instead of getStoreSizeInBits()/8.
llvm-svn: 82656
2009-09-23 21:07:02 +00:00
Dan Gohman 08c0a95ac6 Rename several variables from EVT to more descriptive names, now that EVT
is also the name of their type, as declarations like "EVT EVT" look
really odd.

llvm-svn: 82654
2009-09-23 21:02:20 +00:00
Nate Begeman 879d8f1c3e Substantially speed up combiner-aa in the following ways:
1. Switch from an std::set to a SmallPtrSet for visited chain nodes.
2. Do not force the recursive flattening of token factor nodes, regardless of
   use count.
3. Immediately process newly created TokenFactor nodes.

Also, improve combiner-aa by teaching it that loads to non-overlapping offsets
of relatively aligned objects cannot alias.

These changes result in a >5x speedup for combiner-aa on most testcases.

llvm-svn: 81816
2009-09-15 00:18:30 +00:00
Bob Wilson 39f51320ca Don't swap the operands of a subtraction when trying to create a
post-decrement load/store.

llvm-svn: 81464
2009-09-10 22:09:31 +00:00
Duncan Sands 2fbeaf084f Remove some unused variables and methods warned about by
icc (#177, partial).  Patch by Erick Tryzelaar.

llvm-svn: 81106
2009-09-06 08:33:48 +00:00
Chris Lattner 4dc3edde9f remove a few DOUTs here and there.
llvm-svn: 79832
2009-08-23 06:35:02 +00:00
Eli Friedman 79ba8f2edc Add check for completeness. Note that this doesn't actually have any
effect with the way the current code is structured.

llvm-svn: 79792
2009-08-23 00:14:19 +00:00
Eli Friedman 1e008c173a PR4737: Fix a nasty bug in load narrowing with non-power-of-two types.
llvm-svn: 79415
2009-08-19 08:46:10 +00:00
Owen Anderson 117c9e8497 Add contexts to some of the MVT APIs. No functionality change yet, just the infrastructure work needed to get the contexts to where they need to be first.
llvm-svn: 78759
2009-08-12 00:36:31 +00:00
Owen Anderson 9f94459d24 Split EVT into MVT and EVT, the former representing _just_ a primitive type, while
the latter is capable of representing either a primitive or an extended type.

llvm-svn: 78713
2009-08-11 20:47:22 +00:00
Dan Gohman 9d26c85bdc Fix a bug in the DAGCombiner's handling of multiple linked
MERGE_VALUES nodes. Replacing the result values with the
operands in one MERGE_VALUES node may cause another
MERGE_VALUES node be CSE'd with the first one, and bring
its uses along, so that the first one isn't dead, as this
code expects. Fix this by iterating until the node is
really dead. This fixes PR4699.

llvm-svn: 78619
2009-08-10 23:43:19 +00:00
Dan Gohman 733a64db57 Fix a bug where DAGCombine was producing an illegal ConstantFP
node after legalize, and remove the workaround code from the
ARM backend.

llvm-svn: 78615
2009-08-10 23:15:10 +00:00
Owen Anderson 53aa7a960c Rename MVT to EVT, in preparation for splitting SimpleValueType out into its own struct type.
llvm-svn: 78610
2009-08-10 22:56:29 +00:00
Dan Gohman b717091e69 Make this comment more closely reflect the code.
llvm-svn: 78569
2009-08-10 16:50:32 +00:00
Jakob Stoklund Olesen dc6bccbaa6 Don't build illegal ops in DAGCombiner::SimplifyBinOpWithSameOpcodeHands().
Blackfin supports and/or/xor on i32 but not on i16. Teach
DAGCombiner::SimplifyBinOpWithSameOpcodeHands to not produce illegal nodes
after legalize ops.

llvm-svn: 78497
2009-08-08 20:42:17 +00:00
Dan Gohman 5758e1e92a Fix a few places in DAGCombiner that were creating all-ones-bits
and high-bits values in ways that weren't correct for integer
types wider than 64 bits. This fixes a miscompile in
PPMacroExpansion.cpp in clang on x86-64.

llvm-svn: 78295
2009-08-06 09:18:59 +00:00
Dan Gohman 3f323847bc Avoid forming a SELECT_CC in a type that the target doesn't
support. This isn't immediately interesting, because Legalize
ends up lowering SELECT_CC if the target doesn't support it,
but this simplifies the process.

Also, if the SELECT_CC would be expanded in Legalize, it
can potentially end up with two copies of the condition
expression. By leaving it as SELECT+SETCC, the SELECT can be
expanded into two SELECTs that use a single SETCC.

The two comparisons are usually CSE'd, but depending on
when various expressions get legalized, the comparison
expression could involve calls to library functions, such
that the comparison expression may not be able to be CSE'd.
This will be needed by a future patch.

llvm-svn: 77896
2009-08-02 16:19:38 +00:00
Owen Anderson 4056ca9568 Move types back to the 2.5 API.
llvm-svn: 77516
2009-07-29 22:17:13 +00:00
Owen Anderson c2c7932c64 Change ConstantArray to 2.5 API.
llvm-svn: 77347
2009-07-28 18:32:17 +00:00
Jakob Stoklund Olesen 1ae0736830 Add support for promoting SETCC operations.
llvm-svn: 76987
2009-07-24 18:22:59 +00:00
Evan Cheng a7bb55ebb6 Fix a dagga combiner bug: avoid creating illegal constant.
Is this really a winning transformation?
fold (shl (srl x, c1), c2) -> (shl (and x, (shl -1, c1)), (sub c2, c1)) or                                                                              
                              (srl (and x, (shl -1, c1)), (sub c1, c2))

llvm-svn: 76535
2009-07-21 05:40:15 +00:00
Owen Anderson f945a9ed07 Move a few more convenience factory functions from Constant to LLVMContext.
llvm-svn: 75840
2009-07-15 21:51:10 +00:00
Torok Edwin fbcc663cbf llvm_unreachable->llvm_unreachable(0), LLVM_UNREACHABLE->llvm_unreachable.
This adds location info for all llvm_unreachable calls (which is a macro now) in
!NDEBUG builds.
In NDEBUG builds location info and the message is off (it only prints
"UREACHABLE executed").

llvm-svn: 75640
2009-07-14 16:55:14 +00:00
Torok Edwin 56d0659726 assert(0) -> LLVM_UNREACHABLE.
Make llvm_unreachable take an optional string, thus moving the cerr<< out of
line.
LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for
NDEBUG builds.

llvm-svn: 75379
2009-07-11 20:10:48 +00:00
Torok Edwin ccb29cd290 Convert more assert(0)+abort() -> LLVM_UNREACHABLE,
and abort()/exit() -> llvm_report_error().

llvm-svn: 75363
2009-07-11 13:10:19 +00:00
Owen Anderson 0504e0a222 Thread LLVMContext through MVT and related parts of SDISel.
llvm-svn: 75153
2009-07-09 17:57:24 +00:00
Chris Lattner 4ac607332d dag combine sext(setcc) -> vsetcc before legalize. To make this safe,
VSETCC must define all bits, which is different than it was documented
to before.  Since all targets that implement VSETCC already have this
behavior, and we don't optimize based on this, just change the 
documentation.  We now get nice code for vec_compare.ll

llvm-svn: 74978
2009-07-08 00:31:33 +00:00
Nate Begeman 624690c6b2 Adapt the x86 build_vector dagcombine to the current state of the legalizer.
build vectors with i64 elements will only appear on 32b x86 before legalize.
Since vector widening occurs during legalize, and produces i64 build_vector 
elements, the dag combiner is never run on these before legalize splits them
into 32b elements.

Teach the build_vector dag combine in x86 back end to recognize consecutive 
loads producing the low part of the vector.

Convert the two uses of TLI's consecutive load recognizer to pass LoadSDNodes
since that was required implicitly.

Add a testcase for the transform.

Old:
	subl	$28, %esp
	movl	32(%esp), %eax
	movl	4(%eax), %ecx
	movl	%ecx, 4(%esp)
	movl	(%eax), %eax
	movl	%eax, (%esp)
	movaps	(%esp), %xmm0
	pmovzxwd	%xmm0, %xmm0
	movl	36(%esp), %eax
	movaps	%xmm0, (%eax)
	addl	$28, %esp
	ret

New:
	movl	4(%esp), %eax
	pmovzxwd	(%eax), %xmm0
	movl	8(%esp), %eax
	movaps	%xmm0, (%eax)
	ret

llvm-svn: 72957
2009-06-05 21:37:30 +00:00
Dan Gohman 7b6b5dd954 Don't do the X * 0.0 -> 0.0 transformation in instcombine, because
instcombine doesn't know when it's safe. To partially compensate
for this, introduce new code to do this transformation in
dagcombine, which can use UnsafeFPMath.

llvm-svn: 72872
2009-06-04 17:12:12 +00:00
Dale Johannesen 5234d3795f Revert 72707 and 72709, for the moment.
llvm-svn: 72712
2009-06-02 03:12:52 +00:00
Dale Johannesen 0b8ca79253 Make the implicit inputs and outputs of target-independent
ADDC/ADDE use MVT::i1 (later, whatever it gets legalized to)
instead of MVT::Flag.  Remove CARRY_FALSE in favor of 0; adjust
all target-independent code to use this format.

Most targets will still produce a Flag-setting target-dependent
version when selection is done.  X86 is converted to use i32
instead, which means TableGen needs to produce different code
in xxxGenDAGISel.inc.  This keys off the new supportsHasI1 bit
in xxxInstrInfo, currently set only for X86; in principle this
is temporary and should go away when all other targets have
been converted.  All relevant X86 instruction patterns are
modified to represent setting and using EFLAGS explicitly.  The
same can be done on other targets.

The immediate behavior change is that an ADC/ADD pair are no
longer tightly coupled in the X86 scheduler; they can be
separated by instructions that don't clobber the flags (MOV).
I will soon add some peephole optimizations based on using
other instructions that set the flags to feed into ADC.

llvm-svn: 72707
2009-06-01 23:27:20 +00:00
Evan Cheng 86cdb4b345 Do not try to create a MVT type of width 0.
llvm-svn: 72557
2009-05-28 23:52:18 +00:00
Evan Cheng 6673ff08fe Incorporate patch feedbacks.
llvm-svn: 72533
2009-05-28 18:41:02 +00:00
Evan Cheng a9cda8abf2 Added optimization that narrow load / op / store and the 'op' is a bit twiddling instruction and its second operand is an immediate. If bits that are touched by 'op' can be done with a narrower instruction, reduce the width of the load and store as well. This happens a lot with bitfield manipulation code.
e.g.
orl     $65536, 8(%rax)
=>
orb     $1, 10(%rax)

Since narrowing is not always a win, e.g. i32 -> i16 is a loss on x86, dag combiner consults with the target before performing the optimization.

llvm-svn: 72507
2009-05-28 00:35:15 +00:00
Torok Edwin be6a9a151a Fix PR4254.
The DAGCombiner created a negative shiftamount, stored in an
unsigned variable. Later the optimizer eliminated the shift entirely as being
undefined.
Example: (srl (shl X, 56) 48). ShiftAmt is 4294967288.
Fix it by checking that the shiftamount is positive, and storing in a signed
variable.

llvm-svn: 72331
2009-05-23 17:29:48 +00:00
Daniel Dunbar a8c1658619 Silence Release-Asserts warnings.
llvm-svn: 72011
2009-05-18 16:43:04 +00:00
Duncan Sands af9eaa830a Rename PaddedSize to AllocSize, in the hope that this
will make it more obvious what it represents, and stop
it being confused with the StoreSize.

llvm-svn: 71349
2009-05-09 07:06:46 +00:00
Evan Cheng cfc0513080 Do not use register as base ptr of pre- and post- inc/dec load / store nodes.
llvm-svn: 71098
2009-05-06 18:25:01 +00:00
Bill Wendling 026e5d7667 Instead of passing in an unsigned value for the optimization level, use an enum,
which better identifies what the optimization is doing. And is more flexible for
future uses.

llvm-svn: 70440
2009-04-29 23:29:43 +00:00
Nate Begeman 5f829d896d Implement review feedback for vector shuffle work.
llvm-svn: 70372
2009-04-29 05:20:52 +00:00
Bill Wendling 084669a1c9 Second attempt:
Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to
use the old behavior, the flag is -O0. This change allows for finer-grained
control over which optimizations are run at different -O levels.

Most of this work was pretty mechanical. The majority of the fixes came from
verifying that a "fast" variable wasn't used anymore. The JIT still uses a
"Fast" flag. I'll change the JIT with a follow-up patch.

llvm-svn: 70343
2009-04-29 00:15:41 +00:00
Bill Wendling 56f2987a87 r70270 isn't ready yet. Back this out. Sorry for the noise.
llvm-svn: 70275
2009-04-28 01:04:53 +00:00
Bill Wendling d0ae15946c Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to
use the old behavior, the flag is -O0. This change allows for finer-grained
control over which optimizations are run at different -O levels.

Most of this work was pretty mechanical. The majority of the fixes came from
verifying that a "fast" variable wasn't used anymore. The JIT still uses a
"Fast" flag. I'm not 100% sure if it's necessary to change it there...

llvm-svn: 70270
2009-04-28 00:21:31 +00:00
Nate Begeman 8d6d4b9289 2nd attempt, fixing SSE4.1 issues and implementing feedback from duncan.
PR2957

ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle
mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes
as the shuffle mask.  A value of -1 represents UNDEF.

In addition to eliminating the creation of illegal BUILD_VECTORS just to 
represent shuffle masks, we are better about canonicalizing the shuffle mask,
resulting in substantially better code for some classes of shuffles.

llvm-svn: 70225
2009-04-27 18:41:29 +00:00
Dan Gohman be36f5ccda When transforming sext(trunc(load(x))) into sext(smaller load(x)),
the trunc is directly replaced with the smaller load, so don't
try to create a new sext node. This fixes PR4050.

llvm-svn: 70179
2009-04-27 02:00:55 +00:00
Dan Gohman 4539987920 Add a top-level comment about DAGCombiner's role in the compiler.
llvm-svn: 70052
2009-04-25 17:09:45 +00:00
Rafael Espindola b93db668b3 Revert 69952. Causes testsuite failures on linux x86-64.
llvm-svn: 69967
2009-04-24 12:40:33 +00:00
Nate Begeman bb881d66f4 PR2957
ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle
mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes
as the shuffle mask.  A value of -1 represents UNDEF.

In addition to eliminating the creation of illegal BUILD_VECTORS just to 
represent shuffle masks, we are better about canonicalizing the shuffle mask,
resulting in substantially better code for some classes of shuffles.

A clean up of x86 shuffle code, and some canonicalizing in DAGCombiner is next.

llvm-svn: 69952
2009-04-24 03:42:54 +00:00
Bob Wilson da188ebbbd Revise my previous change 68996 as suggested by Duncan.
llvm-svn: 69607
2009-04-20 17:27:09 +00:00
Duncan Sands e4ff21ba4b Don't try to make BUILD_VECTOR operands have the same
type as the vector element type: allow them to be of
a wider integer type than the element type all the way
through the system, and not just as far as LegalizeDAG.
This should be safe because it used to be this way
(the old type legalizer would produce such nodes), so
backends should be able to handle it.  In fact only
targets which have legal vector types with an illegal
promoted element type will ever see this (eg: <4 x i16>
on ppc).  This fixes a regression with the new type
legalizer (vec_splat.ll).  Also, treat SCALAR_TO_VECTOR
the same as BUILD_VECTOR.  After all, it is just a
special case of BUILD_VECTOR.

llvm-svn: 69467
2009-04-18 20:16:54 +00:00
Bob Wilson 59dbbb2bb4 Change SelectionDAG type legalization to allow BUILD_VECTOR operands to be
promoted to legal types without changing the type of the vector.  This is
following a suggestion from Duncan
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-February/019923.html).
The transformation that used to be done during type legalization is now
postponed to DAG legalization.  This allows the BUILD_VECTORs to be optimized
and potentially handled specially by target-specific code.

It turns out that this is also consistent with an optimization done by the
DAG combiner: a BUILD_VECTOR and INSERT_VECTOR_ELT may be combined by
replacing one of the BUILD_VECTOR operands with the newly inserted element;
but INSERT_VECTOR_ELT allows its scalar operand to be larger than the
element type, with any extra high bits being implicitly truncated.  The
result is a BUILD_VECTOR where one of the operands has a type larger the
the vector element type.

Any code that operates on BUILD_VECTORs may now need to be aware of the
potential type discrepancy between the vector element type and the
BUILD_VECTOR operands.  This patch updates all of the places that I could
find to handle that case.

llvm-svn: 68996
2009-04-13 22:05:19 +00:00
Dan Gohman 0e8d199f91 Generalize ExtendUsesToFormExtLoad to be usable for ANY_EXTEND,
in addition to ZERO_EXTEND and SIGN_EXTEND. Fix a bug in the
way it checked for live-out values, and simplify the way it
find users by using SDNode::use_iterator's (relatively) new
features. Also, make it slightly more permissive on targets
with free truncates.

In SelectionDAGBuild, avoid creating ANY_EXTEND nodes that are
larger than necessary. If the target's SwitchAmountTy has
enough bits, use it. This exposes the truncate to optimization
early, enabling more optimizations.

llvm-svn: 68670
2009-04-09 03:51:29 +00:00
Dan Gohman ad3e549a53 Implement support for using modeling implicit-zero-extension on x86-64
with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce
SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG
instructions), and teach the DAGCombiner to take advantage of this on
targets which support it. This eliminates many redundant
zero-extension operations on x86-64.

This adds a new TargetLowering hook, isZExtFree. It's similar to
isTruncateFree, except it only applies to actual definitions, and not
no-op truncates which may not zero the high bits.

Also, this adds a new optimization to SimplifyDemandedBits: transform
operations like x+y into (zext (add (trunc x), (trunc y))) on targets
where all the casts are no-ops. In contexts where the high part of the
add is explicitly masked off, this allows the mask operation to be
eliminated. Fix the DAGCombiner to avoid undoing these transformations
to eliminate casts on targets where the casts are no-ops.

Also, this adds a new two-address lowering heuristic. Since
two-address lowering runs before coalescing, it helps to be able to
look through copies when deciding whether commuting and/or
three-address conversion are profitable.

Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle
the case that a clobber range extended both before and beyond an
existing live range. In that case, multiple live ranges need to be
added. This was exposed by the new subreg coalescing code.

Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the
spiller behavior it was looking for no longer occurrs with the new
instruction selection.

llvm-svn: 68576
2009-04-08 00:15:30 +00:00
Evan Cheng fd81c73cde Optimize some 64-bit multiplication by constants into two lea's or one lea + shl since imulq is slow (latency 5). e.g.
x * 40
=>
shlq    $3, %rdi
leaq    (%rdi,%rdi,4), %rax

This has the added benefit of allowing more multiply to be folded into addressing mode. e.g.
a * 24 + b
=>
leaq    (%rdi,%rdi,2), %rax
leaq    (%rsi,%rax,8), %rax

llvm-svn: 67917
2009-03-28 05:57:29 +00:00
Bill Wendling aa28be652c Pull transform from target-dependent code into target-independent code.
llvm-svn: 67742
2009-03-26 06:14:09 +00:00
Mon P Wang 523c0852c6 Fix a problem with DAGCombine where we were building an illegal build
vector shuffle mask. Forced the mask to be built using i32.  Note: this will
be irrelevant once vector_shuffle no longer takes a build vector for the
shuffle mask.

llvm-svn: 67076
2009-03-17 06:33:10 +00:00
Mon P Wang c86715631c Avoid doing the transformation c ? 1.0 : 2.0 as load { 2.0, 1.0 } + c*4
if FPConstant is legal because if the FPConstant doesn't need to be stored
in a constant pool, the transformation is unlikely to be profitable.

llvm-svn: 66994
2009-03-14 00:25:19 +00:00
Evan Cheng 1fb8aedd1e Fix some significant problems with constant pools that resulted in unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues.
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.


Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.

llvm-svn: 66875
2009-03-13 07:51:59 +00:00
Chris Lattner 4147f08e44 Move 3 "(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))"
related transformations out of target-specific dag combine into the
ARM backend.  These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).

Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0  -> (zext(cond) << 3).  This happens frequently
with the recently added cp constant select optimization, but is a
very general xform.  For example, we now compile the second example
in const-select.ll to:

_test:
        movsd   LCPI2_0, %xmm0
        ucomisd 8(%esp), %xmm0
        seta    %al
        movzbl  %al, %eax
        movl    4(%esp), %ecx
        movsbl  (%ecx,%eax,4), %eax
        ret

instead of:

_test:
        movl    4(%esp), %eax
        leal    4(%eax), %ecx
        movsd   LCPI2_0, %xmm0
        ucomisd 8(%esp), %xmm0
        cmovbe  %eax, %ecx
        movsbl  (%ecx), %eax
        ret

This passes multisource and dejagnu.

llvm-svn: 66779
2009-03-12 06:52:53 +00:00
Chris Lattner 43d6377f89 reapply my previous patch (r66358) with a tweak to set the
alignment of the generated constant pool entry to the
desired alignment of a type.  If we don't do this, we end up
trying to do movsd from 4-byte alignment memory.  This fixes
450.soplex and 456.hmmer.

llvm-svn: 66641
2009-03-11 05:08:08 +00:00
Evan Cheng aa887653f4 Revert 66358 for now. It's breaking povray, 450.soplex, and 456.hmmer on x86 / Darwin.
llvm-svn: 66574
2009-03-10 20:47:18 +00:00
Chris Lattner 4249b9a698 Fix PR3763 by using proper APInt methods instead of uint64_t's.
llvm-svn: 66434
2009-03-09 20:22:18 +00:00
Chris Lattner ab5a443144 implement an optimization to codegen c ? 1.0 : 2.0 as load { 2.0, 1.0 } + c*4.
For 2009-03-07-FPConstSelect.ll we now produce:

_f:
	xorl	%eax, %eax
	testl	%edi, %edi
	movl	$4, %ecx
	cmovne	%rax, %rcx
	leaq	LCPI1_0(%rip), %rax
	movss	(%rcx,%rax), %xmm0
	ret

previously we produced:

_f:
	subl	$4, %esp
	cmpl	$0, 8(%esp)
	movss	LCPI1_0, %xmm0
	je	LBB1_2	## entry
LBB1_1:	## entry
	movss	LCPI1_1, %xmm0
LBB1_2:	## entry
	movss	%xmm0, (%esp)
	flds	(%esp)
	addl	$4, %esp
	ret

on PPC the code also improves to:

_f:
	cntlzw r2, r3
	srwi r2, r2, 5
	li r3, lo16(LCPI1_0)
	slwi r2, r2, 2
	addis r3, r3, ha16(LCPI1_0)
	lfsx f1, r3, r2
	blr 

from:

_f:
	li r2, lo16(LCPI1_1)
	cmplwi cr0, r3, 0
	addis r2, r2, ha16(LCPI1_1)
	beq cr0, LBB1_2	; entry
LBB1_1:	; entry
	li r2, lo16(LCPI1_0)
	addis r2, r2, ha16(LCPI1_0)
LBB1_2:	; entry
	lfs f1, 0(r2)
	blr 

This also improves the existing pic-cpool case from:

foo:
	subl	$12, %esp
	call	.Lllvm$1.$piclabel
.Lllvm$1.$piclabel:
	popl	%eax
	addl	$_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$1.$piclabel], %eax
	cmpl	$0, 16(%esp)
	movsd	.LCPI1_0@GOTOFF(%eax), %xmm0
	je	.LBB1_2	# entry
.LBB1_1:	# entry
	movsd	.LCPI1_1@GOTOFF(%eax), %xmm0
.LBB1_2:	# entry
	movsd	%xmm0, (%esp)
	fldl	(%esp)
	addl	$12, %esp
	ret

to:

foo:
	call	.Lllvm$1.$piclabel
.Lllvm$1.$piclabel:
	popl	%eax
	addl	$_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$1.$piclabel], %eax
	xorl	%ecx, %ecx
	cmpl	$0, 4(%esp)
	movl	$8, %edx
	cmovne	%ecx, %edx
	fldl	.LCPI1_0@GOTOFF(%eax,%edx)
	ret

This triggers a few dozen times in spec FP 2000.

llvm-svn: 66358
2009-03-08 01:51:30 +00:00
Nate Begeman a9e981225e Fix a problem with DAGCombine on 64b targets where folding
extracts + build_vector into a shuffle would fail, because the
type of the new build_vector would not be legal.  Try harder to
create a legal build_vector type.  Note: this will be totally 
irrelevant once vector_shuffle no longer takes a build_vector for
shuffle mask.

New:
_foo:
	xorps	%xmm0, %xmm0
	xorps	%xmm1, %xmm1
	subps	%xmm1, %xmm1
	mulps	%xmm0, %xmm1
	addps	%xmm0, %xmm1
	movaps	%xmm1, 0

Old:
_foo:
	xorps	%xmm0, %xmm0
	movss	%xmm0, %xmm1
	xorps	%xmm2, %xmm2
	unpcklps	%xmm1, %xmm2
	pshufd	$80, %xmm1, %xmm1
	unpcklps	%xmm1, %xmm2
	pslldq	$16, %xmm2
	pshufd	$57, %xmm2, %xmm1
	subps	%xmm0, %xmm1
	mulps	%xmm0, %xmm1
	addps	%xmm0, %xmm1
	movaps	%xmm1, 0

llvm-svn: 65791
2009-03-01 23:44:07 +00:00
Evan Cheng a49de9de2e Revert BuildVectorSDNode related patches: 65426, 65427, and 65296.
llvm-svn: 65482
2009-02-25 22:49:59 +00:00
Scott Michel 9d31aca679 Introduce the BuildVectorSDNode class that encapsulates the ISD::BUILD_VECTOR
instruction. The class also consolidates the code for detecting constant
splats that's shared across PowerPC and the CellSPU backends (and might be
useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for
generating new BUILD_VECTOR nodes.

llvm-svn: 65296
2009-02-22 23:36:09 +00:00
Dan Gohman e7fe80fcf9 Fix a bug that David Greene found in the DAGCombiner's logic
that checks whether it's safe to transform a store of a bitcast
value into a store of the original value.

llvm-svn: 65201
2009-02-20 23:29:13 +00:00
Scott Michel cf0da6c597 Remove trailing whitespace to reduce later commit patch noise.
(Note: Eventually, commits like this will be handled via a pre-commit hook that
 does this automagically, as well as expand tabs to spaces and look for 80-col
 violations.)

llvm-svn: 64827
2009-02-17 22:15:04 +00:00
Dale Johannesen 84935759d5 Remove more non-DebugLoc getNode variants. Use
getCALLSEQ_{END,START} to permit passing no DebugLoc
there.  UNDEF doesn't logically have DebugLoc; add
getUNDEF to encapsulate this.

llvm-svn: 63978
2009-02-06 23:05:02 +00:00
Dale Johannesen 400dc2e2e4 Remove more non-DebugLoc versions of getNode.
llvm-svn: 63969
2009-02-06 21:50:26 +00:00
Dale Johannesen f1163e9a4d Propagation in TargetLowering. Includes passing a DL
into SimplifySetCC which gets called elsewhere.

llvm-svn: 63583
2009-02-03 00:47:48 +00:00
Duncan Sands 3ed768868d Fix PR3453 and probably a bunch of other potential
crashes or wrong code with codegen of large integers:
eliminate the legacy getIntegerVTBitMask and
getIntegerVTSignBit methods, which returned their
value as a uint64_t, so couldn't handle huge types.

llvm-svn: 63494
2009-02-01 18:06:53 +00:00
Bill Wendling a6c75ffd73 Forgot some more DebugLoc propagations.
llvm-svn: 63493
2009-02-01 11:19:36 +00:00
Duncan Sands 41826036b1 Fix PR3401: when using large integers, the type
returned by getShiftAmountTy may be too small
to hold shift values (it is an i8 on x86-32).
Before and during type legalization, use a large
but legal type for shift amounts: getPointerTy;
afterwards use getShiftAmountTy, fixing up any
shift amounts with a big type during operation
legalization.  Thanks to Dan for writing the
original patch (which I shamelessly pillaged).

llvm-svn: 63482
2009-01-31 15:50:11 +00:00
Bill Wendling 3b585af0ec Don't use DebugLoc::getUnknownLoc(). Default to something hopefully sensible.
llvm-svn: 63473
2009-01-31 03:12:48 +00:00
Bill Wendling 31b50991cb More DebugLoc propagation.
llvm-svn: 63454
2009-01-30 23:59:18 +00:00
Bill Wendling 27d9dd4b57 More DebugLoc propagation.
llvm-svn: 63452
2009-01-30 23:36:47 +00:00
Bill Wendling 306bfc2213 More DebugLoc propagation in LOAD etc. methods.
llvm-svn: 63451
2009-01-30 23:27:35 +00:00
Bill Wendling 0bd29743e3 More DebugLoc propagation in floating-point methods.
llvm-svn: 63446
2009-01-30 23:15:49 +00:00
Bill Wendling 6fbf5495f8 Standardize comments about folding xforms.
llvm-svn: 63443
2009-01-30 23:10:18 +00:00
Bill Wendling 8fb81f1b3d Get rid of the non-DebugLoc-ified getNOT() method.
llvm-svn: 63442
2009-01-30 23:03:19 +00:00
Bill Wendling 3dc5d2454e Propagate debug loc info for some FP arithmetic methods.
llvm-svn: 63441
2009-01-30 22:57:07 +00:00
Bill Wendling cb9be5d174 Propagate debug loc info for some FP arithmetic methods.
llvm-svn: 63440
2009-01-30 22:53:48 +00:00
Bill Wendling 4e0a61514b Propagate debug loc info for BIT_CONVERT.
llvm-svn: 63439
2009-01-30 22:44:24 +00:00
Bill Wendling 7bfa43b022 Propagate debug loc info for more *_EXTEND methods.
llvm-svn: 63437
2009-01-30 22:33:24 +00:00
Bill Wendling 9b3dc8d848 Propagate debug loc info for ANY_EXTEND.
llvm-svn: 63436
2009-01-30 22:27:33 +00:00
Bill Wendling c409318562 Propagate debug loc info for some of the *_EXTEND functions.
llvm-svn: 63434
2009-01-30 22:23:15 +00:00
Bill Wendling b6b6f46fe4 - Propagate debug loc info for SELECT.
- Added xform for (select X, 1, Y) and (select X, Y, 0), which was commented on,
  but missing.

llvm-svn: 63428
2009-01-30 22:02:18 +00:00
Bill Wendling d51e3ff540 Propagate debug loc info for Shifts.
llvm-svn: 63424
2009-01-30 21:37:17 +00:00
Bill Wendling 35972a9460 Propagate debug loc info for XOR and MatchRotate.
llvm-svn: 63420
2009-01-30 21:14:50 +00:00
Bill Wendling f29b6e1318 Propagate debug loc info for OR. Also clean up some comments.
llvm-svn: 63419
2009-01-30 20:59:34 +00:00
Bill Wendling ff8acd684f Perform obvious constant arithmetic folding.
llvm-svn: 63417
2009-01-30 20:50:00 +00:00
Bill Wendling 8617191302 Propagate debug loc info for AND. Also clean up some comments.
llvm-svn: 63416
2009-01-30 20:43:18 +00:00
Bill Wendling 781db7a1ad Propagate debug loc info in SimplifyBinOpWithSameOpcodeHands.
llvm-svn: 63411
2009-01-30 19:25:47 +00:00
Bill Wendling 9b3407e5bb Propagate debug loc info in SimplifyNodeWithTwoResults.
llvm-svn: 63376
2009-01-30 03:08:40 +00:00
Bill Wendling faed065e5c Propagate debug loc info for MULHS.
llvm-svn: 63375
2009-01-30 03:00:18 +00:00
Bill Wendling d033af09fd Propagate debug loc info for SREM and UREM.
llvm-svn: 63374
2009-01-30 02:57:00 +00:00
Bill Wendling aff3e03765 Propagate debug loc info for UDIV.
llvm-svn: 63373
2009-01-30 02:55:25 +00:00
Bill Wendling 5b663e7b53 Propagate debug loc info for SDIV.
llvm-svn: 63372
2009-01-30 02:52:17 +00:00
Bill Wendling b48dcf67e5 Forgot to propagate debug loc info here.
llvm-svn: 63371
2009-01-30 02:49:26 +00:00
Bill Wendling 091f92f568 Propagate debug loc info for MUL.
llvm-svn: 63369
2009-01-30 02:45:56 +00:00
Bill Wendling 48ff08ef3e Propagate debug loc info in SUB.
llvm-svn: 63368
2009-01-30 02:42:10 +00:00
Bill Wendling 6127757920 Propagate debug loc info in ADDC and ADDE.
llvm-svn: 63367
2009-01-30 02:38:00 +00:00
Bill Wendling c442348dd7 Propagate debug loc info in DAG combine's "ADD".
llvm-svn: 63366
2009-01-30 02:31:17 +00:00
Bill Wendling cdd96133bd - Propagate debug loc info in combineSelectAndUse().
- Modify ReassociateOps so that the resulting SDValue is what the comment claims
  it is.

llvm-svn: 63365
2009-01-30 02:23:43 +00:00
Bill Wendling 9c9a3b6665 Propagate debug location info for the token factor.
llvm-svn: 63355
2009-01-30 01:13:16 +00:00
Bill Wendling f6d0aff0bd Add DebugLoc propagation to some of the methods in DAG combiner.
llvm-svn: 63350
2009-01-30 00:45:56 +00:00
Dan Gohman e58ab79f33 Make x86's BT instruction matching more thorough, and add some
dagcombines that help it match in several more cases. Add
several more cases to test/CodeGen/X86/bt.ll. This doesn't
yet include matching for BT with an immediate operand, it
just covers more register+register cases.

llvm-svn: 63266
2009-01-29 01:59:02 +00:00
Dan Gohman 4aa1846215 Make isOperationLegal do what its name suggests, and introduce a
new isOperationLegalOrCustom, which does what isOperationLegal
previously did.

Update a bunch of callers to use isOperationLegalOrCustom
instead of isOperationLegal. In some case it wasn't obvious
which behavior is desired; when in doubt I changed then to
isOperationLegalOrCustom as that preserves their previous
behavior.

This is for the second half of PR3376.

llvm-svn: 63212
2009-01-28 17:46:25 +00:00
Dan Gohman fb58faf29e Add an assertion to the form of SelectionDAG::getConstant that takes
a uint64_t to verify that the value is in range for the given type,
to help catch accidental overflow. Fix a few places that relied on
getConstant implicitly truncating the value.

llvm-svn: 63128
2009-01-27 20:39:34 +00:00
Dan Gohman 8e4ac9b71a Take the next steps in making SDUse more consistent with LLVM Use, and
tidy up SDUse and related code.
 - Replace the operator= member functions with a set method, like
   LLVM Use has, and variants setInitial and setNode, which take
   care up updating use lists, like LLVM Use's does. This simplifies
   code that calls these functions.
 - getSDValue() is renamed to get(), as in LLVM Use, though most
   places can either use the implicit conversion to SDValue or the
   convenience functions instead.
 - Fix some more node vs. value terminology issues.

Also, eliminate the one remaining use of SDOperandPtr, and
SDOperandPtr itself.

llvm-svn: 62995
2009-01-26 04:35:06 +00:00
Dan Gohman 1275e28ded Fold x-0 to x in unsafe-fp-math mode. This comes up in the
testcase from PR3376, and in fact is sufficient to completely
avoid the problem in that testcase.

There's an underlying problem though; TLI.isOperationLegal
considers Custom to be Legal, which might be ok in some
cases, but that's what DAGCombiner is using in many places
to test if something is legal when LegalOperations is true.
When DAGCombiner is running after legalize, this isn't
sufficient. I'll address this in a separate commit.

llvm-svn: 62860
2009-01-23 19:10:37 +00:00
Bob Wilson c2dc7ee5d0 Fix a minor bug in DAGCombiner's folding of SELECT. Folding "select C, 0, 1"
to "C ^ 1" is only valid when C is known to be either 0 or 1.  Most of the
similar foldings in this function only handle "i1" types, but this one appears
intentionally written to handle larger integer types.  If C has an integer
type larger than "i1", this needs to check if the high bits of a boolean
are known to be zero.  I also changed the comment to describe this folding as
"C ^ 1" instead of "~C", since that is what the code does and since the latter
would only be valid for "i1" types.  The good news is that most LLVM targets
use TargetLowering::ZeroOrOneBooleanContent so this change will not disable
the optimization; the bad news is that I've been unable to come up with a
testcase to demonstrate the problem.

I have also removed a "FIXME" comment for folding "select C, X, 0" to "C & X",
since the code looks correct to me.  It could be made more aggressive by not
limiting the type to "i1", but that would then require checking for
TargetLowering::ZeroOrNegativeOneBooleanContent.  Similar changes could be
done for the other SELECT foldings, but it was decided to be not worth the
trouble and complexity (see e.g., r44663).

llvm-svn: 62790
2009-01-22 22:05:48 +00:00
Dan Gohman 1f3411de47 Don't create ISD::FNEG nodes after legalize if they aren't legal.
Simplify x+0 to x in unsafe-fp-math mode. This avoids a bunch of
redundant work in many cases, because in unsafe-fp-math mode,
ISD::FADD with a constant is considered free to negate, so the
DAGCombiner often negates x+0 to -0-x thinking it's free, when
in reality the end result is -x, which is more expensive than x.

Also, combine x*0 to 0.

This fixes PR3374.

llvm-svn: 62789
2009-01-22 21:58:43 +00:00
Bob Wilson c58900504b Add SelectionDAG::getNOT method to construct bitwise NOT operations,
corresponding to the "not" and "vnot" PatFrags.  Use the new method
in some places where it seems appropriate.

llvm-svn: 62768
2009-01-22 17:39:32 +00:00
Dan Gohman 7e6b932f18 Simplify ReduceLoadWidth's logic: it doesn't need several different
special cases after producing the new reduced-width load, because the
new load already has the needed adjustments built into it. This fixes
several bugs due to the special cases, including PR3317.

llvm-svn: 62692
2009-01-21 15:17:51 +00:00
Dan Gohman 161b7b66ac Fix a dagcombine to not generate loads of non-round integer types,
as its comment says, even in the case where it will be generating
extending loads. This fixes PR3216.

llvm-svn: 62557
2009-01-20 01:06:45 +00:00
Dan Gohman cd0b1bf0a0 Fix SelectionDAG::ReplaceAllUsesWith to behave correctly when
uses are added to the From node while it is processing From's
use list, because of automatic local CSE. The fix is to avoid
visiting any new uses.

Fix a few places in the DAGCombiner that assumed that after
a RAUW call, the From node has no users and may be deleted.

This fixes PR3018.

llvm-svn: 62533
2009-01-19 21:44:21 +00:00
Mon P Wang e9e7abb6b8 Simplify extract element based on comments from Duncan Sands.
llvm-svn: 62459
2009-01-18 06:43:40 +00:00
Mon P Wang ca6d6dea0b Simplify extract element of a scalar to vector.
llvm-svn: 62383
2009-01-17 00:07:25 +00:00
Dan Gohman 38978ba972 Use the getNode() accessor instead of accessing the Node
member directly, which is private as of r55504.

llvm-svn: 62364
2009-01-16 21:47:21 +00:00
Chris Lattner 41828cdb0a new nodes should be added to the worklist, not old nodes.
llvm-svn: 62359
2009-01-16 21:15:56 +00:00
Dan Gohman 619ef48a52 Move a few containers out of ScheduleDAGInstrs::BuildSchedGraph
and into the ScheduleDAGInstrs class, so that they don't get
destructed and re-constructed for each block. This fixes a
compile-time hot spot in the post-pass scheduler.

To help facilitate this, tidy and do some minor reorganization
in the scheduler constructor functions.

llvm-svn: 62275
2009-01-15 19:20:50 +00:00
Dan Gohman b9fa1d24f8 Fix a DAGCombiner abort on an invalid shift count constant. This fixes PR3250.
llvm-svn: 61613
2009-01-03 19:22:06 +00:00
Duncan Sands 8feb694e8f Fix PR3274: when promoting the condition of a BRCOND node,
promote from i1 all the way up to the canonical SetCC type.
In order to discover an appropriate type to use, pass
MVT::Other to getSetCCResultType.  In order to be able to
do this, change getSetCCResultType to take a type as an
argument, not a value (this is also more logical).

llvm-svn: 61542
2009-01-01 15:52:00 +00:00
Dale Johannesen ee573fcefc Change comments so everybody can understand them, hopefully.
llvm-svn: 61405
2008-12-23 23:47:22 +00:00
Dale Johannesen acc84e5aa0 Add another permutation where we should get rid of a-a.
llvm-svn: 61401
2008-12-23 23:01:27 +00:00
Dale Johannesen d2a4685860 One more permutation of subtracting off a base value.
llvm-svn: 61361
2008-12-23 01:59:54 +00:00
Dale Johannesen f51dcef803 A new dag combine; several permutations of this
are there under ADD, this one was missing.

llvm-svn: 61107
2008-12-16 22:13:49 +00:00
Bill Wendling 1a317678bc Redo the arithmetic with overflow architecture. I was changing the semantics of
ISD::ADD to emit an implicit EFLAGS. This was horribly broken. Instead, replace
the intrinsic with an ISD::SADDO node. Then custom lower that into an
X86ISD::ADD node with a associated SETCC that checks the correct condition code
(overflow or carry). Then that gets lowered into the correct X86::ADDOvf
instruction.

Similar for SUB and MUL instructions.

llvm-svn: 60915
2008-12-12 00:56:36 +00:00
Bill Wendling 40d2476adc Clarify FIXME.
llvm-svn: 60867
2008-12-11 01:26:44 +00:00
Mon P Wang b5eb7205ea Make fix for r60829 less conservative to allow the proper optimization for
vec_extract-sse4.ll.

llvm-svn: 60865
2008-12-11 00:26:16 +00:00
Bill Wendling 0864a75ebf If ADD, SUB, or MUL have an overflow bit that's used, don't do transformation on
them. The DAG combiner expects that nodes that are transformed have one value
result.

llvm-svn: 60857
2008-12-10 22:36:00 +00:00
Mon P Wang 4637c3c698 Fixed a bug when trying to optimize a extract vector element of a
bit convert that changes the number of elements of a shuffle.

llvm-svn: 60829
2008-12-10 03:59:02 +00:00
Dale Johannesen 54bdec238a One more transformation.
llvm-svn: 60432
2008-12-02 18:40:40 +00:00
Dale Johannesen 8c76670b5a Add a few more transformations.
llvm-svn: 60391
2008-12-02 01:30:54 +00:00
Dale Johannesen 73bc0ba4c9 Add a missing case in visitADD.
llvm-svn: 60137
2008-11-27 00:43:21 +00:00
Duncan Sands dc2dac181a If the type legalizer actually legalized anything
(this doesn't happen that often, since most code
does not use illegal types) then follow it by a
DAG combiner run that is allowed to generate
illegal operations but not illegal types.  I didn't
modify the target combiner code to distinguish like
this between illegal operations and illegal types,
so it will not produce illegal operations as well
as not producing illegal types.

llvm-svn: 59960
2008-11-24 14:53:14 +00:00
Duncan Sands 8d6e2e13d5 Rename SetCCResultContents to BooleanContents. In
practice these booleans are mostly produced by SetCC,
however the concept is more general.

llvm-svn: 59911
2008-11-23 15:47:28 +00:00
Bill Wendling be8e7f851c - Move conversion of [SU]ADDO from DAG combiner into legalizer.
- Add "promote integer type" stuff to the legalizer for these nodes.

llvm-svn: 59847
2008-11-22 00:22:52 +00:00
Bill Wendling 0b5be6c5e0 Default to converting UADDO to the generic form that SADDO is converted to.
llvm-svn: 59801
2008-11-21 07:44:30 +00:00
Bill Wendling 8badb674eb Remove chains. Unnecessary.
llvm-svn: 59783
2008-11-21 02:22:59 +00:00
Bill Wendling 77538cc510 Rename "ADDO" to "SADDO" and "UADDO". The "UADDO" isn't equivalent to "ADDC"
because the boolean it returns to indicate an overflow may not be treated like
as a flag. It could be stored to memory, for instance.

llvm-svn: 59780
2008-11-21 02:12:42 +00:00
Bill Wendling 74296c60ff Implement the sadd_with_overflow intrinsic. This is converted into
"ISD::ADDO". ISD::ADDO is lowered into a target-independent form that does the
addition and then checks if the result is less than one of the operands. (If it
is, then there was an overflow.)

llvm-svn: 59779
2008-11-21 02:03:52 +00:00
Bill Wendling 49a5ce863e Fix for PR3040:
The CC was changed, but wasn't checked to see if it was legal if the DAG
combiner was being run after legalization. Threw in a couple of checks just to
make sure that it's okay. As far as the PR is concerned, no back-end target
actually exhibited this problem, so there isn't an associated testcase.

llvm-svn: 59035
2008-11-11 08:25:46 +00:00
Mon P Wang 25f0106fd9 Added support for the following definition of shufflevector
<result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> 

llvm-svn: 58964
2008-11-10 04:46:22 +00:00
Evan Cheng c7b04a12bb Type of shuffle mask has changed.
llvm-svn: 58751
2008-11-05 06:04:18 +00:00
Chris Lattner 5fa1040130 Don't produce invalid comparisons after legalize.
llvm-svn: 58320
2008-10-28 07:11:07 +00:00
Duncan Sands c6d12bd665 Use a legal integer type for vector shuffle mask
elements.  Otherwise LegalizeTypes will, reasonably
enough, legalize the mask, which may result in it
no longer being a BUILD_VECTOR node (LegalizeDAG
simply ignores the legality or not of vector masks).

llvm-svn: 57782
2008-10-19 14:58:05 +00:00
Dan Gohman 2fe6bee5b6 Teach DAGCombine to fold constant offsets into GlobalAddress nodes,
and add a TargetLowering hook for it to use to determine when this
is legal (i.e. not in PIC mode, etc.)

This allows instruction selection to emit folded constant offsets
in more cases, such as the included testcase, eliminating the need
for explicit arithmetic instructions.

This eliminates the need for the C++ code in X86ISelDAGToDAG.cpp
that attempted to achieve the same effect, but wasn't as effective.

Also, fix handling of offsets in GlobalAddressSDNodes in several
places, including changing GlobalAddressSDNode's offset from
int to int64_t.

The Mips, Alpha, Sparc, and CellSPU targets appear to be
unaware of GlobalAddress offsets currently, so set the hook to
false on those targets.

llvm-svn: 57748
2008-10-18 02:06:02 +00:00
Dan Gohman a39b0a1f05 Define patterns for shld and shrd that match immediate
shift counts, and patterns that match dynamic shift counts
when the subtract is obscured by a truncate node.

Add DAGCombiner support for recognizing rotate patterns
when the shift counts are defined by truncate nodes.

Fix and simplify the code for commuting shld and shrd
instructions to work even when the given instruction doesn't
have a parent, and when the caller needs a new instruction.

These changes allow LLVM to use the shld, shrd, rol, and ror
instructions on x86 to replace equivalent code using two
shifts and an or in many more cases.

llvm-svn: 57662
2008-10-17 01:23:35 +00:00
Evan Cheng 07d53b1d33 Rename LoadX to LoadExt.
llvm-svn: 57526
2008-10-14 21:26:46 +00:00
Dale Johannesen 54306fe499 Rename APFloat::convertToAPInt to bitcastToAPInt to
make it clearer what the function does.  No functional
change.

llvm-svn: 57325
2008-10-09 18:53:47 +00:00
Dan Gohman 6e0548336a Rename ConstantSDNode's getSignExtended to getSExtValue, for
consistancy with ConstantInt, and re-implement it in terms
of ConstantInt's getSExtValue.

llvm-svn: 56700
2008-09-26 21:54:37 +00:00
Bill Wendling dea91308ae Reapplying r56550
llvm-svn: 56553
2008-09-24 10:25:02 +00:00
Eric Christopher 4e26a81371 Temporarily revert r56550 until missing commit can be added.
llvm-svn: 56551
2008-09-24 08:30:44 +00:00
Bill Wendling 7c31464a0b Refactor the constant folding code into it's own function. And call it from both
the SelectionDAG and DAGCombiner code. The only functionality change is that now
the DAG combiner is performing the constant folding for these operations instead
of being a no-op.

This is *not* in response to a bug, so there isn't a testcase.

llvm-svn: 56550
2008-09-24 07:11:26 +00:00
Evan Cheng 13beeeb128 Per review feedback: Only perform
(srl x, (trunc (and y, c))) -> (srl x, (and (trunc y), c))
etc. when both "trunc" and "and" have single uses.

llvm-svn: 56452
2008-09-22 18:19:24 +00:00
Dan Gohman ec270fb640 Change ConstantSDNode and ConstantFPSDNode to use ConstantInt* and
ConstantFP* instead of APInt and APFloat directly.

This reduces the amount of time to create ConstantSDNode
and ConstantFPSDNode nodes when ConstantInt* and ConstantFP*
respectively are already available, as is the case in
SelectionDAGBuild.cpp. Also, it reduces the amount of time
to legalize constants into constant pools, and the amount of
time to add ConstantFP operands to MachineInstrs, due to
eliminating ConstantInt::get and ConstantFP::get calls.

It increases the amount of work needed to create new constants
in cases where the client doesn't already have a ConstantInt*
or ConstantFP*, such as legalize expanding 64-bit integer constants
to 32-bit constants. And it adds a layer of indirection for the
accessor methods. But these appear to be outweight by the benefits
in most cases.

It will also make it easier to make ConstantSDNode and
ConstantFPNode more consistent with ConstantInt and ConstantFP.

llvm-svn: 56162
2008-09-12 18:08:03 +00:00
Dan Gohman effb894453 Rename ConstantSDNode::getValue to getZExtValue, for consistency
with ConstantInt. This led to fixing a bug in TargetLowering.cpp
using getValue instead of getAPIntValue.

llvm-svn: 56159
2008-09-12 16:56:44 +00:00
Dan Gohman 1df80f6b1c In visitUREM, arrange for the temporary UDIV node to be
revisited, consistent with the code in visitSREM.

llvm-svn: 55923
2008-09-08 16:59:01 +00:00
Bill Wendling 5f7371d7b1 Revert my previous change -- the subtraction of two constants was a no-op
before. This is taken care of in the selection DAG pass. In my opinion, this
should be in one place or the other. I.e., it should probably be removed from
the DAG combiner (along with the other arithmetic transformations on constants
that are essentially no-ops).

llvm-svn: 55889
2008-09-08 01:56:32 +00:00
Bill Wendling df81749886 Convert
// fold (sub c1, c2) -> c1-c2

from a no-op into an actual transformation.

llvm-svn: 55886
2008-09-07 11:34:47 +00:00
Dan Gohman 921ddd69ba Fix a search+replace-o.
llvm-svn: 55824
2008-09-05 01:58:21 +00:00
Dan Gohman 634412fe35 Clean up uses of TargetLowering::getTargetMachine.
llvm-svn: 55769
2008-09-04 15:39:15 +00:00
Bill Wendling 11284ea499 Another situation where ROTR is cheaper than ROTL.
llvm-svn: 55577
2008-08-31 01:13:31 +00:00
Bill Wendling 4822a7ac8a For this pattern, ROTR is the cheaper option.
llvm-svn: 55576
2008-08-31 01:04:56 +00:00
Bill Wendling fc72416447 - Fix comment so that it describes how the code really works:
// fold (or (shl x, (*ext y)), (srl x, (*ext (sub 32, y)))) ->
   //   (rotl x, y)
   // fold (or (shl x, (*ext y)), (srl x, (*ext (sub 32, y)))) ->
   //   (rotr x, (sub 32, y))

Example: (x == 0xDEADBEEF and y == 4)

    (x << 4) | (x >> 28)
 => 0xEADBEEF0 | 0x0000000D
 => 0xEADBEEFD

    (rotl x, 4)
 => 0xEADBEEFD

    (rotr x, 28)
 => 0xEADBEEFD

- Fix comment and code for second version. It wasn't using the rot* propertly.

   // fold (or (shl x, (*ext (sub 32, y))), (srl x, (*ext r))) -> 
   //   (rotr x, y)
   // fold (or (shl x, (*ext (sub 32, y))), (srl x, (*ext r))) ->
   //   (rotl x, (sub 32, y))

    (x << 28) | (x >> 4)
 => 0xD0000000 | 0x0DEADBEE
 => 0xDDEADBEE

    (rotl x, 4)
 => 0xEADBEEFD

    (rotr x, 28)
 => (0xEADBEEFD)

llvm-svn: 55575
2008-08-31 00:37:27 +00:00
Gabor Greif e12264bf41 fix some 80-col violations
llvm-svn: 55571
2008-08-30 19:29:20 +00:00
Evan Cheng cfb7f3abdf Transform (x << (y&31)) -> (x << y). This takes advantage of the fact x86 shift instructions 2nd operand (shift count) is limited to 0 to 31 (or 63 in the x86-64 case).
llvm-svn: 55558
2008-08-30 02:03:58 +00:00
Evan Cheng 894be333f1 Fix 80 col. violations.
llvm-svn: 55551
2008-08-29 23:20:46 +00:00
Evan Cheng 5e7658c2e4 Back out 55498. It broken Apple style bootstrapping.
llvm-svn: 55549
2008-08-29 22:21:44 +00:00
Gabor Greif f304a7aa4d erect abstraction boundaries for accessing SDValue members, rename Val -> Node to reflect semantics
llvm-svn: 55504
2008-08-28 21:40:38 +00:00
Dan Gohman f27e33baa7 Optimize DAGCombiner's worklist processing. Previously it started
its work by putting all nodes in the worklist, requiring a big
dynamic allocation. Now, DAGCombiner just iterates over the AllNodes
list and maintains a worklist for nodes that are newly created or
need to be revisited. This allows the worklist to stay small in most
cases, so it can be a SmallVector.

This has the side effect of making DAGCombine not miss a folding
opportunity in alloca-align-rounding.ll.

llvm-svn: 55498
2008-08-28 21:01:56 +00:00
Gabor Greif abfdf928d8 disallow direct access to SDValue::ResNo, provide a getter instead
llvm-svn: 55394
2008-08-26 22:36:50 +00:00
Dan Gohman 837c13a029 Disable DAGCombine's alignment inference in "fast" codegen mode.
llvm-svn: 55059
2008-08-20 16:30:28 +00:00
Dan Gohman 550c9af91f Improve support for vector casts in LLVM IR and CodeGen.
llvm-svn: 54784
2008-08-14 20:04:46 +00:00
Dan Gohman 127bb03b8c Take the FrameOffset into account when computing the alignment
of stack objects. This fixes PR2656.

llvm-svn: 54646
2008-08-11 18:27:03 +00:00
Dan Gohman 345d63ccf2 Improve dagcombining for sext-loads and sext-in-reg nodes.
llvm-svn: 54239
2008-07-31 00:50:31 +00:00
Dan Gohman 2ce6f2ad5e Rename SDOperand to SDValue.
llvm-svn: 54128
2008-07-27 21:46:04 +00:00
Dan Gohman 91e5dcb680 Tidy SDNode::use_iterator, and complete the transition to have it
parallel its analogue, Value::value_use_iterator. The operator* method
now returns the user, rather than the use.

llvm-svn: 54127
2008-07-27 20:43:25 +00:00
Evan Cheng b8ff223f26 Fix pr2566: incorrect assumption about bit_convert. It doesn't not have to output a vector value. Patch by Nicolas Capens!
llvm-svn: 53932
2008-07-22 20:42:56 +00:00
Dan Gohman 581cc87f57 Add titles to the various SelectionDAG viewGraph calls
that include useful information like the name of the
block being viewed and the current phase of compilation.

llvm-svn: 53872
2008-07-21 20:00:07 +00:00
Duncan Sands b0e3938651 Add VerifyNode, a place to put sanity checks on
generic SDNode's (nodes with their own constructors
should do sanity checking in the constructor).  Add
sanity checks for BUILD_VECTOR and fix all the places
that were producing bogus BUILD_VECTORs, as found by
"make check".  My favorite is the BUILD_VECTOR with
only two operands that was being used to build a
vector with four elements!

llvm-svn: 53850
2008-07-21 10:20:31 +00:00
Duncan Sands 32e387c461 Revert 53729, after waking up in the middle of
the night realising that it was wrong :)  I
think the reason the same type was being used
for the shufflevec of indices as for the actual
indices is so that if one of them needs splitting
then so does the other.  After my patch it might
be that the indices need splitting but not the
rest, yet there is no good way of handling that.
I think the right solution is to not have the
shufflevec be an operand at all: just have it
be the list of numbers it actually is, stored
as extra info in the node.

llvm-svn: 53768
2008-07-18 20:12:05 +00:00
Duncan Sands 656b256a1a Use a legal type for elements of the vector_shuffle
mask.  These are just indices into the shuffled vector
so their type is unrelated to the type of the
shuffled elements (which is what was being used before).
This fixes vec_shuffle-11.ll when using LegalizeTypes.
What seems to have happened is that Dan's recent change
r53687, which corrected the result type of the shuffle,
somehow caused LegalizeTypes to notice that the mask
operand was a BUILD_VECTOR with a legal type but elements
of an illegal type (i64).  LegalizeTypes legalized this
by introducing a new BUILD_VECTOR of i32 and bitcasting
it to the old type.  But the mask operand is not supposed
to be a bitcast but a straight BUILD_VECTOR of constants,
causing a crash.

llvm-svn: 53729
2008-07-17 19:28:41 +00:00
Dan Gohman 2714059079 Fix the result type of a VECTOR_SHUFFLE+BIT_CONVERT dagcombine. This
was turned up by some new SelectionDAG assertion checks that I'm
working on.

llvm-svn: 53687
2008-07-16 16:13:58 +00:00
Dan Gohman a76e60a77a Use reserve.
SelectionDAG::allnodes_size is linear, but that doesn't appear to
outweigh the benefit of reducing heap traffic. If it does become a
problem, we should teach SelectionDAG to keep a count of how many
nodes are live, because there are several other places where that
information would be useful as well.

llvm-svn: 52926
2008-06-30 21:04:06 +00:00
Dan Gohman 6f7b5a6392 When folding a bitcast into a load or store, preserve the alignment
information of the original load or store, which is checked to be
at least as good, and possibly better.

llvm-svn: 52849
2008-06-28 00:45:22 +00:00
Chris Lattner df1cbdd645 duncan points out that isOperationLegal includes a check for
type legality.  Thanks Duncan!

llvm-svn: 52786
2008-06-26 17:16:00 +00:00
Chris Lattner b1e66ce3bb when we know the signbit of an input to uint_to_fp is zero,
change it to sint_to_fp on targets where that is cheaper (and
visaversa of course).  This allows us to compile uint_to_fp to:

_test:
	movl	4(%esp), %eax
	shrl	$23, %eax
	cvtsi2ss	%eax, %xmm0
	movl	8(%esp), %eax
	movss	%xmm0, (%eax)
	ret

instead of:

	.align	3
LCPI1_0:					##  double
	.long	0	## double least significant word 4.5036e+15
	.long	1127219200	## double most significant word 4.5036e+15
	.text
	.align	4,0x90
	.globl	_test
_test:
	subl	$12, %esp
	movl	16(%esp), %eax
	shrl	$23, %eax
	movl	%eax, (%esp)
	movl	$1127219200, 4(%esp)
	movsd	(%esp), %xmm0
	subsd	LCPI1_0, %xmm0
	cvtsd2ss	%xmm0, %xmm0
	movl	20(%esp), %eax
	movss	%xmm0, (%eax)
	addl	$12, %esp
	ret

llvm-svn: 52747
2008-06-26 00:16:49 +00:00
Dan Gohman b4e2637e9b Duncan pointed out this code could be tidied.
llvm-svn: 52624
2008-06-23 15:29:14 +00:00
Dan Gohman 546505e7e1 Simplify some getNode calls.
llvm-svn: 52604
2008-06-21 22:06:07 +00:00
Duncan Sands 37c1f5267b Allow these transforms for types like i256 while
still excluding types like i1 (not byte sized)
and i120 (loading an i120 requires loading an i64,
an i32, an i16 and an i8, which is expensive). 

llvm-svn: 52310
2008-06-16 08:14:38 +00:00
Duncan Sands 075293ff46 The transforms in visitEXTRACT_VECTOR_ELT are
not valid if the load is volatile.  Hopefully
all wrong DAG combiner transforms of volatile
loads and stores have now been caught.

llvm-svn: 52293
2008-06-15 20:12:31 +00:00
Duncan Sands b1bfff53fe Remove a redundant AfterLegalize check. Turn
on some code when !AfterLegalize - but since
this whole code section is turned off by an
"if (0)" it's not really turning anything on.

llvm-svn: 52276
2008-06-14 17:48:34 +00:00
Duncan Sands 8651e9c584 Disable some DAG combiner optimizations that may be
wrong for volatile loads and stores.  In fact this
is almost all of them!  There are three types of
problems: (1) it is wrong to change the width of
a volatile memory access.  These may be used to
do memory mapped i/o, in which case a load can have
an effect even if the result is not used.  Consider
loading an i32 but only using the lower 8 bits.  It
is wrong to change this into a load of an i8, because
you are no longer tickling the other three bytes.  It
is also unwise to make a load/store wider.  For
example, changing an i16 load into an i32 load is
wrong no matter how aligned things are, since the
fact of loading an additional 2 bytes can have
i/o side-effects.  (2) it is wrong to change the
number of volatile load/stores: they may be counted
by the hardware.  (3) it is wrong to change a volatile
load/store that requires one memory access into one
that requires several.  For example on x86-32, you
can store a double in one processor operation, but to
store an i64 requires two (two i32 stores).  In a
multi-threaded program you may want to bitcast an i64
to a double and store as a double because that will
occur atomically, and be indivisible to other threads.
So it would be wrong to convert the store-of-double
into a store of an i64, because this will become two
i32 stores - no longer atomic.  My policy here is
to say that the number of processor operations for
an illegal operation is undefined.  So it is alright
to change a store of an i64 (requires at least two
stores; but could be validly lowered to memcpy for
example) into a store of double (one processor op).
In short, if the new store is legal and has the same
size then I say that the transform is ok.  It would
also be possible to say that transforms are always
ok if before they were illegal, whether after they
are illegal or not, but that's more awkward to do
and I doubt it buys us anything much.
However this exposed an interesting thing - on x86-32
a store of i64 is considered legal!  That is because
operations are marked legal by default, regardless of
whether the type is legal or not.  In some ways this
is clever: before type legalization this means that
operations on illegal types are considered legal;
after type legalization there are no illegal types
so now operations are only legal if they really are.
But I consider this to be too cunning for mere mortals.
Better to do things explicitly by testing AfterLegalize.
So I have changed things so that operations with illegal
types are considered illegal - indeed they can never
map to a machine operation.  However this means that
the DAG combiner is more conservative because before
it was "accidentally" performing transforms where the
type was illegal because the operation was nonetheless
marked legal.  So in a few such places I added a check
on AfterLegalize, which I suppose was actually just
forgotten before.  This causes the DAG combiner to do
slightly more than it used to, which resulted in the X86
backend blowing up because it got a slightly surprising
node it wasn't expecting, so I tweaked it.

llvm-svn: 52254
2008-06-13 19:07:40 +00:00
Duncan Sands bf17080ec2 Sometimes (rarely) nodes held in LegalizeTypes
maps can be deleted.  This happens when RAUW
replaces a node N with another equivalent node
E, deleting the first node.  Solve this by
adding (N, E) to ReplacedNodes, which is already
used to remap nodes to replacements.  This means
that deleted nodes are being allowed in maps,
which can be delicate: the memory may be reused
for a new node which might get confused with the
old deleted node pointer hanging around in the
maps, so detect this and flush out maps if it
occurs (ExpungeNode).  The expunging operation
is expensive, however it never occurs during
a llvm-gcc bootstrap or anywhere in the nightly
testsuite.  It occurs three times in "make check":
Alpha/illegal-element-type.ll,
PowerPC/illegal-element-type.ll and
X86/mmx-shift.ll.  If expunging proves to be too
expensive then there are other more complicated
ways of solving the problem.
In the normal case this patch adds the overhead
of a few more map lookups, which is hopefully
negligable.

llvm-svn: 52214
2008-06-11 11:42:12 +00:00
Duncan Sands 67d0f332d5 Various tweaks related to apint codegen. No functionality
change for non-funky-sized integers.

llvm-svn: 52151
2008-06-09 15:48:25 +00:00
Duncan Sands 93b6609ae2 Remove some DAG combiner assumptions about sizes
of integer types.  Fix the isMask APInt method to
actually work (hopefully) rather than crashing
because it adds apints of different bitwidths.
It looks like isShiftedMask is also broken, but
I'm leaving that one to the APInt people (it is
not used anywhere).

llvm-svn: 52142
2008-06-09 11:32:28 +00:00
Duncan Sands 11dd424539 Remove comparison methods for MVT. The main cause
of apint codegen failure is the DAG combiner doing
the wrong thing because it was comparing MVT's using
< rather than comparing the number of bits.  Removing
the < method makes this mistake impossible to commit.
Instead, add helper methods for comparing bits and use
them.

llvm-svn: 52098
2008-06-08 20:54:56 +00:00
Duncan Sands 13237ac3b9 Wrap MVT::ValueType in a struct to get type safety
and better control the abstraction.  Rename the type
to MVT.  To update out-of-tree patches, the main
thing to do is to rename MVT::ValueType to MVT, and
rewrite expressions like MVT::getSizeInBits(VT) in
the form VT.getSizeInBits().  Use VT.getSimpleVT()
to extract a MVT::SimpleValueType for use in switch
statements (you will get an assert failure if VT is
an extended value type - these shouldn't exist after
type legalization).
This results in a small speedup of codegen and no
new testsuite failures (x86-64 linux).

llvm-svn: 52044
2008-06-06 12:08:01 +00:00
Dan Gohman 643b3a0581 Add #includes to make some dependencies explicit.
llvm-svn: 51496
2008-05-23 20:40:06 +00:00
Dan Gohman c1a4e212a3 Code simplification.
llvm-svn: 51345
2008-05-20 20:56:33 +00:00
Evan Cheng 1120279ae6 Instead of a vector load, shuffle and then extract an element. Load the element from address with an offset.
pshufd $1, (%rdi), %xmm0
        movd %xmm0, %eax
=>
        movl 4(%rdi), %eax

llvm-svn: 51026
2008-05-13 08:35:03 +00:00
Evan Cheng b980f6fb3d Xform bitconvert(build_pair(load a, load b)) to a single load if the load locations are at the right offset from each other.
llvm-svn: 51008
2008-05-12 23:04:07 +00:00
Dan Gohman c968c1f592 Evan pointed out that folding sext to zext may not be correct
if the zext is not legal.

llvm-svn: 50368
2008-04-28 18:47:17 +00:00
Dan Gohman 3eb10f758e Teach DAGCombine to convert (sext x) to (zext x) when the
sign-bit of x is known to be zero.

llvm-svn: 50357
2008-04-28 16:58:24 +00:00
Roman Levenstein a3ee1a38a3 Ongoing work on improving the instruction selection infrastructure:
Rename SDOperandImpl back to SDOperand.
Introduce the SDUse class that represents a use of the SDNode referred by
an SDOperand. Now it is more similar to Use/Value classes.

Patch is approved by Dan Gohman.

llvm-svn: 49795
2008-04-16 16:15:27 +00:00
Roman Levenstein 51f532f92d Re-commit of the r48822, where the infinite looping problem discovered
by Dan Gohman is fixed.

llvm-svn: 49330
2008-04-07 10:06:32 +00:00
Evan Cheng 025cea1126 Backing out 48222 temporarily.
llvm-svn: 49124
2008-04-03 03:13:16 +00:00
Dan Gohman f549b26254 Fix a DAGCombiner optimization to respect volatile qualification.
llvm-svn: 48994
2008-03-31 20:32:52 +00:00
Roman Levenstein 358e04a185 Use a linked data structure for the uses lists of an SDNode, just like
LLVM Value/Use does and MachineRegisterInfo/MachineOperand does.
This allows constant time for all uses list maintenance operations.

The idea was suggested by Chris. Reviewed by Evan and Dan.
Patch is tested and approved by Dan.

On normal use-cases compilation speed is not affected. On very big basic
blocks there are compilation speedups in the range of 15-20% or even better. 

llvm-svn: 48822
2008-03-26 12:39:26 +00:00
Evan Cheng df1690dc7c Handle a special case xor undef, undef -> 0. Technically this should be transformed to undef. But this is such a common idiom (misuse) we are going to handle it.
llvm-svn: 48792
2008-03-25 20:08:07 +00:00
Evan Cheng fe7610f37f Remove an unneeded test.
llvm-svn: 48755
2008-03-24 23:55:16 +00:00
Evan Cheng 31604a62f6 Teach DAG combiner to commute commutable binary nodes in order to achieve sdisel CSE.
llvm-svn: 48673
2008-03-22 01:55:50 +00:00
Christopher Lamb 3e9f49716e Check even more carefully before applying this DAGCombine transform.
llvm-svn: 48580
2008-03-20 04:31:39 +00:00
Evan Cheng 7a3e750fd2 Fix this xform: (sra (shl X, m), result_size) -> (sign_extend (trunc (shl X, result_size - n - m)))
llvm-svn: 48578
2008-03-20 02:18:41 +00:00
Christopher Lamb 8fe9109469 Fix X86's isTruncateFree to not claim that truncate to i1 is free. This fixes Bill's testcase that failed for r48491.
llvm-svn: 48542
2008-03-19 08:30:06 +00:00
Bill Wendling efb4d9ef80 Temporarily revert r48491. It's breaking test/CodeGen/X86/xorl.ll.
llvm-svn: 48510
2008-03-18 22:29:51 +00:00
Christopher Lamb 3e408d4d82 Target independent DAG transform to use truncate for field extraction + sign extend on targets where this is profitable. Passes nightly on x86-64.
llvm-svn: 48491
2008-03-18 16:46:39 +00:00
Dan Gohman b72127ac4c More APInt-ification.
llvm-svn: 48344
2008-03-13 22:13:53 +00:00
Evan Cheng 99ee78ef63 Clean up my own mess.
X86 lowering normalize vector 0 to v4i32. However DAGCombine can fold (sub x, x) -> 0 after legalization. It can create a zero vector of a type that's not expected (e.g. v8i16). We don't want to disable the optimization since leaving a (sub x, x) is really bad. Add isel patterns for other types of vector 0 to ensure correctness. It's highly unlikely to happen other than in bugpoint reduced test cases.

llvm-svn: 48279
2008-03-12 07:02:50 +00:00
Evan Cheng 0903aef2ff Total brain cramp.
llvm-svn: 48274
2008-03-12 02:05:05 +00:00
Evan Cheng b9e4280e94 Somewhat better solution.
llvm-svn: 48170
2008-03-10 19:58:22 +00:00
Scott Michel a6729e8666 Give TargetLowering::getSetCCResultType() a parameter so that ISD::SETCC's
return ValueType can depend its operands' ValueType.

This is a cosmetic change, no functionality impacted.

llvm-svn: 48145
2008-03-10 15:42:14 +00:00
Evan Cheng 831ae49599 Doh
llvm-svn: 48140
2008-03-10 07:59:01 +00:00
Evan Cheng b5d11980d9 Avoid creating BUILD_VECTOR of all zero elements of "non-normalized" type (e.g. v8i16 on x86) after legalizer. Instruction selection does not expect to see them. In all likelihood this can only be an issue in a bugpoint reduced test case.
llvm-svn: 48136
2008-03-10 07:19:13 +00:00
Evan Cheng 567d2e5b57 Rename isOperand() to isOperandOf() (and other similar methods). It always confuses me.
llvm-svn: 47872
2008-03-04 00:41:45 +00:00
Dan Gohman e1c4f99549 Misc. APInt-ification in the DAGCombiner.
llvm-svn: 47869
2008-03-03 23:51:38 +00:00
Dan Gohman ae2b6fbb8e Convert SimplifyDemandedMask and ShrinkDemandedConstant to use APInt.
Change several cases in SimplifyDemandedMask that don't ever do any
simplifying to reuse the logic in ComputeMaskedBits instead of
duplicating it.

llvm-svn: 47648
2008-02-27 00:25:32 +00:00
Chris Lattner 07c83cc86e Fix PR2096, a regression introduced with my patch last night. This
also fixes cfrac, flops, and 175.vpr

llvm-svn: 47605
2008-02-26 17:09:59 +00:00
Chris Lattner e7c14013f5 Fix isNegatibleForFree to not return true for ConstantFP nodes
after legalize.  Just because a constant is legal (e.g. 0.0 in SSE) 
doesn't mean that its negated value is legal (-0.0).  We could make
this stronger by checking to see if the negated constant is actually
legal post negation, but it doesn't seem like a big deal.

llvm-svn: 47591
2008-02-26 07:04:54 +00:00
Dan Gohman 1f372edd97 Convert MaskedValueIsZero and all its users to use APInt. Also add
a SignBitIsZero function to simplify a common use case.

llvm-svn: 47561
2008-02-25 21:11:39 +00:00
Dan Gohman 360c86aed5 Add explicit keywords.
llvm-svn: 47382
2008-02-20 16:44:09 +00:00
Dan Gohman d0ff91dac5 Convert DAGCombiner to use the APInt form of ComputeMaskedBits.
llvm-svn: 47381
2008-02-20 16:33:30 +00:00
Anton Korobeynikov 035eaacd1f Update gcc 4.3 warnings fix patch with recent head changes
llvm-svn: 47368
2008-02-20 11:10:28 +00:00
Evan Cheng 6200c225e0 - When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type.
- X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC.

llvm-svn: 47290
2008-02-18 23:04:32 +00:00
Chris Lattner ee322b44a4 teach dag combiner how to eliminate MERGE_VALUES nodes.
llvm-svn: 47052
2008-02-13 07:25:05 +00:00
Duncan Sands 7377f5fbe3 Add a isBigEndian method to complement isLittleEndian.
llvm-svn: 46954
2008-02-11 10:37:04 +00:00
Bill Wendling 9c2ce9a32d Return "(c1 + c2)" instead of yet another ADD node (which made this a
no-op).

llvm-svn: 46922
2008-02-10 08:10:24 +00:00
Chris Lattner d2d166ea9f the world doesn't need my debugging code.
llvm-svn: 46678
2008-02-03 07:01:05 +00:00
Chris Lattner b2b9d6f0fb Change the 'global modification' APIs in SelectionDAG to take a new
DAGUpdateListener object pointer instead of just returning a vector 
of deleted nodes.  This makes the interfaces more efficient (no more
allocating a vector [at least a malloc], filling it in, then walking
it) and more clean.  This also allows the client to be notified of
nodes that are *changed* but not deleted.

llvm-svn: 46677
2008-02-03 06:49:24 +00:00
Dan Gohman 47a7d6fafe Factor the addressing mode and the load/store VT out of LoadSDNode
and StoreSDNode into their common base class LSBaseSDNode. Member
functions getLoadedVT and getStoredVT are replaced with the common
getMemoryVT to simplify code that will handle both loads and stores.

llvm-svn: 46538
2008-01-30 00:15:11 +00:00
Dan Gohman 70de4cb1cd Use empty() instead of comparing size() with zero.
llvm-svn: 46514
2008-01-29 13:02:09 +00:00
Chris Lattner 2ee91f4300 Fix PowerPC/./2007-10-18-PtrArithmetic.ll
llvm-svn: 46424
2008-01-27 23:32:17 +00:00
Chris Lattner d0496d0433 fix a crash on CodeGen/X86/vector-rem.ll
llvm-svn: 46422
2008-01-27 23:21:58 +00:00
Chris Lattner 888560d62c Implement some dag combines that allow doing fneg/fabs/fcopysign in integer
registers if used by a bitconvert or using a bitconvert.  This allows us to
avoid constant pool loads and use cheaper integer instructions when the
values come from or end up in integer regs anyway.  For example, we now 
compile CodeGen/X86/fp-in-intregs.ll to:

_test1:
	movl	$2147483648, %eax
	xorl	4(%esp), %eax
	ret
_test2:
	movl	$1065353216, %eax
	orl	4(%esp), %eax
	andl	$3212836864, %eax
	ret

Instead of:
_test1:
	movss	4(%esp), %xmm0
	xorps	LCPI2_0, %xmm0
	movd	%xmm0, %eax
	ret
_test2:
	movss	4(%esp), %xmm0
	andps	LCPI3_0, %xmm0
	movss	LCPI3_1, %xmm1
	andps	LCPI3_2, %xmm1
	orps	%xmm0, %xmm1
	movd	%xmm1, %eax
	ret

bitconverts can happen due to various calling conventions that require
fp values to passed in integer regs in some cases, e.g. when returning
a complex.

llvm-svn: 46414
2008-01-27 17:42:27 +00:00
Chris Lattner e30e33af4f Infer alignment of loads and increase their alignment when we can tell they are
from the stack.  This allows us to compile stack-align.ll to:

_test:
	movsd	LCPI1_0, %xmm0
	movapd	%xmm0, %xmm1
***	andpd	4(%esp), %xmm1
	andpd	_G, %xmm0
	addsd	%xmm1, %xmm0
	movl	20(%esp), %eax
	movsd	%xmm0, (%eax)
	ret

instead of:

_test:
	movsd	LCPI1_0, %xmm0
**	movsd	4(%esp), %xmm1
**	andpd	%xmm0, %xmm1
	andpd	_G, %xmm0
	addsd	%xmm1, %xmm0
	movl	20(%esp), %eax
	movsd	%xmm0, (%eax)
	ret

llvm-svn: 46401
2008-01-26 19:45:50 +00:00
Chris Lattner 31e9edce1c Fix some bugs in SimplifyNodeWithTwoResults where it would call deletenode to
delete a node even if it was not dead in some cases.  Instead, just add it to
the worklist.  Also, make sure to use the CombineTo methods, as it was doing
things that were unsafe: the top level combine loop could touch dangling memory.

This fixes CodeGen/Generic/2008-01-25-dag-combine-mul.ll

llvm-svn: 46384
2008-01-26 01:09:19 +00:00
Chris Lattner cb3cf546c3 reduce indentation
llvm-svn: 46377
2008-01-25 23:34:24 +00:00
Chris Lattner 2d7a830ff3 Add skeletal code to increase the alignment of loads and stores when
we can infer it.  This will eventually help stuff, though it doesn't
do much right now because all fixed FI's have an alignment of 1.

llvm-svn: 46349
2008-01-25 07:20:16 +00:00
Chris Lattner 34ed27c46d clarify a comment, thanks Duncan.
llvm-svn: 46313
2008-01-24 17:10:01 +00:00
Chris Lattner e97fa8cdf0 Fix this buggy transformation. Two observations:
1. we already know the value is dead, so don't bother replacing 
   it with undef.
2. The very case the comment describes actually makes the load
   live which asserts in deletenode.  If we do the replacement
   and the node becomes live, just treat it as new.  This fixes
   a failure on X86/2008-01-16-InvalidDAGCombineXform.ll with
   some local changes in my tree.

llvm-svn: 46306
2008-01-24 07:57:06 +00:00
Chris Lattner d66eac62fd The dag combiner is missing revisiting nodes that it really should, and thus leaving
dead stuff around.  This gets fed into the isel pass and causes certain foldings from
happening because nodes have extraneous uses floating around.  For example, if we turned
foo(bar(x)) -> baz(x), we sometimes left bar(x) around.

llvm-svn: 46305
2008-01-24 07:18:21 +00:00
Chris Lattner 0feb1b0f84 fold fp_round(fp_round(x)) -> fp_round(x).
llvm-svn: 46304
2008-01-24 06:45:35 +00:00
Chris Lattner 1ea55cf816 This commit changes:
1. Legalize now always promotes truncstore of i1 to i8. 
2. Remove patterns and gunk related to truncstore i1 from targets.
3. Rename the StoreXAction stuff to TruncStoreAction in TLI.
4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions.
5. Mark a wide variety of invalid truncstores as such in various targets, e.g.
   X86 currently doesn't support truncstore of any of its integer types.
6. Add legalize support for truncstores with invalid value input types.
7. Add a dag combine transform to turn store(truncate) into truncstore when
   safe.

The later allows us to compile CodeGen/X86/storetrunc-fp.ll to:

_foo:
	fldt	20(%esp)
	fldt	4(%esp)
	faddp	%st(1)
	movl	36(%esp), %eax
	fstps	(%eax)
	ret

instead of:

_foo:
	subl	$4, %esp
	fldt	24(%esp)
	fldt	8(%esp)
	faddp	%st(1)
	fstps	(%esp)
	movl	40(%esp), %eax
	movss	(%esp), %xmm0
	movss	%xmm0, (%eax)
	addl	$4, %esp
	ret

llvm-svn: 46140
2008-01-17 19:59:44 +00:00
Chris Lattner 7eabed3521 code cleanups, no functionality change.
llvm-svn: 46126
2008-01-17 07:20:38 +00:00
Chris Lattner 72733e573b * Introduce a new SelectionDAG::getIntPtrConstant method
and switch various codegen pieces and the X86 backend over
  to using it.

* Add some comments to SelectionDAGNodes.h

* Introduce a second argument to FP_ROUND, which indicates
  whether the FP_ROUND changes the value of its input. If
  not it is safe to xform things like fp_extend(fp_round(x)) -> x.

llvm-svn: 46125
2008-01-17 07:00:52 +00:00
Evan Cheng 7be1528004 Fixes a nasty dag combiner bug that causes a bunch of tests to fail at -O0.
It's not safe to use the two value CombineTo variant to combine away a dead load.
e.g. 
v1, chain2 = load chain1, loc
v2, chain3 = load chain2, loc
v3         = add v2, c 
Now we replace use of v1 with undef, use of chain2 with chain1.
ReplaceAllUsesWith() will iterate through uses of the first load and update operands:
v1, chain2 = load chain1, loc
v2, chain3 = load chain1, loc
v3         = add v2, c 
Now the second load is the same as the first load, SelectionDAG cse will ensure
the use of second load is replaced with the first load.
v1, chain2 = load chain1, loc
v3         = add v1, c
Then v1 is replaced with undef and bad things happen.

llvm-svn: 46099
2008-01-16 23:11:54 +00:00
Chris Lattner 2e50a6f90f Factor the ReachesChainWithoutSideEffects out of dag combiner into
a public SDOperand::reachesChainWithoutSideEffects method.  No 
functionality change.

llvm-svn: 46050
2008-01-16 05:49:24 +00:00
Chris Lattner 51b01bf8a5 Make load->store deletion a bit smarter. This allows us to compile this:
void test(long long *P) { *P ^= 1; }

into just:

_test:
	movl	4(%esp), %eax
	xorl	$1, (%eax)
	ret

instead of code like this:

_test:
	movl	4(%esp), %ecx
        xorl    $1, (%ecx)
	movl	4(%ecx), %edx
	movl	%edx, 4(%ecx)
	ret

llvm-svn: 45762
2008-01-08 23:08:06 +00:00
Chris Lattner f3ebc3f3d2 Remove attribution from file headers, per discussion on llvmdev.
llvm-svn: 45418
2007-12-29 20:36:04 +00:00
Chris Lattner 2de9b85297 make sure not to zap volatile stores, thanks a lot to Dale for noticing this!
llvm-svn: 45402
2007-12-29 07:15:45 +00:00
Chris Lattner 5919b48fe9 don't fold fp_round(fp_extend(load)) -> fp_round(extload)
llvm-svn: 45400
2007-12-29 06:55:23 +00:00
Chris Lattner 3f9c6a7260 Delete a store whose input is a load from the same pointer:
x = load p
  store x -> p

llvm-svn: 45398
2007-12-29 06:26:16 +00:00
Chris Lattner efd1cddb5a Tell TargetLoweringOpt whether it is running before
or after legalize.

llvm-svn: 45321
2007-12-22 20:56:36 +00:00
Evan Cheng 9f06e5e2df Don't leave newly created nodes around if it turns out they are not needed.
llvm-svn: 45186
2007-12-19 01:34:38 +00:00
Dale Johannesen 5eff4de9c8 Redo previous patch so optimization only done for i1.
Simpler and safer.

llvm-svn: 44663
2007-12-06 17:53:31 +00:00
Chris Lattner eedaf92fcf third time around: instead of disabling this completely,
only disable it if we don't know it will be obviously profitable.
Still fixme, but less so. :)

llvm-svn: 44658
2007-12-06 07:47:55 +00:00
Chris Lattner b5fdfb9612 Actually, disable this code for now. More analysis and improvements to
the X86 backend are needed before this should be enabled by default.

llvm-svn: 44657
2007-12-06 07:44:31 +00:00
Chris Lattner 7c709a5d08 implement a readme entry, compiling the code into:
_foo:
	movl	$12, %eax
	andl	4(%esp), %eax
	movl	_array(%eax), %eax
	ret

instead of:

_foo:
	movl	4(%esp), %eax
	shrl	$2, %eax
	andl	$3, %eax
	movl	_array(,%eax,4), %eax
	ret

As it turns out, this triggers all the time, in a wide variety of
situations, for example, I see diffs like this in various programs:

-       movl    8(%eax), %eax
-       shll    $2, %eax
-       andl    $1020, %eax
-       movl    (%esi,%eax), %eax
+       movzbl  8(%eax), %eax
+       movl    (%esi,%eax,4), %eax


-       shll    $2, %edx
-       andl    $1020, %edx
-       movl    (%edi,%edx), %edx
+       andl    $255, %edx
+       movl    (%edi,%edx,4), %edx

Unfortunately, I also see stuff like this, which can be fixed in the
X86 backend:

-       andl    $85, %ebx
-       addl    _bit_count(,%ebx,4), %ebp
+       shll    $2, %ebx
+       andl    $340, %ebx
+       addl    _bit_count(%ebx), %ebp

llvm-svn: 44656
2007-12-06 07:33:36 +00:00
Dale Johannesen 05bbbda78a Fix PR1842.
llvm-svn: 44649
2007-12-06 01:43:46 +00:00
Dan Gohman 9a69341725 Don't lower srem/urem X%C to X-X/C*C unless the division is actually
optimized. This avoids creating illegal divisions when the combiner is
running after legalize; this fixes PR1815. Also, it produces better
code in the included testcase by avoiding the subtract and multiply
when the division isn't optimized.

llvm-svn: 44341
2007-11-26 23:46:11 +00:00
Duncan Sands e795efea5b Move MinAlign to MathExtras.h.
llvm-svn: 43944
2007-11-09 13:41:39 +00:00
Duncan Sands e7a9ac929f Fix some load/store logic that would be wrong for
apints on big-endian machines if the bitwidth is
not a multiple of 8.  Introduce a new helper,
MVT::getStoreSizeInBits, and use it.

llvm-svn: 43934
2007-11-09 08:57:19 +00:00
Evan Cheng ece4c68b82 If both parts of smul_lohi, etc. are used, don't simplify. If only one part is used, try simplify it.
llvm-svn: 43888
2007-11-08 09:25:29 +00:00
Evan Cheng 0747bc1df6 Typo.
llvm-svn: 43511
2007-10-30 20:11:21 +00:00
Dan Gohman ae95d72a52 Fix a DAGCombiner abort on a bitcast from a scalar to a vector.
llvm-svn: 43470
2007-10-29 20:44:42 +00:00
Evan Cheng e106e2f142 Enable more fold (sext (load x)) -> (sext (truncate (sextload x)))
transformation. Previously, it's restricted by ensuring the number of load uses
is one. Now the restriction is loosened up by allowing setcc uses to be
"extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq).

llvm-svn: 43465
2007-10-29 19:58:20 +00:00
Duncan Sands 1826deda68 The guaranteed alignment of ptr+offset is only the minimum of
of offset and the alignment of ptr if these are both powers of
2.  While the ptr alignment is guaranteed to be a power of 2,
there is no reason to think that offset is.  For example, if
offset is 12 (the size of a long double on x86-32 linux) and
the alignment of ptr is 8, then the alignment of ptr+offset
will in general be 4, not 8.  Introduce a function MinAlign,
lifted from gcc, for computing the minimum guaranteed alignment.
I've tried to fix up everywhere under lib/CodeGen/SelectionDAG/.
I also changed some places that weren't wrong (because both values
were a power of 2), as a defensive change against people copying
and pasting the code.
Hopefully someone who cares about alignment will review the rest
of LLVM and fix up the remaining places.  Since I'm on x86 I'm
not very motivated to do this myself...

llvm-svn: 43421
2007-10-28 12:59:45 +00:00
Dale Johannesen 6802d0c96f Redo "last ppc long double fix" as Chris wants.
llvm-svn: 43189
2007-10-19 20:29:00 +00:00
Dale Johannesen 10432e5a67 More ppcf128 issues (maybe the last)?
llvm-svn: 43160
2007-10-19 00:59:18 +00:00
Dale Johannesen e5facd51cb Disable attempts to constant fold PPC f128.
Remove the assumption that this will happen from
various places.

llvm-svn: 43053
2007-10-16 23:38:29 +00:00
Chris Lattner 3cfb56d489 One mundane change: Change ReplaceAllUsesOfValueWith to *optionally*
take a deleted nodes vector, instead of requiring it.

One more significant change:  Implement the start of a legalizer that
just works on types.  This legalizer is designed to run before the 
operation legalizer and ensure just that the input dag is transformed
into an output dag whose operand and result types are all legal, even
if the operations on those types are not.

This design/impl has the following advantages:

1. When finished, this will *significantly* reduce the amount of code in
   LegalizeDAG.cpp.  It will remove all the code related to promotion and
   expansion as well as splitting and scalarizing vectors.
2. The new code is very simple, idiomatic, and modular: unlike 
   LegalizeDAG.cpp, it has no 3000 line long functions. :)
3. The implementation is completely iterative instead of recursive, good
   for hacking on large dags without blowing out your stack.
4. The implementation updates nodes in place when possible instead of 
   deallocating and reallocating the entire graph that points to some 
   mutated node.
5. The code nicely separates out handling of operations with invalid 
   results from operations with invalid operands, making some cases
   simpler and easier to understand.
6. The new -debug-only=legalize-types option is very very handy :), 
   allowing you to easily understand what legalize types is doing.

This is not yet done.  Until the ifdef added to SelectionDAGISel.cpp is
enabled, this does nothing.  However, this code is sufficient to legalize
all of the code in 186.crafty, olden and freebench on an x86 machine.  The
biggest issues are:

1. Vectors aren't implemented at all yet
2. SoftFP is a mess, I need to talk to Evan about it.
3. No lowering to libcalls is implemented yet.
4. Various operations are missing etc.
5. There are FIXME's for stuff I hax0r'd out, like softfp.

Hey, at least it is a step in the right direction :).  If you'd like to help,
just enable the #ifdef in SelectionDAGISel.cpp and compile code with it.  If
this explodes it will tell you what needs to be implemented.  Help is 
certainly appreciated.

Once this goes in, we can do three things:

1. Add a new pass of dag combine between the "type legalizer" and "operation
   legalizer" passes.  This will let us catch some long-standing isel issues
   that we miss because operation legalization often obfuscates the dag with
   target-specific nodes.
2. We can rip out all of the type legalization code from LegalizeDAG.cpp,
   making it much smaller and simpler.  When that happens we can then 
   reimplement the core functionality left in it in a much more efficient and
   non-recursive way.
3. Once the whole legalizer is non-recursive, we can implement whole-function
   selectiondags maybe...

llvm-svn: 42981
2007-10-15 06:10:22 +00:00
Chris Lattner f47e30627a Enhance the truncstore optimization code to handle shifted
values and propagate demanded bits through them in simple cases.

This allows this code:
void foo(char *P) {
   strcpy(P, "abc");
}
to compile to:

_foo:
        ldrb r3, [r1]
        ldrb r2, [r1, #+1]
        ldrb r12, [r1, #+2]!
        ldrb r1, [r1, #+1]
        strb r1, [r0, #+3]
        strb r2, [r0, #+1]
        strb r12, [r0, #+2]
        strb r3, [r0]
        bx lr

instead of:

_foo:
        ldrb r3, [r1, #+3]
        ldrb r2, [r1, #+2]
        orr r3, r2, r3, lsl #8
        ldrb r2, [r1, #+1]
        ldrb r1, [r1]
        orr r2, r1, r2, lsl #8
        orr r3, r2, r3, lsl #16
        strb r3, [r0]
        mov r2, r3, lsr #24
        strb r2, [r0, #+3]
        mov r2, r3, lsr #16
        strb r2, [r0, #+2]
        mov r3, r3, lsr #8
        strb r3, [r0, #+1]
        bx lr

testcase here: test/CodeGen/ARM/truncstore-dag-combine.ll

This also helps occasionally for X86 and other cases not involving 
unaligned load/stores.

llvm-svn: 42954
2007-10-13 06:58:48 +00:00
Chris Lattner 5e6fe054a2 Add a simple optimization to simplify the input to
truncate and truncstore instructions, based on the 
knowledge that they don't demand the top bits.

llvm-svn: 42952
2007-10-13 06:35:54 +00:00
Duncan Sands 56ab90d3ad Correct swapped arguments to getConstant.
llvm-svn: 42824
2007-10-10 09:54:50 +00:00
Dan Gohman 5c6d0c3b99 DAGCombiner support for UDIVREM/SDIVREM and UMUL_LOHI/SMUL_LOHI.
Check if one of the two results unneeded so see if a simpler operator
could bs used. Also check to see if each of the two computations could be
simplified if they were split into separate operators. Factor out the code
that calls visit() so that it can be used for this purpose.

llvm-svn: 42759
2007-10-08 17:57:15 +00:00
Evan Cheng 0de312dd7d Reapply 42677.
llvm-svn: 42692
2007-10-06 08:19:55 +00:00
Chris Lattner 82217bd155 revert evan's patch until the header is committed
llvm-svn: 42686
2007-10-06 06:08:17 +00:00
Evan Cheng f4b5d491df Added DAG xforms. e.g.
(vextract (v4f32 s2v (f32 load $addr)), 0) -> (f32 load $addr) 
(vextract (v4i32 bc (v4f32 s2v (f32 load $addr))), 0) -> (i32 load $addr)
Remove x86 specific patterns.

llvm-svn: 42677
2007-10-06 02:46:29 +00:00
Evan Cheng e2e8f2d96b Fix a bogus splat xform:
shuffle <undef, undef, x, undef>, <undef, undef, undef, undef>, <2, 2, 2, 2>
!=
<undef, undef, x, undef>

llvm-svn: 42111
2007-09-18 21:54:37 +00:00
Dale Johannesen af12b57405 Prevent crash on long double.
llvm-svn: 42103
2007-09-18 18:36:59 +00:00
Dale Johannesen 028084efe5 Revise previous patch per review comments.
Next round of x87 long double stuff.
Getting close now, basically works.

llvm-svn: 41875
2007-09-12 03:30:33 +00:00
Dale Johannesen 245dceb06d Add APInt interfaces to APFloat (allows directly
access to bits).  Use them in place of float and
double interfaces where appropriate.
First bits of x86 long double constants handling 
(untested, probably does not work).

llvm-svn: 41858
2007-09-11 18:32:33 +00:00
Chris Lattner 58c227bd09 Emit:
cmpl    %eax, %ecx
        setae   %al
        movzbl  %al, %eax

instead of:

        cmpl    %eax, %ecx
        setb    %al
        xorb    $1, %al
        movzbl  %al, %eax

when using logical not of a C comparison.

llvm-svn: 41807
2007-09-10 21:39:07 +00:00
Dale Johannesen 446b900192 Add mod, copysign, abs operations to APFloat.
Implement some constant folding in SelectionDAG and
DAGCombiner using APFloat.  Remove double versions
of constructor and getValue from ConstantFPSDNode.

llvm-svn: 41664
2007-08-31 23:34:27 +00:00
Dan Gohman 9625d812c9 Make DAGCombiner's global alias analysis query more precise in the case
where both pointers have non-zero offsets.

llvm-svn: 41491
2007-08-27 16:32:11 +00:00
Dale Johannesen b6d2bec418 Revise per review comments.
llvm-svn: 41409
2007-08-26 01:18:27 +00:00
Dale Johannesen 2cfcf70f82 Add APFloat interface to ConstantFPSDNode. Change
over uses in DAGCombiner.  Fix interfaces to work
with APFloats.

llvm-svn: 41407
2007-08-25 22:10:57 +00:00
Evan Cheng f5a23abf37 Fold C ? 0 : 1 to ~C or zext(~C) or trunc(~C) depending the types.
llvm-svn: 41163
2007-08-18 05:57:05 +00:00
Dan Gohman 30f060be80 Fix the alias analysis query in DAGCombiner to not add in two
offsets. The SrcValueOffset values are the real offsets from the
SrcValue base pointers.

llvm-svn: 40534
2007-07-26 16:14:06 +00:00
Dan Gohman 80f9f077e3 Don't call SimplifyVBinOp for non-vector operations, following earlier review
feedback. This theoretically makes the common (scalar) case more efficient.

llvm-svn: 39823
2007-07-13 20:03:40 +00:00
Dan Gohman adb3d37c07 Fix a bug in the folding of binary operators to undef.
Thanks to Lauro for spotting this!

llvm-svn: 38491
2007-07-10 15:19:29 +00:00
Dan Gohman fa91282dbf Fix the folding of undef in several binary operators to recognize
undef in either the left or right operand.

llvm-svn: 38489
2007-07-10 14:20:37 +00:00
Dan Gohman 2af3063337 Preserve volatililty and alignment information when lowering or
simplifying loads and stores.

llvm-svn: 38473
2007-07-09 22:18:38 +00:00
Chris Lattner 6caf8fdd04 Fix this warning:
DAGCombiner.cpp: In member function 'llvm::SDOperand<unnamed>::DAGCombiner::visitOR(llvm::SDNode*)':
DAGCombiner.cpp:1608: warning: passing negative value '-0x00000000000000001' for argument 1 to 'llvm::SDOperand llvm::SelectionDAG::getConstant(uint64_t, llvm::MVT::ValueType, bool)'

oiy.

llvm-svn: 38458
2007-07-09 16:16:34 +00:00
Dan Gohman 06563a8702 Fix several over-aggressive folds for undef nodes in dagcombine, to
follow the rules for undef used in instcombine.

llvm-svn: 37851
2007-07-03 14:03:57 +00:00
Dan Gohman 9a70823375 Teach GetNegatedExpression to negate 0-B to B in UnsafeFPMath mode, and
visitFSUB to fold 0-B to -B in UnsafeFPMath mode. Also change visitFNEG
to use isNegatibleForFree/GetNegatedExpression instead of doing a subset
of the same thing manually.

This fixes test/CodeGen/X86/negative-sin.ll.

llvm-svn: 37842
2007-07-02 15:48:56 +00:00
Dan Gohman a866514528 Generalize MVT::ValueType and associated functions to be able to represent
extended vector types. Remove the special SDNode opcodes used for pre-legalize
vector operations, and the special MVT::Vector type used with them. Adjust
lowering and legalize to work with the normal SDNode kinds instead, and to
use the normal MVT functions to work with vector types instead of using the
two special operands that the pre-legalize nodes held.

This allows pre-legalize and post-legalize DAGs, and the code that operates
on them, to be more consistent. Pre-legalize vector operators can be handled
more consistently with scalar operators. And, -view-dag-combine1-dags and
-view-legalize-dags now look prettier for vector code.

llvm-svn: 37719
2007-06-25 16:23:39 +00:00
Dan Gohman 309d3d51b3 Move ComputeMaskedBits, MaskedValueIsZero, and ComputeNumSignBits from
TargetLowering to SelectionDAG so that they have more convenient
access to the current DAG, in preparation for the ValueType routines
being changed from standalone functions to members of SelectionDAG for
the pre-legalize vector type changes.

llvm-svn: 37704
2007-06-22 14:59:07 +00:00
Evan Cheng aa5f5d960d Xforms:
(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))
(sub x, (select cc, 0, c)) -> (select cc, x, (sub, x, c))

llvm-svn: 37685
2007-06-21 07:39:16 +00:00
Dan Gohman a7644dd9b9 Pass a SelectionDAG into SDNode::dump everywhere it's used, in prepration
for needing the DAG node to print pre-legalize extended value types, and
to get better debug messages with target-specific nodes.

llvm-svn: 37656
2007-06-19 14:13:56 +00:00
Dan Gohman 5c4413120f Rename MVT::getVectorBaseType to MVT::getVectorElementType.
llvm-svn: 37579
2007-06-14 22:58:02 +00:00
Chris Lattner 4698083b96 tighten up recursion depth again
llvm-svn: 37330
2007-05-25 02:19:06 +00:00
Evan Cheng a4d187b8ce Fix a typo that caused combiner to create mal-formed pre-indexed store where value store is the same as the base pointer.
llvm-svn: 37318
2007-05-24 02:35:39 +00:00
Chris Lattner 6509c0673f prevent exponential recursion in isNegatibleForFree
llvm-svn: 37310
2007-05-23 07:35:22 +00:00
Dan Gohman b539df3389 Qualify calls to getTypeForValueType with MVT:: too.
llvm-svn: 37233
2007-05-18 18:41:29 +00:00
Dale Johannesen 7a6c175e7a Don't fold bitconvert(load) for preinc/postdec loads. Likewise stores.
llvm-svn: 37130
2007-05-16 22:45:30 +00:00
Chris Lattner 48fb92f75d Use a ptr set instead of a linear search to unique TokenFactor operands.
This fixes PR1423

llvm-svn: 37102
2007-05-16 06:37:59 +00:00
Evan Cheng 288f133c71 Bug fix: should check ABI alignment, not pref. alignment.
llvm-svn: 37094
2007-05-16 02:04:50 +00:00
Lauro Ramos Venancio 3f142cbca2 Fix an infinite recursion in GetNegatedExpression.
llvm-svn: 37086
2007-05-15 17:05:43 +00:00
Chris Lattner e49c974a7c implement a simple fneg optimization/propagation thing. This compiles:
CodeGen/PowerPC/fneg.ll into:

_t4:
        fmul f0, f3, f4
        fmadd f1, f1, f2, f0
        blr

instead of:

_t4:
        fneg f0, f3
        fmul f0, f0, f4
        fmsub f1, f1, f2, f0
        blr

llvm-svn: 37054
2007-05-14 22:04:50 +00:00
Evan Cheng f325c2a65e Can't fold the bit_convert is the store is a truncating store.
llvm-svn: 36962
2007-05-09 21:49:47 +00:00
Evan Cheng 562e45692e Forgot a check.
llvm-svn: 36910
2007-05-07 21:36:06 +00:00
Evan Cheng a4cf58a103 Enable a couple of xforms:
- (store (bitconvert v)) -> (store v) if resultant store does not require
higher alignment
- (bitconvert (load v)) -> (load (bitconvert*)v) if resultant load does not
require higher alignment

llvm-svn: 36908
2007-05-07 21:27:48 +00:00
Evan Cheng 044a0a8cfb Don't create indexed load / store with zero offset!
llvm-svn: 36716
2007-05-03 23:52:19 +00:00
Evan Cheng b68343cdd8 Forgot about chain result; also UNDEF cannot have multiple values.
llvm-svn: 36622
2007-05-01 08:53:39 +00:00
Evan Cheng a684cd23a5 * Only turn a load to UNDEF if all of its outputs have no uses (indexed loads
produce two results.)
* Do not touch volatile loads.

llvm-svn: 36604
2007-05-01 00:38:21 +00:00
Christopher Lamb 8af6d5896f PR400 phase 2. Propagate attributed load/store information through DAGs.
llvm-svn: 36356
2007-04-22 23:15:30 +00:00
Reid Spencer 0c1349e6bc Revert Christopher Lamb's load/store alignment changes.
llvm-svn: 36309
2007-04-21 18:36:27 +00:00
Christopher Lamb bff50208c8 add support for alignment attributes on load/store instructions
llvm-svn: 36301
2007-04-21 08:16:25 +00:00
Chris Lattner f03c90bee6 allow SRL to simplify its operands, as it doesn't demand all bits as input.
llvm-svn: 36245
2007-04-18 03:06:49 +00:00
Chris Lattner bf14f20632 When replacing a node in SimplifyDemandedBits, if the old node used any
single-use nodes, they will be dead soon.  Make sure to remove them before
processing other nodes.  This implements CodeGen/X86/shl_elim.ll

llvm-svn: 36244
2007-04-18 03:05:22 +00:00
Chris Lattner 9ad5915559 SIGN_EXTEND_INREG does not demand its top bits. Give SimplifyDemandedBits
a chance to hack on it.  This compiles:

int baz(long long a) { return (short)(((int)(a >>24)) >> 9); }

into:
_baz:
        slwi r2, r3, 8
        srwi r2, r2, 9
        extsh r3, r2
        blr

instead of:

_baz:
        srwi r2, r4, 24
        rlwimi r2, r3, 8, 0, 23
        srwi r2, r2, 9
        extsh r3, r2
        blr

This implements CodeGen/PowerPC/sign_ext_inreg1.ll

llvm-svn: 36212
2007-04-17 19:03:21 +00:00
Chris Lattner 18e4ac4107 fix an infinite loop compiling ldecod, notice by JeffC.
llvm-svn: 35910
2007-04-11 16:51:53 +00:00
Chris Lattner a083ffcad7 Fix this harder.
llvm-svn: 35888
2007-04-11 06:50:51 +00:00
Chris Lattner c5f85d3738 don't create shifts by zero, fix some problems with my previous patch
llvm-svn: 35887
2007-04-11 06:43:25 +00:00
Chris Lattner 65786b078c Teach the codegen to turn [aez]ext (setcc) -> selectcc of 1/0, which often
allows other simplifications.  For example, this compiles:
int isnegative(unsigned int X) {
   return !(X < 2147483648U);
}

Into this code:

x86:
        movl 4(%esp), %eax
        shrl $31, %eax
        ret
arm:
        mov r0, r0, lsr #31
        bx lr
thumb:
        lsr r0, r0, #31
        bx lr

instead of:

x86:
        cmpl $0, 4(%esp)
        sets %al
        movzbl %al, %eax
        ret

arm:
        mov r3, #0
        cmp r0, #0
        movlt r3, #1
        mov r0, r3
        bx lr

thumb:
        mov r2, #1
        mov r1, #0
        cmp r0, #0
        blt LBB1_2      @entry
LBB1_1: @entry
        cpy r2, r1
LBB1_2: @entry
        cpy r0, r2
        bx lr

Testcase here: test/CodeGen/Generic/ispositive.ll

llvm-svn: 35883
2007-04-11 05:32:27 +00:00
Chris Lattner 41189c63cc Codegen integer abs more efficiently using the trick from the PPC CWG. This
improves codegen on many architectures.  Tests committed as CodeGen/*/iabs.ll

X86 Old:			X86 New:
_test:				_test:
   movl 4(%esp), %ecx		   movl 4(%esp), %eax
   movl %ecx, %eax		   movl %eax, %ecx
   negl %eax			   sarl $31, %ecx
   testl %ecx, %ecx		   addl %ecx, %eax
   cmovns %ecx, %eax		   xorl %ecx, %eax
   ret				   ret

PPC Old:			PPC New:
_test:				_test:
   cmpwi cr0, r3, -1		   srawi r2, r3, 31
   neg r2, r3			   add r3, r3, r2
   bgt cr0, LBB1_2 ;		   xor r3, r3, r2
LBB1_1: ;			   blr
   mr r3, r2
LBB1_2: ;
   blr

ARM Old:			ARM New:
_test:				_test:
   rsb r3, r0, #0		   add r3, r0, r0, asr #31
   cmp r0, #0			   eor r0, r3, r0, asr #31
   movge r3, r0			   bx lr
   mov r0, r3
   bx lr

Thumb Old:			Thumb New:
_test:				_test:
   neg r2, r0			   asr r2, r0, #31
   cmp r0, #0			   add r0, r0, r2
   bge LBB1_2			   eor r0, r2
LBB1_1: @			   bx lr
   cpy r0, r2
LBB1_2: @
   bx lr


Sparc Old:			Sparc New:
test:				test:
   save -96, %o6, %o6		   save -96, %o6, %o6
   sethi 0, %l0			   sra %i0, 31, %l0
   sub %l0, %i0, %l0		   add %i0, %l0, %l1
   subcc %i0, -1, %l1		   xor %l1, %l0, %i0
   bg .BB1_2			   restore %g0, %g0, %g0
   nop				   retl
.BB1_1:				   nop
   or %g0, %l0, %i0
.BB1_2:
   restore %g0, %g0, %g0
   retl
   nop

It also helps alpha/ia64 :)

llvm-svn: 35881
2007-04-11 05:11:38 +00:00
Scott Michel 16627a542f 1. Insert custom lowering hooks for ISD::ROTR and ISD::ROTL.
2. Help DAGCombiner recognize zero/sign/any-extended versions of ROTR and ROTL
patterns. This was motivated by the X86/rotate.ll testcase, which should now
generate code for other platforms (and soon-to-come platforms.) Rewrote code
slightly to make it easier to read.

llvm-svn: 35605
2007-04-02 21:36:32 +00:00
Dale Johannesen 4bbd2eefba Fix incorrect combination of different loads. Reenable zext-over-truncate
combination.

llvm-svn: 35517
2007-03-30 21:38:07 +00:00
Evan Cheng ccee35fd0d Disable load width reduction xform of variant (zext (truncate load x)) for
big endian targets until llvm-gcc build issue has been resolved.

llvm-svn: 35449
2007-03-29 07:56:46 +00:00
Evan Cheng 8275f0e0af SIGN_EXTEND_INREG requires one extra operand, a ValueType node.
llvm-svn: 35350
2007-03-26 07:12:51 +00:00
Evan Cheng b7051f596a Adjust offset to compensate for big endian machines.
llvm-svn: 35293
2007-03-24 00:02:43 +00:00
Evan Cheng a883b58caf Make sure SEXTLOAD of the specific type is supported on the target.
llvm-svn: 35289
2007-03-23 22:13:36 +00:00
Evan Cheng e2f5f24e8e Also replace uses of SRL if that's also folded during ReduceLoadWidth().
llvm-svn: 35286
2007-03-23 20:55:21 +00:00
Evan Cheng a824e79f06 A couple of bug fixes for reducing load width xform:
1. Address offset is in bytes.
2. Make sure truncate node uses are replaced with new load.

llvm-svn: 35274
2007-03-23 02:16:52 +00:00
Evan Cheng 464dc9b74c More opportunities to reduce load size.
llvm-svn: 35254
2007-03-22 01:54:19 +00:00
Evan Cheng d63baead9b fold (truncate (srl (load x), c)) -> (smaller load (x+c/vt bits))
llvm-svn: 35239
2007-03-21 20:14:05 +00:00
Evan Cheng 8a1d09d079 Avoid combining indexed load further.
llvm-svn: 35005
2007-03-07 08:07:03 +00:00
Chris Lattner 47206667c0 fold away addc nodes when we know there cannot be a carry-out.
llvm-svn: 34913
2007-03-04 20:40:38 +00:00
Chris Lattner 2dcc6e7f58 generalize
llvm-svn: 34910
2007-03-04 20:08:45 +00:00
Chris Lattner e2e13caeb2 canonicalize constants to the RHS of addc/adde. If nothing uses the carry out of
addc, turn it into add.

This allows us to compile:

long long test(long long A, unsigned B) {
  return (A + ((long long)B << 32)) & 123;
}

into:

_test:
        movl $123, %eax
        andl 4(%esp), %eax
        xorl %edx, %edx
        ret

instead of:
_test:
        xorl %edx, %edx
        movl %edx, %eax
        addl 4(%esp), %eax   ;; add of zero
        andl $123, %eax
        ret

llvm-svn: 34909
2007-03-04 20:03:15 +00:00
Chris Lattner fce448f856 Fold (sext (truncate x)) more aggressively, by avoiding creation of a
sextinreg if not needed.   This is useful in two cases: before legalize,
it avoids creating a sextinreg that will be trivially removed.  After legalize
if the target doesn't support sextinreg, the trunc/sext would not have been
removed before.

llvm-svn: 34621
2007-02-26 03:13:59 +00:00
Evan Cheng 92658d5648 Move SimplifySetCC to TargetLowering and allow it to be shared with legalizer.
llvm-svn: 34065
2007-02-08 22:13:59 +00:00
Evan Cheng 00a640dbe0 Fix for PR1108: type of insert_vector_elt index operand is PtrVT, not MVT::i32.
llvm-svn: 33398
2007-01-20 10:10:26 +00:00
Evan Cheng 9201100b29 Remove this xform:
(shl (add x, c1), c2) -> (add (shl x, c2), c1<<c2)
Replace it with:
(add (shl (add x, c1), c2), ) -> (add (add (shl x, c2), c1<<c2), )

This fixes test/CodeGen/ARM/smul.ll

llvm-svn: 33361
2007-01-19 17:51:44 +00:00
Chris Lattner 4dc4489286 Fix PR1114 and CodeGen/Generic/2007-01-15-LoadSelectCycle.ll by being
careful when folding "c ? load p : load q" that C doesn't reach either load.
If so, folding this into load (c ? p : q) will induce a cycle in the graph.

llvm-svn: 33251
2007-01-16 05:59:59 +00:00
Chris Lattner f70c5cd5db add options to view the dags before the first or second pass of dag combine.
llvm-svn: 33249
2007-01-16 04:55:25 +00:00
Chris Lattner 0199fd6d59 Implement some trivial FP foldings when -enable-unsafe-fp-math is specified.
This implements CodeGen/PowerPC/unsafe-math.ll

llvm-svn: 33024
2007-01-08 23:04:05 +00:00
Chris Lattner aee775a6b7 Eliminate static ctors from Statistics
llvm-svn: 32698
2006-12-19 22:41:21 +00:00
Evan Cheng 28cf4277bb Cannot combine an indexed load / store any further.
llvm-svn: 32629
2006-12-16 06:25:23 +00:00
Jim Laskey 26df19ace6 This code was usurping the sextload expand in teh legalizer. Just make
sure the right conditions are checked.

llvm-svn: 32611
2006-12-15 21:38:30 +00:00
Chris Lattner b7524b6d0e make this code more aggressive about turning store fpimm into store int imm.
This is not sufficient to fix X86/store-fp-constant.ll

llvm-svn: 32465
2006-12-12 04:16:14 +00:00
Evan Cheng 218369881f Don't convert store double C, Ptr to store long C, Ptr if i64 is not a legal type.
llvm-svn: 32434
2006-12-11 17:25:19 +00:00
Nate Begeman 8e20c760fa Move something that should be in the dag combiner from the legalizer to the
dag combiner.

llvm-svn: 32431
2006-12-11 02:23:46 +00:00
Chris Lattner d9f04e4875 Fix CodeGen/PowerPC/2006-12-07-SelectCrash.ll on PPC64
llvm-svn: 32336
2006-12-07 22:36:47 +00:00
Bill Wendling 22e978a736 Removing even more <iostream> includes.
llvm-svn: 32320
2006-12-07 20:04:42 +00:00
Chris Lattner 700b873130 Detemplatize the Statistic class. The only type it is instantiated with
is 'unsigned'.

llvm-svn: 32279
2006-12-06 17:46:33 +00:00
Chris Lattner 3da631f29a For better or worse, load from i1 is assumed to be zero extended. Do not
form a load from i1 from larger loads that may not be zext'd.

llvm-svn: 31933
2006-11-27 04:40:53 +00:00
Chris Lattner 3676a994ca Fix PR1011 and CodeGen/Generic/2006-11-20-DAGCombineCrash.ll
llvm-svn: 31878
2006-11-20 18:05:46 +00:00
Evan Cheng f64da389f8 Fix an incorrectly inverted condition.
llvm-svn: 31773
2006-11-16 00:08:20 +00:00
Chris Lattner a0a8003f59 disallow preinc of a frameindex. This is not profitable and causes 2-addr
pass to explode.  This fixes a bunch of llc-beta failures on ppc last night.

llvm-svn: 31661
2006-11-11 01:00:15 +00:00
Chris Lattner eabc15c1d8 reduce indentation by using early exits. No functionality change.
llvm-svn: 31660
2006-11-11 00:56:29 +00:00
Chris Lattner ffad2166e1 move big chunks of code out-of-line, no functionality change.
llvm-svn: 31658
2006-11-11 00:39:41 +00:00
Chris Lattner 4eac5f59e6 Fix a dag combiner bug exposed by my recent instcombine patch. This fixes
CodeGen/Generic/2006-11-10-DAGCombineMiscompile.ll and PPC gsm/toast

llvm-svn: 31644
2006-11-10 21:37:15 +00:00
Evan Cheng 13440b025c When forming a pre-indexed store, make sure ptr isn't the same or is a pred of value being stored. It would cause a cycle.
llvm-svn: 31631
2006-11-10 08:28:11 +00:00
Evan Cheng 6878378390 Don't attempt expensive pre-/post- indexed dag combine if target does not support them.
llvm-svn: 31598
2006-11-09 19:10:46 +00:00
Evan Cheng b15000736c Rename ISD::MemOpAddrMode to ISD::MemIndexedMode
llvm-svn: 31595
2006-11-09 17:55:04 +00:00
Evan Cheng b58e06bc9e getPostIndexedAddressParts change: passes in load/store instead of its loaded / stored VT.
llvm-svn: 31584
2006-11-09 04:29:46 +00:00
Evan Cheng 85e54223cd Match more post-indexed ops.
llvm-svn: 31569
2006-11-08 20:27:27 +00:00
Jim Laskey 61feeb90f9 Remove redundant <cmath>.
llvm-svn: 31561
2006-11-08 19:16:44 +00:00
Evan Cheng 0303cb9b33 - When performing pre-/post- indexed load/store transformation, do not worry
about whether the new base ptr would be live below the load/store. Let two
  address pass split it back to non-indexed ops.
- Minor tweaks / fixes.

llvm-svn: 31544
2006-11-08 08:30:28 +00:00
Evan Cheng 6072435756 Fixed a minor bug preventing some pre-indexed load / store transformation.
llvm-svn: 31543
2006-11-08 06:56:05 +00:00
Evan Cheng d48f7dd250 Fix a obscure post-indexed load / store dag combine bug.
llvm-svn: 31537
2006-11-08 02:38:55 +00:00
Evan Cheng 60c6846d21 Add post-indexed load / store transformations.
llvm-svn: 31498
2006-11-07 09:03:05 +00:00
Evan Cheng eb99bd736a Add comment.
llvm-svn: 31473
2006-11-06 08:14:30 +00:00
Jeff Cohen 7d6f3db3e2 Unbreak VC++ build.
llvm-svn: 31464
2006-11-05 19:31:28 +00:00
Evan Cheng 33157700d9 Added pre-indexed store support.
llvm-svn: 31459
2006-11-05 09:31:14 +00:00
Evan Cheng 1dfd26a151 Rename
llvm-svn: 31413
2006-11-03 07:21:16 +00:00
Reid Spencer 52f958741a Remove dead variable. Fix 80 column violations.
llvm-svn: 31412
2006-11-03 03:30:34 +00:00
Evan Cheng 357017f4a9 Added DAG combiner transformation to generate pre-indexed loads.
llvm-svn: 31410
2006-11-03 03:06:21 +00:00
Reid Spencer de46e48420 For PR786:
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.

llvm-svn: 31380
2006-11-02 20:25:50 +00:00
Jim Laskey 55e4dcad36 Add option for controlling inclusion of global AA.
llvm-svn: 31040
2006-10-18 19:08:31 +00:00
Jim Laskey a15b0ebb5e Use global info for alias analysis.
llvm-svn: 31035
2006-10-18 12:29:57 +00:00
Chris Lattner 327b88b102 Fix CodeGen/PowerPC/2006-10-17-brcc-miscompile.ll
llvm-svn: 31019
2006-10-17 21:24:15 +00:00
Jim Laskey e7d2c24a7d Make it simplier to dump DAGs while in DAGCombiner. Remove a nasty optimization.
llvm-svn: 31009
2006-10-17 19:33:52 +00:00
Evan Cheng 1e3a39cd08 Make sure operand does have size and element type operands.
llvm-svn: 30999
2006-10-17 17:06:35 +00:00
Evan Cheng f3ae00a64a Be careful when looking through a vbit_convert. Optimizing this:
(vector_shuffle
  (vbitconvert (vbuildvector (copyfromreg v4f32), 1, v4f32), 4, f32),
  (undef, undef, undef, undef), (0, 0, 0, 0), 4, f32)
to the
  vbitconvert
is a very bad idea.

llvm-svn: 30989
2006-10-16 22:49:37 +00:00
Jim Laskey dcb2b83886 Pass AliasAnalysis thru to DAGCombiner.
llvm-svn: 30984
2006-10-16 20:52:31 +00:00
Jim Laskey 3bf4f3bd60 Tidy up after truncstore changes.
llvm-svn: 30961
2006-10-14 12:14:27 +00:00
Chris Lattner 6a1b2de8c4 Make sure that the node returned by SimplifySetCC is added to the worklist
so that it can be deleted if unused.

llvm-svn: 30955
2006-10-14 03:52:46 +00:00
Chris Lattner 0626bd2fbc fold setcc of a setcc.
llvm-svn: 30953
2006-10-14 01:02:29 +00:00
Chris Lattner bd9acad805 When SimplifySetCC was moved to the DAGCombiner, it was never removed from
SelectionDAG and it has since bitrotted.  Remove the copy from SelectionDAG.
Next, remove the constant folding piece of DAGCombiner::SimplifySetCC into
a new FoldSetCC method which can be used by getNode() and SimplifySetCC.

This fixes obscure bugs.

llvm-svn: 30952
2006-10-14 00:41:01 +00:00
Jim Laskey dcf983ce41 Reduce the workload by not adding chain users to work list.
llvm-svn: 30948
2006-10-13 23:32:28 +00:00
Evan Cheng ab51cf2e78 Merge ISD::TRUNCSTORE to ISD::STORE. Switch to using StoreSDNode.
llvm-svn: 30945
2006-10-13 21:14:26 +00:00
Chris Lattner d0620d2773 Lower X%C into X/C+stuff. This allows the 'division by a constant' logic to
apply to rems as well as divs.  This fixes PR945 and speeds up ReedSolomon
from 14.57s to 10.90s (which is now faster than gcc).

It compiles CodeGen/X86/rem.ll into:

_test1:
        subl $4, %esp
        movl %esi, (%esp)
        movl $2155905153, %ecx
        movl 8(%esp), %esi
        movl %esi, %eax
        imull %ecx
        addl %esi, %edx
        movl %edx, %eax
        shrl $31, %eax
        sarl $7, %edx
        addl %eax, %edx
        imull $255, %edx, %eax
        subl %eax, %esi
        movl %esi, %eax
        movl (%esp), %esi
        addl $4, %esp
        ret
_test2:
        movl 4(%esp), %eax
        movl %eax, %ecx
        sarl $31, %ecx
        shrl $24, %ecx
        addl %eax, %ecx
        andl $4294967040, %ecx
        subl %ecx, %eax
        ret
_test3:
        subl $4, %esp
        movl %esi, (%esp)
        movl $2155905153, %ecx
        movl 8(%esp), %esi
        movl %esi, %eax
        mull %ecx
        shrl $7, %edx
        imull $255, %edx, %eax
        subl %eax, %esi
        movl %esi, %eax
        movl (%esp), %esi
        addl $4, %esp
        ret

instead of div/idiv instructions.

llvm-svn: 30920
2006-10-12 20:58:32 +00:00
Chris Lattner 2e33fb453b add a minor dag combine noticed when looking at PR945
llvm-svn: 30915
2006-10-12 20:23:19 +00:00
Jim Laskey df2ccc395e D'oh - need to use the rigth kind of store.
llvm-svn: 30903
2006-10-12 15:22:24 +00:00
Jim Laskey a13b9c7aa4 Alias analysis of TRUNCSTORE.
llvm-svn: 30889
2006-10-11 18:55:16 +00:00
Jim Laskey 0f7c328ae7 Handle aliasing of loadext.
llvm-svn: 30883
2006-10-11 17:47:52 +00:00
Jim Laskey 08edf332ed Fix regression in combiner alias analysis.
llvm-svn: 30880
2006-10-11 13:47:09 +00:00
Evan Cheng d35734bd1f Naming consistency.
llvm-svn: 30878
2006-10-11 07:10:22 +00:00
Evan Cheng e71fe34d75 Reflects ISD::LOAD / ISD::LOADX / LoadSDNode changes.
llvm-svn: 30844
2006-10-09 20:57:25 +00:00
Chris Lattner 5ab6d8b3fc Eliminate more token factors by taking advantage of transitivity:
if TF depends on A and B, and A depends on B, TF just needs to depend on
A.  With Jim's alias-analysis stuff enabled, this compiles the testcase in
PR892 into:

__Z4test3Val:
        subl $44, %esp
        call L__Z3foov$stub
        movl %edx, 28(%esp)
        movl %eax, 32(%esp)
        movl %eax, 24(%esp)
        movl %edx, 36(%esp)
        movl 52(%esp), %ecx
        movl %ecx, 4(%esp)
        movl %eax, 8(%esp)
        movl %edx, 12(%esp)
        movl 48(%esp), %eax
        movl %eax, (%esp)
        call L__Z3bar3ValS_$stub
        addl $44, %esp
        ret

instead of:

__Z4test3Val:
        subl $44, %esp
        call L__Z3foov$stub
        movl %eax, 24(%esp)
        movl %edx, 28(%esp)
        movl 24(%esp), %eax
        movl %eax, 32(%esp)
        movl 28(%esp), %eax
        movl %eax, 36(%esp)
        movl 32(%esp), %eax
        movl 36(%esp), %ecx
        movl 52(%esp), %edx
        movl %edx, 4(%esp)
        movl %eax, 8(%esp)
        movl %ecx, 12(%esp)
        movl 48(%esp), %eax
        movl %eax, (%esp)
        call L__Z3bar3ValS_$stub
        addl $44, %esp
        ret

llvm-svn: 30821
2006-10-08 22:57:01 +00:00
Jim Laskey 0463e08005 Combiner alias analysis passes Multisource (release-asserts.)
llvm-svn: 30818
2006-10-07 23:37:56 +00:00
Evan Cheng df9ac47e5e Make use of getStore().
llvm-svn: 30759
2006-10-05 23:01:46 +00:00
Jim Laskey 6549d22ef9 Alias analysis code clean ups.
llvm-svn: 30753
2006-10-05 15:07:25 +00:00
Jim Laskey 708d0db2d8 More extensive alias analysis.
llvm-svn: 30721
2006-10-04 16:53:27 +00:00
Evan Cheng 5d9fd977d3 Combine ISD::EXTLOAD, ISD::SEXTLOAD, ISD::ZEXTLOAD into ISD::LOADX. Add an
extra operand to LOADX to specify the exact value extension type.

llvm-svn: 30714
2006-10-04 00:56:09 +00:00
Jim Laskey 60832693a7 Load chain check is not needed
llvm-svn: 30613
2006-09-26 17:44:58 +00:00
Jim Laskey dde51671e5 Chain can be any operand
llvm-svn: 30611
2006-09-26 09:32:41 +00:00
Jim Laskey 5f3e0af9d0 Wrong size for load
llvm-svn: 30610
2006-09-26 08:14:06 +00:00
Jim Laskey b4a864d533 Can't move a load node if it's chain is not used.
llvm-svn: 30609
2006-09-26 07:37:42 +00:00
Jim Laskey 7aa0638aa9 Accidental enable of bad code
llvm-svn: 30601
2006-09-25 21:11:32 +00:00
Jim Laskey b5534e5c28 Fix chain dropping in load and drop unused stores in ret blocks.
llvm-svn: 30600
2006-09-25 19:32:58 +00:00
Jim Laskey d07be232ba Core antialiasing for load and store.
llvm-svn: 30597
2006-09-25 16:29:54 +00:00
Evan Cheng 449a0c7e33 Make it work for DAG combine of multi-value nodes.
llvm-svn: 30573
2006-09-21 19:04:05 +00:00
Jim Laskey 35f7eebb49 core corrections
llvm-svn: 30570
2006-09-21 17:35:47 +00:00
Jim Laskey 5d19d59017 Basic "in frame" alias analysis.
llvm-svn: 30568
2006-09-21 16:28:59 +00:00
Chris Lattner 082db3f9aa fold (aext (and (trunc x), cst)) -> (and x, cst).
llvm-svn: 30561
2006-09-21 06:40:43 +00:00
Chris Lattner fa9f92cf65 Check the right value type. This fixes 186.crafty on x86
llvm-svn: 30560
2006-09-21 06:17:39 +00:00
Chris Lattner 8d8a3bf9c9 Compile:
int %test(ulong *%tmp) {
        %tmp = load ulong* %tmp         ; <ulong> [#uses=1]
        %tmp.mask = shr ulong %tmp, ubyte 50            ; <ulong> [#uses=1]
        %tmp.mask = cast ulong %tmp.mask to ubyte
        %tmp2 = and ubyte %tmp.mask, 3          ; <ubyte> [#uses=1]
        %tmp2 = cast ubyte %tmp2 to int         ; <int> [#uses=1]
        ret int %tmp2
}

to:

_test:
        movl 4(%esp), %eax
        movl 4(%eax), %eax
        shrl $18, %eax
        andl $3, %eax
        ret

instead of:

_test:
        movl 4(%esp), %eax
        movl 4(%eax), %eax
        shrl $18, %eax
        # TRUNCATE movb %al, %al
        andb $3, %al
        movzbl %al, %eax
        ret

llvm-svn: 30558
2006-09-21 06:14:31 +00:00
Chris Lattner a31f0a622b Generalize (zext (truncate x)) and (sext (truncate x)) folding to work when
the src/dst are not the same size.  This catches things like "truncate
32-bit X to 8 bits, then zext to 16", which happens a bit on X86.

llvm-svn: 30557
2006-09-21 06:00:20 +00:00
Chris Lattner c8cd62d381 Compile:
int test3(int a, int b) { return (a < 0) ? a : 0; }

to:

_test3:
        srawi r2, r3, 31
        and r3, r2, r3
        blr

instead of:

_test3:
        cmpwi cr0, r3, 1
        li r2, 0
        blt cr0, LBB2_2 ;entry
LBB2_1: ;entry
        mr r3, r2
LBB2_2: ;entry
        blr


This implements: PowerPC/select_lt0.ll:seli32_a_a

llvm-svn: 30517
2006-09-20 06:41:35 +00:00
Chris Lattner 8746e2cd57 Fold the full generality of (any_extend (truncate x))
llvm-svn: 30514
2006-09-20 06:29:17 +00:00
Chris Lattner 8b68decb27 Two things:
1. teach SimplifySetCC that '(srl (ctlz x), 5) == 0' is really x != 0.
2. Teach visitSELECT_CC to use SimplifySetCC instead of calling it and
   ignoring the result.  This allows us to compile:

bool %test(ulong %x) {
  %tmp = setlt ulong %x, 4294967296
  ret bool %tmp
}

to:

_test:
        cntlzw r2, r3
        cmplwi cr0, r3, 1
        srwi r2, r2, 5
        li r3, 0
        beq cr0, LBB1_2 ;
LBB1_1: ;
        mr r3, r2
LBB1_2: ;
        blr

instead of:

_test:
        addi r2, r3, -1
        cntlzw r2, r2
        cntlzw r3, r3
        srwi r2, r2, 5
        cmplwi cr0, r2, 0
        srwi r2, r3, 5
        li r3, 0
        bne cr0, LBB1_2 ;
LBB1_1: ;
        mr r3, r2
LBB1_2: ;
        blr

This isn't wonderful, but it's an improvement.

llvm-svn: 30513
2006-09-20 06:19:26 +00:00
Chris Lattner 46d710e6ea Fold (X & C1) | (Y & C2) -> (X|Y) & C3 when possible.
This implements CodeGen/X86/and-or-fold.ll

llvm-svn: 30379
2006-09-14 21:11:37 +00:00
Chris Lattner 97614c86ce Split rotate matching code out to its own function. Make it stronger, by
matching things like ((x >> c1) & c2) | ((x << c3) & c4) to (rot x, c5) & c6

llvm-svn: 30376
2006-09-14 20:50:57 +00:00
Evan Cheng 31305c45da DAG combiner fix for rotates. Previously the outer-most condition checks
for ROTL availability. This prevents it from forming ROTR for targets that
has ROTR only.

llvm-svn: 29997
2006-08-31 07:41:12 +00:00
Evan Cheng e5570a4c3f Move isCommutativeBinOp from SelectionDAG.cpp and DAGCombiner.cpp out. Make it a static method of SelectionDAG.
llvm-svn: 29951
2006-08-29 06:42:35 +00:00
Chris Lattner 3d27be1333 s|llvm/Support/Visibility.h|llvm/Support/Compiler.h|
llvm-svn: 29911
2006-08-27 12:54:02 +00:00
Chris Lattner 6f22ebd8be change internal impl of dag combiner so that calls to CombineTo never have to
make a temporary vector.

llvm-svn: 29618
2006-08-11 17:56:38 +00:00
Chris Lattner a2f4086828 Change one ReplaceAllUsesWith method to take an array of operands to replace
instead of a vector of operands.

llvm-svn: 29616
2006-08-11 17:46:28 +00:00
Chris Lattner c24a1d3093 Start eliminating temporary vectors used to create DAG nodes. Instead, pass
in the start of an array and a count of operands where applicable.  In many
cases, the number of operands is known, so this static array can be allocated
on the stack, avoiding the heap.  In many other cases, a SmallVector can be
used, which has the same benefit in the common cases.

I updated a lot of code calling getNode that takes a vector, but ran out of
time.  The rest of the code should be updated, and these methods should be
removed.

We should also do the same thing to eliminate the methods that take a
vector of MVT::ValueTypes.

It would be extra nice to convert the dagiselemitter to avoid creating vectors
for operands when calling getTargetNode.

llvm-svn: 29566
2006-08-08 02:23:42 +00:00
Reid Spencer 658b9476f0 Initialize some variables the compiler warns about.
llvm-svn: 29277
2006-07-25 20:44:41 +00:00
Evan Cheng 7c970b98d0 If a shuffle is a splat, check if the argument is a build_vector with all elements being the same. If so, return the argument.
llvm-svn: 29242
2006-07-21 08:25:53 +00:00
Evan Cheng 8472e0c4af If a shuffle is unary, i.e. one of the vector argument is not needed, turn the
operand into a undef and adjust mask accordingly.

llvm-svn: 29232
2006-07-20 22:44:41 +00:00
Andrew Lenharth ec104a2b41 80 cols
llvm-svn: 29221
2006-07-20 17:43:27 +00:00
Andrew Lenharth c496b418b5 Reduce number of exported symbols
llvm-svn: 29220
2006-07-20 17:28:38 +00:00
Chris Lattner 54a34cd20b Mark these two classes as hidden, shrinking libllbmgcc.dylib by 25K
llvm-svn: 28970
2006-06-28 21:58:30 +00:00
Andrew Lenharth 0e57b2cb92 Start on my todo list
llvm-svn: 28752
2006-06-12 16:07:18 +00:00
Evan Cheng 64d2846017 visitVBinOp: Can't fold divide by zero!
llvm-svn: 28584
2006-05-31 06:08:35 +00:00
Chris Lattner 8f872d2091 Fix a nasty dag combiner bug that caused nondeterminstic crashes (MY FAVORITE!):
SimplifySelectOps would eliminate a Select, delete it, then return true.

The clients would see that it did something and return null.

The top level would see a null return, and decide that nothing happened,
proceeding to process the node in other ways: boom.

The fix is simple: clients of SimplifySelectOps should return the select
node itself.

In order to catch really obnoxious boogs like this in the future, add an
assert that nodes are not deleted.  We do this by checking for a sentry node
type that the SDNode dtor sets when a node is destroyed.

llvm-svn: 28514
2006-05-27 00:43:02 +00:00
Andrew Lenharth 1dc9ec5874 Move this code to a common place
llvm-svn: 28329
2006-05-16 17:42:15 +00:00
Chris Lattner afe72481f6 Comment out dead variables
llvm-svn: 28252
2006-05-12 17:57:54 +00:00
Chris Lattner 66adee93aa Two simplifications for token factor nodes: simplify tf(x,x) -> x.
simplify tf(x,y,y,z) -> tf(x,y,z).

llvm-svn: 28233
2006-05-12 05:01:37 +00:00
Evan Cheng 2c74848af1 Debugging info
llvm-svn: 28200
2006-05-09 06:55:15 +00:00
Chris Lattner 446e1ef26a Make the case I just checked in stronger. Now we compile this:
short test2(short X, short x) {
  int Y = (short)(X+x);
  return Y >> 1;
}

to:

_test2:
        add r2, r3, r4
        extsh r2, r2
        srawi r3, r2, 1
        blr

instead of:

_test2:
        add r2, r3, r4
        extsh r2, r2
        srwi r2, r2, 1
        extsh r3, r2
        blr

llvm-svn: 28175
2006-05-08 21:18:59 +00:00
Chris Lattner 29062da0ac Implement and_sext.ll:test3, generating:
_test4:
        srawi r3, r3, 16
        blr

instead of:

_test4:
        srwi r2, r3, 16
        extsh r3, r2
        blr

for:

short test4(unsigned X) {
  return (X >> 16);
}

llvm-svn: 28174
2006-05-08 20:59:41 +00:00
Chris Lattner 2935d8190c Compile this:
short test4(unsigned X) {
  return (X >> 16);
}

to:

_test4:
        movl 4(%esp), %eax
        sarl $16, %eax
        ret

instead of:

_test4:
        movl $-65536, %eax
        andl 4(%esp), %eax
        sarl $16, %eax
        ret

llvm-svn: 28171
2006-05-08 20:51:54 +00:00
Nate Begeman e5ce5bb6da Fix PR772
llvm-svn: 28161
2006-05-08 01:35:01 +00:00
Chris Lattner 7e7bcf3a54 Simplify some code, add a couple minor missed folds
llvm-svn: 28152
2006-05-06 23:06:26 +00:00
Chris Lattner 2a4d7b845b remove cases handled elsewhere
llvm-svn: 28150
2006-05-06 22:43:44 +00:00
Chris Lattner 1ecb2a2dac Use the new TargetLowering::ComputeNumSignBits method to eliminate
sign_extend_inreg operations.  Though ComputeNumSignBits is still rudimentary,
this is enough to compile this:

short test(short X, short x) {
  int Y = X+x;
  return (Y >> 1);
}
short test2(short X, short x) {
  int Y = (short)(X+x);
  return Y >> 1;
}

into:

_test:
        add r2, r3, r4
        srawi r3, r2, 1
        blr
_test2:
        add r2, r3, r4
        extsh r2, r2
        srawi r3, r2, 1
        blr

instead of:

_test:
        add r2, r3, r4
        srawi r2, r2, 1
        extsh r3, r2
        blr
_test2:
        add r2, r3, r4
        extsh r2, r2
        srawi r2, r2, 1
        extsh r3, r2
        blr

llvm-svn: 28146
2006-05-06 09:30:03 +00:00
Chris Lattner 907e392dba Fold trunc(any_ext). This gives stuff like:
27,28c27
<       movzwl %di, %edi
<       movl %edi, %ebx
---
>       movw %di, %bx

llvm-svn: 28137
2006-05-05 22:56:26 +00:00
Chris Lattner 57f8c5a387 Shrink shifts when possible.
llvm-svn: 28136
2006-05-05 22:53:17 +00:00
Chris Lattner 3d26577396 Fold (fpext (load x)) -> (extload x)
llvm-svn: 28130
2006-05-05 21:34:35 +00:00
Chris Lattner 25a5283a86 Fold some common code.
llvm-svn: 28124
2006-05-05 06:32:04 +00:00
Chris Lattner 002ee91457 Implement:
// fold (and (sext x), (sext y)) -> (sext (and x, y))
  // fold (or  (sext x), (sext y)) -> (sext (or  x, y))
  // fold (xor (sext x), (sext y)) -> (sext (xor x, y))
  // fold (and (aext x), (aext y)) -> (aext (and x, y))
  // fold (or  (aext x), (aext y)) -> (aext (or  x, y))
  // fold (xor (aext x), (aext y)) -> (aext (xor x, y))

llvm-svn: 28123
2006-05-05 06:31:05 +00:00
Chris Lattner 5ac4293606 Pull and through and/or/xor. This compiles some bitfield code to:
mov EAX, DWORD PTR [ESP + 4]
        mov ECX, DWORD PTR [EAX]
        mov EDX, ECX
        add EDX, EDX
        or EDX, ECX
        and EDX, -2147483648
        and ECX, 2147483647
        or EDX, ECX
        mov DWORD PTR [EAX], EDX
        ret

instead of:

        sub ESP, 4
        mov DWORD PTR [ESP], ESI
        mov EAX, DWORD PTR [ESP + 8]
        mov ECX, DWORD PTR [EAX]
        mov EDX, ECX
        add EDX, EDX
        mov ESI, ECX
        and ESI, -2147483648
        and EDX, -2147483648
        or EDX, ESI
        and ECX, 2147483647
        or EDX, ECX
        mov DWORD PTR [EAX], EDX
        mov ESI, DWORD PTR [ESP]
        add ESP, 4
        ret

llvm-svn: 28122
2006-05-05 06:10:43 +00:00
Chris Lattner 812646aa0c Implement a variety of simplifications for ANY_EXTEND.
llvm-svn: 28121
2006-05-05 05:58:59 +00:00
Chris Lattner 8d6fc20181 Factor some code, add these transformations:
// fold (and (trunc x), (trunc y)) -> (trunc (and x, y))
  // fold (or  (trunc x), (trunc y)) -> (trunc (or  x, y))
  // fold (xor (trunc x), (trunc y)) -> (trunc (xor x, y))

llvm-svn: 28120
2006-05-05 05:51:50 +00:00
Chris Lattner 2b48a94413 Remove a bogus transformation. This fixes SingleSource/UnitTests/2006-01-23-InitializedBitField.c
with some changes I have to the new CFE.

llvm-svn: 28022
2006-04-28 23:33:20 +00:00
Chris Lattner 662e940f73 Fix a couple more memory issues
llvm-svn: 27930
2006-04-21 15:32:26 +00:00
Chris Lattner cc47ab3305 Fix a really subtle and obnoxious memory bug that caused issues with an
llvm-gcc4 boostrap.  Whenever a node is deleted by the dag combiner, it
*must* be returned by the visit function, or the dag combiner will not
know that the node has been processed (and will, e.g., send it to the
target dag combine xforms).

llvm-svn: 27922
2006-04-20 23:55:59 +00:00
Evan Cheng a320abc494 Turn a VAND into a VECTOR_SHUFFLE is applicable.
DAG combiner can turn a VAND V, <-1, 0, -1, -1>, i.e. vector clear elements,
into a vector shuffle with a zero vector. It only does so when TLI tells it
the xform is profitable.

llvm-svn: 27874
2006-04-20 08:56:16 +00:00
Chris Lattner e1401e3610 Canonicalize vvector_shuffle(x,x) -> vvector_shuffle(x,undef) to enable patterns
to match again :)

llvm-svn: 27533
2006-04-08 05:34:25 +00:00
Chris Lattner 098c01e94e Codegen shufflevector as VVECTOR_SHUFFLE
llvm-svn: 27529
2006-04-08 04:15:24 +00:00
Evan Cheng 613996c55e 1. If both vector operands of a vector_shuffle are undef, turn it into an undef.
2. A shuffle mask element can also be an undef.

llvm-svn: 27472
2006-04-06 23:20:43 +00:00
Chris Lattner 4ea52cac01 Do not create ZEXTLOAD's unless we are before legalize or the operation is
legal.

llvm-svn: 27402
2006-04-04 17:39:18 +00:00
Chris Lattner e1e3adf802 Add a missing check, this fixes UnitTests/Vector/sumarray.c
llvm-svn: 27375
2006-04-03 17:29:28 +00:00
Chris Lattner 04c00fc844 Add a missing check, which broke a bunch of vector tests.
llvm-svn: 27374
2006-04-03 17:21:50 +00:00
Andrew Lenharth 94f012f606 back this out
llvm-svn: 27367
2006-04-03 03:16:50 +00:00
Andrew Lenharth 015eaf5f33 This should be a win of every arch
llvm-svn: 27364
2006-04-02 21:42:45 +00:00
Chris Lattner 4993249a04 Add a little dag combine to compile this:
int %AreSecondAndThirdElementsBothNegative(<4 x float>* %in) {
entry:
        %tmp1 = load <4 x float>* %in           ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.ppc.altivec.vcmpgefp.p( int 1, <4 x float> < float 0x7FF8000000000000, float 0.000000e+00, float 0.000000e+00, float 0x7FF8000000000000 >, <4 x float> %tmp1 )           ; <int> [#uses=1]
        %tmp = seteq int %tmp, 0                ; <bool> [#uses=1]
        %tmp3 = cast bool %tmp to int           ; <int> [#uses=1]
        ret int %tmp3
}

into this:

_AreSecondAndThirdElementsBothNegative:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI1_0)
        lis r5, ha16(LCPI1_0)
        lvx v0, 0, r3
        lvx v1, r5, r4
        vcmpgefp. v0, v1, v0
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        mtspr 256, r2
        blr

instead of this:

_AreSecondAndThirdElementsBothNegative:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI1_0)
        lis r5, ha16(LCPI1_0)
        lvx v0, 0, r3
        lvx v1, r5, r4
        vcmpgefp. v0, v1, v0
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        xori r3, r3, 1
        cntlzw r3, r3
        srwi r3, r3, 5
        mtspr 256, r2
        blr

llvm-svn: 27356
2006-04-02 06:11:11 +00:00
Chris Lattner 0442a18758 Constant fold all of the vector binops. This allows us to compile this:
"vector unsigned char mergeLowHigh = (vector unsigned char)
( 8, 9, 10, 11, 16, 17, 18, 19, 12, 13, 14, 15, 20, 21, 22, 23 );
vector unsigned char mergeHighLow = vec_xor( mergeLowHigh, vec_splat_u8(8));"

aka:

void %test2(<16 x sbyte>* %P) {
  store <16 x sbyte> cast (<4 x int> xor (<4 x int> cast (<16 x ubyte> < ubyte 8, ubyte 9, ubyte 10, ubyte 11, ubyte 16, ubyte 17, ubyte 18, ubyte 19, ubyte 12, ubyte 13, ubyte 14, ubyte 15, ubyte 20, ubyte 21, ubyte 22, ubyte 23 > to <4 x int>), <4 x int> cast (<16 x sbyte> < sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8 > to <4 x int>)) to <16 x sbyte>), <16 x sbyte> * %P
  ret void
}

into this:

_test2:
        mfspr r2, 256
        oris r4, r2, 32768
        mtspr 256, r4
        li r4, lo16(LCPI2_0)
        lis r5, ha16(LCPI2_0)
        lvx v0, r5, r4
        stvx v0, 0, r3
        mtspr 256, r2
        blr

instead of this:

_test2:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI2_0)
        lis r5, ha16(LCPI2_0)
        vspltisb v0, 8
        lvx v1, r5, r4
        vxor v0, v1, v0
        stvx v0, 0, r3
        mtspr 256, r2
        blr

... which occurs here:
http://developer.apple.com/hardware/ve/calcspeed.html

llvm-svn: 27343
2006-04-02 03:25:57 +00:00
Chris Lattner e4e64b6b85 Implement constant folding of bit_convert of arbitrary constant vbuild_vector nodes.
llvm-svn: 27341
2006-04-02 02:53:43 +00:00
Chris Lattner 39dcf1a9e2 Delete identity shuffles, implementing CodeGen/Generic/vector-identity-shuffle.ll
llvm-svn: 27317
2006-03-31 22:16:43 +00:00
Chris Lattner 7e30af3887 Remove dead *extloads. This allows us to codegen vector.ll:test_extract_elt
to:

test_extract_elt:
        alloc r3 = ar.pfs,0,1,0,0
        adds r8 = 12, r32
        ;;
        ldfs f8 = [r8]
        mov ar.pfs = r3
        br.ret.sptk.many rp

instead of:

test_extract_elt:
        alloc r3 = ar.pfs,0,1,0,0
        adds r8 = 28, r32
        adds r9 = 24, r32
        adds r10 = 20, r32
        adds r11 = 16, r32
        ;;
        ldfs f6 = [r8]
        ;;
        ldfs f6 = [r9]
        adds r8 = 12, r32
        adds r9 = 8, r32
        adds r14 = 4, r32
        ;;
        ldfs f6 = [r10]
        ;;
        ldfs f6 = [r11]
        ldfs f8 = [r8]
        ;;
        ldfs f6 = [r9]
        ;;
        ldfs f6 = [r14]
        ;;
        ldfs f6 = [r32]
        mov ar.pfs = r3
        br.ret.sptk.many rp

llvm-svn: 27297
2006-03-31 18:10:41 +00:00
Chris Lattner 2d8551c85b Delete dead loads in the dag. This allows us to compile
vector.ll:test_extract_elt2 into:

_test_extract_elt2:
        lfd f1, 32(r3)
        blr

instead of:

_test_extract_elt2:
        lfd f0, 56(r3)
        lfd f0, 48(r3)
        lfd f0, 40(r3)
        lfd f1, 32(r3)
        lfd f0, 24(r3)
        lfd f0, 16(r3)
        lfd f0, 8(r3)
        lfd f0, 0(r3)
        blr

llvm-svn: 27296
2006-03-31 18:06:18 +00:00
Chris Lattner 20e619fba3 When building a VVECTOR_SHUFFLE node from extract_element operations, make
sure to build it as SHUFFLE(X, undef, mask), not SHUFFLE(X, X, mask).

The later is not canonical form, and prevents the PPC splat pattern from
matching.  For a particular splat, we go from generating this:

	li r10, lo16(LCPI1_0)
	lis r11, ha16(LCPI1_0)
	lvx v3, r11, r10
	vperm v3, v2, v2, v3

to generating:

	vspltw v3, v2, 3

llvm-svn: 27236
2006-03-28 22:19:47 +00:00
Chris Lattner a46dfe80c8 Canonicalize VECTOR_SHUFFLE(X, X, Y) -> VECTOR_SHUFFLE(X,undef,Y')
llvm-svn: 27235
2006-03-28 22:11:53 +00:00
Chris Lattner c9992548fc Turn a series of extract_element's feeding a build_vector into a
vector_shuffle node.  For this:

void test(__m128 *res, __m128 *A, __m128 *B) {
  *res = _mm_unpacklo_ps(*A, *B);
}

we now produce this code:

_test:
        movl 8(%esp), %eax
        movaps (%eax), %xmm0
        movl 12(%esp), %eax
        unpcklps (%eax), %xmm0
        movl 4(%esp), %eax
        movaps %xmm0, (%eax)
        ret

instead of this:

_test:
        subl $76, %esp
        movl 88(%esp), %eax
        movaps (%eax), %xmm0
        movaps %xmm0, (%esp)
        movaps %xmm0, 32(%esp)
        movss 4(%esp), %xmm0
        movss 32(%esp), %xmm1
        unpcklps %xmm0, %xmm1
        movl 84(%esp), %eax
        movaps (%eax), %xmm0
        movaps %xmm0, 16(%esp)
        movaps %xmm0, 48(%esp)
        movss 20(%esp), %xmm0
        movss 48(%esp), %xmm2
        unpcklps %xmm0, %xmm2
        unpcklps %xmm1, %xmm2
        movl 80(%esp), %eax
        movaps %xmm2, (%eax)
        addl $76, %esp
        ret

GCC produces this (with -fomit-frame-pointer):

_test:
        subl    $12, %esp
        movl    20(%esp), %eax
        movaps  (%eax), %xmm0
        movl    24(%esp), %eax
        unpcklps        (%eax), %xmm0
        movl    16(%esp), %eax
        movaps  %xmm0, (%eax)
        addl    $12, %esp
        ret

llvm-svn: 27233
2006-03-28 20:28:38 +00:00
Chris Lattner b7163598f9 Don't crash on X^X if X is a vector. Instead, produce a vector of zeros.
llvm-svn: 27229
2006-03-28 19:11:05 +00:00
Chris Lattner dc1eab5886 Don't call SimplifyDemandedBits on vectors
llvm-svn: 27128
2006-03-25 22:19:00 +00:00
Chris Lattner 5336a59e4b fold insertelement(buildvector) -> buildvector if the inserted element # is
a constant.  This implements test_constant_insert in CodeGen/Generic/vector.ll

llvm-svn: 26851
2006-03-19 01:27:56 +00:00
Nate Begeman bb01d4f272 Remove BRTWOWAY*
Make the PPC backend not dependent on BRTWOWAY_CC and make the branch
selector smarter about the code it generates, fixing a case in the
readme.

llvm-svn: 26814
2006-03-17 01:40:33 +00:00
Chris Lattner 68ac09d5cb make sure dead token factor nodes are removed by the dag combiner.
llvm-svn: 26731
2006-03-13 18:37:30 +00:00
Chris Lattner d8c2a48d58 Fold X+Y -> X|Y when safe. This implements:
Regression/CodeGen/PowerPC/and_add.ll

a case that occurs with dynamic allocas of constant size.

llvm-svn: 26727
2006-03-13 06:51:27 +00:00
Chris Lattner 8bb6cb7d7b add a couple of missing folds
llvm-svn: 26724
2006-03-13 06:26:26 +00:00
Chris Lattner bdaf4f38b5 Reinstate this now that the offending opposite xform has been removed.
llvm-svn: 26548
2006-03-05 19:53:55 +00:00
Evan Cheng d428e22c07 Back out fold (shl (add x, c1), c2) -> (add (shl x, c2), c1<<c2) for now.
It's causing an infinite loop compiling ldecod on x86 / Darwin.

llvm-svn: 26544
2006-03-05 07:30:16 +00:00
Chris Lattner 3bc4050217 Add some simple copysign folds
llvm-svn: 26543
2006-03-05 05:30:57 +00:00
Chris Lattner f29f5204cc fold (mul (add x, c1), c2) -> (add (mul x, c2), c1*c2)
fold (shl (add x, c1), c2) -> (add (shl x, c2), c1<<c2)

This allows us to compile CodeGen/PowerPC/addi-reassoc.ll into:

_test1:
        slwi r2, r4, 4
        add r2, r2, r3
        lwz r3, 36(r2)
        blr
_test2:
        mulli r2, r4, 5
        add r2, r2, r3
        lbz r2, 11(r2)
        extsb r3, r2
        blr

instead of:

_test1:
        addi r2, r4, 2
        slwi r2, r2, 4
        add r2, r3, r2
        lwz r3, 4(r2)
        blr
_test2:
        addi r2, r4, 2
        mulli r2, r2, 5
        add r2, r3, r2
        lbz r2, 1(r2)
        extsb r3, r2
        blr

llvm-svn: 26535
2006-03-04 23:33:26 +00:00
Chris Lattner 0db2f2c689 Fix CodeGen/Generic/2006-03-01-dagcombineinfloop.ll, an infinite loop
in the dag combiner on 176.gcc on x86.

llvm-svn: 26459
2006-03-01 21:47:21 +00:00
Chris Lattner 232024edb8 Fix a typo evan noticed
llvm-svn: 26454
2006-03-01 19:55:35 +00:00
Chris Lattner bc1c85beea Add support for target-specific dag combines
llvm-svn: 26443
2006-03-01 04:53:38 +00:00
Chris Lattner fbcd62d3bb Add a new AddToWorkList method, start using it
llvm-svn: 26441
2006-03-01 04:03:14 +00:00
Chris Lattner 324871ef1a Pull shifts by a constant through multiplies (a form of reassociation),
implementing Regression/CodeGen/X86/mul-shift-reassoc.ll

llvm-svn: 26440
2006-03-01 03:44:24 +00:00
Evan Cheng b97aab4371 Vector ops lowering.
llvm-svn: 26436
2006-03-01 01:09:54 +00:00
Chris Lattner f0032b350c Compile:
unsigned foo4(unsigned short *P) { return *P & 255; }
unsigned foo5(short *P) { return *P & 255; }

to:

_foo4:
        lbz r3,1(r3)
        blr
_foo5:
        lbz r3,1(r3)
        blr

not:

_foo4:
        lhz r2, 0(r3)
        rlwinm r3, r2, 0, 24, 31
        blr
_foo5:
        lhz r2, 0(r3)
        rlwinm r3, r2, 0, 24, 31
        blr

llvm-svn: 26419
2006-02-28 06:49:37 +00:00
Chris Lattner bdbc4476d9 Fold "and (LOAD P), 255" -> zextload. This allows us to compile:
unsigned foo3(unsigned *P) { return *P & 255; }
as:
_foo3:
        lbz r3, 3(r3)
        blr

instead of:

_foo3:
        lwz r2, 0(r3)
        rlwinm r3, r2, 0, 24, 31
        blr

and:

unsigned short foo2(float a) { return a; }

as:
_foo2:
        fctiwz f0, f1
        stfd f0, -8(r1)
        lhz r3, -2(r1)
        blr

instead of:

_foo2:
        fctiwz f0, f1
        stfd f0, -8(r1)
        lwz r2, -4(r1)
        rlwinm r3, r2, 0, 16, 31
        blr

llvm-svn: 26417
2006-02-28 06:35:35 +00:00
Chris Lattner 0f8a727c49 fold (sra (sra x, c1), c2) -> (sra x, c1+c2)
llvm-svn: 26416
2006-02-28 06:23:04 +00:00
Chris Lattner 47ee42829d remove some completed notes
llvm-svn: 26390
2006-02-27 00:39:31 +00:00
Chris Lattner 301f45cf6f Fix a problem Nate and Duraid reported where simplifying nodes can cause
them to get ressurected, in which case, deleting the undead nodes is
unfriendly.

llvm-svn: 26291
2006-02-20 06:51:04 +00:00
Nate Begeman abac61603f Add checks to make sure we don't create bogus extend nodes, and fix a bug
where we were doing exactly that which was causing failures on x86 and
alpha.

llvm-svn: 26284
2006-02-18 02:40:58 +00:00
Chris Lattner 375e1a71cc Fix a tricky issue in the SimplifyDemandedBits code where CombineTo wasn't
exactly the API we wanted to call into.  This fixes the crash on crafty last
night.

llvm-svn: 26269
2006-02-17 21:58:01 +00:00
Nate Begeman fb5dbadf15 Clean up DemandedBitsAreZero interface
Make more use of the new mask helpers in valuetypes.h
Combine (sra (srl x, c1), c1) -> sext_inreg if legal

llvm-svn: 26263
2006-02-17 19:54:08 +00:00
Nate Begeman 57b3567552 Don't expand sdiv by power of two before legalize, since it will likely
generate illegal nodes.

llvm-svn: 26261
2006-02-17 07:26:20 +00:00
Nate Begeman 5965bd19f8 kill ADD_PARTS & SUB_PARTS and replace them with fancy new ADDC, ADDE, SUBC
and SUBE nodes that actually expose what's going on and allow for
significant simplifications in the targets.

llvm-svn: 26255
2006-02-17 05:43:56 +00:00
Nate Begeman 8a77efe4f7 Rework the SelectionDAG-based implementations of SimplifyDemandedBits
and ComputeMaskedBits to match the new improved versions in instcombine.
Tested against all of multisource/benchmarks on ppc.

llvm-svn: 26238
2006-02-16 21:11:51 +00:00
Chris Lattner 471627c49d Lowering of sdiv X, pow2 was broken, this fixes it. This patch is written
by Nate, I'm just committing it for him.

llvm-svn: 26230
2006-02-16 08:02:36 +00:00
Jim Laskey 2eea436192 Should not combine ISD::LOCATIONs until we have scheme to remove from
MachineDebugInfo tables.

llvm-svn: 26216
2006-02-15 19:34:44 +00:00
Chris Lattner a10e23c19f Compile this:
xori r6, r2, 1
        rlwinm r6, r6, 0, 31, 31
        cmpwi cr0, r6, 0
        bne cr0, LBB1_3 ; endif

to this:

        rlwinm r6, r2, 0, 31, 31
        cmpwi cr0, r6, 0
        beq cr0, LBB1_3 ; endif

llvm-svn: 26047
2006-02-08 02:13:15 +00:00
Nate Begeman 8c9cd461df Back out previous commit, it isn't safe.
llvm-svn: 26006
2006-02-05 08:23:00 +00:00
Nate Begeman 3dc8b89493 fold c1 << (x + c2) into (c1 << c2) << x. fix a warning.
llvm-svn: 26005
2006-02-05 08:07:24 +00:00
Nate Begeman c89fdf1eb3 Handle urem by shifted powers of 2.
llvm-svn: 26001
2006-02-05 07:36:48 +00:00
Nate Begeman 25d178bece handle combining A / (B << N) into A >>u (log2(B)+N) when B is a power of 2
llvm-svn: 26000
2006-02-05 07:20:23 +00:00
Nate Begeman dc7bba9ffe Add a framework for eliminating instructions that produces undemanded bits.
llvm-svn: 25945
2006-02-03 22:24:05 +00:00
Nate Begeman 22e251abf1 Add common code for reassociating ops in the dag combiner
llvm-svn: 25934
2006-02-03 06:46:56 +00:00
Chris Lattner 49beaf40fc Turn any_extend nodes into zero_extend nodes when it allows us to remove an
and instruction.  This allows us to compile stuff like this:

bool %X(int %X) {
        %Y = add int %X, 14
        %Z = setne int %Y, 12345
        ret bool %Z
}

to this:

_X:
        cmpl $12331, 4(%esp)
        setne %al
        movzbl %al, %eax
        ret

instead of this:

_X:
        cmpl $12331, 4(%esp)
        setne %al
        movzbl %al, %eax
        andl $1, %eax
        ret

This occurs quite a bit with the X86 backend.  For example, 25 times in
lambda, 30 times in 177.mesa, 14 times in galgel,  70 times in fma3d,
25 times in vpr, several hundred times in gcc, ~45 times in crafty,
~60 times in parser, ~140 times in eon, 110 times in perlbmk, 55 on gap,
16 times on bzip2, 14 times on twolf, and 1-2 times in many other SPEC2K
programs.

llvm-svn: 25901
2006-02-02 07:17:31 +00:00
Chris Lattner 49ce35542f add two dag combines:
(C1-X) == C2 --> X == C1-C2
(X+C1) == C2 --> X == C2-C1

This allows us to compile this:

bool %X(int %X) {
        %Y = add int %X, 14
        %Z = setne int %Y, 12345
        ret bool %Z
}

into this:

_X:
        cmpl $12331, 4(%esp)
        setne %al
        movzbl %al, %eax
        andl $1, %eax
        ret

not this:

_X:
        movl $14, %eax
        addl 4(%esp), %eax
        cmpl $12345, %eax
        setne %al
        movzbl %al, %eax
        andl $1, %eax
        ret

Testcase here: Regression/CodeGen/X86/compare-add.ll

nukage of the and coming up next.

llvm-svn: 25898
2006-02-02 06:36:13 +00:00
Nate Begeman 7e7f439f85 Fix some of the stuff in the PPC README file, and clean up legalization
of the SELECT_CC, BR_CC, and BRTWOWAY_CC nodes.

llvm-svn: 25875
2006-02-01 07:19:44 +00:00
Chris Lattner f0b24d2dc0 Move MaskedValueIsZero from the DAGCombiner to the TargetLowering interface,making isMaskedValueZeroForTargetNode simpler, and useable from other partsof the compiler.
llvm-svn: 25803
2006-01-30 04:09:27 +00:00
Chris Lattner 3b40e64aa3 pass the address of MaskedValueIsZero into isMaskedValueZeroForTargetNode,
to permit recursion

llvm-svn: 25799
2006-01-30 03:49:37 +00:00
Chris Lattner 678da98835 eliminate uses of SelectionDAG::getBR2Way_CC
llvm-svn: 25767
2006-01-29 06:00:45 +00:00
Nate Begeman af397cec0b Add a missing case to the dag combiner.
llvm-svn: 25723
2006-01-28 01:06:30 +00:00
Chris Lattner de02d7727f Add explicit #includes of <iostream>
llvm-svn: 25515
2006-01-22 23:41:00 +00:00
Nate Begeman 569c439567 Get rid of code in the DAGCombiner that is duplicated in SelectionDAG.cpp
Now all constant folding in the code generator is in one place.

llvm-svn: 25426
2006-01-18 22:35:16 +00:00
Chris Lattner 5fee908be5 Fix a backwards conditional that caused an inf loop in some cases. This
fixes: test/Regression/CodeGen/Generic/2005-01-18-SetUO-InfLoop.ll

llvm-svn: 25419
2006-01-18 19:13:41 +00:00
Chris Lattner fcdb420baf Disable two transformations that contribute to bus errors on SparcV8.
llvm-svn: 25339
2006-01-15 18:58:59 +00:00
Chris Lattner 3470b5dee6 Add a simple missing fold to produce this:
subfic r3, r2, 33

instead of this:

        subfic r2, r2, 32
        addi r3, r2, 1

llvm-svn: 25255
2006-01-12 20:22:43 +00:00
Chris Lattner b1ee616de9 Don't create rotate instructions in unsupported types, because we don't have
promote/expand code yet.  This fixes the 177.mesa failure on PPC.

llvm-svn: 25250
2006-01-12 18:57:33 +00:00
Nate Begeman 1b8121b227 Add bswap, rotl, and rotr nodes
Add dag combiner code to recognize rotl, rotr
Add ppc code to match rotl

Targets should add rotl/rotr patterns if they have them

llvm-svn: 25222
2006-01-11 21:21:00 +00:00
Evan Cheng 85c973cda9 Revert the previous check-in. Leave shl x, 1 along for target to deal with.
llvm-svn: 25121
2006-01-06 01:56:02 +00:00
Evan Cheng b03f9b32d2 fold (shl x, 1) -> (add x, x)
llvm-svn: 25120
2006-01-06 01:06:31 +00:00
Jim Laskey 762e9ec06c Added initial support for DEBUG_LABEL allowing debug specific labels to be
inserted in the code.

llvm-svn: 25104
2006-01-05 01:25:28 +00:00
Jim Laskey 0da76a676a Add unique id to debug location for debug label use (work in progress.)
llvm-svn: 25096
2006-01-04 15:04:11 +00:00
Jim Laskey bdba3e2a46 Remove redundant debug locations.
llvm-svn: 24995
2005-12-23 20:08:28 +00:00
Chris Lattner 26943b9691 Simplify store(bitconv(x)) to store(x). This allows us to compile this:
void bar(double Y, double *X) {
  *X = Y;
}

to this:

bar:
        save -96, %o6, %o6
        st %i1, [%i2+4]
        st %i0, [%i2]
        restore %g0, %g0, %g0
        retl
        nop

instead of this:

bar:
        save -104, %o6, %o6
        st %i1, [%i6+-4]
        st %i0, [%i6+-8]
        ldd [%i6+-8], %f0
        std  %f0, [%i2]
        restore %g0, %g0, %g0
        retl
        nop

on sparcv8.

llvm-svn: 24983
2005-12-23 05:48:07 +00:00
Chris Lattner 54560f6887 fold (conv (load x)) -> (load (conv*)x).
This allows us to compile this:
void foo(double);
void bar(double *X) { foo(*X); }

To this:

bar:
        save -96, %o6, %o6
        ld [%i0+4], %o1
        ld [%i0], %o0
        call foo
        nop
        restore %g0, %g0, %g0
        retl
        nop

instead of this:

bar:
        save -104, %o6, %o6
        ldd [%i0], %f0
        std %f0, [%i6+-8]
        ld [%i6+-4], %o1
        ld [%i6+-8], %o0
        call foo
        nop
        restore %g0, %g0, %g0
        retl
        nop

on SparcV8.

llvm-svn: 24982
2005-12-23 05:44:41 +00:00
Chris Lattner efbbedbf4a Fold bitconv(bitconv(x)) -> x. We now compile this:
void foo(double);
void bar(double X) { foo(X); }

to this:

bar:
        save -96, %o6, %o6
        or %g0, %i0, %o0
        or %g0, %i1, %o1
        call foo
        nop
        restore %g0, %g0, %g0
        retl
        nop

instead of this:

bar:
        save -112, %o6, %o6
        st %i1, [%i6+-4]
        st %i0, [%i6+-8]
        ldd [%i6+-8], %f0
        std %f0, [%i6+-16]
        ld [%i6+-12], %o1
        ld [%i6+-16], %o0
        call foo
        nop
        restore %g0, %g0, %g0
        retl
        nop

on V8.

llvm-svn: 24981
2005-12-23 05:37:50 +00:00
Chris Lattner a187460552 constant fold bits_convert in getNode and in the dag combiner for fp<->int
conversions.  This allows V8 to compiles this:

void %test() {
        call float %test2( float 1.000000e+00, float 2.000000e+00, double 3.000000e+00, double* null )
        ret void
}

into:

test:
        save -96, %o6, %o6
        sethi 0, %o3
        sethi 1049088, %o2
        sethi 1048576, %o1
        sethi 1040384, %o0
        or %g0, %o3, %o4
        call test2
        nop
        restore %g0, %g0, %g0
        retl
        nop

instead of:

test:
        save -112, %o6, %o6
        sethi 0, %o4
        sethi 1049088, %l0
        st %o4, [%i6+-12]
        st %l0, [%i6+-16]
        ld [%i6+-12], %o3
        ld [%i6+-16], %o2
        sethi 1048576, %o1
        sethi 1040384, %o0
        call test2
        nop
        restore %g0, %g0, %g0
        retl
        nop

llvm-svn: 24980
2005-12-23 05:30:37 +00:00
Evan Cheng 9cdc16c6d3 * Fix a GlobalAddress lowering bug.
* Teach DAG combiner about X86ISD::SETCC by adding a TargetLowering hook.

llvm-svn: 24921
2005-12-21 23:05:39 +00:00
Chris Lattner 83e4407379 Don't create SEXTLOAD/ZEXTLOAD instructions that the target doesn't support
if after legalize.  This fixes IA64 failures.

llvm-svn: 24725
2005-12-15 19:02:38 +00:00
Chris Lattner d39c60fcc8 When folding loads into ops, immediately replace uses of the op with the
load.  This reduces number of worklist iterations and avoid missing optimizations
depending on folding of things into sext_inreg nodes (which aren't supported by
all targets).
Tested by Regression/CodeGen/X86/extend.ll:test2

llvm-svn: 24712
2005-12-14 19:25:30 +00:00
Chris Lattner 7dac1083da Fix the (zext (zextload)) case to trigger, similarly for sign extends.
Allow (zext (truncate)) to apply after legalize if the target supports
AND (which all do).

This compiles
short %foo() {
        %tmp.0 = load ubyte* %X         ; <ubyte> [#uses=1]
        %tmp.3 = cast ubyte %tmp.0 to short             ; <short> [#uses=1]
        ret short %tmp.3
}

to:
_foo:
        movzbl _X, %eax
        ret

instead of:

_foo:
        movzbl _X, %eax
        movzbl %al, %eax
        ret

thanks to Evan for pointing this out.

llvm-svn: 24709
2005-12-14 19:05:06 +00:00
Chris Lattner f753d1a574 Fix a miscompilation in crafty due to a recent patch
llvm-svn: 24706
2005-12-14 07:58:38 +00:00
Evan Cheng bce7c47306 Fold (zext (load x) to (zextload x).
llvm-svn: 24702
2005-12-14 02:19:23 +00:00
Chris Lattner 57c882edf8 Only transform (sext (truncate x)) -> (sextinreg x) if before legalize or
if the target supports the resultant sextinreg

llvm-svn: 24632
2005-12-07 18:02:05 +00:00
Chris Lattner cbd3d01a43 Teach the dag combiner to turn a truncate/sign_extend pair into a sextinreg
when the types match up.  This allows the X86 backend to compile:

sbyte %toggle_value(sbyte* %tmp.1) {
        %tmp.2 = load sbyte* %tmp.1
        ret sbyte %tmp.2
}

to this:

_toggle_value:
        mov %EAX, DWORD PTR [%ESP + 4]
        movsx %EAX, BYTE PTR [%EAX]
        ret

instead of this:

_toggle_value:
        mov %EAX, DWORD PTR [%ESP + 4]
        movsx %EAX, BYTE PTR [%EAX]
        movsx %EAX, %AL
        ret

noticed in Shootout/objinst.

-Chris

llvm-svn: 24630
2005-12-07 07:11:03 +00:00
Jeff Cohen cf1f782a2f Fix operator precedence bug caught by VC++.
llvm-svn: 24318
2005-11-12 00:59:01 +00:00
Chris Lattner bf4f233214 Switch the allnodes list from a vector of pointers to an ilist of nodes.This eliminates the vector, allows constant time removal of a node froma graph, and makes iteration over the all nodes list stable when adding
nodes to the graph.

llvm-svn: 24263
2005-11-09 23:47:37 +00:00
Nate Begeman ee065281e8 Fix a crash that Andrew noticed, and add a pair of braces to unfconfuse
XCode's indenting.

llvm-svn: 24159
2005-11-02 18:42:59 +00:00
Chris Lattner 17df608719 Fix a source of undefined behavior when dealing with 64-bit types. This
may fix PR652.  Thanks to Andrew for tracking down the problem.

llvm-svn: 24145
2005-11-02 01:47:04 +00:00
Chris Lattner a70878d4fb Codegen mul by negative power of two with a shift and negate.
This implements test/Regression/CodeGen/PowerPC/mul-neg-power-2.ll,
producing:

_foo:
        slwi r2, r3, 1
        subfic r3, r2, 63
        blr

instead of:

_foo:
        mulli r2, r3, -2
        addi r3, r2, 63
        blr

llvm-svn: 24106
2005-10-30 06:41:49 +00:00
Chris Lattner 4b6d583d7a Fix DSE to not nuke dead stores unless they redundant store is the same
VT as the killing one.  Fix fixes PR491

llvm-svn: 24034
2005-10-27 07:10:34 +00:00
Chris Lattner d8c5c066a1 Add a simple xform that is useful for bitfield operations.
llvm-svn: 24029
2005-10-27 05:06:38 +00:00
Chris Lattner 3b409a85eb Clear a bit in this file that was causing a miscompilation of 178.galgel.
llvm-svn: 23980
2005-10-25 18:57:30 +00:00
Chris Lattner 9faa5b7a9a BuildSDIV and BuildUDIV only work for i32/i64, but they don't check that
the input is that type, this caused a failure on gs on X86 last night.
Move the hard checks into Build[US]Div since that is where decisions like
this should be made.

llvm-svn: 23881
2005-10-22 18:50:15 +00:00
Chris Lattner 75ea5b10bf add a case missing from the dag combiner that exposed the failure on
2005-10-21-longlonggtu.ll.

llvm-svn: 23875
2005-10-21 21:23:25 +00:00
Nate Begeman 8f62cd32ad Fix a typo in the dag combiner, so that this can work on i64 targets
llvm-svn: 23856
2005-10-21 01:51:45 +00:00
Nate Begeman 4dd383120f Invert the TargetLowering flag that controls divide by consant expansion.
Add a new flag to TargetLowering indicating if the target has really cheap
  signed division by powers of two, make ppc use it.  This will probably go
  away in the future.
Implement some more ISD::SDIV folds in the dag combiner
Remove now dead code in the x86 backend.

llvm-svn: 23853
2005-10-21 00:02:42 +00:00
Nate Begeman 7efe53d90b Fix a couple bugs in the const div stuff where we'd generate MULHS/MULHU
for types that aren't legal, and fail a divisor is less than zero
comparison, which would cause us to drop a subtract.

llvm-svn: 23846
2005-10-20 17:45:03 +00:00
Chris Lattner a6efeb01f9 don't use llabs with apparently VC++ doesn't have
llvm-svn: 23845
2005-10-20 17:01:00 +00:00
Nate Begeman c6f067a8c4 Move the target constant divide optimization up into the dag combiner, so
that the nodes can be folded with other nodes, and we can not duplicate
code in every backend.  Alpha will probably want this too.

llvm-svn: 23835
2005-10-20 02:15:44 +00:00
Chris Lattner 6c14c35bd7 Fold (select C, load A, load B) -> load (select C, A, B). This happens quite
a lot throughout many programs.  In particular, specfp triggers it a bunch for
constant FP nodes when you have code like  cond ? 1.0 : -1.0.

If the PPC ISel exposed the loads implicit in pic references to external globals,
we would be able to eliminate a load in cases like this as well:

%X = external global int
%Y = external global int
int* %test4(bool %C) {
        %G = select bool %C, int* %X, int* %Y
        ret int* %G
}

Note that this breaks things that use SrcValue's (see the fixme), but since nothing
uses them yet, this is ok.

Also, simplify some code to use hasOneUse() on an SDOperand instead of hasNUsesOfValue directly.

llvm-svn: 23781
2005-10-18 06:04:22 +00:00
Nate Begeman 418c6e4045 Implement some feedback from Chris re: constant canonicalization
llvm-svn: 23777
2005-10-18 00:28:13 +00:00
Nate Begeman ec48a1bfbd fold fmul X, +2.0 -> fadd X, X;
llvm-svn: 23774
2005-10-17 20:40:11 +00:00
Chris Lattner eeb2bda2fa add a trivial fold
llvm-svn: 23764
2005-10-17 01:07:11 +00:00
Chris Lattner e540800d5a Fix this logic.
llvm-svn: 23756
2005-10-15 22:35:40 +00:00
Chris Lattner 17cc9edd33 Add a case we were missing that was causing us to fail CodeGen/PowerPC/rlwinm.ll:test3
llvm-svn: 23755
2005-10-15 22:18:08 +00:00
Nate Begeman 6e673b24d3 fold sext_in_reg, sext_in_reg where both have the same VT. This was
popping up in Fourinarow.

llvm-svn: 23722
2005-10-14 01:29:07 +00:00
Nate Begeman d59e5a7abb Relax the checking on zextload generation a bit, since as sabre pointed out
you could be AND'ing with the result of a shift that shifts out all the
bits you care about, in addition to a constant.

Also, move over an add/sub_parts fold from legalize to the dag combiner,
where it works for things other than constants.  Woot!

llvm-svn: 23720
2005-10-14 01:12:21 +00:00
Chris Lattner b8282987f4 Fix the trunc(load) case, finally allowing crafty and povray to pass
llvm-svn: 23718
2005-10-13 22:10:05 +00:00
Chris Lattner dbc5ae3109 Fix some bugs in (sext (load x))
llvm-svn: 23717
2005-10-13 21:52:31 +00:00
Nate Begeman 8e022b3d89 Fix the remaining DAGCombiner issues pointed out by sabre. This should fix
the remainder of the failures introduced by my patch last night.

llvm-svn: 23714
2005-10-13 18:34:58 +00:00
Chris Lattner a80f1f6e72 Fix a minor bug in the dag combiner that broke pcompress2 and some other
tests.

llvm-svn: 23713
2005-10-13 18:16:34 +00:00
Nate Begeman 02b23c6065 Move some Legalize functionality over to the DAGCombiner where it belongs.
Kill some dead code.

llvm-svn: 23706
2005-10-13 03:11:28 +00:00
Nate Begeman 70d28c5e32 Fix a potential bug with two combine-to's back to back that chris pointed
out, where after the first CombineTo() call, the node the second CombineTo
wishes to replace may no longer exist.

Fix a very real bug with the truncated load optimization on little endian
targets, which do not need a byte offset added to the load.

llvm-svn: 23704
2005-10-12 23:18:53 +00:00
Nate Begeman 8caf81d617 More cool stuff for the dag combiner. We can now finally handle things
like turning:

_foo:
        fctiwz f0, f1
        stfd f0, -8(r1)
        lwz r2, -4(r1)
        rlwinm r3, r2, 0, 16, 31
        blr

into
_foo:
        fctiwz f0,f1
        stfd f0,-8(r1)
        lhz r3,-2(r1)
        blr

Also removed an unncessary constraint from sra -> srl conversion, which
should take care of hte only reason we would ever need to handle sra in
MaskedValueIsZero, AFAIK.

llvm-svn: 23703
2005-10-12 20:40:40 +00:00
Chris Lattner 514f058be1 Fix a powerpc crash on CodeGen/Generic/llvm-ct-intrinsics.ll
llvm-svn: 23694
2005-10-11 17:56:34 +00:00
Chris Lattner c38fb8e2a1 Add a canonicalization that got lost, fixing PowerPC/fold-li.ll:SUB
llvm-svn: 23693
2005-10-11 06:07:15 +00:00
Chris Lattner cc6e53e6ee clean up some corner cases
llvm-svn: 23692
2005-10-10 23:00:08 +00:00
Chris Lattner 04c737091f Implement trivial DSE. If two stores are neighbors and store to the same
location, replace them with a new store of the last value.  This occurs
in the same neighborhood in 197.parser, speeding it up about 1.5%

llvm-svn: 23691
2005-10-10 22:31:19 +00:00
Chris Lattner e260ed8628 Add support for CombineTo, allowing the dag combiner to replace nodes with
multiple results.

Use this support to implement trivial store->load forwarding, implementing
CodeGen/PowerPC/store-load-fwd.ll.  Though this is the most simple case and
can be extended in the future, it is still useful.  For example, it speeds
up 197.parser by 6.2% by avoiding an LSU reject in xalloc:

        stw r6, lo16(l5_end_of_array)(r2)
        addi r2, r5, -4
        stwx r5, r4, r2
-       lwzx r5, r4, r2
-       rlwinm r5, r5, 0, 0, 30
        stwx r5, r4, r2
        lwz r2, -4(r4)
        ori r2, r2, 1

llvm-svn: 23690
2005-10-10 22:04:48 +00:00
Nate Begeman 6828ed9bfd Teach the DAGCombiner several new tricks, teaching it how to turn
sext_inreg into zext_inreg based on the signbit (fires a lot), srem into
urem, etc.

llvm-svn: 23688
2005-10-10 21:26:48 +00:00
Chris Lattner 7730924067 Fix comment
llvm-svn: 23686
2005-10-10 16:52:03 +00:00
Chris Lattner 3d1d4a3d12 Add ISD::ADD to MaskedValueIsZero
llvm-svn: 23685
2005-10-10 16:51:40 +00:00
Chris Lattner 6a49b7cabb add a todo for something I noticed
llvm-svn: 23679
2005-10-09 22:59:08 +00:00
Chris Lattner 1d3dc00674 (X & Y) & C == 0 if either X&C or Y&C are zero
llvm-svn: 23678
2005-10-09 22:12:36 +00:00
Nate Begeman 2042aa5b92 Lo and behold, the last bits of SelectionDAG.cpp have been moved over.
llvm-svn: 23665
2005-10-08 00:29:44 +00:00
Chris Lattner fb12624a3f implement CodeGen/PowerPC/div-2.ll:test2-4 by propagating zero bits through
C-X's

llvm-svn: 23662
2005-10-07 15:30:32 +00:00
Chris Lattner 5bcd0dd811 Turn sdivs into udivs when we can prove the sign bits are clear. This
implements CodeGen/PowerPC/div-2.ll

llvm-svn: 23659
2005-10-07 06:10:46 +00:00
Nate Begeman bd7df030d2 Check in some more DAGCombiner pieces
llvm-svn: 23639
2005-10-05 21:43:42 +00:00
Chris Lattner a49e16fefa implement visitBR_CC so that PowerPC/inverted-bool-compares.ll passes
with the dag combiner.  This speeds up espresso by 8%, reaching performance
parity with the dag-combiner-disabled llc.

llvm-svn: 23636
2005-10-05 06:47:48 +00:00
Chris Lattner 06f1d0f73a Add a new HandleNode class, which is used to handle (haha) cases in the
dead node elim and dag combiner passes where the root is potentially updated.
This fixes a fixme in the dag combiner.

llvm-svn: 23634
2005-10-05 06:35:28 +00:00
Chris Lattner a6895d180e Implement the code for PowerPC/inverted-bool-compares.ll, even though it
that testcase still does not pass with the dag combiner.  This is because
not all forms of br* are folded yet.

Also, when we combine a node into another one, delete the node immediately
instead of waiting for the node to potentially come up in the future.

llvm-svn: 23632
2005-10-05 06:11:08 +00:00
Chris Lattner 746d50a01a Fix a crash compiling Olden/tsp
llvm-svn: 23630
2005-10-05 04:45:43 +00:00
Chris Lattner 6f3b577ee6 Add FP versions of the binary operators, keeping the int and fp worlds seperate.
Though I have done extensive testing, it is possible that this will break
things in configs I can't test.  Please let me know if this causes a problem
and I'll fix it ASAP.

llvm-svn: 23504
2005-09-28 22:28:18 +00:00
Nate Begeman c760f80fed Stub out the rest of the DAG Combiner. Just need to fill in the
select_cc bits and then wrap it in a convenience function for  use with
regular select.

llvm-svn: 23389
2005-09-19 22:34:01 +00:00
Nate Begeman 24a7eca282 More DAG combining. Still need the branch instructions, and select_cc
llvm-svn: 23371
2005-09-16 00:54:12 +00:00
Chris Lattner bd39c1a4c6 Add a missing #include, patch courtesy of Baptiste Lepilleur.
llvm-svn: 23302
2005-09-09 23:53:39 +00:00
Nate Begeman 049b748c76 Last round of 2-node folds from SD.cpp. Will move on to 3 node ops such
as setcc and select next.

llvm-svn: 23295
2005-09-09 19:49:52 +00:00
Nate Begeman 85c1cc4523 Move yet more folds over to the dag combiner from sd.cpp
llvm-svn: 23278
2005-09-08 20:18:10 +00:00
Nate Begeman 2cc2c9a79c Another round of dag combiner changes. This fixes some missing XOR folds
as well as fixing how we replace old values with new values.

llvm-svn: 23260
2005-09-07 23:25:52 +00:00
Nate Begeman 6791d63e55 Implement a common missing fold, (add (add x, c1), c2) -> (add x, c1+c2).
This restores all of stanford to being identical with and without the dag
combiner with the add folding turned off in sd.cpp.

llvm-svn: 23258
2005-09-07 16:09:19 +00:00
Nate Begeman 007c650699 Add an option to the DAG Combiner to enable it for beta runs, and turn on
that option for PowerPC's beta.

llvm-svn: 23253
2005-09-07 00:15:36 +00:00
Nate Begeman d23739d020 Next round of DAGCombiner changes. This version now passes all the tests
I have run so far when run before Legalize.  It still needs to pick up the
SetCC folds, and nodes that use SetCC.

llvm-svn: 23243
2005-09-06 04:43:02 +00:00
Nate Begeman 7cea6ef16e Next round of DAG Combiner changes. Just need to support multiple return
values, and then we should be able to hook it up.

llvm-svn: 23231
2005-09-02 21:18:40 +00:00
Nate Begeman 2504fe2613 Implement first round of feedback from chris (there's still a couple things
left to do).

llvm-svn: 23195
2005-09-01 23:24:04 +00:00
Nate Begeman e8f78d1aab Add the rest of the currently implemented visit routines to the switch
statement in visit().

llvm-svn: 23185
2005-09-01 00:33:32 +00:00
Nate Begeman 21158fc485 First pass at the DAG Combiner. It isn't used anywhere yet, but it should
be mostly functional.  It currently has all folds from SelectionDAG.cpp
that do not involve a condition code.

llvm-svn: 23184
2005-09-01 00:19:25 +00:00