Commit Graph

7706 Commits

Author SHA1 Message Date
Craig Topper d9a6cc031d Revert r191940 to see if it fixes the build bots.
llvm-svn: 191941
2013-10-04 05:52:17 +00:00
Craig Topper a2efe9ebc6 Add OPC_CheckChildSame0-3 to the DAG isel matcher. This replaces sequences of MoveChild, CheckSame, MoveParent. Saves 846 bytes from the X86 DAG isel matcher, ~300 from ARM, ~840 from Hexagon.
llvm-svn: 191940
2013-10-04 05:22:20 +00:00
Jin-Gu Kang 0bf8241d4b Added checking code whehter target supports specific dag combining about rotate
or not. The corresponding dag patterns are as following:

"DAGCombier::MatchRotate" function in DAGCombiner.cpp
Pattern1
// fold (or (shl (*ext x), (*ext y)),
//          (srl (*ext x), (*ext (sub 32, y)))) ->
//   (*ext (rotl x, y))
// fold (or (shl (*ext x), (*ext y)),
//          (srl (*ext x), (*ext (sub 32, y)))) ->
//   (*ext (rotr x, (sub 32, y)))

pattern2
// fold (or (shl (*ext x), (*ext (sub 32, y))),
//          (srl (*ext x), (*ext y))) ->
//   (*ext (rotl x, y))
// fold (or (shl (*ext x), (*ext (sub 32, y))),
//          (srl (*ext x), (*ext y))) ->
//   (*ext (rotr x, (sub 32, y)))

llvm-svn: 191905
2013-10-03 15:58:48 +00:00
Rafael Espindola 44fee4e0eb Remove several unused variables.
Patch by Alp Toker.

llvm-svn: 191757
2013-10-01 13:32:03 +00:00
Tom Stellard 6aada32dc4 SelectionDAG: Clarify comments from r191600
llvm-svn: 191724
2013-10-01 02:09:00 +00:00
Benjamin Kramer c3c807b3bf Allocate AtomicSDNode operands in SelectionDAG's allocator to stop leakage.
SDNode destructors are never called. As an optimization use AtomicSDNode's
internal storage if we have a small number of operands.

llvm-svn: 191636
2013-09-29 11:18:56 +00:00
Robert Wilhelm f0cfb83bb4 Fix spelling intruction -> instruction.
llvm-svn: 191610
2013-09-28 11:46:15 +00:00
Tom Stellard 45015d9796 SelectionDAG: Silence unused variable warning on release builds
llvm-svn: 191604
2013-09-28 03:10:17 +00:00
Tom Stellard 5694d3090a SelectionDAG: Improve legalization of SELECT_CC with illegal condition codes
SelectionDAG will now attempt to inverse an illegal conditon in order to
find a legal one and if that doesn't work, it will attempt to swap the
operands using the inverted condition.

There are no new test cases for this, but a nubmer of the existing R600
tests hit this path.

llvm-svn: 191602
2013-09-28 02:50:43 +00:00
Tom Stellard cd42818d86 SelectionDAG: Try to expand all condition codes using getCCSwappedOperands()
This is useful for targets like R600, which only support GT, GE, NE, and EQ
condition codes as it removes the need to handle unsupported condition
codes in target specific code.

There are no tests with this commit, but R600 has been updated to take
advantage of this new feature, so its existing selectcc tests are now
testing the swapped operands path.

llvm-svn: 191601
2013-09-28 02:50:38 +00:00
Tom Stellard 08690a146f SelectionDAG: Clean up LegalizeSetCCCondCode() function
Interpreting the results of this function is not very intuitive, so I
cleaned it up to make it more clear whether or not a SETCC op was
legalized and how it was legalized (either by swapping LHS and RHS or
replacing with AND/OR).

This patch does change functionality in the LHS and RHS swapping case,
but unfortunately there are no in-tree tests for this.  However, this
patch is a prerequisite for R600 to take advantage of the LHS and RHS
swapping, so tests will be added in subsequent commits.

llvm-svn: 191600
2013-09-28 02:50:32 +00:00
Andrea Di Biagio 56ce9c4e78 Re-apply the change from r191393 with fix for pr17380.
This change fixes the problem reported in pr17380 and re-add the dagcombine 
transformation ensuring that the value types are always legal if the 
transformation is triggered after Legalization took place.

Added the test case from pr17380.

llvm-svn: 191509
2013-09-27 11:37:05 +00:00
Andrea Di Biagio 549d6605a0 Revert r191393 since it caused pr17380.
llvm-svn: 191438
2013-09-26 16:54:01 +00:00
Amara Emerson b4ad2f396a [ARM] Use the load-acquire/store-release instructions optimally in AArch32.
Patch by Artyom Skrobov.

llvm-svn: 191428
2013-09-26 12:22:36 +00:00
Andrew Trick 71e8bb6d1d Added temp flag -misched-bench for staging in default changes.
llvm-svn: 191423
2013-09-26 05:53:35 +00:00
Andrew Trick 6f5aad7a24 whitespace
llvm-svn: 191422
2013-09-26 05:53:31 +00:00
Andrea Di Biagio 9f3313109f Teach DAGCombiner how to canonicalize dags according to the rule
(shl (zext (shr A, X)), X) => (zext (shl (shr A, X), X)).

The rule only triggers when there are no other uses of the
zext to avoid materializing more instructions.

This helps the DAGCombiner understand that the shl/shr
sequence can then be converted into an and instruction.

llvm-svn: 191393
2013-09-25 19:01:01 +00:00
Eli Friedman a961d694e2 Add missing check to SETCC optimization.
PR17338.

llvm-svn: 191337
2013-09-24 22:50:14 +00:00
Benjamin Kramer 64bdb29a83 DAGCombiner: Unify rotate matching for extended and unextended amounts.
No functionality change, lots of indentation changes.

llvm-svn: 191303
2013-09-24 14:21:28 +00:00
Jiangning Liu 63dc840fc5 Initial support for Neon scalar instructions.
Patch by Ana Pazos.

1.Added support for v1ix and v1fx types.
2.Added Scalar Pairwise Reduce instructions.
3.Added initial implementation of Scalar Arithmetic instructions.

llvm-svn: 191263
2013-09-24 02:47:27 +00:00
Michael Gottesman 5e3600c1ce [stackprotector] Allow for copies from vreg -> vreg to be in a terminator sequence.
Sometimes a copy from a vreg -> vreg sneaks into the middle of a terminator
sequence. It is safe to slice this into the stack protector success bb.

This fixes PR16979.

llvm-svn: 191260
2013-09-24 01:50:26 +00:00
Kay Tiong Khoo 9195a5b081 fix typo: than -> then
llvm-svn: 191214
2013-09-23 18:43:51 +00:00
Tim Northover 31d093c705 ISelDAG: spot chain cycles involving MachineNodes
Previously, the DAGISel function WalkChainUsers was spotting that it
had entered already-selected territory by whether a node was a
MachineNode (amongst other things). Since it's fairly common practice
to insert MachineNodes during ISelLowering, this was not the correct
check.

Looking around, it seems that other nodes get their NodeId set to -1
upon selection, so this makes sure the same thing happens to all
MachineNodes and uses that characteristic to determine whether we
should stop looking for a loop during selection.

This should fix PR15840.

llvm-svn: 191165
2013-09-22 08:21:56 +00:00
Juergen Ributzka f043a65327 Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too."
This reverts commit r191130.

llvm-svn: 191138
2013-09-21 15:09:46 +00:00
Juergen Ributzka e9a80fc912 SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.

This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask for the given target. This mask has usually
te same size as the VSELECT return type (except for Intel KNL). Now the type
legalizer will split both VSELECT and SETCC.

This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.

llvm-svn: 191130
2013-09-21 04:55:18 +00:00
David Blaikie 9d117ab7ef Add braces to suppress Clang's dangling-else warning.
These violations were introduced in r191049

llvm-svn: 191059
2013-09-20 00:33:11 +00:00
Kai Nacke d09bb4614b PR16726: extend rol/ror matching
C-like languages promote types like unsigned short to unsigned int before
performing an arithmetic operation. Currently the rotate matcher in the
DAGCombiner does not consider this situation.

This commit extends the DAGCombiner in the way that the pattern

(or (shl ([az]ext x), (*ext y)), (srl ([az]ext x), (*ext (sub 32, y))))

is folded into

([az]ext (rotl x, y))

The matching is restricted to aext and zext because in this cases the upper
bits are either undefined or known. Test case is included.

This fixes PR16726.

llvm-svn: 191049
2013-09-19 23:00:28 +00:00
Kai Nacke 2d967b2751 Revert PR16726: extend rol/ror matching
There is a buildbot failure. Need to investigate this.

llvm-svn: 191048
2013-09-19 22:53:36 +00:00
Kai Nacke 4eaf6444fa PR16726: extend rol/ror matching
C-like languages promote types like unsigned short to unsigned int before
performing an arithmetic operation. Currently the rotate matcher in the
DAGCombiner does not consider this situation.

This commit extends the DAGCombiner in the way that the pattern

(or (shl ([az]ext x), (*ext y)), (srl ([az]ext x), (*ext (sub 32, y))))

is folded into

([az]ext (rotl x, y))

The matching is restricted to aext and zext because in this cases the upper
bits are either undefined or known. Test case is included.

This fixes PR16726.

llvm-svn: 191045
2013-09-19 22:36:39 +00:00
Benjamin Kramer d443e4a080 DAGCombiner: Don't fold vector muls with constants that look like a splat of a power of 2 but differ in bit width.
PR17283.

llvm-svn: 191000
2013-09-19 13:28:20 +00:00
Adrian Prantl 262bcf4584 Debug info: Get rid of the VLA indirection hack in FastISel.
Use the DIVariable::isIndirect() flag set by the frontend instead of
guessing whether to set the machine location's indirection bit.
Paired commit with CFE.

llvm-svn: 190961
2013-09-18 22:08:59 +00:00
Serge Pavlov 8ec39992c1 Added documentation to getMemsetStores.
llvm-svn: 190866
2013-09-17 16:24:42 +00:00
Quentin Colombet d30a9585b8 [SelectionDAG] Teach the vector scalarizer about TRUNCATE.
When a truncate node defines a legal vector type but uses an illegal
vector type, the legalization process was splitting the vector until
<1 x vector> type, but then it was failing to scalarize the node because
it did not know how to handle TRUNCATE.

<rdar://problem/14989896>

llvm-svn: 190830
2013-09-17 00:26:56 +00:00
Adrian Prantl db3e26d193 Debug info: Fix PR16736 and rdar://problem/14990587.
A DBG_VALUE is register-indirect iff the first operand is a register
_and_ the second operand is an immediate.

llvm-svn: 190821
2013-09-16 23:29:03 +00:00
Hal Finkel 31658834e6 Prevent assert in CombinerGlobalAA with null values
DAGCombiner::isAlias can be called with SrcValue1 or SrcValue2 null, and we
can't use AA in this case (if we try, then the casting code in AA will assert).

llvm-svn: 190763
2013-09-15 02:19:49 +00:00
Matt Arsenault bc08ddba58 Remove pointless assertion after r190376
llvm-svn: 190565
2013-09-12 01:07:49 +00:00
Benjamin Kramer 079b96e6f7 Revert "Give internal classes hidden visibility."
It works with clang, but GCC has different rules so we can't make all of those
hidden. This reverts commit r190534.

llvm-svn: 190536
2013-09-11 18:05:11 +00:00
Benjamin Kramer 6a44af3629 Give internal classes hidden visibility.
Worth 100k on a linux/x86_64 Release+Asserts clang.

llvm-svn: 190534
2013-09-11 17:42:27 +00:00
Eli Friedman 8f06d55697 Rename variables for consistency.
No functional change.

llvm-svn: 190466
2013-09-11 00:41:02 +00:00
Eli Friedman 78bffa5767 Fix unused variables.
llvm-svn: 190448
2013-09-10 23:18:14 +00:00
Matt Arsenault d232222f34 Don't use getSetCCResultType for creating a vselect
The vselect mask isn't a setcc.

This breaks in the case when the result of getSetCCResultType
is larger than the vector operands

e.g. %tmp = select i1 %cmp <2 x i8> %a, <2 x i8> %b
when getSetCCResultType returns <2 x i32>, the assertion
that the (MaskTy.getSizeInBits() == Op1.getValueType().getSizeInBits())
is hit.

No test since I don't think I can hit this with any of the current
targets. The R600/SI implementation would break, since it returns a
vector of i1 for this, but it doesn't reach ExpandSELECT for other
reasons.

llvm-svn: 190376
2013-09-10 00:41:56 +00:00
Jack Carter 170a5f2983 white spaces and long lines
llvm-svn: 190358
2013-09-09 22:02:08 +00:00
Bob Wilson e407736a06 Revert patches to add case-range support for PR1255.
The work on this project was left in an unfinished and inconsistent state.
Hopefully someone will eventually get a chance to implement this feature, but
in the meantime, it is better to put things back the way the were.  I have
left support in the bitcode reader to handle the case-range bitcode format,
so that we do not lose bitcode compatibility with the llvm 3.3 release.

This reverts the following commits: 155464, 156374, 156377, 156613, 156704,
156757, 156804 156808, 156985, 157046, 157112, 157183, 157315, 157384, 157575,
157576, 157586, 157612, 157810, 157814, 157815, 157880, 157881, 157882, 157884,
157887, 157901, 158979, 157987, 157989, 158986, 158997, 159076, 159101, 159100,
159200, 159201, 159207, 159527, 159532, 159540, 159583, 159618, 159658, 159659,
159660, 159661, 159703, 159704, 160076, 167356, 172025, 186736

llvm-svn: 190328
2013-09-09 19:14:35 +00:00
Tim Northover 950fcc0577 SelectionDAG: create correct BooleanContent constants
Occasionally DAGCombiner can spot that a SETCC operation is completely
redundant and reduce it to "all true" or "all false". If this happens to a
vector, the value produced has to take account of what a normal comparison
would have produced, which may be an all-1s bitmask.

The fix in SelectionDAG.cpp is tested, however, as far as I can see the code in
TargetLowering.cpp is possibly unreachable and almost certainly irrelevant when
triggered so there are no tests. However, I believe it's still clearly the
right change and may save someone else some hassle if it suddenly becomes
reachable. So I'm doing it anyway.

llvm-svn: 190147
2013-09-06 12:38:12 +00:00
Hal Finkel 5ef4dccdce Use TargetSubtargetInfo::useAA() in DAGCombine
This uses the TargetSubtargetInfo::useAA() function to control the defaults of
the -combiner-alias-analysis and -combiner-global-alias-analysis options.

llvm-svn: 189564
2013-08-29 03:29:55 +00:00
Juergen Ributzka 11c52c601a Fix a typo and coding style of a previous commit. No functional change.
llvm-svn: 189526
2013-08-28 22:33:58 +00:00
Tim Northover 819bfb5a25 DAGCombiner: make sure or/shl/srl really has zero high bits before forming bswap
We want to convert code like (or (srl N, 8), (shl N, 8)) into (srl (bswap N),
const), but this is only valid if the bits above 16 on the source pattern are
0, the checks we were doing on this were slightly wrong before.

llvm-svn: 189348
2013-08-27 13:46:45 +00:00
Owen Anderson a0260f848d Remove an over-zealous assertion. A pointer type could be illegal if the target is prepared to custom-legalize pointer operands. This assertion was evaluated before the target would have a chance to do so, making it impossible.
llvm-svn: 189299
2013-08-27 00:28:23 +00:00
Tom Stellard 838e2344ec SelectionDAG: Remove unnecessary uses of TargetLowering::getPointerTy()
If we have a binary operation like ISD:ADD, we can set the result type
equal to the result type of one of its operands rather than using
TargetLowering::getPointerTy().

Also, any use of DAG.getIntPtrConstant(C) as an operand for a binary
operation can be replaced with:
DAG.getConstant(C, OtherOperand.getValueType());

llvm-svn: 189227
2013-08-26 15:06:10 +00:00
Tom Stellard 7da047c9fb SelectionDAG: Use correct pointer size when splitting vector stores
llvm-svn: 189224
2013-08-26 15:05:55 +00:00
Tom Stellard fd155828ed SelectionDAG: Use correct pointer size when lowering function arguments v2
This adds minimal support to the SelectionDAG for handling address spaces
with different pointer sizes.  The SelectionDAG should now correctly
lower pointer function arguments to the correct size as well as generate
the correct code when lowering getelementptr.

This patch also updates the R600 DataLayout to use 32-bit pointers for
the local address space.

v2:
  - Add more helper functions to TargetLoweringBase
  - Use CHECK-LABEL for tests

llvm-svn: 189221
2013-08-26 15:05:36 +00:00
Benjamin Kramer b12cf01908 Add a function object to compare the first or second component of a std::pair.
Replace instances of this scattered around the code base.

llvm-svn: 189169
2013-08-24 12:54:27 +00:00
Michael Gottesman 20f25eb958 [stack protector] Work around an issue with the BMOVPCB_CALL instruction on ARM by disabling does not return on __stack_chk_fail.
This is to fix the bots while I look to see if there is something I can do here.

rdar://14811848

llvm-svn: 189076
2013-08-22 23:45:24 +00:00
Michael Gottesman 1adac3582d [stackprotector] When finding the split point to splice off the end of a parentmbb into a successmbb, include any DBG_VALUE MI.
Fix for PR16954.

llvm-svn: 188987
2013-08-22 05:40:50 +00:00
Tom Stellard 1b2c2d8414 SelectionDAG: Make sure stores are always added to the LegalizedNodes list
When truncated vector stores were being custom lowered in
VectorLegalizer::LegalizeOp(), the old (illegal) and new (legal) node pair
was not being added to LegalizedNodes list.  Instead of the legalized
result being passed to VectorLegalizer::TranslateLegalizeResult(),
the result was being passed back into VectorLegalizer::LegalizeOp(),
which ended up adding a (new, new) pair to the list instead.

This was causing an assertion failure when a custom lowered truncated
vector store was the last instruction a basic block and the VectorLegalizer
was unable to find it in the LegalizedNodes list when updating the
DAG root.

llvm-svn: 188953
2013-08-21 22:42:58 +00:00
Juergen Ributzka 3db39dc1ae Teach BaseIndexOffset::match to identify base pointers in loops.
The small utility function that pattern matches Base + Index +
Offset patterns for loads and stores fails to recognize the base
pointer for loads/stores from/into an array at offset 0 inside a
loop. As a result DAGCombiner::MergeConsecutiveStores was not able
to merge all stores.

This commit fixes the issue by adding an additional pattern match
and also a test case.

Reviewer: Nadav
llvm-svn: 188936
2013-08-21 21:53:38 +00:00
Richard Sandiford 6f6d55161b [SystemZ] Use SRST to optimize memchr
SystemZTargetLowering::emitStringWrapper() previously loaded the character
into R0 before the loop and made R0 live on entry.  I'd forgotten that
allocatable registers weren't allowed to be live across blocks at this stage,
and it confused LiveVariables enough to cause a miscompilation of f3 in
memchr-02.ll.

This patch instead loads R0 in the loop and leaves LICM to hoist it
after RA.  This is actually what I'd tried originally, but I went for
the manual optimisation after noticing that R0 often wasn't being hoisted.
This bug forced me to go back and look at why, now fixed as r188774.

We should also try to optimize null checks so that they test the CC result
of the SRST directly.  The select between null and the SRST GPR result could
then usually be deleted as dead.

llvm-svn: 188779
2013-08-20 09:38:48 +00:00
Michael Gottesman f7e1203d95 Remove unused variables that crept in.
llvm-svn: 188761
2013-08-20 07:17:27 +00:00
Michael Gottesman b27f0f1f6b Teach selectiondag how to handle the stackprotectorcheck intrinsic.
Previously, generation of stack protectors was done exclusively in the
pre-SelectionDAG Codegen LLVM IR Pass "Stack Protector". This necessitated
splitting basic blocks at the IR level to create the success/failure basic
blocks in the tail of the basic block in question. As a result of this,
calls that would have qualified for the sibling call optimization were no
longer eligible for optimization since said calls were no longer right in
the "tail position" (i.e. the immediate predecessor of a ReturnInst
instruction).

Then it was noticed that since the sibling call optimization causes the
callee to reuse the caller's stack, if we could delay the generation of
the stack protector check until later in CodeGen after the sibling call
decision was made, we get both the tail call optimization and the stack
protector check!

A few goals in solving this problem were:

  1. Preserve the architecture independence of stack protector generation.

  2. Preserve the normal IR level stack protector check for platforms like
     OpenBSD for which we support platform specific stack protector
     generation.

The main problem that guided the present solution is that one can not
solve this problem in an architecture independent manner at the IR level
only. This is because:

  1. The decision on whether or not to perform a sibling call on certain
     platforms (for instance i386) requires lower level information
     related to available registers that can not be known at the IR level.

  2. Even if the previous point were not true, the decision on whether to
     perform a tail call is done in LowerCallTo in SelectionDAG which
     occurs after the Stack Protector Pass. As a result, one would need to
     put the relevant callinst into the stack protector check success
     basic block (where the return inst is placed) and then move it back
     later at SelectionDAG/MI time before the stack protector check if the
     tail call optimization failed. The MI level option was nixed
     immediately since it would require platform specific pattern
     matching. The SelectionDAG level option was nixed because
     SelectionDAG only processes one IR level basic block at a time
     implying one could not create a DAG Combine to move the callinst.

To get around this problem a few things were realized:

  1. While one can not handle multiple IR level basic blocks at the
     SelectionDAG Level, one can generate multiple machine basic blocks
     for one IR level basic block. This is how we handle bit tests and
     switches.

  2. At the MI level, tail calls are represented via a special return
     MIInst called "tcreturn". Thus if we know the basic block in which we
     wish to insert the stack protector check, we get the correct behavior
     by always inserting the stack protector check right before the return
     statement. This is a "magical transformation" since no matter where
     the stack protector check intrinsic is, we always insert the stack
     protector check code at the end of the BB.

Given the aforementioned constraints, the following solution was devised:

  1. On platforms that do not support SelectionDAG stack protector check
     generation, allow for the normal IR level stack protector check
     generation to continue.

  2. On platforms that do support SelectionDAG stack protector check
     generation:

    a. Use the IR level stack protector pass to decide if a stack
       protector is required/which BB we insert the stack protector check
       in by reusing the logic already therein. If we wish to generate a
       stack protector check in a basic block, we place a special IR
       intrinsic called llvm.stackprotectorcheck right before the BB's
       returninst or if there is a callinst that could potentially be
       sibling call optimized, before the call inst.

    b. Then when a BB with said intrinsic is processed, we codegen the BB
       normally via SelectBasicBlock. In said process, when we visit the
       stack protector check, we do not actually emit anything into the
       BB. Instead, we just initialize the stack protector descriptor
       class (which involves stashing information/creating the success
       mbbb and the failure mbb if we have not created one for this
       function yet) and export the guard variable that we are going to
       compare.

    c. After we finish selecting the basic block, in FinishBasicBlock if
       the StackProtectorDescriptor attached to the SelectionDAGBuilder is
       initialized, we first find a splice point in the parent basic block
       before the terminator and then splice the terminator of said basic
       block into the success basic block. Then we code-gen a new tail for
       the parent basic block consisting of the two loads, the comparison,
       and finally two branches to the success/failure basic blocks. We
       conclude by code-gening the failure basic block if we have not
       code-gened it already (all stack protector checks we generate in
       the same function, use the same failure basic block).

llvm-svn: 188755
2013-08-20 07:00:16 +00:00
Hal Finkel 0c5c01aa4a Add a llvm.copysign intrinsic
This adds a llvm.copysign intrinsic; We already have Libfunc recognition for
copysign (which is turned into the FCOPYSIGN SDAG node). In order to
autovectorize calls to copysign in the loop vectorizer, we need a corresponding
intrinsic as well.

In addition to the expected changes to the language reference, the loop
vectorizer, BasicTTI, and the SDAG builder (the intrinsic is transformed into
an FCOPYSIGN node, just like the function call), this also adds FCOPYSIGN to a
few lists in LegalizeVector{Ops,Types} so that vector copysigns can be
expanded.

In TargetLoweringBase::initActions, I've made the default action for FCOPYSIGN
be Expand for vector types. This seems correct for all in-tree targets, and I
think is the right thing to do because, previously, there was no way to generate
vector-values FCOPYSIGN nodes (and most targets don't specify an action for
vector-typed FCOPYSIGN).

llvm-svn: 188728
2013-08-19 23:35:46 +00:00
Paul Redmond 62f840f46a Improve the widening of integral binary vector operations
- split WidenVecRes_Binary into WidenVecRes_Binary and WidenVecRes_BinaryCanTrap
  - WidenVecRes_BinaryCanTrap preserves the original behaviour for operations
    that can trap
  - WidenVecRes_Binary simply widens the operation and improves codegen for
    3-element vectors by allowing widening and promotion on x86 (matches the
    behaviour of unary and ternary operation widening)
- use WidenVecRes_Binary for operations on integers.

Reviewed by: nrotem

llvm-svn: 188699
2013-08-19 20:01:35 +00:00
Hal Finkel e4eb78188c Add ExpandFloatOp_FCOPYSIGN to handle ppcf128-related expansions
We had previously been asserting when faced with a FCOPYSIGN f64, ppcf128 node
because there was no way to expand the FCOPYSIGN node. Because ppcf128 is the
sum of two doubles, and the first double must have the larger magnitude, we
can take the sign from the first double. As a result, in addition to fixing the
crash, this is also an optimization.

llvm-svn: 188655
2013-08-19 06:55:37 +00:00
Jim Grosbach 06c2a68125 ARM: Fix more fast-isel verifier failures.
Teach the generic instruction selection helper functions to constrain
the register classes of their input operands. For non-physical register
references, the generic code needs to be careful not to mess that up
when replacing references to result registers. As the comment indicates
for MachineRegisterInfo::replaceRegWith(), it's important to call
constrainRegClass() first.

rdar://12594152

llvm-svn: 188593
2013-08-16 23:37:31 +00:00
Richard Sandiford 0dec06a28c [SystemZ] Use SRST to implement strlen and strnlen
It would also make sense to use it for memchr; I'm working on that now.

llvm-svn: 188547
2013-08-16 11:41:43 +00:00
Richard Sandiford bb83a50f57 [SystemZ] Use MVST to implement strcpy and stpcpy
llvm-svn: 188546
2013-08-16 11:29:37 +00:00
Richard Sandiford ca23271010 [SystemZ] Use CLST to implement strcmp
llvm-svn: 188544
2013-08-16 11:21:54 +00:00
Richard Sandiford e3827751e2 [SystemZ] Fix handling of 64-bit memcmp results
Generalize r188163 to cope with return types other than MVT::i32, just
as the existing visitMemCmpCall code did.  I've split this out into a
subroutine so that it can be used for other upcoming patches.

I also noticed that I'd used the wrong API to record the out chain.
It's a load that uses DAG.getRoot() rather than getRoot(), so the out
chain should go on PendingLoads.  I don't have a testcase for that because
we don't do any interesting scheduling on z yet.

llvm-svn: 188540
2013-08-16 10:55:47 +00:00
Craig Topper d9c2783d8f Replace getValueType().getSimpleVT() with getSimpleValueType().
llvm-svn: 188442
2013-08-15 02:44:19 +00:00
Jim Grosbach 327ccc787e DAG: Combine (and (setne X, 0), (setne X, -1)) -> (setuge (add X, 1), 2)
A common idiom is to use zero and all-ones as sentinal values and to
check for both in a single conditional ("x != 0 && x != (unsigned)-1").
That generates code, for i32, like:
  testl %edi, %edi
  setne %al
  cmpl  $-1, %edi
  setne %cl
  andb  %al, %cl

With this transform, we generate the simpler:
  incl  %edi
  cmpl  $1, %edi
  seta  %al

Similar improvements for other integer sizes and on other platforms. In
general, combining the two setcc instructions into one is better.

rdar://14689217

llvm-svn: 188315
2013-08-13 21:30:58 +00:00
Michael Gottesman 7a8017290a Update makeLibCall to return both the call and the chain associated with the libcall instead of just the call. This allows us to specify libcalls that return void.
LowerCallTo returns a pair with the return value of the call as the first
element and the chain associated with the return value as the second element. If
we lower a call that has a void return value, LowerCallTo returns an SDValue
with a NULL SDNode and the chain for the call. Thus makeLibCall by just
returning the first value makes it impossible for you to set up the chain so
that the call is not eliminated as dead code.

I also updated all references to makeLibCall to reflect the new return type.

llvm-svn: 188300
2013-08-13 17:54:56 +00:00
Michael Gottesman 3923bec37b Fixed SelectionDAGBuilder.h C++ filetype declaration to use the canonical C++ instead of c++.
llvm-svn: 188203
2013-08-12 21:02:02 +00:00
Richard Sandiford 564681c88d [SystemZ] Use CLC and IPM to implement memcmp
For now this is restricted to fixed-length comparisons with a length
in the range [1, 256], as for memcpy() and MVC.

llvm-svn: 188163
2013-08-12 10:28:10 +00:00
Craig Topper 0ecb26a79e Change asserts at the top of getVectorShuffle to check that LHS and RHS have the same type as the result.
Previously the asserts were only checking that RHS and LHS were the same type and had the same element type as the result. All downstream code for ISD::VECTOR_SHUFFLE requires the types to be the same.

Also removed one unnecessary check of matched element counts that was present in the code.

llvm-svn: 188051
2013-08-09 04:37:24 +00:00
Craig Topper 9a39b07a60 Remove AllUndef check from one of the loops in getVectorShuffle. It was already handled by the 'AllLHS && AllRHS' check after the previous loop.
llvm-svn: 187965
2013-08-08 08:03:12 +00:00
Craig Topper 309dfefb6f Optimize mask generation for one of the DAG combiner shufflevector cases.
llvm-svn: 187961
2013-08-08 07:38:55 +00:00
Hal Finkel 171817ee8a Add ISD::FROUND for libm round()
All libm floating-point rounding functions, except for round(), had their own
ISD nodes. Recent PowerPC cores have an instruction for round(), and so here I'm
adding ISD::FROUND so that round() can be custom lowered as well.

For the most part, this is straightforward. I've added an intrinsic
and a matching ISD node just like those for nearbyint() and friends. The
SelectionDAG pattern I've named frnd (because ISD::FP_ROUND has already claimed
fround).

This will be used by the PowerPC backend in a follow-up commit.

llvm-svn: 187926
2013-08-07 22:49:12 +00:00
Tom Stellard d42c594960 TargetLowering: Add getVectorIdxTy() function v2
This virtual function can be implemented by targets to specify the type
to use for the index operand of INSERT_VECTOR_ELT, EXTRACT_VECTOR_ELT,
INSERT_SUBVECTOR, EXTRACT_SUBVECTOR.  The default implementation returns
the result from TargetLowering::getPointerTy()

The previous code was using TargetLowering::getPointerTy() for vector
indices, because this is guaranteed to be legal on all targets.  However,
using TargetLowering::getPointerTy() can be a problem for targets with
pointer sizes that differ across address spaces.  On such targets,
when vectors need to be loaded or stored to an address space other than the
default 'zero' address space (which is the address space assumed by
TargetLowering::getPointerTy()), having an index that
is a different size than the pointer can lead to inefficient
pointer calculations, (e.g. 64-bit adds for a 32-bit address space).

There is no intended functionality change with this patch.

llvm-svn: 187748
2013-08-05 22:22:01 +00:00
Eric Christopher e6656ac870 Fix crashing on invalid inline asm with matching constraints.
For a testcase like the following:

 typedef unsigned long uint64_t;

 typedef struct {
   uint64_t lo;
   uint64_t hi;
 } blob128_t;

 void add_128_to_128(const blob128_t *in, blob128_t *res) {
   asm ("PAND %1, %0" : "+Q"(*res) : "Q"(*in));
 }

where we'll fail to allocate the register for the output constraint,
our matching input constraint will not find a register to match,
and could try to search past the end of the current operands array.

On the idea that we'd like to attempt to keep compilation going
to find more errors in the module, change the error cases when
we're visiting inline asm IR to return immediately and avoid
trying to create a node in the DAG. This leaves us with only
a single error message per inline asm instruction, but allows us
to safely keep going in the general case.

llvm-svn: 187470
2013-07-31 01:26:24 +00:00
Eric Christopher 029af15086 Reflow this to be easier to read.
llvm-svn: 187459
2013-07-30 22:50:44 +00:00
Quentin Colombet 6bf4baa408 [DAGCombiner] insert_vector_elt: Avoid building a vector twice.
This patch prevents the following combine when the input vector is used more
than once.
insert_vector_elt (build_vector elt0, ..., eltN), NewEltIdx, idx
=>
build_vector elt0, ..., NewEltIdx, ..., eltN 

The reasons are:
- Building a vector may be expensive, so try to reuse the existing part of a
  vector instead of creating a new one (think big vectors).
- elt0 to eltN now have two users instead of one. This may prevent some other
  optimizations.

llvm-svn: 187396
2013-07-30 00:24:09 +00:00
Nick Lewycky 0b68245ec8 Reimplement isPotentiallyReachable to make nocapture deduction much stronger.
Adds unit tests for it too.

Split BasicBlockUtils into an analysis-half and a transforms-half, and put the
analysis bits into a new Analysis/CFG.{h,cpp}. Promote isPotentiallyReachable
into llvm::isPotentiallyReachable and move it into Analysis/CFG.

llvm-svn: 187283
2013-07-27 01:24:00 +00:00
Justin Holewinski d3f2035a3c Add a target legalize hook for SplitVectorOperand (again)
CustomLowerNode was not being called during SplitVectorOperand,
meaning custom legalization could not be used by targets.

This also adds a test case for NVPTX that depends on this custom
legalization.

Differential Revision: http://llvm-reviews.chandlerc.com/D1195

Attempt to fix the buildbots by making the X86 test I just added platform independent

llvm-svn: 187202
2013-07-26 13:28:29 +00:00
Rafael Espindola 1d812728cc Revert "Add a target legalize hook for SplitVectorOperand"
This reverts commit 187198. It broke the bots.

The soft float test probably needs a -triple because of name differences.
On the hard float test I am getting a "roundss $1, %xmm0, %xmm0", instead of
"vroundss $1, %xmm0, %xmm0, %xmm0".

llvm-svn: 187201
2013-07-26 13:18:16 +00:00
Justin Holewinski f848a24e50 Add a target legalize hook for SplitVectorOperand
CustomLowerNode was not being called during SplitVectorOperand,
meaning custom legalization could not be used by targets.

This also adds a test case for NVPTX that depends on this custom
legalization.

Differential Revision: http://llvm-reviews.chandlerc.com/D1195

llvm-svn: 187198
2013-07-26 12:46:39 +00:00
Tom Stellard c54731aa9d DAGCombiner: Pass the correct type to TargetLowering::isF(Abs|Neg)Free
This commit also implements these functions for R600 and removes a test
case that was relying on the buggy behavior.

llvm-svn: 187007
2013-07-23 23:55:03 +00:00
Michael Gottesman f87a6ae65f Add -*- C++ -*- to InstrEmitter.h.
llvm-svn: 186527
2013-07-17 18:53:29 +00:00
Hal Finkel 2f5e8e3d95 Remove invalid assert in DAGTypeLegalizer::RemapValue
There is a comment at the top of DAGTypeLegalizer::PerformExpensiveChecks
which, in part, says:

  // Note that these invariants may not hold momentarily when processing a node:
  // the node being processed may be put in a map before being marked Processed.

Unfortunately, this assert would be valid only if the above-mentioned invariant
held unconditionally. This was causing llc to assert when, in fact,
everything was fine.

Thanks to Richard Sandiford for investigating this issue!

Fixes PR16562.

llvm-svn: 186338
2013-07-15 18:57:05 +00:00
Craig Topper 06b3b6651e Add 'const' qualifier to some arrays.
llvm-svn: 186312
2013-07-15 08:02:13 +00:00
Craig Topper b94011fd28 Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size.
llvm-svn: 186274
2013-07-14 04:42:23 +00:00
Craig Topper e0b711864c Pass SmallVector by const reference instead of by value.
llvm-svn: 186243
2013-07-13 07:43:40 +00:00
Stephen Lin 10947502e5 Remove trailing whitespac
llvm-svn: 186032
2013-07-10 20:47:39 +00:00
Adrian Prantl a1ffd1a450 Un-break the buildbot by tweaking the indirection flag.
Pulled in a testcase from the debuginfo-test suite.

llvm-svn: 185993
2013-07-10 01:53:37 +00:00
Adrian Prantl facc9f4e3e Document a known limitation of the status quo.
llvm-svn: 185992
2013-07-10 01:53:30 +00:00
Adrian Prantl 418d1d1ea9 Reapply an improved version of r180816/180817.
Change the informal convention of DBG_VALUE machine instructions so that
we can express a register-indirect address with an offset of 0.
The old convention was that a DBG_VALUE is a register-indirect value if
the offset (operand 1) is nonzero. The new convention is that a DBG_VALUE
is register-indirect if the first operand is a register and the second
operand is an immediate. For plain register values the combination reg,
reg is used. MachineInstrBuilder::BuildMI knows how to build the new
DBG_VALUES.

rdar://problem/13658587

llvm-svn: 185966
2013-07-09 20:28:37 +00:00
Hal Finkel e4dd5c29f0 WidenVecRes_BUILD_VECTOR must use the first operand's type
Because integer BUILD_VECTOR operands may have a larger type than the result's
vector element type, and all operands must have the same type, when widening a
BUILD_VECTOR node by adding UNDEFs, we cannot use the vector element type, but
rather must use the type of the existing operands.

Another bug found by llvm-stress.

llvm-svn: 185960
2013-07-09 18:55:10 +00:00
Stephen Lin 73de7bf5de AArch64/PowerPC/SystemZ/X86: This patch fixes the interface, usage, and all
in-tree implementations of TargetLoweringBase::isFMAFasterThanMulAndAdd in
order to resolve the following issues with fmuladd (i.e. optional FMA)
intrinsics:

1. On X86(-64) targets, ISD::FMA nodes are formed when lowering fmuladd
intrinsics even if the subtarget does not support FMA instructions, leading
to laughably bad code generation in some situations.

2. On AArch64 targets, ISD::FMA nodes are formed for operations on fp128,
resulting in a call to a software fp128 FMA implementation.

3. On PowerPC targets, FMAs are not generated from fmuladd intrinsics on types
like v2f32, v8f32, v4f64, etc., even though they promote, split, scalarize,
etc. to types that support hardware FMAs.

The function has also been slightly renamed for consistency and to force a
merge/build conflict for any out-of-tree target implementing it. To resolve,
see comments and fixed in-tree examples.

llvm-svn: 185956
2013-07-09 18:16:56 +00:00
Hal Finkel 6c29bd9088 DAGCombine tryFoldToZero cannot create illegal types after type legalization
When folding sub x, x (and other similar constructs), where x is a vector, the
result is a vector of zeros. After type legalization, make sure that the input
zero elements have a legal type. This type may be larger than the result's
vector element type.

This was another bug found by llvm-stress.

llvm-svn: 185949
2013-07-09 17:02:45 +00:00
Stephen Lin 8e8424eb17 Style fixes: remove unnecessary braces for one-statement if blocks, no else after return, etc. No funcionality change.
llvm-svn: 185893
2013-07-09 00:44:49 +00:00
Hal Finkel 12493bb7d5 Improve the comment from r185794 (re: PromoteIntRes_BUILD_VECTOR)
In response to Duncan's review, I believe that the original comment was not as
clear as it could be. Hopefully, this is better.

llvm-svn: 185824
2013-07-08 14:40:04 +00:00
Hal Finkel 8cb9a0e1d3 Fix PromoteIntRes_BUILD_VECTOR crash with i1 vectors
This fixes a bug (found by llvm-stress) in
DAGTypeLegalizer::PromoteIntRes_BUILD_VECTOR where it assumed that the result
type would always be larger than the original operands. This is not always
true, however, with boolean vectors. For example, promoting a node of type v8i1
(where the operands will be of type i32, the type to which i1 is promoted) will
yield a node with a result vector element type of i16 (and operands of type
i32). As a result, we cannot blindly assume that we can ANY_EXTEND the operands
to the result type.

llvm-svn: 185794
2013-07-08 06:16:58 +00:00
Stephen Lin cfe7f352c7 Remove trailing whitespace from SelectionDAG/*.cpp
llvm-svn: 185780
2013-07-08 00:37:03 +00:00
Stephen Lin 6d715e8699 SelectionDAGBuilder: style fixes (add space between end parentheses and open brace)
llvm-svn: 185768
2013-07-06 21:44:25 +00:00
Benjamin Kramer c7332b2796 DAGCombiner: Don't drop extension behavior when shrinking a load when unsafe.
ReduceLoadWidth unconditionally drops extensions from loads. Limit it to the
case when all of the bits the extension would otherwise produce are dropped by
the shrink. It would be possible to shrink the load in more cases by merging
the extensions, but this isn't trivial and a very rare case. I left a TODO for
that case.

Fixes PR16551.

llvm-svn: 185755
2013-07-06 14:05:09 +00:00
Tim Northover dab4db5372 Stop putting operations after a tail call.
This prevents the emission of DAG-generated vreg definitions after a
tail call be dropping them entirely (on the grounds that nothing could
use them anyway, and they interfere with O0 CodeGen).

llvm-svn: 185754
2013-07-06 12:58:45 +00:00
Jakob Stoklund Olesen db429d9483 Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes.
These exception-related opcodes are not used any longer.

llvm-svn: 185625
2013-07-04 13:54:20 +00:00
Jakob Stoklund Olesen 6a7d68349f Typo.
llvm-svn: 185618
2013-07-04 04:53:49 +00:00
Jakob Stoklund Olesen fee2a20209 Simplify landing pad lowering.
Stop using the ISD::EXCEPTIONADDR and ISD::EHSELECTION when lowering
landing pad arguments. These nodes were previously legalized into
CopyFromReg nodes, but that never worked properly because the
CopyFromReg node weren't guaranteed to be  scheduled at the top of the
basic block.

This meant the exception pointer and selector registers could be
clobbered before being copied to a virtual register.

This patch copies the two physical registers to virtual registers at
the beginning of the basic block, and lowers the landingpad instruction
directly to two CopyFromReg nodes reading the *virtual* registers. This
is safe because virtual registers don't get clobbered.

A future patch will remove the ISD::EXCEPTIONADDR and ISD::EHSELECTION
nodes.

llvm-svn: 185617
2013-07-04 04:53:45 +00:00
Jakob Stoklund Olesen 3d8560c382 FastISel can only apend to basic blocks.
Compute the insertion point from the end of the basic block instead of
skipping labels from the front.

This caused failures in landing pads when live-in copies where inserted
before instruction selection.

llvm-svn: 185616
2013-07-04 04:32:39 +00:00
Jakob Stoklund Olesen a1f5b901a5 Revert r185595-185596 which broke buildbots.
Revert "Simplify landing pad lowering."
Revert "Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes."

llvm-svn: 185600
2013-07-04 00:26:30 +00:00
Jakob Stoklund Olesen f33ec531fa Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes.
These exception-related opcodes are not used any longer.

llvm-svn: 185596
2013-07-03 23:56:31 +00:00
Jakob Stoklund Olesen fa6a7b9b02 Simplify landing pad lowering.
Stop using the ISD::EXCEPTIONADDR and ISD::EHSELECTION when lowering
landing pad arguments. These nodes were previously legalized into
CopyFromReg nodes, but that never worked properly because the
CopyFromReg node weren't guaranteed to be  scheduled at the top of the
basic block.

This meant the exception pointer and selector registers could be
clobbered before being copied to a virtual register.

This patch copies the two physical registers to virtual registers at
the beginning of the basic block, and lowers the landingpad instruction
directly to two CopyFromReg nodes reading the *virtual* registers. This
is safe because virtual registers don't get clobbered.

A future patch will remove the ISD::EXCEPTIONADDR and ISD::EHSELECTION
nodes.

llvm-svn: 185595
2013-07-03 23:56:24 +00:00
Craig Topper af0ad9e20f Use SmallVectorImpl::const_iterator instead of SmallVector to avoid specifying the vector size.
llvm-svn: 185514
2013-07-03 05:18:47 +00:00
Craig Topper e1c1d363a5 Use SmallVectorImpl instead of SmallVector for iterators and references to avoid specifying the vector size unnecessarily.
llvm-svn: 185512
2013-07-03 05:11:49 +00:00
Tim Northover 6823900e55 DAGCombiner: fix use-counting issue when forming zextload
DAGCombiner was counting all uses of a load node  when considering whether it's
worth combining into a zextload. Really, it wants to ignore the chain and just
count real uses.

rdar://problem/13896307

llvm-svn: 185419
2013-07-02 09:58:53 +00:00
Michael Gottesman fd62bb9d3e Added c++ mode selector to head of SelectionDAGBuilder.h so editors open it in c++ mode instead of c mode.
llvm-svn: 185348
2013-07-01 16:53:41 +00:00
Lang Hames c22e39d83d Add missing case to switch statement - DAGTypeLegalizer::ExpandIntegerResult
should expand ATOMIC_CMP_SWAP nodes the same way that it does for ATOMIC_SWAP.

Since ATOMIC_LOADs on some targets (e.g. older ARM variants) get legalized to
ATOMIC_CMP_SWAPs, the missing case had been causing i64 atomic loads to crash
during isel.

<rdar://problem/14074644>

llvm-svn: 185186
2013-06-28 18:36:42 +00:00
Manman Ren 983a16c08a Debug Info: clean up usage of Verify.
No functionality change.
It should suffice to check the type of a debug info metadata, instead of
calling Verify. For cases where we know the type of a DI metadata, use
assert.

Also update testing cases to make them conform to the format of DI classes.

llvm-svn: 185135
2013-06-28 05:43:10 +00:00
Elena Demikhovsky fed077be03 Fixed a comment.
llvm-svn: 184933
2013-06-26 12:15:53 +00:00
Elena Demikhovsky 6769c50d9e Optimized integer vector multiplication operation by replacing it with shift/xor/sub when it is possible. Fixed a bug in SDIV, where the const operand is not a splat constant vector.
llvm-svn: 184931
2013-06-26 10:55:03 +00:00
Chad Rosier 295bd43adb The getRegForInlineAsmConstraint function should only accept MVT value types.
llvm-svn: 184642
2013-06-22 18:37:38 +00:00
David Blaikie 97c6c5bd98 DebugInfo: Don't lose unreferenced non-trivial by-value parameters
A FastISel optimization was causing us to emit no information for such
parameters & when they go missing we end up emitting a different
function type. By avoiding that shortcut we not only get types correct
(very important) but also location information (handy) - even if it's
only live at the start of a function & may be clobbered later.

Reviewed/discussion by Evan Cheng & Dan Gohman.

llvm-svn: 184604
2013-06-21 22:56:30 +00:00
Michael Liao 62ebfd8786 Fix PR16360
When (srl (anyextend x), c) is folded into (anyextend (srl x, c)), the
high bits are not cleared. Add 'and' to clear off them.

llvm-svn: 184575
2013-06-21 18:45:27 +00:00
Bill Wendling a3cd350249 Access the TargetLoweringInfo from the TargetMachine object instead of caching it. The TLI may change between functions. No functionality change.
llvm-svn: 184360
2013-06-19 21:36:55 +00:00
Bill Wendling 0ccf31007f Don't cache the TLI object since we have access to it through TargetMachine already.
llvm-svn: 184346
2013-06-19 20:32:16 +00:00
Quentin Colombet b51a68681a During SelectionDAG building explicitly set a node to constant zero when the
value is zero.
This allows optmizations to kick in more easily.
Fix some test cases so that they remain meaningful (i.e., not completely dead
coded) when optimizations apply.

<rdar://problem/14096009> superfluous multiply by high part of zero-extended
value.

llvm-svn: 184222
2013-06-18 20:14:39 +00:00
David Blaikie 0252265be0 Debug Info: Simplify Frame Index handling in DBG_VALUE Machine Instructions
Rather than using the full power of target-specific addressing modes in
DBG_VALUEs with Frame Indicies, simply use Frame Index + Offset. This
reduces the complexity of debug info handling down to two
representations of values (reg+offset and frame index+offset) rather
than three or four.

Ideally we could ensure that frame indicies had been eliminated by the
time we reached an assembly or dwarf generation, but I haven't spent the
time to figure out where the FIs are leaking through into that & whether
there's a good place to convert them. Some FI+offset=>reg+offset
conversion is done (see PrologEpilogInserter, for example) which is
necessary for some SelectionDAG assumptions about registers, I believe,
but it might be possible to make this a more thorough conversion &
ensure there are no remaining FIs no matter how instruction selection
is performed.

llvm-svn: 184066
2013-06-16 20:34:15 +00:00
Stephen Lin 605207fe75 SelectionDAG: slightly refactor DAGCombiner::visitSELECT_CC to avoid redudant checks...
This doesn't really effect performance due to all the relevant calls being transparent but is clearer. 

llvm-svn: 184027
2013-06-15 04:03:33 +00:00
Matt Arsenault d2f0332a29 Introduce getSelect usage and use more getSelectCC
llvm-svn: 184012
2013-06-14 22:04:37 +00:00
Stephen Lin 4e69d01b67 SelectionDAG: minor fix to order of operands in comments to match the code
llvm-svn: 184008
2013-06-14 21:33:58 +00:00
Stephen Lin e31f2d2d54 SelectionDAG: Fix incorrect condition checks in some cases of folding FADD/FMUL combinations; also improve accuracy of comments
llvm-svn: 183993
2013-06-14 18:17:35 +00:00
David Majnemer 0fc8670cb0 TargetLowering: Clean up method description comments
llvm-svn: 183623
2013-06-08 23:51:45 +00:00
Bill Wendling f77190855d Cache the TargetLowering info object as a pointer.
Caching it as a pointer allows us to reset it if the TargetMachine object
changes.

llvm-svn: 183361
2013-06-06 00:43:09 +00:00
Bill Wendling 8db01cb262 Don't cache the TargetLoweringInfo object inside of the FunctionLowering object.
The TargetLoweringInfo object is owned by the TargetMachine. In the future, the
TargetMachine object may change, which may also change the TargetLoweringInfo
object.

llvm-svn: 183356
2013-06-06 00:11:39 +00:00
Andrew Trick ad6d08ac6f Order CALLSEQ_START and CALLSEQ_END nodes.
Fixes PR16146: gdb.base__call-ar-st.exp fails after
pre-RA-sched=source fixes.

Patch by Xiaoyi Guo!

This also fixes an unsupported dbg.value test case. Codegen was
previously incorrect but the test was passing by luck.

llvm-svn: 182885
2013-05-29 22:03:55 +00:00
Benjamin Kramer 262b154247 Simplify code. No functionality change.
llvm-svn: 182779
2013-05-28 16:39:36 +00:00
Benjamin Kramer 351d53c225 Remove double semicolons.
llvm-svn: 182778
2013-05-28 16:31:26 +00:00
Preston Gurd 048f99de11 Convert sqrt functions into sqrt instructions when -ffast-math is in effect.
When -ffast-math is in effect (on Linux, at least), clang defines
__FINITE_MATH_ONLY__ > 0 when including <math.h>. This causes the
preprocessor to include <bits/math-finite.h>, which renames the sqrt functions.
For instance, "sqrt" is renamed as "__sqrt_finite". 

This patch adds the 3 new names in such a way that they will be treated
as equivalent to their respective original names.

llvm-svn: 182739
2013-05-27 15:44:35 +00:00
Andrew Trick c66d26adf0 Fix PR16143: Insert DEBUG_VALUE before terminator.
llvm-svn: 182717
2013-05-26 08:58:50 +00:00
Andrew Trick e2431c64bc Track IR ordering of SelectionDAG nodes 3/4.
Remove the old IR ordering mechanism and switch to new one.  Fix unit
test failures.

llvm-svn: 182704
2013-05-25 03:08:10 +00:00
Andrew Trick ef9de2a739 Track IR ordering of SelectionDAG nodes 2/4.
Change SelectionDAG::getXXXNode() interfaces as well as call sites of
these functions to pass in SDLoc instead of DebugLoc.

llvm-svn: 182703
2013-05-25 02:42:55 +00:00
Andrew Trick 175143bf88 Track IR ordering of SelectionDAG nodes 1/4.
Use a field in the SelectionDAGNode object to track its IR ordering.
This adds fields and utility classes without changing existing
interfaces or functionality.

llvm-svn: 182701
2013-05-25 02:20:36 +00:00
Michael J. Spencer df1ecbd734 Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.
llvm-svn: 182680
2013-05-24 22:23:49 +00:00
Adrian Prantl 0d1e5592a6 Unify formatting of debug output.
llvm-svn: 182495
2013-05-22 18:02:19 +00:00
Justin Holewinski fff1f5f5e2 Drop @llvm.annotation and @llvm.ptr.annotation intrinsics during codegen.
The intrinsic calls are dropped, but the annotated value is propagated.

Fixes PR 15253

Original patch by Zeng Bin!

llvm-svn: 182387
2013-05-21 14:37:16 +00:00
Benjamin Kramer 8aaf197990 DAGCombine: Avoid an edge case where it tried to create an i0 type for (x & 0) == 0.
Fixes PR16083.

llvm-svn: 182357
2013-05-21 08:51:09 +00:00
Matt Arsenault 75865923c9 Add LLVMContext argument to getSetCCResultType
llvm-svn: 182180
2013-05-18 00:21:46 +00:00
Matt Arsenault 04126234e5 Replace redundant code
Use EVT::changeExtendedVectorElementTypeToInteger instead of doing the
same thing that it does

llvm-svn: 182165
2013-05-17 21:43:43 +00:00
Matt Arsenault 52ddb7bcdd Add missing -*- C++ -*- to headers
llvm-svn: 182164
2013-05-17 21:43:39 +00:00
Adrian Prantl 9c93059aa4 Generate debug info for by-value struct args even if they are not used.
radar://problem/13865940

llvm-svn: 182062
2013-05-16 23:44:12 +00:00
Benjamin Kramer fc88c3761f DAGCombine: Also shrink eq compares where the constant is exactly as large as the smaller type.
if ((x & 255) == 255)

before: movzbl  %al, %eax
        cmpl  $255, %eax

after:  cmpb  $-1, %al
llvm-svn: 182038
2013-05-16 18:47:58 +00:00
Hal Finkel 1f6a7f53d8 Fix legalization of SETCC with promoted integer intrinsics
If the input operands to SETCC are promoted, we need to make sure that we
either use the promoted form of both operands (or neither); a mixture is not
allowed. This can happen, for example, if a target has a custom promoted
i1-returning intrinsic (where i1 is not a legal type). In this case, we need to
use the promoted form of both operands.

This change only augments the behavior of the existing logic in the case where
the input types (which may or may not have already been legalized) disagree,
and should not affect existing target code because this case would otherwise
cause an assert in the SETCC operand promotion code.

This will be covered by (essentially all of the) tests for the new PPCCTRLoops
infrastructure.

llvm-svn: 181926
2013-05-15 21:37:27 +00:00
Bob Wilson c5c0823724 Remove redundant variable introduced by r181682.
llvm-svn: 181721
2013-05-13 19:02:31 +00:00
Hao Liu bc60196951 Fix PR15950 A bug in DAG Combiner about undef mask
llvm-svn: 181682
2013-05-13 02:07:05 +00:00
Benjamin Kramer a5d59333b3 DAGCombiner: Generate a correct constant for vector types when folding (xor (and)) into (and (not)).
PR15948.

llvm-svn: 181597
2013-05-10 14:09:52 +00:00
Owen Anderson 32baf99b1d Teach SelectionDAG to constant fold all-constant FMA nodes the same way that it constant folds FADD, FMUL, etc.
llvm-svn: 181555
2013-05-09 22:27:13 +00:00
David Majnemer 386ab7f872 DAGCombiner: Simplify inverted bit tests
Fold (xor (and x, y), y) -> (and (not x), y)

This removes an opportunity for a constant to appear twice.

llvm-svn: 181395
2013-05-08 06:44:42 +00:00
Matt Arsenault a5733dc97e Fix vselect when getSetCCResultType returns a different type from the operands
llvm-svn: 181348
2013-05-07 20:24:18 +00:00
Michael Kuperstein ac868757d0 Fix slightly too aggressive conact_vector optimization.
(Would sometimes optimize away conacts used to extend a vector with undef values)

llvm-svn: 181186
2013-05-06 08:06:13 +00:00
Dmitri Gribenko 3238fb7595 Add ArrayRef constructor from None, and do the cleanups that this constructor enables
Patch by Robert Wilhelm.

llvm-svn: 181138
2013-05-05 00:40:33 +00:00
Chad Rosier 8e4824f350 [inline asm] Return an undef SDValue of the expected value type, rather than
report a fatal error.  This allows us to continue processing the translation
unit.  Test case to come on the clang side because we need an inline asm
diagnostics handler in place.
rdar://13446483

llvm-svn: 180873
2013-05-01 19:49:26 +00:00
Nadav Rotem e5a2dda372 Optimize away nop CONCAT_VECTOR nodes.
Optimize CONCAT_VECTOR nodes that merge EXTRACT_SUBVECTOR values that extract from the same vector.

rdar://13402653
PR15866

llvm-svn: 180871
2013-05-01 19:18:51 +00:00
Stephen Lin 699808ceb2 Only pass 'returned' to target-specific lowering code when the value of entire register is guaranteed to be preserved.
llvm-svn: 180825
2013-04-30 22:49:28 +00:00
Adrian Prantl a2888e71eb Temporarily revert "Change the informal convention of DBG_VALUE so that we can express a"
because it breaks some buildbots.

This reverts commit 180816.

llvm-svn: 180819
2013-04-30 22:35:14 +00:00
Adrian Prantl 9a576644e4 Change the informal convention of DBG_VALUE so that we can express a
register-indirect address with an offset of 0.
It used to be that a DBG_VALUE is a register-indirect value if the offset
(operand 1) is nonzero. The new convention is that a DBG_VALUE is
register-indirect if the first operand is a register and the second
operand is an immediate. For plain registers use the combination reg, reg.

rdar://problem/13658587

llvm-svn: 180816
2013-04-30 22:16:46 +00:00
Silviu Baranga af7e8c367f Re-write the address propagation code for pre-indexed loads/stores to take into account some previously misssed cases (PRE_DEC addressing mode, the offset and base address are swapped, etc). This should fix PR15581.
llvm-svn: 180609
2013-04-26 15:52:24 +00:00
Benjamin Kramer d56ffc709d DAGCombiner: Canonicalize vector integer abs in the same way we do it for scalars.
This already helps SSE2 x86 a lot because it lacks an efficient way to
represent a vector select. The long term goal is to enable the backend to match
a canonicalized pattern into a single instruction (e.g. vabs or pabs).

llvm-svn: 180597
2013-04-26 09:19:19 +00:00
Silviu Baranga 4ad2bc5963 Fix constant folding for one lane vector types. Constant folding one lane vector types not returns a vector instead of a scalar.
llvm-svn: 180254
2013-04-25 09:32:33 +00:00
Chad Rosier 108d5a61b7 [inline asm] Fix a crasher for an invalid value type/register class.
rdar://13731657

llvm-svn: 180226
2013-04-24 22:53:10 +00:00
Owen Anderson 2d4cca35c3 DAGCombine should not aggressively fold SEXT(VSETCC(...)) into a wider VSETCC without first checking the target's vector boolean contents.
This exposed an issue with PowerPC AltiVec where it appears it was setting the wrong vector boolean contents.  The included change
fixes the PowerPC tests, and was OK'd by Hal.

llvm-svn: 180129
2013-04-23 18:09:28 +00:00
Jim Grosbach 563983c8a3 Legalize vector truncates by parts rather than just splitting.
Rather than just splitting the input type and hoping for the best, apply
a bit more cleverness. Just splitting the types until the source is
legal often leads to an illegal result time, which is then widened and a
scalarization step is introduced which leads to truly horrible code
generation. With the loop vectorizer, these sorts of operations are much
more common, and so it's worth extra effort to do them well.

Add a legalization hook for the operands of a TRUNCATE node, which will
be encountered after the result type has been legalized, but if the
operand type is still illegal. If simple splitting of both types
ends up with the result type of each half still being legal, just
do that (v16i16 -> v16i8 on ARM, for example). If, however, that would
result in an illegal result type (v8i32 -> v8i8 on ARM, for example),
we can get more clever with power-two vectors. Specifically,
split the input type, but also widen the result element size, then
concatenate the halves and truncate again.  For example on ARM,
To perform a "%res = v8i8 trunc v8i32 %in" we transform to:
  %inlo = v4i32 extract_subvector %in, 0
  %inhi = v4i32 extract_subvector %in, 4
  %lo16 = v4i16 trunc v4i32 %inlo
  %hi16 = v4i16 trunc v4i32 %inhi
  %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16
  %res = v8i8 trunc v8i16 %in16

This allows instruction selection to generate three VMOVN instructions
instead of a sequences of moves, stores and loads.

Update the ARMTargetTransformInfo to take this improved legalization
into account.

Consider the simplified IR:

define <16 x i8> @test1(<16 x i32>* %ap) {
  %a = load <16 x i32>* %ap
  %tmp = trunc <16 x i32> %a to <16 x i8>
  ret <16 x i8> %tmp
}

define <8 x i8> @test2(<8 x i32>* %ap) {
  %a = load <8 x i32>* %ap
  %tmp = trunc <8 x i32> %a to <8 x i8>
  ret <8 x i8> %tmp
}

Previously, we would generate the truly hideous:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #20
	bic	sp, sp, #7
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d24, d25}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	vld1.64	{d18, d19}, [r2:128]
	add	r1, r0, #16
	vmovn.i32	d22, q8
	vld1.64	{d16, d17}, [r1:128]
	vmovn.i32	d20, q9
	vmovn.i32	d18, q12
	vmov.u16	r0, d22[3]
	strb	r0, [sp, #15]
	vmov.u16	r0, d22[2]
	strb	r0, [sp, #14]
	vmov.u16	r0, d22[1]
	strb	r0, [sp, #13]
	vmov.u16	r0, d22[0]
	vmovn.i32	d16, q8
	strb	r0, [sp, #12]
	vmov.u16	r0, d20[3]
	strb	r0, [sp, #11]
	vmov.u16	r0, d20[2]
	strb	r0, [sp, #10]
	vmov.u16	r0, d20[1]
	strb	r0, [sp, #9]
	vmov.u16	r0, d20[0]
	strb	r0, [sp, #8]
	vmov.u16	r0, d18[3]
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	vldmia	sp, {d16, d17}
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	mov	sp, r7
	pop	{r7}
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #12
	bic	sp, sp, #7
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d20, d21}, [r0:128]
	vmovn.i32	d18, q8
	vmov.u16	r0, d18[3]
	vmovn.i32	d16, q10
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	ldm	sp, {r0, r1}
	mov	sp, r7
	pop	{r7}
	bx	lr

Now, however, we generate the much more straightforward:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d20, d21}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	add	r1, r0, #16
	vld1.64	{d18, d19}, [r2:128]
	vld1.64	{d22, d23}, [r1:128]
	vmovn.i32	d17, q8
	vmovn.i32	d16, q9
	vmovn.i32	d18, q10
	vmovn.i32	d19, q11
	vmovn.i16	d17, q8
	vmovn.i16	d16, q9
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d18, d19}, [r0:128]
	vmovn.i32	d16, q8
	vmovn.i32	d17, q9
	vmovn.i16	d16, q8
	vmov	r0, r1, d16
	bx	lr

llvm-svn: 179989
2013-04-21 23:47:41 +00:00
Jim Grosbach d4db72db61 Tidy up comment grammar.
llvm-svn: 179986
2013-04-21 21:23:01 +00:00
Tim Northover a2b533906a Remove unused MEMBARRIER DAG node; it's been replaced by ATOMIC_FENCE.
llvm-svn: 179939
2013-04-20 12:32:17 +00:00
Stephen Lin b8bd232a3d Add CodeGen support for functions that always return arguments via a new parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter).
llvm-svn: 179925
2013-04-20 05:14:40 +00:00
Eli Bendersky e80691dc0a Simplify the code in FastISel::tryToFoldLoad, add an assertion and fix a comment.
llvm-svn: 179908
2013-04-19 23:26:18 +00:00
Eli Bendersky 90dd3e7dfd Move TryToFoldFastISelLoad to FastISel, where it belongs. In general, I'm
trying to move as much FastISel logic as possible out of the main path in
SelectionDAGISel - intermixing them just adds confusion.

llvm-svn: 179902
2013-04-19 22:29:18 +00:00
Michael Liao b53d8963ce ArrayRefize getMachineNode(). No functionality change.
llvm-svn: 179901
2013-04-19 22:22:57 +00:00
Eli Bendersky dbeefaa86a Use dbgs() consistently for -debug printouts
llvm-svn: 179894
2013-04-19 21:37:07 +00:00
Eli Bendersky 6084f45f38 Add some more stats for fast isel vs. SelectionDAG, w.r.t lowering function
arguments in entry BBs.

llvm-svn: 179824
2013-04-19 01:04:40 +00:00
Benjamin Kramer bbae991db6 DAGCombiner: Fold a shuffle on CONCAT_VECTORS into a new CONCAT_VECTORS if possible.
This pattern occurs in SROA output due to the way vector arguments are lowered
on ARM.

The testcase from PR15525 now compiles into this, which is better than the code
we got with the old scalarrepl:
_Store:
	ldr.w	r9, [sp]
	vmov	d17, r3, r9
	vmov	d16, r1, r2
	vst1.8	{d16, d17}, [r0]
	bx	lr

Differential Revision: http://llvm-reviews.chandlerc.com/D647

llvm-svn: 179106
2013-04-09 17:41:43 +00:00
Eli Bendersky fc186358f2 Formatting
llvm-svn: 178771
2013-04-04 18:03:41 +00:00
Bill Schmidt 92e26646bc Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC.
For this we need to use a libcall.  Previously LLVM didn't implement
libcall support for frem, so I've added it in the usual
straightforward manner.  A test case from the bug report is included.

llvm-svn: 178639
2013-04-03 13:05:44 +00:00
Arnold Schwaighofer d6c6e868b2 DAGCombiner: Merge store/loads when we have extload/truncstores
This is helps on architectures where i8,i16 are not legal but we have byte, and
short loads/stores. Allowing us to merge copies like the one below on ARM.

copy(char *a, char *b, int n) {
 do {
   int t0 = a[0];
   int t1 = a[1];
   b[0] = t0;
   b[1] = t1;

radar://13536387

llvm-svn: 178546
2013-04-02 15:58:51 +00:00
Arnold Schwaighofer 6752366ed7 Merge load/store sequences with adresses: base + index + offset
We would also like to merge sequences that involve a variable index like in the
example below.

    int index = *idx++
    int i0 = c[index+0];
    int i1 = c[index+1];
    b[0] = i0;
    b[1] = i1;

By extending the parsing of the base pointer to handle dags that contain a
base, index, and offset we can handle examples like the one above.

The dag for the code above will look something like:

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i8 load %index))))

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))

The code that parses the tree ignores the intermediate sign extensions. However,
if there is a sign extension it needs to be on all indexes.

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (add (i8 load %index)
                                     (i8 1))))
 vs

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))
radar://13536387

llvm-svn: 178483
2013-04-01 18:12:58 +00:00
Benjamin Kramer 9335443236 DAGCombine: visitXOR can replace a node without returning it, bail out in that case.
Fixes the crash reported in PR15608.

llvm-svn: 178429
2013-03-30 21:28:18 +00:00
Chad Rosier dbac025d84 [fast-isel] Add a preemptive fix for the case where we fail to materialize an
immediate in a register.  I don't believe this should ever fail, but I see no
harm in trying to make this code bullet proof.

I've added an assert to ensure my assumtion is correct.  If the assertion fires
something is wrong and we should fix it, rather then just silently fall back to
SelectionDAG isel.

llvm-svn: 178305
2013-03-28 23:04:47 +00:00
Michael Liao bb05a1d7b5 Enhance folding of (extract_subvec (insert_subvec V1, V2, IIdx), EIdx)
- Handle the case where the result of 'insert_subvect' is bitcasted
  before 'extract_subvec'. This removes the redundant insertf128/extractf128
  pair on unaligned 256-bit vector load/store on vectors of non 64-bit integer.

llvm-svn: 177945
2013-03-25 23:47:35 +00:00
Shuxin Yang 93b1f12ac1 Disable some unsafe-fp-math DAG-combine transformation after legalization.
For instance, following transformation will be disabled:
    x + x + x => 3.0f * x;

The problem of these transformations is that it introduces a FP constant, which
following Instruction-Selection pass cannot handle.

Reviewed by Nadav, thanks a lot!

rdar://13445387

llvm-svn: 177933
2013-03-25 22:52:29 +00:00
Owen Anderson c81616b0a9 Remove the type legality check from the SelectionDAGBuilder when it lowers @llvm.fmuladd to ISD::FMA nodes.
Performing this check unilaterally prevented us from generating FMAs when the incoming IR contained illegal vector types which would eventually be legalized to underlying types that *did* support FMA.
For example, an @llvm.fmuladd on an OpenCL float16 should become a sequence of float4 FMAs, not float4 fmul+fadd's.

NOTE: Because we still call the target-specific profitability hook, individual targets can reinstate the old behavior, if desired, by simply performing the legality check inside their callback hook.  They can also perform more sophisticated legality checks, if, for example, some illegal vector types can be productively implemented as FMAs, but not others.
llvm-svn: 177820
2013-03-23 08:26:53 +00:00
Justin Holewinski 7478f3d776 Make variable name more explicit and eliminate redundant lookup in SDNodeOrdering
llvm-svn: 177600
2013-03-20 23:10:59 +00:00
Nadav Rotem 4536d582fd When computing the demanded bits of Load SDNodes, make sure that we are looking at the loaded-value operand and not the ptr result (in case of pre-inc loads).
rdar://13348420

llvm-svn: 177596
2013-03-20 22:53:44 +00:00
Christian Konig ed34d0ef1a Revert "pre-RA-sched: fix TargetOpcode usage"
This reverts commit 06091513c283c863296f01cc7c2e86b56bb50d02.

The code is obviously wrong, but the trivial fix causes
inefficient code generation on X86. Somebody with more
knowledge of the code needs to take a look here.

Signed-off-by: Christian König <christian.koenig@amd.com>
llvm-svn: 177529
2013-03-20 15:43:00 +00:00
Justin Holewinski c2d2c8939c Move SDNode order propagation to SDNodeOrdering, which also fixes a missed
case of order propagation during isel.

Thanks Owen for the suggestion!

llvm-svn: 177525
2013-03-20 14:51:01 +00:00
Christian Konig 9ce2d5b862 pre-RA-sched: fix TargetOpcode usage
TargetOpcodes need to be treaded as Machine- and not ISD-Opcodes.

Signed-off-by: Christian König <christian.koenig@amd.com>
llvm-svn: 177518
2013-03-20 13:49:22 +00:00
Justin Holewinski d068943809 Propagate DAG node ordering during type legalization and instruction selection
A node's ordering is only propagated during legalization if (a) the new node does
not have an ordering (is not a CSE'd node), or (b) the new node has an ordering
that is higher than the node being legalized.

llvm-svn: 177465
2013-03-20 00:10:32 +00:00
Bill Wendling 965bd58902 Reset some of the target options which affect code generation.
This doesn't reset all of the target options within the TargetOptions
object. This is because some of those are ABI-specific and must be determined if
it's okay to change those on the fly.

llvm-svn: 176986
2013-03-13 22:26:59 +00:00
Richard Relph 61046a9727 Avoid generating ISD::SELECT for vector operands to SIGN_EXTEND
llvm-svn: 176881
2013-03-12 18:17:18 +00:00
Nick Lewycky 48beb21185 Fix a crasher newly introduced in r176659/r176649, where fast-isel tries to
lower an expect intrinsic that is a constant expression.

llvm-svn: 176830
2013-03-11 21:44:37 +00:00
Jan Wen Voung 7857a64909 Disable statistics on Release builds and move tests that depend on -stats.
Summary:
Statistics are still available in Release+Asserts (any +Asserts builds),
and stats can also be turned on with LLVM_ENABLE_STATS.

Move some of the FastISel stats that were moved under DEBUG()
back out of DEBUG(), since stats are disabled across the board now.

Many tests depend on grepping "-stats" output.  Move those into
a orig_dir/Stats/. so that they can be marked as unsupported
when building without statistics.

Differential Revision: http://llvm-reviews.chandlerc.com/D486

llvm-svn: 176733
2013-03-08 22:56:31 +00:00
Benjamin Kramer 0f12a5c2a4 Remove default from fully covered switch.
llvm-svn: 176703
2013-03-08 17:03:19 +00:00
Tom Stellard d93ef7afaa LegalizeDAG: Respect the result of TLI.getBooleanContents() when expanding SETCC
llvm-svn: 176695
2013-03-08 15:37:02 +00:00
Tom Stellard b1588fc057 DAGCombiner: Use correct value type for checking legality of BR_CC v3
LegalizeDAG.cpp uses the value of the comparison operands when checking
the legality of BR_CC, so DAGCombiner should do the same.

v2:
  - Expand more BR_CC value types for NVPTX

v3:
  - Expand correct BR_CC value types for Hexagon, Mips, and XCore.

llvm-svn: 176694
2013-03-08 15:36:57 +00:00
Bill Wendling 2d915e2c15 Revert r176154 in favor of a better approach.
Code generation makes some basic assumptions about the IR it's been given. In
particular, if there is only one 'invoke' in the function, then that invoke
won't be going away. However, with the advent of the `llvm.donothing' intrinsic,
those invokes may go away. If all of them go away, the landing pad no longer has
any users. This confuses the back-end, which asserts.

This happens with SjLj exceptions, because that's the model that modifies the IR
based on there being invokes, etc. in the function.

Remove any invokes of `llvm.donothing' during SjLj EH preparation. This will
give us a CFG that the back-end won't be confused about. If all of the invokes
in a function are removed, then the SjLj EH prepare pass won't insert the bogus
code the relies upon the invokes being there.
<rdar://problem/13228754&13316637>

llvm-svn: 176677
2013-03-08 02:21:08 +00:00
Chad Rosier 3a200e1faf [fast-isel] Seriously, add support for the expect intrinsic.
rdar://13370942

llvm-svn: 176659
2013-03-07 21:38:33 +00:00
Chad Rosier 9c1796f877 [fast-isel] Add support for the expect intrinsic.
rdar://13370942

llvm-svn: 176649
2013-03-07 20:42:17 +00:00
Benjamin Kramer fdf362bd69 ArrayRefize some code. No functionality change.
llvm-svn: 176648
2013-03-07 20:33:29 +00:00
Andrew Trick 0f23b763a9 pre-RA-sched debug-only fix
llvm-svn: 176638
2013-03-07 19:21:08 +00:00
Andrew Trick b2ab8a732c pre-RA-sched assertion fix. This bug was exposed by r176037.
rdar:13370002 [pre-RA-sched] assertion: released too many times

I tracked this down to an earlier hack that is no longer applicable
and interfered with normal scheduler logic. With the changes in
r176037, it was causing an instruction to be scheduled multiple times.

I have an external test case that I tried hard to reduce and
failed. I can't even reproduce with llc.

llvm-svn: 176636
2013-03-07 19:07:57 +00:00
Nadav Rotem 40cda80af8 No need to go through int64 and APInt when generating a new constant.
llvm-svn: 176615
2013-03-07 06:34:49 +00:00
Jim Grosbach 48a91abc10 SDAG: Handle scalarizing an extend of a <1 x iN> vector.
Just scalarize the element and rebuild a vector of the result type
from that.

rdar://13281568

llvm-svn: 176614
2013-03-07 05:47:54 +00:00
Eli Bendersky b1caf3c30e Remove duplicate line and move another closer to its actual use
llvm-svn: 176391
2013-03-01 23:32:40 +00:00
Akira Hatanaka 3d055580a9 Set properties for f128 type.
llvm-svn: 176378
2013-03-01 21:11:44 +00:00
Chad Rosier b3864609cf Generate an error message instead of asserting or segfaulting when we can't
handle indirect register inputs.
rdar://13322011

llvm-svn: 176367
2013-03-01 19:12:05 +00:00
Michael Liao 6af16fc3b7 Fix PR10475
- ISD::SHL/SRL/SRA must have either both scalar or both vector operands
  but TLI.getShiftAmountTy() so far only return scalar type. As a
  result, backend logic assuming that breaks.
- Rename the original TLI.getShiftAmountTy() to
  TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to
  return target-specificed scalar type or the same vector type as the
  1st operand.
- Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar
  type.

llvm-svn: 176364
2013-03-01 18:40:30 +00:00
Eli Bendersky 33ebf836bc A small refactoring + adding comments.
SelectionDAGIsel::LowerArguments needs a function, not a basic block. So it
makes sense to pass it the function instead of extracting a basic-block from
the function and then tossing it. This is also more self-documenting (functions
have arguments, BBs don't).

In addition, added comments to a couple of Select* methods.

llvm-svn: 176305
2013-02-28 23:09:18 +00:00
Eli Bendersky d0c6e7b038 Put some per-instruction statistics of fast isel under NDEBUG, together with
other per-instruction statistics.

llvm-svn: 176273
2013-02-28 18:05:12 +00:00
Eric Christopher 10d35e9065 Remove unnecessary cast to void.
llvm-svn: 176222
2013-02-27 23:49:45 +00:00
Nadav Rotem c29095fb50 Silence the unused variable warning.
llvm-svn: 176218
2013-02-27 22:52:54 +00:00
Nadav Rotem 00b75dd3c4 The FastISEL should be fast. But when we record statistics we use atomic operations to increment the counters.
This patch disables the counters on non-debug builds. This reduces the runtime of SelectionDAGISel::SelectCodeCommon by ~5%.

llvm-svn: 176214
2013-02-27 21:59:43 +00:00
Michael Ilseman ba8446c80e Reverted: r176136 - Have a way for a target to opt-out of target-independent fast isel
llvm-svn: 176204
2013-02-27 19:54:00 +00:00
Manman Ren 683f59b36c SelectionDAG: If llvm.donothing has a landingpad, we should clear
CurrentCallSite to avoid an assertion failure:
assert(MMI.getCurrentCallSite() == 0 && "Overlapping call sites!");

rdar://problem/13228754

llvm-svn: 176154
2013-02-27 02:11:57 +00:00
Michael Ilseman 846c6f0a32 Have a way for a target to opt-out of target-independent fast isel
llvm-svn: 176136
2013-02-26 23:15:23 +00:00
Chad Rosier 0587597fb8 Fix wording.
llvm-svn: 176055
2013-02-25 22:20:00 +00:00
Chad Rosier a92ef4ba5b [fast-isel] Add X86FastIsel::FastLowerArguments to handle functions with 6 or
fewer scalar integer (i32 or i64) arguments. It completely eliminates the need
for SDISel for trivial functions.

Also, add the new llc -fast-isel-abort-args option, which is similar to
-fast-isel-abort option, but for formal argument lowering.

llvm-svn: 176052
2013-02-25 21:59:35 +00:00
Andrew Trick 7cf4361912 pre-RA-sched fix: only reevaluate physreg interferences when necessary.
Fixes rdar:13279013: scheduler was blowing up on select instructions.

llvm-svn: 176037
2013-02-25 19:11:48 +00:00
Matt Beaumont-Gay 0e760da5fc 'Hexadecimal' has two 'a's and only one 'i'.
llvm-svn: 176031
2013-02-25 18:11:18 +00:00
Chandler Carruth 121dbf8846 Fix spelling noticed by Duncan.
llvm-svn: 176023
2013-02-25 14:29:38 +00:00
Chandler Carruth 05920b1847 Fix the root cause of PR15348 by correctly handling alignment 0 on
memory intrinsics in the SDAG builder.

When alignment is zero, the lang ref says that *no* alignment
assumptions can be made. This is the exact opposite of the internal API
contracts of the DAG where alignment 0 indicates that the alignment can
be made to be anything desired.

There is another, more explicit alignment that is better suited for the
role of "no alignment at all": an alignment of 1. Map the intrinsic
alignment to this early so that we don't end up generating aligned DAGs.

It is really terrifying that we've never seen this before, but we
suddenly started generating a large number of alignment 0 memcpys due to
the new code to do memcpy-based copying of POD class members. That patch
contains a bug that rounds bitfield alignments down when they are the
first field. This can in turn produce zero alignments.

This fixes weird crashes I've seen in library users of LLVM on 32-bit
hosts, etc.

llvm-svn: 176022
2013-02-25 14:20:21 +00:00
Nadav Rotem b7f90bd97b SelectionDAG compile time improvement.
One of the phases of SelectionDAG is LegalizeVectors. We don't need to sort the DAG and copy nodes around if there are no vector ops.

Speeds up the compilation time of SelectionDAG on a big scalar workload by ~8%.

llvm-svn: 175929
2013-02-22 23:33:30 +00:00
Pete Cooper 047f81a5df Fix isa<> check which could never be true.
It was incorrectly checking a Function* being an IntrinsicInst* which
isn't possible.  It should always have been checking the CallInst* instead.

Added test case for x86 which ensures we only get one constant load.
It was 2 before this change.

rdar://problem/13267920

llvm-svn: 175853
2013-02-22 01:50:38 +00:00
Benjamin Kramer 3238dc0c61 DAGCombiner: Make the post-legalize vector op optimization more aggressive.
A legal BUILD_VECTOR goes in and gets constant folded into another legal
BUILD_VECTOR so we don't lose any legality here. The problematic PPC
optimization that made this check necessary was fixed recently.

llvm-svn: 175759
2013-02-21 15:24:35 +00:00
Arnold Schwaighofer 3f9568e921 DAGCombiner: Fold pointless truncate, bitcast, buildvector series
(2xi32) (truncate ((2xi64) bitcast (buildvector i32 a, i32 x, i32 b, i32 y)))
can be folded into a (2xi32) (buildvector i32 a, i32 b).

Such a DAG would cause uneccessary vdup instructions followed by vmovn
instructions.

We generate this code on ARM NEON for a setcc olt, 2xf64, 2xf64. For example, in
the vectorized version of the code below.

double A[N];
double B[N];

void test_double_compare_to_double() {
  int i;
  for(i=0;i<N;i++)
    A[i] = (double)(A[i] < B[i]);
}

radar://13191881

Fixes bug 15283.

llvm-svn: 175670
2013-02-20 21:33:32 +00:00
Michael Liao 7fb39669ef Fix PR15267
- When extloading from a vector with non-byte-addressable element, e.g.
  <4 x i1>, the current logic breaks. Extend the current logic to
  fix the case where the element type is not byte-addressable by loading
  all bytes, bit-extracting/packing each element.

llvm-svn: 175642
2013-02-20 18:04:21 +00:00
Benjamin Kramer 5c3e21ba55 Move the SplatByte helper to APInt and generalize it a bit.
llvm-svn: 175621
2013-02-20 13:00:06 +00:00
Jakub Staszak 8bc7af1a93 Fix #includes, so we include only what we really need.
llvm-svn: 175581
2013-02-20 00:26:25 +00:00
Chad Rosier 3489bcc9ab [ms-inline asm] Remove a redundant call to the setHasMSInlineAsm function.
llvm-svn: 175456
2013-02-18 20:13:59 +00:00
NAKAMURA Takumi 68426c79db [ms-inline asm] Fix undefined behavior to reset hasMSInlineAsm in advance of SelectAllBasicBlocks().
llvm-svn: 175422
2013-02-18 07:06:48 +00:00
Jakub Staszak 5c262f505e LegalizeDAG.cpp doesn't need DenseMap.
llvm-svn: 175365
2013-02-16 16:15:42 +00:00
Chad Rosier 925c9b499e [ms-inline asm] Do not omit the frame pointer if we have ms-inline assembly.
If the frame pointer is omitted, and any stack changes occur in the inline
assembly, e.g.: "pusha", then any C local variable or C argument references
will be incorrect.  

I pass no judgement on anyone who would do such a thing. ;)
rdar://13218191

llvm-svn: 175334
2013-02-16 01:25:28 +00:00
Bill Wendling aef9c37c65 Use the 'target-features' and 'target-cpu' attributes to reset the subtarget features.
If two functions require different features (e.g., `-mno-sse' vs. `-msse') then
we want to honor that, especially during LTO. We can do that by resetting the
subtarget's features depending upon the 'target-feature' attribute.

llvm-svn: 175314
2013-02-15 22:31:27 +00:00
Paul Redmond f29ddfe93f enable SDISel sincos optimization for GNU environments
- add sincos to runtime library if target triple environment is GNU
- added canCombineSinCosLibcall() which checks that sincos is in the RTL and
  if the environment is GNU then unsafe fpmath is enabled (required to
  preserve errno)
- extended sincos-opt lit test

Reviewed by: Hal Finkel

llvm-svn: 175283
2013-02-15 18:45:18 +00:00
Nadav Rotem 495b1a43c1 Dont merge consecutive loads/stores into vectors when noimplicitfloat is used.
llvm-svn: 175190
2013-02-14 18:28:52 +00:00
Owen Anderson cc068993ee Add some legality checks for SETCC before introducing it in the DAG combiner post-operand legalization.
llvm-svn: 175149
2013-02-14 09:07:33 +00:00
Guy Benyei 83c74e9fad Add static cast to unsigned char whenever a character classification function is called with a signed char argument, in order to avoid assertions in Windows Debug configuration.
llvm-svn: 175006
2013-02-12 21:21:59 +00:00
Paul Redmond 288604ed0c PR14562 - Truncation of left shift became undef
DAGCombiner::ReduceLoadWidth was converting (trunc i32 (shl i64 v, 32))
into (shl i32 v, 32) into undef. To prevent this, check the shift count
against the final result size.

Patch by: Kevin Schoedel
Reviewed by: Nadav Rotem

llvm-svn: 174972
2013-02-12 15:21:21 +00:00
Pete Cooper 10a3ae7039 Check type for legality before forming a select from loads.
Sorry for the lack of a test case.  I tried writing one for i386 as i know selects are illegal on this target, but they are actually considered legal by isel and expanded later.

I can't see any targets to trigger this, but checking for the legality of a node before forming it is general goodness.

llvm-svn: 174934
2013-02-12 03:14:50 +00:00
Evan Cheng 615620c9e8 Currently, codegen may spent some time in SDISel passes even if an entire
function is successfully handled by fast-isel. That's because function
arguments are *always* handled by SDISel. Introduce FastLowerArguments to
allow each target to provide hook to handle formal argument lowering.

As a proof-of-concept, add ARMFastIsel::FastLowerArguments to handle
functions with 4 or fewer scalar integer (i8, i16, or i32) arguments. It
completely eliminates the need for SDISel for trivial functions.

rdar://13163905

llvm-svn: 174855
2013-02-11 01:27:15 +00:00
Evan Cheng d1c6404250 Remove unnecessary code.
llvm-svn: 174854
2013-02-11 01:18:26 +00:00
Hal Finkel 2581905f81 DAGCombiner: Constant folding around pre-increment loads/stores
Previously, even when a pre-increment load or store was generated,
we often needed to keep a copy of the original base register for use
with other offsets. If all of these offsets are constants (including
the offset which was combined into the addressing mode), then this is
clearly unnecessary. This change adjusts these other offsets to use the
new incremented address.

llvm-svn: 174746
2013-02-08 21:35:47 +00:00
Bob Wilson 67bbf3aa0c Revert 172027 and 174336. Remove diagnostics about over-aligned stack objects.
Aside from the question of whether we report a warning or an error when we
can't satisfy a requested stack object alignment, the current implementation
of this is not good.  We're not providing any source location in the diagnostics
and the current warning is not connected to any warning group so you can't
control it.  We could improve the source location somewhat, but we can do a
much better job if this check is implemented in the front-end, so let's do that
instead.  <rdar://problem/13127907>

llvm-svn: 174741
2013-02-08 20:35:15 +00:00
Evan Cheng a72b9709d7 Tweak check to avoid integer overflow (for insanely large alignments)
llvm-svn: 174482
2013-02-06 02:06:33 +00:00
Owen Anderson de89ecf1fc Reapply r174343, with a fix for a scary DAG combine bug where it failed to differentiate between the alignment of the
base point of a load, and the overall alignment of the load.  This caused infinite loops in DAG combine with the
original application of this patch.

ORIGINAL COMMIT LOG:
When the target-independent DAGCombiner inferred a higher alignment for a load,
it would replace the load with one with the higher alignment.  However, it did
not place the new load in the worklist, which prevented later DAG combines in
the same phase (for example, target-specific combines) from ever seeing it.

This patch corrects that oversight, and updates some tests whose output changed
due to slightly different DAGCombine outputs.

llvm-svn: 174431
2013-02-05 19:24:39 +00:00
NAKAMURA Takumi 3753b28cd2 Revert r174343, "When the target-independent DAGCombiner inferred a higher alignment for a load,"
It caused hangups in compiling clang/lib/Parse/ParseDecl.cpp and clang/lib/Driver/Tools.cpp in stage2 on some hosts.

llvm-svn: 174374
2013-02-05 14:44:16 +00:00
Owen Anderson a47fdbb032 When the target-independent DAGCombiner inferred a higher alignment for a load,
it would replace the load with one with the higher alignment.  However, it did
not place the new load in the worklist, which prevented later DAG combines in
the same phase (for example, target-specific combines) from ever seeing it.

This patch corrects that oversight, and updates some tests whose output changed
due to slightly different DAGCombine outputs.

llvm-svn: 174343
2013-02-05 06:25:30 +00:00
Benjamin Kramer 548ffa274a SelectionDAG: Teach FoldConstantArithmetic how to deal with vectors.
This required disabling a PowerPC optimization that did the following:
input:
x = BUILD_VECTOR <i32 16, i32 16, i32 16, i32 16>
lowered to:
tmp = BUILD_VECTOR <i32 8, i32 8, i32 8, i32 8>
x = ADD tmp, tmp

The add now gets folded immediately and we're back at the BUILD_VECTOR we
started from. I don't see a way to fix this currently so I left it disabled
for now.

Fix some trivially foldable X86 tests too.

llvm-svn: 174325
2013-02-04 15:19:18 +00:00
Shuxin Yang cadd8a068e rdar://13126763
Fix a bug in DAGCombine. The symptom is mistakenly optimizing expression
"x + x*x" into "x * 3.0".

llvm-svn: 174239
2013-02-02 00:22:03 +00:00
Nadav Rotem f04cbeb357 Fix errant fallthrough in the generation of the lifetime markers.
Found by Alexander Kornienko.

llvm-svn: 174207
2013-02-01 19:25:23 +00:00
Lang Hames dd47804394 When lowering memcpys to loads and stores, make sure we don't promote alignments
past the natural stack alignment.

llvm-svn: 174085
2013-01-31 20:23:43 +00:00
Weiming Zhao 4a0b4fb9a5 Add a special handling case for untyped CopyFromReg node in GetCostForDef() of ScheduleDAGRRList
llvm-svn: 173833
2013-01-29 21:18:43 +00:00
Evan Cheng 0e88c7d897 Teach SDISel to combine fsin / fcos into a fsincos node if the following
conditions are met:
1. They share the same operand and are in the same BB.
2. Both outputs are used.
3. The target has a native instruction that maps to ISD::FSINCOS node or
   the target provides a sincos library call.

Implemented the generic optimization in sdisel and enabled it for
Mac OSX. Also added an additional optimization for x86_64 Mac OSX by
using an alternative entry point __sincos_stret which returns the two
results in xmm0 / xmm1.

rdar://13087969
PR13204

llvm-svn: 173755
2013-01-29 02:32:37 +00:00
Benjamin Kramer cf9dae17b7 Legalizer: Reword comment again, per Duncan's suggestion.
llvm-svn: 173625
2013-01-27 21:02:52 +00:00
Benjamin Kramer 084e675e17 Legalizer: Add an assert and tweak a comment to clarify the assumptions this code makes.
llvm-svn: 173620
2013-01-27 15:04:43 +00:00
Benjamin Kramer 05cc93964a When the legalizer is splitting vector shifts, the result may not have the right shift amount type.
Fix that by adding a cast to the shift expander. This came up with vector shifts
on sse-less X86 CPUs.

   <2 x i64>       = shl <2 x i64> <2 x i64>
-> i64,i64         = shl i64 i64; shl i64 i64
-> i32,i32,i32,i32 = shl_parts i32 i32 i64; shl_parts i32 i32 i64

Now we cast the last two i64s to the right type. Fixes the crash in PR14668.

llvm-svn: 173615
2013-01-27 11:19:11 +00:00
Preston Gurd 0959bb707d This patch aims to reduce compile time in LegalizeTypes by using SmallDenseMap,
with an initial number of elements,  instead of DenseMap, which has
zero initial elements, in order to avoid the copying of elements
when the size changes and to avoid allocating space every time
LegalizeTypes is run. This patch will not affect the memory footprint,
because DenseMap will increase the element size to 64
when the first element is added.

Patch by Wan Xiaofei.

llvm-svn: 173448
2013-01-25 15:18:54 +00:00
Tim Northover 29178a348a Make APFloat constructor require explicit semantics.
Previously we tried to infer it from the bit width size, with an added
IsIEEE argument for the PPC/IEEE 128-bit case, which had a default
value. This default value allowed bugs to creep in, where it was
inappropriate.

llvm-svn: 173138
2013-01-22 09:46:31 +00:00
Nadav Rotem 9450fcfff1 Revert 172708.
The optimization handles esoteric cases but adds a lot of complexity both to the X86 backend and to other backends.
This optimization disables an important canonicalization of chains of SEXT nodes and makes SEXT and ZEXT asymmetrical.
Disabling the canonicalization of consecutive SEXT nodes into a single node disables other DAG optimizations that assume
that there is only one SEXT node. The AVX mask optimizations is one example. Additionally this optimization does not update the cost model.

llvm-svn: 172968
2013-01-20 08:35:56 +00:00
Bill Wendling 658d24d211 Use AttributeSet accessor methods instead of Attribute accessor methods.
Further encapsulation of the Attribute object. Don't allow direct access to the
Attribute object as an aggregate.

llvm-svn: 172853
2013-01-18 21:53:16 +00:00
Bill Wendling 4f972ea2d8 Remove unused parameter. Also use the AttributeSet query methods instead of the Attribute query methods.
llvm-svn: 172852
2013-01-18 21:50:24 +00:00
Elena Demikhovsky f6a30e05d5 Optimization for the following SIGN_EXTEND pairs:
v8i8  -> v8i64, 
v8i8  -> v8i32, 
v4i8  -> v4i64, 
v4i16 -> v4i64 
for AVX and AVX2.

Bug 14865.

llvm-svn: 172708
2013-01-17 09:59:53 +00:00
Bill Schmidt d006c6938b This patch addresses an incorrect transformation in the DAG combiner.
The included test case is derived from one of the GCC compatibility tests.
The problem arises after the selection DAG has been converted to type-legalized
form.  The combiner first sees a 64-bit load that can be converted into a
pre-increment form.  The original load feeds into a SRL that isolates the
upper 32 bits of the loaded doubleword.  This looks like an opportunity for
DAGCombiner::ReduceLoadWidth() to replace the 64-bit load with a 32-bit load.

However, this transformation is not valid, as the replacement load is not
a pre-increment load.  The pre-increment load produces an extra result,
which feeds a subsequent add instruction.  The replacement load only has
one result value, and this value is propagated to all uses of the pre-
increment load, including the add.  Because the add is looking for the
second result value as its operand, it ends up attempting to add a constant
to a token chain, resulting in a crash.

So the patch simply disables this transformation for any load with more than
two result values.

llvm-svn: 172480
2013-01-14 22:04:38 +00:00
Benjamin Kramer 5ea0349ef5 When lowering an inreg sext first shift left, then right arithmetically.
Shifting right two times will only yield zero. Should fix
SingleSource/UnitTests/SignlessTypes/factor.

llvm-svn: 172322
2013-01-12 19:06:44 +00:00
Nadav Rotem dbe5c72d03 PPC: Implement efficient lowering of sign_extend_inreg.
llvm-svn: 172269
2013-01-11 22:57:48 +00:00
Benjamin Kramer fb3c009b52 Remove some accidentaly duplicated code. This needs urgent cleanup :(
llvm-svn: 172248
2013-01-11 20:11:33 +00:00
Benjamin Kramer 56b31bd9d7 Split TargetLowering into a CodeGen and a SelectionDAG part.
This fixes some of the cycles between libCodeGen and libSelectionDAG. It's still
a complete mess but as long as the edges consist of virtual call it doesn't
cause breakage. BasicTTI did static calls and thus broke some build
configurations.

llvm-svn: 172246
2013-01-11 20:05:37 +00:00
Eric Christopher 0cb6fd930e For inline asm:
- recognize string "{memory}" in the MI generation
- mark as mayload/maystore when there's a memory clobber constraint.

PR14859.

Patch by Krzysztof Parzyszek

llvm-svn: 172228
2013-01-11 18:12:39 +00:00
Evan Cheng c8444b159a PR14896: Handle memcpy from constant string where the memcpy size is larger than the string size.
llvm-svn: 172124
2013-01-10 22:13:27 +00:00
Jakub Staszak a9286e931c Remove unneeded includes from FunctionLoweringInfo.h.
llvm-svn: 172123
2013-01-10 22:13:13 +00:00
Manman Ren 207bcbacca Stack Alignment: throw error if we can't satisfy the minimal alignment
requirement when creating stack objects in MachineFrameInfo.

Add CreateStackObjectWithMinAlign to throw error when the minimal alignment
can't be achieved and to clamp the alignment when the preferred alignment
can't be achieved. Same is true for CreateVariableSizedObject.
Will not emit error in CreateSpillStackObject or CreateStackObject.

As long as callers of CreateStackObject do not assume the object will be
aligned at the requested alignment, we should not have miscompile since
later optimizations which look at the object's alignment will have the correct
information.

rdar://12713765

llvm-svn: 172027
2013-01-10 01:10:10 +00:00
Evan Cheng 5652a8df32 Fix a DAG combine bug visitBRCOND() is transforming br(xor(x, y)) to br(x != y).
It cahced XOR's operands before calling visitXOR() but failed to update the
operands when visitXOR changed the XOR node.

rdar://12968664

llvm-svn: 171999
2013-01-09 20:56:40 +00:00
Tim Northover f1450d8d7c Refactor to expose RTLIB calls to targets.
fp128 is almost but not quite completely illegal as a type on AArch64. As a
result it needs to have a register class (for argument passing mainly), but all
operations need to be lowered to runtime calls. Currently there's no way for
targets to do this (without duplicating code), as the relevant functions are
hidden in SelectionDAG. This patch changes that.

llvm-svn: 171971
2013-01-09 13:18:15 +00:00
Tim Northover 4bf47bc9d7 Add fp128 rtlib function names to LLVM
llvm-svn: 171867
2013-01-08 17:09:59 +00:00
Chandler Carruth a7c44e6ec6 Sink a function that refers to the SelectionDAG into that library in the
one file where it is called as a static function. Nuke the declaration
and the definition in lib/CodeGen, along with the include of
SelectionDAG.h from this file.

There is no dependency edge from lib/CodeGen to
lib/CodeGen/SelectionDAG, so it isn't valid for a routine in lib/CodeGen
to reference the DAG. There is a dependency from
lib/CodeGen/SelectionDAG on lib/CodeGen. This breaks one violation of
this layering.

llvm-svn: 171842
2013-01-08 05:11:57 +00:00
Chandler Carruth 95f83e0155 Sink AddrMode back into TargetLowering, removing one of the most
peculiar headers under include/llvm.

This struct still doesn't make a lot of sense, but it makes more sense
down in TargetLowering than it did before.

llvm-svn: 171739
2013-01-07 15:14:13 +00:00
Chandler Carruth d3e73556d6 Move TargetTransformInfo to live under the Analysis library. This no
longer would violate any dependency layering and it is in fact an
analysis. =]

llvm-svn: 171686
2013-01-07 03:08:10 +00:00
Chandler Carruth 664e354de7 Switch TargetTransformInfo from an immutable analysis pass that requires
a TargetMachine to construct (and thus isn't always available), to an
analysis group that supports layered implementations much like
AliasAnalysis does. This is a pretty massive change, with a few parts
that I was unable to easily separate (sorry), so I'll walk through it.

The first step of this conversion was to make TargetTransformInfo an
analysis group, and to sink the nonce implementations in
ScalarTargetTransformInfo and VectorTargetTranformInfo into
a NoTargetTransformInfo pass. This allows other passes to add a hard
requirement on TTI, and assume they will always get at least on
implementation.

The TargetTransformInfo analysis group leverages the delegation chaining
trick that AliasAnalysis uses, where the base class for the analysis
group delegates to the previous analysis *pass*, allowing all but tho
NoFoo analysis passes to only implement the parts of the interfaces they
support. It also introduces a new trick where each pass in the group
retains a pointer to the top-most pass that has been initialized. This
allows passes to implement one API in terms of another API and benefit
when some other pass above them in the stack has more precise results
for the second API.

The second step of this conversion is to create a pass that implements
the TargetTransformInfo analysis using the target-independent
abstractions in the code generator. This replaces the
ScalarTargetTransformImpl and VectorTargetTransformImpl classes in
lib/Target with a single pass in lib/CodeGen called
BasicTargetTransformInfo. This class actually provides most of the TTI
functionality, basing it upon the TargetLowering abstraction and other
information in the target independent code generator.

The third step of the conversion adds support to all TargetMachines to
register custom analysis passes. This allows building those passes with
access to TargetLowering or other target-specific classes, and it also
allows each target to customize the set of analysis passes desired in
the pass manager. The baseline LLVMTargetMachine implements this
interface to add the BasicTTI pass to the pass manager, and all of the
tools that want to support target-aware TTI passes call this routine on
whatever target machine they end up with to add the appropriate passes.

The fourth step of the conversion created target-specific TTI analysis
passes for the X86 and ARM backends. These passes contain the custom
logic that was previously in their extensions of the
ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces.
I separated them into their own file, as now all of the interface bits
are private and they just expose a function to create the pass itself.
Then I extended these target machines to set up a custom set of analysis
passes, first adding BasicTTI as a fallback, and then adding their
customized TTI implementations.

The fourth step required logic that was shared between the target
independent layer and the specific targets to move to a different
interface, as they no longer derive from each other. As a consequence,
a helper functions were added to TargetLowering representing the common
logic needed both in the target implementation and the codegen
implementation of the TTI pass. While technically this is the only
change that could have been committed separately, it would have been
a nightmare to extract.

The final step of the conversion was just to delete all the old
boilerplate. This got rid of the ScalarTargetTransformInfo and
VectorTargetTransformInfo classes, all of the support in all of the
targets for producing instances of them, and all of the support in the
tools for manually constructing a pass based around them.

Now that TTI is a relatively normal analysis group, two things become
straightforward. First, we can sink it into lib/Analysis which is a more
natural layer for it to live. Second, clients of this interface can
depend on it *always* being available which will simplify their code and
behavior. These (and other) simplifications will follow in subsequent
commits, this one is clearly big enough.

Finally, I'm very aware that much of the comments and documentation
needs to be updated. As soon as I had this working, and plausibly well
commented, I wanted to get it committed and in front of the build bots.
I'll be doing a few passes over documentation later if it sticks.

Commits to update DragonEgg and Clang will be made presently.

llvm-svn: 171681
2013-01-07 01:37:14 +00:00
Chandler Carruth 42e9611f15 Funnel the actual TargetTransformInfo pass from the SelectionDAGISel
pass into the SelectionDAG itself rather than snooping on the
implementation of that pass as exposed by the TargetMachine. This
removes the last direct client of the ScalarTargetTransformInfo class
outside of the TTI pass implementation.

llvm-svn: 171625
2013-01-05 12:32:17 +00:00
Tom Stellard 567f886eb0 DAGCombiner: Avoid generating illegal vector INT_TO_FP nodes
DAGCombiner::reduceBuildVecConvertToConvertBuildVec() was making two
mistakes:

1. It was checking the legality of scalar INT_TO_FP nodes and then generating
vector nodes.

2. It was passing the result value type to
TargetLoweringInfo::getOperationAction() when it should have been
passing the value type of the first operand.

llvm-svn: 171420
2013-01-02 22:13:01 +00:00
Chandler Carruth 9fb823bbd4 Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366
2013-01-02 11:36:10 +00:00
Chandler Carruth be81023d74 Resort the #include lines in include/... and lib/... with the
utils/sort_includes.py script.

Most of these are updating the new R600 target and fixing up a few
regressions that have creeped in since the last time I sorted the
includes.

llvm-svn: 171362
2013-01-02 10:22:59 +00:00
Hal Finkel 6dbdd4307b Support ppcf128 in SelectionDAG::getConstantFP
Fixes pr14751.

Patch by Kai; Thanks!

llvm-svn: 171261
2012-12-30 19:03:32 +00:00
Bill Wendling 74dba875e2 Remove the Function::getRetAttributes method in favor of using the AttributeSet accessor method.
llvm-svn: 171256
2012-12-30 13:01:51 +00:00
Bill Wendling 94dcaf8e2b Remove Function::getParamAttributes and use the AttributeSet accessor methods instead.
llvm-svn: 171255
2012-12-30 12:45:13 +00:00
Bill Wendling 698e84fc4f Remove the Function::getFnAttributes method in favor of using the AttributeSet
directly.

This is in preparation for removing the use of the 'Attribute' class as a
collection of attributes. That will shift to the AttributeSet class instead.

llvm-svn: 171253
2012-12-30 10:32:01 +00:00
Nadav Rotem b1dd52450e Refactor DAGCombinerInfo. Change the different booleans that indicate if we are before or after different runs of DAGCo, with the CombineLevel enum.
Also, added a new API for checking if we are running before or after the LegalizeVectorOps phase. 

llvm-svn: 171142
2012-12-27 06:47:41 +00:00
Jakob Stoklund Olesen 2705333253 Use MachineInstrBuilder for PHI nodes in SelectionDAGISel.
llvm-svn: 170716
2012-12-20 18:46:29 +00:00
Jakob Stoklund Olesen b109a7b430 Use MachineInstrBuilder in InstrEmitter.
This is supposed to be a mechanical change with no functional effects.

InstrEmitter can generate all types of MachineOperands which revealed
that MachineInstrBuilder was missing a few methods, added by this patch.

Besides providing a context pointer to MI::addOperand(),
MachineInstrBuilder seems like a better fit for this code.

llvm-svn: 170712
2012-12-20 18:08:09 +00:00
Bob Wilson 3365b80290 Do not introduce vector operations in functions marked with noimplicitfloat.
<rdar://problem/12879313>

llvm-svn: 170630
2012-12-20 01:36:20 +00:00
Patrik Hagglund f9934613e8 Change AsmOperandInfo::ConstraintVT to MVT, instead of EVT.
Accordingly, add MVT::getVT.

llvm-svn: 170550
2012-12-19 15:19:11 +00:00
Patrik Hagglund 00e7a11904 Split the usage of 'EVT PartVT' into 'MVT PartVT' and 'EVT PartEVT'.
llvm-svn: 170540
2012-12-19 12:33:30 +00:00
Patrik Hagglund 4e0f828686 Change RegVT in BitTestBlock and RegsForValue, to contain MVTs,
instead of EVTs.

llvm-svn: 170538
2012-12-19 12:23:01 +00:00
Patrik Hagglund e09cac9a67 Change TargetLowering::getTypeForExtArgOrReturn to take and return
MVTs, instead of EVTs.

llvm-svn: 170537
2012-12-19 12:02:25 +00:00
Patrik Hagglund 3f1905199b Change a parameter of TargetLowering::getVectorTypeBreakdown to MVT,
from EVT.

llvm-svn: 170536
2012-12-19 11:53:21 +00:00
Patrik Hagglund bad545ccba Change TargetLowering::RegisterTypeForVT to contain MVTs, instead of
EVTs.

llvm-svn: 170535
2012-12-19 11:48:16 +00:00
Patrik Hagglund 93060569ba Change TargetLowering::TransformToType to contain MVTs, instead of
EVTs.

llvm-svn: 170534
2012-12-19 11:42:00 +00:00
Patrik Hagglund f9eb168ef4 Change TargetLowering::findRepresentativeClass to take an MVT, instead
of EVT.

llvm-svn: 170532
2012-12-19 11:30:36 +00:00
Patrik Hagglund fd41b5b969 Change TargetLowering::getTypeToPromoteTo to take and return MVTs,
instead of EVTs.

llvm-svn: 170529
2012-12-19 11:21:04 +00:00
Patrik Hagglund ffd057a3e1 Change TargetLowering::isCondCodeLegal to take an MVT, instead of EVT.
llvm-svn: 170524
2012-12-19 10:19:55 +00:00
Patrik Hagglund deee9003ed Change TargetLowering::getCondCodeAction to take an MVT, instead of
EVT.

llvm-svn: 170522
2012-12-19 10:09:26 +00:00
Patrik Hagglund d7cdcf8cb5 Change TargetLowering::getTruncStoreAction to take MVTs, instead of EVTs.
llvm-svn: 170510
2012-12-19 08:28:51 +00:00
Elena Demikhovsky 14a4af0e66 Optimized load + SIGN_EXTEND patterns in the X86 backend.
llvm-svn: 170506
2012-12-19 07:50:20 +00:00
Nadav Rotem 33360d8ae9 After reducing the size of an operation in the DAG we zero-extend the reduced
bitwidth op back to the original size. If we reduce ANDs then this can cause
an endless loop. This patch changes the ZEXT to ANY_EXTEND if the demanded bits
are equal or smaller than the size of the reduced operation.

llvm-svn: 170505
2012-12-19 07:39:08 +00:00
Bill Wendling 3d7b0b8ac7 Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future.
llvm-svn: 170502
2012-12-19 07:18:57 +00:00
Craig Topper 3f194c8f4f Remove more of 'else's after 'returns'. No functional change.
llvm-svn: 170497
2012-12-19 06:43:58 +00:00
Craig Topper 5dd8291cbe Remove a bunch of 'else's after 'returns'
llvm-svn: 170496
2012-12-19 06:39:17 +00:00
Craig Topper 63f5921776 Teach SimplifySetCC that comparing AssertZext i1 against a constant 1 can be rewritten as a compare against a constant 0 with the opposite condition.
llvm-svn: 170495
2012-12-19 06:12:28 +00:00
Hal Finkel 943f76d1b3 Check multiple register classes for inline asm tied registers
A register can be associated with several distinct register classes.
For example, on PPC, the floating point registers are each associated with
both F4RC (which holds f32) and F8RC (which holds f64). As a result, this code
would fail when provided with a floating point register and an f64 operand
because it would happen to find the register in the F4RC class first and
return that. From the F4RC class, SDAG would extract f32 as the register
type and then assert because of the invalid implied conversion between
the f64 value and the f32 register.

Instead, search all register classes. If a register class containing the
the requested register has the requested type, then return that register
class. Otherwise, as before, return the first register class found that
contains the requested register.

llvm-svn: 170436
2012-12-18 17:50:58 +00:00
Patrik Hagglund c494d24a68 Revert/correct some FastISel changes in r170104 (EVT->MVT for
TargetLowering::getRegClassFor).

Some isSimple() guards were missing, or getSimpleVT() were hoisted too
far, resulting in asserts on valid LLVM assembly input.

llvm-svn: 170336
2012-12-17 14:30:06 +00:00
Patrik Hagglund 55d6f47a37 Change TargetLowering::getLoadExtAction to take an MVT, instead of
EVT.

llvm-svn: 170183
2012-12-14 09:05:13 +00:00
Patrik Hagglund 13abe5ec3c Change TargetLowering::setTypeAction to take an MVT, instead fo EVT.
llvm-svn: 170148
2012-12-13 20:42:43 +00:00
Patrik Hagglund 05394352c0 Change TargetLowering::getRepRegClassFor to take an MVT, instead of
EVT.

Accordingly, change RegDefIter to contain MVTs instead of EVTs.

llvm-svn: 170140
2012-12-13 18:45:35 +00:00
Patrik Hagglund 5e6c361bc0 Change TargetLowering::getRegClassFor to take an MVT, instead of EVT.
Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.

This is the first, in a series of patches.

This is the second attempt. In the first attempt (r169837), a few
getSimpleVT() were hoisted too far, detected by bootstrap failures.

llvm-svn: 170104
2012-12-13 06:34:11 +00:00
Evan Cheng bf0baa9de7 Fix a bug in DAGCombiner::MatchBSwapHWord. Make sure the node has operands before referencing them. rdar://12868039
llvm-svn: 170078
2012-12-13 01:34:32 +00:00
Evan Cheng b7d3d03bf9 Fix a logic bug in inline expansion of memcpy / memset with an overlapping
load / store pair. It's not legal to use a wider load than the size of
the remaining bytes if it's the first pair of load / store.

llvm-svn: 170018
2012-12-12 20:43:23 +00:00
Evan Cheng 962711ee71 Sorry about the churn. One more change to getOptimalMemOpType() hook. Did I
mention the inline memcpy / memset expansion code is a mess?

This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset.
The first indicates whether it is expanding a memset or a memcpy / memmove.
The later is whether the memset is a memset of zero. It's totally possible
(likely even) that targets may want to do different things for memcpy and
memset of zero.

llvm-svn: 169959
2012-12-12 02:34:41 +00:00
Evan Cheng c3d1aca657 - Rename isLegalMemOpType to isSafeMemOpType. "Legal" is a very overloade term.
Also added more comments to explain why it is generally ok to return true.
- Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to
be true for loaded source (memcpy) or zero constants (memset). The poor name
choice is probably some kind of legacy issue.

llvm-svn: 169954
2012-12-12 01:32:07 +00:00
Manman Ren 82751a105c DAGCombine: clamp hi bit in APInt::getBitsSet to avoid assertion
rdar://12838504

llvm-svn: 169951
2012-12-12 01:13:50 +00:00
Evan Cheng 04e5518783 Avoid using lossy load / stores for memcpy / memset expansion. e.g.
f64 load / store on non-SSE2 x86 targets.

llvm-svn: 169944
2012-12-12 00:42:09 +00:00
Evan Cheng eb54240dc2 Replace TargetLowering::isIntImmLegal() with
ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined
term for something like integer immediate materialization. It is always possible
to materialize an integer immediate. Whether to use it for memcpy expansion is
more a "cost" conceern.

llvm-svn: 169929
2012-12-11 23:26:14 +00:00
Patrik Hagglund e98b7a0389 Revert EVT->MVT changes, r169836-169851, due to buildbot failures.
llvm-svn: 169854
2012-12-11 11:14:33 +00:00
Patrik Hagglund b31465b09b Change RegVT in BitTestBlock and RegsForValue, to contain MVTs,
instead of EVTs.

llvm-svn: 169851
2012-12-11 10:24:48 +00:00
Patrik Hagglund ad432a8e70 Change TargetLowering::getTypeForExtArgOrReturn to take and return
MVTs, instead of EVTs.

Accordingly, add bitsLT (and similar) to MVT.

llvm-svn: 169850
2012-12-11 10:20:51 +00:00
Patrik Hagglund d34337495e Change a parameter of TargetLowering::getVectorTypeBreakdown to MVT,
from EVT.

llvm-svn: 169849
2012-12-11 10:16:19 +00:00
Patrik Hagglund 03e9628cfa Change TargetLowering::RegisterTypeForVT to contain MVTs, instead of
EVTs.

llvm-svn: 169848
2012-12-11 10:09:23 +00:00
Patrik Hagglund c50489e203 Change TargetLowering::TransformToType to contain MVTs, instead of
EVTs.

llvm-svn: 169847
2012-12-11 10:05:04 +00:00
Patrik Hagglund 8d2e7cf561 Change TargetLowering::findRepresentativeClass to take an MVT, instead
of EVT.

llvm-svn: 169845
2012-12-11 09:57:18 +00:00
Patrik Hagglund ffb60f7c08 Change TargetLowering::getTypeToPromoteTo to take and return MVTs,
instead of EVTs.

llvm-svn: 169844
2012-12-11 09:54:23 +00:00
Patrik Hagglund a970281106 Change TargetLowering::isCondCodeLegal to take an MVT, instead of EVT.
llvm-svn: 169843
2012-12-11 09:51:27 +00:00
Patrik Hagglund e3bec6365a Change TargetLowering::getCondCodeAction to take an MVT, instead of
EVT.

llvm-svn: 169842
2012-12-11 09:48:14 +00:00
Patrik Hagglund 7ffcd226dd Change TargetLowering::getTruncStoreAction to take MVTs, instead of EVTs.
llvm-svn: 169841
2012-12-11 09:42:24 +00:00
Patrik Hagglund cbc9d4d0f9 Change TargetLowering::getLoadExtAction to take an MVT, instead of EVT.
llvm-svn: 169840
2012-12-11 09:39:09 +00:00
Patrik Hagglund 40e1afe970 Change TargetLowering::setTypeAction to take an MVT, instead fo EVT.
llvm-svn: 169839
2012-12-11 09:32:56 +00:00
Patrik Hagglund 57b1694df1 Change TargetLowering::getRepRegClassFor to take an MVT, instead of
EVT.

Accordingly, change RegDefIter to contain MVTs instead of EVTs.

llvm-svn: 169838
2012-12-11 09:31:43 +00:00
Patrik Hagglund 3708e548f8 Change TargetLowering::getRegClassFor to take an MVT, instead of EVT.
Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.

This is the first, in a series of patches.

llvm-svn: 169837
2012-12-11 09:10:33 +00:00
Chandler Carruth b27041c50b Fix a miscompile in the DAG combiner. Previously, we would incorrectly
try to reduce the width of this load, and would end up transforming:

  (truncate (lshr (sextload i48 <ptr> as i64), 32) to i32)
to
  (truncate (zextload i32 <ptr+4> as i64) to i32)

We lost the sext attached to the load while building the narrower i32
load, and replaced it with a zext because lshr always zext's the
results. Instead, bail out of this combine when there is a conflict
between a sextload and a zext narrowing. The rest of the DAG combiner
still optimize the code down to the proper single instruction:

  movswl 6(...),%eax

Which is exactly what we wanted. Previously we read past the end *and*
missed the sign extension:

  movl 6(...), %eax

llvm-svn: 169802
2012-12-11 00:36:57 +00:00
Chad Rosier df42cf39ab Fall back to the selection dag isel to select tail calls.
This shouldn't affect codegen for -O0 compiles as tail call markers are not
emitted in unoptimized compiles.  Testing with the external/internal nightly
test suite reveals no change in compile time performance.  Testing with -O1,
-O2 and -O3 with fast-isel enabled did not cause any compile-time or
execution-time failures.  All tests were performed on my x86 machine.
I'll monitor our arm testers to ensure no regressions occur there.

In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue
and objc_retainAutoreleaseReturnValue as tail calls unconditionally.  While
it's theoretically true that this is just an optimization, it's an
optimization that we very much want to happen even at -O0, or else ARC
applications become substantially harder to debug.

Part of rdar://12553082

llvm-svn: 169796
2012-12-11 00:18:02 +00:00
Evan Cheng 79e2ca90bc Some enhancements for memcpy / memset inline expansion.
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
   bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
   x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
   if it's not possible to materialize an integer immediate with a single
   instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
   are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
   Also increase the threshold to something reasonable (8 for memset, 4 pairs
   for memcpy).

This significantly improves Dhrystone, up to 50% on ARM iOS devices.

rdar://12760078

llvm-svn: 169791
2012-12-10 23:21:26 +00:00
Eric Christopher 200dd760fa Fix a coding style nit.
llvm-svn: 169776
2012-12-10 22:00:20 +00:00
Tom Stellard 30e2aa5015 LegalizeDAG: Allow type promotion of scalar loads
llvm-svn: 169773
2012-12-10 21:41:58 +00:00
Tom Stellard b785bd776c LegalizeDAG: Allow type promotion for scalar stores
llvm-svn: 169772
2012-12-10 21:41:54 +00:00
Craig Topper d8005db486 Teach DAG combine to handle vector add/sub with vectors of all 0s.
llvm-svn: 169727
2012-12-10 08:12:29 +00:00
Craig Topper 5ea3bdd75b Remove extra blank line.
llvm-svn: 169692
2012-12-09 08:20:52 +00:00
Craig Topper a183ddb0fe Teach DAG combine to handle vector logical operations with vectors of all 1s or all 0s. These cases can show up when vectors are split for legalizing. Fix some tests that were dependent on these cases not being combined.
llvm-svn: 169684
2012-12-08 22:49:19 +00:00
Evan Cheng 9ec512d768 Replace r169459 with something safer. Rather than having computeMaskedBits to
understand target implementation of any_extend / extload, just generate
zero_extend in place of any_extend for liveouts when the target knows the
zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz).

rdar://12771555

llvm-svn: 169536
2012-12-06 19:13:27 +00:00
Nadav Rotem ac450eb59e Fix a bug in the code that merges consecutive stores. Previously we did not
check if loads that happen in between stores alias with the first store in the
chain, only with the second store onwards.

llvm-svn: 169516
2012-12-06 17:34:13 +00:00
Evan Cheng 5213139f48 Let targets provide hooks that compute known zero and ones for any_extend
and extload's. If they are implemented as zero-extend, or implicitly
zero-extend, then this can enable more demanded bits optimizations. e.g.

define void @foo(i16* %ptr, i32 %a) nounwind {
entry:
  %tmp1 = icmp ult i32 %a, 100
  br i1 %tmp1, label %bb1, label %bb2
bb1:
  %tmp2 = load i16* %ptr, align 2
  br label %bb2
bb2:
  %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ]
  %cmp = icmp ult i16 %tmp3, 24
  br i1 %cmp, label %bb3, label %exit
bb3:
  call void @bar() nounwind
  br label %exit
exit:
  ret void
}

This compiles to the followings before:
        push    {lr}
        mov     r2, #0
        cmp     r1, #99
        bhi     LBB0_2
@ BB#1:                                 @ %bb1
        ldrh    r2, [r0]
LBB0_2:                                 @ %bb2
        uxth    r0, r2
        cmp     r0, #23
        bhi     LBB0_4
@ BB#3:                                 @ %bb3
        bl      _bar
LBB0_4:                                 @ %exit
        pop     {lr}
        bx      lr

The uxth is not needed since ldrh implicitly zero-extend the high bits. With
this change it's eliminated.

rdar://12771555

llvm-svn: 169459
2012-12-06 01:28:01 +00:00
Chandler Carruth 802d755533 Sort includes for all of the .h files under the 'lib' tree. These were
missed in the first pass because the script didn't yet handle include
guards.

Note that the script is now able to handle all of these headers without
manual edits. =]

llvm-svn: 169224
2012-12-04 07:12:27 +00:00
Jakub Staszak ae551a853d Simplify code. No functionality change.
llvm-svn: 169198
2012-12-04 01:00:52 +00:00
Jakub Staszak bac8ae6506 Use dyn_cast instead of isa and cast. No functionality change.
llvm-svn: 169196
2012-12-04 00:50:06 +00:00
Chandler Carruth ed0881b2a6 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Nadav Rotem 1157e1410c Allow merging multiple store sequences on the same chain.
llvm-svn: 169111
2012-12-02 17:14:09 +00:00
Justin Holewinski edec332437 Cleanup recent addition of DAGTypeLegalizer::SplitVecOp_VSELECT
llvm-svn: 168932
2012-11-29 19:42:09 +00:00
Justin Holewinski 0ac49bf846 Teach the legalizer how to handle operands for VSELECT nodes
If we need to split the operand of a VSELECT, it must be the mask operand. We
split the entire VSELECT operand with EXTRACT_SUBVECTOR.

llvm-svn: 168883
2012-11-29 14:26:28 +00:00
Justin Holewinski bc45119b44 Allow targets to prefer TypeSplitVector over TypePromoteInteger when computing the legalization method for vectors
For some targets, it is desirable to prefer scalarizing <N x i1> instead of promoting to a larger legal type, such as <N x i32>.

llvm-svn: 168882
2012-11-29 14:26:24 +00:00
Nadav Rotem 307d767177 When combining consecutive stores allow loads in between the stores, if the loads do not alias.
llvm-svn: 168832
2012-11-29 00:00:08 +00:00
Craig Topper 79bd205d8c Refactor to make helper method static.
llvm-svn: 168557
2012-11-25 08:08:58 +00:00
Craig Topper 268b62288e Remove duplicate check of LimitFloatPrecision. It was already checked earlier before IsExp10 could be set to true.
llvm-svn: 168553
2012-11-25 00:48:58 +00:00
Craig Topper 8571944cf1 Factor common code out of individual if blocks into common tail.
llvm-svn: 168551
2012-11-25 00:15:07 +00:00
Craig Topper d374694b07 Remove redundant calls to getCurDebugLoc in visitIntrinsicCall. It's already called at the start of the function and captured in a local variable.
llvm-svn: 168548
2012-11-24 23:05:23 +00:00
Craig Topper d2638c1894 Refactor a bit to make some helper methods static.
llvm-svn: 168546
2012-11-24 18:52:06 +00:00
Craig Topper 4a98175800 Factor some common code out of individual if blocks.
llvm-svn: 168538
2012-11-24 08:22:37 +00:00
Craig Topper bef254ab16 Refactor a bit to make some helper functions static.
llvm-svn: 168524
2012-11-23 18:38:31 +00:00
Patrik Hägglund f77cc055cd Cleanup: Simplify loop end logic in computeRegisterProperties().
llvm-svn: 168507
2012-11-23 08:35:04 +00:00
Lang Hames e9541c820a llvm.fmuladd.* lowering should be checking isOperationLegalOrCustom, rather than
isOperationLegal. Thanks to Craig Topper for pointing this out.

llvm-svn: 168485
2012-11-22 03:31:45 +00:00
Eli Friedman 30834940ec Mark FP_EXTEND form v2f32 to v2f64 as "expand" for ARM NEON. Patch by Pete Couperus.
llvm-svn: 168240
2012-11-17 01:52:46 +00:00
Craig Topper ed756c5fc8 Remove conditions from 'else if' that were guaranteed by preceding 'if'.
llvm-svn: 168191
2012-11-16 20:01:39 +00:00
Craig Topper 3669de4c97 Factor out the final FADD that's common to multiple code paths in the visitLog* functions.
llvm-svn: 168183
2012-11-16 19:08:44 +00:00
Craig Topper ae89426f07 Factor some common code to reduce compile size.
llvm-svn: 168143
2012-11-16 07:48:23 +00:00
Eli Friedman e6385e61b5 Mark FP_ROUND for converting NEON v2f64 to v2f32 as expand. Add a missing
case to vector legalization so this actually works.

Patch by Pete Couperus.  Fixes PR12540.

llvm-svn: 168107
2012-11-15 22:44:27 +00:00
Craig Topper 61d045781a Add llvm.ceil, llvm.trunc, llvm.rint, llvm.nearbyint intrinsics.
llvm-svn: 168025
2012-11-15 06:51:10 +00:00
Rafael Espindola c79532d101 Handle DAG CSE adding new uses during ReplaceAllUsesWith. Fixes PR14333.
llvm-svn: 167912
2012-11-14 05:08:56 +00:00
Duncan Sands b8d3caf65a Codegen support for arbitrary vector getelementptrs.
llvm-svn: 167830
2012-11-13 13:01:58 +00:00
Andrew Trick 108c88c5b7 misched: Allow subtargets to enable misched and dependent options.
This allows me to begin enabling (or backing out) misched by default
for one subtarget at a time. To run misched we typically want to:
- Disable SelectionDAG scheduling (use the source order scheduler)
- Enable more aggressive coalescing (until we decide to always run the coalescer this way)
- Enable MachineScheduler pass itself.

Disabling PostRA sched may follow for some subtargets.

llvm-svn: 167826
2012-11-13 08:47:29 +00:00
Andrew Trick f1ff84c64e misched: Infrastructure for weak DAG edges.
This adds support for weak DAG edges to the general scheduling
infrastructure in preparation for MachineScheduler support for
heuristics based on weak edges.

llvm-svn: 167738
2012-11-12 19:28:57 +00:00
Andrew Trick baeaabb2d0 ScheduleDAG interface. Added OrderKind to distinguish nonregister dependencies.
This is in preparation for adding "weak" DAG edges, but generally
simplifies the design.

llvm-svn: 167435
2012-11-06 03:13:46 +00:00
Owen Anderson 15fd6ac4ba Be careful not to optimize a SELECT_CC into a SETCC post-legalization if the SETCC node would be illegal.
llvm-svn: 167344
2012-11-03 00:17:26 +00:00
Manman Ren 3d5af279b1 OutputArg: added an index of the original argument to match the change to
InputArg in r165616.

This will enable us to get the actual type for both InputArg and OutputArg.

rdar://9932559

llvm-svn: 167265
2012-11-01 23:49:58 +00:00
Chandler Carruth 5da3f0512e Revert the majority of the next patch in the address space series:
r165941: Resubmit the changes to llvm core to update the functions to
         support different pointer sizes on a per address space basis.

Despite this commit log, this change primarily changed stuff outside of
VMCore, and those changes do not carry any tests for correctness (or
even plausibility), and we have consistently found questionable or flat
out incorrect cases in these changes. Most of them are probably correct,
but we need to devise a system that makes it more clear when we have
handled the address space concerns correctly, and ideally each pass that
gets updated would receive an accompanying test case that exercises that
pass specificaly w.r.t. alternate address spaces.

However, from this commit, I have retained the new C API entry points.
Those were an orthogonal change that probably should have been split
apart, but they seem entirely good.

In several places the changes were very obvious cleanups with no actual
multiple address space code added; these I have not reverted when
I spotted them.

In a few other places there were merge conflicts due to a cleaner
solution being implemented later, often not using address spaces at all.
In those cases, I've preserved the new code which isn't address space
dependent.

This is part of my ongoing effort to clean out the partial address space
code which carries high risk and low test coverage, and not likely to be
finished before the 3.2 release looms closer. Duncan and I would both
like to see the above issues addressed before we return to these
changes.

llvm-svn: 167222
2012-11-01 09:14:31 +00:00
Chandler Carruth 7ec5085e01 Revert the series of commits starting with r166578 which introduced the
getIntPtrType support for multiple address spaces via a pointer type,
and also introduced a crasher bug in the constant folder reported in
PR14233.

These commits also contained several problems that should really be
addressed before they are re-committed. I have avoided reverting various
cleanups to the DataLayout APIs that are reasonable to have moving
forward in order to reduce the amount of churn, and minimize the number
of commits that were reverted. I've also manually updated merge
conflicts and manually arranged for the getIntPtrType function to stay
in DataLayout and to be defined in a plausible way after this revert.

Thanks to Duncan for working through this exact strategy with me, and
Nick Lewycky for tracking down the really annoying crasher this
triggered. (Test case to follow in its own commit.)

After discussing with Duncan extensively, and based on a note from
Micah, I'm going to continue to back out some more of the more
problematic patches in this series in order to ensure we go into the
LLVM 3.2 branch with a reasonable story here. I'll send a note to
llvmdev explaining what's going on and why.

Summary of reverted revisions:

r166634: Fix a compiler warning with an unused variable.
r166607: Add some cleanup to the DataLayout changes requested by
         Chandler.
r166596: Revert "Back out r166591, not sure why this made it through
         since I cancelled the command. Bleh, sorry about this!
r166591: Delete a directory that wasn't supposed to be checked in yet.
r166578: Add in support for getIntPtrType to get the pointer type based
         on the address space.
llvm-svn: 167221
2012-11-01 08:07:29 +00:00
Owen Anderson b351c8d692 Add a few more simple fast-math constant propagations and cancellations.
llvm-svn: 167200
2012-11-01 02:00:53 +00:00
Chad Rosier 909f6a035f [inline asm] Get the mayLoad/mayStore directly from the MIOp_ExtraInfo operand.
llvm-svn: 167050
2012-10-30 20:39:19 +00:00
Chad Rosier 86f6050c54 Add a comment for r167040.
llvm-svn: 167046
2012-10-30 20:01:12 +00:00
Chad Rosier 9e1274fb48 [inline asm] Implement mayLoad and mayStore for inline assembly. In general,
the MachineInstr MayLoad/MayLoad flags are based on the tablegen implementation.
For inline assembly, however, we need to compute these based on the constraints.

Revert r166929 as this is no longer needed, but leave the test case in place. 
rdar://12033048 and PR13504

llvm-svn: 167040
2012-10-30 19:11:54 +00:00
Ulrich Weigand 3abb34389d In various places throughout the code generator, there were special
checks to avoid performing compile-time arithmetic on PPCDoubleDouble.

Now that APFloat supports arithmetic on PPCDoubleDouble, those checks
are no longer needed, and we can treat the type like any other.

llvm-svn: 166958
2012-10-29 18:35:49 +00:00
Micah Villmow 51e7246cb4 Back out r166591, not sure why this made it through since I cancelled the command. Bleh, sorry about this!
llvm-svn: 166596
2012-10-24 17:25:11 +00:00
Micah Villmow 6a8f3f9e20 Delete a directory that wasn't supposed to be checked in yet.
llvm-svn: 166591
2012-10-24 17:20:04 +00:00
Micah Villmow 12d9127833 Add in support for getIntPtrType to get the pointer type based on the address space.
This checkin also adds in some tests that utilize these paths and updates some of the
clients.

llvm-svn: 166578
2012-10-24 15:52:52 +00:00
Michael Liao 5922979e49 Teach DAG combine to fold (buildvec (Xint2fp x)) to (Xint2fp (buildvec x))
- If more than 1 elemennts are defined and target supports the vectorized
  conversion, use the vectorized one instead to reduce the strength on
  conversion operation.

llvm-svn: 166546
2012-10-24 04:14:18 +00:00
Jakub Staszak a6addc2741 Keep coding standard. Don't evaluate getNumOperands() every time.
llvm-svn: 166531
2012-10-24 00:38:25 +00:00
Michael Liao 6d106b7bfd Clean up code and put transformation on (build_vec (ext x)) into a helper func
llvm-svn: 166519
2012-10-23 23:06:52 +00:00
Nadav Rotem 33e034a4b3 Make the indirect branch optimization deterministic. No functionality change.
Patch by Daniel Reynaud.

llvm-svn: 166501
2012-10-23 21:05:33 +00:00
Benjamin Kramer a74129adad Symbol hygiene: Make sure declarations and definitions match, make helper functions static.
llvm-svn: 166376
2012-10-20 12:53:26 +00:00
Shuxin Yang 1479fcdef1 1. Remove noreturn attribute from __builtin_debugtrap().
(The change at Clang side was committed in r166345)

2. Cosmetic change in order to conform to coding standards. 

llvm-svn: 166350
2012-10-19 23:00:20 +00:00
Shuxin Yang cdde059a34 This patch is to fix radar://8426430. It is about llvm support of __builtin_debugtrap()
which is supposed to consistently raise SIGTRAP across all systems. In contrast,
__builtin_trap() behave differently on different systems. e.g. it raises SIGTRAP on ARM, and
SIGILL on X86. The purpose of __builtin_debugtrap() is to consistently provide "trap"
functionality, in the mean time preserve the compatibility with on gcc on __builtin_trap().

  The X86 backend is already able to handle debugtrap(). This patch is to:
  1) make front-end recognize "__builtin_debugtrap()" (emboddied in the one-line change to Clang).
  2) In DAG legalization phase, by default, "debugtrap" will be replaced with "trap", which
     make the __builtin_debugtrap() "available" to all existing ports without the hassle of
     changing their code.
  3) If trap-function is specified (via -trap-func=xyz to llc), both __builtin_debugtrap() and
     __builtin_trap() will be expanded into the function call of the specified trap function.
    This behavior may need change in the future.

  The provided testing-case is to make sure 2) and 3) are working for ARM port, and we
already have a testing case for x86. 

llvm-svn: 166300
2012-10-19 20:11:16 +00:00
Michael Liao 2c2358036d Simplify condition checking as CONCAT assume all inputs of the same type.
llvm-svn: 166260
2012-10-19 03:17:00 +00:00
Nadav Rotem d5f8859672 In SimplifySelectOps we pulled two loads through a select node despite the fact that one was dependent on the other.
rdar://12513091

llvm-svn: 166196
2012-10-18 18:06:48 +00:00
Michael Liao 3ac8201ea4 Revert part of r166049 back and enable test case in r166125.
- Folding (trunc (concat ... X )) to (concat ... (trunc X) ...) is valid
  when '...' are all 'undef's.
- r166125 relies on this transformation.

llvm-svn: 166155
2012-10-17 23:45:54 +00:00
Michael Liao c87d98dbc8 Revert r166049
- In general, it's unsafe for this transformation.

llvm-svn: 166135
2012-10-17 22:41:15 +00:00
Michael Liao 7a442c8031 Teach DAG combine to fold (extract_subvec (concat v1, ..) i) to v_i
- If the extracted vector has the same type of all vectored being concatenated
  together, it should be simplified directly into v_i, where i is the index of
  the element being extracted.

llvm-svn: 166125
2012-10-17 20:48:33 +00:00
Evan Cheng 839fb650b2 Add a really faster pre-RA scheduler (-pre-RA-sched=linearize). It doesn't use
any scheduling heuristics nor does it build up any scheduling data structure
that other heuristics use. It essentially linearize by doing a DFA walk but
it does handle glues correctly.

IMPORTANT: it probably can't handle all the physical register dependencies so
it's not suitable for x86. It also doesn't deal with dbg_value nodes right now
so it's definitely is still WIP.

rdar://12474515

llvm-svn: 166122
2012-10-17 19:39:36 +00:00
Michael Liao 19006206a1 Teach DAG combine to fold (trunc (fptoXi x)) to (fptoXi x)
llvm-svn: 166049
2012-10-16 19:38:35 +00:00
Jakob Stoklund Olesen 57e310613c Freeze the reserved registers as soon as isel is complete.
Also provide an MRI::getReservedRegs() function to access the frozen
register set, and isReserved() and isAllocatable() methods to test
individual registers.

The various implementations of TRI::getReservedRegs() are quite
complicated, and many passes need to look at the reserved register set.
This patch makes it possible for these passes to use the cached copy in
MRI, avoiding a lot of malloc traffic and repeated calculations.

llvm-svn: 165982
2012-10-15 21:33:06 +00:00
Micah Villmow 4bb926d91d Resubmit the changes to llvm core to update the functions to support different pointer sizes on a per address space basis.
llvm-svn: 165941
2012-10-15 16:24:29 +00:00
Ulrich Weigand 9aa51d1a2c Fix big-endian codegen bug in DAGTypeLegalizer::ExpandRes_BITCAST
On PowerPC, a bitcast of <16 x i8> to i128 may run through a code
path in ExpandRes_BITCAST that attempts to do an intermediate
bitcast to a <4 x i32> vector, and then construct the Hi and Lo parts
of the resulting i128 by pairing up two of those i32 vector elements
each.  The code already recognizes that on a big-endian system, the
first two vector elements form the Hi part, and the final two vector
elements form the Lo part (vice-versa from the little-endian situation).

However, we also need to take endianness into account when forming each
of those separate pairs:  on a big-endian system, vector element 0 is
the *high* part of the pair making up the Hi part of the result, and
vector element 1 is the low part of the pair.  The code currently always
uses vector element 0 as the low part and vector element 1 as the high
part, as is appropriate for little-endian platforms only.

This patch fixes this by swapping the vector elements as they are
paired up as appropriate.

llvm-svn: 165802
2012-10-12 15:42:58 +00:00
Evan Cheng 21c4adcdd8 Legalizer optimize a pair of div / mod to a call to divrem libcall if they are
not legal. However, it should use a div instruction + mul + sub if divide is
legal. The rem legalization code was missing a check and incorrectly uses a
divrem libcall even when div is legal.

rdar://12481395

llvm-svn: 165778
2012-10-12 01:15:47 +00:00
Micah Villmow 0c61134d8d Revert 165732 for further review.
llvm-svn: 165747
2012-10-11 21:27:41 +00:00
Micah Villmow 083189730e Add in the first iteration of support for llvm/clang/lldb to allow variable per address space pointer sizes to be optimized correctly.
llvm-svn: 165726
2012-10-11 17:21:41 +00:00
Michael Liao 6b49c2f69c Follow the same routine to add target float expansion hook
llvm-svn: 165707
2012-10-11 07:22:01 +00:00
Micah Villmow 0242b9b543 Add in support for expansion of all of the comparison operations to the absolute minimum required set. This allows a backend to expand any arbitrary set of comparisons as long as a minimum set is supported.
The minimum set of required instructions is ISD::AND, ISD::OR, ISD::SETO(or ISD::SETOEQ) and ISD::SETUO(or ISD::SETUNE). Everything is expanded into one of two patterns:
Pattern 1: (LHS CC1 RHS) Opc (LHS CC2 RHS)
Pattern 2: (LHS CC1 LHS) Opc (RHS CC2 RHS)

llvm-svn: 165655
2012-10-10 20:50:51 +00:00
Michael Liao effae0c8e1 Add alternative support for FP_ROUND from v2f32 to v2f64
- Due to the current matching vector elements constraints in ISD::FP_EXTEND,
  rounding from v2f32 to v2f64 is scalarized. Add a customized v2f32 widening
  to convert it into a target-specific X86ISD::VFPEXT to work around this
  constraints. This patch also reverts a previous attempt to fix this issue by
  recovering the scalarized ISD::FP_EXTEND pattern and thus significantly
  reduces the overhead of supporting non-power-2 vector FP extend.

llvm-svn: 165625
2012-10-10 16:32:15 +00:00
Stepan Dyatkovskiy f13dbb8e24 Issue description:
SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack
objects and byval parameters. So loading byval parameters from stack may be
inserted *before* it will be stored, since these operations are treated as
independent.

Fix:
Currently ARMTargetLowering::LowerFormalArguments saves byval registers with
FixedStack MachinePointerInfo. To fix the problem we need to store byval
registers with MachinePointerInfo referenced to first the "byval" parameter.

Also commit adds two new fields to the InputArg structure: Function's argument
index and InputArg's part offset in bytes relative to the start position of
Function's argument. E.g.: If function's argument is 128 bit width and it was
splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index,
but different offset values. 

llvm-svn: 165616
2012-10-10 11:37:36 +00:00
Bill Wendling 8ccd6ca199 Use the attribute enums to query if a parameter has an attribute.
llvm-svn: 165550
2012-10-09 21:38:14 +00:00
Micah Villmow 89021e4740 Add in the first step of the multiple pointer support. This adds in support to the data layout for specifying a per address space pointer size.
The next step is to update the optimizers to allow them to optimize the different address spaces with this information.

llvm-svn: 165505
2012-10-09 16:06:12 +00:00
Bill Wendling c9b22d735a Create enums for the different attributes.
We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.

llvm-svn: 165488
2012-10-09 07:45:08 +00:00
Nadav Rotem 35315fea70 Refactor the AddrMode class out of TLI to its own header file.
This class is used by LSR and a number of places in the codegen.
This is the first step in de-coupling LSR from TLI, and creating
a new interface in between them.

llvm-svn: 165455
2012-10-08 23:06:34 +00:00
Andrew Trick 09650df562 misched: remove forceUnitLatencies. Defaults are handled by the default SchedModel
llvm-svn: 165417
2012-10-08 18:53:57 +00:00
Micah Villmow cdfe20b97f Move TargetData to DataLayout.
llvm-svn: 165402
2012-10-08 16:38:25 +00:00
Benjamin Kramer db5fb3bfe8 Remove unused but set variable flagged by GCC.
llvm-svn: 165331
2012-10-05 20:08:45 +00:00
Benjamin Kramer 62f7fb977c Simplify code, don't or a bool with an uint64_t.
No functionality change.

llvm-svn: 165321
2012-10-05 18:19:44 +00:00
Nadav Rotem b27777ff02 When merging connsecutive stores, use vectors to store the constant zero.
llvm-svn: 165267
2012-10-04 22:35:15 +00:00
Bill Wendling 71ad78b24b Update to use the predicate methods to query if an attribute exists.
llvm-svn: 165163
2012-10-03 21:17:09 +00:00
Nadav Rotem ac92066b0c Fix a cycle in the DAG. In this code we replace multiple loads with a single load and
multiple stores with a single load. We create the wide loads and stores (and their chains)
before we remove the scalar loads and stores and fix the DAG chain. We attempted to merge
loads with a different chain. When that happened, the assumption that it is safe to RAUW
broke and a cycle was introduced.

llvm-svn: 165148
2012-10-03 19:30:31 +00:00
Nadav Rotem 7cbc12a41d A DAGCombine optimization for mergeing consecutive stores to memory. The optimization
is not profitable in many cases because modern processors perform multiple stores
in parallel and merging stores prior to merging requires extra work. We handle two main cases:

1. Store of multiple consecutive constants:
  q->a = 3;
  q->4 = 5;
In this case we store a single legal wide integer.

2. Store of multiple consecutive loads:
  int a = p->a;
  int b = p->b;
  q->a = a;
  q->b = b;
In this case we load/store either ilegal vector registers or legal wide integer registers.

llvm-svn: 165125
2012-10-03 16:11:15 +00:00
Eric Christopher f4fba5cf7a Revert 165051-165049 while looking into the foreach.m failure in
more detail.

llvm-svn: 165099
2012-10-03 08:10:01 +00:00
Eric Christopher d40ce7a43d Remove the SavePoint infrastructure from fast isel, replace
with just an insert point from the MachineBasicBlock and let
the location be updated as we access it.

llvm-svn: 165049
2012-10-02 21:16:50 +00:00
Duncan Sands f97cb15aee Fix PR13991: legalizing an overflowing multiplication operation is harder than
the add/sub case since in the case of multiplication you also have to check that
the operation in the larger type did not overflow.

llvm-svn: 165017
2012-10-02 15:03:49 +00:00
Jakub Staszak ec5a2f248f Use dyn_cast instead of isa and cast.
No functionality change.

llvm-svn: 164924
2012-09-30 21:24:57 +00:00
Nadav Rotem abbe665154 Revert r164910 because it causes failures to several phase2 builds.
llvm-svn: 164911
2012-09-30 07:17:56 +00:00
Nadav Rotem 45715b25f7 A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases
because moden processos can store multiple values in parallel, and preparing the consecutive store requires
some work.  We only handle these cases:

1. Consecutive stores where the values and consecutive loads. For example:
 int a = p->a;
 int b = p->b;
 q->a = a;
 q->b = b;

2. Consecutive stores where the values are constants. Foe example:
 q->a = 4;
 q->b = 5;

llvm-svn: 164910
2012-09-30 06:24:14 +00:00
Duncan Sands fb9d30dd64 Speculatively revert commit 164885 (nadav) in the hope of ressurecting a pile of
buildbots.  Original commit message:

A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases
because moden processos can store multiple values in parallel, and preparing the consecutive store requires
some work.  We only handle these cases:

1. Consecutive stores where the values and consecutive loads. For example:
  int a = p->a;
  int b = p->b;
  q->a = a;
  q->b = b;

2. Consecutive stores where the values are constants. Foe example:
  q->a = 4;
  q->b = 5;

llvm-svn: 164890
2012-09-29 10:25:35 +00:00
Craig Topper 5f9791fd2f Tidy up to match coding standards. Remove 'else' after 'return' and moving operators to end of preceding line. No functional change intended.
llvm-svn: 164887
2012-09-29 07:18:53 +00:00
Craig Topper 65161fa493 Replace a couple if/elses around similar calls with conditional operators on the varying arguments. No functional change.
llvm-svn: 164886
2012-09-29 06:54:22 +00:00
Nadav Rotem a2e7ea2f18 A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases
because moden processos can store multiple values in parallel, and preparing the consecutive store requires
some work.  We only handle these cases:

1. Consecutive stores where the values and consecutive loads. For example:
  int a = p->a;
  int b = p->b;
  q->a = a;
  q->b = b;

2. Consecutive stores where the values are constants. Foe example:
  q->a = 4;
  q->b = 5;

llvm-svn: 164885
2012-09-29 06:33:25 +00:00
Sylvestre Ledru 91ce36c986 Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767
llvm-svn: 164768
2012-09-27 10:14:43 +00:00
Sylvestre Ledru 721cffd53a Fix a typo 'iff' => 'if'
llvm-svn: 164767
2012-09-27 09:59:43 +00:00
Bill Wendling 863bab689a Remove the `hasFnAttr' method from Function.
The hasFnAttr method has been replaced by querying the Attributes explicitly. No
intended functionality change.

llvm-svn: 164725
2012-09-26 21:48:26 +00:00
Bill Wendling 5def891396 Generate an error message instead of asserting or segfaulting when we have a
scalar-to-vector conversion that we cannot handle. For instance, when an invalid
constraint is used in an inline asm statement.
<rdar://problem/12284092>

llvm-svn: 164662
2012-09-26 06:16:18 +00:00
Bill Wendling 81406f692f Generate an error message instead of asserting or segfaulting when we have a
scalar-to-vector conversion that we cannot handle. For instance, when an invalid
constraint is used in an inline asm statement.
<rdar://problem/12284092>

llvm-svn: 164657
2012-09-26 04:04:19 +00:00
Sebastian Pop edb31faf92 TargetLowering interface to set/get minimum block entries for jump tables.
Provide interface in TargetLowering to set or get the minimum number of basic
blocks whereby jump tables are generated for switch statements rather than an
if sequence.

    getMinimumJumpTableEntries() defaults to 4.
    setMinimumJumpTableEntries() allows target configuration.

    This patch changes the default for the Hexagon architecture to 5
    as it improves performance on some benchmarks.

llvm-svn: 164628
2012-09-25 20:35:36 +00:00
Nadav Rotem 841c9a84d0 Fix 80-col violations.
llvm-svn: 164297
2012-09-20 08:53:31 +00:00
Bill Wendling d6b2688130 Add predicates for queries on whether an attribute exists.
llvm-svn: 164264
2012-09-19 23:35:21 +00:00
Craig Topper b1d83e8c72 Mark unimplemented copy constructors and copy assignment operators as LLVM_DELETED_FUNCTION.
llvm-svn: 164090
2012-09-18 02:01:41 +00:00
Evan Cheng c573599137 Fix some funky indentation.
llvm-svn: 164087
2012-09-18 01:34:40 +00:00
Michael Liao b503b323f3 Fix PR13859
- Preserve the original NOutVT during casting from vector to integer by
  extracting vector elements.

llvm-svn: 164042
2012-09-17 18:05:20 +00:00
Craig Topper 04b4e83cf7 Fix bad comment. No functional change.
llvm-svn: 164000
2012-09-16 16:48:25 +00:00
Eric Christopher b83dba2b84 Fix both the test for zero and what we do if we have a zero for
umulo legalization.

Fixes PR13839

llvm-svn: 163856
2012-09-13 23:24:02 +00:00
Eric Christopher 3bc248176c Reformat, remove a couple unused variables and move some variables
closer to where they're needed.

llvm-svn: 163855
2012-09-13 23:23:58 +00:00
Michael Liao 460fc46e0f Enhance type legalization on bitcast from vector to integer
- Find a legal vector type before casting and extracting element from it.
- As the new vector type may have more than 2 elements, build the final
  hi/lo pair by BFS pairing them from bottom to top.

llvm-svn: 163830
2012-09-13 19:58:21 +00:00
Nadav Rotem 24a822a5cb Fix a dagcombine optimization. The optimization attempts to optimize a bitcast of fneg to integers
by xoring the high-bit. This fails if the source operand is a vector because we need to negate
each of the elements in the vector.

Fix rdar://12281066 PR13813.

llvm-svn: 163802
2012-09-13 14:54:28 +00:00
Michael Liao abb87d4857 Fix PR11985
- BlockAddress has no support of BA + offset form and there is no way to
  propagate that offset into machine operand;
- Add BA + offset support and a new interface 'getTargetBlockAddress' to
  simplify target block address forming;
- All targets are modified to use new interface and X86 backend is enhanced to
  support BA + offset addressing.

llvm-svn: 163743
2012-09-12 21:43:09 +00:00
Owen Anderson 6f9dace01c Remove an overly-aggressive assertion. The code following this assertion already knows how to handle the case where DstRC was NULL, so it's not actually protecting us from anything, and this pattern can come up when using unknown_class operands in the SelectionDAG.
llvm-svn: 163736
2012-09-12 20:09:19 +00:00
Kristof Beyls e6b876f4e5 Fix constant folding through bitcasts by no longer relying on undefined behaviour (converting NaN values between float and double).
SelectionDAG::getConstantFP(double Val, EVT VT, bool isTarget);
should not be used when Val is not a simple constant (as the comment in
SelectionDAG.h indicates). This patch avoids using this function
when folding an unknown constant through a bitcast, where it cannot be
guaranteed that Val will be a simple constant.

llvm-svn: 163703
2012-09-12 11:25:02 +00:00
Manman Ren 19f49ac624 Release build: guard dump functions with
"#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)"

No functional change. Update r163339.

llvm-svn: 163653
2012-09-11 22:23:19 +00:00
Craig Topper 8238461211 Teach DAG combiner to constant fold FABS of a BUILD_VECTOR of ConstantFPs. Factor similar code out of FNEG DAG combiner.
llvm-svn: 163587
2012-09-11 01:45:21 +00:00
Michael Ilseman 0666f0580c Fold multiply by 0 or 1 when in UnsafeFPMath mode in SelectionDAG::getNode().
This folding happens as early as possible for performance reasons, and to make sure it isn't foiled by other transforms (e.g. forming FMAs).

llvm-svn: 163519
2012-09-10 17:00:37 +00:00
Michael Ilseman d5f91515f3 whitespace
llvm-svn: 163518
2012-09-10 16:56:31 +00:00
James Molloy 1e5c611815 Fix an assertion failure when optimising a shufflevector incorrectly into concat_vectors, and a followup bug with SelectionDAG::getNode() creating nodes with invalid types.
llvm-svn: 163511
2012-09-10 14:01:21 +00:00
Nadav Rotem d753a952ca Teach the DAGBuilder about lifetime markers which are generated from PHINodes.
llvm-svn: 163494
2012-09-10 08:43:23 +00:00
Craig Topper 03f39773e0 Teach DAG combiner to constant fold fneg of a BUILD_VECTOR of constants.
llvm-svn: 163483
2012-09-09 22:58:45 +00:00
Michael Liao b7cd341901 Stop emitting lifetime region info when stack coloring is not enabled in O0
- this should fix PR13780

llvm-svn: 163370
2012-09-07 05:13:00 +00:00
Manman Ren 742534c4dc Release build: guard dump functions with "ifndef NDEBUG"
No functional change.

llvm-svn: 163339
2012-09-06 19:06:06 +00:00
Nadav Rotem a8e15b0892 Fix a few old-GCC warnings. No functional change.
llvm-svn: 163309
2012-09-06 11:13:55 +00:00
Nadav Rotem 7c277da364 Add a new optimization pass: Stack Coloring, that merges disjoint static allocations (allocas). Allocas are known to be
disjoint if they are marked by disjoint lifetime markers (@llvm.lifetime.XXX intrinsics).

llvm-svn: 163299
2012-09-06 09:17:37 +00:00
Chad Rosier e53314f7e3 Cleanup a few magic numbers.
llvm-svn: 163263
2012-09-05 22:40:13 +00:00
Roman Divacky ad06cee239 Stop casting away const qualifier needlessly.
llvm-svn: 163258
2012-09-05 22:26:57 +00:00
Chad Rosier cbd2a1983f [ms-inline asm] We only need one bit to represent the AsmDialect in the
MachineInstr.

llvm-svn: 163257
2012-09-05 22:17:43 +00:00
Roman Divacky 9338344acb Constify this properly. Found by gcc48 -Wcast-qual.
llvm-svn: 163256
2012-09-05 22:15:49 +00:00
Roman Divacky 665260222f Constify SDNodeIterator an stop its only non-const user being cast stripped
of its constness. Found by gcc48 -Wcast-qual.

llvm-svn: 163254
2012-09-05 22:03:34 +00:00
Chad Rosier 994f4040f5 [ms-inline asm] Propagate the asm dialect into the MachineInstr representation.
llvm-svn: 163243
2012-09-05 21:00:58 +00:00
Silviu Baranga 3f40d87207 Fixed the DAG combiner to better handle the folding of AND nodes for vector types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value.
llvm-svn: 163203
2012-09-05 08:57:21 +00:00
Craig Topper 2db2353b21 Convert vextracti128/vextractf128 intrinsics to extract_subvector at DAG build time. Similar was previously done for vinserti128/vinsertf128. Add patterns for folding these extract_subvectors with stores.
llvm-svn: 163192
2012-09-05 05:48:09 +00:00
Preston Gurd cdf540d5d6 Generic Bypass Slow Div
- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.

Patch by Tyler Nowicki!

llvm-svn: 163150
2012-09-04 18:22:17 +00:00
Nadav Rotem 10f6b8802b Fix a typo.
llvm-svn: 163094
2012-09-02 12:21:50 +00:00
Nadav Rotem 500d691d4a Generate better select code by allowing the target to use scalar select, and not sign-extend.
llvm-svn: 163086
2012-09-02 08:20:07 +00:00
Pete Cooper 2455e9c4a5 Only legalise a VSELECT in to bitwise operations if the vector mask bool is zeros or all ones. A vector bool with just ones isn't suitable for masking with.
No test case unfortunately as i couldn't find a target which fit all
the conditions needed to hit this code.

llvm-svn: 163075
2012-09-01 22:27:48 +00:00
Pete Cooper 2117ac40c9 Revert "Take account of boolean vector contents when promoting a build vector from i1 to some other type. rdar://problem/12210060"
This reverts commit 5dd9e214fb92847e947f9edab170f9b4e52b908f.

Thanks to Duncan for explaining how this should have been done.

Conflicts:

	test/CodeGen/X86/vec_select.ll

llvm-svn: 163064
2012-09-01 17:37:55 +00:00
Owen Anderson 90e0eaffa8 Teach DAG combine a number of tricks to simplify FMA expressions in fast-math mode.
llvm-svn: 163051
2012-09-01 06:04:27 +00:00
Michael Liao ec385012ae Fix typo
llvm-svn: 163049
2012-09-01 04:09:16 +00:00
Jakob Stoklund Olesen 5c8eda0ebc Add MachineInstr::tieOperands, remove setIsTied().
Manage tied operands entirely internally to MachineInstr. This makes it
possible to change the representation of tied operands, as I will do
shortly.

The constraint that tied uses and defs must be in the same order was too
restrictive.

llvm-svn: 163021
2012-08-31 20:50:53 +00:00
Jakob Stoklund Olesen 96f87069c4 Don't enforce ordered inline asm operands.
I was too optimistic, inline asm can have tied operands that don't
follow the def order.

Fixes PR13742.

llvm-svn: 162998
2012-08-31 15:34:59 +00:00
Pete Cooper e969340fea Take account of boolean vector contents when promoting a build vector from i1 to some other type. rdar://problem/12210060
llvm-svn: 162960
2012-08-30 23:58:52 +00:00
Owen Anderson cc61f87cf7 Teach the DAG combiner to turn chains of FADDs (x+x+x+x+...) into FMULs by constants. This is only enabled in unsafe FP math mode, since it does not preserve rounding effects for all such constants.
llvm-svn: 162956
2012-08-30 23:35:16 +00:00
Nadav Rotem ea973bda26 Currently targets that do not support selects with scalar conditions and vector operands - scalarize the code. ARM is such a target
because it does not support CMOV of vectors. To implement this efficientlyi, we broadcast the condition bit and use a sequence of NAND-OR
to select between the two operands. This is the same sequence we use for targets that don't have vector BLENDs (like SSE2).

rdar://12201387

llvm-svn: 162926
2012-08-30 19:17:29 +00:00
Craig Topper 2da13f9ef8 Add FMA to switch statement in VectorLegalizer::LegalizeOp so that it can be expanded when it isn't legal.
llvm-svn: 162894
2012-08-30 07:34:22 +00:00
Craig Topper c8f5d77e75 Add support for FMA to WidenVectorResult.
llvm-svn: 162893
2012-08-30 07:13:41 +00:00
Jakob Stoklund Olesen ffba07b927 Verify the order of tied operands in inline asm.
When there are multiple tied use-def pairs on an inline asm instruction,
the tied uses must appear in the same order as the defs.

It is possible to write an LLVM IR inline asm instruction that breaks
this constraint, but there is no reason for a front end to emit the
operands out of order.

The gnu inline asm syntax specifies tied operands as a single read/write
constraint "+r", so ouf of order operands are not possible.

llvm-svn: 162878
2012-08-29 23:52:52 +00:00
Jakob Stoklund Olesen b2bef482fd Set the isTied flags when building INLINEASM MachineInstrs.
For normal instructions, isTied() is set automatically by addOperand(),
based on MCInstrDesc, but inline asm has tied operands outside the
descriptor.

llvm-svn: 162869
2012-08-29 22:02:00 +00:00
Jakob Stoklund Olesen 87cb471e52 Remove extra MayLoad/MayStore flags from atomic_load/store.
These extra flags are not required to properly order the atomic
load/store instructions. SelectionDAGBuilder chains atomics as if they
were volatile, and SelectionDAG::getAtomic() sets the isVolatile bit on
the memory operands of all atomic operations.

The volatile bit is enough to order atomic loads and stores during and
after SelectionDAG.

This means we set mayLoad on atomic_load, mayStore on atomic_store, and
mayLoad+mayStore on the remaining atomic read-modify-write operations.

llvm-svn: 162733
2012-08-28 03:11:32 +00:00
Akira Hatanaka adb14f56c7 Fix bug 13532.
In SelectionDAGLegalize::ExpandLegalINT_TO_FP, expand INT_TO_FP nodes without
using any f64 operations if f64 is not a legal type.

Patch by Stefan Kristiansson. 

llvm-svn: 162728
2012-08-28 02:12:42 +00:00
Richard Smith 228e6d4cf3 Fix integer undefined behavior due to signed left shift overflow in LLVM.
Reviewed offline by chandlerc.

llvm-svn: 162623
2012-08-24 23:29:28 +00:00
Jakob Stoklund Olesen 10cdd09318 Avoid including explicit uses when counting SDNode imp-uses.
It is legal to have a register node as an explicit operand, it shouldn't
be counted as an implicit use.

llvm-svn: 162591
2012-08-24 20:52:42 +00:00
Manman Ren cf10446ffa BranchProb: modify the definition of an edge in BranchProbabilityInfo to handle
the case of multiple edges from one block to another.

A simple example is a switch statement with multiple values to the same
destination. The definition of an edge is modified from a pair of blocks to
a pair of PredBlock and an index into the successors.

Also set the weight correctly when building SelectionDAG from LLVM IR,
especially when converting a Switch.
IntegersSubsetMapping is updated to calculate the weight for each cluster.

llvm-svn: 162572
2012-08-24 18:14:27 +00:00
Stepan Dyatkovskiy 99120e04be Rejected 169195. As Duncan commented, bitcasting to proper type is wrong approach. We need to insert some valid TRANCATE node here.
llvm-svn: 162354
2012-08-22 09:33:55 +00:00
Craig Topper a538d831e6 Add a getName function to MachineFunction. Use it in places that previously did getFunction()->getName(). Remove includes of Function.h that are no longer needed.
llvm-svn: 162347
2012-08-22 06:07:19 +00:00
Richard Smith 3fb2047f82 Initialize SelectionDAGBuilder's Context in 'init', not in its constructor. The
SelectionDAG's 'init' has not been called when the SelectionDAGBuilder is
constructed (in SelectionDAGISel's constructor), so this was previously always
initialized with 0.

llvm-svn: 162333
2012-08-22 00:42:39 +00:00
Jakob Stoklund Olesen 7d33c5739f Don't add CFG edges for redundant conditional branches.
IR that hasn't been through SimplifyCFG can look like this:

  br i1 %b, label %r, label %r

Make sure we don't create duplicate Machine CFG edges in this case.

Fix the machine code verifier to accept conditional branches with a
single CFG edge.

llvm-svn: 162230
2012-08-20 21:39:52 +00:00
Stepan Dyatkovskiy 6a638ec521 Fixed DAGCombiner bug (found and localized by James Malloy):
The DAGCombiner tries to optimise a BUILD_VECTOR by checking if it
consists purely of get_vector_elts from one or two source vectors. If
so, it either makes a concat_vectors node or a shufflevector node.

However, it doesn't check the element type width of the underlying
vector, so if you have this sequence:

Node0: v4i16 = ...
Node1: i32 = extract_vector_elt Node0
Node2: i32 = extract_vector_elt Node0
Node3: v16i8 = BUILD_VECTOR Node1, Node2, ...

It will attempt to:

Node0:    v4i16 = ...
NewNode1: v16i8 = concat_vectors Node0, ...

Where this is actually invalid because the element width is completely
different. This causes an assertion failure on DAG legalization stage.

Fix:
If output item type of BUILD_VECTOR differs from input item type.
Make concat_vectors based on input element type and then bitcast it to the output vector type. So the case described above will transformed to:
Node0:    v4i16 = ...
NewNode1: v8i16 = concat_vectors Node0, ...
NewNode2: v16i8 = bitcast NewNode1

llvm-svn: 162195
2012-08-20 07:57:06 +00:00
Eli Friedman 79a6b30d8a Make atomic load and store of pointers work. Tighten verification of atomic operations
so other unexpected operations don't slip through.  Based on patch by Logan Chien.
PR11786/PR13186.

llvm-svn: 162146
2012-08-17 23:24:29 +00:00
Benjamin Kramer ca7ca4f6c6 TargetLowering: Use the large shift amount during legalize types. The legalizer may call us with an overly large type.
llvm-svn: 162101
2012-08-17 15:54:21 +00:00
Owen Anderson a40319b7f1 Add a roundToIntegral method to APFloat, which can be parameterized over various rounding modes. Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC.
llvm-svn: 161807
2012-08-13 23:32:49 +00:00
Nadav Rotem e0f84d31c8 Fix the legalization of ExtLoad on ARM. ExpandUnalignedLoad did not properly
handle the cases where the memory value type was illegal. 
PR 13111. 

llvm-svn: 161565
2012-08-09 01:56:44 +00:00
Jakob Stoklund Olesen 505715d816 Add SelectionDAG::getTargetIndex.
This adds support for TargetIndex operands during isel. The meaning of
these (index, offset, flags) operands is entirely defined by the target.

llvm-svn: 161453
2012-08-07 22:37:05 +00:00
Bob Wilson 874886cd66 Refactor and check "onlyReadsMemory" before optimizing builtins.
This patch is mostly just refactoring a bunch of copy-and-pasted code, but
it also adds a check that the call instructions are readnone or readonly.
That check was already present for sin, cos, sqrt, log2, and exp2 calls, but
it was missing for the rest of the builtins being handled in this code.

llvm-svn: 161282
2012-08-03 23:29:17 +00:00
Bob Wilson 871701c606 Try to reduce the compile time impact of r161232.
The previous change caused fast isel to not attempt handling any calls to
builtin functions.  That included things like "printf" and caused some
noticable regressions in compile time.  I wanted to avoid having fast isel
keep a separate list of functions that had to be kept in sync with what the
code in SelectionDAGBuilder.cpp was handling.  I've resolved that here by
moving the list into TargetLibraryInfo.  This is somewhat redundant in
SelectionDAGBuilder but it will ensure that we keep things consistent.

llvm-svn: 161263
2012-08-03 21:26:24 +00:00
Bob Wilson fa59485b94 Fix memcmp code-gen to honor -fno-builtin.
I noticed that SelectionDAGBuilder::visitCall was missing a check for memcmp
in TargetLibraryInfo, so that it would use custom code for memcmp calls even
with -fno-builtin.  I also had to add a new -disable-simplify-libcalls option
to llc so that I could write a test for this.

llvm-svn: 161262
2012-08-03 21:26:18 +00:00
Bob Wilson 3e6fa462f3 Fall back to selection DAG isel for calls to builtin functions.
Fast isel doesn't currently have support for translating builtin function
calls to target instructions.  For embedded environments where the library
functions are not available, this is a matter of correctness and not
just optimization.  Most of this patch is just arranging to make the
TargetLibraryInfo available in fast isel.  <rdar://problem/12008746>

llvm-svn: 161232
2012-08-03 04:06:28 +00:00
Elena Demikhovsky 3cb3b0045c Added FMA functionality to X86 target.
llvm-svn: 161110
2012-08-01 12:06:00 +00:00
Micah Villmow b67d7a3a33 Conform to LLVM coding style.
llvm-svn: 161061
2012-07-31 18:07:43 +00:00
Micah Villmow 6b12f596ef Don't generate ordered or unordered comparison operations if it is not legal to do so.
llvm-svn: 161053
2012-07-31 16:48:03 +00:00
Pete Cooper 91244268d7 Consider address spaces for hashing and CSEing DAG nodes. Otherwise two loads from different x86 segments but the same address would get CSEd
llvm-svn: 160987
2012-07-30 20:23:19 +00:00
Dan Gohman 0b3d782933 Add a floor intrinsic.
llvm-svn: 160791
2012-07-26 17:43:27 +00:00
Craig Topper 17300940ae Change llvm_unreachable in SplitVectorOperand to report_fatal_error. Keeps release builds from crashing if code uses an intrinsic with an illegal type.
llvm-svn: 160661
2012-07-24 04:11:21 +00:00
Sylvestre Ledru 35521e2310 Fix a typo (the the => the)
llvm-svn: 160621
2012-07-23 08:51:15 +00:00
Nadav Rotem 9056076cab Fixed DAGCombine optimizations which generate select_cc for targets
that do not support it (X86 does not lower select_cc).

PR: 13428

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160619
2012-07-23 07:59:50 +00:00
Craig Topper 2694c05e86 Tidy up. Fix indentation and remove trailing whitespace.
llvm-svn: 160617
2012-07-23 05:38:07 +00:00
Craig Topper b49546a3b3 Change llvm_unreachable in SplitVectorResult to report_fatal_error. Keeps release builds from crashing if code uses an intrinsic with an illegal type. For instance 256-bit AVX intrinsics without having AVX enabled.
llvm-svn: 160616
2012-07-23 04:34:49 +00:00
Benjamin Kramer f364a63c3e Replace some explicit compare loops with std::equal.
No functionality change.

llvm-svn: 160501
2012-07-19 10:46:05 +00:00
Galina Kistanova aaf9735951 Fixed few warnings.
llvm-svn: 160493
2012-07-19 04:50:12 +00:00
Bill Wendling d163405df8 Remove tabs.
llvm-svn: 160475
2012-07-19 00:04:14 +00:00
Nuno Lopes 2151497dca ignore 'invoke @llvm.donothing', but still keep the edge to the continuation BB
llvm-svn: 160411
2012-07-18 00:07:17 +00:00
Evan Cheng e6a3b03ee0 Back out r160101 and instead implement a dag combine to recover from instcombine transformation.
llvm-svn: 160387
2012-07-17 18:54:11 +00:00
Benjamin Kramer 7c1598caaa Remove unused variable.
llvm-svn: 160372
2012-07-17 17:00:11 +00:00
Nadav Rotem 277a40bc0a Fix a crash in the legalization of large vectors.
When truncating a result of a vector that is split we need
to use the result of the split vector, and not re-split the dead node.

llvm-svn: 160357
2012-07-17 09:07:37 +00:00
Evan Cheng 780f9b5f92 Implement r160312 as target indepedenet dag combine.
llvm-svn: 160354
2012-07-17 08:31:11 +00:00
Evan Cheng 47d7be9578 Make sure constant bitwidth is <= 64 bit before calling getSExtValue().
llvm-svn: 160350
2012-07-17 07:47:50 +00:00
Evan Cheng f579beca6d This is another case where instcombine demanded bits optimization created
large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.

int foo(unsigned long l) {
  return (l>> 47) == 1;
}

we produce

  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv

which codegens to

movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret

TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.

Based on a patch by Eli Friedman.

PR10328
rdar://9758774

llvm-svn: 160346
2012-07-17 06:53:39 +00:00
Nadav Rotem 60f7904db7 Minor cleanup and docs.
llvm-svn: 160311
2012-07-16 18:56:39 +00:00
Nadav Rotem 839a06e9d7 Make ComputeDemandedBits return a deterministic result when computing an AssertZext value.
In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits
reported that some of the bits were both known to be one and known to be zero.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160305
2012-07-16 18:34:53 +00:00
Nadav Rotem 3050e07108 Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be wider than the output element type. Make sure to trunc them if needed.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160235
2012-07-15 20:39:08 +00:00
Nadav Rotem a62368c965 Refactor the code that checks that all operands of a node are UNDEFs.
Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs.
Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160229
2012-07-15 08:38:23 +00:00
Nadav Rotem 018921002e Add a dagcombine optimization to convert concat_vectors of undefs into a single undef.
The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node.

llvm-svn: 160221
2012-07-14 21:30:27 +00:00
Jim Grosbach 1af8c8060c Provide function name in 'Cannot select' fatal error.
When dumping the DAG for a fatal 'Cannot select' back-end error, also
provide the name of the function the construct is in. Useful when dealing
with large testcases, as the next step is to llvm-extract the function
in question to get a small(er) testcase.

llvm-svn: 160152
2012-07-13 00:29:09 +00:00
Duncan Sands 671cc2575d The result type of EXTRACT_VECTOR_ELT doesn't have to match the element type of
the input vector, it can be bigger (this is helpful for powerpc where <2 x i16>
is a legal vector type but i16 isn't a legal type, IIRC).  However this wasn't
being taken into account by ExpandRes_EXTRACT_VECTOR_ELT, causing PR13220.
Lightly tweaked version of a patch by Michael Liao.

llvm-svn: 160116
2012-07-12 09:01:35 +00:00
Evan Cheng b17122859b InstrEmitter::EmitSubregNode() optimize extract_subreg in this case:
r1025 = s/zext r1024, 4
r1026 = extract_subreg r1025, 4

to a copy:
r1026 = copy r1024

This is correct. However it uses TII->isCoalescableExtInstr() which can return
true for instructions which essentially does a sext_in_reg so this can end up
with an illegal copy where the source and destination register classes do not
match. Add a check to avoid it. Sorry, no test case possible at this time.

rdar://11849816

llvm-svn: 160059
2012-07-11 18:55:07 +00:00
Nadav Rotem 2a148668b6 Rename many of the Tmp1, Tmp2, Tmp3 variables to names such as Chain, Value, Ptr, etc.
No functionality change.

llvm-svn: 160042
2012-07-11 11:02:16 +00:00
Benjamin Kramer 9488100d46 Remove unused variable.
llvm-svn: 160040
2012-07-11 09:39:04 +00:00
Nadav Rotem de6fd282ef Refactor the DAG Legalizer by extracting the legalization of
Load and Store nodes into their own functions.
No functional change.

llvm-svn: 160037
2012-07-11 08:52:09 +00:00
Owen Anderson b8844d6744 Only apply the SETCC+SITOFP -> SELECTCC optimization when the SETCC returns an MVT::i1, i.e. before type legalization.
This is a speculative fix for a problem on Mips reported by Akira Hatanaka.

llvm-svn: 160036
2012-07-11 06:38:55 +00:00
Nadav Rotem d908ddc186 Improve the loading of load-anyext vectors by allowing the codegen to load
multiple scalars and insert them into a vector. Next, we shuffle the elements
into the correct places, as before.
Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the
migration of bitcasts happened too late in the SelectionDAG process.

llvm-svn: 159991
2012-07-10 13:25:08 +00:00
Owen Anderson d4b841f8f9 Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values.
Previously, this would become an integer extension operation, followed by a real integer->float conversion.

llvm-svn: 159957
2012-07-09 20:31:12 +00:00
Andrew Trick 87255e340e I'm introducing a new machine model to simultaneously allow simple
subtarget CPU descriptions and support new features of
MachineScheduler.

MachineModel has three categories of data:
1) Basic properties for coarse grained instruction cost model.
2) Scheduler Read/Write resources for simple per-opcode and operand cost model (TBD).
3) Instruction itineraties for detailed per-cycle reservation tables.

These will all live side-by-side. Any subtarget can use any
combination of them. Instruction itineraries will not change in the
near term. In the long run, I expect them to only be relevant for
in-order VLIW machines that have complex contraints and require a
precise scheduling/bundling model. Once itineraries are only actively
used by VLIW-ish targets, they could be replaced by something more
appropriate for those targets.

This tablegen backend rewrite sets things up for introducing
MachineModel type #2: per opcode/operand cost model.

llvm-svn: 159891
2012-07-07 04:00:00 +00:00
Chad Rosier 879c34f45a Whitespace.
llvm-svn: 159839
2012-07-06 17:44:22 +00:00
Chad Rosier 88d53eae56 [fast-isel] Tell fast-isel to do nothing with the new donothing intrinsic.
llvm-svn: 159837
2012-07-06 17:33:39 +00:00
Duncan Sands 71dacd09fe All cases are covered, no need for a default. This deals with the
corresponding clang warning.

llvm-svn: 159742
2012-07-05 10:14:33 +00:00
Duncan Sands 0552a2cad2 Use the right kind of booleans: we were emitting 0/1 booleans, instead of 0/-1
booleans.  Patch by James Benton.

llvm-svn: 159739
2012-07-05 09:32:46 +00:00
Jakob Stoklund Olesen c300ef0e50 Allow trailing physreg RegisterSDNode operands on non-variadic instructions.
Also allow trailing register mask operands on non-variadic both
MachineSDNodes and MachineInstrs.

The extra physreg RegisterSDNode operands are added to the MI as
<imp-use> operands. This makes it possible to have non-variadic call
instructions.

Call and return instructions really are non-variadic, the argument
registers should only be used implicitly - they are not part of the
encoding.

llvm-svn: 159727
2012-07-04 23:53:23 +00:00
Stepan Dyatkovskiy 7ff588f986 Reverted r156659, due to probable performance regressions, DenseMap should be used here:
IntegersSubsetMapping
  - Replaced type of Items field from std::list with std::map. In neares future I'll test it with DenseMap and do the correspond replacement
    if possible.

llvm-svn: 159703
2012-07-04 05:53:05 +00:00
Stepan Dyatkovskiy 8b0c97e0dd Part of r159527. Splitted into series of patches and gone with fixed PR13256:
IntegersSubsetMapping
  - Replaced type of Items field from std::list with std::map. In neares future I'll test it with DenseMap and do the correspond replacement
    if possible.

llvm-svn: 159659
2012-07-03 13:46:45 +00:00
Eric Christopher b65acc61a5 Revert "IntRange:" as it appears to be breaking self hosting.
This reverts commit b2833d9dcba88c6f0520cad760619200adc0442c.

llvm-svn: 159618
2012-07-02 23:22:21 +00:00
Evan Cheng 39e90029a2 Target option DisableJumpTables is a gross hack. Move it to TargetLowering instead.
llvm-svn: 159611
2012-07-02 22:39:56 +00:00
Eric Christopher dd8638fb3e Turn an assert into an error to make it a bit more friendly.
Part of rdar://6880388 and rdar://11766377

llvm-svn: 159590
2012-07-02 21:16:43 +00:00
Stepan Dyatkovskiy 8b9ecca42d IntRange:
- Changed isSingleNumber method behaviour. Now this flag is calculated on demand.
IntegersSubsetMapping
  - Optimized diff operation.
  - Replaced type of Items field from std::list with std::map.
  - Added new methods:
    bool isOverlapped(self &RHS)
    void add(self& RHS, SuccessorClass *S)
    void detachCase(self& NewMapping, SuccessorClass *Succ)
    void removeCase(SuccessorClass *Succ)
    SuccessorClass *findSuccessor(const IntTy& Val)
    const IntTy* getCaseSingleNumber(SuccessorClass *Succ)
IntegersSubsetTest
  - DiffTest: Added checks for successors.
SimplifyCFG
  Updated SwitchInst usage (now it is case-ragnes compatible) for
    - SimplifyEqualityComparisonWithOnlyPredecessor
    - FoldValueComparisonIntoPredecessors

llvm-svn: 159527
2012-07-02 13:02:18 +00:00
Jakob Stoklund Olesen 3e3cdecf98 Clear kill flags in InstrEmitter::EmitSubregNode().
When a local virtual register is made global, make sure to clear any
existing kill flags.

llvm-svn: 159461
2012-06-29 21:00:03 +00:00
Nuno Lopes ec9653b363 add a new @llvm.donothing intrinsic that, well, does nothing, and teach CodeGen to ignore calls to it
llvm-svn: 159383
2012-06-28 22:30:12 +00:00
Jim Grosbach e0c10d8b86 'Promote' vector [su]int_to_fp should widen elements.
Teach vector legalization how to honor Promote for int to float
conversions. The code checking whether to promote the operation knew
to look at the operand, but the actual promotion code didn't. This
fixes that. The operand is promoted up via [zs]ext.

rdar://11762659

llvm-svn: 159378
2012-06-28 21:03:44 +00:00
Bill Wendling e38859dc8e Move lib/Analysis/DebugInfo.cpp to lib/VMCore/DebugInfo.cpp and
include/llvm/Analysis/DebugInfo.h to include/llvm/DebugInfo.h.

The reasoning is because the DebugInfo module is simply an interface to the
debug info MDNodes and has nothing to do with analysis.

llvm-svn: 159312
2012-06-28 00:05:13 +00:00
Evan Cheng 4c6f917d34 Make sure type is not extended or untyped before create a constant of the type. No test case. Found by inspection.
llvm-svn: 159179
2012-06-26 01:19:33 +00:00
NAKAMURA Takumi 704de074b8 llvm/lib: [CMake] Add explicit dependency to intrinsics_gen.
llvm-svn: 159112
2012-06-24 13:32:01 +00:00
Pete Cooper fe212e762f DAG legalisation can now handle illegal fma vector types by scalarisation
llvm-svn: 159092
2012-06-24 00:05:44 +00:00
Lang Hames b8650f106a Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a
boolean flag to an enum: { Fast, Standard, Strict } (default = Standard).

This option controls the creation by optimizations of fused FP ops that store
intermediate results in higher precision than IEEE allows (E.g. FMAs). The
behavior of this option is intended to match the behaviour specified by a
soon-to-be-introduced frontend flag: '-ffuse-fp-ops'.

Fast mode - allows formation of fused FP ops whenever they're profitable.

Standard mode - allow fusion only for 'blessed' FP ops. At present the only
blessed op is the fmuladd intrinsic. In the future more blessed ops may be
added.

Strict mode - allow fusion only if/when it can be proven that the excess
precision won't effect the result.

Note: This option only controls formation of fused ops by the optimizers.  Fused
operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic)
will always be honored, regardless of the value of this option.

Internally TargetOptions::AllowExcessFPPrecision has been replaced by
TargetOptions::AllowFPOpFusion.

llvm-svn: 158956
2012-06-22 01:09:09 +00:00
Pete Cooper 5b61422d80 Fix potential crash if DAGCombine on stores sees a half type
llvm-svn: 158927
2012-06-21 18:00:39 +00:00
Evan Cheng 8c2ad81238 Emit a single _udivmodsi4 libcall instead of two separate _udivsi3 and
_umodsi3 libcalls if they have the same arguments. This optimization
was apparently broken if one of the node was replaced in place.
rdar://11714607

llvm-svn: 158900
2012-06-21 05:56:05 +00:00
Pete Cooper fe5b84b404 Add users of a MERGE_VALUE node to the worklist to process again when the node is removed. Sorry, no test case. Foudn it by inspection of the code
llvm-svn: 158839
2012-06-20 19:35:43 +00:00
Hal Finkel 8a31138521 Fix DAGCombine to deal with ext-conversion of pre/post_inc loads.
The test case for this will come with the PPC indexed preinc loads commit.

llvm-svn: 158822
2012-06-20 15:42:48 +00:00
Lang Hames 39fb1d08dc Add DAG-combines for aggressive FMA formation.
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
      AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
        OR
      UnsafeFPMath option (-enable-unsafe-fp-math)
    are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
    the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).

If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.

llvm-svn: 158757
2012-06-19 22:51:23 +00:00
Lang Hames a33db65bd9 Make comment slightly more helpful.
llvm-svn: 158467
2012-06-14 20:37:15 +00:00
Andrew Trick 4544606c71 misched: API for minimum vs. expected latency.
Minimum latency determines per-cycle scheduling groups.
Expected latency determines critical path and cost.

llvm-svn: 158021
2012-06-05 21:11:27 +00:00
Lang Hames a59100cc08 Add a new intrinsic: llvm.fmuladd. This intrinsic represents a multiply-add
expression (a * b + c) that can be implemented as a fused multiply-add (fma)
if the target determines that this will be more efficient. This intrinsic
will be used to implement FP_CONTRACT support and an aggressive FMA formation
mode.

If your target has a fast FMA instruction you should override the
isFMAFasterThanMulAndAdd method in TargetLowering to return true.

llvm-svn: 158014
2012-06-05 19:07:46 +00:00
Andrew Trick 73d7736b17 misched: Added MultiIssueItineraries.
This allows a subtarget to explicitly specify the issue width and
other properties without providing pipeline stage details for every
instruction.

llvm-svn: 157979
2012-06-05 03:44:40 +00:00
Andrew Trick a88d46e818 sdsched: Use the right heuristics when -mcpu is not provided and we have no itinerary.
Use ILP heuristics for long latency instrs if no scoreboard exists.

llvm-svn: 157978
2012-06-05 03:44:34 +00:00
Nadav Rotem b7bb72e4f3 Remove the "-promote-elements" flag. This flag is now enabled by default.
llvm-svn: 157925
2012-06-04 11:27:21 +00:00
Benjamin Kramer bde9176663 Fix typos found by http://github.com/lyda/misspell-check
llvm-svn: 157885
2012-06-02 10:20:22 +00:00
Stepan Dyatkovskiy 0e46d8a08c PR1255: case ranges.
IntRange converted from struct to class. So main change everywhere is replacement of ".Low/High" with ".getLow/getHigh()"

llvm-svn: 157884
2012-06-02 09:42:43 +00:00
Stepan Dyatkovskiy 9549f5894b PR1255: case ranges.
IntegersSubsetGeneric, IntegersSubsetMapping: added IntTy template parameter, that allows use either APInt or IntItem. This change allows to write unittest for these classes.

llvm-svn: 157880
2012-06-02 07:26:00 +00:00
Akira Hatanaka 6f3b2a670f Fix a bug in the code which custom-lowers truncating stores in LegalizeDAG.
Check that the SDValue TargetLowering::LowerOperation returns is not null
before replacing the original node with the returned node.

llvm-svn: 157873
2012-06-02 01:10:34 +00:00
Jakob Stoklund Olesen 54038d796c Switch all register list clients to the new MC*Iterator interface.
No functional change intended.

Sorry for the churn. The iterator classes are supposed to help avoid
giant commits like this one in the future. The TableGen-produced
register lists are getting quite large, and it may be necessary to
change the table representation.

This makes it possible to do so without changing all clients (again).

llvm-svn: 157854
2012-06-01 23:28:30 +00:00
Jakob Stoklund Olesen 9b09cf0c11 Simplify some more getAliasSet callers.
MCRegAliasIterator can include Reg itself in the list.

llvm-svn: 157848
2012-06-01 22:38:17 +00:00
Manman Ren e873552091 ARM: properly handle alignment for struct byval.
Factor out the expansion code into a function.
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 157830
2012-06-01 19:33:18 +00:00
Stepan Dyatkovskiy 66305749f1 PR1255: case ranges.
IntegersSubset devided into IntegersSubsetGeneric and into IntegersSubset itself. The first has no references to ConstantInt and works with IntItem only.
IntegersSubsetMapping also made generic. Here added second template parameter "IntegersSubsetTy" that allows to use on of two IntegersSubset types described below.

llvm-svn: 157815
2012-06-01 16:17:57 +00:00
Owen Anderson 0eda3e1de6 Switch the canonical FMA term operand order to match both the comment I wrote and the usual LLVM convention.
llvm-svn: 157708
2012-05-30 18:54:50 +00:00
Owen Anderson c7aaf523e1 Teach DAGCombine to canonicalize the position of a constant in the term operands of an FMA node.
llvm-svn: 157707
2012-05-30 18:50:39 +00:00
Stepan Dyatkovskiy 58107dd547 ConstantRangesSet renamed to IntegersSubset. CRSBuilder renamed to IntegersSubsetMapping.
llvm-svn: 157612
2012-05-29 12:26:47 +00:00
Peter Collingbourne 913869be45 Add llvm.fabs intrinsic.
llvm-svn: 157594
2012-05-28 21:48:37 +00:00
Stepan Dyatkovskiy e3e19cbb13 PR1255: Case Ranges
Implemented IntItem - the wrapper around APInt. Why not to use APInt item directly right now?
1. It will very difficult to implement case ranges as series of small patches. We got several large and heavy patches. Each patch will about 90-120 kb. If you replace ConstantInt with APInt in SwitchInst you will need to changes at the same time all Readers,Writers and absolutely all passes that uses SwitchInst.
2. We can implement APInt pool inside and save memory space. E.g. we use several switches that works with 256 bit items (switch on signatures, or strings). We can avoid value duplicates in this case.
3. IntItem can be easyly easily replaced with APInt.
4. Currenly we can interpret IntItem both as ConstantInt and as APInt. It allows to provide SwitchInst methods that works with ConstantInt for non-updated passes.

Why I need it right now? Currently I need to update SimplifyCFG pass (EqualityComparisons). I need to work with APInts directly a lot, so peaces of code
ConstantInt *V = ...;
if (V->getValue().ugt(AnotherV->getValue()) {
  ...
}
will look awful. Much more better this way:
IntItem V = ConstantIntVal->getValue();
if (AnotherV < V) {
}

Of course any reviews are welcome.

P.S.: I'm also going to rename ConstantRangesSet to IntegersSubset, and CRSBuilder to IntegersSubsetMapping (allows to map individual subsets of integers to the BasicBlocks).
Since in future these classes will founded on APInt, it will possible to use them in more generic ways.

llvm-svn: 157576
2012-05-28 12:39:09 +00:00
Benjamin Kramer abb3fa69b4 Missed parens.
llvm-svn: 157527
2012-05-27 10:56:55 +00:00
Benjamin Kramer 4b8f8e75e6 r157525 didn't work, just disable iterator checking.
This is obviosly right but I don't see how to do this with proper vector
iterators without building a horrible mess of workarounds.

llvm-svn: 157526
2012-05-27 10:24:52 +00:00
Benjamin Kramer 48ff2751c1 SDAGBuilder: Avoid iterator invalidation harder.
vector.begin()-1 is invalid too.

llvm-svn: 157525
2012-05-27 09:44:52 +00:00
Benjamin Kramer 5aad872f8c SDAGBuilder: Don't create an invalid iterator when there is only one switch case.
Found by libstdc++'s debug mode.

llvm-svn: 157522
2012-05-26 21:19:12 +00:00
Benjamin Kramer f2beccf6b4 SelectionDAGBuilder: When emitting small compare chains for switches order them by using edge weights.
SimplifyCFG tends to form a lot of 2-3 case switches when merging branches. Move
the most likely condition to the front so it is checked first and the others can
be skipped. This is currently not as effective as it could be because SimplifyCFG
destroys profiling metadata when merging branches and switches. Merging branch
weight metadata is tricky though.

This code touches at most 3 cases so I didn't use a proper sorting algorithm.

llvm-svn: 157521
2012-05-26 20:01:32 +00:00
Justin Holewinski aa58397b3c Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall
to pass around a struct instead of a large set of individual values.  This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.

NV_CONTRIB

llvm-svn: 157479
2012-05-25 16:35:28 +00:00
Eli Friedman 315a0c79f3 Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process.
llvm-svn: 157446
2012-05-25 00:09:29 +00:00
Craig Topper 9520719b9b Mark some static arrays as const.
llvm-svn: 157377
2012-05-24 06:35:32 +00:00
Owen Anderson f2118ea826 Fix use of an unitialized value in the LegalizeOps expansion for ISD::SUB. No in-tree targets exercise this path.
Patch by Micah Villmow.

llvm-svn: 157215
2012-05-21 22:39:20 +00:00
Chad Rosier 5d1f5d2be3 Typo.
llvm-svn: 157195
2012-05-21 17:13:41 +00:00
Peter Collingbourne 8eb05fd093 When legalising shifts, do not pre-build a list of operands which
may be RAUW'd by the recursive call to LegalizeOps; instead, retrieve
the other operands when calling UpdateNodeOperands.  Fixes PR12889.

llvm-svn: 157162
2012-05-20 18:36:15 +00:00
Jakob Stoklund Olesen 1f1c6add10 Properly constrain register classes for sub-registers.
Not all GR64 registers have sub_8bit sub-registers.

llvm-svn: 157150
2012-05-20 06:38:37 +00:00
Stepan Dyatkovskiy b638ee0ed3 Recommited reworked r156804:
SelectionDAGBuilder::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced.

llvm-svn: 157046
2012-05-18 08:32:28 +00:00
Stepan Dyatkovskiy 96d0c925e9 SelectionDAGBuilder: CaseBlock, CaseRanges and CaseCmp changed representation of Low and High from signed to unsigned. Since unsigned ints usually simpler, faster and allows to reduce some extra signed bit checks needed before <,>,<=,>= comparisons.
llvm-svn: 156985
2012-05-17 08:56:30 +00:00
Duncan Sands 49080cd9a1 Fix a thinko in DisintegrateMERGE_VALUES. Patch by Xiaoyi Guo.
llvm-svn: 156909
2012-05-16 07:57:18 +00:00
Stepan Dyatkovskiy e01e9863c5 Rejected r156804 due to buildbots failures.
llvm-svn: 156808
2012-05-15 06:50:18 +00:00
Stepan Dyatkovskiy d450d3fa12 SelectionDAGBuilder::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced.
llvm-svn: 156804
2012-05-15 05:09:41 +00:00
Dan Gohman 164fe18cfe Rename @llvm.debugger to @llvm.debugtrap.
llvm-svn: 156774
2012-05-14 18:58:10 +00:00
Chad Rosier a33015d4e0 Revert 156658.
llvm-svn: 156662
2012-05-11 23:21:01 +00:00
Chad Rosier e40f5d3ee0 [fast-isel] Fast-isel doesn't use the expect intrinsic.
llvm-svn: 156658
2012-05-11 23:10:58 +00:00
Dan Gohman dfab443ae8 Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(),
but it generates int3 on x86 instead of ud2.

llvm-svn: 156593
2012-05-11 00:19:32 +00:00
Jim Grosbach 92f6adc8be DAGCombiner should not change the type of an extract_vector index.
When a combine twiddles an extract_vector, care should be take to preserve
the type of the index operand. No luck extracting a reasonable testcase,
unfortunately.

rdar://11391009

llvm-svn: 156419
2012-05-08 20:56:07 +00:00
Jakob Stoklund Olesen 3c52f0281f Add an MF argument to TRI::getPointerRegClass() and TII::getRegClass().
The getPointerRegClass() hook can return register classes that depend on
the calling convention of the current function (ptr_rc_tailcall).

So far, we have been able to infer the calling convention from the
subtarget alone, but as we add support for multiple calling conventions
per target, that no longer works.

Patch by Yiannis Tsiouris!

llvm-svn: 156328
2012-05-07 22:10:26 +00:00
Owen Anderson ab63d84252 Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled.
llvm-svn: 156324
2012-05-07 20:51:25 +00:00
Benjamin Kramer e31f31e5c0 Add a new target hook "predictableSelectIsExpensive".
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.

Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.

I'm not entirely happy with the name of this flag, suggestions welcome ;)

llvm-svn: 156233
2012-05-05 12:49:14 +00:00
Jakob Stoklund Olesen e326ed33a8 Make sure findRepresentativeClass picks the widest super-register.
We want the representative register class to contain the largest
super-registers available. This makes the function less sensitive to the
register class numbering.

llvm-svn: 156220
2012-05-04 22:53:28 +00:00
Jakob Stoklund Olesen 75fbe90839 Use SuperRegClassIterator for findRepresentativeClass().
The masks returned by SuperRegClassIterator are computed automatically
by TableGen. This is better than depending on the manually specified
SuperRegClasses.

llvm-svn: 156147
2012-05-04 02:19:22 +00:00
Andrew Trick 32aea358e1 Added TargetRegisterInfo::getAllocatableClass.
The ensures that virtual registers always belong to an allocatable class.
If your target attempts to create a vreg for an operand that has no
allocatable register subclass, you will crash quickly.

This ensures that targets define register classes as intended.

llvm-svn: 156046
2012-05-03 01:14:37 +00:00
Owen Anderson 41b0665b5b Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs.
llvm-svn: 156029
2012-05-02 22:17:40 +00:00
Owen Anderson b5f167c660 Teach DAG combine that multiplication by 1.0 can always be constant folded.
llvm-svn: 156023
2012-05-02 21:32:35 +00:00
Jakub Staszak cd2353402d Use dyn_cast instead of checking opcode and cast.
llvm-svn: 155957
2012-05-01 23:06:00 +00:00
Bill Wendling b6b50c6638 Strip the pointer casts off of allocas so that the selection DAG can find them.
PR10799

llvm-svn: 155954
2012-05-01 22:50:45 +00:00
Jakub Staszak cec09b2594 Add some constantness. No functionality change.
llvm-svn: 155859
2012-04-30 23:41:30 +00:00
Andrew Trick 833f04962a Reapply 155668: Fix the SD scheduler to avoid gluing the same node twice.
This time, also fix the caller of AddGlue to properly handle
incomplete chains. AddGlue had failure modes, but shamefully hid them
from its caller. It's luck ran out.

Fixes rdar://11314175: BuildSchedUnits assert.

llvm-svn: 155749
2012-04-28 01:03:23 +00:00
Andrew Trick 7a773ec053 Temporarily revert r155668: Fix the SD scheduler to avoid gluing.
This definitely caused regression with ARM -mno-thumb.

llvm-svn: 155743
2012-04-27 22:55:59 +00:00
Andrew Trick 03fa574af5 Fix the SD scheduler to avoid gluing the same node twice.
DAGCombine strangeness may result in multiple loads from the same
offset. They both may try to glue themselves to another load. We could
insist that the redundant loads glue themselves to each other, but the
beter fix is to bail out from bad gluing at the time we detect it.

Fixes rdar://11314175: BuildSchedUnits assert.

llvm-svn: 155668
2012-04-26 21:48:25 +00:00
Elena Demikhovsky 8d7e56c409 ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2
llvm-svn: 155309
2012-04-22 09:39:03 +00:00
Nadav Rotem 31caa27bf5 Teach getVectorTypeBreakdown about promotion of vectors in addition to widening of vectors.
llvm-svn: 155296
2012-04-21 20:08:32 +00:00
Jakob Stoklund Olesen d114da6004 Fix PR12599.
The X86 target is editing the selection DAG while isel is selecting
nodes following a topological ordering. When the DAG hacking triggers
CSE, nodes can be deleted and bad things happen.

llvm-svn: 155257
2012-04-20 23:36:09 +00:00
Jakob Stoklund Olesen e3a891cf08 Make ISelPosition a local variable.
Now that multiple DAGUpdateListeners can be active at the same time,
ISelPosition can become a local variable in DoInstructionSelection.

We simply register an ISelUpdater with CurDAG while ISelPosition exists.

llvm-svn: 155249
2012-04-20 22:08:50 +00:00
Jakob Stoklund Olesen beb9469d5c Register DAGUpdateListeners with SelectionDAG.
Instead of passing listener pointers to RAUW, let SelectionDAG itself
keep a linked list of interested listeners.

This makes it possible to have multiple listeners active at once, like
RAUWUpdateListener was already doing. It also makes it possible to
register listeners up the call stack without controlling all RAUW calls
below.

DAGUpdateListener uses an RAII pattern to add itself to the SelectionDAG
list of active listeners.

llvm-svn: 155248
2012-04-20 22:08:46 +00:00
Joel Jones 828531f798 Fixes a problem in instruction selection with testing whether or not the
transformation:

(X op C1) ^ C2 --> (X op C1) & ~C2 iff (C1&C2) == C2

should be done.  

This change has been tested:
 Using a debug+asserts build:
   on the specific test case that brought this bug to light
   make check-all
   lnt nt
   using this clang to build a release version of clang
 Using the release+asserts clang-with-clang build:
   on the specific test case that brought this bug to light
   make check-all
   lnt nt

Checking in because Evan wants it checked in.  Test case forthcoming after
scrubbing.

llvm-svn: 154955
2012-04-17 22:23:10 +00:00
Hal Finkel e0cf6397fd Remove dead SD nodes after the combining pass. Fixes PR12201.
llvm-svn: 154786
2012-04-16 03:33:22 +00:00
Nadav Rotem 02ef0c3524 When emulating vselect using OR/AND/XOR make sure to bitcast the result back to the original type.
llvm-svn: 154764
2012-04-15 15:08:09 +00:00
Nadav Rotem 9d376b6578 Reapply 154397. Original message:
Fix a dagcombine optimization which assumes that the vsetcc result type is always
of the same size as the compared values. This is ture for SSE/AVX/NEON but not
for all targets.

llvm-svn: 154490
2012-04-11 08:26:11 +00:00
Craig Topper 692d584910 Fix an overly indented line. Remove an 'else' after an 'if' that returns.
llvm-svn: 154479
2012-04-11 04:55:51 +00:00
Craig Topper bc680061e8 Inline implVisitAluOverflow by introducing a nested switch to convert the intrinsic to an nodetype.
llvm-svn: 154478
2012-04-11 04:34:11 +00:00
Craig Topper 3ef01cdb2e Optimize code a bit by calling push_back only once in some loops. Reduces compiled code size a bit.
llvm-svn: 154473
2012-04-11 03:06:35 +00:00
Owen Anderson 6f1ee1634d Move the constant-folding support for FP_ROUND in SelectionDAG from the one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point.
Zap a testcase that this allows us to completely fold away.

llvm-svn: 154447
2012-04-10 22:46:53 +00:00
Duncan Sands 4f53074cca Add a comment noting that the fdiv -> fmul conversion won't generate
multiplication by a denormal, and some tests checking that.

llvm-svn: 154431
2012-04-10 20:35:27 +00:00
Eric Christopher e9abba71fe To ensure that we have more accurate line information for a block
don't elide the branch instruction if it's the only one in the block,
otherwise it's ok.

PR9796 and rdar://11215207

llvm-svn: 154417
2012-04-10 18:18:10 +00:00
Owen Anderson 3efc8f22bd Revert r154397, which was causing make check failures on the buildbots.
llvm-svn: 154414
2012-04-10 18:02:12 +00:00
Nadav Rotem 065564d85a Fix a dagcombine optimization which assumes that the vsetcc result type is always
of the same size as the compared values. This is ture for SSE/AVX/NEON but not
for all targets.

llvm-svn: 154397
2012-04-10 14:58:31 +00:00
Anton Korobeynikov 4d1220de34 Transform div to mul with reciprocal only when fp imm is legal.
This fixes PR12516 and uncovers one weird problem in legalize (workarounded)

llvm-svn: 154394
2012-04-10 13:22:49 +00:00
Evan Cheng 136861d994 Make the code slightly more palatable.
llvm-svn: 154378
2012-04-10 03:15:18 +00:00
Evan Cheng f8bad08001 Fix a long standing tail call optimization bug. When a libcall is emitted
legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.

PR12419
rdar://9770785
rdar://11195178

llvm-svn: 154370
2012-04-10 01:51:00 +00:00
Rafael Espindola 1d9672bdce Don't try to zExt just to check if an integer constant is zero, it might
not fit in a i64.

llvm-svn: 154364
2012-04-10 00:16:22 +00:00
Akira Hatanaka 8483a6c47d Have TargetLowering::getPICJumpTableRelocBase return a node that points to the
GOT if jump table uses 64-bit gp-relative relocation.

llvm-svn: 154341
2012-04-09 20:32:12 +00:00
Rafael Espindola 8f62b3248e Pattern match a setcc of boolean value with 0 as a truncate.
llvm-svn: 154322
2012-04-09 16:06:03 +00:00
Craig Topper 9c3da316ec Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out.
llvm-svn: 154309
2012-04-09 07:19:09 +00:00
Craig Topper e5893f64e8 Remove unnecessary 'else' on an 'if' that always returns
llvm-svn: 154308
2012-04-09 05:59:53 +00:00
Craig Topper e3ad4834ae Optimize code slightly. No functionality change.
llvm-svn: 154307
2012-04-09 05:55:33 +00:00
Craig Topper 5894fe430a Replace some explicit checks with asserts for conditions that should never happen.
llvm-svn: 154305
2012-04-09 05:16:56 +00:00
Craig Topper 6148fe65e8 Optimize code a bit. No functional change intended.
llvm-svn: 154299
2012-04-08 23:15:04 +00:00
Benjamin Kramer bb6ff08766 Silence sign-compare warning.
llvm-svn: 154297
2012-04-08 19:04:45 +00:00
Duncan Sands 2f1dc3814b Only have codegen turn fdiv by a constant into fmul by the reciprocal
when -ffast-math, i.e. don't just always do it if the reciprocal can
be formed exactly.  There is already an IR level transform that does
that, and it does it more carefully.

llvm-svn: 154296
2012-04-08 18:08:12 +00:00
Craig Topper c8e2d91a58 Simplify code that tries to do vector extracts for shuffles when the mask width and the input vector widths don't match. No need to check the min and max are in range before calculating the start index. The range check after having the start index is sufficient. Also no need to check for an extract from the beginning differently.
llvm-svn: 154295
2012-04-08 17:53:33 +00:00
Chandler Carruth 16f0ebcbb5 Move the TLSModel information into the TargetMachine rather than hiding
in TargetLowering. There was already a FIXME about this location being
odd. The interface is simplified as a consequence. This will also make
it easier to change TLS models when compiling with PIE.

llvm-svn: 154292
2012-04-08 17:20:55 +00:00
Craig Topper d024cef233 Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1.
llvm-svn: 154272
2012-04-07 22:32:29 +00:00
Craig Topper e09d1c5c48 Remove 'else' after 'if' that ends in return.
llvm-svn: 154267
2012-04-07 21:23:41 +00:00
Nadav Rotem 71d07ae5cb 1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new
shuffle node because it could introduce new shuffle nodes that were not
   supported efficiently by the target.

2. Add a more restrictive shuffle-of-shuffle optimization for cases where the
   second shuffle reverses the transformation of the first shuffle.

llvm-svn: 154266
2012-04-07 21:19:08 +00:00
Duncan Sands 5f8397a934 Convert floating point division by a constant into multiplication by the
reciprocal if converting to the reciprocal is exact.  Do it even if inexact
if -ffast-math.  This substantially speeds up ac.f90 from the polyhedron
benchmarks.

llvm-svn: 154265
2012-04-07 20:04:00 +00:00
Jakob Stoklund Olesen 37492eac8c Don't break the IV update in TLI::SimplifySetCC().
LSR always tries to make the ICmp in the loop latch use the incremented
induction variable. This allows the induction variable to be kept in a
single register.

When the induction variable limit is equal to the stride,
SimplifySetCC() would break LSR's hard work by transforming:

   (icmp (add iv, stride), stride) --> (cmp iv, 0)

This forced us to use lea for the IC update, preventing the simpler
incl+cmp.

<rdar://problem/7643606>
<rdar://problem/11184260>

llvm-svn: 154119
2012-04-05 20:30:20 +00:00
Owen Anderson a6eebf6013 Treat f16 the same as f80/f128 for the purposes of generating constants during instruction selection.
llvm-svn: 154113
2012-04-05 18:50:32 +00:00
Pete Cooper 8a3dc0ed8c f16 FREM can now be legalized by promoting to f32
llvm-svn: 154039
2012-04-04 19:36:31 +00:00
Rafael Espindola ba0a6cabb8 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Craig Topper 4c7d995029 Remove default case from switch that was already covering all cases.
llvm-svn: 153996
2012-04-04 04:42:42 +00:00
Pete Cooper e7bff68a5e Removed useless switch for default case when switch was covering all the enum values
llvm-svn: 153984
2012-04-04 00:53:04 +00:00
Pete Cooper 9511ec86f9 Add VSELECT to LegalizeVectorTypes::ScalariseVectorResult. Previously it would crash if it encountered a 1 element VSELECT. Solution is slightly more complicated than just creating a SELET as we have to mask or sign extend the vector condition if it had different boolean contents from the scalar condition. Fixes <rdar://problem/11178095>
llvm-svn: 153976
2012-04-03 22:57:55 +00:00
Chad Rosier 2a02fe1bb2 Fix an issue in SimplifySetCC() specific to vector comparisons.
When folding X == X we need to check getBooleanContents() to determine if the
result is a vector of ones or a vector of negative ones. 

I tried creating a test case, but the problem seems to only be exposed on a
much older version of clang (around r144500).
rdar://10923049

llvm-svn: 153966
2012-04-03 20:11:24 +00:00
Owen Anderson 98f2c0c384 Add predicates for checking whether targets have free FNEG and FABS operations, and prevent the DAGCombiner from turning them into bitwise operations if they do.
llvm-svn: 153901
2012-04-02 22:10:29 +00:00
Nadav Rotem 702f080767 Optimizing swizzles of complex shuffles may generate additional complex shuffles.
Do not try to optimize swizzles of shuffles if the source shuffle has more than
a single user, except when the source shuffle is also a swizzle.

llvm-svn: 153864
2012-04-02 07:11:12 +00:00
Nadav Rotem b078350872 This commit contains a few changes that had to go in together.
1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B))
   (and also scalar_to_vector).

2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src).
   Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B))

3. Optimize swizzles of shuffles:  shuff(shuff(x, y), undef) -> shuff(x, y).

4. Fix an X86ISelLowering optimization which was very bitcast-sensitive.

Code which was previously compiled to this:

movd    (%rsi), %xmm0
movdqa  .LCPI0_0(%rip), %xmm2
pshufb  %xmm2, %xmm0
movd    (%rdi), %xmm1
pshufb  %xmm2, %xmm1
pxor    %xmm0, %xmm1
pshufb  .LCPI0_1(%rip), %xmm1
movd    %xmm1, (%rdi)
ret

Now compiles to this:

movl    (%rsi), %eax
xorl    %eax, (%rdi)
ret

llvm-svn: 153848
2012-04-01 19:31:22 +00:00
Rafael Espindola 80c540e656 Teach CodeGen's version of computeMaskedBits to understand the range metadata.
This is the CodeGen equivalent of r153747. I tested that there is not noticeable
performance difference with any combination of -O0/-O2 /-g when compiling
gcc as a single compilation unit.

llvm-svn: 153817
2012-03-31 18:14:00 +00:00
Bill Wendling 9f829f1cc4 If we have a VLA that has a "use" in a metadata node that's then used
here but it has no other uses, then we have a problem. E.g.,

  int foo (const int *x) {
    char a[*x];
    return 0;
  }

If we assign 'a' a vreg and fast isel later on has to use the selection
DAG isel, it will want to copy the value to the vreg. However, there are
no uses, which goes counter to what selection DAG isel expects.
<rdar://problem/11134152>

llvm-svn: 153705
2012-03-30 00:02:55 +00:00
Eric Christopher 24a6298512 More debug output.
llvm-svn: 153571
2012-03-28 07:34:36 +00:00
Chris Lattner 1cc25e8a40 fix what looks like a real logic bug, found by PVS-Studio (part of PR12357)
llvm-svn: 153513
2012-03-27 16:27:21 +00:00
Eric Christopher c1e2dcdb8a Add a debug statement.
llvm-svn: 153428
2012-03-26 06:10:32 +00:00
Hal Finkel 71c2ba3d2e Add the ability to promote legal integer VAARGs. This is required for the PPC64 SVR4 ABI.
llvm-svn: 153372
2012-03-24 03:53:52 +00:00
Evan Cheng 8ab58a21a5 Source order scheduler should not preschedule nodes with multiple uses. rdar://11096639
llvm-svn: 153270
2012-03-22 19:31:17 +00:00
Evan Cheng 79f03e915d Assign node orders to target intrinsics which do not produce results. rdar://11096639
llvm-svn: 153269
2012-03-22 19:29:09 +00:00
Chad Rosier 6a63a74113 [fast-isel] Fold "urem x, pow2" -> "and x, pow2-1". This should fix the 271%
execution-time regression for nsieve-bits on the ARMv7 -O0 -g nightly tester.
This may also improve compile-time on architectures that would otherwise 
generate a libcall for urem (e.g., ARM) or fall back to the DAG selector.
rdar://10810716

llvm-svn: 153230
2012-03-22 00:21:17 +00:00
Jim Grosbach e13adc38d0 Checking a build_vector for an all-ones value.
Type legalization can zero-extend the elements of the build_vector node, so,
for example, we may have an <8 x i8> with i32 elements of value 255. That
should return 'true' for the vector being all ones.

llvm-svn: 153203
2012-03-21 17:48:04 +00:00
Craig Topper aaeae98936 When combining (vextract shuffle (load ), <1,u,u,u>), 0) -> (load ), add users of the final load to the worklist too. Needed by changes I'm preparing to make to X86 backend.
llvm-svn: 153078
2012-03-20 05:28:39 +00:00
Eric Christopher 60e01c560a Do everything up to generating code to try to get a register for
a variable. The previous code would break the debug info changing
code invariant. This will regress debug info for arguments where
we elide the alloca created.

Fixes rdar://11066468

llvm-svn: 153074
2012-03-20 01:07:58 +00:00
Eric Christopher 997aaa9237 Untabify.
llvm-svn: 153073
2012-03-20 01:07:56 +00:00
Eric Christopher e5e54c87fa Add another debugging statement here.
llvm-svn: 153072
2012-03-20 01:07:53 +00:00
Eric Christopher 1a06cc9ae6 Use lookUpRegForValue here instead of duplicating the code.
llvm-svn: 153071
2012-03-20 01:07:47 +00:00
Pete Cooper e69be6df4f f16 FDIV can now be legalized by promoting to f32
llvm-svn: 153064
2012-03-19 23:38:12 +00:00
Duncan Sands 3fb2fc6edb Fix DAG combine which creates illegal vector shuffles. Patch by Heikki Kultala.
llvm-svn: 153035
2012-03-19 15:35:44 +00:00
NAKAMURA Takumi a7e57ace28 Revert r152613 (and r152614), "Inline the d'tor and add an anchor instead." for workaround of g++-4.4's miscompilation.
It caused MSP430DAGToDAGISel::SelectIndexedBinOp() to be miscompiled.
When two ReplaceUses()'s are expanded as inline, vtable in base class is stored to latter (ISelUpdater)ISU.

llvm-svn: 152877
2012-03-16 00:01:55 +00:00
Eric Christopher 3390a6e5e3 We actually handle AllocaInst via getRegForValue below just fine.
Part of rdar://8905263

llvm-svn: 152845
2012-03-15 21:33:47 +00:00
Eric Christopher 142820ba8d Add some debugging output into fast isel as well.
llvm-svn: 152844
2012-03-15 21:33:44 +00:00
Eric Christopher be7a1016fc Add another debug statement.
llvm-svn: 152843
2012-03-15 21:33:41 +00:00
Nadav Rotem 6fd1d32c63 When optimizing certain BUILD_VECTOR nodes into other BUILD_VECTOR nodes, add the new node into the work list because there is a potential for further optimizations.
llvm-svn: 152784
2012-03-15 08:49:06 +00:00
Bill Wendling df170db2f6 Add a xform to the DAG combiner.
Transform:

        (fsub x, (fadd x, y)) -> (fneg y) and
        (fsub x, (fadd y, x)) -> (fneg y)

if 'unsafe math' is specified.
<rdar://problem/7540295>

llvm-svn: 152777
2012-03-15 05:12:00 +00:00
Bill Wendling 618d57310a Insert the debugging instructions in one fell-swoop so that it doesn't call the
expensive "getFirstTerminator" call. This reduces the time of compilation in
PR12258 from >10 minutes to < 10 seconds.

llvm-svn: 152704
2012-03-14 07:14:25 +00:00
Evan Cheng d5f8e5766c Fortify r152675 a bit. Although I'm not able to come up with a test case that would trigger the truncation case.
llvm-svn: 152678
2012-03-13 22:16:11 +00:00
Evan Cheng 7bf83096df DAG combine incorrectly optimize (i32 vextract (v4i16 load $addr), c) to
(i16 load $addr+c*sizeof(i16)) and replace uses of (i32 vextract) with the
i16 load. It should issue an extload instead: (i32 extload $addr+c*sizeof(i16)).

rdar://11035895

llvm-svn: 152675
2012-03-13 22:00:52 +00:00
Bill Wendling ac499ab244 Add a return type.
llvm-svn: 152614
2012-03-13 05:52:28 +00:00
Bill Wendling 8adb10c8a9 Inline the d'tor and add an anchor instead.
llvm-svn: 152613
2012-03-13 05:51:56 +00:00
Bill Wendling 508a3e5185 Refactor the SelectionDAG's 'dump' methods into their own .cpp file.
No functionality change.

llvm-svn: 152611
2012-03-13 05:47:27 +00:00
Stepan Dyatkovskiy 97b02fc1b3 llvm::SwitchInst
Renamed methods caseBegin, caseEnd and caseDefault with case_begin, case_end, and case_default.
Added some notes relative to case iterators.

llvm-svn: 152532
2012-03-11 06:09:17 +00:00
Benjamin Kramer e1e549d617 Give dagcombiner's worklist some inline capacity.
llvm-svn: 152454
2012-03-10 00:23:58 +00:00
Craig Topper 5a4bcc749a Use uint16_t to store instruction implicit uses and defs. Reduces static data.
llvm-svn: 152301
2012-03-08 08:22:45 +00:00
Stepan Dyatkovskiy 5b648afb4d Taken into account Duncan's comments for r149481 dated by 2nd Feb 2012:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/136146.html

Implemented CaseIterator and it solves almost all described issues: we don't need to mix operand/case/successor indexing anymore. Base iterator class is implemented as a template since it may be initialized either from "const SwitchInst*" or from "SwitchInst*".

ConstCaseIt is just a read-only iterator.
CaseIt is read-write iterator; it allows to change case successor and case value.

Usage of iterator allows totally remove resolveXXXX methods. All indexing convertions done automatically inside the iterator's getters.

Main way of iterator usage looks like this:
SwitchInst *SI = ... // intialize it somehow

for (SwitchInst::CaseIt i = SI->caseBegin(), e = SI->caseEnd(); i != e; ++i) {
  BasicBlock *BB = i.getCaseSuccessor();
  ConstantInt *V = i.getCaseValue();
  // Do something.
}

If you want to convert case number to TerminatorInst successor index, just use getSuccessorIndex iterator's method.
If you want initialize iterator from TerminatorInst successor index, use CaseIt::fromSuccessorIndex(...) method.

There are also related changes in llvm-clients: klee and clang.

llvm-svn: 152297
2012-03-08 07:06:20 +00:00
Andrew Trick 52226d409b misched preparation: rename core scheduler methods for consistency.
We had half the API with one convention, half with another. Now was a
good time to clean it up.

llvm-svn: 152255
2012-03-07 23:00:49 +00:00
Andrew Trick 60cf03e772 misched preparation: clarify ScheduleDAG and ScheduleDAGInstrs roles.
ScheduleDAG is responsible for the DAG: SUnits and SDeps. It provides target hooks for latency computation.

ScheduleDAGInstrs extends ScheduleDAG and defines the current scheduling region in terms of MachineInstr iterators. It has access to the target's scheduling itinerary data. ScheduleDAGInstrs provides the logic for building the ScheduleDAG for the sequence of MachineInstrs in the current region. Target's can implement highly custom schedulers by extending this class.

ScheduleDAGPostRATDList provides the driver and diagnostics for current postRA scheduling. It maintains a current Sequence of scheduled machine instructions and logic for splicing them into the block. During scheduling, it uses the ScheduleHazardRecognizer provided by the target.

Specific changes:
- Removed driver code from ScheduleDAG. clearDAG is the only interface needed.

- Added enterRegion/exitRegion hooks to ScheduleDAGInstrs to delimit the scope of each scheduling region and associated DAG. They should be used to setup and cleanup any region-specific state in addition to the DAG itself. This is necessary because we reuse the same ScheduleDAG object for the entire function. The target may extend these hooks to do things at regions boundaries, like bundle terminators. The hooks are called even if we decide not to schedule the region. So all instructions in a block are "covered" by these calls.

- Added ScheduleDAGInstrs::begin()/end() public API.

- Moved Sequence into the driver layer, which is specific to the scheduling algorithm.

llvm-svn: 152208
2012-03-07 05:21:52 +00:00
Andrew Trick e932bb77b5 misched preparation: modularize schedule emission.
ScheduleDAG has nothing to do with how the instructions are scheduled.

llvm-svn: 152206
2012-03-07 05:21:44 +00:00
Andrew Trick edee68ce1b misched preparation: modularize schedule printing.
ScheduleDAG will not refer to the scheduled instruction sequence.

llvm-svn: 152205
2012-03-07 05:21:40 +00:00
Andrew Trick 46a58664f7 misched preparation: modularize schedule verification.
ScheduleDAG will not refer to the scheduled instruction sequence.

llvm-svn: 152204
2012-03-07 05:21:36 +00:00
Andrew Trick 7c6c41a56a whitespace
llvm-svn: 152203
2012-03-07 05:21:32 +00:00
Andrew Trick 1b2324d0e8 Cleanup in preparation for misched: Move DAG visualization logic.
Soon, ScheduleDAG will not refer to the BB.

llvm-svn: 152177
2012-03-07 00:18:22 +00:00
Andrew Trick 5297d8df99 whitespace
llvm-svn: 152175
2012-03-07 00:18:15 +00:00
Andrew Trick 0c84efe8dd Cleanup: DAG building is specific to either SD or MI scheduling. Not part of the target interface.
llvm-svn: 152174
2012-03-07 00:18:12 +00:00
Evan Cheng 80893ce5f5 Extend r148086 to check for [r +/- reg] address mode. This fixes queens performance regression (due to increased register pressure from overly aggressive pre-inc formation).
llvm-svn: 152162
2012-03-06 23:33:32 +00:00
Owen Anderson 2ee7c4dfc5 Make it possible for a target to mark FSUB as Expand. This requires providing a default expansion (FADD+FNEG), and teaching DAGCombine not to form FSUBs post-legalize if they are not legal.
llvm-svn: 152079
2012-03-06 00:29:31 +00:00
Bill Wendling 7cf6db7e3c Fix warnings about adding a bool to a string.
Patch by Sean Silva!

llvm-svn: 152042
2012-03-05 19:29:36 +00:00
Craig Topper 1d32658877 Use uint16_t to store register overlaps to reduce static data.
llvm-svn: 152001
2012-03-04 10:43:23 +00:00
James Molloy f6298e9281 Fix a codegen fault in which log2 or exp2 could be dead-code eliminated even though they could have sideeffects.
Only allow log2/exp2 to be converted to an intrinsic if they are declared "readnone".

llvm-svn: 151807
2012-03-01 14:32:18 +00:00
Benjamin Kramer d05a0c6c42 LegalizeIntegerTypes: Reorder operations in the "big shift by small amount" optimization, making the lives of later passes easier.
llvm-svn: 151722
2012-02-29 13:27:00 +00:00
Evan Cheng 65f9d19c4f Re-commit r151623 with fix. Only issue special no-return calls if it's a direct call.
llvm-svn: 151645
2012-02-28 18:51:51 +00:00
Benjamin Kramer f2e160c665 Fix off-by one in comment.
llvm-svn: 151644
2012-02-28 18:37:06 +00:00
Benjamin Kramer 0c281a7deb LegalizeIntegerTypes: Reenable the large shift with small amount optimization.
To avoid problems with zero shifts when getting the bits that move between words
we use a trick: first shift the by amount-1, then do another shift by one. When
amount is 0 (and size 32) we first shift by 31, then by one, instead of by 32.

Also fix a latent bug that emitted the low and high words in the wrong order
when shifting right.

Fixes PR12113.

llvm-svn: 151637
2012-02-28 17:58:00 +00:00
Daniel Dunbar ee7b899343 Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack prediction. ...", it is breaking the Clang build during the Compiler-RT part.
llvm-svn: 151630
2012-02-28 15:36:07 +00:00
Nadav Rotem 1d666099be Code cleanup following CR by Duncan.
llvm-svn: 151627
2012-02-28 14:13:19 +00:00
Nadav Rotem 875e463b19 Fix a bug in the code that builds SDNodes from vector GEPs.
When the GEP index is a vector of pointers, the code that calculated the size
of the element started from the vector type, and not the contained pointer type.
As a result, instead of looking at the data element pointed by the vector, this
code used the size of the vector. This works for 32bit members (on 32bit
systems), but not for other types. Added code to peel the vector type and
added a test.

llvm-svn: 151626
2012-02-28 11:54:05 +00:00
Evan Cheng 87c7b09d8d Some ARM implementaions, e.g. A-series, does return stack prediction. That is,
the processor keeps a return addresses stack (RAS) which stores the address
and the instruction execution state of the instruction after a function-call
type branch instruction.

Calling a "noreturn" function with normal call instructions (e.g. bl) can
corrupt RAS and causes 100% return misprediction so LLVM should use a
unconditional branch instead. i.e.
mov lr, pc
b _foo
The "mov lr, pc" is issued in order to get proper backtrace.

rdar://8979299

llvm-svn: 151623
2012-02-28 06:42:03 +00:00
Hal Finkel b9a3d61894 Don't crash when a glue node contains an internal CopyToReg
This is necessary to support the existing ppc lowering code for indirect calls.
Fixes PR12071.

llvm-svn: 151373
2012-02-24 17:53:59 +00:00
Benjamin Kramer 6fe3e3d335 SDAGBuilder: Remove register sets that were never read and prune dead code surrounding it.
llvm-svn: 151364
2012-02-24 14:01:17 +00:00
Pete Cooper 682c76b7d4 Turn avx insert intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove duplicate patterns for selecting the intrinsics
llvm-svn: 151342
2012-02-24 03:51:49 +00:00
Eric Christopher da97054114 If the Address of a variable is an argument then treat the entire
variable declaration as an argument because we want that address
anyhow for our debug information.

This seems to fix rdar://9965111, at least we have more debug
information than before and from reading the assembly it appears
to be the correct location.

llvm-svn: 151335
2012-02-24 01:59:08 +00:00
Eric Christopher 219d51d649 Tabs, formatting and long lines oh my!
llvm-svn: 151334
2012-02-24 01:59:01 +00:00
Bill Wendling 38b31619f6 Allow an integer to be converted into an MMX type when it's used in an inline
asm.
<rdar://problem/10106006>

llvm-svn: 151303
2012-02-23 23:25:25 +00:00
Eric Christopher 18c6be7132 More newline cleanups.
llvm-svn: 151235
2012-02-23 03:39:43 +00:00
Eric Christopher 5c45205b79 Add some handy-dandy newlines.
llvm-svn: 151234
2012-02-23 03:39:39 +00:00
Michael J. Spencer 8b98bf2d6b Properly emit _fltused with FastISel. Refactor to share code with SDAG.
Patch by Joe Groff!

llvm-svn: 151183
2012-02-22 19:06:13 +00:00
Craig Topper 760b134ffa Make all pointers to TargetRegisterClass const since they are all pointers to static data that should not be modified.
llvm-svn: 151134
2012-02-22 05:59:10 +00:00
James Molloy 862fe49c55 Teach the DAGCombiner that certain loadext nodes followed by ANDs can be converted to zeroexts.
llvm-svn: 150957
2012-02-20 12:02:38 +00:00
Eric Christopher 81e2bf2b77 Ignore the lifetime intrinsics in fast-isel.
llvm-svn: 150848
2012-02-17 23:03:39 +00:00
James Molloy 920ae8c642 Remove extraneous #include and spelling mistake introduced in r150669.
llvm-svn: 150670
2012-02-16 09:48:07 +00:00
James Molloy 67b6b11b52 Modify the algorithm when traversing the DAGCombiner's worklist to be O(log N) for all operations. This fixes a horrible worst case with lots of nodes where 99% of the time was being spent in std::remove.
llvm-svn: 150669
2012-02-16 09:17:04 +00:00
Pete Cooper 4dd0963d56 Added hook to let targets custom lower splitting of illegal vectors
llvm-svn: 150550
2012-02-15 00:55:31 +00:00
Nadav Rotem 29984ba033 Fix PR12000. Some vector operations may use scalar operands with types
that are greater than the vector element type. For example BUILD_VECTOR
of type <1 x i1> with a constant i8 operand.
This patch fixes the assertion.

llvm-svn: 150477
2012-02-14 13:06:32 +00:00
Lang Hames 29d6ed6416 Rename getExceptionAddressRegister() to getExceptionPointerRegister() for consistency with setExceptionPointerRegister(...).
llvm-svn: 150460
2012-02-14 04:45:49 +00:00
Bill Wendling 05d6f2ff1e Don't reserve the R0 and R1 registers here. We don't use these registers, and
marking them as "live-in" into a BB ruins some invariants that the back-end
tries to maintain.

llvm-svn: 150437
2012-02-13 23:47:16 +00:00
Jakob Stoklund Olesen 2ceea93dd3 Add register mask support to ScheduleDAGRRList.
The scheduler will sometimes check the implicit-def list on instructions
to properly handle pre-colored DAG edges.

Also check any register mask operands for physreg clobbers.

llvm-svn: 150428
2012-02-13 23:25:24 +00:00
Nadav Rotem 0c65064dbe Fix a bug in DAGCombine for the optimization of BUILD_VECTOR. We cant generate a shuffle node from two vectors of different types.
llvm-svn: 150383
2012-02-13 12:42:26 +00:00
Nadav Rotem 34ca89afa8 This patch addresses the problem of poor code generation for the zext
v8i8 -> v8i32 on AVX machines. The codegen often scalarizes ANY_EXTEND nodes.
The DAGCombiner has two optimizations that can mitigate the problem. First,
if all of the operands of a BUILD_VECTOR node are extracted from an ZEXT/ANYEXT
nodes, then it is possible to create a new simplified BUILD_VECTOR which uses
UNDEFS/ZERO values to eliminate the scalar ZEXT/ANYEXT nodes.
Second, another dag combine optimization lowers BUILD_VECTOR into a shuffle
vector instruction.

In the case of zext v8i8->v8i32 on AVX, a value in an XMM register is to be
shuffled into a wide YMM register.

This patch modifes the second optimization and allows the creation of
shuffle vectors even when the newly generated vector and the original vector
from which we extract the values are of different types.

llvm-svn: 150340
2012-02-12 15:05:31 +00:00
Benjamin Kramer bf152d57a4 Put instruction names into an indexed string table on the side, removing a pointer from MCInstrDesc.
Make them accessible through MCInstrInfo. They are only used for debugging purposes so this doesn't
have an impact on performance. X86MCTargetDesc.o goes from 630K to 461K on x86_64.

llvm-svn: 150245
2012-02-10 13:18:44 +00:00
Bill Wendling 0aef16afd5 [unwind removal] Remove all of the code for the dead 'unwind' instruction. There
were no 'unwind' instructions being generated before this, so this is in effect
a no-op.

llvm-svn: 149906
2012-02-06 21:44:22 +00:00
Nadav Rotem 4f4546b73a Add additional documentation to the extract-and-trunc dagcombine optimization.
llvm-svn: 149823
2012-02-05 11:39:23 +00:00
Craig Topper ee4dab5f1f Convert assert(0) to llvm_unreachable
llvm-svn: 149816
2012-02-05 08:31:47 +00:00
Chris Lattner cf9e8f6968 reapply the patches reverted in r149470 that reenable ConstantDataArray,
but with a critical fix to the SelectionDAG code that optimizes copies
from strings into immediate stores: the previous code was stopping reading
string data at the first nul.  Address this by adding a new argument to
llvm::getConstantStringInfo, preserving the behavior before the patch.

llvm-svn: 149800
2012-02-05 02:29:43 +00:00
Chad Rosier 6d68c7cf79 [fast-isel] HandlePHINodesInSuccessorBlocks() can promite i8 and i16 types too.
llvm-svn: 149730
2012-02-04 00:39:19 +00:00
Jakob Stoklund Olesen f650732cab Handle all live physreg defs in the same place.
SelectionDAG has 4 different ways of passing physreg defs to users.
Collect all of the uses at the same time, and pass all of them to
MI->setPhysRegsDeadExcept() to mark the remaining defs dead.

The setPhysRegsDeadExcept() function will soon add the required
implicit-defs to instructions with register mask operands.

llvm-svn: 149708
2012-02-03 20:43:35 +00:00
Nadav Rotem 5399f4d6bf The type-legalizer often scalarizes code. One of the common patterns is extract-and-truncate.
In this patch we optimize this pattern and convert the sequence into extract op of a narrow type.
This allows the BUILD_VECTOR dag optimizations to construct efficient shuffle operations in many cases.

llvm-svn: 149692
2012-02-03 13:18:25 +00:00
Andrew Trick 3441597f84 fix cmake
llvm-svn: 149553
2012-02-01 22:28:29 +00:00
Andrew Trick d06df96a7c VLIW specific scheduler framework that utilizes deterministic finite automaton (DFA).
This new scheduler plugs into the existing selection DAG scheduling framework. It is a top-down critical path scheduler that tracks register pressure and uses a DFA for pipeline modeling.

Patch by Sergei Larin!

llvm-svn: 149547
2012-02-01 22:13:57 +00:00
Stepan Dyatkovskiy 513aaa5691 SwitchInst refactoring.
The purpose of refactoring is to hide operand roles from SwitchInst user (programmer). If you want to play with operands directly, probably you will need lower level methods than SwitchInst ones (TerminatorInst or may be User). After this patch we can reorganize SwitchInst operands and successors as we want.

What was done:

1. Changed semantics of index inside the getCaseValue method:
getCaseValue(0) means "get first case", not a condition. Use getCondition() if you want to resolve the condition. I propose don't mix SwitchInst case indexing with low level indexing (TI successors indexing, User's operands indexing), since it may be dangerous.
2. By the same reason findCaseValue(ConstantInt*) returns actual number of case value. 0 means first case, not default. If there is no case with given value, ErrorIndex will returned.
3. Added getCaseSuccessor method. I propose to avoid usage of TerminatorInst::getSuccessor if you want to resolve case successor BB. Use getCaseSuccessor instead, since internal SwitchInst organization of operands/successors is hidden and may be changed in any moment.
4. Added resolveSuccessorIndex and resolveCaseIndex. The main purpose of these methods is to see how case successors are really mapped in TerminatorInst.
4.1 "resolveSuccessorIndex" was created if you need to level down from SwitchInst to TerminatorInst. It returns TerminatorInst's successor index for given case successor.
4.2 "resolveCaseIndex" converts low level successors index to case index that curresponds to the given successor.

Note: There are also related compatability fix patches for dragonegg, klee, llvm-gcc-4.0, llvm-gcc-4.2, safecode, clang.
llvm-svn: 149481
2012-02-01 07:49:51 +00:00
Argyrios Kyrtzidis 17c981a45b Revert Chris' commits up to r149348 that started causing VMCoreTests unit test to fail.
These are:

r149348
r149351
r149352
r149354
r149356
r149357
r149361
r149362
r149364
r149365

llvm-svn: 149470
2012-02-01 04:51:17 +00:00
Chris Lattner 997348e9fe remove the last vestiges of llvm::GetConstantStringInfo, in CodeGen.
llvm-svn: 149356
2012-01-31 05:09:17 +00:00
Chris Lattner 983005f51b rework this logic to not depend on the last argument to GetConstantStringInfo,
which is going away.

llvm-svn: 149348
2012-01-31 04:39:22 +00:00
Bill Wendling 8d9d1a0022 Remove the now-dead llvm.eh.exception and llvm.eh.selector intrinsics.
llvm-svn: 149331
2012-01-31 01:58:48 +00:00
Bill Wendling a4237652d2 Remove the eh.exception and eh.selector intrinsics. Also remove a hack to copy
over the catch information. The catch information is now tacked to the invoke
instruction.

llvm-svn: 149326
2012-01-31 01:46:13 +00:00
Eli Friedman 18a4c31525 Use the correct ShiftAmtTy for creating shifts after legalization. PR11881. Not committing a testcase because I think it will be too fragile.
llvm-svn: 149315
2012-01-31 01:08:03 +00:00
Chris Lattner 0256be96f2 continue making the world safe for ConstantDataVector. At this point,
we should (theoretically optimize and codegen ConstantDataVector as well
as ConstantVector.

llvm-svn: 149116
2012-01-27 03:08:05 +00:00
Chris Lattner cf12970bd0 eliminate the Constant::getVectorElements method. There are better (and
more robust) ways to do what it was doing now.  Also, add static methods
for decoding a ShuffleVector mask.

llvm-svn: 149028
2012-01-26 02:51:13 +00:00
Chris Lattner 47a86bdbe2 use ConstantVector::getSplat in a few places.
llvm-svn: 148929
2012-01-25 06:02:56 +00:00
Chris Lattner 9be59599b3 Use the right method to get the # elements in a CDS.
llvm-svn: 148897
2012-01-25 01:27:20 +00:00
Chris Lattner 00245f420a add more support for ConstantDataSequential
llvm-svn: 148802
2012-01-24 13:41:11 +00:00
David Blaikie 46a9f016c5 More dead code removal (using -Wunreachable-code)
llvm-svn: 148578
2012-01-20 21:51:11 +00:00
Jakob Stoklund Olesen 9349351d72 Add a RegisterMaskSDNode class.
This SelectionDAG node will be attached to call nodes by LowerCall(),
and eventually becomes a MO_RegisterMask MachineOperand on the
MachineInstr representing the call instruction.

LowerCall() will attach a register mask that depends on the calling
convention.

llvm-svn: 148436
2012-01-18 23:52:12 +00:00
Nadav Rotem 3b8f0cc9fa Fix a bug in the type-legalization of vector integers. When we bitcast one vector type to another, we must not bitcast the result if one type is widened while the other is promoted.
llvm-svn: 148383
2012-01-18 08:33:18 +00:00
Pete Cooper c52eeed310 Fix ISD::REG_SEQUENCE to accept physical registers and change TwoAddressInstructionPass to insert copies for any physical reg operands of the REG_SEQUENCE
llvm-svn: 148377
2012-01-18 04:16:16 +00:00
Nadav Rotem fb6ddee0e9 Transform: (EXTRACT_VECTOR_ELT( VECTOR_SHUFFLE )) -> EXTRACT_VECTOR_ELT.
llvm-svn: 148337
2012-01-17 21:44:01 +00:00
Craig Topper 02cb0fb136 Teach DAG combiner to turn a BUILD_VECTOR of UNDEFs into an UNDEF of vector type.
llvm-svn: 148297
2012-01-17 09:09:48 +00:00
Pete Cooper e3d305a206 Changed flag operand of ISD::FP_ROUND to TargetConstant as it should not get checked for legalisation
llvm-svn: 148275
2012-01-17 01:54:07 +00:00
David Blaikie 5d8e42755c Refactor variables unused under non-assert builds (& remove two entirely unused variables).
llvm-svn: 148230
2012-01-16 05:17:39 +00:00
Pete Cooper e85b95d754 Changed intrinsic ID operand to a target constant as its not used in any arithmetic so should not be checked in legalisation
llvm-svn: 148228
2012-01-16 04:08:12 +00:00
Nadav Rotem 57935243bd [AVX] Optimize x86 VSELECT instructions using SimplifyDemandedBits.
We know that the blend instructions only use the MSB, so if the mask is
sign-extended then we can convert it into a SHL instruction. This is a
common pattern because the type-legalizer sign-extends the i1 type which
is used by the LLVM-IR for the condition.

Added a new optimization in SimplifyDemandedBits for SIGN_EXTEND_INREG -> SHL.

llvm-svn: 148225
2012-01-15 19:27:55 +00:00
Benjamin Kramer 339ced4e34 Return an ArrayRef from ShuffleVectorSDNode::getMask and push it through CodeGen.
llvm-svn: 148218
2012-01-15 13:16:05 +00:00
Benjamin Kramer 5a377e28da DAGCombiner: Deduplicate code.
llvm-svn: 148217
2012-01-15 11:50:43 +00:00
Craig Topper 201c1a3505 Truncate of undef is just undef of smaller size.
llvm-svn: 148205
2012-01-15 01:05:11 +00:00
Evan Cheng fa8326334b DAGCombine's logic for forming pre- and post- indexed loads / stores were being
overly conservative. It was concerned about cases where it would prohibit
folding simple [r, c] addressing modes. e.g.
  ldr r0, [r2]
  ldr r1, [r2, #4]
=>
  ldr r0, [r2], #4
  ldr r1, [r2]
Change the logic to look for such cases which allows it to form indexed memory
ops more aggressively.

rdar://10674430

llvm-svn: 148086
2012-01-13 01:37:24 +00:00
Pete Cooper 99415fea87 Added FPOW, FEXP, FLOG to PromoteNode so that custom actions can be set to Promote for those operations.
Sorry, no test case yet

llvm-svn: 148050
2012-01-12 21:46:18 +00:00
Evan Cheng 09cc429cb1 Allow targets to select source order pre-RA scheduler.
llvm-svn: 148033
2012-01-12 18:27:52 +00:00
Nadav Rotem b5ce6ee835 On AVX, we can load v8i32 at a time. The bug happens when two uneven loads are used.
When we load the v12i32 type, the GenWidenVectorLoads method generates two loads: v8i32 and v4i32 
and attempts to use CONCAT_VECTORS to join them. In this fix I concat undef values to widen 
the smaller value. The test "widen_load-2.ll" also exposes this bug on AVX.

llvm-svn: 147964
2012-01-11 20:19:17 +00:00
Chandler Carruth 55b2cdee26 Teach the X86 instruction selection to do some heroic transforms to
detect a pattern which can be implemented with a small 'shl' embedded in
the addressing mode scale. This happens in real code as follows:

  unsigned x = my_accelerator_table[input >> 11];

Here we have some lookup table that we look into using the high bits of
'input'. Each entity in the table is 4-bytes, which means this
implicitly gets turned into (once lowered out of a GEP):

  *(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));

The shift right followed by a shift left is canonicalized to a smaller
shift right and masking off the low bits. That hides the shift right
which x86 has an addressing mode designed to support. We now detect
masks of this form, and produce the longer shift right followed by the
proper addressing mode. In addition to saving a (rather large)
instruction, this also reduces stalls in Intel chips on benchmarks I've
measured.

In order for all of this to work, one part of the DAG needs to be
canonicalized *still further* than it currently is. This involves
removing pointless 'trunc' nodes between a zextload and a zext. Without
that, we end up generating spurious masks and hiding the pattern.

llvm-svn: 147936
2012-01-11 08:41:08 +00:00
Chandler Carruth f3e8502cc1 Add 'llvm_unreachable' to passify GCC's understanding of the constraints
of several newly un-defaulted switches. This also helps optimizers
(including LLVM's) recognize that every case is covered, and we should
assume as much.

llvm-svn: 147861
2012-01-10 18:08:01 +00:00
David Blaikie edbb58c577 Remove unnecessary default cases in switches that cover all enum values.
llvm-svn: 147855
2012-01-10 16:47:17 +00:00
Nadav Rotem 61bdf79035 Fix a bug in the legalization of shuffle vectors. When we emulate shuffles using BUILD_VECTORS we may be using a BV of different type. Make sure to cast it back.
llvm-svn: 147851
2012-01-10 14:28:46 +00:00
Craig Topper 0515cd41e4 Replace some uses of hasNUsesOfValue(0, X) with !hasAnyUseOfValue(X)
llvm-svn: 147733
2012-01-07 18:31:09 +00:00
Craig Topper 43a1bd6ac7 Add some DAG combines for SUBC/SUBE. If nothing uses the carry/borrow out of subc, turn it into a sub. Turn (subc x, x) into 0 with no borrow. Turn (subc x, 0) into x with no borrow. Turn (subc -1, x) into (xor x, -1) with no borrow. Turn sube with no borrow in into subc.
llvm-svn: 147728
2012-01-07 09:06:39 +00:00
Chad Rosier 73a3fab480 Add comment.
llvm-svn: 147696
2012-01-06 23:45:47 +00:00
Chandler Carruth e041a30bb9 Prevent a DAGCombine from firing where there are two uses of
a combined-away node and the result of the combine isn't substantially
smaller than the input, it's just canonicalized. This is the first part
of a significant (7%) performance gain for Snappy's hot decompression
loop.

llvm-svn: 147604
2012-01-05 11:05:55 +00:00
Craig Topper f726e15f44 Allow vector shuffle normalizing to use concat vector even if the sources are commuted in the shuffle mask.
llvm-svn: 147527
2012-01-04 09:23:09 +00:00
Craig Topper 279c77b677 Implement VECTOR_SHUFFLE canonicalizations during DAG combine.
llvm-svn: 147525
2012-01-04 08:07:43 +00:00
Chris Lattner 6b77a07f75 Turn a few more inline asm errors into "emitErrors" instead of fatal errors.
Before we'd get:

$ clang t.c 
fatal error: error in backend: Invalid operand for inline asm constraint 'i'!

Now we get:

$ clang t.c
t.c:16:5: error: invalid operand for inline asm constraint 'i'!
    "movq         (%4), %%mm0\n"
    ^

Which at least gets us the inline asm that is the problem.

llvm-svn: 147502
2012-01-03 23:51:01 +00:00
Nadav Rotem 1e7dda13c8 Fix incorrect widening of the bitcast sdnode in case the incoming operand is integer-promoted.
llvm-svn: 147484
2012-01-03 22:12:28 +00:00
Owen Anderson fcc041eabf Remove the restriction that target intrinsics can only involve legal types. Targets can perfects well support intrinsics on illegal types, as long as they are prepared to perform custom expansion during type legalization. For example, a target where i64 is illegal might still support the i64 intrinsic operation using pairs of i32's. ARM already does some expansions like this for non-intrinsic operations.
llvm-svn: 147472
2012-01-03 20:09:02 +00:00
Elena Demikhovsky 8ec21a2801 Fixed a bug in SelectionDAG.cpp.
The failure seen on win32, when i64 type is illegal.
It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR.

The failure message is:
llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT || (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed.

I added a special test that checks vector shuffle on win32.

llvm-svn: 147445
2012-01-03 11:59:04 +00:00
Rafael Espindola d3df940169 Revert 147399. It broke CodeGen/ARM/vext.ll.
llvm-svn: 147400
2012-01-01 17:36:23 +00:00
Elena Demikhovsky 67f80c3432 Fixed a bug in SelectionDAG.cpp.
The failure seen on win32, when i64 type is illegal.
It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR.

The failure message is:
llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT || (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed.

I added a special test that checks vector shuffle on win32.

llvm-svn: 147399
2012-01-01 16:22:47 +00:00
Nadav Rotem 3c3dd6e588 PR11662.
Promotion of the mask operand needs to be done using PromoteTargetBoolean, and not padded with garbage.

llvm-svn: 147309
2011-12-28 13:08:20 +00:00
Eli Friedman e96286cdf2 Make sure DAGCombiner doesn't introduce multiple loads from the same memory location. PR10747, part 2.
llvm-svn: 147283
2011-12-26 22:49:32 +00:00
Nadav Rotem c1faeac410 Fix a typo in the widening of vectors in PromoteIntRes. Patch by Shemer Anat.
llvm-svn: 147272
2011-12-25 20:01:38 +00:00
Dylan Noblesmith 9e5b178ecc drop unneeded config.h includes
llvm-svn: 147197
2011-12-22 23:04:07 +00:00
Jakub Staszak 96f8c551e3 Add some constantness to BranchProbabilityInfo and BlockFrequnencyInfo.
llvm-svn: 146986
2011-12-20 20:03:10 +00:00
David Blaikie a379b18173 Unweaken vtables as per http://llvm.org/docs/CodingStandards.html#ll_virtual_anch
llvm-svn: 146960
2011-12-20 02:50:00 +00:00
Dan Gohman 94580ab375 Add basic generic CodeGen support for half.
llvm-svn: 146927
2011-12-20 00:02:33 +00:00
Joerg Sonnenberger d6cb7649d8 Allow inlining of functions with returns_twice calls, if they have the
attribute themselve.

llvm-svn: 146851
2011-12-18 20:35:43 +00:00
Devang Patel 7bbc1e56f5 Update DebugLoc while merging nodes at -O0.
Patch by Kyriakos Georgiou!

llvm-svn: 146670
2011-12-15 18:21:18 +00:00
Eli Friedman 2ec824966d Don't try to form FGETSIGN after legalization; it is possible in some cases, but the existing code can't do it correctly. PR11570.
llvm-svn: 146630
2011-12-15 02:07:20 +00:00
Owen Anderson e7f329fa7a Enable synthesis of FLOG2 and FEXP2 SelectionDAG nodes from libm calls. These are already marked as illegal by default.
llvm-svn: 146623
2011-12-15 00:54:12 +00:00
Eli Friedman 6512cd4366 Add missing cases to SDNode::getOperationName(). Patch by Micah Villmow.
llvm-svn: 146548
2011-12-14 02:28:54 +00:00
Chad Rosier b941674aa4 [fast-isel] Remove SelectInsertValue() as fast-isel wasn't designed to handle
instructions that define aggregate types.

llvm-svn: 146492
2011-12-13 17:45:06 +00:00
Chandler Carruth 637cc6a8aa Initial CodeGen support for CTTZ/CTLZ where a zero input produces an
undefined result. This adds new ISD nodes for the new semantics,
selecting them when the LLVM intrinsic indicates that the undef behavior
is desired. The new nodes expand trivially to the old nodes, so targets
don't actually need to do anything to support these new nodes besides
indicating that they should be expanded. I've done this for all the
operand types that I could figure out for all the targets. Owners of
various targets, please review and let me know if any of these are
incorrect.

Note that the expand behavior is *conservatively correct*, and exactly
matches LLVM's current behavior with these operations. Ideally this
patch will not change behavior in any way. For example the regtest suite
finds the exact same instruction sequences coming out of the code
generator. That's why there are no new tests here -- all of this is
being exercised by the existing test suite.

Thanks to Duncan Sands for reviewing the various bits of this patch and
helping me get the wrinkles ironed out with expanding for each target.
Also thanks to Chris for clarifying through all the discussions that
this is indeed the approach he was looking for. That said, there are
likely still rough spots. Further review much appreciated.

llvm-svn: 146466
2011-12-13 01:56:10 +00:00
Chad Rosier 2f8347e0b6 [fast-isel] Guard "exhastive" fast-isel output with -fast-isel-verbose2.
llvm-svn: 146453
2011-12-13 00:05:11 +00:00
Daniel Dunbar 27a7489a03 LLVMBuild: Remove trailing newline, which irked me.
llvm-svn: 146409
2011-12-12 19:48:00 +00:00
Chad Rosier 3168cabef1 [fast-isel] SelectInsertValue seems to be causing miscompiles for ARM. Disable while I investigate.
llvm-svn: 146331
2011-12-10 21:27:40 +00:00
Chad Rosier f70174b869 Typo.
llvm-svn: 146327
2011-12-10 19:48:51 +00:00
Chad Rosier dd998ff4df [fast-isel] Add support for selecting insertvalue.
rdar://10530851

llvm-svn: 146276
2011-12-09 20:09:54 +00:00
Eli Friedman 053a724483 Fix a couple of logic bugs in TargetLowering::SimplifyDemandedBits. PR11514.
llvm-svn: 146219
2011-12-09 01:16:26 +00:00
Owen Anderson bb15fec2b8 Enhance both TargetLibraryInfo and SelectionDAGBuilder so that the latter can use the former to prevent the formation of libm SDNode's when -fno-builtin is passed.
llvm-svn: 146193
2011-12-08 22:15:21 +00:00
Chad Rosier 0464869922 Add rather verbose stats for fast-isel failures.
llvm-svn: 146186
2011-12-08 21:37:10 +00:00
Owen Anderson 0b9b9da6c8 Teach SelectionDAG to match more calls to libm functions onto existing SDNodes. Mark these nodes as illegal by default, unless the target declares otherwise.
llvm-svn: 146171
2011-12-08 19:32:14 +00:00
Nadav Rotem 26edb291ac Fix a bug in the integer-promotion of bitcast operations on vector types.
We must not issue a bitcast operation for integer-promotion of vector types, because the
location of the values in the vector may be different.

llvm-svn: 146150
2011-12-08 13:10:01 +00:00
Eli Friedman d5c173fad0 Make sure we correctly set LiveRegGens when a call is unscheduled. <rdar://problem/10460321>. No testcase because this is very sensitive to scheduling.
llvm-svn: 146087
2011-12-07 22:24:28 +00:00
Eli Friedman 0bdc083e21 Fix an assertion in the scheduler. PR11386. No testcase included because it's rather delicate.
llvm-svn: 146083
2011-12-07 22:06:02 +00:00
Nick Lewycky d63851eb93 These global variables aren't thread-safe, STATISTIC is. Andy Trick tells me
that he isn't using these any more, so just delete them.

llvm-svn: 146076
2011-12-07 21:35:59 +00:00
Evan Cheng 7f8e563a69 Add bundle aware API for querying instruction properties and switch the code
generator to it. For non-bundle instructions, these behave exactly the same
as the MC layer API.

For properties like mayLoad / mayStore, look into the bundle and if any of the
bundled instructions has the property it would return true.
For properties like isPredicable, only return true if *all* of the bundled
instructions have the property.
For properties like canFoldAsLoad, isCompare, conservatively return false for
bundles.

llvm-svn: 146026
2011-12-07 07:15:52 +00:00
Eli Friedman f9081a8afe Zap unnecessary isIntDivCheap() check. PR11485. No testcase because this doesn't affect any in-tree target.
llvm-svn: 146015
2011-12-07 03:55:52 +00:00
Eli Friedman 0e58cba286 Fix an optimization involving EXTRACT_SUBVECTOR in DAGCombine so it behaves correctly. PR11494.
llvm-svn: 145996
2011-12-07 00:11:56 +00:00
Evan Cheng 2a81dd4a3c First chunk of MachineInstr bundle support.
1. Added opcode BUNDLE
2. Taught MachineInstr class to deal with bundled MIs
3. Changed MachineBasicBlock iterator to skip over bundled MIs; added an iterator to walk all the MIs
4. Taught MachineBasicBlock methods about bundled MIs

llvm-svn: 145975
2011-12-06 22:12:01 +00:00
Nadav Rotem 3924cb0267 Add support for vectors of pointers.
llvm-svn: 145801
2011-12-05 06:29:09 +00:00
Nick Lewycky 50f02cb21b Move global variables in TargetMachine into new TargetOptions class. As an API
change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.

One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.

llvm-svn: 145714
2011-12-02 22:16:29 +00:00
Chad Rosier 46addb9e07 If fast-isel fails, remove dead instructions generated during the failed
attempt.  

llvm-svn: 145425
2011-11-29 19:40:47 +00:00
Daniel Dunbar 539d0a8a09 build/CMake: Finish removal of add_llvm_library_dependencies.
llvm-svn: 145420
2011-11-29 19:25:30 +00:00
Eli Friedman e7ab1a2f0f Make SelectionDAG::InferPtrAlignment use llvm::ComputeMaskedBits instead of duplicating the logic for globals. Make llvm::ComputeMaskedBits handle GlobalVariables slightly more aggressively, to match what InferPtrAlignment knew how to do.
llvm-svn: 145304
2011-11-28 22:48:22 +00:00
Evan Cheng 4a5b2040e2 Revert r145273 and fix in SelectionDAG::InferPtrAlignment() instead.
Conservatively returns zero when the GV does not specify an alignment nor is it
initialized. Previously it returns ABI alignment for type of the GV. However, if
the type is a "packed" type, then the under-specified alignments is attached to
the load / store instructions. In that case, the alignment of the type cannot be
trusted.
rdar://10464621

llvm-svn: 145300
2011-11-28 22:37:34 +00:00
Evan Cheng a4b6404cf0 DAG combine should not increase alignment of loads / stores with alignment less
than ABI alignment. These are loads / stores from / to "packed" data structures.
Their alignments are intentionally under-specified.

rdar://10301431

llvm-svn: 145273
2011-11-28 20:42:56 +00:00
Chad Rosier 61e8d1026f 80-column.
llvm-svn: 145267
2011-11-28 19:59:09 +00:00
Bill Wendling 5ebc95ff4c Remove dead llvm.eh.sjlj.dispatchsetup intrinsic.
llvm-svn: 145263
2011-11-28 19:23:13 +00:00
Chandler Carruth e2530dc889 Fix an obvious omission in the SelectionDAGBuilder where we were
dropping weights on the floor for invokes. This was impeding my writing
further test cases for invoke when interacting with probabilities and
block placement.

No test case as there doesn't appear to be a way to test this stuff. =/
Suggestions for a test case of course welcome. I hope to be able to add
test cases that indirectly cover this eventually by adding probabilities
to the exceptional edge and reordering blocks as a result.

llvm-svn: 145060
2011-11-22 11:37:46 +00:00
Chad Rosier f83ab704e4 When fast iseling a GEP, accumulate the offset rather than emitting a series of
ADDs.  MaxOffs is used as a threshold to limit the size of the offset. Tradeoffs
being: (1) If we can't materialize the large constant then we'll cause fast-isel
to bail. (2) Too large of an offset can't be directly encoded in the ADD
resulting in a MOV+ADD.  Generally not a bad thing because otherwise we would
have had ADD+ADD, but on Thumb this turns into a MOVS+MOVT+ADD. Working on a fix
for that. (3) Conversely, too low of a threshold we'll miss opportunities to 
coalesce ADDs.
rdar://10412592

llvm-svn: 144886
2011-11-17 07:15:58 +00:00
Eli Friedman ff1eaa7578 Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393.
llvm-svn: 144863
2011-11-16 23:50:22 +00:00
Chad Rosier ff40b1e164 Add fast-isel stats to determine who's doing all the work, the
target-independent selector or the target-specific selector.

llvm-svn: 144833
2011-11-16 21:05:28 +00:00
Chad Rosier cfd0d10e72 Fix the stats collection for fast-isel. The failed count was only accounting
for a single miss and not all predecessor instructions that get selected by
the selection DAG instruction selector.  This is still not exact (e.g., over
states misses when folded/dead instructions are present), but it is a step in
the right direction.

llvm-svn: 144832
2011-11-16 21:02:08 +00:00
Eli Friedman 87f92512c3 CONCAT_VECTORS can have more than two operands. PR11389.
llvm-svn: 144768
2011-11-16 02:52:39 +00:00
Eli Friedman d257a464d1 Add a couple asserts so it will be easier to debug if we accidentally pass indexed loads/stores to the legalizer.
llvm-svn: 144767
2011-11-16 02:43:15 +00:00
Owen Anderson ca2f78a95b Rename MVT::untyped to MVT::Untyped to match similar nomenclature.
llvm-svn: 144747
2011-11-16 01:02:57 +00:00
Chad Rosier 291ce47db7 GEPs with all zero indices are trivially coalesced by fast-isel. For example,
%arrayidx135 = getelementptr inbounds [4 x [4 x [4 x [4 x i32]]]]* %M0, i32 0, i64 0
%arrayidx136 = getelementptr inbounds [4 x [4 x [4 x i32]]]* %arrayidx135, i32 0, i64 %idxprom134

Prior to this commit, the GEP instruction that defines %arrayidx136 thought that 
%arrayidx135 was a trivial kill.  The GEP that defines %arrayidx135 doesn't 
generate any code and thus %M0 gets folded into the second GEP.  Thus, we need
to look through GEPs with all zero indices.
rdar://10443319

llvm-svn: 144730
2011-11-15 23:34:05 +00:00
Pete Cooper 7c7ba1baa1 Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used
by later instructions.

Only done for DEC64m right now.

Fixes <rdar://problem/6172640>

llvm-svn: 144705
2011-11-15 21:57:53 +00:00
Benjamin Kramer 1f97a5a671 Remove all remaining uses of Value::getNameStr().
llvm-svn: 144648
2011-11-15 16:27:03 +00:00
Benjamin Kramer 4c93d15f09 Twinify GraphWriter a little bit.
llvm-svn: 144647
2011-11-15 16:26:38 +00:00
Jay Foad 70679df664 Remove some unnecessary includes of PseudoSourceValue.h.
llvm-svn: 144634
2011-11-15 07:50:46 +00:00
Eli Friedman 9d448e4a42 Don't try to form pre/post-indexed loads/stores until after LegalizeDAG runs. Fixes PR11029.
llvm-svn: 144438
2011-11-12 00:35:34 +00:00
Eli Friedman 1347715644 Some cleanup and bulletproofing for node replacement in LegalizeDAG. To maintain LegalizeDAG invariants, whenever we a node is replaced, we must attempt to delete it, and if it still
has uses after it is replaced (which can happen in rare cases due to CSE), we must revisit it.

llvm-svn: 144432
2011-11-11 23:58:27 +00:00
Evan Cheng d33b2d6b7a Use a bigger hammer to fix PR11314 by disabling the "forcing two-address
instruction lower optimization" in the pre-RA scheduler.

The optimization, rather the hack, was done before MI use-list was available.
Now we should be able to implement it in a better way, perhaps in the
two-address pass until a MI scheduler is available.

Now that the scheduler has to backtrack to handle call sequences. Adding
artificial scheduling constraints is just not safe. Furthermore, the hack
is not taking all the other scheduling decisions into consideration so it's just
as likely to pessimize code. So I view disabling this optimization goodness
regardless of PR11314.

llvm-svn: 144267
2011-11-10 07:43:16 +00:00
Eli Friedman 53218b6fcc Add check so we don't try to perform an impossible transformation. Fixes issue from PR11319.
llvm-svn: 144216
2011-11-09 22:25:12 +00:00
Duncan Sands 635e4efca0 Speculatively revert commit 144124 (djg) in the hope that the 32 bit
dragonegg self-host buildbot will recover (it is complaining about object
files differing between different build stages).  Original commit message:

Add a hack to the scheduler to disable pseudo-two-address dependencies in
basic blocks containing calls. This works around a problem in which
these artificial dependencies can get tied up in calling seqeunce
scheduling in a way that makes the graph unschedulable with the current
approach of using artificial physical register dependencies for calling
sequences. This fixes PR11314.

llvm-svn: 144188
2011-11-09 14:20:48 +00:00
Dan Gohman a4bc6171a5 Add a hack to the scheduler to disable pseudo-two-address dependencies in
basic blocks containing calls. This works around a problem in which
these artificial dependencies can get tied up in calling seqeunce
scheduling in a way that makes the graph unschedulable with the current
approach of using artificial physical register dependencies for calling
sequences. This fixes PR11314.

llvm-svn: 144124
2011-11-08 21:29:06 +00:00
Lang Hames b85fcd07df Lower mem-ops to unaligned i32/i16 load/stores on ARM where supported.
Add support for trimming constants to GetDemandedBits. This fixes some funky
constant generation that occurs when stores are expanded for targets that don't
support unaligned stores natively.

llvm-svn: 144102
2011-11-08 18:56:23 +00:00
Pete Cooper 82cd9e81fc Added invariant field to the DAG.getLoad method and changed all calls.
When this field is true it means that the load is from constant (runt-time or compile-time) and so can be hoisted from loops or moved around other memory accesses

llvm-svn: 144100
2011-11-08 18:42:53 +00:00
Eli Friedman f2a9bd4b1e Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization doesn't get confused by CSE later on. Fixes PR11318.
Re-commit of r144034, with an extra fix so that RemoveDeadNode doesn't blow up.

llvm-svn: 144055
2011-11-08 01:25:24 +00:00
Eli Friedman a35a5295e0 Revert r144034 while I try to track down a crash.
llvm-svn: 144044
2011-11-07 23:53:20 +00:00
Eli Friedman 55a86d32d3 Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization doesn't get confused by CSE later on. Fixes PR11318.
llvm-svn: 144034
2011-11-07 22:51:10 +00:00
Richard Osborne 561fac4d4e Don't introduce custom nodes after legalization in TargetLowering::BuildSDIV()
and TargetLowering::BuildUDIV(). Fixes PR11283

llvm-svn: 143964
2011-11-07 17:09:05 +00:00
Dan Gohman 198b7ffc11 Reapply r143206, with fixes. Disallow physical register lifetimes
across calls, and only check for nested dependences on the special
call-sequence-resource register.

llvm-svn: 143660
2011-11-03 21:49:52 +00:00
Daniel Dunbar bf9bba47a1 build: Add initial cut at LLVMBuild.txt files.
llvm-svn: 143634
2011-11-03 18:53:17 +00:00
Bill Wendling 645eadac67 An array of chars of length 8 will also cause the stack protector to be inserted
into the function. Reflect that here so that the array will be placed next to
the SP.
<rdar://problem/10128329>

llvm-svn: 143590
2011-11-02 23:20:58 +00:00
Nadav Rotem f310361a7d Cleanup. Document. Make sure that this build_vector optimization only runs before the op legalizer and that the used type is legal.
llvm-svn: 143358
2011-10-31 20:08:25 +00:00
Benjamin Kramer a4eba41b7a Silence compiler warning.
llvm-svn: 143308
2011-10-30 08:39:55 +00:00
Nadav Rotem bf6568b5d6 Add a new DAGCombine optimization for BUILD_VECTOR.
If all of the inputs are zero/any_extended, create a new simple BV
which can be further optimized by other BV optimizations.

llvm-svn: 143297
2011-10-29 21:23:04 +00:00
Dan Gohman 9b9c970148 Revert r143206, as there are still some failing tests.
llvm-svn: 143262
2011-10-29 00:41:52 +00:00
Dan Gohman 73057ad24f Reapply r143177 and r143179 (reverting r143188), with scheduler
fixes: Use a separate register, instead of SP, as the
calling-convention resource, to avoid spurious conflicts with
actual uses of SP. Also, fix unscheduling of calling sequences,
which can be triggered by pseudo-two-address dependencies.

llvm-svn: 143206
2011-10-28 17:55:38 +00:00
Duncan Sands 225a7037d6 Speculatively disable Dan's commits 143177 and 143179 to see if
it fixes the dragonegg self-host (it looks like gcc is miscompiled).
Original commit messages:
Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.

Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.

Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.

Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.

This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.

Delete #if 0 code accidentally left in.

llvm-svn: 143188
2011-10-28 09:55:57 +00:00
Dan Gohman 0e8d1454b1 Delete #if 0 code accidentally left in.
llvm-svn: 143179
2011-10-28 01:41:21 +00:00
Dan Gohman 4db3f7dd83 Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.

Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.

Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.

Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.

This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.

llvm-svn: 143177
2011-10-28 01:29:32 +00:00
Eli Friedman e9e356ad6b Don't crash on 128-bit sdiv by constant. Found by inspection.
llvm-svn: 143095
2011-10-27 02:06:39 +00:00
Lang Hames 58dba012b6 Rename NonScalarIntSafe to something more appropriate.
llvm-svn: 143080
2011-10-26 23:50:43 +00:00
Duncan Sands dce448c642 Simplify SplitVecRes_UnaryOp by removing all the code that is
trying to legalize the operand types when only the result type
is required to be legalized - the type legalization machinery
will get round to the operands later if they need legalizing.
There can be a point to legalizing operands in parallel with
the result: when this saves compile time or results in better
code.  There was only one case in which this was true: when
the operand is also split, so keep the logic for that bit.
As a result of this change, additional operand legalization
methods may need to be introduced to handle nodes where the
result and operand types can differ, like SIGN_EXTEND, but
the testsuite doesn't contain any tests where this is the case.
In any case, it seems better to require such methods (and die
with an assert if they doesn't exist) than to quietly produce
wrong code if we forgot to special case the node in
SplitVecRes_UnaryOp.

llvm-svn: 143026
2011-10-26 14:11:18 +00:00
Jakob Stoklund Olesen e8261a22f1 Don't use floating point to do an integer's job.
This code makes different decisions when compiled into x87 instructions
because of different rounding behavior.  That caused phase 2/3
miscompares on 32-bit Linux when the phase 1 compiler was built with gcc
(using x87), and the phase 2 compiler was built with clang (using SSE).

This fixes PR11200.

llvm-svn: 143006
2011-10-26 01:47:48 +00:00
Eli Friedman 3e9ef907e0 Remove a couple redundant checks.
llvm-svn: 142959
2011-10-25 20:34:22 +00:00
Douglas Gregor 0cc574eee7 Really unbreak CMake build
llvm-svn: 142822
2011-10-24 18:10:52 +00:00
Douglas Gregor d0800fda95 Unbreak CMake build
llvm-svn: 142821
2011-10-24 18:09:23 +00:00
Dan Gohman e2ff95e327 Delete the top-down "Latency" scheduler. Top-down scheduling doesn't handle
physreg dependencies, and upcoming codegen changes will require proper
physreg dependence handling.

llvm-svn: 142816
2011-10-24 18:01:06 +00:00
Dan Gohman d78fc160cc Delete the Latency scheduling preference.
llvm-svn: 142815
2011-10-24 17:56:48 +00:00
Dan Gohman 4ed1afa51d Change this overloaded use of Sched::Latency to be an overloaded
use of Sched::ILP instead, as Sched::Latency is going away.

llvm-svn: 142813
2011-10-24 17:55:11 +00:00
Dan Gohman c32af340fc Change the default scheduler from Latency to ILP, since Latency
is going away.

llvm-svn: 142810
2011-10-24 17:45:02 +00:00
Nadav Rotem 5e00bb5feb Fix pr11194. When promoting and splitting integers we need to use
ZExtPromotedInteger and SExtPromotedInteger based on the operation we legalize.

SetCC return type needs to be legalized via PromoteTargetBoolean.

llvm-svn: 142660
2011-10-21 17:35:19 +00:00
Nadav Rotem d315157f12 1. Fix the widening of SETCC in WidenVecOp_SETCC. Use the correct return CC type.
2. Fix a typo in CONCAT_VECTORS which exposed the bug in #1.

llvm-svn: 142648
2011-10-21 11:42:07 +00:00
Chandler Carruth 001153784a Remove a now dead function, fixing -Wunused-function warnings from
Clang.

llvm-svn: 142631
2011-10-21 01:23:41 +00:00
Dan Gohman 90fb55237b Delete the list-tdrr scheduler. Top-down schedulers are going away
because they don't support physical register dependencies.

llvm-svn: 142620
2011-10-20 21:44:34 +00:00
Chad Rosier 4236a63c3c Revert r142579, "Fix a type in the legalization of CONCAT_VECTORS". This is
causing one of the unit tests to infinitely loop, which resulted in the 
buildbots stalling.

llvm-svn: 142604
2011-10-20 19:19:10 +00:00
Nadav Rotem fe3969293d Fix a type in the legalization of CONCAT_VECTORS.
llvm-svn: 142579
2011-10-20 13:38:16 +00:00
Nadav Rotem 8824472a25 Improve code generation for vselect on SSE2:
When checking the availability of instructions using the TLI, a 'promoted'
instruction IS available. It means that the value is bitcasted to another type
for which there is an operation. The correct check for the availablity of an
instruction is to check if it should be expanded.

llvm-svn: 142542
2011-10-19 20:43:16 +00:00
Nadav Rotem 6652e22bad Add support for the vector-widening of vselect and vector-setcc
llvm-svn: 142488
2011-10-19 09:45:11 +00:00
Nadav Rotem 75c2229f41 Fix a bug in the legalization of vector anyext-load and trunc-store. Mem Index starts with zero.
llvm-svn: 142434
2011-10-18 22:32:43 +00:00
Bob Wilson 681561901d Fix a DAG combiner assertion failure when constant folding BUILD_VECTORS.
svn r139159 caused SelectionDAG::getConstant() to promote BUILD_VECTOR operands
with illegal types, even before type legalization.  For this testcase, that led
to one BUILD_VECTOR with i16 operands and another with promoted i32 operands,
which triggered the assertion.

llvm-svn: 142370
2011-10-18 17:34:47 +00:00
Duncan Sands d278d35b13 Fix a bunch of unused variable warnings when doing a release
build with gcc-4.6.

llvm-svn: 142350
2011-10-18 12:44:00 +00:00
Hal Finkel bab66789d5 Fix comment to refer to correct instruction
llvm-svn: 142334
2011-10-18 03:51:57 +00:00
Bill Wendling 63a4ea1859 Correct over-zealous removal of hack.
Some code want to check that *any* call within a function has the 'returns
twice' attribute, not just that the current function has one.

llvm-svn: 142221
2011-10-17 18:43:40 +00:00
Bill Wendling 2a83a71c2a Now that we have the ReturnsTwice function attribute, this method is
obsolete. Check the attribute instead.
<rdar://problem/8031714>

llvm-svn: 142212
2011-10-17 18:22:52 +00:00
Chad Rosier c17257c4cb Removed set, but unused variable.
Patch by Joe Abbey <jabbey@arxan.com>.

llvm-svn: 142206
2011-10-17 18:01:59 +00:00
Nadav Rotem 486ff59a9f Enable element promotion type legalization by deafault.
Changed tests which assumed that vectors are legalized by widening them.

llvm-svn: 142152
2011-10-16 20:31:33 +00:00
Benjamin Kramer cc863b2bb6 Let printf do the formatting instead aligning strings ourselves.
While at it, merge some format strings.

llvm-svn: 142140
2011-10-16 16:30:34 +00:00
Nadav Rotem ebe13bc3f1 Move the legalization of vector loads and stores into LegalizeVectorOps. In some
cases we need the second type-legalization pass in order to support all cases.

llvm-svn: 142060
2011-10-15 07:41:10 +00:00
Bill Wendling 2730a0099a Clear out the landing pad to call site map for each function.
This isn't put into the 'clear()' method because the information needs to stick
around (at least for a little bit) after the selection DAG is built.

llvm-svn: 142032
2011-10-15 01:00:26 +00:00
Jim Grosbach 400907cc41 Fix typo. "__sync_fetch_and-xor_4" should be "__sync_fetch_and_xor_4".
Pointed out by George Russell.

llvm-svn: 141956
2011-10-14 15:53:48 +00:00
Jakob Stoklund Olesen 24abd9d9b6 Encode register class constreaints in inline asm instructions.
The inline asm operand constraint is initially encoded in the virtual
register for the operand, but that register class may change during
coalescing, and the original constraint is lost.

Encode the original register class as part of the flag word for each
inline asm operand.  This makes it possible to recover the actual
constraint required by inline asm, just like we can for normal
instructions.

llvm-svn: 141833
2011-10-12 23:37:29 +00:00
Eli Friedman 979009ea61 Use a utility from MathExtras to clarify a check and avoid undefined behavior. Based on patch by Ahmed Charles.
llvm-svn: 141829
2011-10-12 22:46:45 +00:00
Dan Gohman de239d2647 Fix a thinko that Nick noticed. The previous code actually worked as
intended, but only by accident.

llvm-svn: 141779
2011-10-12 15:56:56 +00:00
Jakob Stoklund Olesen 35163e21dc Use an existing function.
llvm-svn: 141763
2011-10-12 01:24:51 +00:00
Eric Christopher 57d1692750 Formatting.
llvm-svn: 141728
2011-10-11 22:59:04 +00:00
Nadav Rotem 3283793c9a Add support for legalization of vector SHL/SRA/SRL instructions
llvm-svn: 141667
2011-10-11 14:36:35 +00:00
Nadav Rotem 198fe81571 Add support for legalization of vector trunc-store where the saved scalar type is illegal (for example, v2i16 on systems where the smallest store size is i32)
llvm-svn: 141661
2011-10-11 11:25:16 +00:00
Nadav Rotem b521b6037b Cleanup the trunc-store legalization code and add asserts.
llvm-svn: 141659
2011-10-11 10:04:25 +00:00
Bill Wendling 7ecfbd90ef Thread the chain through the eh.sjlj.setjmp intrinsic, like it's documented to
do. This will be useful later on with the new SJLJ stuff.

llvm-svn: 141416
2011-10-07 21:25:38 +00:00
Eli Friedman 1456cd20b4 Remove the old atomic instrinsics. autoupgrade functionality is included with this patch.
llvm-svn: 141333
2011-10-06 23:20:49 +00:00
Bill Wendling 267f323d28 Modify the mapping from landing pad to call sites to accept more than one call
site.

llvm-svn: 141226
2011-10-05 22:24:35 +00:00
Bill Wendling e61c62533e Small refactoring. Cache the FunctionInfo->MBB into a local variable.
llvm-svn: 141221
2011-10-05 22:16:11 +00:00
Jakob Stoklund Olesen f7957a9819 Simplify EXTRACT_SUBREG emission.
EXTRACT_SUBREG is emitted as %dst = COPY %src:sub, so there is no need to
constrain the %dst register class.  RegisterCoalescer will apply the
necessary constraints if it decides to eliminate the COPY.

The %src register class does need to be constrained to something with
the right sub-registers, though.  This is currently done manually with
COPY_TO_REGCLASS nodes.  They can possibly be removed after this patch.

llvm-svn: 141207
2011-10-05 20:26:40 +00:00
Jakob Stoklund Olesen 8ff52c4135 Simplify INSERT_SUBREG emission.
The register class created by INSERT_SUBREG and SUBREG_TO_REG must be
legal and support the SubIdx sub-registers.

The new getSubClassWithSubReg() hook can compute that.

This may create INSERT_SUBREG instructions defining a larger register
class than the sub-register being inserted.  That is OK,
RegisterCoalescer will constrain the register class as needed when it
eliminates the INSERT_SUBREG instructions.

llvm-svn: 141198
2011-10-05 18:31:00 +00:00
Bill Wendling 3d11aa7e75 Create a mapping between the landing pad basic block and the call site index for later use.
llvm-svn: 141125
2011-10-04 22:00:35 +00:00
Nadav Rotem 52e8ed9214 Moved type construction out of the loop and added an assert on the legality of the type. Formatted lines to the 80 char limit.
llvm-svn: 140952
2011-10-01 18:39:28 +00:00
Bill Wendling 9925f197cc When inferring the pointer alignment, if the global doesn't have an initializer
and the alignment is 0 (i.e., it's defined globally in one file and declared in
another file) it could get an alignment which is larger than the ABI allows for
that type, resulting in aligned moves being used for unaligned loads.

For instance, in file A.c:

   struct S s;

In file B.c:
   struct {
     // something long
   };
   extern S s;

   void foo() {
     struct S p = s;
     // ...
   }

this copy is a 'memcpy' which is turned into a series of 'movaps' instructions
on X86. But this is wrong, because 'struct S' has alignment of 4, not 16.

llvm-svn: 140902
2011-09-30 23:19:55 +00:00
Nick Lewycky f40df1d46c Promote comment to doxycomment. Adjust whitespace. No functionality change.
llvm-svn: 140899
2011-09-30 22:19:53 +00:00
Jakob Stoklund Olesen 1352be2bd3 Move getCommonSubClass() into TRI.
It will soon need the context.

llvm-svn: 140896
2011-09-30 22:18:51 +00:00
Eli Friedman 95031ed837 Clean up uses of switch instructions so they are not dependent on the operand ordering. Patch by Stepan Dyatkovskiy.
llvm-svn: 140803
2011-09-29 20:21:17 +00:00
Eric Christopher d299dccf91 Use the local we already set up.
llvm-svn: 140745
2011-09-29 00:50:59 +00:00
Bill Wendling baf3941fde Strip off pointer casts when looking at the eh.sjlj.functioncontext's argument.
llvm-svn: 140678
2011-09-28 03:52:41 +00:00
Bill Wendling 66b110f571 Create and use an llvm.eh.sjlj.functioncontext intrinsic.
This intrinsic is used to pass the index of the function context to the back-end
for further processing. The back-end is in charge of filling in the rest of the
entries.

llvm-svn: 140676
2011-09-28 03:36:43 +00:00
Jim Grosbach af136f71ec Rename AddSelectionDAGCSEId() to addSelectionDAGCSEId().
Naming conventions consistency. No functional change.

llvm-svn: 140636
2011-09-27 20:59:33 +00:00
Nadav Rotem 38b3b83362 Cleanup PromoteIntOp_EXTRACT_VECTOR_ELT and PromoteIntRes_SETCC.
Add a new method: getAnyExtOrTrunc and use it to replace the manual check.

llvm-svn: 140603
2011-09-27 11:16:47 +00:00
Nadav Rotem 1b857d2762 Revert r140463; The patch assumes that <4 x i1> is saved to memory as 4 x i8,
while the decision is to bit-pack small values.

llvm-svn: 140601
2011-09-27 10:48:29 +00:00
Nadav Rotem 2279949129 [vector-select] Address one of the issues in pr10902. EXTRACT_VECTOR_ELEMENT
SDNodes may return values which are wider than the incoming element types. In
this patch we fix the integer promotion of these nodes.

Fixes spill-q.ll when running -promote-elements.

llvm-svn: 140471
2011-09-25 18:59:42 +00:00
Nadav Rotem c2deabd202 Implement Duncan's suggestion to use the result of getSetCCResultType if it is legal
(this is always the case for scalars), otherwise use the promoted result type.

Fix test/CodeGen/X86/vsplit-and.ll when promote-elements is enabled.

llvm-svn: 140464
2011-09-24 19:48:19 +00:00
Nadav Rotem 77426a754b [Vector-Select] Address one of the problems in 10902.
When generating the trunc-store of i1's, we need to use the vector type and not
the scalar type.

This patch fixes the assertion in CodeGen/Generic/bool-vector.ll when
running with -promote-elements.

llvm-svn: 140463
2011-09-24 18:32:19 +00:00
Duncan Sands b461176cfb Tweak the handling of MERGE_VALUES nodes: remove the need for
DecomposeMERGE_VALUES to "know" that results are legalized in
a particular order, by passing it the number of the result
being legalized (the type legalization core provides this, it
just needs to be passed on).

llvm-svn: 140373
2011-09-23 13:59:22 +00:00
Nadav Rotem 57e30726ad Vector-Select: Address one of the problems in pr10902. Add handling for the
integer-promotion of CONCAT_VECTORS.

Test: test/CodeGen/X86/widen_shuffle-1.ll

This patch fixes the above tests (when running in with -promote-elements).

llvm-svn: 140372
2011-09-23 09:33:24 +00:00
Dan Gohman e83e1b2d2c Fix SimplifySelectCC to add newly created nodes to the DAGCombiner
worklist, as it may be possible to perform further optimization on them.

llvm-svn: 140349
2011-09-22 23:01:29 +00:00
Jakob Stoklund Olesen e92e5ee81f Constrain register classes instead of emitting copies.
Sometimes register class constraints are trivial, like GR32->GR32_NOSP,
or GPR->rGPR.  Teach InstrEmitter to simply constrain the virtual
register instead of emitting a copy in these cases.

Normally, these copies are handled by the coalescer.  This saves some
coalescer work.

llvm-svn: 140340
2011-09-22 21:39:34 +00:00
Nadav Rotem bc9ba30158 [VECTOR-SELECT] Address one of the bugs in pr10902.
Vector SetCC result types need to be type-legalized.
This code worked before because scalar result types are known to be legal.

llvm-svn: 140249
2011-09-21 14:34:38 +00:00
Andrew Trick 924123acb3 Lower ARM adds/subs to add/sub after adding optional CPSR operand.
This is still a hack until we can teach tblgen to generate the
optional CPSR operand rather than an implicit CPSR def. But the
strangeness is now limited to the selection DAG. ADD/SUB MI's no
longer have implicit CPSR defs, nor do we allow flag setting variants
of these opcodes in machine code. There are several corner cases to
consider, and getting one wrong would previously lead to nasty
miscompilation. It's not the first time I've debugged one, so this
time I added enough verification to ensure it won't happen again.

llvm-svn: 140228
2011-09-21 02:20:46 +00:00
Bruno Cardoso Lopes 6cb23f6e7f Add a DAGCombine for subvector extracts to remove useless chains of
subvector inserts and extracts. Initial patch by Rackover, Zvi with
some tweak done by me.

llvm-svn: 140204
2011-09-20 23:19:33 +00:00
Andrew Trick 52363bdbeb Restore hasPostISelHook tblgen flag.
No functionality change. The hook makes it explicit which patterns
require "special" handling. i.e. it self-documents tblgen
deficiencies. I plan to add verification in ExpandISelPseudos and
Thumb2SizeReduce to catch any missing hasPostISelHooks. Otherwise it's
too fragile.

llvm-svn: 140160
2011-09-20 18:22:31 +00:00
Andrew Trick 8586e62d91 ARM isel bug fix for adds/subs operands.
Modified ARMISelLowering::AdjustInstrPostInstrSelection to handle the
full gamut of CPSR defs/uses including instructins whose "optional"
cc_out operand is not really optional. This allowed removal of the
hasPostISelHook to simplify the .td files and make the implementation
more robust.
Fixes rdar://10137436: sqlite3 miscompile

llvm-svn: 140134
2011-09-20 03:17:40 +00:00
Andrew Trick 53df4b6dfa whitespace
llvm-svn: 140133
2011-09-20 03:06:13 +00:00
Nadav Rotem 7aaa0aa7a7 white space cleanups
llvm-svn: 139994
2011-09-18 10:29:29 +00:00
Eli Friedman ee8f14a799 Some legalization fixes for atomic load and store.
llvm-svn: 139851
2011-09-15 21:20:49 +00:00
Nadav Rotem d748dbacb0 Add integer promotion support for vselect
llvm-svn: 139692
2011-09-14 14:42:15 +00:00
Eli Friedman f78c6a83ee Fix check for unaligned load/store so it doesn't catch over-aligned load/store.
llvm-svn: 139649
2011-09-13 22:19:59 +00:00
Eli Friedman f1518216fd Error out on CodeGen of unaligned load/store. Fix test so it isn't accidentally testing that case.
llvm-svn: 139641
2011-09-13 20:50:54 +00:00
Nadav Rotem 66dc9ae08d Fix the assertion which checks the size of the input operand.
llvm-svn: 139633
2011-09-13 20:03:38 +00:00
Nadav Rotem 52202fbf2d Add vselect target support for targets that do not support blend but do support
xor/and/or (For example SSE2).

llvm-svn: 139623
2011-09-13 19:17:42 +00:00
Chris Lattner e74e0c8020 tidy up a bit
llvm-svn: 139419
2011-09-09 22:06:59 +00:00
Eli Friedman b7910b79f5 Make the SelectionDAG verify that all the operands of BUILD_VECTOR have the same type. Teach DAGCombiner::visitINSERT_VECTOR_ELT not to make invalid BUILD_VECTORs. Fixes PR10897.
llvm-svn: 139407
2011-09-09 21:04:06 +00:00
Devang Patel 9d904e1a97 Directly point debug info to the stack slot of the arugment, instead of trying to keep track of vreg in which it the arugment is copied. The LiveDebugVariable can keep track of variable's ranges.
llvm-svn: 139330
2011-09-08 22:59:09 +00:00
Eli Friedman e978d2f644 Relax the MemOperands on atomics a bit. Fixes -verify-machineinstrs failures for atomic laod/store on ARM.
(The fix for the related failures on x86 is going to be nastier because we actually need Acquire memoperands attached to the atomic load instrs, etc.)

llvm-svn: 139221
2011-09-07 02:23:42 +00:00
Duncan Sands f2641e1bc1 Add codegen support for vector select (in the IR this means a select
with a vector condition); such selects become VSELECT codegen nodes.
This patch also removes VSETCC codegen nodes, unifying them with SETCC
nodes (codegen was actually often using SETCC for vector SETCC already).
This ensures that various DAG combiner optimizations kick in for vector
comparisons.  Passes dragonegg bootstrap with no testsuite regressions
(nightly testsuite as well as "make check-all").  Patch mostly by
Nadav Rotem.

llvm-svn: 139159
2011-09-06 19:07:46 +00:00
Duncan Sands a098436b32 Split the init.trampoline intrinsic, which currently combines GCC's
init.trampoline and adjust.trampoline intrinsics, into two intrinsics
like in GCC.  While having one combined intrinsic is tempting, it is
not natural because typically the trampoline initialization needs to
be done in one function, and the result of adjust trampoline is needed
in a different (nested) function.  To get around this llvm-gcc hacks the
nested function lowering code to insert an additional parent variable
holding the adjust.trampoline result that can be accessed from the child
function.  Dragonegg doesn't have the luxury of tweaking GCC code, so it
stored the result of adjust.trampoline in the memory GCC set aside for
the trampoline itself (this is always available in the child function),
and set up some new memory (using an alloca) to hold the trampoline.
Unfortunately this breaks Go which allocates trampoline memory on the
heap and wants to use it even after the parent has exited (!).  Rather
than doing even more hacks to get Go working, it seemed best to just use
two intrinsics like in GCC.  Patch mostly by Sanjoy Das.

llvm-svn: 139140
2011-09-06 13:37:06 +00:00
Owen Anderson 40d756eacc Fix a truly heinous bug in DAGCombine related to AssertZext.
If we have a chain of zext -> assert_zext -> zext -> use, the first zext would get simplified away because of the later zext, and then the later zext would get simplified away because of the assert.  The solution is to teach SimplifyDemandedBits that assert_zext demands all of the high bits of its input, rather than only those demanded by its users.  No testcase because the only example I have manifests as llvm-gcc miscompiling LLVM, and I haven't found a smaller case that reproduces this problem.
Fixes <rdar://problem/10063365>.

llvm-svn: 139059
2011-09-03 00:26:49 +00:00
Dan Gohman 3767be9aee Revert r131152, r129796, r129761. This code is currently considered
to be unreliable on platforms which require memcpy calls, and it is
complicating broader legalize cleanups. It is hoped that these cleanups
will make memcpy byval easier to implement in the future.

llvm-svn: 138977
2011-09-01 23:07:08 +00:00
Andrew Trick 832a6a1909 PreRA scheduler should avoid cloning compares.
Added canClobberReachingPhysRegUse() to handle a particular pattern in
which a two-address instruction could be forced to interfere with
EFLAGS, causing a compare to be unnecessarilly cloned.
Fixes rdar://problem/5875261

llvm-svn: 138924
2011-09-01 00:54:31 +00:00
Eli Friedman ae1acddb95 Misc cleanup; addresses Duncan's comments on r138877.
llvm-svn: 138887
2011-08-31 20:13:26 +00:00
Eli Friedman e839ecb70b Fill in type legalization for MERGE_VALUES in all the various cases. Patch by Micah Villmow. (No testcase because the issue only showed up in an out-of-tree backend.)
llvm-svn: 138877
2011-08-31 18:36:04 +00:00
Eli Friedman 7c3bdede25 Generic expansion for atomic load/store into cmpxchg/atomicrmw xchg; implements 64-bit atomic load/store for ARM.
llvm-svn: 138872
2011-08-31 18:26:09 +00:00
Evan Cheng e6fba77971 Follow up to r138791.
Add a instruction flag: hasPostISelHook which tells the pre-RA scheduler to
call a target hook to adjust the instruction. For ARM, this is used to
adjust instructions which may be setting the 's' flag. ADC, SBC, RSB, and RSC
instructions have implicit def of CPSR (required since it now uses CPSR physical
register dependency rather than "glue"). If the carry flag is used, then the
target hook will *fill in* the optional operand with CPSR. Otherwise, the hook
will remove the CPSR implicit def from the MachineInstr.

llvm-svn: 138810
2011-08-30 19:09:48 +00:00
Eli Friedman 452aae6202 Atomic load/store on ARM/Thumb.
I don't really like the patterns, but I'm having trouble coming up with a
better way to handle them.

I plan on making other targets use the same legalization
ARM-without-memory-barriers is using... it's not especially efficient, but
if anyone cares, it's not that hard to fix for a given target if there's
some better lowering.

llvm-svn: 138621
2011-08-26 02:59:24 +00:00
Eli Friedman 342e8df0e0 Basic x86 code generation for atomic load and store instructions.
llvm-svn: 138478
2011-08-24 20:50:09 +00:00
Bill Wendling 4eb0433672 A landingpad instruction is neither folded nor dead.
llvm-svn: 138387
2011-08-23 21:33:05 +00:00
Evan Cheng 6b477b985b Fix 80 col violations.
llvm-svn: 138356
2011-08-23 19:17:21 +00:00
Nick Lewycky 97f73cb449 Be less redundant.
llvm-svn: 138252
2011-08-22 18:26:12 +00:00
Benjamin Kramer 68ed46ce9a Roll back the rest of r126557. It's a hack that will break in some obscure cases.
llvm-svn: 138130
2011-08-19 22:39:31 +00:00
Nick Lewycky c1348074ec Eli points out that this is what report_fatal_error() is for.
llvm-svn: 138091
2011-08-19 21:45:19 +00:00
Nick Lewycky 3f73184d90 This is not actually unreachable, so don't use llvm_unreachable for it. Since
the intent seems to be to terminate even in Release builds, just use abort()
directly.

If program flow ever reaches a __builtin_unreachable (which llvm_unreachable is
#define'd to on newer GCCs) then the program is undefined.

llvm-svn: 138068
2011-08-19 20:14:27 +00:00
Ivan Krasin d7cbd4c518 FastISel: avoid function calls between the materialization of the constant and its use.
llvm-svn: 137993
2011-08-18 22:06:10 +00:00
Bill Wendling 247fd3bf59 Add the support in code-gen for the landingpad instruction lowering.
The landingpad instruction is lowered into the EXCEPTIONADDR and EHSELECTION
SDNodes. The information from the landingpad instruction is harvested by the
'AddLandingPadInfo' function. The new EH uses the current EH scheme in the
back-end. This will change once we switch over to the new scheme. (Reviewed by
Jakob!)

llvm-svn: 137880
2011-08-17 21:56:44 +00:00
Bill Wendling a408e5bf31 Revert patch. Forgot a dependent commit.
llvm-svn: 137875
2011-08-17 21:28:05 +00:00
Bill Wendling 2a521948f0 Add the body of 'visitLandingPad'.
This generates the SDNodes for the new exception handling scheme. It takes the
two values coming from the landingpad instruction and assigns them to the
EXCEPTIONADDR and EHSELECTION nodes.

llvm-svn: 137873
2011-08-17 21:25:14 +00:00
Nadav Rotem b66b866f46 Revert r137562 because it caused PR10674
llvm-svn: 137719
2011-08-16 14:34:29 +00:00
Nadav Rotem 6858b344ed Fix PR 10635. When generating integer constants, the constant element type may
be illegal, even if the requested vector type is legal. Testcase is one of the
disabled ARM tests in the vector-select patch.

llvm-svn: 137562
2011-08-13 20:31:45 +00:00
Bill Wendling fae1475823 Initial commit of the 'landingpad' instruction.
This implements the 'landingpad' instruction. It's used to indicate that a basic
block is a landing pad. There are several restrictions on its use (see
LangRef.html for more detail). These restrictions allow the exception handling
code to gather the information it needs in a much more sane way.

This patch has the definition, implementation, C interface, parsing, and bitcode
support in it.

llvm-svn: 137501
2011-08-12 20:24:12 +00:00
Nadav Rotem 62da15a330 Revert r137310 because it does not optimize any code on ToT
llvm-svn: 137466
2011-08-12 17:15:04 +00:00
Duncan Sands a41634e307 Silence a bunch (but not all) "variable written but not read" warnings
when building with assertions disabled.

llvm-svn: 137460
2011-08-12 14:54:45 +00:00
Nadav Rotem 61140e1028 [AVX] When joining two XMM registers into a YMM register, make sure that the
lower XMM register gets in first. This will allow the SUBREG pattern to
elliminate the first vector insertion. 

llvm-svn: 137310
2011-08-11 16:49:36 +00:00
Chris Lattner 96710b4308 fix PR10605 / rdar://9930964 by adding a pretty scary missed check.
It's somewhat surprising anything works without this.  Before we would
compile the testcase into:

test:                                   # @test
	movl	$4, 8(%rdi)
	movl	8(%rdi), %eax
	orl	%esi, %eax
	cmpl	$32, %edx
	movl	%eax, -4(%rsp)          # 4-byte Spill
	je	.LBB0_2

now we produce:

test:                                   # @test
	movl	8(%rdi), %eax
	movl	$4, 8(%rdi)
	orl	%esi, %eax
	cmpl	$32, %edx
	movl	%eax, -4(%rsp)          # 4-byte Spill
	je	.LBB0_2
llvm-svn: 137303
2011-08-11 06:26:54 +00:00
Devang Patel aab841cf63 Do not drop undef debug values. These are used as range termination marker by live debug variable pass.
llvm-svn: 136834
2011-08-03 23:13:55 +00:00
Eli Friedman 30a49e93e3 New approach to r136737: insert the necessary fences for atomic ops in platform-independent code, since a bunch of platforms (ARM, Mips, PPC, Alpha are the relevant targets here) need to do essentially the same thing.
I think this completes the basic CodeGen for atomicrmw and cmpxchg.

llvm-svn: 136813
2011-08-03 21:06:02 +00:00
Eli Friedman 04c5025cd5 Don't create a ridiculous EXTRACT_ELEMENT. PR10563.
The testcase looks extremely fragile, so I'm adding an assertion which should catch any cases like this.

llvm-svn: 136711
2011-08-02 18:38:35 +00:00
Bill Wendling f891bf8b30 Add the 'resume' instruction for the new EH rewrite.
This adds the 'resume' instruction class, IR parsing, and bitcode reading and
writing. The 'resume' instruction resumes propagation of an existing (in-flight)
exception whose unwinding was interrupted with a 'landingpad' instruction (to be
added later).

llvm-svn: 136589
2011-07-31 06:30:59 +00:00
Bill Wendling ad088e6724 Revert r136253, r136263, r136269, r136313, r136325, r136326, r136329, r136338,
r136339, r136341, r136369, r136387, r136392, r136396, r136429, r136430, r136444,
r136445, r136446, r136253 pending review.

llvm-svn: 136556
2011-07-30 05:42:50 +00:00
Jakub Staszak 0480a8fbbb Do not lose branch weights when lowering SwitchInst.
llvm-svn: 136529
2011-07-29 22:25:21 +00:00
Jakub Staszak 539db98987 Remove unneeded const_cast.
llvm-svn: 136506
2011-07-29 20:05:36 +00:00
Eli Friedman adec587d5c Misc optimizer+codegen work for 'cmpxchg' and 'atomicrmw'. They appear to be
working on x86 (at least for trivial testcases); other architectures will
need more work so that they actually emit the appropriate instructions for
orderings stricter than 'monotonic'. (As far as I can tell, the ARM, PPC,
Mips, and Alpha backends need such changes.)

llvm-svn: 136457
2011-07-29 03:05:32 +00:00
Bill Wendling 7eadbeaf62 Use the pointer type size.
With this, we can now compile a simple EH program.

llvm-svn: 136446
2011-07-29 01:15:29 +00:00
Bill Wendling 6a8cac735a And now something that compiles...
llvm-svn: 136445
2011-07-29 01:11:33 +00:00
Bill Wendling 4b0a365beb Make sure to sext or trunc the result from the register.
llvm-svn: 136444
2011-07-29 01:11:14 +00:00
Chandler Carruth 9d7feab3e0 Rewrite the CMake build to use explicit dependencies between libraries,
specified in the same file that the library itself is created. This is
more idiomatic for CMake builds, and also allows us to correctly specify
dependencies that are missed due to bugs in the GenLibDeps perl script,
or change from compiler to compiler. On Linux, this returns CMake to
a place where it can relably rebuild several targets of LLVM.

I have tried not to change the dependencies from the ones in the current
auto-generated file. The only places I've really diverged are in places
where I was seeing link failures, and added a dependency. The goal of
this patch is not to start changing the dependencies, merely to move
them into the correct location, and an explicit form that we can control
and change when necessary.

This also removes a serialization point in the build because we don't
have to scan all the libraries before we begin building various tools.
We no longer have a step of the build that regenerates a file inside the
source tree. A few other associated cleanups fall out of this.

This isn't really finished yet though. After talking to dgregor he urged
switching to a single CMake macro to construct libraries with both
sources and dependencies in the arguments. Migrating from the two macros
to that style will be a follow-up patch.

Also, llvm-config is still generated with GenLibDeps.pl, which means it
still has slightly buggy dependencies. The internal CMake
'llvm-config-like' macro uses the correct explicitly specified
dependencies however. A future patch will switch llvm-config generation
(when using CMake) to be based on these deps as well.

This may well break Windows. I'm getting a machine set up now to dig
into any failures there. If anyone can chime in with problems they see
or ideas of how to solve them for Windows, much appreciated.

llvm-svn: 136433
2011-07-29 00:14:25 +00:00
Bill Wendling 3cc87682e1 Visit the landingpad instruction.
This generates the correct SDNodes for the landingpad instruction. It makes an
assumption that the result of the landingpad instruction has at least two
values. And that the first value is a pointer to the exception object and the
second value is the "selector."

llvm-svn: 136430
2011-07-28 23:44:58 +00:00
Bill Wendling 7fa7fe6b58 Add the AddLandingPadInfo function.
AddLandingPadInfo takes a landingpad instruction and grabs all of the
information from it that it needs for EH table generation.

llvm-svn: 136429
2011-07-28 23:42:57 +00:00
Eli Friedman c9a551ebed LangRef and basic memory-representation/reading/writing for 'cmpxchg' and
'atomicrmw' instructions, which allow representing all the current atomic
rmw intrinsics.

The allowed operands for these instructions are heavily restricted at the
moment; we can probably loosen it a bit, but supporting general
first-class types (where it makes sense) might get a bit complicated,
given how SelectionDAG works.

As an initial cut, these operations do not support specifying an alignment,
but it would be possible to add if we think it's useful. Specifying an
alignment lower than the natural alignment would be essentially
impossible to support on anything other than x86, but specifying a greater
alignment would be possible.  I can't think of any useful optimizations which
would use that information, but maybe someone else has ideas.

Optimizer/codegen support coming soon.

llvm-svn: 136404
2011-07-28 21:48:00 +00:00
Bill Wendling 4f027233d2 The personality function should be a Function* and not just a Value*.
llvm-svn: 136392
2011-07-28 21:14:13 +00:00
Nadav Rotem 9708aef2dc CR fix: The ANY_EXTEND can be removed because the input and putput type must be
identical.

llvm-svn: 136355
2011-07-28 14:38:46 +00:00
Eli Friedman 26a484852e Code generation for 'fence' instruction.
llvm-svn: 136283
2011-07-27 22:21:52 +00:00
Bill Wendling 6c923bb8d9 Merge the contents from exception-handling-rewrite to the mainline.
This adds the new instructions 'landingpad' and 'resume'.

llvm-svn: 136253
2011-07-27 20:18:04 +00:00
Jeffrey Yasskin 6381c0100b Explicitly cast narrowing conversions inside {}s that will become errors in
C++0x.

llvm-svn: 136211
2011-07-27 06:22:51 +00:00
Dan Gohman 456b1edd0d Revert r136156, which broke several buildbots.
llvm-svn: 136206
2011-07-27 01:10:27 +00:00
Dan Gohman 9eb62cd159 Delete unnecessarily cautious LastCALLSEQ code.
llvm-svn: 136156
2011-07-26 22:00:59 +00:00
Eli Friedman 06b8b571b2 Add obvious missing case to switch. PR10497.
llvm-svn: 136130
2011-07-26 20:38:49 +00:00
Eli Friedman fee02c6c13 Initial implementation of 'fence' instruction, the new C++0x-style replacement for llvm.memory.barrier.
This is just a LangRef entry and reading/writing/memory representation; optimizer+codegen support coming soon.

llvm-svn: 136009
2011-07-25 23:16:38 +00:00
Eli Friedman cbd3ba91b7 Make sure this DAGCombine actually returns an UNDEF of the correct type; PR10476.
llvm-svn: 135993
2011-07-25 22:25:42 +00:00
Eli Friedman 6ed783228d PR10421: Fix a straightforward bug in the widening logic for CONCAT_VECTORS.
llvm-svn: 135595
2011-07-20 18:14:33 +00:00
Devang Patel 9ab3cac694 Revert r135423.
llvm-svn: 135454
2011-07-19 00:28:24 +00:00
Jeffrey Yasskin 7a16288157 Add APInt(numBits, ArrayRef<uint64_t> bigVal) constructor to prevent future ambiguity
errors like the one corrected by r135261.  Migrate all LLVM callers of the old
constructor to the new one.

llvm-svn: 135431
2011-07-18 21:45:40 +00:00
Devang Patel 4dc76f2438 During bottom up fast-isel, instructions emitted to materalize registers are at top of basic block and do not have debug location. This may misguide debugger while entering the basic block and sometimes debugger provides semi useful view of current location to developer by picking up previous known location as current location. Assign a sensible location to the first instruction in a basic block, if it does not have one location derived from source file, so that debugger can provide meaningful user experience to developers in edge cases.
[take 2]

llvm-svn: 135423
2011-07-18 20:55:23 +00:00
Chris Lattner 229907cd11 land David Blaikie's patch to de-constify Type, with a few tweaks.
llvm-svn: 135375
2011-07-18 04:54:35 +00:00
Nadav Rotem 76d51c6c89 Minor code cleanups
llvm-svn: 135362
2011-07-17 19:05:00 +00:00
Dan Gohman 945864d6dc LegalizeDAG doesn't need its own copy of this enum.
llvm-svn: 135320
2011-07-15 22:51:43 +00:00
Dan Gohman e49e74261a Delete LegalizeDAG's own version of isTypeLegal and getTypeAction
and just use the ones from TargetLowering directly.

llvm-svn: 135318
2011-07-15 22:39:09 +00:00
Dan Gohman 8c5ca645ce Delete an unused variable and a redundant assert.
llvm-svn: 135311
2011-07-15 22:19:02 +00:00
Dan Gohman ad94608b1f Modernize comments.
llvm-svn: 135305
2011-07-15 21:42:20 +00:00
Eric Christopher 92464be28c Check register class matching instead of width of type matching
when determining validity of matching constraint. Allow i1
types access to the GR8 reg class for x86.

Fixes PR10352 and rdar://9777108

llvm-svn: 135180
2011-07-14 20:13:52 +00:00
Nadav Rotem 771f29677f [VECTOR-SELECT]
During type legalization we often use the SIGN_EXTEND_INREG SDNode.
When this SDNode is legalized during the LegalizeVector phase, it is
scalarized because non-simple types are automatically marked to be expanded.
In this patch we add support for lowering SIGN_EXTEND_INREG manually.
This fixes CodeGen/X86/vec_sext.ll when running with the '-promote-elements'
flag.

llvm-svn: 135144
2011-07-14 11:11:14 +00:00
Nadav Rotem db213c0400 Add assertion for the chain value type
llvm-svn: 135143
2011-07-14 10:37:54 +00:00
Benjamin Kramer 15cd5a3f12 Don't emit a bit test if there is only one case the test can yield false. A simple SETNE is sufficient.
llvm-svn: 135126
2011-07-14 01:38:42 +00:00
Eric Christopher d6300d2956 Add a dag combine pattern for folding C2-(A+C1) -> (C2-C1)-A
Fixes rdar://9761830

llvm-svn: 135123
2011-07-14 01:12:15 +00:00
Jay Foad 57aa636794 Convert InsertValueInst and ExtractValueInst APIs to use ArrayRef.
llvm-svn: 135040
2011-07-13 10:26:04 +00:00
Cameron Zwarich f03fa189ca Add an intrinsic and codegen support for fused multiply-accumulate. The intent
is to use this for architectures that have a native FMA instruction.

llvm-svn: 134742
2011-07-08 21:39:21 +00:00
Benjamin Kramer 2bb8b26aa8 Apparently we can't expect a BinaryOperator here.
Should fix llvm-gcc selfhost.

llvm-svn: 134699
2011-07-08 12:08:24 +00:00
Benjamin Kramer 9960a25006 Emit a more efficient magic number multiplication for exact sdivs.
We have to do this in DAGBuilder instead of DAGCombiner, because the exact bit is lost after building.

  struct foo { char x[24]; };
  long bar(struct foo *a, struct foo *b) { return a-b; }
is now compiled into
  movl	4(%esp), %eax
  subl	8(%esp), %eax
  sarl	$3, %eax
  imull	$-1431655765, %eax, %eax
instead of
  movl	4(%esp), %eax
  subl	8(%esp), %eax
  movl	$715827883, %ecx
  imull	%ecx
  movl	%edx, %eax
  shrl	$31, %eax
  sarl	$2, %edx
  addl	%eax, %edx
  movl	%edx, %eax

llvm-svn: 134695
2011-07-08 10:31:30 +00:00
Eric Christopher 6a6d8fc7fd Remove a FIXME. All of the standard ones are in the list.
llvm-svn: 134647
2011-07-07 22:29:03 +00:00
Lang Hames 5a00499e87 Add functions 'hasPredecessor' and 'hasPredecessorHelper' to SDNode. The
hasPredecessorHelper function allows predecessors to be cached to speed up
repeated invocations. This fixes PR10186.

X.isPredecessorOf(Y) now just calls Y.hasPredecessor(X)

Y.hasPredecessor(X) calls Y.hasPredecessorHelper(X, Visited, Worklist) with
empty Visited and Worklist sets (i.e. no caching over invocations).

Y.hasPredecessorHelper(X, Visited, Worklist) caches search state in Visited
and Worklist to speed up repeated calls. The Visited set is searched for X
before going to the worklist to further search the DAG if necessary.

llvm-svn: 134592
2011-07-07 04:31:51 +00:00
Eric Christopher ea336c797c Grammar and 80-col.
llvm-svn: 134555
2011-07-06 22:41:18 +00:00
Jakub Staszak 3f158fdf6e Introduce "expect" intrinsic instructions.
llvm-svn: 134516
2011-07-06 18:22:43 +00:00
Evan Cheng 0d639a28aa Rename TargetSubtarget to TargetSubtargetInfo for consistency.
llvm-svn: 134259
2011-07-01 21:01:15 +00:00
Eric Christopher f81292ba3b Remove getRegClassForInlineAsmConstraint and all dependencies.
Fixes rdar://9643582

llvm-svn: 134123
2011-06-30 01:20:03 +00:00
Devang Patel 0eada03216 Revert r133953 for now.
llvm-svn: 134116
2011-06-29 23:50:13 +00:00
Benjamin Kramer 8665f8d916 Revert a part of r126557 which could create unschedulable DAGs.
llvm-svn: 134067
2011-06-29 13:47:25 +00:00
Evan Cheng 8264e272a9 Sink SubtargetFeature and TargetInstrItineraries (renamed MCInstrItineraries) into MC.
llvm-svn: 134049
2011-06-29 01:14:12 +00:00
Evan Cheng 6cc775f905 - Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo and
sink them into MC layer.
- Added MCInstrInfo, which captures the tablegen generated static data. Chang
TargetInstrInfo so it's based off MCInstrInfo.

llvm-svn: 134021
2011-06-28 19:10:37 +00:00
Devang Patel 4dc034df1d During bottom up fast-isel, instructions emitted to materalize registers are at top of basic block and do not have debug location. This may misguide debugger while entering the basic block and sometimes debugger provides semi useful view of current location to developer by picking up previous known location as current location. Assign a sensible location to the first instruction in a basic block, if it does not have one location derived from source file, so that debugger can provide meaningful user experience to developers in edge cases.
llvm-svn: 133953
2011-06-27 22:32:04 +00:00
Evan Cheng 8d71a75777 More refactoring. Move getRegClass from TargetOperandInfo to TargetInstrInfo.
llvm-svn: 133944
2011-06-27 21:26:13 +00:00
Owen Anderson b0a5a1ee29 The index stored in the RegDefIter is one after the current index. When getting the index, decrement it so that it points to the current element. Fixes an off-by-one bug encountered when trying to make use of MVT::untyped.
llvm-svn: 133923
2011-06-27 18:34:12 +00:00
Andrew Trick 31f25bc66f pre-RA-sched: Cleanup register pressure tracking.
Removed the check that peeks past EXTRA_SUBREG, which I don't think
makes sense any more. Intead treat it as a normal register def. No
significant affect on x86 or ARM benchmarks.

llvm-svn: 133917
2011-06-27 18:01:20 +00:00
Jakob Stoklund Olesen 537a302d1a Distinguish early clobber output operands from clobbered registers.
Both become <earlyclobber> defs on the INLINEASM MachineInstr, but we
now use two different asm operand kinds.

The new Kind_Clobber is treated identically to the old
Kind_RegDefEarlyClobber for now, but x87 floating point stack inline
assembly does care about the difference.

This will pop a register off the stack:

  asm("fstp %st" : : "t"(x) : "st");

While this will pop the input and push an output:

  asm("fst %st" : "=&t"(r) : "t"(x));

We need to know if ST0 was a clobber or an output operand, and we can't
depend on <dead> flags for that.

llvm-svn: 133902
2011-06-27 04:08:33 +00:00
Owen Anderson 99adfec0b1 The scheduler needs to be aware on the existence of untyped nodes when it performs type propagation for EXTRACT_SUBREG.
llvm-svn: 133838
2011-06-24 23:02:22 +00:00
Devang Patel f071d72c44 Handle debug info for i128 constants.
llvm-svn: 133821
2011-06-24 20:46:11 +00:00
Jay Foad 83be361b8a Replace the existing forms of ConstantArray::get() with a single form
that takes an ArrayRef.

llvm-svn: 133615
2011-06-22 09:24:39 +00:00
Owen Anderson d1955e78b4 Fix some trailing issues from my introduction of MVT::untyped and its use for REGISTER_SEQUENCE.
llvm-svn: 133567
2011-06-21 22:54:23 +00:00
Evan Cheng 4c0bd9629d Teach dag combine to match halfword byteswap patterns.
1. (((x) & 0xFF00) >> 8) | (((x) & 0x00FF) << 8)
   => (bswap x) >> 16
2. ((x&0xff)<<8)|((x&0xff00)>>8)|((x&0xff000000)>>8)|((x&0x00ff0000)<<8))
   => (rotl (bswap x) 16)

This allows us to eliminate most of the def : Pat patterns for ARM rev16
revsh instructions. It catches many more cases for ARM and x86.

rdar://9609108

llvm-svn: 133503
2011-06-21 06:01:08 +00:00
Nadav Rotem d34ce4344b Fix PromoteIntRes_TRUNCATE: Add support for cases where the
source vector type is to be split while the target vector is to be promoted.
(eg: <4 x i64> -> <4 x i8> )

llvm-svn: 133424
2011-06-20 07:15:58 +00:00
Nadav Rotem 94d67a02e0 Code cleanups: Remove duplicated logic in PromotInteRes_BITCAST, reserve vector space, reuse types.
llvm-svn: 133389
2011-06-19 10:49:57 +00:00
Nadav Rotem 35d600d9f4 Calls to AssertZext and getZeroExtendInReg must be made using scalar types.
llvm-svn: 133388
2011-06-19 10:22:39 +00:00
Nadav Rotem 36896bfd0c When promoting the vector elements in CopyToParts, use vector trunc
instead of scalarizing, and doing an element-by-element truncat.

llvm-svn: 133382
2011-06-19 08:49:38 +00:00
Benjamin Kramer e1fc29b6ac Don't allocate empty read-only SmallVectors during SelectionDAG deallocation.
llvm-svn: 133348
2011-06-18 13:13:44 +00:00
Benjamin Kramer 25e17b0f89 Remove unused but set variables.
llvm-svn: 133347
2011-06-18 11:09:41 +00:00
Eric Christopher e4a1266a9a Fix UMULO support for 2x register width to allow the full
range without a libcall to a new mulo<mode> libcall
that we'd have to create.

Finishes the rest of rdar://9090077 and rdar://9210061

llvm-svn: 133318
2011-06-18 00:09:57 +00:00
Eric Christopher 232431c389 Fix comment.
llvm-svn: 133307
2011-06-17 22:35:59 +00:00
Eric Christopher 5bbb2bdb46 Lower multiply with overflow checking to __mulo<mode>
calls if we haven't been able to lower them any
other way.

Fixes rdar://9090077 and rdar://9210061

llvm-svn: 133288
2011-06-17 20:41:29 +00:00
Jakob Stoklund Olesen c826df9506 Don't use register classes larger than TLI->getRegClassFor(VT).
In Thumb mode we cannot handle GPR virtual registers, even though some
instructions can. When isel is lowering a CopyFromReg, it should limit
itself to subclasses of getRegClassFor(VT).

<rdar://problem/9624323>

llvm-svn: 133210
2011-06-16 22:50:38 +00:00
Jakub Staszak 12a43bdde5 Introduce MachineBranchProbabilityInfo class, which has similar API to
BranchProbabilityInfo (expect setEdgeWeight which is not available here).
Branch Weights are kept in MachineBasicBlocks. To turn off this analysis
set -use-mbpi=false.

llvm-svn: 133184
2011-06-16 20:22:37 +00:00
Owen Anderson 5fc8b77f83 Change the REG_SEQUENCE SDNode to take an explict register class ID as its first operand. This operand is lowered away by the time we reach MachineInstrs, so the actual register-allocation handling of them doesn't need to change.
This is intended to support using REG_SEQUENCE SDNode's with type MVT::untyped, and is part of the long road to eliminating some of the hacks we currently use to support register pairs and other strange constraints, particularly on ARM NEON.

llvm-svn: 133178
2011-06-16 18:17:13 +00:00
Jakob Stoklund Olesen 1f641d577e Add TargetRegisterInfo::getRawAllocationOrder().
This virtual function will replace allocation_order_begin/end as the one
to override when implementing custom allocation orders. It is simpler to
have one function return an ArrayRef than having two virtual functions
computing different ends of the same array.

Use getRawAllocationOrder() in place of allocation_order_begin() where
it makes sense, but leave some clients that look like they really want
the filtered allocation orders from RegisterClassInfo.

llvm-svn: 133170
2011-06-16 17:42:25 +00:00
Nick Lewycky 6d677cfdd8 Add a DAGCombine for (ext (binop (load x), cst)).
llvm-svn: 133124
2011-06-16 01:15:49 +00:00
Owen Anderson 96adc4a540 Add a new MVT::untyped. This will be used in future work for modelling ISA features like register pairs and lists with "interesting" constraints (such as ARM NEON contiguous register lists or even-odd paired registers). We need to be able to generate these instructions (often from intrinsics), but don't want to have to assign a legal type to them. Instead, we'll use an "untyped" edge to bypass the type-checking and simply ensure that the register classes match.
llvm-svn: 133106
2011-06-15 23:35:18 +00:00
Andrew Trick 3013b6ae4a Added -stress-sched flag in the Asserts build.
Added a test case for handling physreg aliases during pre-RA-sched.

llvm-svn: 133063
2011-06-15 17:16:12 +00:00
Nadav Rotem 13cb7736a7 getZeroExtendInReg needs to get a scalar type
llvm-svn: 133057
2011-06-15 14:37:18 +00:00
Nadav Rotem d2d9bdb2b0 Enable the simplification of truncating-store after fixing the usage of
GetDemandBits (which must operate on the vector element type).

Fix the a usage of getZeroExtendInReg which must also be done on scalar types.

llvm-svn: 133052
2011-06-15 11:19:12 +00:00
Chad Rosier 818e116723 When pattern matching during instruction selection make sure shl x,1 is not
converted to add x,x if x is a undef.  add undef, undef does not guarantee
that the resulting low order bit is zero.
Fixes <rdar://problem/9453156> and <rdar://problem/9487392>.

llvm-svn: 133022
2011-06-14 22:29:10 +00:00
Nadav Rotem 10193c830b Add a testcase for checking the integer-promotion of many different vector
types (with power of two types such as 8,16,32 .. 512).

Fix a bug in the integer promotion of bitcast nodes. Enable integer expanding
only if the target of the conversion is an integer (when the type action is
scalarize).

Add handling to the legalization of vector load/store in cases where the saved
vector is integer-promoted.

llvm-svn: 132985
2011-06-14 08:11:52 +00:00
Nadav Rotem 571ae19af7 Disable trunc-store simplification on vectors.
llvm-svn: 132984
2011-06-14 07:18:26 +00:00
Bruno Cardoso Lopes dc9ff3a4b1 Add one more argument to the prefetch intrinsic to indicate whether it's a data
or instruction cache access. Update the targets to match it and also teach
autoupgrade.

llvm-svn: 132976
2011-06-14 04:58:37 +00:00
Nadav Rotem 573ee374a2 Fix a bug in FindMemType. When widening vector loads, use a wider memory type
only if the number of packed elements is a power of two.
Bug found in Duncan's testcase.

llvm-svn: 132923
2011-06-13 18:13:24 +00:00
Nadav Rotem 504cf0cde2 Fix a bug in the calculation of the vectorTypeBreakdown into registers. Odd
types such as i33 were rounded to i32. Originated from Duncan's testcase.

llvm-svn: 132893
2011-06-12 14:56:55 +00:00
Nadav Rotem 083837e729 Improve the generated code by getCopyFromPartsVector for promoted integer types.
Instead of scalarizing, and doing an element-by-element truncat, use vector
truncate.
Add support for scalarization of vectors:  i8 -> <1 x i1> (from Duncan's
testcase).

llvm-svn: 132892
2011-06-12 14:49:38 +00:00
Chad Rosier 79044dbebf Revert r132871.
llvm-svn: 132872
2011-06-11 02:27:46 +00:00
Chad Rosier 5793b53027 Typo.
llvm-svn: 132871
2011-06-11 02:16:36 +00:00
Eric Christopher eb964516c3 80-col cleanups.
llvm-svn: 132863
2011-06-10 23:05:08 +00:00
Eli Friedman 1877ac9937 Change this DAGCombine to build AND of SHR instead of SHR of AND; this matches the ordering we prefer in instcombine. Part of rdar://9562809.
The potential DAGCombine which enforces this more generally messes up some other very fragile patterns, so I'm leaving that alone, at least for now.

llvm-svn: 132809
2011-06-09 22:14:44 +00:00
Eric Christopher 0713a9d8fc Add a parameter to CCState so that it can access the MachineFunction.
No functional change.

Part of PR6965

llvm-svn: 132763
2011-06-08 23:55:35 +00:00
Andrew Trick 6ed0c63559 Remove a temporary test case probe in CheckForLiveRegDef.
llvm-svn: 132751
2011-06-08 15:19:49 +00:00
Andrew Trick 0af2e47310 Fix a merge bug in preRAsched for handling physreg aliases.
I've been sitting on this long enough trying to find a test case. I
think the fix should go in now, but I'll keep working on the test case.

llvm-svn: 132701
2011-06-07 00:38:12 +00:00
Nadav Rotem c807fa5687 Add methods to support the integer-promotion of vector types. Methods to
legalize SDNodes such as BUILD_VECTOR, EXTRACT_VECTOR_ELT, etc.

llvm-svn: 132689
2011-06-06 20:55:56 +00:00
Stuart Hastings bee6fcc5aa Avoid FGETSIGN of 80-bit types. Fixes PR10085.
llvm-svn: 132681
2011-06-06 16:44:31 +00:00
Eli Friedman bd375f1a3f PR10077: fix fast-isel of extractvalue of aggregate constants.
llvm-svn: 132676
2011-06-06 05:46:34 +00:00
Nadav Rotem 06bd6d304e TypeLegalizer: Add support for passing of vector-promoted types in registers (copyFromParts/copyToParts).
llvm-svn: 132649
2011-06-04 20:58:08 +00:00
Nadav Rotem 78d19bebe6 TypeLegalizer: Fix a bug in the promotion of elements of integer vectors.
(only happens when using the -promote-elements option).

The correct legalization order is to first try to promote element. Next, we try
to widen vectors.

llvm-svn: 132648
2011-06-04 20:32:01 +00:00
Eric Christopher fbff0e4f26 Add a TODO about memory operands.
llvm-svn: 132559
2011-06-03 17:21:23 +00:00
Eric Christopher de9399bf76 Have LowerOperandForConstraint handle multiple character constraints.
Part of rdar://9119939

llvm-svn: 132510
2011-06-02 23:16:42 +00:00
Rafael Espindola aa318ae495 Revert 132424 to fix PR10068.
llvm-svn: 132479
2011-06-02 19:57:47 +00:00
Jakob Stoklund Olesen aff1060207 Use TRI::has{Sub,Super}ClassEq() where possible.
No functional change.

llvm-svn: 132455
2011-06-02 05:43:46 +00:00
Stuart Hastings 7adc95f69e Recommit 132404 with fixes. rdar://problem/5993888
llvm-svn: 132424
2011-06-01 21:33:14 +00:00
Eric Christopher 690030c116 Allow bitcasts between valid types of the same size and vector
types if the vector type is legal.

Fixes rdar://9306086

llvm-svn: 132420
2011-06-01 19:55:10 +00:00
Nadav Rotem 22ad9bb7d9 Refactor LegalizeTypes: Erase LegalizeAction and make the type legalizer use
the TargetLowering enum.

llvm-svn: 132418
2011-06-01 19:47:10 +00:00
Stuart Hastings 3ae49c03a4 Fix double FGETSIGN to work on x86_32; followup to 132396.
rdar://problem/5660695

llvm-svn: 132411
2011-06-01 18:32:25 +00:00
Stuart Hastings fd5ecd0cec Turn on FGETSIGN for x86. Followup to 132388. rdar://problem/5660695
llvm-svn: 132396
2011-06-01 14:04:17 +00:00
Nadav Rotem 8b24a731f2 This patch is another step in the direction of adding vector select. In this
patch we add a flag to enable a new type legalization decision - to promote
integer elements in vectors. Currently, the rest of the codegen does not support
this kind of legalization.  This flag will be removed when the transition is
complete.

llvm-svn: 132394
2011-06-01 12:51:46 +00:00
Nadav Rotem d86c1c41fb Refactor the type legalizer. Switch TargetLowering to a new enum - LegalizeTypeAction.
This patch does not change the behavior of the type legalizer. The codegen
produces the same code.
This infrastructural change is needed in order to enable complex decisions
for vector types (needed by the vector-select patch).

llvm-svn: 132263
2011-05-28 17:57:14 +00:00
Nadav Rotem a9effb13dd Refactor getActionType and getTypeToTransformTo ; place all of the 'decision'
code in one place. Re-apply 131534 and fix the multi-step promotion of integers.

llvm-svn: 132217
2011-05-27 21:03:13 +00:00
Eli Friedman c70355195c Rewrite fast-isel integer cast handling to handle more cases, and to be simpler and more consistent.
The practical effects here are that x86-64 fast-isel can now handle trunc from i8 to i1, and ARM fast-isel can handle many more constructs involving integers narrower than 32 bits (including loads, stores, and many integer casts).

rdar://9437928 .

llvm-svn: 132099
2011-05-25 23:49:02 +00:00
Devang Patel 84b64a3e92 Remove unused statistical counter.
llvm-svn: 132087
2011-05-25 21:55:40 +00:00
Devang Patel 5de2375db8 Remove dead code.
llvm-svn: 131974
2011-05-24 18:27:52 +00:00
Evan Cheng 88f9137fd7 - Teach SelectionDAG::isKnownNeverZero to return true (op x, c) when c is
non-zero.
- Teach X86 cmov optimization to eliminate the cmov from ctlz, cttz extension
  when the source of X86ISD::BSR / X86ISD::BSF is proven to be non-zero.

rdar://9490949

llvm-svn: 131948
2011-05-24 01:48:22 +00:00
Devang Patel efec7715ec Revert 121907 (it causes llc crash) and apply original patch from PR9817.
llvm-svn: 131926
2011-05-23 22:04:42 +00:00
Devang Patel 7992883811 Preserve debug info during iSel by keeping DanglingDebugInfoMap live until end of function.
Patch by Micah Villmow

llvm-svn: 131908
2011-05-23 17:44:13 +00:00
Devang Patel c4d9a84159 While replacing all uses of a SDValue with another value, do not forget to transfer SDDbgValue.
llvm-svn: 131907
2011-05-23 17:35:08 +00:00
Chris Lattner 68254fcbca Eliminate some temporary variables, and don't call getByValTypeAlignment
when we're just going to throw the result away.  No functionality change.

llvm-svn: 131880
2011-05-22 23:23:02 +00:00
Benjamin Kramer 2fd48f2730 Implement mulo x, 2 -> addo x, x in DAGCombiner.
llvm-svn: 131800
2011-05-21 18:31:55 +00:00
Cameron Zwarich 2af60abad8 Fix PR9955 by only attaching load memory operands to load instructions and
similarly for stores. Now "make check" passes with the MachineVerifier forced
on with the VerifyCoalescing option!

llvm-svn: 131705
2011-05-19 23:44:34 +00:00
Stuart Hastings 4a4e5a2b55 Update some currently-disabled code, preparing for eventual use.
llvm-svn: 131663
2011-05-19 18:48:20 +00:00
Duncan Sands 3d9407f4eb Revert commit 131534 since it seems to have broken several buildbots.
Original log entry:
Refactor getActionType and getTypeToTransformTo ; place all of the 'decision'
code in one place.

llvm-svn: 131536
2011-05-18 14:57:56 +00:00
Nadav Rotem c5c27ede55 Refactor getActionType and getTypeToTransformTo ; place all of the 'decision'
code in one place.

llvm-svn: 131534
2011-05-18 12:26:38 +00:00
Eli Friedman e9692808b7 Make fast-isel miss counting in -stats and -fast-isel-verbose take terminators into account; since there are many fewer isel misses with recent changes, misses caused by terminators are more significant.
llvm-svn: 131502
2011-05-17 23:02:10 +00:00
Dan Gohman abffc991dc Misc. code cleanups.
llvm-svn: 131497
2011-05-17 22:22:52 +00:00
Dan Gohman 4298df6d86 Misc. code cleanups.
llvm-svn: 131495
2011-05-17 22:20:36 +00:00
Dan Gohman d282f46c6b Delete unused variables.
llvm-svn: 131430
2011-05-16 22:19:54 +00:00
Dan Gohman d4d12d14b5 Trim #includes.
llvm-svn: 131429
2011-05-16 22:14:50 +00:00
Dan Gohman ae9b1685a8 Fix whitespace and 80-column violations.
llvm-svn: 131428
2011-05-16 22:09:53 +00:00
Jim Grosbach e85c0dde7a Track how many insns fast-isel successfully selects as well as how many it
misses.

llvm-svn: 131426
2011-05-16 21:51:07 +00:00
Devang Patel 8e60ff11db Preserve debug info for unused zero extended boolean argument.
Radar 9422775.

llvm-svn: 131422
2011-05-16 21:24:05 +00:00
Eli Friedman a4d4a0162d Make fast-isel work correctly s/uadd.with.overflow intrinsics.
llvm-svn: 131420
2011-05-16 21:06:17 +00:00
Eli Friedman 4c08bb450a Fix silly typo.
llvm-svn: 131419
2011-05-16 20:34:53 +00:00
Eli Friedman 9ac944774f Basic fast-isel of extractvalue. Not too helpful on its own, given the IR clang generates for cases like this, but it should become more useful soon.
llvm-svn: 131417
2011-05-16 20:27:46 +00:00
Rafael Espindola 2050af838d Don't do tail calls in a function that call setjmp. The stack might be
corrupted when setjmp returns again.

llvm-svn: 131399
2011-05-16 03:05:33 +00:00
Eli Friedman 8f1e11cde9 Fix a FIXME by moving the fast-isel implementation of the objectsize intrinsic from the x86 code to the generic code.
llvm-svn: 131332
2011-05-14 00:47:51 +00:00
Rafael Espindola e53b7d1a11 Make codegen able to handle values of empty types. This is one way
to fix PR9900. I will keep it open until sable is able to comment on it.

llvm-svn: 131294
2011-05-13 15:18:06 +00:00
Stuart Hastings aa02c0847d Since I can't reproduce the failures from 131261, re-trying with a
simplified version.  <rdar://problem/9298790>

llvm-svn: 131274
2011-05-13 00:51:54 +00:00
Stuart Hastings 8d57d8ea64 Revert 131266 and 131261 due to buildbot complaints.
rdar://problem/9298790

llvm-svn: 131269
2011-05-13 00:15:17 +00:00
Stuart Hastings 89f1b47e3a Non-fast-isel followup to 129634; correctly handle branches controlled
by non-CMP expressions.  The executable test case (129821) would test
this as well, if we had an "-O0 -disable-arm-fast-isel" LLVM-GCC
tester.  Alas, the ARM assembly would be very difficult to check with
FileCheck.

The thumb2-cbnz.ll test is affected; it generates larger code (tst.w
vs. cmp #0), but I believe the new version is correct.
rdar://problem/9298790

llvm-svn: 131261
2011-05-12 23:36:41 +00:00
Nadav Rotem 8a7beb80f0 Fixes a bug in the DAGCombiner. LoadSDNodes have two values (data, chain).
If there is a store after the load node, then there is a chain, which means
that there is another user. Thus, asking hasOneUser would fail. Instead we
ask hasNUsesOfValue on the 'data' value.

llvm-svn: 131183
2011-05-11 14:40:50 +00:00
Bill Wendling 50117f8186 Give the 'eh.sjlj.dispatchsetup' intrinsic call the value coming from the setjmp
intrinsic call. This prevents it from being reordered so that it appears
*before* the setjmp intrinsic (thus making it completely useless).
<rdar://problem/9409683>

llvm-svn: 131174
2011-05-11 01:11:55 +00:00
Eli Friedman 768de0a0f8 Disable my little CopyToReg argument hack with fast-isel. rdar://problem/9413587 .
llvm-svn: 131156
2011-05-10 21:50:58 +00:00
Stuart Hastings 999fa3bf1f Correctly walk through nested and adjacent CALLSEQ_START nodes. No
test case; I've only seen this on a release branch, and I can't get it
to reproduce on trunk.  rdar://problem/7662569

llvm-svn: 131152
2011-05-10 21:20:03 +00:00
Eric Christopher 4480428474 Look through struct wrapped types for inline asm statments.
Patch by Evan Cheng.

llvm-svn: 131093
2011-05-09 20:04:43 +00:00
Duncan Sands 6be291a2cd Indent properly, no functionality change.
llvm-svn: 131082
2011-05-09 08:03:33 +00:00
Evan Cheng d26fc5e013 80 col violations.
llvm-svn: 131015
2011-05-06 20:52:23 +00:00
Eli Friedman 2518f8376d Make the logic for determining function alignment more explicit. No functionality change.
llvm-svn: 131012
2011-05-06 20:34:06 +00:00
Eli Friedman 7a78f66145 Use array_lengthof. No functional change.
llvm-svn: 131008
2011-05-06 19:50:10 +00:00
Owen Anderson 68b6b0efb0 Allow FastISel of three-register-operand instructions.
llvm-svn: 130934
2011-05-05 17:59:04 +00:00
Eli Friedman 441a01a2b8 Avoid extra vreg copies for arguments passed in registers. Specifically, this can make MachineCSE more effective in some cases (especially in small functions). PR8361 / part of rdar://problem/8259436 .
llvm-svn: 130928
2011-05-05 16:53:34 +00:00
Eli Friedman fd8c6adffb Small syntax cleanup; we don't need to #define constants in C++. No functionality change intended.
llvm-svn: 130926
2011-05-05 16:25:23 +00:00
Owen Anderson 66fd073974 Other parts of the SelectionDAG framework assume that targets use their pointer type for vector indices. Make the vector unrolling code respect that.
llvm-svn: 130733
2011-05-02 22:25:45 +00:00
Eli Friedman 4105ed1523 Make FastEmit_ri_ try a bit harder to succeed for supported operations; FastEmit_i can fail for non-Thumb2 ARM. Makes ARMSimplifyAddress work correctly, and reduces the number of fast-isel bailouts on non-Thumb ARM.
llvm-svn: 130560
2011-04-29 23:34:52 +00:00
Eli Friedman 33c133919a Fix a silly mistake in r130338.
llvm-svn: 130360
2011-04-28 00:42:03 +00:00
Eli Friedman 406c471b69 Make the fast-isel code for literal 0.0 a bit shorter/faster, since 0.0 is common. rdar://problem/9303592 .
llvm-svn: 130338
2011-04-27 22:41:55 +00:00
Eli Friedman 121d27e9e4 Remove unused function.
llvm-svn: 130337
2011-04-27 22:21:02 +00:00
Evan Cheng 1355bbdd11 Be careful about scheduling nodes above previous calls. It increase usages of
more callee-saved registers and introduce copies. Only allows it if scheduling
a node above calls would end up lessen register pressure.

Call operands also has added ABI restrictions for register allocation, so be
extra careful with hoisting them above calls.

rdar://9329627

llvm-svn: 130245
2011-04-26 21:31:35 +00:00
Dan Gohman 7da91aee83 Fast-isel support for simple inline asms.
llvm-svn: 130205
2011-04-26 17:18:34 +00:00
Evan Cheng 2f64754031 Fix typo
llvm-svn: 130190
2011-04-26 04:57:37 +00:00
Devang Patel 734f2218ac A dbg.declare may not be in entry block, even if it is referring to an incoming argument. However, It is appropriate to emit DBG_VALUE referring to this incoming argument in entry block in MachineFunction.
llvm-svn: 130129
2011-04-25 16:33:52 +00:00
Jay Foad 1a180156b6 Remove unused STL header includes.
llvm-svn: 130068
2011-04-23 19:53:52 +00:00
Owen Anderson dd450b86cf Teach FastISel to deal with instructions that have two immediate operands.
llvm-svn: 130033
2011-04-22 23:38:06 +00:00
Chris Lattner 6d277517d1 Recommit the fix for rdar://9289512 with a couple tweaks to
fix bugs exposed by the gcc dejagnu testsuite:
1. The load may actually be used by a dead instruction, which
   would cause an assert.
2. The load may not be used by the current chain of instructions,
   and we could move it past a side-effecting instruction. Change
   how we process uses to define the problem away.

llvm-svn: 130018
2011-04-22 21:59:37 +00:00
Benjamin Kramer 341c11da3b DAGCombine: fold "(zext x) == C" into "x == (trunc C)" if the trunc is lossless.
On x86 this allows to fold a load into the cmp, greatly reducing register pressure.
  movzbl	(%rdi), %eax
  cmpl	$47, %eax
->
  cmpb	$47, (%rdi)

This shaves 8k off gcc.o on i386. I'll leave applying the patch in README.txt to Chris :)

llvm-svn: 130005
2011-04-22 18:47:44 +00:00
Daniel Dunbar 6309828206 Revert r1296656, "Fix rdar://9289512 - not folding load into compare at -O0...",
which broke a couple GCC test suite tests at -O0.

llvm-svn: 129914
2011-04-21 16:14:46 +00:00
Eric Christopher bcaedb5ce0 Rewrite the expander for umulo/smulo to remember to sign extend the input
manually and pass all (now) 4 arguments to the mul libcall. Add a new
ExpandLibCall for just this (copied gratuitously from type legalization).

Fixes rdar://9292577

llvm-svn: 129842
2011-04-20 01:19:45 +00:00
Stuart Hastings 468086d5e1 Delete unnecessary variable. <rdar://problem/7662569>
llvm-svn: 129796
2011-04-19 20:09:38 +00:00
Eli Friedman bcd09b3a7f SelectBasicBlock is rather slow even when it doesn't do anything; skip the
unnecessary work where possible.

llvm-svn: 129763
2011-04-19 17:01:08 +00:00
Stuart Hastings 0b68c1219f Support nested CALLSEQ_BEGIN/END; necessary for ARM byval support. <rdar://problem/7662569>
llvm-svn: 129761
2011-04-19 16:16:58 +00:00
Chris Lattner 91328b317b Implement support for x86 fastisel of small fixed-sized memcpys, which are generated
en-mass for C++ PODs.  On my c++ test file, this cuts the fast isel rejects by 10x 
and shrinks the generated .s file by 5%

llvm-svn: 129755
2011-04-19 05:52:03 +00:00
Chris Lattner 48f75ad678 while we're at it, handle 'sdiv exact' of a power of 2 also,
this fixes a few rejects on c++ iterator loops.

llvm-svn: 129694
2011-04-18 07:00:40 +00:00
Chris Lattner 562d6e82bd fix rdar://9297011 - udiv by power of two causing fast-isel rejects
llvm-svn: 129693
2011-04-18 06:55:51 +00:00
Chris Lattner b53ccb8e36 1. merge fast-isel-shift-imm.ll into fast-isel-x86-64.ll
2. implement rdar://9289501 - fast isel should fold trivial multiplies to shifts
3. teach tblgen to handle shift immediates that are different sizes than the 
   shifted operands, eliminating some code from the X86 fast isel backend.
4. Have FastISel::SelectBinaryOp use (the poorly named) FastEmit_ri_ function
   instead of FastEmit_ri to simplify code.

llvm-svn: 129666
2011-04-17 20:23:29 +00:00
Chris Lattner 4832660b4d fix an oversight which caused us to compile the testcase (and other
less trivial things) into a dummy lea.  Before we generated:

_test:                                  ## @test
	movq	_G@GOTPCREL(%rip), %rax
	leaq	(%rax), %rax
	ret

now we produce:

_test:                                  ## @test
	movq	_G@GOTPCREL(%rip), %rax
	ret

This is part of rdar://9289558

llvm-svn: 129662
2011-04-17 17:12:08 +00:00
Chris Lattner 045c43855c Fix rdar://9289512 - not folding load into compare at -O0
The basic issue here is that bottom-up isel is matching the branch
and compare, and was failing to fold the load into the branch/compare
combo.  Fixing this (by allowing folding into any instruction of a
sequence that is selected) allows us to produce things like:


cmpb    $0, 52(%rax)
je      LBB4_2

instead of:

movb    52(%rax), %cl
cmpb    $0, %cl
je      LBB4_2

This makes the generated -O0 code run a bit faster, but also speeds up
compile time by putting less pressure on the register allocator and 
generating less code.

This was one of the biggest classes of missing load folding.  Implementing
this shrinks 176.gcc's c-decl.s (as a random example) by about 4% in (verbose-asm)
line count.

llvm-svn: 129656
2011-04-17 06:35:44 +00:00
Chris Lattner d70ff0d807 split a complex predicate out to a helper function. Simplify two for loops,
which don't need to check for falling off the end of a block *and* end of phi
nodes, since terminators are never phis.

llvm-svn: 129655
2011-04-17 06:03:19 +00:00
Chris Lattner fba7ca63cc fix rdar://9289583 - fast isel should handle non-canonical commutative binops
allowing us to fold the immediate into the 'and' in this case:

int test1(int i) {
  return 8&i;
}

llvm-svn: 129653
2011-04-17 01:16:47 +00:00
Eli Friedman 55b0acd624 PR9055: extend the fix to PR4050 (r70179) to apply to zext and anyext.
Returning a new node makes the code try to replace the old node, which
in the included testcase is killed by CSE.

llvm-svn: 129650
2011-04-16 23:25:34 +00:00
Evan Cheng b14ce09fca Fix divmod libcall lowering. Convert to {S|U}DIVREM first and then expand the node to a libcall. rdar://9280991
llvm-svn: 129633
2011-04-16 03:08:26 +00:00
Chris Lattner 0ab5e2cded Fix a ton of comment typos found by codespell. Patch by
Luis Felipe Strano Moraes!

llvm-svn: 129558
2011-04-15 05:18:47 +00:00
Owen Anderson a519284fec Fix another instance of the DAG combiner not using the correct type for the RHS of a shift.
llvm-svn: 129522
2011-04-14 17:30:49 +00:00
Andrew Trick bfbd972b1f In the pre-RA scheduler, maintain cmp+br proximity.
This is done by pushing physical register definitions close to their
use, which happens to handle flag definitions if they're not glued to
the branch. This seems to be generally a good thing though, so I
didn't need to add a target hook yet.

The primary motivation is to generate code closer to what people
expect and rule out missed opportunity from enabling macro-op
fusion. As a side benefit, we get several 2-5% gains on x86
benchmarks. There is one regression:
SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is
an independent scheduler bug that will be tracked separately.
See rdar://problem/9283108.

Incidentally, pre-RA scheduling is only half the solution. Fixing the
later passes is tracked by:
<rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump

Fixes:
<rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion

llvm-svn: 129508
2011-04-14 05:15:06 +00:00
Chris Lattner 493b3e72f2 sink a call into its only use.
llvm-svn: 129503
2011-04-14 04:12:47 +00:00
Owen Anderson 9c12834eed During post-legalization DAG combining, be careful to only create shifts where the RHS is of the legal type for the new operation.
llvm-svn: 129484
2011-04-13 23:22:23 +00:00
Andrew Trick b53a00d2cb Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency.
Additional fixes:
Do something reasonable for subtargets with generic
itineraries by handle node latency the same as for an empty
itinerary. Now nodes default to unit latency unless an itinerary
explicitly specifies a zero cycle stage or it is a TokenFactor chain.

Original fixes:
UnitsSharePred was a source of randomness in the scheduler: node
priority depended on the queue data structure. I rewrote the recent
VRegCycle heuristics to completely replace the old heuristic without
any randomness. To make the ndoe latency adjustments work, I also
needed to do something a little more reasonable with TokenFactor. I
gave it zero latency to its consumers and always schedule it as low as
possible.

llvm-svn: 129421
2011-04-13 00:38:32 +00:00
Andrew Trick 1b60ad6644 Revert 129383. It causes some targets to hit a scheduler assert.
llvm-svn: 129385
2011-04-12 20:14:07 +00:00
Andrew Trick c5dd24a542 PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency.
UnitsSharePred was a source of randomness in the scheduler: node
priority depended on the queue data structure. I rewrote the recent
VRegCycle heuristics to completely replace the old heuristic without
any randomness. To make these heuristic adjustments to node latency work,
I also needed to do something a little more reasonable with TokenFactor. I
gave it zero latency to its consumers and always schedule it as low as
possible.

llvm-svn: 129383
2011-04-12 19:54:36 +00:00
Jay Foad 7c14a558fe Don't include Operator.h from InstrTypes.h.
llvm-svn: 129271
2011-04-11 09:35:34 +00:00
Chris Lattner cfe5aa65d2 Avoid excess precision issues that lead to generating host-compiler-specific code.
Switch lowering probably shouldn't be using FP for this.  This resolves PR9581.

llvm-svn: 129199
2011-04-09 06:57:13 +00:00
Chris Lattner 41c80e89f3 have dag combine zap "store undef", which can be formed during call lowering
with undef arguments.

llvm-svn: 129185
2011-04-09 02:32:02 +00:00
Evan Cheng 74d92c1924 Change -arm-trap-func= into a non-arm specific option. Now Intrinsic::trap is lowered into a call to the specified trap function at sdisel time.
llvm-svn: 129152
2011-04-08 21:37:21 +00:00
Andrew Trick 2ad0b37318 Added a check in the preRA scheduler for potential interference on a
induction variable. The preRA scheduler is unaware of induction vars,
so we look for potential "virtual register cycles" instead.

Fixes <rdar://problem/8946719> Bad scheduling prevents coalescing

llvm-svn: 129100
2011-04-07 19:54:57 +00:00
Bill Wendling dd4dcd549b Revamp the SjLj "dispatch setup" intrinsic.
It needed to be moved closer to the setjmp statement, because the code directly
after the setjmp needs to know about values that are on the stack. Also, the
'bitcast' of the function context was causing a dead load. This wouldn't be too
horrible, except that at -O0 it wasn't optimized out, and because it wasn't
using the correct base pointer (if there is a VLA), it would try to access a
value from a garbage address.
<rdar://problem/9130540>

llvm-svn: 128873
2011-04-05 01:37:43 +00:00
Stuart Hastings ad68c93a2d Revert 123704; it broke threaded LLVM.
llvm-svn: 128868
2011-04-05 00:37:28 +00:00
Cameron Zwarich 8c7bbc09e2 Add a RemoveFromWorklist method to DCI. This is needed to do some complicated
transformations in target-specific DAG combines without causing DAGCombiner to
delete the same node twice. If you know of a better way to avoid this (see my
next patch for an example), please let me know.

llvm-svn: 128758
2011-04-02 02:40:26 +00:00
Evan Cheng 8b1bca1998 Add comments.
llvm-svn: 128730
2011-04-01 19:57:01 +00:00
Evan Cheng 8d68ebd42a Assign node order numbers to results of call instruction lowering. This should improve src line debug info when sdisel is used. rdar://9199118
llvm-svn: 128728
2011-04-01 19:42:22 +00:00
Evan Cheng bd76679700 Issue libcalls __udivmod*i4 / __divmod*i4 for div / rem pairs.
rdar://8911343

llvm-svn: 128696
2011-04-01 00:42:02 +00:00
Benjamin Kramer 355ce07425 Turn SelectionDAGBuilder::GetRegistersForValue into a local function.
It couldn't be used outside of the file because SDISelAsmOperandInfo
is local to SelectionDAGBuilder.cpp. Making it a static function avoids
a weird linkage dance.

llvm-svn: 128342
2011-03-26 16:35:10 +00:00
Andrew Trick 3bd8b7a388 Fix for -pre-RA-sched=source.
Yet another case of unchecked NULL node (for physreg copy).
May fix PR9509.

llvm-svn: 128266
2011-03-25 06:40:55 +00:00
Eli Friedman 4c192305bf PR9535: add support for splitting and scalarizing vector ISD::FP_ROUND.
Also cleaning up some duplicated code while I'm here.

llvm-svn: 128176
2011-03-23 22:18:48 +00:00
Andrew Trick 13acae040c Ensure that def-side physreg copies are scheduled above any other uses
so the scheduler can't create new interferences on the copies
themselves. Prior to this fix the scheduler could get stuck in a loop
creating copies.
Fixes PR9509.

llvm-svn: 128164
2011-03-23 20:42:39 +00:00
Andrew Trick a8846e0540 whitespace
llvm-svn: 128163
2011-03-23 20:40:18 +00:00
Andrew Trick b1fd328581 Added block number and name to isel debug output.
I'm tired of doing this manually for each checkout.
If anyone knows a better way debug isel for non-trivial tests feel
free to revert and let me know how to do it.

llvm-svn: 128132
2011-03-23 01:38:28 +00:00
Eric Christopher 1b4b1e559a Grammar-o.
llvm-svn: 128004
2011-03-21 18:06:21 +00:00
Nadav Rotem e7a101ccab Add support for legalizing UINT_TO_FP of vectors on platforms which do
not have native support for this operation (such as X86).
The legalized code uses two vector INT_TO_FP operations and is faster
than scalarizing.

llvm-svn: 127951
2011-03-19 13:09:10 +00:00
Benjamin Kramer cfcea12fe2 BuildUDIV: If the divisor is even we can simplify the fixup of the multiplied value by introducing an early shift.
This allows us to compile "unsigned foo(unsigned x) { return x/28; }" into
	shrl	$2, %edi
	imulq	$613566757, %rdi, %rax
	shrq	$32, %rax
	ret

instead of
	movl    %edi, %eax
	imulq   $613566757, %rax, %rcx
	shrq    $32, %rcx
	subl    %ecx, %eax
	shrl    %eax
	addl    %ecx, %eax
	shrl    $4, %eax

on x86_64

llvm-svn: 127829
2011-03-17 20:39:14 +00:00
Cameron Zwarich 2ef0c69df1 Move more logic into getTypeForExtArgOrReturn.
llvm-svn: 127809
2011-03-17 14:53:37 +00:00
Cameron Zwarich 34e7b3f77e Rename getTypeForExtendedInteger() to getTypeForExtArgOrReturn().
llvm-svn: 127807
2011-03-17 14:21:56 +00:00
Cameron Zwarich ac106273d4 The x86-64 ABI says that a bool is only guaranteed to be sign-extended to a byte
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.

This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.

llvm-svn: 127766
2011-03-16 22:20:18 +00:00
Cameron Zwarich d1ad9bc277 Don't recompute something that we already have in a local variable.
llvm-svn: 127764
2011-03-16 22:20:07 +00:00
Evan Cheng c5c2cfa381 sext(undef) = 0, because the top bits will all be the same.
zext(undef) = 0, because the top bits will be zero.

llvm-svn: 127649
2011-03-15 02:22:10 +00:00
Evan Cheng 37139edc8c BIT_CONVERT has been renamed to BITCAST.
llvm-svn: 127600
2011-03-14 18:19:52 +00:00
Evan Cheng d2f3b01797 Minor optimization. sign-ext/anyext of undef is still undef.
llvm-svn: 127598
2011-03-14 18:15:55 +00:00
Owen Anderson 66443c034d Teach FastISel to support register-immediate-immediate instructions.
llvm-svn: 127496
2011-03-11 21:33:55 +00:00
Andrew Trick 710d5da306 Replace -dag-chain-limit flag with constant. It has survived a release cycle without being touched, so no longer needs to pollute the hidden-help text.
llvm-svn: 127468
2011-03-11 17:46:59 +00:00
Evan Cheng adb9c03e41 Avoid replacing the value of a directly stored load with the stored value if the load is indexed. rdar://9117613.
llvm-svn: 127440
2011-03-11 00:48:56 +00:00
Evan Cheng b4c6a34415 Re-commit 127368 and 127371. They are exonerated.
llvm-svn: 127380
2011-03-10 00:16:32 +00:00
Evan Cheng d4b3f8e009 Revert 127368 and 127371 for now.
llvm-svn: 127376
2011-03-09 23:53:17 +00:00
Evan Cheng ca9a936332 Change the definition of TargetRegisterInfo::getCrossCopyRegClass to be more
flexible.

If it returns a register class that's different from the input, then that's the
register class used for cross-register class copies.
If it returns a register class that's the same as the input, then no cross-
register class copies are needed (normal copies would do).
If it returns null, then it's not at all possible to copy registers of the
specified register class.

llvm-svn: 127368
2011-03-09 22:47:38 +00:00
Andrew Trick 072ed2ee0d Improve pre-RA-sched register pressure tracking for duplicate operands.
This helps cases like 2008-07-19-movups-spills.ll, but doesn't have an obvious impact on benchmarks

llvm-svn: 127347
2011-03-09 19:12:43 +00:00
Benjamin Kramer b2e4d84305 Fix typo, make helper static.
llvm-svn: 127335
2011-03-09 16:19:12 +00:00
Eric Christopher 7238cba180 Fix some latent bugs if the nodes are unschedulable. We'd gotten away
with this before since none of the register tracking or nightly tests
had unschedulable nodes.

This should probably be refixed with a special default Node that just
returns some "don't touch me" values.

Fixes PR9427

llvm-svn: 127263
2011-03-08 19:35:47 +00:00
Andrew Trick 52b3e38a1f Further improvements to pre-RA-sched=list-ilp.
This change uses the MaxReorderWindow for both height and depth, which
tends to limit the negative effects of high register pressure.

llvm-svn: 127203
2011-03-08 01:51:56 +00:00
Cameron Zwarich df61694417 Move getRegPressureLimit() from TargetLoweringInfo to TargetRegisterInfo.
llvm-svn: 127175
2011-03-07 21:56:36 +00:00
Owen Anderson cd526fa15e Use the correct LHS type when determining the legalization of a shift's RHS type.
llvm-svn: 127163
2011-03-07 18:29:47 +00:00
Eric Christopher 9cb33deebf Typo.
llvm-svn: 127131
2011-03-06 21:13:45 +00:00
Andrew Trick dd01732e63 Disable a couple of experimental heuristics to get the best results from the current implementation of -pre-RA-sched=list-ilp.
llvm-svn: 127113
2011-03-06 00:03:32 +00:00
Andrew Trick 25cedf3fe4 Be explicit with abs(). Visual Studio workaround.
llvm-svn: 127075
2011-03-05 10:29:25 +00:00
Andrew Trick d7f4c21684 Fix for -sched-high-latency-cycles in sched=list-ilp mode.
llvm-svn: 127071
2011-03-05 09:18:16 +00:00
Andrew Trick b8390b7a25 Missing comment.
llvm-svn: 127068
2011-03-05 08:04:11 +00:00
Andrew Trick 641e2d4f8c Increased the register pressure limit on x86_64 from 8 to 12
regs. This is the only change in this checkin that may affects the
default scheduler. With better register tracking and heuristics, it
doesn't make sense to artificially lower the register limit so much.

Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to
give the scheduler a way to account for div and sqrt on targets that
don't have an itinerary. It is currently defaults to 10 (the actual
number doesn't matter much), but only takes effect on non-default
schedulers: list-hybrid and list-ilp.

Added several heuristics that can be individually disabled for the
non-default sched=list-ilp mode. This helps us determine how much
better we can do on a given benchmark than the default
scheduler. Certain compute intensive loops run much faster in this
mode with the right set of heuristics, and it doesn't seem to have
much negative impact elsewhere. Not all of the heuristics are needed,
but we still need to experiment to decide which should be disabled by
default for sched=list-ilp.

llvm-svn: 127067
2011-03-05 08:00:22 +00:00
Duncan Sands 6bd1044222 Revert commit 126684 "Use the correct shift amount type". It is only the correct
type after type legalization has completed.  Before then it may simply not be big
enough to hold the shift amount, particularly on x86 which uses a very small type
for shifts (this issue broke stuff in the past which is why LegalizeTypes carefully
uses a large type for shift amounts).

llvm-svn: 127000
2011-03-04 14:28:59 +00:00
Andrew Trick c88b7ecb88 Minor pre-RA-sched fixes and cleanup.
Fix the PendingQueue, then disable it because it's not required for
the current schedulers' heuristics.
Fix the logic for the unused list-ilp scheduler.

llvm-svn: 126981
2011-03-04 02:03:45 +00:00
Bill Wendling f3658f3872 There are times when the landing pad won't have a call to 'eh.selector' in
it. It's been assumed up til now that it would be in its immediate
successor. However, this isn't necessarily the case. It could be in one of its
successor's successors.

Modify the code to more thoroughly check for an 'eh.selector' call in
successors. It only looks at a successor if we get there as a result of an
unconditional branch.

Testcase ObjC/exceptions-4.m in r126968.

llvm-svn: 126969
2011-03-03 23:14:05 +00:00
Eli Friedman d8a555bb3b Revert r123908; the code in question is completely untested and wrong.
llvm-svn: 126964
2011-03-03 22:33:23 +00:00
Bob Wilson 24b3ba5990 Avoid exponential blow-up when printing DAGs.
David Greene changed CannotYetSelect() to print the full DAG including multiple
copies of operands reached through different paths in the DAG.  Unfortunately
this blows up exponentially in some cases.  The depth limit of 100 is way too
high to prevent this -- I'm seeing a message string of 150MB with a depth of
only 40 in one particularly bad case, even though the DAG has less than 200
nodes.  Part of the problem is that the printing code is following chain
operands, so if you fail to select an operation with a chain, the printer will
follow all the chained operations back to the entry node.

llvm-svn: 126899
2011-03-02 23:38:06 +00:00
Stuart Hastings 6b4007dec6 Can't introduce floating-point immediate constants after legalization.
Radar 9056407.

llvm-svn: 126864
2011-03-02 19:36:30 +00:00
Duncan Sands cb95eeecc6 Add a few missed unary cases when legalizing vector results. Put some cases
in alphabetical order.

llvm-svn: 126745
2011-03-01 15:15:43 +00:00
Jim Grosbach 621818ab1a trailing whitespace.
llvm-svn: 126733
2011-03-01 01:39:05 +00:00
Jim Grosbach 1d479dbc55 Generalize the register matching code in DAGISel a bit.
llvm-svn: 126731
2011-03-01 01:37:19 +00:00
Owen Anderson 0dc63104c6 Use the correct shift amount type.
llvm-svn: 126684
2011-02-28 21:10:10 +00:00
Owen Anderson 4f4df81861 Clean whitespace.
llvm-svn: 126683
2011-02-28 20:57:56 +00:00
Duncan Sands f571290d1e Legalize support for fpextend of vector. PR9309.
llvm-svn: 126574
2011-02-27 14:41:27 +00:00
Nadav Rotem b00913028f Fix typos in the comments.
llvm-svn: 126565
2011-02-27 07:40:43 +00:00
Tobias Grosser 3ac8689fa3 Pass the graph to the DOTGraphTraits.getEdgeAttributes().
This follows the interface of getNodeAttributes.

llvm-svn: 126562
2011-02-27 04:11:03 +00:00
Benjamin Kramer 26691d9660 Add some DAGCombines for (adde 0, 0, glue), which are useful to optimize legalized code for large integer arithmetic.
1. Inform users of ADDEs with two 0 operands that it never sets carry
2. Fold other ADDs or ADDCs into the ADDE if possible

It would be neat if we could do the same thing for SETCC+ADD eventually, but we can't do that in target independent code.

llvm-svn: 126557
2011-02-26 22:48:07 +00:00
Owen Anderson b2c80da4ae Allow targets to specify a the type of the RHS of a shift parameterized on the type of the LHS.
llvm-svn: 126518
2011-02-25 21:41:48 +00:00
Jim Grosbach 14a07365cb Fix formatting of debug helper string.
llvm-svn: 126471
2011-02-25 03:59:03 +00:00
Cameron Zwarich 4c82cd21ed Set NumSignBits to 1 if KnownZero/KnownOne are being zero extended. In theory it
is possible to do better if the high bit is set in either KnownZero/KnownOne, but
in practice NumSignBits is always 1 when we are zero extending because nothing
is known about that register.

llvm-svn: 126465
2011-02-25 01:11:01 +00:00
Cameron Zwarich d2f3041c7f We only want to zero extend the existing information if the bit width is
actually larger.

llvm-svn: 126464
2011-02-25 01:10:55 +00:00
Nadav Rotem 502f1b943f Enable support for vector sext and trunc:
Limit the folding of any_ext and sext  into the load operation to scalars.
Limit the active-bits trunc optimization to scalars.
Document vector trunc and vector sext in LangRef.

Similar to commit 126080 (for enabling zext).

llvm-svn: 126424
2011-02-24 21:01:34 +00:00
Cameron Zwarich a62fc89a04 Merge information about the number of zero, one, and sign bits of live-out
registers at phis. This enables us to eliminate a lot of pointless zexts during
the DAGCombine phase. This fixes <rdar://problem/8760114>.

llvm-svn: 126380
2011-02-24 10:00:25 +00:00
Cameron Zwarich 3cf9280214 Add a getNumSignBits() method to APInt.
llvm-svn: 126379
2011-02-24 10:00:20 +00:00
Cameron Zwarich 97eb52da7b Add a mechanism for invalidating the LiveOutInfo of a PHI, and use it whenever
a block is visited before all of its predecessors.

llvm-svn: 126378
2011-02-24 10:00:16 +00:00
Cameron Zwarich 988faf91bd Track blocks visited in reverse postorder.
llvm-svn: 126377
2011-02-24 10:00:13 +00:00
Cameron Zwarich 6470647383 Refactor the LiveOutInfo interface into a few methods on FunctionLoweringInfo
and make the actual map private.

llvm-svn: 126376
2011-02-24 10:00:08 +00:00
Cameron Zwarich b670d512e9 Have isel visit blocks in reverse postorder rather than an undefined order. This
allows for the information propagated across basic blocks to be merged at phis.

llvm-svn: 126375
2011-02-24 10:00:04 +00:00
Cameron Zwarich f8b22b3483 Roll out r126169 and r126170 in an attempt to fix the selfhost bot.
llvm-svn: 126185
2011-02-22 03:24:52 +00:00
Cameron Zwarich 800f85baf9 Merge information about the number of zero, one, and sign bits of live-out registers
at phis. This enables us to eliminate a lot of pointless zexts during the DAGCombine
phase. This fixes <rdar://problem/8760114>.

llvm-svn: 126170
2011-02-22 00:46:27 +00:00
Cameron Zwarich f248f945c8 Have isel visit blocks in reverse postorder rather than an undefined order. This
allows for the information propagated across basic blocks to be merged at phis.

llvm-svn: 126169
2011-02-22 00:46:22 +00:00
Devang Patel f3292b2196 Revert r124611 - "Keep track of incoming argument's location while emitting LiveIns."
In other words, do not keep track of argument's location.  The debugger (gdb) is not prepared to see line table entries for arguments. For the debugger, "second" line table entry marks beginning of function body.
This requires some coordination with debugger to get this working. 
 - The debugger needs to be aware of prolog_end attribute attached with line table entries.
 - The compiler needs to accurately mark prolog_end in line table entries (at -O0 and at -O1+)

llvm-svn: 126155
2011-02-21 23:21:26 +00:00
Nadav Rotem 25f2ac948b Fix 9267; Add vector zext support.
The DAGCombiner folds the zext into complex load instructions. This patch
prevents this optimization on vectors since none of the supported targets
knows how to perform load+vector_zext in one instruction.

llvm-svn: 126080
2011-02-20 12:37:50 +00:00
Devang Patel b7ae3ccb84 Do not lose debug info of an inlined function argument even if the argument is only used through GEPs.
This time with a fix that avoids using invalidated DenseMap iterator.

llvm-svn: 125984
2011-02-18 22:43:42 +00:00
Cameron Zwarich 0a1a36dc46 Roll out r125794 to help diagnose the llvm-gcc-i386-linux-selfhost failure.
llvm-svn: 125830
2011-02-18 04:58:10 +00:00
Devang Patel f922a431ee Do not lose debug info of an inlined function argument even if the argument is only used through GEPs.
llvm-svn: 125794
2011-02-17 23:33:27 +00:00
Duncan Sands c6196aa481 Fix wrong logic in promotion of signed mul-with-overflow (I pointed this out at
the time but presumably my email got lost).  Examples where the previous logic
got it wrong: (1) a signed i8 multiply of 64 by 2 overflows, but the high part is
zero; (2) a signed i8 multiple of -128 by 2 overflows, but the high part is all
ones. 

llvm-svn: 125748
2011-02-17 12:42:48 +00:00
Stuart Hastings 81c4306005 Swap VT and DebugLoc operands of getExtLoad() for consistency with
other getNode() methods.  Radar 9002173.

llvm-svn: 125665
2011-02-16 16:23:55 +00:00
Eric Christopher e5ca1e0506 Refactor zero folding slightly. Clean up todo.
llvm-svn: 125651
2011-02-16 04:50:12 +00:00
Eric Christopher ef72141a75 The change for PR9190 wasn't quite right. We need to avoid making the
transformation if we can't legally create a build vector of the correct
type. Check that we can make the transformation first, and add a TODO to
refactor this code with similar cases.

Fixes: PR9223 and rdar://9000350
llvm-svn: 125631
2011-02-16 01:10:03 +00:00
Chris Lattner 69229316aa convert ConstantVector::get to use ArrayRef.
llvm-svn: 125537
2011-02-15 00:14:00 +00:00
Chris Lattner 34442e6ebf revert my ConstantVector patch, it seems to have made the llvm-gcc
builders unhappy.

llvm-svn: 125504
2011-02-14 18:15:46 +00:00
Chris Lattner d9f5b88548 Switch ConstantVector::get to use ArrayRef instead of a pointer+size
idiom.  Change various clients to simplify their code.

llvm-svn: 125487
2011-02-14 07:55:32 +00:00
Chris Lattner eff248ca7f fix PR9210 by implementing some type legalization logic for
vector fp conversions.

llvm-svn: 125482
2011-02-14 06:30:45 +00:00
Chris Lattner eaa8341d3b fix two comment thinkos
llvm-svn: 125481
2011-02-14 06:14:42 +00:00
Chris Lattner 46c01a30f4 Enhance ComputeMaskedBits to know that aligned frameindexes
have their low bits set to zero.  This allows us to optimize
out explicit stack alignment code like in stack-align.ll:test4 when
it is redundant.

Doing this causes the code generator to start turning FI+cst into
FI|cst all over the place, which is general goodness (that is the
canonical form) except that various pieces of the code generator
don't handle OR aggressively.  Fix this by introducing a new
SelectionDAG::isBaseWithConstantOffset predicate, and using it
in places that are looking for ADD(X,CST).  The ARM backend in
particular was missing a lot of addressing mode folding opportunities
around OR.

llvm-svn: 125470
2011-02-13 22:25:43 +00:00
Chris Lattner e95d195014 Revisit my fix for PR9028: the issue is that DAGCombine was
generating i8 shift amounts for things like i1024 types.  Add
an assert in getNode to prevent this from occuring in the future,
fix the buggy transformation, revert my previous patch, and
document this gotcha in ISDOpcodes.h

llvm-svn: 125465
2011-02-13 19:09:16 +00:00
Chris Lattner d5f0b1148a when legalizing extremely wide shifts, make sure that
the shift amounts are in a suitably wide type so that
we don't generate out of range constant shift amounts.

This fixes PR9028.

llvm-svn: 125458
2011-02-13 09:10:56 +00:00
Chris Lattner 2a720d933a fix visitShift to properly zero extend the shift amount if the provided operand
is narrower than the shift register.  Doing an anyext provides undefined bits in
the top part of the register.

llvm-svn: 125457
2011-02-13 09:02:52 +00:00
Nadav Rotem db2f54811d A fix for 9165.
The DAGCombiner created illegal BUILD_VECTOR operations.
The patch added a check that either illegal operations are
allowed or that the created operation is legal.

llvm-svn: 125435
2011-02-12 14:40:33 +00:00
Nadav Rotem a49a02a04f SimplifySelectOps can only handle selects with a scalar condition. Add a check
that the condition is not a vector.

llvm-svn: 125398
2011-02-11 19:57:47 +00:00
Nadav Rotem 18f6a33457 Fix #9190
The bug happens when the DAGCombiner attempts to optimize one of the patterns
of the SUB opcode. It tries to create a zero of type v2i64. This type is legal
on 32bit machines, but the initializer of this vector (i64) is target dependent.
Currently, the initializer attempts to create an i64 zero constant, which fails.
Added a flag to tell the DAGCombiner to create a legal zero, if we require that
the pass would generate legal types.

llvm-svn: 125391
2011-02-11 19:20:37 +00:00
Devang Patel 639dd997eb Remove comment about an argument that was removed couple of years ago.
llvm-svn: 125054
2011-02-07 21:58:52 +00:00
Andrew Trick d0548ae750 Introducing a new method of tracking register pressure. We can't
precisely track pressure on a selection DAG, but we can at least keep
it balanced. This design accounts for various interesting aspects of
selection DAGS: register and subregister copies, glued nodes, dead
nodes, unused registers, etc.

Added SUnit::NumRegDefsLeft and ScheduleDAGSDNodes::RegDefIter.

Note: I disabled PrescheduleNodesWithMultipleUses when register
pressure is enabled, based on no evidence other than I don't think it
makes sense to have both enabled.

llvm-svn: 124853
2011-02-04 03:18:17 +00:00
Andrew Trick 3f924e4e87 whitespace
llvm-svn: 124827
2011-02-03 23:00:17 +00:00
Evan Cheng d42641c6b5 Given a pair of floating point load and store, if there are no other uses of
the load, then it may be legal to transform the load and store to integer
load and store of the same width.

This is done if the target specified the transformation as profitable. e.g.
On arm, this can transform:
vldr.32 s0, []
vstr.32 s0, []

to

ldr r12, []
str r12, []

rdar://8944252

llvm-svn: 124708
2011-02-02 01:06:55 +00:00
Matt Beaumont-Gay 29c8c8fe92 Take Bill Wendling's suggestion for structuring a couple of asserts.
llvm-svn: 124688
2011-02-01 22:12:50 +00:00
Devang Patel 56cc5fdf09 Keep track of incoming argument's location while emitting LiveIns.
llvm-svn: 124611
2011-01-31 21:38:14 +00:00
Richard Osborne 272e084bca Fix bug where ReduceLoadWidth was creating illegal ZEXTLOAD instructions.
llvm-svn: 124587
2011-01-31 17:41:44 +00:00
Benjamin Kramer 946e1522b6 Teach DAGCombine to fold fold (sra (trunc (sr x, c1)), c2) -> (trunc (sra x, c1+c2) when c1 equals the amount of bits that are truncated off.
This happens all the time when a smul is promoted to a larger type.

On x86-64 we now compile "int test(int x) { return x/10; }" into
  movslq  %edi, %rax
  imulq $1717986919, %rax, %rax
  movq  %rax, %rcx
  shrq  $63, %rcx
  sarq  $34, %rax <- used to be "shrq $32, %rax; sarl $2, %eax"
  addl  %ecx, %eax

This fires 96 times in gcc.c on x86-64.

llvm-svn: 124559
2011-01-30 16:38:43 +00:00
Benjamin Kramer 65bb14d368 Add the missing sub identity "A-(A-B) -> B" to DAGCombine.
This happens e.g. for code like "X - X%10" where we lower the modulo operation
to a series of multiplies and shifts that are then subtracted from X, leading to
this missed optimization.

llvm-svn: 124532
2011-01-29 12:34:05 +00:00
Nick Lewycky 0af77fd45b Fix build with stdcxx by using llvm::next. Patch by Joerg Sonnenberger!
llvm-svn: 124472
2011-01-28 04:00:15 +00:00
Andrew Trick c0ca67601a Remove a temporary workaround for a lencod miscompile. Depends on the fix in r124442.
llvm-svn: 124443
2011-01-27 21:28:51 +00:00
Devang Patel 1cec755494 Speculatively revert r124380.
llvm-svn: 124397
2011-01-27 19:15:01 +00:00
Devang Patel 3b266a2780 While legalizing SDValues do not drop SDDbgValues, trasfer them to new legal nodes.
Take 2. This includes fix for dragonegg crash.

llvm-svn: 124380
2011-01-27 17:43:53 +00:00
Matt Beaumont-Gay a148c59231 Try harder to not have unused variables.
llvm-svn: 124350
2011-01-27 02:39:27 +00:00
Matt Beaumont-Gay 0cddbf2bdf Opt-mode -Wunused-variable cleanup
llvm-svn: 124346
2011-01-27 01:47:50 +00:00
Devang Patel 92b7077f9e Reapply 124301
llvm-svn: 124339
2011-01-27 00:13:27 +00:00
Bill Wendling fb4ee9bbde Initialize variable to get rid of clang warning.
llvm-svn: 124331
2011-01-26 22:21:35 +00:00
Devang Patel b370bf329a Revert 124301.
llvm-svn: 124327
2011-01-26 21:41:22 +00:00
Devang Patel 084e0628e0 Revert r124302
llvm-svn: 124320
2011-01-26 21:12:32 +00:00
David Greene bab5e6ed0e [AVX] Add INSERT_SUBVECTOR and support it on x86. This provides a
default implementation for x86, going through the stack in a similr
fashion to how the codegen implements BUILD_VECTOR.  Eventually this
will get matched to VINSERTF128 if AVX is available.

llvm-svn: 124307
2011-01-26 19:13:22 +00:00
Devang Patel a11210b1b8 While legalizing SDValues do not drop SDDbgValues, trasfer them to new legal nodes.
llvm-svn: 124302
2011-01-26 18:55:05 +00:00
Devang Patel 9d4eb2f480 Process valid SDDbgValues even if the node does not have any order assigned.
llvm-svn: 124301
2011-01-26 18:42:32 +00:00
Devang Patel 1448e7c8b6 Refactor.
llvm-svn: 124300
2011-01-26 18:20:04 +00:00
David Greene b6f1611928 [AVX] Support EXTRACT_SUBVECTOR on x86. This provides a default
implementation of EXTRACT_SUBVECTOR for x86, going through the stack
in a similr fashion to how the codegen implements BUILD_VECTOR.
Eventually this will get matched to VEXTRACTF128 if AVX is available.

llvm-svn: 124292
2011-01-26 15:38:49 +00:00
Devang Patel efc6b16e4b Provide an interface to transfer SDDbgValue from one SDNode to another.
llvm-svn: 124245
2011-01-25 23:27:42 +00:00
Devang Patel 70f8e5962a Resolve DanglingDbgValue of PHI nodes where the use follows dbg.value intrinisic.
llvm-svn: 124203
2011-01-25 18:09:58 +00:00
Devang Patel 04b649d48a This assertion is too restrictive, it does not apply for dangling dbg value nodes (nodes where dbg.value intrinsic preceds use of the value).
llvm-svn: 124202
2011-01-25 18:09:33 +00:00
Devang Patel 533479544b Speculatively revert r124138.
llvm-svn: 124142
2011-01-24 20:04:37 +00:00
Devang Patel 8cc5355c90 Resolve DanglingDbgValue of PHI nodes where the use follows dbg.value intrinisic.
llvm-svn: 124138
2011-01-24 19:24:37 +00:00
Andrew Trick a293c49f0d Temporarily workaround JM/lencod miscompile (SIGSEGV).
rdar://problem/8893967

llvm-svn: 124137
2011-01-24 19:08:15 +00:00
Ted Kremenek 3c4408ceb6 Null initialize a few variables flagged by
clang's -Wuninitialized-experimental warning.
While these don't look like real bugs, clang's
-Wuninitialized-experimental analysis is stricter
than GCC's, and these fixes have the benefit
of being general nice cleanups.

llvm-svn: 124073
2011-01-23 17:05:06 +00:00
Andrew Trick bd428ec50f Enable support for precise scheduling of the instruction selection
DAG. Disable using "-disable-sched-cycles".

For ARM, this enables a framework for modeling the cpu pipeline and
counting stalls. It also activates several heuristics to drive
scheduling based on the model. Scheduling is inherently imprecise at
this stage, and until spilling is improved it may defeat attempts to
schedule. However, this framework provides greater control over
tuning codegen.

Although the flag is not target-specific, it should have very little
affect on the default scheduler used by x86. The only two changes that
affect x86 are:
- scheduling a high-latency operation bumps the current cycle so independent
  operations can have their latency covered. i.e. two independent 4
  cycle operations can produce results in 4 cycles, not 8 cycles.
- Two operations with equal register pressure impact and no
  latency-based stalls on their uses will be prioritized by depth before height
  (height is irrelevant if no stalls occur in the schedule below this point).

llvm-svn: 123971
2011-01-21 06:19:05 +00:00
Andrew Trick 47ff14b091 Convert -enable-sched-cycles and -enable-sched-hazard to -disable
flags. They are still not enable in this revision.

Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with
the scheduler's model of operand latency in the selection DAG.

Generalized unit tests to work with sched-cycles.

llvm-svn: 123969
2011-01-21 05:51:33 +00:00
Eric Christopher 37c4a8be72 My editor's indent went crazy. Fix.
llvm-svn: 123909
2011-01-20 08:56:34 +00:00
Eric Christopher 785db078b4 Expand invalid return values for umulo and smulo. Handle these similarly
to add/sub by doing the normal operation and then checking for overflow
afterwards. This generally relies on the DAG handling the later invalid
operations as well.

Fixes the 64-bit part of rdar://8622122 and rdar://8774702.

llvm-svn: 123908
2011-01-20 08:54:28 +00:00
Andrew Trick 2cd1f0beb6 Selection DAG scheduler register pressure heuristic fixes.
Added a check for already live regs before claiming HighRegPressure.
Fixed a few cases of checking the wrong number of successors.
Added some tracing until these heuristics are better understood.

llvm-svn: 123892
2011-01-20 06:21:59 +00:00
Eric Christopher b2139f655b Use only one API at a time.
llvm-svn: 123866
2011-01-20 01:29:23 +00:00
Eric Christopher bb14f65672 If we can, lower the multiply part of a umulo/smulo call to a libcall
with an invalid type then split the result and perform the overflow check
normally.

Fixes the 32-bit parts of rdar://8622122 and rdar://8774702.

llvm-svn: 123864
2011-01-20 00:29:24 +00:00
Jeffrey Yasskin 249fcd4499 Remove unused variables found by gcc-4.6's -Wunused-but-set-variable.
llvm-svn: 123707
2011-01-18 00:51:23 +00:00
Stuart Hastings 4fa832aab0 Remove checking that prevented overlapping CALLSEQ_START/CALLSEQ_END
ranges, add legalizer support for nested calls.  Necessary for ARM
byval support.  Radar 7662569.

llvm-svn: 123704
2011-01-18 00:09:27 +00:00
Benjamin Kramer 45d183ccf0 Fix an off-by-one error in ctpop combining.
llvm-svn: 123664
2011-01-17 18:00:28 +00:00
Benjamin Kramer 24c5184dca Add a DAGCombine to turn (ctpop x) u< 2 into (x & x-1) == 0.
This shaves off 4 popcounts from the hacked 186.crafty source.

This is enabled even when a native popcount instruction is available. The
combined code is one operation longer but it should be faster nevertheless.

llvm-svn: 123621
2011-01-17 12:04:57 +00:00
Chris Lattner 2d186574a6 reapply my fix for PR8961 with a tweak to properly handle
multi-instruction sequences like calls.  Many thanks to Jakob for
finding a testcase.

llvm-svn: 123559
2011-01-16 02:27:38 +00:00
Benjamin Kramer bec03ea725 Add an assert so we don't silently miscompile ctpop for bit widths > 128.
llvm-svn: 123549
2011-01-15 21:19:37 +00:00
Benjamin Kramer fff2517edc Reimplement CTPOP legalization with the "best" algorithm from
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel

In a silly microbenchmark on a 65 nm core2 this is 1.5x faster than the old
code in 32 bit mode and about 2x faster in 64 bit mode. It's also a lot shorter,
especially when counting 64 bit population on a 32 bit target.

I hope this is fast enough to replace Kernighan-style counting loops even when
the input is rather sparse.

llvm-svn: 123547
2011-01-15 20:30:30 +00:00
Dan Gohman abac063b7a Delete an assignment to ThisBB which isn't needed, and tidy up some
comments.

llvm-svn: 123479
2011-01-14 22:26:16 +00:00
Andrew Trick 9ccce77893 Support for precise scheduling of the instruction selection DAG,
disabled in this checkin. Sorry for the large diffs due to
refactoring. New functionality is all guarded by EnableSchedCycles.

Scheduling the isel DAG is inherently imprecise, but we give it a best
effort:
- Added MayReduceRegPressure to allow stalled nodes in the queue only
  if there is a regpressure need.
- Added BUHasStall to allow checking for either dependence stalls due to
  latency or resource stalls due to pipeline hazards.
- Added BUCompareLatency to encapsulate and standardize the heuristics
  for minimizing stall cycles (vs. reducing register pressure).
- Modified the bottom-up heuristic (now in BUCompareLatency) to
  prioritize nodes by their depth rather than height. As long as it
  doesn't stall, height is irrelevant. Depth represents the critical
  path to the DAG root.
- Added hybrid_ls_rr_sort::isReady to filter stalled nodes before
  adding them to the available queue.

Related Cleanup: most of the register reduction routines do not need
to be templates.

llvm-svn: 123468
2011-01-14 21:11:41 +00:00
Chris Lattner 3be81e9bd7 Set the insertion point correctly for instructions generated by load folding:
they should go *before* the new instruction not after it. 

llvm-svn: 123420
2011-01-14 01:33:40 +00:00
Dan Gohman 958620dd6d Fix r123346 to handle scalar types too.
llvm-svn: 123352
2011-01-13 01:06:51 +00:00
Dan Gohman 6e017a1134 Apply the patch from PR8958, which allows llc to get slightly
further on the associated testcase before aborting.

llvm-svn: 123346
2011-01-12 23:56:26 +00:00
Eric Christopher 1bb2c00f65 Move ExpandAtomic into the integer expansion routines - it's only used there.
llvm-svn: 123202
2011-01-11 00:36:08 +00:00
Dale Johannesen d2b48119b0 Fix PR 8916 (qv for analysis), at least the immediate problem.
There's an inherent tension in DAGCombine between assuming
that things will be put in canonical form, and the Depth
mechanism that disables transformations when recursion gets
too deep.  It would not surprise me if there's a lot of little
bugs like this one waiting to be discovered.  The mechanism
seems fragile and I'd suggest looking at it from a design viewpoint.

llvm-svn: 123191
2011-01-10 21:53:07 +00:00
Anton Korobeynikov 2f93128109 Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs and fixes here and there.
llvm-svn: 123170
2011-01-10 12:39:04 +00:00
Jakob Stoklund Olesen 2fb5b31578 Simplify a bunch of isVirtualRegister() and isPhysicalRegister() logic.
These functions not longer assert when passed 0, but simply return false instead.

No functional change intended.

llvm-svn: 123155
2011-01-10 02:58:51 +00:00
Jakob Stoklund Olesen 1331a15b0c Replace TargetRegisterInfo::printReg with a PrintReg class that also works without a TRI instance.
Print virtual registers numbered from 0 instead of the arbitrary
FirstVirtualRegister. The first virtual register is printed as %vreg0.
TRI::NoRegister is printed as %noreg.

llvm-svn: 123107
2011-01-09 03:05:53 +00:00
Jakob Stoklund Olesen 793d7b7626 Use an IndexedMap for LiveOutRegInfo to hide its dependence on TargetRegisterInfo::FirstVirtualRegister.
llvm-svn: 123096
2011-01-08 23:10:50 +00:00
Evan Cheng 6eb516dbea Do not model all INLINEASM instructions as having unmodelled side effects.
Instead encode llvm IR level property "HasSideEffects" in an operand (shared
with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check
the operand when the instruction is an INLINEASM.

This allows memory instructions to be moved around INLINEASM instructions.

llvm-svn: 123044
2011-01-07 23:50:32 +00:00
Bob Wilson 8265d56638 Add ARM patterns to match EXTRACT_SUBVECTOR nodes.
Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle
vectors from being translated to EXTRACT_SUBVECTOR.
Patch by Tim Northover.

The test changes are needed to keep those spill-q tests from testing aligned
spills and restores.  If the only aligned stack objects are spill slots, we
no longer realign the stack frame.  Prior to this patch, an EXTRACT_SUBVECTOR
was legalized by loading from the stack, which created an aligned frame index.
Now, however, there is nothing except the spill slot in the stack frame, so
I added an aligned alloca.

llvm-svn: 122995
2011-01-07 04:59:04 +00:00
Bob Wilson f291cb268f Change EXTRACT_SUBVECTOR to require a constant index.
We were never generating any of these nodes with variable indices, and there
was one legalizer function asserting on a non-constant index.  If we ever have
a need to support variable indices, we can add this back again.

llvm-svn: 122993
2011-01-07 04:58:56 +00:00
Duncan Sands 61c5708b51 Fix the other problem reported in PR8582. Testcase and patch by
Nadav Rotem.

llvm-svn: 122983
2011-01-06 23:45:22 +00:00
Eric Christopher e516af753b Add some fairly duplicated code to let type legalization split illegal
typed atomics. This will lower exclusively to libcalls at the moment.

llvm-svn: 122979
2011-01-06 22:28:56 +00:00
Evan Cheng 3ae2b79aa3 Re-implement r122936 with proper target hooks. Now getMaxStoresPerMemcpy
etc. takes an option OptSize. If OptSize is true, it would return
the inline limit for functions with attribute OptSize.

llvm-svn: 122952
2011-01-06 06:52:41 +00:00
Evan Cheng c052ba7ff3 Revert r122936. I'll re-implement the change.
llvm-svn: 122949
2011-01-06 06:17:53 +00:00
Evan Cheng 06536e7158 r105228 reduced the memcpy / memset inline limit to 4 with -Os to avoid blowing
up freebsd bootloader. However, this doesn't make much sense for Darwin, whose
-Os is meant to optimize for size only if it doesn't hurt performance.
rdar://8821501

llvm-svn: 122936
2011-01-06 01:04:47 +00:00
Evan Cheng ac730dd2d1 Avoid zero extend bit test operands to pointer type if all the masks fit in
the original type of the switch statement key.
rdar://8781238

llvm-svn: 122935
2011-01-06 01:02:44 +00:00
Evan Cheng 260acf32ee Optimize:
r1025 = s/zext r1024, 4
  r1026 = extract_subreg r1025, 4
to:
  r1026 = copy r1024

llvm-svn: 122925
2011-01-05 23:06:49 +00:00
Eric Christopher c673b21a87 80-cols.
llvm-svn: 122909
2011-01-05 21:45:56 +00:00
Eric Christopher 988518109d Remove TODO, these appear to be implemented.
llvm-svn: 122849
2011-01-04 22:31:50 +00:00
Benjamin Kramer 25e6e06e42 Try to reuse the value when lowering memset.
This allows us to compile:
  void test(char *s, int a) {
    __builtin_memset(s, a, 15);
  }
into 1 mul + 3 stores instead of 3 muls + 3 stores.

llvm-svn: 122710
2011-01-02 19:57:05 +00:00
Benjamin Kramer 2fdea4c8f1 Lower the i8 extension in memset to a multiply instead of a potentially long series of shifts and ors.
We could implement a DAGCombine to turn x * 0x0101 back into logic operations
on targets that doesn't support the multiply or it is slow (p4) if someone cares
enough.

Example code:
  void test(char *s, int a) {
      __builtin_memset(s, a, 4);
  }
before:
  _test:                                  ## @test
    movzbl  8(%esp), %eax
    movl  %eax, %ecx
    shll  $8, %ecx
    orl %eax, %ecx
    movl  %ecx, %eax
    shll  $16, %eax
    orl %ecx, %eax
    movl  4(%esp), %ecx
    movl  %eax, 4(%ecx)
    movl  %eax, (%ecx)
    ret
after:
  _test:                                  ## @test
    movzbl  8(%esp), %eax
    imull $16843009, %eax, %eax   ## imm = 0x1010101
    movl  4(%esp), %ecx
    movl  %eax, 4(%ecx)
    movl  %eax, (%ecx)
    ret

llvm-svn: 122707
2011-01-02 19:44:58 +00:00
Andrew Trick 5ce945ca3a Minor cleanup related to my latest scheduler changes.
llvm-svn: 122545
2010-12-24 07:10:19 +00:00
Andrew Trick c94056692a Fix a few cases where the scheduler is not checking for phys reg copies. The scheduling node may have a NULL DAG node, yuck.
llvm-svn: 122544
2010-12-24 06:46:50 +00:00
Andrew Trick 10ffc2b6c2 Various bits of framework needed for precise machine-level selection
DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.

Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.

Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.

Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.

ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.

ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.

llvm-svn: 122541
2010-12-24 05:03:26 +00:00
Andrew Trick c416ba612b whitespace
llvm-svn: 122539
2010-12-24 04:28:06 +00:00
Chris Lattner 11a33811b6 flags -> glue for selectiondag
llvm-svn: 122509
2010-12-23 17:24:32 +00:00
Chris Lattner f647e95b9a sdisel flag -> glue.
llvm-svn: 122507
2010-12-23 17:13:18 +00:00
Andrew Trick 528fad91d2 Reorganize ListScheduleBottomUp in preparation for modeling machine cycles and instruction issue.
llvm-svn: 122491
2010-12-23 05:42:20 +00:00
Andrew Trick a52f325c35 Converted LiveRegCycles to LiveRegGens. It's easier to work with and allows multiple nodes per cycle.
llvm-svn: 122474
2010-12-23 04:16:14 +00:00
Andrew Trick 12acde11cb In CheckForLiveRegDef use TRI->getOverlaps.
llvm-svn: 122473
2010-12-23 03:43:21 +00:00
Andrew Trick 033efdf4d7 Fixes PR8823: add-with-overflow-128.ll
In the bottom-up selection DAG scheduling, handle two-address
instructions that read/write unspillable registers. Treat
the entire chain of two-address nodes as a single live range.

llvm-svn: 122472
2010-12-23 03:15:51 +00:00
Jeffrey Yasskin 9b43f33620 Change all self assignments X=X to (void)X, so that we can turn on a
new gcc warning that complains on self-assignments and
self-initializations.

llvm-svn: 122458
2010-12-23 00:58:24 +00:00
Benjamin Kramer 1f4dfbbcb0 DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal. The latter usually compiles into smaller code.
example code:
unsigned foo(unsigned x, unsigned y) {
  if (x != 0) y--;
  return y;
}

before:
  _foo:                           ## @foo
    cmpl  $1, 4(%esp)             ## encoding: [0x83,0x7c,0x24,0x04,0x01]
    sbbl  %eax, %eax              ## encoding: [0x19,0xc0]
    notl  %eax                    ## encoding: [0xf7,0xd0]
    addl  8(%esp), %eax           ## encoding: [0x03,0x44,0x24,0x08]
    ret                           ## encoding: [0xc3]

after:
  _foo:                           ## @foo
    cmpl  $1, 4(%esp)             ## encoding: [0x83,0x7c,0x24,0x04,0x01]
    movl  8(%esp), %eax           ## encoding: [0x8b,0x44,0x24,0x08]
    adcl  $-1, %eax               ## encoding: [0x83,0xd0,0xff]
    ret                           ## encoding: [0xc3]

llvm-svn: 122455
2010-12-22 23:17:45 +00:00
Chris Lattner cafc1e60bb Fix a bug in ReduceLoadWidth that wasn't handling extending
loads properly.  We miscompiled the testcase into:

_test:                                  ## @test
	movl	$128, (%rdi)
	movzbl	1(%rdi), %eax
	ret

Now we get a proper:

_test:                                  ## @test
	movl	$128, (%rdi)
	movsbl	(%rdi), %eax
	movzbl	%ah, %eax
	ret

This fixes PR8757.

llvm-svn: 122392
2010-12-22 08:02:57 +00:00
Chris Lattner 9a499e96eb more cleanups, move a check for "roundedness" earlier to reject
unhanded cases faster and simplify code.

llvm-svn: 122391
2010-12-22 08:01:44 +00:00
Chris Lattner 222374d886 reduce indentation and improve comments, no functionality change.
llvm-svn: 122389
2010-12-22 07:36:50 +00:00
Andrew Trick fbb3ed8774 In DelayForLiveRegsBottomUp, handle instructions that read and write
the same physical register. Simplifies the fix from the previous
checkin r122211.

llvm-svn: 122370
2010-12-21 22:27:44 +00:00
Andrew Trick 2085a96513 whitespace
llvm-svn: 122368
2010-12-21 22:25:04 +00:00
Dale Johannesen a94e36bbee Reapply 122353-122355 with fixes. 122354 was wrong;
the shift type was needed one place, the shift count
type another.  The transform in 123555 had the same
problem.

llvm-svn: 122366
2010-12-21 21:55:50 +00:00
Dale Johannesen 87c47499c6 Revert 122353-122355 for the moment, they broke stuff.
llvm-svn: 122360
2010-12-21 21:22:27 +00:00
Dale Johannesen caf42aa6a4 Add a new transform to DAGCombiner.
llvm-svn: 122355
2010-12-21 20:10:51 +00:00
Dale Johannesen fa5dc82fda Get the type of a shift from the shift, not from its shift
count operand.  These should be the same but apparently are
not always, and this is cleaner anyway.  This improves the
code in an existing test.

llvm-svn: 122354
2010-12-21 20:06:19 +00:00
Dale Johannesen d64931df77 Shift by the word size is invalid IR; don't create it.
llvm-svn: 122353
2010-12-21 20:00:06 +00:00
Chris Lattner 2a7ff99979 fix some typos
llvm-svn: 122349
2010-12-21 18:05:22 +00:00
Stuart Hastings 83cce8e7ab Fix indentation, add comment.
llvm-svn: 122345
2010-12-21 17:16:58 +00:00
Stuart Hastings 8c5bfcaa29 Missing logic for nested CALLSEQ_START/END.
llvm-svn: 122342
2010-12-21 17:07:24 +00:00
Chris Lattner 3e5fbd74ed rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for
something that just glues two nodes together, even if it is
sometimes used for flags.

llvm-svn: 122310
2010-12-21 02:38:05 +00:00
Chris Lattner 17f906be96 improve "cannot yet select" errors a trivial amount: now
they are just as useless, but at least a bit more gramatical

llvm-svn: 122305
2010-12-21 02:07:03 +00:00
Dale Johannesen 0a291a36f2 Cosmetic changes.
llvm-svn: 122259
2010-12-20 20:10:50 +00:00
Chris Lattner 0b3ca50ebb implement type legalization promotion support for SMULO and UMULO, giving
ARM (and other 32-bit-only) targets support for i8 and i16 overflow 
multiplies.  The generated code isn't great, but this at least fixes
CodeGen/Generic/overflow.ll when running on ARM hosts.

llvm-svn: 122221
2010-12-20 02:05:39 +00:00
Chris Lattner 981afd206b Fix a bug in the scheduler's handling of "unspillable" vregs.
Imagine we see:

EFLAGS = inst1
EFLAGS = inst2 FLAGS
gpr = inst3 EFLAGS

Previously, we would refuse to schedule inst2 because it clobbers
the EFLAGS of the predecessor.  However, it also uses the EFLAGS
of the predecessor, so it is safe to emit.  SDep edges ensure that
the right order happens already anyway.

This fixes 2 testsuite crashes with the X86 patch I'm going to
commit next.

llvm-svn: 122211
2010-12-20 00:55:43 +00:00
Chris Lattner 0cfe884874 the result of CheckForLiveRegDef is dead, remove it.
llvm-svn: 122209
2010-12-20 00:51:56 +00:00
Chris Lattner ed69c6e4b9 reduce indentation, no functionality change.
llvm-svn: 122208
2010-12-20 00:50:16 +00:00
Nick Lewycky 0de20af7ba Add missing standard headers. Patch by Joerg Sonnenberger!
llvm-svn: 122193
2010-12-19 20:43:38 +00:00
Chris Lattner 440b2804ff teach MaskedValueIsZero how to analyze ADDE. This is
enough to teach it that ADDE(0,0) is known 0 except the 
low bit, for example.

llvm-svn: 122191
2010-12-19 20:38:28 +00:00
Chris Lattner 77a8a71414 fix PR8642: if a critical edge has a PHI value that can trap,
isel is *required* to split the edge.  PHI values get evaluated
on the edge, not in their predecessor block.

llvm-svn: 122170
2010-12-19 04:58:57 +00:00
Bob Wilson 5408144add Fix a DAGCombiner crash when folding binary vector operations with constant
BUILD_VECTOR operands where the element type is not legal.  I had previously
changed this code to insert TRUNCATE operations, but that was just wrong.

llvm-svn: 122102
2010-12-17 23:06:49 +00:00
Dale Johannesen cd538afa52 Add a transform to DAG Combiner. This improves the
code for the case where 32-bit divide by constant is
turned into 64-bit multiply by constant.  8771012.

llvm-svn: 122090
2010-12-17 21:45:49 +00:00
Bob Wilson bfc6904fc6 Fix crash compiling a QQQQ REG_SEQUENCE for a Neon vld3_lane operation.
Radar 8776599

llvm-svn: 122018
2010-12-17 01:21:12 +00:00
Chris Lattner 15090e1eb0 take care of some todos, transforming [us]mul_lohi into
a wider mul if the wider mul is legal.

llvm-svn: 121848
2010-12-15 06:04:19 +00:00
Chris Lattner b86dceea1b when transforming a MULHS into a wider MUL, there is no need to SRA the
result, the top bits are truncated off anyway, just use SRL.

llvm-svn: 121846
2010-12-15 05:51:39 +00:00
Chris Lattner 10bd29f1d4 Add a couple dag combines to transform mulhi/mullo into a wider multiply
when the wider type is legal.  This allows us to compile:

define zeroext i16 @test1(i16 zeroext %x) nounwind {
entry:
	%div = udiv i16 %x, 33
	ret i16 %div
}

into:

test1:                                  # @test1
	movzwl	4(%esp), %eax
	imull	$63551, %eax, %eax      # imm = 0xF83F
	shrl	$21, %eax
	ret

instead of:

test1:                                  # @test1
        movw    $-1985, %ax             # imm = 0xFFFFFFFFFFFFF83F
        mulw    4(%esp)
        andl    $65504, %edx            # imm = 0xFFE0
        movl    %edx, %eax
        shrl    $5, %eax
        ret

Implementing rdar://8760399 and example #4 from:
http://blog.regehr.org/archives/320

We should implement the same thing for [su]mul_hilo, but I don't
have immediate plans to do this.

llvm-svn: 121696
2010-12-13 08:39:01 +00:00
Chris Lattner cb404360ca reduce indentation by using continue, no functionality change.
llvm-svn: 121662
2010-12-13 01:11:17 +00:00
Duncan Sands d2e70b5442 Catch attempts to remove a deleted node from the CSE maps. Better to
catch this here rather than later after accessing uninitialized memory
etc.  Fires when compiling the testcase in PR8237.

llvm-svn: 121635
2010-12-12 13:22:50 +00:00
Stuart Hastings d2ea97cbef Initial support for nested CALLSEQ_START/CALLSEQ_END constructs in LegalizeDAG.
Necessary for byval support on ARM.  Radar 7662569.

llvm-svn: 121412
2010-12-09 21:25:20 +00:00
Eric Christopher d9e8eac235 80-col fixups.
llvm-svn: 121356
2010-12-09 04:48:06 +00:00
Eric Christopher 1b93e7b4ed Reword comment slightly.
llvm-svn: 121293
2010-12-08 22:21:42 +00:00
Jay Foad 583abbc4df PR5207: Change APInt methods trunc(), sext(), zext(), sextOrTrunc() and
zextOrTrunc(), and APSInt methods extend(), extOrTrunc() and new method
trunc(), to be const and to return a new value instead of modifying the
object in place.

llvm-svn: 121120
2010-12-07 08:25:19 +00:00
Devang Patel c24048a718 If dbg_declare() or dbg_value() is not lowered by isel then emit DEBUG message instead of creating DBG_VALUE for undefined value in reg0.
llvm-svn: 121059
2010-12-06 22:39:26 +00:00
Benjamin Kramer 31920b0a2a Remove unneeded zero arrays.
llvm-svn: 120910
2010-12-04 15:28:22 +00:00
Jay Foad 25a5e4ca1f PR5207: Rename overloaded APInt methods set(), clear(), flip() to
setAllBits(), setBit(unsigned), etc.

llvm-svn: 120564
2010-12-01 08:53:58 +00:00
Evan Cheng d4b0873c06 Enable sibling call optimization of libcalls which are expanded during
legalization time. Since at legalization time there is no mapping from
SDNode back to the corresponding LLVM instruction and the return
SDNode is target specific, this requires a target hook to check for
eligibility. Only x86 and ARM support this form of sibcall optimization
right now.
rdar://8707777

llvm-svn: 120501
2010-11-30 23:55:39 +00:00
Chris Lattner ea41dfe385 add TLI support indicating that jumps are more expensive than logical operations
and use this to disable a specific optimization.  Patch by Micah Villmow!

llvm-svn: 120435
2010-11-30 18:12:52 +00:00
Jay Foad 15084f085d PR5207: Make APInt::set(), APInt::clear() and APInt::flip() return void.
llvm-svn: 120413
2010-11-30 09:02:01 +00:00
Michael J. Spencer 447762da85 Merge System into Support.
llvm-svn: 120298
2010-11-29 18:16:10 +00:00
Bob Wilson f9b96c474f Fix a comment typo.
llvm-svn: 120235
2010-11-28 06:51:19 +00:00
Wesley Peck 527da1b6e2 Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept.
llvm-svn: 119990
2010-11-23 03:31:01 +00:00
Benjamin Kramer 24656c9583 Implement the "if (X == 6 || X == 4)" -> "if ((X|2) == 6)" optimization.
This currently only catches the most basic case, a two-case switch, but can be
extended later.

llvm-svn: 119964
2010-11-22 09:45:38 +00:00
Benjamin Kramer f6fb58a216 Silence Release build warnings about unused functions.
llvm-svn: 119903
2010-11-20 15:53:24 +00:00
Duncan Sands 7c601ded34 On X86, MEMBARRIER, MFENCE, SFENCE, LFENCE are not target memory intrinsics,
so don't claim they are.  They are allocated using DAG.getNode, so attempts
to access MemSDNode fields results in reading off the end of the allocated
memory.  This fixes crashes with "llc -debug" due to debug code trying to
print MemSDNode fields for these barrier nodes (since the crashes are not
deterministic, use valgrind to see this).  Add some nasty checking to try
to catch this kind of thing in the future.

llvm-svn: 119901
2010-11-20 11:25:00 +00:00
Andrew Trick cf7fefb25c Removing the useless test that I added recently. It was meant as an example, but not complicated enough to merit another test.
llvm-svn: 119898
2010-11-20 07:26:51 +00:00
Bill Wendling 54df187f25 Check for _setjmp too, because it's also used.
llvm-svn: 119875
2010-11-20 00:03:09 +00:00
Mon P Wang 88ff56caa3 Make isScalarToVector to return false if the node is a scalar. This will prevent
DAGCombine from making an illegal transformation of bitcast of a scalar to a
vector into a scalar_to_vector.

llvm-svn: 119819
2010-11-19 19:08:12 +00:00
Duncan Sands c92331b984 Fix thinko: we must turn select(anyext, sext) into sext(select)
not anyext(select).  Spotted by Frits van Bommel.

llvm-svn: 119739
2010-11-18 21:16:28 +00:00
Duncan Sands 12f3b3b44f The DAGCombiner was threading select over pairs of extending loads even
if the extension types were not the same.  The result was that if you
fed a select with sext and zext loads, as in the testcase, then it
would get turned into a zext (or sext) of the select, which is wrong
in the cases when it should have been an sext (resp. zext).  Reported
and diagnosed by Sebastien Deldon.

llvm-svn: 119728
2010-11-18 20:05:18 +00:00
Dale Johannesen ed0d840838 Do not throw away alignment when generating the DAG for
memset; we may need it to decide between MOVAPS and MOVUPS
later.  Adjust a test that was looking for wrong code.
PR 3866 / 8675131.

llvm-svn: 119605
2010-11-18 01:35:23 +00:00
John Thompson ddc7ce548c Bug 8621 fix - pointer cast stripped from inline asm constraint argument.
llvm-svn: 119590
2010-11-17 23:58:47 +00:00
Dan Gohman 8b67c720f2 Split pseudo-instruction expansion into a separate pass, to make it
easier to debug, and to avoid complications when the CFG changes
in the middle of the instruction selection process.

llvm-svn: 119382
2010-11-16 21:02:37 +00:00
Andrew Trick 6cbf6c1db5 typo (4th checkin for one fix)
llvm-svn: 118913
2010-11-12 18:36:03 +00:00
Andrew Trick 116efac780 Fixes PR8287: SD scheduling time. The fix is a failsafe that prevents
catastrophic compilation time in the event of unreasonable LLVM
IR. Code quality is a separate issue--someone upstream needs to do a
better job of reducing to llvm.memcpy. If the situation can be reproduced with
any supported frontend, then it will be a separate bug.

llvm-svn: 118904
2010-11-12 17:50:46 +00:00
Chris Lattner 64634c36dd tidy up.
llvm-svn: 118896
2010-11-12 17:24:29 +00:00
Dan Gohman 6cf9bb45ad Remove the memmove->memcpy optimization from CodeGen. MemCpyOpt does this.
llvm-svn: 118789
2010-11-11 16:24:49 +00:00
Dan Gohman 5db8921422 Fix DAGCombiner to avoid folding a sext-in-reg or similar through a shl
in order to fold it into a load.

llvm-svn: 118471
2010-11-09 01:54:35 +00:00
Dale Johannesen f11ea9ce61 Fix an inline asm pasto from 117667; was preventing
{i64, i64} from matching i128.

llvm-svn: 118465
2010-11-09 01:15:07 +00:00
Duncan Sands 6c25ca4f2b When passing a parameter using the 'byval' mechanism, inline code needs to be used
to perform the copy, which may be of lots of memory [*].  It would be good if the
fall-back code generated something reasonable, i.e. did the copy in a loop, rather
than vast numbers of loads and stores.  Add a note about this.  Currently target
specific code seems to always kick in so this is more of a theoretical issue rather
than a practical one now that X86 has been fixed.
[*] It's amazing how often people pass mega-byte long arrays by copy...

llvm-svn: 118275
2010-11-05 15:20:29 +00:00
Eric Christopher c6418b105a Just return undef for invalid masks or elts, and since we're doing that,
just do it earlier too.

llvm-svn: 118195
2010-11-03 20:44:42 +00:00
Duncan Sands 1462777017 Simplify uses of MVT and EVT. An MVT can be compared directly
with a SimpleValueType, while an EVT supports equality and
inequality comparisons with SimpleValueType.

llvm-svn: 118169
2010-11-03 12:17:33 +00:00
Duncan Sands f5dda01f33 Inside the calling convention logic LocVT is always a simple
value type, so there is no point in passing it around using
an EVT.  Use the simpler MVT everywhere.  Rather than trying
to propagate this information maximally in all the code that
using the calling convention stuff, I chose to do a mainly
low impact change instead.

llvm-svn: 118167
2010-11-03 11:35:31 +00:00
Eric Christopher fcc9e6848a If we have an undef mask our Elt will be -1 for our access, handle
this by using an undef as a pointer.

Fixes rdar://8625016

llvm-svn: 118164
2010-11-03 09:36:40 +00:00
Dan Gohman 68fb004616 Fix DAGCombiner to avoid going into an infinite loop when it
encounters (and:i64 (shl:i64 (load:i64), 1), 0xffffffff).
This fixes rdar://8606584.

llvm-svn: 118143
2010-11-03 01:47:46 +00:00
Evan Cheng debf9c502a Two sets of changes. Sorry they are intermingled.
1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
   "optimize for latency". Call instructions don't have the right latency and
   this is more likely to use introduce spills.
2. Fix if-converter cost function. For ARM, it should use instruction latencies,
   not # of micro-ops since multi-latency instructions is completely executed
   even when the predicate is false. Also, some instruction will be "slower"
   when they are predicated due to the register def becoming implicit input.
   rdar://8598427

llvm-svn: 118135
2010-11-03 00:45:17 +00:00
Devang Patel bc741405a7 If value map does not have register for an argument then try to find frame index before giving up.
llvm-svn: 118022
2010-11-02 17:19:03 +00:00
Devang Patel 94f2a2578c Use frameindex, if available, as a last resort to emit debug info for a parameter.
llvm-svn: 118020
2010-11-02 17:01:30 +00:00
Bob Wilson 08882be86c Remove DAG combiner patch to fold vector splats. Instcombiner does it now.
llvm-svn: 117720
2010-10-29 22:03:02 +00:00
Evan Cheng 6c1414f9c2 Avoiding overly aggressive latency scheduling. If the two nodes share an
operand and one of them has a single use that is a live out copy, favor the
one that is live out. Otherwise it will be difficult to eliminate the copy
if the instruction is a loop induction variable update. e.g.

BB:
sub r1, r3, #1
str r0, [r2, r3]
mov r3, r1
cmp
bne BB

=>

BB:
str r0, [r2, r3]
sub r3, r3, #1
cmp
bne BB

This fixed the recent 256.bzip2 regression.

llvm-svn: 117675
2010-10-29 18:09:28 +00:00
John Thompson e8360b7182 Inline asm multiple alternative constraints development phase 2 - improved basic logic, added initial platform support.
llvm-svn: 117667
2010-10-29 17:29:13 +00:00
Bob Wilson f63da12be9 Teach the DAG combiner to fold a splat of a splat. Radar 8597790.
Also do some minor refactoring to reduce indentation.

llvm-svn: 117558
2010-10-28 17:06:14 +00:00
Evan Cheng ff310737e5 Re-commit 117518 and 117519 now that ARM MC test failures are out of the way.
llvm-svn: 117531
2010-10-28 06:47:08 +00:00
Evan Cheng e2c211c1b9 Revert 117518 and 117519 for now. They changed scheduling and cause MC tests to fail. Ugh.
llvm-svn: 117520
2010-10-28 02:00:25 +00:00
Evan Cheng 523fa3a2e8 Fix a major bug in operand latency computation. The use index must be adjusted
by the number of defs first for it to match the instruction itinerary.

llvm-svn: 117518
2010-10-28 01:46:29 +00:00
Dale Johannesen e660f4d072 Use a MemIntrinsicSDNode for ISD::PREFETCH, which touches
memory, so a MachineMemOperand is useful (not propagated
into the MachineInstr yet).  No functional change except
for dump output.

llvm-svn: 117413
2010-10-26 23:11:10 +00:00
Devang Patel 05561e8b7b Assign source ordering to nodes created for StoreInst.
llvm-svn: 117404
2010-10-26 22:14:52 +00:00
Nick Lewycky 90b2ac2696 For statistics that are only used in functions declared in !NDEBUG, wrap the
declarations in !NDEBUG to avoid -Wunused-variable warnings. Patch by
Matt Beaumont-Gay!

llvm-svn: 117345
2010-10-26 00:51:57 +00:00
Devang Patel 43c3f4b63c Simplify.
Do not count use of sdisel for single call instruction.

llvm-svn: 117316
2010-10-25 21:31:46 +00:00
Devang Patel 3bc6d198fb Add counters to count basic blocks and machine basic blocks with out of order line number info.
Add counters to count how many basic blocks are entirely selected by fastisel.

llvm-svn: 117310
2010-10-25 20:55:43 +00:00
Chandler Carruth 82058c05f8 Move the remaining attribute macros to systematic names based on the attribute
name and prefixed with 'LLVM_'.

llvm-svn: 117203
2010-10-23 08:40:19 +00:00
Michael J. Spencer 0e36e0340a X86: Base _fltused on the FunctionType of the called value instead of the potentially null "CalledFunction". Thanks Duncan!
This is needed for indirect calls.

llvm-svn: 117061
2010-10-21 20:49:23 +00:00
Michael J. Spencer 83ce5f181f CodeGen-Windows: Only emit _fltused if a VarArg function is called with floating point args.
This should be the minimum set of functions that could possibly need it.

llvm-svn: 116978
2010-10-21 00:08:21 +00:00
Dale Johannesen 320a553319 Remove Synthesizable from the Type system; as MMX vector
types are no longer Legal on X86, we don't need it.
No functional change.  8499854.

llvm-svn: 116947
2010-10-20 21:32:10 +00:00
Dan Gohman a94cc6dfe8 Make CodeGen TBAA-aware.
llvm-svn: 116890
2010-10-20 00:31:05 +00:00
Jim Grosbach bbdc5d2ef9 Add a pre-dispatch SjLj EH hook on the unwind edge for targets to do any
setup they require. Use this for ARM/Darwin to rematerialize the base
pointer from the frame pointer when required. rdar://8564268

llvm-svn: 116879
2010-10-19 23:27:08 +00:00
Owen Anderson 6c18d1aac0 Get rid of static constructors for pass registration. Instead, every pass exposes an initializeMyPassFunction(), which
must be called in the pass's constructor.  This function uses static dependency declarations to recursively initialize
the pass's dependencies.

Clients that only create passes through the createFooPass() APIs will require no changes.  Clients that want to use the
CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h
before parsing commandline arguments.

I have tested this with all standard configurations of clang and llvm-gcc on Darwin.  It is possible that there are problems
with the static dependencies that will only be visible with non-standard options.  If you encounter any crash in pass
registration/creation, please send the testcase to me directly.

llvm-svn: 116820
2010-10-19 17:21:58 +00:00
Michael J. Spencer 5e683250ee X86-Windows: Emit an undefined global __fltused symbol when targeting Windows
if any floating point arguments are passed to an external function.

llvm-svn: 116665
2010-10-16 08:25:41 +00:00
Michael J. Spencer d3ea25e66e Whitespace!
llvm-svn: 116664
2010-10-16 08:25:21 +00:00
Chris Lattner eb313a46fc fix the default va_arg expansion (in the realignment case) to not implicitly
truncate the stack pointer to 32-bits on a 64-bit machine.

llvm-svn: 116169
2010-10-10 18:36:26 +00:00
Dan Gohman aadc5596f1 ComputeLinearIndex doesn't need its TLI argument.
llvm-svn: 115792
2010-10-06 16:18:29 +00:00
Evan Cheng 49d4c0bd18 - Add TargetInstrInfo::getOperandLatency() to compute operand latencies. This
allow target to correctly compute latency for cases where static scheduling
  itineraries isn't sufficient. e.g. variable_ops instructions such as
  ARM::ldm.
  This also allows target without scheduling itineraries to compute operand
  latencies. e.g. X86 can return (approximated) latencies for high latency
  instructions such as division.
- Compute operand latencies for those defined by load multiple instructions,
  e.g. ldm and those used by store multiple instructions, e.g. stm.

llvm-svn: 115755
2010-10-06 06:27:31 +00:00
Owen Anderson d8d1dcc09a Use a more efficient lowering of uint64_t --> float that can take advantage of hardware signed integer conversion without
having to do a double cast (uint64_t --> double --> float).  This is based on the algorithm from compiler_rt's __floatundisf
for X86-64.

llvm-svn: 115634
2010-10-05 17:24:05 +00:00
Evan Cheng c8d6cfd730 This DAG combine BRCOND transformation can look pass truncate of the operand:
//   %a = ...                                                                                                                                                                                  
    //   %b = and i32 %a, 2                                                                                                                                                                        
    //   %c = srl i32 %b, 1                                                                                                                                                                        
    //   brcond i32 %c ...                                                                                                                                                                         
    //                                                                                                                                                                                             
    // into                                                                                                                                                                                        
    //                                                                                                                                                                                             
    //   %a = ...                                                                                                                                                                                  
    //   %b = and i32 %a, 2                                                                                                                                                                        
    //   %c = setcc eq %b, 0                                                                                                                                                                       
    //   brcond %c ...

Make sure it restores local variable N1, which corresponds to the condition operand if it fails to match.

This apparently breaks TCE but since that backend isn't in the tree I don't have a test for it.

llvm-svn: 115571
2010-10-04 22:41:01 +00:00
Devang Patel d3fe5fa5d1 Fix code gen crash reported in PR 8235. We still lose debug info for the unused argument here. This is a known limitation recorded debuginfo-tests/trunk/dbg-declare2.ll function 'f6' test case.
llvm-svn: 115323
2010-10-01 19:00:44 +00:00
Gabor Greif 47a3b8c30b typo
llvm-svn: 115310
2010-10-01 10:32:19 +00:00
Chris Lattner f08bfdc29f fix typo
llvm-svn: 115300
2010-10-01 06:54:02 +00:00
Chris Lattner a205055857 fix rdar://8494845 + PR8244 - a miscompile exposed by my patch in r101350
llvm-svn: 115294
2010-10-01 05:36:09 +00:00
Dale Johannesen dd224d2333 Massive rewrite of MMX:
The x86_mmx type is used for MMX intrinsics, parameters and
return values where these use MMX registers, and is also
supported in load, store, and bitcast.

Only the above operations generate MMX instructions, and optimizations
do not operate on or produce MMX intrinsics. 

MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into
smaller pieces.  Optimizations may occur on these forms and the
result casted back to x86_mmx, provided the result feeds into a
previous existing x86_mmx operation.

The point of all this is prevent optimizations from introducing
MMX operations, which is unsafe due to the EMMS problem.

llvm-svn: 115243
2010-09-30 23:57:10 +00:00
Jakob Stoklund Olesen 665aa6efcc When isel is emitting instructions for an x86 target without CMOV, the CFG is
edited during emission.

If the basic block ends in a switch that gets lowered to a jump table, any
phis at the default edge were getting updated wrong. The jump table data
structure keeps a pointer to the header blocks that wasn't getting updated
after the MBB is split.

This bug was exposed on 32-bit Linux when disabling critical edge splitting in
codegen prepare.

The fix is to uipdate stale MBB pointers whenever a block is split during
emission.

llvm-svn: 115191
2010-09-30 19:44:31 +00:00
Evan Cheng 4a010fd1ea Model Cortex-a9 load to SUB, RSB, ADD, ADC, SBC, RSC, CMN, MVN, or CMP
pipeline forwarding path.

llvm-svn: 115098
2010-09-29 22:42:35 +00:00
Oscar Fuentes b4b12535e8 Removed a bunch of unnecessary target_link_libraries.
llvm-svn: 114999
2010-09-28 22:39:14 +00:00
Dale Johannesen 117f7708c4 Don't try to make a vector of x86mmx; this won't work,
and asserts.

llvm-svn: 114843
2010-09-27 17:29:14 +00:00
John Thompson 8118ef8d3d Fix for test/CodeGen/PowerPC/2008-10-17-AsmMatchingOperands.ll crash.
llvm-svn: 114767
2010-09-24 22:24:05 +00:00
Michael J. Spencer ded5f66813 Get rid of pop_macro warnings on MSVC.
llvm-svn: 114750
2010-09-24 19:48:47 +00:00
Evan Cheng 6b8b2b7312 Revert 114634 for now since buildbot claim it broke Clang self-hosting. I doubt it but it's possible it's exposing another bug somewhere.
llvm-svn: 114681
2010-09-23 18:32:19 +00:00
Oscar Fuentes 57214f533a Fix VS 2010 build.
Patch by Nathan Jeffords!

llvm-svn: 114661
2010-09-23 16:59:36 +00:00
Evan Cheng b6d175a39d Follow up to r114630. Do not optimize away unconditional branch following a conditional one.
llvm-svn: 114634
2010-09-23 07:18:35 +00:00
Evan Cheng 79687dda9a SDISel should not optimize a unconditional branch following a conditional branch
when the unconditional branch destination is the fallthrough block. The
canonicalization makes it easier to allow optimizations on DAGs to invert
conditional branches. The branch folding pass (and AnalyzeBranch) will clean up
the unnecessary unconditional branches later.

This is one of the patches leading up to disabling codegen prepare critical edge
splitting.

llvm-svn: 114630
2010-09-23 06:51:55 +00:00
Owen Anderson 3231d13ddd A select between a constant and zero, when fed by a bit test, can be efficiently
lowered using a series of shifts.
Fixes <rdar://problem/8285015>.

llvm-svn: 114599
2010-09-22 22:58:22 +00:00
John Thompson c467aa2fa4 Fixed pr20314-2.c failure, added E, F, p constraint letters.
llvm-svn: 114490
2010-09-21 22:04:54 +00:00
Chris Lattner a9e57e0eff Rework passing parent pointers into complexpatterns, I forgot
that complex patterns are matched after the entire pattern has
a structural match, therefore the NodeStack isn't in a useful
state when the actual call to the matcher happens.

llvm-svn: 114489
2010-09-21 22:00:25 +00:00
Devang Patel 99ff76212a If only user of a vreg is an copy instruction to export copy of vreg out of current basic block then insert DBG_VALUE so that debug value of the variable is also transfered to new vreg.
Testcase is in r114476.
This fixes radar 8412415.

llvm-svn: 114478
2010-09-21 20:56:33 +00:00
Chris Lattner 0bb8b19865 correct this logic.
llvm-svn: 114474
2010-09-21 20:46:40 +00:00
Owen Anderson 5e65dfbb97 Reimplement r114460 in target-independent DAGCombine rather than target-dependent, by using
the predicate to discover the number of sign bits.  Enhance X86's target lowering to provide
a useful response to this query.

llvm-svn: 114473
2010-09-21 20:42:50 +00:00
Chris Lattner dd83548fea just like they can opt into getting the root of the pattern being
matched, allow ComplexPatterns to opt into getting the parent node
of the operand being matched.

llvm-svn: 114472
2010-09-21 20:37:12 +00:00
Chris Lattner a4f199720d finish pushing MachinePointerInfo through selectiondags. At this point,
I think I've audited all uses, so it should be dependable for address spaces,
and the pointer+offset info should also be accurate when there.

llvm-svn: 114464
2010-09-21 18:58:22 +00:00
Chris Lattner 676c61db0e update a bunch of code to use the MachinePointerInfo version of getStore.
llvm-svn: 114461
2010-09-21 18:41:36 +00:00
Bob Wilson 5549d496dd Define the TargetLowering::getTgtMemIntrinsic hook for ARM so that NEON load
and store intrinsics are represented with MemIntrinsicSDNodes.

llvm-svn: 114454
2010-09-21 17:56:22 +00:00
Chris Lattner 6963c1f789 eliminate an old SelectionDAG::getTruncStore method, propagating
MachinePointerInfo around more.

llvm-svn: 114452
2010-09-21 17:42:31 +00:00
Chris Lattner 5e39ffd02f eliminate last SelectionDAG::getLoad old entrypoint, on to stores.
llvm-svn: 114450
2010-09-21 17:28:52 +00:00
Chris Lattner ea952f05a5 fix the code that infers SV info to be correct when dealing
with an indexed load/store that has an offset in the index.

llvm-svn: 114449
2010-09-21 17:24:05 +00:00
Chris Lattner 3d178ed4d4 propagate MachinePointerInfo through various uses of the old
SelectionDAG::getExtLoad overload, and eliminate it.

llvm-svn: 114446
2010-09-21 17:04:51 +00:00
Chris Lattner 1ffcf527c7 continue MachinePointerInfo'izing, eliminating use of one of the old
getLoad overloads.

llvm-svn: 114443
2010-09-21 16:36:31 +00:00
Chris Lattner f72c3c08a4 convert dagcombine off the old form of getLoad. This fixes several bugs
with SVOffset computation.

llvm-svn: 114442
2010-09-21 16:08:50 +00:00
Chris Lattner e32675253f simplify DAGCombiner::SimplifySelectOps step #2/2.
llvm-svn: 114437
2010-09-21 15:58:55 +00:00
Chris Lattner 254c445e63 substantially reduce indentation and simplify DAGCombiner::SimplifySelectOps.
no functionality change (step #1)

llvm-svn: 114436
2010-09-21 15:46:59 +00:00
Chris Lattner a35499e2af a few more trivial updates. This fixes PerformInsertVectorEltInMemory to not
pass a completely incorrect SrcValue, which would result in a miscompile with
combiner-aa.

llvm-svn: 114411
2010-09-21 07:32:19 +00:00
Chris Lattner 2510de2bea reimplement memcpy/memmove/memset lowering to use MachinePointerInfo
instead of srcvalue/offset pairs.  This corrects SV info for mem 
operations whose size is > 32-bits.

llvm-svn: 114401
2010-09-21 05:40:29 +00:00
Chris Lattner bc419ba98f add overloads for SelectionDAG::getLoad, getStore, getTruncStore that take a
MachinePointerInfo.  Among other virtues, this doesn't silently  truncate the
svoffset to 32-bits.

llvm-svn: 114399
2010-09-21 05:10:45 +00:00
Chris Lattner d2d58ada70 simplify interface to SelectionDAG::getMemIntrinsicNode, making it take a MachinePointerInfo
llvm-svn: 114397
2010-09-21 04:57:15 +00:00
Chris Lattner 15d84c460a chagne interface to SelectionDAG::getAtomic to take a MachinePointerInfo,
eliminating some weird "infer a frame address" logic which was dead.

llvm-svn: 114396
2010-09-21 04:53:42 +00:00
Chris Lattner 3b5dc0cdad don't implicitly drop the offset of a machinememoperand when legalizing atomics.
llvm-svn: 114395
2010-09-21 04:51:11 +00:00
Chris Lattner b5f4920979 force clients of MachineFunction::getMachineMemOperand to provide a
MachinePointerInfo, propagating the type out a level of API.  Remove
the old MachineFunction::getMachineMemOperand impl.

llvm-svn: 114393
2010-09-21 04:46:39 +00:00
Owen Anderson 272ff94916 When TCO is turned on, it is possible to end up with aliasing FrameIndex's. Therefore,
CombinerAA cannot assume that different FrameIndex's never alias, but can instead use
MachineFrameInfo to get the actual offsets of these slots and check for actual aliasing.

This fixes CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll and CodeGen/X86/tailcallstack64.ll
when CombinerAA is enabled, modulo a different register allocation sequence.

llvm-svn: 114348
2010-09-20 20:39:59 +00:00
Owen Anderson 7b8d2ae912 Revert r114312 while I sort out some issues.
llvm-svn: 114313
2010-09-19 21:01:26 +00:00
Owen Anderson ff82f8a35b Tentatively enabled DAGCombiner Alias Analysis by default. As far as I know,
r114268 fixed the last of the blockers to enabling it.  I will be monitoring
for failures.

llvm-svn: 114312
2010-09-19 19:51:55 +00:00
Owen Anderson b92b13d8a0 Invert the logic of reachesChainWithoutSideEffects(). What we want to check is that there is
NO path to the destination containing side effects, not that SOME path contains no side effects.
In  practice, this only manifests with CombinerAA enabled, because otherwise the chain has little
to no branching, so "any" is effectively equivalent to "all".

llvm-svn: 114268
2010-09-18 04:45:14 +00:00
Devang Patel 46b96c4ba0 Check bb to ensure that alloca is in separate basic block.
This fixes funcargs.exp regression reported by gdb testsuite.

llvm-svn: 113992
2010-09-15 18:13:55 +00:00
Devang Patel da25de8096 If dbg.declare from non-entry block is using alloca from entry block then use offset available in StaticAllocaMap to emit DBG_VALUE. Right now, this has no material impact because varible info also collected using offset table maintained in machine module info.
llvm-svn: 113967
2010-09-15 14:48:53 +00:00
Devang Patel e4682fa8e2 Use frame index, if available for byval argument while lowering dbg_declare. Otherwise let getRegForValue() find register for this argument.
llvm-svn: 113843
2010-09-14 20:29:31 +00:00
Michael J. Spencer 93c9b2ea93 Revert "CMake: Get rid of LLVMLibDeps.cmake and export the libraries normally."
This reverts commit r113632

Conflicts:

	cmake/modules/AddLLVM.cmake

llvm-svn: 113819
2010-09-13 23:59:48 +00:00
Eric Christopher 79127ab3f5 Silence more warnings. Two more unused variables.
llvm-svn: 113771
2010-09-13 18:30:57 +00:00
John Thompson 1094c80281 Added skeleton for inline asm multiple alternative constraint support.
llvm-svn: 113766
2010-09-13 18:15:37 +00:00
Michael J. Spencer dc38d36ccb CMake: Get rid of LLVMLibDeps.cmake and export the libraries normally.
llvm-svn: 113632
2010-09-10 21:14:25 +00:00
Devang Patel 6095d818e5 Add DEBUG message.
llvm-svn: 113614
2010-09-10 20:32:09 +00:00
Evan Cheng bf4070756f Teach if-converter to be more careful with predicating instructions that would
take multiple cycles to decode.
For the current if-converter clients (actually only ARM), the instructions that
are predicated on false are not nops. They would still take machine cycles to
decode. Micro-coded instructions such as LDM / STM can potentially take multiple
cycles to decode. If-converter should take treat them as non-micro-coded
simple instructions.

llvm-svn: 113570
2010-09-10 01:29:16 +00:00
Chris Lattner eeba0c73e5 implement rdar://6653118 - fastisel should fold loads where possible.
Since mem2reg isn't run at -O0, we get a ton of reloads from the stack,
for example, before, this code:

int foo(int x, int y, int z) {
  return x+y+z;
}

used to compile into:

_foo:                                   ## @foo
	subq	$12, %rsp
	movl	%edi, 8(%rsp)
	movl	%esi, 4(%rsp)
	movl	%edx, (%rsp)
	movl	8(%rsp), %edx
	movl	4(%rsp), %esi
	addl	%edx, %esi
	movl	(%rsp), %edx
	addl	%esi, %edx
	movl	%edx, %eax
	addq	$12, %rsp
	ret

Now we produce:

_foo:                                   ## @foo
	subq	$12, %rsp
	movl	%edi, 8(%rsp)
	movl	%esi, 4(%rsp)
	movl	%edx, (%rsp)
	movl	8(%rsp), %edx
	addl	4(%rsp), %edx    ## Folded load
	addl	(%rsp), %edx     ## Folded load
	movl	%edx, %eax
	addq	$12, %rsp
	ret

Fewer instructions and less register use = faster compiles.

llvm-svn: 113102
2010-09-05 02:18:34 +00:00
Bob Wilson 3626a8c136 Add a missing check when legalizing a vector extending load. This doesn't
solve the root problem, but it corrects the bug in the code I added to
support legalizing in the case where the non-extended type is also legal.

llvm-svn: 112997
2010-09-03 19:20:37 +00:00
Devang Patel 3bffd52d78 Detect undef value early and save unnecessary NodeMap query.
llvm-svn: 112864
2010-09-02 21:29:42 +00:00
Dan Gohman 3c9b5f394b Don't narrow the load and store in a load+twiddle+store sequence unless
there are clearly no stores between the load and the store. This fixes
this miscompile reported as PR7833.

This breaks the test/CodeGen/X86/narrow_op-2.ll optimization, which is
safe, but awkward to prove safe. Move it to X86's README.txt.

llvm-svn: 112861
2010-09-02 21:18:42 +00:00
Devang Patel 98d3edfe2a Tidy up.
llvm-svn: 112858
2010-09-02 21:02:27 +00:00
Devang Patel 86ec8b3a3f Reapply r112623. Included additional check for unused byval argument.
llvm-svn: 112659
2010-08-31 22:22:42 +00:00
Devang Patel 529f248eb4 Revert r112623. It is causing self host build failures.
llvm-svn: 112631
2010-08-31 19:41:03 +00:00
Devang Patel 8559932d36 Remember byval argument's frame index during argument lowering and use this info to emit debug info.
Fixes Radar 8367011.

llvm-svn: 112623
2010-08-31 18:50:09 +00:00
Devang Patel 417d72823a Offset is not always unsigned number.
llvm-svn: 112584
2010-08-31 06:12:08 +00:00
Bruno Cardoso Lopes d9ef4a1a24 zap unused method. x86 is the only user and already has a more powerfull version
llvm-svn: 112571
2010-08-31 02:36:20 +00:00
Bill Wendling f824489a1d Revert r112461. It was failing on PPC...
llvm-svn: 112463
2010-08-30 04:36:50 +00:00
Bill Wendling 938f299fa9 When adding a register, we should mark it as "def" if it can optionally define
said (physical) register.

llvm-svn: 112461
2010-08-30 01:36:05 +00:00
Chris Lattner 13ee795c42 remove unions from LLVM IR. They are severely buggy and not
being actively maintained, improved, or extended.

llvm-svn: 112356
2010-08-28 04:09:24 +00:00
Dan Gohman e06905d1f0 Completely disable tail calls when fast-isel is enabled, as fast-isel
doesn't currently support dealing with this.

llvm-svn: 112341
2010-08-28 00:51:03 +00:00
Dan Gohman 1e06dbf881 Trim a #include.
llvm-svn: 112340
2010-08-28 00:49:13 +00:00
Devang Patel f2855b147f Simplify.
llvm-svn: 112305
2010-08-27 22:25:51 +00:00
Devang Patel b12ff5999e Revert r112213. It is not needed.
llvm-svn: 112242
2010-08-26 23:35:15 +00:00
Devang Patel ea134f56b1 If node is not available then use FuncInfo.ValueMap to emit debug info for byval parameter.
llvm-svn: 112238
2010-08-26 22:53:27 +00:00
Devang Patel 42b4ac7ed3 Speculatively revert r112207.
llvm-svn: 112216
2010-08-26 20:33:42 +00:00
Devang Patel 977057f481 80 col.
llvm-svn: 112215
2010-08-26 20:32:32 +00:00
Devang Patel 384fa91deb Update DanglingDebugInfo so that it can be used to track llvm.dbg.declare also.
llvm-svn: 112213
2010-08-26 20:06:46 +00:00
Devang Patel ab596a637c Donot forget to resolve dangling debug info in a case where virtual register, used for a value, is initialized after a dbg intrinsic is seen.
llvm-svn: 112207
2010-08-26 18:36:14 +00:00
Chris Lattner af23e9a798 Add a hackaround for PR7993 which is causing failures on x86 builders that lack sse2.
llvm-svn: 112175
2010-08-26 06:57:07 +00:00
Chris Lattner eb2cc0ce0e implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1.
llvm-svn: 112171
2010-08-26 05:51:22 +00:00
Chris Lattner f6418b804e zap dead code.
llvm-svn: 112155
2010-08-26 02:57:35 +00:00
Chris Lattner 8df99b523e remove some llvmcontext arguments that are now dead post-refactoring.
llvm-svn: 112104
2010-08-25 23:00:45 +00:00
Chris Lattner 75ff053497 Change handling of illegal vector types to widen when possible instead of
expanding: e.g. <2 x float> -> <4 x float> instead of -> 2 floats.  This
affects two places in the code: handling cross block values and handling
function return and arguments.  Since vectors are already widened by 
legalizetypes, this gives us much better code and unblocks x86-64 abi
and SPU abi work.

For example, this (which is a silly example of a cross-block value):
define <4 x float> @test2(<4 x float> %A) nounwind {
 %B = shufflevector <4 x float> %A, <4 x float> undef, <2 x i32> <i32 0, i32 1>
 %C = fadd <2 x float> %B, %B
  br label %BB
BB:
 %D = fadd <2 x float> %C, %C
 %E = shufflevector <2 x float> %D, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
 ret <4 x float> %E
}

Now compiles into:

_test2:                                 ## @test2
## BB#0:
 addps %xmm0, %xmm0
 addps %xmm0, %xmm0
 ret

previously it compiled into:

_test2:                                 ## @test2
## BB#0:
 addps %xmm0, %xmm0
 pshufd $1, %xmm0, %xmm1
                                        ## kill: XMM0<def> XMM0<kill> XMM0<def>
 insertps $0, %xmm0, %xmm0
 insertps $16, %xmm1, %xmm0
 addps %xmm0, %xmm0
 ret

This implements rdar://8230384

llvm-svn: 112101
2010-08-25 22:49:25 +00:00
Devang Patel 32a72ab072 Fix comment.
llvm-svn: 112086
2010-08-25 20:41:24 +00:00
Devang Patel 3f53d6e56a Remove dead argument.
llvm-svn: 112085
2010-08-25 20:39:26 +00:00
Chris Lattner 05bcb488b5 split the vector case of getCopyFromParts out to its own function,
no functionality change.

llvm-svn: 111994
2010-08-24 23:20:40 +00:00
Chris Lattner 96a77ebd7c split the vector case out of getCopyToParts into its own function. No
functionality change.

llvm-svn: 111990
2010-08-24 23:10:06 +00:00
Chris Lattner 5b8967f8a2 tidy up, reduce indentation
llvm-svn: 111982
2010-08-24 22:43:11 +00:00
Chandler Carruth 191c4f73b2 Fix some GCC warnings by providing a virtual destructor in the base of a class
hierarchy with virtual methods and using llvm_unreachable to properly indicate
unreachable states which would otherwise leave variables uninitialized.

llvm-svn: 111803
2010-08-23 08:25:07 +00:00
Bob Wilson c56fef4eac If the target says that an extending load is not legal, regardless of whether
it involves specific floating-point types, legalize should expand an
extending load to a non-extending load followed by a separate extend operation.
For example, we currently expand SEXTLOAD to EXTLOAD+SIGN_EXTEND_INREG (and
assert that EXTLOAD should always be supported).  Now we can expand that to
LOAD+SIGN_EXTEND.  This is needed to allow vector SIGN_EXTEND and ZERO_EXTEND
to be used for NEON.

llvm-svn: 111586
2010-08-19 23:52:39 +00:00
Dale Johannesen 16f96445c3 Make fast scheduler handle asm clobbers correctly.
PR 7882.  Follows suggestion by Amaury Pouly, thanks.

llvm-svn: 111306
2010-08-17 22:17:24 +00:00
Eric Christopher 541f8012d9 Fix typo.
llvm-svn: 111223
2010-08-17 01:30:33 +00:00
Evan Cheng 23ef829096 Add missing null check reported by Amaury Pouly.
llvm-svn: 110649
2010-08-10 02:39:45 +00:00
Owen Anderson a7aed18624 Reapply r110396, with fixes to appease the Linux buildbot gods.
llvm-svn: 110460
2010-08-06 18:33:48 +00:00
Owen Anderson bda59bd247 Revert r110396 to fix buildbots.
llvm-svn: 110410
2010-08-06 00:23:35 +00:00
Owen Anderson 755aceb5d0 Don't use PassInfo* as a type identifier for passes. Instead, use the address of the static
ID member as the sole unique type identifier.  Clean up APIs related to this change.

llvm-svn: 110396
2010-08-05 23:42:04 +00:00
Dan Gohman 5cae103392 Eliminate unnecessary empty string literals.
llvm-svn: 110183
2010-08-04 01:39:08 +00:00
Oscar Fuentes 40b31ad3ee Prefix `next' iterator operation with `llvm::'.
Fixes potential ambiguity problems on VS 2010.

Patch by nobled!

llvm-svn: 110029
2010-08-02 06:00:15 +00:00
Eli Friedman 460ad41d6d PR7586: Make sure we don't claim that unknown bits are actually known in the
ISD::AND case of TargetLowering::SimplifyDemandedBits.

llvm-svn: 110019
2010-08-02 04:42:25 +00:00
Eli Friedman ffe64c06ef Fix for bug reported by Evzen Muller on llvm-commits: make sure to correctly
check the range of the constant when optimizing a comparison between a
constant and a sign_extend_inreg node.

llvm-svn: 109854
2010-07-30 06:44:31 +00:00
Nate Begeman 317b969ac5 Fix a crash in the dag combiner caused by ConstantFoldBIT_CONVERTofBUILD_VECTOR calling itself
recursively and returning a SCALAR_TO_VECTOR node, but assuming the input was always a BUILD_VECTOR.

llvm-svn: 109519
2010-07-27 18:02:18 +00:00
Bill Wendling 0ff1ef650b It's better to have the arrays, which would trigger the creation of stack
protectors, to be near the stack protectors on the stack. Accomplish this by
tagging the stack object with a predicate that indicates that it would trigger
this. In the prolog-epilog inserter, assign these objects to the stack after the
stack protector but before the other objects.

llvm-svn: 109481
2010-07-27 01:55:19 +00:00
Evan Cheng e6d6c5dd11 The "excess register pressure" returned by HighRegPressure() is not accurate enough to factor into scheduling priority. Eliminate it and add early exits to speed up scheduling.
llvm-svn: 109449
2010-07-26 21:49:07 +00:00
Dan Gohman 2810bacafb Handle Values with no value in getCopyFromRegs.
llvm-svn: 109415
2010-07-26 18:15:41 +00:00
Duncan Sands 136a6f0dbb Pacify gcc-4.5 which wrongly thinks that RExcess (passed as the Excess parameter)
may be used uninitialized in the callers of HighRegPressure.

llvm-svn: 109393
2010-07-26 07:54:17 +00:00
Evan Cheng 8ae3ecad2b Add comments.
llvm-svn: 109383
2010-07-25 18:59:43 +00:00
Bob Wilson 280ce9984e Fix crashes when scheduling a CopyToReg node -- getMachineOpcode asserts on
those.  Radar 8231572.

llvm-svn: 109367
2010-07-25 05:34:27 +00:00
Evan Cheng 37b740c4bf Add an ILP scheduler. This is a register pressure aware scheduler that's
appropriate for targets without detailed instruction iterineries.
The scheduler schedules for increased instruction level parallelism in
low register pressure situation; it schedules to reduce register pressure
when the register pressure becomes high.

On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
by 16%.

llvm-svn: 109300
2010-07-24 00:39:05 +00:00
Evan Cheng df907f4594 - Allow target to specify when is register pressure "too high". In most cases,
it's too late to start backing off aggressive latency scheduling when most
  of the registers are in use so the threshold should be a bit tighter.
- Correctly handle live out's and extract_subreg etc.
- Enable register pressure aware scheduling by default for hybrid scheduler.
  For ARM, this is almost always a win on # of instructions. It's runtime
  neutral for most of the tests. But for some kernels with high register
  pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by
  54 and sped up by 20%.

llvm-svn: 109279
2010-07-23 22:39:59 +00:00
Dan Gohman 55e244698a Use the proper type for shift counts. This fixes a bootstrap error.
llvm-svn: 109265
2010-07-23 21:08:12 +00:00
Dan Gohman 0818684a70 DAGCombine (shl (anyext x, c)) to (anyext (shl x, c)) if the high bits
are not demanded. This often allows the anyext to be folded away.

llvm-svn: 109242
2010-07-23 18:03:30 +00:00
Dan Gohman 2e00e3b12d Make SDNode::dump() print a newline at the end.
llvm-svn: 109234
2010-07-23 16:37:47 +00:00
Eric Christopher faf5c76114 80-col.
llvm-svn: 109205
2010-07-23 01:05:59 +00:00
Gabor Greif 59f9970ba5 keep in 80 cols
llvm-svn: 109122
2010-07-22 17:18:03 +00:00
Gabor Greif dde79d8f1a mass elimination of reliance on automatic iterator dereferencing
llvm-svn: 109103
2010-07-22 13:36:47 +00:00
Evan Cheng bf32e54bac Re-apply r109079 with fix.
llvm-svn: 109083
2010-07-22 06:24:48 +00:00
Owen Anderson 6c55cccf87 Revert r109079, which broke a lot of CodeGen tests.
llvm-svn: 109082
2010-07-22 06:01:28 +00:00
Evan Cheng bd81bff672 Initialize RegLimit only when register pressure is being tracked.
llvm-svn: 109079
2010-07-22 05:18:41 +00:00
Evan Cheng 285903853f More register pressure aware scheduling work.
llvm-svn: 109064
2010-07-21 23:53:58 +00:00
Evan Cheng a77f3d3b37 Teach bottom up pre-ra scheduler to track register pressure. Work in progress.
llvm-svn: 108991
2010-07-21 06:09:07 +00:00
Dan Gohman b5e918dc05 After a custom inserter, in a block which has constant instructions,
update the current basic block in addition to the current insert
position, so that they remain consistent. This fixes rdar://8204072.

llvm-svn: 108765
2010-07-19 22:48:56 +00:00
Evan Cheng 10f99a3490 ARM has to provide its own TargetLowering::findRepresentativeClass because its scalar floating point registers alias its vector registers.
llvm-svn: 108761
2010-07-19 22:15:08 +00:00
Evan Cheng 7a135510e3 Teach computeRegisterProperties() to compute "representative" register class for legal value types. A "representative" register class is the largest legal super-reg register class for a value type. e.g. On i386, GR32 is the rep register class for i8 / i16 / i32; on x86_64 it would be GR64.
This property will be used by the register pressure tracking instruction scheduler.

llvm-svn: 108735
2010-07-19 18:47:01 +00:00
Owen Anderson 9c271e2835 Remove r108639 now that it is handled by InstCombine instead.
llvm-svn: 108688
2010-07-19 08:10:24 +00:00
Owen Anderson f7f9c8a2f7 Add a DAGCombine xform to fold away redundant float->double->float conversions around sqrt instructions.
I am assured by people more knowledgeable than me that there are no rounding issues in eliminating this.

This fixed <rdar://problem/8197504>.

llvm-svn: 108639
2010-07-18 08:47:54 +00:00
Eric Christopher 0baaa9bcc1 Propagate alloca alignment information via variable size object frame
information.

No functional change yet.

llvm-svn: 108583
2010-07-17 00:28:22 +00:00
Dan Gohman 1e936277c3 Revert r108369, sorting llvm.dbg.declare information by source position,
since it doesn't work for front-ends which don't emit column information
(which includes llvm-gcc in its present configuration), and doesn't
work for clang for K&R style variables where the variables are declared
in a different order from the parameter list.

Instead, make a separate pass through the instructions to collect the
llvm.dbg.declare instructions in order. This ensures that the debug
information for variables is emitted in this order.

llvm-svn: 108538
2010-07-16 17:54:27 +00:00
Dan Gohman 103c4ebea5 Use the source-order scheduler instead of the "fast" scheduler at -O0,
because it's more likely to keep debug line information in its original
order.

llvm-svn: 108496
2010-07-16 02:01:19 +00:00
Dale Johannesen bfd4fd7bb7 The SelectionDAGBuilder's handling of debug info, on rare
occasions, caused code to be generated in a different order.
All cases I've seen involved float softening in the type
legalizer, and this could be perhaps be fixed there, but
it's better not to generate things differently in the first
place.  7797940 (6/29/2010..7/15/2010).

llvm-svn: 108484
2010-07-16 00:02:08 +00:00
Bill Wendling 4bda1c8e68 Revert. This isn't the correct way to go.
llvm-svn: 108478
2010-07-15 23:42:21 +00:00
Bill Wendling 973dc3b1d8 Handle code gen for the unreachable instruction if it's the only instruction in
the function. We'll just turn it into a "trap" instruction instead.

The problem with not handling this is that it might generate a prologue without
the equivalent epilogue to go with it:

$ cat t.ll
define void @foo() {
entry:
  unreachable
}
$ llc -o - t.ll -relocation-model=pic -disable-fp-elim -unwind-tables
        .section        __TEXT,__text,regular,pure_instructions
        .globl  _foo
        .align  4, 0x90
_foo:                                   ## @foo
Leh_func_begin0:
## BB#0:                                ## %entry
        pushq   %rbp
Ltmp0:
        movq    %rsp, %rbp
Ltmp1:
Leh_func_end0:
...

The unwind tables then have bad data in them causing all sorts of problems.

Fixes <rdar://problem/8096481>.

llvm-svn: 108473
2010-07-15 23:32:40 +00:00
Evan Cheng 55f0c6b9fc Split -enable-finite-only-fp-math to two options:
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN.

llvm-svn: 108465
2010-07-15 22:07:12 +00:00
Devang Patel df09db62e2 Fix crash reported in PR7653.
llvm-svn: 108441
2010-07-15 18:45:27 +00:00
Eric Christopher 474e56a2bf 80-col.
llvm-svn: 108381
2010-07-14 23:41:32 +00:00
Dan Gohman c12a6731c5 Properly restore DebugLoc after leaving the local constant area.
llvm-svn: 108364
2010-07-14 22:01:31 +00:00
Dan Gohman 042523340b Delete fast-isel's trivial load optimization; it breaks debugging because
it can look past points where a debugger might modify user variables.

llvm-svn: 108336
2010-07-14 17:25:37 +00:00
Dan Gohman 1f471435f8 Don't propagate debug locations to instructions for materializing
constants, since they may not be emited near the other instructions
which get the same line, and this confuses debug info.

llvm-svn: 108302
2010-07-14 01:07:44 +00:00
Dale Johannesen caca5488dc In inline asm treat indirect 'X' constraint as 'm'.
This may not be right in all cases, but it's better
than asserting which it was doing before.  PR 7528.

llvm-svn: 108268
2010-07-13 20:17:05 +00:00
Rafael Espindola a18c5a0e5e Fix a typo and fit in 80 columns. Found by Bob Wilson.
llvm-svn: 108164
2010-07-12 18:11:17 +00:00
Duncan Sands 41b4a6b36a Convert some tab stops into spaces.
llvm-svn: 108130
2010-07-12 08:16:59 +00:00
Jakob Stoklund Olesen 51642aea77 Use COPY for fast-isel bitconvert, but don't create cross-class copies.
This doesn't change the behavior of SelectBitcast for X86.

llvm-svn: 108073
2010-07-11 05:16:54 +00:00
Rafael Espindola a76eccf815 Fix va_arg for doubles. With this patch VAARG nodes always contain the
correct alignment information, which simplifies ExpandRes_VAARG a bit.

The patch introduces a new alignment information to TargetLoweringInfo. This is
needed since the two natural candidates cannot be used:

* The 's' in target data: If this is set to the minimal alignment of any
  argument, getCallFrameTypeAlignment would return 4 for doubles on ARM for
  example.
* The getTransientStackAlignment method. It is possible for an architecture to
  have argument less aligned than what we maintain the stack pointer.

llvm-svn: 108072
2010-07-11 04:01:49 +00:00
Jakob Stoklund Olesen 7147ab9e78 Use COPY for extracting ImplicitDef'ed values from fast-isel instructions.
This assumes that the registers can be copied which is probably a safe
assumption.

llvm-svn: 108070
2010-07-11 03:31:05 +00:00
Jakob Stoklund Olesen 3bb1267431 Use COPY in FastISel everywhere it is safe and trivial.
The remaining copyRegToReg calls actually check the return value (shock!), so we
cannot trivially replace them with COPY instructions.

llvm-svn: 108069
2010-07-11 03:31:00 +00:00
Dan Gohman a64a323564 Fix a bug in the code which re-inserts DBG_VALUE nodes after scheduling;
if a block is split (by a custom inserter), the insert point may be in a
different block than it was originally. This fixes 32-bit llvm-gcc
bootstrap builds, and I haven't been able to reproduce it otherwise.

llvm-svn: 108060
2010-07-10 22:42:31 +00:00
Jakob Stoklund Olesen e50d30d586 Emit COPY instructions instead of using copyRegToReg in InstrEmitter,
ScheduleDAGEmit, TwoAddressLowering, and PHIElimination.

This switches the bulk of register copies to using COPY, but many less used
copyRegToReg calls remain.

llvm-svn: 108050
2010-07-10 19:08:25 +00:00
Dan Gohman fbdba81550 Insert IMPLICIT_DEF instructions at the current insert position, not
at the end of the block.

llvm-svn: 108045
2010-07-10 13:55:45 +00:00
Dan Gohman d7b5ce3312 Reapply bottom-up fast-isel, with several fixes for x86-32:
- Check getBytesToPopOnReturn().
 - Eschew ST0 and ST1 for return values.
 - Fix the PIC base register initialization so that it doesn't ever
   fail to end up the top of the entry block.

llvm-svn: 108039
2010-07-10 09:00:22 +00:00
Bill Wendling f831d86311 Clarify what mysterious check means.
llvm-svn: 108005
2010-07-09 19:44:12 +00:00
Bob Wilson 6586e9b203 --- Reverse-merging r107947 into '.':
U    utils/TableGen/FastISelEmitter.cpp
--- Reverse-merging r107943 into '.':
U    test/CodeGen/X86/fast-isel.ll
U    test/CodeGen/X86/fast-isel-loads.ll
U    include/llvm/Target/TargetLowering.h
U    include/llvm/Support/PassNameParser.h
U    include/llvm/CodeGen/FunctionLoweringInfo.h
U    include/llvm/CodeGen/CallingConvLower.h
U    include/llvm/CodeGen/FastISel.h
U    include/llvm/CodeGen/SelectionDAGISel.h
U    lib/CodeGen/LLVMTargetMachine.cpp
U    lib/CodeGen/CallingConvLower.cpp
U    lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
U    lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp
U    lib/CodeGen/SelectionDAG/FastISel.cpp
U    lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
U    lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
U    lib/CodeGen/SelectionDAG/InstrEmitter.cpp
U    lib/CodeGen/SelectionDAG/TargetLowering.cpp
U    lib/Target/XCore/XCoreISelLowering.cpp
U    lib/Target/XCore/XCoreISelLowering.h
U    lib/Target/X86/X86ISelLowering.cpp
U    lib/Target/X86/X86FastISel.cpp
U    lib/Target/X86/X86ISelLowering.h

llvm-svn: 107987
2010-07-09 16:37:18 +00:00
Gabor Greif 52617fc462 cache result of operator*
llvm-svn: 107980
2010-07-09 16:08:33 +00:00
Dan Gohman 0b5aa1cdd3 Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emitting
a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL.

llvm-svn: 107943
2010-07-09 00:39:23 +00:00
Bob Wilson 21eed476e8 Reenable DAG combining for vector shuffles. It looks like it was temporarily
disabled and then never turned back on again.  Adjust some tests, one because
this change avoids an unnecessary instruction, and the other to make it
continue testing what it was intended to test.

llvm-svn: 107941
2010-07-09 00:38:12 +00:00
Bill Wendling a992445ff2 Extension of r107506. Make sure that we don't mark a function as having a call
if the inline ASM doesn't need a stack frame.

llvm-svn: 107922
2010-07-08 22:38:02 +00:00
Jakob Stoklund Olesen 00264624a9 Convert EXTRACT_SUBREG to COPY when emitting machine instrs.
EXTRACT_SUBREG no longer appears as a machine instruction. Use COPY instead.

Add isCopy() checks in many places using isMoveInstr() and isExtractSubreg().
The isMoveInstr hook will be removed later.

llvm-svn: 107879
2010-07-08 16:40:22 +00:00
Benjamin Kramer 0ae3f08c0d Merge the duplicated iabs optimization in DAGCombiner and let it detected a few more idioms.
llvm-svn: 107868
2010-07-08 12:09:56 +00:00
Dan Gohman e75704369d Revert 107840 107839 107813 107804 107800 107797 107791.
Debug info intrinsics win for now.

llvm-svn: 107850
2010-07-08 01:00:56 +00:00
Dan Gohman eb9164dc50 Don't forward-declare registers for static allocas, which we'll
prefer to materialize as local constants. This fixes the clang
bootstrap abort.

llvm-svn: 107840
2010-07-07 23:52:58 +00:00
Dan Gohman 1adc499dda Fix -fast-isel-abort to check the right instruction.
llvm-svn: 107839
2010-07-07 23:47:25 +00:00
Evan Cheng 1c349f18f8 Move getExtLoad() and (some) getLoad() DebugLoc argument after EVT argument for consistency sake.
llvm-svn: 107820
2010-07-07 22:15:37 +00:00
Dan Gohman 25d5c1b4f8 Not all custom inserters create new basic blocks. If the inserter
didn't create a new block, don't reset the insert position.

llvm-svn: 107813
2010-07-07 21:18:22 +00:00
Dan Gohman e7ccc51cc1 Implement bottom-up fast-isel. This has the advantage of not requiring
a separate DCE pass over MachineInstrs.

llvm-svn: 107804
2010-07-07 19:20:32 +00:00
Dan Gohman 2d4d01d0de Add X86FastISel support for return statements. This entails refactoring
a bunch of stuff, to allow the target-independent calling convention
logic to be employed.

llvm-svn: 107800
2010-07-07 18:32:53 +00:00
Dan Gohman b792f844ad Update the insert position after scheduling, which may change the
position when emitting multiple blocks when executing a custom
inserter.

llvm-svn: 107797
2010-07-07 18:22:13 +00:00
Devang Patel 637ee5f149 Update comment.
llvm-svn: 107796
2010-07-07 18:18:18 +00:00
Dan Gohman ffe64b1ee5 Give FunctionLoweringInfo an MBB member, avoiding the need to pass it
around everywhere, and also give it an InsertPt member, to enable isel
to operate at an arbitrary position within a block, rather than just
appending to a block.

llvm-svn: 107791
2010-07-07 16:47:08 +00:00
Dan Gohman 87fb4e8fcd Simplify FastISel's constructor by giving it a FunctionLoweringInfo
instance, rather than pointers to all of FunctionLoweringInfo's
members.

This eliminates an NDEBUG ABI sensitivity.

llvm-svn: 107789
2010-07-07 16:29:44 +00:00
Dan Gohman e784616fbb Move FunctionLoweringInfo.h out into include/llvm/CodeGen. This will
allow target-specific fast-isel code to make use of it directly.

llvm-svn: 107787
2010-07-07 16:01:37 +00:00
Dan Gohman fe7532a308 Split the SDValue out of OutputArg so that SelectionDAG-independent
code can do calling-convention queries. This obviates OutputArgReg.

llvm-svn: 107786
2010-07-07 15:54:55 +00:00
Dan Gohman 498e5f899d Move CallingConvLower.cpp out of the SelectionDAG directory.
llvm-svn: 107781
2010-07-07 15:15:27 +00:00
Jim Grosbach dc0a0659be By default, the eh.sjlj.setjmp/longjmp intrinsics should just do nothing rather
than assuming a target will custom lower them. Targets which do so should
exlicitly mark them as having custom lowerings. PR7454.

llvm-svn: 107734
2010-07-06 23:44:52 +00:00
Dan Gohman ee0cb70381 CanLowerReturn doesn't need a SelectionDAG; it just needs an LLVMContext.
SelectBasicBlock doesn't needs its BasicBlock argument.

llvm-svn: 107712
2010-07-06 22:19:37 +00:00
Devang Patel a3ca21b228 Propagate debug loc.
llvm-svn: 107710
2010-07-06 22:08:15 +00:00
Dan Gohman 3439629239 Reapply r107655 with fixes; insert the pseudo instruction into
the block before calling the expansion hook. And don't
put EFLAGS in a mbb's live-in list twice.

llvm-svn: 107691
2010-07-06 20:24:04 +00:00
Dan Gohman 4e49b59dad Add versions of OutputArgReg, AnalyzeReturn, and AnalyzeCallOperands
which do not depend on SelectionDAG.

llvm-svn: 107666
2010-07-06 15:39:54 +00:00
Chris Lattner c4a7073db3 more tidying.
llvm-svn: 107615
2010-07-05 05:53:14 +00:00
Chris Lattner 2c0315a0f3 random tidying
llvm-svn: 107612
2010-07-05 05:36:21 +00:00
Evan Cheng f3aeb2c22c Infer alignments of fixed frame objects when they are constructed. This ensures remat'ed loads from fixed slots have the right alignments.
llvm-svn: 107591
2010-07-04 18:52:05 +00:00
Bill Wendling f844642350 Proper indentation.
llvm-svn: 107581
2010-07-04 08:58:43 +00:00
Dale Johannesen 4d887f7ca7 Propagate the AlignStack bit in InlineAsm's to the
PrologEpilog code, and use it to determine whether
the asm forces stack alignment or not.  gcc consistently
does not do this for GCC-style asms; Apple gcc inconsistently
sometimes does it for asm blocks.  There is no
convenient place to put a bit in either the SDNode or
the MachineInstr form, so I've added an extra operand
to each; unlovely, but it does allow for expansion for
more bits, should we need it.  PR 5125.  Some
existing testcases are affected.
The operand lists of the SDNode and MachineInstr forms
are indexed with awesome mnemonics, like "2"; I may
fix this someday, but not now.  I'm not making it any
worse.  If anyone is inspired I think you can find all
the right places from this patch.

llvm-svn: 107506
2010-07-02 20:16:09 +00:00
Jim Grosbach 9b7755fbc6 80-column and trailing whitespace cleanup.
llvm-svn: 107490
2010-07-02 17:41:59 +00:00
Jim Grosbach 64a4f3f062 grammar tweaks
llvm-svn: 107489
2010-07-02 17:38:34 +00:00
Dan Gohman 93f5920914 Rename CreateReg to CreateRegs, and MakeReg to CreateReg.
llvm-svn: 107451
2010-07-02 00:10:16 +00:00
Dan Gohman d2965c10a1 Temporarily disable on-demand fast-isel.
llvm-svn: 107393
2010-07-01 12:15:30 +00:00
Dan Gohman 42b7ee15f5 Use FuncInfo's isExportedInst accessor method instead of
doing the work manually.

llvm-svn: 107384
2010-07-01 03:57:05 +00:00
Dan Gohman 85e02e9340 Rename CreateRegForValue to CreateReg, and change its argument
from a Value to a Type, because it doesn't actually care about
the Value.

llvm-svn: 107383
2010-07-01 03:55:39 +00:00
Dan Gohman aef3d140b7 Teach fast-isel to avoid loading a value from memory when it's already
available in a register. This is pretty primitive, but it reduces the
number of instructions in common testcases by 4%.

llvm-svn: 107380
2010-07-01 03:49:38 +00:00
Dan Gohman 722f5fc567 Enable on-demand fast-isel.
llvm-svn: 107377
2010-07-01 02:58:57 +00:00
Dan Gohman d432223163 Reapply r106422, splitting the code for materializing a value out of
SelectionDAGBuilder::getValue into a helper function, with fixes to
use DenseMaps safely.

llvm-svn: 107371
2010-07-01 01:59:43 +00:00
Dan Gohman 9576645a84 Don't use operator[] here, because it's not desirable to insert a default
value if the search fails.

llvm-svn: 107368
2010-07-01 01:33:21 +00:00
Jim Grosbach caf9b3ab7d grammar tweak in comment.
llvm-svn: 107321
2010-06-30 21:27:56 +00:00
Duncan Sands 945a347478 Remove an unused variable. The call to getRoot has side-effects, so
this could break something (but doesn't seem to).

llvm-svn: 107295
2010-06-30 17:22:28 +00:00
Gabor Greif 647d9c9797 use ArgOperand API
llvm-svn: 107282
2010-06-30 13:45:50 +00:00
Gabor Greif f69acfe133 use ArgOperand API
llvm-svn: 107279
2010-06-30 12:55:46 +00:00
Rafael Espindola 38a7d7cbc3 Add a VT argument to getMinimalPhysRegClass and replace the copy related uses
of getPhysicalRegisterRegClass with it.

If we want to make a copy (or estimate its cost), it is better to use the
smallest class as more efficient operations might be possible.

llvm-svn: 107140
2010-06-29 14:02:34 +00:00
Duncan Sands 6d28e73acc Remove initialized but otherwise unused variables.
llvm-svn: 107127
2010-06-29 11:22:26 +00:00
Bob Wilson 269a89fd3a Unlike other targets, ARM now uses BUILD_VECTORs post-legalization so they
can't be changed arbitrarily by the DAGCombiner without checking if it is
running after legalization.

llvm-svn: 107097
2010-06-28 23:40:25 +00:00
Dale Johannesen 17feb07c53 In asm's, output operands with matching input constraints
have to be registers, per gcc documentation.  This affects
the logic for determining what "g" should lower to.  PR 7393.
A couple of existing testcases are affected.

llvm-svn: 107079
2010-06-28 22:09:45 +00:00
Rafael Espindola 2041abd958 When splitting a VAARG, remember its alignment.
This produces terrible but correct code.

llvm-svn: 106952
2010-06-26 18:22:20 +00:00
Evan Cheng 02b184de5b Change if-conversion block size limit checks to add some flexibility.
llvm-svn: 106901
2010-06-25 22:42:03 +00:00
Dale Johannesen ce97d55ad9 The hasMemory argument is irrelevant to how the argument
for an "i" constraint should get lowered; PR 6309.  While
this argument was passed around a lot, this is the only
place it was used, so it goes away from a lot of other
places.

llvm-svn: 106893
2010-06-25 21:55:36 +00:00
Duncan Sands 2dc70bea54 Remove variables which are assigned to but for which the value
is not used.  Spotted by gcc-4.6.

llvm-svn: 106854
2010-06-25 14:48:39 +00:00
Gabor Greif eba0be7dc9 use ArgOperand API
llvm-svn: 106836
2010-06-25 09:38:13 +00:00
Gabor Greif e4eed709d4 use ArgOperand API
llvm-svn: 106828
2010-06-25 08:24:59 +00:00
Gabor Greif f6207e0a80 prune an include
llvm-svn: 106827
2010-06-25 08:16:50 +00:00
Bill Wendling 2d3c490026 It's possible that a flag is added to the SDNode that points back to the
original SDNode. This is badness. Also, this function allows one SDNode to point
multiple flags to another SDNode. Badness as well.

llvm-svn: 106793
2010-06-24 22:00:37 +00:00
Dan Gohman 8a84cd57ae Simplify this code; switch lowering shouldn't produce cases
which trivially fold away.

llvm-svn: 106765
2010-06-24 17:08:31 +00:00
Dan Gohman 463f26b4be Eliminate the other half of the BRCOND optimization, and update
as many tests as possible.

llvm-svn: 106749
2010-06-24 15:24:03 +00:00
Dan Gohman df6b33e778 Eliminate the first have of the optimization which eliminates BRCOND
when the condition is constant. This optimization shouldn't be
necessary, because codegen shouldn't be able to find dead control
paths that the IR-level optimizer can't find. And it's undesirable,
because it encourages bugpoint to leave "br i1 false" branches
in its output. And it wasn't updating the CFG.

I updated all the tests I could, but some tests are too reduced
and I wasn't able to meaningfully preserve them.

llvm-svn: 106748
2010-06-24 15:04:11 +00:00
Dan Gohman 600f62b3ba Reapply r106634, now that the bug it exposed is fixed.
llvm-svn: 106746
2010-06-24 14:30:44 +00:00
Dan Gohman 0695e09b09 Optimize the "bit test" code path for switch lowering in the
case where the bit mask has exactly one bit.

llvm-svn: 106716
2010-06-24 02:06:24 +00:00
Bill Wendling a136521a17 MorphNodeTo doesn't preserve the memory operands. Because we're morphing a node
into the same node, but with different non-memory operands, we need to replace
the memory operands after it's finished morphing.

llvm-svn: 106643
2010-06-23 18:16:24 +00:00
Daniel Dunbar 4df321b7ad Revert r106263, "Fold the ShrinkDemandedOps pass into the regular DAGCombiner pass,"... it was causing both 'file' (with clang) and 176.gcc (with llvm-gcc) to be miscompiled.
llvm-svn: 106634
2010-06-23 17:09:26 +00:00
Jim Grosbach b58c08b0ba Some targets don't require the fencing MEMBARRIER instructions surrounding
atomic intrinsics, either because the use locking instructions for the
atomics, or because they perform the locking directly. Add support in the
DAG combiner to fold away the fences.

llvm-svn: 106630
2010-06-23 16:07:42 +00:00
Dan Gohman dd41bba517 Use A.append(...) instead of A.insert(A.end(), ...) when A is a
SmallVector, and other SmallVector simplifications.

llvm-svn: 106452
2010-06-21 19:47:52 +00:00
Dan Gohman bbc29ea821 Revert r106422, which is breaking the non-fast-isel path.
llvm-svn: 106423
2010-06-21 16:02:28 +00:00
Dan Gohman f64fdd69d0 More changes for non-top-down fast-isel.
Split the code for materializing a value out of
SelectionDAGBuilder::getValue into a helper function, so that it can
be used in other ways. Add a new getNonRegisterValue function which
uses it, for use in code which doesn't want a CopyFromReg even
when FuncMap.ValueMap already has an entry for it.

llvm-svn: 106422
2010-06-21 15:13:54 +00:00
Dan Gohman f91aff5f13 Do one lookup instead of two.
llvm-svn: 106415
2010-06-21 14:21:47 +00:00
Dan Gohman 7c58cf75fa Generalize this to look in the regular ValueMap in addition to
the LocalValueMap, to make it more flexible when fast-isel isn't
proceding straight top-down.

llvm-svn: 106414
2010-06-21 14:17:46 +00:00
Dan Gohman 8693650422 Teach regular and fast isel to set dead flags on unused implicit defs
on calls and similar instructions.

llvm-svn: 106353
2010-06-18 23:28:01 +00:00
Jim Grosbach a57c2885cf back-end libcall handling for ATOMIC_SWAP (__sync_lock_test_and_set)
llvm-svn: 106342
2010-06-18 23:03:10 +00:00
Evan Cheng f5d62535a5 Fix cross initialization compilation error.
llvm-svn: 106324
2010-06-18 22:01:37 +00:00
Jim Grosbach d64dfc1568 Add Expand-to-libcall support for additional atomics. This covers the usual
entries used by llvm-gcc. *_[U]MIN and such can be added later if needed.

This enables the front ends to simplify handling of the atomic intrinsics by
removing the target-specific decision about which targets can handle the
intrinsics.

llvm-svn: 106321
2010-06-18 21:43:38 +00:00
Dan Gohman 7edb39cc6b Minor code simplifications.
llvm-svn: 106286
2010-06-18 16:00:29 +00:00
Dan Gohman 6e681a5fbe Give NamedRegionTimer an Enabled flag, allowing all its clients to
switch from this:

  if (TimePassesIsEnabled) {
    NamedRegionTimer T(Name, GroupName);
    do_something();
  } else {
    do_something(); // duplicate the code, this time without a timer!
  }

to this:

  {
    NamedRegionTimer T(Name, GroupName, TimePassesIsEnabled);
    do_something();
  }

llvm-svn: 106285
2010-06-18 15:56:31 +00:00
Dan Gohman 96ca25eba5 Don't replace the old Ordering object with a new one; just clear()
the old one.

llvm-svn: 106284
2010-06-18 15:40:58 +00:00
Dan Gohman a4f46b3ef8 Don't call clear() on DbgInfo when it's going to be deleted anyway.
Don't replace the old DbgInfo with a new one when clear() on the
old one is sufficient.

llvm-svn: 106283
2010-06-18 15:36:18 +00:00
Dan Gohman 92c11acdb8 Change UpdateNodeOperands' operand and return value from SDValue to
SDNode *, since it doesn't care about the ResNo value.

llvm-svn: 106282
2010-06-18 15:30:29 +00:00
Dan Gohman f1d8304fe3 Eliminate unnecessary uses of getZExtValue().
llvm-svn: 106279
2010-06-18 14:22:04 +00:00
Dan Gohman 35b6f9a929 isValueValidForType can be a static member function.
llvm-svn: 106278
2010-06-18 14:01:07 +00:00
Dan Gohman b92156d5e4 Fold the ShrinkDemandedOps pass into the regular DAGCombiner pass,
which is faster, simpler, and less surprising.

llvm-svn: 106263
2010-06-18 01:05:21 +00:00
Dan Gohman 0883789ec4 Handle ext(ext(x)) -> ext(x) immediately, since it's simple.
llvm-svn: 106256
2010-06-18 00:08:30 +00:00
Stuart Hastings 0125b6410a Add a DebugLoc parameter to TargetInstrInfo::InsertBranch(). This
addresses a longstanding deficiency noted in many FIXMEs scattered
across all the targets.

This effectively moves the problem up one level, replacing eleven
FIXMEs in the targets with eight FIXMEs in CodeGen, plus one path
through FastISel where we actually supply a DebugLoc, fixing Radar
7421831.

llvm-svn: 106243
2010-06-17 22:43:56 +00:00
Jim Grosbach 0ed5b460dc add missing break. inconsequential as the code shouldn't be reached, but
for correctness' sake, it should be there.

llvm-svn: 106229
2010-06-17 17:58:54 +00:00
Jim Grosbach 3aeae8aeeb Add entries for Expanding atomic intrinsics to libcalls. Just a placeholder
for the moment. The implementation of the libcall will follow.

Currently, the llvm-gcc knows when the intrinsics can be correctly handled by
the back end and only generates them in those cases, issuing libcalls directly
otherwise. That's too much coupling. The intrinsics should always be
generated and the back end decide how to handle them, be it with a libcall,
inline code, or whatever. This patch is a step in that direction.

rdar://8097623

llvm-svn: 106227
2010-06-17 17:50:54 +00:00
Jim Grosbach ba451e80dc ISD::MEMBARRIER should lower to a libcall (__sync_synchronize) if the target
sets the legalize action to Expand.

llvm-svn: 106203
2010-06-17 02:00:53 +00:00
Mon P Wang 7a84689cc5 Fixed vector widening of binary instructions that can trap. Patch by Visa Putkinen!
llvm-svn: 106038
2010-06-15 20:29:05 +00:00
Evan Cheng 38f6560461 Code refactoring, no functionality changes.
llvm-svn: 105775
2010-06-10 02:09:31 +00:00
Jakob Stoklund Olesen 8bc5eca331 Mark physregs defined by inline asm as implicit.
This is a bit of a hack to make inline asm look more like call instructions.
It would be better to produce correct dead flags during isel.

llvm-svn: 105749
2010-06-09 20:05:00 +00:00
Jakob Stoklund Olesen a13b1c29b0 Add argument name comments.
llvm-svn: 105665
2010-06-09 00:40:31 +00:00
Mon P Wang 622cdd2297 Fixed a bug during widening where we would avoid legalizing a node. When we
replace an OpA with a widened OpB, it is possible to get new uses of OpA due to CSE
when recursively updating nodes.  Since OpA has been processed, the new uses are
not examined again.  The patch checks if this occurred and it it did, updates the
new uses of OpA to use OpB.

llvm-svn: 105453
2010-06-04 01:20:10 +00:00
Dan Gohman d83e3e7750 Fix SimplifyDemandedBits' AssertZext logic to demand all the bits. It
needs to demand the high bits because it's asserting that they're zero.

llvm-svn: 105406
2010-06-03 20:21:33 +00:00
Eli Friedman dbbbf73c96 Implement expansion in type legalization for add/sub with overflow. The
expansion is the same as that used by LegalizeDAG.

The resulting code sucks in terms of performance/codesize on x86-32 for a
64-bit operation; I haven't looked into whether different expansions might be
better in general.

llvm-svn: 105378
2010-06-03 03:49:50 +00:00
Devang Patel b0c76394a3 Keep track of incoming debug value of unused argument.
Radar 7927666.

llvm-svn: 105285
2010-06-01 19:59:01 +00:00
Dan Gohman b782caa393 Fill in missing support for ISD::FEXP, ISD::FPOWI, and friends.
llvm-svn: 105283
2010-06-01 18:35:14 +00:00
Chris Lattner 14c46517b5 fix PR6623: when optimizing for size, don't inline memcpy/memsets
that are too large.  This causes the freebsd bootloader to be too
large apparently.

It's unclear if this should be an -Os or -Oz thing.  Thoughts welcome.

llvm-svn: 105228
2010-05-31 17:30:14 +00:00
Chris Lattner b4a773b452 the 'limit' argument to FindOptimalMemOpLowering is unsigned, not uint64_t.
llvm-svn: 105226
2010-05-31 17:12:23 +00:00
Oscar Fuentes a97311f152 Use `llvm::next' instead of `next' to make VC++ 2010 happy.
llvm-svn: 105168
2010-05-30 13:14:21 +00:00