Commit Graph

5822 Commits

Author SHA1 Message Date
Hal Finkel e0cf6397fd Remove dead SD nodes after the combining pass. Fixes PR12201.
llvm-svn: 154786
2012-04-16 03:33:22 +00:00
Nadav Rotem 02ef0c3524 When emulating vselect using OR/AND/XOR make sure to bitcast the result back to the original type.
llvm-svn: 154764
2012-04-15 15:08:09 +00:00
Nadav Rotem 9d376b6578 Reapply 154397. Original message:
Fix a dagcombine optimization which assumes that the vsetcc result type is always
of the same size as the compared values. This is ture for SSE/AVX/NEON but not
for all targets.

llvm-svn: 154490
2012-04-11 08:26:11 +00:00
Craig Topper 692d584910 Fix an overly indented line. Remove an 'else' after an 'if' that returns.
llvm-svn: 154479
2012-04-11 04:55:51 +00:00
Craig Topper bc680061e8 Inline implVisitAluOverflow by introducing a nested switch to convert the intrinsic to an nodetype.
llvm-svn: 154478
2012-04-11 04:34:11 +00:00
Craig Topper 3ef01cdb2e Optimize code a bit by calling push_back only once in some loops. Reduces compiled code size a bit.
llvm-svn: 154473
2012-04-11 03:06:35 +00:00
Owen Anderson 6f1ee1634d Move the constant-folding support for FP_ROUND in SelectionDAG from the one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point.
Zap a testcase that this allows us to completely fold away.

llvm-svn: 154447
2012-04-10 22:46:53 +00:00
Duncan Sands 4f53074cca Add a comment noting that the fdiv -> fmul conversion won't generate
multiplication by a denormal, and some tests checking that.

llvm-svn: 154431
2012-04-10 20:35:27 +00:00
Eric Christopher e9abba71fe To ensure that we have more accurate line information for a block
don't elide the branch instruction if it's the only one in the block,
otherwise it's ok.

PR9796 and rdar://11215207

llvm-svn: 154417
2012-04-10 18:18:10 +00:00
Owen Anderson 3efc8f22bd Revert r154397, which was causing make check failures on the buildbots.
llvm-svn: 154414
2012-04-10 18:02:12 +00:00
Nadav Rotem 065564d85a Fix a dagcombine optimization which assumes that the vsetcc result type is always
of the same size as the compared values. This is ture for SSE/AVX/NEON but not
for all targets.

llvm-svn: 154397
2012-04-10 14:58:31 +00:00
Anton Korobeynikov 4d1220de34 Transform div to mul with reciprocal only when fp imm is legal.
This fixes PR12516 and uncovers one weird problem in legalize (workarounded)

llvm-svn: 154394
2012-04-10 13:22:49 +00:00
Evan Cheng 136861d994 Make the code slightly more palatable.
llvm-svn: 154378
2012-04-10 03:15:18 +00:00
Evan Cheng f8bad08001 Fix a long standing tail call optimization bug. When a libcall is emitted
legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.

PR12419
rdar://9770785
rdar://11195178

llvm-svn: 154370
2012-04-10 01:51:00 +00:00
Rafael Espindola 1d9672bdce Don't try to zExt just to check if an integer constant is zero, it might
not fit in a i64.

llvm-svn: 154364
2012-04-10 00:16:22 +00:00
Akira Hatanaka 8483a6c47d Have TargetLowering::getPICJumpTableRelocBase return a node that points to the
GOT if jump table uses 64-bit gp-relative relocation.

llvm-svn: 154341
2012-04-09 20:32:12 +00:00
Rafael Espindola 8f62b3248e Pattern match a setcc of boolean value with 0 as a truncate.
llvm-svn: 154322
2012-04-09 16:06:03 +00:00
Craig Topper 9c3da316ec Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out.
llvm-svn: 154309
2012-04-09 07:19:09 +00:00
Craig Topper e5893f64e8 Remove unnecessary 'else' on an 'if' that always returns
llvm-svn: 154308
2012-04-09 05:59:53 +00:00
Craig Topper e3ad4834ae Optimize code slightly. No functionality change.
llvm-svn: 154307
2012-04-09 05:55:33 +00:00
Craig Topper 5894fe430a Replace some explicit checks with asserts for conditions that should never happen.
llvm-svn: 154305
2012-04-09 05:16:56 +00:00
Craig Topper 6148fe65e8 Optimize code a bit. No functional change intended.
llvm-svn: 154299
2012-04-08 23:15:04 +00:00
Benjamin Kramer bb6ff08766 Silence sign-compare warning.
llvm-svn: 154297
2012-04-08 19:04:45 +00:00
Duncan Sands 2f1dc3814b Only have codegen turn fdiv by a constant into fmul by the reciprocal
when -ffast-math, i.e. don't just always do it if the reciprocal can
be formed exactly.  There is already an IR level transform that does
that, and it does it more carefully.

llvm-svn: 154296
2012-04-08 18:08:12 +00:00
Craig Topper c8e2d91a58 Simplify code that tries to do vector extracts for shuffles when the mask width and the input vector widths don't match. No need to check the min and max are in range before calculating the start index. The range check after having the start index is sufficient. Also no need to check for an extract from the beginning differently.
llvm-svn: 154295
2012-04-08 17:53:33 +00:00
Chandler Carruth 16f0ebcbb5 Move the TLSModel information into the TargetMachine rather than hiding
in TargetLowering. There was already a FIXME about this location being
odd. The interface is simplified as a consequence. This will also make
it easier to change TLS models when compiling with PIE.

llvm-svn: 154292
2012-04-08 17:20:55 +00:00
Craig Topper d024cef233 Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1.
llvm-svn: 154272
2012-04-07 22:32:29 +00:00
Craig Topper e09d1c5c48 Remove 'else' after 'if' that ends in return.
llvm-svn: 154267
2012-04-07 21:23:41 +00:00
Nadav Rotem 71d07ae5cb 1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new
shuffle node because it could introduce new shuffle nodes that were not
   supported efficiently by the target.

2. Add a more restrictive shuffle-of-shuffle optimization for cases where the
   second shuffle reverses the transformation of the first shuffle.

llvm-svn: 154266
2012-04-07 21:19:08 +00:00
Duncan Sands 5f8397a934 Convert floating point division by a constant into multiplication by the
reciprocal if converting to the reciprocal is exact.  Do it even if inexact
if -ffast-math.  This substantially speeds up ac.f90 from the polyhedron
benchmarks.

llvm-svn: 154265
2012-04-07 20:04:00 +00:00
Jakob Stoklund Olesen 37492eac8c Don't break the IV update in TLI::SimplifySetCC().
LSR always tries to make the ICmp in the loop latch use the incremented
induction variable. This allows the induction variable to be kept in a
single register.

When the induction variable limit is equal to the stride,
SimplifySetCC() would break LSR's hard work by transforming:

   (icmp (add iv, stride), stride) --> (cmp iv, 0)

This forced us to use lea for the IC update, preventing the simpler
incl+cmp.

<rdar://problem/7643606>
<rdar://problem/11184260>

llvm-svn: 154119
2012-04-05 20:30:20 +00:00
Owen Anderson a6eebf6013 Treat f16 the same as f80/f128 for the purposes of generating constants during instruction selection.
llvm-svn: 154113
2012-04-05 18:50:32 +00:00
Pete Cooper 8a3dc0ed8c f16 FREM can now be legalized by promoting to f32
llvm-svn: 154039
2012-04-04 19:36:31 +00:00
Rafael Espindola ba0a6cabb8 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Craig Topper 4c7d995029 Remove default case from switch that was already covering all cases.
llvm-svn: 153996
2012-04-04 04:42:42 +00:00
Pete Cooper e7bff68a5e Removed useless switch for default case when switch was covering all the enum values
llvm-svn: 153984
2012-04-04 00:53:04 +00:00
Pete Cooper 9511ec86f9 Add VSELECT to LegalizeVectorTypes::ScalariseVectorResult. Previously it would crash if it encountered a 1 element VSELECT. Solution is slightly more complicated than just creating a SELET as we have to mask or sign extend the vector condition if it had different boolean contents from the scalar condition. Fixes <rdar://problem/11178095>
llvm-svn: 153976
2012-04-03 22:57:55 +00:00
Chad Rosier 2a02fe1bb2 Fix an issue in SimplifySetCC() specific to vector comparisons.
When folding X == X we need to check getBooleanContents() to determine if the
result is a vector of ones or a vector of negative ones. 

I tried creating a test case, but the problem seems to only be exposed on a
much older version of clang (around r144500).
rdar://10923049

llvm-svn: 153966
2012-04-03 20:11:24 +00:00
Owen Anderson 98f2c0c384 Add predicates for checking whether targets have free FNEG and FABS operations, and prevent the DAGCombiner from turning them into bitwise operations if they do.
llvm-svn: 153901
2012-04-02 22:10:29 +00:00
Nadav Rotem 702f080767 Optimizing swizzles of complex shuffles may generate additional complex shuffles.
Do not try to optimize swizzles of shuffles if the source shuffle has more than
a single user, except when the source shuffle is also a swizzle.

llvm-svn: 153864
2012-04-02 07:11:12 +00:00
Nadav Rotem b078350872 This commit contains a few changes that had to go in together.
1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B))
   (and also scalar_to_vector).

2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src).
   Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B))

3. Optimize swizzles of shuffles:  shuff(shuff(x, y), undef) -> shuff(x, y).

4. Fix an X86ISelLowering optimization which was very bitcast-sensitive.

Code which was previously compiled to this:

movd    (%rsi), %xmm0
movdqa  .LCPI0_0(%rip), %xmm2
pshufb  %xmm2, %xmm0
movd    (%rdi), %xmm1
pshufb  %xmm2, %xmm1
pxor    %xmm0, %xmm1
pshufb  .LCPI0_1(%rip), %xmm1
movd    %xmm1, (%rdi)
ret

Now compiles to this:

movl    (%rsi), %eax
xorl    %eax, (%rdi)
ret

llvm-svn: 153848
2012-04-01 19:31:22 +00:00
Rafael Espindola 80c540e656 Teach CodeGen's version of computeMaskedBits to understand the range metadata.
This is the CodeGen equivalent of r153747. I tested that there is not noticeable
performance difference with any combination of -O0/-O2 /-g when compiling
gcc as a single compilation unit.

llvm-svn: 153817
2012-03-31 18:14:00 +00:00
Bill Wendling 9f829f1cc4 If we have a VLA that has a "use" in a metadata node that's then used
here but it has no other uses, then we have a problem. E.g.,

  int foo (const int *x) {
    char a[*x];
    return 0;
  }

If we assign 'a' a vreg and fast isel later on has to use the selection
DAG isel, it will want to copy the value to the vreg. However, there are
no uses, which goes counter to what selection DAG isel expects.
<rdar://problem/11134152>

llvm-svn: 153705
2012-03-30 00:02:55 +00:00
Eric Christopher 24a6298512 More debug output.
llvm-svn: 153571
2012-03-28 07:34:36 +00:00
Chris Lattner 1cc25e8a40 fix what looks like a real logic bug, found by PVS-Studio (part of PR12357)
llvm-svn: 153513
2012-03-27 16:27:21 +00:00
Eric Christopher c1e2dcdb8a Add a debug statement.
llvm-svn: 153428
2012-03-26 06:10:32 +00:00
Hal Finkel 71c2ba3d2e Add the ability to promote legal integer VAARGs. This is required for the PPC64 SVR4 ABI.
llvm-svn: 153372
2012-03-24 03:53:52 +00:00
Evan Cheng 8ab58a21a5 Source order scheduler should not preschedule nodes with multiple uses. rdar://11096639
llvm-svn: 153270
2012-03-22 19:31:17 +00:00
Evan Cheng 79f03e915d Assign node orders to target intrinsics which do not produce results. rdar://11096639
llvm-svn: 153269
2012-03-22 19:29:09 +00:00
Chad Rosier 6a63a74113 [fast-isel] Fold "urem x, pow2" -> "and x, pow2-1". This should fix the 271%
execution-time regression for nsieve-bits on the ARMv7 -O0 -g nightly tester.
This may also improve compile-time on architectures that would otherwise 
generate a libcall for urem (e.g., ARM) or fall back to the DAG selector.
rdar://10810716

llvm-svn: 153230
2012-03-22 00:21:17 +00:00
Jim Grosbach e13adc38d0 Checking a build_vector for an all-ones value.
Type legalization can zero-extend the elements of the build_vector node, so,
for example, we may have an <8 x i8> with i32 elements of value 255. That
should return 'true' for the vector being all ones.

llvm-svn: 153203
2012-03-21 17:48:04 +00:00
Craig Topper aaeae98936 When combining (vextract shuffle (load ), <1,u,u,u>), 0) -> (load ), add users of the final load to the worklist too. Needed by changes I'm preparing to make to X86 backend.
llvm-svn: 153078
2012-03-20 05:28:39 +00:00
Eric Christopher 60e01c560a Do everything up to generating code to try to get a register for
a variable. The previous code would break the debug info changing
code invariant. This will regress debug info for arguments where
we elide the alloca created.

Fixes rdar://11066468

llvm-svn: 153074
2012-03-20 01:07:58 +00:00
Eric Christopher 997aaa9237 Untabify.
llvm-svn: 153073
2012-03-20 01:07:56 +00:00
Eric Christopher e5e54c87fa Add another debugging statement here.
llvm-svn: 153072
2012-03-20 01:07:53 +00:00
Eric Christopher 1a06cc9ae6 Use lookUpRegForValue here instead of duplicating the code.
llvm-svn: 153071
2012-03-20 01:07:47 +00:00
Pete Cooper e69be6df4f f16 FDIV can now be legalized by promoting to f32
llvm-svn: 153064
2012-03-19 23:38:12 +00:00
Duncan Sands 3fb2fc6edb Fix DAG combine which creates illegal vector shuffles. Patch by Heikki Kultala.
llvm-svn: 153035
2012-03-19 15:35:44 +00:00
NAKAMURA Takumi a7e57ace28 Revert r152613 (and r152614), "Inline the d'tor and add an anchor instead." for workaround of g++-4.4's miscompilation.
It caused MSP430DAGToDAGISel::SelectIndexedBinOp() to be miscompiled.
When two ReplaceUses()'s are expanded as inline, vtable in base class is stored to latter (ISelUpdater)ISU.

llvm-svn: 152877
2012-03-16 00:01:55 +00:00
Eric Christopher 3390a6e5e3 We actually handle AllocaInst via getRegForValue below just fine.
Part of rdar://8905263

llvm-svn: 152845
2012-03-15 21:33:47 +00:00
Eric Christopher 142820ba8d Add some debugging output into fast isel as well.
llvm-svn: 152844
2012-03-15 21:33:44 +00:00
Eric Christopher be7a1016fc Add another debug statement.
llvm-svn: 152843
2012-03-15 21:33:41 +00:00
Nadav Rotem 6fd1d32c63 When optimizing certain BUILD_VECTOR nodes into other BUILD_VECTOR nodes, add the new node into the work list because there is a potential for further optimizations.
llvm-svn: 152784
2012-03-15 08:49:06 +00:00
Bill Wendling df170db2f6 Add a xform to the DAG combiner.
Transform:

        (fsub x, (fadd x, y)) -> (fneg y) and
        (fsub x, (fadd y, x)) -> (fneg y)

if 'unsafe math' is specified.
<rdar://problem/7540295>

llvm-svn: 152777
2012-03-15 05:12:00 +00:00
Bill Wendling 618d57310a Insert the debugging instructions in one fell-swoop so that it doesn't call the
expensive "getFirstTerminator" call. This reduces the time of compilation in
PR12258 from >10 minutes to < 10 seconds.

llvm-svn: 152704
2012-03-14 07:14:25 +00:00
Evan Cheng d5f8e5766c Fortify r152675 a bit. Although I'm not able to come up with a test case that would trigger the truncation case.
llvm-svn: 152678
2012-03-13 22:16:11 +00:00
Evan Cheng 7bf83096df DAG combine incorrectly optimize (i32 vextract (v4i16 load $addr), c) to
(i16 load $addr+c*sizeof(i16)) and replace uses of (i32 vextract) with the
i16 load. It should issue an extload instead: (i32 extload $addr+c*sizeof(i16)).

rdar://11035895

llvm-svn: 152675
2012-03-13 22:00:52 +00:00
Bill Wendling ac499ab244 Add a return type.
llvm-svn: 152614
2012-03-13 05:52:28 +00:00
Bill Wendling 8adb10c8a9 Inline the d'tor and add an anchor instead.
llvm-svn: 152613
2012-03-13 05:51:56 +00:00
Bill Wendling 508a3e5185 Refactor the SelectionDAG's 'dump' methods into their own .cpp file.
No functionality change.

llvm-svn: 152611
2012-03-13 05:47:27 +00:00
Stepan Dyatkovskiy 97b02fc1b3 llvm::SwitchInst
Renamed methods caseBegin, caseEnd and caseDefault with case_begin, case_end, and case_default.
Added some notes relative to case iterators.

llvm-svn: 152532
2012-03-11 06:09:17 +00:00
Benjamin Kramer e1e549d617 Give dagcombiner's worklist some inline capacity.
llvm-svn: 152454
2012-03-10 00:23:58 +00:00
Craig Topper 5a4bcc749a Use uint16_t to store instruction implicit uses and defs. Reduces static data.
llvm-svn: 152301
2012-03-08 08:22:45 +00:00
Stepan Dyatkovskiy 5b648afb4d Taken into account Duncan's comments for r149481 dated by 2nd Feb 2012:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/136146.html

Implemented CaseIterator and it solves almost all described issues: we don't need to mix operand/case/successor indexing anymore. Base iterator class is implemented as a template since it may be initialized either from "const SwitchInst*" or from "SwitchInst*".

ConstCaseIt is just a read-only iterator.
CaseIt is read-write iterator; it allows to change case successor and case value.

Usage of iterator allows totally remove resolveXXXX methods. All indexing convertions done automatically inside the iterator's getters.

Main way of iterator usage looks like this:
SwitchInst *SI = ... // intialize it somehow

for (SwitchInst::CaseIt i = SI->caseBegin(), e = SI->caseEnd(); i != e; ++i) {
  BasicBlock *BB = i.getCaseSuccessor();
  ConstantInt *V = i.getCaseValue();
  // Do something.
}

If you want to convert case number to TerminatorInst successor index, just use getSuccessorIndex iterator's method.
If you want initialize iterator from TerminatorInst successor index, use CaseIt::fromSuccessorIndex(...) method.

There are also related changes in llvm-clients: klee and clang.

llvm-svn: 152297
2012-03-08 07:06:20 +00:00
Andrew Trick 52226d409b misched preparation: rename core scheduler methods for consistency.
We had half the API with one convention, half with another. Now was a
good time to clean it up.

llvm-svn: 152255
2012-03-07 23:00:49 +00:00
Andrew Trick 60cf03e772 misched preparation: clarify ScheduleDAG and ScheduleDAGInstrs roles.
ScheduleDAG is responsible for the DAG: SUnits and SDeps. It provides target hooks for latency computation.

ScheduleDAGInstrs extends ScheduleDAG and defines the current scheduling region in terms of MachineInstr iterators. It has access to the target's scheduling itinerary data. ScheduleDAGInstrs provides the logic for building the ScheduleDAG for the sequence of MachineInstrs in the current region. Target's can implement highly custom schedulers by extending this class.

ScheduleDAGPostRATDList provides the driver and diagnostics for current postRA scheduling. It maintains a current Sequence of scheduled machine instructions and logic for splicing them into the block. During scheduling, it uses the ScheduleHazardRecognizer provided by the target.

Specific changes:
- Removed driver code from ScheduleDAG. clearDAG is the only interface needed.

- Added enterRegion/exitRegion hooks to ScheduleDAGInstrs to delimit the scope of each scheduling region and associated DAG. They should be used to setup and cleanup any region-specific state in addition to the DAG itself. This is necessary because we reuse the same ScheduleDAG object for the entire function. The target may extend these hooks to do things at regions boundaries, like bundle terminators. The hooks are called even if we decide not to schedule the region. So all instructions in a block are "covered" by these calls.

- Added ScheduleDAGInstrs::begin()/end() public API.

- Moved Sequence into the driver layer, which is specific to the scheduling algorithm.

llvm-svn: 152208
2012-03-07 05:21:52 +00:00
Andrew Trick e932bb77b5 misched preparation: modularize schedule emission.
ScheduleDAG has nothing to do with how the instructions are scheduled.

llvm-svn: 152206
2012-03-07 05:21:44 +00:00
Andrew Trick edee68ce1b misched preparation: modularize schedule printing.
ScheduleDAG will not refer to the scheduled instruction sequence.

llvm-svn: 152205
2012-03-07 05:21:40 +00:00
Andrew Trick 46a58664f7 misched preparation: modularize schedule verification.
ScheduleDAG will not refer to the scheduled instruction sequence.

llvm-svn: 152204
2012-03-07 05:21:36 +00:00
Andrew Trick 7c6c41a56a whitespace
llvm-svn: 152203
2012-03-07 05:21:32 +00:00
Andrew Trick 1b2324d0e8 Cleanup in preparation for misched: Move DAG visualization logic.
Soon, ScheduleDAG will not refer to the BB.

llvm-svn: 152177
2012-03-07 00:18:22 +00:00
Andrew Trick 5297d8df99 whitespace
llvm-svn: 152175
2012-03-07 00:18:15 +00:00
Andrew Trick 0c84efe8dd Cleanup: DAG building is specific to either SD or MI scheduling. Not part of the target interface.
llvm-svn: 152174
2012-03-07 00:18:12 +00:00
Evan Cheng 80893ce5f5 Extend r148086 to check for [r +/- reg] address mode. This fixes queens performance regression (due to increased register pressure from overly aggressive pre-inc formation).
llvm-svn: 152162
2012-03-06 23:33:32 +00:00
Owen Anderson 2ee7c4dfc5 Make it possible for a target to mark FSUB as Expand. This requires providing a default expansion (FADD+FNEG), and teaching DAGCombine not to form FSUBs post-legalize if they are not legal.
llvm-svn: 152079
2012-03-06 00:29:31 +00:00
Bill Wendling 7cf6db7e3c Fix warnings about adding a bool to a string.
Patch by Sean Silva!

llvm-svn: 152042
2012-03-05 19:29:36 +00:00
Craig Topper 1d32658877 Use uint16_t to store register overlaps to reduce static data.
llvm-svn: 152001
2012-03-04 10:43:23 +00:00
James Molloy f6298e9281 Fix a codegen fault in which log2 or exp2 could be dead-code eliminated even though they could have sideeffects.
Only allow log2/exp2 to be converted to an intrinsic if they are declared "readnone".

llvm-svn: 151807
2012-03-01 14:32:18 +00:00
Benjamin Kramer d05a0c6c42 LegalizeIntegerTypes: Reorder operations in the "big shift by small amount" optimization, making the lives of later passes easier.
llvm-svn: 151722
2012-02-29 13:27:00 +00:00
Evan Cheng 65f9d19c4f Re-commit r151623 with fix. Only issue special no-return calls if it's a direct call.
llvm-svn: 151645
2012-02-28 18:51:51 +00:00
Benjamin Kramer f2e160c665 Fix off-by one in comment.
llvm-svn: 151644
2012-02-28 18:37:06 +00:00
Benjamin Kramer 0c281a7deb LegalizeIntegerTypes: Reenable the large shift with small amount optimization.
To avoid problems with zero shifts when getting the bits that move between words
we use a trick: first shift the by amount-1, then do another shift by one. When
amount is 0 (and size 32) we first shift by 31, then by one, instead of by 32.

Also fix a latent bug that emitted the low and high words in the wrong order
when shifting right.

Fixes PR12113.

llvm-svn: 151637
2012-02-28 17:58:00 +00:00
Daniel Dunbar ee7b899343 Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack prediction. ...", it is breaking the Clang build during the Compiler-RT part.
llvm-svn: 151630
2012-02-28 15:36:07 +00:00
Nadav Rotem 1d666099be Code cleanup following CR by Duncan.
llvm-svn: 151627
2012-02-28 14:13:19 +00:00
Nadav Rotem 875e463b19 Fix a bug in the code that builds SDNodes from vector GEPs.
When the GEP index is a vector of pointers, the code that calculated the size
of the element started from the vector type, and not the contained pointer type.
As a result, instead of looking at the data element pointed by the vector, this
code used the size of the vector. This works for 32bit members (on 32bit
systems), but not for other types. Added code to peel the vector type and
added a test.

llvm-svn: 151626
2012-02-28 11:54:05 +00:00
Evan Cheng 87c7b09d8d Some ARM implementaions, e.g. A-series, does return stack prediction. That is,
the processor keeps a return addresses stack (RAS) which stores the address
and the instruction execution state of the instruction after a function-call
type branch instruction.

Calling a "noreturn" function with normal call instructions (e.g. bl) can
corrupt RAS and causes 100% return misprediction so LLVM should use a
unconditional branch instead. i.e.
mov lr, pc
b _foo
The "mov lr, pc" is issued in order to get proper backtrace.

rdar://8979299

llvm-svn: 151623
2012-02-28 06:42:03 +00:00
Hal Finkel b9a3d61894 Don't crash when a glue node contains an internal CopyToReg
This is necessary to support the existing ppc lowering code for indirect calls.
Fixes PR12071.

llvm-svn: 151373
2012-02-24 17:53:59 +00:00
Benjamin Kramer 6fe3e3d335 SDAGBuilder: Remove register sets that were never read and prune dead code surrounding it.
llvm-svn: 151364
2012-02-24 14:01:17 +00:00
Pete Cooper 682c76b7d4 Turn avx insert intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove duplicate patterns for selecting the intrinsics
llvm-svn: 151342
2012-02-24 03:51:49 +00:00
Eric Christopher da97054114 If the Address of a variable is an argument then treat the entire
variable declaration as an argument because we want that address
anyhow for our debug information.

This seems to fix rdar://9965111, at least we have more debug
information than before and from reading the assembly it appears
to be the correct location.

llvm-svn: 151335
2012-02-24 01:59:08 +00:00
Eric Christopher 219d51d649 Tabs, formatting and long lines oh my!
llvm-svn: 151334
2012-02-24 01:59:01 +00:00
Bill Wendling 38b31619f6 Allow an integer to be converted into an MMX type when it's used in an inline
asm.
<rdar://problem/10106006>

llvm-svn: 151303
2012-02-23 23:25:25 +00:00
Eric Christopher 18c6be7132 More newline cleanups.
llvm-svn: 151235
2012-02-23 03:39:43 +00:00
Eric Christopher 5c45205b79 Add some handy-dandy newlines.
llvm-svn: 151234
2012-02-23 03:39:39 +00:00
Michael J. Spencer 8b98bf2d6b Properly emit _fltused with FastISel. Refactor to share code with SDAG.
Patch by Joe Groff!

llvm-svn: 151183
2012-02-22 19:06:13 +00:00
Craig Topper 760b134ffa Make all pointers to TargetRegisterClass const since they are all pointers to static data that should not be modified.
llvm-svn: 151134
2012-02-22 05:59:10 +00:00
James Molloy 862fe49c55 Teach the DAGCombiner that certain loadext nodes followed by ANDs can be converted to zeroexts.
llvm-svn: 150957
2012-02-20 12:02:38 +00:00
Eric Christopher 81e2bf2b77 Ignore the lifetime intrinsics in fast-isel.
llvm-svn: 150848
2012-02-17 23:03:39 +00:00
James Molloy 920ae8c642 Remove extraneous #include and spelling mistake introduced in r150669.
llvm-svn: 150670
2012-02-16 09:48:07 +00:00
James Molloy 67b6b11b52 Modify the algorithm when traversing the DAGCombiner's worklist to be O(log N) for all operations. This fixes a horrible worst case with lots of nodes where 99% of the time was being spent in std::remove.
llvm-svn: 150669
2012-02-16 09:17:04 +00:00
Pete Cooper 4dd0963d56 Added hook to let targets custom lower splitting of illegal vectors
llvm-svn: 150550
2012-02-15 00:55:31 +00:00
Nadav Rotem 29984ba033 Fix PR12000. Some vector operations may use scalar operands with types
that are greater than the vector element type. For example BUILD_VECTOR
of type <1 x i1> with a constant i8 operand.
This patch fixes the assertion.

llvm-svn: 150477
2012-02-14 13:06:32 +00:00
Lang Hames 29d6ed6416 Rename getExceptionAddressRegister() to getExceptionPointerRegister() for consistency with setExceptionPointerRegister(...).
llvm-svn: 150460
2012-02-14 04:45:49 +00:00
Bill Wendling 05d6f2ff1e Don't reserve the R0 and R1 registers here. We don't use these registers, and
marking them as "live-in" into a BB ruins some invariants that the back-end
tries to maintain.

llvm-svn: 150437
2012-02-13 23:47:16 +00:00
Jakob Stoklund Olesen 2ceea93dd3 Add register mask support to ScheduleDAGRRList.
The scheduler will sometimes check the implicit-def list on instructions
to properly handle pre-colored DAG edges.

Also check any register mask operands for physreg clobbers.

llvm-svn: 150428
2012-02-13 23:25:24 +00:00
Nadav Rotem 0c65064dbe Fix a bug in DAGCombine for the optimization of BUILD_VECTOR. We cant generate a shuffle node from two vectors of different types.
llvm-svn: 150383
2012-02-13 12:42:26 +00:00
Nadav Rotem 34ca89afa8 This patch addresses the problem of poor code generation for the zext
v8i8 -> v8i32 on AVX machines. The codegen often scalarizes ANY_EXTEND nodes.
The DAGCombiner has two optimizations that can mitigate the problem. First,
if all of the operands of a BUILD_VECTOR node are extracted from an ZEXT/ANYEXT
nodes, then it is possible to create a new simplified BUILD_VECTOR which uses
UNDEFS/ZERO values to eliminate the scalar ZEXT/ANYEXT nodes.
Second, another dag combine optimization lowers BUILD_VECTOR into a shuffle
vector instruction.

In the case of zext v8i8->v8i32 on AVX, a value in an XMM register is to be
shuffled into a wide YMM register.

This patch modifes the second optimization and allows the creation of
shuffle vectors even when the newly generated vector and the original vector
from which we extract the values are of different types.

llvm-svn: 150340
2012-02-12 15:05:31 +00:00
Benjamin Kramer bf152d57a4 Put instruction names into an indexed string table on the side, removing a pointer from MCInstrDesc.
Make them accessible through MCInstrInfo. They are only used for debugging purposes so this doesn't
have an impact on performance. X86MCTargetDesc.o goes from 630K to 461K on x86_64.

llvm-svn: 150245
2012-02-10 13:18:44 +00:00
Bill Wendling 0aef16afd5 [unwind removal] Remove all of the code for the dead 'unwind' instruction. There
were no 'unwind' instructions being generated before this, so this is in effect
a no-op.

llvm-svn: 149906
2012-02-06 21:44:22 +00:00
Nadav Rotem 4f4546b73a Add additional documentation to the extract-and-trunc dagcombine optimization.
llvm-svn: 149823
2012-02-05 11:39:23 +00:00
Craig Topper ee4dab5f1f Convert assert(0) to llvm_unreachable
llvm-svn: 149816
2012-02-05 08:31:47 +00:00
Chris Lattner cf9e8f6968 reapply the patches reverted in r149470 that reenable ConstantDataArray,
but with a critical fix to the SelectionDAG code that optimizes copies
from strings into immediate stores: the previous code was stopping reading
string data at the first nul.  Address this by adding a new argument to
llvm::getConstantStringInfo, preserving the behavior before the patch.

llvm-svn: 149800
2012-02-05 02:29:43 +00:00
Chad Rosier 6d68c7cf79 [fast-isel] HandlePHINodesInSuccessorBlocks() can promite i8 and i16 types too.
llvm-svn: 149730
2012-02-04 00:39:19 +00:00
Jakob Stoklund Olesen f650732cab Handle all live physreg defs in the same place.
SelectionDAG has 4 different ways of passing physreg defs to users.
Collect all of the uses at the same time, and pass all of them to
MI->setPhysRegsDeadExcept() to mark the remaining defs dead.

The setPhysRegsDeadExcept() function will soon add the required
implicit-defs to instructions with register mask operands.

llvm-svn: 149708
2012-02-03 20:43:35 +00:00
Nadav Rotem 5399f4d6bf The type-legalizer often scalarizes code. One of the common patterns is extract-and-truncate.
In this patch we optimize this pattern and convert the sequence into extract op of a narrow type.
This allows the BUILD_VECTOR dag optimizations to construct efficient shuffle operations in many cases.

llvm-svn: 149692
2012-02-03 13:18:25 +00:00
Andrew Trick 3441597f84 fix cmake
llvm-svn: 149553
2012-02-01 22:28:29 +00:00
Andrew Trick d06df96a7c VLIW specific scheduler framework that utilizes deterministic finite automaton (DFA).
This new scheduler plugs into the existing selection DAG scheduling framework. It is a top-down critical path scheduler that tracks register pressure and uses a DFA for pipeline modeling.

Patch by Sergei Larin!

llvm-svn: 149547
2012-02-01 22:13:57 +00:00
Stepan Dyatkovskiy 513aaa5691 SwitchInst refactoring.
The purpose of refactoring is to hide operand roles from SwitchInst user (programmer). If you want to play with operands directly, probably you will need lower level methods than SwitchInst ones (TerminatorInst or may be User). After this patch we can reorganize SwitchInst operands and successors as we want.

What was done:

1. Changed semantics of index inside the getCaseValue method:
getCaseValue(0) means "get first case", not a condition. Use getCondition() if you want to resolve the condition. I propose don't mix SwitchInst case indexing with low level indexing (TI successors indexing, User's operands indexing), since it may be dangerous.
2. By the same reason findCaseValue(ConstantInt*) returns actual number of case value. 0 means first case, not default. If there is no case with given value, ErrorIndex will returned.
3. Added getCaseSuccessor method. I propose to avoid usage of TerminatorInst::getSuccessor if you want to resolve case successor BB. Use getCaseSuccessor instead, since internal SwitchInst organization of operands/successors is hidden and may be changed in any moment.
4. Added resolveSuccessorIndex and resolveCaseIndex. The main purpose of these methods is to see how case successors are really mapped in TerminatorInst.
4.1 "resolveSuccessorIndex" was created if you need to level down from SwitchInst to TerminatorInst. It returns TerminatorInst's successor index for given case successor.
4.2 "resolveCaseIndex" converts low level successors index to case index that curresponds to the given successor.

Note: There are also related compatability fix patches for dragonegg, klee, llvm-gcc-4.0, llvm-gcc-4.2, safecode, clang.
llvm-svn: 149481
2012-02-01 07:49:51 +00:00
Argyrios Kyrtzidis 17c981a45b Revert Chris' commits up to r149348 that started causing VMCoreTests unit test to fail.
These are:

r149348
r149351
r149352
r149354
r149356
r149357
r149361
r149362
r149364
r149365

llvm-svn: 149470
2012-02-01 04:51:17 +00:00
Chris Lattner 997348e9fe remove the last vestiges of llvm::GetConstantStringInfo, in CodeGen.
llvm-svn: 149356
2012-01-31 05:09:17 +00:00
Chris Lattner 983005f51b rework this logic to not depend on the last argument to GetConstantStringInfo,
which is going away.

llvm-svn: 149348
2012-01-31 04:39:22 +00:00
Bill Wendling 8d9d1a0022 Remove the now-dead llvm.eh.exception and llvm.eh.selector intrinsics.
llvm-svn: 149331
2012-01-31 01:58:48 +00:00
Bill Wendling a4237652d2 Remove the eh.exception and eh.selector intrinsics. Also remove a hack to copy
over the catch information. The catch information is now tacked to the invoke
instruction.

llvm-svn: 149326
2012-01-31 01:46:13 +00:00
Eli Friedman 18a4c31525 Use the correct ShiftAmtTy for creating shifts after legalization. PR11881. Not committing a testcase because I think it will be too fragile.
llvm-svn: 149315
2012-01-31 01:08:03 +00:00
Chris Lattner 0256be96f2 continue making the world safe for ConstantDataVector. At this point,
we should (theoretically optimize and codegen ConstantDataVector as well
as ConstantVector.

llvm-svn: 149116
2012-01-27 03:08:05 +00:00
Chris Lattner cf12970bd0 eliminate the Constant::getVectorElements method. There are better (and
more robust) ways to do what it was doing now.  Also, add static methods
for decoding a ShuffleVector mask.

llvm-svn: 149028
2012-01-26 02:51:13 +00:00
Chris Lattner 47a86bdbe2 use ConstantVector::getSplat in a few places.
llvm-svn: 148929
2012-01-25 06:02:56 +00:00
Chris Lattner 9be59599b3 Use the right method to get the # elements in a CDS.
llvm-svn: 148897
2012-01-25 01:27:20 +00:00
Chris Lattner 00245f420a add more support for ConstantDataSequential
llvm-svn: 148802
2012-01-24 13:41:11 +00:00
David Blaikie 46a9f016c5 More dead code removal (using -Wunreachable-code)
llvm-svn: 148578
2012-01-20 21:51:11 +00:00
Jakob Stoklund Olesen 9349351d72 Add a RegisterMaskSDNode class.
This SelectionDAG node will be attached to call nodes by LowerCall(),
and eventually becomes a MO_RegisterMask MachineOperand on the
MachineInstr representing the call instruction.

LowerCall() will attach a register mask that depends on the calling
convention.

llvm-svn: 148436
2012-01-18 23:52:12 +00:00
Nadav Rotem 3b8f0cc9fa Fix a bug in the type-legalization of vector integers. When we bitcast one vector type to another, we must not bitcast the result if one type is widened while the other is promoted.
llvm-svn: 148383
2012-01-18 08:33:18 +00:00
Pete Cooper c52eeed310 Fix ISD::REG_SEQUENCE to accept physical registers and change TwoAddressInstructionPass to insert copies for any physical reg operands of the REG_SEQUENCE
llvm-svn: 148377
2012-01-18 04:16:16 +00:00
Nadav Rotem fb6ddee0e9 Transform: (EXTRACT_VECTOR_ELT( VECTOR_SHUFFLE )) -> EXTRACT_VECTOR_ELT.
llvm-svn: 148337
2012-01-17 21:44:01 +00:00
Craig Topper 02cb0fb136 Teach DAG combiner to turn a BUILD_VECTOR of UNDEFs into an UNDEF of vector type.
llvm-svn: 148297
2012-01-17 09:09:48 +00:00
Pete Cooper e3d305a206 Changed flag operand of ISD::FP_ROUND to TargetConstant as it should not get checked for legalisation
llvm-svn: 148275
2012-01-17 01:54:07 +00:00
David Blaikie 5d8e42755c Refactor variables unused under non-assert builds (& remove two entirely unused variables).
llvm-svn: 148230
2012-01-16 05:17:39 +00:00
Pete Cooper e85b95d754 Changed intrinsic ID operand to a target constant as its not used in any arithmetic so should not be checked in legalisation
llvm-svn: 148228
2012-01-16 04:08:12 +00:00
Nadav Rotem 57935243bd [AVX] Optimize x86 VSELECT instructions using SimplifyDemandedBits.
We know that the blend instructions only use the MSB, so if the mask is
sign-extended then we can convert it into a SHL instruction. This is a
common pattern because the type-legalizer sign-extends the i1 type which
is used by the LLVM-IR for the condition.

Added a new optimization in SimplifyDemandedBits for SIGN_EXTEND_INREG -> SHL.

llvm-svn: 148225
2012-01-15 19:27:55 +00:00
Benjamin Kramer 339ced4e34 Return an ArrayRef from ShuffleVectorSDNode::getMask and push it through CodeGen.
llvm-svn: 148218
2012-01-15 13:16:05 +00:00
Benjamin Kramer 5a377e28da DAGCombiner: Deduplicate code.
llvm-svn: 148217
2012-01-15 11:50:43 +00:00
Craig Topper 201c1a3505 Truncate of undef is just undef of smaller size.
llvm-svn: 148205
2012-01-15 01:05:11 +00:00
Evan Cheng fa8326334b DAGCombine's logic for forming pre- and post- indexed loads / stores were being
overly conservative. It was concerned about cases where it would prohibit
folding simple [r, c] addressing modes. e.g.
  ldr r0, [r2]
  ldr r1, [r2, #4]
=>
  ldr r0, [r2], #4
  ldr r1, [r2]
Change the logic to look for such cases which allows it to form indexed memory
ops more aggressively.

rdar://10674430

llvm-svn: 148086
2012-01-13 01:37:24 +00:00
Pete Cooper 99415fea87 Added FPOW, FEXP, FLOG to PromoteNode so that custom actions can be set to Promote for those operations.
Sorry, no test case yet

llvm-svn: 148050
2012-01-12 21:46:18 +00:00
Evan Cheng 09cc429cb1 Allow targets to select source order pre-RA scheduler.
llvm-svn: 148033
2012-01-12 18:27:52 +00:00
Nadav Rotem b5ce6ee835 On AVX, we can load v8i32 at a time. The bug happens when two uneven loads are used.
When we load the v12i32 type, the GenWidenVectorLoads method generates two loads: v8i32 and v4i32 
and attempts to use CONCAT_VECTORS to join them. In this fix I concat undef values to widen 
the smaller value. The test "widen_load-2.ll" also exposes this bug on AVX.

llvm-svn: 147964
2012-01-11 20:19:17 +00:00
Chandler Carruth 55b2cdee26 Teach the X86 instruction selection to do some heroic transforms to
detect a pattern which can be implemented with a small 'shl' embedded in
the addressing mode scale. This happens in real code as follows:

  unsigned x = my_accelerator_table[input >> 11];

Here we have some lookup table that we look into using the high bits of
'input'. Each entity in the table is 4-bytes, which means this
implicitly gets turned into (once lowered out of a GEP):

  *(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));

The shift right followed by a shift left is canonicalized to a smaller
shift right and masking off the low bits. That hides the shift right
which x86 has an addressing mode designed to support. We now detect
masks of this form, and produce the longer shift right followed by the
proper addressing mode. In addition to saving a (rather large)
instruction, this also reduces stalls in Intel chips on benchmarks I've
measured.

In order for all of this to work, one part of the DAG needs to be
canonicalized *still further* than it currently is. This involves
removing pointless 'trunc' nodes between a zextload and a zext. Without
that, we end up generating spurious masks and hiding the pattern.

llvm-svn: 147936
2012-01-11 08:41:08 +00:00
Chandler Carruth f3e8502cc1 Add 'llvm_unreachable' to passify GCC's understanding of the constraints
of several newly un-defaulted switches. This also helps optimizers
(including LLVM's) recognize that every case is covered, and we should
assume as much.

llvm-svn: 147861
2012-01-10 18:08:01 +00:00
David Blaikie edbb58c577 Remove unnecessary default cases in switches that cover all enum values.
llvm-svn: 147855
2012-01-10 16:47:17 +00:00
Nadav Rotem 61bdf79035 Fix a bug in the legalization of shuffle vectors. When we emulate shuffles using BUILD_VECTORS we may be using a BV of different type. Make sure to cast it back.
llvm-svn: 147851
2012-01-10 14:28:46 +00:00
Craig Topper 0515cd41e4 Replace some uses of hasNUsesOfValue(0, X) with !hasAnyUseOfValue(X)
llvm-svn: 147733
2012-01-07 18:31:09 +00:00
Craig Topper 43a1bd6ac7 Add some DAG combines for SUBC/SUBE. If nothing uses the carry/borrow out of subc, turn it into a sub. Turn (subc x, x) into 0 with no borrow. Turn (subc x, 0) into x with no borrow. Turn (subc -1, x) into (xor x, -1) with no borrow. Turn sube with no borrow in into subc.
llvm-svn: 147728
2012-01-07 09:06:39 +00:00
Chad Rosier 73a3fab480 Add comment.
llvm-svn: 147696
2012-01-06 23:45:47 +00:00
Chandler Carruth e041a30bb9 Prevent a DAGCombine from firing where there are two uses of
a combined-away node and the result of the combine isn't substantially
smaller than the input, it's just canonicalized. This is the first part
of a significant (7%) performance gain for Snappy's hot decompression
loop.

llvm-svn: 147604
2012-01-05 11:05:55 +00:00
Craig Topper f726e15f44 Allow vector shuffle normalizing to use concat vector even if the sources are commuted in the shuffle mask.
llvm-svn: 147527
2012-01-04 09:23:09 +00:00
Craig Topper 279c77b677 Implement VECTOR_SHUFFLE canonicalizations during DAG combine.
llvm-svn: 147525
2012-01-04 08:07:43 +00:00
Chris Lattner 6b77a07f75 Turn a few more inline asm errors into "emitErrors" instead of fatal errors.
Before we'd get:

$ clang t.c 
fatal error: error in backend: Invalid operand for inline asm constraint 'i'!

Now we get:

$ clang t.c
t.c:16:5: error: invalid operand for inline asm constraint 'i'!
    "movq         (%4), %%mm0\n"
    ^

Which at least gets us the inline asm that is the problem.

llvm-svn: 147502
2012-01-03 23:51:01 +00:00
Nadav Rotem 1e7dda13c8 Fix incorrect widening of the bitcast sdnode in case the incoming operand is integer-promoted.
llvm-svn: 147484
2012-01-03 22:12:28 +00:00
Owen Anderson fcc041eabf Remove the restriction that target intrinsics can only involve legal types. Targets can perfects well support intrinsics on illegal types, as long as they are prepared to perform custom expansion during type legalization. For example, a target where i64 is illegal might still support the i64 intrinsic operation using pairs of i32's. ARM already does some expansions like this for non-intrinsic operations.
llvm-svn: 147472
2012-01-03 20:09:02 +00:00
Elena Demikhovsky 8ec21a2801 Fixed a bug in SelectionDAG.cpp.
The failure seen on win32, when i64 type is illegal.
It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR.

The failure message is:
llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT || (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed.

I added a special test that checks vector shuffle on win32.

llvm-svn: 147445
2012-01-03 11:59:04 +00:00
Rafael Espindola d3df940169 Revert 147399. It broke CodeGen/ARM/vext.ll.
llvm-svn: 147400
2012-01-01 17:36:23 +00:00
Elena Demikhovsky 67f80c3432 Fixed a bug in SelectionDAG.cpp.
The failure seen on win32, when i64 type is illegal.
It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR.

The failure message is:
llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT || (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed.

I added a special test that checks vector shuffle on win32.

llvm-svn: 147399
2012-01-01 16:22:47 +00:00
Nadav Rotem 3c3dd6e588 PR11662.
Promotion of the mask operand needs to be done using PromoteTargetBoolean, and not padded with garbage.

llvm-svn: 147309
2011-12-28 13:08:20 +00:00
Eli Friedman e96286cdf2 Make sure DAGCombiner doesn't introduce multiple loads from the same memory location. PR10747, part 2.
llvm-svn: 147283
2011-12-26 22:49:32 +00:00
Nadav Rotem c1faeac410 Fix a typo in the widening of vectors in PromoteIntRes. Patch by Shemer Anat.
llvm-svn: 147272
2011-12-25 20:01:38 +00:00
Dylan Noblesmith 9e5b178ecc drop unneeded config.h includes
llvm-svn: 147197
2011-12-22 23:04:07 +00:00
Jakub Staszak 96f8c551e3 Add some constantness to BranchProbabilityInfo and BlockFrequnencyInfo.
llvm-svn: 146986
2011-12-20 20:03:10 +00:00
David Blaikie a379b18173 Unweaken vtables as per http://llvm.org/docs/CodingStandards.html#ll_virtual_anch
llvm-svn: 146960
2011-12-20 02:50:00 +00:00
Dan Gohman 94580ab375 Add basic generic CodeGen support for half.
llvm-svn: 146927
2011-12-20 00:02:33 +00:00
Joerg Sonnenberger d6cb7649d8 Allow inlining of functions with returns_twice calls, if they have the
attribute themselve.

llvm-svn: 146851
2011-12-18 20:35:43 +00:00
Devang Patel 7bbc1e56f5 Update DebugLoc while merging nodes at -O0.
Patch by Kyriakos Georgiou!

llvm-svn: 146670
2011-12-15 18:21:18 +00:00
Eli Friedman 2ec824966d Don't try to form FGETSIGN after legalization; it is possible in some cases, but the existing code can't do it correctly. PR11570.
llvm-svn: 146630
2011-12-15 02:07:20 +00:00
Owen Anderson e7f329fa7a Enable synthesis of FLOG2 and FEXP2 SelectionDAG nodes from libm calls. These are already marked as illegal by default.
llvm-svn: 146623
2011-12-15 00:54:12 +00:00
Eli Friedman 6512cd4366 Add missing cases to SDNode::getOperationName(). Patch by Micah Villmow.
llvm-svn: 146548
2011-12-14 02:28:54 +00:00
Chad Rosier b941674aa4 [fast-isel] Remove SelectInsertValue() as fast-isel wasn't designed to handle
instructions that define aggregate types.

llvm-svn: 146492
2011-12-13 17:45:06 +00:00
Chandler Carruth 637cc6a8aa Initial CodeGen support for CTTZ/CTLZ where a zero input produces an
undefined result. This adds new ISD nodes for the new semantics,
selecting them when the LLVM intrinsic indicates that the undef behavior
is desired. The new nodes expand trivially to the old nodes, so targets
don't actually need to do anything to support these new nodes besides
indicating that they should be expanded. I've done this for all the
operand types that I could figure out for all the targets. Owners of
various targets, please review and let me know if any of these are
incorrect.

Note that the expand behavior is *conservatively correct*, and exactly
matches LLVM's current behavior with these operations. Ideally this
patch will not change behavior in any way. For example the regtest suite
finds the exact same instruction sequences coming out of the code
generator. That's why there are no new tests here -- all of this is
being exercised by the existing test suite.

Thanks to Duncan Sands for reviewing the various bits of this patch and
helping me get the wrinkles ironed out with expanding for each target.
Also thanks to Chris for clarifying through all the discussions that
this is indeed the approach he was looking for. That said, there are
likely still rough spots. Further review much appreciated.

llvm-svn: 146466
2011-12-13 01:56:10 +00:00
Chad Rosier 2f8347e0b6 [fast-isel] Guard "exhastive" fast-isel output with -fast-isel-verbose2.
llvm-svn: 146453
2011-12-13 00:05:11 +00:00
Daniel Dunbar 27a7489a03 LLVMBuild: Remove trailing newline, which irked me.
llvm-svn: 146409
2011-12-12 19:48:00 +00:00
Chad Rosier 3168cabef1 [fast-isel] SelectInsertValue seems to be causing miscompiles for ARM. Disable while I investigate.
llvm-svn: 146331
2011-12-10 21:27:40 +00:00
Chad Rosier f70174b869 Typo.
llvm-svn: 146327
2011-12-10 19:48:51 +00:00
Chad Rosier dd998ff4df [fast-isel] Add support for selecting insertvalue.
rdar://10530851

llvm-svn: 146276
2011-12-09 20:09:54 +00:00
Eli Friedman 053a724483 Fix a couple of logic bugs in TargetLowering::SimplifyDemandedBits. PR11514.
llvm-svn: 146219
2011-12-09 01:16:26 +00:00
Owen Anderson bb15fec2b8 Enhance both TargetLibraryInfo and SelectionDAGBuilder so that the latter can use the former to prevent the formation of libm SDNode's when -fno-builtin is passed.
llvm-svn: 146193
2011-12-08 22:15:21 +00:00
Chad Rosier 0464869922 Add rather verbose stats for fast-isel failures.
llvm-svn: 146186
2011-12-08 21:37:10 +00:00
Owen Anderson 0b9b9da6c8 Teach SelectionDAG to match more calls to libm functions onto existing SDNodes. Mark these nodes as illegal by default, unless the target declares otherwise.
llvm-svn: 146171
2011-12-08 19:32:14 +00:00
Nadav Rotem 26edb291ac Fix a bug in the integer-promotion of bitcast operations on vector types.
We must not issue a bitcast operation for integer-promotion of vector types, because the
location of the values in the vector may be different.

llvm-svn: 146150
2011-12-08 13:10:01 +00:00
Eli Friedman d5c173fad0 Make sure we correctly set LiveRegGens when a call is unscheduled. <rdar://problem/10460321>. No testcase because this is very sensitive to scheduling.
llvm-svn: 146087
2011-12-07 22:24:28 +00:00
Eli Friedman 0bdc083e21 Fix an assertion in the scheduler. PR11386. No testcase included because it's rather delicate.
llvm-svn: 146083
2011-12-07 22:06:02 +00:00
Nick Lewycky d63851eb93 These global variables aren't thread-safe, STATISTIC is. Andy Trick tells me
that he isn't using these any more, so just delete them.

llvm-svn: 146076
2011-12-07 21:35:59 +00:00
Evan Cheng 7f8e563a69 Add bundle aware API for querying instruction properties and switch the code
generator to it. For non-bundle instructions, these behave exactly the same
as the MC layer API.

For properties like mayLoad / mayStore, look into the bundle and if any of the
bundled instructions has the property it would return true.
For properties like isPredicable, only return true if *all* of the bundled
instructions have the property.
For properties like canFoldAsLoad, isCompare, conservatively return false for
bundles.

llvm-svn: 146026
2011-12-07 07:15:52 +00:00
Eli Friedman f9081a8afe Zap unnecessary isIntDivCheap() check. PR11485. No testcase because this doesn't affect any in-tree target.
llvm-svn: 146015
2011-12-07 03:55:52 +00:00
Eli Friedman 0e58cba286 Fix an optimization involving EXTRACT_SUBVECTOR in DAGCombine so it behaves correctly. PR11494.
llvm-svn: 145996
2011-12-07 00:11:56 +00:00
Evan Cheng 2a81dd4a3c First chunk of MachineInstr bundle support.
1. Added opcode BUNDLE
2. Taught MachineInstr class to deal with bundled MIs
3. Changed MachineBasicBlock iterator to skip over bundled MIs; added an iterator to walk all the MIs
4. Taught MachineBasicBlock methods about bundled MIs

llvm-svn: 145975
2011-12-06 22:12:01 +00:00
Nadav Rotem 3924cb0267 Add support for vectors of pointers.
llvm-svn: 145801
2011-12-05 06:29:09 +00:00
Nick Lewycky 50f02cb21b Move global variables in TargetMachine into new TargetOptions class. As an API
change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.

One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.

llvm-svn: 145714
2011-12-02 22:16:29 +00:00
Chad Rosier 46addb9e07 If fast-isel fails, remove dead instructions generated during the failed
attempt.  

llvm-svn: 145425
2011-11-29 19:40:47 +00:00
Daniel Dunbar 539d0a8a09 build/CMake: Finish removal of add_llvm_library_dependencies.
llvm-svn: 145420
2011-11-29 19:25:30 +00:00
Eli Friedman e7ab1a2f0f Make SelectionDAG::InferPtrAlignment use llvm::ComputeMaskedBits instead of duplicating the logic for globals. Make llvm::ComputeMaskedBits handle GlobalVariables slightly more aggressively, to match what InferPtrAlignment knew how to do.
llvm-svn: 145304
2011-11-28 22:48:22 +00:00
Evan Cheng 4a5b2040e2 Revert r145273 and fix in SelectionDAG::InferPtrAlignment() instead.
Conservatively returns zero when the GV does not specify an alignment nor is it
initialized. Previously it returns ABI alignment for type of the GV. However, if
the type is a "packed" type, then the under-specified alignments is attached to
the load / store instructions. In that case, the alignment of the type cannot be
trusted.
rdar://10464621

llvm-svn: 145300
2011-11-28 22:37:34 +00:00
Evan Cheng a4b6404cf0 DAG combine should not increase alignment of loads / stores with alignment less
than ABI alignment. These are loads / stores from / to "packed" data structures.
Their alignments are intentionally under-specified.

rdar://10301431

llvm-svn: 145273
2011-11-28 20:42:56 +00:00
Chad Rosier 61e8d1026f 80-column.
llvm-svn: 145267
2011-11-28 19:59:09 +00:00
Bill Wendling 5ebc95ff4c Remove dead llvm.eh.sjlj.dispatchsetup intrinsic.
llvm-svn: 145263
2011-11-28 19:23:13 +00:00
Chandler Carruth e2530dc889 Fix an obvious omission in the SelectionDAGBuilder where we were
dropping weights on the floor for invokes. This was impeding my writing
further test cases for invoke when interacting with probabilities and
block placement.

No test case as there doesn't appear to be a way to test this stuff. =/
Suggestions for a test case of course welcome. I hope to be able to add
test cases that indirectly cover this eventually by adding probabilities
to the exceptional edge and reordering blocks as a result.

llvm-svn: 145060
2011-11-22 11:37:46 +00:00
Chad Rosier f83ab704e4 When fast iseling a GEP, accumulate the offset rather than emitting a series of
ADDs.  MaxOffs is used as a threshold to limit the size of the offset. Tradeoffs
being: (1) If we can't materialize the large constant then we'll cause fast-isel
to bail. (2) Too large of an offset can't be directly encoded in the ADD
resulting in a MOV+ADD.  Generally not a bad thing because otherwise we would
have had ADD+ADD, but on Thumb this turns into a MOVS+MOVT+ADD. Working on a fix
for that. (3) Conversely, too low of a threshold we'll miss opportunities to 
coalesce ADDs.
rdar://10412592

llvm-svn: 144886
2011-11-17 07:15:58 +00:00
Eli Friedman ff1eaa7578 Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393.
llvm-svn: 144863
2011-11-16 23:50:22 +00:00
Chad Rosier ff40b1e164 Add fast-isel stats to determine who's doing all the work, the
target-independent selector or the target-specific selector.

llvm-svn: 144833
2011-11-16 21:05:28 +00:00
Chad Rosier cfd0d10e72 Fix the stats collection for fast-isel. The failed count was only accounting
for a single miss and not all predecessor instructions that get selected by
the selection DAG instruction selector.  This is still not exact (e.g., over
states misses when folded/dead instructions are present), but it is a step in
the right direction.

llvm-svn: 144832
2011-11-16 21:02:08 +00:00
Eli Friedman 87f92512c3 CONCAT_VECTORS can have more than two operands. PR11389.
llvm-svn: 144768
2011-11-16 02:52:39 +00:00
Eli Friedman d257a464d1 Add a couple asserts so it will be easier to debug if we accidentally pass indexed loads/stores to the legalizer.
llvm-svn: 144767
2011-11-16 02:43:15 +00:00
Owen Anderson ca2f78a95b Rename MVT::untyped to MVT::Untyped to match similar nomenclature.
llvm-svn: 144747
2011-11-16 01:02:57 +00:00
Chad Rosier 291ce47db7 GEPs with all zero indices are trivially coalesced by fast-isel. For example,
%arrayidx135 = getelementptr inbounds [4 x [4 x [4 x [4 x i32]]]]* %M0, i32 0, i64 0
%arrayidx136 = getelementptr inbounds [4 x [4 x [4 x i32]]]* %arrayidx135, i32 0, i64 %idxprom134

Prior to this commit, the GEP instruction that defines %arrayidx136 thought that 
%arrayidx135 was a trivial kill.  The GEP that defines %arrayidx135 doesn't 
generate any code and thus %M0 gets folded into the second GEP.  Thus, we need
to look through GEPs with all zero indices.
rdar://10443319

llvm-svn: 144730
2011-11-15 23:34:05 +00:00
Pete Cooper 7c7ba1baa1 Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used
by later instructions.

Only done for DEC64m right now.

Fixes <rdar://problem/6172640>

llvm-svn: 144705
2011-11-15 21:57:53 +00:00
Benjamin Kramer 1f97a5a671 Remove all remaining uses of Value::getNameStr().
llvm-svn: 144648
2011-11-15 16:27:03 +00:00
Benjamin Kramer 4c93d15f09 Twinify GraphWriter a little bit.
llvm-svn: 144647
2011-11-15 16:26:38 +00:00
Jay Foad 70679df664 Remove some unnecessary includes of PseudoSourceValue.h.
llvm-svn: 144634
2011-11-15 07:50:46 +00:00
Eli Friedman 9d448e4a42 Don't try to form pre/post-indexed loads/stores until after LegalizeDAG runs. Fixes PR11029.
llvm-svn: 144438
2011-11-12 00:35:34 +00:00
Eli Friedman 1347715644 Some cleanup and bulletproofing for node replacement in LegalizeDAG. To maintain LegalizeDAG invariants, whenever we a node is replaced, we must attempt to delete it, and if it still
has uses after it is replaced (which can happen in rare cases due to CSE), we must revisit it.

llvm-svn: 144432
2011-11-11 23:58:27 +00:00
Evan Cheng d33b2d6b7a Use a bigger hammer to fix PR11314 by disabling the "forcing two-address
instruction lower optimization" in the pre-RA scheduler.

The optimization, rather the hack, was done before MI use-list was available.
Now we should be able to implement it in a better way, perhaps in the
two-address pass until a MI scheduler is available.

Now that the scheduler has to backtrack to handle call sequences. Adding
artificial scheduling constraints is just not safe. Furthermore, the hack
is not taking all the other scheduling decisions into consideration so it's just
as likely to pessimize code. So I view disabling this optimization goodness
regardless of PR11314.

llvm-svn: 144267
2011-11-10 07:43:16 +00:00
Eli Friedman 53218b6fcc Add check so we don't try to perform an impossible transformation. Fixes issue from PR11319.
llvm-svn: 144216
2011-11-09 22:25:12 +00:00
Duncan Sands 635e4efca0 Speculatively revert commit 144124 (djg) in the hope that the 32 bit
dragonegg self-host buildbot will recover (it is complaining about object
files differing between different build stages).  Original commit message:

Add a hack to the scheduler to disable pseudo-two-address dependencies in
basic blocks containing calls. This works around a problem in which
these artificial dependencies can get tied up in calling seqeunce
scheduling in a way that makes the graph unschedulable with the current
approach of using artificial physical register dependencies for calling
sequences. This fixes PR11314.

llvm-svn: 144188
2011-11-09 14:20:48 +00:00
Dan Gohman a4bc6171a5 Add a hack to the scheduler to disable pseudo-two-address dependencies in
basic blocks containing calls. This works around a problem in which
these artificial dependencies can get tied up in calling seqeunce
scheduling in a way that makes the graph unschedulable with the current
approach of using artificial physical register dependencies for calling
sequences. This fixes PR11314.

llvm-svn: 144124
2011-11-08 21:29:06 +00:00
Lang Hames b85fcd07df Lower mem-ops to unaligned i32/i16 load/stores on ARM where supported.
Add support for trimming constants to GetDemandedBits. This fixes some funky
constant generation that occurs when stores are expanded for targets that don't
support unaligned stores natively.

llvm-svn: 144102
2011-11-08 18:56:23 +00:00
Pete Cooper 82cd9e81fc Added invariant field to the DAG.getLoad method and changed all calls.
When this field is true it means that the load is from constant (runt-time or compile-time) and so can be hoisted from loops or moved around other memory accesses

llvm-svn: 144100
2011-11-08 18:42:53 +00:00
Eli Friedman f2a9bd4b1e Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization doesn't get confused by CSE later on. Fixes PR11318.
Re-commit of r144034, with an extra fix so that RemoveDeadNode doesn't blow up.

llvm-svn: 144055
2011-11-08 01:25:24 +00:00
Eli Friedman a35a5295e0 Revert r144034 while I try to track down a crash.
llvm-svn: 144044
2011-11-07 23:53:20 +00:00
Eli Friedman 55a86d32d3 Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization doesn't get confused by CSE later on. Fixes PR11318.
llvm-svn: 144034
2011-11-07 22:51:10 +00:00
Richard Osborne 561fac4d4e Don't introduce custom nodes after legalization in TargetLowering::BuildSDIV()
and TargetLowering::BuildUDIV(). Fixes PR11283

llvm-svn: 143964
2011-11-07 17:09:05 +00:00
Dan Gohman 198b7ffc11 Reapply r143206, with fixes. Disallow physical register lifetimes
across calls, and only check for nested dependences on the special
call-sequence-resource register.

llvm-svn: 143660
2011-11-03 21:49:52 +00:00
Daniel Dunbar bf9bba47a1 build: Add initial cut at LLVMBuild.txt files.
llvm-svn: 143634
2011-11-03 18:53:17 +00:00
Bill Wendling 645eadac67 An array of chars of length 8 will also cause the stack protector to be inserted
into the function. Reflect that here so that the array will be placed next to
the SP.
<rdar://problem/10128329>

llvm-svn: 143590
2011-11-02 23:20:58 +00:00
Nadav Rotem f310361a7d Cleanup. Document. Make sure that this build_vector optimization only runs before the op legalizer and that the used type is legal.
llvm-svn: 143358
2011-10-31 20:08:25 +00:00
Benjamin Kramer a4eba41b7a Silence compiler warning.
llvm-svn: 143308
2011-10-30 08:39:55 +00:00
Nadav Rotem bf6568b5d6 Add a new DAGCombine optimization for BUILD_VECTOR.
If all of the inputs are zero/any_extended, create a new simple BV
which can be further optimized by other BV optimizations.

llvm-svn: 143297
2011-10-29 21:23:04 +00:00
Dan Gohman 9b9c970148 Revert r143206, as there are still some failing tests.
llvm-svn: 143262
2011-10-29 00:41:52 +00:00
Dan Gohman 73057ad24f Reapply r143177 and r143179 (reverting r143188), with scheduler
fixes: Use a separate register, instead of SP, as the
calling-convention resource, to avoid spurious conflicts with
actual uses of SP. Also, fix unscheduling of calling sequences,
which can be triggered by pseudo-two-address dependencies.

llvm-svn: 143206
2011-10-28 17:55:38 +00:00
Duncan Sands 225a7037d6 Speculatively disable Dan's commits 143177 and 143179 to see if
it fixes the dragonegg self-host (it looks like gcc is miscompiled).
Original commit messages:
Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.

Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.

Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.

Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.

This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.

Delete #if 0 code accidentally left in.

llvm-svn: 143188
2011-10-28 09:55:57 +00:00
Dan Gohman 0e8d1454b1 Delete #if 0 code accidentally left in.
llvm-svn: 143179
2011-10-28 01:41:21 +00:00
Dan Gohman 4db3f7dd83 Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.

Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.

Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.

Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.

This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.

llvm-svn: 143177
2011-10-28 01:29:32 +00:00
Eli Friedman e9e356ad6b Don't crash on 128-bit sdiv by constant. Found by inspection.
llvm-svn: 143095
2011-10-27 02:06:39 +00:00
Lang Hames 58dba012b6 Rename NonScalarIntSafe to something more appropriate.
llvm-svn: 143080
2011-10-26 23:50:43 +00:00
Duncan Sands dce448c642 Simplify SplitVecRes_UnaryOp by removing all the code that is
trying to legalize the operand types when only the result type
is required to be legalized - the type legalization machinery
will get round to the operands later if they need legalizing.
There can be a point to legalizing operands in parallel with
the result: when this saves compile time or results in better
code.  There was only one case in which this was true: when
the operand is also split, so keep the logic for that bit.
As a result of this change, additional operand legalization
methods may need to be introduced to handle nodes where the
result and operand types can differ, like SIGN_EXTEND, but
the testsuite doesn't contain any tests where this is the case.
In any case, it seems better to require such methods (and die
with an assert if they doesn't exist) than to quietly produce
wrong code if we forgot to special case the node in
SplitVecRes_UnaryOp.

llvm-svn: 143026
2011-10-26 14:11:18 +00:00
Jakob Stoklund Olesen e8261a22f1 Don't use floating point to do an integer's job.
This code makes different decisions when compiled into x87 instructions
because of different rounding behavior.  That caused phase 2/3
miscompares on 32-bit Linux when the phase 1 compiler was built with gcc
(using x87), and the phase 2 compiler was built with clang (using SSE).

This fixes PR11200.

llvm-svn: 143006
2011-10-26 01:47:48 +00:00
Eli Friedman 3e9ef907e0 Remove a couple redundant checks.
llvm-svn: 142959
2011-10-25 20:34:22 +00:00
Douglas Gregor 0cc574eee7 Really unbreak CMake build
llvm-svn: 142822
2011-10-24 18:10:52 +00:00
Douglas Gregor d0800fda95 Unbreak CMake build
llvm-svn: 142821
2011-10-24 18:09:23 +00:00
Dan Gohman e2ff95e327 Delete the top-down "Latency" scheduler. Top-down scheduling doesn't handle
physreg dependencies, and upcoming codegen changes will require proper
physreg dependence handling.

llvm-svn: 142816
2011-10-24 18:01:06 +00:00
Dan Gohman d78fc160cc Delete the Latency scheduling preference.
llvm-svn: 142815
2011-10-24 17:56:48 +00:00
Dan Gohman 4ed1afa51d Change this overloaded use of Sched::Latency to be an overloaded
use of Sched::ILP instead, as Sched::Latency is going away.

llvm-svn: 142813
2011-10-24 17:55:11 +00:00
Dan Gohman c32af340fc Change the default scheduler from Latency to ILP, since Latency
is going away.

llvm-svn: 142810
2011-10-24 17:45:02 +00:00
Nadav Rotem 5e00bb5feb Fix pr11194. When promoting and splitting integers we need to use
ZExtPromotedInteger and SExtPromotedInteger based on the operation we legalize.

SetCC return type needs to be legalized via PromoteTargetBoolean.

llvm-svn: 142660
2011-10-21 17:35:19 +00:00
Nadav Rotem d315157f12 1. Fix the widening of SETCC in WidenVecOp_SETCC. Use the correct return CC type.
2. Fix a typo in CONCAT_VECTORS which exposed the bug in #1.

llvm-svn: 142648
2011-10-21 11:42:07 +00:00
Chandler Carruth 001153784a Remove a now dead function, fixing -Wunused-function warnings from
Clang.

llvm-svn: 142631
2011-10-21 01:23:41 +00:00
Dan Gohman 90fb55237b Delete the list-tdrr scheduler. Top-down schedulers are going away
because they don't support physical register dependencies.

llvm-svn: 142620
2011-10-20 21:44:34 +00:00
Chad Rosier 4236a63c3c Revert r142579, "Fix a type in the legalization of CONCAT_VECTORS". This is
causing one of the unit tests to infinitely loop, which resulted in the 
buildbots stalling.

llvm-svn: 142604
2011-10-20 19:19:10 +00:00
Nadav Rotem fe3969293d Fix a type in the legalization of CONCAT_VECTORS.
llvm-svn: 142579
2011-10-20 13:38:16 +00:00
Nadav Rotem 8824472a25 Improve code generation for vselect on SSE2:
When checking the availability of instructions using the TLI, a 'promoted'
instruction IS available. It means that the value is bitcasted to another type
for which there is an operation. The correct check for the availablity of an
instruction is to check if it should be expanded.

llvm-svn: 142542
2011-10-19 20:43:16 +00:00
Nadav Rotem 6652e22bad Add support for the vector-widening of vselect and vector-setcc
llvm-svn: 142488
2011-10-19 09:45:11 +00:00
Nadav Rotem 75c2229f41 Fix a bug in the legalization of vector anyext-load and trunc-store. Mem Index starts with zero.
llvm-svn: 142434
2011-10-18 22:32:43 +00:00
Bob Wilson 681561901d Fix a DAG combiner assertion failure when constant folding BUILD_VECTORS.
svn r139159 caused SelectionDAG::getConstant() to promote BUILD_VECTOR operands
with illegal types, even before type legalization.  For this testcase, that led
to one BUILD_VECTOR with i16 operands and another with promoted i32 operands,
which triggered the assertion.

llvm-svn: 142370
2011-10-18 17:34:47 +00:00
Duncan Sands d278d35b13 Fix a bunch of unused variable warnings when doing a release
build with gcc-4.6.

llvm-svn: 142350
2011-10-18 12:44:00 +00:00
Hal Finkel bab66789d5 Fix comment to refer to correct instruction
llvm-svn: 142334
2011-10-18 03:51:57 +00:00
Bill Wendling 63a4ea1859 Correct over-zealous removal of hack.
Some code want to check that *any* call within a function has the 'returns
twice' attribute, not just that the current function has one.

llvm-svn: 142221
2011-10-17 18:43:40 +00:00
Bill Wendling 2a83a71c2a Now that we have the ReturnsTwice function attribute, this method is
obsolete. Check the attribute instead.
<rdar://problem/8031714>

llvm-svn: 142212
2011-10-17 18:22:52 +00:00
Chad Rosier c17257c4cb Removed set, but unused variable.
Patch by Joe Abbey <jabbey@arxan.com>.

llvm-svn: 142206
2011-10-17 18:01:59 +00:00
Nadav Rotem 486ff59a9f Enable element promotion type legalization by deafault.
Changed tests which assumed that vectors are legalized by widening them.

llvm-svn: 142152
2011-10-16 20:31:33 +00:00
Benjamin Kramer cc863b2bb6 Let printf do the formatting instead aligning strings ourselves.
While at it, merge some format strings.

llvm-svn: 142140
2011-10-16 16:30:34 +00:00
Nadav Rotem ebe13bc3f1 Move the legalization of vector loads and stores into LegalizeVectorOps. In some
cases we need the second type-legalization pass in order to support all cases.

llvm-svn: 142060
2011-10-15 07:41:10 +00:00
Bill Wendling 2730a0099a Clear out the landing pad to call site map for each function.
This isn't put into the 'clear()' method because the information needs to stick
around (at least for a little bit) after the selection DAG is built.

llvm-svn: 142032
2011-10-15 01:00:26 +00:00
Jim Grosbach 400907cc41 Fix typo. "__sync_fetch_and-xor_4" should be "__sync_fetch_and_xor_4".
Pointed out by George Russell.

llvm-svn: 141956
2011-10-14 15:53:48 +00:00
Jakob Stoklund Olesen 24abd9d9b6 Encode register class constreaints in inline asm instructions.
The inline asm operand constraint is initially encoded in the virtual
register for the operand, but that register class may change during
coalescing, and the original constraint is lost.

Encode the original register class as part of the flag word for each
inline asm operand.  This makes it possible to recover the actual
constraint required by inline asm, just like we can for normal
instructions.

llvm-svn: 141833
2011-10-12 23:37:29 +00:00
Eli Friedman 979009ea61 Use a utility from MathExtras to clarify a check and avoid undefined behavior. Based on patch by Ahmed Charles.
llvm-svn: 141829
2011-10-12 22:46:45 +00:00
Dan Gohman de239d2647 Fix a thinko that Nick noticed. The previous code actually worked as
intended, but only by accident.

llvm-svn: 141779
2011-10-12 15:56:56 +00:00
Jakob Stoklund Olesen 35163e21dc Use an existing function.
llvm-svn: 141763
2011-10-12 01:24:51 +00:00
Eric Christopher 57d1692750 Formatting.
llvm-svn: 141728
2011-10-11 22:59:04 +00:00
Nadav Rotem 3283793c9a Add support for legalization of vector SHL/SRA/SRL instructions
llvm-svn: 141667
2011-10-11 14:36:35 +00:00
Nadav Rotem 198fe81571 Add support for legalization of vector trunc-store where the saved scalar type is illegal (for example, v2i16 on systems where the smallest store size is i32)
llvm-svn: 141661
2011-10-11 11:25:16 +00:00
Nadav Rotem b521b6037b Cleanup the trunc-store legalization code and add asserts.
llvm-svn: 141659
2011-10-11 10:04:25 +00:00
Bill Wendling 7ecfbd90ef Thread the chain through the eh.sjlj.setjmp intrinsic, like it's documented to
do. This will be useful later on with the new SJLJ stuff.

llvm-svn: 141416
2011-10-07 21:25:38 +00:00
Eli Friedman 1456cd20b4 Remove the old atomic instrinsics. autoupgrade functionality is included with this patch.
llvm-svn: 141333
2011-10-06 23:20:49 +00:00
Bill Wendling 267f323d28 Modify the mapping from landing pad to call sites to accept more than one call
site.

llvm-svn: 141226
2011-10-05 22:24:35 +00:00
Bill Wendling e61c62533e Small refactoring. Cache the FunctionInfo->MBB into a local variable.
llvm-svn: 141221
2011-10-05 22:16:11 +00:00
Jakob Stoklund Olesen f7957a9819 Simplify EXTRACT_SUBREG emission.
EXTRACT_SUBREG is emitted as %dst = COPY %src:sub, so there is no need to
constrain the %dst register class.  RegisterCoalescer will apply the
necessary constraints if it decides to eliminate the COPY.

The %src register class does need to be constrained to something with
the right sub-registers, though.  This is currently done manually with
COPY_TO_REGCLASS nodes.  They can possibly be removed after this patch.

llvm-svn: 141207
2011-10-05 20:26:40 +00:00
Jakob Stoklund Olesen 8ff52c4135 Simplify INSERT_SUBREG emission.
The register class created by INSERT_SUBREG and SUBREG_TO_REG must be
legal and support the SubIdx sub-registers.

The new getSubClassWithSubReg() hook can compute that.

This may create INSERT_SUBREG instructions defining a larger register
class than the sub-register being inserted.  That is OK,
RegisterCoalescer will constrain the register class as needed when it
eliminates the INSERT_SUBREG instructions.

llvm-svn: 141198
2011-10-05 18:31:00 +00:00
Bill Wendling 3d11aa7e75 Create a mapping between the landing pad basic block and the call site index for later use.
llvm-svn: 141125
2011-10-04 22:00:35 +00:00
Nadav Rotem 52e8ed9214 Moved type construction out of the loop and added an assert on the legality of the type. Formatted lines to the 80 char limit.
llvm-svn: 140952
2011-10-01 18:39:28 +00:00
Bill Wendling 9925f197cc When inferring the pointer alignment, if the global doesn't have an initializer
and the alignment is 0 (i.e., it's defined globally in one file and declared in
another file) it could get an alignment which is larger than the ABI allows for
that type, resulting in aligned moves being used for unaligned loads.

For instance, in file A.c:

   struct S s;

In file B.c:
   struct {
     // something long
   };
   extern S s;

   void foo() {
     struct S p = s;
     // ...
   }

this copy is a 'memcpy' which is turned into a series of 'movaps' instructions
on X86. But this is wrong, because 'struct S' has alignment of 4, not 16.

llvm-svn: 140902
2011-09-30 23:19:55 +00:00
Nick Lewycky f40df1d46c Promote comment to doxycomment. Adjust whitespace. No functionality change.
llvm-svn: 140899
2011-09-30 22:19:53 +00:00
Jakob Stoklund Olesen 1352be2bd3 Move getCommonSubClass() into TRI.
It will soon need the context.

llvm-svn: 140896
2011-09-30 22:18:51 +00:00
Eli Friedman 95031ed837 Clean up uses of switch instructions so they are not dependent on the operand ordering. Patch by Stepan Dyatkovskiy.
llvm-svn: 140803
2011-09-29 20:21:17 +00:00
Eric Christopher d299dccf91 Use the local we already set up.
llvm-svn: 140745
2011-09-29 00:50:59 +00:00
Bill Wendling baf3941fde Strip off pointer casts when looking at the eh.sjlj.functioncontext's argument.
llvm-svn: 140678
2011-09-28 03:52:41 +00:00
Bill Wendling 66b110f571 Create and use an llvm.eh.sjlj.functioncontext intrinsic.
This intrinsic is used to pass the index of the function context to the back-end
for further processing. The back-end is in charge of filling in the rest of the
entries.

llvm-svn: 140676
2011-09-28 03:36:43 +00:00
Jim Grosbach af136f71ec Rename AddSelectionDAGCSEId() to addSelectionDAGCSEId().
Naming conventions consistency. No functional change.

llvm-svn: 140636
2011-09-27 20:59:33 +00:00
Nadav Rotem 38b3b83362 Cleanup PromoteIntOp_EXTRACT_VECTOR_ELT and PromoteIntRes_SETCC.
Add a new method: getAnyExtOrTrunc and use it to replace the manual check.

llvm-svn: 140603
2011-09-27 11:16:47 +00:00
Nadav Rotem 1b857d2762 Revert r140463; The patch assumes that <4 x i1> is saved to memory as 4 x i8,
while the decision is to bit-pack small values.

llvm-svn: 140601
2011-09-27 10:48:29 +00:00
Nadav Rotem 2279949129 [vector-select] Address one of the issues in pr10902. EXTRACT_VECTOR_ELEMENT
SDNodes may return values which are wider than the incoming element types. In
this patch we fix the integer promotion of these nodes.

Fixes spill-q.ll when running -promote-elements.

llvm-svn: 140471
2011-09-25 18:59:42 +00:00
Nadav Rotem c2deabd202 Implement Duncan's suggestion to use the result of getSetCCResultType if it is legal
(this is always the case for scalars), otherwise use the promoted result type.

Fix test/CodeGen/X86/vsplit-and.ll when promote-elements is enabled.

llvm-svn: 140464
2011-09-24 19:48:19 +00:00
Nadav Rotem 77426a754b [Vector-Select] Address one of the problems in 10902.
When generating the trunc-store of i1's, we need to use the vector type and not
the scalar type.

This patch fixes the assertion in CodeGen/Generic/bool-vector.ll when
running with -promote-elements.

llvm-svn: 140463
2011-09-24 18:32:19 +00:00
Duncan Sands b461176cfb Tweak the handling of MERGE_VALUES nodes: remove the need for
DecomposeMERGE_VALUES to "know" that results are legalized in
a particular order, by passing it the number of the result
being legalized (the type legalization core provides this, it
just needs to be passed on).

llvm-svn: 140373
2011-09-23 13:59:22 +00:00
Nadav Rotem 57e30726ad Vector-Select: Address one of the problems in pr10902. Add handling for the
integer-promotion of CONCAT_VECTORS.

Test: test/CodeGen/X86/widen_shuffle-1.ll

This patch fixes the above tests (when running in with -promote-elements).

llvm-svn: 140372
2011-09-23 09:33:24 +00:00
Dan Gohman e83e1b2d2c Fix SimplifySelectCC to add newly created nodes to the DAGCombiner
worklist, as it may be possible to perform further optimization on them.

llvm-svn: 140349
2011-09-22 23:01:29 +00:00
Jakob Stoklund Olesen e92e5ee81f Constrain register classes instead of emitting copies.
Sometimes register class constraints are trivial, like GR32->GR32_NOSP,
or GPR->rGPR.  Teach InstrEmitter to simply constrain the virtual
register instead of emitting a copy in these cases.

Normally, these copies are handled by the coalescer.  This saves some
coalescer work.

llvm-svn: 140340
2011-09-22 21:39:34 +00:00
Nadav Rotem bc9ba30158 [VECTOR-SELECT] Address one of the bugs in pr10902.
Vector SetCC result types need to be type-legalized.
This code worked before because scalar result types are known to be legal.

llvm-svn: 140249
2011-09-21 14:34:38 +00:00
Andrew Trick 924123acb3 Lower ARM adds/subs to add/sub after adding optional CPSR operand.
This is still a hack until we can teach tblgen to generate the
optional CPSR operand rather than an implicit CPSR def. But the
strangeness is now limited to the selection DAG. ADD/SUB MI's no
longer have implicit CPSR defs, nor do we allow flag setting variants
of these opcodes in machine code. There are several corner cases to
consider, and getting one wrong would previously lead to nasty
miscompilation. It's not the first time I've debugged one, so this
time I added enough verification to ensure it won't happen again.

llvm-svn: 140228
2011-09-21 02:20:46 +00:00
Bruno Cardoso Lopes 6cb23f6e7f Add a DAGCombine for subvector extracts to remove useless chains of
subvector inserts and extracts. Initial patch by Rackover, Zvi with
some tweak done by me.

llvm-svn: 140204
2011-09-20 23:19:33 +00:00
Andrew Trick 52363bdbeb Restore hasPostISelHook tblgen flag.
No functionality change. The hook makes it explicit which patterns
require "special" handling. i.e. it self-documents tblgen
deficiencies. I plan to add verification in ExpandISelPseudos and
Thumb2SizeReduce to catch any missing hasPostISelHooks. Otherwise it's
too fragile.

llvm-svn: 140160
2011-09-20 18:22:31 +00:00
Andrew Trick 8586e62d91 ARM isel bug fix for adds/subs operands.
Modified ARMISelLowering::AdjustInstrPostInstrSelection to handle the
full gamut of CPSR defs/uses including instructins whose "optional"
cc_out operand is not really optional. This allowed removal of the
hasPostISelHook to simplify the .td files and make the implementation
more robust.
Fixes rdar://10137436: sqlite3 miscompile

llvm-svn: 140134
2011-09-20 03:17:40 +00:00
Andrew Trick 53df4b6dfa whitespace
llvm-svn: 140133
2011-09-20 03:06:13 +00:00
Nadav Rotem 7aaa0aa7a7 white space cleanups
llvm-svn: 139994
2011-09-18 10:29:29 +00:00
Eli Friedman ee8f14a799 Some legalization fixes for atomic load and store.
llvm-svn: 139851
2011-09-15 21:20:49 +00:00
Nadav Rotem d748dbacb0 Add integer promotion support for vselect
llvm-svn: 139692
2011-09-14 14:42:15 +00:00
Eli Friedman f78c6a83ee Fix check for unaligned load/store so it doesn't catch over-aligned load/store.
llvm-svn: 139649
2011-09-13 22:19:59 +00:00
Eli Friedman f1518216fd Error out on CodeGen of unaligned load/store. Fix test so it isn't accidentally testing that case.
llvm-svn: 139641
2011-09-13 20:50:54 +00:00
Nadav Rotem 66dc9ae08d Fix the assertion which checks the size of the input operand.
llvm-svn: 139633
2011-09-13 20:03:38 +00:00
Nadav Rotem 52202fbf2d Add vselect target support for targets that do not support blend but do support
xor/and/or (For example SSE2).

llvm-svn: 139623
2011-09-13 19:17:42 +00:00
Chris Lattner e74e0c8020 tidy up a bit
llvm-svn: 139419
2011-09-09 22:06:59 +00:00
Eli Friedman b7910b79f5 Make the SelectionDAG verify that all the operands of BUILD_VECTOR have the same type. Teach DAGCombiner::visitINSERT_VECTOR_ELT not to make invalid BUILD_VECTORs. Fixes PR10897.
llvm-svn: 139407
2011-09-09 21:04:06 +00:00
Devang Patel 9d904e1a97 Directly point debug info to the stack slot of the arugment, instead of trying to keep track of vreg in which it the arugment is copied. The LiveDebugVariable can keep track of variable's ranges.
llvm-svn: 139330
2011-09-08 22:59:09 +00:00
Eli Friedman e978d2f644 Relax the MemOperands on atomics a bit. Fixes -verify-machineinstrs failures for atomic laod/store on ARM.
(The fix for the related failures on x86 is going to be nastier because we actually need Acquire memoperands attached to the atomic load instrs, etc.)

llvm-svn: 139221
2011-09-07 02:23:42 +00:00
Duncan Sands f2641e1bc1 Add codegen support for vector select (in the IR this means a select
with a vector condition); such selects become VSELECT codegen nodes.
This patch also removes VSETCC codegen nodes, unifying them with SETCC
nodes (codegen was actually often using SETCC for vector SETCC already).
This ensures that various DAG combiner optimizations kick in for vector
comparisons.  Passes dragonegg bootstrap with no testsuite regressions
(nightly testsuite as well as "make check-all").  Patch mostly by
Nadav Rotem.

llvm-svn: 139159
2011-09-06 19:07:46 +00:00
Duncan Sands a098436b32 Split the init.trampoline intrinsic, which currently combines GCC's
init.trampoline and adjust.trampoline intrinsics, into two intrinsics
like in GCC.  While having one combined intrinsic is tempting, it is
not natural because typically the trampoline initialization needs to
be done in one function, and the result of adjust trampoline is needed
in a different (nested) function.  To get around this llvm-gcc hacks the
nested function lowering code to insert an additional parent variable
holding the adjust.trampoline result that can be accessed from the child
function.  Dragonegg doesn't have the luxury of tweaking GCC code, so it
stored the result of adjust.trampoline in the memory GCC set aside for
the trampoline itself (this is always available in the child function),
and set up some new memory (using an alloca) to hold the trampoline.
Unfortunately this breaks Go which allocates trampoline memory on the
heap and wants to use it even after the parent has exited (!).  Rather
than doing even more hacks to get Go working, it seemed best to just use
two intrinsics like in GCC.  Patch mostly by Sanjoy Das.

llvm-svn: 139140
2011-09-06 13:37:06 +00:00
Owen Anderson 40d756eacc Fix a truly heinous bug in DAGCombine related to AssertZext.
If we have a chain of zext -> assert_zext -> zext -> use, the first zext would get simplified away because of the later zext, and then the later zext would get simplified away because of the assert.  The solution is to teach SimplifyDemandedBits that assert_zext demands all of the high bits of its input, rather than only those demanded by its users.  No testcase because the only example I have manifests as llvm-gcc miscompiling LLVM, and I haven't found a smaller case that reproduces this problem.
Fixes <rdar://problem/10063365>.

llvm-svn: 139059
2011-09-03 00:26:49 +00:00
Dan Gohman 3767be9aee Revert r131152, r129796, r129761. This code is currently considered
to be unreliable on platforms which require memcpy calls, and it is
complicating broader legalize cleanups. It is hoped that these cleanups
will make memcpy byval easier to implement in the future.

llvm-svn: 138977
2011-09-01 23:07:08 +00:00
Andrew Trick 832a6a1909 PreRA scheduler should avoid cloning compares.
Added canClobberReachingPhysRegUse() to handle a particular pattern in
which a two-address instruction could be forced to interfere with
EFLAGS, causing a compare to be unnecessarilly cloned.
Fixes rdar://problem/5875261

llvm-svn: 138924
2011-09-01 00:54:31 +00:00
Eli Friedman ae1acddb95 Misc cleanup; addresses Duncan's comments on r138877.
llvm-svn: 138887
2011-08-31 20:13:26 +00:00
Eli Friedman e839ecb70b Fill in type legalization for MERGE_VALUES in all the various cases. Patch by Micah Villmow. (No testcase because the issue only showed up in an out-of-tree backend.)
llvm-svn: 138877
2011-08-31 18:36:04 +00:00
Eli Friedman 7c3bdede25 Generic expansion for atomic load/store into cmpxchg/atomicrmw xchg; implements 64-bit atomic load/store for ARM.
llvm-svn: 138872
2011-08-31 18:26:09 +00:00
Evan Cheng e6fba77971 Follow up to r138791.
Add a instruction flag: hasPostISelHook which tells the pre-RA scheduler to
call a target hook to adjust the instruction. For ARM, this is used to
adjust instructions which may be setting the 's' flag. ADC, SBC, RSB, and RSC
instructions have implicit def of CPSR (required since it now uses CPSR physical
register dependency rather than "glue"). If the carry flag is used, then the
target hook will *fill in* the optional operand with CPSR. Otherwise, the hook
will remove the CPSR implicit def from the MachineInstr.

llvm-svn: 138810
2011-08-30 19:09:48 +00:00
Eli Friedman 452aae6202 Atomic load/store on ARM/Thumb.
I don't really like the patterns, but I'm having trouble coming up with a
better way to handle them.

I plan on making other targets use the same legalization
ARM-without-memory-barriers is using... it's not especially efficient, but
if anyone cares, it's not that hard to fix for a given target if there's
some better lowering.

llvm-svn: 138621
2011-08-26 02:59:24 +00:00
Eli Friedman 342e8df0e0 Basic x86 code generation for atomic load and store instructions.
llvm-svn: 138478
2011-08-24 20:50:09 +00:00
Bill Wendling 4eb0433672 A landingpad instruction is neither folded nor dead.
llvm-svn: 138387
2011-08-23 21:33:05 +00:00
Evan Cheng 6b477b985b Fix 80 col violations.
llvm-svn: 138356
2011-08-23 19:17:21 +00:00
Nick Lewycky 97f73cb449 Be less redundant.
llvm-svn: 138252
2011-08-22 18:26:12 +00:00
Benjamin Kramer 68ed46ce9a Roll back the rest of r126557. It's a hack that will break in some obscure cases.
llvm-svn: 138130
2011-08-19 22:39:31 +00:00
Nick Lewycky c1348074ec Eli points out that this is what report_fatal_error() is for.
llvm-svn: 138091
2011-08-19 21:45:19 +00:00
Nick Lewycky 3f73184d90 This is not actually unreachable, so don't use llvm_unreachable for it. Since
the intent seems to be to terminate even in Release builds, just use abort()
directly.

If program flow ever reaches a __builtin_unreachable (which llvm_unreachable is
#define'd to on newer GCCs) then the program is undefined.

llvm-svn: 138068
2011-08-19 20:14:27 +00:00
Ivan Krasin d7cbd4c518 FastISel: avoid function calls between the materialization of the constant and its use.
llvm-svn: 137993
2011-08-18 22:06:10 +00:00
Bill Wendling 247fd3bf59 Add the support in code-gen for the landingpad instruction lowering.
The landingpad instruction is lowered into the EXCEPTIONADDR and EHSELECTION
SDNodes. The information from the landingpad instruction is harvested by the
'AddLandingPadInfo' function. The new EH uses the current EH scheme in the
back-end. This will change once we switch over to the new scheme. (Reviewed by
Jakob!)

llvm-svn: 137880
2011-08-17 21:56:44 +00:00
Bill Wendling a408e5bf31 Revert patch. Forgot a dependent commit.
llvm-svn: 137875
2011-08-17 21:28:05 +00:00
Bill Wendling 2a521948f0 Add the body of 'visitLandingPad'.
This generates the SDNodes for the new exception handling scheme. It takes the
two values coming from the landingpad instruction and assigns them to the
EXCEPTIONADDR and EHSELECTION nodes.

llvm-svn: 137873
2011-08-17 21:25:14 +00:00
Nadav Rotem b66b866f46 Revert r137562 because it caused PR10674
llvm-svn: 137719
2011-08-16 14:34:29 +00:00
Nadav Rotem 6858b344ed Fix PR 10635. When generating integer constants, the constant element type may
be illegal, even if the requested vector type is legal. Testcase is one of the
disabled ARM tests in the vector-select patch.

llvm-svn: 137562
2011-08-13 20:31:45 +00:00
Bill Wendling fae1475823 Initial commit of the 'landingpad' instruction.
This implements the 'landingpad' instruction. It's used to indicate that a basic
block is a landing pad. There are several restrictions on its use (see
LangRef.html for more detail). These restrictions allow the exception handling
code to gather the information it needs in a much more sane way.

This patch has the definition, implementation, C interface, parsing, and bitcode
support in it.

llvm-svn: 137501
2011-08-12 20:24:12 +00:00
Nadav Rotem 62da15a330 Revert r137310 because it does not optimize any code on ToT
llvm-svn: 137466
2011-08-12 17:15:04 +00:00
Duncan Sands a41634e307 Silence a bunch (but not all) "variable written but not read" warnings
when building with assertions disabled.

llvm-svn: 137460
2011-08-12 14:54:45 +00:00
Nadav Rotem 61140e1028 [AVX] When joining two XMM registers into a YMM register, make sure that the
lower XMM register gets in first. This will allow the SUBREG pattern to
elliminate the first vector insertion. 

llvm-svn: 137310
2011-08-11 16:49:36 +00:00
Chris Lattner 96710b4308 fix PR10605 / rdar://9930964 by adding a pretty scary missed check.
It's somewhat surprising anything works without this.  Before we would
compile the testcase into:

test:                                   # @test
	movl	$4, 8(%rdi)
	movl	8(%rdi), %eax
	orl	%esi, %eax
	cmpl	$32, %edx
	movl	%eax, -4(%rsp)          # 4-byte Spill
	je	.LBB0_2

now we produce:

test:                                   # @test
	movl	8(%rdi), %eax
	movl	$4, 8(%rdi)
	orl	%esi, %eax
	cmpl	$32, %edx
	movl	%eax, -4(%rsp)          # 4-byte Spill
	je	.LBB0_2
llvm-svn: 137303
2011-08-11 06:26:54 +00:00
Devang Patel aab841cf63 Do not drop undef debug values. These are used as range termination marker by live debug variable pass.
llvm-svn: 136834
2011-08-03 23:13:55 +00:00
Eli Friedman 30a49e93e3 New approach to r136737: insert the necessary fences for atomic ops in platform-independent code, since a bunch of platforms (ARM, Mips, PPC, Alpha are the relevant targets here) need to do essentially the same thing.
I think this completes the basic CodeGen for atomicrmw and cmpxchg.

llvm-svn: 136813
2011-08-03 21:06:02 +00:00
Eli Friedman 04c5025cd5 Don't create a ridiculous EXTRACT_ELEMENT. PR10563.
The testcase looks extremely fragile, so I'm adding an assertion which should catch any cases like this.

llvm-svn: 136711
2011-08-02 18:38:35 +00:00
Bill Wendling f891bf8b30 Add the 'resume' instruction for the new EH rewrite.
This adds the 'resume' instruction class, IR parsing, and bitcode reading and
writing. The 'resume' instruction resumes propagation of an existing (in-flight)
exception whose unwinding was interrupted with a 'landingpad' instruction (to be
added later).

llvm-svn: 136589
2011-07-31 06:30:59 +00:00
Bill Wendling ad088e6724 Revert r136253, r136263, r136269, r136313, r136325, r136326, r136329, r136338,
r136339, r136341, r136369, r136387, r136392, r136396, r136429, r136430, r136444,
r136445, r136446, r136253 pending review.

llvm-svn: 136556
2011-07-30 05:42:50 +00:00
Jakub Staszak 0480a8fbbb Do not lose branch weights when lowering SwitchInst.
llvm-svn: 136529
2011-07-29 22:25:21 +00:00
Jakub Staszak 539db98987 Remove unneeded const_cast.
llvm-svn: 136506
2011-07-29 20:05:36 +00:00
Eli Friedman adec587d5c Misc optimizer+codegen work for 'cmpxchg' and 'atomicrmw'. They appear to be
working on x86 (at least for trivial testcases); other architectures will
need more work so that they actually emit the appropriate instructions for
orderings stricter than 'monotonic'. (As far as I can tell, the ARM, PPC,
Mips, and Alpha backends need such changes.)

llvm-svn: 136457
2011-07-29 03:05:32 +00:00
Bill Wendling 7eadbeaf62 Use the pointer type size.
With this, we can now compile a simple EH program.

llvm-svn: 136446
2011-07-29 01:15:29 +00:00
Bill Wendling 6a8cac735a And now something that compiles...
llvm-svn: 136445
2011-07-29 01:11:33 +00:00
Bill Wendling 4b0a365beb Make sure to sext or trunc the result from the register.
llvm-svn: 136444
2011-07-29 01:11:14 +00:00
Chandler Carruth 9d7feab3e0 Rewrite the CMake build to use explicit dependencies between libraries,
specified in the same file that the library itself is created. This is
more idiomatic for CMake builds, and also allows us to correctly specify
dependencies that are missed due to bugs in the GenLibDeps perl script,
or change from compiler to compiler. On Linux, this returns CMake to
a place where it can relably rebuild several targets of LLVM.

I have tried not to change the dependencies from the ones in the current
auto-generated file. The only places I've really diverged are in places
where I was seeing link failures, and added a dependency. The goal of
this patch is not to start changing the dependencies, merely to move
them into the correct location, and an explicit form that we can control
and change when necessary.

This also removes a serialization point in the build because we don't
have to scan all the libraries before we begin building various tools.
We no longer have a step of the build that regenerates a file inside the
source tree. A few other associated cleanups fall out of this.

This isn't really finished yet though. After talking to dgregor he urged
switching to a single CMake macro to construct libraries with both
sources and dependencies in the arguments. Migrating from the two macros
to that style will be a follow-up patch.

Also, llvm-config is still generated with GenLibDeps.pl, which means it
still has slightly buggy dependencies. The internal CMake
'llvm-config-like' macro uses the correct explicitly specified
dependencies however. A future patch will switch llvm-config generation
(when using CMake) to be based on these deps as well.

This may well break Windows. I'm getting a machine set up now to dig
into any failures there. If anyone can chime in with problems they see
or ideas of how to solve them for Windows, much appreciated.

llvm-svn: 136433
2011-07-29 00:14:25 +00:00
Bill Wendling 3cc87682e1 Visit the landingpad instruction.
This generates the correct SDNodes for the landingpad instruction. It makes an
assumption that the result of the landingpad instruction has at least two
values. And that the first value is a pointer to the exception object and the
second value is the "selector."

llvm-svn: 136430
2011-07-28 23:44:58 +00:00
Bill Wendling 7fa7fe6b58 Add the AddLandingPadInfo function.
AddLandingPadInfo takes a landingpad instruction and grabs all of the
information from it that it needs for EH table generation.

llvm-svn: 136429
2011-07-28 23:42:57 +00:00
Eli Friedman c9a551ebed LangRef and basic memory-representation/reading/writing for 'cmpxchg' and
'atomicrmw' instructions, which allow representing all the current atomic
rmw intrinsics.

The allowed operands for these instructions are heavily restricted at the
moment; we can probably loosen it a bit, but supporting general
first-class types (where it makes sense) might get a bit complicated,
given how SelectionDAG works.

As an initial cut, these operations do not support specifying an alignment,
but it would be possible to add if we think it's useful. Specifying an
alignment lower than the natural alignment would be essentially
impossible to support on anything other than x86, but specifying a greater
alignment would be possible.  I can't think of any useful optimizations which
would use that information, but maybe someone else has ideas.

Optimizer/codegen support coming soon.

llvm-svn: 136404
2011-07-28 21:48:00 +00:00
Bill Wendling 4f027233d2 The personality function should be a Function* and not just a Value*.
llvm-svn: 136392
2011-07-28 21:14:13 +00:00
Nadav Rotem 9708aef2dc CR fix: The ANY_EXTEND can be removed because the input and putput type must be
identical.

llvm-svn: 136355
2011-07-28 14:38:46 +00:00
Eli Friedman 26a484852e Code generation for 'fence' instruction.
llvm-svn: 136283
2011-07-27 22:21:52 +00:00
Bill Wendling 6c923bb8d9 Merge the contents from exception-handling-rewrite to the mainline.
This adds the new instructions 'landingpad' and 'resume'.

llvm-svn: 136253
2011-07-27 20:18:04 +00:00
Jeffrey Yasskin 6381c0100b Explicitly cast narrowing conversions inside {}s that will become errors in
C++0x.

llvm-svn: 136211
2011-07-27 06:22:51 +00:00
Dan Gohman 456b1edd0d Revert r136156, which broke several buildbots.
llvm-svn: 136206
2011-07-27 01:10:27 +00:00
Dan Gohman 9eb62cd159 Delete unnecessarily cautious LastCALLSEQ code.
llvm-svn: 136156
2011-07-26 22:00:59 +00:00
Eli Friedman 06b8b571b2 Add obvious missing case to switch. PR10497.
llvm-svn: 136130
2011-07-26 20:38:49 +00:00
Eli Friedman fee02c6c13 Initial implementation of 'fence' instruction, the new C++0x-style replacement for llvm.memory.barrier.
This is just a LangRef entry and reading/writing/memory representation; optimizer+codegen support coming soon.

llvm-svn: 136009
2011-07-25 23:16:38 +00:00
Eli Friedman cbd3ba91b7 Make sure this DAGCombine actually returns an UNDEF of the correct type; PR10476.
llvm-svn: 135993
2011-07-25 22:25:42 +00:00
Eli Friedman 6ed783228d PR10421: Fix a straightforward bug in the widening logic for CONCAT_VECTORS.
llvm-svn: 135595
2011-07-20 18:14:33 +00:00
Devang Patel 9ab3cac694 Revert r135423.
llvm-svn: 135454
2011-07-19 00:28:24 +00:00
Jeffrey Yasskin 7a16288157 Add APInt(numBits, ArrayRef<uint64_t> bigVal) constructor to prevent future ambiguity
errors like the one corrected by r135261.  Migrate all LLVM callers of the old
constructor to the new one.

llvm-svn: 135431
2011-07-18 21:45:40 +00:00
Devang Patel 4dc76f2438 During bottom up fast-isel, instructions emitted to materalize registers are at top of basic block and do not have debug location. This may misguide debugger while entering the basic block and sometimes debugger provides semi useful view of current location to developer by picking up previous known location as current location. Assign a sensible location to the first instruction in a basic block, if it does not have one location derived from source file, so that debugger can provide meaningful user experience to developers in edge cases.
[take 2]

llvm-svn: 135423
2011-07-18 20:55:23 +00:00
Chris Lattner 229907cd11 land David Blaikie's patch to de-constify Type, with a few tweaks.
llvm-svn: 135375
2011-07-18 04:54:35 +00:00
Nadav Rotem 76d51c6c89 Minor code cleanups
llvm-svn: 135362
2011-07-17 19:05:00 +00:00
Dan Gohman 945864d6dc LegalizeDAG doesn't need its own copy of this enum.
llvm-svn: 135320
2011-07-15 22:51:43 +00:00
Dan Gohman e49e74261a Delete LegalizeDAG's own version of isTypeLegal and getTypeAction
and just use the ones from TargetLowering directly.

llvm-svn: 135318
2011-07-15 22:39:09 +00:00
Dan Gohman 8c5ca645ce Delete an unused variable and a redundant assert.
llvm-svn: 135311
2011-07-15 22:19:02 +00:00
Dan Gohman ad94608b1f Modernize comments.
llvm-svn: 135305
2011-07-15 21:42:20 +00:00
Eric Christopher 92464be28c Check register class matching instead of width of type matching
when determining validity of matching constraint. Allow i1
types access to the GR8 reg class for x86.

Fixes PR10352 and rdar://9777108

llvm-svn: 135180
2011-07-14 20:13:52 +00:00
Nadav Rotem 771f29677f [VECTOR-SELECT]
During type legalization we often use the SIGN_EXTEND_INREG SDNode.
When this SDNode is legalized during the LegalizeVector phase, it is
scalarized because non-simple types are automatically marked to be expanded.
In this patch we add support for lowering SIGN_EXTEND_INREG manually.
This fixes CodeGen/X86/vec_sext.ll when running with the '-promote-elements'
flag.

llvm-svn: 135144
2011-07-14 11:11:14 +00:00
Nadav Rotem db213c0400 Add assertion for the chain value type
llvm-svn: 135143
2011-07-14 10:37:54 +00:00
Benjamin Kramer 15cd5a3f12 Don't emit a bit test if there is only one case the test can yield false. A simple SETNE is sufficient.
llvm-svn: 135126
2011-07-14 01:38:42 +00:00
Eric Christopher d6300d2956 Add a dag combine pattern for folding C2-(A+C1) -> (C2-C1)-A
Fixes rdar://9761830

llvm-svn: 135123
2011-07-14 01:12:15 +00:00
Jay Foad 57aa636794 Convert InsertValueInst and ExtractValueInst APIs to use ArrayRef.
llvm-svn: 135040
2011-07-13 10:26:04 +00:00
Cameron Zwarich f03fa189ca Add an intrinsic and codegen support for fused multiply-accumulate. The intent
is to use this for architectures that have a native FMA instruction.

llvm-svn: 134742
2011-07-08 21:39:21 +00:00
Benjamin Kramer 2bb8b26aa8 Apparently we can't expect a BinaryOperator here.
Should fix llvm-gcc selfhost.

llvm-svn: 134699
2011-07-08 12:08:24 +00:00
Benjamin Kramer 9960a25006 Emit a more efficient magic number multiplication for exact sdivs.
We have to do this in DAGBuilder instead of DAGCombiner, because the exact bit is lost after building.

  struct foo { char x[24]; };
  long bar(struct foo *a, struct foo *b) { return a-b; }
is now compiled into
  movl	4(%esp), %eax
  subl	8(%esp), %eax
  sarl	$3, %eax
  imull	$-1431655765, %eax, %eax
instead of
  movl	4(%esp), %eax
  subl	8(%esp), %eax
  movl	$715827883, %ecx
  imull	%ecx
  movl	%edx, %eax
  shrl	$31, %eax
  sarl	$2, %edx
  addl	%eax, %edx
  movl	%edx, %eax

llvm-svn: 134695
2011-07-08 10:31:30 +00:00
Eric Christopher 6a6d8fc7fd Remove a FIXME. All of the standard ones are in the list.
llvm-svn: 134647
2011-07-07 22:29:03 +00:00
Lang Hames 5a00499e87 Add functions 'hasPredecessor' and 'hasPredecessorHelper' to SDNode. The
hasPredecessorHelper function allows predecessors to be cached to speed up
repeated invocations. This fixes PR10186.

X.isPredecessorOf(Y) now just calls Y.hasPredecessor(X)

Y.hasPredecessor(X) calls Y.hasPredecessorHelper(X, Visited, Worklist) with
empty Visited and Worklist sets (i.e. no caching over invocations).

Y.hasPredecessorHelper(X, Visited, Worklist) caches search state in Visited
and Worklist to speed up repeated calls. The Visited set is searched for X
before going to the worklist to further search the DAG if necessary.

llvm-svn: 134592
2011-07-07 04:31:51 +00:00
Eric Christopher ea336c797c Grammar and 80-col.
llvm-svn: 134555
2011-07-06 22:41:18 +00:00
Jakub Staszak 3f158fdf6e Introduce "expect" intrinsic instructions.
llvm-svn: 134516
2011-07-06 18:22:43 +00:00
Evan Cheng 0d639a28aa Rename TargetSubtarget to TargetSubtargetInfo for consistency.
llvm-svn: 134259
2011-07-01 21:01:15 +00:00
Eric Christopher f81292ba3b Remove getRegClassForInlineAsmConstraint and all dependencies.
Fixes rdar://9643582

llvm-svn: 134123
2011-06-30 01:20:03 +00:00
Devang Patel 0eada03216 Revert r133953 for now.
llvm-svn: 134116
2011-06-29 23:50:13 +00:00
Benjamin Kramer 8665f8d916 Revert a part of r126557 which could create unschedulable DAGs.
llvm-svn: 134067
2011-06-29 13:47:25 +00:00
Evan Cheng 8264e272a9 Sink SubtargetFeature and TargetInstrItineraries (renamed MCInstrItineraries) into MC.
llvm-svn: 134049
2011-06-29 01:14:12 +00:00
Evan Cheng 6cc775f905 - Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo and
sink them into MC layer.
- Added MCInstrInfo, which captures the tablegen generated static data. Chang
TargetInstrInfo so it's based off MCInstrInfo.

llvm-svn: 134021
2011-06-28 19:10:37 +00:00
Devang Patel 4dc034df1d During bottom up fast-isel, instructions emitted to materalize registers are at top of basic block and do not have debug location. This may misguide debugger while entering the basic block and sometimes debugger provides semi useful view of current location to developer by picking up previous known location as current location. Assign a sensible location to the first instruction in a basic block, if it does not have one location derived from source file, so that debugger can provide meaningful user experience to developers in edge cases.
llvm-svn: 133953
2011-06-27 22:32:04 +00:00
Evan Cheng 8d71a75777 More refactoring. Move getRegClass from TargetOperandInfo to TargetInstrInfo.
llvm-svn: 133944
2011-06-27 21:26:13 +00:00
Owen Anderson b0a5a1ee29 The index stored in the RegDefIter is one after the current index. When getting the index, decrement it so that it points to the current element. Fixes an off-by-one bug encountered when trying to make use of MVT::untyped.
llvm-svn: 133923
2011-06-27 18:34:12 +00:00
Andrew Trick 31f25bc66f pre-RA-sched: Cleanup register pressure tracking.
Removed the check that peeks past EXTRA_SUBREG, which I don't think
makes sense any more. Intead treat it as a normal register def. No
significant affect on x86 or ARM benchmarks.

llvm-svn: 133917
2011-06-27 18:01:20 +00:00
Jakob Stoklund Olesen 537a302d1a Distinguish early clobber output operands from clobbered registers.
Both become <earlyclobber> defs on the INLINEASM MachineInstr, but we
now use two different asm operand kinds.

The new Kind_Clobber is treated identically to the old
Kind_RegDefEarlyClobber for now, but x87 floating point stack inline
assembly does care about the difference.

This will pop a register off the stack:

  asm("fstp %st" : : "t"(x) : "st");

While this will pop the input and push an output:

  asm("fst %st" : "=&t"(r) : "t"(x));

We need to know if ST0 was a clobber or an output operand, and we can't
depend on <dead> flags for that.

llvm-svn: 133902
2011-06-27 04:08:33 +00:00
Owen Anderson 99adfec0b1 The scheduler needs to be aware on the existence of untyped nodes when it performs type propagation for EXTRACT_SUBREG.
llvm-svn: 133838
2011-06-24 23:02:22 +00:00
Devang Patel f071d72c44 Handle debug info for i128 constants.
llvm-svn: 133821
2011-06-24 20:46:11 +00:00
Jay Foad 83be361b8a Replace the existing forms of ConstantArray::get() with a single form
that takes an ArrayRef.

llvm-svn: 133615
2011-06-22 09:24:39 +00:00
Owen Anderson d1955e78b4 Fix some trailing issues from my introduction of MVT::untyped and its use for REGISTER_SEQUENCE.
llvm-svn: 133567
2011-06-21 22:54:23 +00:00
Evan Cheng 4c0bd9629d Teach dag combine to match halfword byteswap patterns.
1. (((x) & 0xFF00) >> 8) | (((x) & 0x00FF) << 8)
   => (bswap x) >> 16
2. ((x&0xff)<<8)|((x&0xff00)>>8)|((x&0xff000000)>>8)|((x&0x00ff0000)<<8))
   => (rotl (bswap x) 16)

This allows us to eliminate most of the def : Pat patterns for ARM rev16
revsh instructions. It catches many more cases for ARM and x86.

rdar://9609108

llvm-svn: 133503
2011-06-21 06:01:08 +00:00
Nadav Rotem d34ce4344b Fix PromoteIntRes_TRUNCATE: Add support for cases where the
source vector type is to be split while the target vector is to be promoted.
(eg: <4 x i64> -> <4 x i8> )

llvm-svn: 133424
2011-06-20 07:15:58 +00:00
Nadav Rotem 94d67a02e0 Code cleanups: Remove duplicated logic in PromotInteRes_BITCAST, reserve vector space, reuse types.
llvm-svn: 133389
2011-06-19 10:49:57 +00:00
Nadav Rotem 35d600d9f4 Calls to AssertZext and getZeroExtendInReg must be made using scalar types.
llvm-svn: 133388
2011-06-19 10:22:39 +00:00
Nadav Rotem 36896bfd0c When promoting the vector elements in CopyToParts, use vector trunc
instead of scalarizing, and doing an element-by-element truncat.

llvm-svn: 133382
2011-06-19 08:49:38 +00:00
Benjamin Kramer e1fc29b6ac Don't allocate empty read-only SmallVectors during SelectionDAG deallocation.
llvm-svn: 133348
2011-06-18 13:13:44 +00:00
Benjamin Kramer 25e17b0f89 Remove unused but set variables.
llvm-svn: 133347
2011-06-18 11:09:41 +00:00
Eric Christopher e4a1266a9a Fix UMULO support for 2x register width to allow the full
range without a libcall to a new mulo<mode> libcall
that we'd have to create.

Finishes the rest of rdar://9090077 and rdar://9210061

llvm-svn: 133318
2011-06-18 00:09:57 +00:00
Eric Christopher 232431c389 Fix comment.
llvm-svn: 133307
2011-06-17 22:35:59 +00:00
Eric Christopher 5bbb2bdb46 Lower multiply with overflow checking to __mulo<mode>
calls if we haven't been able to lower them any
other way.

Fixes rdar://9090077 and rdar://9210061

llvm-svn: 133288
2011-06-17 20:41:29 +00:00
Jakob Stoklund Olesen c826df9506 Don't use register classes larger than TLI->getRegClassFor(VT).
In Thumb mode we cannot handle GPR virtual registers, even though some
instructions can. When isel is lowering a CopyFromReg, it should limit
itself to subclasses of getRegClassFor(VT).

<rdar://problem/9624323>

llvm-svn: 133210
2011-06-16 22:50:38 +00:00
Jakub Staszak 12a43bdde5 Introduce MachineBranchProbabilityInfo class, which has similar API to
BranchProbabilityInfo (expect setEdgeWeight which is not available here).
Branch Weights are kept in MachineBasicBlocks. To turn off this analysis
set -use-mbpi=false.

llvm-svn: 133184
2011-06-16 20:22:37 +00:00
Owen Anderson 5fc8b77f83 Change the REG_SEQUENCE SDNode to take an explict register class ID as its first operand. This operand is lowered away by the time we reach MachineInstrs, so the actual register-allocation handling of them doesn't need to change.
This is intended to support using REG_SEQUENCE SDNode's with type MVT::untyped, and is part of the long road to eliminating some of the hacks we currently use to support register pairs and other strange constraints, particularly on ARM NEON.

llvm-svn: 133178
2011-06-16 18:17:13 +00:00
Jakob Stoklund Olesen 1f641d577e Add TargetRegisterInfo::getRawAllocationOrder().
This virtual function will replace allocation_order_begin/end as the one
to override when implementing custom allocation orders. It is simpler to
have one function return an ArrayRef than having two virtual functions
computing different ends of the same array.

Use getRawAllocationOrder() in place of allocation_order_begin() where
it makes sense, but leave some clients that look like they really want
the filtered allocation orders from RegisterClassInfo.

llvm-svn: 133170
2011-06-16 17:42:25 +00:00
Nick Lewycky 6d677cfdd8 Add a DAGCombine for (ext (binop (load x), cst)).
llvm-svn: 133124
2011-06-16 01:15:49 +00:00
Owen Anderson 96adc4a540 Add a new MVT::untyped. This will be used in future work for modelling ISA features like register pairs and lists with "interesting" constraints (such as ARM NEON contiguous register lists or even-odd paired registers). We need to be able to generate these instructions (often from intrinsics), but don't want to have to assign a legal type to them. Instead, we'll use an "untyped" edge to bypass the type-checking and simply ensure that the register classes match.
llvm-svn: 133106
2011-06-15 23:35:18 +00:00
Andrew Trick 3013b6ae4a Added -stress-sched flag in the Asserts build.
Added a test case for handling physreg aliases during pre-RA-sched.

llvm-svn: 133063
2011-06-15 17:16:12 +00:00
Nadav Rotem 13cb7736a7 getZeroExtendInReg needs to get a scalar type
llvm-svn: 133057
2011-06-15 14:37:18 +00:00
Nadav Rotem d2d9bdb2b0 Enable the simplification of truncating-store after fixing the usage of
GetDemandBits (which must operate on the vector element type).

Fix the a usage of getZeroExtendInReg which must also be done on scalar types.

llvm-svn: 133052
2011-06-15 11:19:12 +00:00
Chad Rosier 818e116723 When pattern matching during instruction selection make sure shl x,1 is not
converted to add x,x if x is a undef.  add undef, undef does not guarantee
that the resulting low order bit is zero.
Fixes <rdar://problem/9453156> and <rdar://problem/9487392>.

llvm-svn: 133022
2011-06-14 22:29:10 +00:00
Nadav Rotem 10193c830b Add a testcase for checking the integer-promotion of many different vector
types (with power of two types such as 8,16,32 .. 512).

Fix a bug in the integer promotion of bitcast nodes. Enable integer expanding
only if the target of the conversion is an integer (when the type action is
scalarize).

Add handling to the legalization of vector load/store in cases where the saved
vector is integer-promoted.

llvm-svn: 132985
2011-06-14 08:11:52 +00:00
Nadav Rotem 571ae19af7 Disable trunc-store simplification on vectors.
llvm-svn: 132984
2011-06-14 07:18:26 +00:00
Bruno Cardoso Lopes dc9ff3a4b1 Add one more argument to the prefetch intrinsic to indicate whether it's a data
or instruction cache access. Update the targets to match it and also teach
autoupgrade.

llvm-svn: 132976
2011-06-14 04:58:37 +00:00
Nadav Rotem 573ee374a2 Fix a bug in FindMemType. When widening vector loads, use a wider memory type
only if the number of packed elements is a power of two.
Bug found in Duncan's testcase.

llvm-svn: 132923
2011-06-13 18:13:24 +00:00
Nadav Rotem 504cf0cde2 Fix a bug in the calculation of the vectorTypeBreakdown into registers. Odd
types such as i33 were rounded to i32. Originated from Duncan's testcase.

llvm-svn: 132893
2011-06-12 14:56:55 +00:00
Nadav Rotem 083837e729 Improve the generated code by getCopyFromPartsVector for promoted integer types.
Instead of scalarizing, and doing an element-by-element truncat, use vector
truncate.
Add support for scalarization of vectors:  i8 -> <1 x i1> (from Duncan's
testcase).

llvm-svn: 132892
2011-06-12 14:49:38 +00:00
Chad Rosier 79044dbebf Revert r132871.
llvm-svn: 132872
2011-06-11 02:27:46 +00:00
Chad Rosier 5793b53027 Typo.
llvm-svn: 132871
2011-06-11 02:16:36 +00:00
Eric Christopher eb964516c3 80-col cleanups.
llvm-svn: 132863
2011-06-10 23:05:08 +00:00
Eli Friedman 1877ac9937 Change this DAGCombine to build AND of SHR instead of SHR of AND; this matches the ordering we prefer in instcombine. Part of rdar://9562809.
The potential DAGCombine which enforces this more generally messes up some other very fragile patterns, so I'm leaving that alone, at least for now.

llvm-svn: 132809
2011-06-09 22:14:44 +00:00