Commit Graph

423 Commits

Author SHA1 Message Date
Benjamin Kramer ca7ca4f6c6 TargetLowering: Use the large shift amount during legalize types. The legalizer may call us with an overly large type.
llvm-svn: 162101
2012-08-17 15:54:21 +00:00
Micah Villmow b67d7a3a33 Conform to LLVM coding style.
llvm-svn: 161061
2012-07-31 18:07:43 +00:00
Micah Villmow 6b12f596ef Don't generate ordered or unordered comparison operations if it is not legal to do so.
llvm-svn: 161053
2012-07-31 16:48:03 +00:00
Bill Wendling d163405df8 Remove tabs.
llvm-svn: 160475
2012-07-19 00:04:14 +00:00
Evan Cheng 780f9b5f92 Implement r160312 as target indepedenet dag combine.
llvm-svn: 160354
2012-07-17 08:31:11 +00:00
Evan Cheng 47d7be9578 Make sure constant bitwidth is <= 64 bit before calling getSExtValue().
llvm-svn: 160350
2012-07-17 07:47:50 +00:00
Evan Cheng f579beca6d This is another case where instcombine demanded bits optimization created
large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.

int foo(unsigned long l) {
  return (l>> 47) == 1;
}

we produce

  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv

which codegens to

movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret

TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.

Based on a patch by Eli Friedman.

PR10328
rdar://9758774

llvm-svn: 160346
2012-07-17 06:53:39 +00:00
Duncan Sands 71dacd09fe All cases are covered, no need for a default. This deals with the
corresponding clang warning.

llvm-svn: 159742
2012-07-05 10:14:33 +00:00
Duncan Sands 0552a2cad2 Use the right kind of booleans: we were emitting 0/1 booleans, instead of 0/-1
booleans.  Patch by James Benton.

llvm-svn: 159739
2012-07-05 09:32:46 +00:00
Evan Cheng 39e90029a2 Target option DisableJumpTables is a gross hack. Move it to TargetLowering instead.
llvm-svn: 159611
2012-07-02 22:39:56 +00:00
Nadav Rotem b7bb72e4f3 Remove the "-promote-elements" flag. This flag is now enabled by default.
llvm-svn: 157925
2012-06-04 11:27:21 +00:00
Benjamin Kramer bde9176663 Fix typos found by http://github.com/lyda/misspell-check
llvm-svn: 157885
2012-06-02 10:20:22 +00:00
Eli Friedman 315a0c79f3 Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process.
llvm-svn: 157446
2012-05-25 00:09:29 +00:00
Benjamin Kramer e31f31e5c0 Add a new target hook "predictableSelectIsExpensive".
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.

Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.

I'm not entirely happy with the name of this flag, suggestions welcome ;)

llvm-svn: 156233
2012-05-05 12:49:14 +00:00
Jakob Stoklund Olesen e326ed33a8 Make sure findRepresentativeClass picks the widest super-register.
We want the representative register class to contain the largest
super-registers available. This makes the function less sensitive to the
register class numbering.

llvm-svn: 156220
2012-05-04 22:53:28 +00:00
Jakob Stoklund Olesen 75fbe90839 Use SuperRegClassIterator for findRepresentativeClass().
The masks returned by SuperRegClassIterator are computed automatically
by TableGen. This is better than depending on the manually specified
SuperRegClasses.

llvm-svn: 156147
2012-05-04 02:19:22 +00:00
Nadav Rotem 31caa27bf5 Teach getVectorTypeBreakdown about promotion of vectors in addition to widening of vectors.
llvm-svn: 155296
2012-04-21 20:08:32 +00:00
Joel Jones 828531f798 Fixes a problem in instruction selection with testing whether or not the
transformation:

(X op C1) ^ C2 --> (X op C1) & ~C2 iff (C1&C2) == C2

should be done.  

This change has been tested:
 Using a debug+asserts build:
   on the specific test case that brought this bug to light
   make check-all
   lnt nt
   using this clang to build a release version of clang
 Using the release+asserts clang-with-clang build:
   on the specific test case that brought this bug to light
   make check-all
   lnt nt

Checking in because Evan wants it checked in.  Test case forthcoming after
scrubbing.

llvm-svn: 154955
2012-04-17 22:23:10 +00:00
Akira Hatanaka 8483a6c47d Have TargetLowering::getPICJumpTableRelocBase return a node that points to the
GOT if jump table uses 64-bit gp-relative relocation.

llvm-svn: 154341
2012-04-09 20:32:12 +00:00
Chandler Carruth 16f0ebcbb5 Move the TLSModel information into the TargetMachine rather than hiding
in TargetLowering. There was already a FIXME about this location being
odd. The interface is simplified as a consequence. This will also make
it easier to change TLS models when compiling with PIE.

llvm-svn: 154292
2012-04-08 17:20:55 +00:00
Jakob Stoklund Olesen 37492eac8c Don't break the IV update in TLI::SimplifySetCC().
LSR always tries to make the ICmp in the loop latch use the incremented
induction variable. This allows the induction variable to be kept in a
single register.

When the induction variable limit is equal to the stride,
SimplifySetCC() would break LSR's hard work by transforming:

   (icmp (add iv, stride), stride) --> (cmp iv, 0)

This forced us to use lea for the IC update, preventing the simpler
incl+cmp.

<rdar://problem/7643606>
<rdar://problem/11184260>

llvm-svn: 154119
2012-04-05 20:30:20 +00:00
Rafael Espindola ba0a6cabb8 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Craig Topper 4c7d995029 Remove default case from switch that was already covering all cases.
llvm-svn: 153996
2012-04-04 04:42:42 +00:00
Chad Rosier 2a02fe1bb2 Fix an issue in SimplifySetCC() specific to vector comparisons.
When folding X == X we need to check getBooleanContents() to determine if the
result is a vector of ones or a vector of negative ones. 

I tried creating a test case, but the problem seems to only be exposed on a
much older version of clang (around r144500).
rdar://10923049

llvm-svn: 153966
2012-04-03 20:11:24 +00:00
Eli Friedman 18a4c31525 Use the correct ShiftAmtTy for creating shifts after legalization. PR11881. Not committing a testcase because I think it will be too fragile.
llvm-svn: 149315
2012-01-31 01:08:03 +00:00
David Blaikie 5d8e42755c Refactor variables unused under non-assert builds (& remove two entirely unused variables).
llvm-svn: 148230
2012-01-16 05:17:39 +00:00
Nadav Rotem 57935243bd [AVX] Optimize x86 VSELECT instructions using SimplifyDemandedBits.
We know that the blend instructions only use the MSB, so if the mask is
sign-extended then we can convert it into a SHL instruction. This is a
common pattern because the type-legalizer sign-extends the i1 type which
is used by the LLVM-IR for the condition.

Added a new optimization in SimplifyDemandedBits for SIGN_EXTEND_INREG -> SHL.

llvm-svn: 148225
2012-01-15 19:27:55 +00:00
Chandler Carruth f3e8502cc1 Add 'llvm_unreachable' to passify GCC's understanding of the constraints
of several newly un-defaulted switches. This also helps optimizers
(including LLVM's) recognize that every case is covered, and we should
assume as much.

llvm-svn: 147861
2012-01-10 18:08:01 +00:00
David Blaikie edbb58c577 Remove unnecessary default cases in switches that cover all enum values.
llvm-svn: 147855
2012-01-10 16:47:17 +00:00
Dan Gohman 94580ab375 Add basic generic CodeGen support for half.
llvm-svn: 146927
2011-12-20 00:02:33 +00:00
Eli Friedman 2ec824966d Don't try to form FGETSIGN after legalization; it is possible in some cases, but the existing code can't do it correctly. PR11570.
llvm-svn: 146630
2011-12-15 02:07:20 +00:00
Eli Friedman 053a724483 Fix a couple of logic bugs in TargetLowering::SimplifyDemandedBits. PR11514.
llvm-svn: 146219
2011-12-09 01:16:26 +00:00
Owen Anderson 0b9b9da6c8 Teach SelectionDAG to match more calls to libm functions onto existing SDNodes. Mark these nodes as illegal by default, unless the target declares otherwise.
llvm-svn: 146171
2011-12-08 19:32:14 +00:00
Eli Friedman 53218b6fcc Add check so we don't try to perform an impossible transformation. Fixes issue from PR11319.
llvm-svn: 144216
2011-11-09 22:25:12 +00:00
Pete Cooper 82cd9e81fc Added invariant field to the DAG.getLoad method and changed all calls.
When this field is true it means that the load is from constant (runt-time or compile-time) and so can be hoisted from loops or moved around other memory accesses

llvm-svn: 144100
2011-11-08 18:42:53 +00:00
Richard Osborne 561fac4d4e Don't introduce custom nodes after legalization in TargetLowering::BuildSDIV()
and TargetLowering::BuildUDIV(). Fixes PR11283

llvm-svn: 143964
2011-11-07 17:09:05 +00:00
Dan Gohman c32af340fc Change the default scheduler from Latency to ILP, since Latency
is going away.

llvm-svn: 142810
2011-10-24 17:45:02 +00:00
Nadav Rotem 486ff59a9f Enable element promotion type legalization by deafault.
Changed tests which assumed that vectors are legalized by widening them.

llvm-svn: 142152
2011-10-16 20:31:33 +00:00
Jim Grosbach 400907cc41 Fix typo. "__sync_fetch_and-xor_4" should be "__sync_fetch_and_xor_4".
Pointed out by George Russell.

llvm-svn: 141956
2011-10-14 15:53:48 +00:00
Jakob Stoklund Olesen 35163e21dc Use an existing function.
llvm-svn: 141763
2011-10-12 01:24:51 +00:00
Duncan Sands f2641e1bc1 Add codegen support for vector select (in the IR this means a select
with a vector condition); such selects become VSELECT codegen nodes.
This patch also removes VSETCC codegen nodes, unifying them with SETCC
nodes (codegen was actually often using SETCC for vector SETCC already).
This ensures that various DAG combiner optimizations kick in for vector
comparisons.  Passes dragonegg bootstrap with no testsuite regressions
(nightly testsuite as well as "make check-all").  Patch mostly by
Nadav Rotem.

llvm-svn: 139159
2011-09-06 19:07:46 +00:00
Owen Anderson 40d756eacc Fix a truly heinous bug in DAGCombine related to AssertZext.
If we have a chain of zext -> assert_zext -> zext -> use, the first zext would get simplified away because of the later zext, and then the later zext would get simplified away because of the assert.  The solution is to teach SimplifyDemandedBits that assert_zext demands all of the high bits of its input, rather than only those demanded by its users.  No testcase because the only example I have manifests as llvm-gcc miscompiling LLVM, and I haven't found a smaller case that reproduces this problem.
Fixes <rdar://problem/10063365>.

llvm-svn: 139059
2011-09-03 00:26:49 +00:00
Eli Friedman 30a49e93e3 New approach to r136737: insert the necessary fences for atomic ops in platform-independent code, since a bunch of platforms (ARM, Mips, PPC, Alpha are the relevant targets here) need to do essentially the same thing.
I think this completes the basic CodeGen for atomicrmw and cmpxchg.

llvm-svn: 136813
2011-08-03 21:06:02 +00:00
Chris Lattner 229907cd11 land David Blaikie's patch to de-constify Type, with a few tweaks.
llvm-svn: 135375
2011-07-18 04:54:35 +00:00
Eric Christopher 92464be28c Check register class matching instead of width of type matching
when determining validity of matching constraint. Allow i1
types access to the GR8 reg class for x86.

Fixes PR10352 and rdar://9777108

llvm-svn: 135180
2011-07-14 20:13:52 +00:00
Cameron Zwarich f03fa189ca Add an intrinsic and codegen support for fused multiply-accumulate. The intent
is to use this for architectures that have a native FMA instruction.

llvm-svn: 134742
2011-07-08 21:39:21 +00:00
Benjamin Kramer 9960a25006 Emit a more efficient magic number multiplication for exact sdivs.
We have to do this in DAGBuilder instead of DAGCombiner, because the exact bit is lost after building.

  struct foo { char x[24]; };
  long bar(struct foo *a, struct foo *b) { return a-b; }
is now compiled into
  movl	4(%esp), %eax
  subl	8(%esp), %eax
  sarl	$3, %eax
  imull	$-1431655765, %eax, %eax
instead of
  movl	4(%esp), %eax
  subl	8(%esp), %eax
  movl	$715827883, %ecx
  imull	%ecx
  movl	%edx, %eax
  shrl	$31, %eax
  sarl	$2, %edx
  addl	%eax, %edx
  movl	%edx, %eax

llvm-svn: 134695
2011-07-08 10:31:30 +00:00
Eric Christopher 6a6d8fc7fd Remove a FIXME. All of the standard ones are in the list.
llvm-svn: 134647
2011-07-07 22:29:03 +00:00
Eric Christopher f81292ba3b Remove getRegClassForInlineAsmConstraint and all dependencies.
Fixes rdar://9643582

llvm-svn: 134123
2011-06-30 01:20:03 +00:00
Eric Christopher 5bbb2bdb46 Lower multiply with overflow checking to __mulo<mode>
calls if we haven't been able to lower them any
other way.

Fixes rdar://9090077 and rdar://9210061

llvm-svn: 133288
2011-06-17 20:41:29 +00:00
Nadav Rotem 504cf0cde2 Fix a bug in the calculation of the vectorTypeBreakdown into registers. Odd
types such as i33 were rounded to i32. Originated from Duncan's testcase.

llvm-svn: 132893
2011-06-12 14:56:55 +00:00
Chad Rosier 79044dbebf Revert r132871.
llvm-svn: 132872
2011-06-11 02:27:46 +00:00
Chad Rosier 5793b53027 Typo.
llvm-svn: 132871
2011-06-11 02:16:36 +00:00
Stuart Hastings bee6fcc5aa Avoid FGETSIGN of 80-bit types. Fixes PR10085.
llvm-svn: 132681
2011-06-06 16:44:31 +00:00
Nadav Rotem 78d19bebe6 TypeLegalizer: Fix a bug in the promotion of elements of integer vectors.
(only happens when using the -promote-elements option).

The correct legalization order is to first try to promote element. Next, we try
to widen vectors.

llvm-svn: 132648
2011-06-04 20:32:01 +00:00
Eric Christopher de9399bf76 Have LowerOperandForConstraint handle multiple character constraints.
Part of rdar://9119939

llvm-svn: 132510
2011-06-02 23:16:42 +00:00
Rafael Espindola aa318ae495 Revert 132424 to fix PR10068.
llvm-svn: 132479
2011-06-02 19:57:47 +00:00
Stuart Hastings 7adc95f69e Recommit 132404 with fixes. rdar://problem/5993888
llvm-svn: 132424
2011-06-01 21:33:14 +00:00
Stuart Hastings 3ae49c03a4 Fix double FGETSIGN to work on x86_32; followup to 132396.
rdar://problem/5660695

llvm-svn: 132411
2011-06-01 18:32:25 +00:00
Stuart Hastings fd5ecd0cec Turn on FGETSIGN for x86. Followup to 132388. rdar://problem/5660695
llvm-svn: 132396
2011-06-01 14:04:17 +00:00
Nadav Rotem 8b24a731f2 This patch is another step in the direction of adding vector select. In this
patch we add a flag to enable a new type legalization decision - to promote
integer elements in vectors. Currently, the rest of the codegen does not support
this kind of legalization.  This flag will be removed when the transition is
complete.

llvm-svn: 132394
2011-06-01 12:51:46 +00:00
Nadav Rotem d86c1c41fb Refactor the type legalizer. Switch TargetLowering to a new enum - LegalizeTypeAction.
This patch does not change the behavior of the type legalizer. The codegen
produces the same code.
This infrastructural change is needed in order to enable complex decisions
for vector types (needed by the vector-select patch).

llvm-svn: 132263
2011-05-28 17:57:14 +00:00
Nadav Rotem a9effb13dd Refactor getActionType and getTypeToTransformTo ; place all of the 'decision'
code in one place. Re-apply 131534 and fix the multi-step promotion of integers.

llvm-svn: 132217
2011-05-27 21:03:13 +00:00
Stuart Hastings 4a4e5a2b55 Update some currently-disabled code, preparing for eventual use.
llvm-svn: 131663
2011-05-19 18:48:20 +00:00
Duncan Sands 3d9407f4eb Revert commit 131534 since it seems to have broken several buildbots.
Original log entry:
Refactor getActionType and getTypeToTransformTo ; place all of the 'decision'
code in one place.

llvm-svn: 131536
2011-05-18 14:57:56 +00:00
Nadav Rotem c5c27ede55 Refactor getActionType and getTypeToTransformTo ; place all of the 'decision'
code in one place.

llvm-svn: 131534
2011-05-18 12:26:38 +00:00
Eric Christopher 4480428474 Look through struct wrapped types for inline asm statments.
Patch by Evan Cheng.

llvm-svn: 131093
2011-05-09 20:04:43 +00:00
Eli Friedman 2518f8376d Make the logic for determining function alignment more explicit. No functionality change.
llvm-svn: 131012
2011-05-06 20:34:06 +00:00
Benjamin Kramer 341c11da3b DAGCombine: fold "(zext x) == C" into "x == (trunc C)" if the trunc is lossless.
On x86 this allows to fold a load into the cmp, greatly reducing register pressure.
  movzbl	(%rdi), %eax
  cmpl	$47, %eax
->
  cmpb	$47, (%rdi)

This shaves 8k off gcc.o on i386. I'll leave applying the patch in README.txt to Chris :)

llvm-svn: 130005
2011-04-22 18:47:44 +00:00
Chris Lattner 0ab5e2cded Fix a ton of comment typos found by codespell. Patch by
Luis Felipe Strano Moraes!

llvm-svn: 129558
2011-04-15 05:18:47 +00:00
Chris Lattner 493b3e72f2 sink a call into its only use.
llvm-svn: 129503
2011-04-14 04:12:47 +00:00
Owen Anderson 9c12834eed During post-legalization DAG combining, be careful to only create shifts where the RHS is of the legal type for the new operation.
llvm-svn: 129484
2011-04-13 23:22:23 +00:00
Evan Cheng bd76679700 Issue libcalls __udivmod*i4 / __divmod*i4 for div / rem pairs.
rdar://8911343

llvm-svn: 128696
2011-04-01 00:42:02 +00:00
Benjamin Kramer cfcea12fe2 BuildUDIV: If the divisor is even we can simplify the fixup of the multiplied value by introducing an early shift.
This allows us to compile "unsigned foo(unsigned x) { return x/28; }" into
	shrl	$2, %edi
	imulq	$613566757, %rdi, %rax
	shrq	$32, %rax
	ret

instead of
	movl    %edi, %eax
	imulq   $613566757, %rax, %rcx
	shrq    $32, %rcx
	subl    %ecx, %eax
	shrl    %eax
	addl    %ecx, %eax
	shrl    $4, %eax

on x86_64

llvm-svn: 127829
2011-03-17 20:39:14 +00:00
Owen Anderson b2c80da4ae Allow targets to specify a the type of the RHS of a shift parameterized on the type of the LHS.
llvm-svn: 126518
2011-02-25 21:41:48 +00:00
Chris Lattner 46c01a30f4 Enhance ComputeMaskedBits to know that aligned frameindexes
have their low bits set to zero.  This allows us to optimize
out explicit stack alignment code like in stack-align.ll:test4 when
it is redundant.

Doing this causes the code generator to start turning FI+cst into
FI|cst all over the place, which is general goodness (that is the
canonical form) except that various pieces of the code generator
don't handle OR aggressively.  Fix this by introducing a new
SelectionDAG::isBaseWithConstantOffset predicate, and using it
in places that are looking for ADD(X,CST).  The ARM backend in
particular was missing a lot of addressing mode folding opportunities
around OR.

llvm-svn: 125470
2011-02-13 22:25:43 +00:00
Benjamin Kramer 45d183ccf0 Fix an off-by-one error in ctpop combining.
llvm-svn: 123664
2011-01-17 18:00:28 +00:00
Benjamin Kramer 24c5184dca Add a DAGCombine to turn (ctpop x) u< 2 into (x & x-1) == 0.
This shaves off 4 popcounts from the hacked 186.crafty source.

This is enabled even when a native popcount instruction is available. The
combined code is one operation longer but it should be faster nevertheless.

llvm-svn: 123621
2011-01-17 12:04:57 +00:00
Dale Johannesen d2b48119b0 Fix PR 8916 (qv for analysis), at least the immediate problem.
There's an inherent tension in DAGCombine between assuming
that things will be put in canonical form, and the Depth
mechanism that disables transformations when recursion gets
too deep.  It would not surprise me if there's a lot of little
bugs like this one waiting to be discovered.  The mechanism
seems fragile and I'd suggest looking at it from a design viewpoint.

llvm-svn: 123191
2011-01-10 21:53:07 +00:00
Evan Cheng 3ae2b79aa3 Re-implement r122936 with proper target hooks. Now getMaxStoresPerMemcpy
etc. takes an option OptSize. If OptSize is true, it would return
the inline limit for functions with attribute OptSize.

llvm-svn: 122952
2011-01-06 06:52:41 +00:00
Nick Lewycky 0de20af7ba Add missing standard headers. Patch by Joerg Sonnenberger!
llvm-svn: 122193
2010-12-19 20:43:38 +00:00
Jay Foad 583abbc4df PR5207: Change APInt methods trunc(), sext(), zext(), sextOrTrunc() and
zextOrTrunc(), and APSInt methods extend(), extOrTrunc() and new method
trunc(), to be const and to return a new value instead of modifying the
object in place.

llvm-svn: 121120
2010-12-07 08:25:19 +00:00
Chris Lattner ea41dfe385 add TLI support indicating that jumps are more expensive than logical operations
and use this to disable a specific optimization.  Patch by Micah Villmow!

llvm-svn: 120435
2010-11-30 18:12:52 +00:00
Wesley Peck 527da1b6e2 Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept.
llvm-svn: 119990
2010-11-23 03:31:01 +00:00
Dale Johannesen f11ea9ce61 Fix an inline asm pasto from 117667; was preventing
{i64, i64} from matching i128.

llvm-svn: 118465
2010-11-09 01:15:07 +00:00
John Thompson e8360b7182 Inline asm multiple alternative constraints development phase 2 - improved basic logic, added initial platform support.
llvm-svn: 117667
2010-10-29 17:29:13 +00:00
Dale Johannesen 320a553319 Remove Synthesizable from the Type system; as MMX vector
types are no longer Legal on X86, we don't need it.
No functional change.  8499854.

llvm-svn: 116947
2010-10-20 21:32:10 +00:00
John Thompson c467aa2fa4 Fixed pr20314-2.c failure, added E, F, p constraint letters.
llvm-svn: 114490
2010-09-21 22:04:54 +00:00
Chris Lattner 1ffcf527c7 continue MachinePointerInfo'izing, eliminating use of one of the old
getLoad overloads.

llvm-svn: 114443
2010-09-21 16:36:31 +00:00
Eric Christopher 79127ab3f5 Silence more warnings. Two more unused variables.
llvm-svn: 113771
2010-09-13 18:30:57 +00:00
John Thompson 1094c80281 Added skeleton for inline asm multiple alternative constraint support.
llvm-svn: 113766
2010-09-13 18:15:37 +00:00
Chris Lattner 8df99b523e remove some llvmcontext arguments that are now dead post-refactoring.
llvm-svn: 112104
2010-08-25 23:00:45 +00:00
Chris Lattner 75ff053497 Change handling of illegal vector types to widen when possible instead of
expanding: e.g. <2 x float> -> <4 x float> instead of -> 2 floats.  This
affects two places in the code: handling cross block values and handling
function return and arguments.  Since vectors are already widened by 
legalizetypes, this gives us much better code and unblocks x86-64 abi
and SPU abi work.

For example, this (which is a silly example of a cross-block value):
define <4 x float> @test2(<4 x float> %A) nounwind {
 %B = shufflevector <4 x float> %A, <4 x float> undef, <2 x i32> <i32 0, i32 1>
 %C = fadd <2 x float> %B, %B
  br label %BB
BB:
 %D = fadd <2 x float> %C, %C
 %E = shufflevector <2 x float> %D, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
 ret <4 x float> %E
}

Now compiles into:

_test2:                                 ## @test2
## BB#0:
 addps %xmm0, %xmm0
 addps %xmm0, %xmm0
 ret

previously it compiled into:

_test2:                                 ## @test2
## BB#0:
 addps %xmm0, %xmm0
 pshufd $1, %xmm0, %xmm1
                                        ## kill: XMM0<def> XMM0<kill> XMM0<def>
 insertps $0, %xmm0, %xmm0
 insertps $16, %xmm1, %xmm0
 addps %xmm0, %xmm0
 ret

This implements rdar://8230384

llvm-svn: 112101
2010-08-25 22:49:25 +00:00
Eli Friedman 460ad41d6d PR7586: Make sure we don't claim that unknown bits are actually known in the
ISD::AND case of TargetLowering::SimplifyDemandedBits.

llvm-svn: 110019
2010-08-02 04:42:25 +00:00
Eli Friedman ffe64c06ef Fix for bug reported by Evzen Muller on llvm-commits: make sure to correctly
check the range of the constant when optimizing a comparison between a
constant and a sign_extend_inreg node.

llvm-svn: 109854
2010-07-30 06:44:31 +00:00
Dan Gohman 55e244698a Use the proper type for shift counts. This fixes a bootstrap error.
llvm-svn: 109265
2010-07-23 21:08:12 +00:00
Dan Gohman 0818684a70 DAGCombine (shl (anyext x, c)) to (anyext (shl x, c)) if the high bits
are not demanded. This often allows the anyext to be folded away.

llvm-svn: 109242
2010-07-23 18:03:30 +00:00
Evan Cheng a77f3d3b37 Teach bottom up pre-ra scheduler to track register pressure. Work in progress.
llvm-svn: 108991
2010-07-21 06:09:07 +00:00
Evan Cheng 10f99a3490 ARM has to provide its own TargetLowering::findRepresentativeClass because its scalar floating point registers alias its vector registers.
llvm-svn: 108761
2010-07-19 22:15:08 +00:00
Evan Cheng 7a135510e3 Teach computeRegisterProperties() to compute "representative" register class for legal value types. A "representative" register class is the largest legal super-reg register class for a value type. e.g. On i386, GR32 is the rep register class for i8 / i16 / i32; on x86_64 it would be GR64.
This property will be used by the register pressure tracking instruction scheduler.

llvm-svn: 108735
2010-07-19 18:47:01 +00:00