Commit Graph

5932 Commits

Author SHA1 Message Date
Eli Bendersky 90dd3e7dfd Move TryToFoldFastISelLoad to FastISel, where it belongs. In general, I'm
trying to move as much FastISel logic as possible out of the main path in
SelectionDAGISel - intermixing them just adds confusion.

llvm-svn: 179902
2013-04-19 22:29:18 +00:00
Michael Liao b53d8963ce ArrayRefize getMachineNode(). No functionality change.
llvm-svn: 179901
2013-04-19 22:22:57 +00:00
Eli Bendersky dbeefaa86a Use dbgs() consistently for -debug printouts
llvm-svn: 179894
2013-04-19 21:37:07 +00:00
Eli Bendersky 6084f45f38 Add some more stats for fast isel vs. SelectionDAG, w.r.t lowering function
arguments in entry BBs.

llvm-svn: 179824
2013-04-19 01:04:40 +00:00
Benjamin Kramer bbae991db6 DAGCombiner: Fold a shuffle on CONCAT_VECTORS into a new CONCAT_VECTORS if possible.
This pattern occurs in SROA output due to the way vector arguments are lowered
on ARM.

The testcase from PR15525 now compiles into this, which is better than the code
we got with the old scalarrepl:
_Store:
	ldr.w	r9, [sp]
	vmov	d17, r3, r9
	vmov	d16, r1, r2
	vst1.8	{d16, d17}, [r0]
	bx	lr

Differential Revision: http://llvm-reviews.chandlerc.com/D647

llvm-svn: 179106
2013-04-09 17:41:43 +00:00
Eli Bendersky fc186358f2 Formatting
llvm-svn: 178771
2013-04-04 18:03:41 +00:00
Bill Schmidt 92e26646bc Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC.
For this we need to use a libcall.  Previously LLVM didn't implement
libcall support for frem, so I've added it in the usual
straightforward manner.  A test case from the bug report is included.

llvm-svn: 178639
2013-04-03 13:05:44 +00:00
Arnold Schwaighofer d6c6e868b2 DAGCombiner: Merge store/loads when we have extload/truncstores
This is helps on architectures where i8,i16 are not legal but we have byte, and
short loads/stores. Allowing us to merge copies like the one below on ARM.

copy(char *a, char *b, int n) {
 do {
   int t0 = a[0];
   int t1 = a[1];
   b[0] = t0;
   b[1] = t1;

radar://13536387

llvm-svn: 178546
2013-04-02 15:58:51 +00:00
Arnold Schwaighofer 6752366ed7 Merge load/store sequences with adresses: base + index + offset
We would also like to merge sequences that involve a variable index like in the
example below.

    int index = *idx++
    int i0 = c[index+0];
    int i1 = c[index+1];
    b[0] = i0;
    b[1] = i1;

By extending the parsing of the base pointer to handle dags that contain a
base, index, and offset we can handle examples like the one above.

The dag for the code above will look something like:

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i8 load %index))))

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))

The code that parses the tree ignores the intermediate sign extensions. However,
if there is a sign extension it needs to be on all indexes.

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (add (i8 load %index)
                                     (i8 1))))
 vs

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))
radar://13536387

llvm-svn: 178483
2013-04-01 18:12:58 +00:00
Benjamin Kramer 9335443236 DAGCombine: visitXOR can replace a node without returning it, bail out in that case.
Fixes the crash reported in PR15608.

llvm-svn: 178429
2013-03-30 21:28:18 +00:00
Chad Rosier dbac025d84 [fast-isel] Add a preemptive fix for the case where we fail to materialize an
immediate in a register.  I don't believe this should ever fail, but I see no
harm in trying to make this code bullet proof.

I've added an assert to ensure my assumtion is correct.  If the assertion fires
something is wrong and we should fix it, rather then just silently fall back to
SelectionDAG isel.

llvm-svn: 178305
2013-03-28 23:04:47 +00:00
Michael Liao bb05a1d7b5 Enhance folding of (extract_subvec (insert_subvec V1, V2, IIdx), EIdx)
- Handle the case where the result of 'insert_subvect' is bitcasted
  before 'extract_subvec'. This removes the redundant insertf128/extractf128
  pair on unaligned 256-bit vector load/store on vectors of non 64-bit integer.

llvm-svn: 177945
2013-03-25 23:47:35 +00:00
Shuxin Yang 93b1f12ac1 Disable some unsafe-fp-math DAG-combine transformation after legalization.
For instance, following transformation will be disabled:
    x + x + x => 3.0f * x;

The problem of these transformations is that it introduces a FP constant, which
following Instruction-Selection pass cannot handle.

Reviewed by Nadav, thanks a lot!

rdar://13445387

llvm-svn: 177933
2013-03-25 22:52:29 +00:00
Owen Anderson c81616b0a9 Remove the type legality check from the SelectionDAGBuilder when it lowers @llvm.fmuladd to ISD::FMA nodes.
Performing this check unilaterally prevented us from generating FMAs when the incoming IR contained illegal vector types which would eventually be legalized to underlying types that *did* support FMA.
For example, an @llvm.fmuladd on an OpenCL float16 should become a sequence of float4 FMAs, not float4 fmul+fadd's.

NOTE: Because we still call the target-specific profitability hook, individual targets can reinstate the old behavior, if desired, by simply performing the legality check inside their callback hook.  They can also perform more sophisticated legality checks, if, for example, some illegal vector types can be productively implemented as FMAs, but not others.
llvm-svn: 177820
2013-03-23 08:26:53 +00:00
Justin Holewinski 7478f3d776 Make variable name more explicit and eliminate redundant lookup in SDNodeOrdering
llvm-svn: 177600
2013-03-20 23:10:59 +00:00
Nadav Rotem 4536d582fd When computing the demanded bits of Load SDNodes, make sure that we are looking at the loaded-value operand and not the ptr result (in case of pre-inc loads).
rdar://13348420

llvm-svn: 177596
2013-03-20 22:53:44 +00:00
Christian Konig ed34d0ef1a Revert "pre-RA-sched: fix TargetOpcode usage"
This reverts commit 06091513c283c863296f01cc7c2e86b56bb50d02.

The code is obviously wrong, but the trivial fix causes
inefficient code generation on X86. Somebody with more
knowledge of the code needs to take a look here.

Signed-off-by: Christian König <christian.koenig@amd.com>
llvm-svn: 177529
2013-03-20 15:43:00 +00:00
Justin Holewinski c2d2c8939c Move SDNode order propagation to SDNodeOrdering, which also fixes a missed
case of order propagation during isel.

Thanks Owen for the suggestion!

llvm-svn: 177525
2013-03-20 14:51:01 +00:00
Christian Konig 9ce2d5b862 pre-RA-sched: fix TargetOpcode usage
TargetOpcodes need to be treaded as Machine- and not ISD-Opcodes.

Signed-off-by: Christian König <christian.koenig@amd.com>
llvm-svn: 177518
2013-03-20 13:49:22 +00:00
Justin Holewinski d068943809 Propagate DAG node ordering during type legalization and instruction selection
A node's ordering is only propagated during legalization if (a) the new node does
not have an ordering (is not a CSE'd node), or (b) the new node has an ordering
that is higher than the node being legalized.

llvm-svn: 177465
2013-03-20 00:10:32 +00:00
Bill Wendling 965bd58902 Reset some of the target options which affect code generation.
This doesn't reset all of the target options within the TargetOptions
object. This is because some of those are ABI-specific and must be determined if
it's okay to change those on the fly.

llvm-svn: 176986
2013-03-13 22:26:59 +00:00
Richard Relph 61046a9727 Avoid generating ISD::SELECT for vector operands to SIGN_EXTEND
llvm-svn: 176881
2013-03-12 18:17:18 +00:00
Nick Lewycky 48beb21185 Fix a crasher newly introduced in r176659/r176649, where fast-isel tries to
lower an expect intrinsic that is a constant expression.

llvm-svn: 176830
2013-03-11 21:44:37 +00:00
Jan Wen Voung 7857a64909 Disable statistics on Release builds and move tests that depend on -stats.
Summary:
Statistics are still available in Release+Asserts (any +Asserts builds),
and stats can also be turned on with LLVM_ENABLE_STATS.

Move some of the FastISel stats that were moved under DEBUG()
back out of DEBUG(), since stats are disabled across the board now.

Many tests depend on grepping "-stats" output.  Move those into
a orig_dir/Stats/. so that they can be marked as unsupported
when building without statistics.

Differential Revision: http://llvm-reviews.chandlerc.com/D486

llvm-svn: 176733
2013-03-08 22:56:31 +00:00
Benjamin Kramer 0f12a5c2a4 Remove default from fully covered switch.
llvm-svn: 176703
2013-03-08 17:03:19 +00:00
Tom Stellard d93ef7afaa LegalizeDAG: Respect the result of TLI.getBooleanContents() when expanding SETCC
llvm-svn: 176695
2013-03-08 15:37:02 +00:00
Tom Stellard b1588fc057 DAGCombiner: Use correct value type for checking legality of BR_CC v3
LegalizeDAG.cpp uses the value of the comparison operands when checking
the legality of BR_CC, so DAGCombiner should do the same.

v2:
  - Expand more BR_CC value types for NVPTX

v3:
  - Expand correct BR_CC value types for Hexagon, Mips, and XCore.

llvm-svn: 176694
2013-03-08 15:36:57 +00:00
Bill Wendling 2d915e2c15 Revert r176154 in favor of a better approach.
Code generation makes some basic assumptions about the IR it's been given. In
particular, if there is only one 'invoke' in the function, then that invoke
won't be going away. However, with the advent of the `llvm.donothing' intrinsic,
those invokes may go away. If all of them go away, the landing pad no longer has
any users. This confuses the back-end, which asserts.

This happens with SjLj exceptions, because that's the model that modifies the IR
based on there being invokes, etc. in the function.

Remove any invokes of `llvm.donothing' during SjLj EH preparation. This will
give us a CFG that the back-end won't be confused about. If all of the invokes
in a function are removed, then the SjLj EH prepare pass won't insert the bogus
code the relies upon the invokes being there.
<rdar://problem/13228754&13316637>

llvm-svn: 176677
2013-03-08 02:21:08 +00:00
Chad Rosier 3a200e1faf [fast-isel] Seriously, add support for the expect intrinsic.
rdar://13370942

llvm-svn: 176659
2013-03-07 21:38:33 +00:00
Chad Rosier 9c1796f877 [fast-isel] Add support for the expect intrinsic.
rdar://13370942

llvm-svn: 176649
2013-03-07 20:42:17 +00:00
Benjamin Kramer fdf362bd69 ArrayRefize some code. No functionality change.
llvm-svn: 176648
2013-03-07 20:33:29 +00:00
Andrew Trick 0f23b763a9 pre-RA-sched debug-only fix
llvm-svn: 176638
2013-03-07 19:21:08 +00:00
Andrew Trick b2ab8a732c pre-RA-sched assertion fix. This bug was exposed by r176037.
rdar:13370002 [pre-RA-sched] assertion: released too many times

I tracked this down to an earlier hack that is no longer applicable
and interfered with normal scheduler logic. With the changes in
r176037, it was causing an instruction to be scheduled multiple times.

I have an external test case that I tried hard to reduce and
failed. I can't even reproduce with llc.

llvm-svn: 176636
2013-03-07 19:07:57 +00:00
Nadav Rotem 40cda80af8 No need to go through int64 and APInt when generating a new constant.
llvm-svn: 176615
2013-03-07 06:34:49 +00:00
Jim Grosbach 48a91abc10 SDAG: Handle scalarizing an extend of a <1 x iN> vector.
Just scalarize the element and rebuild a vector of the result type
from that.

rdar://13281568

llvm-svn: 176614
2013-03-07 05:47:54 +00:00
Eli Bendersky b1caf3c30e Remove duplicate line and move another closer to its actual use
llvm-svn: 176391
2013-03-01 23:32:40 +00:00
Akira Hatanaka 3d055580a9 Set properties for f128 type.
llvm-svn: 176378
2013-03-01 21:11:44 +00:00
Chad Rosier b3864609cf Generate an error message instead of asserting or segfaulting when we can't
handle indirect register inputs.
rdar://13322011

llvm-svn: 176367
2013-03-01 19:12:05 +00:00
Michael Liao 6af16fc3b7 Fix PR10475
- ISD::SHL/SRL/SRA must have either both scalar or both vector operands
  but TLI.getShiftAmountTy() so far only return scalar type. As a
  result, backend logic assuming that breaks.
- Rename the original TLI.getShiftAmountTy() to
  TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to
  return target-specificed scalar type or the same vector type as the
  1st operand.
- Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar
  type.

llvm-svn: 176364
2013-03-01 18:40:30 +00:00
Eli Bendersky 33ebf836bc A small refactoring + adding comments.
SelectionDAGIsel::LowerArguments needs a function, not a basic block. So it
makes sense to pass it the function instead of extracting a basic-block from
the function and then tossing it. This is also more self-documenting (functions
have arguments, BBs don't).

In addition, added comments to a couple of Select* methods.

llvm-svn: 176305
2013-02-28 23:09:18 +00:00
Eli Bendersky d0c6e7b038 Put some per-instruction statistics of fast isel under NDEBUG, together with
other per-instruction statistics.

llvm-svn: 176273
2013-02-28 18:05:12 +00:00
Eric Christopher 10d35e9065 Remove unnecessary cast to void.
llvm-svn: 176222
2013-02-27 23:49:45 +00:00
Nadav Rotem c29095fb50 Silence the unused variable warning.
llvm-svn: 176218
2013-02-27 22:52:54 +00:00
Nadav Rotem 00b75dd3c4 The FastISEL should be fast. But when we record statistics we use atomic operations to increment the counters.
This patch disables the counters on non-debug builds. This reduces the runtime of SelectionDAGISel::SelectCodeCommon by ~5%.

llvm-svn: 176214
2013-02-27 21:59:43 +00:00
Michael Ilseman ba8446c80e Reverted: r176136 - Have a way for a target to opt-out of target-independent fast isel
llvm-svn: 176204
2013-02-27 19:54:00 +00:00
Manman Ren 683f59b36c SelectionDAG: If llvm.donothing has a landingpad, we should clear
CurrentCallSite to avoid an assertion failure:
assert(MMI.getCurrentCallSite() == 0 && "Overlapping call sites!");

rdar://problem/13228754

llvm-svn: 176154
2013-02-27 02:11:57 +00:00
Michael Ilseman 846c6f0a32 Have a way for a target to opt-out of target-independent fast isel
llvm-svn: 176136
2013-02-26 23:15:23 +00:00
Chad Rosier 0587597fb8 Fix wording.
llvm-svn: 176055
2013-02-25 22:20:00 +00:00
Chad Rosier a92ef4ba5b [fast-isel] Add X86FastIsel::FastLowerArguments to handle functions with 6 or
fewer scalar integer (i32 or i64) arguments. It completely eliminates the need
for SDISel for trivial functions.

Also, add the new llc -fast-isel-abort-args option, which is similar to
-fast-isel-abort option, but for formal argument lowering.

llvm-svn: 176052
2013-02-25 21:59:35 +00:00
Andrew Trick 7cf4361912 pre-RA-sched fix: only reevaluate physreg interferences when necessary.
Fixes rdar:13279013: scheduler was blowing up on select instructions.

llvm-svn: 176037
2013-02-25 19:11:48 +00:00
Matt Beaumont-Gay 0e760da5fc 'Hexadecimal' has two 'a's and only one 'i'.
llvm-svn: 176031
2013-02-25 18:11:18 +00:00
Chandler Carruth 121dbf8846 Fix spelling noticed by Duncan.
llvm-svn: 176023
2013-02-25 14:29:38 +00:00
Chandler Carruth 05920b1847 Fix the root cause of PR15348 by correctly handling alignment 0 on
memory intrinsics in the SDAG builder.

When alignment is zero, the lang ref says that *no* alignment
assumptions can be made. This is the exact opposite of the internal API
contracts of the DAG where alignment 0 indicates that the alignment can
be made to be anything desired.

There is another, more explicit alignment that is better suited for the
role of "no alignment at all": an alignment of 1. Map the intrinsic
alignment to this early so that we don't end up generating aligned DAGs.

It is really terrifying that we've never seen this before, but we
suddenly started generating a large number of alignment 0 memcpys due to
the new code to do memcpy-based copying of POD class members. That patch
contains a bug that rounds bitfield alignments down when they are the
first field. This can in turn produce zero alignments.

This fixes weird crashes I've seen in library users of LLVM on 32-bit
hosts, etc.

llvm-svn: 176022
2013-02-25 14:20:21 +00:00
Nadav Rotem b7f90bd97b SelectionDAG compile time improvement.
One of the phases of SelectionDAG is LegalizeVectors. We don't need to sort the DAG and copy nodes around if there are no vector ops.

Speeds up the compilation time of SelectionDAG on a big scalar workload by ~8%.

llvm-svn: 175929
2013-02-22 23:33:30 +00:00
Pete Cooper 047f81a5df Fix isa<> check which could never be true.
It was incorrectly checking a Function* being an IntrinsicInst* which
isn't possible.  It should always have been checking the CallInst* instead.

Added test case for x86 which ensures we only get one constant load.
It was 2 before this change.

rdar://problem/13267920

llvm-svn: 175853
2013-02-22 01:50:38 +00:00
Benjamin Kramer 3238dc0c61 DAGCombiner: Make the post-legalize vector op optimization more aggressive.
A legal BUILD_VECTOR goes in and gets constant folded into another legal
BUILD_VECTOR so we don't lose any legality here. The problematic PPC
optimization that made this check necessary was fixed recently.

llvm-svn: 175759
2013-02-21 15:24:35 +00:00
Arnold Schwaighofer 3f9568e921 DAGCombiner: Fold pointless truncate, bitcast, buildvector series
(2xi32) (truncate ((2xi64) bitcast (buildvector i32 a, i32 x, i32 b, i32 y)))
can be folded into a (2xi32) (buildvector i32 a, i32 b).

Such a DAG would cause uneccessary vdup instructions followed by vmovn
instructions.

We generate this code on ARM NEON for a setcc olt, 2xf64, 2xf64. For example, in
the vectorized version of the code below.

double A[N];
double B[N];

void test_double_compare_to_double() {
  int i;
  for(i=0;i<N;i++)
    A[i] = (double)(A[i] < B[i]);
}

radar://13191881

Fixes bug 15283.

llvm-svn: 175670
2013-02-20 21:33:32 +00:00
Michael Liao 7fb39669ef Fix PR15267
- When extloading from a vector with non-byte-addressable element, e.g.
  <4 x i1>, the current logic breaks. Extend the current logic to
  fix the case where the element type is not byte-addressable by loading
  all bytes, bit-extracting/packing each element.

llvm-svn: 175642
2013-02-20 18:04:21 +00:00
Benjamin Kramer 5c3e21ba55 Move the SplatByte helper to APInt and generalize it a bit.
llvm-svn: 175621
2013-02-20 13:00:06 +00:00
Jakub Staszak 8bc7af1a93 Fix #includes, so we include only what we really need.
llvm-svn: 175581
2013-02-20 00:26:25 +00:00
Chad Rosier 3489bcc9ab [ms-inline asm] Remove a redundant call to the setHasMSInlineAsm function.
llvm-svn: 175456
2013-02-18 20:13:59 +00:00
NAKAMURA Takumi 68426c79db [ms-inline asm] Fix undefined behavior to reset hasMSInlineAsm in advance of SelectAllBasicBlocks().
llvm-svn: 175422
2013-02-18 07:06:48 +00:00
Jakub Staszak 5c262f505e LegalizeDAG.cpp doesn't need DenseMap.
llvm-svn: 175365
2013-02-16 16:15:42 +00:00
Chad Rosier 925c9b499e [ms-inline asm] Do not omit the frame pointer if we have ms-inline assembly.
If the frame pointer is omitted, and any stack changes occur in the inline
assembly, e.g.: "pusha", then any C local variable or C argument references
will be incorrect.  

I pass no judgement on anyone who would do such a thing. ;)
rdar://13218191

llvm-svn: 175334
2013-02-16 01:25:28 +00:00
Bill Wendling aef9c37c65 Use the 'target-features' and 'target-cpu' attributes to reset the subtarget features.
If two functions require different features (e.g., `-mno-sse' vs. `-msse') then
we want to honor that, especially during LTO. We can do that by resetting the
subtarget's features depending upon the 'target-feature' attribute.

llvm-svn: 175314
2013-02-15 22:31:27 +00:00
Paul Redmond f29ddfe93f enable SDISel sincos optimization for GNU environments
- add sincos to runtime library if target triple environment is GNU
- added canCombineSinCosLibcall() which checks that sincos is in the RTL and
  if the environment is GNU then unsafe fpmath is enabled (required to
  preserve errno)
- extended sincos-opt lit test

Reviewed by: Hal Finkel

llvm-svn: 175283
2013-02-15 18:45:18 +00:00
Nadav Rotem 495b1a43c1 Dont merge consecutive loads/stores into vectors when noimplicitfloat is used.
llvm-svn: 175190
2013-02-14 18:28:52 +00:00
Owen Anderson cc068993ee Add some legality checks for SETCC before introducing it in the DAG combiner post-operand legalization.
llvm-svn: 175149
2013-02-14 09:07:33 +00:00
Guy Benyei 83c74e9fad Add static cast to unsigned char whenever a character classification function is called with a signed char argument, in order to avoid assertions in Windows Debug configuration.
llvm-svn: 175006
2013-02-12 21:21:59 +00:00
Paul Redmond 288604ed0c PR14562 - Truncation of left shift became undef
DAGCombiner::ReduceLoadWidth was converting (trunc i32 (shl i64 v, 32))
into (shl i32 v, 32) into undef. To prevent this, check the shift count
against the final result size.

Patch by: Kevin Schoedel
Reviewed by: Nadav Rotem

llvm-svn: 174972
2013-02-12 15:21:21 +00:00
Pete Cooper 10a3ae7039 Check type for legality before forming a select from loads.
Sorry for the lack of a test case.  I tried writing one for i386 as i know selects are illegal on this target, but they are actually considered legal by isel and expanded later.

I can't see any targets to trigger this, but checking for the legality of a node before forming it is general goodness.

llvm-svn: 174934
2013-02-12 03:14:50 +00:00
Evan Cheng 615620c9e8 Currently, codegen may spent some time in SDISel passes even if an entire
function is successfully handled by fast-isel. That's because function
arguments are *always* handled by SDISel. Introduce FastLowerArguments to
allow each target to provide hook to handle formal argument lowering.

As a proof-of-concept, add ARMFastIsel::FastLowerArguments to handle
functions with 4 or fewer scalar integer (i8, i16, or i32) arguments. It
completely eliminates the need for SDISel for trivial functions.

rdar://13163905

llvm-svn: 174855
2013-02-11 01:27:15 +00:00
Evan Cheng d1c6404250 Remove unnecessary code.
llvm-svn: 174854
2013-02-11 01:18:26 +00:00
Hal Finkel 2581905f81 DAGCombiner: Constant folding around pre-increment loads/stores
Previously, even when a pre-increment load or store was generated,
we often needed to keep a copy of the original base register for use
with other offsets. If all of these offsets are constants (including
the offset which was combined into the addressing mode), then this is
clearly unnecessary. This change adjusts these other offsets to use the
new incremented address.

llvm-svn: 174746
2013-02-08 21:35:47 +00:00
Bob Wilson 67bbf3aa0c Revert 172027 and 174336. Remove diagnostics about over-aligned stack objects.
Aside from the question of whether we report a warning or an error when we
can't satisfy a requested stack object alignment, the current implementation
of this is not good.  We're not providing any source location in the diagnostics
and the current warning is not connected to any warning group so you can't
control it.  We could improve the source location somewhat, but we can do a
much better job if this check is implemented in the front-end, so let's do that
instead.  <rdar://problem/13127907>

llvm-svn: 174741
2013-02-08 20:35:15 +00:00
Evan Cheng a72b9709d7 Tweak check to avoid integer overflow (for insanely large alignments)
llvm-svn: 174482
2013-02-06 02:06:33 +00:00
Owen Anderson de89ecf1fc Reapply r174343, with a fix for a scary DAG combine bug where it failed to differentiate between the alignment of the
base point of a load, and the overall alignment of the load.  This caused infinite loops in DAG combine with the
original application of this patch.

ORIGINAL COMMIT LOG:
When the target-independent DAGCombiner inferred a higher alignment for a load,
it would replace the load with one with the higher alignment.  However, it did
not place the new load in the worklist, which prevented later DAG combines in
the same phase (for example, target-specific combines) from ever seeing it.

This patch corrects that oversight, and updates some tests whose output changed
due to slightly different DAGCombine outputs.

llvm-svn: 174431
2013-02-05 19:24:39 +00:00
NAKAMURA Takumi 3753b28cd2 Revert r174343, "When the target-independent DAGCombiner inferred a higher alignment for a load,"
It caused hangups in compiling clang/lib/Parse/ParseDecl.cpp and clang/lib/Driver/Tools.cpp in stage2 on some hosts.

llvm-svn: 174374
2013-02-05 14:44:16 +00:00
Owen Anderson a47fdbb032 When the target-independent DAGCombiner inferred a higher alignment for a load,
it would replace the load with one with the higher alignment.  However, it did
not place the new load in the worklist, which prevented later DAG combines in
the same phase (for example, target-specific combines) from ever seeing it.

This patch corrects that oversight, and updates some tests whose output changed
due to slightly different DAGCombine outputs.

llvm-svn: 174343
2013-02-05 06:25:30 +00:00
Benjamin Kramer 548ffa274a SelectionDAG: Teach FoldConstantArithmetic how to deal with vectors.
This required disabling a PowerPC optimization that did the following:
input:
x = BUILD_VECTOR <i32 16, i32 16, i32 16, i32 16>
lowered to:
tmp = BUILD_VECTOR <i32 8, i32 8, i32 8, i32 8>
x = ADD tmp, tmp

The add now gets folded immediately and we're back at the BUILD_VECTOR we
started from. I don't see a way to fix this currently so I left it disabled
for now.

Fix some trivially foldable X86 tests too.

llvm-svn: 174325
2013-02-04 15:19:18 +00:00
Shuxin Yang cadd8a068e rdar://13126763
Fix a bug in DAGCombine. The symptom is mistakenly optimizing expression
"x + x*x" into "x * 3.0".

llvm-svn: 174239
2013-02-02 00:22:03 +00:00
Nadav Rotem f04cbeb357 Fix errant fallthrough in the generation of the lifetime markers.
Found by Alexander Kornienko.

llvm-svn: 174207
2013-02-01 19:25:23 +00:00
Lang Hames dd47804394 When lowering memcpys to loads and stores, make sure we don't promote alignments
past the natural stack alignment.

llvm-svn: 174085
2013-01-31 20:23:43 +00:00
Weiming Zhao 4a0b4fb9a5 Add a special handling case for untyped CopyFromReg node in GetCostForDef() of ScheduleDAGRRList
llvm-svn: 173833
2013-01-29 21:18:43 +00:00
Evan Cheng 0e88c7d897 Teach SDISel to combine fsin / fcos into a fsincos node if the following
conditions are met:
1. They share the same operand and are in the same BB.
2. Both outputs are used.
3. The target has a native instruction that maps to ISD::FSINCOS node or
   the target provides a sincos library call.

Implemented the generic optimization in sdisel and enabled it for
Mac OSX. Also added an additional optimization for x86_64 Mac OSX by
using an alternative entry point __sincos_stret which returns the two
results in xmm0 / xmm1.

rdar://13087969
PR13204

llvm-svn: 173755
2013-01-29 02:32:37 +00:00
Benjamin Kramer cf9dae17b7 Legalizer: Reword comment again, per Duncan's suggestion.
llvm-svn: 173625
2013-01-27 21:02:52 +00:00
Benjamin Kramer 084e675e17 Legalizer: Add an assert and tweak a comment to clarify the assumptions this code makes.
llvm-svn: 173620
2013-01-27 15:04:43 +00:00
Benjamin Kramer 05cc93964a When the legalizer is splitting vector shifts, the result may not have the right shift amount type.
Fix that by adding a cast to the shift expander. This came up with vector shifts
on sse-less X86 CPUs.

   <2 x i64>       = shl <2 x i64> <2 x i64>
-> i64,i64         = shl i64 i64; shl i64 i64
-> i32,i32,i32,i32 = shl_parts i32 i32 i64; shl_parts i32 i32 i64

Now we cast the last two i64s to the right type. Fixes the crash in PR14668.

llvm-svn: 173615
2013-01-27 11:19:11 +00:00
Preston Gurd 0959bb707d This patch aims to reduce compile time in LegalizeTypes by using SmallDenseMap,
with an initial number of elements,  instead of DenseMap, which has
zero initial elements, in order to avoid the copying of elements
when the size changes and to avoid allocating space every time
LegalizeTypes is run. This patch will not affect the memory footprint,
because DenseMap will increase the element size to 64
when the first element is added.

Patch by Wan Xiaofei.

llvm-svn: 173448
2013-01-25 15:18:54 +00:00
Tim Northover 29178a348a Make APFloat constructor require explicit semantics.
Previously we tried to infer it from the bit width size, with an added
IsIEEE argument for the PPC/IEEE 128-bit case, which had a default
value. This default value allowed bugs to creep in, where it was
inappropriate.

llvm-svn: 173138
2013-01-22 09:46:31 +00:00
Nadav Rotem 9450fcfff1 Revert 172708.
The optimization handles esoteric cases but adds a lot of complexity both to the X86 backend and to other backends.
This optimization disables an important canonicalization of chains of SEXT nodes and makes SEXT and ZEXT asymmetrical.
Disabling the canonicalization of consecutive SEXT nodes into a single node disables other DAG optimizations that assume
that there is only one SEXT node. The AVX mask optimizations is one example. Additionally this optimization does not update the cost model.

llvm-svn: 172968
2013-01-20 08:35:56 +00:00
Bill Wendling 658d24d211 Use AttributeSet accessor methods instead of Attribute accessor methods.
Further encapsulation of the Attribute object. Don't allow direct access to the
Attribute object as an aggregate.

llvm-svn: 172853
2013-01-18 21:53:16 +00:00
Bill Wendling 4f972ea2d8 Remove unused parameter. Also use the AttributeSet query methods instead of the Attribute query methods.
llvm-svn: 172852
2013-01-18 21:50:24 +00:00
Elena Demikhovsky f6a30e05d5 Optimization for the following SIGN_EXTEND pairs:
v8i8  -> v8i64, 
v8i8  -> v8i32, 
v4i8  -> v4i64, 
v4i16 -> v4i64 
for AVX and AVX2.

Bug 14865.

llvm-svn: 172708
2013-01-17 09:59:53 +00:00
Bill Schmidt d006c6938b This patch addresses an incorrect transformation in the DAG combiner.
The included test case is derived from one of the GCC compatibility tests.
The problem arises after the selection DAG has been converted to type-legalized
form.  The combiner first sees a 64-bit load that can be converted into a
pre-increment form.  The original load feeds into a SRL that isolates the
upper 32 bits of the loaded doubleword.  This looks like an opportunity for
DAGCombiner::ReduceLoadWidth() to replace the 64-bit load with a 32-bit load.

However, this transformation is not valid, as the replacement load is not
a pre-increment load.  The pre-increment load produces an extra result,
which feeds a subsequent add instruction.  The replacement load only has
one result value, and this value is propagated to all uses of the pre-
increment load, including the add.  Because the add is looking for the
second result value as its operand, it ends up attempting to add a constant
to a token chain, resulting in a crash.

So the patch simply disables this transformation for any load with more than
two result values.

llvm-svn: 172480
2013-01-14 22:04:38 +00:00
Benjamin Kramer 5ea0349ef5 When lowering an inreg sext first shift left, then right arithmetically.
Shifting right two times will only yield zero. Should fix
SingleSource/UnitTests/SignlessTypes/factor.

llvm-svn: 172322
2013-01-12 19:06:44 +00:00
Nadav Rotem dbe5c72d03 PPC: Implement efficient lowering of sign_extend_inreg.
llvm-svn: 172269
2013-01-11 22:57:48 +00:00
Benjamin Kramer fb3c009b52 Remove some accidentaly duplicated code. This needs urgent cleanup :(
llvm-svn: 172248
2013-01-11 20:11:33 +00:00
Benjamin Kramer 56b31bd9d7 Split TargetLowering into a CodeGen and a SelectionDAG part.
This fixes some of the cycles between libCodeGen and libSelectionDAG. It's still
a complete mess but as long as the edges consist of virtual call it doesn't
cause breakage. BasicTTI did static calls and thus broke some build
configurations.

llvm-svn: 172246
2013-01-11 20:05:37 +00:00
Eric Christopher 0cb6fd930e For inline asm:
- recognize string "{memory}" in the MI generation
- mark as mayload/maystore when there's a memory clobber constraint.

PR14859.

Patch by Krzysztof Parzyszek

llvm-svn: 172228
2013-01-11 18:12:39 +00:00