Commit Graph

6012 Commits

Author SHA1 Message Date
Hal Finkel 5ef4dccdce Use TargetSubtargetInfo::useAA() in DAGCombine
This uses the TargetSubtargetInfo::useAA() function to control the defaults of
the -combiner-alias-analysis and -combiner-global-alias-analysis options.

llvm-svn: 189564
2013-08-29 03:29:55 +00:00
Juergen Ributzka 11c52c601a Fix a typo and coding style of a previous commit. No functional change.
llvm-svn: 189526
2013-08-28 22:33:58 +00:00
Tim Northover 819bfb5a25 DAGCombiner: make sure or/shl/srl really has zero high bits before forming bswap
We want to convert code like (or (srl N, 8), (shl N, 8)) into (srl (bswap N),
const), but this is only valid if the bits above 16 on the source pattern are
0, the checks we were doing on this were slightly wrong before.

llvm-svn: 189348
2013-08-27 13:46:45 +00:00
Owen Anderson a0260f848d Remove an over-zealous assertion. A pointer type could be illegal if the target is prepared to custom-legalize pointer operands. This assertion was evaluated before the target would have a chance to do so, making it impossible.
llvm-svn: 189299
2013-08-27 00:28:23 +00:00
Tom Stellard 838e2344ec SelectionDAG: Remove unnecessary uses of TargetLowering::getPointerTy()
If we have a binary operation like ISD:ADD, we can set the result type
equal to the result type of one of its operands rather than using
TargetLowering::getPointerTy().

Also, any use of DAG.getIntPtrConstant(C) as an operand for a binary
operation can be replaced with:
DAG.getConstant(C, OtherOperand.getValueType());

llvm-svn: 189227
2013-08-26 15:06:10 +00:00
Tom Stellard 7da047c9fb SelectionDAG: Use correct pointer size when splitting vector stores
llvm-svn: 189224
2013-08-26 15:05:55 +00:00
Tom Stellard fd155828ed SelectionDAG: Use correct pointer size when lowering function arguments v2
This adds minimal support to the SelectionDAG for handling address spaces
with different pointer sizes.  The SelectionDAG should now correctly
lower pointer function arguments to the correct size as well as generate
the correct code when lowering getelementptr.

This patch also updates the R600 DataLayout to use 32-bit pointers for
the local address space.

v2:
  - Add more helper functions to TargetLoweringBase
  - Use CHECK-LABEL for tests

llvm-svn: 189221
2013-08-26 15:05:36 +00:00
Benjamin Kramer b12cf01908 Add a function object to compare the first or second component of a std::pair.
Replace instances of this scattered around the code base.

llvm-svn: 189169
2013-08-24 12:54:27 +00:00
Michael Gottesman 20f25eb958 [stack protector] Work around an issue with the BMOVPCB_CALL instruction on ARM by disabling does not return on __stack_chk_fail.
This is to fix the bots while I look to see if there is something I can do here.

rdar://14811848

llvm-svn: 189076
2013-08-22 23:45:24 +00:00
Michael Gottesman 1adac3582d [stackprotector] When finding the split point to splice off the end of a parentmbb into a successmbb, include any DBG_VALUE MI.
Fix for PR16954.

llvm-svn: 188987
2013-08-22 05:40:50 +00:00
Tom Stellard 1b2c2d8414 SelectionDAG: Make sure stores are always added to the LegalizedNodes list
When truncated vector stores were being custom lowered in
VectorLegalizer::LegalizeOp(), the old (illegal) and new (legal) node pair
was not being added to LegalizedNodes list.  Instead of the legalized
result being passed to VectorLegalizer::TranslateLegalizeResult(),
the result was being passed back into VectorLegalizer::LegalizeOp(),
which ended up adding a (new, new) pair to the list instead.

This was causing an assertion failure when a custom lowered truncated
vector store was the last instruction a basic block and the VectorLegalizer
was unable to find it in the LegalizedNodes list when updating the
DAG root.

llvm-svn: 188953
2013-08-21 22:42:58 +00:00
Juergen Ributzka 3db39dc1ae Teach BaseIndexOffset::match to identify base pointers in loops.
The small utility function that pattern matches Base + Index +
Offset patterns for loads and stores fails to recognize the base
pointer for loads/stores from/into an array at offset 0 inside a
loop. As a result DAGCombiner::MergeConsecutiveStores was not able
to merge all stores.

This commit fixes the issue by adding an additional pattern match
and also a test case.

Reviewer: Nadav
llvm-svn: 188936
2013-08-21 21:53:38 +00:00
Richard Sandiford 6f6d55161b [SystemZ] Use SRST to optimize memchr
SystemZTargetLowering::emitStringWrapper() previously loaded the character
into R0 before the loop and made R0 live on entry.  I'd forgotten that
allocatable registers weren't allowed to be live across blocks at this stage,
and it confused LiveVariables enough to cause a miscompilation of f3 in
memchr-02.ll.

This patch instead loads R0 in the loop and leaves LICM to hoist it
after RA.  This is actually what I'd tried originally, but I went for
the manual optimisation after noticing that R0 often wasn't being hoisted.
This bug forced me to go back and look at why, now fixed as r188774.

We should also try to optimize null checks so that they test the CC result
of the SRST directly.  The select between null and the SRST GPR result could
then usually be deleted as dead.

llvm-svn: 188779
2013-08-20 09:38:48 +00:00
Michael Gottesman f7e1203d95 Remove unused variables that crept in.
llvm-svn: 188761
2013-08-20 07:17:27 +00:00
Michael Gottesman b27f0f1f6b Teach selectiondag how to handle the stackprotectorcheck intrinsic.
Previously, generation of stack protectors was done exclusively in the
pre-SelectionDAG Codegen LLVM IR Pass "Stack Protector". This necessitated
splitting basic blocks at the IR level to create the success/failure basic
blocks in the tail of the basic block in question. As a result of this,
calls that would have qualified for the sibling call optimization were no
longer eligible for optimization since said calls were no longer right in
the "tail position" (i.e. the immediate predecessor of a ReturnInst
instruction).

Then it was noticed that since the sibling call optimization causes the
callee to reuse the caller's stack, if we could delay the generation of
the stack protector check until later in CodeGen after the sibling call
decision was made, we get both the tail call optimization and the stack
protector check!

A few goals in solving this problem were:

  1. Preserve the architecture independence of stack protector generation.

  2. Preserve the normal IR level stack protector check for platforms like
     OpenBSD for which we support platform specific stack protector
     generation.

The main problem that guided the present solution is that one can not
solve this problem in an architecture independent manner at the IR level
only. This is because:

  1. The decision on whether or not to perform a sibling call on certain
     platforms (for instance i386) requires lower level information
     related to available registers that can not be known at the IR level.

  2. Even if the previous point were not true, the decision on whether to
     perform a tail call is done in LowerCallTo in SelectionDAG which
     occurs after the Stack Protector Pass. As a result, one would need to
     put the relevant callinst into the stack protector check success
     basic block (where the return inst is placed) and then move it back
     later at SelectionDAG/MI time before the stack protector check if the
     tail call optimization failed. The MI level option was nixed
     immediately since it would require platform specific pattern
     matching. The SelectionDAG level option was nixed because
     SelectionDAG only processes one IR level basic block at a time
     implying one could not create a DAG Combine to move the callinst.

To get around this problem a few things were realized:

  1. While one can not handle multiple IR level basic blocks at the
     SelectionDAG Level, one can generate multiple machine basic blocks
     for one IR level basic block. This is how we handle bit tests and
     switches.

  2. At the MI level, tail calls are represented via a special return
     MIInst called "tcreturn". Thus if we know the basic block in which we
     wish to insert the stack protector check, we get the correct behavior
     by always inserting the stack protector check right before the return
     statement. This is a "magical transformation" since no matter where
     the stack protector check intrinsic is, we always insert the stack
     protector check code at the end of the BB.

Given the aforementioned constraints, the following solution was devised:

  1. On platforms that do not support SelectionDAG stack protector check
     generation, allow for the normal IR level stack protector check
     generation to continue.

  2. On platforms that do support SelectionDAG stack protector check
     generation:

    a. Use the IR level stack protector pass to decide if a stack
       protector is required/which BB we insert the stack protector check
       in by reusing the logic already therein. If we wish to generate a
       stack protector check in a basic block, we place a special IR
       intrinsic called llvm.stackprotectorcheck right before the BB's
       returninst or if there is a callinst that could potentially be
       sibling call optimized, before the call inst.

    b. Then when a BB with said intrinsic is processed, we codegen the BB
       normally via SelectBasicBlock. In said process, when we visit the
       stack protector check, we do not actually emit anything into the
       BB. Instead, we just initialize the stack protector descriptor
       class (which involves stashing information/creating the success
       mbbb and the failure mbb if we have not created one for this
       function yet) and export the guard variable that we are going to
       compare.

    c. After we finish selecting the basic block, in FinishBasicBlock if
       the StackProtectorDescriptor attached to the SelectionDAGBuilder is
       initialized, we first find a splice point in the parent basic block
       before the terminator and then splice the terminator of said basic
       block into the success basic block. Then we code-gen a new tail for
       the parent basic block consisting of the two loads, the comparison,
       and finally two branches to the success/failure basic blocks. We
       conclude by code-gening the failure basic block if we have not
       code-gened it already (all stack protector checks we generate in
       the same function, use the same failure basic block).

llvm-svn: 188755
2013-08-20 07:00:16 +00:00
Hal Finkel 0c5c01aa4a Add a llvm.copysign intrinsic
This adds a llvm.copysign intrinsic; We already have Libfunc recognition for
copysign (which is turned into the FCOPYSIGN SDAG node). In order to
autovectorize calls to copysign in the loop vectorizer, we need a corresponding
intrinsic as well.

In addition to the expected changes to the language reference, the loop
vectorizer, BasicTTI, and the SDAG builder (the intrinsic is transformed into
an FCOPYSIGN node, just like the function call), this also adds FCOPYSIGN to a
few lists in LegalizeVector{Ops,Types} so that vector copysigns can be
expanded.

In TargetLoweringBase::initActions, I've made the default action for FCOPYSIGN
be Expand for vector types. This seems correct for all in-tree targets, and I
think is the right thing to do because, previously, there was no way to generate
vector-values FCOPYSIGN nodes (and most targets don't specify an action for
vector-typed FCOPYSIGN).

llvm-svn: 188728
2013-08-19 23:35:46 +00:00
Paul Redmond 62f840f46a Improve the widening of integral binary vector operations
- split WidenVecRes_Binary into WidenVecRes_Binary and WidenVecRes_BinaryCanTrap
  - WidenVecRes_BinaryCanTrap preserves the original behaviour for operations
    that can trap
  - WidenVecRes_Binary simply widens the operation and improves codegen for
    3-element vectors by allowing widening and promotion on x86 (matches the
    behaviour of unary and ternary operation widening)
- use WidenVecRes_Binary for operations on integers.

Reviewed by: nrotem

llvm-svn: 188699
2013-08-19 20:01:35 +00:00
Hal Finkel e4eb78188c Add ExpandFloatOp_FCOPYSIGN to handle ppcf128-related expansions
We had previously been asserting when faced with a FCOPYSIGN f64, ppcf128 node
because there was no way to expand the FCOPYSIGN node. Because ppcf128 is the
sum of two doubles, and the first double must have the larger magnitude, we
can take the sign from the first double. As a result, in addition to fixing the
crash, this is also an optimization.

llvm-svn: 188655
2013-08-19 06:55:37 +00:00
Jim Grosbach 06c2a68125 ARM: Fix more fast-isel verifier failures.
Teach the generic instruction selection helper functions to constrain
the register classes of their input operands. For non-physical register
references, the generic code needs to be careful not to mess that up
when replacing references to result registers. As the comment indicates
for MachineRegisterInfo::replaceRegWith(), it's important to call
constrainRegClass() first.

rdar://12594152

llvm-svn: 188593
2013-08-16 23:37:31 +00:00
Richard Sandiford 0dec06a28c [SystemZ] Use SRST to implement strlen and strnlen
It would also make sense to use it for memchr; I'm working on that now.

llvm-svn: 188547
2013-08-16 11:41:43 +00:00
Richard Sandiford bb83a50f57 [SystemZ] Use MVST to implement strcpy and stpcpy
llvm-svn: 188546
2013-08-16 11:29:37 +00:00
Richard Sandiford ca23271010 [SystemZ] Use CLST to implement strcmp
llvm-svn: 188544
2013-08-16 11:21:54 +00:00
Richard Sandiford e3827751e2 [SystemZ] Fix handling of 64-bit memcmp results
Generalize r188163 to cope with return types other than MVT::i32, just
as the existing visitMemCmpCall code did.  I've split this out into a
subroutine so that it can be used for other upcoming patches.

I also noticed that I'd used the wrong API to record the out chain.
It's a load that uses DAG.getRoot() rather than getRoot(), so the out
chain should go on PendingLoads.  I don't have a testcase for that because
we don't do any interesting scheduling on z yet.

llvm-svn: 188540
2013-08-16 10:55:47 +00:00
Craig Topper d9c2783d8f Replace getValueType().getSimpleVT() with getSimpleValueType().
llvm-svn: 188442
2013-08-15 02:44:19 +00:00
Jim Grosbach 327ccc787e DAG: Combine (and (setne X, 0), (setne X, -1)) -> (setuge (add X, 1), 2)
A common idiom is to use zero and all-ones as sentinal values and to
check for both in a single conditional ("x != 0 && x != (unsigned)-1").
That generates code, for i32, like:
  testl %edi, %edi
  setne %al
  cmpl  $-1, %edi
  setne %cl
  andb  %al, %cl

With this transform, we generate the simpler:
  incl  %edi
  cmpl  $1, %edi
  seta  %al

Similar improvements for other integer sizes and on other platforms. In
general, combining the two setcc instructions into one is better.

rdar://14689217

llvm-svn: 188315
2013-08-13 21:30:58 +00:00
Michael Gottesman 7a8017290a Update makeLibCall to return both the call and the chain associated with the libcall instead of just the call. This allows us to specify libcalls that return void.
LowerCallTo returns a pair with the return value of the call as the first
element and the chain associated with the return value as the second element. If
we lower a call that has a void return value, LowerCallTo returns an SDValue
with a NULL SDNode and the chain for the call. Thus makeLibCall by just
returning the first value makes it impossible for you to set up the chain so
that the call is not eliminated as dead code.

I also updated all references to makeLibCall to reflect the new return type.

llvm-svn: 188300
2013-08-13 17:54:56 +00:00
Michael Gottesman 3923bec37b Fixed SelectionDAGBuilder.h C++ filetype declaration to use the canonical C++ instead of c++.
llvm-svn: 188203
2013-08-12 21:02:02 +00:00
Richard Sandiford 564681c88d [SystemZ] Use CLC and IPM to implement memcmp
For now this is restricted to fixed-length comparisons with a length
in the range [1, 256], as for memcpy() and MVC.

llvm-svn: 188163
2013-08-12 10:28:10 +00:00
Craig Topper 0ecb26a79e Change asserts at the top of getVectorShuffle to check that LHS and RHS have the same type as the result.
Previously the asserts were only checking that RHS and LHS were the same type and had the same element type as the result. All downstream code for ISD::VECTOR_SHUFFLE requires the types to be the same.

Also removed one unnecessary check of matched element counts that was present in the code.

llvm-svn: 188051
2013-08-09 04:37:24 +00:00
Craig Topper 9a39b07a60 Remove AllUndef check from one of the loops in getVectorShuffle. It was already handled by the 'AllLHS && AllRHS' check after the previous loop.
llvm-svn: 187965
2013-08-08 08:03:12 +00:00
Craig Topper 309dfefb6f Optimize mask generation for one of the DAG combiner shufflevector cases.
llvm-svn: 187961
2013-08-08 07:38:55 +00:00
Hal Finkel 171817ee8a Add ISD::FROUND for libm round()
All libm floating-point rounding functions, except for round(), had their own
ISD nodes. Recent PowerPC cores have an instruction for round(), and so here I'm
adding ISD::FROUND so that round() can be custom lowered as well.

For the most part, this is straightforward. I've added an intrinsic
and a matching ISD node just like those for nearbyint() and friends. The
SelectionDAG pattern I've named frnd (because ISD::FP_ROUND has already claimed
fround).

This will be used by the PowerPC backend in a follow-up commit.

llvm-svn: 187926
2013-08-07 22:49:12 +00:00
Tom Stellard d42c594960 TargetLowering: Add getVectorIdxTy() function v2
This virtual function can be implemented by targets to specify the type
to use for the index operand of INSERT_VECTOR_ELT, EXTRACT_VECTOR_ELT,
INSERT_SUBVECTOR, EXTRACT_SUBVECTOR.  The default implementation returns
the result from TargetLowering::getPointerTy()

The previous code was using TargetLowering::getPointerTy() for vector
indices, because this is guaranteed to be legal on all targets.  However,
using TargetLowering::getPointerTy() can be a problem for targets with
pointer sizes that differ across address spaces.  On such targets,
when vectors need to be loaded or stored to an address space other than the
default 'zero' address space (which is the address space assumed by
TargetLowering::getPointerTy()), having an index that
is a different size than the pointer can lead to inefficient
pointer calculations, (e.g. 64-bit adds for a 32-bit address space).

There is no intended functionality change with this patch.

llvm-svn: 187748
2013-08-05 22:22:01 +00:00
Eric Christopher e6656ac870 Fix crashing on invalid inline asm with matching constraints.
For a testcase like the following:

 typedef unsigned long uint64_t;

 typedef struct {
   uint64_t lo;
   uint64_t hi;
 } blob128_t;

 void add_128_to_128(const blob128_t *in, blob128_t *res) {
   asm ("PAND %1, %0" : "+Q"(*res) : "Q"(*in));
 }

where we'll fail to allocate the register for the output constraint,
our matching input constraint will not find a register to match,
and could try to search past the end of the current operands array.

On the idea that we'd like to attempt to keep compilation going
to find more errors in the module, change the error cases when
we're visiting inline asm IR to return immediately and avoid
trying to create a node in the DAG. This leaves us with only
a single error message per inline asm instruction, but allows us
to safely keep going in the general case.

llvm-svn: 187470
2013-07-31 01:26:24 +00:00
Eric Christopher 029af15086 Reflow this to be easier to read.
llvm-svn: 187459
2013-07-30 22:50:44 +00:00
Quentin Colombet 6bf4baa408 [DAGCombiner] insert_vector_elt: Avoid building a vector twice.
This patch prevents the following combine when the input vector is used more
than once.
insert_vector_elt (build_vector elt0, ..., eltN), NewEltIdx, idx
=>
build_vector elt0, ..., NewEltIdx, ..., eltN 

The reasons are:
- Building a vector may be expensive, so try to reuse the existing part of a
  vector instead of creating a new one (think big vectors).
- elt0 to eltN now have two users instead of one. This may prevent some other
  optimizations.

llvm-svn: 187396
2013-07-30 00:24:09 +00:00
Nick Lewycky 0b68245ec8 Reimplement isPotentiallyReachable to make nocapture deduction much stronger.
Adds unit tests for it too.

Split BasicBlockUtils into an analysis-half and a transforms-half, and put the
analysis bits into a new Analysis/CFG.{h,cpp}. Promote isPotentiallyReachable
into llvm::isPotentiallyReachable and move it into Analysis/CFG.

llvm-svn: 187283
2013-07-27 01:24:00 +00:00
Justin Holewinski d3f2035a3c Add a target legalize hook for SplitVectorOperand (again)
CustomLowerNode was not being called during SplitVectorOperand,
meaning custom legalization could not be used by targets.

This also adds a test case for NVPTX that depends on this custom
legalization.

Differential Revision: http://llvm-reviews.chandlerc.com/D1195

Attempt to fix the buildbots by making the X86 test I just added platform independent

llvm-svn: 187202
2013-07-26 13:28:29 +00:00
Rafael Espindola 1d812728cc Revert "Add a target legalize hook for SplitVectorOperand"
This reverts commit 187198. It broke the bots.

The soft float test probably needs a -triple because of name differences.
On the hard float test I am getting a "roundss $1, %xmm0, %xmm0", instead of
"vroundss $1, %xmm0, %xmm0, %xmm0".

llvm-svn: 187201
2013-07-26 13:18:16 +00:00
Justin Holewinski f848a24e50 Add a target legalize hook for SplitVectorOperand
CustomLowerNode was not being called during SplitVectorOperand,
meaning custom legalization could not be used by targets.

This also adds a test case for NVPTX that depends on this custom
legalization.

Differential Revision: http://llvm-reviews.chandlerc.com/D1195

llvm-svn: 187198
2013-07-26 12:46:39 +00:00
Tom Stellard c54731aa9d DAGCombiner: Pass the correct type to TargetLowering::isF(Abs|Neg)Free
This commit also implements these functions for R600 and removes a test
case that was relying on the buggy behavior.

llvm-svn: 187007
2013-07-23 23:55:03 +00:00
Michael Gottesman f87a6ae65f Add -*- C++ -*- to InstrEmitter.h.
llvm-svn: 186527
2013-07-17 18:53:29 +00:00
Hal Finkel 2f5e8e3d95 Remove invalid assert in DAGTypeLegalizer::RemapValue
There is a comment at the top of DAGTypeLegalizer::PerformExpensiveChecks
which, in part, says:

  // Note that these invariants may not hold momentarily when processing a node:
  // the node being processed may be put in a map before being marked Processed.

Unfortunately, this assert would be valid only if the above-mentioned invariant
held unconditionally. This was causing llc to assert when, in fact,
everything was fine.

Thanks to Richard Sandiford for investigating this issue!

Fixes PR16562.

llvm-svn: 186338
2013-07-15 18:57:05 +00:00
Craig Topper 06b3b6651e Add 'const' qualifier to some arrays.
llvm-svn: 186312
2013-07-15 08:02:13 +00:00
Craig Topper b94011fd28 Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size.
llvm-svn: 186274
2013-07-14 04:42:23 +00:00
Craig Topper e0b711864c Pass SmallVector by const reference instead of by value.
llvm-svn: 186243
2013-07-13 07:43:40 +00:00
Stephen Lin 10947502e5 Remove trailing whitespac
llvm-svn: 186032
2013-07-10 20:47:39 +00:00
Adrian Prantl a1ffd1a450 Un-break the buildbot by tweaking the indirection flag.
Pulled in a testcase from the debuginfo-test suite.

llvm-svn: 185993
2013-07-10 01:53:37 +00:00
Adrian Prantl facc9f4e3e Document a known limitation of the status quo.
llvm-svn: 185992
2013-07-10 01:53:30 +00:00
Adrian Prantl 418d1d1ea9 Reapply an improved version of r180816/180817.
Change the informal convention of DBG_VALUE machine instructions so that
we can express a register-indirect address with an offset of 0.
The old convention was that a DBG_VALUE is a register-indirect value if
the offset (operand 1) is nonzero. The new convention is that a DBG_VALUE
is register-indirect if the first operand is a register and the second
operand is an immediate. For plain register values the combination reg,
reg is used. MachineInstrBuilder::BuildMI knows how to build the new
DBG_VALUES.

rdar://problem/13658587

llvm-svn: 185966
2013-07-09 20:28:37 +00:00