Commit Graph

8347 Commits

Author SHA1 Message Date
Craig Topper 9fe357971c [ValueTracking] Remove const_casts on several calls to computeKnownBits and ComputeSignBit. NFC
llvm-svn: 302991
2017-05-13 17:22:16 +00:00
Tim Shen 10c64e6aea [PPC] Move the combine "a << (b % (sizeof(a) * 8)) -> (PPCshl a, b)" to the backend. NFC.
Summary:
Eli pointed out that it's unsafe to combine the shifts to ISD::SHL etc.,
because those are not defined for b > sizeof(a) * 8, even after some of
the combiners run.

However, PPCISD::SHL defines that behavior (as the instructions themselves).
Move the combination to the backend.

The tests in shift_mask.ll still pass.

Reviewers: echristo, hfinkel, efriedma, iteratee

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D33076

llvm-svn: 302937
2017-05-12 19:25:37 +00:00
Craig Topper 8df66c602a [KnownBits] Add bit counting methods to KnownBits struct and use them where possible
This patch adds min/max population count, leading/trailing zero/one bit counting methods.

The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give.

Differential Revision: https://reviews.llvm.org/D32931

llvm-svn: 302925
2017-05-12 17:20:30 +00:00
Simon Pilgrim eabf6fc4b5 [DAGCombine] Use SelectionDAG::getAnyExtOrTrunc helper. NFCI.
llvm-svn: 302907
2017-05-12 15:26:50 +00:00
Simon Pilgrim f01c301f72 [DAGCombine] Use SelectionDAG::getZExtOrTrunc helper. NFCI.
llvm-svn: 302897
2017-05-12 13:22:12 +00:00
Simon Pilgrim a6ed1b2f12 Use SDValue::getOperand() helper. NFCI.
llvm-svn: 302896
2017-05-12 13:20:24 +00:00
Vadzim Dambrouski 38e30197c3 [MSP430] Generate EABI-compliant libcalls
Updates the MSP430 target to generate EABI-compatible libcall names.
As a byproduct, adjusts the hardware multiplier options available in
the MSP430 target, adds support for promotion of the ISD::MUL operation
for 8-bit integers, and correctly marks R11 as used by call instructions.

Patch by Andrew Wygle.

Differential Revision: https://reviews.llvm.org/D32676

llvm-svn: 302820
2017-05-11 19:56:14 +00:00
Simon Pilgrim 6faddcbd07 [DAGCombine] Use SelectionDAG::getAnyExtOrTrunc helper. NFCI.
llvm-svn: 302808
2017-05-11 16:40:44 +00:00
Simon Pilgrim a4a13a0da0 Strip trailing whitespace. NFCI.
llvm-svn: 302784
2017-05-11 10:03:05 +00:00
David L. Jones bbd97d273b Revert "[SDAG] Relax conditions under stores of loaded values can be merged"
This reverts r302712.

The change fails with ASAN enabled:

ERROR: AddressSanitizer: use-after-poison on address ... at ...
READ of size 2 at ... thread T0
  #0 ... in llvm::SDNode::getNumValues() const <snip>/include/llvm/CodeGen/SelectionDAGNodes.h:855:42
  #1 ... in llvm::SDNode::hasAnyUseOfValue(unsigned int) const <snip>/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7270:3
  #2 ... in llvm::SDValue::use_empty() const <snip> include/llvm/CodeGen/SelectionDAGNodes.h:1042:17
  #3 ... in (anonymous namespace)::DAGCombiner::MergeConsecutiveStores(llvm::StoreSDNode*) <snip>/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12944:7

Reviewers: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33081

llvm-svn: 302746
2017-05-10 23:56:21 +00:00
Nirav Dave a38c049fc5 [SDAG] Relax conditions under stores of loaded values can be merged
Summary:

Allow consecutive stores whose values come from consecutive loads to
merged in the presense of other uses of the loads. Previously this was
disallowed as in general the merged load cannot be shared with the
other uses. Merging N stores into 1 may cause as many as N redundant
loads. However in the context of caching this should have neglible
affect on memory pressure and reduce instruction count making it
almost always a win.

Fixes PR32086.

Reviewers: spatel, jyknight, andreadb, hfinkel, efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30471

llvm-svn: 302712
2017-05-10 19:53:41 +00:00
Amaury Sechet 197685c6d8 Small refactoring in DAGCombine. NFC
llvm-svn: 302699
2017-05-10 17:58:28 +00:00
Simon Pilgrim cd4d913336 [DAGCombiner] Dropped explicit (sra 0, x) -> 0 and (sra -1, x) -> 0 folds.
These are both handled (and tested) by the earlier ComputeNumSignBits == EltSizeInBits fold.

llvm-svn: 302651
2017-05-10 13:06:26 +00:00
Simon Pilgrim c29af824bf [DAGCombiner] Add vector support to fold (shl/srl 0, x) -> 0
llvm-svn: 302641
2017-05-10 12:34:27 +00:00
Ahmed Bougacha 604526fe87 [CodeGen] Don't require AA in SDAGISel at -O0.
Before r247167, the pass manager builder controlled which AA
implementations were used, exporting them all in the AliasAnalysis
analysis group.

Now, AAResultsWrapperPass always uses BasicAA, but still uses other AA
implementations if made available in the pass pipeline.

But regardless, SDAGISel is required at O0, and really doesn't need to
be doing fancy optimizations based on useful AA results.

Don't require AA at CodeGenOpt::None, and only use it otherwise.

This does have a functional impact (and one testcase is pessimized
because we can't reuse a load).  But I think that's desirable no matter
what.

Note that this alone doesn't result in less DT computations: TwoAddress
was previously able to reuse the DT we computed for SDAG.  That will be
fixed separately.

Differential Revision: https://reviews.llvm.org/D32766

llvm-svn: 302611
2017-05-10 00:39:30 +00:00
Zvi Rackover b483e28c77 DAGCombine: Combine shuffles of splat-shuffles
Summary: Reapply r299047, but this time handle correctly splat-masks with undef elements.

Reviewers: spatel, RKSimon, eli.friedman, andreadb

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31961

llvm-svn: 302583
2017-05-09 20:25:38 +00:00
Reid Kleckner 3a363fff7e Re-land "Use the frame index side table for byval and inalloca arguments"
This re-lands r302483. It was not the cause of PR32977.

llvm-svn: 302544
2017-05-09 16:02:20 +00:00
Reid Kleckner 84075fddff Re-land "Don't add DBG_VALUE instructions for static allocas in dbg.declare"
This re-lands commit r302461. It was not the cause of PR32977.

llvm-svn: 302543
2017-05-09 16:01:47 +00:00
Serge Pavlov d526b13e61 Add extra operand to CALLSEQ_START to keep frame part set up previously
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to  CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.

This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.

The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
  affects all targets that use frame pseudo instructions and touched many
  files although the changes are uniform.
- Access to frame properties are implemented using special instructions
  rather than calls getOperand(N).getImm(). For X86 and ARM such
  replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
  instruction. These involve proper instruction initialization and
  methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
  frame parts initialized inside frame instruction pair and outside it.

The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.

Differential Revision: https://reviews.llvm.org/D32394

llvm-svn: 302527
2017-05-09 13:35:13 +00:00
Amara Emerson cf9daa33a7 Introduce experimental generic intrinsics for horizontal vector reductions.
- This change allows targets to opt-in to using them instead of the log2
  shufflevector algorithm.
- The SLP and Loop vectorizers have the common code to do shuffle reductions
  factored out into LoopUtils, and now have a unified interface for generating
  reductions regardless of the preference of the target. LoopUtils now uses TTI
  to determine what kind of reductions the target wants to handle.
- For CodeGen, basic legalization support is added.

Differential Revision: https://reviews.llvm.org/D30086

llvm-svn: 302514
2017-05-09 10:43:25 +00:00
Reid Kleckner 41bb94233b Revert "Don't add DBG_VALUE instructions for static allocas in dbg.declare"
This reverts commit r302461.

It appears to be causing failures compiling gtest with debug info on the
Linux sanitizer bot. I was unable to reproduce the failure locally,
however.

llvm-svn: 302504
2017-05-09 01:57:44 +00:00
Reid Kleckner 9f29914d40 Revert "Use the frame index side table for byval and inalloca arguments"
This reverts r302483 and it's follow up fix.

llvm-svn: 302493
2017-05-09 01:14:39 +00:00
Reid Kleckner 45efcf0c96 Use the frame index side table for byval and inalloca arguments
Summary:
For inalloca functions, this is a very common code pattern:

  %argpack = type <{ i32, i32, i32 }>
  define void @f(%argpack* inalloca %args) {
  entry:
    %a = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 0
    %b = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 1
    %c = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 2
    tail call void @llvm.dbg.declare(metadata i32* %a, ... "a")
    tail call void @llvm.dbg.declare(metadata i32* %c, ... "b")
    tail call void @llvm.dbg.declare(metadata i32* %b, ... "c")

Even though these GEPs can be simplified to a constant offset from EBP
or RSP, we don't do that at -O0, and each GEP is computed into a
register. Registers used to compute argument addresses are typically
spilled and clobbered very quickly after the initial computation, so
live debug variable tracking loses information very quickly if we use
DBG_VALUE instructions.

This change moves processing of dbg.declare between argument lowering
and basic block isel, so that we can ask if an argument has a frame
index or not. If the argument lives in a register as is the case for
byval arguments on some targets, then we don't put it in the side table
and during ISel we emit DBG_VALUE instructions.

Reviewers: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32980

llvm-svn: 302483
2017-05-08 23:20:27 +00:00
Reid Kleckner bf828eedb4 Don't add DBG_VALUE instructions for static allocas in dbg.declare
Summary:
An llvm.dbg.declare of a static alloca is always added to the
MachineFunction dbg variable map, so these values are entirely
redundant. They survive all the way through codegen to be ignored by
DWARF emission.

Effectively revert r113967

Two bugpoint-reduced test cases from 2012 broke as a result of this
change. Despite my best efforts, I haven't been able to rewrite the test
case using dbg.value. I'm not too concerned about the lost coverage
because these were reduced from the test-suite, which we still run.

Reviewers: aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32920

llvm-svn: 302461
2017-05-08 19:58:15 +00:00
Dean Michael Berris 9bcaed867a [XRay] Custom event logging intrinsic
This patch introduces an LLVM intrinsic and a target opcode for custom event
logging in XRay. Initially, its use case will be to allow users of XRay to log
some type of string ("poor man's printf"). The target opcode compiles to a noop
sled large enough to enable calling through to a runtime-determined relative
function call. At runtime, when X-Ray is enabled, the sled is replaced by
compiler-rt with a trampoline to the logic for creating the custom log entries.

Future patches will implement the compiler-rt parts and clang-side support for
emitting the IR corresponding to this intrinsic.

Reviewers: timshen, dberris

Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits

Differential Revision: https://reviews.llvm.org/D27503

llvm-svn: 302405
2017-05-08 05:45:21 +00:00
Simon Pilgrim 2c15447f99 [DAGCombiner] If ISD::ABS is legal/custom, use it directly instead of canonicalizing first.
Remove an extra canonicalization step if ISD::ABS is going to be used anyway.

Updated x86 abs combine to check that we are lowering from both canonicalizations.

llvm-svn: 302337
2017-05-06 13:44:42 +00:00
Reid Kleckner ac1a97b32f Simplify dbg.value handling in SDISel with early returns
No functional change other than improving dbgs logging accuracy on
constant dbg values. Previously we would add things like "i32 42" as
debug values, and then log that we were dropping the debug info, which
is silly.

Delete some dead code that was checking for static allocas. This
remained after r207165, but served no purpose. Currently, static alloca
dbg.values are always sent through the DanglingDebugInfoMap, and are
usually made valid the first time the alloca is used.

llvm-svn: 302267
2017-05-05 18:30:34 +00:00
Craig Topper f0aeee01c3 [KnownBits] Add wrapper methods for setting and clear all bits in the underlying APInts in KnownBits.
This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown.

Differential Revision: https://reviews.llvm.org/D32637

llvm-svn: 302262
2017-05-05 17:36:09 +00:00
Chad Rosier 84a238dd62 [DAGCombine] Transform (fadd A, (fmul B, -2.0)) -> (fsub A, (fadd B, B)).
Differential Revision: http://reviews.llvm.org/D32596

llvm-svn: 302153
2017-05-04 14:14:44 +00:00
Krzysztof Parzyszek 41b6e14dc5 Refactoring with range-based for, NFC
Patch by Wei-Ren Chen.

Differential Revision: https://reviews.llvm.org/D32682

llvm-svn: 302148
2017-05-04 13:35:17 +00:00
Craig Topper d4d09fd73d [SelectionDAG] Improve known bits support for CTPOP.
This is based on the same concept from ValueTracking's version of computeKnownBits.

llvm-svn: 302110
2017-05-04 04:33:27 +00:00
Craig Topper d938fd1397 [KnownBits] Add zext, sext, and trunc methods to KnownBits
This patch adds zext, sext, and trunc methods to KnownBits and uses them where possible.

Differential Revision: https://reviews.llvm.org/D32784

llvm-svn: 302088
2017-05-03 22:07:25 +00:00
Sanjay Patel e1cf61c69f [TargetLowering] use isSubsetOf in SimplifyDemandedBits; NFCI
This is the DAG equivalent of https://reviews.llvm.org/D32255 , 
which will hopefully be committed again. The functionality
(preferring a 'not' op) is already here in the DAG, so this is
just intended to be a clean-up and performance improvement.

llvm-svn: 302087
2017-05-03 21:55:34 +00:00
Amaury Sechet 666c705953 [DAGCombine] (addcarry (add|uaddo X, Y), 0, Carry) -> (addcarry X, Y, Carry)
Summary: Do the transform when the carry isn't used. It's a pattern exposed when legalizing large integers.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32755

llvm-svn: 302047
2017-05-03 16:28:10 +00:00
Tim Shen e59d06fe78 [PowerPC, DAGCombiner] Fold a << (b % (sizeof(a) * 8)) back to a single instruction
Summary:
This is the corresponding llvm change to D28037 to ensure no performance
regression.

Reviewers: bogner, kbarton, hfinkel, iteratee, echristo

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D28329

llvm-svn: 301990
2017-05-03 00:07:02 +00:00
Amaury Sechet 106a7eab84 [DAGCombine] (uaddo X, (addcarry Y, 0, Carry)) -> (addcarry X, Y, Carry)
Summary: This is a common pattern that arise when legalizing large integers operations. Only do it when Y + 1 cannot overflow as this would change the carry behavior of uaddo .

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32687

llvm-svn: 301922
2017-05-02 14:15:48 +00:00
Amaury Sechet 153911f71d [DAGCombine] (add X, (addcarry Y, 0, Carry)) -> (addcarry X, Y, Carry)
Summary: Common pattern when legalizing large integers operations. Similar to D32687, when the carry isn't used.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Differential Revision: https://reviews.llvm.org/D32738

llvm-svn: 301919
2017-05-02 13:34:25 +00:00
Simon Pilgrim 89ad89cc73 [SelectionDAG] Improve support for promotion of <1 x fX> floating point argument types (PR31088)
PR31088 demonstrated that we were assuming that only integers require promotion from <1 x iX> types, when in fact float types may require it as well - in this case half floats.

This patch adds support for extension/truncation for both integer and float types.

Differential Revision: https://reviews.llvm.org/D32391

llvm-svn: 301910
2017-05-02 10:33:08 +00:00
Simon Pilgrim 8deb87a6c0 [DAGCombiner] Improve MatchBswapHword logic (PR31357)
The existing code only looks at half of the tree when matching bswap + rol patterns ending in an OR tree (as opposed to a cascade).

Patch originally introduced by Jim Lewis.

Submitted on the behalf of Dinar Temirbulatov.

Differential Revision: https://reviews.llvm.org/D32039

llvm-svn: 301907
2017-05-02 10:16:19 +00:00
Craig Topper 6b1b630a98 [SelectionDAG] Use known ones to provide a better bound for the known zeros for CTTZ/CTLZ operations.
This is the SelectionDAG version of D32521. If know where at least one 1 is located in the input to these intrinsics we can place an upper bound on the number of bits needed to represent the count and thus increase the number of known zeros in the output.

I think we can also refine this further for CTTZ_UNDEF/CTLZ_UNDEF by assuming that the answer will never be BitWidth. I've left this out for now because it caused other test failures across multiple targets. Usually because of turning ADD into OR based on this new information.

I'll fix CTPOP in a future patch.

Differential Revision: https://reviews.llvm.org/D32692

llvm-svn: 301806
2017-05-01 16:08:06 +00:00
Amara Emerson d28f0cd448 Generalize the specialized flag-carrying SDNodes by moving flags into SDNode.
This removes BinaryWithFlagsSDNode, and flags are now all passed by value.

Differential Revision: https://reviews.llvm.org/D32527

llvm-svn: 301803
2017-05-01 15:17:51 +00:00
Sanjay Patel ad13826aea [DAGCombiner] shrink/widen a vselect to match its condition operand size (PR14657)
We discussed shrinking/widening of selects in IR in D26556, and I'll try to get back to that
patch eventually. But I'm hoping that this transform is less iffy in the DAG where we can check
legality of the select that we want to produce.

A few things to note:

1. We can't wait until after legalization and do this generically because (at least in the x86
   tests from PR14657), we'll have PACKSS and bitcasts in the pattern.
2. This might benefit more of the SSE codegen if we lifted the legal-or-custom requirement, but
   that requires a closer look to make sure we don't end up worse.
3. There's a 'vblendv' opportunity that we're missing that results in andn/and/or in some cases. 
   That should be fixed next.
4. I'm assuming that AVX1 offers the worst of all worlds wrt uneven ISA support with multiple 
   legal vector sizes, but if there are other targets like that, we should add more tests.
5. There's a codegen miracle in the multi-BB tests from PR14657 (the gcc auto-vectorization tests):
   despite IR that is terrible for the target, this patch allows us to generate the optimal loop
   code because something post-ISEL is hoisting the splat extends above the vector loops.

Differential Revision: https://reviews.llvm.org/D32620

llvm-svn: 301781
2017-04-30 22:44:51 +00:00
Amaury Sechet 8ac81f3924 Do not legalize large add with addc/adde, introduce addcarry and do it with uaddo/addcarry
Summary: As per discution on how to get better codegen an large int legalization, it became clear that using a glue for the carry was preventing several desirable optimizations. Passing the carry down as a value allow for more flexibility.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D29872

llvm-svn: 301775
2017-04-30 19:24:09 +00:00
Craig Topper 778f57b4f1 [APInt] Replace calls to setBits with more specific calls to setBitsFrom and setLowBits where possible.
llvm-svn: 301768
2017-04-30 07:44:58 +00:00
Craig Topper ca48af3c87 [KnownBits] Add methods for determining if the known bits represent a negative/nonnegative number and add methods for changing the negative/nonnegative state
Summary: This patch adds isNegative, isNonNegative for querying whether the sign bit is known. It also adds makeNegative and makeNonNegative for controlling the sign bit.

Reviewers: RKSimon, spatel, davide

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32651

llvm-svn: 301747
2017-04-29 16:43:11 +00:00
Reid Kleckner 859f8b544a Make getParamAlignment use argument numbers
The method is called "get *Param* Alignment", and is only used for
return values exactly once, so it should take argument indices, not
attribute indices.

Avoids confusing code like:
  IsSwiftError = CS->paramHasAttr(ArgIdx, Attribute::SwiftError);
  Alignment  = CS->getParamAlignment(ArgIdx + 1);

Add getRetAlignment to handle the one case in Value.cpp that wants the
return value alignment.

This is a potentially breaking change for out-of-tree backends that do
their own call lowering.

llvm-svn: 301682
2017-04-28 20:34:27 +00:00
Matthias Braun 744c215e29 TargetLowering: Add finalizeLowering() function; NFC
Adds a new method finalizeLowering to TargetLoweringBase. This is in
preparation for an upcoming commit.

This function is meant for target specific adjustments to
MachineFrameInfo or register reservations.

Move the freezeRegisters() and the hasCopyImplyingStackAdjustment()
handling into the new function to prove the concept. As an added bonus
GlobalISel no longer missed the hasCopyImplyingStackAdjustment()
handling with this.

Differential Revision: https://reviews.llvm.org/D32621

llvm-svn: 301679
2017-04-28 20:25:05 +00:00
Reid Kleckner 6652a52e2b Use Argument::hasAttribute and AttributeList::ReturnIndex more
This eliminates many extra 'Idx' induction variables in loops over
arguments in CodeGen/ and Target/. It also reduces the number of places
where we assume that ReturnIndex is 0 and that we should add one to
argument numbers to get the corresponding attribute list index.

NFC

llvm-svn: 301666
2017-04-28 18:37:16 +00:00
Craig Topper 24db6b800f [APInt] Add clearSignBit method. Use it and setSignBit in a few places. NFCI
llvm-svn: 301656
2017-04-28 16:58:05 +00:00
Jun Bum Lim 919f9e8d65 [InlineCost] Improve the cost heuristic for Switch
Summary:
The motivation example is like below which has 13 cases but only 2 distinct targets

```
lor.lhs.false2:                                   ; preds = %if.then
  switch i32 %Status, label %if.then27 [
    i32 -7012, label %if.end35
    i32 -10008, label %if.end35
    i32 -10016, label %if.end35
    i32 15000, label %if.end35
    i32 14013, label %if.end35
    i32 10114, label %if.end35
    i32 10107, label %if.end35
    i32 10105, label %if.end35
    i32 10013, label %if.end35
    i32 10011, label %if.end35
    i32 7008, label %if.end35
    i32 7007, label %if.end35
    i32 5002, label %if.end35
  ]
```
which is compiled into a balanced binary tree like this on AArch64 (similar on X86)

```
.LBB853_9:                              // %lor.lhs.false2
        mov     w8, #10012
        cmp             w19, w8
        b.gt    .LBB853_14
// BB#10:                               // %lor.lhs.false2
        mov     w8, #5001
        cmp             w19, w8
        b.gt    .LBB853_18
// BB#11:                               // %lor.lhs.false2
        mov     w8, #-10016
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#12:                               // %lor.lhs.false2
        mov     w8, #-10008
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#13:                               // %lor.lhs.false2
        mov     w8, #-7012
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_14:                             // %lor.lhs.false2
        mov     w8, #14012
        cmp             w19, w8
        b.gt    .LBB853_21
// BB#15:                               // %lor.lhs.false2
        mov     w8, #-10105
        add             w8, w19, w8
        cmp             w8, #9          // =9
        b.hi    .LBB853_17
// BB#16:                               // %lor.lhs.false2
        orr     w9, wzr, #0x1
        lsl     w8, w9, w8
        mov     w9, #517
        and             w8, w8, w9
        cbnz    w8, .LBB853_23
.LBB853_17:                             // %lor.lhs.false2
        mov     w8, #10013
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_18:                             // %lor.lhs.false2
        mov     w8, #-7007
        add             w8, w19, w8
        cmp             w8, #2          // =2
        b.lo    .LBB853_23
// BB#19:                               // %lor.lhs.false2
        mov     w8, #5002
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#20:                               // %lor.lhs.false2
        mov     w8, #10011
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_21:                             // %lor.lhs.false2
        mov     w8, #14013
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#22:                               // %lor.lhs.false2
        mov     w8, #15000
        cmp             w19, w8
        b.ne    .LBB853_3
```
However, the inline cost model estimates the cost to be linear with the number
of distinct targets and the cost of the above switch is just 2 InstrCosts.
The function containing this switch is then inlined about 900 times.

This change use the general way of switch lowering for the inline heuristic. It
etimate the number of case clusters with the suitability check for a jump table
or bit test. Considering the binary search tree built for the clusters, this
change modifies the model to be linear with the size of the balanced binary
tree. The model is off by default for now :
  -inline-generic-switch-cost=false

This change was originally proposed by Haicheng in D29870.

Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier

Reviewed By: hans

Subscribers: joerg, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D31085

llvm-svn: 301649
2017-04-28 16:04:03 +00:00
Simon Pilgrim 7ae9419dc0 [DAGCombiner] Add ComputeNumSignBits vector demanded elements support to ASHR and INSERT_VECTOR_ELT (reapplied)
Reapplied r299221 after fix for nondeterminism in ThinLTO builder (rL301599), with extra check for implicit truncation of inserted element.

llvm-svn: 301644
2017-04-28 13:21:18 +00:00
Craig Topper f42b23f7d8 [ValueTracking] Convert computeKnownBitsFromRangeMetadata to use KnownBits struct.
llvm-svn: 301626
2017-04-28 06:28:56 +00:00
Craig Topper d0af7e8ab8 [SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits
This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.

This is largely a mechanical transformation from KnownZero to Known.Zero.

Differential Revision: https://reviews.llvm.org/D32569

llvm-svn: 301620
2017-04-28 05:31:46 +00:00
Craig Topper 0e03e74e95 [SelectionDAG] Use various APInt methods to reduce temporary APInt creation
This patch uses various APInt methods to reduce the number of temporary APInts. These were all found while working through converting SelectionDAG's computeKnownBits to also use the KnownBits struct recently added to the ValueTracking version.

llvm-svn: 301618
2017-04-28 04:57:59 +00:00
Craig Topper 24e71017aa [APInt] Use inplace shift methods where possible. NFCI
llvm-svn: 301612
2017-04-28 03:36:24 +00:00
Sanjoy Das 40c32dd9a0 Use a pointer type for target frame indices during statepoint lowering
Summary:
The type of the target frame index is intptr, not the type of the value we're
going to store into it.  Without this change we crash in the attached test case
when trying to type-legalize a TargetFrameIndex.

Patchpoint lowering types the target frame index as intptr as well.

Reviewers: reames, bogner, arsenm

Subscribers: arsenm, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D32256

llvm-svn: 301566
2017-04-27 17:17:16 +00:00
Sanjay Patel a0547c3d9f [DAGCombiner] add (sext i1 X), 1 --> zext (not i1 X)
Besides better codegen, the motivation is to be able to canonicalize this pattern 
in IR (currently we don't) knowing that the backend is prepared for that.

This may also allow removing code for special constant cases in 
DAGCombiner::foldSelectOfConstants() that was added in D30180.

Differential Revision: https://reviews.llvm.org/D31944

llvm-svn: 301457
2017-04-26 20:26:46 +00:00
Craig Topper b45eabcf82 [ValueTracking] Introduce a KnownBits struct to wrap the two APInts for computeKnownBits
This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit.

Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch.

I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases.

Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero|One) so we don't write it out everywhere. Maybe a method for (Zero|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with.

Differential Revision: https://reviews.llvm.org/D32376

llvm-svn: 301432
2017-04-26 16:39:58 +00:00
Sanjay Patel e2ec05a62a [TargetLowering] fix isConstTrueVal to account for build vector truncation
Build vectors have magical truncation powers, so we have things like this:

v4i1 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1>, Constant:i32<1>, Constant:i32<1>
v4i16 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1>, Constant:i32<1>, Constant:i32<1>

If we don't truncate the splat node returned by getConstantSplatNode(), then we won't find 
truth when ZeroOrNegativeOneBooleanContent is the rule.

Differential Revision: https://reviews.llvm.org/D32505

llvm-svn: 301408
2017-04-26 14:05:42 +00:00
Ranjeet Singh acbd4e141f Fix signed multiplication with overflow fallback.
For targets that don't have ISD::MULHS or ISD::SMUL_LOHI for the type
and the double width type is illegal, then the two operands are
sign extended to twice their size then multiplied to check for overflow.
The extended upper halves were mismatched causing an incorrect result.
This fixes the mismatch.

A test was added for ARM V6-M where the bug was detected.

Patch by James Duley.

Differential Revision: https://reviews.llvm.org/D31807

llvm-svn: 301404
2017-04-26 13:41:43 +00:00
Sanjay Patel a4b4e9388c [DAG] add FIXME comments for splat detection; NFC
llvm-svn: 301403
2017-04-26 13:27:57 +00:00
Sanjay Patel 7a8317c09a [DAG] fix formatting of isConstantSplat(); NFC
llvm-svn: 301366
2017-04-25 23:33:28 +00:00
Simon Pilgrim 8264ed7075 [DAGCombiner] Refactor to make it easy to add support for vectors in a future patch. NFCI.
llvm-svn: 301320
2017-04-25 16:16:03 +00:00
Simon Pilgrim 37ef04ad1f [SelectionDAG] Use getBuildVector helper where possible. NFCI
llvm-svn: 301314
2017-04-25 15:10:47 +00:00
Simon Pilgrim 986d73cc1d [SelectionDAG] Pull out repeated getValueType calls. NFCI.
Noticed in D32391.

llvm-svn: 301308
2017-04-25 13:39:07 +00:00
Simon Pilgrim 7d65b66962 [DAGCombiner] Add vector support for (srl (trunc (srl x, c1)), c2) combine.
llvm-svn: 301305
2017-04-25 12:40:45 +00:00
Simon Pilgrim ab0446332e [SelectionDAG] Recognise splat vector isKnownToBeAPowerOfTwo one/sign bit shift cases.
llvm-svn: 301303
2017-04-25 12:29:07 +00:00
Simon Pilgrim 96611aa30c [DAGCombiner] Use SDValue::getConstantOperandVal helper where possible. NFCI.
llvm-svn: 301300
2017-04-25 10:47:35 +00:00
Simon Pilgrim 93da6660a2 [DAGCombiner] Use APInt::intersects to avoid tmp variable. NFCI.
llvm-svn: 301258
2017-04-24 21:43:21 +00:00
Krzysztof Parzyszek c8e8e2a046 Move value type list from TargetRegisterClass to TargetRegisterInfo
Differential Revision: https://reviews.llvm.org/D31937

llvm-svn: 301234
2017-04-24 19:51:12 +00:00
Krzysztof Parzyszek 98ab4c64c4 Revert r301231: Accidentally committed stale files
I forgot to commit local changes before commit.

llvm-svn: 301232
2017-04-24 19:48:51 +00:00
Krzysztof Parzyszek c0197066d7 Move value type list from TargetRegisterClass to TargetRegisterInfo
Differential Revision: https://reviews.llvm.org/D31937

llvm-svn: 301231
2017-04-24 19:43:45 +00:00
Yaxun Liu fd23a0c095 CodeGen: Add a hook for getFenceOperandTy
Currently the operand type for ATOMIC_FENCE assumes value type of a pointer in address space 0.
This is fine for most targets. However for amdgcn target, the size of pointer in address space 0
depends on triple environment. For amdgiz environment, it is 64 bit but for other environment it is
32 bit. On the other hand, amdgcn target expects 32 bit fence operands independent of the target
triple environment. Therefore a hook is need in target lowering for getting the fence operand type.

This patch has no effect on targets other than amdgcn.

Differential Revision: https://reviews.llvm.org/D32186

llvm-svn: 301215
2017-04-24 18:26:27 +00:00
Simon Pilgrim f60f57e6e8 [DAGCombiner] Updated bswap byte offset variable names to be more descriptive. NFC
As discussed on D32039, use MaskByteOffset to describe the variable and also pull out repeated getOpcode() calls.

llvm-svn: 301193
2017-04-24 17:05:14 +00:00
Nirav Dave c799f3a809 [SDAG] Teach Chain Analysis about BaseIndexOffset addressing.
While we use BaseIndexOffset in FindBetterNeighborChains to
appropriately realize they're almost the same address and should be
improved concurrently we do not use it in isAlias using the non-index
understanding FindBaseOffset instead. Adding a BaseIndexOffset check
in isAlias like should allow indexed stores to be merged.

FindBaseOffset to be excised in subsequent patch.

Reviewers: jyknight, aditya_nandakumar, bogner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31987

llvm-svn: 301187
2017-04-24 15:37:20 +00:00
Renato Golin 4abfb3d741 Revert "[APInt] Fix a few places that use APInt::getRawData to operate within the normal API."
This reverts commit r301105, 4, 3 and 1, as a follow up of the previous
revert, which broke even more bots.

For reference:
Revert "[APInt] Use operator<<= where possible. NFC"
Revert "[APInt] Use operator<<= instead of shl where possible. NFC"
Revert "[APInt] Use ashInPlace where possible."

PR32754.

llvm-svn: 301111
2017-04-23 12:15:30 +00:00
Artyom Skrobov 53cf1897cc [ARM] ScheduleDAGRRList::DelayForLiveRegsBottomUp must consider OptionalDefs
Summary:
D30400 has enabled tADC and tSBC instructions to be unglued, thereby allowing CPSR to remain live between Thumb1 scheduling units.

Most Thumb1 instructions have an OptionalDef for CPSR; but the scheduler ignored the OptionalDefs, and could unwittingly insert a flag-setting instruction in between an ADDS and the corresponding ADC.

Reviewers: javed.absar, atrick, MatzeB, t.p.northover, jmolloy, rengolin

Reviewed By: javed.absar

Subscribers: rogfer01, efriedma, aemerson, rengolin, llvm-commits, MatzeB

Differential Revision: https://reviews.llvm.org/D31081

llvm-svn: 301106
2017-04-23 06:58:08 +00:00
Craig Topper 474e5de72d [APInt] Fix a few places that use APInt::getRawData to operate within the normal API.
getRawData exposes the internal type of the APInt class directly to its users. Ideally we wouldn't expose such an implementation detail.

This patch fixes a few of the easy cases by using truncate, extract, or a rotate.

llvm-svn: 301105
2017-04-23 06:41:11 +00:00
Craig Topper cdd5ae6676 [APInt] Use operator<<= where possible. NFC
llvm-svn: 301104
2017-04-23 05:43:02 +00:00
Craig Topper 5f68af0806 [APInt] Use operator<<= instead of shl where possible. NFC
llvm-svn: 301103
2017-04-23 05:18:31 +00:00
Craig Topper ae9672c96d [APInt] Use ashInPlace where possible.
llvm-svn: 301101
2017-04-23 03:45:59 +00:00
Akira Hatanaka 22e839f4b2 [AArch64] Improve code generation for logical instructions taking
immediate operands.

This commit adds an AArch64 dag-combine that optimizes code generation
for logical instructions taking immediate operands. The optimization
uses demanded bits to change a logical instruction's immediate operand
so that the immediate can be folded into the immediate field of the
instruction.

This recommits r300932 and r300930, which was causing dag-combine to
loop forever. The problem was that optimizeLogicalImm was returning
true even when there was no change to the immediate node (which happened
when the immediate was all zeros or ones), which caused dag-combine to
push and pop the same node to the work list over and over again without
making any progress.

This commit fixes the bug by returning false early in optimizeLogicalImm
if the immediate is all zeros or ones. Also, it changes the code to
compare the immediate with 0 or Mask rather than calling
countPopulation.

rdar://problem/18231627

Differential Revision: https://reviews.llvm.org/D5591

llvm-svn: 301019
2017-04-21 18:53:12 +00:00
Akira Hatanaka 78ccba6a20 Revert r300932 and r300930.
It seems that r300930 was creating an infinite loop in dag-combine when
compling the following file:

MultiSource/Benchmarks/MiBench/consumer-typeset/z21.c

llvm-svn: 300940
2017-04-21 01:31:50 +00:00
Akira Hatanaka 19077aaee0 [AArch64] Improve code generation for logical instructions taking
immediate operands.

This commit adds an AArch64 dag-combine that optimizes code generation
for logical instructions taking immediate operands. The optimization
uses demanded bits to change a logical instruction's immediate operand
so that the immediate can be folded into the immediate field of the
instruction.

This recommits r300913, which broke bots because I didn't fix a call to
ShrinkDemandedConstant in SIISelLowering.cpp after changing the APIs of
TargetLoweringOpt and TargetLowering.

rdar://problem/18231627

Differential Revision: https://reviews.llvm.org/D5591

llvm-svn: 300930
2017-04-21 00:05:16 +00:00
Akira Hatanaka 7b06cebe73 Revert "[AArch64] Improve code generation for logical instructions taking"
This reverts r300913.

This broke bots.

llvm-svn: 300916
2017-04-20 23:03:30 +00:00
Akira Hatanaka e327f09832 [AArch64] Improve code generation for logical instructions taking
immediate operands.

This commit adds an AArch64 dag-combine that optimizes code generation
for logical instructions taking immediate operands. The optimization
uses demanded bits to change a logical instruction's immediate operand
so that the immediate can be folded into the immediate field of the
instruction.

rdar://problem/18231627

Differential Revision: https://reviews.llvm.org/D5591

llvm-svn: 300913
2017-04-20 22:47:56 +00:00
Benjamin Kramer 997fd5eeb4 [Recycler] Add asan/msan annotations.
This enables use after free and uninit memory checking for memory
returned by a recycler. SelectionDAG currently relies on the opcode of a
free'd node being ISD::DELETED_NODE, so poke a hole in the asan poison
for SDNode opcodes. This means that we won't find some issues, but only
in SDag.

llvm-svn: 300868
2017-04-20 18:29:37 +00:00
Yaxun Liu 5d977f8ed4 CodeGen: Let frame index value type match alloca addr space
Recently alloca address space has been added to data layout. Due to this
change, pointer returned by alloca may have different size as pointer in
address space 0.

However, currently the value type of frame index is assumed to be of the
same size as pointer in address space 0.

This patch fixes that.

Most targets assume alloca returning pointer in address space 0, which
is the default alloca address space. Therefore it is NFC for them.

AMDGCN target with amdgiz environment requires this change since it
assumes alloca returning pointer to addr space 5 and its size is 32,
which is different from the size of pointer in addr space 0 which is 64.

Differential Revision: https://reviews.llvm.org/D32021

llvm-svn: 300864
2017-04-20 18:15:34 +00:00
Sanjay Patel 13985cd111 [DAGCombiner] use more local variables in isAlias(); NFCI
llvm-svn: 300860
2017-04-20 18:02:27 +00:00
Craig Topper bcfd2d1789 [APInt] Rename getSignBit to getSignMask
getSignBit is a static function that creates an APInt with only the sign bit set. getSignMask seems like a better name to convey its functionality. In fact several places use it and then store in an APInt named SignMask.

Differential Revision: https://reviews.llvm.org/D32108

llvm-svn: 300856
2017-04-20 16:56:25 +00:00
Sanjay Patel 2d0e88fb9b [DAGCombiner] fix variable names in isAlias(); NFCI
We started with zero-based params and switched to one-based locals...
Also, variables start with a capital and functions do not.

llvm-svn: 300854
2017-04-20 16:36:37 +00:00
Sanjay Patel b7701bc9af [DAGCombiner] give names to repeated calcs in isAlias(); NFCI
llvm-svn: 300850
2017-04-20 16:15:08 +00:00
Amara Emerson 23e79ec2b3 [MVT][SVE] Scalable vector MVTs (3/3)
Adds MVT::ElementCount to represent the length of a
vector which may be scalable, then adds helper functions
that work with it.

Patch by Graham Hunter.

Differential Revision: https://reviews.llvm.org/D32019

llvm-svn: 300842
2017-04-20 13:54:09 +00:00
Amara Emerson 5054782052 [MVT][SVE] Scalable vector MVTs (1/3)
This patch adds a few helper functions to obtain new vector
value types based on existing ones without needing to care
about whether they are scalable or not.

I've confined their use to a few common locations right now,
and targets that don't have scalable vectors should never
need to care about these.

Patch by Graham Hunter.

Differential Revision: https://reviews.llvm.org/D32017

llvm-svn: 300838
2017-04-20 13:08:17 +00:00
Craig Topper 9ce5ef9475 [SelectionDAG] Fix another place that was passing a large value to APInt::lshrInPlace.
llvm-svn: 300821
2017-04-20 04:55:01 +00:00
Craig Topper d3884b8402 [SelectionDAG] Use getActiveBits() and countTrailingZeros() to avoid creating temporary APInts with lshr and trunc. NFCI
llvm-svn: 300819
2017-04-20 04:23:43 +00:00
Craig Topper 4db0c69373 Recommit "[APInt] Add back the asserts that check that the APInt shift methods aren't called with values larger than BitWidth."
This includes a fix to clamp a right shift of larger than BitWidth in DAG combining.

llvm-svn: 300816
2017-04-20 03:49:18 +00:00
Galina Kistanova 2cc97d92ce Temporarily revert r299221 to fix nondeterminism in ThinLTO builder.
llvm-svn: 300783
2017-04-19 23:16:14 +00:00
Sanjay Patel 0658a95a35 [DAG] add splat vector support for 'or' in SimplifyDemandedBits
I've changed one of the tests to not fold away, but we didn't and still don't do the transform
that the comment claims we do (and I don't know why we'd want to do that).

Follow-up to:
https://reviews.llvm.org/rL300725
https://reviews.llvm.org/rL300763

llvm-svn: 300772
2017-04-19 22:00:00 +00:00
Sanjay Patel ae382bb6af [DAG] add splat vector support for 'xor' in SimplifyDemandedBits
This allows forming more 'not' ops, so we get improvements for ISAs that have and-not.

Follow-up to:
https://reviews.llvm.org/rL300725

llvm-svn: 300763
2017-04-19 21:23:09 +00:00
Craig Topper 9b71a402c2 [APInt] Cast calls to add/sub/mul overflow methods to void if only their overflow bool out param is used.
This is preparation for a clang change to improve the [[nodiscard]] warning to not be ignored on methods that return a class marked [[nodiscard]] that are defined in the class itself. See D32207.

We should consider adding wrapper methods to APInt that return the overflow flag directly and discard the APInt result. This would eliminate the void casts and the need to create a bool before the call to pass to the out param.

llvm-svn: 300758
2017-04-19 21:09:45 +00:00
Sanjay Patel ded7d59f0e [DAG] add splat vector support for 'and' in SimplifyDemandedBits
The patch itself is simple: stop discriminating against vectors in visitAnd() and again in 
SimplifyDemandedBits().

Some notes for reference:

1. We're not consistent about calls to SimplifyDemandedBits in the various visitXXX functions. 
   Sometimes, we check if the RHS is a constant first. Other times (like here), we just dive in.
2. I'd like to break the vector shackles in steps for the sake of risk minimization, but we could
    make similar simultaneous changes in other places if we think that would be better.
3. I don't know what the intent of the changed tests in this patch was supposed to be, but since 
   they wiggled in a positive way, I'm just going with that. :)
4. In the rotate tests, note that we can see through non-splat constants. This is a result of D24253.
5. My motivation for being here now is to make D31944 look better, so this is step 1 of N towards 
   improving the vector codegen in that patch without writing any actual new code.

Differential Revision: https://reviews.llvm.org/D32230

llvm-svn: 300725
2017-04-19 18:05:06 +00:00
Nirav Dave 8563fc4664 [DAG] Loop over remaining candidates on successful merge of stores of
extracted vectors types. NFCI.

llvm-svn: 300688
2017-04-19 13:52:38 +00:00
Chih-Hung Hsieh 877923a87f [X86] Keep EXTRACT_VECTOR_ELT result type as f128 for Android x86_64.
Android x86_64 target uses f128 type and stores f128 values in %xmm* registers.
SoftenFloatRes_EXTRACT_VECTOR_ELT should not convert result value
from f128 to i128.

Differential Revision: http://reviews.llvm.org/D32102

llvm-svn: 300583
2017-04-18 20:15:18 +00:00
Craig Topper fc947bcfba [APInt] Use lshrInPlace to replace lshr where possible
This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result.

This adds an lshrInPlace(const APInt &) version as well.

Differential Revision: https://reviews.llvm.org/D32155

llvm-svn: 300566
2017-04-18 17:14:21 +00:00
Nirav Dave 855ef45602 [DAG] Improve store merge candidate pruning.
Remove non-consecutive stores from store merge candidate search as
they cannot be merged and will prevent us from finding subsequent
mergeable store cases.

Reviewers: jyknight, bogner, javed.absar, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32086

llvm-svn: 300561
2017-04-18 15:36:34 +00:00
Adrian Prantl 6825fb64e9 PR32382: Fix emitting complex DWARF expressions.
The DWARF specification knows 3 kinds of non-empty simple location
descriptions:
1. Register location descriptions
  - describe a variable in a register
  - consist of only a DW_OP_reg
2. Memory location descriptions
  - describe the address of a variable
3. Implicit location descriptions
  - describe the value of a variable
  - end with DW_OP_stack_value & friends

The existing DwarfExpression code is pretty much ignorant of these
restrictions. This used to not matter because we only emitted very
short expressions that we happened to get right by accident.  This
patch makes DwarfExpression aware of the rules defined by the DWARF
standard and now chooses the right kind of location description for
each expression being emitted.

This would have been an NFC commit (for the existing testsuite) if not
for the way that clang describes captured block variables. Based on
how the previous code in LLVM emitted locations, DW_OP_deref
operations that should have come at the end of the expression are put
at its beginning. Fixing this means changing the semantics of
DIExpression, so this patch bumps the version number of DIExpression
and implements a bitcode upgrade.

There are two major changes in this patch:

I had to fix the semantics of dbg.declare for describing function
arguments. After this patch a dbg.declare always takes the *address*
of a variable as the first argument, even if the argument is not an
alloca.

When lowering a DBG_VALUE, the decision of whether to emit a register
location description or a memory location description depends on the
MachineLocation — register machine locations may get promoted to
memory locations based on their DIExpression. (Future) optimization
passes that want to salvage implicit debug location for variables may
do so by appending a DW_OP_stack_value. For example:
  DBG_VALUE, [RBP-8]                        --> DW_OP_fbreg -8
  DBG_VALUE, RAX                            --> DW_OP_reg0 +0
  DBG_VALUE, RAX, DIExpression(DW_OP_deref) --> DW_OP_reg0 +0

All testcases that were modified were regenerated from clang. I also
added source-based testcases for each of these to the debuginfo-tests
repository over the last week to make sure that no synchronized bugs
slip in. The debuginfo-tests compile from source and run the debugger.

https://bugs.llvm.org/show_bug.cgi?id=32382
<rdar://problem/31205000>

Differential Revision: https://reviews.llvm.org/D31439

llvm-svn: 300522
2017-04-18 01:21:53 +00:00
Reid Kleckner fb502d2f5e [IR] Make paramHasAttr to use arg indices instead of attr indices
This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern.

Previously we were testing return value attributes with index 0, so I
introduced hasReturnAttr() for that use case.

llvm-svn: 300367
2017-04-14 20:19:02 +00:00
Nirav Dave 642ed1ef7e Reorder StoreMergeCandidates to run faster. NFCI.
llvm-svn: 300321
2017-04-14 13:34:30 +00:00
Andrew V. Tischenko 4e7bcd5216 Fix for PR#30562: Selection DAG error: Detected cycle in SelectionDAG.
Patch by Dinar Temirbulatov

llvm-svn: 300314
2017-04-14 09:17:09 +00:00
Nirav Dave 9acd2fd9d9 [DAG] Fold away temporary vector in store candidate merge NFC.
llvm-svn: 300241
2017-04-13 20:00:27 +00:00
Craig Topper 8b459c24f3 [SelectionDAG] Use APInt move assignment to avoid 2 memory allocations and copies when bit width is larger than 64-bits.
llvm-svn: 300091
2017-04-12 18:39:27 +00:00
Serge Pavlov 2757afdb85 Remove redundant type casts
llvm-svn: 300063
2017-04-12 14:13:00 +00:00
Nirav Dave a55dad3c33 [SDAG] Factor CandidateMatch check into lambda. NFC.
llvm-svn: 299939
2017-04-11 13:41:19 +00:00
Nirav Dave 83defd1902 [SDAG] Factor ChainMerge into helper function NFCI.
llvm-svn: 299938
2017-04-11 13:41:17 +00:00
Nirav Dave 233eb7a636 [SDAG] Reorder expensive StoreMerge Check after cheaper one. NFC
llvm-svn: 299937
2017-04-11 13:41:16 +00:00
Sam Parker 4fc5f3c02e [SelectionDAG] Check CALLSEQ_BEGIN nodes in DelayForLiveRegs
A fix for the bug reported in PR30911.

The issue arises when multiple CALLSEQ_BEGIN nodes are unscheduled as
the last node to be unscheduled will gain access to the CallResource
register. But when a node is being picked, only CALLSEQ_END nodes are
checked against the CallResource and have their chains evaluated.
This then means that other CALLSEQ_BEGIN nodes can be scheduled
before the existing call sequence has been finalised. This patch adds
a check against the FrameSetup nodes in DelayForLiveRegs to prevent
this from happening.

Differential Revision: https://reviews.llvm.org/D31536

llvm-svn: 299926
2017-04-11 08:43:32 +00:00
Craig Topper 3606e732dd [SelectionDAG] TargetLowering::SimplifyDemandedBits how to properly calculate KnownZero bits for ISD::SETCC and ISD::AssertZExt
Summary:
For SETCC we aren't calculating the KnownZero bits at all. I've copied the code from computeKnownZero over for this.

For AssertZExt we were only setting KnownZero for bits that were demanded. But the upper bits are zero whether they were demanded or not.

I'm interested in fixing this because my belief is the first part of the ISD::AND handling code in SimplifyDemandedBits largely exists because of these two bugs. In that code we go to computeKnownBits for the LHS and optimize a RHS constant. Because computeKnownBits handles SETCC and AssertZExt correctly we get better information sometimes than when we call SimplifyDemandedBits on the LHS later. With these two issues fixed in SimplifyDemandedBits I was able to remove that computeKnownBits call and still pass all X86 tests. I'll submit that change in a separate patch.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31715

llvm-svn: 299839
2017-04-10 07:06:44 +00:00
Simon Dardis f7e4388e3b Revert "[SelectionDAG] Enable target specific vector scalarization of calls and returns"
This reverts commit r299766. This change appears to have broken the MIPS
buildbots. Reverting while I investigate.

Revert "[mips] Remove usage of debug only variable (NFC)"

This reverts commit r299769. Follow up commit.

llvm-svn: 299788
2017-04-07 17:25:05 +00:00
Simon Dardis 6470ff0b24 [SelectionDAG] Enable target specific vector scalarization of calls and returns
By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize vector types for calls
and returns.

The MIPS vector ABI requires that vector arguments and returns are passed in
integer registers. With SelectionDAG's new hooks, the MIPS backend can now
handle LLVM-IR with vector types in calls and returns. E.g.
'call @foo(<4 x i32> %4)'.

Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for
calls and returns if vector types were not legal. If vector types were legal,
a single 128bit vector argument would be assigned to a single 32 bit / 64 bit
integer register.

By teaching the MIPS backend to inspect the original types, it can now
implement the MIPS vector ABI which requires a particular method of
scalarizing vectors.

Previously, the MIPS backend relied on clang to scalarize types such as "call
@foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3,
i32 inreg %4)".

This patch enables the MIPS backend to take either form for vector types.

Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur

Differential Revision: https://reviews.llvm.org/D27845

llvm-svn: 299766
2017-04-07 13:03:52 +00:00
Nirav Dave 974f7c23ae [SDAG] Fix visitAND optimization to deal with vector extract case again.
Summary:
Fix case elided by rL298920.

Fixes PR32545.

Reviewers: eli.friedman, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31759

llvm-svn: 299688
2017-04-06 19:05:41 +00:00
Jonas Paulsson 45c936ef86 [SelectionDAG] NFC patch removing a redundant check.
Since the BUILD_VECTOR has already been checked by
isBuildVectorOfConstantSDNodes() in SelectionDAG::getNode() for a
SIGN_EXTEND_INREG, it can be assumed that Op is always either undef or a
ConstantSDNode, and Ops.size() will always equal VT.getVectorNumElements().

llvm-svn: 299647
2017-04-06 13:00:37 +00:00
Craig Topper 2ca72f4971 Revert accidental commit of r299619.
llvm-svn: 299622
2017-04-06 04:04:10 +00:00
Craig Topper 6b15606051 Revert accidental commit of r299618
llvm-svn: 299621
2017-04-06 04:03:34 +00:00
Craig Topper 5d7ece8895 bar
llvm-svn: 299619
2017-04-06 04:02:31 +00:00
Craig Topper faf5a8553c foo
llvm-svn: 299618
2017-04-06 04:02:28 +00:00
Adam Nemet d5ffdd3605 [DAGCombine] Support FMF contract in fused multiple-and-sub too
This is a follow-on to r299096 which added support for fmadd.

Subtract does not have the case where with two multiply operands we commute in
order to fuse with the multiply with the fewer uses.

llvm-svn: 299572
2017-04-05 17:58:48 +00:00
Adam Nemet 99e347fc35 [DAGCombine] Remove commented-out code from r299096
llvm-svn: 299571
2017-04-05 17:58:44 +00:00
Sanjay Patel b2f1621bb1 [DAGCombiner] add and use TLI hook to convert and-of-seteq / or-of-setne to bitwise logic+setcc (PR32401)
This is a generic combine enabled via target hook to reduce icmp logic as discussed in:
https://bugs.llvm.org/show_bug.cgi?id=32401

It's likely that other targets will want to enable this hook for scalar transforms, 
and there are probably other patterns that can use bitwise logic to reduce comparisons.

Note that we are missing an IR canonicalization for these patterns, and we will probably
prefer the pair-of-compares form in IR (shorter, more likely to fold).

Differential Revision: https://reviews.llvm.org/D31483

llvm-svn: 299542
2017-04-05 14:09:39 +00:00
Jonas Paulsson 38a2da92bc [DAGCombiner] Don't make a BUILD_VECTOR with operands of illegal type.
When DAGCombiner visits a SIGN_EXTEND_INREG of a BUILD_VECTOR with
constant operands, a new BUILD_VECTOR node will be created transformed
constants.

Llvm-stress found a case where the new BUILD_VECTOR had constant operands
of an illegal type, because the (legal) element type is in fact not a legal
scalar type.

This patch changes this so that the new BUILD_VECTOR has the same operand
type as the old one.

Review: Eli Friedman, Nirav Dave
https://bugs.llvm.org//show_bug.cgi?id=32422

llvm-svn: 299540
2017-04-05 13:45:37 +00:00
Matt Arsenault c82768290d DAG: Fix missing legalization for any_extend_vector_inreg operands
llvm-svn: 299389
2017-04-03 21:28:13 +00:00
Craig Topper 3882613956 [DAGCombine][InstCombine] Fix inverted if condition in equivalent comments in DAGCombine and InstCombine. NFC
llvm-svn: 299378
2017-04-03 19:18:48 +00:00
Zvi Rackover d76a4d0ac6 Revert "[DAGCombine] A shuffle of a splat is always the splat itself"
This reverts commit r299047 which is incorrect because the
simplification may result in incorrect propogation of undefs to users of
the folded shuffle.

Thanks to Andrea Di Biagio for pointing this out.

llvm-svn: 299368
2017-04-03 17:41:19 +00:00
Craig Topper d33ee1b960 [APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt class. Implement them without memory allocation for multiword
This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation.

Differential Revision: https://reviews.llvm.org/D31565

llvm-svn: 299362
2017-04-03 16:34:59 +00:00
Simon Pilgrim 9daf9c047d [DAGCombiner] Check limits before accessing array element (PR32502)
llvm-svn: 299361
2017-04-03 15:27:49 +00:00
Sanjay Patel 665021e7ee [DAGCombiner] enable vector transforms for any/all {sign} bits set/clear
The code already allowed vector types in via "isInteger" (which might want
a more specific name), so use splat-friendly constant predicates to match
those types.

llvm-svn: 299304
2017-04-01 15:05:54 +00:00
Craig Topper 73250168e7 [DAGCombiner] Fix fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf A, B, Mask) to explicitly ensure that only one of the inputs of each shuffle is a zero vector.
This can only happen when we have a mix of zero and undef elements and the two vectors have a different arrangement of zeros/undefs. The shuffle should eventually be constant folded to all zeros.

Fixes PR32484.

llvm-svn: 299291
2017-04-01 04:26:20 +00:00
Quentin Colombet 35a47010b1 Revert "Instrument SDISel C++ patterns"
This reverts commit r299284.

Didn't intend to commit this :(

llvm-svn: 299286
2017-04-01 01:26:17 +00:00
Quentin Colombet b43da15602 Instrument SDISel C++ patterns
llvm-svn: 299284
2017-04-01 01:21:32 +00:00
Sanjay Patel 16d458ea0d [DAGCombiner] refactor and/or-of-setcc to get rid of duplicated code; NFCI
llvm-svn: 299266
2017-03-31 21:30:50 +00:00
Sanjay Patel 34da36e74f [DAGCombiner] add fold for 'All sign bits set?'
(and (setlt X,  0), (setlt Y,  0)) --> (setlt (and X, Y),  0)

We have 7 similar folds, but this one got away. The fact that the
x86 test with a branch didn't change is probably a separate bug. We
may also be missing this and the related folds in instcombine.

llvm-svn: 299252
2017-03-31 20:28:06 +00:00
Sanjay Patel 61d3409535 [DAGCombiner] remove redundant code and add comments; NFCI
llvm-svn: 299241
2017-03-31 18:18:58 +00:00
Simon Pilgrim 1cdbfe44b1 [DAGCombiner] Add ComputeNumSignBits vector demanded elements support to ASHR and INSERT_VECTOR_ELT
Followup to D31311

llvm-svn: 299221
2017-03-31 14:21:50 +00:00
Simon Pilgrim 3c81c34d8d [DAGCombiner] Add vector demanded elements support to ComputeNumSignBits
Currently ComputeNumSignBits returns the minimum number of sign bits for all elements of vector data, when we may only be interested in one/some of the elements.

This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original ComputeNumSignBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1.

I've only added support for BUILD_VECTOR and EXTRACT_VECTOR_ELT so far, all others will default to demanding all elements but can be updated in due course.

Followup to D25691.

Differential Revision: https://reviews.llvm.org/D31311

llvm-svn: 299219
2017-03-31 13:54:09 +00:00
Simon Pilgrim 37b536e4b3 [DAGCombiner] Add vector demanded elements support to computeKnownBitsForTargetNode
Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes.

Differential Revision: https://reviews.llvm.org/D31249

llvm-svn: 299201
2017-03-31 11:24:16 +00:00
Adam Nemet edaec6de73 [DAGCombiner] Initial support for the fast-math flag contract
Now alternatively to the TargetOption.AllowFPOpFusion global flag, FMUL->FADD
can also use the per operation FMF to allow fusion.

The idea here is not to port everything to the new scheme (e.g. fused
multiply-and-sub will be ported later) but that this work all the way from
clang.

The transformation is conditionalized on *both* the FADD and the FMUL having
the FMF contract flag.

Differential Revision: https://reviews.llvm.org/D31169

llvm-svn: 299096
2017-03-30 18:53:04 +00:00
Ahmed Bougacha 6dd6082472 [CodeGen] Pass SDAG an ORE, and replace FastISel stats with remarks.
In the long-term, we want to replace statistics with something
finer-grained that lets us gather per-function data.
Remarks are that replacement.

Create an ORE instance in SelectionDAGISel, and pass it to
SelectionDAG.

SelectionDAG was used so that we can emit remarks from all
SelectionDAG-related code, including TargetLowering and DAGCombiner.
This isn't used in the current patch but Adam tells me he's interested
for the fp-contract combines.

Use the ORE instance to emit FastISel failures as remarks (instead of
the mix of dbgs() dumps and statistics that we currently have).

Eventually, we want to have an API that tells us whether remarks are
enabled (http://llvm.org/PR32352) so that we don't emit expensive
remarks (in this case, dumping IR) when it's not needed.  For now, use
'isEnabled' as a crude replacement.

This does mean that the replacement for '-fast-isel-verbose' is now
'-pass-remarks-missed=isel'.  Additionally, clang users also need to
enable remark diagnostics, using '-Rpass-missed=isel'.

This also removes '-fast-isel-verbose2': there are no static statistics
that we want to only enable in asserts builds, so we can always use
the remarks regardless of the build type.

Differential Revision: https://reviews.llvm.org/D31405

llvm-svn: 299093
2017-03-30 17:49:58 +00:00
Sanjay Patel 6d5ba061f8 [DAGCombiner] add helper function for visitORLike; NFCI
This combines all of the equivalent clean-ups for foldAndOfSetCCs:
https://reviews.llvm.org/rL298938
https://reviews.llvm.org/rL298940
https://reviews.llvm.org/rL298944
https://reviews.llvm.org/rL298949
https://reviews.llvm.org/rL298950
https://reviews.llvm.org/rL299002
https://reviews.llvm.org/rL299013

The sins of code duplication are on full display here:
each function is missing a fold that wasn't copied over from its logical sibling. 

llvm-svn: 299091
2017-03-30 17:32:42 +00:00
Craig Topper eafcbe2d10 [APInt] Remove references to integerPartWidth outside of APFloat implentation.
Turns out integerPartWidth only explicitly defines the width of the tc functions in the APInt class. Functions that aren't used by APInt implementation itself. Many places in the code base already assume APInt is made up of 64-bit pieces. Explicitly assuming 64-bit here doesn't make that situation much worse. A full audit would need to be done if it ever changes.

llvm-svn: 299059
2017-03-30 05:49:03 +00:00
Zvi Rackover 7569436f81 [DAGCombine] A shuffle of a splat is always the splat itself
Summary:
Add a simplification:
shuffle (splat-shuffle), undef, M --> splat-shuffle

Fixes pr32449

Patch by Sanjay Patel

Reviewers: eli.friedman, RKSimon, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31426

llvm-svn: 299047
2017-03-30 01:42:57 +00:00