Commit Graph

16194 Commits

Author SHA1 Message Date
David Majnemer 0be7155350 [InstCombine] Handle failures from ConstantFoldConstantExpression
ConstantFoldConstantExpression returns null when folding fails.

This fixes PR28745.

llvm-svn: 276952
2016-07-28 02:29:06 +00:00
Wei Mi 315bb33f27 Fix the assertion error in collectLoopUniforms caused by empty Worklist before expanding.
Contributed-by: David Callahan

Differential Revision: https://reviews.llvm.org/D22886

llvm-svn: 276943
2016-07-27 23:53:58 +00:00
Michael Zolotukhin ff5ce639de Add verifyAnalysis for LCSSA.
Summary:
LCSSAWrapperPass currently doesn't override verifyAnalysis method, so pass
manager doesn't verify LCSSA. This patch adds the method so that we start
verifying LCSSA between loop passes.

Reviewers: chandlerc, sanjoy, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22888

llvm-svn: 276941
2016-07-27 23:35:53 +00:00
Justin Lebar 37f4e0e096 [LSV] Use Instruction*s rather than Value*s where possible.
Summary:
Given the crash in D22878, this patch converts the load/store vectorizer
to use explicit Instruction*s wherever possible.  This is an overall
simplification and should be an improvement in safety, as we have fewer
naked cast<>s, and now where we use Value*, we really mean something
different from Instruction*.

This patch also gets rid of some cast<>s around Value*s returned by
Builder.  Given that Builder constant-folds everything, we can't assume
much about what we get out of it.

One downside of this patch is that we have to copy our chain before
calling propagateMetadata.  But I don't think this is a big deal, as our
chains are very small (usually 2 or 4 elems).

Reviewers: asbirlea

Subscribers: mzolotukhin, llvm-commits, arsenm

Differential Revision: https://reviews.llvm.org/D22887

llvm-svn: 276938
2016-07-27 23:06:00 +00:00
Justin Lebar 23a9686011 [LSV] Don't assume that bitcast ops are Instructions.
Summary:
When we ask the builder to create a bitcast on a constant, we get back a
constant, not an instruction.

Reviewers: asbirlea

Subscribers: jholewinski, mzolotukhin, llvm-commits, arsenm

Differential Revision: https://reviews.llvm.org/D22878

llvm-svn: 276922
2016-07-27 21:45:48 +00:00
Jun Bum Lim a033139cd4 [DSE] Fix bug in updating MadeChange flag
Summary: The MadeChange flag should be ORed to keep the previous result.

Reviewers: mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22873

llvm-svn: 276894
2016-07-27 17:25:20 +00:00
Sean Silva 285e0974f0 Refactor - CodeExtractor : Move check for valid block to static utility
This lets you actually check to see if a block is valid before trying to
extract.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22699

llvm-svn: 276846
2016-07-27 08:02:46 +00:00
George Burgess IV 9cf05464aa [GVNHoist] Fix typo in assert.
This fixes PR28730.

llvm-svn: 276844
2016-07-27 06:34:53 +00:00
Sebastian Pop 55c3007b88 GVN-hoist: improve code generation for recursive GEPs
When loading or storing in a field of a struct like "a.b.c", GVN is able to
detect the equivalent expressions, and GVN-hoist would fail in the code
generation.  This is because the GEPs are not hoisted as scalar operations to
avoid moving the GEPs too far from their ld/st instruction when the ld/st is not
movable.  So we end up having to generate code for the GEP of a ld/st when we
move the ld/st.  In the case of a GEP referring to another GEP as in "a.b.c" we
need to code generate all the GEPs necessary to make all the operands available
at the new location for the ld/st.  With this patch we recursively walk through
the GEP operands checking whether all operands are available, and in the case of
a GEP operand, it recursively makes all its operands available. Code generation
happens from the inner GEPs out until reaching the GEP that appears as an
operand of the ld/st.

Differential Revision: https://reviews.llvm.org/D22599

llvm-svn: 276841
2016-07-27 05:48:12 +00:00
Sebastian Pop 586d3eaeb5 GVN-hoist: use DFS numbers instead of walking the instruction stream
The patch replaces a function that walks the IR with a call to firstInBB() that
uses the DFS numbering.  NFC.

Differential Revision: https://reviews.llvm.org/D22809

llvm-svn: 276840
2016-07-27 05:13:52 +00:00
Andrew Kaylor f990fa5f7b Reverting r276771 due to MSan failures.
llvm-svn: 276824
2016-07-27 01:19:24 +00:00
Adam Nemet 2f2bd8caf4 [LoopUtils] Sort headers
llvm-svn: 276776
2016-07-26 17:52:02 +00:00
Andrew Kaylor 3104a6bad0 Re-committing r275284: add support to inline __builtin_mempcpy
Patch by Sunita Marathe

Differential Revision: http://reviews.llvm.org/D21920

llvm-svn: 276771
2016-07-26 17:23:13 +00:00
Sebastian Pop 91d4a30159 GVN-hoist: use a DFS numbering of instructions (PR28670)
Instead of DFS numbering basic blocks we now DFS number instructions that avoids
the costly operation of which instruction comes first in a basic block.

Patch mostly written by Daniel Berlin.

Differential Revision: https://reviews.llvm.org/D22777

llvm-svn: 276714
2016-07-26 00:15:10 +00:00
Sebastian Pop 38422b1356 GVN-hoist: limit hoisting depth (PR28670)
This patch adds an option to specify the maximum depth in a BB at which to
consider hoisting instructions.  Hoisting instructions from a deeper level is
not profitable as it increases register pressure and compilation time.

Differential Revision: https://reviews.llvm.org/D22772

llvm-svn: 276713
2016-07-26 00:15:08 +00:00
Michael Kuperstein 39feb6290c [PM] Port SymbolRewriter to the new PM
Differential Revision: https://reviews.llvm.org/D22703

llvm-svn: 276687
2016-07-25 20:52:00 +00:00
Matt Arsenault 7cddfed7e8 Scalarizer: Support scalarizing intrinsics
llvm-svn: 276681
2016-07-25 20:02:54 +00:00
Rong Xu 705f7775bb [PGO] Fix profile mismatch in COMDAT function with pre-inliner
Pre-instrumentation inline (pre-inliner) greatly improves the IR
instrumentation code performance, among other benefits. One issue of the
pre-inliner is it can introduce CFG-mismatch for COMDAT functions. This
is due to the fact that the same COMDAT function may have different early
inline decisions across different modules -- that means different copies
of COMDAT functions will have different CFG checksum.

In this patch, we propose a partially renaming the COMDAT group and its
member function/variable so we have different profile counter for each
version. We will post-fix the COMDAT function and the group name with its
FunctionHash.

Differential Revision: http://reviews.llvm.org/D22600

llvm-svn: 276673
2016-07-25 18:45:37 +00:00
Michael Kuperstein 9a89b15aa2 Attempt to pacify windows bots.
llvm-svn: 276672
2016-07-25 18:39:08 +00:00
Daniel Berlin 40765a62ad Revert NewGVN N^2 behavior patch
llvm-svn: 276670
2016-07-25 18:19:49 +00:00
Michael Kuperstein 8f8e1d1bf6 Don't use iplist in SymbolRewriter. NFC.
There didn't appear to be a good reason to use iplist in this case, a regular
list of unique_ptr works just as well.
Change made in preparation to a new PM port (since iplist is not moveable).

llvm-svn: 276668
2016-07-25 18:10:54 +00:00
Daniel Berlin 14c000936e NFC: Make a few asserts in GVNHoist do the same thing, but cheaper.
llvm-svn: 276662
2016-07-25 17:36:14 +00:00
Daniel Berlin f107f3292f Fix N^2 instruction ordering comparisons in GVNHoist.
This fixes GVNHoist's portion of PR28670.

llvm-svn: 276658
2016-07-25 17:24:27 +00:00
Daniel Berlin 65af45de03 NFC: Refactor GVNHoist class so not everything is public
llvm-svn: 276657
2016-07-25 17:24:22 +00:00
Sean Silva 519323db58 Cleanup : Reformat PartialInliner.cpp to have current LLVM style conventions
Modify the variable names and code style to be that of modern LLVM.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22743

llvm-svn: 276610
2016-07-25 05:57:59 +00:00
Sean Silva fe5abd5e0c Fix : Partial Inliner requires AssumptionCacheTracker
The public InlineFunction utility assumes that the passed in
InlineFunctionInfo has a valid AssumptionCacheTracker.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22706

llvm-svn: 276609
2016-07-25 05:00:00 +00:00
David Majnemer 68623a0e9f [GVNHoist] Merge metadata on hoisted instructions less conservatively
We can combine metadata from multiple instructions intelligently for
certain metadata nodes.

llvm-svn: 276602
2016-07-25 02:21:25 +00:00
David Majnemer 4728569d0a [GVNHoist] Properly merge alignments when hoisting
If we two loads of two different alignments, we must use the minimum of
the two alignments when hoisting.  Same deal for stores.

For allocas, use the maximum of the two allocas.

llvm-svn: 276601
2016-07-25 02:21:23 +00:00
David Majnemer 6f014d37d5 [Utils] Simplify combineMetadata
Use a range-based for loop, no functional change is intended.

llvm-svn: 276600
2016-07-25 02:21:19 +00:00
Elena Demikhovsky 376a18bd92 [Loop Vectorizer] Handling loops FP induction variables.
Allowed loop vectorization with secondary FP IVs. Like this:
float *A;
float x = init;
for (int i=0; i < N; ++i) {
  A[i] = x;
  x -= fp_inc;
}

The auto-vectorization is possible when the induction binary operator is "fast" or the function has "unsafe" attribute.

Differential Revision: https://reviews.llvm.org/D21330

llvm-svn: 276554
2016-07-24 07:24:54 +00:00
George Burgess IV 93ea19b9a6 [MSSA] Make EXPENSIVE_CHECKS check more.
checkClobberSanity will now be run for all results of `ClobberWalk`,
instead of just the crazy phi-optimized ones. This can help us catch
cases where our cache is being wonky.

llvm-svn: 276553
2016-07-24 07:03:49 +00:00
George Burgess IV f23eb70e03 [MSSA] Remove useless assert. NFC.
liveOnEntry is always a MemoryDef; asserting that a MemoryPhi isn't
liveOnEntry, while correct, isn't very helpful. :)

llvm-svn: 276542
2016-07-24 01:50:07 +00:00
Sanjay Patel 1271bf9178 [InstCombine] allow icmp (bit-manipulation-intrinsic(), C) folds for vectors
llvm-svn: 276523
2016-07-23 13:06:49 +00:00
Xinliang David Li 9239245401 [Profile] Use explicit flag to enable IR PGO
Patch by Jake VanAdrighem

Differential Revision: http://reviews.llvm.org/D22607

llvm-svn: 276516
2016-07-23 04:28:52 +00:00
Sean Silva ab6a683765 Avoid using a raw AssumptionCacheTracker in various inliner functions.
This unblocks the new PM part of River's patch in
https://reviews.llvm.org/D22706

Conveniently, this same change was needed for D21921 and so these
changes are just spun out from there.

llvm-svn: 276515
2016-07-23 04:22:50 +00:00
Sanjay Patel 6ebd5857c8 [InstCombine] move udiv+cmp fold over with other BinOp+cmp folds; NFCI
llvm-svn: 276502
2016-07-23 00:28:39 +00:00
Adam Nemet eea7c267b9 [LoopDataPrefetch] Fix unused variable in release build
llvm-svn: 276491
2016-07-22 23:08:10 +00:00
Adam Nemet 9e6e63fba2 [LoopDataPrefetch] Include hotness of region in opt remark
llvm-svn: 276488
2016-07-22 22:53:17 +00:00
Adam Nemet 885f1de490 [LoopDataPrefetch] Sort headers
llvm-svn: 276487
2016-07-22 22:53:12 +00:00
Vitaly Buka e3a032a740 Unpoison stack before resume instruction
Summary:
Clang inserts cleanup code before resume similar way as before return instruction.
This makes asan poison local variables causing false use-after-scope reports.

__asan_handle_no_return does not help here as it was executed before
llvm.lifetime.end inserted into resume block.

To avoid false report we need to unpoison stack for resume same way as for return.

PR27453

Reviewers: kcc, eugenis

Differential Revision: https://reviews.llvm.org/D22661

llvm-svn: 276480
2016-07-22 22:04:38 +00:00
Alina Sbirlea ba21ffebff Add flag to PassManagerBuilder to disable GVN Hoist Pass.
Summary:
Adding a flag to diable GVN Hoisting by default.
Note: The GVN Hoist Pass causes some Halide tests to hang. Halide will disable the pass while investigating.

Reviewers: llvm-commits, chandlerc, spop, dberlin

Subscribers: mehdi_amini

Differential Revision: https://reviews.llvm.org/D22639

llvm-svn: 276479
2016-07-22 22:02:19 +00:00
Michael Kuperstein 38e7298093 [SLPVectorizer] Vectorize reverse-order loads in horizontal reductions
When vectorizing a tree rooted at a store bundle, we currently try to sort the
stores before building the tree, so that the stores can be vectorized. For other
trees, the order of the root bundle - which determines the order of all other
bundles - is arbitrary. That is bad, since if a leaf bundle of consecutive loads
happens to appear in the wrong order, we will not vectorize it.

This is partially mitigated when the root is a binary operator, by trying to
build a "reversed" tree when that's considered profitable. This patch extends the
workaround we have for binops to trees rooted in a horizontal reduction.

This fixes PR28474.

Differential Revision: https://reviews.llvm.org/D22554

llvm-svn: 276477
2016-07-22 21:28:48 +00:00
Jun Bum Lim 6a7dc5c430 Recommit - [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals
Recommiting r275571 after fixing crash reported in PR28270.
Now we erase elements of IOL in deleteDeadInstruction().

Original Summary:
This change use the overlap interval map built from partial overwrite tracking to perform shortening MemIntrinsics.
Add test cases which was missing opportunities before.

llvm-svn: 276452
2016-07-22 18:27:24 +00:00
Wei Mi e04d0eff29 [PM] Port BreakCriticalEdges to the new PM.
Differential Revision: https://reviews.llvm.org/D22688

llvm-svn: 276449
2016-07-22 18:04:25 +00:00
David Majnemer 522a91181a Don't remove side effecting instructions due to ConstantFoldInstruction
Just because we can constant fold the result of an instruction does not
imply that we can delete the instruction.  It may have side effects.

This fixes PR28655.

llvm-svn: 276389
2016-07-22 04:54:44 +00:00
Xinliang David Li d382e9d82b Sync up InstrProfData.inc with compiler-rt with fixes to references
llvm-svn: 276388
2016-07-22 04:46:56 +00:00
Vitaly Buka 53054a7024 Fix detection of stack-use-after scope for char arrays.
Summary:
Clang inserts GetElementPtrInst so findAllocaForValue was not
able to find allocas.

PR27453

Reviewers: kcc, eugenis

Differential Revision: https://reviews.llvm.org/D22657

llvm-svn: 276374
2016-07-22 00:56:17 +00:00
Sanjoy Das bb969791b4 [IRCE] Add an option to skip profitability checks
If `-irce-skip-profitability-checks` is passed in, IRCE will kick in in
all cases where it is legal for it to kick in.  This flag is intended to
help diagnose and analyse performance issues.

llvm-svn: 276372
2016-07-22 00:40:56 +00:00
Sebastian Pop 0e2cec075c GVN-hoist: move check before mutating the IR
llvm-svn: 276368
2016-07-22 00:07:01 +00:00
Sebastian Pop c107a4875e GVN-hoist: add missing check for all GEP operands available
llvm-svn: 276364
2016-07-21 23:32:39 +00:00
Sanjay Patel 18fa9d3ca1 [InstCombine] break up foldICmpEqualityWithConstant(); NFCI
Almost all of these folds require changes to allow vector types. 
Splitting up the logic should make that easier to do incrementally.

llvm-svn: 276360
2016-07-21 23:27:36 +00:00
Sebastian Pop 31fd506623 GVH-hoist: only clone GEPs (PR28606)
Do not clone stored values unless they are GEPs that are special cased to avoid
hoisting them without hoisting their associated ld/st.

Differential revision: https://reviews.llvm.org/D22652

llvm-svn: 276358
2016-07-21 23:22:10 +00:00
Xinliang David Li 6f8c504f10 [Profile] deprecate __llvm_profile_override_default_filename
This eliminates unncessary calls and init functions.

Differential Revision: http://reviews.llvm.org/D22613

llvm-svn: 276354
2016-07-21 23:19:10 +00:00
Wei Mi 1cf58f8996 [PM] Port NaryReassociate to the new PM
Differential Revision: https://reviews.llvm.org/D22648

llvm-svn: 276349
2016-07-21 22:28:52 +00:00
Adam Nemet 84a6425d61 [OptDiag,LDist] Convert remaining opt remarks to use the new API
llvm-svn: 276340
2016-07-21 21:21:34 +00:00
Matthew Simpson 102729cf1b [LV] Move vector int induction update to end of latch
This patch moves the update instruction for vectorized integer induction phi
nodes to the end of the latch block. This ensures consistent placement of all
induction updates across all the kinds of int inductions we create (scalar,
splat vector, or vector phi).

Differential Revision: https://reviews.llvm.org/D22416

llvm-svn: 276339
2016-07-21 21:20:15 +00:00
Rong Xu 97b68c5ebe [PGO] Make needsComdatForCounter() available (NFC)
Move needsComdatForCounter() to lib/ProfileData/InstrProf.cpp from
lib/Transforms/Instrumentation/InstrProfiling.cpp to make is available for
other files.

Differential Revision: https://reviews.llvm.org/D22643

llvm-svn: 276330
2016-07-21 20:50:02 +00:00
Sanjoy Das ff9eea2278 [IndVars] Reflow oddly formatted condition; NFC
llvm-svn: 276319
2016-07-21 18:58:01 +00:00
Sanjay Patel 43395060a1 make InstCombine compare helper functions private; NFC
Also, rename some of them for consistency and to follow current conventions.

llvm-svn: 276312
2016-07-21 18:07:40 +00:00
Vedant Kumar cd32eba67b Avoid a string copy, NFC
llvm-svn: 276310
2016-07-21 17:50:07 +00:00
Sanjay Patel 1710e7cfa7 [InstCombine] break up visitICmpInstWithInstAndIntCst(); NFCI
Making smaller pieces out of some of these ~1000 line functions should make
it easier to incrementally upgrade them to handle vector types.

llvm-svn: 276304
2016-07-21 17:15:49 +00:00
Benjamin Kramer eab3d36753 Rename StringMap::emplace_second to try_emplace.
Coincidentally this function maps to the C++17 try_emplace. Rename it
for consistentcy with C++17 std::map. NFC.

llvm-svn: 276276
2016-07-21 13:37:48 +00:00
Benjamin Kramer 2a185a2547 [GCOV] Remove a layer of indirection.
StringMap is designed to hold large values. No functionality change
intended.

llvm-svn: 276265
2016-07-21 12:06:31 +00:00
David Majnemer 825e4ab9e3 [GVNHoist] Preserve optimization hints which agree
If we have optimization hints with agree with each other along different
paths, preserve them.

llvm-svn: 276248
2016-07-21 07:16:26 +00:00
David Majnemer 4808f26422 [GVNHoist] Don't wrongly preserve TBAA
We hoisted loads/stores without taking into account which can cause
miscompiles.

llvm-svn: 276240
2016-07-21 05:59:53 +00:00
David Majnemer 15cf7b83d1 [MergedLoadStoreMotion] Remove out of date comment
llvm-svn: 276239
2016-07-21 05:59:51 +00:00
Adam Nemet 7cfd5971ab [OptDiag,LV] Add hotness attribute to applied-optimization remarks
Test coverage is provided by modifying the function in the FP-math
testcase that we are allowed to vectorize.

llvm-svn: 276223
2016-07-21 01:07:13 +00:00
Sanjay Patel 0753c06d9c [InstCombine] LogicOpc (zext X), C --> zext (LogicOpc X, C) (PR28476)
The benefits of this change include:
1. Remove DeMorgan-matching code that was added specifically to work-around 
   the missing transform in http://reviews.llvm.org/rL248634.
2. Makes the DeMorgan transform work for vectors too.
3. Fix PR28476: https://llvm.org/bugs/show_bug.cgi?id=28476

Extending this transform to other casts and other associative operators may
be useful too. See https://reviews.llvm.org/D22421 for a prerequisite for
doing that though.

Differential Revision: https://reviews.llvm.org/D22271

llvm-svn: 276221
2016-07-21 00:24:18 +00:00
Adam Nemet 0e0e2d5d26 [OptDiag,LV] Add hotness attribute to the derived analysis remarks
This includes FPCompute and Aliasing.

Testcase is based on no_fpmath.ll.

llvm-svn: 276211
2016-07-20 23:50:32 +00:00
Sanjay Patel 5f3c70307d [InstSimplify][InstCombine] don't crash when folding vector selects of icmp
Differential Revision: https://reviews.llvm.org/D22602

llvm-svn: 276209
2016-07-20 23:40:01 +00:00
George Burgess IV f84bb6fa67 Make help text more consistent. NFC.
llvm-svn: 276205
2016-07-20 23:14:29 +00:00
Adam Nemet 5b3a5cf6b0 [OptDiag,LV] Add hotness attribute to analysis remarks
The earlier change added hotness attribute to missed-optimization
remarks.  This follows up with the analysis remarks (the ones explaining
the reason for the missed optimization).

llvm-svn: 276192
2016-07-20 21:44:26 +00:00
David Majnemer bd21012c6c [GVNHoist] Don't hoist PHI nodes
We hoisted PHIs without respecting their special insertion point in the
block, leading to verfier errors.

This fixes PR28626.

llvm-svn: 276181
2016-07-20 21:05:01 +00:00
Davide Italiano 15ff2d6d0c [SCCP] Zap multiple return values.
We can replace the return values with undef if we replaced all
the call uses with a constant/undef.

Differential Revision:  https://reviews.llvm.org/D22336

llvm-svn: 276174
2016-07-20 20:17:13 +00:00
Justin Lebar a272c12b73 [LSV] Don't move stores across may-load instrs, and loosen restrictions on moving loads.
Summary:
Previously we wouldn't move loads/stores across instructions that had
side-effects, where that was defined as may-write or may-throw.  But
this is not sufficiently restrictive: Stores can't safely be moved
across instructions that may load.

This patch also adds a DEBUG check that all instructions in our chain
are either loads or stores.

Reviewers: asbirlea

Subscribers: llvm-commits, jholewinski, arsenm, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22547

llvm-svn: 276171
2016-07-20 20:07:37 +00:00
Justin Lebar 62b03e344e [LSV] Vectorize up to side-effecting instructions.
Summary:
Previously if we had a chain that contained a side-effecting
instruction, we wouldn't vectorize it at all.  Now we'll vectorize
everything that comes before the side-effecting instruction.

Reviewers: asbirlea

Subscribers: arsenm, jholewinski, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22536

llvm-svn: 276170
2016-07-20 20:07:34 +00:00
George Burgess IV 400ae40348 [MSSA] Add an overload for getClobberingMemoryAccess.
A seemingly common use for the walker's getClobberingMemoryAccess
function is:

```
MemoryAccess *getClobber(MemorySSAWalker *W, MemoryUseOrDef *MUD) {
  const Instruction *I = MUD->getMemoryInst();
  return W->getClobberingMemoryAccess(I);
}
```

Which is kind of redundant, since walkers will ultimately query MSSA to
find out which MemoryAccess `I` maps to (...which is always `MUD`).

So, this patch adds an overload of getClobberingMemoryAccess that
accepts MemoryAccesses directly. As a result, the Instruction overload
of getClobberingMemoryAccess becomes a lightweight wrapper around our
new overload.

Additionally, this patch un`virtual`izes the Instruction overload of
getClobberingMemoryAccess, since there doesn't seem to be a walker that
benefits from that being virtual, and I can't think of how else one
would implement it. Happy to make it virtual again if we would benefit
from doing so.

llvm-svn: 276169
2016-07-20 19:51:34 +00:00
Sanjay Patel 683170bf56 move decomposeBitTestICmp() to Transforms/Utils; NFC
As noted in https://reviews.llvm.org/D22537 , we can use this functionality in 
visitSelectInstWithICmp() and InstSimplify, but currently we have duplicated
code.

llvm-svn: 276140
2016-07-20 17:18:45 +00:00
Sanjay Patel be53c65fab fix documentation comments; NFC
llvm-svn: 276135
2016-07-20 16:30:55 +00:00
Benjamin Kramer b4d64cf27d Revert "[InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp))"
Makes InstCombine infloop when compiling v8.

This reverts commit r275989 and r276105.

llvm-svn: 276106
2016-07-20 11:40:16 +00:00
Adam Nemet 67c8929a2c [LV] Add hotness attribute to missed-optimization remarks
The new OptimizationRemarkEmitter analysis pass is hooked up to both new
and old PM passes.

llvm-svn: 276080
2016-07-20 04:03:43 +00:00
Michael Zolotukhin 6bc56d552a Revert "Revert r275883 and r275891. They seem to cause PR28608."
This reverts commit r276064, and thus reapplies r275891 and r275883 with
a fix for PR28608.

llvm-svn: 276077
2016-07-20 01:55:27 +00:00
Justin Lebar 6114b37838 [LSV] Don't assume that loads/stores appear in address order in the BB.
Summary:
getVectorizablePrefix previously didn't work properly in the face of
aliasing loads/stores.  It unwittingly assumed that the loads/stores
appeared in the BB in address order.  If they didn't, it would do the
wrong thing.

Reviewers: asbirlea, tstellarAMD

Subscribers: arsenm, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22535

llvm-svn: 276072
2016-07-20 00:55:12 +00:00
Sean Silva 554efb28d2 Revert r275883 and r275891. They seem to cause PR28608.
Revert "[LoopSimplify] Update LCSSA after separating nested loops."

This reverts commit r275891.

Revert "[LCSSA] Post-process PHI-nodes created by SSAUpdate when constructing LCSSA form."

This reverts commit r275883.

llvm-svn: 276064
2016-07-19 23:54:29 +00:00
Sean Silva e3c18a5ae8 [PM] Port LoopUnroll.
We just set PreserveLCSSA to always true since we don't have an
analogous method `mustPreserveAnalysisID(LCSSA)`.

Also port LoopInfo verifier pass to test LoopUnrollPass.

llvm-svn: 276063
2016-07-19 23:54:23 +00:00
Justin Lebar 8778c62629 [LSV] Insert stores at the right point.
Summary:
Previously, the insertion point for stores was the last instruction in
Chain *before calling getVectorizablePrefixEndIdx*.  Thus if
getVectorizablePrefixEndIdx didn't return Chain.size(), we still would
insert at the last instruction in Chain.

This patch changes our internal API a bit in an attempt to make it less
prone to this sort of error.  As a result, we end up recalculating the
Chain's boundary instructions, but I think worrying about the speed hit
of this is a premature optimization right now.

Reviewers: asbirlea, tstellarAMD

Subscribers: mzolotukhin, arsenm, llvm-commits

Differential Revision: https://reviews.llvm.org/D22534

llvm-svn: 276056
2016-07-19 23:19:20 +00:00
Justin Lebar 2cf2c22870 [LSV] Use make_range, and reformat a DEBUG message. NFC
Summary:
The DEBUG message was hard to read because two Values were being printed
on the same line with only the delimiter "aliases".  This change makes
us print each Value on its own line.

Reviewers: asbirlea

Subscribers: llvm-commits, arsenm, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22533

llvm-svn: 276055
2016-07-19 23:19:18 +00:00
Justin Lebar 4ee8a2d024 [LSV] Nix two global (ish) variables in the LoadStoreVectorizer. NFC
Reviewers: asbirlea

Subscribers: mzolotukhin, llvm-commits, arsenm

Differential Revision: https://reviews.llvm.org/D22532

llvm-svn: 276054
2016-07-19 23:19:16 +00:00
Daniel Berlin 1986030b62 Fix unused variable
llvm-svn: 276050
2016-07-19 23:08:08 +00:00
Paul Robinson 2d23c029f7 Make GVN Hoisting obey optnone/bisect.
Differential Revision: http://reviews.llvm.org/D22545

llvm-svn: 276048
2016-07-19 22:57:14 +00:00
Daniel Berlin 5c46b943db Make MemorySSA::dominates/locallydominates constant time
Summary: Make MemorySSA::dominates/locallydominates constant time

Reviewers: george.burgess.iv, gberry

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22527

llvm-svn: 276046
2016-07-19 22:49:43 +00:00
Sanjay Patel 2d477e59e8 [InstCombine] fold add(zext(xor X, C), C) --> sext X when C is INT_MIN in the source type
The pattern may look more obviously like a sext if written as:

  define i32 @g(i16 %x) {
    %zext = zext i16 %x to i32
    %xor = xor i32 %zext, 32768
    %add = add i32 %xor, -32768
    ret i32 %add
  }

We already have that fold in visitAdd().

Differential Revision: https://reviews.llvm.org/D22477

llvm-svn: 276035
2016-07-19 22:09:34 +00:00
Vedant Kumar 57faf2d208 [tsan] Don't instrument __llvm_gcov_global_state_pred or __llvm_gcda*
r274801 did not go far enough to allow gcov+tsan to cooperate. With this
commit it's possible to run the following code without false positives:

  std::thread T1(fib), T2(fib);
  T1.join(); T2.join();

llvm-svn: 276015
2016-07-19 20:16:08 +00:00
David Majnemer 5246e0b2c2 [FunctionAttrs] Correct the safety analysis for inference of 'returned'
We skipped over ReturnInsts which didn't return an argument which would
lead us to incorrectly conclude that an argument returned by another
ReturnInst was 'returned'.

This reverts commit r275756.

This fixes PR28610.

llvm-svn: 276008
2016-07-19 18:50:26 +00:00
Davide Italiano 63266b6be5 [SCCP] Improve assert messages. NFCI.
I've been hitting those already while working on SCCP and I think
it's be useful to provide a more explanatory diagnostic.

llvm-svn: 276007
2016-07-19 18:31:07 +00:00
Chad Rosier 8b5fa7a2f2 [DSE] Add additional debug output. NFC.
llvm-svn: 276005
2016-07-19 18:11:11 +00:00
Chad Rosier 667b1ca0e6 [DSE] Add additional debug output. NFC.
llvm-svn: 275991
2016-07-19 16:50:57 +00:00
Tobias Grosser 1c38262279 [InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp))
Summary:
Currently, InstCombine is already able to fold expressions of the form `logic(cast(A), cast(B))` to the simpler form `cast(logic(A, B))`, where logic designates one of `and`/`or`/`xor`. This transformation is implemented in `foldCastedBitwiseLogic()` in InstCombineAndOrXor.cpp. However, this optimization will not be performed if both `A` and `B` are `icmp` instructions. The decision to preclude casts of `icmp` instructions originates in r48715 in combination with r261707, and can be best understood by the title of the former one:

> Transform (zext (or (icmp), (icmp))) to (or (zext (cimp), (zext icmp))) if at least one of the (zext icmp) can be transformed to eliminate an icmp.

Apparently, it introduced a transformation that is a reverse of the transformation that is done in `foldCastedBitwiseLogic()`. Its purpose is to expose pairs of `zext icmp` that would subsequently be optimized by `transformZExtICmp()` in InstCombineCasts.cpp. Therefore, in order to avoid an endless loop of switching back and forth between these two transformations, the one in `foldCastedBitwiseLogic()` has been restricted to exclude `icmp` instructions which is mirrored in the responsible check:

`if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src)) && ...`

This check seems to sort out more cases than necessary because:
- the reverse transformation is obviously done for `or` instructions only
- and also not every `zext icmp` pair is necessarily the result of this reverse transformation

Therefore we now remove this check and replace it by a more finegrained one in `shouldOptimizeCast()` that now rejects only those `logic(zext(icmp), zext(icmp))` that would be able to be optimized by `transformZExtICmp()`, which also avoids the mentioned endless loop. That means we are now able to also simplify expressions of the form `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` (`cast` being an arbitrary `CastInst`).

As an example, consider the following IR snippet

```
%1 = icmp sgt i64 %a, %b
%2 = zext i1 %1 to i8
%3 = icmp slt i64 %a, %c
%4 = zext i1 %3 to i8
%5 = and i8 %2, %4
```

which would now be transformed to

```
%1 = icmp sgt i64 %a, %b
%2 = icmp slt i64 %a, %c
%3 = and i1 %1, %2
%4 = zext i1 %3 to i8
```

This issue became apparent when experimenting with the programming language Julia, which makes use of LLVM. Currently, Julia lowers its `Bool` datatype to LLVM's `i8` (also see https://github.com/JuliaLang/julia/pull/17225). In fact, the above IR example is the lowered form of the Julia snippet `(a > b) & (a < c)`. Like shown above, this may introduce `zext` operations, casting between `i1` and `i8`, which could for example hinder ScalarEvolution and Polly on certain code.

Reviewers: grosser, vtjnash, majnemer

Subscribers: majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D22511

Contributed-by: Matthias Reisinger
llvm-svn: 275989
2016-07-19 16:39:17 +00:00
Tobias Grosser 8ef834c712 [InstCombine] Minor cleanup of cast simplification code [NFC]
Summary:
This patch cleans up parts of InstCombine to raise its compliance with the LLVM coding standards and to increase its readability. The changes and according rationale are summarized in the following:

- Rename `ShouldOptimizeCast()` to `shouldOptimizeCast()` since functions should start with a lower case letter.

- Move `shouldOptimizeCast()` from InstCombineCasts.cpp to InstCombineAndOrXor.cpp since it's only used there.

- Simplify interface of `shouldOptimizeCast()`.

- Minor code style adaptions in `shouldOptimizeCast()`.

- Remove the documentation on the function definition of `shouldOptimizeCast()` since it just repeats the documentation on its declaration. Also enhance the documentation on its declaration with more information describing its intended use and make it doxygen-compliant.

- Change a comment in `foldCastedBitwiseLogic()` from `fold (logic (cast A), (cast B)) -> (cast (logic A, B))` to `fold logic(cast(A), cast(B)) -> cast(logic(A, B))` since the surrounding comments use this format.

- Remove comment `Only do this if the casts both really cause code to be generated.` in `foldCastedBitwiseLogic()` since it just repeats parts of the documentation of `shouldOptimizeCast()` and does not help to improve readability.

- Simplify the interface of `isEliminableCastPair()`.

- Removed the documentation on the function definition of `isEliminableCastPair()` which only contained obvious statements about its implementation. Instead added more general doxygen-compliant documentation to its declaration.

- Renamed parameter `DoXform` of `transformZExtIcmp()` to `DoTransform` to make its intention clearer.

- Moved documentation of `transformZExtIcmp()` from its definition to its declaration and made it doxygen-compliant.

Reviewers: vtjnash, grosser

Subscribers: majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D22449

Contributed-by: Matthias Reisinger
llvm-svn: 275964
2016-07-19 09:06:08 +00:00
George Burgess IV 5f30897b7b [MemorySSA] Update to the new shiny walker.
This patch updates MemorySSA's use-optimizing walker to be more
accurate and, in some cases, faster.

Essentially, this changed our core walking algorithm from a
cache-as-you-go DFS to an iteratively expanded DFS, with all of the
caching happening at the end. Said expansion happens when we hit a Phi,
P; we'll try to do the smallest amount of work possible to see if
optimizing above that Phi is legal in the first place. If so, we'll
expand the search to see if we can optimize to the next phi, etc.

An iteratively expanded DFS lets us potentially quit earlier (because we
don't assume that we can optimize above all phis) than our old walker.
Additionally, because we don't cache as we go, we can now optimize above
loops.

As an added bonus, this patch adds a ton of verification (if
EXPENSIVE_CHECKS are enabled), so finding bugs is easier.

Differential Revision: https://reviews.llvm.org/D21777

llvm-svn: 275940
2016-07-19 01:29:15 +00:00
Wei Mi 79997a24d7 Recommit the patch "Use uniforms set to populate VecValuesToIgnore".
For instructions in uniform set, they will not have vector versions so
add them to VecValuesToIgnore.
For induction vars, those only used in uniform instructions or consecutive
ptrs instructions have already been added to VecValuesToIgnore above. For
those induction vars which are only used in uniform instructions or
non-consecutive/non-gather scatter ptr instructions, the related phi and
update will also be added into VecValuesToIgnore set.

The change will make the vector RegUsages estimation less conservative.

Differential Revision: https://reviews.llvm.org/D20474

The recommit fixed the testcase global_alias.ll.

llvm-svn: 275936
2016-07-19 00:50:43 +00:00
Sanjoy Das ab73c9d88e [LoopReroll] Reroll loops with unordered atomic memory accesses
Reviewers: hfinkel, jfb, reames

Subscribers: mcrosier, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D22385

llvm-svn: 275932
2016-07-19 00:23:54 +00:00
Dehao Chen 6132ee8502 [PM] Convert Loop Strength Reduce pass to new PM
Summary: Convert Loop String Reduce pass to new PM

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D22468

llvm-svn: 275919
2016-07-18 21:41:50 +00:00
Teresa Johnson 2124157102 [PM] Port FunctionImport Pass to new PM
Summary: Port FunctionImport Pass to new PM.

Reviewers: mehdi_amini, davide

Subscribers: davidxl, llvm-commits

Differential Revision: https://reviews.llvm.org/D22475

llvm-svn: 275916
2016-07-18 21:22:24 +00:00
Wei Mi f9afff71a2 Revert rL275912.
llvm-svn: 275915
2016-07-18 21:14:43 +00:00
Wei Mi 1fd25726af Use uniforms set to populate VecValuesToIgnore.
For instructions in uniform set, they will not have vector versions so
add them to VecValuesToIgnore.
For induction vars, those only used in uniform instructions or consecutive
ptrs instructions have already been added to VecValuesToIgnore above. For
those induction vars which are only used in uniform instructions or
non-consecutive/non-gather scatter ptr instructions, the related phi and
update will also be added into VecValuesToIgnore set.

The change will make the vector RegUsages estimation less conservative.

Differential Revision: https://reviews.llvm.org/D20474

llvm-svn: 275912
2016-07-18 20:59:53 +00:00
Michael Zolotukhin ea5b72825b [LoopSimplify] Update LCSSA after separating nested loops.
Summary:
Usually LCSSA survives this transformation, but in some cases (see
attached test) it doesn't: values from the original loop after
separating might be used from the outer loop. Before the transformation
it was the same loop, so LCSSA phis were not required.

This fixes PR28272.

Reviewers: sanjoy, hfinkel, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21665

llvm-svn: 275891
2016-07-18 19:44:19 +00:00
David Majnemer 04854ab1e5 [GVNHoist] Remove a home-grown version of replaceUsesOfWith
replaceUsesOfWith will, on average, consider fewer values when trying
to do the replacement.

No functional change is intended.

llvm-svn: 275884
2016-07-18 19:14:14 +00:00
Michael Zolotukhin 7a3040dc83 [LCSSA] Post-process PHI-nodes created by SSAUpdate when constructing LCSSA form.
Summary:
SSAUpdate might insert PHI-nodes inside loops, which can break LCSSA
form unless we fix it up.

This fixes PR28424.

Reviewers: sanjoy, chandlerc, hfinkel

Subscribers: uabelho, llvm-commits

Differential Revision: http://reviews.llvm.org/D21997

llvm-svn: 275883
2016-07-18 19:05:08 +00:00
Reid Kleckner 3498ad11eb Fix -Wmicrosoft-enum-value in GVNHoist.cpp
llvm-svn: 275879
2016-07-18 18:53:50 +00:00
Adam Nemet b2593f78ca [LoopDist] Port to new PM
Summary:
The direct motivation for the port is to ensure that the OptRemarkEmitter
tests work with the new PM.

This remains a function pass because we not only create multiple loops
but could also version the original loop.

In the test I need to invoke opt
with -passes='require<aa>,loop-distribute'.  LoopDistribute does not
directly depend on AA however LAA does.  LAA uses getCachedResult so
I *think* we need manually pull in 'aa'.

Reviewers: davidxl, silvas

Subscribers: sanjoy, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22437

llvm-svn: 275811
2016-07-18 16:29:27 +00:00
Adam Nemet 79ac42a5c9 [OptRemarkEmitter] Port to new PM
Summary:
The main goal is to able to start using the new OptRemarkEmitter
analysis from the LoopVectorizer.  Since the vectorizer was recently
converted to the new PM, it makes sense to convert this analysis as
well.

This pass is currently tested through the LoopDistribution pass, so I am
also porting LoopDistribution to get coverage for this analysis with the
new PM.

Reviewers: davidxl, silvas

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22436

llvm-svn: 275810
2016-07-18 16:29:21 +00:00
Alexander Kornienko 63dd36faa5 Revert "r275571 [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals"
Causes https://llvm.org/bugs/show_bug.cgi?id=28588

llvm-svn: 275801
2016-07-18 15:51:31 +00:00
David Majnemer 04c7c225a1 [GVNHoist] Change the key for VNtoInsns to a pair
While debugging GVNHoist, I found it confusing that the entries in a
VNtoInsns were not always value numbers.  They _usually_ were except for
StoreInst in which case they were a hash of two different value numbers.

This leads to two observations:
- It is more difficult to debug things when the semantic contents of
  VNtoInsns changes over time.
- Using a single value number is not much cheaper, the value of
  VNtoInsns is a SmallVector.
- It is not immediately clear what the algorithm would do if there were
  hash collisions in the StoreInst case.

Using a DenseMap of std::pair sidesteps all of this.

N.B.  The changes in the test were due their sensitivity to the
iteration order of VNtoInsns which has changed.

llvm-svn: 275761
2016-07-18 06:11:37 +00:00
NAKAMURA Takumi 966bde50c3 Revert r275678, "Revert "Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute""
This reverts also r275029, "Update Clang tests after adding inference for the returned argument attribute"

It broke LTO build. Seems miscompilation.

llvm-svn: 275756
2016-07-18 03:23:25 +00:00
David Majnemer aa2417835e [GVNHoist] Sink HoistedCtr into GVNHoist
HoistedCtr cannot be a mutated global variable, that will open us up to
races between threads compiling code in parallel.

llvm-svn: 275744
2016-07-18 00:35:01 +00:00
David Majnemer 4c66a714c3 [GVNHoist] Some small cleanups
No functional change is intended, just trying to clean things up a
little.

llvm-svn: 275743
2016-07-18 00:34:58 +00:00
Teresa Johnson cd21a646f6 [ThinLTO] Perform profile-guided indirect call promotion
Summary:
To enable profile-guided indirect call promotion in ThinLTO mode, we
simply add call graph edges for each profitable target from the profile
to the summaries, then the summary-guided importing will consider the
callee for importing as usual.

Also we need to enable the indirect call promotion pass creation in the
PassManagerBuilder when PerformThinLTO=true (we are in the ThinLTO
backend), so that the newly imported functions are considered for
promotion in the backends.

The IC promotion profiles refer to callees by GUID, which required
adding GUIDs to the per-module VST in bitcode (and assigning them
valueIds similar to how they are assigned valueIds in the combined
index).

Reviewers: mehdi_amini, xur

Subscribers: mehdi_amini, davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21932

llvm-svn: 275707
2016-07-17 14:47:01 +00:00
Teresa Johnson ce7de9b6fb Address review comments.
llvm-svn: 275706
2016-07-17 14:46:58 +00:00
Teresa Johnson 3f42198652 Refactor indirect call promotion profitability analysis (NFC)
Summary:
Refactored the profitability analysis out of the IC promotion pass and
into lib/Analysis so that it can be accessed by the summary index
builder in a follow-on patch to enable IC promotion in ThinLTO (D21932).

Reviewers: davidxl, xur

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22182

llvm-svn: 275705
2016-07-17 14:46:54 +00:00
Dehao Chen 1a44452b11 [PM] Convert IVUsers analysis to new pass manager.
Summary: Convert IVUsers analysis to new pass manager.

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22434

llvm-svn: 275698
2016-07-16 22:51:33 +00:00
Sanjay Patel 79acd2a96b [InstCombine] allow X + signbit --> X ^ signbit for vector splats
llvm-svn: 275691
2016-07-16 18:29:26 +00:00
Sanjay Patel f9d2b20daf [InstCombine] reassociate logic ops with constants separated by a zext
This is a partial implementation of a general fold for associative+commutative operators:
(op (cast (op X, C2)), C1) --> (cast (op X, op (C1, C2)))
(op (cast (op X, C2)), C1) --> (op (cast X), op (C1, C2))

There are 7 associative operators and 13 cast types, so this could potentially go a lot further.

Differential Revision: https://reviews.llvm.org/D22421

llvm-svn: 275684
2016-07-16 15:20:19 +00:00
Hal Finkel 660096b260 Revert "Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute"
This reverts commit r275042; the initial commit triggered self-hosting failures
on ARM/AArch64. James Molloy identified the problematic backend code, which has
been disabled in r275677. Trying again...

Original commit message:

Let FuncAttrs infer the 'returned' argument attribute

A function can have one argument with the 'returned' attribute, indicating that
the associated argument is always the return value of the function. Add
FuncAttrs inference logic.

llvm-svn: 275678
2016-07-16 07:21:28 +00:00
Matt Arsenault 93be6e8c0a StructurizeCFG: Fix inverting constantexpr conditions
llvm-svn: 275626
2016-07-15 22:13:16 +00:00
Michael Zolotukhin a78937afb2 Make processInstruction from LCSSA.cpp externally available.
Summary:
When a pass tries to keep LCSSA form it's often convenient to be able to update
LCSSA for a set of instructions rather than for the entire loop. This patch makes the
processInstruction from LCSSA externally available under a name
formLCSSAForInstruction.

Reviewers: chandlerc, sanjoy, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22378

llvm-svn: 275613
2016-07-15 21:08:41 +00:00
Davide Italiano 094dadd5b4 [SCCP] Merge two conditions into one. NFCI.
llvm-svn: 275593
2016-07-15 18:33:16 +00:00
Rong Xu 96a19d35ae [PGO] IRPGO pre-cleanup pass changes
This patch adds a selected set of cleanup passes including a pre-inline pass
before LLVM IR PGO instrumentation. The inline is only intended to apply those
obvious/trivial ones before instrumentation so that much less instrumentation
is needed to get better profiling information. This will drastically improve
the instrumented code performance for large C++ applications. Another benefit
is the context sensitive counts that can potentially improve the PGO
optimization.

Differential Revision: http://reviews.llvm.org/D21405

llvm-svn: 275588
2016-07-15 18:10:49 +00:00
Adam Nemet aad816083e [OptRemark,LDist] RFC: Add hotness attribute
Summary:
This is the first set of changes implementing the RFC from
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334

This is a cross-sectional patch; rather than implementing the hotness
attribute for all optimization remarks and all passes in a patch set, it
implements it for the 'missed-optimization' remark for Loop
Distribution.  My goal is to shake out the design issues before scaling
it up to other types and passes.

Hotness is computed as an integer as the multiplication of the block
frequency with the function entry count.  It's only printed in opt
currently since clang prints the diagnostic fields directly.  E.g.:

  remark: /tmp/t.c:3:3: loop not distributed: use -Rpass-analysis=loop-distribute for more info (hotness: 300)

A new API added is similar to emitOptimizationRemarkMissed.  The
difference is that it additionally takes a code region that the
diagnostic corresponds to.  From this, hotness is computed using BFI.
The new API is exposed via an analysis pass so that it can be made
dependent on LazyBFI.  (Thanks to Hal for the analysis pass idea.)

This feature can all be enabled by setDiagnosticHotnessRequested in the
LLVM context.  If this is off, LazyBFI is not calculated (D22141) so
there should be no overhead.

A new command-line option is added to turn this on in opt.

My plan is to switch all user of emitOptimizationRemark* to use this
module instead.

Reviewers: hfinkel

Subscribers: rcox2, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D21771

llvm-svn: 275583
2016-07-15 17:23:20 +00:00
David Majnemer a940f360cb [AliasAnalysis] Give back AA results for fence instructions
Calling getModRefInfo with a fence resulted in crashes because fences
don't have a memory location.  Add a new predicate to Instruction
called isFenceLike which indicates that the instruction mutates memory
but not any single memory location in particular. In practice, it is a
proxy for the set of instructions which "mayWriteToMemory" but cannot be
used with MemoryLocation::get.

This fixes PR28570.

llvm-svn: 275581
2016-07-15 17:19:24 +00:00
Dehao Chen dcafd5ebfd [PM] Convert LoopInstSimplify Pass to new PM
Summary: Convert LoopInstSimplify to new PM. Unfortunately there is no exisiting unittest for this pass.

Reviewers: davidxl, silvas

Subscribers: silvas, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22280

llvm-svn: 275576
2016-07-15 16:42:11 +00:00
Jun Bum Lim a5737d8eac [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals
Summary:
This change use the overlap interval map built from partial overwrite tracking to perform shortening MemIntrinsics.
Add test cases which was missing opportunities before.

Reviewers: hfinkel, eeckstein, mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D21909

llvm-svn: 275571
2016-07-15 16:14:34 +00:00
Matthew Simpson f855346f0b [LV] Swap A and B in interleaved access analysis (NFC)
This patch swaps A and B in the interleaved access analysis and clarifies
related comments. The algorithm is more intuitive if we let access A precede
access B in program order rather than the reverse. This change was requested in
the review of D19984.

llvm-svn: 275567
2016-07-15 15:22:43 +00:00
Sebastian Pop 4177480aad code hoisting pass based on GVN
This pass hoists duplicated computations in the program. The primary goal of
gvn-hoist is to reduce the size of functions before inline heuristics to reduce
the total cost of function inlining.

Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki.
Important algorithmic contributions by Daniel Berlin under the form of reviews.

Differential Revision: http://reviews.llvm.org/D19338

llvm-svn: 275561
2016-07-15 13:45:20 +00:00
Adam Nemet 74730d9ab0 [LoopDist] Fix typo in diagnostic
llvm-svn: 275495
2016-07-14 22:33:46 +00:00
Ekaterina Romanova 7aea5906c0 [GVN] Fold constant expression in GVN.
Fix for PR 28418.

opt never finishes compiling a test when -gvn option is passed.
The problem is caused by the fact that GVN fails to fold a constant expression.

Differential Revision: https://reviews.llvm.org/D22185

llvm-svn: 275483
2016-07-14 22:02:25 +00:00
Matthew Simpson 96e881deb5 [LV] Rename StrideAccesses to AccessStrideInfo (NFC)
We now collect all accesses with a constant stride, not just the ones with a
stride greater than one. This change was requested in the review of D19984.

llvm-svn: 275473
2016-07-14 21:05:08 +00:00
Matthew Simpson 65ca32b83c [LV] Allow interleaved accesses in loops with predicated blocks
This patch allows the formation of interleaved access groups in loops
containing predicated blocks. However, the predicated accesses are prevented
from forming groups.

Differential Revision: https://reviews.llvm.org/D19694

llvm-svn: 275471
2016-07-14 20:59:47 +00:00
Sanjay Patel bbbb3ce787 don't repeat function names in comments; NFC
llvm-svn: 275470
2016-07-14 20:54:43 +00:00
Davide Italiano 6f73588fb9 [SCCP] Pass the Solver by reference, copies are expensive ...
.. enough to cause LTO compile time to regress insanely.
Thanks *a lot* to Rafael for reporting the problem and testing
the fix!

llvm-svn: 275468
2016-07-14 20:25:54 +00:00
Sanjoy Das 13623ad009 [JumpThreading] PRE unordered loads
Summary: Extend JumpThreading's PRE to unordered atomic loads.

Reviewers: hfinkel, reames

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D22326

llvm-svn: 275456
2016-07-14 19:21:15 +00:00
Jun Bum Lim c837af306e [PM] Port Dead Loop Deletion Pass to the new PM
Summary: Port Dead Loop Deletion Pass to the new pass manager.

Reviewers: silvas, davide

Subscribers: llvm-commits, sanjoy, mcrosier

Differential Revision: https://reviews.llvm.org/D21483

llvm-svn: 275453
2016-07-14 18:28:29 +00:00
Kostya Serebryany dd5c7f9313 [sanitizer-coverage] make sure that calls to __sanitizer_cov_trace_pc are not merged (otherwise different calls get the same PC and confuse fuzzers)
llvm-svn: 275449
2016-07-14 17:59:01 +00:00
Nico Weber 755cd760cd Revert r275401, it caused PR28551.
llvm-svn: 275420
2016-07-14 14:41:25 +00:00
Matthew Simpson 3c3b4a257b [LV] Avoid unnecessary IV scalar-to-vector-to-scalar conversions
This patch prevents increases in the number of instructions, pre-instcombine,
due to induction variable scalarization. An increase in instructions can lead
to an increase in the compile-time required to simplify the induction
variables. We now maintain a new map for scalarized induction variables to
prevent us from converting between the scalar and vector forms.

This patch should resolve compile-time regressions seen after r274627.

llvm-svn: 275419
2016-07-14 14:36:06 +00:00
Sjoerd Meijer 716abbb2f5 This converts a signed remainder instruction to unsigned remainder, which
enables the code size optimisation to fold a rem and div into a single
aeabi_uidivmod call. This was not happening before because sdiv was converted
but srem not, and instructions with different signedness are not combined.

Differential Revision: http://reviews.llvm.org/D22214

llvm-svn: 275403
2016-07-14 12:23:48 +00:00
Sebastian Pop 63847d04e7 code hoisting pass based on GVN
This pass hoists duplicated computations in the program. The primary goal of
gvn-hoist is to reduce the size of functions before inline heuristics to reduce
the total cost of function inlining.

Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki.
Important algorithmic contributions by Daniel Berlin under the form of reviews.

Differential Revision: http://reviews.llvm.org/D19338

llvm-svn: 275401
2016-07-14 12:18:53 +00:00
Sjoerd Meijer 38c2cd0c14 This implements a more optimal algorithm for selecting a base constant in
constant hoisting. It not only takes into account the number of uses and the
cost of expressions in which constants appear, but now also the resulting
integer range of the offsets. Thus, the algorithm maximizes the number of uses
within an integer range that will enable more efficient code generation. On
ARM, for example, this will enable code size optimisations because less
negative offsets will be created. Negative offsets/immediates are not supported
by Thumb1 thus preventing more compact instruction encoding.

Differential Revision: http://reviews.llvm.org/D21183

llvm-svn: 275382
2016-07-14 07:44:20 +00:00
David Majnemer 666aa945a5 [InstCombine] Masked loads with undef masks can fold to normal loads
We were able to fold masked loads with an all-ones mask to a normal
load.  However, we couldn't turn a masked load with a mask with mixed
ones and undefs into a normal load.

llvm-svn: 275380
2016-07-14 06:58:42 +00:00
Davide Italiano ed4d5ea82a [SCCP] Pass a Value * instead of templating this function. NFC.
Thanks to Eli for the suggestion!

llvm-svn: 275366
2016-07-14 03:02:34 +00:00
Davide Italiano 7dac027ed7 [IPSCCP] Constant fold struct argument/instructions when all the lattice values are constant.
This now should also work with the interprocedural variant of the pass.
Slightly easier now that the yak is shaved.

Differential Revision:   http://reviews.llvm.org/D22329

llvm-svn: 275363
2016-07-14 02:51:41 +00:00
Mehdi Amini 8484f92f7f [Scalarizer] PR28108: Skip over nullptr rather than crashing on it.
Summary:
In Scalarizer::gather we see if we already have a scattered form of Op,
and in that case use the new form.

In the particular case of PR28108, the found ValueVector SV has size 2,
where the first Value is nullptr, and the second is indeed a proper Value.
The nullptr then caused an assert to blow when we tried to do
cast<Instruction>(SV[I]).

With this patch we check SV[I] before doing the cast, and if it's nullptr
we just skip over it.

I don't know the Scalarizer well enough to know if this is the best fix
or if something should be done else where to prevent the nullptr from
being in the ValueVector at all, but at least this avoids the crash
and looking at the test case output it looks reasonable.

Reviewers: hfinkel, frasercrmck, wala, mehdi_amini

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21518

llvm-svn: 275359
2016-07-14 01:31:25 +00:00
Davide Italiano 6ed6d77950 [SCCP] Generalize tryToReplaceInstWithConstant to work also with arguments.
llvm-svn: 275357
2016-07-14 01:27:29 +00:00
Sanjoy Das 931df67ae6 [JumpThreading] Delete commented out debug code; NFC
llvm-svn: 275346
2016-07-13 23:33:20 +00:00
David Majnemer d77a3b61eb Move a transform from InstCombine to InstSimplify.
This transform doesn't require any new instructions, it can safely live
in InstSimplify.

llvm-svn: 275344
2016-07-13 23:32:53 +00:00
Davide Italiano 296e9785ba [SCCP] Have the logic for replacing insts with constant in a single place.
The code was pretty much copy-pasted between SCCP and IPSCCP. The situation
became clearly worse after I introduced the support for folding structs in
SCCP.  This commit is NFC as we currently (still) skip the replacement
step in IPSCCP, but I'll change this soon.

llvm-svn: 275339
2016-07-13 23:20:04 +00:00
Alina Sbirlea 640a61cd8b Extended LoadStoreVectorizer to vectorize subchains.
Summary:
LSV used to abort vectorizing a chain for interleaved load/store accesses that alias.
Allow a valid prefix of the chain to be vectorized, mark just the prefix and retry vectorizing the remaining chain.

Reviewers: llvm-commits, jlebar, arsenm

Subscribers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D22119

llvm-svn: 275317
2016-07-13 21:20:01 +00:00
Davide Italiano 390b7ea533 [SCCP] Factor out common code.
llvm-svn: 275308
2016-07-13 19:33:25 +00:00
Davide Italiano 2185001551 [SCCP] Use early return. NFCI.
llvm-svn: 275307
2016-07-13 19:23:30 +00:00
Andrew Kaylor 346dd7f1bd Reverting r275284 due to platform-specific test failures
llvm-svn: 275304
2016-07-13 19:09:16 +00:00
Sanjay Patel c00e48a3db [InstCombine] extend vector select matching for non-splat constants
In D21740, we discussed trying to make this a more general matcher. However, I didn't see a clean
way to handle the regular m_Not cases and these non-splat vector patterns, so I've opted for the
direct approach here. If there are other potential uses of areInverseVectorBitmasks(), we could
move that helper function to a higher level.

There is an open question as to which is of these forms should be considered the canonical IR:
  %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i32> %a, <4 x i32> %b
  %shuf = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 0, i32 5, i32 6, i32 3>

Differential Revision: http://reviews.llvm.org/D22114

llvm-svn: 275289
2016-07-13 18:07:02 +00:00
Andrew Kaylor 12cccdd731 Fix for Bug 26903, adds support to inline __builtin_mempcpy
Patch by Sunita Marathe

Differential Revision: http://reviews.llvm.org/D21920

llvm-svn: 275284
2016-07-13 17:25:11 +00:00
David Majnemer 81d877b392 [LoopVectorize] Further cleanups
No functional change is intended, just a minor cleanup.

llvm-svn: 275243
2016-07-13 03:24:38 +00:00
Michael Kuperstein 51078b81ca [LV] Do not invalidate use-lists we're iterating over.
Should make sanitizers happier.

llvm-svn: 275230
2016-07-12 23:11:34 +00:00
Dehao Chen 9cba1f4e7e New pass manager for LICM.
Summary: Port LICM to the new pass manager.

Reviewers: davidxl, silvas

Subscribers: krasin, vitalybuka, silvas, davide, sanjoy, llvm-commits, mehdi_amini

Differential Revision: http://reviews.llvm.org/D21772

llvm-svn: 275222
2016-07-12 22:37:48 +00:00
Teresa Johnson 8950ad12ad Remove unused variable to fix bot failure from r275216
Remove unused variable added in r275216. Should fix bot failure:
http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/24665

llvm-svn: 275219
2016-07-12 21:29:05 +00:00
Michael Kuperstein a99c46cc73 [LV] Remove wrong assumption about LCSSA
The LCSSA pass itself will not generate several redundant PHI nodes in a single
exit block. However, such redundant PHI nodes don't violate LCSSA form, and may
be introduced by passes that preserve LCSSA, and/or preserved by the LCSSA pass
itself. So, assuming a single PHI node per exit block is not safe.

llvm-svn: 275217
2016-07-12 21:24:06 +00:00
Teresa Johnson 1e44b5d3ab Refactor indirect call promotion profitability analysis (NFC)
Summary:
Refactored the profitability analysis out of the IC promotion pass and
into lib/Analysis so that it can be accessed by the summary index
builder in a follow-on patch to enable IC promotion in ThinLTO (D21932).

Reviewers: davidxl, xur

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22182

llvm-svn: 275216
2016-07-12 21:13:44 +00:00
Davide Italiano 0080269342 [SCCP] Constant fold structs if all the lattice value are constant.
Differential Revision:   http://reviews.llvm.org/D22269

llvm-svn: 275208
2016-07-12 19:54:19 +00:00
David Majnemer 9330b78431 [LoopVectorize] Assorted cleanups
Use range-based for loops instead of doing everything manually.
Use auto when appropriate.

No functional change is intended.

llvm-svn: 275205
2016-07-12 19:35:15 +00:00
Dehao Chen b9f8e29290 [PM] Port LoopIdiomRecognize Pass to new PM
Summary: Port LoopIdiomRecognize Pass to new PM

Reviewers: davidxl

Subscribers: davide, sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D22250

llvm-svn: 275202
2016-07-12 18:45:51 +00:00
Vitaly Buka 204dc533c5 Revert "New pass manager for LICM."
Summary: This reverts commit r275118.

Subscribers: sanjoy, mehdi_amini

Differential Revision: http://reviews.llvm.org/D22259

llvm-svn: 275156
2016-07-12 06:25:32 +00:00
Ivan Krasin 5474645dc8 Print remarks from WholeProgramDevirt pass for each call site.
Summary:
It's useful to have some visibility about which call sites are devirtualized,
especially for debug purposes. Another use case is a regression test on the
application side (like, Chromium).

Reviewers: pcc

Differential Revision: http://reviews.llvm.org/D22252

llvm-svn: 275145
2016-07-12 02:38:37 +00:00
Dehao Chen 7ef5820fa3 New pass manager for LICM.
Summary: Port LICM to the new pass manager.

Reviewers: davidxl, silvas

Subscribers: silvas, davide, sanjoy, llvm-commits, mehdi_amini

Differential Revision: http://reviews.llvm.org/D21772

llvm-svn: 275118
2016-07-11 22:45:24 +00:00
Alina Sbirlea cbc6ac2afd Correct ordering of loads/stores.
Summary:
Aiming to correct the ordering of loads/stores. This patch changes the
insert point for loads to the position of the first load.
It updates the ordering method for loads to insert before, rather than after.

Before this patch the following sequence:
"load a[1], store a[1], store a[0], load a[2]"
Would incorrectly vectorize to "store a[0,1], load a[1,2]".
The correctness check was assuming the insertion point for loads is at
the position of the first load, when in practice it was at the last
load. An alternative fix would have been to invert the correctness check.
The current fix changes insert position but also requires reordering of
instructions before the vectorized load.

Updated testcases to reflect the changes.

Reviewers: tstellarAMD, llvm-commits, jlebar, arsenm

Subscribers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D22071

llvm-svn: 275117
2016-07-11 22:34:29 +00:00
Alina Sbirlea 327955e057 Add TLI.allowsMisalignedMemoryAccesses to LoadStoreVectorizer
Summary: Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains.
Add additional parameters: AddressSpace, Alignment, Fast.

Reviewers: llvm-commits, jlebar

Subscribers: arsenm, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21935

llvm-svn: 275100
2016-07-11 20:46:17 +00:00
Davide Italiano 63c4ce8e1b [SCCP] Try to follow the DRY principle, use `OpSt`.
Thanks to Eli Friedman for pointing out in his post-commit review!

llvm-svn: 275084
2016-07-11 18:21:29 +00:00
Jingyue Wu 641cfee976 [SLSR] Call getPointerSizeInBits with the correct address space.
llvm-svn: 275083
2016-07-11 18:13:28 +00:00
Davide Italiano e8ae0b5eb4 [PM/IPO] Port LowerTypeTests to the new PassManager.
There's a little bit of churn in this patch because the initialization
mechanism is now shared between the old and the new PM. Other than
that, it's just a pretty mechanical translation.

llvm-svn: 275082
2016-07-11 18:10:06 +00:00
Davide Italiano 12a115683b [LowerTypeTests] Don't rely on doInitialization().
In preparation for porting this pass to the new PM (which has no
doInitialization()).

Differential Revision:  http://reviews.llvm.org/D22223

llvm-svn: 275074
2016-07-11 17:00:31 +00:00
Dehao Chen 9232f98279 Implement callsite-hotness based inline cost for Sample-based PGO
Summary:
For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch.

E.g.

if (A1 && A2 && A3 && ..... && A10) {
  for (i=0; i < 100000000; i++) {
    callsite();
  }
}

Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value.

In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness.

Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR.

Reviewers: davidxl, eraman, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22118

llvm-svn: 275073
2016-07-11 16:48:54 +00:00
Dehao Chen 29d2641f52 Tune the weight propagation algorithm for sample profile.
Summary: Handle the case when there is only one incoming/outgoing edge for a visited basic block: use the block weight to adjust edge weight even when the edge has been visited before. This can help reduce inaccuracies introduced by incorrect basic block profile, as shown in the updated unittest.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D22180

llvm-svn: 275072
2016-07-11 16:40:17 +00:00
Nicolai Haehnle 889a20cf40 [Sink] Don't move calls to readonly functions across stores
Summary:

Reviewers: hfinkel, majnemer, tstellarAMD, sunfish

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17279

llvm-svn: 275066
2016-07-11 14:11:51 +00:00
Hal Finkel 02012bcfee Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute
Reverting r275027 and r275033. These seem to cause miscompiles on the AArch64 buildbot.

llvm-svn: 275042
2016-07-11 04:51:23 +00:00
Hal Finkel ce881a41f9 Don't use a SmallSet for returned attribute inference
Suggested post-commit by David Majnemer on IRC (following-up on a pre-commit
review comment).

llvm-svn: 275033
2016-07-11 01:14:21 +00:00
Hal Finkel d66a7b05db Let FuncAttrs infer the 'returned' argument attribute
A function can have one argument with the 'returned' attribute, indicating that
the associated argument is always the return value of the function. Add
FuncAttrs inference logic.

Differential Revision: http://reviews.llvm.org/D22202

llvm-svn: 275027
2016-07-10 22:02:55 +00:00
Benjamin Kramer 4d09892e9a Give helper classes/functions internal linkage. NFC.
llvm-svn: 275014
2016-07-10 11:28:51 +00:00
Davide Italiano 0f03ce0c88 [SCCP] Rename undefined -> unknown.
In the solver, isUndefined() does really mean "we don't know the
value yet" rather than "this is an UndefinedValue". Discussed with
Eli Friedman.

Differential Revision:  http://reviews.llvm.org/D22192

llvm-svn: 275004
2016-07-10 00:35:15 +00:00
Sean Silva db90d4d9c1 [PM] Port LoopVectorize to the new PM.
llvm-svn: 275000
2016-07-09 22:56:50 +00:00
Davide Italiano c4890705ef [SCCP] Remove wrong and misleading vector handling code.
This code was already commented out and it made some weird assumptions,
e.g. using isUndefined() as "this value is UndefValue" instead of
"we haven't computed this value is yet". Thanks to Eli Friedman for
pointing out where I was wrong (and where this code was wrong).

llvm-svn: 274995
2016-07-09 22:49:35 +00:00
Jingyue Wu debce55ac3 [SLSR] Fix crash on handling 128-bit integers.
ConstantInt::getSExtValue may fail on >64-bit integers. Add checks to call
getSExtValue only on narrow integers.

As a minor aside, simplify slsr-gep.ll to remove unnecessary load instructions.

llvm-svn: 274982
2016-07-09 19:13:18 +00:00
Benjamin Kramer 5f7edcf953 [ArgPromote] Use function_ref and for-range loops.
No functionality change intended.

llvm-svn: 274973
2016-07-09 10:36:36 +00:00
Davide Italiano 081fd139b3 [LoopSimplify] Remove a comment which is unlikely to age well.
Chandler pointed out in his review but I forgot to remove before
committing, my bad.

llvm-svn: 274963
2016-07-09 03:27:24 +00:00
Davide Italiano 92b933a55c [PM] Port CrossDSOCFI to the new pass manager.
llvm-svn: 274962
2016-07-09 03:25:35 +00:00
Sean Silva 0dacbd8f31 [PM] Fix a think-o. mv {Scalar,Vectorize}/SLPVectorize.h
llvm-svn: 274960
2016-07-09 03:11:29 +00:00
Davide Italiano cd96cfd8df [PM] Port LoopSimplify to the new pass manager.
While here move simplifyLoop() function to the new header, as
suggested by Chandler in the review.

Differential Revision:  http://reviews.llvm.org/D21404

llvm-svn: 274959
2016-07-09 03:03:01 +00:00
Piotr Padlewski 3b77612839 Add 'thinlto_src_module' md with asserts or -enable-import-metadata
Summary:
This way the metadata will be only generated when asserts enabled,
or when -enable-import-metadata specified

FIXED missing colon on requires.

Reviewers: tejohnson, eraman, mehdi_amini

Subscribers: mehdi_amini, llvm-commits

Differential Revision: http://reviews.llvm.org/D22167

llvm-svn: 274947
2016-07-08 23:01:49 +00:00
Piotr Padlewski d4b792346c Revert "Add 'thinlto_src_module' md with asserts or -enable-import-metadata"
Reverting because of 17463
http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17463

This reverts commit d20cb431bba2ba43b4c65a8556cff445bfefbb7c.

llvm-svn: 274946
2016-07-08 22:55:48 +00:00
Anna Thomas 9ad45adfd7 Revert "InstCombine rule to fold truncs whose value is available"
This reverts commit r274853.
Caused failure in ppcBE build

llvm-svn: 274943
2016-07-08 22:15:08 +00:00
Jingyue Wu 15f3e82d42 [TTI] Expose TTI::getGEPCost and use it in SLSR and NaryReassociate.
NFC.

llvm-svn: 274940
2016-07-08 21:48:05 +00:00
Piotr Padlewski d6efefa2b8 Add 'thinlto_src_module' md with asserts or -enable-import-metadata
Summary:
This way the metadata will be only generated when asserts enabled,
or when -enable-import-metadata specified

Reviewers: tejohnson, eraman, mehdi_amini

Subscribers: mehdi_amini, llvm-commits

Differential Revision: http://reviews.llvm.org/D22167

llvm-svn: 274938
2016-07-08 21:25:39 +00:00
Sanjay Patel 664514f7fe [InstCombine] don't form select from bitcasted logic ops if bitcasts have >1 use
This isn't a sure thing (are 2 extra bitcasts less expensive than a logic op?), 
but we'll try to err on the conservative side by going with the case that has
less IR instructions.

Note: This question came up in http://reviews.llvm.org/D22114 , but this part is
independent of that patch proposal, so I'm making this small change ahead of that
one. 

See also:
http://reviews.llvm.org/rL274926

llvm-svn: 274932
2016-07-08 21:17:51 +00:00
Xinliang David Li 7853c1dd73 Rename LoopAccessAnalysis to LoopAccessLegacyAnalysis /NFC
llvm-svn: 274927
2016-07-08 20:55:26 +00:00
Sanjay Patel f4a08ede03 [InstCombine] don't form select from logic ops if it's unlikely that we'll eliminate any ops
llvm-svn: 274926
2016-07-08 20:53:29 +00:00
Xinliang David Li 8c3554fa69 Remove duplicate inclusion /NFC
llvm-svn: 274921
2016-07-08 20:21:32 +00:00
Dehao Chen 429f5c735f Remove inline hints computation from SampleProfile.cpp
Summary: As we will move to use uniformed hotness check in inliner, we do not need inline hints in SampleProfile pass any more.

Reviewers: dnovillo, davidxl

Subscribers: eraman, llvm-commits

Differential Revision: http://reviews.llvm.org/D19287

llvm-svn: 274918
2016-07-08 20:12:44 +00:00
Davide Italiano b4b9db81f2 [CrossDSOCFI] Change the pass so that it doesn't require doInitialization()
Differential Revision:  http://reviews.llvm.org/D21357

llvm-svn: 274910
2016-07-08 19:30:06 +00:00
Davide Italiano d555bde59f [SCCP] Fold constants as we build them whne visiting cast instructions.
This should be slightly more efficient and could avoid spurious overdefined
markings, as Eli pointed out.

Differential Revision:  http://reviews.llvm.org/D22122

llvm-svn: 274905
2016-07-08 19:13:40 +00:00
Sanjay Patel 1b6b824548 [InstCombine] check for one-use before turning simple logic op into a select
llvm-svn: 274891
2016-07-08 17:26:47 +00:00
Sanjay Patel cbfca9e8ef [InstCombine] allow or(sext(A), B) --> A ? -1 : B transform for vectors
llvm-svn: 274883
2016-07-08 17:01:15 +00:00
Chad Rosier 89c32a9531 [DSE] Minor refactor based on D21007. NFC.
llvm-svn: 274877
2016-07-08 16:48:40 +00:00
Anna Thomas 3124f6273a InstCombine rule to fold truncs whose value is available
We can fold truncs whose operand feeds from a load, if the trunc value
is available through a prior load/store.

This change is from: http://reviews.llvm.org/D21246, which folded the
trunc but missed the bitcast or ptrtoint/inttoptr required in the RAUW
call, when the load type didnt match the prior load/store type.

Differential Revision: http://reviews.llvm.org/D21791

llvm-svn: 274853
2016-07-08 15:18:56 +00:00
Vedant Kumar 0fdffd3709 [tsan] Try harder to not instrument gcov counters
GCOVProfiler::emitProfileArcs() can create many variables with names
starting with "__llvm_gcov_ctr", so llvm appends a numeric suffix to
most of them. Teach tsan about this.

llvm-svn: 274801
2016-07-07 22:45:28 +00:00
Davide Italiano 16284df8ec [PM] Port InstSimplify to the new pass manager.
llvm-svn: 274796
2016-07-07 21:14:36 +00:00
Anna Thomas 6a78c78a03 [DSE] Remove dead stores in end blocks containing fence
We can remove dead stores in the presence of fence instructions. Fence
does not change an otherwise thread local store to visible.

reviewers: reames, dexonsmith, jfb
Differential Revision: http://reviews.llvm.org/D22001

llvm-svn: 274795
2016-07-07 20:51:42 +00:00
Rui Ueyama a7e11a5d34 Add a missing semicolon.
llvm-svn: 274794
2016-07-07 20:21:50 +00:00
Alina Sbirlea 598f8aad98 Clang-format LoadStoreVectorizer
Reviewers: llvm-commits, jlebar, arsenm

Subscribers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D22107

llvm-svn: 274792
2016-07-07 20:10:35 +00:00
Davide Italiano 709d41819b [LoopStrengthReduce] Fix -Wmisleading-indentation. Reported by GCC6.
llvm-svn: 274773
2016-07-07 17:44:38 +00:00
Sanjay Patel 25600f39eb save type in local var; NFCI
llvm-svn: 274760
2016-07-07 15:28:17 +00:00
Sjoerd Meijer 7435a910b5 Addressing post-commit comments for not rewriting fputs:
moved the optimise for size check inside function optimizeFPuts.

llvm-svn: 274758
2016-07-07 14:31:19 +00:00
Sjoerd Meijer 17c08dc701 Code size optimisation: don't rewrite fputs to fwrite when optimising for size
because fwrite requires more arguments and thus extra MOVs are required.

llvm-svn: 274753
2016-07-07 13:56:23 +00:00
Elena Demikhovsky fc1e969dfc Fixed a bug in vectorizing GEP before gather/scatter intrinsic.
Vectorizing GEP was incorrect and broke SSA in some cases.
 
The patch fixes PR27997 https://llvm.org/bugs/show_bug.cgi?id=27997.

Differential revision: http://reviews.llvm.org/D22035

llvm-svn: 274735
2016-07-07 06:06:46 +00:00
Qin Zhao c35b2cba6f [esan:cfrag] Add option -esan-aux-field-info
Summary:
Adds option -esan-aux-field-info to control generating binary with
auxiliary struct field information.

Extracts code for creating auxiliary information from
createCacheFragInfoGV into createCacheFragAuxGV.

Adds test struct_field_small.ll for -esan-aux-field-info test.

Reviewers: aizatsky

Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka

Differential Revision: http://reviews.llvm.org/D22019

llvm-svn: 274726
2016-07-07 03:20:16 +00:00
Sean Silva 59fe82f4ce [PM] Port TailCallElim
llvm-svn: 274708
2016-07-06 23:48:41 +00:00
Sean Silva b025d375a1 [PM] Port CorrelatedValuePropagation
llvm-svn: 274705
2016-07-06 23:26:29 +00:00
Sanjay Patel 65a51c25c1 [InstCombine] enhance (select X, C1, C2 --> ext X) to handle vectors
By replacing dyn_cast of ConstantInt with m_Zero/m_One/m_AllOnes, we
allow these transforms for splat vectors.

Differential Revision: http://reviews.llvm.org/D21899

llvm-svn: 274696
2016-07-06 22:23:01 +00:00
Chad Rosier 232e29ebea [MemorySSA] Reinstate the legacy printer and verifier.
Differential Revision: http://reviews.llvm.org/D22058

llvm-svn: 274679
2016-07-06 21:20:47 +00:00
Haicheng Wu a95cd1267f [LIR] Fix mis-compilation with unwinding.
To fix PR27859, bail out if there is an instruction may throw.

Differential Revision: http://reviews.llvm.org/D20638

llvm-svn: 274673
2016-07-06 21:05:40 +00:00
Sanjay Patel ea23436638 [InstCombine] use more specific pattern matchers; NFCI
Follow-up from r274465: we don't need to capture the value in these cases, 
so just match the constant that we're looking for. m_One/m_Zero work with
vector splats as well as scalars.

llvm-svn: 274670
2016-07-06 21:01:26 +00:00
Piotr Padlewski 6deaa6afae Add 'thinlto_src_module' metadata to imported function
Added metadata to be able to make statistics on how many functions
that have been imported have been removed. Also module name might
be helpfull when debugging.

Reviewers: tejohnson, eraman

Subscribers: mehdi_amini, llvm-commits

Differential Revision: http://reviews.llvm.org/D21943

llvm-svn: 274668
2016-07-06 20:26:25 +00:00
Derek Bruening d712a3c10e [esan|wset] Fix incorrect memory size assert
Summary:
Fixes an incorrect assert that fails on 128-bit-sized loads or stores.
Augments the wset tests to include this case.

Reviewers: aizatsky

Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits

Differential Revision: http://reviews.llvm.org/D22062

llvm-svn: 274666
2016-07-06 20:13:53 +00:00
Chad Rosier dcfce2d0ec [DSE] Avoid iterator invalidation bugs.
The dse_with_dbg_value.ll test committed with r273141 is removed because this
we no longer performs any type of back tracking, which is what was causing the
codegen differences with and without debug information.

Differential Revision: http://reviews.llvm.org/D21613

llvm-svn: 274660
2016-07-06 19:48:52 +00:00
Sean Silva f50d4b6cdc Work around PR28400 a bit harder.
We were still crashing in the "no change" case because LVI was not
getting invalidated.

See the thread "Should analyses be able to hold AssertingVH to IR?
(related to PR28400)" for more discussion.

llvm-svn: 274656
2016-07-06 19:05:41 +00:00
Piotr Padlewski 1f685e0186 NFC changed names in FunctionImport
llvm-svn: 274649
2016-07-06 18:12:23 +00:00
Matthew Simpson 433cb1dfe3 [LV] Don't widen trivial induction variables
We currently always vectorize induction variables. However, if an induction
variable is only used for counting loop iterations or computing addresses with
getelementptr instructions, we don't need to do this. Vectorizing these trivial
induction variables can create vector code that is difficult to simplify later
on. This is especially true when the unroll factor is greater than one, and we
create vector arithmetic when computing step vectors. With this patch, we check
if an induction variable is only used for counting iterations or computing
addresses, and if so, scalarize the arithmetic when computing step vectors
instead. This allows for greater simplification.

This patch addresses the suboptimal pointer arithmetic sequence seen in
PR27881.

Reference: https://llvm.org/bugs/show_bug.cgi?id=27881
Differential Revision: http://reviews.llvm.org/D21620

llvm-svn: 274627
2016-07-06 14:26:59 +00:00
Daniel Berlin fc7e651bfd Fix handling of forward unreachable but reverse-reachable blocks in MemorySSA construction
llvm-svn: 274606
2016-07-06 05:32:05 +00:00
George Burgess IV a362b09a81 [MSSA] Fix typo. NFC.
llvm-svn: 274590
2016-07-06 00:28:43 +00:00
George Burgess IV bfa401e5ad [CFLAA] Split into Anders+Steens analysis.
StratifiedSets (as implemented) is very fast, but its accuracy is also
limited. If we take a more aggressive andersens-like approach, we can be
way more accurate, but we'll also end up being slower.

So, we've decided to split CFLAA into CFLSteensAA and CFLAndersAA.

Long-term, we want to end up in a place where CFLSteens is queried
first; if it can provide an answer, great (since queries are basically
map lookups). Otherwise, we'll fall back to CFLAnders, BasicAA, etc.

This patch splits everything out so we can try to do something like
that when we get a reasonable CFLAnders implementation.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21910

llvm-svn: 274589
2016-07-06 00:26:41 +00:00
Ryan Govostes e51401bdab [asan] Add a hidden option for Mach-O global metadata liveness tracking
llvm-svn: 274578
2016-07-05 21:53:08 +00:00
Matthew Simpson 89188729c3 [LV] Refactor integer induction widening (NFC)
This patch also removes the SCEV variants of getStepVector() since they have no
uses after the refactoring.

Differential Revision: http://reviews.llvm.org/D21903

llvm-svn: 274558
2016-07-05 15:41:28 +00:00
Sanjay Patel cbaac41856 [InstCombine] enable vector select of bools -> logic folds
llvm-svn: 274465
2016-07-03 14:34:39 +00:00
Sanjay Patel a1a4e100be fix formatting; NFC
llvm-svn: 274463
2016-07-03 14:08:19 +00:00
Sean Silva fa6db90164 PR28400: Partly undo r274440 to bring test-suite back to life with the new PM
PR28400 seems to be not an isolated issue, but a general problem related
to caching analyses. We will need to discuss on llvm-dev.

A test case is in the PR.

llvm-svn: 274457
2016-07-03 03:35:06 +00:00
Sean Silva 997cbea05b [PM] Some preparatory refactoring to minimize the diff of D21921
llvm-svn: 274456
2016-07-03 03:35:03 +00:00
Sean Silva 45835e731d Remove dead TLI arg of isKnownNonNull and propagate deadness. NFC.
This actually uncovered a surprisingly large chain of ultimately unused
TLI args.
From what I can gather, this argument is a remnant of when
isKnownNonNull would look at the TLI directly.
The current approach seems to be that InferFunctionAttrs runs early in
the pipeline and uses TLI to annotate the TLI-dependent non-null
information as return attributes.

This also removes the dependence of functionattrs on TLI altogether.

llvm-svn: 274455
2016-07-02 23:47:27 +00:00
Sean Silva 0fb7774f91 [PM] Some preparatory refactoring to minimize the diff of D21921
The main change here is just moving stuff to static functions.

llvm-svn: 274446
2016-07-02 19:12:56 +00:00
Sean Silva e2133e7c32 [PM] Preparatory cleanups to ArgumentPromotion.
This pulls some obvious changes out of http://reviews.llvm.org/D21921 to
minimize the diff.

llvm-svn: 274445
2016-07-02 18:59:51 +00:00
Sean Silva f2db01c626 [PM] Fix a small typo from when I ported JumpThreading
llvm-svn: 274440
2016-07-02 16:16:44 +00:00
Benjamin Kramer 3bc1edf95b Use arrays or initializer lists to feed ArrayRefs instead of SmallVector where possible.
No functionality change intended.

llvm-svn: 274431
2016-07-02 11:41:39 +00:00
Qin Zhao b463c23c10 [esan|cfrag] Add counters for struct array accesses
Summary:
Adds one counter to the struct counter array for counting struct
array accesses.

Adds instrumentation to insert counter update for struct array
accesses.

Reviewers: aizatsky

Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka

Differential Revision: http://reviews.llvm.org/D21594

llvm-svn: 274420
2016-07-02 03:25:37 +00:00
Michael Kuperstein 071d8306b0 [PM] Port ConstantHoisting to the new Pass Manager
Differential Revision: http://reviews.llvm.org/D21945

llvm-svn: 274411
2016-07-02 00:16:47 +00:00
Matt Arsenault 3add3a40a4 LoadStoreVectorizer: Fix warning about extra semicolon
llvm-svn: 274406
2016-07-01 23:26:54 +00:00
Evgeniy Stepanov b736335dc3 [msan] Fix __msan_maybe_ for non-standard type sizes.
Fix incorrect calculation of the type size for __msan_maybe_warning_N
call that resulted in an invalid (narrowing) zext instruction and
"Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed."

Only happens in very large functions (with more than 3500 MSan
checks) operating on integer types that are not power-of-two.

llvm-svn: 274395
2016-07-01 22:49:59 +00:00
Alina Sbirlea 8d8aa5dd6c Address two correctness issues in LoadStoreVectorizer
Summary:
GetBoundryInstruction returns the last instruction as the instruction which follows or end(). Otherwise the last instruction in the boundry set is not being tested by isVectorizable().
Partially solve reordering of instructions. More extensive solution to follow.

Reviewers: tstellarAMD, llvm-commits, jlebar

Subscribers: escha, arsenm, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21934

llvm-svn: 274389
2016-07-01 21:44:12 +00:00
Sanjay Patel 887aa6d6ef fix documentation comments; NFC
llvm-svn: 274362
2016-07-01 16:41:59 +00:00
Xinliang David Li 94734eef33 [PM] refactor LoopAccessInfo code part-2
Differential Revision: http://reviews.llvm.org/D21636

llvm-svn: 274334
2016-07-01 05:59:55 +00:00
Matt Arsenault a8576706e3 LoadStoreVectorizer: improvements: better pointer analysis
If OpB has an ADD NSW/NUW, we can use that to prove that adding 1
to OpA won't wrap if OpA + 1 == OpB.

Patch by Fiona Glaser

llvm-svn: 274324
2016-07-01 02:16:24 +00:00
Matt Arsenault 0101ecade0 LoadStoreVectorizer: Don't increase alignment with no align set
If no alignment was set on the load/stores, it would vectorize
to the new type even though this increases the default alignment.

llvm-svn: 274323
2016-07-01 02:09:38 +00:00
Matt Arsenault 370e8226c7 LoadStoreVectorizer: Check TTI for vec reg bit width
llvm-svn: 274322
2016-07-01 02:07:22 +00:00
Matt Arsenault 42ad17059a LoadStoreVectorizer: Fix assert when merging pointer ops
This needs to use inttoptr/ptrtoint if combining an int and pointer
load. If a pointer is used always do an integer load.

llvm-svn: 274321
2016-07-01 01:55:52 +00:00
Duncan P. N. Exon Smith 9d1f156418 Revert "code hoisting pass based on GVN"
This reverts commit r274305, since it breaks self-hosting:
  http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/22349/
  http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17232

Note that the blamelist on lab.llvm.org:8011 is incorrect.  The previous
build was r274299, but somehow r274305 wasn't included in the blamelist:
  http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules

llvm-svn: 274320
2016-07-01 01:51:40 +00:00
Matt Arsenault 241f34cde8 LoadStoreVectorizer: Use AA metadata
This was not passing the full instruction with metadata
to the alias query.

llvm-svn: 274318
2016-07-01 01:47:46 +00:00
Matt Arsenault d7e8898bdd LoadStoreVectorizer: if one element of a vector is integer, default to
integer.

Fixes issues on some architectures where we use arithmetic ops to build
vectors, which can cause bad things to happen for loads/stores of mixed
types.

Patch by Fiona Glaser

llvm-svn: 274307
2016-07-01 00:37:01 +00:00
Matt Arsenault 8a4ab5e19f LoadStoreVectorizer: Fix crashes on sub-byte types
llvm-svn: 274306
2016-07-01 00:36:54 +00:00
Sebastian Pop 5c5798c57c code hoisting pass based on GVN
This pass hoists duplicated computations in the program. The primary goal of
gvn-hoist is to reduce the size of functions before inline heuristics to reduce
the total cost of function inlining.

Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki.
Important algorithmic contributions by Daniel Berlin under the form of reviews.

Differential Revision: http://reviews.llvm.org/D19338

llvm-svn: 274305
2016-07-01 00:24:31 +00:00
Matt Arsenault 079d0f19a2 LoadStoreVectorizer: Check skipFunction first.
Also add test I forgot to add to r274296.

llvm-svn: 274299
2016-06-30 23:50:18 +00:00
Matt Arsenault 2cbe52b990 LoadStoreVectorizer: Skip optnone functions
llvm-svn: 274296
2016-06-30 23:30:29 +00:00
Matt Arsenault 08debb0244 Add LoadStoreVectorizer pass
This was contributed by Apple, and I've been working on
minimal cleanups and generalizing it.

llvm-svn: 274293
2016-06-30 23:11:38 +00:00
Matt Arsenault 2ec640a62f Don't use unchecked dyn_cast
llvm-svn: 274282
2016-06-30 21:18:06 +00:00
Matt Arsenault 727e279ac4 SLPVectorizer: Move propagateMetadata to VectorUtils
This will be re-used by the LoadStoreVectorizer.

Fix handling of range metadata and testcase by Justin Lebar.

llvm-svn: 274281
2016-06-30 21:17:59 +00:00
Wei Mi 95685faeee Refine the set of UniformAfterVectorization instructions.
Except the seed uniform instructions (conditional branch and consecutive ptr
instructions), dependencies to be added into uniform set should only be used
by existing uniform instructions or intructions outside of current loop.

Differential Revision: http://reviews.llvm.org/D21755

llvm-svn: 274262
2016-06-30 18:42:56 +00:00
Sanjay Patel 7521e1b880 fix formatting, add TODO; NFC
llvm-svn: 274238
2016-06-30 15:32:45 +00:00
Jun Bum Lim 596a3bd9ec [DSE] Fix bug in partial overwrite tracking
Summary:
Found cases where DSE incorrectly add partially-overwritten intervals.
Please see the test case for details.

Reviewers: mcrosier, eeckstein, hfinkel

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D21859

llvm-svn: 274237
2016-06-30 15:32:20 +00:00
Sanjay Patel 7c6eab5777 [InstCombine] shrink switch conditions better (PR24766)
https://llvm.org/bugs/show_bug.cgi?id=24766#c2

This removes a hack that was added for the benefit of x86 codegen. 
It prevented shrinking the switch condition even to smaller legal (DataLayout) types.
We have a safety mechanism in CGP after:
http://reviews.llvm.org/rL251857
...so we're free to use the optimal (smallest) IR type now.

Differential Revision: http://reviews.llvm.org/D12965

llvm-svn: 274233
2016-06-30 14:51:21 +00:00
Sanjay Patel 4520d9a1f5 [InstCombine] use ConstantExpr::getBitCast() instead of creating useless instruction
llvm-svn: 274229
2016-06-30 14:27:41 +00:00
Sanjay Patel 7ad98babfa [InstCombine] extend matchSelectFromAndOr() to work with i1 scalar types
If the incoming types are i1, then we don't have to pattern match any sext ops.

Differential Revision: http://reviews.llvm.org/D21740

llvm-svn: 274228
2016-06-30 14:18:18 +00:00
Adam Nemet e1af3c635c [LV] Improve accuracy and formatting of function comment
llvm-svn: 274182
2016-06-29 22:04:10 +00:00
Tim Shen aec68b263d [InstCombine] Simplify and correct folding fcmps with the same children
Summary: Take advantage of FCmpInst::Predicate's bit pattern and handle (fcmp *, x, y) | (fcmp *, x, y) and (fcmp *, x, y) & (fcmp *, x, y) more consistently. Also fold more FCmpInst::FCMP_FALSE and FCmpInst::FCMP_TRUE to constants.

Currently InstCombine wrongly folds (fcmp ogt, x, y) | (fcmp ord, x, y) to (fcmp ogt, x, y); this patch also fixes that.

Reviewers: spatel

Subscribers: llvm-commits, iteratee, echristo

Differential Revision: http://reviews.llvm.org/D21775

llvm-svn: 274156
2016-06-29 20:10:17 +00:00
Tim Shen 860a67eb4c [InstCombine, NFC] Change the generated variable names by creating new instructions
This removes some noise for D21775's test changes.

llvm-svn: 274155
2016-06-29 20:10:13 +00:00
Elena Demikhovsky 5e21c94f25 Reverted patch 273864
llvm-svn: 274115
2016-06-29 10:01:06 +00:00
Adam Nemet ad437fff53 [Diag] Add getter shouldAlwaysPrint. NFC
For the new hotness attribute, the API will take the pass rather than
the pass name so we can no longer play the trick of AlwaysPrint being a
special pass name. This adds a getter to help the transition.

There is also a corresponding clang patch.

llvm-svn: 274100
2016-06-29 04:55:19 +00:00
Eric Christopher 0c58837b1f Revert "[InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions"
Revert "[InstCombine] Combine A->B->A BitCast"

as this appears to cause PR27996 and as discussed in http://reviews.llvm.org/D20847

This reverts commits r270135 and r263734.

llvm-svn: 274094
2016-06-29 03:05:58 +00:00
Adam Nemet bd861acf29 [LLE] Don't hoist conditionally executed loads
If the load is conditional we can't hoist its 0-iteration instance to
the preheader because that would make it unconditional.  Thus we would
access a memory location that the original loop did not access.

llvm-svn: 273991
2016-06-28 04:02:47 +00:00
Michael Kuperstein 835facd863 [PM] Normalize FIXMEs for missing PreserveCFG to have the same wording.
llvm-svn: 273974
2016-06-28 00:54:12 +00:00
Sanjay Patel 59ed2ffca3 [InstCombine] shrink type of sdiv if dividend is sexted and constant divisor is small enough (PR28153)
This should fix PR28153:
https://llvm.org/bugs/show_bug.cgi?id=28153

Differential Revision: http://reviews.llvm.org/D21769

llvm-svn: 273951
2016-06-27 22:27:11 +00:00
Elena Demikhovsky 6f2ec8104a Fixed crash of SLP Vectorizer on KNL
The bug is connected to vector GEPs.
https://llvm.org/bugs/show_bug.cgi?id=28313

llvm-svn: 273919
2016-06-27 20:07:00 +00:00
Sanjay Patel bedd1f9d3d [InstCombine] refactor sdiv by APInt transforms (NFC)
There's at least one more fold to do here:
https://llvm.org/bugs/show_bug.cgi?id=28153

llvm-svn: 273904
2016-06-27 18:38:40 +00:00
Daniel Berlin 16ed57c86b Factor out buildMemorySSA from getWalker.
NFC.

llvm-svn: 273901
2016-06-27 18:22:27 +00:00
Sanjay Patel c6ada53be5 [InstCombine] use m_APInt for div --> ashr fold
The APInt matcher works with splat vectors, so we get this fold for vectors too.

llvm-svn: 273897
2016-06-27 17:25:57 +00:00
Easwaran Raman 1832bf6aee [PM] Port PartialInlining to the new PM
Differential revision: http://reviews.llvm.org/D21699

llvm-svn: 273894
2016-06-27 16:50:18 +00:00
Kuba Brecka 7d03ce480a [asan] fix false dynamic-stack-buffer-overflow report with constantly-sized dynamic allocas, LLVM part
See the bug report at https://github.com/google/sanitizers/issues/691. When a dynamic alloca has a constant size, ASan instrumentation will treat it as a regular dynamic alloca (insert calls to poison and unpoison), but the backend will turn it into a regular stack variable. The poisoning/unpoisoning is then broken. This patch will treat such allocas as static.

Differential Revision: http://reviews.llvm.org/D21509

llvm-svn: 273888
2016-06-27 15:57:08 +00:00
Benjamin Kramer 4c137dbe25 [msan] Tighten up type in StoreList. NFC.
llvm-svn: 273866
2016-06-27 12:25:23 +00:00
Elena Demikhovsky 4c58b2761a Fixed consecutive memory access detection in Loop Vectorizer.
It did not handle correctly cases without GEP.

The following loop wasn't vectorized:

for (int i=0; i<len; i++)

  *to++ = *from++;

I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.

Re-commit rL273257 - revision: http://reviews.llvm.org/D20789

llvm-svn: 273864
2016-06-27 11:19:23 +00:00
Benjamin Kramer 706e48839d [CodeExtractor] Merge DEBUG statements in an attempt to fix the msvc
build.

There's a known bug in msvc 2013 that fails to compile do-while loops
inside of ranged for loops.

llvm-svn: 273811
2016-06-26 13:39:33 +00:00
Benjamin Kramer 135f735af1 Apply clang-tidy's modernize-loop-convert to most of lib/Transforms.
Only minor manual fixes. No functionality change intended.

llvm-svn: 273808
2016-06-26 12:28:59 +00:00
Sanjoy Das 9d08642c64 [RSForGC] Appease MSVC
llvm-svn: 273805
2016-06-26 05:42:52 +00:00
Sanjoy Das a37bb4a65d [LoopUnswitch] Unswitch on conditions feeding into guards
Summary:
This is a straightforward extension of what LoopUnswitch does to
branches to guards.  That is, we unswitch

```
for (;;) {
  ...
  guard(loop_invariant_cond);
  ...
}
```

into

```
if (loop_invariant_cond) {
  for (;;) {
    ...
    // There is no need to emit guard(true)
    ...
  }
} else {
  for (;;) {
    ...
    guard(false);
    // SimplifyCFG will clean this up by adding an
    // unreachable after the guard(false)
    ...
  }
}
```

Reviewers: majnemer

Subscribers: mcrosier, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21725

llvm-svn: 273801
2016-06-26 05:10:45 +00:00
Sanjoy Das 7dda0edb5f [RSForGC] Bring the BDVState struct up to code; NFC
llvm-svn: 273800
2016-06-26 04:55:35 +00:00
Sanjoy Das 61c76e3b89 [RSForGC] Bring computeLiveInValues up to code; NFC
llvm-svn: 273799
2016-06-26 04:55:32 +00:00
Sanjoy Das 83186b067d [RSForGC] Bring computeLiveOutSeed up to code; NFC
llvm-svn: 273798
2016-06-26 04:55:30 +00:00
Sanjoy Das b2df57af65 [RSForGC] Bring computeLiveInValues up to code; NFC
llvm-svn: 273797
2016-06-26 04:55:26 +00:00
Sanjoy Das 255532f629 [RSForGC] Bring recomputeLiveInValues up to code; NFC
llvm-svn: 273796
2016-06-26 04:55:23 +00:00
Sanjoy Das 73c7f26035 [RSForGC] Bring containsGCPtrType, isGCPointerType up to code; NFC
llvm-svn: 273795
2016-06-26 04:55:19 +00:00
Sanjoy Das 1e7eeb4bf0 [RSForGC] Bring analyzeParsePointLiveness up to code; NFC
llvm-svn: 273794
2016-06-26 04:55:17 +00:00
Sanjoy Das 6cf88091b3 [RSForGC] Bring meetBDVStateImpl up to code; NFC
llvm-svn: 273793
2016-06-26 04:55:13 +00:00
Sanjoy Das bd43d0e2d0 [RSForGC] Get rid of the unnecessary MeetBDVStates struct; NFC
All of its implementation is in just one function.

llvm-svn: 273792
2016-06-26 04:55:10 +00:00
Sanjoy Das 90547f1d20 [RSForGC] Bring findBasePointer up to code; NFC
Name-casing and minor style changes to bring the function up to the LLVM
coding style.

llvm-svn: 273791
2016-06-26 04:55:05 +00:00
David Majnemer 9f506259c8 Just a small cleanup
No functional change is intended

llvm-svn: 273780
2016-06-25 08:34:38 +00:00
David Majnemer e14e7bc4b8 Revert "[SimplifyCFG] Stop inserting calls to llvm.trap for UB"
This reverts commit r273778, it seems to break UBSan :/

llvm-svn: 273779
2016-06-25 08:19:55 +00:00
David Majnemer d346a37737 [SimplifyCFG] Stop inserting calls to llvm.trap for UB
SimplifyCFG had logic to insert calls to llvm.trap for two very
particular IR patterns: stores and invokes of undef/null.

While InstCombine canonicalizes certain undefined behavior IR patterns
to stores of undef, phase ordering means that this cannot be relied upon
in general.

There are much better tools than llvm.trap: UBSan and ASan.

N.B. I could be argued into reverting this change if a clear argument as
to why it is important that we synthesize llvm.trap for stores, I'd be
hard pressed to see why it'd be useful for invokes...

llvm-svn: 273778
2016-06-25 08:04:19 +00:00
David Majnemer 1fea77c6fc [SimplifyCFG] Replace calls to null/undef with unreachable
Calling null is undefined behavior, a call to undef can be trivially
treated as a call to null.

llvm-svn: 273776
2016-06-25 07:37:27 +00:00
Sanjoy Das d850068282 [LoopUnswitch] Avoid exponential behavior
Summary: (No semantic change intended).

Reviewers: majnemer, bogner, mzolotukhin

Subscribers: mcrosier, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21707

llvm-svn: 273763
2016-06-25 01:14:19 +00:00
Michael Kuperstein f4c56e97df It isn't meaningful for a transform to preserve another transform. NFC.
llvm-svn: 273761
2016-06-25 00:47:21 +00:00
Peter Collingbourne 0312f614b1 IR: Introduce llvm.type.checked.load intrinsic.
This intrinsic safely loads a function pointer from a virtual table pointer
using type metadata. This intrinsic is used to implement control flow integrity
in conjunction with virtual call optimization. The virtual call optimization
pass will optimize away llvm.type.checked.load intrinsics associated with
devirtualized calls, thereby removing the type check in cases where it is
not needed to enforce the control flow integrity constraint.

This patch also introduces the capability to copy type metadata between
global variables, and teaches the virtual call optimization pass to do so.

Differential Revision: http://reviews.llvm.org/D21121

llvm-svn: 273756
2016-06-25 00:23:04 +00:00
David Majnemer b8da3a2bb2 Reinstate r273711
r273711 was reverted by r273743.  The inliner needs to know about any
call sites in the inlined function.  These were obscured if we replaced
a call to undef with an undef but kept the call around.

This fixes PR28298.

llvm-svn: 273753
2016-06-25 00:04:10 +00:00
David Majnemer 580e754348 Silence a -Wsign-compare warning
llvm-svn: 273752
2016-06-25 00:04:06 +00:00
Michael Kuperstein 83b753d430 [PM] Port float2int to the new pass manager
Differential Revision: http://reviews.llvm.org/D21704

llvm-svn: 273747
2016-06-24 23:32:02 +00:00
Dehao Chen c66a06ad0e Hookup ProfileSummary with SampleProfilerLoader
Summary: Set ProfileSummary in SampleProfilerLoader.

Reviewers: davidxl, eraman

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21702

llvm-svn: 273745
2016-06-24 22:57:06 +00:00
Nico Weber ae2ef4ccd4 Revert r273711, it caused PR28298.
llvm-svn: 273743
2016-06-24 22:52:39 +00:00
Peter Collingbourne 995d6cc8f9 Fix unused variable warning in -asserts builds.
llvm-svn: 273737
2016-06-24 21:37:11 +00:00
Sanjoy Das 91e6ba6399 [IndVarSimplify] Run clang-format over some oddly formatted bits
NFC (whitespace only change)

llvm-svn: 273732
2016-06-24 21:23:32 +00:00
Peter Collingbourne 7efd750607 IR: New representation for CFI and virtual call optimization pass metadata.
The bitset metadata currently used in LLVM has a few problems:

1. It has the wrong name. The name "bitset" refers to an implementation
   detail of one use of the metadata (i.e. its original use case, CFI).
   This makes it harder to understand, as the name makes no sense in the
   context of virtual call optimization.

2. It is represented using a global named metadata node, rather than
   being directly associated with a global. This makes it harder to
   manipulate the metadata when rebuilding global variables, summarise it
   as part of ThinLTO and drop unused metadata when associated globals are
   dropped. For this reason, CFI does not currently work correctly when
   both CFI and vcall opt are enabled, as vcall opt needs to rebuild vtable
   globals, and fails to associate metadata with the rebuilt globals. As I
   understand it, the same problem could also affect ASan, which rebuilds
   globals with a red zone.

This patch solves both of those problems in the following way:

1. Rename the metadata to "type metadata". This new name reflects how
   the metadata is currently being used (i.e. to represent type information
   for CFI and vtable opt). The new name is reflected in the name for the
   associated intrinsic (llvm.type.test) and pass (LowerTypeTests).

2. Attach metadata directly to the globals that it pertains to, rather
   than using the "llvm.bitsets" global metadata node as we are doing now.
   This is done using the newly introduced capability to attach
   metadata to global variables (r271348 and r271358).

See also: http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html

Differential Revision: http://reviews.llvm.org/D21053

llvm-svn: 273729
2016-06-24 21:21:32 +00:00
George Burgess IV fd1f2f8561 [MemorySSA] Move code around a bit. NFC.
This patch moves MSSA's caching walker into MemorySSA, and moves the
actual definition of MSSA's caching walker out of MemorySSA.h. This is
done in preparation for the new walker, which should be out for review
soonish.

Also, this patch removes a field from UpwardsMemoryQuery and has a few
lines of diff from clang-format'ing MemorySSA.cpp.

llvm-svn: 273723
2016-06-24 21:02:12 +00:00
Sanjay Patel 2cbe679774 [InstCombine] use m_APInt; NFCI
llvm-svn: 273715
2016-06-24 20:36:34 +00:00
David Majnemer 3b3e954ea2 SimplifyInstruction does not imply DCE
We cannot remove an instruction with no uses just because
SimplifyInstruction succeeds.  It may have side effects.

llvm-svn: 273711
2016-06-24 19:34:46 +00:00
Sanjay Patel 4e8ebce196 [InstCombine] refactor optional bitcasting in matchSelectFromAndOr() into one code path (NFCI)
Tests to verify that the commuted variants are all exercised were added with:
http://reviews.llvm.org/rL273702

llvm-svn: 273706
2016-06-24 18:55:27 +00:00
Reid Kleckner fbd5eef691 Revert "InstCombine rule to fold trunc when value available"
This reverts commit r273608.

Broke building code with sanitizers, where apparently these kinds of
loads, casts, and truncations are common:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/24502
http://crbug.com/623099

llvm-svn: 273703
2016-06-24 18:42:58 +00:00
Sanjay Patel f8b08f7179 [InstCombine] consolidate commutation variants of matchSelectFromAndOr() in one place; NFCI
By putting all the possible commutations together, we simplify the code.
Note that this is NFCI, but I'm adding tests that actually exercise each
commutation pattern because we don't have this anywhere else.

llvm-svn: 273702
2016-06-24 18:26:02 +00:00
Matthew Simpson e794678404 [LV] Preserve order of dependences in interleaved accesses analysis
The interleaved access analysis currently assumes that the inserted run-time
pointer aliasing checks ensure the absence of dependences that would prevent
its instruction reordering. However, this is not the case.

Issues can arise from how code generation is performed for interleaved groups.
For a load group, all loads in the group are essentially moved to the location
of the first load in program order, and for a store group, all stores in the
group are moved to the location of the last store. For groups having members
involved in a dependence relation with any other instruction in the loop, this
reordering can violate the dependence.

This patch teaches the interleaved access analysis how to avoid breaking such
dependences, and should fix PR27626.

An assumption of the original analysis was that the accesses had been collected
in "program order". The analysis was then simplified by visiting the accesses
bottom-up. However, this ordering was never guaranteed for anything other than
single basic block loops. Thus, this patch also enforces the desired ordering.

Reference: https://llvm.org/bugs/show_bug.cgi?id=27626
Differential Revision: http://reviews.llvm.org/D19984

llvm-svn: 273687
2016-06-24 15:33:25 +00:00
Anna Thomas 671513553c [LICM] Avoid repeating expensive call while promoting loads. NFC
Summary:
We can avoid repeating the check `isGuaranteedToExecute` when it's already called once while checking if the alignment can be widened for the load/store being hoisted.

The function is invariant for the same instruction `UI` in `isGuaranteedToExecute(*UI, DT, CurLoop, SafetyInfo);`

Reviewers: hfinkel, eli.friedman

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21672

llvm-svn: 273671
2016-06-24 12:38:45 +00:00
David Majnemer d770877328 Switch more loops to be range-based
This makes the code a little more concise, no functional change is
intended.

llvm-svn: 273644
2016-06-24 04:05:21 +00:00
Chuang-Yu Cheng 68f7f1cf00 Teaching SimplifyCFG to recognize the Or-Mask trick that InstCombine uses to
reduce the number of comparisons.

Specifically, InstCombine can turn:
  (i == 5334 || i == 5335)
into:
  ((i | 1) == 5335)

SimplifyCFG was already able to detect the pattern:
  (i == 5334 || i == 5335)
to:
  ((i & -2) == 5334)

This patch supersedes D21315 and resolves PR27555
(https://llvm.org/bugs/show_bug.cgi?id=27555).

Thanks to David and Chandler for the suggestions!

Author: Thomas Jablin (tjablin)
Reviewers: majnemer chandlerc halfdan cycheng

http://reviews.llvm.org/D21397

llvm-svn: 273639
2016-06-24 01:59:00 +00:00
Anna Thomas 31a0b2088f InstCombine rule to fold trunc when value available
Summary:
This instcombine rule folds away trunc operations that have value available from a prior load or store.
This kind of code can be generated as a result of GVN widening the load or from source code as well.

Reviewers: reames, majnemer, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21246

llvm-svn: 273608
2016-06-23 20:22:22 +00:00
Sanjoy Das 81c00fe022 [IRCE] Use getTerminator instead of rbegin; NFC
llvm-svn: 273586
2016-06-23 18:03:26 +00:00
Hal Finkel a1271036c5 Allow DeadStoreElimination to track combinations of partial later wrties
DeadStoreElimination can currently remove a small store rendered unnecessary by
a later larger one, but could not remove a larger store rendered unnecessary by
a series of later smaller ones. This adds that capability.

It works by keeping a map, which is used as an effective interval map, for each
store later overwritten only partially, and filling in that interval map as
more such stores are discovered. No additional walking or aliasing queries are
used. In the map forms an interval covering the the entire earlier store, then
it is dead and can be removed. The map is used as an interval map by storing a
mapping between the ending offset and the beginning offset of each interval.

I discovered this problem when investigating a performance issue with code like
this on PowerPC:

  #include <complex>
  using namespace std;

  complex<float> bar(complex<float> C);
  complex<float> foo(complex<float> C) {
    return bar(C)*C;
  }

which produces this:

  define void @_Z4testSt7complexIfE(%"struct.std::complex"* noalias nocapture sret %agg.result, i64 %c.coerce) {
  entry:
    %ref.tmp = alloca i64, align 8
    %tmpcast = bitcast i64* %ref.tmp to %"struct.std::complex"*
    %c.sroa.0.0.extract.shift = lshr i64 %c.coerce, 32
    %c.sroa.0.0.extract.trunc = trunc i64 %c.sroa.0.0.extract.shift to i32
    %0 = bitcast i32 %c.sroa.0.0.extract.trunc to float
    %c.sroa.2.0.extract.trunc = trunc i64 %c.coerce to i32
    %1 = bitcast i32 %c.sroa.2.0.extract.trunc to float
    call void @_Z3barSt7complexIfE(%"struct.std::complex"* nonnull sret %tmpcast, i64 %c.coerce)
    %2 = bitcast %"struct.std::complex"* %agg.result to i64*
    %3 = load i64, i64* %ref.tmp, align 8
    store i64 %3, i64* %2, align 4 ; <--- ***** THIS SHOULD NOT BE HERE ****
    %_M_value.realp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 0
    %4 = lshr i64 %3, 32
    %5 = trunc i64 %4 to i32
    %6 = bitcast i32 %5 to float
    %_M_value.imagp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 1
    %7 = trunc i64 %3 to i32
    %8 = bitcast i32 %7 to float
    %mul_ad.i.i = fmul fast float %6, %1
    %mul_bc.i.i = fmul fast float %8, %0
    %mul_i.i.i = fadd fast float %mul_ad.i.i, %mul_bc.i.i
    %mul_ac.i.i = fmul fast float %6, %0
    %mul_bd.i.i = fmul fast float %8, %1
    %mul_r.i.i = fsub fast float %mul_ac.i.i, %mul_bd.i.i
    store float %mul_r.i.i, float* %_M_value.realp.i.i, align 4
    store float %mul_i.i.i, float* %_M_value.imagp.i.i, align 4
    ret void
  }

the problem here is not just that the i64 store is unnecessary, but also that
it blocks further backend optimizations of the other uses of that i64 value in
the backend.

In the future, we might want to add a special case for handling smaller
accesses (e.g. using a bit vector) if the map mechanism turns out to be
noticeably inefficient. A sorted vector is also a possible replacement for the
map for small numbers of tracked intervals.

Differential Revision: http://reviews.llvm.org/D18586

llvm-svn: 273559
2016-06-23 13:46:39 +00:00
Eric Christopher d3d9cbf127 Fix unused variable warning by folding the temporary into the debug statement.
llvm-svn: 273523
2016-06-23 00:42:00 +00:00
David Majnemer d1fbf48566 [SCCP] Don't assume all Constants are ConstantInt
This fixes PR28269.

llvm-svn: 273521
2016-06-23 00:14:29 +00:00
Sanjoy Das 5dae789a16 [RS4GC] Use StringRef; NFC
Spotted during random inspection.

llvm-svn: 273512
2016-06-22 23:32:46 +00:00
Peter Collingbourne 6d88fde3af IR: Introduce Module::global_objects().
This is a convenience iterator that allows clients to enumerate the
GlobalObjects within a Module.

Also start using it in a few places where it is obviously the right thing
to use.

Differential Revision: http://reviews.llvm.org/D21580

llvm-svn: 273470
2016-06-22 20:29:42 +00:00
Vedant Kumar f5ac6d49e4 [asan] Do not instrument accesses to profiling globals
It's only useful to asan-itize profiling globals while debugging llvm's
profiling instrumentation passes. Enabling asan along with instrprof or
gcov instrumentation shouldn't incur extra overhead.

This patch is in the same spirit as r264805 and r273202, which disabled
tsan instrumentation of instrprof/gcov globals.

Differential Revision: http://reviews.llvm.org/D21541

llvm-svn: 273444
2016-06-22 17:30:58 +00:00
Rafael Espindola 2b7fef681f Delete more dead code.
Found by gcc 6.

llvm-svn: 273402
2016-06-22 12:44:16 +00:00
Anna Zaks 644d9d3a44 [asan] Do not instrument pointers with address space attributes
Do not instrument pointers with address space attributes since we cannot track
them anyway. Instrumenting them results in false positives in ASan and a
compiler crash in TSan. (The compiler should not crash in any case, but that's
a different problem.)

llvm-svn: 273339
2016-06-22 00:15:52 +00:00
Rafael Espindola 48975881ab Delete some dead code.
Found by gcc 6.

llvm-svn: 273303
2016-06-21 19:48:12 +00:00
Easwaran Raman 8bceb9d210 Fix PR28219: Use profile summary from reader and not compute it
Differentiaal revision: http://reviews.llvm.org/D21546

llvm-svn: 273301
2016-06-21 19:29:49 +00:00
Daniel Berlin 1430026142 Add MemoryAccess creation and PHI creation APIs to MemorySSA
Reviewers: george.burgess.iv, gberry, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21463

llvm-svn: 273295
2016-06-21 18:39:20 +00:00
Etienne Bergeron 70684f9422 This is part of the effort for asan to support Windows 64 bit.
The large offset is being tested on Windows 10 (which has larger usable
virtual address space than Windows 8 or earlier)

Patch by:  Wei Wang
Differential Revision: http://reviews.llvm.org/D21523

llvm-svn: 273269
2016-06-21 15:07:29 +00:00
Elena Demikhovsky a266cf0518 reverted the prev commit due to assertion failure
llvm-svn: 273258
2016-06-21 12:10:11 +00:00
Elena Demikhovsky 9823c995bc Fixed consecutive memory access detection in Loop Vectorizer.
It did not handle correctly cases without GEP.

The following loop wasn't vectorized:

for (int i=0; i<len; i++)
  *to++ = *from++;

I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.

Differential revision: http://reviews.llvm.org/D20789

llvm-svn: 273257
2016-06-21 11:32:01 +00:00
David Majnemer e61e4bfd87 Replace silly uses of 'signed' with 'int'
llvm-svn: 273244
2016-06-21 05:10:24 +00:00
Xinliang David Li 69a00f06b0 clang format change /NFC
llvm-svn: 273233
2016-06-21 02:39:08 +00:00
Vedant Kumar 0222adbcd2 [tsan] Do not instrument accesses to the gcov counters array
There is a known intended race here. This is a follow-up to r264805,
which disabled tsan instrumentation for updates to instrprof counters.
For more background on this please see the discussion in D18164.

llvm-svn: 273202
2016-06-20 21:24:26 +00:00
Sanjay Patel 9ad8fb68f7 [InstSimplify] analyze (optionally casted) icmps to eliminate obviously false logic (PR27869)
By moving this transform to InstSimplify from InstCombine, we sidestep the problem/question
raised by PR27869:
https://llvm.org/bugs/show_bug.cgi?id=27869
...where InstCombine turns an icmp+zext into a shift causing us to miss the fold.

Credit to David Majnemer for a draft patch of the changes to InstructionSimplify.cpp.

Differential Revision: http://reviews.llvm.org/D21512

llvm-svn: 273200
2016-06-20 20:59:59 +00:00
Dehao Chen 071bb9d7af Pass AssumptionCacheTracker from SampleProfileLoader to Inliner
Summary: Inliner needs ACT when calling InlineFunction. Instead of nullptr, we need to pass it in from SampleProfileLoader

Reviewers: davidxl

Subscribers: eraman, vsk, danielcdh, llvm-commits

Differential Revision: http://reviews.llvm.org/D21205

llvm-svn: 273199
2016-06-20 20:53:40 +00:00
Daniel Berlin ada263dcd0 Rename to be consistent with other type names. NFC
llvm-svn: 273194
2016-06-20 20:21:33 +00:00
Matt Arsenault 802ebcb4bb InstCombine: Don't strip convergent from intrinsic callsites
Specific instances of intrinsic calls may want to be convergent, such
as certain register reads but the intrinsic declaration is not.

llvm-svn: 273188
2016-06-20 19:04:44 +00:00
David Majnemer 41ff4fdcd4 Forgot to update callers of deleteDeadInstruction
llvm-svn: 273163
2016-06-20 16:07:38 +00:00
David Majnemer c5601df9fd Reapply "[LoopIdiom] Don't remove dead operands manually"
This reverts commit r273160, reapplying r273132.
RecursivelyDeleteTriviallyDeadInstructions cannot be called on a
parentless Instruction.

llvm-svn: 273162
2016-06-20 16:03:25 +00:00
Cong Liu 1c28b6d733 Revert "[LoopIdiom] Don't remove dead operands manually"
This reverts commit r273132.
Breaks multiple test under /llvm/test:Transforms (e.g.
llvm/test:Transforms/LoopIdiom/basic.ll.test) under asan.

llvm-svn: 273160
2016-06-20 15:22:15 +00:00
Patrik Hagglund 4e0bd84b35 Fix formatting of r273144. NFC.
llvm-svn: 273149
2016-06-20 11:19:58 +00:00
Patrik Hagglund a83706e354 Avoid output indeterminism between GCC and Clang builds.
Remove dependency of the evalution order of function arguments, which
is unspecified.

The following test previously failed when built with GCC (but succeded
when built with Clang):

  ; RUN: opt -sroa -S < %s | FileCheck %s

  target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
  target triple = "x86_64-unknown-linux-gnu"

  %A = type {i16}

  @a = global %A* null
  @b = global i16 0

  ; CHECK-LABEL: @f1(
  ; CHECK: alloca %A
  ; CHECK-NEXT: extractvalue %A
  ; CHECK-NEXT: getelementptr inbounds %A

  define void @f1 (%A %a) {
    %1 = alloca %A
    store %A %a, %A* %1
    %2 = load i16, i16* @b
    %3 = icmp ne i16 %2, 0
    br i1 %3, label %bb1, label %bb2
  bb1:
    store %A* %1, %A** @a
    br label %bb2
  bb2:
    ret void
  }

Patch by David Stenberg.

Differential Revision: http://reviews.llvm.org/D21226

llvm-svn: 273144
2016-06-20 10:19:00 +00:00
Patrik Hagglund 7205215591 Fix for PR27940
After a store has been eliminated, when making sure that the
instruction iterator points to a valid instruction, dbg intrinsics are
now ignored as a new instruction.

Patch by Henric Karlsson.

Reviewed by Daniel Berlin.

Differential Revision: http://reviews.llvm.org/D21076

llvm-svn: 273141
2016-06-20 09:10:10 +00:00
David Majnemer a705843f23 [LoopIdiom] Don't remove dead operands manually
Removing dead instructions requires remembering which operands have
already been removed.  RecursivelyDeleteTriviallyDeadInstructions has
this logic, don't partially reimplement it in LoopIdiomRecognize.

This fixes PR28196.

llvm-svn: 273132
2016-06-20 02:33:29 +00:00
David Majnemer 3ffe2dd4d2 Address Eli's post-commit comments
Use an APInt to handle pointers of arbitrary width, let
accumulateConstantOffset handle overflow issues.

llvm-svn: 273126
2016-06-19 21:36:35 +00:00
Sanjay Patel f8ee0e0218 fix formatting, typo; NFC
llvm-svn: 273118
2016-06-19 17:20:27 +00:00
David Majnemer 3119599475 [LoadCombine] Combine Loads formed from GEPS with negative indexes
Change the underlying offset and comparisons to use int64_t instead of
uint64_t.

Patch by River Riddle!

Differential Revision: http://reviews.llvm.org/D21499

llvm-svn: 273105
2016-06-19 06:14:56 +00:00
Marcin Koscielnicki 3feda222c6 [sanitizers] Disable target-specific lowering of string functions.
CodeGen has hooks that allow targets to emit specialized code instead
of calls to memcmp, memchr, strcpy, stpcpy, strcmp, strlen, strnlen.
When ASan/MSan/TSan/ESan is in use, this sidesteps its interceptors, resulting
in uninstrumented memory accesses.  To avoid that, make these sanitizers
mark the calls as nobuiltin.

Differential Revision: http://reviews.llvm.org/D19781

llvm-svn: 273083
2016-06-18 10:10:37 +00:00
Matt Arsenault 8fd5978811 Revert "Revert "Revert "InstCombine: Reduce trunc (shl x, K) width."""
This seems to be causing an infinite loop / crash in instcombine
on some bots.

llvm-svn: 273069
2016-06-17 23:36:38 +00:00
Adam Nemet a9f09c6245 [LAA] Enable symbolic stride speculation for all LAA clients
This is a functional change for LLE and LDist.  The other clients (LV,
LVerLICM) already had this explicitly enabled.

The temporary boolean parameter to LAA is removed that allowed turning
off speculation of symbolic strides.  This makes LAA's caching interface
LAA::getInfo only take the loop as the parameter.  This makes the
interface more friendly to the new Pass Manager.

The flag -enable-mem-access-versioning is moved from LV to a LAA which
now allows turning off speculation globally.

llvm-svn: 273064
2016-06-17 22:35:41 +00:00
Benjamin Kramer 1afc1de406 Apply another batch of fixes from clang-tidy's performance-unnecessary-value-param.
Contains some manual fixes. No functionality change intended.

llvm-svn: 273047
2016-06-17 20:41:14 +00:00
Matt Arsenault d76efc14b9 Revert "Revert "InstCombine: Reduce trunc (shl x, K) width.""
Reapply r272987. Condition should be in terms of the destination type,
and the flags should not be copied.

llvm-svn: 273045
2016-06-17 20:33:53 +00:00
Davide Italiano b49aa5c0c4 [PM] Port MergedLoadStoreMotion to the new pass manager, take two.
This is indeed a much cleaner approach (thanks to Daniel Berlin
for pointing out), and also David/Sean for review.

Differential Revision:  http://reviews.llvm.org/D21454

llvm-svn: 273032
2016-06-17 19:10:09 +00:00
Benjamin Kramer 4dea8f542b Avoid duplicated map lookups. No functionality change intended.
llvm-svn: 273030
2016-06-17 18:59:41 +00:00
Justin Bogner 78eebe7756 LoopSimplifyCFG: Prefer `const auto &` to `auto &`, for clarity. NFC
llvm-svn: 273023
2016-06-17 17:59:48 +00:00
Sanjay Patel 216d8cf720 [InstCombine] allow more than one use for vector bitcast folding with selects
The motivating example for this transform is similar to D20774 where bitcasts interfere
with a single cmp/select sequence, but in this case we have 2 uses of each bitcast to 
produce min and max ops:

define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) {
  %cmp = fcmp olt <4 x float> %a, %b
  %bc1 = bitcast <4 x float> %a to <4 x i32>
  %bc2 = bitcast <4 x float> %b to <4 x i32>
  %sel1 = select <4 x i1> %cmp, <4 x i32> %bc1, <4 x i32> %bc2
  %sel2 = select <4 x i1> %cmp, <4 x i32> %bc2, <4 x i32> %bc1
  %bc3 = bitcast <4 x float>* %ptr1 to <4 x i32>*
  store <4 x i32> %sel1, <4 x i32>* %bc3
  %bc4 = bitcast <4 x float>* %ptr2 to <4 x i32>*
  store <4 x i32> %sel2, <4 x i32>* %bc4
  ret void
}

With this patch, we move the selects up to use the input args which allows getting rid of
all of the bitcasts:

define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) {
  %cmp = fcmp olt <4 x float> %a, %b
  %sel1.v = select <4 x i1> %cmp, <4 x float> %a, <4 x float> %b
  %sel2.v = select <4 x i1> %cmp, <4 x float> %b, <4 x float> %a
  store <4 x float> %sel1.v, <4 x float>* %ptr1, align 16
  store <4 x float> %sel2.v, <4 x float>* %ptr2, align 16
  ret void
}

The asm for x86 SSE then improves from:

movaps  %xmm0, %xmm2
cmpltps %xmm1, %xmm2
movaps  %xmm2, %xmm3
andnps  %xmm1, %xmm3
movaps  %xmm2, %xmm4
andnps  %xmm0, %xmm4
andps %xmm2, %xmm0
orps  %xmm3, %xmm0
andps %xmm1, %xmm2
orps  %xmm4, %xmm2
movaps  %xmm0, (%rdi)
movaps  %xmm2, (%rsi)

To:

movaps  %xmm0, %xmm2
minps %xmm1, %xmm2
maxps %xmm0, %xmm1
movaps  %xmm2, (%rdi)
movaps  %xmm1, (%rsi)

The TODO comments show that we're limiting this transform only to vectors and only to bitcasts
because we need to improve other transforms or risk creating worse codegen.

Differential Revision: http://reviews.llvm.org/D21190

llvm-svn: 273011
2016-06-17 16:46:50 +00:00
Matt Arsenault ce56f7bbaa Revert "InstCombine: Reduce trunc (shl x, K) width."
This reverts commit r272987.

This might be causing crashes on some bots.

llvm-svn: 272990
2016-06-17 06:28:53 +00:00
Qin Zhao bb4496f8c8 [esan|cfrag] Add the struct field size array in StructInfo
Summary:
Adds the struct field size array in struct StructInfo.

Updates test struct_field_count_basic.ll.

Reviewers: aizatsky

Subscribers: vitalybuka, zhaoqin, kcc, eugenis, bruening, llvm-commits

Differential Revision: http://reviews.llvm.org/D21341

llvm-svn: 272989
2016-06-17 04:50:20 +00:00
Matt Arsenault 028fd50642 InstCombine: Reduce trunc (shl x, K) width.
llvm-svn: 272987
2016-06-17 04:43:22 +00:00
Sanjoy Das a324487493 [RS4GC] Pass CallSite by value instead of const ref; NFC
That's the idiomatic LLVM pattern.

llvm-svn: 272981
2016-06-17 00:45:00 +00:00
Chandler Carruth 164a2aa6f4 [PM] Remove support for omitting the AnalysisManager argument to new
pass manager passes' `run` methods.

This removes a bunch of SFINAE goop from the pass manager and just
requires pass authors to accept `AnalysisManager<IRUnitT> &` as a dead
argument. This is a small price to pay for the simplicity of the system
as a whole, despite the noise that changing it causes at this stage.

This will also helpfull allow us to make the signature of the run
methods much more flexible for different kinds af passes to support
things like intelligently updating the pass's progression over IR units.

While this touches many, many, files, the changes are really boring.
Mostly made with the help of my trusty perl one liners.

Thanks to Sean and Hal for bouncing ideas for this with me in IRC.

llvm-svn: 272978
2016-06-17 00:11:01 +00:00
Chuang-Yu Cheng 5078f94690 Use m_APInt in SimplifyCFG
Switch from m_Constant to m_APInt per David's request. NFC.

Author: Thomas Jablin (tjablin)
Reviewers: majnemer cycheng

http://reviews.llvm.org/D21440

llvm-svn: 272977
2016-06-17 00:04:39 +00:00
Adam Nemet c953bb9953 [LV] Move management of symbolic strides to LAA. NFCI
This is still NFCI, so the list of clients that allow symbolic stride
speculation does not change (yes: LV and LoopVersioningLICM, no: LLE,
LDist).  However since the symbolic strides are now managed by LAA
rather than passed by client a new bool parameter is used to enable
symbolic stride speculation.

The existing test Transforms/LoopVectorize/version-mem-access.ll checks
that stride speculation is performed for LV.

The previously added test Transforms/LoopLoadElim/symbolic-stride.ll
ensures that no speculation is performed for LLE.

The next patch will change the functionality and turn on symbolic stride
speculation in all of LAA's clients and remove the bool parameter.

llvm-svn: 272970
2016-06-16 22:57:55 +00:00
Evgeniy Stepanov 72d961a1da [safestack] Fixup llvm.dbg.value when rewriting unsafe allocas.
When moving unsafe allocas to the unsafe stack, dbg.declare intrinsics are
updated to refer to the new location.

This change does the same to dbg.value intrinsics.

llvm-svn: 272968
2016-06-16 22:34:00 +00:00
Adam Nemet 886e0617a2 [LV] Make getSymbolicStrides return a pointer rather than a reference. NFC
Turns out SymbolicStrides is actually used in canVectorizeWithIfConvert
before it gets set up in canVectorizeMemory.

This works fine as long as SymbolicStrides resides in LV since we just
have an empty map.  Based on this the conclusion is made that there are
no symbolic strides which is conservatively correct.

However once SymbolicStrides becomes part of LAI, LAI is nullptr at this
point so we need to differentiate the uninitialized state by returning a
nullptr for SymbolicStrides.

llvm-svn: 272966
2016-06-16 21:55:10 +00:00
Sanjoy Das 1ab2fad363 [EarlyCSE] Minor cosmetic NFC changes
- Avoid implicit conversion from pointer to bool
 - Add a comment when passing in a boolean value

llvm-svn: 272955
2016-06-16 21:00:57 +00:00
Sanjoy Das 07c6521aed [EarlyCSE] Fold invariant loads
Redundant invariant loads can be CSE'ed with very little extra effort
over what early-cse already tracks, so it looks reasonable to make
early-cse handle this case.

llvm-svn: 272954
2016-06-16 20:47:57 +00:00
Davide Italiano 41315f7873 [PM] Revert the port of MergeLoadStoreMotion to the new pass manager.
Daniel Berlin expressed some real concerns about the port and proposed
and alternative approach. I'll revert this for now while working on a
new patch, which I hope to put up for review shortly. Sorry for the churn.

llvm-svn: 272925
2016-06-16 17:40:53 +00:00
Chad Rosier 624fee55bc [DSE] Minor style cleanup. NFC.
llvm-svn: 272922
2016-06-16 17:06:04 +00:00
Igor Laevsky 87f0d0e185 Revert r272891 "[JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo"
It was causing failures in Profile-i386 and Profile-x86_64 tests.

llvm-svn: 272912
2016-06-16 16:25:53 +00:00
Igor Laevsky c9179fd2c2 [JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo
We should update results of the BranchProbabilityInfo after removing block in JumpThreading. Otherwise 
we will get dangling pointer inside BranchProbabilityInfo cache.

Differential Revision: http://reviews.llvm.org/D20957

llvm-svn: 272891
2016-06-16 13:28:25 +00:00
Patrik Hagglund 0acaefaf9d PR27938: Don't remove valid DebugLoc in Scalarizer
Added checks to make sure the Scalarizer::transferMetadata() don't
remove valid debug locations from instructions. This is important as
the verifier pass require that e.g. inlinable callsites have a valid
debug location.

https://llvm.org/bugs/show_bug.cgi?id=27938

Patch by Karl-Johan Karlsson

Reviewers: dblaikie

Differential Revision: http://reviews.llvm.org/D20807

llvm-svn: 272884
2016-06-16 10:48:54 +00:00
Adam Nemet bdbc5227ce [LAA] Default getInfo to not speculate symbolic strides. NFC
Soon we won't be passing Strides to getInfo and then we'll have fewer
call sites to update.

llvm-svn: 272878
2016-06-16 08:26:56 +00:00
Sean Silva a4cfb620df Attempt to define friend function more portably.
Patch written by Reid. I verified it locally with clang.

llvm-svn: 272875
2016-06-16 07:00:19 +00:00
Chuang-Yu Cheng dbe00d51b4 SimplifyCFG is able to detect the pattern:
(i == 5334 || i == 5335)
to:
    ((i & -2) == 5334)

This transformation has some incorrect side conditions. Specifically, the
transformation is only applied when the right-hand side constant (5334 in
the example) is a power of two not equal and not equal to the negated mask.
These side conditions were added in r258904 to fix PR26323. The correct side
condition is that: ((Constant & Mask) == Constant)[(5334 & -2) == 5334].

It's a little bit hard to see why these transformations are correct and what
the side conditions ought to be. Here is a CVC3 program to verify them for
64-bit values:
    ONE  : BITVECTOR(64) = BVZEROEXTEND(0bin1, 63);
    x    : BITVECTOR(64);
    y    : BITVECTOR(64);
    z    : BITVECTOR(64);
    mask : BITVECTOR(64) = BVSHL(ONE, z);
    QUERY( (y & ~mask = y) =>
           ((x & ~mask = y) <=> (x = y OR x = (y |  mask)))
    );

Please note that each pattern must be a dual implication (<--> or iff). One
directional implication can create spurious matches. If the implication is
only one-way, an unsatisfiable condition on the left side can imply a
satisfiable condition on the right side. Dual implication ensures that
satisfiable conditions are transformed to other satisfiable conditions and
unsatisfiable conditions are transformed to other unsatisfiable conditions.

Here is a concrete example of a unsatisfiable condition on the left
implying a satisfiable condition on the right:
    mask = (1 << z)
    (x & ~mask) == y --> (x == y || x == (y | mask))

Substituting y = 3, z = 0 yields:
    (x & -2) == 3 --> (x == 3 || x == 2)

The version of this code before r258904 had no side-conditions and
incorrectly justified itself in comments through one-directional
implication.

Thanks to Chandler for the suggestion!

Author: Thomas Jablin (tjablin)
Reviewers: chandlerc majnemer hfinkel cycheng

http://reviews.llvm.org/D21417

llvm-svn: 272873
2016-06-16 04:44:25 +00:00
Eli Friedman bd254a6f45 [InstCombine] Don't widen metadata on store-to-load forwarding
The original check for load CSE or store-to-load forwarding is wrong
when the forwarded stored value happened to be a load.

Ref https://github.com/JuliaLang/julia/issues/16894

Differential Revision: http://reviews.llvm.org/D21271

Patch by Yichao Yu!

llvm-svn: 272868
2016-06-16 02:33:42 +00:00
Justin Lebar c05f3c9942 [IR] [DAE] Copy comdats during DAE, and don't copy comdats in GlobalObject::copyAttributesFrom.
Summary: This reverts the changes to Globals.cpp and IRMover.cpp in
"[IR] Copy comdats in GlobalObject::copyAttributesFrom" (D20631,
rL270743).

The DeadArgElim test is left unchanged, and we change DAE to explicitly
copy comdats.

The reverted change breaks copyAttributesFrom when the destination lives
in a different module from the source.  The decision in D21255 was to
revert this patch and handle comdat copying separately from
copyAttributesFrom.

Reviewers: majnemer, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21403

llvm-svn: 272855
2016-06-15 23:20:15 +00:00
Adam Nemet 76a41d3a25 [LV] Make the new getter return a const reference. NFC
LoopVectorizationLegality holds a constant reference to LAI, so this
will have to be const as well.

Also added missed function comment.

llvm-svn: 272851
2016-06-15 22:58:27 +00:00
Xinliang David Li 1e16d61f1f Address review feedbacks of AddDiscriminator change
llvm-svn: 272850
2016-06-15 22:20:56 +00:00
Chad Rosier 72a793c5b1 [DSE] Hoist a redundant check to simplify logic. NFC.
llvm-svn: 272849
2016-06-15 22:17:38 +00:00
Xinliang David Li 1eaecefaf9 [PM] Port Add discriminator pass to new PM
llvm-svn: 272847
2016-06-15 21:51:30 +00:00
Chad Rosier 844e2df94b Typo. NFC.
llvm-svn: 272846
2016-06-15 21:41:22 +00:00
Davide Italiano 63af1aa0c2 [PM] Remove unneded doFinalization() override from LoopVersioningLICM.
llvm-svn: 272842
2016-06-15 21:23:54 +00:00
Davide Italiano 9d305d707e [LoopSimplify] Analyses do not need to be member variables.
In preparation for porting this pass to the new PM.

llvm-svn: 272818
2016-06-15 18:51:25 +00:00
Adam Nemet 82b9d2a72c [LV] Add getter function for LoopVectorizationLegality::Strides. NFC
This should help moving Strides to LAA later.

llvm-svn: 272796
2016-06-15 15:49:46 +00:00
Adam Nemet 927b54e48a [LV] Remove more unused functions. NFC
LoopVectorizationLegality::strides_begin/end are also unused.

llvm-svn: 272781
2016-06-15 12:26:15 +00:00
Adam Nemet b1973be8e2 [LV] Remove unused function. NFC
LoopVectorizationLegality::mustCheckStrides is unused.

llvm-svn: 272780
2016-06-15 12:26:11 +00:00
Sean Silva 7eeda20c72 Work around MSVC "friend" semantics.
The error on clang-x86-win2008-selfhost is:

C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(955) : error C2248: 'llvm::slpvectorizer::BoUpSLP::ScheduleData' : cannot access private struct declared in class 'llvm::slpvectorizer::BoUpSLP'
        C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(608) : see declaration of 'llvm::slpvectorizer::BoUpSLP::ScheduleData'
        C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(337) : see declaration of 'llvm::slpvectorizer::BoUpSLP'

I reproduced this locally with both MSVC 2013 and MSVC 2015.

llvm-svn: 272772
2016-06-15 10:51:40 +00:00
Sean Silva ec3ed2097b Speculative buildbot fix.
This wasn't failing for me with clang as the compiler. I think GCC may
disagree with clang about whether a friend declaration introduces a
declaration in the enclosing namespace (or something).

Example error:

/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:950:77: error: ‘llvm::raw_ostream& llvm::slpvectorizer::operator<<(llvm::raw_ostream&, const llvm::slpvectorizer::BoUpSLP::ScheduleData&)’ should have been declared inside ‘llvm::slpvectorizer’
                                              const BoUpSLP::ScheduleData &SD) {
                                                                             ^

llvm-svn: 272767
2016-06-15 09:00:33 +00:00
Sean Silva e0a9e66040 [PM] Port SLPVectorizer to the new PM
This uses the "runImpl" approach to share code with the old PM.

Porting to the new PM meant abandoning the anonymous namespace enclosing
most of SLPVectorizer.cpp which is a bit of a bummer (but not a big deal
compared to having to pull the pass class into a header which the new PM
requires since it calls the constructor directly).

llvm-svn: 272766
2016-06-15 08:43:40 +00:00
Sean Silva a4c2d150d0 [PM] Port AlignmentFromAssumptions to the new PM.
This uses the "runImpl" pattern to share code between the old and new PM.

llvm-svn: 272757
2016-06-15 06:18:01 +00:00
Michael Kuperstein 3277a05fcf Recommit [LV] Enable vectorization of loops where the IV has an external use
r272715 broke libcxx because it did not correctly handle cases where the
last iteration of one IV is the second-to-last iteration of another.

Original commit message:
Vectorizing loops with "escaping" IVs has been disabled since r190790, due to
PR17179. This re-enables it, with support for external use of both
"post-increment" (last iteration) and "pre-increment" (second-to-last iteration)
IVs.

llvm-svn: 272742
2016-06-15 00:35:26 +00:00
David Majnemer 4a697c312f [LoopUnroll] Don't crash trying to unroll loop with EH pad exit
We do not support splitting cleanuppad or catchswitches.  This is
problematic for passes which assume that a loop is in loop simplify
form (the loop would have a dedicated exit block instead of sharing it).

While it isn't great that we don't support this for cleanups, we still
cannot make loop-simplify form an assertable precondition because
indirectbr will also disable these sorts of CFG cleanups.

This fixes PR28132.

llvm-svn: 272739
2016-06-15 00:19:56 +00:00
David Majnemer cbf614a93b Remove the ScalarReplAggregates pass
Nearly all the changes to this pass have been done while maintaining and
updating other parts of LLVM.  LLVM has had another pass, SROA, which
has superseded ScalarReplAggregates for quite some time.

Differential Revision: http://reviews.llvm.org/D21316

llvm-svn: 272737
2016-06-15 00:19:09 +00:00
Michael Kuperstein d4bd3ab5fe Reverting r272715 since it broke libcxx.
llvm-svn: 272730
2016-06-14 22:30:41 +00:00
Davide Italiano d737dd2ec6 [PM] Port WholeProgramDevirt to the new pass manager.
llvm-svn: 272721
2016-06-14 21:44:19 +00:00
Michael Kuperstein 23b6d6adc9 [LV] Enable vectorization of loops where the IV has an external use
Vectorizing loops with "escaping" IVs has been disabled since r190790, due to
PR17179. This re-enables it, with support for external use of both
"post-increment" (last iteration) and "pre-increment" (second-to-last iteration)
IVs.

Differential Revision: http://reviews.llvm.org/D21048

llvm-svn: 272715
2016-06-14 21:27:27 +00:00
Geoff Berry efb0dd176a [MemorySSA] Set CFGOnly correctly for MemorySSAWrapperPass
Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D21344

llvm-svn: 272712
2016-06-14 21:19:40 +00:00
Peter Collingbourne 96efdd6107 IR: Introduce local_unnamed_addr attribute.
If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.

This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
  the normal rule that the global must have a unique address can be broken without
  being observable by the program by performing comparisons against the global's
  address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
  its own copy of the global if it requires one, and the copy in each linkage unit
  must be the same)
- It is a constant or a function (which means that the program cannot observe that
  the unique-address rule has been broken by writing to the global)

Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.

See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.

Part of the fix for PR27553.

Differential Revision: http://reviews.llvm.org/D20348

llvm-svn: 272709
2016-06-14 21:01:22 +00:00
Sebastian Pop dfb66a1191 LoopRotate: restructure code to simplify functions
We move the loop rotate functions in a separate class to avoid passing multiple
parameters to each function.  This cleanup will help with further development of
loop rotation.  NFC.

Patch written by Aditya Kumar and Sebastian Pop.

Differential Revision: http://reviews.llvm.org/D21311

llvm-svn: 272672
2016-06-14 14:44:05 +00:00
Chad Rosier 66a9d07a86 [MergedLoadStoreMotion] Before quering AA verify the loads are the same.
Basicaa stats show the number of queries in Spec2k6 are reduced by 4540
or ~.67% overall.

llvm-svn: 272661
2016-06-14 12:47:18 +00:00
Adam Nemet 57fb8989a5 [LoopVer] Remove an assert that's redundant now. NFC
Ensuring that the PHI are all single-operand is not performed in the
second pass added by the previous pass.  This removes the assert from
the first pass.

llvm-svn: 272650
2016-06-14 09:39:01 +00:00
Adam Nemet 73a26957fc [LoopVer] Update all existing PHIs in the exit block
We only used to add the edge from the cloned loop to PHIs that
corresponded to values defined by the loop.  We need to do this for all
PHIs obviously since we need a PHI operand for each incoming edge.

This includes things like PHIs with a constant value or with values
defined before the original loop (see the testcases).

After the patch the PHIs are added to the exit block in two passes.

In the first pass we ensure there is a single-operand (LCSSA) PHI for
each value defined by the loop.

In the second pass we loop through each (single-operand) PHI and add the
value for the edge from the cloned loop.  If the value is defined in the
loop we'll use the cloned instruction from the cloned loop.

Fixes PR28037

llvm-svn: 272649
2016-06-14 09:38:54 +00:00
Davide Italiano cccf4f01ad [PM] Port Mem2Reg to the new pass manager.
llvm-svn: 272630
2016-06-14 03:22:22 +00:00
Sean Silva 6347df0f81 [PM] Port MemCpyOpt to the new PM.
The need for all these Lookup* functions is just because of calls to
getAnalysis inside methods (i.e. not at the top level) of the
runOnFunction method. They should be straightforward to clean up when
the old PM is gone.

llvm-svn: 272615
2016-06-14 02:44:55 +00:00
Davide Italiano 3ab1b588b5 [PM/MergedLoadStoreMotion] Preserve analyses more aggressively.
llvm-svn: 272611
2016-06-14 01:23:31 +00:00
Sean Silva 46590d556a Bring back "[PM] Port JumpThreading to the new PM" with a fix
This reverts commit r272603 and adds a fix.

Big thanks to Davide for pointing me at r216244 which gives some insight
into how to fix this VS2013 issue. VS2013 can't synthesize a move
constructor. So the fix here is to add one explicitly to the
JumpThreadingPass class.

llvm-svn: 272607
2016-06-14 00:51:09 +00:00
Davide Italiano 89ab89d6cd [PM] Port MergedLoadStoreMotion to the new pass manager.
llvm-svn: 272606
2016-06-14 00:49:23 +00:00
Sean Silva 7d5a57cbfc Revert "[PM] Port JumpThreading to the new PM"
This reverts commit r272597.

Will investigate issue with VS2013 compilation and then recommit.

llvm-svn: 272603
2016-06-14 00:26:31 +00:00
Davide Italiano 86c1f953f5 [PM/MergedLoadStoreMotion] Remove unneeded pass dependency.
llvm-svn: 272598
2016-06-13 23:28:35 +00:00
Sean Silva f81328d0b4 [PM] Port JumpThreading to the new PM
This follows the approach in r263208 (for GVN) pretty closely:
- move the bulk of the body of the function to the new PM class.
- expose a runImpl method on the new-PM class that takes the IRUnitT and
  pointers/references to any analyses and use that to implement the
  old-PM class.
- use a private namespace in the header for stuff that used to be file
  scope

llvm-svn: 272597
2016-06-13 22:52:52 +00:00
Davide Italiano 44faf7f407 [PM/MergeLoadStoreMotion] Convert the logic to static functions.
Pass AliasAnalyis and MemoryDepResult around. This is in preparation
for porting this pass to the new PM.

llvm-svn: 272595
2016-06-13 22:27:30 +00:00
Sean Silva 687019facb [PM] Port LVI to the new PM.
This is a bit gnarly since LVI is maintaining its own cache.
I think this port could be somewhat cleaner, but I'd rather not spend
too much time on it while we still have the old pass hanging around and
limiting how much we can clean things up.
Once the old pass is gone it will be easier (less time spent) to clean
it up anyway.

This is the last dependency needed for porting JumpThreading which I'll
do in a follow-up commit (there's no printer pass for LVI or anything to
test it, so porting a pass that depends on it seems best).

I've been mostly following:
r269370 / D18834 which ported Dependence Analysis
r268601 / D19839 which ported BPI

llvm-svn: 272593
2016-06-13 22:01:25 +00:00
Vikram TV 299abc10e7 Fix a typo in loop versioning.
Reviewers: ashutosh.nema

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21281

llvm-svn: 272545
2016-06-13 10:49:28 +00:00
Benjamin Kramer 4ca41fd09e Run clang-tidy's performance-unnecessary-copy-initialization over LLVM.
No functionality change intended.

llvm-svn: 272516
2016-06-12 17:30:47 +00:00
Benjamin Kramer d3f4c05aea Move instances of std::function.
Or replace with llvm::function_ref if it's never stored. NFC intended.

llvm-svn: 272513
2016-06-12 16:13:55 +00:00
Benjamin Kramer bdc4956bac Pass DebugLoc and SDLoc by const ref.
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.

llvm-svn: 272512
2016-06-12 15:39:02 +00:00
Sean Silva e3bb457423 [PM] Port DeadArgumentElimination to the new PM
The approach taken here follows r267631.

deadarghaX0r should be easy to port when the time comes to add new-PM
support to bugpoint.

llvm-svn: 272507
2016-06-12 09:16:39 +00:00
Sean Silva f5080194fd [PM] Port ReversePostOrderFunctionAttrs to the new PM
Below are my super rough notes when porting. They can probably serve as
a basic guide for porting other passes to the new PM. As I port more
passes I'll expand and generalize this and make a proper
docs/HowToPortToNewPassManager.rst document. There is also missing
documentation for general concepts and API's in the new PM which will
require some documentation.
Once there is proper documentation in place we can put up a list of
passes that have to be ported and game-ify/crowdsource the rest of the
porting (at least of the middle end; the backend is still unclear).

I will however be taking personal responsibility for ensuring that the
LLD/ELF LTO pipeline is ported in a timely fashion. The remaining passes
to be ported are (do something like
`git grep "<the string in the bullet point below>"` to find the pass):

General Scalar:
[ ] Simplify the CFG
[ ] Jump Threading
[ ] MemCpy Optimization
[ ] Promote Memory to Register
[ ] MergedLoadStoreMotion
[ ] Lazy Value Information Analysis

General IPO:
[ ] Dead Argument Elimination
[ ] Deduce function attributes in RPO

Loop stuff / vectorization stuff:
[ ] Alignment from assumptions
[ ] Canonicalize natural loops
[ ] Delete dead loops
[ ] Loop Access Analysis
[ ] Loop Invariant Code Motion
[ ] Loop Vectorization
[ ] SLP Vectorizer
[ ] Unroll loops

Devirtualization / CFI:
[ ] Cross-DSO CFI
[ ] Whole program devirtualization
[ ] Lower bitset metadata

CGSCC passes:
[ ] Function Integration/Inlining
[ ] Remove unused exception handling info
[ ] Promote 'by reference' arguments to scalars

Please let me know if you are interested in working on any of the passes
in the above list (e.g. reply to the post-commit thread for this patch).
I'll probably be tackling "General Scalar" and "General IPO" first FWIW.

Steps as I port "Deduce function attributes in RPO"
---------------------------------------------------

(note: if you are doing any work based on these notes, please leave a
note in the post-commit review thread for this commit with any
improvements / suggestions / incompleteness you ran into!)

Note: "Deduce function attributes in RPO" is a module pass.

1. Do preparatory refactoring.

Do preparatory factoring. In this case all I had to do was to pull out a static helper (r272503).
(TODO: give more advice here e.g. if pass holds state or something)

2. Rename the old pass class.

llvm/lib/Transforms/IPO/FunctionAttrs.cpp
Rename class ReversePostOrderFunctionAttrs -> ReversePostOrderFunctionAttrsLegacyPass
in preparation for adding a class ReversePostOrderFunctionAttrs as the pass in the new PM.
(edit: actually wait what? The new class name will be
ReversePostOrderFunctionAttrsPass, so it doesn't conflict. So this step is
sort of useless churn).

llvm/include/llvm/InitializePasses.h
llvm/lib/LTO/LTOCodeGenerator.cpp
llvm/lib/Transforms/IPO/IPO.cpp
llvm/lib/Transforms/IPO/FunctionAttrs.cpp
Rename initializeReversePostOrderFunctionAttrsPass -> initializeReversePostOrderFunctionAttrsLegacyPassPass
(note that the "PassPass" thing falls out of `s/ReversePostOrderFunctionAttrs/ReversePostOrderFunctionAttrsLegacyPass/`)
Note that the INITIALIZE_PASS macro is what creates this identifier name, so renaming the class requires this renaming too.

Note that createReversePostOrderFunctionAttrsPass does not need to be
renamed since its name is not generated from the class name.

3. Add the new PM pass class.

In the new PM all passes need to have their
declaration in a header somewhere, so you will often need to add a header.
In this case
llvm/include/llvm/Transforms/IPO/FunctionAttrs.h is already there because
PostOrderFunctionAttrsPass was already ported.
The file-level comment from the .cpp file can be used as the file-level
comment for the new header. You may want to tweak the wording slightly
from "this file implements" to "this file provides" or similar.

Add declaration for the new PM pass in this header:

    class ReversePostOrderFunctionAttrsPass
        : public PassInfoMixin<ReversePostOrderFunctionAttrsPass> {
    public:
      PreservedAnalyses run(Module &M, AnalysisManager<Module> &AM);
    };

Its name should end with `Pass` for consistency (note that this doesn't
collide with the names of most old PM passes). E.g. call it
`<name of the old PM pass>Pass`.

Also, move the doxygen comment from the old PM pass to the declaration of
this class in the header.
Also, include the declaration for the new PM class
`llvm/Transforms/IPO/FunctionAttrs.h` at the top of the file (in this case,
it was already done when the other pass in this file was ported).

Now define the `run` method for the new class.
The main things here are:
a) Use AM.getResult<...>(M) to get results instead of `getAnalysis<...>()`

b) If the old PM pass would have returned "false" (i.e. `Changed ==
false`), then you should return PreservedAnalyses::all();

c) In the old PM getAnalysisUsage method, observe the calls
   `AU.addPreserved<...>();`.

   In the case `Changed == true`, for each preserved analysis you should do
   call `PA.preserve<...>()` on a PreservedAnalyses object and return it.
   E.g.:

       PreservedAnalyses PA;
       PA.preserve<CallGraphAnalysis>();
       return PA;

Note that calls to skipModule/skipFunction are not supported in the new PM
currently, so optnone and optimization bisect support do not work. You can
just drop those calls for now.

4. Add the pass to the new PM pass registry to make it available in opt.

In llvm/lib/Passes/PassBuilder.cpp add a #include for your header.
`#include "llvm/Transforms/IPO/FunctionAttrs.h"`
In this case there is already an include (from when
PostOrderFunctionAttrsPass was ported).

Add your pass to llvm/lib/Passes/PassRegistry.def
In this case, I added
`MODULE_PASS("rpo-functionattrs", ReversePostOrderFunctionAttrsPass())`
The string is from the `INITIALIZE_PASS*` macros used in the old pass
manager.

Then choose a test that uses the pass and use the new PM `-passes=...` to
run it.
E.g. in this case there is a test that does:
; RUN: opt < %s -basicaa -functionattrs -rpo-functionattrs -S | FileCheck %s
I have added the line:
; RUN: opt < %s -aa-pipeline=basic-aa -passes='require<targetlibinfo>,cgscc(function-attrs),rpo-functionattrs' -S | FileCheck %s
The `-aa-pipeline=basic-aa` and
`require<targetlibinfo>,cgscc(function-attrs)` are what is needed to run
functionattrs in the new PM (note that in the new PM "functionattrs"
becomes "function-attrs" for some reason). This is just pulled from
`readattrs.ll` which contains the change from when functionattrs was ported
to the new PM.
Adding rpo-functionattrs causes the pass that was just ported to run.

llvm-svn: 272505
2016-06-12 07:48:51 +00:00
Sean Silva adc7939525 Factor out a helper. NFC
Prep for porting to new PM.

llvm-svn: 272503
2016-06-12 05:44:51 +00:00
Eli Friedman 9f8031c2da [MergedLoadStoreMotion] Use correct helper for load hoist safety.
It isn't legal to hoist a load past a call which might not return;
even if it doesn't throw, it could, for example, call exit().

Fixes http://llvm.org/PR27953.

llvm-svn: 272495
2016-06-12 02:11:20 +00:00
Craig Topper 99d1eab327 [IR] Require ArrayRef of 'uint32_t' instead of 'int' for the mask argument for one of the signatures of CreateShuffleVector. This better emphasises that you can't use it for the -1 as undef behavior.
llvm-svn: 272491
2016-06-12 00:41:19 +00:00
Eli Friedman f1da33e4d3 [LICM] Make isGuaranteedToExecute more accurate.
Summary:
Make isGuaranteedToExecute use the
isGuaranteedToTransferExecutionToSuccessor helper, and make that helper
a bit more accurate.

There's a potential performance impact here from assuming that arbitrary
calls might not return. This probably has little impact on loads and
stores to a pointer because most things alias analysis can reason about
are dereferenceable anyway. The other impacts, like less aggressive
hoisting of sdiv by a variable and less aggressive hoisting around
volatile memory operations, are unlikely to matter for real code.

This also impacts SCEV, which uses the same helper.  It's a minor
improvement there because we can tell that, for example, memcpy always
returns normally. Strictly speaking, it's also introducing
a bug, but it's not any worse than everywhere else we assume readonly
functions terminate.

Fixes http://llvm.org/PR27857.

Reviewers: hfinkel, reames, chandlerc, sanjoy

Subscribers: broune, llvm-commits

Differential Revision: http://reviews.llvm.org/D21167

llvm-svn: 272489
2016-06-11 21:48:25 +00:00
Vikram TV c702b8b3d7 Delay dominator updation while cloning loop.
Summary:
Dominator updation fails for a loop inserted with a new basicblock.

A block required by DT to set the IDom might not have been cloned yet. This is because there is no predefined ordering of loop blocks (except for the header block which should be the first block in the list).

The patch first creates DT nodes for the cloned blocks and then separately updates the DT in a follow-on loop.

Reviewers: anemet, dberlin

Subscribers: dberlin, llvm-commits

Differential Revision: http://reviews.llvm.org/D20899

llvm-svn: 272479
2016-06-11 16:41:10 +00:00
Qin Zhao bc8fbeacf3 [esan|cfrag] Handle complex GEP instr in the cfrag tool
Summary:
Iterates all (except the first and the last) operands within each GEP
instruction for instrumentation.

Adds test struct_field_gep.ll.

Reviewers: aizatsky

Subscribers: vitalybuka, zhaoqin, kcc, eugenis, bruening, llvm-commits

Differential Revision: http://reviews.llvm.org/D21242

llvm-svn: 272442
2016-06-10 22:28:55 +00:00
Michael Zolotukhin b98294d006 Don't try to rotate a loop more than once - we never do this anyway.
Summary:
I can't find a case where we can rotate a loop more than once, and it looks
like we never do this. To rotate a loop following conditions should be met:
1) its header should be exiting
2) its latch shouldn't be exiting

But after the first rotation the header becomes the new latch, so this
condition can never be true any longer.

Tested on with an assert on LNT testsuite and make check.

Reviewers: hfinkel, sanjoy

Subscribers: sebpop, sanjoy, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D20181

llvm-svn: 272439
2016-06-10 22:03:56 +00:00
Sebastian Pop e1f60b1fb3 MemorySSA: fix memory access local dominance function for live on entry
A memory access defined on function entry cannot be locally dominated by another memory access.
The patch was split from http://reviews.llvm.org/D19338 which exposes the problem.

Differential Revision: http://reviews.llvm.org/D21039

llvm-svn: 272436
2016-06-10 21:36:41 +00:00
Nico Weber 2cf5e89e1d Remove a few gendered pronouns.
llvm-svn: 272422
2016-06-10 20:06:03 +00:00
Evgeniy Stepanov eaea297df4 Disable MSan-hostile loop unswitching.
Loop unswitching may cause MSan false positive when the unswitch
condition is not guaranteed to execute.

This is very similar to ASan and TSan special case in
llvm::isSafeToSpeculativelyExecute (they don't like speculative loads
and stores), but for branch instructions.

This is a workaround for PR28054.

llvm-svn: 272421
2016-06-10 20:03:20 +00:00
Evgeniy Stepanov 122f984a33 Move isGuaranteedToExecute out of LICM.
Also rename LICMSafetyInfo to LoopSafetyInfo.
Both will be used in LoopUnswitch in a separate change.

llvm-svn: 272420
2016-06-10 20:03:17 +00:00
Chad Rosier 840b3efeae Add a period. NFC.
llvm-svn: 272410
2016-06-10 17:59:22 +00:00
Chad Rosier a8bc512be5 Fix whitespace. NFC.
llvm-svn: 272409
2016-06-10 17:58:01 +00:00
Qin Zhao 0b96aa7190 [esan|cfrag] Add the struct field offset array in StructInfo
Summary:
Adds the struct field offset array in struct StructInfo.

Updates test struct_field_count_basic.ll.

Reviewers: aizatsky

Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka

Differential Revision: http://reviews.llvm.org/D21192

llvm-svn: 272362
2016-06-10 02:10:06 +00:00
Qin Zhao d677d88867 [esan|cfrag] Disable load/store instrumentation for cfrag
Summary:
Adds ClInstrumentFastpath option to control fastpath instrumentation.

Avoids the load/store instrumentation for the cache fragmentation tool.

Renames cache_frag_basic.ll to working_set_slow.ll for slowpath
instrumentation test.

Adds the __esan_init check in struct_field_count_basic.ll.

Reviewers: aizatsky

Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka

Differential Revision: http://reviews.llvm.org/D21079

llvm-svn: 272355
2016-06-10 00:48:53 +00:00
Vitaly Buka b451f1bdf6 Make sure that not interesting allocas are not instrumented.
Summary:
We failed to unpoison uninteresting allocas on return as unpoisoning is part of
main instrumentation which skips such allocas.

Added check -asan-instrument-allocas for dynamic allocas. If instrumentation of
dynamic allocas is disabled it will not will not be unpoisoned.

PR27453

Reviewers: kcc, eugenis

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21207

llvm-svn: 272341
2016-06-09 23:31:59 +00:00
Vitaly Buka 79b75d3d11 Unpoison stack memory in use-after-return + use-after-scope mode
Summary:
We still want to unpoison full stack even in use-after-return as it can be disabled at runtime.

PR27453

Reviewers: eugenis, kcc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21202

llvm-svn: 272334
2016-06-09 23:05:35 +00:00
Easwaran Raman 71069cf67d Use ProfileSummaryInfo in inline cost analysis.
Instead of directly using MaxFunctionCount and function entry count to determine callee hotness, use the isHotFunction/isColdFunction methods provided by ProfileSummaryInfo.

Differential revision: http://reviews.llvm.org/D21045

llvm-svn: 272321
2016-06-09 22:23:21 +00:00
Easwaran Raman e12c487b8c [PM] Port LCSSA to the new PM.
Differential Revision: http://reviews.llvm.org/D21090

llvm-svn: 272294
2016-06-09 19:44:46 +00:00
Michael Kuperstein c5edcdeb0e [LV] Use vector phis for some secondary induction variables
Previously, we materialized secondary vector IVs from the primary scalar IV,
by offseting the primary to match the correct start value, and then broadcasting
it - inside the loop body. Instead, we can use a real vector IV, like we do for
the primary.

This enables using vector IVs for secondary integer IVs whose type matches the
type of the primary.

Differential Revision: http://reviews.llvm.org/D20932

llvm-svn: 272283
2016-06-09 18:03:15 +00:00
Xinliang David Li ecde1c7f3d Revert r272194 No need for it if loop Analysis Manager is used
llvm-svn: 272243
2016-06-09 03:22:39 +00:00
Teresa Johnson 7ab1f69272 [ThinLTO/gold] Enable summary-based internalization
Summary: Enable existing summary-based importing support in the gold-plugin.

Reviewers: mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: http://reviews.llvm.org/D21080

llvm-svn: 272239
2016-06-09 01:14:13 +00:00
Michael Zolotukhin 8e7e76729d [LoopSimplify] Preserve LCSSA when merging exit blocks.
Summary:
This fixes PR26682. Also add LCSSA as a preserved pass to LoopSimplify,
that looks correct to me and allows to write a test for the issue.

Reviewers: chandlerc, bogner, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21112

llvm-svn: 272224
2016-06-08 23:13:21 +00:00
Michael Zolotukhin aa547616d2 [LoopUnroll] Check that DT is available before trying to verify it.
llvm-svn: 272221
2016-06-08 22:49:59 +00:00
Michael Zolotukhin 987ab631fa [SLPVectorizer] Handle GEP with differing constant index types
Summary:
This fixes PR27617.

Bug description: The SLPVectorizer asserts on encountering GEPs with different index types, such as i8 and i64.

The patch includes a simple relaxation of the assert to allow constants being of different types, along with a regression test that will provoke the unrelaxed assert.

Reviewers: nadav, mzolotukhin

Subscribers: JesperAntonsson, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D20685

Patch by Jesper Antonsson!

llvm-svn: 272206
2016-06-08 21:55:16 +00:00
Davide Italiano 02861d8695 [PM] Add missing caching of GlobalsAA to EarlyCSE.
llvm-svn: 272204
2016-06-08 21:31:55 +00:00
Sanjay Patel 3929313811 [InstCombine] move fold of select of add/sub to helper function; NFCI
llvm-svn: 272199
2016-06-08 21:10:01 +00:00
Sanjay Patel 384d0f219d [InstCombine] fix outdated comment, simplify logic; NFCI
llvm-svn: 272196
2016-06-08 20:31:52 +00:00
Evgeny Stupachenko 3e2f389a7e The patch set unroll disable pragma when unroll
with user specified count has been applied.

Summary:
Previously SetLoopAlreadyUnrolled() set the disable pragma only if
there was some loop metadata.
Now it set the pragma in all cases. This helps to prevent multiple
unroll when -unroll-count=N is given.

Reviewers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D20765

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 272195
2016-06-08 20:21:24 +00:00
Xinliang David Li 572135f717 [PM] Refector LoopAccessInfo analysis code
This is the preparation patch to port the analysis to new PM

Differential Revision: http://reviews.llvm.org/D20560

llvm-svn: 272194
2016-06-08 20:15:37 +00:00
Sanjay Patel 10a2c38d83 [InstCombine] reduce indent; NFC
llvm-svn: 272193
2016-06-08 20:09:04 +00:00
Tim Shen 7aa0ad65ce [MemCpyOpt] Do not exchange llvm.lifetime.start and llvm.memcpy
Reviewers: iteratee

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21087

llvm-svn: 272192
2016-06-08 19:42:32 +00:00
Sanjay Patel 916f8a0cdb [InstCombine] use copyIRFlags() ; NFCI
llvm-svn: 272191
2016-06-08 19:33:52 +00:00
Benjamin Kramer c321e53402 Apply most suggestions of clang-tidy's performance-unnecessary-value-param
Avoids unnecessary copies. All changes audited & pass tests with asan.
No functional change intended.

llvm-svn: 272190
2016-06-08 19:09:22 +00:00
Davide Italiano 2d5ab0a56a [PM] LoopSimplify. Remove unneeded pass dependencies. NFCI.
llvm-svn: 272140
2016-06-08 13:56:59 +00:00
Davide Italiano d8d83f4773 [PM/SimplifyCFG] Preserve GlobalsAA even if the IR is mutated.
llvm-svn: 272139
2016-06-08 13:32:23 +00:00
Benjamin Kramer 46e38f3678 Avoid copies of std::strings and APInt/APFloats where we only read from it
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.

llvm-svn: 272126
2016-06-08 10:01:20 +00:00
Davide Italiano 16e96d4b16 [PM] Preserve GlobalsAA for SROA.
Differential Revision:  http://reviews.llvm.org/D21040

llvm-svn: 272009
2016-06-07 13:21:17 +00:00
Simon Pilgrim db9893fb90 [InstCombine][AVX2] Add support for simplifying AVX2 per-element shifts to native shifts
Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit).

If the shift amount is constant we can sometimes convert these instructions to native shifts:

1 - if all shift amounts are in range then the conversion is trivial.
2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion.
3 - logical shifts just return zero if all elements have out of range shift amounts.

In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts.

Differential Revision: http://reviews.llvm.org/D19675

llvm-svn: 271996
2016-06-07 10:27:15 +00:00
Simon Pilgrim 91e3ac8293 [InstCombine][SSE] Add MOVMSK constant folding (PR27982)
This patch adds support for folding undef/zero/constant inputs to MOVMSK instructions.

The SSE/AVX versions can be fully folded, but the MMX version can only handle undef inputs.

Differential Revision: http://reviews.llvm.org/D20998

llvm-svn: 271990
2016-06-07 08:18:35 +00:00
Michael Kuperstein a0c6ae02a5 [InstCombine] scalarizePHI should not assume the code it sees has been CSE'd
scalarizePHI only looked for phis that have exactly two uses - the "latch"
use, and an extract. Unfortunately, we can not assume all equivalent extracts
are CSE'd, since InstCombine itself may create an extract which is a duplicate
of an existing one. This extends it to handle several distinct extracts from
the same index.

This should fix at least some of the  performance regressions from PR27988.

Differential Revision: http://reviews.llvm.org/D20983

llvm-svn: 271961
2016-06-06 23:38:33 +00:00
Davide Italiano fea0a4c5b2 [PM] Preserve the correct set of analyses for GVN.
llvm-svn: 271934
2016-06-06 20:01:50 +00:00
Davide Italiano 82c447823b [GVN] Switch dump() definition over to LLVM_DUMP_METHOD.
llvm-svn: 271932
2016-06-06 19:24:27 +00:00
Geoff Berry 43e5160d0e Reapply [LSR] Create fewer redundant instructions.
Summary:
Fix LSRInstance::HoistInsertPosition() to check the original insert
position block first for a canonical insertion point that is dominated
by all inputs.  This leads to SCEV being able to reuse more instructions
since it currently tracks the instructions it creates for reuse by
keeping a table of <Value, insert point> pairs.

Originally reviewed in http://reviews.llvm.org/D18001

Reviewers: atrick

Subscribers: llvm-commits, mzolotukhin, mcrosier

Differential Revision: http://reviews.llvm.org/D18480

llvm-svn: 271929
2016-06-06 19:10:46 +00:00
Sanjay Patel 6a333c3ed9 [InstCombine] limit icmp transform to ConstantInt (PR28011)
In r271810 ( http://reviews.llvm.org/rL271810 ), I loosened the check
above this to work for any Constant rather than ConstantInt. AFAICT, 
that part makes sense if we can determine that the shrunken/extended 
constant remained equal. But it doesn't make sense for this later 
transform where we assume that the constant DID change. 

This could assert for a ConstantExpr:
https://llvm.org/bugs/show_bug.cgi?id=28011

And it could be wrong for a vector as shown in the added regression test.

llvm-svn: 271908
2016-06-06 16:56:57 +00:00
Eli Friedman ee89505799 LICM: Don't sink stores out of loops that may throw.
Summary:
This hasn't been caught before because it requires noalias or similarly
strong alias analysis to actually reproduce.

Fixes http://llvm.org/PR27952 .

Reviewers: hfinkel, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20944

llvm-svn: 271858
2016-06-05 22:13:52 +00:00
Sanjoy Das b7e861a488 Add safety check to InstCombiner::commonIRemTransforms
Since FoldOpIntoPhi speculates the binary operation to potentially each
of the predecessors of the PHI node (pulling it out of arbitrary control
dependence in the process), we can FoldOpIntoPhi only if we know the
operation doesn't have UB.

This also brings up an interesting profitability question -- the way it
is written today, commonIRemTransforms will hoist out work from
dynamically dead code into code that will execute at runtime.  Perhaps
that isn't the best canonicalization?

Fixes PR27968.

llvm-svn: 271857
2016-06-05 21:17:04 +00:00
Sanjoy Das 4d4339d1e8 [PM] Port IndVarSimplify to the new pass manager
Summary:
There are some rough corners, since the new pass manager doesn't have
(as far as I can tell) LoopSimplify and LCSSA, so I've updated the
tests to run them separately in the old pass manager in the lit tests.
We also don't have an equivalent for AU.setPreservesCFG() in the new
pass manager, so I've left a FIXME.

Reviewers: bogner, chandlerc, davide

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20783

llvm-svn: 271846
2016-06-05 18:01:19 +00:00
Sanjoy Das f90e28d6fd [IndVars] Remove -liv-reduce
It is an off-by-default option that no one seems to use[0], and given
that SCEV directly understands the overflow instrinsics there is no real
need for it anymore.

[0]: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098181.html

llvm-svn: 271845
2016-06-05 18:01:12 +00:00
Sanjay Patel a6fbc82392 [InstCombine] allow vector icmp bool transforms
llvm-svn: 271843
2016-06-05 17:49:45 +00:00
Sanjay Patel 5f0217f42e fix documentation comments and other clean-ups; NFC
llvm-svn: 271839
2016-06-05 16:46:18 +00:00
Xinliang David Li 64dbb295b6 [PM] Port GCOVProfiler pass to the new pass manager
llvm-svn: 271823
2016-06-05 05:12:23 +00:00
Xinliang David Li fb3137c3b3 [PM] code refactoring /NFC
llvm-svn: 271822
2016-06-05 03:40:03 +00:00
Sanjay Patel 6f8f47b358 [InstCombine] less 'CI' confusion; NFC
Change the name of the ICmpInst to 'ICmp' and the Constant (was a ConstantInt) to 'C',
so that it's hopefully clearer that 'CI' refers to CastInst in this context.

While we're scrubbing, fix the documentation comment and use 'auto' with 'dyn_cast'.

llvm-svn: 271817
2016-06-05 00:12:32 +00:00
David Majnemer 2482e1c017 [SimplifyCFG] Don't kill empty cleanuppads with multiple uses
A basic block could contain:
  %cp = cleanuppad []
  cleanupret from %cp unwind to caller

This basic block is empty and is thus a candidate for removal.  However,
there can be other uses of %cp outside of this basic block.  This is
only possible in unreachable blocks.

Make our transform more correct by checking that the pad has a single
user before removing the BB.

This fixes PR28005.

llvm-svn: 271816
2016-06-04 23:50:03 +00:00
Sanjay Patel ea8a211169 [InstCombine] allow vector constants for cast+icmp fold
This is step 1 of unknown towards fixing PR28001:
https://llvm.org/bugs/show_bug.cgi?id=28001

llvm-svn: 271810
2016-06-04 22:04:05 +00:00
Sanjay Patel c774f8c265 clean-up; NFC
llvm-svn: 271807
2016-06-04 21:20:44 +00:00
Sanjay Patel 4c204230fc fix formatting, punctuation; NFC
llvm-svn: 271804
2016-06-04 20:39:22 +00:00
Simon Pilgrim fda22d66fc [InstCombine][MMX] Extend SimplifyDemandedUseBits MOVMSK support to MMX
Add the MMX implementation to the SimplifyDemandedUseBits SSE/AVX MOVMSK support added in D19614

Requires a minor tweak as llvm.x86.mmx.pmovmskb takes a x86_mmx argument - so we have to be explicit about the implied v8i8 vector type.

llvm-svn: 271789
2016-06-04 13:42:46 +00:00
Xinliang David Li 6c44e9e33d [pgo] extend r271532 to darwin platform
llvm-svn: 271746
2016-06-03 23:02:28 +00:00
Derek Bruening 9ef5772154 [esan|wset] Optionally assume intra-cache-line accesses
Summary:
Adds an option -esan-assume-intra-cache-line which causes esan to assume
that a single memory access touches just one cache line, even if it is not
aligned, for better performance at a potential accuracy cost.  Experiments
show that the performance difference can be 2x or more, and accuracy loss
is typically negligible, so we turn this on by default.  This currently
applies just to the working set tool.

Reviewers: aizatsky

Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits

Differential Revision: http://reviews.llvm.org/D20978

llvm-svn: 271743
2016-06-03 22:29:52 +00:00
Derek Bruening 4252a16c35 [esan] Specify which tool via a global variable
Summary:
Adds a global variable to specify the tool, to support handling early
interceptors that invoke instrumented code and require shadow memory to be
initialized prior to __esan_init() being invoked.

Reviewers: aizatsky

Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits

Differential Revision: http://reviews.llvm.org/D20973

llvm-svn: 271715
2016-06-03 19:40:37 +00:00
Sanjay Patel 6cf18af1c5 [InstCombine] look through bitcasts to find selects
There was concern that creating bitcasts for the simpler potential select pattern:

define <2 x i64> @vecBitcastOp1(<4 x i1> %cmp, <2 x i64> %a) {
  %a2 = add <2 x i64> %a, %a
  %sext = sext <4 x i1> %cmp to <4 x i32>
  %bc = bitcast <4 x i32> %sext to <2 x i64>
  %and = and <2 x i64> %a2, %bc
  ret <2 x i64> %and
}

might lead to worse code for some targets, so this patch is matching the larger
patterns seen in the test cases.

The motivating example for this patch is this IR produced via SSE intrinsics in C:

define <2 x i64> @gibson(<2 x i64> %a, <2 x i64> %b) {
  %t0 = bitcast <2 x i64> %a to <4 x i32>
  %t1 = bitcast <2 x i64> %b to <4 x i32>
  %cmp = icmp sgt <4 x i32> %t0, %t1
  %sext = sext <4 x i1> %cmp to <4 x i32>
  %t2 = bitcast <4 x i32> %sext to <2 x i64>
  %and = and <2 x i64> %t2, %a
  %neg = xor <4 x i32> %sext, <i32 -1, i32 -1, i32 -1, i32 -1>
  %neg2 = bitcast <4 x i32> %neg to <2 x i64>
  %and2 = and <2 x i64> %neg2, %b
  %or = or <2 x i64> %and, %and2
  ret <2 x i64> %or
}

For an AVX target, this is currently:

vpcmpgtd  %xmm1, %xmm0, %xmm2
vpand     %xmm0, %xmm2, %xmm0
vpandn    %xmm1, %xmm2, %xmm1
vpor      %xmm1, %xmm0, %xmm0
retq

With this patch, it becomes:

vpmaxsd   %xmm1, %xmm0, %xmm0

Differential Revision: http://reviews.llvm.org/D20774

llvm-svn: 271676
2016-06-03 14:42:07 +00:00