Commit Graph

8335 Commits

Author SHA1 Message Date
Michael Zolotukhin e0b2d97b52 [LoopInfo] Add verification by recomputation.
Summary:
Current implementation of LI verifier isn't ideal and fails to detect
some cases when LI is incorrect. For instance, it checks that all
recorded loops are in a correct form, but it has no way to check if
there are no more other (unrecorded in LI) loops in the function. This
patch adds a way to detect such bugs.

Reviewers: chandlerc, sanjoy, hfinkel

Subscribers: llvm-commits, silvas, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23437

llvm-svn: 280280
2016-08-31 19:26:19 +00:00
Geoff Berry 8d84605f25 [EarlyCSE] Optionally use MemorySSA. NFC.
Summary:
Use MemorySSA, if requested, to do less conservative memory dependency
checking.

This change doesn't enable the MemorySSA enhanced EarlyCSE in the
default pipelines, so should be NFC.

Reviewers: dberlin, sanjoy, reames, majnemer

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19821

llvm-svn: 280279
2016-08-31 19:24:10 +00:00
Geoff Berry 64f5ed172a [EarlyCSE] Allow forwarding a non-invariant load into an invariant load.
Reviewers: sanjoy

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D23935

llvm-svn: 280265
2016-08-31 17:45:31 +00:00
Philip Reames 2b1084ac93 [statepoints][experimental] Add support for live-in semantics of values in deopt bundles
This is a first step towards supporting deopt value lowering and reporting entirely with the register allocator. I hope to build on this in the near future to support live-on-return semantics, but I have a use case which allows me to test and investigate code quality with just the live-in semantics so I've chosen to start there. For those curious, my use cases is our implementation of the "__llvm_deoptimize" function we bind to @llvm.deoptimize. I'm choosing not to hard code that fact in the patch and instead make it configurable via function attributes.

The basic approach here is modelled on what is done for the "Live In" values on stackmaps and patchpoints. (A secondary goal here is to remove one of the last barriers to merging the pseudo instructions.) We start by adding the operands directly to the STATEPOINT SDNode. Once we've lowered to MI, we extend the remat logic used by the register allocator to fold virtual register uses into StackMap::Indirect entries as needed. This does rely on the fact that the register allocator rematerializes. If it didn't along some code path, we could end up with more vregs than physical registers and fail to allocate.

Today, we *only* fold in the register allocator. This can create some weird effects when combined with arguments passed on the stack because we don't fold them appropriately. I have an idea how to fix that, but it needs this patch in place to work on that effectively. (There's some weird interaction with the scheduler as well, more investigation needed.)

My near term plan is to land this patch off-by-default, experiment in my local tree to identify any correctness issues and then start fixing codegen problems one by one as I find them. Once I have the live-in lowering fully working (both correctness and code quality), I'm hoping to move on to the live-on-return semantics. Note: I don't have any *known* miscompiles with this patch enabled, but I'm pretty sure I'll find at least a couple. Thus, the "experimental" tag and the fact it's off by default.

Differential Revision: https://reviews.llvm.org/D24000

llvm-svn: 280250
2016-08-31 15:12:17 +00:00
Chad Rosier 27ac0d8670 [Reassociate] Add additional debug output. NFC.
llvm-svn: 280090
2016-08-30 13:58:35 +00:00
Anna Thomas 1aea6da564 [RewriteStatepointsForGC] Update comment for same PHI node check. NFC
llvm-svn: 280052
2016-08-30 02:36:48 +00:00
Duncan P. N. Exon Smith 5c001c367f ADT: Give ilist<T>::reverse_iterator a handle to the current node
Reverse iterators to doubly-linked lists can be simpler (and cheaper)
than std::reverse_iterator.  Make it so.

In particular, change ilist<T>::reverse_iterator so that it is *never*
invalidated unless the node it references is deleted.  This matches the
guarantees of ilist<T>::iterator.

(Note: MachineBasicBlock::iterator is *not* an ilist iterator, but a
MachineInstrBundleIterator<MachineInstr>.  This commit does not change
MachineBasicBlock::reverse_iterator, but it does update
MachineBasicBlock::reverse_instr_iterator.  See note at end of commit
message for details on bundle iterators.)

Given the list (with the Sentinel showing twice for simplicity):

     [Sentinel] <-> A <-> B <-> [Sentinel]

the following is now true:
 1. begin() represents A.
 2. begin() holds the pointer for A.
 3. end() represents [Sentinel].
 4. end() holds the poitner for [Sentinel].
 5. rbegin() represents B.
 6. rbegin() holds the pointer for B.
 7. rend() represents [Sentinel].
 8. rend() holds the pointer for [Sentinel].

The changes are #6 and #8.  Here are some properties from the old
scheme (which used std::reverse_iterator):
- rbegin() held the pointer for [Sentinel] and rend() held the pointer
  for A;
- operator*() cost two dereferences instead of one;
- converting from a valid iterator to its valid reverse_iterator
  involved a confusing increment; and
- "RI++->erase()" left RI invalid.  The unintuitive replacement was
  "RI->erase(), RE = end()".

With vector-like data structures these properties are hard to avoid
(since past-the-beginning is not a valid pointer), and don't impose a
real cost (since there's still only one dereference, and all iterators
are invalidated on erase).  But with lists, this was a poor design.

Specifically, the following code (which obviously works with normal
iterators) now works with ilist::reverse_iterator as well:

    for (auto RI = L.rbegin(), RE = L.rend(); RI != RE;)
      fooThatMightRemoveArgFromList(*RI++);

Converting between iterator and reverse_iterator for the same node uses
the getReverse() function.

    reverse_iterator iterator::getReverse();
    iterator reverse_iterator::getReverse();

Why doesn't iterator <=> reverse_iterator conversion use constructors?

In order to catch and update old code, reverse_iterator does not even
have an explicit conversion from iterator.  It wouldn't be safe because
there would be no reasonable way to catch all the bugs from the changed
semantic (see the changes at call sites that are part of this patch).

Old code used this API:

    std::reverse_iterator::reverse_iterator(iterator);
    iterator std::reverse_iterator::base();

Here's how to update from old code to new (that incorporates the
semantic change), assuming I is an ilist<>::iterator and RI is an
ilist<>::reverse_iterator:

            [Old]         ==>          [New]
    reverse_iterator(I)       (--I).getReverse()
    reverse_iterator(I)         ++I.getReverse()
  --reverse_iterator(I)           I.getReverse()
    reverse_iterator(++I)         I.getReverse()
          RI.base()          (--RI).getReverse()
          RI.base()            ++RI.getReverse()
        --RI.base()              RI.getReverse()
      (++RI).base()              RI.getReverse()
  delete &*RI, RE = end()         delete &*RI++
  RI->erase(), RE = end()         RI++->erase()

=======================================
Note: bundle iterators are out of scope
=======================================

MachineBasicBlock::iterator, also known as
MachineInstrBundleIterator<MachineInstr>, is a wrapper to represent
MachineInstr bundles.  The idea is that each operator++ takes you to the
beginning of the next bundle.  Implementing a sane reverse iterator for
this is harder than ilist.  Here are the options:
- Use std::reverse_iterator<MBB::i>.  Store a handle to the beginning of
  the next bundle.  A call to operator*() runs a loop (usually
  operator--() will be called 1 time, for unbundled instructions).
  Increment/decrement just works.  This is the status quo.
- Store a handle to the final node in the bundle.  A call to operator*()
  still runs a loop, but it iterates one time fewer (usually
  operator--() will be called 0 times, for unbundled instructions).
  Increment/decrement just works.
- Make the ilist_sentinel<MachineInstr> *always* store that it's the
  sentinel (instead of just in asserts mode).  Then the bundle iterator
  can sniff the sentinel bit in operator++().

I initially tried implementing the end() option as part of this commit,
but updating iterator/reverse_iterator conversion call sites was
error-prone.  I have a WIP series of patches that implements the final
option.

llvm-svn: 280032
2016-08-30 00:13:12 +00:00
Anna Thomas 2bc129c5fd [StatepointsForGC] Rematerialize in the presence of PHIs
Summary:
While walking the use chain for identifying rematerializable values in RS4GC,
add the case where the current value and base value are the same PHI nodes.

This will aid rematerialization of geps and casts instead of relocating.

Reviewers: sanjoy, reames, igor

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23920

llvm-svn: 279975
2016-08-29 15:41:59 +00:00
Sebastian Pop 4660199a33 GVN-hoist: invalidate MD cache (PR29144)
Without invalidating the entries in the MD cache we would try to access instructions
that were removed in previous iterations of hoisting.

Differential Revision: https://reviews.llvm.org/D23927

llvm-svn: 279907
2016-08-27 02:48:41 +00:00
Bob Haarman 3db176410a limit the number of instructions per block examined by dead store elimination
Summary: Dead store elimination gets very expensive when large numbers of instructions need to be analyzed. This patch limits the number of instructions analyzed per store to the value of the memdep-block-scan-limit parameter (which defaults to 100). This resulted in no observed difference in performance of the generated code, and no change in the statistics for the dead store elimination pass, but improved compilation time on some files by more than an order of magnitude.

Reviewers: dexonsmith, bruno, george.burgess.iv, dberlin, reames, davidxl

Subscribers: davide, chandlerc, dberlin, davidxl, eraman, tejohnson, mbodart, llvm-commits

Differential Revision: https://reviews.llvm.org/D15537

llvm-svn: 279833
2016-08-26 16:34:27 +00:00
Bob Haarman 244ed8b574 test commit
llvm-svn: 279830
2016-08-26 16:00:04 +00:00
Adam Nemet 4f155b6e91 [LoopUnroll] Use OptimizationRemarkEmitter directly not via the analysis pass
We can't mark ORE (a function pass) preserved as required by the loop
passes because that is how we ensure that the required passes like
LazyBFI are all available any time ORE is used.  See the new comments in
the patch.

Instead we use it directly just like the inliner does in D22694.

As expected there is some additional overhead after removing the caching
provided by analysis passes.  The worst case, I measured was
LNT/CINT2006_ref/401.bzip2 which regresses by 12%.  As before, this only
affects -Rpass-with-hotness and not default compilation.

llvm-svn: 279829
2016-08-26 15:58:34 +00:00
Tim Shen 3ad8b43cc2 [MemCpy] Add comments for r279769
Differential Revision: https://reviews.llvm.org/D23846

llvm-svn: 279778
2016-08-25 21:03:46 +00:00
Tim Shen a3dbead2d6 [MemCpy] Check for alias in performMemCpyToMemSetOptzn, instead of the identity of two operands
Summary:
This fixes pr29105. The reason is that lifetime marks creates new
aliasing pointers the original ones, but before this patch aliases
were not checked in performMemCpyToMemSetOptzn.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23846

llvm-svn: 279769
2016-08-25 19:27:26 +00:00
Sebastian Pop 5f0d0e60d1 GVN-hoist: fix hoistingFromAllPaths for loops (PR29034)
It is invalid to hoist stores or loads if they are not executed on all paths
from the hoisting point to the exit of the function. In the testcase, there are
paths in the loop that do not execute the stores or the loads, and so hoisting
them within the loop is unsafe.

The problem is that the current implementation of hoistingFromAllPaths is
incomplete: it walks all blocks dominated by the hoisting point, and does not
return false when the loop contains a path on which the hoisted ld/st is
not executed.

Differential Revision: https://reviews.llvm.org/D23843

llvm-svn: 279732
2016-08-25 11:55:47 +00:00
Sanjoy Das ff855b6020 [SCCP] Don't delete side-effecting instructions
I'm not sure if the `!isa<CallInst>(Inst) &&
!isa<TerminatorInst>(Inst))` bit is correct either, but this fixes the
case we know is broken.

llvm-svn: 279647
2016-08-24 18:10:21 +00:00
David Callahan 012d1c0766 [ADCE] Add control dependence computation
Summary:
This is part of a serious of patches to evolve ADCE.cpp to support
removing of unnecessary control flow.

This patch adds the ability to compute control dependences using
the iterated dominance frontier. We extend the liveness propagation
to alternate between data and control dependences until convergences.

Modify the pass manager intergation to compute the post-dominator tree
needed for iterator dominance frontier.

We still force all terminators live for now until we add code to
handlinge removing control flow in a later patch.

No changes to effective behavior with this patch

Previous patches:

D23225 [ADCE] Modify data structures to support removing control flow
D23065 [ADCE] Refactor anticipating new functionality (NFC)
D23102 [ADCE] Refactoring for new functionality (NFC)

Reviewers: nadav, majnemer, mehdi_amini

Subscribers: twoh, freik, llvm-commits

Differential Revision: https://reviews.llvm.org/D23559

llvm-svn: 279594
2016-08-24 00:10:06 +00:00
Michael Zolotukhin bd63d436c1 [LoopUnroll] By default disable unrolling when optimizing for size.
Summary:
In clang commit r268509 we started to invoke loop-unroll pass from the
driver even under -Os. However, we happen to not initialize optsize
thresholds properly, which si fixed with this change.

r268509 led to some big compile time regressions, because we started to
unroll some loops that we didn't unroll before. With this change I hope
to recover most of the regressions. We still are slightly slower than
before, because we do some checks here and there in loop-unrolling
before we bail out, but at least the slowdown is not that huge now.

Reviewers: hfinkel, chandlerc

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D23388

llvm-svn: 279585
2016-08-23 23:13:15 +00:00
Xinliang David Li 812288a5b4 Possible fix of test failures on win bots
llvm-svn: 279542
2016-08-23 18:00:41 +00:00
Xinliang David Li dc49140b44 [Profile] refactor meta data copying/swapping code
Differential Revision: http://reviews.llvm.org/D23619

llvm-svn: 279523
2016-08-23 15:39:03 +00:00
Daniel Berlin ea02eee18f GVNHoist: Use the pass version of MemorySSA and preserve it.
Summary: GVNHoist: Use the pass version of MemorySSA and preserve it.

Reviewers: sebpop, george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23782

llvm-svn: 279504
2016-08-23 05:42:41 +00:00
James Molloy 0fee97f8ba [SROA] Remove incorrect assertion
Confirmed with aprantl, this assertion is incorrect - code can get here (for example 80-bit FP types) and if it does it's benign. This is exposed by a completely unrelated patch of mine, so stop the compiler falling over.

Original differential: http://reviews.llvm.org/D16187
aprantl's advice to remove assertion: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160815/382129.html

llvm-svn: 279454
2016-08-22 18:49:42 +00:00
Artur Pilipenko b78ad9d41f Revert -r278269 [IndVarSimplify] Eliminate zext of a signed IV when the IV is known to be non-negative
This change needs to be reverted in order to revert -r278267 which cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks.

See comments on https://reviews.llvm.org/D18777 for details.

llvm-svn: 279432
2016-08-22 13:12:07 +00:00
Daniel Berlin 11da66fc10 Partially revert 279331, as we modify this instruction in the loop
llvm-svn: 279335
2016-08-19 22:18:38 +00:00
Daniel Berlin a36f46363f Convert some depth first traversals to depth_first
llvm-svn: 279331
2016-08-19 22:06:23 +00:00
Tim Shen b5e0f5ac95 [GraphTraits] Make nodes_iterator dereference to NodeType*/NodeRef
Currently nodes_iterator may dereference to a NodeType* or a NodeType&. Make them all dereference to NodeType*, which is NodeRef later.

Differential Revision: https://reviews.llvm.org/D23704
Differential Revision: https://reviews.llvm.org/D23705

llvm-svn: 279326
2016-08-19 21:20:13 +00:00
Artur Pilipenko 615b820af6 CVP. Turn marking adds as no wrap (introduced by r278107) off by default
It causes a regression on our internal benchmark. Introduce cvp-dont-process flag and set it off by default while investigating the regression.

llvm-svn: 279082
2016-08-18 16:08:35 +00:00
Davide Italiano d1279df752 [IRCE] Switch over to LLVM_DUMP_METHOD. NFCI.
llvm-svn: 279079
2016-08-18 15:55:49 +00:00
Haicheng Wu e787763275 [LoopUnroll] Move a simple check earlier. NFC.
Move the check of CallInst earlier to skip expensive recursive operations.

Differential Revision: https://reviews.llvm.org/D23611

llvm-svn: 278998
2016-08-17 22:42:58 +00:00
Chad Rosier ea7e4647db Revert "Reassociate: Reprocess RedoInsts after each inst".
This reverts commit r258830, which introduced a bug described in PR28367.

PR28367

llvm-svn: 278938
2016-08-17 15:54:39 +00:00
Chad Rosier a6822f64f3 Revert "[Reassociate] Avoid iterator invalidation when negating value."
This reverts commit r278928 due to lit test failures.

llvm-svn: 278929
2016-08-17 14:31:34 +00:00
Chad Rosier cf3e8121a6 [Reassociate] Avoid iterator invalidation when negating value.
Differential Revision: https://reviews.llvm.org/D23464
PR28367

llvm-svn: 278928
2016-08-17 14:16:45 +00:00
Jonas Paulsson 7a79422536 [LoopStrenghtReduce] Refactoring and addition of a new target cost function.
Refactored so that a LSRUse owns its fixups, as oppsed to letting the
LSRInstance own them. This makes it easier to rate formulas for
LSRUses, since the fixups are available directly. The Offsets vector
has been removed since it was no longer necessary.

New target hook isFoldableMemAccessOffset(), which is used during formula
rating.

For SystemZ, this is useful to express that loads and stores with
float or vector types with a big/negative offset should be avoided in
loops. Without this, LSR will generate a lot of negative offsets that
would require extra instructions for loading the address.

Updated tests:
test/CodeGen/SystemZ/loop-01.ll

Reviewed by: Quentin Colombet and Ulrich Weigand.
https://reviews.llvm.org/D19152

llvm-svn: 278927
2016-08-17 13:24:19 +00:00
Justin Bogner b03fd12cef Replace "fallthrough" comments with LLVM_FALLTHROUGH
This is a mechanical change of comments in switches like fallthrough,
fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.

llvm-svn: 278902
2016-08-17 05:10:15 +00:00
Duncan P. N. Exon Smith 362d120488 Scalar: Avoid dereferencing end() in IndVarSimplify
IndVarSimplify::sinkUnusedInvariants calls
BasicBlock::getFirstInsertionPt on the ExitBlock and moves instructions
before it.  This can return end(), so it's not safe to dereference.  Add
an iterator-based overload to Instruction::moveBefore to avoid the UB.

llvm-svn: 278886
2016-08-17 01:54:41 +00:00
Duncan P. N. Exon Smith 3bcaa81204 Scalar: Avoid dereferencing end() in InductiveRangeCheckElimination
BasicBlock::Create isn't designed to take iterators (which might be
end()), but pointers (which might be nullptr).  Fix the UB that was
converting end() to a BasicBlock* by calling BasicBlock::getNextNode()
in the first place.

llvm-svn: 278883
2016-08-17 01:16:17 +00:00
David Majnemer 744a8753db Preserve the assumption cache more often
We were clearing it out in LoopUnswitch and InlineFunction instead of
attempting to preserve it.

llvm-svn: 278860
2016-08-16 22:07:32 +00:00
David Callahan 947be0fa66 [ADCE] Modify data structures to support removing control flow
Summary:
This is part of a serious of patches to evolve ADCE.cpp to support
removing of unnecessary control flow.

This patch changes the data structures to hold liveness information to
support the additional information we will eventually need. In
particular we now have a notion of basic blocks being live because
they contain a live operations. This will eventually feed into control
dependence analysis of which branches are live. We cater to getting
from instructions to associated block information and from blocks to
information about their terminators.

This patch also changes the structure of the main loop of the
algorithm so that it alternates propagating liveness between
instructions and usign control dependence information to mark branches
live.

We force all terminators live for now until we add code to handlinge
removing control flow in a later patch.

No changes to effective behavior with this patch

Previous patches:

D23065 [ADCE] Refactor anticipating new functionality (NFC)
D23102 [ADCE] Refactoring for new functionality (NFC)

Reviewers: nadav, majnemer, mehdi_amini

Subscribers: freik, twoh, llvm-commits

Differential Revision: https://reviews.llvm.org/D23225

llvm-svn: 278807
2016-08-16 14:31:51 +00:00
James Molloy 196ad0823e [LSR] Don't try and create post-inc expressions on non-rotated loops
If a loop is not rotated (for example when optimizing for size), the latch is not the backedge. If we promote an expression to post-inc form, we not only increase register pressure and add a COPY for that IV expression but for all IVs!

Motivating testcase:

    void f(float *a, float *b, float *c, int n) {
      while (n-- > 0)
        *c++ = *a++ + *b++;
    }

It's imperative that the pointer increments be located in the latch block and not the header block; if not, we cannot use post-increment loads and stores and we have to keep both the post-inc and pre-inc values around until the end of the latch which bloats register usage.

llvm-svn: 278658
2016-08-15 07:53:03 +00:00
Sanjoy Das 35459f0e34 [IRCE] Change variable grouping; NFC
llvm-svn: 278619
2016-08-14 01:04:50 +00:00
Sanjoy Das 2143447c73 [IRCE] Create llvm::Loop instances for cloned out loops
llvm-svn: 278618
2016-08-14 01:04:46 +00:00
Sanjoy Das 7a18a238c6 [IRCE] Don't iterate on loops that were cloned out
IRCE has the ability to further version pre-loops and post-loops that it
created, but this isn't useful at all.  This change teaches IRCE to
leave behind some metadata in the loops it creates (by cloning the main
loop) so that these new loops are not re-processed by IRCE.

Today this bug is hidden by another bug -- IRCE does not update LoopInfo
properly so the loop pass manager does not re-invoke IRCE on the loops
it split out.  However, once the latter is fixed the bug addressed in
this change causes IRCE to infinite-loop in some cases (e.g. it splits
out a pre-loop, a pre-pre-loop from that, a pre-pre-pre-loop from that
and so on).

llvm-svn: 278617
2016-08-14 01:04:36 +00:00
Sanjoy Das 43fdc54303 [IRCE] Add better DEBUG diagnostic; NFC
NFC meaning IRCE should not _do_ anything different, but
-debug-only=irce will be a little friendlier.

llvm-svn: 278616
2016-08-14 01:04:31 +00:00
Sanjoy Das 2a2f14d7ab [IRCE] Be resilient in the face of non-simplified loops
Loops containing `indirectbr` may not be in simplified form, even after
running LoopSimplify.  Reject then gracefully, instead of tripping an
assert.

llvm-svn: 278611
2016-08-13 23:36:35 +00:00
Sanjoy Das f2b7bafae4 [IRCE] Use dyn_cast instead of explicit isa/cast; NFC
llvm-svn: 278607
2016-08-13 22:00:12 +00:00
Sanjoy Das d1d62a1354 [IRCE] Use range-for; NFC
llvm-svn: 278606
2016-08-13 22:00:09 +00:00
Aditya Kumar f24939b1f4 Test commit
llvm-svn: 278598
2016-08-13 11:56:50 +00:00
Teresa Johnson 1eca6bc6a7 [PM] Port LoopDataPrefetch to new pass manager
Summary:
Refactor the existing support into a LoopDataPrefetch implementation
class and a LoopDataPrefetchLegacyPass class that invokes it.
Add a new LoopDataPrefetchPass for the new pass manager that utilizes
the LoopDataPrefetch implementation class.

Reviewers: mehdi_amini

Subscribers: sanjoy, mzolotukhin, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D23483

llvm-svn: 278591
2016-08-13 04:11:27 +00:00
Sanjoy Das 3502511548 [IndVars] Ignore (s|z)exts that don't extend the induction variable
`IVVisitor::visitCast` used to have the invariant that if the
instruction it was passed was a sext or zext instruction, the result of
the instruction would be wider than the induction variable.  This is no
longer true after rL275037, so this change teaches `IndVarSimplify` s
implementation of `IVVisitor::visitCast` to work with the relaxed
invariant.

A corresponding change to SimplifyIndVar to preserve the said invariant
after rL275037 would also work, but given how `IVVisitor::visitCast` is
spelled (no indication of said invariant), I figured the current fix is
cleaner.

Fixes PR28935.

llvm-svn: 278584
2016-08-13 00:58:31 +00:00
David Majnemer 2d006e7673 Use the range variant of transform instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278476
2016-08-12 04:32:42 +00:00
David Majnemer c700490f48 Use the range variant of remove_if instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278475
2016-08-12 04:32:37 +00:00
David Majnemer 42531260b3 Use the range variant of find/find_if instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278469
2016-08-12 03:55:06 +00:00
Eli Friedman a6707f56b5 [DSE] Don't remove stores made live by a call which unwinds.
Issue exposed by noalias or more aggressive alias analysis.

Fixes http://llvm.org/PR25422.

Differential revision: https://reviews.llvm.org/D21007

llvm-svn: 278451
2016-08-12 01:09:53 +00:00
Xinliang David Li cbb5e02f4a Fix typos /NFC
llvm-svn: 278436
2016-08-11 22:34:00 +00:00
David Majnemer 0d955d0bf5 Use the range variant of find instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278433
2016-08-11 22:21:41 +00:00
Ehsan Amiri dbcfea9811 Extend trip count instead of truncating IV in LFTR, when legal
When legal, extending trip count in the loop control logic generates better code compared to truncating IV. This is because

(1) extending trip count is a loop invariant operation (see genLoopLimit where we prove trip count is loop invariant).
(2) Scalar Evolution seems to have problems understanding trunc when computing loop trip count. So removing them allows better analysis performed in Scalar Evolution. (In particular this fixes PR 28363 which is the motivation for this change).

I am not going to perform any performance test. Any degradation caused by this should be an indication of a bug elsewhere.

To prove legality, we rely on SCEV to prove zext(trunc(IV)) == IV (or similarly for sext). If this holds, we can prove equivalence of trunc(IV)==ExitCnt (1) and IV == zext(ExitCnt). Simply take zext of boths sides of (1) and apply the proven equivalence.

This commit contains changes in a newly added testcase which was not included in the previous commit (which was reverted later on).

https://reviews.llvm.org/D23075

llvm-svn: 278421
2016-08-11 21:31:40 +00:00
David Majnemer 0a16c22846 Use range algorithms instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278417
2016-08-11 21:15:00 +00:00
Geoff Berry d01828096f [SCEV] Update interface to handle SCEVExpander insert point motion.
Summary:
This is an extension of the fix in r271424.  That fix dealt with builder
insert points being moved by SCEV expansion, but only for the lifetime
of the expand call.  This change modifies the interface so that LSR can
safely call expand multiple times at the same insert point and do the
right thing if one of the expansions decides to move the original insert
point.

This is a fix for PR28719.

Reviewers: sanjoy

Subscribers: llvm-commits, mcrosier, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23342

llvm-svn: 278413
2016-08-11 21:05:17 +00:00
Daniel Berlin f75fd1b58b Fix PR 28933
Summary:
This fixes PR 28933 by making sure GVNHoist does not try to recreate memory
accesses when it has not actually moved them.

Reviewers: sebpop

Subscribers: llvm-commits, george.burgess.iv

Differential Revision: https://reviews.llvm.org/D23411

llvm-svn: 278401
2016-08-11 20:32:43 +00:00
Andrew Kaylor 7cdf01ef58 Target independent codesize heuristics for Loop Idiom Recognition
Patch by Sunita Marathe

Differential Revision: https://reviews.llvm.org/D21449

llvm-svn: 278378
2016-08-11 18:28:33 +00:00
Ehsan Amiri 3818f1b38a revert 278334
llvm-svn: 278337
2016-08-11 14:51:14 +00:00
Ehsan Amiri b9fcc2b171 Extend trip count instead of truncating IV in LFTR, when legal
When legal, extending trip count in the loop control logic generates better code compared to truncating IV. This is because

(1) extending trip count is a loop invariant operation (see genLoopLimit where we prove trip count is loop invariant).
(2) Scalar Evolution seems to have problems understanding trunc when computing loop trip count. So removing them allows better analysis performed in Scalar Evolution. (In particular this fixes PR 28363 which is the motivation for this change).

I am not going to perform any performance test. Any degradation caused by this should be an indication of a bug elsewhere.

To prove legality, we rely on SCEV to prove zext(trunc(IV)) == IV (or similarly for sext). If this holds, we can prove equivalence of trunc(IV)==ExitCnt (1) and IV == zext(ExitCnt). Simply take zext of boths sides of (1) and apply the proven equivalence.

https://reviews.llvm.org/D23075

llvm-svn: 278334
2016-08-11 13:51:20 +00:00
Andrew Kaylor 498d3113c3 [IndVarSimplify] Eliminate zext of a signed IV when the IV is known to be non-negative
Patch by Li Huang

Differential Revision: https://reviews.llvm.org/D18867

llvm-svn: 278269
2016-08-10 18:56:35 +00:00
Artur Pilipenko e171ea8a33 Teach CorrelatedValuePropagation to mark adds as no wrap
This is a resubmission of previously reverted r277592. It was hitting overly strong assertion in getConstantRange which was relaxed in r278217.

Use LVI to prove that adds do not wrap. The change is motivated by https://llvm.org/bugs/show_bug.cgi?id=28620 bug and it's the first step to fix that problem.

Reviewed By: sanjoy

Differential Revision: http://reviews.llvm.org/D23059

llvm-svn: 278220
2016-08-10 13:08:34 +00:00
Wei Mi 575435012c Fix the runtime error caused by "Use ValueOffsetPair to enhance value reuse during SCEV expansion".
The patch is to fix the bug in PR28705. It was caused by setting wrong return
value for SCEVExpander::findExistingExpansion. The return values of findExistingExpansion
have different meanings when the function is used in different ways so it is easy to make
mistake. The fix creates two new interfaces to replace SCEVExpander::findExistingExpansion,
and specifies where each interface is expected to be used.

Differential Revision: https://reviews.llvm.org/D22942

llvm-svn: 278161
2016-08-09 20:40:03 +00:00
Anna Thomas b2d12b81c3 [EarlyCSE] Teach about CSE'ing over invariant.start intrinsics
Summary:
Teach EarlyCSE about invariant.start intrinsic. Specifically, we can perform
store-load, load-load forwarding over this call.

Reviewers: majnemer, reames, dberlin, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23268

llvm-svn: 278153
2016-08-09 20:00:47 +00:00
Sean Silva 0746f3bfa4 Consistently use LoopAnalysisManager
One exception here is LoopInfo which must forward-declare it (because
the typedef is in LoopPassManager.h which depends on LoopInfo).

Also, some includes for LoopPassManager.h were needed since that file
provides the typedef.

Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278079
2016-08-09 00:28:52 +00:00
Sean Silva fd03ac6a0c Consistently use ModuleAnalysisManager
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278078
2016-08-09 00:28:38 +00:00
Sean Silva 36e0d01e13 Consistently use FunctionAnalysisManager
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278077
2016-08-09 00:28:15 +00:00
Sean Silva 0873e7d218 Add some comments linking back to PR28400.
Thanks to Mehdi for the suggestion!

llvm-svn: 277984
2016-08-08 07:03:49 +00:00
Sean Silva 7f21f4b264 [PM] More workaround for PR28400
llvm-svn: 277982
2016-08-08 05:38:06 +00:00
Eli Friedman 02419a9849 [JumpThreading] Fix handling of aliasing metadata.
Summary:
The correctness fix here is that when we CSE a load with another load,
we need to combine the metadata on the two loads. This matches the
behavior of other passes, like instcombine and GVN.

There's also a minor optimization improvement here: for load PRE, the
aliasing metadata on the inserted load should be the same as the
metadata on the original load. Not sure why the old code was throwing
it away.

Issue found by inspection.

Differential Revision: http://reviews.llvm.org/D21460

llvm-svn: 277977
2016-08-08 04:10:22 +00:00
Eli Friedman 2a65dd1ba6 [SROA] Fix crash with lifetime intrinsic partially covering alloca.
Summary:
PromoteMemToReg looks specifically for the pattern
bitcast+lifetime.start (or a bitcast-equivalent GEP); any offset
will lead to an assertion failure.

Fixes https://llvm.org/bugs/show_bug.cgi?id=27999 .

Differential Revision: https://reviews.llvm.org/D22737

llvm-svn: 277969
2016-08-08 01:30:53 +00:00
Benjamin Kramer a3d4def878 [LoadCombine] Simplify code with a brace init. NFC.
llvm-svn: 277921
2016-08-06 12:11:11 +00:00
Sanjoy Das b8c2ebea08 [IRCE] Remove unused headers; NFC
llvm-svn: 277892
2016-08-06 00:02:01 +00:00
Sanjoy Das cf181867a6 [IRCE] Preserve loop-simplify form
Fixes PR28764.  Right now there is no way to test this, but (as
mentioned on the PR) with Michael Zolotukhin's yet to be checked in
LoopSimplify verfier, 8 of the llvm-lit tests for IRCE crash.

llvm-svn: 277891
2016-08-06 00:01:56 +00:00
David Callahan 45e442ebaa [ADCE] Refactoring for new functionality (NFC)
Summary:
This is another refactoring to break up the one function into three logical components functions.
Another non-functional change before we start added in features.

Reviewers: nadav, mehdi_amini, majnemer

Subscribers: twoh, freik, llvm-commits

Differential Revision: https://reviews.llvm.org/D23102

llvm-svn: 277855
2016-08-05 19:38:11 +00:00
Sebastian Pop 429740a6c2 GVN-hoist: fix early exit logic
The patch splits a complex && if condition into easier to read and understand
logic.  That wrong early exit condition was letting some instructions with not
all operands available pass through when HoistingGeps was true.

Differential Revision: https://reviews.llvm.org/D23174

llvm-svn: 277785
2016-08-04 23:49:05 +00:00
Matt Arsenault 6ad97732aa GVNHoist: Don't hoist convergent calls
llvm-svn: 277767
2016-08-04 20:52:57 +00:00
Sebastian Pop 2aadad7243 GVN-hoist: limit the length of dependent instructions
Limit the number of times the while(1) loop is executed. With this restriction
the number of hoisted instructions does not change in a significant way on the
test-suite.

Differential Revision: https://reviews.llvm.org/D23028

llvm-svn: 277651
2016-08-03 20:54:38 +00:00
Sebastian Pop 4ba7c88cc7 GVN-hoist: compute DFS numbers once
With this patch we compute the DFS numbers of instructions only once and update
them during the code generation when an instruction gets hoisted.

Differential Revision: https://reviews.llvm.org/D23021

llvm-svn: 277650
2016-08-03 20:54:36 +00:00
Sebastian Pop 5d3822fc12 GVN-hoist: compute MSSA once per function (PR28670)
With this patch we compute the MemorySSA once and update it in the code generator.

Differential Revision: https://reviews.llvm.org/D22966

llvm-svn: 277649
2016-08-03 20:54:33 +00:00
Renato Golin f583097434 Revert "Teach CorrelatedValuePropagation to mark adds as no wrap"
This reverts commit r277592, trying to fix the AArch64 42VMA buildbot.

llvm-svn: 277607
2016-08-03 16:20:48 +00:00
Artur Pilipenko 68cb947cc9 Teach CorrelatedValuePropagation to mark adds as no wrap
Use LVI to prove that adds do not wrap. The change is motivated by https://llvm.org/bugs/show_bug.cgi?id=28620 bug and it's the first step to fix that problem.

Reviewed By: sanjoy

Differential Revision: http://reviews.llvm.org/D23059

llvm-svn: 277592
2016-08-03 13:11:39 +00:00
David Callahan cc5cd4dc65 [ADCE] Refactor anticipating new functionality (NFC)
Summary:
This is the first refactoring before adding new functionality.
Add a class wrapper for the functions and container for
state associated with the transformation.

No functional change

Reviewers: majnemer, nadav, mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23065

llvm-svn: 277565
2016-08-03 04:28:39 +00:00
Sanjoy Das 83a72850c7 [IRCE] Rename variable; NFC
There is nothing "Original" about "OriginalLoopInfo".

llvm-svn: 277506
2016-08-02 19:32:01 +00:00
Sanjoy Das f45e03e201 [IRCE] Preserve DomTree and LCSSA
This changes IRCE to "preserve" LCSSA and DomTree by recomputing them.
It still does not preserve LoopSimplify.

llvm-svn: 277505
2016-08-02 19:31:54 +00:00
Michael Kuperstein c40618610f [PM] Port SpeculativeExecution to the new PM
Differential Revision: https://reviews.llvm.org/D23033

llvm-svn: 277393
2016-08-01 21:48:33 +00:00
Adam Nemet 12937c361f [LoopUnroll] Include hotness of region in opt remark
LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter
is added to the common function analysis passes that loop passes
depend on.

The BFI and indirectly BPI used in this pass is computed lazily so no
overhead should be observed unless -pass-remarks-with-hotness is used.

This is how the patch affects the O3 pipeline:

         Dominator Tree Construction
         Natural Loop Information
         Canonicalize natural loops
         Loop-Closed SSA Form Pass
         Basic Alias Analysis (stateless AA impl)
         Function Alias Analysis Results
         Scalar Evolution Analysis
+        Lazy Branch Probability Analysis
+        Lazy Block Frequency Analysis
+        Optimization Remark Emitter
         Loop Pass Manager
           Rotate Loops
           Loop Invariant Code Motion
           Unswitch loops
         Simplify the CFG
         Dominator Tree Construction
         Basic Alias Analysis (stateless AA impl)
         Function Alias Analysis Results
         Combine redundant instructions
         Natural Loop Information
         Canonicalize natural loops
         Loop-Closed SSA Form Pass
         Scalar Evolution Analysis
+        Lazy Branch Probability Analysis
+        Lazy Block Frequency Analysis
+        Optimization Remark Emitter
         Loop Pass Manager
           Induction Variable Simplification
           Recognize loop idioms
           Delete dead loops
           Unroll loops
...

llvm-svn: 277203
2016-07-29 19:29:47 +00:00
David Majnemer 130b9f99d6 [EarlyCSE] Correctly handle simplified, but live, instructions
Some instructions may have their uses replaced with a symbolic constant.
However, the instruction may still have side effects which percludes it
from being removed from the function.  EarlyCSE treated such an
instruction as if it were removed, resulting in PR28763.

llvm-svn: 277114
2016-07-29 05:39:21 +00:00
David Majnemer d536f2328e [ConstnatFolding] Teach the folder how to fold ConstantVector
A ConstantVector can have ConstantExpr operands and vice versa.
However, the folder had no ability to fold ConstantVectors which, in
some cases, was an optimization barrier.

Instead, rephrase the folder in terms of Constants instead of
ConstantExprs and teach callers how to deal with failure.

llvm-svn: 277099
2016-07-29 03:27:26 +00:00
Michael Kuperstein e45d4d9b35 [PM] Port LowerGuardIntrinsic to the new PM.
llvm-svn: 277057
2016-07-28 22:08:41 +00:00
Jun Bum Lim a033139cd4 [DSE] Fix bug in updating MadeChange flag
Summary: The MadeChange flag should be ORed to keep the previous result.

Reviewers: mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22873

llvm-svn: 276894
2016-07-27 17:25:20 +00:00
George Burgess IV 9cf05464aa [GVNHoist] Fix typo in assert.
This fixes PR28730.

llvm-svn: 276844
2016-07-27 06:34:53 +00:00
Sebastian Pop 55c3007b88 GVN-hoist: improve code generation for recursive GEPs
When loading or storing in a field of a struct like "a.b.c", GVN is able to
detect the equivalent expressions, and GVN-hoist would fail in the code
generation.  This is because the GEPs are not hoisted as scalar operations to
avoid moving the GEPs too far from their ld/st instruction when the ld/st is not
movable.  So we end up having to generate code for the GEP of a ld/st when we
move the ld/st.  In the case of a GEP referring to another GEP as in "a.b.c" we
need to code generate all the GEPs necessary to make all the operands available
at the new location for the ld/st.  With this patch we recursively walk through
the GEP operands checking whether all operands are available, and in the case of
a GEP operand, it recursively makes all its operands available. Code generation
happens from the inner GEPs out until reaching the GEP that appears as an
operand of the ld/st.

Differential Revision: https://reviews.llvm.org/D22599

llvm-svn: 276841
2016-07-27 05:48:12 +00:00
Sebastian Pop 586d3eaeb5 GVN-hoist: use DFS numbers instead of walking the instruction stream
The patch replaces a function that walks the IR with a call to firstInBB() that
uses the DFS numbering.  NFC.

Differential Revision: https://reviews.llvm.org/D22809

llvm-svn: 276840
2016-07-27 05:13:52 +00:00
Sebastian Pop 91d4a30159 GVN-hoist: use a DFS numbering of instructions (PR28670)
Instead of DFS numbering basic blocks we now DFS number instructions that avoids
the costly operation of which instruction comes first in a basic block.

Patch mostly written by Daniel Berlin.

Differential Revision: https://reviews.llvm.org/D22777

llvm-svn: 276714
2016-07-26 00:15:10 +00:00
Sebastian Pop 38422b1356 GVN-hoist: limit hoisting depth (PR28670)
This patch adds an option to specify the maximum depth in a BB at which to
consider hoisting instructions.  Hoisting instructions from a deeper level is
not profitable as it increases register pressure and compilation time.

Differential Revision: https://reviews.llvm.org/D22772

llvm-svn: 276713
2016-07-26 00:15:08 +00:00
Matt Arsenault 7cddfed7e8 Scalarizer: Support scalarizing intrinsics
llvm-svn: 276681
2016-07-25 20:02:54 +00:00
Daniel Berlin 40765a62ad Revert NewGVN N^2 behavior patch
llvm-svn: 276670
2016-07-25 18:19:49 +00:00
Daniel Berlin 14c000936e NFC: Make a few asserts in GVNHoist do the same thing, but cheaper.
llvm-svn: 276662
2016-07-25 17:36:14 +00:00
Daniel Berlin f107f3292f Fix N^2 instruction ordering comparisons in GVNHoist.
This fixes GVNHoist's portion of PR28670.

llvm-svn: 276658
2016-07-25 17:24:27 +00:00
Daniel Berlin 65af45de03 NFC: Refactor GVNHoist class so not everything is public
llvm-svn: 276657
2016-07-25 17:24:22 +00:00
David Majnemer 68623a0e9f [GVNHoist] Merge metadata on hoisted instructions less conservatively
We can combine metadata from multiple instructions intelligently for
certain metadata nodes.

llvm-svn: 276602
2016-07-25 02:21:25 +00:00
David Majnemer 4728569d0a [GVNHoist] Properly merge alignments when hoisting
If we two loads of two different alignments, we must use the minimum of
the two alignments when hoisting.  Same deal for stores.

For allocas, use the maximum of the two allocas.

llvm-svn: 276601
2016-07-25 02:21:23 +00:00
Elena Demikhovsky 376a18bd92 [Loop Vectorizer] Handling loops FP induction variables.
Allowed loop vectorization with secondary FP IVs. Like this:
float *A;
float x = init;
for (int i=0; i < N; ++i) {
  A[i] = x;
  x -= fp_inc;
}

The auto-vectorization is possible when the induction binary operator is "fast" or the function has "unsafe" attribute.

Differential Revision: https://reviews.llvm.org/D21330

llvm-svn: 276554
2016-07-24 07:24:54 +00:00
Adam Nemet eea7c267b9 [LoopDataPrefetch] Fix unused variable in release build
llvm-svn: 276491
2016-07-22 23:08:10 +00:00
Adam Nemet 9e6e63fba2 [LoopDataPrefetch] Include hotness of region in opt remark
llvm-svn: 276488
2016-07-22 22:53:17 +00:00
Adam Nemet 885f1de490 [LoopDataPrefetch] Sort headers
llvm-svn: 276487
2016-07-22 22:53:12 +00:00
Jun Bum Lim 6a7dc5c430 Recommit - [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals
Recommiting r275571 after fixing crash reported in PR28270.
Now we erase elements of IOL in deleteDeadInstruction().

Original Summary:
This change use the overlap interval map built from partial overwrite tracking to perform shortening MemIntrinsics.
Add test cases which was missing opportunities before.

llvm-svn: 276452
2016-07-22 18:27:24 +00:00
David Majnemer 522a91181a Don't remove side effecting instructions due to ConstantFoldInstruction
Just because we can constant fold the result of an instruction does not
imply that we can delete the instruction.  It may have side effects.

This fixes PR28655.

llvm-svn: 276389
2016-07-22 04:54:44 +00:00
Sanjoy Das bb969791b4 [IRCE] Add an option to skip profitability checks
If `-irce-skip-profitability-checks` is passed in, IRCE will kick in in
all cases where it is legal for it to kick in.  This flag is intended to
help diagnose and analyse performance issues.

llvm-svn: 276372
2016-07-22 00:40:56 +00:00
Sebastian Pop 0e2cec075c GVN-hoist: move check before mutating the IR
llvm-svn: 276368
2016-07-22 00:07:01 +00:00
Sebastian Pop c107a4875e GVN-hoist: add missing check for all GEP operands available
llvm-svn: 276364
2016-07-21 23:32:39 +00:00
Sebastian Pop 31fd506623 GVH-hoist: only clone GEPs (PR28606)
Do not clone stored values unless they are GEPs that are special cased to avoid
hoisting them without hoisting their associated ld/st.

Differential revision: https://reviews.llvm.org/D22652

llvm-svn: 276358
2016-07-21 23:22:10 +00:00
Wei Mi 1cf58f8996 [PM] Port NaryReassociate to the new PM
Differential Revision: https://reviews.llvm.org/D22648

llvm-svn: 276349
2016-07-21 22:28:52 +00:00
Adam Nemet 84a6425d61 [OptDiag,LDist] Convert remaining opt remarks to use the new API
llvm-svn: 276340
2016-07-21 21:21:34 +00:00
Sanjoy Das ff9eea2278 [IndVars] Reflow oddly formatted condition; NFC
llvm-svn: 276319
2016-07-21 18:58:01 +00:00
David Majnemer 825e4ab9e3 [GVNHoist] Preserve optimization hints which agree
If we have optimization hints with agree with each other along different
paths, preserve them.

llvm-svn: 276248
2016-07-21 07:16:26 +00:00
David Majnemer 4808f26422 [GVNHoist] Don't wrongly preserve TBAA
We hoisted loads/stores without taking into account which can cause
miscompiles.

llvm-svn: 276240
2016-07-21 05:59:53 +00:00
David Majnemer 15cf7b83d1 [MergedLoadStoreMotion] Remove out of date comment
llvm-svn: 276239
2016-07-21 05:59:51 +00:00
David Majnemer bd21012c6c [GVNHoist] Don't hoist PHI nodes
We hoisted PHIs without respecting their special insertion point in the
block, leading to verfier errors.

This fixes PR28626.

llvm-svn: 276181
2016-07-20 21:05:01 +00:00
Davide Italiano 15ff2d6d0c [SCCP] Zap multiple return values.
We can replace the return values with undef if we replaced all
the call uses with a constant/undef.

Differential Revision:  https://reviews.llvm.org/D22336

llvm-svn: 276174
2016-07-20 20:17:13 +00:00
Sean Silva e3c18a5ae8 [PM] Port LoopUnroll.
We just set PreserveLCSSA to always true since we don't have an
analogous method `mustPreserveAnalysisID(LCSSA)`.

Also port LoopInfo verifier pass to test LoopUnrollPass.

llvm-svn: 276063
2016-07-19 23:54:23 +00:00
Paul Robinson 2d23c029f7 Make GVN Hoisting obey optnone/bisect.
Differential Revision: http://reviews.llvm.org/D22545

llvm-svn: 276048
2016-07-19 22:57:14 +00:00
Davide Italiano 63266b6be5 [SCCP] Improve assert messages. NFCI.
I've been hitting those already while working on SCCP and I think
it's be useful to provide a more explanatory diagnostic.

llvm-svn: 276007
2016-07-19 18:31:07 +00:00
Chad Rosier 8b5fa7a2f2 [DSE] Add additional debug output. NFC.
llvm-svn: 276005
2016-07-19 18:11:11 +00:00
Chad Rosier 667b1ca0e6 [DSE] Add additional debug output. NFC.
llvm-svn: 275991
2016-07-19 16:50:57 +00:00
Sanjoy Das ab73c9d88e [LoopReroll] Reroll loops with unordered atomic memory accesses
Reviewers: hfinkel, jfb, reames

Subscribers: mcrosier, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D22385

llvm-svn: 275932
2016-07-19 00:23:54 +00:00
Dehao Chen 6132ee8502 [PM] Convert Loop Strength Reduce pass to new PM
Summary: Convert Loop String Reduce pass to new PM

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D22468

llvm-svn: 275919
2016-07-18 21:41:50 +00:00
David Majnemer 04854ab1e5 [GVNHoist] Remove a home-grown version of replaceUsesOfWith
replaceUsesOfWith will, on average, consider fewer values when trying
to do the replacement.

No functional change is intended.

llvm-svn: 275884
2016-07-18 19:14:14 +00:00
Reid Kleckner 3498ad11eb Fix -Wmicrosoft-enum-value in GVNHoist.cpp
llvm-svn: 275879
2016-07-18 18:53:50 +00:00
Adam Nemet b2593f78ca [LoopDist] Port to new PM
Summary:
The direct motivation for the port is to ensure that the OptRemarkEmitter
tests work with the new PM.

This remains a function pass because we not only create multiple loops
but could also version the original loop.

In the test I need to invoke opt
with -passes='require<aa>,loop-distribute'.  LoopDistribute does not
directly depend on AA however LAA does.  LAA uses getCachedResult so
I *think* we need manually pull in 'aa'.

Reviewers: davidxl, silvas

Subscribers: sanjoy, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22437

llvm-svn: 275811
2016-07-18 16:29:27 +00:00
Adam Nemet 79ac42a5c9 [OptRemarkEmitter] Port to new PM
Summary:
The main goal is to able to start using the new OptRemarkEmitter
analysis from the LoopVectorizer.  Since the vectorizer was recently
converted to the new PM, it makes sense to convert this analysis as
well.

This pass is currently tested through the LoopDistribution pass, so I am
also porting LoopDistribution to get coverage for this analysis with the
new PM.

Reviewers: davidxl, silvas

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22436

llvm-svn: 275810
2016-07-18 16:29:21 +00:00
Alexander Kornienko 63dd36faa5 Revert "r275571 [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals"
Causes https://llvm.org/bugs/show_bug.cgi?id=28588

llvm-svn: 275801
2016-07-18 15:51:31 +00:00
David Majnemer 04c7c225a1 [GVNHoist] Change the key for VNtoInsns to a pair
While debugging GVNHoist, I found it confusing that the entries in a
VNtoInsns were not always value numbers.  They _usually_ were except for
StoreInst in which case they were a hash of two different value numbers.

This leads to two observations:
- It is more difficult to debug things when the semantic contents of
  VNtoInsns changes over time.
- Using a single value number is not much cheaper, the value of
  VNtoInsns is a SmallVector.
- It is not immediately clear what the algorithm would do if there were
  hash collisions in the StoreInst case.

Using a DenseMap of std::pair sidesteps all of this.

N.B.  The changes in the test were due their sensitivity to the
iteration order of VNtoInsns which has changed.

llvm-svn: 275761
2016-07-18 06:11:37 +00:00
David Majnemer aa2417835e [GVNHoist] Sink HoistedCtr into GVNHoist
HoistedCtr cannot be a mutated global variable, that will open us up to
races between threads compiling code in parallel.

llvm-svn: 275744
2016-07-18 00:35:01 +00:00
David Majnemer 4c66a714c3 [GVNHoist] Some small cleanups
No functional change is intended, just trying to clean things up a
little.

llvm-svn: 275743
2016-07-18 00:34:58 +00:00
Dehao Chen 1a44452b11 [PM] Convert IVUsers analysis to new pass manager.
Summary: Convert IVUsers analysis to new pass manager.

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22434

llvm-svn: 275698
2016-07-16 22:51:33 +00:00
Matt Arsenault 93be6e8c0a StructurizeCFG: Fix inverting constantexpr conditions
llvm-svn: 275626
2016-07-15 22:13:16 +00:00
Davide Italiano 094dadd5b4 [SCCP] Merge two conditions into one. NFCI.
llvm-svn: 275593
2016-07-15 18:33:16 +00:00
Adam Nemet aad816083e [OptRemark,LDist] RFC: Add hotness attribute
Summary:
This is the first set of changes implementing the RFC from
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334

This is a cross-sectional patch; rather than implementing the hotness
attribute for all optimization remarks and all passes in a patch set, it
implements it for the 'missed-optimization' remark for Loop
Distribution.  My goal is to shake out the design issues before scaling
it up to other types and passes.

Hotness is computed as an integer as the multiplication of the block
frequency with the function entry count.  It's only printed in opt
currently since clang prints the diagnostic fields directly.  E.g.:

  remark: /tmp/t.c:3:3: loop not distributed: use -Rpass-analysis=loop-distribute for more info (hotness: 300)

A new API added is similar to emitOptimizationRemarkMissed.  The
difference is that it additionally takes a code region that the
diagnostic corresponds to.  From this, hotness is computed using BFI.
The new API is exposed via an analysis pass so that it can be made
dependent on LazyBFI.  (Thanks to Hal for the analysis pass idea.)

This feature can all be enabled by setDiagnosticHotnessRequested in the
LLVM context.  If this is off, LazyBFI is not calculated (D22141) so
there should be no overhead.

A new command-line option is added to turn this on in opt.

My plan is to switch all user of emitOptimizationRemark* to use this
module instead.

Reviewers: hfinkel

Subscribers: rcox2, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D21771

llvm-svn: 275583
2016-07-15 17:23:20 +00:00
Dehao Chen dcafd5ebfd [PM] Convert LoopInstSimplify Pass to new PM
Summary: Convert LoopInstSimplify to new PM. Unfortunately there is no exisiting unittest for this pass.

Reviewers: davidxl, silvas

Subscribers: silvas, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22280

llvm-svn: 275576
2016-07-15 16:42:11 +00:00
Jun Bum Lim a5737d8eac [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals
Summary:
This change use the overlap interval map built from partial overwrite tracking to perform shortening MemIntrinsics.
Add test cases which was missing opportunities before.

Reviewers: hfinkel, eeckstein, mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D21909

llvm-svn: 275571
2016-07-15 16:14:34 +00:00
Sebastian Pop 4177480aad code hoisting pass based on GVN
This pass hoists duplicated computations in the program. The primary goal of
gvn-hoist is to reduce the size of functions before inline heuristics to reduce
the total cost of function inlining.

Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki.
Important algorithmic contributions by Daniel Berlin under the form of reviews.

Differential Revision: http://reviews.llvm.org/D19338

llvm-svn: 275561
2016-07-15 13:45:20 +00:00
Adam Nemet 74730d9ab0 [LoopDist] Fix typo in diagnostic
llvm-svn: 275495
2016-07-14 22:33:46 +00:00
Ekaterina Romanova 7aea5906c0 [GVN] Fold constant expression in GVN.
Fix for PR 28418.

opt never finishes compiling a test when -gvn option is passed.
The problem is caused by the fact that GVN fails to fold a constant expression.

Differential Revision: https://reviews.llvm.org/D22185

llvm-svn: 275483
2016-07-14 22:02:25 +00:00
Davide Italiano 6f73588fb9 [SCCP] Pass the Solver by reference, copies are expensive ...
.. enough to cause LTO compile time to regress insanely.
Thanks *a lot* to Rafael for reporting the problem and testing
the fix!

llvm-svn: 275468
2016-07-14 20:25:54 +00:00
Sanjoy Das 13623ad009 [JumpThreading] PRE unordered loads
Summary: Extend JumpThreading's PRE to unordered atomic loads.

Reviewers: hfinkel, reames

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D22326

llvm-svn: 275456
2016-07-14 19:21:15 +00:00
Jun Bum Lim c837af306e [PM] Port Dead Loop Deletion Pass to the new PM
Summary: Port Dead Loop Deletion Pass to the new pass manager.

Reviewers: silvas, davide

Subscribers: llvm-commits, sanjoy, mcrosier

Differential Revision: https://reviews.llvm.org/D21483

llvm-svn: 275453
2016-07-14 18:28:29 +00:00
Nico Weber 755cd760cd Revert r275401, it caused PR28551.
llvm-svn: 275420
2016-07-14 14:41:25 +00:00
Sjoerd Meijer 716abbb2f5 This converts a signed remainder instruction to unsigned remainder, which
enables the code size optimisation to fold a rem and div into a single
aeabi_uidivmod call. This was not happening before because sdiv was converted
but srem not, and instructions with different signedness are not combined.

Differential Revision: http://reviews.llvm.org/D22214

llvm-svn: 275403
2016-07-14 12:23:48 +00:00
Sebastian Pop 63847d04e7 code hoisting pass based on GVN
This pass hoists duplicated computations in the program. The primary goal of
gvn-hoist is to reduce the size of functions before inline heuristics to reduce
the total cost of function inlining.

Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki.
Important algorithmic contributions by Daniel Berlin under the form of reviews.

Differential Revision: http://reviews.llvm.org/D19338

llvm-svn: 275401
2016-07-14 12:18:53 +00:00
Sjoerd Meijer 38c2cd0c14 This implements a more optimal algorithm for selecting a base constant in
constant hoisting. It not only takes into account the number of uses and the
cost of expressions in which constants appear, but now also the resulting
integer range of the offsets. Thus, the algorithm maximizes the number of uses
within an integer range that will enable more efficient code generation. On
ARM, for example, this will enable code size optimisations because less
negative offsets will be created. Negative offsets/immediates are not supported
by Thumb1 thus preventing more compact instruction encoding.

Differential Revision: http://reviews.llvm.org/D21183

llvm-svn: 275382
2016-07-14 07:44:20 +00:00
Davide Italiano ed4d5ea82a [SCCP] Pass a Value * instead of templating this function. NFC.
Thanks to Eli for the suggestion!

llvm-svn: 275366
2016-07-14 03:02:34 +00:00
Davide Italiano 7dac027ed7 [IPSCCP] Constant fold struct argument/instructions when all the lattice values are constant.
This now should also work with the interprocedural variant of the pass.
Slightly easier now that the yak is shaved.

Differential Revision:   http://reviews.llvm.org/D22329

llvm-svn: 275363
2016-07-14 02:51:41 +00:00
Mehdi Amini 8484f92f7f [Scalarizer] PR28108: Skip over nullptr rather than crashing on it.
Summary:
In Scalarizer::gather we see if we already have a scattered form of Op,
and in that case use the new form.

In the particular case of PR28108, the found ValueVector SV has size 2,
where the first Value is nullptr, and the second is indeed a proper Value.
The nullptr then caused an assert to blow when we tried to do
cast<Instruction>(SV[I]).

With this patch we check SV[I] before doing the cast, and if it's nullptr
we just skip over it.

I don't know the Scalarizer well enough to know if this is the best fix
or if something should be done else where to prevent the nullptr from
being in the ValueVector at all, but at least this avoids the crash
and looking at the test case output it looks reasonable.

Reviewers: hfinkel, frasercrmck, wala, mehdi_amini

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21518

llvm-svn: 275359
2016-07-14 01:31:25 +00:00
Davide Italiano 6ed6d77950 [SCCP] Generalize tryToReplaceInstWithConstant to work also with arguments.
llvm-svn: 275357
2016-07-14 01:27:29 +00:00
Sanjoy Das 931df67ae6 [JumpThreading] Delete commented out debug code; NFC
llvm-svn: 275346
2016-07-13 23:33:20 +00:00
Davide Italiano 296e9785ba [SCCP] Have the logic for replacing insts with constant in a single place.
The code was pretty much copy-pasted between SCCP and IPSCCP. The situation
became clearly worse after I introduced the support for folding structs in
SCCP.  This commit is NFC as we currently (still) skip the replacement
step in IPSCCP, but I'll change this soon.

llvm-svn: 275339
2016-07-13 23:20:04 +00:00
Davide Italiano 390b7ea533 [SCCP] Factor out common code.
llvm-svn: 275308
2016-07-13 19:33:25 +00:00
Davide Italiano 2185001551 [SCCP] Use early return. NFCI.
llvm-svn: 275307
2016-07-13 19:23:30 +00:00
Dehao Chen 9cba1f4e7e New pass manager for LICM.
Summary: Port LICM to the new pass manager.

Reviewers: davidxl, silvas

Subscribers: krasin, vitalybuka, silvas, davide, sanjoy, llvm-commits, mehdi_amini

Differential Revision: http://reviews.llvm.org/D21772

llvm-svn: 275222
2016-07-12 22:37:48 +00:00
Davide Italiano 0080269342 [SCCP] Constant fold structs if all the lattice value are constant.
Differential Revision:   http://reviews.llvm.org/D22269

llvm-svn: 275208
2016-07-12 19:54:19 +00:00
Dehao Chen b9f8e29290 [PM] Port LoopIdiomRecognize Pass to new PM
Summary: Port LoopIdiomRecognize Pass to new PM

Reviewers: davidxl

Subscribers: davide, sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D22250

llvm-svn: 275202
2016-07-12 18:45:51 +00:00
Vitaly Buka 204dc533c5 Revert "New pass manager for LICM."
Summary: This reverts commit r275118.

Subscribers: sanjoy, mehdi_amini

Differential Revision: http://reviews.llvm.org/D22259

llvm-svn: 275156
2016-07-12 06:25:32 +00:00
Dehao Chen 7ef5820fa3 New pass manager for LICM.
Summary: Port LICM to the new pass manager.

Reviewers: davidxl, silvas

Subscribers: silvas, davide, sanjoy, llvm-commits, mehdi_amini

Differential Revision: http://reviews.llvm.org/D21772

llvm-svn: 275118
2016-07-11 22:45:24 +00:00
Davide Italiano 63c4ce8e1b [SCCP] Try to follow the DRY principle, use `OpSt`.
Thanks to Eli Friedman for pointing out in his post-commit review!

llvm-svn: 275084
2016-07-11 18:21:29 +00:00
Jingyue Wu 641cfee976 [SLSR] Call getPointerSizeInBits with the correct address space.
llvm-svn: 275083
2016-07-11 18:13:28 +00:00
Nicolai Haehnle 889a20cf40 [Sink] Don't move calls to readonly functions across stores
Summary:

Reviewers: hfinkel, majnemer, tstellarAMD, sunfish

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17279

llvm-svn: 275066
2016-07-11 14:11:51 +00:00
Benjamin Kramer 4d09892e9a Give helper classes/functions internal linkage. NFC.
llvm-svn: 275014
2016-07-10 11:28:51 +00:00
Davide Italiano 0f03ce0c88 [SCCP] Rename undefined -> unknown.
In the solver, isUndefined() does really mean "we don't know the
value yet" rather than "this is an UndefinedValue". Discussed with
Eli Friedman.

Differential Revision:  http://reviews.llvm.org/D22192

llvm-svn: 275004
2016-07-10 00:35:15 +00:00
Davide Italiano c4890705ef [SCCP] Remove wrong and misleading vector handling code.
This code was already commented out and it made some weird assumptions,
e.g. using isUndefined() as "this value is UndefValue" instead of
"we haven't computed this value is yet". Thanks to Eli Friedman for
pointing out where I was wrong (and where this code was wrong).

llvm-svn: 274995
2016-07-09 22:49:35 +00:00
Jingyue Wu debce55ac3 [SLSR] Fix crash on handling 128-bit integers.
ConstantInt::getSExtValue may fail on >64-bit integers. Add checks to call
getSExtValue only on narrow integers.

As a minor aside, simplify slsr-gep.ll to remove unnecessary load instructions.

llvm-svn: 274982
2016-07-09 19:13:18 +00:00
Jingyue Wu 15f3e82d42 [TTI] Expose TTI::getGEPCost and use it in SLSR and NaryReassociate.
NFC.

llvm-svn: 274940
2016-07-08 21:48:05 +00:00
Xinliang David Li 7853c1dd73 Rename LoopAccessAnalysis to LoopAccessLegacyAnalysis /NFC
llvm-svn: 274927
2016-07-08 20:55:26 +00:00
Davide Italiano d555bde59f [SCCP] Fold constants as we build them whne visiting cast instructions.
This should be slightly more efficient and could avoid spurious overdefined
markings, as Eli pointed out.

Differential Revision:  http://reviews.llvm.org/D22122

llvm-svn: 274905
2016-07-08 19:13:40 +00:00
Chad Rosier 89c32a9531 [DSE] Minor refactor based on D21007. NFC.
llvm-svn: 274877
2016-07-08 16:48:40 +00:00
Anna Thomas 6a78c78a03 [DSE] Remove dead stores in end blocks containing fence
We can remove dead stores in the presence of fence instructions. Fence
does not change an otherwise thread local store to visible.

reviewers: reames, dexonsmith, jfb
Differential Revision: http://reviews.llvm.org/D22001

llvm-svn: 274795
2016-07-07 20:51:42 +00:00
Davide Italiano 709d41819b [LoopStrengthReduce] Fix -Wmisleading-indentation. Reported by GCC6.
llvm-svn: 274773
2016-07-07 17:44:38 +00:00
Sean Silva 59fe82f4ce [PM] Port TailCallElim
llvm-svn: 274708
2016-07-06 23:48:41 +00:00
Sean Silva b025d375a1 [PM] Port CorrelatedValuePropagation
llvm-svn: 274705
2016-07-06 23:26:29 +00:00
Haicheng Wu a95cd1267f [LIR] Fix mis-compilation with unwinding.
To fix PR27859, bail out if there is an instruction may throw.

Differential Revision: http://reviews.llvm.org/D20638

llvm-svn: 274673
2016-07-06 21:05:40 +00:00
Chad Rosier dcfce2d0ec [DSE] Avoid iterator invalidation bugs.
The dse_with_dbg_value.ll test committed with r273141 is removed because this
we no longer performs any type of back tracking, which is what was causing the
codegen differences with and without debug information.

Differential Revision: http://reviews.llvm.org/D21613

llvm-svn: 274660
2016-07-06 19:48:52 +00:00
Sean Silva f50d4b6cdc Work around PR28400 a bit harder.
We were still crashing in the "no change" case because LVI was not
getting invalidated.

See the thread "Should analyses be able to hold AssertingVH to IR?
(related to PR28400)" for more discussion.

llvm-svn: 274656
2016-07-06 19:05:41 +00:00
Sean Silva fa6db90164 PR28400: Partly undo r274440 to bring test-suite back to life with the new PM
PR28400 seems to be not an isolated issue, but a general problem related
to caching analyses. We will need to discuss on llvm-dev.

A test case is in the PR.

llvm-svn: 274457
2016-07-03 03:35:06 +00:00
Sean Silva 45835e731d Remove dead TLI arg of isKnownNonNull and propagate deadness. NFC.
This actually uncovered a surprisingly large chain of ultimately unused
TLI args.
From what I can gather, this argument is a remnant of when
isKnownNonNull would look at the TLI directly.
The current approach seems to be that InferFunctionAttrs runs early in
the pipeline and uses TLI to annotate the TLI-dependent non-null
information as return attributes.

This also removes the dependence of functionattrs on TLI altogether.

llvm-svn: 274455
2016-07-02 23:47:27 +00:00
Sean Silva f2db01c626 [PM] Fix a small typo from when I ported JumpThreading
llvm-svn: 274440
2016-07-02 16:16:44 +00:00
Michael Kuperstein 071d8306b0 [PM] Port ConstantHoisting to the new Pass Manager
Differential Revision: http://reviews.llvm.org/D21945

llvm-svn: 274411
2016-07-02 00:16:47 +00:00
Sanjay Patel 887aa6d6ef fix documentation comments; NFC
llvm-svn: 274362
2016-07-01 16:41:59 +00:00
Xinliang David Li 94734eef33 [PM] refactor LoopAccessInfo code part-2
Differential Revision: http://reviews.llvm.org/D21636

llvm-svn: 274334
2016-07-01 05:59:55 +00:00
Duncan P. N. Exon Smith 9d1f156418 Revert "code hoisting pass based on GVN"
This reverts commit r274305, since it breaks self-hosting:
  http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/22349/
  http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17232

Note that the blamelist on lab.llvm.org:8011 is incorrect.  The previous
build was r274299, but somehow r274305 wasn't included in the blamelist:
  http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules

llvm-svn: 274320
2016-07-01 01:51:40 +00:00
Sebastian Pop 5c5798c57c code hoisting pass based on GVN
This pass hoists duplicated computations in the program. The primary goal of
gvn-hoist is to reduce the size of functions before inline heuristics to reduce
the total cost of function inlining.

Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki.
Important algorithmic contributions by Daniel Berlin under the form of reviews.

Differential Revision: http://reviews.llvm.org/D19338

llvm-svn: 274305
2016-07-01 00:24:31 +00:00
Jun Bum Lim 596a3bd9ec [DSE] Fix bug in partial overwrite tracking
Summary:
Found cases where DSE incorrectly add partially-overwritten intervals.
Please see the test case for details.

Reviewers: mcrosier, eeckstein, hfinkel

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D21859

llvm-svn: 274237
2016-06-30 15:32:20 +00:00
Adam Nemet ad437fff53 [Diag] Add getter shouldAlwaysPrint. NFC
For the new hotness attribute, the API will take the pass rather than
the pass name so we can no longer play the trick of AlwaysPrint being a
special pass name. This adds a getter to help the transition.

There is also a corresponding clang patch.

llvm-svn: 274100
2016-06-29 04:55:19 +00:00
Adam Nemet bd861acf29 [LLE] Don't hoist conditionally executed loads
If the load is conditional we can't hoist its 0-iteration instance to
the preheader because that would make it unconditional.  Thus we would
access a memory location that the original loop did not access.

llvm-svn: 273991
2016-06-28 04:02:47 +00:00
Michael Kuperstein 835facd863 [PM] Normalize FIXMEs for missing PreserveCFG to have the same wording.
llvm-svn: 273974
2016-06-28 00:54:12 +00:00
Benjamin Kramer 135f735af1 Apply clang-tidy's modernize-loop-convert to most of lib/Transforms.
Only minor manual fixes. No functionality change intended.

llvm-svn: 273808
2016-06-26 12:28:59 +00:00
Sanjoy Das 9d08642c64 [RSForGC] Appease MSVC
llvm-svn: 273805
2016-06-26 05:42:52 +00:00
Sanjoy Das a37bb4a65d [LoopUnswitch] Unswitch on conditions feeding into guards
Summary:
This is a straightforward extension of what LoopUnswitch does to
branches to guards.  That is, we unswitch

```
for (;;) {
  ...
  guard(loop_invariant_cond);
  ...
}
```

into

```
if (loop_invariant_cond) {
  for (;;) {
    ...
    // There is no need to emit guard(true)
    ...
  }
} else {
  for (;;) {
    ...
    guard(false);
    // SimplifyCFG will clean this up by adding an
    // unreachable after the guard(false)
    ...
  }
}
```

Reviewers: majnemer

Subscribers: mcrosier, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21725

llvm-svn: 273801
2016-06-26 05:10:45 +00:00
Sanjoy Das 7dda0edb5f [RSForGC] Bring the BDVState struct up to code; NFC
llvm-svn: 273800
2016-06-26 04:55:35 +00:00
Sanjoy Das 61c76e3b89 [RSForGC] Bring computeLiveInValues up to code; NFC
llvm-svn: 273799
2016-06-26 04:55:32 +00:00
Sanjoy Das 83186b067d [RSForGC] Bring computeLiveOutSeed up to code; NFC
llvm-svn: 273798
2016-06-26 04:55:30 +00:00
Sanjoy Das b2df57af65 [RSForGC] Bring computeLiveInValues up to code; NFC
llvm-svn: 273797
2016-06-26 04:55:26 +00:00
Sanjoy Das 255532f629 [RSForGC] Bring recomputeLiveInValues up to code; NFC
llvm-svn: 273796
2016-06-26 04:55:23 +00:00
Sanjoy Das 73c7f26035 [RSForGC] Bring containsGCPtrType, isGCPointerType up to code; NFC
llvm-svn: 273795
2016-06-26 04:55:19 +00:00
Sanjoy Das 1e7eeb4bf0 [RSForGC] Bring analyzeParsePointLiveness up to code; NFC
llvm-svn: 273794
2016-06-26 04:55:17 +00:00
Sanjoy Das 6cf88091b3 [RSForGC] Bring meetBDVStateImpl up to code; NFC
llvm-svn: 273793
2016-06-26 04:55:13 +00:00
Sanjoy Das bd43d0e2d0 [RSForGC] Get rid of the unnecessary MeetBDVStates struct; NFC
All of its implementation is in just one function.

llvm-svn: 273792
2016-06-26 04:55:10 +00:00
Sanjoy Das 90547f1d20 [RSForGC] Bring findBasePointer up to code; NFC
Name-casing and minor style changes to bring the function up to the LLVM
coding style.

llvm-svn: 273791
2016-06-26 04:55:05 +00:00
David Majnemer e14e7bc4b8 Revert "[SimplifyCFG] Stop inserting calls to llvm.trap for UB"
This reverts commit r273778, it seems to break UBSan :/

llvm-svn: 273779
2016-06-25 08:19:55 +00:00
David Majnemer d346a37737 [SimplifyCFG] Stop inserting calls to llvm.trap for UB
SimplifyCFG had logic to insert calls to llvm.trap for two very
particular IR patterns: stores and invokes of undef/null.

While InstCombine canonicalizes certain undefined behavior IR patterns
to stores of undef, phase ordering means that this cannot be relied upon
in general.

There are much better tools than llvm.trap: UBSan and ASan.

N.B. I could be argued into reverting this change if a clear argument as
to why it is important that we synthesize llvm.trap for stores, I'd be
hard pressed to see why it'd be useful for invokes...

llvm-svn: 273778
2016-06-25 08:04:19 +00:00
Sanjoy Das d850068282 [LoopUnswitch] Avoid exponential behavior
Summary: (No semantic change intended).

Reviewers: majnemer, bogner, mzolotukhin

Subscribers: mcrosier, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21707

llvm-svn: 273763
2016-06-25 01:14:19 +00:00
David Majnemer b8da3a2bb2 Reinstate r273711
r273711 was reverted by r273743.  The inliner needs to know about any
call sites in the inlined function.  These were obscured if we replaced
a call to undef with an undef but kept the call around.

This fixes PR28298.

llvm-svn: 273753
2016-06-25 00:04:10 +00:00
Michael Kuperstein 83b753d430 [PM] Port float2int to the new pass manager
Differential Revision: http://reviews.llvm.org/D21704

llvm-svn: 273747
2016-06-24 23:32:02 +00:00
Nico Weber ae2ef4ccd4 Revert r273711, it caused PR28298.
llvm-svn: 273743
2016-06-24 22:52:39 +00:00
Sanjoy Das 91e6ba6399 [IndVarSimplify] Run clang-format over some oddly formatted bits
NFC (whitespace only change)

llvm-svn: 273732
2016-06-24 21:23:32 +00:00
David Majnemer 3b3e954ea2 SimplifyInstruction does not imply DCE
We cannot remove an instruction with no uses just because
SimplifyInstruction succeeds.  It may have side effects.

llvm-svn: 273711
2016-06-24 19:34:46 +00:00
Anna Thomas 671513553c [LICM] Avoid repeating expensive call while promoting loads. NFC
Summary:
We can avoid repeating the check `isGuaranteedToExecute` when it's already called once while checking if the alignment can be widened for the load/store being hoisted.

The function is invariant for the same instruction `UI` in `isGuaranteedToExecute(*UI, DT, CurLoop, SafetyInfo);`

Reviewers: hfinkel, eli.friedman

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21672

llvm-svn: 273671
2016-06-24 12:38:45 +00:00
David Majnemer d770877328 Switch more loops to be range-based
This makes the code a little more concise, no functional change is
intended.

llvm-svn: 273644
2016-06-24 04:05:21 +00:00
Sanjoy Das 81c00fe022 [IRCE] Use getTerminator instead of rbegin; NFC
llvm-svn: 273586
2016-06-23 18:03:26 +00:00
Hal Finkel a1271036c5 Allow DeadStoreElimination to track combinations of partial later wrties
DeadStoreElimination can currently remove a small store rendered unnecessary by
a later larger one, but could not remove a larger store rendered unnecessary by
a series of later smaller ones. This adds that capability.

It works by keeping a map, which is used as an effective interval map, for each
store later overwritten only partially, and filling in that interval map as
more such stores are discovered. No additional walking or aliasing queries are
used. In the map forms an interval covering the the entire earlier store, then
it is dead and can be removed. The map is used as an interval map by storing a
mapping between the ending offset and the beginning offset of each interval.

I discovered this problem when investigating a performance issue with code like
this on PowerPC:

  #include <complex>
  using namespace std;

  complex<float> bar(complex<float> C);
  complex<float> foo(complex<float> C) {
    return bar(C)*C;
  }

which produces this:

  define void @_Z4testSt7complexIfE(%"struct.std::complex"* noalias nocapture sret %agg.result, i64 %c.coerce) {
  entry:
    %ref.tmp = alloca i64, align 8
    %tmpcast = bitcast i64* %ref.tmp to %"struct.std::complex"*
    %c.sroa.0.0.extract.shift = lshr i64 %c.coerce, 32
    %c.sroa.0.0.extract.trunc = trunc i64 %c.sroa.0.0.extract.shift to i32
    %0 = bitcast i32 %c.sroa.0.0.extract.trunc to float
    %c.sroa.2.0.extract.trunc = trunc i64 %c.coerce to i32
    %1 = bitcast i32 %c.sroa.2.0.extract.trunc to float
    call void @_Z3barSt7complexIfE(%"struct.std::complex"* nonnull sret %tmpcast, i64 %c.coerce)
    %2 = bitcast %"struct.std::complex"* %agg.result to i64*
    %3 = load i64, i64* %ref.tmp, align 8
    store i64 %3, i64* %2, align 4 ; <--- ***** THIS SHOULD NOT BE HERE ****
    %_M_value.realp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 0
    %4 = lshr i64 %3, 32
    %5 = trunc i64 %4 to i32
    %6 = bitcast i32 %5 to float
    %_M_value.imagp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 1
    %7 = trunc i64 %3 to i32
    %8 = bitcast i32 %7 to float
    %mul_ad.i.i = fmul fast float %6, %1
    %mul_bc.i.i = fmul fast float %8, %0
    %mul_i.i.i = fadd fast float %mul_ad.i.i, %mul_bc.i.i
    %mul_ac.i.i = fmul fast float %6, %0
    %mul_bd.i.i = fmul fast float %8, %1
    %mul_r.i.i = fsub fast float %mul_ac.i.i, %mul_bd.i.i
    store float %mul_r.i.i, float* %_M_value.realp.i.i, align 4
    store float %mul_i.i.i, float* %_M_value.imagp.i.i, align 4
    ret void
  }

the problem here is not just that the i64 store is unnecessary, but also that
it blocks further backend optimizations of the other uses of that i64 value in
the backend.

In the future, we might want to add a special case for handling smaller
accesses (e.g. using a bit vector) if the map mechanism turns out to be
noticeably inefficient. A sorted vector is also a possible replacement for the
map for small numbers of tracked intervals.

Differential Revision: http://reviews.llvm.org/D18586

llvm-svn: 273559
2016-06-23 13:46:39 +00:00
Eric Christopher d3d9cbf127 Fix unused variable warning by folding the temporary into the debug statement.
llvm-svn: 273523
2016-06-23 00:42:00 +00:00
David Majnemer d1fbf48566 [SCCP] Don't assume all Constants are ConstantInt
This fixes PR28269.

llvm-svn: 273521
2016-06-23 00:14:29 +00:00
Sanjoy Das 5dae789a16 [RS4GC] Use StringRef; NFC
Spotted during random inspection.

llvm-svn: 273512
2016-06-22 23:32:46 +00:00
Rafael Espindola 2b7fef681f Delete more dead code.
Found by gcc 6.

llvm-svn: 273402
2016-06-22 12:44:16 +00:00
Rafael Espindola 48975881ab Delete some dead code.
Found by gcc 6.

llvm-svn: 273303
2016-06-21 19:48:12 +00:00
David Majnemer 41ff4fdcd4 Forgot to update callers of deleteDeadInstruction
llvm-svn: 273163
2016-06-20 16:07:38 +00:00
David Majnemer c5601df9fd Reapply "[LoopIdiom] Don't remove dead operands manually"
This reverts commit r273160, reapplying r273132.
RecursivelyDeleteTriviallyDeadInstructions cannot be called on a
parentless Instruction.

llvm-svn: 273162
2016-06-20 16:03:25 +00:00
Cong Liu 1c28b6d733 Revert "[LoopIdiom] Don't remove dead operands manually"
This reverts commit r273132.
Breaks multiple test under /llvm/test:Transforms (e.g.
llvm/test:Transforms/LoopIdiom/basic.ll.test) under asan.

llvm-svn: 273160
2016-06-20 15:22:15 +00:00
Patrik Hagglund 4e0bd84b35 Fix formatting of r273144. NFC.
llvm-svn: 273149
2016-06-20 11:19:58 +00:00
Patrik Hagglund a83706e354 Avoid output indeterminism between GCC and Clang builds.
Remove dependency of the evalution order of function arguments, which
is unspecified.

The following test previously failed when built with GCC (but succeded
when built with Clang):

  ; RUN: opt -sroa -S < %s | FileCheck %s

  target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
  target triple = "x86_64-unknown-linux-gnu"

  %A = type {i16}

  @a = global %A* null
  @b = global i16 0

  ; CHECK-LABEL: @f1(
  ; CHECK: alloca %A
  ; CHECK-NEXT: extractvalue %A
  ; CHECK-NEXT: getelementptr inbounds %A

  define void @f1 (%A %a) {
    %1 = alloca %A
    store %A %a, %A* %1
    %2 = load i16, i16* @b
    %3 = icmp ne i16 %2, 0
    br i1 %3, label %bb1, label %bb2
  bb1:
    store %A* %1, %A** @a
    br label %bb2
  bb2:
    ret void
  }

Patch by David Stenberg.

Differential Revision: http://reviews.llvm.org/D21226

llvm-svn: 273144
2016-06-20 10:19:00 +00:00
Patrik Hagglund 7205215591 Fix for PR27940
After a store has been eliminated, when making sure that the
instruction iterator points to a valid instruction, dbg intrinsics are
now ignored as a new instruction.

Patch by Henric Karlsson.

Reviewed by Daniel Berlin.

Differential Revision: http://reviews.llvm.org/D21076

llvm-svn: 273141
2016-06-20 09:10:10 +00:00
David Majnemer a705843f23 [LoopIdiom] Don't remove dead operands manually
Removing dead instructions requires remembering which operands have
already been removed.  RecursivelyDeleteTriviallyDeadInstructions has
this logic, don't partially reimplement it in LoopIdiomRecognize.

This fixes PR28196.

llvm-svn: 273132
2016-06-20 02:33:29 +00:00
David Majnemer 3ffe2dd4d2 Address Eli's post-commit comments
Use an APInt to handle pointers of arbitrary width, let
accumulateConstantOffset handle overflow issues.

llvm-svn: 273126
2016-06-19 21:36:35 +00:00
Sanjay Patel f8ee0e0218 fix formatting, typo; NFC
llvm-svn: 273118
2016-06-19 17:20:27 +00:00
David Majnemer 3119599475 [LoadCombine] Combine Loads formed from GEPS with negative indexes
Change the underlying offset and comparisons to use int64_t instead of
uint64_t.

Patch by River Riddle!

Differential Revision: http://reviews.llvm.org/D21499

llvm-svn: 273105
2016-06-19 06:14:56 +00:00
Adam Nemet a9f09c6245 [LAA] Enable symbolic stride speculation for all LAA clients
This is a functional change for LLE and LDist.  The other clients (LV,
LVerLICM) already had this explicitly enabled.

The temporary boolean parameter to LAA is removed that allowed turning
off speculation of symbolic strides.  This makes LAA's caching interface
LAA::getInfo only take the loop as the parameter.  This makes the
interface more friendly to the new Pass Manager.

The flag -enable-mem-access-versioning is moved from LV to a LAA which
now allows turning off speculation globally.

llvm-svn: 273064
2016-06-17 22:35:41 +00:00
Benjamin Kramer 1afc1de406 Apply another batch of fixes from clang-tidy's performance-unnecessary-value-param.
Contains some manual fixes. No functionality change intended.

llvm-svn: 273047
2016-06-17 20:41:14 +00:00
Davide Italiano b49aa5c0c4 [PM] Port MergedLoadStoreMotion to the new pass manager, take two.
This is indeed a much cleaner approach (thanks to Daniel Berlin
for pointing out), and also David/Sean for review.

Differential Revision:  http://reviews.llvm.org/D21454

llvm-svn: 273032
2016-06-17 19:10:09 +00:00
Benjamin Kramer 4dea8f542b Avoid duplicated map lookups. No functionality change intended.
llvm-svn: 273030
2016-06-17 18:59:41 +00:00
Justin Bogner 78eebe7756 LoopSimplifyCFG: Prefer `const auto &` to `auto &`, for clarity. NFC
llvm-svn: 273023
2016-06-17 17:59:48 +00:00
Sanjoy Das a324487493 [RS4GC] Pass CallSite by value instead of const ref; NFC
That's the idiomatic LLVM pattern.

llvm-svn: 272981
2016-06-17 00:45:00 +00:00
Chandler Carruth 164a2aa6f4 [PM] Remove support for omitting the AnalysisManager argument to new
pass manager passes' `run` methods.

This removes a bunch of SFINAE goop from the pass manager and just
requires pass authors to accept `AnalysisManager<IRUnitT> &` as a dead
argument. This is a small price to pay for the simplicity of the system
as a whole, despite the noise that changing it causes at this stage.

This will also helpfull allow us to make the signature of the run
methods much more flexible for different kinds af passes to support
things like intelligently updating the pass's progression over IR units.

While this touches many, many, files, the changes are really boring.
Mostly made with the help of my trusty perl one liners.

Thanks to Sean and Hal for bouncing ideas for this with me in IRC.

llvm-svn: 272978
2016-06-17 00:11:01 +00:00
Adam Nemet c953bb9953 [LV] Move management of symbolic strides to LAA. NFCI
This is still NFCI, so the list of clients that allow symbolic stride
speculation does not change (yes: LV and LoopVersioningLICM, no: LLE,
LDist).  However since the symbolic strides are now managed by LAA
rather than passed by client a new bool parameter is used to enable
symbolic stride speculation.

The existing test Transforms/LoopVectorize/version-mem-access.ll checks
that stride speculation is performed for LV.

The previously added test Transforms/LoopLoadElim/symbolic-stride.ll
ensures that no speculation is performed for LLE.

The next patch will change the functionality and turn on symbolic stride
speculation in all of LAA's clients and remove the bool parameter.

llvm-svn: 272970
2016-06-16 22:57:55 +00:00
Sanjoy Das 1ab2fad363 [EarlyCSE] Minor cosmetic NFC changes
- Avoid implicit conversion from pointer to bool
 - Add a comment when passing in a boolean value

llvm-svn: 272955
2016-06-16 21:00:57 +00:00
Sanjoy Das 07c6521aed [EarlyCSE] Fold invariant loads
Redundant invariant loads can be CSE'ed with very little extra effort
over what early-cse already tracks, so it looks reasonable to make
early-cse handle this case.

llvm-svn: 272954
2016-06-16 20:47:57 +00:00
Davide Italiano 41315f7873 [PM] Revert the port of MergeLoadStoreMotion to the new pass manager.
Daniel Berlin expressed some real concerns about the port and proposed
and alternative approach. I'll revert this for now while working on a
new patch, which I hope to put up for review shortly. Sorry for the churn.

llvm-svn: 272925
2016-06-16 17:40:53 +00:00
Chad Rosier 624fee55bc [DSE] Minor style cleanup. NFC.
llvm-svn: 272922
2016-06-16 17:06:04 +00:00
Igor Laevsky 87f0d0e185 Revert r272891 "[JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo"
It was causing failures in Profile-i386 and Profile-x86_64 tests.

llvm-svn: 272912
2016-06-16 16:25:53 +00:00
Igor Laevsky c9179fd2c2 [JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo
We should update results of the BranchProbabilityInfo after removing block in JumpThreading. Otherwise 
we will get dangling pointer inside BranchProbabilityInfo cache.

Differential Revision: http://reviews.llvm.org/D20957

llvm-svn: 272891
2016-06-16 13:28:25 +00:00
Patrik Hagglund 0acaefaf9d PR27938: Don't remove valid DebugLoc in Scalarizer
Added checks to make sure the Scalarizer::transferMetadata() don't
remove valid debug locations from instructions. This is important as
the verifier pass require that e.g. inlinable callsites have a valid
debug location.

https://llvm.org/bugs/show_bug.cgi?id=27938

Patch by Karl-Johan Karlsson

Reviewers: dblaikie

Differential Revision: http://reviews.llvm.org/D20807

llvm-svn: 272884
2016-06-16 10:48:54 +00:00
Adam Nemet bdbc5227ce [LAA] Default getInfo to not speculate symbolic strides. NFC
Soon we won't be passing Strides to getInfo and then we'll have fewer
call sites to update.

llvm-svn: 272878
2016-06-16 08:26:56 +00:00
Chad Rosier 72a793c5b1 [DSE] Hoist a redundant check to simplify logic. NFC.
llvm-svn: 272849
2016-06-15 22:17:38 +00:00
Chad Rosier 844e2df94b Typo. NFC.
llvm-svn: 272846
2016-06-15 21:41:22 +00:00
Davide Italiano 63af1aa0c2 [PM] Remove unneded doFinalization() override from LoopVersioningLICM.
llvm-svn: 272842
2016-06-15 21:23:54 +00:00
Sean Silva a4c2d150d0 [PM] Port AlignmentFromAssumptions to the new PM.
This uses the "runImpl" pattern to share code between the old and new PM.

llvm-svn: 272757
2016-06-15 06:18:01 +00:00
David Majnemer 4a697c312f [LoopUnroll] Don't crash trying to unroll loop with EH pad exit
We do not support splitting cleanuppad or catchswitches.  This is
problematic for passes which assume that a loop is in loop simplify
form (the loop would have a dedicated exit block instead of sharing it).

While it isn't great that we don't support this for cleanups, we still
cannot make loop-simplify form an assertable precondition because
indirectbr will also disable these sorts of CFG cleanups.

This fixes PR28132.

llvm-svn: 272739
2016-06-15 00:19:56 +00:00
David Majnemer cbf614a93b Remove the ScalarReplAggregates pass
Nearly all the changes to this pass have been done while maintaining and
updating other parts of LLVM.  LLVM has had another pass, SROA, which
has superseded ScalarReplAggregates for quite some time.

Differential Revision: http://reviews.llvm.org/D21316

llvm-svn: 272737
2016-06-15 00:19:09 +00:00
Peter Collingbourne 96efdd6107 IR: Introduce local_unnamed_addr attribute.
If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.

This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
  the normal rule that the global must have a unique address can be broken without
  being observable by the program by performing comparisons against the global's
  address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
  its own copy of the global if it requires one, and the copy in each linkage unit
  must be the same)
- It is a constant or a function (which means that the program cannot observe that
  the unique-address rule has been broken by writing to the global)

Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.

See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.

Part of the fix for PR27553.

Differential Revision: http://reviews.llvm.org/D20348

llvm-svn: 272709
2016-06-14 21:01:22 +00:00
Sebastian Pop dfb66a1191 LoopRotate: restructure code to simplify functions
We move the loop rotate functions in a separate class to avoid passing multiple
parameters to each function.  This cleanup will help with further development of
loop rotation.  NFC.

Patch written by Aditya Kumar and Sebastian Pop.

Differential Revision: http://reviews.llvm.org/D21311

llvm-svn: 272672
2016-06-14 14:44:05 +00:00
Chad Rosier 66a9d07a86 [MergedLoadStoreMotion] Before quering AA verify the loads are the same.
Basicaa stats show the number of queries in Spec2k6 are reduced by 4540
or ~.67% overall.

llvm-svn: 272661
2016-06-14 12:47:18 +00:00
Sean Silva 6347df0f81 [PM] Port MemCpyOpt to the new PM.
The need for all these Lookup* functions is just because of calls to
getAnalysis inside methods (i.e. not at the top level) of the
runOnFunction method. They should be straightforward to clean up when
the old PM is gone.

llvm-svn: 272615
2016-06-14 02:44:55 +00:00
Davide Italiano 3ab1b588b5 [PM/MergedLoadStoreMotion] Preserve analyses more aggressively.
llvm-svn: 272611
2016-06-14 01:23:31 +00:00
Sean Silva 46590d556a Bring back "[PM] Port JumpThreading to the new PM" with a fix
This reverts commit r272603 and adds a fix.

Big thanks to Davide for pointing me at r216244 which gives some insight
into how to fix this VS2013 issue. VS2013 can't synthesize a move
constructor. So the fix here is to add one explicitly to the
JumpThreadingPass class.

llvm-svn: 272607
2016-06-14 00:51:09 +00:00
Davide Italiano 89ab89d6cd [PM] Port MergedLoadStoreMotion to the new pass manager.
llvm-svn: 272606
2016-06-14 00:49:23 +00:00
Sean Silva 7d5a57cbfc Revert "[PM] Port JumpThreading to the new PM"
This reverts commit r272597.

Will investigate issue with VS2013 compilation and then recommit.

llvm-svn: 272603
2016-06-14 00:26:31 +00:00
Davide Italiano 86c1f953f5 [PM/MergedLoadStoreMotion] Remove unneeded pass dependency.
llvm-svn: 272598
2016-06-13 23:28:35 +00:00
Sean Silva f81328d0b4 [PM] Port JumpThreading to the new PM
This follows the approach in r263208 (for GVN) pretty closely:
- move the bulk of the body of the function to the new PM class.
- expose a runImpl method on the new-PM class that takes the IRUnitT and
  pointers/references to any analyses and use that to implement the
  old-PM class.
- use a private namespace in the header for stuff that used to be file
  scope

llvm-svn: 272597
2016-06-13 22:52:52 +00:00
Davide Italiano 44faf7f407 [PM/MergeLoadStoreMotion] Convert the logic to static functions.
Pass AliasAnalyis and MemoryDepResult around. This is in preparation
for porting this pass to the new PM.

llvm-svn: 272595
2016-06-13 22:27:30 +00:00
Sean Silva 687019facb [PM] Port LVI to the new PM.
This is a bit gnarly since LVI is maintaining its own cache.
I think this port could be somewhat cleaner, but I'd rather not spend
too much time on it while we still have the old pass hanging around and
limiting how much we can clean things up.
Once the old pass is gone it will be easier (less time spent) to clean
it up anyway.

This is the last dependency needed for porting JumpThreading which I'll
do in a follow-up commit (there's no printer pass for LVI or anything to
test it, so porting a pass that depends on it seems best).

I've been mostly following:
r269370 / D18834 which ported Dependence Analysis
r268601 / D19839 which ported BPI

llvm-svn: 272593
2016-06-13 22:01:25 +00:00
Benjamin Kramer d3f4c05aea Move instances of std::function.
Or replace with llvm::function_ref if it's never stored. NFC intended.

llvm-svn: 272513
2016-06-12 16:13:55 +00:00
Benjamin Kramer bdc4956bac Pass DebugLoc and SDLoc by const ref.
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.

llvm-svn: 272512
2016-06-12 15:39:02 +00:00
Eli Friedman 9f8031c2da [MergedLoadStoreMotion] Use correct helper for load hoist safety.
It isn't legal to hoist a load past a call which might not return;
even if it doesn't throw, it could, for example, call exit().

Fixes http://llvm.org/PR27953.

llvm-svn: 272495
2016-06-12 02:11:20 +00:00
Eli Friedman f1da33e4d3 [LICM] Make isGuaranteedToExecute more accurate.
Summary:
Make isGuaranteedToExecute use the
isGuaranteedToTransferExecutionToSuccessor helper, and make that helper
a bit more accurate.

There's a potential performance impact here from assuming that arbitrary
calls might not return. This probably has little impact on loads and
stores to a pointer because most things alias analysis can reason about
are dereferenceable anyway. The other impacts, like less aggressive
hoisting of sdiv by a variable and less aggressive hoisting around
volatile memory operations, are unlikely to matter for real code.

This also impacts SCEV, which uses the same helper.  It's a minor
improvement there because we can tell that, for example, memcpy always
returns normally. Strictly speaking, it's also introducing
a bug, but it's not any worse than everywhere else we assume readonly
functions terminate.

Fixes http://llvm.org/PR27857.

Reviewers: hfinkel, reames, chandlerc, sanjoy

Subscribers: broune, llvm-commits

Differential Revision: http://reviews.llvm.org/D21167

llvm-svn: 272489
2016-06-11 21:48:25 +00:00
Michael Zolotukhin b98294d006 Don't try to rotate a loop more than once - we never do this anyway.
Summary:
I can't find a case where we can rotate a loop more than once, and it looks
like we never do this. To rotate a loop following conditions should be met:
1) its header should be exiting
2) its latch shouldn't be exiting

But after the first rotation the header becomes the new latch, so this
condition can never be true any longer.

Tested on with an assert on LNT testsuite and make check.

Reviewers: hfinkel, sanjoy

Subscribers: sebpop, sanjoy, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D20181

llvm-svn: 272439
2016-06-10 22:03:56 +00:00
Nico Weber 2cf5e89e1d Remove a few gendered pronouns.
llvm-svn: 272422
2016-06-10 20:06:03 +00:00
Evgeniy Stepanov eaea297df4 Disable MSan-hostile loop unswitching.
Loop unswitching may cause MSan false positive when the unswitch
condition is not guaranteed to execute.

This is very similar to ASan and TSan special case in
llvm::isSafeToSpeculativelyExecute (they don't like speculative loads
and stores), but for branch instructions.

This is a workaround for PR28054.

llvm-svn: 272421
2016-06-10 20:03:20 +00:00
Evgeniy Stepanov 122f984a33 Move isGuaranteedToExecute out of LICM.
Also rename LICMSafetyInfo to LoopSafetyInfo.
Both will be used in LoopUnswitch in a separate change.

llvm-svn: 272420
2016-06-10 20:03:17 +00:00
Chad Rosier 840b3efeae Add a period. NFC.
llvm-svn: 272410
2016-06-10 17:59:22 +00:00
Chad Rosier a8bc512be5 Fix whitespace. NFC.
llvm-svn: 272409
2016-06-10 17:58:01 +00:00
Easwaran Raman e12c487b8c [PM] Port LCSSA to the new PM.
Differential Revision: http://reviews.llvm.org/D21090

llvm-svn: 272294
2016-06-09 19:44:46 +00:00
Xinliang David Li ecde1c7f3d Revert r272194 No need for it if loop Analysis Manager is used
llvm-svn: 272243
2016-06-09 03:22:39 +00:00
Davide Italiano 02861d8695 [PM] Add missing caching of GlobalsAA to EarlyCSE.
llvm-svn: 272204
2016-06-08 21:31:55 +00:00
Evgeny Stupachenko 3e2f389a7e The patch set unroll disable pragma when unroll
with user specified count has been applied.

Summary:
Previously SetLoopAlreadyUnrolled() set the disable pragma only if
there was some loop metadata.
Now it set the pragma in all cases. This helps to prevent multiple
unroll when -unroll-count=N is given.

Reviewers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D20765

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 272195
2016-06-08 20:21:24 +00:00
Xinliang David Li 572135f717 [PM] Refector LoopAccessInfo analysis code
This is the preparation patch to port the analysis to new PM

Differential Revision: http://reviews.llvm.org/D20560

llvm-svn: 272194
2016-06-08 20:15:37 +00:00
Tim Shen 7aa0ad65ce [MemCpyOpt] Do not exchange llvm.lifetime.start and llvm.memcpy
Reviewers: iteratee

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21087

llvm-svn: 272192
2016-06-08 19:42:32 +00:00
Benjamin Kramer c321e53402 Apply most suggestions of clang-tidy's performance-unnecessary-value-param
Avoids unnecessary copies. All changes audited & pass tests with asan.
No functional change intended.

llvm-svn: 272190
2016-06-08 19:09:22 +00:00
Davide Italiano d8d83f4773 [PM/SimplifyCFG] Preserve GlobalsAA even if the IR is mutated.
llvm-svn: 272139
2016-06-08 13:32:23 +00:00
Benjamin Kramer 46e38f3678 Avoid copies of std::strings and APInt/APFloats where we only read from it
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.

llvm-svn: 272126
2016-06-08 10:01:20 +00:00
Davide Italiano 16e96d4b16 [PM] Preserve GlobalsAA for SROA.
Differential Revision:  http://reviews.llvm.org/D21040

llvm-svn: 272009
2016-06-07 13:21:17 +00:00
Davide Italiano fea0a4c5b2 [PM] Preserve the correct set of analyses for GVN.
llvm-svn: 271934
2016-06-06 20:01:50 +00:00
Davide Italiano 82c447823b [GVN] Switch dump() definition over to LLVM_DUMP_METHOD.
llvm-svn: 271932
2016-06-06 19:24:27 +00:00
Geoff Berry 43e5160d0e Reapply [LSR] Create fewer redundant instructions.
Summary:
Fix LSRInstance::HoistInsertPosition() to check the original insert
position block first for a canonical insertion point that is dominated
by all inputs.  This leads to SCEV being able to reuse more instructions
since it currently tracks the instructions it creates for reuse by
keeping a table of <Value, insert point> pairs.

Originally reviewed in http://reviews.llvm.org/D18001

Reviewers: atrick

Subscribers: llvm-commits, mzolotukhin, mcrosier

Differential Revision: http://reviews.llvm.org/D18480

llvm-svn: 271929
2016-06-06 19:10:46 +00:00
Eli Friedman ee89505799 LICM: Don't sink stores out of loops that may throw.
Summary:
This hasn't been caught before because it requires noalias or similarly
strong alias analysis to actually reproduce.

Fixes http://llvm.org/PR27952 .

Reviewers: hfinkel, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20944

llvm-svn: 271858
2016-06-05 22:13:52 +00:00
Sanjoy Das 4d4339d1e8 [PM] Port IndVarSimplify to the new pass manager
Summary:
There are some rough corners, since the new pass manager doesn't have
(as far as I can tell) LoopSimplify and LCSSA, so I've updated the
tests to run them separately in the old pass manager in the lit tests.
We also don't have an equivalent for AU.setPreservesCFG() in the new
pass manager, so I've left a FIXME.

Reviewers: bogner, chandlerc, davide

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20783

llvm-svn: 271846
2016-06-05 18:01:19 +00:00
Sanjoy Das f90e28d6fd [IndVars] Remove -liv-reduce
It is an off-by-default option that no one seems to use[0], and given
that SCEV directly understands the overflow instrinsics there is no real
need for it anymore.

[0]: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098181.html

llvm-svn: 271845
2016-06-05 18:01:12 +00:00
Michael Zolotukhin 585649895f [LoopUnroll] Set correct thresholds for new recently enabled unrolling heuristic.
In r270478, where I enabled the new heuristic I posted testing results,
which I got when explicitly passed the thresholds values via CL options.
However, setting the CL options init-values is not enough to change the
default values of thresholds, so I'm changing them in another place now.

llvm-svn: 271615
2016-06-03 00:16:46 +00:00
Davide Italiano 8738363339 [TailRecursionElimination] Refactor/cleanup.
In preparation for porting to the new PM.
Patch by Jake VanAdrighem! (review mainly by me/Justin)

Differential Revision:  http://reviews.llvm.org/D20610

llvm-svn: 271607
2016-06-02 23:02:44 +00:00
Davide Italiano 6dfdbf1f46 [PM] LoadCombine preserves GlobalsAA, doesn't depend on it.
llvm-svn: 271601
2016-06-02 22:05:59 +00:00
Davide Italiano 84e1414522 [PM/LoadCombine] Inline getAnalysisUsage(). NFCI.
llvm-svn: 271600
2016-06-02 22:04:43 +00:00
Davide Italiano bdc2971434 [PM] BDCE: Fix caching of analyses.
Another chapter in the story. GlobalsAA should be preserved, as
 well as the CFG.

llvm-svn: 271307
2016-05-31 17:53:22 +00:00
Davide Italiano 688616ff74 [PM] ADCE: Fix caching of analyses.
When this pass was originally ported, AA wasn't available for the
new PM. Now it is, so we can cache properly.

llvm-svn: 271303
2016-05-31 17:39:39 +00:00
Craig Topper 8287fd8abd [X86] Remove SSE/AVX unaligned store intrinsics as clang no longer uses them. Auto upgrade to native unaligned store instructions.
llvm-svn: 271236
2016-05-30 23:15:56 +00:00
Sanjoy Das 3e5ce2b737 [IndVars] Assert that the incoming IR is in LCSSA
Since we already assert that the outgoing IR is in LCSSA, it is easy to
get misled into thinking that -indvars broke LCSSA if the incoming IR is
non-LCSSA.  Checking this pre-condition will make such cases break in
more obvious ways.

Inspired by (but does _not_ fix) PR26682.

llvm-svn: 271196
2016-05-30 01:37:39 +00:00
Sanjoy Das 496f274257 [IndVarSimplify] Extract the logic of `-indvars` out into a class; NFC
This will be used later to port IndVarSimplify to the new pass manager.

llvm-svn: 271190
2016-05-29 21:42:00 +00:00
Davide Italiano 39893bd41c [PM] Reassociate: cache analyses more aggressively.
While here, add a FIXME for setPreserveCFG().

llvm-svn: 271159
2016-05-29 00:41:17 +00:00
Davide Italiano 484b5ab39d [PM] SCCP should preserve GlobalsAA even if the IR is mutated.
llvm-svn: 271149
2016-05-29 00:31:15 +00:00
Evgeny Stupachenko b787522d28 The patch fixes r271071
Summary:
unused variables in Release mode:
  BasicBlock *Header
  unsigned OrigCount
put under DEBUG

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 271076
2016-05-28 00:14:58 +00:00
Evgeny Stupachenko ea2aef4a1d The patch refactors unroll pass.
Summary:
Unroll factor (Count) calculations moved to a new function.
Early exits on pragma and "-unroll-count" defined factor added.
New type of unrolling "Force" introduced (previously used implicitly).
New unroll preference "AllowRemainder" introduced and set "true" by default.
(should be set to false for architectures that suffers from it).

Reviewers: hfinkel, mzolotukhin, zzheng

Differential Revision: http://reviews.llvm.org/D19553

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 271071
2016-05-27 23:15:06 +00:00
Sanjoy Das 6fff9dc932 [GVN] Preserve !range metadata when PRE'ing loads
Reviewers: dberlin, reames, george.burgess.iv

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20743

llvm-svn: 271034
2016-05-27 19:03:10 +00:00
Benjamin Kramer 82de7d323d Apply clang-tidy's misc-move-constructor-init throughout LLVM.
No functionality change intended, maybe a tiny performance improvement.

llvm-svn: 270997
2016-05-27 14:27:24 +00:00
Igor Laevsky df9db45c94 [RewriteStatepointsForGC] All constant should have null base pointer
Currently we consider that each constant has itself as a base value. I.e "base(const) = const". 
This introduces couple of problems when we are trying to avoid reporting constants in statepoint live sets:

1. When querying "base( phi(const1, const2) )" we will get "phi(const1, const2)" as a base pointer. Since 
   it's not a constant we will record it in a stack map. However on practice we don't want this to happen
   (constant are never relocated).
2. base( phi(const, gc ptr) ) = phi( const, base(gc ptr) ). This particular case imposes challenge on our 
   runtime - we don't expect to see constant base pointers other than null. This problems can be avoided 
   by treating all constant as if they were derived from null pointer base. I.e in a first case we will 
   not include constant pointer in a stack map at all. In a second case we will get "phi(null, base(gc ptr))" 
   as a base pointer which is a lot more convenient.

Differential Revision: http://reviews.llvm.org/D20584

llvm-svn: 270993
2016-05-27 13:13:59 +00:00
Michael Zolotukhin 1ecdedad8d [LoopUnrollAnalyzer] Fix a crash in analyzeLoopUnrollCost.
Condition might be simplified to a Constant, but it doesn't have to be
ConstantInt, so we should dyn_cast, instead of cast.

This fixes PR27886.

llvm-svn: 270924
2016-05-26 21:42:51 +00:00
David Majnemer d99068d26d [MemCpyOpt] Don't perform callslot optimization across may-throw calls
An exception could prevent a store from occurring but MemCpyOpt's
callslot optimization would fire anyway, causing the store to occur.

This fixes PR27849.

llvm-svn: 270892
2016-05-26 19:24:24 +00:00
David Majnemer 474512576e [MergedLoadStoreMotion] Don't transform across may-throw calls
It is unsafe to hoist a load before a function call which may throw, the
throw might prevent a pointer dereference.

Likewise, it is unsafe to sink a store after a call which may throw.
The caller might be able to observe the difference.

This fixes PR27858.

llvm-svn: 270828
2016-05-26 07:11:09 +00:00
David Majnemer 8cce333abd [MergedLoadStoreMotion] Small cleanup
No functional change is intended.

llvm-svn: 270824
2016-05-26 05:43:12 +00:00
Craig Topper a423aa4642 [X86] Add the AVX storeu intrinsics to InstCombine and LoopStrengthReduce in the same places that the SSE/SSE2 storeu intrinsics appear.
I don't really know how to test this. Just seemed like we should be consistent.

llvm-svn: 270819
2016-05-26 04:28:45 +00:00
Sanjoy Das ee77a4828e [IRCE] Use C++11 style initializers; NFC
llvm-svn: 270815
2016-05-26 01:50:18 +00:00
Sanjoy Das a099268e85 [IRCE] Optimize conjunctions of range checks
After this change, we do the expected thing for cases like

```
Check0Passed = /* range check IRCE can optimize */
Check1Passed = /* range check IRCE can optimize */
if (!(Check0Passed && Check1Passed))
  throw_Exception();
```

llvm-svn: 270804
2016-05-26 00:09:02 +00:00
Sanjoy Das 8fe8892c2d [IRCE] Refactor out a parseRangeCheckFromCond; NFC
This will later hold more general logic to parse conjunctions of range
checks.

llvm-svn: 270802
2016-05-26 00:08:24 +00:00
Davide Italiano 1021c68e92 [PM] Port PartiallyInlineLibCalls to the new pass manager.
llvm-svn: 270798
2016-05-25 23:38:53 +00:00
Davide Italiano d85ac997b8 [PM] CorrelatedValuePropagation: pass state to function. NFCI.
While here, convert the logic of the pass to use static function(s).
This is in preparation for porting this pass to the new PM.

llvm-svn: 270734
2016-05-25 17:39:54 +00:00
Craig Topper 12e322a8cf [X86] Remove the llvm.x86.sse2.storel.dq intrinsic. It hasn't been used in a long time.
llvm-svn: 270677
2016-05-25 06:56:32 +00:00
Davide Italiano 655a145e83 [PM] Port BDCE to the new pass manager.
llvm-svn: 270647
2016-05-25 01:57:04 +00:00
Michael Zolotukhin 8f7a242c7b Re-enable "[LoopUnroll] Enable advanced unrolling analysis by default" one more time.
This reverts commit r270577.

llvm-svn: 270630
2016-05-24 23:00:05 +00:00
Sanjoy Das be99153aca [GuardWidening] Tighten the interface of the RangeCheck struct; NFC
Make `GuardWideningImpl::RangeCheck` into a class and add accessors.

llvm-svn: 270611
2016-05-24 20:54:45 +00:00
Sanjoy Das 5fd7ac452e [IRCE] Return a Value*, not SCEV* from parseRangeCheck; NFC
This is better layering, since the caller needs to check if the index
was an add-rec anyway.

llvm-svn: 270582
2016-05-24 17:19:56 +00:00
Hans Wennborg b64e4390a3 Revert r270518, which re-enabled "[LoopUnroll] Enable advanced unrolling analysis by default.
Chromium builds are still hitting the assert in PR27874.

llvm-svn: 270577
2016-05-24 16:10:12 +00:00
Michael Zolotukhin 96c150d154 Revert "Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default.""
This reverts commit r270512 and reapplies r270478. Originally it caused
PR27847, but it was fixed in r270517.

llvm-svn: 270518
2016-05-24 01:22:20 +00:00
Hans Wennborg 6951028b61 Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default."
This caused PR27847.

llvm-svn: 270512
2016-05-23 23:42:35 +00:00
Sanjoy Das aa83c47bab [IRCE] Optimize "uses" not branches; NFCI
This changes IRCE to optimize uses, and not branches.  This change is
NFCI since the uses we do inspect are in practice only ever going to be
the condition use in conditional branches; but this flexibility will
later allow us to analyze more complex expressions than just a direct
branch on a range check.

llvm-svn: 270500
2016-05-23 22:16:45 +00:00
Michael Zolotukhin be080fc51d [LoopUnroll] Enable advanced unrolling analysis by default.
Summary:
This patch turns on LoopUnrollAnalyzer by default. To mitigate compile
time regressions, I chose very conservative thresholds for now. Later we
can make them more aggressive, but it might require being smarter in
which loops we're optimizing. E.g. currently the biggest issue is that
with more agressive thresholds we unroll many cold loops, which
increases compile time for no performance benefit (performance of those
loops is improved, but it doesn't matter since they are cold).

Test results for compile time(using 4 samples to reduce noise):
```
MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes 5.19%
SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect  4.19%
MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow  3.39%
MultiSource/Applications/JM/lencod/lencod 1.47%
MultiSource/Benchmarks/Fhourstones-3_1/fhourstones3_1 -6.06%
```

I didn't see any performance changes in the testsuite, but it improves
some internal tests.

Reviewers: hfinkel, chandlerc

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D20482

llvm-svn: 270478
2016-05-23 19:10:19 +00:00
Sanjoy Das c5b1169de2 [IRCE] Don't use an allocator for range checks; NFC
The InductiveRangeCheck struct is only five words long; so passing these
around value is fine.  The allocator makes the code look more complex
than it is.

llvm-svn: 270309
2016-05-21 02:52:13 +00:00
Sanjoy Das 59776734a3 [IRCE] Don't pass IRBuilder<> where unnecessary; NFC
llvm-svn: 270308
2016-05-21 02:31:51 +00:00
Sanjoy Das be6c7a12cb [GuardWidening] Fix incorrect use of remove_if
I had used `std::remove_if` under the assumption that it moves the
predicate matching elements to the end, but actaully the elements
remaining towards the end (after the iterator returned by
`std::remove_if`) are indeterminate.  Fix the bug (and make the code
more straightforward) by using a temporary SmallVector, and add a test
case demonstrating the issue.

llvm-svn: 270306
2016-05-21 02:24:44 +00:00
Davide Italiano f7211fd44d [PM/PartiallyInlineLibCalls] Fix pass dependencies.
Inline getAnalysisUsage() while I'm here.

llvm-svn: 270231
2016-05-20 16:23:14 +00:00
Davide Italiano 8749dfd1bf [PartiallyInlineLibCalls] Remove dead includes. NFC.
llvm-svn: 270228
2016-05-20 15:52:23 +00:00
Davide Italiano 08713bd1ed [PM/PartiallyInlineLibCalls] Convert to static function in preparation for porting this pass to the new PM.
llvm-svn: 270225
2016-05-20 15:43:39 +00:00
Sanjoy Das 2351975860 Add const qualifiers to appease bots; NFC
llvm-svn: 270155
2016-05-19 23:15:59 +00:00
Sanjoy Das f5f0331a3b [GuardWidening] Introduce range check merging
Sequences of range checks expressed using guards, like

  guard((I - 2) u< L)
  guard((I - 1) u< L)
  guard((I + 0) u< L)
  guard((I + 1) u< L)
  guard((I + 2) u< L)

can sometimes be combined into a smaller sequence:

  guard((I - 2) u< L AND (I + 2) u< L)

if we can prove that (I - 2) u< L AND (I + 2) u< L implies all of checks
expressed in the previous sequence.

This change teaches GuardWidening to do this kind of merging when
feasible.

llvm-svn: 270151
2016-05-19 22:55:46 +00:00
Davide Italiano 46f249b4cd [SCCP] Prefer class to struct.
llvm-svn: 270074
2016-05-19 15:58:02 +00:00
Sanjoy Das b784ed36c0 [GuardWidening] Use getEquivalentICmp to fold constant compares
`ConstantRange::getEquivalentICmp` is more general, and better
factored.

llvm-svn: 270019
2016-05-19 03:53:17 +00:00
Sanjoy Das 52bbde2bbc [LowerGuards] Rename variable; NFC
PredicatePassProbability is a better name for what LikelyBranchWeight
was trying to express.

llvm-svn: 269999
2016-05-18 23:16:27 +00:00
Sanjoy Das 083f38939b New pass: guard widening
Summary:
Implement guard widening in LLVM. Description from GuardWidening.cpp:

The semantics of the `@llvm.experimental.guard` intrinsic lets LLVM
transform it so that it fails more often that it did before the
transform.  This optimization is called "widening" and can be used hoist
and common runtime checks in situations like these:

```
%cmp0 = 7 u< Length
call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ]
call @unknown_side_effects()
%cmp1 = 9 u< Length
call @llvm.experimental.guard(i1 %cmp1) [ "deopt"(...) ]
...
```

to

```
%cmp0 = 9 u< Length
call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ]
call @unknown_side_effects()
...
```

If `%cmp0` is false, `@llvm.experimental.guard` will "deoptimize" back
to a generic implementation of the same function, which will have the
correct semantics from that point onward.  It is always _legal_ to
deoptimize (so replacing `%cmp0` with false is "correct"), though it may
not always be profitable to do so.

NB! This pass is a work in progress.  It hasn't been tuned to be
"production ready" yet.  It is known to have quadriatic running time and
will not scale to large numbers of guards

Reviewers: reames, atrick, bogner, apilipenko, nlewycky

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20143

llvm-svn: 269997
2016-05-18 22:55:34 +00:00
Michael Zolotukhin d2268a73bc [LoopUnrollAnalyzer] Take into account cost of instructions controlling branches, along with their operands.
Previously, we didn't add their and their operands cost, which could've
resulted in unrolling loops for no actual benefit.

llvm-svn: 269985
2016-05-18 21:20:12 +00:00
Davide Italiano 98f7e0e790 [PM] Port per-function SCCP to the new pass manager.
llvm-svn: 269937
2016-05-18 15:18:25 +00:00
Justin Bogner 594e07bd78 [PM] Port DSE to the new pass manager
Patch by JakeVanAdrighem. Thanks!

llvm-svn: 269847
2016-05-17 21:38:13 +00:00
Sanjoy Das fd67038c8b [Guards] Add branch metadata when lowering
Guards are expected to basically never fail.  Reflect this in the branch
probabilities in their lowered form.

llvm-svn: 269791
2016-05-17 17:51:19 +00:00
Igor Laevsky 953f2d2a54 [RewriteStatepointsForGC] Remove obsolete assertion
This is assertion is no longer necessary since we never record
constants in the live set anyway. (They are never recorded in 
the initial live set, and constant bases are removed near line 2119)

Differential Revision: http://reviews.llvm.org/D20293

llvm-svn: 269764
2016-05-17 13:54:10 +00:00
Davide Italiano 6f852eedbf [PM] RewriterStatepointForGC: add missing dependency.
llvm-svn: 269624
2016-05-16 02:29:53 +00:00
Benjamin Kramer a65b610bd2 Move helper classes into anonymous namespaces. NFC.
llvm-svn: 269591
2016-05-15 15:18:11 +00:00
Davide Italiano e62c54375d [PM/SCCP] Fix pass dependencies.
TargetLibraryInfoWrapperPass is a dependency of
SCCP but it's not listed as such. Chandler pointed
out this is an easy mistake to make which only
surfaces in weird crashes with some flag combinations.
This code will go away anyway at some point in the
future, but as long as it's (still) exercised, try
to make it correct.

llvm-svn: 269589
2016-05-15 08:04:28 +00:00
Davide Italiano e7c56c5c4f [SCCP] Use range-based for loops. NFC.
llvm-svn: 269578
2016-05-14 20:59:09 +00:00
Davide Italiano 9922344178 [PM] Port LowerAtomic to the new pass manager.
llvm-svn: 269511
2016-05-13 22:52:35 +00:00
Michael Zolotukhin 963a6d9c69 Revert "Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...""
This reverts commit r269395.

Try to reapply with a fix from chapuni.

llvm-svn: 269486
2016-05-13 21:23:25 +00:00
Jun Bum Lim be11bdc4b0 Rename getLargestLegalIntTypeSize to getLargestLegalIntTypeSizeInBits(). NFC.
Summary: Rename DataLayout::getLargestLegalIntTypeSize to DataLayout::getLargestLegalIntTypeSizeInBits() to prevent similar mistakes  fixed in r269433.

Reviewers: joker.eph, mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20248

llvm-svn: 269456
2016-05-13 18:38:35 +00:00
Geoff Berry 2f64c20284 [EarlyCSE] Change key type of AvailableCalls to Instruction*. NFCI.
llvm-svn: 269445
2016-05-13 17:54:58 +00:00
Jun Bum Lim f28beac419 [MemCpyOpt] Use MaxIntSize in byte instead of bit
Summary: This change fix the bug in isProfitableToUseMemset() where MaxIntSize shoule be in byte, not bit.

Reviewers: arsenm, joker.eph, mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20176

llvm-svn: 269433
2016-05-13 16:52:24 +00:00
Michael Zolotukhin 9be3b8b9bb Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..."
This reverts commit r269388.

It caused some bots to fail, I'm reverting it until I investigate the
issue.

llvm-svn: 269395
2016-05-13 06:32:25 +00:00
Adam Nemet eff76646f5 [LoopDist] Only run LAA for loops with the pragma
This should fix some compile-time regressions after r267672.  Thanks to
Chris Matthews for bisecting it.

llvm-svn: 269392
2016-05-13 04:20:31 +00:00
Michael Zolotukhin b7b8052982 [Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...
Summary:
...loop after the last iteration.

This is really hard to do correctly. The core problem is that we need to
model liveness through the induction PHIs from iteration to iteration in
order to get the correct results, and we need to correctly de-duplicate
the common subgraphs of instructions feeding some subset of the
induction PHIs. All of this can be driven either from a side effect at
some iteration or from the loop values used after the loop finishes.

This patch implements this by storing the forward-propagating analysis
of each instruction in a cache to recall whether it was free and whether
it has become live and thus counted toward the total unroll cost. Then,
at each sink for a value in the loop, we recursively walk back through
every value that feeds the sink, including looping back through the
iterations as needed, until we have marked the entire input graph as
live. Because we cache this, we never visit instructions more than twice
-- once when we analyze them and put them into the cache, and once when
we count their cost towards the unrolled loop. Also, because the cache
is only two bits and because we are dealing with relatively small
iteration counts, we can store all of this very densely in memory to
avoid this from becoming an excessively slow analysis.

The code here is still pretty gross. I would appreciate suggestions
about better ways to factor or split this up, I've stared too long at
the algorithmic side to really have a good sense of what the design
should probably look at.

Also, it might seem like we should do all of this bottom-up, but I think
that is a red herring. Specifically, the simplification power is *much*
greater working top-down. We can forward propagate very effectively,
even across strange and interesting recurrances around the backedge.
Because we use data to propagate, this doesn't cause a state space
explosion. Doing this level of constant folding, etc, would be very
expensive to do bottom-up because it wouldn't be until the last moment
that you could collapse everything. The current solution is essentially
a top-down simplification with a bottom-up cost accounting which seems
to get the best of both worlds. It makes the simplification incremental
and powerful while leaving everything dead until we *know* it is needed.

Finally, a core property of this approach is its *monotonicity*. At all
times, the current UnrolledCost is a conservatively low estimate. This
ensures that we will never early-exit from the analysis due to exceeding
a threshold when if we had continued, the cost would have gone back
below the threshold. These kinds of bugs can cause incredibly hard to
track down random changes to behavior.

We could use a techinque similar (but much simpler) within the inliner
as well to avoid considering speculated code in the inline cost.

Reviewers: chandlerc

Subscribers: sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11758

llvm-svn: 269388
2016-05-13 01:42:39 +00:00
Chandler Carruth 49c22190d0 [PM] Port of the DepndenceAnalysis to the new PM.
Ported DA to the new PM by splitting the former DependenceAnalysis Pass
into a DependenceInfo result type and DependenceAnalysisWrapperPass type
and adding a new PM-style DependenceAnalysis analysis pass returning the
DependenceInfo.

Patch by Philip Pfaffe, most of the review by Justin.

Differential Revision: http://reviews.llvm.org/D18834

llvm-svn: 269370
2016-05-12 22:19:39 +00:00
Davide Italiano 851f879f32 [PM] Make LowerAtomic a FunctionPass.
Differential Revision: http://reviews.llvm.org/D20025

llvm-svn: 269322
2016-05-12 18:49:32 +00:00
David Majnemer 96f0d383a7 [SCCP] Resolve shifts beyond the bitwidth to undef
Shifts beyond the bitwidth are undef but SCCP resolved them to zero.
Instead, DTRT and resolve them to undef.

This reimplements the transform which caused PR27712.

llvm-svn: 269269
2016-05-12 03:07:40 +00:00
Davide Italiano cd7c84bd8b Revert "[SCCP] Partially propagate informations when the input is not fully defined."
This reverts commit r269105 as it caused PR27712.

llvm-svn: 269252
2016-05-11 23:06:10 +00:00
Tim Northover 3961735f03 Revert "MemCpyOpt: combine local load/store sequences into memcpy."
This reverts commit r269125. It was in my tree when I ran "git svn dcommit".
It's really still under review.

llvm-svn: 269127
2016-05-10 21:49:40 +00:00
Tim Northover 6c65c71639 MemCpyOpt: combine local load/store sequences into memcpy.
Sort of the BB-local equivalent to idiom-recognizer: if we have a basic-block
that really implements a memcpy operation, backends can benefit from seeing
this.

llvm-svn: 269125
2016-05-10 21:48:11 +00:00
Hans Wennborg 719b26ba54 Loop unroller: set thresholds for optsize and minsize functions to zero
Before r268509, Clang would disable the loop unroll pass when optimizing
for size. That commit enabled it to be able to support unroll pragmas
in -Os builds. However, this regressed binary size in one of Chromium's
DLLs with ~100 KB.

This restores the original behaviour of no unrolling at -Os, but doing it
in LLVM instead of Clang makes more sense, and also allows the pragmas to
keep working.

Differential revision: http://reviews.llvm.org/D20115

llvm-svn: 269124
2016-05-10 21:45:55 +00:00
Lawrence Hu e58a814c07 Enable loopreroll for sext of loop control only IV
This patch extend loopreroll to allow the instruction chain
        of loop control only IV has sext.

        Differential Revision: http://reviews.llvm.org/D19820

llvm-svn: 269121
2016-05-10 21:16:49 +00:00
Lawrence Hu fe7c87beac Revert r26084: Enable loopreroll for sext of loop control only IV
llvm-svn: 269119
2016-05-10 21:11:09 +00:00
Davide Italiano 7860c9bbf4 [SCCP] Partially propagate informations when the input is not fully defined.
With this patch:
%r1 = lshr i64 -1, 4294967296 -> undef

Before this patch:
%r1 = lshr i64 -1, 4294967296 -> 0

llvm-svn: 269105
2016-05-10 19:49:47 +00:00
Lawrence Hu 8cc3b37d2c Enable loopreroll for sext of loop control only IV
This patch extend loopreroll to allow the instruction chain
    of loop control only IV has sext.

llvm-svn: 269084
2016-05-10 17:42:27 +00:00
Chuang-Yu Cheng 175741d5a7 Update Debug Intrinsics in RewriteUsesOfClonedInstructions in LoopRotation
Loop rotation clones instruction from the old header into the preheader. If
there were uses of values produced by these instructions that were outside
the loop, we have to insert PHI nodes to merge the two values. If the values
are used by DbgIntrinsics they will be used as a MetadataAsValue of a
ValueAsMetadata of the original values, and iterating all of the uses of the
original value will not update the DbgIntrinsics. The new code checks if the
values are used by DbgIntrinsics and if so, updates them using essentially
the same logic as the original code.

The attached testcase demonstrates the issue. Without the fix, the
DbgIntrinic outside the loop uses values computed inside the loop, even
though these values do not dominate the DbgIntrinsic.

Author: Thomas Jablin (tjablin)
Reviewers: dblaikie aprantl kbarton hfinkel cycheng

http://reviews.llvm.org/D19564

llvm-svn: 269034
2016-05-10 09:45:44 +00:00
Denis Zobnin 15d1e64b2b [LAA] Rename "isStridedPtr" with "getPtrStride". NFC.
Changing misleading function name was approved in http://reviews.llvm.org/D17268.
Patch by Roman Shirokiy.

llvm-svn: 269021
2016-05-10 05:55:16 +00:00
Philip Reames 4a3c3b66d7 [GVN] PRE of unordered loads
Again, fairly simple.  Only change is ensuring that we actually copy the property of the load correctly.  The aliasing legality constraints were already handled by the FRE patches.  There's nothing special about unorder atomics from the perspective of the PRE algorithm itself.

llvm-svn: 268804
2016-05-06 21:43:51 +00:00
Sanjoy Das 091fcfa3a7 [RS4GC] Fix typo in comment
llvm-svn: 268790
2016-05-06 20:39:33 +00:00
Philip Reames 1fdce639d2 [GVN] Handle unordered atomics in cross block FRE
You'll note there are essentially no code changes here.  Cross block FRE heavily reuses code from the block local FRE.  All of the tricky parts were done as part of the previous patch and the refactoring that removed the original code duplication.  

llvm-svn: 268775
2016-05-06 18:46:45 +00:00
Philip Reames ae8997f496 [GVN] Do local FRE for unordered atomic loads
This patch is the first in a small series teaching GVN to optimize unordered loads aggressively. This change just handles block local FRE because that's the simplest thing which lets me test MDA, and the AvailableValue pieces. Somewhat suprisingly, MDA appears fine and only a couple of small changes are needed in GVN.

Once this is in, I'll tackle non-local FRE and PRE. The former looks like a natural extension of this, the later will require a couple of minor changes.

Differential Revision: http://reviews.llvm.org/D19440

llvm-svn: 268770
2016-05-06 18:17:13 +00:00
Philip Reames 32b55181fa [EarlyCSE] Rename a variable for clarity [NFC]
llvm-svn: 268701
2016-05-06 01:13:58 +00:00
Davide Italiano f54f2f0893 [PM] Port Interprocedural SCCP to the new pass manager.
llvm-svn: 268684
2016-05-05 21:05:36 +00:00
Dehao Chen f50c67ce7c Revert http://reviews.llvm.org/D19926 as it breaks tests.
llvm-svn: 268681
2016-05-05 20:47:53 +00:00
Dehao Chen e48b4ee98c Simplify CFG before assigning discriminator.
Summary: We need to clean up CFG before assigning discriminator to minimize the impact of optimization on debug info.

Reviewers: davidxl, dblaikie, dnovillo

Subscribers: dnovillo, danielcdh, llvm-commits

Differential Revision: http://reviews.llvm.org/D19926

llvm-svn: 268675
2016-05-05 20:18:49 +00:00
Chad Rosier 799e4c6fc3 Remove dead include. NFC.
llvm-svn: 268654
2016-05-05 17:53:43 +00:00
Dehao Chen d55bc4c7ab clang-format some files in preparation of coming patch reviews.
llvm-svn: 268583
2016-05-05 00:54:54 +00:00
Adam Nemet 3c5eabfcbc [LoopDataPrefetch] Add optimization remark
With -Rpass=loop-data-prefetch, show the memory access that got
prefetched.

llvm-svn: 268578
2016-05-05 00:08:15 +00:00
Davide Italiano a7f5e88932 Revert "[SCCP] Throw away dead code. NFC."
This reverts commit r268568, as it broke the bots.

llvm-svn: 268570
2016-05-04 23:27:13 +00:00
Davide Italiano fc1214fee2 [SCCP] Throw away dead code. NFC.
llvm-svn: 268568
2016-05-04 23:05:59 +00:00
Chad Rosier 7ab9a7b203 Use a uniform name for the load combine pass. NFC.
llvm-svn: 268507
2016-05-04 15:19:02 +00:00
Igor Laevsky fb1811d3a0 [RS4GC] Use SetVector/MapVector instead of DenseSet/DenseMap to guarantee stable ordering
Goal of this change is to guarantee stable ordering of the statepoint arguments and other 
newly inserted values such as gc.relocates. Previously we had explicit sorting in a couple
of places. However for unnamed values ordering was partial and overall we didn't have any 
strong invariant regarding it. This change switches all data structures to use SetVector's
and MapVector's which provide possibility for deterministic iteration over them.
Explicit sorting is now redundant and was removed.

Differential Revision: http://reviews.llvm.org/D19669

llvm-svn: 268502
2016-05-04 14:55:36 +00:00
Andrew Kaylor 50271f787e Add opt-bisect support to additional passes that can be skipped
Differential Revision: http://reviews.llvm.org/D19882

llvm-svn: 268457
2016-05-03 22:32:30 +00:00
Justin Bogner d0d2341f30 PM: Port LoopRotation to the new loop pass manager
llvm-svn: 268452
2016-05-03 22:02:31 +00:00
Justin Bogner ab6a513b4e PM: Port LoopSimplifyCFG to the new pass manager
llvm-svn: 268446
2016-05-03 21:47:32 +00:00
Jack Liu f101c0f7a1 [SROA] Function canConvertValue needs to check whether both NewTy and OldTy pointers are
pointing to the same addr space. This can prevent SROA from creating a bitcast
between pointers with different addr spaces.

Differential Revision: http://reviews.llvm.org/D19697

llvm-svn: 268424
2016-05-03 19:30:48 +00:00
Jack Liu 430e2c2140 Revert 268409 due to missing comment.
llvm-svn: 268421
2016-05-03 19:15:02 +00:00
Jack Liu 1ff4a0b7ee (no commit message)
llvm-svn: 268409
2016-05-03 18:01:43 +00:00
Sanjoy Das 4ae3920c5b [LICM] Kill SCEV loop dispositions if needed
SCEV caches whether SCEV expressions are loop invariant, variant or
computable.  LICM breaks this cache, almost by definition; so clear the
SCEV disposition cache if LICM changed anything.

llvm-svn: 268408
2016-05-03 17:50:11 +00:00
Sanjoy Das 7e7a5a050a Use all_of instead of a raw loop; NFC
Added some tests despite being NFC, since it looks like nothing was
exercising the "all incoming values to exit PHIs are same" logic.

llvm-svn: 268407
2016-05-03 17:50:06 +00:00
Sanjoy Das 905fc27ebf [LoopDeletion] Clear SCEV loop dispositions
`Loop::makeLoopInvariant` can hoist instructions out of loops, so loop
dispositions for the loop it operated on may need to be cleared.  We can
be smarter here (especially around how `forgetLoopDispositions` is
implemented), but let's be correct first.

Fixes PR27570.

llvm-svn: 268406
2016-05-03 17:50:02 +00:00
Kristof Beyls c08f70588d Mark that SpeculativeExecution preserves Globals Alias Analysis.
A few benchmarks with lots of accesses to global variables in the hot
loops regressed a lot since r266399, which added the
SpeculativeExecution pass to the default pipeline. The problem is that
this pass doesn't mark Globals Alias Analysis as preserved. Globals
Alias Analysis is computed in a module pass, whereas
SpeculativeExecution is a function pass, and a lot of passes dependent
on the Globals Alias Analysis to optimize these benchmarks are also
function passes. As such, the Globals Alias Analysis information cannot
be recomputed between SpeculativeExecution and the following function
passes needing that information.

SpeculativeExecution doesn't invalidate Globals Alias Analysis, so mark
it as such to fix those performance regressions.

Differential Revision: http://reviews.llvm.org/D19806

llvm-svn: 268370
2016-05-03 08:33:26 +00:00
David Majnemer 3d90bb79c4 [LoopUnroll] Unroll loops which have exit blocks to EH pads
We were overly cautious in our analysis of loops which have invokes
which unwind to EH pads.  The loop unroll transform is safe because it
only clones blocks in the loop body, it does not try to split critical
edges involving EH pads.  Instead, move the necessary safety check to
LoopUnswitch.

N.B. The safety check for loop unswitch is covered by an existing test
which fails without it.

llvm-svn: 268357
2016-05-03 03:57:40 +00:00
Chad Rosier fcb2210812 Typo. NFC.
llvm-svn: 268280
2016-05-02 19:06:04 +00:00
Chad Rosier 4466ff50eb Use false rather than 0 for a boolean value. NFC.
llvm-svn: 268279
2016-05-02 19:06:02 +00:00
Adam Nemet d02872c7b4 [LLE] Fix typo from r263058
This was meant to check unit stride for both the load and the store.

Thanks to Roman Shirokiy for noticing this.

llvm-svn: 268251
2016-05-02 16:52:00 +00:00
Sanjoy Das 47cf2affbd [LowerGuardIntrinsics] Keep track of !make.implicit metadata
If a guard call being lowered by LowerGuardIntrinsics has the
`!make.implicit` metadata attached, then reattach the metadata to the
branch in the resulting expanded form of the intrinsic.  This allows us
to implement null checks as guards and still get the benefit of implicit
null checks.

llvm-svn: 268148
2016-04-30 00:55:59 +00:00
Lawrence Hu 1befea2bdc Reroll loops with multiple IV and negative step part 3
support multiple induction variables

    This patch enable loop reroll for the following case:
        for(int i=0;  i<N; i += 2) {
           S += *a++;
           S += *a++;
        };

Differential Revision: http://reviews.llvm.org/D16550

llvm-svn: 268147
2016-04-30 00:51:22 +00:00
Sanjoy Das 52c68bb0f5 [LowerGuardIntrinsics] Preserve calling conv when lowering
llvm-svn: 268142
2016-04-30 00:17:47 +00:00
Sanjoy Das 107aefc2fc Mark guards on true as "trivially dead"
This moves some logic added to EarlyCSE in rL268120 into
`llvm::isInstructionTriviallyDead`.  Adds a test case for DCE to
demonstrate that passes other than EarlyCSE can now pick up on the new
information.

llvm-svn: 268126
2016-04-29 22:23:16 +00:00
Sanjoy Das ee81b23fe7 [EarlyCSE] Simplify guard intrinsics
Summary:
This change teaches EarlyCSE some basic properties of guard intrinsics:

 - Guard intrinsics read all memory, but don't write to any memory
 - After a guard has executed, the condition it was guarding on can be
   assumed to be true
 - Guard intrinsics on a constant `true` are no-ops

Reviewers: reames, hfinkel

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19578

llvm-svn: 268120
2016-04-29 21:52:58 +00:00
Filipe Cabecinhas 0da9937517 Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to the cmake build to enable them.
Summary:
Historically, we had a switch in the Makefiles for turning on "expensive
checks". This has never been ported to the cmake build, but the
(dead-ish) code is still around.

This will also make it easier to turn it on in buildbots.

Reviewers: chandlerc

Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits

Differential Revision: http://reviews.llvm.org/D19723

llvm-svn: 268050
2016-04-29 15:22:48 +00:00
Adam Nemet 88ec491830 [LoopDist] Also emit optimization remark on success (-Rpass=)
The option -Rpass=loop-distribute now reports the loops that were
distributed.

llvm-svn: 268006
2016-04-29 07:10:46 +00:00
Adam Nemet 4338d6769e [LoopDist] Pass 'Function' to main class. NFC
Next patch will add another use for 'Function' inside the class.

llvm-svn: 268005
2016-04-29 07:10:39 +00:00
Adam Nemet 0ba164bbcb [LoopDist] Emit optimization remarks (-Rpass*)
I closely followed the precedents set by the vectorizer:

* With -Rpass-missed, the loop is reported with further details pointing
to -Rpass--analysis.

* -Rpass-analysis reports the details why distribution has failed.

* Regardless of -Rpass*, when distribution fails for a loop where
distribution was forced with the pragma, a warning is produced according
to -Wpass-failed.  In this case the analysis info is also printed even
without -Rpass-analysis.

llvm-svn: 267952
2016-04-28 23:08:32 +00:00
Adam Nemet adeccf7658 [LoopDist] Improve debug messages
The next patch will start using these for -Rpass-analysis so they won't
be internal-only anymore.

Move the 'Skipping; ' prefix that some of the message are using into the
'fail' function.  We don't want to include this prefix in
the -Rpass-analysis report.

llvm-svn: 267951
2016-04-28 23:08:30 +00:00
Adam Nemet 7f38e1199a [LoopDist] Add helper to print debug message when distribution fails. NFC
This will form the basis to emit optimization remarks (-Rpass*).

llvm-svn: 267950
2016-04-28 23:08:27 +00:00
Chad Rosier 712b7d7630 [GVN] Minor code cleanup. NFC.
Differential Revision: http://reviews.llvm.org/D18828
Patch by Aditya Kumar!

llvm-svn: 267898
2016-04-28 16:00:15 +00:00
Geoff Berry 5ae272c2c1 [EarlyCSE] Change LoadValue field Value *Data to Instruction *Inst. NFC.
Made in preparation for adding MemorySSA support to EarlyCSE.

llvm-svn: 267893
2016-04-28 15:22:37 +00:00
Geoff Berry 354fac2a69 [EarlyCSE] Sort includes. NFC.
Reviewers: mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19617

llvm-svn: 267890
2016-04-28 14:59:27 +00:00
Ahmed Bougacha ace97c1f7d [LIR] Set attributes on memset_pattern16.
"inferattrs" will deduce the attribute, but it will be too late for
many optimizations. Set it ourselves when creating the call.

Differential Revision: http://reviews.llvm.org/D17598

llvm-svn: 267762
2016-04-27 19:04:50 +00:00
Ahmed Bougacha 7f97193dd7 [LIR] Reuse variable. NFCI.
llvm-svn: 267761
2016-04-27 19:04:46 +00:00
Artur Pilipenko 9bb6beabf4 isSafeToLoadUnconditionally support queries without a context
This is required to use this function from isSafeToSpeculativelyExecute

Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D16231

llvm-svn: 267692
2016-04-27 11:00:48 +00:00
Adam Nemet d2fa414718 [LoopDist] Add llvm.loop.distribute.enable loop metadata
Summary:
D19403 adds a new pragma for loop distribution.  This change adds
support for the corresponding metadata that the pragma is translated to
by the FE.

As part of this I had to rethink the flag -enable-loop-distribute.  My
goal was to be backward compatible with the existing behavior:

  A1. pass is off by default from the optimization pipeline
  unless -enable-loop-distribute is specified

  A2. pass is on when invoked directly from opt (e.g. for unit-testing)

The new pragma/metadata overrides these defaults so the new behavior is:

  B1. A1 + enable distribution for individual loop with the pragma/metadata

  B2. A2 + disable distribution for individual loop with the pragma/metadata

The default value whether the pass is on or off comes from the initiator
of the pass.  From the PassManagerBuilder the default is off, from opt
it's on.

I moved -enable-loop-distribute under the pass.  If the flag is
specified it overrides the default from above.

Then the pragma/metadata can further modifies this per loop.

As a side-effect, we can now also use -enable-loop-distribute=0 from opt
to emulate the default from the optimization pipeline.  So to be precise
this is the new behavior:

  C1. pass is off by default from the optimization pipeline
  unless -enable-loop-distribute or the pragma/metadata enables it

  C2. pass is on when invoked directly from opt
  unless -enable-loop-distribute=0 or the pragma/metadata disables it

Reviewers: hfinkel

Subscribers: joker.eph, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D19431

llvm-svn: 267672
2016-04-27 05:28:18 +00:00
Sanjoy Das 5253a089ba Fix typo in comment; NFC
llvm-svn: 267653
2016-04-27 01:44:31 +00:00
Matt Arsenault ba437c67d2 SLSR: Use UnknownAddressSpace instead of 0 for pure arithmetic.
In the case where isLegalAddressingMode is used for cases
not related to addressing modes, such as pure adds and muls,
it should not be using address space 0. LSR already passes -1
as the address space in these cases.

llvm-svn: 267645
2016-04-27 00:32:09 +00:00
Adam Nemet 61399ac424 [LoopDist] Split main class. NFC
This splits out the per-loop functionality from the Pass class.

With this the fact whether the loop is forced-distribute with the new
metadata/pragma can be cached in the per-loop class rather than passed
around.

llvm-svn: 267643
2016-04-27 00:31:03 +00:00
Justin Bogner c2bf63d29d PM: Port Reassociate to the new pass manager
llvm-svn: 267631
2016-04-26 23:39:29 +00:00
Justin Bogner cb8a21c88e Reassociate: Convert another functor into a lambda. NFC
Also move the explanatory comment with it.

llvm-svn: 267628
2016-04-26 23:32:00 +00:00
Sanjay Patel d2d2aa52cd [LowerExpectIntrinsic] make default likely/unlikely ratio bigger
We need the default ratio to be sufficiently large that it triggers transforms 
based on block frequency info (BFI) and plays well with the recently introduced
BranchProbability used by CGP.

Differential Revision: http://reviews.llvm.org/D19435

llvm-svn: 267615
2016-04-26 22:23:38 +00:00
Justin Bogner 90744d215b Reassociate: Simplify using lambdas. NFC
llvm-svn: 267614
2016-04-26 22:22:18 +00:00
David Majnemer 30ffc4ce45 [SROA] Don't falsely report that changes have occured
We would report that the function changed despite creating no new
allocas or performing any promotion.

This fixes PR27316.

llvm-svn: 267507
2016-04-26 01:05:00 +00:00
Chad Rosier e2cbd13e56 [ValueTracking] Improve isImpliedCondition when the dominating cond is false.
llvm-svn: 267430
2016-04-25 17:23:36 +00:00
Andrew Kaylor aa641a5171 Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267231
2016-04-22 22:06:11 +00:00
Justin Bogner b93949089e PM: Port SinkingPass to the new pass manager
llvm-svn: 267199
2016-04-22 19:54:10 +00:00
Justin Bogner 82077c4ab0 PM: Reorder the functions used for SinkingPass. NFC
This will make the port to the new PM easier to follow.

llvm-svn: 267198
2016-04-22 19:54:04 +00:00
Jun Bum Lim d29a24e4fd [DeadStoreElimination] Shorten beginning of memset overwritten by later stores
Summary: This change will shorten memset if the beginning of memset is overwritten by later stores.

Reviewers: hfinkel, eeckstein, dberlin, mcrosier

Subscribers: mgrang, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18906

llvm-svn: 267197
2016-04-22 19:51:29 +00:00
Justin Bogner 395c2127ed PM: Port DCE to the new pass manager
Also add a very basic test, since apparently there aren't any tests
for DCE whatsoever to add the new pass version to.

llvm-svn: 267196
2016-04-22 19:40:41 +00:00
Chad Rosier 1a4bc110f5 [EarlyCSE/CVP] Add stats for CVPs and make sure to account for any Changes.
llvm-svn: 267187
2016-04-22 18:47:21 +00:00
David Majnemer bfd695d591 [EarlyCSE] Don't add the overflow flags to the hash
We take the intersection of overflow flags while CSE'ing.
This permits us to consider two instructions with different overflow
behavior to be replaceable.

llvm-svn: 267153
2016-04-22 14:12:50 +00:00
Vedant Kumar 6013f45f92 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

llvm-svn: 267115
2016-04-22 06:51:37 +00:00
David Majnemer d0ce8f1485 [GVN] Respect fast-math-flags on fcmps
We assumed that flags were only present on binary operators.  This is
not true, they may also be present on calls and fcmps.

llvm-svn: 267113
2016-04-22 06:37:51 +00:00
David Majnemer 9554c1339c [EarlyCSE] Take the intersection of flags on instructions
EarlyCSE had inconsistent behavior with regards to flag'd instructions:
- In some cases, it would pessimize if the available instruction had
  different flags by not performing CSE.
- In other cases, it would miscompile if it replaced an instruction
  which had no flags with an instruction which has flags.

Fix this by being more consistent with our flag handling by utilizing
andIRFlags.

llvm-svn: 267111
2016-04-22 06:37:45 +00:00
Andrew Kaylor f0f279291c Initial implementation of optimization bisect support.
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.

The bisection is enabled using a new command line option (-opt-bisect-limit).  Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit.  A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.

The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check.  Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute.  A new function call has been added for module and SCC passes that behaves in a similar way.

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267022
2016-04-21 17:58:54 +00:00
Adam Nemet 963341c872 [LoopUtils] Move def of findStringMetadataForLoop to LoopUtils.cpp. NFC
The decl is in LoopUtils.h.  I think that this was added to
LoopVersioningLICM.cpp by mistake.

llvm-svn: 267014
2016-04-21 17:33:17 +00:00
Adam Nemet f787826b46 [LoopUtils] Rename {check->find}StringMetadata{Into->For}Loop. NFC
"Into" was misleading.  I am also planning to use this helper to look
for loop metadata and return the argument, so find seems like a better
name.

llvm-svn: 267013
2016-04-21 17:33:12 +00:00
Chad Rosier b346dcbc25 Typo.
llvm-svn: 266905
2016-04-20 19:16:23 +00:00
Chad Rosier 41dd31f0b0 [ValueTracking] Make isImpliedCondition return an Optional<bool>. NFC.
Phabricator Revision: http://reviews.llvm.org/D19277

llvm-svn: 266904
2016-04-20 19:15:26 +00:00
Chad Rosier b7dfbb40a3 [ValueTracking] Improve isImpliedCondition for conditions with matching operands.
This patch improves SimplifyCFG to catch cases like:

  if (a < b) {
    if (a > b) <- known to be false
      unreachable;
  }

Phabricator Revision: http://reviews.llvm.org/D18905

llvm-svn: 266767
2016-04-19 17:19:14 +00:00
Michael Kuperstein de16b44f74 Port DemandedBits to the new pass manager.
Differential Revision: http://reviews.llvm.org/D18679

llvm-svn: 266699
2016-04-18 23:55:01 +00:00
Mehdi Amini b550cb1750 [NFC] Header cleanup
Removed some unused headers, replaced some headers with forward class declarations.

Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'

Patch by Eugene Kosov <claprix@yandex.ru>

Differential Revision: http://reviews.llvm.org/D19219

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
2016-04-18 09:17:29 +00:00
Duncan P. N. Exon Smith a71301befa Transforms: Fix bootstrap after r266565
Apparently there isn't test coverage for all of these.  I'd appreciate
if someone with could reproduce and send me something to reduce, but for
now I've just looked for users of RemapInstruction and MapValue and
ensured they don't accidentally insert nullptr.  Here is one of the
bootstraps that caught:

  http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/11494

llvm-svn: 266567
2016-04-17 19:26:49 +00:00
Justin Lebar cad81cf6b3 [Speculation] Add a SpeculativeExecution mode where the pass does nothing unless TTI::hasBranchDivergence() is true.
Summary:
This lets us add this pass to the IR pass manager unconditionally; it
will simply not do anything on targets without branch divergence.

Reviewers: tra

Subscribers: llvm-commits, jingyue, rnk, chandlerc

Differential Revision: http://reviews.llvm.org/D18625

llvm-svn: 266398
2016-04-15 00:32:09 +00:00
Nicolai Haehnle 05b127da06 [StructurizeCFG] Annotate branches that were treated as uniform
Summary:
This fully solves the problem where the StructurizeCFG pass does not
consider the same branches as uniform as the SIAnnotateControlFlow pass.
The patch in D19013 helps with this problem, but is not sufficient
(and, interestingly, causes a "regression" with one of the existing
test cases).

No tests included here, because tests in D19013 already cover this.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19018

llvm-svn: 266346
2016-04-14 17:42:35 +00:00
Tim Northover 5c02f9ad28 ARM: override cost function to re-enable ConstantHoisting (& fix it).
At some point, ARM stopped getting any benefit from ConstantHoisting because
the pass called a different variant of getIntImmCost. Reimplementing the
correct variant revealed some problems, however:

  + ConstantHoisting was modifying switch statements. This is simply invalid,
    the cases must remain integer constants no matter the notional cost.
  + ConstantHoisting was mangling alloca instructions in the entry block. These
    should be handled by FrameLowering, so constants actually have a cost of 0.
    Worse, the resulting bitcasts meant they became dynamic allocas.

rdar://25707382

llvm-svn: 266260
2016-04-13 23:08:27 +00:00
Betul Buyukkurt bf8554c279 [PGO] Remove redundant VP instrumentation
LLVM optimization passes may reduce a profiled target expression
to a constant. Removing runtime calls at such instrumentation points
would help speedup the runtime of the instrumented program.

llvm-svn: 266229
2016-04-13 18:52:19 +00:00
Sanjoy Das 5ce3272833 Don't IPO over functions that can be de-refined
Summary:
Fixes PR26774.

If you're aware of the issue, feel free to skip the "Motivation"
section and jump directly to "This patch".

Motivation:

I define "refinement" as discarding behaviors from a program that the
optimizer has license to discard.  So transforming:

```
void f(unsigned x) {
  unsigned t = 5 / x;
  (void)t;
}
```

to

```
void f(unsigned x) { }
```

is refinement, since the behavior went from "if x == 0 then undefined
else nothing" to "nothing" (the optimizer has license to discard
undefined behavior).

Refinement is a fundamental aspect of many mid-level optimizations done
by LLVM.  For instance, transforming `x == (x + 1)` to `false` also
involves refinement since the expression's value went from "if x is
`undef` then { `true` or `false` } else { `false` }" to "`false`" (by
definition, the optimizer has license to fold `undef` to any non-`undef`
value).

Unfortunately, refinement implies that the optimizer cannot assume
that the implementation of a function it can see has all of the
behavior an unoptimized or a differently optimized version of the same
function can have.  This is a problem for functions with comdat
linkage, where a function can be replaced by an unoptimized or a
differently optimized version of the same source level function.

For instance, FunctionAttrs cannot assume a comdat function is
actually `readnone` even if it does not have any loads or stores in
it; since there may have been loads and stores in the "original
function" that were refined out in the currently visible variant, and
at the link step the linker may in fact choose an implementation with
a load or a store.  As an example, consider a function that does two
atomic loads from the same memory location, and writes to memory only
if the two values are not equal.  The optimizer is allowed to refine
this function by first CSE'ing the two loads, and the folding the
comparision to always report that the two values are equal.  Such a
refined variant will look like it is `readonly`.  However, the
unoptimized version of the function can still write to memory (since
the two loads //can// result in different values), and selecting the
unoptimized version at link time will retroactively invalidate
transforms we may have done under the assumption that the function
does not write to memory.

Note: this is not just a problem with atomics or with linking
differently optimized object files.  See PR26774 for more realistic
examples that involved neither.

This patch:

This change introduces a new set of linkage types, predicated as
`GlobalValue::mayBeDerefined` that returns true if the linkage type
allows a function to be replaced by a differently optimized variant at
link time.  It then changes a set of IPO passes to bail out if they see
such a function.

Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18634

llvm-svn: 265762
2016-04-08 00:48:30 +00:00
Ulrich Weigand fc23907673 [GVN] Address review comments for D18662
As suggested by Chandler in his review comments for D18662, this
follow-on patch renames some variables in GetLoadValueForLoad and
CoerceAvailableValueToLoadType to hopefully make it more obvious
which variables hold value sizes and which hold load/store sizes.

No functional change intended.

llvm-svn: 265687
2016-04-07 15:55:11 +00:00
Ulrich Weigand 6e6966460a [GVN] Fix handling of sub-byte types in big-endian mode
When GVN wants to re-interpret an already available value in a smaller
type, it needs to right-shift the value on big-endian systems to ensure
the correct bytes are accessed.  The shift value is the difference of
the sizes of the two types.

This is correct as long as both types occupy multiples of full bytes.
However, when one of them is a sub-byte type like i1, this no longer
holds true: we still need to shift, but only to access the correct
*byte*.  Accessing bits within the byte requires no shift in either
endianness; e.g. an i1 resides in the least-significant bit of its
containing byte on both big- and little-endian systems.

Therefore, the appropriate shift value to be used is the difference of
the *storage* sizes of the two types.  This is already handled correctly
in one place where such a shift takes place (GetStoreValueForLoad), but
is incorrect in two other places: GetLoadValueForLoad and
CoerceAvailableValueToLoadType.

This patch changes both places to use the storage size as well.

Differential Revision: http://reviews.llvm.org/D18662

llvm-svn: 265684
2016-04-07 15:45:02 +00:00
Duncan P. N. Exon Smith da68cbc4ad IR: RF_IgnoreMissingValues => RF_IgnoreMissingLocals, NFC
Clarify what this RemapFlag actually means.

  - Change the flag name to match its intended behaviour.
  - Clearly document that it's not supposed to affect globals.
  - Add a host of FIXMEs to indicate how to fix the behaviour to match
    the intent of the flag.

RF_IgnoreMissingLocals should only affect the behaviour of
RemapInstruction for function-local operands; namely, for operands of
type Argument, Instruction, and BasicBlock.  Currently, it is *only*
passed into RemapInstruction calls (and the transitive MapValue calls
that it makes).

When I split Metadata from Value I didn't understand the flag, and I
used it in a bunch of places for "global" metadata.

This commit doesn't have any functionality change, but prepares to
cleanup MapMetadata and MapValue.

llvm-svn: 265628
2016-04-07 00:26:43 +00:00
JF Bastien 800f87a871 NFC: make AtomicOrdering an enum class
Summary:
In the context of http://wg21.link/lwg2445 C++ uses the concept of
'stronger' ordering but doesn't define it properly. This should be fixed
in C++17 barring a small question that's still open.

The code currently plays fast and loose with the AtomicOrdering
enum. Using an enum class is one step towards tightening things. I later
also want to tighten related enums, such as clang's
AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI'
enum).

This change touches a few lines of code which can be improved later, I'd
like to keep it as NFC for now as it's already quite complex. I have
related changes for clang.

As a follow-up I'll add:
  bool operator<(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>(AtomicOrdering, AtomicOrdering) = delete;
  bool operator<=(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>=(AtomicOrdering, AtomicOrdering) = delete;
This is separate so that clang and LLVM changes don't need to be in sync.

Reviewers: jyknight, reames

Subscribers: jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D18775

llvm-svn: 265602
2016-04-06 21:19:33 +00:00
Fiona Glaser 045afc4f66 Loop Unroll: add options and tweak to make Partial unrolling more useful
1. Add FullUnrollMaxCount option that works like MaxCount, but also limits
   the unroll count for fully unrolled loops. So if a loop has an iteration
   count over this, it won't fully unroll.
2. Add CLI options for MaxCount and the new option, so they can be tested
   (plus a test).
3. Make partial unrolling obey MaxCount.

An example use-case (the out of tree one this is originally designed for) is
a target’s TTI can analyze a loop and decide on a max unroll count separate
from the size threshold, e.g. based on register pressure, then constrain
LoopUnroll to not exceed that, regardless of the size of the unrolled loop.

llvm-svn: 265562
2016-04-06 16:57:25 +00:00
Fiona Glaser 16332ba861 LoopUnroll: only allow non-modulo Partial unrolling when Runtime=true
Patch by Evgeny Stupachenko <evstupac@gmail.com>.

llvm-svn: 265558
2016-04-06 16:43:45 +00:00
Chad Rosier 074ce836f0 Simplify logic. NFC.
llvm-svn: 265537
2016-04-06 13:27:13 +00:00
Richard Trieu f35d4b0928 Add parentheses to silence warning.
llvm-svn: 265516
2016-04-06 04:22:00 +00:00
Sanjoy Das 99abb2728b [RS4GC] Add a comment
llvm-svn: 265503
2016-04-06 01:33:54 +00:00
Sanjoy Das 8d89a2b296 [RS4GC] NFC cleanup of the DeferredReplacement class
Instead of constructors use clearly named factory methods.

llvm-svn: 265486
2016-04-05 23:18:53 +00:00
Sanjoy Das 49e974b33b [RS4GC] Better codegen for deoptimize calls
Don't emit a gc.result for a statepoint lowered from
@llvm.experimental.deoptimize since the call into __llvm_deoptimize is
effectively noreturn.  Instead follow the corresponding gc.statepoint
with an "unreachable".

llvm-svn: 265485
2016-04-05 23:18:35 +00:00
Sanjay Patel e77c7de459 use range loop; NFCI
llvm-svn: 265360
2016-04-04 23:05:06 +00:00
Zia Ansari a82a58a4e5 Enable unroll for constant bound loops when TripCount is not modulo of unroll factor, reducing it to maximum power-of-2 that satisfies threshold limit.
Commit for Evgeny Stupachenko (evstupac@gmail.com)

Differential Revision: http://reviews.llvm.org/D18290

llvm-svn: 265337
2016-04-04 19:24:46 +00:00
Sanjoy Das 021de058df Introduce a @llvm.experimental.guard intrinsic
Summary:
As discussed on llvm-dev[1].

This change adds the basic boilerplate code around having this intrinsic
in LLVM:

 - Changes in Intrinsics.td, and the IR Verifier
 - A lowering pass to lower @llvm.experimental.guard to normal
   control flow
 - Inliner support

[1]: http://lists.llvm.org/pipermail/llvm-dev/2016-February/095523.html

Reviewers: reames, atrick, chandlerc, rnk, JosephTremoulet, echristo

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18527

llvm-svn: 264976
2016-03-31 00:18:46 +00:00
David Majnemer 5d518386b6 [IndVarSimplify] Don't insert after a catchswitch
Widening a PHI requires us to insert a trunc.
The logical place for this trunc is in the same BB as the PHI.
This is not possible if the BB is terminated by a catchswitch.

This fixes PR27133.

llvm-svn: 264926
2016-03-30 21:12:06 +00:00
Adam Nemet 1428d41f9a [LoopDataPrefetch] Centralize the tuning cl::opts under the pass
This is effectively NFC, minus the renaming of the options
(-cyclone-prefetch-distance -> -prefetch-distance).

The change was requested by Tim in D17943.

llvm-svn: 264806
2016-03-29 23:45:52 +00:00
Duncan P. N. Exon Smith e8eb94a9a5 ADCE: Remove debug info intrinsics in dead scopes
During ADCE, track which debug info scopes still have live references
from the code, and delete debug info intrinsics for the dead ones.

These intrinsics describe the locations of variables (in registers or
stack slots).  If there's no code left corresponding to a variable's
scope, then there's no way to reference the variable in the debugger and
it doesn't matter what its value is.

I add a DEBUG printout when the described location in an SSA register,
in case it helps some trying to track down why locations get lost.
However, we still delete these; the scope itself isn't attached to any
real code, so the ship has already sailed.

llvm-svn: 264800
2016-03-29 22:57:12 +00:00
Adam Nemet 85fba39390 [LoopDataPrefetch] Make more member functions private, NFC.
llvm-svn: 264798
2016-03-29 22:40:02 +00:00
Hyojin Sung 4673f10568 [SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops
When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes
is currently used to recognize potential loops of which the block is the header and keep the block.
However, the current algorithm fails if the loops' exit condition is evaluated only with volatile
values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested
loop, the loop is collapsed into a single loop which prevent later optimizations from being
applied (e.g., transforming nested loops into simplified forms and loop vectorization).
    
The patch augments the existing PHI node-based check by adding a pre-test if the BB actually
belongs to a set of loop headers and not eliminating it if yes.

llvm-svn: 264697
2016-03-29 04:08:57 +00:00
Reid Kleckner ba85781f58 Revert "[SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops"
This reverts commit r264596.

It does not compile.

llvm-svn: 264604
2016-03-28 18:07:40 +00:00
Hyojin Sung 0ada5b0d14 [SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops
When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes
is currently used to recognize potential loops of which the block is the header and keep the block.
However, the current algorithm fails if the loops' exit condition is evaluated only with volatile
values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested
loop, the loop is collapsed into a single loop which prevent later optimizations from being 
applied (e.g., transforming nested loops into simplified forms and loop vectorization).

The patch augments the existing PHI node-based check by adding a pre-test if the BB actually 
belongs to a set of loop headers and not eliminating it if yes. 

llvm-svn: 264596
2016-03-28 17:22:25 +00:00
Hal Finkel 5c83a090bc [SROA] Fix typo in comment
llvm-svn: 264573
2016-03-28 11:23:21 +00:00
Hal Finkel 29f5131daf C++11 is required, remove some preprocessor checks for it
We require C++11 to build, so remove a few remaining preprocessor checks for
'__cplusplus >= 201103L'. This should always be true.

llvm-svn: 264572
2016-03-28 11:13:03 +00:00
Sanjoy Das d4c783335b [RS4GC] Lower calls to @llvm.experimental.deoptimize
This changes RS4GC to lower calls to ``@llvm.experimental.deoptimize``
to gc.statepoints wrapping ``__llvm_deoptimize``, and changes
``callsGCLeafFunction`` to recognize ``@llvm.experimental.deoptimize``
as a non GC leaf function.

I've had to hard code the ``"__llvm_deoptimize"`` name in
RewriteStatepointsForGC; since ``TargetLibraryInfo`` is available only
during codegen.  This isn't without precedent in the codebase, so I'm
not overtly concerned.

llvm-svn: 264456
2016-03-25 20:12:13 +00:00
David L Kreitzer 8d441eb936 Enable non-power-of-2 #pragma unroll counts.
Patch by Evgeny Stupachenko.

Differential Revision: http://reviews.llvm.org/D18202

llvm-svn: 264407
2016-03-25 14:24:52 +00:00
David Majnemer e09d035dad [LoopStrengthReduce] Don't hoist into a catchswitch
We try to hoist the insertion point as high as possible to encourage
sharing.  However, we must be careful not to hoist into a catchswitch as
it is both an EHPad and a terminator.

llvm-svn: 264344
2016-03-24 21:40:22 +00:00
Adam Nemet 7aba60c853 [LLE] Check for mismatching types between the store and the load earlier
isDependenceDistanceOfOne asserts that the store and the load access
through the same type.  This function is also used by
removeDependencesFromMultipleStores so we need to make sure we filter
out mismatching types before reaching this point.

Now we do this when the initial candidates are gathered.

This is a refinement of the fix made in r262267.

Fixes PR27048.

llvm-svn: 264313
2016-03-24 17:59:26 +00:00
Zinovy Nis 07ac2bd4d0 [PATCH] Force LoopReroll to reset the loop trip count value after reroll.
It's a bug fix. 
For rerolled loops SE trip count remains unchanged. It leads to incorrect work of the next passes.
My patch just resets SE info for rerolled loop forcing SE to re-evaluate it next time it requested.
I also added a verifier call in the exisitng test to be sure no invalid SE data remain. Without my fix this test would fail with -verify-scev.

Differential Revision: http://reviews.llvm.org/D18316

llvm-svn: 264051
2016-03-22 13:50:57 +00:00
Adam Nemet 709e3046ee [LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch ahead
Summary:
It can hurt performance to prefetch ahead too much.  Be conservative for
now and don't prefetch ahead more than 3 iterations on Cyclone.

Reviewers: hfinkel

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17949

llvm-svn: 263772
2016-03-18 00:27:43 +00:00
Adam Nemet 6d8beeca53 [LoopDataPrefetch/Aarch64] Allow selective prefetching of large-strided accesses
Summary:
And use this TTI for Cyclone.  As it was explained in the original RFC
(http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW
prefetcher work up to 2KB strides.

I am also adding tests for this and the previous change (D17943):

* Cyclone prefetching accesses with a large stride
* Cyclone not prefetching accesses with a small stride
* Generic Aarch64 subtarget not prefetching either

Reviewers: hfinkel

Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17945

llvm-svn: 263771
2016-03-18 00:27:38 +00:00
Adam Nemet 5eccf07df3 [LoopVersioning] Annotate versioned loop with noalias metadata
Summary:
If we decide to version a loop to benefit a transformation, it makes
sense to record the now non-aliasing accesses in the newly versioned
loop.  This allows non-aliasing information to be used by subsequent
passes.

One example is 456.hmmer in SPECint2006 where after loop distribution,
we vectorize one of the newly distributed loops.  To vectorize we
version this loop to fully disambiguate may-aliasing accesses.  If we
add the noalias markers, we can use the same information in a later DSE
pass to eliminate some dead stores which amounts to ~25% of the
instructions of this hot memory-pipeline-bound loop.  The overall
performance improves by 18% on our ARM64.

The scoped noalias annotation is added in LoopVersioning.  The patch
then enables this for loop distribution.  A follow-on patch will enable
it for the vectorizer.  Eventually this should be run by default when
versioning the loop but first I'd like to get some feedback whether my
understanding and application of scoped noalias metadata is correct.

Essentially my approach was to have a separate alias domain for each
versioning of the loop.  For example, if we first version in loop
distribution and then in vectorization of the distributed loops, we have
a different set of memchecks for each versioning.  By keeping the scopes
in different domains they can conveniently be defined independently
since different alias domains don't affect each other.

As written, I also have a separate domain for each loop.  This is not
necessary and we could save some metadata here by using the same domain
across the different loops.  I don't think it's a big deal either way.

Probably the best is to review the tests first to see if I mapped this
problem correctly to scoped noalias markers.  I have plenty of comments
in the tests.

Note that the interface is prepared for the vectorizer which needs the
annotateInstWithNoAlias API.  The vectorizer does not use LoopVersioning
so we need a way to pass in the versioned instructions.  This is also
why the maps have to become part of the object state.

Also currently, we only have an AA-aware DSE after the vectorizer if we
also run the LTO pipeline.  Depending how widely this triggers we may
want to schedule a DSE toward the end of the regular pass pipeline.

Reviewers: hfinkel, nadav, ashutosh.nema

Subscribers: mssimpso, aemerson, llvm-commits, mcrosier

Differential Revision: http://reviews.llvm.org/D16712

llvm-svn: 263743
2016-03-17 20:32:32 +00:00
Sanjoy Das c9058ca9e0 [Statepoints] Export a magic constant into a header; NFC
llvm-svn: 263733
2016-03-17 18:42:17 +00:00
Sanjoy Das 312038872d [Statepoints] Separate out logic for statepoint directives; NFC
This splits out the logic that maps the `"statepoint-id"` attribute into
the actual statepoint ID, and the `"statepoint-num-patch-bytes"`
attribute into the number of patchable bytes the statpeoint is lowered
into.  The new home of this logic is in IR/Statepoint.cpp, and this
refactoring will support similar functionality when lowering calls with
deopt operand bundles in the future.

llvm-svn: 263685
2016-03-17 01:56:10 +00:00
Geoff Berry 56fabf9b55 Revert "[LSR] Create fewer redundant instructions."
This reverts commit r263644.  Investigating bootstrap failures.

llvm-svn: 263655
2016-03-16 19:21:47 +00:00
Geoff Berry 459b750871 [LSR] Create fewer redundant instructions.
Summary:
Fix LSRInstance::HoistInsertPosition() to check the original insert
position block first for a canonical insertion point that is dominated
by all inputs.  This leads to SCEV being able to reuse more instructions
since it currently tracks the instructions it creates for reuse by
keeping a table of <Value, insert point> pairs.

Reviewers: atrick

Subscribers: mcrosier, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18001

llvm-svn: 263644
2016-03-16 17:29:49 +00:00
Haicheng Wu 7873857a88 [JumpThreading] See through Cast Instructions
To capture more jump-thread opportunity.

llvm-svn: 263618
2016-03-16 04:52:52 +00:00
Haicheng Wu 64d9d7c3f7 Revert "[JumpThreading] Simplify Instructions first in ComputeValueKnownInPredecessors()"
Not sure it handles undef properly.

llvm-svn: 263605
2016-03-15 23:38:47 +00:00
Justin Lebar 6827de19b2 [LoopUnroll] Respect the convergent attribute.
Summary:
Specifically, when we perform runtime loop unrolling of a loop that
contains a convergent op, we can only unroll k times, where k divides
the loop trip multiple.

Without this change, we'll happily unroll e.g. the following loop

  for (int i = 0; i < N; ++i) {
    if (i == 0) convergent_op();
    foo();
  }

into

  int i = 0;
  if (N % 2 == 1) {
    convergent_op();
    foo();
    ++i;
  }
  for (; i < N - 1; i += 2) {
    if (i == 0) convergent_op();
    foo();
    foo();
  }.

This is unsafe, because we've just added a control-flow dependency to
the convergent op in the prelude.

In general, runtime unrolling loops that contain convergent ops is safe
only if we don't have emit a prelude, which occurs when the unroll count
divides the trip multiple.

Reviewers: resistor

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17526

llvm-svn: 263509
2016-03-14 23:15:34 +00:00
Amaury Sechet bdb261b4c0 Imporove load to store => memcpy
Summary: This now try to reorder instructions in order to help create the optimizable pattern.

Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer

Differential Revision: http://reviews.llvm.org/D16523

llvm-svn: 263503
2016-03-14 22:52:27 +00:00
Chad Rosier 00bd82cade [CVP] Replace nonnegative with positive, per Philip's request. NFC.
llvm-svn: 263430
2016-03-14 13:48:00 +00:00
Haicheng Wu d60ae33d29 [CVP] Convert an SDiv to a UDiv if both operands are known to be nonnegative
The motivating example is this

for (j = n; j > 1; j = i) {
   i = j / 2;
}

The signed division is safely to be changed to an unsigned division (j is known
to be larger than 1 from the loop guard) and later turned into a single shift
without considering the sign bit.

llvm-svn: 263406
2016-03-14 03:24:28 +00:00
Mehdi Amini ba9fba81d6 Remove PreserveNames template parameter from IRBuilder
This reapplies r263258, which was reverted in r263321 because
of issues on Clang side.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263393
2016-03-13 21:05:13 +00:00
Eric Christopher 35abd051c0 Temporarily revert:
commit ae14bf6488e8441f0f6d74f00455555f6f3943ac
Author: Mehdi Amini <mehdi.amini@apple.com>
Date:   Fri Mar 11 17:15:50 2016 +0000

    Remove PreserveNames template parameter from IRBuilder

    Summary:
    Following r263086, we are now relying on a flag on the Context to
    discard Value names in release builds.

    Reviewers: chandlerc

    Subscribers: mzolotukhin, llvm-commits

    Differential Revision: http://reviews.llvm.org/D18023

    From: Mehdi Amini <mehdi.amini@apple.com>

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258
    91177308-0d34-0410-b5e6-96231b3b80d8

until we can figure out what to do about clang and Release build testing.

This reverts commit 263258.

llvm-svn: 263321
2016-03-12 01:47:22 +00:00
Mehdi Amini 99eab3dd06 Remove PreserveNames template parameter from IRBuilder
Summary:
Following r263086, we are now relying on a flag on the Context to
discard Value names in release builds.

Reviewers: chandlerc

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18023

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263258
2016-03-11 17:15:50 +00:00
Mehdi Amini 1e9c925182 Do not specialize IRBuilder to strip names in SROA
Summary:
Following r263086, we are replacing this by a runtime check.
More cleanup will follow on the IRBuilder itself, but I submitted
this patch separately as SROA has a fancy "prefixInserter" class
that needs extra-love.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18022

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263256
2016-03-11 17:15:34 +00:00
Chandler Carruth ace8c8f765 [PM] Sink the "Expression" type for GVN into the class as a private
member type.

Because of how this type is used by the ValueTable, it cannot actually
have hidden visibility. GCC actually nicely warns about this but Clang
just silently ... I don't even know. =/ We should do a better job either
way though.

This should resolve a bunch of the GCC warnings about visibility that
the port of GVN triggered and make the visibility story a bit more
correct.

llvm-svn: 263250
2016-03-11 16:25:19 +00:00
Chandler Carruth 3bc9c7fb45 [PM] The order of evaluation of these analyses is actually significant,
much to my horror, so use variables to fix it in place.

This terrifies me. Both basic-aa and memdep will provide more precise
information when the domtree and/or the loop info is available. Because
of this, if your pass (like GVN) requires domtree, and then queries
memdep or basic-aa, it will get more precise results. If it does this in
the other order, it gets less precise results.

All of the ideas I have for fixing this are, essentially, terrible. Here
I've just caused us to stop having unspecified behavior as different
implementations evaluate the order of these arguments differently. I'm
actually rather glad that they do, or the fragility of memdep and
basic-aa would have gone on unnoticed. I've left comments so we don't
immediately break this again. This should fix bots whose host compilers
evaluate the order of arguments differently from Clang.

llvm-svn: 263231
2016-03-11 13:26:47 +00:00
Chandler Carruth b47f8010a9 [PM] Make the AnalysisManager parameter to run methods a reference.
This was originally a pointer to support pass managers which didn't use
AnalysisManagers. However, that doesn't realistically come up much and
the complexity of supporting it doesn't really make sense.

In fact, *many* parts of the pass manager were just assuming the pointer
was never null already. This at least makes it much more explicit and
clear.

llvm-svn: 263219
2016-03-11 11:05:24 +00:00
Chandler Carruth 89c45a162f [PM] Port GVN to the new pass manager, wire it up, and teach a couple of
tests to run GVN in both modes.

This is mostly the boring refactoring just like SROA and other complex
transformation passes. There is some trickiness in that GVN's
ValueNumber class requires hand holding to get to compile cleanly. I'm
open to suggestions about a better pattern there, but I tried several
before settling on this. I was trying to balance my desire to sink as
much implementation detail into the source file as possible without
introducing overly many layers of abstraction.

Much like with SROA, the design of this system is made somewhat more
cumbersome by the need to support both pass managers without duplicating
the significant state and logic of the pass. The same compromise is
struck here.

I've also left a FIXME in a doxygen comment as the GVN pass seems to
have pretty woeful documentation within it. I'd like to submit this with
the FIXME and let those more deeply familiar backfill the information
here now that we have a nice place in an interface to put that kind of
documentaiton.

Differential Revision: http://reviews.llvm.org/D18019

llvm-svn: 263208
2016-03-11 08:50:55 +00:00
Adam Nemet efb234135c [LLE] Add missed LoopSimplify dependence
The code assumed that we always had a preheader without making the pass
dependent on LoopSimplify.

Thanks to Mattias Eriksson V for reporting this.

llvm-svn: 263173
2016-03-10 23:54:39 +00:00
Chandler Carruth 37f1f12226 [SROA] Fix PR25873, which Andrea Di Biagio analyzed the daylights out
of, and I misdiagnosed for months and months.

Andrea has had a patch for this forever, but I just couldn't see how
it was fixing the root cause of the problem. It didn't make sense to me,
even though the patch was perfectly good and the analysis of the actual
failure event was *fantastic*.

Well, I came back to it today because the patch has sat for *far* too
long and needs attention and decided I wouldn't let it go until I really
understood what was going on. After quite some time in the debugger,
I finally realized that in fact I had just missed an important case with
my previous attempt to fix PR22093 in r225149. Not only do we need to
handle loads that won't be split, but stores-of-loads that we won't
split. We *do* actually have enough logic in the presplitting to form
new slices for split stores.... *unless* we decided not to split them!

I'm so sorry that it took me this long to come to the realization that
this is the issue. It seems so obvious in hind sight (of course).
Anyways, the fix becomes *much* smaller and more focused. The fact that
we're left doing integer smashing is related to the FIXME in my original
commit: fundamentally, we're not aggressive about pre-splitting for
loads and stores to the same alloca. If we want to get aggressive about
this, it'll need both what Andrea had put into the proposed fix, but
also a *lot* more logic to essentially iteratively pre-split the alloca
until we can't do any more. As I said in that commit log, its really
unclear that this is the right call. Instead, the integer blending and
letting targets lower this to narrower stores seems slightly better. But
we definitely shouldn't really go down that path just to fix this bug.

Again, tons of thanks are owed to Andrea and others at Sony for working
on this bug. I really should have seen what was going on here and
re-directed them sooner. =////

llvm-svn: 263121
2016-03-10 15:31:17 +00:00
Chandler Carruth d94a5962cc [SROA] Clean up some really weird code, no functionality changed.
We already have the instruction extracted into 'I', just cast that to
a store the way we do for loads. Also, we don't enter the if unless SI
is non-null, so don't test it again for null.

I'm pretty sure the entire test there can be nuked, but this is just the
trivial cleanup.

llvm-svn: 263112
2016-03-10 14:16:18 +00:00
Chandler Carruth 7776377e62 [gvn] Fix more indenting and formatting in regions of code that will
need to be changed for porting to the new pass manager.

Also sink the comment on the ValueTable class back to that class instead
of it dangling on an anonymous namespace.

No functionality changed.

llvm-svn: 263084
2016-03-10 00:58:20 +00:00
Chandler Carruth 169c84f1cc [gvn] Reformat a chunk of the GVN code that is strangely indented prior
to restructuring it for porting to the new pass manager.

No functionality changed.

llvm-svn: 263083
2016-03-10 00:58:18 +00:00
Chandler Carruth 61440d225b [PM] Port memdep to the new pass manager.
This is a fairly straightforward port to the new pass manager with one
exception. It removes a very questionable use of releaseMemory() in
the old pass to invalidate its caches between runs on a function.
I don't think this is really guaranteed to be safe. I've just used the
more direct port to the new PM to address this by nuking the results
object each time the pass runs. While this could cause some minor malloc
traffic increase, I don't expect the compile time performance hit to be
noticable, and it makes the correctness and other aspects of the pass
much easier to reason about. In some cases, it may make things faster by
making the sets and maps smaller with better locality. Indeed, the
measurements collected by Bruno (thanks!!!) show mostly compile time
improvements.

There is sadly very limited testing at this point as there are only two
tests of memdep, and both rely on GVN. I'll be porting GVN next and that
will exercise this heavily though.

Differential Revision: http://reviews.llvm.org/D17962

llvm-svn: 263082
2016-03-10 00:55:30 +00:00
Philip Reames e0a5454df4 Fix the build
I screwed up rebasing 263072.  This change fixes the build and passes all make check.

llvm-svn: 263073
2016-03-09 23:07:53 +00:00
Philip Reames b54c8e6eea [LICM] Store promotion when memory is thread local
This patch teaches LICM's implementation of store promotion to exploit the fact that the memory location being accessed might be provable thread local. The fact it's thread local weakens the requirements for where we can insert stores since no other thread can observe the write. This allows us perform store promotion even in cases where the store is not guaranteed to execute in the loop.

Two key assumption worth drawing out is that this assumes a) no-capture is strong enough to imply no-escape, and b) standard allocation functions like malloc, calloc, and operator new return values which can be assumed not to have previously escaped.

In future work, it would be nice to generalize this so that it works without directly seeing the allocation site. I believe that the nocapture return attribute should be suitable for this purpose, but haven't investigated carefully. It's also likely that we could support unescaped allocas with similar reasoning, but since SROA and Mem2Reg should destroy those, they're less interesting than they first might seem.

Differential Revision: http://reviews.llvm.org/D16783

llvm-svn: 263072
2016-03-09 22:59:30 +00:00
Adam Nemet 660748ca8c [LLE] Add missing check for unit stride
I somehow missed this.  The case in GCC (global_alloc) was similar to
the new testcase except it had an array of structs rather than a two
dimensional array.

Fixes RP26885.

llvm-svn: 263058
2016-03-09 20:47:55 +00:00
Adam Nemet 34785ecff1 [LoopDataPrefetch] Add stats and debug output
llvm-svn: 262998
2016-03-09 05:33:21 +00:00
Sanjoy Das 2eac48de9e Return StringRef instead of a naked char*; NFC
llvm-svn: 262989
2016-03-09 02:34:19 +00:00
Sanjoy Das f13900f8ac [IRCE] Reflow comments; NFC
llvm-svn: 262988
2016-03-09 02:34:15 +00:00
Sanjay Patel f831fdb56a fix variable name; NFC
llvm-svn: 262953
2016-03-08 19:07:42 +00:00
Sanjay Patel 5c96723622 use range-based loop; NFCI
llvm-svn: 262952
2016-03-08 19:06:12 +00:00
Adam Nemet bb3680bd85 [LoopDataPrefetch] If prefetch distance is not set, skip pass
This lets select sub-targets enable this pass.  The patch implements the
idea from the recent llvm-dev thread:
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/94925

The goal is to enable the LoopDataPrefetch pass for the Cyclone
sub-target only within Aarch64.

Positive and negative tests will be included in an upcoming patch that
enables selective prefetching of large-strided accesses on Cyclone.

llvm-svn: 262844
2016-03-07 18:35:42 +00:00
Adam Nemet efc091f457 [LLE] Fix a comment
llvm-svn: 262270
2016-02-29 23:21:12 +00:00
Adam Nemet 83be06e529 [LLE] Fix SingleSource/Benchmarks/Polybench/stencils/jacobi-2d-imper with Polly
We can actually have dependences between accesses with different
underlying types.  Bail in this case.

A test will follow shortly.

llvm-svn: 262267
2016-02-29 22:53:59 +00:00
Chandler Carruth ad8cb382fa [LICM] Teach LICM how to handle cases where the alias set tracker was
merged into a loop that was subsequently unrolled (or otherwise nuked).

In this case it can't merge in the ASTs for any remaining nested loops,
it needs to re-add their instructions dircetly.

The fix is very isolated, but I've pulled the code for merging blocks
into the AST into a single place in the process. The only behavior
change is in the case which would have crashed before.

This fixes a crash reported by Mikael Holmen on the list after r261316
restored much of the loop pass pipelining and allowed us to actually do
this kind of nested transformation sequenc. I've taken that test case
and further reduced it into the somewhat twisty maze of loops in the
included test case. This does in fact trigger the bug even in this
reduced form.

llvm-svn: 262108
2016-02-27 04:34:07 +00:00
Haicheng Wu 5539f852ae [JumpThreading] Simplify Instructions first in ComputeValueKnownInPredecessors()
This change tries to find more opportunities to thread over basic blocks.

llvm-svn: 261981
2016-02-26 06:06:04 +00:00
Michael Zolotukhin 9f520ebc54 [LoopUnrollAnalyzer] Check that we're using SCEV for the same loop we're simulating.
Summary: Check that we're using SCEV for the same loop we're simulating. Otherwise, we might try to use the iteration number of the current loop in SCEV expressions for inner/outer loops IVs, which is clearly incorrect.

Reviewers: chandlerc, hfinkel

Subscribers: sanjoy, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17632

llvm-svn: 261958
2016-02-26 02:57:05 +00:00
Adam Nemet fb31d580ea [LoopDataPrefetch] Make it testable with opt
Summary:
Since this is an IR pass it's nice to be able to write tests without
llc.  This is the counterpart of the llc test under
CodeGen/PowerPC/loop-data-prefetch.ll.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17464

llvm-svn: 261578
2016-02-22 21:41:22 +00:00
Philip Reames ce38c2ddf6 [RS4GC] "Constant fold" the rs4gc-split-vector-values flag
This flag was part of a migration to a new means of handling vectors-of-points which was described in the llvm-dev thread "FYI: Relocating vector of pointers".  The old code path has been off by default for a while without complaints, so time to cleanup.

llvm-svn: 261569
2016-02-22 21:01:28 +00:00
Philip Reames 79fa9b75c0 [RS4GC] Revert optimization attempt due to memory corruption
This change reverts "246133 [RewriteStatepointsForGC] Reduce the number of new instructions for base pointers" and a follow on bugfix 12575.

As pointed out in pr25846, this code suffers from a memory corruption bug.  Since I'm (empirically) not going to get back to this any time soon, simply reverting the problematic change is the right answer.

llvm-svn: 261565
2016-02-22 20:45:56 +00:00
Elena Demikhovsky 9914dbd11b Allow setting MaxRerollIterations above 16
By Ayal Zaks.

Differential Revision http://reviews.llvm.org/D17258

llvm-svn: 261517
2016-02-22 09:38:28 +00:00
Duncan P. N. Exon Smith e9bc579c37 ADT: Remove == and != comparisons between ilist iterators and pointers
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.

Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators.  (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)

There should be NFC here.

llvm-svn: 261498
2016-02-21 20:39:50 +00:00
Sanjoy Das 979a11d5b2 [LoopDeletion] Add an assert that verifies LCSSA
This is inspired by PR24804 -- had this assert been there before,
isolating the root cause for PR24804 would have been far easier.

llvm-svn: 261481
2016-02-21 17:11:59 +00:00
Chandler Carruth 31088a9d58 [LPM] Factor all of the loop analysis usage updates into a common helper
routine.

We were getting this wrong in small ways and generally being very
inconsistent about it across loop passes. Instead, let's have a common
place where we do this. One minor downside is that this will require
some analyses like SCEV in more places than they are strictly needed.
However, this seems benign as these analyses are complete no-ops, and
without this consistency we can in many cases end up with the legacy
pass manager scheduling deciding to split up a loop pass pipeline in
order to run the function analysis half-way through. It is very, very
annoying to fix these without just being very pedantic across the board.

The only loop passes I've not updated here are ones that use
AU.setPreservesAll() such as IVUsers (an analysis) and the pass printer.
They seemed less relevant.

With this patch, almost all of the problems in PR24804 around loop pass
pipelines are fixed. The one remaining issue is that we run simplify-cfg
and instcombine in the middle of the loop pass pipeline. We've recently
added some loop variants of these passes that would seem substantially
cleaner to use, but this at least gets us much closer to the previous
state. Notably, the seven loop pass managers is down to three.

I've not updated the loop passes using LoopAccessAnalysis because that
analysis hasn't been fully wired into LoopSimplify/LCSSA, and it isn't
clear that those transforms want to support those forms anyways. They
all run late anyways, so this is harmless. Similarly, LSR is left alone
because it already carefully manages its forms and doesn't need to get
fused into a single loop pass manager with a bunch of other loop passes.

LoopReroll didn't use loop simplified form previously, and I've updated
the test case to match the trivially different output.

Finally, I've also factored all the pass initialization for the passes
that use this technique as well, so that should be done regularly and
reliably.

Thanks to James for the help reviewing and thinking about this stuff,
and Ben for help thinking about it as well!

Differential Revision: http://reviews.llvm.org/D17435

llvm-svn: 261316
2016-02-19 10:45:18 +00:00
Lawrence Hu 84e6f1dd70 Bug fix: use dyn_cast_or_null instead of dyn_cast
Differential Revision: http://reviews.llvm.org/D17154

llvm-svn: 261299
2016-02-19 02:17:07 +00:00
Richard Trieu 7a08381403 Remove uses of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261270
2016-02-18 22:09:30 +00:00
Adam Nemet 9d9cb274ea [PPCLoopDataPrefetch] Move pass to Transforms/Scalar/LoopDataPrefetch. NFC
This patch is part of the work to make PPCLoopDataPrefetch
target-independent
(http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758).

Obviously the pass still only used from PPC at this point.  Subsequent
patches will start driving this from ARM64 as well.

Due to the previous patch most lines should show up as moved lines.

llvm-svn: 261265
2016-02-18 21:38:19 +00:00
Junmo Park 80440eb804 Minor code cleanup. NFC.
llvm-svn: 261200
2016-02-18 10:09:20 +00:00
Haicheng Wu 57e1a3e6ee [LIR] Avoid turning non-temporal stores into memset
This is to fix PR26645.

llvm-svn: 261149
2016-02-17 21:00:06 +00:00
Roman Gareev 036c08874a Tweak the LICM code to reuse the first sub-loop instead of creating a new one
LICM starts with an *empty* AST, and then merges in each sub-loop. While the
add code is appropriate for sub-loop 2 and up, it's utterly unnecessary for
sub-loop 1. If the AST starts off empty, we can just clone/move the contents
of the subloop into the containing AST.

Reviewed-by: Philip Reames <listmail@philipreames.com>

Differential Revision: http://reviews.llvm.org/D16753

llvm-svn: 260892
2016-02-15 14:48:50 +00:00
Chad Rosier 81362a8599 [LIR] Allow merging of memsets in negatively strided loops.
Last part of PR25166.

llvm-svn: 260732
2016-02-12 21:03:23 +00:00
Justin Lebar df04d2a1f1 [LoopRotate] Don't perform loop rotation if the loop header calls a convergent function.
Summary:
Calls to convergent functions can be duplicated, but only if the
duplicates are not control-flow dependent on any additional values.
Loop rotation doesn't meet the bar.

Reviewers: jingyue

Subscribers: mzolotukhin, llvm-commits, arsenm, joker.eph, resistor, tra, hfinkel, broune

Differential Revision: http://reviews.llvm.org/D17127

llvm-svn: 260729
2016-02-12 21:01:33 +00:00
David Majnemer 01674939b2 Remove unused variable
llvm-svn: 260722
2016-02-12 20:33:51 +00:00
Philip Reames 96fccc2d09 [GVN] Common code for local and non-local load availability [NFCI]
The attached patch removes all of the block local code for performing X-load forwarding by reusing the code used in the non-local case.

The motivation here is to remove duplication and in the process increase our test coverage of some fairly tricky code. I have some upcoming changes I'll be proposing in this area and wanted to have the code cleaned up a bit first.

Note: The review for this mostly happened in email which didn't make it to phabricator on the 258882 commit thread.

Differential Revision: http://reviews.llvm.org/D16608

llvm-svn: 260711
2016-02-12 19:24:57 +00:00
Chad Rosier 4acff96646 [LIR] Partially revert r252926(NFC), which introduced a very subtle change.
In short, before r252926 we were comparing an unsigned (StoreSize) against an a
APInt (Stride), which is fine and well.  After we were zero extending the Stride
and then converting to an unsigned, which is not the same thing.  Obviously,
Stides can also be negative.  This commit just restores the original behavior.

AFAICT, it's not possible to write a test case to expose the issue because
the code already has checks to make sure the StoreSize can't overflow an
unsigned (which prevents the Stride from overflowing an unsigned as well).

llvm-svn: 260706
2016-02-12 19:05:27 +00:00
Tamas Berghammer d12f31535a Fix MSVC 2013 build after rL260504
llvm-svn: 260511
2016-02-11 11:27:51 +00:00
Ashutosh Nema 2260a3a046 Fixed typo in comment & coding style for LoopVersioningLICM.
llvm-svn: 260504
2016-02-11 09:23:53 +00:00
Philip Reames d59638e14f Follow up to 260439, Speculative fix to clang builders
It looks like clang has a couple of test cases which caught the fact LVI was not slightly more precise after 260439.  When looking at the failures, it struck me as wasteful to be querying nullness of a constant via LVI, so instead of tweaking the clang tests, let's just stop querying constants from this source.

llvm-svn: 260451
2016-02-10 22:22:41 +00:00
Tom Stellard 6fa4e290d7 StructurizeCFG: Initialize SkipUniformRegions in the default constructor
This should fix some random bot failures caused by r260336.

llvm-svn: 260342
2016-02-10 01:10:09 +00:00
Tom Stellard 755a4e6b57 StructurizeCFG: Add an option for skipping regions with only uniform branches
Summary:
Tests for this will be added once the AMDGPU backend enables this
option.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16602

llvm-svn: 260336
2016-02-10 00:39:37 +00:00
Michael Zolotukhin 1da4afdfc9 Factor out UnrollAnalyzer to Analysis, and add unit tests for it.
Summary:
Unrolling Analyzer is already pretty complicated, and it becomes harder and harder to exercise it with usual IR tests, as with them we can only check the final decision: whether the loop is unrolled or not. This change factors this framework out from LoopUnrollPass to analyses, which allows to use unit tests.
The change itself is supposed to be NFC, except adding a couple of tests.

I plan to add more tests as I add new functionality and find/fix bugs.

Reviewers: chandlerc, hfinkel, sanjoy

Subscribers: zzheng, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D16623

llvm-svn: 260169
2016-02-08 23:03:59 +00:00
Haicheng Wu b35f772b90 [JumpThreading] Change a return of ComputeValueKnownInPredecessors()
Change a return statement of ComputeValueKnownInPredecessors() to be the same as
the rest return statements of the function. Otherwise, it might return true with
an empty Result when the current basic block has no predecessors and trigger the
first assert of JumpThreading::ProcessThreadableEdges().

llvm-svn: 260110
2016-02-08 17:00:39 +00:00
Ashutosh Nema df6763abe8 New Loop Versioning LICM Pass
Summary:
When alias analysis is uncertain about the aliasing between any two accesses,
it will return MayAlias. This uncertainty from alias analysis restricts LICM
from proceeding further. In cases where alias analysis is uncertain we might
use loop versioning as an alternative.

Loop Versioning will create a version of the loop with aggressive aliasing
assumptions in addition to the original with conservative (default) aliasing
assumptions. The version of the loop making aggressive aliasing assumptions
will have all the memory accesses marked as no-alias. These two versions of
loop will be preceded by a memory runtime check. This runtime check consists
of bound checks for all unique memory accessed in loop, and it ensures the
lack of memory aliasing. The result of the runtime check determines which of
the loop versions is executed: If the runtime check detects any memory
aliasing, then the original loop is executed. Otherwise, the version with
aggressive aliasing assumptions is used.

The pass is off by default and can be enabled with command line option 
-enable-loop-versioning-licm.

Reviewers: hfinkel, anemet, chatur01, reames

Subscribers: MatzeB, grosser, joker.eph, sanjoy, javed.absar, sbaranga,
             llvm-commits

Differential Revision: http://reviews.llvm.org/D9151

llvm-svn: 259986
2016-02-06 07:47:48 +00:00
Joseph Tremoulet adc2376375 [RS4GC] Pass DenseMap by reference, NFC
Summary:
Passing the rematerialized values map to insertRematerializationStores by
value looks to be a simple oversight; update it to pass by reference.


Reviewers: reames, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16911

llvm-svn: 259867
2016-02-05 01:42:52 +00:00
Adam Nemet 9455c1d2b1 [LoopLoadElim] Don't allow versioning when optForSize
This was requested in the review of D16300.

llvm-svn: 259861
2016-02-05 01:14:05 +00:00
David Majnemer a53b5bbb18 [LoopStrengthReduce] Don't rewrite PHIs with incoming values from CatchSwitches
Bail out if we have a PHI on an EHPad that gets a value from a
CatchSwitchInst.  Because the CatchSwitchInst cannot be split, there is
no good place to stick any instructions.

This fixes PR26373.

llvm-svn: 259702
2016-02-03 21:30:34 +00:00
Adam Nemet d52ed84160 [LoopVersioning] Expose loop versioning as a pass too
Summary:
LoopVersioning is a transform utility that transform passes can use to
run-time disambiguate may-aliasing accesses. I'd like to also expose as
pass to allow it to be unit-tested.

I am planning to add support for non-aliasing annotation in
LoopVersioning and I'd like to be able to write tests directly using
this pass.

(After that feature is done, the pass could also be used to look for
optimization opportunities that are hidden behind incomplete alias
information at compile time.)

The pass drives LoopVersioning in its default way which is to fully
disambiguate may-aliasing accesses no matter how many checks are
required.

Reviewers: hfinkel, ashutosh.nema, sbaranga

Subscribers: zzheng, mssimpso, llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D16612

llvm-svn: 259610
2016-02-03 00:06:10 +00:00
Matthias Braun b30f2f5141 Avoid overly large SmallPtrSet/SmallSet
These sets perform linear searching in small mode so it is never a good
idea to use SmallSize/N bigger than 32.

llvm-svn: 259283
2016-01-30 01:24:31 +00:00
Fiona Glaser 36e8230db0 Fix typo in LoopSimplifyCFG
llvm-svn: 259261
2016-01-29 23:12:52 +00:00
Fiona Glaser b417d464e6 Add LoopSimplifyCFG pass
Loop transformations can sometimes fail because the loop, while in
valid rotated LCSSA form, is not in a canonical CFG form. This is
an extremely simple pass that just merges obviously redundant
blocks, which can be used to fix some known failure cases. In the
future, it may be enhanced with more cases (and have code shared with
SimplifyCFG).

This allows us to run LoopSimplifyCFG -> LoopRotate -> LoopUnroll,
so that SimplifyCFG cleans up the loop before Rotate tries to run.

Not currently used in the pass manager, since this pass doesn't do
anything unless you can hook it up in an LPM with other loop passes.
It'll be added once Chandler cleans up things to allow this.

Tested in a custom pipeline out of tree to confirm it works in
practice (in addition to the included trivial test).

llvm-svn: 259256
2016-01-29 22:35:36 +00:00
David Majnemer 75f492e7f1 Fix the build
llvm-svn: 259215
2016-01-29 17:46:57 +00:00
Sanjoy Das c816f03b70 [RS4GC] Address post-commit review on r259208 from David
NFC

llvm-svn: 259211
2016-01-29 17:20:49 +00:00
Sanjoy Das 565f7866ac [RS4GC] Remove unnecessary const_cast; NFC
GCRelocateInst::getDerivedPtr already returns a non-const llvm::Value
pointer.

llvm-svn: 259209
2016-01-29 16:54:49 +00:00
Sanjoy Das 3794eeb8bb [RS4GC] Minor local cleanup to StabilizeOrder; NFC
- Locally declare struct, and call it BaseDerivedPair
 - Use a lambda to compare, instead of a singleton with uninitialized
   fields
 - Add a constructor to BaseDerivedPair and use SmallVector::emplace_back

llvm-svn: 259208
2016-01-29 16:50:34 +00:00
Philip Reames 10e678d25a [GVN] Add clarifying assert [NFCI]
Just adding an assert which makes invariants between AnalyzeLoadsFromClobberingLoads and GetLoadValueForLoad slightly more clear.

llvm-svn: 259145
2016-01-29 02:23:10 +00:00
Sanjoy Das bcf27523f5 [RS4GC] Minor cleanups enabled by the previous change; NFC
llvm-svn: 259133
2016-01-29 01:03:20 +00:00
Sanjoy Das 4099297856 [RS4GC] Delete code that is dead due to r259129; NFC
llvm-svn: 259132
2016-01-29 01:03:17 +00:00
Sanjoy Das 0407108020 [RS4GC] Clamp UseDeoptBundles to true and update tests
The full diff for the test directory may be hard to read because of the
filename clash; so here's all that happened as far as the tests are
concerned:

```
cd test/Transforms/RewriteStatepointsForGC
git rm *ll
git mv deopt-bundles/* ./
rmdir deopt-bundles
find . -name '*.ll' | xargs gsed -i 's/-rs4gc-use-deopt-bundles //g'
```

llvm-svn: 259129
2016-01-29 00:28:57 +00:00
Sanjoy Das bb04f6e28f [PlaceSafepoints] Use DEBUG() instead of TraceLSP
DEBUG() is the more idiomatic LLVM style.

llvm-svn: 259121
2016-01-28 23:49:27 +00:00
Sanjoy Das cd23fec756 [PlaceSafepoints] Misc. minor cleanups; NFC
These changes are aimed at bringing PlaceSafepoints up to code with the
LLVM coding guidelines:

 - Fix variable naming
 - Use DenseSet instead of std::set
 - Remove dead code
 - Minor local code simplifications

llvm-svn: 259112
2016-01-28 23:03:19 +00:00
Sanjoy Das 360a4e4ee2 [PlaceSafepoints] Remvoe unused headers, and sort #includes; NFC
llvm-svn: 259111
2016-01-28 23:03:17 +00:00
Sanjoy Das 12673765cf [PlaceSafepoints] Eliminate dead code; NFC
Now that NoStatepoints is a constant `true`, we can get rid of a bunch
of dead code.

llvm-svn: 259110
2016-01-28 23:03:14 +00:00
Sanjoy Das f7302c8baf [PlaceSafepoints] Clamp NoStatepoints to true
This change permanently clamps -spp-no-statepoints to true (the code
deletion will come later).  Tests that specifically tested
PlaceSafepoint's ability to wrap calls in gc.statepoint have been moved
to RS4GC's test suite.

llvm-svn: 259096
2016-01-28 21:51:14 +00:00
Sanjoy Das 7a2e2bed67 [LICM] Keep metadata on control equivalent hoists
Summary:
If the instruction we're hoisting out of a loop into its preheader is
guaranteed to have executed in the loop, then the metadata associated
with the instruction (e.g. !range or !dereferenceable) is valid in the
preheader.  This is because once we're in the preheader, we know we're
eventually going to reach the location the metadata was valid at.

This change makes LICM smarter around this, and helps it recognize cases
like these:

```
  do {
    int a = *ptr; !range !0
    ...
  } while (i++ < N);
```

to

```
  int a = *ptr; !range !0
  do {
    ...
  } while (i++ < N);
```

Earlier we'd drop the `!range` metadata after hoisting the load from
`ptr`.

Reviewers: igor-laevsky

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16669

llvm-svn: 259053
2016-01-28 15:51:58 +00:00
Sanjoy Das cddde58f1c [IndVars] Hoist DataLayout load out of loop; NFC
llvm-svn: 258946
2016-01-27 17:05:09 +00:00
Sanjoy Das 2f7a7447c2 [IndVars] Use isSCEVable; NFC
llvm-svn: 258945
2016-01-27 17:05:06 +00:00
Sanjoy Das 8fdf87c338 [IndVars] Use range-for; NFC
llvm-svn: 258944
2016-01-27 17:05:03 +00:00
Chen Li 5cde8389cf [IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader
Summary:
This is a revised version of D13974, and the following quoted summary are from D13974

"This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with
its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch."

D13974 was committed but failed one lnt test. The bug was that we only checked the condition from loop exit's incoming block was a loop invariant. But there could be another condition from loop header to that incoming block not being a loop invariant. This would produce miscompiled code.

This patch fixes the issue by checking if the incoming block is loop header, and if not, don't perform the rewrite. The could be further improved by recursively checking all conditions leading to loop exit block, but I'd like to check in this simple version first and improve it with future patches.     

Reviewers: sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16570

llvm-svn: 258912
2016-01-27 07:40:41 +00:00
Philip Reames 8e785a4ec0 [GVN] Split AvailableValueInBlock into two parts [NFC]
AvailableValue is the part that represents the potential rematerialization.  AvailableValueInBlock is simply a pair of an AvailableValue and a BB which we might materialize it in.

This is motivated by http://reviews.llvm.org/D16608.  The intent is that we'll have a single function which handles the local case which both local and non-local will use to identify available values.  Once that's done, the local case can rematerialize at the use site and the non-local case can do the SSA construction as it does currently.

llvm-svn: 258882
2016-01-26 23:43:16 +00:00
Chris Bieneman e49730d4ba Remove autoconf support
Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html

"I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened."
- Obi Wan Kenobi

Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark

Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16471

llvm-svn: 258861
2016-01-26 21:29:08 +00:00
Aditya Nandakumar 3d0c46d489 Reassociate: Reprocess RedoInsts after each inst
Previously the RedoInsts was processed at the end of the block.
However it was possible that it left behind some instructions that
were not canonicalized.
This should guarantee that any previous instruction in the basic
block is canonicalized before we process a new instruction.

llvm-svn: 258830
2016-01-26 18:42:36 +00:00
Haicheng Wu f1c00a22be [LIR] Add support for structs and hand unrolled loops
This is a recommit of r258620 which causes PR26293.

The original message:

Now LIR can turn following codes into memset:

typedef struct foo {
  int a;
  int b;
} foo_t;

void bar(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; ++i) {
    f[i].a = 0;
    f[i].b = 0;
  }
}

void test(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; i += 2) {
    f[i] = 0;
    f[i+1] = 0;
  }
}

llvm-svn: 258777
2016-01-26 02:27:47 +00:00
Philip Reames 273dcb0d82 [GVN] Rearrange code to make local vs non-local cases more obvious [NFCI]
llvm-svn: 258747
2016-01-25 23:37:53 +00:00
Philip Reames 10a50b188e [GVN] Factor out common code [NFCI]
We had the same code duplicated for each type of Def.  We also have the entire block duplicated between the local and non-local case, but let's start with local cleanup.

llvm-svn: 258740
2016-01-25 23:19:12 +00:00
Lawrence Hu d3d51061fb Enable loopreroll to rerool loop with pointer induction variable.
Example:

while (buf !=end ) {
   S += buf[0];
   S += buf[1];
   buf +=2;
};

Differential Revision: http://reviews.llvm.org/D13151

llvm-svn: 258709
2016-01-25 19:43:45 +00:00
Lawrence Hu b917cd9fa6 Undo commit 258700 due to missing commit message
llvm-svn: 258708
2016-01-25 19:36:30 +00:00
Quentin Colombet a392810bea Speculatively revert r258620 as it is the likely culprid of PR26293.
llvm-svn: 258703
2016-01-25 19:12:49 +00:00
Lawrence Hu 84b6195e41 Differential Revision: http://reviews.llvm.org/D13151
llvm-svn: 258700
2016-01-25 18:53:39 +00:00
David Majnemer eec878574e Fix build bot breakage
llvm-svn: 258661
2016-01-24 16:46:53 +00:00
David Majnemer dcd6c79d55 Fix buildbot failures
llvm-svn: 258655
2016-01-24 06:40:37 +00:00
David Majnemer 88542a0a69 [SCCP] Remove duplicate code
SCCP has code identical to changeToUnreachable's behavior, switch it
over to just call changeToUnreachable.

No functionality change intended.

llvm-svn: 258654
2016-01-24 06:26:47 +00:00
David Majnemer 35c46d3e0b [InstCombine, SCCP] Consolidate code used to remove instructions
InstCombine and SCCP both want to remove dead code in a very particular
way but using identical means to do so.  Share the code between the two.

No functionality change is intended.

llvm-svn: 258653
2016-01-24 05:26:18 +00:00
Haicheng Wu dd5e9d2159 [LIR] Add support for structs and hand unrolled loops
Now LIR can turn following codes into memset:

typedef struct foo {
  int a;
  int b;
} foo_t;

void bar(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; ++i) {
    f[i].a = 0;
    f[i].b = 0;
  }
}

void test(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; i += 2) {
    f[i] = 0;
    f[i+1] = 0;
  }
}

llvm-svn: 258620
2016-01-23 06:52:41 +00:00
Sanjoy Das 95639746e5 [PlaceSafepoints] Introduce a -spp-no-statepoints flag
Summary:
This change adds a `-spp-no-statepoints` flag to PlaceSafepoints that
bypasses the code that wraps newly introduced polls and existing calls
in gc.statepoint.  With `-spp-no-statepoints` enabled, PlaceSafepoints
effectively becomes a safpeoint **poll** insertion pass.

The eventual goal is to "constant fold" this option, along with
`-rs4gc-use-deopt-bundles` to `true`, once clients using gc.statepoint
are okay doing so.

Reviewers: pgavlin, reames, JosephTremoulet

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16439

llvm-svn: 258551
2016-01-22 21:02:55 +00:00
Sanjoy Das acc43d197d [RS4GC] Use OB_deopt instead of "deopt"
llvm-svn: 258529
2016-01-22 19:20:40 +00:00
Eduard Burtescu 68e7f49f8e [opaque pointer types] [NFC] DataLayout::getIndexedOffset: take source element type instead of pointer type and rename to getIndexedOffsetInType.
Summary:

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16282

llvm-svn: 258478
2016-01-22 03:08:27 +00:00
Eduard Burtescu e2a6917849 [opaque pointer types] [NFC] FindAvailableLoadedValue: take LoadInst instead of just the pointer.
Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16422

llvm-svn: 258477
2016-01-22 01:51:51 +00:00
Eduard Burtescu 1423921a24 [opaque pointer types] [NFC] Add an explicit type argument to ConstantFoldLoadFromConstPtr.
Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16418

llvm-svn: 258472
2016-01-22 01:17:26 +00:00
David L Kreitzer 4d7257dfa1 Fix for two constant propagation problems in GVN with the assume intrinsic
instruction.

Patch by Yuanrui Zhang.

Differential Revision: http://reviews.llvm.org/D16100

llvm-svn: 258435
2016-01-21 21:32:35 +00:00
Sanjoy Das a34ce95b60 Add a "gc-transition" operand bundle
Summary:
This adds a new kind of operand bundle to LLVM denoted by the
`"gc-transition"` tag.  Inputs to `"gc-transition"` operand bundle are
lowered into the "transition args" section of `gc.statepoint` by
`RewriteStatepointsForGC`.

This removes the last bit of functionality that was unsupported in the
deopt bundle based code path in `RewriteStatepointsForGC`.

Reviewers: pgavlin, JosephTremoulet, reames

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16342

llvm-svn: 258338
2016-01-20 19:50:25 +00:00
Eduard Burtescu 19eb03106d [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.
Summary:
GEPOperator: provide getResultElementType alongside getSourceElementType.
This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has.

GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16275

llvm-svn: 258145
2016-01-19 17:28:00 +00:00
Philip Reames b336bca07e [GC] Lower vectors-of-pointers directly by default
This commit changes the default on our lowering of vectors-of-pointers from splitting in RS4GC to reporting them in the final stack map.  All of the changes to do so are already in place and tested.  Assuming no problems are unearthed in the next week, we will be deleting the old code entirely next Monday.

llvm-svn: 258111
2016-01-19 04:18:24 +00:00
Eduard Burtescu 6007e0dd02 Revert assert added in rL258028 as the alloca and OtherPtr types may differ in address space.
llvm-svn: 258029
2016-01-18 00:20:34 +00:00
Eduard Burtescu 90c4449128 [opaque pointer types] Alloca: use getAllocatedType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: llvm-commits, dblaikie

Differential Revision: http://reviews.llvm.org/D16272

llvm-svn: 258028
2016-01-18 00:10:01 +00:00
Sanjoy Das de47590589 [IndVars] Fix PR25576
`LCSSASafePhiForRAUW` as computed was incorrect -- in cases like
these (this exact example does not actually trigger the bug):

define i32 @f(i32 %n, i1* %c) {
entry:
  br label %outer.loop

outer.loop:
  br label %inner.loop

inner.loop:
  %iv = phi i32 [ 0, %outer.loop ], [ %iv.inc, %inner.loop ]
  %iv.inc = add nuw nsw i32 %iv, 1
  %tc = udiv i32 %n, 13
  %be.cond = icmp ult i32 %iv, %tc
  br i1 %be.cond, label %inner.loop, label %inner.exit

inner.exit:
  %iv.lcssa = phi i32 [ %iv, %inner.loop ]
  %outer.be.cond = load volatile i1, i1* %c
  br i1 %outer.be.cond, label %outer.loop, label %leave

leave:
  %iv.lcssa.lcssa = phi i32 [ %iv.lcssa, %inner.exit ]
  ret i32 %iv.lcssa.lcssa
}

`LCSSASafePhiForRAUW` is true for `%iv.lcssa` when re-rewriting the exit
value of `%iv` for `%inner.loop` to `%tc` (this can happen due to
`SCEVExpander::findExistingExpansion`), but the RAUW breaks LCSSA.

To fix this, instead of computing `SafePhi` with special logic, decide
the safety of RAUW directly via `replacementPreservesLCSSAForm`.

llvm-svn: 258016
2016-01-17 18:12:52 +00:00
Sanjoy Das 7a8a705c9d [IndVars] Use emplace_back; NFC
llvm-svn: 258015
2016-01-17 18:12:48 +00:00
Artur Pilipenko aba8fdc480 Fix buildbot failure introduced by 258010. Remove local variables became unused.
llvm-svn: 258011
2016-01-17 12:59:40 +00:00
Artur Pilipenko f84dc06e5b Push isDereferenceableAndAlignedPointer down into isSafeToLoadUnconditionally
Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D16226

llvm-svn: 258010
2016-01-17 12:35:29 +00:00
Manuel Jacob 5f6eaac611 GlobalValue: use getValueType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: jholewinski, arsenm, dsanders, dblaikie

Patch by Eduard Burtescu.

Differential Revision: http://reviews.llvm.org/D16260

llvm-svn: 257999
2016-01-16 20:30:46 +00:00
Justin Bogner fd757648a4 PM: Fix an inverted condition in simplifyFunctionCFG
I mentioned the issue here in code review way back in September and
was sure we'd fixed it, but apparently we forgot:

  http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150921/301850.html

In any case, as soon as you try to use this pass in anything but the
most basic pipeline everything falls apart. Fix the condition.

llvm-svn: 257935
2016-01-15 21:21:39 +00:00
Artur Pilipenko 6dd6969cee Change isSafeToLoadUnconditionally arguments order. Separated from http://reviews.llvm.org/D10920.
llvm-svn: 257894
2016-01-15 15:27:46 +00:00
Keno Fischer d5354fdddb [SROA] Also insert a bit piece expression if only one piece is needed
Summary: If SROA creates only one piece (e.g. because the other is not needed),
it still needs to create a bit_piece expression if that bit piece is smaller
than the original size of the alloca.

Reviewers: aprantl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16187

llvm-svn: 257795
2016-01-14 20:06:34 +00:00
Sanjay Patel 9913322327 move return variable declarations down to where they are actually used; NFCI
llvm-svn: 257700
2016-01-13 23:01:57 +00:00
Justin Bogner b8d82abb78 LoopUnroll: Move the actual unrolling logic to a standalone function. NFC
This is pure code motion - break the actual work out of runOnLoop into
a reusable standalone function.

llvm-svn: 257445
2016-01-12 05:21:37 +00:00
Justin Bogner 921b04e9a4 LoopUnroll: Make canUnrollCompletely static - it doesn't use any state. NFC
llvm-svn: 257427
2016-01-12 01:06:32 +00:00
Justin Bogner a1dd493159 LoopUnroll: Clean up the maze of initialization for unroll parameters. NFC
The layering of where the various loop unroll parameters are
initialized and overridden here was very confusing, making it pretty
difficult to tell just how the various sources interacted. Instead, we
put all of the initialization logic together in a single function so
that it's obvious what overrides what.

llvm-svn: 257426
2016-01-12 00:55:26 +00:00
Justin Bogner 0fb7ed5726 LoopUnroll: Use the optsize threshold for minsize as well
Currently we're unrolling loops more in minsize than in optsize, which
means -Oz will have a larger code size than -Os. That doesn't make any
sense.

This resolves the FIXME about this in LoopUnrollPass and extends the
optsize test to make sure we use the smaller threshold for minsize as
well.

llvm-svn: 257402
2016-01-11 22:39:43 +00:00
David Majnemer d9833ea579 [JumpThreading] Don't forget to report that the IR changed
JumpThreading's runOnFunction is supposed to return true if it made any
changes.  JumpThreading has a call to removeUnreachableBlocks which may
result in changes to the IR but runOnFunction didn't appropriate account
for this possibility, leading to badness.

While we are here, make sure to call LazyValueInfo::eraseBlock in
removeUnreachableBlocks;  JumpThreading preserves LVI.

This fixes PR26096.

llvm-svn: 257279
2016-01-10 07:13:04 +00:00
Benjamin Kramer 543762da3e [JumpThreading] Use range-based for loops.
No functionality change intended.

llvm-svn: 257262
2016-01-09 18:43:01 +00:00
Benjamin Kramer 530e0db333 [TRE] Simplify code with range-based loops and std::find.
No functional change intended.

llvm-svn: 257261
2016-01-09 17:35:29 +00:00
Manuel Jacob 734e73342d [RS4GC] Update and simplify handling of Constants in findBaseDefiningValueOfVector().
Summary:
This is analogous to r256079, which removed an overly strong assertion, and
r256812, which simplified the code by replacing three conditionals by one.

Reviewers: reames

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D16019

llvm-svn: 257250
2016-01-09 04:02:16 +00:00
Manuel Jacob 0593cfd336 [RS4GC] Unify two asserts. NFC.
llvm-svn: 257247
2016-01-09 03:08:49 +00:00
Philip Reames 5715f576ea [rs4gc] Optionally directly relocated vector of pointers
This patch teaches rewrite-statepoints-for-gc to relocate vector-of-pointers directly rather than trying to split them. This builds on the recent lowering/IR changes to allow vector typed gc.relocates.

The motivation for this is that we recently found a bug in the vector splitting code where depending on visit order, a vector might not be relocated at some safepoint. Specifically, the bug is that the splitting code wasn't updating the side tables (live vector) of other safepoints. As a result, a vector which was live at two safepoints might not be updated at one of them. However, if you happened to visit safepoints in post order over the dominator tree, everything worked correctly. Weirdly, it turns out that post order is actually an incredibly common order to visit instructions in in practice. Frustratingly, I have not managed to write a test case which actually hits this. I can only reproduce it in large IR files produced by actual applications.

Rather than continue to make this code more complicated, we can remove all of the complexity by just representing the relocation of the entire vector natively in the IR.

At the moment, the new functionality is hidden behind a flag. To use this code, you need to pass "-rs4gc-split-vector-values=0". Once I have a chance to stress test with this option and get feedback from other users, my plan is to flip the default and remove the original splitting code. I would just remove it now, but given the rareness of the bug, I figured it was better to leave it in place until the new approach has been stress tested.

Differential Revision: http://reviews.llvm.org/D15982

llvm-svn: 257244
2016-01-09 01:31:13 +00:00
Sanjay Patel 9f088ab5e2 rangify; NFCI
llvm-svn: 257226
2016-01-08 22:59:42 +00:00
Sanjay Patel 9f49b683e0 variable names start with an upper case letter; NFC
llvm-svn: 257213
2016-01-08 22:05:03 +00:00
Haicheng Wu a6a3279bd3 [JumpThreading] Split select that has constant conditions coming from the PHI node
Look for PHI/Select in the same BB of the form

bb:
  %p = phi [false, %bb1], [true, %bb2], [false, %bb3], [true, %bb4], ...
  %s = select p, trueval, falseval

And expand the select into a branch structure. This later enables
jump-threading over bb in this pass.

Using the similar approach of SimplifyCFG::FoldCondBranchOnPHI(), unfold
select if the associated PHI has at least one constant.  If the unfolded
select is not jump-threaded, it will be folded again in the later
optimizations.

llvm-svn: 257198
2016-01-08 19:39:39 +00:00
Justin Bogner e9fb228d59 LoopInfo: Simplify ownership of Loop objects
It's strange that LoopInfo mostly owns the Loop objects, but that it
defers deleting them to the loop pass manager. Instead, change the
oddly named "updateUnloop" to "markAsRemoved" and have it queue the
Loop object for deletion. We can't delete the Loop immediately when we
remove it, since we need its pointer identity still, so we'll mark the
object as "invalid" so that clients can see what's going on.

llvm-svn: 257191
2016-01-08 19:08:53 +00:00
Mehdi Amini 599ebf2767 Remove static global GCNames from Function.cpp and move it to the Context
This remove the need for locking when deleting a function.

Differential Revision: http://reviews.llvm.org/D15988

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 257139
2016-01-08 02:28:20 +00:00
Aditya Nandakumar f94c149f7f Instructions to be redone only if from the same BB
While adding instructions(possible roots) to be redone, make sure they
are from the same basic block.

llvm-svn: 257112
2016-01-07 23:22:55 +00:00
David Majnemer f1a9c9e148 [SCCP] Don't violate the lattice invariants
We marked values which are 'undef' as constant instead of undefined
which violates SCCP's invariants.  If we can figure out that a
computation results in 'undef', leave it in the undefined state.

This fixes PR16052.

llvm-svn: 257102
2016-01-07 21:36:16 +00:00
David Majnemer f3b99dd22e Remove junk accidentally commited with r257087
llvm-svn: 257089
2016-01-07 19:30:13 +00:00
David Majnemer bae945735a [SCCP] Can't go from overdefined to constant
The fix for PR23999 made us mark loads of null as producing the constant
undef which upsets the lattice.  Instead, keep the load as "undefined".
This fixes PR26044.

llvm-svn: 257087
2016-01-07 19:25:39 +00:00
Philip Reames 103d2381d6 [RS4GC] Add an option to suppress vector splitting
At the moment, this is essentially a diangostic option so that I can start collecting failing test cases, but we will eventually migrate to removing the vector splitting code entirely.

llvm-svn: 257015
2016-01-07 02:20:11 +00:00
Mehdi Amini 0535003bef Fix PR26051: Memcpy optimization should introduce a call to memcpy before the store destination position
This is a conservative fix, I expect Amaury to relax this.
Follow-up for r256923

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 256999
2016-01-06 23:50:22 +00:00
Amaury Sechet 3235c08253 Promote aggregate store to memset when possible
Summary: As per title. This will allow the optimizer to pick up on it.

Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15923

llvm-svn: 256969
2016-01-06 19:47:24 +00:00
Amaury Sechet 5fc9f6999d Remove useless DEBUG
llvm-svn: 256968
2016-01-06 19:45:09 +00:00
Amaury Sechet d3b2c0fd94 Improve load/store to memcpy for aggregate
Summary: It turns out that if we don't try to do it at the store location, we can do it before any operation that alias the load, as long as no operation alias the store.

Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15903

llvm-svn: 256923
2016-01-06 09:30:39 +00:00
Amaury Sechet a0c242cdfd Implement load to store => memcpy in MemCpyOpt for aggregates
Summary:
Most of the tool chain is able to optimize scalar and memcpy like operation effisciently while it isn't that good with aggregates. In order to improve the support of aggregate, we try to change aggregate manipulation into either scalar or memcpy like ones whenever possible without loosing informations.

This is one such opportunity.

Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15894

llvm-svn: 256868
2016-01-05 20:17:48 +00:00
Manuel Jacob 75cbfdcf03 [RS4GC] Simplify handling of Constants in findBaseDefiningValue(). NFC.
Summary:
Previously there were three conditionals, checking for global
variables, undef values and everything constant except these two, all three
returning the same value.  This commit replaces them by one conditional.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15818

llvm-svn: 256812
2016-01-05 04:06:21 +00:00
Manuel Jacob 83eefa6d20 [Statepoints] Refactor GCRelocateOperands into an intrinsic wrapper. NFC.
Summary:
This commit renames GCRelocateOperands to GCRelocateInst and makes it an
intrinsic wrapper, similar to e.g. MemCpyInst.  Also, all users of
GCRelocateOperands were changed to use the new intrinsic wrapper instead.

Reviewers: sanjoy, reames

Subscribers: reames, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15762

llvm-svn: 256811
2016-01-05 04:03:00 +00:00
David Majnemer b33f3a239a [LICM] Fix a small oversight introduced in r256763
r256763 had promoteLoopAccessesToScalars check for the existence of a
catchswitch when the exit blocks were populated but
promoteLoopAccessesToScalars may be called with a prepopulated set of
exit blocks which would also need to be checked.

This fixes PR26019.

llvm-svn: 256788
2016-01-04 23:16:22 +00:00
Haicheng Wu 9d6c94006e [LIR] General refactoring to simplify code and the ease future code review
This is a resubmission of r256336 which was reverted in r256361. The issue was the lack of the invariant check of the memset value in processLooMemSet().

The original message:

Move several checks into isLegalStores. Also, delineate between those stores that are memset-able and those that are memcpy-able.

llvm-svn: 256783
2016-01-04 21:43:14 +00:00
Aditya Nandakumar 12d060481a Remove dead instructions before Redoing
Before reevaluating instructions, iterate over all instructions
to be reevaluated and remove trivially dead instructions and if
any of it's operands become trivially dead, mark it for deletion
until all trivially dead instructions have been removed

llvm-svn: 256773
2016-01-04 19:48:14 +00:00
David Majnemer 219055f9df [LICM] Don't insert instructions after a catchswitch when performing loop promotion
Inserting after a catchswitch results in verifier errors, bail out on
promotion if a catchswitch is a loop exit.

llvm-svn: 256763
2016-01-04 17:42:19 +00:00
David Majnemer 42a0730c42 [LICM] Make instruction sinking funclet-aware
We had two bugs here:
- We might try to sink into a catchswitch, causing verifier failures.
- We will succeed in sinking into a cleanuppad but we didn't update the
  funclet operand bundle.

This fixes PR26000.

llvm-svn: 256728
2016-01-04 03:37:39 +00:00
Manuel Jacob 67f1d3ac63 [RS4GC] Use DenseMap::count() instead of DenseMap::find()/DenseMap::end(). NFC.
llvm-svn: 256586
2015-12-29 22:16:41 +00:00
Manuel Jacob e3773d632e [PlaceSafepoints] Assert that the gc.safepoint_poll function is present in the module.
If running the PlaceSafepoints pass on a module which doesn't have the
gc.safepoint_poll function without disabling entry and backedge safepoints,
previously the pass crashed with an obscure error because of a null pointer.
Now it fails the assert instead.

llvm-svn: 256580
2015-12-29 21:57:55 +00:00
Geoff Berry 43dc285915 [JumpThreading] Fix opcode bonus in getJumpThreadDuplicationCost()
The code that was meant to adjust the duplication cost based on the
terminator opcode was not being executed in cases where the initial
threshold was hit inside the loop.

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D15536

llvm-svn: 256568
2015-12-29 18:10:16 +00:00
Manuel Jacob 9db5b93ffc [RS4GC] Fix rematerialization of bitcast of bitcast.
Summary:
Previously, only the outer (last) bitcast was rematerialized, resulting in a
use of the unrelocated inner (first) bitcast after the statepoint.  See the
test case for an example.

Reviewers: igor-laevsky, reames

Subscribers: reames, alex, llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D15789

llvm-svn: 256520
2015-12-28 20:14:05 +00:00
Chen Li d71999ef1b [gc.statepoint] Change gc.statepoint intrinsic's return type to token type instead of i32 type
Summary: This patch changes gc.statepoint intrinsic's return type to token type instead of i32 type. Using token types could prevent LLVM to merge different gc.statepoint nodes into PHI nodes and cause further problems with gc relocations. The patch also changes the way on how gc.relocate and gc.result look for their corresponding gc.statepoint on unwind path. The current implementation uses the selector value extracted from a { i8*, i32 } landingpad as a hook to find the gc.statepoint, while the patch directly uses a token type landingpad (http://reviews.llvm.org/D15405) to find the gc.statepoint. 

Reviewers: sanjoy, JosephTremoulet, pgavlin, igor-laevsky, mjacob

Subscribers: reames, mjacob, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15662

llvm-svn: 256443
2015-12-26 07:54:32 +00:00
Nico Weber 95cc9d5f14 Revert r256336, it caused PR25939
llvm-svn: 256361
2015-12-24 04:01:06 +00:00
Chad Rosier fba65d2fd3 [LIR] General refactoring to simplify code and the ease future code review.
Move several checks into isLegalStores. Also, delineate between those stores
that are memset-able and those that are memcpy-able.

http://reviews.llvm.org/D15683
Patch by Haicheng Wu <haicheng@codeaurora.org>!

llvm-svn: 256336
2015-12-23 17:29:33 +00:00
David Majnemer 63ad9e0543 [OperandBundles] Have TailCallElim play nice with operand bundles
A call site's use of a Value might not correspond to an argument
operand but to a bundle operand.

This fixes PR25928.

llvm-svn: 256328
2015-12-23 09:58:43 +00:00
Philip Reames ee8f055327 [GC] Make GCStrategy::isGCManagedPointer a type predicate not a value predicate [NFC]
Reasons:
1) The existing form was a form of false generality.  None of the implemented GCStrategies use anything other than a type.  Its becoming more and more clear we're going to need some type of strong GC pointer in the type system and we shouldn't pretend otherwise at this point.
2) The API was awkward when applied to vectors-of-pointers.  The old one could have been made to work, but calling isGCManagedPointer(Ty->getScalarType()) is much cleaner than the Value alternatives.  
3) The rewriting implementation effectively assumes the type based predicate as well.  We should be consistent.

llvm-svn: 256312
2015-12-23 01:42:15 +00:00
Manuel Jacob a4efd8ac2e [RS4GC] Fix base pair printing for constants.
Previously, "%" + name of the value was printed for each derived and base
pointer.  This is correct for instructions, but wrong for e.g. globals.

llvm-svn: 256305
2015-12-23 00:19:45 +00:00
Cong Hou 6a2c71af0b [BPI] Fix two potential divide-by-zero operations that are introduced in r256263.
llvm-svn: 256303
2015-12-22 23:45:55 +00:00
Cong Hou e93b8e1539 [BPI] Replace weights by probabilities in BPI.
This patch removes all weight-related interfaces from BPI and replace
them by probability versions. With this patch, we won't use edge weight
anymore in either IR or MC passes. Edge probabilitiy is a better
representation in terms of CFG update and validation.


Differential revision: http://reviews.llvm.org/D15519 

llvm-svn: 256263
2015-12-22 18:56:14 +00:00
Manuel Jacob 4e4f60ded0 Remove deprecated llvm.experimental.gc.result.{int,float,ptr} intrinsics.
Summary:
These were deprecated 11 months ago when a generic
llvm.experimental.gc.result intrinsic, which works for all types, was added.

Reviewers: sanjoy, reames

Subscribers: sanjoy, chenli, llvm-commits

Differential Revision: http://reviews.llvm.org/D15719

llvm-svn: 256262
2015-12-22 18:44:45 +00:00
Manuel Jacob 990dfa6fe5 [RS4GC] Fix crash in the case that a live variable has a constant base.
Summary:
Previously, RS4GC crashed in CreateGCRelocates() because it assumed
that every base is also in the array of live variables, which isn't true if a
live variable has a constant base.

This change fixes the crash by making sure CreateGCRelocates() won't try to
relocate a live variable with a constant base.  This would be unnecessary
anyway because anything with a constant base won't move.

Reviewers: reames

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D15556

llvm-svn: 256252
2015-12-22 16:50:44 +00:00
Chad Rosier 94274fb1ad [LIR] Refactor code to enable future patch. NFC.
llvm-svn: 256159
2015-12-21 14:49:32 +00:00
Manuel Jacob 8050a49737 [RS4GC] Add an assert which fails if there is a (yet unsupported) addrspacecast.
The slightly strange indentation comes from clang-format.

llvm-svn: 256132
2015-12-21 01:26:46 +00:00
Philip Reames 5d54689bca [RS4GC] Remove an overly strong assertion
As shown by the included test case, it's reasonable to end up with constant references during base pointer calculation.  The code actually handled this case just fine, we only had the assert to help isolate problems under the belief that constant references shouldn't be present in IR generated by managed frontends. This turned out to be wrong on two fronts: 1) Manual Jacobs is working on a language with constant references, and b) we found a case where the optimizer does create them in practice.

llvm-svn: 256079
2015-12-19 02:38:22 +00:00
Jingyue Wu ba3ca76ed2 [NaryReassociate] allow candidate to have a different type
Summary:
If Candiadte may have a different type from GEP, we should bitcast or
pointer cast it to GEP's type so that the later RAUW doesn't complain.

Added a test in nary-gep.ll

Reviewers: tra, meheff

Subscribers: mcrosier, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D15618

llvm-svn: 256035
2015-12-18 21:36:30 +00:00
Philip Reames dd0948a1b6 [RS4GC] Use an value handle to help isolate errors quickly
Inspired by the bug reported in 25846.  Whatever we end up doing about that one, the value handle change is a generally good one since it will help catch this type of mistake more quickly.

Patch by: Manuel Jacob

llvm-svn: 255984
2015-12-18 03:53:28 +00:00
Sanjoy Das 0de2feceb1 [SCEV] Add and use SCEVConstant::getAPInt; NFCI
llvm-svn: 255921
2015-12-17 20:28:46 +00:00
Philip Reames 15145fb7b1 [EarlyCSE] DSE of atomic unordered stores
The rules for removing trivially dead stores are a lot less complicated than loads. Since we know the later store post dominates the former and the former dominates the later, unless the former has side effects other than the actual store, we can remove it. One slightly surprising thing is that we can freely remove atomic stores, even if the later one isn't atomic. There's no guarantee the atomic one was every visible.

For the moment, we don't handle DSE of ordered atomic stores. We could extend the same chain of reasoning to them, but the catch is we'd then have to model the ordering effect without a store instruction. Since our fences are a stronger than our operation orderings, simple using a fence isn't an obvious win. This arguable calls for a refinement in our fence specification, but that's (much) later work.

Differential Revision: http://reviews.llvm.org/D15352

llvm-svn: 255914
2015-12-17 18:50:50 +00:00
Eric Christopher bfba572425 Fix funciton->function typo.
llvm-svn: 255841
2015-12-16 23:10:53 +00:00
Justin Bogner 883a3ea67f LPM: Make callers of LPM.deleteLoopFromQueue update LoopInfo directly. NFC
As of r255720, the loop pass manager will DTRT when passes update the
loop info for removed loops, so they no longer need to reach into
LPPassManager APIs to do this kind of transformation. This change very
nearly removes the need for the LPPassManager to even be passed into
loop passes - the only remaining pass that uses the LPM argument is
LoopUnswitch.

llvm-svn: 255797
2015-12-16 18:40:20 +00:00
Philip Reames ae1f265bf1 [EarlyCSE] DSE of stores which write back loaded values
Extend EarlyCSE with an additional style of dead store elimination. If we write back a value just read from that memory location, we can eliminate the store under the assumption that the value hasn't changed.

I'm implementing this mostly because I noticed the omission when looking at the code. It seemed strange to have InstCombine have a peephole which was more powerful than EarlyCSE. :)

Differential Revision: http://reviews.llvm.org/D15397

llvm-svn: 255739
2015-12-16 01:01:30 +00:00
Justin Bogner 843fb204b7 LPM: Stop threading `Pass *` through all of the loop utility APIs. NFC
A large number of loop utility functions take a `Pass *` and reach
into it to find out which analyses to preserve. There are a number of
problems with this:

- The APIs have access to pretty well any Pass state they want, so
  it's hard to tell what they may or may not do.

- Other APIs have copied these and pass around a `Pass *` even though
  they don't even use it. Some of these just hand a nullptr to the API
  since the callers don't even have a pass available.

- Passes in the new pass manager don't work like the current ones, so
  the APIs can't be used as is there.

Instead, we should explicitly thread the analysis results that we
actually care about through these APIs. This is both simpler and more
reusable.

llvm-svn: 255669
2015-12-15 19:40:57 +00:00
Justin Bogner 6291b587b6 LoopRotate: Convert the methods of LoopRotate to utility functions. NFC
This moves the actual work to do loop rotation into standalone
functions with the analysis results they need passed in as arguments,
leaving the class itself as a relatively simple shim. This will make
the functions easy to reuse when we're ready to port this
transformation to the new pass manager.

llvm-svn: 255574
2015-12-14 23:22:48 +00:00
Justin Bogner a730045156 LoopRotate: Reorder some method implementations. NFC
This just moves some callers after their callees. My next patch will
convert some of these methods to stand alone functions, and that diff
is more obviously NFC if I move these first. That change, in turn,
will make it much easier to port this pass to the new pass manager
once the loop pass manager is in place.

llvm-svn: 255573
2015-12-14 23:22:44 +00:00
Sanjay Patel af674fbfd9 getParent() ^ 3 == getModule() ; NFCI
llvm-svn: 255511
2015-12-14 17:24:23 +00:00
David Majnemer 8a1c45d6e8 [IR] Reformulate LLVM's EH funclet IR
While we have successfully implemented a funclet-oriented EH scheme on
top of LLVM IR, our scheme has some notable deficiencies:
- catchendpad and cleanupendpad are necessary in the current design
  but they are difficult to explain to others, even to seasoned LLVM
  experts.
- catchendpad and cleanupendpad are optimization barriers.  They cannot
  be split and force all potentially throwing call-sites to be invokes.
  This has a noticable effect on the quality of our code generation.
- catchpad, while similar in some aspects to invoke, is fairly awkward.
  It is unsplittable, starts a funclet, and has control flow to other
  funclets.
- The nesting relationship between funclets is currently a property of
  control flow edges.  Because of this, we are forced to carefully
  analyze the flow graph to see if there might potentially exist illegal
  nesting among funclets.  While we have logic to clone funclets when
  they are illegally nested, it would be nicer if we had a
  representation which forbade them upfront.

Let's clean this up a bit by doing the following:
- Instead, make catchpad more like cleanuppad and landingpad: no control
  flow, just a bunch of simple operands;  catchpad would be splittable.
- Introduce catchswitch, a control flow instruction designed to model
  the constraints of funclet oriented EH.
- Make funclet scoping explicit by having funclet instructions consume
  the token produced by the funclet which contains them.
- Remove catchendpad and cleanupendpad.  Their presence can be inferred
  implicitly using coloring information.

N.B.  The state numbering code for the CLR has been updated but the
veracity of it's output cannot be spoken for.  An expert should take a
look to make sure the results are reasonable.

Reviewers: rnk, JosephTremoulet, andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D15139

llvm-svn: 255422
2015-12-12 05:38:55 +00:00
Chad Rosier d7634fc91d Revert r255247, r255265, and r255286 due to serious compile-time regressions.
Revert "[DSE] Disable non-local DSE to see if the bots go green."
Revert "[DeadStoreElimination] Use range-based loops. NFC."
Revert "[DeadStoreElimination] Add support for non-local DSE."

llvm-svn: 255354
2015-12-11 18:39:41 +00:00
Hal Finkel 494393b740 AlignmentFromAssumptions and SLPVectorizer preserves AA and GlobalsAA
GlobalsAA's assumptions that passes do not escape globals not previously
escaped is not violated by AlignmentFromAssumptions and SLPVectorizer. Marking
them as such allows GlobalsAA to be preserved until GVN in the LTO pipeline.

http://lists.llvm.org/pipermail/llvm-dev/2015-December/092972.html

Patch by Vaivaswatha Nagaraj!

llvm-svn: 255348
2015-12-11 17:46:01 +00:00
Chad Rosier 843c7b4309 [DSE] Disable non-local DSE to see if the bots go green.
I see a few bots timing out, so I'm speculatively disabling r255247.

llvm-svn: 255286
2015-12-10 19:23:02 +00:00
Chad Rosier 02fe4248a2 [DeadStoreElimination] Use range-based loops. NFC.
llvm-svn: 255265
2015-12-10 17:27:18 +00:00
Chad Rosier 533bc3fcac [DeadStoreElimination] Add support for non-local DSE.
We extend the search for redundant stores to predecessor blocks that
unconditionally lead to the block BB with the current store instruction.  That
also includes single-block loops that unconditionally lead to BB, and
if-then-else blocks where then- and else-blocks unconditionally lead to BB.

http://reviews.llvm.org/D13363
Patch by Ivan Baev <ibaev@codeaurora.org>!

llvm-svn: 255247
2015-12-10 13:51:43 +00:00
Silviu Baranga 86de80db37 [LLE] Use the PredicatedScalarEvolution interface to query SCEVs for dependences
Summary:
LAA uses the PredicatedScalarEvolution interface, so it can produce
forward/backward dependences having SCEVs that are AddRecExprs only after being
transformed by PredicatedScalarEvolution.

Use PredicatedScalarEvolution to get the expected expressions.

Reviewers: anemet

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D15382

llvm-svn: 255241
2015-12-10 11:07:18 +00:00
Reid Kleckner 54ade23504 [Float2Int] Don't operate on vector instructions
This fixes a crash bug. It's also not clear if we'd want to do this
transform for vectors.

llvm-svn: 255155
2015-12-09 21:08:18 +00:00
Silviu Baranga 9cd9a7e310 Re-commit r255115, with the PredicatedScalarEvolution class moved to
ScalarEvolution.h, in order to avoid cyclic dependencies between the Transform
and Analysis modules:

[LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions

Summary:
This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the
usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge
by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is
that both LAA and LV should use this interface everywhere.

This also solves a problem involving the result of SCEV expression rewritting when
the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates
  P1: {a,+,b} has nsw
  P2: b = 1.

Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us
sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies).
The SCEVPredicatedLayer maintains the order of transformations by feeding back
the results of previous transformations into new transformations, and therefore
avoiding this issue.

The SCEVPredicatedLayer maintains a cache to remember the results of previous
SCEV rewritting results. This also has the benefit of reducing the overall number
of expression rewrites.

Reviewers: mzolotukhin, anemet

Subscribers: jmolloy, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D14296

llvm-svn: 255122
2015-12-09 16:06:28 +00:00
Silviu Baranga ad1ccb357b Revert r255115 until we figure out how to fix the bot failures.
llvm-svn: 255117
2015-12-09 15:25:28 +00:00
Silviu Baranga 41eb682501 [LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions
Summary:
This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the
usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge
by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is
that both LAA and LV should use this interface everywhere.

This also solves a problem involving the result of SCEV expression rewritting when
the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates
  P1: {a,+,b} has nsw
  P2: b = 1.

Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us
sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies).
The SCEVPredicatedLayer maintains the order of transformations by feeding back
the results of previous transformations into new transformations, and therefore
avoiding this issue.

The SCEVPredicatedLayer maintains a cache to remember the results of previous
SCEV rewritting results. This also has the benefit of reducing the overall number
of expression rewrites.

Reviewers: mzolotukhin, anemet

Subscribers: jmolloy, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D14296

llvm-svn: 255115
2015-12-09 15:03:52 +00:00
JF Bastien 9938425b31 EarlyCSE: fix typo from rL255054.
llvm-svn: 255102
2015-12-09 09:05:42 +00:00
Vikram TV 74b4111483 Test commit access - Fix few missing '.' in comments of LoopInterchange code.
llvm-svn: 255095
2015-12-09 05:16:24 +00:00
Sanjoy Das 42e551b92d [IndVars] Use any_of and foreach instead of explicit for loops; NFC
llvm-svn: 255077
2015-12-08 23:52:58 +00:00
Philip Reames 8fc2cbf933 [EarlyCSE] Value forwarding for unordered atomics
This patch teaches the fully redundant load part of EarlyCSE how to forward from atomic and volatile loads and stores, and how to eliminate unordered atomics (only). This patch does not include dead store elimination support for unordered atomics, that will follow in the near future.

The basic idea is that we allow all loads and stores to be tracked by the AvailableLoad table. We store a bit in the table which tracks whether load/store was atomic, and then only replace atomic loads with ones which were also atomic.

No attempt is made to refine our handling of ordered loads or stores. Those are still treated as full fences. We could pretty easily extend the release fence handling to release stores, but that should be a separate patch.

Differential Revision: http://reviews.llvm.org/D15337

llvm-svn: 255054
2015-12-08 21:45:41 +00:00
Sanjoy Das 683bf070ef [IndVars] Have getInsertPointForUses preserve LCSSA
Summary:
Also add a stricter post-condition for IndVarSimplify.

Fixes PR25578.  Test case by Michael Zolotukhin.

Reviewers: hfinkel, atrick, mzolotukhin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15059

llvm-svn: 254977
2015-12-08 00:13:21 +00:00
Philip Reames 9e5e2d61bf Reapply 254950 w/fix
254950 ended up being not NFC.  The previous code was overriding the flags for whether an instruction read or wrote memory using the target specific flags returned via TTI.  I'd missed this in my refactoring.  Since I mistakenly built only x86 and didn't notice the number of unsupported tests, I didn't catch that before the original checkin.

This raises an interesting issue though.  Given we have function attributes (i.e. readonly, readnone, argmemonly) which describe the aliasing of intrinsics, why does TTI have this information overriding the instruction definition at all?  I see no reason for this, but decided to preserve existing behavior for the moment.  The root issue might be that we don't have a "writeonly" attribute.

Original commit message:
[EarlyCSE] Simplify and invert ParseMemoryInst [NFCI]

Restructure ParseMemoryInst - which was introduced to abstract over target specific load and stores instructions - to just query the underlying instructions. In theory, this could be slightly slower than caching the results, but in practice, it's very unlikely to be measurable.

The simple query scheme makes it far easier to understand, and much easier to extend with new queries. Given I'm about to need to add new query types, doing the cleanup first seemed worthwhile.

Do we still believe the target specific intrinsic handling is worthwhile in EarlyCSE? It adds quite a bit of complexity and makes the code harder to read. Being able to delete the abstraction entirely would be wonderful.

llvm-svn: 254957
2015-12-07 22:41:23 +00:00
Philip Reames 4b5634af44 Revert 254950
It's causing test failures on AArch64.  Due to a bad build config on my part, I apparently wasn't running the tests I thought I was.

llvm-svn: 254954
2015-12-07 21:41:29 +00:00
Philip Reames 998cae653b [EarlyCSE] Simplify and invert ParseMemoryInst [NFCI]
Restructure ParseMemoryInst - which was introduced to abstract over target specific load and stores instructions - to just query the underlying instructions. In theory, this could be slightly slower than caching the results, but in practice, it's very unlikely to be measurable.

The simple query scheme makes it far easier to understand, and much easier to extend with new queries. Given I'm about to need to add new query types, doing the cleanup first seemed worthwhile.

Do we still believe the target specific intrinsic handling is worthwhile in EarlyCSE? It adds quite a bit of complexity and makes the code harder to read. Being able to delete the abstraction entirely would be wonderful.

llvm-svn: 254950
2015-12-07 21:27:15 +00:00
Philip Reames 7c6692de16 [EarlyCSE] IsSimple vs IsVolatile naming clarification (NFC)
When the notion of target specific memory intrinsics was introduced to EarlyCSE, the commit confused the notions of volatile and simple memory access.  Since I'm about to start working on this area, cleanup the naming so that patches aren't horribly confusing.  Note that the actual implementation was always bailing if the load or store wasn't simple.  

Reminder:
- "volatile" - C++ volatile, can't remove any memory operations, but in principal unordered
- "ordered" - imposes ordering constraints on other nearby memory operations
- "atomic" - can't be split or sheared.  In LLVM terms, all "ordered" operations are also atomic so the predicate "isAtomic" is often used.
- "simple" - a load which is none of the above.  These are normal loads and what most of the optimizer works with.

llvm-svn: 254805
2015-12-05 00:18:33 +00:00
Akira Hatanaka 237916b537 [AttributeSet] Overload AttributeSet::addAttribute to reduce compile
time.

The new overloaded function is used when an attribute is added to a
large number of slots of an AttributeSet (for example, to function
parameters). This is much faster than calling AttributeSet::addAttribute
once per slot, because AttributeSet::getImpl (which calls
FoldingSet::FIndNodeOrInsertPos) is called only once per function
instead of once per slot.

With this commit, clang compiles a file which used to take over 22
minutes in just 13 seconds.

rdar://problem/23581000

Differential Revision: http://reviews.llvm.org/D15085

llvm-svn: 254491
2015-12-02 06:58:49 +00:00
Chad Rosier 869962f962 [LIR] Push check into helper function. NFC.
llvm-svn: 254416
2015-12-01 14:26:35 +00:00
Craig Topper d896b03e4c Remove an intermediate lambda. NFC
llvm-svn: 254246
2015-11-29 05:38:08 +00:00
Craig Topper e471cf32a0 Use range-based for loops. NFC
llvm-svn: 254222
2015-11-28 08:23:04 +00:00
Davide Italiano dd04fee8a6 [SCCP] More informative message if we don't know how to handle a terminator.
llvm-svn: 254093
2015-11-25 21:03:36 +00:00
Sanjay Patel 739f2ce93a use convenience function for copying IR flags; NFCI
llvm-svn: 253996
2015-11-24 17:16:33 +00:00
Chad Rosier a15b4b6af2 [LIR] Put includes in correct order. NFC.
llvm-svn: 253915
2015-11-23 21:09:13 +00:00
Andrew Kaylor 0615a0e65d [WinEH] Fix a case where GVN could incorrectly PRE a load into an EH pad.
Differential Revision: http://reviews.llvm.org/D14842

llvm-svn: 253908
2015-11-23 19:51:41 +00:00
Davide Italiano 945d05f6a0 [LoopStrengthReduce] Mark dump() definitions as LLVM_DUMP_METHOD.
llvm-svn: 253841
2015-11-23 02:47:30 +00:00
Craig Topper a5ea5289ff Use modulo operator instead of multiplying result of a divide and subtracting from the original dividend. NFC.
llvm-svn: 253792
2015-11-21 17:44:42 +00:00
Owen Anderson 630077ef55 Fix a pair of issues that caused an infinite loop in reassociate.
Terrifyingly, one of them is a mishandling of floating point vectors
in Constant::isZero().  How exactly this issue survived this long
is beyond me.

llvm-svn: 253655
2015-11-20 08:16:13 +00:00
Craig Topper e325e3806f Use range-based for loops. NFC
llvm-svn: 253652
2015-11-20 07:18:48 +00:00
Chad Rosier 1cd3da15e8 [LIR] Update some comments. NFC.
llvm-svn: 253603
2015-11-19 21:33:07 +00:00
Chad Rosier 3ecc8d8d83 [LIR] Fix 80-column from previous commit.
llvm-svn: 253586
2015-11-19 18:25:11 +00:00
Chad Rosier fddc01f393 [LIR] Sink checks into function to enable future refactoring. NFC.
The purpose of this change is help delineate the memset and memcpy
optimizations with the overall goal of resolving PR25520.

llvm-svn: 253585
2015-11-19 18:22:21 +00:00
Chad Rosier 85c21f0a6e [LIR] Use the more appropriate method. NFC.
llvm-svn: 253578
2015-11-19 17:27:28 +00:00
Pete Cooper 67cf9a723b Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253511.

This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

llvm-svn: 253543
2015-11-19 05:56:52 +00:00
Weiming Zhao b69babd01e Fix bug 25440: GVN assertion after coercing loads
Optimizations like LoadPRE in GVN will insert new instructions.
If the insertion point is in a already processed BB, they should
get a value number explicitly. If the insertion point is after
current instruction, then just leave it. However, current GVN framework
has no support for it.
In this patch, we just bail out if a VN can't be found.

Dfferential Revision: http://reviews.llvm.org/D14670

A    test/Transforms/GVN/pr25440.ll
M    lib/Transforms/Scalar/GVN.cpp

llvm-svn: 253536
2015-11-19 02:45:18 +00:00
Mehdi Amini adb4057a15 Fix returned value for GVN: could return "false" even after modifying the IR
This bug would manifest in some very specific cases where all the following
conditions are fullfilled:

- GVN didn't remove block
- The regular GVN iteration didn't change the IR
- PRE is enabled
- PRE will not split critical edge
- The last instruction processed by PRE didn't change the IR

Because the CallGraph PassManager relies on this returned value to decide
if it needs to recompute a node after the execution of Function passes,
not returning the right value can lead to unexpected results.

Fix for: https://llvm.org/bugs/show_bug.cgi?id=24715

Patch by Wenxiang Qiu <vincentqiuuu@gmail.com>

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 253518
2015-11-18 22:49:49 +00:00
Pete Cooper 72bc23ef02 Change memcpy/memset/memmove to have dest and source alignments.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

llvm-svn: 253511
2015-11-18 22:17:24 +00:00
Mike Aizatsky c7810baaa6 Disable gvn non-local speculative loads under asan.
Summary: Fix for https://llvm.org/bugs/show_bug.cgi?id=25550

Differential Revision: http://reviews.llvm.org/D14763

llvm-svn: 253498
2015-11-18 20:43:00 +00:00
Igor Laevsky 7310c68e85 Revert "Revert "Strip metadata when speculatively hoisting instructions (r252604)"
Failing clang test is now fixed by the r253458.

llvm-svn: 253459
2015-11-18 14:50:18 +00:00
Craig Topper 66059c9f4d Replace dyn_cast with isa in places that weren't using the returned value for more than a boolean check. NFC.
llvm-svn: 253441
2015-11-18 07:07:59 +00:00
Philip Reames b6e8fe3dac [PRE] Preserve !invariant.load metadata
Spoted via inspection.  Test case included.

llvm-svn: 253275
2015-11-17 00:15:09 +00:00
Owen Anderson 2de9f545aa Add intermediate subtract instructions to reassociation worklist.
We sometimes create intermediate subtract instructions during
reassociation.  Adding these to the worklist to revisit exposes many
additional reassociation opportunities.

Patch by Aditya Nandakumar.

llvm-svn: 253240
2015-11-16 18:07:30 +00:00
David Majnemer 7378e7a333 [LoopStrengthReduce] Don't increment iterator past the end of the BB
We tried to move the insertion point beyond instructions like landingpad
and cleanuppad.
However, we *also* tried to move past catchpad.  This is problematic
because catchpad is also a terminator.

This fixes PR25541.

llvm-svn: 253238
2015-11-16 17:37:58 +00:00
Keno Fischer 86c95b5642 [Sink] Don't move landingpads
Summary: Moving landingpads into successor basic blocks makes the
verifier sad. Teach Sink that much like PHI nodes and terminator
instructions, landingpads (and cleanuppads, etc.) may not be moved
between basic blocks.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14475

llvm-svn: 253182
2015-11-16 04:47:58 +00:00
Chad Rosier cc299b627d [LIR] Add support for creating memcpys from loops with a negative stride.
This allows us to transform the below loop into a memcpy.

void test(unsigned *__restrict__ a, unsigned *__restrict__ b) {
  for (int i = 2047; i >= 0; --i) {
    a[i] = b[i];
  }
}

This is the memcpy version of r251518, which added support for memset with
negative strided loops.

llvm-svn: 253091
2015-11-13 21:51:02 +00:00
Chad Rosier 2fa50a7a05 Add a comment that should have made my last commit.
llvm-svn: 253063
2015-11-13 19:13:40 +00:00
Chad Rosier ed0c7d1316 [LIR] Factor out the code to compute base ptr for negative strided loops.
This will allow for the code to be reused in the memcpy optimization.

llvm-svn: 253061
2015-11-13 19:11:07 +00:00
Tobias Grosser 8241795d20 Revert "Fix bug 25440: GVN assertion after coercing loads"
This reverts 252919 which broke LNT: MultiSource/Applications/SPASS

llvm-svn: 252936
2015-11-12 20:04:21 +00:00
Chad Rosier a548fe569b [LIR] Minor refactoring. NFCI.
This change prevents uninteresting stores from being inserted into the list of
candidate stores for memset/memcpy conversion.

llvm-svn: 252926
2015-11-12 19:09:16 +00:00
Weiming Zhao eed0145dd2 Fix bug 25440: GVN assertion after coercing loads
Summary:
when coercing loads, it inserts some instructions, which have no GV assigned.

https://llvm.org/bugs/show_bug.cgi?id=25440


Reviewers: hfinkel, dberlin

Subscribers: dberlin, llvm-commits

Differential Revision: http://reviews.llvm.org/D14479

llvm-svn: 252919
2015-11-12 18:19:59 +00:00
Chad Rosier cc9030b60a [LIR] General refactor to improve compile-time and simplify code.
First create a list of candidates, then transform.  This simplifies the code in
that you have don't have to worry that you may be using an invalidated
iterator.

Previously, each time we created a memset/memcpy we would reevaluate the entire
loop potentially resulting in lots of redundant work for large basic blocks.

llvm-svn: 252817
2015-11-11 23:00:59 +00:00
Renato Golin 0e77d72b0a Revert "Strip metadata when speculatively hoisting instructions"
This reverts commit r252604, as it broke all ARM and AArch64 buildbots, as
well as some x86, et al.

llvm-svn: 252623
2015-11-10 18:01:16 +00:00
Igor Laevsky 01c3692a10 Strip metadata when speculatively hoisting instructions
This is fix for PR24059.

When we are hoisting instruction above some condition it may turn out
that metadata on this instruction was control dependant on the condition.
This metadata becomes invalid and we need to drop it.

This patch should cover most obvious places of speculative execution (which
I have found by greping isSafeToSpeculativelyExecute). I think there are more
cases but at least this change covers the severe ones.

Differential Revision: http://reviews.llvm.org/D14398

llvm-svn: 252604
2015-11-10 14:10:31 +00:00
Chad Rosier 19dc92dc8d Simplify. NFC.
llvm-svn: 252491
2015-11-09 16:56:06 +00:00
Silviu Baranga 2910a4f6b1 Allow LLE/LD and the loop versioning infrastructure to use SCEV predicates
Summary:
LAA currently generates a set of SCEV predicates that must be checked by users.
In the case of Loop Distribute/Loop Load Elimination, no such predicates could have
been emitted, since we don't allow stride versioning. However, in the future there
could be SCEV predicates that will need to be checked.

This change adds support for SCEV predicate versioning in the Loop Distribute, Loop
Load Eliminate and the loop versioning infrastructure.

Reviewers: anemet

Subscribers: mssimpso, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D14240

llvm-svn: 252467
2015-11-09 13:26:09 +00:00
David Majnemer b222184223 [LoopStrengthReduce] Don't bother fixing up PHIs from EH Pad preds
We cannot really insert fixup code into a PHI's predecessor.

This fixes PR25445.

llvm-svn: 252416
2015-11-08 05:04:07 +00:00
Duncan P. N. Exon Smith 83c4b68720 ADT: Remove last implicit ilist iterator conversions, NFC
Some implicit ilist iterator conversions have crept back into Analysis,
Transforms, Hexagon, and llvm-stress.  This removes them.

I'll commit a patch immediately after this to disallow them (in a
separate patch so that it's easy to revert if necessary).

llvm-svn: 252371
2015-11-07 00:01:16 +00:00
Akira Hatanaka 5cfcce12eb Add 'notail' marker for call instructions.
This marker prevents optimization passes from adding 'tail' or
'musttail' markers to a call. Is is used to prevent tail call
optimization from being performed on the call.

rdar://problem/22667622

Differential Revision: http://reviews.llvm.org/D12923

llvm-svn: 252368
2015-11-06 23:55:38 +00:00
Sanjoy Das 55ea67cea7 [ValueTracking] Add parameters to isImpliedCondition; NFC
Summary:
This change makes the `isImpliedCondition` interface similar to the rest
of the functions in ValueTracking (in that it takes a DataLayout,
AssumptionCache etc.).  This is an NFC, intended to make a later diff
less noisy.

Depends on D14369

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14391

llvm-svn: 252333
2015-11-06 19:01:08 +00:00
Chad Rosier 43f9b48975 [LIR] Simplify code by making DataLayout globally accessible. NFC.
llvm-svn: 252317
2015-11-06 16:33:57 +00:00
Eugene Zelenko ffec81ca00 Fix some Clang-tidy modernize warnings, other minor fixes.
Fixed warnings are: modernize-use-override, modernize-use-nullptr and modernize-redundant-void-arg.

Differential revision: http://reviews.llvm.org/D14312

llvm-svn: 252087
2015-11-04 22:32:32 +00:00
Philip Reames 814fb60130 [CVP] Fold return values if possible
In my previous change to CVP (251606), I made CVP much more aggressive about trying to constant fold comparisons. This patch is a reversal in direction. Rather than being agressive about every compare, we restore the non-block local restriction for most, and then try hard for compares feeding returns.

The motivation for this is two fold:
 * The more I thought about it, the less comfortable I got with the possible compile time impact of the other approach. There have been no reported issues, but after talking to a couple of folks, I've come to the conclusion the time probably isn't justified.
 * It turns out we need to know the context to leverage the full power of LVI. In particular, asking about something at the end of it's block (the use of a compare in a return) will frequently get more precise results than something in the middle of a block. This is an implementation detail, but it's also hard to get around since mid-block queries have to reason about possible throwing instructions and don't get to use most of LVI's block focused infrastructure. This will become particular important when combined with http://reviews.llvm.org/D14263.

Differential Revision: http://reviews.llvm.org/D14271

llvm-svn: 252032
2015-11-04 01:43:54 +00:00
Adam Nemet 7c94c9bf07 Fix unused variable warning from r252017
llvm-svn: 252019
2015-11-04 00:10:33 +00:00
Adam Nemet e54a4fa95d LLE 6/6: Add LoopLoadElimination pass
Summary:
The goal of this pass is to perform store-to-load forwarding across the
backedge of a loop.  E.g.:

  for (i)
     A[i + 1] = A[i] + B[i]

  =>

  T = A[0]
  for (i)
     T = T + B[i]
     A[i + 1] = T

The pass relies on loop dependence analysis via LoopAccessAnalisys to
find opportunities of loop-carried dependences with a distance of one
between a store and a load.  Since it's using LoopAccessAnalysis, it was
easy to also add support for versioning away may-aliasing intervening
stores that would otherwise prevent this transformation.

This optimization is also performed by Load-PRE in GVN without the
option of multi-versioning.  As was discussed with Daniel Berlin in
http://reviews.llvm.org/D9548, this is inferior to a more loop-aware
solution applied here.  Hopefully, we will be able to remove some
complexity from GVN/MemorySSA as a consequence.

In the long run, we may want to extend this pass (or create a new one if
there is little overlap) to also eliminate loop-indepedent redundant
loads and store that *require* versioning due to may-aliasing
intervening stores/loads.  I have some motivating cases for store
elimination. My plan right now is to wait for MemorySSA to come online
first rather than using memdep for this.

The main motiviation for this pass is the 456.hmmer loop in SPECint2006
where after distributing the original loop and vectorizing the top part,
we are left with the critical path exposed in the bottom loop.  Being
able to promote the memory dependence into a register depedence (even
though the HW does perform store-to-load fowarding as well) results in a
major gain (~20%).  This gain also transfers over to x86: it's
around 8-10%.

Right now the pass is off by default and can be enabled
with -enable-loop-load-elim.  On the LNT testsuite, there are two
performance changes (negative number -> improvement):

  1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the
     critical paths is reduced
  2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this
     outside of LNT

The pass is scheduled after the loop vectorizer (which is after loop
distribution).  The rational is to try to reuse LAA state, rather than
recomputing it.  The order between LV and LLE is not critical because
normally LV does not touch scalar st->ld forwarding cases where
vectorizing would inhibit the CPU's st->ld forwarding to kick in.

LoopLoadElimination requires LAA to provide the full set of dependences
(including forward dependences).  LAA is known to omit loop-independent
dependences in certain situations.  The big comment before
removeDependencesFromMultipleStores explains why this should not occur
for the cases that we're interested in.

Reviewers: dberlin, hfinkel

Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D13259

llvm-svn: 252017
2015-11-03 23:50:08 +00:00
Adam Nemet a2df750fb3 [LAA] LLE 3/6: Rename InterestingDependence to Dependences, NFC
Summary:
We now collect all types of dependences including lexically forward
deps not just "interesting" ones.

Reviewers: hfinkel

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13256

llvm-svn: 251985
2015-11-03 21:39:52 +00:00
Tobias Grosser 526d52691a Revert "[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader"
Commit 251839 triggers miscompiles on some bots:

http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/13723

(The commit is listed in 13722, but due to an existing failure introduced in
13721 and reverted in 13723 the failure is only visible in 13723)

To verify r251839 is indeed the only change that triggered the buildbot failures
and to ensure the buildbots remain green while investigating I temporarily
revert this commit. At the current state it is unclear if this commit introduced
some miscompile or if it only exposed code to Polly that is subsequently
miscompiled by Polly.

llvm-svn: 251901
2015-11-03 07:14:39 +00:00
Chen Li d715310162 [IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader
Summary:
This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with
its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch.


Reviewers: sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13974

llvm-svn: 251839
2015-11-02 22:00:15 +00:00
Justin Bogner 19b679963f [PM] Port ADCE to the new pass manager
llvm-svn: 251725
2015-10-30 23:13:18 +00:00
Philip Reames eb3e9dad7f [LVI/CVP] Teach LVI about range metadata
Somewhat shockingly for an analysis pass which is computing constant ranges, LVI did not understand the ranges provided by range metadata.

As part of this change, I included a change to CVP primarily because doing so made it much easier to write small self contained test cases. CVP was previously only handling the non-local operand case, but given that LVI can sometimes figure out information about instructions standalone, I don't see any reason to restrict this.  There could possibly be a compile time impact from this, but I suspect it should be minimal.  If anyone has an example which substaintially regresses, please let me know.  I could restrict the block local handling to ICmps feeding Terminator instructions if needed.  

Note that this patch continues a somewhat bad practice in LVI. In many cases, we know facts about values, and separate context sensitive facts about values. LVI makes no effort to distinguish and will frequently cache the same value fact repeatedly for different contexts. I would like to change this, but that's a large enough change that I want it to go in separately with clear documentation of what's changing. Other examples of this include the non-null handling, and arguments.

As a meta comment: the entire motivation of this change was being able to write smaller (aka reasonable sized) test cases for a future patch teaching LVI about select instructions.

Differential Revision: http://reviews.llvm.org/D13543

llvm-svn: 251606
2015-10-29 03:57:17 +00:00
Sanjoy Das 13e63a2f21 [JumpThreading] Use dominating conditions to prove implications
Summary:
If P branches to Q conditional on C and Q branches to R conditional on
C' and C => C' then the branch conditional on C' can be folded to an
unconditional branch.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13972

llvm-svn: 251557
2015-10-28 21:27:08 +00:00
Chad Rosier 7142da0ed4 Typo.
llvm-svn: 251521
2015-10-28 15:08:33 +00:00
Chad Rosier 7967614b2b Reapply: [LIR] Add support for creating memsets from loops with a negative stride.
The simple fix is to prevent forming memcpy from loops with a negative stride.

llvm-svn: 251518
2015-10-28 14:38:49 +00:00
Chad Rosier 8eb2a18a9f Revert "[LIR] Add support for creating memsets from loops with a negative stride."
This reverts commit r251512.  This is causing LNT/chomp to fail.

llvm-svn: 251513
2015-10-28 13:54:09 +00:00
Chad Rosier d6a6bd5501 [LIR] Add support for creating memsets from loops with a negative stride.
http://reviews.llvm.org/D14125

llvm-svn: 251512
2015-10-28 12:55:34 +00:00
Chen Li 8d23a9bbef Revert r251492 "[IndVarSimplify] Rewrite loop exit values with their
initial values from loop preheader", because it broke some bots.

llvm-svn: 251498
2015-10-28 05:15:51 +00:00
Chen Li 032a5d0cea [IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader
Summary:
This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with
its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch.


Reviewers: sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13974

llvm-svn: 251492
2015-10-28 04:45:47 +00:00
Igor Laevsky 1ef06559f4 [RS4GC] Strip noalias attribute after statepoint rewrite
We should remove noalias along with dereference and dereference_or_null attributes 
because statepoint could potentially touch the entire heap including noalias objects.

Differential Revision: http://reviews.llvm.org/D14032

llvm-svn: 251333
2015-10-26 19:06:01 +00:00