Commit Graph

6922 Commits

Author SHA1 Message Date
Sanjoy Das 25ec1a3e60 [RS4GC] Use "deopt" operand bundles
Summary:
This is a step towards using operand bundles to carry deopt state till
RewriteStatepointsForGC.  The change adds a flag to
RewriteStatepointsForGC that teaches it to pick up deopt state from a
`"deopt"` operand bundle attached to the `call` or `invoke` it is
wrapping.

The command line flag added, `-rs4gc-use-deopt-bundles`, will only exist
for a short while.  Once we are able to pipe deopt bundle state through
the full optimization pipeline without problems, we will "constant fold"
`-rs4gc-use-deopt-bundles` to `true`.

Reviewers: swaroop.sridhar, reames

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D13372

llvm-svn: 250489
2015-10-16 02:41:00 +00:00
Sanjoy Das 7360f30852 [IndVars] Rename getExtend; NFC
Rename `IndVarSimplify::getExtend` to `IndVarSimplify::createExtendInst`
to make it obvious that it creates `llvm::Instruction` s.

llvm-svn: 250484
2015-10-16 01:00:50 +00:00
Sanjoy Das 37e87c2023 [IndVars] Have `cloneArithmeticIVUser` guess better
Summary:
`cloneArithmeticIVUser` currently trips over expression like `add %iv,
-1` when `%iv` is being zero extended -- it tries to construct the
widened use as `add %iv.zext, zext(-1)` and (correctly) fails to prove
equivalence to `zext(add %iv, -1)` (here the SCEV for `%iv` is
`{1,+,1}`).

This change teaches `IndVars` to try sign extending the non-IV operand
if that makes the newly constructed IV use equivalent to the widened
narrow IV use.

Reviewers: atrick, hfinkel, reames

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D13717

llvm-svn: 250483
2015-10-16 01:00:47 +00:00
Sanjoy Das 472840a3d3 [IndVars] Extract out a few local variables; NFC
llvm-svn: 250482
2015-10-16 01:00:44 +00:00
Sanjoy Das 1fd184e5a2 [IndVars] Split `WidenIV::cloneIVUser`; NFC
Summary:
This NFC splitting is intended to make a later diff easier to follow.
It just tail duplicates `cloneIVUser` into `cloneArithmeticIVUser` and
`cloneBitwiseIVUser`.

Reviewers: atrick, hfinkel, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13716

llvm-svn: 250481
2015-10-16 01:00:39 +00:00
Benjamin Kramer 6db3338cb1 [ScalarOpts] Remove dead code.
Does not touch debug dumpers. NFC.

llvm-svn: 250417
2015-10-15 15:08:58 +00:00
Manman Ren 72d44b1b09 Recommit r250345, it was reverted in r250366 to investigate a bot failure.
Our internal bot is still red after r250366.

llvm-svn: 250415
2015-10-15 14:59:40 +00:00
Manman Ren f5499fd9d5 Temporarily revert r250345 to sort out bot failure.
With r250345 and r250343, we start to observe the following failure
when bootstrap clang with lto and pgo:
PHI node entries do not match predecessors!
  %.sroa.029.3.i = phi %"class.llvm::SDNode.13298"* [ null, %30953 ], [ null, %31017 ], [ null, %30998 ], [ null, %_ZN4llvm8dyn_castINS_14ConstantSDNodeENS_7SDValueEEENS_10cast_rettyIT_T0_E8ret_typeERS5_.exit.i.1804 ], [ null, %30975 ], [ null, %30991 ], [ null, %_ZNK4llvm3EVT13getScalarTypeEv.exit.i.1812 ], [ %..sroa.029.0.i, %_ZN4llvm11SmallVectorIiLj8EED1Ev.exit.i.1826 ], !dbg !451895
label %30998
label %_ZNK4llvm3EVTeqES0_.exit19.thread.i
LLVM ERROR: Broken function found, compilation aborted!

I will re-commit this if the bot does not recover.

llvm-svn: 250366
2015-10-15 04:58:24 +00:00
Cong Hou b74d3b3b86 Update the branch weight metadata in JumpThreading pass.
Currently in JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB).

This is the third attempt to submit this patch, while the first two led to failures in some FDO tests. After investigation, it is the edge weight normalization that caused those failures. In this patch the edge weight normalization is fixed so that there is no zero weight in the output and the sum of all weights can fit in 32-bit integer. Several unit tests are added.

Differential revision: http://reviews.llvm.org/D10979

llvm-svn: 250345
2015-10-14 23:14:17 +00:00
Chen Li 567aa7ab30 [LoopUnswitch] Correct misleading comments.
Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13738

llvm-svn: 250317
2015-10-14 19:47:43 +00:00
Manman Ren 2c8e16d507 Revert r250204 and r250240 due to bot failure. We failed to build PGO-ed clang.
llvm-svn: 250264
2015-10-14 03:04:03 +00:00
Chad Rosier 7f08d80595 Typo.
llvm-svn: 250224
2015-10-13 20:59:16 +00:00
Duncan P. N. Exon Smith be4d8cba1c Scalar: Remove remaining ilist iterator implicit conversions
Remove remaining `ilist_iterator` implicit conversions from
LLVMScalarOpts.

This change exposed some scary behaviour in
lib/Transforms/Scalar/SCCP.cpp around line 1770.  This patch changes a
call from `Function::begin()` to `&Function::front()`, since the return
was immediately being passed into another function that takes a
`Function*`.  `Function::front()` started to assert, since the function
was empty.  Note that `Function::end()` does not point at a legal
`Function*` -- it points at an `ilist_half_node` -- so the other
function was getting garbage before.  (I added the missing check for
`Function::isDeclaration()`.)

Otherwise, no functionality change intended.

llvm-svn: 250211
2015-10-13 19:26:58 +00:00
Cong Hou 7ab123a5cf Update the branch weight metadata in JumpThreading pass.
Currently in JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB).

Differential revision: http://reviews.llvm.org/D10979

llvm-svn: 250204
2015-10-13 18:43:10 +00:00
Duncan P. N. Exon Smith 3a9c9e3dcd Scalar: Remove some implicit ilist iterator conversions, NFC
Remove some of the implicit ilist iterator conversions in
LLVMScalarOpts.  More to go.

llvm-svn: 250197
2015-10-13 18:26:00 +00:00
Sanjoy Das b873cbe5c9 [IndVars] NFC Cleanup.
- Rename methods according to the LLVM Coding Style
 - Merge adjacent anonymous namespace block
 - Use `auto` in two places

llvm-svn: 250152
2015-10-13 07:17:38 +00:00
Manman Ren 9f824dab1d Revert 250089 due to bot failure. It failed when building clang itself with PGO.
llvm-svn: 250145
2015-10-13 03:38:02 +00:00
Cong Hou 3320bcd815 Update the branch weight metadata in JumpThreading pass.
In JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB). 

Differential revision: http://reviews.llvm.org/D10979

llvm-svn: 250089
2015-10-12 19:44:08 +00:00
Sanjoy Das cc16ccc1ab [IndVars] Use `auto`; NFC
llvm-svn: 249944
2015-10-10 06:33:33 +00:00
Owen Anderson 97ca0f3f2c Generalize convergent check to handle invokes as well as calls.
llvm-svn: 249892
2015-10-09 20:17:46 +00:00
Owen Anderson 2c9978b12b Teach LoopUnswitch not to perform non-trivial unswitching on loops containing convergent operations.
Doing so could cause the post-unswitching convergent ops to be
control-dependent on the unswitch condition where they were not before.
This check could be refined to allow unswitching where the convergent
operation was already control-dependent on the unswitch condition.

llvm-svn: 249874
2015-10-09 18:40:20 +00:00
Owen Anderson d95b08a0a7 Refine the definition of convergent to only disallow the addition of new control dependencies.
This covers the common case of operations that cannot be sunk.
Operations that cannot be hoisted should already be handled properly via
the safe-to-speculate rules and mechanisms.

llvm-svn: 249865
2015-10-09 18:06:13 +00:00
Andrea Di Biagio 99493df257 [MemCpyOpt] Fix wrong merging adjacent nontemporal stores into memset calls.
Pass MemCpyOpt doesn't check if a store instruction is nontemporal.
As a consequence, adjacent nontemporal stores are always merged into a
memset call.

Example:

;;;
define void @foo(<4 x float>* nocapture %p) {
entry:
  store <4 x float> zeroinitializer, <4 x float>* %p, align 16, !nontemporal !0
  %p1 = getelementptr inbounds <4 x float>, <4 x float>* %dst, i64 1
  store <4 x float> zeroinitializer, <4 x float>* %p1, align 16, !nontemporal !0
  ret void
}

!0 = !{i32 1}
;;;

In this example, the two nontemporal stores are combined to a memset of zero
which does not preserve the nontemporal hint. Later on the backend (tested on a
x86-64 corei7) expands that memset call into a sequence of two normal 16-byte
aligned vector stores.

opt -memcpyopt example.ll -S -o - | llc -mcpu=corei7 -o -

Before:
  xorps  %xmm0, %xmm0
  movaps  %xmm0, 16(%rdi)
  movaps  %xmm0, (%rdi)

With this patch, we no longer merge nontemporal stores into calls to memset.
In this example, llc correctly expands the two stores into two movntps:
  xorps  %xmm0, %xmm0
  movntps %xmm0, 16(%rdi)
  movntps  %xmm0, (%rdi)

In theory, we could extend the usage of !nontemporal metadata to memcpy/memset
calls. However a change like that would only have the effect of forcing the
backend to expand !nontemporal memsets back to sequences of store instructions.
A memset library call would not have exactly the same semantic of a builtin
!nontemporal memset call. So, SelectionDAG will have to conservatively expand
it back to a sequence of !nontemporal stores (effectively undoing the merging).

Differential Revision: http://reviews.llvm.org/D13519

llvm-svn: 249820
2015-10-09 10:53:41 +00:00
Arnaud A. de Grandmaison 859b2ac07d [EarlyCSE] Address post commit review for r249523.
llvm-svn: 249814
2015-10-09 09:23:01 +00:00
Sanjoy Das 3c520a1272 [RS4GC] Refactoring to make a later change easier, NFCI
Summary:
These non-semantic changes will help make a later change adding
support for deopt operand bundles more streamlined.

Reviewers: reames, swaroop.sridhar

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D13491

llvm-svn: 249779
2015-10-08 23:18:38 +00:00
Sanjoy Das c21a05a3a4 [PlaceSafeopints] Extract out `callsGCLeafFunction`, NFC
Summary:
This will be used in a later change to RewriteStatepointsForGC.

Reviewers: reames, swaroop.sridhar

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13490

llvm-svn: 249777
2015-10-08 23:18:30 +00:00
Sanjoy Das 1ede5367ba [RS4GC] Don't copy ADT's unneccessarily, NFCI
Summary: Use `const auto &` instead of `auto` in `makeStatepointExplicit`.

Reviewers: reames, swaroop.sridhar

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13454

llvm-svn: 249776
2015-10-08 23:18:22 +00:00
Sanjoy Das 40bdd041db [RS4GC] Use AssertingVH for RematerializedValueMapTy, NFCI
Reviewers: reames, swaroop.sridhar

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13489

llvm-svn: 249620
2015-10-07 21:32:35 +00:00
Arnaud A. de Grandmaison a6178a179d [EarlyCSE] Fix handling of target memory intrinsics for CSE'ing loads.
Summary:
Some target intrinsics can access multiple elements, using the pointer as a
base address (e.g. AArch64 ld4). When trying to CSE such instructions,
it must be checked the available value comes from a compatible instruction
because the pointer is not enough to discriminate whether the value is
correct.

Reviewers: ssijaric

Subscribers: mcrosier, llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D13475

llvm-svn: 249523
2015-10-07 07:41:29 +00:00
Sanjoy Das 60bf3db17f [RS4GC] Remove an unnecessary assert & related variables
I don't think this assert adds much value, and removing it and related
variables avoids an "unused variable" warning in release builds.

llvm-svn: 249511
2015-10-07 02:39:27 +00:00
Sanjoy Das b40bd1a93f [RS4GC] Cosmetic cleanup, NFC
Summary:
A series of cosmetic cleanup changes to RewriteStatepointsForGC:

  - Rename variables to LLVM style
  - Remove some redundant asserts
  - Remove an unsued `Pass *` parameter
  - Remove unnecessary variables
  - Use C++11 idioms where applicable
  - Pass CallSite by value, not reference

Reviewers: reames, swaroop.sridhar

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D13370

llvm-svn: 249508
2015-10-07 02:39:18 +00:00
Hans Wennborg 083ca9bb32 Fix Clang-tidy modernize-use-nullptr warnings in source directories and generated files; other minor cleanups.
Patch by Eugene Zelenko!

Differential Revision: http://reviews.llvm.org/D13321

llvm-svn: 249482
2015-10-06 23:24:35 +00:00
Sanjoy Das 5c8bead46d [IndVars] Don't break dominance in `eliminateIdentitySCEV`
Summary:
After r249211, `getSCEV(X) == getSCEV(Y)` does not guarantee that X and
Y are related in the dominator tree, even if X is an operand to Y (I've
included a toy example in comments, and a real example as a test case).

This commit changes `SimplifyIndVar` to require a `DominatorTree`.  I
don't think this is a problem because `ScalarEvolution` requires it
anyway.

Fixes PR25051.

Depends on D13459.

Reviewers: atrick, hfinkel

Subscribers: joker.eph, llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D13460

llvm-svn: 249471
2015-10-06 21:44:49 +00:00
Arnaud A. de Grandmaison 6fd488b156 [EarlyCSE] Constify ParseMemoryInst methods (NFC).
llvm-svn: 249400
2015-10-06 13:35:30 +00:00
Piotr Padlewski dc9b2cfc50 inariant.group handling in GVN
The most important part required to make clang
devirtualization works ( ͡°͜ʖ ͡°).
The code is able to find non local dependencies, but unfortunatelly
because the caller can only handle local dependencies, I had to add
some restrictions to look for dependencies only in the same BB.

http://reviews.llvm.org/D12992

llvm-svn: 249196
2015-10-02 22:12:22 +00:00
Jingyue Wu df1a1b113b [NaryReassociate] SeenExprs records WeakVH
Summary:
The instructions SeenExprs records may be deleted during rewriting.
FindClosestMatchingDominator should ignore these deleted instructions.

Fixes PR24301.

Reviewers: grosser

Subscribers: grosser, llvm-commits

Differential Revision: http://reviews.llvm.org/D13315

llvm-svn: 248983
2015-10-01 03:51:44 +00:00
Fiona Glaser b0c6d9174e DeadCodeElimination: rewrite to be faster
Same strategy as simplifyInstructionsInBlock. ~1/3 less time
on my test suite. This pass doesn't have many in-tree users,
but getting rid of an O(N^2) worst case and making it cleaner
should at least make it a viable alternative to ADCE, since
it's now consistently somewhat faster.

llvm-svn: 248927
2015-09-30 17:49:49 +00:00
Chen Li 9f27fc0599 [LoopUnswitch] Add block frequency analysis to recognize hot/cold regions
Summary: This patch adds block frequency analysis to LoopUnswitch pass to recognize hot/cold regions. For cold regions the pass only performs trivial unswitches since they do not increase code size, and for hot regions everything works as before. This helps to minimize code growth in cold regions and be more aggressive in hot regions. Currently the default cold regions are blocks with frequencies below 20% of function entry frequency, and it can be adjusted via -loop-unswitch-cold-block-frequency flag. The entire feature is controlled via -loop-unswitch-with-block-frequency flag and it is off by default.

Reviewers: broune, silvas, dnovillo, reames

Subscribers: davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D11605

llvm-svn: 248777
2015-09-29 05:03:32 +00:00
Weiming Zhao 310770a90f [LoopReroll] Ignore debug intrinsics
Originally, debug intrinsics and annotation intrinsics may prevent
the loop to be rerolled, now they are ignored.

Differential Revision: http://reviews.llvm.org/D13150

llvm-svn: 248718
2015-09-28 17:03:23 +00:00
Justin Bogner 0638b7ba99 ADCE: Fix typo in file comment. NFC
llvm-svn: 248613
2015-09-25 21:03:46 +00:00
Lawrence Hu cac0b89289 Swap loop invariant GEP with loop variant GEP to allow more LICM.
This patch changes the order of GEPs generated by Splitting GEPs
    pass, specially when one of the GEPs has constant and the base is
    loop invariant, then we will generate the GEP with constant first
    when beneficial, to expose more cases for LICM.

    If originally Splitting GEP generate the following:
      do.body.i:
        %idxprom.i = sext i32 %shr.i to i64
        %2 = bitcast %typeD* %s to i8*
        %3 = shl i64 %idxprom.i, 2
        %uglygep = getelementptr i8, i8* %2, i64 %3
        %uglygep7 = getelementptr i8, i8* %uglygep, i64 1032
      ...
    Now it genereates:
      do.body.i:
        %idxprom.i = sext i32 %shr.i to i64
        %2 = bitcast %typeD* %s to i8*
        %3 = shl i64 %idxprom.i, 2
        %uglygep = getelementptr i8, i8* %2, i64 1032
        %uglygep7 = getelementptr i8, i8* %uglygep, i64 %3
      ...

    For no-loop cases, the original way of generating GEPs seems to
    expose more CSE cases, so we don't change the logic for no-loop
    cases, and only limit our change to the specific case we are
    interested in.

llvm-svn: 248420
2015-09-23 19:25:30 +00:00
Igor Laevsky 029bd93c5d [DeadStoreElimination] Remove dead zero store to calloc initialized memory
This change allows dead store elimination to remove zero and null stores into memory freshly allocated with calloc-like function.

Differential Revision: http://reviews.llvm.org/D13021

llvm-svn: 248374
2015-09-23 11:38:44 +00:00
Sanjoy Das 2aacc0ecca [SCEV] Introduce ScalarEvolution::getOne and getZero.
Summary:
It is fairly common to call SE->getConstant(Ty, 0) or
SE->getConstant(Ty, 1); this change makes such uses a little bit
briefer.

I've refactored the call sites I could find easily to use getZero /
getOne.

Reviewers: hfinkel, majnemer, reames

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12947

llvm-svn: 248362
2015-09-23 01:59:04 +00:00
Michael Zolotukhin deade19630 [Unroll] Do not crash trying to propagate a value to vector load.
llvm-svn: 248333
2015-09-22 22:27:12 +00:00
Michael Zolotukhin 8bb31dd08a [Unroll] Follow-up for r247769: fix a bug in UnrolledInstAnalyzer::visitLoad.
Apart from checking that GlobalVariable is a constant, we should check
that it's not a weak constant, in which case we can't propagate its
value.

llvm-svn: 248327
2015-09-22 21:41:29 +00:00
NAKAMURA Takumi 10c80e7996 Prune trailing whitespaces.
llvm-svn: 248265
2015-09-22 11:19:03 +00:00
NAKAMURA Takumi 0a7d0ad95f Untabify.
llvm-svn: 248264
2015-09-22 11:15:07 +00:00
NAKAMURA Takumi a9cb538a74 Reformat blank lines.
llvm-svn: 248263
2015-09-22 11:14:39 +00:00
NAKAMURA Takumi 84965031a7 Reformat comment lines.
llvm-svn: 248262
2015-09-22 11:14:12 +00:00
NAKAMURA Takumi 70ad98aca4 Reformat.
llvm-svn: 248261
2015-09-22 11:13:55 +00:00
Michael Zolotukhin 9f3aea6e1f [LoopUnswitch] Require DominatorTree info.
Summary:
We should either require the DT info to be available, or check if it's
available in every place we use DT (and we already miss such check in
one place, which causes failures in some cases). As other loop passes
preserve DT and it's usually available, it makes sense to just require
it here.

There is no regression test, because the bug only shows up if pass
manager decides to clean DT info right before LoopUnswitch. If
loop-unswitch is run separately, DT is available, so bug isn't exposed.

Reviewers: chandlerc, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13036

llvm-svn: 248230
2015-09-22 00:22:47 +00:00
Philip Reames 5f99423de9 [LICM] Hoist calls to readonly argmemonly functions even with stores in the loop
We know that an argmemonly function can only access memory pointed to by it's pointer arguments. Rather than needing to consider all possible stores as aliasing (as we do for a readonly function), we can only consider the aliasing of the pointer arguments.

Note that this change only addresses hoisting. I'm thinking about how to address speculation safety as well, but that will be a different change.

FYI, argmemonly disallows accessing memory through non-pointer typed arguments.  

Differential Revision: http://reviews.llvm.org/D12771

llvm-svn: 248220
2015-09-21 22:27:59 +00:00
Mehdi Amini 24e20583d1 Fix UB: can't bind a reference to nullptr (NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 248213
2015-09-21 21:29:43 +00:00
Sanjoy Das 7cc2cfecd9 [IndVars] Use C++11 style field initialization; NFCI.
llvm-svn: 248131
2015-09-20 18:42:53 +00:00
Sanjoy Das e1e352d5c5 [IndVars] Don't add a level of indentation for namespace {. NFC.
Whitespace-only change.

llvm-svn: 248130
2015-09-20 18:42:50 +00:00
Sanjoy Das 9119bf4c0b [IndVars] Don't repeat function names in comment; NFC.
Only changes comments.

llvm-svn: 248112
2015-09-20 06:58:03 +00:00
Sanjoy Das 428db150d1 [IndVars] Fix a bug in r248045.
Because -indvars widens induction variables through arithmetic,
`NeverNegative` cannot be a property of the `WidenIV` (a `WidenIV`
manages information for all transitive uses of an IV being widened,
including uses of `-1 * IV`).  Instead it must live on `NarrowIVDefUse`
which manages information for a specific def-use edge in the transitive
use list of an induction variable.

This change also adds a test case that demonstrates the problem with
r248045.

llvm-svn: 248107
2015-09-20 01:52:18 +00:00
Sanjoy Das f69d0e3384 [IndVars] Widen more comparisons for non-negative induction vars
Summary:
If an induction variable is provably non-negative, its sign extension is
equal to its zero extension.  This means narrow uses like

  icmp slt iNarrow %indvar, %rhs

can be widened into

  icmp slt iWide zext(%indvar), sext(%rhs)

Reviewers: atrick, mcrosier, hfinkel

Subscribers: hfinkel, reames, llvm-commits

Differential Revision: http://reviews.llvm.org/D12745

llvm-svn: 248045
2015-09-18 21:21:02 +00:00
Larisse Voufo 532bf7153c Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan. (Complete version of r247497. See D12886)
llvm-svn: 248022
2015-09-18 19:14:35 +00:00
Piotr Padlewski a4d43337d4 gvn small fix
http://reviews.llvm.org/D12928

llvm-svn: 247935
2015-09-17 20:34:22 +00:00
David L Kreitzer da700ce581 Test commit: Fixed a few typos in the comments.
llvm-svn: 247793
2015-09-16 13:27:30 +00:00
Michael Zolotukhin fc314be0ec [Unroll] Fix a bug in UnrolledInstAnalyzer::visitLoad.
We only checked that a global is initialized with constants, which is
incorrect. We should be checking that GlobalVariable *is* a constant,
not just initialized with it.

llvm-svn: 247769
2015-09-16 03:25:09 +00:00
Sanjoy Das 8a5526e8be [IndVars] Fix PR24783.
In `IndVarSimplify::ExpandSCEVIfNeeded`,
`SCEVExpander::findExistingExpansion` may return an `llvm::Value` that
differs in type from the SCEV it was asked to find an expansion for (but
computes the same value).  In such cases, we fall back on
`expandCodeFor`; and rely on LLVM to CSE the two equivalent
expressions (different only by a no-op cast) into a single computation.

I tried a few other approaches to fixing PR24783, all of which turned
out to be more complex than this current version:

 1. Move the `ExpandSCEVIfNeeded` logic into `expandCodeFor`.  This got
    problematic because currently we do not pass in the `Loop *` into
    `expandCodeFor`.  Changing the interface to do this is a more
    invasive change, and really does not make much semantic sense unless
    the SCEV being passed in is an add recurrence.

    There is also the problem of `expandCodeFor` being used in places
    other than `indvars` -- there may be performance / correctness
    issues elsewhere if `expandCodeFor` is moved from always generating
    IR from scratch to cache-like model.

 2. Have `findExistingExpansion` only return expression with the correct
    type.  This would make `isHighCostExpansionHelper` and thus
    `isHighCostExpansion` more conservative than necessary.

 3. Insert casts on the value returned by `findExistingExpansion` if
    needed using `InsertNoopCastOfTo`.  This is complicated because
    `InsertNoopCastOfTo` depends on internal state of its
    `SCEVExpander` (specifically `Builder.GetInserPoint()`), and this
    may not be set up when `ExpandSCEVIfNeeded` is called.

 4. Manually insert casts on the value returned by
    `findExistingExpansion` if needed using `InsertNoopCastOfTo` via
    `CastInst::Create`.  This is probably workable, but figuring out the
    location where the cast instruction needs to be inserted has enough
    edge cases (arguments, constants, invokes, LCSSA must be preserved)
    makes me feel what I have right now is simplest solution.

llvm-svn: 247749
2015-09-15 23:45:39 +00:00
Sanjoy Das 0ce51a92a8 [IndVars] Rename variable; NFC.
llvm-svn: 247748
2015-09-15 23:45:35 +00:00
Larisse Voufo 6b867c7254 Revert "Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan." for preliminary community discussion (See. D12886)
llvm-svn: 247716
2015-09-15 19:14:05 +00:00
Igor Laevsky bdc1eafe20 [CorrelatedValuePropagation] Infer nonnull attributes
LazuValueInfo can prove that value is nonnull based on the context information. 
Make use of this ability to infer nonnull attributes for the call arguments.

Differential Revision: http://reviews.llvm.org/D12836

llvm-svn: 247707
2015-09-15 17:51:50 +00:00
Marcello Maggioni 454faa84e2 [NaryReassociate] Add support for Mul instructions
This patch extends the current pass by handling
Mul instructions as well.

Patch by: Volkan Keles (vkeles@apple.com)

llvm-svn: 247705
2015-09-15 17:22:52 +00:00
Sanjoy Das f75e15e5ac [PlaceSafepoints] Make the width of a counted loop settable.
Summary:
This change lets a `PlaceSafepoints` client change how wide the trip
count of a loop has to be for the loop to be considerd "counted", via
`CountedLoopTripWidth`.  It also removes the boolean `SkipCounted` flag
and the `upperTripBound` constant -- we can get the old behavior of
`SkipCounted` == `false` by setting `CountedLoopTripWidth` to `13` (2 ^
13 == 8192).

Reviewers: reames

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D12789

llvm-svn: 247656
2015-09-15 01:42:48 +00:00
Chandler Carruth 29a18a4663 [PM] Port SROA to the new pass manager.
In some ways this is a very boring port to the new pass manager as there
are no interesting analyses or dependencies or other oddities.

However, this does introduce the first good example of a transformation
pass with non-trivial state porting to the new pass manager. I've tried
to carve out patterns here to replicate elsewhere, and would appreciate
comments on whether folks like these patterns:

- A common need in the new pass manager is to effectively lift the pass
  class and some of its state into a public header file. Prior to this,
  LLVM used anonymous namespaces to provide "module private" types and
  utilities, but that doesn't scale to cases where a public header file
  is needed and the new pass manager will exacerbate that. The pattern
  I've adopted here is to use the namespace-cased-name of the core pass
  (what would be a module if we had them) as a module-private namespace.
  Then utility and other code can be declared and defined in this
  namespace. At some point in the future, we could even have
  (conditionally compiled) code that used modules features when
  available to do the same basic thing.

- I've split the actual pass run method in two in order to expose
  a private method usable by the old pass manager to wrap the new class
  with a minimum of duplicated code. I actually looked at a bunch of
  ways to automate or generate these, but they are all quite terrible
  IMO. The fundamental need is to extract the set of analyses which need
  to cross this interface boundary, and that will end up being too
  unpredictable to effectively encapsulate IMO. This is also
  a relatively small amount of boiler plate that will live a relatively
  short time, so I'm not too worried about the fact that it is boiler
  plate.

The rest of the patch is totally boring but results in a massive diff
(sorry). It just moves code around and removes or adds qualifiers to
reflect the new name and nesting structure.

Differential Revision: http://reviews.llvm.org/D12773

llvm-svn: 247501
2015-09-12 09:09:14 +00:00
Larisse Voufo f57162b6e7 Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan.
llvm-svn: 247497
2015-09-12 01:41:55 +00:00
James Molloy efbba72cb2 Add GlobalsAA as preserved to a bunch of transforms
GlobalsAA must by definition be preserved in function passes, but the passmanager doesn't know that. Make each pass explicitly preserve GlobalsAA.

llvm-svn: 247263
2015-09-10 10:22:12 +00:00
Philip Reames 953817b65d [RewriteStatepointsForGC] Minor refactor to use shared implementation [NFC]
llvm-svn: 247223
2015-09-10 00:44:10 +00:00
Philip Reames b4e55f3923 [RewriteStatepointsForGC] Strengthen a confusingly weak assertion [NFC]
The assertion was weaker than it should be and gave the impression we're growing the number of base defining values being considered during the fixed point interation.  That's not true.  The tighter form of the assert is useful documentation.

llvm-svn: 247221
2015-09-10 00:32:56 +00:00
Philip Reames c8ded462c4 [RewriteStatepointsForGC] One last bit of naming [NFCI]
llvm-svn: 247220
2015-09-10 00:27:50 +00:00
Philip Reames 34d7a7493d [RewriteStatepointsForGC] Further style/naming fixup [NFCI]
llvm-svn: 247217
2015-09-10 00:22:49 +00:00
Philip Reames 7540e3a45d [RewriteStatepointsForGC] More naming cleanup [NFCI]
llvm-svn: 247213
2015-09-10 00:01:53 +00:00
Philip Reames ece70b8042 [RewriteStatepointsForGC] Code cleanup [NFC]
Factor out common code related to naming values, fix a small style issue.  More to follow in separate changes.

llvm-svn: 247211
2015-09-09 23:57:18 +00:00
Philip Reames 6628713f4f [RewriteStatepointsForGC] Extend base pointer inference to handle insertelement
This change is simply enhancing the existing inference algorithm to handle insertelement instructions by conservatively inserting a new instruction to propagate the vector of associated base pointers. In the process, I'm ripping out the peephole optimizations which mostly helped cover the fact this hadn't been done.

Note that most of the newly inserted nodes will be nearly immediately removed by the post insertion optimization pass introduced in 246718. Arguably, we should be trying harder to avoid the malloc traffic here, but I'd rather get the code correct, then worry about compile time.

Unlike previous extensions of the algorithm to handle more case, I discovered the existing code was causing miscompiles in some cases. In particular, we had an implicit assumption that the peephole covered *all* insert element instructions, so if we had a value directly based on a insert element the peephole didn't cover, we proceeded as if it were a base anyways. Not good. I believe we had the same issue with shufflevector which is why I adjusted the predicate for them as well.

Differential Revision: http://reviews.llvm.org/D12583

llvm-svn: 247210
2015-09-09 23:40:12 +00:00
Philip Reames 15d5563cea [RewriteStatepointsForGC] Make base pointer inference deterministic
Previously, the base pointer algorithm wasn't deterministic. The core fixed point was (of course), but we were inserting new nodes and optimizing them in an order which was unspecified and variable. We'd somewhat hacked around this for testing by sorting by value name, but that doesn't solve the general determinism problem.

Instead, we can use the order of traversal over the def/use graph to give us a single consistent ordering. Today, this is a DFS order, but the exact order doesn't mater provided it's deterministic for a given input.

(Q: It is safe to rely on a deterministic order of operands right?)

Note that this only fixes the determinism within a single inference step. The inference step is currently invoked many times in a non-deterministic order. That's a future change in the sequence. :)

Differential Revision: http://reviews.llvm.org/D12640

llvm-svn: 247208
2015-09-09 23:26:08 +00:00
Chandler Carruth 7b560d40bd [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.

This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:

- FunctionAAResults is a type-erasing alias analysis results aggregation
  interface to walk a single query across a range of results from
  different alias analyses. Currently this is function-specific as we
  always assume that aliasing queries are *within* a function.

- AAResultBase is a CRTP utility providing stub implementations of
  various parts of the alias analysis result concept, notably in several
  cases in terms of other more general parts of the interface. This can
  be used to implement only a narrow part of the interface rather than
  the entire interface. This isn't really ideal, this logic should be
  hoisted into FunctionAAResults as currently it will cause
  a significant amount of redundant work, but it faithfully models the
  behavior of the prior infrastructure.

- All the alias analysis passes are ported to be wrapper passes for the
  legacy PM and new-style analysis passes for the new PM with a shared
  result object. In some cases (most notably CFL), this is an extremely
  naive approach that we should revisit when we can specialize for the
  new pass manager.

- BasicAA has been restructured to reflect that it is much more
  fundamentally a function analysis because it uses dominator trees and
  loop info that need to be constructed for each function.

All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.

The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.

This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.

Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.

One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.

Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.

Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.

Differential Revision: http://reviews.llvm.org/D12080

llvm-svn: 247167
2015-09-09 17:55:00 +00:00
Chandler Carruth 1688a772fc Fix a typo I spotted when hacking on SROA. Somewhat alarming that
nothing broke.

llvm-svn: 247127
2015-09-09 09:46:16 +00:00
Sanjoy Das da0d79e0a0 [IRCE] Add INITIALIZE_PASS_DEPENDENCY invocations.
IRCE was just using INITIALIZE_PASS(), which is incorrect.

llvm-svn: 247122
2015-09-09 03:47:18 +00:00
Philip Reames 3ea158950e [RewriteStatepointsForGC] Extract common code, comment, and fix a build warning [NFC]
llvm-svn: 246810
2015-09-03 21:57:40 +00:00
Philip Reames f5b8e47651 [RewriteStatepointsForGC] Strengthen invariants around BDVs
As a first step towards a new implementation of the base pointer inference algorithm, introduce an abstraction for BDVs, strengthen the assertions around them, and rewrite the BDV relation code in terms of the abstraction which includes an explicit notion of whether the BDV is also a base. The later is motivated by the fact we had a bug where insertelement was always assumed to be a base pointer even though the BDV code knew it wasn't. The strengthened assertions in this patch would have caught that bug.

The next step will be to separate the DefiningValueMap into a BDV use list cache (entirely within findBasePointers) and a base pointer cache. Having the former will allow me to use a deterministic visit order when visiting BDVs in the inference algorithm and remove a bunch of ordering related hacks. Before actually doing the last step, I'm likely going to extend the lattice with a 'BaseN' (seen only base inputs) state so that I can kill the post process optimization step.

Phabricator Revision: http://reviews.llvm.org/D12608

llvm-svn: 246809
2015-09-03 21:34:30 +00:00
Philip Reames 246e618e77 [RewriteStatepointsForGC] Workaround a lack of determinism in visit order
The visit order being used in the base pointer inference algorithm is currently non-deterministic.  When working on http://reviews.llvm.org/D12583, I discovered that we were relying on a peephole optimization to get deterministic ordering in one of the test cases.  

This change is intented to let me test and land http://reviews.llvm.org/D12583.  The current code will not be long lived.  I'm starting to investigate a rewrite of the algorithm which will combine the post-process step into the initial algorithm and make the visit order determistic.  Before doing that, I wanted to make sure the existing code was complete and the test were stable.  Hopefully, patches should be up for review for the new algorithm this week or early next.

llvm-svn: 246801
2015-09-03 20:24:29 +00:00
Philip Reames 07a2ee1aff [RewriteStatepointsForGC] Delete stale comment [NFC]
llvm-svn: 246722
2015-09-02 22:35:42 +00:00
Philip Reames b3967cd08e [RewriteStatepointsForGC] Pull a function out of anon namespace [NFC]
Thanks to David Blaikie for noticing in previous commit.

llvm-svn: 246721
2015-09-02 22:30:53 +00:00
Philip Reames 9546f367f7 [RewriteStatepointsForGC] Bugfix for change 246133
Fix a bug in change 246133. I didn't handle the case where we had a cycle in the use graph and could add an instruction we were about to erase back on to the worklist. Oddly, I have not been able to write a small test case for this, even with the AssertingVH added. I have confirmed the basic theory for the fix on a large failing example, but all attempts to reduce that to something appropriate for a test case have failed.

Differential Revision: http://reviews.llvm.org/D12575

llvm-svn: 246718
2015-09-02 22:25:07 +00:00
Philip Reames 6906e92812 Fix release build warning for unused function
llvm-svn: 246717
2015-09-02 21:57:17 +00:00
Philip Reames dab35f317d [RewriteStatepointsForGC] Improve debug output [NFC]
llvm-svn: 246713
2015-09-02 21:11:44 +00:00
Piotr Padlewski 0c7d8fc1f6 assuem(X) handling in GVN bugfix
There was infinite loop because it was trying to change assume(true) into
assume(true)
Also added handling when assume(false) appear

http://reviews.llvm.org/D12516

llvm-svn: 246697
2015-09-02 20:00:03 +00:00
Piotr Padlewski 28ffcbe1cc Constant propagation after hitting assume(cmp) bugfix
Last time code run into assertion `BBE.isSingleEdge()` in
lib/IR/Dominators.cpp:200.

http://reviews.llvm.org/D12170

llvm-svn: 246696
2015-09-02 19:59:59 +00:00
Piotr Padlewski 14e815c22b Constant propagation after hiting llvm.assume
After hitting @llvm.assume(X) we can:
- propagate equality that X == true
- if X is icmp/fcmp (with eq operation), and one of operand
  is constant we can change all variables with constants in the same BasicBlock

http://reviews.llvm.org/D11918

llvm-svn: 246695
2015-09-02 19:59:53 +00:00
Jingyue Wu e84f671830 [JumpThreading] make jump threading respect convergent annotation.
Summary:
JumpThreading shouldn't duplicate a convergent call, because that would move a convergent call into a control-inequivalent location. For example,
  if (cond) {
    ...
  } else {
    ...
  }
  convergent_call();
  if (cond) {
    ...
  } else {
    ...
  }
should not be optimized to
  if (cond) {
    ...
    convergent_call();
    ...
  } else {
    ...
    convergent_call();
    ...
  }

Test Plan: test/Transforms/JumpThreading/basic.ll

Patch by Xuetian Weng. 

Reviewers: resistor, arsenm, jingyue

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12484

llvm-svn: 246415
2015-08-31 06:10:27 +00:00
Chandler Carruth 4b682f6f24 [SROA] Fix PR24463, a crash I introduced in SROA by allowing it to
handle more allocas with loads past the end of the alloca.

I suspect there are some related crashers with slightly different
patterns, but I'll fix those and add test cases as I find them.

Thanks to David Majnemer for the excellent test case reduction here.
Made this super simple to debug and fix.

llvm-svn: 246289
2015-08-28 09:03:52 +00:00
Steven Wu 61db34d12e Revert r246244 and r246243
These two commits cause clang/llvm bootstrap to hang.

llvm-svn: 246279
2015-08-28 06:52:00 +00:00
Piotr Padlewski 3f81ec1e38 Constant propagation after hitting assume(cmp) bugfix
Last time code run into assertion `BBE.isSingleEdge()` in
lib/IR/Dominators.cpp:200.

http://reviews.llvm.org/D12170

llvm-svn: 246244
2015-08-28 01:02:00 +00:00
Piotr Padlewski 63cc5d4627 Constant propagation after hiting llvm.assume
After hitting @llvm.assume(X) we can:
- propagate equality that X == true
- if X is icmp/fcmp (with eq operation), and one of operand
  is constant we can change all variables with constants in the same BasicBlock

http://reviews.llvm.org/D11918

llvm-svn: 246243
2015-08-28 01:01:57 +00:00
James Molloy 1bbf15c57c [LoopVectorize] Extract InductionInfo into a helper class...
... and move it into LoopUtils where it can be used by other passes, just like ReductionDescriptor. The API is very similar to ReductionDescriptor - that is, not very nice at all. Sorting these both out will come in a followup.

NFC

llvm-svn: 246145
2015-08-27 09:53:00 +00:00
Philip Reames dfd890dd3a Allow value forwarding past release fences in EarlyCSE
A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-store forwarding across a release fence.

We do need to make sure that stores before the fence can't be eliminated even if there's another store to the same location after the fence. In theory, we could reorder the second store above the fence and *then* eliminate the former, but we can't do this if the stores are on opposite sides of the fence.

Note: While more aggressive then what's there, this patch is still implementing a really conservative ordering.  In particular, I'm not trying to exploit undefined behavior via races, or the fact that the LangRef says only 'atomic' accesses are ordered w.r.t. fences.

Differential Revision: http://reviews.llvm.org/D11434

llvm-svn: 246134
2015-08-27 01:32:33 +00:00