Commit Graph

543 Commits

Author SHA1 Message Date
Saleem Abdulrasool 418d1ea555 PM: silence `-Wpessimizing-move` from GCC 9.2.1 (NFC)
Remove the explicit move enabling NVRO.
2019-10-27 18:33:09 -04:00
Eric Christopher c3649a0871 In the new pass manager use PTO.LoopUnrolling to determine when and how
we will unroll loops. Also comment a few occasions where we need to
know whether or not we're forcing the unwinder or not.

The default before and after this patch is for LoopUnroll to be enabled,
and for it to use a cost model to determine whether to unroll the loop
(`OnlyWhenForced = false`). Before this patch, disabling loop unroll
would not run the LoopUnroll pass. After this patch, the LoopUnroll pass
is being run, but it restricts unrolling to only the loops marked by a
pragma (`OnlyWhenForced = true`).

In addition, this patch disables the UnrollAndJam pass when disabling unrolling.

Testcase is in clang because it's controlling how the loop optimizer
is being set up and there's no other way to trigger the behavior.

llvm-svn: 374838
2019-10-14 22:56:07 +00:00
Joerg Sonnenberger 9681ea9560 Reapply r374743 with a fix for the ocaml binding
Add a pass to lower is.constant and objectsize intrinsics

This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280

llvm-svn: 374784
2019-10-14 16:15:14 +00:00
Dmitri Gribenko 1a21f98ac3 Revert "Add a pass to lower is.constant and objectsize intrinsics"
This reverts commit r374743. It broke the build with Ocaml enabled:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218

llvm-svn: 374768
2019-10-14 12:22:48 +00:00
Joerg Sonnenberger e4300c392d Add a pass to lower is.constant and objectsize intrinsics
This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280

llvm-svn: 374743
2019-10-13 23:00:15 +00:00
Yuanfang Chen cc382cf727 [NewPM] Port MachineModuleInfo to the new pass manager.
Existing clients are converted to use MachineModuleInfoWrapperPass. The
new interface is for defining a new pass manager API in CodeGen.

Reviewers: fedor.sergeev, philip.pfaffe, chandlerc, arsenm

Reviewed By: arsenm, fedor.sergeev

Differential Revision: https://reviews.llvm.org/D64183

llvm-svn: 373240
2019-09-30 17:54:50 +00:00
Thomas Preud'homme 5f738940b5 Regex: Make "match" and "sub" const member functions
Summary:
The Regex "match" and "sub" member functions were previously not "const"
because they wrote to the "error" member variable. This commit removes
those assignments, and instead assumes that the validity of the regex
is already known after the initial compilation of the regular
expression. As a result, these member functions were possible to make
"const". This makes it easier to do things like pre-compile Regexes
up-front, and makes "match" and "sub" thread-safe. The error status is
now returned as an optional output, which also makes the API of "match"
and "sub" more consistent with each other.

Also, some uses of Regex that could be refactored to be const were made const.

Patch by Nicolas Guillemot

Reviewers: jankratochvil, thopre

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67241

llvm-svn: 372764
2019-09-24 14:42:36 +00:00
Serguei Katkov a44768858c [Unroll] Add an option to control complete unrolling
Add an ability to specify the max full unroll count for LoopUnrollPass pass
in pass options.

Reviewers: fhahn, fedor.sergeev
Reviewed By: fedor.sergeev
Subscribers: hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D67701

llvm-svn: 372305
2019-09-19 06:57:29 +00:00
Bardia Mahjour db800c267d Data Dependence Graph Basics
Summary:
This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper:
D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS.
This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges.
The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored.

The algorithm for building the graph involves the following steps in order:

  1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph.
  2. For each node in the graph establish def-use edges to/from other nodes in the graph.
  3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur, fhahn, myhsu

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto

Tag: #llvm

Differential Revision: https://reviews.llvm.org/D65350

llvm-svn: 372238
2019-09-18 17:43:45 +00:00
Bardia Mahjour 6476d7cf0b Revert "Data Dependence Graph Basics"
This reverts commit c98ec60993, which broke the sphinx-docs build.

llvm-svn: 372168
2019-09-17 19:22:01 +00:00
Bardia Mahjour c98ec60993 Data Dependence Graph Basics
Summary:
This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper:
D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS.
This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges.
The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored.

The algorithm for building the graph involves the following steps in order:

  1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph.
  2. For each node in the graph establish def-use edges to/from other nodes in the graph.
  3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur, fhahn, myhsu

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto

Tag: #llvm

Differential Revision: https://reviews.llvm.org/D65350

llvm-svn: 372162
2019-09-17 18:55:44 +00:00
Denis Bakhvalov 58f172f05a [MergedLoadStoreMotion] Sink stores to BB with more than 2 predecessors
If we have:

bb5:
  br i1 %arg3, label %bb6, label %bb7

bb6:
  %tmp = getelementptr inbounds i32, i32* %arg1, i64 2
  store i32 3, i32* %tmp, align 4
  br label %bb9

bb7:
  %tmp8 = getelementptr inbounds i32, i32* %arg1, i64 2
  store i32 3, i32* %tmp8, align 4
  br label %bb9

bb9:  ; preds = %bb4, %bb6, %bb7
  ...

We can't sink stores directly into bb9.
This patch creates new BB that is successor of %bb6 and %bb7
and sinks stores into that block.

SplitFooterBB is the parameter to the pass that controls
that behavior.

Change-Id: I7fdf50a772b84633e4b1b860e905bf7e3e29940f
Differential: https://reviews.llvm.org/D66234
llvm-svn: 371089
2019-09-05 17:00:32 +00:00
Alina Sbirlea 7425179fee [LoopPassManager + MemorySSA] Only enable use of MemorySSA for LPMs known to preserve it.
Summary:
Add a flag to the FunctionToLoopAdaptor that allows enabling MemorySSA only for the loop pass managers that are known to preserve it.

If an LPM is known to have only loop transforms that *all* preserve MemorySSA, then use MemorySSA if `EnableMSSALoopDependency` is set.
If an LPM has loop passes that do not preserve MemorySSA, then the flag passed is `false`, regardless of the value of `EnableMSSALoopDependency`.

When using a custom loop pass pipeline via `passes=...`, use keyword `loop` vs `loop-mssa` to use MemorySSA in that LPM. If a loop that does not preserve MemorySSA is added while using the `loop-mssa` keyword, that's an error.

Add the new `loop-mssa` keyword to a few tests where a difference occurs when enabling MemorySSA.

Reviewers: chandlerc

Subscribers: mehdi_amini, Prazek, george.burgess.iv, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66376

llvm-svn: 369548
2019-08-21 17:00:57 +00:00
Whitney Tsang dd3b6498b0 Title: Loop Cache Analysis
Summary: Implement a new analysis to estimate the number of cache lines
required by a loop nest.
The analysis is largely based on the following paper:

Compiler Optimizations for Improving Data Locality
By: Steve Carr, Katherine S. McKinley, Chau-Wen Tseng
http://www.cs.utexas.edu/users/mckinley/papers/asplos-1994.pdf
The analysis considers temporal reuse (accesses to the same memory
location) and spatial reuse (accesses to memory locations within a cache
line). For simplicity the analysis considers memory accesses in the
innermost loop in a loop nest, and thus determines the number of cache
lines used when the loop L in loop nest LN is placed in the innermost
position.

The result of the analysis can be used to drive several transformations.
As an example, loop interchange could use it determine which loops in a
perfect loop nest should be interchanged to maximize cache reuse.
Similarly, loop distribution could be enhanced to take into
consideration cache reuse between arrays when distributing a loop to
eliminate vectorization inhibiting dependencies.

The general approach taken to estimate the number of cache lines used by
the memory references in the inner loop of a loop nest is:

Partition memory references that exhibit temporal or spatial reuse into
reference groups.
For each loop L in the a loop nest LN: a. Compute the cost of the
reference group b. Compute the 'cache cost' of the loop nest by summing
up the reference groups costs
For further details of the algorithm please refer to the paper.
Authored By: etiotto
Reviewers: hfinkel, Meinersbur, jdoerfert, kbarton, bmahjour, anemet,
fhahn
Reviewed By: Meinersbur
Subscribers: reames, nemanjai, MaskRay, wuzish, Hahnfeld, xusx595,
venkataramanan.kumar.llvm, greened, dmgreen, steleman, fhahn, xblvaOO,
Whitney, mgorny, hiraditya, mgrang, jsji, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D63459

llvm-svn: 368439
2019-08-09 13:56:29 +00:00
Serguei Katkov de67affd00 [Loop Peeling] Introduce an option for profile based peeling disabling.
This patch adds an ability to disable profile based peeling 
causing the peeling of all iterations and as a result prohibits
further unroll/peeling attempts on that loop.

The motivation to get an ability to separate peeling usage in
pipeline where in the first part we peel only separate iterations if needed
and later in pipeline we apply the full peeling which will prohibit further peeling.

Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D64983

llvm-svn: 367668
2019-08-02 09:32:52 +00:00
Rong Xu ca161fa008 [PGO] Add PGO support at -O0 in the experimental new pass manager
Add PGO support at -O0 in the experimental new pass manager to sync the
behavior of the legacy pass manager.

Also change the test of gcc-flag-compatibility.c for more complete test:
(1) change the match string to "profc" and "profd" to ensure the
    instrumentation is happening.
(2) add IR format proftext so that PGO use compilation is tested.

Differential Revision: https://reviews.llvm.org/D64029

llvm-svn: 367628
2019-08-01 22:36:34 +00:00
Leonard Chan 007f674c6a Reland the "[NewPM] Port Sancov" patch from rL365838. No functional
changes were made to the patch since then.

--------

[NewPM] Port Sancov

This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty.

Changes:

- Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over
  functions and ModuleSanitizerCoverage for passing over modules.
- ModuleSanitizerCoverage exists for adding 2 module level calls to initialization
  functions but only if there's a function that was instrumented by sancov.
- Added legacy and new PM wrapper classes that own instances of the 2 new classes.
- Update llvm tests and add clang tests.

llvm-svn: 367053
2019-07-25 20:53:15 +00:00
Leonard Chan bb147aabc6 Revert "[NewPM] Port Sancov"
This reverts commit 5652f35817.

llvm-svn: 366153
2019-07-15 23:18:31 +00:00
Leonard Chan 5652f35817 [NewPM] Port Sancov
This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty.

Changes:

- Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over
  functions and ModuleSanitizerCoverage for passing over modules.
- ModuleSanitizerCoverage exists for adding 2 module level calls to initialization
  functions but only if there's a function that was instrumented by sancov.
- Added legacy and new PM wrapper classes that own instances of the 2 new classes.
- Update llvm tests and add clang tests.

Differential Revision: https://reviews.llvm.org/D62888

llvm-svn: 365838
2019-07-11 22:35:40 +00:00
Philip Reames f47a313e71 Add a transform pass to make the executable semantics of poison explicit in the IR
Implements a transform pass which instruments IR such that poison semantics are made explicit. That is, it provides a (possibly partial) executable semantics for every instruction w.r.t. poison as specified in the LLVM LangRef. There are obvious parallels to the sanitizer tools, but this pass is focused purely on the semantics of LLVM IR, not any particular source language.

The target audience for this tool is developers working on or targetting LLVM from a frontend. The idea is to be able to take arbitrary IR (with the assumption of known inputs), and evaluate it concretely after having made poison semantics explicit to detect cases where either a) the original code executes UB, or b) a transform pass introduces UB which didn't exist in the original program.

At the moment, this is mostly the framework and still needs to be fleshed out. By reusing existing code we have decent coverage, but there's a lot of cases not yet handled. What's here is good enough to handle interesting cases though; for instance, one of the recent LFTR bugs involved UB being triggered by integer induction variables with nsw/nuw flags would be reported by the current code.

(See comment in PoisonChecking.cpp for full explanation and context)

Differential Revision: https://reviews.llvm.org/D64215

llvm-svn: 365536
2019-07-09 18:49:29 +00:00
Leonard Chan 97dc622ab3 [clang][NewPM] Do not eliminate available_externally durng `-O2 -flto` runs
This fixes CodeGen/available-externally-suppress.c when the new pass manager is
turned on by default. available_externally was not emitted during -O2 -flto
runs when it should still be retained for link time inlining purposes. This can
be fixed by checking that we aren't LTOPrelinking when adding the
EliminateAvailableExternallyPass.

Differential Revision: https://reviews.llvm.org/D63580

llvm-svn: 363971
2019-06-20 19:44:51 +00:00
Johannes Doerfert aade782a98 [Attributor] Pass infrastructure and fixpoint framework
NOTE: Note that no attributes are derived yet. This patch will not go in
      alone but only with others that derive attributes. The framework is
      split for review purposes.

This commit introduces the Attributor pass infrastructure and fixpoint
iteration framework. Further patches will introduce abstract attributes
into this framework.

In a nutshell, the Attributor will update instances of abstract
arguments until a fixpoint, or a "timeout", is reached. Communication
between the Attributor and the abstract attributes that are derived is
restricted to the AbstractState and AbstractAttribute interfaces.

Please see the file comment in Attributor.h for detailed information
including design decisions and typical use case. Also consider the class
documentation for Attributor, AbstractState, and AbstractAttribute.

Reviewers: chandlerc, homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes, nicholas, reames

Subscribers: mehdi_amini, mgorny, hiraditya, bollu, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59918

llvm-svn: 362578
2019-06-05 03:02:24 +00:00
Alina Sbirlea d82ddfa7c3 [NewPassManager] Add tuning option: ForgetAllSCEVInLoopUnroll [NFC].
Summary: Mirror tuning option from old pass manager in new pass manager.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, zzheng, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61612

llvm-svn: 361560
2019-05-23 21:52:59 +00:00
Alina Sbirlea e4b27869c6 [NewPassManager] Add tuning option: LoopUnrolling [NFC].
Summary: Mirror tuning option from old pass manager in new pass manager.

Reviewers: chandlerc

Subscribers: jlebar, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61618

llvm-svn: 361540
2019-05-23 19:35:40 +00:00
Clement Courbet 43882b16a3 [MergeICmps] Make the pass compatible with the new pass manager.
Reviewers: gchatelet, spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62287

llvm-svn: 361490
2019-05-23 12:35:26 +00:00
Joerg Sonnenberger ec6ee797ec Fix typos in comment.
llvm-svn: 360921
2019-05-16 18:01:57 +00:00
Leonard Chan 0cdd3b1d81 [NewPM] Port HWASan and Kernel HWASan
Port hardware assisted address sanitizer to new PM following the same guidelines as msan and tsan.

Changes:
- Separate HWAddressSanitizer into a pass class and a sanitizer class.
- Create new PM wrapper pass for the sanitizer class.
- Use the getOrINsert pattern for some module level initialization declarations.
- Also enable kernel-kwasan in new PM
- Update llvm tests and add clang test.

Differential Revision: https://reviews.llvm.org/D61709

llvm-svn: 360707
2019-05-14 21:17:21 +00:00
Alina Sbirlea 458c7339e1 [NewPassManager] Add tuning option: SLPVectorization [NFC].
Summary: Mirror tuning option from old pass manager in new pass manager.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61616

llvm-svn: 360276
2019-05-08 17:58:35 +00:00
Eric Christopher 86e2f169bb Tidy up a comment, fix a typo, remove a comment that's obsolete.
llvm-svn: 359852
2019-05-03 00:15:23 +00:00
Teresa Johnson 867bc3951b [ThinLTO] Pass down opt level to LTO backend and handle -O0 LTO in new PM
Summary:
The opt level was not being passed down to the ThinLTO backend when
invoked via clang (for distributed ThinLTO).

This exposed an issue where the new PM was asserting if the Thin or
regular LTO backend pipelines were invoked with -O0 (not a new issue,
could be provoked by invoking in-process *LTO backends via linker using
new PM and -O0). Fix this similar to the old PM where -O0 only does the
necessary lowering of type metadata (WPD and LowerTypeTest passes) and
then quits, rather than asserting.

Reviewers: xur

Subscribers: mehdi_amini, inglorion, eraman, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits, pcc

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D61022

llvm-svn: 359025
2019-04-23 18:56:19 +00:00
Serguei Katkov 40a3b96196 [NewPM] Add Option handling for SimpleLoopUnswitch
This patch enables passing options to SimpleLoopUnswitch via the passes pipeline.

Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe
Reviewed By: fedor.sergeev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D60676

llvm-svn: 358880
2019-04-22 10:35:07 +00:00
Eric Christopher dfebd84eb3 Remove the EnableEarlyCSEMemSSA set of options from the legacy
and new pass managers. They were default to true and not being
used.

Differential Revision: https://reviews.llvm.org/D60747

llvm-svn: 358789
2019-04-19 22:18:53 +00:00
Alina Sbirlea 43709f7233 [LICM & MemorySSA] Make limit flags pass tuning options.
Summary:
Make the flags in LICM + MemorySSA tuning options in the old and new
pass managers.

Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60490

llvm-svn: 358772
2019-04-19 17:46:50 +00:00
Alina Sbirlea 0499a2f961 [NewPassManager] Adding pass tuning options: loop vectorize.
Summary:
Trying to add the plumbing necessary to add tuning options to the new pass manager.
Testing with the flags for loop vectorize.

Reviewers: chandlerc

Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59723

llvm-svn: 358763
2019-04-19 16:11:59 +00:00
Serguei Katkov ca6c03a22f [NewPM] Add Option handling for LoopVectorize
This patch enables passing options to LoopVectorizePass via the passes pipeline.

Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe
Reviewed By: fedor.sergeev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D60681

llvm-svn: 358647
2019-04-18 08:46:11 +00:00
Eric Christopher eff3b6fe7f Elaborate why we have an option on by default for enabling chr.
llvm-svn: 358641
2019-04-18 06:17:40 +00:00
Kit Barton 3cdf87940f Add basic loop fusion pass.
This patch adds a basic loop fusion pass. It will fuse loops that conform to the
following 4 conditions:
  1. Adjacent (no code between them)
  2. Control flow equivalent (if one loop executes, the other loop executes)
  3. Identical bounds (both loops iterate the same number of iterations)
  4. No negative distance dependencies between the loop bodies.

The pass does not make any changes to the IR to create opportunities for fusion.
Instead, it checks if the necessary conditions are met and if so it fuses two
loops together.

The pass has not been added to the pass pipeline yet, and thus is not enabled by
default. It can be run stand alone using the -loop-fusion option.

Differential Revision: https://reviews.llvm.org/D55851

llvm-svn: 358607
2019-04-17 18:53:27 +00:00
Eric Christopher e29874eaa0 Revert "Add basic loop fusion pass." Per request.
This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358553
2019-04-17 04:55:24 +00:00
Eric Christopher cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Kit Barton ab70da0728 Add basic loop fusion pass.
This patch adds a basic loop fusion pass. It will fuse loops that conform to the
following 4 conditions:
  1. Adjacent (no code between them)
  2. Control flow equivalent (if one loop executes, the other loop executes)
  3. Identical bounds (both loops iterate the same number of iterations)
  4. No negative distance dependencies between the loop bodies.

The pass does not make any changes to the IR to create opportunities for fusion.
Instead, it checks if the necessary conditions are met and if so it fuses two
loops together.

The pass has not been added to the pass pipeline yet, and thus is not enabled by
default. It can be run stand alone using the -loop-fusion option.

Phabricator: https://reviews.llvm.org/D55851
llvm-svn: 358543
2019-04-17 01:37:00 +00:00
Hiroshi Yamauchi 09e539fcae [PGO] Profile guided code size optimization.
Summary:
Enable some of the existing size optimizations for cold code under PGO.

A ~5% code size saving in big internal app under PGO.

The way it gets BFI/PSI is discussed in the RFC thread

http://lists.llvm.org/pipermail/llvm-dev/2019-March/130894.html 

Note it doesn't currently touch loop passes.

Reviewers: davidxl, eraman

Reviewed By: eraman

Subscribers: mgorny, javed.absar, smeenai, mehdi_amini, eraman, zzheng, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59514

llvm-svn: 358422
2019-04-15 16:49:00 +00:00
Serguei Katkov f54328372b [NewPM] Add Option handling for SimplifyCFG
This patch enables passing options to SimplifyCFGPass via the passes pipeline.

Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe
Reviewed By: fedor.sergeev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D60675

llvm-svn: 358379
2019-04-15 08:57:53 +00:00
Fedor Sergeev a2ed448bf2 SafepointIRVerifier port to new Pass Manager
Straightforward port of StatepointIRVerifier pass to new Pass Manager framework.

Fix By: skatkov
Reviewed By: fedor.sergeev
Differential Revision: https://reviews.llvm.org/D59825

This is a re-land of r357147/r357148 with LLVM_ENABLE_MODULES build fixed.
Adding IR/SafepointIRVerifier.h into its own module.

llvm-svn: 357361
2019-03-31 10:15:39 +00:00
Adrian Prantl 119fdeded8 Temporarily revert "SafepointIRVerifier port to new Pass Manager"
to unbreak the modular bots and its follow-up commit.

This reverts commit https://reviews.llvm.org/D59825
because it introduced a

fatal error: cyclic dependency in module 'LLVM_intrinsic_gen': LLVM_intrinsic_gen -> LLVM_IR -> LLVM_intrinsic_gen

llvm-svn: 357201
2019-03-28 18:34:34 +00:00
Serguei Katkov 93432be304 SafepointIRVerifier port to new Pass Manager
Straightforward port of StatepointIRVerifier pass to new Pass Manager framework.

Reviewers: fedor.sergeev, reames
Reviewed By: fedor.sergeev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D59825

llvm-svn: 357147
2019-03-28 06:00:09 +00:00
Robert Lougher f2158a8ef0 Resubmit r356511 "[TailCallElim] Add tailcall elimination pass to LTO pipelines"
Failing LLD tests have been fixed in r356593.

llvm-svn: 356594
2019-03-20 19:08:18 +00:00
Robert Lougher c67a759c99 Revert r356511 "[TailCallElim] Add tailcall elimination pass to LTO pipelines"
Due to buildbot failures (LLD tests).

llvm-svn: 356516
2019-03-19 20:54:20 +00:00
Robert Lougher de548ccab9 [TailCallElim] Add tailcall elimination pass to LTO pipelines
LTO provides additional opportunities for tailcall elimination due to
link-time inlining and visibility of nocapture attribute. Testing showed
negligible impact on compilation times.

Differential Revision: https://reviews.llvm.org/D58391

llvm-svn: 356511
2019-03-19 20:24:28 +00:00
Rong Xu db29a3a438 [PGO] Context sensitive PGO (part 3)
Part 3 of CSPGO changes (mostly related to PassMananger).

Differential Revision: https://reviews.llvm.org/D54175

llvm-svn: 355330
2019-03-04 20:21:27 +00:00
Manman Ren 1829512dd3 Add a module pass for order file instrumentation
The basic idea of the pass is to use a circular buffer to log the execution ordering of the functions. We only log the function when it is first executed. We use a 8-byte hash to log the function symbol name.

In this pass, we add three global variables:
(1) an order file buffer: a circular buffer at its own llvm section.
(2) a bitmap for each module: one byte for each function to say if the function is already executed.
(3) a global index to the order file buffer.

At the function prologue, if the function has not been executed (by checking the bitmap), log the function hash, then atomically increase the index.

Differential Revision:  https://reviews.llvm.org/D57463

llvm-svn: 355133
2019-02-28 20:13:38 +00:00
Rong Xu 6cdf3d8086 Recommit r354930 "[PGO] Context sensitive PGO (part 1)"
Fixed UBSan failures.

llvm-svn: 355005
2019-02-27 17:24:33 +00:00
Vlad Tsyrklevich c01643087e Revert "[PGO] Context sensitive PGO (part 1)"
This reverts commit r354930, it was causing UBSan failures.

llvm-svn: 354953
2019-02-27 03:45:28 +00:00
Rong Xu 35d2d51369 [PGO] Context sensitive PGO (part 1)
Current PGO profile counts are not context sensitive. The branch probabilities
for the inlined functions are kept the same for all call-sites, and they might
be very different from the actual branch probabilities. These suboptimal
profiles can greatly affect some downstream optimizations, in particular for
the machine basic block placement optimization.

In this patch, we propose to have a post-inline PGO instrumentation/use pass,
which we called Context Sensitive PGO (CSPGO). For the users who want the best
possible performance, they can perform a second round of PGO instrument/use on
the top of the regular PGO. They will have two sets of profile counts. The
first pass profile will be manly for inline, indirect-call promotion, and
CGSCC simplification pass optimizations. The second pass profile is for
post-inline optimizations and code-gen optimizations.

A typical usage:
// Regular PGO instrumentation and generate pass1 profile.
> clang -O2 -fprofile-generate source.c -o gen
> ./gen
> llvm-profdata merge default.*profraw -o pass1.profdata
// CSPGO instrumentation.
> clang -O2 -fprofile-use=pass1.profdata -fcs-profile-generate -o gen2
> ./gen2
// Merge two sets of profiles
> llvm-profdata merge default.*profraw pass1.profdata -o profile.profdata
// Use the combined profile. Pass manager will invoke two PGO use passes.
> clang -O2 -fprofile-use=profile.profdata -o use

This change touches many components in the compiler. The reviewed patch
(D54175) will committed in phrases.

Differential Revision: https://reviews.llvm.org/D54175

llvm-svn: 354930
2019-02-26 22:37:46 +00:00
Eric Christopher 721eaeff3a Fix a small comment typo.
llvm-svn: 354923
2019-02-26 20:33:22 +00:00
Vedant Kumar 47a0c9b69c [HotColdSplit] Schedule splitting late to fix perf regression
With or without PGO data applied, splitting early in the pipeline
(either before the inliner or shortly after it) regresses performance
across SPEC variants. The cause appears to be that splitting hides
context for subsequent optimizations.

Schedule splitting late again, in effect reversing r352080, which
scheduled the splitting pass early for code size benefits (documented in
https://reviews.llvm.org/D57082).

Differential Revision: https://reviews.llvm.org/D58258

llvm-svn: 354158
2019-02-15 18:46:44 +00:00
Leonard Chan 436fb2bd82 [NewPM] Second attempt at porting ASan
This is the second attempt to port ASan to new PM after D52739. This takes the
initialization requried by ASan from the Module by moving it into a separate
class with it's own analysis that the new PM ASan can use.

Changes:
- Split AddressSanitizer into 2 passes: 1 for the instrumentation on the
  function, and 1 for the pass itself which creates an instance of the first
  during it's run. The same is done for AddressSanitizerModule.
- Add new PM AddressSanitizer and AddressSanitizerModule.
- Add legacy and new PM analyses for reading data needed to initialize ASan with.
- Removed DominatorTree dependency from ASan since it was unused.
- Move GlobalsMetadata and ShadowMapping out of anonymous namespace since the
  new PM analysis holds these 2 classes and will need to expose them.

Differential Revision: https://reviews.llvm.org/D56470

llvm-svn: 353985
2019-02-13 22:22:48 +00:00
Teresa Johnson 716abbeb43 [HotColdSplit] Move splitting after instrumented PGO use
Summary:
Follow up to D57082 which moved splitting earlier in the pipeline, in
order to perform it before inlining. However, it was moved too early,
before the IR is annotated with instrumented PGO data. This caused the
splitting to incorrectly determine cold functions.

Move it to just after PGO annotation (still before inlining), in both
pass managers.

Reviewers: vsk, hiraditya, sebpop

Subscribers: mehdi_amini, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57805

llvm-svn: 353270
2019-02-06 04:29:39 +00:00
Teresa Johnson b0bf530fb5 [SamplePGO] More pipeline changes when flattened profile used in ThinLTO postlink
Summary:
Follow on to D54819/r351476.

We also don't need to perform extra InstCombine pass when we aren't
loading the sample profile in the ThinLTO backend because we have a
flattened sample profile.

Additionally, for consistency and clarity, when we aren't reloading the
sample profile, perform ICP in the same location as non-sample PGO
backends. To this end I have moved the ICP invocation for non-SamplePGO
ThinLTO down into buildModuleSimplificationPipeline (partly addresses
the FIXME where we were previously setting this up).

Reviewers: wmi

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57705

llvm-svn: 353135
2019-02-05 04:09:19 +00:00
Philip Pfaffe 0ee6a933ce [NewPM][MSan] Add Options Handling
Summary: This patch enables passing options to msan via the passes pipeline, e.e., -passes=msan<recover;kernel;track-origins=4>.

Reviewers: chandlerc, fedor.sergeev, leonardchan

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57640

llvm-svn: 353090
2019-02-04 21:02:49 +00:00
Max Kazantsev f392bc846f Default lowering for experimental.widenable.condition
Introduces a pass that provides default lowering strategy for the
`experimental.widenable.condition` intrinsic, replacing all its uses with
`i1 true`.

Differential Revision: https://reviews.llvm.org/D56096
Reviewed By: reames

llvm-svn: 352739
2019-01-31 09:10:17 +00:00
Vedant Kumar ef1ebed1c6 [HotColdSplit] Move splitting earlier in the pipeline
Performing splitting early has several advantages:

  - Inhibiting inlining of cold code early improves code size. Compared
    to scheduling splitting at the end of the pipeline, this cuts code
    size growth in half within the iOS shared cache (0.69% to 0.34%).

  - Inhibiting inlining of cold code improves compile time. There's no
    need to inline split cold functions, or to inline as much *within*
    those split functions as they are marked `minsize`.

  - During LTO, extra work is only done in the pre-link step. Less code
    must be inlined during cross-module inlining.

An additional motivation here is that the most common cold regions
identified by the static/conservative splitting heuristic can (a) be
found before inlining and (b) do not grow after inlining. E.g.
__assert_fail, os_log_error.

The disadvantages are:

  - Some opportunities for splitting out cold code may be missed. This
    gap can potentially be narrowed by adding a worklist algorithm to the
    splitting pass.

  - Some opportunities to reduce code size may be lost (e.g. store
    sinking, when one side of the CFG diamond is split). This does not
    outweigh the code size benefits of splitting earlier.

On net, splitting early in the pipeline has substantial code size
benefits, and no major effects on memory locality or performance. We
measured memory locality using ktrace data, and consistently found that
10% fewer pages were needed to capture 95% of text page faults in key
iOS benchmarks. We measured performance on frequency-stabilized iOS
devices using LNT+externals.

This reverses course on the decision made to schedule splitting late in
r344869 (D53437).

Differential Revision: https://reviews.llvm.org/D57082

llvm-svn: 352080
2019-01-24 18:55:49 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Wei Mi 3bcccdfe38 [SampleFDO] Skip profile reading when flattened profile used in ThinLTO postlink
If the sample profile has no inlining hierachy information included, we call
the sample profile is flattened. For flattened profile, in ThinLTO postlink
phase, SampleProfileLoader's hot function inlining and profile annotation will
do nothing, so it is better to save the effort to read in the profile and run
the sample profile loader pass. It is helpful for reducing compile time when
the flattened profile is huge.

Differential Revision: https://reviews.llvm.org/D54819

llvm-svn: 351476
2019-01-17 20:48:34 +00:00
Philip Pfaffe 685c76d7a3 [NewPM][TSan] Reiterate the TSan port
Summary:
Second iteration of D56433 which got reverted in rL350719. The problem
in the previous version was that we dropped the thunk calling the tsan init
function. The new version keeps the thunk which should appease dyld, but is not
actually OK wrt. the current semantics of function passes. Hence, add a
helper to insert the functions only on the first time. The helper
allows hooking into the insertion to be able to append them to the
global ctors list.

Reviewers: chandlerc, vitalybuka, fedor.sergeev, leonardchan

Subscribers: hiraditya, bollu, llvm-commits

Differential Revision: https://reviews.llvm.org/D56538

llvm-svn: 351314
2019-01-16 09:28:01 +00:00
Fedor Sergeev b7871405fa [LoopUnroll] add parsing for unroll parameters in -passes pipeline
Allow to specify loop-unrolling with optional parameters explicitly
spelled out in -passes pipeline specification.
Introducing somewhat generic way of specifying parameters parsing via
FUNCTION_PASS_PARAMETRIZED pass registration.

Syntax of parametrized unroll pass name is as follows:
   'unroll<' parameter-list '>'

Where parameter-list is ';'-separate list of parameter names and optlevel
   optlevel: 'O[0-3]'
   parameter: { 'partial' | 'peeling' | 'runtime' | 'upperbound' }
   negated:  'no-' parameter

Example:
   -passes=loop(unroll<O3;runtime;no-upperbound>)

    this invokes LoopUnrollPass configured with OptLevel=3,
    Runtime, no UpperBound, everything else by default.

llvm-svn: 350808
2019-01-10 10:01:53 +00:00
Florian Hahn 9697d2a764 Revert r350647: "[NewPM] Port tsan"
This patch breaks thread sanitizer on some macOS builders, e.g.
http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/52725/

llvm-svn: 350719
2019-01-09 13:32:16 +00:00
Philip Pfaffe 82f995db75 [NewPM] Port tsan
A straightforward port of tsan to the new PM, following the same path
as D55647.

Differential Revision: https://reviews.llvm.org/D56433

llvm-svn: 350647
2019-01-08 19:21:57 +00:00
Teresa Johnson 853b962416 [ThinLTO] Handle chains of aliases
At -O0, globalopt is not run during the compile step, and we can have a
chain of an alias having an immediate aliasee of another alias. The
summaries are constructed assuming aliases in a canonical form
(flattened chains), and as a result only the base object but no
intermediate aliases were preserved.

Fix by adding a pass that canonicalize aliases, which ensures each
alias is a direct alias of the base object.

Reviewers: pcc, davidxl

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D54507

llvm-svn: 350423
2019-01-04 19:04:54 +00:00
Philip Pfaffe b39a97c8f6 [NewPM] Port Msan
Summary:
Keeping msan a function pass requires replacing the module level initialization:
That means, don't define a ctor function which calls __msan_init, instead just
declare the init function at the first access, and add that to the global ctors
list.

Changes:
- Pull the actual sanitizer and the wrapper pass apart.
- Add a newpm msan pass. The function pass inserts calls to runtime
  library functions, for which it inserts declarations as necessary.
- Update tests.

Caveats:
- There is one test that I dropped, because it specifically tested the
  definition of the ctor.

Reviewers: chandlerc, fedor.sergeev, leonardchan, vitalybuka

Subscribers: sdardis, nemanjai, javed.absar, hiraditya, kbarton, bollu, atanasyan, jsji

Differential Revision: https://reviews.llvm.org/D55647

llvm-svn: 350305
2019-01-03 13:42:44 +00:00
Michael Kruse 7244852557 [Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.
When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g.

    #pragma clang loop unroll_and_jam(enable)
    #pragma clang loop distribute(enable)

is the same as

    #pragma clang loop distribute(enable)
    #pragma clang loop unroll_and_jam(enable)

and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used.

This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance,

    !0 = !{!0, !1, !2}
    !1 = !{!"llvm.loop.unroll_and_jam.enable"}
    !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3}
    !3 = !{!"llvm.loop.distribute.enable"}

defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop.

Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account.

For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations.

Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated.

To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied.

With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling).

Reviewed By: hfinkel, dmgreen

Differential Revision: https://reviews.llvm.org/D49281
Differential Revision: https://reviews.llvm.org/D55288

llvm-svn: 348944
2018-12-12 17:32:52 +00:00
Max Kazantsev b9e65cbddf Introduce llvm.experimental.widenable_condition intrinsic
This patch introduces a new instinsic `@llvm.experimental.widenable_condition`
that allows explicit representation for guards. It is an alternative to using
`@llvm.experimental.guard` intrinsic that does not contain implicit control flow.

We keep finding places where `@llvm.experimental.guard` is not supported or
treated too conservatively, and there are 2 reasons to that:

- `@llvm.experimental.guard` has memory write side effect to model implicit control flow,
  and this sometimes confuses passes and analyzes that work with memory;
- Not all passes and analysis are aware of the semantics of guards. These passes treat them
  as regular throwing call and have no idea that the condition of guard may be used to prove
  something. One well-known place which had caused us troubles in the past is explicit loop
  iteration count calculation in SCEV. Another example is new loop unswitching which is not
  aware of guards. Whenever a new pass appears, we potentially have this problem there.

Rather than go and fix all these places (and commit to keep track of them and add support
in future), it seems more reasonable to leverage the existing optimizer's logic as much as possible.
The only significant difference between guards and regular explicit branches is that guard's condition
can be widened. It means that a guard contains (explicitly or implicitly) a `deopt` block successor,
and it is always legal to go there no matter what the guard condition is. The other successor is
a guarded block, and it is only legal to go there if the condition is true.

This patch introduces a new explicit form of guards alternative to `@llvm.experimental.guard`
intrinsic. Now a widenable guard can be represented in the CFG explicitly like this:


    %widenable_condition = call i1 @llvm.experimental.widenable.condition()
    %new_condition = and i1 %cond, %widenable_condition
    br i1 %new_condition, label %guarded, label %deopt

  guarded:
    ; Guarded instructions

  deopt:
    call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ]

The new intrinsic `@llvm.experimental.widenable.condition` has semantics of an
`undef`, but the intrinsic prevents the optimizer from folding it early. This form
should exploit all optimization boons provided to `br` instuction, and it still can be
widened by replacing the result of `@llvm.experimental.widenable.condition()`
with `and` with any arbitrary boolean value (as long as the branch that is taken when
it is `false` has a deopt and has no side-effects).

For more motivation, please check llvm-dev discussion "[llvm-dev] Giving up using
implicit control flow in guards".

This patch introduces this new intrinsic with respective LangRef changes and a pass
that converts old-style guards (expressed as intrinsics) into the new form.

The naming discussion is still ungoing. Merging this to unblock further items. We can
later change the name of this intrinsic.

Reviewed By: reames, fedor.sergeev, sanjoy
Differential Revision: https://reviews.llvm.org/D51207

llvm-svn: 348593
2018-12-07 14:39:46 +00:00
Markus Lavin 4dc4ebd606 [PM] Port LoadStoreVectorizer to the new pass manager.
Differential Revision: https://reviews.llvm.org/D54848

llvm-svn: 348570
2018-12-07 08:23:37 +00:00
Vitaly Buka 4493fe1c1b [stack-safety] Empty local passes for Stack Safety Local Analysis
Reviewers: eugenis, vlad.tsyrklevich

Subscribers: mgorny, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D54502

llvm-svn: 347602
2018-11-26 21:57:47 +00:00
Mikael Holmen b6f76002d9 [PM] Port Scalarizer to the new pass manager.
Patch by: markus (Markus Lavin)

Reviewers: chandlerc, fedor.sergeev

Reviewed By: fedor.sergeev

Subscribers: llvm-commits, Ka-Ka, bjope

Differential Revision: https://reviews.llvm.org/D54695

llvm-svn: 347392
2018-11-21 14:00:17 +00:00
Xin Tong 642c8d3575 [LTO] Load sample profile in LTO link step.
Summary:
Load sample profile in LTO link step.
ThinLTO calls populateModulePassManager to load the profile

Reviewers: tejohnson, davidxl, danielcdh

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D54564

llvm-svn: 346971
2018-11-15 18:06:42 +00:00
Philip Pfaffe 2d4effb25c Add an OptimizerLast EP
Summary:
It turns out that we need an OptimizerLast PassBuilder extension point
after all. I missed the relevance of this EP the first time. By legacy PM magic,
function passes added at this EP get added to the last _Function_ PM, which is a
feature we lost when dropping this EP for the new PM.

A key difference between this and the legacy PassManager's OptimizerLast
callback is that this extension point is not triggered at O0. Extensions
to the O0 pipeline should append their passes to the end of the overall
pipeline.

Differential Revision: https://reviews.llvm.org/D54374

llvm-svn: 346645
2018-11-12 11:17:07 +00:00
Fedor Sergeev 412ed34744 [LoopUnroll] allow customization for new-pass-manager version of LoopUnroll
Unlike its legacy counterpart new pass manager's LoopUnrollPass does
not provide any means to select which flavors of unroll to run
(runtime, peeling, partial), relying on global defaults.

In some cases having ability to run a restricted LoopUnroll that
does more than LoopFullUnroll is needed.

Introduced LoopUnrollOptions to select optional unroll behaviors.
Added 'unroll<peeling>' to PassRegistry mainly for the sake of testing.

Reviewers: chandlerc, tejohnson
Differential Revision: https://reviews.llvm.org/D53440

llvm-svn: 345723
2018-10-31 14:33:14 +00:00
Leonard Chan eebecb3214 Revert "[PassManager/Sanitizer] Enable usage of ported AddressSanitizer passes with -fsanitize=address"
This reverts commit 8d6af840396f2da2e4ed6aab669214ae25443204 and commit
b78d19c287b6e4a9abc9fb0545de9a3106d38d3d which causes slower build times
by initializing the AddressSanitizer on every function run.

The corresponding revisions are https://reviews.llvm.org/D52814 and
https://reviews.llvm.org/D52739.

llvm-svn: 345433
2018-10-26 22:51:51 +00:00
Teresa Johnson d725335bd1 [hot-cold-split] Only perform splitting in ThinLTO backend post-link
Summary:
Fix the new PM to only perform hot cold splitting once during ThinLTO,
by skipping it in the pre-link phase.

This was already fixed in the old PM by the move of the hot cold split
pass later (after the early return when PrepareForThinLTO) by r344869.

Reviewers: vsk, sebpop, hiraditya

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D53611

llvm-svn: 345096
2018-10-23 22:57:40 +00:00
Aditya Kumar d9e2e383a9 Schedule Hot Cold Splitting pass after most optimization passes
Summary:
In the new+old pass manager, hot cold splitting was schedule too early.
Thanks to Vedant for pointing this out.

Reviewers: sebpop, vsk

Reviewed By: sebpop, vsk

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D53437

llvm-svn: 344869
2018-10-21 18:11:56 +00:00
Fedor Sergeev bd6b2138b9 [NewPM] teach -passes= to emit meaningful error messages
All the PassBuilder::parse interfaces now return descriptive StringError
instead of a plain bool. It allows to make -passes/aa-pipeline parsing
errors context-specific and thus less confusing.

TODO: ideally we should also make suggestions for misspelled pass names,
but that requires some extensions to PassBuilder.

Reviewed By: philip.pfaffe, chandlerc
Differential Revision: https://reviews.llvm.org/D53246

llvm-svn: 344685
2018-10-17 10:36:23 +00:00
Fedor Sergeev a01be0f217 Revert "[NewPM] teach -passes= to emit meaningful error messages"
This reverts r344519 due to failures in pipeline-parsing test.

llvm-svn: 344524
2018-10-15 15:36:08 +00:00
Fedor Sergeev 4155a77e98 [NewPM] teach -passes= to emit meaningful error messages
Summary:
All the PassBuilder::parse interfaces now return descriptive StringError
instead of a plain bool. It allows to make -passes/aa-pipeline parsing
errors context-specific and thus less confusing.

TODO: ideally we should also make suggestions for misspelled pass names,
but that requires some extensions to PassBuilder.

Reviewed By: philip.pfaffe, chandlerc
Differential Revision: https://reviews.llvm.org/D53246

llvm-svn: 344519
2018-10-15 15:00:18 +00:00
Leonard Chan 64e21b5cfd [PassManager/Sanitizer] Port of AddresSanitizer pass from legacy to new PassManager
This patch ports the legacy pass manager to the new one to take advantage of
the benefits of the new PM. This involved moving a lot of the declarations for
`AddressSantizer` to a header so that it can be publicly used via
PassRegistry.def which I believe contains all the passes managed by the new PM.

This patch essentially decouples the instrumentation from the legacy PM such
hat it can be used by both legacy and new PM infrastructure.

Differential Revision: https://reviews.llvm.org/D52739

llvm-svn: 344274
2018-10-11 18:31:51 +00:00
Richard Smith 6c67662816 Add a flag to remap manglings when reading profile data information.
This can be used to preserve profiling information across codebase
changes that have widespread impact on mangled names, but across which
most profiling data should still be usable. For example, when switching
from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI,
or even from a 32-bit to a 64-bit build.

The user can provide a remapping file specifying parts of mangled names
that should be treated as equivalent (eg, std::__1 should be treated as
equivalent to std::__cxx11), and profile data will be treated as
applying to a particular function if its name is equivalent to the name
of a function in the profile data under the provided equivalences. See
the documentation change for a description of how this is configured.

Remapping is supported for both sample-based profiling and instruction
profiling. We do not support remapping indirect branch target
information, but all other profile data should be remapped
appropriately.

Support is only added for the new pass manager. If someone wants to also
add support for this for the old pass manager, doing so should be
straightforward.

This is the LLVM side of Clang r344199.

Reviewers: davidxl, tejohnson, dlj, erik.pilkington

Subscribers: mehdi_amini, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51249

llvm-svn: 344200
2018-10-10 23:13:47 +00:00
Aditya Kumar 9e20ade72a Add support for new pass manager
Modified the testcases to use both pass managers
Use single commandline flag for both pass managers.

Differential Revision: https://reviews.llvm.org/D52708
Reviewers: sebpop, tejohnson, brzycki, SirishP
Reviewed By: tejohnson, brzycki

llvm-svn: 343662
2018-10-03 05:55:20 +00:00
Eric Christopher dcf1d97c5c Temporarily revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts commit r342387 as it's showing significant performance
regressions in a number of benchmarks. Followed up with the
committer and original thread with an example and will get performance
numbers before recommitting.

llvm-svn: 343522
2018-10-01 18:57:08 +00:00
Alexandros Lamprineas 8a1c374b2e [GVNHoist] Re-enable GVNHoist by default
Rebase rL341954 since https://bugs.llvm.org/show_bug.cgi?id=38912
has been fixed by rL342055.

Precommit testing performed:
* Overnight runs of csmith comparing the output between programs
  compiled with gvn-hoist enabled/disabled.
* Bootstrap builds of clang with UbSan/ASan configurations.

llvm-svn: 342387
2018-09-17 12:24:55 +00:00
Alexandros Lamprineas fe0512d575 Revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts rL341954.

The builder `sanitizer-x86_64-linux-bootstrap-ubsan` has been
failing with timeouts at stage2 clang/ubsan:

[3065/3073] Linking CXX executable bin/lld
command timed out: 1200 seconds without output running python
../sanitizer_buildbot/sanitizers/buildbot_selector.py,
attempting to kill

llvm-svn: 342001
2018-09-11 22:10:57 +00:00
Alexandros Lamprineas db18e972d7 [GVNHoist] Re-enable GVNHoist by default
Rebase rL340922 since https://bugs.llvm.org/show_bug.cgi?id=38807
has been fixed by rL341947.

llvm-svn: 341954
2018-09-11 15:55:45 +00:00
Hiroshi Yamauchi 9775a620b0 [PGO] Control Height Reduction
Summary:
Control height reduction merges conditional blocks of code and reduces the
number of conditional branches in the hot path based on profiles.

if (hot_cond1) { // Likely true.
  do_stg_hot1();
}
if (hot_cond2) { // Likely true.
  do_stg_hot2();
}

->

if (hot_cond1 && hot_cond2) { // Hot path.
  do_stg_hot1();
  do_stg_hot2();
} else { // Cold path.
  if (hot_cond1) {
    do_stg_hot1();
  }
  if (hot_cond2) {
    do_stg_hot2();
  }
}

This speeds up some internal benchmarks up to ~30%.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: xbolva00, dmgreen, mehdi_amini, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D50591

llvm-svn: 341386
2018-09-04 17:19:13 +00:00
Alexandros Lamprineas f6db5bcd38 Revert r340922 "[GVNHoist] Re-enable GVNHoist by default"
Another sanitizer buildbot failed this time at bootstrap when
compiling SemaTemplateInstantiate.cpp with this assertion:
`dominates(MD, U) && "Memory Def does not dominate it's uses"'.

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/15047

llvm-svn: 340925
2018-08-29 13:00:55 +00:00
Alexandros Lamprineas c03b9b8854 [GVNHoist] Re-enable GVNHoist by default
Rebase rL338240 since the excessive memory usage observed when using
GVNHoist with UBSan has been fixed by rL340818.

Differential Revision: https://reviews.llvm.org/D49858

llvm-svn: 340922
2018-08-29 11:58:34 +00:00
Vlad Tsyrklevich 1c7160e85f Revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts commit r338240 because it was causing OOMs on the UBSan
buildbot when building clang/lib/Sema/SemaChecking.cpp

llvm-svn: 338297
2018-07-30 20:07:33 +00:00
Alexandros Lamprineas de3ca964c1 [GVNHoist] Re-enable GVNHoist by default
My initial motivation for this came from https://reviews.llvm.org/D48122,
where it was pointed out that my change didn't fit well in SimplifyCFG and
therefore using GVNHoist was a better way to go. GVNHoist has been disabled
for a while as there was a list of bugs related to it.

I have fixed the following bugs:

https://bugs.llvm.org/show_bug.cgi?id=37808 -> https://reviews.llvm.org/D48372 (rL337149)
https://bugs.llvm.org/show_bug.cgi?id=36787 -> https://reviews.llvm.org/D49555 (rL337674)
https://bugs.llvm.org/show_bug.cgi?id=37445 -> https://reviews.llvm.org/D49425 (rL337680)

The next two bugs no longer occur, and it's unclear which commit fixed them:

https://bugs.llvm.org/show_bug.cgi?id=36635
https://bugs.llvm.org/show_bug.cgi?id=37791

I investigated this one and proved to be unrelated to GVNHoist, but a genuine bug in NewGvn:

https://bugs.llvm.org/show_bug.cgi?id=37660

To convince myself GVNHoist is in a good state I made a successful bootstrap build of LLVM.
Merging this change now in order to make it to the LLVM 7.0.0 branch.

Differential Revision: https://reviews.llvm.org/D49858

llvm-svn: 338240
2018-07-30 10:50:18 +00:00
Teresa Johnson 28023dbed7 [ThinLTO] Enable ThinLTO WholeProgramDevirt and LowerTypeTests in new PM
Summary:
Enable these passes for CFI and WPD in ThinLTO and LTO with the new pass
manager. Add a couple of tests for both PMs based on the clang tests
tools/clang/test/CodeGen/thinlto-distributed-cfi*.ll, but just test
through llvm-lto2 and not with distributed ThinLTO.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49429

llvm-svn: 337461
2018-07-19 14:51:32 +00:00
Michael J. Spencer 7bb2767fba Recommit r335794 "Add support for generating a call graph profile from Branch Frequency Info." with fix for removed functions.
llvm-svn: 337140
2018-07-16 00:28:24 +00:00
David Green 963401d2be [UnrollAndJam] New Unroll and Jam pass
This is a simple implementation of the unroll-and-jam classical loop
optimisation.

The basic idea is that we take an outer loop of the form:

  for i..
    ForeBlocks(i)
    for j..
      SubLoopBlocks(i, j)
    AftBlocks(i)

Instead of doing normal inner or outer unrolling, we unroll as follows:

  for i... i+=2
    ForeBlocks(i)
    ForeBlocks(i+1)
    for j..
      SubLoopBlocks(i, j)
      SubLoopBlocks(i+1, j)
    AftBlocks(i)
    AftBlocks(i+1)
  Remainder Loop

So we have unrolled the outer loop, then jammed the two inner loops into
one. This can lead to a simpler inner loop if memory accesses can be shared
between the now jammed loops.

To do this we have to prove that this is all safe, both for the memory
accesses (using dependence analysis) and that ForeBlocks(i+1) can move before
AftBlocks(i) and SubLoopBlocks(i, j).

Differential Revision: https://reviews.llvm.org/D41953

llvm-svn: 336062
2018-07-01 12:47:30 +00:00
Chandler Carruth 7c557f804d [instsimplify] Move the instsimplify pass to use more obvious file names
and diretory.

Also cleans up all the associated naming to be consistent and removes
the public access to the pass ID which was unused in LLVM.

Also runs clang-format over parts that changed, which generally cleans
up a bunch of formatting.

This is in preparation for doing some internal cleanups to the pass.

Differential Revision: https://reviews.llvm.org/D47352

llvm-svn: 336028
2018-06-29 23:36:03 +00:00
John Brawn bdbbd8381f Add a PhiValuesAnalysis pass to calculate the underlying values of phis
This pass is being added in order to make the information available to BasicAA,
which can't do caching of this information itself, but possibly this information
may be useful for other passes.

Incorporates code based on Daniel Berlin's implementation of Tarjan's algorithm.

Differential Revision: https://reviews.llvm.org/D47893

llvm-svn: 335857
2018-06-28 14:13:06 +00:00
Benjamin Kramer 269eb21e1c Revert "Add support for generating a call graph profile from Branch Frequency Info."
This reverts commits r335794 and r335797. Breaks ThinLTO+FDO selfhost.

llvm-svn: 335851
2018-06-28 13:15:03 +00:00
Michael J. Spencer 5bf1ead377 Add support for generating a call graph profile from Branch Frequency Info.
=== Generating the CG Profile ===

The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions.  For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight.

After scanning all the functions, it generates an appending module flag containing the data. The format looks like:
```
!llvm.module.flags = !{!0}

!0 = !{i32 5, !"CG Profile", !1}
!1 = !{!2, !3, !4} ; List of edges
!2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32
!3 = !{void (i1)* @freq, void ()* @a, i64 11}
!4 = !{void (i1)* @freq, void ()* @b, i64 20}
```

Differential Revision: https://reviews.llvm.org/D48105

llvm-svn: 335794
2018-06-27 23:58:08 +00:00
Chandler Carruth 71fd27043e [PM/LoopUnswitch] When using the new SimpleLoopUnswitch pass, schedule
loop-cleanup passes at the beginning of the loop pass pipeline, and
re-enqueue loops after even trivial unswitching.

This will allow us to much more consistently avoid simplifying code
while doing trivial unswitching. I've also added a test case that
specifically shows effective iteration using this technique.

I've unconditionally updated the new PM as that is always using the
SimpleLoopUnswitch pass, and I've made the pipeline changes for the old
PM conditional on using this new unswitch pass. I added a bunch of
comments to the loop pass pipeline in the old PM to make it more clear
what is going on when reviewing.

Hopefully this will unblock doing *partial* unswitching instead of just
full unswitching.

Differential Revision: https://reviews.llvm.org/D47408

llvm-svn: 333493
2018-05-30 02:46:45 +00:00
David Green aee7ad0cde Revert 333358 as it's failing on some builders.
I'm guessing the tests reply on the ARM backend being built.

llvm-svn: 333359
2018-05-27 12:54:33 +00:00
David Green 3034281b43 [UnrollAndJam] Add a new Unroll and Jam pass
This is a simple implementation of the unroll-and-jam classical loop
optimisation.

The basic idea is that we take an outer loop of the form:

for i..
  ForeBlocks(i)
  for j..
    SubLoopBlocks(i, j)
  AftBlocks(i)

Instead of doing normal inner or outer unrolling, we unroll as follows:

for i... i+=2
  ForeBlocks(i)
  ForeBlocks(i+1)
  for j..
    SubLoopBlocks(i, j)
    SubLoopBlocks(i+1, j)
  AftBlocks(i)
  AftBlocks(i+1)
Remainder

So we have unrolled the outer loop, then jammed the two inner loops into
one. This can lead to a simpler inner loop if memory accesses can be shared
between the now-jammed loops.

To do this we have to prove that this is all safe, both for the memory
accesses (using dependence analysis) and that ForeBlocks(i+1) can move before
AftBlocks(i) and SubLoopBlocks(i, j).

Differential Revision: https://reviews.llvm.org/D41953

llvm-svn: 333358
2018-05-27 12:11:21 +00:00
Chandler Carruth e6c30fdda7 Restore the LoopInstSimplify pass, reverting r327329 that removed it.
The plan had always been to move towards using this rather than so much
in-pass simplification within the loop pipeline, but we never got around
to it.... until only a couple months after it was removed due to disuse.
=/

This commit is just a pure revert of the removal. I will add tests and
do some basic cleanup in follow-up commits. Then I'll wire it into the
loop pass pipeline.

Differential Revision: https://reviews.llvm.org/D47353

llvm-svn: 333250
2018-05-25 01:32:36 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
David Blaikie 4fe1fe1418 Fix Layering, move instrumentation transform headers into Instrumentation subdirectory
llvm-svn: 328379
2018-03-23 22:11:06 +00:00
David Blaikie 301627f875 Move SampleProfile.h into IPO along with the rest of the IPO pass headers
llvm-svn: 328262
2018-03-22 22:42:44 +00:00
Fedor Sergeev 194a407bda [New PM][IRCE] port of Inductive Range Check Elimination pass to the new pass manager
There are two nontrivial details here:
* Loop structure update interface is quite different with new pass manager,
  so the code to add new loops was factored out

* BranchProbabilityInfo is not a loop analysis, so it can not be just getResult'ed from
  within the loop pass. It cant even be queried through getCachedResult as LoopCanonicalization
  sequence (e.g. LoopSimplify) might invalidate BPI results.

  Complete solution for BPI will likely take some time to discuss and figure out,
  so for now this was partially solved by making BPI optional in IRCE
  (skipping a couple of profitability checks if it is absent).

Most of the IRCE tests got their corresponding new-pass-manager variant enabled.
Only two of them depend on BPI, both marked with TODO, to be turned on when BPI
starts being available for loop passes.

Reviewers: chandlerc, mkazantsev, sanjoy, asbirlea
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D43795

llvm-svn: 327619
2018-03-15 11:01:19 +00:00
Vedant Kumar 3a408538f0 Remove the LoopInstSimplify pass (-loop-instsimplify)
LoopInstSimplify is unused and untested. Reading through the commit
history the pass also seems to have a high maintenance burden.

It would be best to retire the pass for now. It should be easy to
recover if we need something similar in the future.

Differential Revision: https://reviews.llvm.org/D44053

llvm-svn: 327329
2018-03-12 20:49:42 +00:00
Amjad Aboud f1f57a3137 Another try to commit 323321 (aggressive instruction combine).
llvm-svn: 323416
2018-01-25 12:06:32 +00:00
Amjad Aboud d53504e379 Reverted 323321.
llvm-svn: 323326
2018-01-24 14:48:49 +00:00
Amjad Aboud e4453233d7 [InstCombine] Introducing Aggressive Instruction Combine pass (-aggressive-instcombine).
Combine expression patterns to form expressions with fewer, simple instructions.
This pass does not modify the CFG.

For example, this pass reduce width of expressions post-dominated by TruncInst
into smaller width when applicable.

It differs from instcombine pass in that it contains pattern optimization that
requires higher complexity than the O(1), thus, it should run fewer times than
instcombine pass.

Differential Revision: https://reviews.llvm.org/D38313

llvm-svn: 323321
2018-01-24 12:42:42 +00:00
Malcolm Parsons 21e545d08d Fix typos of occurred and occurrence
llvm-svn: 323318
2018-01-24 10:33:39 +00:00
David Blaikie 0c64f5a2eb NewPM: Add an extension point for the start of the pipeline.
This applies to most pipelines except the LTO and ThinLTO backend
actions - it is for use at the beginning of the overall pipeline.

This extension point will be used to add the GCOV pass when enabled in
Clang.

llvm-svn: 323166
2018-01-23 01:25:20 +00:00
Easwaran Raman bdf20261d8 Add a pass to generate synthetic function entry counts.
Summary:
This pass synthesizes function entry counts by traversing the callgraph
and using the relative block frequencies of the callsites. The intended
use of these counts is in inlining to determine hot/cold callsites in
the absence of profile information.

The pass is split into two files with the code that propagates the
counts in a callgraph in a Utils file. I plan to add support for
propagation in the thinlto link phase and the propagation code will be
shared and hence this split. I did not add support to the old PM since
hot callsite determination in inlining is not possible in old PM
(although we could use hot callee heuristic with synthetic counts in the
old PM it is not worth the effort tuning it)

Reviewers: davidxl, silvas

Subscribers: mgorny, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D41604

llvm-svn: 322110
2018-01-09 19:39:35 +00:00
Fedor Sergeev 02e7f0247b [PM] pass -debug-pass-manager flag into FunctionToLoopPassAdaptor's canonicalization PM
Summary:
New pass manager driver passes DebugPM (-debug-pass-manager) flag into
individual PassManager constructors in order to enable debug logging.
FunctionToLoopPassAdaptor has its own internal LoopCanonicalizationPM
which never gets its debug logging enabled and that means canonicalization
passes like LoopSimplify are never present in -debug-pass-manager output.

Extending FunctionToLoopPassAdaptor's constructor and
createFunctionToLoopPassAdaptor wrapper with an optional
boolean DebugLogging argument.

Passing debug-logging flags there as appropriate.

Reviewers: chandlerc, davide

Reviewed By: davide

Subscribers: mehdi_amini, eraman, llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D41586

llvm-svn: 321548
2017-12-29 08:16:06 +00:00
Fedor Sergeev 4b86d79048 [PM] port Rewrite Statepoints For GC to the new pass manager.
Summary:
The port is nearly straightforward.
The only complication is related to the analyses handling,
since one of the analyses used in this module pass is domtree,
which is a function analysis. That requires asking for the results
of each function and disallows a single interface for run-on-module
pass action.

Decided to copy-paste the main body of this pass.
Most of its code is requesting analyses anyway, so not that much
of a copy-paste.

The rest of the code movement is to transform all the implementation
helper functions like stripNonValidData into non-member statics.

Extended all the related LLVM tests with new-pass-manager use.
No failures.

Reviewers: sanjoy, anna, reames

Reviewed By: anna

Subscribers: skatkov, llvm-commits

Differential Revision: https://reviews.llvm.org/D41162

llvm-svn: 320796
2017-12-15 09:32:11 +00:00
Sanjay Patel 0ab0c1a201 [SimplifyCFG] don't sink common insts too soon (PR34603)
This should solve:
https://bugs.llvm.org/show_bug.cgi?id=34603
...by preventing SimplifyCFG from altering redundant instructions before early-cse has a chance to run.
It changes the default (canonical-forming) behavior of SimplifyCFG, so we're only doing the
sinking transform later in the optimization pipeline.

Differential Revision: https://reviews.llvm.org/D38566

llvm-svn: 320749
2017-12-14 22:05:20 +00:00
Michael Zolotukhin d8920b1c44 Remove redundant includes from various places.
llvm-svn: 320629
2017-12-13 21:31:03 +00:00
Chandler Carruth c34f789e38 Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable.
The core idea is to (re-)introduce some redundancies where their cost is
hidden by the cost of materializing immediates for constant operands of
PHI nodes. When the cost of the redundancies is covered by this,
avoiding materializing the immediate has numerous benefits:
1) Less register pressure
2) Potential for further folding / combining
3) Potential for more efficient instructions due to immediate operand

As a motivating example, consider the remarkably different cost on x86
of a SHL instruction with an immediate operand versus a register
operand.

This pattern turns up surprisingly frequently, but is somewhat rarely
obvious as a significant performance problem.

The pass is entirely target independent, but it does rely on the target
cost model in TTI to decide when to speculate things around the PHI
node. I've included x86-focused tests, but any target that sets up its
immediate cost model should benefit from this pass.

There is probably more that can be done in this space, but the pass
as-is is enough to get some important performance on our internal
benchmarks, and should be generally performance neutral, but help with
more extensive benchmarking is always welcome.

One awkward part is that this pass has to be scheduled after
*everything* that can eliminate these kinds of redundancies. This
includes SimplifyCFG, GVN, etc. I'm open to suggestions about better
places to put this. We could in theory make it part of the codegen pass
pipeline, but there doesn't really seem to be a good reason for that --
it isn't "lowering" in any sense and only relies on pretty standard cost
model based TTI queries, so it seems to fit well with the "optimization"
pipeline model. Still, further thoughts on the pipeline position are
welcome.

I've also only implemented this in the new pass manager. If folks are
very interested, I can try to add it to the old PM as well, but I didn't
really see much point (my use case is already switched over to the new
PM).

I've tested this pretty heavily without issue. A wide range of
benchmarks internally show no change outside the noise, and I don't see
any significant changes in SPEC either. However, the size class
computation in tcmalloc is substantially improved by this, which turns
into a 2% to 4% win on the hottest path through tcmalloc for us, so
there are definitely important cases where this is going to make
a substantial difference.

Differential revision: https://reviews.llvm.org/D37467

llvm-svn: 319164
2017-11-28 11:32:31 +00:00
Sanjay Patel 3e29890a7f [(new) Pass Manager] instantiate SimplifyCFG with the same options as the old PM
This is a recommit of r316869 which was speculatively reverted with r317444 and 
subsequently shown to not be the cause of PR35210. That crash should be fixed
after r318237.

Original commit message:

The old PM sets the options of what used to be known as "latesimplifycfg" on the
instantiation after the vectorizers have run, so that's what we'redoing here.

FWIW, there's a later SimplifyCFGPass instantiation in both PMs where we do not
set the "late" options. I'm not sure if that's intentional or not.

Differential Revision: https://reviews.llvm.org/D39407

llvm-svn: 318299
2017-11-15 16:33:11 +00:00
Hans Wennborg e1ecd61b98 Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining
Clang implements the -finstrument-functions flag inherited from GCC, which
inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit.

This is useful for getting a trace of how the functions in a program are
executed. Normally, the calls remain even if a function is inlined into another
function, but it is useful to be able to turn this off for users who are
interested in a lower-level trace, i.e. one that reflects what functions are
called post-inlining. (We use this to generate link order files for Chromium.)

LLVM already has a pass for inserting similar instrumentation calls to
mcount(), which it does after inlining. This patch renames and extends that
pass to handle calls both to mcount and the cygprofile functions, before and/or
after inlining as controlled by function attributes.

Differential Revision: https://reviews.llvm.org/D39287

llvm-svn: 318195
2017-11-14 21:09:45 +00:00
Chandler Carruth 00a301d568 [PM] Port BoundsChecking to the new PM.
Registers it and everything, updates all the references, etc.

Next patch will add support to Clang's `-fexperimental-new-pass-manager`
path to actually enable BoundsChecking correctly.

Differential Revision: https://reviews.llvm.org/D39084

llvm-svn: 318128
2017-11-14 01:30:04 +00:00
David L. Jones 82b22e0327 [PassManager, SimplifyCFG] Revert r316908 and r316869.
These cause Clang to crash with a segfault. See PR35210 for details.

llvm-svn: 317444
2017-11-06 00:32:01 +00:00
Jun Bum Lim 0c99007db1 Recommit r317351 : Add CallSiteSplitting pass
This recommit r317351 after fixing a buildbot failure.

Original commit message:

    Summary:
    This change add a pass which tries to split a call-site to pass
    more constrained arguments if its argument is predicated in the control flow
    so that we can expose better context to the later passes (e.g, inliner, jump
    threading, or IPA-CP based function cloning, etc.).
    As of now we support two cases :

    1) If a call site is dominated by an OR condition and if any of its arguments
    are predicated on this OR condition, try to split the condition with more
    constrained arguments. For example, in the code below, we try to split the
    call site since we can predicate the argument (ptr) based on the OR condition.

    Split from :
          if (!ptr || c)
            callee(ptr);
    to :
          if (!ptr)
            callee(null ptr)  // set the known constant value
          else if (c)
            callee(nonnull ptr)  // set non-null attribute in the argument

    2) We can also split a call-site based on constant incoming values of a PHI
    For example,
    from :
          BB0:
           %c = icmp eq i32 %i1, %i2
           br i1 %c, label %BB2, label %BB1
          BB1:
           br label %BB2
          BB2:
           %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
           call void @bar(i32 %p)
    to
          BB0:
           %c = icmp eq i32 %i1, %i2
           br i1 %c, label %BB2-split0, label %BB1
          BB1:
           br label %BB2-split1
          BB2-split0:
           call void @bar(i32 0)
           br label %BB2
          BB2-split1:
           call void @bar(i32 1)
           br label %BB2
          BB2:
           %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]

llvm-svn: 317362
2017-11-03 20:41:16 +00:00
Jun Bum Lim 0eb1c2d63a Revert "Add CallSiteSplitting pass"
Revert due to Buildbot failure.

This reverts commit r317351.

llvm-svn: 317353
2017-11-03 19:17:11 +00:00
Jun Bum Lim 2a58933519 Add CallSiteSplitting pass
Summary:
This change add a pass which tries to split a call-site to pass
more constrained arguments if its argument is predicated in the control flow
so that we can expose better context to the later passes (e.g, inliner, jump
threading, or IPA-CP based function cloning, etc.).
As of now we support two cases :

1) If a call site is dominated by an OR condition and if any of its arguments
are predicated on this OR condition, try to split the condition with more
constrained arguments. For example, in the code below, we try to split the
call site since we can predicate the argument (ptr) based on the OR condition.

Split from :
      if (!ptr || c)
        callee(ptr);
to :
      if (!ptr)
        callee(null ptr)  // set the known constant value
      else if (c)
        callee(nonnull ptr)  // set non-null attribute in the argument

2) We can also split a call-site based on constant incoming values of a PHI
For example,
from :
      BB0:
       %c = icmp eq i32 %i1, %i2
       br i1 %c, label %BB2, label %BB1
      BB1:
       br label %BB2
      BB2:
       %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
       call void @bar(i32 %p)
to
      BB0:
       %c = icmp eq i32 %i1, %i2
       br i1 %c, label %BB2-split0, label %BB1
      BB1:
       br label %BB2-split1
      BB2-split0:
       call void @bar(i32 0)
       br label %BB2
      BB2-split1:
       call void @bar(i32 1)
       br label %BB2
      BB2:
       %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]

Reviewers: davidxl, huntergr, chandlerc, mcrosier, eraman, davide

Reviewed By: davidxl

Subscribers: sdesmalen, ashutosh.nema, fhahn, mssimpso, aemerson, mgorny, mehdi_amini, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D39137

llvm-svn: 317351
2017-11-03 19:01:57 +00:00
Sanjay Patel adf38911d8 [(new) Pass Manager] instantiate SimplifyCFG with the same options as the old PM
The old PM sets the options of what used to be known as "latesimplifycfg" on the 
instantiation after the vectorizers have run, so that's what we'redoing here.

FWIW, there's a later SimplifyCFGPass instantiation in both PMs where we do not 
set the "late" options. I'm not sure if that's intentional or not.

Differential Revision: https://reviews.llvm.org/D39407

llvm-svn: 316869
2017-10-29 20:49:31 +00:00
Matthew Simpson cb58558c2f Add CalledValuePropagation pass
This patch adds a new pass for attaching !callees metadata to indirect call
sites. The pass propagates values to call sites by performing an IPSCCP-like
analysis using the generic sparse propagation solver. For indirect call sites
having a small set of possible callees, the attached metadata indicates what
those callees are. The metadata can be used to facilitate optimizations like
intersecting the function attributes of the possible callees, refining the call
graph, performing indirect call promotion, etc.

Differential Revision: https://reviews.llvm.org/D37355

llvm-svn: 316576
2017-10-25 13:40:08 +00:00
Rong Xu e1f4245f8d [PM] Add pgo-memop-opt pass to the new pass manager
This pass adds pgo-memop-opt pass to the new pass manager.
It is in the old pass manager but somehow left out in the new pass manager.

Differential Revision: http://reviews.llvm.org/D39145

llvm-svn: 316384
2017-10-23 22:21:29 +00:00
Adam Nemet 0965da2055 Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.*
Sync it up with the name of the class actually defined here.  This has been
bothering me for a while...

llvm-svn: 315249
2017-10-09 23:19:02 +00:00
Davide Italiano e070721308 [NewPassManager] Run global dead code elimination after the inliner.
This is the same exact change we did for the current pass manager
in rL314997, but the new pass manager pipeline already happened
to run GlobalOpt after the inliner, so we just insert a run of
GDCE here.

llvm-svn: 315003
2017-10-05 18:36:01 +00:00
Dehao Chen d26dae0d34 Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases.
Summary: In SamplePGO ThinLTO compile phase, we will not invoke ICP as it may introduce confusion to the 2nd annotation. This patch extracted that logic and makes it clearer before profile annotation. In the mean time, we need to make function importing process both inlined callsites as well as not promoted indirect callsites.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, mehdi_amini, llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D38094

llvm-svn: 314619
2017-10-01 05:24:51 +00:00
Sanjay Patel 6fd4391ddd [DivRempairs] add a pass to optimize div/rem pairs (PR31028)
This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented 
as an independent pass, so there's no stretching of scope and feature creep for an existing pass. 
I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost 
this same functionality as an addition to CGP in the motivating example of PR31028:
https://bugs.llvm.org/show_bug.cgi?id=31028

The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow 
more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and
undo the hoisting that is done here.

Decomposing remainder may allow removing some code from the backend (PPC and possibly others).

Differential Revision: https://reviews.llvm.org/D37121 

llvm-svn: 312862
2017-09-09 13:38:18 +00:00
Chandler Carruth 19913b22c0 [PM] Switch the CGSCC debug messages to use the standard LLVM debug
printing techniques with a DEBUG_TYPE controlling them.

It was a mistake to start re-purposing the pass manager `DebugLogging`
variable for generic debug printing -- those logs are intended to be
very minimal and primarily used for testing. More detailed and
comprehensive logging doesn't make sense there (it would only make for
brittle tests).

Moreover, we kept forgetting to propagate the `DebugLogging` variable to
various places making it also ineffective and/or unavailable. Switching
to `DEBUG_TYPE` makes this a non-issue.

llvm-svn: 310695
2017-08-11 05:47:13 +00:00
Dehao Chen 2f4e2e2758 Revert part of r310296 to make it really NFC for instrumentation PGO.
Summary: Part of r310296 will disable PGOIndirectCallPromotion in ThinLTO backend if PGOOpt is None. However, as PGOOpt is not passed down to ThinLTO backend for instrumentation based PGO, that change would actually disable ICP entirely in ThinLTO backend, making it behave differently in instrumentation PGO mode. This change reverts that change, and only disable ICP there when it is SamplePGO.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36566

llvm-svn: 310550
2017-08-10 05:10:32 +00:00
Dehao Chen 08f8831e57 Move the SampleProfileLoader right after EarlyFPM.
Summary: SampleProfileLoader pass do need to happen after some early cleanup passes so that inlining can happen correctly inside the SampleProfileLoader pass.

Reviewers: chandlerc, davidxl, tejohnson

Reviewed By: chandlerc, tejohnson

Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36333

llvm-svn: 310296
2017-08-07 20:23:20 +00:00
Teresa Johnson ecd901314d [PM] Split LoopUnrollPass and make partial unroller a function pass
Summary:
This is largely NFC*, in preparation for utilizing ProfileSummaryInfo
and BranchFrequencyInfo analyses. In this patch I am only doing the
splitting for the New PM, but I can do the same for the legacy PM as
a follow-on if this looks good.

*Not NFC since for partial unrolling we lose the updates done to the
loop traversal (adding new sibling and child loops) - according to
Chandler this is not very useful for partial unrolling, but it also
means that the debugging flag -unroll-revisit-child-loops no longer
works for partial unrolling.

Reviewers: chandlerc

Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36157

llvm-svn: 309886
2017-08-02 20:35:29 +00:00
Dehao Chen 89d3226019 Update the new PM pipeline to make ICP aware if it is SamplePGO build.
Summary: In ThinLTO backend compile, OPTOptions are not set so that the ICP in ThinLTO backend does not know if it is a SamplePGO build, in which profile count needs to be annotated directly on call instructions. This patch cleaned up the PGOOptions handling logic and passes down PGOOptions to ThinLTO backend.

Reviewers: chandlerc, tejohnson, davidxl

Reviewed By: chandlerc

Subscribers: sanjoy, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D36052

llvm-svn: 309780
2017-08-02 01:28:31 +00:00
Dehao Chen 95f003003d Refactor the build{Module|Function}SimplificationPipeline to expose optimization phase.
Summary: This is in preparation of https://reviews.llvm.org/D36052

Reviewers: chandlerc, davidxl, tejohnson

Reviewed By: chandlerc

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D36053

llvm-svn: 309500
2017-07-30 04:55:39 +00:00
Dehao Chen ce0842ce9c Refine the PGOOpt and SamplePGOSupport handling.
Summary:
Now that SamplePGOSupport is part of PGOOpt, there are several places that need tweaking:
1. AddDiscriminator pass should *not* be invoked at ThinLTOBackend (as it's already invoked in the PreLink phase)
2. addPGOInstrPasses should only be invoked when either ProfileGenFile or ProfileUseFile is non-empty.
3. SampleProfileLoaderPass should only be invoked when SampleProfileFile is non-empty.
4. PGOIndirectCallPromotion should only be invoked in ProfileUse phase, or in ThinLTOBackend of SamplePGO.

Reviewers: chandlerc, tejohnson, davidxl

Reviewed By: chandlerc

Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36040

llvm-svn: 309478
2017-07-29 04:10:24 +00:00
Dehao Chen 641f387cd0 Update the assertion to meet with the changes in r309121. (NFC)
llvm-svn: 309125
2017-07-26 15:47:00 +00:00
Dehao Chen e90d0153ca Make new PM honor -fdebug-info-for-profiling
Summary: The new PM needs to invoke add-discriminator pass when building with -fdebug-info-for-profiling.

Reviewers: chandlerc, davidxl

Reviewed By: chandlerc

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D35744

llvm-svn: 309121
2017-07-26 15:01:20 +00:00
Philip Pfaffe 730f2f9bb6 [PM] Enable registration of out-of-tree passes with PassBuilder
Summary:
This patch adds a callback registration API to the PassBuilder,
enabling registering out-of-tree passes with it.

Through the Callback API, callers may register callbacks with the
various stages at which passes are added into pass managers, including
parsing of a pass pipeline as well as at extension points within the
default -O pipelines.

Registering utilities like `require<>` and `invalidate<>` needs to be
handled manually by the caller, but a helper is provided.

Additionally, adding passes at pipeline extension points is exposed
through the opt tool. This patch adds a `-passes-ep-X` commandline
option for every extension point X, which opt parses into pipelines
inserted into that extension point.

Reviewers: chandlerc

Reviewed By: chandlerc

Subscribers: lksbhm, grosser, davide, mehdi_amini, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D33464

llvm-svn: 307532
2017-07-10 10:57:55 +00:00
Dehao Chen 3a9861420c Add sample PGO support to ThinLTO new pass manager.
Summary:
For SamplePGO + ThinLTO, because profile annotation is done twice at both PrepareForThinLTO pipeline and backend compiler, the following changes are needed at the PrepareForThinLTO phase to ensure the IR is not changed dramatically. Otherwise the profile annotation will be inaccurate in the backend compiler.

* disable hot-caller heuristic
* disable loop unrolling
* disable indirect call promotion

This will unblock the new PM testing for sample PGO (tools/clang/test/CodeGen/pgo-sample-thinlto-summary.c), which will be covered in another cfe patch.

Reviewers: chandlerc, tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: sanjoy, mehdi_amini, Prazek, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D34895

llvm-svn: 307437
2017-07-07 20:53:10 +00:00
Dehao Chen 2f31d0d86e Hook the sample PGO machinery in the new PM
Summary: This patch hooks up SampleProfileLoaderPass with the new PM.

Reviewers: chandlerc, davidxl, davide, tejohnson

Reviewed By: chandlerc, tejohnson

Subscribers: tejohnson, llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D34720

llvm-svn: 306763
2017-06-29 23:33:05 +00:00
Tim Shen 664706916b [ThinkLTO] Invoke build(Thin)?LTOPreLinkDefaultPipeline.
Previously it doesn't actually invoke the designated new PM builder
functions.

This patch moves NameAnonGlobalPass out from PassBuilder, as Chandler
points out that PassBuilder is used for non-O0 builds, and for
optimizations only.

Differential Revision: https://reviews.llvm.org/D34728

llvm-svn: 306756
2017-06-29 23:08:38 +00:00
Easwaran Raman 8249fac52d Create inliner params based on size and opt levels.
Differential revision: https://reviews.llvm.org/D34309

llvm-svn: 306542
2017-06-28 13:33:49 +00:00
Geoff Berry 2573a19fe6 [EarlyCSE][MemorySSA] Enable MemorySSA in function-simplification pass of EarlyCSE.
llvm-svn: 306477
2017-06-27 22:25:02 +00:00
Xinliang David Li b67530e9b9 [PGO] Implementate profile counter regiser promotion
Differential Revision: http://reviews.llvm.org/D34085

llvm-svn: 306231
2017-06-25 00:26:43 +00:00
Eric Christopher 5a7c2f1700 Remove the LoadCombine pass. It was never enabled and is unsupported.
Based on discussions with the author on mailing lists.

llvm-svn: 306067
2017-06-22 22:58:12 +00:00
Geoff Berry 3cca1da20c [EarlyCSE] Add option to use MemorySSA for function simplification run of EarlyCSE (off by default).
Summary:
Use MemorySSA for memory dependency checking in the EarlyCSE pass at the
start of the function simplification portion of the pipeline.  We rely
on the fact that GVNHoist runs just after this pass of EarlyCSE to
amortize the MemorySSA construction cost since GVNHoist uses MemorySSA
and EarlyCSE preserves it.

This is turned off by default.  A follow-up change will turn it on to
allow for easier reversion in case it breaks something.

llvm-svn: 305146
2017-06-10 15:20:03 +00:00
Davide Italiano be1b6a963e [PM] Add GVNSink to the pipeline.
With this, the two pipelines should be in sync again (modulo
LoopUnswitch, but Chandler is actively working on that).

Differential Revision:  https://reviews.llvm.org/D33810

llvm-svn: 304671
2017-06-03 23:18:29 +00:00
Davide Italiano c368831580 Move GVNHoist to the right position in the new pass manager pipeline.
GVNHoist was moved as part of simplification passes for the current
pass manager (but not for the new), so they're out-of-sync.

Differential Revision:  https://reviews.llvm.org/D33806

llvm-svn: 304490
2017-06-01 23:08:14 +00:00
Chandler Carruth 8b3be4e59d [PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM.
Based on the original patch by Davide, but I've adjusted the API exposed
to just be different entry points rather than exposing more state
parameters. I've factored all the common logic out so that we don't have
any duplicate pipelines, we just stitch them together in different ways.
I think this makes the build easier to reason about and understand.

This adds a direct method for getting the module simplification pipeline
as well as a method to get the optimization pipeline. While not my
express goal, this seems nice and gives a good place comment about the
restrictions that are imposed on them.

I did make some minor changes to the way the pipelines are structured
here, but hopefully not ones that are significant or controversial:

1) I sunk the PGO indirect call promotion to only be run when we have
   PGO enabled (or as part of the special ThinLTO pipeline).

2) I made the extra GlobalOpt run in ThinLTO just happen all the time
   and at a slightly more powerful place (before we remove available
   externaly functions). This seems like general goodness and not a big
   compile time sink, so it didn't make sense to *only* use it in
   ThinLTO. Fewer differences in the pipeline makes everything simpler
   IMO.

3) I hoisted the ThinLTO stop point pre-link above the the RPO function
   attr inference. The RPO inference won't infer anything terribly
   meaningful pre-link (recursiveness?) so it didn't make a lot of
   sense. But if the placement of RPO inference starts to matter, we
   should move it to the canonicalization phase anyways which seems like
   a better place for it (and there is a FIXME to this effect!). But
   that seemed a bridge too far for this patch.

If we ever need to parameterize these pipelines more heavily, we can
always sink the logic to helper functions with parameters to keep those
parameters out of the public API. But the changes above seemed minor
that we could possible get away without the parameters entirely.

I added support for parsing 'thinlto' and 'thinlto-pre-link' names in
pass pipelines to make it easy to test these routines and play with them
in larger pipelines. I also added a really basic manifest of passes test
that will show exactly how the pipelines behave and work as well as
making updates to them clear.

Lastly, this factoring does introduce a nesting layer of module pass
managers in the default pipeline. I don't think this is a big deal and
the flexibility of decoupling the pipelines seems easily worth it.

Differential Revision: https://reviews.llvm.org/D33540

llvm-svn: 304407
2017-06-01 11:39:39 +00:00
Chandler Carruth 86248d5632 [PM] Enable the new simple loop unswitch pass in the new pass manager
(where it is the only realistic option).

This passes the LLVM test suite for me, but I'm clearly still hammering
on this.

llvm-svn: 303952
2017-05-26 01:24:11 +00:00
Chandler Carruth f4d62c480c [PM] Teach the PGO instrumentation pasess to run GlobalDCE before
instrumenting code.

This is important in the new pass manager. The old pass manager's
inliner has a small DCE routine embedded within it. The new pass manager
relies on the actual GlobalDCE pass for this.

Without this patch, instrumentation profiling with the new PM results in
massive code bloat in the object files because the instrumentation
itself ends up preventing DCE from working to remove the code.

We should probably change the instrumentation (and/or DCE) so that we
can eliminate dead code even if instrumented, but we shouldn't even
spend the time generating instrumentation for that code so this still
seems like a good patch.

Differential Revision: https://reviews.llvm.org/D33535

llvm-svn: 303845
2017-05-25 07:15:09 +00:00
Davide Italiano 8e7d11ab2b [NewPM] Fix an innocent but silly typo. Reported by Craig Topper.
llvm-svn: 303587
2017-05-22 23:47:11 +00:00
Davide Italiano 8a09b8eba9 [NewPM] Add a temporary cl::opt() to test NewGVN.
llvm-svn: 303586
2017-05-22 23:41:40 +00:00
Xinliang David Li 126157c3b4 [PartialInlining] Add internal options to enable partial inlining in pass pipeline (off by default)
1. Legacy: -mllvm -enable-partial-inlining
2. New:  -mllvm -enable-npm-partial-inlining -fexperimental-new-pass-manager

Differential Revision: http://reviews.llvm.org/D33382

llvm-svn: 303567
2017-05-22 16:41:57 +00:00
Easwaran Raman 5e6f9bd4f8 [PM] Add ProfileSummaryAnalysis as a required pass in the new pipeline.
Differential revision: https://reviews.llvm.org/D32768

llvm-svn: 302170
2017-05-04 16:58:45 +00:00
Chandler Carruth 1353f9a48b [PM/LoopUnswitch] Introduce a new, simpler loop unswitch pass.
Currently, this pass only focuses on *trivial* loop unswitching. At that
reduced problem it remains significantly better than the current loop
unswitch:
- Old pass is worse than cubic complexity. New pass is (I think) linear.
- New pass is much simpler in its design by focusing on full unswitching. (See
  below for details on this).
- New pass doesn't carry state for thresholds between pass iterations.
- New pass doesn't carry state for correctness (both miscompile and
  infloop) between pass iterations.
- New pass produces substantially better code after unswitching.
- New pass can handle more trivial unswitch cases.
- New pass doesn't recompute the dominator tree for the entire function
  and instead incrementally updates it.

I've ported all of the trivial unswitching test cases from the old pass
to the new one to make sure that major functionality isn't lost in the
process. For several of the test cases I've worked to improve the
precision and rigor of the CHECKs, but for many I've just updated them
to handle the new IR produced.

My initial motivation was the fact that the old pass carried state in
very unreliable ways between pass iterations, and these mechansims were
incompatible with the new pass manager. However, I discovered many more
improvements to make along the way.

This pass makes two very significant assumptions that enable most of these
improvements:

1) Focus on *full* unswitching -- that is, completely removing whatever
   control flow construct is being unswitched from the loop. In the case
   of trivial unswitching, this means removing the trivial (exiting)
   edge. In non-trivial unswitching, this means removing the branch or
   switch itself. This is in opposition to *partial* unswitching where
   some part of the unswitched control flow remains in the loop. Partial
   unswitching only really applies to switches and to folded branches.
   These are very similar to full unrolling and partial unrolling. The
   full form is an effective canonicalization, the partial form needs
   a complex cost model, cannot be iterated, isn't canonicalizing, and
   should be a separate pass that runs very late (much like unrolling).

2) Leverage LLVM's Loop machinery to the fullest. The original unswitch
   dates from a time when a great deal of LLVM's loop infrastructure was
   missing, ineffective, and/or unreliable. As a consequence, a lot of
   complexity was added which we no longer need.

With these two overarching principles, I think we can build a fast and
effective unswitcher that fits in well in the new PM and in the
canonicalization pipeline. Some of the remaining functionality around
partial unswitching may not be relevant today (not many test cases or
benchmarks I can find) but if they are I'd like to add support for them
as a separate layer that runs very late in the pipeline.

Purely to make reviewing and introducing this code more manageable, I've
split this into first a trivial-unswitch-only pass and in the next patch
I'll add support for full non-trivial unswitching against a *fixed*
threshold, exactly like full unrolling. I even plan to re-use the
unrolling thresholds, as these are incredibly similar cost tradeoffs:
we're cloning a loop body in order to end up with simplified control
flow. We should only do that when the total growth is reasonably small.

One of the biggest changes with this pass compared to the previous one
is that previously, each individual trivial exiting edge from a switch
was unswitched separately as a branch. Now, we unswitch the entire
switch at once, with cases going to the various destinations. This lets
us unswitch multiple exiting edges in a single operation and also avoids
numerous extremely bad behaviors, where we would introduce 1000s of
branches to test for thousands of possible values, all of which would
take the exact same exit path bypassing the loop. Now we will use
a switch with 1000s of cases that can be efficiently lowered into
a jumptable. This avoids relying on somehow forming a switch out of the
branches or getting horrible code if that fails for any reason.

Another significant change is that this pass actively updates the CFG
based on unswitching. For trivial unswitching, this is actually very
easy because of the definition of loop simplified form. Doing this makes
the code coming out of loop unswitch dramatically more friendly. We
still should run loop-simplifycfg (at the least) after this to clean up,
but it will have to do a lot less work.

Finally, this pass makes much fewer attempts to simplify instructions
based on the unswitch. Something like loop-instsimplify, instcombine, or
GVN can be used to do increasingly powerful simplifications based on the
now dominating predicate. The old simplifications are things that
something like loop-instsimplify should get today or a very, very basic
loop-instcombine could get. Keeping that logic separate is a big
simplifying technique.

Most of the code in this pass that isn't in the old one has to do with
achieving specific goals:
- Updating the dominator tree as we go
- Unswitching all cases in a switch in a single step.

I think it is still shorter than just the trivial unswitching code in
the old pass despite having this functionality.

Differential Revision: https://reviews.llvm.org/D32409

llvm-svn: 301576
2017-04-27 18:45:20 +00:00
Chandler Carruth c246a4c973 Disable GVN Hoist due to still more bugs being found in it. There is
also a discussion about exactly what we should do prior to re-enabling
it.

The current bug is http://llvm.org/PR32821 and the discussion about this
is in the review thread for r300200.

llvm-svn: 301505
2017-04-27 00:28:03 +00:00
Filipe Cabecinhas 92dc348773 Simplify the CFG after loop pass cleanup.
Summary:
Otherwise we might end up with some empty basic blocks or
single-entry-single-exit basic blocks.

This fixes PR32085

Reviewers: chandlerc, danielcdh

Subscribers: mehdi_amini, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D30468

llvm-svn: 301395
2017-04-26 12:02:41 +00:00
Daniel Berlin 554dcd8c89 MemorySSA: Move to Analysis, from Transforms/Utils. It's used as
Analysis, it has Analysis passes, and once NewGVN is made an Analysis,
this removes the cross dependency from Analysis to Transform/Utils.
NFC.

llvm-svn: 299980
2017-04-11 20:06:36 +00:00
Dehao Chen cc75d2441d Add call branch annotation for ICP promoted direct call in SamplePGO mode.
Summary: SamplePGO uses branch_weight annotation to represent callsite hotness. When ICP promotes an indirect call to direct call, we need to make sure the direct call is annotated with branch_weight in SamplePGO mode, so that downstream function inliner can use hot callsite heuristic.

Reviewers: davidxl, eraman, xur

Reviewed By: davidxl, xur

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D30282

llvm-svn: 296028
2017-02-23 22:15:18 +00:00
Dehao Chen 7d230325ef Increases full-unroll threshold.
Summary:
The default threshold for fully unroll is too conservative. This patch doubles the full-unroll threshold

This change will affect the following speccpu2006 benchmarks (performance numbers were collected from Intel Sandybridge):

Performance:

403	0.11%
433	0.51%
445	0.48%
447	3.50%
453	1.49%
464	0.75%

Code size:

403	0.56%
433	0.96%
445	2.16%
447	2.96%
453	0.94%
464	8.02%

The compiler time overhead is similar with code size.

Reviewers: davidxl, mkuper, mzolotukhin, hfinkel, chandlerc

Reviewed By: hfinkel, chandlerc

Subscribers: mehdi_amini, zzheng, efriedma, haicheng, hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D28368

llvm-svn: 295538
2017-02-18 03:46:51 +00:00
Davide Italiano 513dfaa0a3 [PM] Hook up the instrumented PGO machinery in the new PM.
Differential Revision:  https://reviews.llvm.org/D29308

llvm-svn: 294955
2017-02-13 15:26:22 +00:00
Chandler Carruth 719ffe1a66 [PM] Add devirtualization-based iteration utility into the new PM's
default pipeline.

A clang with this patch built with ASan and asserts can build all of the
test-suite as well, so it seems to not uncover any latent problems.

Differential Revision: https://reviews.llvm.org/D29853

llvm-svn: 294888
2017-02-12 05:38:04 +00:00
Chandler Carruth e87fc8cb71 [PM] Enable GlobalsAA in the new PM's pipeline by default.
All the invalidation issues and bugs in this seem to be fixed, it has
survived a full build of the test suite plus SPEC with asserts and ASan
enabled on the Clang binary used.

Differential Revision: https://reviews.llvm.org/D29815

llvm-svn: 294887
2017-02-12 05:34:04 +00:00
Chandler Carruth 0ede22e1c0 [PM] Add Argument Promotion to the pass pipeline.
This needs explicit requires of the optimization remark emission before
loop pass pipelines containing LICM as we no longer get it from the
inliner -- Argument Promotion may invalidate it. Technically the inliner
could also have broken this, but it never came up in testing.

Differential Revision: https://reviews.llvm.org/D29595

llvm-svn: 294670
2017-02-09 23:54:57 +00:00
Chandler Carruth addcda483e [PM] Port ArgumentPromotion to the new pass manager.
Now that the call graph supports efficient replacement of a function and
spurious reference edges, we can port ArgumentPromotion to the new pass
manager very easily.

The old PM-specific bits are sunk into callbacks that the new PM simply
doesn't use. Unlike the old PM, the new PM simply does argument
promotion and afterward does the update to LCG reflecting the promoted
function.

Differential Revision: https://reviews.llvm.org/D29580

llvm-svn: 294667
2017-02-09 23:46:27 +00:00
Peter Collingbourne 857aba4410 Rename LowerTypeTestsSummaryAction to PassSummaryAction. NFCI.
I intend to use the same type with the same semantics in the WholeProgramDevirt
pass.

Differential Revision: https://reviews.llvm.org/D29746

llvm-svn: 294629
2017-02-09 21:45:01 +00:00
Daniel Berlin 439042b7ad Add PredicateInfo utility and printing pass
Summary:
This patch adds a utility to build extended SSA (see "ABCD: eliminating
array bounds checks on demand"), and an intrinsic to support it. This
is then used to get functionality equivalent to propagateEquality in
GVN, in NewGVN (without having to replace instructions as we go). It
would work similarly in SCCP or other passes. This has been talked
about a few times, so i built a real implementation and tried to
productionize it.

Copies are inserted for operands used in assumes and conditional
branches that are based on comparisons (see below for more)

Every use affected by the predicate is renamed to the appropriate
intrinsic result.

E.g.
%cmp = icmp eq i32 %x, 50
br i1 %cmp, label %true, label %false
true:
ret i32 %x
false:
ret i32 1

will become

%cmp = icmp eq i32, %x, 50
br i1 %cmp, label %true, label %false
true:
; Has predicate info
; branch predicate info { TrueEdge: 1 Comparison: %cmp = icmp eq i32 %x, 50 }
%x.0 = call @llvm.ssa_copy.i32(i32 %x)
ret i32 %x.0
false:
ret i23 1

(you can use -print-predicateinfo to get an annotated-with-predicateinfo dump)

This enables us to easily determine what operations are affected by a
given predicate, and how operations affected by a chain of
predicates.

Reviewers: davide, sanjoy

Subscribers: mgorny, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D29519

Update for review comments

Fix a bug Nuno noticed where we are giving information about and/or on edges where the info is not useful and easy to use wrong

Update for review comments

llvm-svn: 294351
2017-02-07 21:10:46 +00:00
Chandler Carruth baabda9317 [PM] Port LoopLoadElimination to the new pass manager and wire it into
the main pipeline.

This is a very straight forward port. Nothing weird or surprising.

This brings the number of missing passes from the new PM's pipeline down
to three.

llvm-svn: 293249
2017-01-27 01:32:26 +00:00
Chandler Carruth a95ff38924 [PM] Flesh out almost all of the late loop passes.
With this the per-module pass pipeline is *extremely* close to the
legacy PM. The missing pieces are:
- PruneEH (or some equivalent)
- ArgumentPromotion
- LoopLoadElimination
- LoopUnswitch

I'm going to work through those in essentially that order but this seems
like a worthwhile incremental step toward the end state.

One difference in what I have here from the legacy PM is that I've
consolidated some of the per-function passes at the very end of the
pipeline into the main optimization function pipeline. The intervening
passes are *really* uninteresting and so this seems very likely to have
any effect other than minor improvement to locality.

Note that there are still some failures in the test suite, but the
compiler doesn't crash or assert.

Differential Revision: https://reviews.llvm.org/D29114

llvm-svn: 293241
2017-01-27 00:50:21 +00:00
Chandler Carruth 79b733bc6b [PM] Enable the main loop pass pipelines with everything but
loop-unswitch in the main pipelines for the new PM.

All of these now work, and Clang built using this pipeline can build the
test suite and SPEC without hitting any asserts of ASan failures.

There are still some bugs hiding though -- 7 tests regress with the new
PM. I'm going to be investigating these, but it seems worthwhile to at
least get the pipelines in place so that others can play with them, and
they aren't completely broken.

Differential Revision: https://reviews.llvm.org/D29113

llvm-svn: 293225
2017-01-26 23:21:17 +00:00
Artur Pilipenko 8fb3d57e67 [Guards] Introduce loop-predication pass
This patch introduces guard based loop predication optimization. The new LoopPredication pass tries to convert loop variant range checks to loop invariant by widening checks across loop iterations. For example, it will convert

  for (i = 0; i < n; i++) {
    guard(i < len);
    ...
  }

to

  for (i = 0; i < n; i++) {
    guard(n - 1 < len);
    ...
  }

After this transformation the condition of the guard is loop invariant, so loop-unswitch can later unswitch the loop by this condition which basically predicates the loop by the widened condition:

  if (n - 1 < len)
    for (i = 0; i < n; i++) {
      ...
    } 
  else
    deoptimize

This patch relies on an NFC change to make ScalarEvolution::isMonotonicPredicate public (revision 293062).

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D29034

llvm-svn: 293064
2017-01-25 16:00:44 +00:00
Davide Italiano 089a912365 [PM] Flesh out the new pass manager LTO pipeline.
Differential Revision:  https://reviews.llvm.org/D28996

llvm-svn: 292863
2017-01-24 00:57:39 +00:00
Chandler Carruth e9b18e3d34 [PM] Port LoopSink to the new pass manager.
Like several other loop passes (the vectorizer, etc) this pass doesn't
really fit the model of a loop pass. The critical distinction is that it
isn't intended to be pipelined together with other loop passes. I plan
to add some documentation to the loop pass manager to make this more
clear on that side.

LoopSink is also different because it doesn't really need a lot of the
infrastructure of our loop passes. For example, if there aren't loop
invariant instructions causing a preheader to exist, there is no need to
form a preheader. It also doesn't need LCSSA because this pass is
only involved in sinking invariant instructions from a preheader into
the loop, not reasoning about live-outs.

This allows some nice simplifications to the pass in the new PM where we
can directly walk the loops once without restructuring them.

Differential Revision: https://reviews.llvm.org/D28921

llvm-svn: 292589
2017-01-20 08:42:19 +00:00
Michael Kuperstein 8ecc38ef85 [PM] Add LoopVectorize to the default module pipeline
LV no longer "requires" LCSSA and LoopSimplify, and instead forms
them internally as required. So, there's nothing preventing it from
being enabled.

llvm-svn: 292464
2017-01-19 02:21:54 +00:00
Chandler Carruth 3bab7e1a79 [PM] Separate the LoopAnalysisManager from the LoopPassManager and move
the latter to the Transforms library.

While the loop PM uses an analysis to form the IR units, the current
plan is to have the PM itself establish and enforce both loop simplified
form and LCSSA. This would be a layering violation in the analysis
library.

Fundamentally, the idea behind the loop PM is to *transform* loops in
addition to running passes over them, so it really seemed like the most
natural place to sink this was into the transforms library.

We can't just move *everything* because we also have loop analyses that
rely on a subset of the invariants. So this patch splits the the loop
infrastructure into the analysis management that has to be part of the
analysis library, and the transform-aware pass manager.

This also required splitting the loop analyses' printer passes out to
the transforms library, which makes sense to me as running these will
transform the code into LCSSA in theory.

I haven't split the unittest though because testing one component
without the other seems nearly intractable.

Differential Revision: https://reviews.llvm.org/D28452

llvm-svn: 291662
2017-01-11 09:43:56 +00:00
Chandler Carruth 410eaeb064 [PM] Rewrite the loop pass manager to use a worklist and augmented run
arguments much like the CGSCC pass manager.

This is a major redesign following the pattern establish for the CGSCC layer to
support updates to the set of loops during the traversal of the loop nest and
to support invalidation of analyses.

An additional significant burden in the loop PM is that so many passes require
access to a large number of function analyses. Manually ensuring these are
cached, available, and preserved has been a long-standing burden in LLVM even
with the help of the automatic scheduling in the old pass manager. And it made
the new pass manager extremely unweildy. With this design, we can package the
common analyses up while in a function pass and make them immediately available
to all the loop passes. While in some cases this is unnecessary, I think the
simplicity afforded is worth it.

This does not (yet) address loop simplified form or LCSSA form, but those are
the next things on my radar and I have a clear plan for them.

While the patch is very large, most of it is either mechanically updating loop
passes to the new API or the new testing for the loop PM. The code for it is
reasonably compact.

I have not yet updated all of the loop passes to correctly leverage the update
mechanisms demonstrated in the unittests. I'll do that in follow-up patches
along with improved FileCheck tests for those passes that ensure things work in
more realistic scenarios. In many cases, there isn't much we can do with these
until the loop simplified form and LCSSA form are in place.

Differential Revision: https://reviews.llvm.org/D28292

llvm-svn: 291651
2017-01-11 06:23:21 +00:00
Chandler Carruth 05ca5acc9e [PM] Introduce a devirtualization iteration layer for the new PM.
This is an orthogonal and separated layer instead of being embedded
inside the pass manager. While it adds a small amount of complexity, it
is fairly minimal and the composability and control seems worth the
cost.

The logic for this ends up being nicely isolated and targeted. It should
be easy to experiment with different iteration strategies wrapped around
the CGSCC bottom-up walk using this kind of facility.

The mechanism used to track devirtualization is the simplest one I came
up with. I think it handles most of the cases the existing iteration
machinery handles, but I haven't done a *very* in depth analysis. It
does however match the basic intended semantics, and we can tweak or
tune its exact behavior incrementally as necessary. One thing that we
may want to revisit is freshly building the value handle set on each
iteration. While I don't think this will be a significant cost (it is
strictly fewer value handles but more churn of value handes than the old
call graph), it is conceivable that we'll want a somewhat more clever
tracking mechanism. My hope is to layer that on as a follow up patch
with data supporting any implementation complexity it adds.

This code also provides for a basic count heuristic: if the number of
indirect calls decreases and the number of direct calls increases for
a given function in the SCC, we assume devirtualization is responsible.
This matches the heuristics currently used in the legacy pass manager.

Differential Revision: https://reviews.llvm.org/D23114

llvm-svn: 290665
2016-12-28 11:07:33 +00:00
Chandler Carruth e635289ee2 [PM] Disable the loop vectorizer from the new PM's pipeline as it
currenty relies on the old PM's dependency system forming LCSSA.

The new PM will require a different design for this, and for now this is
causing most of the issues I'm currently seeing in testing. I'd like to
get to a testable baseline and then work on re-enabling things one at
a time.

llvm-svn: 290644
2016-12-28 02:24:55 +00:00
Chandler Carruth 81c8edaf5c [PM] Disable more of the loop passes -- LCSSA and LoopSimplify are also
not really wired into the loop pass manager in a way that will let us
productively use these passes yet.

This lets the new PM get farther in basic testing which is useful for
establishing a good baseline of "doesn't explode". There are still
plenty of crashers in basic testing though, this just gets rid of some
noise that is well understood and not representing a specific or narrow
bug.

llvm-svn: 290601
2016-12-27 10:16:46 +00:00
Chandler Carruth 534d644b86 [PM] Try to improve the comments here to make what's going on more
clear.

Based on post-commit review suggestion from Sean. (Thanks!)

llvm-svn: 290488
2016-12-24 05:11:17 +00:00
Chandler Carruth 060ad61fbe [PM] Add support for building a default AA pipeline to the PassBuilder.
Pretty boring and lame as-is but necessary. This is definitely a place
we'll end up with extension hooks longer term. =]

Differential Revision: https://reviews.llvm.org/D28076

llvm-svn: 290449
2016-12-23 20:38:19 +00:00
Davide Italiano e05e3306a3 [NewGVN] Add the pass to PassRegistry.def.
We need to hook up here to get it working with the new PM.
Add a test while here (and remove a typo).

llvm-svn: 290350
2016-12-22 16:35:02 +00:00
Chandler Carruth e3f5064b72 [PM] Introduce a reasonable port of the main per-module pass pipeline
from the old pass manager in the new one.

I'm not trying to support (initially) the numerous options that are
currently available to customize the pass pipeline. If we end up really
wanting them, we can add them later, but I suspect many are no longer
interesting. The simplicity of omitting them will help a lot as we sort
out what the pipeline should look like in the new PM.

I've also documented to the best of my ability *why* each pass or group
of passes is used so that reading the pipeline is more helpful. In many
cases I think we have some questionable choices of ordering and I've
left FIXME comments in place so we know what to come back and revisit
going forward. But for now, I've left it as similar to the current
pipeline as I could.

Lastly, I've had to comment out several places where passes are not
ported to the new pass manager or where the loop pass infrastructure is
not yet ready. I did at least fix a few bugs in the loop pass
infrastructure uncovered by running the full pipeline, but I didn't want
to go too far in this patch -- I'll come back and re-enable these as the
infrastructure comes online. But I'd like to keep the comments in place
because I don't want to lose track of which passes need to be enabled
and where they go.

One thing that seemed like a significant API improvement was to require
that we don't build pipelines for O0. It seems to have no real benefit.

I've also switched back to returning pass managers by value as at this
API layer it feels much more natural to me for composition. But if
others disagree, I'm happy to go back to an output parameter.

I'm not 100% happy with the testing strategy currently, but it seems at
least OK. I may come back and try to refactor or otherwise improve this
in subsequent patches but I wanted to at least get a good starting point
in place.

Differential Revision: https://reviews.llvm.org/D28042

llvm-svn: 290325
2016-12-22 06:59:15 +00:00
Chandler Carruth 1d96311447 [PM] Provide an initial, minimal port of the inliner to the new pass manager.
This doesn't implement *every* feature of the existing inliner, but
tries to implement the most important ones for building a functional
optimization pipeline and beginning to sort out bugs, regressions, and
other problems.

Notable, but intentional omissions:
- No alloca merging support. Why? Because it isn't clear we want to do
  this at all. Active discussion and investigation is going on to remove
  it, so for simplicity I omitted it.
- No support for trying to iterate on "internally" devirtualized calls.
  Why? Because it adds what I suspect is inappropriate coupling for
  little or no benefit. We will have an outer iteration system that
  tracks devirtualization including that from function passes and
  iterates already. We should improve that rather than approximate it
  here.
- Optimization remarks. Why? Purely to make the patch smaller, no other
  reason at all.

The last one I'll probably work on almost immediately. But I wanted to
skip it in the initial patch to try to focus the change as much as
possible as there is already a lot of code moving around and both of
these *could* be skipped without really disrupting the core logic.

A summary of the different things happening here:

1) Adding the usual new PM class and rigging.

2) Fixing minor underlying assumptions in the inline cost analysis or
   inline logic that don't generally hold in the new PM world.

3) Adding the core pass logic which is in essence a loop over the calls
   in the nodes in the call graph. This is a bit duplicated from the old
   inliner, but only a handful of lines could realistically be shared.
   (I tried at first, and it really didn't help anything.) All told,
   this is only about 100 lines of code, and most of that is the
   mechanics of wiring up analyses from the new PM world.

4) Updating the LazyCallGraph (in the new PM) based on the *newly
   inlined* calls and references. This is very minimal because we cannot
   form cycles.

5) When inlining removes the last use of a function, eagerly nuking the
   body of the function so that any "one use remaining" inline cost
   heuristics are immediately refined, and queuing these functions to be
   completely deleted once inlining is complete and the call graph
   updated to reflect that they have become dead.

6) After all the inlining for a particular function, updating the
   LazyCallGraph and the CGSCC pass manager to reflect the
   function-local simplifications that are done immediately and
   internally by the inline utilties. These are the exact same
   fundamental set of CG updates done by arbitrary function passes.

7) Adding a bunch of test cases to specifically target CGSCC and other
   subtle aspects in the new PM world.

Many thanks to the careful review from Easwaran and Sanjoy and others!

Differential Revision: https://reviews.llvm.org/D24226

llvm-svn: 290161
2016-12-20 03:15:32 +00:00
Daniel Jasper aec2fa352f Revert @llvm.assume with operator bundles (r289755-r289757)
This creates non-linear behavior in the inliner (see more details in
r289755's commit thread).

llvm-svn: 290086
2016-12-19 08:22:17 +00:00
Hal Finkel 3ca4a6bcf1 Remove the AssumptionCache
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...

llvm-svn: 289756
2016-12-15 03:02:15 +00:00
Chandler Carruth 6b9816477b [PM] Support invalidation of inner analysis managers from a pass over the outer IR unit.
Summary:
This never really got implemented, and was very hard to test before
a lot of the refactoring changes to make things more robust. But now we
can test it thoroughly and cleanly, especially at the CGSCC level.

The core idea is that when an inner analysis manager proxy receives the
invalidation event for the outer IR unit, it needs to walk the inner IR
units and propagate it to the inner analysis manager for each of those
units. For example, each function in the SCC needs to get an
invalidation event when the SCC gets one.

The function / module interaction is somewhat boring here. This really
becomes interesting in the face of analysis-backed IR units. This patch
effectively handles all of the CGSCC layer's needs -- both invalidating
SCC analysis and invalidating function analysis when an SCC gets
invalidated.

However, this second aspect doesn't really handle the
LoopAnalysisManager well at this point. That one will need some change
of design in order to fully integrate, because unlike the call graph,
the entire function behind a LoopAnalysis's results can vanish out from
under us, and we won't even have a cached API to access. I'd like to try
to separate solving the loop problems into a subsequent patch though in
order to keep this more focused so I've adapted them to the API and
updated the tests that immediately fail, but I've not added the level of
testing and validation at that layer that I have at the CGSCC layer.

An important aspect of this change is that the proxy for the
FunctionAnalysisManager at the SCC pass layer doesn't work like the
other proxies for an inner IR unit as it doesn't directly manage the
FunctionAnalysisManager and invalidation or clearing of it. This would
create an ever worsening problem of dual ownership of this
responsibility, split between the module-level FAM proxy and this
SCC-level FAM proxy. Instead, this patch changes the SCC-level FAM proxy
to work in terms of the module-level proxy and defer to it to handle
much of the updates. It only does SCC-specific invalidation. This will
become more important in subsequent patches that support more complex
invalidaiton scenarios.

Reviewers: jlebar

Subscribers: mehdi_amini, mcrosier, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D27197

llvm-svn: 289317
2016-12-10 06:34:44 +00:00
Chandler Carruth dab4eae274 [PM] Change the static object whose address is used to uniquely identify
analyses to have a common type which is enforced rather than using
a char object and a `void *` type when used as an identifier.

This has a number of advantages. First, it at least helps some of the
confusion raised in Justin Lebar's code review of why `void *` was being
used everywhere by having a stronger type that connects to documentation
about this.

However, perhaps more importantly, it addresses a serious issue where
the alignment of these pointer-like identifiers was unknown. This made
it hard to use them in pointer-like data structures. We were already
dodging this in dangerous ways to create the "all analyses" entry. In
a subsequent patch I attempted to use these with TinyPtrVector and
things fell apart in a very bad way.

And it isn't just a compile time or type system issue. Worse than that,
the actual alignment of these pointer-like opaque identifiers wasn't
guaranteed to be a useful alignment as they were just characters.

This change introduces a type to use as the "key" object whose address
forms the opaque identifier. This both forces the objects to have proper
alignment, and provides type checking that we get it right everywhere.
It also makes the types somewhat less mysterious than `void *`.

We could go one step further and introduce a truly opaque pointer-like
type to return from the `ID()` static function rather than returning
`AnalysisKey *`, but that didn't seem to be a clear win so this is just
the initial change to get to a reliably typed and aligned object serving
is a key for all the analyses.

Thanks to Richard Smith and Justin Lebar for helping pick plausible
names and avoid making this refactoring many times. =] And thanks to
Sean for the super fast review!

While here, I've tried to move away from the "PassID" nomenclature
entirely as it wasn't really helping and is overloaded with old pass
manager constructs. Now we have IDs for analyses, and key objects whose
address can be used as IDs. Where possible and clear I've shortened this
to just "ID". In a few places I kept "AnalysisID" to make it clear what
was being identified.

Differential Revision: https://reviews.llvm.org/D27031

llvm-svn: 287783
2016-11-23 17:53:26 +00:00
Davide Italiano 2ae76dd239 [GlobalSplit] Port to the new pass manager.
llvm-svn: 287511
2016-11-21 00:28:23 +00:00
Rong Xu 1c0e9b97d2 Conditionally eliminate library calls where the result value is not used
Summary:
This pass shrink-wraps a condition to some library calls where the call
result is not used. For example:
   sqrt(val);
 is transformed to
   if (val < 0)
     sqrt(val);
Even if the result of library call is not being used, the compiler cannot
safely delete the call because the function can set errno on error
conditions.
Note in many functions, the error condition solely depends on the incoming
parameter. In this optimization, we can generate the condition can lead to
the errno to shrink-wrap the call. Since the chances of hitting the error
condition is low, the runtime call is effectively eliminated.

These partially dead calls are usually results of C++ abstraction penalty
exposed by inlining. This optimization hits 108 times in 19 C/C++ programs
in SPEC2006.

Reviewers: hfinkel, mehdi_amini, davidxl

Subscribers: modocache, mgorny, mehdi_amini, xur, llvm-commits, beanz

Differential Revision: https://reviews.llvm.org/D24414

llvm-svn: 284542
2016-10-18 21:36:27 +00:00
Mehdi Amini 27d2379b4e Rename NameAnonFunctions to NameAnonGlobals to match what it is doing (NFC)
llvm-svn: 281745
2016-09-16 16:56:30 +00:00
Sriraman Tallam 06a67ba57d [PM] Port CFGViewer and CFGPrinter to the new Pass Manager
Differential Revision: https://reviews.llvm.org/D24592

llvm-svn: 281640
2016-09-15 18:35:27 +00:00
Chandler Carruth 8882346842 [PM] Introduce basic update capabilities to the new PM's CGSCC pass
manager, including both plumbing and logic to handle function pass
updates.

There are three fundamentally tied changes here:
1) Plumbing *some* mechanism for updating the CGSCC pass manager as the
   CG changes while passes are running.
2) Changing the CGSCC pass manager infrastructure to have support for
   the underlying graph to mutate mid-pass run.
3) Actually updating the CG after function passes run.

I can separate them if necessary, but I think its really useful to have
them together as the needs of #3 drove #2, and that in turn drove #1.

The plumbing technique is to extend the "run" method signature with
extra arguments. We provide the call graph that intrinsically is
available as it is the basis of the pass manager's IR units, and an
output parameter that records the results of updating the call graph
during an SCC passes's run. Note that "...UpdateResult" isn't a *great*
name here... suggestions very welcome.

I tried a pretty frustrating number of different data structures and such
for the innards of the update result. Every other one failed for one
reason or another. Sometimes I just couldn't keep the layers of
complexity right in my head. The thing that really worked was to just
directly provide access to the underlying structures used to walk the
call graph so that their updates could be informed by the *particular*
nature of the change to the graph.

The technique for how to make the pass management infrastructure cope
with mutating graphs was also something that took a really, really large
number of iterations to get to a place where I was happy. Here are some
of the considerations that drove the design:

- We operate at three levels within the infrastructure: RefSCC, SCC, and
  Node. In each case, we are working bottom up and so we want to
  continue to iterate on the "lowest" node as the graph changes. Look at
  how we iterate over nodes in an SCC running function passes as those
  function passes mutate the CG. We continue to iterate on the "lowest"
  SCC, which is the one that continues to contain the function just
  processed.

- The call graph structure re-uses SCCs (and RefSCCs) during mutation
  events for the *highest* entry in the resulting new subgraph, not the
  lowest. This means that it is necessary to continually update the
  current SCC or RefSCC as it shifts. This is really surprising and
  subtle, and took a long time for me to work out. I actually tried
  changing the call graph to provide the opposite behavior, and it
  breaks *EVERYTHING*. The graph update algorithms are really deeply
  tied to this particualr pattern.

- When SCCs or RefSCCs are split apart and refined and we continually
  re-pin our processing to the bottom one in the subgraph, we need to
  enqueue the newly formed SCCs and RefSCCs for subsequent processing.
  Queuing them presents a few challenges:
  1) SCCs and RefSCCs use wildly different iteration strategies at
     a high level. We end up needing to converge them on worklist
     approaches that can be extended in order to be able to handle the
     mutations.
  2) The order of the enqueuing need to remain bottom-up post-order so
     that we don't get surprising order of visitation for things like
     the inliner.
  3) We need the worklists to have set semantics so we don't duplicate
     things endlessly. We don't need a *persistent* set though because
     we always keep processing the bottom node!!!! This is super, super
     surprising to me and took a long time to convince myself this is
     correct, but I'm pretty sure it is... Once we sink down to the
     bottom node, we can't re-split out the same node in any way, and
     the postorder of the current queue is fixed and unchanging.
  4) We need to make sure that the "current" SCC or RefSCC actually gets
     enqueued here such that we re-visit it because we continue
     processing a *new*, *bottom* SCC/RefSCC.

- We also need the ability to *skip* SCCs and RefSCCs that get merged
  into a larger component. We even need the ability to skip *nodes* from
  an SCC that are no longer part of that SCC.

This led to the design you see in the patch which uses SetVector-based
worklists. The RefSCC worklist is always empty until an update occurs
and is just used to handle those RefSCCs created by updates as the
others don't even exist yet and are formed on-demand during the
bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and
we push new SCCs onto it and blacklist existing SCCs on it to get the
desired processing.

We then *directly* update these when updating the call graph as I was
never able to find a satisfactory abstraction around the update
strategy.

Finally, we need to compute the updates for function passes. This is
mostly used as an initial customer of all the update mechanisms to drive
their design to at least cover some real set of use cases. There are
a bunch of interesting things that came out of doing this:

- It is really nice to do this a function at a time because that
  function is likely hot in the cache. This means we want even the
  function pass adaptor to support online updates to the call graph!

- To update the call graph after arbitrary function pass mutations is
  quite hard. We have to build a fairly comprehensive set of
  data structures and then process them. Fortunately, some of this code
  is related to the code for building the cal graph in the first place.
  Unfortunately, very little of it makes any sense to share because the
  nature of what we're doing is so very different. I've factored out the
  one part that made sense at least.

- We need to transfer these updates into the various structures for the
  CGSCC pass manager. Once those were more sanely worked out, this
  became relatively easier. But some of those needs necessitated changes
  to the LazyCallGraph interface to make it significantly easier to
  extract the changed SCCs from an update operation.

- We also need to update the CGSCC analysis manager as the shape of the
  graph changes. When an SCC is merged away we need to clear analyses
  associated with it from the analysis manager which we didn't have
  support for in the analysis manager infrsatructure. New SCCs are easy!
  But then we have the case that the original SCC has its shape changed
  but remains in the call graph. There we need to *invalidate* the
  analyses associated with it.

- We also need to invalidate analyses after we *finish* processing an
  SCC. But the analyses we need to invalidate here are *only those for
  the newly updated SCC*!!! Because we only continue processing the
  bottom SCC, if we split SCCs apart the original one gets invalidated
  once when its shape changes and is not processed farther so its
  analyses will be correct. It is the bottom SCC which continues being
  processed and needs to have the "normal" invalidation done based on
  the preserved analyses set.

All of this is mostly background and context for the changes here.

Many thanks to all the reviewers who helped here. Especially Sanjoy who
caught several interesting bugs in the graph algorithms, David, Sean,
and others who all helped with feedback.

Differential Revision: http://reviews.llvm.org/D21464

llvm-svn: 279618
2016-08-24 09:37:14 +00:00
Chandler Carruth 9b35e6d746 [PM] Re-instate r279227 and r279228 with a fix to the way the templating
was done to hopefully appease MSVC.

As an upside, this also implements the suggestion Sanjoy made in code
review, so two for one! =]

I'll be watching the bots to see if there are still issues.

llvm-svn: 279295
2016-08-19 18:36:06 +00:00
Chandler Carruth b8824a5d3f [PM] Revert r279227 and r279228 until I can find someone to help me
solve completely opaque MSVC build errors. It complains about lots of
stuff with this change without givin nearly enough information to even
try to fix.

llvm-svn: 279231
2016-08-19 10:51:55 +00:00
Chandler Carruth db1759ace1 [PM] Make the the new pass manager support fully generic extra arguments
to run methods, both for transform passes and analysis passes.

This also allows the analysis manager to use a different set of extra
arguments from the pass manager where useful. Consider passes over
analysis produced units of IR like SCCs of the call graph or loops.
Passes of this nature will often want to refer to the analysis result
that was used to compute their IR units (the call graph or LoopInfo).
And for transformations, they may want to communicate special update
information to the outer pass manager. With this change, it becomes
possible to have a run method for a loop pass that looks more like:

  PreservedAnalyses run(Loop &L, AnalysisManager<Loop, LoopInfo> &AM,
                        LoopInfo &LI, LoopUpdateRecord &UR);

And to query the analysis manager like:

    AM.getResult<MyLoopAnalysis>(L, LI);

This makes accessing the known-available analyses convenient and clear,
and it makes passing customized data structures around easy.

My initial use case is going to be in updating the pass manager layers
when the analysis units of IR change. But there are more use cases here
such as having a layer that lets inner passes signal whether certain
additional passes should be run because of particular simplifications
made. Two desires for this have come up in the past: triggering
additional optimization after successfully unrolling loops, and
triggering additional inlining after collapsing indirect calls to direct
calls.

Despite adding this layer of generic extensibility, the *only* change to
existing, simple usage are for places where we forward declare the
AnalysisManager template. We really shouldn't be doing this because of
the fragility exposed here, but currently it makes coping with the
legacy PM code easier.

Differential Revision: http://reviews.llvm.org/D21462

llvm-svn: 279227
2016-08-19 09:45:16 +00:00
Chandler Carruth 67fc52f067 [PM] Port the always inliner to the new pass manager in a much more
minimal and boring form than the old pass manager's version.

This pass does the very minimal amount of work necessary to inline
functions declared as always-inline. It doesn't support a wide array of
things that the legacy pass manager did support, but is alse ... about
20 lines of code. So it has that going for it. Notably things this
doesn't support:

- Array alloca merging
  - To support the above, bottom-up inlining with careful history
    tracking and call graph updates
- DCE of the functions that become dead after this inlining.
- Inlining through call instructions with the always_inline attribute.
  Instead, it focuses on inlining functions with that attribute.

The first I've omitted because I'm hoping to just turn it off for the
primary pass manager. If that doesn't pan out, I can add it here but it
will be reasonably expensive to do so.

The second should really be handled by running global-dce after the
inliner. I don't want to re-implement the non-trivial logic necessary to
do comdat-correct DCE of functions. This means the -O0 pipeline will
have to be at least 'always-inline,global-dce', but that seems
reasonable to me. If others are seriously worried about this I'd like to
hear about it and understand why. Again, this is all solveable by
factoring that logic into a utility and calling it here, but I'd like to
wait to do that until there is a clear reason why the existing
pass-based factoring won't work.

The final point is a serious one. I can fairly easily add support for
this, but it seems both costly and a confusing construct for the use
case of the always inliner running at -O0. This attribute can of course
still impact the normal inliner easily (although I find that
a questionable re-use of the same attribute). I've started a discussion
to sort out what semantics we want here and based on that can figure out
if it makes sense ta have this complexity at O0 or not.

One other advantage of this design is that it should be quite a bit
faster due to checking for whether the function is a viable candidate
for inlining exactly once per function instead of doing it for each call
site.

Anyways, hopefully a reasonable starting point for this pass.

Differential Revision: https://reviews.llvm.org/D23299

llvm-svn: 278896
2016-08-17 02:56:20 +00:00
Teresa Johnson 1eca6bc6a7 [PM] Port LoopDataPrefetch to new pass manager
Summary:
Refactor the existing support into a LoopDataPrefetch implementation
class and a LoopDataPrefetchLegacyPass class that invokes it.
Add a new LoopDataPrefetchPass for the new pass manager that utilizes
the LoopDataPrefetch implementation class.

Reviewers: mehdi_amini

Subscribers: sanjoy, mzolotukhin, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D23483

llvm-svn: 278591
2016-08-13 04:11:27 +00:00
Michael Kuperstein 31b8399beb [PM] Port LowerInvoke to the new pass manager
llvm-svn: 278531
2016-08-12 17:28:27 +00:00
Teresa Johnson 4223dd8559 [PM] Port NameAnonFunction pass to new pass manager
Summary:
Port the NameAnonFunction pass and add a test.

Depends on D23439.

Reviewers: mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23440

llvm-svn: 278509
2016-08-12 14:03:36 +00:00
Teresa Johnson f93b246f8b [PM] Port ModuleSummaryIndex analysis to new pass manager
Summary:
Port the ModuleSummaryAnalysisWrapperPass to the new pass manager.
Use it in the ported BitcodeWriterPass (similar to how we use the
legacy ModuleSummaryAnalysisWrapperPass in the legacy WriteBitcodePass).

Also, pass the -module-summary opt flag through to the new pass
manager pipeline and through to the bitcode writer pass, and add
a test that uses it.

Reviewers: mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23439

llvm-svn: 278508
2016-08-12 13:53:02 +00:00
Sean Silva 5f6ec06f17 Consistently use CGSCCAnalysisManager
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278080
2016-08-09 00:28:56 +00:00
Sean Silva 0746f3bfa4 Consistently use LoopAnalysisManager
One exception here is LoopInfo which must forward-declare it (because
the typedef is in LoopPassManager.h which depends on LoopInfo).

Also, some includes for LoopPassManager.h were needed since that file
provides the typedef.

Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278079
2016-08-09 00:28:52 +00:00
Sean Silva fd03ac6a0c Consistently use ModuleAnalysisManager
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278078
2016-08-09 00:28:38 +00:00
Sean Silva 36e0d01e13 Consistently use FunctionAnalysisManager
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278077
2016-08-09 00:28:15 +00:00
Chandler Carruth a053a88df5 [PM] Change the name of the repeating utility to something less
overloaded (and simpler).

Sean rightly pointed out in code review that we've started using
"wrapper pass" as a specific part of the old pass manager, and in fact
it is more applicable there. Here, we really have a pass *template* to
build a repeated pass, so call it that.

llvm-svn: 277689
2016-08-04 03:52:53 +00:00
Chandler Carruth fdc6ba1e45 [PM] Fix a mis-named parameter in parseLoopPass -- the pass manager was
called "FPM" instead of "LPM" in a hold-over from when the code was
modeled on that used to parse function passes.

llvm-svn: 277584
2016-08-03 09:14:03 +00:00
Chandler Carruth 241bf2456f [PM] Add a generic 'repeat N times' pass wrapper to the new pass
manager.

While this has some utility for debugging and testing on its own, it is
primarily intended to demonstrate the technique for adding custom
wrappers that can provide more interesting interation behavior in
a nice, orthogonal, and composable layer.

Being able to write these kinds of very dynamic and customized controls
for running passes was one of the motivating use cases of the new pass
manager design, and this gives a hint at how they might look. The actual
logic is tiny here, and most of this is just wiring in the pipeline
parsing so that this can be widely used.

I'm adding this now to show the wiring without a lot of business logic.
This is a precursor patch for showing how a "iterate up to N times as
long as we devirtualize a call" utility can be added as a separable and
composable component along side the CGSCC pass management.

Differential Revision: https://reviews.llvm.org/D22405

llvm-svn: 277581
2016-08-03 07:44:48 +00:00
Chandler Carruth 4c3e3bf9fb [PM] Remove the NDEBUG condition around isModulePassName.
I forgot to do this initially, and added when I saw this fail in
a no-asserts build, but managed to loose the diff from the actual patch
that got submitted. Very sorry.

llvm-svn: 277562
2016-08-03 03:26:09 +00:00
Chandler Carruth 6cb2ab2c60 [PM] Significantly refactor the pass pipeline parsing to be easier to
reason about and less error prone.

The core idea is to fully parse the text without trying to identify
passes or structure. This is done with a single state machine. There
were various bugs in the logic around this previously that were repeated
and scattered across the code. Having a single routine makes it much
easier to fix and get correct. For example, this routine doesn't suffer
from PR28577.

Then the actual pass construction is handled using *much* easier to read
code and simple loops, with particular pass manager construction sunk to
live with other pass construction. This is especially nice as the pass
managers *are* in fact passes.

Finally, the "implicit" pass manager synthesis is done much more simply
by forming "pre-parsed" structures rather than having to duplicate tons
of logic.

One of the bugs fixed by this was evident in the tests where we accepted
a pipeline that wasn't really well formed. Another bug is PR28577 for
which I have added a test case.

The code is less efficient than the previous code but I'm really hoping
that's not a priority. ;]

Thanks to Sean for the review!

Differential Revision: https://reviews.llvm.org/D22724

llvm-svn: 277561
2016-08-03 03:21:41 +00:00
Michael Kuperstein c40618610f [PM] Port SpeculativeExecution to the new PM
Differential Revision: https://reviews.llvm.org/D23033

llvm-svn: 277393
2016-08-01 21:48:33 +00:00
Michael Kuperstein e45d4d9b35 [PM] Port LowerGuardIntrinsic to the new PM.
llvm-svn: 277057
2016-07-28 22:08:41 +00:00
Michael Kuperstein 39feb6290c [PM] Port SymbolRewriter to the new PM
Differential Revision: https://reviews.llvm.org/D22703

llvm-svn: 276687
2016-07-25 20:52:00 +00:00
Jordan Rose f85a95fdcb StringSwitch cannot be copied (take 2).
This prevents StringSwitch from being used with 'auto', which is
important because the inferred type is StringSwitch rather than the
result type. This is a problem because StringSwitch stores addresses
of temporary values rather than copying or moving the value into its
own storage.

This is a compromise that still allows wrapping StringSwitch in other
temporary structures, which (unlike StringSwitch) may be non-trivial
to set up and therefore want to at least be movable. (For an example,
see QueryParser.cpp in clang-tools-extra.)

Changing this uncovered the bug in PassBuilder, also in this patch.
Clang doesn't seem to have any occurrences of the issue.

Re-commit of r276652.

llvm-svn: 276671
2016-07-25 18:34:51 +00:00
Jordan Rose 9978dec4c2 Revert "StringSwitch cannot be copied or moved."
This reverts commit r276652. The clang-query tool is currently
relying on this behavior. I'll try again later.

llvm-svn: 276661
2016-07-25 17:28:33 +00:00
Jordan Rose 0cdbe7a572 StringSwitch cannot be copied or moved.
...but most importantly, it cannot be used well with 'auto', because
the inferred type is StringSwitch rather than the result type. This
is a problem because StringSwitch stores addresses of temporary
values rather than copying or moving the value into its own storage.

Changing this uncovered the bug in PassBuilder, also in this patch.
Clang doesn't seem to have any occurrences of the issue.

llvm-svn: 276652
2016-07-25 17:08:24 +00:00
Wei Mi e04d0eff29 [PM] Port BreakCriticalEdges to the new PM.
Differential Revision: https://reviews.llvm.org/D22688

llvm-svn: 276449
2016-07-22 18:04:25 +00:00
Wei Mi 1cf58f8996 [PM] Port NaryReassociate to the new PM
Differential Revision: https://reviews.llvm.org/D22648

llvm-svn: 276349
2016-07-21 22:28:52 +00:00
Sean Silva e3c18a5ae8 [PM] Port LoopUnroll.
We just set PreserveLCSSA to always true since we don't have an
analogous method `mustPreserveAnalysisID(LCSSA)`.

Also port LoopInfo verifier pass to test LoopUnrollPass.

llvm-svn: 276063
2016-07-19 23:54:23 +00:00
Dehao Chen 6132ee8502 [PM] Convert Loop Strength Reduce pass to new PM
Summary: Convert Loop String Reduce pass to new PM

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D22468

llvm-svn: 275919
2016-07-18 21:41:50 +00:00
Teresa Johnson 2124157102 [PM] Port FunctionImport Pass to new PM
Summary: Port FunctionImport Pass to new PM.

Reviewers: mehdi_amini, davide

Subscribers: davidxl, llvm-commits

Differential Revision: https://reviews.llvm.org/D22475

llvm-svn: 275916
2016-07-18 21:22:24 +00:00
Adam Nemet b2593f78ca [LoopDist] Port to new PM
Summary:
The direct motivation for the port is to ensure that the OptRemarkEmitter
tests work with the new PM.

This remains a function pass because we not only create multiple loops
but could also version the original loop.

In the test I need to invoke opt
with -passes='require<aa>,loop-distribute'.  LoopDistribute does not
directly depend on AA however LAA does.  LAA uses getCachedResult so
I *think* we need manually pull in 'aa'.

Reviewers: davidxl, silvas

Subscribers: sanjoy, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22437

llvm-svn: 275811
2016-07-18 16:29:27 +00:00
Adam Nemet 79ac42a5c9 [OptRemarkEmitter] Port to new PM
Summary:
The main goal is to able to start using the new OptRemarkEmitter
analysis from the LoopVectorizer.  Since the vectorizer was recently
converted to the new PM, it makes sense to convert this analysis as
well.

This pass is currently tested through the LoopDistribution pass, so I am
also porting LoopDistribution to get coverage for this analysis with the
new PM.

Reviewers: davidxl, silvas

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22436

llvm-svn: 275810
2016-07-18 16:29:21 +00:00
Adam Nemet 3beef41873 Sort include headers
llvm-svn: 275809
2016-07-18 16:29:17 +00:00
Dehao Chen 1a44452b11 [PM] Convert IVUsers analysis to new pass manager.
Summary: Convert IVUsers analysis to new pass manager.

Reviewers: davidxl, silvas

Subscribers: junbuml, sanjoy, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22434

llvm-svn: 275698
2016-07-16 22:51:33 +00:00
Dehao Chen dcafd5ebfd [PM] Convert LoopInstSimplify Pass to new PM
Summary: Convert LoopInstSimplify to new PM. Unfortunately there is no exisiting unittest for this pass.

Reviewers: davidxl, silvas

Subscribers: silvas, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22280

llvm-svn: 275576
2016-07-15 16:42:11 +00:00
Jun Bum Lim c837af306e [PM] Port Dead Loop Deletion Pass to the new PM
Summary: Port Dead Loop Deletion Pass to the new pass manager.

Reviewers: silvas, davide

Subscribers: llvm-commits, sanjoy, mcrosier

Differential Revision: https://reviews.llvm.org/D21483

llvm-svn: 275453
2016-07-14 18:28:29 +00:00
Dehao Chen f400a099a4 Add missing files for r275222
New pass manager for LICM.

Summary: Port LICM to the new pass manager.

Reviewers: davidxl, silvas

Subscribers: krasin, vitalybuka, silvas, davide, sanjoy, llvm-commits, mehdi_amini

Differential Revision: http://reviews.llvm.org/D21772

llvm-svn: 275224
2016-07-12 22:42:24 +00:00
Dehao Chen b9f8e29290 [PM] Port LoopIdiomRecognize Pass to new PM
Summary: Port LoopIdiomRecognize Pass to new PM

Reviewers: davidxl

Subscribers: davide, sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D22250

llvm-svn: 275202
2016-07-12 18:45:51 +00:00
Vitaly Buka 204dc533c5 Revert "New pass manager for LICM."
Summary: This reverts commit r275118.

Subscribers: sanjoy, mehdi_amini

Differential Revision: http://reviews.llvm.org/D22259

llvm-svn: 275156
2016-07-12 06:25:32 +00:00
Dehao Chen 7ef5820fa3 New pass manager for LICM.
Summary: Port LICM to the new pass manager.

Reviewers: davidxl, silvas

Subscribers: silvas, davide, sanjoy, llvm-commits, mehdi_amini

Differential Revision: http://reviews.llvm.org/D21772

llvm-svn: 275118
2016-07-11 22:45:24 +00:00
Davide Italiano e8ae0b5eb4 [PM/IPO] Port LowerTypeTests to the new PassManager.
There's a little bit of churn in this patch because the initialization
mechanism is now shared between the old and the new PM. Other than
that, it's just a pretty mechanical translation.

llvm-svn: 275082
2016-07-11 18:10:06 +00:00
Sean Silva db90d4d9c1 [PM] Port LoopVectorize to the new PM.
llvm-svn: 275000
2016-07-09 22:56:50 +00:00
Davide Italiano 92b933a55c [PM] Port CrossDSOCFI to the new pass manager.
llvm-svn: 274962
2016-07-09 03:25:35 +00:00
Sean Silva 0dacbd8f31 [PM] Fix a think-o. mv {Scalar,Vectorize}/SLPVectorize.h
llvm-svn: 274960
2016-07-09 03:11:29 +00:00
Davide Italiano cd96cfd8df [PM] Port LoopSimplify to the new pass manager.
While here move simplifyLoop() function to the new header, as
suggested by Chandler in the review.

Differential Revision:  http://reviews.llvm.org/D21404

llvm-svn: 274959
2016-07-09 03:03:01 +00:00
Wei Mi 90d195a5fd [PM] Port UnreachableBlockElim to the new Pass Manager
Differential Revision: http://reviews.llvm.org/D22124

llvm-svn: 274824
2016-07-08 03:32:49 +00:00
Davide Italiano 16284df8ec [PM] Port InstSimplify to the new pass manager.
llvm-svn: 274796
2016-07-07 21:14:36 +00:00
Sean Silva 59fe82f4ce [PM] Port TailCallElim
llvm-svn: 274708
2016-07-06 23:48:41 +00:00
Sean Silva b025d375a1 [PM] Port CorrelatedValuePropagation
llvm-svn: 274705
2016-07-06 23:26:29 +00:00