Commit Graph

6830 Commits

Author SHA1 Message Date
Teresa Johnson 6955feebf3 [ThinLTO] Prevent exporting of locals used/defined in module level asm
Summary:
This patch uses the same approach added for inline asm in r285513 to
similarly prevent promotion/renaming of locals used or defined in module
level asm.

All static global values defined in normal IR and used in module level asm
should be included on either the llvm.used or llvm.compiler.used global.
The former were already being flagged as NoRename in the summary, and
I've simply added llvm.compiler.used values to this handling.

Module level asm may also contain defs of values. We need to prevent
export of any refs to local values defined in module level asm (e.g. a
ref in normal IR), since that also requires renaming/promotion of the
local. To do that, the summary index builder looks at all values in the
module level asm string that are not marked Weak or Global, which is
exactly the set of locals that are defined. A summary is created for
each of these local defs and flagged as NoRename.

This required adding handling to the BitcodeWriter to look at GV
declarations to see if they have a summary (rather than skipping them
all).

Finally, added an assert to IRObjectFile::CollectAsmUndefinedRefs to
ensure that an MCAsmParser is available, otherwise the module asm parse
would silently fail. Initialized the asm parser in the opt tool for use
in testing this fix.

Fixes PR30610.

Reviewers: mehdi_amini

Subscribers: johanengelen, krasin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26146

llvm-svn: 286297
2016-11-08 21:53:35 +00:00
Andrew Kaylor 9604f34996 [BasicAA] Teach BasicAA to handle the inaccessiblememonly and inaccessiblemem_or_argmemonly attributes
Differential Revision: https://reviews.llvm.org/D26382

llvm-svn: 286294
2016-11-08 21:07:42 +00:00
Sanjoy Das 2582e690b7 [TBAA] Drop support for "old style" scalar TBAA tags
Summary:
We've had support for auto upgrading old style scalar TBAA access
metadata tags into the "new" struct path aware TBAA metadata for 3 years
now.  The only way to actually generate old style TBAA was explicitly
through the IRBuilder API.  I think this is a good time for dropping
support for old style scalar TBAA.

I'm not removing support for textual or bitcode upgrade -- if you have
IR with the old style scalar TBAA tags that go through the AsmParser orf
the bitcode parser before LLVM sees them, they will keep working as
usual.

Note:

  %val = load i32, i32* %ptr, !tbaa !N
  !N = < scalar tbaa node >

is equivalent to

  %val = load i32, i32* %ptr, !tbaa !M
  !N = < scalar tbaa node >
  !M = !{!N, !N, 0}

Reviewers: manmanren, chandlerc, sunfish

Subscribers: mcrosier, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D26229

llvm-svn: 286291
2016-11-08 20:46:01 +00:00
Piotr Padlewski 01659cb9fe NFC small changes in MemDep
llvm-svn: 286260
2016-11-08 18:20:51 +00:00
Amara Emerson 0b40201e13 Adds the loop end location to the loop metadata.
This additional information can be used to improve the locations when generating remarks for loops.

Patch by Florian Hahn.

Differential Revision: https://reviews.llvm.org/D25763

llvm-svn: 286227
2016-11-08 11:18:59 +00:00
Adam Nemet b103fc52d3 [OptDiag, opt-viewer] Save callee's location and display as link
With this we get a new field in the YAML record if the value being
streamed out has a debug location.  For examples, please see the changes
to the tests.

This is then used in opt-viewer to display a link for the callee
function in the inlining remarks.

Differential Revision: https://reviews.llvm.org/D26366

llvm-svn: 286169
2016-11-07 22:41:13 +00:00
Chad Rosier 8f348017b0 [AliasSetTracker] Make AST smarter about assume intrinsics that don't actually affect memory.
Differential Revision: https://reviews.llvm.org/D26252

llvm-svn: 286108
2016-11-07 14:11:45 +00:00
Eli Friedman b6befc3bc4 DCE math library calls with a constant operand.
On platforms which use -fmath-errno, math libcalls without any uses
require some extra checks to figure out if they are actually dead.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30464 .

Differential Revision: https://reviews.llvm.org/D25970

llvm-svn: 285857
2016-11-02 20:48:11 +00:00
Sanjay Patel 9840cdad4c [ValueTracking] remove TODO comment; NFC
InstCombine should always canonicalize patterns like the one shown in the comment
when visiting 'select' insts in adjustMinMax().

Scalars were already handled there, and vector splats are handled after:
https://reviews.llvm.org/rL285732

llvm-svn: 285744
2016-11-01 20:43:00 +00:00
Sanjoy Das 9c729067e9 [TBAA] Use wrapper objects instead of raw getOperand s; NFC
This is intended to make the semantic intent clearer.

The wrapper objects are now generic to avoid `const_cast` s.  Since
`const` ness is part of the API of `MDNode::getMostGenericTBAA` (and
therefore I can't make things `const` all the way through without some
code churn outside TypeBasedAliasAnalysis.cpp), this seemed like the
cleanest solution.

llvm-svn: 285665
2016-11-01 02:58:30 +00:00
Sanjoy Das a1a77f3a21 [TBAA] Rename accessors to be more idiomatic; NFC
llvm-svn: 285661
2016-11-01 01:21:57 +00:00
Sanjoy Das 1707869db5 [SCEV] Try to order n-ary expressions in CompareValueComplexity
llvm-svn: 285535
2016-10-31 03:32:43 +00:00
Sanjoy Das 299e67291c [SCEV] In CompareValueComplexity, order global values by their name
llvm-svn: 285529
2016-10-30 23:52:56 +00:00
Sanjoy Das b4830a84b9 [SCEV] Use auto for consistency with an upcoming change; NFC
llvm-svn: 285528
2016-10-30 23:52:53 +00:00
Teresa Johnson bf28c8fa45 [ThinLTO] Use per-summary flag to prevent exporting locals used in inline asm
Summary:
Instead of using the workaround of suppressing the entire index for
modules that call inline asm that may reference locals, use the
NoRename flag on the summary for any locals in the llvm.used set, and
add a reference edge from any functions containing inline asm.

This avoids issues from having no summaries despite the module defining
global values, which was preventing more aggressive index-based
optimization. It will be followed by a subsequent patch to make a
similar fix for local references in module level asm (to fix PR30610).

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26121

llvm-svn: 285513
2016-10-30 05:40:44 +00:00
Sanjay Patel 36eeb6d6f6 [ValueTracking] recognize more variants of smin/smax
Try harder to detect obfuscated min/max patterns: the initial pattern was added with D9352 / rL236202. 
There was a bug fix for PR27137 at rL264996, but I think we can do better by folding the corresponding
smax pattern and commuted variants.

The codegen tests demonstrate the effect of ValueTracking on the backend via SelectionDAGBuilder. We
can't expose these differences minimally in IR because we don't have smin/smax intrinsics for IR.

Differential Revision: https://reviews.llvm.org/D26091

llvm-svn: 285499
2016-10-29 16:21:19 +00:00
Tom Stellard 13068995b9 [Loads] Fix crash in is isDereferenceableAndAlignedPointer()
Summary:
We were trying to add APInt values with different bit sizes after
visiting an addrspacecast instruction which changed the bit width
of the pointer.

Reviewers: majnemer, hfinkel

Subscribers: hfinkel, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D24774

llvm-svn: 285407
2016-10-28 15:32:28 +00:00
Igor Laevsky c3ccf5d77b [LCSSA] Perform LCSSA verification only for the current loop nest.
Now LPPassManager will run LCSSA verification only for the top-level loop
which was processed on the current iteration.

Differential Revision: https://reviews.llvm.org/D25873

llvm-svn: 285394
2016-10-28 12:57:20 +00:00
Teresa Johnson 02563cd3a6 [ThinLTO] Create AliasSummary when building index
Summary:
Previously we were creating the alias summary on the fly while writing
the summary to bitcode. This moves the creation of these summaries to
the module summary index builder where we build the rest of the summary
index.

This is going to be necessary for setting the NoRename flag for values
possibly used in inline asm or module level asm.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26049

llvm-svn: 285379
2016-10-28 02:39:38 +00:00
Sanjay Patel e372aecb8a [ValueTracking] fix matchSelectPattern to allow vector splat folds of min/max/abs/nabs
llvm-svn: 285303
2016-10-27 15:26:10 +00:00
Sanjoy Das 01969218a4 Simplify `x >=u x >> y` and `x >=u x udiv y`
Summary:
Extends InstSimplify to handle both `x >=u x >> y` and `x >=u x udiv y`.

This is a folloup of rL258422 and
https://github.com/rust-lang/rust/pull/30917 where llvm failed to
optimize away the bounds checking in a binary search.

Patch by Arthur Silva!

Reviewers: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25941

llvm-svn: 285228
2016-10-26 19:18:43 +00:00
Chad Rosier 4447d7a816 Revert "[AliasSetTracker] Make AST smarter about intrinsics that don't actually affect memory."
This reverts commit r285191.

LICM appears to rely on the Alias Set Tracker hitting lifetime markers to prevent
code from being moved outside of the original scope.

llvm-svn: 285227
2016-10-26 19:18:19 +00:00
Chad Rosier 1408628ffa [AliasSetTracker] Make AST smarter about intrinsics that don't actually affect memory.
Differential Revision: https://reviews.llvm.org/D25969

llvm-svn: 285191
2016-10-26 12:42:11 +00:00
Eli Friedman c5b7262073 Fix regression from my recent GlobalsAA fix.
There are two fixes here: one, AnalyzeUsesOfPointer can't return
false until it has checked all the uses of the pointer. Two, if a
global uses another global, we have to assume the address of the
first global escapes.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30707 .

Differential Revision: https://reviews.llvm.org/D25798

llvm-svn: 285034
2016-10-24 21:47:44 +00:00
Gerolf Hoflehner 9e2afa8bd7 [BasicAA] Fix - missed alias in GEP expressions
In BasicAA GEP operand values get adjusted ("wrap-around") based on the
pointersize. Otherwise, in non-64b modes, AA could report false negatives.
However, a wrap-around is valid only for a fully evaluated expression.
It had been introduced to fix an alias problem in
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160118/326163.html.
This commit restricts the wrap-around to constant gep operands only where the
value is known at compile-time.

llvm-svn: 284908
2016-10-22 02:41:39 +00:00
Peter Collingbourne ecdd58f1d6 Analysis: Move llvm::getConstantRangeFromMetadata to IR library.
We're about to start using it there.

Differential Revision: https://reviews.llvm.org/D25877

llvm-svn: 284865
2016-10-21 19:59:26 +00:00
Artur Pilipenko 47dc098c06 [LVI] Fix a bug with a guard being the very first instruction in a BB not taken into account
While looking for guards use reverse iterator and scan up to rend() not to begin()

llvm-svn: 284827
2016-10-21 15:02:21 +00:00
John Brawn 84b21835f1 [LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops
When we have a loop with a known upper bound on the number of iterations, and
furthermore know that either the number of iterations will be either exactly
that upper bound or zero, then we can fully unroll up to that upper bound
keeping only the first loop test to check for the zero iteration case.

Most of the work here is in plumbing this 'max-or-zero' information from the
part of scalar evolution where it's detected through to loop unrolling. I've
also gone for the safe default of 'false' everywhere but howManyLessThans which
could probably be improved.

Differential Revision: https://reviews.llvm.org/D25682

llvm-svn: 284818
2016-10-21 11:08:48 +00:00
Li Huang fcfe8cd3ae [SCEV] Add a threshold to restrict number of mul operands to be inlined into SCEV
This is to avoid inlining too many multiplication operands into a SCEV, which could 
take exponential time in the worst case.

Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin

Differential Revision: https://reviews.llvm.org/D25794

llvm-svn: 284784
2016-10-20 21:38:39 +00:00
Benjamin Kramer b2505005c7 Retire llvm::alignOf in favor of C++11 alignof.
No functionality change intended.

llvm-svn: 284733
2016-10-20 15:02:18 +00:00
Benjamin Kramer 2a8bef8769 Do a sweep over move ctors and remove those that are identical to the default.
All of these existed because MSVC 2013 was unable to synthesize default
move ctors. We recently dropped support for it so all that error-prone
boilerplate can go.

No functionality change intended.

llvm-svn: 284721
2016-10-20 12:20:28 +00:00
Sanjay Patel efd8885772 [InstSimplify] fold negation of sign-bit
0 - X --> X, if X is 0 or the minimum signed value
0 - X --> 0, if X is 0 or the minimum signed value and the sub is NSW

I noticed this pattern might be created in the backend after the change from D25485, 
so we'll want to add a similar fold for the DAG.

The use of computeKnownBits in InstSimplify may be something to investigate if the
compile time of InstSimplify is noticeable. We could replace computeKnownBits with 
specific pattern matchers or limit the recursion.

Differential Revision: https://reviews.llvm.org/D25785

llvm-svn: 284649
2016-10-19 21:23:45 +00:00
Chad Rosier 6e3a92ec88 [AliasSetTracker] Add support for memcpy and memmove.
Differential Revision: https://reviews.llvm.org/D25776

llvm-svn: 284630
2016-10-19 19:09:03 +00:00
Chad Rosier 16970a847c [AliasSetTracker] Return void for add() functions. NFC.
Differential Revision: https://reviews.llvm.org/D25748

llvm-svn: 284628
2016-10-19 18:50:32 +00:00
Sanjoy Das 507dd40a4a [SCEV] Make CompareValueComplexity a little bit smarter
This helps canonicalization in some cases.

Thanks to Pankaj Chawla for the investigation and the test case!

llvm-svn: 284501
2016-10-18 17:45:16 +00:00
Sanjoy Das 9cd877a25a [SCEV] Extract out a helper function; NFC
llvm-svn: 284500
2016-10-18 17:45:13 +00:00
John Brawn ecf79300dd [SCEV] More accurate calculation of max backedge count of some less-than loops
In loops that look something like
 i = n;
 do {
  ...
 } while(i++ < n+k);
where k is a constant, the maximum backedge count is k (in fact the backedge
count will be either 0 or k, depending on whether n+k wraps). More generally
for LHS < RHS if RHS-(LHS of first comparison) is a constant then the loop will
iterate either 0 or that constant number of times.

This allows for more loop unrolling with the recent upper bound loop unrolling
changes, and I'm working on a patch that will let loop unrolling additionally
make use of the loop being executed either 0 or k times (we need to retain the
loop comparison only on the first unrolled iteration).

Differential Revision: https://reviews.llvm.org/D25607

llvm-svn: 284465
2016-10-18 10:10:53 +00:00
Tobias Grosser 2bbec0ee7f [SCEV] Consider delinearization pattern with extension with identity factor
Summary: The delinearization algorithm did not consider terms which had an extension without a multiply factor, i.e. a identify factor. We lose cases where size is char type where there will no multiply factor.

Reviewers: sanjoy, grosser

Subscribers: mzolotukhin, Eugene.Zelenko, llvm-commits, mssimpso, sanjoy, grosser

Differential Revision: https://reviews.llvm.org/D16492

llvm-svn: 284378
2016-10-17 11:56:26 +00:00
Li Huang 755f75f765 Test commit. (NFC)
llvm-svn: 284307
2016-10-15 19:00:04 +00:00
Albert Gutowski 795d7d6381 Create llvm.addressofreturnaddress intrinsic
Summary: We need a new LLVM intrinsic to implement MS _AddressOfReturnAddress builtin on 64-bit Windows.

Reviewers: majnemer, rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25293

llvm-svn: 284061
2016-10-12 22:13:19 +00:00
Haicheng Wu 1ef17e90b2 Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"
Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049.

The original summary:

This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
    if (a[i] == value) {
        found = true;
        break;
    }
}

GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.

The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.

The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.

llvm-svn: 284053
2016-10-12 21:29:38 +00:00
Haicheng Wu 45e4ef737d Revert "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"
This reverts commit r284044.

llvm-svn: 284051
2016-10-12 21:02:22 +00:00
Haicheng Wu 6cac34fd41 [LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop
This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
    if (a[i] == value) {
        found = true;
        break;
    }
}

GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.

The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.

The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.

Differential Revision: https://reviews.llvm.org/D24790

llvm-svn: 284044
2016-10-12 20:24:32 +00:00
Artur Pilipenko c6eb6bd9cb [ValueTracking] An improvement to IR ValueTracking on Non-negative Integers
Since this change is known to cause performance degradations in some cases it's commited under a temporary flag which is turned off by default.

Patch by Li Huang

Differential Revision: https://reviews.llvm.org/D18777

llvm-svn: 284022
2016-10-12 16:18:43 +00:00
Chandler Carruth 5dbc164d15 [LCG] Add the necessary functionality to the LazyCallGraph to support inlining.
The basic inlining operation makes the following changes to the call graph:
1) Add edges that were previously transitive edges. This is always trivial and
   this patch gives the LCG helper methods to make this more convenient.
2) Remove the inlined edge. We had existing support for this, but it contained
   bugs that needed to be fixed. Testing in the same pattern as the inliner
   exposes these bugs very nicely.
3) Delete a function when it becomes dead because it is internal and all calls
   have been inlined. The LCG had no support at all for this operation, so this
   adds that support.

Two unittests have been added that exercise this specific mutation pattern to
the call graph. They were extremely effective in uncovering bugs. Sadly,
a large fraction of the code here is just to implement those unit tests, but
I think they're paying for themselves. =]

This was split out of a patch that actually uses the routines to
implement inlining in the new pass manager in order to isolate (with
unit tests) the logic that was entirely within the LCG.

Many thanks for the careful review from folks! There will be a few minor
follow-up patches based on the comments in the review as well.

Differential Revision: https://reviews.llvm.org/D24225

llvm-svn: 283982
2016-10-12 07:59:56 +00:00
Igor Laevsky 04423cf785 [LCSSA] Implement linear algorithm for the isRecursivelyLCSSAForm
For each block check that it doesn't have any uses outside of it's innermost loop.

Differential Revision: https://reviews.llvm.org/D25364

llvm-svn: 283877
2016-10-11 13:37:22 +00:00
Dehao Chen c87bc79d90 Tune isHotFunction/isColdFunction
Summary: This patch sets function as hot if function's entry count is hot/cold.

Reviewers: eraman, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25048

llvm-svn: 283852
2016-10-11 05:19:00 +00:00
Dehao Chen 84287abf43 Rename isHotFunction/isColdFunction to isFunctionEntryHot/isFunctionEntryCold. (NFC)
This is in preparation for https://reviews.llvm.org/D25048

llvm-svn: 283805
2016-10-10 21:47:28 +00:00
Mehdi Amini 732afdd09a Turn cl::values() (for enum) from a vararg function to using C++ variadic template
The core of the change is supposed to be NFC, however it also fixes
what I believe was an undefined behavior when calling:

 va_start(ValueArgs, Desc);

with Desc being a StringRef.

Differential Revision: https://reviews.llvm.org/D25342

llvm-svn: 283671
2016-10-08 19:41:06 +00:00
Teresa Johnson 897bab9b35 [ThinLTO] Record calls to aliases
Summary:
When there is a call to an alias in the same module, we were not
adding a call edge. So we could incorrectly think that the alias
was dead if it was inlined in that function, despite having a
reference imported elsewhere. This resulted in unsats at link time.

Add a call edge when the call is to an alias.

Reviewers: davide, mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25384

llvm-svn: 283664
2016-10-08 16:11:42 +00:00
Tom Stellard 17eb3413cd [ValueTracking] Fix crash in GetPointerBaseWithConstantOffset()
Summary:
While walking defs of pointer operands we were assuming that the pointer
size would remain constant.  This is not true, because addresspacecast
instructions may cast the pointer to an address space with a different
pointer width.

This partial reverts r282612, which was a more conservative solution
to this problem.

Reviewers: reames, sanjoy, apilipenko

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D24772

llvm-svn: 283557
2016-10-07 14:23:29 +00:00
Oliver Stannard 4df1cc0b00 [ARM] Don't convert switches to lookup tables of pointers with ROPI/RWPI
With the ROPI and RWPI relocation models we can't always have pointers
to global data or functions in constant data, so don't try to convert switches
into lookup tables if any value in the lookup table would require a relocation.
We can still safely emit lookup tables of other values, such as simple
constants.

Differential Revision: https://reviews.llvm.org/D24462

llvm-svn: 283530
2016-10-07 08:48:24 +00:00
Henric Karlsson 54a53bd303 Test commit access (NFC)
llvm-svn: 283439
2016-10-06 10:58:41 +00:00
Bjorn Pettersson 3961603921 [ValueTracking] Teach computeKnownBits and ComputeNumSignBits to look through ExtractElement.
Summary:
The computeKnownBits and ComputeNumSignBits functions in ValueTracking can now do a simple look-through of ExtractElement.

Reviewers: majnemer, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24955

llvm-svn: 283434
2016-10-06 09:56:21 +00:00
Mehdi Amini 149f6eaed9 Re-commit "Use StringRef in Support/Darf APIs (NFC)"
This reverts commit r283285 and re-commit r283275 with
a fix for format("%s", Str); where Str is a StringRef.

llvm-svn: 283298
2016-10-05 05:59:29 +00:00
Mehdi Amini 2bcac0fac4 Revert "Re-commit "Use StringRef in Support/Darf APIs (NFC)""
One test seems randomly broken: DebugInfo/X86/gnu-public-names.ll

llvm-svn: 283285
2016-10-05 01:04:02 +00:00
Mehdi Amini 32b297a42f Re-commit "Use StringRef in Support/Darf APIs (NFC)"
This reverts commit r283278 and re-commit r283275 with
the update to fix the build on the LLDB side.

llvm-svn: 283281
2016-10-05 00:37:18 +00:00
Mehdi Amini 78b04ae7ac Revert "Use StringRef in Support/Darf APIs (NFC)"
This reverts commit r283275, it broke LLDB Android debug server.

llvm-svn: 283278
2016-10-05 00:21:14 +00:00
Mehdi Amini e0327be584 Use StringRef in Support/Darf APIs (NFC)
llvm-svn: 283275
2016-10-04 23:55:40 +00:00
Hal Finkel bdd6735a9e Don't filter diagnostics written as YAML to the output file
The purpose of the YAML diagnostic output file is to collect information on
optimizations performed, or not performed, for later processing by tools that
help users (and compiler developers) understand how code was optimized. As
such, the diagnostics that appear in the file should not be coupled to what a
user might want to see summarized for them as the compiler runs, and in fact,
because the user likely does not know what optimization diagnostics their tools
might want to use, the user cannot provide a useful filter regardless. As such,
we shouldn't filter the diagnostics going to the output file.

Differential Revision: https://reviews.llvm.org/D25224

llvm-svn: 283236
2016-10-04 18:13:45 +00:00
Adam Nemet 0428e93217 Serialize remark argument as a mapping to get proper quotation for the value.
llvm-svn: 283231
2016-10-04 17:05:04 +00:00
Adam Nemet 2780ee0dc1 Allow derived classes of OptimizationRemarkAnalysis in YAML
llvm-svn: 283230
2016-10-04 17:05:01 +00:00
Eli Friedman 74bed9d757 Make GlobalsAA ignore dead constant expressions.
Slightly improves the precision of GlobalsAA in certain situations, and
makes the behavior of optimization passes more predictable.

Differential Revision: https://reviews.llvm.org/D24104

llvm-svn: 283165
2016-10-04 00:03:55 +00:00
Volkan Keles 1c38681ae6 Add new target hooks for LoadStoreVectorizer
Summary: Added 6 new target hooks for the vectorizer in order to filter types, handle size constraints and decide how to split chains.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, mzolotukhin, wdng, llvm-commits, nhaehnle

Differential Revision: https://reviews.llvm.org/D24727

llvm-svn: 283099
2016-10-03 10:31:34 +00:00
Sanjoy Das 4aeb0f2c7f [SCEV] Rely on ConstantRange instead of custom logic; NFCI
This was first landed in rL283058 and subsequenlty reverted since a
change this depends on (rL283057) was buggy and had to be reverted.

llvm-svn: 283079
2016-10-02 20:59:10 +00:00
Sanjoy Das f230b0aa43 Revert r283057 and r283058
They've broken the sanitizer-bootstrap bots.  Reverting while I investigate.

Original commit messages:

r283057: "[ConstantRange] Make getEquivalentICmp smarter"

r283058: "[SCEV] Rely on ConstantRange instead of custom logic; NFCI"
llvm-svn: 283062
2016-10-02 02:40:27 +00:00
Sanjoy Das 1f7b813e2b Remove duplicated code; NFC
ICmpInst::makeConstantRange does exactly the same thing as
ConstantRange::makeExactICmpRegion.

llvm-svn: 283059
2016-10-02 00:09:57 +00:00
Sanjoy Das 1b9cefcf03 [SCEV] Rely on ConstantRange instead of custom logic; NFCI
llvm-svn: 283058
2016-10-02 00:09:52 +00:00
Sanjoy Das 54e6a21dca [SCEV] Remove commented out code; NFC
llvm-svn: 283056
2016-10-02 00:09:45 +00:00
Mehdi Amini 9a72cd7b21 Use StringRef in TLI instead of raw pointer (NFC)
llvm-svn: 283005
2016-10-01 03:10:48 +00:00
Mehdi Amini 117296c0a0 Use StringRef in Pass/PassManager APIs (NFC)
llvm-svn: 283004
2016-10-01 02:56:57 +00:00
Piotr Padlewski f3d122cd02 NFC fix doxygen comments
llvm-svn: 282950
2016-09-30 21:05:49 +00:00
Adam Nemet 877ccee8cc [LAA, LV] Port to new streaming interface for opt remarks. Update LV
(Recommit after making sure IsVerbose gets properly initialized in
DiagnosticInfoOptimizationBase.  See previous commit that takes care of
this.)

OptimizationRemarkAnalysis directly takes the role of the report that is
generated by LAA.

Then we need the magic to be able to turn an LAA remark into an LV
remark.  This is done via a new OptimizationRemark ctor.

llvm-svn: 282813
2016-09-30 00:01:30 +00:00
Adam Nemet 556a06b1ee Revert "[LAA, LV] Port to new streaming interface for opt remarks. Update LV"
This reverts commit r282758.

There are some clang failures I haven't seen.

llvm-svn: 282759
2016-09-29 20:17:37 +00:00
Adam Nemet c1d21817d1 [LAA, LV] Port to new streaming interface for opt remarks. Update LV
OptimizationRemarkAnalysis directly takes the role of the report that is
generated by LAA.

Then we need the magic to be able to turn an LAA remark into an LV
remark.  This is done via a new OptimizationRemark ctor.

llvm-svn: 282758
2016-09-29 20:12:18 +00:00
Dehao Chen 5461d8bdb5 Refactor the ProfileSummaryInfo to use doInitialization and doFinalization to handle Module update.
Summary: This refactors the change in r282616

Reviewers: davidxl, eraman, mehdi_amini

Subscribers: mehdi_amini, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D25041

llvm-svn: 282630
2016-09-28 21:00:58 +00:00
Dehao Chen 80c8ebb4d8 Fix the bug when -compile-twice is specified, the PSI will be invalidated.
Summary:
When using llc with -compile-twice, module is generated twice, but getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI will still get the old PSI with the original (invalidated) Module. This patch checks if the module has changed when calling getPSI, if yes, update the module and invalidate the Summary.
The bug does not show up in the current llc because PSI is not used in CodeGen yet. But with https://reviews.llvm.org/D24989, the bug will be exposed by test/CodeGen/PowerPC/pr26378.ll

Reviewers: eraman, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24993

llvm-svn: 282616
2016-09-28 18:41:14 +00:00
Artur Pilipenko b6ce6e5dac Don't look through addrspacecast in GetPointerBaseWithConstantOffset
Pointers in different addrspaces can have different sizes, so it's not valid to look through addrspace cast calculating base and offset for a value.

This is similar to D13008.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D24729

llvm-svn: 282612
2016-09-28 17:57:16 +00:00
Sanjoy Das f0022125e0 [SCEV] Use a SmallPtrSet as a temporary union predicate; NFC
Summary:
Instead of creating and destroying SCEVUnionPredicate instances (which
internally creates and destroys a DenseMap), use temporary SmallPtrSet
instances of remember the set of predicates that will get reified into a
SCEVUnionPredicate.

Reviewers: silviu.baranga, sbaranga

Subscribers: sanjoy, mcrosier, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D25000

llvm-svn: 282606
2016-09-28 17:14:58 +00:00
Sanjay Patel 220a8730fb [InstSimplify] allow or-of-icmps folds with vector splat constants
llvm-svn: 282592
2016-09-28 14:27:21 +00:00
Sanjay Patel 1b312ad42d [InstSimplify] allow and-of-icmps folds with vector splat constants
llvm-svn: 282590
2016-09-28 13:53:13 +00:00
Adam Nemet 69330e0bc5 [LAA] Rename emitAnalysis to recordAnalys. NFC
Ever since LAA was split out into an analysis on its own, this function
stopped emitting the report directly.  Instead it stores it to be
retrieved by the client which can then emit it as its own report
(e.g. -Rpass-analysis=loop-vectorize).

llvm-svn: 282561
2016-09-28 00:58:36 +00:00
Adam Nemet c507ac96f5 [Inliner] Port all opt remarks to new streaming API
llvm-svn: 282559
2016-09-27 23:47:03 +00:00
Adam Nemet 04758ba385 Shorten DiagnosticInfoOptimizationRemark* to OptimizationRemark*. NFC
With the new streaming interface, these class names need to be typed a
lot and it's way too looong.

llvm-svn: 282544
2016-09-27 22:19:23 +00:00
Adam Nemet a62b7e1a28 Output optimization remarks in YAML
(Re-committed after moving the template specialization under the yaml
namespace.  GCC was complaining about this.)

This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

llvm-svn: 282539
2016-09-27 20:55:07 +00:00
Sanjoy Das 237c84540f [SCEV] Replace a struct with a function; NFC
We can do this now thanks to C++11 lambdas.

llvm-svn: 282515
2016-09-27 18:01:48 +00:00
Sanjoy Das a26021414a [SCEV] Use find instead of find_as; NFC
We don't need the extra generality here.

llvm-svn: 282514
2016-09-27 18:01:46 +00:00
Sanjoy Das c220ac79c4 [SCEV] Reduce the scope of a struct; NFC
llvm-svn: 282513
2016-09-27 18:01:44 +00:00
Sanjoy Das c46bceb632 [SCEV] Remove custom RAII wrapper; NFC
Instead use the pre-existing `scope_exit` class.

llvm-svn: 282512
2016-09-27 18:01:42 +00:00
Sanjoy Das db93375711 [SCEV] Make PendingLoopPredicates more frugal; NFCI
I don't expect `PendingLoopPredicates` to have very many
elements (e.g. when -O3'ing the sqlite3 amalgamation,
`PendingLoopPredicates` has at most 3 elements).  So now we use a
`SmallPtrSet` for it instead of the more heavyweight `DenseSet`.

llvm-svn: 282511
2016-09-27 18:01:38 +00:00
Adam Nemet cc2a3fa8e8 Revert "Output optimization remarks in YAML"
This reverts commit r282499.

The GCC bots are failing

llvm-svn: 282503
2016-09-27 16:39:24 +00:00
Adam Nemet 92e928c10a Output optimization remarks in YAML
This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

llvm-svn: 282499
2016-09-27 16:15:16 +00:00
Piotr Padlewski d9830eb79f [thinlto] Basic thinlto fdo heuristic
Summary:
This patch improves thinlto importer
by importing 3x larger functions that are called from hot block.

I compared performance with the trunk on spec, and there
were about 2% on povray and 3.33% on milc. These results seems
to be consistant and match the results Teresa got with her simple
heuristic. Some benchmarks got slower but I think they are just
noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with
more iterations to confirm. Geomean of all benchmarks including the noisy ones
were about +0.02%.

I see much better improvement on google branch with Easwaran patch
for pgo callsite inlining (the inliner actually inline those big functions)
Over all I see +0.5% improvement, and I get +8.65% on povray.
So I guess we will see much bigger change when Easwaran patch will land
(it depends on new pass manager), but it is still worth putting this to trunk
before it.

Implementation details changes:
- Removed CallsiteCount.
- ProfileCount got replaced by Hotness
- hot-import-multiplier is set to 3.0 for now,
didn't have time to tune it up, but I see that we get most of the interesting
functions with 3, so there is no much performance difference with higher, and
binary size doesn't grow as much as with 10.0.

Reviewers: eraman, mehdi_amini, tejohnson

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24638

llvm-svn: 282437
2016-09-26 20:37:32 +00:00
Chandler Carruth 68abda52c2 [SCEV] Fix the order of members in the initializer list.
Noticed due to the warning on this line. Sanjoy is on
a less-than-awesome internet connection, so committing on his behalf.

llvm-svn: 282380
2016-09-26 04:49:58 +00:00
Sanjoy Das 5cb11b6423 [SCEV] Assign LoopPropertiesCache in the move constructor
In a previous change I collapsed two different caches into one.  When
doing that I noticed that ScalarEvolution's move constructor was not
moving those caches.

To keep the previous change simple, I've moved that bugfix into this
separate change.

llvm-svn: 282376
2016-09-26 02:44:10 +00:00
Sanjoy Das 5603fc00a6 [SCEV] Combine two predicates into one; NFC
Both `loopHasNoSideEffects` and `loopHasNoAbnormalExits` involve walking
the loop and maintaining similar sorts of caches.  This commit changes
SCEV to compute both the predicates via a single walk, and maintain a
single cache instead of two.

llvm-svn: 282375
2016-09-26 02:44:07 +00:00
Sanjoy Das 5c4869b39d [SCEV] Make it obvious BackedgeTakenInfo's constructor steals storage
Specifically, it moves SCEVUnionPredicates from its input into its own
storage.  Make this obvious at the type level.

llvm-svn: 282374
2016-09-26 01:10:27 +00:00
Sanjoy Das 6b76cdf0d5 [SCEV] Further isolate incidental data structure; NFC
llvm-svn: 282373
2016-09-26 01:10:25 +00:00
Sanjoy Das 7326861abd [SCEV] Simplify BackedgeTakenInfo::getMax; NFC
llvm-svn: 282372
2016-09-26 01:10:22 +00:00
Sanjoy Das e935c77e20 [SCEV] Reserve space in SmallVector; NFC
llvm-svn: 282368
2016-09-25 23:12:08 +00:00