Commit Graph

2265 Commits

Author SHA1 Message Date
Elena Demikhovsky 3ec9e15ad4 Vector of pointers in function attributes calculation
While setting function attributes we check all instructions that may access memory. For a call instruction we check all arguments. The special check is required for pointers.
I added vector-of-pointers to the call arguments types that should be checked.

Differential Revision: http://reviews.llvm.org/D14693

llvm-svn: 253363
2015-11-17 19:30:51 +00:00
James Molloy d4d2357f26 [GlobalOpt] Address post-commit review comments on r253168
Address Duncan Exon Smith's comments on D14148, which was added after the patch had been LGTM'd and committed:
  * clang-format one area where whitespace diffs occurred.
  * Add a threshold to limit the store/load dominance checks as they are quadratic.

llvm-svn: 253192
2015-11-16 10:16:22 +00:00
Benjamin Kramer 83709b1c1e Move helper classes into anonymous namespaces. NFC.
llvm-svn: 253189
2015-11-16 09:01:28 +00:00
James Molloy 9c7d4d8855 [GlobalOpt] Demote globals to locals more aggressively
Global to local demotion can speed up programs that use globals a lot. It is particularly useful with LTO, when the entire call graph is known and most functions have been internalized.

For a global to be demoted, it must only be accessed by one function and that function:
  1. Must never recurse directly or indirectly, else the GV would be clobbered.
  2. Must never rely on the value in GV at the start of the function (apart from the initializer).

GlobalOpt can already do this, but it is hamstrung and only ever tries to demote globals inside "main", because C++ gives extra guarantees about how main is called - once and only once.

In LTO mode, we can often prove the first property (if the function is internal by this point, we know enough about the callgraph to determine if it could possibly recurse). FunctionAttrs now infers the "norecurse" attribute for this reason.

The second property can be proven for a subset of functions by proving that all loads from GV are dominated by a store to GV. This is conservative in the name of compile time - this only requires a DominatorTree which is fairly cheap in the grand scheme of things. We could do more fancy stuff with MemoryDependenceAnalysis too to catch more cases but this appears to catch most of the useful ones in my testing.

llvm-svn: 253168
2015-11-15 14:21:37 +00:00
James Molloy 33e7345886 [GlobalOpt] Make sure all debug lines end with '\n'
GlobalVariable::print() used to emit a newline. It hasn't for a while now, but these debug lines weren't updated.

llvm-svn: 253030
2015-11-13 11:05:13 +00:00
James Molloy ea31ad3b27 [GlobalOpt] Coding style - remove function names from doxygen comments
Suggested by Mehdi in the review of D14148.

llvm-svn: 253029
2015-11-13 11:05:07 +00:00
Akira Hatanaka 5af7ace4ee Revert r252990.
Some of the buildbots are still failing.

llvm-svn: 252999
2015-11-13 01:44:32 +00:00
Akira Hatanaka c7dfb76fe7 Provide a way to specify inliner's attribute compatibility and merging.
This reapplies r252949. I've changed the type of FuncName to be
std::string instead of StringRef in emitFnAttrCompatCheck.

Original commit message for r252949:

Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.

This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.

rdar://problem/19836465

llvm-svn: 252990
2015-11-13 01:23:11 +00:00
Akira Hatanaka f3aa82f666 Revert r252949.
It broke some of the bots including clang-x64-ninja-win7.

llvm-svn: 252951
2015-11-12 21:19:18 +00:00
Akira Hatanaka 61b81a563a Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.

This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.

rdar://problem/19836465

llvm-svn: 252949
2015-11-12 20:59:43 +00:00
James Molloy 7e9bdd5d01 Revert "Revert "[FunctionAttrs] Identify norecurse functions""
This reapplies this patch, with test fixes.

llvm-svn: 252871
2015-11-12 10:55:20 +00:00
James Molloy 9a32da74f7 Revert "[FunctionAttrs] Identify norecurse functions"
This reverts commit r252862. This introduced test failures and I'm reverting while I investigate how this happened.

llvm-svn: 252863
2015-11-12 09:05:43 +00:00
James Molloy b14994e752 [FunctionAttrs] Identify norecurse functions
A function can be marked as norecurse if:
  * The SCC to which it belongs has cardinality 1; and either
    a) It does not call any non-norecurse function. This includes self-recursion; or
    b) It only has one callsite and the function that callsite is within is marked norecurse.

a) is best propagated bottom-up and b) is best propagated top-down.

We build up the norecurse attributes bottom-up using the existing SCC pass, and mark functions with no obvious recursion (but not provably norecurse) to sweep later, top-down.

llvm-svn: 252862
2015-11-12 08:53:04 +00:00
David Majnemer f0f224d12d [IR] Add support for empty tokens
When working with tokens, it is often the case that one has instructions
which consume a token and produce a new token.  Currently, we have no
mechanism to represent an initial token state.

Instead, we can create a notional "empty token" by inventing a new
constant which captures the semantics we would like.  This new constant
is called ConstantTokenNone and is written textually as "token none".

Differential Revision: http://reviews.llvm.org/D14581

llvm-svn: 252811
2015-11-11 21:57:16 +00:00
Oliver Stannard c1103398f2 GlobalOpt should maintain externally_initialized when splitting aggregates
When GlobalOpt splits an internal, global variable with an aggregate type, it
should propagate the externally_initialized flag to the newly created globals.

This makes the pass safe for our downstream use of this flag, while still
allowing some useful optimisations (such as removing dead parts of the split
aggregate) to be performed.

Differential Revision: http://reviews.llvm.org/D13382

llvm-svn: 252490
2015-11-09 16:47:16 +00:00
Sanjoy Das 76dd243f99 Unbreak the build
My code clashed with some ilist iterator changes upstream.  Fix by
adding an explicit "&*" coercion.

llvm-svn: 252392
2015-11-07 02:26:53 +00:00
Sanjoy Das ea1df7fe9f [FunctionAttrs] Add comment and clarify assertion message; NFC
llvm-svn: 252389
2015-11-07 01:56:07 +00:00
Sanjoy Das 71fe81fd25 [FunctionAttrs] Add handling for operand bundles
Summary:
Teach the FunctionAttrs to do the right thing for IR with operand
bundles.

Reviewers: reames, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14408

llvm-svn: 252387
2015-11-07 01:56:00 +00:00
Sanjoy Das 436e2397f8 [FunctionAttrs] Fix an iterator wraparound bug
Summary:
This change fixes an iterator wraparound bug in
`determinePointerReadAttrs`.

Ideally, ++'ing off the `end()` of an iplist should result in a failed
assert, but currently iplist seems to silently wrap to the head of the
list on `end()++`.  This is why the bad behavior is difficult to
demonstrate.

Reviewers: chandlerc, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14350

llvm-svn: 252386
2015-11-07 01:55:53 +00:00
Duncan P. N. Exon Smith 83c4b68720 ADT: Remove last implicit ilist iterator conversions, NFC
Some implicit ilist iterator conversions have crept back into Analysis,
Transforms, Hexagon, and llvm-stress.  This removes them.

I'll commit a patch immediately after this to disallow them (in a
separate patch so that it's easy to revert if necessary).

llvm-svn: 252371
2015-11-07 00:01:16 +00:00
Peter Collingbourne d4bff30370 DI: Reverse direction of subprogram -> function edge.
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.

For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.

This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.

Since this is an IR change, a bitcode upgrade has been provided.

Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.

Differential Revision: http://reviews.llvm.org/D14265

llvm-svn: 252219
2015-11-05 22:03:56 +00:00
Sanjoy Das 98bfe26bf8 [FunctionAttrs] Remove a loop, NFC refactor
Summary:
Remove the loop over the uses of the CallSite in ArgumentUsesTracker.
Since we have the `Use *` for actual argument operand, we can just use
pointer subtraction.

The time complexity remains the same though (except for a vararg
argument) -- `std::advance` is O(UseIndex) for the ArgumentList
iterator.

The real motivation is to make a later change adding support for operand
bundles simpler.

Reviewers: reames, chandlerc, nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14363

llvm-svn: 252141
2015-11-05 03:04:40 +00:00
Adam Nemet e54a4fa95d LLE 6/6: Add LoopLoadElimination pass
Summary:
The goal of this pass is to perform store-to-load forwarding across the
backedge of a loop.  E.g.:

  for (i)
     A[i + 1] = A[i] + B[i]

  =>

  T = A[0]
  for (i)
     T = T + B[i]
     A[i + 1] = T

The pass relies on loop dependence analysis via LoopAccessAnalisys to
find opportunities of loop-carried dependences with a distance of one
between a store and a load.  Since it's using LoopAccessAnalysis, it was
easy to also add support for versioning away may-aliasing intervening
stores that would otherwise prevent this transformation.

This optimization is also performed by Load-PRE in GVN without the
option of multi-versioning.  As was discussed with Daniel Berlin in
http://reviews.llvm.org/D9548, this is inferior to a more loop-aware
solution applied here.  Hopefully, we will be able to remove some
complexity from GVN/MemorySSA as a consequence.

In the long run, we may want to extend this pass (or create a new one if
there is little overlap) to also eliminate loop-indepedent redundant
loads and store that *require* versioning due to may-aliasing
intervening stores/loads.  I have some motivating cases for store
elimination. My plan right now is to wait for MemorySSA to come online
first rather than using memdep for this.

The main motiviation for this pass is the 456.hmmer loop in SPECint2006
where after distributing the original loop and vectorizing the top part,
we are left with the critical path exposed in the bottom loop.  Being
able to promote the memory dependence into a register depedence (even
though the HW does perform store-to-load fowarding as well) results in a
major gain (~20%).  This gain also transfers over to x86: it's
around 8-10%.

Right now the pass is off by default and can be enabled
with -enable-loop-load-elim.  On the LNT testsuite, there are two
performance changes (negative number -> improvement):

  1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the
     critical paths is reduced
  2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this
     outside of LNT

The pass is scheduled after the loop vectorizer (which is after loop
distribution).  The rational is to try to reuse LAA state, rather than
recomputing it.  The order between LV and LLE is not critical because
normally LV does not touch scalar st->ld forwarding cases where
vectorizing would inhibit the CPU's st->ld forwarding to kick in.

LoopLoadElimination requires LAA to provide the full set of dependences
(including forward dependences).  LAA is known to omit loop-independent
dependences in certain situations.  The big comment before
removeDependencesFromMultipleStores explains why this should not occur
for the cases that we're interested in.

Reviewers: dberlin, hfinkel

Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D13259

llvm-svn: 252017
2015-11-03 23:50:08 +00:00
Teresa Johnson c7ed52f2ba Restore "Support for ThinLTO function importing and symbol linking."
This restores commit r251837, with the new library dependence added to
llvm-link/Makefile to address bot failures.

llvm-svn: 251866
2015-11-03 00:14:15 +00:00
Teresa Johnson 227a923140 Revert "Support for ThinLTO function importing and symbol linking."
This reverts commit r251837, due to a number of bot failures of the form:

/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function
loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined
reference to
'llvm::object::FunctionIndexObjectFile::create(llvm::MemoryBufferRef,
llvm::LLVMContext&, llvm::Module const*, bool)'
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function
loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined
reference to 'llvm::object::FunctionIndexObjectFile::takeIndex()'

I'm not sure why these are happening - I added Object to the requred
libraries in tools/llvm-link/LLVMBuild.txt and the LLVM_LINK_COMPONENTS
in tools/llvm-link/CMakeLists.txt. Confirmed for my build that these
symbols come out of libLLVMObject.a. What am I missing?

llvm-svn: 251841
2015-11-02 22:17:32 +00:00
Teresa Johnson b1d4a39990 Support for ThinLTO function importing and symbol linking.
Summary:
Support for necessary linkage changes and symbol renaming during
ThinLTO function importing.

Also includes llvm-link support for manually importing functions
and associated llvm-link based tests.

Note that this does not include support for intelligently importing
metadata, which is currently imported duplicate times. That support will
be in the follow-on patch, and currently is ignored by the tests.

Reviewers: dexonsmith, joker.eph, davidxl

Subscribers: tobiasvk, tejohnson, llvm-commits

Differential Revision: http://reviews.llvm.org/D13515

llvm-svn: 251837
2015-11-02 21:39:10 +00:00
David Blaikie 2297a9142e StringRef-ify DiagnosticInfoSampleProfile::Filename
llvm-svn: 251823
2015-11-02 20:01:13 +00:00
Teresa Johnson f72278f051 Clang format a few prior patches (NFC)
I had clang formatted my earlier patches using the wrong style.
Reformatted with the LLVM style.

llvm-svn: 251812
2015-11-02 18:02:11 +00:00
Diego Novillo f9ed08e16e SamplePGO - Count sample records in embedded profiles when computing coverage.
The initial coverage checking code for sample records failed to count
records inside inlined profiles. This change fixes the oversight.

llvm-svn: 251752
2015-10-31 21:53:58 +00:00
Chandler Carruth cada2d8d1e [FunctionAttrs] Inline the prototype attribute inference to an existing
loop over the SCC.

The separate function wasn't really adding much, NFC.

llvm-svn: 251728
2015-10-31 00:28:37 +00:00
Justin Bogner 21e153748a [PM] Port StripDeadPrototypes to the new pass manager
This is a really straightforward port. Also adds a test for the pass,
since it only seemed to be tested tangentially before.

llvm-svn: 251726
2015-10-30 23:28:12 +00:00
Justin Bogner 48f1f885e3 Whitespace. NFC
llvm-svn: 251724
2015-10-30 23:02:38 +00:00
Chandler Carruth a8125350ea [FunctionAttrs] Separate another chunk of the logic for functionattrs
from its pass harness by providing a lambda to query for AA results.

This allows the legacy pass to easily provide a lambda that uses the
special helpers to construct function AA results from a legacy CGSCC
pass. With the new pass manager (the next patch) the lambda just
directly wraps the intuitive query API.

llvm-svn: 251715
2015-10-30 16:48:08 +00:00
Chandler Carruth c518ebddf3 [FunctionAttrs] Provide a single SCC node set to all of the
transformations in FunctionAttrs rather than building a new one each
time.

This isn't trivial because there are different heuristics from different
passes for exactly what set they want. The primary difference is whether
an *overridable* function completely disables the synthesis of
attributes. I've modeled this by directly testing for overridable, and
using the common set that excludes external and opt-none functions.

This does cause some changes by disabling more optimizations in the face
of opt-none. Specifically, we were still optimizing *calls* to opt-none
functions based on their attributes, just not the bodies. It seems
better to be conservative on both fronts given the intended semanticas
here (best effort to not assume or disturb anything). I've not tried to
test this change as it seems complex, brittle, and not important to the
implicit contract of opt-none. Instead, it seems more like a choice that
should be dictated by the simplified implementation and the change to be
acceptable differences within the space of opt-none.

A big benefit here is that these transformations no longer rely on the
legacy pass manager's SCC types, they just work on generic sets of
function pointers. This will make it easy to re-use their logic in the
new pass manager.

I've also made the transforms static functions instead of members where
trivial while I was touching the signatures.

llvm-svn: 251640
2015-10-29 18:29:15 +00:00
Daniel Jasper 1de905a667 Fix use-after-free. Thanks ASAN for giving me a detailed report :-).
llvm-svn: 251623
2015-10-29 12:49:37 +00:00
Diego Novillo 748b3ffe3b SamplePGO - Add flag to check sampling coverage.
This adds the flag -mllvm -sample-profile-check-coverage=N to the
SampleProfile pass. N is the percent of input sample records that the
user expects to apply.  If the pass does not use N% (or more) of the
sample records in the input, it emits a warning.

This is useful to detect some forms of stale profiles. If the code has
drifted enough from the original profile, there will be records that do
not match the IR anymore.

This will not detect cases where a sample profile record for line L is
referring to some other instructions that also used to be at line L.

llvm-svn: 251568
2015-10-28 22:30:25 +00:00
Diego Novillo a8a3bd2100 SamplePGO - Clear per-function data after applying a profile.
The pass was keeping around a lot of per-function data (visited blocks,
edges, dominance, etc) that is just taking up memory for no reason. In
fact, from function to function it could potentially confuse the
propagator since some maps are indexed by line offsets which can be
common between functions.

llvm-svn: 251531
2015-10-28 17:40:22 +00:00
James Molloy ef607a2089 [GlobalOpt] Add newlines to DEBUG messages
I think these were affected by a change way back when to stop printing newlines in Value::dump() by default. This change simply allows the debug output to be readable.

NFC.

llvm-svn: 251517
2015-10-28 14:30:53 +00:00
Diego Novillo aa55507ff9 Tidy a comment. NFC.
llvm-svn: 251434
2015-10-27 18:41:46 +00:00
Diego Novillo c04270d2e4 Fix SamplePGO segfault when debug info is missing.
When emitting a remark for a conditional branch annotation, the remark
uses the line location information of the conditional branch in the
message.  In some cases, that information is unavailable and the
optimization would segfaul. I'm still not sure whether this is a bug or
WAI, but the optimizer should not die because of this.

llvm-svn: 251420
2015-10-27 17:37:00 +00:00
Chandler Carruth 69798fb5ec [function-attrs] Refactor code to handle shorter code with early exits.
No functionality changed here, but the indentation is substantially
reduced and IMO the code is much easier to read. I've also added some
helpful comments.

This is just a clean-up I wrote while studying the code, and that has
been in my backlog for a while.

llvm-svn: 251381
2015-10-27 01:41:43 +00:00
Diego Novillo e822b63681 Remove unused local variable. NFC.
llvm-svn: 251344
2015-10-26 20:50:26 +00:00
Diego Novillo 7963ea1996 SamplePGO - Add optimization reports.
This adds a couple of optimization remarks to the SamplePGO
transformation. When it decides to inline a hot function (to mimic the
inline stack and repeat useful inline decisions in the original build).

It will also report branch destinations. For instance, given the code
fragment:

     6      if (i < 1000)
     7        sum -= i;
     8      else
     9        sum += -i * rand();

If the 'else' branch is taken most of the time, building this code with
-Rpass=sample-profile will produce:

a.cc:9:14: remark: most popular destination for conditional branches at small.cc:6:9 [-Rpass=sample-profile]
      sum += -i * rand();
             ^

llvm-svn: 251330
2015-10-26 18:52:53 +00:00
Dehao Chen 100424124b Tolerate negative offset when matching sample profile.
In some cases (as illustrated in the unittest), lineno can be less than the heade_lineno because the function body are included from some other files. In this case, offset will be negative. This patch makes clang still able to match the profile to IR in this situation.

http://reviews.llvm.org/D13914

llvm-svn: 250873
2015-10-21 01:22:27 +00:00
Diego Novillo 38be33302c Sample Profiles - Adjust integer types. Mostly NFC.
This adjusts all integers in the reader/writer to reflect the types
stored on profile files. They should all be unsigned 32-bit or 64-bit
values. Changed all associated internal types to be uint32_t or
uint64_t.

The only place that needed some adjustments is in the sample profile
transformation. Altough the weight read from the profile are 64-bit
values, the internal API for branch weights only accepts 32-bit values.
The pass now saturates weights that overflow uint32_t.

llvm-svn: 250427
2015-10-15 16:36:21 +00:00
Duncan P. N. Exon Smith 1732340bfa IPO: Remove implicit ilist iterator conversions, NFC
llvm-svn: 250187
2015-10-13 17:51:03 +00:00
James Molloy 4015925606 [GlobalsAA] Turn GlobalsAA on again by default
Now that all the known faults with GlobalsAA have been fixed, flip the big switch on -enable-non-lto-gmr again.

Feel free to pester me with any more bugs found, and don't hesitate to flip the switch back off.

llvm-svn: 250157
2015-10-13 10:43:57 +00:00
Oliver Stannard 939724cd02 GlobalOpt does not treat externally_initialized globals correctly
GlobalOpt currently merges stores into the initialisers of internal,
externally_initialized globals, but should not do so as the value of the global
may change between the initialiser and any code in the module being run.

llvm-svn: 250035
2015-10-12 13:20:52 +00:00
Dehao Chen 41dc5a6e86 Make HeaderLineno a local variable.
http://reviews.llvm.org/D13576

As we are using hierarchical profile, there is no need to keep HeaderLineno a member variable. This is because each level of the inline stack will have its own header lineno. One should use the head lineno of its own inline stack level instead of the actual symbol.

llvm-svn: 249848
2015-10-09 16:50:16 +00:00
Hans Wennborg 083ca9bb32 Fix Clang-tidy modernize-use-nullptr warnings in source directories and generated files; other minor cleanups.
Patch by Eugene Zelenko!

Differential Revision: http://reviews.llvm.org/D13321

llvm-svn: 249482
2015-10-06 23:24:35 +00:00