Commit Graph

2214 Commits

Author SHA1 Message Date
David Blaikie 70573dcd9f Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard
library's associative container insert function.

This lead to updating SmallSet::insert to return pair<iterator, bool>,
and then to update SmallPtrSet::insert to return pair<iterator, bool>,
and then to update all the existing users of those functions...

llvm-svn: 222334
2014-11-19 07:49:26 +00:00
NAKAMURA Takumi d0e13af22c Reformat partially, where I touched for whitespace changes.
llvm-svn: 220773
2014-10-28 11:54:52 +00:00
NAKAMURA Takumi 335a7bcf1e Untabify and whitespace cleanups.
llvm-svn: 220771
2014-10-28 11:53:30 +00:00
Arnold Schwaighofer eb1a38fa73 Add an option to the LTO code generator to disable vectorization during LTO
We used to always vectorize (slp and loop vectorize) in the LTO pass pipeline.

r220345 changed it so that we used the PassManager's fields 'LoopVectorize' and
'SLPVectorize' out of the desire to be able to disable vectorization using the
cl::opt flags 'vectorize-loops'/'slp-vectorize' which the before mentioned
fields default to.
Unfortunately, this turns off vectorization because those fields
default to false.
This commit adds flags to the LTO library to disable lto vectorization which
reconciles the desire to optionally disable vectorization during LTO and
the desired behavior of defaulting to enabled vectorization.

We really want tools to set PassManager flags directly to enable/disable
vectorization and not go the route via cl::opt flags *in*
PassManagerBuilder.cpp.

llvm-svn: 220652
2014-10-26 21:50:58 +00:00
Nick Lewycky 592d84974c If requested, apply function merging at -O0 too. It's useful there to reduce the time to compile.
llvm-svn: 220537
2014-10-23 23:49:31 +00:00
JF Bastien f42a6ea5ac LTO: respect command-line options that disable vectorization.
Summary: Patches 202051 and 208013 added calls to LTO's PassManager which unconditionally add LoopVectorizePass and SLPVectorizerPass instead of following the logic in PassManagerBuilder::populateModulePassManager and honoring the -vectorize-loops -run-slp-after-loop-vectorization flags.

Reviewers: nadav, aschwaighofer, yijiang

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5884

llvm-svn: 220345
2014-10-21 23:18:21 +00:00
Chandler Carruth 7b8297a61e Add some optional passes around the vectorizer to both better prepare
the IR going into it and to clean up the IR produced by the vectorizers.

Note that these are *off by default* right now while folks collect data
on whether the performance tradeoff is reasonable.

In a build of the 'opt' binary, I see about 2% compile time regression
due to this change on average. This is in my mind essentially the worst
expected case: very little of the opt binary is going to *benefit* from
these extra passes.

I've seen several benchmarks improve in performance my small amounts due
to running these passes, and there are certain (rare) cases where these
passes make a huge difference by either enabling the vectorizer at all
or by hoisting runtime checks out of the outer loop. My primary
motivation is to prevent people from seeing runtime check overhead in
benchmarks where the existing passes and optimizers would be able to
eliminate that.

I've chosen the sequence of passes based on the kinds of things that
seem likely to be relevant for the code at each stage: rotaing loops for
the vectorizer, finding correlated values, loop invariants, and
unswitching opportunities from any runtime checks, and cleaning up
commonalities exposed by the SLP vectorizer.

I'll be pinging existing threads where some of these issues have come up
and will start new threads to get folks to benchmark and collect data on
whether this is the right tradeoff or we should do something else.

llvm-svn: 219644
2014-10-14 00:31:29 +00:00
David Majnemer ac07703842 Inliner: Non-local functions in COMDATs shouldn't be dropped
A function with discardable linkage cannot be discarded if its a member
of a COMDAT group without considering all the other COMDAT members as
well.  This sort of thing is already handled by GlobalOpt/GlobalDCE.

This fixes PR21206.

llvm-svn: 219335
2014-10-08 19:32:32 +00:00
David Majnemer 1b3b70e371 GlobalOpt: Don't drop unused memberes of a Comdat
A linkonce_odr member of a COMDAT shouldn't be dropped if we need to
keep the entire COMDAT group.

This fixes PR21191.

llvm-svn: 219283
2014-10-08 07:23:31 +00:00
David Blaikie 17364d4e05 DebugInfo+DeadArgElimination: Ensure llvm::Function*s from debug info are updated even when DAE removes both varargs and non-varargs arguments on the same function.
After some stellar (& inspired) help from Reid Kleckner providing a test
case for some rather unstable undefined behavior showing up as
assertions produced by r214761, I was able to fix this issue in DAE
involving the application of both varargs removal, followed by normal
argument removal.

Indeed I introduced this same bug into ArgumentPromotion (r212128) by
copying the code from DAE, and when I fixed the bug in ArgPromo
(r213805) and commented in that patch that I didn't need to address the
same issue in DAE because it was a single pass. Turns out it's two pass,
one for the varargs and one for the normal arguments, so the same fix is
needed (at least during varargs removal). So here it is.

(the observable/net effect of this bug, even when it didn't result in
assertion failure, is that debug info would describe the DAE'd function
in the abstract, but wouldn't provide high/low_pc, variable locations,
line table, etc (it would appear as though the function had been
entirely optimized away), see the original PR14016 for details of the
general problem)

I'm not recommitting the assertion just yet, as there's been another
regression of it since I last tried. It might just be a few test cases
weren't adequately updated after Adrian or Duncan's recent schema
changes.

llvm-svn: 219210
2014-10-07 15:10:23 +00:00
David Majnemer e025321d36 GlobalDCE: Don't drop any COMDAT members
If we require a single member of a comdat, require all of the other
members as well.

This fixes PR20981.

llvm-svn: 219191
2014-10-07 07:07:19 +00:00
David Blaikie e44ee92a3f range-for some loops in DAE
llvm-svn: 219167
2014-10-06 22:59:29 +00:00
Duncan P. N. Exon Smith 176b691d32 Revert "Revert "DI: Fold constant arguments into a single MDString""
This reverts commit r218918, effectively reapplying r218914 after fixing
an Ocaml bindings test and an Asan crash.  The root cause of the latter
was a tightened-up check in `DILexicalBlock::Verify()`, so I'll file a
PR to investigate who requires the loose check (and why).

Original commit message follows.

--

This patch addresses the first stage of PR17891 by folding constant
arguments together into a single MDString.  Integers are stringified and
a `\0` character is used as a separator.

Part of PR17891.

Note: I've attached my testcases upgrade scripts to the PR.  If I've
just broken your out-of-tree testcases, they might help.

llvm-svn: 219010
2014-10-03 20:01:09 +00:00
Benjamin Kramer e12a6bac32 Eliminate some deep std::vector copies. NFC.
llvm-svn: 218999
2014-10-03 18:33:16 +00:00
Duncan P. N. Exon Smith 786cd049fc Revert "DI: Fold constant arguments into a single MDString"
This reverts commit r218914 while I investigate some bots.

llvm-svn: 218918
2014-10-02 22:15:31 +00:00
Duncan P. N. Exon Smith 571f97bd90 DI: Fold constant arguments into a single MDString
This patch addresses the first stage of PR17891 by folding constant
arguments together into a single MDString.  Integers are stringified and
a `\0` character is used as a separator.

Part of PR17891.

Note: I've attached my testcases upgrade scripts to the PR.  If I've
just broken your out-of-tree testcases, they might help.

llvm-svn: 218914
2014-10-02 21:56:57 +00:00
Nick Lewycky 9e6d184803 Add control of function merging to the PMBuilder.
llvm-svn: 217731
2014-09-13 21:46:00 +00:00
Rafael Espindola c435adcde0 Add doInitialization/doFinalization to DataLayoutPass.
With this a DataLayoutPass can be reused for multiple modules.

Once we have doInitialization/doFinalization, it doesn't seem necessary to pass
a Module to the constructor.

Overall this change seems in line with the idea of making DataLayout a required
part of Module. With it the only way of having a DataLayout used is to add it
to the Module.

llvm-svn: 217548
2014-09-10 21:27:43 +00:00
Gerolf Hoflehner 008e5cdcba [PassManager] Adding Hidden attribute to EnableMLSM option
llvm-svn: 217539
2014-09-10 20:24:03 +00:00
Gerolf Hoflehner 24815d9b8f [MergedLoadStoreMotion] Move pass enabling option to PassManagerBuilder
llvm-svn: 217538
2014-09-10 19:55:29 +00:00
Stepan Dyatkovskiy fe134cdfa7 MergeFunctions: FunctionPtr has been renamed to FunctionNode.
It's supposed to store additional pass information for current function here.
That was the reason for name change.

llvm-svn: 217483
2014-09-10 10:08:25 +00:00
Hal Finkel d67e463901 Add an AlignmentFromAssumptions Pass
This adds a ScalarEvolution-powered transformation that updates load, store and
memory intrinsic pointer alignments based on invariant((a+q) & b == 0)
expressions. Many of the simple cases we can get with ValueTracking, but we
still need something like this for the more complicated cases (such as those
with an offset) that require some algebra. Note that gcc's
__builtin_assume_aligned's optional third argument provides exactly for this
kind of 'misalignment' offset for which this kind of logic is necessary.

The primary motivation is to fixup alignments for vector loads/stores after
vectorization (and unrolling). This pass is added to the optimization pipeline
just after the SLP vectorizer runs (which, admittedly, does not preserve SE,
although I imagine it could).  Regardless, I actually don't think that the
preservation matters too much in this case: SE computes lazily, and this pass
won't issue any SE queries unless there are any assume intrinsics, so there
should be no real additional cost in the common case (SLP does preserve DT and
LoopInfo).

llvm-svn: 217344
2014-09-07 20:05:11 +00:00
Hal Finkel 74c2f355d2 Add an Assumption-Tracking Pass
This adds an immutable pass, AssumptionTracker, which keeps a cache of
@llvm.assume call instructions within a module. It uses callback value handles
to keep stale functions and intrinsics out of the map, and it relies on any
code that creates new @llvm.assume calls to notify it of the new instructions.
The benefit is that code needing to find @llvm.assume intrinsics can do so
directly, without scanning the function, thus allowing the cost of @llvm.assume
handling to be negligible when none are present.

The current design is intended to be lightweight. We don't keep track of
anything until we need a list of assumptions in some function. The first time
this happens, we scan the function. After that, we add/remove @llvm.assume
calls from the cache in response to registration calls and ValueHandle
callbacks.

There are no new direct test cases for this pass, but because it calls it
validation function upon module finalization, we'll pick up detectable
inconsistencies from the other tests that touch @llvm.assume calls.

This pass will be used by follow-up commits that make use of @llvm.assume.

llvm-svn: 217334
2014-09-07 12:44:26 +00:00
James Molloy 6b95d8ed36 Enable noalias metadata by default and swap the order of the SLP and Loop vectorizers by default.
After some time maturing, hopefully the flags themselves will be removed.

llvm-svn: 217144
2014-09-04 13:23:08 +00:00
Hal Finkel 445dda5c4a Add pass-manager flags to use CFL AA
Add -use-cfl-aa (and -use-cfl-aa-in-codegen) to add CFL AA in the default pass
managers (for easy testing).

llvm-svn: 216978
2014-09-02 22:12:54 +00:00
Hal Finkel 0c083024f0 Feed AA to the inliner and use AA->getModRefBehavior in AddAliasScopeMetadata
This feeds AA through the IFI structure into the inliner so that
AddAliasScopeMetadata can use AA->getModRefBehavior to figure out which
functions only access their arguments (instead of just hard-coding some
knowledge of memory intrinsics). Most of the information is only available from
BasicAA; this is important for preserving alias scoping information for
target-specific intrinsics when doing the noalias parameter attribute to
metadata conversion.

llvm-svn: 216866
2014-09-01 09:01:39 +00:00
Reid Kleckner febb279c9c Don't promote byval pointer arguments when padding matters
Don't promote byval pointer arguments when when their size in bits is
not equal to their alloc size in bits. This can happen for x86_fp80,
where the size in bits is 80 but the alloca size in bits in 128.
Promoting these types can break passing unions of x86_fp80s and other
types.

Patch by Thomas Jablin!

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D5057

llvm-svn: 216693
2014-08-28 22:42:00 +00:00
Craig Topper e1d1294853 Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created.
llvm-svn: 216525
2014-08-27 05:25:25 +00:00
Reid Kleckner 3715461b48 musttail: Don't eliminate varargs packs if there is a forwarding call
Also clean up and beef up this grep test for the feature.

llvm-svn: 216425
2014-08-26 00:59:51 +00:00
Reid Kleckner e6e88f99b3 ArgPromotion: Don't touch variadic functions
Adding, removing, or changing non-pack parameters can change the ABI
classification of pack parameters. Clang and other frontends encode the
classification in the IR of the call site, but the callee side
determines it dynamically based on the number of registers consumed so
far. Changing the prototype affects the number of registers consumed
would break such code.

Dead argument elimination performs a similar task and already has a
similar check to avoid this problem.

Patch by Thomas Jablin!

llvm-svn: 216421
2014-08-25 23:58:48 +00:00
Bruno Cardoso Lopes e2a1fa35df Remove dangling initializers in GlobalDCE
GlobalDCE deletes global vars and updates their initializers to nullptr
while leaving underlying constants to be cleaned up later by its uses.
The clean up may never happen, fix this by forcing it every time it's
safe to destroy constants.

Final patch by Rafael Espindola
http://reviews.llvm.org/D4931

<rdar://problem/17523868>

llvm-svn: 216390
2014-08-25 17:51:14 +00:00
Stepan Dyatkovskiy c90308bf83 MergeFunctions, tiny refactoring:
cmpAPFloat has been renamed to cmpAPFloats (multiple form).

llvm-svn: 216376
2014-08-25 08:22:46 +00:00
Stepan Dyatkovskiy 7f895c1184 MergeFunctions, tiny refactoring:
cmpAPInt has been renamed to cmpAPInts (multiple form).

llvm-svn: 216375
2014-08-25 08:19:50 +00:00
Stepan Dyatkovskiy 0b765dee6e MergeFunctions, tiny refactoring:
cmpType has been renamed to cmpTypes (multiple form).

llvm-svn: 216374
2014-08-25 08:16:39 +00:00
Stepan Dyatkovskiy 016daddc52 MergeFunctions, tiny refactoring:
cmpGEP has been renamed to cmpGEPs (multiple form).

llvm-svn: 216373
2014-08-25 08:12:45 +00:00
Craig Topper 4627679cec Use range based for loops to avoid needing to re-mention SmallPtrSet size.
llvm-svn: 216351
2014-08-24 23:23:06 +00:00
Rafael Espindola 7cebf36a95 Move some logic to populateLTOPassManager.
This will avoid code duplication in the next commit which calls it directly
from the gold plugin.

llvm-svn: 216211
2014-08-21 20:03:44 +00:00
Rafael Espindola 216e0c0617 Respect LibraryInfo in populateLTOPassManager and use it. NFC.
llvm-svn: 216203
2014-08-21 18:49:52 +00:00
Rafael Espindola e07caad9e7 Handle inlining in populateLTOPassManager like in populateModulePassManager.
No functionality change.

llvm-svn: 216178
2014-08-21 13:35:30 +00:00
Rafael Espindola 208bc533cd Move DisableGVNLoadPRE from populateLTOPassManager to PassManagerBuilder.
llvm-svn: 216174
2014-08-21 13:13:17 +00:00
Craig Topper 71b7b68b74 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
llvm-svn: 216158
2014-08-21 05:55:13 +00:00
Craig Topper 97ebe53032 Const-correct and prevent a copy of a SmallPtrSet.
llvm-svn: 215973
2014-08-19 07:44:27 +00:00
Craig Topper 6230691c91 Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size."
Getting a weird buildbot failure that I need to investigate.

llvm-svn: 215870
2014-08-18 00:24:38 +00:00
Craig Topper 5229cfd163 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
llvm-svn: 215868
2014-08-17 23:47:00 +00:00
Chandler Carruth 0fb998110a [optnone] Make the optnone attribute effective at suppressing function
attribute and function argument attribute synthesizing and propagating.

As with the other uses of this attribute, the goal remains a best-effort
(no guarantees) attempt to not optimize the function or assume things
about the function when optimizing. This is particularly useful for
compiler testing, bisecting miscompiles, triaging things, etc. I was
hitting specific issues using optnone to isolate test code from a test
driver for my fuzz testing, and this is one step of fixing that.

llvm-svn: 215538
2014-08-13 10:49:33 +00:00
David Majnemer fe8c7540b0 GlobalOpt: Optimize in the face of insertvalue/extractvalue
GlobalOpt didn't know how to simulate InsertValueInst or
ExtractValueInst.  Optimizing these is pretty straightforward.

N.B. This came up when looking at clang's IRGen for MS ABI member
pointers; they are represented as aggregates.

llvm-svn: 215184
2014-08-08 05:50:43 +00:00
James Molloy 568da0990e Add a new option -run-slp-after-loop-vectorization.
This swaps the order of the loop vectorizer and the SLP/BB vectorizers. It is disabled by default so we can do performance testing - ideally we want to change to having the loop vectorizer running first, and the SLP vectorizer using its leftovers instead of the other way around.

llvm-svn: 214963
2014-08-06 12:56:19 +00:00
Rafael Espindola f9e52cf015 Don't internalize all but main by default.
This is mostly a cleanup, but it changes a fairly old behavior.

Every "real" LTO user was already disabling the silly internalize pass
and creating the internalize pass itself. The difference with this
patch is for "opt -std-link-opts" and the C api.

Now to get a usable behavior out of opt one doesn't need the funny
looking command line:

opt -internalize -disable-internalize -internalize-public-api-list=foo,bar -std-link-opts

llvm-svn: 214919
2014-08-05 20:10:38 +00:00
Stepan Dyatkovskiy 87c046189d MergeFunctions, tiny refactoring:
cmpOperation has been renamed to cmpOperations (multiple form).

llvm-svn: 214392
2014-07-31 07:16:59 +00:00
Aaron Ballman 573f3b5313 Fixing a few -Woverloaded-virtual warnings by exposing the hidden virtual function as well. No functional changes intended.
llvm-svn: 214325
2014-07-30 19:23:59 +00:00
Rafael Espindola 3cf4af11d5 Add the missing hasLinkOnceODRLinkage predicate.
llvm-svn: 214312
2014-07-30 15:57:51 +00:00
Duncan P. N. Exon Smith 4b4d8ecde1 Move -verify-use-list-order into llvm-uselistorder
Ugh.  Turns out not even transformation passes link in how to read IR.
I sincerely believe the buildbots will finally agree with my system
after this though.  (I don't really understand why all of this has been
working on my system, but not on all the buildbots.)

Create a new tool called llvm-uselistorder to use for verifying use-list
order.  For now, just dump everything from the (now defunct)
-verify-use-list-order pass into the tool.

This might be a better way to test use-list order anyway.

Part of PR5680.

llvm-svn: 213957
2014-07-25 17:13:03 +00:00
Duncan P. N. Exon Smith 20a005f27a Try to fix a layering violation introduced by r213945
The dragonegg buildbot (and others?) started failing after
r213945/r213946 because `llvm-as` wasn't linking in the bitcode reader.
I think moving the verify functions to the same file as the verify pass
should fix the build.  Adding a command-line option for maintaining
use-list order in assembly as a drive-by to prevent warnings about
unused static functions.

llvm-svn: 213947
2014-07-25 15:41:49 +00:00
Duncan P. N. Exon Smith 6b6fdc992a IPO: Add use-list-order verifier
Add a -verify-use-list-order pass, which shuffles use-list order, writes
to bitcode, reads back, and verifies that the (shuffled) order matches.

  - The utility functions live in lib/IR/UseListOrder.cpp.

  - Moved (and renamed) the command-line option to enable writing
    use-lists, so that this pass can return early if the use-list orders
    aren't being serialized.

It's not clear that this pass is the right direction long-term (perhaps
a separate tool instead?), but short-term it's a great way to test the
use-list order prototype.  I've added an XFAIL-ed testcase that I'm
hoping to get working pretty quickly.

This is part of PR5680.

llvm-svn: 213945
2014-07-25 14:49:26 +00:00
Hal Finkel 9414665a3b Add scoped-noalias metadata
This commit adds scoped noalias metadata. The primary motivations for this
feature are:
  1. To preserve noalias function attribute information when inlining
  2. To provide the ability to model block-scope C99 restrict pointers

Neither of these two abilities are added here, only the necessary
infrastructure. In fact, there should be no change to existing functionality,
only the addition of new features. The logic that converts noalias function
parameters into this metadata during inlining will come in a follow-up commit.

What is added here is the ability to generally specify noalias memory-access
sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA
nodes:

!scope0 = metadata !{ metadata !"scope of foo()" }
!scope1 = metadata !{ metadata !"scope 1", metadata !scope0 }
!scope2 = metadata !{ metadata !"scope 2", metadata !scope0 }
!scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 }
!scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 }

Loads and stores can be tagged with an alias-analysis scope, and also, with a
noalias tag for a specific scope:

... = load %ptr1, !alias.scope !{ !scope1 }
... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 }

When evaluating an aliasing query, if one of the instructions is associated
with an alias.scope id that is identical to the noalias scope associated with
the other instruction, or is a descendant (in the scope hierarchy) of the
noalias scope associated with the other instruction, then the two memory
accesses are assumed not to alias.

Note that is the first element of the scope metadata is a string, then it can
be combined accross functions and translation units. The string can be replaced
by a self-reference to create globally unqiue scope identifiers.

[Note: This overview is slightly stylized, since the metadata nodes really need
to just be numbers (!0 instead of !scope0), and the scope lists are also global
unnamed metadata.]

Existing noalias metadata in a callee is "cloned" for use by the inlined code.
This is necessary because the aliasing scopes are unique to each call site
(because of possible control dependencies on the aliasing properties). For
example, consider a function: foo(noalias a, noalias b) { *a = *b; } that gets
inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } --
now just because we know that a1 does not alias with b1 at the first call site,
and a2 does not alias with b2 at the second call site, we cannot let inlining
these functons have the metadata imply that a1 does not alias with b2.

llvm-svn: 213864
2014-07-24 14:25:39 +00:00
Hal Finkel cc39b67530 AA metadata refactoring (introduce AAMDNodes)
In order to enable the preservation of noalias function parameter information
after inlining, and the representation of block-level __restrict__ pointer
information (etc.), additional kinds of aliasing metadata will be introduced.
This metadata needs to be carried around in AliasAnalysis::Location objects
(and MMOs at the SDAG level), and so we need to generalize the current scheme
(which is hard-coded to just one TBAA MDNode*).

This commit introduces only the necessary refactoring to allow for the
introduction of other aliasing metadata types, but does not actually introduce
any (that will come in a follow-up commit). What it does introduce is a new
AAMDNodes structure to hold all of the aliasing metadata nodes associated with
a particular memory-accessing instruction, and uses that structure instead of
the raw MDNode* in AliasAnalysis::Location, etc.

No functionality change intended.

llvm-svn: 213859
2014-07-24 12:16:19 +00:00
David Blaikie 8e9cfa5497 ArgPromo+DebugInfo: Handle updating debug info over multiple applications of argument promotion.
While the subprogram map cache used by Dead Argument Elimination works
there, I made a mistake when reusing it for Argument Promotion in
r212128 because ArgPromo may transform functions more than once whereas
DAE transforms each function only once, removing all the dead arguments
in one go.

To address this, ensure that the map is updated after each argument
promotion.

In retrospect it might be a little wasteful to create a map of all
subprograms when only handling a single CGSCC, but the alternative is
walking the debug info for each function in the CGSCC that gets updated.
It's not clear to me what the right tradeoff is there, but since the
current tradeoff seems to be working OK (and the code to keep things
updated is very cheap), let's stick with that for now.

llvm-svn: 213805
2014-07-23 22:09:29 +00:00
Duncan P. N. Exon Smith 6c99015fe2 Revert "[C++11] Add predecessors(BasicBlock *) / successors(BasicBlock *) iterator ranges."
This reverts commit r213474 (and r213475), which causes a miscompile on
a stage2 LTO build.  I'll reply on the list in a moment.

llvm-svn: 213562
2014-07-21 17:06:51 +00:00
Manuel Jacob d11beffef4 [C++11] Add predecessors(BasicBlock *) / successors(BasicBlock *) iterator ranges.
Summary: This patch introduces two new iterator ranges and updates existing code to use it.  No functional change intended.

Test Plan: All tests (make check-all) still pass.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4481

llvm-svn: 213474
2014-07-20 09:10:11 +00:00
Gerolf Hoflehner f27ae6cdcf MergedLoadStoreMotion pass
Merges equivalent loads on both sides of a hammock/diamond
and hoists into into the header.
Merges equivalent stores on both sides of a hammock/diamond
and sinks it to the footer.
Can enable if conversion and tolerate better load misses
and store operand latencies.

llvm-svn: 213396
2014-07-18 19:13:09 +00:00
Stepan Dyatkovskiy dee612d4f6 MergeFunc patch from Björn Steinbrink.
Phabricator ticket: D4246, Don't merge functions with different range metadata on call/invoke.
Thanks!

llvm-svn: 213060
2014-07-15 10:46:51 +00:00
Hal Finkel 2e42c34d05 Allow isDereferenceablePointer to look through some bitcasts
isDereferenceablePointer should not give up upon encountering any bitcast. If
we're casting from a pointer to a larger type to a pointer to a small type, we
can continue by examining the bitcast's operand. This missing capability
was noted in a comment in the function.

In order for this to work, isDereferenceablePointer now takes an optional
DataLayout pointer (essentially all callers already had such a pointer
available). Most code uses isDereferenceablePointer though
isSafeToSpeculativelyExecute (which already took an optional DataLayout
pointer), and to enable the LICM test case, LICM needs to actually provide its DL
pointer to isSafeToSpeculativelyExecute (which it was not doing previously).

llvm-svn: 212686
2014-07-10 05:27:53 +00:00
Pete Cooper 91e4ba2f88 Revert "GlobalDCE: Delete available_externally initializers if it allows removing the value the initializer is referring to."
This reverts commit 5b55a47e94e28fbb56d0cd5d72c3db9105c15b4c.

A test case was found to crash after this was applied.  I'll file a bug to track fixing this with the test case needed.

llvm-svn: 212550
2014-07-08 17:06:03 +00:00
Benjamin Kramer 3c5b126239 GlobalDCE: Delete available_externally initializers if it allows removing the value the initializer is referring to.
This is useful for functions that are not actually available externally but
referenced by a vtable of some kind. Clang emits functions like this for the MS
ABI.

PR20182.

llvm-svn: 212337
2014-07-04 12:36:05 +00:00
Gerolf Hoflehner 65b13324e1 Run interprocedural const prop before global optimizer
Exposes more constant globals that can be removed by
the global optimizer. A specific example is the removal
of the static global block address array in
clang/test/CodeGen/indirect-goto.c. This change impacts only
lower optimization levels. With LTO interprocedural
const prop runs already before global opt.

llvm-svn: 212284
2014-07-03 19:28:15 +00:00
David Blaikie a8c3509ffe Constify the Function pointers in the result of makeSubprogramMap
These don't need to be mutable and callers being added soon in CodeGen
won't have access to non-const Module&.

llvm-svn: 212202
2014-07-02 18:30:05 +00:00
David Blaikie e844cd5305 DebugInfo: Keep track of subprograms who's arguments have been promoted.
Matching behavior with DeadArgumentElimination (and leveraging some
now-common infrastructure), keep track of the function from debug info
metadata if arguments are promoted.

This may produce interesting debug info - since the arguments may be
missing or of different types... but at least backtraces, inlining, etc,
will be correct.

llvm-svn: 212128
2014-07-01 21:13:37 +00:00
David Blaikie 6876b3bcff DebugInfo: Provide a utility for building a mapping from llvm::Function*s to llvm::DISubprograms
Update DeadArgumentElimintation to use this, with the intent of reusing
the functionality for ArgumentPromotion as well.

llvm-svn: 212122
2014-07-01 20:05:26 +00:00
David Majnemer 5c92115972 GlobalOpt: Don't swap private for internal linkage
There were transforms whose *intent* was to downgrade the linkage of
external objects to have internal linkage.

However, it fired on things with private linkage as well.

llvm-svn: 212104
2014-07-01 15:26:50 +00:00
David Majnemer 0e2cc2a519 GlobalOpt: Handle non-zero offsets for aliases
An alias with an aliasee of a non-zero GEP is not trivially replacable
with it's aliasee.

llvm-svn: 212079
2014-07-01 00:30:56 +00:00
David Majnemer dad0a645a7 IR: Add COMDATs to the IR
This new IR facility allows us to represent the object-file semantic of
a COMDAT group.

COMDATs allow us to tie together sections and make the inclusion of one
dependent on another. This is required to implement features like MS
ABI VFTables and optimizing away certain kinds of initialization in C++.

This functionality is only representable in COFF and ELF, Mach-O has no
similar mechanism.

Differential Revision: http://reviews.llvm.org/D4178

llvm-svn: 211920
2014-06-27 18:19:56 +00:00
David Blaikie b0cdf530c3 ArgumentPromotion: Propagate debug locations on calls for which arguments are promoted.
llvm-svn: 211872
2014-06-27 05:32:09 +00:00
David Majnemer 6098b2f519 GlobalOpt: Don't optimize thread_local for initializers
Folding a reference to a thread_local variable into another global
variable's initializer is very problematic, there is no relocation that
exists to represent such an access.

llvm-svn: 211762
2014-06-26 03:02:19 +00:00
David Majnemer 23fc9afa4d GlobalOpt: Don't optimize dllimport for initializers
Referencing a dllimport variable requires actually instructions, not
just a relocation.  This fixes PR19955.

Differential Revision: http://reviews.llvm.org/D4249

llvm-svn: 211571
2014-06-24 06:53:45 +00:00
Stepan Dyatkovskiy 38afcd70f8 MergeFunctions Pass, removed DenseMap helpers.
Patch removes rest part of code related to old implementation.

This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

This one was the final patch.

llvm-svn: 211457
2014-06-22 01:53:30 +00:00
Stepan Dyatkovskiy 471eab30b9 MergeFunctions Pass, updated header comments.
Added short description for new comparison algorithm, that introduces
total ordering among functions set.

This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 211456
2014-06-22 00:57:09 +00:00
Stepan Dyatkovskiy f4af855930 MergeFunctions Pass, FnSet has been replaced with FnTree.
Patch activates new implementation.
So from now, merging process should take time O(N*log(N)).
Where N size of module (we are free to measure it in
functions or in instructions). Internally FnTree represents
binary tree. So every lookup operation takes O(log(N)) time.

It is still not the last patch in series, we also have to
clean-up pass from old code, and update pass comments.

This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 211445
2014-06-21 20:54:36 +00:00
Stepan Dyatkovskiy 71038cadd4 MergeFunctions Pass, removed unused methods from old implementation.
Patch removed next old FunctionComparator methods:
    * enumerate
    * isEquivalentOperation
    * isEquivalentGEP
    * isEquivalentType 
    
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 211444
2014-06-21 20:13:24 +00:00
Stepan Dyatkovskiy 0b58801b69 MergeFunctions, doSanityCheck: fixed body comments.
llvm-svn: 211443
2014-06-21 19:07:51 +00:00
Stepan Dyatkovskiy a77f3d8587 MergeFunctions Pass, introduced sanity check, that checks order relation,
introduced among functions set.
    
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 211442
2014-06-21 18:58:11 +00:00
Stepan Dyatkovskiy 17ee5ac20d MergeFunctions Pass, introduced total ordering among top-level comparison
methods.
    
Patch changes return type of FunctionComparator::compare() and
FunctionComparator::compare(const BasicBlock*, const BasicBlock*)
methods from bool (equal or not) to {-1, 0, 1} (less, equal, great).
    
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 211437
2014-06-21 17:55:51 +00:00
Stepan Dyatkovskiy 6baeb8805c Commited patch from Björn Steinbrink:
Summary:
Different range metadata can lead to different optimizations in later
passes, possibly breaking the semantics of the merged function. So range
metadata must be taken into consideration when comparing Load
instructions.

Thanks!

llvm-svn: 211391
2014-06-20 19:11:56 +00:00
Tim Northover 420a216817 IR: add "cmpxchg weak" variant to support permitted failure.
This commit adds a weak variant of the cmpxchg operation, as described
in C++11. A cmpxchg instruction with this modifier is permitted to
fail to store, even if the comparison indicated it should.

As a result, cmpxchg instructions must return a flag indicating
success in addition to their original iN value loaded. Thus, for
uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The
second flag is 1 when the store succeeded.

At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been
added as the natural representation for the new cmpxchg instructions.
It is a strong cmpxchg.

By default this gets Expanded to the existing ATOMIC_CMP_SWAP during
Legalization, so existing backends should see no change in behaviour.
If they wish to deal with the enhanced node instead, they can call
setOperationAction on it. Beware: as a node with 2 results, it cannot
be selected from TableGen.

Currently, no use is made of the extra information provided in this
patch. Test updates are almost entirely adapting the input IR to the
new scheme.

Summary for out of tree users:
------------------------------

+ Legacy Bitcode files are upgraded during read.
+ Legacy assembly IR files will be invalid.
+ Front-ends must adapt to different type for "cmpxchg".
+ Backends should be unaffected by default.

llvm-svn: 210903
2014-06-13 14:24:07 +00:00
Craig Topper 66f09ad041 [C++11] Use 'nullptr'.
llvm-svn: 210442
2014-06-08 22:29:17 +00:00
Tom Roeder 544d1c22be Removing spurious dependency of IPO on JumpInstrTables
llvm-svn: 210281
2014-06-05 19:43:57 +00:00
Tom Roeder 44cb65fff1 Add a new attribute called 'jumptable' that creates jump-instruction tables for functions marked with this attribute.
It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables.

This also adds backend support for generating the jump-instruction tables on ARM and X86.
Note that since the jumptable attribute creates a second function pointer for a
function, any function marked with jumptable must also be marked with unnamed_addr.

llvm-svn: 210280
2014-06-05 19:29:43 +00:00
Nick Lewycky 59633cb478 When analyzing params/args for readnone/readonly, don't forget to consider that a pointer argument may be passed through a callsite to the return, and that we may need to analyze it. Fixes a bug reported on llvm-dev: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073098.html
llvm-svn: 209870
2014-05-30 02:31:27 +00:00
Michael J. Spencer 289067cc3d Add LoadCombine pass.
This pass is disabled by default. Use -combine-loads to enable in -O[1-3]

Differential revision: http://reviews.llvm.org/D3580

llvm-svn: 209791
2014-05-29 01:55:07 +00:00
Peter Collingbourne 0a4376190f Add an extension point for peephole optimizers.
This extension point allows adding passes that perform peephole optimizations
similar to the instruction combiner. These passes will be inserted after
each instance of the instruction combiner pass.

Differential Revision: http://reviews.llvm.org/D3905

llvm-svn: 209595
2014-05-25 10:27:02 +00:00
Diego Novillo 7f8af8bf91 Add support for missed and analysis optimization remarks.
Summary:
This adds two new diagnostics: -pass-remarks-missed and
-pass-remarks-analysis. They take the same values as -pass-remarks but
are intended to be triggered in different contexts.

-pass-remarks-missed is used by LLVMContext::emitOptimizationRemarkMissed,
which passes call when they tried to apply a transformation but
couldn't.

-pass-remarks-analysis is used by LLVMContext::emitOptimizationRemarkAnalysis,
which passes call when they want to inform the user about analysis
results.

The patch also:

1- Adds support in the inliner for the two new remarks and a
   test case.

2- Moves emitOptimizationRemark* functions to the llvm namespace.

3- Adds an LLVMContext argument instead of making them member functions
   of LLVMContext.

Reviewers: qcolombet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3682

llvm-svn: 209442
2014-05-22 14:19:46 +00:00
Peter Collingbourne 68a889757d Check the alwaysinline attribute on the call as well as on the caller.
Differential Revision: http://reviews.llvm.org/D3815

llvm-svn: 209150
2014-05-19 18:25:54 +00:00
Rafael Espindola f1bedd3747 Use create methods since msvc doesn't handle delegating constructors.
llvm-svn: 209076
2014-05-17 21:29:57 +00:00
Rafael Espindola 8370565820 Reduce abuse of default values in the GlobalAlias constructor.
This is in preparation for adding an optional offset.

llvm-svn: 209073
2014-05-17 19:57:46 +00:00
Rafael Espindola 6b238633b7 Fix most of PR10367.
This patch changes the design of GlobalAlias so that it doesn't take a
ConstantExpr anymore. It now points directly to a GlobalObject, but its type is
independent of the aliasee type.

To avoid changing all alias related tests in this patches, I kept the common
syntax

@foo = alias i32* @bar

to mean the same as now. The cases that used to use cast now use the more
general syntax

@foo = alias i16, i32* @bar.

Note that GlobalAlias now behaves a bit more like GlobalVariable. We
know that its type is always a pointer, so we omit the '*'.

For the bitcode, a nice surprise is that we were writing both identical types
already, so the format change is minimal. Auto upgrade is handled by looking
through the casts and no new fields are needed for now. New bitcode will
simply have different types for Alias and Aliasee.

One last interesting point in the patch is that replaceAllUsesWith becomes
smart enough to avoid putting a ConstantExpr in the aliasee. This seems better
than checking and updating every caller.

A followup patch will delete getAliasedGlobal now that it is redundant. Another
patch will add support for an explicit offset.

llvm-svn: 209007
2014-05-16 19:35:39 +00:00
Rafael Espindola 4fe0094fd1 Change the GlobalAlias constructor to look a bit more like GlobalVariable.
This is part of the fix for pr10367. A GlobalAlias always has a pointer type,
so just have the constructor build the type.

llvm-svn: 208983
2014-05-16 13:34:04 +00:00
Stepan Dyatkovskiy 948366ac0b MergeFunctions Pass, introduced total ordering among GEP operations.
Patch replaces old isEquivalentGEP implementation, and changes type of
comparison result from bool (equal or not) to {-1, 0, 1} (less, equal, greater).

This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 208976
2014-05-16 11:55:02 +00:00
Stepan Dyatkovskiy fa6820a035 MergeFunctions Pass, introduced total ordering among operations.
Patch replaces old isEquivalentOperation implementation, and changes type of
comparison result from bool (equal or not) to {-1, 0, 1} (less, equal, greater).

This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 208973
2014-05-16 11:02:22 +00:00
Stepan Dyatkovskiy 5c2cc2506d MergeFunctions Pass, introduced total ordering among function attributes.
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 208953
2014-05-16 08:55:34 +00:00
Duncan P. N. Exon Smith e60adfdbd0 GlobalValue: Assert symbols with local linkage have default visibility
The change to ExtractGV.cpp has no functionality change except to avoid
the asserts.  Existing testcases already cover this, so I didn't add a
new one.

llvm-svn: 208264
2014-05-07 23:00:22 +00:00
Stepan Dyatkovskiy cfd641f123 MergeFunctions Pass, introduced total ordering among values.
This is a third patch of patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

This patch description:
Being comparing functions we need to compare values we meet at left and
right sides.
Its easy to sort things out for external values. It just should be
the same value at left and right.
But for local values (those were introduced inside function body)
we have to ensure they were introduced at exactly the same place,
and plays the same role.

In short, patch introduces values serial numbering and comparison routine.
The last one compares two values by their serial numbers.

llvm-svn: 208189
2014-05-07 11:11:39 +00:00
Stepan Dyatkovskiy d103130ee0 Second patch of patch series that improves MergeFunctions performance time from O(N*N) to
O(N*log(N)). The idea is to introduce total ordering among functions set.
It allows to build binary tree and perform function look-up procedure in O(log(N)) time. 

This patch description:
Introduced total ordering among constants implemented in cmpConstants method.
Method performs lexicographical comparison between constants represented as
hypothetical numbers of next format:
<bitcastability-trait><raw-bit-contents>

Please, read cmpConstants declaration comments for more details.

llvm-svn: 208173
2014-05-07 09:05:10 +00:00
Richard Smith c167d656e7 Re-commit r208025, reverted in r208030, with a fix for a conformance issue
which GCC detects and Clang does not!

llvm-svn: 208033
2014-05-06 01:44:26 +00:00
Richard Smith 09bf116939 Revert r208025, which made buildbots unhappy for unknown reasons.
llvm-svn: 208030
2014-05-06 01:26:00 +00:00
Richard Smith 6cf1d744d8 Add llvm::function_ref (and a couple of uses of it), representing a type-erased reference to a callable object.
llvm-svn: 208025
2014-05-06 01:01:29 +00:00
Yi Jiang 79eb0aa8cb Reapply: Add slp vectorization to LTO passes. The bug it exposed has been fixed by r207983. <radar://16641956>
llvm-svn: 208013
2014-05-05 23:14:46 +00:00
Duncan P. N. Exon Smith 1789fb6493 LTO: -internalize sets visibility to default
Visibility is meaningless when the linkage is local.  Change
`-internalize` to reset the visibility to `default`.

<rdar://problem/16141113>

llvm-svn: 207979
2014-05-05 17:40:44 +00:00
Benjamin Kramer 64425fe875 SLPVectorizer: Lazily allocate the map for block numbering.
There is no point in creating it if we're not going to vectorize
anything. Creating the map is expensive as it creates large values.
No functionality change.

llvm-svn: 207916
2014-05-03 15:50:37 +00:00
Nico Weber 4b2acde21a Teach GlobalDCE how to remove empty global_ctor entries.
This moves most of GlobalOpt's constructor optimization
code out of GlobalOpt into Transforms/Utils/CDtorUtils.{h,cpp}. The
public interface is a single function OptimizeGlobalCtorsList() that
takes a predicate returning which constructors to remove.

GlobalOpt calls this with a function that statically evaluates all
constructors, just like it did before. This part of the change is
behavior-preserving.

Also add a call to this from GlobalDCE with a filter that removes global
constructors that contain a "ret" instruction and nothing else – this
fixes PR19590.

llvm-svn: 207856
2014-05-02 18:35:25 +00:00
Yi Jiang e2d5f29c2f Revert r207571 - Add slp vectorization to LTO passes
llvm-svn: 207693
2014-04-30 19:27:24 +00:00
Carlo Kok 307625c974 [IPO/MergeFunctions] changes so it doesn't try to bitcast a struct return type but instead recreates it with insert/extract value.
llvm-svn: 207679
2014-04-30 17:53:04 +00:00
Benjamin Kramer bf2368d94b Add a <tuple> include to more files that aren't getting it transitively on MSVC.
llvm-svn: 207617
2014-04-30 07:21:01 +00:00
Yi Jiang 4e234aa790 Add slp vectorization to LTO passes
llvm-svn: 207571
2014-04-29 19:35:39 +00:00
Duncan P. N. Exon Smith d2b2facb07 SCC: Change clients to use const, NFC
It's fishy to be changing the `std::vector<>` owned by the iterator, and
no one actual does it, so I'm going to remove the ability in a
subsequent commit.  First, update the users.

<rdar://problem/14292693>

llvm-svn: 207252
2014-04-25 18:24:50 +00:00
Manman Ren 3c44067a30 [inline cold threshold] Command line argument for inline threshold will
override the default cold threshold.

When we use command line argument to set the inline threshold, the default
cold threshold will not be used. This is in line with how we use
OptSizeThreshold. When we want a higher threshold for all functions, we
do not have to set both inline threshold and cold threshold.

llvm-svn: 207245
2014-04-25 17:34:55 +00:00
Craig Topper f40110f4d8 [C++] Use 'nullptr'. Transforms edition.
llvm-svn: 207196
2014-04-25 05:29:35 +00:00
Matt Arsenault fcd7401bbf Don't use default address space arguments in GlobalOpt
llvm-svn: 207019
2014-04-23 20:36:10 +00:00
Chandler Carruth 964daaaf19 [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all of the header #include lines, lib/Transforms/...
edition.

This one is tricky for two reasons. We again have a couple of passes
that define something else before the includes as well. I've sunk their
name macros with the DEBUG_TYPE.

Also, InstCombine contains headers that need DEBUG_TYPE, so now those
headers #define and #undef DEBUG_TYPE around their code, leaving them
well formed modular headers. Fixing these headers was a large motivation
for all of these changes, as "leaky" macros of this form are hard on the
modules implementation.

llvm-svn: 206844
2014-04-22 02:55:47 +00:00
David Blaikie bc44220eb8 Use unique_ptr to handle GlobalOpt's Evaluator members
llvm-svn: 206790
2014-04-21 20:49:36 +00:00
David Blaikie eb038915ab Simplify expression that was explicitly naming an operator overload in a call.
llvm-svn: 206788
2014-04-21 20:43:51 +00:00
Duncan P. N. Exon Smith 49f3ec80c2 PMBuilder: Expose an option to disable tail calls
Adds API to allow frontends to disable tail calls in PassManagerBuilder.

<rdar://problem/16050591>

llvm-svn: 206542
2014-04-18 01:05:15 +00:00
NAKAMURA Takumi cd1fc4bc1b Inliner::OptimizationRemark: Fix crash in clang/test/Frontend/optimization-remark.c on some hosts, including --vg.
DebugLoc in Callsite would not live after Inliner. It should be copied before Inliner.

llvm-svn: 206459
2014-04-17 12:22:14 +00:00
Duncan P. N. Exon Smith 2b69189c9c LTO: Add more loop simplification passes to LTO
Similar to r202051, add missing loop simplification passes to the LTO
optimization pipeline.

Patch by Rafael Espindola.

llvm-svn: 206306
2014-04-15 17:48:15 +00:00
Diego Novillo a9298b2297 Add support for optimization reports.
Summary:
This patch adds backend support for -Rpass=, which indicates the name
of the optimization pass that should emit remarks stating when it
made a transformation to the code.

Pass names are taken from their DEBUG_NAME definitions.

When emitting an optimization report diagnostic, the lack of debug
information causes the diagnostic to use "<unknown>:0:0" as the
location string.

This is the back end counterpart for

http://llvm-reviews.chandlerc.com/D3226

Reviewers: qcolombet

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D3227

llvm-svn: 205774
2014-04-08 16:42:34 +00:00
Duncan P. N. Exon Smith 4680f40d28 Revert "Reapply "LTO: add API to set strategy for -internalize""
This reverts commit r199244.

Conflicts:
	include/llvm-c/lto.h
	include/llvm/LTO/LTOCodeGenerator.h
	lib/LTO/LTOCodeGenerator.cpp

llvm-svn: 205471
2014-04-02 22:05:57 +00:00
Hal Finkel 86b3064f2b Move partial/runtime unrolling late in the pipeline
The generic (concatenation) loop unroller is currently placed early in the
standard optimization pipeline. This is a good place to perform full unrolling,
but not the right place to perform partial/runtime unrolling. However, most
targets don't enable partial/runtime unrolling, so this never mattered.

However, even some x86 cores benefit from partial/runtime unrolling of very
small loops, and follow-up commits will enable this. First, we need to move
partial/runtime unrolling late in the optimization pipeline (importantly, this
is after SLP and loop vectorization, as vectorization can drastically change
the size of a loop), while keeping the full unrolling where it is now. This
change does just that.

llvm-svn: 205264
2014-03-31 23:23:51 +00:00
Rafael Espindola 5e66a7e699 Add a missing break.
Patch by Tobias Güntner.

I tried to write a test, but the only difference is the Changed value that
gets returned. It can be tested with "opt -debug-pass=Executions -functionattrs,
but that doesn't seem worth it.

llvm-svn: 205121
2014-03-30 03:26:17 +00:00
Lang Hames 459b5dc39e Revert r204076 for now - it caused significant regressions in a number of
benchmarks.

<rdar://problem/16368461>

llvm-svn: 204558
2014-03-23 04:22:31 +00:00
Alon Mishne ad312155a6 [C++11] Change DebugInfoFinder to use range-based loops
Also changes the iterators to return actual DI type over MDNode.

llvm-svn: 204130
2014-03-18 09:41:07 +00:00
Dan Gohman 172c5d3451 Use range metadata instead of introducing selects.
When GlobalOpt has determined that a GlobalVariable only ever has two values,
it would convert the GlobalVariable to a boolean, and introduce SelectInsts
at every load, to choose between the two possible values. These SelectInsts
introduce overhead and other unpleasantness.

This patch makes GlobalOpt just add range metadata to loads from such
GlobalVariables instead. This enables the same main optimization (as seen in
test/Transforms/GlobalOpt/integer-bool.ll), without introducing selects.

The main downside is that it doesn't get the memory savings of shrinking such
GlobalVariables, but this is expected to be negligible.

llvm-svn: 204076
2014-03-17 19:57:04 +00:00
Stepan Dyatkovskiy a53cf970a1 MergeFunctions, cmpType: fixed variable names from XXTy1 and XXTy2 to XXTyL and XXTyR.
llvm-svn: 203907
2014-03-14 08:48:52 +00:00
Stepan Dyatkovskiy 90c4436962 MergeFunctions, cmpType: Fixed comments wrapping.
llvm-svn: 203905
2014-03-14 08:17:19 +00:00
Stepan Dyatkovskiy d8eb0bcb5b First patch of patch series that improves MergeFunctions performance time from O(N*N) to
O(N*log(N)). The idea is to introduce total ordering among functions set.
That allows to build binary tree and perform function look-up procedure in O(log(N)) time. 

This patch description:
Introduced total ordering among Type instances. Actually it is improvement for existing
isEquivalentType.
0. Coerce pointer of 0 address space to integer.
1. If left and right types are equal (the same Type* value), return 0 (means equal).
2. If types are of different kind (different type IDs). Return result of type IDs
comparison, treating them as numbers.
3. If types are vectors or integers, return result of its
pointers comparison (casted to numbers).
4. Check whether type ID belongs to the next group: 
* Void 
* Float 
* Double 
* X86_FP80 
* FP128 
* PPC_FP128 
* Label 
* Metadata 
If so, return 0.
5. If left and right are pointers, return result of address space
comparison (numbers comparison).
6. If types are complex.
Then both LEFT and RIGHT will be expanded and their element types will be checked with
the same way. If we get Res != 0 on some stage, return it. Otherwise return 0.
7. For all other cases put llvm_unreachable.

llvm-svn: 203788
2014-03-13 11:54:50 +00:00
Eli Bendersky 95b540f221 Revive SizeOptLevel-explaining comments that were dropped in r203669
llvm-svn: 203675
2014-03-12 16:44:17 +00:00
Eli Bendersky 49f6565267 Move duplicated code into a helper function (exposed through overload).
There's a bit of duplicated "magic" code in opt.cpp and Clang's CodeGen that
computes the inliner threshold from opt level and size opt level.

This patch moves the code to a function that lives alongside the inliner itself,
providing a convenient overload to the inliner creation.

A separate patch can be committed to Clang to use this once it's committed to
LLVM. Standalone tools that use the inlining pass can also avoid duplicating
this code and fearing it will go out of sync.

Note: this patch also restructures the conditinal logic of the computation to
be cleaner.

llvm-svn: 203669
2014-03-12 16:12:36 +00:00
Tim Northover e94a518a22 IR: add a second ordering operand to cmpxhg for failure
The syntax for "cmpxchg" should now look something like:

	cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic

where the second ordering argument gives the required semantics in the case
that no exchange takes place. It should be no stronger than the first ordering
constraint and cannot be either "release" or "acq_rel" (since no store will
have taken place).

rdar://problem/15996804

llvm-svn: 203559
2014-03-11 10:48:52 +00:00
Chandler Carruth cdf4788401 [C++11] Add range based accessors for the Use-Def chain of a Value.
This requires a number of steps.
1) Move value_use_iterator into the Value class as an implementation
   detail
2) Change it to actually be a *Use* iterator rather than a *User*
   iterator.
3) Add an adaptor which is a User iterator that always looks through the
   Use to the User.
4) Wrap these in Value::use_iterator and Value::user_iterator typedefs.
5) Add the range adaptors as Value::uses() and Value::users().
6) Update *all* of the callers to correctly distinguish between whether
   they wanted a use_iterator (and to explicitly dig out the User when
   needed), or a user_iterator which makes the Use itself totally
   opaque.

Because #6 requires churning essentially everything that walked the
Use-Def chains, I went ahead and added all of the range adaptors and
switched them to range-based loops where appropriate. Also because the
renaming requires at least churning every line of code, it didn't make
any sense to split these up into multiple commits -- all of which would
touch all of the same lies of code.

The result is still not quite optimal. The Value::use_iterator is a nice
regular iterator, but Value::user_iterator is an iterator over User*s
rather than over the User objects themselves. As a consequence, it fits
a bit awkwardly into the range-based world and it has the weird
extra-dereferencing 'operator->' that so many of our iterators have.
I think this could be fixed by providing something which transforms
a range of T&s into a range of T*s, but that *can* be separated into
another patch, and it isn't yet 100% clear whether this is the right
move.

However, this change gets us most of the benefit and cleans up
a substantial amount of code around Use and User. =]

llvm-svn: 203364
2014-03-09 03:16:01 +00:00
Benjamin Kramer adf1ea8227 [C++11] Revert uses of lambdas with array_pod_sort.
Looks like GCC implements the lambda->function pointer conversion differently.

llvm-svn: 203294
2014-03-07 21:52:38 +00:00
Benjamin Kramer b0f74b24fa [C++11] Convert sort predicates into lambdas.
No functionality change.

llvm-svn: 203288
2014-03-07 21:35:39 +00:00
Chandler Carruth 9a4c9e597b [Layering] Move DebugInfo.h into the IR library where its implementation
already lives.

llvm-svn: 203046
2014-03-06 00:46:21 +00:00
Chandler Carruth 12664a0b17 [Layering] Move DIBuilder.h into the IR library where its implementation
already lives.

llvm-svn: 203038
2014-03-06 00:22:06 +00:00
Chandler Carruth 64e9aa5c93 [C++11] Make this interface accept const Use pointers and use override
to ensure we don't mess up any of the overrides. Necessary for cleaning
up the Value use iterators and enabling range-based traversing of use
lists.

llvm-svn: 202958
2014-03-05 10:21:48 +00:00
Craig Topper 3e4c697ca1 [C++11] Add 'override' keyword to virtual methods that override their base class.
llvm-svn: 202953
2014-03-05 09:10:37 +00:00
Chandler Carruth 1305dc3351 [Modules] Move CFG.h to the IR library as it defines graph traits over
IR types.

llvm-svn: 202827
2014-03-04 11:45:46 +00:00
Chandler Carruth 4220e9c154 [Modules] Move ValueHandle into the IR library where Value itself lives.
Move the test for this class into the IR unittests as well.

This uncovers that ValueMap too is in the IR library. Ironically, the
unittest for ValueMap is useless in the Support library (honestly, so
was the ValueHandle test) and so it already lives in the IR unittests.
Mmmm, tasty layering.

llvm-svn: 202821
2014-03-04 11:17:44 +00:00
Chandler Carruth 219b89b987 [Modules] Move CallSite into the IR library where it belogs. It is
abstracting between a CallInst and an InvokeInst, both of which are IR
concepts.

llvm-svn: 202816
2014-03-04 11:01:28 +00:00
Chandler Carruth 03eb0de93d [Modules] Move GetElementPtrTypeIterator into the IR library. As its
name might indicate, it is an iterator over the types in an instruction
in the IR.... You see where this is going.

Another step of modularizing the support library.

llvm-svn: 202815
2014-03-04 10:40:04 +00:00
Chandler Carruth 8394857f43 [Modules] Move InstIterator out of the Support library, where it had no
business.

This header includes Function and BasicBlock and directly uses the
interfaces of both classes. It has to do with the IR, it even has that
in the name. =] Put it in the library it belongs to.

This is one step toward making LLVM's Support library survive a C++
modules bootstrap.

llvm-svn: 202814
2014-03-04 10:30:26 +00:00
Benjamin Kramer b2f034b85e [C++11] Use std::tie to simplify compare operators.
No functionality change.

llvm-svn: 202751
2014-03-03 19:58:30 +00:00
Benjamin Kramer b6d0bd48bd [C++11] Replace llvm::next and llvm::prior with std::next and std::prev.
Remove the old functions.

llvm-svn: 202636
2014-03-02 12:27:27 +00:00
Reid Kleckner e6ff5c51e6 Reflow isProfitableToMakeFastCC
llvm-svn: 202555
2014-02-28 22:50:08 +00:00
Reid Kleckner 22869378d9 GlobalOpt: Apply fastcc to internal x86_thiscallcc functions
We should apply fastcc whenever profitable.  We can expand this list,
but there are lots of conventions with performance implications that we
don't want to change.

Differential Revision: http://llvm-reviews.chandlerc.com/D2705

llvm-svn: 202293
2014-02-26 19:57:30 +00:00
Rafael Espindola 935125126c Make DataLayout a plain object, not a pass.
Instead, have a DataLayoutPass that holds one. This will allow parts of LLVM
don't don't handle passes to also use DataLayout.

llvm-svn: 202168
2014-02-25 17:30:31 +00:00
Rafael Espindola 43b5a51e7c Make a few more DataLayout variables const.
llvm-svn: 202155
2014-02-25 14:24:11 +00:00
Rafael Espindola aeff8a9c05 Make some DataLayout pointers const.
No functionality change. Just reduces the noise of an upcoming patch.

llvm-svn: 202087
2014-02-24 23:12:18 +00:00
Arnold Schwaighofer 6ccda923e5 LTO: Add the loop vectorizer to the LTO pipeline.
During the LTO phase LICM will move loop invariant global variables out of loops
(informed by GlobalModRef). This makes more loops countable presenting
opportunity for the loop vectorizer.

Adding the loop vectorizer improves some TSVC benchmarks and twolf/ref dataset
(5%) on x86-64.

radar://15970632

llvm-svn: 202051
2014-02-24 18:19:31 +00:00
Rafael Espindola 612886fc8c Rename a few more DataLayout variables.
llvm-svn: 201833
2014-02-21 01:53:35 +00:00
Rafael Espindola 37dc9e19f5 Rename many DataLayout variables from TD to DL.
I am really sorry for the noise, but the current state where some parts of the
code use TD (from the old name: TargetData) and other parts use DL makes it
hard to write a patch that changes where those variables come from and how
they are passed along.

llvm-svn: 201827
2014-02-21 00:06:31 +00:00
Reid Kleckner 22b19da9fc GlobalOpt: Aliases don't have sections, don't copy them when replacing
As defined in LangRef, aliases do not have sections.  However, LLVM's
GlobalAlias class inherits from GlobalValue, which means we can read and
set its section.  We should probably ban that as a separate change,
since it doesn't make much sense for an alias to have a section that
differs from its aliasee.

Fixes PR18757, where the section was being lost on the global in code
from Clang like:

extern "C" {
__attribute__((used, section("CUSTOM"))) static int in_custom_section;
}

Reviewers: rafael.espindola

Differential Revision: http://llvm-reviews.chandlerc.com/D2758

llvm-svn: 201286
2014-02-13 02:18:36 +00:00
Manman Ren d461244972 Set default of inlinecold-threshold to 225.
225 is the default value of inline-threshold. This change will make sure
we have the same inlining behavior as prior to r200886.

As Chandler points out, even though we don't have code in our testing
suite that uses cold attribute, there are larger applications that do
use cold attribute.

r200886 + this commit intend to keep the same behavior as prior to r200886.
We can later on tune the inlinecold-threshold.

The main purpose of r200886 is to help performance of instrumentation based
PGO before we actually hook up inliner with analysis passes such as BPI and BFI.
For instrumentation based PGO, we try to increase inlining of hot functions and
reduce inlining of cold functions by setting inlinecold-threshold.

Another option suggested by Chandler is to use a boolean flag that controls
if we should use OptSizeThreshold for cold functions. The default value
of the boolean flag should not change the current behavior. But it gives us
less freedom in controlling inlining of cold functions.

llvm-svn: 200898
2014-02-06 01:59:22 +00:00
Paul Robinson af4e64d095 Disable most IR-level transform passes on functions marked 'optnone'.
Ideally only those transform passes that run at -O0 remain enabled,
in reality we get as close as we reasonably can.
Passes are responsible for disabling themselves, it's not the job of
the pass manager to do it for them.

llvm-svn: 200892
2014-02-06 00:07:05 +00:00
Manman Ren e8781b1a36 Inliner uses a smaller inline threshold for callees with cold attribute.
Added command line option inlinecold-threshold to set threshold for inlining
functions with cold attribute. Listen to the cold attribute when it would
decrease the inline threshold.

llvm-svn: 200886
2014-02-05 22:53:44 +00:00
Duncan P. N. Exon Smith 8e661efc00 cleanup: scc_iterator consumers should use isAtEnd
No functional change.  Updated loops from:

    for (I = scc_begin(), E = scc_end(); I != E; ++I)

to:

    for (I = scc_begin(); !I.isAtEnd(); ++I)

for teh win.

llvm-svn: 200789
2014-02-04 19:19:07 +00:00
Reid Kleckner d47a59a4f8 inalloca: Don't remove dead arguments in the presence of inalloca args
It disturbs the layout of the parameters in memory and registers,
leading to problems in the backend.

The plan for optimizing internal inalloca functions going forward is to
essentially SROA the argument memory and demote any captured arguments
(things that aren't trivially written by a load or store) to an indirect
pointer to a static alloca.

llvm-svn: 200717
2014-02-03 20:42:49 +00:00
Reid Kleckner 26af2cae05 Update optimization passes to handle inalloca arguments
Summary:
I searched Transforms/ and Analysis/ for 'ByVal' and updated those call
sites to check for inalloca if appropriate.

I added tests for any change that would allow an optimization to fire on
inalloca.

Reviewers: nlewycky

Differential Revision: http://llvm-reviews.chandlerc.com/D2449

llvm-svn: 200281
2014-01-28 02:38:36 +00:00
Alp Toker cb40291100 Fix known typos
Sweep the codebase for common typos. Includes some changes to visible function
names that were misspelt.

llvm-svn: 200018
2014-01-24 17:20:08 +00:00
Rafael Espindola 2a05ea5c0e Remove tail marker when changing an argument to an alloca.
Argument promotion can replace an argument of a call with an alloca. This
requires clearing the tail marker as it is very likely that the callee is now
using an alloca in the caller.

This fixes pr14710.

llvm-svn: 199909
2014-01-23 17:19:42 +00:00
Matt Arsenault e55a2c2e6b Make nocapture analysis work with addrspacecast
llvm-svn: 199246
2014-01-14 19:11:52 +00:00
Duncan P. N. Exon Smith 93be7c4fb3 Reapply "LTO: add API to set strategy for -internalize"
Reapply r199191, reverted in r199197 because it carelessly broke
Other/link-opts.ll.  The problem was that calling
createInternalizePass("main") would select
createInternalizePass(bool("main")) instead of
createInternalizePass(ArrayRef<const char *>("main")).  This commit
fixes the bug.

The original commit message follows.

Add API to LTOCodeGenerator to specify a strategy for the -internalize
pass.

This is a new attempt at Bill's change in r185882, which he reverted in
r188029 due to problems with the gold linker.  This puts the onus on the
linker to decide whether (and what) to internalize.

In particular, running internalize before outputting an object file may
change a 'weak' symbol into an internal one, even though that symbol
could be needed by an external object file --- e.g., with arclite.

This patch enables three strategies:

- LTO_INTERNALIZE_FULL: the default (and the old behaviour).
- LTO_INTERNALIZE_NONE: skip -internalize.
- LTO_INTERNALIZE_HIDDEN: only -internalize symbols with hidden
  visibility.

LTO_INTERNALIZE_FULL should be used when linking an executable.

Outputting an object file (e.g., via ld -r) is more complicated, and
depends on whether hidden symbols should be internalized.  E.g., for
ld -r, LTO_INTERNALIZE_NONE can be used when -keep_private_externs, and
LTO_INTERNALIZE_HIDDEN can be used otherwise.  However,
LTO_INTERNALIZE_FULL is inappropriate, since the output object file will
eventually need to link with others.

lto_codegen_set_internalize_strategy() sets the strategy for subsequent
calls to lto_codegen_write_merged_modules() and lto_codegen_compile*().

<rdar://problem/14334895>

llvm-svn: 199244
2014-01-14 18:52:17 +00:00
Nico Rieck 7157bb765e Decouple dllexport/dllimport from linkage
Representing dllexport/dllimport as distinct linkage types prevents using
these attributes on templates and inline functions.

Instead of introducing further mixed linkage types to include linkonce and
weak ODR, the old import/export linkage types are replaced with a new
separate visibility-like specifier:

  define available_externally dllimport void @f() {}
  @Var = dllexport global i32 1, align 4

Linkage for dllexported globals and functions is now equal to their linkage
without dllexport. Imported globals and functions must be either
declarations with external linkage, or definitions with
AvailableExternallyLinkage.

llvm-svn: 199218
2014-01-14 15:22:47 +00:00
Nico Rieck 9d2e0df049 Revert "Decouple dllexport/dllimport from linkage"
Revert this for now until I fix an issue in Clang with it.

This reverts commit r199204.

llvm-svn: 199207
2014-01-14 12:38:32 +00:00
Nico Rieck e43aaf7967 Decouple dllexport/dllimport from linkage
Representing dllexport/dllimport as distinct linkage types prevents using
these attributes on templates and inline functions.

Instead of introducing further mixed linkage types to include linkonce and
weak ODR, the old import/export linkage types are replaced with a new
separate visibility-like specifier:

  define available_externally dllimport void @f() {}
  @Var = dllexport global i32 1, align 4

Linkage for dllexported globals and functions is now equal to their linkage
without dllexport. Imported globals and functions must be either
declarations with external linkage, or definitions with
AvailableExternallyLinkage.

llvm-svn: 199204
2014-01-14 11:55:03 +00:00
NAKAMURA Takumi 23c0ab53b2 Revert r199191, "LTO: add API to set strategy for -internalize"
Please update also Other/link-opts.ll, in next time.

llvm-svn: 199197
2014-01-14 09:40:18 +00:00
Duncan P. N. Exon Smith 43ea3478bf LTO: add API to set strategy for -internalize
Add API to LTOCodeGenerator to specify a strategy for the -internalize
pass.

This is a new attempt at Bill's change in r185882, which he reverted in
r188029 due to problems with the gold linker.  This puts the onus on the
linker to decide whether (and what) to internalize.

In particular, running internalize before outputting an object file may
change a 'weak' symbol into an internal one, even though that symbol
could be needed by an external object file --- e.g., with arclite.

This patch enables three strategies:

- LTO_INTERNALIZE_FULL: the default (and the old behaviour).
- LTO_INTERNALIZE_NONE: skip -internalize.
- LTO_INTERNALIZE_HIDDEN: only -internalize symbols with hidden
  visibility.

LTO_INTERNALIZE_FULL should be used when linking an executable.

Outputting an object file (e.g., via ld -r) is more complicated, and
depends on whether hidden symbols should be internalized.  E.g., for
ld -r, LTO_INTERNALIZE_NONE can be used when -keep_private_externs, and
LTO_INTERNALIZE_HIDDEN can be used otherwise.  However,
LTO_INTERNALIZE_FULL is inappropriate, since the output object file will
eventually need to link with others.

lto_codegen_set_internalize_strategy() sets the strategy for subsequent
calls to lto_codegen_write_merged_modules() and lto_codegen_compile*().

<rdar://problem/14334895>

llvm-svn: 199191
2014-01-14 06:37:26 +00:00
Chandler Carruth 73523021d0 [PM] Split DominatorTree into a concrete analysis result object which
can be used by both the new pass manager and the old.

This removes it from any of the virtual mess of the pass interfaces and
lets it derive cleanly from the DominatorTreeBase<> template. In turn,
tons of boilerplate interface can be nuked and it turns into a very
straightforward extension of the base DominatorTree interface.

The old analysis pass is now a simple wrapper. The names and style of
this split should match the split between CallGraph and
CallGraphWrapperPass. All of the users of DominatorTree have been
updated to match using many of the same tricks as with CallGraph. The
goal is that the common type remains the resulting DominatorTree rather
than the pass. This will make subsequent work toward the new pass
manager significantly easier.

Also in numerous places things became cleaner because I switched from
re-running the pass (!!! mid way through some other passes run!!!) to
directly recomputing the domtree.

llvm-svn: 199104
2014-01-13 13:07:17 +00:00
Chandler Carruth 5ad5f15cff [cleanup] Move the Dominators.h and Verifier.h headers into the IR
directory. These passes are already defined in the IR library, and it
doesn't make any sense to have the headers in Analysis.

Long term, I think there is going to be a much better way to divide
these matters. The dominators code should be fully separated into the
abstract graph algorithm and have that put in Support where it becomes
obvious that evn Clang's CFGBlock's can use it. Then the verifier can
manually construct dominance information from the Support-driven
interface while the Analysis library can provide a pass which both
caches, reconstructs, and supports a nice update API.

But those are very long term, and so I don't want to leave the really
confusing structure until that day arrives.

llvm-svn: 199082
2014-01-13 09:26:24 +00:00
Chandler Carruth 8a8cd2bab9 Re-sort all of the includes with ./utils/sort_includes.py so that
subsequent changes are easier to review. About to fix some layering
issues, and wanted to separate out the necessary churn.

Also comment and sink the include of "Windows.h" in three .inc files to
match the usage in Memory.inc.

llvm-svn: 198685
2014-01-07 11:48:04 +00:00
Matt Arsenault 461c8e0a8c Delete unread globals through addrspacecast
llvm-svn: 198346
2014-01-02 20:01:43 +00:00
Matt Arsenault da1deabb16 Fix addrspacecast with metadata globals
llvm-svn: 198345
2014-01-02 19:53:49 +00:00
Hal Finkel f59fd7dcb4 Fix a use-after-free error in GlobalOpt CleanupConstantGlobalUsers
GlobalOpt's CleanupConstantGlobalUsers function uses a worklist array to manage
constant users to be visited. The pointers in this array need to be weak
handles because when we delete a constant array, we may also be holding a
pointer to one of its elements (or an element of one of its elements if we're
dealing with an array of arrays) in the worklist.

Fixes PR17347.

llvm-svn: 197178
2013-12-12 20:45:24 +00:00
Hal Finkel 26fc4c29c6 Initialize the barrier pass llvm::initializeIPO
The barrier pass is a temporary hack, and should go away soon. Nevertheless, if
we don't initialize it, then opt will not understand -barrier, and this will
break bugpoint (because when it dumps the passes from the default pass manager
-barrier will be there).

llvm-svn: 197177
2013-12-12 20:45:08 +00:00
NAKAMURA Takumi 8bc9bfaa5a Prune redundant dependencies in LLVMBuild.txt.
llvm-svn: 196988
2013-12-11 00:30:57 +00:00
Renato Golin 729a3ae90a Add #pragma vectorize enable/disable to LLVM
The intended behaviour is to force vectorization on the presence
of the flag (either turn on or off), and to continue the behaviour
as expected in its absence. Tests were added to make sure the all
cases are covered in opt. No tests were added in other tools with
the assumption that they should use the PassManagerBuilder in the
same way.

This patch also removes the outdated -late-vectorize flag, which was
on by default and not helping much.

The pragma metadata is being attached to the same place as other loop
metadata, but nothing forbids one from attaching it to a function
(to enable #pragma optimize) or basic blocks (to hint the basic-block
vectorizers), etc. The logic should be the same all around.

Patches to Clang to produce the metadata will be produced after the
initial implementation is agreed upon and committed. Patches to other
vectorizers (such as SLP and BB) will be added once we're happy with
the pass manager changes.

llvm-svn: 196537
2013-12-05 21:20:02 +00:00
Alp Toker f907b891da Correct word hyphenations
This patch tries to avoid unrelated changes other than fixing a few
hyphen-related ambiguities and contractions in nearby lines.

llvm-svn: 196471
2013-12-05 05:44:44 +00:00
Yunzhong Gao 9163e8bce6 Teach the internalize pass to skip dllexported symbols because they could be
referenced in a way that even the linker does not see.

Differential Revision: http://llvm-reviews.chandlerc.com/D2280

llvm-svn: 196300
2013-12-03 18:05:14 +00:00
Stepan Dyatkovskiy abb8505dc5 PR17925 bugfix.
Short description.

This issue is about case of treating pointers as integers.
We treat pointers as different if they references different address space.
At the same time, we treat pointers equal to integers (with machine address
width). It was a point of false-positive. Consider next case on 32bit machine:

void foo0(i32 addrespace(1)* %p)
void foo1(i32 addrespace(2)* %p)
void foo2(i32 %p)

foo0 != foo1, while
foo1 == foo2 and foo0 == foo2.

As you can see it breaks transitivity. That means that result depends on order
of how functions are presented in module. Next order causes merging of foo0
and foo1: foo2, foo0, foo1
First foo0 will be merged with foo2, foo0 will be erased. Second foo1 will be
merged with foo2.
Depending on order, things could be merged we don't expect to.

The fix:
Forbid to treat any pointer as integer, except for those, who belong to address space 0.

llvm-svn: 195769
2013-11-26 16:11:03 +00:00
Chandler Carruth 6378cf539f [PM] Split the CallGraph out from the ModulePass which creates the
CallGraph.

This makes the CallGraph a totally generic analysis object that is the
container for the graph data structure and the primary interface for
querying and manipulating it. The pass logic is separated into its own
class. For compatibility reasons, the pass provides wrapper methods for
most of the methods on CallGraph -- they all just forward.

This will allow the new pass manager infrastructure to provide its own
analysis pass that constructs the same CallGraph object and makes it
available. The idea is that in the new pass manager, the analysis pass's
'run' method returns a concrete analysis 'result'. Here, that result is
a 'CallGraph'. The 'run' method will typically do only minimal work,
deferring much of the work into the implementation of the result object
in order to be lazy about computing things, but when (like DomTree)
there is *some* up-front computation, the analysis does it prior to
handing the result back to the querying pass.

I know some of this is fairly ugly. I'm happy to change it around if
folks can suggest a cleaner interim state, but there is going to be some
amount of unavoidable ugliness during the transition period. The good
thing is that this is very limited and will naturally go away when the
old pass infrastructure goes away. It won't hang around to bother us
later.

Next up is the initial new-PM-style call graph analysis. =]

llvm-svn: 195722
2013-11-26 04:19:30 +00:00
Manman Ren cb14bbcc48 Debug Info: move StripDebugInfo from StripSymbols.cpp to DebugInfo.cpp.
We can share the implementation between StripSymbols and dropping debug info
for metadata versions that do not match.

Also update the comments to match the implementation. A follow-on patch will
drop the "Debug Info Version" module flag in StripDebugInfo.

llvm-svn: 195505
2013-11-22 22:06:31 +00:00
Rafael Espindola 6597992c69 Add a fixed version of r195470 back.
The fix is simply to use CurI instead of I when handling aliases to
avoid accessing a invalid iterator.

original message:

Convert linkonce* to weak* instead of strong.

Also refactor the logic into a helper function. This is an important improve
on mingw where the linker complains about mixed weak and strong symbols.
Converting to weak ensures that the symbol is not dropped, but keeps in a
comdat, making the linker happy.

llvm-svn: 195477
2013-11-22 17:58:12 +00:00
Rafael Espindola 77aa674cc4 Revert "Convert linkonce* to weak* instead of strong."
This reverts commit r195470.
Debugging failure in some bots.

llvm-svn: 195472
2013-11-22 17:09:34 +00:00
Rafael Espindola 5574032575 Convert linkonce* to weak* instead of strong.
Also refactor the logic into a helper function. This is an important improvement
on mingw where the linker complains about mixed weak and strong symbols.
Converting to weak ensures that the symbol is not dropped, but keeps in a
comdat, making the linker happy.

llvm-svn: 195470
2013-11-22 16:14:30 +00:00
Hal Finkel 29aeb20518 Add a loop rerolling flag to the PassManagerBuilder
This adds a boolean member variable to the PassManagerBuilder to control loop
rerolling (just like we have for unrolling and the various vectorization
options). This is necessary for control by the frontend. Loop rerolling remains
disabled by default at all optimization levels.

llvm-svn: 194966
2013-11-17 16:02:50 +00:00
Hal Finkel bf45efde2d Add a loop rerolling pass
This adds a loop rerolling pass: the opposite of (partial) loop unrolling. The
transformation aims to take loops like this:

for (int i = 0; i < 3200; i += 5) {
  a[i]     += alpha * b[i];
  a[i + 1] += alpha * b[i + 1];
  a[i + 2] += alpha * b[i + 2];
  a[i + 3] += alpha * b[i + 3];
  a[i + 4] += alpha * b[i + 4];
}

and turn them into this:

for (int i = 0; i < 3200; ++i) {
  a[i] += alpha * b[i];
}

and loops like this:

for (int i = 0; i < 500; ++i) {
  x[3*i] = foo(0);
  x[3*i+1] = foo(0);
  x[3*i+2] = foo(0);
}

and turn them into this:

for (int i = 0; i < 1500; ++i) {
  x[i] = foo(0);
}

There are two motivations for this transformation:

  1. Code-size reduction (especially relevant, obviously, when compiling for
code size).

  2. Providing greater choice to the loop vectorizer (and generic unroller) to
choose the unrolling factor (and a better ability to vectorize). The loop
vectorizer can take vector lengths and register pressure into account when
choosing an unrolling factor, for example, and a pre-unrolled loop limits that
choice. This is especially problematic if the manual unrolling was optimized
for a machine different from the current target.

The current implementation is limited to single basic-block loops only. The
rerolling recognition should work regardless of how the loop iterations are
intermixed within the loop body (subject to dependency and side-effect
constraints), but the significant restriction is that the order of the
instructions in each iteration must be identical. This seems sufficient to
capture all current use cases.

This pass is not currently enabled by default at any optimization level.

llvm-svn: 194939
2013-11-16 23:59:05 +00:00
Manman Ren bc37658a7f ArgumentPromotion: correctly transfer TBAA tags and alignments.
We used to use std::map<IndicesVector, LoadInst*> for OriginalLoads, and when we
try to promote two arguments, they will both write to OriginalLoads causing
created loads for the two arguments to have the same original load. And the same
tbaa tag and alignment will be put to the created loads for the two arguments.

The fix is to use std::map<std::pair<Argument*, IndicesVector>, LoadInst*>
for OriginalLoads, so each Argument will write to different parts of the map.

PR17906

llvm-svn: 194846
2013-11-15 20:41:15 +00:00
Rafael Espindola dd8757abbc Corruptly merge constants with explicit and implicit alignments.
Constant merge can merge a constant with implicit alignment with one that has
explicit alignment. Before this change it was assuming that the explicit
alignment was higher than the implicit one, causing the result to be under
aligned in some cases.

Fixes pr17815.

Patch by Chris Smowton!

llvm-svn: 194506
2013-11-12 20:21:43 +00:00
Matt Arsenault 5bcefabcda Teach MergeFunctions about address spaces
llvm-svn: 194342
2013-11-10 01:44:37 +00:00
Shuxin Yang d1382b6c31 Remove dead code
llvm-svn: 194017
2013-11-04 21:44:01 +00:00
David Majnemer 927df85de0 Spell "Actual" correctly
llvm-svn: 193954
2013-11-03 11:09:39 +00:00
Rafael Espindola 282a47037b Use LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN instead of the "dso list".
There are two ways one could implement hiding of linkonce_odr symbols in LTO:
* LLVM tells the linker which symbols can be hidden if not used from native
  files.
* The linker tells LLVM which symbols are not used from other object files,
  but will be put in the dso symbol table if present.

GOLD's API is the second option. It was implemented almost 1:1 in llvm by
passing the list down to internalize.

LLVM already had partial support for the first option. It is also very similar
to how ld64 handles hiding these symbols when *not* doing LTO.

This patch then
* removes the APIs for the DSO list.
* marks LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN all linkonce_odr unnamed_addr
  global values and other linkonce_odr whose address is not used.
* makes the gold plugin responsible for handling the API mismatch.

llvm-svn: 193800
2013-10-31 20:51:58 +00:00
Rafael Espindola 6554e5a94d Merge CallGraph and BasicCallGraph.
llvm-svn: 193734
2013-10-31 03:03:55 +00:00
Shuxin Yang 2e1890e18b Revert r193251 : Use address-taken to disambiguate global variable and indirect memops.
llvm-svn: 193489
2013-10-27 03:08:44 +00:00
Shuxin Yang e4fb375995 Use address-taken to disambiguate global variable and indirect memops.
Major steps include:
 1). introduces a not-addr-taken bit-field in GlobalVariable
 2). GlobalOpt pass sets "not-address-taken" if it proves a global varirable 
    dosen't have its address taken.
 3). AA use this info for disambiguation. 

llvm-svn: 193251
2013-10-23 17:28:19 +00:00
Eric Christopher 874fa0f6c7 Fix spelling, grammar, and match naming convention for test files.
llvm-svn: 193130
2013-10-21 23:14:06 +00:00
Matt Arsenault 404c60a7c3 Use more type helper functions
llvm-svn: 193109
2013-10-21 19:43:56 +00:00
Rafael Espindola 3d7fc25c7c Optimize more linkonce_odr values during LTO.
When a linkonce_odr value that is on the dso list is not unnamed_addr
we can still look to see if anything is actually using its address. If
not, it is safe to hide it.

This patch implements that by moving GlobalStatus to Transforms/Utils
and using it in Internalize.

llvm-svn: 193090
2013-10-21 17:14:55 +00:00
Nadav Rotem 7f27e0b0ce Mark some command line flags as hidden
llvm-svn: 193013
2013-10-18 23:38:13 +00:00
Rafael Espindola 045a78fa7e Rename fields of GlobalStatus to match the coding style.
llvm-svn: 192910
2013-10-17 18:18:52 +00:00
Rafael Espindola 27797baee7 rename SafeToDestroyConstant to isSafeToDestroyConstant and clang-format.
llvm-svn: 192907
2013-10-17 18:06:32 +00:00
Rafael Espindola 026c9cbefe Simplify the interface of AnalyzeGlobal a bit and rename to analyzeGlobal.
No functionality change.

llvm-svn: 192906
2013-10-17 18:00:25 +00:00
Shuxin Yang 1cab418ce2 Fix a bug in Dead Argument Elimination.
If a function seen at compile time is not necessarily the one linked to
the binary being built, it is illegal to change the actual arguments
passing to it. 

  e.g. 
   --------------------------
   void foo(int lol) {
     // foo() has linkage satisifying isWeakForLinker()
     // "lol" is not used at all.
   }

   void bar(int lo2) {
      // xform to foo(undef) is illegal, as compiler dose not know which
      // instance of foo() will be linked to the the binary being built.
      foo(lol2); 
   }
  -----------------------------

  Such functions can be captured by isWeakForLinker(). NOTE that
mayBeOverridden() is insufficient for this purpose as it dosen't include
linkage types like AvailableExternallyLinkage and LinkOnceODRLinkage.
Take link_odr* as an example, it indicates a set of *EQUIVALENT* globals
that can be merged at link-time. However, the semantic of 
*EQUIVALENT*-functions includes parameters. Changing parameters breaks
the assumption.

  Thank John McCall for help, especially for the explanation of subtle
difference between linkage types.

  rdar://11546243

llvm-svn: 192302
2013-10-09 17:21:44 +00:00
Alexey Samsonov a1944e6d26 Revert r191834 until we measure the effect of this benchmarks and maybe find a better way to fix it
llvm-svn: 192121
2013-10-07 19:03:24 +00:00
Rafael Espindola cda2911caa Optimize linkonce_odr unnamed_addr functions during LTO.
Generalize the API so we can distinguish symbols that are needed just for a DSO
symbol table from those that are used from some native .o.

The symbols that are only wanted for the dso symbol table can be dropped if
llvm can prove every other dso has a copy (linkonce_odr) and the address is not
important (unnamed_addr).

llvm-svn: 191922
2013-10-03 18:29:09 +00:00
Alexey Samsonov 31540172d0 Remove "localize global" optimization
Summary:
As discussed in http://llvm-reviews.chandlerc.com/D1754,
this optimization isn't really valid for C, and fires too rarely anyway.

Reviewers: rafael, nicholas

Reviewed By: nicholas

CC: rnk, llvm-commits, nicholas

Differential Revision: http://llvm-reviews.chandlerc.com/D1769

llvm-svn: 191834
2013-10-02 15:31:34 +00:00
Matt Arsenault 517d84e268 Don't merge tiny functions.
It's silly to merge functions like these:

define void @foo(i32 %x) {
  ret void
}

define void @bar(i32 %x) {
  ret void
}

to get

define void @bar(i32) {
  tail call void @foo(i32 %0)
  ret void
}

llvm-svn: 191786
2013-10-01 18:05:30 +00:00
Benjamin Kramer 8817cca5ce Provide basic type safety for array_pod_sort comparators.
This makes using array_pod_sort significantly safer. The implementation relies
on function pointer casting but that should be safe as we're dealing with void*
here.

llvm-svn: 191175
2013-09-22 14:09:50 +00:00
Stepan Dyatkovskiy dc2c4b4462 Bugfix for PR17099:
Wrong cast operation.
MergeFunctions emits Bitcast instead of pointer-to-integer operation.
Patch fixes MergeFunctions::writeThunk function. It replaces
unconditional Bitcast creation with "Value* createCast(...)" method, that
checks operand types and selects proper instruction.
See unit-test as example.

llvm-svn: 190859
2013-09-17 09:36:11 +00:00
Peter Collingbourne 3fa50f9b05 Implement function prefix data as an IR feature.
Previous discussion:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/063909.html

Differential Revision: http://llvm-reviews.chandlerc.com/D1191

llvm-svn: 190773
2013-09-16 01:08:15 +00:00
Duncan Sands c9e95ad0db Avoid a compiler warning about Found not being used when assertions are
disabled.

llvm-svn: 190668
2013-09-13 08:16:06 +00:00
Matt Arsenault d3471e9ea8 Use type form of getIntPtrType
This doesn't change anything since malloc always returns
address space 0.

llvm-svn: 190498
2013-09-11 07:29:40 +00:00
Eli Friedman 33d3700716 Don't shrink atomic ops to bool in GlobalOpt.
LLVM IR doesn't currently allow atomic bool load/store operations, and the
transformation is dubious anyway because it isn't profitable on all platforms.

PR17163.

llvm-svn: 190357
2013-09-09 22:00:13 +00:00
Rafael Espindola d21ac19bda Remove unused argument.
llvm-svn: 190090
2013-09-05 19:15:21 +00:00
Nick Lewycky 2c88067a46 Declare missing dependency on AliasAnalysis. Patch by Liu Xin!
llvm-svn: 190035
2013-09-05 08:19:58 +00:00
Rafael Espindola b7c0b4a327 Rename some variables to match the style guide.
I am about to patch this code, and this makes the diff far more readable.

llvm-svn: 189982
2013-09-04 20:08:46 +00:00
Rafael Espindola b832d49822 Small simplification given that insert of an empty range is a nop.
llvm-svn: 189971
2013-09-04 18:53:21 +00:00
Rafael Espindola 49a6c153c9 Refactor duplicated logic to a helper function.
No functionality change.

llvm-svn: 189969
2013-09-04 18:37:36 +00:00
Rafael Espindola 9406516af1 Remove dead code.
llvm-svn: 189967
2013-09-04 18:16:02 +00:00
Rafael Espindola 128c5ea902 Revert "Add r159136 back now that pr13124 has been fixed."
This reverts commit r189886.

I found a corner case where this optimization is not valid:

Say we have a "linkonce_odr unnamed_addr" in two translation units:
* In TU 1 this optimization kicks in and makes it hidden.
* In TU 2 it gets const merged with a constant that is *not* unnamed_addr,
  resulting in a non unnamed_addr constant with default visibility.
* The static linker rules for combining visibility them produce a hidden
  symbol, which is incorrect from the point of view of the non unnamed_addr
  constant.

The one place we can do this is when we know that the symbol is not used from
another TU in the same shared object, i.e., during LTO. I will move it there.

llvm-svn: 189954
2013-09-04 16:09:01 +00:00
Rafael Espindola 5eb7df68bf Add r159136 back now that pr13124 has been fixed.
Original message:
If a constant or a function has linkonce_odr linkage and unnamed_addr, mark
hidden. Being linkonce_odr guarantees that it is available in every dso that
needs it. Being a constant/function with unnamed_addr guarantees that the
copies don't have to be merged.

llvm-svn: 189886
2013-09-03 23:34:36 +00:00
Nadav Rotem 5d78dba6d9 Enable late-vectorization by default.
This patch changes the default setting for the LateVectorization flag that controls where the loop-vectorizer is ran.

Perf gains:
SingleSource/Benchmarks/Shootout/matrix -37.33%
MultiSource/Benchmarks/PAQ8p/paq8p  -22.83%
SingleSource/Benchmarks/Linpack/linpack-pc  -16.22%
SingleSource/Benchmarks/Shootout-C++/ary3 -15.16%
MultiSource/Benchmarks/TSVC/NodeSplitting-flt/NodeSplitting-flt -10.34%
MultiSource/Benchmarks/TSVC/NodeSplitting-dbl/NodeSplitting-dbl -7.12%

Regressions:
SingleSource/Benchmarks/Misc/lowercase  15.10%
MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt 13.18%
SingleSource/Benchmarks/Shootout-C++/matrix 8.27%
SingleSource/Benchmarks/CoyoteBench/lpbench 7.30%

llvm-svn: 189858
2013-09-03 21:33:17 +00:00
Bill Wendling 2865be79f8 Compulsive reformatting.
llvm-svn: 189697
2013-08-30 21:07:33 +00:00
Bill Wendling 4c0d9adecb Random cleanup: No need to use a std::vector here, since createInternalizePass uses an ArrayRef.
llvm-svn: 189632
2013-08-30 00:48:37 +00:00
Nadav Rotem 4c459bcd47 Vectorizer/PassManager: I am working on moving the vectorizer out of the SCC passes. This patch moves the SLP-vectorizer and BB-vectorizer back into SCC passes for two reasons:
1. They are a kind of cannonicalization.
2. The performance measurements show that it is better to keep them in.

There should be no functional change if you are not enabling the LateVectorization mode.

llvm-svn: 189539
2013-08-28 23:40:29 +00:00
Hal Finkel 6d09904cc9 Disable unrolling in the loop vectorizer when disabled in the pass manager
When unrolling is disabled in the pass manager, the loop vectorizer should also
not unroll loops. This will allow the -fno-unroll-loops option in Clang to
behave as expected (even for vectorizable loops). The loop vectorizer's
-force-vector-unroll option will (continue to) override the pass-manager
setting (including -force-vector-unroll=0 to force use of the internal
auto-selection logic).

In order to test this, I added a flag to opt (-disable-loop-unrolling) to force
disable unrolling through opt (the analog of -fno-unroll-loops in Clang). Also,
this fixes a small bug in opt where the loop vectorizer was enabled only after
the pass manager populated the queue of passes (the global_alias.ll test needed
a slight update to the RUN line as a result of this fix).

llvm-svn: 189499
2013-08-28 18:33:10 +00:00
Michael Gottesman eab9a7fa7c Fixed typo.
Noticed by Stephen Checkoway <s@pahtak.org>.

llvm-svn: 189312
2013-08-27 04:43:03 +00:00
Michael Gottesman 823aaffd37 Update StripDeadDebugInfo to use DebugInfoFinder so that it is no longer stale to the point of not working and more resilient to debug info changes.
The current version of StripDeadDebugInfo became stale and no longer actually
worked since it was expecting an older version of debug info.

This patch updates it to use DebugInfoFinder and the modern DebugInfo classes as
much as possible to make it more redundent to such changes. Additionally, the
only place where that was avoided (the code where we replace the old sets with
the new), I call verify on the DIContextUnit implying that if the format changes
and my live set changes no longer make sense an assert will be hit. In order to
ensure that that occurs I have included a test case.

The actual stripping of the dead debug info follows the same strategy as was
used before in this class: find the live set and replace the old set in the
given compile unit (which may contain dead global variables/functions) with the
new live one.

llvm-svn: 189078
2013-08-23 00:23:24 +00:00
Michael Gottesman 0dc00645a2 Fixed typo.
llvm-svn: 188957
2013-08-21 22:53:54 +00:00
Michael Gottesman 0900993c3c Removed trailing whitespace.
llvm-svn: 188956
2013-08-21 22:53:29 +00:00
Arnold Schwaighofer 124ccf3ad1 Also remove logic in LateVectorize
llvm-svn: 188285
2013-08-13 16:12:04 +00:00
Arnold Schwaighofer c14b59d1a1 Remove logic that decides whether to vectorize or not depending on O-levels
I have moved this logic into clang and opt.

llvm-svn: 188281
2013-08-13 15:51:25 +00:00
Bill Wendling e1eaecd528 Move stack protector names to the same place.
llvm-svn: 188198
2013-08-12 20:09:37 +00:00
Tom Stellard aa664d9b92 Factor FlattenCFG out from SimplifyCFG
Patch by: Mei Ye

llvm-svn: 187764
2013-08-06 02:43:45 +00:00
Nadav Rotem e4e6e9ed47 Move the optlevel check to the frontend.
llvm-svn: 187628
2013-08-01 22:41:58 +00:00
Nadav Rotem 9153b3871d Only enable SLP-vectorization on O3 builds.
llvm-svn: 187595
2013-08-01 18:28:15 +00:00
Tom Stellard 8b1e021e85 SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch conditions
Merge consecutive if-regions if they contain identical statements.
Both transformations reduce number of branches.  The transformation
is guarded by a target-hook, and is currently enabled only for +R600,
but the correctness has been tested on X86 target using a variety of
CPU benchmarks.

Patch by: Mei Ye

llvm-svn: 187278
2013-07-27 00:01:07 +00:00
Rafael Espindola 17600e29fa Respect llvm.used in Internalize.
The language reference says that:

"If a symbol appears in the @llvm.used list, then the compiler,
assembler, and linker are required to treat the symbol as if there is
a reference to the symbol that it cannot see"

Since even the linker cannot see the reference, we must assume that
the reference can be using the symbol table. For example, a user can add
__attribute__((used)) to a debug helper function like dump and use it from
a debugger.

llvm-svn: 187103
2013-07-25 03:23:25 +00:00
Nick Lewycky 5b15037fc9 Check that TD isn't NULL before dereferencing it down this path.
llvm-svn: 187099
2013-07-25 02:55:14 +00:00
Rafael Espindola ec2375fb51 Make these methods const correct.
Thanks to Nick Lewycky for noticing it.

llvm-svn: 187098
2013-07-25 02:50:08 +00:00
Rafael Espindola c2bb73fc8d Don't crash when llvm.compiler.used becomes empty.
GlobalOpt simplifies llvm.compiler.used by removing any members that are also
in the more strict llvm.used. Handle the special case where llvm.compiler.used
becomes empty.

llvm-svn: 186778
2013-07-20 23:33:15 +00:00
Rafael Espindola 9aadcc4c0e s/compiler_used/compiler.used/.
We were incorrectly using compiler_used instead of compiler.used. Unfortunately
the passes using the broken name had tests also using the broken name.

llvm-svn: 186705
2013-07-19 18:44:51 +00:00
Nick Lewycky 03f3d34ffb Clean up some of this code a tiny bit, no functionality change.
llvm-svn: 186622
2013-07-18 22:32:32 +00:00
Hal Finkel ec7cd26968 Fix comparisons of alloca alignment in inliner merging
Duncan pointed out a mistake in my fix in r186425 when only one of the allocas
being compared had the target-default alignment. This is essentially his
suggested solution. Thanks!

llvm-svn: 186510
2013-07-17 14:32:41 +00:00