Commit Graph

3397 Commits

Author SHA1 Message Date
Vedant Kumar e0b5f86b30 [STLExtras] Add distance() for ranges, pred_size(), and succ_size()
This commit adds a wrapper for std::distance() which works with ranges.
As it would be a common case to write `distance(predecessors(BB))`, this
also introduces `pred_size()` and `succ_size()` helpers to make that
easier to write.

Differential Revision: https://reviews.llvm.org/D46668

llvm-svn: 332057
2018-05-10 23:01:54 +00:00
Teresa Johnson 59da890c96 [NewPM] Emit inliner NoDefinition missed optimization remark
Summary: Makes this consistent with the old PM.

Reviewers: eraman

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D46526

llvm-svn: 331709
2018-05-08 01:45:46 +00:00
Dmitry Mikulin 738bac77c1 Remove explicit setting of the CFI jumptable section name, it does not appear
to be needed: jump table sections are created with .cfi.jumptable suffix. With
this change each jump table is placed in a separate section, which allows the
linker to re-order them.

Differential Revision: https://reviews.llvm.org/D46537

llvm-svn: 331680
2018-05-07 21:30:15 +00:00
Peter Collingbourne e04ecc88de LowerTypeTests: Fix non-determinism in code that handles icall branch funnels.
This was exposed by enabling expensive checks, which causes llvm::sort
to sort randomly.

Differential Revision: https://reviews.llvm.org/D45901

llvm-svn: 331573
2018-05-05 00:51:55 +00:00
Adrian Prantl 4dfcc4a788 Remove @brief commands from doxygen comments, too.
This is a follow-up to r331272.

We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by
  for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done

https://reviews.llvm.org/D46290

llvm-svn: 331275
2018-05-01 16:10:38 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Adrian Prantl 210a29de7b Fix a bug in GlobalOpt's handling of DIExpressions.
This patch adds support for fragment expressions
TryToShrinkGlobalToBoolean() which were previously just dropped.

Thanks to Reid Kleckner for providing me a reproducer!

llvm-svn: 331086
2018-04-27 21:41:36 +00:00
Eli Friedman e06539456c [LowerTypeTests] Mark .cfi.jumptable nounwind.
It doesn't unwind, and the wrong marking leads to the creation of an
.eh_frame section when it isn't necessary.

Differential Revision: https://reviews.llvm.org/D46082

llvm-svn: 331008
2018-04-27 00:32:24 +00:00
Vlad Tsyrklevich b768d235a9 Revert "Enable EliminateAvailableExternally pass for -O1"
This reverts commit r330961 because it breaks a handful of clang tests.

llvm-svn: 330964
2018-04-26 17:54:53 +00:00
Vlad Tsyrklevich 42c5a9c29a Enable EliminateAvailableExternally pass for -O1
Summary:
Follow-up to D43690, the EliminateAvailableExternally pass currently
runs under -O0 and -O2 and up. Under -O1 we would still want to drop
available_externally symbols to reduce space without inlining having
run.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: mehdi_amini, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D46093

llvm-svn: 330961
2018-04-26 17:33:24 +00:00
David Blaikie ba47dd16c5 Fix some layering in AggressiveInstCombine (avoiding inclusion of Scalar.h)
llvm-svn: 330726
2018-04-24 15:40:07 +00:00
David Blaikie a27771b62f InstCombine: Fix layering by not including Scalar.h in InstCombine
(notionally Scalar.h is part of libLLVMScalarOpts, so it shouldn't be
included by InstCombine which doesn't/shouldn't need to depend on
ScalarOpts)

llvm-svn: 330669
2018-04-24 00:48:59 +00:00
Sean Fertile 18f17333dd [PartialInlining] Fix Crash from holding a reference to a destructed ORE.
The callback used to create an ORE for the legacy PI pass caches the allocated
object in a unique_ptr in the runOnModule function, and returns a reference to
that object. Under certian circumstances we can end up holding onto that
reference after the OREs destruction. Rather then allowing the new and legacy
passes to create ORE object in diffrent ways, create the ORE at the point of
use.

Differential Revision: https://reviews.llvm.org/D43219

llvm-svn: 330473
2018-04-20 19:56:26 +00:00
Vlad Tsyrklevich 230b256783 LowerTypeTests: Propagate symver directives
Summary:
This change fixes https://crbug.com/834474, a build failure caused by
LowerTypeTests not preserving .symver symbol versioning directives for
exported functions. Emit symver information to ThinLTO summary data and
then propagate symver directives for exported functions to the merged
module.

Emitting symver information to the summaries increases the size of
intermediate build artifacts for a Chromium build by less than 0.2%.

Reviewers: pcc

Reviewed By: pcc

Subscribers: tejohnson, mehdi_amini, eraman, llvm-commits, eugenis, kcc

Differential Revision: https://reviews.llvm.org/D45798

llvm-svn: 330387
2018-04-20 01:36:48 +00:00
Mandeep Singh Grang 636d94db3b [Transforms] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: kcc, pcc, danielcdh, jmolloy, sanjoy, dberlin, ruiu

Reviewed By: ruiu

Subscribers: ruiu, llvm-commits

Differential Revision: https://reviews.llvm.org/D45142

llvm-svn: 330059
2018-04-13 19:47:57 +00:00
Eli Friedman e1938cbc87 Don't call skipModule for CFI lowering passes.
opt-bisect shouldn't skip these passes; they lower intrinsics which
no other pass can handle.

llvm-svn: 329961
2018-04-12 22:04:11 +00:00
George Burgess IV 48ee59b6f0 [DeadArgElim] Remove allocsize attributes on callsites
We're already removing allocsize attributes from Functions that we
remove args from, since removing arguments from a function may make the
allocsize attribute incorrect. It appears we forgot to also remove them
from callsites.

Without this, I get verifier errors on `@Test2`.

It probably wouldn't be too hard to make DAE properly update allocsize
attributes instead of dropping them, but I can't think of a scenario
where that'd be useful in practice.

llvm-svn: 329868
2018-04-12 02:06:01 +00:00
Vitaly Buka 9cb59b92cc Fix warning by cl::opt<int> -> cl::opt<unsigned>
llvm-svn: 329461
2018-04-06 21:41:17 +00:00
Vitaly Buka 66f53d71f7 Runtime flag to control branch funnel threshold
Reviewers: pcc

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45193

llvm-svn: 329459
2018-04-06 21:32:36 +00:00
Benjamin Kramer 1fc0da4849 Make helpers static. NFC.
llvm-svn: 329170
2018-04-04 11:45:11 +00:00
Vlad Tsyrklevich 07cf78cdad Fix bad copy-and-paste in r329108
llvm-svn: 329118
2018-04-03 21:40:27 +00:00
Vlad Tsyrklevich d17f61ea3b Add the ShadowCallStack attribute
Summary:
Introduce the ShadowCallStack function attribute. It's added to
functions compiled with -fsanitize=shadow-call-stack in order to mark
functions to be instrumented by a ShadowCallStack pass to be submitted
in a separate change.

Reviewers: pcc, kcc, kubamracek

Reviewed By: pcc, kcc

Subscribers: cryptoad, mehdi_amini, javed.absar, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D44800

llvm-svn: 329108
2018-04-03 20:10:40 +00:00
Rong Xu 5a8d4c3357 [DeadArgumentElim] Clone function level metadatas
Some Function level metadatas, such as function entry count, are not cloned in
DeadArgumentElim. This happens a lot in lto/thinlto because of DeadArgumentElim
after internalization.

This patch clones the metadatas in the original function to the new function.

Differential Revision: https://reviews.llvm.org/D44127

llvm-svn: 328991
2018-04-02 17:27:38 +00:00
Teresa Johnson 974706ebf7 [ThinLTO] Add an import cutoff for debugging/triaging
Summary:
Adds -import-cutoff=N which will stop importing during the thin link
after N imports. Default is -1 (no  limit).

Reviewers: wmi

Subscribers: inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D45127

llvm-svn: 328934
2018-04-01 15:54:40 +00:00
David Blaikie a373d18eb7 Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h
Fixes layering - Transforms/Utils shouldn't depend on including a Scalar
or IPO header, because Scalar and IPO depend on Utils.

llvm-svn: 328717
2018-03-28 17:44:36 +00:00
Benjamin Kramer 8840f644b4 [DeadArgElim] Strip allocsize attributes when deleting an argument.
Since allocsize refers to the argument number it gets invalidated when
an argument is removed and the numbers shift.

llvm-svn: 328481
2018-03-26 09:44:24 +00:00
Fedor Sergeev 6660fd0f95 [PM][FunctionAttrs] add NoUnwind attribute inference to PostOrderFunctionAttrs pass
Summary:
This was motivated by absence of PrunEH functionality in new PM.
It was decided that a proper way to do PruneEH is to add NoUnwind inference
into PostOrderFunctionAttrs and then perform normal SimplifyCFG on top.

This change generalizes attribute handling implemented for (a removal of)
Convergent attribute, by introducing a generic builder-like class
   AttributeInferer

It registers all the attribute inference requests, storing per-attribute
predicates into a vector, and then goes through an SCC Node, scanning all
the instructions for not breaking attribute assumptions.

The main idea is that as soon all the instructions from all the functions
of SCC Node conform to attribute assumptions then we are free to infer
the attribute as set for all the functions of SCC Node.

It handles two distinct cases of attributes:
   - those that might break due to derefinement of the function code

     for these attributes we are allowed to apply inference only if all the
     functions are "exact definitions". Example - NoUnwind.

   - those that do not care about derefinement

     for these attributes we are allowed to apply inference as soon as we see
     any function definition. Example - removal of Convergent attribute.

Also in this commit:
* Converted all the FunctionAttrs tests to use FileCheck and added new-PM
  invocations to them

* FunctionAttrs/convergent.ll test demonstrates a difference in behavior between
   new and old PM implementations. Marked with FIXME.

* PruneEH tests were converted to new-PM as well, using function-attrs+simplify-cfg
  combo as intended

* some of "other" tests were updated since function-attrs now infers 'nounwind'
  even for old PM pipeline

* -disable-nounwind-inference hidden option added as a possible workaround for a supposedly
  rare case when nounwind being inferred by default presents a problem

Reviewers: chandlerc, jlebar

Reviewed By: jlebar

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D44415

llvm-svn: 328377
2018-03-23 21:46:16 +00:00
David Blaikie 301627f875 Move SampleProfile.h into IPO along with the rest of the IPO pass headers
llvm-svn: 328262
2018-03-22 22:42:44 +00:00
David Blaikie 376294c23a Finish moving the IPSCCP pass from Scalar to IPO - moving the registration
llvm-svn: 328259
2018-03-22 22:07:53 +00:00
David Blaikie 3bbf5af0ac Fix layering between SCCP and IPO SCCP
Transforms/Scalar/SCCP.cpp implemented both the Scalar and IPO SCCP, but
this meant Transforms/Scalar including Transfroms/IPO headers, creating
a circular dependency. (IPO depends on Scalar already) - so move the IPO
SCCP shims out into IPO and the basic library implementation accessible
from Scalar/SCCP.h to be used from the IPO/SCCP.cpp implementation.

llvm-svn: 328250
2018-03-22 21:41:29 +00:00
David Blaikie 2965a01e98 Move the initialization of the Meta Renamer pass over to IPO along with the rest of it that was moved in r328209
llvm-svn: 328234
2018-03-22 19:36:54 +00:00
Matt Morehouse 236cdaf84c [SimplifyCFG] Create attribute for fuzzing-specific optimizations.
Summary:
When building with libFuzzer, converting control flow to selects or
obscuring the original operands of CMPs reduces the effectiveness of
libFuzzer's heuristics.

This patch provides an attribute to disable or modify certain optimizations
for optimal fuzzing signal.

Provides a less aggressive alternative to https://reviews.llvm.org/D44057.

Reviewers: vitalybuka, davide, arsenm, hfinkel

Reviewed By: vitalybuka

Subscribers: junbuml, mehdi_amini, wdng, javed.absar, hiraditya, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D44232

llvm-svn: 328214
2018-03-22 17:07:51 +00:00
David Blaikie 0368417595 Move MetaRenamer from Transforms/UTils to Transforms/IPO since it implements part of IPO.h
llvm-svn: 328209
2018-03-22 15:57:47 +00:00
David Blaikie 2be3922807 Fix a couple of layering violations in Transforms
Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering.

Transforms depends on Transforms/Utils, not the other way around. So
remove the header and the "createStripGCRelocatesPass" function
declaration (& definition) that is unused and motivated this dependency.

Move Transforms/Utils/Local.h into Analysis because it's used by
Analysis/MemoryBuiltins.cpp.

llvm-svn: 328165
2018-03-21 22:34:23 +00:00
Oren Ben Simhon fdd72fd522 [X86] Added support for nocf_check attribute for indirect Branch Tracking
X86 Supports Indirect Branch Tracking (IBT) as part of Control-Flow Enforcement Technology (CET).
IBT instruments ENDBR instructions used to specify valid targets of indirect call / jmp.
The `nocf_check` attribute has two roles in the context of X86 IBT technology:
	1. Appertains to a function - do not add ENDBR instruction at the beginning of the function.
	2. Appertains to a function pointer - do not track the target function of this pointer by adding nocf_check prefix to the indirect-call instruction.

This patch implements `nocf_check` context for Indirect Branch Tracking.
It also auto generates `nocf_check` prefixes before indirect branchs to jump tables that are guarded by range checks.

Differential Revision: https://reviews.llvm.org/D41879

llvm-svn: 327767
2018-03-17 13:29:46 +00:00
Vlad Tsyrklevich aab6000684 Reland r327041: [ThinLTO] Keep available_externally symbols live
Summary:
This change fixes PR36483. The bug was originally introduced by a change
that marked non-prevailing symbols dead. This broke LowerTypeTests
handling of available_externally functions, which are non-prevailing.
LowerTypeTests uses liveness information to avoid emitting thunks for
unused functions.

Marking available_externally functions dead is incorrect, the functions
are used though the function definitions are not. This change keeps them
live, and lets the EliminateAvailableExternally/GlobalDCE passes remove
them later instead.

(Reland with a suspected fix for a unit test failure I haven't been able
to reproduce locally)

Reviewers: pcc, tejohnson

Reviewed By: tejohnson

Subscribers: grimar, mehdi_amini, inglorion, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D43690

llvm-svn: 327360
2018-03-13 05:08:48 +00:00
Volkan Keles 4ecdb44a64 BlockExtractor: Don’t delete functions directly
Blocks may have function calls, so don’t erase functions
directly to avoid erasing a function that has a user.

llvm-svn: 327340
2018-03-12 22:28:18 +00:00
Eugene Leviant 19e238746b [ThinLTO] Recommit of import global variables
This wasreverted in r326638 due to link problems and fixed
afterwards

llvm-svn: 327254
2018-03-12 10:30:50 +00:00
Florian Hahn a7dcfa746e [PartialInlining] Use isInlineViable to detect constructs preventing inlining.
Use isInlineViable to prevent inlining of functions with non-inlinable
constructs, in case cost analysis is skipped.

Reviewers: efriedma, sfertile, davide, davidxl

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D42846

llvm-svn: 327207
2018-03-10 14:53:44 +00:00
Peter Collingbourne 2974856ad4 Use branch funnels for virtual calls when retpoline mitigation is enabled.
The retpoline mitigation for variant 2 of CVE-2017-5715 inhibits the
branch predictor, and as a result it can lead to a measurable loss of
performance. We can reduce the performance impact of retpolined virtual
calls by replacing them with a special construct known as a branch
funnel, which is an instruction sequence that implements virtual calls
to a set of known targets using a binary tree of direct branches. This
allows the processor to speculately execute valid implementations of the
virtual function without allowing for speculative execution of of calls
to arbitrary addresses.

This patch extends the whole-program devirtualization pass to replace
certain virtual calls with calls to branch funnels, which are
represented using a new llvm.icall.jumptable intrinsic. It also extends
the LowerTypeTests pass to recognize the new intrinsic, generate code
for the branch funnels (x86_64 only for now) and lay out virtual tables
as required for each branch funnel.

The implementation supports full LTO as well as ThinLTO, and extends the
ThinLTO summary format used for whole-program devirtualization to
support branch funnels.

For more details see RFC:
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120672.html

Differential Revision: https://reviews.llvm.org/D42453

llvm-svn: 327163
2018-03-09 19:11:44 +00:00
Eric Christopher 3caa0fd050 Revert "[ThinLTO] Keep available_externally symbols live"
This reverts commit r327041 and the followup attempts at fixing the testcase as they're still failing.

llvm-svn: 327094
2018-03-09 01:25:18 +00:00
Vlad Tsyrklevich 7b66ef1036 [ThinLTO] Keep available_externally symbols live
Summary:
This change fixes PR36483. The bug was originally introduced by a change
that marked non-prevailing symbols dead. This broke LowerTypeTests
handling of available_externally functions, which are non-prevailing.
LowerTypeTests uses liveness information to avoid emitting thunks for
unused functions.

Marking available_externally functions dead is incorrect, the functions
are used though the function definitions are not. This change keeps them
live, and lets the EliminateAvailableExternally/GlobalDCE passes remove
them later instead.

I've also enabled EliminateAvailableExternally for all optimization
levels, I believe it being disabled for O1 was an oversight.

Reviewers: pcc, tejohnson

Reviewed By: tejohnson

Subscribers: grimar, mehdi_amini, inglorion, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D43690

llvm-svn: 327041
2018-03-08 18:48:03 +00:00
Chandler Carruth a4619d9944 [ThinLTO] Revert r325320: Import global variables
This caused some links to fail with ThinLTO due to missing symbols as
well as causing some binaries to have failures at runtime. We're working
with the author to get a test case, but want to get the tree green
again.

Further, it appears to introduce a data race. While the test usage of
threads was disabled in r325361 & r325362, that isn't an acceptable fix.
I've reverted both of these as well. This code needs to be thread safe.
Test cases for this are already on the original commit thread.

llvm-svn: 326638
2018-03-02 23:40:08 +00:00
Fedor Indutny 1571b1271e [ArgumentPromotion] don't break musttail invariant PR36543
Summary:
Do not break musttail invariant by promoting arguments of musttail
callee or caller.

Reviewers: sanjoy, dberlin, hfinkel, george.burgess.iv, fhahn, rnk

Reviewed By: rnk

Subscribers: rnk, llvm-commits

Differential Revision: https://reviews.llvm.org/D43926

llvm-svn: 326521
2018-03-02 00:59:27 +00:00
Reid Kleckner cb9611ca67 [DAE] don't remove args of musttail target/caller
`musttail` requires identical signatures of caller and callee. Removing
arguments breaks `musttail` semantics.

PR36441

Patch by Fedor Indutny

Differential Revision: https://reviews.llvm.org/D43708

llvm-svn: 326394
2018-03-01 00:09:35 +00:00
Jonas Devlieghere 9ca064552a [GlobalOpt] don't change CC of musttail calle(e|r)
When the function has musttail call - its cc is fixed to be equal to the
cc of the musttail callee. In such case (and in the case of the musttail
callee), GlobalOpt should not change the cc to fastcc as it will break
the invariant.

This fixes PR36546

Patch by: Fedor Indutny (indutny)

Differential revision: https://reviews.llvm.org/D43859

llvm-svn: 326376
2018-02-28 22:28:44 +00:00
Eric Christopher 675dcf02a8 Update comment for whether or not we can optimize an alias - we're
checking the alias and not the aliasee. If the alias can be interposed
then we shouldn't do anything.

llvm-svn: 325837
2018-02-22 23:12:11 +00:00
Luke Cheeseman 6c1e6bbe0c [FunctionAttrs][ArgumentPromotion][GlobalOpt] Disable some optimisations passes for naked functions
- Fix for bug 36078.
- Prevent the functionattrs, function-attrs, globalopt and argpromotion passes
  from changing naked functions.
- These passes can perform some alterations to the functions that should not be
  applied. An example is removing parameters that are seemingly not used because
  they are only referenced in the inline assembly. Another example is marking
  the function as fastcc.

llvm-svn: 325788
2018-02-22 14:42:08 +00:00
Mircea Trofin 56950974d4 [SampleProf] NFC. Expose reusable functionality in SampleProfile.
Summary:
Exposing getOffset and findFunctionSamples as members of
SampleProfile. They are intimately tied to design choices of the
sample profile format - using offsets instead of line numbers, and
traversing inlined functions stack, respectively.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43605

llvm-svn: 325747
2018-02-22 06:42:57 +00:00
Charles Saternos b040fcc693 [ThinLTO] Add GraphTraits for FunctionSummaries
Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes.

Third attempt - moved function from lambda to static function due to build failures.

llvm-svn: 325506
2018-02-19 15:14:50 +00:00
Simon Pilgrim 0efed32577 Revert: [llvm] r325448 - [ThinLTO] Add GraphTraits for FunctionSummaries
Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes.

Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function).

Reverted due to buildbot failures

llvm-svn: 325454
2018-02-18 00:01:36 +00:00
Charles Saternos 35878ee7a4 [ThinLTO] Add GraphTraits for FunctionSummaries
Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes.

Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function).

llvm-svn: 325448
2018-02-17 21:39:24 +00:00
Eugene Leviant 7331a0bf1c [ThinLTO] Import global variables
Differential revision: https://reviews.llvm.org/D43077

llvm-svn: 325320
2018-02-16 08:11:04 +00:00
Rafael Espindola 7186753218 Pass a module reference to CloneModule.
It can never be null and most callers were already using references or
std::unique_ptr.

llvm-svn: 325160
2018-02-14 19:50:40 +00:00
Rafael Espindola 6a86e25d90 Pass a reference to a module to the bitcode writer.
This simplifies most callers as they are already using references or
std::unique_ptr.

llvm-svn: 325155
2018-02-14 19:11:32 +00:00
Volodymyr Sapsai 2ad768bb13 Revert "[ThinLTO] Add GraphTraits for FunctionSummaries"
It caused assertion failure
Assertion failed: (!DD.IsLambda && !MergeDD.IsLambda && "faked up lambda definition?"), function MergeDefinitionData, file /Users/buildslave/jenkins/workspace/clang-stage1-configure-RA/llvm/tools/clang/lib/Serialization/ASTReaderDecl.cpp, line 1675.

on the second stage build bots.

llvm-svn: 324932
2018-02-12 20:43:31 +00:00
Charles Saternos d3e7d19f59 [ThinLTO] Add GraphTraits for FunctionSummaries
Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes.

llvm-svn: 324854
2018-02-11 22:06:20 +00:00
Steven Wu 33ba93c2b5 [ThinLTO] Teach ThinLTO about auto hide symbols
Summary:
For symbols that has linkonce_odr linkage and unnamed_addr, it can be
auto hide by linker to avoid weak external symbols. Teach ThinLTO to
perform auto hide so it can safely promote linkonce_odr to weak symbols
without breaking this nice property.

Reviewers: tejohnson, mehdi_amini

Reviewed By: tejohnson

Subscribers: inglorion, eraman, rnk, pcc, llvm-commits

Differential Revision: https://reviews.llvm.org/D43130

llvm-svn: 324757
2018-02-09 18:34:08 +00:00
Dmitry Mikulin 5cf73cea9c [ThinLTO] Skip BlockAddresses while replacing uses in function import.
Differential Revision: https://reviews.llvm.org/D43027

llvm-svn: 324658
2018-02-08 22:14:56 +00:00
George Rimar d3704f67ad Recommit r324455 "[ThinLTO] - Simplify code in ThinLTOBitcodeWriter."
With fix: reimplemented.

Original commit message:
Recently introduced convertToDeclaration is very similar
to code used in filterModule function.
Patch reuses it to reduce duplication.

Differential revision: https://reviews.llvm.org/D42971

llvm-svn: 324574
2018-02-08 07:23:24 +00:00
George Rimar 545652491d Revert r324455 "[ThinLTO] - Simplify code in ThinLTOBitcodeWriter."
It broke BB:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/23721

llvm-svn: 324458
2018-02-07 08:46:36 +00:00
George Rimar 5f133dc99b [ThinLTO] - Simplify code in ThinLTOBitcodeWriter.
Recently introduced convertToDeclaration is very similar
to code used in filterModule function.
Patch reuses it to reduce duplication.

Differential revision: https://reviews.llvm.org/D42971

llvm-svn: 324455
2018-02-07 08:32:35 +00:00
Petar Jovanovic 714f241304 [DeadArgumentElim] Set pointer to DISubprogram before calling RAUW. NFC
It is better to update pointer of the DISuprogram before we call RAUW for
still live arguments of the function, because with the change reviewed in
D42541 in RAUW we compare DISubprograms rather than functions itself.

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D42794

llvm-svn: 324335
2018-02-06 11:11:28 +00:00
Peter Collingbourne 29c6f4833c ThinLTOBitcodeWriter: Do not include module-level inline asm in the merged module.
If the inline asm provides the definition of a symbol, this can result
in duplicate symbol errors.

Differential Revision: https://reviews.llvm.org/D42944

llvm-svn: 324313
2018-02-06 03:29:18 +00:00
Teresa Johnson 5a95c47730 [ThinLTO] Convert dead alias to declarations
Summary:
This complements the fixes in r323633 and r324075 which drop the
definitions of dead functions and variables, respectively.

Fixes PR36208.

Reviewers: grimar, rafael

Subscribers: mehdi_amini, llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D42856

llvm-svn: 324242
2018-02-05 15:44:27 +00:00
George Rimar 0410afd1a3 [ThinLTO] - Add comment. NFC.
Was requested during review of D42798.

llvm-svn: 324095
2018-02-02 15:06:09 +00:00
Mikael Holmen b69e5b7393 [GlobalOpt] Include padding in debug fragments
Summary:
When creating the debug fragments for a SRA'd variable, use the types'
allocation sizes. This fixes issues where the pass would emit too small
fragments, placed at the wrong offset, for padded types.

An example of this is long double on x86. The type is represented using
x86_fp80, which is 10 bytes, but the value is aligned to 12/16 bytes.
The padding is included in the type's DW_AT_byte_size attribute;
therefore, the fragments should also include that. Newer GCC releases
(I tested 7.2.0) emit 12/16-byte pieces for long double. Earlier
releases, e.g. GCC 5.5.0, behaved as LLVM did, i.e. by emitting a
10-byte piece, followed by an empty 2/6-byte piece for the padding.

Failing to cover all `DW_AT_byte_size' bytes of a value with non-empty
pieces results in the value being printed as <optimized out> by GDB.

Patch by: David Stenberg

Reviewers: aprantl, JDevlieghere

Reviewed By: aprantl, JDevlieghere

Subscribers: llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D42807

llvm-svn: 324066
2018-02-02 10:34:13 +00:00
Amara Emerson 93b0ff20c9 [GlobalOpt] Improve common case efficiency of static global initializer evaluation
For very, very large global initializers which can be statically evaluated, the
code would create vectors of temporary Constants, modifying them in place,
before committing the resulting Constant aggregate to the global's initializer
value. This had effectively O(n^2) complexity in the size of the global
initializer and would cause memory and non-termination issues compiling some
workloads.

This change performs the static initializer evaluation and creation in batches,
once for each global in the evaluated IR memory. The existing code is maintained
as a last resort when the initializers are more complex than simple values in a
large aggregate. This should theoretically by NFC, no test as the example case
is massive. The existing test cases pass with this, as well as the llvm test
suite.

To give an example, consider the following C++ code adapted from the clang
regression tests:
struct S {
 int n = 10;
 int m = 2 * n;
 S(int a) : n(a) {}
};

template<typename T>
struct U {
 T *r = &q;
 T q = 42;
 U *p = this;
};

U<S> e;

The global static constructor for 'e' will need to initialize 'r' and 'p' of
the outer struct, while also initializing the inner 'q' structs 'n' and 'm'
members. This batch algorithm will simply use general CommitValueTo() method
to handle the complex nested S struct initialization of 'q', before
processing the outermost members in a single batch. Using CommitValueTo() to
handle member in the outer struct is inefficient when the struct/array is
very large as we end up creating and destroy constant arrays for each
initialization.
For the above case, we expect the following IR to be generated:

%struct.U = type { %struct.S*, %struct.S, %struct.U* }
%struct.S = type { i32, i32 }
@e = global %struct.U { %struct.S* gep inbounds (%struct.U, %struct.U* @e,
                                                 i64 0, i32 1),
                        %struct.S { i32 42, i32 84 }, %struct.U* @e }
The %struct.S { i32 42, i32 84 } inner initializer is treated as a complex
constant expression, while the other two elements of @e are "simple".

Differential Revision: https://reviews.llvm.org/D42612

llvm-svn: 323933
2018-01-31 23:56:07 +00:00
Peter Collingbourne 7873669be5 LTO: Drop comdats when converting definitions to declarations.
Differential Revision: https://reviews.llvm.org/D42715

llvm-svn: 323844
2018-01-31 02:51:03 +00:00
Zaara Syeda 1f59ae311b Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass to mark
candidates with coldcc attribute.

This recommits r322721 reverted due to sanitizer memory leak build bot failures.

Original commit message:
This patch adds support for the coldcc calling convention for Power.
This changes the set of non-volatile registers. It includes a pass to stress
test the implementation by marking all static directly called functions with
the coldcc attribute through the option -enable-coldcc-stress-test. It also
includes an option, -ppc-enable-coldcc, to add the coldcc attribute to
functions which are cold at all call sites based on BlockFrequencyInfo when
the containing function does not call any non cold functions.

Differential Revision: https://reviews.llvm.org/D38413

llvm-svn: 323778
2018-01-30 16:17:22 +00:00
George Rimar eaf5172ca6 [ThinLTO] - Stop internalizing and drop non-prevailing symbols.
Implementation marks non-prevailing symbols as not live in the summary.
Then them are dropped in backends.

Fixes https://bugs.llvm.org/show_bug.cgi?id=35938

Differential revision: https://reviews.llvm.org/D42107

llvm-svn: 323633
2018-01-29 08:03:30 +00:00
Easwaran Raman 8410c37465 [SyntheticCounts] Rewrite the code using only graph traits.
Summary:
The intent of this is to allow the code to be used with ThinLTO. In
Thinlink phase, a traditional Callgraph can not be computed even though
all the necessary information (nodes and edges of a call graph) is
available. This is due to the fact that CallGraph class is closely tied
to the IR. This patch first extends GraphTraits to add a CallGraphTraits
graph. This is then used to implement a version of counts propagation
on a generic callgraph.

Reviewers: davidxl

Subscribers: mehdi_amini, tejohnson, llvm-commits

Differential Revision: https://reviews.llvm.org/D42311

llvm-svn: 323475
2018-01-25 22:02:29 +00:00
Easwaran Raman c73cec84c9 Re-land "[ThinLTO] Add call edges' relative block frequency to per-module summary."
It was reverted after buildbot regressions.

Original commit message:

This allows relative block frequency of call edges to be passed
to the thinlink stage where it will be used to compute synthetic
entry counts of functions.

llvm-svn: 323460
2018-01-25 19:27:17 +00:00
Amjad Aboud f1f57a3137 Another try to commit 323321 (aggressive instruction combine).
llvm-svn: 323416
2018-01-25 12:06:32 +00:00
Mikael Holmen 886edf8f8a [GlobalOpt] Emit fragments using field offsets from struct layout
Summary:
When creating the debug fragments for a SRA'd struct, use the fields'
offsets, taken from the struct layout, as the offsets for the resulting
fragments. This fixes an issue where GlobalOpt would emit fragments with
incorrect offsets for padded fields.

This should solve PR36016.

Patch by David Stenberg.

Reviewers: aprantl

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42489

llvm-svn: 323411
2018-01-25 10:09:26 +00:00
Amjad Aboud d53504e379 Reverted 323321.
llvm-svn: 323326
2018-01-24 14:48:49 +00:00
Amjad Aboud e4453233d7 [InstCombine] Introducing Aggressive Instruction Combine pass (-aggressive-instcombine).
Combine expression patterns to form expressions with fewer, simple instructions.
This pass does not modify the CFG.

For example, this pass reduce width of expressions post-dominated by TruncInst
into smaller width when applicable.

It differs from instcombine pass in that it contains pattern optimization that
requires higher complexity than the O(1), thus, it should run fewer times than
instcombine pass.

Differential Revision: https://reviews.llvm.org/D38313

llvm-svn: 323321
2018-01-24 12:42:42 +00:00
Volkan Keles ebf34ea316 BlockExtractor: Remove unused variable. NFC.
llvm-svn: 323271
2018-01-23 22:24:34 +00:00
Volkan Keles dc40be75f8 [llvm-extract] Support extracting basic blocks
Summary:
Currently, there is no way to extract a basic block from a function easily. This patch
extends llvm-extract to extract the specified basic block(s).

Reviewers: loladiro, rafael, bogner

Reviewed By: bogner

Subscribers: hintonda, mgorny, qcolombet, llvm-commits

Differential Revision: https://reviews.llvm.org/D41638

llvm-svn: 323266
2018-01-23 21:51:34 +00:00
Eugene Leviant 28d8a49f42 [ThinLTO] Re-commit of dot dumper after test fix
llvm-svn: 323116
2018-01-22 13:35:40 +00:00
Eugene Leviant 72b9bdb71a Temporarily revert r323062 to investigate buildbot failures
llvm-svn: 323065
2018-01-21 10:22:19 +00:00
Eugene Leviant 453c976a63 [ThinLTO] Implement summary visualizer
Differential revision: https://reviews.llvm.org/D41297

llvm-svn: 323062
2018-01-21 07:27:32 +00:00
Hiroshi Inoue d24ddcd6c4 [NFC] fix trivial typos in comments
"the the" -> "the"

llvm-svn: 322934
2018-01-19 10:55:29 +00:00
Easwaran Raman e5b8de2f1f Add a ProfileCount class to represent entry counts.
Summary:
The class wraps a uint64_t and an enum to represent the type of profile
count (real and synthetic) with some helper methods.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41883

llvm-svn: 322771
2018-01-17 22:24:23 +00:00
Zaara Syeda c9dc7b451b Revert [PowerPC] This reverts commit rL322721
Failing build bots. Revert the commit now.

llvm-svn: 322748
2018-01-17 20:00:15 +00:00
Zaara Syeda 8e951fd2f6 [PowerPC] Add handling for ColdCC calling convention and a pass to mark
candidates with coldcc attribute.

This patch adds support for the coldcc calling convention for Power.
This changes the set of non-volatile registers. It includes a pass to stress
test the implementation by marking all static directly called functions with
the coldcc attribute through the option -enable-coldcc-stress-test. It also
includes an option, -ppc-enable-coldcc, to add the coldcc attribute to
functions which are cold at all call sites based on BlockFrequencyInfo when
the containing function does not call any non cold functions.

Differential Revision: https://reviews.llvm.org/D38413

llvm-svn: 322721
2018-01-17 18:22:55 +00:00
Rafael Espindola e4b0231c63 Make internal/private GVs implicitly dso_local.
While updating clang tests for having clang set dso_local I noticed
that:

- There are *a lot* of tests to update.
- Many of the updates are redundant.

They are redundant because a GV is "obviously dso_local". This patch
starts formalizing that a bit by requiring that internal and private
GVs be dso_local too. Since they all are, we don't have to print
dso_local to the textual representation, making it a bit more compact
and easier to read.

llvm-svn: 322317
2018-01-11 22:15:05 +00:00
Vlad Tsyrklevich cdec22ef9a LowerTypeTests: Add limited support for aliases
Summary:
LowerTypeTests moves some function definitions from individual object
files to the merged module, leaving a stub to be called in the merged
module's jump table. If an alias was pointing to such a function
definition LowerTypeTests would fail because the alias would be left
without a definition to point to.

This change 1) emits information about aliases to the ThinLTO summary,
2) replaces aliases pointing to function definitions that are moved to
the merged module with function declarations, and 3) re-emits those
aliases in the merged module pointing to the correct function
definitions.

The patch does not correctly fix all possible mis-uses of aliases in
LowerTypeTests. For example, it does not handle aliases with a different
type from the pointed to function.

The addition of alias data increases the size of Chrome build artifacts
by less than 1%.

Reviewers: pcc

Reviewed By: pcc

Subscribers: mehdi_amini, eraman, mgrang, llvm-commits, eugenis, kcc

Differential Revision: https://reviews.llvm.org/D41741

llvm-svn: 322139
2018-01-10 00:00:51 +00:00
Easwaran Raman bdf20261d8 Add a pass to generate synthetic function entry counts.
Summary:
This pass synthesizes function entry counts by traversing the callgraph
and using the relative block frequencies of the callsites. The intended
use of these counts is in inlining to determine hot/cold callsites in
the absence of profile information.

The pass is split into two files with the code that propagates the
counts in a callgraph in a Utils file. I plan to add support for
propagation in the thinlto link phase and the propagation code will be
shared and hence this split. I did not add support to the old PM since
hot callsite determination in inlining is not possible in old PM
(although we could use hot callee heuristic with synthetic counts in the
old PM it is not worth the effort tuning it)

Reviewers: davidxl, silvas

Subscribers: mgorny, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D41604

llvm-svn: 322110
2018-01-09 19:39:35 +00:00
Justin Bogner 6f6846fc9d AlwaysInliner: Alow setting InsertLifetime in the new-style pass
llvm-svn: 322033
2018-01-08 22:07:42 +00:00
Justin Bogner 92fe563b57 ArgPromotion: Allow setting MaxElements in the new-style pass
llvm-svn: 322025
2018-01-08 21:13:35 +00:00
Peter Collingbourne 9110cb456d WholeProgramDevirt: Simplify ORE getter mechanism for old PM. NFCI.
llvm-svn: 321841
2018-01-05 00:27:51 +00:00
Benjamin Kramer 24cb28bb54 Remove superfluous copies in sample profiling.
No functionliaty change intended.

llvm-svn: 321530
2017-12-28 18:10:41 +00:00
Benjamin Kramer 3a13ed60ba Avoid int to string conversion in Twine or raw_ostream contexts.
Some output changes from uppercase hex to lowercase hex, no other functionality change intended.

llvm-svn: 321526
2017-12-28 16:58:54 +00:00
Easwaran Raman a17f220590 Add hasProfileData() to check if a function has profile data. NFC.
Summary:
This replaces calls to getEntryCount().hasValue() with hasProfileData
that does the same thing. This refactoring is useful to do before adding
synthetic function entry counts but also a useful cleanup IMO even
otherwise. I have used hasProfileData instead of hasRealProfileData as
David had earlier suggested since I think profile implies "real" and I
use the phrase "synthetic entry count" and not "synthetic profile count"
but I am fine calling it hasRealProfileData if you prefer.

Reviewers: davidxl, silvas

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41461

llvm-svn: 321331
2017-12-22 01:33:52 +00:00
Adrian Prantl 0e6694d111 Silence a bunch of implicit fallthrough warnings
llvm-svn: 321114
2017-12-19 22:05:25 +00:00
Teresa Johnson 915897e21b [PGO] Fix handling of cold entry count for instrumented PGO
Summary:
In r277849, getEntryCount was changed to return None when the entry
count was 0, specifically for SamplePGO where it means no samples were
recorded. However, for instrumentation PGO a 0 entry count should be
returned directly, since it does mean that the function was completely
cold. Otherwise we end up treating these functions conservatively
in isFunctionEntryCold() and isColdBB().

Instead, for SamplePGO use -1 when there are no samples, and change
getEntryCount to return None when the value is -1.

Reviewers: danielcdh, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41307

llvm-svn: 321018
2017-12-18 20:02:43 +00:00
Matt Arsenault d89d0b6494 Removed unused DominanceFrontier
llvm-svn: 321001
2017-12-18 18:01:13 +00:00
Eugene Leviant c95b49603e [ThinLTO] Remove unused code
This is a re-commit of r320464, after patch for gold plugin
was landed.

llvm-svn: 320968
2017-12-18 10:53:45 +00:00
Teresa Johnson 69b2de8466 Fix NDEBUG build problem in r320895
Fix incorrect placement of #endif causing NDEBUG build failures.

llvm-svn: 320897
2017-12-16 00:29:31 +00:00
Teresa Johnson 81bbf74265 [ThinLTO] Enable importing of aliases as copy of aliasee
Summary:
This implements a missing feature to allow importing of aliases, which
was previously disabled because alias cannot be available_externally.
We instead import an alias as a copy of its aliasee.

Some additional work was required in the IndexBitcodeWriter for the
distributed build case, to ensure that the aliasee has a value id
in the distributed index file (i.e. even when it is not being
imported directly).

This is a performance win in codes that have many aliases, e.g. C++
applications that have many constructor and destructor aliases.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D40747

llvm-svn: 320895
2017-12-16 00:18:12 +00:00
Sanjay Patel 0ab0c1a201 [SimplifyCFG] don't sink common insts too soon (PR34603)
This should solve:
https://bugs.llvm.org/show_bug.cgi?id=34603
...by preventing SimplifyCFG from altering redundant instructions before early-cse has a chance to run.
It changes the default (canonical-forming) behavior of SimplifyCFG, so we're only doing the
sinking transform later in the optimization pipeline.

Differential Revision: https://reviews.llvm.org/D38566

llvm-svn: 320749
2017-12-14 22:05:20 +00:00
Michael Zolotukhin 6af4f232b5 Remove redundant includes from lib/Transforms.
llvm-svn: 320628
2017-12-13 21:31:01 +00:00
Eugene Leviant d53f3da772 Revert r320464 as it breaks gold plugin tests
llvm-svn: 320467
2017-12-12 10:12:46 +00:00
Eugene Leviant 3695183395 [ThinLTO] Remove unused code from thinLTOInternalizeModule
Differential revision: https://reviews.llvm.org/D40970

llvm-svn: 320464
2017-12-12 09:12:32 +00:00
Evgeniy Stepanov c667c1f47a Hardware-assisted AddressSanitizer (llvm part).
Summary:
This is LLVM instrumentation for the new HWASan tool. It is basically
a stripped down copy of ASan at this point, w/o stack or global
support. Instrumenation adds a global constructor + runtime callbacks
for every load and store.

HWASan comes with its own IR attribute.

A brief design document can be found in
clang/docs/HardwareAssistedAddressSanitizerDesign.rst (submitted earlier).

Reviewers: kcc, pcc, alekseyshl

Subscribers: srhines, mehdi_amini, mgorny, javed.absar, eraman, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D40932

llvm-svn: 320217
2017-12-09 00:21:41 +00:00
Alina Sbirlea 193429f0c8 [ModRefInfo] Make enum ModRefInfo an enum class [NFC].
Summary:
Make enum ModRefInfo an enum class. Changes to ModRefInfo values should
be done using inline wrappers.
This should prevent future bit-wise opearations from being added, which can be more error-prone.

Reviewers: sanjoy, dberlin, hfinkel, george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40933

llvm-svn: 320107
2017-12-07 22:41:34 +00:00
Matthew Simpson e363d2cebb [PGO] Make indirect call promotion a utility
This patch factors out the main code transformation utilities in the pgo-driven
indirect call promotion pass and places them in Transforms/Utils. The change is
intended to be a non-functional change, letting non-pgo-driven passes share a
common implementation with the existing pgo-driven pass.

The common utilities are used to conditionally promote indirect call sites to
direct call sites. They perform the underlying transformation, and do not
consider profile information. The pgo-specific details (e.g., the computation
of branch weight metadata) have been left in the indirect call promotion pass.

Differential Revision: https://reviews.llvm.org/D40658

llvm-svn: 319963
2017-12-06 21:22:54 +00:00
Alina Sbirlea 63d2250a42 Modify ModRefInfo values using static inline method abstractions [NFC].
Summary:
The aim is to make ModRefInfo checks and changes more intuitive
and less error prone using inline methods that abstract the bit operations.

Ideally ModRefInfo would become an enum class, but that change will require
a wider set of changes into FunctionModRefBehavior.

Reviewers: sanjoy, george.burgess.iv, dberlin, hfinkel

Subscribers: nlopes, llvm-commits

Differential Revision: https://reviews.llvm.org/D40749

llvm-svn: 319821
2017-12-05 20:12:23 +00:00
Peter Collingbourne 1f03422610 ThinLTOBitcodeWriter: Try harder to discard unused references to the merged module.
If the thin module has no references to an internal global in the
merged module, we need to make sure to preserve that property if the
global is a member of a comdat group, as otherwise promotion can end
up adding global symbols to the comdat, which is not allowed.

This situation can arise if the external global in the thin module
has dead constant users, which would cause use_empty() to return
false and would cause us to try to promote it. To prevent this from
happening, discard the dead constant users before asking whether a
global is empty.

Differential Revision: https://reviews.llvm.org/D40593

llvm-svn: 319494
2017-11-30 23:05:52 +00:00
Graham Yiu 70293fa27a - Removed unused lamba (IsReturnBlock) causing build bots to fail for r319398
- Added lit testcases that were supposed to be part of r319398

llvm-svn: 319399
2017-11-30 03:36:57 +00:00
Graham Yiu 8b1882c186 With PGO information, we can do more aggressive outlining of cold regions in the inline candidate function. This contrasts with the scheme of keeping only the 'early return' portion of the inline candidate and outlining the rest of the function as a single function call.
Support for outlining multiple regions of each function is added, as well as some basic heuristics to determine which regions are good to outline. Outline candidates limited to regions that are single-entry & single-exit. We also avoid outlining regions that produce live-exit variables, which may inhibit some forms of code motion (like commoning).

Fallback to the regular partial inlining scheme is retained when either i) no regions are identified for outlining in the function, or ii) the outlined function could not be inlined in any of its callers.

Differential Revision: https://reviews.llvm.org/D38190

llvm-svn: 319398
2017-11-30 02:41:36 +00:00
Peter Collingbourne 9e3175bb6b LowerTypeTests: Deduplicate code. NFC.
llvm-svn: 319390
2017-11-30 00:27:08 +00:00
Peter Collingbourne 943aca3c27 LowerTypeTests: Remove unnecessary cast. NFC.
llvm-svn: 319387
2017-11-30 00:02:55 +00:00
Hans Wennborg e1ecd61b98 Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining
Clang implements the -finstrument-functions flag inherited from GCC, which
inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit.

This is useful for getting a trace of how the functions in a program are
executed. Normally, the calls remain even if a function is inlined into another
function, but it is useful to be able to turn this off for users who are
interested in a lower-level trace, i.e. one that reflects what functions are
called post-inlining. (We use this to generate link order files for Chromium.)

LLVM already has a pass for inserting similar instrumentation calls to
mcount(), which it does after inlining. This patch renames and extends that
pass to handle calls both to mcount and the cygprofile functions, before and/or
after inlining as controlled by function attributes.

Differential Revision: https://reviews.llvm.org/D39287

llvm-svn: 318195
2017-11-14 21:09:45 +00:00
Florian Hahn 0e9dec672d [PartialInliner] Inline vararg functions that forward varargs.
Summary:
This patch extends the partial inliner to support inlining parts of
vararg functions, if the vararg handling is done in the outlined part.

It adds a `ForwardVarArgsTo` argument to InlineFunction. If it is
non-null, all varargs passed to the inlined function will be added to
all calls to `ForwardVarArgsTo`.

The partial inliner takes care to only pass `ForwardVarArgsTo` if the
varargs handing is done in the outlined function. It checks that vastart
is not part of the function to be inlined.

`test/Transforms/CodeExtractor/PartialInlineNoInline.ll` (already part
of the repo) checks we do not do partial inlining if vastart is used in
a basic block that will be inlined.

Reviewers: davide, davidxl, grosser

Reviewed By: davide, davidxl, grosser

Subscribers: gyiu, grosser, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D39607

llvm-svn: 318028
2017-11-13 10:35:52 +00:00
Teresa Johnson 07ec7d59c2 [ThinLTO] Ensure sanitizer passes are run
Summary:
In ThinLTO compilation, we exit populateModulePassManager early and
were not adding PM extension passes meant to run at the end of the
pipeline. This includes sanitizer passes. Add these passes before
the early exit.

A test will be added to projects/compiler-rt.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D39565

llvm-svn: 317714
2017-11-08 19:45:52 +00:00
Adrian Prantl 25a09dd408 Make DIExpression::createFragmentExpression() return an Optional.
We can't safely split arithmetic into multiple fragments because we
can't express carry-over between fragments.

llvm-svn: 317534
2017-11-07 00:45:34 +00:00
Davide Italiano 1a46affb45 [IPO/LowerTypesTest] Skip blockaddress(es) when replacing uses.
Blockaddresses refer to the function itself, therefore replacing them
would cause an assertion in doRAUW.

Fixes https://bugs.llvm.org/show_bug.cgi?id=35201

This was found when trying CFI on a proprietary kernel by Dmitry Mikulin.

Differential Revision:  https://reviews.llvm.org/D39695

llvm-svn: 317527
2017-11-07 00:09:25 +00:00
Dehao Chen 5d2a1a5045 Include already promoted counts when computing SUM for VP.
Summary: When computing the SUM for indirect call promotion, if the callsite is already promoted in the profile, it will be promoted before ICP. In the current implementation, ICP only sees remaining counts in SUM. This may cause extra indirect call targets being promoted. This patch updates the SUM to include the counts already promoted earlier. This way we do not end up promoting too many indirect call targets.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D38763

llvm-svn: 317502
2017-11-06 19:52:49 +00:00
Jun Bum Lim 0c99007db1 Recommit r317351 : Add CallSiteSplitting pass
This recommit r317351 after fixing a buildbot failure.

Original commit message:

    Summary:
    This change add a pass which tries to split a call-site to pass
    more constrained arguments if its argument is predicated in the control flow
    so that we can expose better context to the later passes (e.g, inliner, jump
    threading, or IPA-CP based function cloning, etc.).
    As of now we support two cases :

    1) If a call site is dominated by an OR condition and if any of its arguments
    are predicated on this OR condition, try to split the condition with more
    constrained arguments. For example, in the code below, we try to split the
    call site since we can predicate the argument (ptr) based on the OR condition.

    Split from :
          if (!ptr || c)
            callee(ptr);
    to :
          if (!ptr)
            callee(null ptr)  // set the known constant value
          else if (c)
            callee(nonnull ptr)  // set non-null attribute in the argument

    2) We can also split a call-site based on constant incoming values of a PHI
    For example,
    from :
          BB0:
           %c = icmp eq i32 %i1, %i2
           br i1 %c, label %BB2, label %BB1
          BB1:
           br label %BB2
          BB2:
           %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
           call void @bar(i32 %p)
    to
          BB0:
           %c = icmp eq i32 %i1, %i2
           br i1 %c, label %BB2-split0, label %BB1
          BB1:
           br label %BB2-split1
          BB2-split0:
           call void @bar(i32 0)
           br label %BB2
          BB2-split1:
           call void @bar(i32 1)
           br label %BB2
          BB2:
           %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]

llvm-svn: 317362
2017-11-03 20:41:16 +00:00
Jun Bum Lim 0eb1c2d63a Revert "Add CallSiteSplitting pass"
Revert due to Buildbot failure.

This reverts commit r317351.

llvm-svn: 317353
2017-11-03 19:17:11 +00:00
Jun Bum Lim 2a58933519 Add CallSiteSplitting pass
Summary:
This change add a pass which tries to split a call-site to pass
more constrained arguments if its argument is predicated in the control flow
so that we can expose better context to the later passes (e.g, inliner, jump
threading, or IPA-CP based function cloning, etc.).
As of now we support two cases :

1) If a call site is dominated by an OR condition and if any of its arguments
are predicated on this OR condition, try to split the condition with more
constrained arguments. For example, in the code below, we try to split the
call site since we can predicate the argument (ptr) based on the OR condition.

Split from :
      if (!ptr || c)
        callee(ptr);
to :
      if (!ptr)
        callee(null ptr)  // set the known constant value
      else if (c)
        callee(nonnull ptr)  // set non-null attribute in the argument

2) We can also split a call-site based on constant incoming values of a PHI
For example,
from :
      BB0:
       %c = icmp eq i32 %i1, %i2
       br i1 %c, label %BB2, label %BB1
      BB1:
       br label %BB2
      BB2:
       %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
       call void @bar(i32 %p)
to
      BB0:
       %c = icmp eq i32 %i1, %i2
       br i1 %c, label %BB2-split0, label %BB1
      BB1:
       br label %BB2-split1
      BB2-split0:
       call void @bar(i32 0)
       br label %BB2
      BB2-split1:
       call void @bar(i32 1)
       br label %BB2
      BB2:
       %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]

Reviewers: davidxl, huntergr, chandlerc, mcrosier, eraman, davide

Reviewed By: davidxl

Subscribers: sdesmalen, ashutosh.nema, fhahn, mssimpso, aemerson, mgorny, mehdi_amini, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D39137

llvm-svn: 317351
2017-11-03 19:01:57 +00:00
Florian Hahn 41e32bfd68 [PartialInliner] Skip call sites where inlining fails.
Summary:
InlineFunction can fail, for example when trying to inline vararg
fuctions. In those cases, we do not want to bump partial inlining
counters or set AnyInlined to true, because this could leave an unused
function hanging around.

Reviewers: davidxl, davide, gyiu

Reviewed By: davide

Subscribers: llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D39581

llvm-svn: 317314
2017-11-03 11:29:00 +00:00
Dehao Chen c6c051f2ea Include GUIDs from the same module when computing GUIDs that needs to be imported.
Summary: In the compile phase of SamplePGO+ThinLTO, ICP is not invoked. Instead, indirect call targets will be included as function metadata for ThinIndex to buidl the call graph. This should not only include functions defined in other modules, but also functions defined in the same module, otherwise ThinIndex may find the callee dead and eliminate it, while ICP in backend will revive the symbol, which leads to undefined symbol.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D39480

llvm-svn: 317118
2017-11-01 20:26:47 +00:00
Peter Collingbourne 9fb6e1a037 LTO: Apply global DCE to ThinLTO modules at LTO opt level 0.
This is necessary because DCE is applied to full LTO modules. Without
this change, a reference from a dead ThinLTO global to a dead full
LTO global will result in an undefined reference at link time.

This problem is only observable when --gc-sections is disabled, or
when targeting COFF, as the COFF port of lld requires all symbols to
have a definition even if all references are dead (this is consistent
with link.exe).

This change also adds an EliminateAvailableExternally pass at -O0. This
is necessary to handle the situation on Windows where a non-prevailing
copy of a linkonce_odr function has an SEH filter function; any
such filters must be DCE'd because they will contain a call to the
llvm.localrecover intrinsic, passing as an argument the address of the
function that the filter belongs to, and llvm.localrecover requires
this function to be defined locally.

Fixes PR35142.

Differential Revision: https://reviews.llvm.org/D39484

llvm-svn: 317108
2017-11-01 17:58:39 +00:00
Sanjay Patel b049173157 [SimplifyCFG] use pass options and remove the latesimplifycfg pass
This is no-functional-change-intended.

This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and 
D35411 (defer folding unconditional branches) with pass parameters rather than a named
"latesimplifycfg" pass. Now that we have individual options to control the functionality,
we could decouple when these fire (but that's an independent patch if desired). 

The next planned step would be to add another option bit to disable the sinking transform
mentioned in D38566. This should also make it clear that the new pass manager needs to
be updated to limit simplifycfg in the same way as the old pass manager.

Differential Revision: https://reviews.llvm.org/D38631

llvm-svn: 316835
2017-10-28 18:43:07 +00:00
Matthew Simpson 99f57933ba Attempt to unbreak the expensive-checks-win bot
llvm-svn: 316625
2017-10-25 22:46:34 +00:00
Matthew Simpson cb58558c2f Add CalledValuePropagation pass
This patch adds a new pass for attaching !callees metadata to indirect call
sites. The pass propagates values to call sites by performing an IPSCCP-like
analysis using the generic sparse propagation solver. For indirect call sites
having a small set of possible callees, the attached metadata indicates what
those callees are. The metadata can be used to facilitate optimizations like
intersecting the function attributes of the possible callees, refining the call
graph, performing indirect call promotion, etc.

Differential Revision: https://reviews.llvm.org/D37355

llvm-svn: 316576
2017-10-25 13:40:08 +00:00
Eugene Zelenko f27d161bf0 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316187
2017-10-19 21:21:30 +00:00
whitequark a99ecf1bbb [MergeFunctions] Don't blindly RAUW a GlobalValue with a ConstantExpr.
MergeFunctions uses (through FunctionComparator) a map of GlobalValues
to identifiers because it needs to compare functions and globals
do not have an inherent total order. Thus, FunctionComparator
(through GlobalNumberState) has a ValueMap<GlobalValue *>.

r315852 added a RAUW on globals that may have been previously
encountered by the FunctionComparator, which would replace
a GlobalValue * key with a ConstantExpr *, which is illegal.

This commit adjusts that code path to remove the function being
replaced from the ValueMap as well.

llvm-svn: 316145
2017-10-19 04:47:48 +00:00
Michael Zolotukhin c4fcc189d2 [GlobalDCE] Use DenseMap instead of unordered_multimap for GVDependencies.
Summary:
std::unordered_multimap happens to be very slow when the number of elements
grows large. On one of our internal applications we observed a 17x compile time
improvement from changing it to DenseMap.

Reviewers: mehdi_amini, serge-sans-paille, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38916

llvm-svn: 316045
2017-10-17 23:47:06 +00:00
whitequark ae12efab20 [MergeFunctions] Merge small functions if possible without a thunk.
This can result in significant code size savings in some cases,
e.g. an interrupt table all filled with the same assembly stub
in a certain Cortex-M BSP results in code blowup by a factor of 2.5.

Differential Revision: https://reviews.llvm.org/D34806

llvm-svn: 315853
2017-10-15 12:29:09 +00:00
whitequark b2ce9ffede [MergeFunctions] Replace all uses of unnamed_addr functions.
This reduces code size for constructs like vtables or interrupt
tables that refer to functions in global initializers.

Differential Revision: https://reviews.llvm.org/D34805

llvm-svn: 315852
2017-10-15 12:29:01 +00:00
Peter Collingbourne 868783e855 LowerTypeTests: Give imported symbols a type with size 0 so that they are not assumed not to alias.
It is possible for both a base and a derived class to be satisfied
with a unique vtable. If a program contains casts of the same pointer
to both of those types, the CFI checks will be lowered to this
(with ThinLTO):

if (p != &__typeid_base_global_addr)
  trap();
if (p != &__typeid_derived_global_addr)
  trap();

The optimizer may then use the first condition combined
with the assumption that __typeid_base_global_addr and
__typeid_derived_global_addr may not alias to optimize away the second
comparison, resulting in an unconditional trap.

This patch fixes the bug by giving imported globals the type [0 x i8]*,
which prevents the optimizer from assuming that they do not alias.

Differential Revision: https://reviews.llvm.org/D38873

llvm-svn: 315753
2017-10-13 21:02:16 +00:00
Vivek Pandya 9590658fb8 [NFC] Convert OptimizationRemarkEmitter old emit() calls to new closure
parameterized emit() calls

Summary: This is not functional change to adopt new emit() API added in r313691.

Reviewed By: anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38285

llvm-svn: 315476
2017-10-11 17:12:59 +00:00
Eugene Zelenko e9ea08a097 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 315383
2017-10-10 22:49:55 +00:00
Dehao Chen 3f56a05ae5 Use the first instruction's count to estimate the funciton's entry frequency.
Summary: In the current implementation, we only have accurate profile count for standalone symbols. For inlined functions, we do not have entry count data because it's not available in LBR. In this patch, we use the first instruction's frequency to estimiate the function's entry count, especially for inlined functions. This may be inaccurate due to debug info in optimized code. However, this is a better estimate than the static 80/20 estimation we have in the current implementation.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D38478

llvm-svn: 315369
2017-10-10 21:13:50 +00:00
Adam Nemet 0965da2055 Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.*
Sync it up with the name of the class actually defined here.  This has been
bothering me for a while...

llvm-svn: 315249
2017-10-09 23:19:02 +00:00
Dehao Chen 9bd60429e2 Directly return promoted direct call instead of rely on stripPointerCast.
Summary: stripPointerCast is not reliably returning the value that's being type-casted. Instead it may look further at function attributes to further propagate the value. Instead of relying on stripPOintercast, the more reliable solution is to directly use the pointer to the promoted direct call.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D38603

llvm-svn: 315077
2017-10-06 17:04:55 +00:00
Davide Italiano c74ea93b8c [PM] Retire disable unit-at-a-time switch.
This is a vestige from the GCC-3 days, which disables IPO passes
when set. I don't think anybody actually uses it as there are
several IPO passes which still run with this flag set and
nobody complained/noticed. This reduces the delta between
current and new pass manager and allows us to easily review
the difference when we decide to flip the switch (or audit
which passes should run, FWIW).

llvm-svn: 315043
2017-10-06 04:39:40 +00:00
Dehao Chen 16f01fb1db Annotate VP prof on indirect call if it is ICPed in the profiled binary.
Summary: In SamplePGO, when an indirect call is promoted in the profiled binary, before profile annotation, it will be promoted and inlined. For the original indirect call, the current implementation will not mark VP profile on it. This is an issue when profile becomes stale. This patch annotates VP prof on indirect calls during annotation.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D38477

llvm-svn: 315016
2017-10-05 20:15:29 +00:00
Davide Italiano c8708e59e8 [PassManager] Improve the interaction between -O2 and ThinLTO.
Run GDCE slightly later so that we don't have to repeat it
twice when preparing for Thin. Thanks to Mehdi for the suggestion.

llvm-svn: 314999
2017-10-05 18:23:25 +00:00
Davide Italiano ff829cea8b [PassManager] Run global optimizations after the inliner.
The inliner performs some kind of dead code elimination as it goes,
but there are cases that are not really caught by it. We might
at some point consider teaching the inliner about them, but it
is OK for now to run GlobalOpt + GlobalDCE in tandem as their
benefits generally outweight the cost, making the whole pipeline
faster.

This fixes PR34652.

Differential Revision: https://reviews.llvm.org/D38154

llvm-svn: 314997
2017-10-05 18:06:37 +00:00
Dehao Chen ea523ddb1b Revert the change that accidentally went in r314806.
llvm-svn: 314807
2017-10-03 15:50:42 +00:00
Davide Italiano c48d1c8519 [PassManager] Retire cl::opt that have been set for a while. NFCI.
llvm-svn: 314740
2017-10-02 23:39:20 +00:00
Dehao Chen f464627f28 Update getMergedLocation to check the instruction type and merge properly.
Summary: If the merged instruction is call instruction, we need to set the scope to the closes common scope between 2 locations, otherwise it will cause trouble when the call is getting inlined.

Reviewers: dblaikie, aprantl

Reviewed By: dblaikie, aprantl

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D37877

llvm-svn: 314694
2017-10-02 18:13:14 +00:00
Dehao Chen d26dae0d34 Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases.
Summary: In SamplePGO ThinLTO compile phase, we will not invoke ICP as it may introduce confusion to the 2nd annotation. This patch extracted that logic and makes it clearer before profile annotation. In the mean time, we need to make function importing process both inlined callsites as well as not promoted indirect callsites.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, mehdi_amini, llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D38094

llvm-svn: 314619
2017-10-01 05:24:51 +00:00
Dehao Chen 4f5d830343 Refactor the SamplePGO profile annotation logic to extract inlineCallInstruction. (NFC)
llvm-svn: 314601
2017-09-30 20:46:15 +00:00
Sanjoy Das def1729dc4 Use a BumpPtrAllocator for Loop objects
Summary:
And now that we no longer have to explicitly free() the Loop instances, we can
(with more ease) use the destructor of LoopBase to do what LoopBase::clear() was
doing.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38201

llvm-svn: 314375
2017-09-28 02:45:42 +00:00
Sanjoy Das 388b012f4e Rename markAsErased to erase, as pointed out in a previous review; NFC
llvm-svn: 313951
2017-09-22 01:47:41 +00:00
Strahinja Petrovic 29202f6dc1 Fixed reverted commit rL312318
This patch contains fix for reverted commit
rL312318 which was causing failure due to use
of unchecked dyn_cast to CIInit.

Patch by: Nikola Prica.

llvm-svn: 313870
2017-09-21 10:04:02 +00:00
Teresa Johnson f625118ec7 [ThinLTO] Fix dead stripping analysis for SamplePGO
Summary:
The fix for dead stripping analysis in the case of SamplePGO indirect
calls to local functions (r313151) introduced the possibility of an
infinite loop.

Make sure we check for the value being already live after we update it
for SamplePGO indirect call handling.

Reviewers: danielcdh

Subscribers: mehdi_amini, inglorion, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D38086

llvm-svn: 313766
2017-09-20 17:09:47 +00:00
Sanjoy Das 76ab23234c [LoopInfo] Make LoopBase and Loop destructors non-public
Summary:
See comment for why I think this is a good idea.

This change also:

 - Removes an SCEV test case.  The SCEV test was not testing anything useful (most of it was `#if 0` ed out) and it would need to be updated to deal with a private ~Loop::Loop.
 - Updates the loop pass manager test case to deal with a private ~Loop::Loop.
 - Renames markAsRemoved to markAsErased to contrast with removeLoop, via the usual remove vs. erase idiom we already have for instructions and basic blocks.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D37996

llvm-svn: 313695
2017-09-19 23:19:00 +00:00
Adam Nemet 15fccf0009 Allow ORE.emit to take a closure to delay building the remark object
In the lambda we are now returning the remark by value so we need to preserve
its type in the insertion operator.  This requires making the insertion
operator generic.

I've also converted a few cases to use the new API.  It seems to work pretty
well.  See the LoopUnroller for a slightly more interesting case.

llvm-svn: 313691
2017-09-19 23:00:55 +00:00
Dehao Chen 62b9c33e1e Import all inlined indirect call targets for SamplePGO.
Summary: In the ThinLTO compilation, if a function is inlined in the profiling binary, we need to inline it before annotation. If the callee is not available in the primary module, a first step is needed to import that callee function. For the current implementation, if the call is an indirect call, which has been promoted to >1 targets and inlined, SamplePGO will only import one target with the largest sample count. This patch fixed the bug to import all targets instead.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D36637

llvm-svn: 313678
2017-09-19 21:18:14 +00:00
Dehao Chen b6e60c8b80 Handle profile mismatch correctly for SamplePGO.
Summary: Fix the bug when promoted call return type mismatches with the promoted function, we should not try to inline it. Otherwise it may lead to compiler crash.

Reviewers: davidxl, tejohnson, eraman

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D38018

llvm-svn: 313658
2017-09-19 18:26:54 +00:00
Dehao Chen 3a81f84d9a Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader.
Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues.

Reviewers: eraman, davidxl

Reviewed By: eraman

Subscribers: vitalybuka, sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D37779

llvm-svn: 313277
2017-09-14 17:29:56 +00:00
Vitaly Buka 48624d327a Revert "Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader."
Patch introduced uninitialized value.

This reverts commit r313195.

llvm-svn: 313230
2017-09-14 05:40:33 +00:00
Peter Collingbourne cfbd089237 Reland r313157, "ThinLTO: Correctly follow aliasee references when dead stripping." which was reverted in r313222.
This reland includes a fix for the LowerTypeTests pass so that it
looks past aliases when determining which type identifiers are live.

Differential Revision: https://reviews.llvm.org/D37842

llvm-svn: 313229
2017-09-14 05:02:59 +00:00
Hans Wennborg ae050afeb9 Revert r313157 "ThinLTO: Correctly follow aliasee references when dead stripping."
This broke Chromium's CFI build; see crbug.com/765004.

> We were previously handling aliases during dead stripping by adding
> the aliased global's "original name" GUID to the worklist. This will
> lead to incorrect behaviour if the global has local linkage because
> the original name GUID will not correspond to the global's GUID in
> the summary.
>
> Because an alias is just another name for the global that it
> references, there is no need to mark the referenced global as used,
> or to follow references from any other copies of the global. So all
> we need to do is to follow references from the aliasee's summary
> instead of the alias.
>
> Differential Revision: https://reviews.llvm.org/D37789

llvm-svn: 313222
2017-09-14 00:40:14 +00:00
Dehao Chen 15c86ef970 Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader.
Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues.

Reviewers: eraman, davidxl

Reviewed By: eraman

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D37779

llvm-svn: 313195
2017-09-13 21:22:55 +00:00
Peter Collingbourne d067c8ed59 ThinLTO: Correctly follow aliasee references when dead stripping.
We were previously handling aliases during dead stripping by adding
the aliased global's "original name" GUID to the worklist. This will
lead to incorrect behaviour if the global has local linkage because
the original name GUID will not correspond to the global's GUID in
the summary.

Because an alias is just another name for the global that it
references, there is no need to mark the referenced global as used,
or to follow references from any other copies of the global. So all
we need to do is to follow references from the aliasee's summary
instead of the alias.

Differential Revision: https://reviews.llvm.org/D37789

llvm-svn: 313157
2017-09-13 17:09:20 +00:00
Teresa Johnson 1958083d35 [ThinLTO] For SamplePGO, need to handle ICP targets consistently in thin link
Summary:
SamplePGO indirect call profiles record the target as the original GUID
for statics. The importer had special handling to map to the normal GUID
in that case. The dead global analysis needs the same treatment or
inconsistencies arise, resulting in linker unsats due to some dead
symbols being exported and kept, leaving in references to other dead
symbols that are removed.

This can happen when a SamplePGO profile collected by one binary is used
for a different binary, so the indirect call profiles may not accurately
reflect live targets.

Reviewers: danielcdh

Subscribers: mehdi_amini, inglorion, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D37783

llvm-svn: 313151
2017-09-13 15:16:38 +00:00
Dehao Chen f3ed14d323 Refactor the code to pass down ACT to SampleProfileLoader correctly.
Summary: This change passes down ACT to SampleProfileLoader for the new PM. Also remove the default value for SampleProfileLoader class as it is not used.

Reviewers: eraman, davidxl

Reviewed By: eraman

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D37773

llvm-svn: 313080
2017-09-12 21:55:55 +00:00
Peter Collingbourne b9b6025328 LowerTypeTests: Add import/export support for targets without absolute symbol constants.
The rationale is the same as for r312967.

Differential Revision: https://reviews.llvm.org/D37408

llvm-svn: 312968
2017-09-11 22:49:10 +00:00
Peter Collingbourne b15a35e604 WholeProgramDevirt: Add import/export support for targets without absolute symbol constants.
Not all targets support the use of absolute symbols to export
constants. In particular, ARM has a wide variety of constant encodings
that cannot currently be relocated by linkers. So instead of exporting
the constants using symbols, export them directly in the summary.
The values of the constants are left as zeroes on targets that support
symbolic exports.

This may result in more cache misses when targeting those architectures
as a result of arbitrary changes in constant values, but this seems
somewhat unavoidable for now.

Differential Revision: https://reviews.llvm.org/D37407

llvm-svn: 312967
2017-09-11 22:34:42 +00:00
Nuno Lopes 404f106d71 Merge isKnownNonNull into isKnownNonZero
It now knows the tricks of both functions.
Also, fix a bug that considered allocas of non-zero address space to be always non null

Differential Revision: https://reviews.llvm.org/D37628

llvm-svn: 312869
2017-09-09 18:23:11 +00:00
Sanjay Patel 6fd4391ddd [DivRempairs] add a pass to optimize div/rem pairs (PR31028)
This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented 
as an independent pass, so there's no stretching of scope and feature creep for an existing pass. 
I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost 
this same functionality as an addition to CGP in the motivating example of PR31028:
https://bugs.llvm.org/show_bug.cgi?id=31028

The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow 
more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and
undo the hoisting that is done here.

Decomposing remainder may allow removing some code from the backend (PPC and possibly others).

Differential Revision: https://reviews.llvm.org/D37121 

llvm-svn: 312862
2017-09-09 13:38:18 +00:00
Peter Collingbourne 88a58cf9e7 WholeProgramDevirt: When promoting for single-impl devirt, also rename the comdat.
This is required when targeting COFF, as the comdat name must match
one of the names of the symbols in the comdat.

Differential Revision: https://reviews.llvm.org/D37550

llvm-svn: 312767
2017-09-08 00:10:53 +00:00
Reid Kleckner 0e8c4bb055 Sink some IntrinsicInst.h and Intrinsics.h out of llvm/include
Many of these uses can get by with forward declarations. Hopefully this
speeds up compilation after adding a single intrinsic.

llvm-svn: 312759
2017-09-07 23:27:44 +00:00
Richard Trieu c7828ebea4 Revert r312318, r312325, r312424, r312489
r312318 - Debug info for variables whose type is shrinked to bool
r312325, r312424, r312489 - Test case for r312318

Revision 312318 introduced a null dereference bug.
Details in https://bugs.llvm.org/show_bug.cgi?id=34490

llvm-svn: 312758
2017-09-07 23:20:35 +00:00
Strahinja Petrovic 676fd0b022 Debug info for variables whose type is shrinked to bool
This patch provides such debug information for integer
variables whose type is shrinked to bool by providing 
dwarf expression which returns either constant initial 
value or other value.

Patch by Nikola Prica.

Differential Revision: https://reviews.llvm.org/D35994

llvm-svn: 312318
2017-09-01 10:05:27 +00:00
Adrian Prantl 504b82d44b Don't add a fragment expression when GlobalSRA splits up a single-member struct
Fixes PR34390.

https://bugs.llvm.org/show_bug.cgi?id=34390

llvm-svn: 312196
2017-08-31 00:06:18 +00:00
Adrian Prantl b192b545c1 Refactor DIBuilder::createFragmentExpression into a static DIExpression member
NFC

llvm-svn: 312165
2017-08-30 20:04:17 +00:00
Mandeep Singh Grang e3bbb68b0c [cfi] Fixed non-determinism in codegen due to DenseSet iteration order
llvm-svn: 312098
2017-08-30 04:47:21 +00:00
Evgeniy Stepanov 7372b67063 [cfi] Avoid branch veneers in jump tables when possible.
Summary:
When jumptable encoding does not match target code encoding (arm vs
thumb), a veneer is inserted by the linker. We can not avoid this
in all cases, because entries within one jumptable must have the same
encoding, but we can make it less common by selecting the jumptable
encoding to match the majority of its targets.

This change only covers FullLTO, and not ThinLTO.

Reviewers: pcc

Subscribers: aemerson, mehdi_amini, javed.absar, kristof.beyls, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37171

llvm-svn: 312054
2017-08-29 22:40:19 +00:00
Evgeniy Stepanov 4731ad81c7 [cfi] Build __cfi_check as Thumb when applicable.
Summary:
Cross-DSO CFI needs all __cfi_check exports to use the same encoding
(ARM vs Thumb).

Reviewers: pcc

Subscribers: aemerson, srhines, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37243

llvm-svn: 312052
2017-08-29 22:29:15 +00:00
Benjamin Kramer 154411e0e7 [FunctionImport] Avoid unused variable warnings in Release builds
Just skip the entire block in NDEBUG. No functionality change intended.

llvm-svn: 312031
2017-08-29 20:24:39 +00:00
Teresa Johnson 2df7fc7991 [ThinLTO] Clean up stale alias import handling
Summary:
Remove some code that was no longer needed. The first FIXME is
stale since we long ago started using the index to drive importing,
rather than doing force importing based on linkage type. And
now with r309278, we no longer import any aliases.

Reviewers: dblaikie

Subscribers: inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D37266

llvm-svn: 312019
2017-08-29 18:15:34 +00:00
Davide Italiano a872519dbd [Inliner] Only compute fully inline cost when remarks are enabled.
Prior to this change (and after r311371), we computed it
unconditionally, causin gsevere compile time regressions (in some
cases, 5 to 10x).

llvm-svn: 311804
2017-08-25 22:01:42 +00:00
Chad Rosier f98335e0b0 [PartialInlining] Formatting. NFC.
llvm-svn: 311702
2017-08-24 21:21:09 +00:00
Chad Rosier 4cb2e82774 [PartialInlining] Type. NFC.
llvm-svn: 311699
2017-08-24 20:29:02 +00:00
Peter Collingbourne 001052a067 WholeProgramDevirt: Create bitcast to i8* at each virtual call site.
We can't reuse the llvm.assume instruction's bitcast because it may not
dominate every user of the vtable pointer.

Differential Revision: https://reviews.llvm.org/D36994

llvm-svn: 311491
2017-08-22 21:41:19 +00:00
Haicheng Wu 0812c5bea3 [InlineCost] Add cl::opt to allow full inline cost to be computed for debugging purposes.
Currently, the inline cost model will bail once the inline cost exceeds the
inline threshold in order to avoid unnecessary compile-time. However, when
debugging it is useful to compute the full cost, so this command line option
is added to override the default behavior.

I took over this work from Chad Rosier (mcrosier@codeaurora.org).

Differential Revision: https://reviews.llvm.org/D35850

llvm-svn: 311371
2017-08-21 20:00:09 +00:00
Sam Elliott e963c89d11 Migrate WholeProgramDevirt to new Optimization Remark API
Summary:
This is an attempt to move WholeProgramDevirt to the new remark API.

https://bugs.llvm.org/show_bug.cgi?id=33793

Reviewers: anemet

Reviewed By: anemet

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D36943

llvm-svn: 311352
2017-08-21 16:57:21 +00:00
Sam Elliott e604b563ea Emit only A Single Opt Remark When Inlining
Summary:
This updates the Inliner to only add a single Optimization
Remark when Inlining, rather than an Analysis Remark and an
Optimization Remark.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33786

Reviewers: anemet, davidxl, chandlerc

Reviewed By: anemet

Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D36054

llvm-svn: 311349
2017-08-21 16:45:47 +00:00
Sam Elliott 7fe0aaa140 Revert "Emit only A Single Opt Remark When Inlining"
Reverting due to clang build failure

llvm-svn: 311274
2017-08-20 06:55:10 +00:00
Sam Elliott 785dd75369 Emit only A Single Opt Remark When Inlining
Summary:
This updates the Inliner to only add a single Optimization
Remark when Inlining, rather than an Analysis Remark and an
Optimization Remark.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33786

Reviewers: anemet, davidxl, chandlerc

Reviewed By: anemet

Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D36054

llvm-svn: 311273
2017-08-20 06:43:34 +00:00
Teresa Johnson 73305f82e9 [ThinLTO] Fix ThinLTO crash
Summary:
Follow up to fix in r311023, which fixed the case where the combined
index is written to disk. The same samplePGO logic exists for the
in-memory index when computing imports, so we need to filter out
GlobalVariable summaries there too.

Reviewers: davidxl

Subscribers: inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D36919

llvm-svn: 311254
2017-08-19 18:04:25 +00:00
Andrew Kaylor 53a5fbb45f Add strictfp attribute to prevent unwanted optimizations of libm calls
Differential Revision: https://reviews.llvm.org/D34163

llvm-svn: 310885
2017-08-14 21:15:13 +00:00
Eli Friedman 51cf2604b6 [OptDiag] Updating Remarks in SampleProfile
Updating remark API to newer OptimizationDiagnosticInfo API. This
allows remarks to show up in diagnostic yaml file, and enables use
of opt-viewer tool.

Hotness information for remarks (L505 and L751) do not display hotness
information, most likely due to profile information not being
propagated yet. Unsure if this is the desired outcome.

Patch by Tarun Rajendran.

Differential Revision: https://reviews.llvm.org/D36127

llvm-svn: 310763
2017-08-11 21:12:04 +00:00
Xinliang David Li 24524f314c Fix typo /NFC
llvm-svn: 310737
2017-08-11 17:49:20 +00:00
Davide Italiano c163fac184 [GlobalOpt] Switch an explicit loop to llvm::all_of(). NFCI.
llvm-svn: 310453
2017-08-09 09:23:29 +00:00
Reid Kleckner da748f1c3d [ArgPromotion] Preserve alignment of byval argument in new alloca
The frontend may have requested a higher alignment for any reason, and
downstream optimizations may already have taken advantage of it.  We
should keep the same alignment when moving the allocation from the
parameter area to the local variable area.

Fixes PR34038

llvm-svn: 310071
2017-08-04 17:09:11 +00:00
Victor Leschuk 56b03d0dd6 Un-revert r310014: false revert, it wasn't the cause of build break
llvm-svn: 310021
2017-08-04 04:51:15 +00:00
Victor Leschuk 21713ebfb1 Revert r310014 as it breaks build lld-x86_64-darwin13
llvm-svn: 310020
2017-08-04 04:43:54 +00:00
Adrian Prantl fd8c8e9fe6 Teach GlobalSRA to update the debug info for split-up globals.
This is similar to what we are doing in "regular" SROA and creates
DW_OP_LLVM_fragment operations to describe the resulting variables.

rdar://problem/33654891

llvm-svn: 310014
2017-08-04 01:19:54 +00:00
Chandler Carruth 95055d8f8b [PM] Fix a bug where through CGSCC iteration we can get
infinite-inlining across multiple runs of the inliner by keeping a tiny
history of internal-to-SCC inlining decisions.

This is still a bit gross, but I don't yet have any fundamentally better
ideas and numerous people are blocked on this to use new PM and ThinLTO
together.

The core of the idea is to detect when we are about to do an inline that
has a chance of re-splitting an SCC which we have split before with
a similar inlining step. That is a critical component in the inlining
forming a cycle and so far detects all of the various cyclic patterns
I can come up with as well as the original real-world test case (which
comes from a ThinLTO build of libunwind).

I've added some tests that I think really demonstrate what is going on
here. They are essentially state machines that march the inliner through
various steps of a cycle and check that we stop when the cycle is closed
and that we actually did do inlining to form that cycle.

A lot of thanks go to Eric Christopher and Sanjoy Das for the help
understanding this issue and improving the test cases.

The biggest "yuck" here is the layering issue -- the CGSCC pass manager
is providing somewhat magical state to the inliner for it to use to make
itself converge. This isn't great, but I don't honestly have a lot of
better ideas yet and at least seems nicely isolated.

I have tested this patch, and it doesn't block *any* inlining on the
entire LLVM test suite and SPEC, so it seems sufficiently narrowly
targeted to the issue at hand.

We have come up with hypothetical issues that this patch doesn't cover,
but so far none of them are practical and we don't have a viable
solution yet that covers the hypothetical stuff, so proceeding here in
the interim. Definitely an area that we will be back and revisiting in
the future.

Differential Revision: https://reviews.llvm.org/D36188

llvm-svn: 309784
2017-08-02 02:09:22 +00:00
Peter Collingbourne bcd204b478 Update phi nodes in LowerTypeTests control flow simplification
D33925 added a control flow simplification for -O2 --lto-O0 builds that
manually splits blocks and reassigns conditional branches but does not
correctly update phi nodes. If the else case being branched to had
incoming phi nodes the control-flow simplification would leave phi nodes
in that BB with an unhandled predecessor.

Patch by Vlad Tsyrklevich!

Differential Revision: https://reviews.llvm.org/D36012

llvm-svn: 309621
2017-07-31 20:43:07 +00:00
Florian Hahn 6b3216aad8 Guard print() functions only used by dump() functions.
Summary:
Since  r293359, most dump() function are only defined when
`!defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)` holds. print() functions
only used by dump() functions are now unused in release builds,
generating lots of warnings. This patch only defines some print()
functions if they are used.

Reviewers: MatzeB

Reviewed By: MatzeB

Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D35949

llvm-svn: 309553
2017-07-31 10:07:49 +00:00
Dehao Chen 8260d66556 Increase the ImportHotMultiplier to 10.0
Summary: The original 3.0 hot mupltiplier is too small, and would prevent hot callsites from being inline. This patch increases the hot multilier to 10.0

Reviewers: davidxl, tejohnson

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D35969

llvm-svn: 309344
2017-07-28 01:02:34 +00:00
whitequark 9e8197ac8f [MergeFunctions] Remove alias support.
The alias support was dead code since 2011. It was last touched
in r124182, where it was reintroduced after being removed
in r110434, and since then it was gated behind a HasGlobalAliases
flag that was permanently stuck as `false`.

It is also broken. I'm not sure if it bitrotted or was just broken
in the first place because it appears to have never been tested,
but the following IR results in a crash:

    define internal i32 @a(i32 %a, i32 %b) unnamed_addr {
      %c = add i32 %a, %b
      %d = xor i32 %a, %c
      ret i32 %c
    }

    define internal i32 @b(i32 %a, i32 %b) unnamed_addr {
      %c = add i32 %a, %b
      %d = xor i32 %a, %c
      ret i32 %c
    }

It seems safe to remove buggy untested code that no one cared about
for seven years.

Differential Revision: https://reviews.llvm.org/D34802

llvm-svn: 309313
2017-07-27 19:36:13 +00:00
Davide Italiano 82c7d3768d [FunctionImport] Prefer isa<> to dyn_cast<> as the value is not used.
This change makes GCC7 happy again.

llvm-svn: 309305
2017-07-27 18:38:09 +00:00
David Blaikie 2f0cc477ab ThinLTO: Don't import aliases of any kind (even linkonce_odr)
Summary:
Until a more advanced version of importing can be implemented for
aliases (one that imports an alias as an available_externally definition
of the aliasee), skip the narrow subset of cases that was possible but
came at a cost: aliases of linkonce_odr functions could be imported
because the linkonce_odr function could be safely duplicated from the
source module. This came/comes at the cost of not being able to 'home'
imported linkonce functions (they had to be emitted linkonce_odr in all
the destination modules (even if they weren't used by an alias) rather
than as available_externally - causing extra object size).

Tangentially, this also was the only reason ThinLTO would emit multiple
CUs in to the resulting DWARF - which happens to be a problem for
Fission (there's a fix for this in GDB but not released yet, etc).
(actually it's not the only reason - but I'm sending a patch to fix the
other reason shortly)

There's no reason to believe this particularly narrow alias importing
was especially/meaningfully important, only that it was /possible/ to
implement in this way. When a more general solution is done, it should
still satisfy the DWARF concerns above, since the import will still be
available_externally, and thus not create extra CUs.

Since now all aliases are treated the same, I removed/simplified some
test cases since they were testing corner cases where there are no
longer any corners.

Reviewers: tejohnson, mehdi_amini

Differential Revision: https://reviews.llvm.org/D35875

llvm-svn: 309278
2017-07-27 15:09:06 +00:00
Haojie Wang 1dec57d5b0 ThinLTO Minimized Bitcode File Size Reduction
Summary: Currently the ThinLTO minimized bitcode file only strip the debug info, but there is still a lot of information in the minimized bit code file that will be not used for thin linker. In this patch, most of the extra information is striped to reduce the minimized bitcode file. Now only ModuleVersion, ModuleInfo, ModuleGlobalValueSummary, ModuleHash, Symtab and Strtab are left. Now the minimized bitcode file size is reduced to 15%-30% of the debug info stripped bitcode file size.

Reviewers: danielcdh, tejohnson, pcc

Reviewed By: pcc

Subscribers: mehdi_amini, aprantl, inglorion, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D35334

llvm-svn: 308760
2017-07-21 17:25:20 +00:00
Peter Collingbourne 6f6788b99c LowerTypeTests: Drop function type metadata only if we're going to replace it.
Previously we were (mis)handling jump table members with a prevailing
definition in a full LTO module and a non-prevailing definition in a
ThinLTO module by dropping type metadata on those functions entirely,
which would cause type tests involving such functions to fail.

This patch causes us to drop metadata only if we are about to replace
it with metadata from cfi.functions.

We also want to replace metadata for available_externally functions,
which can arise in the opposite scenario (prevailing ThinLTO
definition, non-prevailing full LTO definition). The simplest way
to handle that is to remove the definition; there's little value in
keeping it around at this point (i.e. after most optimization passes
have already run) and later code will try to use the function's linkage
to create an alias, which would result in invalid IR if the function
is available_externally.

Fixes PR33832.

Differential Revision: https://reviews.llvm.org/D35604

llvm-svn: 308642
2017-07-20 18:02:05 +00:00
Peter Collingbourne 93fdaca5ac ThinLTOBitcodeWriter: Do not rewrite intrinsic functions when splitting modules.
Changing the type of an intrinsic may invalidate the IR.

Differential Revision: https://reviews.llvm.org/D35593

llvm-svn: 308500
2017-07-19 17:54:29 +00:00
Chandler Carruth 06a86301a1 [PM/LCG] Follow-up fix to r308088 to handle deletion of library
functions.

In the prior commit, we provide ordering to the LCG between functions
and library function definitions that they might begin to call through
transformations. But we still would delete these library functions from
the call graph if they became dead during inlining.

While this immediately crashed, it also exposed a loss of information.
We shouldn't remove definitions of library functions that can still
usefully participate in the LCG-powered CGSCC optimization process. If
new call edges are formed, we want to have definitions to be called.

We can still remove these functions if truly dead using global-dce, etc,
but removing them during the CGSCC walk is premature.

This fixes a crash in the new PM when optimizing some unusual libraries
that end up with "internal" lib functions such as the code in the "R"
language's libraries.

llvm-svn: 308417
2017-07-19 04:12:25 +00:00
Teresa Johnson f9dc3deaa6 Revert "Restore with fix "[ThinLTO] Ensure we always select the same function copy to import""
This reverts commit r308114 (and follow on fixes to test).

There is a linking failure in a ThinLTO bot:
http://green.lab.llvm.org/green/job/clang-stage2-configure-Rthinlto_build/3663/

(and undefined reference). It seems like it must be a second order
effect of the heuristic change I made, and may take some time to try
to reproduce locally and track down. Therefore, reverting for now.

llvm-svn: 308206
2017-07-17 19:25:38 +00:00
Teresa Johnson a7660b0127 Restore with fix "[ThinLTO] Ensure we always select the same function copy to import"
This restores r308078/r308079 with a fix for bot non-determinisim (make
sure we run llvm-lto in single threaded mode so the debug output doesn't get
interleaved).

llvm-svn: 308114
2017-07-15 22:58:06 +00:00
Chandler Carruth d78a38ed2e Revert r308078 (and subsequent tweak in r308079) which introduces a test
that appears to exhibit non-determinism and is flaking on the bots
pretty consistently.

r308078: [ThinLTO] Ensure we always select the same function copy to import
r308079: Require asserts in new test that uses debug flag
llvm-svn: 308095
2017-07-15 13:50:26 +00:00
Teresa Johnson 82b4fb1afe [ThinLTO] Ensure we always select the same function copy to import
Summary:
Check if the first eligible callee is under the instruction threshold.
Checking this on the first eligible callee ensures that we don't end
up selecting different callees to import when we invoke this routine
with different thresholds due to reaching the callee via paths that
are shallower or hotter (when there are multiple copies, i.e. with
weak or linkonce linkage). We don't want to leave the decision of which
copy to import up to the backend.

Reviewers: mehdi_amini

Subscribers: inglorion, fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D35436

llvm-svn: 308078
2017-07-15 04:53:05 +00:00
Jakub Kuderski b292c22c8d [Dominators] Make IsPostDominator a template parameter
Summary:
DominatorTreeBase used to have IsPostDominators (bool) member to indicate if the tree is a dominator or a postdominator tree. This made it possible to switch between the two 'modes' at runtime, but it isn't used in practice anywhere.

This patch makes IsPostDominator a template argument. This way, it is easier to switch between different algorithms at compile-time based on this argument and design external utilities around it. It also makes it impossible to incidentally assign a postdominator tree to a dominator tree (and vice versa), and to further simplify template code in GenericDominatorTreeConstruction.

Reviewers: dberlin, sanjoy, davide, grosser

Reviewed By: dberlin

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35315

llvm-svn: 308040
2017-07-14 18:26:09 +00:00
Davide Italiano c3dc055780 Reapply [GlobalOpt] Remove unreachable blocks before optimizing a function.
This commit reapplies r307215 now that we found out and fixed
the cause of the cfi test failure (in r307871).

llvm-svn: 307920
2017-07-13 15:40:59 +00:00
Peter Collingbourne cacac6a104 LowerTypeTests: When importing functions skip definitions where the summary contains a decl.
This normally indicates mixed CFI + non-CFI compilation, and will
result in us treating the function in the same way as a function
defined outside of the LTO unit.

Part of PR33752.

Differential Revision: https://reviews.llvm.org/D35281

llvm-svn: 307744
2017-07-12 00:39:12 +00:00
Davide Italiano b8ad3eebca [IPO] Temporarily rollback r307215.
[GlobalOpt] Remove unreachable blocks before optimizing a function.
While the change is presumably correct, it exposes a latent bug
in DI which breaks on of the CFI checks. I'll analyze it further
and try to understand what's going on.

llvm-svn: 307729
2017-07-11 23:10:17 +00:00
Konstantin Zhuravlyov bb80d3e1d3 Enhance synchscope representation
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
  global and local memory. These scopes restrict how synchronization is
  achieved, which can result in improved performance.

  This change extends existing notion of synchronization scopes in LLVM to
  support arbitrary scopes expressed as target-specific strings, in addition to
  the already defined scopes (single thread, system).

  The LLVM IR and MIR syntax for expressing synchronization scopes has changed
  to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
  replaces *singlethread* keyword), or a target-specific name. As before, if
  the scope is not specified, it defaults to CrossThread/System scope.

  Implementation details:
    - Mapping from synchronization scope name/string to synchronization scope id
      is stored in LLVM context;
    - CrossThread/System and SingleThread scopes are pre-defined to efficiently
      check for known scopes without comparing strings;
    - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
      the bitcode.

Differential Revision: https://reviews.llvm.org/D21723

llvm-svn: 307722
2017-07-11 22:23:00 +00:00
Chandler Carruth 01f0c8a8c4 [PM/ThinLTO] Fix PR33536, a bug where the ThinLTO bitcode writer was
querying for analysis results on a function declaration rather than
a definition.

The only reason this worked previously is by chance -- because the way
we got alias analysis results with the legacy PM, we happened to not
compute a dominator tree and so we happened to not hit an assert even
though it didn't make any real sense. Now we bail out before trying to
compute alias analysis so that we don't hit these asserts.

llvm-svn: 307625
2017-07-11 05:39:20 +00:00
Mikael Holmen e0ced14449 [ArgumentPromotion] Change use of removed argument in llvm.dbg.value to undef
Summary:
This solves PR33641.

When removing a dead argument we must also handle possibly existing calls
to llvm.dbg.value that use the removed argument. Now we change the use
of the otherwise dead argument to an undef for some other pass to cleanup
later.

If the calls are left untouched, they will later on cause errors:
 "function-local metadata used in wrong function"
since the ArgumentPromotion rewrites the code by creating a new function
with the wanted signature, but the metadata is not recreated so the new
function may then erroneously use metadata from the old function.

Reviewers: mstorsjo, rnk, arsenm

Reviewed By: rnk

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D34874

llvm-svn: 307521
2017-07-10 06:07:24 +00:00
Chandler Carruth bd9c29039e [PM] Finish implementing and fix a chain of bugs uncovered by testing
the invalidation propagation logic from an SCC to a Function.

I wrote the infrastructure to test this but didn't actually use it in
the unit test where it was designed to be used. =[ My bad. Once
I actually added it to the test case I discovered that it also hadn't
been properly implemented, so I've implemented it. The logic in the FAM
proxy for an SCC pass to propagate invalidation follows the same ideas
as the FAM proxy for a Module pass, but the implementation is a bit
different to reflect the fact that it is forwarding just for an SCC.

However, implementing this correctly uncovered a surprising "bug" (it
was conservatively correct but relatively very expensive) in how we
handle invalidation when splitting one SCC into multiple SCCs. We did an
eager invalidation when in reality we should be deferring invaliadtion
for the *current* SCC to the CGSCC pass manager and just invaliating the
newly constructed SCCs. Otherwise we end up invalidating too much too
soon. This was exposed by the inliner test case that I've updated. Now,
we invalidate *just* the split off '(test1_f)' SCC when doing the CG
update, and then the inliner finishes and invalidates the '(test1_g,
test1_h)' SCC's analyses. The first few attempts at fixing this hit
still more bugs, but all of those are covered by existing tests. For
example, the inliner should also preserve the FAM proxy to avoid
unnecesasry invalidation, and this is safe because the CG update
routines it uses handle any necessary adjustments to the FAM proxy.

Finally, the unittests for the CGSCC pass manager needed a bunch of
updates where we weren't correctly preserving the FAM proxy because it
hadn't been fully implemented and failing to preserve it didn't matter.

Note that this doesn't yet fix the current crasher due to MemSSA finding
a stale dominator tree, but without this the fix to that crasher doesn't
really make any sense when testing because it relies on the proxy
behavior.

llvm-svn: 307487
2017-07-09 03:59:31 +00:00
Dehao Chen 64c46574b0 Increase the import-threshold for crtical functions.
Summary: For interative sample-pgo, if a hot call site is inlined in the profiling binary, we should inline it in before profile annotation in the backend. Before that, the compile phase first collects all GUIDs that needs to be imported and creates virtual "hot" call edge in the summary. However, "hot" is not good enough to guarantee the callsites get inlined. This patch introduces "critical" call edge, and assign much higher importing threshold for those edges.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, mehdi_amini, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D35096

llvm-svn: 307439
2017-07-07 21:01:00 +00:00
Davide Italiano f4891d29f8 [lib/LTO] Add a comment to explain where we set the linkage in the summary.
Pointed out by Teresa!

llvm-svn: 307305
2017-07-06 20:04:20 +00:00
Davide Italiano 6a5fbe52fa [LTO] Fix the interaction between linker redefined symbols and ThinLTO
This is the same as r304719 but for ThinLTO.
The substantial difference is that in this case we don't have
whole visibility, just the summary.
In the LTO case, when we got the resolution for the input file we
could just see if the linker told us whether a symbol was linker
redefined (using --wrap or --defsym) and switch the linkage directly
for the GV.

Here, we have the summary. So, we record that the linkage changed
from <whatever it was> to $weakany to prevent IPOs across this symbol
boundaries and actually just switch the linkage at FunctionImport time.

This patch should also fixes the lld bits (as all the scaffolding for
communicating if a symbol is linker redefined should be there & should
be the same), but I'll make sure to add some tests there as well.

Fixes PR33192.

Differential Revision:  https://reviews.llvm.org/D35064

llvm-svn: 307303
2017-07-06 19:58:26 +00:00
Frederich Munch 52dfcd18d1 Avoid constructing GlobalExtensions only to find out it is empty.
Summary:
GlobalExtensions is dereferenced twice, once for iteration and then a check if it is empty.
As a ManagedStatic this dereference forces it's construction which is unnecessary.

Reviewers: efriedma, davide, mehdi_amini

Reviewed By: mehdi_amini

Subscribers: chapuni, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D33381

llvm-svn: 307229
2017-07-06 00:09:09 +00:00
Davide Italiano 7dd0694f96 [GlobalOpt] Remove unreachable blocks before optimizing a function.
LLVM's definition of dominance allows instructions that are cyclic
in unreachable blocks, e.g.:

  %pat = select i1 %condition, @global, i16* %pat

because any instruction dominates an instruction in a block that's
not reachable from entry.
So, remove unreachable blocks from the function, because a) there's
no point in analyzing them and b) GlobalOpt should otherwise grow
some more complicated logic to break these cycles.

Differential Revision:  https://reviews.llvm.org/D35028

llvm-svn: 307215
2017-07-05 22:28:28 +00:00
Chandler Carruth 3545a9e1f9 Remove the BBVectorize pass.
It served us well, helped kick-start much of the vectorization efforts
in LLVM, etc. Its time has come and past. Back in 2014:
http://lists.llvm.org/pipermail/llvm-dev/2014-November/079091.html

Time to actually let go and move forward. =]

I've updated the release notes both about the removal and the
deprecation of the corresponding C API.

llvm-svn: 306797
2017-06-30 07:09:08 +00:00
Dehao Chen 2f31d0d86e Hook the sample PGO machinery in the new PM
Summary: This patch hooks up SampleProfileLoaderPass with the new PM.

Reviewers: chandlerc, davidxl, davide, tejohnson

Reviewed By: chandlerc, tejohnson

Subscribers: tejohnson, llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D34720

llvm-svn: 306763
2017-06-29 23:33:05 +00:00
Peter Collingbourne 92648c25a4 Bitcode: Write the irsymtab to disk.
Differential Revision: https://reviews.llvm.org/D33973

llvm-svn: 306487
2017-06-27 23:50:11 +00:00
Geoff Berry 2573a19fe6 [EarlyCSE][MemorySSA] Enable MemorySSA in function-simplification pass of EarlyCSE.
llvm-svn: 306477
2017-06-27 22:25:02 +00:00
Dehao Chen 66131665c4 Enable ICP for AutoFDO.
Summary: AutoFDO should have ICP enabled.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: sanjoy, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D34662

llvm-svn: 306429
2017-06-27 17:23:33 +00:00
Xinliang David Li b67530e9b9 [PGO] Implementate profile counter regiser promotion
Differential Revision: http://reviews.llvm.org/D34085

llvm-svn: 306231
2017-06-25 00:26:43 +00:00
Eric Christopher 5a7c2f1700 Remove the LoadCombine pass. It was never enabled and is unsupported.
Based on discussions with the author on mailing lists.

llvm-svn: 306067
2017-06-22 22:58:12 +00:00
Dehao Chen 50f2aa19e8 Do not inline recursive direct calls in sample loader pass.
Summary: r305009 disables recursive inlining for indirect calls in sample loader pass. The same logic applies to direct recursive calls.

Reviewers: iteratee, davidxl

Reviewed By: iteratee

Subscribers: sanjoy, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D34456

llvm-svn: 305934
2017-06-21 17:57:43 +00:00
Xinliang David Li c3f8e83253 Fix function name /NFC
llvm-svn: 305564
2017-06-16 16:54:13 +00:00
Evgeniy Stepanov 4d4ee93d25 [cfi] CFI-ICall for ThinLTO.
Implement ControlFlowIntegrity for indirect function calls in ThinLTO.
Design follows the RFC in llvm-dev, see
https://groups.google.com/d/msg/llvm-dev/MgUlaphu4Qc/kywu0AqjAQAJ

llvm-svn: 305533
2017-06-16 00:18:29 +00:00
Xinliang David Li eea0ade2eb [PartialInlining] Code Refactoring
This is a NFC code refactoring and interface cleanup. This paves the
way to enable outlining-only mode for the partial inliner.

llvm-svn: 305530
2017-06-15 23:56:59 +00:00
Frederich Munch 6391c7e2a1 Revert r305313 & r305303, self-hosting build-bot isn’t liking it.
llvm-svn: 305318
2017-06-13 19:05:24 +00:00
Frederich Munch 4c73b40dca Force RegisterStandardPasses to construct std::function in the IPO library.
Summary: Fixes an issue using RegisterStandardPasses from a statically linked object before PassManagerBuilder::addGlobalExtension is called from a dynamic library.

Reviewers: efriedma, theraven

Reviewed By: efriedma

Subscribers: mehdi_amini, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D33515

llvm-svn: 305303
2017-06-13 16:48:41 +00:00
David Blaikie 6d0f39476a Inliner: Avoid calling shouldInline until it's absolutely necessary
This restores the order of evaluation (& conditionalized evaluation) of
isTriviallyDeadInstruction, InlineHistoryIncludes, and shouldInline
(with the addition of a shouldInline call after
isTriviallyDeadInstruction) from before r305245.

llvm-svn: 305267
2017-06-13 02:24:09 +00:00
David Blaikie ae8c4af4ac Inliner: Don't remove calls to readnone+nounwind (but not always_inline) functions in the AlwaysInliner
llvm-svn: 305245
2017-06-12 23:01:17 +00:00
Geoff Berry 3cca1da20c [EarlyCSE] Add option to use MemorySSA for function simplification run of EarlyCSE (off by default).
Summary:
Use MemorySSA for memory dependency checking in the EarlyCSE pass at the
start of the function simplification portion of the pipeline.  We rely
on the fact that GVNHoist runs just after this pass of EarlyCSE to
amortize the MemorySSA construction cost since GVNHoist uses MemorySSA
and EarlyCSE preserves it.

This is turned off by default.  A follow-up change will turn it on to
allow for easier reversion in case it breaks something.

llvm-svn: 305146
2017-06-10 15:20:03 +00:00
David Blaikie cb9327b02d Inliner: Don't touch indirect calls
Other comments/implications are that this isn't intended behavior (nor
perserved/reimplemented in the new inliner) & complicates fixing the
'inlining' of trivially dead calls without consulting the cost function
first.

llvm-svn: 305052
2017-06-09 03:29:20 +00:00
Evgeniy Stepanov d02dbf6b1c [CFI] Remove LinkerSubsectionsViaSymbols.
Since D17854 LinkerSubsectionsViaSymbols is unnecessary.

It is interfering with ThinLTO implementation of CFI-ICall, where
the aliases used on the !LinkerSubsectionsViaSymbols branch are
needed to export jump tables to ThinLTO backends.

This is the second attempt to land this change after fixing PR33316.

llvm-svn: 305031
2017-06-08 23:38:22 +00:00
Craig Topper 2aa4d39f5e [ExtractGV] Fix the doxygen comment on the constructor and the class to refer to global values instead of functions. While there fix an 80 column violation. NFC
llvm-svn: 305030
2017-06-08 23:38:19 +00:00
Peter Collingbourne e357fbd243 Write summaries for merged modules when splitting modules for ThinLTO.
This is to prepare to allow for dead stripping of globals in the
merged modules.

Differential Revision: https://reviews.llvm.org/D33921

llvm-svn: 305027
2017-06-08 23:01:49 +00:00
Dehao Chen e2a428bad7 Do not early-inline recursive calls in sample profile loader.
Summary: Early-inlining of recursive call makes the code size bloat exponentially. We should not disable it.

Reviewers: davidxl, dnovillo, iteratee

Reviewed By: iteratee

Subscribers: iteratee, llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D34017

llvm-svn: 305009
2017-06-08 20:11:57 +00:00
Peter Collingbourne aaae7eed5c LowerTypeTests: Generate simpler IR for br(llvm.type.test, then, else).
This makes it so that the code quality for CFI checks when compiling
with -O2 and linking with --lto-O0 is similar to that of the rest of
the code.

Reduces the size of a chrome binary built with -O2/--lto-O0 by
about 750KB.

Differential Revision: https://reviews.llvm.org/D33925

llvm-svn: 304921
2017-06-07 15:49:14 +00:00
Chandler Carruth 6bda14b313 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
Evgeniy Stepanov 704003ea3d Revert "[CFI] Remove LinkerSubsectionsViaSymbols."
This reverts commit r304582: breaks cfi-devirt :: anon-namespace.cpp on Darwin.

llvm-svn: 304626
2017-06-03 00:46:27 +00:00