Commit Graph

4233 Commits

Author SHA1 Message Date
Johannes Doerfert 5e06b2514a [Attributor][FIX] Carefully handle/ignore/forget `argmemonly`
When we have an existing `argmemonly` or `inaccessiblememorargmemonly`
we used to "know" that information. However, interprocedural constant
propagation can invalidate these attributes. We now ignore and remove
these attributes for internal functions (which may be affected by IP
constant propagation), if we are deriving new attributes for the
function.
2020-05-10 19:06:11 -05:00
Johannes Doerfert 713ee3aa77 [Attributor] Use "simplify to constant" in genericValueTraversal
As we replace values with constants interprocedurally, we also need to
do this "look-through" step during the generic value traversal or we
would derive properties from replaced values. While this is often not
problematic, it is when we use the "kind" of a value for reasoning,
e.g., accesses to arguments allow `argmemonly`.
2020-05-10 19:06:11 -05:00
Johannes Doerfert 513ac6e9b0 [Attributor] Ignore illegal accesses to `null`
When we categorize a pointer value we bailed at `null` before. If we
know `null` is not a valid memory location we can ignore it as there
won't be an access at all.
2020-05-10 19:06:10 -05:00
Johannes Doerfert 31c03b9223 [Attributor] Use existing helpers to determine IR facts
We now use getPointerDereferenceableBytes to determine `nonnull` and
`dereferenceable` facts from the IR. We also use getPointerAlignment in
AAAlign for the same reason. The latter can interfere with callbacks so
we do restrict it to non-function-pointers for now.
2020-05-10 19:06:10 -05:00
Johannes Doerfert a9ee8b492c [Attributor][NFC] Clang format Attributor*.cpp 2020-05-10 19:06:10 -05:00
Johannes Doerfert edf0391491 [Attributor][FIX] Record dependences for assumed dead abstract attributes
In a recent patch we introduced a problem with abstract attributes that
were assumed dead at some point. Since `Attributor::updateAA` was
introduced in 95e0d28b71, we did not
remember the dependence on the liveness AA when an abstract attribute
was assumed dead and therefore not updated.

Explicit reproducer added in liveness.ll.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 509242 (345483/s)
temporary memory allocations: 98666 (66937/s)
peak heap memory consumption: 18.60MB
peak RSS (including heaptrack overhead): 103.29MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 529332 (355494/s)
temporary memory allocations: 102107 (68574/s)
peak heap memory consumption: 19.40MB
peak RSS (including heaptrack overhead): 102.79MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: 20090 (1339333/s)
temporary memory allocations: 3441 (229400/s)
peak heap memory consumption: 801.45KB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```
2020-05-07 17:00:50 -05:00
Johannes Doerfert 675334daef [Attributor] Mark dependence as optional 2020-05-07 17:00:50 -05:00
Johannes Doerfert f014972446 [Attributor][NFC] Cleanup some AAMemoryLocation code
This is the first step to resolve a TODO in AAMemoryLocation and to fix
a bug we have when handling `byval` arguments of `readnone` call sites.

No functional change intended.
2020-05-05 23:15:33 -05:00
Johannes Doerfert 0cc9c02255 [Attributor][NFC] Minor code cleanups to minimize follow up diffs 2020-05-05 23:14:23 -05:00
Johannes Doerfert 094137a6c6 [Attributor][NFC] Avoid dependences on known information 2020-05-05 23:14:23 -05:00
Kazu Hirata e8984fe65b [Inlining] Teach shouldBeDeferred to take the total cost into account
Summary:
This patch teaches shouldBeDeferred to take into account the total
cost of inlining.

Suppose we have a call hierarchy {A1,A2,A3,...}->B->C.  (Each of A1,
A2, A3, ... calls B, which in turn calls C.)

Without this patch, shouldBeDeferred essentially returns true if

  TotalSecondaryCost < IC.getCost()

where TotalSecondaryCost is the total cost of inlining B into As.
This means that if B is a small wraper function, for example, it would
get inlined into all of As.  In turn, C gets inlined into all of As.
In other words, shouldBeDeferred ignores the cost of inlining C into
each of As.

This patch adds an option, inline-deferral-scale, to replace the
expression above with:

  TotalCost < Allowance

where

- TotalCost is TotalSecondaryCost + IC.getCost() * # of As, and
- Allowance is IC.getCost() * Scale

For now, the new option defaults to -1, disabling the new scheme.

Reviewers: davidxl

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79138
2020-05-05 11:02:06 -07:00
Arthur Eubanks d056c0c71f Remove unnecessary check for inalloca in IPConstantPropagation
Summary:
This was added in https://reviews.llvm.org/D2449, but I'm not sure it's
necessary since an inalloca value is never a Constant (should be an
AllocaInst).

Reviewers: hans, rnk

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79350
2020-05-05 08:26:11 -07:00
Johannes Doerfert 14cb0bdf2b [Attributor][NFC] Replace the nested AAMap with a key pair
No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 512375 (362871/s)
temporary memory allocations: 98746 (69933/s)
peak heap memory consumption: 22.54MB
peak RSS (including heaptrack overhead): 106.78MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 509833 (338534/s)
temporary memory allocations: 98902 (65671/s)
peak heap memory consumption: 18.71MB
peak RSS (including heaptrack overhead): 103.00MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: -2542 (-27042/s)
temporary memory allocations: 156 (1659/s)
peak heap memory consumption: -3.83MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```
2020-05-03 22:10:47 -05:00
Johannes Doerfert 95e0d28b71 [Attributor] Remember only necessary dependences
Before we eagerly put dependences into the QueryMap as soon as we
encountered them (via `Attributor::getAAFor<>` or
`Attributor::recordDependence`). Now we will wait to see if the
dependence is useful, that is if the target is not already in a fixpoint
state at the end of the update. If so, there is no need to record the
dependence at all.

Due to the abstraction via `Attributor::updateAA` we will now also treat
the very first update (during attribute creation) as we do subsequent
updates.

Finally this resolves the problematic usage of QueriedNonFixAA.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 554675 (389245/s)
temporary memory allocations: 101574 (71280/s)
peak heap memory consumption: 28.46MB
peak RSS (including heaptrack overhead): 116.26MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 512465 (345559/s)
temporary memory allocations: 98832 (66643/s)
peak heap memory consumption: 22.54MB
peak RSS (including heaptrack overhead): 106.58MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: -42210 (-727758/s)
temporary memory allocations: -2742 (-47275/s)
peak heap memory consumption: -5.92MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```
2020-05-03 22:01:51 -05:00
Johannes Doerfert 231026a508 [Attributor] Inititialize "value attributes" w/ must-be-executed-context info
Attributes that only depend on the value (=bit pattern) can be
initialized from uses in the must-be-executed-context (MBEC). We did use
`AAComposeTwoGenericDeduction` and `AAFromMustBeExecutedContext` before
to do this for some positions of these attributes but not for all. This
was fairly complicated and also problematic as we did run it in every
`updateImpl` call even though we only use known information. The new
implementation removes `AAComposeTwoGenericDeduction`* and
`AAFromMustBeExecutedContext` in favor of a simple interface
`AddInformation::fromMBEContext(...)` which we call from the
`initialize` methods of the "value attribute" `Impl` classes, e.g.
`AANonNullImpl:initialize`.

There can be two types of test changes:
  1) Artifacts were we miss some information that was known before a
     global fixpoint was reached and therefore available in an update
     but not at the beginning.
  2) Deduction for values we did not derive via the MBEC before or which
     were not found as the `AAFromMustBeExecutedContext::updateImpl` was
     never invoked.

* An improved version of AAComposeTwoGenericDeduction can be found in
  D78718. Once we find a new use case that implementation will be able
  to handle "generic" AAs better.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 468428 (328952/s)
temporary memory allocations: 77480 (54410/s)
peak heap memory consumption: 32.71MB
peak RSS (including heaptrack overhead): 122.46MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 554720 (351310/s)
temporary memory allocations: 101650 (64376/s)
peak heap memory consumption: 28.46MB
peak RSS (including heaptrack overhead): 116.75MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: 86292 (556722/s)
temporary memory allocations: 24170 (155935/s)
peak heap memory consumption: -4.25MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D78719
2020-05-03 21:41:22 -05:00
Johannes Doerfert 87f1e93945 [Attributor][NFC] Use reference instead of pointer 2020-05-03 21:38:06 -05:00
Johannes Doerfert 2f97b8b891 [Attributor][NFC] Proactively ask for `nocapure` on call site arguments
This minimizes test noise later on and is in line with other attributes
we derive proactively.
2020-05-03 21:38:06 -05:00
Sergey Dmitriev 0f70f73308 [Attributor] Bitcast constant to the returned value type if it has different type
Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: jdoerfert

Subscribers: hiraditya, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79277
2020-05-03 11:46:13 -07:00
Mircea Trofin bec4ab95a4 [llvm][NFC] Inliner: factor cost and reporting out of inlining process
Summary:
This factors cost and reporting out of the inlining workflow, thus
making it easier to reuse when driving inlining from the upcoming
InliningAdvisor.

Depends on: D79215

Reviewers: davidxl, echristo

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79275
2020-05-03 10:38:28 -07:00
Johannes Doerfert 8228153f87 [Attributor][NFC] Encode IRPositions in the bits of a single pointer
This reduces memory consumption for IRPositions by eliminating the
vtable pointer and the `KindOrArgNo` integer. Since each abstract
attribute has an associated IRPosition, the 12-16 bytes we save add up
quickly.

No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 469545 (260135/s)
temporary memory allocations: 77137 (42735/s)
peak heap memory consumption: 30.50MB
peak RSS (including heaptrack overhead): 119.50MB
total memory leaked: 269.07KB
```

After:
```
calls to allocation functions: 468999 (274108/s)
temporary memory allocations: 77002 (45004/s)
peak heap memory consumption: 28.83MB
peak RSS (including heaptrack overhead): 118.05MB
total memory leaked: 269.07KB
```

Difference:
```
calls to allocation functions: -546 (5808/s)
temporary memory allocations: -135 (1436/s)
peak heap memory consumption: -1.67MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

---

CTMark 15 runs

Metric: compile_time

Program                                        lhs    rhs    diff
 test-suite...:: CTMark/sqlite3/sqlite3.test    25.07  24.09 -3.9%
 test-suite...Mark/mafft/pairlocalalign.test    14.58  14.14 -3.0%
 test-suite...-typeset/consumer-typeset.test    21.78  21.58 -0.9%
 test-suite :: CTMark/SPASS/SPASS.test          21.95  22.03  0.4%
 test-suite :: CTMark/lencod/lencod.test        25.43  25.50  0.3%
 test-suite...ark/tramp3d-v4/tramp3d-v4.test    23.88  23.83 -0.2%
 test-suite...TMark/7zip/7zip-benchmark.test    60.24  60.11 -0.2%
 test-suite :: CTMark/kimwitu++/kc.test         15.69  15.69 -0.0%
 test-suite...:: CTMark/ClamAV/clamscan.test    25.43  25.42 -0.0%
 test-suite :: CTMark/Bullet/bullet.test        37.63  37.62 -0.0%
 Geomean difference                                          -0.8%

---

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D78722
2020-05-03 12:15:19 -05:00
Johannes Doerfert 6bf16ee4c5 [Attributor][NFC] Let AbstractAttribute be an IRPosition
Since every AbstractAttribute so far, and for the foreseeable future,
corresponds to a single IRPosition we can simplify the class structure.
We already did this for IRAttribute but there is no reason to stop
there.
2020-05-03 12:13:40 -05:00
Mircea Trofin 667f558c3f [llvm][NFC] Inliner.cpp shouldInline post-commit feedback
Discussion is in https://reviews.llvm.org/D79215
2020-05-03 09:31:31 -07:00
Nikita Popov b7e2358220 Remove getNumUses() comparisons (NFC)
getNumUses() scans the full use list. Don't use it is we only want
to check if there's zero or one uses.
2020-05-02 11:05:19 +02:00
Mircea Trofin 3dbc612cf2 [llvm][NFC] Rename variable as per https://reviews.llvm.org/D79215
Operator error - performed the rename and didn't save.
2020-05-01 16:30:41 -07:00
Mircea Trofin e1c4a7cb16 [llvm][NFC] Inliner: simplify inlining decision logic
Summary:
shouldInline makes a decision based on the InlineCost of a call site, as
well as an evaluation on whether the site should be deferred. This means
it's possible for the decision to be not to inline, even for an
InlineCost that would otherwise allow it.

Both uses of shouldInline performed the exact same logic after calling
it. In addition, the decision on whether to inline or not was
communicated through two values of the Option<InlineCost> return value:
None, or an InlineCost evaluating to false.

Simplified by:
- encapsulating the decision in the return object. The bool it evaluates
to communicates unambiguously the decision. The InlineCost is also
available.
- encapsulated the common post-shouldInline code into shouldInline.

Reviewers: davidxl, echristo, eraman

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79215
2020-05-01 16:18:59 -07:00
Arthur Eubanks a90948fd6e [NFC] Rename *ByValOrInalloca* to *PassPointeeByValue*
Summary: In preparation for preallocated.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79152
2020-04-30 09:42:13 -07:00
David Spickett 3929429347 [globalopt] Don't emit DWARF fragments for members
of a struct that cover the whole struct

This can happen when the rest of the
members of are zero length. Following
the same pattern applied to the SROA
pass in:
d7f6f1636d

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45335

Differential Revision: https://reviews.llvm.org/D78720
2020-04-30 11:36:55 +01:00
Mircea Trofin 2c7ff270d2 [llvm][NFC] Inliner: rename call site variables.
Summary:
Renamed 'CS' to 'CB', and, in one case, to a more specific name to avoid
naming collision with outer scope (a maintainability/readability reason,
not correctness)

Also updated comments.

Reviewers: davidxl, dblaikie, jdoerfert

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79101
2020-04-29 15:36:29 -07:00
Mircea Trofin 4632b7292a [llvm][NFC] Removed addressed fixme; formatting.
Removed already-addressed fixme, and updated formatting of a few lines
that were triggering Harbormaster.
2020-04-29 09:06:01 -07:00
Mircea Trofin 8a7cf11f92 [llvm][NFC] Refactor APIs operating on CallBase
Summary:
Refactored the parameter and return type where they are too generally
typed as Instruction.

Reviewers: dblaikie, wmi, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79027
2020-04-28 13:23:47 -07:00
David Blaikie 95e570725a OpenMPOpt::RuntimeFunctionInfo::UsesMap: Use unique_ptr for values to simplify memory management 2020-04-28 12:26:53 -07:00
David Blaikie 3c89256d71 Attributor::ArgumentReplacementMap: Use unique_ptr to simplify memory management 2020-04-28 12:26:52 -07:00
Craig Topper a58b62b4a2 [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
Simon Pilgrim a3982491db [Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly
Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h.

Differential Revision: https://reviews.llvm.org/D78815
2020-04-26 12:58:20 +01:00
Sergei Trofimovich 09684b08d3 llvm: IPO: handle IRMover error handling, bug #45636
Summary:
Missing error mangling is noticed in
https://bugs.llvm.org/show_bug.cgi?id=45636
where inconsistent profiling input caused
llvm/lld to crash as:

```
Program aborted due to an unhandled Error:
linking module flags 'ProfileSummary':
  IDs have conflicting values in 'Mutex_posix.o' and 'nsBrowserApp.o'
```

The change does not change the fact that LLVM crashes
but changes error output to say what was incorrect:

```
LLVM ERROR: Function Import: link error:
  linking module flags 'ProfileSummary':
    IDs have conflicting values in 'Mutex_posix.o' and 'nsBrowserApp.o'
```

Actual crash has yet to be fixed.

Reviewers: lattner

Reviewed By: lattner

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78676
2020-04-25 19:16:01 +01:00
Sergey Dmitriev 67aed1469b [Attributor] Do not set 'returned' attribute for arguments that cannot be bitcasted to function result
Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: jdoerfert

Subscribers: hiraditya, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78828
2020-04-25 09:49:40 -07:00
Benjamin Kramer 1d42764df7 Give helpers internal linkage. NFC. 2020-04-25 11:50:52 +02:00
Craig Topper 2c24051bac [CallSite removal] Rename CallSite.h to AbstractCallSite.h. NFC
The CallSite and ImmutableCallSite were removed in a previous
commit. So rename the file to match the remaining class and
the name of the cpp that implements it.
2020-04-24 22:12:25 -07:00
Tyker 42431da895 [AssumeBundles] Use assume bundles in isKnownNonZero
Summary: Use nonnull and dereferenceable from an assume bundle in isKnownNonZero

Reviewers: jdoerfert, nikic, lebedev.ri, reames, fhahn, sstefan1

Reviewed By: jdoerfert

Subscribers: fhahn, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76149
2020-04-24 20:41:51 +02:00
Johannes Doerfert 1dfc473177 Revert "[Attributor][NFC] Encode IRPositions in the bits of a single pointer"
A dependent patch has been reverted [0]. Until it goes back in this one
has to stay out.

[0] ebdb893994

This reverts commit d254b50b2b.
2020-04-24 02:53:51 -05:00
Johannes Doerfert d254b50b2b [Attributor][NFC] Encode IRPositions in the bits of a single pointer
This reduces memory consumption for IRPositions by eliminating the
vtable pointer and the `KindOrArgNo` integer. Since each abstract
attribute has an associated IRPosition, the 12-16 bytes we save add up
quickly.

No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 469545 (260135/s)
temporary memory allocations: 77137 (42735/s)
peak heap memory consumption: 30.50MB
peak RSS (including heaptrack overhead): 119.50MB
total memory leaked: 269.07KB
```

After:
```
calls to allocation functions: 468999 (274108/s)
temporary memory allocations: 77002 (45004/s)
peak heap memory consumption: 28.83MB
peak RSS (including heaptrack overhead): 118.05MB
total memory leaked: 269.07KB
```

Difference:
```
calls to allocation functions: -546 (5808/s)
temporary memory allocations: -135 (1436/s)
peak heap memory consumption: -1.67MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

---

CTMark 15 runs

Metric: compile_time

Program                                        lhs    rhs    diff
 test-suite...:: CTMark/sqlite3/sqlite3.test    25.07  24.09 -3.9%
 test-suite...Mark/mafft/pairlocalalign.test    14.58  14.14 -3.0%
 test-suite...-typeset/consumer-typeset.test    21.78  21.58 -0.9%
 test-suite :: CTMark/SPASS/SPASS.test          21.95  22.03  0.4%
 test-suite :: CTMark/lencod/lencod.test        25.43  25.50  0.3%
 test-suite...ark/tramp3d-v4/tramp3d-v4.test    23.88  23.83 -0.2%
 test-suite...TMark/7zip/7zip-benchmark.test    60.24  60.11 -0.2%
 test-suite :: CTMark/kimwitu++/kc.test         15.69  15.69 -0.0%
 test-suite...:: CTMark/ClamAV/clamscan.test    25.43  25.42 -0.0%
 test-suite :: CTMark/Bullet/bullet.test        37.63  37.62 -0.0%
 Geomean difference                                          -0.8%

---

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D78722
2020-04-24 01:58:47 -05:00
Mircea Trofin cea6f4d5f8 [llvm][NFC][CallSite] Remove CallSite from TypeMetadataUtils & related
Reviewers: craig.topper, dblaikie

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78666
2020-04-23 08:23:16 -07:00
Serguei Katkov c0d2bbb1d4 [CaptureTracking] Replace hardcoded constant to option. NFC.
The motivation is to be able to play with the option and change if it is required.

Reviewers: fedor.sergeev, apilipenko, rnk, jdoerfert
Reviewed By: fedor.sergeev
Subscribers: hiraditya, dantrushin, llvm-commits
Differential Revision: https://reviews.llvm.org/D78624
2020-04-23 18:23:35 +07:00
Craig Topper 25807452ac [ArgumentPromotion] Remove unnecessary getScalarType() before casting to PointerType. NFC
I don't believe this pass deals with vectors of pointers. I think
this getScalarType() was added during a mechanical opaque pointer
change of the interface to GetElementPtrInst::getIndexedType.
2020-04-22 22:51:41 -07:00
Craig Topper be04aba6fc [CallSite removal][ValueTracking] Use CallBase instead of ImmutableCallSite for getIntrinsicForCallSite. NFC
Differential Revision: https://reviews.llvm.org/D78613
2020-04-22 12:06:58 -07:00
Christopher Tetreault 2dea3f1298 [SVE] Add new VectorType subclasses
Summary:
Introduce new types for fixed width and scalable vectors.

Does not remove getNumElements yet so as to not break code during transition
period.

Reviewers: deadalnix, efriedma, sdesmalen, craig.topper, huntergr

Reviewed By: sdesmalen

Subscribers: jholewinski, arsenm, jvesely, nhaehnle, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, kerbowa, Joonsoo, grosul1, frgossen, lldb-commits, tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm, #lldb

Differential Revision: https://reviews.llvm.org/D77587
2020-04-22 08:59:01 -07:00
Mircea Trofin 1b6b05a250 [llvm][NFC][CallSite] Remove CallSite from a few trivial locations
Summary: Implementation details and internal (to module) APIs.

Reviewers: craig.topper, dblaikie

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78610
2020-04-22 08:39:21 -07:00
Craig Topper 05a11974ae [CallSite removal] Remove unneeded includes of CallSite.h. NFC 2020-04-22 00:07:13 -07:00
Johannes Doerfert ca59ff5af9 [Attributor] Replace AccessKind2Accesses map with an "array map"
The number of different access location kinds we track is relatively
small (8 so far). With this patch we replace the DenseMap that mapped
from index (0-7) to the access set pointer with an array of access set
pointers. This reduces memory consumption.

No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 472499 (215654/s)
temporary memory allocations: 77794 (35506/s)
peak heap memory consumption: 35.28MB
peak RSS (including heaptrack overhead): 125.46MB
total memory leaked: 269.04KB
```

After:
```
calls to allocation functions: 472270 (308673/s)
temporary memory allocations: 77578 (50704/s)
peak heap memory consumption: 32.70MB
peak RSS (including heaptrack overhead): 121.78MB
total memory leaked: 269.04KB
```

Difference:
```
calls to allocation functions: -229 (346/s)
temporary memory allocations: -216 (326/s)
peak heap memory consumption: -2.58MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

---
2020-04-22 01:35:27 -05:00
Johannes Doerfert f20ff4b17d [Attributor] Run IRPosition::verify only with EXPENSIVE_CHECKS 2020-04-22 01:35:12 -05:00
Mircea Trofin 9ee02aef62 [llvm][NFC][CallSite] Remove CallSite from FunctionAttrs
Reviewers: dblaikie, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78584
2020-04-21 16:16:00 -07:00
Johannes Doerfert 46b7ed0e6f [Attributor] Remove dependence edges eagerly
If we have a dependence between an abstract attribute A to an abstract
attribute B such hat changes in A should trigger an update of B, we do
not need to keep the dependence around once the update was triggered. If
the dependence is still required the update will reinsert it into the
dependence map, if it is not we avoid triggering B in the future. This
replaces the "recompute interval" mechanism we used before to prune
stale dependences.

Number of required iterations is generally down, compile time for the
module pass (not really the CGSCC pass) is down quite a bit.

There is one test change which looks like an artifact in the undefined
behavior AA that needs to be looked at.
2020-04-21 15:22:10 -05:00
Johannes Doerfert ea439bbcbb [Attributor][NFC] Track the number of created AAs in the statistics 2020-04-21 15:22:10 -05:00
Johannes Doerfert c5794f77eb [Attributor][PM] Introduce `-attributor-enable={none,cgscc,module,all}`
The old command line option `-attributor-disable` was too coarse grained
as we want to measure the effects of the module or cgscc pass without
the other as well.

Since `none` is the default there is no real functional change.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D78571
2020-04-21 15:22:10 -05:00
Benjamin Kramer 9a08c30705 Bit-pack some pairs. No functionlity change intended. 2020-04-21 20:40:20 +02:00
Fangrui Song cca545ce46 [CallSite] Fix build breakage after D78538 2020-04-21 11:33:40 -07:00
Mircea Trofin d702325af6 [llvm][NFC][CallSite] Remove CallSite from DeadArgumentElimination
Summary: Also capitalized some induction variables, to match coding style.

Reviewers: dblaikie, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78538
2020-04-21 10:48:38 -07:00
Johannes Doerfert 177c065e50 [Attributor] Use a pointer value type for the OpcodeInstMap
This reduces memory consumption and the need to copy complex data
structures repeatedly.

No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 490390 (320725/s)
temporary memory allocations: 84601 (55330/s)
peak heap memory consumption: 41.70MB
peak RSS (including heaptrack overhead): 131.18MB
total memory leaked: 269.04KB
```

After:
```
calls to allocation functions: 489359 (301144/s)
temporary memory allocations: 82983 (51066/s)
peak heap memory consumption: 36.76MB
peak RSS (including heaptrack overhead): 126.48MB
total memory leaked: 269.04KB
```

Difference:
```
calls to allocation functions: -1031 (-10739/s)
temporary memory allocations: -1618 (-16854/s)
peak heap memory consumption: -4.94MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

---
2020-04-21 11:20:09 -05:00
Johannes Doerfert 99662c22cd [Attributor] Use a pointer value type for the QueryMap
This reduces memory consumption and the need to copy complex data
structures repeatedly.

No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 596180 (374484/s)
temporary memory allocations: 84979 (53378/s)
peak heap memory consumption: 52.14MB
peak RSS (including heaptrack overhead): 139.79MB
total memory leaked: 269.04KB
```

After:
```
calls to allocation functions: 489200 (303285/s)
temporary memory allocations: 83406 (51708/s)
peak heap memory consumption: 41.70MB
peak RSS (including heaptrack overhead): 131.76MB
total memory leaked: 269.04KB
```

Difference:
```
calls to allocation functions: -106980 (-5094285/s)
temporary memory allocations: -1573 (-74904/s)
peak heap memory consumption: -10.44MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

---
2020-04-21 11:20:04 -05:00
Johannes Doerfert 1f570e019d [Attributor] Use a pointer value type for the access kind -> accesses map
This reduces memory consumption and the need to copy complex data
structures repeatedly.

No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 616219 (381559/s)
temporary memory allocations: 83294 (51575/s)
peak heap memory consumption: 72.15MB
peak RSS (including heaptrack overhead): 160.04MB
total memory leaked: 269.04KB
```

After:
```
calls to allocation functions: 595004 (357145/s)
temporary memory allocations: 83840 (50324/s)
peak heap memory consumption: 52.14MB
peak RSS (including heaptrack overhead): 138.32MB
total memory leaked: 269.04KB
```

Difference:
```
calls to allocation functions: -21215 (-415980/s)
temporary memory allocations: 546 (10705/s)
peak heap memory consumption: -20.01MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

---
2020-04-21 11:20:02 -05:00
Johannes Doerfert 40f3baeb20 [Attributor] Pass the Attributor to the AbstractAttribute constructors
AbstractAttribute::initialize is used to initialize the deduction and
the object we do not always call it. To make sure we have the option to
initialize the object even if initialize is not called we pass the
Attributor to AbstractAttribute constructors now.
2020-04-21 11:20:02 -05:00
Johannes Doerfert 91a6c88349 [Attributor] Use a pointer value type for the AAMap
This reduces memory consumption and the need to copy complex data
structures repeatedly.

No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 613353 (376521/s)
temporary memory allocations: 83636 (51341/s)
peak heap memory consumption: 75.64MB
peak RSS (including heaptrack overhead): 162.97MB
total memory leaked: 269.04KB
```

After:
```
calls to allocation functions: 616575 (349929/s)
temporary memory allocations: 83650 (47474/s)
peak heap memory consumption: 72.15MB
peak RSS (including heaptrack overhead): 159.81MB
total memory leaked: 269.04KB
```

Difference:
```
calls to allocation functions: 3222 (24225/s)
temporary memory allocations: 14 (105/s)
peak heap memory consumption: -3.49MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

---
2020-04-21 11:19:58 -05:00
Johannes Doerfert dc3b5b00fe [OpenMPOpt] Make the combination of `ident_t*` deterministic
Before we kept the first applicable `ident_t*` during deduplication of
runtime calls. The problem is that "first" is dependent on the iteration
order of a DenseMap. Since the proper solution, which is to combine the
information from all `ident_t*`, should be deterministic on its own, we
will not try to make the iteration order deterministic. Instead, we will
create a fresh `ident_t*` if there is not a unique existing `ident_t*`
to pick.
2020-04-20 23:27:08 -05:00
Johannes Doerfert 8855fec37e [OpenMPOpt] Use a pointer value type in map
The value type was a set before which can easily lead to excessive
memory usage and copying. We use a pointer to a vector instead now.
2020-04-20 23:27:08 -05:00
Johannes Doerfert ee17263adc [OpenMPOpt] Make the SCC a vector to ensure deterministic results 2020-04-20 23:27:08 -05:00
Mircea Trofin c2d86e1f30 [llvm][NFC][CallSite] Remove CallSite from ArgumentPromotion
Reviewers: dblaikie, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78528
2020-04-20 19:33:42 -07:00
Johannes Doerfert 87aa362985 [Attributor] Use the BumpPtrAllocator in InformationCache as well
We now also use the BumpPtrAllocator from the Attributor in the
InformationCache. The lifetime of objects in either is pretty much the
same and it should result in consistently good performance regardless of
the allocator.

Doing so requires to call more constructors manually but so far that
does not seem to be problematic or messy.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 615359 (368257/s)
temporary memory allocations: 83315 (49859/s)
peak heap memory consumption: 75.64MB
peak RSS (including heaptrack overhead): 163.43MB
total memory leaked: 269.04KB
```

After:
```
calls to allocation functions: 613042 (359555/s)
temporary memory allocations: 83322 (48869/s)
peak heap memory consumption: 75.64MB
peak RSS (including heaptrack overhead): 162.92MB
total memory leaked: 269.04KB
```

Difference:
```
calls to allocation functions: -2317 (-68147/s)
temporary memory allocations: 7 (205/s)
peak heap memory consumption: 2.23KB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

---
2020-04-20 21:12:41 -05:00
Craig Topper 4cf6d4ab48 [CallSite removal][CalledValuePropagation] Use CallBase instead of CallSite. NFC
Differential Revision: https://reviews.llvm.org/D78467
2020-04-19 22:05:40 -07:00
Florian Hahn a7aaadc135 [TTI] Clean up includes (NFC).
Remove some unnecessary includes, replace some with forward
declarations.

This also exposed a few places that were missing some includes.
2020-04-19 20:11:59 +01:00
Craig Topper 5f6d93c7d3 [CallSite removal][Attributor] Replaces use of CallSite with CallBase. NFC
Differential Revision: https://reviews.llvm.org/D78343
2020-04-17 10:44:31 -07:00
Craig Topper 8c94d616e1 Revert "[CallSite removal][MemCpyOptimizer] Replace CallSite with CallBase. NFC"
There were extra changes that weren't supposed to be in there

This reverts commit b91f78db37.
2020-04-17 10:11:22 -07:00
Craig Topper b91f78db37 [CallSite removal][MemCpyOptimizer] Replace CallSite with CallBase. NFC
There are also some adjustments to use MaybeAlign in here due
to CallBase::getParamAlignment() being deprecated. It would
be cleaner if getOrEnforceKnownAlignment was migrated
to Align/MaybeAlign.

Differential Revision: https://reviews.llvm.org/D78345
2020-04-17 10:07:20 -07:00
Craig Topper 5034df8600 [SampleProfile] Use CallBase in function arguments and data structures to reduce the number of explicit casts. NFCI
Removing CallSite left us with a bunch of explicit casts from
Instruction to CallBase. This moves the casts earlier so that
function arguments and data structure types are CallBase so
we don't have to cast when we use them.

Differential Revision: https://reviews.llvm.org/D78246
2020-04-16 22:10:34 -07:00
Craig Topper 798b262c3c [CallSite removal][IPO] Change implementation of AbstractCallSite to store a CallBase* instead of CallSite. NFCI.
CallSite will likely be removed soon, but AbstractCallSite serves a different purpose and won't be going away.

This patch switches it to internally store a CallBase* instead of a
CallSite. The only interface changes are the removal of the getCallSite
method and getCallBackUses now takes a CallBase&. These methods had only
a few callers that were easy enough to update without needing a
compatibility shim.

In the future once the other CallSites are gone, the CallSite.h
header should be renamed to AbstractCallSite.h

Differential Revision: https://reviews.llvm.org/D78322
2020-04-16 16:24:45 -07:00
Bob Haarman cc5c58889e [WPD] Avoid noalias assumptions in unique return value optimization
Summary:
Changes the type of the @__typeid_.*_unique_member imports we generate
for unique return value optimization from i8 to [0 x i8]. This
prevents assuming that these imports do not alias, such as when
two unique return values occur in the same vtable.

Fixes PR45393.

Reviewers: tejohnson, pcc

Reviewed By: pcc

Subscribers: aganea, hiraditya, rnk, george.burgess.iv, dblaikie, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77421
2020-04-16 14:49:51 -07:00
Roman Lebedev b1fbf438f6
[OpenMPOpt] deduplicateRuntimeCalls(): avoid traditional map lookup pitfall
Summary:
This roughly halves time spent in that pass,
while unsurprisingly significantly reducing total memory usage.

This makes sense because most functions won't use any openmp functions..

old
```
   0.2329 (  0.5%)   0.0409 (  0.9%)   0.2738 (  0.5%)   0.2736 (  0.5%)  OpenMP specific optimizations
```
```
total runtime: 63.32s.
bytes allocated in total (ignoring deallocations): 8.34GB (131.70MB/s)
calls to allocation functions: 14526259 (229410/s)
temporary memory allocations: 3335760 (52680/s)
peak heap memory consumption: 324.36MB
peak RSS (including heaptrack overhead): 5.39GB
total memory leaked: 289.93MB
```

new
```
   0.1457 (  0.3%)   0.0276 (  0.6%)   0.1732 (  0.3%)   0.1731 (  0.3%)  OpenMP specific optimizations
```
```
total runtime: 55.01s.
bytes allocated in total (ignoring deallocations): 6.70GB (121.89MB/s)
calls to allocation functions: 14268205 (259398/s)
temporary memory allocations: 3225355 (58637/s)
peak heap memory consumption: 324.09MB
peak RSS (including heaptrack overhead): 5.39GB
total memory leaked: 289.87MB
```

diff
```
total runtime: -8.31s.
bytes allocated in total (ignoring deallocations): -1.63GB (196.58MB/s)
calls to allocation functions: -258054 (31034/s)
temporary memory allocations: -110405 (13277/s)
peak heap memory consumption: -262.36KB
peak RSS (including heaptrack overhead): 0B
total memory leaked: -61.45KB
```

Reviewers: jdoerfert, hfinkel

Reviewed By: jdoerfert

Subscribers: yaxunl, hiraditya, guansong, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78299
2020-04-16 19:54:02 +03:00
Johannes Doerfert c4d3188adb [Attributor][NFC] Reduce indention for call site attribute seeding
Also added a TODO to remind us that indirect calls could be optimized as
well.
2020-04-16 02:32:31 -05:00
Johannes Doerfert 0741dec27b [Attributor][FIX] Handle droppable uses when replacing values
Since we use the fact that some uses are droppable in the Attributor we
need to handle them explicitly when we replace uses. As an example, an
assumed dead value can have live droppable users. In those we cannot
replace the value simply by an undef. Instead, we either drop the uses
(via `dropDroppableUses`) or keep them as they are. In this patch we do
both, depending on the situation. For values that are dead but not
necessarily removed we keep droppable uses around because they contain
information we might be able to use later. For values that are removed
we drop droppable uses explicitly to avoid replacement with undef.
2020-04-16 00:56:08 -05:00
Johannes Doerfert 253d6be0f6 [Attributor][FIX] Properly check for accesses to globals
The check if globals were accessed was not always working because two
bits are set for NO_GLOBAL_MEM. The new check works also if only on kind
of globals (internal/external) is accessed.
2020-04-16 00:55:34 -05:00
Johannes Doerfert ad9c284cc3 [Attributor][NFC] Run the verifier only on functions and under EXPENSIVE_CHECKS
Running the verifier is expensive so we want to avoid it even in runs
that enable assertions. As we move closer to enabling the Attributor
this code will be executed by some buildbots but not cause overhead for
most people.
2020-04-16 00:55:33 -05:00
Johannes Doerfert 898bbc252a [Attributor] Lazily collect function information
Before, we eagerly analyzed all the functions to collect information
about them, e.g. what instructions may read/write memory. This had
multiple drawbacks:
  - In CGSCC-mode we can end up looking at a callee which is not in the
    SCC but for which we need an initialized cache.
  - We end up looking at functions that we deem dead and never need to
    analyze in the first place.
  - We have a implicit dependence which is easy to break.

This patch moves the function analysis into the information cache and
makes it lazy. There is no real functional change expected except due to
the first reason above.
2020-04-15 22:26:38 -05:00
Johannes Doerfert 8c4057e3a3 [Attributor] Replace call graph call sites after function replacement
The CallGraphUpdater allows to directly alter call site information and
we should do so. This might appease the windows buildbot that crashes
during the SCC traversal.
2020-04-15 22:24:09 -05:00
Craig Topper 7b6ff8bf1f [CallSite removal][SampleProfile] Use CallBase instead of CallSite. NFC
Differential Revision: https://reviews.llvm.org/D78219
2020-04-15 12:47:17 -07:00
Craig Topper a0d92248ea [CallSite removal][PruneEH] Use CallBase instead of CallSite. NFC
Reviewers: mtrofin, dblaikie

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78182
2020-04-15 10:11:41 -07:00
Teresa Johnson 33ffb62e23 Allow disabling of vectorization using internal options
Summary:
Currently, the internal options -vectorize-loops, -vectorize-slp, and
-interleave-loops do not have much practical effect. This is because
they are used to initialize the corresponding flags in the pass
managers, and those flags are then unconditionally overwritten when
compiling via clang or via LTO from the linkers. The only exception was
-vectorize-loops via opt because of some special hackery there.

While vectorization could still be disabled when compiling via clang,
using -fno-[slp-]vectorize, this meant that there was no way to disable
it when compiling in LTO mode via the linkers. This only affected
ThinLTO, since for regular LTO vectorization is done during the compile
step for scalability reasons. For ThinLTO it is invoked in the LTO
backends. See also the discussion on PR45434.

This patch makes it so the internal options can actually be used to
disable these optimizations. Ultimately, the best long term solution is
to mark the loops with metadata (similar to the approach used to fix
-fno-unroll-loops in D77058), but this enables a shorter term
workaround, and actually makes these internal options useful.

I constant propagated the initial values of these internal flags into
the pass manager flags (for some reasons vectorize-loops and
interleave-loops were initialized to true, while vectorize-slp was
initialized to false). As mentioned above, they are overwritten
unconditionally so this doesn't have any real impact, and these initial
values aren't particularly meaningful.

I then changed the passes to check the internl values and return without
performing the associated optimization when false (I changed the default
of -vectorize-slp to true so the options behave similarly). I was able
to remove the hackery in opt used to get -vectorize-loops=false to work,
as well as a special option there used to disable SLP vectorization.

Finally, I changed thinlto-slp-vectorize-pm.c to:
a) Only test SLP (moved the loop vectorization checking to a new test).
b) Use code that is slp vectorized when it is enabled, and check that
instead of whether the pass is enabled.
c) Test the new behavior of -vectorize-slp.
d) Test both pass managers.

The loop vectorization (and associated interleaving) testing I moved to
a new thinlto-loop-vectorize-pm.c test, with several changes:
a) Changed the flags on the interleaving testing so that it will
actually interleave, and check that.
b) Test the new behavior of -vectorize-loops and -interleave-loops.
c) Test both pass managers.

Reviewers: fhahn, wmi

Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, davezarzycki, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77989
2020-04-14 18:09:10 -07:00
Mircea Trofin 447e2c3067 [llvm][NFC][CallSite] Remove Implementation uses of CallSite
Reviewers: dblaikie, davidxl, craig.topper

Subscribers: arsenm, dschuff, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78142
2020-04-14 14:49:47 -07:00
Benjamin Kramer 7bf166665e [FunctionAttrs] Don't copy all the nodes where a reference is fine. 2020-04-14 17:18:23 +02:00
Georgii Rymar 1647ff6e27 [ADT/STLExtras.h] - Add llvm::is_sorted wrapper and update callers.
It can be used to avoid passing the begin and end of a range.
This makes the code shorter and it is consistent with another
wrappers we already have.

Differential revision: https://reviews.llvm.org/D78016
2020-04-14 14:11:02 +03:00
Mircea Trofin 4aae4e3f48 [llvm][NFC] CallSite removal from inliner-related files
Summary: This removes CallSite from inliner files. Some dependencies where thus affected.

Reviewers: dblaikie, davidxl, craig.topper

Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, aheejin, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77991
2020-04-13 21:28:58 -07:00
Mehdi Amini 384ca190ae Revert "Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object"
This reverts commit 10df1563d6.

Some buildbots are broken.
2020-04-14 00:27:08 +00:00
Mehdi Amini 10df1563d6 Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object
ModuleSummaryAnalysis is the only file in libAnalysis that brings a
dependency on the CodeGen layer from libAnalysis, moving it breaks this
dependency.

Differential Revision: https://reviews.llvm.org/D77994
2020-04-13 23:12:11 +00:00
Eli Friedman cfb844265a [GlobalOpt] Explicitly set alignment of bool load/store operations. 2020-04-12 16:03:12 -07:00
Mircea Trofin d2f1cd5d97 [llvm][NFC] Refactor uses of CallSite to CallBase - call promotion
Summary:
Updated CallPromotionUtils and impacted sites. Parameters that are
expected to be non-null, and return values that are guranteed non-null,
were replaced with CallBase references rather than pointers.

Left FIXME in places where more changes are facilitated by CallBase, but
aren't CallSites: Instruction* parameters or return values, for example,
where the contract that they are actually CallBase values.

Reviewers: davidxl, dblaikie, wmi

Reviewed By: dblaikie

Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77930
2020-04-12 08:27:29 -07:00
Benjamin Kramer e590bd6b92 [argpromote] Use formatv to simplify code. NFCI. 2020-04-11 14:54:32 +02:00
Mircea Trofin da9bcdaad9 [llvm][NFC] Inliner.cpp: ensure InlineHistory ID is always initialized;
Summary:
The inline history is associated with a call site. There are two locations
we fetch inline history. In one, we fetch it together with the call
site. In the other, we initialize it under certain conditions, use it
later under same conditions (different if check), and otherwise is
uninitialized. Although currently there is no uninitialized use, the
code is more challenging to maintain correctly, than if the value were
always initialized.

Changed to the upfront initialization pattern already present in this
file.

Reviewers: davidxl, dblaikie

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77877
2020-04-10 15:28:53 -07:00
Mircea Trofin f62335b534 [llvm][NFC] Style fixes in Inliner.cpp
Summary:
Function names: camel case, lower case first letter.
Variable names: start with upper letter. For iterators that were 'i',
renamed with a descriptive name, as 'I' is 'Instruction&'.

Lambda captures simplification.

Opportunistic boolean return simplification.

Reviewers: davidxl, dblaikie

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77837
2020-04-10 08:04:39 -07:00
Mircea Trofin 655aa1ae4a [llvm][NFC] Replace CallSite with CallBase in Inliner
Summary:
*Almost* all uses are replaced. Left FIXMEs for the two sites that
require refactoring outside of Inliner, to scope this patch.

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77817
2020-04-09 15:01:58 -07:00
Johannes Doerfert 0985554b70 [Attributor][NFC] Split AbstractAttributes out of Attributor.cpp
Attributor.cpp became quite big and we need to start provide structure.
The Attributor code is now in Attributor.cpp and the classes derived
from AbstractAttribute are in AttributorAttributes.cpp. Minor changes
were required but no intended functional changes.

We also minimized includes as part of this.

Reviewed By: baziotis

Differential Revision: https://reviews.llvm.org/D76873
2020-04-08 19:02:14 -05:00
Fangrui Song d2ef8c1f2c [ThinLTO] Drop dso_local if a GlobalVariable satisfies isDeclarationForLinker()
dso_local leads to direct access even if the definition is not within this compilation unit (it is
still in the same linkage unit). On ELF, such a relocation (e.g. R_X86_64_PC32) referencing a
STB_GLOBAL STV_DEFAULT object can cause a linker error in a -shared link.

If the linkage is changed to available_externally, the dso_local flag should be dropped, so that no
direct access will be generated.

The current behavior is benign, because -fpic does not assume dso_local
(clang/lib/CodeGen/CodeGenModule.cpp:shouldAssumeDSOLocal).
If we do that for -fno-semantic-interposition (D73865), there will be an
R_X86_64_PC32 linker error without this patch.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D74751
2020-04-07 15:46:01 -07:00
Eli Friedman 3f13ee8a00 [NFC] Modernize misc. uses of Align/MaybeAlign APIs.
Use the current getAlign() APIs where it makes sense, and use Align
instead of MaybeAlign when we know the value is non-zero.
2020-04-06 17:53:04 -07:00
Eli Friedman 68b03aee1a Remove SequentialType from the type heirarchy.
Now that we have scalable vectors, there's a distinction that isn't
getting captured in the original SequentialType: some vectors don't have
a known element count, so counting the number of elements doesn't make
sense.

In some cases, there's a better way to express the commonality using
other methods. If we're dealing with GEPs, there's GEP methods; if we're
dealing with a ConstantDataSequential, we can query its element type
directly.

In the relatively few remaining cases, I just decided to write out
the type checks. We're talking about relatively few places, and I think
the abstraction doesn't really carry its weight. (See thread "[RFC]
Refactor class hierarchy of VectorType in the IR" on llvmdev.)

Differential Revision: https://reviews.llvm.org/D75661
2020-04-06 17:03:49 -07:00
Tarindu Jayatilaka b43b59fcc0 Expose `attributor-disable` to the new and old pass managers
The new and old pass managers (PassManagerBuilder.cpp and
PassBuilder.cpp) are exposed to an `extern` declaration of
`attributor-disable` option which will guard the addition of the
attributor passes to the pass pipelines.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D76871
2020-04-05 22:29:34 -05:00
Stefanos Baziotis f3dd3a66d3 [Attributor] AAUndefinedBehavior: Use AAValueSimplify in memory accessing instructions.
Query AAValueSimplify on pointers in memory accessing instructions to take
advantage of the constant propagation (or any other value simplification) of such values.
2020-04-05 02:46:26 +03:00
Luofan Chen eec6d87626 [Attributor] Deduce attributes for non-exact functions
This patch is based on D63312 and D63319. For now we create shallow wrappers for all functions that are IPO amendable.
See also [this github issue](https://github.com/llvm/llvm-project/issues/172).

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D76404
2020-04-04 11:34:58 -05:00
Hongtao Yu 88da019977 Fix a bug in the inliner that causes subsequent double inlining
Summary:
A recent change in the instruction simplifier enables a call to a function that just returns one of its parameter to be simplified as simply loading the parameter. This exposes a bug in the inliner where double inlining may be involved which in turn may cause compiler ICE when an already-inlined callsite is reused for further inlining.
To put it simply, in the following-like C program, when the function call second(t) is inlined, its code t = third(t) will be reduced to just loading the return value of the callsite first(). This causes the inliner internal data structure to register the first() callsite for the call edge representing the third() call, therefore incurs a double inlining when both call edges are considered an inline candidate. I'm making a fix to break the inliner from reusing a callsite for new call edges.

```
void top()
{
    int t = first();
    second(t);
}

void second(int t)
{
   t = third(t);
   fourth(t);
}

void third(int t)
{
   return t;
}
```
The actual failing case is much trickier than the example here and is only reproducible with the legacy inliner. The way the legacy inliner works is to process each SCC in a bottom-up order. That means in reality function first may be already inlined into top, or function third is either inlined to second or is folded into nothing. To repro the failure seen from building a large application, we need to figure out a way to confuse the inliner so that the bottom-up inlining is not fulfilled. I'm doing this by making the second call indirect so that the alias analyzer fails to figure out the right call graph edge from top to second and top can be processed before second during the bottom-up.  We also need to tweak the test code so that when the inlining of top happens, the function body of second is not that optimized, by delaying the pass of function attribute deducer (i.e, which tells function third has no side effect and just returns its parameter). Since the CGSCC pass is iterative, additional calls are added to top to postpone the inlining of second to the second round right after the first function attribute deducing pass is done. I haven't been able to repro the failure with the new pass manager since the processing order of ininlined callsites is a bit different, but in theory the issue could happen there too.

Note that this fix could introduce a side effect that blocks the simplification of inlined code, specifically for a call site that can be folded to another call site. I hope this can probably be complemented by subsequent inlining or folding, as shown in the attached unit test. The ideal fix should be to separate the use of VMap. However, in reality this failing pattern shouldn't happen often. And even if it happens, there should be a good chance that the non-folded call site will be refolded by iterative inlining or subsequent simplification.

Reviewers: wenlei, davidxl, tejohnson

Reviewed By: wenlei, davidxl

Subscribers: eraman, nikic, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76248
2020-04-02 21:08:05 -07:00
Johannes Doerfert bcd8009369 [Attributor] Use the proper context instruction in genericValueTraversal
There was a TODO in genericValueTraversal to provide the context
instruction and due to the lack of it users that wanted one just used
something available. Unfortunately, using a fixed instruction is wrong
in the presence of PHIs so we need to update the context instruction
properly.

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D76870
2020-04-01 22:20:47 -05:00
Johannes Doerfert ac96c8fd85 [Attributor][FIX] Do not compute ranges for arguments of declarations
This cannot be triggered right now, as far as I know, but it doesn't
make sense to deduce a constant range on arguments of declarations.
Exposed during testing of AAValueSimplify extensions.
2020-04-01 22:05:30 -05:00
Johannes Doerfert 54d6a608bf [Attributor][NFC] Predetermine the module
It could happen that we delete the first function in the SCC in the
future so we should be careful accessing `Functions` after the manifest
stage.
2020-04-01 21:56:17 -05:00
Johannes Doerfert 9e19693994 [Attributor] Derive better alignment for accessed pointers
Use DL & ABI information for better alignment deduction, e.g., if a type
is accessed and the ABI specifies an alignment requirement for such an
access we can use it. This is based on a patch by @lebedev.ri and
inspired by getBaseAlign in Loads.cpp.

Depends on D76673.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D76674
2020-04-01 21:49:57 -05:00
Johannes Doerfert b1c788d051 [Attributor][FIX] Prevent alignment breakage wrt. must-tail calls
If we have a must-tail call the callee and caller need to have matching
ABIs. Part of that is alignment which we might modify when we deduce
alignment of arguments of either. Since we would need to keep them in
sync, which is not as simple, we simply avoid deducing alignment for
arguments of the must-tail caller or callee.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D76673
2020-04-01 21:40:07 -05:00
Johannes Doerfert 41f2a57d0b [Attributor][NFC] Use a BumpPtrAllocator to allocate `AbstractAttribute`s
We create a lot of AbstractAttributes and they live as long as
the Attributor does. It seems reasonable to allocate them via a
BumpPtrAllocator owned by the Attributor.

Reviewed By: baziotis

Differential Revision: https://reviews.llvm.org/D76589
2020-04-01 20:53:28 -05:00
Uday Bondhugula c4499e3333 [Attributor] Make attributor aware of aligned_alloc for heap to stack conversion
Make the attributor pass aware of aligned_alloc for converting heap
allocations to stack ones.

Depends on D76971.

Differential Revision: https://reviews.llvm.org/D76974
2020-04-01 23:26:50 +05:30
Wei Mi ebad678857 [SampleFDO] Port MD5 name table support to extbinary format.
Compbinary format uses MD5 to represent strings in name table. That gives smaller profile without the need of compression/decompression when writing/reading the profile. The patch adds the support in extbinary format. It is off by default but user can choose to enable it.

Note the feature of using MD5 in name table can bring very small chance of name conflict leading to profile mismatch. Besides, profile using the feature won't have the profile remapping support.

Differential Revision: https://reviews.llvm.org/D76255
2020-03-30 22:07:08 -07:00
Uday Bondhugula 06066c4003 [NFC] Attributor comment updates / cast cleanup
Minor update/fixes to comments for the Attributor pass, and dyn_cast -> cast.

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D76972
2020-03-28 13:36:43 +05:30
Johannes Doerfert 5699d08b79 [Attributor] Use knowledge retained in llvm.assume (operand bundles)
This patch integrates operand bundle llvm.assumes [0] with the
Attributor. Most IRAttributes will now look at uses of the associated
value and if there are llvm.assume operand bundle uses with the right
tag we will check if they are in the must-be-executed-context (around
the context instruction). Droppable users, which is currently only
llvm::assume, are handled special in some places now as well.

[0] http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D74888
2020-03-24 15:33:40 -05:00
Vedant Kumar b7cd291c15 [GlobalOpt] Treat null-check of loaded value as use of global (PR35760)
PR35760 shows an example program which, when compiled with `clang -O0`
or gcc at any optimization level, prints '0'. However, llvm transforms
the program in a way that causes it to print '1'.

Fix the issue by having `AllUsesOfValueWillTrapIfNull` return false when
analyzing a load from a global which is used by an `icmp`. This special
case was untested [0] so this is just deleting dead code.

An alternative fix might be to change the GlobalStatus analysis for the
global to report "Stored" instead of "StoredOnce". However, "StoredOnce"
is appropriate when only one value other than the initializer is stored
to the global.

[0]
http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/Transforms/IPO/GlobalOpt.cpp.html#L662

Differential Revision: https://reviews.llvm.org/D76645
2020-03-23 22:36:09 -07:00
Johannes Doerfert f09f4b2676 [OpenMPOpt] Initialize value to avoid use of uninitialized memory
This should fix the issue reported here:
  https://reviews.llvm.org/D76058#1937554
2020-03-23 19:17:19 -05:00
Stefanos Baziotis a650d555fc [Attributor][NFC] Refactorings and typos in doc
Reviewed By: sstefan1, uenoku

Differential Revision: https://reviews.llvm.org/D76175
2020-03-23 22:44:10 +02:00
Johannes Doerfert 9d38f98dc3 [OpenMPOpt] Validate declaration types against the expected types
Validation of the found runtime library functions declarations types
(return and argument types) with the expected types.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D76058
2020-03-23 11:43:36 -05:00
Benjamin Kramer ff2f5097ed [Attributor] Fold single-use variable into assert
Fixes unused variable warning in Release builds.
2020-03-23 17:41:52 +01:00
Johannes Doerfert c57689bef2 [Attributor][NFC] Copy llvm::function_ref, don't use references
On IRC this was called a "code smell" so we get rid of it.
2020-03-23 10:45:24 -05:00
Johannes Doerfert 68fed27067 [Attributor] Handle calls in AAValueConstantRange properly
We did handle calls that were operands of certain instructions but not
standalone calls we visit via indirection, e.g., selects.
2020-03-23 10:45:24 -05:00
Johannes Doerfert 54ec9b54f6 [Attributor] Unify handling of must-tail calls
We special cased must-tail calls all over the place because they cannot
be modified as other calls can be. However, we already centralized the
modification API so we can centralize the handling as well. This
simplifies the code and allows to remove must-tail calls completely.
2020-03-23 10:45:24 -05:00
Johannes Doerfert 0995001ce5 [Attributor][NFC] Predetermine the module before verification
It could happen that we delete the first function in the SCC in the
future so we should be careful accessing `Functions` after the manifest
stage.
2020-03-23 10:45:23 -05:00
Johannes Doerfert f3bf4b05c2 [Attributor][NFC] clang-format Attributor.{h,cpp} 2020-03-23 10:45:23 -05:00
Nikita Popov dc81923659 [InstCombine] Remove ExpensiveCombines option
D75801 removed the last and only user of this option, so we can
drop it now. The original idea behind this was to only run expensive
transforms under -O3, but apart from the one known bits transform,
this has never really taken off. I believe nowadays the recommendation
is to put expensive transforms in AggressiveInstCombine instead,
though that isn't terribly popular either :)

Differential Revision: https://reviews.llvm.org/D76540
2020-03-22 16:56:28 +01:00
Eli Friedman e24e95fe90 Remove CompositeType class.
The existence of the class is more confusing than helpful, I think; the
commonality is mostly just "GEP is legal", which can be queried using
APIs on GetElementPtrInst.

Differential Revision: https://reviews.llvm.org/D75660
2020-03-18 13:53:17 -07:00
omarahmed1111 b285b333dc [Attributor] Detect possibly unbounded cycles in functions
This patch add mayContainUnboundedCycle helper function which checks whether a function has any cycle which we don't know if it is bounded or not.
Loops with maximum trip count are considered bounded, any other cycle not.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops and non-loop cycles.

Reviewed By: jdoerfert, uenoku, baziotis

Differential Revision: https://reviews.llvm.org/D74691
2020-03-13 11:17:33 -05:00
Pankaj Gode bf990530ae [Attributor] Improve noalias preservation using reachability
Resolution for below fixme:
(ii) Check whether the value is captured in the scope using AANoCapture.
FIXME: This is conservative though, it is better to look at CFG and
             check only uses possibly executed before this callsite.

Propagates caller argument's noalias attribute to callee.

Reviewed by: jdoerfert, uenoku

Reviewers: jdoerfert, sstefan1, uenoku

Subscribers: uenoku, sstefan1, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D71617
2020-03-13 21:09:08 +05:30
Johannes Doerfert a198adb490 [Attributor] IPO across definition boundary of a function marked alwaysinline
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D75590
2020-03-13 01:06:12 -05:00
rathod-sahaab 263c4a3c75 Fix compiler warning when compiling without asserts
This patch aims to prevent warning-as-error failures in release build.
As suggested in this comment
https://reviews.llvm.org/D69930#1910922

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D75970
2020-03-13 00:26:49 -05:00
Hideto Ueno d9bf79f4e9 [Attributor][FIX] Add a missing dependence track in noalias deduction 2020-03-12 15:27:35 +00:00
evgeny 6d2032e259 [WPD] Provide a way to prevent functions from being devirtualized
Differential revision: https://reviews.llvm.org/D75617
2020-03-09 14:05:15 +03:00
Hideto Ueno bdcbdb4848 [Attributor] Deduction based on path exploration
This patch introduces the propagation of known information based on path exploration.
For example,
```
int u(int c, int *p){
  if(c) {
     return *p;
  } else {
     return *p + 1;
  }
}
```
An argument `p` is dereferenced whatever c's value is.

For an instruction `CtxI`, we accumulate branch instructions in the must-be-executed-context of `CtxI` and then, we take the conjunction of the successors' known state.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D65593
2020-03-09 14:29:26 +09:00
Stefanos Baziotis 01c48d7d11 [Attributor] Fold terminators before changing instructions to unreachable
It is possible that an instruction to be changed to unreachable is
in the same block with a terminator that can be constant-folded.
In this case, as of now, the instruction will be changed to
unreachable before the terminator is folded. But, then the
whole BB becomes invalidated and so when we go ahead to fold
the terminator, we trap.

Change the order of these two.

Differential Revision: https://reviews.llvm.org/D75780
2020-03-07 12:38:44 +02:00
Fangrui Song 952ee0df9e ThinLTOBitcodeWriter: drop dso_local when a GlobalVariable is converted to a declaration
If we infer the dso_local flag for -fpic, dso_local should be dropped
when we convert a GlobalVariable a declaration. dso_local causes the
generation of direct access (e.g. R_X86_64_PC32). Such relocations referencing
STB_GLOBAL STV_DEFAULT objects are not allowed in a -shared link.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D74749
2020-03-05 18:09:33 -08:00
Nikita Popov 0e890cd4d4 [ConstantFolding] Always return something from ConstantFoldConstant
Spin-off from D75407. As described there, ConstantFoldConstant()
currently returns null for non-ConstantExpr/ConstantVector inputs,
but otherwise always returns non-null, independently of whether
any folding has happened or not.

This is confusing and makes consumer code more complicated.
I would expect either that ConstantFoldConstant() returns only if
it actually folded something, or that it always returns non-null.
I'm going to the latter possibility here, which appears to be more
useful considering existing usage.

Differential Revision: https://reviews.llvm.org/D75543
2020-03-04 18:24:47 +01:00
Sanjay Patel 71a316883d [PassManager] adjust VectorCombine placement
The initial placement of vector-combine in the opt pipeline revealed phase ordering bugs:
https://bugs.llvm.org/show_bug.cgi?id=45015
https://bugs.llvm.org/show_bug.cgi?id=42022

This patch contains a few independent changes:

1. Move the pass up in the pipeline, so it happens just after loop-vectorization.
   This is only to keep vectorization passes together in the pipeline at the moment.
   I don't have evidence of interaction between these yet.
2. Add an -early-cse pass after -vector-combine to clean up redundant ops. This was
   partly proposed as far back as rL219644 (which is why it's effectively being moved
   in the old PM code). This is important because the subsequent -instcombine doesn't
   work as well without EarlyCSE. With the CSE, -instcombine is able to squash
   shuffles together in 1 of the tests (because those are simple "select" shuffles).
3. Remove the -vector-combine pass that was running after SLP. We may want to do that
   eventually, but I don't have a test case to support it yet.

Differential Revision: https://reviews.llvm.org/D75145
2020-03-04 11:10:49 -05:00
Teresa Johnson 80bf137fa1 Revert "Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP""
This reverts commit 80d0a137a5, and the
follow on fix in 873c0d0786. It is
causing test failures after a multi-stage clang bootstrap. See
discussion on D73242 and D75201.
2020-03-02 14:02:13 -08:00
Teresa Johnson 873c0d0786 [ThinLTO/LowerTypeTests] Handle unpromoted local type ids
Summary:
Fixes an issue that cropped up after the changes in D73242 to delay
the lowering of type tests. LTT couldn't handle any type tests with
non-string type id (which happens for local vtables, which we try to
promote during the compile step but cannot always when there are no
exported symbols).

We can simply treat the same as having an Unknown resolution, which
delays their lowering, still allowing such type tests to be used in
subsequent optimization (e.g. planned usage during ICP). The final
lowering which simply removes these handles them fine.

Beefed up an existing ThinLTO test for such unpromoted type ids so that
the internal vtable isn't removed before lower type tests, which hides
the problem.

Reviewers: evgeny777, pcc

Subscribers: inglorion, hiraditya, steven_wu, dexonsmith, aganea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75201
2020-03-02 09:31:44 -08:00
Hiroshi Yamauchi f16d2bec40 Devirtualize a call on alloca without waiting for post inline cleanup and next DevirtSCCRepeatedPass iteration.
This aims to fix a missed inlining case.

If there's a virtual call in the callee on an alloca (stack allocated object) in
the caller, and the callee is inlined into the caller, the post-inline cleanup
would devirtualize the virtual call, but if the next iteration of
DevirtSCCRepeatedPass doesn't happen (under the new pass manager), which is
based on a heuristic to determine whether to reiterate, we may miss inlining the
devirtualized call.

This enables inlining in clang/test/CodeGenCXX/member-function-pointer-calls.cpp.

This is a second commit after a revert
https://reviews.llvm.org/rG4569b3a86f8a4b1b8ad28fe2321f936f9d7ffd43 and a fix
https://reviews.llvm.org/rG41e06ae7ba91.

Differential Revision: https://reviews.llvm.org/D69591
2020-02-28 09:43:32 -08:00
Teresa Johnson f9ca75f19b [Inliner] Inlining should honor nobuiltin attributes
Summary:
Final patch in series to fix inlining between functions with different
nobuiltin attributes/options, which was specifically an issue in LTO.
See discussion on D61634 for background.

The prior patch in this series (D67923) enabled per-Function TLI
construction that identified the nobuiltin attributes.

Here I have allowed inlining to proceed if the callee's nobuiltins are a
subset of the caller's nobuiltins, but not in the reverse case, which
should be conservatively correct. This is controlled by a new option,
-inline-caller-superset-nobuiltin, which is enabled by default.

Reviewers: hfinkel, gchatelet, chandlerc, davidxl

Subscribers: arsenm, jvesely, nhaehnle, mehdi_amini, eraman, hiraditya, haicheng, dexonsmith, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74162
2020-02-28 07:34:14 -08:00
Kirill Bobyrev 4569b3a86f
Revert "Devirtualize a call on alloca without waiting for post inline cleanup and next"
This reverts commit 59fb9cde7a.

The patch caused internal miscompilations.
2020-02-27 15:58:39 +01:00
Hiroshi Yamauchi 59fb9cde7a Devirtualize a call on alloca without waiting for post inline cleanup and next
DevirtSCCRepeatedPass iteration.  Needs ReviewPublic

This aims to fix a missed inlining case.

If there's a virtual call in the callee on an alloca (stack allocated object) in
the caller, and the callee is inlined into the caller, the post-inline cleanup
would devirtualize the virtual call, but if the next iteration of
DevirtSCCRepeatedPass doesn't happen (under the new pass manager), which is
based on a heuristic to determine whether to reiterate, we may miss inlining the
devirtualized call.

This enables inlining in clang/test/CodeGenCXX/member-function-pointer-calls.cpp.
2020-02-26 09:51:24 -08:00
Johannes Doerfert 396b725394 [OpenMP][Opt] Combine `struct ident_t*` during deduplication
If we deduplicate OpenMP runtime calls we have multiple `ident_t*` that
represent information like source location. So far, we simply kept the
one used by the replacement call. However, as exposed by PR44893, that
can cause problems if we have stack allocated `ident_t` objects. While
we need to revisit the use of these as well, it is clear that we
eventually want to merge source location information in some way. With
this patch we add the infrastructure to do so but without doing the
actual merge. Instead we pick a global `ident_t` from the replaced
calls, if possible, or create a new one with an unknown location
instead.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D74925
2020-02-25 14:07:14 -08:00
Hideto Ueno 2c0edbf19c [Attributor] Use AssumptionCache in AANonNullFloating::initialize 2020-02-25 13:00:03 +09:00
Ayke van Laethem 2a7a989c3e
[LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints
This patch adds bindings to C and Go for
addCoroutinePassesToExtensionPoints, which is used to add coroutine
passes to the correct locations in PassManagerBuilder.

Differential Revision: https://reviews.llvm.org/D51642
2020-02-24 20:15:51 +01:00
Johannes Doerfert 9708279c72 [Attributor][FIX] Undo 16188f9 until SCC iterator bug is fixed
The buildbot
  http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win
shows some strange SCC iterator bug since 16188f9 which we need to
investigate. This patch should remove the part of 16188f9 that could
have exposed the problem.
2020-02-21 14:20:42 -08:00
Johannes Doerfert d95cb56649 [Attributor] Make sure abstract attributes are properly initialized 2020-02-20 02:46:40 -06:00
Johannes Doerfert 6185fb13d6 [Attributor][NFC] Refactor interface 2020-02-20 02:46:40 -06:00
Johannes Doerfert 8e76fec0ae [Attributor][NFC] Improve the debug output & add a TODO 2020-02-19 23:46:08 -06:00
Johannes Doerfert f8ad735729 [Attributor] Use existing `returned` information better
We can look through calls with `returned` argument attributes when we
collect subsuming positions. This allows us to get existing attributes
from more places.
2020-02-19 23:46:07 -06:00
Johannes Doerfert a801ee869d [Attributor][FIX] Avoid setting wrong load/store alignments 2020-02-19 23:46:07 -06:00
Johannes Doerfert e1eed6c5b9 [Attributor] Generalize `getAssumedConstantInt` interface
We are often interested in an assumed constant and sometimes it has to
be an integer constant. Before we only looked for the latter, now we can
ask for either.
2020-02-19 22:33:51 -06:00
Johannes Doerfert 16188f9d70 [Attributor][FIX] Do not create new calls edge we cannot handle
If we propagate function pointers across function boundaries we can
create new call edges. These need to be represented in the CG if we run
as a CGSCC pass. In the new pass manager that is currently not handled
by the CallGraphUpdater so we need to prevent the situation for now.
2020-02-19 22:33:51 -06:00
Johannes Doerfert 1e99fc9d58 [Attributor] Add initial AAIsDead for arguments
We usually will ask for liveness of an argument anyway so we ended up
lazily creating the attribute anyway. However, that is not always the
case and even if it is we should go the eager route here. Various tests
show how this can improve the outcome. One test exposed a problem with
type mismatches between argument and call site argument, a fix is
included. For liveness various more tests were added as well.
2020-02-19 21:39:45 -06:00
Johannes Doerfert c6ac717aa7 [Attributor] Allow multiple uses of a casted function pointer
If a function pointer is casted into a different type the resulting
expression can be a constant. If so, it can be used multiple times which
cannot be handled by the AbstractCallSite constructor alone. Instead, we
follow the cast expression uses now explicitly during the call site
traversal.
2020-02-19 20:43:38 -06:00
Vedant Kumar c74026daf3 [HotColdSplit] Mark entire function cold when entry block is cold
rdar://58855712
2020-02-17 15:57:50 -08:00
Benjamin Kramer 564a9de28e Hide implementation details. NFC> 2020-02-17 17:55:23 +01:00
Johannes Doerfert 1d5da8cd30 [Attributor][FIX] Use pointer not reference as it can be null 2020-02-15 20:38:49 -06:00
Simon Pilgrim 8a48c4a97c Fix boolean/bitwise operator precedence warnings. NFCI. 2020-02-15 13:53:18 +00:00
Johannes Doerfert ef746aa11f [Attributor] Collect memory accesses with their respective kind and location
In addition to a single bit per memory locations, e.g., globals and
arguments, we now collect more information about the actual accesses,
e.g., what instruction caused it, was it a read/write/read+write, and
what the underlying base pointer was. Follow up patches will make
explicit use of this.

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D73527
2020-02-15 02:12:04 -06:00
Fangrui Song fd5665af2c [Attributor] Fix -Wunused-variable for -DLLVM_ENABLE_ASSERTIONS=off builds after b4352e43d8 2020-02-14 21:47:19 -08:00
Johannes Doerfert b70297a39a [Attributor][FIX] Ensure abstract attributes are existing before manifest
While the function return updateImpl did only look at call sites the
manifest method looked at return values. If we don't do this during the
updateImpl we might create new abstract attributes during manifest. This
is a problem when it comes to liveness information.
2020-02-14 21:44:46 -06:00
Johannes Doerfert ad121ea14d [Attributor] Manifest simplified (return) values properly
If we simplify a function return value we have to modify the return
instructions.
2020-02-14 21:44:46 -06:00
Johannes Doerfert b53af0e7f9 [Attributor][FIX] Collapse `undef` to a proper value
If we see an undef we cannot assume it's the same as "no value". For now
we just collapse it to 0.
2020-02-14 21:44:46 -06:00
Johannes Doerfert 137c99a6a5 [Attributor][FIX] Restrict cross-SCC call deletion
If we know a call was not needed we might have ended up deleting it even
if it was in a different SCC. This prevents us from doing so.
2020-02-14 21:44:46 -06:00
Johannes Doerfert 32e98a7089 [Attributor][FIX] Carefully strip casts in AANoAlias
We can strip casts in AANoAlias but that might cause us to end up with a
non-pointer type. We do properly handle that case now.
2020-02-14 21:44:46 -06:00
Johannes Doerfert b4352e43d8 [Attributor][FIX] Do not RAUW void values
This caused an error when passes iterated over cached assumptions in the
tracker and assumed them to be `null` or an instruction. I failed to
create a test case so far.
2020-02-14 21:44:46 -06:00
Johannes Doerfert 282f5d7ad1 [Attributor] Derive memory location attributes (argmemonly, ...)
In addition to memory behavior attributes (readonly/writeonly) we now
derive memory location attributes (argmemonly/inaccessiblememonly/...).
The former is part of AAMemoryBehavior and the latter part of
AAMemoryLocation. While they are similar in nature it got messy when
they were put in a single AA. Location attributes for arguments and
floating values will follow later.

Note that both memory attributes kinds can derive readnone. If there are
no accesses AAMemoryBehavior will derive readnone. If there are accesses
but only to stack (=local) locations AAMemoryLocation will derive
readnone.

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D73426
2020-02-14 19:05:51 -06:00
Johannes Doerfert 7cbb107feb [Attributor][FIX] Validate the type for AAValueConstantRange as needed
Due to the genericValueTraversal we might visit values for which we did
not create an AAValueConstantRange object, e.g., as they are behind a
PHI or select or call with `returned` argument. As a consequence we need
to validate the types as we are about to query AAValueConstantRange for
operands.
2020-02-14 17:22:40 -06:00
Johannes Doerfert 23f41f16d4 [Attributor] Use fine-grained liveness in all helpers
We used coarse-grained liveness before, thus we looked if the
instruction was executed, but we did not use fine-grained liveness,
hence if the instruction was needed or could be deleted even if the
surrounding ones are live. This patches introduces this level of
liveness checks together with other liveness queries, e.g., for uses.

For more control we enforce that all liveness queries go through the
Attributor.

Test have been adjusted to reflect the changes or augmented to prevent
deletion of the parts we want to check.

Reviewed By: sstefan1

Differential Revision: https://reviews.llvm.org/D73313
2020-02-12 17:36:38 -06:00
Johannes Doerfert b2c76002ca [Attributor] Ignore uses if a value is simplified
If we have a replacement for a value, via AAValueSimplify, the original
value will lose all its uses. Thus, as long as a value is simplified we
can skip the uses in checkForAllUses, given that these uses are
transitive uses for the simplified version and will therefore affect the
simplified version as necessary.

Since this allowed us to remove calls without side-effects and a known
return value, we need to make sure not to eliminate `musttail` calls.
Those we keep around, or later remove the entire `musttail` call chain.
2020-02-12 17:36:38 -06:00
Johannes Doerfert 86509e8c3b [Attributor] Use assumed information to determine side-effects
We relied on wouldInstructionBeTriviallyDead before but that functions
does not take assumed information, especially for calls, into account.
The replacement, AAIsDead::isAssumeSideEffectFree, does.

This change makes AAIsDeadCallSiteReturn more complex as we can have
a dead call or only dead users.

The test have been modified to include a side effect where there was
none in order to keep the coverage.

Reviewed By: sstefan1

Differential Revision: https://reviews.llvm.org/D73311
2020-02-12 17:36:38 -06:00
Ehud Katz d8a2ea9fd5 [LoopExtractor] Fix legacy pass dependencies
Fixes a memory leak of allocating `LoopInfoWrapperPass` and `DominatorTreeWrapperPass`.
2020-02-12 22:39:21 +02:00
Johannes Doerfert 52aec3221f [Attributor][NFC] Clarify the documentation a bit more 2020-02-11 15:11:55 -06:00
Johannes Doerfert 8e62968d45 [Attributor] Identify dead uses in PHIs (almost) based on dead edges
As an approximation to a dead edge we can check if the terminator is
dead. If so, the corresponding operand use in a PHI node is dead even if
the PHI node itself is not.
2020-02-11 15:11:55 -06:00
Teresa Johnson 80d0a137a5 Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"
This restores commit 748bb5a0f1, along
with a fix for a Chromium test suite build issue (and a new test for
that case).

Differential Revision: https://reviews.llvm.org/D73242
2020-02-11 10:48:05 -08:00
Johannes Doerfert 185e9b083e [Attributor][NFC] Improve documentation 2020-02-11 11:19:34 -06:00
Johannes Doerfert f95553923f [Attributor] Return uses do not free pointers
If a pointer is returned that does not mean it is freed in the current
(function) scope. We can ignore such uses in AANoFree.
2020-02-11 11:02:59 -06:00
Johannes Doerfert 4c62a35860 [Attributor][FIX] Remove duplicate, half-broken functionality
The changeXXXAfterManifest functions are better suited to deal with
changes so we should prefer them. These functions also recursively
delete dead instructions which is why we see test changes.
2020-02-11 11:02:59 -06:00
Johannes Doerfert 77a9e61c9a [Attributor][NFC] Improve debug message 2020-02-11 11:02:59 -06:00
Bill Wendling c55cf4afa9 Revert "Remove redundant "std::move"s in return statements"
The build failed with

  error: call to deleted constructor of 'llvm::Error'

errors.

This reverts commit 1c2241a793.
2020-02-10 07:07:40 -08:00
Bill Wendling 1c2241a793 Remove redundant "std::move"s in return statements 2020-02-10 06:39:44 -08:00
Mikael Holmen a50c0b0df7 Fix compiler warning when compiling without asserts [NFC] 2020-02-10 13:55:52 +01:00
Johannes Doerfert 87ddf1f4fa [Attributor] Simple casts preserve no-alias property
This is a minimal but important advancement over the existing code. A
cast with an operand that is only used in the cast retains the no-alias
property of the operand.
2020-02-10 01:11:32 -06:00
Johannes Doerfert 8155439331 [Attributor] Allow PHI nodes in AAValueConstantRangeFloating
Traversing PHI nodes is natural with the genericValueTraversal but also
a bit tricky. The problem is similar to the ones we have seen in AAAlign
and AADereferenceable, namely that we continue to increase the range in
each iteration. We use a pessimistic approach here to stop the
iterations. Nevertheless, optimistic information can now be propagated
through a PHI node.
2020-02-10 00:55:10 -06:00
Johannes Doerfert 63adbb9a0e [Attributor][FIX] Remove FIXME that seems outdated
The change is performed as stated by the FIXME and the tests are
adjusted. All changes look fine to me and values can be inferred as
undef without it being an error.
2020-02-10 00:55:10 -06:00
Johannes Doerfert 7e7e6594b3 [Attributor] Allow SelectInst in AAValueConstantRangeFloating
The genericValueTraversal will already handle SelectInst properly and we
just needed to allow them in the initialize method.
2020-02-10 00:55:09 -06:00
Johannes Doerfert ffdbd2a06c [Attributor] Look through (some) casts in AAValueConstantRangeFloating
Casts can be handled natively by the ConstantRange class. We do limit it
to extends for now as we assume an integer type in different locations.
A TODO and a test case with a FIXME was added to remove that restriction
in the future.
2020-02-10 00:38:01 -06:00
Johannes Doerfert 028db8c490 [Attributor][FIX] Call right base method in AAValueConstantRangeFloating
We now call the base class method as we should.
2020-02-10 00:38:01 -06:00
Michael Liao ab3da5dd66 Fix `-Wparentheses` warning. NFC. 2020-02-10 00:45:02 -05:00
Sanjay Patel a17f03bd93 [VectorCombine] new IR transform pass for partial vector ops
We have several bug reports that could be characterized as "reducing scalarization",
and this topic was also raised on llvm-dev recently:
http://lists.llvm.org/pipermail/llvm-dev/2020-January/138157.html
...so I'm proposing that we deal with these patterns in a new, lightweight IR vector
pass that runs before/after other vectorization passes.

There are 4 alternate options that I can think of to deal with this kind of problem
(and we've seen various attempts at all of these), but they all have flaws:

    InstCombine - can't happen without TTI, but we don't want target-specific
                  folds there.
    SDAG - too late to assist other vectorization passes; TLI is not equipped
           for these kind of cost queries; limited to a single basic block.
    CGP - too late to assist other vectorization passes; would need to re-implement
          basic cleanups like CSE/instcombine.
    SLP - doesn't fit with existing transforms; limited to a single basic block.

This initial patch/transform is based on existing code in AggressiveInstCombine:
we walk backwards through the function looking for a pattern match. But we diverge
from that cost-independent IR canonicalization pass by using TTI to decide if the
vector alternative is profitable.

We probably have at least 10 similar bug reports/patterns (binops, constants,
inserts, cheap shuffles, etc) that would fit in this pass as follow-up enhancements.
It's possible that we could iterate on a worklist to fix-point like InstCombine does,
but it's safer to start with a most basic case and evolve from there, so I didn't
try to do anything fancy with this initial implementation.

Differential Revision: https://reviews.llvm.org/D73480
2020-02-09 10:04:41 -05:00
Ehud Katz 3b70ee27a5 [LoopExtractor] Convert LoopExtractor from LoopPass to ModulePass
The LoopExtractor created new functions (by definition), which violates
the restrictions of a LoopPass.
The correct implementation of this pass should be as a ModulePass.
Includes reverting rL82990 implications on the LoopExtractor.

Fixes PR3082 and PR8929.

Differential Revision: https://reviews.llvm.org/D69069
2020-02-09 12:25:21 +02:00
Johannes Doerfert b0c77c36d2 [Attributor] Add an Attributor CGSCC pass and run it
In addition to the module pass, this patch introduces a CGSCC pass that
runs the Attributor on a strongly connected component of the call graph
(both old and new PM). The Attributor was always design to be used on a
subset of functions which makes this patch mostly mechanical.

The one change is that we give up `norecurse` deduction in the module
pass in favor of doing it during the CGSCC pass. This makes the
interfaces simpler but can be revisited if needed.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D70767
2020-02-08 21:27:34 -06:00
Johannes Doerfert e565db49c6 [OpenMP][Opt] Delete terminating and read-only parallel regions
Parallel regions known to be read-only, e.g., after we removed all dead
write accesses, and terminating (`willreturn`) can be removed.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D69954
2020-02-08 18:52:04 -06:00
Johannes Doerfert e28936f613 [OpenMP][Opt] Annotate known runtime functions and deduplicate more
This adds ~27 more runtime calls to the OpenMPKinds.def file, all with
attributes. We deduplicate 16 of those automatically in function =
thread scope. And we annotate all of them automatically during the
OpenMPOpt discovery step. A test with all omp_XXXX runtime calls to
track annotation coverage is included.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D69984
2020-02-08 18:35:39 -06:00
Johannes Doerfert 9548b74a83 [OpenMP] Introduce the OpenMPOpt transformation pass
The OpenMPOpt pass is a CGSCC pass in which OpenMP specific
optimizations can reside.

The OpenMPOpt pass uses the OpenMPKinds.def file to identify runtime
calls and their uses. This allows targeted transformations and eases
their implementation.

This initial patch deduplicates `__kmpc_global_thread_num` and
`omp_get_thread_num` calls. We can also identify arguments that are
equivalent to such a call result and use it instead. Later we can
determine "gtid" arguments based on the use in kernel functions etc.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D69930
2020-02-08 14:47:03 -06:00
Teresa Johnson 25aa2eef99 Revert "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"
This reverts commit 748bb5a0f1.

Due to Chromium CFI+ThinLTO test crashes reported on patch.
2020-02-05 19:27:32 -08:00
Teresa Johnson 748bb5a0f1 [WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP
Summary:
Currently type test assume sequences inserted for devirtualization are
removed during WPD. This patch delays their removal until later in the
optimization pipeline. This is an enabler for upcoming enhancements to
indirect call promotion, for example streamlined promotion guard
sequences that compare against vtable address instead of the target
function, when there are small number of possible vtables (either
determined via WPD or by in-progress type profiling). We need the type
tests to correlate the callsites with the address point offset needed in
the compare sequence, and optionally to associated type summary info
computed during WPD.

This depends on work in D71913 to enable invocation of LowerTypeTests to
drop type test assume sequences, which will now be invoked following ICP
in the ThinLTO post-LTO link pipelines, and also after the existing
export phase LowerTypeTests invocation in regular LTO (which is already
after ICP). We cannot simply move the existing import phase
LowerTypeTests pass later in the ThinLTO post link pipelines, as the
comment in PassBuilder.cpp notes (it must run early because when
performing CFI other passes may disturb the sequences it looks for).

This necessitated adding a new type test resolution "Unknown" that we
can use on the type test assume sequences previously removed by WPD,
that we now want LTT to ignore.

Depends on D71913.

Reviewers: pcc, evgeny777

Subscribers: mehdi_amini, Prazek, hiraditya, steven_wu, dexonsmith, arphaman, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73242
2020-02-05 08:59:48 -08:00
Teresa Johnson bed4d9c897 [ThinLTO] More efficient export computation (NFC)
Summary:
A recent change to enable more importing of global variables with
references exposed some efficiency issues with export computation.
See D73724 for more information and detailed analysis.

The first was specific to variable importing. The code was marking every
copy of a referenced value (from possibly thousands of files in the case
of linkonce_odr) as exported, and we only need to mark the copy in the
module containing the variable def being imported as exported. The
reason is that this is tracking what values are newly exported as a
result of importing. Anything that was defined in another module and
simply used in the exporting module is already exported, and would have
been identified by the caller (e.g. the LTO API implementations).

The second issue is that the code was re-adding previously exported
values (along with all references). It is easy to identify when a
variable was already imported into the same module (via the
import list insert call return value), and we already did this for
function importing. However, what we weren't doing for either function
or variable importing was avoiding a re-insertion when it was previously
exported into a different importing module. The reason we couldn't do
this is there was no way of telling from the export list whether it was
previously inserted there because its definition was exported (in which
case we already marked all its references as exported) from when it was
inserted there because it was referenced by another exported value (in
which case we haven't yet inserted its own references).

To address this we can restructure the way the export list is
constructed. This patch only adds the actual imported definitions
(variable or function) to the export list for its module during the
import computation. After import computation is complete, where we were
already post-processing the export list we go ahead and add all
references made by those exported values to the export list.

These changes speed up the thin link not only with constant variable
importing enabled, but also without (due to the efficiency improvement
in function importing).

Some thin link user time measurements for one large application, average
of 5 runs:

With constant variable importing enabled:
- without this patch: 479.5s
- with this patch: 74.6s

Without constant variable importing enabled:
- without this patch: 80.6s
- with this patch: 70.3s

Note I have not re-enabled constant variable importing here, as I would
like to do additional compile time measurements with these fixes first.

Reviewers: evgeny777

Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73851
2020-02-03 09:15:33 -08:00
Johannes Doerfert 26d02b0f28 [Attributor] AANoRecurse check all call sites for `norecurse`
If all call sites are in `norecurse` functions we can derive `norecurse`
as the ReversePostOrderFunctionAttrsPass does. This should make
ReversePostOrderFunctionAttrsLegacyPass obsolete once the Attributor is
enabled.

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D72017
2020-02-02 23:57:17 -06:00
Johannes Doerfert 368f7ee7a5 [Attributor] Propagate known information from `checkForAllCallSites`
If we know that all call sites have been processed we can derive an
early fixpoint. The use in this patch is likely not to trigger right now
but a follow up patch will make use of it.

Reviewed By: uenoku, baziotis

Differential Revision: https://reviews.llvm.org/D72016
2020-02-02 23:57:17 -06:00
Juneyoung Lee 578d2e2cb1 [llvm-extract] Add -keep-const-init commandline option
Summary:
This adds -keep-const-init option to llvm-extract which preserves initializers of
used global constants.

For example:

```
$ cat a.ll
@g = constant i32 0
define i32 @f() {
  %v = load i32, i32* @g
  ret i32 %v
}

$ llvm-extract --func=f a.ll -S -o -
@g = external constant i32
define i32 @f() { .. }

$ llvm-extract --func=f a.ll -keep-const-init -S -o -
@g = constant i32 0
define i32 @f() { .. }
```

This option is useful in checking whether a function that uses a constant global is optimized correctly.

Reviewers: jsji, MaskRay, david2050

Reviewed By: MaskRay

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73833
2020-02-03 14:30:28 +09:00
Michael Forster 676c29694c Inline debug variable.
Summary:
In a release build this variable becomes unused and may break the build
with `-Werror,-Wunused-variable`.

Reviewers: gribozavr2, jdoerfert, sstefan1

Reviewed By: gribozavr2

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73683
2020-01-30 10:29:24 +01:00
Johannes Doerfert 89c2e733e8 [Attributor] Pointer privatization attribute (argument promotion)
A pointer is privatizeable if it can be replaced by a new, private one.
Privatizing pointer reduces the use count, interaction between unrelated
code parts. This is a first step towards replacing argument promotion.
While we can already handle recursion (unlike argument promotion!) we
are restricted to stack allocations for now because we do not analyze
the uses in the callee.

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D68852
2020-01-29 21:31:04 -06:00
Johannes Doerfert 791c9f1145 [Attributor] Fix TODO to avoid recomputation of results
The helpers AAReturnedFromReturnedValues and
AACallSiteReturnedFromReturned are useful not only to avoid code
duplication but also to avoid recomputation of results. If we have N
call sites we should not recompute the function return information N
times but once. These are mostly straightforward usages with some minor
improvements on the helpers and addition of a new one
(IRPosition::getAssociatedType) that knows about function return types.
2020-01-29 19:24:34 -06:00
Elia Geretto ab2300bc15 [PassManagerBuilder] Remove global extension when a plugin is unloaded
This commit fixes PR39321.

GlobalExtensions is not guaranteed to be destroyed when optimizer plugins are unloaded. If it is indeed destroyed after a plugin is dlclose-d, the destructor of the corresponding ExtensionFn is not mapped anymore, causing a call to unmapped memory during destruction.

This commit guarantees that extensions coming from external plugins are removed from GlobalExtensions when the plugin is unloaded if GlobalExtensions has not been destroyed yet.

Differential Revision: https://reviews.llvm.org/D71959
2020-01-29 16:15:45 +00:00
Johannes Doerfert 76843ba37f [Attributor][Fix] Initialize unused but loaded variable
This hopefully un-breaks:
  http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/38333
2020-01-28 23:52:16 -06:00
Johannes Doerfert ea5fabe60c [Attributor] Reuse existing logic to avoid duplication
There was a TODO in AAValueConstantRangeArgument to reuse
AAArgumentFromCallSiteArguments. We now do this by allowing new States
to be build from the bestState.
2020-01-28 23:45:59 -06:00
Johannes Doerfert 224085409d [Attributor][FIX] Treat invalidated attributes as changed
If we invalidate an attribute we need to inform all dependent ones even
if the fixpoint state is not invalid. Before we only continued
invalidation if the fixpoint state was invalid, now we signal a change
in case the fixpoint state is valid.

The test case was already included in D71620 but the problem was hiding
because it only manifested with the old PM (for that input).
2020-01-28 23:40:41 -06:00
Johannes Doerfert 53992c7bf7 [Attributor] Modularize AANoAliasCallSiteArgument to simplify extensions
This patch modularizes the way we check for no-alias call site arguments
by putting the existing logic into helper functions. The reasoning was
not changed but special cases for readonly/readnone were added.
2020-01-28 23:39:29 -06:00
Johannes Doerfert 24ae77eebf [Attributor] Mark a non-defined `null` pointer as `noalias`
If `null` is not defined we cannot access it, hence the pointer is
`noalias`. While this is not helpful on it's own it simplifies later
deductions that can skip over already known `noalias` pointers in
certain situations.
2020-01-28 23:09:37 -06:00
Johannes Doerfert 6626d1b7c0 [Attributor][NFC] Remove ugly and unneeded cast 2020-01-28 22:54:31 -06:00
Johannes Doerfert 02bd8180fc [Attributor][NFC] Improve debug messages 2020-01-28 22:53:19 -06:00
Johannes Doerfert b6dbd0f71f [Attributor][NFC] Internalize helper function 2020-01-28 22:50:34 -06:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Teresa Johnson 2f63d549f1 Restore "[LTO/WPD] Enable aggressive WPD under LTO option"
This restores 59733525d3 (D71913), along
with bot fix 19c76989bb.

The bot failure should be fixed by D73418, committed as
af954e441a.

I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
2020-01-27 07:55:05 -08:00
Evgeny Leviant 8973fae195 [WPD] Allow load/save bitcoded index when running opt -wholeprogramdevirt
Differential revision: https://reviews.llvm.org/D73094
2020-01-24 00:31:39 -08:00
Teresa Johnson 90e630a95e Revert "[LTO/WPD] Enable aggressive WPD under LTO option"
This reverts commit 59733525d3.

There is a windows sanitizer bot failure in one of the cfi tests
that I will need some time to figure out:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/57155/steps/stage%201%20check/logs/stdio
2020-01-23 17:29:24 -08:00
Johannes Doerfert 7ad17e008b [Attributor] Avoid REQUIRED dependences in favor of OPTIONAL ones
When we use information only to short-cut deduction or improve it, we
can use OPTIONAL dependences instead of REQUIRED ones to avoid cascading
pessimistic fixpoints.

We also need to track dependences only when we use assumed information,
e.g., we act on assumed liveness information.
2020-01-23 18:42:46 -06:00
Johannes Doerfert 214ed3f676 [Attributor] Record dependences only when necessary
If we use assumed information from AAValueSimplify we need to record
an OPTIONAL dependence, otherwise we do not.
2020-01-23 18:42:45 -06:00
Johannes Doerfert 5429c82db2 [Attributor][FIX] Avoid dangling pointers during code deletion
It can happen that we have instructions in the ToBeDeletedInsts set
which are deleted earlier already. To avoid dangling pointers we use
weak tracking handles.
2020-01-23 18:42:45 -06:00
Johannes Doerfert ff6254dc26 [Attributor][FIX] Handle non-pointers when following uses
When we follow uses, e.g., in AAMemoryBehavior or AANoCapture, we need
to make sure the value is a pointer before we ask for abstract
attributes only valid for pointers. This happens because we follow
pointers through calls that do not capture but may return the value.
2020-01-23 18:42:45 -06:00
Johannes Doerfert 9dcf889d15 [Attributor][NFC] Do not (try to) simplify void values
We might accidentally ask AAValueSimplify to simplify a void value. That
can lead to very interesting, and very wrong, results. We now handle
this case gracefully.
2020-01-23 18:42:45 -06:00
Johannes Doerfert 30179d7ecf [Attributor][FIX][Alignment] Do not report a change if there was none
If alignment was manifested but it is actually only as good as the
data-layout provided one we should not report it as a change.

For testing purposes we still manifest the information.
2020-01-23 18:13:52 -06:00
Johannes Doerfert e273ac4d88 [Attributor][NFC] Add an assertion 2020-01-23 18:13:52 -06:00
Johannes Doerfert d07b5a5525 [Attributor][NFC] Fix spelling 2020-01-23 18:13:52 -06:00
Johannes Doerfert 2baf000ecc [Attributor] `byval` arguments are always `noalias`
`byval` introduces a local copy of the argument. That copy cannot alias
anything.
2020-01-23 18:13:52 -06:00
Johannes Doerfert 30ae859c69 [Attributor][FIX] Store alignment only holds for the pointer value
We accidentally used the store alignment for the value operand as well,
which is incorrect and crashed the SPASS application in the test suite.
2020-01-23 18:13:52 -06:00
Teresa Johnson 59733525d3 [LTO/WPD] Enable aggressive WPD under LTO option
Summary:
Third part in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

This patch adds type test metadata under -fwhole-program-vtables,
even for classes without hidden visibility. It then changes WPD to skip
devirtualization for a virtual function call when any of the compatible
vtables has public vcall visibility.

Additionally, internal LLVM options as well as lld and gold-plugin
options are added which enable upgrading all public vcall visibility
to linkage unit (hidden) visibility during LTO. This enables the more
aggressive WPD to kick in based on LTO time knowledge of the visibility
guarantees.

Support was added to all flavors of LTO WPD (regular, hybrid and
index-only), and to both the new and old LTO APIs.

Unfortunately it was not simple to split the first and second parts of
this part of the change (the unconditional emission of type tests and
the upgrading of the vcall visiblity) as I needed a way to upgrade the
public visibility on legacy WPD llvm assembly tests that don't include
linkage unit vcall visibility specifiers, to avoid a lot of test churn.

I also added a mechanism to LowerTypeTests that allows dropping type
test assume sequences we now aggressively insert when we invoke
distributed ThinLTO backends with null indexes, which is used in testing
mode, and which doesn't invoke the normal ThinLTO backend pipeline.

Depends on D71907 and D71911.

Reviewers: pcc, evgeny777, steven_wu, espindola

Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71913
2020-01-23 16:09:44 -08:00
Alina Sbirlea 9e66c4ec12 [Utils] Use WeakTrackingVH in vector used as scratch storage.
The utility method RecursivelyDeleteTriviallyDeadInstructions receives
as input a vector of Instructions, where all inputs are valid
instructions. This same vector is used as a scratch storage (per the
header comment) to recursively delete instructions. If an instruction is
added as an operand of multiple other instructions, it may be added twice,
then deleted once, then the second reference in the vector is invalid.
Switch to using a Vector<WeakTrackingVH>.
This change facilitates a clean-up in LoopStrengthReduction.
2020-01-23 16:04:57 -08:00
Teresa Johnson 458676db6e [WPD/VFE] Always emit vcall_visibility metadata for -fwhole-program-vtables
Summary:
First patch to support Safe Whole Program Devirtualization Enablement,
see RFC here: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

Always emit !vcall_visibility metadata under -fwhole-program-vtables,
and not just for -fvirtual-function-elimination. The vcall visibility
metadata will (in a subsequent patch) be used to communicate to WPD
which vtables are safe to devirtualize, and we will optionally convert
the metadata to hidden visibility at link time. Subsequent follow on
patches will help enable this by adding vcall_visibility metadata to the
ThinLTO summaries, and always emit type test intrinsics under
-fwhole-program-vtables (and not just for vtables with hidden
visibility).

In order to do this safely with VFE, since for VFE all vtable loads must
be type checked loads which will no longer be the case, this patch adds
a new "Virtual Function Elim" module flag to communicate to GlobalDCE
whether to perform VFE using the vcall_visibility metadata.

One additional advantage of using the vcall_visibility metadata to drive
more WPD at LTO link time is that we can use the same mechanism to
enable more aggressive VFE at LTO link time as well. The link time
option proposed in the RFC will convert vcall_visibility metadata to
hidden (aka linkage unit visibility), which combined with
-fvirtual-function-elimination will allow it to be done more
aggressively at LTO link time under the same conditions.

Reviewers: pcc, ostannard, evgeny777, steven_wu

Subscribers: mehdi_amini, Prazek, hiraditya, dexonsmith, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71907
2020-01-23 11:36:01 -08:00
Guillaume Chatelet 59f95222d4 [Alignment][NFC] Use Align with CreateAlignedStore
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73274
2020-01-23 17:34:32 +01:00
Mircea Trofin 5466597fee [NFC] Refactor InlineResult for readability
Summary:
InlineResult is used both in APIs assessing whether a call site is
inlinable (e.g. llvm::isInlineViable) as well as in the function
inlining utility (llvm::InlineFunction). It means slightly different
things (can/should inlining happen, vs did it happen), and the
implicit casting may introduce ambiguity (casting from 'false' in
InlineFunction will default a message about hight costs,
which is incorrect here).

The change renames the type to a more generic name, and disables
implicit constructors.

Reviewers: eraman, davidxl

Reviewed By: davidxl

Subscribers: kerbowa, arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72744
2020-01-15 13:34:20 -08:00
Hideto Ueno 188f9a348d [Attributor] AAValueConstantRange: Value range analysis using constant range
Summary:
This patch introduces `AAValueConstantRange`, which answers a possible range for integer value in a specific program point.
One of the motivations is propagating existing `range` metadata. (I think we need to change the situation that `range` metadata cannot be put to Argument).

The state is a tuple of `ConstantRange` and it is initialized to (known, assumed) = ([-∞, +∞], empty).

Currently, AAValueConstantRange is created in `getAssumedConstant` method when `AAValueSimplify` returns `nullptr`(worst state).

Supported
 - BinaryOperator(add, sub, ...)
 - CmpInst(icmp eq, ...)
 - !range metadata

`AAValueConstantRange` is not intended to extend to polyhedral range value analysis.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: phosek, davezarzycki, baziotis, hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71620
2020-01-15 16:34:23 +09:00
Nikita Popov 410331869d [NewPM] Port MergeFunctions pass
This ports the MergeFunctions pass to the NewPM. This was rather
straightforward, as no analyses are used.

Additionally MergeFunctions needs to be conditionally enabled in
the PassBuilder, but I left that part out of this patch.

Differential Revision: https://reviews.llvm.org/D72537
2020-01-14 20:55:41 +01:00
Teresa Johnson 2cefb93951 [ThinLTO/WPD] Remove an overly-aggressive assert
Summary:
An assert added to the index-based WPD was trying to verify that we only
have multiple vtables for a given guid when they are all non-external
linkage. This is too conservative because we may have multiple external
vtable with the same guid when they are in comdat. Remove the assert,
as we don't have comdat information in the index, the linker should
issue an error in this case.

See discussion on D71040 for more information.

Reviewers: evgeny777, aganea

Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72648
2020-01-14 10:57:14 -08:00
Dmitri Gribenko 2948ec5ca9 Removed PointerUnion3 and PointerUnion4 aliases in favor of the variadic template 2020-01-14 18:56:29 +01:00
Florian Hahn 192cce10f6 Revert "Recommit "[GlobalOpt] Pass DTU to removeUnreachableBlocks instead of recomputing.""
This reverts commit a03d7b0f24.

As discussed in D68298, this causes a compile-time regression, in case
the DTs requested are not used elsewhere in GlobalOpt. We should only
get the DTs if they are available here, but this seems not possible with
the legacy pass manager from a module pass.
2020-01-14 14:50:07 +00:00
Teresa Johnson 31441a3e00 [ThinLTO/WPD] Fix index-based WPD for alias vtables
Summary:
A recent fix in D69452 fixed index based WPD in the presence of
available_externally vtables. It added a cast of the vtable def
summary to a GlobalVarSummary. However, in some cases one def may be an
alias, in which case we need to get the base object before casting,
otherwise we will crash.

Reviewers: evgeny777, steven_wu, aganea

Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71040
2020-01-13 13:38:26 -08:00
Johannes Doerfert a4088c75cc [Attributor][FIX] Carefully change invokes to calls (after manifest)
Before we manually inserted unreachable early but that could lead to
broken PHI nodes. Now we use the existing late modification
functionality.
2020-01-08 19:32:38 -06:00
Johannes Doerfert 1e46eb74be [Attributor][FIX] Avoid dangling value pointers during code modification
When we replace instructions with unreachable we delete instructions. We
now avoid dangling pointers to those deleted instructions in the
`ToBeChangedToUnreachableInsts` set. Other modification collections
might need to be updated in the future as well.
2020-01-08 19:32:37 -06:00
Fangrui Song 6904cd9486 Add Triple::isX86()
Reviewed By: craig.topper, skan

Differential Revision: https://reviews.llvm.org/D72247
2020-01-06 15:51:02 -08:00
James Henderson d68904f957 [NFC] Fix trivial typos in comments
Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D72143

Patch by Kazuaki Ishizaki.
2020-01-06 10:50:26 +00:00
Alexey Lapshin 831bfcea47 [Transforms][GlobalSRA] huge array causes long compilation time and huge memory usage.
Summary:
For artificial cases (huge array, few usages), Global SRA optimization creates
a lot of redundant data. It creates an instance of GlobalVariable for each array
element. For huge array, that means huge compilation time and huge memory usage.
Following example compiles for 10 minutes and requires 40GB of memory.

namespace {
  char LargeBuffer[64 * 1024 * 1024];
}

int main ( void ) {

    LargeBuffer[0] = 0;

    printf("\n ");

    return LargeBuffer[0] == 0;
}

The fix is to avoid Global SRA for large arrays.

Reviewers: craig.topper, rnk, efriedma, fhahn

Reviewed By: rnk

Subscribers: xbolva00, lebedev.ri, lkail, merge_guards_bot, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71993
2020-01-04 16:42:38 +03:00
Johannes Doerfert d2d2fb19f7 [Attributor][FIX] Allow dead users of rewritten function
If we replace a function with a new one because we rewrite the
signature, dead users may still refer to the old version. With this
patch we reuse the code that deals with dead functions, which the old
versions are, to avoid problems.
2020-01-03 10:43:40 -06:00
Johannes Doerfert 6b9ee2d6cd [Attributor][NFC] Unify the way we delete dead functions 2020-01-03 10:43:40 -06:00
Johannes Doerfert c90681b681 [Attributor][FIX] Don't crash on ptr2int/int2ptr instructions
An integer isn't allowed in getAlignmentForValue so we need to stop at a
ptr2int instruction during exploration.
2020-01-03 10:43:40 -06:00
Johannes Doerfert 412a0101a9 [Attributor][FIX] Do not derive nonnull and dereferenceable w/o access
An inbounds GEP results in poison if the value is not "inbounds", not in
UB. We accidentally derived nonnull and dereferenceable from these
inbounds GEPs even in the absence of accesses that would make the poison
to UB.
2020-01-03 10:43:40 -06:00