Commit Graph

23691 Commits

Author SHA1 Message Date
Johannes Doerfert 16188f9d70 [Attributor][FIX] Do not create new calls edge we cannot handle
If we propagate function pointers across function boundaries we can
create new call edges. These need to be represented in the CG if we run
as a CGSCC pass. In the new pass manager that is currently not handled
by the CallGraphUpdater so we need to prevent the situation for now.
2020-02-19 22:33:51 -06:00
Johannes Doerfert 1e99fc9d58 [Attributor] Add initial AAIsDead for arguments
We usually will ask for liveness of an argument anyway so we ended up
lazily creating the attribute anyway. However, that is not always the
case and even if it is we should go the eager route here. Various tests
show how this can improve the outcome. One test exposed a problem with
type mismatches between argument and call site argument, a fix is
included. For liveness various more tests were added as well.
2020-02-19 21:39:45 -06:00
Johannes Doerfert c6ac717aa7 [Attributor] Allow multiple uses of a casted function pointer
If a function pointer is casted into a different type the resulting
expression can be a constant. If so, it can be used multiple times which
cannot be handled by the AbstractCallSite constructor alone. Instead, we
follow the cast expression uses now explicitly during the call site
traversal.
2020-02-19 20:43:38 -06:00
Michael Kruse e4d20ec8ad [IndVarSimply] Fix assert/release build difference.
In builds with assertions enabled (!NDEBUG), IndVarSimplify does an
additional query to ScalarEvolution which may change future SCEV queries
since it fills the internal cache differently. The result is actually
only used with the -verify-indvars command line option. We fix the issue
by only calling SE->getBackedgeTakenCount(L) if -verify-indvars is
enabled such that only -verify-indvars shows the behavior, but not debug
builds themselves. Also add a remark to the description of
-verify-indvars about this behavior.

Fixes llvm.org/PR44815

Differential Revision: https://reviews.llvm.org/D74810
2020-02-19 14:36:22 -06:00
Florian Hahn c7fc0e5da6 Revert "[PatternMatch] Match XOR variant of unsigned-add overflow check."
This reverts commit e01a3d49c2.
and commit a6a585b803.

This causes a failure on GreenDragon:
http://lab.llvm.org:8080/green/view/LLDB/job/lldb-cmake/9597
2020-02-19 19:37:08 +01:00
Florian Hahn e01a3d49c2 [PatternMatch] Match XOR variant of unsigned-add overflow check.
Instcombine folds (a + b <u a) to (a ^ -1 <u b) and that does not match
the expected pattern in CodeGenPerpare via UAddWithOverflow.

This causes a regression over Clang 7 on both X86 and AArch64:
https://gcc.godbolt.org/z/juhXYV

This patch extends UAddWithOverflow to also catch the XOR case, if the
XOR is only used in the ICMP. This covers just a single case, but I'd
like to make sure I am not missing anything before tackling the other
cases.

Reviewers: nikic, RKSimon, lebedev.ri, spatel

Reviewed By: nikic, lebedev.ri

Differential Revision: https://reviews.llvm.org/D74228
2020-02-19 15:25:18 +01:00
Brian Gesiak 5a187d8ed1 [Coroutines][4/6] New pass manager: coro-cleanup
Summary:
Depends on https://reviews.llvm.org/D71900.

The fourth in a series of patches that ports the LLVM coroutines passes
to the new pass manager infrastructure. This patch implements
'coro-cleanup'.

No existing regression tests check the behavior of coro-cleanup on its
own, so this patch adds one. (A test named 'coro-cleanup.ll' exists, but
it relies on the entire coroutines pipeline being run. It's updated to
test the new pass manager in the 5th patch of this series.)

Reviewers: GorNishanov, lewissbaker, chandlerc, junparser, deadalnix, wenlei

Reviewed By: wenlei

Subscribers: wenlei, EricWF, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71901
2020-02-19 00:30:27 -05:00
Brian Gesiak 2365238b9d Re-land new pass manager coro-split and coro-elide
This re-applies patches https://reviews.llvm.org/D71899 and
https://reviews.llvm.org/D71900, which were reverted in
https://reviews.llvm.org/rG11053a1cc61 and
https://reviews.llvm.org/rGe999aa38d16. The underlying problem that
caused two buildbots to fail with these patches is explained in
https://reviews.llvm.org/rG26f356350bd -- older compliers disagree with
the order in which the left- and right-hand side of an assignment in
LazyCallGraph ought to be evaluated, which caused an assertion in
SmallVector::operator[] to fire when the test suite was run.
2020-02-19 00:11:23 -05:00
Reid Kleckner 0c2b09a9b6 [IR] Lazily number instructions for local dominance queries
Essentially, fold OrderedBasicBlock into BasicBlock, and make it
auto-invalidate the instruction ordering when new instructions are
added. Notably, we don't need to invalidate it when removing
instructions, which is helpful when a pass mostly delete dead
instructions rather than transforming them.

The downside is that Instruction grows from 56 bytes to 64 bytes.  The
resulting LLVM code is substantially simpler and automatically handles
invalidation, which makes me think that this is the right speed and size
tradeoff.

The important change is in SymbolTableTraitsImpl.h, where the numbering
is invalidated. Everything else should be straightforward.

We probably want to implement a fancier re-numbering scheme so that
local updates don't invalidate the ordering, but I plan for that to be
future work, maybe for someone else.

Reviewed By: lattner, vsk, fhahn, dexonsmith

Differential Revision: https://reviews.llvm.org/D51664
2020-02-18 14:44:24 -08:00
Fangrui Song 13a97305ba [JumpThreading] Skip unconditional PredBB when threading jumps through two basic blocks
Fixes https://bugs.llvm.org/show_bug.cgi?id=44922 (caused by 4698bf145d)

ThreadThroughTwoBasicBlocks assumes PredBBBranch is conditional. The following code can segfault.

  AddPHINodeEntriesForMappedBlock(PredBBBranch->getSuccessor(1), PredBB, NewBB,
                                  ValueMapping);

We can also allow unconditional PredBB, but the produced code is not
better.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D74747
2020-02-18 11:01:46 -08:00
Tyker c9e93c84f6 Add Query API for llvm.assume holding attributes
Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: jdoerfert

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72885
2020-02-18 19:42:07 +01:00
Huihui Zhang 8ee0e1dc02 [NFC] Silence compiler warning [-Wmissing-braces]. 2020-02-18 10:37:12 -08:00
Florian Hahn e32522ca17 [SLPVectorizer] Do not assume extracelement idx is a ConstantInt.
The index of an ExtractElementInst is not guaranteed to be a
ConstantInt. It can be any integer value. Check explicitly for
ConstantInts.

The new test cases illustrate scenarios where we crash without
this patch. I've also added another test case to check the matching
of extractelement vector ops works.

Reviewers: RKSimon, ABataev, dtemirbulatov, vporpo

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D74758
2020-02-18 18:16:06 +01:00
Nikita Popov ec6c623ff9 [SimplifyLibCalls] Accept IRBuilderBase; NFC 2020-02-18 17:59:07 +01:00
Nikita Popov 28ffe38bba [LoopUtils] Accept IRBuilderBase; NFC 2020-02-18 17:58:46 +01:00
Nikita Popov ed6d30b517 [BuildLibCalls] Accept IRBuilderBase; NFC
Accept IRBuilderBase instead of IRBuilder<>. Remove dependency
on IRBuilder from header.
2020-02-18 17:58:16 +01:00
Nikita Popov 1ab37fad61 [InstCombine] Fix worklist management when simplifying demanded bits
When simplifying demanded bits, we currently only report the
instruction on which SimplifyDemandedBits was called as changed.
However, this is a recursive call, and the actually modified
instruction will usually be further up the chain. Additionally,
all the intermediate instructions should also be revisited,
as additional combines may be possible after the demanded bits
simplification. We fix this by explicitly adding them back to the
worklist.

Differential Revision: https://reviews.llvm.org/D72944
2020-02-18 17:55:40 +01:00
Nikita Popov c9540fe59b [InstCombine] Fix multi-use handling in cttz transform
The select-of-cttz transform can currently duplicate cttz intrinsics
and zext/trunc ops. The cause is that it unnecessarily duplicates
the intrinsic and the zext/trunc when setting the "undef_on_zero"
flag to false. However, it's always legal to set the flag from true
to false, so we can make this replacement even if there are extra users.

Differential Revision: https://reviews.llvm.org/D74685
2020-02-18 17:55:00 +01:00
Nikita Popov 9adedd146d [InstCombine] Relax preconditions for ashr+and+icmp fold (PR44754)
Fix for https://bugs.llvm.org/show_bug.cgi?id=44754. We already have
a fold that converts icmp (and (ashr X, C3), C2), C1 into
icmp (and C2'), C1', but it imposed overly strict requirements on the
transform.

Relax this by checking that both C2 and C1 don't shift out bits
(in a signed sense) when forming the new constants.

Alive proofs (https://rise4fun.com/Alive/PTz0):

    Name: ashr_legal
    Pre: ((C2 << C3) >> C3) == C2 && ((C1 << C3) >> C3) == C1
    %a = ashr i16 %x, C3
    %b = and i16 %a, C2
    %c = icmp i16 %b, C1
    =>
    %d = and i16 %x, C2 << C3
    %c = icmp i16 %d, C1 << C3

    Name: ashr_shiftout_eq
    Pre: ((C2 << C3) >> C3) == C2 && ((C1 << C3) >> C3) != C1
    %a = ashr i16 %x, C3
    %b = and i16 %a, C2
    %c = icmp eq i16 %b, C1
    =>
    %c = false

Note that >> corresponds to ashr here. The case of an equality
comparison has some special handling in this transform, because
it will form to a true/false result if the condition on the comparison
constant it violated.

Differential Revision: https://reviews.llvm.org/D74294
2020-02-18 17:49:46 +01:00
Florian Hahn 9063022573 [InstCombin] Avoid nested Create calls, to guarantee order.
The original code allowed creating the != checks in unpredictable order,
causing http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/34014
to fail.
2020-02-18 09:44:11 +01:00
Florian Hahn 6c85e92bcf [InstCombine] Simplify a umul overflow check to a != 0 && b != 0.
This patch adds a simplification if an OR weakens the overflow condition
for umul.with.overflow by treating any non-zero result as overflow. In that
case, we overflow if both umul.with.overflow operands are != 0, as in that
case the result can only be 0, iff the multiplication overflows.

Code like this is generated by code using __builtin_mul_overflow with
negative integer constants, e.g.
   bool test(unsigned long long v, unsigned long long *res) {
     return __builtin_mul_overflow(v, -4775807LL, res);
   }

```
----------------------------------------
Name: D74141
  %res = umul_overflow {i8, i1} %a, %b
  %mul = extractvalue {i8, i1} %res, 0
  %overflow = extractvalue {i8, i1} %res, 1
  %cmp = icmp ne %mul, 0
  %ret = or i1 %overflow, %cmp
  ret i1 %ret
=>
  %t0 = icmp ne i8 %a, 0
  %t1 = icmp ne i8 %b, 0
  %ret = and i1 %t0, %t1
  ret i1 %ret
  %res = umul_overflow {i8, i1} %a, %b
  %mul = extractvalue {i8, i1} %res, 0
  %cmp = icmp ne %mul, 0
  %overflow = extractvalue {i8, i1} %res, 1

Done: 1
Optimization is correct!

```

Reviewers: nikic, lebedev.ri, spatel, Bigcheese, dexonsmith, aemerson

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D74141
2020-02-18 09:11:55 +01:00
Brian Gesiak 11053a1cc6 Revert new pass manager coro-split and coro-elide
This reverts
https://reviews.llvm.org/rG7125d66f9969605d886b5286780101a45b5bed67 and
https://reviews.llvm.org/rG00fec8004aca6588d8d695a2c3827c3754c380a0 due
to buildbot failures:
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/34004
2020-02-17 23:55:10 -05:00
Brian Gesiak 00fec8004a [Coroutines][3/6] New pass manager: coro-elide
Summary:
Depends on https://reviews.llvm.org/D71899.

The third in a series of patches that ports the LLVM coroutines passes
to the new pass manager infrastructure. This patch implements 'coro-elide'.

The new pass manager infrastructure does not implicitly repeat CGSCC
pass pipelines when a function is devirtualized, and so the tests
for the new pass manager that rely on that behavior now explicitly
specify `repeat<2>`.

Reviewers: GorNishanov, lewissbaker, chandlerc, jdoerfert, junparser, deadalnix, wenlei

Reviewed By: wenlei

Subscribers: wenlei, EricWF, Prazek, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71900
2020-02-17 23:41:57 -05:00
Brian Gesiak 7125d66f99 [Coroutines][2/6] New pass manager: coro-split
Summary:
This patch has four dependencies:

1. The first in this series of patches that implement coroutine passes in the
   new pass manager: https://reviews.llvm.org/D71898.
2. A patch that introduces an API for CGSCC passes to add new reference
   edges to a `LazyCallGraph`, `updateCGAndAnalysisManagerForCGSCCPass`:
   https://reviews.llvm.org/D72025.
3. A patch that introduces a `CallGraphUpdater` helper class that is
   capable of mutating internal `LazyCallGraph` state in order to insert
   new function nodes into a specific SCC: https://reviews.llvm.org/D70927.
4. And finally, a small edge case fix for updating `LazyCallGraph` that
   patch 3 above happens to run into: https://reviews.llvm.org/D72226.

This is the second in a series of patches that ports the LLVM coroutines
passes to the new pass manager infrastructure. This patch implements
'coro-split'.

Some notes:
* Using the new CGSCC pass manager resulted in IR being printed in the
  reverse order in some tests. To prevent FileCheck checks from failing due
  to these reversed orders, this patch splits up test files that test
  multiple different coroutine functions: specifically
  coro-alloc-with-param.ll, coro-split-eh.ll, and coro-eh-aware-edge-split.ll.
* CoroSplit.cpp contained 2 overloads of `splitCoroutine`, one of which
  dispatched to the other based on the coroutine ABI being used (C++20
  switch-based versus Swift returned-continuation-based). I found this
  confusing, especially with the additional branching based on `CallGraph`
  vs. `LazyCallGraph`, so I removed the ABI-checking overload of
  `splitCoroutine`.

Reviewers: GorNishanov, lewissbaker, chandlerc, jdoerfert, junparser, deadalnix, wenlei

Reviewed By: wenlei

Subscribers: wenlei, qcolombet, EricWF, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71899
2020-02-17 23:35:27 -05:00
Vedant Kumar c74026daf3 [HotColdSplit] Mark entire function cold when entry block is cold
rdar://58855712
2020-02-17 15:57:50 -08:00
Nicolai Hähnle 58297e4d8f LowerMatrixIntrinsics: Avoid use of deprecated CreateCall methods
Reviewers: t.p.northover

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74675
2020-02-18 00:24:09 +01:00
Tim Northover 464d4cf7e6 Coroutines: avoid use of deprecated CreateLoad and CreateCall methods
Summary: Patch originally by Tim Northover

Reviewers: t.p.northover

Subscribers: EricWF, hiraditya, modocache, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74674
2020-02-18 00:24:09 +01:00
Brian Gesiak e9849d5195 [Coroutines][1/6] New pass manager: coro-early
Summary:
The first in a series of patches that ports the LLVM coroutines passes
to the new pass manager infrastructure. This patch implements
'coro-early'.

NB: All coroutines passes begin by checking that coroutine intrinsics are
declared within the LLVM IR module they're operating on. To do so, they call
`coro::declaresIntrinsics`. The next 3 patches in this series, which add new
pass manager implementations of the 'coro-split', 'coro-elide', and
'coro-cleanup' passes, use a similar pattern as the one used here: a static
function is shared across both old and new passes to detect if relevant
coroutine intrinsics are delcared. To make this pattern easier to read, this
patch adds `const` keywords to the parameters of `coro::declaresIntrinsics`.

Reviewers: GorNishanov, lewissbaker, junparser, chandlerc, deadalnix, wenlei

Reviewed By: wenlei

Subscribers: ychen, wenlei, EricWF, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71898
2020-02-17 13:27:48 -05:00
Nikita Popov 3eaa53e805 Reapply "[IRBuilder] Virtualize IRBuilder"
Relative to the original commit, this fixes some warnings,
and is based on the deletion of the IRBuilder copy constructor
in D74693. The automatic copy constructor would no longer be
safe.

-----

Related llvm-dev thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-February/138951.html

This patch moves the IRBuilder from templating over the constant
folder and inserter towards making both of these virtual.
There are a couple of motivations for this:

1. It's not possible to share code between use-sites that use
different IRBuilder folders/inserters (short of templating the code
and moving it into headers).
2. Methods currently defined on IRBuilderBase (which is not templated)
do not use the custom inserter, resulting in subtle bugs (e.g.
incorrect InstCombine worklist management). It would be possible to
move those into the templated IRBuilder, but...
3. The vast majority of the IRBuilder implementation has to live
in the header, because it depends on the template arguments.
4. We have many unnecessary dependencies on IRBuilder.h,
because it is not easy to forward-declare. (Significant parts of
the backend depend on it via TargetLowering.h, for example.)

This patch addresses the issue by making the following changes:

* IRBuilderDefaultInserter::InsertHelper becomes virtual.
  IRBuilderBase accepts a reference to it.
* IRBuilderFolder is introduced as a virtual base class. It is
 implemented by ConstantFolder (default), NoFolder and TargetFolder.
  IRBuilderBase has a reference to this as well.
* All the logic is moved from IRBuilder to IRBuilderBase. This means
  that methods can in the future replace their IRBuilder<> & uses
  (or other specific IRBuilder types) with IRBuilderBase & and thus
  be usable with different IRBuilders.
* The IRBuilder class is now a thin wrapper around IRBuilderBase.
  Essentially it only stores the folder and inserter and takes care
  of constructing the base builder.

What this patch doesn't do, but should be simple followups after this change:

* Fixing use of the inserter for creation methods originally defined
  on IRBuilderBase.
* Replacing IRBuilder<> uses in arguments with IRBuilderBase, where useful.
* Moving code from the IRBuilder header to the source file.

From the user perspective, these changes should be mostly transparent:
The only thing that consumers using a custom inserted may need to do is
inherit from IRBuilderDefaultInserter publicly and mark their InsertHelper
as public.

Differential Revision: https://reviews.llvm.org/D73835
2020-02-17 19:04:11 +01:00
Nikita Popov 80397d2d12 [IRBuilder] Delete copy constructor
D73835 will make IRBuilder no longer trivially copyable. This patch
deletes the copy constructor in advance, to separate out the breakage.

Currently, the IRBuilder copy constructor is usually used by accident,
not by intention.  In rG7c362b25d7a9 I've fixed a number of cases where
functions accepted IRBuilder rather than IRBuilder &, thus performing
an unnecessary copy. In rG5f7b92b1b4d6 I've fixed cases where an
IRBuilder was copied, while an InsertPointGuard should have been used
instead.

The only non-trivial use of the copy constructor is the
getIRBForDbgInsertion() helper, for which I separated construction and
setting of the insertion point in this patch.

Differential Revision: https://reviews.llvm.org/D74693
2020-02-17 18:14:48 +01:00
Benjamin Kramer 564a9de28e Hide implementation details. NFC> 2020-02-17 17:55:23 +01:00
Benjamin Kramer 5fc5c7db38 Strength reduce vectors into arrays. NFCI. 2020-02-17 15:37:35 +01:00
Nikita Popov 5f7b92b1b4 [IRBuilder] Prefer InsertPointGuard over full copy; NFC
Don't copy the IRBuilder when an InsertPointGuard would also do.
2020-02-16 18:02:29 +01:00
Nikita Popov 7c362b25d7 [IRBuilder] Fix unnecessary IRBuilder copies; NFC
Fix a few cases where an IRBuilder is passed to a helper function
by value, while a by reference pass was intended.
2020-02-16 17:57:18 +01:00
Nikita Popov af480e8c63 Revert "[IRBuilder] Virtualize IRBuilder"
This reverts commit 0765d3824d.
This reverts commit 1b04866a3d.

Relevant looking crashes observed on:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win
2020-02-16 17:01:10 +01:00
Sanjay Patel 62dd44d76d [VectorCombine] fix cost calc for extract-cmp
getOperationCost() is not the cost we wanted; that's not the
throughput value that the rest of the calculation uses.

We may want to switch everything in this code to use the
getInstructionThroughput() wrapper to avoid these kinds of
problems, but I'll look at that as a follow-up because that
can create other logical diffs via using optional parameters
(we'd need to speculatively create the vector instruction to
make a fair(er) comparison).
2020-02-16 10:40:28 -05:00
Nikita Popov 893c630fbe [InstCombine] Create new log2 intrinsic; NFCI
Rather than mixing creation of new instructions and in-place
modification here, create a new log2 intrinsic. This should be
NFC apart from worklist order changes.
2020-02-16 15:52:09 +01:00
Nikita Popov 1b04866a3d [IRBuilder] Try to fix warnings
Try to fix -Wnon-virtual-dtor warnings that cause build failure
on clang-pcc64le-rhel.
2020-02-16 15:32:11 +01:00
Nikita Popov 0765d3824d [IRBuilder] Virtualize IRBuilder
Related llvm-dev thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-February/138951.html

This patch moves the IRBuilder from templating over the constant
folder and inserter towards making both of these virtual.
There are a couple of motivations for this:

1. It's not possible to share code between use-sites that use
different IRBuilder folders/inserters (short of templating the code
and moving it into headers).
2. Methods currently defined on IRBuilderBase (which is not templated)
do not use the custom inserter, resulting in subtle bugs (e.g.
incorrect InstCombine worklist management). It would be possible to
move those into the templated IRBuilder, but...
3. The vast majority of the IRBuilder implementation has to live
in the header, because it depends on the template arguments.
4. We have many unnecessary dependencies on IRBuilder.h,
because it is not easy to forward-declare. (Significant parts of
the backend depend on it via TargetLowering.h, for example.)

This patch addresses the issue by making the following changes:

* IRBuilderDefaultInserter::InsertHelper becomes virtual.
  IRBuilderBase accepts a reference to it.
* IRBuilderFolder is introduced as a virtual base class. It is
 implemented by ConstantFolder (default), NoFolder and TargetFolder.
  IRBuilderBase has a reference to this as well.
* All the logic is moved from IRBuilder to IRBuilderBase. This means
  that methods can in the future replace their IRBuilder<> & uses
  (or other specific IRBuilder types) with IRBuilderBase & and thus
  be usable with different IRBuilders.
* The IRBuilder class is now a thin wrapper around IRBuilderBase.
  Essentially it only stores the folder and inserter and takes care
  of constructing the base builder.

What this patch doesn't do, but should be simple followups after this change:

* Fixing use of the inserter for creation methods originally defined
  on IRBuilderBase.
* Replacing IRBuilder<> uses in arguments with IRBuilderBase, where useful.
* Moving code from the IRBuilder header to the source file.

From the user perspective, these changes should be mostly transparent:
The only thing that consumers using a custom inserted may need to do is
inherit from IRBuilderDefaultInserter publicly and mark their InsertHelper
as public.

Differential Revision: https://reviews.llvm.org/D73835
2020-02-16 13:48:55 +01:00
Johannes Doerfert 1d5da8cd30 [Attributor][FIX] Use pointer not reference as it can be null 2020-02-15 20:38:49 -06:00
Florian Hahn f8045b250d Recommit "[SCCP] Remove forcedconstant, go to overdefined instead"
This includes a fix for cases where things get marked as overdefined in
ResolvedUndefsIn, but we later discover a constant. To avoid crashing,
we consistently bail out on overdefined values in the visitors. This is
similar to the previous behavior with forcedconstant.

This reverts the revert commit 02b72f564c.
2020-02-15 18:36:44 +01:00
Simon Pilgrim 8a48c4a97c Fix boolean/bitwise operator precedence warnings. NFCI. 2020-02-15 13:53:18 +00:00
Johannes Doerfert ef746aa11f [Attributor] Collect memory accesses with their respective kind and location
In addition to a single bit per memory locations, e.g., globals and
arguments, we now collect more information about the actual accesses,
e.g., what instruction caused it, was it a read/write/read+write, and
what the underlying base pointer was. Follow up patches will make
explicit use of this.

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D73527
2020-02-15 02:12:04 -06:00
Fangrui Song fd5665af2c [Attributor] Fix -Wunused-variable for -DLLVM_ENABLE_ASSERTIONS=off builds after b4352e43d8 2020-02-14 21:47:19 -08:00
Johannes Doerfert b70297a39a [Attributor][FIX] Ensure abstract attributes are existing before manifest
While the function return updateImpl did only look at call sites the
manifest method looked at return values. If we don't do this during the
updateImpl we might create new abstract attributes during manifest. This
is a problem when it comes to liveness information.
2020-02-14 21:44:46 -06:00
Johannes Doerfert ad121ea14d [Attributor] Manifest simplified (return) values properly
If we simplify a function return value we have to modify the return
instructions.
2020-02-14 21:44:46 -06:00
Johannes Doerfert b53af0e7f9 [Attributor][FIX] Collapse `undef` to a proper value
If we see an undef we cannot assume it's the same as "no value". For now
we just collapse it to 0.
2020-02-14 21:44:46 -06:00
Johannes Doerfert 137c99a6a5 [Attributor][FIX] Restrict cross-SCC call deletion
If we know a call was not needed we might have ended up deleting it even
if it was in a different SCC. This prevents us from doing so.
2020-02-14 21:44:46 -06:00
Johannes Doerfert 32e98a7089 [Attributor][FIX] Carefully strip casts in AANoAlias
We can strip casts in AANoAlias but that might cause us to end up with a
non-pointer type. We do properly handle that case now.
2020-02-14 21:44:46 -06:00
Johannes Doerfert b4352e43d8 [Attributor][FIX] Do not RAUW void values
This caused an error when passes iterated over cached assumptions in the
tracker and assumed them to be `null` or an instruction. I failed to
create a test case so far.
2020-02-14 21:44:46 -06:00
Johannes Doerfert 282f5d7ad1 [Attributor] Derive memory location attributes (argmemonly, ...)
In addition to memory behavior attributes (readonly/writeonly) we now
derive memory location attributes (argmemonly/inaccessiblememonly/...).
The former is part of AAMemoryBehavior and the latter part of
AAMemoryLocation. While they are similar in nature it got messy when
they were put in a single AA. Location attributes for arguments and
floating values will follow later.

Note that both memory attributes kinds can derive readnone. If there are
no accesses AAMemoryBehavior will derive readnone. If there are accesses
but only to stack (=local) locations AAMemoryLocation will derive
readnone.

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D73426
2020-02-14 19:05:51 -06:00
Johannes Doerfert 7cbb107feb [Attributor][FIX] Validate the type for AAValueConstantRange as needed
Due to the genericValueTraversal we might visit values for which we did
not create an AAValueConstantRange object, e.g., as they are behind a
PHI or select or call with `returned` argument. As a consequence we need
to validate the types as we are about to query AAValueConstantRange for
operands.
2020-02-14 17:22:40 -06:00
Alina Sbirlea 1326a5a4cf [LoopRotate] Get and update MSSA only if available in legacy pass manager.
Summary:
Potential fix for: https://bugs.llvm.org/show_bug.cgi?id=44889 and https://bugs.llvm.org/show_bug.cgi?id=44408

In the legacy pass manager, loop rotate need not compute MemorySSA when not being in the same loop pass manager with other loop passes.
There isn't currently a way to differentiate between the two cases, so this attempts to limit the usage in LoopRotate to only update MemorySSA when the analysis is already available.
The side-effect of this is that it will split the Loop pipeline.

This issue does not apply to the new pass manager, where we have a flag specifying if all loop passes in that loop pass manager preserve MemorySSA.

Reviewers: dmgreen, fedor.sergeev, nikic

Subscribers: Prazek, hiraditya, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74574
2020-02-14 10:47:26 -08:00
Kadir Cetinkaya 1674f772b4
[VecotrCombine] Fix unused variable for assertion disabled builds 2020-02-14 09:30:29 +01:00
Vedant Kumar 8e77b33b3c [Local] Do not move around dbg.declares during replaceDbgDeclare
replaceDbgDeclare is used to update the descriptions of stack variables
when they are moved (e.g. by ASan or SafeStack). A side effect of
replaceDbgDeclare is that it moves dbg.declares around in the
instruction stream (typically by hoisting them into the entry block).
This behavior was introduced in llvm/r227544 to fix an assertion failure
(llvm.org/PR22386), but no longer appears to be necessary.

Hoisting a dbg.declare generally does not create problems. Usually,
dbg.declare either describes an argument or an alloca in the entry
block, and backends have special handling to emit locations for these.
In optimized builds, LowerDbgDeclare places dbg.values in the right
spots regardless of where the dbg.declare is. And no one uses
replaceDbgDeclare to handle things like VLAs.

However, there doesn't seem to be a positive case for moving
dbg.declares around anymore, and this reordering can get in the way of
understanding other bugs. I propose getting rid of it.

Testing: stage2 RelWithDebInfo sanitized build, check-llvm

rdar://59397340

Differential Revision: https://reviews.llvm.org/D74517
2020-02-13 14:35:02 -08:00
Sanjay Patel 19b62b79db [VectorCombine] try to form vector binop to eliminate an extract element
binop (extelt X, C), (extelt Y, C) --> extelt (binop X, Y), C

This is a transform that has been considered for canonicalization (instcombine)
in the past because it reduces instruction count. But as shown in the x86 tests,
it's impossible to know if it's profitable without a cost model. There are many
potential target constraints to consider.

We have implemented similar transforms in the backend (DAGCombiner and
target-specific), but I don't think we have this exact fold there either (and if
we did it in SDAG, it wouldn't work across blocks).

Note: this patch was intended to handle the more general case where the extract
indexes do not match, but it got too big, so I scaled it back to this pattern
for now.

Differential Revision: https://reviews.llvm.org/D74495
2020-02-13 17:23:27 -05:00
Vedant Kumar 02b72f564c Revert "Recommit "[SCCP] Remove forcedconstant, go to overdefined instead""
This reverts commit bb310b3f73. This
breaks the stage2 ASan build, see:

https://bugs.llvm.org/show_bug.cgi?id=44898

rdar://59431448
2020-02-13 11:55:18 -08:00
stozer 9bda7ab835 Re-revert: Recover debug intrinsics when killing duplicated/empty blocks
This reverts commit 61b35e4111.

This commit causes a timeout in chromium builds; likely to have a
similar cause to the previous timeout issue caused by this commit (see
6ded69f294 for more details). It is possible that there is no way to
fix this bug that will not cause this issue; further investigations as
to the efficiency of handling large amounts of debug info will be
necessary.
2020-02-13 11:48:19 +00:00
Johannes Doerfert 70cac41a2b Reapply "[OpenMP][IRBuilder] Perform finalization (incl. outlining) late"
Reapply 8a56d64d76 with minor fixes.

The problem was that cancellation can cause new edges to the parallel
region exit block which is not outlined. The CodeExtractor will encode
the information which "exit" was taken as a return value. The fix is to
ensure we do not return any value from the outlined function, to prevent
control to value conversion we ensure a single exit block for the
outlined region.

This reverts commit 3aac953afa.
2020-02-12 22:29:07 -06:00
Johannes Doerfert 3aac953afa Revert "[OpenMP][IRBuilder] Perform finalization (incl. outlining) late"
This reverts commit 8a56d64d76.

Will be recommitted once the clang test problem is addressed.
2020-02-12 18:50:43 -06:00
Johannes Doerfert 8a56d64d76 [OpenMP][IRBuilder] Perform finalization (incl. outlining) late
In order to fix PR44560 and to prepare for loop transformations we now
finalize a function late, which will also do the outlining late. The
logic is as before but the actual outlining step happens now after the
function was fully constructed. Once we have loop transformations we
can apply them in the finalize step before the outlining.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D74372
2020-02-12 17:55:01 -06:00
Johannes Doerfert 23f41f16d4 [Attributor] Use fine-grained liveness in all helpers
We used coarse-grained liveness before, thus we looked if the
instruction was executed, but we did not use fine-grained liveness,
hence if the instruction was needed or could be deleted even if the
surrounding ones are live. This patches introduces this level of
liveness checks together with other liveness queries, e.g., for uses.

For more control we enforce that all liveness queries go through the
Attributor.

Test have been adjusted to reflect the changes or augmented to prevent
deletion of the parts we want to check.

Reviewed By: sstefan1

Differential Revision: https://reviews.llvm.org/D73313
2020-02-12 17:36:38 -06:00
Johannes Doerfert b2c76002ca [Attributor] Ignore uses if a value is simplified
If we have a replacement for a value, via AAValueSimplify, the original
value will lose all its uses. Thus, as long as a value is simplified we
can skip the uses in checkForAllUses, given that these uses are
transitive uses for the simplified version and will therefore affect the
simplified version as necessary.

Since this allowed us to remove calls without side-effects and a known
return value, we need to make sure not to eliminate `musttail` calls.
Those we keep around, or later remove the entire `musttail` call chain.
2020-02-12 17:36:38 -06:00
Johannes Doerfert 86509e8c3b [Attributor] Use assumed information to determine side-effects
We relied on wouldInstructionBeTriviallyDead before but that functions
does not take assumed information, especially for calls, into account.
The replacement, AAIsDead::isAssumeSideEffectFree, does.

This change makes AAIsDeadCallSiteReturn more complex as we can have
a dead call or only dead users.

The test have been modified to include a side effect where there was
none in order to keep the coverage.

Reviewed By: sstefan1

Differential Revision: https://reviews.llvm.org/D73311
2020-02-12 17:36:38 -06:00
Ehud Katz d8a2ea9fd5 [LoopExtractor] Fix legacy pass dependencies
Fixes a memory leak of allocating `LoopInfoWrapperPass` and `DominatorTreeWrapperPass`.
2020-02-12 22:39:21 +02:00
Vedant Kumar 34d9f93977 [AddressSanitizer] Ensure only AllocaInst is passed to dbg.declare
Various parts of the LLVM code generator assume that the address
argument of a dbg.declare is not a `ptrtoint`-of-alloca. ASan breaks
this assumption, and this results in local variables sometimes being
unavailable at -O0.

GlobalISel, SelectionDAG, and FastISel all do not appear to expect
dbg.declares to have a `ptrtoint` as an operand. This means that they do
not place entry block allocas in the usual side table reserved for local
variables available in the whole function scope. This isn't always a
problem, as LLVM can try to lower the dbg.declare to a DBG_VALUE, but
those DBG_VALUEs can get dropped for all the usual reasons DBG_VALUEs
get dropped. In the ObjC test case I'm looking at, the cause happens to
be that `replaceDbgDeclare` has hoisted dbg.declares into the entry
block, causing LiveDebugValues to "kill" the DBG_VALUEs because the
lexical dominance check fails.

To address this, I propose:

1) Have ASan (always) pass an alloca to dbg.declares (this patch). This
   is a narrow bugfix for -O0 debugging.

2) Make replaceDbgDeclare not move dbg.declares around. This should be a
   generic improvement for optimized debug info, as it would prevent the
   lexical dominance check in LiveDebugValues from killing as many
   variables.

   This means reverting llvm/r227544, which fixed an assertion failure
   (llvm.org/PR22386) but no longer seems to be necessary. I was able to
   complete a stage2 build with the revert in place.

rdar://54688991

Differential Revision: https://reviews.llvm.org/D74369
2020-02-12 11:24:02 -08:00
Jay Foad 32aac25637 [KnownBits] Introduce anyext instead of passing a flag into zext
Summary:
This was a very odd API, where you had to pass a flag into a zext
function to say whether the extended bits really were zero or not. All
callers passed in a literal true or false.

I think it's much clearer to make the function name reflect the
operation being performed on the value we're tracking (rather than on
the KnownBits Zero and One fields), so zext means the value is being
zero extended and new function anyext means the value is being extended
with unknown bits.

NFC.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74482
2020-02-12 19:06:53 +00:00
Florian Hahn bb310b3f73 Recommit "[SCCP] Remove forcedconstant, go to overdefined instead"
This version includes a fix for a set of crashes caused by marking
values depending on a yet unknown & tracked call as overdefined.

In some cases, we would later discover that the call has a constant
result and try to mark a user of it as constant, although it was already
marked as overdefined. Most instruction handlers bail out early if the
instruction is already overdefined. But that is not necessary for
CastInsts for example. By skipping values that depend on skipped
calls, we resolve the crashes and also improve the precision in some
cases (see resolvedundefsin-tracked-fn.ll).

Note that we may not skip PHI nodes that may depend on a skipped call,
but they can be safely marked as overdefined, as we bail out early if
the PHI node is overdefined.

This reverts the revert commit
a74b31a3e9cd844c7ce2087978568e3f5ec8519.
2020-02-12 18:02:18 +00:00
Anh Tuyen Tran a5b6480d05 [NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files
Summary:
This refactor patch removes some header files which are not needed and also add some to meet IWYU principles.

Reviewers: rnk (Reid Kleckner), Meinersbur (Michael Kruse), dmgreen (Dave Green)

Reviewed By: dmgreen (Dave Green), rnk (Reid Kleckner), Meinersbur (Michael Kruse)

Subscribers: dmgreen (Dave Green), Whitney (Whitney Tsang), hiraditya (Aditya Kumar), zzheng (Z. Zheng), llvm-commits, LLVM

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D73498
2020-02-12 17:57:56 +00:00
Alina Sbirlea 4f33a68973 Compute ORE, BPI, BFI in Loop passes.
Summary:
Passes ORE, BPI, BFI are not being preserved by Loop passes, hence it
is incorrect to retrieve these passes as cached.
This patch makes the loop passes in question compute a new instance.

In some of these cases, however, it may be beneficial to change the Loop pass to
a Function pass instead, similar to the change for LoopUnrollAndJam.

Reviewers: chandlerc, dmgreen, jdoerfert, reames

Subscribers: mehdi_amini, hiraditya, zzheng, steven_wu, dexonsmith, Whitney, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72891
2020-02-12 09:15:18 -08:00
Sven van Haastregt 665dcdacc0 Add missing newlines at EOF; NFC 2020-02-12 15:57:25 +00:00
stozer 61b35e4111 Re-reapply: Recover debug intrinsics when killing duplicated/empty blocks
This reverts commit 636c93ed11.

The original patch caused build failures on TSan buildbots. Commit 6ded69f294
fixes this issue by reducing the rate at which empty debug intrinsics
propagate, reducing the memory footprint and preventing a fatal spike.
2020-02-12 14:36:30 +00:00
Florian Hahn 81dbb6aec6 Recommit "[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk)."
This includes a fix for the santizier failures.

This reverts the revert commit
42f8b915eb.
2020-02-12 14:17:50 +00:00
Ayman Musa 35f02aa021 Revert "[AggressiveInstCombine] Add support for ICmp instr that feeds a select intsr's condition operand."
This reverts commit cf155150f9.
2020-02-12 15:04:49 +02:00
Ayman Musa cf155150f9 [AggressiveInstCombine] Add support for ICmp instr that feeds a select intsr's condition operand. 2020-02-12 15:01:27 +02:00
stozer ffeb64db35 Reapply "[DebugInfo] Prevent explosion of debug intrinsics during jump threading"
This reverts commit 6ded69f294.
2020-02-12 12:39:54 +00:00
Ayman Musa 3bda9059b8 [AggressiveInstCombine] Add support for select instruction.
Differential Revision: https://reviews.llvm.org/D72837
2020-02-12 13:59:34 +02:00
stozer 6ded69f294 Revert "[DebugInfo] Prevent explosion of debug intrinsics during jump threading"
This reverts commit fe6f6cd6b8.

Found test failure on several buildbots.
2020-02-12 11:48:00 +00:00
Ayman Musa 49a4d85f6d [NFC][AggressiveInstCombine] Remove redundant std::max.
Differential Revision: https://reviews.llvm.org/D74476
2020-02-12 13:47:40 +02:00
stozer fe6f6cd6b8 [DebugInfo] Prevent explosion of debug intrinsics during jump threading
This patch is a fix following the revert of 72ce759
(https://reviews.llvm.org/rG72ce759928e6dfee6a9efa310b966c19722352ba)
and fixes the failure that it caused.

The above patch failed on the Thread Sanitizer buildbot with an out of
memory error. After an investigation, the cause was identified as an
explosion in debug intrinsics while running the Jump Threading pass on
ModuleMap.ll. The above patched prevented debug intrinsics from being
dropped when their Basic Block was deleted due to being "empty". In this
case, one of the functions in ModuleMap.ll had (after many optimization
passes) a very large number of debug intrinsics representing a set of
repeatedly inlined variables. Previously the vast majority of these were
silently dropped during Jump Threading when their blocks were deleted,
but as of the above patch they survived for longer, causing a large
increase in the number of debug intrinsics. These intrinsics were then
repeatedly cloned by the Jump Threading pass as edges were threaded,
multiplying the intrinsic count further. The memory consumed by this
process spiralled out of control, crashing the buildbot that uses TSan
(which has an estimated 5-10x memory overhead compared to non-sanitized
builds).

This patch adds RemoveRedundantDbgInstrs to the Jump Threading pass, in
order to reduce the number of debug intrinsics down to a manageable
amount in cases where many intrinsics for the same variable end up
bunched together contiguously, as in this case.

Differential Revision: https://reviews.llvm.org/D73054
2020-02-12 11:22:54 +00:00
Florian Hahn fa74b31a3e Revert "[SCCP] Remove forcedconstant, go to overdefined instead"
This causes a crash for the reproducer below

 enum { a };
 enum b { c, d };
 e;
 static _Bool g(struct f *h, enum b i) {
   i &&j();
   return a;
 }
 static k(char h, enum b i) {
   _Bool l = g(e, i);
   l;
 }
 m(h) {
   k(h, c);
   g(h, d);
 }

This reverts commit aadb635e04.
2020-02-12 09:41:19 +00:00
Matt Arsenault 86f9117d47 AMDGPU: Don't report 2-byte alignment as fast
This is apparently worse than 1-byte alignment. This does not attempt
to decompose 2-byte aligned wide stores, but will stop trying to
produce them.

Also fix bug in LoadStoreVectorizer which was decreasing the alignment
and vectorizing stack accesses. It was assuming a stack object was an
alloca that could have its base alignment changed, which is not true
if the pointer is derived from a function argument.
2020-02-11 18:35:00 -05:00
Johannes Doerfert 52aec3221f [Attributor][NFC] Clarify the documentation a bit more 2020-02-11 15:11:55 -06:00
Johannes Doerfert 8e62968d45 [Attributor] Identify dead uses in PHIs (almost) based on dead edges
As an approximation to a dead edge we can check if the terminator is
dead. If so, the corresponding operand use in a PHI node is dead even if
the PHI node itself is not.
2020-02-11 15:11:55 -06:00
Teresa Johnson 80d0a137a5 Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"
This restores commit 748bb5a0f1, along
with a fix for a Chromium test suite build issue (and a new test for
that case).

Differential Revision: https://reviews.llvm.org/D73242
2020-02-11 10:48:05 -08:00
Johannes Doerfert 185e9b083e [Attributor][NFC] Improve documentation 2020-02-11 11:19:34 -06:00
Johannes Doerfert f95553923f [Attributor] Return uses do not free pointers
If a pointer is returned that does not mean it is freed in the current
(function) scope. We can ignore such uses in AANoFree.
2020-02-11 11:02:59 -06:00
Johannes Doerfert 4c62a35860 [Attributor][FIX] Remove duplicate, half-broken functionality
The changeXXXAfterManifest functions are better suited to deal with
changes so we should prefer them. These functions also recursively
delete dead instructions which is why we see test changes.
2020-02-11 11:02:59 -06:00
Johannes Doerfert 77a9e61c9a [Attributor][NFC] Improve debug message 2020-02-11 11:02:59 -06:00
Nikita Popov 5a8819b216 [InstCombine] Use replaceOperand() in more places
This is a followup to D73803, which uses the replaceOperand()
helper in more places.

This should be NFC apart from changes to worklist order.

Differential Revision: https://reviews.llvm.org/D73919
2020-02-11 17:38:23 +01:00
Florian Hahn aadb635e04 [SCCP] Remove forcedconstant, go to overdefined instead
This patch removes forcedconstant to simplify things for the
move to ValueLattice, which includes constant ranges, but no
forced constants.

This patch removes forcedconstant and changes ResolvedUndefsIn
to mark instructions with unknown operands as overdefined. This
means we do not do simplifications based on undef directly in SCCP
any longer, but this seems to hardly come up in practice (see stats
below), presumably because InstCombine & others take care
of most of the relevant folds already.

It is still beneficial to keep ResolvedUndefIn, as it allows us delaying
going to overdefined until we propagated all known information.

I also built MultiSource, SPEC2000 and SPEC2006 and compared
sccp.IPNumInstRemoved and sccp.NumInstRemoved. It looks like the impact
is quite low:

Tests: 244
Same hash: 238 (filtered out)
Remaining: 6
Metric: sccp.IPNumInstRemoved

Program                                        base     patch    diff
 test-suite...arks/VersaBench/dbms/dbms.test     4.00    3.00  -25.0%
 test-suite...TimberWolfMC/timberwolfmc.test    38.00   34.00  -10.5%
 test-suite...006/453.povray/453.povray.test   158.00  155.00  -1.9%
 test-suite.../CINT2000/176.gcc/176.gcc.test   668.00  668.00   0.0%
 test-suite.../CINT2006/403.gcc/403.gcc.test   1209.00 1209.00  0.0%
 test-suite...arks/mafft/pairlocalalign.test    76.00   76.00   0.0%

Tests: 244
Same hash: 238 (filtered out)
Remaining: 6
Metric: sccp.NumInstRemoved

Program                                        base    patch     diff
 test-suite...arks/mafft/pairlocalalign.test   185.00  175.00  -5.4%
 test-suite.../CINT2006/403.gcc/403.gcc.test   2059.00 2056.00 -0.1%
 test-suite.../CINT2000/176.gcc/176.gcc.test   2358.00 2357.00 -0.0%
 test-suite...006/453.povray/453.povray.test   317.00  317.00   0.0%
 test-suite...TimberWolfMC/timberwolfmc.test    12.00   12.00   0.0%

Reviewers: davide, efriedma, mssimpso

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D61314
2020-02-11 15:24:15 +00:00
Kadir Cetinkaya 42f8b915eb
Revert "[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk)."
This reverts commit d0c4d4fe09.

Revert "[DSE,MSSA] Move more passing test cases from todo to simple.ll."

This reverts commit 02266e64bb.

Revert "[DSE,MSSA] Adjust mda-with-dbg-values.ll to MSSA backed DSE."

This reverts commit 74f03e4ff0.
2020-02-11 15:34:48 +01:00
Sanjay Patel a2a0f9a43a [VectorCombine] remove unused debug counter; NFC
The variable was added to the initial commit via copy/paste of existing
code, but it wasn't actually used in the code. We can add it back with
the proper usage if/when that is needed.
2020-02-11 08:24:07 -05:00
Sanjay Patel b8ebc11f03 [EarlyCSE] avoid crashing when detecting min/max/abs patterns (PR41083)
As discussed in PR41083:
https://bugs.llvm.org/show_bug.cgi?id=41083
...we can assert/crash in EarlyCSE using the current hashing scheme and
instructions with flags.

ValueTracking's matchSelectPattern() may rely on overflow (nsw, etc) or
other flags when detecting patterns such as min/max/abs composed of
compare+select. But the value numbering / hashing mechanism used by
EarlyCSE intersects those flags to allow more CSE.

Several alternatives to solve this are discussed in the bug report.
This patch avoids the issue by doing simple matching of min/max/abs
patterns that never requires instruction flags. We give up some CSE
power because of that, but that is not expected to result in much
actual performance difference because InstCombine will canonicalize
these patterns when possible. It even has this comment for abs/nabs:

  /// Canonicalize all these variants to 1 pattern.
  /// This makes CSE more likely.

(And this patch adds PhaseOrdering tests to verify that the expected
transforms are still happening in the standard optimization pipelines.

I left this code to use ValueTracking's "flavor" enum values, so we
don't have to change the callers' code. If we decide to go back to
using the ValueTracking call (by changing the hashing algorithm
instead), it should be obvious how to replace this chunk.

Differential Revision: https://reviews.llvm.org/D74285
2020-02-10 17:25:34 -05:00
Hiroshi Yamauchi bb383ae612 [CallPromotionUtils] Add tryPromoteCall.
Summary: It attempts to devirtualize a call on alloca through vtable loads.

Reviewers: davidxl

Subscribers: mgorny, Prazek, hiraditya, llvm-commits

Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71308
2020-02-10 13:43:16 -08:00
Sanjay Patel 62ce7e650a [InstCombine] fix use check when canonicalizing abs/nabs
We were checking for extra uses of the negated operand even
if we were not going to create it as part of this canonicalization.

This was showing up as a regression when we limit EarlyCSE as
proposed in D74285.
2020-02-10 14:57:37 -05:00
David Stenberg 982944525c Revert "[InstCombine][DebugInfo] Fold constants wrapped in metadata"
This reverts commit b54a8ec1bc.

The commit triggered debug invariance (different output with/without
-g). The patch seems to have exposed a pre-existing invariance problem
in GlobalOpt, which I'll write a bug report for.
2020-02-10 17:58:33 +01:00
Bill Wendling c55cf4afa9 Revert "Remove redundant "std::move"s in return statements"
The build failed with

  error: call to deleted constructor of 'llvm::Error'

errors.

This reverts commit 1c2241a793.
2020-02-10 07:07:40 -08:00
Bill Wendling 1c2241a793 Remove redundant "std::move"s in return statements 2020-02-10 06:39:44 -08:00
Mikael Holmen a50c0b0df7 Fix compiler warning when compiling without asserts [NFC] 2020-02-10 13:55:52 +01:00
Florian Hahn d0c4d4fe09 [DSE] Add first version of MemorySSA-backed DSE (Bottom up walk).
This patch adds a first version of a MemorySSA based DSE. It is missing
a lot of features, which will get added as follow-ups, to help to keep
the review manageable.

The patch uses the following general approach: given a MemoryDef, walk
upwards to find clobbering MemoryDefs that may be killed by the
starting def. Then check that there are no uses that may read the
location of the original MemoryDef in between both MemoryDefs. A bit
more concretely:

For all MemoryDefs StartDef:
1. Get the next dominating clobbering MemoryDef (DomAccess) by walking upwards.
2. Check that there no reads between DomAccess and the StartDef by checking
   all uses starting at DomAccess and walking until we see StartDef.
3. For each found DomDef, check that:
  1. There are no barrier instructions between DomDef and StartDef (like
     throws or stores with ordering constraints).
  2. StartDef is executed whenever DomDef is executed.
3. StartDef completely overwrites DomDef.
4. Erase DomDef from the function and MemorySSA.

The patch uses a very simple approach to guarantee that no throwing
instructions are between 2 stores: We only allow accesses to stack
objects, access that are in the same basic block if the block does not
contain any throwing instructions or accesses in functions that do
not contain any throwing instructions. This will get lifted later.

Besides adding support for the missing cases, there is plenty of additional
potential for improvements as follow-up work, e.g. the way we visit stores
(could be just a traversal of the MemorySSA, rather than collecting them
up-front), using the alias information discovered during walking to optimize
the MemorySSA.

This is loosely based on D40480 by Dave Green.

Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D72700
2020-02-10 11:52:11 +00:00
Florian Hahn da52b9c118 [DSE] Add tests for MemorySSA based DSE.
This copies the DSE tests into a MSSA subdirectory to test the MemorySSA
backed DSE implementation, without disturbing the original tests.

Differential Revision: https://reviews.llvm.org/D72145
2020-02-10 10:28:43 +00:00
Johannes Doerfert 87ddf1f4fa [Attributor] Simple casts preserve no-alias property
This is a minimal but important advancement over the existing code. A
cast with an operand that is only used in the cast retains the no-alias
property of the operand.
2020-02-10 01:11:32 -06:00
Johannes Doerfert 8155439331 [Attributor] Allow PHI nodes in AAValueConstantRangeFloating
Traversing PHI nodes is natural with the genericValueTraversal but also
a bit tricky. The problem is similar to the ones we have seen in AAAlign
and AADereferenceable, namely that we continue to increase the range in
each iteration. We use a pessimistic approach here to stop the
iterations. Nevertheless, optimistic information can now be propagated
through a PHI node.
2020-02-10 00:55:10 -06:00
Johannes Doerfert 63adbb9a0e [Attributor][FIX] Remove FIXME that seems outdated
The change is performed as stated by the FIXME and the tests are
adjusted. All changes look fine to me and values can be inferred as
undef without it being an error.
2020-02-10 00:55:10 -06:00
Johannes Doerfert 7e7e6594b3 [Attributor] Allow SelectInst in AAValueConstantRangeFloating
The genericValueTraversal will already handle SelectInst properly and we
just needed to allow them in the initialize method.
2020-02-10 00:55:09 -06:00
Johannes Doerfert ffdbd2a06c [Attributor] Look through (some) casts in AAValueConstantRangeFloating
Casts can be handled natively by the ConstantRange class. We do limit it
to extends for now as we assume an integer type in different locations.
A TODO and a test case with a FIXME was added to remove that restriction
in the future.
2020-02-10 00:38:01 -06:00
Johannes Doerfert 028db8c490 [Attributor][FIX] Call right base method in AAValueConstantRangeFloating
We now call the base class method as we should.
2020-02-10 00:38:01 -06:00
Michael Liao ab3da5dd66 Fix `-Wparentheses` warning. NFC. 2020-02-10 00:45:02 -05:00
Sanjay Patel a17f03bd93 [VectorCombine] new IR transform pass for partial vector ops
We have several bug reports that could be characterized as "reducing scalarization",
and this topic was also raised on llvm-dev recently:
http://lists.llvm.org/pipermail/llvm-dev/2020-January/138157.html
...so I'm proposing that we deal with these patterns in a new, lightweight IR vector
pass that runs before/after other vectorization passes.

There are 4 alternate options that I can think of to deal with this kind of problem
(and we've seen various attempts at all of these), but they all have flaws:

    InstCombine - can't happen without TTI, but we don't want target-specific
                  folds there.
    SDAG - too late to assist other vectorization passes; TLI is not equipped
           for these kind of cost queries; limited to a single basic block.
    CGP - too late to assist other vectorization passes; would need to re-implement
          basic cleanups like CSE/instcombine.
    SLP - doesn't fit with existing transforms; limited to a single basic block.

This initial patch/transform is based on existing code in AggressiveInstCombine:
we walk backwards through the function looking for a pattern match. But we diverge
from that cost-independent IR canonicalization pass by using TTI to decide if the
vector alternative is profitable.

We probably have at least 10 similar bug reports/patterns (binops, constants,
inserts, cheap shuffles, etc) that would fit in this pass as follow-up enhancements.
It's possible that we could iterate on a worklist to fix-point like InstCombine does,
but it's safer to start with a most basic case and evolve from there, so I didn't
try to do anything fancy with this initial implementation.

Differential Revision: https://reviews.llvm.org/D73480
2020-02-09 10:04:41 -05:00
Ehud Katz 3b70ee27a5 [LoopExtractor] Convert LoopExtractor from LoopPass to ModulePass
The LoopExtractor created new functions (by definition), which violates
the restrictions of a LoopPass.
The correct implementation of this pass should be as a ModulePass.
Includes reverting rL82990 implications on the LoopExtractor.

Fixes PR3082 and PR8929.

Differential Revision: https://reviews.llvm.org/D69069
2020-02-09 12:25:21 +02:00
Johannes Doerfert b0c77c36d2 [Attributor] Add an Attributor CGSCC pass and run it
In addition to the module pass, this patch introduces a CGSCC pass that
runs the Attributor on a strongly connected component of the call graph
(both old and new PM). The Attributor was always design to be used on a
subset of functions which makes this patch mostly mechanical.

The one change is that we give up `norecurse` deduction in the module
pass in favor of doing it during the CGSCC pass. This makes the
interfaces simpler but can be revisited if needed.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D70767
2020-02-08 21:27:34 -06:00
Johannes Doerfert e565db49c6 [OpenMP][Opt] Delete terminating and read-only parallel regions
Parallel regions known to be read-only, e.g., after we removed all dead
write accesses, and terminating (`willreturn`) can be removed.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D69954
2020-02-08 18:52:04 -06:00
Johannes Doerfert e28936f613 [OpenMP][Opt] Annotate known runtime functions and deduplicate more
This adds ~27 more runtime calls to the OpenMPKinds.def file, all with
attributes. We deduplicate 16 of those automatically in function =
thread scope. And we annotate all of them automatically during the
OpenMPOpt discovery step. A test with all omp_XXXX runtime calls to
track annotation coverage is included.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D69984
2020-02-08 18:35:39 -06:00
Nikita Popov a05932931c [InstCombine] Refactor foldICmpAndShift(); NFCI
Separate out handling for shl, lshr and ashr. The combined handling
obscured some overly pessimistic requirements for the transform.
2020-02-08 22:27:43 +01:00
Johannes Doerfert 9548b74a83 [OpenMP] Introduce the OpenMPOpt transformation pass
The OpenMPOpt pass is a CGSCC pass in which OpenMP specific
optimizations can reside.

The OpenMPOpt pass uses the OpenMPKinds.def file to identify runtime
calls and their uses. This allows targeted transformations and eases
their implementation.

This initial patch deduplicates `__kmpc_global_thread_num` and
`omp_get_thread_num` calls. We can also identify arguments that are
equivalent to such a call result and use it instead. Later we can
determine "gtid" arguments based on the use in kernel functions etc.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D69930
2020-02-08 14:47:03 -06:00
Johannes Doerfert 72277ecd62 Introduce a CallGraph updater helper class
The CallGraphUpdater is a helper that simplifies the process of updating
the call graph, both old and new style, while running an CGSCC pass.

The uses are contained in different commits, e.g. D70767.

More functionality is added as we need it.

Reviewed By: modocache, hfinkel

Differential Revision: https://reviews.llvm.org/D70927
2020-02-08 14:16:48 -06:00
George Burgess IV f8c9ceb1ce [SimplifyLibCalls] Add __strlen_chk.
Bionic has had `__strlen_chk` for a while. Optimizing that into a
constant is quite profitable, when possible.

Differential Revision: https://reviews.llvm.org/D74079
2020-02-08 11:51:00 -08:00
Nikita Popov a148b9e990 [InstCombine] Fix infinite min/max canonicalization loop (PR44541)
While D72944 also fixes https://bugs.llvm.org/show_bug.cgi?id=44541,
it does so in a more roundabout manner and there might be other
loopholes to trigger the same issue. This is a more direct fix,
that prevents the transform if the min/max is based on a
non-canonical sub X, 0 instruction.

Differential Revision: https://reviews.llvm.org/D73849
2020-02-08 20:42:17 +01:00
Nikita Popov 5b2b67be8e [InstCombine] Remove unnecessary worklist push; NFCI
This is no longer needed after d4627b90a0,
should have dropped it there...
2020-02-08 17:09:28 +01:00
Nikita Popov d4627b90a0 [InstCombine] Avoid modifying instructions in-place
As discussed on D73919, this replaces a few cases where we were
modifying multiple operands of instructions in-place with the
creation of a new instruction, which we generally prefer nowadays.

This tends to be more readable and less prone to worklist management
bugs.

Test changes are only superficial (instruction naming and order).
2020-02-08 17:05:56 +01:00
Nikita Popov 9d03b7d0d0 [InstCombine] Use swapValues(); NFC
Less code, and makes it more obvious that these operands do not
need to be added back to the worklist.
2020-02-08 16:57:28 +01:00
Nikita Popov 23db9724d0 [InstCombine] Fix infinite loop in min/max load/store bitcast combine (PR44835)
Fixes https://bugs.llvm.org/show_bug.cgi?id=44835. Skip the transform
if it wouldn't actually do anything (apart from removing and reinserting
the same instructions).

Note that the test case doesn't loop on current master anymore, only
on the LLVM 10 release branch. The issue is already mitigated on master
due to worklist order fixes, but we should fix the root cause there as well.

As a side note, we should probably assert in combineLoadToNewType()
that it does not combine to the same type. Not doing this here, because
this assertion would also be triggered in another place right now.

Differential Revision: https://reviews.llvm.org/D74278
2020-02-08 16:55:22 +01:00
Akira Hatanaka 4dcc029edb [ObjC][ARC] Keep track of phis that have been discovered to avoid an
infinite loop

This fixes a bug introduced in 6770fbb314.

rdar://problem/59137105
2020-02-07 20:33:11 -08:00
Akira Hatanaka 6770fbb314 [ObjC][ARC] Delete ARC runtime calls that take inert phi values
This improves on the following patch, which removed ARC runtime calls
taking inert global variables:

https://reviews.llvm.org/D62433

rdar://problem/59137105
2020-02-07 16:31:36 -08:00
Hiroshi Yamauchi 4ed205c816 [PGO][PGSO] Enable profile guided size optimization for non-cold code under instrumentation PGO.
Summary:
This enables it for large working set size cases only.

This does not enable it under sample PGO.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74073
2020-02-06 10:29:01 -08:00
Denis Antrushin 99a6e405ed [IRCE] Use SCEVExpander to modify loop bound
IRCE pass checks that it can calculate loop bounds by checking
SCEV availability at loop entry. However it is possible that loop
bound SCEV is loop invariant, but instruction used to compute it
resides within loop. In such case adjusting loop bound in preheader
using IRBuilder leads to malformed SSA.
Use SCEVExpander instead to generate proper instructions.

Reviewed-by: mkazantsev
Differential Revision: https://reviews.llvm.org/D73496
2020-02-06 12:44:43 +03:00
Teresa Johnson 25aa2eef99 Revert "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"
This reverts commit 748bb5a0f1.

Due to Chromium CFI+ThinLTO test crashes reported on patch.
2020-02-05 19:27:32 -08:00
Juneyoung Lee 5687acf431 [MemCpyOpt] Simplify find*Alignment 2020-02-06 06:42:07 +09:00
Juneyoung Lee ad9ae6ee2b MemCpyOpt cannot use ABI alignment even if it was not given
Summary: This patch fixes https://bugs.llvm.org/show_bug.cgi?id=44388 which incorrectly assigns an ABI alignment to memset when there was no explicit alignment given.

Reviewers: gchatelet, lenary, nikic

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74083
2020-02-06 06:21:55 +09:00
Hiroshi Yamauchi b70f23f599 [PGO][PGSO] Tune flags for profile guided size optimization.
Summary:
Tune the profile threshold flag value for instrumentation PGO based on internal
benchmarks.

Also, add flags to allow profile guided size optimizations for non-cold code
to be enabled separately for instrumentation and sample PGSO.

Neither changes the default behavior (yet) as it's disabled for non-cold code.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72937
2020-02-05 09:37:32 -08:00
Kazu Hirata 4698bf145d Resubmit^2: [JumpThreading] Thread jumps through two basic blocks
This reverts commit 41784bed01.

Since the original revision ead815924e,
this revision fixes three issues:

- This revision fixes the Windows build.  My original patch improperly
  copied EH pads on Windows.  This patch disregards jump threading
  opportunities having to do with EH pads.

- This revision fixes jump threading to a wrong destination.
  Specifically, my original patch treated any Constant other than 0 as 1
  while evaluating the branch condition.  This bug led to treating
  constant expressions like:

    icmp ugt i8* null, inttoptr (i64 4 to i8*)

  to "true".  This patch fixes the bug by calling isOneValue.

- This revision fixes the cost calculation of two basic blocks being
  threaded through.  Note that getJumpThreadDuplicationCost returns
  "(unsigned)~0" for those basic blocks that cannot be duplicated.  If
  we sum of two return values from getJumpThreadDuplicationCost, we
  could have an unsigned overflow like:

    (unsigned)~0 + 5 = 4

  and mistakenly determine that it's safe and profitable to proceed
  with the jump threading opportunity.  The patch fixes the bug by
  checking each return value before summing them up.

[JumpThreading] Thread jumps through two basic blocks

Summary:
This patch teaches JumpThreading.cpp to thread through two basic
blocks like:

  bb3:
    %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

by duplicating basic blocks like bb3 above.  Once we duplicate bb3 as
bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have:

  bb3:
    %var = phi i32* [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb3.dup:
    %var = phi i32* [ null, %bb1 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

Then the existing code in JumpThreading.cpp can thread edge
bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5.

Reviewers: wmi

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70247
2020-02-05 09:23:37 -08:00
Alina Sbirlea 67904db23c [IRCE] Make IRCE a Function pass.
Summary: Make InductiveRangeCheckElimination a FunctionPass.

Reviewers: reames, mkazantsev

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73592
2020-02-05 09:22:41 -08:00
Teresa Johnson 748bb5a0f1 [WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP
Summary:
Currently type test assume sequences inserted for devirtualization are
removed during WPD. This patch delays their removal until later in the
optimization pipeline. This is an enabler for upcoming enhancements to
indirect call promotion, for example streamlined promotion guard
sequences that compare against vtable address instead of the target
function, when there are small number of possible vtables (either
determined via WPD or by in-progress type profiling). We need the type
tests to correlate the callsites with the address point offset needed in
the compare sequence, and optionally to associated type summary info
computed during WPD.

This depends on work in D71913 to enable invocation of LowerTypeTests to
drop type test assume sequences, which will now be invoked following ICP
in the ThinLTO post-LTO link pipelines, and also after the existing
export phase LowerTypeTests invocation in regular LTO (which is already
after ICP). We cannot simply move the existing import phase
LowerTypeTests pass later in the ThinLTO post link pipelines, as the
comment in PassBuilder.cpp notes (it must run early because when
performing CFI other passes may disturb the sequences it looks for).

This necessitated adding a new type test resolution "Unknown" that we
can use on the type test assume sequences previously removed by WPD,
that we now want LTT to ignore.

Depends on D71913.

Reviewers: pcc, evgeny777

Subscribers: mehdi_amini, Prazek, hiraditya, steven_wu, dexonsmith, arphaman, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73242
2020-02-05 08:59:48 -08:00
Alina Sbirlea 1c03cc5a39 [NFCI] Update according to style.
clang-tidy + clang-format
2020-02-04 17:11:36 -08:00
Sanjay Patel dc42ff6697 [InstCombine] add FIXME comment to shuffle transform; NFC
Existing tests:
rG5d04e008f708
rG2a191cf8500f
...should verify that the underlying analysis doesn't improve
too much without updating this user code.
2020-02-04 13:02:06 -05:00
Matt Arsenault a3c814d234 Separately track input and output denormal mode
AMDGPU and x86 at least both have separate controls for whether
denormal results are flushed on output, and for whether denormals are
implicitly treated as 0 as an input. The current DAGCombiner use only
really cares about the input treatment of denormals.
2020-02-04 12:59:21 -05:00
Sanjay Patel 0cf0be993c [InstCombine] fix operands of shouldChangeType() for casted phi transform
This is a bug noted in the recent D72733 and seen
in the similar transform just above the changed source code.

I added tests with illegal types and zexts to show the bug -
we could transform legal phi ops to illegal, etc. I did not add
tests with trunc because we won't see any diffs on those patterns.
That is because InstCombiner::SliceUpIllegalIntegerPHI() appears to
do those transforms independently of datalayout. It can also create
more casts than are present in existing code.

There are some existing regression tests that do not include a
datalayout that would be altered by this fix. I assumed that the
lack of a datalayout in those regression files is an oversight, so
I added the minimal layout (make i32 legal) necessary to preserve
behavior on those tests.

Differential Revision: https://reviews.llvm.org/D73907
2020-02-04 07:45:48 -05:00
Thomas Raoux e53bbf1213 [GVN] Add GVNOption to control load-pre more fine-grained.
Adds the global (cl::opt) GVNOption enable-load-in-loop-pre in order
to control whether the optimization will be performed if the load
is part of a loop.

Patch by Hendrik Greving!

Differential Revision: https://reviews.llvm.org/D73804
2020-02-03 23:00:58 -08:00
Tyker 15f54d348b [NFC] Factor out function to detect if an attribute has an argument. 2020-02-03 22:27:24 +01:00
Alina Sbirlea 388de9dfcd [LoopUtils] Make duplicate method a utility. [NFCI]
Summary:
Method appendLoopsToWorklist is duplicate in LoopUnroll and in the
LoopPassManager as an internal method. Make it an utility.

Reviewers: dmgreen, chandlerc, fedor.sergeev, yamauchi

Subscribers: mehdi_amini, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73569
2020-02-03 10:24:18 -08:00
Nikita Popov 575a975afd [SimplifyLibCalls] Remove unused IRBuilder argument; NFC
isLocallyOpenedFile() does not use IRBuilder.
2020-02-03 19:12:57 +01:00
Nikita Popov 878cb38a5c [InstCombine] Add replaceOperand() helper
Adds a replaceOperand() helper, which is like Instruction.setOperand()
but adds the old operand to the worklist. This reduces the amount of
missing or incorrect worklist management.

This only applies the helper to a relatively small subset of
setOperand() calls in InstCombine, namely those of the pattern
`I.setOperand(); return &I;`, where it is most obviously applicable.

Differential Revision: https://reviews.llvm.org/D73803
2020-02-03 19:00:17 +01:00
Nikita Popov e6c9ab4fb7 [InstCombine] Rename worklist methods; NFC
This renames Worklist.AddDeferred() to Worklist.add() and
Worklist.Add() to Worklist.push(). The intention here is that
Worklist.add() should be the go-to method for explicit worklist
management, while the raw Worklist.push() is mostly for
InstCombine internals. I will then migrate uses of Worklist.push()
to Worklist.add() in followup changes.

As suggested by spatel on D73411 I'm also changing the remaining
method names to lowercase first character, in line with current
coding standards.

Differential Revision: https://reviews.llvm.org/D73745
2020-02-03 18:56:51 +01:00
Nikita Popov a59954051e [InstCombine] Fix unused variable warning; NFC 2020-02-03 18:47:38 +01:00
Teresa Johnson bed4d9c897 [ThinLTO] More efficient export computation (NFC)
Summary:
A recent change to enable more importing of global variables with
references exposed some efficiency issues with export computation.
See D73724 for more information and detailed analysis.

The first was specific to variable importing. The code was marking every
copy of a referenced value (from possibly thousands of files in the case
of linkonce_odr) as exported, and we only need to mark the copy in the
module containing the variable def being imported as exported. The
reason is that this is tracking what values are newly exported as a
result of importing. Anything that was defined in another module and
simply used in the exporting module is already exported, and would have
been identified by the caller (e.g. the LTO API implementations).

The second issue is that the code was re-adding previously exported
values (along with all references). It is easy to identify when a
variable was already imported into the same module (via the
import list insert call return value), and we already did this for
function importing. However, what we weren't doing for either function
or variable importing was avoiding a re-insertion when it was previously
exported into a different importing module. The reason we couldn't do
this is there was no way of telling from the export list whether it was
previously inserted there because its definition was exported (in which
case we already marked all its references as exported) from when it was
inserted there because it was referenced by another exported value (in
which case we haven't yet inserted its own references).

To address this we can restructure the way the export list is
constructed. This patch only adds the actual imported definitions
(variable or function) to the export list for its module during the
import computation. After import computation is complete, where we were
already post-processing the export list we go ahead and add all
references made by those exported values to the export list.

These changes speed up the thin link not only with constant variable
importing enabled, but also without (due to the efficiency improvement
in function importing).

Some thin link user time measurements for one large application, average
of 5 runs:

With constant variable importing enabled:
- without this patch: 479.5s
- with this patch: 74.6s

Without constant variable importing enabled:
- without this patch: 80.6s
- with this patch: 70.3s

Note I have not re-enabled constant variable importing here, as I would
like to do additional compile time measurements with these fixes first.

Reviewers: evgeny777

Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73851
2020-02-03 09:15:33 -08:00
Sanjay Patel e78fb556c5 [InstCombine] reassociate splatted vector ops
bo (splat X), (bo Y, OtherOp) --> bo (splat (bo X, Y)), OtherOp

This patch depends on the splat analysis enhancement in D73549.
See the test with comment:
; Negative test - mismatched splat elements
...as the motivation for that first patch.

The motivating case for reassociating splatted ops is shown in PR42174:
https://bugs.llvm.org/show_bug.cgi?id=42174

In that example, a slight change in order-of-associative math results
in a big difference in IR and codegen. This patch gets all of the
unnecessary shuffles out of the way, but doesn't address the potential
scalarization (see D50992 or D73480 for that).

Differential Revision: https://reviews.llvm.org/D73703
2020-02-03 09:08:36 -05:00
Sam Parker 2663a25fad [JumpThreading] Half the duplicate threshold at Oz
Duplicating instructions can lead to code size increases but using
a threshold of 3 is good for reducing code size.

Differential Revision: https://reviews.llvm.org/D72916
2020-02-03 08:40:20 +00:00
Johannes Doerfert 26d02b0f28 [Attributor] AANoRecurse check all call sites for `norecurse`
If all call sites are in `norecurse` functions we can derive `norecurse`
as the ReversePostOrderFunctionAttrsPass does. This should make
ReversePostOrderFunctionAttrsLegacyPass obsolete once the Attributor is
enabled.

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D72017
2020-02-02 23:57:17 -06:00
Johannes Doerfert 368f7ee7a5 [Attributor] Propagate known information from `checkForAllCallSites`
If we know that all call sites have been processed we can derive an
early fixpoint. The use in this patch is likely not to trigger right now
but a follow up patch will make use of it.

Reviewed By: uenoku, baziotis

Differential Revision: https://reviews.llvm.org/D72016
2020-02-02 23:57:17 -06:00
Juneyoung Lee 578d2e2cb1 [llvm-extract] Add -keep-const-init commandline option
Summary:
This adds -keep-const-init option to llvm-extract which preserves initializers of
used global constants.

For example:

```
$ cat a.ll
@g = constant i32 0
define i32 @f() {
  %v = load i32, i32* @g
  ret i32 %v
}

$ llvm-extract --func=f a.ll -S -o -
@g = external constant i32
define i32 @f() { .. }

$ llvm-extract --func=f a.ll -keep-const-init -S -o -
@g = constant i32 0
define i32 @f() { .. }
```

This option is useful in checking whether a function that uses a constant global is optimized correctly.

Reviewers: jsji, MaskRay, david2050

Reviewed By: MaskRay

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73833
2020-02-03 14:30:28 +09:00
Johannes Doerfert 342357c568 [Inliner][NoAlias] Use call site attributes too
If we had `noalias` on an argument the inliner created alias scope
metadata already. However, the call site `noalias` annotation was not
considered. Since the Attributor can derive such call site `noalias`
annotation we should treat them the same as argument annotations.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D73528
2020-02-02 23:21:29 -06:00
Tyker a7bbe45a3e Build assume from call
Fix attempt

this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call.
2020-02-02 19:43:36 +01:00
Tyker 7cb5d96fbe Revert "[WIP] Build assume from call"
casued buildbot failure

This reverts commit 8ebe001553.
2020-02-02 18:35:19 +01:00
Tyker 8ebe001553 [WIP] Build assume from call
Summary:
this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: mgrang, fhahn, mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72475
2020-02-02 18:15:50 +01:00
Tyker c2d0336208 Revert "[WIP] Build assume from call"
caused build bot failure

This reverts commit 780d2c532f.
2020-02-02 18:09:06 +01:00
Tyker 780d2c532f [WIP] Build assume from call
Summary:
this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: mgrang, fhahn, mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72475
2020-02-02 17:54:31 +01:00
Tyker ad8ffc5010 Revert "[WIP] Build assume from call"
This reverts commit 355e4bfd78.
2020-02-02 17:49:23 +01:00
Tyker 355e4bfd78 [WIP] Build assume from call
Summary:
this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: mgrang, fhahn, mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72475
2020-02-02 17:17:46 +01:00
Tyker 0adda3df92 Revert "[WIP] Build assume from call"
This reverts commit 2ff5602cb5.
2020-02-02 15:05:33 +01:00
Tyker d431c5d9af Revert "[NFC] Factor out function to detect if an attribute has an argument."
This reverts commit ff1b9add2f.
2020-02-02 15:03:06 +01:00
Tyker ff1b9add2f [NFC] Factor out function to detect if an attribute has an argument.
Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72884
2020-02-02 14:50:31 +01:00
Tyker 2ff5602cb5 [WIP] Build assume from call
Summary:
this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: mgrang, fhahn, mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72475
2020-02-02 14:50:31 +01:00
Fangrui Song ba3a1774a9 [Transforms] Simplify with make_early_inc_range 2020-02-02 00:54:32 -08:00
Artur Pilipenko 34547ac959 NFC. Comments cleanup in DSE::memoryIsNotModifiedBetween
Separated from https://reviews.llvm.org/D68006 review.
2020-01-31 15:22:33 -08:00
Nikita Popov ff17da3f75 [InstCombine] Push negation through multiply (PR44234)
Fixes https://bugs.llvm.org/show_bug.cgi?id=44234 by adding
multiply support to freelyNegateValue(). Only one of the operands
needs to be negatible, so this still fits within the framework.

Differential Revision: https://reviews.llvm.org/D73410
2020-01-31 20:58:55 +01:00
Hiroshi Yamauchi ac8da31a0f [PGO][PGSO] Handle MBFIWrapper
Some code gen passes use MBFIWrapper to keep track of the frequency of new
blocks. This was not taken into account and could lead to incorrect frequencies
as MBFI silently returns zero frequency for unknown/new blocks.

Add a variant for MBFIWrapper in the PGSO query interface.

Depends on D73494.
2020-01-31 09:36:55 -08:00
Sanjay Patel bc1148e7bc [PATCH] D73727: [SLP] drop poison-generating flags for shuffle reduction ops (PR44536)
We may calculate reassociable math ops in arbitrary order when creating a shuffle reduction,
so there's no guarantee that things like 'nsw' hold on those intermediate values. Drop all
poison-generating flags for safety.

This change is limited to shuffle reductions because I don't think we have a problem in the
general case (where we intersect flags of each scalar op that goes into a vector op), but if
there's evidence of other cases being wrong, we can extend this fix to cover those cases.

https://bugs.llvm.org/show_bug.cgi?id=44536

Differential Revision: https://reviews.llvm.org/D73727
2020-01-31 09:54:35 -05:00
Nikita Popov 480391035c [InstCombine] Remove unnecessary worklist add; NFCI
Again, this will already be added by IRBuilder.
2020-01-30 23:24:59 +01:00
Nikita Popov 90b5ed996b [InstCombine] Remove unnecessary worklist add; NFCI
The IRBuilder will automatically add instructions to the worklist.
Adding it manually is unnecessary, but may mess up worklist order.
2020-01-30 23:06:28 +01:00
Nikita Popov cad91074a6 [InstCombine] Create new insts in foldICmpEqIntrinsicWithConstant; NFCI
In line with current conventions, create new instructions rather
than modify two operands in place and performing manual worklist
management.

This should be NFC apart from possible worklist order changes.
2020-01-30 23:03:16 +01:00
Whitney Tsang e44f4a8a54 [LoopFusion] Move instructions from FC1.GuardBlock to FC0.GuardBlock and
from FC0.ExitBlock to FC1.ExitBlock when proven safe.

Summary:
Currently LoopFusion give up when the second loop nest guard
block or the first loop nest exit block is not empty. For example:

if (0 < N) {
  for (int i = 0; i < N; ++i) {}
  x+=1;
}
y+=1;
if (0 < N) {
  for (int i = 0; i < N; ++i) {}
}
The above example should be safe to fuse.
This PR moves instructions in FC1 guard block (e.g. y+=1;) to
FC0 guard block, or instructions in FC0 exit block (e.g. x+=1;) to
FC1 exit block, which then LoopFusion is able to fuse them.
Reviewer: kbarton, jdoerfert, Meinersbur, dmgreen, fhahn, hfinkel,
bmahjour, etiotto
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D73641
2020-01-30 18:02:22 +00:00
David Stenberg b54a8ec1bc [InstCombine][DebugInfo] Fold constants wrapped in metadata
Summary:
When constant folding, constants that are wrapped in metadata were not
folded. This could lead to dbg.values being the only user of a constant
expression, due to the non-dbg uses having been rewritten, resulting in
the constant later on being removed by some other pass. This occurred
with the attached test case, in which the non-rewritten GEP in the
dbg.value intrinsic was later on removed by globalopt.

This patch makes the code look through metadata and fold such constants.

I guess that we in the future may want to allow dbg.values using GEPs and
other constant expressions to be emittable even if there are no non-dbg
uses, but for example SelectionDAG does not support that.

Reviewers: jmorse, aprantl, vsk, davide

Reviewed By: aprantl, vsk, davide

Subscribers: hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D73630
2020-01-30 15:50:16 +01:00
Piotr Sobczak dd7148822b [InstCombine][AMDGPU] Trim components of s_buffer_load
Summary:
Add trimming of unused components of s_buffer_load.

For s_buffer_load and unformatted buffer_load also trim unused
components at the beginning of vector and update offset accordingly.

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71785
2020-01-30 10:48:25 +01:00
Michael Forster 676c29694c Inline debug variable.
Summary:
In a release build this variable becomes unused and may break the build
with `-Werror,-Wunused-variable`.

Reviewers: gribozavr2, jdoerfert, sstefan1

Reviewed By: gribozavr2

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73683
2020-01-30 10:29:24 +01:00
Nikita Popov 8058196677 [InstCombine] Process newly inserted instructions in the correct order
InstCombine operates on the basic premise that the operands of the
currently processed instruction have already been simplified. It
achieves this by pushing instructions to the worklist in reverse
program order, so that instructions are popped off in program order.
The worklist management in the main combining loop also makes sure
to uphold this invariant.

However, the same is not true for all the code that is performing
manual worklist management. The largest problem (addressed in this
patch) are instructions inserted by InstCombine's IRBuilder. These
will be pushed onto the worklist in order of insertion (generally
matching program order), which means that a) the users of the
original instruction will be visited first, as they are pushed later
in the main loop and b) the newly inserted instructions will be
visited in reverse program order.

This causes a number of problems: First, folds operate on instructions
that have not had their operands simplified, which may result in
optimizations being missed (ran into this in
https://reviews.llvm.org/D72048#1800424, which was the original
motivation for this patch). Additionally, this increases the amount
of folds InstCombine has to perform, both within one iteration, and
by increasing the number of total iterations.

This patch addresses the issue by adding a Worklist.AddDeferred()
method, which is used for instructions inserted by IRBuilder. These
will only be added to the real worklist after the combine finished,
and in reverse order, so they will end up processed in program order.
I should note that the same should also be done to nearly all other
uses of Worklist.Add(), but I'm starting with just this occurrence,
which has by far the largest test fallout.

Most of the test changes are due to
https://bugs.llvm.org/show_bug.cgi?id=44521 or other cases where
we don't canonicalize something. These are neutral. One regression
has been addressed in D73575 and D73647. The remaining regression
in an shl+sdiv fold can't really be fixed without dropping another
transform, but does not seem particularly problematic in the first
place.

Differential Revision: https://reviews.llvm.org/D73411
2020-01-30 09:40:10 +01:00
Francesco Petrogalli 623cff81fe [llvm][VectorUtils] Tweak VFShape for scalable vector functions.
Summary:
This patch makes sure that the field VFShape.VF is greater than zero
when demangling the vector function name of scalable vector functions
encoded in the "vector-function-abi-variant" attribute.

This change is required to be able to provide instances of VFShape
that can be used to query the VFDatabase for the vectorization passes,
as such passes always require a positive value for the Vectorization Factor (VF)
needed by the vectorization process.

It is not possible to extract the value of VFShape.VF from the mangled
name of scalable vector functions, because it is encoded as
`x`. Therefore, the VFABI demangling function has been modified to
extract such information from the IR declaration of the vector
function, under the assumption that _all_ vectors in the signature of
the vector function have the same number of lanes. Such assumption is
valid because it is also assumed by the Vector Function ABI
specifications supported by the demangling function (x86, AArch64, and
LLVM internal one).

The unit tests that demangle scalable names have been modified by
adding the IR module that carries the declaration of the vector
function name being demangled.

In particular, the demangling function fails in the following cases:

1. When the declaration of the scalable vector function is not
    present in the module.

2. When the value of VFSHape.VF is not greater than 0.

Reviewers: jdoerfert, sdesmalen, andwar

Reviewed By: jdoerfert

Subscribers: mgorny, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73286
2020-01-30 05:53:56 +00:00
Johannes Doerfert 89c2e733e8 [Attributor] Pointer privatization attribute (argument promotion)
A pointer is privatizeable if it can be replaced by a new, private one.
Privatizing pointer reduces the use count, interaction between unrelated
code parts. This is a first step towards replacing argument promotion.
While we can already handle recursion (unlike argument promotion!) we
are restricted to stack allocations for now because we do not analyze
the uses in the callee.

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D68852
2020-01-29 21:31:04 -06:00
Johannes Doerfert 791c9f1145 [Attributor] Fix TODO to avoid recomputation of results
The helpers AAReturnedFromReturnedValues and
AACallSiteReturnedFromReturned are useful not only to avoid code
duplication but also to avoid recomputation of results. If we have N
call sites we should not recompute the function return information N
times but once. These are mostly straightforward usages with some minor
improvements on the helpers and addition of a new one
(IRPosition::getAssociatedType) that knows about function return types.
2020-01-29 19:24:34 -06:00
Nikita Popov e086e23024 [InstCombine] Support non-splat vectors in icmp eq + add/sub fold
For the

    icmp eq (add X, C1), C2 => icmp eq X, C2-C1
    icmp eq (sub C1, X), C2 => icmp eq X, C1-C2

folds, this allows C1 to be non-splat and contain undefs.
C2 is still splat, due to the structure of the code.

This is to address the remaining part of the regression in D73411,
where demanded element analysis replaces some elements with undef.

Differential Revision: https://reviews.llvm.org/D73647
2020-01-29 20:56:58 +01:00
Elia Geretto ab2300bc15 [PassManagerBuilder] Remove global extension when a plugin is unloaded
This commit fixes PR39321.

GlobalExtensions is not guaranteed to be destroyed when optimizer plugins are unloaded. If it is indeed destroyed after a plugin is dlclose-d, the destructor of the corresponding ExtensionFn is not mapped anymore, causing a call to unmapped memory during destruction.

This commit guarantees that extensions coming from external plugins are removed from GlobalExtensions when the plugin is unloaded if GlobalExtensions has not been destroyed yet.

Differential Revision: https://reviews.llvm.org/D71959
2020-01-29 16:15:45 +00:00
Simon Pilgrim 79748add70 Fix MSVC lamdba default capture mode warning. NFCI. 2020-01-29 15:47:04 +00:00
Whitney Tsang da58e68fdf [LoopFusion] Move instructions from FC1.Preheader to FC0.Preheader when
proven safe.

Summary:
Currently LoopFusion give up when the second loop nest preheader is
not empty. For example:

for (int i = 0; i < 100; ++i) {}
x+=1;
for (int i = 0; i < 100; ++i) {}
The above example should be safe to fuse.
This PR moves instructions in FC1 preheader (e.g. x+=1; ) to
FC0 preheader, which then LoopFusion is able to fuse them.
Reviewer: kbarton, Meinersbur, jdoerfert, dmgreen, fhahn, hfinkel,
bmahjour, etiotto
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D71821
2020-01-29 15:06:11 +00:00
Sanjay Patel 87f6314f8c [InstCombine] canonicalize splat shuffle after cmp
cmp (splat V1, M), SplatC --> splat (cmp V1, SplatC'), M

As discussed in PR44588:
https://bugs.llvm.org/show_bug.cgi?id=44588
...we try harder to push shuffles after binops than after compares.

This patch handles the special (but presumably most common case) of
splat shuffles. If both operands are splats, then we can do the
comparison on the non-splat inputs followed by splat of the compare.
That should take care of the regression noted in D73411.

There's another potential fold requested in PR37463 to scalarize the
compare, but that's another patch (and it's not clear if we can do
that without the ability to undo it later):
https://bugs.llvm.org/show_bug.cgi?id=37463

Differential Revision: https://reviews.llvm.org/D73575
2020-01-29 08:34:29 -05:00
Johannes Doerfert 76843ba37f [Attributor][Fix] Initialize unused but loaded variable
This hopefully un-breaks:
  http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/38333
2020-01-28 23:52:16 -06:00
Johannes Doerfert ea5fabe60c [Attributor] Reuse existing logic to avoid duplication
There was a TODO in AAValueConstantRangeArgument to reuse
AAArgumentFromCallSiteArguments. We now do this by allowing new States
to be build from the bestState.
2020-01-28 23:45:59 -06:00
Johannes Doerfert 224085409d [Attributor][FIX] Treat invalidated attributes as changed
If we invalidate an attribute we need to inform all dependent ones even
if the fixpoint state is not invalid. Before we only continued
invalidation if the fixpoint state was invalid, now we signal a change
in case the fixpoint state is valid.

The test case was already included in D71620 but the problem was hiding
because it only manifested with the old PM (for that input).
2020-01-28 23:40:41 -06:00
Johannes Doerfert 53992c7bf7 [Attributor] Modularize AANoAliasCallSiteArgument to simplify extensions
This patch modularizes the way we check for no-alias call site arguments
by putting the existing logic into helper functions. The reasoning was
not changed but special cases for readonly/readnone were added.
2020-01-28 23:39:29 -06:00
Johannes Doerfert 24ae77eebf [Attributor] Mark a non-defined `null` pointer as `noalias`
If `null` is not defined we cannot access it, hence the pointer is
`noalias`. While this is not helpful on it's own it simplifies later
deductions that can skip over already known `noalias` pointers in
certain situations.
2020-01-28 23:09:37 -06:00
Johannes Doerfert 6626d1b7c0 [Attributor][NFC] Remove ugly and unneeded cast 2020-01-28 22:54:31 -06:00
Johannes Doerfert 02bd8180fc [Attributor][NFC] Improve debug messages 2020-01-28 22:53:19 -06:00
Johannes Doerfert b6dbd0f71f [Attributor][NFC] Internalize helper function 2020-01-28 22:50:34 -06:00
Vedant Kumar 8359511c62 [CodeExtractor] Remove stale llvm.assume calls from extracted region
During extraction, stale llvm.assume handles may be retained in the
original function. The setup is:

1) CodeExtractor unregisters assumptions in the blocks that are to be
   extracted.

2) Extraction happens. There are now two functions: f1 and f1.extracted.

3) Leftover assumptions in f1 (/not/ removed as they were not in the set of
   blocks to be extracted) now have affected-value llvm.assume handles in
   f1.extracted.

When assumptions for a value used in f1 are looked up, ValueTracking can assert
as some of the handles are in the wrong function. To fix this, simply erase the
llvm.assume calls in the extracted function.

Alternatives include flushing the assumption cache in the original function, or
walking all values used in the original function to prune stale affected-value
handles. Both seem more expensive.

Testing: check-llvm, LNT run with -mllvm -hot-cold-split enabled

rdar://58460728
2020-01-28 17:18:01 -08:00
Eli Friedman 2f6b9edfa8 [AliasAnalysis] Add missing FMRB_* enums.
Previously, the enums didn't account for all the possible cases, which
could cause misleading results (particularly for a "switch" on
FunctionModRefBehavior).

Fixes regression in polly from recent patch to add writeonly to memset.

While I'm here, also fix a few dubious uses of the FMRB_* enum values.

Differential Revision: https://reviews.llvm.org/D73154
2020-01-28 15:47:08 -08:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Whitney Tsang cd0cff4392 [NFCI][LoopUnrollAndJam] Minor changes.
Summary:
1. Add assertions.
2. Verify more analyses.
These changes are moved out of https://reviews.llvm.org/D73129 to
simplify that review.

Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto
Reviewed By: dmgreen
Subscribers: fhahn, hiraditya, zzheng, llvm-commits, prithayan, anhtuyen
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D73204
2020-01-28 20:24:23 +00:00
Petr Hosek 127d3abf25 [Instrumentation] Set hidden visibility for the bias variable
We have to avoid using a GOT relocation to access the bias variable,
setting the hidden visibility achieves that.

Differential Revision: https://reviews.llvm.org/D73529
2020-01-28 12:07:03 -08:00
Sanjay Patel 7a717d82ff [InstCombine] refactor foldVectorCmp(); NFC
We can handle other patterns here as shown in PR44588.
2020-01-28 14:40:48 -05:00
Florian Hahn 5d0ffbeb4d [Matrix] Mark expressions shared between multiple remarks.
This patch adds support for explicitly highlighting sub-expressions
shared by multiple leaf nodes. For example consider the following
code

  %shared.load = tail call <8 x double> @llvm.matrix.columnwise.load.v8f64.p0f64(double* %arg1, i32 %stride, i32 2, i32 4), !dbg !10, !noalias !10
  %trans = tail call <8 x double> @llvm.matrix.transpose.v8f64(<8 x double> %shared.load, i32 2, i32 4), !dbg !10
  tail call void @llvm.matrix.columnwise.store.v8f64.p0f64(<8 x double> %trans, double* %arg3, i32 10, i32 4, i32 2), !dbg !10
  %load.2 = tail call <30 x double> @llvm.matrix.columnwise.load.v30f64.p0f64(double* %arg3, i32 %stride, i32 2, i32 15), !dbg !10, !noalias !10
  %mult = tail call <60 x double> @llvm.matrix.multiply.v60f64.v8f64.v30f64(<8 x double> %trans, <30 x double> %load.2, i32 4, i32 2, i32 15), !dbg !11
  tail call void @llvm.matrix.columnwise.store.v60f64.p0f64(<60 x double> %mult, double* %arg2, i32 10, i32 4, i32 15), !dbg !11

We have two leaf nodes (the 2 stores) and the first store stores %trans
which is also used by the matrix multiply %mult. We generate separate
remarks for each leaf (stores). To denote that parts are shared, the
shared expressions are marked as shared (), with a reference to the
other remark that shares it. The operation summary also denotes the
shared operations separately.

Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D72526
2020-01-28 09:27:55 -08:00
Florian Hahn d1f849a284 [LV] Hoist code to mark conditional assumes as dead to caller (NFC).
This is a follow-up suggested in D73423. It is sufficient to just add
the conditional assumes to DeadInstructions once.
2020-01-28 08:50:44 -08:00
Michael Liao 9c54b42338 Fix warning of `-Wcast-qual`. NFC. 2020-01-28 11:36:43 -05:00
Florian Hahn a911fef3dd [LV] Do not try to sink dead instructions.
Dead instructions do not need to be sunk. Currently we try and record
the recipies for them, but there are no recipes emitted for them and
there's nothing to sink. They can be removed from SinkAfter while
marking them for recording.

Fixes PR44634.

Reviewers: rengolin, hsaito, fhahn, Ayal, gilr

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D73423
2020-01-28 08:28:03 -08:00
Whitney Tsang 78dc64989c [CodeMoverUtils] Improve IsControlFlowEquivalent.
Summary:
Currently IsControlFlowEquivalent determine if two blocks are control
flow equivalent by checking if A dominates B and B post dominates A.
There exists blocks that are control flow equivalent even if they don't
satisfy the A dominates B and B post dominates A condition.
For example,

if (cond)
  A
if (cond)
  B
In the PR, we determine if two blocks are control flow equivalent by
also checking if the two sets of conditions A and B depends on are
equivalent.
Reviewer: jdoerfert, Meinersbur, dmgreen, etiotto, bmahjour, fhahn,
hfinkel, kbarton
Reviewed By: fhahn
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D71578
2020-01-28 14:18:00 +00:00
Florian Hahn 62e228f8fd [Matrix] Add info about number of operations to remarks.
This patch updates the remark to also include a summary of the number of
vector operations generated for each matrix expression.

Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D72480
2020-01-27 17:43:39 -08:00
Florian Hahn 949294f396 [Matrix] Add optimization remarks for matrix expression.
Generate remarks for matrix operations in a function. To generate remarks
for matrix expressions, the following approach is used:
1. Collect leafs of matrix expressions (done in
   RemarkGenerator::getExpressionLeafs).  Leafs are lowered matrix
   instructions without other matrix users (like stores).

2. For each leaf, create a remark containing a linearizied version of the
   matrix expression.

The following improvements will be submitted as follow-ups:
* Summarize number of vector instructions generated for each expression.
* Account for shared sub-expressions.
* Propagate matrix remarks up the inlining chain.

The information provided by the matrix remarks helps users to spot cases
where matrix expression got split up, e.g. due to inlining not
happening. The remarks allow users to address those issues, ensuring
best performance.

Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D72453
2020-01-27 16:39:29 -08:00
Sanjay Patel 747242af8d [InstCombine] allow more narrowing of casted select
D47163 created a rule that we should not change the casted
type of a select when we have matching types in its compare condition.
That was intended to help vector codegen, but it also could create
situations where we miss subsequent folds as shown in PR44545:
https://bugs.llvm.org/show_bug.cgi?id=44545

By using shouldChangeType(), we can continue to get the vector folds
(because we always return false for vector types). But we also solve
the motivating bug because it's ok to narrow the scalar select in that
example.

Our canonicalization rules around select are a mess, but AFAICT, this
will not induce any infinite looping from the reverse transform (but
we'll need to watch for that possibility if committed).

Side note: there's a similar use of shouldChangeType() for phi ops
just below this diff, and the source and destination types appear to
be reversed.

Differential Revision: https://reviews.llvm.org/D72733
2020-01-27 16:35:50 -05:00
Sanjay Patel 242fed9d7f [InstCombine] convert fsub nsz with fneg operand to -(X + Y)
This was noted in D72521 - we need to match fneg specifically to
consistently handle that pattern along with (-0.0 - X).
2020-01-27 14:49:15 -05:00
Nikita Popov bcfa0f592f [InstCombine] Move negation handling into freelyNegateValue()
Followup to D72978. This moves existing negation handling in
InstCombine into freelyNegateValue(), which make it composable.
In particular, root negations of div/zext/sext/ashr/lshr/sub can
now always be performed through a shl/trunc as well.

Differential Revision: https://reviews.llvm.org/D73288
2020-01-27 20:46:23 +01:00
Teresa Johnson 2f63d549f1 Restore "[LTO/WPD] Enable aggressive WPD under LTO option"
This restores 59733525d3 (D71913), along
with bot fix 19c76989bb.

The bot failure should be fixed by D73418, committed as
af954e441a.

I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
2020-01-27 07:55:05 -08:00
Whitney Tsang 2b335e9aae [LoopUnroll] Remove remapInstruction().
Summary:
LoopUnroll can reuse the RemapInstruction() in ValueMapper, or
remapInstructionsInBlocks() in CloneFunction, depending on the needs.
There is no need to have its own version in LoopUnroll.

By calling RemapInstruction() without TypeMapper or Materializer and
with Flags (RF_NoModuleLevelChanges | RF_IgnoreMissingLocals), it does
the same as remapInstruction(). remapInstructionsInBlocks() calls
RemapInstruction() exactly as described.

Looking at the history, I cannot find any obvious reason to have its own
version.
Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto,
foad, aprantl
Reviewed By: jdoerfert
Subscribers: hiraditya, zzheng, llvm-commits, prithayan, anhtuyen
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D73277
2020-01-27 15:42:13 +00:00
Guillaume Chatelet 07c9d53266 [Alignment][NFC] Use Align with CreateAlignedLoad
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73449
2020-01-27 10:58:36 +01:00
Guillaume Chatelet d0a7cc7177 [Alignment][NFC] Use Align with CreateMaskedScatter/Gather
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

This patch shows that CreateMaskedScatter/CreateMaskedGather can only take positive non zero alignment values.

Reviewers: courbet

Subscribers: hiraditya, llvm-commits, delena

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73361
2020-01-27 10:17:14 +01:00
Evgenii Stepanov 1df8549b26 [msan] Instrument x86.pclmulqdq* intrinsics.
Summary:
These instructions ignore parts of the input vectors which makes the
default MSan handling too strict and causes false positive reports.

Reviewers: vitalybuka, RKSimon, thakis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73374
2020-01-24 14:31:06 -08:00
Andy Kaylor b35b7da460 [PGO] Attach appropriate funclet operand bundles to value profiling instrumentation calls
Patch by Chris Chrulski

When generating value profiling instrumentation, ensure the call gets the
correct funclet token, otherwise WinEHPrepare will turn the call (and all
subsequent instructions) into unreachable.

Differential Revision: https://reviews.llvm.org/D73221
2020-01-24 11:20:53 -08:00
Simon Pilgrim abd1927d44 Fix some comment typos. NFC. 2020-01-24 18:18:42 +00:00
Alina Sbirlea 0d90d2457c [LoopStrengthReduce] Teach LoopStrengthReduce to preserve MemorySSA is available. 2020-01-24 10:13:52 -08:00
Andy Kaylor a33accde95 [PGO] Early detection regarding whether pgo counter promotion is possible
Patch by Chris Chrulski

This fixes a problem with the current behavior when assertions are enabled.
A loop that exits to a catchswitch instruction is skipped for the counter
promotion, however this check was being done after the PGOCounterPromoter
tried to collect an insertion point for the exit block. A call to
getFirstInsertionPt() on a block that begins with a catchswitch instruction
triggers an assertion. This change performs a check whether the counter
promotion is possible prior to collecting the ExitBlocks and InsertPts.

Differential Revision: https://reviews.llvm.org/D73222
2020-01-24 09:55:41 -08:00
Guillaume Chatelet 805c157e8a [Alignment][NFC] Deprecate Align::None()
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.

Reviewers: xbolva00, courbet, bollu

Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73099
2020-01-24 12:53:58 +01:00
Evgeny Leviant 8973fae195 [WPD] Allow load/save bitcoded index when running opt -wholeprogramdevirt
Differential revision: https://reviews.llvm.org/D73094
2020-01-24 00:31:39 -08:00
Andy Kaylor c467faf23c [WinEH] Ignore lifetime.end PHI nodes in empty cleanuppads
This fixes a bug where a PHI node that is only referenced by a lifetime.end intrinsic in an otherwise empty cleanuppad can cause SimplyCFG to create an SSA violation while removing the empty cleanuppad. Theoretically the same problem can occur with debug intrinsics.

Differential Revision: https://reviews.llvm.org/D72540
2020-01-23 18:18:50 -08:00
Teresa Johnson 90e630a95e Revert "[LTO/WPD] Enable aggressive WPD under LTO option"
This reverts commit 59733525d3.

There is a windows sanitizer bot failure in one of the cfi tests
that I will need some time to figure out:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/57155/steps/stage%201%20check/logs/stdio
2020-01-23 17:29:24 -08:00
Johannes Doerfert 7ad17e008b [Attributor] Avoid REQUIRED dependences in favor of OPTIONAL ones
When we use information only to short-cut deduction or improve it, we
can use OPTIONAL dependences instead of REQUIRED ones to avoid cascading
pessimistic fixpoints.

We also need to track dependences only when we use assumed information,
e.g., we act on assumed liveness information.
2020-01-23 18:42:46 -06:00
Johannes Doerfert 214ed3f676 [Attributor] Record dependences only when necessary
If we use assumed information from AAValueSimplify we need to record
an OPTIONAL dependence, otherwise we do not.
2020-01-23 18:42:45 -06:00
Johannes Doerfert 5429c82db2 [Attributor][FIX] Avoid dangling pointers during code deletion
It can happen that we have instructions in the ToBeDeletedInsts set
which are deleted earlier already. To avoid dangling pointers we use
weak tracking handles.
2020-01-23 18:42:45 -06:00
Johannes Doerfert ff6254dc26 [Attributor][FIX] Handle non-pointers when following uses
When we follow uses, e.g., in AAMemoryBehavior or AANoCapture, we need
to make sure the value is a pointer before we ask for abstract
attributes only valid for pointers. This happens because we follow
pointers through calls that do not capture but may return the value.
2020-01-23 18:42:45 -06:00
Johannes Doerfert 9dcf889d15 [Attributor][NFC] Do not (try to) simplify void values
We might accidentally ask AAValueSimplify to simplify a void value. That
can lead to very interesting, and very wrong, results. We now handle
this case gracefully.
2020-01-23 18:42:45 -06:00
Alina Sbirlea 1d09174290 [LoopStrengthReduce] Reuse utility method to clean dead instructions. [NFCI]
Create a utility wrapper for the RecursivelyDeleteTriviallyDeadInstructions utility
method, which sets to nullptr the instructions that are not trivially
dead. Use the new method in LoopStrengthReduce.
Alternative: add a bool to the same method; this option adds a marginal
amount of overhead to the other callers, and the method needs to be
updated to return a bool status when it removes/doesn't remove
instructions.
2020-01-23 16:27:32 -08:00
Johannes Doerfert 30179d7ecf [Attributor][FIX][Alignment] Do not report a change if there was none
If alignment was manifested but it is actually only as good as the
data-layout provided one we should not report it as a change.

For testing purposes we still manifest the information.
2020-01-23 18:13:52 -06:00
Johannes Doerfert e273ac4d88 [Attributor][NFC] Add an assertion 2020-01-23 18:13:52 -06:00
Johannes Doerfert d07b5a5525 [Attributor][NFC] Fix spelling 2020-01-23 18:13:52 -06:00
Johannes Doerfert 2baf000ecc [Attributor] `byval` arguments are always `noalias`
`byval` introduces a local copy of the argument. That copy cannot alias
anything.
2020-01-23 18:13:52 -06:00
Johannes Doerfert 30ae859c69 [Attributor][FIX] Store alignment only holds for the pointer value
We accidentally used the store alignment for the value operand as well,
which is incorrect and crashed the SPASS application in the test suite.
2020-01-23 18:13:52 -06:00
Teresa Johnson 59733525d3 [LTO/WPD] Enable aggressive WPD under LTO option
Summary:
Third part in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

This patch adds type test metadata under -fwhole-program-vtables,
even for classes without hidden visibility. It then changes WPD to skip
devirtualization for a virtual function call when any of the compatible
vtables has public vcall visibility.

Additionally, internal LLVM options as well as lld and gold-plugin
options are added which enable upgrading all public vcall visibility
to linkage unit (hidden) visibility during LTO. This enables the more
aggressive WPD to kick in based on LTO time knowledge of the visibility
guarantees.

Support was added to all flavors of LTO WPD (regular, hybrid and
index-only), and to both the new and old LTO APIs.

Unfortunately it was not simple to split the first and second parts of
this part of the change (the unconditional emission of type tests and
the upgrading of the vcall visiblity) as I needed a way to upgrade the
public visibility on legacy WPD llvm assembly tests that don't include
linkage unit vcall visibility specifiers, to avoid a lot of test churn.

I also added a mechanism to LowerTypeTests that allows dropping type
test assume sequences we now aggressively insert when we invoke
distributed ThinLTO backends with null indexes, which is used in testing
mode, and which doesn't invoke the normal ThinLTO backend pipeline.

Depends on D71907 and D71911.

Reviewers: pcc, evgeny777, steven_wu, espindola

Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71913
2020-01-23 16:09:44 -08:00
Alina Sbirlea 9e66c4ec12 [Utils] Use WeakTrackingVH in vector used as scratch storage.
The utility method RecursivelyDeleteTriviallyDeadInstructions receives
as input a vector of Instructions, where all inputs are valid
instructions. This same vector is used as a scratch storage (per the
header comment) to recursively delete instructions. If an instruction is
added as an operand of multiple other instructions, it may be added twice,
then deleted once, then the second reference in the vector is invalid.
Switch to using a Vector<WeakTrackingVH>.
This change facilitates a clean-up in LoopStrengthReduction.
2020-01-23 16:04:57 -08:00
Florian Hahn 4ed7355e44 [IPSCCP] Use ParamState for arguments at call sites.
We currently use integer ranges to merge concrete function arguments.
We use the ParamState range for those, but we only look up concrete
values in the regular state. For concrete function arguments that are
themselves arguments of the containing function, we can use the param
state directly and improve the precision in some cases.

Besides improving the results in some cases, this is also a small step towards
switching to ValueLatticeElement, by allowing D60582 to be a NFC.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D71836
2020-01-23 13:55:42 -08:00
Teresa Johnson 458676db6e [WPD/VFE] Always emit vcall_visibility metadata for -fwhole-program-vtables
Summary:
First patch to support Safe Whole Program Devirtualization Enablement,
see RFC here: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

Always emit !vcall_visibility metadata under -fwhole-program-vtables,
and not just for -fvirtual-function-elimination. The vcall visibility
metadata will (in a subsequent patch) be used to communicate to WPD
which vtables are safe to devirtualize, and we will optionally convert
the metadata to hidden visibility at link time. Subsequent follow on
patches will help enable this by adding vcall_visibility metadata to the
ThinLTO summaries, and always emit type test intrinsics under
-fwhole-program-vtables (and not just for vtables with hidden
visibility).

In order to do this safely with VFE, since for VFE all vtable loads must
be type checked loads which will no longer be the case, this patch adds
a new "Virtual Function Elim" module flag to communicate to GlobalDCE
whether to perform VFE using the vcall_visibility metadata.

One additional advantage of using the vcall_visibility metadata to drive
more WPD at LTO link time is that we can use the same mechanism to
enable more aggressive VFE at LTO link time as well. The link time
option proposed in the RFC will convert vcall_visibility metadata to
hidden (aka linkage unit visibility), which combined with
-fvirtual-function-elimination will allow it to be done more
aggressively at LTO link time under the same conditions.

Reviewers: pcc, ostannard, evgeny777, steven_wu

Subscribers: mehdi_amini, Prazek, hiraditya, dexonsmith, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71907
2020-01-23 11:36:01 -08:00
Alina Sbirlea 6770de9b8d [LoopIdiomRecognize] Teach LoopIdiomRecognize to preserve MemorySSA. 2020-01-23 11:31:12 -08:00
Alina Sbirlea a0f627d584 [IndVarSimplify] Fix for MemorySSA preserve. 2020-01-23 11:06:16 -08:00
Justin Bogner b81a337be7 [LoopUnroll] Avoid UB when converting from WeakVH to `Value *`
Calling `operator*` on a WeakVH with a null value yields a null
reference, which is UB. Avoid this by implicitly converting the WeakVH
to a `Value *` rather than dereferencing and then taking the address
for the type conversion.

Differential Revision: https://reviews.llvm.org/D73280
2020-01-23 10:36:39 -08:00
Guillaume Chatelet 59f95222d4 [Alignment][NFC] Use Align with CreateAlignedStore
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73274
2020-01-23 17:34:32 +01:00
Kazu Hirata 41784bed01 Revert "Resubmit: [JumpThreading] Thread jumps through two basic blocks"
This reverts commit 53b68e676f.

Our internal tests are showing breakage with this patch.
2020-01-23 06:34:03 -08:00
Fedor Sergeev 2f6987ba61 [LoopRotate] add ability to repeat loop rotation until non-deoptimizing exit is found
In case of loops with multiple exit where all-but-one exit are deoptimizing
it might happen that the first rotation will end up with latch having a deoptimizing
exit. This makes the loop unsuitable for trip-count analysis (say, getLoopEstimatedTripCount)
as well as for loop transformations that know how to handle multple deoptimizing exits.

It pretty much means that canonical form in multple-deoptimizing-exits case should be
with non-deoptimizing exit at latch.
Teach loop-rotation to reach this canonical form by repeating rotation.

-loop-rotate-multi option introduced to control this behavior, currently disabled by default.

Reviewers: skatkov, asbirlea, reames, fhahn
Reviewed By: skatkov

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73058
2020-01-23 15:56:24 +03:00
Guillaume Chatelet 279fa8e006 [Alignement][NFC] Deprecate untyped CreateAlignedLoad
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73260
2020-01-23 13:34:32 +01:00
Daniil Suchkov 4a8dbc617d [SSAUpdater] Don't call ValueIsRAUWd upon single use replacement
It is incorrect to call ValueHandleBase::ValueIsRAUWd when only one use
is replaced since it simply violates semantics of the callback and leads
to bugs like PR44320.

Previously this call was used specifically to keep LICM's cache of
AliasSetTrackers up to date across passes (as PR36801 showed, even for
that purpose it didn't work properly), but since LICM doesn't have that
cache anymore, we can safely remove this incorrect call with no
repercussions.

This patch fixes https://bugs.llvm.org/show_bug.cgi?id=44320

Reviewers: asbirlea, fhahn, efriedma, reames

Reviewed-By: asbirlea

Differential Revision: https://reviews.llvm.org/D73089
2020-01-23 15:53:53 +07:00
Daniil Suchkov 6fc9e60149 NFC. Remove obsolete SimpleAnalysis infrastructure
Apparently cache of AliasSetTrackers held by LICM was the only user of
SimpleAnalysis infrastructure. Now, given that we no longer have that
cache, this infrastructure is obsolete and, taking into account its
nature, we don't want any new solutions to be based on it.

Reviewers: asbirlea, fhahn, efriedma, reames

Reviewed-By: asbirlea

Differential Revision: https://reviews.llvm.org/D73085
2020-01-23 13:58:30 +07:00
Daniil Suchkov 53a28bd891 [LICM] NFC. Remove AST caching infrastructure
Since LICM doesn't use AST caching any more (see D73081), this
infrastructure is now obsolete and we can remove it.

Reviewers: asbirlea, fhahn, efriedma, reames

Reviewed-By: asbirlea

Differential Revision: https://reviews.llvm.org/D73084
2020-01-23 12:33:50 +07:00
Florian Hahn f14f2a8568 [LV] Fix predication for branches with matching true and false succs.
Currently due to the edge caching, we create wrong predicates for
branches with matching true and false successors. We will cache the
condition for the edge from the true successor, and then lookup the same
edge (src and dst are the same) for the edge to the false successor.

If both successors match, the condition should always be true. At the
moment, we cannot really create constant VPValues, but we can just
create a true condition as X | !X. Later passes will clean that up.

Fixes PR44488.

Reviewers: rengolin, hsaito, fhahn, Ayal, dorit, gilr

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D73079
2020-01-22 18:34:11 -08:00
Jonas Devlieghere cf2b498d28 [llvm/Transforms] Fix warning: private field 'MSSA' is not used 2020-01-22 18:07:53 -08:00
Alina Sbirlea adc4faf532 [IndVarSimplify] Teach IndVarSimplify to preserve MemorySSA. 2020-01-22 16:33:17 -08:00
Alina Sbirlea b5b6126d97 [IndVarSimplify] Cleanup spaces and reduce variable scope [NFCI]
Minor clean-ups + clang-format.
2020-01-22 15:32:20 -08:00
Alina Sbirlea 6baf31b7c1 [LoopIdiomRecognize] Reduce variable scope. [NFCI] 2020-01-22 15:30:08 -08:00
Nikita Popov 0b83c5a78f [InstCombine] Combine neg of shl of sub (PR44529)
Fixes https://bugs.llvm.org/show_bug.cgi?id=44529. We already have
a combine to sink a negation through a left-shift, but it currently
only works if the shift operand is negatable without creating any
instructions. This patch introduces freelyNegateValue() as a more
powerful extension of dyn_castNegVal(), which allows negating a
value as long as this doesn't end up increasing instruction count.
Specifically, this patch adds support for negating A-B to B-A.

This mechanism could in the future be extended to handle general
negation chains that a) start at a proper 0-X negation and b) only
require one operand to be freely negatable. This would end up as a
weaker form of D68408 aimed at the most obviously profitable subset
that eliminates a negation entirely.

Differential Revision: https://reviews.llvm.org/D72978
2020-01-22 23:03:58 +01:00
Nikita Popov efba7ed05e [PatternMatch] Make m_c_ICmp swap the predicate (PR42801)
This addresses https://bugs.llvm.org/show_bug.cgi?id=42801.
The m_c_ICmp() matcher is changed to provide the swapped predicate
if the operands are swapped.

Existing uses of m_c_ICmp() fall in one of two categories: Working
on equality predicates only, where swapping is irrelevant.
Or performing a manual swap, in which case this patch removes it.

The only exception is the foldICmpWithLowBitMaskedVal() fold, which
does not swap the predicate, and instead reasons about whether
a swap occurred or not for each predicate. Getting the swapped
predicate allows us to merge the logic for pairs of predicates,
instead of duplicating it.

Differential Revision: https://reviews.llvm.org/D72976
2020-01-22 22:56:26 +01:00
Alina Sbirlea efb130fc93 [LoopDeletion] Teach LoopDeletion to preserve MemorySSA if available.
If MemorySSA analysis is analysis, LoopDeletion now preserves it.
2020-01-22 11:38:38 -08:00
Sanjay Patel 0ade2abdb0 [InstCombine] fneg(X + C) --> -C - X
This is 1 of the potential folds uncovered by extending D72521.

We don't seem to do this in the backend either (unless I'm not
seeing some target-specific transform).

icc and gcc (appears to be target-specific) do this transform.

Differential Revision: https://reviews.llvm.org/D73057
2020-01-22 09:48:43 -05:00
Guillaume Chatelet 0957233320 [Alignment][NFC] Use Align with CreateMaskedStore
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73106
2020-01-22 11:04:39 +01:00
Daniil Suchkov 7bdc83f340 [LICM] Don't cache AliasSetTrackers when run under legacy PM
Summary:
This is the first step towards complete removal of AST caching from
LICM. Attempts to keep LICM's AST cache up to date across passes can lead
to miscompiles like this one: https://bugs.llvm.org/show_bug.cgi?id=44320.

LICM has already switched to using MemorySSA to do sinking and hoisting
and only builds an AliasSetTracker on demand for the promoteToScalars
step, without caching it from one LICM instance to the next. Given this,
we don't have compile-time reasons to keep AST caching any more.
The only scenario where the caching would be used currently is when
using the LegacyPassManager and setting -enable-mssa-loop-dependency=false.

This switch should help us to surface any possible issues that may arise
along this way, also it turns subsequent removal of AST caching into NFC.

Reviewers: asbirlea, fhahn, efriedma, reames

Reviewed By: asbirlea

Subscribers: hiraditya, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73081
2020-01-22 13:16:45 +07:00
Andrei Elovikov e1d6d36852 [SLP] Don't allow Div/Rem as alternate opcodes
Summary:
We don't have control/verify what will be the RHS of the division, so it might
happen to be zero, causing UB.

Reviewers: Vasilis, RKSimon, ABataev

Reviewed By: ABataev

Subscribers: vporpo, ABataev, hiraditya, llvm-commits, vdmitrie

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72740
2020-01-21 15:21:17 -08:00
Florian Hahn f42994f228 [Matrix] Hide and describe matrix-propagate-shape option. 2020-01-21 14:28:47 -08:00
Guillaume Chatelet bc8a1ab26f [Alignment][NFC] Use Align with CreateMaskedLoad
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73087
2020-01-21 14:13:22 +01:00
Sanjay Patel 7bee94410c [InstCombine] form copysign from select of FP constants (PR44153)
This should be the last step needed to solve the problem in the
description of PR44153:
https://bugs.llvm.org/show_bug.cgi?id=44153

If we're casting an FP value to int, testing its signbit, and then
choosing between a value and its negated value, that's a
complicated way of saying "copysign":

(bitcast X) <  0 ? -TC :  TC --> copysign(TC,  X)

Differential Revision: https://reviews.llvm.org/D72643
2020-01-20 10:51:14 -05:00
Guillaume Chatelet 46b9563cf6 [Alignment][NFC] Use Align with CreateElementUnorderedAtomicMemCpy
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, nicolasvasilache

Subscribers: hiraditya, jfb, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, liufengdb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73041
2020-01-20 15:39:45 +01:00
Evgeniy Brevnov af7e158872 [LV] Vectorizer should adjust trip count in profile information
Summary: Vectorized loop processes VFxUF number of elements in one iteration thus total number of iterations decreases proportionally. In addition epilog loop may not have more than VFxUF - 1 iterations. This patch updates profile information accordingly.

Reviewers: hsaito, Ayal, fhahn, reames, silvas, dcaballe, SjoerdMeijer, mkuper, DaniilSuchkov

Reviewed By: Ayal, DaniilSuchkov

Subscribers: fedor.sergeev, hiraditya, rkruppe, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67905
2020-01-20 18:36:28 +07:00
Evgeniy Brevnov cfe97681cd [NFC][LoopUtils] Minor change in comment according to review D71990. 2020-01-20 17:10:10 +07:00
Evgeniy Brevnov 10357e1c89 [LoopUtils] Better accuracy for getLoopEstimatedTripCount.
Summary: Current implementation of getLoopEstimatedTripCount returns 1 iteration less than it should. The reason is that in bottom tested loop first iteration is executed before first back branch is taken. For example for loop with !{!"branch_weights", i32 1 // taken, i32 1 // exit} metadata getLoopEstimatedTripCount gives 1 while actual number of iterations is 2.

Reviewers: Ayal, fhahn

Reviewed By: Ayal

Subscribers: mgorny, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71990
2020-01-20 16:58:07 +07:00
Sjoerd Meijer 93175a5caa [IndVarSimplify][LoopUtils] rewriteLoopExitValues. NFCI
This moves `rewriteLoopExitValues()` from IndVarSimplify to LoopUtils thus
making it a generic loop utility function.  This allows to rewrite loop exit
values by just calling this function without running the whole IndVarSimplify
pass.

We use this in D72714 to rematerialise the iteration count in exit blocks, so
that we can clean-up loop update expressions inside the hardware-loops later.

Differential Revision: https://reviews.llvm.org/D72602
2020-01-20 09:05:00 +00:00
Matt Arsenault a4451d88ee Consolidate internal denormal flushing controls
Currently there are 4 different mechanisms for controlling denormal
flushing behavior, and about as many equivalent frontend controls.

- AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features
- NVPTX uses the nvptx-f32ftz attribute
- ARM directly uses the denormal-fp-math attribute
- Other targets indirectly use denormal-fp-math in one DAGCombine
- cl-denorms-are-zero has a corresponding denorms-are-zero attribute

AMDGPU wants a distinct control for f32 flushing from f16/f64, and as
far as I can tell the same is true for NVPTX (based on the attribute
name).

Work on consolidating these into the denormal-fp-math attribute, and a
new type specific denormal-fp-math-f32 variant. Only ARM seems to
support the two different flush modes, so this is overkill for the
other use cases. Ideally we would error on the unsupported
positive-zero mode on other targets from somewhere.

Move the logic for selecting the flush mode into the compiler driver,
instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32
are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as
a user flag.

-cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and
-fno-cuda-flush-denormals-to-zero will be mapped to
-fp-denormal-math-f32=ieee or preserve-sign rather than the old
attributes.

Stop emitting the denorms-are-zero attribute for the OpenCL flag. It
has no in-tree users. The meaning would also be target dependent, such
as the AMDGPU choice to treat this as only meaning allow flushing of
f32 and not f16 or f64. The naming is also potentially confusing,
since DAZ in other contexts refers to instructions implicitly treating
input denormals as zero, not necessarily flushing output denormals to
zero.

This also does not attempt to change the behavior for the current
attribute. The LangRef now states that the default is ieee behavior,
but this is inaccurate for the current implementation. The clang
handling is slightly hacky to avoid touching the existing
denormal-fp-math uses. Fixing this will be left for a future patch.

AMDGPU is still using the subtarget feature to control the denormal
mode, but the new attribute are now emitted. A future change will
switch this and remove the subtarget features.
2020-01-17 20:09:53 -05:00
Alina Sbirlea 9f6c6ee6b9 [MemDepAnalysis/VNCoercion] Move static method to its only use. [NFCI]
Static method MemoryDependenceResults::getLoadLoadClobberFullWidthSize
does not have or use any info specific to MemoryDependenceResults.
Move it to its only user: VNCoercion.
2020-01-17 15:18:42 -08:00
Petr Hosek d3db13af7e [profile] Support counter relocation at runtime
This is an alternative to the continous mode that was implemented in
D68351. This mode relies on padding and the ability to mmap a file over
the existing mapping which is generally only available on POSIX systems
and isn't suitable for other platforms.

This change instead introduces the ability to relocate counters at
runtime using a level of indirection. On every counter access, we add a
bias to the counter address. This bias is stored in a symbol that's
provided by the profile runtime and is initially set to zero, meaning no
relocation. The runtime can mmap the profile into memory at abitrary
location, and set bias to the offset between the original and the new
counter location, at which point every subsequent counter access will be
to the new location, which allows updating profile directly akin to the
continous mode.

The advantage of this implementation is that doesn't require any special
OS support. The disadvantage is the extra overhead due to additional
instructions required for each counter access (overhead both in terms of
binary size and performance) plus duplication of counters (i.e. one copy
in the binary itself and another copy that's mmapped).

Differential Revision: https://reviews.llvm.org/D69740
2020-01-17 15:02:23 -08:00
Peter Collingbourne cd40bd0a32 hwasan: Move .note.hwasan.globals note to hwasan.module_ctor comdat.
As of D70146 lld GCs comdats as a group and no longer considers notes in
comdats to be GC roots, so we need to move the note to a comdat with a GC root
section (.init_array) in order to prevent lld from discarding the note.

Differential Revision: https://reviews.llvm.org/D72936
2020-01-17 13:40:52 -08:00
Drew Wock 0bcfafc5e7 [SeparateConstOffsetFromGEP] Fix: sext(a) + sext(b) -> sext(a + b) matches add and sub instructions with one another
During the SeparateConstOffsetFromGEP pass, signed extensions are distributed
to the values that feed into them and then later recombined. The recombination
stage is somewhat problematic- it doesn't differ add and sub instructions
from another when matching the sext(a) +/- sext(b) -> sext(a +/- b) pattern
in some instances.

An example- the IR contains:
%unextendedA
%unextendedB
%subuAuB = unextendedA - unextendedB
%extA = extend A
%extB = extend B
%addeAeB = extA + extB

The problematic optimization will transform that into:

%unextendedA
%unextendedB
%subuAuB = unextendedA - unextendedB
%extA = extend A
%extB = extend B
%addeAeB = extend subuAuB ; Obviously not semantically equivalent to the IR input.

This patch fixes that.

Patch by Drew Wock <drew.wock@sas.com>
Differential Revision: https://reviews.llvm.org/D65967
2020-01-17 12:22:52 -05:00
Nikita Popov 522c030aa9 [InstCombine] Fix worklist management in DSE (PR44552)
Fixes https://bugs.llvm.org/show_bug.cgi?id=44552. We need to make
sure that the store is reprocessed, because performing DSE may
expose more DSE opportunities.

There is a slight caveat here though: We need to make sure that we
add back the store the worklist first, because that means it will
be processed after the operands of the removed store have been
processed. This is a general bug in InstCombine worklist management
that I hope to address at some point, but for now it means we need
to do this manually rather than just returning the instruction as
changed.

Differential Revision: https://reviews.llvm.org/D72807
2020-01-17 18:10:56 +01:00
Nikita Popov 77befe54f7 [InstCombine] Fix worklist management in return combine
There are two related bugs here: First, we don't add the operand
we're replacing to the worklist, which means it may not get DCEd
(see test change). Second, usually this would just get picked up
in the next iteration, but we also do not report the instruction
as changed. This means that we do not get that extra instcombine
iteration, and more importantly, may break the pass pipeline, as
the function is not marked as changed.

Differential Revision: https://reviews.llvm.org/D72864
2020-01-17 17:59:23 +01:00
Nikita Popov 2ca092f320 [InstCombine] Support disabling expensive combines in opt
Currently, there is no way to disable ExpensiveCombines when doing
a standalone opt -instcombine run, as that's the default, and the
opt option can currently only be used to force enable, not to force
disable. The only way to disable expensive combines is via -O1 or -O2,
but that of course also runs the rest of the kitchen sink...

This patch allows using opt -instcombine -expensive-combines=0 to
run InstCombine without ExpensiveCombines.

Differential Revision: https://reviews.llvm.org/D72861
2020-01-17 17:56:20 +01:00
Matt Arsenault 3ef8cdf666 AMDGPU: Do permlane16 vdst_in discard optimization in InstCombine
There's more potential value to discarding the source value earlier,
since we always know the value of the fi/bc bits.
2020-01-16 17:27:53 -05:00
Kazu Hirata 53b68e676f Resubmit: [JumpThreading] Thread jumps through two basic blocks
This reverts commit 2d258ed931.  This
revision fixes the Windows build and adds a testcase for it, namely
thread-two-bbs3.ll.  My original patch improperly copied EH pads on
Windows.  This patch disregards jump threading opportunities having to
do with EH pads.

[JumpThreading] Thread jumps through two basic blocks

Summary:
This patch teaches JumpThreading.cpp to thread through two basic
blocks like:

  bb3:
    %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

by duplicating basic blocks like bb3 above.  Once we duplicate bb3 as
bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have:

  bb3:
    %var = phi i32* [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb3.dup:
    %var = phi i32* [ null, %bb1 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

Then the existing code in JumpThreading.cpp can thread edge
bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5.

Reviewers: wmi

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70247
2020-01-16 12:33:37 -08:00
Arkady Shlykov c87982b467 Revert "[Loop Peeling] Add possibility to enable peeling on loop nests."
This reverts commit 3f3017e because there's a failure on peel-loop-nests.ll
with LLVM_ENABLE_EXPENSIVE_CHECKS on.

Differential Revision: https://reviews.llvm.org/D70304
2020-01-16 10:33:38 -08:00
Fedor Sergeev 3478551bf3 [GVN] introduce GVNOptions to control GVN pass behavior
There are a few global (cl::opt) controls that enable optional
behavior in GVN. Introduce GVNOptions that provide corresponding
per-pass instance controls.

That will allow to use GVN multiple times in pipeline each time
with different settings.

Reviewers: asbirlea, rnk, reames, skatkov, fhahn
Reviewed By: fhahn

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72732
2020-01-16 20:21:08 +03:00
Mircea Trofin 7acfda633f [llvm] Make new pass manager's OptimizationLevel a class
Summary:
The old pass manager separated speed optimization and size optimization
levels into two unsigned values. Coallescing both in an enum in the new
pass manager may lead to unintentional casts and comparisons.

In particular, taking a look at how the loop unroll passes were constructed
previously, the Os/Oz are now (==new pass manager) treated just like O3,
likely unintentionally.

This change disallows raw comparisons between optimization levels, to
avoid such unintended effects. As an effect, the O{s|z} behavior changes
for loop unrolling and loop unroll and jam, matching O2 rather than O3.

The change also parameterizes the threshold values used for loop
unrolling, primarily to aid testing.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: zzheng, ychen, mehdi_amini, hiraditya, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72547
2020-01-16 09:00:56 -08:00
Francesco Petrogalli 66c120f025 [VectorUtils] Rework the Vector Function Database (VFDatabase).
Summary:
This commits is a rework of the patch in
https://reviews.llvm.org/D67572.

The rework was requested to prevent out-of-tree performance regression
when vectorizing out-of-tree IR intrinsics. The vectorization of such
intrinsics is enquired via the static function `isTLIScalarize`. For
detail see the discussion in https://reviews.llvm.org/D67572.

Reviewers: uabelho, fhahn, sdesmalen

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72734
2020-01-16 15:08:26 +00:00
Simon Pilgrim 23a887b0dd Fix unused variable warning. NFCI. 2020-01-16 13:02:40 +00:00
Florian Hahn 23c113802e [LV] Allow assume calls in predicated blocks.
The assume intrinsic is intentionally marked as may reading/writing
memory, to avoid passes moving them around. When flattening the CFG
for predicated blocks, we have to drop the assume calls, as they
are control-flow dependent.

There are some cases where we can do better (when control flow is
preserved), but that is follow-up work.

Fixes PR43620.

Reviewers: hsaito, rengolin, dcaballe, Ayal

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D68814
2020-01-16 10:11:35 +00:00
Sameer Sahasrabuddhe ed181efa17 [HIP][AMDGPU] expand printf when compiling HIP to AMDGPU
Summary:
This change implements the expansion in two parts:
- Add a utility function emitAMDGPUPrintfCall() in LLVM.
- Invoke the above function from Clang CodeGen, when processing a HIP
  program for the AMDGPU target.

The printf expansion has undefined behaviour if the format string is
not a compile-time constant. As a sufficient condition, the HIP
ToolChain now emits -Werror=format-nonliteral.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D71365
2020-01-16 15:15:38 +05:30
Vedant Kumar 360abb7ee5 [CodeExtractor] Transfer debug info to extracted function
After extracting, fix up debug info in both the old and new functions by

1) Pointing line locations and debug intrinsics to the new subprogram
   scope, and

2) Deleting intrinsics which point to values outside of the new
   function.

Depends on https://reviews.llvm.org/D72795.

Testing: check-llvm, check-clang, a build of LNT in the `-Os -g` config
with "-mllvm -hot-cold-split=1" set, and end-to-end debugging of a toy
program which undergoes splitting to verify that lldb can find
variables, single step, etc. in extracted code.

rdar://45507940

Differential Revision: https://reviews.llvm.org/D72801
2020-01-15 15:38:36 -08:00
Fedor Sergeev 8a4d12ae5b [BasicBlock] add helper getPostdominatingDeoptimizeCall
It appears to be rather useful when analyzing Loops with multiple
deoptimizing exits, perhaps merged ones.
For now it is used in LoopPredication, will be adding more uses
in other loop passes.

Reviewers: asbirlea, fhahn, skatkov, spatel, reames
Reviewed By: reames

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72754
2020-01-16 01:15:57 +03:00
Mircea Trofin 5466597fee [NFC] Refactor InlineResult for readability
Summary:
InlineResult is used both in APIs assessing whether a call site is
inlinable (e.g. llvm::isInlineViable) as well as in the function
inlining utility (llvm::InlineFunction). It means slightly different
things (can/should inlining happen, vs did it happen), and the
implicit casting may introduce ambiguity (casting from 'false' in
InlineFunction will default a message about hight costs,
which is incorrect here).

The change renames the type to a more generic name, and disables
implicit constructors.

Reviewers: eraman, davidxl

Reviewed By: davidxl

Subscribers: kerbowa, arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72744
2020-01-15 13:34:20 -08:00
Zhongduo Lin 34ba96a3d4 [NFC][IndVarSimplify] remove duplicate code in widenWithVariantLoadUseCodegen.
Summary: Duplicate code in widenWithVariantLoadUseCodegen is removed and also use assert to check unknown extension type as it should be filtered out by the pre condition check before calling this function.

Reviewers: az, sanjoy, sebpop, efriedma, javed.absar, sanjoy.google

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits, amehsan

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72652
2020-01-15 16:27:58 -05:00
Vedant Kumar a2cc80bc95 DebugInfo: Factor out logic to update locations in MD_loop metadata, NFC
Factor out the logic needed to update debug locations contained within
MD_loop metadata.

This refactor is preparation for a future change that also needs to
rewrite MD_loop metadata.

rdar://45507940
2020-01-15 13:02:36 -08:00
Arkady Shlykov 3f3017e162 [Loop Peeling] Add possibility to enable peeling on loop nests.
Summary:
Current peeling implementation bails out in case of loop nests.
The patch introduces a field in TargetTransformInfo structure that
certain targets can use to relax the constraints if it's
profitable (disabled by default).
Also additional option is added to enable peeling manually for
experimenting and testing purposes.

Reviewers: fhahn, lebedev.ri, xbolva00

Reviewed By: xbolva00

Subscribers: xbolva00, hiraditya, zzheng, llvm-commits

Differential Revision: https://reviews.llvm.org/D70304
2020-01-15 08:25:21 -08:00
Sanjay Patel 3180af4362 [InstCombine] reassociate fsub+fsub into fsub+fadd
As discussed in the motivating PR44509:
https://bugs.llvm.org/show_bug.cgi?id=44509

...we can end up with worse code using fast-math than without.
This is because the reassociate pass greedily transforms fsub
into fneg/fadd and apparently (based on the regression tests
seen here) expects instcombine to clean that up if it wasn't
profitable. But we were missing this fold:

(X - Y) - Z --> X - (Y + Z)

There's another, more specific case that I think we should
handle as shown in the "fake" fneg test (but missed with a real
fneg), but that's another patch. That may be tricky to get
right without conflicting with existing transforms for fneg.

Differential Revision: https://reviews.llvm.org/D72521
2020-01-15 11:14:13 -05:00
Hideto Ueno 188f9a348d [Attributor] AAValueConstantRange: Value range analysis using constant range
Summary:
This patch introduces `AAValueConstantRange`, which answers a possible range for integer value in a specific program point.
One of the motivations is propagating existing `range` metadata. (I think we need to change the situation that `range` metadata cannot be put to Argument).

The state is a tuple of `ConstantRange` and it is initialized to (known, assumed) = ([-∞, +∞], empty).

Currently, AAValueConstantRange is created in `getAssumedConstant` method when `AAValueSimplify` returns `nullptr`(worst state).

Supported
 - BinaryOperator(add, sub, ...)
 - CmpInst(icmp eq, ...)
 - !range metadata

`AAValueConstantRange` is not intended to extend to polyhedral range value analysis.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: phosek, davezarzycki, baziotis, hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71620
2020-01-15 16:34:23 +09:00
Nikita Popov 04e586151e [InstCombine] Fix worklist management when removing guard intrinsic
When multiple guard intrinsics are merged into one, currently the
result of eraseInstFromFunction() is returned -- however, this
should only be done if the current instruction is being removed.
In this case we're removing a different instruction and should
instead report that the current one has been modified by returning it.

For this test case, this reduces the number of instcombine iterations
from 5 to 2 (the minimum possible).

Differential Revision: https://reviews.llvm.org/D72558
2020-01-14 21:47:48 +01:00
Nikita Popov 410331869d [NewPM] Port MergeFunctions pass
This ports the MergeFunctions pass to the NewPM. This was rather
straightforward, as no analyses are used.

Additionally MergeFunctions needs to be conditionally enabled in
the PassBuilder, but I left that part out of this patch.

Differential Revision: https://reviews.llvm.org/D72537
2020-01-14 20:55:41 +01:00
Nikita Popov 65c0805be5 [InstCombine] Fix infinite loop due to bitcast <-> phi transforms
Fix for https://bugs.llvm.org/show_bug.cgi?id=44245.

The optimizeBitCastFromPhi() and FoldPHIArgOpIntoPHI() end up
fighting against each other, because optimizeBitCastFromPhi()
assumes that bitcasts of loads will get folded. This doesn't
happen here, because a dangling phi node prevents the one-use
fold in https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp#L620-L628 from triggering.

This patch fixes the issue by explicitly performing the load
combine as part of the bitcast of phi transform. Other attempts
to force the load to be combined first were ultimately too
unreliable.

Differential Revision: https://reviews.llvm.org/D71164
2020-01-14 20:45:13 +01:00
Nikita Popov b4dd928ffb [InstCombine] Make combineLoadToNewType a method; NFC
So it can be reused as part of other combines.
In particular for D71164.
2020-01-14 20:40:03 +01:00
Nikita Popov 652cd7c100 [InstCombine] Fix user iterator invalidation in bitcast of phi transform
This fixes the issue encountered in D71164. Instead of using a
range-based for, manually iterate over the users and advance the
iterator beforehand, so we do not skip any users due to iterator
invalidation.

Differential Revision: https://reviews.llvm.org/D72657
2020-01-14 20:38:10 +01:00
Teresa Johnson 2cefb93951 [ThinLTO/WPD] Remove an overly-aggressive assert
Summary:
An assert added to the index-based WPD was trying to verify that we only
have multiple vtables for a given guid when they are all non-external
linkage. This is too conservative because we may have multiple external
vtable with the same guid when they are in comdat. Remove the assert,
as we don't have comdat information in the index, the linker should
issue an error in this case.

See discussion on D71040 for more information.

Reviewers: evgeny777, aganea

Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72648
2020-01-14 10:57:14 -08:00
Juneyoung Lee 3e32b7e127 [InstCombine] Let combineLoadToNewType preserve ABI alignment of the load (PR44543)
Summary:
If aligment on `LoadInst` isn't specified, load is assumed to be ABI-aligned.
And said aligment may be different for different types.
So if we change load type, but don't pay extra attention to the aligment
(i.e. keep it unspecified), we may either overpromise (if the default aligment
of the new type is higher), or underpromise (if the default aligment
of the new type is smaller).

Thus, if no alignment is specified, we need to manually preserve the implied ABI alignment.

This addresses https://bugs.llvm.org/show_bug.cgi?id=44543 by making combineLoadToNewType preserve ABI alignment of the load.

Reviewers: spatel, lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72710
2020-01-15 03:20:53 +09:00
Dmitri Gribenko 2948ec5ca9 Removed PointerUnion3 and PointerUnion4 aliases in favor of the variadic template 2020-01-14 18:56:29 +01:00
Florian Hahn 192cce10f6 Revert "Recommit "[GlobalOpt] Pass DTU to removeUnreachableBlocks instead of recomputing.""
This reverts commit a03d7b0f24.

As discussed in D68298, this causes a compile-time regression, in case
the DTs requested are not used elsewhere in GlobalOpt. We should only
get the DTs if they are available here, but this seems not possible with
the legacy pass manager from a module pass.
2020-01-14 14:50:07 +00:00
Benjamin Kramer df186507e1 Make helper functions static or move them into anonymous namespaces. NFC. 2020-01-14 14:06:37 +01:00
Hiroshi Yamauchi 7b9f8e17d1 [PGO][CHR] Guard against 0-to-0 branch weight and avoid division by zero crash.
Summary: This fixes a crash in internal builds under SamplePGO.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72653
2020-01-13 14:38:58 -08:00
Teresa Johnson 31441a3e00 [ThinLTO/WPD] Fix index-based WPD for alias vtables
Summary:
A recent fix in D69452 fixed index based WPD in the presence of
available_externally vtables. It added a cast of the vtable def
summary to a GlobalVarSummary. However, in some cases one def may be an
alias, in which case we need to get the base object before casting,
otherwise we will crash.

Reviewers: evgeny777, steven_wu, aganea

Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71040
2020-01-13 13:38:26 -08:00
Simon Pilgrim 2740b2d5d5 Fix uninitialized value clang static analyzer warning. NFC. 2020-01-11 16:02:22 +00:00
Nuno Lopes 87407fc03c DSE: fix bug where we would only check libcalls for name rather than whole decl 2020-01-11 11:57:29 +00:00
Nikita Popov 0e322c8a1f [InstCombine] Preserve nuw on sub of geps (PR44419)
Fix https://bugs.llvm.org/show_bug.cgi?id=44419 by preserving the
nuw on sub of geps. We only do this if the offset has a multiplication
as the final operation, as we can't be sure the operations is nuw
in the other cases without more thorough analysis.

Differential Revision: https://reviews.llvm.org/D72048
2020-01-11 11:01:12 +01:00
Andrew Paverd bdd88b7ed3 Add support for __declspec(guard(nocf))
Summary:
Avoid using the `nocf_check` attribute with Control Flow Guard. Instead, use a
new `"guard_nocf"` function attribute to indicate that checks should not be
added on indirect calls within that function. Add support for
`__declspec(guard(nocf))` following the same syntax as MSVC.

Reviewers: rnk, dmajor, pcc, hans, aaron.ballman

Reviewed By: aaron.ballman

Subscribers: aaron.ballman, tomrittervg, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72167
2020-01-10 16:04:12 +00:00
Simon Pilgrim 2e66405d8d Don't use dyn_cast_or_null if we know the pointer is nonnull.
Fix clang static analyzer null dereference warning by using dyn_cast instead.
2020-01-10 10:32:36 +00:00
Benjamin Kramer 498856fca5 [LV] Silence unused variable warning in Release builds. NFC. 2020-01-10 11:21:27 +01:00
Gil Rapaport 8647a72c4a [LV] VPValues for memory operation pointers (NFCI)
Memory instruction widening recipes use the pointer operand of their load/store
ingredient for generating the needed GEPs, making it difficult to feed these
recipes with pointers based on other ingredients or none at all.
This patch modifies these recipes to use a VPValue for the pointer instead, in
order to reduce ingredient def-use usage by ILV as a step towards full
VPlan-based def-use relations. The recipes are constructed with VPValues bound
to these ingredients, maintaining current behavior.

Differential revision: https://reviews.llvm.org/D70865
2020-01-10 09:24:59 +02:00
Whitney Tsang d27a15fed7 [NFCI][LoopUnrollAndJam] Changing LoopUnrollAndJamPass to a function
pass.

Summary: This patch changes LoopUnrollAndJamPass to a function pass, and
keeps the loops traversal order same as defined in
FunctionToLoopPassAdaptor LoopPassManager.h.

The next patch will change the loop traversal to outer to inner order,
so more loops can be transform.

Discussion in llvm-dev mailing list:
https://groups.google.com/forum/#!topic/llvm-dev/LF4rUjkVI2g
Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto
Reviewed By: dmgreen
Subscribers: hiraditya, zzheng, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D72230
2020-01-09 16:18:36 +00:00
@raghesh (Raghesh Aloor) 6c04ef472a [InstCombine] Z / (1.0 / Y) => (Y * Z)
This is a special case of Z / (X / Y) => (Y * Z) / X, with X = 1.0.
The m_OneUse check is avoided because even in the case of the
multiple uses for 1.0/Y, the number of instructions remain the same
and a division is replaced by a multiplication.

Differential Revision: https://reviews.llvm.org/D72319
2020-01-09 10:52:39 -05:00
Florian Hahn ccf24225e3 [Matrix] Update shape propagation to iterate until done.
This patch updates the shape propagation to iterate until no new shape
information is discovered.

As initial seed for the forward propagation, we use the matrix intrinsic
instructions. Both propagateShapeForward and propagateShapeBackward
return new work lists, with the instructions to be used for the next
iteration. When propagating forward, we record all instructions we added
new shape information for. When propagating backward, we record all
users of instructions we added new shape information for.

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70901
2020-01-09 10:52:52 +00:00
Florian Hahn 7adf6644f5 [Matrix] Propagate and use shape information for loads.
This patch extends to shape propagation to also include load
instructions and implements shape aware lowering for vector loads.

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70900
2020-01-09 10:21:20 +00:00
Evgeniy Brevnov f0abe820ee [LoopUtils][NFC] Minor refactoring in getLoopEstimatedTripCount. 2020-01-09 16:49:15 +07:00
Florian Hahn 459ad8e97e [Matrix] Implement back-propagation of shape information.
This patch extends the shape propagation for matrix operations to also
propagate the shape of instructions to their operands.

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70899
2020-01-09 09:48:07 +00:00
Sjoerd Meijer 8f1887456a [LV] Still vectorise when tail-folding can't find a primary inducation variable
This addresses a vectorisation regression for tail-folded loops that are
counting down, e.g. loops as simple as this:

  void foo(char *A, char *B, char *C, uint32_t N) {
    while (N > 0) {
      *C++ = *A++ + *B++;
       N--;
    }
  }

These are loops that can be vectorised, but when tail-folding is requested, it
can't find a primary induction variable which we do need for predicating the
loop. As a result, the loop isn't vectorised at all, which it is able to do
when tail-folding is not attempted. So, this adds a check for the primary
induction variable where we decide how to lower the scalar epilogue. I.e., when
there isn't a primary induction variable, a scalar epilogue loop is allowed
(i.e. don't request tail-folding) so that vectorisation could still be
triggered.

Having this check for the primary induction variable make sense anyway, and in
addition, in a follow-up of this I will look into discovering earlier the
primary induction variable for counting down loops, so that this can also be
tail-folded.

Differential revision: https://reviews.llvm.org/D72324
2020-01-09 09:14:00 +00:00
Johannes Doerfert a4088c75cc [Attributor][FIX] Carefully change invokes to calls (after manifest)
Before we manually inserted unreachable early but that could lead to
broken PHI nodes. Now we use the existing late modification
functionality.
2020-01-08 19:32:38 -06:00
Johannes Doerfert 1e46eb74be [Attributor][FIX] Avoid dangling value pointers during code modification
When we replace instructions with unreachable we delete instructions. We
now avoid dangling pointers to those deleted instructions in the
`ToBeChangedToUnreachableInsts` set. Other modification collections
might need to be updated in the future as well.
2020-01-08 19:32:37 -06:00
Kazu Hirata 2d258ed931 Revert "[JumpThreading] Thread jumps through two basic blocks"
It looks like my patch breaks the sanitizer-windows build:

http://lab.llvm.org:8011/builders/sanitizer-windows/builds/56324

This reverts commit ead815924e.
2020-01-08 13:58:39 -08:00
Kazu Hirata ead815924e [JumpThreading] Thread jumps through two basic blocks
Summary:
This patch teaches JumpThreading.cpp to thread through two basic
blocks like:

  bb3:
    %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

by duplicating basic blocks like bb3 above.  Once we duplicate bb3 as
bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have:

  bb3:
    %var = phi i32* [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb3.dup:
    %var = phi i32* [ null, %bb1 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

Then the existing code in JumpThreading.cpp can thread edge
bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5.

Reviewers: wmi

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70247
2020-01-08 06:57:36 -08:00
Kadir Cetinkaya b212eb7159
Revert "[InstCombine] fold zext of masked bit set/clear"
This reverts commit a041c4ec6f.

This looks like a non-trivial change and there has been no code
reviews (at least there were no phabricator revisions attached to the
commit description). It is also causing a regression in one of our
downstream integration tests, we haven't been able to come up with a
minimal reproducer yet.
2020-01-08 11:21:21 +01:00
Philip Reames 312a532dc0 [GVN/FP] Considate logic for reasoning about equality vs equivalance for floats
Factor out common logic into some reasonable commented helper functions. In the process, ensure that the in-block vs cross-block cases are handled the same. They previously weren't.

Differential Revision: https://reviews.llvm.org/D67126
2020-01-07 16:05:04 -08:00
Sanjay Patel f8962571f7 [InstCombine] try to pull 'not' of select into compare operands
not (select ?, (cmp TPred, ?, ?), (cmp FPred, ?, ?) -->
     select ?, (cmp TPred', ?, ?), (cmp FPred', ?, ?)

If both sides of the select are cmps, we can remove an instruction.
The case where only side is a cmp is deferred to a possible
follow-on patch.

We have a more general 'isFreeToInvert' analysis, but I'm not seeing
a way to use that more widely without inducing infinite looping
(opposing transforms).
Here, we flip the compare predicates directly, so we should not have
any danger by creating extra intermediate 'not' ops.

Alive proofs:
https://rise4fun.com/Alive/jKa

Name: both select values are compares - invert predicates
  %tcmp = icmp sle i32 %x, %y
  %fcmp = icmp ugt i32 %z, %w
  %sel = select i1 %cond, i1 %tcmp, i1 %fcmp
  %not = xor i1 %sel, true
=>
  %tcmp_not = icmp sgt i32 %x, %y
  %fcmp_not = icmp ule i32 %z, %w
  %not = select i1 %cond, i1 %tcmp_not, i1 %fcmp_not

Name: false val is compare - invert/not
  %fcmp = icmp ugt i32 %z, %w
  %sel = select i1 %cond, i1 %tcmp, i1 %fcmp
  %not = xor i1 %sel, true
=>
  %tcmp_not = xor i1 %tcmp, -1
  %fcmp_not = icmp ule i32 %z, %w
  %not = select i1 %cond, i1 %tcmp_not, i1 %fcmp_not

Differential Revision: https://reviews.llvm.org/D72007
2020-01-07 10:44:23 -05:00
Simon Pilgrim bd1dc6a3eb Fix "use of uninitialized variable" static analyzer warnings. NFCI. 2020-01-07 10:55:37 +00:00
Fangrui Song 6904cd9486 Add Triple::isX86()
Reviewed By: craig.topper, skan

Differential Revision: https://reviews.llvm.org/D72247
2020-01-06 15:51:02 -08:00
James Henderson d68904f957 [NFC] Fix trivial typos in comments
Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D72143

Patch by Kazuaki Ishizaki.
2020-01-06 10:50:26 +00:00
Brian Gesiak 83a9321f60 [Coroutines] Remove corresponding phi values when apply simplifyTerminatorLeadingToRet
Summary:
In addMustTailToCoroResumes, we set musttail on those resume instructions that are followed by a ret instruction. This is done by simplifyTerminatorLeadingToRet which replace a sequence of branches leading to a ret with a clone of the ret.

However it forgets to remove corresponding PHI values that come from basic block of replaced branch, and may cause jumpthreading pass hangs (https://bugs.llvm.org/show_bug.cgi?id=43720)

This patch fix this issue

Test Plan:
cppcoro library with O3+flto
check-llvm

Reviewers: modocache, GorNishanov, lewissbaker

Reviewed By: modocache

Subscribers: mehdi_amini, EricWF, hiraditya, dexonsmith, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71826

Patch by junparser (JunMa)!
2020-01-05 18:26:30 -05:00
Florian Hahn b8a3c34eee Revert "[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC)."
This reverts commit 51ef53f3bd, as it
breaks some bots.
2020-01-04 18:44:38 +00:00
Florian Hahn 51ef53f3bd [SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.

Reviewers: sanjoy.google, efriedma, reames

Reviewed By: sanjoy.google

Differential Revision: https://reviews.llvm.org/D71537
2020-01-04 18:29:35 +00:00
Florian Hahn 99f74a64a2 [SCEV] Remove unused ScalarEvolutionExpander.h includes (NFC). 2020-01-04 18:29:35 +00:00
Roman Lebedev 6d05bc2e3a
[NFCI][InstCombine] Refactor 'sink negation into select if that folds one hand of select to 0' fold
I would think it's better than having two practically identical folds
next to eachother, but then generalization isn't all that pretty
due to the fact that we need to produce different `sub` each time..

This change is no-functional-changes-intended refactoring.
2020-01-04 17:30:51 +03:00
Roman Lebedev 772ede3d5d
[InstCombine] Sink sub into hands of select if one hand becomes zero. Part 2 (PR44426)
This decreases use count of %Op0, makes one hand of select to be 0,
and possibly exposes further folding potential.

Name: sub %Op0, (select %Cond, %Op0, %FalseVal) -> select %Cond, 0, (sub %Op0, %FalseVal)
  %Op0 = %TrueVal
  %o = select i1 %Cond, i8 %Op0, i8 %FalseVal
  %r = sub i8 %Op0, %o
=>
  %n = sub i8 %Op0, %FalseVal
  %r = select i1 %Cond, i8 0, i8 %n

Name: sub %Op0, (select %Cond, %TrueVal, %Op0) -> select %Cond, (sub %Op0, %TrueVal), 0
  %Op0 = %FalseVal
  %o = select i1 %Cond, i8 %TrueVal, i8 %Op0
  %r = sub i8 %Op0, %o
=>
  %n = sub i8 %Op0, %TrueVal
  %r = select i1 %Cond, i8 %n, i8 0

https://rise4fun.com/Alive/aHRt

https://bugs.llvm.org/show_bug.cgi?id=44426
2020-01-04 17:30:51 +03:00
Roman Lebedev 4d8e47ca18
[InstCombine] Sink sub into hands of select if one hand becomes zero (PR44426)
This decreases use count of %Op1, makes one hand of select to be 0,
and possibly exposes further folding potential.

Name: sub (select %Cond, %Op1, %FalseVal), %Op1 -> select %Cond, 0, (sub %FalseVal, %Op1)
  %Op1 = %TrueVal
  %o = select i1 %Cond, i8 %Op1, i8 %FalseVal
  %r = sub i8 %o, %Op1
=>
  %n = sub i8 %FalseVal, %Op1
  %r = select i1 %Cond, i8 0, i8 %n

Name: sub (select %Cond, %TrueVal, %Op1), %Op1 -> select %Cond, (sub %TrueVal, %Op1), 0
  %Op1 = %FalseVal
  %o = select i1 %Cond, i8 %TrueVal, i8 %Op1
  %r = sub i8 %o, %Op1
=>
  %n = sub i8 %TrueVal, %Op1
  %r = select i1 %Cond, i8 %n, i8 0

https://rise4fun.com/Alive/avL

https://bugs.llvm.org/show_bug.cgi?id=44426
2020-01-04 17:30:51 +03:00
Alexey Lapshin 831bfcea47 [Transforms][GlobalSRA] huge array causes long compilation time and huge memory usage.
Summary:
For artificial cases (huge array, few usages), Global SRA optimization creates
a lot of redundant data. It creates an instance of GlobalVariable for each array
element. For huge array, that means huge compilation time and huge memory usage.
Following example compiles for 10 minutes and requires 40GB of memory.

namespace {
  char LargeBuffer[64 * 1024 * 1024];
}

int main ( void ) {

    LargeBuffer[0] = 0;

    printf("\n ");

    return LargeBuffer[0] == 0;
}

The fix is to avoid Global SRA for large arrays.

Reviewers: craig.topper, rnk, efriedma, fhahn

Reviewed By: rnk

Subscribers: xbolva00, lebedev.ri, lkail, merge_guards_bot, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71993
2020-01-04 16:42:38 +03:00
Roman Lebedev 7973aa05f6
[NFC][InstCombine] '(Op1 & С) - Op1' -> '-(Op1 & ~C)' fold (PR44427)
This decreases use count of Op1, potentially allows
us to further hoist said 'neg' later on,
and results in marginally better X86 codegen.

Name: (Op1 & С) - Op1 -> -(Op1 & ~C)
  %o = and i64 %Op1, C1
  %r = sub i64 %o, %Op1
=>
  %n = and i64 %Op1, ~C1
  %r = sub i64 0, %n

https://rise4fun.com/Alive/rwgA

https://godbolt.org/z/R_RMfM

https://bugs.llvm.org/show_bug.cgi?id=44427
2020-01-03 21:25:48 +03:00
Roman Lebedev cc0216bedb
[NFC][InstCombine] '(X & (- Y)) - X' -> '- (X & (Y - 1))' fold (PR44448)
Name: (X & (- Y)) - X  ->  - (X & (Y - 1))  (PR44448)
  %negy = sub i8 0, %y
  %unbiasedx = and i8 %negy, %x
  %r = sub i8 %unbiasedx, %x
=>
  %ymask = add i8 %y, -1
  %xmasked = and i8 %ymask, %x
  %r = sub i8 0, %xmasked

https://rise4fun.com/Alive/OIpla

This decreases use count of %x, may allow us to
later hoist said negation even further,
and results in marginally nicer X86 codegen.

See
  https://bugs.llvm.org/show_bug.cgi?id=44448
  https://reviews.llvm.org/D71499
2020-01-03 20:27:29 +03:00
Johannes Doerfert d2d2fb19f7 [Attributor][FIX] Allow dead users of rewritten function
If we replace a function with a new one because we rewrite the
signature, dead users may still refer to the old version. With this
patch we reuse the code that deals with dead functions, which the old
versions are, to avoid problems.
2020-01-03 10:43:40 -06:00
Johannes Doerfert 6b9ee2d6cd [Attributor][NFC] Unify the way we delete dead functions 2020-01-03 10:43:40 -06:00
Johannes Doerfert c90681b681 [Attributor][FIX] Don't crash on ptr2int/int2ptr instructions
An integer isn't allowed in getAlignmentForValue so we need to stop at a
ptr2int instruction during exploration.
2020-01-03 10:43:40 -06:00
Johannes Doerfert 412a0101a9 [Attributor][FIX] Do not derive nonnull and dereferenceable w/o access
An inbounds GEP results in poison if the value is not "inbounds", not in
UB. We accidentally derived nonnull and dereferenceable from these
inbounds GEPs even in the absence of accesses that would make the poison
to UB.
2020-01-03 10:43:40 -06:00
Johannes Doerfert a4b3588ba2 [Attributor][FIX] Return CHANGED once a pessimistic fixpoint is reached. 2020-01-03 10:43:40 -06:00
Ankit 369a919514 Fix for a dangling point bug in DeadStoreElimination pass
The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction.

While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration.

In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container.

Reviewers: fhahn, bcahoon, efriedma

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D65326
2020-01-03 14:28:44 +00:00
Sanjay Patel 1640582743 [InstCombine] replace undef elements in vector constant when doing icmp folds (PR44383)
As shown in P44383:
https://bugs.llvm.org/show_bug.cgi?id=44383
...we can't safely propagate a vector constant through this icmp fold
if that vector constant contains undefined elements.

We know that each defined element of the constant is safe though, so
find the first of those and replicate it into the formerly undef lanes.

Differential Revision: https://reviews.llvm.org/D72101
2020-01-03 09:16:57 -05:00
Hideto Ueno 5fc02dc0a7 Revert "[Attributor] AAValueConstantRange: Value range analysis using constant range"
This reverts commit e996303431.
2020-01-03 11:03:56 +09:00
Sanjay Patel 88fc5fdef6 [InstCombine] remove uses before deleting instructions (PR43723)
This is a less ambitious alternative to previous attempts to fix
this bug with:
rG56b2aee1875a
rGef02831f0a4e
rG56b2aee1875a
...because those all failed bot testing with use-after-free or
other problems.

The original crashing/assert problem is still showing up on
various fuzzers, so I've added a new minimal test based on
another one of those failures.

Instead of trying to manage and coordinate the logic in
isAllocSiteRemovable() with the deletion loops, just loosen
the existing code that handles casts and GEP by replacing
with undef to allow other opcodes. That means that no
instructions with uses should assert on deletion, and there
are hopefully no non-obvious sanitizer bugs induced.
2020-01-02 09:47:36 -05:00
Brian Gesiak 9ce0ff2eef [Coroutines] const-ify internal helpers (NFC)
Several helpers internal to llvm/Transforms/Coroutines do not use
'const' for parameters that are not modified. Add const where possible.
2020-01-01 21:57:49 -05:00
Brian Gesiak 2fcf7691df [Coroutines] Rename "legacy" passes (NFC)
A series of patches beginning with https://reviews.llvm.org/D71898
propose to add an implementation of the coroutine passes to the new pass
manager. As part of these changes, the coroutine passes that implement
the legacy pass manager interface are renamed, to `<PassName>Legacy`.
This mirrors similar changes that have been made to many other passes in
LLVM as they've been transitioned to support both old and new pass
managers.

This commit splits out the renaming portion of that patch and commits it
in advance as an NFC (no functional change intended) commit. It renames:

* `CoroEarly` => `CoroEarlyLegacy`
* `CoroSplit` => `CoroSplitLegacy`
* `CoroElide` => `CoroElideLegacy`
* `CoroCleanup` => `CoroCleanupLegacy`
2020-01-01 21:41:16 -05:00
Nikita Popov 8dd9a13619 [InstCombine] Preserve inbounds when merging with zero-index GEP (PR44423)
This addresses https://bugs.llvm.org/show_bug.cgi?id=44423.
If one of the GEPs is inbounds and the other is zero-index,
we can also preserve inbounds.

Differential Revision: https://reviews.llvm.org/D72060
2020-01-01 23:04:28 +01:00
Nikita Popov 6ba5f8c4ac [InstCombine] Fix incorrect inbounds on GEP of GEP (PR44425)
This fixes https://bugs.llvm.org/show_bug.cgi?id=44425. We need to
drop inbounds if one of the GEPs is not inbounds. This was already
done when creating a new GEP, but not when modifying in place.

Differential Revision: https://reviews.llvm.org/D72059
2020-01-01 22:10:55 +01:00