Commit Graph

9881 Commits

Author SHA1 Message Date
Kazu Hirata 4698bf145d Resubmit^2: [JumpThreading] Thread jumps through two basic blocks
This reverts commit 41784bed01.

Since the original revision ead815924e,
this revision fixes three issues:

- This revision fixes the Windows build.  My original patch improperly
  copied EH pads on Windows.  This patch disregards jump threading
  opportunities having to do with EH pads.

- This revision fixes jump threading to a wrong destination.
  Specifically, my original patch treated any Constant other than 0 as 1
  while evaluating the branch condition.  This bug led to treating
  constant expressions like:

    icmp ugt i8* null, inttoptr (i64 4 to i8*)

  to "true".  This patch fixes the bug by calling isOneValue.

- This revision fixes the cost calculation of two basic blocks being
  threaded through.  Note that getJumpThreadDuplicationCost returns
  "(unsigned)~0" for those basic blocks that cannot be duplicated.  If
  we sum of two return values from getJumpThreadDuplicationCost, we
  could have an unsigned overflow like:

    (unsigned)~0 + 5 = 4

  and mistakenly determine that it's safe and profitable to proceed
  with the jump threading opportunity.  The patch fixes the bug by
  checking each return value before summing them up.

[JumpThreading] Thread jumps through two basic blocks

Summary:
This patch teaches JumpThreading.cpp to thread through two basic
blocks like:

  bb3:
    %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

by duplicating basic blocks like bb3 above.  Once we duplicate bb3 as
bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have:

  bb3:
    %var = phi i32* [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb3.dup:
    %var = phi i32* [ null, %bb1 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

Then the existing code in JumpThreading.cpp can thread edge
bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5.

Reviewers: wmi

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70247
2020-02-05 09:23:37 -08:00
Alina Sbirlea 67904db23c [IRCE] Make IRCE a Function pass.
Summary: Make InductiveRangeCheckElimination a FunctionPass.

Reviewers: reames, mkazantsev

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73592
2020-02-05 09:22:41 -08:00
Alina Sbirlea 1c03cc5a39 [NFCI] Update according to style.
clang-tidy + clang-format
2020-02-04 17:11:36 -08:00
Thomas Raoux e53bbf1213 [GVN] Add GVNOption to control load-pre more fine-grained.
Adds the global (cl::opt) GVNOption enable-load-in-loop-pre in order
to control whether the optimization will be performed if the load
is part of a loop.

Patch by Hendrik Greving!

Differential Revision: https://reviews.llvm.org/D73804
2020-02-03 23:00:58 -08:00
Alina Sbirlea 388de9dfcd [LoopUtils] Make duplicate method a utility. [NFCI]
Summary:
Method appendLoopsToWorklist is duplicate in LoopUnroll and in the
LoopPassManager as an internal method. Make it an utility.

Reviewers: dmgreen, chandlerc, fedor.sergeev, yamauchi

Subscribers: mehdi_amini, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73569
2020-02-03 10:24:18 -08:00
Sam Parker 2663a25fad [JumpThreading] Half the duplicate threshold at Oz
Duplicating instructions can lead to code size increases but using
a threshold of 3 is good for reducing code size.

Differential Revision: https://reviews.llvm.org/D72916
2020-02-03 08:40:20 +00:00
Fangrui Song ba3a1774a9 [Transforms] Simplify with make_early_inc_range 2020-02-02 00:54:32 -08:00
Artur Pilipenko 34547ac959 NFC. Comments cleanup in DSE::memoryIsNotModifiedBetween
Separated from https://reviews.llvm.org/D68006 review.
2020-01-31 15:22:33 -08:00
Whitney Tsang e44f4a8a54 [LoopFusion] Move instructions from FC1.GuardBlock to FC0.GuardBlock and
from FC0.ExitBlock to FC1.ExitBlock when proven safe.

Summary:
Currently LoopFusion give up when the second loop nest guard
block or the first loop nest exit block is not empty. For example:

if (0 < N) {
  for (int i = 0; i < N; ++i) {}
  x+=1;
}
y+=1;
if (0 < N) {
  for (int i = 0; i < N; ++i) {}
}
The above example should be safe to fuse.
This PR moves instructions in FC1 guard block (e.g. y+=1;) to
FC0 guard block, or instructions in FC0 exit block (e.g. x+=1;) to
FC1 exit block, which then LoopFusion is able to fuse them.
Reviewer: kbarton, jdoerfert, Meinersbur, dmgreen, fhahn, hfinkel,
bmahjour, etiotto
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D73641
2020-01-30 18:02:22 +00:00
Whitney Tsang da58e68fdf [LoopFusion] Move instructions from FC1.Preheader to FC0.Preheader when
proven safe.

Summary:
Currently LoopFusion give up when the second loop nest preheader is
not empty. For example:

for (int i = 0; i < 100; ++i) {}
x+=1;
for (int i = 0; i < 100; ++i) {}
The above example should be safe to fuse.
This PR moves instructions in FC1 preheader (e.g. x+=1; ) to
FC0 preheader, which then LoopFusion is able to fuse them.
Reviewer: kbarton, Meinersbur, jdoerfert, dmgreen, fhahn, hfinkel,
bmahjour, etiotto
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D71821
2020-01-29 15:06:11 +00:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Whitney Tsang cd0cff4392 [NFCI][LoopUnrollAndJam] Minor changes.
Summary:
1. Add assertions.
2. Verify more analyses.
These changes are moved out of https://reviews.llvm.org/D73129 to
simplify that review.

Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto
Reviewed By: dmgreen
Subscribers: fhahn, hiraditya, zzheng, llvm-commits, prithayan, anhtuyen
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D73204
2020-01-28 20:24:23 +00:00
Florian Hahn 5d0ffbeb4d [Matrix] Mark expressions shared between multiple remarks.
This patch adds support for explicitly highlighting sub-expressions
shared by multiple leaf nodes. For example consider the following
code

  %shared.load = tail call <8 x double> @llvm.matrix.columnwise.load.v8f64.p0f64(double* %arg1, i32 %stride, i32 2, i32 4), !dbg !10, !noalias !10
  %trans = tail call <8 x double> @llvm.matrix.transpose.v8f64(<8 x double> %shared.load, i32 2, i32 4), !dbg !10
  tail call void @llvm.matrix.columnwise.store.v8f64.p0f64(<8 x double> %trans, double* %arg3, i32 10, i32 4, i32 2), !dbg !10
  %load.2 = tail call <30 x double> @llvm.matrix.columnwise.load.v30f64.p0f64(double* %arg3, i32 %stride, i32 2, i32 15), !dbg !10, !noalias !10
  %mult = tail call <60 x double> @llvm.matrix.multiply.v60f64.v8f64.v30f64(<8 x double> %trans, <30 x double> %load.2, i32 4, i32 2, i32 15), !dbg !11
  tail call void @llvm.matrix.columnwise.store.v60f64.p0f64(<60 x double> %mult, double* %arg2, i32 10, i32 4, i32 15), !dbg !11

We have two leaf nodes (the 2 stores) and the first store stores %trans
which is also used by the matrix multiply %mult. We generate separate
remarks for each leaf (stores). To denote that parts are shared, the
shared expressions are marked as shared (), with a reference to the
other remark that shares it. The operation summary also denotes the
shared operations separately.

Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D72526
2020-01-28 09:27:55 -08:00
Whitney Tsang 78dc64989c [CodeMoverUtils] Improve IsControlFlowEquivalent.
Summary:
Currently IsControlFlowEquivalent determine if two blocks are control
flow equivalent by checking if A dominates B and B post dominates A.
There exists blocks that are control flow equivalent even if they don't
satisfy the A dominates B and B post dominates A condition.
For example,

if (cond)
  A
if (cond)
  B
In the PR, we determine if two blocks are control flow equivalent by
also checking if the two sets of conditions A and B depends on are
equivalent.
Reviewer: jdoerfert, Meinersbur, dmgreen, etiotto, bmahjour, fhahn,
hfinkel, kbarton
Reviewed By: fhahn
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D71578
2020-01-28 14:18:00 +00:00
Florian Hahn 62e228f8fd [Matrix] Add info about number of operations to remarks.
This patch updates the remark to also include a summary of the number of
vector operations generated for each matrix expression.

Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D72480
2020-01-27 17:43:39 -08:00
Florian Hahn 949294f396 [Matrix] Add optimization remarks for matrix expression.
Generate remarks for matrix operations in a function. To generate remarks
for matrix expressions, the following approach is used:
1. Collect leafs of matrix expressions (done in
   RemarkGenerator::getExpressionLeafs).  Leafs are lowered matrix
   instructions without other matrix users (like stores).

2. For each leaf, create a remark containing a linearizied version of the
   matrix expression.

The following improvements will be submitted as follow-ups:
* Summarize number of vector instructions generated for each expression.
* Account for shared sub-expressions.
* Propagate matrix remarks up the inlining chain.

The information provided by the matrix remarks helps users to spot cases
where matrix expression got split up, e.g. due to inlining not
happening. The remarks allow users to address those issues, ensuring
best performance.

Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D72453
2020-01-27 16:39:29 -08:00
Guillaume Chatelet 07c9d53266 [Alignment][NFC] Use Align with CreateAlignedLoad
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73449
2020-01-27 10:58:36 +01:00
Alina Sbirlea 0d90d2457c [LoopStrengthReduce] Teach LoopStrengthReduce to preserve MemorySSA is available. 2020-01-24 10:13:52 -08:00
Alina Sbirlea 1d09174290 [LoopStrengthReduce] Reuse utility method to clean dead instructions. [NFCI]
Create a utility wrapper for the RecursivelyDeleteTriviallyDeadInstructions utility
method, which sets to nullptr the instructions that are not trivially
dead. Use the new method in LoopStrengthReduce.
Alternative: add a bool to the same method; this option adds a marginal
amount of overhead to the other callers, and the method needs to be
updated to return a bool status when it removes/doesn't remove
instructions.
2020-01-23 16:27:32 -08:00
Alina Sbirlea 9e66c4ec12 [Utils] Use WeakTrackingVH in vector used as scratch storage.
The utility method RecursivelyDeleteTriviallyDeadInstructions receives
as input a vector of Instructions, where all inputs are valid
instructions. This same vector is used as a scratch storage (per the
header comment) to recursively delete instructions. If an instruction is
added as an operand of multiple other instructions, it may be added twice,
then deleted once, then the second reference in the vector is invalid.
Switch to using a Vector<WeakTrackingVH>.
This change facilitates a clean-up in LoopStrengthReduction.
2020-01-23 16:04:57 -08:00
Florian Hahn 4ed7355e44 [IPSCCP] Use ParamState for arguments at call sites.
We currently use integer ranges to merge concrete function arguments.
We use the ParamState range for those, but we only look up concrete
values in the regular state. For concrete function arguments that are
themselves arguments of the containing function, we can use the param
state directly and improve the precision in some cases.

Besides improving the results in some cases, this is also a small step towards
switching to ValueLatticeElement, by allowing D60582 to be a NFC.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D71836
2020-01-23 13:55:42 -08:00
Alina Sbirlea 6770de9b8d [LoopIdiomRecognize] Teach LoopIdiomRecognize to preserve MemorySSA. 2020-01-23 11:31:12 -08:00
Alina Sbirlea a0f627d584 [IndVarSimplify] Fix for MemorySSA preserve. 2020-01-23 11:06:16 -08:00
Guillaume Chatelet 59f95222d4 [Alignment][NFC] Use Align with CreateAlignedStore
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73274
2020-01-23 17:34:32 +01:00
Kazu Hirata 41784bed01 Revert "Resubmit: [JumpThreading] Thread jumps through two basic blocks"
This reverts commit 53b68e676f.

Our internal tests are showing breakage with this patch.
2020-01-23 06:34:03 -08:00
Guillaume Chatelet 279fa8e006 [Alignement][NFC] Deprecate untyped CreateAlignedLoad
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73260
2020-01-23 13:34:32 +01:00
Daniil Suchkov 6fc9e60149 NFC. Remove obsolete SimpleAnalysis infrastructure
Apparently cache of AliasSetTrackers held by LICM was the only user of
SimpleAnalysis infrastructure. Now, given that we no longer have that
cache, this infrastructure is obsolete and, taking into account its
nature, we don't want any new solutions to be based on it.

Reviewers: asbirlea, fhahn, efriedma, reames

Reviewed-By: asbirlea

Differential Revision: https://reviews.llvm.org/D73085
2020-01-23 13:58:30 +07:00
Daniil Suchkov 53a28bd891 [LICM] NFC. Remove AST caching infrastructure
Since LICM doesn't use AST caching any more (see D73081), this
infrastructure is now obsolete and we can remove it.

Reviewers: asbirlea, fhahn, efriedma, reames

Reviewed-By: asbirlea

Differential Revision: https://reviews.llvm.org/D73084
2020-01-23 12:33:50 +07:00
Jonas Devlieghere cf2b498d28 [llvm/Transforms] Fix warning: private field 'MSSA' is not used 2020-01-22 18:07:53 -08:00
Alina Sbirlea adc4faf532 [IndVarSimplify] Teach IndVarSimplify to preserve MemorySSA. 2020-01-22 16:33:17 -08:00
Alina Sbirlea b5b6126d97 [IndVarSimplify] Cleanup spaces and reduce variable scope [NFCI]
Minor clean-ups + clang-format.
2020-01-22 15:32:20 -08:00
Alina Sbirlea 6baf31b7c1 [LoopIdiomRecognize] Reduce variable scope. [NFCI] 2020-01-22 15:30:08 -08:00
Alina Sbirlea efb130fc93 [LoopDeletion] Teach LoopDeletion to preserve MemorySSA if available.
If MemorySSA analysis is analysis, LoopDeletion now preserves it.
2020-01-22 11:38:38 -08:00
Daniil Suchkov 7bdc83f340 [LICM] Don't cache AliasSetTrackers when run under legacy PM
Summary:
This is the first step towards complete removal of AST caching from
LICM. Attempts to keep LICM's AST cache up to date across passes can lead
to miscompiles like this one: https://bugs.llvm.org/show_bug.cgi?id=44320.

LICM has already switched to using MemorySSA to do sinking and hoisting
and only builds an AliasSetTracker on demand for the promoteToScalars
step, without caching it from one LICM instance to the next. Given this,
we don't have compile-time reasons to keep AST caching any more.
The only scenario where the caching would be used currently is when
using the LegacyPassManager and setting -enable-mssa-loop-dependency=false.

This switch should help us to surface any possible issues that may arise
along this way, also it turns subsequent removal of AST caching into NFC.

Reviewers: asbirlea, fhahn, efriedma, reames

Reviewed By: asbirlea

Subscribers: hiraditya, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73081
2020-01-22 13:16:45 +07:00
Florian Hahn f42994f228 [Matrix] Hide and describe matrix-propagate-shape option. 2020-01-21 14:28:47 -08:00
Guillaume Chatelet 46b9563cf6 [Alignment][NFC] Use Align with CreateElementUnorderedAtomicMemCpy
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, nicolasvasilache

Subscribers: hiraditya, jfb, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, liufengdb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73041
2020-01-20 15:39:45 +01:00
Sjoerd Meijer 93175a5caa [IndVarSimplify][LoopUtils] rewriteLoopExitValues. NFCI
This moves `rewriteLoopExitValues()` from IndVarSimplify to LoopUtils thus
making it a generic loop utility function.  This allows to rewrite loop exit
values by just calling this function without running the whole IndVarSimplify
pass.

We use this in D72714 to rematerialise the iteration count in exit blocks, so
that we can clean-up loop update expressions inside the hardware-loops later.

Differential Revision: https://reviews.llvm.org/D72602
2020-01-20 09:05:00 +00:00
Drew Wock 0bcfafc5e7 [SeparateConstOffsetFromGEP] Fix: sext(a) + sext(b) -> sext(a + b) matches add and sub instructions with one another
During the SeparateConstOffsetFromGEP pass, signed extensions are distributed
to the values that feed into them and then later recombined. The recombination
stage is somewhat problematic- it doesn't differ add and sub instructions
from another when matching the sext(a) +/- sext(b) -> sext(a +/- b) pattern
in some instances.

An example- the IR contains:
%unextendedA
%unextendedB
%subuAuB = unextendedA - unextendedB
%extA = extend A
%extB = extend B
%addeAeB = extA + extB

The problematic optimization will transform that into:

%unextendedA
%unextendedB
%subuAuB = unextendedA - unextendedB
%extA = extend A
%extB = extend B
%addeAeB = extend subuAuB ; Obviously not semantically equivalent to the IR input.

This patch fixes that.

Patch by Drew Wock <drew.wock@sas.com>
Differential Revision: https://reviews.llvm.org/D65967
2020-01-17 12:22:52 -05:00
Kazu Hirata 53b68e676f Resubmit: [JumpThreading] Thread jumps through two basic blocks
This reverts commit 2d258ed931.  This
revision fixes the Windows build and adds a testcase for it, namely
thread-two-bbs3.ll.  My original patch improperly copied EH pads on
Windows.  This patch disregards jump threading opportunities having to
do with EH pads.

[JumpThreading] Thread jumps through two basic blocks

Summary:
This patch teaches JumpThreading.cpp to thread through two basic
blocks like:

  bb3:
    %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

by duplicating basic blocks like bb3 above.  Once we duplicate bb3 as
bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have:

  bb3:
    %var = phi i32* [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb3.dup:
    %var = phi i32* [ null, %bb1 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

Then the existing code in JumpThreading.cpp can thread edge
bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5.

Reviewers: wmi

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70247
2020-01-16 12:33:37 -08:00
Arkady Shlykov c87982b467 Revert "[Loop Peeling] Add possibility to enable peeling on loop nests."
This reverts commit 3f3017e because there's a failure on peel-loop-nests.ll
with LLVM_ENABLE_EXPENSIVE_CHECKS on.

Differential Revision: https://reviews.llvm.org/D70304
2020-01-16 10:33:38 -08:00
Fedor Sergeev 3478551bf3 [GVN] introduce GVNOptions to control GVN pass behavior
There are a few global (cl::opt) controls that enable optional
behavior in GVN. Introduce GVNOptions that provide corresponding
per-pass instance controls.

That will allow to use GVN multiple times in pipeline each time
with different settings.

Reviewers: asbirlea, rnk, reames, skatkov, fhahn
Reviewed By: fhahn

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72732
2020-01-16 20:21:08 +03:00
Mircea Trofin 7acfda633f [llvm] Make new pass manager's OptimizationLevel a class
Summary:
The old pass manager separated speed optimization and size optimization
levels into two unsigned values. Coallescing both in an enum in the new
pass manager may lead to unintentional casts and comparisons.

In particular, taking a look at how the loop unroll passes were constructed
previously, the Os/Oz are now (==new pass manager) treated just like O3,
likely unintentionally.

This change disallows raw comparisons between optimization levels, to
avoid such unintended effects. As an effect, the O{s|z} behavior changes
for loop unrolling and loop unroll and jam, matching O2 rather than O3.

The change also parameterizes the threshold values used for loop
unrolling, primarily to aid testing.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: zzheng, ychen, mehdi_amini, hiraditya, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72547
2020-01-16 09:00:56 -08:00
Fedor Sergeev 8a4d12ae5b [BasicBlock] add helper getPostdominatingDeoptimizeCall
It appears to be rather useful when analyzing Loops with multiple
deoptimizing exits, perhaps merged ones.
For now it is used in LoopPredication, will be adding more uses
in other loop passes.

Reviewers: asbirlea, fhahn, skatkov, spatel, reames
Reviewed By: reames

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72754
2020-01-16 01:15:57 +03:00
Mircea Trofin 5466597fee [NFC] Refactor InlineResult for readability
Summary:
InlineResult is used both in APIs assessing whether a call site is
inlinable (e.g. llvm::isInlineViable) as well as in the function
inlining utility (llvm::InlineFunction). It means slightly different
things (can/should inlining happen, vs did it happen), and the
implicit casting may introduce ambiguity (casting from 'false' in
InlineFunction will default a message about hight costs,
which is incorrect here).

The change renames the type to a more generic name, and disables
implicit constructors.

Reviewers: eraman, davidxl

Reviewed By: davidxl

Subscribers: kerbowa, arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72744
2020-01-15 13:34:20 -08:00
Zhongduo Lin 34ba96a3d4 [NFC][IndVarSimplify] remove duplicate code in widenWithVariantLoadUseCodegen.
Summary: Duplicate code in widenWithVariantLoadUseCodegen is removed and also use assert to check unknown extension type as it should be filtered out by the pre condition check before calling this function.

Reviewers: az, sanjoy, sebpop, efriedma, javed.absar, sanjoy.google

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits, amehsan

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72652
2020-01-15 16:27:58 -05:00
Arkady Shlykov 3f3017e162 [Loop Peeling] Add possibility to enable peeling on loop nests.
Summary:
Current peeling implementation bails out in case of loop nests.
The patch introduces a field in TargetTransformInfo structure that
certain targets can use to relax the constraints if it's
profitable (disabled by default).
Also additional option is added to enable peeling manually for
experimenting and testing purposes.

Reviewers: fhahn, lebedev.ri, xbolva00

Reviewed By: xbolva00

Subscribers: xbolva00, hiraditya, zzheng, llvm-commits

Differential Revision: https://reviews.llvm.org/D70304
2020-01-15 08:25:21 -08:00
Nuno Lopes 87407fc03c DSE: fix bug where we would only check libcalls for name rather than whole decl 2020-01-11 11:57:29 +00:00
Whitney Tsang d27a15fed7 [NFCI][LoopUnrollAndJam] Changing LoopUnrollAndJamPass to a function
pass.

Summary: This patch changes LoopUnrollAndJamPass to a function pass, and
keeps the loops traversal order same as defined in
FunctionToLoopPassAdaptor LoopPassManager.h.

The next patch will change the loop traversal to outer to inner order,
so more loops can be transform.

Discussion in llvm-dev mailing list:
https://groups.google.com/forum/#!topic/llvm-dev/LF4rUjkVI2g
Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto
Reviewed By: dmgreen
Subscribers: hiraditya, zzheng, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D72230
2020-01-09 16:18:36 +00:00
Florian Hahn ccf24225e3 [Matrix] Update shape propagation to iterate until done.
This patch updates the shape propagation to iterate until no new shape
information is discovered.

As initial seed for the forward propagation, we use the matrix intrinsic
instructions. Both propagateShapeForward and propagateShapeBackward
return new work lists, with the instructions to be used for the next
iteration. When propagating forward, we record all instructions we added
new shape information for. When propagating backward, we record all
users of instructions we added new shape information for.

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70901
2020-01-09 10:52:52 +00:00
Florian Hahn 7adf6644f5 [Matrix] Propagate and use shape information for loads.
This patch extends to shape propagation to also include load
instructions and implements shape aware lowering for vector loads.

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70900
2020-01-09 10:21:20 +00:00
Florian Hahn 459ad8e97e [Matrix] Implement back-propagation of shape information.
This patch extends the shape propagation for matrix operations to also
propagate the shape of instructions to their operands.

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70899
2020-01-09 09:48:07 +00:00
Kazu Hirata 2d258ed931 Revert "[JumpThreading] Thread jumps through two basic blocks"
It looks like my patch breaks the sanitizer-windows build:

http://lab.llvm.org:8011/builders/sanitizer-windows/builds/56324

This reverts commit ead815924e.
2020-01-08 13:58:39 -08:00
Kazu Hirata ead815924e [JumpThreading] Thread jumps through two basic blocks
Summary:
This patch teaches JumpThreading.cpp to thread through two basic
blocks like:

  bb3:
    %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

by duplicating basic blocks like bb3 above.  Once we duplicate bb3 as
bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have:

  bb3:
    %var = phi i32* [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb3.dup:
    %var = phi i32* [ null, %bb1 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

Then the existing code in JumpThreading.cpp can thread edge
bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5.

Reviewers: wmi

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70247
2020-01-08 06:57:36 -08:00
Philip Reames 312a532dc0 [GVN/FP] Considate logic for reasoning about equality vs equivalance for floats
Factor out common logic into some reasonable commented helper functions. In the process, ensure that the in-block vs cross-block cases are handled the same. They previously weren't.

Differential Revision: https://reviews.llvm.org/D67126
2020-01-07 16:05:04 -08:00
Florian Hahn b8a3c34eee Revert "[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC)."
This reverts commit 51ef53f3bd, as it
breaks some bots.
2020-01-04 18:44:38 +00:00
Florian Hahn 51ef53f3bd [SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.

Reviewers: sanjoy.google, efriedma, reames

Reviewed By: sanjoy.google

Differential Revision: https://reviews.llvm.org/D71537
2020-01-04 18:29:35 +00:00
Ankit 369a919514 Fix for a dangling point bug in DeadStoreElimination pass
The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction.

While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration.

In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container.

Reviewers: fhahn, bcahoon, efriedma

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D65326
2020-01-03 14:28:44 +00:00
Florian Hahn dc2c9b0fcf [Matrix] Propagate and use shape info for binary operators.
This patch extends the current shape propagation and shape aware
lowering to also support binary operators. Those operators are uniform
with respect to their shape (shape of the input operands is the same as
the shape of their result).

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70898
2019-12-27 15:50:47 +00:00
Whitney Tsang d1f41b2ca9 [NFC][LoopFusion] Fix printing of the guard branch.
Reviewer: kbarton, jdoerfert
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D71878
2019-12-26 02:45:29 +00:00
Florian Hahn 8d6f59b78a [Matrix] Use fmuladd for matrix.multiply if allowed.
If the matrix.multiply calls have the contract fast math flag, we can
use fmuladd. This als adds a command line option to force fmuladd
generation. We can retire this option once there is a clang-level
option.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70951
2019-12-23 14:49:14 +01:00
Florian Hahn 109e4e3851 [Matrix] Add forward shape propagation and first shape aware lowerings.
This patch adds infrastructure for forward shape propagation to
LowerMatrixIntrinsics. It also updates the pass to make use of
the shape information to break up larger vector operations and to
eliminate unnecessary conversion operations between columnwise matrixes
and flattened vectors: if shape information is available for an
instruction, lower the operation to a set of instructions operating on
columns. For example, a store of a matrix is broken down into separate
stores for each column. For users that do not have shape
information (e.g. because they do not yet support shape information
aware lowering), we pack the result columns into a flat vector and
update those users.

It also adds shape aware lowering for the first non-intrinsic
instruction: vector stores.

Example:

For
  %c  = call <4 x double> @llvm.matrix.transpose(<4 x double> %a, i32 2, i32 2)
  store <4 x double> %c, <4 x double>* %Ptr

We generate the code below without shape propagation. Note %9 which
combines the columns of the transposed matrix into a flat vector.

  %split = shufflevector <4 x double> %a, <4 x double> undef, <2 x i32> <i32 0, i32 1>
  %split1 = shufflevector <4 x double> %a, <4 x double> undef, <2 x i32> <i32 2, i32 3>
  %1 = extractelement <2 x double> %split, i64 0
  %2 = insertelement <2 x double> undef, double %1, i64 0
  %3 = extractelement <2 x double> %split1, i64 0
  %4 = insertelement <2 x double> %2, double %3, i64 1
  %5 = extractelement <2 x double> %split, i64 1
  %6 = insertelement <2 x double> undef, double %5, i64 0
  %7 = extractelement <2 x double> %split1, i64 1
  %8 = insertelement <2 x double> %6, double %7, i64 1
  %9 = shufflevector <2 x double> %4, <2 x double> %8, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
  store <4 x double> %9, <4 x double>* %Ptr

With this patch, we propagate the 2x2 shape information from the
transpose to the store and we generate the code below. Note that we
store the columns directly and do not need an extra shuffle.

  %9 = bitcast <4 x double>* %Ptr to double*
  %10 = bitcast double* %9 to <2 x double>*
  store <2 x double> %4, <2 x double>* %10, align 8
  %11 = getelementptr double, double* %9, i32 2
  %12 = bitcast double* %11 to <2 x double>*
  store <2 x double> %8, <2 x double>* %12, align 8

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70897
2019-12-23 13:51:56 +01:00
Mark de Wever 098d3347e7 [Transforms] Fixes -Wrange-loop-analysis warnings
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.

Differential Revision: https://reviews.llvm.org/D71810
2019-12-22 19:20:17 +01:00
Bjorn Pettersson 89e3bb4502 [ConstantHoisting] Ignore unreachable bb:s when collecting candidates
Summary:
Ignore looking at blocks that are unreachable from entry when
collecting candidates for hosting.

Normally the consthoist pass is executed in the llc pipeline,
just after unreachableblockelim. So it is abnormal to have code
that is unreachable from the entry block. But when running the
pass as part of opt, for example as part of fuzzy testing, we
might trigger various kinds of asserts when collecting candidates
if we include unreachable blocks in that analysis.

It seems like a waste of time to hoist constants in unreachble
blocks, so the solution is to simply ignore such blocks when
collecting the hoisting candidates.

The two added test cases used to end up in two different asserts,
and the intention with the checks is just to verify that we no
longer fail.

Fixes: PR43903

Reviewers: spatel

Reviewed By: spatel

Subscribers: hiraditya, uabelho, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71678
2019-12-19 15:07:55 +01:00
Kit Barton 3db1cf7a1e [LoopFusion] Use the LoopInfo::isRotatedForm method (NFC).
Loop fusion previously had a method to check whether a loop was in rotated form. This method has
been moved into the LoopInfo class. This patch removes the old isRotated method from loop fusion,
in favour of the new one in LoopInfo.
2019-12-18 15:04:25 -05:00
Whitney Tsang 36bdc3dc35 [LoopFusion] Move instructions from FC0.Latch to FC1.Latch.
Summary:This PR move instructions from FC0.Latch bottom up to the
beginning of FC1.Latch as long as they are proven safe.

To illustrate why this is beneficial, let's consider the following
example:
Before Fusion:
header1:
  br header2
header2:
  br header2, latch1
latch1:
  br header1, preheader3
preheader3:
  br header3
header3:
  br header4
header4:
  br header4, latch3
latch3:
  br header3, exit3

After Fusion (before this PR):
header1:
  br header2
header2:
  br header2, latch1
latch1:
  br header3
header3:
  br header4
header4:
  br header4, latch3
latch3:
  br header1, exit3

Note that preheader3 is removed during fusion before this PR.
Notice that we cannot fuse loop2 with loop4 as there exists block latch1
in between.
This PR move instructions from latch1 to beginning of latch3, and remove
block latch1. LoopFusion is now able to fuse loop nest recursively.

After Fusion (after this PR):
header1:
  br header2
header2:
  br header3
header3:
  br header4
header4:
  br header2, latch3
latch3:
  br header1, exit3

Reviewer: kbarton, jdoerfert, Meinersbur, dmgreen, fhahn, hfinkel,
bmahjour, etiotto
Reviewed By: kbarton, Meinersbur
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D71165
2019-12-17 22:10:23 +00:00
Guillaume Chatelet 531c1161b9 Resubmit "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove"
Summary:
This is a resubmit of D71473.

This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients.
Functions will be deprecated one by one and as in tree code is cleaned up.

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: aaron.ballman, courbet

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71547
2019-12-17 10:07:46 +01:00
Kit Barton ff07fc66d9 [LoopFusion] Restrict loop fusion to rotated loops.
Summary:
This patch restricts loop fusion to only consider rotated loops as valid candidates.
This simplifies the analysis and transformation and aligns with other loop optimizations.

Reviewers: jdoerfert, Meinersbur, dmgreen, etiotto, Whitney, fhahn, hfinkel

Reviewed By: Meinersbur

Subscribers: ormris, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71025
2019-12-16 15:17:29 -05:00
Guillaume Chatelet 4658da10e4 Revert "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove"
This reverts commit 181ab91efc.
2019-12-16 15:19:49 +01:00
Guillaume Chatelet 181ab91efc [Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove
Summary:
This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients.
Functions will be deprecated one by one and as in tree code is cleaned up.

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71473
2019-12-16 13:35:55 +01:00
Bjorn Pettersson 1c49553c19 [BasicBlockUtils] Add utility to remove redundant dbg.value instrs
Summary:
Add a RemoveRedundantDbgInstrs to BasicBlockUtils with the
goal to remove redundant dbg intrinsics from a basic block.

This can be useful after various transforms, as it might
be simpler to do a filtering of dbg intrinsics after the
transform than during the transform.
One primary use case would be to replace a too aggressive
removal done by MergeBlockIntoPredecessor, seen at loop
rotate (not done in this patch).

The elimination algorithm currently focuses on dbg.value
intrinsics and is doing two iterations over the BB.

First we iterate backward starting at the last instruction
in the BB. Whenever a consecutive sequence of dbg.value
instructions are found we keep the last dbg.value for
each variable found (variable fragments are identified
using the  {DILocalVariable, FragmentInfo, inlinedAt}
triple as given by the DebugVariable helper class).

Next we iterate forward starting at the first instruction
in the BB. Whenever we find a dbg.value describing a
DebugVariable (identified by {DILocalVariable, inlinedAt})
we save the {DIValue, DIExpression} that describes that
variables value. But if the variable already was mapped
to the same {DIValue, DIExpression} pair we instead drop
the second dbg.value.

To ease the process of making lit tests for this utility a
new pass is introduced called RedundantDbgInstElimination.
It can be executed by opt using -redundant-dbg-inst-elim.

Reviewers: aprantl, jmorse, vsk

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71478
2019-12-16 11:41:21 +01:00
Nicola Zaghen 97572775d2 Reland [DataLayout] Fix occurrences that size and range of pointers are assumed to be the same.
GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places
in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues
with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered.

This fixes the buildbot failures.

Differential Revision: https://reviews.llvm.org/D68328

Patch by Joseph Faulls!
2019-12-13 14:30:21 +00:00
Florian Hahn 526244b187 [Matrix] Add first set of matrix intrinsics and initial lowering pass.
This is the first patch adding an initial set of matrix intrinsics and a
corresponding lowering pass. This has been discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2019-October/136240.html

The first patch introduces four new intrinsics (transpose, multiply,
columnwise load and store) and a LowerMatrixIntrinsics pass, that
lowers those intrinsics to vector operations.

Matrixes are embedded in a 'flat' vector (e.g. a 4 x 4 float matrix
embedded in a <16 x float> vector) and the intrinsics take the dimension
information as parameters. Those parameters need to be ConstantInt.
For the memory layout, we initially assume column-major, but in the RFC
we also described how to extend the intrinsics to support row-major as
well.

For the initial lowering, we split the input of the intrinsics into a
set of column vectors, transform those column vectors and concatenate
the result columns to a flat result vector.

This allows us to lower the intrinsics without any shape propagation, as
mentioned in the RFC. In follow-up patches, we plan to submit the
following improvements:
 * Shape propagation to eliminate the embedding/splitting for each
   intrinsic.
 * Fused & tiled lowering of multiply and other operations.
 * Optimization remarks highlighting matrix expressions and costs.
 * Generate loops for operations on large matrixes.
 * More general block processing for operation on large vectors,
   exploiting shape information.

We would like to add dedicated transpose, columnwise load and store
intrinsics, even though they are not strictly necessary. For example, we
could instead emit a large shufflevector instruction instead of the
transpose. But we expect that to
  (1) become unwieldy for larger matrixes (even for 16x16 matrixes,
      the resulting shufflevector masks would be huge),
  (2) risk instcombine making small changes, causing us to fail to
      detect the transpose, preventing better lowerings

For the load/store, we are additionally planning on exploiting the
intrinsics for better alias analysis.

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor, efriedma, rengolin

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70456
2019-12-12 15:42:18 +00:00
Nicola Zaghen f798eb21ec Temporarily Revert "[DataLayout] Fix occurrences that size and range of pointers are assumed to be the same."
This reverts commit 5f6208778f.

This caused failures in Transforms/PhaseOrdering/scev-custom-dl.ll
const: Assertion `getBitWidth() == CR.getBitWidth() && "ConstantRange types don't agree!"' failed.
2019-12-12 10:29:54 +00:00
Nicola Zaghen 5f6208778f [DataLayout] Fix occurrences that size and range of pointers are assumed to be the same.
GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places
in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues
with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered.

Differential Revision: https://reviews.llvm.org/D68328

Patch by Joseph Faulls!
2019-12-12 10:07:01 +00:00
Reid Kleckner 85ba5f637a Rename TTI::getIntImmCost for instructions and intrinsics
Soon Intrinsic::ID will be a plain integer, so this overload will not be
possible.

Rename both overloads to ensure that downstream targets observe this as
a build failure instead of a runtime failure.

Split off from D71320

Reviewers: efriedma

Differential Revision: https://reviews.llvm.org/D71381
2019-12-11 18:00:20 -08:00
Guillaume Chatelet 0a0d54b357 [Alignment][NFC] Introduce Align in IRBuilder
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71343
2019-12-11 14:41:23 +01:00
Guillaume Chatelet 8a7c52bc22 [Alignment][NFC] Introduce Align in SROA
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71277
2019-12-11 09:34:38 +01:00
Guillaume Chatelet 1b2842bf90 [Alignment][NFC] CreateMemSet use MaybeAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71213
2019-12-10 15:17:44 +01:00
Djordje Todorovic 9b9e995819 [DebugInfo][EarlyCSE] Use the salvageDebugInfoOrMarkUndef(); NFC
Use the newest API.

Differential Revision: https://reviews.llvm.org/D71061
2019-12-09 13:57:35 +01:00
Daniil Suchkov c4d8c6319f [LCSSA] Don't use VH callbacks to invalidate SCEV when creating LCSSA phis
In general ValueHandleBase::ValueIsRAUWd shouldn't be called when not
all uses of the value were actually replaced, though, currently
formLCSSAForInstructions calls it when it inserts LCSSA-phis.

Calls of ValueHandleBase::ValueIsRAUWd were added to LCSSA specifically
to update/invalidate SCEV. In the best case these calls duplicate some
of the work already done by SE->forgetValue, though in case when SCEV of
the value is SCEVUnknown, SCEV replaces the underlying value of
SCEVUnknown with the new value (i.e. acts like LCSSA-phi actually fully
replaces the value it is created for), which leads to SCEV being
corrupted because LCSSA-phi rarely dominates all uses of its inputs.

Fixes bug https://bugs.llvm.org/show_bug.cgi?id=44058.

Reviewers: fhahn, efriedma, reames, sanjoy.google

Reviewed By: fhahn

Subscribers: hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70593
2019-12-06 13:21:49 +07:00
Florian Hahn 19071173fc Revert "[DSE] Fix for a dangling point bug in DeadStoreElimination."
The commit causes a failure:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20911

This reverts commit 1847fd9d85.
2019-12-05 19:29:21 +00:00
Ankit 1847fd9d85 [DSE] Fix for a dangling point bug in DeadStoreElimination.
The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction.

While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration.

In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container.

Patch by Ankit <quic_aankit@quicinc.com>

Reviewers: fhahn, bcahoon, efriedma

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D65326
2019-12-05 17:53:58 +00:00
Florian Hahn e8a5c17211 [LoopInterchange] Improve inner exit loop safety checks.
The PHI node checks for inner loop exits are too permissive currently.
As indicated by an existing comment, we should only allow LCSSA PHI
nodes that are part of reductions or are only used outside of the loop
nest. We ensure this by checking the users of the LCSSA PHIs.
Specifically, it is not safe to use an exiting value from the inner loop in the latch of the outer
loop.

It also moves the inner loop exit check before the outer loop exit
check.

Fixes PR43473.

Reviewers: efriedma, mcrosier

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D68144
2019-12-04 17:46:01 +00:00
Florian Hahn 4a9cde5a79 [SimpleLoopUnswitch] Invalidate the topmost loop with ExitBB as exiting.
SCEV caches the exiting blocks when computing exit counts. In
SimpleLoopUnswitch, we split the exit block of the loop to unswitch.

Currently we only invalidate the loop containing that exit block, but if
that block is the exiting block for a parent loop, we have stale cache
entries. We have to invalidate the top-most loop that contains the exit
block as exiting block. We might also be able to skip invalidating the
loop containing the exit block, if the exit block is not an exiting
block of that loop.

There are also 2 more places in SimpleLoopUnswitch, that use a similar
problematic approach to get the loop to invalidate. If the patch makes
sense, I will also update those places to a similar approach (they deal
with multiple exit blocks, so we cannot directly re-use
getTopMostExitingLoop).

Fixes PR43972.

Reviewers: skatkov, reames, asbirlea, chandlerc

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D70786
2019-12-04 11:32:09 +00:00
Hiroshi Yamauchi 8cdfdfeee6 [PGO][PGSO] Add an optional query type parameter to shouldOptimizeForSize.
Summary:
In case of a need to distinguish different query sites for gradual commit or
debugging of PGSO. NFC.

Reviewers: davidxl

Subscribers: hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70510
2019-12-02 13:54:13 -08:00
Whitney Tsang aaf7f05a96 [NFC][LoopFusion] Use isControlFlowEquivalent() from CodeMoverUtils.
Reviewer: kbarton, jdoerfert, Meinersbur, bmahjour, etiotto
Reviewed By: Meinersbur
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D70619
2019-11-25 17:54:42 +00:00
Florian Hahn 9a432161c6 [LoopInterchange] Adjust assertions when updating successors.
Currently the assertion in updateSuccessor is overly strict in some
cases and overly relaxed in other cases. For branches to the inner and
outer loop preheader it is too strict, because they can either be
unconditional branches or conditional branches with duplicate targets.
Both cases are fine and we can allow updating multiple successors.

On the other hand, we have to at least update one successor. This patch
adds such an assertion.
2019-11-24 19:37:16 +00:00
Kazu Hirata a195556628 [JumpThreading] NFC: Don't cache F.hasProfileData()
Summary:
With this patch, we no longer cache F.hasProfileData().  We simply
call the function again.

I'm doing this because:

- JumpThreadingPass also has a member variable named HasProfileData,
  which is very confusing,

- the function is very lightweight, and

- this patch makes JumpThreading::runOnFunction more consistent with
  JumpThreadingPass::run.

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70602
2019-11-22 08:51:14 -08:00
Kazu Hirata 1a58be2ac5 [JumpThreading] Use profile data even with the new pass manager
Summary:
Without this patch, the jump threading pass ignores profiling data
whenever we invoke the pass with the new pass manager.

Specifically, JumpThreadingPass::run calls runImpl with class variable
HasProfileData always set to false.  In turn, runImpl sets
HasProfileData to false again:

  HasProfileData = HasProfileData_;

In the end, we don't use profiling data at all with the new pass
manager.

This patch fixes the problem by passing F.hasProfileData() to runImpl.

The bug appears to have been introduced at:

  https://reviews.llvm.org/D41461

which removed local variable HasProfileData in JumpThreadingPass::run
even though there was one more use left in the same function.  As a
result, the remaining use ended referring to the class variable
instead.

Note that F.hasProfileData is an extremely lightweight function, so I
don't see the need to cache its result.  Once this patch is approved,
I'm planning to stop caching the result of F.hasProfileData in
runOnFunction.

Reviewers: wmi, eli.friedman

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70509
2019-11-22 08:21:48 -08:00
Alina Sbirlea fa09dddd70 [LoopInstSimplify] Move MemorySSA verification under flag.
The verification inside loop passes should be done under the
VerifyMemorySSA flag (enabled by EXPESIVE_CHECKS or explicitly with
opt), in order to not add to compile time during regular builds.
2019-11-21 17:01:24 -08:00
Philip Reames dfb7a9091a [LoopPred] Robustly handle partially unswitched loops
We may end up with a case where we have a widenable branch above the loop, but not all widenable branches within the loop have been removed.  Since a widenable branch inhibit SCEVs ability to reason about exit counts (by design), we have a tradeoff between effectiveness of this optimization and allowing future widening of the branches within the loop.  LoopPred is thought to be one of the most important optimizations for range check elimination, so let's pay the cost.
2019-11-21 15:44:36 -08:00
Vedant Kumar 844d97f650 Clang-trunk Generates Wrong Debug values with -O1
Bit-Tracking Dead Code Elimination (bdce) do not mark dbg.value as undef after
deleting instruction.  which shows invalid state of variable in debugger.  This
patches fixes this by marking the dbg.value as undef which depends on dead
instruction.

This fixes https://bugs.llvm.org/show_bug.cgi?id=41925

Patch by kamlesh kumar!

Differential Revision: https://reviews.llvm.org/D70040
2019-11-21 13:53:10 -08:00
Kazu Hirata 4f5d931c58 [JumpThreading] Refactor ThreadEdge
Summary:
This patch moves various checks from ThreadEdge to new function
TryThreadEdge The rational behind this is that I'd like to use
ThreadEdge without its checks in my upcoming patch.

This patch preserves lightweight checks as assertions in ThreadEdge.
ThreadEdge does not repeat the cost check, however.

Reviewers: wmi

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70338
2019-11-21 12:38:22 -08:00
Tom Stellard ab411801b8 [cmake] Explicitly mark libraries defined in lib/ as "Component Libraries"
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO.  I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so.  Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:

1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so.  This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.

With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.

2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set.  This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.

I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:

- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON

Reviewers: beanz, smeenai, compnerd, phosek

Reviewed By: beanz

Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70179
2019-11-21 10:48:08 -08:00
Philip Reames aaea24802b Broaden the definition of a "widenable branch"
As a reminder, a "widenable branch" is the pattern "br i1 (and i1 X, WC()), label %taken, label %untaken" where "WC" is the widenable condition intrinsics. The semantics of such a branch (derived from the semantics of WC) is that a new condition can be added into the condition arbitrarily without violating legality.

Broaden the definition in two ways:
    Allow swapped operands to the br (and X, WC()) form
    Allow widenable branch w/trivial condition (i.e. true) which takes form of br i1 WC()

The former is just general robustness (e.g. for X = non-instruction this is what instcombine produces). The later is specifically important as partial unswitching of a widenable range check produces exactly this form above the loop.

Differential Revision: https://reviews.llvm.org/D70502
2019-11-21 10:46:16 -08:00
James Y Knight e47d6da8a5 D'oh. Fix assert after a84922916e.
(Which was attempting to fix unused variable warning in NDEBUG mode after 8ba56f322a)
2019-11-20 22:22:51 -05:00
James Y Knight a84922916e Fix unused variable warning in NDEBUG mode after 8ba56f322a 2019-11-20 22:05:05 -05:00
Alina Sbirlea 5c5cf899ef [MemorySSA] Moving at the end often means before terminator.
Moving accesses in MemorySSA at InsertionPlace::End, when an instruction is
moved into a block, almost always means insert at the end of the block, but
before the block terminator. This matters when the block terminator is a
MemoryAccess itself (an invoke), and the insertion must be done before
the terminator for the update to be correct.

Insert an additional position: InsertionPlace:BeforeTerminator and update
current usages where this applies.

Resolves PR44027.
2019-11-20 17:11:00 -08:00
Alina Sbirlea da4baa2a6c [MemorySSA] Update analysis when the terminator is a memory instruction.
Update MemorySSA when moving the terminator instruction, as that may be a memory touching instruction.
Resolves PR44029.
2019-11-20 16:36:52 -08:00
Philip Reames 8ba56f322a Move widenable branch formation into makeGuardControlFlowExplicit helper
This is mostly NFC, but I removed the setting of the guard's calling convention onto the WC call.  Why?  Because it was untested, and was producing an ill defined output as the declaration's convention wasn't been changed leaving a mismatch which is UB.
2019-11-20 12:54:05 -08:00