Commit Graph

21018 Commits

Author SHA1 Message Date
Max Kazantsev 24c186ff00 Disable TermFolding in LoopSimplifyCFG until PR39783 is fixed
llvm-svn: 347844
2018-11-29 09:00:19 +00:00
Sam Parker d6ebf0108e [LoopStrengthReduce] ComplexityLimit as an option
Convert ComplexityLimit into a command line value.

Differential Revision: https://reviews.llvm.org/D54899

llvm-svn: 347843
2018-11-29 08:34:22 +00:00
Jeremy Morse 9b4cfa55b1 [DebugInfo] Give inlinable calls DILocs (PR39807)
In PR39807 we incorrectly handle circumstances where calls are common'd
from conditional blocks into the parent BB. Calls that can be inlined
must always have DebugLocs, however we strip them during commoning, which
the IR verifier asserts on.

Fix this by using applyMergedLocation: it will perform the same DebugLoc
stripping of conditional Locs, but will also generate an unknown location
DebugLoc that satisfies the requirement for inlinable calls to always have
locations.

Some of the prior logic for selecting a DebugLoc is now likely redundant;
I'll generate a follow-up to remove it (involves editing more regression
tests).

Differential Revision: https://reviews.llvm.org/D54997

llvm-svn: 347782
2018-11-28 17:58:45 +00:00
John Brawn 4557ffeb63 [LICM] Enable control flow hoisting by default
Differential Revision: https://reviews.llvm.org/D54949

llvm-svn: 347778
2018-11-28 17:23:03 +00:00
John Brawn 31c9769580 [LICM] Reapply r347190 "Make LICM able to hoist phis" with fix
This commit caused failures because it failed to correctly handle cases where
we hoist a phi, then hoist a use of that phi, then have to rehoist that use. We
need to make sure that we rehoist the use to _after_ the hoisted phi, which we
do by always rehoisting to the immediate dominator instead of just rehoisting
everything to the original preheader.

An option is also added to control whether control flow is hoisted, which is
off in this commit but will be turned on in a subsequent commit.

Differential Revision: https://reviews.llvm.org/D52827

llvm-svn: 347776
2018-11-28 17:21:49 +00:00
Nikita Popov 8d63aed459 [InstCombine] Combine saturating add/sub with constant operands
Combine
  sat(sat(X + C1) + C2) -> sat(X + (C1+C2))
and
  sat(sat(X - C1) - C2) -> sat(X - (C1+C2))
if the sign of C1 and C2 matches.

In the unsigned case we can compute C1+C2 with saturating arithmetic,
and InstSimplify will reduce this just to the saturation value. For
the signed case, we cannot perform the simplification if the result
of the addition overflows.

This change is part of https://reviews.llvm.org/D54534.

llvm-svn: 347773
2018-11-28 16:37:15 +00:00
Nikita Popov 42f89989a1 [InstCombine] Canonicalize ssub.sat to sadd.sat
Canonicalize ssub.sat(X, C) to ssub.sat(X, -C) if C is constant and
not signed minimum. This will help further optimizations to apply.

This change is part of https://reviews.llvm.org/D54534.

llvm-svn: 347772
2018-11-28 16:37:09 +00:00
Nikita Popov 78a9295e15 [InstCombine] Use known overflow information for saturating add/sub
If ValueTracking can determine that the add/sub can newer overflow,
replace it with the corresponding nuw/nsw add/sub.

Additionally, for the unsigned case, if ValueTracking determines
that the add/sub always overflows, replace the result with the
saturation value.

This change is part of https://reviews.llvm.org/D54534.

llvm-svn: 347770
2018-11-28 16:36:59 +00:00
Nikita Popov 085d24a8b3 [InstCombine] Canonicalize const arg for saturating adds
If a saturating add intrinsic has one constant argument, make sure
it is on the RHS. This will simplify further transformations.

This change is part of https://reviews.llvm.org/D54534.

llvm-svn: 347769
2018-11-28 16:36:52 +00:00
Xin Tong 53e52e47e8 [ThinLTO] Correct linkonce_any function import linkage. NFC.
Summary:
This is a NFC as we do not import non-odr vague linkage when computing
for import list for a module.

Reviewers: tejohnson, pcc

Subscribers: inglorion, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D54928

llvm-svn: 347763
2018-11-28 15:16:35 +00:00
Alexey Bataev 579c2d9d64 [SLP]Fix PR39774: Set ReductionRoot if the original instruction is vectorized.
Summary:
If the original reduction root instruction was vectorized, it might be
removed from the tree. It means that the insertion point may become
invalidated and the whole vectorization of the reduction leads to the
incorrect output result.
The ReductionRoot instruction must be marked as externally used so it
could not be removed. Otherwise it might cause inconsistency with the
cost model and we may end up with too optimistic optimization.

Reviewers: RKSimon, spatel, hfinkel, mkuper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54955

llvm-svn: 347759
2018-11-28 14:34:11 +00:00
Florian Hahn fd6ea134f4 [PartialInliner] Make PHIs free in cost computation.
InlineCost also treats them as free and the current implementation
can cause assertion failures if PHI nodes are moved outside the region
from entry BBs to the region.

It also updates the code to use the instructionsWithoutDebug iterator.

Reviewers: davidxl, davide, vsk, graham-yiu-huawei

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D54748

llvm-svn: 347683
2018-11-27 18:17:27 +00:00
Tim Northover 81bff5e6ea InstCombine: add comment explaining malloc deletion. NFC.
I tried to change this, not quite realising the logic behind what we
were doing. Hopefully this comment will help the next person to come
along.

llvm-svn: 347653
2018-11-27 11:08:14 +00:00
Max Kazantsev 70b11c6d31 [LoopSimplifyCFG] Turn on term folding after underlying bug fixed
llvm-svn: 347641
2018-11-27 06:19:42 +00:00
Max Kazantsev c4e4d6449a [LoopSimplifyCFG] Fix corner case with duplicating successors
It fixes a bug that doesn't update Phi inputs of the only live successor that
is in the list of block's successors more than once.

Thanks @uabelho for finding this.

Differential Revision: https://reviews.llvm.org/D54849
Reviewed By: anna

llvm-svn: 347640
2018-11-27 06:17:21 +00:00
Xin Tong 04d49779a1 [ICP] Remove incompatible attributes at indirect-call promoted callsites.
Summary:
Removing ncompatible attributes at indirect-call promoted callsites, not removing it results in
at least a IR verification error.

Reviewers: davidxl, xur, mssimpso

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54913

llvm-svn: 347605
2018-11-26 22:03:52 +00:00
Sanjay Patel 790af91803 [InstCombine] add helper function to reduce code duplication; NFC
llvm-svn: 347604
2018-11-26 22:00:41 +00:00
Florian Hahn 6615a7132a [IPSCCP] Use input operand instead of OriginalOp for ssa_copy.
OriginalOp of a Predicate refers to the original IR value,
before renaming. While solving in IPSCCP, we have to use
the operand of the ssa_copy instead, to avoid missing
updates for nested conditions on the same IR value.

Fixes PR39772.

llvm-svn: 347524
2018-11-25 16:32:02 +00:00
Nikita Popov 2c779c0e34 [InstCombine] Determine demanded and known bits for funnel shifts
Support funnel shifts in InstCombine demanded bits simplification.
If the shift amount is constant, we can determine both the demanded
bits of the operands, as well as the known bits of the result.

If one of the operands has no demanded bits, it will be replaced
by undef and the funnel shift will be simplified into a simple shift
due to the simplifications added in D54778.

Differential Revision: https://reviews.llvm.org/D54869

llvm-svn: 347515
2018-11-24 19:00:45 +00:00
Nikita Popov 6e81d421e1 [InstCombine] Simplify funnel shift with zero/undef operand to shift
The following simplifications are implemented:

 * `fshl(X, 0, C) -> shl X, C%BW`
 * `fshl(X, undef, C) -> shl X, C%BW` (assuming undef = 0)
 * `fshl(0, X, C) -> lshr X, BW-C%BW`
 * `fshl(undef, X, C) -> lshr X, BW-C%BW` (assuming undef = 0)
 * `fshr(X, 0, C) -> shl X, (BW-C%BW)`
 * `fshr(X, undef, C) -> shl X, BW-C%BW` (assuming undef = 0)
 * `fshr(0, X, C) -> lshr X, C%BW`
 * `fshr(undef, X, C) -> lshr, X, C%BW` (assuming undef = 0)

The simplification is only performed if the shift amount C is constant,
because we can explicitly compute C%BW and BW-C%BW in this case.

Differential Revision: https://reviews.llvm.org/D54778

llvm-svn: 347505
2018-11-23 22:45:08 +00:00
Max Kazantsev e1c2dc27d3 Disable LoopSimplifyCFG terminator folding by default
llvm-svn: 347486
2018-11-23 09:14:53 +00:00
Max Kazantsev cb8e240334 [LoopSimplifyCFG] Don't delete LCSSA Phis
When removing edges, we also update Phi inputs and may end up removing
a Phi if it has only one input. We should not do it for edges that leave the current
loop because these Phis are LCSSA Phis and need to be preserved.

Thanks @dmgreen	for finding this!

Differential Revision: https://reviews.llvm.org/D54841

llvm-svn: 347484
2018-11-23 07:56:47 +00:00
Max Kazantsev b565e6093b [NFC] Assert that all blocks staying in loop are live
llvm-svn: 347458
2018-11-22 12:43:27 +00:00
Max Kazantsev 56a2443024 [NFC] Ensure deterministic order of dead exit blocks
llvm-svn: 347457
2018-11-22 12:33:41 +00:00
Max Kazantsev d9f59f8c80 [NFC] Simplify code by using standard exit blocks collection
llvm-svn: 347454
2018-11-22 10:48:30 +00:00
Fedor Sergeev 59246b6bfe [PM] correcting return value for new-pass-manager version of Scalarizer
Obvious mistake missed during D54695 review.

llvm-svn: 347432
2018-11-21 22:01:19 +00:00
Nikita Popov 6f54fb0052 [MergeFuncs] Generate alias instead of thunk if possible
The MergeFunctions pass was originally intended to emit aliases
instead of thunks where possible (unnamed_addr). However, for a
long time this functionality was behind a flag hardcoded to false,
bitrotted and was eventually removed in r309313.

Originally the functionality was first disabled in r108417 due to
lack of support for aliases in Mach-O. I believe that this is no
longer the case nowadays, but not really familiar with this area.

In the interest of being conservative, this patch reintroduces the
aliasing functionality behind a default disabled -mergefunc-use-aliases
flag.

Differential Revision: https://reviews.llvm.org/D53285

llvm-svn: 347407
2018-11-21 19:37:19 +00:00
Mikael Holmen b6f76002d9 [PM] Port Scalarizer to the new pass manager.
Patch by: markus (Markus Lavin)

Reviewers: chandlerc, fedor.sergeev

Reviewed By: fedor.sergeev

Subscribers: llvm-commits, Ka-Ka, bjope

Differential Revision: https://reviews.llvm.org/D54695

llvm-svn: 347392
2018-11-21 14:00:17 +00:00
Guozhi Wei c21fba1bab [LoopSink] Add preheader to alias set
This patch fixes PR39695.

The original LoopSink only considers memory alias in loop body. But PR39695 shows that instructions following sink candidate in preheader should also be checked. This is a conservative patch, it simply adds whole preheader block to alias set. It may lose some optimization opportunity, but I think that is very rare because: 1 in the most common case st/ld to the same address, the load should already be optimized away. 2 usually preheader is not very large. 

Differential Revision: https://reviews.llvm.org/D54659

llvm-svn: 347325
2018-11-20 16:49:07 +00:00
Max Kazantsev c04b5307d1 Recommit "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switches"
The initial version of patch lacked Phi nodes updates in destinations of removed
edges. This version contains this update and tests on this situation.

Differential Revision: https://reviews.llvm.org/D54021

llvm-svn: 347289
2018-11-20 05:43:32 +00:00
Reid Kleckner 994a8451ba [Transforms] Prefer static and avoid namespaces, NFC
Put 'static' on three functions in an anonymous namespace as per our
coding style.

Remove the 'namespace llvm {}' around the .cpp file and explicitly
declare the free function 'llvm::optimizeGlobalCtorsList' in 'llvm::'.
I prefer this style for free functions because the compiler will error
out if the .h and .cpp files don't agree on the function name or
prototype.

llvm-svn: 347269
2018-11-19 22:19:05 +00:00
Benjamin Kramer fdd9b4fc8f Revert "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switches"
This reverts commits r347183 & r347184. Crashes while building libxml.

llvm-svn: 347260
2018-11-19 20:01:20 +00:00
Vedant Kumar 238533ec2e [InstCombine] Set debug loc on `mergeStoreIntoSuccessor` phi
Assigning a merged debug location to the `mergeStoreIntoSuccessor` phi
improves backtrace quality.

Fixes llvm.org/PR38083.

llvm-svn: 347257
2018-11-19 19:55:02 +00:00
Vedant Kumar 4de31bba51 [IR] Add hasNPredecessors, hasNPredecessorsOrMore to BasicBlock
Add methods to BasicBlock which make it easier to efficiently check
whether a block has N (or more) predecessors.

This can be more efficient than using pred_size(), which is a linear
time operation.

We might consider adding similar methods for successors. I haven't done
so in this patch because succ_size() is already O(1).

With this patch applied, I measured a 0.065% compile-time reduction in
user time for running `opt -O3` on the sqlite3 amalgamation (30 trials).
The change in mergeStoreIntoSuccessor alone saves 45 million linked list
iterations in a stage2 Release build of llc.

See llvm.org/PR39702 for a harder but more general way of achieving
similar results.

Differential Revision: https://reviews.llvm.org/D54686

llvm-svn: 347256
2018-11-19 19:54:27 +00:00
Benjamin Kramer 2cad359c91 Revert "[LICM] Make LICM able to hoist phis"
This reverts commit r347190.

llvm-svn: 347225
2018-11-19 16:51:57 +00:00
Anna Thomas 5e9215f02b [LV] Avoid vectorizing unsafe dependencies in uniform address
Summary:
Currently, when vectorizing stores to uniform addresses, the only
instance we prevent vectorization is if there are multiple stores to the
same uniform address causing an unsafe dependency.
This patch teaches LAA to avoid vectorizing loops that have an unsafe
cross-iteration dependency between a load and a store to the same uniform address.

Fixes PR39653.

Reviewers: Ayal, efriedma

Subscribers: rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D54538

llvm-svn: 347220
2018-11-19 15:39:59 +00:00
John Brawn 12c046fba0 [LICM] Make LICM able to hoist phis
The general approach taken is to make note of loop invariant branches, then when
we see something conditional on that branch, such as a phi, we create a copy of
the branch and (empty versions of) its successors and hoist using that.

This has no impact by itself that I've been able to see, as LICM typically
doesn't see such phis as they will have been converted into selects by the time
LICM is run, but once we start doing phi-to-select conversion later it will be
important.

Differential Revision: https://reviews.llvm.org/D52827

llvm-svn: 347190
2018-11-19 11:31:24 +00:00
Max Kazantsev 8e3e33d138 [LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switches
This patch introduces infrastructure and the simplest case for constant-folding
of branch and switch instructions within loop into unconditional branches.
It is useful as a cleanup for such passes as loop unswitching that sometimes
produce such branches.

Only the simplest case supported in this patch: after the folding, no block
should become dead or stop being part of the loop. Support for more
sophisticated cases will go separately in follow-up patches.

Differential Revision: https://reviews.llvm.org/D54021
Reviewed By: anna

llvm-svn: 347183
2018-11-19 05:54:38 +00:00
Vedant Kumar e7b789b529 [ProfileSummary] Standardize methods and fix comment
Every Analysis pass has a get method that returns a reference of the Result of
the Analysis, for example, BlockFrequencyInfo
&BlockFrequencyInfoWrapperPass::getBFI().  I believe that
ProfileSummaryInfo::getPSI() is the only exception to that, as it was returning
a pointer.

Another change is renaming isHotBB and isColdBB to isHotBlock and isColdBlock,
respectively.  Most methods use BB as the argument of variable names while
methods usually refer to Basic Blocks as Blocks, instead of BB.  For example,
Function::getEntryBlock, Loop:getExitBlock, etc.

I also fixed one of the comments.

Patch by Rodrigo Caetano Rocha!

Differential Revision: https://reviews.llvm.org/D54669

llvm-svn: 347182
2018-11-19 05:23:16 +00:00
Vedant Kumar 35f504c113 [CorrelatedValuePropagation] Preserve debug locations (PR38178)
Fix all of the missing debug location errors in CVP found by debugify.

This includes the missing-location-after-udiv-truncation case described
in llvm.org/PR38178.

llvm-svn: 347147
2018-11-18 00:29:58 +00:00
Fangrui Song 7570932977 Use llvm::copy. NFC
llvm-svn: 347126
2018-11-17 01:44:25 +00:00
Fedor Sergeev 2e3e224e71 [SimpleLoopUnswitch] adding cost multiplier to cap exponential unswitch with
We need to control exponential behavior of loop-unswitch so we do not get
run-away compilation.

Suggested solution is to introduce a multiplier for an unswitch cost that
makes cost prohibitive as soon as there are too many candidates and too
many sibling loops (meaning we have already started duplicating loops
by unswitching).

It does solve the currently known problem with compile-time degradation
(PR 39544).

Tests are built on top of a recently implemented CHECK-COUNT-<num>
FileCheck directives.

Reviewed By: chandlerc, mkazantsev
Differential Revision: https://reviews.llvm.org/D54223

llvm-svn: 347097
2018-11-16 21:16:43 +00:00
Adrian Prantl 83d87520ed GlobalDCE: Teach isEmptyFunction() to ignore debug intrinsics.
This fixes PR39669.
https://bugs.llvm.org/show_bug.cgi?id=39669

llvm-svn: 347065
2018-11-16 17:47:21 +00:00
Eugene Leviant bf46e7410c [ThinLTO] Internalize readonly globals
An attempt to recommit r346584 after failure on OSX build bot.
Fixed cache key computation in ThinLTOCodeGenerator and added
test case

llvm-svn: 347033
2018-11-16 07:08:00 +00:00
Xin Tong 642c8d3575 [LTO] Load sample profile in LTO link step.
Summary:
Load sample profile in LTO link step.
ThinLTO calls populateModulePassManager to load the profile

Reviewers: tejohnson, davidxl, danielcdh

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D54564

llvm-svn: 346971
2018-11-15 18:06:42 +00:00
Sanjay Patel bc56b2432d [InstCombine] fix rotate narrowing bug for non-pow-2 types
llvm-svn: 346968
2018-11-15 17:19:14 +00:00
Mandeep Singh Grang 0905fc77c1 [InstCombine] Remove a couple of asserts based on incorrect assumptions
Summary:
These asserts are based on the assumption that the order of true/false operands in a select and those in the compare would always be the same.
This fixes PR39595.

Reviewers: craig.topper, spatel, dmgreen

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54359

llvm-svn: 346874
2018-11-14 17:55:07 +00:00
Sanjay Patel 6072842770 [InstCombine] fix formatting for matchBSwap(); NFC
We should have a similar function for matching rotate and/or 
funnel shift, so tidy up the related existing call.

llvm-svn: 346871
2018-11-14 16:03:36 +00:00
Florian Hahn 6df11868b5 [VPlan, SLP] Use SmallPtrSet for Candidates.
This slightly improves the candidate handling in getBest().

llvm-svn: 346870
2018-11-14 15:58:40 +00:00
Florian Hahn 02cb67deb9 [VPlan] Remove LLVM_DEBUG from VPlanSlp::dumpBundle.
The caller should take care of only calling it with debug enabled.

llvm-svn: 346860
2018-11-14 13:33:44 +00:00
Florian Hahn 2eca3728ee [VPlan] Update ifdef.
llvm-svn: 346858
2018-11-14 13:21:26 +00:00
Florian Hahn 09e516c54b [VPlan, SLP] Add simple SLP analysis on top of VPlan.
This patch adds an initial implementation of the look-ahead SLP tree
construction described in 'Look-Ahead SLP: Auto-vectorization in the Presence
of Commutative Operations, CGO 2018 by Vasileios Porpodas, Rodrigo C. O. Rocha,
Luís F. W. Góes'.

It returns an SLP tree represented as VPInstructions, with combined
instructions represented as a single, wider VPInstruction.

This initial version does not support instructions with multiple
different users (either inside or outside the SLP tree) or
non-instruction operands; it won't generate any shuffles or
insertelement instructions.

It also just adds the analysis that builds an SLP tree rooted in a set
of stores. It does not include any cost modeling or memory legality
checks. The plan is to integrate it with VPlan based cost modeling, once
available and to only apply it to operations that can be widened.

A follow-up patch will add a support for replacing instructions in a
VPlan with their SLP counter parts.

Reviewers: Ayal, mssimpso, rengolin, mkuper, hfinkel, hsaito, dcaballe, vporpo, RKSimon, ABataev

Reviewed By: rengolin

Differential Revision: https://reviews.llvm.org/D4949

llvm-svn: 346857
2018-11-14 13:11:49 +00:00
Florian Hahn 505091a8f2 Recommit r346483: [CallSiteSplitting] Only record conditions up to the IDom(call site).
The underlying problem causing the expensive-check failure was fixed in
rL346769.

llvm-svn: 346843
2018-11-14 10:04:30 +00:00
Reid Kleckner 41390b47de Revert r346810 "Preserve loop metadata when splitting exit blocks"
It broke the Windows self-host:
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/1457

llvm-svn: 346823
2018-11-14 01:47:32 +00:00
Sanjay Patel a139564896 [InstCombine] fold funnel shift amount based on demanded bits
The shift amount of a funnel shift is modulo the scalar bitwidth:
http://llvm.org/docs/LangRef.html#llvm-fshl-intrinsic
...so we can use demanded bits analysis on that operand to simplify it
when we have a power-of-2 bitwidth.

This is another step towards canonicalizing {shift/shift/or} to the 
intrinsics in IR.

Differential Revision: https://reviews.llvm.org/D54478

llvm-svn: 346814
2018-11-13 23:27:23 +00:00
Craig Topper 3c87c2a3c5 Preserve loop metadata when splitting exit blocks
LoopUtils.cpp contains a utility that splits an loop exit block, so that the new block contains only edges coming from the loop. In the case of nested loops, the exit path for the inner loop might also be the back-edge of the outer loop. The new block which is inserted on this path, is now a latch for the outer loop, and it needs to hold the loop metadata for the outer loop. (The test case gives a more concrete view of the situation.)

Patch by Chang Lin (clin1)

Differential Revision: https://reviews.llvm.org/D53876

llvm-svn: 346810
2018-11-13 23:06:49 +00:00
Sanjay Patel f8f12272e8 [InstCombine] canonicalize rotate patterns with cmp/select
The cmp+branch variant of this pattern is shown in:
https://bugs.llvm.org/show_bug.cgi?id=34924
...and as discussed there, we probably can't transform
that without a rotate intrinsic. We do have that now
via funnel shift, but we're not quite ready to 
canonicalize IR to that form yet. The case with 'select'
should already be transformed though, so that's this patch.

The sequence with negation followed by masking is what we
use in the backend and partly in clang (though that part 
should be updated).

https://rise4fun.com/Alive/TplC
  %cmp = icmp eq i32 %shamt, 0
  %sub = sub i32 32, %shamt
  %shr = lshr i32 %x, %shamt
  %shl = shl i32 %x, %sub
  %or = or i32 %shr, %shl
  %r = select i1 %cmp, i32 %x, i32 %or
  =>
  %neg = sub i32 0, %shamt
  %masked = and i32 %shamt, 31
  %maskedneg = and i32 %neg, 31
  %shl2 = lshr i32 %x, %masked
  %shr2 = shl i32 %x, %maskedneg
  %r = or i32 %shl2, %shr2

llvm-svn: 346807
2018-11-13 22:47:24 +00:00
Florian Hahn 107d0a8756 [CSP, Cloning] Update DuplicateInstructionsInSplitBetween to use DomTreeUpdater.
This patch updates DuplicateInstructionsInSplitBetween to update a DTU
instead of applying updates to the DT directly.

Given that there only are 2 users, also updated them in this patch to
avoid churn.

I slightly moved the code in CallSiteSplitting around to reduce the
places where we have to pass in DTU. If necessary, I could split those
changes in a separate patch.

This fixes missing DT updates when dealing with musttail calls in
CallSiteSplitting, by using DTU->deleteBB.

Reviewers: junbuml, kuhar, NutshellySima, indutny, brzycki

Reviewed By: NutshellySima

llvm-svn: 346769
2018-11-13 17:54:43 +00:00
Steven Wu fa43892d6f Revert "[ThinLTO] Internalize readonly globals"
This reverts commit 10c84a8f35cae4a9fc421648d9608fccda3925f2.

llvm-svn: 346768
2018-11-13 17:35:04 +00:00
Florian Hahn a4dc7feeea [VPlan] VPlan version of InterleavedAccessInfo.
This patch turns InterleaveGroup into a template with the instruction type
being a template parameter. It also adds a VPInterleavedAccessInfo class, which
only contains a mapping from VPInstructions to their respective InterleaveGroup.
As we do not have access to scalar evolution in VPlan, we can re-use
convert InterleavedAccessInfo to VPInterleavedAccess info.


Reviewers: Ayal, mssimpso, hfinkel, dcaballe, rengolin, mkuper, hsaito

Reviewed By: rengolin

Differential Revision: https://reviews.llvm.org/D49489

llvm-svn: 346758
2018-11-13 15:58:18 +00:00
Zhizhou Yang cc633af55b Introduce DebugCounter into ConstProp pass
Summary:
This patch introduces DebugCounter into ConstProp pass at per-transformation level.

It will provide an option to skip first n or stop after n transformations for the whole ConstProp pass.

This will make debug easier for the pass, also providing chance to do transformation level bisecting.

Reviewers: davide, fhahn

Reviewed By: fhahn

Subscribers: llozano, george.burgess.iv, llvm-commits

Differential Revision: https://reviews.llvm.org/D50094

llvm-svn: 346720
2018-11-13 00:31:22 +00:00
Sanjay Patel 35b1c2d19d [InstCombine] narrow width of rotate patterns, part 3
This is a longer variant for the pattern handled in
rL346713 
This one includes zexts. 

Eventually, we should canonicalize all rotate patterns 
to the funnel shift intrinsics, but we need a bit more
infrastructure to make sure the vectorizers handle those
intrinsics as well as the shift+logic ops.

https://rise4fun.com/Alive/FMn

Name: narrow rotateright
  %neg = sub i8 0, %shamt
  %rshamt = and i8 %shamt, 7
  %rshamtconv = zext i8 %rshamt to i32
  %lshamt = and i8 %neg, 7
  %lshamtconv = zext i8 %lshamt to i32
  %conv = zext i8 %x to i32
  %shr = lshr i32 %conv, %rshamtconv
  %shl = shl i32 %conv, %lshamtconv
  %or = or i32 %shl, %shr
  %r = trunc i32 %or to i8
  =>
  %maskedShAmt2 = and i8 %shamt, 7
  %negShAmt2 = sub i8 0, %shamt
  %maskedNegShAmt2 = and i8 %negShAmt2, 7
  %shl2 = lshr i8 %x, %maskedShAmt2
  %shr2 = shl i8 %x, %maskedNegShAmt2
  %r = or i8 %shl2, %shr2
llvm-svn: 346716
2018-11-12 22:52:25 +00:00
Sanjay Patel 98e427ccf2 [InstCombine] narrow width of rotate patterns, part 2 (PR39624)
The sub-pattern for the shift amount in a rotate can take on
several different forms, and there's apparently no way to
canonicalize those without seeing the entire rotate sequence.

This is the form noted in:
https://bugs.llvm.org/show_bug.cgi?id=39624

https://rise4fun.com/Alive/qnT

  %zx = zext i8 %x to i32
  %maskedShAmt = and i32 %shAmt, 7
  %shl = shl i32 %zx, %maskedShAmt
  %negShAmt = sub i32 0, %shAmt
  %maskedNegShAmt = and i32 %negShAmt, 7
  %shr = lshr i32 %zx, %maskedNegShAmt
  %rot = or i32 %shl, %shr
  %r = trunc i32 %rot to i8
  =>
  %truncShAmt = trunc i32 %shAmt to i8
  %maskedShAmt2 = and i8 %truncShAmt, 7
  %shl2 = shl i8 %x, %maskedShAmt2
  %negShAmt2 = sub i8 0, %truncShAmt
  %maskedNegShAmt2 = and i8 %negShAmt2, 7
  %shr2 = lshr i8 %x, %maskedNegShAmt2
  %r = or i8 %shl2, %shr2

llvm-svn: 346713
2018-11-12 22:11:09 +00:00
Sanjay Patel ceab2329b6 [InstCombine] refactor code for matching shift amount of a rotate; NFC
As shown in existing test cases and with:
https://bugs.llvm.org/show_bug.cgi?id=39624
...we're missing at least 2 more patterns for rotate narrowing.

llvm-svn: 346711
2018-11-12 22:00:00 +00:00
Philip Reames b8d8db30ea [GC][InstCombine] Fix a potential iteration issue
Noticed via inspection.  Appears to be largely innocious in practice, but slight code change could have resulted in either visit order dependent missed optimizations or infinite loops.  May be a minor compile time problem today.

llvm-svn: 346698
2018-11-12 20:00:53 +00:00
Simon Pilgrim 631f2bf51e [CostModel] Add more realistic SK_ExtractSubvector generic costs.
Instead of defaulting to a cost = 1, expand to element extract/insert like we do for other shuffles.

This exposes an issue in LoopVectorize which could call SK_ExtractSubvector with a scalar subvector type.

llvm-svn: 346656
2018-11-12 14:25:23 +00:00
Max Kazantsev 7d49a3a816 [LICM] Hoist guards from non-header blocks
This patch relaxes overconservative checks on whether or not we could write
memory before we execute an instruction. This allows us to hoist guards out of
loops even if they are not in the header block.

Differential Revision: https://reviews.llvm.org/D50891
Reviewed By: fedor.sergeev

llvm-svn: 346643
2018-11-12 09:29:58 +00:00
Calixte Denizet c6fabeac11 [GCOV] Add options to filter files which must be instrumented.
Summary:
When making code coverage, a lot of files (like the ones coming from /usr/include) are removed when post-processing gcno/gcda so finally they doen't need to be instrumented nor to appear in gcno/gcda.
The goal of the patch is to be able to filter the files we want to instrument, there are several advantages to do that:
- improve speed (no overhead due to instrumentation on files we don't care)
- reduce gcno/gcda size
- it gives the possibility to easily instrument only few files (e.g. ones modified in a patch) without changing the build system
- need to accept this patch to be enabled in clang: https://reviews.llvm.org/D52034

Reviewers: marco-c, vsk

Reviewed By: marco-c

Subscribers: llvm-commits, sylvestre.ledru

Differential Revision: https://reviews.llvm.org/D52033

llvm-svn: 346641
2018-11-12 09:01:43 +00:00
Florian Hahn 9026d4ee9b [IPSCCP,PM] Preserve PDT in the new pass manager.
Reviewers: kuhar, chandlerc, NutshellySima, brzycki

Reviewed By: NutshellySima, brzycki

Differential Revision: https://reviews.llvm.org/D54317

llvm-svn: 346618
2018-11-11 20:22:45 +00:00
Sanjay Patel 4a12aa9791 [InstCombine] simplify code for merging stores; NFCI
llvm-svn: 346596
2018-11-10 20:29:25 +00:00
Eugene Leviant be8d19967a [ThinLTO] Internalize readonly globals
This patch allows internalising globals if all accesses to them
(from live functions) are from non-volatile load instructions

Differential revision: https://reviews.llvm.org/D49362

llvm-svn: 346584
2018-11-10 08:31:21 +00:00
Eli Friedman 15930bf352 [JumpThreading] Fix exponential time algorithm computing known values.
ComputeValueKnownInPredecessors has a "visited" set to prevent infinite
loops, since a value can be visited more than once.  However, the
implementation didn't prevent the algorithm from taking exponential
time. Instead of removing elements from the RecursionSet one at a time,
we should keep around the whole set until
ComputeValueKnownInPredecessors finishes, then discard it.

The testcase is synthetic because I was having trouble effectively
reducing the original.  But it's basically the same idea.

Instead of failing, we could theoretically cache the result instead.
But I don't think it would help substantially in practice.

Differential Revision: https://reviews.llvm.org/D54239

llvm-svn: 346562
2018-11-09 22:35:26 +00:00
Florian Hahn 9f878e9bae Revert r346483: [CallSiteSplitting] Only record conditions up to the IDom(call site).
This cause a failure with EXPENSIVE_CHECKS

llvm-svn: 346492
2018-11-09 13:28:58 +00:00
Florian Hahn a1062f4b68 [IPSCCP,PM] Preserve DT in the new pass manager.
After D45330, Dominators are required for IPSCCP and can be preserved.

This patch preserves DominatorTreeAnalysis in the new pass manager. AFAIK the legacy pass manager cannot preserve function analysis required by a module analysis.

Reviewers: davide, dberlin, chandlerc, efriedma, kuhar, NutshellySima

Reviewed By: chandlerc, kuhar, NutshellySima

Differential Revision: https://reviews.llvm.org/D47259

llvm-svn: 346486
2018-11-09 11:52:27 +00:00
Florian Hahn 52578f95c9 [CallSiteSplitting] Only record conditions up to the IDom(call site).
We can stop recording conditions once we reached the immediate dominator
for the block containing the call site. Conditions in predecessors of the
that node will be the same for all paths to the call site and splitting
is not beneficial.

This patch makes CallSiteSplitting dependent on the DT anlysis. because
the immediate dominators seem to be the easiest way of finding the node
to stop at.

I had to update some exiting tests, because they were checking for
conditions that were true/false on all paths to the call site. Those
should now be handled by instcombine/ipsccp.

Reviewers: davide, junbuml

Reviewed By: junbuml

Differential Revision: https://reviews.llvm.org/D44627

llvm-svn: 346483
2018-11-09 10:23:46 +00:00
Carlos Alberto Enciso fa9cf89734 [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG.
In SimplifyCFG when given a conditional branch that goes to BB1 and BB2, the hoisted common terminator instruction in the two blocks, caused debug line records associated with subsequent select instructions to become ambiguous. It causes the debugger to display unreachable source lines.

Differential Revision: https://reviews.llvm.org/D53390

llvm-svn: 346481
2018-11-09 09:42:10 +00:00
Max Kazantsev 9883d1e1a7 [NFC] Add utility function for SafetyInfo updates for moveBefore
llvm-svn: 346472
2018-11-09 05:39:04 +00:00
Florian Hahn a684a99441 [LoopInterchange] Support reductions across inner and outer loop.
This patch adds logic to detect reductions across the inner and outer
loop by following the incoming values of PHI nodes in the outer loop. If
the incoming values take part in a reduction in the inner loop or come
from outside the outer loop, we found a reduction spanning across inner
and outer loop.

With this change, ~10% more loops are interchanged in the LLVM
test-suite + SPEC2006.

Fixes https://bugs.llvm.org/show_bug.cgi?id=30472

Reviewers: mcrosier, efriedma, karthikthecool, davide, hfinkel, dmgreen

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D43245

llvm-svn: 346438
2018-11-08 20:44:19 +00:00
Pirama Arumuga Nainar e61652a384 [LTO] Drop non-prevailing definitions only if linkage is not local or appending
Summary:
This fixes PR 37422

In ELF, non-weak symbols can also be non-prevailing.  In this particular
PR, the __llvm_profile_* symbols are non-prevailing but weren't getting
dropped - causing multiply-defined errors with lld.

Also add a test, strong_non_prevailing.ll, to ensure that multiple
copies of a strong symbol are dropped.

To fix the test regressions exposed by this fix,
- do not mark prevailing copies for symbols with 'appending' linkage.
There's no one prevailing copy for such symbols.
- fix the prevailing version in dead-strip-fulllto.ll
- explicitly pass exported symbols to llvm-lto in fumcimport.ll and
funcimport_var.ll

Reviewers: tejohnson, pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith,
dang, srhines, llvm-commits

Differential Revision: https://reviews.llvm.org/D54125

llvm-svn: 346436
2018-11-08 20:10:07 +00:00
Tom Stellard 28d662164d InstCombine: Avoid introducing poison values when lowering llvm.amdgcn.[us]bfe
Summary:
When the 3rd argument to these intrinsics is zero, lowering them
to shift instructions produces poison values, since we end up with
shift amounts equal to the number of bits in the shifted value.  This
means we can only lower these intrinsics if we can prove that the
3rd argument is not zero.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: bnieuwenhuizen, jvesely, wdng, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D53739

llvm-svn: 346422
2018-11-08 17:57:57 +00:00
Vedant Kumar d6699423f1 [CodeExtractor] Mark functions noreturn when applicable
This eliminates the outlining penalty for llvm.trap/unreachable, because
callers no longer have to emit cleanup/ret instructions after calling an
outlined `noreturn` function.

rdar://45523626

llvm-svn: 346421
2018-11-08 17:57:09 +00:00
Max Kazantsev 266c087b9d Return "[IndVars] Smart hard uses detection"
The patch has been reverted because it ended up prohibiting propagation
of a constant to exit value. For such values, we should skip all checks
related to hard uses because propagating a constant is always profitable.

Differential Revision: https://reviews.llvm.org/D53691

llvm-svn: 346397
2018-11-08 11:54:35 +00:00
Gil Rapaport 7b88bab386 [LSR] Combine unfolded offset into invariant register
LSR reassociates constants as unfolded offsets when the constants fit as
immediate add operands, which currently prevents such constants from being
combined later with loop invariant registers.
This patch modifies GenerateCombinations() to generate a second formula which
includes the unfolded offset in the combined loop-invariant register.

This commit fixes a bug in the original patch (committed at r345114, reverted
at r345123).

Differential Revision: https://reviews.llvm.org/D51861

llvm-svn: 346390
2018-11-08 09:01:19 +00:00
whitequark 73cb978495 [MergeFuncs] Improve ordering of equal functions
Summary:
MergeFunctions currently tries to process strong functions before
weak functions, because weak functions can simply call strong
functions, while a strong/weak function cannot call a weak function
(a backing strong function is needed).

This patch additionally tries to process external functions before
local functions, because we definitely have to keep the external
function, but may be able to drop the local one (and definitely
can if it is also unnamed_addr).

Unfortunately, this exposes an existing bug in the implementation:
The FnTree and FNodesInTree structures can currently go out of
sync in the case where two weak functions are merged, because the
function in FnTree/FNodesInTree is RAUWed. This leaves it behind in
FnTree (this is intended, as it is the strong backing function which
should be used for further merges), while it is replaced in
FNodesInTree (this is not intended).

This is fixed by switching FNodesInTree from using a ValueMap to
using a DenseMap of AssertingVH.

This exposes another minor issue: Currently FNodesInTree is not
cleared after MergeFunctions finishes running. Currently, this is
potentially dangerous (e.g. if something else wants to RAUW a function
with a non-function), but at the very least it is unnecessary/inefficient.
After the change to use AssertingVH it becomes more problematic,
because there are certainly passes that remove functions.

This issue is fixed by clearing FNodesInTree at the end of the pass.

Reviewers: jfb, whitequark

Reviewed By: whitequark

Subscribers: rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D53271

llvm-svn: 346386
2018-11-08 03:58:01 +00:00
whitequark 3580ac6125 [MergeFuncs] Call removeUsers() prior to unnamed_addr RAUW
Summary:
For unnamed_addr functions we RAUW instead of only replacing direct callers. However, functions in which replacements were performed currently are not added back to the worklist, resulting in missed merging opportunities.

Fix this by calling removeUsers() prior to RAUW.

Reviewers: jfb, whitequark

Reviewed By: whitequark

Subscribers: rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D53262

llvm-svn: 346385
2018-11-08 03:57:55 +00:00
Reid Kleckner b41b372171 [sancov] Put .SCOV* sections into the right comdat groups on COFF
Avoids linker errors about relocations against discarded sections.

This was uncovered during the Chromium clang roll here:
https://chromium-review.googlesource.com/c/chromium/src/+/1321863#message-717516acfcf829176f6a2f50980f7a4bdd66469a

After this change, Chromium's libGLESv2 links successfully for me.

Reviewers: metzman, hans, morehouse

Differential Revision: https://reviews.llvm.org/D54232

llvm-svn: 346381
2018-11-08 00:57:33 +00:00
Rong Xu fb4bcc452c [PGO] Exit early if all count values are zero
If all the edge counts for a function are zero, skip count population and
annotation, as nothing will happen. This can save some compile time.

Differential Revision: https://reviews.llvm.org/D54212

llvm-svn: 346370
2018-11-07 23:51:20 +00:00
Fedor Sergeev f9a02a7006 [SimpleLoopUnswitch] partial unswitch needs to be careful when replacing invariants with constants
When partial unswitch operates on multiple conditions at once, .e.g:
   if (Cond1 || Cond2 || NonInv) ...

it should infer (and replace) values for individual conditions only on one
side of unswitch and not another.

More precisely only these derivations hold true:
   (Cond1 || Cond2) == false  =>  Cond1 == Cond2 == false
   (Cond1 && Cond2) == true   =>  Cond1 == Cond2 == true

By the way we organize unswitching it means only replacing on "continue" blocks
and never on "unswitched" ones. Since trivial unswitch does not have "unswitched"
blocks it does not have this problem.

Fixes PR 39568.

Reviewers: chandlerc, asbirlea
Differential Revision: https://reviews.llvm.org/D54211

llvm-svn: 346350
2018-11-07 20:05:11 +00:00
Mandeep Singh Grang d47d188b6f [LoopSink] Do not sink instructions into non-cold blocks
Summary: This fixes PR39570.

Reviewers: danielcdh, rnk, bkramer

Reviewed By: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54181

llvm-svn: 346337
2018-11-07 18:26:24 +00:00
Florian Hahn ac86038b40 [NewGVN] Make sure we do not add a user to itself.
If we simplify an instruction to itself, we do not need to add a user to
itself. For congruence classes with a defining expression, we already
use a similar logic.

Fixes PR38259.

Reviewers: davide, efriedma, mcrosier

Reviewed By: davide

Differential Revision: https://reviews.llvm.org/D51168

llvm-svn: 346335
2018-11-07 17:20:07 +00:00
Sanjay Patel 57a08b3343 [InstCombine] propagate FMF for fcmp+fabs folds
By morphing the instruction rather than deleting and creating a new one,
we retain fast-math-flags and potentially other metadata (profile info?).

llvm-svn: 346331
2018-11-07 16:15:01 +00:00
Sanjay Patel bb521e63af [InstCombine] peek through fabs() when checking isnan()
That should be the end of the missing cases for this fold.
See earlier patches in this series:
rL346321
rL346324

llvm-svn: 346327
2018-11-07 15:44:26 +00:00
Sanjay Patel fa5f146872 [InstCombine] add folds for fcmp Pred fabs(X), 0.0
Similar to rL346321, we had folds for the ordered
versions of these compares already, so add the
unordered siblings for completeness.

llvm-svn: 346324
2018-11-07 15:33:03 +00:00
James Y Knight 72f76bf230 Add support for llvm.is.constant intrinsic (PR4898)
This adds the llvm-side support for post-inlining evaluation of the
__builtin_constant_p GCC intrinsic.

Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call
to a function where canConstantFoldTo returns true, and one of the
arguments is a struct.

Updated from patch initially by Janusz Sobczak.

Differential Revision: https://reviews.llvm.org/D4276

llvm-svn: 346322
2018-11-07 15:24:12 +00:00
Sanjay Patel 76faf5145d [InstCombine] add fold for fabs(X) u< 0.0
The sibling fold for 'oge' --> 'ord' was already here,
but this half was missing. 

The result of fabs() must be positive or nan, so asking 
if the result is negative or nan is the same as asking 
if the result is nan.

This is another step towards fixing:
https://bugs.llvm.org/show_bug.cgi?id=39475

llvm-svn: 346321
2018-11-07 15:11:32 +00:00
Sanjay Patel de58e93666 fix typos aggressively; NFC
llvm-svn: 346316
2018-11-07 14:35:36 +00:00
Sanjay Patel 7552d0d2e6 [InstCombine] do not shrink switch conditions to illegal types (PR29009)
This patch makes shrinking switch conditions less aggressive which was introduced by:
rL274233

Note that we have 2 new bugs to track potential follow-ups that might have solved PR29009
in different ways:
https://bugs.llvm.org/show_bug.cgi?id=39569
https://bugs.llvm.org/show_bug.cgi?id=39578

Patch by:
@dendibakh (Denis Bakhvalov)

Differential Revision: https://reviews.llvm.org/D54115

llvm-svn: 346315
2018-11-07 14:12:41 +00:00
Calixte Denizet c3bed1e8e6 [GCOV] Flush counters before to avoid counting the execution before fork twice and for exec** functions we must flush before the call
Summary:
This is replacement for patch in https://reviews.llvm.org/D49460.
When we fork, the counters are duplicate as they're and so the values are finally wrong when writing gcda for parent and child.
So just before to fork, we flush the counters and so the parent and the child have new counters set to zero.
For exec** functions, we need to flush before the call to have some data.

Reviewers: vsk, davidxl, marco-c

Reviewed By: marco-c

Subscribers: llvm-commits, sylvestre.ledru, marco-c

Differential Revision: https://reviews.llvm.org/D53593

llvm-svn: 346313
2018-11-07 13:49:17 +00:00
Sanjay Patel d1172a0c20 [IR] add optional parameter for copying IR flags to compare instructions
As shown, this is used to eliminate redundant code in InstCombine,
and there are more cases where we should be using this pattern, but
we're currently unintentionally dropping flags. 

llvm-svn: 346282
2018-11-07 00:00:42 +00:00
Teresa Johnson cb397461e1 [ThinLTO] Split NotEligibleToImport into legality and inlinability flags
Summary:
The NotEligibleToImport flag on the GlobalValueSummary was set if it
isn't legal to import (e.g. because it references unpromotable locals)
and when it can't be inlined (in which case importing is pointless).

I split out the inlinable piece into a separate flag on the
FunctionSummary (doesn't make sense for aliases or global variables),
because in the future we may want to import for reasons other than
inlining.

Reviewers: davidxl

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D53345

llvm-svn: 346261
2018-11-06 19:41:35 +00:00