Commit Graph

7070 Commits

Author SHA1 Message Date
Dehao Chen 190f17cae7 Use ProfileSummary:getProfileCount to get ScaledCount for ModuleSummary
Summary: ModuleSummary should use the standard interface of ProfileSummary::getProfileCount.

Reviewers: eraman, tejohnson

Reviewed By: tejohnson

Subscribers: tejohnson, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D31154

llvm-svn: 298404
2017-03-21 17:22:35 +00:00
Reid Kleckner b518054b87 Rename AttributeSet to AttributeList
Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
typically accessed by parameter and return value index, so
"AttributeList" seems like a more intuitive name.

Rename AttributeSetImpl to AttributeListImpl to follow suit.

It's useful to rename this class so that we can rename AttributeSetNode
to AttributeSet later. AttributeSet is the set of attributes that apply
to a single function, argument, or return value.

Reviewers: sanjoy, javed.absar, chandlerc, pete

Reviewed By: pete

Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits

Differential Revision: https://reviews.llvm.org/D31102

llvm-svn: 298393
2017-03-21 16:57:19 +00:00
David Green da21170c49 [ConstantFolding] Fix to prevent constant folding having to repeatedly scan operands. NFCI
After the loop unroll threshold was increased in r295538, very
large constant expressions can be created. This prevents them
from having to be recursively scanned, leading to a compile
time blow-up.

Differential Revision: https://reviews.llvm.org/D30689

llvm-svn: 298356
2017-03-21 10:17:39 +00:00
Eli Friedman b1578d3612 [SCEV] Fix trip multiple calculation
If loop bound containing calculations like min(a,b), the Scalar
Evolution API getSmallConstantTripMultiple returns 4294967295 "-1"
as the trip multiple. The problem is that, SCEV use -1 * umax to
represent umin. The multiple constant -1 was returned, and the logic
of guarding against huge trip counts was skipped. Because -1 has 32
active bits.

The fix attempt to factor more general cases. First try to get the
greatest power of two divisor of trip count expression. In case
overflow happens, the trip count expression is still divisible by the
greatest power of two divisor returned. Returns 1 if not divisible by 2.

Patch by Huihui Zhang <huihuiz@codeaurora.org>

Differential Revision: https://reviews.llvm.org/D30840

llvm-svn: 298301
2017-03-20 20:25:46 +00:00
Simon Pilgrim 00b34996b4 Use MutableArrayRef for APFloat::convertToInteger
As discussed on D31074, use MutableArrayRef for destination integer buffers to help assert before stack overflows happen.

llvm-svn: 298253
2017-03-20 14:40:12 +00:00
Xin Tong aef0fcb191 Extract FindAvailablePtrLoadStore out of FindAvailableLoadedValue. NFCI
Summary:
Extract FindAvailablePtrLoadStore out of FindAvailableLoadedValue.
Prepare for upcoming change which will do phi-translation for load on
phi pointer in jump threading SimplifyPartiallyRedundantLoad.

This is in preparation for https://reviews.llvm.org/D30543

Reviewers: efriedma, sanjoy, davide, dberlin

Reviewed By: davide

Subscribers: junbuml, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D30524

llvm-svn: 298216
2017-03-19 15:27:52 +00:00
Brian Gesiak 1640e68728 [Analysis] bitreverse(undef) returns undef
Summary:
The reverse of an artbitrary bitpattern is also an arbitrary
bitpattern.

Reviewers: trentxintong, arsenm, majnemer

Reviewed By: majnemer

Subscribers: majnemer, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D31118

llvm-svn: 298201
2017-03-19 04:40:42 +00:00
Craig Topper 7cfd4a9d7a [ValueTracking] Remove deadish code from computeKnownBitsAddSub.
The code assigned to KnownZero, but later code unconditionally assigned over it. I'm pretty sure the later code can handle the same cases and more equally well.

llvm-svn: 298190
2017-03-18 18:21:46 +00:00
Craig Topper 3eb0d80de0 [ValueTracking] Add APInt::setSignBit and use it to replace ORing with getSignBit which will malloc if the bit width is larger than 64.
llvm-svn: 298180
2017-03-18 04:01:29 +00:00
Eli Friedman f7b060bd3e [SCEV] Use const Loop *L instead of Loop *L. NFC
Use const pointer in the trip count and trip multiple calculations.

Patch by Huihui Zhang <huihuiz@codeaurora.org>

llvm-svn: 298161
2017-03-17 22:19:52 +00:00
Michael Zolotukhin 99de88d1f3 [SCEV] Compute affine range in another way to avoid bitwidth extending.
Summary:
This approach has two major advantages over the existing one:
1. We don't need to extend bitwidth in our computations. Extending
bitwidth is a big issue for compile time as we often end up working with
APInts wider than 64bit, which is a slow case for APInt.
2. When we zero extend a wrapped range, we lose some information (we
replace the range with [0, 1 << src bit width)). Thus, avoiding such
extensions better preserves information.

Correctness testing:
I ran 'ninja check' with assertions that the new implementation of
getRangeForAffineAR gives the same results as the old one (this
functionality is not present in this patch). There were several failures
- I inspected them manually and found out that they all are caused by
the fact that we're returning more accurate results now (see bullet (2)
above).
Without such assertions 'ninja check' works just fine, as well as
SPEC2006.

Compile time testing:
CTMark/Os:
 - mafft/pairlocalalign	-16.98%
 - tramp3d-v4/tramp3d-v4	-12.72%
 - lencod/lencod	-11.51%
 - Bullet/bullet	-4.36%
 - ClamAV/clamscan	-3.66%
 - 7zip/7zip-benchmark	-3.19%
 - sqlite3/sqlite3	-2.95%
 - SPASS/SPASS	-2.74%
 - Average	-5.81%

Performance testing:
The changes are expected to be neutral for runtime performance.

Reviewers: sanjoy, atrick, pete

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30477

llvm-svn: 297992
2017-03-16 21:07:38 +00:00
Oliver Stannard 062041113f [ValueTracking] Out of range shifts might be undef
If it is possible for the RHS of a shift operation to be greater than or equal
to the bit-width, then the result might be undef, and we can't report any known
bits.

In some cases, this was allowing a transformation in instcombine which widened
an undef value from i1 to i32, increasing the range of values that a function
could return.

Differential revision: https://reviews.llvm.org/D30781

llvm-svn: 297724
2017-03-14 10:13:17 +00:00
Jonas Paulsson a48ea231c0 [TargetTransformInfo] getIntrinsicInstrCost() scalarization estimation improved
getIntrinsicInstrCost() used to only compute scalarization cost based on types.
This patch improves this so that the actual arguments are checked when they are
available, in order to handle only unique non-constant operands.

Tests updates:

Analysis/CostModel/X86/arith-fp.ll
Transforms/LoopVectorize/AArch64/interleaved_cost.ll
Transforms/LoopVectorize/ARM/interleaved_cost.ll

The improvement in getOperandsScalarizationOverhead() to differentiate on
constants made it necessary to update the interleaved_cost.ll tests even
though they do not relate to intrinsics.

Review: Hal Finkel
https://reviews.llvm.org/D29540

llvm-svn: 297705
2017-03-14 06:35:36 +00:00
Anna Thomas a10e3e4c34 [LVI] Add Datalayout to the class LazyValueInfo since all its Impls require it. NFC
llvm-svn: 297583
2017-03-12 14:06:41 +00:00
Sanjoy Das 3f1e8e0102 Use a WeakVH for UnknownInstructions in AliasSetTracker
Summary:
This change solves the same problem as D30726, except that this only
throws out the bathwater.

AST was not correctly tracking and deleting UnknownInstructions via
handles.  The existing code only tracks "pointers" in its
`ASTCallbackVH`, so an UnknownInstruction (that isn't also def'ing a
pointer used by another memory instruction) never gets a
`ASTCallbackVH`.

There are two other ways to solve this problem:

 - Use the `PointerRec` scheme for both known and unknown instructions.
 - Use a `CallbackVH` that erases the offending Instruction from the
   UnknownInstruction list.

Both of the above changes seemed to be significantly (and unnecessarily
IMO) more complex than this.

Reviewers: chandlerc, dberlin, hfinkel, reames

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D30849

llvm-svn: 297539
2017-03-11 01:15:48 +00:00
Davide Italiano 574e59786e [ProfileSummaryInfo] Remove unneeded braces. NFCI.
llvm-svn: 297506
2017-03-10 20:50:51 +00:00
Dehao Chen c2048155a0 Refactor the PSI to extract getCallSiteCount and remove checks for profile type.
Summary: There is no need to check profile count as only CallInst will have metadata attached.

Reviewers: eraman

Reviewed By: eraman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30799

llvm-svn: 297500
2017-03-10 19:45:16 +00:00
Michael Kuperstein 5fb39a7966 [SLP] Revert everything that has to do with memory access sorting.
This reverts r293386, r294027, r294029 and r296411.

Turns out the SLP tree isn't actually a "tree" and we don't handle
accessing the same packet of loads in several different orders well,
causing miscompiles.

Revert until we can fix this properly.

llvm-svn: 297493
2017-03-10 18:59:07 +00:00
Yaron Keren 1de4792c55 Implement getPassName() for IR printing passes.
llvm-svn: 297442
2017-03-10 07:09:20 +00:00
Dehao Chen 22645eea7a Do not use branch metadata to check if a basic block is hot.
Summary: We should not use that to check basic block hotness as optimization may mess it up.

Reviewers: eraman

Reviewed By: eraman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30800

llvm-svn: 297437
2017-03-10 01:44:37 +00:00
Sanjay Patel 962a8431ea [InstSimplify] allow folds for bool vector div/rem
llvm-svn: 297411
2017-03-09 21:56:03 +00:00
Sanjay Patel 2b1f6f4b92 [InstSimplify] vector div/rem with any zero element in divisor is undef
This was suggested as a DAG simplification in the review for rL297026 :
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170306/435253.html
...but let's start with IR since we have actual docs for IR (LangRef).

Differential Revision:
https://reviews.llvm.org/D30665

llvm-svn: 297390
2017-03-09 16:20:52 +00:00
Teresa Johnson d820447212 Perform symbol binding for .symver versioned symbols
Summary:
In a .symver assembler directive like:
.symver name, name2@@nodename
"name2@@nodename" should get the same symbol binding as "name".

While the ELF object writer is updating the symbol binding for .symver
aliases before emitting the object file, not doing so when the module
inline assembly is handled by the RecordStreamer is causing the wrong
behavior in *LTO mode.

E.g. when "name" is global, "name2@@nodename" must also be marked as
global. Otherwise, the symbol is skipped when iterating over the LTO
InputFile symbols (InputFile::Symbol::shouldSkip). So, for example,
when performing any *LTO via the gold-plugin, the versioned symbol
definition is not recorded by the plugin and passed back to the
linker. If the object was in an archive, and there were no other symbols
needed from that object, the object would not be included in the final
link and references to the versioned symbol are undefined.

The llvm-lto2 tests added will give an error about an unused symbol
resolution without the fix.

Reviewers: rafael, pcc

Reviewed By: pcc

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D30485

llvm-svn: 297332
2017-03-09 00:19:49 +00:00
Amjad Aboud 5448e989a5 [SLP] Fixed non-deterministic behavior in Loop Vectorizer.
Differential Revision: https://reviews.llvm.org/D30638

llvm-svn: 297257
2017-03-08 05:09:10 +00:00
Sebastian Pop 4a4d245b19 Handle UnreachableInst in isGuaranteedToTransferExecutionToSuccessor
A block with an UnreachableInst does not transfer execution to a successor.
The problem was exposed by GVN-hoist. This patch fixes bug 32153.

Patch by Aditya Kumar.

Differential Revision: https://reviews.llvm.org/D30667

llvm-svn: 297254
2017-03-08 01:54:50 +00:00
Michael Kuperstein 768d013a03 [SLP] Revert r296863 due to miscompiles.
Details and reproducer are on the email thread for r296863.

llvm-svn: 297103
2017-03-06 23:54:51 +00:00
Sanjay Patel 0cb2ee9287 [InstSimplify] refactor related div/rem folds; NFCI
llvm-svn: 297052
2017-03-06 19:08:35 +00:00
Sanjay Patel 79a9ecbe80 [InstSimplify] remove misleading comments; NFC
Div/rem-of-0 does not cause faults/undef (not the same as div/rem-by-0).

llvm-svn: 297029
2017-03-06 16:49:35 +00:00
Sanjoy Das 1bd479dd5c [SCEV] Decrease the recursion threshold for CompareValueComplexity
Fixes PR32142.

r287232 accidentally increased the recursion threshold for
CompareValueComplexity from 2 to 32.  This change reverses that change
by introducing a separate flag for CompareValueComplexity's threshold.

llvm-svn: 296992
2017-03-05 23:49:17 +00:00
Mohammad Shahid bdac9f30c0 [SLP] Fixes the bug due to absence of in order uses of scalars which needs to be available
for VectorizeTree() API.This API uses it for proper mask computation to be used in shufflevector IR.
The fix is to compute the mask for out of order memory accesses while building the vectorizable tree
instead of actual vectorization of vectorizable tree.It also needs to recompute the proper Lane for
external use of vectorizable scalars based on shuffle mask.

Reviewers: mkuper

Differential Revision: https://reviews.llvm.org/D30159

Change-Id: Ide8773ce0ad3562f3cf4d1a0ad0f487e2f60ce5d
llvm-svn: 296863
2017-03-03 10:02:47 +00:00
Hans Wennborg cc4ff78c9d Revert r296575 "[SLP] Fixes the bug due to absence of in order uses of scalars which needs to be available"
It caused miscompiles, e.g. in Chromium (PR32109).

llvm-svn: 296654
2017-03-01 18:57:16 +00:00
Igor Laevsky 37cba43604 [BasicAA] Take attributes into account when requesting modref info for a call site
Differential Revision: https://reviews.llvm.org/D29989

llvm-svn: 296617
2017-03-01 13:19:51 +00:00
Mohammad Shahid 175ffa8c35 [SLP] Fixes the bug due to absence of in order uses of scalars which needs to be available
for VectorizeTree() API.This API uses it for proper mask computation to be used in shufflevector IR.
The fix is to compute the mask for out of order memory accesses while building the vectorizable tree
instead of actual vectorization of vectorizable tree.

Reviewers: mkuper

Differential Revision: https://reviews.llvm.org/D30159

Change-Id: Id1e287f073fa4959713ba545fa4254db5da8b40d
llvm-svn: 296575
2017-03-01 03:51:54 +00:00
Francis Visoiu Mistrih 262ad16a3a [LCG] Fix EXPENSIVE_CHECKS typo. NFC
Differential Revision: https://reviews.llvm.org/D30434

llvm-svn: 296500
2017-02-28 18:34:55 +00:00
Dehao Chen a60cdd3881 Add function importing info from samplepgo profile to the module summary.
Summary: For SamplePGO, the profile may contain cross-module inline stacks. As we need to make sure the profile annotation happens when all the hot inline stacks are expanded, we need to pass this info to the module importer so that it can import proper functions if necessary. This patch implemented this feature by emitting cross-module targets as part of function entry metadata. In the module-summary phase, the metadata is used to build call edges that points to functions need to be imported.

Reviewers: mehdi_amini, tejohnson

Reviewed By: tejohnson

Subscribers: davidxl, llvm-commits

Differential Revision: https://reviews.llvm.org/D30053

llvm-svn: 296498
2017-02-28 18:09:44 +00:00
Michael Kuperstein c07cca85fb [SLP] Load sorting should not try to sort things that aren't loads.
We may get a VL where the first element is a load, but the others
aren't. Trying to sort such VLs can only lead to sorrow.

llvm-svn: 296411
2017-02-27 23:18:11 +00:00
Sanjoy Das 39a684d117 [ValueTracking] Don't do an unchecked shift in ComputeNumSignBits
Summary:
Previously we used to return a bogus result, 0, for IR like `ashr %val,
-1`.

I've also added an assert checking that `ComputeNumSignBits` at least
returns 1.  That assert found an already checked in test case where we
were returning a bad result for `ashr %val, -1`.

Fixes PR32045.

Reviewers: spatel, majnemer

Reviewed By: spatel, majnemer

Subscribers: efriedma, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D30311

llvm-svn: 296273
2017-02-25 20:30:45 +00:00
Easwaran Raman a8b9cdc9e2 [InlineCost] Move the code in isGEPOffsetConstant to a lambda.
Differential revision: https://reviews.llvm.org/D30112

llvm-svn: 296208
2017-02-25 00:10:22 +00:00
Xin Tong 68ea9aa23a Fix Indentation. NFCI
llvm-svn: 296169
2017-02-24 20:59:26 +00:00
Adam Nemet 2531df3d49 [ORE] Remove ORE.emit{{.+}} functions
Last use was killed in my previous patch. The preferred way is now to
construct the remark, pipe things to it and pass it to ORE.emit.

llvm-svn: 296019
2017-02-23 21:32:53 +00:00
Adam Nemet 41b019a39c [LAA] Remove unused LoopAccessReport
The need for this removed when I converted everything to use the opt-remark
classes directly with the streaming interface.

llvm-svn: 296017
2017-02-23 21:17:36 +00:00
Davide Italiano e122d6885a [ModuleSummaryAnalysis] Don't crash when referencing unnamed globals.
Instead, just be conservative as these are unfrequent enough. Thanks
to Peter Collingbourne for the discussion about this on IRC.

llvm-svn: 295861
2017-02-22 18:53:38 +00:00
Justin Bogner 8281c81413 OptDiag: Add const to some interfaces that don't modify anything. NFC
This needed a const_cast for the dominator tree recalculation in
OptimizationRemarkEmitter, but we do that all over the place already
and it's safe.

llvm-svn: 295812
2017-02-22 07:38:17 +00:00
Sanjoy Das 5cd6c5cacf [ValueTracking] Make poison propagation more aggressive
Summary:
Motivation: fix PR31181 without regression (the actual fix is still in
progress).  However, the actual content of PR31181 is not relevant
here.

This change makes poison propagation more aggressive in the following
cases:

 1. poision * Val == poison, for any Val.  In particular, this changes
    existing intentional and documented behavior in these two cases:
     a. Val is 0
     b. Val is 2^k * N
 2. poison << Val == poison, for any Val
 3. getelementptr is poison if any input is poison

I think all of these are justified (and are axiomatically true in the
new poison / undef model):

1a: we need poison * 0 to be poison to allow transforms like these:

  A * (B + C) ==> A * B + A * C

If poison * 0 were 0 then the above transform could not be allowed
since e.g. we could have A = poison, B = 1, C = -1, making the LHS

  poison * (1 + -1) = poison * 0 = 0

and the RHS

  poison * 1 + poison * -1 = poison + poison = poison

1b: we need e.g. poison * 4 to be poison since we want to allow

  A * 4 ==> A + A + A + A

If poison * 4 were a value with all of their bits poison except the
last four; then we'd not be able to do this transform since then if A
were poison the LHS would only be "partially" poison while the RHS
would be "full" poison.

2: Same reasoning as (1b), we'd like have the following kinds
transforms be legal:

  A << 1 ==> A + A

Reviewers: majnemer, efriedma

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D30185

llvm-svn: 295809
2017-02-22 06:52:32 +00:00
Sanjoy Das 7b0b408973 [ValueTracking] clang-format a section I'm about to touch; NFC
(Whitespace only change)

llvm-svn: 295690
2017-02-21 02:42:42 +00:00
Sanjay Patel fe67255961 [InstSimplify] add nsw/nuw (xor X, signbit), signbit --> X
The change to InstCombine in:
https://reviews.llvm.org/D29729
...exposes this missing fold in InstSimplify, so adding this
first to avoid a regression.

llvm-svn: 295573
2017-02-18 21:59:09 +00:00
Easwaran Raman 617f63640b Refactor instruction simplification code in visitors. NFC.
Several visitors check if operands to the instruction are constants,
either as it is or after looking up SimplifiedValues, check if the
result is a constant and update the SimplifiedValues map. This
refactoring splits it into a common function that does the checking of
whether the operands are constants and updating of the SimplifiedValues
table, and an instruction specific part that is implemented by each
instruction visitor as a lambda and passed to the common function.

Differential revision: https://reviews.llvm.org/D30104

llvm-svn: 295552
2017-02-18 17:22:52 +00:00
Justin Bogner d890f95bf6 OptDiag: Decouple backend diagnostics from debug info metadata
This creates and uses a DiagnosticLocation type rather than using
DebugLoc for this purpose in the backend diagnostics. This is NFC for
now, but will allow us to create locations for diagnostics without
having to create new metadata nodes when we don't have a DILocation.

llvm-svn: 295519
2017-02-18 00:42:23 +00:00
Matthew Simpson a899f86054 [LAA] Remove unused code (NFC)
llvm-svn: 295493
2017-02-17 20:46:52 +00:00
Peter Collingbourne 9421c2dc54 AssumptionCache: Disable the verifier by default, move it behind a hidden cl::opt and verify from releaseMemory().
This is a short term solution to the problem that many passes currently fail
to update the assumption cache. In the long term the verifier should not
be controllable with a flag. We should either fix all passes to correctly
update the assumption cache and enable the verifier unconditionally or
somehow arrange for the assumption list to be updated automatically by passes.

Differential Revision: https://reviews.llvm.org/D30003

llvm-svn: 295236
2017-02-15 21:10:09 +00:00