Commit Graph

9705 Commits

Author SHA1 Message Date
Mircea Trofin da92448828 [NFC][MLInliner] Presort instruction successions.
Differential Revision: https://reviews.llvm.org/D87489
2020-09-10 21:40:49 -07:00
Nikita Popov a5168bdb4a [DemandedBits][BDCE] Add support for min/max intrinsics
Add DemandedBits / BDCE support for min/max intrinsics: If the low
bits are not demanded in the result, they also aren't demanded in
the operands.

Differential Revision: https://reviews.llvm.org/D87161
2020-09-10 22:13:31 +02:00
Nikita Popov 99e78cb718 [DemandedBits] Add braces to large if (NFC)
While the if only contains a single statement, it happens to be
a huge switch. Add braces to make this code easier to read.
2020-09-10 22:13:27 +02:00
Christopher Tetreault 7ddfd9b3eb [SVE] Bail from VectorUtils heuristics for scalable vectors
Bail from maskIsAllZeroOrUndef and maskIsAllOneOrUndef prior to iterating over the number of
elements for scalable vectors.

Assert that the mask type is not scalable in possiblyDemandedEltsInMask .

Assert that the types are correct in all three functions.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87424
2020-09-10 12:29:37 -07:00
Max Kazantsev 8c0bbbade1 [NFC] Refactoring in SCEV: add missing `const` qualifiers 2020-09-10 19:06:37 +07:00
Max Kazantsev cde8fc65ae [NFC] Rename variables to avoid name confusion
Name `LI` is used for loop info, loop and load inst at the same
function, which causes a lot of confusion.
2020-09-10 13:41:10 +07:00
Juneyoung Lee a6183d0f02 [ValueTracking] isKnownNonZero, computeKnownBits for freeze
This implements support for isKnownNonZero, computeKnownBits when freeze is involved.

```
  br (x != 0), BB1, BB2
BB1:
  y = freeze x
```

In the above program, we can say that y is non-zero. The reason is as follows:

(1) If x was poison, `br (x != 0)` raised UB
(2) If x was fully undef, the branch again raised UB
(3) If x was non-zero partially undef, say `undef | 1`, `freeze x` will return a nondeterministic value which is also non-zero.
(4) If x was just a concrete value, it is trivial

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D75808
2020-09-10 08:07:38 +09:00
Mircea Trofin 4b15fc9ddb [NFC][MLInliner] Don't initialize in an assert.
Since the build bots have assertions enabled, this flew under the radar.
2020-09-09 09:56:07 -07:00
Juneyoung Lee 25ce1e0497 [ValueTracking] Add UndefOrPoison/Poison-only version of relevant functions
This patch adds isGuaranteedNotToBePoison and programUndefinedIfUndefOrPoison.

isGuaranteedNotToBePoison will be used at D75808. The latter function is used at isGuaranteedNotToBePoison.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D84242
2020-09-09 20:00:26 +09:00
Max Kazantsev 795e4ee9d2 [NFC] Move functon from IndVarSimplify to SCEV
This function can be reused in other places.

Differential Revision: https://reviews.llvm.org/D87274
Reviewed By: fhahn, lebedev.ri
2020-09-09 11:20:59 +07:00
Krzysztof Parzyszek 055d209589 Handle masked loads and stores in MemoryLocation/Dependence
Differential Revision: https://reviews.llvm.org/D87061
2020-09-08 19:08:44 -05:00
Nikita Popov 8453fbf088 [ValueTracking] Compute known bits of min/max intrinsics
Implement known bits for the min/max intrinsics based on the
recently added KnownBits primitives.
2020-09-08 21:08:17 +02:00
Nikita Popov e97f3b1b43 [InstCombine] Fold abs of known negative operand
If we know that the abs operand is known negative, we can replace
it with a neg.

To avoid computing known bits twice, I've removed the fold for the
non-negative case from InstSimplify. Both the non-negative and the
negative case are handled by InstCombine now, with one known bits call.

Differential Revision: https://reviews.llvm.org/D87196
2020-09-08 20:14:35 +02:00
Jay Foad 5350e1b509 [KnownBits] Implement accurate unsigned and signed max and min
Use the new implementation in ValueTracking, SelectionDAG and
GlobalISel.

Differential Revision: https://reviews.llvm.org/D87034
2020-09-07 09:09:01 +01:00
dongAxis 1fd7dc4074 When dumping results of StackLifetime, it will print the following
log:

BB  [7, 8): begin {}, end {}, livein {}, liveout {}
BB  [1, 2): begin {}, end {}, livein {}, liveout {}
...

But it is not convenient to know what the basic block is.
So I add the basic block name to it.

Reviewed By: vitalybuka
TestPlan: check-llvm
Differential Revision: https://reviews.llvm.org/D87152
2020-09-07 11:43:16 +08:00
Nikita Popov b536cbaac5 [ValueTracking] Avoid known bits fallback for non-zero get check (NFCI)
The known bits fall back will never be able to infer a non-null
value here, so don't bother.
2020-09-06 23:16:38 +02:00
Nikita Popov ff218cbc84 [InstSimplify] Fold degenerate abs of abs form
This addresses the remaining issue from D87188. Due to a series of
folds, we may end up with abs-of-abs represented as
x == 0 ? -abs(x) : abs(x). Rather than recognizing this as a special
abs pattern and doing an abs-of-abs fold on it afterwards,
I'm directly folding this to one of the select operands in InstSimplify.

The general pattern falls into the "select with operand replaced"
category, but that fold is not powerful enough to recognize that
both hands of the select are the same for value zero.

Differential Revision: https://reviews.llvm.org/D87197
2020-09-06 09:43:08 +02:00
Florian Hahn 1ddb3a369f [LangRef] Adjust guarantee for llvm.memcpy to also allow equal arguments.
This adjusts the description of `llvm.memcpy` to also allow operands
to be equal. This is in line with what Clang currently expects.

This change is intended to be temporary and followed by re-introduce
a variant with the non-overlapping guarantee for cases where we can
actually ensure that property in the front-end.

See the links below for more details:
http://lists.llvm.org/pipermail/cfe-dev/2020-August/066614.html
and PR11763.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D86815
2020-09-05 19:18:23 +01:00
Nikita Popov ac87480bd8 [SCEV] Recognize min/max intrinsics
Recognize umin/umax/smin/smax intrinsics and convert them to the
already existing SCEV nodes of the same name.

In the future we'll want SCEVExpander to also produce the intrinsics,
but we're not ready for that yet.

Differential Revision: https://reviews.llvm.org/D87160
2020-09-05 16:30:11 +02:00
Nikita Popov 73104b0751 [InstSimplify] Fold min/max based on dominating condition
If we have a dominating condition that x >= y, then umax(x, y) is x,
etc. I'm doing this in InstSimplify as the corresponding transform
for the select form is also done there.

Differential Revision: https://reviews.llvm.org/D87168
2020-09-05 16:16:40 +02:00
Arthur Eubanks c9771391ce [NewPM][Lint] Port -lint to NewPM
This also changes -lint from an analysis to a pass. It's similar to
-verify, and that is a normal pass, and lives in llvm/IR.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D87057
2020-09-03 13:03:44 -07:00
Arthur Eubanks e440b4933a Revert "[NewPM][Lint] Port -lint to NewPM"
This reverts commit 883399c840.
2020-09-02 21:34:29 -07:00
Arthur Eubanks 883399c840 [NewPM][Lint] Port -lint to NewPM
This also changes -lint from an analysis to a pass. It's similar to
-verify, and that is a normal pass, and lives in llvm/IR.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D87057
2020-09-02 21:13:01 -07:00
Craig Topper b16e8687ab [CodeGenPrepare][X86] Teach optimizeGatherScatterInst to turn a splat pointer into GEP with scalar base and 0 index
This helps SelectionDAGBuilder recognize the splat can be used as a uniform base.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D86371
2020-09-02 20:44:12 -07:00
Eli Friedman 96ef6998df [InstCombine] Fix a couple crashes with extractelement on a scalable vector.
Differential Revision: https://reviews.llvm.org/D86989
2020-09-02 18:02:07 -07:00
Jordan Rupprecht c90f15d25a [NFC] Fix unused var in release build 2020-09-01 13:05:56 -07:00
Florian Hahn 0d966ae4b2 [Loads] Add canReplacePointersIfEqual helper.
This patch adds an initial, incomeplete and unsound implementation of
canReplacePointersIfEqual to check if a pointer value A can be replaced
by another pointer value B, that are deemed to be equivalent through
some means (e.g. information from conditions).

Note that is in general not sound to blindly replace pointers based on
equality, for example if they are based on different underlying objects.

LLVM's memory model is not completely settled as of now; see
https://bugs.llvm.org/show_bug.cgi?id=34548 for a more detailed
discussion.

The initial version of canReplacePointersIfEqual only rejects a very
specific case: replacing a pointer with a constant expression that is
not dereferenceable. Such a replacement is problematic and can be
restricted relatively easily without impacting most code. Using it to
limit replacements in GVN/SCCP/CVP only results in small differences in
7 programs out of MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto.

This patch is supposed to be an initial step to improve the current
situation and the helper should be made stricter in the future. But this
will require careful analysis of the impact on performance.

Reviewed By: aqjune

Differential Revision: https://reviews.llvm.org/D85524
2020-09-01 20:57:41 +01:00
Alina Sbirlea c292fba46f [MemorySSA] Update phi map with replacement value. 2020-09-01 11:56:40 -07:00
Alina Sbirlea 63844c116a [MemorySSA] Clean up single value phis.
MemoryPhis with a single value are correct, but can lead to errors when
updating. Clean up single entry Phis newly added when cloning blocks.
Resolves PR46574.
2020-08-31 19:26:08 -07:00
Nikita Popov 88b310f64b [InstSimplify] Reduce code duplication in simplifySelectWithICmpCond (NFC)
Canonicalize icmp ne to icmp eq and implement all the folds only once.
2020-08-29 22:38:49 +02:00
Nikita Popov a5be86fde5 [InstSimplify] Protect against more poison in SimplifyWithOpReplaced (PR47322)
Replace the check for poison-producing instructions in
SimplifyWithOpReplaced() with the generic helper canCreatePoison()
that properly handles poisonous shifts and thus avoids the problem
from PR47322.

This additionally fixes a bug in IIQ.UseInstrInfo=false mode, which
previously could have caused this code to ignore poison flags.
Setting UseInstrInfo=false should reduce the possible optimizations,
not increase them.

This is not a full solution to the problem, as poison could be
introduced more indirectly. This is just a minimal, easy to backport
fix.

Differential Revision: https://reviews.llvm.org/D86834
2020-08-29 21:59:39 +02:00
Nikita Popov a400a61721 [LVI] Remove unnecessary lambda capture (NFC) 2020-08-29 21:33:19 +02:00
Nikita Popov 6d88f6efd4 Reapply [LVI] Normalize pointer behavior
This got reverted because a dependency was reverted. It has since
been reapplied, so reapply this as well.

-----

Related to D69686. As noted there, LVI currently behaves differently
for integer and pointer values: For integers, the block value is always
valid inside the basic block, while for pointers it is only valid at
the end of the basic block. I believe the integer behavior is the
correct one, and CVP relies on it via its getConstantRange() uses.

The reason for the special pointer behavior is that LVI checks whether
a pointer is dereferenced in a given basic block and marks it as
non-null in that case. Of course, this information is valid only after
the dereferencing instruction, or in conservative approximation,
at the end of the block.

This patch changes the treatment of dereferencability: Instead of
including it inside the block value, we instead treat it as something
similar to an assume (it essentially is a non-nullness assume) and
incorporate this information in intersectAssumeOrGuardBlockValueConstantRange()
if the context instruction is the terminator of the basic block.
This happens either when determining an edge-value internally in LVI,
or when a terminator was explicitly passed to getValueAt(). The latter
case makes this more powerful than the previous implementation as
a side-effect, and this does actually seem benefitial in practice.

Of course, we do not want to recompute dereferencability on each
intersectAssume call, so we need a new cache for this. The
dereferencability analysis requires walking the entire basic block
and computing underlying objects of all memory operands. This was
previously done separately for each queried pointer value. In the
new implementation (both because this makes the caching simpler,
and because it is faster), I instead only walk the full BB once and
cache all the dereferenced pointers. So the traversal is now performed
only once per BB, instead of once per queried pointer value.

I think the overall model now makes more sense than before, and there
will be no more pitfalls due to differing integer/pointer behavior.

Differential Revision: https://reviews.llvm.org/D69914
2020-08-29 21:17:03 +02:00
Roman Lebedev c1b3e32118
[NFC][InstructionSimplify] Add a warning about not simplifying to not def-reachable
See
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824235.html
and
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824967.html

InstSimply is not allowed to perform simplifications to instructions
that are not def-reachable from the original instruction.
2020-08-29 09:58:08 +03:00
Owen Anderson ed90f15efb Revert "[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block"
This reverts commit 6102310d81.  It
appears to cause compilation non-determinism and caused stage3
mismatches.
2020-08-28 23:43:42 +00:00
serge-sans-paille 2296182181 Skip analysis re-computation when no changes are reported
This is a follow-up to https://reviews.llvm.org/D80707, generalized to
CallGraphSCC, Loop and Region

Differential Revision: https://reviews.llvm.org/D86442
2020-08-28 21:41:01 +02:00
David Sherwood f4257c5832 [SVE] Make ElementCount members private
This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). In addition I've
added some other member functions for more commonly used operations.
Hopefully this makes the class more useful and will reduce the
need for calling getKnownMinValue().

Differential Revision: https://reviews.llvm.org/D86065
2020-08-28 14:43:53 +01:00
Florian Hahn fd6ebea50d [MemLoc] Support memcmp in MemoryLocation::getForArgument.
This patch adds support for memcmp in MemoryLocation::getForArgument.
memcmp reads from the first 2 arguments up to the number of bytes of the
third argument.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86725
2020-08-28 10:19:54 +01:00
Martin Storsjö db1ec04963 [ValueTracking] Remove a stray semicolon. NFC.
This silences warnings when built with GCC at least.
2020-08-28 09:24:10 +03:00
serge-sans-paille b1f4e5979b (Expensive) Check for Loop, SCC and Region pass return status
This generalizes the logic introduced in https://reviews.llvm.org/D80916 to
other passes.

It's needed by https://reviews.llvm.org/D86442 to assert passes correctly report
their status.

Differential Revision: https://reviews.llvm.org/D86589
2020-08-28 07:56:35 +02:00
Alina Sbirlea d370836c20 [MemorySSA] Assert defining access is not a MemoryUse. 2020-08-27 18:21:10 -07:00
Vitaly Buka 23524fdece [ValueTracking] Replace recursion with Worklist
Now findAllocaForValue can handle nontrivial phi cycles.
2020-08-27 14:44:49 -07:00
Vitaly Buka a40660551e [StackSafety] Ignore allocas with partial lifetime markers
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D86672
2020-08-27 13:54:41 -07:00
Vitaly Buka a6927c8621 [NFC][ValueTracking] Add OffsetZero into findAllocaForValue
For StackLifetime after finding alloca we need to check that
values ponting to the begining of alloca.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D86692
2020-08-27 13:46:22 -07:00
Roman Lebedev b85f91fdce
[InstSimplify] SimplifyPHINode(): check that instruction is in basic block first
As pointed out in post-commit review, this can legally be called
on instructions that are not inserted into basic blocks,
so don't blindly assume that there is basic block.
2020-08-27 22:32:03 +03:00
Simon Moll c48b06c44f [sda][nfc] clang-formatting 2020-08-27 18:27:44 +02:00
Roman Lebedev 6102310d81
[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block
Apparently, we don't do this, neither in EarlyCSE, nor in InstSimplify,
nor in (old) GVN, but do in NewGVN and SimplifyCFG of all places..

While i could teach EarlyCSE how to hash PHI nodes,
we can't really do much (anything?) even if we find two identical
PHI nodes in different basic blocks, same-BB case is the interesting one,
and if we teach InstSimplify about it (which is what i wanted originally,
https://reviews.llvm.org/D86530), we get EarlyCSE support for free.

So i would think this is pretty uncontroversial.

On vanilla llvm test-suite + RawSpeed, this has the following effects:
```
| statistic name                                     | baseline  | proposed  |      Δ |        % |    \|%\| |
|----------------------------------------------------|-----------|-----------|-------:|---------:|---------:|
| instsimplify.NumPHICSE                             | 0         | 23779     |  23779 |    0.00% |    0.00% |
| asm-printer.EmittedInsts                           | 7942328   | 7942392   |     64 |    0.00% |    0.00% |
| assembler.ObjectBytes                              | 273069192 | 273084704 |  15512 |    0.01% |    0.01% |
| correlated-value-propagation.NumPhis               | 18412     | 18539     |    127 |    0.69% |    0.69% |
| early-cse.NumCSE                                   | 2183283   | 2183227   |    -56 |    0.00% |    0.00% |
| early-cse.NumSimplify                              | 550105    | 542090    |  -8015 |   -1.46% |    1.46% |
| instcombine.NumAggregateReconstructionsSimplified  | 73        | 4506      |   4433 | 6072.60% | 6072.60% |
| instcombine.NumCombined                            | 3640264   | 3664769   |  24505 |    0.67% |    0.67% |
| instcombine.NumDeadInst                            | 1778193   | 1783183   |   4990 |    0.28% |    0.28% |
| instcount.NumCallInst                              | 1758401   | 1758799   |    398 |    0.02% |    0.02% |
| instcount.NumInvokeInst                            | 59478     | 59502     |     24 |    0.04% |    0.04% |
| instcount.NumPHIInst                               | 330557    | 330533    |    -24 |   -0.01% |    0.01% |
| instcount.TotalInsts                               | 8831952   | 8832286   |    334 |    0.00% |    0.00% |
| simplifycfg.NumInvokes                             | 4300      | 4410      |    110 |    2.56% |    2.56% |
| simplifycfg.NumSimpl                               | 1019808   | 999607    | -20201 |   -1.98% |    1.98% |
```
I.e. it fires ~24k times, causes +110 (+2.56%) more `invoke` -> `call`
transforms, and counter-intuitively results in *more* instructions total.

That being said, the PHI count doesn't decrease that much,
and looking at some examples, it seems at least some of them
were previously getting PHI CSE'd in SimplifyCFG of all places..

I'm adjusting `Instruction::isIdenticalToWhenDefined()` at the same time.
As a comment in `InstCombinerImpl::visitPHINode()` already stated,
there are no guarantees on the ordering of the operands of a PHI node,
so if we just naively compare them, we may false-negatively say that
the nodes are not equal when the only difference is operand order,
which is especially important since the fold is in InstSimplify,
so we can't rely on InstCombine sorting them beforehand.

Fixing this for the general case is costly (geomean +0.02%),
and does not appear to catch anything in test-suite, but for
the same-BB case, it's trivial, so let's fix at least that.

As per http://llvm-compile-time-tracker.com/compare.php?from=04879086b44348cad600a0a1ccbe1f7776cc3cf9&to=82bdedb888b945df1e9f130dd3ac4dd3c96e2925&stat=instructions
this appears to cause geomean +0.03% compile time increase (regression),
but geomean -0.01%..-0.04% code size decrease (improvement).
2020-08-27 18:47:04 +03:00
Vitaly Buka 469debe027 [ValueTracking] Support select in findAllocaForValue 2020-08-27 02:13:52 -07:00
Nikita Popov d7c119d89c [InstSimplify] Fold min/max intrinsic based on icmp of operands
This is a reboot of D84655, now performing the inner icmp
simplification query without undef folds.

It should be possible to handle the current foldMinMaxSharedOp()
fold based on this, by moving the logic into icmp of min/max instead,
making it more general. We can't drop the folds for constant operands,
because those also allow undef, which we exclude here.

The tests use assumes for exhaustive coverage, and have a few
more examples of misc folds we get based on icmp simplification.

Differential Revision: https://reviews.llvm.org/D85929
2020-08-26 22:02:57 +02:00
Arthur Eubanks 098d3f9827 [InstSimplify] Simplify to vector constants when possible
InstSimplify should do all transformations that ConstProp does, but
one thing that ConstProp does that InstSimplify wouldn't is inline
vector instructions that are constants, e.g. into a ret.

Previously vector instructions wouldn't be inlined in InstSimplify
because llvm::Simplify*Instruction() would return nullptr for specific
instructions, such as vector instructions that were actually constants,
if it couldn't simplify them.

This changes SimplifyInsertElementInst, SimplifyExtractElementInst, and
SimplifyShuffleVectorInst to return a vector constant when possible.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85946
2020-08-26 11:40:36 -07:00
Juneyoung Lee 684b43c0cf [IR] Add NoUndef attribute to Intrinsics.td
This patch adds NoUndef to Intrinsics.td.
The attribute is attached to llvm.assume's operand, because llvm.assume(undef)
is UB.
It is attached to pointer operands of several memory accessing intrinsics
as well.

This change makes ValueTracking::getGuaranteedNonPoisonOps' intrinsic check
unnecessary, so it is removed.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86576
2020-08-27 02:54:48 +09:00
Mircea Trofin 7cfcecece0 [MLInliner] Simplify TFUTILS_SUPPORTED_TYPES
We only need the C++ type and the corresponding TF Enum. The other
parameter was used for the output spec json file, but we can just
standardize on the C++ type name there.

Differential Revision: https://reviews.llvm.org/D86549
2020-08-25 14:19:39 -07:00
Juneyoung Lee f753f5b050 [ValueTracking] Let getGuaranteedNonPoisonOp find multiple non-poison operands
This patch helps getGuaranteedNonPoisonOp find multiple non-poison operands.

Instead of special-casing llvm.assume, I think it is also a viable option to
add noundef to Intrinsics.td. If it makes sense, I'll make a patch for that.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86477
2020-08-26 04:40:21 +09:00
Nikita Popov 3a54b6a4b7 [MemDep] Use BatchAA when computing pointer dependencies
We're not changing IR while running a single MemDep query, so it's
safe to cache alias analysis results using BatchAA. This adds BatchAA
usage to getSimplePointerDependencyFrom(), which is non-intrusive --
covering larger parts (like a whole processNonLocalLoad query) is
also possible, but requires threading BatchAA through a bunch of APIs.

For the ThinLTO configuration, this is a 1% geomean improvement on CTMark.

Differential Revision: https://reviews.llvm.org/D85583
2020-08-25 21:34:34 +02:00
Ta-Wei Tu abbd652dd6 [LoopNest] False negative of `arePerfectlyNested` with LCSSA loops
Summary: The LCSSA pass (required for all loop passes) sometimes adds
additional blocks containing LCSSA variables, and checkLoopsStructure
may return false even when the loops are perfectly nested in this case.
This is because the successor of the exit block of the inner loop now
points to the LCSSA block instead of the latch block of the outer loop.
Examples are shown in the test nests-with-lcssa.ll.

To fix the issue, the successor of the exit block of the inner loop can
now point to a block in which all instructions are LCSSA phi node
(except the terminator), and the sole successor of that block should
point to the latch block of the outer loop.

Reviewed By: Whitney, etiotto

Differential Revision: https://reviews.llvm.org/D86133
2020-08-25 16:20:52 +00:00
Mircea Trofin 8c63df2416 [MLInliner] Support training that doesn't require partial rewards
If we use training algorithms that don't need partial rewards, we don't
need to worry about an ir2native model. In that case, training logs
won't contain a 'delta_size' feature either (since that's the partial
reward).

Differential Revision: https://reviews.llvm.org/D86481
2020-08-24 17:36:29 -07:00
Sam Parker 2e194fe73b [SCEV] Still trying to fix windows buildbots 2020-08-24 10:26:48 +01:00
Alina Sbirlea f55ad3973d [DomTree] Extend update API to allow a post CFG view.
Extend the `applyUpdates` in DominatorTree to allow a post CFG view,
different from the current CFG.
This patch implements the functionality of updating an already up to
date DT, to the desired PostCFGView.
Combining a set of updates towards an up to date DT and a PostCFGView is
not yet supported.

Differential Revision: https://reviews.llvm.org/D85472
2020-08-21 17:23:08 -07:00
Arthur Eubanks b79889c2b1 [opt][NewPM] Add basic-aa in legacy PM compatibility mode
The legacy PM alias analysis pipeline by default includes basic-aa.
When running `opt -foo-pass` under the NPM and -disable-basic-aa is not
specified, use basic-aa.

This decreases the number of check-llvm failures under NPM from 913 to 752.

Reviewed By: ychen, asbirlea

Differential Revision: https://reviews.llvm.org/D86167
2020-08-21 14:05:07 -07:00
Roman Lebedev 5d7c5a5e99
[NFC] Port InstCount pass to new pass manager 2020-08-21 12:39:42 +03:00
Yevgeny Rouban 18bc400f97 [NewPM][PassInstrumentation] Add PreservedAnalyses parameter to AfterPass* callbacks
Both AfterPass and AfterPassInvalidated pass instrumentation
callbacks get additional parameter of type PreservedAnalyses.
This patch was created by @fedor.sergeev. I have just slightly
changed it.

Reviewers: fedor.sergeev

Differential Revision: https://reviews.llvm.org/D81555
2020-08-21 16:10:42 +07:00
David Green 2b69efded0 [ARM][LV] Add a preferPredicatedReductionSelect target hook
As part of D84741, this adds a target hook for the
preferPredicatedReductionSelect option and makes use
of it under MVE, allowing us to tail predicate most
reduction loops.

Differential Revision: https://reviews.llvm.org/D85980
2020-08-21 08:48:12 +01:00
Sanjay Patel 6f3511a01a [ValueTracking] define/use max recursion depth in header
There's a potential motivating case to increase this limit in PR47191:
http://bugs.llvm.org/PR47191

But first we should make it less hacky. The limit in InstCombine is directly tied
to this value because an increase there can cause asserts in the underlying value
tracking calls if not changed together. The usage in VectorUtils is independent,
but the comment suggests that we should use the same value unless there's a known
reason to diverge. There are similar limits in codegen analysis, but I think we
should leave those independent in case we intentionally want the optimization
power/cost to be different there.

Differential Revision: https://reviews.llvm.org/D86113
2020-08-19 16:56:59 -04:00
Mehdi Amini a407ec9b6d Revert "Revert "[NFC][llvm] Make the contructors of `ElementCount` private.""
Was reverted because MLIR/Flang builds were broken, these APIs have been
fixed in the meantime.
2020-08-19 17:26:36 +00:00
Mehdi Amini 4fc56d70aa Revert "[NFC][llvm] Make the contructors of `ElementCount` private."
This reverts commit 264afb9e6a.
(and dependent 6b742cc48 and fc53bd610f)

MLIR/Flang are broken.
2020-08-19 17:21:37 +00:00
Francesco Petrogalli 264afb9e6a [NFC][llvm] Make the contructors of `ElementCount` private.
Differential Revision: https://reviews.llvm.org/D86120
2020-08-19 16:26:44 +00:00
Mircea Trofin 62fc44ca3c [MLInliner] In development mode, obtain the output specs from a file
Different training algorithms may produce models that, besides the main
policy output (i.e. inline/don't inline), produce additional outputs
that are necessary for the next training stage. To facilitate this, in
development mode, we require the training policy infrastructure produce
a description of the outputs that are interesting to it, in the form of
a JSON file. We special-case the first entry in the JSON file as the
inlining decision - we care about its value, so we can guide inlining
during training - but treat the rest as opaque data that we just copy
over to the training log.

Differential Revision: https://reviews.llvm.org/D85674
2020-08-17 16:56:47 -07:00
Tyker a79e604462 [AssumeBundles] Fix Bug in Assume Queries
this bug was causing miscompile.
now clang cant properly selfhost with -mllvm --enable-knowledge-retention

Reviewed By: jdoerfert, lebedev.ri

Differential Revision: https://reviews.llvm.org/D83507
2020-08-17 21:36:53 +02:00
Dávid Bolvanský 0f14b2e6cb Revert "[BPI] Improve static heuristics for integer comparisons"
This reverts commit 50c743fa71. Patch will be split to smaller ones.
2020-08-17 20:44:33 +02:00
Simon Pilgrim c1f6ce0c73 [DemandedBits] Improve accuracy of Add propagator
The current demand propagator for addition will mark all input bits at and right of the alive output bit as alive. But carry won't propagate beyond a bit for which both operands are zero (or one/zero in the case of subtraction) so a more accurate answer is possible given known bits.

I derived a propagator by working through truth tables and using a bit-reversed addition to make demand ripple to the right, but I'm not sure how to make a convincing argument for its correctness in the comments yet. Nevertheless, here's a minimal implementation and test to get feedback.

This would help in a situation where, for example, four bytes (<128) packed into an int are added with four others SIMD-style but only one of the four results is actually read.

Known A:     0_______0_______0_______0_______
Known B:     0_______0_______0_______0_______
AOut:        00000000001000000000000000000000
AB, current: 00000000001111111111111111111111
AB, patch:   00000000001111111000000000000000

Committed on behalf of: @rrika (Erika)

Differential Revision: https://reviews.llvm.org/D72423
2020-08-17 12:54:09 +01:00
Cullen Rhodes 2ccde3c96b [InlineCost] Fix scalable vectors in visitAlloca
Discovered as part of the VLS type work (see D85128).

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85848
2020-08-17 10:34:27 +00:00
Vitaly Buka 3b348d9102 [NFC][StackSafety] Move out sort from the loop 2020-08-17 03:30:14 -07:00
Vitaly Buka e10e7829bf [StackSafety] Skip ambiguous lifetime analysis
If we can't identify alloca used in lifetime marker we
need to assume to worst case scenario.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D84630
2020-08-16 18:05:52 -07:00
Vitaly Buka 47552a614a [StackSafety] Change how callee searched in index
Handle other than local linkage types.
2020-08-16 04:37:19 -07:00
Wenlei He 577e58bcc7 [InlineAdvisor] New inliner advisor to replay inlining from optimization remarks
This change added a new inline advisor that takes optimization remarks from previous inlining as input, and provides the decision as advice so current inlining can replay inline decisions of a different compilation. Dwarf inline stack with line and discriminator is used as anchor for call sites including call context. The change can be useful for Inliner tuning as it provides a channel to allow external input for tweaking inline decisions. Existing alternatives like alwaysinline attribute is per-function, not per-callsite. Per-callsite inline intrinsic can be another solution (not yet existing), but it's intrusive to implement and also does not differentiate call context.

A switch -sample-profile-inline-replay=<inline_remarks_file> is added to hook up the new inline advisor with SampleProfileLoader's inline decision for replay. Since SampleProfileLoader does top-down inlining, inline decision can be specialized for each call context, hence we should be able to replay inlining accurately. However with a bottom-up inliner like CGSCC inlining, the replay can be limited due to lack of specialization for different call context. Apart from that limitation, the new inline advisor can still be used by regular CGSCC inliner later if needed for tuning purpose.

This is a resubmit of https://reviews.llvm.org/D83743
2020-08-15 20:17:21 -07:00
Vitaly Buka fc4fd89852 [StackSafety] Use ValueInfo in ParamAccess::Call
This avoid GUID lookup in Index.findSummaryInModule.
Follow up for D81242.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D85269
2020-08-14 12:42:44 -07:00
Matt Morehouse 891b2be85d Revert "[NFC][StackSafety] Move out sort from the loop"
This reverts commit 0426e28419 due to ASan
buildbot failure.
2020-08-14 08:17:35 -07:00
Vitaly Buka 4c30d4b4e5 [NFC][StackSafety] Change map key comparison 2020-08-14 04:23:15 -07:00
Vitaly Buka 0426e28419 [NFC][StackSafety] Move out sort from the loop 2020-08-14 04:19:10 -07:00
Vitaly Buka 798eb71c3a [NFC][StackSafety] Dedup callees 2020-08-14 01:14:52 -07:00
Dávid Bolvanský 50c743fa71 [BPI] Improve static heuristics for integer comparisons
Similarly as for pointers, even for integers a == b is usually false.

GCC also uses this heuristic.

Reviewed By: ebrevnov

Differential Revision: https://reviews.llvm.org/D85781
2020-08-13 19:54:27 +02:00
Simon Pilgrim 63863451d1 Fix unused variable warning. NFC.
Reduce the dyn_cast<> to a isa<> as that's all non-assert builds require, and move the cast<> inside the assert.
2020-08-13 15:43:20 +01:00
Dávid Bolvanský f9264995a6 Revert "[BPI] Improve static heuristics for integer comparisons"
This reverts commit 44587e2f7e. Sanitizer tests need to be updated.
2020-08-13 14:37:40 +02:00
Dávid Bolvanský 44587e2f7e [BPI] Improve static heuristics for integer comparisons
Similarly as for pointers, even for integers a == b is usually false.

GCC also uses this heuristic.

Reviewed By: ebrevnov

Differential Revision: https://reviews.llvm.org/D85781
2020-08-13 14:23:58 +02:00
Dávid Bolvanský a0485421d2 Revert "[BPI] Improve static heuristics for integer comparisons"
This reverts commit 385c9d673f.
2020-08-13 12:59:15 +02:00
Dávid Bolvanský 385c9d673f [BPI] Improve static heuristics for integer comparisons
Similarly as for pointers, even for integers a == b is usually false.

GCC also uses this heuristic.

Reviewed By: ebrevnov

Differential Revision: https://reviews.llvm.org/D85781
2020-08-13 12:45:40 +02:00
Ali Tamur 0581c0b0ee Revert "[SCEV] Look through single value PHIs."
This reverts commit e441b7a7a0.

This patch causes a compile error in tensorflow opensource project. The stack trace looks like:

Point of crash:
llvm/include/llvm/Analysis/LoopInfoImpl.h : line 35

(gdb) ptype *this
type = const class llvm::LoopBase<llvm::BasicBlock, llvm::Loop> [with BlockT = llvm::BasicBlock, LoopT = llvm::Loop]

(gdb) p *this
$1 = {ParentLoop = 0x0, SubLoops = std::vector of length 0, capacity 0, Blocks = std::vector of length 0, capacity 1,
  DenseBlockSet = {<llvm::SmallPtrSetImpl<llvm::BasicBlock const*>> = {<llvm::SmallPtrSetImplBase> = {<llvm::DebugEpochBase> = {Epoch = 3}, SmallArray = 0x1b2bf6c8, CurArray = 0x1b2bf6c8,
        CurArraySize = 8, NumNonEmpty = 0, NumTombstones = 0}, <No data fields>}, SmallStorage = {0xfffffffffffffffe, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, IsInvalid = true}

(gdb) p *this->DenseBlockSet->CurArray
$2 = (const void *) 0xfffffffffffffffe

I will try to get a case from tensorflow or use creduce to get a small case.
2020-08-12 23:13:24 -07:00
Nikita Popov eba5f5f798 [ValueTracking] Add abs intrinsics support to computeConstantRange()
Implementation is the same as for SPF_ABS.
2020-08-12 22:28:46 +02:00
Nikita Popov e2040d38a1 [ValueTracking] Support min/max intrinsics in computeConstantRange()
The implementation is the same as for the SPF_* case.
2020-08-12 22:07:29 +02:00
Craig Topper a7a06ded8b Recommit "[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms" and its follow up patches
This recommits the following patches now that D85684 has landed

1cf6f210a2 [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison.
469da663f2 [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison
122b0640fc [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison
ac0af12ed2 [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison
9b1e95329a [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms
2020-08-12 10:45:27 -07:00
Florian Hahn e441b7a7a0 [SCEV] Look through single value PHIs.
Now that SCEVExpander can preserve LCSSA form,
we do not have to worry about LCSSA form when
trying to look through PHIs. SCEVExpander will take
care of inserting LCSSA PHI nodes as required.

This increases precision of the analysis in some cases.

Reviewed By: mkazantsev, bmahjour

Differential Revision: https://reviews.llvm.org/D71539
2020-08-12 10:03:42 +01:00
Nikita Popov 06d567059e [InstSimplify] Respect CanUseUndef in more places
Similar to what we do in IIQ, add an isUndefValue() helper that
checks for undef values while respective CanUseUndef. This makes
it much easier to search for places that don't respect the flag
yet.
2020-08-11 21:53:33 +02:00
Dávid Bolvanský d68a2859ab [BPI] Teach BPI about bcmp function
bcmp is similar to memcmp
2020-08-11 20:44:53 +02:00
Nikita Popov d110d4aaff [InstSimplify] Forbid undef folds in expandBinOp
This is the replacement for D84250 based on D84792. As we recursively
fold with the same value twice, we need to disable undef folds,
to prevent an undef from being folded to two different values.

Reverting rG00f3579aea6e3d4a4b7464c3db47294f71cef9e4 and using the
test case from https://reviews.llvm.org/D83360#2145793, it no longer
performs the incorrect fold.

Differential Revision: https://reviews.llvm.org/D85684
2020-08-11 18:39:24 +02:00
Sanjay Patel 1470ce4a76 [InstSimplify] fold min/max with matching min/max operands
I think this is the last remaining translation of an existing
instcombine transform for the corresponding cmp+sel idiom.

This interpretation is more general though - we can remove
mismatched signed/unsigned combinations in addition to the
more obvious cases.

min/max(X, Y) must produce X or Y as the result, so this is
just another clause in the existing transform that was already
matching a min/max of min/max.
2020-08-11 11:23:15 -04:00
Florian Hahn 3483c28c5b [SCEV] ] If RHS >= Start, simplify (Start smax RHS) to RHS for trip counts.
This is the max version of D85046.

This change causes binary changes in 44 out of 237 benchmarks (out of
MultiSource/SPEC2000/SPEC2006)

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D85189
2020-08-11 13:20:24 +01:00
Juneyoung Lee 63b5b92bc9 [LazyValueInfo] Let getEdgeValueLocal look into freeze instructions
This patch makes getEdgeValueLocal more precise when a freeze instruction is
given, by adding support for freeze into constantFoldUser

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D84629
2020-08-11 16:39:34 +09:00
Thomas Lively 514445e035 [WebAssembly][ConstantFolding] Fold fp-to-int truncation intrinsics
Constant fold both the trapping and saturating versions of the
WebAssembly truncation intrinsics. The tests are adapted from the
WebAssembly spec tests for the corresponding instructions.

Requested in PR46982.

Differential Revision: https://reviews.llvm.org/D85392
2020-08-10 12:40:05 -07:00
Mircea Trofin 211117b660 [NFC][MLInliner] remove curly braces for a few sinle-line loops 2020-08-10 09:32:21 -07:00
Mircea Trofin d5c81be3ca [NFC][MLInliner] Set up the logger outside the development mode advisor
This allows us to subsequently configure the logger for the case when we
use a model evaluator and want to log additional outputs.

Differential Revision: https://reviews.llvm.org/D85577
2020-08-10 09:22:17 -07:00