Commit Graph

5989 Commits

Author SHA1 Message Date
Kazu Hirata e53472de68 [Transforms] Use llvm::append_range (NFC) 2021-01-20 21:35:54 -08:00
Kazu Hirata 8f5da41c4d [llvm] Construct SmallVector with iterator ranges (NFC) 2021-01-20 21:35:52 -08:00
Dávid Bolvanský bb3f169b59 [BuildLibcalls, Attrs] Support more variants of C++'s new, add attributes for C++'s delete
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95095
2021-01-21 00:12:37 +01:00
Mircea Trofin 95ce32c787 [NFC] Move ImportedFunctionsInliningStatistics to Analysis
This is related to D94982. We want to call these APIs from the Analysis
component, so we can't leave them under Transforms.

Differential Revision: https://reviews.llvm.org/D95079
2021-01-20 13:18:03 -08:00
Nikita Popov 1c6d1e57c1 [PredicateInfo] Handle logical and/or
Teach PredicateInfo to handle logical and/or the same way as
bitwise and/or. This allows handling logical and/or inside IPSCCP
and NewGVN.
2021-01-20 21:03:07 +01:00
Nikita Popov ca4ed1e7ae [PredicateInfo] Generalize processing of conditions
Branch/assume conditions in PredicateInfo are currently handled in
a rather ad-hoc manner, with some arbitrary limitations. For example,
an `and` of two `icmp`s will be handled, but an `and` of an `icmp`
and some other condition will not. That also includes the case where
more than two conditions and and'ed together.

This patch makes the handling more general by looking through and/ors
up to a limit and considering all kinds of conditions (though operands
will only be taken for cmps of course).

Differential Revision: https://reviews.llvm.org/D94447
2021-01-20 20:40:41 +01:00
Mircea Trofin d97f776be5 Revert "[NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor"
This reverts commit e8aec763a5.
2021-01-20 11:19:34 -08:00
Mircea Trofin e8aec763a5 [NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor
When using 2 InlinePass instances in the same CGSCC - one for other
mandatory inlinings, the other for the heuristic-driven ones - the order
in which the ImportedFunctionStats would be output-ed would depend on
the destruction order of the inline passes, which is not deterministic.

This patch moves the ImportedFunctionStats responsibility to the
InlineAdvisor to address this problem.

Differential Revision: https://reviews.llvm.org/D94982
2021-01-20 11:07:36 -08:00
Dávid Bolvanský 16d6e85271 [BuildLibcalls] Mark some libcalls with inaccessiblememonly and inaccessiblemem_or_argmemonly
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94850
2021-01-20 19:45:23 +01:00
Joseph Tremoulet 40cd262c43 Loop peeling: check that latch is conditional branch
Loop peeling assumes that the loop's latch is a conditional branch.  Add
a check to canPeel that explicitly checks for this, and testcases that
otherwise fail an assertion when trying to peel a loop whose back-edge
is a switch case or the non-unwind edge of an invoke.

Reviewed By: skatkov, fhahn

Differential Revision: https://reviews.llvm.org/D94995
2021-01-20 11:01:16 -05:00
Florian Hahn 83daa49758
[LoopRotate] Add PrepareForLTO stage, avoid rotating with inline cands.
D84108 exposed a bad interaction between inlining and loop-rotation
during regular LTO, which is causing notable regressions in at least
CINT2006/473.astar.

The problem boils down to: we now rotate a loop just before the vectorizer
which requires duplicating a function call in the preheader when compiling
the individual files ('prepare for LTO'). But this then prevents further
inlining of the function during LTO.

This patch tries to resolve this issue by making LoopRotate more
conservative with respect to rotating loops that have inline-able calls
during the 'prepare for LTO' stage.

I think this change intuitively improves the current situation in
general. Loop-rotate tries hard to avoid creating headers that are 'too
big'. At the moment, it assumes all inlining already happened and the
cost of duplicating a call is equal to just doing the call. But with LTO,
inlining also happens during full LTO and it is possible that a previously
duplicated call is actually a huge function which gets inlined
during LTO.

From the perspective of LV, not much should change overall. Most loops
calling user-provided functions won't get vectorized to start with
(unless we can infer that the function does not touch memory, has no
other side effects). If we do not inline the 'inline-able' call during
the LTO stage, we merely delayed loop-rotation & vectorization. If we
inline during LTO, chances should be very high that the inlined code is
itself vectorizable or the user call was not vectorizable to start with.

There could of course be scenarios where we inline a sufficiently large
function with code not profitable to vectorize, which would have be
vectorized earlier (by scalarzing the call). But even in that case,
there probably is no big performance impact, because it should be mostly
down to the cost-model to reject vectorization in that case. And then
the version with scalarized calls should also not be beneficial. In a way,
LV should have strictly more information after inlining and make more
accurate decisions (barring cost-model issues).

There is of course plenty of room for things to go wrong unexpectedly,
so we need to keep a close look at actual performance and address any
follow-up issues.

I took a look at the impact on statistics for
MultiSource/SPEC2000/SPEC2006. There are a few benchmarks with fewer
loops rotated, but no change to the number of loops vectorized.

Reviewed By: sanwou01

Differential Revision: https://reviews.llvm.org/D94232
2021-01-19 10:15:29 +00:00
Juneyoung Lee 395c737d9f [SimplifyCFG] Update SimplifyBranchOnICmpChain to recognize select form of and/or
This patch teaches SimplifyCFG::SimplifyBranchOnICmpChain to understand select form of
(x == C1 || x == C2 || ...) / (x != C1 && x != C2 && ...) and optimize them into switch if possible.
D93065 has more context about the transition, including links to the list of optimizations being updated.

Differential Revision: https://reviews.llvm.org/D93943
2021-01-19 08:53:40 +09:00
Kazu Hirata dc300beba7 [STLExtras] Add a default value to drop_begin
This patch adds the default value of 1 to drop_begin.

In the llvm codebase, 70% of calls to drop_begin have 1 as the second
argument.  The interface similar to with std::next should improve
readability.

This patch converts a couple of calls to drop_begin as examples.

Differential Revision: https://reviews.llvm.org/D94858
2021-01-18 10:16:34 -08:00
Florian Hahn e6d758de82
[InferAttrs] Mark some library functions as willreturn.
This patch marks some library functions as willreturn. On the first pass, I
excluded most functions that interact with streams/the filesystem.

Along with willreturn, it also adds nounwind to a set of math functions.
There probably are a few additional attributes we can add for those, but
that should be done separately.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D94684
2021-01-18 13:40:21 +00:00
Roman Lebedev 32fc32317a
[SimplifyCFG] markAliveBlocks(): catchswitch: preserve PostDomTree
When removing catchpad's from catchswitch, if that removes a successor,
we need to record that in DomTreeUpdater.

This fixes PostDomTree preservation failure in an existing test.
This appears to be the single issue that i see in my current test coverage.
2021-01-17 01:21:05 +03:00
Kazu Hirata 19aacdb715 [llvm] Construct SmallVector with iterator ranges (NFC) 2021-01-16 09:40:53 -08:00
Dávid Bolvanský a1500105ee [SimplifyCFG] Optimize CFG when null is passed to a function with nonnull argument
Example:

```
__attribute__((nonnull,noinline)) char * pinc(char *p)  {
  return ++p;
}

char * foo(bool b, char *a) {
       return pinc(b ? 0 : a);

}
```

optimize to

```
char * foo(bool b, char *a) {
       return pinc(a);

}
```

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94180
2021-01-15 23:53:43 +01:00
Nick Desaulniers ed0fd567eb BreakCriticalEdges: do not split the critical edge from a CallBr indirect successor
Otherwise we'll fail the assertion in SplitBlockPredecessors() related
to splitting the edges from CallBr's.

Fixes: https://github.com/ClangBuiltLinux/linux/issues/1161
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1252

Reviewed By: void, MaskRay, jyknight

Differential Revision: https://reviews.llvm.org/D88438
2021-01-15 13:51:47 -08:00
Roman Lebedev a14c36fe27
[SimplifyCFG] switchToSelect(): don't forget to insert DomTree edge iff needed
DestBB might or might not already be a successor of SelectBB,
and it wasn't we need to ensure that we record the fact in DomTree.

The testcase used to crash in lazy domtree updater mode + non-per-function
domtree validity checks disabled.
2021-01-15 23:35:57 +03:00
Roman Lebedev c6654a4cda
[SimplifyCFG][BasicBlockUtils] Port SplitBlockPredecessors()/SplitLandingPadPredecessors() to DomTreeUpdater
This is not nice, but it's the best transient solution possible,
and is better than just duplicating the whole function.

The problem is, this function is widely used,
and it is not at all obvious that all the users
could be painlessly switched to operate on DomTreeUpdater,
and somehow i don't feel like porting all those users first.

This function is one of last three that not operate on DomTreeUpdater.
2021-01-15 23:35:56 +03:00
Roman Lebedev 286cf6cb02
[SimplifyCFG] Port SplitBlockAndInsertIfThen() to DomTreeUpdater
This is not nice, but it's the best transient solution possible,
and is better than just duplicating the whole function.

The problem is, this function is widely used,
and it is not at all obvious that all the users
could be painlessly switched to operate on DomTreeUpdater,
and somehow i don't feel like porting all those users first.

This function is one of last three that not operate on DomTreeUpdater.
2021-01-15 23:35:56 +03:00
Roman Lebedev c845c724c2
[Utils][SimplifyCFG] Port SplitBlock() to DomTreeUpdater
This is not nice, but it's the best transient solution possible,
and is better than just duplicating the whole function.

The problem is, this function is widely used,
and it is not at all obvious that all the users
could be painlessly switched to operate on DomTreeUpdater,
and somehow i don't feel like porting all those users first.

This function is one of last three that not operate on DomTreeUpdater.
2021-01-15 23:35:56 +03:00
Roman Lebedev b81f75fa79
[Utils] splitBlockBefore() always operates on DomTreeUpdater, so take it, not DomTree
Even though not all it's users operate on DomTreeUpdater,
it itself internally operates on DomTreeUpdater,
so it must mean everything is fine with that,
so just do that globally.
2021-01-15 23:35:56 +03:00
Kazu Hirata 8a20e2b3d3 [llvm] Use Optional::getValueOr (NFC) 2021-01-12 21:43:50 -08:00
Hongtao Yu 175288a1af Add sample-profile-suffix-elision-policy attribute with -funique-internal-linkage-names.
Adding sample-profile-suffix-elision-policy attribute to functions whose linkage names are uniquefied so that their unique name suffix won't be trimmed when applying AutoFDO profiles.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D94455
2021-01-12 15:15:53 -08:00
Philip Reames caafdf07bb [LV] Weaken spuriously strong assert in LoopVersioning
LoopVectorize uses some utilities on LoopVersioning, but doesn't actually use it for, you know, versioning.  As a result, the precondition LoopVersioning expects is too strong for this user.  At the moment, LoopVectorize supports any loop with a unique exit block, so check the same precondition here.

Really, the whole class structure here is a mess.  We should separate the actual versioning from the metadata updates, but that's a bigger problem.
2021-01-12 12:57:13 -08:00
Roman Lebedev 90a92f8b4d
[NFCI][Utils/Local] removeUnreachableBlocks(): cleanup support for lazy DomTreeUpdater
When DomTreeUpdater is in lazy update mode, the blocks
that were scheduled to be removed, won't be removed
until the updates are flushed, e.g. by asking
DomTreeUpdater for a up-to-date DomTree.

From the function's current code, it is pretty evident
that the support for the lazy mode is an afterthought,
see e.g. how we roll-back NumRemoved statistic..

So instead of considering all the unreachable blocks
as the blocks-to-be-removed, simply additionally skip
all the blocks that are already scheduled to be removed
2021-01-12 02:09:47 +03:00
Roman Lebedev f9ba347706
[SimplifyCFG] FoldValueComparisonIntoPredecessors(): don't insert a DomTree edge if it already exists
When we are adding edges to the terminator and potentially turning it
into a switch (if it wasn't already), it is possible that the
case we're adding will share it's destination with one of the
preexisting cases, in which case there is no domtree edge to add.

Indeed, this change does not have a test coverage change.
This failure has been exposed in an existing test coverage
by a follow-up patch that switches to lazy domtreeupdater mode,
and removes domtree verification from
SimplifyCFGOpt::simplifyOnce()/SimplifyCFGOpt::run(),
IOW it does not appear feasible to add dedicated test coverage here.
2021-01-12 02:09:47 +03:00
Roman Lebedev c0de0a1b72
[SimplifyCFG] SimplifyBranchOnICmpChain(): don't insert a DomTree edge that already exists
BB was already always branching to EdgeBB, there is no edge to add.

Indeed, this change does not have a test coverage change.
This failure has been exposed in an existing test coverage
by a follow-up patch that switches to lazy domtreeupdater mode,
and removes domtree verification from
SimplifyCFGOpt::simplifyOnce()/SimplifyCFGOpt::run(),
IOW it does not appear feasible to add dedicated test coverage here.
2021-01-12 02:09:46 +03:00
Roman Lebedev c22bc5f1f8
[SimplifyCFG] SwitchToLookupTable(): don't insert a DomTree edge that already exists
SI is the terminator of BB, so the edge we are adding obviously already existed.

Indeed, this change does not have a test coverage change.
This failure has been exposed in an existing test coverage
by a follow-up patch that switches to lazy domtreeupdater mode,
and removes domtree verification from
SimplifyCFGOpt::simplifyOnce()/SimplifyCFGOpt::run(),
IOW it does not appear feasible to add dedicated test coverage here.
2021-01-12 02:09:46 +03:00
Hongtao Yu 32bcfcda4e Rename debug linkage name with -funique-internal-linkage-names
Functions that are renamed under -funique-internal-linkage-names have their debug linkage name updated as well.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D93747
2021-01-11 13:56:07 -08:00
Sriraman Tallam d8c6d24359 -funique-internal-linkage-names appends a hex md5hash suffix to the symbol name which is not demangler friendly, convert it to decimal.
Please see D93747 for more context which tries to make linkage names of internal
linkage functions to be the uniqueified names. This causes a problem with gdb
because breaking using the demangled function name will not work if the new
uniqueified name cannot be demangled. The problem is the generated suffix which
is a mix of integers and letters which do not demangle. The demangler accepts
either all numbers or all letters. This patch simply converts the hash to decimal.

There is no loss of uniqueness by doing this as the precision is maintained.
The symbol names get longer by a few characters though.

Differential Revision: https://reviews.llvm.org/D94154
2021-01-11 11:10:29 -08:00
Serguei Katkov 7f69860243 [LoopUnroll] Fix a crash
Loop peeling as a last step triggers loop simplification and this
can change the loop structure. As a result all cashed values like
latch branch becomes invalid.

Patch re-structure the code to take into account the possible
changes caused by peeling.

Reviewers: dmgreen, Meinersbur, etiotto, fhahn, efriedma, bmahjour
Reviewed By: Meinersbur, fhahn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D93686
2021-01-11 10:19:26 +07:00
Philip Reames 4739dd67e7 [LoopDeletion] Break backedge of outermost loops when known not taken
This is a resubmit of dd6bb367 (which was reverted due to stage2 build failures in 7c63aac), with the additional restriction added to the transform to only consider outer most loops.

As shown in the added test case, ensuring LCSSA is up to date when deleting an inner loop is tricky as we may actually need to remove blocks from any outer loops, thus changing the exit block set.   For the moment, just avoid transforming this case.  I plan to return to this case in a follow up patch and see if we can do better.

Original commit message follows...

The basic idea is that if SCEV can prove the backedge isn't taken, we can go ahead and get rid of the backedge (and thus the loop) while leaving the rest of the control in place. This nicely handles cases with dispatch between multiple exits and internal side effects.

Differential Revision: https://reviews.llvm.org/D93906
2021-01-10 16:02:33 -08:00
Roman Lebedev 8e8d214c4a
[NFCI][SimplifyCFG] Prefer to add Insert edges before Delete edges into DomTreeUpdater, if reasonable
This has a measurable impact on the number of DomTree recalculations.
While this doesn't handle all the cases,
it deals with the most obvious ones.
2021-01-11 00:30:44 +03:00
Florian Hahn c701f85c45
[STLExtras] Use return type from operator* of the wrapped iter.
Currently make_early_inc_range cannot be used with iterators with
operator* implementations that do not return a reference.

Most notably in the LLVM codebase, this means the User iterator ranges
cannot be used with make_early_inc_range, which slightly simplifies
iterating over ranges while elements are removed.

Instead of directly using BaseT::reference as return type of operator*,
this patch uses decltype to get the actual return type of the operator*
implementation in WrappedIteratorT.

This patch also updates a few places to use make use of
make_early_inc_range.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D93992
2021-01-10 14:41:13 +00:00
Florian Hahn d98fc62ae6 [SimplifyCFG] Keep !dgb metadata of moved instruction, if they match.
Currently SimplifyCFG drops the debug locations of 'bonus' instructions.
Such instructions are moved before the first branch. The reason for the
current behavior is that this could lead to surprising debug stepping,
if the block that's folded is dead.

In case the first branch and the instructions to be folded have the same
debug location, this shouldn't be an issue and we can keep the debug
location.

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D93662
2021-01-09 19:15:16 +00:00
Kazu Hirata 9a7c03b800 [SCEV] Remove unused getOrInsertCanonicalInductionVariable (NFC)
The last use was removed on Mar 22, 2012 in commit
f47d0af551.
2021-01-09 09:24:56 -08:00
Kazu Hirata f62b93b9a2 [SCEV] Remove unused getExactExistingExpansion (NFC)
The last use was removed on Sep 4, 2018 in commit
2cbba56337.
2021-01-08 18:39:57 -08:00
Ruiling Song 8dddcc762d [Cloning] Copy metadata of global declarations
We have modules with metadata on declarations, and out-of-tree passes
use that metadata, and we need to clone those modules. We really expect
such metadata is kept during the clone operation.

Reviewed by: arsenm, aprantl

Differential Revision: https://reviews.llvm.org/D93451
2021-01-08 08:21:18 +08:00
Roman Lebedev f2f81c554b
[SimplifyCFG] markAliveBlocks(): switch to non-permissive DomTree updates
No actual changes needed, invoke can't have the same block as an unwind
destination and a normal destination.
2021-01-08 02:15:27 +03:00
Roman Lebedev d59f97bb3a
[SimplifyCFG] removeUnwindEdge(): switch to non-permissive DomTree updates
No actual changes needed, Catchswitch cannot unwind to one of its catchpads.
2021-01-08 02:15:27 +03:00
Roman Lebedev f0eba8ce2d
[SimplifyCFG] changeToCall(): switch to non-permissive DomTree updates
No actual changes needed, normal and unwind destinations of an invoke
can never be identical.
2021-01-08 02:15:27 +03:00
Roman Lebedev be0a31d13b
[SimplifyCFG] DeleteDeadBlocks(): switch to non-permissive DomTree updates
No actual changes needed, DetatchDeadBlocks() was already doing the right thing.
2021-01-08 02:15:27 +03:00
Roman Lebedev 66189212bb
[SimplifyCFG] MergeBlockIntoPredecessor(): switch to non-permissive DomTree updates
... which requires not deleting edges that were just deleted already,
    by not processing the same successor more than once.
2021-01-08 02:15:26 +03:00
Roman Lebedev 05adc73db0
[SimplifyCFG] changeToUnreachable(): switch to non-permissive DomTree updates
... which requires not deleting edges that were just deleted already,
    by not processing the same predecessor more than once.
2021-01-08 02:15:26 +03:00
Roman Lebedev 7600d7c7be
[SimplifyCFG] removeUnreachableBlocks(): switch to non-permissive DomTree updates
... which requires not deleting edges that were just deleted already,
    by not processing the same predecessor more than once.
2021-01-08 02:15:26 +03:00
Roman Lebedev 1f9b591ee6
[SimplifyCFG] TryToSimplifyUncondBranchFromEmptyBlock(): switch to non-permissive DomTree updates
... which requires not deleting edges that were just deleted already,
    by not processing the same predecessor more than once.
2021-01-08 02:15:25 +03:00
Roman Lebedev b3822728fa
[SimplifyCFG] ConstantFoldTerminator(): switch to non-permissive DomTree updates in `indirectbr` handling
... which requires not deleting edges that were just deleted already.
2021-01-08 02:15:25 +03:00
Roman Lebedev 36593a30a4
[SimplifyCFG] ConstantFoldTerminator(): switch to non-permissive DomTree updates in `SwitchInst` handling
... which requires not deleting edges that will still be present.
2021-01-08 02:15:24 +03:00
Roman Lebedev 16ab8e5f6d
[SimplifyCFG] ConstantFoldTerminator(): handle matching destinations of condbr earlier
We need to handle this case before dealing with the case of constant
branch condition, because if the destinations match, latter fold
would try to remove the DomTree edge that would still be present.

This allows to make that particular DomTree update non-permissive
2021-01-08 02:15:24 +03:00
dfukalov 6a87e9b08b [NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813
2021-01-07 22:22:05 +03:00
Roman Lebedev 6be1fd6b20
[SimplifyCFG] FoldValueComparisonIntoPredecessors(): drop reachable errneous assert
I have added it in d15d81c because it *seemed* correct, was holding
for all the tests so far, and was validating the fix added in the same
commit, but as David Major is pointing out (with a reproducer),
the assertion isn't really correct after all. So remove it.

Note that the d15d81c still fine.
2021-01-07 18:05:04 +03:00
Sidharth Baveja 048f184ee4 [SplitEdge] Add new parameter to SplitEdge to name the newly created basic block
Summary:
Currently SplitEdge does not support passing in parameter which allows you to
name the newly created BasicBlock.

This patch updates the function such that the name of the block can be passed
in, if users of this utility decide to do so.

Reviewed By: Whitney, bmahjour, asbirlea, jamieschmeiser

Differential Revision: https://reviews.llvm.org/D94176
2021-01-07 14:49:23 +00:00
Oliver Stannard 76f6b125ce Revert "[llvm] Use BasicBlock::phis() (NFC)"
Reverting because this causes crashes on the 2-stage buildbots, for
example http://lab.llvm.org:8011/#/builders/7/builds/1140.

This reverts commit 9b228f107d.
2021-01-07 09:43:33 +00:00
Kazu Hirata 9b228f107d [llvm] Use BasicBlock::phis() (NFC) 2021-01-06 18:27:35 -08:00
Alina Sbirlea 63aeaf754a [DominatorTree] Add support for mixed pre/post CFG views.
Add support for mixed pre/post CFG views.

Update usages of the MemorySSAUpdater to use the new DT API by
requesting the DT updates to be done by the MSSAUpdater.

Differential Revision: https://reviews.llvm.org/D93371
2021-01-06 14:53:09 -08:00
Arthur Eubanks 7fea561eb1 [CGSCC][Coroutine][NewPM] Properly support function splitting/outlining
Previously when trying to support CoroSplit's function splitting, we
added in a hack that simply added the new function's node into the
original function's SCC (https://reviews.llvm.org/D87798). This is
incorrect since it might be in its own SCC.

Now, more similar to the previous design, we have callers explicitly
notify the LazyCallGraph that a function has been split out from another
one.

In order to properly support CoroSplit, there are two ways functions can
be split out.

One is the normal expected "outlining" of one function into a new one.
The new function may only contain references to other functions that the
original did. The original function must reference the new function. The
new function may reference the original function, which can result in
the new function being in the same SCC as the original function. The
weird case is when the original function indirectly references the new
function, but the new function directly calls the original function,
resulting in the new SCC being a parent of the original function's SCC.
This form of function splitting works with CoroSplit's Switch ABI.

The second way of splitting is more specific to CoroSplit. CoroSplit's
Retcon and Async ABIs split the original function into multiple
functions that all reference each other and are referenced by the
original function. In order to keep the LazyCallGraph in a valid state,
all new functions must be processed together, else some nodes won't be
populated. To keep things simple, this only supports the case where all
new edges are ref edges, and every new function references every other
new function. There can be a reference back from any new function to the
original function, putting all functions in the same RefSCC.

This also adds asserts that all nodes in a (Ref)SCC can reach all other
nodes to prevent future incorrect hacks.

The original hacks in https://reviews.llvm.org/D87798 are no longer
necessary since all new functions should have been registered before
calling updateCGAndAnalysisManagerForPass.

This fixes all coroutine tests when opt's -enable-new-pm is true by
default. This also fixes PR48190, which was likely due to the previous
hack breaking SCC invariants.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D93828
2021-01-06 11:19:15 -08:00
Francesco Petrogalli dfd3384fee [InstCombine] Update valueCoversEntireFragment to use TypeSize
* Update valueCoversEntireFragment to use TypeSize.
* Add a regression test.
* Assertions have been added to protect untested codepaths.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D91806
2021-01-06 17:14:59 +00:00
Roman Lebedev a14945c1db
[SimplifyCFG] SimplifyEqualityComparisonWithOnlyPredecessor(): really don't delete DomTree edges multiple times 2021-01-06 01:52:39 +03:00
Roman Lebedev 2b437fcd47
[SimplifyCFG] SwitchToLookupTable(): switch to non-permissive DomTree updates
... which requires not deleting a DomTree edge that we just deleted.
2021-01-06 01:52:38 +03:00
Roman Lebedev fa5447aa3f
[NFC][SimplifyCFG] SwitchToLookupTable(): pull out SI->getParent() into a variable 2021-01-06 01:52:38 +03:00
Roman Lebedev d15d81ce15
[SimplifyCFG] FoldValueComparisonIntoPredecessors(): deal with each predecessor only once
If the predecessor is a switch, and BB is not the default destination,
multiple cases could have the same destination. and it doesn't
make sense to re-process the predecessor, because we won't make any changes,
once is enough.

I'm not sure this can be really tested, other than via the assertion
being added here, which fires without the fix.
2021-01-06 01:52:37 +03:00
Roman Lebedev fc96cb2dad
[SimplifyCFG] FoldValueComparisonIntoPredecessors(): switch to non-permissive DomTree updates
... which requires not adding a DomTree edge that we just added.
2021-01-06 01:52:37 +03:00
Roman Lebedev 29ca7d5a1a
[SimplifyCFG] simplifyUnreachable(): fix handling of degenerate same-destination conditional branch
One would hope that it would have been already canonicalized into an
unconditional branch, but that isn't really guaranteed to happen
with SimplifyCFG's visitation order.
2021-01-06 01:52:36 +03:00
Roman Lebedev 3460719f58
[NFC][SimplifyCFG] Add a test with same-destination condidional branch
Reported by Mikael Holmén as post-commit feedback on
https://reviews.llvm.org/rG2d07414ee5f74a09fb89723b4a9bb0818bdc2e18#968162
2021-01-06 01:52:36 +03:00
Roman Lebedev f98535686e
[SimplifyCFG] simplifyUnreachable(): switch to non-permissive DomTree updates
... which requires not removing a DomTree edge if the switch's default
still points at that destination, because it can't be removed;
... and not processing the same predecessor more than once.
2021-01-06 01:52:36 +03:00
Atmn Patel f88a797521 [LoopDeletion] Allows deletion of possibly infinite side-effect free loops
From C11 and C++11 onwards, a forward-progress requirement has been
introduced for both languages. In the case of C, loops with non-constant
conditionals that do not have any observable side-effects (as defined by
6.8.5p6) can be assumed by the implementation to terminate, and in the
case of C++, this assumption extends to all functions. The clang
frontend will emit the `mustprogress` function attribute for C++
functions (D86233, D85393, D86841) and emit the loop metadata
`llvm.loop.mustprogress` for every loop in C11 or later that has a
non-constant conditional.

This patch modifies LoopDeletion so that only loops with
the `llvm.loop.mustprogress` metadata or loops contained in functions
that are required to make progress (`mustprogress` or `willreturn`) are
checked for observable side-effects. If these loops do not have an
observable side-effect, then we delete them.

Loops without observable side-effects that do not satisfy the above
conditions will not be deleted.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86844
2021-01-05 09:56:16 -05:00
Simon Pilgrim a000366d05 [SimplifyIndVar] createWideIV - make WideIVInfo arg a const ref. NFCI.
The WideIVInfo arg is only ever used as a const.

Fixes cppcheck warning.
2021-01-05 10:31:45 +00:00
Roman Lebedev 32c47ebef1
[SimplifyCFG] SimplifyCondBranchToTwoReturns(): switch to non-permissive DomTree updates
... which requires not deleting an edge that just got deleted,
because we could be dealing with a block that didn't go through
ConstantFoldTerminator() yet, and thus has a degenerate cond br
with matching true/false destinations.
2021-01-05 01:26:37 +03:00
Roman Lebedev 110b3d7855
[SimplifyCFG] SimplifyEqualityComparisonWithOnlyPredecessor(): switch to non-permissive DomTree updates
... which requires not deleting an edge that just got deleted.
2021-01-05 01:26:37 +03:00
Roman Lebedev a8604e3d5b
[SimplifyCFG] simplifyIndirectBr(): switch to non-permissive DomTree updates
... which requires not deleting an edge that just got deleted.
2021-01-05 01:26:36 +03:00
Roman Lebedev 3fb57222c4
[NFCI] SimplifyCFG: switch to non-permissive DomTree updates, where possible
Notably, this doesn't switch *every* case, remaining cases
don't actually pass sanity checks in non-permissve mode,
and therefore require further analysis.

Note that SimplifyCFG still defaults to not preserving DomTree by default,
so this is effectively a NFC change.
2021-01-05 01:26:36 +03:00
Sanjay Patel 36263a7ccc [LoopUtils] remove redundant opcode parameter; NFC
While here, rename the inaccurate getRecurrenceBinOp()
because that was also used to get CmpInst opcodes.

The recurrence/reduction kind should always refer to the
expected opcode for a reduction. SLP appears to be the
only direct caller of createSimpleTargetReduction(), and
that calling code ideally should not be carrying around
both an opcode and a reduction kind.

This should allow us to generalize reduction matching to
use intrinsics instead of only binops.
2021-01-04 17:05:28 -05:00
Sanjay Patel 9766957524 [LoopUtils] reduce code for creatng reduction; NFC
We can return from each case instead creating a temporary
variable just to have a common return.
2021-01-04 16:05:03 -05:00
Sanjay Patel 58b6c5d932 [LoopUtils] reorder logic for creating reduction; NFC
If we are using a shuffle reduction, we don't need to
go through the switch on opcode - return early.
2021-01-04 16:05:02 -05:00
Whitney Tsang de6d43f16c Revert "[LoopNest] Allow empty basic blocks without loops"
This reverts commit 9a17bff4f7.
2021-01-04 20:42:21 +00:00
Whitney Tsang 9a17bff4f7 [LoopNest] Allow empty basic blocks without loops
Allow loop nests with empty basic blocks without loops in different
levels as perfect.

Reviewers: Meinersbur

Differential Revision: https://reviews.llvm.org/D93665
2021-01-04 19:59:50 +00:00
Philip Reames 7c63aac7bd Revert "[LoopDeletion] Break backedge of loops when known not taken"
This reverts commit dd6bb367d1.

Multi-stage builders are showing an assertion failure w/LCSSA not being preserved on entry to IndVars.  Reason isn't clear, reverting while investigating.
2021-01-04 09:50:47 -08:00
Philip Reames dd6bb367d1 [LoopDeletion] Break backedge of loops when known not taken
The basic idea is that if SCEV can prove the backedge isn't taken, we can go ahead and get rid of the backedge (and thus the loop) while leaving the rest of the control in place. This nicely handles cases with dispatch between multiple exits and internal side effects.

Differential Revision: https://reviews.llvm.org/D93906
2021-01-04 09:19:29 -08:00
Roman Lebedev 98cd1c33e3
[NFC][SimplifyCFG] Hoist 'original' DomTree verification from simplifyOnce() into run()
This is NFC since SimplifyCFG still currently defaults to not preserving DomTree.

SimplifyCFGOpt::simplifyOnce() is only be called from SimplifyCFGOpt::run(),
and can not be called externally, since SimplifyCFGOpt is defined in .cpp
This avoids some needless verifications, and is thus a bit faster
without sacrificing precision.
2021-01-04 01:02:02 +03:00
Roman Lebedev a7684940f0
[SimplifyCFG] SimplifyTerminatorOnSelect(): fix/tune DomTree updates
We only need to remove non-TrueBB/non-FalseBB successors,
and we only need to do that once. We don't need to insert
any new edges, because no new successors will be added.
2021-01-04 01:02:02 +03:00
Roman Lebedev 70935b9595
[NFC][SimplifyCFG] SimplifyTerminatorOnSelect(): pull out OldTerm->getParent() into a variable 2021-01-04 01:02:02 +03:00
Roman Lebedev 5fa241a657
[SimplifyCFG] FoldValueComparisonIntoPredecessors(): fine-tune/fix DomTree preservation, take 2 2021-01-03 01:45:48 +03:00
Roman Lebedev 6a3a8d17eb
[SimplifyCFG] FoldValueComparisonIntoPredecessors(): fine-tune/fix DomTree preservation 2021-01-03 01:45:48 +03:00
Roman Lebedev 7c8b8063b6
[SimplifyCFG][AMDGPU] AMDGPUUnifyDivergentExitNodes: SimplifyCFG isn't ready to preserve PostDomTree
There is a number of transforms in SimplifyCFG that take DomTree out of
DomTreeUpdater, and do updates manually. Until they are fixed,
user passes are unable to claim that PDT is preserved.

Note that the default for SimplifyCFG is still not to preserve DomTree,
so this is still effectively NFC.
2021-01-03 01:45:46 +03:00
Kazu Hirata 530c5af6a4 [Transforms] Construct SmallVector with iterator ranges (NFC) 2021-01-02 09:24:17 -08:00
Roman Lebedev b9da488ad7
[SimplifyCFG] Don't actually take DomTreeUpdater unless we intend to maintain DomTree validity
This guards against unintentional mistakes
like the one i just fixed in previous commit.
2021-01-02 14:40:55 +03:00
Roman Lebedev b4429f3cdd
[SimplifyCFG] Teach removeUndefIntroducingPredecessor to preserve DomTree 2021-01-02 01:01:20 +03:00
Roman Lebedev 657c1e09da
[SimplifyCFG] Teach eliminateDeadSwitchCases() to preserve DomTree, part 2 2021-01-02 01:01:18 +03:00
Roman Lebedev f1ce696056
[SimplifyCFG] Teach tryWidenCondBranchToCondBranch() to preserve DomTree 2021-01-02 01:01:17 +03:00
Kazu Hirata f43daf1b62 [SSAUpdater] Remove unused code InstrIsPHI (NFC)
The last use of this function was removed on Jan 4, 2018 in commit
commit 90ecac01e9.
2021-01-01 12:44:52 -08:00
Sanjay Patel c74e8539ff [Analysis] flatten enums for recurrence types
This is almost all mechanical search-and-replace and
no-functional-change-intended (NFC). Having a single
enum makes it easier to match/reason about the
reduction cases.

The goal is to remove `Opcode` from reduction matching
code in the vectorizers because that makes it harder to
adapt the code to handle intrinsics.

The code in RecurrenceDescriptor::AddReductionVar() is
the only place that required closer inspection. It uses
a RecurrenceDescriptor and a second InstDesc to sometimes
overwrite part of the struct. It seem like we should be
able to simplify that logic, but it's not clear exactly
which cmp+sel patterns that we are trying to handle/avoid.
2021-01-01 12:20:16 -05:00
Roman Lebedev 831636b0e6
[SimplifyCFG] SUCCESS! Teach createUnreachableSwitchDefault() to preserve DomTree
This pretty much concludes patch series for updating SimplifyCFG
to preserve DomTree. All 318 dedicated `-simplifycfg` tests now pass
with `-simplifycfg-require-and-preserve-domtree=1`.

There are a few leftovers that apparently don't have good test coverage.
I do not yet know what gaps in test coverage will the wider-scale testing
reveal, but the default flip might be close.
2021-01-01 03:25:25 +03:00
Roman Lebedev e1440d43bc
[SimplifyCFG] Teach tryToSimplifyUncondBranchWithICmpInIt() to preserve DomTree 2021-01-01 03:25:25 +03:00
Roman Lebedev 8866583953
[SimplifyCFG] Teach FoldValueComparisonIntoPredecessors() to preserve DomTree, part 2 2021-01-01 03:25:24 +03:00
Roman Lebedev a815b6b2b2
[SimplifyCFG] Teach eliminateDeadSwitchCases() to preserve DomTree, part 1 2021-01-01 03:25:24 +03:00
Roman Lebedev 0d2f219d4d
[SimplifyCFG] Teach SimplifyEqualityComparisonWithOnlyPredecessor() to preserve DomTree, part 3 2021-01-01 03:25:23 +03:00
Roman Lebedev 9f17dab1f4
[SimplifyCFG] Teach simplifyIndirectBr() to preserve DomTree 2021-01-01 03:25:23 +03:00
Roman Lebedev b7c463d7b8
[SimplifyCFG] Teach FoldBranchToCommonDest() to preserve DomTree, part 2 2021-01-01 03:25:23 +03:00
Roman Lebedev c1b825d4b8
[SimplifyCFG] Teach FoldValueComparisonIntoPredecessors() to preserve DomTree, part 1 2021-01-01 03:25:22 +03:00
Bogdan Graur 8bee4d4e8f Revert "[LoopDeletion] Allows deletion of possibly infinite side-effect free loops"
Test clang/test/Misc/loop-opt-setup.c fails when executed in Release.

This reverts commit 6f1503d598.

Reviewed By: SureYeaah

Differential Revision: https://reviews.llvm.org/D93956
2020-12-31 11:47:49 +00:00
Atmn Patel 6f1503d598 [LoopDeletion] Allows deletion of possibly infinite side-effect free loops
From C11 and C++11 onwards, a forward-progress requirement has been
introduced for both languages. In the case of C, loops with non-constant
conditionals that do not have any observable side-effects (as defined by
6.8.5p6) can be assumed by the implementation to terminate, and in the
case of C++, this assumption extends to all functions. The clang
frontend will emit the `mustprogress` function attribute for C++
functions (D86233, D85393, D86841) and emit the loop metadata
`llvm.loop.mustprogress` for every loop in C11 or later that has a
non-constant conditional.

This patch modifies LoopDeletion so that only loops with
the `llvm.loop.mustprogress` metadata or loops contained in functions
that are required to make progress (`mustprogress` or `willreturn`) are
checked for observable side-effects. If these loops do not have an
observable side-effect, then we delete them.

Loops without observable side-effects that do not satisfy the above
conditions will not be deleted.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86844
2020-12-30 21:43:01 -05:00
Roman Lebedev 7f221c9196
[SimplifyCFG] Teach SwitchToLookupTable() to preserve DomTree 2020-12-30 23:58:41 +03:00
Roman Lebedev a17025aa61
[SimplifyCFG] Teach switchToSelect() to preserve DomTree 2020-12-30 23:58:40 +03:00
Roman Lebedev c45f765c0d
[SimplifyCFG] Teach SimplifyBranchOnICmpChain() to preserve DomTree 2020-12-30 23:58:40 +03:00
Sanjay Patel 8ca60db40b [LoopUtils] reduce FMF and min/max complexity when forming reductions
I don't know if there's some way this changes what the vectorizers
may produce for reductions, but I have added test coverage with
3567908 and 5ced712 to show that both passes already have bugs in
this area. Hopefully this does not make things worse before we can
really fix it.
2020-12-30 15:22:26 -05:00
Sanjay Patel e90ea76380 [IR] remove 'NoNan' param when creating FP reductions
This is no-functional-change-intended (AFAIK, we can't
isolate this difference in a regression test).

That's because the callers should be setting the IRBuilder's
FMF field when creating the reduction and/or setting those
flags after creating. It doesn't make sense to override this
one flag alone.

This is part of a multi-step process to clean up the FMF
setting/propagation. See PR35538 for an example.
2020-12-30 09:51:23 -05:00
Juneyoung Lee 9b29610228 Use unary CreateShuffleVector if possible
As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used
instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`.
Let's update them.

Actually, it would have been more natural if the patches were made in this order:
(1) let them use unary CreateShuffleVector first
(2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793)

The order is swapped, but in terms of correctness it is still fine.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D93923
2020-12-30 22:36:08 +09:00
Kazu Hirata 16d20e2554 [Transforms/Utils] Construct SmallVector with iterator ranges (NFC) 2020-12-29 19:23:23 -08:00
Roman Lebedev 39a56f7f17
[SimplifyCFG] Teach SimplifyTerminatorOnSelect() to preserve DomTree 2020-12-30 00:48:12 +03:00
Roman Lebedev ec0b671a61
[SimplifyCFG] Teach SimplifyCondBranchToCondBranch() to preserve DomTree 2020-12-30 00:48:12 +03:00
Roman Lebedev 307156246f
[SimplifyCFG] Teach mergeConditionalStoreToAddress() to preserve DomTree 2020-12-30 00:48:11 +03:00
Roman Lebedev d4c0abb4a3
[SimplifyCFG] Teach FoldCondBranchOnPHI() to preserve DomTree 2020-12-30 00:48:11 +03:00
Roman Lebedev b8121b2e62
[SimplifyCFG] Teach SinkCommonCodeFromPredecessors() to preserve DomTree 2020-12-30 00:48:11 +03:00
Roman Lebedev 18c407bf4c
[SimplifyCFG] Teach HoistThenElseCodeToIf() to preserve DomTree 2020-12-30 00:48:10 +03:00
Roman Lebedev fe9bdd9621
[SimplifyCFG] Teach SimplifyEqualityComparisonWithOnlyPredecessor() to preserve DomTree, part 2 2020-12-30 00:48:10 +03:00
Roman Lebedev 6027e05dbf
[SimplifyCFG] Teach SimplifyEqualityComparisonWithOnlyPredecessor() to preserve DomTree, part 1 2020-12-30 00:48:10 +03:00
Sanjay Patel 8d18bc8e6d [Utils] reduce code in createTargetReduction(); NFC
The switch duplicated the translation in getRecurrenceBinOp().
This code is still weird because it translates to the TTI
ReductionFlags for min/max, but then createSimpleTargetReduction()
converts that back to RecurrenceDescriptor::MinMaxRecurrenceKind.
2020-12-29 15:56:19 -05:00
Roman Lebedev ef93f7a11c
[SimplifyCFG] FoldBranchToCommonDest: gracefully handle unreachable code ()
We might be dealing with an unreachable code,
so the bonus instruction we clone might be self-referencing.

There is a sanity check that all uses of bonus instructions
that are not in the original block with said bonus instructions
are PHI nodes, and that is obviously not the case
for self-referencing instructions..

So if we find such an use, just rewrite it.

Thanks to Mikael Holmén for the reproducer!

Fixes https://bugs.llvm.org/show_bug.cgi?id=48450#c8
2020-12-28 23:31:19 +03:00
Yevgeny Rouban d76c1d2247 [RS4GC] Lazily set changed flag when folding single entry phis
The function FoldSingleEntryPHINodes() is changed to return if
it has changed IR or not. This return value is used by RS4GC to
set the MadeChange flag respectively.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D93810
2020-12-28 10:54:21 +07:00
Florian Hahn 0ea3749b3c
[LV] Set up branch from middle block earlier.
Previously the branch from the middle block to the scalar preheader & exit
was being set-up at the end of skeleton creation in completeLoopSkeleton.
Inserting SCEV or runtime checks may result in LCSSA phis being created,
if they are required. Adjusting branches afterwards may break those
PHIs.

To avoid this, we can instead create the branch from the middle block
to the exit after we created the middle block, so we have the final CFG
before potentially adjusting/creating PHIs.

This fixes a crash for the included test case. For the non-crashing
case, this is almost a NFC with respect to the generated code. The
only change is the order of the predecessors of the involved branch
targets.

Note an assertion was moved from LoopVersioning() to
LoopVersioning::versionLoop. Adjusting the branches means loop-simplify
form may be broken before constructing LoopVersioning. But LV only uses
LoopVersioning to annotate the loop instructions with !noalias metadata,
which does not require loop-simplify form.

This is a fix for an existing issue uncovered by D93317.
2020-12-27 18:21:12 +00:00
Kazu Hirata 8299fb8f25 [Transforms] Use llvm::append_range (NFC) 2020-12-27 09:57:29 -08:00
Kazu Hirata 789d250613 [CodeGen, Transforms] Use *Map::lookup (NFC) 2020-12-27 09:57:27 -08:00
Kazu Hirata 46bea9b297 [Local] Remove unused function RemovePredecessorAndSimplify (NFC)
The last use of the function was removed on Sep 29, 2010 in commit
99c985c37d.
2020-12-25 09:35:20 -08:00
Roman Lebedev ff3749fc79
[NFC] SimplifyCFGOpt::simplifyUnreachable(): pacify unused variable warning
Thanks to Luke Benes for pointing it out.
2020-12-24 21:20:46 +03:00
Kazu Hirata df812115e3 [CodeGen, Transforms] Use llvm::any_of (NFC) 2020-12-24 09:08:36 -08:00
Roman Lebedev c043f5055e
[SimplifyCFG] Teach FoldBranchToCommonDest() to preserve DomTree, part 1
... for conditional branch case
2020-12-20 00:18:36 +03:00
Roman Lebedev 262ff9c23e
[SimplifyCFG] Teach TryToMergeLandingPad() to preserve DomTree 2020-12-20 00:18:36 +03:00
Roman Lebedev 6a1617d67c
[SimplifyCFG] Teach SimplifyCondBranchToTwoReturns() to preserve DomTree, part 2
... for the custom case returning void.
2020-12-20 00:18:36 +03:00
Roman Lebedev b94520c9ee
[SimplifyCFG] Teach SimplifyCondBranchToTwoReturns() to preserve DomTree, part 1
... for the general case of returning a value.
2020-12-20 00:18:35 +03:00
Roman Lebedev 4d87a6ad13
[NFCI][SimplifyCFG] SimplifyCondBranchToTwoReturns(): pull out BI->getParent() into a variable 2020-12-20 00:18:35 +03:00
Roman Lebedev 83659c7076
[SimplifyCFG] simplifySingleResume(): FoldReturnIntoUncondBranch() already knows how to preserve DomTree
... so just ensure that we pass DomTreeUpdater it into it.

Apparently, there were no dedicated tests just for that functionality,
so i'm adding one here.
2020-12-20 00:18:34 +03:00
Roman Lebedev b7d00e29b7
[SimplifyCFG] Teach simplifySingleResume() to preserve DomTree 2020-12-20 00:18:34 +03:00
Roman Lebedev c209b88dd4
[SimplifyCFG] Teach simplifyCommonResume() to preserve DomTree 2020-12-20 00:18:34 +03:00
Roman Lebedev 76e74d9395
[SimplifyCFG] Teach removeEmptyCleanup() to preserve DomTree 2020-12-20 00:18:33 +03:00
Roman Lebedev 4be8707e64
[SimplifyCFG] Teach FoldTwoEntryPHINode() to preserve DomTree
Still boring, simply drop all edges to successors of DomBlock,
and add an edge to to BB instead.
2020-12-20 00:18:33 +03:00
Roman Lebedev b43b77ff9b
[NFCI][SimlifyCFG] simplifyOnce(): also perform DomTree validation
And that exposes that a number of tests don't *actually* manage to
maintain DomTree validity, which is inline with my observations.

Once again, SimlifyCFG pass currently does not require/preserve DomTree
by default, so this is effectively NFC.
2020-12-20 00:18:32 +03:00
Whitney Tsang 2a814cd9e1 Ensure SplitEdge to return the new block between the two given blocks
This PR implements the function splitBasicBlockBefore to address an
issue
that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore.
The issue occurs in SplitEdge when the Succ has a single predecessor
and the edge between the BB and Succ is not critical. This produces
the result ‘BB->Succ->New’. The new function splitBasicBlockBefore
was added to splitBlockBefore to handle the issue and now produces
the correct result ‘BB->New->Succ’.

Below is an example of splitting the block bb1 at its first instruction.

/// Original IR
bb0:
	br bb1
bb1:
        %0 = mul i32 1, 2
	br bb2
bb2:
/// IR after splitEdge(bb0, bb1) using splitBasicBlock
bb0:
	br bb1
bb1:
	br bb1.split
bb1.split:
        %0 = mul i32 1, 2
	br bb2
bb2:
/// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore
bb0:
	br bb1.split
bb1.split
	br bb1
bb1:
        %0 = mul i32 1, 2
	br bb2
bb2:

Differential Revision: https://reviews.llvm.org/D92200
2020-12-18 17:37:17 +00:00
Yevgeny Rouban f0e3d1d6ca [IndVars] Fix adding trunc instructions to unwind blocks
Truncate instruction must not be inserted before landing pads.
The insertion point is fixed.
2020-12-18 12:52:23 +07:00
Kazu Hirata b621116716 [Transforms] Use llvm::erase_if (NFC) 2020-12-17 19:53:10 -08:00
Rong Xu 31c0b8700b Fix clang-ppc64le-rhel buildbot build error
ix buildbot build error due to
commit 3733463d: [IR][PGO] Add hot func attribute and use hot/cold
attribute in func section
2020-12-17 19:14:43 -08:00
Roman Lebedev 2d07414ee5
[SimplifyCFG] Teach simplifyUnreachable() to preserve DomTree
Pretty boring, removeUnwindEdge() already known how to update DomTree,
so if we are to call it, we must first flush our own pending updates;
otherwise, we just stop predecessors from branching to us,
and for certain predecessors, stop their predecessors from
branching to them also.
2020-12-18 00:37:22 +03:00
Roman Lebedev 2ee724863e
[SimplifyCFG] ConstantFoldTerminator() already knows how to preserve DomTree
... so just ensure that we pass DomTreeUpdater it into it.

Fixes DomTree preservation for a number of tests,
all of which are marked as such so that they do not regress.
2020-12-18 00:37:22 +03:00
Roman Lebedev 164e0847a5
[SimplifyCFG] DeleteDeadBlock() already knows how to preserve DomTree
... so just ensure that we pass DomTreeUpdater it into it.

Fixes DomTree preservation for a large number of tests,
all of which are marked as such so that they do not regress.
2020-12-18 00:37:21 +03:00
Bangtian Liu 511cfe9441 Revert "Ensure SplitEdge to return the new block between the two given blocks"
This reverts commit d20e0c3444.
2020-12-17 21:00:37 +00:00
Nabeel Omer df2b9a3e02 [DebugInfo] Avoid re-ordering assignments in LCSSA
The LCSSA pass makes use of a function insertDebugValuesForPHIs() to
propogate dbg.value() intrinsics to newly inserted PHI instructions. Faulty
behaviour occurs when the parent PHI of a newly inserted PHI is not the
most recent assignment to a source variable. insertDebugValuesForPHIs ends
up propagating a value that isn't the most recent assignemnt.

This change removes the call to insertDebugValuesForPHIs() from LCSSA,
preventing incorrect dbg.value intrinsics from being propagated.
Propagating variable locations between blocks will occur later, during
LiveDebugValues.

Differential Revision: https://reviews.llvm.org/D92576
2020-12-17 16:17:32 +00:00
Bangtian Liu d20e0c3444 Ensure SplitEdge to return the new block between the two given blocks
This PR implements the function splitBasicBlockBefore to address an
issue
that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore.
The issue occurs in SplitEdge when the Succ has a single predecessor
and the edge between the BB and Succ is not critical. This produces
the result ‘BB->Succ->New’. The new function splitBasicBlockBefore
was added to splitBlockBefore to handle the issue and now produces
the correct result ‘BB->New->Succ’.

Below is an example of splitting the block bb1 at its first instruction.

/// Original IR
bb0:
	br bb1
bb1:
        %0 = mul i32 1, 2
	br bb2
bb2:
/// IR after splitEdge(bb0, bb1) using splitBasicBlock
bb0:
	br bb1
bb1:
	br bb1.split
bb1.split:
        %0 = mul i32 1, 2
	br bb2
bb2:
/// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore
bb0:
	br bb1.split
bb1.split
	br bb1
bb1:
        %0 = mul i32 1, 2
	br bb2
bb2:

Differential Revision: https://reviews.llvm.org/D92200
2020-12-17 16:00:15 +00:00
Florian Hahn 75c04bfc61
[SimplifyCFG] Preserve !annotation in FoldBranchToCommonDest.
When folding a branch to a common destination, preserve !annotation on
the created instruction, if the terminator of the BB that is going to be
removed has !annotation. This should ensure that !annotation is attached
to the instructions that 'replace' the original terminator.

Reviewed By: jdoerfert, lebedev.ri

Differential Revision: https://reviews.llvm.org/D93410
2020-12-17 14:06:58 +00:00
dfukalov 9ed8e0caab [NFC] Reduce include files dependency and AA header cleanup (part 2).
Continuing work started in https://reviews.llvm.org/D92489:

Removed a bunch of includes from "AliasAnalysis.h" and "LoopPassManager.h".

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D92852
2020-12-17 14:04:48 +03:00
Roman Lebedev 5cce4aff18
[SimplifyCFG] TryToSimplifyUncondBranchFromEmptyBlock() already knows how to preserve DomTree
... so just ensure that we pass DomTreeUpdater it into it.

Fixes DomTree preservation for a large number of tests,
all of which are marked as such so that they do not regress.
2020-12-17 01:03:49 +03:00
Roman Lebedev 49dac4aca0
[SimplifyCFG] MergeBlockIntoPredecessor() already knows how to preserve DomTree
... so just ensure that we pass DomTreeUpdater it into it.

Fixes DomTree preservation for a large number of tests,
all of which are marked as such so that they do not regress.
2020-12-17 01:03:49 +03:00
Bangtian Liu c10757200d Revert "Ensure SplitEdge to return the new block between the two given blocks"
This reverts commit cf638d793c.
2020-12-16 11:52:30 +00:00
Bangtian Liu cf638d793c Ensure SplitEdge to return the new block between the two given blocks
This PR implements the function splitBasicBlockBefore to address an
issue
that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore.
The issue occurs in SplitEdge when the Succ has a single predecessor
and the edge between the BB and Succ is not critical. This produces
the result ‘BB->Succ->New’. The new function splitBasicBlockBefore
was added to splitBlockBefore to handle the issue and now produces
the correct result ‘BB->New->Succ’.

Below is an example of splitting the block bb1 at its first instruction.

/// Original IR
bb0:
	br bb1
bb1:
        %0 = mul i32 1, 2
	br bb2
bb2:
/// IR after splitEdge(bb0, bb1) using splitBasicBlock
bb0:
	br bb1
bb1:
	br bb1.split
bb1.split:
        %0 = mul i32 1, 2
	br bb2
bb2:
/// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore
bb0:
	br bb1.split
bb1.split
	br bb1
bb1:
        %0 = mul i32 1, 2
	br bb2
bb2:

Differential Revision: https://reviews.llvm.org/D92200
2020-12-15 23:32:29 +00:00
Roman Lebedev e113317958
[NFCI][SimplifyCFG] Add basic scaffolding for gradually making the pass DomTree-aware
Two observations:
1. Unavailability of DomTree makes it impossible to make
  `FoldBranchToCommonDest()` transform in certain cases,
   where the successor is dominated by predecessor,
   because we then don't have PHI's, and can't recreate them,
   well, without handrolling 'is dominated by' check,
   which doesn't really look like a great solution to me.
2. Avoiding invalidating DomTree in SimplifyCFG will
   decrease the number of `Dominator Tree Construction` by 5
   (from 28 now, i.e. -18%) in `-O3` old-pm pipeline
   (as per `llvm/test/Other/opt-O3-pipeline.ll`)
   This might or might not be beneficial for compile time.

So the plan is to make SimplifyCFG preserve DomTree, and then
eventually make DomTree fully required and preserved by the pass.

Now, SimplifyCFG is ~7KLOC. I don't think it will be nice
to do all this uplifting in a single mega-commit,
nor would it be possible to review it in any meaningful way.

But, i believe, it should be possible to do this in smaller steps,
introducing the new behavior, in an optional way, off-by-default,
opt-in option, and gradually fixing transforms one-by-one
and adding the flag to appropriate test coverage.

Then, eventually, the default should be flipped,
and eventually^2 the flag removed.

And that is what is happening here - when the new off-by-default option
is specified, DomTree is required and is claimed to be preserved,
and SimplifyCFG-internal assertions verify that the DomTree is still OK.
2020-12-16 00:38:00 +03:00
Nico Weber a852ee199c Reland "[MachineDebugify] Insert synthetic DBG_VALUE instructions"
This reverts commit 841f9c937f.
The change landed many months ago; something else broke those tests.
2020-12-14 22:34:23 -05:00
Nico Weber 841f9c937f Revert "[MachineDebugify] Insert synthetic DBG_VALUE instructions"
This reverts commit 2a5675f11d.
The tests it adds fail: https://reviews.llvm.org/D78135#2453736
2020-12-14 22:14:48 -05:00
Gulfem Savrun Yeniceri 7c0e3a77bc [clang][IR] Add support for leaf attribute
This patch adds support for leaf attribute as an optimization hint
in Clang/LLVM.

Differential Revision: https://reviews.llvm.org/D90275
2020-12-14 14:48:17 -08:00
Philip Reames f5fe8493e5 [LAA] Relax restrictions on early exits in loop structure
his is a preparation patch for supporting multiple exits in the loop vectorizer, by itself it should be mostly NFC. This patch moves the loop structure checks from LAA to their respective consumers (where duplicates don't already exist).  Moving the checks does end up changing some of the optimization warnings and debug output slightly, but nothing that appears to be a regression.

Why do this? Well, after auditing the code, I can't actually find anything in LAA itself which relies on having all instructions within a loop execute an equal number of times. This patch simply makes this explicit so that if one consumer - say LV in the near future (hopefully) - wants to handle a broader class of loops, it can do so.

Differential Revision: https://reviews.llvm.org/D92066
2020-12-14 12:44:01 -08:00
Roman Lebedev 59560e8589
[SimplifyCFG] FoldBranchToCommonDest(): temporairly put back restrictions on liveout uses of bonus instructions (PR48450)
Even though d38205144f was mostly a correct
fix for the external non-PHI users, it's not a *generally* correct fix,
because the 'placeholder' values in those trivial PHI's we create
shouldn't be *always* 'undef', but the PHI itself for the backedges,
else we end up with wrong value, as the `@pr48450_2` test shows.

But we can't just do that, because we can't check that the PHI
can be it's own incoming value when coming from certain predecessor,
because we don't have a dominator tree.

So until we can address this correctness problem properly,
ensure that we don't perform the transformation
if there are such problematic external uses.

Making dominator tree available there is going to be involved,
since `-simplifycfg` pass currently does not preserve/update domtree...
2020-12-14 20:14:31 +03:00
Roman Lebedev e8360a8e1e
[NFC][SimplifyCFG] FoldBranchToCommonDest(): pull out 'common successor' into a variable
Makes it easier to use it elsewhere
2020-12-14 20:14:31 +03:00
Kazu Hirata 5891ad4e22 [Transforms] Use llvm::erase_value (NFC) 2020-12-13 09:48:47 -08:00
Roman Lebedev d38205144f
[SimplifyCFG] FoldBranchToCommonDest(): bonus instrns must only be used by PHI nodes in successors (PR48450)
In particular, if the successor block, which is about to get a new
predecessor block, currently only has a single predecessor,
then the bonus instructions will be directly used within said successor,
which is fine, since the block with bonus instructions dominates that
successor. But once there's a new predecessor, the IR is no longer valid,
and we don't fix it, because we only update PHI nodes.

Which means, the live-out bonus instructions must be exclusively used
by the PHI nodes in successor blocks. So we have to form trivial PHI nodes.
which will then be successfully updated to recieve cloned bonus instns.

This all works fine, except for the fact that we don't have access to
the dominator tree, and we don't ignore unreachable code,
so we sometimes do end up having to deal with some weird IR.

Fixes https://bugs.llvm.org/show_bug.cgi?id=48450
2020-12-13 00:06:57 +03:00
Fangrui Song b5ad32ef5c Migrate deprecated DebugLoc::get to DILocation::get
This migrates all LLVM (except Kaleidoscope and
CodeGen/StackProtector.cpp) DebugLoc::get to DILocation::get.

The CodeGen/StackProtector.cpp usage may have a nullptr Scope
and can trigger an assertion failure, so I don't migrate it.

Reviewed By: #debug-info, dblaikie

Differential Revision: https://reviews.llvm.org/D93087
2020-12-11 12:45:22 -08:00
Philip Reames 5171b7b40e [indvars] Common a bit of code [NFC] 2020-12-08 15:25:48 -08:00
Valentin Churavy 700cf7dcc9 [VNCoercion] Disallow coercion between different ni addrspaces
I'm not sure if it would be legal by the IR reference to introduce
an addrspacecast here, since the IR reference is a bit vague on
the exact semantics, but at least for our usage of it (and I
suspect for many other's usage) it is not. For us, addrspacecasts
between non-integral address spaces carry frontend information that the
optimizer cannot deduce afterwards in a generic way (though we
have frontend specific passes in our pipline that do propagate
these). In any case, I'm sure nobody is using it this way at
the moment, since it would have introduced inttoptrs, which
are definitely illegal.

Fixes PR38375

Co-authored-by: Keno Fischer <keno@alumni.harvard.edu>

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D50010
2020-12-07 20:19:48 -05:00
Fangrui Song 2832f3528c [Transforms] Delete unused declarations from NewGVN/CoroSplit/ValueMapper 2020-12-06 13:04:01 -08:00
Hiroshi Yamauchi f9c3954a6e Fix for Bug 48055.
Differential Revision: https://reviews.llvm.org/D92599
2020-12-04 11:05:01 -08:00
Max Kazantsev 12b6c5e682 Return "[IndVars] ICmpInst should not prevent IV widening"
This reverts commit 4bd35cdc3a.

The patch was reverted during the investigation. The investigation
shown that the patch did not cause any trouble, but just exposed
the existing problem that is addressed by the previous patch
"[IndVars] Quick fix LHS/RHS bug". Returning without changes.
2020-12-04 12:34:43 +07:00
Max Kazantsev 3df0daceb2 [IndVars] Quick fix LHS/RHS bug
The code relies on fact that LHS is the NarrowDef but never
really checks it. Adding the conservative restrictive check,
will follow-up with handling of case where RHS is a NarrowDef.
2020-12-04 12:34:42 +07:00
dfukalov 2ce38b3f03 [NFC] Reduce include files dependency.
1. Removed #include "...AliasAnalysis.h" in other headers and modules.
2. Cleaned up includes in AliasAnalysis.h.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D92489
2020-12-03 18:25:05 +03:00
Max Kazantsev 4bd35cdc3a Revert "[IndVars] ICmpInst should not prevent IV widening"
This reverts commit 0c9c6ddf17.

We are seeing some failures with this patch locally. Not clear
if it's causing them or just triggering a problem in another
place. Reverting while investigating.
2020-12-03 18:01:41 +07:00
David Sherwood 71bd59f0cb [SVE] Add support for scalable vectors with vectorize.scalable.enable loop attribute
In this patch I have added support for a new loop hint called
vectorize.scalable.enable that says whether we should enable scalable
vectorization or not. If a user wants to instruct the compiler to
vectorize a loop with scalable vectors they can now do this as
follows:

  br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !2
  ...
  !2 = !{!2, !3, !4}
  !3 = !{!"llvm.loop.vectorize.width", i32 8}
  !4 = !{!"llvm.loop.vectorize.scalable.enable", i1 true}

Setting the hint to false simply reverts the behaviour back to the
default, using fixed width vectors.

Differential Revision: https://reviews.llvm.org/D88962
2020-12-02 13:23:43 +00:00
Roman Lebedev 15f8060f6f
[SimplifyCFG] FoldBranchToCommonDest: don't require that cmp of br is last instruction
There is no correctness need for that, and since we allow live-out
uses, this could theoretically happen, because currently nothing
will move the cond to right before the branch in those tests.
But regardless, lifting that restriction even makes the transform
easier to understand.

This makes the transform happen in 81 more cases (+0.55%)
)
2020-12-01 15:13:06 +03:00
Roman Lebedev b0e9b7c59f
[NFC][SimplifyCFG] Add STATISTIC() to the FoldValueComparisonIntoPredecessors() fold 2020-11-30 12:27:16 +03:00
Max Kazantsev 0c9c6ddf17 [IndVars] ICmpInst should not prevent IV widening
If we decided to widen IV with zext, then unsigned comparisons
should not prevent widening (same for sext/sign comparisons).
The result of comparison in wider type does not change in this case.

Differential Revision: https://reviews.llvm.org/D92207
Reviewed By: nikic
2020-11-30 10:51:31 +07:00
Roman Lebedev b33fbbaa34
Reland [SimplifyCFG] FoldBranchToCommonDest: lift use-restriction on bonus instructions
This was orginally committed in 2245fb8aaa.
but was immediately reverted in f3abd54958
because of a PHI handling issue.

Original commit message:

1. It doesn't make sense to enforce that the bonus instruction
   is only used once in it's basic block. What matters is
   whether those user instructions fit within our budget, sure,
   but that is another question.
2. It doesn't make sense to enforce that said bonus instructions
   are only used within their basic block. Perhaps the branch
   condition isn't using the value computed by said bonus instruction,
   and said bonus instruction is simply being calculated
   to be used in successors?

So iff we can clone bonus instructions, to lift these restrictions,
we just need to carefully update their external uses
to use the new cloned instructions.

Notably, this transform (even without this change) appears to be
poison-unsafe as per alive2, but is otherwise (including the patch) legal.

We don't introduce any new PHI nodes, but only "move" the instructions
around, i'm not really seeing much potential for extra cost modelling
for the transform, especially since now we allow at most one such
bonus instruction by default.

This causes the fold to fire +11.4% more (13216 -> 14725)
as of vanilla llvm test-suite + RawSpeed.

The motivational pattern is IEEE-754-2008 Binary16->Binary32
extension code:
ca57d77fb2/src/librawspeed/common/FloatingPoint.h (L115-L120)
^ that should be a switch, but it is not now: https://godbolt.org/z/bvja5v
That being said, even thought this seemed like this would fix it: https://godbolt.org/z/xGq3TM
apparently that fold is happening somewhere else afterall,
so something else also has a similar 'artificial' restriction.
2020-11-27 12:47:15 +03:00
Max Kazantsev faf183874c [IndVars] LCSSA Phi users should not prevent widening
When widening an IndVar that has LCSSA Phi users outside
the loop, we can safely widen it as usual and then truncate
the result outside the loop without hurting the performance.

Differential Revision: https://reviews.llvm.org/D91593
Reviewed By: skatkov
2020-11-27 11:19:54 +07:00
Roman Lebedev f3abd54958
Revert "[SimplifyCFG] FoldBranchToCommonDest: lift use-restriction on bonus instructions"
Many bots are unhappy, at the very least missed a few codegen tests,
and possibly this has a logic hole inducing a miscompile
(will be really awesome to have ready reproducer..)

Need to investigate.

This reverts commit 2245fb8aaa.
2020-11-26 23:13:43 +03:00
Roman Lebedev 2245fb8aaa
[SimplifyCFG] FoldBranchToCommonDest: lift use-restriction on bonus instructions
1. It doesn't make sense to enforce that the bonus instruction
   is only used once in it's basic block. What matters is
   whether those user instructions fit within our budget, sure,
   but that is another question.
2. It doesn't make sense to enforce that said bonus instructions
   are only used within their basic block. Perhaps the branch
   condition isn't using the value computed by said bonus instruction,
   and said bonus instruction is simply being calculated
   to be used in successors?

So iff we can clone bonus instructions, to lift these restrictions,
we just need to carefully update their external uses
to use the new cloned instructions.

Notably, this transform (even without this change) appears to be
poison-unsafe as per alive2, but is otherwise (including the patch) legal.

We don't introduce any new PHI nodes, but only "move" the instructions
around, i'm not really seeing much potential for extra cost modelling
for the transform, especially since now we allow at most one such
bonus instruction by default.

This causes the fold to fire +11.4% more (13216 -> 14725)
as of vanilla llvm test-suite + RawSpeed.

The motivational pattern is IEEE-754-2008 Binary16->Binary32
extension code:
ca57d77fb2/src/librawspeed/common/FloatingPoint.h (L115-L120)
^ that should be a switch, but it is not now: https://godbolt.org/z/bvja5v
That being said, even thought this seemed like this would fix it: https://godbolt.org/z/xGq3TM
apparently that fold is happening somewhere else afterall,
so something else also has a similar 'artificial' restriction.
2020-11-26 22:51:22 +03:00
Roman Lebedev 65db7d38e0
[NFC][SimplifyCFG] Add statistic to `FoldBranchToCommonDest()` fold 2020-11-26 22:51:21 +03:00
Zhengyang Liu 345fcccb33 Fix use-of-uninitialized-value in rG75f50e15bf8f
Differential Revision: https://reviews.llvm.org/D71126
2020-11-26 01:39:22 -07:00
Max Kazantsev 28d7ba1543 [IndVars] Use more precise context when eliminating narrowing
When deciding to widen narrow use, we may need to prove some facts
about it. For proof, the context is used. Currently we take the instruction
being widened as the context.

However, we may be more precise here if we take as context the point that
dominates all users of instruction being widened.

Differential Revision: https://reviews.llvm.org/D90456
Reviewed By: skatkov
2020-11-25 11:47:39 +07:00
Philip Reames 10ddb927c1 [SCEV] Use isa<> pattern for testing for CouldNotCompute [NFC]
Some older code - and code copied from older code - still directly tested against the singelton result of SE::getCouldNotCompute.  Using the isa<SCEVCouldNotCompute> form is both shorter, and more readable.
2020-11-24 18:47:49 -08:00
Kazu Hirata df73b8c174 [ValueMapper] Remove unused declaration remapFunction (NFC)
The function declaration with two parameters was introduced on Apr 16
2016 in commit f0d73f95c1 without a
corresponding definition.
2020-11-22 21:52:03 -08:00
Hongtao Yu f3c445697d [CSSPGO] IR intrinsic for pseudo-probe block instrumentation
This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story.

A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues:

1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality.
2. The counter atomics may not be fully cleaned up from the code stream eventually.
3. Extra work is needed for re-targeting.

We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality.

Let's now look at an example. Given the following LLVM IR:

```
define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 {
bb0:
  %cmp = icmp eq i32 %x, 0
   br i1 %cmp, label %bb1, label %bb2
bb1:
   br label %bb3
bb2:
   br label %bb3
bb3:
   ret void
}
```

The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID.

```
define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 {
bb0:
   %cmp = icmp eq i32 %x, 0
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 1)
   br i1 %cmp, label %bb1, label %bb2
bb1:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 2)
   br label %bb3
bb2:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 3)
   br label %bb3
bb3:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 4)
   ret void
}

```

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D86490
2020-11-20 10:39:24 -08:00
Max Kazantsev 515105f46b [NFC] Remove comment (commited ahead of time by mistake) 2020-11-19 16:28:34 +07:00
Max Kazantsev 7c601d09a7 [NFC] Move code earlier as preparation for further changes 2020-11-19 16:27:23 +07:00
Kazu Hirata 43c0e4f665 [Transforms] Use llvm::is_contained (NFC) 2020-11-18 20:42:22 -08:00
Nikita Popov f4a3969bff [Inline] Fix incorrectly dropped noalias metadata
This is the same fix as 23aeadb89d,
just for CloneScopedAliasMetadata rather than PropagateCallSiteMetadata.

In this case the previous outcome was incorrectly dropped metadata,
as it was not part of the computed metadata map.

The real change in the test is that the first load now retains
metadata, the rest of the changes are due to changes in metadata
numbering.
2020-11-18 21:22:50 +01:00
Nikita Popov 23aeadb89d [Inline] Fix incorrect noalias metadata application (PR48209)
The VMap also contains a mapping from Argument => Instruction,
where the instruction is part of the original function, not the
inlined one. The code was assuming that all the instructions in
the VMap were inlined.

This was a pre-existing problem for the loop access metadata, but
was extended to the more common noalias metadata by
27f647d117, thus causing miscompiles.

There is a similar assumption inside CloneAliasScopeMetadata(), so
that one likely needs to be fixed as well.
2020-11-18 20:52:58 +01:00
Nick Desaulniers f4c6080ab8 Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch"
This reverts commit b7926ce6d7.

Going with a simpler approach.
2020-11-17 17:27:14 -08:00
Matt Arsenault c5ce6036c1 Linker: Fix linking of byref types
This wasn't properly remapping the type like with the other
attributes, so this would end up hitting a verifier error after
linking different modules using byref.
2020-11-17 11:02:04 -05:00
Max Kazantsev 63dd1734b2 [NFC] Collect ext users into vector instead of finding them twice 2020-11-17 14:01:43 +07:00
Kazu Hirata 1da60f1d44 [Transforms] Use pred_empty (NFC) 2020-11-16 22:09:14 -08:00
Arthur Eubanks 7de6dcd246 [Debugify] Skip debugifying on special/immutable passes
With a function pass manager, it would insert debuginfo metadata before
getting to function passes while processing the pass manager, causing
debugify to skip while running the function passes.

Skip special passes + verifier + printing passes. Compared to the legacy
implementation of -debugify-each, this additionally skips verifier
passes. Probably no need to update the legacy version since it will be
obsolete soon.

This fixes 2 instcombine tests using -debugify-each under NPM.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D91558
2020-11-16 20:39:46 -08:00
Roman Lebedev 6861d938e5
Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"
See discussion in https://bugs.llvm.org/show_bug.cgi?id=45073 / https://reviews.llvm.org/D66324#2334485
the implementation is known-broken for certain inputs,
the bugreport was up for a significant amount of timer,
and there has been no activity to address it.
Therefore, just completely rip out all of misexpect handling.

I suspect, fixing it requires redesigning the internals of MD_misexpect.
Should anyone commit to fixing the implementation problem,
starting from clean slate may be better anyways.

This reverts commit 7bdad08429,
and some of it's follow-ups, that don't stand on their own.
2020-11-14 13:12:38 +03:00
serge-sans-paille 9218ff50f9 llvmbuildectomy - replace llvm-build by plain cmake
No longer rely on an external tool to build the llvm component layout.

Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.

These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.

Differential Revision: https://reviews.llvm.org/D90848
2020-11-13 10:35:24 +01:00
Max Kazantsev 0a1d394bf3 [NFC] Refactor loop-invariant getters to return Optional 2020-11-13 15:03:10 +07:00
Max Kazantsev d6dd938589 [IndVars] IV user should not prevent use widening
Sometimes the an instruction we are trying to widen is used by the IV
(which means the instruction is the IV increment). Currently this may
prevent its widening. We should ignore such user because it will be
dead once the transform is done anyways.

Differential Revision: https://reviews.llvm.org/D90920
Reviewed By: fhahn
2020-11-12 12:02:01 +07:00
Max Kazantsev 2e01ceafaa [IndVars] Recognize 'sub nuw' expressed as 'add' for widening
InstCombine canonicalizes 'sub nuw' instructions to 'add' without the
`nuw` flag. The typical case where we see it is decrementing induction
variables. For them, IndVars fails to prove that it's legal to widen them,
and inserts unprofitable `zext`'s.

This patch adds recognition of such pattern using SCEV.

Differential Revision: https://reviews.llvm.org/D89550
Reviewed By: fhahn, skatkov
2020-11-12 10:51:29 +07:00
Jonas Paulsson 89a1042b6a Make inferLibFuncAttributes() add SExt attribute on second arg to ldexp.
This was missing as discovered by the SystemZ multistage bot:
http://lab.llvm.org:8011/#/builders/8, where wrong code resulted when this
extension was not performed.

Thanks for review by Ulrich Weigand and Roman Lebedev.

Differential Revision: https://reviews.llvm.org/D90760
2020-11-10 18:32:15 +01:00
David Green c7e275388e [ARM] Don't aggressively unroll vector remainder loops
We already do not unroll loops with vector instructions under MVE, but
that does not include the remainder loops that the vectorizer produces.
These remainder loops will be rarely executed and are not worth
unrolling, as the trip count is likely to be low if they get executed at
all. Luckily they get llvm.loop.isvectorized to make recognizing them
simpler.

We have wanted to do this for a while but hit issues with low overhead
loops being reverted due to difficult registry allocation. With recent
changes that seems to be less of an issue now.

Differential Revision: https://reviews.llvm.org/D90055
2020-11-10 17:01:31 +00:00
Tim Northover f7fe7ea24d [MergeFunctions] fix function attribute comparison in FunctionComparator
The comparison of AttributeSets stopped after seeing a matching type attribute.
Subsequent mismatching attributes were not detected causing a crash.
2020-11-09 09:19:11 +00:00
Kazu Hirata 75e46c6328 [Mem2Reg] Use llvm::count instead of std::count (NFC) 2020-11-07 20:18:47 -08:00
Atmn Patel 04a0896487 Revert "[LoopDeletion] Allows deletion of possibly infinite side-effect free loops"
This reverts commit 0b17c6e447. This patch
causes a compile-time error in SCEV.
2020-11-07 00:32:12 -05:00
Atmn Patel 0b17c6e447 [LoopDeletion] Allows deletion of possibly infinite side-effect free loops
From C11 and C++11 onwards, a forward-progress requirement has been
introduced for both languages. In the case of C, loops with non-constant
conditionals that do not have any observable side-effects (as defined by
6.8.5p6) can be assumed by the implementation to terminate, and in the
case of C++, this assumption extends to all functions. The clang
frontend will emit the `mustprogress` function attribute for C++
functions (D86233, D85393, D86841) and emit the loop metadata
`llvm.loop.mustprogress` for every loop in C11 or later that has a
non-constant conditional.

This patch modifies LoopDeletion so that only loops with
the `llvm.loop.mustprogress` metadata or loops contained in functions
that are required to make progress (`mustprogress` or `willreturn`) are
checked for observable side-effects. If these loops do not have an
observable side-effect, then we delete them.

Loops without observable side-effects that do not satisfy the above
conditions will not be deleted.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86844
2020-11-06 22:06:58 -05:00
Atmn Patel babc224c5d [LoopDeletion] Remove dead loops with no exit blocks
Currently, LoopDeletion refuses to remove dead loops with no exit blocks
because it cannot statically determine the control flow after it removes
the block. This leads to miscompiles if the loop is an infinite loop and
should've been removed.

Differential Revision: https://reviews.llvm.org/D90115
2020-11-06 17:08:34 -05:00
Giorgis Georgakoudis 700d2417d8 [CodeExtractor] Replace uses of extracted bitcasts in out-of-region lifetime markers
CodeExtractor handles bitcasts in the extracted region that have
lifetime markers users in the outer region as outputs. That
creates unnecessary alloca/reload instructions and extra lifetime
markers. The patch identifies those cases, and replaces uses in
out-of-region lifetime markers with new bitcasts in the outer region.

**Example**
```
define void @foo() {
entry:
  %0 = alloca i32
  br label %extract

extract:
  %1 = bitcast i32* %0 to i8*
  call void @llvm.lifetime.start.p0i8(i64 4, i8* %1)
  call void @use(i32* %0)
  br label %exit

exit:
  call void @use(i32* %0)
  call void @llvm.lifetime.end.p0i8(i64 4, i8* %1)
  ret void
}
```

**Current extraction**
```
define void @foo() {
entry:
  %.loc = alloca i8*, align 8
  %0 = alloca i32, align 4
  br label %codeRepl

codeRepl:                                         ; preds = %entry
  %lt.cast = bitcast i8** %.loc to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast)
  %lt.cast1 = bitcast i32* %0 to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1)
  call void @foo.extract(i32* %0, i8** %.loc)
  %.reload = load i8*, i8** %.loc, align 8
  call void @llvm.lifetime.end.p0i8(i64 -1, i8* %lt.cast)
  br label %exit

exit:                                             ; preds = %codeRepl
  call void @use(i32* %0)
  call void @llvm.lifetime.end.p0i8(i64 4, i8* %.reload)
  ret void
}

define internal void @foo.extract(i32* %0, i8** %.out) {
newFuncRoot:
  br label %extract

exit.exitStub:                                    ; preds = %extract
  ret void

extract:                                          ; preds = %newFuncRoot
  %1 = bitcast i32* %0 to i8*
  store i8* %1, i8** %.out, align 8
  call void @use(i32* %0)
  br label %exit.exitStub
}
```

**Extraction with patch**
```
define void @foo() {
entry:
  %0 = alloca i32, align 4
  br label %codeRepl

codeRepl:                                         ; preds = %entry
  %lt.cast1 = bitcast i32* %0 to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1)
  call void @foo.extract(i32* %0)
  br label %exit

exit:                                             ; preds = %codeRepl
  call void @use(i32* %0)
  %lt.cast = bitcast i32* %0 to i8*
  call void @llvm.lifetime.end.p0i8(i64 4, i8* %lt.cast)
  ret void
}

define internal void @foo.extract(i32* %0) {
newFuncRoot:
  br label %extract

exit.exitStub:                                    ; preds = %extract
  ret void

extract:                                          ; preds = %newFuncRoot
  %1 = bitcast i32* %0 to i8*
  call void @use(i32* %0)
  br label %exit.exitStub
}
```

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D90689
2020-11-05 17:01:08 -08:00
Sjoerd Meijer 7eb70158e4 [IndVarSimplify][SimplifyIndVar] Move WidenIV to Utils/SimplifyIndVar. NFCI.
This moves WidenIV from IndVarSimplify to Utils/SimplifyIndVar so that we have
createWideIV available as a generic helper utility. I.e., this is not only
useful in IndVarSimplify, but could be useful for loop transformations. For
example, motivation for this refactoring is the loop flatten transformation: if
induction variables in a loop nest can be widened, we can avoid having to
perform certain overflow checks, enabling this transformation.

Differential Revision: https://reviews.llvm.org/D90421
2020-11-05 16:52:47 +00:00
Xun Li 7f34aca083 [musttail] Unify musttail call preceding return checking
There is already an API in BasicBlock that checks and returns the musttail call if it precedes the return instruction.
Use it instead of manually checking in each place.

Differential Revision: https://reviews.llvm.org/D90693
2020-11-03 11:39:27 -08:00
Jameson Nash 59a6ab28c4 [GVN] small improvements to comments 2020-11-03 13:21:48 -05:00
Fangrui Song 98b9338588 [Debugify] Port -debugify-each to NewPM
Preemptively switch 2 tests to the new PM

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D90365
2020-11-02 08:16:43 -08:00
Florian Hahn b3b993a7ad Reland "[TTI] Add VecPred argument to getCmpSelInstrCost."
This reverts the revert commit 408c4408fa.

This version of the patch includes a fix for a crash caused by
treating ICmp/FCmp constant expressions as instructions.

Original message:

On some targets, like AArch64, vector selects can be efficiently lowered
if the vector condition is a compare with a supported predicate.

This patch adds a new argument to getCmpSelInstrCost, to indicate the
predicate of the feeding select condition. Note that it is not
sufficient to use the context instruction when querying the cost of a
vector select starting from a scalar one, because the condition of the
vector select could be composed of compares with different predicates.

This change greatly improves modeling the costs of certain
compare/select patterns on AArch64.

I am also planning on putting up patches to make use of the new argument in
SLPVectorizer & LV.
2020-11-02 15:39:29 +00:00
Nikita Popov 27f647d117 [Inliner] Consistently apply callsite noalias metadata
Previously, !noalias and !alias.scope metadata on the call site was
applied as part of CloneAliasScopeMetadata(), which short-circuits
if the callee does not use any noalias metadata itself. However,
these two things have no relation to each other.

Consistently apply !noalias and !alias.scope metadata by integrating
this into an existing function that handled !llvm.access.group and
!llvm.mem.parallel_loop_access metadata. The handling for all of
these metadata kinds essentially the same.
2020-10-31 10:54:45 +01:00
Arthur Eubanks 5c31b8b94f Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit 10f2a0d662.

More uint64_t overflows.
2020-10-31 00:25:32 -07:00
Florian Hahn 408c4408fa Revert "[TTI] Add VecPred argument to getCmpSelInstrCost."
This reverts commit 73f01e3df5.

This appears to break
http://lab.llvm.org:8011/#/builders/85/builds/383.
2020-10-30 21:26:14 +00:00
Arthur Eubanks 10f2a0d662 Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609
2020-10-30 10:03:46 -07:00
Pedro Tammela 70a495c7f0 [NFC][LoopSimplify] modernize for loops over LoopInfo
This patch modifies two for loops to use the range based syntax.
Since they are equivalent, this patch is tagged NFC.

Differential Revision: https://reviews.llvm.org/D90069
2020-10-30 16:50:07 +00:00
Simon Pilgrim ed577892cf Use cast<> instead of dyn_cast<> as we dereference the pointers immediately. NFCI.
Fix clang static analyzer warnings - we're better off relying on cast<> asserting on failure rather than a null dereference crash.
2020-10-30 15:20:40 +00:00
Simon Pilgrim b7c91a9b8e [SCEV] SCEVExpander::InsertNoopCastOfTo - reduce scope of pointer type. NFCI.
By reducing the scope of the dyn_cast<PointerType> we can make this a cast<PointerType> and avoid clang static analyzer null deference warnings.
2020-10-30 14:55:09 +00:00
Florian Hahn 73f01e3df5 [TTI] Add VecPred argument to getCmpSelInstrCost.
On some targets, like AArch64, vector selects can be efficiently lowered
if the vector condition is a compare with a supported predicate.

This patch adds a new argument to getCmpSelInstrCost, to indicate the
predicate of the feeding select condition. Note that it is not
sufficient to use the context instruction when querying the cost of a
vector select starting from a scalar one, because the condition of the
vector select could be composed of compares with different predicates.

This change greatly improves modeling the costs of certain
compare/select patterns on AArch64.

I am also planning on putting up patches to make use of the new argument in
SLPVectorizer & LV.

Reviewed By: dmgreen, RKSimon

Differential Revision: https://reviews.llvm.org/D90070
2020-10-30 13:49:08 +00:00
Roman Lebedev 81fc53a36a
[SCEV] Introduce SCEVPtrToIntExpr (PR46786)
And use it to model LLVM IR's `ptrtoint` cast.

This is essentially an alternative to D88806, but with no chance for
all the problems it caused due to having the cast as implicit there.
(see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3)

As we've established by now, there are at least two reasons why we want this:
* It will allow SCEV to actually model the `ptrtoint` casts
  and their operands, instead of treating them as `SCEVUnknown`
* It should help with initial problem of PR46786 - this should eventually allow us
  to not loose pointer-ness of an expression in more cases

As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=46786 | PR46786 ]], in principle,
we could just extend `SCEVUnknown` with a `is ptrtoint` cast, because `ScalarEvolution::getPtrToIntExpr()`
should sink the cast as far down into the expression as possible,
so in the end we should always end up with `SCEVPtrToIntExpr` of `SCEVUnknown`.

But i think that it isn't the best solution, because it doesn't really matter
from memory consumption side - there probably won't be *that* many `SCEVPtrToIntExpr`s
for it to matter, and it allows for much better discoverability.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D89456
2020-10-30 11:13:35 +03:00
Stefanos Baziotis a3345300b6 [LCSSA] Doc for special treatment of PHIs
Differential Revision: https://reviews.llvm.org/D89739
2020-10-29 22:50:07 +02:00
Nikita Popov 20b386aae0 [LoopUtils] Fix neutral value for vector.reduce.fadd
Use -0.0 instead of 0.0 as the start value. The previous use of 0.0
was fine for all existing uses of this function though, as it is
always generated with fast flags right now, and thus nsz.
2020-10-29 21:45:13 +01:00
Dávid Bolvanský 7a2abf5aca [InferAttrs] Add nocapture/writeonly to string/mem libcalls
One step closer to fix PR47644.

Differential Revision: https://reviews.llvm.org/D89645
2020-10-29 20:06:43 +01:00
Max Kazantsev a5b2e795c3 [NFC][SCEV] Refactor monotonic predicate checks to return enums instead of bools
This patch gets rid of output parameter which is not needed for most users
and prepares this API for further refactoring.
2020-10-29 16:01:25 +07:00
Fangrui Song 39856d5d0b [Debugify] Move global namespace functions into llvm::
Also move exportDebugifyStats from tools/opt to Debugify.cpp
2020-10-28 19:11:41 -07:00
Vedant Kumar 5a3ef55a52 [Utils] Skip RemoveRedundantDbgInstrs in MergeBlockIntoPredecessor (PR47746)
This patch changes MergeBlockIntoPredecessor to skip the call to
RemoveRedundantDbgInstrs, in effect partially reverting D71480 due to
some compile-time issues spotted in LoopUnroll and SimplifyCFG.

The call to RemoveRedundantDbgInstrs appears to have changed the
worst-case behavior of the merging utility. Loosely speaking, it seems
to have gone from O(#phis) to O(#insts).

It might not be possible to mitigate this by scanning a block to
determine whether there are any debug intrinsics to remove, since such a
scan costs O(#insts).

So: skip the call to RemoveRedundantDbgInstrs. There's surprisingly
little fallout from this, and most of it can be addressed by doing
RemoveRedundantDbgInstrs later. The exception is (the block-local
version of) SimplifyCFG, where it might just be too expensive to call
RemoveRedundantDbgInstrs.

Differential Revision: https://reviews.llvm.org/D88928
2020-10-27 10:12:59 -07:00
Simon Pilgrim bce770ffa6 Revert rG0905bd5c2fa42bd4c "[InstCombine] collectBitParts - add trunc support."
This reverts commit 0905bd5c2f.

Causing failures in multistage buildbots that I need to investigate
2020-10-27 13:43:54 +00:00
Nico Weber 2a4e704c92 Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit e5766f25c6.
Makes clang assert when building Chromium, see https://crbug.com/1142813
for a repro.
2020-10-27 09:26:21 -04:00
Simon Pilgrim 0905bd5c2f [InstCombine] collectBitParts - add trunc support.
This should allow us to remove the rather limited matchOrConcat fold and just use recognizeBSwapOrBitReverseIdiom.
2020-10-27 13:14:54 +00:00
Arthur Eubanks e5766f25c6 Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609
2020-10-26 20:24:04 -07:00
Sriraman Tallam ad1b9daa4b Prepend "__uniq" to symbol names hash with -funique-internal-linkage-names.
Prepend the module name hash with a fixed string ".__uniq." which helps tools
that consume sampled profiles and attribute it to functions to understand
that this symbol belongs to a unique internal linkage type symbol.

Symbols with suffixes can result from various optimizations in the compiler.
Function Multiversioning, function splitting, parameter constant propogation,
unique internal linkage names.

External tools like sampled profile aggregators combine profiles from multiple
runs of a binary. They use various heuristics with symbols that have suffixes
to try and attribute the profile to the right function instance. For instance
multi-versioned symbols like foo.avx, foo.sse4.2, etc even though different
should be attributed to the same source function if a single function is
versioned, using attribute target_clones (supported in GCC but yet to land in
LLVM). Similarly, functions that are split (split part having a .cold suffix)
could have profiles for both the original and split symbols but would be
aggregated and attributed to the original function that was split.

Unique internal linkage functions however have different source instances and
the aggregator must not put them together but attribute it to the appropriate
function instance. To be sure that we are dealing with a symbol of a unique
internal linkage function, we would like to prepend the hash with a known
string ".__uniq." which these tools can check to understand the suffix type.

Differential Revision: https://reviews.llvm.org/D89617
2020-10-26 14:24:28 -07:00
Simon Pilgrim 532f3bec3e [InstCombine] collectBitParts - add bitreverse intrinsic support. 2020-10-26 14:36:36 +00:00
TaWeiTu 060a4fccf1 [LoopVersioning] Form dedicated exits for versioned loop to preserve simplify form
The exit blocks of the versioned and non-versioned loops are not dedicated and thus the two loops are not in simplify form.
Insert dummy exit blocks after loop versioning with `formDedicatedExits()` to preserve the simplify form for subsequence passes.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D89569
2020-10-24 21:40:46 +08:00
Artur Pilipenko 6ec2c5e402 GC-parseable element atomic memcpy/memmove
This change introduces a GC parseable lowering for element atomic
memcpy/memmove intrinsics. This way runtime can provide an
implementation which can take a safepoint during copy operation.

See "GC-parseable element atomic memcpy/memmove" thread on llvm-dev
for the background and details:
https://groups.google.com/g/llvm-dev/c/NnENHzmX-b8/m/3PyN8Y2pCAAJ

Differential Revision: https://reviews.llvm.org/D88861
2020-10-23 14:06:09 -07:00
Nick Desaulniers b7926ce6d7 [IR] add fn attr for no_stack_protector; prevent inlining on mismatch
It's currently ambiguous in IR whether the source language explicitly
did not want a stack a stack protector (in C, via function attribute
no_stack_protector) or doesn't care for any given function.

It's common for code that manipulates the stack via inline assembly or
that has to set up its own stack canary (such as the Linux kernel) would
like to avoid stack protectors in certain functions. In this case, we've
been bitten by numerous bugs where a callee with a stack protector is
inlined into an __attribute__((__no_stack_protector__)) caller, which
generally breaks the caller's assumptions about not having a stack
protector. LTO exacerbates the issue.

While developers can avoid this by putting all no_stack_protector
functions in one translation unit together and compiling those with
-fno-stack-protector, it's generally not very ergonomic or as
ergonomic as a function attribute, and still doesn't work for LTO. See also:
https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/
https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u

Typically, when inlining a callee into a caller, the caller will be
upgraded in its level of stack protection (see adjustCallerSSPLevel()).
By adding an explicit attribute in the IR when the function attribute is
used in the source language, we can now identify such cases and prevent
inlining.  Block inlining when the callee and caller differ in the case that one
contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`.

Fixes pr/47479.

Reviewed By: void

Differential Revision: https://reviews.llvm.org/D87956
2020-10-23 11:55:39 -07:00
OCHyams fea067bdfd [mem2reg] Remove dbg.values describing contents of dead allocas
This patch copies @vsk's fix to instcombine from D85555 over to mem2reg. The
motivation and rationale are exactly the same: When mem2reg removes an alloca,
it erases the dbg.{addr,declare} instructions which refer to the alloca. It
would be better to instead remove all debug intrinsics which describe the
contents of the dead alloca, namely all dbg.value(<dead alloca>, ...,
DW_OP_deref)'s.

As far as I can tell, prior to D80264 these `dbg.value+deref`s would have been
silently dropped instead of being made `undef`, so we're just returning to
previous behaviour with these patches.

Testing:
`llvm-lit llvm/test` and `ninja check-clang` gave no unexpected failures. Added
3 tests, each of which covers a dbg.value deletion path in mem2reg:
  mem2reg-promote-alloca-1.ll
  mem2reg-promote-alloca-2.ll
  mem2reg-promote-alloca-3.ll
The first is based on the dexter test inlining.c from D89543. This patch also
improves the debugging experience for loop.c from D89543, which suffers
similarly after arg promotion instead of inlining.
2020-10-23 04:46:56 +00:00
Caroline Concatto 2415636475 [SVE]Clarify TypeSize comparisons in llvm/lib/Transforms
Use isKnownXY comparators when one of the operands can be with
scalable vectors or getFixedSize() for all the other cases.

This patch also does bug fixes for getPrimitiveSizeInBits by using
getFixedSize() near the places with the TypeSize comparison.

Differential Revision: https://reviews.llvm.org/D89703
2020-10-23 09:15:17 +01:00
Vedant Kumar 099bffe7f7 Revert "[CodeExtractor] Don't create bitcasts when inserting lifetime markers (NFCI)"
This reverts commit 26ee8aff2b.

It's necessary to insert bitcast the pointer operand of a lifetime
marker if it has an opaque pointer type.

rdar://70560161
2020-10-22 12:25:50 -07:00
Arthur Eubanks 92d9a3868a Port -instnamer to NPM
Some clang tests use this.

Reviewed By: akhuang

Differential Revision: https://reviews.llvm.org/D89931
2020-10-22 12:08:36 -07:00
Zequan Wu 2f29341114 Revert "Revert "SimplifyCFG: Clean up optforfuzzing implementation""
This reverts commit 716f7636e1.
2020-10-21 17:08:56 -07:00
Zequan Wu 716f7636e1 Revert "SimplifyCFG: Clean up optforfuzzing implementation"
See discussion: https://reviews.llvm.org/D89590
This reverts commit cdd006eec9.
2020-10-21 16:56:32 -07:00
Geoffrey Martin-Noble c17ae2916c Remove unnecessary header include which violates layering
This was introduced in https://reviews.llvm.org/D89774, but I don't
think it should be necessary.

Reviewed By: TaWeiTu, aeubanks

Differential Revision: https://reviews.llvm.org/D89843
2020-10-20 20:14:03 -07:00
Ta-Wei Tu 529ecd19df [NPM] port -unify-loop-exits to NPM
Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D89774
2020-10-20 10:46:57 -07:00
Ta-Wei Tu 59286b36df [NPM] Port -mergereturn to NPM
Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D89781
2020-10-20 10:33:58 -07:00
Atmn Patel 595c615606 [IR] Adds mustprogress as a LLVM IR attribute
This adds the LLVM IR attribute `mustprogress` as defined in LangRef through D86233. This attribute will be applied to functions with in languages like C++ where forward progress is guaranteed. Functions without this attribute are not required to make progress.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D85393
2020-10-20 03:09:57 -04:00
Jordan Rupprecht 8a377f1e3c [NFC] Inline assertion-only variable 2020-10-19 15:11:37 -07:00
Roman Lebedev e0567582b8
[NFCI][SCEV] Always refer to enum SCEVTypes as enum, not integer
The main tricky thing here is forward-declaring the enum:
we have to specify it's underlying data type.

In particular, this avoids the danger of switching over the SCEVTypes,
but actually switching over an integer, and not being notified
when some case is not handled.

I have updated most of such switches to be exaustive and not have
a default case, where it's pretty obvious to be the intent,
however not all of them.
2020-10-20 00:10:22 +03:00
Roman Lebedev 3355284b2d
[NFC][SCEVExpander] isHighCostExpansionHelper(): rewrite as a switch
If we switch over an enum, compiler can easily issue a diagnostic
if some case is not handled. However with an if cascade that isn't so.
Experimental evidence suggests new behavior to be superior.
2020-10-20 00:10:22 +03:00
Roman Lebedev d083d55c2c
[NFC][SCEV] Rename SCEVCastExpr into SCEVIntegralCastExpr
All existing SCEV cast types operate on integers.
D89456 will add SCEVPtrToIntExpr cast expression type.
I believe this is best for consistency.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D89455
2020-10-19 10:59:53 +03:00
Dávid Bolvanský 65e94cc946 [InferAttrs] Add argmemonly attribute to string libcalls
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89602
2020-10-18 01:33:26 +02:00
Dávid Bolvanský 2a75e956e5 Revert "[InferAttrs] Add argmemonly attribute to string libcalls"
This reverts commit b77dd32a6f. Sanitizer tests are broken.
2020-10-17 23:29:02 +02:00
Dávid Bolvanský b77dd32a6f [InferAttrs] Add argmemonly attribute to string libcalls
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89602
2020-10-17 22:42:36 +02:00
Matt Arsenault 0a7cd99a70 Reapply "OpaquePtr: Add type to sret attribute"
This reverts commit eb9f7c28e5.

Previously this was incorrectly handling linking of the contained
type, so this merges the fixes from D88973.
2020-10-16 11:05:02 -04:00
Michael Liao 98f254960f [globalopt] Teach to look through `addrspacecast`.
- so that global variables in numbered address spaces could be properly
  analyzed.

Differential Revision: https://reviews.llvm.org/D89140
2020-10-16 08:43:09 -04:00
Florian Hahn 89c0124273 [LoopVersion] Unify SCEVChecks and alias check handling (NFC).
This is an initial cleanup of the way LoopVersioning interacts with LAA.

Currently LoopVersioning has 2 ways of initializing things:

1. Passing LAI and passing UseLAIChecks = true
2. Passing UseLAIChecks = false, followed by calling setSCEVChecks and
   setAliasChecks.

Both ways of initializing lead to the same result and the duplication
seems more complicated than necessary.

This patch removes the UseLAIChecks flag from the constructor and the
setSCEVChecks & setAliasChecks helpers and move initialization
exclusively to the constructor.

This simplifies things, by providing a single way to initialize
LoopVersioning and reducing duplication.

Reviewed By: Meinersbur, lebedev.ri

Differential Revision: https://reviews.llvm.org/D84406
2020-10-15 22:02:17 +01:00
Roman Lebedev 7ee6c40247
Revert "Reland "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown"" and it's follow-ups
While we haven't encountered an earth-shattering problem with this yet,
by now it is pretty evident that trying to model the ptr->int cast
implicitly leads to having to update every single place that assumed
no such cast could be needed. That is of course the wrong approach.

Let's back this out, and re-attempt with some another approach,
possibly one originally suggested by Eli Friedman in
https://bugs.llvm.org/show_bug.cgi?id=46786#c20
which should hopefully spare us this pain and more.

This reverts commits 1fb6104293,
7324616660,
aaafe350bb,
e92a8e0c74.

I've kept&improved the tests though.
2020-10-14 16:09:18 +03:00
Juneyoung Lee 9b3c2a72e4 [ValueTracking] Use assume's noundef operand bundle
This patch updates `isGuaranteedNotToBeUndefOrPoison` to use `llvm.assume`'s `noundef` operand bundle.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89219
2020-10-14 20:16:33 +09:00
Roman Lebedev 1fb6104293
Reland "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown"
This relands commit 1c021c64ca which was
reverted in commit 17cec6a11a because
an assertion was being triggered, since `BuildConstantFromSCEV()`
wasn't updated to handle the case where the constant we want to truncate
is actually a pointer. I was unsuccessful in coming up with a test case
where we'd end there with constant zext/sext of a pointer,
so i didn't handle those cases there until there is a test case.

Original commit message:

While we indeed can't treat them as no-ops, i believe we can/should
do better than just modelling them as `unknown`. `inttoptr` story
is complicated, but for `ptrtoint`, it seems straight-forward
to model it just as a zext-or-trunc of unknown.

This may be important now that we track towards
making inttoptr/ptrtoint casts not no-op,
and towards preventing folding them into loads/etc
(see D88979/D88789/D88788)

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D88806
2020-10-12 23:02:55 +03:00
Hans Wennborg 17cec6a11a Revert 1c021c64c "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown"
> While we indeed can't treat them as no-ops, i believe we can/should
> do better than just modelling them as `unknown`. `inttoptr` story
> is complicated, but for `ptrtoint`, it seems straight-forward
> to model it just as a zext-or-trunc of unknown.
>
> This may be important now that we track towards
> making inttoptr/ptrtoint casts not no-op,
> and towards preventing folding them into loads/etc
> (see D88979/D88789/D88788)
>
> Reviewed By: mkazantsev
>
> Differential Revision: https://reviews.llvm.org/D88806

It caused the following assert during Chromium builds:

  llvm/lib/IR/Constants.cpp:1868:
  static llvm::Constant *llvm::ConstantExpr::getTrunc(llvm::Constant *, llvm::Type *, bool):
  Assertion `C->getType()->isIntOrIntVectorTy() && "Trunc operand must be integer"' failed.

See code review for a link to a reproducer.

This reverts commit 1c021c64ca.
2020-10-12 18:39:35 +02:00
Florian Hahn ad5541045a [LoopDeletion] Remove over-eager SCEV verification.
60b852092c introduced SCEV verification to
deleteDeadLoop, but it appears this check is currently a bit over-eager
and some users of deleteDeadLoop appear to only patch up SE after
calling it (e.g. PR47753).

Remove the extra check for now. We can consider adding it back after we
tracked down the source of the inconsistency for PR47753.
2020-10-12 16:18:30 +01:00
Roman Lebedev 1c021c64ca
[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown
While we indeed can't treat them as no-ops, i believe we can/should
do better than just modelling them as `unknown`. `inttoptr` story
is complicated, but for `ptrtoint`, it seems straight-forward
to model it just as a zext-or-trunc of unknown.

This may be important now that we track towards
making inttoptr/ptrtoint casts not no-op,
and towards preventing folding them into loads/etc
(see D88979/D88789/D88788)

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D88806
2020-10-12 11:04:03 +03:00
Arthur Eubanks 0689dab844 [FixIrreducible][NewPM] Port -fix-irreducible to NPM
In the NPM, a pass cannot depend on another non-analysis pass. So pin
the test that tests that -lowerswitch is run automatically to legacy PM.

Reviewed By: sameerds

Differential Revision: https://reviews.llvm.org/D89051
2020-10-09 09:22:09 -07:00
Simon Pilgrim 8f0658ae67 [Transforms] CodeExtractor::verifyAssumptionCache - don't dereference a dyn_cast<>. NFCI.
Use cast<> as we immediately dereference the pointer afterwards - cast<> will assert if we fail.

Prevents clang static analyzer warning that we could deference a null pointer.
2020-10-08 19:04:30 +01:00
Reid Kleckner 940d7aaea9 Port StripGCRelocates pass to NPM
Fixes one test under NPM

Differential Revision: https://reviews.llvm.org/D88766
2020-10-07 14:41:29 -07:00
Reid Kleckner da48fe1732 [NPM] Port strip nonlinetable debuginfo pass to the new pass manager
Fixes a few tests in llvm/test/Transforms/Utils.

Differential Revision: https://reviews.llvm.org/D88762
2020-10-07 14:35:36 -07:00
Dávid Bolvanský 86429c4eaf [SimplifyLibCalls] Optimize mempcpy_chk to mempcpy 2020-10-06 17:08:46 +02:00
Dávid Bolvanský a4bae56ab8 Revert "[SLC] Optimize mempcpy_chk to mempcpy"
This reverts commit 3f1fd59de3.
2020-10-05 22:27:14 +02:00
Dávid Bolvanský 3f1fd59de3 [SLC] Optimize mempcpy_chk to mempcpy
As reported in PR46735:

void* f(void *d, const void *s, size_t l)
{
    return __builtin___mempcpy_chk(d, s, l, __builtin_object_size(d, 0));
}

This can be optimized to `return mempcpy(d, s, l);`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86019
2020-10-05 22:18:36 +02:00
Simon Pilgrim aacfe2be53 [InstCombine] recognizeBSwapOrBitReverseIdiom - add vector support
Add basic vector handling to recognizeBSwapOrBitReverseIdiom/collectBitParts - this works at the element level, all vector element operations must match (splat constants etc.) and there is no cross-element support (insert/extract/shuffle etc.).
2020-10-03 16:26:46 +01:00
Simon Pilgrim 347fd9955a [InstCombine] recognizeBSwapOrBitReverseIdiom - use generic CreateIntegerCast
Try to appease buildbots breakages due to D88578
2020-10-03 15:29:22 +01:00
Simon Pilgrim 3aa93f690b [InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191) (Reapplied)
If we're bswap'ing some bytes and zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern.

Reapplied with early-out if recognizeBSwapOrBitReverseIdiom collects a source wider than the result type.

Differential Revision: https://reviews.llvm.org/D88578
2020-10-03 14:52:42 +01:00
Arthur Eubanks 321986fe68 [MetaRenamer][NewPM] Port metarenamer to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D88690
2020-10-02 15:42:25 -07:00
Simon Pilgrim 0364721e3e Revert rG3d14a1e982ad27 - "[InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191)"
This reverts commit 3d14a1e982.

This is breaking on some 2stage clang buildbots
2020-10-02 18:17:14 +01:00
Simon Pilgrim 3d14a1e982 [InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191)
If we're bswap'ing some bytes and zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern.

Differential Revision: https://reviews.llvm.org/D88578
2020-10-02 17:25:12 +01:00
Philip Reames f29645e7af [gvn] Handle a corner case w/vectors of non-integral pointers
If we try to coerce a vector of non-integral pointers to a narrower type (either narrower vector or single pointer), we use inttoptr and violate the semantics of non-integral pointers.  In theory, we can handle many of these cases, we just need to use a different code idiom to convert without going through inttoptr and back.

This shows up as wrong code bugs, and in some cases, crashes due to failed asserts.  Modeled after a change which has lived downstream for a couple years, though completely rewritten to be more idiomatic.
2020-10-01 19:20:21 -07:00
Simon Pilgrim 29ac9fae54 [InstCombine] collectBitParts - convert to use PatterMatch matchers and avoid IntegerType casts.
Make sure we're using getScalarSizeInBits instead of cast<IntegerType> to get Type bit widths.

This is preliminary cleanup before we can start adding vector support to the bswap/bitreverse (element level) matching.
2020-10-01 16:44:14 +01:00
Simon Pilgrim bc730b5e43 [InstCombine] collectBitParts - use APInt directly to check for out of range bit shifts. NFCI. 2020-10-01 12:50:36 +01:00
Simon Pilgrim c722b32596 [InstCombine] recognizeBSwapOrBitReverseIdiom - merge the regular/trunc+zext paths. NFCI.
There doesn't seem to be any good reason for having a separate path for when we bswap/bitreverse at a smaller size than the destination size - so merge these to make the instruction generation a lot clearer.
2020-09-30 14:54:04 +01:00
Simon Pilgrim d5545a8993 [InstCombine] recognizeBSwapOrBitReverseIdiom - remove unnecessary cast. NFCI. 2020-09-30 14:44:15 +01:00
Simon Pilgrim 621c6c8962 [InstCombine] recognizeBSwapOrBitReverseIdiom - cleanup bswap/bitreverse detection loop. NFCI.
Early out if both pattern matches have failed (or we don't want them). Fix case of bit index iterator (and avoid Wshadow issue).
2020-09-30 14:19:18 +01:00
Simon Pilgrim 413b4998bd [InstCombine] recognizeBSwapOrBitReverseIdiom - use ArrayRef::back() helper. NFCI.
Post-commit feedback on D88316
2020-09-30 13:39:18 +01:00
Simon Pilgrim 05290eead3 InstCombine] collectBitParts - cleanup variable names. NFCI.
Fix a number of WShadow warnings (I was used as the instruction and index......) and fix cases to match style.

Also, replaced the Bit APInt mask check in AND instructions with a direct APInt[] bit check.
2020-09-30 13:25:32 +01:00
Simon Pilgrim af47d40b9c [InstCombine] recognizeBSwapOrBitReverseIdiom - recognise zext(bswap(trunc(x))) patterns (PR39793)
PR39793 demonstrated an issue where we fail to recognize 'partial' bswap patterns of the lower bytes of an integer source.

In fact, most of this is already in place collectBitParts suitably tags zero bits, so we just need to correctly handle this case by finding the zero'd upper bits and reducing the bswap pattern just to the active demanded bits.

Differential Revision: https://reviews.llvm.org/D88316
2020-09-30 12:07:19 +01:00
Simon Pilgrim ec3f24d453 [InstCombine] recognizeBSwapOrBitReverseIdiom - assert for correct bit providence indices. NFCI.
As suggested by @spatel on D88316
2020-09-30 11:16:33 +01:00
Jeremy Morse 05659606a2 Revert "[gardening] Replace some uses of setDebugLoc(DebugLoc()) with dropLocation(), NFC"
Some of the buildbots have croaked with this patch, for examples failures
that begin in this build:

  http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/29933

This reverts commit 674f57870f.
2020-09-30 09:52:12 +01:00
Vedant Kumar 674f57870f [gardening] Replace some uses of setDebugLoc(DebugLoc()) with dropLocation(), NFC 2020-09-29 17:39:07 -07:00
Vedant Kumar 26ee8aff2b [CodeExtractor] Don't create bitcasts when inserting lifetime markers (NFCI)
Lifetime marker intrinsics support any pointer type, so CodeExtractor
does not need to bitcast to `i8*` in order to use these markers.
2020-09-29 16:34:36 -07:00
Juneyoung Lee 67aac915ba [BuildLibCalls] Add noundef to the returned pointers of allocators and argument of free
This patch adds noundef to the returned pointers of allocators (malloc, calloc, ...)
and the pointer argument of free.
The returned pointer of allocators cannot be poison or (partially) undef.
Since the pointer that is given to free should precisely have zero offset,
it cannot be poison or (partially) undef too.

For the size arguments of allocators, noundef wasn't attached simply because
I wasn't sure whether attaching it is okay or not.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D87984
2020-09-30 02:13:48 +09:00
Florian Hahn 7bae2bc5a8 [LoopUtils] Only verify SE in builds with assertions.
Follow up to 60b852092c.
2020-09-29 13:39:23 +01:00
David Stenberg e6f332ef1e [IndVarSimplify] Fix Modified status for removal of overflow intrinsics
When removing an overflow intrinsic the Changed status in SimplifyIndvar
was not set, leading to the IndVarSimplify pass returning an incorrect
status.

This was caught using the check introduced by D80916.

As pointed out in the code review, a similar bug may exist for
eliminateTrunc().

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D85971
2020-09-29 13:20:59 +02:00
Florian Hahn 60b852092c [LoopDeletion] Forget loop before setting values to undef
After D71539, we need to forget the loop before setting the incoming
values of phi nodes in exit blocks, because we are looking through those
phi nodes now and the SCEV expression could depend on the loop phi. If
we update the phi nodes before forgetting the loop, we miss those users
during invalidation.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D88167
2020-09-29 10:38:44 +01:00
Dávid Bolvanský 155ac33394 [BuildLibCalls] Add noalias for strcat and stpcpy
strcat:
destination and source shall not overlap. (http://www.cplusplus.com/reference/cstring/strcat/)

stpcpy:
The strings may not overlap, and the destination string dest must be  large enough to receive the copy. (https://man7.org/linux/man-pages/man3/stpcpy.3.html)

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D88335
2020-09-27 21:37:09 +02:00
Nikita Popov 9b959b59df [LVI] Require context instruction in external API (NFCI)
Require CxtI in getConstant() and getConstantRange() APIs.
Accordingly drop the BB parameter, as it is implied by
CxtI->getParent().

This makes sure we don't forget to pass the context instruction,
and makes the API contract clearer (also clean up the comments to
that effect -- the value holds at the context instruction, not
the end of the block).
2020-09-27 18:07:24 +02:00
Simon Pilgrim 2a0ca17f66 [InstCombine] collectBitParts - add fshl/fshr handling
Pulled from D87452, this is a fixed version of the collectBitParts fshl/fshr handling which as @nikic noticed wasn't checking for different providers or had correct bit ordering (which was hid by only testing shift amounts of bitwidth/2).

Differential Revision: https://reviews.llvm.org/D88292
2020-09-25 20:34:59 +01:00
Arthur Eubanks 6b1ce83a12 [NewPM][CGSCC] Handle newly added functions in updateCGAndAnalysisManagerForPass
This seems to fit the CGSCC updates model better than calling
addNewFunctionInto{Ref,}SCC() on newly created/outlined functions.
Now addNewFunctionInto{Ref,}SCC() are no longer necessary.

However, this doesn't work on newly outlined functions that aren't
referenced by the original function. e.g. if a() was outlined into b()
and c(), but c() is only referenced by b() and not by a(), this will
trigger an assert.

This also fixes an issue I was seeing with newly created functions not
having passes run on them.

Ran check-llvm with expensive checks.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87798
2020-09-23 15:22:18 -07:00
Hubert Tong 32c9991dab [InstCombine] Fix errno bug in pow expansion to sqrt
A conversion from `pow` to `sqrt` shall not call an `errno`-setting
`sqrt` with -//infinity//: the `sqrt` will set `EDOM` where the `pow`
call need not.

This patch avoids the erroneous (pun not intended) transformation by
applying the restrictions discussed in the thread for
https://lists.llvm.org/pipermail/llvm-dev/2020-September/145051.html.

The existing tests are updated (depending on emphasis in the checks for
library calls, avoidance of overlap, and overall coverage):
  - to add `ninf`, retaining the intended library call,
  - to use the intrinsic, retaining the use of `select`, or
  - to expect the replacement to not occur.

The following is tested:
  - The pow intrinsic folds to a `select` instruction to
    handle -//infinity//.
  - The pow library call folds, with `ninf`, to `sqrt` without the
    `select` instruction associated with handling -//infinity//.
  - The pow library call does not fold to `sqrt` without `ninf`.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D87877
2020-09-22 18:58:05 -04:00
Stefanos Baziotis 89c1e35f3c [LoopInfo] empty() -> isInnermost(), add isOutermost()
Differential Revision: https://reviews.llvm.org/D82895
2020-09-22 23:28:51 +03:00
Hubert Tong 6801950192 [InstCombine] For pow(x, +/-0.5), stop falling into pow(x, 1.5), etc. case
The current code for handling pow(x, y) where y is an integer plus 0.5
is not explicitly guarded against attempting to transform the case where
abs(y) is exactly 0.5.

The latter case is meant to be handled by `replacePowWithSqrt`. Indeed,
if the pow(x, integer+0.5) case proceeds past a certain point, it will
hit an assertion by attempting to form pow(x, 0) using `getPow`.

This patch adds an explicit check to prevent attempting the
pow(x, integer+0.5) transformation on pow(x, +/-0.5) as suggested during
the review of D87877. This has the effect of retaining the shrinking of
`pow` to `powf` when the `sqrt` libcall cannot be formed.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D88066
2020-09-22 14:23:32 -04:00
Fangrui Song 6913812abc Fix some clang-tidy bugprone-argument-comment issues 2020-09-19 20:41:25 -07:00
Nikita Popov f4e5541809 [Local] Clean up enforceKnownAlignment() (NFC)
I want to export this function, and the current API was a bit
weird: It took an additional Alignment argument that didn't really
have anything to do with what the function does. Drop it, and
perform a max at the callsite.

Also rename it to tryEnforceAlignment().
2020-09-19 22:29:40 +02:00
Florian Hahn 1d8f2e5292 [SCEVExpander] Support expanding nonintegral pointers with constant base.
Currently SCEVExpander creates inttoptr for non-integral pointers if the
base is a null constant for example. This results in invalid IR.

This patch changes InsertNoopCastOfTo to emit a GEP & bitcast to convert
to a non-integral pointer. First, a GEP of i8* null is generated and the
integral value is used as index. The GEP is then bitcasted to the target
type.

This was exposed by D71539.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87827
2020-09-19 17:19:53 +01:00
Fangrui Song 76eec6c95b [SCEV] Fix an unused variable in -DLLVM_ENABLE_ASSERTIONS=off build 2020-09-18 16:19:05 -07:00
Roman Lebedev aadf55d1ce
[NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%)
This is functionally equivalent to the old implementation.

As per https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=4739e6e4eb54d3736e6457249c0919b30f6c855a&stat=instructions
this is a clear geomean compile-time regression-free win with overall geomean of `-0.08%`

32 PHI's appears to be the sweet spot; both the 16 and 64 performed worse:
https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=c4efe1fbbfdf0305ac26cd19eacb0c7774cdf60e&stat=instructions
https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=e4989d1c67010d3339d1a40ff5286a31f10cfe82&stat=instructions

If we have more PHI's than that, we fall-back to the original DenseSet-based implementation,
so the not-so-fast cases will still be handled.

However compile-time isn't the main motivation here.
I can name at least 3 limitations of this CSE:
1. Assumes that all PHI nodes have incoming basic blocks in the same order (can be fixed while keeping the DenseMap)
2. Does not special-handle `undef` incoming values (i don't see how we can do this with hashing)
3. Does not special-handle backedge incoming values (maybe can be fixed by hashing backedge as some magical value)

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87408
2020-09-17 11:29:03 +03:00
Arthur Eubanks f7aa1563eb [LowerSwitch][NewPM] Port lowerswitch to NPM
Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D87726
2020-09-15 18:18:31 -07:00
Wenlei He 2ea4c2c598 [BFI] Make BFI information available through loop passes inside LoopStandardAnalysisResults
~~D65060 uncovered that trying to use BFI in loop passes can lead to non-deterministic behavior when blocks are re-used while retaining old BFI data.~~

~~To make sure BFI is preserved through loop passes a Value Handle (VH) callback is registered on blocks themselves. When a block is freed it now also wipes out the accompanying BFI entry such that stale BFI data can no longer persist resolving the determinism issue. ~~

~~An optimistic approach would be to incrementally update BFI information throughout the loop passes rather than only invalidating them on removed blocks. The issues with that are:~~
~~1. It is not clear how BFI information should be incrementally updated: If a block is duplicated does its BFI information come with? How about if it's split/modified/moved around? ~~
~~2. Assuming we can address these problems the implementation here will be a massive undertaking. ~~

~~There's a known need of BFI in LICM analysis which requires correct but not incrementally updated BFI data. A follow-up change can register BFI in all loop passes so this preserved but potentially lossy data is available to any loop pass that wants it.~~

See: D75341 for an identical implementation of preserving BFI via VH callbacks. The previous statements do still apply but this change no longer has to be in this diff because it's already upstream 😄 .

This diff also moves BFI to be a part of LoopStandardAnalysisResults since the previous method using getCachedResults now (correctly!) statically asserts (D72893) that this data isn't static through the loop passes.

Testing
Ninja check

Reviewed By: asbirlea, nikic

Differential Revision: https://reviews.llvm.org/D86156
2020-09-15 16:16:24 -07:00
Xun Li 7b4cc0961b [TSAN] Handle musttail call properly in EscapeEnumerator (and TSAN)
Call instructions with musttail tag must be optimized as a tailcall, otherwise could lead to incorrect program behavior.
When TSAN is instrumenting functions, it broke the contract by adding a call to the tsan exit function inbetween the musttail call and return instruction, and also inserted exception handling code.
This happend throguh EscapeEnumerator, which adds exception handling code and returns ret instructions as the place to insert instrumentation calls.
This becomes especially problematic for coroutines, because coroutines rely on tail calls to do symmetric transfers properly.
To fix this, this patch moves the location to insert instrumentation calls prior to the musttail call for ret instructions that are following musttail calls, and also does not handle exception for musttail calls.

Differential Revision: https://reviews.llvm.org/D87620
2020-09-15 15:20:05 -07:00
Simon Pilgrim 65c6ae3b6a [Utils] isLegalToPromote - Fix missing null check before writing to FailureReason.
The FailureReason input parameter maybe null, we check this in all other cases in the method but this one was missed somehow.

Fixes clang-tidy warning.
2020-09-15 14:49:04 +01:00
Sanjay Patel aa57c1c967 [InstCombine] fix bug in pow expansion
There at least one other bug related to pow -> sqrt transforms:
http://lists.llvm.org/pipermail/llvm-dev/2020-September/145051.html
...but we probably can't solve that without fixing this first.
2020-09-15 09:29:48 -04:00
Simon Pilgrim 4ff4708d39 collectBitParts - use const references. NFCI.
Fixes clang-tidy warnings first noticed on D87452.
2020-09-14 18:23:00 +01:00
Jay Foad 9a4476072e [UnifyLoopExits] Fix non-deterministic iteration order
This was causing random minor codegen differences in shaders compiled
with the AMDGPU backend.

Differential Revision: https://reviews.llvm.org/D87548
2020-09-14 09:09:58 +01:00
David Sherwood 1e1770a07e [SVE][CodeGen] Fix InlineFunction for scalable vectors
When inlining functions containing allocas of scalable vectors we
cannot specify the size in the lifetime markers, since we don't
know this at compile time.

Added new test here:

  test/Transforms/Inline/AArch64/sve-alloca-merge.ll

Differential Revision: https://reviews.llvm.org/D87139
2020-09-11 08:34:51 +01:00
Sam Parker 0bdf8c9127 [SCEV] Constant expansion cost at minsize
As code size is the only thing we care about at minsize, query the
cost of materialising immediates when calculating the cost of a SCEV
expansion. We also modify the CostKind to TCK_CodeSize for minsize,
instead of RecipThroughput.

Differential Revision: https://reviews.llvm.org/D76434
2020-09-10 08:21:11 +01:00
David Stenberg 48fc781438 [UnifyFunctionExitNodes] Fix Modified status for unreachable blocks
If a function had at most one return block, the pass would return false
regardless if an unified unreachable block was created.

This patch fixes that by refactoring runOnFunction into two separate
helper functions for handling the unreachable blocks respectively the
return blocks, as suggested by @bjope in a review comment.

This was caught using the check introduced by D80916.

Reviewed By: serge-sans-paille

Differential Revision: https://reviews.llvm.org/D85818
2020-09-09 13:36:03 +02:00
Juneyoung Lee 36c8621638 [BuildLibCalls] Add more noundef to library functions
This patch follows D85345 and adds more noundef attributes to return values/arguments of library functions
that are mostly about accessing the file system or processes.

A few functions like `chmod` or `times` use typedef `mode_t` and `clock_t`.
They are neither struct nor union, so they cannot contain undef even if they're lowered to iN in IR. So, it is fine to add noundef to them.

- clock_t's actual type is size_t (C17, 7.27.1.3), so it isn't struct or union.

- For mode_t, either int or long is used in practice because programmers use bit manipulation. So, I think it is okay that it's never aggregate in practice.

After this patch, the remaining library functions are those that eagerly participate in optimizations: they can be removed, reordered, or
introduced by a transformation from primitive IR operations.
For them, a few testings is needed, since it may not be valid to add noundef anymore even if C standard says it's okay.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85894
2020-09-09 20:33:35 +09:00
David Stenberg 17dce2fe43 [UnifyFunctionExitNodes] Remove unused getters, NFC
The get{Return,Unwind,Unreachable}Block functions in
UnifyFunctionExitNodes have not been used for many years,
so just remove them.

Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D87078
2020-09-08 20:42:28 +02:00
Sam Parker 928c4b4b49 [SCEV] Refactor isHighCostExpansionHelper
To enable the cost of constants, the helper function has been
reorganised:
- A struct has been introduced to hold SCEV operand information so
  that we know the user of the operand, as well as the operand index.
  The Worklist now uses instead instead of a bare SCEV.
- The costing of each SCEV, and collection of its operands, is now
  performed in a helper function.

Differential Revision: https://reviews.llvm.org/D86050
2020-09-07 11:57:46 +01:00
Sam Parker 65f78e73ad [SimplifyCFG] Consider cost of combining predicates.
Modify FoldBranchToCommonDest to consider the cost of inserting
instructions when attempting to combine predicates to fold blocks.
The threshold can be controlled via a new option:
-simplifycfg-branch-fold-threshold which defaults to '2' to allow
the insertion of a not and another logical operator.

Differential Revision: https://reviews.llvm.org/D86526
2020-09-07 10:04:50 +01:00
serge-sans-paille 3a6f3fc160 Fix return status of SimplifyCFG
When a switch case is folded into default's case, that's an IR change that
should be reported, update ConstantFoldTerminator accordingly.

Differential Revision: https://reviews.llvm.org/D87142
2020-09-05 07:54:15 +02:00
Roman Lebedev 1dcb936cf6
[NFC][Local] EliminateDuplicatePHINodes(): add STATISTIC() 2020-08-29 22:03:18 +03:00
Roman Lebedev 961483a5ea
[NFCI][Local] Rewrite EliminateDuplicatePHINodes to optionally check hashing invariants
EarlyCSE has a mode to verify the invariant that hash equality equals
key equality, but EliminateDuplicatePHINodes() doesn't.

I've verified that this would have caught the stage2-stage3 mismatches
5ec2b757cc revert has fixed,
that were introduced last time in 3e69871ab5.
2020-08-29 22:03:10 +03:00
Roman Lebedev 5ec2b757cc
[Instruction] Speculatively undo isIdenticalToWhenDefined() PHI handling changes
The stage2-stage3 differences persist even without instcombine-based
PHI CSE, so this is the only possible reason.
2020-08-29 19:38:57 +03:00
Benjamin Kramer 8782c72765 Strength-reduce SmallVectors to arrays. NFCI. 2020-08-28 21:14:20 +02:00
David Sherwood f4257c5832 [SVE] Make ElementCount members private
This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). In addition I've
added some other member functions for more commonly used operations.
Hopefully this makes the class more useful and will reduce the
need for calling getKnownMinValue().

Differential Revision: https://reviews.llvm.org/D86065
2020-08-28 14:43:53 +01:00
Florian Hahn 20e989e9de [BuildLibCalls] Add argmemonly to more lib calls.
strspn, strncmp, strcspn, strcasecmp, strncasecmp, memcmp, memchr,
memrchr, memcpy, memmove, memcpy, mempcpy, strchr, strrchr, bcmp
should all only access memory through their arguments.

I broke out strcoll, strcasecmp, strncasecmp because the result
depends on the locale, which might get accessed through memory.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86724
2020-08-28 09:50:38 +01:00
Florian Hahn 419c6948df [SimplifyLibCalls] Remove over-eager early return in strlen optzns.
Currently we bail out early for strlen calls with a GEP operand, if none
of the GEP specific optimizations fire. But there could be later
optimizations that still apply,  which we currently miss out on.

An example is that we do not apply the following optimization
   strlen(x) == 0 --> *x == 0

Unless I am missing something, there seems to be no reason for bailing
out early there.

Fixes PR47149.

Reviewed By: lebedev.ri, xbolva00

Differential Revision: https://reviews.llvm.org/D85886
2020-08-27 15:19:45 +01:00
Sam Parker 8ce450da32 [NFCI][SimplifyCFG] Combine select costs and checks
Combine the cost modelling and validity checks for the phi to select
conversion in SpeculativelyExecuteBB, extracting the logic out into
a function.
2020-08-24 09:16:11 +01:00
Amy Huang 5e3fd471ac [Cloning] Fix to cloning DISubprograms.
When trying to enable -debug-info-kind=constructor there was an assert
that occurs during debug info cloning ("mismatched subprogram between
llvm.dbg.value variable and !dbg attachment").
It appears that during llvm::CloneFunctionInto, a DISubprogram could be
duplicated when MapMetadata is called, and then added to the MD map again
when DIFinder gets a list of subprograms. This results in two different
versions of the DISubprogram.

This patch switches the order so that the DIFinder subprograms are
added before MapMetadata is called.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46784

Differential Revision: https://reviews.llvm.org/D86185
2020-08-21 11:54:56 -07:00
Florian Hahn 8eded24bf4 Recommit "[SCEVExpander] Add helper to clean up instrs inserted while expanding."
Recommit the patch after fixing an issue reported caused by the fact
that re-used values are also added to InsertedValues.

Additional tests have been added in 88818491b9

This reverts the revert commit 38884641f2.
2020-08-21 15:04:17 +01:00
Sam Parker bfc6d8b59b [NFC][SimplifyCFG] Formatting and variable rename 2020-08-21 13:11:17 +01:00
Sam Parker 47251582f5 [SimplifyCFG] Cost required selects
Before we speculatively execute a basic block, query the cost of
inserting the necessary select instructions against the phi folding
threshold. For non-trivial insertions, a more accurate decision can
probably be made during machine if-conversion. With minsize we query
the CodeSize cost, otherwise we use SizeAndLatency.

Differential Revision: https://reviews.llvm.org/D82438
2020-08-21 09:52:52 +01:00
Dávid Bolvanský f134fc4f1b Reland "[SLC] sprintf(dst, "%s", str) -> strcpy(dst, str)" 2020-08-15 12:14:57 +02:00
Martin Storsjö 3e7403a134 Revert "[SLC] sprintf(dst, "%s", str) -> strcpy(dst, str)"
This reverts commit 6dbf0cfcf7.

That commit caused failed assertions, e.g. like this:

$ cat sprintf-strcpy.c
char *ptr; void func(void) { ptr += sprintf(ptr, "%s", ""); }

$ clang -c sprintf-strcpy.c -O2 -target x86_64-linux-gnu
clang: ../lib/IR/Value.cpp:473: void llvm::Value::doRAUW(llvm::Value*,
llvm::Value::ReplaceMetadataUses): Assertion `New->getType() ==
getType() && "replaceAllUses of value with new value of different
type!"' failed.
2020-08-15 09:35:11 +03:00
Dávid Bolvanský f62de7c9c7 [SLC] Transform strncpy(dst, "text", C) to memcpy(dst, "text\0\0\0", C) for C <= 128 only
Transformation creates big strings for big C values, so bail out for C > 128.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86004
2020-08-15 01:53:32 +02:00
Jordan Rupprecht 38884641f2 Temporarily revert "[SCEVExpander] Add helper to clean up instrs inserted while expanding."
This reverts commit 7829c33084. The assertion is triggering on some internal code. A reduced test case is in progress.
2020-08-14 14:52:37 -07:00
Dávid Bolvanský 6dbf0cfcf7 [SLC] sprintf(dst, "%s", str) -> strcpy(dst, str)
Transform sprintf(dst, "%s", str) -> strcpy(dst, str) if result is unused
Avoid sprintf(dest, "%s", str) -> llvm.memcpy(align 1 dest, align 1 str, strlen(str)+1) if optimizing for size.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85963
2020-08-14 23:48:53 +02:00
Arthur Eubanks 48cd5b72b1 Revert "[SLC] sprintf(dst, "%s", str) -> strcpy(dst, str)"
This reverts commit ab9fc8bae8.

Incorrect transformation if the result is used.
Causes breakages, e.g.
http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-x86_64-O3/8193/
2020-08-13 21:05:03 -07:00
Dávid Bolvanský ab9fc8bae8 [SLC] sprintf(dst, "%s", str) -> strcpy(dst, str)
Solves 46489
2020-08-14 00:05:55 +02:00
Dávid Bolvanský 5ef2287d36 [SLC] Optimize strncpy(a, a, C) to memcpy(a, a000, C)
Solves PR47154
2020-08-13 22:22:51 +02:00
David Stenberg e8ebebb0bd [InstCombine] Fix incorrect Modified status
When removing instructions from unreachable blocks, and only debug info
intrinsics were removed, InstCombine could incorrectly return a false
Modified status.

This is fixed by making removeAllNonTerminatorAndEHPadInstructions()
also return how many debug info intrinsics that were removed, and take
that into account.

This was caught using the check introduced by D80916.

Reviewed By: majnemer

Differential Revision: https://reviews.llvm.org/D85839
2020-08-13 15:10:41 +02:00
Whitney Tsang aa994d9867 [NFC][LoopUnrollAndJam] Use BasicBlock::replacePhiUsesWith instead of
static function updatePHIBlocks.

Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D85673
2020-08-11 15:35:14 +00:00
Florian Hahn 7829c33084 [SCEVExpander] Add helper to clean up instrs inserted while expanding.
SCEVExpander already tracks which instructions have been inserted n
InsertedValues/InsertedPostIncValues. This patch adds an additional
vector to collect the instructions in insertion order. This can then be
used to remove exactly the instructions inserted by the expander.

This replaces ExpandedValuesCleaner, which in some cases might remove
values not inserted by the expander (e.g. if a value was dead before
insertion and is then used during expansion).

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D84327
2020-08-11 09:30:31 +01:00
Juneyoung Lee ef018cb65c [BuildLibCalls] Add noundef to standard I/O functions
This patch adds noundef to return value and arguments of standard I/O functions.
With this patch, passing undef or poison to the functions becomes undefined
behavior in LLVM IR. Since undef/poison is lowered from operations having UB in C/C++,
passing undef to them was already UB in source.

With this patch, the functions cannot return undef or poison anymore as well.
According to C17 standard, ungetc/ungetwc/fgetpos/ftell can generate unspecified
value; 3.19.3 says unspecified value is a valid value of the relevant type,
and using unspecified value is unspecified behavior, which is not UB, so it
cannot be undef (using undef is UB when e.g. it is used at branch condition).

— The value of the file position indicator after a successful call to the ungetc function for a text stream, or the ungetwc function for any stream, until all pushed-back characters are read or discarded (7.21.7.10, 7.29.3.10).
— The details of the value stored by the fgetpos function (7.21.9.1).
— The details of the value returned by the ftell function for a text stream (7.21.9.4).

In the long run, most of the functions listed in BuildLibCalls should have noundefs; to remove redundant diffs which will anyway disappear in the future, I added noundef to a few more non-I/O functions as well.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85345
2020-08-10 10:58:25 +09:00
Florian Hahn 23817cbd0b [SCEVExpander] Make sure cast properly dominates Builder's IP.
The selected cast must properly dominate the Builder's IP, so we cannot
re-use the cast, if it matches the builder's IP.
2020-08-09 16:51:19 +01:00
Florian Hahn c70f0b9d4a [SCEVExpander] Avoid re-using existing casts if it means updating users.
Currently the SCEVExpander tries to re-use existing casts, even if they
are not exactly at the insertion point it was asked to create the cast.
To do so in some case, it creates a new cast at the insertion point and
updates all users to use the new cast.

This behavior is problematic, because it changes the IR outside of the
instructions created during the expansion. Therefore we cannot
completely undo all changes made during expansion.

This re-use should be only an extra optimization, so only using the new
cast in the expanded instructions should not be a correctness issue.
There are many cases equivalent instructions are created during
expansion.

This patch also adjusts findInsertPointAfter to skip instructions
inserted during expansion. This enables re-using existing casts without
the renaming any uses, by picking a better insertion point.

Reviewed By: efriedma, lebedev.ri

Differential Revision: https://reviews.llvm.org/D84399
2020-08-09 13:25:17 +01:00
Roman Lebedev e492f0e03b
[SimplifyCFG] Fix invoke->call fold w/ multiple invokes in presence of lifetime intrinsics
SimplifyCFG has two main folds for resumes - one when resume is directly
using the landingpad, and the other one where resume is using a PHI node.

While for the first case, we were already correctly ignoring all the
PHI nodes, and both the debug info intrinsics and lifetime intrinsics,
in the PHI-based-one, we weren't ignoring PHI's in the resume block,
and weren't ignoring lifetime intrinsics. That is clearly a bug.

On RawSpeed library, this results in +9.34% (+81) more invoke->call folds,
-0.19% (-39) landing pads, -0.24% (-81) invoke instructions
but +51 call instructions and -132 basic blocks.

Though, the run-time performance impact appears to be within the noise.
2020-08-08 20:00:28 +03:00
Roman Lebedev 1f452ac1d7
[NFC][SimplifyCFG] Rewrite isCleanupBlockEmpty() to be iterator_range-based 2020-08-08 20:00:28 +03:00
Roman Lebedev a587bf3eb0
[NFC][SimplifyCFG] Count the number of invokes turned into calls due to empty cleanup blocks 2020-08-08 20:00:27 +03:00
Arthur Eubanks 456f38a971 Fix layering violation Transforms/Utils -> Scalar
Introduced in D85063.
2020-08-03 11:53:23 -07:00
Arthur Eubanks 7c19c89dd5 [NewPM][LoopVersioning] Port LoopVersioning to NPM
Reviewed By: ychen, fhahn

Differential Revision: https://reviews.llvm.org/D85063
2020-08-03 10:32:09 -07:00
Florian Hahn 05b44f7eae [LCSSA] Provide option for caller to clean up unused PHIs.
formLCSSAForInstructions is used by SCEVExpander, which tracks all
inserted instructions including LCSSA phis using asserting value
handles. This means cleanup needs to happen in the caller.

Extend formLCSSAForInstructions  to take an optional pointer to a
vector. If this argument is non-nullptr, instead of directly deleting
the phis, add them to the vector, so the caller can process them.

This should address various PPC buildbot failures, including
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/40567
2020-08-01 20:43:19 +01:00
Florian Hahn a9b06a2c14 [LCSSA] Use IRBuilder for PHI creation.
Use IRBuilder instead PHINode::Create. This should not impact the
generated code, but IRBuilder provides a way to register callbacks for
inserted instructions, which is convenient for some users.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D85037
2020-08-01 18:44:15 +01:00
Chen Zheng 8c5edf5023 [SCEV] don't query getSCEV() for incomplete phis
querying getSCEV() for incomplete phis leads to wrong cache value in `ExprToIVMap`,
because incomplete phis may be simplified to same value before get SCEV expression.

Reviewed By: lebedev.ri, mkazantsev

Differential Revision: https://reviews.llvm.org/D77560
2020-08-01 02:38:54 -04:00
Sidharth Baveja b7cfa6ca92 [Loop Peeling] Separate the Loop Peeling Utilities from the Loop Unrolling Utilities
Summary: This patch separates the Loop Peeling Utilities from Loop Unrolling.
The reason for this change is that Loop Peeling is no longer only being used by
loop unrolling; Patch D82927 introduces loop peeling with fusion, such that
loops can be modified to have to same trip count, making them legal to be
peeled.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D83056
2020-07-31 18:31:58 +00:00
Florian Hahn 3b0d30ffd3 [SCEVExpander] Name temporary instructions for LCSSA insertion (NFC). 2020-07-31 18:16:46 +01:00
Vitaly Buka b0eb40ca39 [NFC] Remove unused GetUnderlyingObject paramenter
Depends on D84617.

Differential Revision: https://reviews.llvm.org/D84621
2020-07-31 02:10:03 -07:00
Vitaly Buka 89051ebace [NFC] GetUnderlyingObject -> getUnderlyingObject
I am going to touch them in the next patch anyway
2020-07-30 21:08:24 -07:00
Vitaly Buka 61cab352e3 [NFC] Move findAllocaForValue into ValueTracking.h
Differential Revision: https://reviews.llvm.org/D84616
2020-07-30 18:22:59 -07:00
Simon Pilgrim 4a161bd8b3 LoopUnroll.cpp - pass std::vector by const reference to needToInsertPhisForLCSSA helper. NFCI.
Avoid an unnecessary pass by value.
2020-07-30 18:17:04 +01:00
Florian Hahn f75564ad4e Reland "[SCEVExpander] Add option to preserve LCSSA directly."
This reverts the revert commit dc28675768.

It includes a fix for Polly, which uses SCEVExpander on IR that is not
in LCSSA form. Set PreserveLCSSA = false in that case, to ensure we do
not introduce LCSSA phis where there were none before.
2020-07-29 20:41:53 +01:00
Florian Hahn dc28675768 Revert "[SCEVExpander] Add option to preserve LCSSA directly."
This reverts commit 99166fd4fb, because it
breaks the polly builders.

polly/test/Isl/CodeGen/invariant_load_escaping_second_scop.ll fails
because a apparently unnecessary LCSSA phi node is introduced.

Make the bots green again, while I take a closer look.
2020-07-29 19:19:04 +01:00
Florian Hahn 99166fd4fb [SCEVExpander] Add option to preserve LCSSA directly.
This patch teaches SCEVExpander to directly preserve LCSSA.

As it is currently, SCEV does not look through PHI nodes in loops,
as it might break LCSSA form. Once SCEVExpander can preserve
LCSSA form, it should be safe for SCEV to look through PHIs.

To preserve LCSSA form, this patch uses formLCSSAForInstructions
on operands of newly created instructions, if the definition is inside
a different loop than the new instruction.

The final value we return from expandCodeFor may also need LCSSA
phis, depending on the insert point. As no user for it exists there yet,
create a temporary instruction at the insert point, which can be passed
to formLCSSAForInstructions. This temporary instruction is removed
after LCSSA construction.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D71538
2020-07-29 15:07:37 +01:00
David Green 60280e9818 [Analysis] TTI: Add CastContextHint for getCastInstrCost
Currently, getCastInstrCost has limited information about the cast it's
rating, often just the opcode and types.  Sometimes there is a context
instruction as well, but it isn't trustworthy: for instance, when the
vectorizer is rating a plan, it calls getCastInstrCost with the old
instructions when, in fact, it's trying to evaluate the cost of the
instruction post-vectorization.  Thus, the current system can get the
cost of certain casts incorrect as the correct cost can vary greatly
based on the context in which it's used.

For example, if the vectorizer queries getCastInstrCost to evaluate the
cost of a sext(load) with tail predication enabled, getCastInstrCost
will think it's free most of the time, but it's not always free. On ARM
MVE, a VLD2 group cannot be extended like a normal VLDR can. Similar
situations can come up with how masked loads can be extended when being
split.

To fix that, this path adds a new parameter to getCastInstrCost to give
it a hint about the context of the cast. It adds a CastContextHint enum
which contains the type of the load/store being created by the
vectorizer - one for each of the types it can produce.

Original patch by Pierre van Houtryve

Differential Revision: https://reviews.llvm.org/D79162
2020-07-29 13:32:53 +01:00
Johannes Doerfert 450dc09d69 [SROA][Mem2Reg] Use efficient droppable use API (after D83976)
Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D84804
2020-07-28 17:41:01 -05:00
Roman Lebedev 96d74530c0
[Reduce] Argument reduction: do deal with function declarations
We can happily turn function definitions into declarations,
thus obscuring their argument from being elided by this pass.

I don't believe there is a good reason to just ignore declarations.
likely even proper llvm intrinsics ones,
at worst the input becomes uninteresting.

The other question here is that all these transforms are all-or-nothing.
In some cases, should we be treating each use separately?

The main blocker here seemed to be that llvm::CloneFunctionInto()
does `&OldFunc->front()`, which inserts a nullptr into a densemap,
which is not happy about it and asserts.
2020-07-26 01:31:56 +03:00
Simon Pilgrim b5e14d78f1 SimplifyLibCalls - remove unnecessary header and forward declaration. NFC.
We include TargetLibraryInfo.h so don't need to forward declare it, and we don't need to include TargetLibraryInfo.h in SimplifyLibCalls.cpp as well.
2020-07-25 12:58:39 +01:00
Johannes Doerfert ce8928f2e4 [Mem2Reg] Teach promote to register about droppable instructions
This is the first of two patches to address PR46753. We basically allow
mem2reg to promote allocas that are used in doppable instructions, for
now that means `llvm.assume`. The uses of the alloca (or a bitcast or
zero offset GEP from there) are replaced by `undef` in the droppable
instructions.

Reviewed By: Tyker

Differential Revision: https://reviews.llvm.org/D83976
2020-07-24 15:15:38 -05:00
Johannes Doerfert ce2d69b557 [SROA][Mem2Reg] Do not crash on alloca + addrspacecast
SROA knows that it can look through addrspacecast but
PromoteMemoryToRegister did not handle them. This caused an assertion
error for the test case, exposed while running
`Transforms/PhaseOrdering/inlining-alignment-assumptions.ll` with D83978
applied.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D84085
2020-07-24 15:15:38 -05:00
Simon Pilgrim 0128b9505c Revert rG5dd566b7c7b78bd- "PassManager.h - remove unnecessary Function.h/Module.h includes. NFCI."
This reverts commit 5dd566b7c7.

Causing some buildbot failures that I'm not seeing on MSVC builds.
2020-07-24 13:02:33 +01:00
Simon Pilgrim 5dd566b7c7 PassManager.h - remove unnecessary Function.h/Module.h includes. NFCI.
PassManager.h is one of the top headers in the ClangBuildAnalyzer frontend worst offenders list.

This exposes a large number of implicit dependencies on various forward declarations/includes in other headers that need addressing.
2020-07-24 12:40:50 +01:00
Nikita Popov def48b0e88 [PredicateInfo][SCCP] Remove assertion (PR46814)
As long as RenamedOp is not guaranteed to be accurate, we cannot
assert here and should just return false. This was already done
for the other conditions in this function.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46814.
2020-07-23 19:36:51 +02:00
Florian Hahn ecd3f853a8 [SCEVExpander] Use IRBuilderCallbackInserter to call rememberInstruction.
Currently there are plenty of instructions that SCEVExpander creates but
does not track as created. IRBuilder allows specifying a callback
whenever an instruction is inserted. Use this to call
rememberInstruction automatically for each created instruction.

There are still a few rememberInstruction calls remaining, because in
some cases Inst::Create functions are used to construct instructions.

Suggested by @lebedev.ri in D75980.

Reviewers: mkazantsev, reames, sanjoy.google, lebedev.ri

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D84326
2020-07-23 14:25:28 +01:00
Hiroshi Yamauchi 557db6f8aa Reland D84057 [PGO][PGSO] Remove a temporary flag used for gradual rollout.
The revert was a misfire.

Remove the temporary flag PGSOIRPassOrTestOnly and the guard code which was used
for the staged rollout. This is a cleanup (NFC) as it's now false by default.

Differential Revision: https://reviews.llvm.org/D84057
2020-07-22 20:57:25 -07:00
Fangrui Song dbdda8232a Revert D84057 "[PGO][PGSO] Remove a temporary flag used for gradual rollout."
This reverts commit e64afefdf8. It caused
a PGO bootstrapped clang to crash on many source files.

`__llvm_profile_instrument_range` seems to trigger a null pointer dereference.

Call stack:
__llvm_profile_instrument_range
llvm::APInt::udiv(llvm::APInt const&) const
getRangeForAffineARHelper
2020-07-22 14:28:28 -07:00
Max Kazantsev 360ab70712 [SimplifyCFG] Do not create unneeded PR Phi in block with convergent calls
We do not thread blocks with convergent calls, but this check was missing
when we decide to insert PR Phis into it (which we only do for threading).

Differential Revision: https://reviews.llvm.org/D83936
Reviewed By: nikic
2020-07-22 13:53:50 +07:00
Hiroshi Yamauchi e64afefdf8 [PGO][PGSO] Remove a temporary flag used for gradual rollout.
Remove the temporary flag PGSOIRPassOrTestOnly and the guard code which was used
for the staged rollout. This is a cleanup (NFC) as it's now false by default.

Differential Revision: https://reviews.llvm.org/D84057
2020-07-20 11:12:11 -07:00
Florian Hahn e1270b16c9 [Matrix] Add TileInfo abstraction for tiled matrix code-gen.
This patch adds a TileInfo abstraction and utilities to
create a 3-level loop nest for tiling.

Reviewers: anemet

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D77550
2020-07-20 18:49:08 +01:00
Matt Arsenault 5e999cbe8d IR: Define byref parameter attribute
This allows tracking the in-memory type of a pointer argument to a
function for ABI purposes. This is essentially a stripped down version
of byval to remove some of the stack-copy implications in its
definition.

This includes the base IR changes, and some tests for places where it
should be treated similarly to byval. Codegen support will be in a
future patch.

My original attempt at solving some of these problems was to repurpose
byval with a different address space from the stack. However, it is
technically permitted for the callee to introduce a write to the
argument, although nothing does this in reality. There is also talk of
removing and replacing the byval attribute, so a new attribute would
need to take its place anyway.

This is intended avoid some optimization issues with the current
handling of aggregate arguments, as well as fixes inflexibilty in how
frontends can specify the kernel ABI. The most honest representation
of the amdgpu_kernel convention is to expose all kernel arguments as
loads from constant memory. Today, these are raw, SSA Argument values
and codegen is responsible for turning these into loads.

Background:

There currently isn't a satisfactory way to represent how arguments
for the amdgpu_kernel calling convention are passed. In reality,
arguments are passed in a single, flat, constant memory buffer
implicitly passed to the function. It is also illegal to call this
function in the IR, and this is only ever invoked by a driver of some
kind.

It does not make sense to have a stack passed parameter in this
context as is implied by byval. It is never valid to write to the
kernel arguments, as this would corrupt the inputs seen by other
dispatches of the kernel. These argumets are also not in the same
address space as the stack, so a copy is needed to an alloca. From a
source C-like language, the kernel parameters are invisible.
Semantically, a copy is always required from the constant argument
memory to a mutable variable.

The current clang calling convention lowering emits raw values,
including aggregates into the function argument list, since using
byval would not make sense. This has some unfortunate consequences for
the optimizer. In the aggregate case, we end up with an aggregate
store to alloca, which both SROA and instcombine turn into a store of
each aggregate field. The optimizer never pieces this back together to
see that this is really just a copy from constant memory, so we end up
stuck with expensive stack usage.

This also means the backend dictates the alignment of arguments, and
arbitrarily picks the LLVM IR ABI type alignment. By allowing an
explicit alignment, frontends can make better decisions. For example,
there's real no advantage to an aligment higher than 4, so a frontend
could choose to compact the argument layout. Similarly, there is a
high penalty to using an alignment lower than 4, so a frontend could
opt into more padding for small arguments.

Another design consideration is when it is appropriate to expose the
fact that these arguments are all really passed in adjacent
memory. Currently we have a late IR optimization pass in codegen to
rewrite the kernel argument values into explicit loads to enable
vectorization. In most programs, unrelated argument loads can be
merged together. However, exposing this property directly from the
frontend has some disadvantages. We still need a way to track the
original argument sizes and alignments to report to the driver. I find
using some side-channel, metadata mechanism to track this
unappealing. If the kernel arguments were exposed as a single buffer
to begin with, alias analysis would be unaware that the padding bits
betewen arguments are meaningless. Another family of problems is there
are still some gaps in replacing all of the available parameter
attributes with metadata equivalents once lowered to loads.

The immediate plan is to start using this new attribute to handle all
aggregate argumets for kernels. Long term, it makes sense to migrate
all kernel arguments, including scalars, to be passed indirectly in
the same manner.

Additional context is in D79744.
2020-07-20 10:23:09 -04:00
Benjamin Kramer 44ab60f74d [LoopSimplify] Use SmallPtrSet and range for loops more. NFCI. 2020-07-20 15:00:59 +02:00
Roman Lebedev 04b729d076
[NFCI][SimplifyCFG] Guard common code hoisting with a (default-on) flag
Common code sinking is already guarded with a (with default-off!) flag,
so add a flag for hoisting, too.

D84108 will hopefully make hoisting off-by-default too.
2020-07-20 10:29:57 +03:00
Nikita Popov c6e13667e7 [PredicateInfo] Add a method to interpret predicate as cmp constraint
Both users of predicteinfo (NewGVN and SCCP) are interested in
getting a cmp constraint on the predicated value. They currently
implement separate logic for this. This patch adds a common method
for this in PredicateBase.

This enables a missing bit of PredicateInfo handling in SCCP: Now
the predicate on the condition itself is also used. For switches
it means we know that the switched-on value is the same as the case
value. For assumes/branches we know that the condition is true or
false.

Differential Revision: https://reviews.llvm.org/D83640
2020-07-19 15:34:32 +02:00
Chen Zheng 6d247f980d [SCEV][IndVarSimplify] insert point should not be block front.
Recommit after removing the unused cast instructions.

Differential Revision:  https://reviews.llvm.org/D80975
2020-07-17 22:25:10 -04:00
Sidharth Baveja 11e879d4f1 [Loop Simplify] Resolve an issue where metadata is not applied to a loop latch.
Summary:
This patch resolves an issue where the metadata of a loop is not added to the
new loop latch, and not removed from the old loop latch. This issue occurs in
the SplitBlockPredecessors function, which  adds a new block in a loop, and
in the case that the block passed into this function is the header of the loop,
the loop can be modified such that the latch of the loop is replaced.
This patch applies to the Loop Simplify pass since it ensures that each loop
has exit blocks which only have predecessors that are inside of the loop. In
the case that this is not true, the pass will create a new exit block for the
loop. This guarantees that the loop preheader/header will dominate the exit blocks.

Author: sidbav (Sidharth Baveja)

Reviewers: asbirlea (Alina Sbirlea), chandlerc (Chandler Carruth), Whitney (Whitney Tsang), bmahjour (Bardia Mahjour)

Reviewed By:  asbirlea (Alina Sbirlea)

Subscribers: hiraditya (Aditya Kumar), llvm-commits

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D83869
2020-07-17 14:02:14 +00:00
Jon Roelofs a0537fc35f [SimplifyCFG] Fix crash in the EXPENSIVE_CHECKS build
SimplifyCFG was incorrectly reporting to the pass manager that it had not made
changes after folding away a PHI.  This is detected in the EXPENSIVE_CHECKS
build when the function's hash changes.

Differential Revision: https://reviews.llvm.org/D83985
2020-07-16 15:34:41 -06:00
Nadav Rotem 8f0a8ed44e [InjectTLIMappings] Use StringRef instead of std::string for FN name.
https://reviews.llvm.org/D83797
2020-07-16 11:53:04 -07:00
Matt Arsenault 023883a834 IR: Rename Argument::hasPassPointeeByValueAttr to prepare for byref
When the byref attribute is added, there will need to be two similar
functions for the existing cases which have an associate value copy,
and byref which does not. Most, but not all of the existing uses will
use the existing version.

The associated size function added by D82679 also needs to
contextually differ, and will help eliminate a few places still
relying on pointee element types.
2020-07-16 13:50:49 -04:00
Roman Lebedev 2815429d08
[NFC][SimplifyCFG] HoistThenElseCodeToIf(): after hoisting terminator, do return Changed, not just true
Otherwise, if Changed was still false before that,
we would not account for that hoist in NumHoistCommonCode statistic.
2020-07-16 00:32:48 +03:00
Roman Lebedev 1cfc24fd67
[NFC][SimplifyCFG] HoistThenElseCodeToIf(): count number of common instruction "blocks" hoisted
I.e. out of all the times HoistThenElseCodeToIf() was called,
how many times did it actually hoist something?
2020-07-16 00:21:56 +03:00
Roman Lebedev 7b53ad88d4
[NFC][SimplifyCFG] HoistThenElseCodeToIf(): count number of common instructions hoisted 2020-07-16 00:21:56 +03:00
Roman Lebedev 3fc1defc0b
[NFC][SimplifyCFG] SinkCommonCodeFromPredecessors(): count number of instruction "blocks" actually sunk
Out of all the times the function was called,
how many times did we actually sink anything?
2020-07-16 00:21:56 +03:00
Roman Lebedev 9ed65c76c0
[NFC][SimplifyCFG] SinkCommonCodeFromPredecessors(): add debug output when failing to actually sink instr 2020-07-16 00:21:55 +03:00
Roman Lebedev 4c79864488
[NFC][SimplifyCFG] SinkCommonCodeFromPredecessors(): early return if nothing to sink
If we can't sink even one instruction, early return, to increase readability.
2020-07-16 00:21:55 +03:00
Roman Lebedev 702a3c6410
[NFC][SimplifyCFG] Rename statistic NumSinkCommons into NumSinkCommonInstrs
It really counts instructions added into common block,
not number of instruction groups sunk.
2020-07-16 00:21:55 +03:00
Roman Lebedev ce4459a0db
[NFC][LoopRotate] Add a statistic for how many times rotation failed due to the header size 2020-07-16 00:21:55 +03:00
Hongtao Yu f3731d34fa [LoopUnroll] Update branch weight for remainder loop
Unrolling a loop with compile-time unknown trip count results in a remainder loop. The remainder loop executes the remaining iterations of the original loop when the original trip count is not a multiple of the unroll factor. For better profile counts maintenance throughout the optimization pipeline, I'm assigning an artificial weight to the latch branch of the remainder loop.

A remainder loop runs up to as many times as the unroll factor subtracted by 1. Therefore I'm assigning the maximum possible trip count as the back edge weight. This should be more accurate than the default non-profile weight, which assumes the back edge runs much more frequently than the exit edge.

Differential Revision: https://reviews.llvm.org/D83187
2020-07-15 12:33:29 -07:00
Tim Northover 37b96d51d0 CodeGenPrep: remove AssertingVH references before deleting dead instructions.
CodeGenPrepare keeps fairly close track of various instructions it's
seen, particularly GEPs, in maps and vectors. However, sometimes those
instructions become dead and get removed while it's still executing.
This triggers AssertingVH references to them in an asserts build and
could lead to miscompiles in a release build (I've only seen a later
segfault though).

So this patch adds a callback to
RecursivelyDeleteTriviallyDeadInstructions which can make sure the
instruction about to be deleted is removed from CodeGenPrepare's data
structures.
2020-07-15 15:19:21 +01:00
Florian Hahn 9ea0d8c38f [LoopRotate] Remove unnecessary verifyMemorySSA calls.
The actual rotation happens in processLoop, so the second removed
call to verifyMemorySSA was unnecessary.

In fact, processLoop/rotateLoop already verify MemorySSA before
and after transforming each loop. Hence, both calls can be removed.

Pointed out by @lebedev.ri post-commit D51718.
2020-07-15 11:49:24 +01:00
Tyker 16f777f421 [NFC] Add debug and stat counters to assume queries and assume builder
Summary:
Add debug counter and stats counter to assume queries and assume builder
here is the collected stats on a build of check-llvm + check-clang.
  "assume-builder.NumAssumeBuilt": 2720879,
  "assume-builder.NumAssumesMerged": 761396,
  "assume-builder.NumAssumesRemoved": 1576212,
  "assume-builder.NumBundlesInAssumes": 6518809,
  "assume-queries.NumAssumeQueries": 85566380,
  "assume-queries.NumUsefullAssumeQueries": 2727360,
the NumUsefullAssumeQueries stat is actually pessimistic because in a few places queries
ask to keep providing information to try to get better information. and this isn't counted
as a usefull query evem tho it can be usefull

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83506
2020-07-14 21:49:14 +02:00
Logan Smith a19461d9e1 [NFC] Add 'override' keyword where missing in include/ and lib/.
This fixes warnings raised by Clang's new -Wsuggest-override, in preparation for enabling that warning in the LLVM build. This patch also removes the virtual keyword where redundant, but only in places where doing so improves consistency within a given file. It also removes a couple unnecessary virtual destructor declarations in derived classes where the destructor inherited from the base class is already virtual.

Differential Revision: https://reviews.llvm.org/D83709
2020-07-14 09:47:29 -07:00
serge-sans-paille 1cd1c1d62e Revert "[SCEV][IndVarSimplify] insert point should not be block front."
This reverts commit f1efb8bb4b.

Reverted because it doesn't correctly update the pass return status, see

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/9441/steps/test-check-all/logs/FAIL%3A%20LLVM%3A%3Awiden-i32-i8ptr.ll
2020-07-14 14:24:26 +02:00
Jameson Nash 2c7a07b59d [GVN] teach ConstantFolding correct handling of non-integral addrspace casts
Here we teach the ConstantFolding analysis pass that it is not legal to
replace a load of a bitcast constant (having a non-integral addrspace)
with a bitcast of the value of that constant (with a different
non-integral addrspace).

But also teach it that certain bit patterns are always known and
convertable (a fact it already uses elsewhere). This required us to also
fix a globalopt test, since, after this change, LLVM is able to realize
that the test actually is a valid transform (NULL is always a known
bit-pattern) and so it doesn't need to emit the failure remarks for it.

Also simplify some of the negative tests for transforms by avoiding a
type change in their bitcast, and add positive versions of the same
tests, to show that they otherwise should work.

Differential Revision: https://reviews.llvm.org/D59730
2020-07-13 21:44:17 -04:00
Jameson Nash e244f86f4d [VNCoercion] avoid creating bitcast for zero offsets [NFCI]
This could previously make it more complicated for ConstantFolding
later, leading to a higher likelyhood it would have to reject the
expression, even though zero seems like probably the common case here.

Differential Revision: https://reviews.llvm.org/D59730
2020-07-13 21:44:17 -04:00
Gui Andrade bfa3b627c6 [InstCombine] Erase attribute lists for simplified libcalls
Currently, a transformation like pow(2.0, x) -> exp2(x) copies the pow
attribute list verbatim and applies it to exp2. This works out fine
when the attribute list is empty, but when it isn't clang may error due
due to the mismatch.

The source function and destination don't necessarily have anything
to do with one another, attribute-wise. So it makes sense to remove
the attribute lists (this is similar to what IPO does in this
situation).

This was discovered after implementing the `noundef` param attribute.

Differential Revision: https://reviews.llvm.org/D82820
2020-07-13 22:32:33 +00:00
Nikita Popov 353fa4403a [PredicateInfo] Place predicate info after assume
Place the ssa.copy instructions for assumes after the assume,
instead of before it. Both options are valid, but placing them
afterwards prevents assumes from being replaced with assume(true).
This fixes https://bugs.llvm.org/show_bug.cgi?id=37541 in NewGVN
and will avoid a similar issue in SCCP when we handle more
predicate infos.

Differential Revision: https://reviews.llvm.org/D83631
2020-07-13 21:10:11 +02:00
Michael Liao 0b4cf802fa [fix-irreducible] Skip unreachable predecessors.
Summary:
- Skip unreachable predecessors during header detection in SCC. Those
  unreachable blocks would be generated in the switch lowering pass in
  the corner cases or other frontends. Even though they could be removed
  through the CFG simplification, we should skip them during header
  detection.

Reviewers: sameerds

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83562
2020-07-11 10:08:44 -04:00
Sidharth Baveja e541e1b757 [NFC] Separate Peeling Properties into its own struct (re-land after minor fix)
Summary:
This patch separates the peeling specific parameters from the UnrollingPreferences,
and creates a new struct called PeelingPreferences. Functions which used the
UnrollingPreferences struct for peeling have been updated to use the PeelingPreferences struct.

Author: sidbav (Sidharth Baveja)

Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel), anhtuyen (Anh Tuyen Tran), nikic (Nikita Popov)

Reviewed By: Meinersbur (Michael Kruse)

Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D80580
2020-07-10 18:39:30 +00:00
SharmaRithik e71c7b593a [CodeMoverUtils] Move OrderedInstructions to CodeMoverUtils
Summary: This patch moves OrderedInstructions to CodeMoverUtils as It was
the only place where OrderedInstructions is required.
Authored By: RithikSharma
Reviewer: Whitney, bmahjour, etiotto, fhahn, nikic
Reviewed By: Whitney, nikic
Subscribers: mgorny, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80643
2020-07-10 11:22:43 +05:30
Chen Zheng f1efb8bb4b [SCEV][IndVarSimplify] insert point should not be block front.
The block front may be a PHI node, inserting a cast instructions like
BitCast, PtrToInt, IntToPtr among PHIs is not right.

Reviewed By: lebedev.ri

Differential Revision:  https://reviews.llvm.org/D80975
2020-07-09 21:56:57 -04:00
Nikita Popov c0308fd154 [PredicateInfo] Print RenamedOp (NFC)
Make it easier to debug renaming issues.
2020-07-09 23:14:24 +02:00
Florian Hahn b805e94477 [PredicateInfo] Add additional RenamedOp field to PB.
OriginalOp of a predicate always refers to the original IR
value that was renamed. So for nested predicates of the same value, it
will always refer to the original IR value.

For the use in SCCP however, we need to find the renamed value that is
currently used in the condition associated with the predicate. This
patch adds a new RenamedOp field to do exactly that.

NewGVN currently relies on the existing behavior to merge instruction
metadata. A test case to check for exactly that has been added in
195fa4bfae.

Reviewers: efriedma, davide, nikic

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D78133
2020-07-09 09:51:18 +01:00
Nikita Popov 0b39d2d752 Revert "[NFC] Separate Peeling Properties into its own struct"
This reverts commit 0369dc98f9.

Many failing tests.
2020-07-08 21:43:32 +02:00
Gui Andrade ff7900d5de [LLVM] Accept `noundef` attribute in function definitions/calls
The `noundef` attribute indicates an argument or return value which
may never have an undef value representation.

This patch allows LLVM to parse the attribute.

Differential Revision: https://reviews.llvm.org/D83412
2020-07-08 19:02:04 +00:00
Sidharth Baveja 0369dc98f9 [NFC] Separate Peeling Properties into its own struct
Summary:
This patch makes the peeling properties of the loop accessible by other loop transformations.

Author: sidbav (Sidharth Baveja)

Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel)

Reviewed By: Meinersbur (Michael Kruse)

Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D80580
2020-07-08 18:59:59 +00:00
Anh Tuyen Tran 6965af43e6 Revert "[NFC] Separate Peeling Properties into its own struct"
This reverts commit fead250b43.
2020-07-08 18:58:05 +00:00
Anh Tuyen Tran fead250b43 [NFC] Separate Peeling Properties into its own struct
Summary:
This patch makes the peeling properties of the loop accessible by other loop transformations.

Author: sidbav (Sidharth Baveja)

Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel)

Reviewed By: Meinersbur (Michael Kruse)

Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D80580
2020-07-08 18:56:03 +00:00
SharmaRithik 082e395230 [CodeMoverUtils] Make specific analysis dependent checks optional
Summary: This patch makes code motion checks optional which are dependent on
specific analysis example, dominator tree, post dominator tree and dependence
info. The aim is to make the adoption of CodeMoverUtils easier for clients that
don't use analysis which were strictly required by CodeMoverUtils. This will
also help in diversifying code motion checks using other analysis example MSSA.
Authored By: RithikSharma
Reviewer: Whitney, bmahjour, etiotto
Reviewed By: Whitney
Subscribers: Prazek, hiraditya, george.burgess.iv, asbirlea, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D82566
2020-07-07 20:11:07 +05:30
Roman Lebedev 69dca6efc6
[NFCI][IR] Introduce CallBase::Create() wrapper
Summary:
It is reasonably common to want to clone some call with different bundles.
Let's actually provide an interface to do that.

Reviewers: chandlerc, jdoerfert, dblaikie, nickdesaulniers

Reviewed By: nickdesaulniers

Subscribers: llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83248
2020-07-07 01:16:36 +03:00
Nicolai Hähnle 76c5cb05a3 DomTree: Remove getChildren() accessor
Summary:
Avoid exposing details about how children are stored. This will enable
subsequent type-erasure changes.

New methods are introduced to cover common access patterns.

Change-Id: Idb5f4b1b9c84e4cc71ddb39bb52a388682f5674f

Reviewers: arsenm, RKSimon, mehdi_amini, courbet

Subscribers: qcolombet, sdardis, wdng, hiraditya, jrtc27, zzheng, atanasyan, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83083
2020-07-06 21:58:11 +02:00
Roman Lebedev 11a3f040c7
[Utils] Make -assume-builder/-assume-simplify actually work on Old-PM
clang w/ old-pm currently would simply crash
when -mllvm  -enable-knowledge-retention=true is specified.

Clearly, these two passes had no Old-PM test coverage,
which would have shown the problem - not requiring AssumptionCacheTracker,
but then trying to always get it.

Also, why try to get domtree only if it's cached,
but at the same time marking it as required?
2020-07-04 21:06:36 +03:00
Guillaume Chatelet 8dbafd24d6 [Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82977
2020-07-02 11:28:02 +00:00
Sergey Dmitriev cb8faaacb5 [CallGraph] Add support for callback call sites
Summary:
This patch changes call graph analysis to recognize callback call sites
and add an artificial 'reference' call record from the broker function
caller to the callback function in the call graph. A presence of such
reference enforces bottom-up traversal order for callback functions in
CG SCC pass manager because callback function logically becomes a callee
of the broker function caller.

Reviewers: jdoerfert, hfinkel, sstefan1, baziotis

Reviewed By: jdoerfert

Subscribers: hiraditya, kuter, sstefan1, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82572
2020-07-01 13:44:11 -07:00
Simon Pilgrim cfb5b144cf Fix Wdocumentation warnings by only tagging a param id once per doxygen comment block. NFC. 2020-07-01 12:01:19 +01:00
Max Kazantsev f01d9e6fc3 [SimplifyCFG] Fix inconsistency in block size assessment for threading
Sometimes SimplifyCFG may decide to perform jump threading. In order
to do it, it follows the following algorithm:

1. Checks if the block is small enough for threading;
2. If yes, inserts a PR Phi relying that the next iteration will remove it
   by performing jump threading;
3. The next iteration checks the block again and performs the threading.

This logic has a corner case: inserting the PR Phi increases block's size
by 1. If the block size at first check was max possible, one more Phi will
exceed this size, and we will neither perform threading nor remove the
created Phi node. As result, we will end up with worse IR than before.

This patch fixes this situation by excluding Phis from block size computation.
Excluding Phis from size computation for threading also makes sense by
itself because in case of threadign all those Phis will be removed.

Differential Revision: https://reviews.llvm.org/D81835
Reviewed By: asbirlea, nikic
2020-06-30 12:40:07 +07:00
serge-sans-paille b4130e6e99 Correctly report Changed status in FoldBranchToCommonDest
It's possible for the first loop trip(s) to set the `Changed` Status, and to a
later one to early exit, in which case `Changed` must be return.

Differential Revision: https://reviews.llvm.org/D82753
2020-06-29 18:13:42 +02:00
Vedant Kumar c1cad151b0 [debugify] Demote an error about empty locations to a warning
In https://reviews.llvm.org/D81198, we outlined a number of scenarios
where dropping debug locations is appropriate. Stop issuing an error
when this happens.
2020-06-26 14:55:02 -07:00
Simon Pilgrim 70f290d95c VNCoercion.cpp - remove unused includes. NFC. 2020-06-26 09:58:20 +01:00
Simon Pilgrim 8c2082e1dc GlobalsModRef.h - reduce CallGraph.h include to forward declarations. NFC.
Fix implicit include dependencies in source files.
2020-06-25 16:00:43 +01:00
Simon Pilgrim a53dddb3e9 Local.h - reduce includes to forward declarations. NFC.
Fix implicit include dependencies in source files and replace legacy AliasAnalysis typedef with AAResults where necessary.
2020-06-24 19:27:37 +01:00
Vedant Kumar f8bd6a75ed [SimplifyCFG] Drop debug loc in SpeculativelyExecuteBB
Summary:
According to HowToUpdateDebugInfo.rst:

```
Preserving the debug locations of speculated instructions can make
it seem like a condition is true when it's not (or vice versa), which
leads to a confusing single-stepping experience
```

This patch follows the recommendation to drop debug locations on
speculated instructions.

Reviewers: aprantl, davide

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82420
2020-06-23 18:25:52 -07:00
Ryan Santhiraraja f64dc4e686 Preserve GlobalsAA analysis result in InjectTLIMappings
InjectTLIMappings fails to preserve the analysis result of GlobalsAA. Not preserving the analysis might affect benchmark performance. This change fixes this issue.

Patch by: Ryan Santhiraraja <rsanthir@quicinc.com>

Reviewers: fpetrogalli, joerg, fhahn

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D82343
2020-06-23 22:05:42 +01:00
Simon Pilgrim 36bc10e74a [Transforms] Ensure we include CommandLine.h if we declare any cl::opt flags 2020-06-23 12:11:51 +01:00
Roman Lebedev d57e9aca01
[IndVarSimplify] Don't replace IV user with unsafe loop-invariant (PR45360)
Summary:
As [[ https://bugs.llvm.org/show_bug.cgi?id=45360 | PR45360 ]] reports,
with new cost-model we can sometimes end up being able to expand `udiv`/`urem` instructions.
And that exposes at least one instance of when we do that
regardless of whether or not it is safe to do.
In this particular case, it's `SimplifyIndvar::replaceIVUserWithLoopInvariant()`.

It seems to me, we simply need to check with `isSafeToExpandAt()` first.

The test isn't great. I'm not sure how to make it only run `-indvars`.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=45360 | PR45360 ]].

Reviewers: mkazantsev, reames, helloqirun

Reviewed By: mkazantsev

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82108
2020-06-23 13:53:15 +03:00
Arthur Eubanks d335c1317b Fix dynamic alloca detection in CloneBasicBlock
Summary:
Simply check AI->isStaticAlloca instead of reimplementing checks for
static/dynamic allocas.

Reviewers: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82328
2020-06-22 15:06:28 -07:00
Hiroshi Yamauchi 9e1decf743 [PGO][PGSO] Enable non-cold size opts under partial profile sample PGO.
Summary: Similar to D81020. Follow up D78949.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82053
2020-06-22 10:12:48 -07:00
Serguei Katkov eae0d2e9b2 Revert "[Peeling] Extend the scope of peeling a bit"
This reverts commit 29b2c1ca72.

The patch causes the DT verifier failure like:
DominatorTree is different than a freshly computed one!

Not sure the patch itself it wrong but revert to investigate the failure.
2020-06-22 17:48:29 +07:00
Serguei Katkov 29b2c1ca72 [Peeling] Extend the scope of peeling a bit
Currently we allow peeling of the loops if there is a exiting latch block
and all other exits are blocks ending with deopt.

Actually we want that exit would end up with deopt unconditionally but
it is not required that exit itself ends with deopt.

Reviewers: reames, ashlykov, fhahn, apilipenko, fedor.sergeev
Reviewed By: apilipenko
Subscribers: hiraditya, zzheng, dantrushin, llvm-commits
Differential Revision: https://reviews.llvm.org/D81140
2020-06-22 12:17:44 +07:00
Yevgeny Rouban 6429471e8b [IR] Convert profile metadata in createCallMatchingInvoke()
When an invoke instruction is converted to a call its
profile metadata is dropped because it has incompatible
format (see commit 16ad6eeb94).
This patch adds an attempt to convert profile data to
format of the call instruction. This used to work well
before the commit dcfa78a4cc.

Reviewers: reames
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82071
2020-06-20 12:10:31 +07:00
Tyker b7338fb1a6 [AssumeBundles] add cannonicalisation to the assume builder
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79650
2020-06-19 10:32:26 +02:00
Matt Arsenault b13f6b0fe0 BypassSlowDivision: Fix dropping debug info
I don't know anything about debug info, but this seems like more work
should be necessary. This constructs a new IRBuilder and reconstructs
the original divides rather than moving the original.

One problem this has is if a div/rem pair are handled, both end up
with the same debugloc. I'm not sure how to fix this, since this uses
a cache when it sees the same input operands again, which will have
the first instance's location attached.
2020-06-18 17:27:19 -04:00
Christopher Tetreault 8d11ec66b6 [SVE] Remove calls to VectorType::getNumElements from Transforms/Utils
Reviewers: efriedma, c-rhodes, david-arm, Tyker, asbirlea

Reviewed By: david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82057
2020-06-18 13:39:14 -07:00
Sanjay Patel 46a285ad9e [IRBuilder] add/use wrapper to create a generic compare based on predicate type; NFC
The predicate can always be used to distinguish between icmp and fcmp,
so we don't need to keep repeating this check in the callers.
2020-06-18 15:47:06 -04:00
Davide Italiano 8cdd2a158c [SimplifyCFG] Update debug location when folding branch to common destination
Sometimes a dead block gets folded and the debug information is still
retained. This manifests as jumpy stepping in lldb, see the bugzilla PR
for an end-to-end C testcase.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46008

Differential Revision:  https://reviews.llvm.org/D82062
2020-06-18 12:33:32 -07:00
Nick Desaulniers 88c965ba14 BreakCriticalEdges for callbr indirect dests
Summary:
llvm::SplitEdge was failing an assertion that the BasicBlock only had
one successor (for BasicBlocks terminated by CallBrInst, we typically
have multiple successors).  It was surprising that the earlier call to
SplitCriticalEdge did not handle the critical edge (there was an early
return).  Removing that triggered another assertion relating to creating
a BlockAddress for a BasicBlock that did not (yet) have a parent, which
is a simple order of operations issue in llvm::SplitCriticalEdge (a
freshly constructed BasicBlock must be inserted into a Function's basic
block list to have a parent).

Thanks to @nathanchance for the report.
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1018

Reviewers: craig.topper, jyknight, void, fhahn, efriedma

Reviewed By: efriedma

Subscribers: eli.friedman, rnk, efriedma, fhahn, hiraditya, llvm-commits, nathanchance, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81607
2020-06-17 11:45:06 -07:00
Hans Wennborg 16ad6eeb94 [IR] Don't copy profile metadata in createCallMatchingInvoke()
The invoke instruction can have profile metadata with branch_weights,
which does not make sense for a call instruction and will be
rejected by the verifier.

Differential revision: https://reviews.llvm.org/D81996
2020-06-17 11:18:23 +02:00
Tyker d7deef1206 Revert "[AssumeBundles] add cannonicalisation to the assume builder"
This reverts commit 90c50cad19.
2020-06-16 14:34:55 +02:00
Tyker 90c50cad19 [AssumeBundles] add cannonicalisation to the assume builder
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79650
2020-06-16 13:12:35 +02:00
Jay Foad 6fdd5a28b7 Revert "[IR] Clean up dead instructions after simplifying a conditional branch"
This reverts commit 69bdfb075b.

Reverting to investigate https://bugs.llvm.org/show_bug.cgi?id=46343
2020-06-16 10:32:15 +01:00
Whitney Tsang 5225cd43e8 [LoopUnroll] Allow loops with multiple exiting blocks where loop latch
is not necessary one of them.

Summary: Currently LoopUnrollPass already allow loops with multiple
exiting blocks, but it is only allowed when the loop latch is one of the
exiting blocks.
When the loop latch is not an exiting block, then only single exiting
block is supported.
When possible, the single loop latch or the single exiting block
terminator is optimized to an unconditional branch in the unrolled loop.

This patch allows loops with multiple exiting blocks even if the loop
latch is not one of them. However, the optimization of exiting block
terminator to unconditional branch is not done when there exists more
than one exiting block.
Reviewer: dmgreen, Meinersbur, etiotto, fhahn, efriedma, bmahjour
Reviewed By: efriedma
Subscribers: hiraditya, zzheng, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D81053
2020-06-14 18:44:18 +00:00
Florian Hahn 4495a6b141 [BreakCritEdges] Add option to opt-out of perserving loop-simplify.
This patch adds a new option to CriticalEdgeSplittingOptions to control
whether loop-simplify form must be preserved. It is them used by GVN to
indicate that loop-simplify form does not have to be preserved.

This fixes a crash exposed by 189efe295b.

If the critical edge we are splitting goes from a block inside a loop to
a block outside the loop, splitting the edge will create a new exit
block. As a result, the new block will branch to the original exit
block, which will add a non-loop predecessor, breaking loop-simplify
form. To preserve loop-simplify form, the predecessor blocks of the
original exit are split, but that does not work for blocks with
indirectbr terminators. If preserving loop-simplify form is requested,
bail out , before making any changes.

Reviewers: reames, hfinkel, davide, efriedma

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D81582
2020-06-12 11:47:13 +01:00
Alina Sbirlea 519b019a0a Verify MemorySSA after all updates.
Verify after completing all updates.
Resolves PR46275.
2020-06-11 18:48:41 -07:00
Jay Foad 69bdfb075b [IR] Clean up dead instructions after simplifying a conditional branch
Change BasicBlock::removePredecessor to optionally return a vector of
instructions which might be dead. Use this in ConstantFoldTerminator to
delete them if they are dead.

Reapply with a bug fix: don't drop the "!KeepOneInputPHIs" argument when
removePredecessor calls PHINode::removeIncomingValue.

Differential Revision: https://reviews.llvm.org/D80206
2020-06-11 14:53:01 +01:00
Jay Foad f45c65aa41 Revert "[IR] Clean up dead instructions after simplifying a conditional branch"
This reverts commit 4494e45316.

It caused problems for sanitizer buildbots.
2020-06-11 14:22:16 +01:00
Jay Foad 4494e45316 [IR] Clean up dead instructions after simplifying a conditional branch
Change BasicBlock::removePredecessor to optionally return a vector of
instructions which might be dead. Use this in ConstantFoldTerminator to
delete them if they are dead.

Differential Revision: https://reviews.llvm.org/D80206
2020-06-11 13:28:10 +01:00
Chris Jackson 4707bc2177 [DebugInfo] Refactor SalvageDebugInfo and SalvageDebugInfoForDbgValues
- Simplify the salvaging interface and the algorithm in InstCombine

Reviewers: vsk, aprantl, Orlando, jmorse, TWeaver

Reviewed by: Orlando

Differential Revision: https://reviews.llvm.org/D79863
2020-06-11 11:13:46 +01:00
serge-sans-paille 9daccb7a47 Correctly update Changed status for SimplifyCFG
Interestingly, this leads to better output in one of the test case.

Differential Revision: https://reviews.llvm.org/D81237
2020-06-10 16:54:15 +02:00
Marco Elver d3f89314ff [KernelAddressSanitizer] Make globals constructors compatible with kernel [v2]
[ v1 was reverted by c6ec352a6b due to
  modpost failing; v2 fixes this. More info:
  https://github.com/ClangBuiltLinux/linux/issues/1045#issuecomment-640381783 ]

This makes -fsanitize=kernel-address emit the correct globals
constructors for the kernel. We had to do the following:

* Disable generation of constructors that rely on linker features such
  as dead-global elimination.

* Only instrument globals *not* in explicit sections. The kernel uses
  sections for special globals, which we should not touch.

* Do not instrument globals that are prefixed with "__" nor that are
  aliased by a symbol that is prefixed with "__". For example, modpost
  relies on specially named aliases to find globals and checks their
  contents. Unfortunately modpost relies on size stored as ELF debug info
  and any padding of globals currently causes the debug info to cause size
  reported to be *with* redzone which throws modpost off.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203493

Tested:
* With 'clang/test/CodeGen/asan-globals.cpp'.

* With test_kasan.ko, we can see:

  	BUG: KASAN: global-out-of-bounds in kasan_global_oob+0xb3/0xba [test_kasan]

* allyesconfig, allmodconfig (x86_64)

Reviewed By: glider

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D81390
2020-06-10 15:08:42 +02:00
Hans Wennborg fc202c5fec [PGO] CallPromotion: Don't try to pass sret args to varargs functions
It's not allowed by the verifier.

Differential revision: https://reviews.llvm.org/D81409
2020-06-08 21:10:27 +02:00
Chris Jackson c6c65164af [DebugInfo] Reduce SalvageDebugInfo() functions
- Now all SalvageDebugInfo() calls will mark undef if the salvage
  attempt fails.

 Reviewed by: vsk, Orlando

 Differential Revision: https://reviews.llvm.org/D78369
2020-06-08 19:28:18 +01:00
Hiroshi Yamauchi b5632f4083 [PGO][PGSO] Enable non-cold code size opts under non-partial-profile sample PGO.
Summary: Following up D78949.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81020
2020-06-08 10:02:00 -07:00
Marco Elver c6ec352a6b Revert "[KernelAddressSanitizer] Make globals constructors compatible with kernel"
This reverts commit 866ee2353f.

Building the kernel results in modpost failures due to modpost relying
on debug info and inspecting kernel modules' globals:
https://github.com/ClangBuiltLinux/linux/issues/1045#issuecomment-640381783
2020-06-08 10:34:03 +02:00
Benjamin Kramer 3badd17b69 SmallPtrSet::find -> SmallPtrSet::count
The latter is more readable and more efficient. While there clean up
some double lookups. NFCI.
2020-06-07 22:38:08 +02:00
Simon Pilgrim f14d4c9c54 EHPersonalities.h - reduce Triple.h include to forward declaration. NFC.
Move implicit include dependencies down to source files.
2020-06-06 15:48:31 +01:00
Simon Pilgrim 5006e551d3 LoopAnalysisManager.h - reduce includes to forward declarations. NFC.
Move implicit include dependencies down to header/source files.
2020-06-06 14:06:46 +01:00
Marco Elver 866ee2353f [KernelAddressSanitizer] Make globals constructors compatible with kernel
Summary:
This makes -fsanitize=kernel-address emit the correct globals
constructors for the kernel. We had to do the following:

- Disable generation of constructors that rely on linker features such
  as dead-global elimination.

- Only emit constructors for globals *not* in explicit sections. The
  kernel uses sections for special globals, which we should not touch.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203493

Tested:
1. With 'clang/test/CodeGen/asan-globals.cpp'.
2. With test_kasan.ko, we can see:

  	BUG: KASAN: global-out-of-bounds in kasan_global_oob+0xb3/0xba [test_kasan]

Reviewers: glider, andreyknvl

Reviewed By: glider

Subscribers: cfe-commits, nickdesaulniers, hiraditya, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D80805
2020-06-05 20:20:46 +02:00
serge-sans-paille 2e5940cf29 Correctly report modified status for LoopSimplify
Differential Revision: https://reviews.llvm.org/D81235
2020-06-05 15:46:28 +02:00
Huihui Zhang bd43f78c76 [LSR][SCEVExpander] Avoid blind cast 'Factor' to SCEVConstant in FactorOutConstant.
Summary:
In SCEVExpander FactorOutConstant(), when GEP indexing into/over scalable vector,
it is legal for the 'Factor' in a MulExpr to be the size of a scalable vector
instead of a compile-time constant.

Current upstream crash with the test attached.

Reviewers: efriedma, sdesmalen, sanjoy.google, mkazantsev

Reviewed By: efriedma

Subscribers: hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80973
2020-06-04 10:33:39 -07:00
Sanjay Patel dd54432a0f [InstNamer] use 'i' for Instructions, not 'tmp'
As discussed in https://bugs.llvm.org/show_bug.cgi?id=45951 and
D80584, the name 'tmp' is almost always a bad choice, but we have
a legacy of regression tests with that name because it was baked
into utils/update_test_checks.py.

This change makes -instnamer more consistent (already using "arg"
and "bb", the common LLVM shorthand). And it avoids the conflict
in telling users of the FileCheck script to run "-instnamer" to
create a better regression test and having that cause a warn/fail
in update_test_checks.py.
2020-06-01 11:11:14 -04:00
Whitney Tsang 7873376bb3 [LoopUnroll] Fix build failure for allyesconfig.
Differential Revision: https://reviews.llvm.org/D80477.
2020-05-30 18:32:47 +00:00
Christopher Tetreault c8f1aca316 [SVE] Eliminate calls to default-false VectorType::get() from Utils
Reviewers: efriedma, c-rhodes, sdesmalen, xbolva00

Reviewed By: c-rhodes

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80337
2020-05-29 15:01:18 -07:00
Ehud Katz c710bb44a6 [Local] Prevent `invertCondition` from creating a redundant instruction
Prevent `invertCondition` from creating the inversion instruction, in
case the given value is an argument which has already been inverted.
Note that this approach has already been taken in case the given value
is an instruction (and not an argument).

Differential Revision: https://reviews.llvm.org/D80399
2020-05-29 21:08:22 +03:00
Whitney Tsang 4e74541a92 [LoopUnroll] Fix not-rotated.ll by adding back a limitation was unintentionally
removed in https://reviews.llvm.org/D80477
2020-05-29 03:05:58 +00:00
Whitney Tsang 1bc73b02d6 [LoopUnroll] Support loops with exiting block that is neither header nor
latch.

Summary: Remove the limitation in LoopUnrollPass that exiting block must
be either header or latch.
Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto,
fhahn, efriedma
Reviewed By: etiotto, fhahn, efriedma
Subscribers: efriedma, lkail, xbolva00, hiraditya, zzheng, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80477
2020-05-29 01:18:38 +00:00
Whitney Tsang 47ffc81830 Revert "[LoopUnroll] Support loops with exiting block that is neither header nor"
This reverts commit 2810582265.

Revert until
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/7334
is resolved.
2020-05-28 19:10:27 +00:00
Whitney Tsang 2810582265 [LoopUnroll] Support loops with exiting block that is neither header nor
latch.

Summary: Remove the limitation in LoopUnrollPass that exiting block must
be either header or latch.
Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto,
fhahn, efriedma
Reviewed By: etiotto, fhahn, efriedma
Subscribers: efriedma, lkail, xbolva00, hiraditya, zzheng, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80477
2020-05-28 18:27:09 +00:00
Sidharth Baveja 15b6730f07 Create utility function to Merge Adjacent Basic Blocks
Summary: The following code from
/llvm/lib/Transforms/Utils/LoopUnrollAndJam.cpp can be used by other
transformations:

while (!MergeBlocks.empty()) {
    BasicBlock *BB = *MergeBlocks.begin();
    BranchInst *Term = dyn_cast<BranchInst>(BB->getTerminator());
    if (Term && Term->isUnconditional() &&
L->contains(Term->getSuccessor(0))) {
      BasicBlock *Dest = Term->getSuccessor(0);
      BasicBlock *Fold = Dest->getUniquePredecessor();
      if (MergeBlockIntoPredecessor(Dest, &DTU, LI)) {
        // Don't remove BB and add Fold as they are the same BB
        assert(Fold == BB);
        (void)Fold;
        MergeBlocks.erase(Dest);
      } else
        MergeBlocks.erase(BB);
    } else
      MergeBlocks.erase(BB);
  }
Hence it should be separated into its own utility function.

Authored By: sidbav
Reviewer: Whitney, Meinersbur, asbirlea, dmgreen, etiotto
Reviewed By: asbirlea
Subscribers: hiraditya, zzheng, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80583
2020-05-28 16:44:37 +00:00
Philip Reames 87bea912c2 [Statepoint] Replace uses of isX functions with idiomatic isa<X>
Now that all of the statepoint related routines have classes with isa support, let's cleanup.

I'm leaving the (dead) utitilities in tree for a few days so that I can do the same cleanup downstream without breakage.
2020-05-27 18:32:28 -07:00
Rithik Sharma eadf295956 [CodeMoverUtils] Use dominator tree level to decide the direction of
code motion

Summary: Currently isSafeToMoveBefore uses DFS numbering for determining
the relative position of instruction and insert point which is not
always correct. This PR proposes the use of Dominator Tree depth for the
same. If a node is at a higher level than the insert point then it is
safe to say that we want to move in the forward direction.
Authored By: RithikSharma
Reviewer: Whitney, nikic, bmahjour, etiotto, fhahn
Reviewed By: Whitney
Subscribers: fhahn, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80084
2020-05-27 18:02:06 +00:00
David Green 70d4a20299 [UnJ] Update LI for inner nested loops
This makes sure to correctly register the loop info of the children
of unroll and jammed loops. It re-uses some code from the unroller for
registering subloops.

Differential Revision: https://reviews.llvm.org/D80619
2020-05-27 14:36:38 +01:00
Djordje Todorovic 65030821d4 [NFC][Debugify] Format the CheckModuleDebugify output
This fixes the output of the check-debugify option.
Without the patch an example of running the option:

$ opt -check-debugify test.ll -S -o testDebugify.ll
CheckModuleDebugifySkipping module without debugify metadata

After the patch:

$ opt -check-debugify test.ll -S -o testDebugify.ll
CheckModuleDebugify: Skipping module without debugify metadata

Differential Revision: https://reviews.llvm.org/D80553
2020-05-27 10:32:40 +02:00
Florian Hahn 5cf90d6cf1 [LoopUnroll] Simplify latch/header block handling (NFC).
I think the current code dealing with connecting the unrolled iterations
is a bit more complicated than necessary currently. To connect the
unrolled iterations, we have to update the unrolled latch blocks to
branch to the header of the next unrolled iteration.

We need to do this regardless whether the latch is exiting or not.

Additionally, we try to turn the conditional branch in the exiting block
to an unconditional one. This is an optimization only; alternatively we
could leave the conditional branches in place and rely on other passes
to simplify the conditions.

Logically, this is a separate step from connecting the latches to the
headers, but it is convenient to fold them into the same loop, if the
latch is also exiting. For headers (or other non-latch exiting blocks,
this is done separately).

Hopefully the patch with additional comments makes things a bit clearer.

Reviewers: efriedma, dmgreen, hfinkel, Whitney

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D80544
2020-05-26 21:54:12 +01:00
Serge Pavlov 4d20e31f73 [FPEnv] Intrinsic llvm.roundeven
This intrinsic implements IEEE-754 operation roundToIntegralTiesToEven,
and performs rounding to the nearest integer value, rounding halfway
cases to even. The intrinsic represents the missed case of IEEE-754
rounding operations and now llvm provides full support of the rounding
operations defined by the standard.

Differential Revision: https://reviews.llvm.org/D75670
2020-05-26 19:24:58 +07:00
Florian Hahn 179c80117c [LoopUnroll] Remove dead NextBlocks argument (NFC). 2020-05-25 22:09:11 +01:00
Whitney Tsang 5d6c5b463c [LoopUtils] Use llvm::find
Summary: Fixes this build error:

llvm/lib/Transforms/Utils/LoopUtils.cpp:679:26: error: no matching
function for call to 'find'
      Loop::iterator I = find(ParentLoop->begin(), ParentLoop->end(),
L);
                         ^~~~
Authored By: orivej
Reviewer: Whitney
Reviewed By: Whitney
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80473
2020-05-25 13:34:56 +00:00
Matt Arsenault cdd006eec9 SimplifyCFG: Clean up optforfuzzing implementation
This should function as any other SimplifyCFGOption rather than having
the transform check and specially consider the attribute itself.
2020-05-23 13:49:50 -04:00
Michal Paszkowski 335de55fa3 Revert "Added a new IRCanonicalizer pass."
This reverts commit 14d358537f.
2020-05-23 13:51:43 +02:00
Michal Paszkowski 14d358537f Added a new IRCanonicalizer pass.
Summary:
Added a new IRCanonicalizer pass which aims to transform LLVM modules into
a canonical form by reordering and renaming instructions while preserving the
same semantics. The canonicalizer makes it easier to spot semantic differences
when diffing two modules which have undergone different passes.

Presentation: https://www.youtube.com/watch?v=c9WMijSOEUg

Reviewed by: plotfi

Differential Revision: https://reviews.llvm.org/D66029
2020-05-23 12:45:53 +02:00
Ehud Katz 111ddc57d3 [FlattenCFG] Fix `MergeIfRegion` in case then-path is empty
In case the then-path of an if-region is empty, then merging with the
else-path should be handled with the inverse of the condition (leading
to that path).

Fix PR37662

Differential Revision: https://reviews.llvm.org/D78881
2020-05-21 14:06:44 +03:00
Roman Lebedev b2df961231
[IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835)
Summary:
Currently, `rewriteLoopExitValues()`'s logic is roughly as following:
> Loop over each incoming value in each PHI node.
> Query whether the SCEV for that incoming value is high-cost.
> Expand the SCEV.
> Perform sanity check (`isValidRewrite()`, D51582)
> Record the info
> Afterwards, see if we can drop the loop given replacements.
> Maybe perform replacements.

The problem is that we interleave SCEV cost checking and expansion.
This is A Problem, because `isHighCostExpansion()` takes special care
to not bill for the expansions that were already expanded, and we can reuse.

While it makes sense in general - if we know that we will expand some SCEV,
all the other SCEV's costs should account for that, which might cause
some of them to become non-high-cost too, and cause chain reaction.

But that isn't what we are doing here. We expand *all* SCEV's, unconditionally.
So every next SCEV's cost will be affected by the already-performed expansions
for previous SCEV's. Even if we are not planning on keeping
some of the expansions we performed.

Worse yet, this current "bonus" depends on the exact PHI node
incoming value processing order. This is completely wrong.

As an example of an issue, see @dmajor's `pr45835.ll` - if we happen to have
a PHI node with two(!) identical high-cost incoming values for the same basic blocks,
we would decide first time around that it is high-cost, expand it,
and immediately decide that it is not high-cost because we have an expansion
that we could reuse (because we expanded it right before, temporarily),
and replace the second incoming value but not the first one;
thus resulting in a broken PHI.

What we instead should do for now, is not perform any expansions
until after we've queried all the costs.

Later, in particular after `isValidRewrite()` is an assertion (D51582)
we could improve upon that, but in a more coherent fashion.

See [[ https://bugs.llvm.org/show_bug.cgi?id=45835 | PR45835 ]]

Reviewers: dmajor, reames, mkazantsev, fhahn, efriedma

Reviewed By: dmajor, mkazantsev

Subscribers: smeenai, nikic, hiraditya, javed.absar, llvm-commits, dmajor

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79787
2020-05-21 13:05:55 +03:00
Benjamin Kramer 5b0d1f04bf Fix a layering violation by not depending from Transforms/Utils on Transforms/Scalar.
NFC.
2020-05-21 09:51:58 +02:00
Yevgeny Rouban 8138487468 [BrachProbablityInfo] Set edge probabilities at once and fix calcMetadataWeights()
Hide the method that allows setting probability for particular edge
and introduce a public method that sets probabilities for all
outgoing edges at once.
Setting individual edge probability is error prone. More over it is
difficult to check that the total probability is 1.0 because there is
no easy way to know when the user finished setting all
the probabilities.

Related bug is fixed in BranchProbabilityInfo::calcMetadataWeights().
Changing unreachable branch probabilities to raw(1) and distributing
the rest (oldProbability - raw(1)) over the reachable branches could
introduce total probability inaccuracy bigger than 1/numOfBranches.

Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
2020-05-21 12:52:37 +07:00
Juneyoung Lee d9a4a24413 Add CanonicalizeFreezeInLoops pass
Summary:
If an induction variable is frozen and used, SCEV yields imprecise result
because it doesn't say anything about frozen variables.

Due to this reason, performance degradation happened after
https://reviews.llvm.org/D76483 is merged, causing
SCEV yield imprecise result and preventing LSR to optimize a loop.

The suggested solution here is to add a pass which canonicalizes frozen variables
inside a loop. To be specific, it pushes freezes out of the loop by freezing
the initial value and step values instead & dropping nsw/nuw flags from instructions used by freeze.
This solution was also mentioned at https://reviews.llvm.org/D70623 .

Reviewers: spatel, efriedma, lebedev.ri, fhahn, jdoerfert

Reviewed By: fhahn

Subscribers: nikic, mgorny, hiraditya, javed.absar, llvm-commits, sanwou01, nlopes

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77523
2020-05-21 09:29:29 +09:00
Florian Hahn bcbd26bfe6 [SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.

This patch was originally committed as b8a3c34eee, but broke the
modules build, as LoopAccessAnalysis was using the Expander.

The code-gen part of LAA was moved to lib/Transforms recently, so this
patch can be landed again.

Reviewers: sanjoy.google, efriedma, reames

Reviewed By: sanjoy.google

Differential Revision: https://reviews.llvm.org/D71537
2020-05-20 10:53:40 +01:00
Benjamin Kramer 350dadaa8a Give helpers internal linkage. NFC. 2020-05-19 22:16:37 +02:00
Nikita Popov 5fae613a4f [LVI] Don't require DominatorTree in LVI (NFC)
After D76797 the dominator tree is no longer used in LVI, so we
can remove it as a pass dependency, and also get rid of the
dominator tree enabling/disabling logic in JumpThreading.

Apart from cleaning up the code, this also clarifies LVI
cache consistency, in that the LVI cache can no longer
depend on whether the DT was or wasn't enabled due to
pending DT updates at any given time.

Differential Revision: https://reviews.llvm.org/D76985
2020-05-19 20:21:46 +02:00
Jay Foad 9bc989a48d [InstCombine] Remove hasNoInfs check for pow(C,y) -> exp2(log2(C)*y)
We already check hasNoNaNs and that x is finite and strictly positive.
That only leaves the following special cases (taken from the Linux man
page for pow):

If x is +1, the result is 1.0 (even if y is a NaN).
If the absolute value of x is less than 1, and y is negative infinity, the result is positive infinity.
If the absolute value of x is greater than 1, and y is negative infinity, the result is +0.
If the absolute value of x is less than 1, and y is positive infinity, the result is +0.
If the absolute value of x is greater than 1, and y is positive infinity, the result is positive infinity.

The first case is handled elsewhere, and this transformation preserves
all the others, so there is no need to limit it to hasNoInfs.

Differential Revision: https://reviews.llvm.org/D79409
2020-05-19 17:06:05 +01:00
Sameer Sahasrabuddhe 6c84884366 [LoopSimplify] don't separate nested loops with convergent calls
Summary:
When a loop has multiple backedges, loop simplification attempts to
separate them out into nested loops. This results in incorrect control
flow in the presence of some functions like a GPU barrier. This change
skips the transformation when such "convergent" function calls are
present in the loop body.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D80078
2020-05-19 09:22:39 +05:30
Vedant Kumar 623b254244 [Local] Do not ignore zexts in salvageDebugInfo, PR45923
Summary:
When salvaging a dead zext instruction, append a convert operation to
the DIExpressions of the debug uses of the instruction, to prevent the
salvaged value from being sign-extended.

I confirmed that lldb prints out the correct unsigned result for "f" in
the example from PR45923 with this changed applied.

rdar://63246143

Reviewers: aprantl, jmorse, chrisjackson, davide

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80034
2020-05-18 09:52:02 -07:00
Craig Topper 5f65faef2c ValueMapper does not preserve inline assembly dialect when remapping the type
Bug report: https://bugs.llvm.org/show_bug.cgi?id=45291

Patch by Tomasz Miąsko

Differential Revision: https://reviews.llvm.org/D80066
2020-05-17 14:57:50 -07:00
Eli Friedman 4f04db4b54 AllocaInst should store Align instead of MaybeAlign.
Along the lines of D77454 and D79968.  Unlike loads and stores, the
default alignment is getPrefTypeAlign, to match the existing handling in
various places, including SelectionDAG and InstCombine.

Differential Revision: https://reviews.llvm.org/D80044
2020-05-16 14:53:16 -07:00
Mircea Trofin 08e2386dee Revert "Revert "[llvm][NFC] Cleanup uses of std::function in Inlining-related APIs""
This reverts commit 454de99a6f.

The problem was that one of the ctor arguments of CallAnalyzer was left
to be const std::function<>&. A function_ref was passed for it, and then
the ctor stored the value in a function_ref field. So a std::function<>
would be created as a temporary, and not survive past the ctor
invocation, while the field would.

Tested locally by following https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild

Original Differential Revision: https://reviews.llvm.org/D79917
2020-05-15 12:29:16 -07:00
Eli Friedman 11aa3707e3 StoreInst should store Align, not MaybeAlign
This is D77454, except for stores.  All the infrastructure work was done
for loads, so the remaining changes necessary are relatively small.

Differential Revision: https://reviews.llvm.org/D79968
2020-05-15 12:26:58 -07:00
Scott Linder 03c44c7584 [NFC] Deduplicate comment in PromoteMemoryToRegister.cpp
This has been duplicated since before
2372a193ba, but that commit has it
appearing twice in the space of 10 lines of the same function body. It
could also be hoisted up to the point just after where the last
special-case is considered, but I want to keep the intent of the
original authors.

Committed as obvious without a review.
2020-05-15 15:18:07 -04:00
Nikita Popov f89f7da999 [IR] Convert null-pointer-is-valid into an enum attribute
The "null-pointer-is-valid" attribute needs to be checked by many
pointer-related combines. To make the check more efficient, convert
it from a string into an enum attribute.

In the future, this attribute may be replaced with data layout
properties.

Differential Revision: https://reviews.llvm.org/D78862
2020-05-15 19:41:07 +02:00
Anna Thomas 7cc3769adb [VectorUtils] Expose vector-function-abi-variant mangling as a utility.
Summary:
This change exposes the vector name mangling with LLVM ISA (used as part
of vector-function-abi-variant) as a utility.
This can then be used by front-ends that add this attribute.
Note that all parameters passed in to the function will be mangled with
the "v" token to identify that they are of of vector type. So, it is the
responsibility of the caller to confirm that all parameters in the
vectorized variant is of vector type.

Added unit test to show vector name mangling.

Reviewed-By: fpetrogalli, simoll

Differential Revision: https://reviews.llvm.org/D79867
2020-05-15 11:42:20 -04:00
Mircea Trofin 454de99a6f Revert "[llvm][NFC] Cleanup uses of std::function in Inlining-related APIs"
This reverts commit 767db5be67.
2020-05-14 22:32:44 -07:00
Mircea Trofin 767db5be67 [llvm][NFC] Cleanup uses of std::function in Inlining-related APIs
Summary:
Replacing uses of std::function pointers or refs, or Optional, to
function_ref, since the usage pattern allows that. If the function is
optional, using a default parameter value (nullptr). This led to a few
parameter reshufles, to push all optionals to the end of the parameter
list.

Reviewers: davidxl, dblaikie

Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79917
2020-05-14 22:13:53 -07:00
Alina Sbirlea bd541b217f [NewPassManager] Add assertions when getting statefull cached analysis.
Summary:
Analyses that are statefull should not be retrieved through a proxy from
an outer IR unit, as these analyses are only invalidated at the end of
the inner IR unit manager.
This patch disallows getting the outer manager and provides an API to
get a cached analysis through the proxy. If the analysis is not
stateless, the call to getCachedResult will assert.

Reviewers: chandlerc

Subscribers: mehdi_amini, eraman, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72893
2020-05-13 12:38:38 -07:00
Reid Kleckner 1370757dd0 Revert "[BrachProbablityInfo] Set edge probabilities at once. NFC."
This reverts commit eef95f2746.

The new assertion about branch propability sums does not hold.
2020-05-13 08:23:09 -07:00
Yevgeny Rouban eef95f2746 [BrachProbablityInfo] Set edge probabilities at once. NFC.
Hide the method that allows setting probability for particular
edge and introduce a public method that sets probabilities for
all outgoing edges at once.
Setting individual edge probability is error prone. More over
it is difficult to check that the total probability is 1.0
because there is no easy way to know when the user finished
setting all the probabilities.

Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
2020-05-13 13:55:36 +07:00
Zequan Wu cb22ab7403 Add nomerge function attribute to supress tail merge optimization in simplifyCFG
We want to add a way to avoid merging identical calls so as to keep the
separate debug-information for those calls. There is also an asan
usecase where having this attribute would be beneficial to avoid
alternative work-arounds.

Here is the link to the feature request:
https://bugs.llvm.org/show_bug.cgi?id=42783.

`nomerge` is different from `noline`. `noinline` prevents function from
inlining at callsites, but `nomerge` prevents multiple identical calls
from being merged into one.

This patch adds `nomerge` to disable the optimization in IR level. A
followup patch will be needed to let backend understands `nomerge` and
avoid tail merge at backend.

Reviewed By: asbirlea, rnk

Differential Revision: https://reviews.llvm.org/D78659
2020-05-12 16:49:20 -07:00
Tyker 78d85c2091 [AssumeBundles] fix crashes
Summary:
this patch fixe crash/asserts found in the test-suite.
the AssumeptionCache cannot be assumed to have all assumes contrary to what i tought.
prevent generation of information for terminators, because this can create broken IR in transfromation where we insert the new terminator before removing the old one.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79458
2020-05-11 11:52:21 +02:00
OCHyams da100de0a6 [NFC][DwarfDebug] Add test for variables with a single location which
don't span their entire scope.

The previous commit (6d1c40c171) is an older version of the test.

Reviewed By: aprantl, vsk

Differential Revision: https://reviews.llvm.org/D79573
2020-05-11 11:49:11 +02:00
Tyker 5957e058e4 [AssumeBundles] Remove non-determinisme from assume builder
Summary:
The assume builder was non-deterministic when working on unamed values.
this patch fixes this.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78616
2020-05-10 21:18:33 +02:00
Tyker 821a0f23d8 [AssumeBundles] Prevent generation of some redundant assumes
Summary: with this patch the assume salvageKnowledge will not generate assume if all knowledge is already available in an assume with valid context. assume bulider can also in some cases update an existing assume with better information.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78014
2020-05-10 19:23:59 +02:00
Florian Hahn 8528186b9b [LAA] Move runtime-check generation to Transforms/Utils/loopUtils (NFC)
Currently LAA's uses of ScalarEvolutionExpander blocks moving the
expander from Analysis to Transforms. Conceptually the expander does not
fit into Analysis (it is only used for code generation) and
runtime-check generation also seems to be better suited as a
transformation utility.

Reviewers: Ayal, anemet

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78460
2020-05-10 17:39:26 +01:00
zoecarver f65f566aeb Re-commit: Mark values as trivially dead when their only use is a start or end lifetime intrinsic.
Summary:
If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well.

Currently, this only works for allocas, globals, and arguments.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79355
2020-05-08 12:24:10 -07:00
Ricky Zhou b38d77f185 [SimplifyCFG] Remap rewritten debug intrinsic operands.
FoldBranchToCommonDest clones instructions to a different basic block,
but handles debug intrinsics in a separate path. Previously, when
cloning debug intrinsics, their operands were not updated to reference
the correct cloned values. As a result, we would emit debug.value
intrinsics with broken operand references which are discarded in later
passes. This leads to incorrect debuginfo that reports incorrect values
for variables.

Fix this by remapping debug intrinsic operands when cloning them.

Fixes https://bugs.llvm.org/show_bug.cgi?id=45667.

Differential Revision: https://reviews.llvm.org/D79602
2020-05-08 11:10:25 -07:00
Yevgeny Rouban b921543c49 SplitIndirectBrCriticalEdges: Fix Branch Probability update
Splitting critical edges for indirect branches
the SplitIndirectBrCriticalEdges() function may break branch
probabilities if target basic block happens to have unset
a probability for any of its successors. That is because in
such cases the getEdgeProbability(Target) function returns
probability 1/NumOfSuccessors and it is called after Target
was split (thus Target has a single successor). As the result
the correspondent successor of the split block gets
probability 100% but 1/NumOfSuccessors is expected (or better
be left unset).

Reviewers: yamauchi
Differential Revision: https://reviews.llvm.org/D78806
2020-05-07 15:31:44 +07:00
Whitney Tsang 0a52401ad6 [LoopUnrollAndJam] Changed safety checks to consider more than 2-levels
loop nest.

Summary: As discussed in https://reviews.llvm.org/D73129.

Example
Before unroll and jam:

for
  A
  for
    B
    for
      C
    D
  E
After unroll and jam (currently):

for
  A
  A'
  for
    B
    for
      C
    D
    B'
    for
      C'
    D'
  E
  E'
After unroll and jam (Ideal):

for
  A
  A'
  for
    B
    B'
    for
      C
      C'
    D
    D'
  E
  E'
This is the first patch to change unroll and jam to work in the ideal
way.
This patch change the safety checks needed to make sure is safe to
unroll and jam in the ideal way.

Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto
Reviewed By: Meinersbur
Subscribers: fhahn, hiraditya, zzheng, llvm-commits, anhtuyen, prithayan
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D76132
2020-05-06 21:47:44 +00:00
zoecarver 1998e796e9 Revert "Mark values as trivially dead when their only use is a start or end lifetime intrinsic."
This reverts commit 95aa28cc8f.
2020-05-06 11:07:22 -07:00
zoecarver 95aa28cc8f Mark values as trivially dead when their only use is a start or end lifetime intrinsic.
Summary:
If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well.

Currently, this only works for allocas, globals, and arguments.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79355
2020-05-06 10:58:08 -07:00
Jay Foad 22829ab5fa [InstCombine] Allow denormal C in pow(C,y) -> exp2(log2(C)*y)
We check that C is finite and strictly positive, but there's no need to
check that it's normal too. exp2 should be just as accurate on denormals
as pow is.

Differential Revision: https://reviews.llvm.org/D79413
2020-05-05 16:25:48 +01:00
Jay Foad fa2783d79a [InstCombine] Remove hasOneUse check for pow(C,x) -> exp2(log2(C)*x)
I don't think there's any good reason not to do this transformation when
the pow has multiple uses.

Differential Revision: https://reviews.llvm.org/D79407
2020-05-05 14:46:08 +01:00
Sergey Dmitriev f637334df9 [CallGraphUpdater] Removed references to calles when deleting function
Summary: Otherwise we can get unaccounted references to call graph nodes.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79382
2020-05-04 18:59:47 -07:00
Jay Foad e737847b8f [SLC] Allow llvm.pow(x,2.0) -> x*x etc even if no pow() lib func
optimizePow does not create any new calls to pow, so it should work
regardless of whether the pow library function is available. This allows
it to optimize the llvm.pow intrinsic on targets with no math library.

Based on a patch by Tim Renouf.

Differential Revision: https://reviews.llvm.org/D68231
2020-05-04 10:54:07 +01:00
Hongtao Yu 911e06f5eb [ICP] Handling must tail calls in indirect call promotion
Per the IR convention, a musttail call must precede a ret with an optional bitcast. This was violated by the indirect call promotion optimization which could result an IR like:

    ; <label>:2192:
      br i1 %2198, label %2199, label %2201, !dbg !226012, !prof !229483

    ; <label>:2199:                                   ; preds = %2192
      musttail call fastcc void @foo(i8* %2195), !dbg !226012
      br label %2202, !dbg !226012

    ; <label>:2201:                                   ; preds = %2192
      musttail call fastcc void %2197(i8* %2195), !dbg !226012
      br label %2202, !dbg !226012

    ; <label>:2202:                                   ; preds = %605, %2201, %2199
      ret void, !dbg !229485

This is being fixed in this change where the return statement goes together with the promoted indirect call. The code generated is like:

    ; <label>:2192:
      br i1 %2198, label %2199, label %2201, !dbg !226012, !prof !229483

    ; <label>:2199:                                   ; preds = %2192
      musttail call fastcc void @foo(i8* %2195), !dbg !226012
      ret void, !dbg !229485

    ; <label>:2201:                                   ; preds = %2192
      musttail call fastcc void %2197(i8* %2195), !dbg !226012
      ret void, !dbg !229485

Differential Revision: https://reviews.llvm.org/D79258
2020-05-03 10:42:22 -07:00
Nikita Popov 60e9ee16b4 [MergeFuncs] Don't merge shufflevectors with different masks
When the shufflevector mask operand was converted into special
instruction data, the FunctionComparator was not updated to
account for this. As such, MergeFuncs will happily merge
shufflevectors with different masks.

This fixes https://bugs.llvm.org/show_bug.cgi?id=45773.

Differential Revision: https://reviews.llvm.org/D79261
2020-05-02 10:21:14 +02:00
Florian Hahn 19ab53f1e2 [LoopVersioning] Update setAliasChecks to take ArrayRef argument (NFC).
This cleanup was suggested as part of D78458.
2020-04-30 22:17:12 +01:00
Nikita Popov b74c6d2c9d [InlineFunction] Disable emission of alignment assumptions by default
In D74183 clang started emitting alignment for sret parameters
unconditionally. This caused a 1.5% compile-time regression on
tramp3d-v4. The reason is that we now generate many instance of IR like

    %ptrint = ptrtoint %class.GuardLayers* %guards_m to i64
    %maskedptr = and i64 %ptrint, 3
    %maskcond = icmp eq i64 %maskedptr, 0
    tail call void @llvm.assume(i1 %maskcond)

to preserve the alignment information during inlining. Based on IR
analysis, these assumptions also regress optimization. The attached
phase ordering test case illustrates two issues: One are instruction
count based optimization heuristics, which are affected by the four
additional instructions of the assumption. The other is blocking of
SROA due to ptrtoint casts (PR45763).

We already encountered the same problem in Rust, where we (unlike
Clang) generally prefer to emit alignment information absolutely
everywhere it is available. We were only able to do this after
hardcoding -preserve-alignment-assumptions-during-inlining=false,
because we were seeing significant optimization and compile-time
regressions otherwise.

This patch disables -preserve-alignment-assumptions-during-inlining
by default, because we should not be punishing people for adding
more alignment annotations.

Once the assume bundle work shakes out and we can represent (and use)
alignment assumptions using assume bundles, it should be possible to
re-enable this with reduced overhead.

Differential Revision: https://reviews.llvm.org/D76886
2020-04-30 23:12:54 +02:00
Arthur Eubanks a90948fd6e [NFC] Rename *ByValOrInalloca* to *PassPointeeByValue*
Summary: In preparation for preallocated.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79152
2020-04-30 09:42:13 -07:00
Mircea Trofin 3ab319b295 [llvm][NFC] Use CallBase explicitly instead of Instruction in FunctionComparator
Reviewers: dblaikie, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79098
2020-04-29 15:37:46 -07:00
Hiroshi Yamauchi 1831986826 [PGO][PGSO] Prep for enabling non-cold code size opts under non-partial-profile sample PGO.
Summary:
- Distinguish between partial-profile and non-partial-profile sample PGO.
- Add a flag for partial-profile sample PGO.
- Tune the sample PGO cutoff.
- No default behavior change (yet).

Reviewers: davidxl

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78949
2020-04-29 08:57:47 -07:00
Mircea Trofin e61247c0a8 [llvm][NFC] Change parameter type to more specific CallBase in IndirectCallPromotion
Reviewers: dblaikie, craig.topper, wmi

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79047
2020-04-29 08:42:32 -07:00
Florian Hahn 616657b39c [LAA] Move CheckingPtrGroup/PointerCheck outside class (NFC).
This allows forward declarations of PointerCheck, which in turn reduce
the number of times LoopAccessAnalysis needs to be included.

Ultimately this helps with moving runtime check generation to
Transforms/Utils/LoopUtils.h, without having to include it there.

Reviewers: anemet, Ayal

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78458
2020-04-28 21:47:31 +01:00
Sam Parker e9c9329aa4 [TTI] Add TargetCostKind argument to getUserCost
There are several different types of cost that TTI tries to provide
explicit information for: throughput, latency, code size along with
a vague 'intersection of code-size cost and execution cost'.

The vectorizer is a keen user of RecipThroughput and there's at least
'getInstructionThroughput' and 'getArithmeticInstrCost' designed to
help with this cost. The latency cost has a single use and a single
implementation. The intersection cost appears to cover most of the
rest of the API.

getUserCost is explicitly called from within TTI when the user has
been explicit in wanting the code size (also only one use) as well
as a few passes which are concerned with a mixture of size and/or
a relative cost. In many cases these costs are closely related, such
as when multiple instructions are required, but one evident diverging
cost in this function is for div/rem.

This patch adds an argument so that the cost required is explicit,
so that we can make the important distinction when necessary.

Differential Revision: https://reviews.llvm.org/D78635
2020-04-28 08:57:45 +01:00
Craig Topper a58b62b4a2 [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
Mircea Trofin cb56e9b923 [llvm][NFC] Use CallBase instead of Instruction in ProfileSummaryInfo
Summary:
getProfileCount requires the parameter be a valid CallBase, and its uses
reflect that.

Reviewers: dblaikie, craig.topper, wmi

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78940
2020-04-27 20:47:52 -07:00
Arthur Eubanks 3b0450acec Add IR constructs for preallocated (inalloca replacement)
Add llvm.call.preallocated.{setup,arg} instrinsics.
Add "preallocated" operand bundle which takes a token produced by llvm.call.preallocated.setup.
Add "preallocated" parameter attribute, which is like byval but without the copy.

Verifier changes for these IR constructs.

See https://github.com/rnk/llvm-project/blob/call-setup-docs/llvm/docs/CallSetup.md

Subscribers: hiraditya, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74651
2020-04-27 16:15:50 -07:00
Sameer Sahasrabuddhe 8488763682 [NFC] UnifyLoopExits: correctly skip expensive checks 2020-04-27 15:10:35 +05:30
Tyker e5f8a77c19 [AssumeBundles] Refactor asssume builder
Summary:
refactor assume bulider for the next patch.
the assume builder now generate only one assume per attribute kind and per value they are on. to do this it takes the highest. this is desirable because currently, for all attributes the higest value is the most valuable.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78013
2020-04-25 13:43:52 +02:00
Benjamin Kramer 1d42764df7 Give helpers internal linkage. NFC. 2020-04-25 11:50:52 +02:00
Ehud Katz 64249f177e [CodeExtractor] Fix extraction of a value used only by intrinsics outside of region
We should only skip `lifetime` and `dbg` intrinsics when searching for users.
Other intrinsics are legit users that can't be ignored.

Without this fix, the testcase would result in an invalid IR. `memcpy`
will have a reference to the, now, external value (local to the
extracted loop function).

Fix PR42194

Differential Revision: https://reviews.llvm.org/D78749
2020-04-25 11:44:47 +03:00
Tyker 97ecd91e20 [NFC] Refactor SimplifyCFG to make propagating information easier.
Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77742
2020-04-24 22:22:20 +02:00
Craig Topper 81c5e83f7d [CallSite removal][Transform] Replace CallSite with CallBase in Utils. NFC
Differential Revision: https://reviews.llvm.org/D78780
2020-04-23 20:49:33 -07:00
Christopher Tetreault 7ca56c90bd [SVE] Remove calls to isScalable from Transforms
Reviewers: efriedma, chandlerc, reames, aprantl, sdesmalen

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77756
2020-04-23 13:50:07 -07:00
Vedant Kumar 2fa656cdfd [Debugify] Do not require named metadata to be present when stripping
This allows -mir-strip-debug to be run without -debugify having run
before.
2020-04-22 17:03:39 -07:00
Vedant Kumar 2a5675f11d [MachineDebugify] Insert synthetic DBG_VALUE instructions
Summary:
Teach MachineDebugify how to insert DBG_VALUE instructions.  This can
help find bugs causing CodeGen differences when debug info is present.
DBG_VALUE instructions are only emitted when -debugify-level is set to
locations+variables.

There is essentially no attempt made to match up DBG_VALUE register
operands with the local variables they ought to correspond to. I'm not
sure how to improve the situation. In some cases (MachineMemOperand?)
it's possible to find the IR instruction a MachineInstr corresponds to,
but in general this seems to call for "undoing" the work done by ISel.

Reviewers: dsanders, aprantl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78135
2020-04-22 17:03:39 -07:00
Christopher Tetreault 2dea3f1298 [SVE] Add new VectorType subclasses
Summary:
Introduce new types for fixed width and scalable vectors.

Does not remove getNumElements yet so as to not break code during transition
period.

Reviewers: deadalnix, efriedma, sdesmalen, craig.topper, huntergr

Reviewed By: sdesmalen

Subscribers: jholewinski, arsenm, jvesely, nhaehnle, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, kerbowa, Joonsoo, grosul1, frgossen, lldb-commits, tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm, #lldb

Differential Revision: https://reviews.llvm.org/D77587
2020-04-22 08:59:01 -07:00
Craig Topper 05a11974ae [CallSite removal] Remove unneeded includes of CallSite.h. NFC 2020-04-22 00:07:13 -07:00
Sameer Sahasrabuddhe 5a7a6382bc FixIrreducible: don't crash when moving a child loop
Summary:
When an irreducible SCC is converted into a new natural loop, existing
loops included in that SCC now become children of the new loop. The
logic that moves these loops from the parent loop to the new loop
invoked undefined behaviour when it modified the container that it was
iterating over. Fixed this by first extracting all the loops that are
to be removed from the parent.

Fixes bug 45623.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D78544
2020-04-22 07:47:30 +05:30
Simon Pilgrim d9af50efbc [Transforms] getOrEnforceKnownAlignment - fix MSVC result of 32-bit shift implicitly converted to 64 bits warning. NFCI
We don't overflow here so we can use a U64 shift directly.
2020-04-21 18:32:12 +01:00
Craig Topper 68b2e507e4 [Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign.
Differential Revision: https://reviews.llvm.org/D78443
2020-04-20 21:31:44 -07:00
Sriraman Tallam 365b60fc93 New pass to make internal linkage symbol names unique.
With clang option -funique-internal-linkage-symbols, symbols with
internal linkage get names with the module hash appended.

Differential Revision: https://reviews.llvm.org/D78243
2020-04-20 15:05:22 -07:00
Craig Topper fcc9d70260 Revert "[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign."
This is breaking the clang build.

This reverts commit 897409fb56.
2020-04-20 13:25:06 -07:00
Craig Topper 897409fb56 [Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign.
Differential Revision: https://reviews.llvm.org/D78443
2020-04-20 13:08:05 -07:00
Florian Hahn 4331b3812a [PredicateInfo] Use new Instruction::comesBefore instead of OI (NFC).
The recently added Instruction::comesBefore can be used instead of
OrderedInstructions.

Reviewers: rnk, nikic, efriedma

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D78452
2020-04-20 09:22:21 +01:00
Florian Hahn 7a87e8f90b [LoopUtils] Clean up includes, use forward decls if appropriate (NFC).
Most of the includes in LoopUtils.h are not required in the header and
they can be replaced by forward declarations.

Unfortunately includes of TargetTransformInfo.h and IVDescriptors.h pull
in a bunch of additional things, but there is no easy way to get rid of
them at the moment I think.
2020-04-19 19:44:29 +01:00
Craig Topper 7fde990694 Recommit "[Local] Simplify the alignment limits in getOrEnforceKnownAlignment. NFCI"
With a tweak to avoid a linker error for passing
MaxAlignmentExponent by reference to std::min.
2020-04-18 13:51:57 -07:00
Nikita Popov a42fd18d0f [PredicateInfo] Factor out PredicateInfoBuilder (NFC)
When running IPSCCP on a module with many small functions, memory
usage is dominated by PredicateInfo, which is a huge structure
(partially due to some unfortunate nested SmallVector use). However,
most of it is actually only temporary state needed to build
predicate info, and does not need to be retained after initial
construction.

This patch factors out the predicate building logic and state
into a separate PrediceInfoBuilder, with the extra bonus that
it does not need to live in the header anymore.

Differential Revision: https://reviews.llvm.org/D78326
2020-04-18 22:34:38 +02:00
Craig Topper 44d63b7528 Revert "[Local] Simplify the alignment limits in getOrEnforceKnownAlignment. NFCI"
This reverts commit e00cfe254d.

Seems to be causing a linker error on the build bots.
2020-04-18 13:23:29 -07:00
Craig Topper e00cfe254d [Local] Simplify the alignment limits in getOrEnforceKnownAlignment. NFCI
We previously clamped the trailing zero count to 31 bits. And
then clamped the final alignment to MaximumAlignment which is
1 << 29.

This patch simplifies this to just clamp the trailing zero to
29 using MaxAlignmentExponent.

I was looking into changing this function to use Align/MaybeAlign
and noticed this.

Differential Revision: https://reviews.llvm.org/D78418
2020-04-18 12:52:47 -07:00
Mircea Trofin 41ad8b7388 [llvm][NFC][CallSite] Remove CallSite from Evaluator.
Reviewers: craig.topper, dblaikie

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78395
2020-04-17 19:11:17 -07:00
Anna Thomas fd5e069d23 Fix buildbot failure due to obsolete CallSite usage
Fix buildbot failures due to ef49b1d97e
(which was a revert of a previous change).
2020-04-17 17:46:19 -04:00
Anna Thomas ef49b1d97e Revert "[InlineFunction] Update metadata on loads that are return values"
This reverts commit 1d0f757904 because of
https://bugs.llvm.org/show_bug.cgi?id=45590. Needs investigation.
2020-04-17 17:23:00 -04:00
Max Kazantsev 72c13446ce [NFC] Add missing 'const' notion to LCSSA-related functions
These functions don't really do any changes to loop info or
dominator tree. We should state this explicitly using 'const'.
2020-04-17 17:49:34 +07:00
Craig Topper 8e1408695c [CallSite removal][TargetLibraryInfo] Replace ImmutableCallSite with CallBase in one of the getLibFunc signatures. NFC
Differential Revision: https://reviews.llvm.org/D78083
2020-04-15 22:43:41 -07:00
Mircea Trofin 4213bc761a [llvm][NFC][CallSite] Removed CallSite from some implementation details.
Reviewers: craig.topper, dblaikie

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78256
2020-04-15 22:27:05 -07:00
Johannes Doerfert df675890b7 [CallGraphUpdater][NFC] Minor updates to D77855
I uploaded the old version accidentally instead of the one with these
minor adjustments requested by the reviewers.

Differential Revision: https://reviews.llvm.org/D77855
2020-04-15 21:26:35 -05:00
Johannes Doerfert 937025757c [CallGraphUpdater] Remove nodes from their SCC (old PM)
Summary:
We can and should remove deleted nodes from their respective SCCs. We
did not do this before and this was a potential problem even though I
couldn't locally trigger an issue. Since the `DeleteNode` would assert
if the node was not in the SCC, we know we only remove nodes from their
SCC and only once (when run on all the Attributor tests).

Reviewers: lebedev.ri, hfinkel, fhahn, probinson, wristow, loladiro, sstefan1, uenoku

Subscribers: hiraditya, bollu, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77855
2020-04-15 18:38:50 -05:00
Johannes Doerfert 1b34b84ddd [CallGraphUpdater] Update the ExternalCallingNode for node replacements
Summary:
While it is uncommon that the ExternalCallingNode needs to be updated,
it can happen. It is uncommon because most functions listed as callees
have external linkage, modifying them is usually not allowed. That said,
there are also internal functions that have, or better had, their
"address taken" at construction time. We conservatively assume various
uses cause the address "to be taken". Furthermore, the user might have
become dead at some point. As a consequence, transformations, e.g., the
Attributor, might be able to replace a function that is listed
as callee of the ExternalCallingNode.

Since there is no function corresponding to the ExternalCallingNode, we
did just remove the node from the callee list if we replaced it (so
far). Now it would be preferable to replace it if needed and remove it
otherwise. However, removing the node has implications on the CGSCC
iteration. Locally, that caused some other nodes to be never visited
but it is for sure possible other (bad) side effects can occur. As it
seems conservatively safe to keep the new node in the callee list we
will do that for now.

Reviewers: lebedev.ri, hfinkel, fhahn, probinson, wristow, loladiro, sstefan1, uenoku

Subscribers: hiraditya, bollu, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77854
2020-04-15 18:38:50 -05:00
Johannes Doerfert 7ec8d79385 [CallGraphUpdater] Properly remove strongly connected components (oldPM)
Summary:
The old code did eliminate references from and to functions that were
about to be deleted only just before we deleted them. This can cause
references from other functions that are supposed to be deleted to still
exist, depending on the order. If the functions form a strongly
connected component the problem manifests regardless of the order in
which we try to actually delete the functions.

This patch introduces a two step deletion. First we remove all
references and then we delete the function. Note that this only affects
the old call graph. There should not be any functional changes if no old
style call graph was given.

To test this we delete two strongly connected functions instead of one
in an existing test.

Reviewers: hfinkel

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77975
2020-04-15 18:38:49 -05:00
Craig Topper fbb804983d [CallSite removal][CloneFunction] Use CallSite instead of CallBase. NFC
Differential Revision: https://reviews.llvm.org/D78236
2020-04-15 15:38:02 -07:00
Benjamin Kramer 6f64daca8f Upgrade calls to CreateShuffleVector to use the preferred form of passing an array of ints
No functionality change intended.
2020-04-15 12:51:38 +02:00
Sameer Sahasrabuddhe 7bb9f500e2 fix warning: specialization of template in different namespace
This is related to commit 8c11bc0cd0
which introduces the FixIrreducible pass. The warning seems hard to
reproduce locally. The latest attempt ought to work.
2020-04-15 15:57:53 +05:30
Sameer Sahasrabuddhe 8c11bc0cd0 Introduce fix-irreducible pass
An irreducible SCC is one which has multiple "header" blocks, i.e., blocks
with control-flow edges incident from outside the SCC. This pass converts an
irreducible SCC into a natural loop by introducing a single new header
block and redirecting all the edges on the original headers to this
new block.

This is a useful workaround for a limitation in the structurizer
which, which produces incorrect control flow in the presence of
irreducible regions. The AMDGPU backend provides an option to
enable this pass before the structurizer, which may eventually be
enabled by default.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D77198

This restores commit 2ada8e2525.

Originally reverted with commit 44e09b59b8.
2020-04-15 15:05:51 +05:30
Sameer Sahasrabuddhe 44e09b59b8 Revert "Introduce fix-irreducible pass"
This reverts commit 2ada8e2525.

Buildbots produced compilation errors which I was not able to quickly
reproduce locally. Need more time to investigate.
2020-04-15 12:19:50 +05:30
Sameer Sahasrabuddhe 2ada8e2525 Introduce fix-irreducible pass
An irreducible SCC is one which has multiple "header" blocks, i.e., blocks
with control-flow edges incident from outside the SCC. This pass converts an
irreducible SCC into a natural loop by introducing a single new header
block and redirecting all the edges on the original headers to this
new block.

This is a useful workaround for a limitation in the structurizer
which, which produces incorrect control flow in the presence of
irreducible regions. The AMDGPU backend provides an option to
enable this pass before the structurizer, which may eventually be
enabled by default.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D77198
2020-04-15 11:29:19 +05:30
Christopher Tetreault 8226d599ff [SVE] Remove calls to getBitWidth from Transforms
Reviewers: efriedma, sdesmalen, spatel, eugenis, chandlerc

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77896
2020-04-14 14:31:42 -07:00
Mircea Trofin 4aae4e3f48 [llvm][NFC] CallSite removal from inliner-related files
Summary: This removes CallSite from inliner files. Some dependencies where thus affected.

Reviewers: dblaikie, davidxl, craig.topper

Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, aheejin, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77991
2020-04-13 21:28:58 -07:00
Vedant Kumar 122a6bfb07 [Debugify] Strip added metadata in the -debugify-each pipeline
Summary:
Share logic to strip debugify metadata between the IR and MIR level
debugify passes. This makes it simpler to hunt for bugs by diffing IR
with vs. without -debugify-each turned on.

As a drive-by, fix an issue causing CallGraphNodes to become invalid
when a dead llvm.dbg.value prototype is deleted.

Reviewers: dsanders, aprantl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77915
2020-04-13 10:55:17 -07:00
Tyker 813f438baa [AssumeBundles] adapt Assumption cache to assume bundles
Summary: change assumption cache to store an assume along with an index to the operand bundle containing the knowledge.

Reviewers: jdoerfert, hfinkel

Reviewed By: jdoerfert

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77402
2020-04-13 12:04:51 +02:00
Huihui Zhang 4bde7c5986 [NFC] Use VectorType::isScalable to align with ongoing VectorType refactor. 2020-04-12 15:39:13 -07:00
Mircea Trofin d2f1cd5d97 [llvm][NFC] Refactor uses of CallSite to CallBase - call promotion
Summary:
Updated CallPromotionUtils and impacted sites. Parameters that are
expected to be non-null, and return values that are guranteed non-null,
were replaced with CallBase references rather than pointers.

Left FIXME in places where more changes are facilitated by CallBase, but
aren't CallSites: Instruction* parameters or return values, for example,
where the contract that they are actually CallBase values.

Reviewers: davidxl, dblaikie, wmi

Reviewed By: dblaikie

Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77930
2020-04-12 08:27:29 -07:00
Huihui Zhang 6e7eeb44b3 [GVN] Fix VNCoercion for Scalable Vector.
Summary:
For VNCoercion, skip scalable vector when analysis rely on fixed size,
otherwise call TypeSize::getFixedSize() explicitly.

Add unit tests to check funtionality of GVN load elimination for scalable type.

Reviewers: sdesmalen, efriedma, spatel, fhahn, reames, apazos, ctetreau

Reviewed By: efriedma

Subscribers: bjope, hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76944
2020-04-10 17:49:07 -07:00
Christopher Tetreault 00a1032412 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: rriddle, sdesmalen, efriedma

Reviewed By: sdesmalen

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77260
2020-04-09 13:35:41 -07:00
Zequan Wu eccfa35d53 Fix lifetime call in landingpad blocking Simplifycfg pass
Fix lifetime call in landingpad blocks simplifycfg from removing the
landingpad.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D77188
2020-04-09 13:07:32 -07:00
Johannes Doerfert cb0ecc5c33 [CallGraphUpdater] Remove dead constants before replacing a function
Dead constants might be left when a function is replaced, we can
gracefully handle this case and avoid complexity for the users who would
see an assertion otherwise.
2020-04-08 22:52:46 -05:00
Eli Friedman 565b56a72c [NFC] Clean up uses of LoadInst constructor. 2020-04-07 16:28:53 -07:00
Daniel Sanders 1adeeabb79 Add MIR-level debugify with only locations support for now
Summary:
Re-used the IR-level debugify for the most part. The MIR-level code then
adds locations to the MachineInstrs afterwards based on the LLVM-IR debug
info.

It's worth mentioning that the resulting locations make little sense as
the range of line numbers used in a Function at the MIR level exceeds that
of the equivelent IR level function. As such, MachineInstrs can appear to
originate from outside the subprogram scope (and from other subprogram
scopes). However, it doesn't seem worth worrying about as the source is
imaginary anyway.

There's a few high level goals this pass works towards:
* We should be able to debugify our .ll/.mir in the lit tests without
  changing the checks and still pass them. I.e. Debug info should not change
  codegen. Combining this with a strip-debug pass should enable this. The
  main issue I ran into without the strip-debug pass was instructions with MMO's and
  checks on both the instruction and the MMO as the debug-location is
  between them. I currently have a simple hack in the MIRPrinter to
  resolve that but the more general solution is a proper strip-debug pass.
* We should be able to test that GlobalISel does not lose debug info. I
  recently found that the legalizer can be unexpectedly lossy in seemingly
  simple cases (e.g. expanding one instr into many). I have a verifier
  (will be posted separately) that can be integrated with passes that use
  the observer interface and will catch location loss (it does not verify
  correctness, just that there's zero lossage). It is a little conservative
  as the line-0 locations that arise from conflicts do not track the
  conflicting locations but it can still catch a fair bit.

Depends on D77439, D77438

Reviewers: aprantl, bogner, vsk

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77446
2020-04-07 16:25:13 -07:00
Fangrui Song d2ef8c1f2c [ThinLTO] Drop dso_local if a GlobalVariable satisfies isDeclarationForLinker()
dso_local leads to direct access even if the definition is not within this compilation unit (it is
still in the same linkage unit). On ELF, such a relocation (e.g. R_X86_64_PC32) referencing a
STB_GLOBAL STV_DEFAULT object can cause a linker error in a -shared link.

If the linkage is changed to available_externally, the dso_local flag should be dropped, so that no
direct access will be generated.

The current behavior is benign, because -fpic does not assume dso_local
(clang/lib/CodeGen/CodeGenModule.cpp:shouldAssumeDSOLocal).
If we do that for -fno-semantic-interposition (D73865), there will be an
R_X86_64_PC32 linker error without this patch.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D74751
2020-04-07 15:46:01 -07:00
Eli Friedman 3f13ee8a00 [NFC] Modernize misc. uses of Align/MaybeAlign APIs.
Use the current getAlign() APIs where it makes sense, and use Align
instead of MaybeAlign when we know the value is non-zero.
2020-04-06 17:53:04 -07:00
Eli Friedman 68b03aee1a Remove SequentialType from the type heirarchy.
Now that we have scalable vectors, there's a distinction that isn't
getting captured in the original SequentialType: some vectors don't have
a known element count, so counting the number of elements doesn't make
sense.

In some cases, there's a better way to express the commonality using
other methods. If we're dealing with GEPs, there's GEP methods; if we're
dealing with a ConstantDataSequential, we can query its element type
directly.

In the relatively few remaining cases, I just decided to write out
the type checks. We're talking about relatively few places, and I think
the abstraction doesn't really carry its weight. (See thread "[RFC]
Refactor class hierarchy of VectorType in the IR" on llvmdev.)

Differential Revision: https://reviews.llvm.org/D75661
2020-04-06 17:03:49 -07:00
Daniel Sanders 15f7bc7857 Add option to limit Debugify to locations (omitting variables)
Summary:
It can be helpful to test behaviour w.r.t locations without having DEBUG_VALUE
around. In particular, because DEBUG_VALUE has the potential to change CodeGen
behaviour (e.g. hasOneUse() vs hasOneNonDbgUse()) while locations generally
don't.

Reviewers: aprantl, bogner

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77438
2020-04-06 15:04:55 -07:00
Anna Thomas 1d0f757904 [InlineFunction] Update metadata on loads that are return values
This patch builds upon D76140 by updating metadata on pointer typed
loads in inlined functions, when the load is the return value, and the
callsite contains return attributes which can be updated as metadata on
the load.
Added test cases show this for nonnull, dereferenceable,
dereferenceable_or_null

Reviewed-By: jdoerfert

Differential Revision: https://reviews.llvm.org/D76792
2020-04-05 14:50:10 -04:00
Nikita Popov 6896d559f3 [VNCoercion] Use IRBuilderBase; NFC
And remove include from header.
2020-04-04 12:44:50 +02:00
Roman Lebedev 7d572ef2dd
Revert "[SCEV] rewriteLoopExitValues(): even if have hard uses, still rewrite if cheap (PR44668)"
As discussed in post-commit review in https://reviews.llvm.org/D73501
if the goal of this is to help vectorizer, then we should actually
be teaching vectorizer to do this, because right now this rewrite
is still budget-limited, which isn't what we'd want.

Additionally, while the rest of the patch series was universally profitable,
this particular patch is reportedly (https://reviews.llvm.org/D73501#1905171)
exposing cost-modeling issues on ARM.

So let's just back this particular patch out. Once there's an undo transform,
this could be considered for reintegration.

This reverts commit 44edc6fd2c.
2020-04-03 20:15:04 +03:00
Anna Thomas bf7a16a768 [InlineFunction] Update valid return attributes at callsite within callee body
Consider a callee function that has a call (C) within it which feeds
into the return. When we inline that callee into a callsite that has
return attributes, we can backward propagate valid attributes to the
call (C) within that inlined callee body.

This is safe to do so only if we can guarantee transfer of execution to
successor in the window of instructions between return value (i.e. the
call C) and the return instruction.

Also, this is valid only for attributes which are a property of a
callsite and not those that are not dependent on the ABI, or a property
of the call itself.

Reviewed-By: reames, jdoerfert

Differential Revision: https://reviews.llvm.org/D76140
2020-04-02 14:13:12 -04:00
Benjamin Kramer dffc503187 Revert "[SimplifyLibCalls] Erase replaced instructions"
This reverts commit 2a77544ad5. This
introduces a use-after-free in Transforms/InstCombine/sincospi.ll.
Found by asan.
2020-04-02 17:30:47 +02:00