Commit Graph

32159 Commits

Author SHA1 Message Date
Florian Hahn 8086b0c8a8
[ConstraintElim] Drop bail out for scalable vectors after using getTrue
ConstantInt::getTrue/getFalse can materialize scalable vectors with all
lanes true/false.
2022-11-03 19:05:45 +00:00
LiDongjin d1cee3539f [LoopVectorize] Fix crash on "Cannot dereference end iterator!"(PR56627)
Check hasOneUser before user_back().

Differential Revision: https://reviews.llvm.org/D136227
2022-11-03 23:13:37 +08:00
Nikita Popov 68b24c3b44 [CVP] Simplify comparisons without constant operand
CVP currently only tries to simplify comparisons if there is a
constant operand. However, even if both are non-constant, we may
be able to determine the result of the comparison based on range
information.

IPSCCP is already capable of doing this, but because it runs very
early, it may miss some cases.

Differential Revision: https://reviews.llvm.org/D137253
2022-11-03 15:35:27 +01:00
bipmis 150fc73dda [AggressiveInstCombine] Avoid load merge/widen if stores are present b/w loads
This patch is to address the test cases in which the load has to be inserted at a right point. This happens when there is a store b/w the loads.

This patch reverts the loads merge in all cases when stores are present b/w loads and will eventually be replaced with proper fix and test cases.

Differential Revision: https://reviews.llvm.org/D137333
2022-11-03 14:32:07 +00:00
Nikita Popov b03f7c3365 [SimplifyCFG] Use range based for loop (NFC) 2022-11-03 15:21:45 +01:00
Nikita Popov 592a96c03b [SimplifyCFG] Extract code for tracking ephemeral values (NFC)
To allow reusing this in more places in SimplifyCFG.
2022-11-03 14:28:12 +01:00
Alexey Bataev f090e3c00f [SLP]Fix write after bounds.
Need to use comma instead of + symbol to prevent writing after bounds.
2022-11-03 05:30:41 -07:00
Alexey Bataev 8b015b2078 [SLP][NFC]Formatting and reduce number of iterations, NFC. 2022-11-03 05:30:13 -07:00
Florian Hahn 44d8f80b26
[ConstraintElim] Use ConstantInt::getTrue to create constants (NFC).
Use existing ConstantInt::getTrue/getFalse functionality instead of
custom getScalarConstOrSplat as suggested by @nikic.
2022-11-03 12:20:23 +00:00
Peter Waller e1790c8c29 Revert "[InstCombine] Remove redundant splats in InstCombineVectorOps"
This reverts commit 957eed0b1a.
2022-11-03 07:56:03 +00:00
Ye Luo 00b09a7b18 Revert "[AAPointerInfo] refactor how offsets and Access objects are tracked"
This reverts commit b756096b0c.
See regression https://github.com/llvm/llvm-project/issues/58774
2022-11-03 00:01:51 -05:00
Youling Tang d16b5c3504 [asan] Use proper shadow offset for loongarch64 in instrumentation passes
Instrumentation passes now use the proper shadow offset. There will be many
asan test failures without this patch. For example:

```
$ ./lib/asan/tests/LOONGARCH64LinuxConfig/Asan-loongarch64-calls-Test
AddressSanitizer:DEADLYSIGNAL
=================================================================
==651209==ERROR: AddressSanitizer: SEGV on unknown address 0x1ffffe2dfa9b (pc 0x5555585e151c bp 0x7ffffb9ec070 sp 0x7ffffb9ebfd0 T0)
==651209==The signal is caused by a UNKNOWN memory access.
```

Before the patch:

```
$ make check-asan
Testing Time: 36.13s
  Unsupported      : 205
  Passed           :  83
  Expectedly Failed:   1
  Failed           : 239
```

After the patch:
```
$ make check-asan
Testing Time: 58.98s
  Unsupported      : 205
  Passed           : 421
  Expectedly Failed:   1
  Failed           :  89
```

Differential Revision: https://reviews.llvm.org/D137013
2022-11-03 11:04:47 +08:00
Fangrui Song 1ada819c23 [asan] Default to -fsanitize-address-use-odr-indicator for non-Windows
This enables odr indicators on all platforms and private aliases on non-Windows.
Note that GCC also uses private aliases: this fixes bogus
`The following global variable is not properly aligned.` errors for interposed global variables

Fix https://github.com/google/sanitizers/issues/398
Fix https://github.com/google/sanitizers/issues/1017
Fix https://github.com/llvm/llvm-project/issues/36893 (we can restore D46665)

Global variables of non-hasExactDefinition() linkages (i.e.
linkonce/linkonce_odr/weak/weak_odr/common/external_weak) are not instrumented.
If an instrumented variable gets interposed to an uninstrumented variable due to
symbol interposition (e.g. in issue 36893, _ZTS1A in foo.so is resolved to _ZTS1A in
the executable), there may be a bogus error.

With private aliases, the register code will not resolve to a definition in
another module, and thus prevent the issue.

Cons: minor size increase. This is mainly due to extra `__odr_asan_gen_*` symbols.
(ELF) In addition, in relocatable files private aliases replace some relocations
referencing global symbols with .L symbols and may introduce some STT_SECTION symbols.

For lld, with -g0, the size increase is 0.07~0.09% for many configurations I
have tested: -O0, -O1, -O2, -O3, -O2 -ffunction-sections -fdata-sections
-Wl,--gc-sections. With -g1 or above, the size increase ratio will be even smaller.

This patch obsoletes D92078.

Don't migrate Windows for now: the static data member of a specialization
`std::num_put<char>::id` is a weak symbol, as well as its ODR indicator.
Unfortunately, link.exe (and lld without -lldmingw) generally doesn't support
duplicate weak definitions (weak symbols in different TUs likely pick different
defined external symbols and conflict).

Differential Revision: https://reviews.llvm.org/D137227
2022-11-02 19:21:33 -07:00
Florian Hahn 74d8628cf7
[ConstraintElimination] Skip compares with scalable vector types.
Materializing scalable vectors with boolean values is not implemented
yet. Skip those cases for now and leave a TODO.
2022-11-02 23:57:14 +00:00
Joseph Huber cccbd2a2b2 Revert "[Attributor][NFCI] Move MemIntrinsic handling into the initializer"
This was causing failures when optimizing codes with complex numbers.
Revert until a fix can be implemented.

This reverts commit 7fdf3564c0.
2022-11-02 15:28:50 -05:00
Florian Hahn b478d8b966
[ConstraintElimination] Generate true/false vectors for vector cmps.
This fixes crashes when vector compares can be simplified to true/false.
2022-11-02 18:13:35 +00:00
Rong Xu 8acb881c19 [PGO] Add a threshold for number of critical edges in PGO
For some auto-generated sources, we have a huge number of critical
edges (like from switch statements). We have seen instance of 183777
critical edges in one function.

After we split the critical edges in PGO instrumentation/profile-use
pass, the CFG is so large that we have compiler time issues in
downstream passes (like in machine CSE and block placement). Here I
add a threshold to skip PGO if the number of critical edges are too
large.

The threshold is large enough so that it will not affect the majority
of PGO compilation.

Also sync the logic for skipping instrumentation and profile-use. I
think this is the correct thing to do.

Differential Revision: https://reviews.llvm.org/D137184
2022-11-02 10:14:04 -07:00
Sanjay Patel f03b069c5b [InstCombine] fold mul with decremented "shl -1" factor (2nd try)
This is a corrected version of:
bc886e9b58

I made a copy-paste error that created an "add" instead of the
intended "sub" on that attempt. The regression tests showed the
bug, but I overlooked that.

As I said in a comment on issue #58717, the bug reports resulting
from the botched patch confirm that the pattern does occur in
many real-world applications, so hopefully eliminating the multiply
results in better code.

I added one more regression test in this version of the patch,
and here's an Alive2 proof to show that exact example:
https://alive2.llvm.org/ce/z/dge7VC

Original commit message:

This is a sibling to:
6064e92b0a
...but we canonicalize the shl+add to shl+xor,
so the pattern is different than I expected:
https://alive2.llvm.org/ce/z/8CX16e

I have not found any patterns that are safe
to propagate no-wrap, so that is not included
here.

Differential Revision: https://reviews.llvm.org/D137157
2022-11-02 09:30:01 -04:00
OCHyams 88d34d4626 [DebugInfo] Fix minor debug info bug in deleteDeadLoop
Using a DebugVariable as the set key rather than std::pair<DIVariable *,
DIExpression *> ensures we don't accidently confuse multiple instances of
inlined variables.

Reviewed By: jryans

Differential Revision: https://reviews.llvm.org/D133303
2022-11-02 13:28:05 +00:00
Sanjay Patel b24e2f6ef6 [InstCombine] use logical-and matcher to avoid crash
Follow-on to:
ec0b406e16

This should prevent crashing for example like issue #58552
by not matching a select-of-vectors-with-scalar-condition.

The test that shows a regression seems unlikely to occur
in real code.

This also picks up an optimization in the case where a real
(bitwise) logic op is used. We could already convert some
similar select ops to real logic via impliesPoison(), so
we don't see more diffs on commuted tests. Using commutative
matchers (when safe) might also handle one of the TODO tests.
2022-11-02 08:23:52 -04:00
Matt Devereau 957eed0b1a [InstCombine] Remove redundant splats in InstCombineVectorOps
Splatting the first vector element of the result of a BinOp, where any of the
BinOp's operands are the result of a first vector element splat can be simplified to
splatting the first vector element of the result of the BinOp

Differential Revision: https://reviews.llvm.org/D135876
2022-11-02 11:57:05 +00:00
Max Kazantsev 2baabd2c19 [LoopPredication][NFCI] Perform 'visited' check before pushing to worklist
This prevents duplicates to be pushed into the stack and hypothetically
should reduce memory footprint on ugly cornercases with multiple repeating
duplicates in 'and' tree.
2022-11-02 18:04:23 +07:00
Nikita Popov 134bda4b61 [ValueLattice] Use DL-aware folding in getCompare()
Use DL-aware ConstantFoldCompareInstOperands() API instead of
ConstantExpr API. The practical effect of this is that SCCP can
now fold comparisons that require DL.
2022-11-02 10:41:11 +01:00
Nikita Popov 7ad3bb527e Reapply [ValueLattice] Fix getCompare() for undef values
Relative to the previous attempt, this also updates the
ValueLattice unit tests.

-----

Resolve the TODO about incorrect getCompare() behavior. This can
be made more precise (e.g. by materializing the undef value and
performing constant folding on it), but for now just return an
unknown result to fix the correctness issue.

This should be NFC in terms of user-visible behavior, because the
only user of this method (SCCP) was already guarding against
UndefValue results.
2022-11-02 10:23:06 +01:00
Bjorn Pettersson 44bb4099cd [ConstraintElimination] Do not crash on vector GEP in decomposeGEP
Commit 359bc5c541 caused
 Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"'
failures in decomposeGEP when the GEP pointer operand is a vector.

Fix is to use DataLayout::getIndexTypeSizeInBits when fetching the
index size, as it will use the scalar type in case of a ptr vector.

Differential Revision: https://reviews.llvm.org/D137185
2022-11-02 10:04:11 +01:00
Fangrui Song 87d98a7300 [hwasan] Remove no-op setDSOLocal. NFC
PrivateLinkage and HiddenVisibility are implicitly dso_local.
2022-11-02 00:17:16 -07:00
Johannes Doerfert 7fdf3564c0 [Attributor][NFCI] Move MemIntrinsic handling into the initializer
The handling for MemIntrinsic is not dependent on optimistic
information, no need to put it in update for now. Added a TODO though.
2022-11-01 20:37:53 -07:00
Johannes Doerfert f89deef13e [Attributor][NFC] Hide verbose output behind `attributor-verbose` 2022-11-01 20:37:53 -07:00
Sanjay Patel ec0b406e16 [InstCombine] use logical-or matcher to avoid crash
This should prevent crashing for the example in issue #58552
by not matching a select-of-vectors-with-scalar-condition.

A similar change is likely needed for the related fold to
properly fix that kind of bug.

The test that shows a regression seems unlikely to occur
in real code.

This also picks up an optimization in the case where a real
(bitwise) logic op is used. We could already convert some
similar select ops to real logic via impliesPoison(), so
we don't see more diffs on commuted tests. Using commutative
matchers (when safe) might also handle one of the TODO tests.
2022-11-01 16:47:41 -04:00
Arthur Eubanks 8c49b01a1e [GlobalOpt] Don't remove inalloca from varargs functions
Varargs and inalloca have a weird interaction where varargs are actually
passed via the inalloca alloca. Removing inalloca breaks the varargs
because they're still not passed as separate arguments.

Fixes #58718

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D137182
2022-11-01 13:04:05 -07:00
Florian Hahn e83e289fd1
[SCEVExpander] Forget SCEV when replacing congruent phi.
Otherwise there may be stale values left over in the cache, causing a
SCEV verification failure.

Fixes #58702.
2022-11-01 17:49:38 +00:00
Nikita Popov 094d1298d1 Revert "[ValueLattice] Fix getCompare() for undef values"
This reverts commit 6de41e6d76.

Missed a unit test affected by this change, revert for now.
2022-11-01 17:03:40 +01:00
Nikita Popov 6de41e6d76 [ValueLattice] Fix getCompare() for undef values
Resolve the TODO about incorrect getCompare() behavior. This can
be made more precise (e.g. by materializing the undef value and
performing constant folding on it), but for now just return an
unknown result.
2022-11-01 16:41:24 +01:00
Sanjay Patel 4299b28a9b [InstCombine] add helper function for select-of-bools folds; NFC
This set of folds keeps growing, and it contains
bugs like issue #58552, so make it easier to
spot those via backtrace.
2022-11-01 11:06:18 -04:00
Florian Hahn 27f6091bb0
[ConstraintElimination] Fix nested GEP check.
At the moment, the implementation requires that the outer GEP has a
single index, the inner GEP can have an arbitrary indices, because the
general `decompose` helper is used.
2022-11-01 14:43:42 +00:00
skc7 87a20868eb [SLP] Extend reordering data of tree entry to support PHI nodes
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D136757
2022-11-01 04:50:04 +00:00
Sameer Sahasrabuddhe b756096b0c [AAPointerInfo] refactor how offsets and Access objects are tracked
AAPointerInfo now maintains a list of all Access objects that it owns, along
with the following maps:

- OffsetBins: OffsetAndSize -> { Access }
- InstTupleMap: RemoteI x LocalI -> Access

A RemoteI is any instruction that accesses memory. RemoteI is different from
LocalI if and only if LocalI is a call; then RemoteI is some instruction in the
callgraph starting from LocalI.

Motivation: When AAPointerInfo recomputes the offset for an instruction, it sets
the value to Unknown if the new offset is not the same as the old offset. The
instruction must now be moved from its current bin to the bin corresponding to
the new offset. This happens for example, when:

- A PHINode has operands that result in different offsets.
- The same remote inst is reachable from the same local inst via different paths
  in the callgraph:

```
               A (local inst)
               |
               B
              / \
             C1  C2
              \ /
               D (remote inst)

```
This fixes a bug where a store is incorrectly eliminated in a lit test.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D136526
2022-11-01 09:57:12 +05:30
Florian Mayer e1de7ac20f Revert "[InstCombine] fold mul with decremented "shl -1" factor"
This reverts commit bc886e9b58.

Broke MSAN bootstrap buildbots with Assertion `RangeAfterCopy % ExtraScale == 0 && "Extra instruction requires immediate to be aligned"' failed.
2022-10-31 17:39:05 -07:00
Usman Nadeem 32755786e0 [JumpThreading] Put a limit on the PHI nodes when duplicating a BB.
Do not duplicate a BB if it has a lot of PHI nodes.
If a threadable chain is too long then the number of duplicated PHI nodes
can add up, leading to a substantial increase in compile time when rewriting
the SSA.

Fixes https://github.com/llvm/llvm-project/issues/58203
Differential Revision: https://reviews.llvm.org/D136716

The threshold of 76 in this patch is reasonably high and reduces the compile
time of cldwat2m_macro.f90 in SPEC2017/cam4 from 80+min to <2min.

Change-Id: I153c89a8e0d89b206a5193dc1b908c67e320717e
2022-10-31 15:51:56 -07:00
Sanjay Patel 1f8ac37e2d [InstCombine] adjust branch on logical-and fold
The transform was just added with:
115d2f69a5
...but as noted in post-commit feedback, it was
confusingly coded. Now, we create the final
expected canonicalized form directly and put
an extra use check on the match, so we should
not ever end up with more instructions.
2022-10-31 17:39:29 -04:00
Philip Reames a819f6c8d1 [InstCombine] Allow simplify demanded transformations on scalable vectors
Differential Revision: https://reviews.llvm.org/D136475
2022-10-31 13:39:36 -07:00
Patrick Walton 01859da84b [AliasAnalysis] Introduce getModRefInfoMask() as a generalization of pointsToConstantMemory().
The pointsToConstantMemory() method returns true only if the memory pointed to
by the memory location is globally invariant. However, the LLVM memory model
also has the semantic notion of *locally-invariant*: memory that is known to be
invariant for the life of the SSA value representing that pointer. The most
common example of this is a pointer argument that is marked readonly noalias,
which the Rust compiler frequently emits.

It'd be desirable for LLVM to treat locally-invariant memory the same way as
globally-invariant memory when it's safe to do so. This patch implements that,
by introducing the concept of a *ModRefInfo mask*. A ModRefInfo mask is a bound
on the Mod/Ref behavior of an instruction that writes to a memory location,
based on the knowledge that the memory is globally-constant memory (in which
case the mask is NoModRef) or locally-constant memory (in which case the mask
is Ref). ModRefInfo values for an instruction can be combined with the
ModRefInfo mask by simply using the & operator. Where appropriate, this patch
has modified uses of pointsToConstantMemory() to instead examine the mask.

The most notable optimization change I noticed with this patch is that now
redundant loads from readonly noalias pointers can be eliminated across calls,
even when the pointer is captured. Internally, before this patch,
AliasAnalysis was assigning Ref to reads from constant memory; now AA can
assign NoModRef, which is a tighter bound.

Differential Revision: https://reviews.llvm.org/D136659
2022-10-31 13:03:41 -07:00
Sanjay Patel 115d2f69a5 [InstCombine] canonicalize branch with logical-and-not condition
https://alive2.llvm.org/ce/z/EfHlWN

In the motivating case from issue #58313,
this allows forming a duplicate 'not' op
which then gets CSE'd and simplifyCFG'd
and combined into the expected 'xor'.
2022-10-31 15:51:45 -04:00
Alexey Bataev 99f9bd4807 [SLP]Fix a crash in the analysis of the compatible cmp operands.
We can skip the analysis of the operands opcodes, can compare directly
them in some cases.
2022-10-31 09:47:25 -07:00
Sameer Sahasrabuddhe 2e7636c13b [AAPointerInfo] check for Unknown offsets in callee
When translating offset info from the callee at a call site, first check if the
offset is Unknown. Any offset in the caller should be added only if the callee
offset is valid.

Differential Revision: https://reviews.llvm.org/D137011
2022-10-31 22:15:56 +05:30
Mengxuan Cai eda3c93486 [LoopFuse] Ensure loops are in loop simplified form under new PM
Loop Fusion (Function Pass) requires loops in simplified form. With
legacy-pm, loop-simplify pass is added as a dependency for loop-fusion.
But the new pass manager does not always ensure this format. This patch
tries to invoke simplifyLoop() on loops that are not in simplified form
only for new PM.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D136781
2022-10-31 11:46:28 -04:00
Brendon Cahoon f59205aef9 [BasicBlockUtils] Add a new way for CreateControlFlowHub()
The existing way of creating the predicate in the guard blocks uses
a boolean value per outgoing block. This increases the number of live
booleans as the number of outgoing blocks increases. The new way added
in this change is to store one integer to represent the outgoing block
we want to branch to, then at each guard block, an integer equality
check is performed to decide which a specific outgoing block is taken.

Using an integer reduces the number of live values and decreases
register pressure especially in cases where there are a large number
of outgoing blocks. The integer based approach is used when the
number of outgoing blocks crosses a threshold, which is currently set
to 32.

Patch by Ruiling Song.

Differential review: https://reviews.llvm.org/D127831
2022-10-31 08:58:54 -05:00
Brendon Cahoon 9265b7fa83 NFC: restructure code for CreateControlFlowHub()
Differential review: https://reviews.llvm.org/D127830
2022-10-31 08:39:09 -05:00
Sanjay Patel bc886e9b58 [InstCombine] fold mul with decremented "shl -1" factor
This is a sibling to:
6064e92b0a
...but we canonicalize the shl+add to shl+xor,
so the pattern is different than I expected:
https://alive2.llvm.org/ce/z/8CX16e

I have not found any patterns that are safe
to propagate no-wrap, so that is not included
here.
2022-10-31 09:06:55 -04:00
Patrick Walton 36bbd68667 [InstCombine] Allow memcpys from constant memory to readonly nocapture parameters to be elided.
Currently, InstCombine can elide a memcpy from a constant to a local alloca if
that alloca is passed as a nocapture parameter to a *function* that's readnone
or readonly, but it can't forward the memcpy if the *argument* is marked
readonly nocapture, even though readonly guarantees that the callee won't
mutate the pointee through that pointer. This patch adds support for detecting
and handling such situations, which arise relatively frequently in Rust, a
frontend that liberally emits readonly.

A more general version of this optimization would use alias analysis to check
the call's ModRef info for the pointee, but I was concerned about blowing up
compile time, so for now I'm just checking for one of readnone on the function,
readonly on the function, or readonly on the parameter.

Differential Revision: https://reviews.llvm.org/D136822
2022-10-30 14:41:03 -07:00
Florian Hahn 1f1fb208da
[SimpleLoopUnswitch] Forget block and loop dispositions.
Also invalidate block and loop dispositions during non-trivial
unswitching.

Fixes #58564.
2022-10-30 20:44:22 +00:00
zhongyunde f58311796c [InstCombine] refactor the SimplifyUsingDistributiveLaws NFC
Precommit for D136015
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D137019
2022-10-30 21:04:06 +08:00
Patrick Walton cb2cb2d201 [InstCombine] Avoid deleting volatile memcpys.
InstCombine can replace memcpy to an alloca with a pointer directly to the
source in certain cases. Unfortunately, it also did so for volatile memcpys.
This patch makes it stop doing that.

This was discovered in D136822.

Differential Revision: https://reviews.llvm.org/D137031
2022-10-30 02:40:16 -07:00
Florian Hahn 43f0f1a66f
[VPlan] Use onlyFirstLaneUsed in sinkScalarOperands.
Replace custom code to check if only the first lane is used by generic
helper `onlyFirstLaneUsed`. This enables VPlan-based sinking in a few
additional cases and was suggested in D133760.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D136368
2022-10-29 19:45:19 +01:00
Sanjay Patel 6064e92b0a [InstCombine] fold mul with incremented "shl 1" factor
X * ((1 << Z) + 1) --> (X << Z) + X

https://alive2.llvm.org/ce/z/P-7WK9

It's possible that we could do better with propagating
no-wrap, but this carries over the existing logic and
appears to be correct.

The naming differences on the existing folds are a result
of using getName() to set the final value via Builder.
That makes it easier to transfer no-wrap rather than the
gymnastics required from the raw create instruction APIs.
2022-10-29 12:50:19 -04:00
Sanjay Patel 50000ec2cb [InstCombine] create helper function for mul patterns with 1<<X; NFC
There are at least 2 other potential patterns that could go here.
2022-10-29 12:50:19 -04:00
Sanjay Patel d344146857 [InstCombine] reduce code duplication in visitMul(); NFC 2022-10-29 09:26:02 -04:00
Arthur Eubanks 5404fe3456 Revert "[LegacyPM] Remove pipeline extension mechanism"
This reverts commit 4ea6ffb7e8.

Breaks various backends.
2022-10-28 10:26:58 -07:00
Arthur Eubanks 4ea6ffb7e8 [LegacyPM] Remove pipeline extension mechanism
Part of gradually removing the legacy PM optimization pipeline.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D136622
2022-10-28 10:16:52 -07:00
Momchil Velikov c9b4dc3a81 [FuncSpec][NFC] Avoid redundant computations of DominatorTree/LoopInfo
The `FunctionSpecialization` pass needs loop analysis results for its
cost function. For this purpose, it computes the `DominatorTree` and
`LoopInfo` for a function in `getSpecializationBonus`.  This function,
however, is called O(number of call sites x number of arguments), but
the DominatorTree/LoopInfo can be computed just once.

This patch plugs into the PassManager infrastructure to obtain
LoopInfo for a function and removes ad-hoc computation from
`getSpecializatioBonus`.

Reviewed By: ChuanqiXu, labrinea

Differential Revision: https://reviews.llvm.org/D136332
2022-10-28 16:08:41 +01:00
Momchil Velikov 14384c96df Recommit: [FuncSpec][NFC] Refactor finding specialisation opportunities
[recommitting after recommitting a dependency]

This patch reorders the traversal of function call sites and function
formal parameters to:

* do various argument feasibility checks (`isArgumentInteresting` )
  only once per argument, i.e. doing N-args checks instead of
  N-calls x  N-args checks.

* do hash table lookups only once per call site, i.e. N-calls
  lookups/inserts instead of N-call x N-args lookups/inserts.

Reviewed By: ChuanqiXu, labrinea

Differential Revision: https://reviews.llvm.org/D135968

Change-Id: I7d21081a2479cbdb62deac15f903d6da4f3b8529
2022-10-28 11:26:25 +01:00
Simon Pilgrim 3c71253eb7 ConstraintElimination - pass const DataLayout by reference in (recursive) MergeResults lambda capture. NFC.
There's no need to copy this and fixes a coverity remark about large copy by value
2022-10-28 11:20:09 +01:00
David Green 5dd7d2ce67 [InstCombine] Don't change switch table from desirable to illegal types
In InstCombine we treat i8/i16 as desirable, even if they are not legal.
The current logic in shouldChangeType will decide to convert from an
illegal but desirable type (such as an i8) to an illegal and undesirable
type (such as i3). This patch prevents changing the switch conditions to
an irregular type on like Arm/AArch64 where i8/i16 are not legal.

This is the same issue as https://reviews.llvm.org/D54115. In the case I
was looking it is was converting an i32 switch to an i8 switch, which
then became a i3 switch.

Differential Revision: https://reviews.llvm.org/D136763
2022-10-28 10:15:41 +01:00
Juan Manuel MARTINEZ CAAMAÑO 256f8b06c6 [StructurizeCFG][DebugInfo] Maintain DILocations in the branches created by StructurizeCFG
Make StructurizeCFG preserve the debug locations of the branch instructions it introduces.

Differential Revision: https://reviews.llvm.org/D135967
2022-10-28 02:51:02 -05:00
Alexey Bataev 2ec51f1c75 [SLP]Improve analysis of same/alternate code ops and scheduling.
Should improve compile time for analysis and vectorization.

Metric: SLP.NumVectorInstructions

Program                                                                                       SLP.NumVectorInstructions
test-suite :: External/SPEC/CINT2017speed/623.xalancbmk_s/623.xalancbmk_s.test  6380.00                   6378.00 -0.0%
test-suite :: External/SPEC/CINT2017rate/523.xalancbmk_r/523.xalancbmk_r.test   6380.00                   6378.00 -0.0%
test-suite :: External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk.test           2023.00                   2022.00 -0.0%
test-suite :: External/SPEC/CINT2006/471.omnetpp/471.omnetpp.test               148.00                    146.00 -1.4%

Generated more vector instructions.

Differential Revision: https://reviews.llvm.org/D127531
2022-10-27 16:29:16 -07:00
Alexey Bataev 8ce0c7b1c9 Revert "[SLP]Improve analysis of same/alternate code ops and scheduling."
This reverts commit dad64448c6 to fix
a crash in https://lab.llvm.org/buildbot/#/builders/74/builds/14584
2022-10-27 15:21:35 -07:00
wlei 63c27c5c75 Use getCanonicalFnName for callee name 2022-10-27 13:14:47 -07:00
Sanjay Patel fd90f542cf [InstCombine] improve efficiency of sub demanded bits; NFC
There's no reason to shrink a constant or simplify
an operand in 2 steps.

This matches what we currently do for 'add' (although that
seems like it should be altered to handle the commutative
case).
2022-10-27 15:28:05 -04:00
Alexey Bataev dad64448c6 [SLP]Improve analysis of same/alternate code ops and scheduling.
Should improve compile time for analysis and vectorization.

Metric: SLP.NumVectorInstructions

Program                                                                                       SLP.NumVectorInstructions
test-suite :: External/SPEC/CINT2017speed/623.xalancbmk_s/623.xalancbmk_s.test  6380.00                   6378.00 -0.0%
test-suite :: External/SPEC/CINT2017rate/523.xalancbmk_r/523.xalancbmk_r.test   6380.00                   6378.00 -0.0%
test-suite :: External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk.test           2023.00                   2022.00 -0.0%
test-suite :: External/SPEC/CINT2006/471.omnetpp/471.omnetpp.test               148.00                    146.00 -1.4%

Generated more vector instructions.

Differential Revision: https://reviews.llvm.org/D127531
2022-10-27 11:31:18 -07:00
eopXD 10da9844d0 [LSR] Drop LSR solution if it is less profitable than baseline
The LSR may suggest less profitable transformation to the loop. This
patch adds check to prevent LSR from generating worse code than what
we already have.

Since LSR affects nearly all targets, the patch is guarded by the
option 'lsr-drop-solution' and default as disable for now.

The next step should be extending an TTI interface to allow target(s)
to enable this enhancememnt.

Debug log is added to remind user of such choice to skip the LSR
solution.

Reviewed By: Meinersbur, #loopoptwg

Differential Revision: https://reviews.llvm.org/D126043
2022-10-27 10:13:57 -07:00
Alexandros Lamprineas dbeaf6baa2 [FuncSpec] Do not overestimate the specialization bonus for users inside loops.
When calculating the specialization bonus for a given function argument,
we recursively traverse the chain of (certain) users, accumulating the
instruction costs. Then we exponentially increase the bonus to account
for loop nests. This is problematic for two reasons: (a) the users might
not themselves be inside the loop nest, (b) if they are we are accounting
for it multiple times. Instead we should be adjusting the bonus before
traversing the user chain.

This reduces the instruction count for CTMark (newPM-O3) when Function
Specialization is enabled without actually reducing the amount of
specializations performed (geomean: -0.001% non-LTO, -0.406% LTO).

Differential Revision: https://reviews.llvm.org/D136692
2022-10-27 15:26:11 +01:00
Sanjay Patel d2d23795ca [InstCombine] improve demanded bits for Sub operand 0
This is copying the code that was added for 'add' with D130075.
(That patch removed a fallthrough in the cases, but we can
probably still share at least some code again as a follow-up
cleanup, but I didn't want to risk it here.)

The reasoning is similar to the carry propagation for 'add':
if we don't demand low bits of the subtraction and the
subtrahend (aka RHS or operand 1) is known zero in those low
bits, then there can't be any borrowing required from the
higher bits of operand 0, so the low bits don't matter.

Also, the no-wrap flags can be propagated (and I think that
should be true for add too).

Here's an attempt to prove that in Alive2:
https://alive2.llvm.org/ce/z/xqh7Pa
(can add nsw or nuw to src and tgt, and it should still pass)

Differential Revision: https://reviews.llvm.org/D136788
2022-10-27 09:41:57 -04:00
Momchil Velikov 38f44ccfba Recommit: [FuncSpec] Fix specialisation based on literals
[fixed test to work with reverse iteration]

The `FunctionSpecialization` pass has support for specialising
functions, which are called with literal arguments. This functionality
is disabled by default and is enabled with the option
`-function-specialization-for-literal-constant` .  There are a few
issues with the implementation, though:

* even with the default, the pass will still specialise based on
   floating-point literals

* even when it's enabled, the pass will specialise only for the `i1`
    type (or `i2` if all of the possible 4 values occur, or `i3` if all
    of the possible 8 values occur, etc)

The reason for this is incorrect check of the lattice value of the
function formal parameter. The lattice value is `overdefined` when the
constant range of the possible arguments is the full set, and this is
the reason for the specialisation to trigger. However, if the set of
the possible arguments is not the full set, that must not prevent the
specialisation.

This patch changes the pass to NOT consider a formal parameter when
specialising a function if the lattice value for that parameter is:

* unknown or undef
* a constant
* a constant range with a single element

on the basis that specialisation is pointless for those cases.

Is also changes the criteria for picking up an actual argument to
specialise if the argument is:

* a LLVM IR constant
* has `constant` lattice value
 has `constantrange` lattice value with a single element.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D135893

Change-Id: Iea273423176082ec51339aa66a5fe9fea83557ee
2022-10-27 12:48:20 +01:00
Sameer Sahasrabuddhe 0fd018f9a9 [NFC] [AAPointerInfo] OffsetAndSize is no longer an std::pair
The struct OffsetAndSize is a simple tuple of two int64_t. Treating it as a
derived class of std::pair has no special benefit, but it makes the code
verbose since we need get/set functions that avoid using "first" and "second" in
client code. Eliminating the std::pair makes this more readable.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D136745
2022-10-27 11:00:28 +05:30
wlei d6a0585dd1 [SampleFDO] Compute and report profile staleness metrics
When a profile is stale and profile mismatch could happen, the mismatched samples are discarded, so we'd like to compute the mismatch metrics to quantify how stale the profile is, which will suggest user to refresh the profile if the number is high.

Two sets of metrics are introduced here:

 - (Num_of_mismatched_funchash/Total_profiled_funchash), (Samples_of_mismached_func_hash / Samples_of_profiled_function) : Here it leverages the FunctionSamples's checksums attribute which is a feature of pseudo probe. When the source code CFG changes, the function checksums will be different, later sample loader will discard the whole functions' samples, this metrics can show the percentage of samples are discarded due to this.
 -  (Num_of_mismatched_callsite/Total_profiled_callsite), (Samples_of_mismached_callsite / Samples_of_profiled_callsite) : This shows how many mismatching for the callsite location as callsite location mismatch will affect the inlining which is highly correlated with the performance. It goes through all the callsite location in the IR and profile, use the call target name to match, report the num of samples in the profile that doesn't match a IR callsite.

This is implemented in a new class(SampleProfileMatcher) and under a switch("--report-profile-staleness"), we plan to extend it with a fuzzy profile matching feature in the future.

Reviewed By: hoy, wenlei, davidxl

Differential Revision: https://reviews.llvm.org/D136627
2022-10-26 21:06:52 -07:00
Johannes Rudolf Doerfert 41a278f56a [OpenMP][FIX] Do not add custom state machine eagerly in LTO runs
If we run LTO optimization we migth end up introducing a custom state machine
and later transforming the region into SPMD. This is a problem. While a follow
up will introduce a check for the SPMD conversion, this already prevents the
eager custom state machine generation. Only if the kernel init function is
defined, rather then declared, we will emit a custom state machine. SPMD-zation
can happen eagerly though. Tests are adjusted via a weak definition. The LTO
test was added to verify this works as expected.

Differential Revision: https://reviews.llvm.org/D136740
2022-10-26 10:40:11 -07:00
Alex Brachet 443e2a10f6 Reland "[PGO] Make emitted symbols hidden"
This was reverted because it was breaking when targeting Darwin which
tried to export these symbols which are now hidden. It should be safe
to just stop attempting to export these symbols in the clang driver,
though Apple folks will need to change their TAPI allow list described
in the commit where these symbols were originally exported
f538018562

Then reverted again because it broke tests on MacOS, they should be
fixed now.

Bug: https://github.com/llvm/llvm-project/issues/58265

Differential Revision: https://reviews.llvm.org/D135340
2022-10-26 17:13:05 +00:00
Momchil Velikov 9901583968 Revert "[FuncSpec] Fix specialisation based on literals"
This reverts commit a8b0f58017 because
of "reverse-iteration" buildbot failure.
2022-10-26 13:54:12 +01:00
Momchil Velikov 2c8a4c6e62 Revert "[FuncSpec][NFC] Refactor finding specialisation opportunities"
This reverts commit a8853924bd due to dependency
on a8b0f58017
2022-10-26 13:54:12 +01:00
Guillaume Chatelet 1a726cfa83 Take memset_inline into account in analyzeLoadFromClobberingMemInst
This appeared in https://reviews.llvm.org/D126903#3884061

Differential Revision: https://reviews.llvm.org/D136752
2022-10-26 09:50:13 +00:00
Momchil Velikov a8853924bd [FuncSpec][NFC] Refactor finding specialisation opportunities
This patch reorders the traversal of function call sites and function
formal parameters to:

* do various argument feasibility checks (`isArgumentInteresting` ) only once per argument, i.e. doing N-args checks instead of N-calls x N-args checks.

* do hash table lookups only once per call site, i.e. N-calls lookups/inserts instead of N-call x N-args lookups/inserts.

Reviewed By: ChuanqiXu, labrinea

Differential Revision: https://reviews.llvm.org/D135968
2022-10-26 10:18:35 +01:00
Momchil Velikov 606d25e545 [FuncSpec] Compute specialisation gain even when forcing specialisation
When rewriting the call sites to call the new specialised functions, a
single call site can be matched by two different specialisations - a
"less specialised" version of the function and a "more specialised"
version of the function, e.g.  for a function

    void f(int x, int y)

the call like `f(1, 2)` could be matched by either

    void f.1(int x /* int y == 2 */);

or

    void f.2(/* int x == 1, int y == 2 */);

The `FunctionSpecialisation` pass tries to match specialisation in the
order of decreasing gain, so "more specialised" functions are
preferred to "less specialised" functions. This breaks, however, when
using the flag `-force-function-specialization`, in which case the
cost/benefit analysis is not performed and all the specialisations are
equally preferable.

This patch makes the pass calculate specialisation gain and order the
specialisations accordingly even when `-force-function-specialization`
is used, under the assumption that this flag has purely debugging
purpose and it is reasonable to ignore the extra computing effort it
incurs.

Reviewed By: ChuanqiXu, labrinea

Differential Revision: https://reviews.llvm.org/D136180
2022-10-26 10:08:03 +01:00
Momchil Velikov a8b0f58017 [FuncSpec] Fix specialisation based on literals
The `FunctionSpecialization` pass has support for specialising
functions, which are called with literal arguments. This functionality
is disabled by default and is enabled with the option
`-function-specialization-for-literal-constant` .  There are a few
issues with the implementation, though:

* even with the default, the pass will still specialise based on
   floating-point literals

* even when it's enabled, the pass will specialise only for the `i1`
    type (or `i2` if all of the possible 4 values occur, or `i3` if all
    of the possible 8 values occur, etc)

The reason for this is incorrect check of the lattice value of the
function formal parameter. The lattice value is `overdefined` when the
constant range of the possible arguments is the full set, and this is
the reason for the specialisation to trigger. However, if the set of
the possible arguments is not the full set, that must not prevent the
specialisation.

This patch changes the pass to NOT consider a formal parameter when
specialising a function if the lattice value for that parameter is:

* unknown or undef
* a constant
* a constant range with a single element

on the basis that specialisation is pointless for those cases.

Is also changes the criteria for picking up an actual argument to
specialise if the argument is:

* a LLVM IR constant
* has `constant` lattice value
 has `constantrange` lattice value with a single element.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D135893
2022-10-26 09:55:33 +01:00
Momchil Velikov 1a525dec7f [FuncSpec] Fix missed opportunities for function specialisation
When collecting the possible constant arguments to
specialise a function the compiler will abandon the search
on the first argument that is for some reason unsuitable as
a specialisation constant. Thus, depending on the traversal
order of the functions and call sites, the compiler can end
up with a different set of possible constants, hence with
different set of specialisations.

With this patch, the compiler will skip unsuitable
constants, but nevertheless will continue searching for
more.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D135867
2022-10-25 23:19:48 +01:00
Philip Reames 269bc684e7 [LV][RISCV] Disable vectorization of epilogue loops
Epilogue loop vectorization is a feature in the vectorize intended to avoid running fully scalar code when the vector length of the main loop turns out to be either longer than the trip count of the actual loop, or with a huge remainder.

In practice, this feature appears to not have been well tuned. I honestly don't think it should be on by default at all, but it definitely shouldn't be on for RISCV. Note that other targets have also disabled it, but they've done so via disabling interleaving - which is, well, completely unrelated - and we don't want to do that for RISCV.

In the near term, many examples I'm seeing have terrible codegen for epilogue vectorization. We are greatly increasing code size for little value at reasonable VLEN values for small types. In the long term, the cases that epilogue vectorization are intended to handle are likely better handled via tail folding on RISCV.

As an aside, I also don't really trust the correctness of epilogue vectorization. The code structure is such that otherwise straight forward changes sometimes break only epilogue vectorization. The reuse of an existing vplan without careful validation opens significant room for nasty bugs. Given how rarely the code is exercised, that is not a good combination.

As such, this patch introduces a TTI hook, and completely disables epilogue vectorization on RISCV.

Differential Revision: https://reviews.llvm.org/D136695
2022-10-25 14:28:02 -07:00
Arthur Eubanks ef37504879 [Instrumentation] Remove legacy passes
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D136615
2022-10-25 13:11:07 -07:00
Alina Sbirlea d1b19da854 [LoopPeeling] Add flag to disable support for peeling loops with non-latch exits
Add a flag to allow disabling the changes in
https://reviews.llvm.org/D134803.

Differential Revision: https://reviews.llvm.org/D136643
2022-10-25 12:19:14 -07:00
Momchil Velikov c47739b45c [FuncSpec] Consider small noinline functions for specialisation
Small functions with size under a given threshold are not
considered for specialisaion on the presumption that they
are easy to inline. This does not apply to `noinline`
functions, though.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D135862
2022-10-25 19:49:04 +01:00
Fangrui Song a527bda520 [LegacyPM] Remove DataFlowSanitizerLegacyPass
Using the legacy PM for the optimization pipeline was deprecated in 13.0.0.
Following recent changes to remove non-core features of the legacy
PM/optimization pipeline, remove DataFlowSanitizerLegacyPass.

Differential Revision: https://reviews.llvm.org/D124594
2022-10-25 10:55:29 -07:00
Simon Pilgrim 50fe87a5c8 [Transforms] classifyArgUse - don't deference pointer before null test
Reported here: https://pvs-studio.com/en/blog/posts/cpp/1003/ (N11)
2022-10-25 17:24:00 +01:00
Yaxun (Sam) Liu 9d5adc7e49 Revert "reland e5581df60a [SimplifyCFG] accumulate bonus insts cost"
This reverts commit bd7949bcd8.

Revert this patch since reviwers have different opinions regarding
the approach in post-commit review.

Will open RFC for further discussion.

Differential Revision: https://reviews.llvm.org/D132408
2022-10-25 12:15:39 -04:00
zhongyunde 620cff096a [InstCombine] Fold series of instructions into mull for more types
Relax the constraint of wide/vectors types.
Address the comment https://reviews.llvm.org/D136015?id=469189#inline-1314520

Reviewed By: spatel, chfast
Differential Revision: https://reviews.llvm.org/D136661
2022-10-25 23:04:46 +08:00
Nico Weber 76745d2b58 Revert "[PGO] Make emitted symbols hidden"
This reverts commit 04877284b4.
Looks like this is still breaking the test
Profile-x86_64 :: instrprof-darwin-dead-strip.c
(see comment on https://reviews.llvm.org/D135340).
2022-10-25 08:54:47 -04:00
LiaoChunyu e6c8418aab [ObjCARC][NFC] Fix defined but not used warning from D135041
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D136665
2022-10-25 15:16:42 +08:00
Kevin Athey 31bfa4a69b [MSAN] Add handleCountZeroes for ctlz and cttz.
This addresses a bug where vector versions of ctlz are creating false positive reports.

Depends on D136369

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D136523
2022-10-24 17:31:34 -07:00
Roy Sundahl 0c35b6165c [ASAN] Don't inline when -asan-max-inline-poisoning-size=0
When -asan-max-inline-poisoning-size=0, all shadow memory access should be
outlined (through asan calls). This was not occuring when partial poisoning
was required on the right side of a variable's redzone. This diff contains
the changes necessary to implement and utilize  __asan_set_shadow_01() through
__asan_set_shadow_07(). The change is necessary for the full abstraction of
the asan implementation and will enable experimentation with alternate strategies.

Differential Revision: https://reviews.llvm.org/D136197
2022-10-24 14:17:59 -07:00
Sanjay Patel 5dcfc32822 [InstCombine] allow more commutative matches for logical-and to select fold
This is a sibling transform to the fold just above it. That was changed
to allow the corresponding commuted patterns with:
3073074562
e1bd759ea5
8628e6df70
2022-10-24 16:40:43 -04:00
Yaxun (Sam) Liu bd7949bcd8 reland e5581df60a [SimplifyCFG] accumulate bonus insts cost
Fixed compile time increase due to always constructing LocalCostTracker.
Now only construct LocalCostTracker when needed.
2022-10-24 15:43:53 -04:00
Craig Topper 1edc51b56a [InstCombine] Explicitly check for scalable TypeSize.
Instead of assuming it is a fixed size.

Reviewed By: peterwaller-arm

Differential Revision: https://reviews.llvm.org/D136517
2022-10-24 12:29:06 -07:00
Alex Brachet 04877284b4 [PGO] Make emitted symbols hidden
This was reverted because it was breaking when targeting Darwin which
tried to export these symbols which are now hidden. It should be safe
to just stop attempting to export these symbols in the clang driver,
though Apple folks will need to change their TAPI allow list described
in the commit where these symbols were originally exported
f538018562

Bug: https://github.com/llvm/llvm-project/issues/58265

Differential Revision: https://reviews.llvm.org/D135340
2022-10-24 19:05:10 +00:00
Alexey Bataev da4e0f7ac5 [SLP][NFC]Fix PR58476: Fix compile time for reductions, NFC.
Improve O(N^2) to O(N) in some cases, reduce number of allocations by
reserving memory.
Also, improve analysis of loads reduction values to avoid analysis
of not vectorizable cases.
2022-10-24 10:13:24 -07:00
zhongyunde 81713e893a [InstCombine] Fold series of instructions into mull
The following sequence should be folded into in0 * in1
      In0Lo = in0 & 0xffffffff; In0Hi = in0 >> 32;
      In1Lo = in1 & 0xffffffff; In1Hi = in1 >> 32;
      m01 = In1Hi * In0Lo; m10 = In1Lo * In0Hi; m00 = In1Lo * In0Lo;
      addc = m01 + m10;
      ResLo = m00 + (addc >> 32);

Reviewed By: spatel, RKSimon
Differential Revision: https://reviews.llvm.org/D136015
2022-10-25 01:09:37 +08:00
Kevin P. Neal cfb88ee3ba [StrictFP][IPSCCP] Constant fold intrinsics with metadata arguments
This teaches the SCCP Solver how to constant fold more intrinsics. Constant
folding appears to be just as good as D115737 but much, much lower in code
change impact as suggested by nikic.

The constrained floating-point intrinsics all take at least one metadata
argument and were the motivation for the change.

Differential Revision: https://reviews.llvm.org/D136466
2022-10-24 11:43:20 -04:00
Ahmed Bougacha bddd9b6b91 [InstCombine] Combine ptrauth sign/resign + auth/resign intrinsics.
(sign|resign) + (auth|resign) can be folded by omitting the middle
sign+auth component if the key and discriminator match.

Differential Revision: https://reviews.llvm.org/D132383
2022-10-24 08:03:14 -07:00
Matt Arsenault 597b9b7e8e CodeExtractor: Fix assertion with non-0 alloca address spaces
emitCallAndSwitchStatement creates placeholder allocas to pass
to these, so the types need to match.
2022-10-23 15:16:55 -07:00
Mike Hommey 86e57e66da [InstCombine] Bail out of casting calls when a conversion from/to byval is involved.
Fixes #58307

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D135738
2022-10-23 09:49:48 +02:00
Kazu Hirata 3f8d2c917c Ensure newlines at the end of files (NFC) 2022-10-22 09:29:40 -07:00
Sanjay Patel 8628e6df70 [InstCombine] use freeze to enable poison-safe logic->select fold
Without a freeze, this transform can leak poison to the output:
https://alive2.llvm.org/ce/z/GJuF9i

This makes the transform as uniform as possible, and it can help
reduce patterns like issue #58313 (although that particular
example probably still needs another transform).

Differential Revision: https://reviews.llvm.org/D136527
2022-10-22 10:42:14 -04:00
Thomas Symalla fc26a75280 [NFC] Fixed several misspellings of "Splitter" in Scalarizer
Spliiter => Splitter
2022-10-22 15:13:56 +02:00
Sanjay Patel e1bd759ea5 [InstCombine] allow more matches for logical-ands --> select
This allows patterns with real 'and' instructions because
those are safe to transform:
https://alive2.llvm.org/ce/z/7-U_Ak
2022-10-22 08:15:50 -04:00
Arthur Eubanks 4153f989ba [ObjCARC] Remove legacy PM versions of optimization passes
This doesn't touch objc-arc-contract because that's in the codegen pipeline.
However, this does move its corresponding initialize function into initializeCodegen().

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D135041
2022-10-21 13:40:54 -07:00
Sanjay Patel 3073074562 [InstCombine] allow more commutative matches for logical-and to select fold
When the common value is part of either select condition,
this is safe to reduce. Otherwise, it is not poison-safe
(with the select form of the pattern):
https://alive2.llvm.org/ce/z/FxQTzB

This is another patch motivated by issue #58313.
2022-10-21 13:29:13 -04:00
Wael Yehia 461a1836d3 [PGO][AIX] Improve dummy var retention and allow -bcdtors:csect linking.
1) Use a static array of pointer to retain the dummy vars.
2) Associate liveness of the array with that of the runtime hook variable
   __llvm_profile_runtime.
3) Perform the runtime initialization through the runtime hook variable.
4) Preserve the runtime hook variable using the -u linker flag.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D136192
2022-10-21 16:32:42 +00:00
Sanjay Patel d7fecf26f4 [InstCombine] allow some commutative matches for logical-and to select fold
This is obviously correct for real logic instructions,
and it also works for the poison-safe variants that use
selects:
https://alive2.llvm.org/ce/z/wyHiwX

This is motivated by the lack of 'xor' folding seen in issue #58313.
This more general fold should help reduce some of those patterns,
but I'm not sure if this specific case does anything for that
particular example.
2022-10-21 11:28:38 -04:00
Sanjay Patel f6fc3e23b9 [InstCombine] refactor matching code for logical ands; NFCI
Separating the matches makes it easier
to enhance for commutative patterns.
2022-10-21 11:28:38 -04:00
Sanjay Patel bf75e937bb [InstCombine] match logical and/or more generally in fold to select
This allows the regular bitwise logic opcodes in addition to the
poison-safe select variants:
https://alive2.llvm.org/ce/z/8xB9gy

Handling commuted variants safely is likely trickier, so that's
left to another patch.
2022-10-21 09:03:36 -04:00
Florian Hahn fd236772f5
[IndVars] Forget SCEV for value after simplifying condition.
Additional SCEV verification highlighted a case where the cached loop
dispositions where incorrect after simplifying a condition in IndVars
and moving the user in LoopDeletion. Fix it by invalidating ICmp and all
its users.

Fixes #58515.
2022-10-21 11:18:01 +01:00
Nikita Popov e9754f0211 [IR] Add support for memory attribute
This implements IR and bitcode support for the memory attribute,
as specified in https://reviews.llvm.org/D135597.

The new attribute is not used for anything yet (and as such, the
old memory attributes are unaffected).

Differential Revision: https://reviews.llvm.org/D135592
2022-10-21 12:11:25 +02:00
Florian Hahn 7eb4ec1c75
[VPlan] Print predicates for widened cmp instructions (NFC). 2022-10-21 08:54:11 +01:00
Michael Francis 922f42d531 [clang][AIX] Fix mcount name and call arguments
Currently, compiling a program with the `-pg` flag will result in an
undefined symbol error for `.mcount`. This revision fixes the call to
use `__mcount`, which requires a pointer argument to a pointer-sized
object (unique per inserted call) on AIX.

This is only a partial fix. This patch should fix the `-pg` flag's
behaviour on AIX to work with code you are compiling, but it will not
link against standard libraries with `mcount` instrumentation calls. The
next step is to add profiled libraries to the linker search paths in the
Clang driver for the AIX toolchain when linking with `-pg`.

Differential Review: https://reviews.llvm.org/D135384
2022-10-20 16:20:00 -04:00
William Huang 6c767cef5a [InstCombine] Canonicalize GEP of GEP by swapping constant-indexed GEP to the back
Canonicalize GEP of GEP by swapping GEP with some suffix constant indices to the back (and GEP with all constant indices to the back of that), this allows more constant index GEP merging to happen. Exceptions are: If swapping violates use-def relations, or anti-optimizes LICM

For constant indexed GEP of GEP, if they cannot be merged directly, they will be casted to i8* and merged.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D125845
2022-10-20 17:41:26 +00:00
Paul Walker ab8257ca0e [NFC] Fix a few whitespace inconsistencies. 2022-10-20 14:52:25 +00:00
OCHyams f825214411 [DebugInfo][NFC] Refactor debug intrinsic copy and delete to instead just move
Reviewed By: jryans

Differential Revision: https://reviews.llvm.org/D133304
2022-10-20 15:12:49 +01:00
Florian Hahn e25ed058bc
[LV] Use buildScalarSteps to also handle VF = 1. (NFCI)
The code in buildScalarSteps already properly handles creating the
scalar induction values with VF = 1. Use it directly instead of using
extra code to handle that case.

Suggested by @Ayal in D133760.
2022-10-20 14:30:01 +01:00
Nikita Popov 7c32c7e777 Reapply [FunctionAttrs] Make location classification more precise
Reapplying after the fix for volatile modelling in D135863.

-----

Don't add argmem if the pointer is clearly not an argument (e.g.
a global). I don't think this makes a difference right now, but
gives more obvious results with D135780.
2022-10-20 15:13:20 +02:00
Nabeel Omer e1fd6d49a3 [InstCombine] Fix assert condition in `foldSelectShuffleOfSelectShuffle`
Bug introduced in e239198cdb.

The assert() is making an assumption that the resulting shuffle mask
will always select elements from both vectors, this is untrue in the
case of two shuffles being folded if the former shuffle has a mask with
undef elements in it. In such a case folding the shuffles might result
in a mask which only selects from one of the vectors because the other
elements (in the mask) are undef.

Differential Revision: https://reviews.llvm.org/D136256
2022-10-20 12:10:54 +00:00
Florian Hahn 3a4aa24fd1
[LoopSimplifyCFG] Forget loop and block dispos after merging blocks.
This fixes another case where block and loop dispositions weren't
properly invalidate after changing the CFG.

Fixes #58489.
2022-10-20 11:23:29 +01:00
Nikita Popov bed31153b7 [FuncAttrs] Extract code for adding a location access (NFC)
This code is the same for accesses from call arguments and for
accesses from other (single-location) instructions. Extract i
into a common function.
2022-10-20 12:01:27 +02:00
Nikita Popov f2fe289374 [FunctionAttrs] Volatile operations can access inaccessible memory
Per LangRef, volatile operations are allowed to access the location
of their pointer argument, plus inaccessible memory:

> Any volatile operation can have side effects, and any volatile
> operation can read and/or modify state which is not accessible
> via a regular load or store in this module.
> [...]
> The allowed side-effects for volatile accesses are limited. If
> a non-volatile store to a given address would be legal, a volatile
> operation may modify the memory at that address. A volatile
> operation may not modify any other memory accessible by the
> module being compiled. A volatile operation may not call any
> code in the current module.

FunctionAttrs currently does not model this and ends up marking
functions with volatile accesses on arguments as argmemonly,
even though they should be inaccessiblemem_or_argmemonly.

Differential Revision: https://reviews.llvm.org/D135863
2022-10-20 11:57:10 +02:00
Alexey Bataev b8b740c834 [SLP][NFC]Remove unused variable, NFC. 2022-10-19 12:35:27 -07:00
Fangrui Song c80b12d352 Revert D135427 "[LTO] Make local linkage GlobalValue in non-prevailing COMDAT available_externally"
This reverts commit 8ef3fd8d59.

I mentioned that GlobalAlias was not handled. It turns out GlobalAlias has to be handled in the same patch (as opposed to in a follow-up),
as otherwise clang codegen of C5/D5 constructor/destructor would regress (https://reviews.llvm.org/D135427#3869003).
2022-10-19 11:24:12 -07:00
Florian Hahn d72fcee8f4
[VPlan] Add VPValue::isDefinedOutsideVectorRegions helper (NFC).
@Ayal suggested a better named helper than using `!getDef()` to check if
a value is invariant across all parts.

The property we are using here is that the VPValue is defined outside
any vector loop region. There's a TODO left to handle recipes defined in
pre-header blocks.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D133666
2022-10-19 13:20:30 +01:00
bipmis 38f3e44997 [AggressiveInstCombine] Load merge the reverse load pattern of consecutive loads.
This patch extends the load merge/widen in AggressiveInstCombine() to handle reverse load patterns.

Differential Revision: https://reviews.llvm.org/D135137
2022-10-19 11:22:58 +01:00
Nikita Popov 747f27d97d [AA] Rename getModRefBehavior() to getMemoryEffects() (NFC)
Follow up on D135962, renaming the method name to match the new
type name.
2022-10-19 11:03:54 +02:00
Nikita Popov 1a9d9823c5 [AA] Rename uses of FunctionModRefBehavior (NFC)
Followup to D135962 to rename remaining uses of
FunctionModRefBehavior to MemoryEffects. Does not touch API names
yet, but also updates variables names FMRB/MRB to ME, to match the
new type name.
2022-10-19 10:54:47 +02:00
Alexey Bataev 087dadfd37 [SLP]Generalize cost model.
Generalized the cost model estimation. Improved cost model estimation
for repeated scalars (no need to count their cost anymore), improved
  cost model for extractelement instructions.

cpu2017
   511.povray_r             0.57
   520.omnetpp_r           -0.98
   521.wrf_r               -0.01
   525.x264_r               3.59 <+
   526.blender_r           -0.12
   531.deepsjeng_r         -0.07
   538.imagick_r           -1.42
Geometric mean:  0.21

Differential Revision: https://reviews.llvm.org/D115757
2022-10-18 11:55:59 -07:00
Alexey Bataev 62267e8de0 Revert "[SLP]Generalize cost model."
This reverts commit f12fb91188 and
f5c747bfbe to fix detected non-initialized
var use.
2022-10-18 11:25:59 -07:00
Sjoerd Meijer f7c42a278b Revert "Recommit "[LoopFlatten] Enable it by default""
This reverts commit 5b9597f59a.

A miscompilation was reported:
https://github.com/llvm/llvm-project/issues/58441

Reverting this while I look at that.
2022-10-18 23:36:36 +05:30
Alexey Bataev f5c747bfbe [SLP][NFC]Fix a warning for ?: with enum/unsigned, NFC. 2022-10-18 10:08:05 -07:00
Florian Hahn c65513444b
[IndVars] Forget SCEV for instruction and users before replacing it.
Extra invalidation is needed here to clear stale values to fix a
verification failure.

Fixes #58440.
2022-10-18 17:38:14 +01:00
Alexey Bataev f12fb91188 [SLP]Generalize cost model.
Generalized the cost model estimation. Improved cost model estimation
for repeated scalars (no need to count their cost anymore), improved
  cost model for extractelement instructions.

cpu2017
   511.povray_r             0.57
   520.omnetpp_r           -0.98
   521.wrf_r               -0.01
   525.x264_r               3.59 <+
   526.blender_r           -0.12
   531.deepsjeng_r         -0.07
   538.imagick_r           -1.42
Geometric mean:  0.21

Differential Revision: https://reviews.llvm.org/D115757
2022-10-18 08:49:32 -07:00
Arthur Eubanks 6219ec07c6 [SROA] Don't speculate phis with different load user types
Fixes an SROA crash.

Fallout from opaque pointers since with typed pointers we'd bail out at the bitcast.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D136119
2022-10-18 08:44:13 -07:00
Sanjay Patel 44b7da89d7 [InstCombine] fmul nnan X, 0.0 --> copysign(0.0, X)
https://alive2.llvm.org/ce/z/ybgM5F

Differential Revision: https://reviews.llvm.org/D136166
2022-10-18 11:34:02 -04:00
Sanjay Patel d16989607b [InstCombine] reduce code duplication in visitBranchInst(); NFCI 2022-10-18 11:34:02 -04:00
Florian Hahn a8e9742bd4
[IndVarSimplify] Clear block and loop dispositions after moving instr.
Moving an instruction can invalidate the cached block dispositions of
the corresponding SCEV. Invalidate the cached dispositions.

Also fixes a copy-paste error in forgetBlockAndLoopDispositions where
the start expression S was removed from BlockDispositions in the loop
but not the current values. This was also exposed by the new test case.

Fixes #58439.
2022-10-18 16:18:14 +01:00
Alexey Bataev e79532d28c [SLP][NFC]Try to fix MSVC buildbots with a workaround, NFC. 2022-10-18 07:50:10 -07:00
uabkaka da137d041b [SimplifyLibCalls] Add NoUndef/NonNull/Dereferenceable attributes to iprintf/siprintf
When SimplifyLibCalls fail to optimize printf and sprintf it add
NoUndef/NonNull/Dereferenceable attributes. This patch add the same attributes
if SimplifyLibCalls optimize printf/sprintf into the integer only
iprintf/siprintf.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D136140
2022-10-18 16:36:35 +02:00
Alexey Bataev 6a6fc4890d [SLP][NFC]Formatting of the getEntryCost function, NFC. 2022-10-18 07:18:26 -07:00
Florian Hahn e302fa89aa
[LoopUnroll] Forget exit values when making changes.
When unrolling, the exit values in LCSSA phis will get updated.
Invalidate cached SCEV values for those phis in case SCEV looked through
a exit phi.

Fixes #58340.
2022-10-18 15:12:24 +01:00
Max Kazantsev f884a4c957 [NFC] Reuse NonTrivialUnswitchCandidate instead of std::pair 2022-10-18 14:00:53 +07:00
Arthur Eubanks 308b4bca14 [NFC][SROA] Update comment to use opaque pointers for clarity 2022-10-17 16:37:29 -07:00
Daniel Sanders 021e6e05d3 [instsimplify] Move (extelt (inselt Vec, Value, Index), Index) -> Value from InstCombine
As requested in https://reviews.llvm.org/D135625#3858141

Differential Revision: https://reviews.llvm.org/D136099
2022-10-17 15:22:06 -07:00
Matthias Braun 6d972ad2d8 ControlHeightReduction: Remove assert check in shouldApply
Remove assertion checking for non-empty `ProfileSummaryInfo`.

Differential Revision: https://reviews.llvm.org/D133706
2022-10-17 13:10:13 -07:00
Florian Hahn 6db71b8f14
[ConstraintElim] Use helper to allow overflow for coefficients of GEPs
If the arithmetic for indices of inbounds GEPs overflows, the result is
poison. This means it is also OK for the coefficients to overflow. GEP
decomposition is limited to cases where the index size is <= 64 bit,
which can be represented by int64_t used for the coefficients in the
constraint system.
2022-10-17 20:30:43 +01:00
Sjoerd Meijer 5b9597f59a Recommit "[LoopFlatten] Enable it by default"
The sanitizer bots turned green again after another change went in, i.e.
revert 26dd64ba9c, so I don't think this
patch was causing the problems.
2022-10-17 23:27:19 +05:30
Sjoerd Meijer a71c4e4fbb Revert "[LoopFlatten] Enable it by default"
This reverts commit 233659c7ae.

I see some sanitizer build bot failures. Not sure if it is change
causing it, but let's see if a revert returns the bots to green...
2022-10-17 22:14:20 +05:30
Sanjay Patel 8d76fbb5f0 [VectorCombine] fix crashing on match of non-canonical fneg
We can't assume that operand 0 is the negated operand because
the matcher handles "fsub -0.0, X" (and also +0.0 with FMF).

By capturing the extract within the match, we avoid the bug
and make the transform more robust (can't assume that this
pass will only see canonical IR).
2022-10-17 10:47:48 -04:00
Nikita Popov 779fd39684 Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify
Relative to the previous attempt, this is rebased over the
InstSimplify fix in ac74e7a780,
which addresses the miscompile reported in PR58401.

-----

foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.

This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.

This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.

Differential Revision: https://reviews.llvm.org/D134954
2022-10-17 16:11:05 +02:00
Florian Hahn 699396131f
Revert "Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify"
This reverts commit 333246b48e.

It looks like this patch causes a mis-compile:
https://github.com/llvm/llvm-project/issues/58401

Fixes #58401.
2022-10-17 12:56:28 +01:00
Sjoerd Meijer 233659c7ae [LoopFlatten] Enable it by default
LoopFlatten has been in the code base off by default for years, but this
enables it to run by default. Downstream this has been running for
years, so it has been exposed to quite some code. Then around the time
we switched to the NPM, several fixes went in related to updating the
MemorySSA state and we moved it to a loop pass manager, which both
helped preventing rerunning certain analysis passes, and thus helped a
bit with compile-times.

About compile-times, adding a pass isn't free, but this should see only
very minor increases. The pass is relatively simple and there shouldn't
be anything algorithmically expensive because all it does is looking at
inner/outer loops and it checks assumptions on loop increments and
indices. If we see increases, I expect this to mainly come from
invalidation of analysis info, and perhaps subsequent passes to trigger
and do more. Despite its simplicity/restrictions, it triggers in most
code-bases, which makes it worth to enable this by default.

Differential Revision: https://reviews.llvm.org/D109958
2022-10-17 17:11:39 +05:30
Chuanqi Xu 1cedc51ff5 [Coroutines] Don't merge readnone calls in presplit coroutines
Another alternative to fix the thread identification problem in
coroutines.

We plan to fix this problem by unifying memory effecting attributes. See
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
But it may be a long-term project. And it is a pity that the coroutines
can't resume in different threads for years. So this one is temporary
fix. It may cause unnecessary performance regression for coroutines. But
correctness are more important. And this one is planned to be reverted
after we are able to unify the memory effecting attributes actually.

Reviewed By: jdoerfert, rjmccall

Differential Revision: https://reviews.llvm.org/D135550
2022-10-17 10:22:43 +08:00
Kazu Hirata 5ea3155565 [llvm] Use llvm::find (NFC) 2022-10-16 16:21:00 -07:00
Florian Hahn 462ab9810d
[ConstraintElim] Fix signed integer overflow for inbounds GEP.
For inbounds GEPs, signed overflow yields poison, so it is fine for the
coefficients to wrap as well. This fixes an UBSan failure.
2022-10-16 23:25:28 +01:00
Florian Hahn aec0c1009f
[ConstraintElim] Replace custom GEP index handling by using existing code
Instead of duplicating the existing decomposition code for GEP indices
just use the existing code by calling the existing decompose function on
the index expression and multiply the result's coefficients by the scale of
the index.

This both reduces code duplication and generalizes the pattern we can
handle.
2022-10-16 21:53:11 +01:00
Florian Hahn a4635ec710
[ConstraintElim] Support `add nsw` for unsigned preds with positive ops.
If both operands of an `add nsw` are known positive, it can be treated
the same as `add nuw` and added to the unsigned system.

https://alive2.llvm.org/ce/z/6gprff
2022-10-16 20:25:14 +01:00
Sanjay Patel e5ee0b06d6 [InstCombine] try to determine "exact" for sdiv
If the divisor is a power-of-2 or negative-power-of-2 and the dividend
is known to have >= trailing zeros than the divisor, the division is exact:
https://alive2.llvm.org/ce/z/UGBksM (general proof)
https://alive2.llvm.org/ce/z/D4yPS- (examples based on regression tests)

This isn't the most direct optimization (we could create ashr in these
examples instead of relying on existing folds for exact divides), but
it's possible that there's a more general constraint than just a pow2
divisor, so this might be extended in the future.

This should solve issue #58348.

Differential Revision: https://reviews.llvm.org/D135970
2022-10-16 10:59:56 -04:00
Sanjay Patel 340ae45be0 [InstCombine] use isKnownNonNegative() for readability; NFCI
This should be functionally equivalent - both calls are thin
wrappers around computeKnownBits(). We'll probably want to use
known-bits directly in follow-up patches because that could
determine "exact" for example (see issue #58348).
2022-10-16 10:59:56 -04:00
Kazu Hirata b2f41e9ac1 [Vectorize] Use std::conditional_t (NFC) 2022-10-15 14:52:25 -07:00
Florian Hahn 7c1b80e35c
[ConstraintElim] Support unsigned decomposition of mul/shl nuw..const
Support decomposition for `mul/shl nuw` with constant operand for unsigned
queries. Those expressions should not wrap in the unsigned sense and can
be added directly to the unsigned system.
2022-10-15 21:28:08 +01:00
Florian Hahn f12684d36e
[ConstraintElim] Support signed decomposition of `add nsw`.
Add support decomposition for `add nsw` for signed queries.
`add nsw` won't wrap and can be directly added to the signed
system.
2022-10-15 18:34:03 +01:00
Zequan Wu 82035ec777 Revert "[PGO] Make emitted symbols hidden"
This reverts commit ecac223b0e.

The commit causes instrprof-darwin-dead-strip.c to fail on mac.
2022-10-14 15:23:26 -07:00
Florian Hahn 16cf666bb7
[Loop] Move block and loop dispo invalidation to makeLoopInvariant.
makeLoopInvariant may recursively move its operands to make them
invariant, before moving the passed in instruction. Those recursively
moved instructions are currently missed when invalidating block and loop
dispositions.

To address this, move the invalidation code to Loop::makeLoopInvariant.

Fixes #58314.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D135909
2022-10-14 21:58:14 +01:00
Zain Jaffal 0c8dde551c [ConstraintElimination] Move logic for replacing ssub overflow users (NFC)
Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D134044
2022-10-14 21:14:21 +01:00
Argyrios Kyrtzidis d877e3fe71 [Transforms/ObjCARC] Fix non-deterministic output of `ObjCARCOptPass`
`ProvenanceAnalysis::related()` was assuming that the order of parameters for `relatedCheck()` was not affecting
the result but this was not the case when both parameters were `PHINode`s.
Due to this assumption `ProvenanceAnalysis::related()` was ordering the parameters based on pointer value which resulted in
non-deterministic behavior.

To address this change `relatedPHI()` so that it gives the same result independent of the parameter order.

rdar://100325456

Differential Revision: https://reviews.llvm.org/D135376
2022-10-14 12:26:58 -07:00
Craig Topper d3366efd43 [LV] Simplify register usage code and avoid double map lookups. NFC
Instead of checking whether a map entry exists to decide if we should
initialize it or add to it, we can rely on the map entry being constructed
and initialized to 0 before the addition happens.

For the std::max case, I've made a reference to the map entry to
avoid looking it up twice.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D135977
2022-10-14 11:55:48 -07:00
Florian Hahn 5a68e578ca
[ConstraintElim] Add debug message when decomposition fails. 2022-10-14 11:02:05 +01:00
Wolfgang Pieb b43a1d1bd9 [PGO] Do not create block count annotations when all weights are 0,
avoiding an assertion.

A BB with a nonzero count, whose successor blocks all have 0 counts, could
cause an assertion. Don't create any branch weights in this case.

Reviewed By: xur

Differential Revision: https://reviews.llvm.org/D134203
2022-10-13 14:57:42 -07:00
Sanjay Patel d85505a932 [InstCombine] fold logical and/or to xor
(A | B) & ~(A & B) --> A ^ B

https://alive2.llvm.org/ce/z/qpFMns

We already have the equivalent fold for real
logic instructions, but this pattern may occur
with selects too.

This is part of solving issue #58313.
2022-10-13 16:12:20 -04:00
Florian Hahn 572d5d374c
[ConstraintElim] Add support for GEPs with multiple indices.
Lift restriction on GEPs with a single index by iterating over all
indices and joining the {Coefficient, Variable} entries for all indices
together.
2022-10-13 21:08:33 +01:00
Alex Brachet ecac223b0e [PGO] Make emitted symbols hidden
This was reverted because it was breaking when targeting Darwin which
tried to export these symbols which are now hidden. It should be safe
to just stop attempting to export these symbols in the clang driver,
though Apple folks will need to change their TAPI allow list described
in the commit where these symbols were originally exported
f538018562

Bug: https://github.com/llvm/llvm-project/issues/58265

Differential Revision: https://reviews.llvm.org/D135340
2022-10-13 19:47:15 +00:00
Florian Hahn 71c49d189a
[ConstraintElim] Move check-and-replace logic to helper function (NFC).
Move logic to check and replace conditions to a helper function. This
isolates the code, allows using early returns, reduces the
indentation and simplifies eliminateConstraints.
2022-10-13 18:58:37 +01:00
Nikita Popov b54b84fde6 [MemCpyOpt] Add additional debug output (NFC) 2022-10-13 17:03:44 +02:00
Alexey Bataev c787986cdd [SLP]Improve costs of vectorized loads/stores by analyzing GEPs.
When generating masked gathers nodes, SLP vectorizer accounts the cost
of the GEPs for loads as part of the scalar-vector transformation cost
estimation. But it does not do it for vectorized loads/stores, while it
may completely remove some of the GEPs completely. Because of this in
some cases masked gather operation can be much more profitable rather
than regular vectorization (masked-gather cost + vector GEP - scalar
loads + GEPs comparing to vectorized loads - scalar loads).
Added the analysis of the removed scalarGEPs for vectorized load/store nodes for better cost estimation.

Differential Revision: https://reviews.llvm.org/D135282
2022-10-13 07:20:41 -07:00
Philip Reames fe755af3a9 Revert "Remove PlaceSafepoints pass"
This reverts commit cb66e123c6.  It was reported via https://reviews.llvm.org/rGcb66e123c6bc82a793300b6fb3ecbed79c58f557#1132969 that the Microsoft.NET compiler is still using this pass.
2022-10-13 07:17:25 -07:00
Florian Hahn 019049a1ca
[ConstraintElim] Use MulOverflow to avoid UB on signed overflow.
This fixes an UBSan failure after 359bc5c541. For inbounds GEP with
index sizes <= 64, having the coefficients overflowing is fine.
2022-10-13 13:57:43 +01:00
Nikita Popov d44cd1bbeb Revert "[FunctionAttrs] Make location classification more precise"
This reverts commit b05f5b90a1.

There are thread sanitizer buildbot failures in simple_stack.c.
I think that's because this ended up affecting the handling of
volatile accesses to allocas. Reverting for now.
2022-10-13 12:11:04 +02:00
Nikita Popov b05f5b90a1 [FunctionAttrs] Make location classification more precise
Don't add argmem if the pointer is clearly not an argument (e.g.
a global). I don't think this makes a difference right now, but
gives more obvious results with D135780.
2022-10-13 11:24:23 +02:00
Florian Hahn 359bc5c541
[ConstraintElim] Bail out for GEPs when index size > 64 bits.
Limit pointer decomposition to pointers with index sizes of at most 64
bits. int64_t is used for coefficients, so as long as the index size <=
64 bits we should be able to represent all pointer offsets.

Pointer decomposition is limited to inbounds GEPs, so if a index
computation would overflow the result is poison, so it doesn't matter
that the coefficient overflows.

This allows replacing MulOverflow with regular multiplications.
2022-10-13 10:19:30 +01:00
Nikita Popov 440ce05fbf [FunctionAttrs] Handle potential access of captured argument
We have to account for accesses to argument memory via captures.
I don't think there's any way to make this produce incorrect
results right now (because as soon as "other" is set, we lose the
ability to infer argmemonly), but this avoids incorrect results
once we have more precise representation.
2022-10-13 11:15:36 +02:00
Nikita Popov 5b3776842f [FunctionAttrs] Account for memory effects of inalloca/preallocated
The code for inferring memory attributes on arguments claims that
inalloca/preallocated arguments are always clobbered:
d71ad41080/llvm/lib/Transforms/IPO/FunctionAttrs.cpp (L640-L642)

However, we would still infer memory attributes for the whole
function without taking this into account, so we could still end
up inferring readnone for the function. This adds an argument
clobber if there are any inalloca/preallocated arguments.

Differential Revision: https://reviews.llvm.org/D135783
2022-10-13 10:20:17 +02:00
Florian Hahn 0ebd288338
[ConstraintElim] Move GEP decomposition code to separate fn (NFC).
Breaks up a large function and allows for the use to early exits.
2022-10-12 20:39:05 +01:00
Arthur Eubanks f59e1bcc22 [PrintPipeline] Handle CoroConditionalWrapper and add more verification
Add a check (can be disabled via a flag) that the pipeline we generate is actually parsable.
Can be disabled because we don't expect to handle every pass in -print-pipeline-passes.

Fixes #58280.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D135703
2022-10-12 09:36:45 -07:00
Sanjay Patel 7b9482df3d [InstCombine] fold sdiv with common shl amount in operands
(X << Z) / (Y << Z) --> X / Y

https://alive2.llvm.org/ce/z/CLKzqT

This requires a surprising "nuw" constraint because we have
to guard against immediate UB via signed-div overflow with
-1 divisor.

This extends 008a89037a and is another transform
derived from issue #58137.
2022-10-12 11:32:15 -04:00
Alexey Bataev d71ad41080 [SLP]Fix insertpoint of the extractellements instructions to avoid reshuffle crash.
Need to set the insertpoint for extractelement to point to the first
instruction in the node to avoid possible crash during external uses
combine  process. Without it we may endup with the incorrect
transformation.

Differential Revision: https://reviews.llvm.org/D135591
2022-10-12 08:18:30 -07:00
Sanjay Patel 008a89037a [InstCombine] fold udiv with common shl amount in operands
(X << Z) / (Y << Z) --> X / Y

https://alive2.llvm.org/ce/z/E5eaxU

This fixes the motivating example from issue #58137,
but it is not the most general transform. We should
probably also convert left-shift in the divisor to
right-shift in the dividend for that, but that exposes
another missed canonicalization for shifts and adds.
2022-10-12 11:12:26 -04:00
Jordan Rupprecht cbae57c0e1 [NFC] Ignore unused var in no-asserts builds 2022-10-12 08:11:10 -07:00
Alexey Bataev 1be3428ea0 [SLP]Fix PR58177: Improve isUndefVector function to avoid extra freeze.
Freeze instruction in some cases makes codegen worse, so need to be very
careful when emitting it. Instead improve analysis in isUndefVector
function to generate mask of unused elements and use it in the analysis.

Differential Revision: https://reviews.llvm.org/D135382
2022-10-12 07:32:54 -07:00
Sanjay Patel fe97f95036 [InstCombine] propagate "exact" through folds of div
These folds were added recently with:
6b869be810
8da2fa856f
...but they didn't account for the "exact" attribute,
and that can be safely propagated:
https://alive2.llvm.org/ce/z/F_WhnR
https://alive2.llvm.org/ce/z/ft9Cgr
2022-10-12 09:25:05 -04:00
Sanjay Patel d117ee25b8 [InstCombine] add helper function for div+shl folds; NFC
There are at least 2 similar patterns that could be added here,
and the existing fold can be improved because it fails to
propagate "exact".
2022-10-12 09:25:04 -04:00
Florian Hahn c1fe52bfa6
[VPlan] Remove dead recipes before sinking.
optimizeInductions may leave dead recipes which can prevent sinking.
Sinking on the other hand should not introduce new dead recipes, so
clean up dead recipes before sinking.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D133762
2022-10-12 12:49:42 +01:00
Max Kazantsev fbad5fdc03 [NFC] Perform all legality checks for non-trivial unswitch in one function
They have been scattered over the code. For better structuring, perform
them in one place. Potential CT drop is possible because we collect exit
blocks twice, but it's small price to pay for much better code structure.
2022-10-12 18:35:12 +07:00
Max Kazantsev 6bfcac612f [SimpleLoopUnswitch][NFC] Separate legality checks from cost computation
These are semantically two different stages, but were entwined in the
old implementation. Now cost computation does not do legality checks,
and they all are done beforehead.
2022-10-12 13:31:36 +07:00
Max Kazantsev 421728b40c [NFC] Factor out computation of best unswitch cost candidate
Split out a major peice of this method to make code more readable.
2022-10-12 12:36:46 +07:00
Fangrui Song 8ef3fd8d59 [LTO] Make local linkage GlobalValue in non-prevailing COMDAT available_externally
For a local linkage GlobalObject in a non-prevailing COMDAT, it remains defined while its
leader has been made available_externally. This violates the COMDAT rule that
its members must be retained or discarded as a unit.

To fix this, update the regular LTO change D34803 to track local linkage
GlobalValues, and port the code to ThinLTO (GlobalAliases are not handled.)

This fixes two problems.

(a) `__cxx_global_var_init` in a non-prevailing COMDAT group used to
linger around (unreferenced, hence benign), and is now correctly discarded.
```
int foo();
inline int v = foo();
```

(b) Fix https://github.com/llvm/llvm-project/issues/58215:
as a size optimization, we place private `__profd_` in a COMDAT with a
`__profc_` key. When FuncImport.cpp makes `__profc_` available_externally due to
a non-prevailing COMDAT, `__profd_` incorrectly remains private. This change
makes the `__profd_` available_externally.

```
cat > c.h <<'eof'
extern void bar();
inline __attribute__((noinline)) void foo() {}
eof
cat > m1.cc <<'eof'
int main() {
  bar();
  foo();
}
eof
cat > m2.cc <<'eof'
__attribute__((noinline)) void bar() {
  foo();
}
eof

clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto -fuse-ld=lld -o t_gen
rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw

clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto=thin -fuse-ld=lld -o t_gen
rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw
```

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D135427
2022-10-11 15:30:07 -07:00
Florian Hahn be611ef7fa
[LoopRotation] Also drop block dispositions.
LoopRotation may also fold basic blocks, so cached block dispositions
also need to be dropped.

Fixes #58291.
2022-10-11 15:25:27 +01:00
Sanjay Patel 7ec604a317 [InstCombine] try harder to cancel out mul/div
((Op1 * X) / Y) / Op1 --> X / Y
https://alive2.llvm.org/ce/z/JYxWjA

InstSimplify handles the more basic mul+div pattern with
shared operand, but we don't seem to have any reassociation
folds to handle cases where the common op is further away.

This is a generalization of 9cff4711ac and another
transform derived from issue #58137.
2022-10-11 09:51:51 -04:00
Max Kazantsev 91aa9097ae [NFC] Factor out collection of unswitch candidate to a separate function
Just to make the code more structured and easier to understand.
2022-10-11 19:35:16 +07:00
Max Kazantsev f18979912d [NFC] Refine API in SimpleLoopUnswitch: add missing const notions 2022-10-11 19:35:16 +07:00
Max Kazantsev a8a07890aa [NFC] Refine API: add missing const notion in hasPartialIVCondition 2022-10-11 19:35:16 +07:00
Nikita Popov df8264c46a [SimplifyLibCalls] Use helper methods to query attributes (NFC) 2022-10-11 11:41:28 +02:00
Daniel Sanders 4a95a64e4a [instcombine] (extelt (inselt Vec, Value, Index), Index) -> Value
When Index is variable but still trivially known to be equal we can use Value
from before the insertion, possibly eliminating the vector.

Reverts a functional change from:
Author: Philip Reames <listmail@philipreames.com>
Date:   Wed Dec 8 12:21:10 2021 -0800

    [instcombine] A couple style tweaks to visitExtractElementInst [nfc]

Thanks to Michele Scandale for identifying the bug

Differential Revision: https://reviews.llvm.org/D135625
2022-10-10 15:41:53 -07:00
Sanjay Patel baab4aa1ba [VectorCombine] convert scalar fneg with insert/extract to vector fneg
insertelt DestVec, (fneg (extractelt SrcVec, Index)), Index --> shuffle DestVec, (fneg SrcVec), Mask

This is a specialized form of what could be a more general fold for a binop.
It's also possible that fneg is overlooked by SLP in this kind of
insert/extract pattern since it's a unary op.

This shows up in the motivating example from #issue 58139, but it won't solve
it (that probably requires some x86-specific backend changes). There are also
some small enhancements (see TODO comments) that can be done as follow-up
patches.

Differential Revision: https://reviews.llvm.org/D135278
2022-10-10 14:59:56 -04:00
Jordan Rupprecht fb27fd5f88 Revert "[LTO] Make local linkage GlobalValue in non-prevailing COMDAT available_externally"
This reverts commit 4fbe33593c. It causes linking errors, with details provided internally. (Hopefully the author/reviewers will be able to upstream the internal repro).
2022-10-10 11:40:45 -07:00
Florian Hahn 4b6bd1c9d5
[LoopSimplifyCFG] Clear SCEV dispositions when removing dead blocks.
When removing loops & blocks we also need to clear the SCEV dispositions
as they may now contain incorrect values.

Fixes #58262.
2022-10-10 18:08:35 +01:00
Florian Hahn 80e49f49e4
[ConstraintElimination] Bail out for GEPs with scalable vectors.
This fixes a crash with scalable vectors, thanks @nikic for spotting
this!
2022-10-10 16:01:20 +01:00
Shubham Narlawar b920407cf5 [LICM] Disable thread-safety checks in single-thread model
If the single-thread model is used, or the
-licm-force-thread-model-single flag is specified, skip checks
related to thread-safety. This means that store promotion for
conditionally executed stores only requires proof of
dereferenceability and writability, but not of thread-safety. For
example, this enables promotion of stores to (non-constant) globals,
as well as captured allocas.

Fixes https://github.com/llvm/llvm-project/issues/50537.

Differential Revision: https://reviews.llvm.org/D130466
2022-10-10 16:51:16 +02:00
Alex Brachet deb82d4a20 Revert "[PGO] Make emitted symbols hidden"
This reverts commit 4ea1a647ff.

This breaks on Darwin which tries to export these symbols
ebb258d3b0/clang/lib/Driver/ToolChains/Darwin.cpp (L1363)

I'll try to reland which that removed and approval from
Apple folks.
2022-10-10 14:37:59 +00:00
Sanjay Patel 9cff4711ac [InstCombine] fold udiv with common factor
((X *nuw Y) >> Z) / X --> Y >> Z

https://alive2.llvm.org/ce/z/x3kKnq

This is similar to 6b869be810 / 8da2fa856f, but I have
not found a signed equivalent, so it's just an unsigned
match for now.
2022-10-10 08:12:06 -04:00
Nikita Popov 874c0327e7 [Attributor] Use ConstantFoldLoadFromConst()
When determining the initial value of the object, use the constant
folding API to load a given type at a given offset in the global
initializer. This makes it work for cases where the load doesn't
directly correspond to an aggregate member.

Differential Revision: https://reviews.llvm.org/D135435
2022-10-10 10:17:37 +02:00
Florian Hahn fee8f561bd
[ConstraintElimination] Include index type scale.
The current decomposition for GEPs did not correctly handle cases where
GEPs access different source types. Adjust the constraints by including
the indexed type-size as coefficients.

Further generalization to allow GEPs with more than one index is a
needed general follow-up improvement.
2022-10-09 21:53:30 +01:00
luxufan eaf6e2fc33 [DSE] Relax constraint on isGuaranteedLoopInvariant
If the location ptr to be killed is in no loop and the Function does not
have irreducible loops, then we can regard it as loop invariant.

Differential Revision: https://reviews.llvm.org/D135369
2022-10-06 03:01:21 +00:00
Florian Hahn 11a6e64ba7
[ConstraintElim] Move logic to get constraint for solving to helper.
Move common logic shared by callers of getConstraint that use the result
to query the constraint system to a new helper getConstraintForSolving.

This includes common legality checks (i.e. not an equality constraint,
no new variables) and the logic to query the unsigned system if possible
for signed predicates.
2022-10-09 10:44:36 +01:00
Fangrui Song 4fbe33593c [LTO] Make local linkage GlobalValue in non-prevailing COMDAT available_externally
See the updated linkonce_resolution_comdat.ll. For a local linkage GV in a
non-prevailing COMDAT, it remains defined while its leader has been made
available_externally. This violates the COMDAT rule that its members must be
retained or discarded as a unit.

To fix this, update the regular LTO change D34803 to track local linkage
GlobalValues, and port the code to ThinLTO (GlobalAliases are not handled.)

Fix https://github.com/llvm/llvm-project/issues/58215:
as a size optimization, we place private `__profd_` in a COMDAT with a
`__profc_` key. When FuncImport.cpp makes `__profc_` available_externally due to
a non-prevailing COMDAT, `__profd_` incorrectly remains private. This change
makes the `__profd_` available_externally.

```
cat > c.h <<'eof'
extern void bar();
inline __attribute__((noinline)) void foo() {}
eof
cat > m1.cc <<'eof'
#include "c.h"
int main() {
  bar();
  foo();
}
eof
cat > m2.cc <<'eof'
#include "c.h"
__attribute__((noinline)) void bar() {
  foo();
}
eof

clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto -fuse-ld=lld -o t_gen
rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw
# one _Z3foov

clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto=thin -fuse-ld=lld -o t_gen
rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw
# one _Z3foov
```

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D135427
2022-10-08 11:09:43 -07:00
Florian Hahn e0136a62cc
[ConstraintElimination] Support chained GEPs with constant offsets.
Handle the (gep (gep ....), C) case by incrementing the constant
coefficient of the inner GEP, if C is a constant.
2022-10-08 16:59:27 +01:00
Florian Hahn 73950f26f5
[LV] Replace check with assert for reduction resume values (NFC).
At this point, we need to have resume values for all inductions. If not,
this would result in silent mis-compiles.
2022-10-08 16:26:10 +01:00
Florian Hahn be858bda69
[ConstraintElimination] Remove unused function (NFC). 2022-10-08 16:05:56 +01:00
Sanjay Patel eccb9a77c6 [InstCombine] fold exact sdiv to ashr (2nd try)
The 1st attempt failed to updated the test checks as expected.

Original commit message:

sdiv exact X, (1<<ShAmt) --> ashr exact X, ShAmt (if shl is non-negative)

https://alive2.llvm.org/ce/z/kB6VF7

It would probably be better to use ValueTracking to replace this
and the existing transform above it, but the analysis does not
account for the no-wrap properly, and it's not immediately clear
to me how to fix it.
2022-10-08 10:09:44 -04:00
Sanjay Patel 68d4dbc2c1 Revert "[InstCombine] fold exact sdiv to ashr"
This reverts commit fe15290e0c.
The test checks were not updated as expected.
2022-10-08 10:02:03 -04:00
Sanjay Patel fe15290e0c [InstCombine] fold exact sdiv to ashr
sdiv exact X, (1<<ShAmt) --> ashr exact X, ShAmt (if shl is non-negative)

https://alive2.llvm.org/ce/z/kB6VF7

It would probably be better to use ValueTracking to replace this
and the existing transform above it, but the analysis does not
account for the no-wrap properly, and it's not immediately clear
to me how to fix it.
2022-10-08 09:23:46 -04:00
Florian Hahn 9d31d1c214
[ConstraintElimination] Use logic from 3771310eed for queries only.
The logic added in 3771310eed was placed sub-optimally. Applying the
transform in ::getConstraint meant that it would also impact conditions
that are added to the system by the signed <-> unsigned transfer logic.

This meant we failed to add some signed facts to the signed system. To
make sure we still add as many useful facts to the signed/unsigned
systems, move the logic to the point where we query the system.
2022-10-08 11:03:45 +01:00
Florian Hahn 13ac102726
[LoopSimplifyCFG] Invalidate SCEV dispositions.
Clear all dispositions if there are any dead blocks (which will get
removed later) and also clear dispositions for removed instructions.

Clearing all dispositions in case there are dead blocks happens first,
which should avoid traversing SCEV use-lists for invalidating
dispositions for individual values.

Fixes #58179.
2022-10-07 21:35:42 +01:00
Florian Hahn 19ad1cd5ce
Recommit "[SCEV] Support clearing Block/LoopDispositions for a single value."
This reverts commit 92f698f01f.

The updated version of the patch includes handling for non-SCEVable
types. A test case has been added in ec86e9a99b.
2022-10-07 20:15:44 +01:00
Philip Reames cb66e123c6 Remove PlaceSafepoints pass
This patch was added way back in the beginning of the work which became the statepoint infrastructure. The idea was that safepoints could be inserted late in the optimization pipeline. This is true if the only concern is garbage collection, but this approach turned out to be incompatible with the requirement to also support deoptimization at safepoints.

In theory, this pass would still be quite useful for an AOT compiled language which wants to support garbage collection, but we have no known users, and haven't for over 5 years. Time to remove unused code. If someone wants to use this, restoring it would not be hard. The immediate motivation for removal is that this is one of the last passes remaining which hasn't been ported to the new pass manager and the (straight forward) work to do so is not justified for unused code.

Differential Revision: https://reviews.llvm.org/D135371
2022-10-07 11:51:00 -07:00
Sanjay Patel 3e6767ed5f [InstCombine] propagate 'exact' when converting ashr to lshr
The shift amount is not changing, so if we guaranteed
shifting out zeros before, those bits are still zeros.

https://alive2.llvm.org/ce/z/sokQca
2022-10-07 13:17:19 -04:00
Florian Hahn 92f698f01f
Revert "[SCEV] Support clearing Block/LoopDispositions for a single value."
This reverts commit 9e931439dd.

This commit causes a crash when TSan, e.g. with
https://lab.llvm.org/buildbot/#/builders/70/builds/28309/steps/10/logs/stdio

Reverting while I extract a reproducer and submit a fix.
2022-10-07 17:58:54 +01:00
Sanjay Patel bdfefac9a4 [InstCombine] refactor sdiv by (negative) power-of-2 folds; NFCI
It's probably better to try harder on this kind of
pattern by using ValueTracking.
2022-10-07 11:35:17 -04:00
Florian Hahn 9e931439dd
[SCEV] Support clearing Block/LoopDispositions for a single value.
Extend forgetBlockAndLoopDisposition to allow clearing information for a
single value. This can be useful when only a single value is changed,
e.g. because the instruction is moved.

We also need to clear the cached values for all SCEV users, because they
may depend on the starting value's disposition.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D134614
2022-10-07 16:07:17 +01:00
Florian Hahn 3771310eed
[ConstraintElimination] Convert to unsigned Pred if possible.
Convert SLE/SLT predicates to unsigned equivalents if both operands are
known to be signed-positive.

https://alive2.llvm.org/ce/z/tBeiZr
2022-10-07 12:27:36 +01:00
Nikita Popov b43a4d0850 [LoopPeeling] Support peeling loops with non-latch exits
Loop peeling currently requires that a) the latch is exiting
b) a branch and c) other exits are unreachable/deopt. This patch
removes all of these limitations, and adds the necessary branch
weight updating support. It essentially works the same way as
before with latch -> exiting terminator and
loop trip count -> per exit trip count.

It's worth noting that there are still other limitations in
profitability heuristics: This patch enables peeling of loops to
make conditions invariant (which is pretty much always highly
profitable if possible), while peeling to make loads dereferenceable
still checks that non-latch exits are unreachable and PGO-based
peeling has even more conditions. Those checks could be relaxed
later if we consider those cases profitable.

The motivation for this change is that loops using iterator adaptors
in Rust often optimize very badly, and end up with a loop phi of the
form phi(true, false) in the final result. Peeling eliminates that
phi and conditions based on it, which enables a lot of follow-on
simplification.

Differential Revision: https://reviews.llvm.org/D134803
2022-10-07 12:35:52 +02:00
Nikita Popov ccf53cae32 [ValueTracking] Remove unused Offset argument in getConstantStringInfo() (NFC) 2022-10-07 11:35:55 +02:00
Dmitry Makogon 8307f6c854 [LoopPredication] Insert assumes of conditions of predicated guards
As LoopPredication performs non-equivalent transforms removing some
checks from loops, other passes may not be able to perform transforms
they'd be able to do if the checks were left in loops.

This patch makes LoopPredication insert assumes of the replaced
conditions either after a guard call or in the true block of
widenable condition branch.

Differential Revision: https://reviews.llvm.org/D135354
2022-10-07 16:10:24 +07:00
Nikita Popov 333246b48e Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify
Relative to the previous attempt, this adjusts simplification to
use the correct context instruction: We need to use the terminator
of the incoming block, not the original instruction.

-----

foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.

This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.

This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.

Differential Revision: https://reviews.llvm.org/D134954
2022-10-07 11:04:19 +02:00
Alina Sbirlea b9898e7ed1 Revert "Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify"
This reverts commit e94619b955.
2022-10-06 13:12:24 -07:00
Alexey Bataev 323ed2308a [SLP]Improve/fix CSE analysis of the blocks/instructions.
Added analysis for invariant extractelement instructions and improved
detection of the CSE blocks for generated extractelement instructions.

Differential Revision: https://reviews.llvm.org/D135279
2022-10-06 12:08:48 -07:00
Alex Brachet 4ea1a647ff [PGO] Make emitted symbols hidden
Differential Revision: https://reviews.llvm.org/D135340
2022-10-06 18:28:16 +00:00
Bjorn Pettersson 0db4b1d1a8 [SimplifyLibCalls] Adjust code comment in optimizeStringLength. NFC
The limitation in LibCallSimplifier::optimizeStringLength to only
optimize when the string is an i8 array was changed already in
commit 50ec0b5dce back in 2017.

We still only simplify when 's' points at an array of 'CharSize', so
the comment is still valid in the sense that we do not support
arbitrary array types.

Differential Revision: https://reviews.llvm.org/D135261
2022-10-06 20:00:27 +02:00
Arthur Eubanks ae5733346f Revert "[DSE] Eliminate noop store even through has clobbering between LoadI and StoreI"
This reverts commit cd8f3e7581.

Causes miscompiles, see D132657
2022-10-06 10:36:02 -07:00
Sanjay Patel 8da2fa856f [InstCombine] fold sdiv with hidden common factor
(X * Y) s/ (X << Z) --> Y s/ (1 << Z)

https://alive2.llvm.org/ce/z/yRSddG

issue #58137
2022-10-06 13:11:50 -04:00
Florian Hahn a7ac0dd0cf
[ConstraintElimination] Generalize AND matching.
Extend more general matching used for chains of ORs to also support
chains of ANDs.
2022-10-06 17:17:38 +01:00
Sanjay Patel 6b869be810 [InstCombine] fold udiv with hidden common factor
(X * Y) u/ (X << Z) --> Y u>> Z

https://alive2.llvm.org/ce/z/4G9D_W
2022-10-06 11:35:27 -04:00
Florian Hahn 8e3e96298f
[ConstraintElimination] Order cmps for signed <-> unsigned transfer first.
Make sure conditions with constant operands come before conditions
without constant operands. This increases the effectiveness of the
current signed <-> unsigned fact transfer logic.
2022-10-06 15:56:25 +01:00
Florian Hahn 349375d093
[ConstraintElimination] Generalize OR matching.
Extend OR handling to traverse chains of ORs.
2022-10-06 11:56:22 +01:00
Nikita Popov 028874dd61 [Local] Fix unused variable warnings (NFC) 2022-10-06 10:30:59 +02:00
Florian Hahn 7449570ff7
[ConstraintElimination] Use ConstraintTy::IsSigned instead of Predicate.
This should be NFC and ensure the sign of the constraint is used
consistently in the future.
2022-10-06 07:51:49 +01:00
Carl Ritson c316332e17 [Sink] Allow sinking of invariant loads across critical edges
Invariant loads can always be sunk.

Reviewed By: foad, arsenm

Differential Revision: https://reviews.llvm.org/D135133
2022-10-06 09:21:12 +09:00
Florian Hahn 9aa004a04c
[ConstraintElimination] Convert NewIndices to vector and rename (NFCI).
The callers of getConstraint only require a list of new variables.
Update the naming and types to make this clearer.
2022-10-05 16:25:00 +01:00
Johannes Doerfert e18736149c [Attributor] Teach AAPointerInfo about atomic cmxchg and rmw
The atomic operations behave similar to a store except that we don't
know the new value and we read the result first.
2022-10-05 06:48:00 -07:00
Johannes Doerfert 93e51fa444 [Attributor] AAPointerInfo can model non-escaping call uses
If a call base use will not capture a pointer we can approximate the
effects. This is important especially for readnone/only uses. Even
may-write uses are not too bad with reachability in place. Capturing
is the problem as we loose track of update sides.
2022-10-05 06:29:14 -07:00
Johannes Doerfert 477e8e10f0 [Attributor] Teach AAPointerInfo to look into aggregates
If we have a constant aggregate, e.g., as an initializer, we usually
failed to extract the proper value/type from it. This patch provides the
size and offset information necessary to extract the right part of the
constant.
2022-10-05 06:19:47 -07:00
Nikita Popov 5fa14ee835 [MemCpyOpt] Don't hoist above producer of pointer operand
This was already handled correctly below, but not checked for the
original store pointer operand. Encountered when converting tests
to opaque pointers, where the intermediate bitcast goes away.
2022-10-05 14:52:33 +02:00
David Stuttard d1d7d2235c [AggressiveInstCombine] Fix cases where non-opaque pointers are used
In the case of non-opaque pointers, when combining consecutive loads,
need to bitcast the pointer source to the combined type size, otherwise
asserts are triggered.

Differential Revision: https://reviews.llvm.org/D135249
2022-10-05 13:42:46 +01:00
Nikita Popov e94619b955 Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify
The infinite loop seen on buildbots should be fixed by
11897708c0 (assuming there are not
multiple infinite combine loops...)

-----

foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.

This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.

This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.

Differential Revision: https://reviews.llvm.org/D134954
2022-10-05 14:00:19 +02:00
Nikita Popov 11897708c0 [InstCombine] Directly replace instr in foldIntegerTypedPHI() (NFCI)
Rather than inserting a ptrtoint + inttoptr pair, directly replace
the inttoptr with the new phi node. This ensures that no other
transform can undo it before the pair gets folded away.

This avoids the infinite loop when combined with D134954.

This is NFCI in the sense that it shouldn't make a difference, but
could due to different worklist order.
2022-10-05 13:28:23 +02:00
Florian Hahn 469f0fc6a6
[SimpleLoopUnswitch] Clear dispos in deleteDeadBlocksFromLoop.
SimpleLoopUnswitch may remove blocks from loops. Clear block and loop
dispositions in that case, to clean up invalid entries in the cache.

Fixes #58158.

Fixes #58159.
2022-10-05 10:28:15 +01:00
Johannes Doerfert a9557115b4 [Attributor] Qualify variables to avoid clashes in the future 2022-10-04 19:43:04 -07:00
Gulfem Savrun Yeniceri d7592bbb03 Revert "Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify"
This reverts commit e1dd2cd063 because
the original commit b20e34b39f had
a dramatic increase in the build time of RTfuzzer, which caused Fuchsia
Clang toolchain builders to timeout:
https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8801248587754572961/overview
2022-10-04 20:57:34 +00:00
Florian Hahn 4f827318e3
[LoopVersioning,LLE] Clear LoopAccessInfoManager after making changes.
Loop versioning changes the control-flow, which may impact SCEVs cached
by for other loops in LoopAccessInfoManager. Clear the manager after
making changes.

Fixes #57825.

Depends on D134609.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D134611
2022-10-04 21:35:42 +01:00
Ram-NK a58b6acf1f [NFC][LoopInterchange] Clean up of irrelevent dependency checking with
isOuterMostDepPositive()

The function isOuterMostDepPositive() is checked after negative dependence
vectors are normalized to be non-negative, so there will not be any negative
dependency ('>' as the outermost non-equal sign) after normalization. And
therefore the check in isOuterMostDepPositive() is irrelevent and redundant.

Reviewed By: congzhe

Differential Revision: https://reviews.llvm.org/D132982
2022-10-04 14:54:08 -04:00
Alexey Bataev ab9a81f736 [SLP]Try to emit canonical shuffle with undef operand.
In the canonical form of the shuffle the poison/undef operand is the
second operand, the patch tries to emit canonical form for partial
vectorization of the buildvector sequence.
Also, this patch starts emitting freeze instruction for shuffles with undef indices if the second shuffle operan is undef, not poison. It is an initial step to D93818, where undef mask element are treated as returning poison value.

Differential Revision: https://reviews.llvm.org/D134377
2022-10-04 08:16:07 -07:00
Nikita Popov e1dd2cd063 Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify
Reapply with a fix for the case where an operand simplified back
to the original phi: We need to map this case to the new phi node.

-----

foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.

This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.

This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.
2022-10-04 15:18:34 +02:00
Alex Richardson 16f9c5577d [SimplifyLibCalls] Retain attributes added by Builder.CreateMem*
This currently does not make much of a difference (only one tests is
affected), but it is helpful e.g. for the out-of-tree CHERI target where
Builder.CreateMemCpy() can add attributes other than parameter alignment.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D135075
2022-10-04 13:11:34 +00:00
Bjorn Pettersson 491ac8f3e8 [LibCalls] Cast Char argument to 'int' before calling emitFPutC
The helpers in BuildLibCalls normally expect that the Value
arguments already have the correct type (matching the lib call
signature). And exception has been emitFPutC which casted the Char
argument to 'int' using CreateIntCast. This patch moves the cast to
the caller instead of doing it inside emitFPutC.

I think it makes sense to make the BuildLibCall API:s a bit
more consistent this way, despite the need to handle the int cast
in two different places now.

Differential Revision: https://reviews.llvm.org/D135066
2022-10-04 12:52:05 +02:00
Bjorn Pettersson aa1b64cc42 [BuildLibCalls] Use TLI to get 'int' and 'size_t' type sizes
Stop assuming that an 'int' is 32 bits in helpers that emit libcalls
to lib functions that had 'int' in the signature. For most targets
this is NFC. For a target with 16 bit 'int' type this could help out
detecting if trying to emit a libcall with incorrect signature.

Similarly we now derive the type mapping to 'size_t' by asking TLI
about the size of 'size_t'. This should be NFC (at least for in-tree
targets) since getSizeTSize(), in TLI, is deriving the size in the
same way as DataLayout::getIntPtrType().

Differential Revision: https://reviews.llvm.org/D135065
2022-10-04 12:52:05 +02:00
Bjorn Pettersson 73e8d95d28 [BuildLibCalls] Name types to identify when 'int' and 'size_t' is assumed. NFC
Lots of BuildLibCalls helpers are using Builder::getInt32Ty to get
a type matching an 'int', and DataLayout::getIntPtrType to get a
type matching 'size_t'. The former is not true for all targets, since
and 'int' isn't always 32 bits. And the latter is a bit weird as well
as the definition of DataLayout::getIntPtrType isn't clearly mapping
it to 'size_t'.

This patch is not aiming at solving any such problems. It is merely
highlighting when a libcall is expecting to use 'int' and 'size_t'
by naming the types as IntTy and SizeTTy when preparing the type
signatures for the emitted libcalls.

Differential Revision: https://reviews.llvm.org/D135064
2022-10-04 12:52:05 +02:00
Florian Hahn 825e16969e
[LAA] Pass LoopAccessInfoManager instead of GetLAA function.
Use LoopAccessInfoManager directly instead of various GetLAA lambdas.

Depends on D134608.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D134609
2022-10-04 11:51:25 +01:00
Florian Hahn e399dd601f
[SimpleLoopUnswitch] Clear block and loop dispos after destroying loop.
SimpleLoopUnswitch may remove loops. Clear block and loop dispositions,
to clean up invalid entries in the cache.

Fixes #58136.
2022-10-04 10:27:52 +01:00
Nikita Popov 635f93dff7 [SimplifyLibCalls] Place deref attr even if nonnull already set
If nonnull is already set, we currently skip setting both nonnull
and dereferenceable. Make these independent, to avoid regressions
when additional nonnull attributes are inferred earlier.
2022-10-04 11:26:15 +02:00
Nikita Popov 0f32f0e147 Revert "[InstCombine] Switch foldOpIntoPhi() to use InstSimplify"
This reverts commit b20e34b39f.

This causes RAUW type mismatch assertions on some buildbots,
reverting for now.
2022-10-04 11:17:09 +02:00
Nikita Popov b20e34b39f [InstCombine] Switch foldOpIntoPhi() to use InstSimplify
foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.

This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.

This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.
2022-10-04 10:12:14 +02:00
Florian Hahn b2daeb9706
[ConstraintElimination] Re-enable debug print when adding facts. (NFC)
Also add test coverage for important debug output.
2022-10-03 20:51:58 +01:00
Florian Hahn bdfe986d66
[ConstraintElimination] Simplify logic for using inverse predicate (NFC)
Recent improvements to the code structure mean we don't need to reset
the condition's predicate in the IR and later restore it. Remove the
restorer logic.
2022-10-03 18:35:59 +01:00
Florian Hahn 017552e8b8
[ConstraintElimination] Remove stray comment (NFC).
The comment doens't apply in the current context, remove it.
2022-10-03 18:20:48 +01:00
Florian Hahn a2efc29e99
[ConstraintElimination] Remove unused StackEntry::IsNot field. (NFC)
The field is no unused and can be removed.
2022-10-03 18:05:25 +01:00
Alex Richardson 3890a456d8 [SimplifyLibCalls] Reduce code duplication. NFC
Reviewed By: nikic, nickdesaulniers, xbolva00

Differential Revision: https://reviews.llvm.org/D135073
2022-10-03 15:44:00 +00:00
Yevgeny Rouban e351821088 Fix compilation of CodeLayout.cpp for MacOS
llvm/lib/Transforms/Utils/CodeLayout.cpp uses std::abs() with double argument,
which is provided by cmath header, which is not explicitly included into CodeLayout.cpp.
The implicit include in llvm/include/llvm/Support/MathExtras.h was removed in
commit 16544cbe64

Inserting explicit include of cmath into CodeLayout.cpp in order to fix build on MacOS.

Committed on behalf of alsemenov (Aleksei Semenov)
Reviewed By: thieta
Differential Revision: https://reviews.llvm.org/D135072
2022-10-03 21:47:43 +07:00
Bjorn Pettersson 66fcdfca4d [Analysis][SimplifyLibCalls] Refactor code related to size_t in lib func signatures. NFC
Added a helper in TargetLibraryInfo to get size of "size_t" in bits,
given a Module reference. The new getSizeTSize helper is using the
same strategy as for example isValidProtoForLibFunc has been using
in the past, assuming that the size can be derived by asking
DataLayout about the size/type of a pointer to int.

FortifiedLibCallSimplifier::optimizeStrpCpyChk was changed to use
the new getSizeTSize helper instead of assuming that sizeof(size_t)
is equal to sizeof(int*) by itself (that is the assumption used in
TargetLibraryInfoImpl::getSizeTSize so the result will be the same).

Having a common helper for this ensure that we use the same strategy
when deriving the size of "size_t" in different parts of the code.
One bonus with this refactoring (basing it on Module instead of just
DataLayout) is that it makes it easier to override this for a specific
target triple, in case the assumption of using getPointerSizeInBits
wouldn't hold.

Differential Revision: https://reviews.llvm.org/D110585
2022-10-03 12:02:50 +02:00
Sanjay Patel 2e87333bfe [InstCombine] convert mul by negative-pow2 to negate and shift
This is an unusual canonicalization because we create an extra instruction,
but it's likely better for analysis and codegen (similar reasoning as D133399).

InstCombine::Negator may create this kind of multiply from negate and shift,
but this should not conflict because of the narrow negation.

I don't know how to create a fully general proof for this kind of transform in
Alive2, but here's an example with bitwidths similar to one of the regression
tests:
https://alive2.llvm.org/ce/z/J3jTjR

Differential Revision: https://reviews.llvm.org/D133667
2022-10-02 12:22:25 -04:00
Florian Hahn 3fe6ddd999
[ConstraintElimination] Update Changed status in ssub simplification.
Update tryToSimplifyOverflowMath to indicate whether the function made
any changes to the IR.
2022-10-02 14:25:51 +01:00
Arthur Eubanks 5df4ab55f9 [llvm] Migrate PAEval to new pass manager 2022-10-01 16:41:58 -07:00
Florian Hahn 7c0ff64b0f
[LAA] Change to function analysis for new PM.
At the moment, LoopAccessAnalysis is a loop analysis for the new pass
manager. The issue with that is that LAI caches SCEV expressions and
modifications in a loop may impact SCEV expressions in other loops, but
we do not have a convenient way to invalidate LAI for other loops
withing a loop pipeline.

To avoid this issue, turn it into a function analysis which returns a
manager object that keeps track of the individual LAI objects per loop.

Fixes #50940.

Fixes #51669.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D134606
2022-10-01 15:44:27 +01:00
Teresa Johnson 43417d8159 [MemProf] Update metadata during inlining
Update both memprof and callsite metadata to reflect inlined functions.

For callsite metadata this is simply a concatenation of each cloned
call's call stack with that of the inlined callsite's.

For memprof metadata, each profiled memory info block (MIB) is either
moved to the cloned allocation call or left on the original allocation
call depending on whether its context matches the newly refined call
stack context on the cloned call. We also reapply context trimming
optimizations based on the refined set of contexts on each of the calls
(cloned and original).

Depends on D128142.

Reviewed By: snehasish

Differential Revision: https://reviews.llvm.org/D128143
2022-09-30 19:21:15 -07:00
Teresa Johnson 4d243348fb Revert "[MemProf] Update metadata during inlining" and preceeding commit
This reverts commit 0d7f3464ce and
commit f9403ca41e. The latter was
"Profile matching and IR annotation for memprof profiles." and was left
from a bad rebase from a commit already pushed upstream.
2022-09-30 17:01:30 -07:00
Teresa Johnson 0d7f3464ce [MemProf] Update metadata during inlining
Update both memprof and callsite metadata to reflect inlined functions.

For callsite metadata this is simply a concatenation of each cloned
call's call stack with that of the inlined callsite's.

For memprof metadata, each profiled memory info block (MIB) is either
moved to the cloned allocation call or left on the original allocation
call depending on whether its context matches the newly refined call
stack context on the cloned call. We also reapply context trimming
optimizations based on the refined set of contexts on each of the calls
(cloned and original), via utilities in MemoryProfileInfo.

Depends on D128142.

Differential Revision: https://reviews.llvm.org/D128143
2022-09-30 16:46:17 -07:00
Teresa Johnson f9403ca41e Profile matching and IR annotation for memprof profiles.
See also related RFCs:
RFC: Sanitizer-based Heap Profiler [1]
RFC: A binary serialization format for MemProf [2]
RFC: IR metadata format for MemProf [3]*

* Note that the IR metadata format has changed from the RFC during
implementation, as described in the preceeding patch adding the basic
metadata and verification support.

The matching is performed during the normal PGO annotation phase, to
ensure that the inlines applied in the IR at that point are a subset
of the inlines in the profiled binary and thus reflected in the
profile's call stacks. This is important because the call frames are
associated with functions in the profile based on the inlining in the
symbolized call stacks, and this simplifies locating the subset of
profile data relevant for matching onto each function's IR.

The PGOInstrumentationUse pass is enhanced to perform matching for
whatever combination of memprof and regular PGO profile data exists in
the profile.

Using the utilities introduced in D128854:
The memprof profile data for each context is converted to "cold" or
"notcold" based on parameterized thresholds for size, access count, and
lifetime. The memprof allocation contexts are trimmed to the minimal
amount of context required to uniquely identify whether the context is
cold or not cold. For allocations where all profiled contexts have the
same allocation type, no memprof metadata is attached and instead the
allocation call is directly annotated with an attribute specifying the
alloction type. This is the same attributed that will be applied to
allocation calls once cloned for different contexts, and later used
during LibCall simplification to emit allocation hints [4].

Depends on D128141 and D128854.

[1] https://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html
[2] https://lists.llvm.org/pipermail/llvm-dev/2021-September/153007.html
[3] https://discourse.llvm.org/t/rfc-ir-metadata-format-for-memprof/59165
[4] ab87cf382d

Differential Revision: https://reviews.llvm.org/D128142
2022-09-30 16:46:17 -07:00
Sanjay Patel 2053070443 [SCCP] remove unnecessary check for constant when folding sext->zext
I'm not sure how to test this because we seem to constant-fold
all examples already. We changed this code to use the common
isNonNegative() helper, so it should not be necessary to avoid
a constant. This makes the code uniform for all transforms.
2022-09-30 17:26:10 -04:00
Sanjay Patel 89a5d804c1 [SCCP] add a code comment about sitofp -> uitofp; NFC
D134975 would have added this fold, but we decided it's
not worth doing without some evidence of benefit.
2022-09-30 17:26:10 -04:00
Florian Hahn 04c711c78d
[ConstraintElimination] Make sure the variable is available before use.
This fixes a crash when trying to access an index for a value where we
don't have a known index.

Fixes #58009.
2022-09-30 18:09:01 +01:00
Nikita Popov d40dcb0b8d [LICM] Collect more scalar promotion stats (NFC)
Collect more statistics for scalar promotion. In particular,
keep track of how many promotion candidates there were, and
whether it is a load or a load/store promotion.
2022-09-30 16:07:52 +02:00
Simon Pilgrim 5849fcb635 Revert rG1b7089fe67b924bdd5ecef786a34bdba7a88778f "[SLP] Add ScalarizationOverheadBuilder helper to track vector extractions"
Revert rGef89409a59f3b79ae143b33b7d8e6ee6285aa42f "Fix 'unused-lambda-capture' gcc warning. NFCI."
Revert rG926ccfef032d206dcbcdf74ca1e3a9ebf4d1be45 "[SLP] ScalarizationOverheadBuilder - demand all elements for scalarization if the extraction index is unknown / out of bounds"

Revert ScalarizationOverheadBuilder sequence from D134605 - when accumulating extraction costs by Type (instead of specific Value), we are not distinguishing enough when they are coming from the same source or not, and we always just count the cost once. This needs addressing before we can use getScalarizationOverhead properly.
2022-09-30 11:22:48 +01:00
Florian Hahn 8ae0d9aa07
[LoopDeletion] Clear block & loop dispo cache after breaking backedge.
breakLoopBackedge may remove blocks and loops. Also clear block &
loop disposition to avoid the cache containing invalid blocks and loops.

The coverage for the change is provided when using an ASAN build of opt
to run the LoopDeletion unit tests; without the fix, pointers to invalid
objects would be used.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D134663
2022-09-30 11:21:58 +01:00