Commit Graph

31623 Commits

Author SHA1 Message Date
Sanjay Patel 2e87333bfe [InstCombine] convert mul by negative-pow2 to negate and shift
This is an unusual canonicalization because we create an extra instruction,
but it's likely better for analysis and codegen (similar reasoning as D133399).

InstCombine::Negator may create this kind of multiply from negate and shift,
but this should not conflict because of the narrow negation.

I don't know how to create a fully general proof for this kind of transform in
Alive2, but here's an example with bitwidths similar to one of the regression
tests:
https://alive2.llvm.org/ce/z/J3jTjR

Differential Revision: https://reviews.llvm.org/D133667
2022-10-02 12:22:25 -04:00
Florian Hahn 3fe6ddd999
[ConstraintElimination] Update Changed status in ssub simplification.
Update tryToSimplifyOverflowMath to indicate whether the function made
any changes to the IR.
2022-10-02 14:25:51 +01:00
Arthur Eubanks 5df4ab55f9 [llvm] Migrate PAEval to new pass manager 2022-10-01 16:41:58 -07:00
Florian Hahn 7c0ff64b0f
[LAA] Change to function analysis for new PM.
At the moment, LoopAccessAnalysis is a loop analysis for the new pass
manager. The issue with that is that LAI caches SCEV expressions and
modifications in a loop may impact SCEV expressions in other loops, but
we do not have a convenient way to invalidate LAI for other loops
withing a loop pipeline.

To avoid this issue, turn it into a function analysis which returns a
manager object that keeps track of the individual LAI objects per loop.

Fixes #50940.

Fixes #51669.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D134606
2022-10-01 15:44:27 +01:00
Teresa Johnson 43417d8159 [MemProf] Update metadata during inlining
Update both memprof and callsite metadata to reflect inlined functions.

For callsite metadata this is simply a concatenation of each cloned
call's call stack with that of the inlined callsite's.

For memprof metadata, each profiled memory info block (MIB) is either
moved to the cloned allocation call or left on the original allocation
call depending on whether its context matches the newly refined call
stack context on the cloned call. We also reapply context trimming
optimizations based on the refined set of contexts on each of the calls
(cloned and original).

Depends on D128142.

Reviewed By: snehasish

Differential Revision: https://reviews.llvm.org/D128143
2022-09-30 19:21:15 -07:00
Teresa Johnson 4d243348fb Revert "[MemProf] Update metadata during inlining" and preceeding commit
This reverts commit 0d7f3464ce and
commit f9403ca41e. The latter was
"Profile matching and IR annotation for memprof profiles." and was left
from a bad rebase from a commit already pushed upstream.
2022-09-30 17:01:30 -07:00
Teresa Johnson 0d7f3464ce [MemProf] Update metadata during inlining
Update both memprof and callsite metadata to reflect inlined functions.

For callsite metadata this is simply a concatenation of each cloned
call's call stack with that of the inlined callsite's.

For memprof metadata, each profiled memory info block (MIB) is either
moved to the cloned allocation call or left on the original allocation
call depending on whether its context matches the newly refined call
stack context on the cloned call. We also reapply context trimming
optimizations based on the refined set of contexts on each of the calls
(cloned and original), via utilities in MemoryProfileInfo.

Depends on D128142.

Differential Revision: https://reviews.llvm.org/D128143
2022-09-30 16:46:17 -07:00
Teresa Johnson f9403ca41e Profile matching and IR annotation for memprof profiles.
See also related RFCs:
RFC: Sanitizer-based Heap Profiler [1]
RFC: A binary serialization format for MemProf [2]
RFC: IR metadata format for MemProf [3]*

* Note that the IR metadata format has changed from the RFC during
implementation, as described in the preceeding patch adding the basic
metadata and verification support.

The matching is performed during the normal PGO annotation phase, to
ensure that the inlines applied in the IR at that point are a subset
of the inlines in the profiled binary and thus reflected in the
profile's call stacks. This is important because the call frames are
associated with functions in the profile based on the inlining in the
symbolized call stacks, and this simplifies locating the subset of
profile data relevant for matching onto each function's IR.

The PGOInstrumentationUse pass is enhanced to perform matching for
whatever combination of memprof and regular PGO profile data exists in
the profile.

Using the utilities introduced in D128854:
The memprof profile data for each context is converted to "cold" or
"notcold" based on parameterized thresholds for size, access count, and
lifetime. The memprof allocation contexts are trimmed to the minimal
amount of context required to uniquely identify whether the context is
cold or not cold. For allocations where all profiled contexts have the
same allocation type, no memprof metadata is attached and instead the
allocation call is directly annotated with an attribute specifying the
alloction type. This is the same attributed that will be applied to
allocation calls once cloned for different contexts, and later used
during LibCall simplification to emit allocation hints [4].

Depends on D128141 and D128854.

[1] https://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html
[2] https://lists.llvm.org/pipermail/llvm-dev/2021-September/153007.html
[3] https://discourse.llvm.org/t/rfc-ir-metadata-format-for-memprof/59165
[4] ab87cf382d

Differential Revision: https://reviews.llvm.org/D128142
2022-09-30 16:46:17 -07:00
Sanjay Patel 2053070443 [SCCP] remove unnecessary check for constant when folding sext->zext
I'm not sure how to test this because we seem to constant-fold
all examples already. We changed this code to use the common
isNonNegative() helper, so it should not be necessary to avoid
a constant. This makes the code uniform for all transforms.
2022-09-30 17:26:10 -04:00
Sanjay Patel 89a5d804c1 [SCCP] add a code comment about sitofp -> uitofp; NFC
D134975 would have added this fold, but we decided it's
not worth doing without some evidence of benefit.
2022-09-30 17:26:10 -04:00
Florian Hahn 04c711c78d
[ConstraintElimination] Make sure the variable is available before use.
This fixes a crash when trying to access an index for a value where we
don't have a known index.

Fixes #58009.
2022-09-30 18:09:01 +01:00
Nikita Popov d40dcb0b8d [LICM] Collect more scalar promotion stats (NFC)
Collect more statistics for scalar promotion. In particular,
keep track of how many promotion candidates there were, and
whether it is a load or a load/store promotion.
2022-09-30 16:07:52 +02:00
Simon Pilgrim 5849fcb635 Revert rG1b7089fe67b924bdd5ecef786a34bdba7a88778f "[SLP] Add ScalarizationOverheadBuilder helper to track vector extractions"
Revert rGef89409a59f3b79ae143b33b7d8e6ee6285aa42f "Fix 'unused-lambda-capture' gcc warning. NFCI."
Revert rG926ccfef032d206dcbcdf74ca1e3a9ebf4d1be45 "[SLP] ScalarizationOverheadBuilder - demand all elements for scalarization if the extraction index is unknown / out of bounds"

Revert ScalarizationOverheadBuilder sequence from D134605 - when accumulating extraction costs by Type (instead of specific Value), we are not distinguishing enough when they are coming from the same source or not, and we always just count the cost once. This needs addressing before we can use getScalarizationOverhead properly.
2022-09-30 11:22:48 +01:00
Florian Hahn 8ae0d9aa07
[LoopDeletion] Clear block & loop dispo cache after breaking backedge.
breakLoopBackedge may remove blocks and loops. Also clear block &
loop disposition to avoid the cache containing invalid blocks and loops.

The coverage for the change is provided when using an ASAN build of opt
to run the LoopDeletion unit tests; without the fix, pointers to invalid
objects would be used.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D134663
2022-09-30 11:21:58 +01:00
Florian Hahn 9933a2e9fd
[SCEVExpander] Move LCSSA fixup to ::expand.
Move LCSSA fixup from ::expandCodeForImpl to ::expand(). This has
the advantage that we directly preserve LCSSA nodes here instead of
relying on doing so in rememberInstruction. It also ensures that we
 don't add the non-LCSSA-safe value to InsertedExpressions.

Alternative to D132704.

Fixes #57000.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D134739
2022-09-29 20:49:56 +01:00
luxufan f079ba76cf [DSE] Eliminate noop store even through has clobbering between LoadI and StoreI
For noop store of the form of LoadI and StoreI,
An invariant should be kept is that the memory state of the related
MemoryLoc before LoadI is the same as before StoreI.
For this example:
```
define void @pr49927(i32* %q, i32* %p) {
  %v = load i32, i32* %p, align 4
  store i32 %v, i32* %q, align 4
  store i32 %v, i32* %p, align 4
  ret void
}
```
Here the definition of the store's destination is different with the
definition of the load's destination, which it seems that the
invariant mentioned above is broken. But the definition of the
store's destination would write a value that is LoadI, actually, the
invariant is still kept. So we can safely ignore it.

Fixes https://github.com/llvm/llvm-project/issues/49271

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D132657
2022-09-29 00:51:56 +00:00
Nikola Tesic b5d28f3ea5 [Debugify][OriginalDIMode] Make HTML reporting infrastructure more resilient
Debugify in OriginalDebugInfo mode (verify-each-debuginfo-preserve), when used
in parallel builds of large projects, can produce incorrect report. More
precisely, simultaneous writes to JSON report file, could form incorrect JSON
objects, which describe found Debug Info bugs.
This patch uses the lock/unlock mechanism to protect JSON report file and also
makes script llvm/utils/llvm-original-di-preservation.py resilient to corrupted
lines in the report file. So, it ensures the creation of HTML report.

Differential Revision: https://reviews.llvm.org/D115616
2022-09-29 16:48:06 +02:00
eopXD 8cbdb1e081 [LSR][NFC] Add missing constness 2022-09-29 06:30:50 -07:00
Nikita Popov 412141663c Reapply [FunctionAttrs] Infer precise FMRB
The previous version of the patch would incorrect convert an
existing argmemonly attribute into an inaccessiblemem_or_argmemonly
attribute.

-----

This updates checkFunctionMemoryAccess() to infer a precise
FunctionModRefBehavior, rather than an approximation split into
read/write and argmemonly.

Afterwards, we still map this back to imprecise function attributes.
This still allows us to infer some cases that we previously did not
handle, namely inaccessiblememonly and inaccessiblemem_or_argmemonly.
In practice, this means we get better memory attributes in the
presence of intrinsics like @llvm.assume.

Differential Revision: https://reviews.llvm.org/D134527
2022-09-29 14:02:15 +02:00
Florian Hahn 080a1e2bbb
[LV] Create createInductionResumeValue helper (NFC).
Factor out the logic to create induction resume values for a specific
induction. This will be used in D92132 to support widened IVs during
epilogue vectorization.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D134211
2022-09-29 11:13:01 +01:00
Juan Manuel MARTINEZ CAAMAÑO 52545e603b [DebugInfo][InferAddressSpaces] Propagate DebugLoc when cloning an instruction in InferAddressSpaces
Differential Revision: https://reviews.llvm.org/D134428
2022-09-29 08:43:37 +00:00
Juan Manuel MARTINEZ CAAMAÑO e9716c64ec [StructurizeCFG] Remove imposible case and replace by assert
In addition, replace outdated XFAIL test by a new one.

Differential Revision: https://reviews.llvm.org/D134439
2022-09-29 08:27:49 +00:00
Florian Hahn 9247b012d6
[SCEVExpander] Use CreateBitOrPointerCast instead of builder (NFC).
Simplify the code by using CastInst::CreateBitOrPointerCast directly. By
not going through the builder, the temporary instruction also won't get
registered in InsertedValues & co, which means less work overall and
simplifies the clean-up.
2022-09-29 09:24:39 +01:00
Gulfem Savrun Yeniceri 5bdf22e743 [InstrProfiling] Fix emitting runtime hook once
https://reviews.llvm.org/D134254 introduced an issue on Fuchsia
target, which does not unconditionally emit runtime hook.
It used containsProfilingIntrinsics(M) after intrinsics are lowered.
So, this patch fixes the issue by capturing the result of that
function invocation before intrinsics are lowered.

Differential Revision: https://reviews.llvm.org/D134841
2022-09-29 01:21:49 +00:00
Florian Mayer e06c9b63bc [NFC] [HWASan] remove unnecessary cast 2022-09-28 17:48:19 -07:00
Florian Mayer 0401dc2913 [MTE] [HWASan] unify isInterestingAlloca
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D134779
2022-09-28 15:52:34 -07:00
serge-sans-paille 16544cbe64 [iwyu] Move <cmath> out of llvm/Support/MathExtras.h
Interestingly, MathExtras.h doesn't use <cmath> declaration, so move it out of
that header and include it when needed.

No functional change intended, but there's no longer a transitive include
fromMathExtras.h to cmath.
2022-09-28 20:49:01 +02:00
Mingming Liu ac28efa6c1 [SimplifyCFG][TranformUtils]Do not simplify away a trivial basic block if both this block and at least one of its predecessors are loop latches.
- Before this patch, loop metadata (if exists) will override the metadata of each predecessor; if the predecessor block already has loop metadata, the orignal loop metadata won't be preserved and could cause missed loop transformations (see 'test2' in llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll).

To illustrate how inner-loop metadata might be dropped before this patch:

CFG Before

      entry
        |
        v
 ---> while.cond   ------------->  while.end
 |       |
 |       v
 |   while.body
 |       |
 |       v
 |    for.body <---- (md1)
 |       |  |______|
 |       v
 |    while.cond.exit (md2)
 |       |
 |_______|

CFG After

       entry
         |
         v
 ---> while.cond.rewrite  ------------->  while.end
 |       |
 |       v
 |   while.body
 |       |
 |       v
 |    for.body <---- (md2)
 |_______|  |______|

Basically, when 'while.cond.exit' is folded into 'while.cond', 'md2' overrides 'md1' and 'md1' is dropped from the CFG.

Differential Revision: https://reviews.llvm.org/D134152
2022-09-28 10:48:14 -07:00
bipmis 3b49a9fcf6 [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load.
The patch simplifies some of the patterns as below

1. (ZExt(L1) << shift1) | (ZExt(L2) << shift2) -> ZExt(L3) << shift1
2. (ZExt(L1) << shift1) | ZExt(L2) -> ZExt(L3)

The pattern is indicative of the fact that the loads are being merged to a wider load and the only use of this pattern is with a wider load. In this case for a non-atomic/non-volatile loads reduce the pattern to a combined load which would improve the cost of inlining, unrolling, vectorization etc.

Fix the error reported on reverse load merge.

Differential Revision: https://reviews.llvm.org/D127392
2022-09-28 17:32:47 +01:00
Sanjay Patel e239198cdb [InstCombine] fold select shuffles with shared operand together
We don't combine generic shuffles together in IR, but select
shuffles are a special-case because a select shuffle of a
select shuffle is just another select shuffle; codegen is
expected to efficiently lower those (select shuffles are also
the canonical form of a vector select with constant condition).
2022-09-28 11:56:27 -04:00
Sameer Sahasrabuddhe 3f078b308b [AAPointerInfo] OffsetInfo: Unassigned is distinct from Unknown
A User like the PHINode may be visited multiple times for the same pointer along
different def-use edges. The uninitialized state of OffsetInfo at the first
visit needs to be distinct from the Unknown value that may be assigned after
processing the PHINode. Without that, a PHINode with all inputs Unknown is never
followed to its uses. This results in incorrect optimization because some
interfering accessess are missed.

Differential Revision: https://reviews.llvm.org/D134704
2022-09-28 20:31:36 +05:30
Benjamin Kramer 0fb2676c24 Revert "[FunctionAttrs] Infer precise FMRB"
This reverts commit 97dfa53626.

It can make DSE crash. Reduced test case at
https://reviews.llvm.org/P8291
2022-09-28 16:57:43 +02:00
Florian Hahn ed47bc8b58
[SCEVExpander] Remove dead Root argument from expandCodeForImpl (NFC).
The argument is unused and can be removed.
2022-09-28 12:08:36 +01:00
Florian Hahn 2b23a58924
[LoopDeletion] Forget block and loop dispositions after deleting loop.
After deleting a loop, the block and loop dispositions need to be
cleared. As we don't know which SCEVs in the loop/blocks may be
impacted, completely clear the cache. This should also fix some cases
where deleted loops remained in the LoopDispositions cache.

This fixes a verification failure surfaced by D134531.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D134613
2022-09-28 11:33:43 +01:00
Simon Pilgrim 926ccfef03 [SLP] ScalarizationOverheadBuilder - demand all elements for scalarization if the extraction index is unknown / out of bounds
Workaround for a chromium bug reported on D134605 - test case will be added later
2022-09-28 11:03:37 +01:00
Igor Kirillov 2d60d7ba1a [LoopVectorize][Fix] Crash when invariant store address is calculated inside loop
Fixes #57572

Generally LICM pass is responsible for sinking out code that calculates
invariant address inside loop as it only needed to be calculated once.
But in rare case it does not happen we will not be vectorizing the
loop.

Differential Revision: https://reviews.llvm.org/D133687
2022-09-28 10:33:50 +01:00
Philip Reames f6d110e26f [LAA] Make getPtrStride return Option instead of overloading zero as error value [nfc]
This is purely NFC restructure in advance of a change which actually exposes zero strides.  This is mostly because I find this interface confusing each time I look at it.
2022-09-27 15:55:44 -07:00
Philip Reames 899ebd7e99 [LV] Remove two unused default arguments [nfc] 2022-09-27 14:33:53 -07:00
Martin Sebor e80e134c77 [InstCombine] Add support for stpncpy folding
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D130922
2022-09-27 14:44:33 -06:00
Doru Bercea c9adeca501 Move allocas converted from __kmpc_alloc_shared to entry block. 2022-09-27 17:16:58 +00:00
Philip Reames dc7387b587 [LV] Adjust cost model to use uniform store lowering for unpredicated uniform stores
Follow up to D133580; adjust the cost model to prefer uniform store lowering for scalable stores which are unpredicated.

The impact here isn't in the uniform store lowering quality itself. InstCombine happily converts the scatter form into the single store form. The main impact is in letting the rest of the cost model make choices based on the knowledge that the vector will be scalarized on use.

Differential Revision: https://reviews.llvm.org/D134460
2022-09-27 07:28:40 -07:00
Simon Pilgrim ef89409a59 Fix 'unused-lambda-capture' gcc warning. NFCI. 2022-09-27 15:15:43 +01:00
Simon Pilgrim 1b7089fe67 [SLP] Add ScalarizationOverheadBuilder helper to track vector extractions
Instead of accumulating all extraction costs separately and then adjusting for repeated subvector extractions, this patch collects all the extractions and then converts to calls to getScalarizationOverhead to improve the accuracy of the costs.

I'm not entirely satisfied with the getExtractWithExtendCost handling yet - this still just adds all the getExtractWithExtendCost costs together - it really needs to be replaced with a "getScalarizationOverheadWithExtend", but that will require further refactoring first.

This replaces my initial attempt in D124769.

Differential Revision: https://reviews.llvm.org/D134605
2022-09-27 14:49:07 +01:00
Florian Hahn 3abaa3760d
[LSR] Preserve LCSSA in expander when rewriting loop exit values.
The expanded values when rewriting exit values need to preserve LCSSA.
Ask SCEVExpander to preserve LCSSA to ensure that.

Fixes #58007.
2022-09-27 09:58:48 +01:00
Nikita Popov 97dfa53626 [FunctionAttrs] Infer precise FMRB
This updates checkFunctionMemoryAccess() to infer a precise
FunctionModRefBehavior, rather than an approximation split into
read/write and argmemonly.

Afterwards, we still map this back to imprecise function attributes.
This still allows us to infer some cases that we previously did not
handle, namely inaccessiblememonly and inaccessiblemem_or_argmemonly.
In practice, this means we get better memory attributes in the
presence of intrinsics like @llvm.assume.

Differential Revision: https://reviews.llvm.org/D134527
2022-09-27 10:14:35 +02:00
Florian Hahn 275bee32ad
[LoopUnroll] Forget block and loop dispositions during unrolling.
After unrolling a loop, the block and loop dispositions need to be
cleared. As we don't know which SCEVs in the loop/blocks may be
impacted, completely clear the cache. This should also fix some cases
where deleted loops remained in the LoopDispositions cache.

This fixes a verification failure surfaced by D134531.

I am planning on reviewing/updating the existing uses of
forgetLoopDispositions to check if they should be replaced by
forgetBlockAndLoopDispositions.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D134612
2022-09-27 08:49:04 +01:00
Sebastian Peryt 46fc75ab28 [NFC][2/n] Remove PrunePH pass
Second patch in the series to remove legacy PM and
associated -enable-new-pm=0 flag targets pass that
has not been ported to new PM - PruneEH.
Discussion about this can be found in D44415.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D134686
2022-09-26 18:38:04 -07:00
Sanjay Patel def6cbd2bd [InstCombine] add assert/test for zext to i1
This is a test to verify that we do not crash with the
problem noted in issue #57986. The root problem should
be fixed with a prior change to InstSimplify.
2022-09-26 16:01:25 -04:00
Matt Arsenault 473e83b95a GuardWidening: Pass through AssumptionCache (NFC) 2022-09-26 14:53:00 -04:00
Matt Arsenault 9bf1aea224 LoopPeel: Pass through AssumptionCache (NFC) 2022-09-26 14:52:59 -04:00