Commit Graph

10881 Commits

Author SHA1 Message Date
Philip Reames 01801d5274 [rs4gc] Fix a latent bug around attribute stripping for intrinsics
This change fixes a latent bug which was exposed by a change currently in review (https://reviews.llvm.org/D99802#2685032).

The story on this is a bit involved.  Without this change, what ended up happening with the pending review was that we'd strip attributes off intrinsics, and then selectiondag would fail to lower the intrinsic.  Why?  Because the lowering of the intrinsic relies on the presence of the readonly attribute.  We don't have a matcher to select the case where there's a glue node needed.

Now, on the surface, this still seems like a codegen bug.  However, here it gets fun.  I was unable to reproduce this with a standalone test at all, and was pretty much struck until skatkov provided the critical detail.  This reproduces only when RS4GC and codegen are run in the same process and context.  Why?  Because it turns out we can't roundtrip the stripped attribute through serialized IR!

We'll happily print out the missing attribute, but when we parse it back, the auto-upgrade logic has a side effect of blindly overwriting attributes on intrinsics with those specified in Intrinsics.td.  This makes it impossible to exercise SelectionDAG from a standalone test case.

At this point, I decided to treat this an RS4GC bug as a) we don't need to strip in this case, and b) I could write a test which shows the correct behavior to ensure this doesn't break again in the future.

As an aside, I'd originally set out to handle libfuncs too - since in theory they might have the same issues - but backed away quickly when I realized how the semantics of builtin, nobuiltin, and no-builtin-x all interacted.  I'm utterly convinced that no part of the optimizer handles that correctly, and decided not to open that can of worms here.
2021-04-19 13:14:07 -07:00
Nikita Popov d440f9a326 [LICM] Make capture check more precise
During store promotion, we check whether the pointer was captured
to exclude potential reads from other threads. However, we're only
interested in captures before or inside the loop. Check this using
PointerMayBeCapturedBefore against the loop header.

Differential Revision: https://reviews.llvm.org/D100706
2021-04-19 20:34:23 +02:00
Evgeniy Brevnov 35e95c6817 [CVP] processCallSite returns wrong status
Recently processMinMaxIntrinsic has been added and we started to observe a number of analysis get invalidated after CVP. The problem is CVP conservatively returns 'true'  even if there were no modifications to IR. I found one more place besides processMinMaxIntrinsic  which has the same problem. I think processMinMaxIntrinsic and similar should better have boolean return status to prevent similar issue reappear in future.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D100538
2021-04-19 12:13:22 +07:00
Serge Guelton d6de1e1a71 Normalize interaction with boolean attributes
Such attributes can either be unset, or set to "true" or "false" (as string).
throughout the codebase, this led to inelegant checks ranging from

        if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")

to

        if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")

Introduce a getValueAsBool that normalize the check, with the following
behavior:

no attributes or attribute set to "false" => return false
attribute set to "true" => return true

Differential Revision: https://reviews.llvm.org/D99299
2021-04-17 08:17:33 +02:00
Marcythm f8cf3b9931
[LICM][NFC] Fix typo
fixed some typos which may lead to misunderstandings in LICM.cpp

Reviewed By: nikic, asbirlea
Differential Revision: https://reviews.llvm.org/D100470
2021-04-16 09:42:00 +08:00
Arthur Eubanks 9c776c2fa2 [NFC][NewPM] Remove some AnalysisManager invalidate methods
These were misleading, they're more of a "clear" than an "invalidate".

We shouldn't be individually clearing analysis results. Either we clear
all analyses when some IR becomes invalid, or we properly go through
invalidation.

There was only one use of this, which can be simulated with
AM.invalidate(F, PA).

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D100519
2021-04-15 16:51:26 -07:00
Stelios Ioannou bf147c4653 [LSR] Fix for pre-indexed generated constant offset
This patch changed the isLegalUse check to ensure that
LSRInstance::GenerateConstantOffsetsImpl generates an
offset that results in a legal addressing mode and
formula. The check is changed to look similar to the
assert check used for illegal formulas.

Differential Revision: https://reviews.llvm.org/D100383

Change-Id: Iffb9e32d59df96b8f072c00f6c339108159a009a
2021-04-15 16:44:42 +01:00
Florian Hahn 5a3ff24b12
[NewGVN] Add phi-of-ops operands if no real PHI is created.
If the PHI-of-ops simplifies to an existing value, no real PHI is
created, which means the dependencies between the
PHI-of-ops and its operands is not materialized in IR. At the
moment, we fail to create a real PHI node for the PHI-of-ops,
because the PHI-of-ops root instruction is not re-visited if
one of the PHI-of-ops operands changes. We need to add the
operands as additional users in this case.

Even with this patch, there are still some dependencies
missing. I will continue tackling the outstanding
reporeted crashes in this area.

Fixes PR36501, PR42422, PR42557.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D66924
2021-04-15 08:25:10 +01:00
Sjoerd Meijer 39d29817f3 [SCCP] Follow up of rGbbab9f986c6d. NFC.
This addresses the linter messages, mainly the inconsistent capitalisation of
member functions.
2021-04-14 17:14:46 +01:00
Sjoerd Meijer bbab9f986c [SCCP] Create SCCP Solver
This refactors SCCP and creates a SCCPSolver interface and class so that it can
be used by other passes and transformations. We will use this in D93838, which
adds a function specialisation pass.

This is based on an early version by Vinay Madhusudan.

Differential Revision: https://reviews.llvm.org/D93762
2021-04-14 14:58:03 +01:00
Evgeniy Brevnov e50aa1af2d [NARY][NFC] Use hasNUsesOrMore instead of getNumUses since it's more
efficient.
2021-04-13 09:29:49 +07:00
Nick Desaulniers 237d4ee835 [JumpThreading] merge debug info when merging select+br
Jump threading can replace select then unconditional branch with
conditional branch, but when doing so loses debug info.

This destructive transform is eventually leading to a failed Verifier
run during full LTO builds of the Linux kernel with CFI and KCOV
enabled, as reported in PR39531.

ModuleSanitizerCoveragePass will insert calls to
__sanitizer_cov_trace_pc, and sometimes split critical edges,
using whatever debug info may or may not exist for the branch for
the added libcall. Since we can inline calls to
__sanitizer_cov_trace_pc due to LTO, this can lead to the error
observed in PR39531 when the debug info isn't propagated to
the libcall, because of prior destructive transforms that failed to
retain debug info.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D100137
2021-04-12 17:51:21 -07:00
Evgeniy Brevnov 36b932d6a3 [NARY] Don't optimize min/max if there are side uses
Say we have
%1=min(%a,%b)
%2=min(%b,%c)
%3=min(%2,%a)

The optimization will try to reassociate the later one so that we can rewrite it to %3=min(%1, %c) and remove %2.
But if %2 has another uses outside of %3 then we can't remove %2 and end up with:

%1=min(%a,%b)
%2=min(%b,%c)
%3=min(%1, %c)

This doesn't harm by itself except it is not profitable and changes IR for no good reason.
What is bad it triggers next iteration which finds out that optimization is applicable to %2 and %3 and generates:

%1=min(%a,%b)
%2=min(%b,%c)
%3=min(%1,%c)
%4=min(%2,%a)

and so on...

The solution is to prevent optimization in the first place if intermediate result (%2) has side uses and
known to be not removed.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D100170
2021-04-12 12:43:54 +07:00
Roman Lebedev 13fca9d816
[NFCI][SimplifyCFG] mergeEmptyReturnBlocks(): improve Dominator Tree updating
Same as with previous patches.
2021-04-11 23:56:23 +03:00
Roman Lebedev 005881e96e
[LoopIdiom] left-shift-until-bittest: set all allowed no-wrap flags on add/sub
I've checked each one of these with alive2,
and this is both correct and precise.
2021-04-11 18:08:07 +03:00
Roman Lebedev 9829f5e6b1
[CVP] @llvm.[us]{min,max}() intrinsics handling
If we can tell that either one of the arguments is taken,
bypass the intrinsic.

Notably, we are indeed fine with non-strict predicate:
* UL: https://alive2.llvm.org/ce/z/69qVW9 https://alive2.llvm.org/ce/z/kNFTKf
      https://alive2.llvm.org/ce/z/AvaPw2 https://alive2.llvm.org/ce/z/oxo53i
* UG: https://alive2.llvm.org/ce/z/wxHeGH https://alive2.llvm.org/ce/z/Lf76qx
* SL: https://alive2.llvm.org/ce/z/hkeTGS https://alive2.llvm.org/ce/z/eR_b-W
* SG: https://alive2.llvm.org/ce/z/wEqRm7 https://alive2.llvm.org/ce/z/FpAsVr

Much like with all other comparison handling in CVP,
while we could sort-of handle two Value's,
at least for plain ICmpInst it does not appear to be worthwhile.

This only fires 78 times on test-suite + dt + rs,
but we don't canonicalize to these yet. (only SCEV produces them)
2021-04-11 00:33:47 +03:00
Roman Lebedev f041757e9c
[NFC][JumpThreading] Increment 'NumFolds' statistic all places terminator becomes uncond 2021-04-10 21:24:29 +03:00
Roman Lebedev a407738def
[NFC][CVP] Add statistic for function pointer argument non-null-ness deduction 2021-04-10 21:23:20 +03:00
Roman Lebedev fe7b3ad8d5
[CVP] LVI: Use in-block values when checking value signedness domain
This has a huge positive impact on all the folds that use these helpers,
as it can be seen on vanilla test-suite + rawspeed + darktable:
correlated-value-propagation.NumSRems             +75.68% (+ 28)
correlated-value-propagation.NumAShrs             +63.87% (+198)
correlated-value-propagation.NumSDivs             +49.42% (+127)
correlated-value-propagation.NumSExt              + 8.85% (+593)
correlated-value-propagation.NumUDivURemsNarrowed	+ 8.65% (+34)

... while having pretty minimal compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=e8c7f43e2c2c6f3581ec1c6489ec21ad9f98958a&to=4cd197711e58ee1b2faeee0c35eea54540185569&stat=instructions
2021-04-10 21:10:59 +03:00
Roman Lebedev 257eda0794
[NFC][LVI] getPredicateAt(): drop default value for UseBlockValue
The default is likely wrong.
Out of all the callees, only a single one needs to pass-in false (JumpThread),
everything else either already passes true, or should pass true.

Until the default is flipped, at least make it harder to unintentionally
add new callees with UseBlockValue=false.
2021-04-10 20:46:01 +03:00
Roman Lebedev c329a47d9e
[CVP] @llvm.abs() handling
Iff we know the sigdness domain of the argument,
we can either skip @llvm.abs, or do negation directly.

Notably, INT_MIN can belong to either domain:
* X u<= INT_MIN --> X  is always fine
  https://alive2.llvm.org/ce/z/QB8j-C https://alive2.llvm.org/ce/z/7sFKpS
* X s<= 0 --> -X  is always fine
  https://alive2.llvm.org/ce/z/QbGSyq https://alive2.llvm.org/ce/z/APsN84

If all else fails, try to inferr NSW flag:
https://alive2.llvm.org/ce/z/qCJfYm
2021-04-10 16:47:31 +03:00
dfukalov c1a88e007b [AA][NFC] Convert AliasResult to class containing offset for PartialAlias case.
Add an ability to store `Offset` between partially aliased location. Use this
storage within returned `ResultAlias` instead of caching it in `AAQueryInfo`.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D98718
2021-04-09 13:26:09 +03:00
dfukalov d066079728 [NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset.
Main reason is preparation to transform AliasResult to class that contains
offset for PartialAlias case.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D98027
2021-04-09 12:54:22 +03:00
Max Kazantsev baf17e2cc9 [NFC] Move statictic increment out of helper 2021-04-09 16:32:35 +07:00
Max Kazantsev 275f3a2540 [GVN][NFC] Factor out load elimination logic via PRE for reuse 2021-04-09 16:12:25 +07:00
Arthur Eubanks 4c89bcadf6 [LICM] Hoist loads with invariant.group metadata
Previously loading the vtable used in calling a virtual method in a loop
was not hoisted out of the loop. This fixes that.

canSinkOrHoistInst() itself doesn't check that the load operands are
loop invariant, callers also check that separately.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D99784
2021-04-08 21:57:37 -07:00
Serguei Katkov d2e15a83a6 [RS4GC] Cleanup meetBDVState. NFC.
meetBDVState looks pretty difficult to read and follow.
This is purely NFC but doing several things:

1) Combine meet and meetBDVState
2) Move the function to be a member of BDVState
3) Make BDVState be a mutable object
4) Convert switch to sequence of ifs
5) Adds comments.

Reviewers: reames, dantrushin
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D99064
2021-04-09 10:20:25 +07:00
Arthur Eubanks c5d1ccbcdf [GVN] Properly invalidate ICF cache when we simplify a value
This fixes a "Cached first special instruction is wrong!" assert.

The assert fires because replacing a value with another can cause an
instruction to no longer be "special" to ICF. In this case,
devirtualization happened, turning an indirect call to a
call to a willreturn function which is no longer special.

Reviewed By: nikic, rnk

Differential Revision: https://reviews.llvm.org/D99977
2021-04-08 14:01:57 -07:00
Nikita Popov 59a2f67011 [LoopRotate] Don't split loop pass manager
After D99249 we use three different loop pass managers for LICM,
LoopRotate and LICM+LoopUnswitch. This happens because LazyBFI
and LazyBPI are not preserved by LoopRotate (note that D74640
is no longer needed). Avoid this by marking them as preserved.

My understanding of D86156 is that it is okay to simply preserve
them (which LoopUnswitch already does for the same reason) and
rely on callbacks to deal with deleted blocks.

Differential Revision: https://reviews.llvm.org/D99843
2021-04-08 22:05:18 +02:00
Congzhe Cao ce2db9005d [LoopInterchange] Fix transformation bugs in loop interchange
After loop interchange, the (old) outer loop header should not jump to
the `LoopExit`. Note that the old outer loop becomes the new inner loop
after interchange. If we branched to `LoopExit` then after interchange
we would jump directly from the (new) inner loop header to `LoopExit`
without executing the rest of outer loop.

This patch modifies adjustLoopBranches() such that the old outer
loop header (which becomes the new inner loop header) jumps to the
old inner loop latch which becomes the new outer loop latch after
interchange.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D98475
2021-04-08 14:58:13 -04:00
Stephen Tozer 140757bfaa [DebugInfo] Prevent invalid debug info being produced during LoopStrengthReduce
During LoopStrengthReduce, some of the SSA values that are used by debug values
may be lost and/or salvaged. After LSR we attempt to recover any undef debug
values, including any that were salvaged but then lost their values afterwards,
by replacing the lost values with any live equal values (plus a possible
constant offset) that have been gathered prior to running LSR. When we do this
we restore the debug value's original DIExpression, to undo any salvaging (as we
have gone back to using the original debug value).

This process can currently produce invalid debug info if the number of operands
has changed by salvaging during LSR. Replacing old values during the
applyEqualValues step does not change the number of location operands, which
means that when we restore the old DIExpression we may have a mismatch between
the number of operands used by the debug value and the number of operands
referenced by the DIExpression. This patch fixes this by restoring the full
original location metadata at the start of the applyEqualValues step, so that
there is no mismatch in operand count between the debug value and its
DIExpression.

Differential Revision: https://reviews.llvm.org/D98644
2021-04-08 13:04:48 +01:00
Congzhe Cao 593cb46550 Revert "[LoopInterchange] Fix transformation bugs in loop interchange"
This reverts commit 6ec68bd815d00c1eec2a6b9766452554f0e6cb61.
2021-04-07 21:17:30 -04:00
CongzheUalberta f5645ea65f [LoopInterchange] Fix transformation bugs in loop interchange
After loop interchange, the (old) outer loop header should not jump to
`LoopExit`. Note that the old outer loop becomes the new inner loop
after interchange. If we branched to `LoopExit` then after interchange
we would jump directly from the (new) inner loop header to `LoopExit`
without executing the rest of (new) outer loop.

This patch modifies adjustLoopBranches() such that the old outer
loop header (which becomes the new inner loop header) jumps to the
old inner loop latch which becomes the new outer loop latch after
interchange.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D98475
2021-04-07 20:55:44 -04:00
Craig Topper 5fc0e98d9a [LoopIdiomRecognize] Minor cleanups to the FFS idiom matching. NFC
-Make sure of the CreateShl/LShr/AShr methods that take a uint64_t
instead of creating a ConstantInt for 1 ourselves.
-Use Builder.getInt1 or ConstantInt::getBool instead of a conditional.
-Pull out repeated calls to getType.
2021-04-07 10:03:14 -07:00
Philip Reames 908215b346 Use AssumeInst in a few more places [nfc]
Follow up to a6d2a8d6f5.  These were found by simply grepping for "::assume", and are the subset of that result which looked cleaner to me using the isa/dyn_cast patterns.
2021-04-06 13:18:53 -07:00
Philip Reames 9ef6aa020b Plumb AssumeInst through operand bundle apis [nfc]
Follow up to a6d2a8d6f5.  This covers all the public interfaces of the bundle related code.  I tried to cleanup the internals where the changes were obvious, but there's definitely more room for improvement.
2021-04-06 12:53:53 -07:00
Philip Reames a6d2a8d6f5 Add a subclass of IntrinsicInst for llvm.assume [nfc]
Add the subclass, update a few places which check for the intrinsic to use idiomatic dyn_cast, and update the public interface of AssumptionCache to use the new class.  A follow up change will do the same for the newer assumption query/bundle mechanisms.
2021-04-06 11:16:22 -07:00
Arthur Eubanks 4e83e59eb8 [GVN] Add missing ICF update
performScalarPREInsertion() inserts instructions into blocks that we
need to tell ImplicitControlFlowTracking about, otherwise the ICF cache
may be invalid.

Fixes PR49193.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D99909
2021-04-06 10:13:42 -07:00
Philip Reames 21d4839948 Move GCRelocateInst and GCResultInst to IntrinsicInst.h [nfc]
These two are part of the IntrinsicInst class hierarchy and it helps to cut down on some redundant includes.
2021-04-06 08:33:15 -07:00
Philip Reames 52ecd94cfb Remove last remnants of PR49607 migration [NFC]
The key change (4f5e92c) to switch gc.result and gc.relocate to being readnone landed nearly two weeks ago, and we haven't seen any fallout.  Time to remove the code added to make reverting easy.
2021-04-06 07:56:55 -07:00
Simon Pilgrim b8aba76a4e LoopFlatten - CanWidenIV - Fix uninitialized variable warnings and use for-range loop. NFCI.
Fix static analysis uninitialized variable warnings, and use for-range loop iteration across WideIVs array.
2021-04-06 12:24:20 +01:00
Arthur Eubanks ea0e2ca1ac [SROA] Allow SROA on pointers with invariant group intrinsic uses
When we are able to SROA an alloca, we know all uses of it, meaning we
don't have to preserve the invariant group intrinsics and metadata.

It's possible that we could lose information regarding redundant
loads/stores, but that's unlikely to have any real impact since right
now the only user is Clang and vtables.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D99760
2021-04-05 19:53:40 -07:00
Ta-Wei Tu 6a82ace5f2 [LoopFusion] Bails out if only the second candidate is guarded (PR48060)
If only the second candidate loop is guarded while the first one is not, fusioning
two loops might not be valid but this check is currently missing.

Fixes https://bugs.llvm.org/show_bug.cgi?id=48060

Reviewed By: sidbav

Differential Revision: https://reviews.llvm.org/D99716
2021-04-06 01:08:56 +08:00
Dimitry Andric 6abb92f210 [SCCP] Avoid modifying AdditionalUsers while iterating over it
When run under valgrind, or with a malloc that poisons freed memory,
this can lead to segfaults or other problems.

To avoid modifying the AdditionalUsers DenseMap while still iterating,
save the instructions to be notified in a separate SmallPtrSet, and use
this to later call OperandChangedState on each instruction.

Fixes PR49582.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D98602
2021-04-02 19:05:59 +02:00
Philip Reames 2c4548e18e [rs4gc] Use loops instead of straightline code for attribute stripping [nfc]
Mostly because I'm about to add more attributes and the straightline copies get much uglier.  What's currently there isn't too bad.
2021-04-02 09:25:15 -07:00
Philip Reames a505801e2b [rs4gc] Strip nofree and nosync attributes when lowering from abstract model
The safepoints being inserted exists to free memory, or coordinate with another thread to do so.  Thus, we must strip any inferred attributes and reinfer them after the lowering.

I'm not aware of any active miscompiles caused by this, but since I'm working on strengthening inference of both and leveraging them in the optimization decisions, I figured a bit of future proofing was warranted.
2021-04-02 09:12:24 -07:00
Evgeniy Brevnov 2388aae401 [NARY-REASSOCIATE] Support reassociation of min/max
Support reassociation for min/max. With that we should be able to transform min(min(a, b), c) -> min(min(a, c), b) if min(a, c) is already available.

Reviewed By: mkazantsev, lebedev.ri

Differential Revision: https://reviews.llvm.org/D88287
2021-04-02 15:30:13 +07:00
Juneyoung Lee c664769330 [AssumeBundles] offset should be added to correctly calculate align
This is a patch to fix the bug in alignment calculation (see https://reviews.llvm.org/D90529#2619492).

Consider this code:

```
call void @llvm.assume(i1 true) ["align"(i32* %a, i32 32, i32 28)]
%arrayidx = getelementptr inbounds i32, i32* %a, i64 -1
; aligment of %arrayidx?
```

The llvm.assume guarantees that `%a - 28` is 32-bytes aligned, meaning that `%a` is 32k + 28 for some k.
Therefore `a - 4` cannot be 32-bytes aligned but the existing code was calculating the pointer as 32-bytes aligned.

The reason why this happened is as follows.
`DiffSCEV` stores `%arrayidx - %a` which is -4.
`OffSCEV` stores the offset value of “align”, which is 28.
`DiffSCEV` + `OffSCEV` = 24 should be used for `a - 4`'s offset from 32k, but `DiffSCEV` - `OffSCEV` = 32 was being used instead.

Reviewed By: Tyker

Differential Revision: https://reviews.llvm.org/D98759
2021-04-02 12:32:05 +09:00
Philip Reames 91790c6785 [indvars[ Fix pr49802 by checking for SCEVCouldNotCompute
The code is assuming that having an exact exit count for the loop implies that exit counts for every exit are known.  This used to be true, but when we added handling for dead exits we broke this invariant.  The new invariant is that an exact loop count implies that any exits non trivially dead have exit counts.

We could have fixed this by either a) explicitly checking for a dead exit, or b) just testing for SCEVCouldNotCompute.  I chose the second as it was simpler.

(Debugging this took longer than it should have since I'd mistyped the original assert and it wasn't checking what it was meant to...)

p.s. Sorry for the lack of test case.  Getting things into a state to actually hit this is difficult and fragile.  The original repro involves loop-deletion leaving SCEV in a slightly inprecise state which lets us bypass other transforms in IndVarSimplify on the way to this one.  All of my attempts to separate it into a standalone test failed.
2021-04-01 17:53:44 -07:00
Yevgeny Rouban 1ed53d44d8 [LoopFlatten] Do not report CFG analyses as up-to-date
Removes CFGAnalyses from the preserved analyses set
returned by LoopFlattenPass::run().

Reviewed By: Dave Green, Ta-Wei Tu

Differential Revision: https://reviews.llvm.org/D99700
2021-04-01 15:52:36 +07:00