Commit Graph

11755 Commits

Author SHA1 Message Date
zhongyunde d485c1b73e [LoopDataPrefetch] Fix crash when TTI doesn't set CacheLineSize
Fix https://github.com/llvm/llvm-project/issues/56681

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D130418
2022-07-26 13:08:42 +08:00
Warren Ristow 3bbd380a5b [Reassociate][NFC] Use an appropriate dyn_cast for BinaryOperator
In D129523, it was noted that there is are some questionable naked casts
from Instruction to BinaryOperator, which could be addressed by doing a
dyn_cast directly to BinaryOperator, avoiding the need for the later cast.
This cleans up that casting.

Reviewed By: nikic, spatel, RKSimon

Differential Revision: https://reviews.llvm.org/D130448
2022-07-25 10:24:43 -07:00
Warren Ristow 3089b411a4 [Reassociate][NFC] Consistent checking for FastMathFlags suitability
In D129523, it was noted that the approach to check whether a value can
have FastMathFlags was done in different ways, and they should be made
consistent.  This patch makes minor changes to fix that.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D130408
2022-07-24 17:44:30 -07:00
Fangrui Song 7225213c0a [LegacyPM] Remove {,PostInline}EntryExitInstrumenterPass
Following recent changes removing non-core features of the legacy
PM/optimization pipeline.
2022-07-23 15:30:15 -07:00
Max Kazantsev a40af8589e [RS4GC] Handle special cases in unreachable code for memcpy/memmov
The existing code doesn't expect dummy values (undef, poison, null-derived
constants etc) as arguments of these intrinsics. However, they can be there
in unreached code. Currently we fail trying to find base for them.

Handle these cases separately. Return null as base for them to be consistent
with the handling in the main algorithm in findBaseDefiningValue.

Differential Revision: https://reviews.llvm.org/D129561
Reviewed By: apilipenko
2022-07-22 11:30:43 +07:00
Nikita Popov 46e6dd84b7 [MemoryBuiltins] Remove isFreeCall() function (NFC)
Remove isFreeCall() in favor of getFreedOperand(). Replace the
two remaining uses with a getFreedOperand() != nullptr check, as
they only care that something is getting freed. (The usage in DSE
is correct as such. The allocator-related checks in CFLGraph look
rather questionable in general.)
2022-07-21 14:44:23 +02:00
Nikita Popov c81dff3c30 [MemoryBuiltins] Add getFreedOperand() function (NFCI)
We currently assume in a number of places that free-like functions
free their first argument. This is true for all hardcoded free-like
functions, but with the new attribute-based design, the freed
argument is supposed to be indicated by the allocptr attribute.

To make sure we handle this correctly once allockind(free) is
respected, add a getFreedOperand() helper which returns the freed
argument, rather than just indicating whether the call frees *some*
argument.

This migrates most but not all users of isFreeCall() to the new
API. The remaining users are a bit more tricky.
2022-07-21 12:39:35 +02:00
Arthur Eubanks 13aa2c1c3b [DSE] Revisit pointers that may no longer escape after removing another store
In dependent-capture, previously we'd see that %tmp4 is captured due to
the first store. We'd cache this info in CapturedBeforeReturn and
InvisibleToCallerAfterRet. Then the first store is then removed, causing
the cached values to be wrong.

We also need to revisit everything because normally we work backwards
when removing stores at the end of the function, but in this case
removing an earlier store causes a later store to be removable.

No compile time impact:
https://llvm-compile-time-tracker.com/compare.php?from=56796ae1a8db4c85dada28676f8303a5a3609c63&to=21b7e5248ffc423cd36c9d4a020085e363451465&stat=instructions

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D123686
2022-07-19 09:30:34 -07:00
Max Kazantsev 82309831c3 [LoopSimplifyCFG] Prevent use-def dominance breach by handling dead exits. PR56243
One of the transforms in LoopSimplifyCFG demands that the LCSSA form is
truly maintained for all values, tokens included, otherwise it may end up creating
a use that is not dominated by def (and Phi creation for tokens is impossible).
Detect this situation and prevent transform for it early.

Differential Revision: https://reviews.llvm.org/D129984
Reviewed By: efriedma
2022-07-19 15:54:12 +07:00
Nikita Popov 21e2f133a8 [LoopSimplifyCFG] Revert accidental change
This change was included in an unrelated change
b57d61384c
and was of course not intended for commit...
2022-07-18 15:30:13 +02:00
Nikita Popov b57d61384c [ConstantRangeTest] Move nowrap binop tests to generic infrastructure (NFC)
Move testing for add/sub with nowrap flags to TestBinaryOpExhaustive,
rather than separate homegrown exhaustive testing functions.
2022-07-18 15:14:17 +02:00
Kazu Hirata 8b3ed1fa98 Remove redundant return statements (NFC)
Identified with readability-redundant-control-flow.
2022-07-17 15:37:46 -07:00
Philip Reames 6ab686eb86 [LSR] Allow already invariant operand for ICmpZero matching [try 2]
Changes since initial commit:

* Wrapping a pointer in an SCEV unknown hides the base, and SCEV is only able to compute a subtraction when the bases are known to be equal. This results in a SCEVCouldNotCompute flowing forward and triggering asserts. Test case added in d767b392.
* isLoopInvariant returns true for instructions outside the loop, but not necessarily *above* the loop. Since this code is allowed to visit uses of an IV outside of a loop, we have to make sure the operands of the compare are both invariant and dominating the header. Test case added in 2aed3cdb.

Original commit message follows...

The ICmpZero matching is checking to see if the expression is loop invariant per SCEV and expandable. This allows expressions inside the loop which can be made loop invariant to be seamlessly expanded, but is overly conservative for expressions which already *are* loop invariant.

As a simple justification for why this is correct, consider a loop invariant urem as RHS vs an alternate function with that same urem wrapped inside a helper call. Why would it be legal to match the later, but not the former?

Differential Revision: https://reviews.llvm.org/D129793
2022-07-15 13:29:43 -07:00
Warren Ristow c650793049 [Reassociate] Enable FP reassociation via 'reassoc' and 'nsz'
Compiling with '-ffast-math' tuns on all the FastMathFlags (FMF), as
expected, and that enables FP reassociation. Only the two FMF flags
'reassoc' and 'nsz' are technically required to perform reassociation,
but disabling other unrelated FMF bits is needlessly suppressing the
optimization.

This patch fixes that needless suppression, and makes appropriate
adjustments to test-cases, fixing some outstanding TODOs in the process.

Fixes: #56483

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D129523
2022-07-15 11:44:35 -07:00
Philip Reames 6fe766beba Revert "[LSR] Allow already invariant operand for ICmpZero matching"
This reverts commit 9153515a7b.  Builtbot crash was reported in the commit thread, reverting while investigating.
2022-07-15 10:47:57 -07:00
Philip Reames 9153515a7b [LSR] Allow already invariant operand for ICmpZero matching
The ICmpZero matching is checking to see if the expression is loop invariant per SCEV and expandable. This allows expressions inside the loop which can be made loop invariant to be seamlessly expanded, but is overly conservative for expressions which already *are* loop invariant.

As a simple justification for why this is correct, consider a loop invariant urem as RHS vs an alternate function with that same urem wrapped inside a helper call. Why would it be legal to match the later, but not the former?

Differential Revision: https://reviews.llvm.org/D129793
2022-07-15 09:51:00 -07:00
Nikita Popov f75ccadcdd [LSR] Create SCEVExpander earlier, use member isSafeToExpand() (NFC)
This is a followup to D129630, which switches LSR to the member
isSafeToExpand() variant, and removes the freestanding function.

This is done by creating the SCEVExpander early (already during the
analysis phase). Because the SCEVExpander is now available for the
whole lifetime of LSRInstance, I've also made it into a member
variable, rather than passing it around in even more places.

Differential Revision: https://reviews.llvm.org/D129769
2022-07-15 09:41:23 +02:00
Warren Ristow 230c8c56f2 [Reassociate] Cleanup minor missed optimizations
In analyzing issue #56483, it was noticed that running `opt` with
`-reassociate` was missing some minor optimizations. For example,
there were cases where the running `opt` on IR with floating-point
instructions that have the `fast` flags applied, sometimes resulted in
less efficient code than the input IR (things like dead instructions
left behind, and missed reassociations). These were sometimes noted
in the test-files with TODOs, to investigate further. This commit
fixes some of these problems, removing some TODOs in the process.

FTR, I refer to these as "minor" missed optimizations, because when
running a full clang/llvm compilation, these inefficiencies are not
happening, as other passes clean that residue up. Regardless, having
cleaner IR produced by `opt`, makes assessing the quality of fixes done
in `opt` easier.
2022-07-14 08:21:04 -07:00
Brendon Cahoon c945d88d2b Revert "[StructurizeCFG] Improve basic block ordering"
This reverts commit f1b05a0a2b.

Need to revert to due to issues identified with testing. The
transformation is incorrect for blocks that contain convergent
instructions.
2022-07-14 09:40:51 -05:00
Nikita Popov 9e6e631b38 [LoopPredication] Use isSafeToExpandAt() member function (NFC)
As a followup to D129630, this switches a usage of the freestanding
function in LoopPredication to use the member variant instead. This
was the last use of the freestanding function, so drop it entirely.
2022-07-14 14:49:07 +02:00
Nikita Popov dcf4b733ef [SCEVExpander] Make CanonicalMode handing in isSafeToExpand() more robust (PR50506)
isSafeToExpand() for addrecs depends on whether the SCEVExpander
will be used in CanonicalMode. At least one caller currently gets
this wrong, resulting in PR50506.

Fix this by a) making the CanonicalMode argument on the freestanding
functions required and b) adding member functions on SCEVExpander
that automatically take the SCEVExpander mode into account. We can
use the latter variant nearly everywhere, and thus make sure that
there is no chance of CanonicalMode mismatch.

Fixes https://github.com/llvm/llvm-project/issues/50506.

Differential Revision: https://reviews.llvm.org/D129630
2022-07-14 14:41:51 +02:00
Nikita Popov 7a43b382ce [IndVars] Make sure header phi simplification preserves LCSSA form
When simplifying instructions, make sure that the replacement
preserves LCSSA form. This fixes the issue reported at:
https://reviews.llvm.org/D129293#3650851
2022-07-14 11:46:48 +02:00
Kazu Hirata 611ffcf4e4 [llvm] Use value instead of getValue (NFC) 2022-07-13 23:11:56 -07:00
Nikita Popov af49bed933 [IndVars] Simplify instructions after replacing header phi with preheader value
After replacing a loop phi with the preheader value, it's usually
possible to simplify some of the using instructions, so do that as
part of replaceLoopPHINodesWithPreheaderValues().

Doing this as part of IndVars is valuable, because it may make GEPs
in the loop have constant offsets and allow the following SROA run
to succeed (as demonstrated in the PhaseOrdering test).

Differential Revision: https://reviews.llvm.org/D129293
2022-07-13 10:27:04 +02:00
Nikita Popov a5ee62a141 [IndVars] Call replaceLoopPHINodesWithPreheaderValues() for already constant exits
Currently we only call replaceLoopPHINodesWithPreheaderValues() if
optimizeLoopExits() replaces the exit with an unconditional exit.
However, it is very common that this already happens as part of
eliminateIVComparison(), in which case we're leaving behind the
dead header phi.

Tweak the early bailout for already-constant exits to also call
replaceLoopPHINodesWithPreheaderValues().

Differential Revision: https://reviews.llvm.org/D129214
2022-07-13 09:43:21 +02:00
ChenYang Li 6d036b83d1 [JumpThreading] Avoid threadThroughTwoBasicBlocks when PredPred BB ends with indirectbranch
Since we can't change the destination of indirectbr, so when
encounter indirectbr as PredPredBB terminator, we should pass it.

Differential Revision: https://reviews.llvm.org/D129193
2022-07-08 09:29:17 +02:00
Nikita Popov 34a5c2bcf2 [BasicBlockUtils] Allow critical edge splitting with callbr terminators
After D129205, we support SplitBlockPredecessors() for predecessors
with callbr terminators. This means that it is now also safe to
invoke critical edge splitting for an edge coming from a callbr
terminator. Remove checks in various passes that were protecting
against that.

Differential Revision: https://reviews.llvm.org/D129256
2022-07-08 09:20:44 +02:00
Nikita Popov 40a4078e14 [BasicBlockUtils] Allow splitting predecessors with callbr terminators
SplitBlockPredecessors currently asserts if one of the predecessor
terminators is a callbr. This limitation was originally necessary,
because just like with indirectbr, it was not possible to replace
successors of a callbr. However, this is no longer the case since
D67252. As the requirement nowadays is that callbr must reference
all blockaddrs directly in the call arguments, and these get
automatically updated when setSuccessor() is called, we no longer
need this limitation.

The only thing we need to do here is use replaceSuccessorWith()
instead of replaceUsesOfWith(), because only the former does the
necessary blockaddr updating magic.

I believe there's other similar limitations that can be removed,
e.g. related to critical edge splitting.

Differential Revision: https://reviews.llvm.org/D129205
2022-07-07 09:13:25 +02:00
Vir Narula 89a99ec900
[GVN] Bug fix to reportMayClobberedLoad remark
Bug fix to avoid assert crashing when generating remarks for GVN crashing.

Intention of assert is correct but ignores edge case of instructions being equivalent.

Reduced input that causes crash when remarks are turned on:
```
target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
target triple = "arm64-apple-macosx12.0.0"

define ptr @ReplaceWithTidy(ptr %zz_hold) {
cond.end480.us:
  %0 = load ptr, ptr null, align 8
  store ptr %0, ptr %0, align 8
  store ptr null, ptr %zz_hold, align 8
  %1 = load ptr, ptr %0, align 8
  store ptr %1, ptr null, align 8
  ret ptr null
}
```

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D129235
2022-07-06 17:42:05 -07:00
Zaara Syeda dbf6ab5ef9 [LSR] Fix bug for optimizing unused IVs to final values
This is a fix for a crash reported for https://reviews.llvm.org/D118808
The fix is to only consider PHINodes which are induction phis.
Fixes #55529

Differential Revision: https://reviews.llvm.org/D125990
2022-07-05 12:30:58 -04:00
Nikita Popov 93cbdaef04 [Reassociate] Avoid ConstantExpr::get()
Use ConstantFoldBinaryOpOperands() instead, to handle the case
where not all binary ops have a constant expression variant.

This is a bit awkward because we only want to pop the element from
Ops once we're sure that it has folded.
2022-07-04 15:17:22 +02:00
Nuno Lopes 53dc0f1078 [NFC] Switch a few uses of undef to poison as placeholders for unreachble code 2022-07-03 14:34:03 +01:00
Nuno Lopes 022bd92c78 [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC] 2022-07-03 12:32:19 +01:00
Nuno Lopes 7c4f45f87a Revert [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC]
This reverts commits 47e6f98f84 and 3e701bcd2a
2022-07-01 23:53:41 +01:00
Nuno Lopes 47e6f98f84 [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC] 2022-07-01 23:31:31 +01:00
Nuno Lopes 373571dbb4 [NFC] Switch a few uses of undef to poison as placeholders for unreachble code 2022-06-30 23:01:43 +01:00
Nuno Lopes 0586d1cac2 [NFC] Switch a few uses of undef to poison as placeholders for unreachble code 2022-06-30 21:47:31 +01:00
Nikita Popov 014c4bdb9d [VNCoercion] Use ConstantFoldLoadFromConst API (NFCI)
Nowdays we have a generic constant folding API to load a type from
an offset. It should be able to do anything that VNCoercion can do.

This avoids the weird templating between IRBuilder and ConstantFolder
in one function, which is will stop working as the IRBuilderFolder
moves from CreateXYZ to FoldXYZ APIs.

Unfortunately, this doesn't eliminate this pattern from VNCoercion
entirely yet.
2022-06-30 14:52:27 +02:00
Nikita Popov 10c531cd5b [SCCP] Simplify CFG in SCCP as well
Currently, we only remove dead blocks and non-feasible edges in
IPSCCP, but not in SCCP. I'm not aware of any strong reason for
that difference, so this patch updates SCCP to perform the CFG
cleanup as well.

Compile-time impact seems to be pretty minimal, in the 0.05%
geomean range on CTMark.

For the test case from https://reviews.llvm.org/D126962#3611579
the result after -sccp now looks like this:

    define void @test(i1 %c) {
    entry:
      br i1 %c, label %unreachable, label %next
    next:
      unreachable
    unreachable:
      call void @bar()
      unreachable
    }

-jump-threading does nothing on this, but -simplifycfg will produce
the optimal result.

Differential Revision: https://reviews.llvm.org/D128796
2022-06-30 09:25:03 +02:00
Nikita Popov 2124b2f0e6 [JumpThreading] Avoid ConstantExpr::get() (NFCI)
This code requires the result to be an UndefValue/ConstantInt
anyway (checked by getKnownConstant), so we are only interested
in the case where this folds.
2022-06-29 16:43:05 +02:00
Nikita Popov 0af53fcb99 [SROA] Don't create constant expressions (NFC)
Use IRBuilder instead, which will fold these. Just to clarify
that this does not actually create any udiv expression.
2022-06-29 11:51:22 +02:00
Guillaume Chatelet 3c126d5fe4 [Alignment] Replace commonAlignment with std::min
`commonAlignment` is a shortcut to pick the smallest of two `Align`
objects. As-is it doesn't bring much value compared to `std::min`.

Differential Revision: https://reviews.llvm.org/D128345
2022-06-28 07:15:02 +00:00
Congzhe Cao b941857b40 [LoopInterchange] New cost model for loop interchange
This is another attempt to land this patch.

The patch proposed to use a new cost model for loop interchange,
which is obtained from loop cache analysis.

Given a loopnest, what loop cache analysis returns is a vector of
loops [loop0, loop1, loop2, ...] where loop0 should be replaced as
the outermost loop, loop1 should be placed one more level inside, and
loop2 one more level inside, etc. What loop cache analysis does is not
only more comprehensive than the current cost model, it is also a "one-shot"
query which means that we only need to query it once during the entire
loop interchange pass, which is better than the current cost model where
we query it every time we check whether it is profitable to interchange
two loops. Thus complexity is reduced, especially after D120386 where we
do more interchanges to get the globally optimal loop access pattern.

Updates made to test cases are mostly minor changes and some
corrections. One change that applies to all tests is that we added an option
`-cache-line-size=64` to the RUN lines. This is ensure that loop
cache analysis receives a valid number of cache line size for correct
analysis. Test coverage for loop interchange is not reduced.

Currently we did not completely remove the legacy cost model, but
keep it as fall-back in case the new cost model did not run successfully.
This is because currently we have some limitations in delinearization, which
sometimes makes loop cache analysis bail out. The longer term goal is to
enhance delinearization and eventually remove the legacy cost model
compeletely.

Reviewed By: bmahjour, #loopoptwg

Differential Revision: https://reviews.llvm.org/D124926
2022-06-28 00:08:37 -04:00
Kazu Hirata d08f34b592 [llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-26 18:31:51 -07:00
Kazu Hirata a81b64a1fb [llvm] Use Optional::has_value instead of Optional::hasValue (NFC)
This patch replaces x.hasValue() with x.has_value() where x is not
contextually convertible to bool.
2022-06-26 16:10:42 -07:00
Nuno Lopes 6ef9a2ad01 [LICM] Use poison to replace unreachable values instead of undef [NFC] 2022-06-26 14:56:35 +01:00
Nuno Lopes 3fa2411dc5 [LoopSimplifyCFG] use poison when replacing dead instructions instead of undef [NFC] 2022-06-26 14:15:55 +01:00
Kazu Hirata a7938c74f1 [llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 21:42:52 -07:00
Kazu Hirata 3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3.
2022-06-25 11:56:50 -07:00
Kazu Hirata aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00