Commit Graph

11185 Commits

Author SHA1 Message Date
Rosie Sumpter 46abd1fbe8 [LoopFlatten] Fix assertion failure in checkOverflow
There is an assertion failure in computeOverflowForUnsignedMul
(used in checkOverflow) due to the inner and outer trip counts
having different types. This occurs when the IV has been widened,
but the loop components are not successfully rediscovered.
This is fixed by some refactoring of the code in findLoopComponents
which identifies the trip count of the loop.
2021-08-13 10:07:49 +01:00
Florian Hahn f999312872
Recommit "[Matrix] Overload stride arg in matrix.columnwise.load/store."
This reverts the revert 28c04794df.

The failing MLIR test that caused the revert should be fixed  in this
version.

Also includes a PPC test fix previously in 1f87c7c478.
2021-08-12 18:31:57 +01:00
maekawatoshiki dd3eea6566 [LICM] Support sinking in LNICM
Currently, LNICM pass does not support sinking instructions out of loop nest.
This patch enables LNICM to sink down as many instructions to the exit block of outermost loop as possible.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D107219
2021-08-13 00:56:26 +09:00
Mehdi Amini 28c04794df Revert "[Matrix] Overload stride arg in matrix.columnwise.load/store."
This reverts commit a1ef81de35.

Broke the MLIR buildbot.
2021-08-12 11:57:19 +00:00
Florian Hahn a1ef81de35
[Matrix] Overload stride arg in matrix.columnwise.load/store.
This patch adjusts the intrinsics definition of
llvm.matrix.column.major.load and llvm.matrix.column.major.store to
allow overloading the type of the stride. The bitwidth of the stride is
used to perform the offset computation.

This fixes a crash when using __builtin_matrix_column_major_load or
__builtin_matrix_column_major_store on 32 bit platforms. The stride argument
of the builtins are defined as `size_t`, which is 32 bits wide on 32 bit
platforms.

Note that we still perform offset computations with 64 bit width on 32
bit platforms for accesses that do not take a user-specified stride.
This can be fixed separately.

Fixes PR51304.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D107349
2021-08-12 10:45:25 +01:00
Christudasan Devadasan 5d940b71ae Reapply "SROA: Enhance speculateSelectInstLoads"
Originally committed as ffc3fb665d
Reverted in fcf2d5f402 due to an
assertion failure.

Original commit message:

Allow the folding even if there is an
intervening bitcast.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D106667
2021-08-11 22:58:54 -04:00
Christopher Di Bella c874dd5362 [llvm][clang][NFC] updates inline licence info
Some files still contained the old University of Illinois Open Source
Licence header. This patch replaces that with the Apache 2 with LLVM
Exception licence.

Differential Revision: https://reviews.llvm.org/D107528
2021-08-11 02:48:53 +00:00
Nikita Popov 17db125b48 [MemCpyOpt] Optimize MemoryDef insertion
When converting a store into a memset, we currently insert the new
MemoryDef after the store MemoryDef, which requires all uses to be
renamed to the new def using a whole block scan. Instead, we can
insert the new MemoryDef before the store and not rename uses,
because we know that the location is immediately overwritten, so
all uses should still refer to the old MemoryDef. Those uses will
get renamed when the old MemoryDef is actually dropped, which is
efficient.

I expect something similar can be done for some of the other MSSA
updates in MemCpyOpt. This is an alternative to D107513, at least
for this particular case.

Differential Revision: https://reviews.llvm.org/D107702
2021-08-10 21:28:29 +02:00
Christudasan Devadasan fcf2d5f402 Revert "SROA: Enhance speculateSelectInstLoads"
This reverts commit ffc3fb665d.
2021-08-09 01:13:39 -04:00
Nikita Popov 88003cea1c [MemCpyOpt] Remove MemDepAnalysis-based implementation
The MemorySSA-based implementation has been enabled for a few months
(since D94376). This patch drops the old MDA-based implementation
entirely.

I've kept this to only the basic cleanup of dropping various
conditions -- the code could be further cleaned up now that there
is only one implementation.

Differential Revision: https://reviews.llvm.org/D102113
2021-08-07 22:35:44 +02:00
Christudasan Devadasan ffc3fb665d SROA: Enhance speculateSelectInstLoads
Allow the folding even if there is an
intervening bitcast.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D106667
2021-08-07 09:09:14 -04:00
Artem Belevich 6a9cf21f5a [CUDA, MemCpyOpt] Add a flag to force-enable memcpyopt and use it for CUDA.
Attempt to enable MemCpyOpt unconditionally in D104801 uncovered the fact that
there are users that do not expect LLVM to materialize `memset` intrinsic.

While other passes can do that, too, MemCpyOpt triggers it more frequently and
breaks sanitizers and some downstream users.

For now introduce a flag to force-enable the flag and opt-in only CUDA
compilation with NVPTX back-end.

Differential Revision: https://reviews.llvm.org/D106401
2021-08-06 11:13:52 -07:00
Michael Liao d1cacd5928 [MemCpyOpt] Teach memcpyopt to handle loads from the constant memory.
- Loads from the constant memory (either explicit one or as the source
  of memory transfer intrinsics) won't alias any stores.

Reviewed By: asbirlea, efriedma

Differential Revision: https://reviews.llvm.org/D107605
2021-08-06 12:43:52 -04:00
Chris Jackson 113a06f7a5 {DebugInfo][LSR] Don't cache dbg.value that are already undef
The SCEV-based salvaging method caches dbg.value information pre-LSR so
that salvaging may be attempted post-LSR. If the dbg.value are already
undef pre-LSR then a salvage attempt would be fruitless, so avoid
caching them.

Reviewed By: StephenTozer

Differential Revision: https://reviews.llvm.org/D107448
2021-08-05 19:16:43 +01:00
Kazu Hirata 72661f337a [Transforms] Drop unnecessary const from return types (NFC)
Identified with readability-const-return-type.
2021-08-05 08:53:17 -07:00
eopXD fd7f6a3c81 [NFC][LoopIdiom] rename boolean variable NegStride to IsNegStride
Rename variable for better code readability.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D107570
2021-08-05 23:11:42 +08:00
Dawid Jurczak 06206a8cd1 [BuildLibCalls][NFC] Remove redundant attribute list from emitCalloc
Additionally with this patch aligned DSE which is the only user of emitCalloc.

Differential Revision: https://reviews.llvm.org/D103523
2021-08-05 16:18:38 +02:00
eopXD 26aa1bbe97 [NFCI] [LoopIdiom] Let processLoopStridedStore take StoreSize as SCEV instead of unsigned
Letting it take SCEV allows further modification on the function to optimize
if the StoreSize / Stride is runtime determined.

This is a preceeding of D107353.
The big picture is to let LoopIdiom deal with runtime-determined sizes.

Reviewed By: Whitney, lebedev.ri

Differential Revision: https://reviews.llvm.org/D104595
2021-08-05 13:21:48 +08:00
Nikita Popov bb15861e14 [MemCpyOpt] Relax libcall checks
Rather than blocking the whole MemCpyOpt pass if the libcalls are
not available, only disable creation of new memset/memcpy intrinsics
where only load/stores were used previously. This only affects the
store merging and load-store conversion optimization. Other
optimizations are derived from existing intrinsics, which are
well-defined in the absence of libcalls -- not having the libcalls
just means that call simplification won't convert them to intrinsics.

This is a weaker variation of D104801, which dropped these checks
entirely. Ideally we would not couple emission of intrinsics to
libcall availability at all, but as the intrinsics may be legalized
to libcalls we need to be a bit careful right now.

Differential Revision: https://reviews.llvm.org/D106769
2021-08-04 21:17:51 +02:00
Dawid Jurczak 238139be09 [DSE][NFC] Clean up DeadStoreElimination from unused variables
Differential Revision: https://reviews.llvm.org/D106446
2021-08-04 19:44:40 +02:00
Chris Jackson 21ee38e24f [DebugInfo][LSR] Avoid crashes on large integer inputs
SCEV-based salvaging in LSR translates SCEVs to DIExpressions. SCEVs may
contain very large integers but the translation does not support
integers greater than 64 bits. This patch adds checks to ensure
conversions of these large integers is not attempted. A regression test
is added to ensure no such translation is attempted.

Reviewed by: StephenTozer

PR: https://bugs.llvm.org/show_bug.cgi?id=51329

Differential Revision: https://reviews.llvm.org/D107438
2021-08-04 15:51:22 +01:00
Roman Lebedev 6f6e9a867f
[BasicTTIImpl][LoopUnroll] getUnrollingPreferences(): emit ORE remark when advising against unrolling due to a call in a loop
I'm not sure this is the best way to approach this,
but the situation is rather not very detectable unless we explicitly call it out when refusing to advise to unroll.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D107271
2021-08-03 00:57:26 +03:00
Nikita Popov c7770574f9 Revert "[unroll] Move multiple exit costing into consumer pass [NFC]"
This reverts commit 76940577e4.

This causes Transforms/LoopUnroll/ARM/multi-blocks.ll to fail.
2021-08-02 22:23:34 +02:00
Philip Reames 76940577e4 [unroll] Move multiple exit costing into consumer pass [NFC]
This aligns the multiple exit costing with all the other cost decisions.  Note that UnrollAndJam, which is the only other caller of the original home of this code, unconditionally bails out of multiple exit loops.
2021-08-02 12:46:23 -07:00
Nikita Popov 380b8a603c [DFAJumpThreading] Use SmallPtrSet for Visited (NFC)
This set is only used for contains checks, so there is no need to
use std::set.
2021-08-02 21:30:25 +02:00
Nikita Popov 3f7aea1a37 [DFAJumpThreading] Use insert return value (NFC)
Rather than find + insert. Also use range based for loop.
2021-08-02 21:21:21 +02:00
Nikita Popov 84602f98c6 [DFAJumpThreading] Remove unnecessary includes (NFC)
This file uses neither unordered_map nor unordered_set.
2021-08-02 21:13:30 +02:00
Nikita Popov e97524cba2 [DFAJumpThreading] Mark DT as preserved in LegacyPM
It is marked as preserved in NewPM, but not LegacyPM.
2021-08-02 21:13:30 +02:00
Rosie Sumpter f117ed542f [LoopFlatten] Fix missed LoopFlatten opportunity
When the limit of the inner loop is a known integer, the InstCombine
pass now causes the transformation e.g. imcp ult i32 %inc, tripcount ->
icmp ult %j, tripcount-step (where %j is the inner loop induction
variable and %inc is add %j, step), which is now accounted for when
identifying the trip count of the loop. This is also an acceptable use
of %j (provided the step is 1) so is ignored as long as the compare
that it's used in is also the condition of the inner branch.

Differential Revision: https://reviews.llvm.org/D105802
2021-08-02 11:09:54 +01:00
Sanjay Patel f2a322bfcf [SROA] prevent crash on large memset length (PR50910)
I don't know much about this pass, but we need a stronger
check on the memset length arg to avoid an assert. The
current code was added with D59000.
The test is reduced from:
https://llvm.org/PR50910

Differential Revision: https://reviews.llvm.org/D106462
2021-07-31 14:07:30 -04:00
Brendon Cahoon c4c379d633 [LoopStrengthReduction] Fix pointer extend asserts
Additional asserts were added to ScalarEvolution to enforce
pointer/int type rules. An assert is triggered when the LSR pass
attempts to extend a pointer SCEV in GenerateTruncates.

This patch changes GenerateTruncates to exit early if the Formaula
contains a ScaledReg or BaseReg with a pointer type.

Differential Revision: https://reviews.llvm.org/D107185
2021-07-30 17:24:08 -04:00
Dawid Jurczak 5c315bee8c [DSE] Transform memset + malloc --> calloc (PR25892)
After this change DSE can eliminate malloc + memset and emit calloc.
It's https://reviews.llvm.org/D101440 follow-up.

Differential Revision: https://reviews.llvm.org/D103009
2021-07-29 18:34:10 +02:00
Rosie Sumpter fab5659c79 Revert "[LoopFlatten] Fix missed LoopFlatten opportunity"
This reverts commit 2df8bf9339.

Reverting because it causes an assertion failure.
2021-07-29 15:52:45 +01:00
Jeremy Morse 2537120c87 Follow-up to D105207, only salvage affine SCEVs to avoid a crash
SCEVToIterCountExpr only expects to be fed affine expressions, but
DbgRewriteSalvageableDVIs is feeding it non-affine induction variables.
Following this up with an obvious fix, will add test coverage too if
this avoids D105207 being reverted.
2021-07-29 11:48:08 +01:00
Rosie Sumpter 2df8bf9339 [LoopFlatten] Fix missed LoopFlatten opportunity
When the trip count of the inner loop is a constant, the InstCombine
pass now causes the transformation e.g. imcp ult i32 %inc, tripcount ->
icmp ult %j, tripcount-step (where %j is the inner loop induction
variable and %inc is add %j, step), which is now accounted for when
identifying the trip count of the loop. This is also an acceptable use
of %j (provided the step is 1) so is ignored as long as the compare
that it's used in is also the condition of the inner branch.

Differential Revision: https://reviews.llvm.org/D105802
2021-07-29 09:47:41 +01:00
Chris Jackson 0ba8595287 [DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR
Reapply commit d675b594f4 that was
reverted due to buildbot failures. A simple fix has been applied to
remove an assertion.

Differential Revision: https://reviews.llvm.org/D105207
2021-07-28 23:04:59 +01:00
Sjoerd Meijer bc43078fe8 [LoopFlatten] Fix bug where SCEVCouldNotCompute object is used
The SCEV method getBackedgeTakenCount() returns a SCEVCouldNotCompute
object if the backedge-taken count is unpredictable. This fix ensures
there is no longer an attempt to use such an object to find the trip
count.

Patch by: Rosie Sumpter.

Differential Revision: https://reviews.llvm.org/D106970
2021-07-28 18:35:08 +01:00
Chris Jackson 3992896043 Revert "[DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR"
Reverted due to buildbot failures.
This reverts commit d675b594f4.
2021-07-28 16:44:54 +01:00
Chris Jackson d675b594f4 [DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR
Reapply commit 796b84d26f that was
reverted due to reports of crashes. A minor change now guards against
getVariableLocationOperand() returning a nullptr.

Differential Revision: https://reviews.llvm.org/D106659
2021-07-28 16:28:46 +01:00
Sanjay Patel 5b83261c15 [DivRemPairs] make sure we have a valid CFG for hoisting division
This transform was added with e38b7e8948
and as shown in:
https://llvm.org/PR51241
...it could crash without an extra check of the blocks.

There might be a more compact way to write this constraint,
but we can't just count the successors/predecessors without
affecting a test that includes a switch instruction.
2021-07-28 11:09:12 -04:00
Chris Jackson 04b94c7cae Revert "[DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR"
Crashes were reported on the upstreamm revision:
https://reviews.llvm.org/D105207

This reverts commit 796b84d26f.
2021-07-28 10:05:54 +01:00
Benjamin Kramer 05815c9f63 Remove unused include that's also a layering violation. NFC. 2021-07-27 21:21:55 +02:00
Adam Nemet d87d3615f7 [Matrix] Fix shape for factored transpose
The shape of the input is C x R.

Differential Revision: https://reviews.llvm.org/D106722
2021-07-27 11:36:13 -07:00
Adam Nemet bf7eb48454 [Matrix] RAUW should only replace an instruction in ShapeMap if supportsShapeInfo
As an instruction is replaced in optimizeTransposes RAUW will replace it in
the ShapeMap (ShapeMap is ValueMap so that uses are updated).  In
finalizeLowering however we skip updating uses if they are in the ShapeMap
since they will be lowered separately at which point we pick up the lowered
operands.

In the testcase what happened was that since we replaced the doubled-transpose
with the shuffle, it ended up in the ShapeMap.  As we lowered the
columnwise-load the use in the shuffle was not updated.  Then as we removed
the original columnwise-load we changed that to an undef.  I.e. we ended up
with:

```
%shuf = shufflevector <8 x double> undef, <8 x double> poison, <6 x i32>
                                   ^^^^^
                                  <i32 0, i32 1, i32 2, i32 4, i32 5, i32 6>
```

Besides the fix itself, I have fortified this last bit.  As we change uses to
undef when removing instruction we track the undefed instruction to make sure
we eventually remove those too.  This would have caught the issue at compile
time.

Differential Revision: https://reviews.llvm.org/D106714
2021-07-27 11:36:13 -07:00
Alexey Zhikhartsev 02077da7e7 Add jump-threading optimization for deterministic finite automata
The current JumpThreading pass does not jump thread loops since it can
result in irreducible control flow that harms other optimizations. This
prevents switch statements inside a loop from being optimized to use
unconditional branches.

This code pattern occurs in the core_state_transition function of
Coremark. The state machine can be implemented manually with goto
statements resulting in a large runtime improvement, and this transform
makes the switch implementation match the goto version in performance.

This patch specifically targets switch statements inside a loop that
have the opportunity to be threaded. Once it identifies an opportunity,
it creates new paths that branch directly to the correct code block.
For example, the left CFG could be transformed to the right CFG:

```
          sw.bb                        sw.bb
        /   |   \                    /   |   \
   case1  case2  case3          case1  case2  case3
        \   |   /                /       |       \
        latch.bb             latch.2  latch.3  latch.1
         br sw.bb              /         |         \
                           sw.bb.2     sw.bb.3     sw.bb.1
                            br case2    br case3    br case1
```

Co-author: Justin Kreiner @jkreiner
Co-author: Ehsan Amiri @amehsan

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D99205
2021-07-27 14:34:04 -04:00
Anna Thomas 8ee5759fd5 Strip undef implying attributes when moving calls
When hoisting/moving calls to locations, we strip unknown metadata. Such calls are usually marked `speculatable`, i.e. they are guaranteed to not cause undefined behaviour when run anywhere. So, we should strip attributes that can cause immediate undefined behaviour if those attributes are not valid in the context where the call is moved to.

This patch introduces such an API and uses it in relevant passes. See
updated tests.

Fix for PR50744.

Reviewed By: nikic, jdoerfert, lebedev.ri

Differential Revision: https://reviews.llvm.org/D104641
2021-07-27 10:57:05 -04:00
Tres Popp d9e3449aa8 Handle unused variable when assertions are disabled 2021-07-27 15:43:06 +02:00
Chris Jackson 796b84d26f [DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR
This reapplies commit 76f3ffb2b2 that was
reverted due to buildbot failures.

- Update lit tests with REQUIRES condition.
- Abandon salvage attempt if SCEVUnknown::getValue() returns nullptr.

Differential Revision: https://reviews.llvm.org/D105207
2021-07-27 14:22:09 +01:00
Chris Jackson 1930c4410d [DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR
This reverts commit 76f3ffb2b2 because
of a failure on sanitixer-X86-64-linux-autoconf.
2021-07-27 13:36:56 +01:00
Chris Jackson 76f3ffb2b2 [DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR
This patch extends salvaging of debuginfo in the Loop Strength Reduction
(LSR) pass by translating Scalar Evaluations (SCEV) into DIExpressions.
The method is as follows:
- Cache dbg.value intrinsics that are salvageable.
- Obtain a loop Induction Variable (IV) from ScalarExpressionExpander or
  the loop header.
- Translate the IV SCEV into an expression that recovers the current
  loop iteration count. Combine this with the dbg.value's location
  op SCEV to create a DIExpression that salvages the value.

Review by: jmorse

Differential Revision: https://reviews.llvm.org/D105207
2021-07-27 13:00:36 +01:00
Rosie Sumpter 491ac28028 [LoopFlatten] Use SCEV and Loop APIs to identify increment and trip count
Replace pattern-matching with existing SCEV and Loop APIs as a more
robust way of identifying the loop increment and trip count. Also
rename 'Limit' as 'TripCount' to be consistent with terminology.

Differential Revision: https://reviews.llvm.org/D106580
2021-07-27 08:42:59 +01:00
Johannes Doerfert 25a3130d89 [Local] Do not introduce a new `llvm.trap` before `unreachable`
This is the second attempt to remove the `llvm.trap` insertion after
https://reviews.llvm.org/rGe14e7bc4b889dfaffb7180d176a03311df2d4ae6
reverted the first one. It is not clear what the exact issue was back
then and it might already be gone by now, it has been >5 years after
all.

Replaces D106299.

Differential Revision: https://reviews.llvm.org/D106308
2021-07-26 23:33:36 -05:00
Nikita Popov f921bf6049 [MergeICmps] Collect block instructions once (NFC)
Collect the relevant instructions for a given BCECmpBlock once
on construction, rather than repeating this logic in multiple
places.
2021-07-26 18:07:20 +02:00
Nikita Popov c691651c53 [MergeICmps] Try to fix MSVC build failure
Apparently this fails to line up the types -- try to sidestep the
issue entirely by writing the code in a more reasonable way: Walk
over the operands and perform a set lookup, rather than walking
over the set and performing an operand scan.
2021-07-26 17:31:27 +02:00
Nikita Popov 0d3807b365 [MergeICmps] Separate out BCECmp and use Optional (NFC)
Separate out the BCECmp part from BCECmpBlock, which just stores
the comparison atoms without the branch instruction. At the same
time switch the code to return Optional<> rather than objects in
invalid state and partially constructed objects.
2021-07-26 17:06:43 +02:00
Nikita Popov 33146857e9 [IR] Consider non-willreturn as side effect (PR50511)
This adjusts mayHaveSideEffect() to return true for !willReturn()
instructions. Just like other side-effects, non-willreturn calls
(aka "divergence") cannot be removed and cannot be reordered relative
to other side effects. This fixes a number of bugs where
non-willreturn calls are either incorrectly dropped or moved. In
particular, it also fixes the last open problem in
https://bugs.llvm.org/show_bug.cgi?id=50511.

I performed a cursory review of all current mayHaveSideEffect()
uses, which convinced me that these are indeed the desired default
semantics. Places that do not want to consider non-willreturn as a
sideeffect generally do not want mayHaveSideEffect() semantics at
all. I identified two such cases, which are addressed by D106591
and D106742. Finally, there is a use in SCEV for which we don't
really have an appropriate API right now -- what it wants is
basically "would this be considered forward progress". I've just
spelled out the previous semantics there.

Differential Revision: https://reviews.llvm.org/D106749
2021-07-26 16:35:14 +02:00
Nikita Popov f502683750 [MergeICmps] Relax sinking check
The check for sinking instructions past the load + cmp sequence
currently checks for side-effects, which includes writing to memory
and unwinding. However, I don't believe we care about sinking the
instructions past an unwind (as they don't have any side-effects
themselves).

Differential Revision: https://reviews.llvm.org/D106591
2021-07-23 22:16:11 +02:00
Dawid Jurczak bc536c7101 Revert "[DSE] Transform memset + malloc --> calloc (PR25892)"
This reverts commit 43234b1595.

Reason: We should detect that we are implementing 'calloc' and bail out.
2021-07-23 11:51:59 +02:00
Fangrui Song 3b181568db [Matrix] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off build after D106457. NFC 2021-07-22 11:33:02 -07:00
Adam Nemet ce5b1320a7 [Matrix] Fix miscompile for NT matmul if the transpose has other use
We should only add the fake lowering entry for the matrix remark if the
transpose is not lowered on its own.  `MapVector::insert` is used to insert
the entry during proper lowering which does not overwrite the fake entry in
the map.

We actually had test coverage for this but the reference output code was
wrong; it was storing undef rather than the transposed column.

Also add an assert that would have caught this.

Differential Revision: https://reviews.llvm.org/D106457
2021-07-22 10:45:56 -07:00
Dawid Jurczak 11338e998d [LoopIdiom] Transform memmove-like loop into memmove (PR46179)
The purpose of patch is to learn Loop idiom recognition pass how to recognize simple memmove patterns
in similar way like GCC: https://godbolt.org/z/fh95e83od
LoopIdiomRecognize already has machinery for memset and memcpy recognition, patch tries to extend exisiting capabilities with minimal effort.

Differential Revision: https://reviews.llvm.org/D104464
2021-07-22 13:05:43 +02:00
Sanjay Patel f14495dc75 [SROA] avoid crash on memset with constant expression length
https://llvm.org/PR50888
2021-07-21 15:20:28 -04:00
Rosie Sumpter 44c9adb414 [LoopFlatten][LoopInfo] Use Loop to identify latch compare instruction
Make getLatchCmpInst non-static and use it in LoopFlatten as a more
robust way of identifying the compare.

Differential Revision: https://reviews.llvm.org/D106256
2021-07-21 10:14:18 +01:00
Dawid Jurczak 43234b1595 [DSE] Transform memset + malloc --> calloc (PR25892)
After this change DSE can eliminate malloc + memset and emit calloc.
It's https://reviews.llvm.org/D101440 follow-up.

Differential Revision: https://reviews.llvm.org/D103009
2021-07-20 11:39:05 +02:00
Artem Belevich 1a43ee65d1 Revert "[MemCpyOpt] Enable memcpy optimizations unconditionally."
This reverts commit 2c98298a75 which breaks
sanitizers.
2021-07-19 14:27:41 -07:00
Artem Belevich b988d69ea2 [infer-address-spaces] Handle complex non-pointer constexpr arguments.
Fixes https://bugs.llvm.org/show_bug.cgi?id=51099

Differential Revision: https://reviews.llvm.org/D106098
2021-07-19 12:15:52 -07:00
Artem Belevich 2c98298a75 [MemCpyOpt] Enable memcpy optimizations unconditionally.
The patch does not depend on the availability of the library functions for
memcpy/memset as it operates on LLVM intrinsics.  The optimizations are useful
on the targets that have these functions disabled (e.g. NVPTX & AMDGPU).

Differential Revision: https://reviews.llvm.org/D104801
2021-07-19 11:58:02 -07:00
maekawatoshiki 74f0f9a455 [LICM] Create LoopNest Invariant Code Motion (LNICM) pass
This patch adds a new pass called LNICM which is a LoopNest version of LICM and a test case to show how LNICM works.
Basically, LNICM only hoists invariants out of loop nest (not a loop) to keep/make perfect loop nest. This enables later optimizations that require perfect loop nest.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D104180
2021-07-20 00:31:18 +09:00
Rosie Sumpter 34d6820551 [LoopFlatten] Use Loop to identify loop induction phi. NFC
Replace code which identifies induction phi with helper function
getInductionVariable to improve robustness.

Differential Revision: https://reviews.llvm.org/D106045
2021-07-19 09:06:57 +01:00
Congzhe Cao a7b7d22d6e [LoopInterchange] Check lcssa phis in the inner latch in scenarios of multi-level nested loops
We already know that we need to check whether lcssa
phis are supported in inner loop exit block or in
outer loop exit block, and we have logic to check
them already. Presumably the inner loop latch does
not have lcssa phis and there is no code that deals
with lcssa phis in the inner loop latch. However,
that assumption is not true, when we have loops
with more than two-level nesting. This patch adds
checks for lcssa phis in the inner latch.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D102300
2021-07-16 11:59:20 -04:00
Max Kazantsev f98ed74f69 [LSR] Handle case 1*reg => reg. PR50918
This patch addresses assertion failure in case when the only found formula for LSR
is `1*reg => reg` which was supposed to be an impossible situation, however there
is a test that shows it is possible.

In this case, we can use scale register with scale of 1 as the missing base register.

Reviewed By: huihuiz, reames
Differential Revision: https://reviews.llvm.org/D105009
2021-07-16 11:33:59 +07:00
Arthur Eubanks 5366de7375 [SimpleLoopUnswitch] Don't non-trivially unswitch loops with catchswitch exits
SplitBlock() can't handle catchswitch.

Fixes PR50973.

Reviewed By: aheejin

Differential Revision: https://reviews.llvm.org/D105672
2021-07-14 14:07:28 -07:00
Arthur Eubanks 0024ec59a0 [NewPM][SimpleLoopUnswitch] Add option to not trivially unswitch
To help with debugging non-trivial unswitching issues.

Don't care about the legacy pass, nobody is using it.

If a pass's string params are empty (e.g. "simple-loop-unswitch"), don't
default to the empty constructor for the pass params. We should still
let the parser take care of it in case the parser has its own defaults.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D105933
2021-07-13 16:09:42 -07:00
Eli Friedman 5d1ba53404 [LoopReroll] Add an extra defensive check to avoid SCEV assertion.
Make sure getMinusSCEV() didn't return a pointer.  The following check
would never succeed if it was a pointer, anyway, but calling
getMulExpr() on a pointer SCEV now asserts.
2021-07-13 12:17:09 -07:00
Arthur Eubanks 489742991f [NFC] Inline variable to prevent unused variable warning 2021-07-13 09:57:59 -07:00
Arthur Eubanks ab5693aa4a [OpaquePtr] Use byval type more 2021-07-13 09:34:34 -07:00
Arthur Eubanks 113a807977 [OpaquePtr] Get load/store type without PointerType::getElementType() 2021-07-13 09:34:34 -07:00
Yevgeny Rouban 88024a724c [RS4GC] Use one DVCache for both inlineGetBaseAndOffset() and insertParsePoints()
This new test demonstrates a case where a base ptr is generated
twice for the same value: the first one is generated while
the gc.get.pointer.base() is inlined, the second is generated
for the statepoint. This happens because the methods
inlineGetBaseAndOffset() and insertParsePoints() do not share
their defining value cache used by the findBasePointer() method.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D103240
2021-07-12 18:13:00 +07:00
Craig Topper e38b7e8948 [DivRemPairs] Add an initial case for hoisting to a common predecessor.
This patch adds support for hoisting the division and maybe the
remainder for control flow graphs like this.

```
PredBB
  |  \
  |  Rem
  |  /
 Div
```

If we have DivRem we'll hoist both to PredBB. If not we'll just
hoist Div and expand Rem using the Div.

This improves our codegen for something like this

```
__uint128_t udivmodti4(__uint128_t dividend, __uint128_t divisor, __uint128_t *remainder) {
    if (remainder != 0)
        *remainder = dividend % divisor;
    return dividend / divisor;
}
```

Reviewed By: spatel, lebedev.ri

Differential Revision: https://reviews.llvm.org/D87555
2021-07-11 10:03:07 -07:00
Eli Friedman 9c4baf5101 [ScalarEvolution] Strictly enforce pointer/int type rules.
Rules:

1. SCEVUnknown is a pointer if and only if the LLVM IR value is a
   pointer.
2. SCEVPtrToInt is never a pointer.
3. If any other SCEV expression has no pointer operands, the result is
   an integer.
4. If a SCEVAddExpr has exactly one pointer operand, the result is a
   pointer.
5. If a SCEVAddRecExpr's first operand is a pointer, and it has no other
   pointer operands, the result is a pointer.
6. If every operand of a SCEVMinMaxExpr is a pointer, the result is a
   pointer.
7. Otherwise, the SCEV expression is invalid.

I'm not sure how useful rule 6 is in practice.  If we exclude it, we can
guarantee that ScalarEvolution::getPointerBase always returns a
SCEVUnknown, which might be a helpful property. Anyway, I'll leave that
for a followup.

This is basically mop-up at this point; all the changes with significant
functional effects have landed.  Some of the remaining changes could be
split off, but I don't see much point.

Differential Revision: https://reviews.llvm.org/D105510
2021-07-09 17:29:26 -07:00
Arthur Eubanks 9a7afae492 [OpaquePtr][InferAddrSpace] Use PointerType::getWithSamePointeeType() 2021-07-09 10:29:08 -07:00
Roman Lebedev 4e332cd41a
Revert "Transform memset + malloc --> calloc (PR25892)"
It broke `check-msan`, see e.g. https://lab.llvm.org/buildbot/#/builders/18/builds/1934

This reverts commit 375694a07b.
2021-07-09 16:26:48 +03:00
Max Kazantsev 9c5e65691e [LoopDeletion] Handle switch in proving that loop exits on first iteration
Added check for switch-terminated blocks in loops.
Now if a block is terminated with a switch, we try to find out which of the
cases is taken on 1st iteration and mark corresponding edge from the block
to the case successor as live.

Patch by Dmitry Makogon!

Differential Revision: https://reviews.llvm.org/D105688
Reviewed By: nikic, mkazantsev
2021-07-09 18:03:34 +07:00
Dawid Jurczak 375694a07b Transform memset + malloc --> calloc (PR25892)
After this change DSE can eliminate malloc + memset and emit calloc.
It's https://reviews.llvm.org/D101440 follow-up.

Differential Revision: https://reviews.llvm.org/D103009
2021-07-09 11:01:12 +02:00
Bjorn Pettersson 472462c472 [NewPM] Consistently use 'simplifycfg' rather than 'simplify-cfg'
There was an alias between 'simplifycfg' and 'simplify-cfg' in the
PassRegistry. That was the original reason for this patch, which
effectively removes the alias.

This patch also replaces all occurrances of 'simplify-cfg'
by 'simplifycfg'. Reason for choosing that form for the name is
that it matches the DEBUG_TYPE for the pass, and the legacy PM name
and also how it is spelled out in other passes such as
'loop-simplifycfg', and in other options such as
'simplifycfg-merge-cond-stores'.

I for some reason the name should be changed to 'simplify-cfg' in
the future, then I think such a renaming should be more widely done
and not only impacting the PassRegistry.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D105627
2021-07-09 09:47:03 +02:00
David Blaikie 1def2579e1 PR51018: Remove explicit conversions from SmallString to StringRef to future-proof against C++23
C++23 will make these conversions ambiguous - so fix them to make the
codebase forward-compatible with C++23 (& a follow-up change I've made
will make this ambiguous/invalid even in <C++23 so we don't regress
this & it generally improves the code anyway)
2021-07-08 13:37:57 -07:00
Max Kazantsev 19885c7adf [NFC] Remove duplicate function calls
Removed repeated call of L->getHeader(). Now using previously stored return value.

Patch by Dmitry Makogon!

Differential Revision: https://reviews.llvm.org/D105535
Reviewed By: mkazantsev
2021-07-07 17:02:36 +07:00
Eli Friedman 7ac1c7bead Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers.
As part of making ScalarEvolution's handling of pointers consistent, we
want to forbid multiplying a pointer by -1 (or any other value). This
means we can't blindly subtract pointers.

There are a few ways we could deal with this:
1. We could completely forbid subtracting pointers in getMinusSCEV()
2. We could forbid subracting pointers with different pointer bases
(this patch).
3. We could try to ptrtoint pointer operands.

The option in this patch is more friendly to non-integral pointers: code
that works with normal pointers will also work with non-integral
pointers. And it seems like there are very few places that actually
benefit from the third option.

As a minimal patch, the ScalarEvolution implementation of getMinusSCEV
still ends up subtracting pointers if they have the same base.  This
should eliminate the shared pointer base, but eventually we'll need to
rewrite it to avoid negating the pointer base. I plan to do this as a
separate step to allow measuring the compile-time impact.

This doesn't cause obvious functional changes in most cases; the one
case that is significantly affected is ICmpZero handling in LSR (which
is the source of almost all the test changes).  The resulting changes
seem okay to me, but suggestions welcome.  As an alternative, I tried
explicitly ptrtoint'ing the operands, but the result doesn't seem
obviously better.

I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out
how to repair it to test what it was actually trying to test.

Recommitting with fix to MemoryDepChecker::isDependent.

Differential Revision: https://reviews.llvm.org/D104806
2021-07-06 12:16:05 -07:00
Eli Friedman a6d081b2cb Revert "[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers."
This reverts commit 74d6ce5d5f.

Seeing crashes on buildbots in MemoryDepChecker::isDependent.
2021-07-06 11:17:13 -07:00
Eli Friedman 74d6ce5d5f [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers.
As part of making ScalarEvolution's handling of pointers consistent, we
want to forbid multiplying a pointer by -1 (or any other value). This
means we can't blindly subtract pointers.

There are a few ways we could deal with this:
1. We could completely forbid subtracting pointers in getMinusSCEV()
2. We could forbid subracting pointers with different pointer bases
(this patch).
3. We could try to ptrtoint pointer operands.

The option in this patch is more friendly to non-integral pointers: code
that works with normal pointers will also work with non-integral
pointers. And it seems like there are very few places that actually
benefit from the third option.

As a minimal patch, the ScalarEvolution implementation of getMinusSCEV
still ends up subtracting pointers if they have the same base.  This
should eliminate the shared pointer base, but eventually we'll need to
rewrite it to avoid negating the pointer base. I plan to do this as a
separate step to allow measuring the compile-time impact.

This doesn't cause obvious functional changes in most cases; the one
case that is significantly affected is ICmpZero handling in LSR (which
is the source of almost all the test changes).  The resulting changes
seem okay to me, but suggestions welcome.  As an alternative, I tried
explicitly ptrtoint'ing the operands, but the result doesn't seem
obviously better.

I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out
how to repair it to test what it was actually trying to test.

Differential Revision: https://reviews.llvm.org/D104806
2021-07-06 10:54:41 -07:00
Jon Roelofs 37b6e03c18 [Intrinsics] Make MemCpyInlineInst a MemCpyInst
This opens up more optimization opportunities in passes that already handle MemCpyInst's.

Differential revision: https://reviews.llvm.org/D105247
2021-07-02 10:25:24 -07:00
Florian Hahn a3ca578eb9
[Matrix] Fix crash during fusion if the same load is re-used.
This patch fixes a crash when the same load is used for both operands of
a fuseable multiply.
2021-07-02 14:00:17 +01:00
Florian Hahn 7655061cc6
[Matrix] Hoist address computation before multiply to enable fusion.
If the store address does not dominate the matrix multiply, try to hoist
address computation instructions without side-effects and/or memory
reads before the multiply, to allow fusion.

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D105193
2021-07-02 09:52:11 +01:00
Evgeniy Brevnov 9568811cb8 [NFC][DSE]Change 'do-while' to 'for' loop to simplify code structure
With 'for' loop there is is a single place where 'Current' is adjusted. It helps to avoid copy paste and makes a bit easy to understand overall loop controll flow.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D101044
2021-07-02 10:00:47 +07:00
Craig Topper 066524ea54 [ScalarizeMaskedMemIntrin][SelectionDAGBuilder] Use the element type to calculate alignment for gather/scatter when alignment operand is 0.
Previously we used the vector type, but we're loading/storing
invididual elements so I think only element alignment should matter.

Noticed while looking at the code for something else so I don't
have a test case.

Differential Revision: https://reviews.llvm.org/D105220
2021-07-01 19:08:47 -07:00
Reshabh Sharma ae983de6cc [InferAddressSpaces] NFC: For noop IntToPtr/PtrToInt pair cast to operator instead of PtrToInt
Compiler crashes at an assertion while casting operands to PtrToIntInst at some cases when
ptrtoint is present as an explicit operand to inttoptr. Explicit instruction operator as
operand can not be casted to an Instruction.

This patch replaces cast from PtrToInst to Operator which are later checked for constant
expressions.

Differential Revision: https://reviews.llvm.org/D105002
2021-06-28 19:24:26 +05:30
Max Kazantsev d58514d41c [LSR][NFC] Make sure that after the canonicalization the formula is canonical 2021-06-28 12:50:04 +07:00
Max Kazantsev 7c73c2ede8 [LoopDeletion] Benefit from branches by undef conditions when symbolically executing 1st iteration
We can exploit branches by `undef` condition. Frankly, the LangRef says that
such branches are UB, so we can assume that all outgoing edges of such blocks
are dead.

However, from practical perspective, we know that this is not supported correctly
in some other places. So we are being conservative about it.

Branch by undef is treated in the following way:
- If it is a loop-exiting branch, we always assume it exits the loop;
- If not, we arbitrarily assume it takes `true` value.

Differential Revision: https://reviews.llvm.org/D104689
Reviewed By: nikic
2021-06-28 11:39:46 +07:00
Nikita Popov e81702912e [DSE] Preserve address space
Preserve address space when inserting i8* cast.
2021-06-27 20:26:00 +02:00
Nikita Popov 9aa951e80e [MemCpyOpt] Preserve address space
Preserve address space when generating the cast to i8*.
2021-06-27 20:21:19 +02:00