Commit Graph

31465 Commits

Author SHA1 Message Date
Florian Hahn ac80b0e84f
[LV] Mark Instr as const in scalarizeInstruction. (NFC).
This is to reduce the diff in follow-up changes.
2022-09-13 09:10:02 +01:00
Max Kazantsev 86d5586d78 [SCEVExpander] Recompute poison-generating flags on hoisting. PR57187
Instruction being hoisted could have nuw/nsw flags inferred from the old
context, and we cannot simply move it to the new location keeping them
because we are going to introduce new uses to them that didn't exist before.

Example in https://github.com/llvm/llvm-project/issues/57187 shows how
this can produce branch by poison from initially well-defined program.

This patch forcefully recomputes poison-generating flag in the new context.

Differential Revision: https://reviews.llvm.org/D132022
Reviewed By: fhahn, nikic
2022-09-13 12:56:35 +07:00
Kazu Hirata 9606608474 [llvm] Use x.empty() instead of llvm::empty(x) (NFC)
I'm planning to deprecate and eventually remove llvm::empty.

I thought about replacing llvm::empty(x) with std::empty(x), but it
turns out that all uses can be converted to x.empty().  That is, no
use requires the ability of std::empty to accept C arrays and
std::initializer_list.

Differential Revision: https://reviews.llvm.org/D133677
2022-09-12 13:34:35 -07:00
Sanjay Patel 53eede597e [InstCombine] look through 'not' of ctlz/cttz op with 0-is-undef
https://alive2.llvm.org/ce/z/MNsC1S

This pattern was flagged at:
https://discourse.llvm.org/t/instcombines-select-optimizations-dont-trigger-reliably/64927
2022-09-12 15:06:21 -04:00
Benjamin Kramer 2675c41671 [DFSan] Don't crash with the legacy pass manager
TargetLibraryInfo isn't optional, so we have to provide it even with the
lageacy stuff. Ideally we wouldn't need it anymore but there are still
users out there that are stuck on the legacy PM.

Differential Revision: https://reviews.llvm.org/D133685
2022-09-12 19:11:55 +02:00
A-Wadhwani de3445e0ef [SROA] Create additional vector type candidates based on store and load slices
This patch adds additional vector types to be considered when doing
promotion in SROA, based on the types of the store and load slices. This
provides more promotion opportunities, by potentially using an optimal
"intermediate" vector type.

For example, the following code would currently not be promoted to a
vector, since `__m128i` is a `<2 x i64>` vector.

```

__m128i packfoo0(int a, int b, int c, int d) {
  int r[4] = {a, b, c, d};
  __m128i rm;
  std::memcpy(&rm, r, sizeof(rm));
  return rm;
}
```

```
packfoo0(int, int, int, int):
        mov     dword ptr [rsp - 24], edi
        mov     dword ptr [rsp - 20], esi
        mov     dword ptr [rsp - 16], edx
        mov     dword ptr [rsp - 12], ecx
        movaps  xmm0, xmmword ptr [rsp - 24]
        ret
```

By also considering the types of the elements, we could find that the
`<4 x i32>` type would be valid for promotion, hence removing the memory
accesses for this function. In other words, we can explore other new
vector types, with the same size but different element types based on
the load and store instructions from the Slices, which can provide us
more promotion opportunities.

Additionally, the step for removing duplicate elements from the
`CandidateTys` vector was not using an equality comparator, which has
been fixed.

Differential Revision: https://reviews.llvm.org/D132096
2022-09-12 09:55:37 -07:00
Sanjay Patel 4ca25c66d4 [Reassociate] prevent partial undef negation replacement
As shown in the examples in issue #57683, we allow matching
vectors with poison (undef) in this transform (and possibly more),
but we can't then use the partially defined value as a replacement
value in other expressions blindly.

This seems to be avoided in simpler examples of reassociation,
and other passes should be able to clean up the redundant op
seen in these tests.
2022-09-12 12:28:34 -04:00
Florian Hahn 3fd1cc2574
[SLP] Add Preheader to CSE blocks after hoisting CSE-able instrs.
Adding the pre-header to CSEBlocks ensures instructions are CSE'd even
after hoisting.

This was original discovered by @atrick a while ago.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D133649
2022-09-12 15:53:31 +01:00
Alexey Bataev dfe1e9dd79 [SLP]Improve reordering of clustered reused scalars.
If the reused scalars are clustered, i.e. each part of the reused mask
contains all elements of the original scalars exactly once, we can
reorder those clusters to improve the whole ordering of of the clustered
vectors.

Differential Revision: https://reviews.llvm.org/D133524
2022-09-12 06:52:25 -07:00
Max Kazantsev 0e465c0c2f [IRCE] Bail in case of pointer types. PR40539
We should not unconditionally expect that SCEVable types are all integers
because SCEV can also be computed for pointers. Bail in this case.
2022-09-12 16:01:25 +07:00
Djordje Todorovic b080d0bae8 Revert ""Recommit "[AggressiveInstCombine] Lower Table Based CTTZ"""
This reverts commit df868edee5, as it
introduces a bug found by Alive2 (more on the rGdf868edee561).
2022-09-12 08:23:07 +02:00
Johannes Doerfert c922cac868 Revert "[Attributor] AAPointerInfo should allow "harmless" uses"
Revert "[Attributor] Teach AAPointerInfo to look into aggregates"

This reverts commit 844f6c5d03 and
4ed0a88cd8 as they broke the buildbots
that run openmp/libomptarget/test/offloading/bug49021.cpp.
2022-09-11 21:37:54 -07:00
Johannes Doerfert 844f6c5d03 [Attributor] AAPointerInfo should allow "harmless" uses
If a call base use will not capture a pointer we can approximate the
effects. This is important especially for readnone/only uses.
2022-09-11 20:16:11 -07:00
Johannes Doerfert 4ed0a88cd8 [Attributor] Teach AAPointerInfo to look into aggregates
If we have a constant aggregate, e.g., as an initializer, we usually
failed to extract the proper value/type from it. This patch provides the
size and offset information necessary to extract the right part of the
constant.
2022-09-11 20:16:11 -07:00
Johannes Doerfert b046ebdc01 [Attributor][FIX] Conservatively handle ptr2int, don't crash
If a pointer-2-int cast is found we give up on AAPointerInfo for now.
This caused a crash before.

Reported by John Tramm (@jtramm).
2022-09-11 20:16:11 -07:00
Johannes Doerfert 21711039e3 [OpenMP] Allow the Attributor to look at functions we also internalized
This is important as we have accesses to globals in those which we need to
categorize.
2022-09-11 20:16:11 -07:00
Junduo Dong 6975ab7126 [Clang] Reimplement time tracing of NewPassManager by PassInstrumentation framework
The previous implementation of time tracing in NewPassManager is direct but messive.

The key codes are like the demo below:
```
  /// Runs the function pass across every function in the module.
  PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,
                        LazyCallGraph &CG, CGSCCUpdateResult &UR) {
      /// ...
      PreservedAnalyses PassPA;
      {
        TimeTraceScope TimeScope(Pass.name());
        PassPA = Pass.run(F, FAM);
      }
      /// ...
 }
```

It can be bothered to judge where should we add the tracing codes by hands.

With the PassInstrumentation framework, we can easily add `Before/After` callback
functions to add time tracing codes.

Differential Revision: https://reviews.llvm.org/D131960
2022-09-11 05:42:55 -07:00
Florian Hahn 69d9bb2aad
[VPlan] Check recipe uses instead of type of underlying instr (NFC).
Suggested by @Ayal post-commit, to reduce the dependence on the
underlying instruction in favor of information available directly for
the recipe.
2022-09-11 12:24:44 +01:00
Marc Auberer 09cdddea0c [InstCombine] Fold x + (x | -x) to x & (x - 1)
Fixes #57531

This transformation may be particularly useful on x86-64,
because x & (x - 1) can be performed by a single blsr instruction.

Differential Revision: https://reviews.llvm.org/D133362
2022-09-11 06:14:24 -04:00
Alexey Bader 2bb5535b58 [StripDeadDebugInfo] Drop dead CUs
In situations when a submodule is extracted from big module (i.e. using
CloneModule) a lot of debug info is copied via metadata nodes. Despite of
the fact that part of that info is not linked to any instruction in extracted
IR file, StripDeadDebugInfo pass doesn't drop them.
Strengthen criteria for debug info that should be kept in a module:
- Only those compile units are left that referenced by a subprogram debug info
node that is attached to a function definition in the module or to an instruction
in the module that belongs to an inlined function.

Signed-off-by: Mikhail Lychkov <mikhail.lychkov@intel.com>

Differential Revision: https://reviews.llvm.org/D122163
2022-09-11 01:31:03 -07:00
Vitaly Buka b51d1f1fbd [msan] Don't deppend on argumens evaluation order 2022-09-10 15:28:32 -07:00
Vitaly Buka 71c5e7b26a [msan] Do not deppend on arguments evaluation order
Clang and GCC do this differently making IR inconsistent.
https://lab.llvm.org/buildbot#builders/6/builds/13120
2022-09-10 13:50:32 -07:00
Vitaly Buka 1819d5999c [NFC][msan] Remove unused return type 2022-09-10 12:20:54 -07:00
Vitaly Buka 6fc31712f1 [msan] Relax handling of llvm.masked.expandload and llvm.masked.gather
This is work around for new false positives. Real implementation will
follow.
2022-09-10 12:19:16 -07:00
Manuel Brito b51c6130ef Use PoisonValue instead of UndefValue when RAUWing unreachable code [NFC]
Replacing the following instances of UndefValue with PoisonValue, where the UndefValue is used as an arbitrary value:

- llvm/lib/CodeGen/WinEHPrepare.cpp
`demotePHIsOnFunclets`: RAUW arbitrary value for lingering uses of removed PHI nodes

 - llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
`FoldSingleEntryPHINodes`: Removes a self-referential single entry phi node.

 - llvm/lib/Transforms/Utils/CallGraphUpdater.cpp
`finalize`: Remove all references to removed functions.

- llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
`cleanup`: the result is not used then the inserted instructions are removed.

 - llvm/tools/bugpoint/CrashDebugger.cpp
`TestInts`:  the program is cloned and instructions are removed to narrow down source of crash.

Differential Revision: https://reviews.llvm.org/D133640
2022-09-10 14:28:01 +01:00
Florian Hahn da734473fa
[LV] Remove now dead variable after 2a78890b7b (NFC). 2022-09-09 20:25:55 +01:00
Florian Hahn 2a78890b7b
[VPlan] Move SCEV expansion for pointer induction to VPExpandSCEV (NFC).
Use VPExpandSCEVRecipe to expand the step of pointer inductions. This
cleanup addresses a corresponding FIXME.

It should be NFC, as steps for pointer induction must be constants,
which makes expansion trivial.
2022-09-09 19:20:13 +01:00
Sanjay Patel 6113e6738d [InstCombine] move/adjust comments about demanded bits; NFC
The code has been moved/copied around, but the comments were not updated to match.
2022-09-09 11:48:20 -04:00
Philip Reames a33d98e20a [LV] Pull out common expression [nfc] 2022-09-09 07:31:46 -07:00
Philip Reames edb26268ce [VPlan] Only generate single instr for stores uniform across all parts.
Extend the approach taken by D133019 to store instructions.

Differential Revision: https://reviews.llvm.org/D133497
2022-09-09 07:15:12 -07:00
Nikita Popov a9f312c7f4 [AST] Use BatchAA in aliasesUnknownInst() (NFCI) 2022-09-09 15:54:48 +02:00
Sebastian Neubauer c7750c522e Add helper func to get first non-alloca position
The LLVM performance tips suggest that allocas should be placed at the
beginning of the entry block. So far, llvm doesn’t provide any helper to
find that position.

Add BasicBlock::getFirstNonPHIOrDbgOrAlloca and IRBuilder::SetInsertPointPastAllocas(Function*)
that get an insert position after the (static) allocas at the start of a
function and use it in ShadowStackGCLowering.

Differential Revision: https://reviews.llvm.org/D132554
2022-09-09 15:39:53 +02:00
Nikita Popov 4ab77d1677 [LICM] Allow promotion with non-load/store users
If there are non-load/store users of the promoted pointer, we
currently abort promotion. However, having such users isn't really
relevant to the transform. We already separately check that a)
there are no instructions that modref the promoted pointer and
b) that a pointer capture disables store promotion.

In the affected @test_captured_in_loop test case we have a readnone
capture of the promoted pointer, which means that load promotion
can be performed (while store promotion cannot).

Differential Revision: https://reviews.llvm.org/D133485
2022-09-09 13:09:59 +02:00
Djordje Todorovic df868edee5 "Recommit "[AggressiveInstCombine] Lower Table Based CTTZ""
This reverts commit 053841c562.

We faced a use-after-free after pushing the D113291, since the
foldSqrt() has a call to eraseFromParent(). The function
should be at the end of the main loop that folds the patterns.
This patch fixes that.
2022-09-09 10:29:39 +02:00
Vitaly Buka 1cf5c7fe8c [msan] Disambiguate warnings debug location
If multiple warnings created on the same instruction (debug location)
it can be difficult to figure out which input value is the cause.

This patches chains origins just before the warning using last origins
update debug information.

To avoid inflating the binary unnecessarily, do this only when uncertainty is
high enough, 3 warnings by default. On average it adds 0.4% to the
.text size.

Reviewed By: kda, fmayer

Differential Revision: https://reviews.llvm.org/D133232
2022-09-08 14:17:07 -07:00
Vitaly Buka 0f2f1c2be1 [sanitizers] Invalidate GlobalsAA
GlobalsAA is considered stateless as usually transformations do not introduce
new global accesses, and removed global access is not a problem for GlobalsAA
users.
Sanitizers introduce new global accesses:
 - Msan and Dfsan tracks origins and parameters with TLS, and to store stack origins.
  - Sancov uses global counters. HWAsan store tag state in TLS.
  - Asan modifies globals, but I am not sure if invalidation is required.

I see no evidence that TSan needs invalidation.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D133394
2022-09-08 14:00:43 -07:00
Sanjay Patel 444f08c832 [InstCombine] fold icmp of truncated left shift, part 2
(trunc (1 << Y) to iN) == 2**C --> Y == C
(trunc (1 << Y) to iN) != 2**C --> Y != C
https://alive2.llvm.org/ce/z/xnFPo5

Follow-up to d9e1f9d759. This was a suggested
enhancement mentioned in issue #51889.
2022-09-08 12:44:02 -04:00
Philip Reames 4c4c0d2c06 [LV] Use safe-divisor lowering for fixed vectors if profitable
This extends the safe-divisor widening scheme recently added for scalable vectors to handle fixed vectors as well.

Differential Revision: https://reviews.llvm.org/D132591
2022-09-08 09:15:54 -07:00
Joe Loser 5e96cea1db [llvm] Use std::size instead of llvm::array_lengthof
LLVM contains a helpful function for getting the size of a C-style
array: `llvm::array_lengthof`. This is useful prior to C++17, but not as
helpful for C++17 or later: `std::size` already has support for C-style
arrays.

Change call sites to use `std::size` instead.

Differential Revision: https://reviews.llvm.org/D133429
2022-09-08 09:01:53 -06:00
Djordje Todorovic 7aec9ddcfd Revert "Recommit "[AggressiveInstCombine] Lower Table Based CTTZ""
This reverts commit f879939157.
2022-09-08 17:01:16 +02:00
Sanjay Patel d9e1f9d759 [InstCombine] Fold icmp of truncated left shift
(trunc (1 << Y) to iN) == 0 --> Y u>= N
(trunc (1 << Y) to iN) != 0 --> Y u<  N

These can be generalized in several ways as noted by the TODO
items, but this handles the pattern in the motivating bug report.

Fixes #51889

Differential Revision: https://reviews.llvm.org/D115480
2022-09-08 10:48:14 -04:00
Djordje Todorovic f879939157 Recommit "[AggressiveInstCombine] Lower Table Based CTTZ" 2022-09-08 16:36:46 +02:00
Florian Hahn 422cf99161
[VPlan] Only generate single instr for loads uniform across all parts.
VPReplicateRecipe::isUniform actually means uniform-per-parts, hence a
scalar instruction is generated per-part.

This is a potential alternative D132892. For now the current patch only
catches cases where the address is trivially invariant (defined outside
VPlan), while D132892 catches any address that is considered invariant
by SCEV AFAICT.

It should be possible to hoist fully invariant recipes feeding loads out
of the vector loop region as well, but in practice LICM should do that
already.

This version of the patch artificially limits this to loads to make it
easier to compare, but this restriction should be easily liftable.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D133019
2022-09-08 14:27:58 +01:00
Chenbing Zheng 01cea7ac10 [InstCombine] extractvalue (any_mul_with_overflow X, 2^n), 0 -> X << n
Alive2: https://alive2.llvm.org/ce/z/JLmabt (umul)
        https://alive2.llvm.org/ce/z/J_ruXR  (smul)
        https://alive2.llvm.org/ce/z/o9SVSz (vector)

Reviewed By: spatel, RKSimon

Differential Revision: https://reviews.llvm.org/D133188
2022-09-08 11:12:55 +08:00
Sami Tolvanen 52967a5306 [InstCombine] Fix a crash in -kcfi debug block
Don't attempt to print out DebugLoc as we may not have one.
2022-09-07 22:59:12 +00:00
Marco Elver 97c2220565 [SanitizerBinaryMetadata] Introduce SanitizerBinaryMetadata instrumentation pass
Introduces the SanitizerBinaryMetadata instrumentation pass which uses
the new MD_pcsections metadata kinds to instrument certain types of
instructions and functions required for breakpoint-based sanitizers.

The first intended user of the binary metadata emitted will be a variant
of GWP-TSan [1]. GWP-TSan will require information about atomic
accesses; to unambiguously determine if an access is atomic or not, we
also require "covered" information which code has been compiled with
SanitizerBinaryMetadata instrumentation enabled.

[1] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D130887
2022-09-07 21:25:40 +02:00
Sanjay Patel 85b289377b [SCCP] convert signed div/rem to unsigned for non-negative operands, 2nd try
The original commit ( fe1f3cfc26 ) was reverted because it could
crash / assert when trying to fold a value that was replaced
by a constant. In that case, there might not be an entry for the
constant in the solver yet.

This version adds a check for that possibility along with tests to
exercise that pattern (they used to crash).

Original commit message:
This extends the transform added with D81756 to handle div/rem opcodes.
For example:
https://alive2.llvm.org/ce/z/cX6za6

This replicates part of what CVP already does, but the motivating example
from issue #57472 demonstrates a phase ordering problem - we convert
branches to select before CVP runs and miss the transform.

Differential Revision: https://reviews.llvm.org/D133198
2022-09-07 11:56:29 -04:00
Sanjay Patel 7c57180900 [InstCombine] fold add+negate through select into sub
This transform came up as a potential DAGCombine in D133282,
so I wanted to see how it escaped in IR too.

We do general folds in InstCombiner::SimplifySelectsFeedingBinaryOp()
by checking if either arm of a select simplifies when the trailing
binop is threaded into the select.

So as long as one side simplifies, it's a good fold to combine a
negate and add into 1 subtract.

This is an example with a zero arm in the select:
https://alive2.llvm.org/ce/z/Hgu_Tj

And this models the tests with a cancelling 'not' op:
https://alive2.llvm.org/ce/z/BuzVV_

Differential Revision: https://reviews.llvm.org/D133369
2022-09-07 08:23:35 -04:00
Aaron Kogon ae05b9dc30 Sink/hoist memory instructions between loop fusion candidates
Currently, instructions in the preheader of the second of two fusion
candidates are sunk and hoisted whenever possible, to try to allow the
loops to fuse. Memory instructions are skipped, and are never sunk or
hoisted. This change adds memory instructions for sinking/hoisting
consideration.

This change uses DependenceAnalysis to check if a mem inst in the
preheader of FC1 depends on an instruction in FC0's header, across
which it will be hoisted, or FC1's header, across which it will be
sunk. We reject cases where the dependency is a data hazard.

Differential Revision: https://reviews.llvm.org/D131606
2022-09-07 07:42:00 -04:00
Nikita Popov f42d92611d [Reassociate] Avoid ConstantExpr::getFNeg() (NFCI)
Use ConstantFoldUnaryOpOperand() instead. Also make the code below
robust against non-instruction users, just in case it doesn't fold.
2022-09-07 10:48:08 +02:00