Commit Graph

11525 Commits

Author SHA1 Message Date
Nico Weber 304a5a7a14 Revert "[ValueTracking] Added support to deduce PHI Nodes values being a power of 2"
This reverts commit d5c130f17e.
Breaks tests, see https://reviews.llvm.org/D125332#3525819
2022-05-19 15:05:30 -04:00
William Huang d5c130f17e [ValueTracking] Added support to deduce PHI Nodes values being a power of 2
Add Value Tracking support to deduce induction variable being a power of 2, allowing urem optimizations

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D125332
2022-05-19 18:39:13 +00:00
Jay Foad 6bec3e9303 [APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf
Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op). Now that
the standard zext, sext and trunc allow this, there is no reason to use
the OrSelf versions.

The OrSelf versions additionally have the strange behaviour of allowing
extending to a *smaller* width, or truncating to a *larger* width, which
are also treated as no-ops. A small amount of client code relied on this
(ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and
needed rewriting.

Differential Revision: https://reviews.llvm.org/D125557
2022-05-19 11:23:13 +01:00
Philip Reames f7988d08a8 Revert "[BasicAA] Remove unneeded special case for malloc/calloc"
This reverts commit 9b1e00738c.

Nikic reported in commit thread that I had forgotten history here, and that a) we'd tried this before, and b) had to revert due to an unexpected codegen impact.  Current measurements confirm the same issue still exists.
2022-05-18 07:35:27 -07:00
NAKAMURA Takumi 6ca7eb2c6d [SCEV] Part 1, Serialize function calls in function arguments.
Evaluation odering in function call arguments is implementation-dependent.
In fact, gcc evaluates bottom-top and clang does top-bottom.

Fixes #55283 partially.

Part of https://reviews.llvm.org/D125627
2022-05-18 23:20:08 +09:00
Philip Reames 9b1e00738c [BasicAA] Remove unneeded special case for malloc/calloc
This code pre-exists the generic handling for inaccessiblememonly.  If we remove it and update one test with inaccessiblememonly, nothing else changes.  Note that simply running O1 on that test would annotate malloc with the missing inaccessiblememonly.
2022-05-17 20:45:14 -07:00
Nikita Popov b9b71c2b87 [LVI] Compute range for xor
We do have a non-trivial implementation for binaryXor() now.
2022-05-17 10:18:38 +02:00
Yang Keao 7dce9eb6e5 [DomPrinter] Migrate -dot-dom to the new pass manager.
In D123677, @YangKeao provided an implementation of `DOTGraphTraits{Viewer,Printer}` in the new pass manager. This commit migrates the `DomPrinter` and `DomViewer` to the new pass manager.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D124904
2022-05-16 15:07:16 -05:00
Nikita Popov 356d47ccb9 [ValueTracking] Handle and/or on RHS of isImpliedCondition()
isImpliedCondition() currently handles and/or on the LHS, but not
on the RHS, resulting in asymmetric behavior. This patch adds two
new implication rules:

 * LHS ==> (RHS1 || RHS2) if LHS ==> RHS1 or LHS ==> RHS2
 * LHS ==> !(RHS1 && RHS2) if LHS ==> !RHS1 or LHS ==> !RHS2

Differential Revision: https://reviews.llvm.org/D125551
2022-05-16 16:30:26 +02:00
Florian Hahn b7315ffc3c
[LAA,LV] Add initial support for pointer-diff memory checks.
This patch adds initial support for a pointer diff based runtime check
scheme for vectorization. This scheme requires fewer computations and
checks than the existing full overlap checking, if it is applicable.

The main idea is to only check if source and sink of a dependency are
far enough apart so the accesses won't overlap in the vector loop. To do
so, it is sufficient to compute the difference and compare it to the
`VF * UF * AccessSize`. It is sufficient to check
`(Sink - Src) <u VF * UF * AccessSize` to rule out a backwards
dependence in the vector loop with the given VF and UF. If Src >=u Sink,
there is not dependence preventing vectorization, hence the overflow
should not matter and using the ULT should be sufficient.

Note that the initial version is restricted in multiple ways:

1. Pointers must only either be read or written, by a single
   instruction (this allows re-constructing source/sink for
   dependences with the available information)
 2. Source and sink pointers must be add-recs, with matching steps
 3. The step must be a constant.
 3. abs(step) == AccessSize.

Most of those restrictions can be relaxed in the future.

See https://github.com/llvm/llvm-project/issues/53590.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D119078
2022-05-16 15:27:22 +01:00
NAKAMURA Takumi da7d8de1e4 ScalarEvolution.cpp: Reformat. 2022-05-15 20:51:27 +09:00
Sanjay Patel ee6754c277 [ValueTracking] recognize sub X, (X % Y) as not overflowing
I fixed some poison-safety violations on related patterns in InstCombine
and noticed that we missed adding nsw/nuw on them, so this adds clauses
to the underlying analysis for that.

We need the undef input restriction to make this safe according to Alive2:
https://alive2.llvm.org/ce/z/48g9K8

Differential Revision: https://reviews.llvm.org/D125500
2022-05-13 09:59:41 -04:00
Nikita Popov ddfee07519 [InstSimplify] Fold and/or using implied conditions
This adds two conjugated folds:

 * A | B -> B if A implies B (https://alive2.llvm.org/ce/z/R6GU4j)
 * A & B -> A if A implies B (https://alive2.llvm.org/ce/z/EGMqyy)

If A and B are icmps themselves, we will usually fold this through
other logic already (though the tests show a couple additional cases
we previously missed). However, isImpliedCond() also supports A
being of the form X & Y, which allows us to handle cases like
(X & Y) | B where X implies B. This addresses the regression from
D125398.

Something that notably doesn't work yet is the (X | Y) & B case.
This is due to an asymmetry in the isImpliedCondition()
implementation that will have to be addressed separately.

Differential Revision: https://reviews.llvm.org/D125530
2022-05-13 15:09:14 +02:00
Florian Hahn 5890b30105
[LAA] Initial support for runtime checks with pointer selects.
Scaffolding support for generating runtime checks for multiple SCEV expressions
per pointer. The initial version just adds support for looking through
a single pointer select.

The more sophisticated logic for analyzing forks is in D108699

Reviewed By: huntergr

Differential Revision: https://reviews.llvm.org/D114487
2022-05-12 19:33:48 +01:00
Arthur Eubanks 7e0802aeb5 [BasicAA] Fix order in which we pass MemoryLocations to alias()
D98718 caused the order of Values/MemoryLocations we pass to alias() to
be significant due to storing the offset in the PartialAlias case. But
some callers weren't audited and were still passing swapped arguments,
causing the returned PartialAlias offset to be negative in some
cases. For example, the newly added unittests would return -1
instead of 1.

Fixes #55343, a miscompile.

Reviewed By: asbirlea, nikic

Differential Revision: https://reviews.llvm.org/D125328
2022-05-10 12:05:38 -07:00
Nikita Popov c077510bb1 [InstSimplify] Handle unknown function context in pointer icmp fold (PR54615)
This issue reproduces in the context of LoopDeletion, because the
bitcast does not get simplified away there. For a plain -inst-simplify
run the bitcast would get folded away first.

Fixes https://github.com/llvm/llvm-project/issues/54615.
2022-05-10 11:48:43 +02:00
Andrew Litteken 96345f773c [IRSim] Remove early check from similarity matching such that commutative instructions are checked correctly when using the same value.
When the first commutative instruction in a region using the same value in both positions was compared to a corresponding instruction with two different values, there was an early check that determined that since the values were new, it was true that these values acted in the same way structurally. If this was not contradicted later in the program, the regions were marked as similar. This removes that check, so that it is clear that the same value cannot be mapped to two different values.

Reviewer: paquette

Differential Revision: https://reviews.llvm.org/D124775
2022-05-09 22:59:09 -05:00
Mircea Trofin c35ad9ee4f [mlgo] Support exposing more features than those supported by models
This allows the compiler to support more features than those supported by a
model. The only requirement (development mode only) is that the new
features must be appended at the end of the list of features requested
from the model. The support is transparent to compiler code: for
unsupported features, we provide a valid buffer to copy their values;
it's just that this buffer is disconnected from the model, so insofar
as the model is concerned (AOT or development mode), these features don't
exist. The buffers are allocated at setup - meaning, at steady state,
there is no extra allocation (maintaining the current invariant). These
buffers has 2 roles: one, keep the compiler code simple. Second, allow
logging their values in development mode. The latter allows retraining
a model supporting the larger feature set starting from traces produced
with the old model.

For release mode (AOT-ed models), this decouples compiler evolution from
model evolution, which we want in scenarios where the toolchain is
frequently rebuilt and redeployed: we can first deploy the new features,
and continue working with the older model, until a new model is made
available, which can then be picked up the next time the compiler is built.

Differential Revision: https://reviews.llvm.org/D124565
2022-05-09 18:01:21 -07:00
Michael Kruse 6b3b87376b [polly] migrate -polly-show to the new pass manager
Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D123678
2022-05-09 14:04:29 -05:00
Michael Kruse a6b399ad79 [PassManager] Implement DOTGraphTraitsViewer under NPM
Rename the legacy `DOTGraphTraits{Module,}{Viewer,Printer}` to the corresponding `DOTGraphTraits...WrapperPass`, and implement a new `DOTGraphTraitsViewer` with new pass manager.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D123677
2022-05-09 14:04:28 -05:00
Alexey Bataev 9dc4ced204 [SLP]Try partial store vectorization if supported by target.
We can try to vectorize number of stores less than MinVecRegSize
/ scalar_value_size, if it is allowed by target. Gives an extra
opportunity for the vectorization.

Fixes PR54985.

Differential Revision: https://reviews.llvm.org/D124284
2022-05-09 09:48:15 -07:00
Nikita Popov 68e1ba8188 [SCEV] Fold umin_seq using known predicate
Fold %x umin_seq %y to %x if %x ule %y. This also subsumes the
special handling for constant operands, as if %y is constant this
folds to umin via implied poison reasoning, and if %x is constant
then either %x is not zero and it folds to umin, or it is known
zero, in which case it is ule anything.
2022-05-09 16:35:08 +02:00
Nikita Popov 18eaff1510 [ScalarEvolution] Fold %x umin_seq %y if %x cannot be zero
Fold %x umin_seq %y to %x umin %y if %x cannot be zero. They only
differ in semantics for %x==0.

More generally %x *_seq %y folds to %x * %y if %x cannot be the
saturation fold (though currently we only have umin_seq).
2022-05-09 15:11:05 +02:00
Serge Pavlov eb28da89a6 [InstCombine] Remove side effect of replaced constrained intrinsics
If a constrained intrinsic call was replaced by some value, it was not
removed in some cases. The dangling instruction resulted in useless
instructions executed in runtime. It happened because constrained
intrinsics usually have side effect, it is used to model the interaction
with floating-point environment. In some cases side effect is actually
absent or can be ignored.

This change adds specific treatment of constrained intrinsics so that
their side effect can be removed if it actually absents.

Differential Revision: https://reviews.llvm.org/D118426
2022-05-07 19:04:11 +07:00
Nikita Popov 47c559d6c1 [SCEV] Fold umin_seq to umin using implied poison reasoning
Similar to how we convert logical and/or to bitwise and/or, we should
also convert umin_seq to umin based on implied poison reasoning. In
%x umin_seq %y, if %y being poison implies %x being poison, then we
don't need the sequential evaluation: Having %y contribute towards
the result will never make the result more poisonous. An important
corollary of this is that if %y is never poison, we also don't need
the sequential evaluation.

This avoids some of the regressions in D124910.

Differential Revision: https://reviews.llvm.org/D124921
2022-05-05 09:43:49 +02:00
Yangguang Li 3a8266902b [SCEV] Removed an unnecessary assertion
The assertion is to check we always get backedge taken count
(`BECount`) of zero when the exit condition is in select form
(`isa<BinaryOperation>(ExitCond)`) and the exit limit for the
first operand is zero `EL0.ExactNotTaken->isZero()`). However
the assertion is checking that the exit condition is NOT in
select form. Removing the the whole assertion since we now handle
select form in ScalarEvolution::getSequentialMinMaxExpr.

Reviewed By: reames, nikic

Differential Revision: https://reviews.llvm.org/D122835
2022-05-03 17:26:27 -04:00
Augie Fackler 1deea714b3 BuildLibCalls: simplify switch statement slightly
Per feedback on D123086 after submit.

Also added a test for vec_malloc et al attribute inference to show it's
doing the right thing.

The new tests exposed a defect, corrected by adding vec_free to the list of
free functions in MemoryBuiltins.cpp, which had been overlooked all the
way back in D94710, over a year ago.

Differential Revision: https://reviews.llvm.org/D124859
2022-05-03 13:17:33 -04:00
Nikita Popov 47255834e7 [ValueTracking] A and (B & ~A) have no common bits set
This extends haveNoCommonBitsSet() to two additional cases, allowing
the following folds:

 * `A + (B & ~A)` --> `A | (B & ~A)`
    (https://alive2.llvm.org/ce/z/crxxhN)
 * `A + ((A & B) ^ B)` --> `A | ((A & B) ^ B)`
    (https://alive2.llvm.org/ce/z/A_wsH_)

These should further fold to just `A | B`, though this currently
only works in the first case.

The reason why the second fold is necessary is that we consider
this to be the canonical form if B is a constant. (I did check
whether we can change that, but it looks like a number of folds
depend on the current canonicalization, so I ended up adding both
patterns here.)

Differential Revision: https://reviews.llvm.org/D124763
2022-05-03 11:33:27 +02:00
Igor Kirillov 4e5e042d9a [LoopVectorize] Support reductions that store intermediary result
Adds ability to vectorize loops containing a store to a loop-invariant
address as part of a reduction that isn't converted to SSA form due to
lack of aliasing info. Runtime checks are generated to ensure the store
does not alias any other accesses in the loop.

Ordered fadd reductions are not yet supported.

Differential Revision: https://reviews.llvm.org/D110235
2022-05-03 10:12:30 +01:00
David Green 6f81903e89 [LV][SLP] Mark fptosi_sat as vectorizable
This adds fptosi_sat and fptoui_sat to the list of trivially
vectorizable functions, mainly so that the loop vectorizer can vectorize
the instruction. Marking them as trivially vectorizable also allows them
to be SLP vectorized, and Scalarized.

The signature of a fptosi_sat requires two type overrides
(@llvm.fptosi.sat.v2i32.v2f32), unlike other intrinsics that often only
take a single. This patch alters hasVectorInstrinsicOverloadedScalarOpd
to isVectorIntrinsicWithOverloadTypeAtArg, so that it can mark the first
operand of the intrinsic as a overloaded (but not scalar) operand.

Differential Revision: https://reviews.llvm.org/D124358
2022-05-03 09:32:34 +01:00
Bardia Mahjour 363b3a645a fix warning caused by ef4ecc3cef 2022-05-02 17:06:27 -04:00
Bardia Mahjour ef4ecc3cef [LoopCacheAnalysis] Consider dimension depth of the subscript reference when calculating cost
Reviewed By: congzhe, etiotto

Differential Revision: https://reviews.llvm.org/D123400
2022-05-02 16:49:10 -04:00
Nikita Popov 597946a4dd [ConstantFold] Don't convert getelementptr to ptrtoint+inttoptr
ConstantFolding currently converts "getelementptr i8, Ptr, (sub 0, V)"
to "inttoptr (sub (ptrtoint Ptr), V)". This transform is, taken by
itself, correct, but does came with two issues:

1. It unnecessarily broadens provenance by introducing an inttoptr.
   We generally prefer not to introduce inttoptr during optimization.
2. For the case where V == ptrtoint Ptr, this folds to inttoptr 0,
   which further folds to null. In that case provenance becomes
   incorrect. This has been observed as a real-world miscompile with
   rustc.

We should probably address that incorrect inttoptr 0 fold at some
point, but in either case we should also drop this inttoptr-introducing
fold. Instead, replace it with a fold rooted at
ptrtoint(getelementptr), which seems to cover the original
motivation for this fold (test2 in the changed file).

Differential Revision: https://reviews.llvm.org/D124677
2022-05-02 10:24:46 +02:00
Congzhe Cao c428a3d2a0 [LoopCacheAnalysis] Enable delinearization of fixed sized arrays
Currently loop cache cost (LCC) cannot analyze fix-sized arrays
since it cannot delinearize them. This patch adds the capability
to delinearize fix-sized arrays to LCC. Most of the code is ported
from DependenceAnalysis.cpp and some refactoring will be done in a
next patch.

Reviewed By: #loopoptwg, Meinersbur

Differential Revision: https://reviews.llvm.org/D122857
2022-04-29 16:01:27 -04:00
Roman Lebedev 981ed72a17
[NFC][SCEV] Refactor `createNodeForSelectViaUMinSeq()` out of `createNodeForSelectOrPHIViaUMinSeq()` 2022-04-29 02:37:06 +03:00
Mircea Trofin 49942d595f [NFC] remove const from FunctionPropertiesAnalysis::run, keep on Result
The goal in 75881d8b02 was just modifying what `Result` is, didn't
need to also modify ::run.
2022-04-28 15:10:21 -07:00
Mircea Trofin 75881d8b02 [NFC] const-ed the return type of FunctionPropertiesAnalysis
The result is a data bag, this makes sure it's signaled to a user that
the data can't be mutated when, for example, doing something like:

auto &R = FAM.getResult<FunctionPropertiesAnalysis>(F)
...
R.Uses++
2022-04-28 12:42:16 -07:00
Alexey Bataev 75e1cf4a6a [COST]Improve cost model for shuffles in SLP.
Introduced masks where they are not added and improved target dependent
cost models to avoid returning of the incorrect cost results after
adding masks.

Differential Revision: https://reviews.llvm.org/D100486
2022-04-28 10:04:41 -07:00
Alexey Bataev 9861ca0c23 Revert "[COST]Improve cost model for shuffles in SLP."
This reverts commit 29a470e380 to fix
a crash reported in https://reviews.llvm.org/D100486#3479989.
2022-04-28 08:11:56 -07:00
Chris Jackson c792884589 [Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]
Reland 3f2b76ec90 with the test corrected
to require x86-registered-target.

Differential Revision: https://reviews.llvm.org/D120169
2022-04-28 14:21:56 +01:00
Chris Jackson cd5f9efc4d Revert "[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]"
This reverts commit 3f2b76ec90.
2022-04-28 14:07:31 +01:00
Chris Jackson 3f2b76ec90 [Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]
Reland commit 74273d575f following a fix
for a memory leak. The DVIRecoveryRecord vectors now use unique_ptr.

Differential Revision: https://reviews.llvm.org/D120169
2022-04-28 13:55:49 +01:00
Kirill Stoimenov 761366e6ae Revert "[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]"
This reverts commit 74273d575f.

Buildbot: https://lab.llvm.org/buildbot/#/builders/5/builds/22795
Failing with memory leak.
2022-04-27 23:11:48 +00:00
Alexey Bataev 29a470e380 [COST]Improve cost model for shuffles in SLP.
Introduced masks where they are not added and improved target dependent
cost models to avoid returning of the incorrect cost results after
adding masks.

Differential Revision: https://reviews.llvm.org/D100486
2022-04-27 10:56:26 -07:00
Chris Jackson 74273d575f [Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]
This relands commit 8f550368b1.

The test is amended with REQUIRES: x86-registered-target, in line with
the other debuginfo-scev-salvage tests.

Differential Revision: https://reviews.llvm.org/D120169
2022-04-27 13:10:30 +01:00
Chris Jackson 855752e563 Revert [Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics[2/2]
This reverts commit 8f550368b1.
2022-04-27 13:06:03 +01:00
Chris Jackson 8f550368b1 [Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]
Second of two patches to extend SCEV-based salvaging to dbg.value
intrinsics that have multiple location ops pre-LSR. This second patch
adds the core implementation.

Reviewers: @StephenTozer, @djtodoro

Differential Revision: https://reviews.llvm.org/D120169
2022-04-27 12:47:35 +01:00
Vasileios Porpodas fa8a9fea47 Recommit "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`"
This reverts commit 6a9bbd9f20.

Code review: https://reviews.llvm.org/D124202
2022-04-26 14:02:40 -07:00
Vasileios Porpodas 6a9bbd9f20 Revert "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`"
This reverts commit 55ce296d6f.
2022-04-26 11:25:26 -07:00
Vasileios Porpodas 55ce296d6f [SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`
Before this patch `Args` was used to pass a broadcat's arguments by SLP.
This patch changes this. `Args` is now used for passing the operands of
the shuffle.

Differential Revision: https://reviews.llvm.org/D124202
2022-04-26 11:11:29 -07:00
Mircea Trofin b1fa5ac3ba [mlgo] Factor out TensorSpec
This is a simple datatype with a few JSON utilities, and is independent
of the underlying executor. The main motivation is to allow taking a
dependency on it on the AOT side, and allow us build a correctly-sized
buffer in the cases when the requested feature isn't supported by the
model. This, in turn, allows us to grow the feature set supported by the
compiler in a backward-compatible way; and also collect traces exposing
the new features, but starting off the older model, and continue
training from those new traces.

Differential Revision: https://reviews.llvm.org/D124417
2022-04-25 18:35:46 -07:00
David Green 9727c77d58 [NFC] Rename Instrinsic to Intrinsic 2022-04-25 18:13:23 +01:00
Jun Ma c0022b4bb1 [InlineCost] Set LastCallToStaticBonus in ML inlining models.
This patch set LastCallToStaticBonus based on check, it has
no noticeable size reduction on an internal workload and linux kernel
with Os/Oz.

Differential Revision: https://reviews.llvm.org/D124233
2022-04-24 09:34:19 +08:00
Florian Hahn d43c083ab6
[SCEV] Use getConstant to construct SCEV for ConstantInt (NFC).
We already know that we will construct a SCEVConstant. Directly use
getConstant, rather than going through getSCEV.
2022-04-23 11:12:59 +01:00
Chang-Sun Lin Jr 7ee30a0e24 [NFC][LAA] Match-up type sizes for possible extensions, based on actual bit-size rather than rounded-up byte size.
Differential Revision: https://reviews.llvm.org/D119200
2022-04-22 23:16:20 -07:00
Mircea Trofin e4794ff5c6 [mlgo][nfc] Decouple TensorSpec from tensorflow.
The motivation is twofold:

1) Allow plugging in a different training-time evaluator, e.g.
   TFLite-based, etc.

2) Allow using TensorSpec for AOT, too, to support evolution: we start
   by extracting a superset of the features currently supported by a
   model. For the tensors the model does not support, we just return a
   valid, but useless, buffer. This makes using a 'smaller' model (less
   supported tensors) transparent to the compiler. The key is to
   dimension the buffer appropriately, and we already have TensorSpec
   modeling that info.

The only coupling was due to the reliance of a TF internal API for
getting the element size, but for the types we are interested in,
`sizeof` is sufficient.

A subsequent change will yank out TensorSpec in its own module.

Differential Revision: https://reviews.llvm.org/D124045
2022-04-21 15:37:01 -07:00
Vasileios Porpodas 889588ee97 [SLP] Refactoring isLegalBroadcastLoad() to use `ElementCount`.
Replacing `unsigned` with `ElementCount` in the argument of `isLegalBroadcastLoad()`.
This helps reduce the diff of a future SLP patch for AArch64.
2022-04-21 10:19:00 -07:00
Alexey Bataev 2cca53c815 [DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer.
We can process the long shuffles (working across several actual
vector registers) in the best way if we take the actual register
represantion into account. We can build more correct representation of
register shuffles, improve number of recognised buildvector sequences.
Also, same function can be used to improve the cost model for the
shuffles. in future patches.

Part of D100486

Differential Revision: https://reviews.llvm.org/D115653
2022-04-20 09:37:16 -07:00
Alexey Bataev 5f7ac15912 Revert "[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer."
This reverts commit 2f49163b33 to fix
a buildbot failure. Reported in https://lab.llvm.org/buildbot#builders/105/builds/24284
2022-04-20 06:35:55 -07:00
Alexey Bataev 2f49163b33 [DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer.
We can process the long shuffles (working across several actual
vector registers) in the best way if we take the actual register
represantion into account. We can build more correct representation of
register shuffles, improve number of recognised buildvector sequences.
Also, same function can be used to improve the cost model for the
shuffles. in future patches.

Part of D100486

Differential Revision: https://reviews.llvm.org/D115653
2022-04-20 05:32:56 -07:00
Nikita Popov f767a7d115 [DomTreeUpdater] Remove deprecated methods
Remove the insertEdge(), insertEdgeRelaxed(), deleteEdge() and
deleteEdgeRelaxed() methods, which have been deprecated three
years ago.
2022-04-20 12:14:29 +02:00
Andrew Litteken 3de29ad209 [IRSim] Ignore debug instructions when creating canonical numbering
When constructing canonical relationships between two regions, the first instruction of a basic block from the first region is used to find the corresponding basic block from the second region. However, debug instructions are not included in similarity matching, and therefore do not have a canonical numbering. This patch makes sure to ignore the debug instructions when finding the first instruction in a basic block.

Reviewer: paquette

Differential Revision: https://reviews.llvm.org/D123903
2022-04-19 13:18:28 -05:00
Arthur Eubanks a7e20a8a7a [CallPrinter] Port CallPrinter passes to new pass manager
Port the legacy CallGraphViewer and CallGraphDOTPrinter to work with the new pass manager.

Addresses issue https://github.com/llvm/llvm-project/issues/54323
Adds back related tests that were removed in commits d53a4e7b4a and 9e9d9aba14

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D122989
2022-04-18 10:02:18 -07:00
Andrew Litteken a919d3d888 [IROutliner] Ensure that incoming blocks of PHINodes are included in the unique numbering gneration for phi nodes for each exit path
Issue: https://github.com/llvm/llvm-project/issues/54431

PHINodes that need to be generated to accommodate a PHINode outside the region due to different output paths need to have their own numbering to determine the number of output schemes required to properly handle all the outlined regions. This numbering was previously only determined by the order and values of the incoming values, as well as the parent block of the PHINode. This adds the incoming blocks to the calculation of a hash value for these PHINodes as well, and the supporting infrastructure to give each block in a region a corresponding canonical numbering.

Reviewer: paquette

Differential Revision: https://reviews.llvm.org/D122207
2022-04-14 12:13:17 -05:00
Kevin P. Neal d43d9e1d5c [FPEnv][InstSimplify] Fold fsub -0.0, -X ==> X
Currently the fsub optimizations in InstSimplify don't know how to fold
-0.0 - (-X) to X when the constrained intrinsics are used. This adds partial
support. The rest of the support will come later with work on the IR
matchers.

This review is split out from D107285.

Differential Revision: https://reviews.llvm.org/D123396
2022-04-14 11:48:54 -04:00
Congzhe Cao 557b131c88 [DA] Refactor with a better API
Refactor from iteratively using BitCastInst::getOperand()
to using stripPointerCasts() instead. This is an improvement
since now we are able to analyze more cases, please refer
to test cases added in this patch.

Reviewed By: Meinersbur, #loopoptwg

Differential Revision: https://reviews.llvm.org/D123559
2022-04-13 14:51:48 -04:00
serge-sans-paille 262eba01b3 Revert "[ValueTracking] Make getStringLenth aware of strdup"
This reverts commit e810d55809.

The commit was not taken into account the fact that strduped string could be
modified. Checking if such modification happens would make the function very
costly, without a test case in mind it's not worth the effort.
2022-04-13 19:17:28 +02:00
Muhammad Omair Javaid 42ebfa8269 Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth"
This reverts commit 64b6192e81.

This broke LLVM AArch64 buildbot clang-aarch64-sve-vls-2stage:

https://lab.llvm.org/buildbot/#/builders/176/builds/1515

llvm-tblgen crashes after applying this patch.
2022-04-13 04:53:07 +05:00
Johannes Doerfert 9dc7da3f9c [GlobalsModRef][FIX] Ensure we honor synchronizing effects of intrinsics
This is a long standing problem that resurfaces once in a while [0].
There might actually be two problems because I'm not 100% sure if the
issue underlying https://reviews.llvm.org/D115302 would be solved by
this or not. Anyway.

In 2008 we thought intrinsics do not read/write globals passed to them:
d4133ac315
This is not correct given that intrinsics can synchronize threads and
cause effects to effectively become visible.

NOTE: I did not yet modify any tests but only tried out the reproducer
      of https://github.com/llvm/llvm-project/issues/54851.

Fixes: https://github.com/llvm/llvm-project/issues/54851

[0] https://discourse.llvm.org/t/bug-gvn-memdep-bug-in-the-presence-of-intrinsics/59402

Differential Revision: https://reviews.llvm.org/D123531
2022-04-12 16:42:50 -05:00
Nikita Popov 1d530b914e [InstSimplify] Don't fold phi of poison and trapping const expr (PR49839)
Folding this case would result in the constant expression being
executed unconditionally, which may introduce a new trap.

Fixes https://github.com/llvm/llvm-project/issues/49839.
2022-04-12 17:32:25 +02:00
serge-sans-paille e810d55809 [ValueTracking] Make getStringLenth aware of strdup
During strlen compile-time evaluation, make it possible to track size of
strduped strings.

Differential Revision: https://reviews.llvm.org/D123497
2022-04-12 14:47:29 +02:00
Nikita Popov 8d5c8d57c6 [InlineCost] Check that function types match
Retain the behavior we get without opaque pointers: A call to a
known function with different function type is considered an
indirect call.

This fixes the crash reported in https://reviews.llvm.org/D123300#3444772.
2022-04-12 11:05:33 +02:00
Arthur Eubanks b22ffc7b98 [CaptureTracking] Ignore ephemeral values in EarliestEscapeInfo
And thread DSE's ephemeral values to EarliestEscapeInfo.

This allows more precise analysis in DSEState::isReadClobber() via BatchAA.

Followup to D123162.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123342
2022-04-08 10:07:26 -07:00
Nikita Popov 930a68765d [Loads] Check type size in bits during store to load forwarding
Rather than checking the rounded type store size, check the type
size in bits. We don't want to forward a store of i1 to a load
of i8 for example, even though they have the same type store size.
The padding bits have unspecified contents.

This is a partial fix for the issue reported at
https://reviews.llvm.org/D115924#inline-1179482,
the problem also needs to be addressed more generally in the
constant folding code.
2022-04-08 17:29:29 +02:00
Nikita Popov 4e85b427dd [MemoryBuiltins] Remove unnecessary lambda capture (NFC) 2022-04-08 10:13:37 +02:00
serge-sans-paille aa15ea47e2 [builtin_object_size] Basic support for posix_memalign
It actually implements support for seeing through loads, using alias analysis to
refine the result.

This is rather limited, but I didn't want to rely on more than available
analysis at that point (to be gentle with compilation time), and it does seem to
catch common scenario, as showcased by the included tests.

Differential Revision: https://reviews.llvm.org/D122431
2022-04-08 09:31:11 +02:00
Evgeniy Brevnov da41214d65 Add support for atomic memory copy lowering
Currently, the utility supports lowering of non atomic memory transfer routines only. This patch adds support for atomic version of memcopy. This may be useful for targets not supporting atomic memcopy.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D118443
2022-04-08 10:41:31 +07:00
Arthur Eubanks 17fdaccccf [CaptureTracking] Ignore ephemeral values when determining pointer escapeness
Ephemeral values cannot cause a pointer to escape.

No change in compile time:
https://llvm-compile-time-tracker.com/compare.php?from=4371710085ba1c376a094948b806ddd3b88319de&to=c5ddbcc4866f38026737762ee8d7b9b00395d4f4&stat=instructions

This partially fixes some regressions caused by more calls to `__builtin_assume` (D122397).

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D123162
2022-04-07 10:11:14 -07:00
Augie Fackler d6a7da5ae3 MemoryBuiltins: only claim an allocator family on builtin functions
This lines up with other parts of the codebase that only use special
knowledge about allocator functions if they're builtins.

Differential Revision: https://reviews.llvm.org/D123053
2022-04-07 12:38:45 -04:00
Augie Fackler 5f09498a11 MemoryBuiltins: also check function definition for allocalign
This got changed to use hasAttrSomewhere() during review, and I didn't
notice until today when I was writing some tests for another part of
this system that using hasAttrSomewhere only checked the callsite for
allocalign, rather than both the callsite and the definition. This fixes
that by introducing a helper method.

Differential Revision: https://reviews.llvm.org/D121641
2022-04-07 12:38:44 -04:00
Alina Sbirlea 50d41f3e0d [MSSA] Print memory phis when inspecting walker.
This makes the MemorySSA and MemorySSA Walker printers consistent.
Invokation `-print<memoryssa-walker>` should also have the MemoryPhis.
2022-04-06 16:06:14 -07:00
Augie Fackler 33b1f41914 MemoryBuiltins: getAllocAlignment is now useful for non-allocator funcs
This has been true since dba73135c8, but
didn't matter until now because clang wasn't emitting allocalign
attributes.

Differential Revision: https://reviews.llvm.org/D121640
2022-04-06 09:51:38 -04:00
Martin Storsjö 46776f7556 Fix warnings about variables that are set but only used in debug mode
Add void casts to mark the variables used, next to the places where
they are used in assert or `LLVM_DEBUG()` expressions.

Differential Revision: https://reviews.llvm.org/D123117
2022-04-06 10:01:46 +03:00
Tom Honermann c54ad13602 [Lint][Verifier] NFC: Rename 'Assert*' macros to 'Check*'.
The LLVM IR verifier and analysis linter defines and uses several macros in
code that performs validation of IR expectations. Previously, these macros
were named with an 'Assert' prefix. These names were misleading since the
macro definitions are not conditioned on build kind; they are defined
identically in builds that have asserts enabled and those that do not. This
was confusing since an LLVM developer might expect these macros to be
conditionally enabled as 'assert' is. Further confusion was possible since
the LLVM IR verifier is implicitly disabled (in Clang::ConstructJob()) for
builds without asserts enabled, but only for Clang driver invocations; not
for clang -cc1 invocations. This could make it appear that the macros were
not active for builds without asserts enabled, e.g. when investigating
behavior using the Clang driver, and thus lead to surprises when running
tests that exercise the clang -cc1 interface.

This change renames this set of macros as follows:
  Assert -> Check
  AssertDI -> CheckDI
  AssertTBAA -> CheckTBAA
2022-04-05 15:34:35 -04:00
Nikita Popov 516333d632 [ValueTracking] Handle non-pow2 align assume bundle (PR53693)
https://reviews.llvm.org/D119414 clarified that this is legal IR,
so handle it gracefully. (We could aggressively use the fact that
the pointer must be a null pointer in that case, but I'm not
bothering with that.)

Fixes https://github.com/llvm/llvm-project/issues/53693.
2022-04-05 16:48:40 +02:00
Jingu Kang 64b6192e81 [AArch64] Set maximum VF with shouldMaximizeVectorBandwidth
Set the maximum VF of AArch64 with 128 / the size of smallest type in loop.

Differential Revision: https://reviews.llvm.org/D118979
2022-04-05 13:16:52 +01:00
Hirochika Matsumoto 447a4485c5 [InstSimplify] Fold (ctpop(X) == N) || (X != 0) into X != 0 where N > 0
(ctpop(X) == N) || (X != 0) --> (X != 0) https://alive2.llvm.org/ce/z/udgUVV
(ctpop(X) != N) && (X == 0) --> (X == 0) https://alive2.llvm.org/ce/z/9dq-cR

Differential Revision: https://reviews.llvm.org/D122757
2022-04-04 23:23:34 +09:00
Augie Fackler e90bce8f91 CallBase: fix getFnAttr so it also checks the function
Prior to this change, CallBase::hasFnAttr checked the called function to
see if it had an attribute if it wasn't set on the CallBase, but
getFnAttr didn't do the same delegation, which led to very confusing
behavior. This patch fixes the issue by making CallBase::getFnAttr also
check the function under the same circumstances.

Test changes look (to me) like they're cleaning up redundant attributes
which no longer get specified both on the callee and call. We also clean
up the one ad-hoc implementation of this getter over in InlineCost.cpp.

Differential Revision: https://reviews.llvm.org/D122821
2022-04-03 23:19:23 -04:00
Artur Pilipenko 4fbde1ef40 Fix MemorySSAUpdater::insertDef for dead code
Fix for https://github.com/llvm/llvm-project/issues/51257.

Differential Revision: https://reviews.llvm.org/D122601
2022-03-31 16:32:35 -07:00
serge-sans-paille 01be9be2f2 Cleanup includes: final pass
Cleanup a few extra files, this closes the work on libLLVM dependencies on my
side.

Impact on libLLVM preprocessed output: -35876 lines

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D122576
2022-03-29 09:00:21 +02:00
Vasileios Porpodas 39aa202aff Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit e6ead19b77.
2022-03-23 18:32:17 -07:00
Fangrui Song dcad676958 [CGSCC] Use make_early_inc_range. NFC 2022-03-23 15:31:09 -07:00
Arthur Eubanks e6ead19b77 Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash."
This reverts commit 27bd8f9492.

Causes crashes, see comments in D121973
2022-03-23 10:57:45 -07:00
Vasileios Porpodas 27bd8f9492 Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit f7d7d2a08d.
2022-03-22 16:41:55 -07:00
Arthur Eubanks f7d7d2a08d Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads.""
This reverts commit 79613185d3.

Causes crashes, see comments in https://reviews.llvm.org/D121973.
2022-03-22 13:33:49 -07:00
Vasileios Porpodas 79613185d3 Recommit "[SLP] Fix lookahead operand reordering for splat loads."
Original review: https://reviews.llvm.org/D121354

The original commit 9136145eb0 broke the build on several targets.

Differential Revision: https://reviews.llvm.org/D121973
2022-03-21 15:57:32 -07:00
Philip Reames b880bde92b Add missing dependencies to mayHaveNonDefUseDependency
Two interesting ommissions:
* When reordering in either direction, reordering two calls which both
  contain inf-loops is illegal.  This one is possibly a change in behavior
  for certain callers (e.g. fixes a latent bug.)
* When moving down, control dependence must be respected by checking the
  inverse of isSafeToSpeculativeExecute.  Current callers all seem to
  handle this case - though admitted, I did not do an exhaustive audit.
  Most seem to be only interested in moving upwards within a block.  This
  is mostly a case of future proofing an API so that it implements what
  the comments says, not just what current callers need.

Noticed via inspection.  I don't have a test case.
2022-03-21 10:15:36 -07:00
Philip Reames ee7324b898 Rename mayBeMemoryDependent to mayHaveNonDefUseDependency [nfc] 2022-03-21 10:01:40 -07:00
serge-sans-paille 39b02d49cc [instcombine] Support and test __builtin_object_size interaction with __strdup and __strndup
Differential Revision: https://reviews.llvm.org/D122005
2022-03-21 11:30:51 +01:00
serge-sans-paille d8e0a6d5e9 [LowerConstantIntrinsics] Support phi operand in __builtin_object_size folder
The implementation is just a generalization of the Select handler.
We're no trying to be smart and compute any kind of fixed point.

Differential Revision: https://reviews.llvm.org/D121897
2022-03-21 11:30:50 +01:00