Commit Graph

31018 Commits

Author SHA1 Message Date
Nikita Popov 9604601c93 [SimplifyCFG] Remove redundant checks for hoisting (NFCI)
These conditions are later checked in the HoistTerminator code
path. Checking them here is somewhat confusing, because this code
only checks the first instruction in the block, which is not
necessarily the terminator.
2022-07-04 10:53:54 +02:00
Florian Hahn b4694229aa
[LV] Simplify setDebugLocFromInst by using early exit (NFC).
Suggested as separate improvement in D128657.
2022-07-04 09:25:26 +01:00
Sanjay Patel f9f40aa10d [InstCombine] fold negated low-bit-mask to cmp+select
(-(X & 1)) & Y --> (X & 1) == 0 ? 0 : Y
https://alive2.llvm.org/ce/z/rhpH3i

This is noted as a missing IR canonicalization in issue #55618.
We already managed to fix codegen to the expected form.
2022-07-03 12:25:26 -04:00
Nuno Lopes 53dc0f1078 [NFC] Switch a few uses of undef to poison as placeholders for unreachble code 2022-07-03 14:34:03 +01:00
Nuno Lopes 022bd92c78 [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC] 2022-07-03 12:32:19 +01:00
Florian Hahn b0da3c6fa4
[VPlan] Move setDebugLocFromInst to VPTransformState (NFC).
The moved helpers are only used for codegen. It will allow moving the
remaining ::execute implementations out of LoopVectorize.cpp.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D128657
2022-07-02 15:18:17 +01:00
Johannes Doerfert 07766f4070 [Attributor] Move heap2stack allocas to the entry block if possible
If we are certainly not in a loop we can directly emit the heap2stack
allocas in the function entry block. This will help to get rid of them
(SROA) and avoid stacksave/restore intrinsics when the function is
inlined.
2022-07-01 21:34:12 -05:00
Nuno Lopes 7c4f45f87a Revert [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC]
This reverts commits 47e6f98f84 and 3e701bcd2a
2022-07-01 23:53:41 +01:00
Nuno Lopes 47e6f98f84 [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC] 2022-07-01 23:31:31 +01:00
Sanjay Patel 9c8a39c67b [InstCombine] restrict select of bit-tests to constant shift amounts
This transform is responsible for a long-standing miscompile
as discussed in issue #47012 (was bugzilla #47668).

There was a proposal to correct it in D88432, but that was
abandoned and there hasn't been any recent activity to fix
it AFAICT.

The original patch D45108 started with a constant-shift-only
restriction and only expanded during review, so I don't think
there's much risk of perf regression on the motivating code.
2022-07-01 16:24:34 -04:00
Martin Sebor 0d68ff87d2 [InstCombine] Transform strrchr to memrchr for constant strings
Add an emitter for the memrchr common extension and simplify the strrchr
call handler to use it. This enables transforming calls with the empty
string to the test C ? S : 0.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128954
2022-07-01 11:10:00 -06:00
Nikita Popov 65d59b4265 [LoopDeletion] Fix deletion with unusual predecessor terminator (PR56266)
LoopSimplify only requires that the loop predecessor has a single
successor and is safe to hoist into -- it doesn't necessarily have
to be an unconditional BranchInst.

Adjust LoopDeletion to assert conditions closer to what it actually
needs for correctness, namely a single successor and a
side-effect-free terminator (as the terminator is getting dropped).

Fixes https://github.com/llvm/llvm-project/issues/56266.
2022-07-01 16:13:35 +02:00
Florian Hahn 0dddf04cab
[LV] Don't optimize exit cond during epilogue vectorization.
At the moment, the same VPlan can be used code generation of both the
main vector and epilogue vector loop. This can lead to wrong results, if
the plan is optimized based on the VF of the main vector loop and then
re-used for the epilogue loop.

One example where this is problematic is if the scalar loops need to
execute at least one iteration, e.g. due to interleave groups.

To prevent mis-compiles in the short-term, disable optimizing exit
conditions for VPlans when using epilogue vectorization. The proper fix
is to avoid re-using the same plan for both loops, which will require
support for cloning plans first.

Fixes #56319.
2022-07-01 13:48:38 +01:00
Nikita Popov fabe915705 [SimplifyLibCalls] Use inbounds GEP
When converting strchr(p, '\0') to p + strlen(p) we know that
strlen() must return an offset that is inbounds of the allocated
object (otherwise it would be UB), so we can use an inbounds GEP.
An equivalent argument can be made for the other cases.
2022-07-01 14:31:44 +02:00
Sanjay Patel ab372cdd6f [InstCombine] add code comment for icmp transform; NFC
This was accidentally left out of cc88445a91
2022-07-01 08:21:55 -04:00
Florian Hahn 583abd0e36
[VPlan] Move addMetadata to VPTransformState (NFC).
The moved helpers are only used for codegen. It will allow moving the
remaining ::execute implementations out of LoopVectorize.cpp.

Depends on D127966.
Depends on D127965.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D127968
2022-07-01 12:03:25 +01:00
Nikita Popov 9b994593cc [SCCP] Only handle unknown lattice values in resolvedUndefsIn()
This is a minor refinement of resolvedUndefsIn(), mostly for clarity.
If the value of an instruction is undef, then that's already a legal
final result -- we can safely rauw such an instruction with undef.
We only need to mark unknown values as overdefined, as that's the
result we get for an instruction that has not been processed because
it has an undef operand.

Differential Revision: https://reviews.llvm.org/D128251
2022-07-01 09:14:37 +02:00
Chen Zheng 39fe49aa57 [Inline] don't add noalias metadata for unknown objects.
The unidentified objects recognized in `getUnderlyingObjects` may
still alias to the noalias parameter because `getUnderlyingObjects`
may not check deep enough to get the underlying object because of
`MaxLookup`. The real underlying object for the unidentified object
 may still be the noalias parameter.

Originally Patched By: tingwang

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D127202
2022-07-01 02:16:55 -04:00
Alexey Bataev 4be3fc35aa [SLP][NFC]Cleanup up operands of the removed insertelements, NFC.
Replace all operands of the insertelement instruction, replaced by
shuffles, by poisons to avoid false-positive reports about incorrect function.
2022-06-30 17:51:43 -07:00
Nuno Lopes 373571dbb4 [NFC] Switch a few uses of undef to poison as placeholders for unreachble code 2022-06-30 23:01:43 +01:00
William Huang a9119143a2 [InstCombine] Changing constant-indexed GEP of GEP to i8* for merging
When merging GEP of GEP with constant indices, if the second GEP's offset is not divisible by the first GEP's element size, convert both type to i8* and merge.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D125934
2022-06-30 21:26:11 +00:00
Nuno Lopes 0586d1cac2 [NFC] Switch a few uses of undef to poison as placeholders for unreachble code 2022-06-30 21:47:31 +01:00
Craig Topper e633f8cd14 [InstCombine] Fix a Wparentheses warning in an assert. NFC 2022-06-30 13:03:32 -07:00
Sanjay Patel cc88445a91 [InstCombine] canonicalize 'icmp (trunc X), C' to 'icmp (X & Mask), C'
I looked at canonicalizing in the other direction, but that causes
many potential regressions and infinite loops because we already
(possibly wrongly) canonicalize "trunc X to i1" into an and+icmp.

This has a data layout restriction to avoid creating illegal
mask instructions, but we could remove that if we can show
that the backend can undo this when needed.

The motivating example from issue #56119 is modeled by the
PhaseOrdering test.
2022-06-30 15:51:39 -04:00
Martin Sebor 3a743a5892 [InstCombine] Fix memrchr logic error that prevents folding
Correct a logic bug in the memrchr enhancement added in D123629 that
makes it ineffective in a subset of cases.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128856
2022-06-30 11:35:26 -06:00
Nikita Popov f34dcf2763 [IRBuilder] Migrate all binops to folding API
Migrate all binops to use FoldXYZ rather than CreateXYZ APIs,
which are compatible with InstSimplifyFolder and fallible constant
folding.

Rather than continuing to add one method for every single operator,
add a generic FoldBinOp (plus variants for nowrap, exact and fmf
operators), which we would need anyway for CreateBinaryOp.

This change is not NFC because IRBuilder with InstSimplifyFolder
may perform more folding. However, this patch changes SCEVExpander
to not use the folder in InsertBinOp to minimize practical impact
and keep this change as close to NFC as possible.
2022-06-30 16:41:17 +02:00
Nikita Popov 588e229bf9 [VNCoercion] Separate constant/non-constant mem intrinsic implementations (NFCI)
This means we no longer need to have the same API between IRBuilder
and IRBuilderFolder.

The constant case is substantially simpler, so implementing it
separately isn't an undue burden.
2022-06-30 15:26:06 +02:00
Nikita Popov 014c4bdb9d [VNCoercion] Use ConstantFoldLoadFromConst API (NFCI)
Nowdays we have a generic constant folding API to load a type from
an offset. It should be able to do anything that VNCoercion can do.

This avoids the weird templating between IRBuilder and ConstantFolder
in one function, which is will stop working as the IRBuilderFolder
moves from CreateXYZ to FoldXYZ APIs.

Unfortunately, this doesn't eliminate this pattern from VNCoercion
entirely yet.
2022-06-30 14:52:27 +02:00
Florian Hahn 68884dde70
[LV] Move LoopVersioning creation to LVP::execute.
At the moment LoopVersioning is only created for inner-loop
vectorization. This patch moves it to LVP::execute, which means it will
also be added for epilogue vectorization. As a consequence, the proper
noalias metadata is now also added to epilogue vector loops.

LVer will be moved to VPTransformState as follow-up.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D127966
2022-06-30 12:14:32 +01:00
Sanjay Patel 7c4b90a98d [InstCombine] fix overzealous assert in icmp-shr fold
The assert was added with 0399473de8 and is correct for that
pattern, but it is off-by-1 with the enhancement in d4f39d8333.

The transforms are still correct with the new pre-condition:
https://alive2.llvm.org/ce/z/6_6ghm
https://alive2.llvm.org/ce/z/_GTBUt

And as shown in the new test, the transform is expected with
'ult' - in that case, the icmp reduces to test if the shift
amount is 0.
2022-06-30 06:28:48 -04:00
Nikita Popov 1579fc62fe [Evaluator] Add missing LLVM_DEBUG()
Missed these in 41f0b6a781, resulting
in unconditional debug output.
2022-06-30 11:54:47 +02:00
Chen Zheng b05801de35 [InlineFunction] Only check pointer arguments for a call
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128529
2022-06-30 05:39:47 -04:00
Nikita Popov 41f0b6a781 [Evaluator] Use ConstantFoldInstOperands()
For instructions that don't need any special handling, use
ConstantFoldInstOperands(), rather than re-implementing individual
cases.

This is probably not NFC because it can handle cases the previous
code missed (e.g. vector operations).
2022-06-30 11:10:17 +02:00
Nikita Popov a6d4b4138f [ConstantFold] Supports compares in ConstantFoldInstOperands()
Support compares in ConstantFoldInstOperands(), instead of
forcing the use of ConstantFoldCompareInstOperands(). Also handle
insertvalue (extractvalue was already handled).

This removes a footgun, where many uses of ConstantFoldInstOperands()
need a separate check for compares beforehand. It's particularly
insidious if called on a constant expression, because it doesn't
fail in that case, but will just not do DL-dependent folding.
2022-06-30 11:05:24 +02:00
Florian Hahn 24b5f8e0d0
[VPlan] Make sure optimizeInductions removes wide ind from scalar plan.
In some cases, there may be widened users of inductions even though the
plan includes the scalar VF. In those cases, make sure we still replace
the VPWidenIntOrFpInductionRecipe with scalar steps, as otherwise we may
try to execute a VPWidenIntOrFpInductionRecipe with a scalar VF.

Alternatively the patch could also split the range if needed.

This fixes a crash exposed by D123720.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D128755
2022-06-30 09:11:48 +01:00
Nikita Popov 10c531cd5b [SCCP] Simplify CFG in SCCP as well
Currently, we only remove dead blocks and non-feasible edges in
IPSCCP, but not in SCCP. I'm not aware of any strong reason for
that difference, so this patch updates SCCP to perform the CFG
cleanup as well.

Compile-time impact seems to be pretty minimal, in the 0.05%
geomean range on CTMark.

For the test case from https://reviews.llvm.org/D126962#3611579
the result after -sccp now looks like this:

    define void @test(i1 %c) {
    entry:
      br i1 %c, label %unreachable, label %next
    next:
      unreachable
    unreachable:
      call void @bar()
      unreachable
    }

-jump-threading does nothing on this, but -simplifycfg will produce
the optimal result.

Differential Revision: https://reviews.llvm.org/D128796
2022-06-30 09:25:03 +02:00
Chuanqi Xu 0b5ead6590 [WebAssembly] Don't set musttail for coroutines when tail-call is not
enabled

The C++20 Coroutines couldn't be compiled to WebAssembly due to an
optimization named symmetric transfer requires the support for musttail
calls but WebAssembly doesn't support it yet.

This patch tries to fix the problem by adding a supportsTailCalls
method to TargetTransformImpl to skip the symmetric transfer when
tail-call feature is not supported.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D128794
2022-06-30 11:15:40 +08:00
zhongyunde 404479b4b0 [InstCombine] Use known bits to determine exact int->fp cast
Reviewed By: spatel, nikic

Differential Revision: https://reviews.llvm.org/D127854
2022-06-30 09:45:11 +08:00
Florian Hahn 6d5f814357
[LoopUnrollRuntime] Invalidate SCEV for exit phi in ConnectProlog.
ConnectProlog adds new incoming values to exit phi nodes which can
change the SCEV for the phi after 20d798bd47.

Fix is analog to cfc741bc0e.

Fixes #56286.
2022-06-29 20:28:43 +01:00
Florian Hahn 9a35f19e3e
[UnrollRuntime] Invalidate SCEVs for modified phis in ConnectEpilog.
ConnectEpilog adds new incoming values to exit phi nodes which can
change the SCEV for the phi after 20d798bd47.

Fix is analog to cfc741bc0e.

Fixes #56282.
2022-06-29 18:26:00 +01:00
Sanjay Patel d4f39d8333 [InstCombine] add fold for (ShiftC >> X) >u C
This is the 'ugt' sibling to:
0399473de8

Decrement the input compare constant (and implicitly
decrement the new compare constant):
https://alive2.llvm.org/ce/z/iELmct
2022-06-29 12:30:01 -04:00
Nikita Popov bdba8278d9 [VectorCombine] Avoid ConstantExpr::get() (NFC)
Use IRBuilder APIs instead, which will still constant fold.
2022-06-29 17:17:52 +02:00
Nikita Popov 2124b2f0e6 [JumpThreading] Avoid ConstantExpr::get() (NFCI)
This code requires the result to be an UndefValue/ConstantInt
anyway (checked by getKnownConstant), so we are only interested
in the case where this folds.
2022-06-29 16:43:05 +02:00
Nikita Popov df698a5762 [InstCombine] Avoid some calls to ConstantExpr::get() (NFCI)
Replace some calls to ConstantExpr::get() with IRBuilder APIs
(which will also constant fold if possible).
2022-06-29 16:26:02 +02:00
Nikita Popov 0af53fcb99 [SROA] Don't create constant expressions (NFC)
Use IRBuilder instead, which will fold these. Just to clarify
that this does not actually create any udiv expression.
2022-06-29 11:51:22 +02:00
Pavel Samolysov 3d9ce9e43d [ArgPromotion] Remove all the getters and ReplaceCallSite (NFC)
AARGetter is an abstraction over a source of the `AAResults` introduced
to support the legacy pass manager as well as the modern one. Since the
Argument Promotion pass doesn't support the legacy pass manager anymore,
the abstraction is not required and `AAResults` may be used directly.

The instance of the `FunctionAnalysisManager` is passed through the
functions to get all the required analyses just wherever they are
required and do not use the awkward getter callbacks.

The `ReplaceCallSite` parameter was required for the legacy pass manager
only and isn't used anymore, so the parameter has been eliminated.

Differential Revision: https://reviews.llvm.org/D128727
2022-06-29 10:45:11 +03:00
Pavel Samolysov 8958057fb1 [ArgPromotion] Move isDenselyPacked static member (NFC)
The `isDenselyPacked` static member of the `ArgumentPromotionPass` class
is not used in the class itself anymore. The single known user of the
function is in the `AttributorAttributes.cpp` file, so the function has
been moved into the file.

Differential Revision: https://reviews.llvm.org/D128725
2022-06-29 10:45:10 +03:00
Martin Sebor 8827679826 [InstCombine] Fold strncmp of constant arrays and variable size
Extend the solution accepted in D127766 to strncmp and simplify
strncmp(A, B, N) calls with constant A and B and variable N to
the equivalent of

  N <= Pos ? 0 : (A < B ? -1 : B < A ? +1 : 0)

where Pos is the offset of either the first mismatch between A
and B or the terminating null character if both A and B are equal
strings.

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D128089
2022-06-28 15:59:14 -06:00
Martin Sebor e263a7670e [InstCombine] Look through more casts when folding memchr and memcmp
Enhance getConstantDataArrayInfo to let the memchr and memcmp library
call folders look through arbitrarily long sequences of bitcast and
GEP instructions.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128364
2022-06-28 15:58:42 -06:00
Alexey Bataev bf4dcbd2df [SLP]Fix PR56251: Do not remove the reordering from the root node, being used as an operand.
If the root order itself does not require reordering, we can just
remove its reorder mask safely (e.g., if the root node is a vector of
phis). But if this node is used as an operand in the graph, we cannot
delete the reordering, need to keep it. Otherwise the graph nodes are
not synchronized with the operands. It may cause an extra gather
instruction(s) or a compiler crash.
Also, need to be very careful when selecting the gather nodes for
reordering since there might several gather nodes with the same scalars
and we can try to reorder just the same node many times instead of
different nodes.

Differential Revision: https://reviews.llvm.org/D128680
2022-06-28 13:42:05 -07:00
Leonard Chan 9553d69580 [NFC][HWASan] Refactor hwasan pass
This moves some code for getting PC and SP into their own functions. Since SP
is also retrieved in the prologue and getting the stack tag, we can cache the
SP if we get it once in the prologue. This caching will really only be relevant
in D128387 where StackBaseTag may not be set in the prologue if __hwasan_tls
is not used.

Differential Revision: https://reviews.llvm.org/D128551
2022-06-28 12:09:20 -07:00
Pavel Samolysov 170c4d21bd [ArgPromotion] Unify byval promotion with non-byval
It makes sense to handle byval promotion in the same way as non-byval
but also allowing `store` instructions. However, these should
use the same checks as the `load` instructions do, i.e. be part of the
`ArgsToPromote` collection. For these instructions, the check for
interfering modifications can be disabled, though. The promotion
algorithm itself has been modified a lot: all the accesses (i.e. loads
and stores) are rewritten to the emitted `alloca` instructions. To
optimize these new `alloca`s out, the `PromoteMemToReg` function from
`Transforms/Utils/PromoteMemoryToRegister.cpp` file is invoked after
promotion.

In order to let the `PromoteMemToReg` promote as many `alloca`s as it
is possible, there should be no `GEP`s from the `alloca`s. To
eliminate the `GEP`s, its own `alloca` is generated for every argument
part because a single `alloca` for the whole argument (that
significantly simplifies the code of the pass though) unfortunately
cannot be used.

The idea comes from the following discussion:
https://reviews.llvm.org/D124514#3479676

Differential Revision: https://reviews.llvm.org/D125485
2022-06-28 15:19:58 +03:00
Mikhail Goncharov c6c124ca80 Fixed unused variable warning. 2022-06-28 11:44:16 +02:00
Florian Hahn 03975b7f0e
[VPlan] Move recipe implementations to separate file (NFC).
This patch moves the code for recipe implementations to a separate file.

The benefits are:
 * Keep VPlan.cpp smaller => faster compile-time during parallel builds.
 * Keep code for logical units together

As a follow-up I am also planning on moving all ::execute
implemetnations from LoopVectorize.cpp over to the new file, which
should help to reduce the size of the file a bit.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D127965
2022-06-28 10:34:30 +01:00
Nikita Popov 5548e807b5 [IR] Remove support for extractvalue constant expression
This removes the extractvalue constant expression, as part of
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.
extractvalue is already not supported in bitcode, so we do not need
to worry about bitcode auto-upgrade.

Uses of ConstantExpr::getExtractValue() should be replaced with
IRBuilder::CreateExtractValue() (if the fact that the result is
constant is not important) or ConstantFoldExtractValueInstruction()
(if it is). Though for this particular case, it is also possible
and usually preferable to use getAggregateElement() instead.

The C API function LLVMConstExtractValue() is removed, as the
underlying constant expression no longer exists. Instead,
LLVMBuildExtractValue() should be used (which will constant fold
or create an instruction). Depending on the use-case,
LLVMGetAggregateElement() may also be used instead.

Differential Revision: https://reviews.llvm.org/D125795
2022-06-28 10:40:17 +02:00
Guillaume Chatelet 3c126d5fe4 [Alignment] Replace commonAlignment with std::min
`commonAlignment` is a shortcut to pick the smallest of two `Align`
objects. As-is it doesn't bring much value compared to `std::min`.

Differential Revision: https://reviews.llvm.org/D128345
2022-06-28 07:15:02 +00:00
wlei 7e86b13c63 [CSSPGO][llvm-profgen] Reimplement SampleContextTracker using context trie
This is the followup patch to https://reviews.llvm.org/D125246 for the `SampleContextTracker` part. Before the promotion and merging of the context is based on the SampleContext(the array of frame), this causes a lot of cost to the memory. This patch detaches the tracker from using the array ref instead to use the context trie itself. This can save a lot of memory usage and benefit both the compiler's CS inliner and llvm-profgen's pre-inliner.

One structure needs to be specially treated is the `FuncToCtxtProfiles`, this is used to get all the functionSamples for one function to do the merging and promoting. Before it search each functions' context and traverse the trie to get the node of the context. Now we don't have the context inside the profile, instead we directly use an auxiliary map `ProfileToNodeMap` for profile , it initialize to create the FunctionSamples to TrieNode relations and keep updating it during promoting and merging the node.

Moreover, I was expecting the results before and after remain the same, but I found that the order of FuncToCtxtProfiles matter and affect the results. This can happen on recursive context case, but the difference should be small. Now we don't have the context, so I just used a vector for the order, the result is still deterministic.

Measured on one huge size(12GB) profile from one of our internal service. The profile similarity difference is 99.999%, and the running time is improved by 3X(debug mode) and the memory is reduced from 170GB to 90GB.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D127031
2022-06-27 23:22:21 -07:00
wlei aa58b7b1e3 [CSSPGO][llvm-profgen] Reimplement computeSummaryAndThreshold using context trie
Follow-up patch to https://reviews.llvm.org/D125246, support `computeSummaryAndThreshold` based on context trie.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D127026
2022-06-27 23:22:21 -07:00
Congzhe Cao b941857b40 [LoopInterchange] New cost model for loop interchange
This is another attempt to land this patch.

The patch proposed to use a new cost model for loop interchange,
which is obtained from loop cache analysis.

Given a loopnest, what loop cache analysis returns is a vector of
loops [loop0, loop1, loop2, ...] where loop0 should be replaced as
the outermost loop, loop1 should be placed one more level inside, and
loop2 one more level inside, etc. What loop cache analysis does is not
only more comprehensive than the current cost model, it is also a "one-shot"
query which means that we only need to query it once during the entire
loop interchange pass, which is better than the current cost model where
we query it every time we check whether it is profitable to interchange
two loops. Thus complexity is reduced, especially after D120386 where we
do more interchanges to get the globally optimal loop access pattern.

Updates made to test cases are mostly minor changes and some
corrections. One change that applies to all tests is that we added an option
`-cache-line-size=64` to the RUN lines. This is ensure that loop
cache analysis receives a valid number of cache line size for correct
analysis. Test coverage for loop interchange is not reduced.

Currently we did not completely remove the legacy cost model, but
keep it as fall-back in case the new cost model did not run successfully.
This is because currently we have some limitations in delinearization, which
sometimes makes loop cache analysis bail out. The longer term goal is to
enhance delinearization and eventually remove the legacy cost model
compeletely.

Reviewed By: bmahjour, #loopoptwg

Differential Revision: https://reviews.llvm.org/D124926
2022-06-28 00:08:37 -04:00
Vitaly Buka 6824eee942 [asan] Add missing dependency on Demangle
Follow up to D127911.
2022-06-27 15:10:02 -07:00
Mitch Phillips dacfa24f75 Delete 'llvm.asan.globals' for global metadata.
Now that we have the sanitizer metadata that is actually on the global
variable, and now that we use debuginfo in order to do symbolization of
globals, we can delete the 'llvm.asan.globals' IR synthesis.

This patch deletes the 'location' part of the __asan_global that's
embedded in the binary as well, because it's unnecessary. This saves
about ~1.7% of the optimised non-debug with-asserts clang binary.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D127911
2022-06-27 14:40:40 -07:00
Philip Reames 20dd3297b1 [LV] Allow scalable vectorization with vscale = 1
This change is a bit subtle. If we have a type like <vscale x 1 x i64>, the vectorizer will currently reject vectorization. The reason is that a type like <1 x i64> is likely to get simply rescalarized, and the vectorizer doesn't want to be in the game of simple unrolling.

(I've given the example in terms of 1 x types which use a single register, but the same issue exists for any N x types which use N registers. e.g. RISCV LMULs.)

This change distinguishes scalable types from fixed types under the reasoning that converting to a scalable type isn't unrolling. Because the actual vscale isn't known until runtime, using a vscale type is potentially very profitable.

This makes an important, but unchecked, assumption. Specifically, the scalable type is assumed to only be legal per the cost model if there's actually a scalable register class which is distinct from the scalar domain. This is, to my knowledge, true for all targets which return non-invalid costs for scalable vector ops today, but in theory, we could have a target decide to lower scalable to fixed length vector or even scalar registers. If that ever happens, we'd need to revisit this code.

In practice, this patch unblocks scalable vectorization for ELEN types on RISCV.

Let me sketch one alternate implementation I considered. We could have restricted this to when we know a minimum value for vscale. Specifically, for the default +v extension for RISCV, we actually know that vscale >= 2 for ELEN types. However, doing it this way means we can't generate scalable vectors when using the various embedded vector extensions which have a minimum vscale of 1.

Differential Revision: https://reviews.llvm.org/D128542
2022-06-27 13:38:57 -07:00
Yuanfang Chen e2e9e708e5 [Coroutine] Remove the '!func_sanitize' metadata for split functions
There is no proper RTTI for these split functions. So just delete the
metadata.

Fixes https://github.com/llvm/llvm-project/issues/49689.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D116130
2022-06-27 12:09:13 -07:00
Yuanfang Chen 6678f8e505 [ubsan] Using metadata instead of prologue data for function sanitizer
Information in the function `Prologue Data` is intentionally opaque.
When a function with `Prologue Data` is duplicated. The self (global
value) references inside `Prologue Data` is still pointing to the
original function. This may cause errors like `fatal error: error in backend: Cannot represent a difference across sections`.

This patch detaches the information from function `Prologue Data`
and attaches it to a function metadata node.

This and D116130 fix https://github.com/llvm/llvm-project/issues/49689.

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D115844
2022-06-27 12:09:13 -07:00
Joseph Huber c7243f21d3 [OpenMP] Only strip runtime attributes if needed
Summary:
Currently in OpenMPOpt we strip `noinline` attributes from runtime
functions. This is here because the device bitcode library that we link
has problems with needed definitions getting prematurely optimized out.
This is only necessary for OpenMP offloading to GPUs so we should narrow
the scope for where we spend time doing this. In the future this
shouldn't be necessary as we move to using a linked library rather than
pulling in a bitcode library in Clang.
2022-06-27 13:35:41 -04:00
Nikita Popov f65c88c42f [GlobalOpt] Fix memset handling in global ctor evaluation (PR55859)
The global ctor evaluator currently handles by checking whether the
memset memory is already zero, and skips it in that case. However,
it only actually checks the first byte of the memory being set.

This patch extends the code to check all bytes being set. This is
done byte-by-byte to avoid converting undef values to zeros in
larger reads. However, the handling is still not completely correct,
because there might still be padding bytes (though probably this
doesn't matter much in practice, as I'd expect global variable
padding to be zero-initialized in practice).

Mostly fixes https://github.com/llvm/llvm-project/issues/55859.

Differential Revision: https://reviews.llvm.org/D128532
2022-06-27 16:50:49 +02:00
Bradley Smith a83aa33d1b [IR] Move vector.insert/vector.extract out of experimental namespace
These intrinsics are now fundemental for SVE code generation and have been
present for a year and a half, hence move them out of the experimental
namespace.

Differential Revision: https://reviews.llvm.org/D127976
2022-06-27 10:48:45 +00:00
Nikita Popov cde402778a [FunctionAttrs] Add missing pass dependency
This pass depends on AAResults. This fixes the ocaml IPO binding
tests.
2022-06-27 10:15:06 +02:00
Nikita Popov 217e85761c [ArgPromotion] Remove legacy PM support
Support for the legacy pass manager in ArgPromotion causes
complications in D125485. As the legacy pass manager for middle-end
optimizations is unsupported, drop ArgPromotion from the legacy
pipeline, rather than introducing additional complexity to deal
with it.

Differential Revision: https://reviews.llvm.org/D128536
2022-06-27 09:42:17 +02:00
Chuanqi Xu 24e53b01d5 Revert "[Coroutines] Only do symmetric transfer if optimization is on"
This reverts commit 7782e080e8. According
to the discussion of WG21, symmetric transfer is a desired feature.
2022-06-27 10:54:56 +08:00
Kazu Hirata d08f34b592 [llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-26 18:31:51 -07:00
Kazu Hirata a81b64a1fb [llvm] Use Optional::has_value instead of Optional::hasValue (NFC)
This patch replaces x.hasValue() with x.has_value() where x is not
contextually convertible to bool.
2022-06-26 16:10:42 -07:00
Nuno Lopes 6ef9a2ad01 [LICM] Use poison to replace unreachable values instead of undef [NFC] 2022-06-26 14:56:35 +01:00
Nuno Lopes 3fa2411dc5 [LoopSimplifyCFG] use poison when replacing dead instructions instead of undef [NFC] 2022-06-26 14:15:55 +01:00
Nuno Lopes d46fa1fc58 [ArgumentPromotion] use poison when replacing dead instructions instead of undef [NFC] 2022-06-26 13:44:05 +01:00
Kazu Hirata a7938c74f1 [llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 21:42:52 -07:00
Kazu Hirata 3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3.
2022-06-25 11:56:50 -07:00
Kazu Hirata aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Pavel Samolysov 6e3d4712b9 [DeadArgElim] Replace insert with emplace (NFC) 2022-06-25 10:31:27 +03:00
Mitch Phillips f57066401e [HWASan] Use new IR attribute for communicating unsanitized globals.
Globals that shouldn't be sanitized are currently communicated to HWASan
through the use of the llvm.asan.globals IR metadata. Now that we have
an on-GV attribute, use it.

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D127543
2022-06-24 12:04:11 -07:00
Mingming Liu e0d069598b [Inline] Annotate inline pass name with link phase information for analysis.
The annotation is flag gated; flag is turned off by default.

Differential Revision: https://reviews.llvm.org/D125495
2022-06-24 10:06:43 -07:00
Alexey Bataev 2faacf61a5 [SLP]Improve shuffles cost estimation where possible.
Improved/fixed cost modeling for shuffles by providing masks, improved
cost model for non-identity insertelements.

Differential Revision: https://reviews.llvm.org/D115462
2022-06-24 09:28:01 -07:00
Arthur Eubanks e422c0d3b2 [GlobalOpt] Perform store->dominated load forwarding for stored once globals
The initial land incorrectly optimized forwarding non-Constants in non-nosync/norecurse functions. Bail on non-Constants since norecurse should cause global -> alloca promotion anyway.

The initial land also incorrectly assumed that StoredOnceStore was the only store to the global, but it actually means that only one value other than the global initializer is stored. Add a check that there's only one store.

Compile time tracker:
https://llvm-compile-time-tracker.com/compare.php?from=c80b88ee29f34078d2149de94e27600093e6c7c0&to=ef2c2b7772424b6861a75e794f3c31b45167304a&stat=instructions

Reviewed By: nikic, asbirlea, jdoerfert

Differential Revision: https://reviews.llvm.org/D128128
2022-06-24 09:09:26 -07:00
Florian Hahn cb69ba4faa
[LV] Create RT checks once VF/IC are selected, track scalar cost.
This patch updates LV to generate runtime after the VF & IC are selected. It
allows deciding whether to vectorize with runtime checks or not based on
their cost compared to the vector loop.

It also updates VectorizationFactor to include the scalar cost.

Reviewed By: lebedev.ri, dmgreen

Differential Revision: https://reviews.llvm.org/D75981
2022-06-24 17:42:11 +02:00
Nikita Popov 871197d0a3 [MemoryBuiltins] Accept any value in getInitialValueOfAllocation() (NFC)
Drop the requirement that getInitialValueOfAllocation() must be
passed an allocator function, shifting the responsibility for
checking that into the function (which it does anyway). The
motivation is to avoid some calls to isAllocationFn(), which has
somewhat ill-defined semantics (given the number of
allocator-related attributes we have floating around...)

(For this function, all we eventually need is an allockind of
zeroed or uninitialized.)

Differential Revision: https://reviews.llvm.org/D127274
2022-06-24 16:08:07 +02:00
Nikita Popov e523baa664 [InlineFunction] Slightly clarify noalias scope calculation (NFC)
Rename CanDeriveViaCapture -> RequiresNoCaptureBefore, drop
unnecessary const cast, reformat some code avoid an ugly
super-indented comment.
2022-06-24 12:31:46 +02:00
Florian Hahn b18141a8f2
[VPlan] Set VFs included in plan before last set of VPTransforms (NFC).
This allows VPlanTransforms to query the VFs included in the plan in the
future.
2022-06-24 10:16:56 +02:00
Florian Hahn 92f87787b3
Recommit "[ConstraintElimination] Transfer info from ULT to signed system."
This reverts commit 94ed2caf70.

The issue with no-determinism with the test has been fixed in
d9526e8a52.
2022-06-24 09:27:14 +02:00
Evgenii Stepanov 878309cc54 Revert "[LoopInterchange] New cost model for loop interchange"
llvm/lib/Analysis/LoopCacheAnalysis.cpp:702:30: runtime error: signed
integer overflow: 6148914691236517209 * 100 cannot be represented in
type 'long'

https://lab.llvm.org/buildbot/#/builders/5/builds/25185

This reverts commit 1b24fe34b0.
2022-06-23 16:10:53 -07:00
Congzhe Cao 1b24fe34b0 [LoopInterchange] New cost model for loop interchange
This is the second attempt to land this patch.

The patch proposed to use a new cost model for loop interchange,
which is obtained from loop cache analysis.

Given a loopnest, what loop cache analysis returns is a vector of
loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the
outermost loop, loop1 should be placed one more level inside, and loop2
one more level inside, etc. What loop cache analysis does is not only more
comprehensive than the current cost model, it is also a "one-shot" query
which means that we only need to query it once during the entire loop
interchange pass, which is better than the current cost model where we
query it every time we check whether it is profitable to interchange two
loops. Thus complexity is reduced, especially after D120386 where we do
more interchanges to get the globally optimal loop access pattern.

Updates made to test cases are mostly minor changes and some corrections.
One change that applies to all tests is that we added an option
`-cache-line-size=64` to the RUN lines. This is ensure that loop cache
analysis receives a valid number of cache line size for correct analysis.
Test coverage for loop interchange is not reduced.

Currently we did not completely remove the legacy cost model, but keep it
as fall-back in case the new cost model did not run successfully. This is
because currently we have some limitations in delinearization, which sometimes
makes loop cache analysis bail out. The longer term goal is to enhance
delinearization and eventually remove the legacy cost model compeletely.

Reviewed By: bmahjour, #loopoptwg

Differential Revision: https://reviews.llvm.org/D124926
2022-06-23 16:34:57 -04:00
Philip Reames 46ea4b5ea1 [LV] Avoid a crash when costing a uniform store which doesn't correspond to a legal scatter
If we have an unaligned uniform store, then when costing a scalable VF we can't emit code to scalarize it.  (Well, we could, but we haven't implemented that case.)  This change replaces an assert with a cost-model bailout such that we reject vectorization with the scalable VF instead of crashing.
2022-06-23 12:41:09 -07:00
Alexey Bataev 3b6edef15d [SLP]Fix a crash when reorder masked gather nodes with reused scalars.
If the masked gather nodes must be reordered, we can just reorder
scalars, just like for gather nodes. But if the node contains reused
scalars, it must be handled same way as a regular vectorizable node,
since need to reorder reused mask, not the scalars directly.

Differential Revision: https://reviews.llvm.org/D128360
2022-06-23 11:32:30 -07:00
Florian Hahn d9526e8a52
[ConstraintElimination] Use stable_sort to sort worklist.
If there are multiple constraints in the same block, at the moment the
order they are processed may be different depending on the sort
implementation.

Use stable_sort to ensure consistent ordering.
2022-06-23 19:22:15 +02:00
Florian Hahn 94ed2caf70
Revert "[ConstraintElimination] Transfer info from ULT to signed system."
This reverts commit 316e106f49.

This breaks a bot with expensive checks.
2022-06-23 17:27:33 +02:00
Florian Hahn 316e106f49
[ConstraintElimination] Transfer info from ULT to signed system.
If A u< B holds, then A s>= 0 && A s< B holds if B s>= 0.

https://alive2.llvm.org/ce/z/RrNxHh
2022-06-23 17:17:01 +02:00
Florian Hahn 9a33f3975e
[ConstraintElimination] Transfer info from SLT to unsigned system.
If A s< B holds, then A u< also holds, if A s>= 0.

https://alive2.llvm.org/ce/z/J4JZuN
2022-06-23 15:57:59 +02:00
chenglin.bi 30e49a3794 [InstCombine] Optimise shift+and+boolean conversion pattern to simple comparison
if (`C1` is pow2) & (`(C2 & ~(C1-1)) + C1)` is pow2):
    ((C1 << X) & C2) == 0 -> X >= (Log2(C2+C1) - Log2(C1));
https://alive2.llvm.org/ce/z/EJAl1R
    ((C1 << X) & C2) != 0 -> X  < (Log2(C2+C1) - Log2(C1));
https://alive2.llvm.org/ce/z/3bVRVz

And remove dead code.

Fix: https://github.com/llvm/llvm-project/issues/56124

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D126591
2022-06-23 21:53:07 +08:00
Florian Hahn 569d84fe99
[VPlan] Remove dead recipes across whole plan.
This extends removeDeadRecipe to remove recipes across the whole plan.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D127580
2022-06-23 13:36:02 +02:00
Florian Hahn 24a98881cd
[ConstraintElimination] Transfer info from SGT to unsigned system.
If A >s B then A >=u 0, if B >=s -1.

https://alive2.llvm.org/ce/z/cncGKi
2022-06-23 11:04:51 +02:00
Fangrui Song 1ffd2d99c2 Revert D115462 "[SLP]Improve shuffles cost estimation where possible."
This reverts commit cac60940b7.

Caused -Os -fsanitize=memory -march=haswell miscompile to pytorch/cpuinfo.
See my latest comment (may update) on D115462.
2022-06-22 23:16:31 -07:00
Fangrui Song a411bc11d6 Revert "[SLP]Fix a crash when insert subvector is out of range."
This reverts commit f1ee2738b3.

Revert due to the revert of a dependent commit `[SLP]Improve shuffles cost estimation where possible.`
2022-06-22 23:16:25 -07:00
Serguei Katkov 5e1ccdf960 [RS4GC] Handle freeze case for vector
Finding BDV for vector value does not handle freeze instruction.
Adding its handling as it is done for scalar case.

Reviewed By: apilipenko
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D128254
2022-06-23 11:58:41 +07:00
Mingming Liu bc856eb3fc [SampleProfile][Inline] Annotate sample profile inline remarks with link phase (prelink/postlink) information.
Differential Revision: https://reviews.llvm.org/D126833
2022-06-22 17:00:53 -07:00
Florian Mayer 9320a32bb9 [MTE] [HWASan] Use LoopInfo for reachability queries.
The reachability queries default to "reachable" after exploring too many
basic blocks. LoopInfo helps it skip over the whole loop.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D127917
2022-06-22 15:28:49 -07:00
Adrian Tong 4e555a3df4 Fix a misspell. NFC 2022-06-22 21:23:21 +00:00
Brendon Cahoon f1b05a0a2b [StructurizeCFG] Improve basic block ordering
StructurizeCFG linearizes the successors of branching basic block
by adding Flow blocks to record the true/false path for branches
and back edges. This patch reduces the number of Phi values needed
to capture the control flow path by improving the basic block
ordering.

Previously, StructurizeCFG adds loop exit blocks outside of the
loop. StructurizeCFG sets a boolean value to indicate the path
taken, and all exit block live values extend to after the loop.
For loops with a large number of exits blocks, this creates a
huge number of values that are maintained, which increases
compilation time and register pressure. This is problem
especially with ASAN, which adds early exits to blocks with
unreachable instructions for each instrumented check in the loop.

In specific cases, this patch reduces the number of values needed
after the loop by moving the exit block into the loop. This is
done for blocks that have a single predecessor and single successor
by moving the block to appear just after the predecessor.

Differential Revision: https://reviews.llvm.org/D123231
2022-06-22 16:10:41 -05:00
Brendon Cahoon e13248ab0e [UnifyLoopExits] Reduce number of guard blocks
UnifyLoopExits creates a single exit, a control flow hub, for
loops with multiple exits. There is an input to the block for
each loop exiting block and an output from the block for each
loop exit block. Multiple checks, or guard blocks, are needed
to branch to the correct exit block.

For large loops with lots of exit blocks, all the extra guard
blocks cause problems for StructurizeCFG and subsequent passes.
This patch reduces the number of guard blocks needed when the
exit blocks branch to a common block (e.g., an unreachable
block). The guard blocks are reduced by changing the inputs
and outputs of the control flow hub. The inputs are the exit
blocks and the outputs are the common block.

Reducing the guard blocks enables StructurizeCFG to reorder the
basic blocks in the CFG to reduce the values that exit a loop
with multiple exits. This reduces the compile-time of
StructurizeCFG and also reduces register pressure.

Differential Revision: https://reviews.llvm.org/D123230
2022-06-22 15:44:23 -05:00
Evgenii Stepanov 5011b4ca0e Revert "[Attributor] Ensure to use the proper liveness AA"
Reason: memory leaks

This reverts commit 083010312a.
2022-06-22 13:40:45 -07:00
Florian Mayer 476ced4b89 [MTE] [HWASan] Support diamond lifetimes.
We were overly conservative and required a ret statement to be dominated
completely be a single lifetime.end marker. This is quite restrictive
and leads to two problems:

* limits coverage of use-after-scope, as we degenerate to
  use-after-return;
* increases stack usage in programs, as we have to remove all lifetime
  markers if we degenerate to use-after-return, which prevents
  reuse of stack slots by the stack coloring algorithm.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D127905
2022-06-22 11:16:34 -07:00
Florian Mayer acc9721e38 [NFC] [HWASan] Remove indirection for getting analyses.
This was necessary for code reuse between the old and new passmanager.
With the old pass-manager gone, this is no longer necessary.

Reviewed By: eugenis, myhsu

Differential Revision: https://reviews.llvm.org/D127913
2022-06-22 10:53:20 -07:00
Mingming Liu 67dc8021a1 [Support] Change TrackingStatistic and NoopStatistic to use uint64_t instead of unsigned.
Binary size of `clang` is trivial; namely, numerical value doesn't
change when measured in MiB, and `.data` section increases from 139Ki to
173 Ki.

Differential Revision: https://reviews.llvm.org/D128070
2022-06-22 10:11:40 -07:00
Max Kazantsev cff4f04e2e [LSR] Don't allow zero quotient as scale ref. PR56160
Scale reg should never be zero, so when the quotient is zero, we
cannot assign it there. Limit this transform to avoid this situation.

Differential Revision: https://reviews.llvm.org/D128339
Reviewed By: eopXD
2022-06-22 23:33:57 +07:00
Guillaume Chatelet 57ffff6db0 Revert "[NFC] Remove dead code"
This reverts commit 8ba2cbff70.
2022-06-22 14:55:47 +00:00
Guillaume Chatelet 8ba2cbff70 [NFC] Remove dead code 2022-06-22 13:33:58 +00:00
Florian Hahn 098b0b18a7
[ConstraintElimination] Transfer info from SGE to unsigned system.
This patch adds a new transferToOtherSystem helper that tries to
transfer information from signed predicates to the unsigned system and
vice versa.

The initial version adds A >=u B for A >=s B && B >=s 0

https://alive2.llvm.org/ce/z/8b6F9i
2022-06-22 15:27:59 +02:00
Nikita Popov 1f88d80408 [SCCP] Don't mark edges feasible when resolving undefs
As branch on undef is immediate undefined behavior, there is no need
to mark one of the edges as feasible. We can leave all the edges
non-feasible. In IPSCCP, we can replace the branch with an unreachable
terminator.

Differential Revision: https://reviews.llvm.org/D126962
2022-06-22 10:28:27 +02:00
Florian Hahn ac62b8f704
[ConstraintElimination] Update addFact to take Predicate and ops (NFC).
This allows adding facts without necessarily having a corresponding
CmpInst.
2022-06-22 08:36:41 +02:00
Pavel Samolysov f44bf3805a [DeadArgElim] Reformat the pass in accordance with the code style
The code has been reformatted in accordance with the code style. Some
function comments were extended to the Doxygen ones and reworded a bit
to eliminate the duplication of the function's/class' name in the
comment.

Differential Revision: https://reviews.llvm.org/D128168
2022-06-22 09:13:00 +03:00
chenglin.bi 810b5c471f [NewGVN] add context instruction for SimplifyQuery
NewGVN will find operator from other context. ValueTracking currently doesn't have a way to run completely without context instruction.
So it will use operator itself as conext instruction.
If the operator in another branch will never be executed but it has an assume, it may caused value tracking use the assume to do wrong simpilfy.

It would be better to make these simplification queries not use context at all, but that would require some API changes.
For now we just use the orignial instruction as context instruction to fix the issue.

Fix #56039

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D127942
2022-06-22 12:25:24 +08:00
Serguei Katkov 8f891b7c39 [LoopVectorize] Uninitialized phi node leads to a crash in SSAUpdater.
createInductionResumeValues creates a phi node placeholder
without filling incoming values. Then it generates the incoming values.

It includes triggering of SCEV expander which may invoke SSAUpdater.
SSAUpdater has an optimization to detect number of predecessors
basing on incoming values if there is phi node.
In case phi node is not filled with incoming values - the number of predecessors
is detected as 0 and this leads to segmentation fault.

In other words SSAUpdater expects that phi is in good shape while
LoopVectorizer breaks this requirement.

The fix is just prepare all incoming values first and then build a phi node.

Reviewed By: fhahn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D128033
2022-06-22 10:49:27 +07:00
Johannes Doerfert b7cc3b10c5 [Attributor][FIX] Avoid empty bin in AAPointerInfo
This avoid creating empty bins in AAPointerInfo which can lead to
segfaults. Also ensure we do not try to translate from callee to caller
except if we really take the argument state and move it to the call site
argument state.

Fixes: https://github.com/llvm/llvm-project/issues/55726
2022-06-21 21:30:57 -05:00
Johannes Doerfert 083010312a [Attributor] Ensure to use the proper liveness AA
When determining liveness via Attributor::isAssumedDead(...) we might
end up without a liveness AA or with one pointing into another function.
Neither is helpful and we will avoid both from now on.

Reapplied after fixing the ASAN error which caused the revert:
db68a25ca9
2022-06-21 21:28:26 -05:00
Vasileios Porpodas 7a9ad25769 Recommit "[SLP][X86] Improve reordering to consider alternate instruction bundles"
This reverts commit 6d6268dcbf.

Review: https://reviews.llvm.org/D125712
2022-06-21 18:35:29 -07:00
Vasileios Porpodas 6d6268dcbf Revert "[SLP][X86] Improve reordering to consider alternate instruction bundles"
This reverts commit 6f88acf410.
2022-06-21 17:07:21 -07:00
Vasileios Porpodas 6f88acf410 [SLP][X86] Improve reordering to consider alternate instruction bundles
During the reordering transformation we should try to avoid reordering bundles
like fadd,fsub because this may block them being matched into a single vector
instruction in x86.
We do this by checking if a TreeEntry is such a pattern and adding it to the
list of TreeEntries with orders that need to be considered.

Differential Revision: https://reviews.llvm.org/D125712
2022-06-21 16:44:48 -07:00
Florian Hahn 88ce403c6a
[LV] Add new block to place recurrence splice, if needed.
In some cases, a recurrence splice instructions needs to be inserted
between to regions, for example if the regions get re-arranged during
sinking.

Fixes #56146.
2022-06-21 21:54:37 +02:00
Heejin Ahn 27e4afcea7 [DSE] Don't remove nounwind invokes
For non-mem-intrinsic and non-lifetime `CallBase`s, the current
`isRemovable` function only checks if the `CallBase` 1. has no uses 2.
will return 3. does not throw:
80fb782336/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp (L1017)

But we should also exclude invokes even in case they don't throw,
because they are terminators and thus cannot be removed. While it
doesn't seem to make much sense for `invoke`s to have an `nounwind`
target, this kind of code can be generated and is also valid bitcode.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128224
2022-06-21 11:54:09 -07:00
Martin Sebor b19194c032 [InstCombine] handle subobjects of constant aggregates
Remove the known limitation of the library function call folders to only
work with top-level arrays of characters (as per the TODO comment in
the code) and allows them to also fold calls involving subobjects of
constant aggregates such as member arrays.
2022-06-21 11:55:14 -06:00
Alexey Bataev d4ee43153d [SLP][NFC]Fix a warning in a comparison, NFC.
Fixed signedness warning.
2022-06-21 10:19:47 -07:00
serge-sans-paille aaf1630ac3 [Scalarizer] No need to gather a scattered extracted element
ExtractElement does not produce a vector out of a vector, so there's no need to
call a gather once done.

Fix #54469

Credits to npopov@redhat.com for the original approach.

Differential Revision: https://reviews.llvm.org/D126012
2022-06-21 18:43:54 +02:00
Arthur Eubanks b5db65e0da Reland [GlobalOpt] Preserve CFG analyses
The only place we modify the CFG is when calling
removeUnreachableBlocks(), so insert a callback there which invalidates
analyses for that function (or recomputes DT in the legacy PM).

We may delete functions, make sure to clear analyses for those
functions. (this was missed in the original revision)

Small compile time wins across the board:
https://llvm-compile-time-tracker.com/compare.php?from=f444ea8ce0aaaa5ec1a4129809389da15cc41396&to=698f41f4fc26cbf1006ed5d88e9d658edfc5b749&stat=instructions

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128145
2022-06-21 09:19:59 -07:00
Alexey Bataev f1ee2738b3 [SLP]Fix a crash when insert subvector is out of range.
If the OffsetBeg + InsertVecSz is greater than VecSz, need to estimate
the cost as shuffle of 2 vector, not as insert of subvector. Otherwise,
the inserted subvector is out of range and compiler may crash.

Differential Revision: https://reviews.llvm.org/D128071
2022-06-21 07:16:35 -07:00
Florian Hahn 4ea6891f95
[ConstraintElimination] Remove unneeded StackEntry::Condition (NFC).
The field was only used for debug printing. Print constraint from the
system instead.
2022-06-21 15:57:29 +02:00
Florian Hahn 2a9313ee0b
[ConstraintElimination] Move logic to check condition to helper (NFC). 2022-06-21 11:50:33 +02:00
Kazu Hirata 7a47ee51a1 [llvm] Don't use Optional::getValue (NFC) 2022-06-20 22:45:45 -07:00
Kazu Hirata d66cbc565a Don't use Optional::hasValue (NFC) 2022-06-20 20:26:05 -07:00
Kazu Hirata 0916d96d12 Don't use Optional::hasValue (NFC) 2022-06-20 20:17:57 -07:00
Florian Hahn 6dd772d348
[ConstraintElimination] Move logic to get a constraint to helper (NFC). 2022-06-20 21:34:07 +02:00
Kazu Hirata ad7ce1e769 Don't use Optional::hasValue (NFC) 2022-06-20 11:49:10 -07:00
Kazu Hirata 5413bf1bac Don't use Optional::hasValue (NFC) 2022-06-20 11:33:56 -07:00
Kazu Hirata e0e687a615 [llvm] Don't use Optional::hasValue (NFC) 2022-06-20 10:38:12 -07:00
Arthur Eubanks 13ff7d6f39 Revert "[GlobalOpt] Perform store->dominated load forwarding for stored once globals"
This reverts commit 6f348b146b.

Am seeing internal test failures plus a linux kernel breakage reported due to this.
2022-06-20 10:26:47 -07:00
Arthur Eubanks 1cd2c72bef Revert "[GlobalOpt] Preserve CFG analyses"
This reverts commit cc65f3e167.

Causes crashes: https://github.com/llvm/llvm-project/issues/56131
2022-06-20 10:25:10 -07:00
Guillaume Chatelet 589c8d6fb9 [NFC] Simplify alignment code in MemorySanitizer 2022-06-20 15:15:53 +00:00
Guillaume Chatelet 7296811910 [NFC] Simplify alignment code in CoroFrame 2022-06-20 15:15:52 +00:00
Florian Hahn cebe7ae881
[ConstraintElimination] Move logic to add constraint to helper (NFC). 2022-06-20 17:08:35 +02:00
Florian Hahn bd9632afd2
[ConstraintElimination] Move StackEntry up, to allow use earlier (NFC). 2022-06-20 16:40:42 +02:00
Florian Hahn cfc741bc0e
[LoopPeel] Forget SCEV for updated exit phi values.
LoopPeel add new incoming values to exit phi nodes which can change the
SCEV for the phi after 20d798bd47.

Forget SCEVs for such phis.

Fixes #56044.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128164
2022-06-20 13:19:27 +02:00
Guillaume Chatelet f1255186c7 [NFC][Alignment] Remove max functions between Align and MaybeAlign
`llvm::max(Align, MaybeAlign)` and `llvm::max(MaybeAlign, Align)` are
not used often enough to be required. They also make the code more opaque.

Differential Revision: https://reviews.llvm.org/D128121
2022-06-20 08:37:48 +00:00
Guillaume Chatelet 009fe0755e [Alignment] Remove multiply by MaybeAlign 2022-06-20 08:37:15 +00:00