Commit Graph

11597 Commits

Author SHA1 Message Date
Florian Hahn e7ec1746a6
[SCEV] Avoid creating unnecessary SCEVs for SelectInsts.
After 675080a453, we always create SCEVs for all operands of a
SelectInst. This can cause notable compile-time regressions compared to
the recursive algorithm, which only evaluates the operands if the select
is in a form we can create a usable expression.

This approach adds additional logic to getOperandsToCreate to only
queue operands for selects if we will later be able to construct a
usable SCEV.

Unfortunately this introduces a bit of coupling between actual SCEV
construction for selects and getOperandsToCreate, but I am not sure if
there are better alternatives to address the regression mentioned for
675080a453.

This doesn't have any notable compile-time impact on CTMark.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D129731
2022-07-14 09:23:47 -07:00
Philip Reames 3bc09c7da5 [SCEVExpander] Allow udiv with isKnownNonZero(RHS) + add vscale case
Motivation here is to unblock LSRs ability to use ICmpZero uses - the major effect of which is to enable count down IVs. The test changes reflect this goal, but the potential impact is much broader since this isn't a change in LSR at all.

SCEVExpander needs(*) to prove that expanding the expression is safe anywhere the SCEV expression is valid. In general, we can't expand any node which might fault (or exhibit UB) unless we can either a) prove it won't fault, or b) guard the faulting case. We'd been allowing non-zero constants here; this change extends it to non-zero values.

vscale is never zero. This is already implemented in ValueTracking, and this change just adds the same logic in SCEV's range computation (which in turn drives isKnownNonZero). We should common up some logic here, but let's do that in separate changes.

(*) As an aside, "needs" is such an interesting word here. First, we don't actually need to guard this at all; we could choose to emit a select for the RHS of ever udiv and remove this code entirely. Secondly, the property being checked here is way too strong. What the client actually needs is to expand the SCEV at some particular point in some particular loop. In the examples, the original urem dominates that loop and yet we completely ignore that information when analyzing legality. I don't plan to actively pursue either direction, just noting it for future reference.

Differential Revision: https://reviews.llvm.org/D129710
2022-07-14 08:56:58 -07:00
Dawid Jurczak d71128d97d [NFC][Metadata] Change MDNode::operands()'s return type from op_range to ArrayRef<MDOperand>
This patch is https://reviews.llvm.org/D129468 follow-up and address one of comment
coming from that review: https://reviews.llvm.org/D129468#3643295

Differential Revision: https://reviews.llvm.org/D129565
2022-07-14 17:22:32 +02:00
Kazu Hirata 611ffcf4e4 [llvm] Use value instead of getValue (NFC) 2022-07-13 23:11:56 -07:00
Kazu Hirata 30d3f56e33 [Analysis] clang-format InlineAdvisor.cpp (NFC) 2022-07-13 13:38:50 -07:00
Max Kazantsev 30e33b4b81 [SCEV][NFC] Make getStrengthenedNoWrapFlagsFromBinOp return optional 2022-07-13 18:54:25 +07:00
Peter Waller 8acf74fd56 [InstCombine][SVE] Bail out of isSafeToLoadUnconditionally for scalable types
`isSafeToLoadUnconditionally` currently assumes sized types. Bail out for now.
This fixes a TypeSize warning reachable from instcombine via (load (select
cond, ptr, ptr)).

Differential Revision: https://reviews.llvm.org/D129477
2022-07-13 10:07:36 +00:00
Dawid Jurczak 165240fe38 [NFC] Fix compile time regression seen on some benchmarks after a630ea3003 commit
The goal of this change is fixing most of compile time slowdown seen after a630ea3003 commit on lencod and sqlite3 benchmarks.
There are 3 improvements included in this patch:

1. In getNumOperands when possible get value directly from SmallNumOps.
2. Inline getLargePtr by moving its definition to header.
3. In TBAAStructTypeNode::getField get all operands once instead taking operands in loop one after one.

Differential Revision: https://reviews.llvm.org/D129468
2022-07-12 15:00:27 +02:00
Aiden Grossman f3939dc509 [mlgo] Simplify autogenerated regalloc model
Currently the autogenerated regalloc model will sometimes
output an incorrect LR index to evict instead of the first LR
with with the mask set to 1. This trips an assertion within
the MLRegallocAdvisor that the evicted LR has a mask of 1. This
patch, made possible by https://reviews.llvm.org/D124565, simplifies
the autogenerated model by taking away all unnecessary features and
getting rid of the functions that were previously to mix in all
the necessary inputs so they wouldn't get pruned by the Tensorflow
XLA AOT compiler. This is no longer necessary after the previously
mentioned patch. This also fixes the nondeterministic behavior
that is sometimes observed where the autogenerated model will
simply output 0 instead of the correct index.

Reviewed By: yundiqian

Differential Revision: https://reviews.llvm.org/D129254
2022-07-11 13:23:31 -07:00
Mircea Trofin 24c6c35270 [mlgo] Don't provide default model URLs
Pointed out in Issue #56432: the current reference models may not be
quite friendly to open source projects. Their purpose is only
illustrative - the expectation is that projects would train their own.
To avoid unintentionally pulling such a model, made the URL cmake
setting require explicit user setting.

Differential Revision: https://reviews.llvm.org/D129342
2022-07-11 07:37:14 -07:00
David Sherwood 03fee6712a [LoopVectorize] Add option to use active lane mask for loop control flow
Currently, for vectorised loops that use the get.active.lane.mask
intrinsic we only use the mask for predicated vector operations,
such as masked loads and stores, etc. The loop itself is still
controlled by comparing the canonical induction variable with the
trip count. However, for some targets this is inefficient when it's
cheap to use the mask itself to control the loop.

This patch adds support for using the active lane mask for control
flow by:

1. Generating the active lane mask for the next iteration of the
vector loop, rather than the current one. If there are still any
remaining iterations then at least the first bit of the mask will
be set.
2. Extract the first bit of this mask and use this bit for the
conditional branch.

I did this by creating a new VPActiveLaneMaskPHIRecipe that sets
up the initial PHI values in the vector loop pre-header. I've also
made use of the new BranchOnCond VPInstruction for the final
instruction in the loop region.

Differential Revision: https://reviews.llvm.org/D125301
2022-07-11 13:46:55 +01:00
Nicolai Hähnle ede600377c ManagedStatic: remove many straightforward uses in llvm
(Reapply after revert in e9ce1a5880 due to
Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other
than error categories, to be checked in more detail and reapplied
separately.)

Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.

Differential Revision: https://reviews.llvm.org/D129120
2022-07-10 10:29:15 +02:00
Nicolai Hähnle e9ce1a5880 Revert "ManagedStatic: remove many straightforward uses in llvm"
This reverts commit e6f1f06245.

Reverting due to a failure on the fuchsia-x86_64-linux buildbot.
2022-07-10 09:54:30 +02:00
Nicolai Hähnle e6f1f06245 ManagedStatic: remove many straightforward uses in llvm
Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.

Differential Revision: https://reviews.llvm.org/D129120
2022-07-10 09:15:08 +02:00
Wenlei He a78f436c3f [Inliner] Make recusive inlinee stack size limit tunable
For recursive callers, we want to be conservative when inlining callees with large stack size. We currently have a limit `InlineConstants::TotalAllocaSizeRecursiveCaller`, but that is hard coded.

We found the current limit insufficient to suppress problematic inlining that bloats stack size for deep recursion. This change adds a switch to make the limit tunable as a mitigation.

Differential Revision: https://reviews.llvm.org/D129411
2022-07-08 21:32:39 -07:00
Nikita Popov d686ea32b1 [ConstantFolding] Guard against unfolded FP binop
Check that the operation actually folded before trying to flush
denormals. A minor variation of the pr33453 test exposed this
with the FP binops marked as undesirable.
2022-07-08 17:45:33 +02:00
Nikita Popov 4a579abd9f [GlobalsModRef] Don't override getModRefBehavior() for CallBase
BasicAA will already call getModRefBehavior() on the Function of
the CallBase if there are no operand bundles. This happens through
getBestAAResults(), i.e. it is a recursive call that will query
other AA providers, not just the BasicAA implementation.

As such, there is no need to reimplement the same functionality
in GlobalsModRef, a combination of BasicAA and GlobalsModRef already
handles it. This does mean that this no longer works under
-disable-basic-aa, but that's a testing only option.
2022-07-07 10:35:44 +02:00
Nikita Popov f96cb66d19 [ValueTracking] Accept Instruction in isSafeToSpeculativelyExecute() (NFC)
As constant expressions can no longer trap, it only makes sense to
call isSafeToSpeculativelyExecute on Instructions, so limit the
API to accept only them, rather than general Operators or Values.
2022-07-06 11:12:49 +02:00
Nikita Popov 8ee913d83b [IR] Remove Constant::canTrap() (NFC)
As integer div/rem constant expressions are no longer supported,
constants can no longer trap and are always safe to speculate.
Remove the Constant::canTrap() method and its usages.
2022-07-06 10:36:47 +02:00
Nikita Popov 935570b2ad [ConstExpr] Don't create div/rem expressions
This removes creation of udiv/sdiv/urem/srem constant expressions,
in preparation for their removal. I've added a
ConstantExpr::isDesirableBinOp() predicate to determine whether
an expression should be created for a certain operator.

With this patch, div/rem expressions can still be created through
explicit IR/bitcode, forbidding them entirely will be the next step.

Differential Revision: https://reviews.llvm.org/D128820
2022-07-05 15:54:53 +02:00
Nikita Popov e4d1d0cc2c [SCEV] Fix isImpliedViaMerge() with values from previous iteration (PR56242)
When trying to prove an implied condition on a phi by proving it
for all incoming values, we need to be careful about values coming
from a backedge, as these may refer to a previous loop iteration.
A variant of this issue was fixed in D101829, but the dominance
condition used there isn't quite right: It checks that the value
dominates the incoming block, which doesn't exclude backedges
(values defined in a loop will usually dominate the loop latch,
which is the incoming block of the backedge).

Instead, we should be checking for domination of the phi block.
Any values defined inside the loop will not dominate the loop
header phi.

Fixes https://github.com/llvm/llvm-project/issues/56242.

Differential Revision: https://reviews.llvm.org/D128640
2022-07-05 15:31:23 +02:00
Nikita Popov f93cd56262 [BPI] Avoid ConstantExpr::get()
Use ConstantFoldBinaryOpOperands() instead, to prepare for the case
where not all binary operators have a constant expression form.

I believe this code actually intended to set OnlyIfReduced=true,
however ConstantExpr::get() actually accepts a Flags argument at
that position (and OnlyIfReducedTy as the next argument), so this
ended up creating a constant expression with some random flag
(probably exact or nuw depending on which).
2022-07-04 16:04:26 +02:00
Nikita Popov 4905bcac00 [ConstantFolding] Check return value of ConstantFoldInstOperandsImpl()
This operation is fallible, but ConstantFoldConstantImpl() is not.
If we fail to fold, we should simply return the original expression.

I don't think this can cause any issues right now, but it becomes
a problem if once make ConstantFoldInstOperandsImpl() not create a
constant expression for everything it possibly could.
2022-07-04 14:19:59 +02:00
Chen Zheng 2c3784cff8 [SCEV] recognize llvm.annotation intrinsic
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D127835
2022-07-03 21:02:50 -04:00
Nuno Lopes 53dc0f1078 [NFC] Switch a few uses of undef to poison as placeholders for unreachble code 2022-07-03 14:34:03 +01:00
Nikita Popov 560e694d48 [AST] Don't assert instruction reads/writes memory (PR51333)
This function is well-defined for an instruction that doesn't access
memory (and thus trivially doesn't alias anything in the AST), so
drop the assert. We can end up with a readnone call here if we
originally created a MemoryDef for an indirect call, which was
later replaced with a direct readnone call.

Fixes https://github.com/llvm/llvm-project/issues/51333.

Differential Revision: https://reviews.llvm.org/D127947
2022-07-01 17:04:48 +02:00
Nikita Popov c8bd3e7825 [SCEV] Remove unnecessary pointer handling in BuildConstantFromSCEV (NFCI)
Nowadays, we do not allow pointers in multiplies, and adds can only
have a single pointer, which is also guaranteed to be last by
complexity sorting. As such, we can somewhat simplify the treatment
of pointer types.
2022-07-01 16:28:56 +02:00
Chen Zheng 758de0e931 [InstructionSimplify] handle denormal input for fcmp
Handle denormal constant input for fcmp instructions based on the
denormal handling mode.

Reviewed By: spatel, dcandler

Differential Revision: https://reviews.llvm.org/D128647
2022-07-01 03:51:28 -04:00
Nikita Popov 9ac386495d [ConstExpr] Don't create insertvalue expressions
In preparation for the removal in D128719, this stops creating
insertvalue constant expressions (well, unless they are directly
used in LLVM IR).

Differential Revision: https://reviews.llvm.org/D128792
2022-07-01 09:23:28 +02:00
Fangrui Song 27abff670b Remove unneeded cl::ZeroOrMore. NFC 2022-06-30 19:11:27 -07:00
Nikita Popov 0445c340ff [ConstantFold] Support loads in ConstantFoldInstOperands()
This allows all constant folding to happen through a single
function, without requiring special handling for loads at each
call-site.

This may not be NFC because some callers currently don't do that
special handling.
2022-06-30 12:18:15 +02:00
Nikita Popov 54fcde42c0 [InlineCost] Simplify constant folding
Use a common ConstantFoldInstOperands-based constant folding
implementation, instead of specifying the folding function for
each function individually. Going through the generic handling
doesn't appear to have any significant compile-time impact.

As the test change shows, this is not NFC, because we now use
DataLayout-aware constant folding, which can do slightly better
in some cases (e.g. those involving GEPs).
2022-06-30 11:49:17 +02:00
Nikita Popov a6d4b4138f [ConstantFold] Supports compares in ConstantFoldInstOperands()
Support compares in ConstantFoldInstOperands(), instead of
forcing the use of ConstantFoldCompareInstOperands(). Also handle
insertvalue (extractvalue was already handled).

This removes a footgun, where many uses of ConstantFoldInstOperands()
need a separate check for compares beforehand. It's particularly
insidious if called on a constant expression, because it doesn't
fail in that case, but will just not do DL-dependent folding.
2022-06-30 11:05:24 +02:00
Chuanqi Xu 0b5ead6590 [WebAssembly] Don't set musttail for coroutines when tail-call is not
enabled

The C++20 Coroutines couldn't be compiled to WebAssembly due to an
optimization named symmetric transfer requires the support for musttail
calls but WebAssembly doesn't support it yet.

This patch tries to fix the problem by adding a supportsTailCalls
method to TargetTransformImpl to skip the symmetric transfer when
tail-call feature is not supported.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D128794
2022-06-30 11:15:40 +08:00
Nikita Popov 30ea6a0636 [SCEV] Don't create udiv constant expression (NFC)
Work on APInts to make it clear that this will not create a
constant expression.

This code path is not reached if the RHS is zero.
2022-06-29 14:35:05 +02:00
Florian Hahn 675080a453
[SCEV] Construct SCEV iteratively.
This patch updates SCEV construction to work iteratively instead of recursively
in most cases. It resolves stack overflow issues when trying to construct SCEVs
for certain inputs, e.g. PR45201.

The basic approach is to to use a worklist to queue operands of V which
need to be created before V. To do so, the current patch adds a
getOperandsToCreate function which collects the operands SCEV
construction depends on for a given value. This is a slight duplication
with createSCEV.

At the moment, SCEVs for phis are still created recursively.

Fixes #32078, #42594, #44546, #49293, #49599,  #55333, #55511

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D114650
2022-06-29 11:29:31 +01:00
Nikita Popov 16033ffdd9 [ConstExpr] Remove more leftovers of extractvalue expression (NFC)
Remove some leftover bits of extractvalue handling after the
removal in D125795.
2022-06-29 10:45:19 +02:00
Martin Sebor e263a7670e [InstCombine] Look through more casts when folding memchr and memcmp
Enhance getConstantDataArrayInfo to let the memchr and memcmp library
call folders look through arbitrarily long sequences of bitcast and
GEP instructions.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128364
2022-06-28 15:58:42 -06:00
Nikita Popov 5548e807b5 [IR] Remove support for extractvalue constant expression
This removes the extractvalue constant expression, as part of
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.
extractvalue is already not supported in bitcode, so we do not need
to worry about bitcode auto-upgrade.

Uses of ConstantExpr::getExtractValue() should be replaced with
IRBuilder::CreateExtractValue() (if the fact that the result is
constant is not important) or ConstantFoldExtractValueInstruction()
(if it is). Though for this particular case, it is also possible
and usually preferable to use getAggregateElement() instead.

The C API function LLVMConstExtractValue() is removed, as the
underlying constant expression no longer exists. Instead,
LLVMBuildExtractValue() should be used (which will constant fold
or create an instruction). Depending on the use-case,
LLVMGetAggregateElement() may also be used instead.

Differential Revision: https://reviews.llvm.org/D125795
2022-06-28 10:40:17 +02:00
Bradley Smith a83aa33d1b [IR] Move vector.insert/vector.extract out of experimental namespace
These intrinsics are now fundemental for SVE code generation and have been
present for a year and a half, hence move them out of the experimental
namespace.

Differential Revision: https://reviews.llvm.org/D127976
2022-06-27 10:48:45 +00:00
Nikita Popov 327307d9d4 [SCEV] Assert that GEP source element type is sized (NFC)
This is checked by the IR verifier, so replace the condition with
an assert.
2022-06-27 10:51:09 +02:00
Florian Hahn e4e22b6d80
[SCEV] Use SCEVUnknown(poison) instead of SCEVUnknown(undef).
Use poison instead of undef for SCEVUnkown of unreachable values.
This should be in line with the movement to replace undef with poison
when possible.

Suggested in D114650.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128586
2022-06-27 09:33:05 +01:00
Kazu Hirata a81b64a1fb [llvm] Use Optional::has_value instead of Optional::hasValue (NFC)
This patch replaces x.hasValue() with x.has_value() where x is not
contextually convertible to bool.
2022-06-26 16:10:42 -07:00
Kazu Hirata a7938c74f1 [llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 21:42:52 -07:00
Kazu Hirata 3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3.
2022-06-25 11:56:50 -07:00
Kazu Hirata aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Mircea Trofin 7ae92a69c2 [MLInliner] No need to invalidate everything post-inlining.
We really just need to invalidate loop info and the dominator tree, in
addition to the FunctionPropertiesInfo we were invalidating originally.
Doing more adds unnecessary compile time overhead.
2022-06-24 18:22:06 -07:00
Mingming Liu e0d069598b [Inline] Annotate inline pass name with link phase information for analysis.
The annotation is flag gated; flag is turned off by default.

Differential Revision: https://reviews.llvm.org/D125495
2022-06-24 10:06:43 -07:00
Dawid Jurczak b7e7f4e1b6 [InlineCost] Improve debugging experience by adding print about initial inlining cost
Differential Revision: https://reviews.llvm.org/D127597
2022-06-24 16:27:26 +02:00
Nikita Popov 871197d0a3 [MemoryBuiltins] Accept any value in getInitialValueOfAllocation() (NFC)
Drop the requirement that getInitialValueOfAllocation() must be
passed an allocator function, shifting the responsibility for
checking that into the function (which it does anyway). The
motivation is to avoid some calls to isAllocationFn(), which has
somewhat ill-defined semantics (given the number of
allocator-related attributes we have floating around...)

(For this function, all we eventually need is an allockind of
zeroed or uninitialized.)

Differential Revision: https://reviews.llvm.org/D127274
2022-06-24 16:08:07 +02:00