llvm-project

Commit Graph

Author	SHA1	Message	Date
Graham Hunter	db8fcb2c25	[LAA] Add recursive IR walker for forked pointers This builds on the previous forked pointers patch, which only accepted a single select as the pointer to check. A recursive function to walk through IR has been added, which searches for either a loop-invariant or addrec SCEV. This will only handle a single fork at present, so selects of selects or a GEP with a select for both the base and offset will be rejected. There is also a recursion limit with a cli option to change it. Reviewed By: fhahn, david-arm Differential Revision: https://reviews.llvm.org/D108699	2022-07-18 12:06:17 +01:00
Kazu Hirata	601b3a13de	[Analysis] Qualify auto variables in for loops (NFC)	2022-07-16 23:26:34 -07:00
Kazu Hirata	92a1b2afc8	[Analysis] Remove isArithmeticRecurrenceKind The last use was removed on Jul 30, 2021 in commit `9d35594993`.	2022-07-16 13:23:32 -07:00
Max Kazantsev	883e83d5fe	[NFC][SCEV] Rename variable to correspond its current meaning	2022-07-15 22:33:57 +07:00
Nikita Popov	2659e1bf4b	[SCEV] List all binops in getOperandsToCreate() Explicitly list all binops rather than having a default case. There were two bugs here: 1. U->getOpcode() was used instead of BO->Opcode, which means we used the logic for the wrong opcode in some cases. 2. SCEV construction does not support LShr. We should return unknown for it rather than recursing into the operands.	2022-07-15 17:08:48 +02:00
Florian Hahn	e7ec1746a6	[SCEV] Avoid creating unnecessary SCEVs for SelectInsts. After `675080a453`, we always create SCEVs for all operands of a SelectInst. This can cause notable compile-time regressions compared to the recursive algorithm, which only evaluates the operands if the select is in a form we can create a usable expression. This approach adds additional logic to getOperandsToCreate to only queue operands for selects if we will later be able to construct a usable SCEV. Unfortunately this introduces a bit of coupling between actual SCEV construction for selects and getOperandsToCreate, but I am not sure if there are better alternatives to address the regression mentioned for `675080a453`. This doesn't have any notable compile-time impact on CTMark. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D129731	2022-07-14 09:23:47 -07:00
Philip Reames	3bc09c7da5	[SCEVExpander] Allow udiv with isKnownNonZero(RHS) + add vscale case Motivation here is to unblock LSRs ability to use ICmpZero uses - the major effect of which is to enable count down IVs. The test changes reflect this goal, but the potential impact is much broader since this isn't a change in LSR at all. SCEVExpander needs() to prove that expanding the expression is safe anywhere the SCEV expression is valid. In general, we can't expand any node which might fault (or exhibit UB) unless we can either a) prove it won't fault, or b) guard the faulting case. We'd been allowing non-zero constants here; this change extends it to non-zero values. vscale is never zero. This is already implemented in ValueTracking, and this change just adds the same logic in SCEV's range computation (which in turn drives isKnownNonZero). We should common up some logic here, but let's do that in separate changes. () As an aside, "needs" is such an interesting word here. First, we don't actually need to guard this at all; we could choose to emit a select for the RHS of ever udiv and remove this code entirely. Secondly, the property being checked here is way too strong. What the client actually needs is to expand the SCEV at some particular point in some particular loop. In the examples, the original urem dominates that loop and yet we completely ignore that information when analyzing legality. I don't plan to actively pursue either direction, just noting it for future reference. Differential Revision: https://reviews.llvm.org/D129710	2022-07-14 08:56:58 -07:00
Dawid Jurczak	d71128d97d	[NFC][Metadata] Change MDNode::operands()'s return type from op_range to ArrayRef<MDOperand> This patch is https://reviews.llvm.org/D129468 follow-up and address one of comment coming from that review: https://reviews.llvm.org/D129468#3643295 Differential Revision: https://reviews.llvm.org/D129565	2022-07-14 17:22:32 +02:00
Kazu Hirata	611ffcf4e4	[llvm] Use value instead of getValue (NFC)	2022-07-13 23:11:56 -07:00
Kazu Hirata	30d3f56e33	[Analysis] clang-format InlineAdvisor.cpp (NFC)	2022-07-13 13:38:50 -07:00
Max Kazantsev	30e33b4b81	[SCEV][NFC] Make getStrengthenedNoWrapFlagsFromBinOp return optional	2022-07-13 18:54:25 +07:00
Peter Waller	8acf74fd56	[InstCombine][SVE] Bail out of isSafeToLoadUnconditionally for scalable types `isSafeToLoadUnconditionally` currently assumes sized types. Bail out for now. This fixes a TypeSize warning reachable from instcombine via (load (select cond, ptr, ptr)). Differential Revision: https://reviews.llvm.org/D129477	2022-07-13 10:07:36 +00:00
Dawid Jurczak	165240fe38	[NFC] Fix compile time regression seen on some benchmarks after `a630ea3003` commit The goal of this change is fixing most of compile time slowdown seen after `a630ea3003` commit on lencod and sqlite3 benchmarks. There are 3 improvements included in this patch: 1. In getNumOperands when possible get value directly from SmallNumOps. 2. Inline getLargePtr by moving its definition to header. 3. In TBAAStructTypeNode::getField get all operands once instead taking operands in loop one after one. Differential Revision: https://reviews.llvm.org/D129468	2022-07-12 15:00:27 +02:00
Aiden Grossman	f3939dc509	[mlgo] Simplify autogenerated regalloc model Currently the autogenerated regalloc model will sometimes output an incorrect LR index to evict instead of the first LR with with the mask set to 1. This trips an assertion within the MLRegallocAdvisor that the evicted LR has a mask of 1. This patch, made possible by https://reviews.llvm.org/D124565, simplifies the autogenerated model by taking away all unnecessary features and getting rid of the functions that were previously to mix in all the necessary inputs so they wouldn't get pruned by the Tensorflow XLA AOT compiler. This is no longer necessary after the previously mentioned patch. This also fixes the nondeterministic behavior that is sometimes observed where the autogenerated model will simply output 0 instead of the correct index. Reviewed By: yundiqian Differential Revision: https://reviews.llvm.org/D129254	2022-07-11 13:23:31 -07:00
Mircea Trofin	24c6c35270	[mlgo] Don't provide default model URLs Pointed out in Issue #56432: the current reference models may not be quite friendly to open source projects. Their purpose is only illustrative - the expectation is that projects would train their own. To avoid unintentionally pulling such a model, made the URL cmake setting require explicit user setting. Differential Revision: https://reviews.llvm.org/D129342	2022-07-11 07:37:14 -07:00
David Sherwood	03fee6712a	[LoopVectorize] Add option to use active lane mask for loop control flow Currently, for vectorised loops that use the get.active.lane.mask intrinsic we only use the mask for predicated vector operations, such as masked loads and stores, etc. The loop itself is still controlled by comparing the canonical induction variable with the trip count. However, for some targets this is inefficient when it's cheap to use the mask itself to control the loop. This patch adds support for using the active lane mask for control flow by: 1. Generating the active lane mask for the next iteration of the vector loop, rather than the current one. If there are still any remaining iterations then at least the first bit of the mask will be set. 2. Extract the first bit of this mask and use this bit for the conditional branch. I did this by creating a new VPActiveLaneMaskPHIRecipe that sets up the initial PHI values in the vector loop pre-header. I've also made use of the new BranchOnCond VPInstruction for the final instruction in the loop region. Differential Revision: https://reviews.llvm.org/D125301	2022-07-11 13:46:55 +01:00
Nicolai Hähnle	ede600377c	ManagedStatic: remove many straightforward uses in llvm (Reapply after revert in `e9ce1a5880` due to Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other than error categories, to be checked in more detail and reapplied separately.) Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it. Differential Revision: https://reviews.llvm.org/D129120	2022-07-10 10:29:15 +02:00
Nicolai Hähnle	e9ce1a5880	Revert "ManagedStatic: remove many straightforward uses in llvm" This reverts commit `e6f1f06245`. Reverting due to a failure on the fuchsia-x86_64-linux buildbot.	2022-07-10 09:54:30 +02:00
Nicolai Hähnle	e6f1f06245	ManagedStatic: remove many straightforward uses in llvm Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it. Differential Revision: https://reviews.llvm.org/D129120	2022-07-10 09:15:08 +02:00
Wenlei He	a78f436c3f	[Inliner] Make recusive inlinee stack size limit tunable For recursive callers, we want to be conservative when inlining callees with large stack size. We currently have a limit `InlineConstants::TotalAllocaSizeRecursiveCaller`, but that is hard coded. We found the current limit insufficient to suppress problematic inlining that bloats stack size for deep recursion. This change adds a switch to make the limit tunable as a mitigation. Differential Revision: https://reviews.llvm.org/D129411	2022-07-08 21:32:39 -07:00
Nikita Popov	d686ea32b1	[ConstantFolding] Guard against unfolded FP binop Check that the operation actually folded before trying to flush denormals. A minor variation of the pr33453 test exposed this with the FP binops marked as undesirable.	2022-07-08 17:45:33 +02:00
Nikita Popov	4a579abd9f	[GlobalsModRef] Don't override getModRefBehavior() for CallBase BasicAA will already call getModRefBehavior() on the Function of the CallBase if there are no operand bundles. This happens through getBestAAResults(), i.e. it is a recursive call that will query other AA providers, not just the BasicAA implementation. As such, there is no need to reimplement the same functionality in GlobalsModRef, a combination of BasicAA and GlobalsModRef already handles it. This does mean that this no longer works under -disable-basic-aa, but that's a testing only option.	2022-07-07 10:35:44 +02:00
Nikita Popov	f96cb66d19	[ValueTracking] Accept Instruction in isSafeToSpeculativelyExecute() (NFC) As constant expressions can no longer trap, it only makes sense to call isSafeToSpeculativelyExecute on Instructions, so limit the API to accept only them, rather than general Operators or Values.	2022-07-06 11:12:49 +02:00
Nikita Popov	8ee913d83b	[IR] Remove Constant::canTrap() (NFC) As integer div/rem constant expressions are no longer supported, constants can no longer trap and are always safe to speculate. Remove the Constant::canTrap() method and its usages.	2022-07-06 10:36:47 +02:00
Nikita Popov	935570b2ad	[ConstExpr] Don't create div/rem expressions This removes creation of udiv/sdiv/urem/srem constant expressions, in preparation for their removal. I've added a ConstantExpr::isDesirableBinOp() predicate to determine whether an expression should be created for a certain operator. With this patch, div/rem expressions can still be created through explicit IR/bitcode, forbidding them entirely will be the next step. Differential Revision: https://reviews.llvm.org/D128820	2022-07-05 15:54:53 +02:00
Nikita Popov	e4d1d0cc2c	[SCEV] Fix isImpliedViaMerge() with values from previous iteration (PR56242) When trying to prove an implied condition on a phi by proving it for all incoming values, we need to be careful about values coming from a backedge, as these may refer to a previous loop iteration. A variant of this issue was fixed in D101829, but the dominance condition used there isn't quite right: It checks that the value dominates the incoming block, which doesn't exclude backedges (values defined in a loop will usually dominate the loop latch, which is the incoming block of the backedge). Instead, we should be checking for domination of the phi block. Any values defined inside the loop will not dominate the loop header phi. Fixes https://github.com/llvm/llvm-project/issues/56242. Differential Revision: https://reviews.llvm.org/D128640	2022-07-05 15:31:23 +02:00
Nikita Popov	f93cd56262	[BPI] Avoid ConstantExpr::get() Use ConstantFoldBinaryOpOperands() instead, to prepare for the case where not all binary operators have a constant expression form. I believe this code actually intended to set OnlyIfReduced=true, however ConstantExpr::get() actually accepts a Flags argument at that position (and OnlyIfReducedTy as the next argument), so this ended up creating a constant expression with some random flag (probably exact or nuw depending on which).	2022-07-04 16:04:26 +02:00
Nikita Popov	4905bcac00	[ConstantFolding] Check return value of ConstantFoldInstOperandsImpl() This operation is fallible, but ConstantFoldConstantImpl() is not. If we fail to fold, we should simply return the original expression. I don't think this can cause any issues right now, but it becomes a problem if once make ConstantFoldInstOperandsImpl() not create a constant expression for everything it possibly could.	2022-07-04 14:19:59 +02:00
Chen Zheng	2c3784cff8	[SCEV] recognize llvm.annotation intrinsic Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D127835	2022-07-03 21:02:50 -04:00
Nuno Lopes	53dc0f1078	[NFC] Switch a few uses of undef to poison as placeholders for unreachble code	2022-07-03 14:34:03 +01:00
Nikita Popov	560e694d48	[AST] Don't assert instruction reads/writes memory (PR51333) This function is well-defined for an instruction that doesn't access memory (and thus trivially doesn't alias anything in the AST), so drop the assert. We can end up with a readnone call here if we originally created a MemoryDef for an indirect call, which was later replaced with a direct readnone call. Fixes https://github.com/llvm/llvm-project/issues/51333. Differential Revision: https://reviews.llvm.org/D127947	2022-07-01 17:04:48 +02:00
Nikita Popov	c8bd3e7825	[SCEV] Remove unnecessary pointer handling in BuildConstantFromSCEV (NFCI) Nowadays, we do not allow pointers in multiplies, and adds can only have a single pointer, which is also guaranteed to be last by complexity sorting. As such, we can somewhat simplify the treatment of pointer types.	2022-07-01 16:28:56 +02:00
Chen Zheng	758de0e931	[InstructionSimplify] handle denormal input for fcmp Handle denormal constant input for fcmp instructions based on the denormal handling mode. Reviewed By: spatel, dcandler Differential Revision: https://reviews.llvm.org/D128647	2022-07-01 03:51:28 -04:00
Nikita Popov	9ac386495d	[ConstExpr] Don't create insertvalue expressions In preparation for the removal in D128719, this stops creating insertvalue constant expressions (well, unless they are directly used in LLVM IR). Differential Revision: https://reviews.llvm.org/D128792	2022-07-01 09:23:28 +02:00
Fangrui Song	27abff670b	Remove unneeded cl::ZeroOrMore. NFC	2022-06-30 19:11:27 -07:00
Nikita Popov	0445c340ff	[ConstantFold] Support loads in ConstantFoldInstOperands() This allows all constant folding to happen through a single function, without requiring special handling for loads at each call-site. This may not be NFC because some callers currently don't do that special handling.	2022-06-30 12:18:15 +02:00
Nikita Popov	54fcde42c0	[InlineCost] Simplify constant folding Use a common ConstantFoldInstOperands-based constant folding implementation, instead of specifying the folding function for each function individually. Going through the generic handling doesn't appear to have any significant compile-time impact. As the test change shows, this is not NFC, because we now use DataLayout-aware constant folding, which can do slightly better in some cases (e.g. those involving GEPs).	2022-06-30 11:49:17 +02:00
Nikita Popov	a6d4b4138f	[ConstantFold] Supports compares in ConstantFoldInstOperands() Support compares in ConstantFoldInstOperands(), instead of forcing the use of ConstantFoldCompareInstOperands(). Also handle insertvalue (extractvalue was already handled). This removes a footgun, where many uses of ConstantFoldInstOperands() need a separate check for compares beforehand. It's particularly insidious if called on a constant expression, because it doesn't fail in that case, but will just not do DL-dependent folding.	2022-06-30 11:05:24 +02:00
Chuanqi Xu	0b5ead6590	[WebAssembly] Don't set musttail for coroutines when tail-call is not enabled The C++20 Coroutines couldn't be compiled to WebAssembly due to an optimization named symmetric transfer requires the support for musttail calls but WebAssembly doesn't support it yet. This patch tries to fix the problem by adding a supportsTailCalls method to TargetTransformImpl to skip the symmetric transfer when tail-call feature is not supported. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D128794	2022-06-30 11:15:40 +08:00
Nikita Popov	30ea6a0636	[SCEV] Don't create udiv constant expression (NFC) Work on APInts to make it clear that this will not create a constant expression. This code path is not reached if the RHS is zero.	2022-06-29 14:35:05 +02:00
Florian Hahn	675080a453	[SCEV] Construct SCEV iteratively. This patch updates SCEV construction to work iteratively instead of recursively in most cases. It resolves stack overflow issues when trying to construct SCEVs for certain inputs, e.g. PR45201. The basic approach is to to use a worklist to queue operands of V which need to be created before V. To do so, the current patch adds a getOperandsToCreate function which collects the operands SCEV construction depends on for a given value. This is a slight duplication with createSCEV. At the moment, SCEVs for phis are still created recursively. Fixes #32078, #42594, #44546, #49293, #49599, #55333, #55511 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D114650	2022-06-29 11:29:31 +01:00
Nikita Popov	16033ffdd9	[ConstExpr] Remove more leftovers of extractvalue expression (NFC) Remove some leftover bits of extractvalue handling after the removal in D125795.	2022-06-29 10:45:19 +02:00
Martin Sebor	e263a7670e	[InstCombine] Look through more casts when folding memchr and memcmp Enhance getConstantDataArrayInfo to let the memchr and memcmp library call folders look through arbitrarily long sequences of bitcast and GEP instructions. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128364	2022-06-28 15:58:42 -06:00
Nikita Popov	5548e807b5	[IR] Remove support for extractvalue constant expression This removes the extractvalue constant expression, as part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179. extractvalue is already not supported in bitcode, so we do not need to worry about bitcode auto-upgrade. Uses of ConstantExpr::getExtractValue() should be replaced with IRBuilder::CreateExtractValue() (if the fact that the result is constant is not important) or ConstantFoldExtractValueInstruction() (if it is). Though for this particular case, it is also possible and usually preferable to use getAggregateElement() instead. The C API function LLVMConstExtractValue() is removed, as the underlying constant expression no longer exists. Instead, LLVMBuildExtractValue() should be used (which will constant fold or create an instruction). Depending on the use-case, LLVMGetAggregateElement() may also be used instead. Differential Revision: https://reviews.llvm.org/D125795	2022-06-28 10:40:17 +02:00
Bradley Smith	a83aa33d1b	[IR] Move vector.insert/vector.extract out of experimental namespace These intrinsics are now fundemental for SVE code generation and have been present for a year and a half, hence move them out of the experimental namespace. Differential Revision: https://reviews.llvm.org/D127976	2022-06-27 10:48:45 +00:00
Nikita Popov	327307d9d4	[SCEV] Assert that GEP source element type is sized (NFC) This is checked by the IR verifier, so replace the condition with an assert.	2022-06-27 10:51:09 +02:00
Florian Hahn	e4e22b6d80	[SCEV] Use SCEVUnknown(poison) instead of SCEVUnknown(undef). Use poison instead of undef for SCEVUnkown of unreachable values. This should be in line with the movement to replace undef with poison when possible. Suggested in D114650. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128586	2022-06-27 09:33:05 +01:00
Kazu Hirata	a81b64a1fb	[llvm] Use Optional::has_value instead of Optional::hasValue (NFC) This patch replaces x.hasValue() with x.has_value() where x is not contextually convertible to bool.	2022-06-26 16:10:42 -07:00
Kazu Hirata	a7938c74f1	[llvm] Don't use Optional::hasValue (NFC) This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.	2022-06-25 21:42:52 -07:00
Kazu Hirata	3b7c3a654c	Revert "Don't use Optional::hasValue (NFC)" This reverts commit `aa8feeefd3`.	2022-06-25 11:56:50 -07:00
Kazu Hirata	aa8feeefd3	Don't use Optional::hasValue (NFC)	2022-06-25 11:55:57 -07:00
Mircea Trofin	7ae92a69c2	[MLInliner] No need to invalidate everything post-inlining. We really just need to invalidate loop info and the dominator tree, in addition to the FunctionPropertiesInfo we were invalidating originally. Doing more adds unnecessary compile time overhead.	2022-06-24 18:22:06 -07:00
Mingming Liu	e0d069598b	[Inline] Annotate inline pass name with link phase information for analysis. The annotation is flag gated; flag is turned off by default. Differential Revision: https://reviews.llvm.org/D125495	2022-06-24 10:06:43 -07:00
Dawid Jurczak	b7e7f4e1b6	[InlineCost] Improve debugging experience by adding print about initial inlining cost Differential Revision: https://reviews.llvm.org/D127597	2022-06-24 16:27:26 +02:00
Nikita Popov	871197d0a3	[MemoryBuiltins] Accept any value in getInitialValueOfAllocation() (NFC) Drop the requirement that getInitialValueOfAllocation() must be passed an allocator function, shifting the responsibility for checking that into the function (which it does anyway). The motivation is to avoid some calls to isAllocationFn(), which has somewhat ill-defined semantics (given the number of allocator-related attributes we have floating around...) (For this function, all we eventually need is an allockind of zeroed or uninitialized.) Differential Revision: https://reviews.llvm.org/D127274	2022-06-24 16:08:07 +02:00
Nikita Popov	54eff7da3c	[AA] Export isEscapeSource() API (NFC) Export API that was previously private to BasicAliasAnalysis and will be used in D127202.	2022-06-24 11:59:15 +02:00
Nikita Popov	bcadfc2595	[BasicAA] Handle passthru calls in isEscapeSource() isEscapeSource() currently considers all call return values as escape sources. However, CaptureTracking can look through certain calls, so we shouldn't consider these as escape sources either. The corresponding CaptureTracking code is: `7c9a3825b8/llvm/lib/Analysis/CaptureTracking.cpp (L332-L333)` Differential Revision: https://reviews.llvm.org/D128444	2022-06-24 11:00:57 +02:00
Wolfgang Pieb	c50e6f590c	[Inline] Introduce a backend option to suppress inlining of functions with large stack sizes. The hidden option max-inline-stacksize=<N> prevents the inlining of functions with a stack size larger than N. Reviewed By: mtrofin, aeubanks Differential Review: https://reviews.llvm.org/D127988	2022-06-23 10:57:46 -07:00
David Green	bd1a4c8565	[ValueTracking] Teach isKnownNonZero that a vscale is never 0. A llvm.vscale will always be at least 1, never zero. Teaching that to isKnownNonZero can help fold away some statically known compares. Differential Revision: https://reviews.llvm.org/D128217	2022-06-23 15:25:24 +01:00
Mingming Liu	bc856eb3fc	[SampleProfile][Inline] Annotate sample profile inline remarks with link phase (prelink/postlink) information. Differential Revision: https://reviews.llvm.org/D126833	2022-06-22 17:00:53 -07:00
Vasileios Porpodas	7a9ad25769	Recommit "[SLP][X86] Improve reordering to consider alternate instruction bundles" This reverts commit `6d6268dcbf`. Review: https://reviews.llvm.org/D125712	2022-06-21 18:35:29 -07:00
Vasileios Porpodas	6d6268dcbf	Revert "[SLP][X86] Improve reordering to consider alternate instruction bundles" This reverts commit `6f88acf410`.	2022-06-21 17:07:21 -07:00
Vasileios Porpodas	6f88acf410	[SLP][X86] Improve reordering to consider alternate instruction bundles During the reordering transformation we should try to avoid reordering bundles like fadd,fsub because this may block them being matched into a single vector instruction in x86. We do this by checking if a TreeEntry is such a pattern and adding it to the list of TreeEntries with orders that need to be considered. Differential Revision: https://reviews.llvm.org/D125712	2022-06-21 16:44:48 -07:00
Martin Sebor	b19194c032	[InstCombine] handle subobjects of constant aggregates Remove the known limitation of the library function call folders to only work with top-level arrays of characters (as per the TODO comment in the code) and allows them to also fold calls involving subobjects of constant aggregates such as member arrays.	2022-06-21 11:55:14 -06:00
Mircea Trofin	3f8e4169c1	[FunctionPropertiesAnalysis] Generalize support for unreachable Generalized support for subgraphs that get rendered unreachable, for both `call` and `invoke` cases. Differential Revision: https://reviews.llvm.org/D127921	2022-06-21 08:18:01 -07:00
Nikita Popov	ed63fcb232	[GlobalsModRef] Remove check for allocator calls As the FIXME already indicates, I don't see why this code would be necessary. If there's a call to an allocator function, that should get treated just like any other function call -- usually it will be a declaration and handled conservatively based on memory attributes only. There should be no need to explicitly force it to be modref. No test failures either, so I think this is just dead code. Differential Revision: https://reviews.llvm.org/D127273	2022-06-21 14:24:13 +02:00
Kazu Hirata	7a47ee51a1	[llvm] Don't use Optional::getValue (NFC)	2022-06-20 22:45:45 -07:00
Kazu Hirata	0916d96d12	Don't use Optional::hasValue (NFC)	2022-06-20 20:17:57 -07:00
Kazu Hirata	064a08cd95	Don't use Optional::hasValue (NFC)	2022-06-20 20:05:16 -07:00
Kazu Hirata	5413bf1bac	Don't use Optional::hasValue (NFC)	2022-06-20 11:33:56 -07:00
Kazu Hirata	e0e687a615	[llvm] Don't use Optional::hasValue (NFC)	2022-06-20 10:38:12 -07:00
David Candler	d3919a8cc5	[ConstantFolding] Respect denormal handling mode attributes when folding instructions Depending on the environment, a floating point instruction should treat denormal inputs as zero, and/or flush a denormal output to zero. Denormals are not currently accounted for when an instruction gets folded to a constant, which can lead to differences in output between a folded and a unfolded instruction when running on the target. The denormal handling mode can be set by the function level attribute denormal-fp-math, which this patch uses to determine whether any denormal inputs to or outputs from folding should be zero, and that the sign is set appropriately. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D116952	2022-06-20 16:41:46 +01:00
Guillaume Chatelet	d3cf49e984	[Alignment] Remove alignTo version taking a MaybeAlign	2022-06-20 15:15:53 +00:00
Arthur Eubanks	6dd17a2b34	[CallGraph] Don't preserve CallGraph when function CFG analyses are preserved The call graph has nothing to with function CFGs. Fixes a crash in a future change that exposes this bug.	2022-06-19 13:01:08 -07:00
Sanjay Patel	4022551a15	[ValueTracking] recognize sub X, (X -nuw Y) as not overflowing This extends a similar pattern from D125500 and D127754. If we know that operand 1 (RHS) of a subtract is itself a non-overflowing subtract from operand 0 (LHS), then the final/outer subtract is also non-overflowing: https://alive2.llvm.org/ce/z/Bqan8v InstCombine uses this analysis to trigger a narrowing optimization, so that is what the first changed test shows. The last test models a motivating case from issue #48013. In that example, we determine 'nuw' on the first sub from the urem, then we determine that the 2nd sub can be narrowed, and that leads to eliminating both subtracts. here are still several missing subtract narrowing optimizations demonstrated in the tests above the diffs shown here - those should be handled in InstCombine with another set of patches.	2022-06-19 15:12:19 -04:00
Kazu Hirata	129b531c9c	[llvm] Use value_or instead of getValueOr (NFC)	2022-06-18 23:07:11 -07:00
Kazu Hirata	b254d67160	[llvm] Call *set::insert without checking membership first (NFC)	2022-06-18 08:32:54 -07:00
Florian Hahn	e9cced2739	Recommit "[LAA] Initial support for runtime checks with pointer selects." This reverts commit `7aa8a67882`. This version includes fixes to address issues uncovered after the commit landed and discussed at D11448. Those include: * Limit select-traversal to selects inside the loop. * Freeze pointers resulting from looking through selects to avoid branch-on-poison.	2022-06-17 21:06:26 +02:00
Sterling Augustine	df6087ee37	Move debug-only code inside LLVM_DEUG to prevent unused variable warnings.	2022-06-16 14:01:26 -07:00
Congzhe Cao	4c77d0276b	[Delinearization] Refactoring of fixed-size array delinearization This is a follow-up patch to D122857 where we added delinearization of fixed-size arrays to loop cache analysis, which resulted in some duplicate code, i.e., "tryDelinearizeFixedSize()", in LoopCacheCost.cpp and DependenceAnalysis.cpp. Refactoring is done in this patch. This patch refactors out the main logic of "tryDelinearizeFixedSize()" as "tryDelinearizeFixedSizeImpl()" and moves it to Delinearization.cpp, such that clients can reuse "llvm::tryDelinearizeFixedSizeImpl()" wherever they would like to delinearize fixed-size arrays. Currently it has two users, i.e., DependenceAnalysis.cpp and LoopCacheCost.cpp. Reviewed By: Meinersbur, #loopoptwg Differential Revision: https://reviews.llvm.org/D124745	2022-06-16 16:03:41 -04:00
Congzhe Cao	a9dccb0072	[TargetTransformInfo] Added an opt/llc option for cache line size In some passes we need a valid number of cache line size to do analysis or transformation, e.g., loop cache analysis and loop date prefetch. However, for some backend targets, `TTIImpl->getCacheLineSize()` is not implemented and hence 'TTI.getCacheLineSize()' would just return 0 which eventually might produce invalid result. In this patch we add a user-specified opt/llc option for cache line size. If the option is specified by users we use the value supplied, otherwise we fall-back to the default value obtained from `TTIImpl->->getCacheLineSize()`. The powerpc target already has such an option, this patch generalizes this option to TargetTransformInfo.cpp. Reviewed By: bmahjour, #loopoptwg Differential Revision: https://reviews.llvm.org/D127342	2022-06-16 15:57:51 -04:00
Mircea Trofin	7f24e574d4	[MLInliner] Don't inline call sites in unreachable basic blocks This requires DominatorTree be updated, which we do in the ml inliner case, but not in the default case, and the cost of doing so is noticeable to compile time for the latter[1]. So the patch only affects the ML inliner. [1] https://llvm-compile-time-tracker.com/compare.php?from=9fc0aa45e3312944431ba7e1ca0cec99c613992b&to=7af461b1ce0d9138211ef5f883f35d5b9ddf47be&stat=wall-time Differential Revision: https://reviews.llvm.org/D127899	2022-06-16 09:14:22 -07:00
Jin Xin Ng	aaff3fb6d5	[mlgo] Fix accounting for SCC splits Previously if the inliner split an SCC such that an empty one remained, the MLInlineAdvisor could potentially lose track of the EdgeCount if a subsequent CGSCC pass modified the calls of a function that was initially in the SCC pre-split. Saving the seen nodes in onPassEntry resolves this. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D127693	2022-06-15 10:53:23 -07:00
Mircea Trofin	22a1f998f7	FunctionPropertiesAnalysis: handle callsite BBs that lose edges There could be successors that were reached before but now are only reachable from elsewhere in the CFG. Suppose the following diamond CFG (lines are arrows pointing down): A / \ B C \ / D There's a call site in C that is inlined. Upon doing that, it turns out it expands to: call void @llvm.trap() unreachable D isn't reachable from C anymore, but we did discount it when we set up FunctionPropertiesUpdater, so we need to re-include it here. The patch also updates loop accounting to use LoopInfo rather than traverse BBs. Differential Revision: https://reviews.llvm.org/D127353	2022-06-14 15:19:44 -07:00
Paul Robinson	10affe74ed	[PS5] Make library function availability match PS4	2022-06-14 12:47:06 -07:00
Sanjay Patel	8605b4d8c5	[ValueTracking] recognize sub X, (X -nsw Y) as not overflowing This extends a similar pattern from D125500. If we know that operand 1 (RHS) of a subtract is itself a non-overflowing subtract from operand 0 (LHS), then the final/outer subtract is also non-overflowing: https://alive2.llvm.org/ce/z/Bqan8v InstCombine uses this analysis to trigger a narrowing optimization, so that is what the first changed test shows. The last test models the motivating case from issue #48013. In that example, we determine 'nsw' on the first sub from the srem, then we determine that the 2nd sub can be narrowed, and that leads to eliminating both subtracts. This works for unsigned sub too, but I left that out to keep the patch minimal. If this looks ok, I will follow up with that change. There are also several missing subtract narrowing optimizations demonstrated in the tests above the diffs shown here - those should be handled in InstCombine with another set of patches. Differential Revision: https://reviews.llvm.org/D127754	2022-06-14 14:51:49 -04:00
Paul Robinson	c36eebb52e	[PS5] Use __gxx_personality_v0 for TSan	2022-06-14 10:39:34 -07:00
Jin Xin Ng	9f2b873a7d	[inliner] Add per-SCC-pass InlineAdvisor printing option Adds option to print the contents of the Inline Advisor after each SCC Inliner pass Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D127689	2022-06-14 08:06:52 -07:00
Kazu Hirata	5c41b0f429	[Analysis] Remove getUniqueInstruction (NFC) The last use was removed on Apr 7, 2022 in commit `5cefe7d9f5`.	2022-06-13 14:26:20 -07:00
Nikita Popov	7e64a29e58	[InstSimplify][IR] Handle trapping constant aggregate (PR49839) Unfortunately, it's not just constant expressions that can trap, we might also have a trapping constant expression nested inside a constant aggregate. Perform the check during phi folding on Constant rather than ConstantExpr, and extend the Constant::mayTrap() implementation to also recursive into ConstantAggregates, not just ConstantExprs. Fixes https://github.com/llvm/llvm-project/issues/49839.	2022-06-13 12:35:17 +02:00
Mircea Trofin	7e7021ca1a	[mlgo] Update FunctionPropertyCache after invalidating analyses The update depends on LoopInfo, so we need that refreshed first, not after. Differential Revision: https://reviews.llvm.org/D127467	2022-06-10 16:18:14 -07:00
Guillaume Chatelet	38637ee477	[clang] Add support for __builtin_memset_inline In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`. The idea is to get support from the compiler to easily write efficient memory function implementations. This patch could be split in two: - one for the LLVM part adding the `llvm.memset.inline.*` intrinsics. - and another one for the Clang part providing the instrinsic as a builtin. Differential Revision: https://reviews.llvm.org/D126903	2022-06-10 13:13:59 +00:00
Nikita Popov	d77f944832	[LoopInfo] Add getOutermostLoop() (NFC) This is a recurring pattern, add an API function for it.	2022-06-10 11:48:21 +02:00
Philip Reames	f85c5079b8	Pipe potentially invalid InstructionCost through CodeMetrics Per the documentation in Support/InstructionCost.h, the purpose of an invalid cost is so that clients can change behavior on impossible to cost inputs. CodeMetrics was instead asserting that invalid costs never occurred. On a target with an incomplete cost model - e.g. RISCV - this means that transformations would crash on (falsely) invalid constructs - e.g. scalable vectors. While we certainly should improve the cost model - and I plan to do so in the near future - we also shouldn't be crashing. This violates the explicitly stated purpose of an invalid InstructionCost. I updated all of the "easy" consumers where bailouts were locally obvious. I plan to follow up with loop unroll in a following change. Differential Revision: https://reviews.llvm.org/D127131	2022-06-09 15:17:24 -07:00
Florian Hahn	20d798bd47	Recommit "[SCEV] Look through single value PHIs." (take 3) This reverts commit `1fbdbb5595`. All known issues surfaced by this patch should have been fixed now. The fixes included fixing issues with SCEV expansion in LV and DA's reliance on LCSSA phis.	2022-06-09 15:20:10 +01:00
Simon Moll	b8c2781ff6	[NFC] format InstructionSimplify & lowerCaseFunctionNames Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with the cleanup of InstructionSimplify in one commit. This is the alternative to the less invasive clang-format only patch: D126783 Reviewed By: spatel, rengolin Differential Revision: https://reviews.llvm.org/D126889	2022-06-09 16:10:08 +02:00
Mircea Trofin	b8c39eb275	Fix FunctionPropertiesAnalysis updating callsite in 1-BB loop If the callsite is in a single BB loop, we need to exclude the BB from the successor set (in which it'd be a member), because that set forms a boundary at which we stop traversing the CFG, when re-ingesting BBs after inlining; but after inlining, the callsite BB's new successors should be visited. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D127178	2022-06-08 14:32:00 -07:00
Jin Xin Ng	a3a7826d82	[mlgo] Disable accounting upon ForceStop Once ForceStop is set to true, we only return positive inlining advice when it is mandatory; There is no need for further node/edge accounting. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D127245	2022-06-08 14:26:06 -07:00
Bardia Mahjour	c9677f6db4	[DA] Handle mismatching loop levels by considering them non-linear To represent various loop levels within a nest, DA implements a special numbering scheme (see comment atop establishNestingLevels). The goal of this numbering scheme appears to be representing each unique loop distinctively by using as little memory as possible. This numbering scheme is simple when the source and destination of the dependence are in the same loop. In such cases the level is simply the depth of the loop in which src and dst reside. When the src and dst are not in the same loop, we could run into the following situation exposed by https://reviews.llvm.org/D71539. This patch fixes this by detecting such cases in checkSubscripts and treating them as non-linear/non-affine. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D110973	2022-06-08 11:15:37 -04:00
Chuanqi Xu	0e10f12844	[NFC] Remove commented cerr debugging loggings There are some unused cerr debugging loggings in the codes. It is weird to remain such commented debug helpers in the product.	2022-06-08 15:58:06 +08:00
Philip Reames	8a0cd23326	Revert "[MemDep][NFCI] Remove redundant dyn_cast, replace with cast" This reverts commit `180d3f251d`. This commit is simply wrong. IsLoad is set within the same file based on modref state, not whether the instruction is a LoadInst. This went uncaught because cast<Ty>(X) has been broken. See https://discourse.llvm.org/t/cast-x-is-broken-implications-and-proposal-to-address/63033 for context.	2022-06-07 13:21:31 -07:00
William Huang	ba26e45ca9	[ValueTracking] Add support to deduce a PHI node being a power of 2 if each incoming value is a power of 2. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D124889	2022-06-07 18:52:31 +00:00
Fangrui Song	d86a206f06	Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options	2022-06-05 00:31:44 -07:00
Fangrui Song	36c7d79dc4	Remove unneeded cl::ZeroOrMore for cl::opt options Similar to `557efc9a8b`. This commit handles options where cl::ZeroOrMore is more than one line below cl::opt.	2022-06-04 00:10:42 -07:00
Fangrui Song	557efc9a8b	[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!` error. More were added due to cargo cult. Since the error has been removed, cl::ZeroOrMore is unneeded. Also remove cl::init(false) while touching the lines.	2022-06-03 21:59:05 -07:00
Max Kazantsev	8555e59a71	[NFC][MemDep] Remove unnecessary Worklist.clear This execution path leads to return 'false' where the Worklist will be deallocated anyways. No need to clear it separately.	2022-06-03 12:31:44 +07:00
Florian Hahn	78c6b1488f	[CaptureTracking] Increase limit and use it for all visited uses. Currently the MaxUsesToExplore limit only applies to the number of users per value, not the total number of users to explore. The current limit of 20 pessimizes IR with opaque pointers in some cases. Without opaque pointers, we have deeper pointer def-use chains in general due to extra bitcasts and geps for structs with index 0. With opaque pointers the def-use chain is not as deep but wider, due to bitcasts & 0-geps missing. To improve the situation for opaque pointers, this patch does 2 things: 1. Apply the limit to the total number of uses visited. From the wording in the description of the option it seems like this may be the original intention. With the current implementation we could still end up walking a lot of uses. 2. Increase the limit to 100. This is quite arbitrary, but enables a good number of additional optimizations. Those adjustments have a noticeable compile-time impact though. In part that is likely due to additional transformations (and conversely the current baseline misses optimizations after switching to opaque pointers). This recovers some regressions that showed up after enabling opaque pointers. Limit=100: * NewPM-O3: +0.21% * NewPM-ReleaseThinLTO: +0.87% * NewPM-ReleaseLTO-g: +0.46% https://llvm-compile-time-tracker.com/compare.php?from=2e50ecb2ef4e1da1aeab05bcf66380068e680991&to=7e6fbe519d958d09f32f01d5d44a622f551e2031&stat=instructions Limit=60: * NewPM-O3: +0.14% * NewPM-ReleaseThinLTO: +0.41% * NewPM-ReleaseLTO-g: +0.21% https://llvm-compile-time-tracker.com/compare.php?from=aeb19817d66f1a15754163c7f48e01e9ebdd6d45&to=520563fdc146319aae90d06f88d87f2e9e1247b7&stat=instructions Limit=40: * NewPM-O3: +0.11% * NewPM-ReleaseThinLTO: +0.12% * NewPM-ReleaseLTO-g: +0.09% https://llvm-compile-time-tracker.com/compare.php?from=aeb19817d66f1a15754163c7f48e01e9ebdd6d45&to=c9182576e9fe3f1c84a71479665aef91a416318c&stat=instructions Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D126236	2022-06-02 21:43:58 +01:00
Mingming Liu	8601f269f1	[Inline][Remark][NFC] Optionally provide inline context to inline advisor. This patch has no functional change, and merely a preparation patch for main functional change. The motivating use case is to annotate inline remark pass name with context information (e.g. prelink or postlink, CGSCC or always-inliner), see D125495 for more details. Differential Revision: https://reviews.llvm.org/D126824	2022-06-02 13:14:30 -07:00
Alexander Kornienko	7aa8a67882	Revert "[LAA] Initial support for runtime checks with pointer selects." This reverts commit `5890b30105` as per discussion on the review thread: https://reviews.llvm.org/D114487#3547560.	2022-06-01 15:24:27 +02:00
Nikita Popov	03aceab08b	[ValueTracking] Enable -branch-on-poison-as-ub by default Now that SimpleLoopUnswitch and other transforms no longer introduce branch on poison, enable the -branch-on-poison-as-ub option by default. The practical impact of this is mostly better flag preservation in SCEV, and some freeze instructions no longer being necessary. Differential Revision: https://reviews.llvm.org/D125299	2022-06-01 10:46:06 +02:00
Mircea Trofin	f46dd19b48	[mlgo] Incrementally update FunctionPropertiesInfo during inlining Re-computing FunctionPropertiesInfo after each inlining may be very time consuming: in certain cases, e.g. large caller with lots of callsites, and when the overall IR doesn't increase (thus not tripping a size bloat threshold). This patch addresses this by incrementally updating FunctionPropertiesInfo. Differential Revision: https://reviews.llvm.org/D125841	2022-05-31 17:27:32 -07:00
Mehdi Amini	aff271930e	Fix warning for unused variable in the non-assert build (NFC)	2022-05-30 16:21:38 +00:00
Simon Moll	18c1ee04de	Re-land "[VP] vp intrinsics are not speculatable" with test fix Update the llvmir-intrinsics.mlir test to account for the modified attribute sets. This reverts commit `2e2a8a2d90`.	2022-05-30 14:41:15 +02:00
Mehdi Amini	2e2a8a2d90	Revert "[VP] vp intrinsics are not speculatable" This reverts commit `78a18d2b54`. Break MLIR bot: https://lab.llvm.org/buildbot/#/builders/61/builds/27127	2022-05-30 12:26:16 +00:00
Max Kazantsev	7e5a730473	[MemDep][NFC] Remove duplicating check in `if` and `else` branch Same check is done whether the condition is true or false. Just hoist it out of conditional.	2022-05-30 17:43:00 +07:00
Simon Moll	78a18d2b54	[VP] vp intrinsics are not speculatable VP intrinsics show UB if the %evl parameter is out of bounds - they must not carry the speculatable attribute. The out-of-bounds UB disappears when the %evl parameter is expanded into the mask or expansion replaces the entire VP intrinsic with non-VP code. This patch - Removes the speculatable attribute on all VP intrinsics. - Generalizes the isSafeToSpeculativelyExecute function to let VP expansion know whether the VP intrinsic replacement will be speculatable. VP expansion may only discard %evl where this is the case. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D125296	2022-05-30 12:20:05 +02:00
Max Kazantsev	180d3f251d	[MemDep][NFCI] Remove redundant dyn_cast, replace with cast When `IsLoad` is `true`, we don't need to check if the instruction is actually a load with dyn_cast. Saves some petty amount of CT.	2022-05-30 17:17:55 +07:00
eopXD	6a84579243	[LSR][TTI][PowerPC][SystemZ][X86] Add const-ness to TTI::isLSRCostLess. NFC Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D126350	2022-05-27 15:22:23 -07:00
William Huang	35b0955aa5	[ValueTracking] Added support to deduce PHI Nodes values being a power of 2 Add Value Tracking support to deduce induction variable being a power of 2, allowing urem optimizations Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D126018	2022-05-26 20:30:31 +00:00
Florian Hahn	6af5f5697c	[SCEV] Collect conditions from assumes same way as for branches. Also collect conditions from assume up-front in applyLoopGuards. This allows re-using the logic to handle logical ANDs as assume conditions. It should should pave the road for a fix for #55645.	2022-05-26 18:17:13 +01:00
serge-sans-paille	fb67d683db	[iwyu] Handle regressions in libLLVM header include Running iwyu-diff on LLVM codebase since `7030654296` detected a few regressions, fixing them. Differential Revision: https://reviews.llvm.org/D126417	2022-05-26 08:12:34 +02:00
Nikita Popov	8a6698b523	[ValueTracking] Loads with !dereferenceable metadata cannot be undef/poison A load with !dereferenceable or !dereferenceable_or_null metadata must return a well-defined (non-undef/poison) value. Effectively they imply !noundef. This is the same as we do for the dereferenceable(N) attribute. This should fix https://github.com/llvm/llvm-project/issues/55672, or at least the specific case discussed there. Differential Revision: https://reviews.llvm.org/D126296	2022-05-25 09:54:04 +02:00
Sanjay Patel	e8c20d995b	[IR] add and use pattern match specialization for sqrt intrinsic; NFC This was included in D126190 originally, but it's independent and a useful change for readability.	2022-05-23 14:16:30 -04:00
Jingu Kang	bb82f74612	Revert "Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth"" This reverts commit `42ebfa8269`. The commmit from https://reviews.llvm.org/D125918 has fixed the stage 2 build failure. Differential Revision: https://reviews.llvm.org/D118979	2022-05-23 16:15:45 +01:00
Peter Waller	ade47bdc31	[LV] Improve register pressure estimate at high VFs Previously, `getRegUsageForType` was implemented using `getTypeLegalizationCost`. `getRegUsageForType` is used by the loop vectorizer to estimate the register pressure caused by using a vector type. However, `getTypeLegalizationCost` currently only appears to understand splitting and not scalarization, so significantly underestimates the register requirements. Instead, use `getNumRegisters`, which understands when scalarization can occur (via computeRegisterProperties). This was discovered while investigating D118979 (Set maximum VF with shouldMaximizeVectorBandwidth), where under fixed-length 512-bit SVE the loop vectorizer previously ends up costing an v128i1 as 2 v64i* registers where it actually occupies 128 i32 registers. I'm sending this patch early for comment, I'm still doing some sanity checking with LNT. I note that getRegisterClassForType appears to return VectorRC even though the type in question (large vNi1 types) end up occupying scalar registers. That might be worth fixing too. Differential Revision: https://reviews.llvm.org/D125918	2022-05-23 07:57:45 +00:00
Nikita Popov	c8b675eaa1	[SCEV] Use umin_seq for BECount of multi-exit loops When computing the BECount for multi-exit loops, we need to combine individual exit counts using umin_seq rather than umin. This is because an earlier exit may exit on the first iteration, in which case later exit expressions will not be evaluated and could be poisonous. We cannot propagate potential poison values from later exits. In particular, this avoids the introduction of "branch on poison" UB when optimizing multi-exit loops. Differential Revision: https://reviews.llvm.org/D124910	2022-05-21 15:48:14 +02:00
Craig Topper	f2df53b750	[InstructionSimplify] Remove multiple 'break' after 'return'. NFC	2022-05-20 10:23:57 -07:00
Nico Weber	304a5a7a14	Revert "[ValueTracking] Added support to deduce PHI Nodes values being a power of 2" This reverts commit `d5c130f17e`. Breaks tests, see https://reviews.llvm.org/D125332#3525819	2022-05-19 15:05:30 -04:00
William Huang	d5c130f17e	[ValueTracking] Added support to deduce PHI Nodes values being a power of 2 Add Value Tracking support to deduce induction variable being a power of 2, allowing urem optimizations Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D125332	2022-05-19 18:39:13 +00:00
Jay Foad	6bec3e9303	[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf Most clients only used these methods because they wanted to be able to extend or truncate to the same bit width (which is a no-op). Now that the standard zext, sext and trunc allow this, there is no reason to use the OrSelf versions. The OrSelf versions additionally have the strange behaviour of allowing extending to a smaller width, or truncating to a larger width, which are also treated as no-ops. A small amount of client code relied on this (ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and needed rewriting. Differential Revision: https://reviews.llvm.org/D125557	2022-05-19 11:23:13 +01:00
Philip Reames	f7988d08a8	Revert "[BasicAA] Remove unneeded special case for malloc/calloc" This reverts commit `9b1e00738c`. Nikic reported in commit thread that I had forgotten history here, and that a) we'd tried this before, and b) had to revert due to an unexpected codegen impact. Current measurements confirm the same issue still exists.	2022-05-18 07:35:27 -07:00
NAKAMURA Takumi	6ca7eb2c6d	[SCEV] Part 1, Serialize function calls in function arguments. Evaluation odering in function call arguments is implementation-dependent. In fact, gcc evaluates bottom-top and clang does top-bottom. Fixes #55283 partially. Part of https://reviews.llvm.org/D125627	2022-05-18 23:20:08 +09:00
Philip Reames	9b1e00738c	[BasicAA] Remove unneeded special case for malloc/calloc This code pre-exists the generic handling for inaccessiblememonly. If we remove it and update one test with inaccessiblememonly, nothing else changes. Note that simply running O1 on that test would annotate malloc with the missing inaccessiblememonly.	2022-05-17 20:45:14 -07:00
Nikita Popov	b9b71c2b87	[LVI] Compute range for xor We do have a non-trivial implementation for binaryXor() now.	2022-05-17 10:18:38 +02:00
Yang Keao	7dce9eb6e5	[DomPrinter] Migrate -dot-dom to the new pass manager. In D123677, @YangKeao provided an implementation of `DOTGraphTraits{Viewer,Printer}` in the new pass manager. This commit migrates the `DomPrinter` and `DomViewer` to the new pass manager. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D124904	2022-05-16 15:07:16 -05:00
Nikita Popov	356d47ccb9	[ValueTracking] Handle and/or on RHS of isImpliedCondition() isImpliedCondition() currently handles and/or on the LHS, but not on the RHS, resulting in asymmetric behavior. This patch adds two new implication rules: * LHS ==> (RHS1 \|\| RHS2) if LHS ==> RHS1 or LHS ==> RHS2 * LHS ==> !(RHS1 && RHS2) if LHS ==> !RHS1 or LHS ==> !RHS2 Differential Revision: https://reviews.llvm.org/D125551	2022-05-16 16:30:26 +02:00
Florian Hahn	b7315ffc3c	[LAA,LV] Add initial support for pointer-diff memory checks. This patch adds initial support for a pointer diff based runtime check scheme for vectorization. This scheme requires fewer computations and checks than the existing full overlap checking, if it is applicable. The main idea is to only check if source and sink of a dependency are far enough apart so the accesses won't overlap in the vector loop. To do so, it is sufficient to compute the difference and compare it to the `VF * UF * AccessSize`. It is sufficient to check `(Sink - Src) <u VF * UF * AccessSize` to rule out a backwards dependence in the vector loop with the given VF and UF. If Src >=u Sink, there is not dependence preventing vectorization, hence the overflow should not matter and using the ULT should be sufficient. Note that the initial version is restricted in multiple ways: 1. Pointers must only either be read or written, by a single instruction (this allows re-constructing source/sink for dependences with the available information) 2. Source and sink pointers must be add-recs, with matching steps 3. The step must be a constant. 3. abs(step) == AccessSize. Most of those restrictions can be relaxed in the future. See https://github.com/llvm/llvm-project/issues/53590. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D119078	2022-05-16 15:27:22 +01:00
NAKAMURA Takumi	da7d8de1e4	ScalarEvolution.cpp: Reformat.	2022-05-15 20:51:27 +09:00
Sanjay Patel	ee6754c277	[ValueTracking] recognize sub X, (X % Y) as not overflowing I fixed some poison-safety violations on related patterns in InstCombine and noticed that we missed adding nsw/nuw on them, so this adds clauses to the underlying analysis for that. We need the undef input restriction to make this safe according to Alive2: https://alive2.llvm.org/ce/z/48g9K8 Differential Revision: https://reviews.llvm.org/D125500	2022-05-13 09:59:41 -04:00
Nikita Popov	ddfee07519	[InstSimplify] Fold and/or using implied conditions This adds two conjugated folds: * A \| B -> B if A implies B (https://alive2.llvm.org/ce/z/R6GU4j) * A & B -> A if A implies B (https://alive2.llvm.org/ce/z/EGMqyy) If A and B are icmps themselves, we will usually fold this through other logic already (though the tests show a couple additional cases we previously missed). However, isImpliedCond() also supports A being of the form X & Y, which allows us to handle cases like (X & Y) \| B where X implies B. This addresses the regression from D125398. Something that notably doesn't work yet is the (X \| Y) & B case. This is due to an asymmetry in the isImpliedCondition() implementation that will have to be addressed separately. Differential Revision: https://reviews.llvm.org/D125530	2022-05-13 15:09:14 +02:00
Florian Hahn	5890b30105	[LAA] Initial support for runtime checks with pointer selects. Scaffolding support for generating runtime checks for multiple SCEV expressions per pointer. The initial version just adds support for looking through a single pointer select. The more sophisticated logic for analyzing forks is in D108699 Reviewed By: huntergr Differential Revision: https://reviews.llvm.org/D114487	2022-05-12 19:33:48 +01:00
Arthur Eubanks	7e0802aeb5	[BasicAA] Fix order in which we pass MemoryLocations to alias() D98718 caused the order of Values/MemoryLocations we pass to alias() to be significant due to storing the offset in the PartialAlias case. But some callers weren't audited and were still passing swapped arguments, causing the returned PartialAlias offset to be negative in some cases. For example, the newly added unittests would return -1 instead of 1. Fixes #55343, a miscompile. Reviewed By: asbirlea, nikic Differential Revision: https://reviews.llvm.org/D125328	2022-05-10 12:05:38 -07:00
Nikita Popov	c077510bb1	[InstSimplify] Handle unknown function context in pointer icmp fold (PR54615) This issue reproduces in the context of LoopDeletion, because the bitcast does not get simplified away there. For a plain -inst-simplify run the bitcast would get folded away first. Fixes https://github.com/llvm/llvm-project/issues/54615.	2022-05-10 11:48:43 +02:00
Andrew Litteken	96345f773c	[IRSim] Remove early check from similarity matching such that commutative instructions are checked correctly when using the same value. When the first commutative instruction in a region using the same value in both positions was compared to a corresponding instruction with two different values, there was an early check that determined that since the values were new, it was true that these values acted in the same way structurally. If this was not contradicted later in the program, the regions were marked as similar. This removes that check, so that it is clear that the same value cannot be mapped to two different values. Reviewer: paquette Differential Revision: https://reviews.llvm.org/D124775	2022-05-09 22:59:09 -05:00
Mircea Trofin	c35ad9ee4f	[mlgo] Support exposing more features than those supported by models This allows the compiler to support more features than those supported by a model. The only requirement (development mode only) is that the new features must be appended at the end of the list of features requested from the model. The support is transparent to compiler code: for unsupported features, we provide a valid buffer to copy their values; it's just that this buffer is disconnected from the model, so insofar as the model is concerned (AOT or development mode), these features don't exist. The buffers are allocated at setup - meaning, at steady state, there is no extra allocation (maintaining the current invariant). These buffers has 2 roles: one, keep the compiler code simple. Second, allow logging their values in development mode. The latter allows retraining a model supporting the larger feature set starting from traces produced with the old model. For release mode (AOT-ed models), this decouples compiler evolution from model evolution, which we want in scenarios where the toolchain is frequently rebuilt and redeployed: we can first deploy the new features, and continue working with the older model, until a new model is made available, which can then be picked up the next time the compiler is built. Differential Revision: https://reviews.llvm.org/D124565	2022-05-09 18:01:21 -07:00
Michael Kruse	6b3b87376b	[polly] migrate -polly-show to the new pass manager Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D123678	2022-05-09 14:04:29 -05:00
Michael Kruse	a6b399ad79	[PassManager] Implement DOTGraphTraitsViewer under NPM Rename the legacy `DOTGraphTraits{Module,}{Viewer,Printer}` to the corresponding `DOTGraphTraits...WrapperPass`, and implement a new `DOTGraphTraitsViewer` with new pass manager. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D123677	2022-05-09 14:04:28 -05:00
Alexey Bataev	9dc4ced204	[SLP]Try partial store vectorization if supported by target. We can try to vectorize number of stores less than MinVecRegSize / scalar_value_size, if it is allowed by target. Gives an extra opportunity for the vectorization. Fixes PR54985. Differential Revision: https://reviews.llvm.org/D124284	2022-05-09 09:48:15 -07:00
Nikita Popov	68e1ba8188	[SCEV] Fold umin_seq using known predicate Fold %x umin_seq %y to %x if %x ule %y. This also subsumes the special handling for constant operands, as if %y is constant this folds to umin via implied poison reasoning, and if %x is constant then either %x is not zero and it folds to umin, or it is known zero, in which case it is ule anything.	2022-05-09 16:35:08 +02:00
Nikita Popov	18eaff1510	[ScalarEvolution] Fold %x umin_seq %y if %x cannot be zero Fold %x umin_seq %y to %x umin %y if %x cannot be zero. They only differ in semantics for %x==0. More generally %x _seq %y folds to %x %y if %x cannot be the saturation fold (though currently we only have umin_seq).	2022-05-09 15:11:05 +02:00
Serge Pavlov	eb28da89a6	[InstCombine] Remove side effect of replaced constrained intrinsics If a constrained intrinsic call was replaced by some value, it was not removed in some cases. The dangling instruction resulted in useless instructions executed in runtime. It happened because constrained intrinsics usually have side effect, it is used to model the interaction with floating-point environment. In some cases side effect is actually absent or can be ignored. This change adds specific treatment of constrained intrinsics so that their side effect can be removed if it actually absents. Differential Revision: https://reviews.llvm.org/D118426	2022-05-07 19:04:11 +07:00
Nikita Popov	47c559d6c1	[SCEV] Fold umin_seq to umin using implied poison reasoning Similar to how we convert logical and/or to bitwise and/or, we should also convert umin_seq to umin based on implied poison reasoning. In %x umin_seq %y, if %y being poison implies %x being poison, then we don't need the sequential evaluation: Having %y contribute towards the result will never make the result more poisonous. An important corollary of this is that if %y is never poison, we also don't need the sequential evaluation. This avoids some of the regressions in D124910. Differential Revision: https://reviews.llvm.org/D124921	2022-05-05 09:43:49 +02:00
Yangguang Li	3a8266902b	[SCEV] Removed an unnecessary assertion The assertion is to check we always get backedge taken count (`BECount`) of zero when the exit condition is in select form (`isa<BinaryOperation>(ExitCond)`) and the exit limit for the first operand is zero `EL0.ExactNotTaken->isZero()`). However the assertion is checking that the exit condition is NOT in select form. Removing the the whole assertion since we now handle select form in ScalarEvolution::getSequentialMinMaxExpr. Reviewed By: reames, nikic Differential Revision: https://reviews.llvm.org/D122835	2022-05-03 17:26:27 -04:00
Augie Fackler	1deea714b3	BuildLibCalls: simplify switch statement slightly Per feedback on D123086 after submit. Also added a test for vec_malloc et al attribute inference to show it's doing the right thing. The new tests exposed a defect, corrected by adding vec_free to the list of free functions in MemoryBuiltins.cpp, which had been overlooked all the way back in D94710, over a year ago. Differential Revision: https://reviews.llvm.org/D124859	2022-05-03 13:17:33 -04:00
Nikita Popov	47255834e7	[ValueTracking] A and (B & ~A) have no common bits set This extends haveNoCommonBitsSet() to two additional cases, allowing the following folds: * `A + (B & ~A)` --> `A \| (B & ~A)` (https://alive2.llvm.org/ce/z/crxxhN) * `A + ((A & B) ^ B)` --> `A \| ((A & B) ^ B)` (https://alive2.llvm.org/ce/z/A_wsH_) These should further fold to just `A \| B`, though this currently only works in the first case. The reason why the second fold is necessary is that we consider this to be the canonical form if B is a constant. (I did check whether we can change that, but it looks like a number of folds depend on the current canonicalization, so I ended up adding both patterns here.) Differential Revision: https://reviews.llvm.org/D124763	2022-05-03 11:33:27 +02:00
Igor Kirillov	4e5e042d9a	[LoopVectorize] Support reductions that store intermediary result Adds ability to vectorize loops containing a store to a loop-invariant address as part of a reduction that isn't converted to SSA form due to lack of aliasing info. Runtime checks are generated to ensure the store does not alias any other accesses in the loop. Ordered fadd reductions are not yet supported. Differential Revision: https://reviews.llvm.org/D110235	2022-05-03 10:12:30 +01:00
David Green	6f81903e89	[LV][SLP] Mark fptosi_sat as vectorizable This adds fptosi_sat and fptoui_sat to the list of trivially vectorizable functions, mainly so that the loop vectorizer can vectorize the instruction. Marking them as trivially vectorizable also allows them to be SLP vectorized, and Scalarized. The signature of a fptosi_sat requires two type overrides (@llvm.fptosi.sat.v2i32.v2f32), unlike other intrinsics that often only take a single. This patch alters hasVectorInstrinsicOverloadedScalarOpd to isVectorIntrinsicWithOverloadTypeAtArg, so that it can mark the first operand of the intrinsic as a overloaded (but not scalar) operand. Differential Revision: https://reviews.llvm.org/D124358	2022-05-03 09:32:34 +01:00
Bardia Mahjour	363b3a645a	fix warning caused by `ef4ecc3cef`	2022-05-02 17:06:27 -04:00
Bardia Mahjour	ef4ecc3cef	[LoopCacheAnalysis] Consider dimension depth of the subscript reference when calculating cost Reviewed By: congzhe, etiotto Differential Revision: https://reviews.llvm.org/D123400	2022-05-02 16:49:10 -04:00
Nikita Popov	597946a4dd	[ConstantFold] Don't convert getelementptr to ptrtoint+inttoptr ConstantFolding currently converts "getelementptr i8, Ptr, (sub 0, V)" to "inttoptr (sub (ptrtoint Ptr), V)". This transform is, taken by itself, correct, but does came with two issues: 1. It unnecessarily broadens provenance by introducing an inttoptr. We generally prefer not to introduce inttoptr during optimization. 2. For the case where V == ptrtoint Ptr, this folds to inttoptr 0, which further folds to null. In that case provenance becomes incorrect. This has been observed as a real-world miscompile with rustc. We should probably address that incorrect inttoptr 0 fold at some point, but in either case we should also drop this inttoptr-introducing fold. Instead, replace it with a fold rooted at ptrtoint(getelementptr), which seems to cover the original motivation for this fold (test2 in the changed file). Differential Revision: https://reviews.llvm.org/D124677	2022-05-02 10:24:46 +02:00
Congzhe Cao	c428a3d2a0	[LoopCacheAnalysis] Enable delinearization of fixed sized arrays Currently loop cache cost (LCC) cannot analyze fix-sized arrays since it cannot delinearize them. This patch adds the capability to delinearize fix-sized arrays to LCC. Most of the code is ported from DependenceAnalysis.cpp and some refactoring will be done in a next patch. Reviewed By: #loopoptwg, Meinersbur Differential Revision: https://reviews.llvm.org/D122857	2022-04-29 16:01:27 -04:00
Roman Lebedev	981ed72a17	[NFC][SCEV] Refactor `createNodeForSelectViaUMinSeq()` out of `createNodeForSelectOrPHIViaUMinSeq()`	2022-04-29 02:37:06 +03:00
Mircea Trofin	49942d595f	[NFC] remove const from FunctionPropertiesAnalysis::run, keep on Result The goal in `75881d8b02` was just modifying what `Result` is, didn't need to also modify ::run.	2022-04-28 15:10:21 -07:00
Mircea Trofin	75881d8b02	[NFC] const-ed the return type of FunctionPropertiesAnalysis The result is a data bag, this makes sure it's signaled to a user that the data can't be mutated when, for example, doing something like: auto &R = FAM.getResult<FunctionPropertiesAnalysis>(F) ... R.Uses++	2022-04-28 12:42:16 -07:00
Alexey Bataev	75e1cf4a6a	[COST]Improve cost model for shuffles in SLP. Introduced masks where they are not added and improved target dependent cost models to avoid returning of the incorrect cost results after adding masks. Differential Revision: https://reviews.llvm.org/D100486	2022-04-28 10:04:41 -07:00
Alexey Bataev	9861ca0c23	Revert "[COST]Improve cost model for shuffles in SLP." This reverts commit `29a470e380` to fix a crash reported in https://reviews.llvm.org/D100486#3479989.	2022-04-28 08:11:56 -07:00
Chris Jackson	c792884589	[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2] Reland `3f2b76ec90` with the test corrected to require x86-registered-target. Differential Revision: https://reviews.llvm.org/D120169	2022-04-28 14:21:56 +01:00
Chris Jackson	cd5f9efc4d	Revert "[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]" This reverts commit `3f2b76ec90`.	2022-04-28 14:07:31 +01:00
Chris Jackson	3f2b76ec90	[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2] Reland commit `74273d575f` following a fix for a memory leak. The DVIRecoveryRecord vectors now use unique_ptr. Differential Revision: https://reviews.llvm.org/D120169	2022-04-28 13:55:49 +01:00
Kirill Stoimenov	761366e6ae	Revert "[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]" This reverts commit `74273d575f`. Buildbot: https://lab.llvm.org/buildbot/#/builders/5/builds/22795 Failing with memory leak.	2022-04-27 23:11:48 +00:00
Alexey Bataev	29a470e380	[COST]Improve cost model for shuffles in SLP. Introduced masks where they are not added and improved target dependent cost models to avoid returning of the incorrect cost results after adding masks. Differential Revision: https://reviews.llvm.org/D100486	2022-04-27 10:56:26 -07:00
Chris Jackson	74273d575f	[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2] This relands commit `8f550368b1`. The test is amended with REQUIRES: x86-registered-target, in line with the other debuginfo-scev-salvage tests. Differential Revision: https://reviews.llvm.org/D120169	2022-04-27 13:10:30 +01:00
Chris Jackson	855752e563	Revert [Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics[2/2] This reverts commit `8f550368b1`.	2022-04-27 13:06:03 +01:00
Chris Jackson	8f550368b1	[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2] Second of two patches to extend SCEV-based salvaging to dbg.value intrinsics that have multiple location ops pre-LSR. This second patch adds the core implementation. Reviewers: @StephenTozer, @djtodoro Differential Revision: https://reviews.llvm.org/D120169	2022-04-27 12:47:35 +01:00
Vasileios Porpodas	fa8a9fea47	Recommit "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`" This reverts commit `6a9bbd9f20`. Code review: https://reviews.llvm.org/D124202	2022-04-26 14:02:40 -07:00
Vasileios Porpodas	6a9bbd9f20	Revert "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`" This reverts commit `55ce296d6f`.	2022-04-26 11:25:26 -07:00
Vasileios Porpodas	55ce296d6f	[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost` Before this patch `Args` was used to pass a broadcat's arguments by SLP. This patch changes this. `Args` is now used for passing the operands of the shuffle. Differential Revision: https://reviews.llvm.org/D124202	2022-04-26 11:11:29 -07:00
Mircea Trofin	b1fa5ac3ba	[mlgo] Factor out TensorSpec This is a simple datatype with a few JSON utilities, and is independent of the underlying executor. The main motivation is to allow taking a dependency on it on the AOT side, and allow us build a correctly-sized buffer in the cases when the requested feature isn't supported by the model. This, in turn, allows us to grow the feature set supported by the compiler in a backward-compatible way; and also collect traces exposing the new features, but starting off the older model, and continue training from those new traces. Differential Revision: https://reviews.llvm.org/D124417	2022-04-25 18:35:46 -07:00
David Green	9727c77d58	[NFC] Rename Instrinsic to Intrinsic	2022-04-25 18:13:23 +01:00
Jun Ma	c0022b4bb1	[InlineCost] Set LastCallToStaticBonus in ML inlining models. This patch set LastCallToStaticBonus based on check, it has no noticeable size reduction on an internal workload and linux kernel with Os/Oz. Differential Revision: https://reviews.llvm.org/D124233	2022-04-24 09:34:19 +08:00
Florian Hahn	d43c083ab6	[SCEV] Use getConstant to construct SCEV for ConstantInt (NFC). We already know that we will construct a SCEVConstant. Directly use getConstant, rather than going through getSCEV.	2022-04-23 11:12:59 +01:00
Chang-Sun Lin Jr	7ee30a0e24	[NFC][LAA] Match-up type sizes for possible extensions, based on actual bit-size rather than rounded-up byte size. Differential Revision: https://reviews.llvm.org/D119200	2022-04-22 23:16:20 -07:00
Mircea Trofin	e4794ff5c6	[mlgo][nfc] Decouple TensorSpec from tensorflow. The motivation is twofold: 1) Allow plugging in a different training-time evaluator, e.g. TFLite-based, etc. 2) Allow using TensorSpec for AOT, too, to support evolution: we start by extracting a superset of the features currently supported by a model. For the tensors the model does not support, we just return a valid, but useless, buffer. This makes using a 'smaller' model (less supported tensors) transparent to the compiler. The key is to dimension the buffer appropriately, and we already have TensorSpec modeling that info. The only coupling was due to the reliance of a TF internal API for getting the element size, but for the types we are interested in, `sizeof` is sufficient. A subsequent change will yank out TensorSpec in its own module. Differential Revision: https://reviews.llvm.org/D124045	2022-04-21 15:37:01 -07:00
Vasileios Porpodas	889588ee97	[SLP] Refactoring isLegalBroadcastLoad() to use `ElementCount`. Replacing `unsigned` with `ElementCount` in the argument of `isLegalBroadcastLoad()`. This helps reduce the diff of a future SLP patch for AArch64.	2022-04-21 10:19:00 -07:00
Alexey Bataev	2cca53c815	[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer. We can process the long shuffles (working across several actual vector registers) in the best way if we take the actual register represantion into account. We can build more correct representation of register shuffles, improve number of recognised buildvector sequences. Also, same function can be used to improve the cost model for the shuffles. in future patches. Part of D100486 Differential Revision: https://reviews.llvm.org/D115653	2022-04-20 09:37:16 -07:00
Alexey Bataev	5f7ac15912	Revert "[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer." This reverts commit `2f49163b33` to fix a buildbot failure. Reported in https://lab.llvm.org/buildbot#builders/105/builds/24284	2022-04-20 06:35:55 -07:00
Alexey Bataev	2f49163b33	[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer. We can process the long shuffles (working across several actual vector registers) in the best way if we take the actual register represantion into account. We can build more correct representation of register shuffles, improve number of recognised buildvector sequences. Also, same function can be used to improve the cost model for the shuffles. in future patches. Part of D100486 Differential Revision: https://reviews.llvm.org/D115653	2022-04-20 05:32:56 -07:00
Nikita Popov	f767a7d115	[DomTreeUpdater] Remove deprecated methods Remove the insertEdge(), insertEdgeRelaxed(), deleteEdge() and deleteEdgeRelaxed() methods, which have been deprecated three years ago.	2022-04-20 12:14:29 +02:00
Andrew Litteken	3de29ad209	[IRSim] Ignore debug instructions when creating canonical numbering When constructing canonical relationships between two regions, the first instruction of a basic block from the first region is used to find the corresponding basic block from the second region. However, debug instructions are not included in similarity matching, and therefore do not have a canonical numbering. This patch makes sure to ignore the debug instructions when finding the first instruction in a basic block. Reviewer: paquette Differential Revision: https://reviews.llvm.org/D123903	2022-04-19 13:18:28 -05:00
Arthur Eubanks	a7e20a8a7a	[CallPrinter] Port CallPrinter passes to new pass manager Port the legacy CallGraphViewer and CallGraphDOTPrinter to work with the new pass manager. Addresses issue https://github.com/llvm/llvm-project/issues/54323 Adds back related tests that were removed in commits `d53a4e7b4a` and `9e9d9aba14` Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D122989	2022-04-18 10:02:18 -07:00
Andrew Litteken	a919d3d888	[IROutliner] Ensure that incoming blocks of PHINodes are included in the unique numbering gneration for phi nodes for each exit path Issue: https://github.com/llvm/llvm-project/issues/54431 PHINodes that need to be generated to accommodate a PHINode outside the region due to different output paths need to have their own numbering to determine the number of output schemes required to properly handle all the outlined regions. This numbering was previously only determined by the order and values of the incoming values, as well as the parent block of the PHINode. This adds the incoming blocks to the calculation of a hash value for these PHINodes as well, and the supporting infrastructure to give each block in a region a corresponding canonical numbering. Reviewer: paquette Differential Revision: https://reviews.llvm.org/D122207	2022-04-14 12:13:17 -05:00
Kevin P. Neal	d43d9e1d5c	[FPEnv][InstSimplify] Fold fsub -0.0, -X ==> X Currently the fsub optimizations in InstSimplify don't know how to fold -0.0 - (-X) to X when the constrained intrinsics are used. This adds partial support. The rest of the support will come later with work on the IR matchers. This review is split out from D107285. Differential Revision: https://reviews.llvm.org/D123396	2022-04-14 11:48:54 -04:00
Congzhe Cao	557b131c88	[DA] Refactor with a better API Refactor from iteratively using BitCastInst::getOperand() to using stripPointerCasts() instead. This is an improvement since now we are able to analyze more cases, please refer to test cases added in this patch. Reviewed By: Meinersbur, #loopoptwg Differential Revision: https://reviews.llvm.org/D123559	2022-04-13 14:51:48 -04:00
serge-sans-paille	262eba01b3	Revert "[ValueTracking] Make getStringLenth aware of strdup" This reverts commit `e810d55809`. The commit was not taken into account the fact that strduped string could be modified. Checking if such modification happens would make the function very costly, without a test case in mind it's not worth the effort.	2022-04-13 19:17:28 +02:00
Muhammad Omair Javaid	42ebfa8269	Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth" This reverts commit `64b6192e81`. This broke LLVM AArch64 buildbot clang-aarch64-sve-vls-2stage: https://lab.llvm.org/buildbot/#/builders/176/builds/1515 llvm-tblgen crashes after applying this patch.	2022-04-13 04:53:07 +05:00
Johannes Doerfert	9dc7da3f9c	[GlobalsModRef][FIX] Ensure we honor synchronizing effects of intrinsics This is a long standing problem that resurfaces once in a while [0]. There might actually be two problems because I'm not 100% sure if the issue underlying https://reviews.llvm.org/D115302 would be solved by this or not. Anyway. In 2008 we thought intrinsics do not read/write globals passed to them: `d4133ac315` This is not correct given that intrinsics can synchronize threads and cause effects to effectively become visible. NOTE: I did not yet modify any tests but only tried out the reproducer of https://github.com/llvm/llvm-project/issues/54851. Fixes: https://github.com/llvm/llvm-project/issues/54851 [0] https://discourse.llvm.org/t/bug-gvn-memdep-bug-in-the-presence-of-intrinsics/59402 Differential Revision: https://reviews.llvm.org/D123531	2022-04-12 16:42:50 -05:00
Nikita Popov	1d530b914e	[InstSimplify] Don't fold phi of poison and trapping const expr (PR49839) Folding this case would result in the constant expression being executed unconditionally, which may introduce a new trap. Fixes https://github.com/llvm/llvm-project/issues/49839.	2022-04-12 17:32:25 +02:00
serge-sans-paille	e810d55809	[ValueTracking] Make getStringLenth aware of strdup During strlen compile-time evaluation, make it possible to track size of strduped strings. Differential Revision: https://reviews.llvm.org/D123497	2022-04-12 14:47:29 +02:00
Nikita Popov	8d5c8d57c6	[InlineCost] Check that function types match Retain the behavior we get without opaque pointers: A call to a known function with different function type is considered an indirect call. This fixes the crash reported in https://reviews.llvm.org/D123300#3444772.	2022-04-12 11:05:33 +02:00
Arthur Eubanks	b22ffc7b98	[CaptureTracking] Ignore ephemeral values in EarliestEscapeInfo And thread DSE's ephemeral values to EarliestEscapeInfo. This allows more precise analysis in DSEState::isReadClobber() via BatchAA. Followup to D123162. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123342	2022-04-08 10:07:26 -07:00
Nikita Popov	930a68765d	[Loads] Check type size in bits during store to load forwarding Rather than checking the rounded type store size, check the type size in bits. We don't want to forward a store of i1 to a load of i8 for example, even though they have the same type store size. The padding bits have unspecified contents. This is a partial fix for the issue reported at https://reviews.llvm.org/D115924#inline-1179482, the problem also needs to be addressed more generally in the constant folding code.	2022-04-08 17:29:29 +02:00
Nikita Popov	4e85b427dd	[MemoryBuiltins] Remove unnecessary lambda capture (NFC)	2022-04-08 10:13:37 +02:00
serge-sans-paille	aa15ea47e2	[builtin_object_size] Basic support for posix_memalign It actually implements support for seeing through loads, using alias analysis to refine the result. This is rather limited, but I didn't want to rely on more than available analysis at that point (to be gentle with compilation time), and it does seem to catch common scenario, as showcased by the included tests. Differential Revision: https://reviews.llvm.org/D122431	2022-04-08 09:31:11 +02:00
Evgeniy Brevnov	da41214d65	Add support for atomic memory copy lowering Currently, the utility supports lowering of non atomic memory transfer routines only. This patch adds support for atomic version of memcopy. This may be useful for targets not supporting atomic memcopy. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D118443	2022-04-08 10:41:31 +07:00
Arthur Eubanks	17fdaccccf	[CaptureTracking] Ignore ephemeral values when determining pointer escapeness Ephemeral values cannot cause a pointer to escape. No change in compile time: https://llvm-compile-time-tracker.com/compare.php?from=4371710085ba1c376a094948b806ddd3b88319de&to=c5ddbcc4866f38026737762ee8d7b9b00395d4f4&stat=instructions This partially fixes some regressions caused by more calls to `__builtin_assume` (D122397). Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D123162	2022-04-07 10:11:14 -07:00
Augie Fackler	d6a7da5ae3	MemoryBuiltins: only claim an allocator family on builtin functions This lines up with other parts of the codebase that only use special knowledge about allocator functions if they're builtins. Differential Revision: https://reviews.llvm.org/D123053	2022-04-07 12:38:45 -04:00
Augie Fackler	5f09498a11	MemoryBuiltins: also check function definition for allocalign This got changed to use hasAttrSomewhere() during review, and I didn't notice until today when I was writing some tests for another part of this system that using hasAttrSomewhere only checked the callsite for allocalign, rather than both the callsite and the definition. This fixes that by introducing a helper method. Differential Revision: https://reviews.llvm.org/D121641	2022-04-07 12:38:44 -04:00
Alina Sbirlea	50d41f3e0d	[MSSA] Print memory phis when inspecting walker. This makes the MemorySSA and MemorySSA Walker printers consistent. Invokation `-print<memoryssa-walker>` should also have the MemoryPhis.	2022-04-06 16:06:14 -07:00
Augie Fackler	33b1f41914	MemoryBuiltins: getAllocAlignment is now useful for non-allocator funcs This has been true since `dba73135c8`, but didn't matter until now because clang wasn't emitting allocalign attributes. Differential Revision: https://reviews.llvm.org/D121640	2022-04-06 09:51:38 -04:00
Martin Storsjö	46776f7556	Fix warnings about variables that are set but only used in debug mode Add void casts to mark the variables used, next to the places where they are used in assert or `LLVM_DEBUG()` expressions. Differential Revision: https://reviews.llvm.org/D123117	2022-04-06 10:01:46 +03:00
Tom Honermann	c54ad13602	[Lint][Verifier] NFC: Rename 'Assert' macros to 'Check'. The LLVM IR verifier and analysis linter defines and uses several macros in code that performs validation of IR expectations. Previously, these macros were named with an 'Assert' prefix. These names were misleading since the macro definitions are not conditioned on build kind; they are defined identically in builds that have asserts enabled and those that do not. This was confusing since an LLVM developer might expect these macros to be conditionally enabled as 'assert' is. Further confusion was possible since the LLVM IR verifier is implicitly disabled (in Clang::ConstructJob()) for builds without asserts enabled, but only for Clang driver invocations; not for clang -cc1 invocations. This could make it appear that the macros were not active for builds without asserts enabled, e.g. when investigating behavior using the Clang driver, and thus lead to surprises when running tests that exercise the clang -cc1 interface. This change renames this set of macros as follows: Assert -> Check AssertDI -> CheckDI AssertTBAA -> CheckTBAA	2022-04-05 15:34:35 -04:00
Nikita Popov	516333d632	[ValueTracking] Handle non-pow2 align assume bundle (PR53693) https://reviews.llvm.org/D119414 clarified that this is legal IR, so handle it gracefully. (We could aggressively use the fact that the pointer must be a null pointer in that case, but I'm not bothering with that.) Fixes https://github.com/llvm/llvm-project/issues/53693.	2022-04-05 16:48:40 +02:00
Jingu Kang	64b6192e81	[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth Set the maximum VF of AArch64 with 128 / the size of smallest type in loop. Differential Revision: https://reviews.llvm.org/D118979	2022-04-05 13:16:52 +01:00
Hirochika Matsumoto	447a4485c5	[InstSimplify] Fold (ctpop(X) == N) \|\| (X != 0) into X != 0 where N > 0 (ctpop(X) == N) \|\| (X != 0) --> (X != 0) https://alive2.llvm.org/ce/z/udgUVV (ctpop(X) != N) && (X == 0) --> (X == 0) https://alive2.llvm.org/ce/z/9dq-cR Differential Revision: https://reviews.llvm.org/D122757	2022-04-04 23:23:34 +09:00
Augie Fackler	e90bce8f91	CallBase: fix getFnAttr so it also checks the function Prior to this change, CallBase::hasFnAttr checked the called function to see if it had an attribute if it wasn't set on the CallBase, but getFnAttr didn't do the same delegation, which led to very confusing behavior. This patch fixes the issue by making CallBase::getFnAttr also check the function under the same circumstances. Test changes look (to me) like they're cleaning up redundant attributes which no longer get specified both on the callee and call. We also clean up the one ad-hoc implementation of this getter over in InlineCost.cpp. Differential Revision: https://reviews.llvm.org/D122821	2022-04-03 23:19:23 -04:00
Artur Pilipenko	4fbde1ef40	Fix MemorySSAUpdater::insertDef for dead code Fix for https://github.com/llvm/llvm-project/issues/51257. Differential Revision: https://reviews.llvm.org/D122601	2022-03-31 16:32:35 -07:00
serge-sans-paille	01be9be2f2	Cleanup includes: final pass Cleanup a few extra files, this closes the work on libLLVM dependencies on my side. Impact on libLLVM preprocessed output: -35876 lines Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D122576	2022-03-29 09:00:21 +02:00
Vasileios Porpodas	39aa202aff	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit `e6ead19b77`.	2022-03-23 18:32:17 -07:00
Fangrui Song	dcad676958	[CGSCC] Use make_early_inc_range. NFC	2022-03-23 15:31:09 -07:00
Arthur Eubanks	e6ead19b77	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash." This reverts commit `27bd8f9492`. Causes crashes, see comments in D121973	2022-03-23 10:57:45 -07:00
Vasileios Porpodas	27bd8f9492	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit `f7d7d2a08d`.	2022-03-22 16:41:55 -07:00
Arthur Eubanks	f7d7d2a08d	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads."" This reverts commit `79613185d3`. Causes crashes, see comments in https://reviews.llvm.org/D121973.	2022-03-22 13:33:49 -07:00
Vasileios Porpodas	79613185d3	Recommit "[SLP] Fix lookahead operand reordering for splat loads." Original review: https://reviews.llvm.org/D121354 The original commit `9136145eb0` broke the build on several targets. Differential Revision: https://reviews.llvm.org/D121973	2022-03-21 15:57:32 -07:00
Philip Reames	b880bde92b	Add missing dependencies to mayHaveNonDefUseDependency Two interesting ommissions: * When reordering in either direction, reordering two calls which both contain inf-loops is illegal. This one is possibly a change in behavior for certain callers (e.g. fixes a latent bug.) * When moving down, control dependence must be respected by checking the inverse of isSafeToSpeculativeExecute. Current callers all seem to handle this case - though admitted, I did not do an exhaustive audit. Most seem to be only interested in moving upwards within a block. This is mostly a case of future proofing an API so that it implements what the comments says, not just what current callers need. Noticed via inspection. I don't have a test case.	2022-03-21 10:15:36 -07:00
Philip Reames	ee7324b898	Rename mayBeMemoryDependent to mayHaveNonDefUseDependency [nfc]	2022-03-21 10:01:40 -07:00
serge-sans-paille	39b02d49cc	[instcombine] Support and test __builtin_object_size interaction with __strdup and __strndup Differential Revision: https://reviews.llvm.org/D122005	2022-03-21 11:30:51 +01:00
serge-sans-paille	d8e0a6d5e9	[LowerConstantIntrinsics] Support phi operand in __builtin_object_size folder The implementation is just a generalization of the Select handler. We're no trying to be smart and compute any kind of fixed point. Differential Revision: https://reviews.llvm.org/D121897	2022-03-21 11:30:50 +01:00
Kazu Hirata	9aa52ba574	[Analysis] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)	2022-03-20 18:21:40 -07:00
Arthur Eubanks	ddc702376a	[NewPM] Don't skip SCCs not in current RefSCC With D107249 I saw huge compile time regressions on a module (150s -> 5700s). This turned out to be due to a huge RefSCC in the module. As we ran the function simplification pipeline on functions in the SCCs in the RefSCC, some of those SCCs would be split out to their RefSCC, a child of the current RefSCC. We'd skip the remaining SCCs in the huge RefSCC because the current RefSCC is now the RefSCC just split out, then revisit the original huge RefSCC from the beginning. This happened many times because many functions in the RefSCC were optimizable to the point of becoming their own RefSCC. This patch makes it so we don't skip SCCs not in the current RefSCC so that we split out all the child RefSCCs on the first iteration of RefSCC. When we split out a RefSCC, we invalidate the original RefSCC and add the remainder of the SCCs into a new RefSCC in RCWorklist. This happens repeatedly until we finish visiting all SCCs, at which point there is only one valid RefSCC in RCWorklist from the original RefSCC containing all the SCCs that were not split out, and we visit that. For example, in the newly added test cgscc-refscc-mutation-order.ll, we'd previously run instcombine in this order: f1, f2, f1, f3, f1, f4, f1 Now it's: f1, f2, f3, f4, f1 This can cause more passes to be run in some specific cases, e.g. if f1<->f2 gets optimized to f1<-f2, we'd previously run f1, f2; now we run f1, f2, f2. This improves kimwitu++ compile times by a lot (12-15% for various -O3 configs): https://llvm-compile-time-tracker.com/compare.php?from=2371c5a0e06d22b48da0427cebaf53a5e5c54635&to=00908f1d67400cab1ad7bcd7cacc7558d1672e97&stat=instructions Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D121953	2022-03-18 14:16:29 -07:00
Florian Hahn	1b7ef6aac8	[BasicAA] Account for wrapping when using abs(VarIndex) >= abs(Scale). The patch adds an extra check to only set MinAbsVarIndex if abs(V * Scale) won't wrap. In the absence of IsNSW, try to use the bitwidths of the original V and Scale to rule out wrapping. Attempt to model https://alive2.llvm.org/ce/z/HE8ZKj The code in the else if below probably needs the same treatment, but I need to come up with a test first. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D121695	2022-03-18 14:41:15 +00:00
Kevin P. Neal	bd050a34fe	[FPEnv][InstSimplify] Teach CannotBeNegativeZero() about constrained intrinsics. Currently some optimizations are disabled because llvm::CannotBeNegativeZero() does not know how to deal with the constrained intrinsics. This patch fixes that by extending the existing implementation. Differential Revision: https://reviews.llvm.org/D121483	2022-03-18 10:24:48 -04:00
Nikita Popov	6ffb3ad631	[SCEV] Use constant ranges when determining reachable blocks (PR54434) This avoids false positive verification failures if the condition is not literally true/false, but SCEV still makes use of the fact that a loop is not reachable through more complex reasoning. Fixes https://github.com/llvm/llvm-project/issues/54434.	2022-03-18 12:04:35 +01:00
Nikita Popov	f96428e16d	[MemorySSA] Don't optimize uses during construction This changes MemorySSA to be constructed in unoptimized form. MemorySSA::ensureOptimizedUses() can be called to optimize all uses (once). This should be done by passes where having optimized uses is beneficial, either because we're going to query all uses anyway, or because we're doing def-use walks. This should help reduce the compile-time impact of MemorySSA for some use cases (the reason why I started looking into this is D117926), which can avoid optimizing all uses upfront, and instead only optimize those that are actually queried. Actually, we have an existing use-case for this, which is EarlyCSE. Disabling eager use optimization there gives a significant compile-time improvement, because EarlyCSE will generally only query clobbers for a subset of all uses (this change is not included in this patch). Differential Revision: https://reviews.llvm.org/D121381	2022-03-18 09:56:16 +01:00
Vasileios Porpodas	9136145eb0	Revert "[SLP] Fix lookahead operand reordering for splat loads." due to build failures This reverts commit `5efa78985b`.	2022-03-17 18:22:04 -07:00
Vasileios Porpodas	5efa78985b	[SLP] Fix lookahead operand reordering for splat loads. Splat loads are inexpensive in X86. For a 2-lane vector we need just one instruction: `movddup (%reg), xmm0`. Using the standard Splat score leads to worse code. This patch adds a new score dedicated for splat loads. Please note that a splat is usually three IR instructions: - It is usually a load and 2 inserts: %ld = load double, double* %gep %ins1 = insertelement <2 x double> poison, double %ld, i32 0 %ins2 = insertelement <2 x double> %ins1, double %ld, i32 1 - But it can also be a load, an insert and a shuffle: %ld = load double, double* %gep %ins = insertelement <2 x double> poison, double %ld, i32 0 %shf = shufflevector <2 x double> %ins, <2 x double> poison, <2 x i32> zeroinitializer Because of this some of the lit tests contain more IR instructions. Differential Revision: https://reviews.llvm.org/D121354	2022-03-17 18:05:54 -07:00
Jay Foad	a3a4591856	[LegacyPassManager] Move structural hashing into Pass classes. NFC. Move structural hashing into virtual methods on Pass. This will allow MachineFunctionPass to override the method to add hashing of the MachineFunction. Differential Revision: https://reviews.llvm.org/D120123	2022-03-17 09:51:12 +00:00
Nikita Popov	f3cbe60aa9	[AAEval] Remove unused function (NFC)	2022-03-16 10:25:45 +01:00
Nikita Popov	57d57b1afd	[AAEval] Make compatible with opaque pointers With opaque pointers, we cannot use the pointer element type to determine the LocationSize for the AA query. Instead, -aa-eval tests are now required to have an explicit load or store for any pointer they want to compute alias results for, and the load/store types are used to determine the location size. This may affect ordering of results, and sorting within one result, as the type is not considered part of the sorted string anymore. To somewhat minimize the churn, printing still uses faux typed pointer notation.	2022-03-16 10:02:11 +01:00
Dmitry Makogon	361034ba78	[NFC] Add LazyValueInfo::clear method This method just calls LazyValueInfoImpl::clear	2022-03-15 17:52:50 +07:00
Arthur Eubanks	4fc7c55fff	[NewPM] Actually recompute GlobalsAA before module optimization pipeline RequireAnalysis<GlobalsAA> doesn't actually recompute GlobalsAA. GlobalsAA isn't invalidated (unless specifically invalidated) because it's self-updating via ValueHandles, but can be imprecise during the self-updates. Rather than invalidating GlobalsAA, which would invalidate AAManager and any analyses that use AAManager, create a new pass that recomputes GlobalsAA. Fixes #53131. Differential Revision: https://reviews.llvm.org/D121167	2022-03-14 09:42:34 -07:00
Arthur Eubanks	55cf09ae26	[ValueTracking] Simplify llvm::isPointerOffset() We still need the code after stripAndAccumulateConstantOffsets() since it doesn't handle GEPs of scalable types and non-constant but identical indexes. Differential Revision: https://reviews.llvm.org/D120523	2022-03-14 09:32:36 -07:00
Nikita Popov	04b717c423	[TLI] Check that malloc argument has type size_t DSE assumes that this is the case when forming a calloc from a malloc + memset pair. For tests, either update the malloc signature or change the data layout.	2022-03-14 17:22:24 +01:00
Andrew Litteken	0c4bbd293e	[IRSim] Make sure the first instruction of a block doesn't get missed if it is the first valid instruction in Module. If an instruction is first legal instruction in the module, and is the only legal instruction in its basic block, it will be ignored by the outliner due to a length check inherited from the older version of the outliner that was restricted to outlining within a single basic block. This removes that check, and updates any tests that broke because of it. Reviewer: paquette Differential Revision: https://reviews.llvm.org/D120786	2022-03-13 23:13:09 -05:00
Andrew Litteken	1643f01232	[IRSim][IROutliner] Ignoring Musttail Function Musttail calls require extra handling to properly propagate the calling convention information and tail call information. The outliner does not currently do this, so we ignore call instructions that utilize the swifttailcc and tailcc calling convention as well as functions marked with the attribute musttail. Reviewers: paquette, aschwaighofer Differential Revision: https://reviews.llvm.org/D120733	2022-03-13 19:27:25 -05:00
Andrew Litteken	66f90fdff1	Revert "[IRSim][IROutliner] Ignoring Musttail Function" This reverts commit `c7037c7257`. Pushed too soon	2022-03-13 19:26:51 -05:00
Andrew Litteken	c7037c7257	[IRSim][IROutliner] Ignoring Musttail Function	2022-03-13 18:57:24 -05:00
serge-sans-paille	3d219d805c	Add missing include under EXPENSIVE_CHECKS	2022-03-12 18:54:29 +01:00
serge-sans-paille	ed98c1b376	Cleanup includes: DebugInfo & CodeGen Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332	2022-03-12 17:26:40 +01:00
Johannes Doerfert	d6e09ce86f	[CaptureTracking][NFCI] Expose capture tracking logic The logic exposed by this patch via `llvm::DetermineUseCaptureKind` was part of `llvm::PointerMayBeCaptured`. In the Attributor we want to keep track of the work list items but still reuse the logic if a use might capture a value. A follow up for the Attributor removes ~100 lines of code and complexity while making future handling of simplified values possible. Differential Revision: https://reviews.llvm.org/D121272	2022-03-11 22:56:16 -06:00
Anna Thomas	a4aa97d578	[InlineCost] Add cl::opt for target attributes compatibility check. NFC This patch adds a CL option for avoiding the attribute compatibility check between caller and callee in TTI. TTI attribute compatibility checks for target CPU and target features. In our downstream compiler, this attribute always remains the same between callee and caller. By avoiding the addition of this attribute to each of our inline candidate (and then checking them here during inline cost), we save some compile time. The option is kept false, so this change is an NFC upstream.	2022-03-11 18:05:16 -05:00
Nikita Popov	806450805d	[ConstFold] Don't fold calls with mismatching function type With opaque pointers, this is no longer ensured through pointer type identity.	2022-03-11 14:09:23 +01:00
Nikita Popov	02c2106002	[InstSimplify] Handle vector GEP when simplifying zero indices If the base is a scalar and the index is a vector, we can't simplify, as this is effectively a splat operation.	2022-03-11 10:56:44 +01:00
Sanjay Patel	b48fe158e0	[Analysis] remove bogus smin/smax pattern detection This is a revert of `cfcc42bdc`. The analysis is wrong as shown by the minimal tests for instcombine: https://alive2.llvm.org/ce/z/y9Dp8A There may be a way to salvage some of the other tests, but that can be done as follow-ups. This avoids a miscompile and fixes #54311.	2022-03-09 17:50:34 -05:00
Florian Hahn	f98125abb2	Revert "[PassManager] Add pretty stack entries before P->run() call." This reverts commit `128745cc26`. This increased compile-time unnecessarily. Revert this change and follow ups `2c7afadb47` & `add0c5856d`. http://llvm-compile-time-tracker.com/compare.php?from=338dfcd60f843082bb589b287d890dbd9394eb82&to=128745cc2681c284bc6d0150a319673a6d6e8424&stat=instructions	2022-03-09 18:46:32 +00:00
Florian Hahn	128745cc26	[PassManager] Add pretty stack entries before P->run() call. This patch adds PrettyStackEntries before running passes. The entries include the pass name and the IR unit the pass runs on. The information is used the print additional information when a pass crashes, including the name and a reference to the IR unit on which it crashed. This is similar to the behavior of the legacy pass manager. The improved stack trace now includes: Stack dump: 0. Program arguments: bin/opt -loop-vectorize -force-vector-width=4 crash.ll 1. Running pass 'ModuleToFunctionPassAdaptor' on module 'crash.ll' 2. Running pass 'LoopVectorizePass' on function '@a' Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D120993	2022-03-09 13:01:09 +00:00
Nikita Popov	ba8ee4a43e	[SCEV] Verify all IR -> SCEV mappings This extends SCEV verification to check not only backedge-taken counts, but all entries in the IR -> SCEV cache. The restrictions are the same as for the BECount case, i.e. we ignore expressions based on undef, we only diagnose constant deltas (there are way too many false positives otherwise) and we limit to reachable code. Differential Revision: https://reviews.llvm.org/D121104	2022-03-09 09:33:22 +01:00
Arthur Eubanks	53e5e58670	[NewPM][Inliner] Make inlined calls to functions in same SCC as callee exponentially expensive Introduce a new attribute "function-inline-cost-multiplier" which multiplies the inline cost of a call site (or all calls to a callee) by the multiplier. When processing the list of calls created by inlining, check each call to see if the new call's callee is in the same SCC as the original callee. If so, set the "function-inline-cost-multiplier" attribute of the new call site to double the original call site's attribute value. This does not happen when the original call site is intra-SCC. This is an alternative to D120584, which marks the call sites as noinline. Hopefully fixes PR45253. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D121084	2022-03-07 23:51:09 -08:00
Florian Hahn	a2979c8399	[IVDescriptors] Bail out instead of asserting that order is expected. When dealing with multiple phis that depend on each other, the order might have been changed and may not match the expectation. If that happens, bail out, rather than asserting. Fixes https://github.com/llvm/llvm-project/issues/54218 Fixes https://github.com/llvm/llvm-project/issues/54233 Fixes https://github.com/llvm/llvm-project/issues/54254	2022-03-07 19:57:26 +00:00
Nikita Popov	81b43b23e4	[SCEV] Enable verification under EXPENSIVE_CHECKS SCEV verification should no longer affect results of subsequent queries, and our lit tests as well as llvm-test-suite pass with SCEV verification enabled, so I think we can enable it by default under EXPENSIVE_CHECKS now. Differential Revision: https://reviews.llvm.org/D120708	2022-03-07 09:53:00 +01:00
Nikita Popov	d1e880acaa	[SCEV] Enable verification in LoopPM Currently, we hardly ever actually run SCEV verification, even in tests with -verify-scev. This is because the NewPM LPM does not verify SCEV. The reason for this is that SCEV verification can actually change the result of subsequent SCEV queries, which means that you see different transformations depending on whether verification is enabled or not. To allow verification in the LPM, this limits verification to BECounts that have actually been cached. It will not calculate new BECounts. BackedgeTakenInfo::getExact() is still not entirely readonly, it still calls getUMinFromMismatchedTypes(). But I hope that this is not problematic in the same way. (This could be avoided by performing the umin in the other SCEV instance, but this would require duplicating some of the code.) Differential Revision: https://reviews.llvm.org/D120551	2022-03-07 09:46:20 +01:00
Nikita Popov	8133778d3c	[SCEV] Fully invalidate SCEVUnknown on RAUW When a SCEVUnknown gets RAUWd, we currently drop it from the folding set, but don't forget memoized values. I believe we should be treating RAUW the same way as deletion here and invalidate all caches and dependent expressions. I don't have any specific cases where this causes issues right now, but it does address the FIXME in https://reviews.llvm.org/D119488. Differential Revision: https://reviews.llvm.org/D120033	2022-03-07 09:28:28 +01:00
Florian Hahn	de8ac485e5	[IVDescriptor] Remove SinkCandidate from SinkAfter before re-sinking. This ensures the right order in the sink-after map is maintained. If we re-sink an instruction, it must be sunk after all earlier instructions have been sunk. Fixes https://github.com/llvm/llvm-project/issues/54223	2022-03-05 19:48:26 +00:00
Arthur Eubanks	f909aed671	Revert "[SCEV] Infer ranges for SCC consisting of cycled Phis" This reverts commit `fc539b0004`. Causes miscompiles, see D110620.	2022-03-04 19:52:44 -08:00
Augie Fackler	dba73135c8	getAllocAlignment: respect allocalign attribute if present As with allocsize(), we prefer the table data to attributes. Differential Revision: https://reviews.llvm.org/D118263	2022-03-04 15:57:54 -05:00
Augie Fackler	5e4c75db3b	InstructionCombining: avoid eliding mismatched alloc/free pairs Prior to this change LLVM would happily elide a call to any allocation function and a call to any free function operating on the same unused pointer. This can cause problems in some obscure cases, for example if the body of operator::new can be inlined but the body of operator::delete can't, as in this example from jyknight: #include <stdlib.h> #include <stdio.h> int allocs = 0; void operator new(size_t n) { allocs++; void mem = malloc(n); if (!mem) abort(); return mem; } __attribute__((noinline)) void operator delete(void mem) noexcept { allocs--; free(mem); } void deleteit(inti) { delete i; } int main() { int*i = new int; deleteit(i); if (allocs != 0) printf("MEMORY LEAK! allocs: %d\n", allocs); } This patch addresses the issue by introducing the concept of an allocator function family and uses it to make sure that alloc/free function pairs are only removed if they're in the same family. Differential Revision: https://reviews.llvm.org/D117356	2022-03-04 10:41:10 -05:00
Florian Hahn	5a60260efe	[IVDescriptor] Use DT to check order of Previous, OtherPrev. Previous and OhterPrev may not be in the same block. Use DT::dominates instead of local comesBefore. DT::dominates is already used earlier to check the order of Previous and SinkCandidate. Fixes https://github.com/llvm/llvm-project/issues/54195	2022-03-04 11:07:42 +00:00
Jez Ng	dd29597e10	[LTO] Initialize canAutoHide() using canBeOmittedFromSymbolTable() Per discussion on https://reviews.llvm.org/D59709#inline-1148734, this seems like the right course of action. `canBeOmittedFromSymbolTable()` subsumes and generalizes the previous logic. In addition to handling `linkonce_odr` `unnamed_addr` globals, we now also internalize `linkonce_odr` + `local_unnamed_addr` constants. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D120173	2022-03-03 19:04:11 -05:00
Arthur Eubanks	41e792d725	[CostModel] Change printer pass wording to work with update_analyze_test_checks.py update_analyze_test_checks.py looks for very specific wording, update the printer pass to match the legacy `-analyze -cost-model` wording.	2022-03-03 10:10:48 -08:00
Craig Topper	608161225e	[InstCombine][Analysis] Move getFCmpCode and getPredForFCmpCode to CmpInstAnalysis. NFC The similar getICmpCode and getPredForICmpCode are already there. This moves FP for consistency. I think InstCombine is currently the only user of both. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D120754	2022-03-03 09:33:24 -08:00
Florian Hahn	139215af8e	[IVDescriptor] Find original 'Previous' for first-order recurrences. This patch extends first-order recurrence handling to support cases where we already sunk an instruction for a different recurrence, but LastPrev comes before Previous. To handle those cases correctly, we need to find the earliest entry for the sink-after chain, because this is references the Previous from the original recurrence. This is needed to ensure we use the correct instruction as sink point. Depends on D118558. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D118642	2022-03-03 16:41:26 +00:00
serge-sans-paille	81a1760cac	Revert "Add missing include under EXPENSIVE_CHECK" This reverts commit `eeaca53df7`. It's a duplicate of https://reviews.llvm.org/rG50874a188b94a25827963956887b878d3701509a	2022-03-03 07:56:34 +01:00
serge-sans-paille	a494ae43be	Cleanup includes: TransformsUtils Estimation on the impact on preprocessor output: before: 1065307662 after: 1064800684 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120741	2022-03-01 21:00:07 +01:00
serge-sans-paille	eeaca53df7	Add missing include under EXPENSIVE_CHECK This is a followup to 344f8ec3048b6eeef94569800acb012f794ad372 It should fix https://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/21961/console	2022-03-01 21:00:06 +01:00
Fangrui Song	50874a188b	Fix -DLLVM_ENABLE_EXPENSIVE_CHECKS=on build after D120659	2022-03-01 11:36:25 -08:00
Mircea Trofin	261419273a	Fix build breaks on ml-* bots introduced by include cleanups	2022-03-01 11:29:18 -08:00
Craig Topper	7bc6667845	[Analysis] Simplify the interface to llvm::getICmpCode. NFC Instead of passing an InstCmpInt * and a bool just pass the predicate from the caller. I'm considering moving the similar FCmp functions from InstCombine over here and this makes the interface consistent with what is used for FCmp. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D120609	2022-03-01 09:53:27 -08:00
serge-sans-paille	71c3a5519d	Cleanup includes: LLVMAnalysis Number of lines output by preprocessor: before: 1065940348 after: 1065307662 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120659	2022-03-01 18:01:54 +01:00
Nikita Popov	aeab6167b0	[SCEV] Only verify BECounts for reachable loops (PR50523) For unreachable loops, any BECount is legal, and since D98706 SCEV can make use of this for loops that are unreachable due to constant branches. To avoid false positives, adjust SCEV verification to only check BECounts in reachable loops. Fixes https://github.com/llvm/llvm-project/issues/50523. Differential Revision: https://reviews.llvm.org/D120651	2022-03-01 11:52:35 +01:00
Nikita Popov	3c53d3a733	[InlineCost] Use SmallPtrSet for DeadBlocks (NFC) This set is only used with contains operations, so there is no need to use a SetVector.	2022-02-28 15:26:22 +01:00
Serge Pavlov	6982c38cb1	[ConstantFolding] Fix folding of constrained compare intrinsics The change fixes treatment of constrained compare intrinsics if compared values are of vector type. Differential revision: https://reviews.llvm.org/D110322	2022-02-27 10:19:19 +07:00
Nikita Popov	2d0fc3e46f	[SCEV] Return ArrayRef from getSCEVValues() (NFC) Return a read-only view on this set. For the one internal use, directly access ExprValueMap.	2022-02-25 09:32:22 +01:00
Nikita Popov	d9715a7266	[SCEV] Don't try to reuse expressions with offset SCEVs ExprValueMap currently tracks not only which IR Values correspond to a given SCEV expression, but additionally stores that it may be expanded in the form X+Offset. In theory, this allows reusing existing IR Values in more cases. In practice, this doesn't seem to be particularly useful (the test changes are rather underwhelming) and adds a good bit of complexity. Per https://github.com/llvm/llvm-project/issues/53905, we have an invalidation issue with these offseted expressions. Differential Revision: https://reviews.llvm.org/D120311	2022-02-25 09:16:48 +01:00
Mircea Trofin	7e3606f43c	[ScalarEvolution] Control flag for nonstrict inequalities in finite loops D118090 causes a pretty significant (19%) regression in some Eigen benchmarks. Investigating is a bit time consuming as the compilation unit where this occurs is large. Rather than revert, this patch adds a flag controlling that behavior (enabled by default).	2022-02-23 17:56:35 -08:00
Malhar Jajoo	9f1c6fbf11	[LAA] Add remarks for unbounded array access Adds new optimization remarks when loop vectorization fails due to the compiler being unable to find bound of an array access inside a loop Differential Revision: https://reviews.llvm.org/D115873	2022-02-23 15:57:39 +00:00
Sanjay Patel	fc3b34c508	[InstSimplify] remove shift that is redundant with part of funnel shift In D111530, I suggested that we add some relatively basic pattern-matching folds for shifts and funnel shifts and avoid a more specialized solution if possible. We can start by implementing at least one of these in IR because it's easier to write the code and verify with Alive2: https://alive2.llvm.org/ce/z/qHpmNn This will need to be adapted/extended for SDAG to handle the motivating bug ( #49541 ) because the patterns only appear later with that example (added some tests: `bb850d422b`) This can be extended within InstSimplify to handle cases where we 'and' with a shift too (in that case, kill the funnel shift). We could also handle patterns where the shift and funnel shift directions are inverted, but I think it's better to canonicalize that instead to avoid pattern-match case explosion. Differential Revision: https://reviews.llvm.org/D120253	2022-02-23 09:10:01 -05:00
Thomas Preud'homme	40f9081958	[LAA] Add missing newline in debug print	2022-02-23 13:25:16 +00:00
Nikita Popov	6777ec9e4d	[ValueTracking] Support signed intrinsic clamp This is the same special logic we apply for SPF signed clamps when computing the number of sign bits, just for intrinsics. This just uses the same logic as the select case, but there's multiple directions this could be improved in: We could also use the num sign bits from the clamped value, we could do this during constant range calculation, and there's probably unsigned analogues for the constant range case at least.	2022-02-23 12:45:16 +01:00
Bill Wendling	a5bbc6ef99	[NFC] Remove unnecessary "#include"s from header files	2022-02-23 01:20:48 -08:00
Kerry McLaughlin	12fb133eba	[LoopVectorize] Support conditional in-loop vector reductions Extends getReductionOpChain to look through Phis which may be part of the reduction chain. adjustRecipesForReductions will now also create a CondOp for VPReductionRecipe if the block is predicated and not only if foldTailByMasking is true. Changes were required in tryToBlend to ensure that we don't attempt to convert the reduction Phi into a select by returning a VPBlendRecipe. The VPReductionRecipe will create a select between the Phi and the reduction. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D117580	2022-02-22 12:04:35 +00:00
Max Kazantsev	ad3b1fe472	[SCEV] Do not erase LoopUsers. PR53969 This patch fixes a logical error in how we work with `LoopUsers` map. It maps a loop onto a set of AddRecs that depend on it. The Addrecs are added to this map only once when they are created and put to the UniqueSCEVs` map. The only purpose of this map is to make sure that, whenever we forget a loop, all (directly or indirectly) dependent SCEVs get forgotten too. Current code erases SCEVs from dependent set of a given loop whenever we forget this loop. This is not a correct behavior due to the following scenario: 1. We have a loop `L` and an AddRec `AR` that depends on it; 2. We modify something in the loop, but don't destroy it. We still call forgetLoop on it; 3. `AR` is no longer dependent on `L` according to `LoopUsers`. It is erased from ValueExprMap` and `ExprValue map, but still exists in UniqueSCEVs; 4. We can later request the very same AddRec for the very same loop again, and get existing SCEV `AR`. 5. Now, `AR` exists and is used again, but its notion that it depends on `L` is lost; 6. Then we decide to delete `L`. `AR` will not be forgotten because we have lost it; 7. Just you wait when you run into a dangling pointer problem, or any other kind of problem because an active SCEV is now referecing a non-existent loop. The solution to this is to stop erasing values from `LoopUsers`. Yes, we will maybe forget something that is already not used, but it's cheap. This fixes a functional bug and potentially may have negative compile time impact on methods with huge or numerous loops. Differential Revision: https://reviews.llvm.org/D120303 Reviewed By: nikic	2022-02-22 17:24:39 +07:00
David Sherwood	dc0657277f	Fix warning introduced by `47eff645d8`	2022-02-22 09:37:16 +00:00
David Sherwood	47eff645d8	[InstCombine] Bail out of load-store forwarding for scalable vector types This patch fixes an invalid TypeSize->uint64_t implicit conversion in FoldReinterpretLoadFromConst. If the size of the constant is scalable we bail out of the optimisation for now. Tests added here: Transforms/InstCombine/load-store-forward.ll Differential Revision: https://reviews.llvm.org/D120240	2022-02-22 09:26:04 +00:00
Max Kazantsev	40d06c4ce9	[SCEV][NFC] Replace contains+insert check with insert.second	2022-02-21 20:11:13 +07:00
Philip Reames	34a9642af8	Revert "[instsimplify] Simplify HaveNonOverlappingStorage per review suggestion on D120133 [NFC]" This reverts commit `3a6be124cc`. This appears to have caused a stage2 build failure: https://lab.llvm.org/buildbot/#/builders/168/builds/4813 Will investigate further on Monday and recommit.	2022-02-18 15:36:15 -08:00
Whitney Tsang	e7afbea8ca	[MemorySSA] Clear VisitedBlocks per query The problem can be shown from the newly added test case. There are two invocations to MemorySSAUpdater::moveToPlace, and the internal data structure VisitedBlocks is changed in the first invocation, and reused in the second invocation. In between the two invocations, there is a change to the CFG, and MemorySSAUpdater is notified about the change. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D119898	2022-02-18 15:36:19 -05:00
Philip Reames	3a6be124cc	[instsimplify] Simplify HaveNonOverlappingStorage per review suggestion on D120133 [NFC]	2022-02-18 11:33:15 -08:00
Philip Reames	ff2e4c04c4	[instsimplify] Assume storage for byval args doesn't overlap allocas, globals, or other byval args This allows us to discharge many pointer comparisons based on byval arguments. Differential Revision: https://reviews.llvm.org/D120133	2022-02-18 11:08:01 -08:00
Philip Reames	bf296ea6bb	[instsimplify] Clarify assumptions about disjoint memory regions [NFC]	2022-02-18 08:51:18 -08:00
Philip Reames	5ecf218eca	[instsimplify] Add a comment hinting how compares involving two globals are handled [NFC]	2022-02-18 08:41:30 -08:00
Philip Reames	f6510e6d6f	[instsimplify] Factor out a helper for alloca bounds checking [NFC] At the moment, this just groups comments with a reasonably named predicate, but I plan to add other cases to this in the near future.	2022-02-18 07:40:22 -08:00

... 4 5 6 7 8 ...

11852 Commits