llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	36f325413e	[SCEV] Don't verify dispositions of invalid loops This should fix the expensive checks build. Ideally we would not have invalid loops in LoopDispositions.	2022-09-19 15:07:44 +02:00
Max Kazantsev	bb68b2402d	[SCEV] Verify contents of loop disposition cache It seems that it is sometimes broken. Initial motivation for this was investigation of https://github.com/llvm/llvm-project/issues/56260, but it also seems that we have found an unrelated bug in LoopFusion that leaves broken caches. Differential Revision: https://reviews.llvm.org/D134158 Reviewed By: nikic	2022-09-19 17:43:00 +07:00
Max Kazantsev	818b1ab84e	[SCEV][NFC] Remove unused parameter from forgetLoopDispositions Let's be honest about it, we don't drop loop dispositions for particular loops. Remove the parameter that misleadingly makes it apparent that we do.	2022-09-19 14:06:42 +07:00
Kazu Hirata	71b12030b9	[ModuleInliner] Capitalize a variable name (NFC)	2022-09-18 14:35:09 -07:00
Kazu Hirata	82293ed486	[ModuleInliner] Remove unused using declarations (NFC)	2022-09-18 14:27:06 -07:00
Kazu Hirata	00d982699b	[ModuleInliner] Move getInlineCostWrapper to an anonymous namespace (NFC) This patch moves getInlineCostWrapper to an anonymous namespace. While I am at it, I'm moving the function closer to the beginning of the file so that I can use it elsewhere in the file without a forward declaration.	2022-09-18 14:09:21 -07:00
Kazu Hirata	d3b95ecc98	[ModuleInliner] Remove InlineOrder::front (NFC) InlineOrder::front is a remnant from the era when we had a nested "while" loops in the module inliner, with the inner one grouping the call sites with the same caller. Now that we have a simple "while" loop draining the priority queue, we can just use InlineOrder::pop. Differential Revision: https://reviews.llvm.org/D134121	2022-09-18 08:49:44 -07:00
Kazu Hirata	cf355bf36e	[Analysis] Introduce isSoleCallToLocalFunction (NFC) We check to see if a given CallBase is a sole call to a local function at multiple places in InlineCost.cpp. This patch factors out the common code. Differential Revision: https://reviews.llvm.org/D134114	2022-09-17 20:59:54 -07:00
Kazu Hirata	20d764aff0	[llvm] Don't including SetVector.h (NFC) llvm/lib/ProfileData/RawMemProfReader.cpp uses SetVector without including SetVector.h, so this patch adds an appropriate #include there.	2022-09-17 12:36:43 -07:00
Kazu Hirata	5faf4bf195	[ModuleInliner] Move UseInlinePriority to InlineOrder.cpp (NFC) UseInlinePriority specifies the priority function. This patch simplifies the code by moving UseInlinePriority closer to the actual consumer -- the switch statement inside getInlineOrder. Differential Revision: https://reviews.llvm.org/D134100	2022-09-17 11:41:28 -07:00
Kazu Hirata	6e30a9cc08	[Inliner] Retire DefaultInlineOrder (NFC) DefaultInlineOrder was largely an exercise in generalizing the traversal order of call sites within the inliner. Now that the module inliner is starting to form its shape, there is no point in sharing DefaultInlineOrder between the module inliner and the CGSCC inliner. DefaultInlineOrder and all the other inline orders are mutually exclusive in the following sense: - The use of DefaultInlineOrder doesn't make sense in the module inliner because there is no priority inherent in the order in which call sites are added to the list of call sites -- SmallVector. - The use of any other inline order doesn't make sense in the CGSCC inliner because little prioritization can be done within one CGSCC. This patch essentially reverts the addition of DefaultInlineOrder so that the loop structure of Inliner.cpp looks like the state just before we started working on the module inliner (circa June 2021). At the same time, ww remove the choice of DefaultInlineOrder from UseInlinePriority. Differential Revision: https://reviews.llvm.org/D134080	2022-09-16 15:36:40 -07:00
Kazu Hirata	e0bc76eb23	[ModuleInliner] Move InlinePriority and its derived classes to InlineOrder.cpp (NFC) These classes are referred to only from getInlineOrder in InlineOrder.cpp. This patch hides the entire class declarations and definitions in InlineOrder.cpp. Differential Revision: https://reviews.llvm.org/D134056	2022-09-16 12:32:16 -07:00
Nikita Popov	b1cd393f9e	[AA] Tracking per-location ModRef info in FunctionModRefBehavior (NFCI) Currently, FunctionModRefBehavior tracks whether the function reads or writes memory (ModRefInfo) and which locations it can access (argmem, inaccessiblemem and other). This patch changes it to track ModRef information per-location instead. To give two examples of why this is useful: * D117095 highlights a weakness of ModRef modelling in the presence of operand bundles. For a memcpy call with deopt operand bundle, we want to say that it can read any memory, but only write argument memory. This would allow them to be treated like any other calls. However, we currently can't express this and have to say that it can read or write any memory. * D127383 would ideally be modelled as a separate threadid location, where threadid Refs outside pre-split coroutines can be ignored (like other accesses to constant memory). The current representation does not allow modelling this precisely. The patch as implemented is intended to be NFC, but there are some obvious opportunities for improvements and simplification. To fully capitalize on this we would also want to change the way we represent memory attributes on functions, but that's a larger change, and I think it makes sense to separate out the FunctionModRefBehavior refactoring. Differential Revision: https://reviews.llvm.org/D130896	2022-09-14 16:34:41 +02:00
Nikita Popov	1cfbbba15b	[AA] Remove unnecessary intersections from getModRefBehavior() (NFC) Intersection with other providers is performed by AAResults. Doing this here is both pointless and confusing.	2022-09-14 14:26:39 +02:00
Nikita Popov	31cc0ab321	[BasicAA] Delay getAllocTypeSize() call (NFC) This call is expensive, so don't perform it for zero indices. Also rename the variable to use Alloc rather than Alloca, this doesn't have anything to do with allocas in particular.	2022-09-13 10:24:50 +02:00
Matthias Gehre	c1502425ba	Move TargetTransformInfo::maxLegalDivRemBitWidth -> TargetLowering::maxSupportedDivRemBitWidth Also remove new-pass-manager version of ExpandLargeDivRem because there is no way yet to access TargetLowering in the new pass manager. Differential Revision: https://reviews.llvm.org/D133691	2022-09-12 17:06:16 +01:00
zhongyunde	8a15695be2	[AA] Improve the BasicAA analysis capability According https://discourse.llvm.org/t/memoryssa-does-the-accessedbetween-support-scalable-vector-pointer/65052, scalable vector support in BasicAA is currently essentially limited, and should be improved effectively for a constant offset GEP if the scalable index is zero, eg: getelementptr <vscale x 4 x i32>, ptr %p, i64 0, i64 %i Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D133567	2022-09-12 19:41:17 +08:00
Junduo Dong	6975ab7126	[Clang] Reimplement time tracing of NewPassManager by PassInstrumentation framework The previous implementation of time tracing in NewPassManager is direct but messive. The key codes are like the demo below: ``` /// Runs the function pass across every function in the module. PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, LazyCallGraph &CG, CGSCCUpdateResult &UR) { /// ... PreservedAnalyses PassPA; { TimeTraceScope TimeScope(Pass.name()); PassPA = Pass.run(F, FAM); } /// ... } ``` It can be bothered to judge where should we add the tracing codes by hands. With the PassInstrumentation framework, we can easily add `Before/After` callback functions to add time tracing codes. Differential Revision: https://reviews.llvm.org/D131960	2022-09-11 05:42:55 -07:00
Aiden Grossman	ec83c7e358	[MLGO] Make TFLiteUtils throw an error if some features haven't been passed to the model In the Tensorflow C lib utilities, an error gets thrown if some features haven't gotten passed into the model (due to differences in ordering which now don't exist with the transition to TFLite). However, this is not currently the case when using TFLiteUtils. This patch makes some minor changes to throw an error when not all inputs of the model have been passed, which when not handled will result in a seg fault within TFLite. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D133451	2022-09-10 22:59:03 +00:00
Nikita Popov	a9f312c7f4	[AST] Use BatchAA in aliasesUnknownInst() (NFCI)	2022-09-09 15:54:48 +02:00
Mircea Trofin	a219a8a822	[mlgo][nfc] Set logging level to warning or higher for TFLite	2022-09-08 12:10:56 -07:00
Nikita Popov	98a3a340c3	[ConstantExpr] Don't create fneg expressions Don't create fneg expressions unless explicitly requested by IR or bitcode.	2022-09-07 11:27:25 +02:00
Matthias Gehre	2090e85fee	[llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64 This adds the ExpandLargeDivRem to the default pass pipeline. The limit at which it expands div/rem instructions is configured via a new TargetTransformInfo hook (default: no expansion) X86, Arm and AArch64 backends implement this hook to expand div/rem instructions with more than 128 bits. Differential Revision: https://reviews.llvm.org/D130076	2022-09-06 15:32:04 +01:00
Sanjay Patel	a8fcb51242	[InstSimplify] allow poison/undef in constant match for "C - X ==/!= X -> false/true" This fold was added with `5e9522c311`, but over-specified. We can assume that an undef element is an odd number: https://alive2.llvm.org/ce/z/djQmWU	2022-09-06 08:19:30 -04:00
eopXD	ea3630e8d4	[CMake][MLGO] Fix cmake for MLGO The if-statement should check whehter TFLITE is on or not rather than if the variable is specified. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D132902	2022-09-06 00:32:08 -07:00
Arthur Eubanks	7e3aa8f01a	Revert "[LoopPassManager] Implement and use LoopNestAnalysis::run() instead of manually creating LoopNests" This reverts commit `57fd866551`. Causes crashes, see comments in D132581.	2022-09-05 15:42:48 -07:00
LiaoChunyu	456c7ef68e	[InstSimplify][NFC] shortened the code	2022-09-05 23:57:53 +08:00
LiaoChunyu	5e9522c311	[InstSimplify] Odd - X ==/!= X -> false/true Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D132989	2022-09-05 23:51:45 +08:00
Kazu Hirata	3850edd9e0	Use llvm::count_if (NFC)	2022-09-03 11:17:35 -07:00
Simon Pilgrim	e2d140e9c3	[TTI] Add isExpensiveToSpeculativelyExecute wrapper CGP uses a raw `getInstructionCost(I, TargetTransformInfo::TCK_SizeAndLatency) >= TCC_Expensive` check to see if its better to move an expensive instruction used in a select behind a branch instead. This is causing issues with upcoming improvements to TCK_SizeAndLatency costs on X86 as we need to use TCK_SizeAndLatency as an uop count (so its compatible with various target-specific buffer sizes - see D132288), but we can have instructions that have a low TCK_SizeAndLatency value but should still be treated as 'expensive' (FDIV for example) - by adding a isExpensiveToSpeculativelyExecute wrapper we can keep the current behaviour but still add an x86 override in a future patch when the cost tables are updated to compensate.	2022-09-03 13:12:22 +01:00
Arthur Eubanks	57fd866551	[LoopPassManager] Implement and use LoopNestAnalysis::run() instead of manually creating LoopNests The current code is basically just emulating what the analysis manager does. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D132581	2022-09-02 10:55:53 -07:00
Rong Xu	0caa4a9559	[PGO] Support PGO annotation of CallBrInst We currently instrument CallBrInst but do not annotate it with the branch weight. This patch enables PGO annotation of CallBrInst. Differential Revision: https://reviews.llvm.org/D133040	2022-09-01 14:13:50 -07:00
Nikita Popov	4f046bc8e0	[PHITranslateAddr] Require dominance when searching for translated address (PR57025) This is a fix for PR57025 and an alternative to D131776. The problem in the phi-translation-to-wrong-context.ll test case is that phi translation of %gep.j into if2 pick %gep.i as the result. While this instruction has the correct pointer address, it occurs in a context where %i != 0. As such, we get a NoAlias result for the store in if2, even though they do alias for %i == 0 (which is legal in the original context of the pointer). PHITranslateValue already has a MustDominate option, which can be used to restrict PHI translation results to values that dominate the translated-into block. However, this is more aggressive than what we need and would significantly regress GVN results. In particular, if we have a pointer value that does not require any translation, then it is fine to continue using that value in the predecessor, because the context is still correct for the original query. We only run into problems if PHITranslateSubExpr() picks a completely random instruction in a context that may have preconditions that do not hold. Fix this by always performing the dominance checks in PHITranslateSubExpr(), without enabling the more general MustDominate requirement. Fixes https://github.com/llvm/llvm-project/issues/57025. This also fixes the test case for https://github.com/llvm/llvm-project/issues/30999, but I'm not sure whether that's just the particular test case, or a general solution to the problem. Differential Revision: https://reviews.llvm.org/D132935	2022-09-01 16:26:42 +02:00
Pavel Samolysov	88581db62f	[LazyCallGraph] Reformat the code in accordance with the code style. NFC Also, some local variables were renamed in accordance with the code style as well as `std::tie` occurrences and `.first`/`.second` member uses were replaced with structure bindings. Differential Revision: https://reviews.llvm.org/D132806	2022-08-30 11:06:42 +03:00
Kazu Hirata	6ed2cb4ad5	Revert "[llvm] Use llvm::is_contained (NFC)" This reverts commit `ebf574f59a`. This patch seems to cause build failures on Windows.	2022-08-28 18:52:49 -07:00
Kazu Hirata	ebf574f59a	[llvm] Use llvm::is_contained (NFC)	2022-08-28 17:35:03 -07:00
Kazu Hirata	ce9f007c7c	[llvm] Use llvm::find_if (NFC)	2022-08-28 10:41:48 -07:00
Kazu Hirata	7a617fdf39	Use std::gcd (NFC) This patch replaces calls to GreatestCommonDivisor64 with std::gcd where both arguments are known to be of unsigned types no larger than 64 bits in size.	2022-08-27 21:20:59 -07:00
Arthur Eubanks	7a94d189ad	[LazyCallGraph] Update libcall list when replacing a libcall node's function Otherwise when we visit all libcalls in updateCGAndAnalysisManagerForPass(), the old libcall is dead and doesn't have a node. We treat libcalls conservatively in LazyCallGraph because any function may introduce calls to them out of thin air. It is weird to change the signature of a libcall since introducing calls to the libcall with a different signature may break, but other passes like deadargelim already do it, so let's preserve this behavior for now. Fixes an issue found in D128830. Reviewed By: psamolysov Differential Revision: https://reviews.llvm.org/D132764	2022-08-27 10:57:53 -07:00
Mircea Trofin	3546b5c520	[mlgo] Fix flaky test The source of the flakyness is internal uninitialized buffers due to a dangling variable in the model.	2022-08-26 21:29:25 -07:00
Florian Hahn	9405af1c85	[LAA] Require AddRecs to be in the innermost loop for diff-checks. The simpler diff-checks require pointers with add-recs from the same innermost loop, but this property wasn't check completely. Add the missing check to ensure both addrecs are in the innermost loop. Fixes #57315.	2022-08-26 20:39:52 +01:00
Philip Reames	86b67a310d	[LAA] Prune dependencies with distance large than access implied by trip count When we have a dependency with a dependence distance which can only be hit on an iteration beyond the actual trip count of the loop, we can ignore that dependency when analyzing said loop. We already had this code, but had restricted it solely to unknown dependence distances. This change applies it to all dependence distances. Without this code, we relied on the vectorizer reducing VF such that our infeasible dependence was respected. This usually worked out to about the same result, but not always. For fixed length vectorization, this could mean a smaller VF than optimal being chosen or additional runtime checks. For scalable vectorization - where the bounds on access implied by VF are broader - we could often not find a feasible VF at all. Differential Revision: https://reviews.llvm.org/D131924	2022-08-25 14:24:13 -07:00
Sanjay Patel	4e44c22c97	[ValueTracking][InstCombine] restrict FP min/max matching to avoid miscompile This is a long-standing FIXME with a non-FMF test that exposes the bug as shown in issue #57357. It's possible that there's still a way to miscompile by mis-identifying/mis-folding FP min/max patterns, but this patch only exposes a couple of seemingly minor regressions while preventing the broken transform.	2022-08-25 16:52:40 -04:00
Florian Hahn	c035efc814	[LAA] Cache PSE.getSE() in variable (NFC). Preparation for follow-up patches will introduce additional uses of SE.	2022-08-25 21:40:22 +01:00
Mircea Trofin	b2b460b0a0	[mlgo] Fix tests Missed a few tests in D119507	2022-08-24 17:31:40 -07:00
Mircea Trofin	5ce4c9aa04	[mlgo] Use TFLite for 'development' mode. TLite is a lightweight, statically linkable[1], model evaluator, supporting a subset of what the full tensorflow library does, sufficient for the types of scenarios we envision having. It is also faster. We still use saved models as "source of truth" - 'release' mode's AOT starts from a saved model; and the ML training side operates in terms of saved models. Using TFLite solves the following problems compared to using the full TF C API: - a compiler-friendly implementation for runtime-loadable (as opposed to AOT-embedded) models: it's statically linked; it can be built via cmake; - solves an issue we had when building the compiler with both AOT and full TF C API support, whereby, due to a packaging issue on the TF side, we needed to have the pip package and the TF C API library at the same version. We have no such constraints now. The main liability is it supporting a subset of what the full TF framework does. We do not expect that to cause an issue, but should that be the case, we can always revert back to using the full framework (after also figuring out a way to address the problems that motivated the move to TFLite). Details: This change switches the development mode to TFLite. Models are still expected to be placed in a directory - i.e. the parameters to clang don't change; what changes is the directory content: we still need an `output_spec.json` file; but instead of the saved_model protobuf and the `variables` directory, we now just have one file, `model.tflite`. The change includes a utility showing how to take a saved model and convert it to TFLite, which it uses for testing. The full TF implementation can still be built (not side-by-side). We intend to remove it shortly, after patching downstream dependencies. The build behavior, however, prioritizes TFLite - i.e. trying to enable both full TF C API and TFLite will just pick TFLite. [1] thanks to @petrhosek's changes to TFLite's cmake support and its deps!	2022-08-24 16:07:24 -07:00
Jakub Kuderski	6fa87ec10f	[ADT] Deprecate is_splat and replace all uses with all_equal See the discussion thread for more details: https://discourse.llvm.org/t/adt-is-splat-and-empty-ranges/64692 Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D132335	2022-08-23 11:36:27 -04:00
Philip Reames	c9608d57b8	[TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC] This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet. The main point of this change is simple consistency with the recently changes getArithmeticInstrCost, and to remove the last (interface) use of OperandValueKind.	2022-08-23 07:55:42 -07:00
Aditya Kumar	0af3ab02fd	[NFC] LoopAccess: Move expressions close to usage Avoids useless evaluation of these expressions. Reviewed By: michaelmaitland, fhahn Differential Revision: https://reviews.llvm.org/D132337	2022-08-23 07:08:42 -07:00
liqinweng	9181ab9223	[NFC]] Use llvm::all_of instead of std::all_of Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D131886	2022-08-23 12:21:53 +08:00
Philip Reames	104fa367ee	[TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC] This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both. This is the change which motivated the whole sequence which preceeded it. In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact. This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through. I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance. For instance, every parameter which changes type in this change also changes name. This was intentional to make sure that every call site possible effected must show up in the diff. This let me audit each one closely.	2022-08-22 15:16:39 -07:00
Philip Reames	27d3321c4f	[TTI] Use OperandValueInfo in getMemoryOpCost client api [nfc] This removes the last use of OperandValueKind from the client side API, and (once this is fully plumbed through TTI implementation) allow use of the same properties in store costing as arithmetic costing.	2022-08-22 11:26:31 -07:00
Philip Reames	274f86e7a6	[TTI] Remove OperandValueKind/Properties from getArithmeticInstrCost interface [nfc] This completes the client side transition to the OperandValueInfo version of this routine. Backend TTI implementations still use the prior versions for now.	2022-08-22 11:06:32 -07:00
Philip Reames	c42a5f1cc2	[TTI] Migrate getOperandInfo to OperandVaueInfo [nfc] This is part of merging OperandValueKind and OperandValueProperties.	2022-08-22 10:19:02 -07:00
Philip Reames	5cd427106d	[TTI] Start process of merging OperandValueKind and OperandValueProperties [nfc] OperandValueKind and OperandValueProperties both provide facts about the operands of an instruction for purposes of cost modeling. We've discussed merging them several times; before I plumb through more flags, let's go ahead and do so. This change only adds the client side interface for getArithmeticInstrCost and makes a couple of minor changes in client code to prove that it works. Target TTI implementations still use the split flags. I'm deliberately splitting what could be one big change into a series of smaller ones so that I can lean on the compiler to catch errors along the way.	2022-08-22 09:48:15 -07:00
Max Kazantsev	e587199a50	[SCEV] Prove condition invariance via context, try 2 Initial implementation had too weak requirements to positive/negative range crossings. Not crossing zero with nuw is not enough for two reasons: - If ArLHS has negative step, it may turn from positive to negative without crossing 0 boundary from left to right (and crossing right to left doesn't count for unsigned); - If ArLHS crosses SINT_MAX boundary, it still turns from positive to negative; In fact we require that ArLHS always stays non-negative or negative, which an be enforced by the following set of preconditions: - both nuw and nsw; - positive step (looks liftable); Because of positive step, boundary crossing is only possible from left part to the right part. And because of no-wrap flags, it is guaranteed to never happen.	2022-08-22 14:31:19 +07:00
Ting Wang	d2d77e050b	[PowerPC][Coroutines] Add tail-call check with call information for coroutines Fixes #56679. Reviewed By: ChuanqiXu, shchenz Differential Revision: https://reviews.llvm.org/D131953	2022-08-21 22:20:40 -04:00
Simon Pilgrim	5263155d5b	[CostModel] Add CostKind argument to getShuffleCost Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after comparison (vectorcombiner). Calls via getInstructionCost were just dropping the CostKind, so again there should be no change at this time (as getShuffleCost and its expansions don't use CostKind yet) - but it will make it easier for us to better account for size/latency shuffle costs in inline/unroll passes in the future. Differential Revision: https://reviews.llvm.org/D132287	2022-08-21 10:54:51 +01:00
Kazu Hirata	258531b7ac	Remove redundant initialization of Optional (NFC)	2022-08-20 21:18:28 -07:00
Sanjay Patel	2981a94902	[EarlyCSE][ConstantFolding] do not constant fold atan2(+/-0.0, +/-0.0), part 2 Follow-up to `7f1262a322`. That patch avoided removing the call, but it still allowed the constant-folded result. This makes the behavior consistent with 1-arg libm folding: if the call potentially raises an exception, then we just bail out. It seems likely that there are other corner-cases like this, but the tests are incomplete, so we have lived with these discrepancies for a long time. This was untested before the the constant folding was expanded in D127964.	2022-08-20 10:16:06 -04:00
Philip Reames	b0a2c48e9f	[tti] Consolidate getOperandInfo without OperandValueProperties copies [nfc]	2022-08-19 16:22:22 -07:00
Sanjay Patel	7f1262a322	[EarlyCSE][ConstantFolding] do not constant fold atan2(+/-0.0, +/-0.0) These may raise an error (set errno) as discussed in the post-commit comments for D127964, so we can't fold away the call and potentially alter that behavior.	2022-08-19 12:27:29 -04:00
Alexey Bataev	d53e245951	[COST][NFC]Introduce OperandValueKind in getMemoryOpCost, NFC. Added OperandValueKind OpdInfo parameter to getMemoryOpCost functions to better estimate cost with immediate values. Part of D126885.	2022-08-19 07:33:00 -07:00
Max Kazantsev	f798c042f4	Revert "[SCEV] Prove condition invariance via context" This reverts commit `a3d1fb3b59`. Reverting until investigation of https://github.com/llvm/llvm-project/issues/57247 has concluded.	2022-08-19 21:02:06 +07:00
Michael Maitland	f29401fcdf	[LoopVectorize][LoopAccessAnalysis] add newline to debug message A debug message in `LoopAccessAnalysis` did not have a newline in it, causing printed debug messages to be formatted incorrectly. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D132172	2022-08-18 13:44:05 -07:00
Florian Hahn	b8709a9d03	[LV] Support fixed order recurrences. If the incoming previous value of a fixed-order recurrence is a phi in the header, go through incoming values from the latch until we find a non-phi value. Use this as the new Previous, all uses in the header will be dominated by the original phi, but need to be moved after the non-phi previous value. At the moment, fixed-order recurrences are modeled as a chain of first-order recurrences. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D119661	2022-08-18 19:15:52 +01:00
Simon Pilgrim	fdec50182d	[CostModel] Replace getUserCost with getInstructionCost * Replace getUserCost with getInstructionCost, covering all cost kinds. * Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks. Original Patch by @samparker (Sam Parker) Differential Revision: https://reviews.llvm.org/D79483	2022-08-18 11:55:23 +01:00
Simon Pilgrim	b994f87184	[Analysis] CostModel.cpp - merge isa<IntrinsicInst> and dyn_cast<IntrinsicInst> checks Pulled out of D79483	2022-08-18 10:43:29 +01:00
Simon Pilgrim	1d522a39f7	[TTI] Remove getInstructionThroughput cost helper. Pulled out of D79483 - we can just as easily use getUserCost directly	2022-08-17 11:41:47 +01:00
Zain Jaffal	f61f99a105	[instcombine] Optimise for zero initialisation of product given fast flags are enabled Currently, clang ignores the 0 initialisation in finite math For example: ``` double f_prod = 0; double arr[1000]; for (size_t i = 0; i < 1000; i++) { f_prod *= arr[i]; } ``` Clang will ignore that `f_prod` is set to zero and it will generate assembly to iterate over the loop. Reviewed By: fhahn, spatel Differential Revision: https://reviews.llvm.org/D131672	2022-08-17 11:12:15 +01:00
Graham Hunter	70d35443dc	[LAA] Handle forked pointers with add/sub instructions Handle cases where a forked pointer has an add or sub instruction before reaching a select. Reviewed By: fhahn Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D130278	2022-08-17 09:51:13 +01:00
Simon Pilgrim	08d153d806	[ValueTracking] computeKnownBits - attempt to use a branch condition feeding a phi to improve known bits range (PR38280) If computeKnownBits encounters a phi node, and we fail to determine any known bits through direct analysis, see if the incoming value is part of a branch condition feeding the phi. Handle cases where icmp(IncomingValue PRED Constant) is driving a branch instruction feeding that phi node - at the moment this only handles EQ/ULT/ULE predicate cases as they are the most straightforward to handle and most likely for branch-loop 'max upper bound' cases - we can extend this if/when necessary. I investigated a more general icmp(LHS PRED RHS) KnownBits system, but the hard limits we put on value tracking depth through phi nodes meant that we were mainly catching constants anyhow. Fixes the pointless vectorization in PR38280 / Issue #37628 (excessive unrolling still needs handling though) Differential Revision: https://reviews.llvm.org/D131838	2022-08-16 16:54:44 +01:00
Max Kazantsev	ebabd6bf18	Return "[SCEV] Use context to strengthen flags of BinOps" This reverts commit `354fa0b480`. Returning as is. The patch was reverted due to a miscompile, but this patch is not causing it. This patch made it possible to infer some nuw flags in code guarded by `false` condition, and then someone else to managed to propagate the flag from dead code outside. Returning the patch to be able to reproduce the issue.	2022-08-16 14:12:36 +07:00
Craig Topper	ef8c34e954	[InstSimplify] sle on i1 also encodes implication We already support SGE, so the same logic should hold for SLE with the LHS and RHS swapped. I didn't see this in the wild. Just happened to walk past this code and thought it was odd that it was asymmetric in what condition codes it handled. Reviewed By: spatel, reames Differential Revision: https://reviews.llvm.org/D131805	2022-08-15 08:27:23 -07:00
Max Kazantsev	354fa0b480	Revert "[SCEV] Use context to strengthen flags of BinOps" This reverts commit `34ae308c73`. Our internal testing found a miscompile. Not sure if it's caused by this patch or it revealed something else. Reverting while investigating.	2022-08-15 18:51:59 +07:00
Wolfgang Pieb	7ddfb4dfeb	[Inlining] Introduce the function attribute "inline-max-stacksize" The value of the attribute is a size in bytes. It has the effect of suppressing inlining of functions whose stacksizes exceed the given value. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D129904	2022-08-12 11:07:18 -07:00
Max Kazantsev	a3d1fb3b59	[SCEV] Prove condition invariance via context Contextual knowledge may be used to prove invariance of some conditions. For example, in this case: ``` ; %len >= 0 guard(%iv = {start,+,1}<nuw> <s %len) guard(%iv = {start,+,1}<nuw> <u %len) ``` the 2nd check always fails if `start` is negative and always passes otherwise. It looks like there are more opportunities of this kind that are still to be implemented in the future. Differential Revision: https://reviews.llvm.org/D129753 Reviewed By: apilipenko	2022-08-12 14:23:35 +07:00
Mircea Trofin	3486b1b736	[mlgo][nfc] regalloc test model generator: prep for TFLite Casting operator to make TFLite happy. Reviewed By: yundiqian Differential Revision: https://reviews.llvm.org/D131584	2022-08-11 15:53:23 -07:00
Fangrui Song	57f334d817	[Support] Remove Log2 workaround for Android API level < 18 The function added by D9467 is unneeded. https://github.com/android/ndk/wiki/Changelog-r24 shows that the NDK has moved forward to at least a minimum target API of 19. Reviewed By: srhines Differential Revision: https://reviews.llvm.org/D131656	2022-08-11 17:39:41 +00:00
Kevin P. Neal	de64d0076e	[FPEnv][InstSimplify] Fix formatting error. My most recent change for D131607 had a formatting error that I didn't notice until after I committed it. Let me fix it now so changes to this file will be back-to-back from me.	2022-08-11 12:10:05 -04:00
Kevin P. Neal	7bdb010d7c	[FPEnv][InstSimplify] 0.0 - -X ==> X Another ticket split out of D107285, this extends the optimization of 0.0 - -X to just X when using constrained intrinsics and the optimization is allowed. If the negation of X is done with fsub then the match fails because of the lack of IR Matcher support for constrained intrinsics. While I'm here, remove some TODO notices since the work is no longer planned. Differential Revision: https://reviews.llvm.org/D131607	2022-08-11 11:35:33 -04:00
Martin Sebor	0dcfe7aa35	[InstCombine] Tighten up known library function signature tests (PR #56463 ) Replace a switch statement used to validate arguments to known library functions with a more consistent table-driven approach and tighten it up.	2022-08-10 14:15:46 -06:00
Simon Pilgrim	77d33f4c1b	[Analysis] Remove unused CostModelAnalysis::getInstructionCost helper. NFCI. Everything now uses TTI costs calls directly	2022-08-10 17:21:46 +01:00
Mohammed Nurul Hoque	30abc1a6a1	[ConstantFolding] Eliminate atan and atan2 calls From the opengroup specifications, atan2 may fail if the result underflows and atan may fail if the argument is subnormal, but we assume that does not happen and eliminate the calls if we can constant fold the result at compile-time. Differential Revision: https://reviews.llvm.org/D127964	2022-08-10 11:01:50 -04:00
Dinar Temirbulatov	cab6cd6834	[AArch64][LoopVectorize] Introduce trip count minimal value threshold to ignore tail-folding. After D121595 was commited, I noticed regressions assosicated with small trip count numbersvectorisation by tail folding with scalable vectors. As a solution for those issues I propose to introduce the minimal trip count threshold value. Differential Revision: https://reviews.llvm.org/D130755	2022-08-09 22:10:17 +01:00
yundiqian	3edd8978c3	fix mlgo regalloc test model generation for tflite To move from TF C API to TFLite, we found that the argmax op in TFLite does not work for int64 inputs, so cast the int64 inputs to int32 inputs to make TFLite argmax op work Differential Revision: https://reviews.llvm.org/D131462	2022-08-09 12:36:28 -07:00
Fangrui Song	de9d80c1c5	[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC With C++17 there is no Clang pedantic warning or MSVC C5051.	2022-08-08 11:24:15 -07:00
Kazu Hirata	e20d210eef	[llvm] Qualify auto (NFC) Identified with readability-qualified-auto.	2022-08-07 23:55:27 -07:00
Sanjay Patel	74b5e797d5	[InstSimplify] fold scalable vectors with over-shift splat constant to poison Fixes #56968	2022-08-07 16:26:05 -04:00
Sanjay Patel	8148c28fad	[ConstFolding] fix overzealous assert when converting FP half Fixes #56981	2022-08-07 13:34:51 -04:00
Kazu Hirata	a2d4501718	[llvm] Fix comment typos (NFC)	2022-08-07 00:16:14 -07:00
Kazu Hirata	c8e6ebd74e	Use value instead of getValue (NFC)	2022-08-06 11:21:39 -07:00
Vitaly Buka	8d2901d537	[NFC][Inliner] Add Load/Store handler This is an additional signal which may benefit sanitizers. Reviewed By: kda Differential Revision: https://reviews.llvm.org/D131129	2022-08-05 13:42:17 -07:00
Sanjay Patel	b63fc26d33	[InstSimplify] make uses of isImpliedCondition more efficient (NFCI) As suggested in the post-commit comments for `019d76196f`, this makes the usage symmetric with the 'and' patterns and should be more efficient.	2022-08-05 12:06:47 -04:00
Sanjay Patel	019d76196f	[InstSimplify] use isImpliedCondition() instead of semi-duplicated code We get a couple of improvements from recognizing swapped operand patterns that were not handled by the replicated code. This should also enable simplifying larger patterns as seen in issue #56653 and issue #56654, but that requires enhancements to isImpliedCondition() itself.	2022-08-05 10:59:09 -04:00
David Green	b2de84633a	[ConstProp] Don't fallthorugh for poison constants on vctp and active_lane_mask. Given a poison constant as input, the dyn_cast to a ConstantInt would fail so we would fall through to the generic code that attempts to fold each element of the input vectors. The inputs to these intrinsics are not vectors though, leading to a compile time crash. Instead bail out properly for poison values by returning nullptr. This doesn't try to define what poison means for these intrinsics. Fixes #56945	2022-08-05 11:19:36 +01:00
Fangrui Song	7d6017fd31	[TTI] Change new getVectorInstrCost overload to use const reference after D131114 A const reference is preferred over a non-null const pointer. `Type *` is kept as is to match the other overload. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D131197	2022-08-04 15:16:51 -07:00
Sanjay Patel	8e7acb670b	[ValueTracking] improve readability in isImpliedCond helper functions; NFC This matches the caller code naming scheme and avoids the potentially confusing transition from left/right to A/B.	2022-08-04 17:43:31 -04:00
Sanjay Patel	657bfa364f	[ValueTracking] reduce code in isImpliedCondICmps; NFC This copies the implementation of the subsequent match with constants.	2022-08-04 17:03:42 -04:00
Mingming Liu	bc8f2f3649	[AArch64][TTI][NFC] Overload method 'getVectorInstrCost' to provide vector instruction itself, as a context information for cost estimation. 1) Overloaded (instruction-based) method is a wrapper around the current (opcode-based) method. 2) This patch also changes a few callsites (VectorCombine.cpp, SLPVectorizer.cpp, CodeGenPrepare.cpp) to call the overloaded method. 3) This is a split of D128302. Differential Revision: https://reviews.llvm.org/D131114	2022-08-04 12:58:25 -07:00

1 2 3 4 5 ...

11816 Commits