llvm-project

Commit Graph

Author	SHA1	Message	Date
Florian Hahn	1625224fbb	[SCEV] Replace assert with returning CouldNotComp in computeMaxBECountForLT. This patch removes the bail out for signed predicates and non-positive strides in howManyLessThans and updates computeMaxBECountForLT to return SCEVCouldNotCompute for signed predicates with negative strides. AFAICT bail-out was only added because computeMaxBECountForLT may not handle negative signed strides correctly. Instead of not calling computeMaxBECountForLT at all because we bail out earlier, we can instead return SCEVCouldNotCompute in computeMaxBECountForLT. The max backedge taken count will be computed as the max value of the symbolic backedge taken count. This improves precision in cases where we can compute symbolic backedge taken counts and also fixes a crash. Fixes #57818. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D135667	2022-10-19 11:24:10 +01:00
Nikita Popov	747f27d97d	[AA] Rename getModRefBehavior() to getMemoryEffects() (NFC) Follow up on D135962, renaming the method name to match the new type name.	2022-10-19 11:03:54 +02:00
Nikita Popov	1a9d9823c5	[AA] Rename uses of FunctionModRefBehavior (NFC) Followup to D135962 to rename remaining uses of FunctionModRefBehavior to MemoryEffects. Does not touch API names yet, but also updates variables names FMRB/MRB to ME, to match the new type name.	2022-10-19 10:54:47 +02:00
Arthur Eubanks	743087fb63	Port print-cfg-sccs to new pass manager This is actually used, see https://discourse.llvm.org/t/use-print-callgrapg-sccs-from-opt/65782. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D135718	2022-10-18 08:47:08 -07:00
Florian Hahn	a8e9742bd4	[IndVarSimplify] Clear block and loop dispositions after moving instr. Moving an instruction can invalidate the cached block dispositions of the corresponding SCEV. Invalidate the cached dispositions. Also fixes a copy-paste error in forgetBlockAndLoopDispositions where the start expression S was removed from BlockDispositions in the loop but not the current values. This was also exposed by the new test case. Fixes #58439.	2022-10-18 16:18:14 +01:00
Nikita Popov	d06131fda2	[AST] Pass BatchAA to mergeSetIn() (NFCI)	2022-10-18 16:54:55 +02:00
Nikita Popov	e162a73e41	[CFG] Add const qualifier to isPotentiallyReachableFromMany() (NFC) Accept a const pointer for StopBB. Unfortunately the worklist has to use non-const pointers due to LoopInfo interaction.	2022-10-18 10:06:07 +02:00
Daniel Sanders	021e6e05d3	[instsimplify] Move (extelt (inselt Vec, Value, Index), Index) -> Value from InstCombine As requested in https://reviews.llvm.org/D135625#3858141 Differential Revision: https://reviews.llvm.org/D136099	2022-10-17 15:22:06 -07:00
Nikita Popov	ac74e7a780	[InstSimplify] Only check self-simplify in simplifyInstruction() InstSimplify currently checks whether the instruction simplifies back to itself, and returns undef in that case. Generally, this should only occur in unreachable code. However, this was also done for the simplifyInstructionWithOperands() API. In that case, the instruction only serves as a template that provides the opcode and other non-operand data. In this case, simplifying back to the same "instruction" may be expected. This caused PR58401 in conjunction with D134954. As such, move this check into simplifyInstruction() only. The only other caller of simplifyInstructionWithOperands() also handles the self-simplification case explicitly.	2022-10-17 15:52:38 +02:00
Nikita Popov	436fb27186	[BasicAA] Support loop phis in pointsToConstantMemory() When looking for underlying objects, if we encounter one that we have already seen, then we should skip it (as it has already been checked) rather than bail out. In particular, this adds support for the case where we have a loop use of a phi recurrence.	2022-10-17 12:34:55 +02:00
Florian Hahn	16cf666bb7	[Loop] Move block and loop dispo invalidation to makeLoopInvariant. makeLoopInvariant may recursively move its operands to make them invariant, before moving the passed in instruction. Those recursively moved instructions are currently missed when invalidating block and loop dispositions. To address this, move the invalidation code to Loop::makeLoopInvariant. Fixes #58314. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D135909	2022-10-14 21:58:14 +01:00
Nikita Popov	237b962031	[BasicAA] Account for cycles when checking for same select condition If we have translated across a cycle backedge, the same SSA value for the condition might be referring to two different loop iterations. Use the isValueEqualInPotentialCycles() helper to avoid assuming equality in that case.	2022-10-14 10:37:40 +02:00
Nikita Popov	03f9d0ff22	[TBAA] Model call accessing immutable type as readnone Accesses to constant memory are not observable and should be reported as readnone, not readonly. This is consistent with what we do for normal (non-call) instructions: For those, the TBAA metadata will result in pointsToConstantMemory() returning true, which will then result in a NoModRef result, not a Ref result. Differential Revision: https://reviews.llvm.org/D135864	2022-10-14 10:08:37 +02:00
Jacob Hegna	17095dfe36	Move interpreter check before modifying the allocation type.	2022-10-12 19:50:36 +00:00
Jacob Hegna	9d93a98f85	[MLGO] Force persistency in tflite buffers. When training large models, we encounter use-after-free bugs when writing to the input tensors for various MLGO models. This patch fixes the issue by marking the tensors as "persistent". Differential Revision: https://reviews.llvm.org/D135739	2022-10-12 19:50:36 +00:00
Arthur Eubanks	60e4af7ab8	[CallGraph] Port -print-callgraph-sccs to new pass manager And remove the legacy opt-specific pass. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D135487	2022-10-11 14:43:16 -07:00
Nikita Popov	884bb97dca	[MustExec][LICM] Handle latch being part of an inner cycle (PR57780) The algorithm in allLoopPathsLeadToBlock() does not handle the case where the loop latch is part of the predecessor set correctly: In this case, we may take the backedge (escaping to a different loop iteration) and not execute other latch successors. This can happen if the latch is part of an inner cycle. Fixes https://github.com/llvm/llvm-project/issues/57780. Differential Revision: https://reviews.llvm.org/D134279	2022-10-11 09:30:13 +02:00
Florian Hahn	4b599fa1ee	[SCEV] Verify block disposition cache. This extends the existing SCEV verification to catch cache invalidation issues as in #57837. The validation logic is similar to the recently added loop disposition cache validation in `bb68b2402d`. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D134531	2022-10-10 20:42:19 +01:00
Shubham Narlawar	b920407cf5	[LICM] Disable thread-safety checks in single-thread model If the single-thread model is used, or the -licm-force-thread-model-single flag is specified, skip checks related to thread-safety. This means that store promotion for conditionally executed stores only requires proof of dereferenceability and writability, but not of thread-safety. For example, this enables promotion of stores to (non-constant) globals, as well as captured allocas. Fixes https://github.com/llvm/llvm-project/issues/50537. Differential Revision: https://reviews.llvm.org/D130466	2022-10-10 16:51:16 +02:00
Florian Hahn	19ad1cd5ce	Recommit "[SCEV] Support clearing Block/LoopDispositions for a single value." This reverts commit `92f698f01f`. The updated version of the patch includes handling for non-SCEVable types. A test case has been added in `ec86e9a99b`.	2022-10-07 20:15:44 +01:00
Florian Hahn	92f698f01f	Revert "[SCEV] Support clearing Block/LoopDispositions for a single value." This reverts commit `9e931439dd`. This commit causes a crash when TSan, e.g. with https://lab.llvm.org/buildbot/#/builders/70/builds/28309/steps/10/logs/stdio Reverting while I extract a reproducer and submit a fix.	2022-10-07 17:58:54 +01:00
Florian Hahn	9e931439dd	[SCEV] Support clearing Block/LoopDispositions for a single value. Extend forgetBlockAndLoopDisposition to allow clearing information for a single value. This can be useful when only a single value is changed, e.g. because the instruction is moved. We also need to clear the cached values for all SCEV users, because they may depend on the starting value's disposition. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D134614	2022-10-07 16:07:17 +01:00
Bjorn Pettersson	01e1f32971	[ValueTracking][SimplifyLibCalls] Fix bug in getConstantDataArrayInfo for wchar_t When SimplifyLibCalls is dealing with wchar_t (e.g. optimizing wcslen) it uses ValueTracking helpers with a CharSize/ElementSize that isn't 8, but rather 16 or 32 (to match with the size in bits of a wchar_t). Problem I've seen is that llvm::getConstantDataArrayInfo is taking both an "ElementSize" argument (basically indicating size of a char/element in bits) and an "Offset" which afaict is an offset in the unit "number of elements". Then it also use stripAndAccumulateConstantOffsets to get a "StartIdx" which afaict is calculated in bytes. The returned Slice.Length is based on arithmetics that add/subtract variables that are having different units (bytes vs elements). Most notably I think the "StartIdx" must be scaled using the "ElementSize" to get correct results. The symptom of the above problem was seen in the wcslen-1.ll test case which miscompiled. This patch is supposed to resolve the bug by converting between bytes and elements when needed. Differential Revision: https://reviews.llvm.org/D135263	2022-10-07 15:29:32 +02:00
Nikita Popov	ccf53cae32	[ValueTracking] Remove unused Offset argument in getConstantStringInfo() (NFC)	2022-10-07 11:35:55 +02:00
Nikita Popov	c5bf452022	[AA] Pass AAResults through AAQueryInfo Currently, AAResultBase (from which alias analysis providers inherit) stores a reference back to the AAResults aggregation it is part of, so it can perform recursive alias analysis queries via getBestAAResults(). This patch removes the back-reference from AAResultBase to AAResults, and instead passes the used aggregation through the AAQueryInfo. This can be used to perform recursive AA queries using the full aggregation. Differential Revision: https://reviews.llvm.org/D94363	2022-10-06 10:10:19 +02:00
Nikita Popov	6053b37e45	[AA] Thread AAQI through getModRefBehavior() (NFC) This is in preparation for D94363, as we will need AAQI to perform the recursive call to the function variant.	2022-10-06 09:57:42 +02:00
Sanjay Patel	0a1210e482	[InstSimplify] try harder to fold fmul with 0.0 operand https://alive2.llvm.org/ce/z/oShzr3 This was noted as a missing fold in D134876 (with additional examples based on issue #58046). I'm assuming that fmul with a zero operand is rare enough that the use of ValueTracking will not noticeably increase compile-time. This adjusts a PowerPC codegen test that was added with D88388 because it would get folded away and no longer provide coverage for the bug fix.	2022-10-04 11:20:01 -04:00
Sanjay Patel	7f7a0f2f83	[InstSimplify] reduce code duplication for fmul folds; NFC This is a modification of the earlier attempt from: `7b7940f9da` For fma callers, we only want to swap a 0.0 or 1.0 constant.	2022-10-04 10:29:53 -04:00
Nikita Popov	6e504d637d	[ValueTracking] Handle constant exprs in isKnownNonZero() Handle constant expressions by falling through to the general operator-based code. In particular, this adds support for bitcast and GEP expressions.	2022-10-04 11:58:07 +02:00
Nikita Popov	45dec8f5fd	[ValueTracking] Avoid known bits fallthrough for freeze (NFCI) The known bits logic should never produce a better result than the direct recursive non-zero query here, so skip the fallthrough.	2022-10-04 11:02:31 +02:00
Nikita Popov	9c0314f54e	[ValueTracking] Switch isKnownNonZero() to switch over opcodes (NFCI) The change in the assume-queries-counter.ll test is because we skip and unnecessary known bits query for arguments.	2022-10-04 10:54:28 +02:00
Florian Hahn	db720dc17c	[LAA] Use LoopAccessInfoManager in legacy pass. Simplify LoopAccessLegacyAnalysis by using LoopAccessInfoManager from D134606. As a side-effect this also removes printing support from LoopAccessLegacyAnalysis. Depends on D134606. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D134608	2022-10-04 08:37:11 +01:00
Guozhi Wei	ded26bf6b9	[IVDescriptors] Before moving an instruction in SinkAfter checking if it is target of other instructions The attached test case can cause LLVM crash in buildVPlanWithVPRecipes because invalid VPlan is generated. FIRST-ORDER-RECURRENCE-PHI ir<%792> = phi ir<%501>, ir<%806> CLONE ir<%804> = fdiv ir<1.000000e+00>, vp<%17> // use of %17 CLONE ir<%806> = load ir<%805> EMIT vp<%17> = first-order splice ir<%792> ir<%806> // def of %17 ... There is a use before def error on %17. When vectorizer generates a VPlan, it generates a "first-order splice" instruction for a loop carried variable after its definition. All related PHI users are changed to use this "first-order splice" result, and are moved after it. The move is guided by a MapVector SinkAfter. And the content of SinkAfter is filled by RecurrenceDescriptor::isFixedOrderRecurrence. Let's look at the first PHI and related instructions %v792 = phi double [ %v806, %Loop ], [ %d1, %Entry ] %v802 = fdiv double %v794, %v792 %v804 = fdiv double 1.000000e+00, %v792 %v806 = load double, ptr %v805, align 8 %v806 is a loop carried variable, %v792 is related PHI instruction. Vectorizer will generated a new "first-order splice" instruction for %v806, and it will be used by %v802 and %v804. So %v802 and %v804 will be moved after %v806 and its "first-order splice" instruction. So SinkAfter contains %v802 -> %v806 %v804 -> %v802 It means %v802 should be moved after %v806 and %v804 will be moved after %v802. Please pay attention that the order is important. When isFixedOrderRecurrence processing PHI instruction %v794, related instructions are %v793 = phi double [ %v813, %Loop ], [ %d1, %Entry ] %v794 = phi double [ %v793, %Loop ], [ %d2, %Entry ] %v802 = fdiv double %v794, %v792 %v813 = load double, ptr %v812, align 8 This time its related loop carried variable is %v813, its user is %v802. So %v802 should also be moved after %v813. But %v802 is already in SinkAfter, because %v813 is later than %v806, so the original %v802 entry in SinkAfter is deleted, a new %v802 entry is added. Now SinkAfter contains %v804 -> %v802 %v802 -> %v813 With these data, %v802 can still be moved after all its operands, but %v804 can't be moved after %v806 and its "first-order splice" instruction. And causes use before def error. So when remove/re-insert an instruction I in SinkAfter, we should also recursively remove instructions targeting I and re-insert them into SinkAfter. But for simplicity I just bail out in this case. Differential Revision: https://reviews.llvm.org/D134083	2022-10-03 18:47:51 +00:00
Sanjay Patel	ba7da14d83	Revert "[InstSimplify] reduce code duplication for fmul folds; NFC" This reverts commit `7b7940f9da`. This missed a test update.	2022-10-03 11:21:23 -04:00
Sanjay Patel	7b7940f9da	[InstSimplify] reduce code duplication for fmul folds; NFC The constant is already commuted for an fmul opcode, but this code can be called more directly for fma, so we have to swap for that caller. There are tests in InstSimplify and InstCombine to verify that this works as expected.	2022-10-03 10:36:02 -04:00
Bjorn Pettersson	66fcdfca4d	[Analysis][SimplifyLibCalls] Refactor code related to size_t in lib func signatures. NFC Added a helper in TargetLibraryInfo to get size of "size_t" in bits, given a Module reference. The new getSizeTSize helper is using the same strategy as for example isValidProtoForLibFunc has been using in the past, assuming that the size can be derived by asking DataLayout about the size/type of a pointer to int. FortifiedLibCallSimplifier::optimizeStrpCpyChk was changed to use the new getSizeTSize helper instead of assuming that sizeof(size_t) is equal to sizeof(int*) by itself (that is the assumption used in TargetLibraryInfoImpl::getSizeTSize so the result will be the same). Having a common helper for this ensure that we use the same strategy when deriving the size of "size_t" in different parts of the code. One bonus with this refactoring (basing it on Module instead of just DataLayout) is that it makes it easier to override this for a specific target triple, in case the assumption of using getPointerSizeInBits wouldn't hold. Differential Revision: https://reviews.llvm.org/D110585	2022-10-03 12:02:50 +02:00
Sanjay Patel	4490cfbaf4	[ValueTracking] peek through fpext in isKnownNeverInfinity() https://alive2.llvm.org/ce/z/BkNoRW	2022-10-02 11:20:23 -04:00
Florian Hahn	7c0ff64b0f	[LAA] Change to function analysis for new PM. At the moment, LoopAccessAnalysis is a loop analysis for the new pass manager. The issue with that is that LAI caches SCEV expressions and modifications in a loop may impact SCEV expressions in other loops, but we do not have a convenient way to invalidate LAI for other loops withing a loop pipeline. To avoid this issue, turn it into a function analysis which returns a manager object that keeps track of the individual LAI objects per loop. Fixes #50940. Fixes #51669. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D134606	2022-10-01 15:44:27 +01:00
Teresa Johnson	0d7f3464ce	[MemProf] Update metadata during inlining Update both memprof and callsite metadata to reflect inlined functions. For callsite metadata this is simply a concatenation of each cloned call's call stack with that of the inlined callsite's. For memprof metadata, each profiled memory info block (MIB) is either moved to the cloned allocation call or left on the original allocation call depending on whether its context matches the newly refined call stack context on the cloned call. We also reapply context trimming optimizations based on the refined set of contexts on each of the calls (cloned and original), via utilities in MemoryProfileInfo. Depends on D128142. Differential Revision: https://reviews.llvm.org/D128143	2022-09-30 16:46:17 -07:00
Teresa Johnson	f9403ca41e	Profile matching and IR annotation for memprof profiles. See also related RFCs: RFC: Sanitizer-based Heap Profiler [1] RFC: A binary serialization format for MemProf [2] RFC: IR metadata format for MemProf [3]* * Note that the IR metadata format has changed from the RFC during implementation, as described in the preceeding patch adding the basic metadata and verification support. The matching is performed during the normal PGO annotation phase, to ensure that the inlines applied in the IR at that point are a subset of the inlines in the profiled binary and thus reflected in the profile's call stacks. This is important because the call frames are associated with functions in the profile based on the inlining in the symbolized call stacks, and this simplifies locating the subset of profile data relevant for matching onto each function's IR. The PGOInstrumentationUse pass is enhanced to perform matching for whatever combination of memprof and regular PGO profile data exists in the profile. Using the utilities introduced in D128854: The memprof profile data for each context is converted to "cold" or "notcold" based on parameterized thresholds for size, access count, and lifetime. The memprof allocation contexts are trimmed to the minimal amount of context required to uniquely identify whether the context is cold or not cold. For allocations where all profiled contexts have the same allocation type, no memprof metadata is attached and instead the allocation call is directly annotated with an attribute specifying the alloction type. This is the same attributed that will be applied to allocation calls once cloned for different contexts, and later used during LibCall simplification to emit allocation hints [4]. Depends on D128141 and D128854. [1] https://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html [2] https://lists.llvm.org/pipermail/llvm-dev/2021-September/153007.html [3] https://discourse.llvm.org/t/rfc-ir-metadata-format-for-memprof/59165 [4] `ab87cf382d` Differential Revision: https://reviews.llvm.org/D128142	2022-09-30 16:46:17 -07:00
Eric Wang	5b26f4f042	Reland "[MLGO] ML Regalloc Priority Advisor" This relands commit `8f4f26ba5b`, which was reverted in `91c96a806c` because of Buildbot failures. The previous model test is not compatible with tflite. e.g. https://lab.llvm.org/buildbot/#/builders/6/builds/14041 Differential Revision: https://reviews.llvm.org/D133616	2022-09-30 16:27:26 -05:00
Nikita Popov	57f7f0d6cf	[AST] Use BatchAA in aliasesPointer() (NFC)	2022-09-30 16:22:29 +02:00
Sanjay Patel	3f906f057c	[InstSimplify] look through vector select (shuffle) in min/max fold This is an extension of the existing min/max+select fold (which already has a very large number of variations) to allow a vector shuffle because that's what we have in the motivating example from issue #42100. A couple of Alive2 checks of variants (I don't know how to generalize these in Alive): https://alive2.llvm.org/ce/z/jUFAqT And verify the PR42100 test: https://alive2.llvm.org/ce/z/3EcASf It's possible there is some generalization of the fold or a VectorCombine/SLP answer for the motivating test, but I haven't found a better/smaller solution yet. We can also add even more variants here as follow-up patches. For example, we can have shuffle followed by min/max; we also don't have this canonicalization or the reverse: https://alive2.llvm.org/ce/z/StHD9f Differential Revision: https://reviews.llvm.org/D134879	2022-09-30 08:27:00 -04:00
Florian Hahn	8ae0d9aa07	[LoopDeletion] Clear block & loop dispo cache after breaking backedge. breakLoopBackedge may remove blocks and loops. Also clear block & loop disposition to avoid the cache containing invalid blocks and loops. The coverage for the change is provided when using an ASAN build of opt to run the LoopDeletion unit tests; without the fix, pointers to invalid objects would be used. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D134663	2022-09-30 11:21:58 +01:00
Mircea Trofin	91c96a806c	Revert "[MLGO] ML Regalloc Priority Advisor" This reverts commit `8f4f26ba5b`. Buildbot failures, e.g. https://lab.llvm.org/buildbot/#/builders/6/builds/14041	2022-09-29 18:26:40 -07:00
Eric Wang	8f4f26ba5b	[MLGO] ML Regalloc Priority Advisor The bulk of the implementation is common between 'release' mode (==AOT-ed model) and 'development' mode (for training), the main difference is that in development mode, we may also log features (for training logs), inject scoring information and then produce the log file. Differential Revision: https://reviews.llvm.org/D133616	2022-09-29 16:55:15 -05:00
Arthur Eubanks	0cdd671df9	[CGSCC][DevirtWrapper] Properly handle invalidating analyses for invalidated SCCs `f77342693` handled the adaptor and pass manager but missed the devirt wrapper.	2022-09-29 09:55:23 -07:00
Kazu Hirata	4e9dd21015	[ModuleInliner] Add a cost-benefit-based priority This patch teaches the module inliner a traversal order designed for the instrumentation FDO (+ThinLTO) scenario. The new traversal order prioritizes call sites in the following order: 1. Those call sites that are expected to reduce the caller size 2. Those call sites that have gone through the cost-benefit analaysis 3. The remaining call sites With this fairly simple traversal order, a large internel benchmark yields performance comparable to the bottom-up inliner -- both in terms of the execution performance and .text* sizes. Big thanks goes to Liqiang Tao for the module inliner infrastructure. I still have hacks outside this patch to prevent excessively long compilation or .text* size explosion. I'm trying to come up with acceptable solutions in near future. Differential Revision: https://reviews.llvm.org/D134376	2022-09-29 09:00:38 -07:00
Nikita Popov	aa25c92f33	[ValueTracking] Fix CannotBeOrderedLessThanZero() for fdiv (PR58046) When checking the RHS of fdiv, we should set the SignBitOnly flag, because a negative zero can become -Inf, which is ordered less than zero. Fixes https://github.com/llvm/llvm-project/issues/58046. Differential Revision: https://reviews.llvm.org/D134876	2022-09-29 17:07:48 +02:00
Vitaly Buka	01f3e2d619	[StackLifetime] More efficient loop for LivenessType::Must CFG with cycles may requires additional passes of "while (Changed)" iteration if to propagate data back from latter blocks to earlier blocks, ordered according to depth_fist. OR logic, used for ::May, converge to stable state faster then AND logic use for ::Must. Though the better solution is to switch to some some form of queue, but having that this one is good enough, I will consider to do that later. We can switch ::Must to OR logic if we calculate "may be dead" instead of direct "must be alive" and then convert values to match existing interface. Additionally it fixes correctness in "@cycle" test. Reviewed By: kstoimenov, fmayer Differential Revision: https://reviews.llvm.org/D134796	2022-09-28 16:28:45 -07:00

1 2 3 4 5 ...

11852 Commits