llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	ce44357216	Analysis: Add AssumptionCache to isSafeToSpeculativelyExecute Does not update any of the uses.	2022-09-19 19:25:22 -04:00
Matt Arsenault	fd37ab6cf6	InstCombine: Pass AssumptionCache through isDereferenceablePointer	2022-09-19 19:10:51 -04:00
Matt Arsenault	0d8ffcc532	Analysis: Add AssumptionCache argument to isDereferenceableAndAlignedPointer This does not try to pass it through from the end users.	2022-09-19 18:57:33 -04:00
Alexey Bataev	ce39bdbd65	[SLP][NFC]Reorder gather nodes with reused scalars, NFC. The compiler does not reorder the gather nodes with reused scalars, just does it for opernads of the user nodes. This currently does not affect the compiler but breaks internal logic of the SLP graph. In future, it is supposed to actually use all nodes instead of just list of operands and this will affect the vectorization result. Also, did some early check to avoid complex logic in cost estimation analysis, should improve compiler time a bit.	2022-09-19 14:00:17 -07:00
Vitaly Buka	6f3276d57e	[msan] Check mask and pointers shadow Msan has default handler for unknown instructions which previously applied to these as well. However depending on mask, not all pointers or passthru part will be used. This allows other passes to insert undef into sum arguments. As result, default strict instruction handler can produce false reports. Reviewed By: kda, kstoimenov Differential Revision: https://reviews.llvm.org/D133678	2022-09-19 13:09:56 -07:00
Florian Hahn	582f8ef19f	[LV] Keep track of cost-based ScalarAfterVec in VPWidenPointerInd. Epilogue vectorization uses isScalarAfterVectorization to check if widened versions for inductions need to be generated and bails out in those cases. At the moment, there are scenarios where isScalarAfterVectorization returns true but VPWidenPointerInduction::onlyScalarsGenerated would return false, causing widening. This can lead to widened phis with incorrect start values being created in the epilogue vector body. This patch addresses the issue by storing the cost-model decision in VPWidenPointerInductionRecipe and restoring the behavior before `151c144`. This effectively reverts `151c144`, but the long-term fix is to properly support widened inductions during epilogue vectorization Fixes #57712.	2022-09-19 18:14:35 +01:00
Craig Topper	90a004b4a1	[LV] Remove FIXME about NoImplicitFloat. NFC My understanding is that NoImplicitFloat, despite it's name, is supposed to disable all vectors not just float vectors. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D134084	2022-09-19 10:01:02 -07:00
Nikita Popov	dd61726d5b	Revert "[SimplifyCFG] accumulate bonus insts cost" This reverts commit `e5581df60a`. This causes major compile-time regressions, about 2-3% end-to-end on CTMark.	2022-09-19 14:46:43 +02:00
Max Kazantsev	92e9bddc49	[LoopRotate] Drop loop dispositions when rotating loops. PR56260 This is required because if there is a pure loop-invariant instruction, Loop Rotation may decide to not clone it and just hoist it instead. If SCEV has previously cached that it was loop-variant (not being smart enough to prove invariance), we may end up with inconsistent cache state (which may later trigger false-negative assertion failures checking that something was invariant). This is a conservative fix that unconditionally drops the dispositions. We could only drop it if the hoisting has actually happened, but it should take some time understanding whether it's safe with all other things this function does. Differential Revision: https://reviews.llvm.org/D134167 Reviewed By: fhahn	2022-09-19 18:01:02 +07:00
Max Kazantsev	21a9abc1ce	[LoopFuse] Drop loop dispositions before reassigning blocks to other loop This bug was found by recent improvement in SCEV verifier. The code in LoopFuse directly reassigns blocks to be a part of a different loop, which should automatically invalidate all related cached loop dispositions. Differential Revision: https://reviews.llvm.org/D134173 Reviewed By: nikic	2022-09-19 17:43:06 +07:00
Max Kazantsev	818b1ab84e	[SCEV][NFC] Remove unused parameter from forgetLoopDispositions Let's be honest about it, we don't drop loop dispositions for particular loops. Remove the parameter that misleadingly makes it apparent that we do.	2022-09-19 14:06:42 +07:00
Yaxun (Sam) Liu	e5581df60a	[SimplifyCFG] accumulate bonus insts cost SimplifyCFG folds bool foo() { if (cond1) return false; if (cond2) return false; return true; } as bool foo() { if (cond1 \| cond2) return false return true; } 'cond2' is called 'bonus insts' in branch folding since they introduce overhead since the original CFG could do early exit but the folded CFG always executes them. SimplifyCFG calculates the costs of 'bonus insts' of a folding a BB into its predecessor BB which shares the destination. If it is below bonus-inst-threshold, SimplifyCFG will fold that BB into its predecessor and cond2 will always be executed. When SimplifyCFG calculates the cost of 'bonus insts', it only consider 'bonus' insts in the current BB to be considered for folding. This causes issue for unrolled loops which share destinations, e.g. bool foo(int a) { for (int i = 0; i < 32; i++) if (a[i] > 0) return false; return true; } After unrolling, it becomes bool foo(int a) { if(a[0]>0) return false if(a[1]>0) return false; //... if(a[31]>0) return false; return true; } SimplifyCFG will merge each BB with its predecessor BB, and ends up with 32 'bonus insts' which are always executed, which is much slower than the original CFG. The root cause is that SimplifyCFG does not consider the accumulated cost of 'bonus insts' which are folded from different BB's. This patch fixes that by introducing a ValueMap to track costs of 'bonus insts' coming from different BB's into the same BB, and cuts off if the accumulated cost exceeds a threshold. Reviewed by: Artem Belevich, Florian Hahn, Nikita Popov, Matt Arsenault Differential Revision: https://reviews.llvm.org/D132408	2022-09-18 20:21:14 -04:00
Sanjay Patel	d6498abc24	[InstCombine] remove multi-use add demanded constant fold This was originally part of D133788. There are no visible regressions. All of the diffs show a large unsigned constant becoming a small negative constant. This should be better for analysis (and slightly less compile-time) and codegen.	2022-09-18 14:23:43 -04:00
Kazu Hirata	5e5a6c5b07	Use std::conditional_t (NFC)	2022-09-18 10:25:06 -07:00
Marc Auberer	f52dd920d4	[InstCombine] Fix bug when folding x + (x \| -x) to x & (x - 1) Addresses concern: https://reviews.llvm.org/rG09cdddea0c4d284c2c22f5dfade40a60850c5ea7 There was a copy/paste mistake in the code. Updated code and test ref. Differential Revision: https://reviews.llvm.org/D134135	2022-09-18 13:16:12 -04:00
Sanjay Patel	1d1d1e6f22	[InstCombine] fold full-shift of sdiv to icmp+extend This is a disguised sign-bit test with offset: (X / +DivC) >> (Width - 1) --> ext (X <= -DivC) (X / -DivC) >> (Width - 1) --> ext (X >= +DivC) https://alive2.llvm.org/ce/z/cO8JO4 We don't match/test poison in the sdiv constant because that would be immediate undefined behavior.	2022-09-18 13:13:14 -04:00
Kazu Hirata	d3b95ecc98	[ModuleInliner] Remove InlineOrder::front (NFC) InlineOrder::front is a remnant from the era when we had a nested "while" loops in the module inliner, with the inner one grouping the call sites with the same caller. Now that we have a simple "while" loop draining the priority queue, we can just use InlineOrder::pop. Differential Revision: https://reviews.llvm.org/D134121	2022-09-18 08:49:44 -07:00
Benjamin Kramer	b987fe4972	Silence unused variable warning in release builds. NFC	2022-09-18 09:15:32 +02:00
Kazu Hirata	284f0397e2	[Transforms] Merge function attributes within InlineFunction (NFC) In the past, we've had a bug resulting in a compiler crash after forgetting to merge function attributes (D105729). This patch teaches InlineFunction to merge function attributes. This way, we minimize the "time" when the IR is valid, but the function attributes are not. Differential Revision: https://reviews.llvm.org/D134117	2022-09-17 23:10:23 -07:00
Kazu Hirata	6e4fbd2f51	[ModuleInliner] Set Changed earlier (NFC) It makes more sense to set Changed to true immediately after a successful inlining.	2022-09-17 14:16:32 -07:00
Kazu Hirata	31b91356bc	[ModuleInliner] Don't include SetVector.h (NFC) We don't use SetVector in the module inliner.	2022-09-17 12:17:52 -07:00
Kazu Hirata	5faf4bf195	[ModuleInliner] Move UseInlinePriority to InlineOrder.cpp (NFC) UseInlinePriority specifies the priority function. This patch simplifies the code by moving UseInlinePriority closer to the actual consumer -- the switch statement inside getInlineOrder. Differential Revision: https://reviews.llvm.org/D134100	2022-09-17 11:41:28 -07:00
Florian Hahn	7914e53e31	[ConstraintElimination] Fix crash when combining results. `f213128b29` didn't account for the possibility that the result of decompose may be empty. Fix that by explicitly checking. Use a newly introduced helper to also reduce some duplication. Thanks @bjope for finding the issue!	2022-09-17 14:47:38 +01:00
Kazu Hirata	6e30a9cc08	[Inliner] Retire DefaultInlineOrder (NFC) DefaultInlineOrder was largely an exercise in generalizing the traversal order of call sites within the inliner. Now that the module inliner is starting to form its shape, there is no point in sharing DefaultInlineOrder between the module inliner and the CGSCC inliner. DefaultInlineOrder and all the other inline orders are mutually exclusive in the following sense: - The use of DefaultInlineOrder doesn't make sense in the module inliner because there is no priority inherent in the order in which call sites are added to the list of call sites -- SmallVector. - The use of any other inline order doesn't make sense in the CGSCC inliner because little prioritization can be done within one CGSCC. This patch essentially reverts the addition of DefaultInlineOrder so that the loop structure of Inliner.cpp looks like the state just before we started working on the module inliner (circa June 2021). At the same time, ww remove the choice of DefaultInlineOrder from UseInlinePriority. Differential Revision: https://reviews.llvm.org/D134080	2022-09-16 15:36:40 -07:00
Alexey Bataev	5d13b12674	[SLP]Improve isUndefVector function by adding insertelement analysis. Added the mask and the analysis of the buildvector sequence in the isUndefVector function, improves codegen and cost estimation. Metric: SLP.NumVectorInstructions Program SLP.NumVectorInstructions results results0 diff test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 27362.00 27360.00 -0.0% Metric: size..text Program size..text results results0 diff test-suite :: External/SPEC/CFP2017rate/508.namd_r/508.namd_r.test 805299.00 806035.00 0.1% 526.blender_r - some extra code is vectorized. 508.namd_r - some extra code is optimized out. Differential Revision: https://reviews.llvm.org/D133891	2022-09-16 14:36:38 -07:00
Teresa Johnson	c2cf93c1a9	[WPD/LTT] Lower type test feeding assumes via phi correctly This fixes https://github.com/llvm/llvm-project/issues/57616. Type test lowering in ThinLTO modules relies on having type id summaries set up for the referenced types, which provide the type test resolution. If there is no summary, the type tests are lowered to false. At the very least, a default type id summary gives the type tests a resolution of Unknown, which is handled correctly (ignored by the first invocation of LTT, and lowered to true by the second). WPD sets up the type id summaries (with a default type test resolution) as it is processing the type tests, but only does this for the patterns handled by WPD, which is a type test directly feeding an assume. In the case of type tests feeding an assume via a phi, the type id summary was not being set up, leading to the type tests being lowered to false incorrectly. Fix this by adding the default type id summary entries for all type ids used on globals during index-only WPD. This is not an issue for hybrid (split-lto-unit) LTO, as in that case the type test resolution is determined and set up during LTT, since the type definitions are in the regular LTO split module, and exported via the summary to the ThinLTO split module. Differential Revision: https://reviews.llvm.org/D134012	2022-09-16 13:50:01 -07:00
Kazu Hirata	9111920af8	[ModuleInliner] clang-format ModuleInliner.cpp (NFC)	2022-09-16 09:41:42 -07:00
Kazu Hirata	4475470529	[ModuleInliner] Remove a stale comment (NFC) These comments refer to the nested loop in the module inliner where the inner loop grouped call sites from the same caller. We don't group call sites anymore, so the comment has become stale.	2022-09-16 09:37:43 -07:00
Kazu Hirata	42a90e6017	[ModuleInliner] Remove a redundaunt variable (NFC) In the CGSCC inliner, DidInline was used as an indicator to update the call graph. In the module inliner, DidInline is always true at the end of the "while" loop, so can just drop it.	2022-09-16 09:32:02 -07:00
Kazu Hirata	513717ddd0	[ModuleInliner] Remove a write-only variable (NFC) InlinedCallees is a remnant from the CGSCC inliner. We don't use it in the module inliner.	2022-09-16 09:15:53 -07:00
Kazu Hirata	77501bfab8	[IPO] Simplify the module inliner loop (NFC) In the bottom-up inliner, we have a two-level nested "while" loop, with the inner one grouping call sites with the same caller. We need to do so to keep CGSCC up to date. Now, with the module inliner, we don't have any per-caller work. We don't update CGSCC. Plus, the caller will likely keep changing as we pop call sites in some priority order. This patch simply removes the inner "while" loop while indenting its body. Further cleanup is possible, but that's left for follow-up patches. Differential Revision: https://reviews.llvm.org/D133969	2022-09-16 08:56:18 -07:00
Sanjay Patel	6174da2299	[InstCombine] reduce code duplication in foldICmpMulConstant(); NFC	2022-09-16 10:39:54 -04:00
Vitaly Buka	f0c2ffa8f8	[msan] Add msan-insert-check DEBUG_COUNTER	2022-09-15 21:52:58 -07:00
Gulfem Savrun Yeniceri	d6aed77f0d	[InstrProfiling] No runtime hook for unused funcs This is a reland of https://reviews.llvm.org/D122336. Original patch caused a problem in collecting coverage in Fuchsia because it was returning early without putting unused function names into __llvm_prf_names section. This patch fixes that issue. The original commit message is as the following: CoverageMappingModuleGen generates a coverage mapping record even for unused functions with internal linkage, e.g. static int foo() { return 100; } Clang frontend eliminates such functions, but InstrProfiling pass still emits runtime hook since there is a coverage record. Fuchsia uses runtime counter relocation, and pulling in profile runtime for unused functions causes a linker error: undefined hidden symbol: __llvm_profile_counter_bias. Since https://reviews.llvm.org/D98061, we do not hook profile runtime for the binaries that none of its translation units have been instrumented in Fuchsia. This patch extends that for the instrumented binaries that consist of only unused functions. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D122336	2022-09-16 02:05:09 +00:00
Navid Emamdoost	3e52c0926c	Add -fsanitizer-coverage=control-flow Reviewed By: kcc, vitalybuka, MaskRay Differential Revision: https://reviews.llvm.org/D133157	2022-09-15 15:56:04 -07:00
Sanjay Patel	aafaa2f4fc	[SCCP] convert ashr to lshr for non-negative shift value This is similar to the existing signed instruction folds. We get the obvious minimal patterns in other passes, but this avoids potential missed folds when the multi-block tests are converted to selects.	2022-09-15 13:54:52 -04:00
Craig Topper	ace05124f5	[IntegerDivision][AMDGPU] Use CreateLogicalOr to block poison propagation. There are two ctlz intrinsics here with the zero_is_poison flag set. There are also two comparisons that check if either of the inputs the ctlzs are zero. We need to use a logical or to block the poison from the ctlz if either of the inputs is zero. Reviewed By: arsenm, aqjune Differential Revision: https://reviews.llvm.org/D130680	2022-09-15 09:38:02 -07:00
Sanjay Patel	02a27b3890	[InstCombine] fold X*X == 0 --> X == 0 This is safe when the mul does not overflow: https://alive2.llvm.org/ce/z/LedVVP This could be extended to handle non-zero compare constants and non-squared multiplies.	2022-09-15 12:02:50 -04:00
Evgeniy Brevnov	03a102e3b2	[JumpThreading][NFC] Reuse existing DT instead of recomputation (newPM) This is the same change as `503d5771b6` with the same intent but for new pass manager.	2022-09-15 12:27:57 +07:00
Dhruva Chakrabarti	839ac62c50	Revert "[OpenMP] Codegen aggregate for outlined function captures" This reverts commit `7539e9cf81`.	2022-09-15 03:08:46 +00:00
Vitaly Buka	f221720e82	[nfc][msan] getShadowOriginPtr on <N x ptr> Some vector instructions can benefit from of Addr as <N x ptr>. Differential Revision: https://reviews.llvm.org/D133681	2022-09-14 19:18:52 -07:00
Vitaly Buka	f404169f24	[NFC][msan] Rename variables to match definition	2022-09-14 19:16:27 -07:00
Vitaly Buka	2209be15a5	[NFC][msan] Convert some code to early returns Reviewed By: kda Differential Revision: https://reviews.llvm.org/D133673	2022-09-14 19:16:11 -07:00
Vitaly Buka	bcf3d666b4	[NFC][msan] Simplify llvm.masked.load origin code Reviewed By: kda Differential Revision: https://reviews.llvm.org/D133652	2022-09-14 19:14:29 -07:00
Vitaly Buka	d421223e25	[msan] Resolve FIXME from D133880 We don't need to change tests we convertToBool unconditionally only before OR.	2022-09-14 18:55:57 -07:00
Giorgis Georgakoudis	7539e9cf81	[OpenMP] Codegen aggregate for outlined function captures Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call. Reviewed By: jdoerfert, jhuber6, ABataev Differential Revision: https://reviews.llvm.org/D102107	2022-09-15 00:54:05 +00:00
Vitaly Buka	bf204881b6	[msan] Change logic of ClInstrumentationWithCallThreshold According to logs, ClInstrumentationWithCallThreshold is workaround for slow backend with large number of basic blocks. However, I can't reproduce that one, but I see significant slowdown after ClCheckConstantShadow. Without ClInstrumentationWithCallThreshold compiler is able to eliminate many of the branches. So maybe we should drop ClInstrumentationWithCallThreshold completly. For now I just change the logic to ignore constant shadow so it will not trigger callback fallback too early. Reviewed By: kstoimenov Differential Revision: https://reviews.llvm.org/D133880	2022-09-14 14:58:12 -07:00
Florian Hahn	7f3ff9d3c0	[ConstraintElimination] Track if variables are positive in constraint. Keep track if variables are known positive during constraint decomposition, aggregate the information when building the constraint object and encode the extra information as constraints to be used during reasoning.	2022-09-14 18:43:54 +01:00
Alexey Bataev	d647312e3f	[SLP][NFC]Extract getLastInstructionInBundle function for better dependence checking, NFC. Part of D110978	2022-09-14 08:43:15 -07:00
Zain Jaffal	8253f7e286	[InstCombine] Optimize multiplication where both operands are negated Handle the case where both operands are negated in matrix multiplication Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D133695	2022-09-14 16:29:39 +01:00
Nikita Popov	b1cd393f9e	[AA] Tracking per-location ModRef info in FunctionModRefBehavior (NFCI) Currently, FunctionModRefBehavior tracks whether the function reads or writes memory (ModRefInfo) and which locations it can access (argmem, inaccessiblemem and other). This patch changes it to track ModRef information per-location instead. To give two examples of why this is useful: * D117095 highlights a weakness of ModRef modelling in the presence of operand bundles. For a memcpy call with deopt operand bundle, we want to say that it can read any memory, but only write argument memory. This would allow them to be treated like any other calls. However, we currently can't express this and have to say that it can read or write any memory. * D127383 would ideally be modelled as a separate threadid location, where threadid Refs outside pre-split coroutines can be ignored (like other accesses to constant memory). The current representation does not allow modelling this precisely. The patch as implemented is intended to be NFC, but there are some obvious opportunities for improvements and simplification. To fully capitalize on this we would also want to change the way we represent memory attributes on functions, but that's a larger change, and I think it makes sense to separate out the FunctionModRefBehavior refactoring. Differential Revision: https://reviews.llvm.org/D130896	2022-09-14 16:34:41 +02:00
Florian Hahn	efd3ec47d9	[ConstraintElimination] Clear new indices directly in getConstraint(NFC) Instead of checking if any of the new indices has a non-zero coefficient before using the constraint, do this directly when constructing the constraint.	2022-09-14 15:31:25 +01:00
Sanjay Patel	73919a87e9	[InstCombine] try multi-use demanded bits folds for 'add' This patch enables a multi-use demanded bits fold (motivated by issue #57576): https://alive2.llvm.org/ce/z/DsZakh This mimics transforms that we already do on the single-use path. Originally, this patch did not include the last part to form a constant, but that can be removed independently to reduce risk. It's not clear what the effect of either change will be when viewed end-to-end. This is expected to be neutral or a slight win for compile-time. See the "add-demand2" series for experimental timing results: https://llvm-compile-time-tracker.com/?config=NewPM-O3&stat=instructions&remote=rotateright Differential Revision: https://reviews.llvm.org/D133788	2022-09-14 09:30:59 -04:00
Alexey Bataev	796af0c027	[SLP] Move getInsertIndex function, NFC. Part of D110978.	2022-09-14 06:22:52 -07:00
Florian Hahn	f213128b29	[ConstraintElimination] Further de-compose operands of add operations. This simply extends the existing logic to look through adds and combine the components as done in other places already.	2022-09-14 12:00:32 +01:00
Kazu Hirata	d3649c2be4	[Vectorize] Fix a warning This patch fixes: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:5879:5: error: expression result unused [-Werror,-Wunused-value]	2022-09-13 09:30:06 -07:00
Arthur Eubanks	5a33d1f0b9	[SimplifyCFG] Don't hoist allocas D129370 started hoisting allocas across stacksave/stackrestore boundaries which is wrong. Reviewed By: chill, rnk Differential Revision: https://reviews.llvm.org/D133730	2022-09-13 09:23:39 -07:00
Valery N Dmitriev	18dde772d6	[SLP] Unify main/alternate selection for CmpInst instructions Make main/alternate operation selection logic for CmpInst consistent across SLP vectorizer. Differential Revision: https://reviews.llvm.org/D133430	2022-09-13 09:20:25 -07:00
Florian Hahn	ac80b0e84f	[LV] Mark Instr as const in scalarizeInstruction. (NFC). This is to reduce the diff in follow-up changes.	2022-09-13 09:10:02 +01:00
Max Kazantsev	86d5586d78	[SCEVExpander] Recompute poison-generating flags on hoisting. PR57187 Instruction being hoisted could have nuw/nsw flags inferred from the old context, and we cannot simply move it to the new location keeping them because we are going to introduce new uses to them that didn't exist before. Example in https://github.com/llvm/llvm-project/issues/57187 shows how this can produce branch by poison from initially well-defined program. This patch forcefully recomputes poison-generating flag in the new context. Differential Revision: https://reviews.llvm.org/D132022 Reviewed By: fhahn, nikic	2022-09-13 12:56:35 +07:00
Kazu Hirata	9606608474	[llvm] Use x.empty() instead of llvm::empty(x) (NFC) I'm planning to deprecate and eventually remove llvm::empty. I thought about replacing llvm::empty(x) with std::empty(x), but it turns out that all uses can be converted to x.empty(). That is, no use requires the ability of std::empty to accept C arrays and std::initializer_list. Differential Revision: https://reviews.llvm.org/D133677	2022-09-12 13:34:35 -07:00
Sanjay Patel	53eede597e	[InstCombine] look through 'not' of ctlz/cttz op with 0-is-undef https://alive2.llvm.org/ce/z/MNsC1S This pattern was flagged at: https://discourse.llvm.org/t/instcombines-select-optimizations-dont-trigger-reliably/64927	2022-09-12 15:06:21 -04:00
Benjamin Kramer	2675c41671	[DFSan] Don't crash with the legacy pass manager TargetLibraryInfo isn't optional, so we have to provide it even with the lageacy stuff. Ideally we wouldn't need it anymore but there are still users out there that are stuck on the legacy PM. Differential Revision: https://reviews.llvm.org/D133685	2022-09-12 19:11:55 +02:00
A-Wadhwani	de3445e0ef	[SROA] Create additional vector type candidates based on store and load slices This patch adds additional vector types to be considered when doing promotion in SROA, based on the types of the store and load slices. This provides more promotion opportunities, by potentially using an optimal "intermediate" vector type. For example, the following code would currently not be promoted to a vector, since `__m128i` is a `<2 x i64>` vector. ``` __m128i packfoo0(int a, int b, int c, int d) { int r[4] = {a, b, c, d}; __m128i rm; std::memcpy(&rm, r, sizeof(rm)); return rm; } ``` ``` packfoo0(int, int, int, int): mov dword ptr [rsp - 24], edi mov dword ptr [rsp - 20], esi mov dword ptr [rsp - 16], edx mov dword ptr [rsp - 12], ecx movaps xmm0, xmmword ptr [rsp - 24] ret ``` By also considering the types of the elements, we could find that the `<4 x i32>` type would be valid for promotion, hence removing the memory accesses for this function. In other words, we can explore other new vector types, with the same size but different element types based on the load and store instructions from the Slices, which can provide us more promotion opportunities. Additionally, the step for removing duplicate elements from the `CandidateTys` vector was not using an equality comparator, which has been fixed. Differential Revision: https://reviews.llvm.org/D132096	2022-09-12 09:55:37 -07:00
Sanjay Patel	4ca25c66d4	[Reassociate] prevent partial undef negation replacement As shown in the examples in issue #57683, we allow matching vectors with poison (undef) in this transform (and possibly more), but we can't then use the partially defined value as a replacement value in other expressions blindly. This seems to be avoided in simpler examples of reassociation, and other passes should be able to clean up the redundant op seen in these tests.	2022-09-12 12:28:34 -04:00
Florian Hahn	3fd1cc2574	[SLP] Add Preheader to CSE blocks after hoisting CSE-able instrs. Adding the pre-header to CSEBlocks ensures instructions are CSE'd even after hoisting. This was original discovered by @atrick a while ago. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D133649	2022-09-12 15:53:31 +01:00
Alexey Bataev	dfe1e9dd79	[SLP]Improve reordering of clustered reused scalars. If the reused scalars are clustered, i.e. each part of the reused mask contains all elements of the original scalars exactly once, we can reorder those clusters to improve the whole ordering of of the clustered vectors. Differential Revision: https://reviews.llvm.org/D133524	2022-09-12 06:52:25 -07:00
Max Kazantsev	0e465c0c2f	[IRCE] Bail in case of pointer types. PR40539 We should not unconditionally expect that SCEVable types are all integers because SCEV can also be computed for pointers. Bail in this case.	2022-09-12 16:01:25 +07:00
Djordje Todorovic	b080d0bae8	Revert ""Recommit "[AggressiveInstCombine] Lower Table Based CTTZ""" This reverts commit `df868edee5`, as it introduces a bug found by Alive2 (more on the rGdf868edee561).	2022-09-12 08:23:07 +02:00
Johannes Doerfert	c922cac868	Revert "[Attributor] AAPointerInfo should allow "harmless" uses" Revert "[Attributor] Teach AAPointerInfo to look into aggregates" This reverts commit `844f6c5d03` and `4ed0a88cd8` as they broke the buildbots that run openmp/libomptarget/test/offloading/bug49021.cpp.	2022-09-11 21:37:54 -07:00
Johannes Doerfert	844f6c5d03	[Attributor] AAPointerInfo should allow "harmless" uses If a call base use will not capture a pointer we can approximate the effects. This is important especially for readnone/only uses.	2022-09-11 20:16:11 -07:00
Johannes Doerfert	4ed0a88cd8	[Attributor] Teach AAPointerInfo to look into aggregates If we have a constant aggregate, e.g., as an initializer, we usually failed to extract the proper value/type from it. This patch provides the size and offset information necessary to extract the right part of the constant.	2022-09-11 20:16:11 -07:00
Johannes Doerfert	b046ebdc01	[Attributor][FIX] Conservatively handle ptr2int, don't crash If a pointer-2-int cast is found we give up on AAPointerInfo for now. This caused a crash before. Reported by John Tramm (@jtramm).	2022-09-11 20:16:11 -07:00
Johannes Doerfert	21711039e3	[OpenMP] Allow the Attributor to look at functions we also internalized This is important as we have accesses to globals in those which we need to categorize.	2022-09-11 20:16:11 -07:00
Junduo Dong	6975ab7126	[Clang] Reimplement time tracing of NewPassManager by PassInstrumentation framework The previous implementation of time tracing in NewPassManager is direct but messive. The key codes are like the demo below: ``` /// Runs the function pass across every function in the module. PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, LazyCallGraph &CG, CGSCCUpdateResult &UR) { /// ... PreservedAnalyses PassPA; { TimeTraceScope TimeScope(Pass.name()); PassPA = Pass.run(F, FAM); } /// ... } ``` It can be bothered to judge where should we add the tracing codes by hands. With the PassInstrumentation framework, we can easily add `Before/After` callback functions to add time tracing codes. Differential Revision: https://reviews.llvm.org/D131960	2022-09-11 05:42:55 -07:00
Florian Hahn	69d9bb2aad	[VPlan] Check recipe uses instead of type of underlying instr (NFC). Suggested by @Ayal post-commit, to reduce the dependence on the underlying instruction in favor of information available directly for the recipe.	2022-09-11 12:24:44 +01:00
Marc Auberer	09cdddea0c	[InstCombine] Fold x + (x \| -x) to x & (x - 1) Fixes #57531 This transformation may be particularly useful on x86-64, because x & (x - 1) can be performed by a single blsr instruction. Differential Revision: https://reviews.llvm.org/D133362	2022-09-11 06:14:24 -04:00
Alexey Bader	2bb5535b58	[StripDeadDebugInfo] Drop dead CUs In situations when a submodule is extracted from big module (i.e. using CloneModule) a lot of debug info is copied via metadata nodes. Despite of the fact that part of that info is not linked to any instruction in extracted IR file, StripDeadDebugInfo pass doesn't drop them. Strengthen criteria for debug info that should be kept in a module: - Only those compile units are left that referenced by a subprogram debug info node that is attached to a function definition in the module or to an instruction in the module that belongs to an inlined function. Signed-off-by: Mikhail Lychkov <mikhail.lychkov@intel.com> Differential Revision: https://reviews.llvm.org/D122163	2022-09-11 01:31:03 -07:00
Vitaly Buka	b51d1f1fbd	[msan] Don't deppend on argumens evaluation order	2022-09-10 15:28:32 -07:00
Vitaly Buka	71c5e7b26a	[msan] Do not deppend on arguments evaluation order Clang and GCC do this differently making IR inconsistent. https://lab.llvm.org/buildbot#builders/6/builds/13120	2022-09-10 13:50:32 -07:00
Vitaly Buka	1819d5999c	[NFC][msan] Remove unused return type	2022-09-10 12:20:54 -07:00
Vitaly Buka	6fc31712f1	[msan] Relax handling of llvm.masked.expandload and llvm.masked.gather This is work around for new false positives. Real implementation will follow.	2022-09-10 12:19:16 -07:00
Manuel Brito	b51c6130ef	Use PoisonValue instead of UndefValue when RAUWing unreachable code [NFC] Replacing the following instances of UndefValue with PoisonValue, where the UndefValue is used as an arbitrary value: - llvm/lib/CodeGen/WinEHPrepare.cpp `demotePHIsOnFunclets`: RAUW arbitrary value for lingering uses of removed PHI nodes - llvm/lib/Transforms/Utils/BasicBlockUtils.cpp `FoldSingleEntryPHINodes`: Removes a self-referential single entry phi node. - llvm/lib/Transforms/Utils/CallGraphUpdater.cpp `finalize`: Remove all references to removed functions. - llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp `cleanup`: the result is not used then the inserted instructions are removed. - llvm/tools/bugpoint/CrashDebugger.cpp `TestInts`: the program is cloned and instructions are removed to narrow down source of crash. Differential Revision: https://reviews.llvm.org/D133640	2022-09-10 14:28:01 +01:00
Florian Hahn	da734473fa	[LV] Remove now dead variable after `2a78890b7b` (NFC).	2022-09-09 20:25:55 +01:00
Florian Hahn	2a78890b7b	[VPlan] Move SCEV expansion for pointer induction to VPExpandSCEV (NFC). Use VPExpandSCEVRecipe to expand the step of pointer inductions. This cleanup addresses a corresponding FIXME. It should be NFC, as steps for pointer induction must be constants, which makes expansion trivial.	2022-09-09 19:20:13 +01:00
Sanjay Patel	6113e6738d	[InstCombine] move/adjust comments about demanded bits; NFC The code has been moved/copied around, but the comments were not updated to match.	2022-09-09 11:48:20 -04:00
Philip Reames	a33d98e20a	[LV] Pull out common expression [nfc]	2022-09-09 07:31:46 -07:00
Philip Reames	edb26268ce	[VPlan] Only generate single instr for stores uniform across all parts. Extend the approach taken by D133019 to store instructions. Differential Revision: https://reviews.llvm.org/D133497	2022-09-09 07:15:12 -07:00
Nikita Popov	a9f312c7f4	[AST] Use BatchAA in aliasesUnknownInst() (NFCI)	2022-09-09 15:54:48 +02:00
Sebastian Neubauer	c7750c522e	Add helper func to get first non-alloca position The LLVM performance tips suggest that allocas should be placed at the beginning of the entry block. So far, llvm doesn’t provide any helper to find that position. Add BasicBlock::getFirstNonPHIOrDbgOrAlloca and IRBuilder::SetInsertPointPastAllocas(Function*) that get an insert position after the (static) allocas at the start of a function and use it in ShadowStackGCLowering. Differential Revision: https://reviews.llvm.org/D132554	2022-09-09 15:39:53 +02:00
Nikita Popov	4ab77d1677	[LICM] Allow promotion with non-load/store users If there are non-load/store users of the promoted pointer, we currently abort promotion. However, having such users isn't really relevant to the transform. We already separately check that a) there are no instructions that modref the promoted pointer and b) that a pointer capture disables store promotion. In the affected @test_captured_in_loop test case we have a readnone capture of the promoted pointer, which means that load promotion can be performed (while store promotion cannot). Differential Revision: https://reviews.llvm.org/D133485	2022-09-09 13:09:59 +02:00
Djordje Todorovic	df868edee5	"Recommit "[AggressiveInstCombine] Lower Table Based CTTZ"" This reverts commit `053841c562`. We faced a use-after-free after pushing the D113291, since the foldSqrt() has a call to eraseFromParent(). The function should be at the end of the main loop that folds the patterns. This patch fixes that.	2022-09-09 10:29:39 +02:00
Vitaly Buka	1cf5c7fe8c	[msan] Disambiguate warnings debug location If multiple warnings created on the same instruction (debug location) it can be difficult to figure out which input value is the cause. This patches chains origins just before the warning using last origins update debug information. To avoid inflating the binary unnecessarily, do this only when uncertainty is high enough, 3 warnings by default. On average it adds 0.4% to the .text size. Reviewed By: kda, fmayer Differential Revision: https://reviews.llvm.org/D133232	2022-09-08 14:17:07 -07:00
Vitaly Buka	0f2f1c2be1	[sanitizers] Invalidate GlobalsAA GlobalsAA is considered stateless as usually transformations do not introduce new global accesses, and removed global access is not a problem for GlobalsAA users. Sanitizers introduce new global accesses: - Msan and Dfsan tracks origins and parameters with TLS, and to store stack origins. - Sancov uses global counters. HWAsan store tag state in TLS. - Asan modifies globals, but I am not sure if invalidation is required. I see no evidence that TSan needs invalidation. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D133394	2022-09-08 14:00:43 -07:00
Sanjay Patel	444f08c832	[InstCombine] fold icmp of truncated left shift, part 2 (trunc (1 << Y) to iN) == 2C --> Y == C (trunc (1 << Y) to iN) != 2C --> Y != C https://alive2.llvm.org/ce/z/xnFPo5 Follow-up to `d9e1f9d759`. This was a suggested enhancement mentioned in issue #51889.	2022-09-08 12:44:02 -04:00
Philip Reames	4c4c0d2c06	[LV] Use safe-divisor lowering for fixed vectors if profitable This extends the safe-divisor widening scheme recently added for scalable vectors to handle fixed vectors as well. Differential Revision: https://reviews.llvm.org/D132591	2022-09-08 09:15:54 -07:00
Joe Loser	5e96cea1db	[llvm] Use std::size instead of llvm::array_lengthof LLVM contains a helpful function for getting the size of a C-style array: `llvm::array_lengthof`. This is useful prior to C++17, but not as helpful for C++17 or later: `std::size` already has support for C-style arrays. Change call sites to use `std::size` instead. Differential Revision: https://reviews.llvm.org/D133429	2022-09-08 09:01:53 -06:00
Djordje Todorovic	7aec9ddcfd	Revert "Recommit "[AggressiveInstCombine] Lower Table Based CTTZ"" This reverts commit `f879939157`.	2022-09-08 17:01:16 +02:00
Sanjay Patel	d9e1f9d759	[InstCombine] Fold icmp of truncated left shift (trunc (1 << Y) to iN) == 0 --> Y u>= N (trunc (1 << Y) to iN) != 0 --> Y u< N These can be generalized in several ways as noted by the TODO items, but this handles the pattern in the motivating bug report. Fixes #51889 Differential Revision: https://reviews.llvm.org/D115480	2022-09-08 10:48:14 -04:00
Djordje Todorovic	f879939157	Recommit "[AggressiveInstCombine] Lower Table Based CTTZ"	2022-09-08 16:36:46 +02:00
Florian Hahn	422cf99161	[VPlan] Only generate single instr for loads uniform across all parts. VPReplicateRecipe::isUniform actually means uniform-per-parts, hence a scalar instruction is generated per-part. This is a potential alternative D132892. For now the current patch only catches cases where the address is trivially invariant (defined outside VPlan), while D132892 catches any address that is considered invariant by SCEV AFAICT. It should be possible to hoist fully invariant recipes feeding loads out of the vector loop region as well, but in practice LICM should do that already. This version of the patch artificially limits this to loads to make it easier to compare, but this restriction should be easily liftable. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D133019	2022-09-08 14:27:58 +01:00
Chenbing Zheng	01cea7ac10	[InstCombine] extractvalue (any_mul_with_overflow X, 2^n), 0 -> X << n Alive2: https://alive2.llvm.org/ce/z/JLmabt (umul) https://alive2.llvm.org/ce/z/J_ruXR (smul) https://alive2.llvm.org/ce/z/o9SVSz (vector) Reviewed By: spatel, RKSimon Differential Revision: https://reviews.llvm.org/D133188	2022-09-08 11:12:55 +08:00
Sami Tolvanen	52967a5306	[InstCombine] Fix a crash in -kcfi debug block Don't attempt to print out DebugLoc as we may not have one.	2022-09-07 22:59:12 +00:00
Marco Elver	97c2220565	[SanitizerBinaryMetadata] Introduce SanitizerBinaryMetadata instrumentation pass Introduces the SanitizerBinaryMetadata instrumentation pass which uses the new MD_pcsections metadata kinds to instrument certain types of instructions and functions required for breakpoint-based sanitizers. The first intended user of the binary metadata emitted will be a variant of GWP-TSan [1]. GWP-TSan will require information about atomic accesses; to unambiguously determine if an access is atomic or not, we also require "covered" information which code has been compiled with SanitizerBinaryMetadata instrumentation enabled. [1] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf Reviewed By: dvyukov Differential Revision: https://reviews.llvm.org/D130887	2022-09-07 21:25:40 +02:00
Sanjay Patel	85b289377b	[SCCP] convert signed div/rem to unsigned for non-negative operands, 2nd try The original commit ( `fe1f3cfc26` ) was reverted because it could crash / assert when trying to fold a value that was replaced by a constant. In that case, there might not be an entry for the constant in the solver yet. This version adds a check for that possibility along with tests to exercise that pattern (they used to crash). Original commit message: This extends the transform added with D81756 to handle div/rem opcodes. For example: https://alive2.llvm.org/ce/z/cX6za6 This replicates part of what CVP already does, but the motivating example from issue #57472 demonstrates a phase ordering problem - we convert branches to select before CVP runs and miss the transform. Differential Revision: https://reviews.llvm.org/D133198	2022-09-07 11:56:29 -04:00
Sanjay Patel	7c57180900	[InstCombine] fold add+negate through select into sub This transform came up as a potential DAGCombine in D133282, so I wanted to see how it escaped in IR too. We do general folds in InstCombiner::SimplifySelectsFeedingBinaryOp() by checking if either arm of a select simplifies when the trailing binop is threaded into the select. So as long as one side simplifies, it's a good fold to combine a negate and add into 1 subtract. This is an example with a zero arm in the select: https://alive2.llvm.org/ce/z/Hgu_Tj And this models the tests with a cancelling 'not' op: https://alive2.llvm.org/ce/z/BuzVV_ Differential Revision: https://reviews.llvm.org/D133369	2022-09-07 08:23:35 -04:00
Aaron Kogon	ae05b9dc30	Sink/hoist memory instructions between loop fusion candidates Currently, instructions in the preheader of the second of two fusion candidates are sunk and hoisted whenever possible, to try to allow the loops to fuse. Memory instructions are skipped, and are never sunk or hoisted. This change adds memory instructions for sinking/hoisting consideration. This change uses DependenceAnalysis to check if a mem inst in the preheader of FC1 depends on an instruction in FC0's header, across which it will be hoisted, or FC1's header, across which it will be sunk. We reject cases where the dependency is a data hazard. Differential Revision: https://reviews.llvm.org/D131606	2022-09-07 07:42:00 -04:00
Nikita Popov	f42d92611d	[Reassociate] Avoid ConstantExpr::getFNeg() (NFCI) Use ConstantFoldUnaryOpOperand() instead. Also make the code below robust against non-instruction users, just in case it doesn't fold.	2022-09-07 10:48:08 +02:00
Vitaly Buka	4c18670776	[NFC][sancov] Rename ModuleSanitizerCoveragePass	2022-09-06 20:55:39 -07:00
Vitaly Buka	5e38b2a456	[NFC][msan] Rename ModuleMemorySanitizerPass	2022-09-06 20:30:35 -07:00
Ruobing Han	fb45f3c948	[SimpleLoopUnswitch] Skip non-trivial unswitching of cold functions In the current main branch, all cold loops will not be applied non-trivial unswitch. As reported in D129599, skipping these cold loops will incur regression in SPEC benchmark. Thus, instead of skipping cold loops, now only skipping loops in cold functions. Reviewed By: alexgatea, aeubanks Differential Revision: https://reviews.llvm.org/D133275	2022-09-06 19:13:31 -04:00
Vitaly Buka	93600eb50c	[NFC][asan] Rename ModuleAddressSanitizerPass	2022-09-06 15:02:11 -07:00
Vitaly Buka	e7bac3b9fa	[msan] Convert Msan to ModulePass MemorySanitizerPass function pass violatied requirement 4 of function pass to do not insert globals. Msan nees to insert globals for origin tracking, and paramereters tracking. https://llvm.org/docs/WritingAnLLVMPass.html#the-functionpass-class Reviewed By: kstoimenov, fmayer Differential Revision: https://reviews.llvm.org/D133336	2022-09-06 15:01:04 -07:00
Vitaly Buka	b4257d3bf5	[tsan] Replace mem intrinsics with calls to interceptors After https://reviews.llvm.org/rG463aa814182a23 tsan replaces llvm intrinsics with calls to glibc functions. However this approach is fragile, as slight changes in pipeline can return llvm intrinsics back. In particular InstCombine can do that. Msan/Asan already declare own version of these memory functions for the similar purpose. KCSAN, or anything that uses something else than compiler-rt, needs to implement this callbacks. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D133268	2022-09-06 13:09:31 -07:00
Florian Hahn	27e7db54eb	Revert "[SCCP] convert signed div/rem to unsigned for non-negative operands" This reverts commit `fe1f3cfc26`. It looks like this commit breaks building llvm-test-suite. To reproduce, run `opt -passes=ipsccp` on the IR below. @g = internal global i32 256, align 4 define void @test() { entry: %0 = load i32, ptr @g, align 4 %div = sdiv i32 %0, undef ret void }	2022-09-06 18:21:51 +01:00
Florian Hahn	2fb68c0628	[ConstraintElimination] Replace pair with named struct (NFC). This slightly improves the readability and allows further extensions in follow-ups.	2022-09-06 18:04:04 +01:00
Vitaly Buka	c51a12d598	Revert "[tsan] Replace mem intrinsics with calls to interceptors" Breaks http://45.33.8.238/macm1/43944/step_4.txt https://lab.llvm.org/buildbot/#/builders/70/builds/26926 This reverts commit `77654a65a3`.	2022-09-06 09:47:33 -07:00
Sanjay Patel	ae117e1c1b	[InstCombine] remove dead code for add (select cond, (sub), 0); NFC This pattern is handled more generally in SimplifySelectsFeedingBinaryOp(). Tests to confirm that added to the add.ll test file in the previous commit.	2022-09-06 12:19:50 -04:00
Doru Bercea	0b1160fdeb	Fix OpenMP Opt for target without a parallel region. Remove ctx redeclaration. Format code. Remove parallel check. Modify tests. Clean-up code. Fix another test. Move code to helper functions. Format file. Minor fixes.	2022-09-06 16:04:53 +00:00
Vitaly Buka	77654a65a3	[tsan] Replace mem intrinsics with calls to interceptors After https://reviews.llvm.org/rG463aa814182a23 tsan replaces llvm intrinsics with calls to glibc functions. However this approach is fragile, as slight changes in pipeline can return llvm intrinsics back. In particular InstCombine can do that. Msan/Asan already declare own version of these memory functions for the similar purpose. KCSAN, or anything that uses something else than compiler-rt, needs to implement this callbacks. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D133268	2022-09-06 08:25:32 -07:00
Sanjay Patel	fe1f3cfc26	[SCCP] convert signed div/rem to unsigned for non-negative operands This extends the transform added with D81756 to handle div/rem opcodes. For example: https://alive2.llvm.org/ce/z/cX6za6 This replicates part of what CVP already does, but the motivating example from issue #57472 demonstrates a phase ordering problem - we convert branches to select before CVP runs and miss the transform. Differential Revision: https://reviews.llvm.org/D133198	2022-09-06 08:58:15 -04:00
Sanjay Patel	dd6eb4d67f	[InstCombine] reduce code duplication; NFC	2022-09-06 08:19:30 -04:00
Arthur Eubanks	7e3aa8f01a	Revert "[LoopPassManager] Implement and use LoopNestAnalysis::run() instead of manually creating LoopNests" This reverts commit `57fd866551`. Causes crashes, see comments in D132581.	2022-09-05 15:42:48 -07:00
Momchil Velikov	078899cd64	[SimplifyCFG] Allow SimplifyCFG hoisting to skip over non-matching instructions SimplifyCFG does some common code hoisting, which is limited to hoisting a sequence of identical instruction in identical order and stops at the first non-identical instruction. This patch allows hoisting instruction pairs over same-length sequences of non-matching instructions. The linear asymptotic complexity of the algorithm stays the same, there's an extra parameter `simplifycfg-hoist-common-skip-limit` serving to limit compilation time and/or the size of the hoisted live ranges. The patch improves SPECv6/525.x264_r by about 10%. Reviewed By: nikic, dmgreen Differential Revision: https://reviews.llvm.org/D129370	2022-09-05 15:13:46 +01:00
Tian Zhou	8fa432be4f	[InstCombine] reduce test-for-overflow of shifted value Fixes #57338. The added code makes the following transformations: For unsigned predicates / eq / ne: icmp pred (x << 1), x --> icmp getSignedPredicate(pred) x, 0 icmp pred x, (x << 1) --> icmp getSignedPredicate(pred) 0, x Some examples: https://alive2.llvm.org/ce/z/ckn4cj https://alive2.llvm.org/ce/z/h-4bAQ Differential Revision: https://reviews.llvm.org/D132888	2022-09-05 09:51:51 -04:00
Florian Hahn	408ebe5e3a	[VPlan] Move VPWidenCallRecipe to VPlanRecipes.cpp (NFC). Depends on D132585. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D132586	2022-09-05 10:48:29 +01:00
Nikita Popov	388b684354	[LICM] Separate check for writability and thread-safety (NFCI) This used a single check to make sure that the object is both writable and thread-local. Separate them out to make the deficiencies in the current code more obvious.	2022-09-05 09:43:17 +02:00
Florian Hahn	ba3d29f871	[LCSSA] Update unreachable uses with poison. Users of LCSSA may not expect non-phi uses when checking the uses outside a loop, which may cause crashes. This is due to the fact that we do not update uses in unreachable blocks. To ensure all reachable uses outside the loop are phis, update uses in unreachable blocks to use poison in dead code. Fixes #57508.	2022-09-04 22:26:18 +01:00
Kazu Hirata	7d8c2d17eb	[llvm] Use range-based for loops (NFC) Identified with modernize-loop-convert.	2022-09-03 23:27:25 -07:00
Fangrui Song	9fc679b87c	[SanitizerCoverage] Simplify pc-table and improve test. NFC	2022-09-03 14:29:21 -07:00
Kazu Hirata	9eca5ed790	[llvm] Use std::enable_if_t (NFC)	2022-09-03 11:17:44 -07:00
Kazu Hirata	fedc59734a	[llvm] Use range-based for loops (NFC)	2022-09-03 11:17:40 -07:00
Sanjay Patel	22e1f66f26	[SCCP] add helper function for replacing signed operations; NFC Preliminary refactoring for planned enhancement in D133198.	2022-09-03 10:30:10 -04:00
Sanjay Patel	5c759edc57	[InstCombine] reduce another or-xor bitwise logic pattern ~(A & ?) \| (A ^ B) --> ~((A & ?) & B) https://alive2.llvm.org/ce/z/mxex6V This is similar to `9d218b61cc` where we peeked through another logic op to find a common operand.	2022-09-03 09:32:08 -04:00
Richard Smith	053841c562	Revert "[AggressiveInstCombine] Lower Table Based CTTZ" This reverts commit `fec01ee3f5`. According to asan, this patch introduces a heap use after free.	2022-09-02 16:19:09 -07:00
Francis Visoiu Mistrih	c5b10f348e	[Matrix] Use print instead of dump for matrix-print-after-transpose-opt We should be able to use this option even if LLVM_ENABLE_DUMP is not on. (should fix the bots too)	2022-09-02 16:12:21 -07:00
Francis Visoiu Mistrih	81bdb4068d	[Matrix] Simplify matmuls with scalars If one of the operands is a transposed splat, the transpose can be removed. This is useful to simplify when transposes are distributed to operands of a matmul: * k^T -> k * (A * k)^t -> A^t * k Differential Revision: https://reviews.llvm.org/D130177	2022-09-02 15:50:25 -07:00
Sameer Sahasrabuddhe	46b293cb3f	[Attributor] Simplify offset calculation for a constant GEP Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D132931	2022-09-02 23:53:51 +05:30
Arthur Eubanks	57fd866551	[LoopPassManager] Implement and use LoopNestAnalysis::run() instead of manually creating LoopNests The current code is basically just emulating what the analysis manager does. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D132581	2022-09-02 10:55:53 -07:00
Djordje Todorovic	fec01ee3f5	[AggressiveInstCombine] Lower Table Based CTTZ This patch introduces recognition of table-based ctz implementation during the AggressiveInstCombine. This fixes the [0]. [0] https://bugs.llvm.org/show_bug.cgi?id=46434 Differential Revision: https://reviews.llvm.org/D113291	2022-09-02 17:26:55 +02:00
Jolanta Jensen	958abe864a	[LoopLoadElim] Add stores with matching sizes as load-store candidates We are not building up a proper list of load-store candidates because we are throwing away stores where the type don't match the load. This patch adds stores with matching store sizes as candidates. Author of the original patch: David Sherwood. Differential Revision: https://reviews.llvm.org/D130233	2022-09-02 13:11:25 +01:00
Muhammad Omair Javaid	18de7c6a3b	Revert "[InstCombine] Treat passing undef to noundef params as UB" This reverts commit `c911befaec`. It has broken LLDB Arm/AArch64 Linux buildbots. I dont really understand the underlying reason. Reverting for now make buildbot green. https://reviews.llvm.org/D133036	2022-09-02 16:09:50 +05:00
Mikael Holmen	51d4c7ceea	[GlobalOpt] Fix debug variance problem in hasOnlyColdCalls hasOnlyColdCalls skipped over calls to intrinsics, but it did so after checking the linkage of the called function. This meant that the presence of a call to a debug intrinsic could affect the outcome of the optimization. In my original reproducer (for an out of tree target) it was particularly interesting, because the actual IR after GlobalOpt was not different with debug instrinsics present, so -print-after-all printouts didn't show anything there. However, without debuginfo, GlobalOpt went further and ran BlockFrequencyAnalysis and (more importanly) LoopAnalysis, and later on in the pipeline, instcombine behaved in different ways when LoopInfo was present. So a call to a dbg.declare prevented running LoopAnalysis in GlobalOpt, which later prevented InstCombine from doing an optimization. The dbg-intrinsic-loopanalysis.ll testcase tries to expose this. Then I also noted that adding a dbg.declare actually made the existing testcase colccc_coldsites.ll generate different code, so I modified that to now test it behaves the same way with and without the dbg.declare. Reviewed By: nikic, fhahn Differential Revision: https://reviews.llvm.org/D133193	2022-09-02 12:29:44 +02:00
Sergey Kachkov	be37caca00	[JumpThreading] Process range comparisions with non-local cmp instructions Use getPredicateOnEdge method if value is a non-local compare-with-a-constant instruction, that can give more precise results than getConstantOnEdge. Differential Revision: https://reviews.llvm.org/D131956	2022-09-02 12:22:45 +02:00
Nikita Popov	c453e5b901	Revert "[DSE] Eliminate noop store even through has clobbering between LoadI and StoreI" This reverts commit `cd8f3e7581`. As pointed out by Eli on the review, this is missing an alignment check. The value might be written at an offset.	2022-09-02 09:28:48 +02:00
Nikita Popov	639d912282	[LICM] Allow load-only scalar promotion in the presence of unwinding Currently, we bail out of scalar promotion if the loop may unwind and the memory may be visible on unwind. This is because we can't insert stores of the promoted value on unwind edges. However, nowadays scalar promotion also has support for only promoting loads, while leaving stores in place. This kind of promotion is safe even in the presence of unwinding. Differential Revision: https://reviews.llvm.org/D133111	2022-09-02 09:27:13 +02:00
luxufan	cd8f3e7581	[DSE] Eliminate noop store even through has clobbering between LoadI and StoreI For noop store of the form of LoadI and StoreI, An invariant should be kept is that the memory state of the related MemoryLoc before LoadI is the same as before StoreI. For this example: ``` define void @pr49927(i32* %q, i32* %p) { %v = load i32, i32* %p, align 4 store i32 %v, i32* %q, align 4 store i32 %v, i32* %p, align 4 ret void } ``` Here the definition of the store's destination is different with the definition of the load's destination, which it seems that the invariant mentioned above is broken. But the definition of the store's destination would write a value that is LoadI, actually, the invariant is still kept. So we can safely ignore it. Differential Revision: https://reviews.llvm.org/D132657	2022-09-02 06:37:41 +00:00
Vitaly Buka	ad3a77df2d	[msan] Fix debug info with getNextNode When we want to add instrumentation after an instruction, instrumentation still should keep debug info of the instruction. Reviewed By: kda, kstoimenov Differential Revision: https://reviews.llvm.org/D133091	2022-09-01 20:13:56 -07:00
Chenbing Zheng	d30cf77cb1	[InstCombine] complete fold extractvalue (any_mul_with_overflow X, -1) When we do extractvalue (any_mul_with_overflow X, -1) --> (-X and icmp), which left partly failed to match vector constant with poison element. This patch try to fix it. Alive2: https://alive2.llvm.org/ce/z/2rGp_3 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D132996	2022-09-02 10:58:42 +08:00
Vitaly Buka	ad2b356f85	[msan] Use no-origin functions when possible Saves 1.8% of .text size on CTMark Reviewed By: kda Differential Revision: https://reviews.llvm.org/D133077	2022-09-01 19:18:38 -07:00

1 2 3 4 5 ...

31623 Commits