llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	9604601c93	[SimplifyCFG] Remove redundant checks for hoisting (NFCI) These conditions are later checked in the HoistTerminator code path. Checking them here is somewhat confusing, because this code only checks the first instruction in the block, which is not necessarily the terminator.	2022-07-04 10:53:54 +02:00
Florian Hahn	b4694229aa	[LV] Simplify setDebugLocFromInst by using early exit (NFC). Suggested as separate improvement in D128657.	2022-07-04 09:25:26 +01:00
Sanjay Patel	f9f40aa10d	[InstCombine] fold negated low-bit-mask to cmp+select (-(X & 1)) & Y --> (X & 1) == 0 ? 0 : Y https://alive2.llvm.org/ce/z/rhpH3i This is noted as a missing IR canonicalization in issue #55618. We already managed to fix codegen to the expected form.	2022-07-03 12:25:26 -04:00
Nuno Lopes	53dc0f1078	[NFC] Switch a few uses of undef to poison as placeholders for unreachble code	2022-07-03 14:34:03 +01:00
Nuno Lopes	022bd92c78	[LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC]	2022-07-03 12:32:19 +01:00
Florian Hahn	b0da3c6fa4	[VPlan] Move setDebugLocFromInst to VPTransformState (NFC). The moved helpers are only used for codegen. It will allow moving the remaining ::execute implementations out of LoopVectorize.cpp. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D128657	2022-07-02 15:18:17 +01:00
Johannes Doerfert	07766f4070	[Attributor] Move heap2stack allocas to the entry block if possible If we are certainly not in a loop we can directly emit the heap2stack allocas in the function entry block. This will help to get rid of them (SROA) and avoid stacksave/restore intrinsics when the function is inlined.	2022-07-01 21:34:12 -05:00
Nuno Lopes	7c4f45f87a	Revert [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC] This reverts commits `47e6f98f84` and `3e701bcd2a`	2022-07-01 23:53:41 +01:00
Nuno Lopes	47e6f98f84	[LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC]	2022-07-01 23:31:31 +01:00
Sanjay Patel	9c8a39c67b	[InstCombine] restrict select of bit-tests to constant shift amounts This transform is responsible for a long-standing miscompile as discussed in issue #47012 (was bugzilla #47668). There was a proposal to correct it in D88432, but that was abandoned and there hasn't been any recent activity to fix it AFAICT. The original patch D45108 started with a constant-shift-only restriction and only expanded during review, so I don't think there's much risk of perf regression on the motivating code.	2022-07-01 16:24:34 -04:00
Martin Sebor	0d68ff87d2	[InstCombine] Transform strrchr to memrchr for constant strings Add an emitter for the memrchr common extension and simplify the strrchr call handler to use it. This enables transforming calls with the empty string to the test C ? S : 0. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128954	2022-07-01 11:10:00 -06:00
Nikita Popov	65d59b4265	[LoopDeletion] Fix deletion with unusual predecessor terminator (PR56266) LoopSimplify only requires that the loop predecessor has a single successor and is safe to hoist into -- it doesn't necessarily have to be an unconditional BranchInst. Adjust LoopDeletion to assert conditions closer to what it actually needs for correctness, namely a single successor and a side-effect-free terminator (as the terminator is getting dropped). Fixes https://github.com/llvm/llvm-project/issues/56266.	2022-07-01 16:13:35 +02:00
Florian Hahn	0dddf04cab	[LV] Don't optimize exit cond during epilogue vectorization. At the moment, the same VPlan can be used code generation of both the main vector and epilogue vector loop. This can lead to wrong results, if the plan is optimized based on the VF of the main vector loop and then re-used for the epilogue loop. One example where this is problematic is if the scalar loops need to execute at least one iteration, e.g. due to interleave groups. To prevent mis-compiles in the short-term, disable optimizing exit conditions for VPlans when using epilogue vectorization. The proper fix is to avoid re-using the same plan for both loops, which will require support for cloning plans first. Fixes #56319.	2022-07-01 13:48:38 +01:00
Nikita Popov	fabe915705	[SimplifyLibCalls] Use inbounds GEP When converting strchr(p, '\0') to p + strlen(p) we know that strlen() must return an offset that is inbounds of the allocated object (otherwise it would be UB), so we can use an inbounds GEP. An equivalent argument can be made for the other cases.	2022-07-01 14:31:44 +02:00
Sanjay Patel	ab372cdd6f	[InstCombine] add code comment for icmp transform; NFC This was accidentally left out of `cc88445a91`	2022-07-01 08:21:55 -04:00
Florian Hahn	583abd0e36	[VPlan] Move addMetadata to VPTransformState (NFC). The moved helpers are only used for codegen. It will allow moving the remaining ::execute implementations out of LoopVectorize.cpp. Depends on D127966. Depends on D127965. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D127968	2022-07-01 12:03:25 +01:00
Nikita Popov	9b994593cc	[SCCP] Only handle unknown lattice values in resolvedUndefsIn() This is a minor refinement of resolvedUndefsIn(), mostly for clarity. If the value of an instruction is undef, then that's already a legal final result -- we can safely rauw such an instruction with undef. We only need to mark unknown values as overdefined, as that's the result we get for an instruction that has not been processed because it has an undef operand. Differential Revision: https://reviews.llvm.org/D128251	2022-07-01 09:14:37 +02:00
Chen Zheng	39fe49aa57	[Inline] don't add noalias metadata for unknown objects. The unidentified objects recognized in `getUnderlyingObjects` may still alias to the noalias parameter because `getUnderlyingObjects` may not check deep enough to get the underlying object because of `MaxLookup`. The real underlying object for the unidentified object may still be the noalias parameter. Originally Patched By: tingwang Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D127202	2022-07-01 02:16:55 -04:00
Alexey Bataev	4be3fc35aa	[SLP][NFC]Cleanup up operands of the removed insertelements, NFC. Replace all operands of the insertelement instruction, replaced by shuffles, by poisons to avoid false-positive reports about incorrect function.	2022-06-30 17:51:43 -07:00
Nuno Lopes	373571dbb4	[NFC] Switch a few uses of undef to poison as placeholders for unreachble code	2022-06-30 23:01:43 +01:00
William Huang	a9119143a2	[InstCombine] Changing constant-indexed GEP of GEP to i8* for merging When merging GEP of GEP with constant indices, if the second GEP's offset is not divisible by the first GEP's element size, convert both type to i8* and merge. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D125934	2022-06-30 21:26:11 +00:00
Nuno Lopes	0586d1cac2	[NFC] Switch a few uses of undef to poison as placeholders for unreachble code	2022-06-30 21:47:31 +01:00
Craig Topper	e633f8cd14	[InstCombine] Fix a Wparentheses warning in an assert. NFC	2022-06-30 13:03:32 -07:00
Sanjay Patel	cc88445a91	[InstCombine] canonicalize 'icmp (trunc X), C' to 'icmp (X & Mask), C' I looked at canonicalizing in the other direction, but that causes many potential regressions and infinite loops because we already (possibly wrongly) canonicalize "trunc X to i1" into an and+icmp. This has a data layout restriction to avoid creating illegal mask instructions, but we could remove that if we can show that the backend can undo this when needed. The motivating example from issue #56119 is modeled by the PhaseOrdering test.	2022-06-30 15:51:39 -04:00
Martin Sebor	3a743a5892	[InstCombine] Fix memrchr logic error that prevents folding Correct a logic bug in the memrchr enhancement added in D123629 that makes it ineffective in a subset of cases. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128856	2022-06-30 11:35:26 -06:00
Nikita Popov	f34dcf2763	[IRBuilder] Migrate all binops to folding API Migrate all binops to use FoldXYZ rather than CreateXYZ APIs, which are compatible with InstSimplifyFolder and fallible constant folding. Rather than continuing to add one method for every single operator, add a generic FoldBinOp (plus variants for nowrap, exact and fmf operators), which we would need anyway for CreateBinaryOp. This change is not NFC because IRBuilder with InstSimplifyFolder may perform more folding. However, this patch changes SCEVExpander to not use the folder in InsertBinOp to minimize practical impact and keep this change as close to NFC as possible.	2022-06-30 16:41:17 +02:00
Nikita Popov	588e229bf9	[VNCoercion] Separate constant/non-constant mem intrinsic implementations (NFCI) This means we no longer need to have the same API between IRBuilder and IRBuilderFolder. The constant case is substantially simpler, so implementing it separately isn't an undue burden.	2022-06-30 15:26:06 +02:00
Nikita Popov	014c4bdb9d	[VNCoercion] Use ConstantFoldLoadFromConst API (NFCI) Nowdays we have a generic constant folding API to load a type from an offset. It should be able to do anything that VNCoercion can do. This avoids the weird templating between IRBuilder and ConstantFolder in one function, which is will stop working as the IRBuilderFolder moves from CreateXYZ to FoldXYZ APIs. Unfortunately, this doesn't eliminate this pattern from VNCoercion entirely yet.	2022-06-30 14:52:27 +02:00
Florian Hahn	68884dde70	[LV] Move LoopVersioning creation to LVP::execute. At the moment LoopVersioning is only created for inner-loop vectorization. This patch moves it to LVP::execute, which means it will also be added for epilogue vectorization. As a consequence, the proper noalias metadata is now also added to epilogue vector loops. LVer will be moved to VPTransformState as follow-up. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D127966	2022-06-30 12:14:32 +01:00
Sanjay Patel	7c4b90a98d	[InstCombine] fix overzealous assert in icmp-shr fold The assert was added with `0399473de8` and is correct for that pattern, but it is off-by-1 with the enhancement in `d4f39d8333`. The transforms are still correct with the new pre-condition: https://alive2.llvm.org/ce/z/6_6ghm https://alive2.llvm.org/ce/z/_GTBUt And as shown in the new test, the transform is expected with 'ult' - in that case, the icmp reduces to test if the shift amount is 0.	2022-06-30 06:28:48 -04:00
Nikita Popov	1579fc62fe	[Evaluator] Add missing LLVM_DEBUG() Missed these in `41f0b6a781`, resulting in unconditional debug output.	2022-06-30 11:54:47 +02:00
Chen Zheng	b05801de35	[InlineFunction] Only check pointer arguments for a call Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128529	2022-06-30 05:39:47 -04:00
Nikita Popov	41f0b6a781	[Evaluator] Use ConstantFoldInstOperands() For instructions that don't need any special handling, use ConstantFoldInstOperands(), rather than re-implementing individual cases. This is probably not NFC because it can handle cases the previous code missed (e.g. vector operations).	2022-06-30 11:10:17 +02:00
Nikita Popov	a6d4b4138f	[ConstantFold] Supports compares in ConstantFoldInstOperands() Support compares in ConstantFoldInstOperands(), instead of forcing the use of ConstantFoldCompareInstOperands(). Also handle insertvalue (extractvalue was already handled). This removes a footgun, where many uses of ConstantFoldInstOperands() need a separate check for compares beforehand. It's particularly insidious if called on a constant expression, because it doesn't fail in that case, but will just not do DL-dependent folding.	2022-06-30 11:05:24 +02:00
Florian Hahn	24b5f8e0d0	[VPlan] Make sure optimizeInductions removes wide ind from scalar plan. In some cases, there may be widened users of inductions even though the plan includes the scalar VF. In those cases, make sure we still replace the VPWidenIntOrFpInductionRecipe with scalar steps, as otherwise we may try to execute a VPWidenIntOrFpInductionRecipe with a scalar VF. Alternatively the patch could also split the range if needed. This fixes a crash exposed by D123720. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D128755	2022-06-30 09:11:48 +01:00
Nikita Popov	10c531cd5b	[SCCP] Simplify CFG in SCCP as well Currently, we only remove dead blocks and non-feasible edges in IPSCCP, but not in SCCP. I'm not aware of any strong reason for that difference, so this patch updates SCCP to perform the CFG cleanup as well. Compile-time impact seems to be pretty minimal, in the 0.05% geomean range on CTMark. For the test case from https://reviews.llvm.org/D126962#3611579 the result after -sccp now looks like this: define void @test(i1 %c) { entry: br i1 %c, label %unreachable, label %next next: unreachable unreachable: call void @bar() unreachable } -jump-threading does nothing on this, but -simplifycfg will produce the optimal result. Differential Revision: https://reviews.llvm.org/D128796	2022-06-30 09:25:03 +02:00
Chuanqi Xu	0b5ead6590	[WebAssembly] Don't set musttail for coroutines when tail-call is not enabled The C++20 Coroutines couldn't be compiled to WebAssembly due to an optimization named symmetric transfer requires the support for musttail calls but WebAssembly doesn't support it yet. This patch tries to fix the problem by adding a supportsTailCalls method to TargetTransformImpl to skip the symmetric transfer when tail-call feature is not supported. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D128794	2022-06-30 11:15:40 +08:00
zhongyunde	404479b4b0	[InstCombine] Use known bits to determine exact int->fp cast Reviewed By: spatel, nikic Differential Revision: https://reviews.llvm.org/D127854	2022-06-30 09:45:11 +08:00
Florian Hahn	6d5f814357	[LoopUnrollRuntime] Invalidate SCEV for exit phi in ConnectProlog. ConnectProlog adds new incoming values to exit phi nodes which can change the SCEV for the phi after `20d798bd47`. Fix is analog to `cfc741bc0e`. Fixes #56286.	2022-06-29 20:28:43 +01:00
Florian Hahn	9a35f19e3e	[UnrollRuntime] Invalidate SCEVs for modified phis in ConnectEpilog. ConnectEpilog adds new incoming values to exit phi nodes which can change the SCEV for the phi after `20d798bd47`. Fix is analog to `cfc741bc0e`. Fixes #56282.	2022-06-29 18:26:00 +01:00
Sanjay Patel	d4f39d8333	[InstCombine] add fold for (ShiftC >> X) >u C This is the 'ugt' sibling to: `0399473de8` Decrement the input compare constant (and implicitly decrement the new compare constant): https://alive2.llvm.org/ce/z/iELmct	2022-06-29 12:30:01 -04:00
Nikita Popov	bdba8278d9	[VectorCombine] Avoid ConstantExpr::get() (NFC) Use IRBuilder APIs instead, which will still constant fold.	2022-06-29 17:17:52 +02:00
Nikita Popov	2124b2f0e6	[JumpThreading] Avoid ConstantExpr::get() (NFCI) This code requires the result to be an UndefValue/ConstantInt anyway (checked by getKnownConstant), so we are only interested in the case where this folds.	2022-06-29 16:43:05 +02:00
Nikita Popov	df698a5762	[InstCombine] Avoid some calls to ConstantExpr::get() (NFCI) Replace some calls to ConstantExpr::get() with IRBuilder APIs (which will also constant fold if possible).	2022-06-29 16:26:02 +02:00
Nikita Popov	0af53fcb99	[SROA] Don't create constant expressions (NFC) Use IRBuilder instead, which will fold these. Just to clarify that this does not actually create any udiv expression.	2022-06-29 11:51:22 +02:00
Pavel Samolysov	3d9ce9e43d	[ArgPromotion] Remove all the getters and ReplaceCallSite (NFC) AARGetter is an abstraction over a source of the `AAResults` introduced to support the legacy pass manager as well as the modern one. Since the Argument Promotion pass doesn't support the legacy pass manager anymore, the abstraction is not required and `AAResults` may be used directly. The instance of the `FunctionAnalysisManager` is passed through the functions to get all the required analyses just wherever they are required and do not use the awkward getter callbacks. The `ReplaceCallSite` parameter was required for the legacy pass manager only and isn't used anymore, so the parameter has been eliminated. Differential Revision: https://reviews.llvm.org/D128727	2022-06-29 10:45:11 +03:00
Pavel Samolysov	8958057fb1	[ArgPromotion] Move isDenselyPacked static member (NFC) The `isDenselyPacked` static member of the `ArgumentPromotionPass` class is not used in the class itself anymore. The single known user of the function is in the `AttributorAttributes.cpp` file, so the function has been moved into the file. Differential Revision: https://reviews.llvm.org/D128725	2022-06-29 10:45:10 +03:00
Martin Sebor	8827679826	[InstCombine] Fold strncmp of constant arrays and variable size Extend the solution accepted in D127766 to strncmp and simplify strncmp(A, B, N) calls with constant A and B and variable N to the equivalent of N <= Pos ? 0 : (A < B ? -1 : B < A ? +1 : 0) where Pos is the offset of either the first mismatch between A and B or the terminating null character if both A and B are equal strings. Reviewed By: courbet Differential Revision: https://reviews.llvm.org/D128089	2022-06-28 15:59:14 -06:00
Martin Sebor	e263a7670e	[InstCombine] Look through more casts when folding memchr and memcmp Enhance getConstantDataArrayInfo to let the memchr and memcmp library call folders look through arbitrarily long sequences of bitcast and GEP instructions. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128364	2022-06-28 15:58:42 -06:00
Alexey Bataev	bf4dcbd2df	[SLP]Fix PR56251: Do not remove the reordering from the root node, being used as an operand. If the root order itself does not require reordering, we can just remove its reorder mask safely (e.g., if the root node is a vector of phis). But if this node is used as an operand in the graph, we cannot delete the reordering, need to keep it. Otherwise the graph nodes are not synchronized with the operands. It may cause an extra gather instruction(s) or a compiler crash. Also, need to be very careful when selecting the gather nodes for reordering since there might several gather nodes with the same scalars and we can try to reorder just the same node many times instead of different nodes. Differential Revision: https://reviews.llvm.org/D128680	2022-06-28 13:42:05 -07:00
Leonard Chan	9553d69580	[NFC][HWASan] Refactor hwasan pass This moves some code for getting PC and SP into their own functions. Since SP is also retrieved in the prologue and getting the stack tag, we can cache the SP if we get it once in the prologue. This caching will really only be relevant in D128387 where StackBaseTag may not be set in the prologue if __hwasan_tls is not used. Differential Revision: https://reviews.llvm.org/D128551	2022-06-28 12:09:20 -07:00
Pavel Samolysov	170c4d21bd	[ArgPromotion] Unify byval promotion with non-byval It makes sense to handle byval promotion in the same way as non-byval but also allowing `store` instructions. However, these should use the same checks as the `load` instructions do, i.e. be part of the `ArgsToPromote` collection. For these instructions, the check for interfering modifications can be disabled, though. The promotion algorithm itself has been modified a lot: all the accesses (i.e. loads and stores) are rewritten to the emitted `alloca` instructions. To optimize these new `alloca`s out, the `PromoteMemToReg` function from `Transforms/Utils/PromoteMemoryToRegister.cpp` file is invoked after promotion. In order to let the `PromoteMemToReg` promote as many `alloca`s as it is possible, there should be no `GEP`s from the `alloca`s. To eliminate the `GEP`s, its own `alloca` is generated for every argument part because a single `alloca` for the whole argument (that significantly simplifies the code of the pass though) unfortunately cannot be used. The idea comes from the following discussion: https://reviews.llvm.org/D124514#3479676 Differential Revision: https://reviews.llvm.org/D125485	2022-06-28 15:19:58 +03:00
Mikhail Goncharov	c6c124ca80	Fixed unused variable warning.	2022-06-28 11:44:16 +02:00
Florian Hahn	03975b7f0e	[VPlan] Move recipe implementations to separate file (NFC). This patch moves the code for recipe implementations to a separate file. The benefits are: * Keep VPlan.cpp smaller => faster compile-time during parallel builds. * Keep code for logical units together As a follow-up I am also planning on moving all ::execute implemetnations from LoopVectorize.cpp over to the new file, which should help to reduce the size of the file a bit. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D127965	2022-06-28 10:34:30 +01:00
Nikita Popov	5548e807b5	[IR] Remove support for extractvalue constant expression This removes the extractvalue constant expression, as part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179. extractvalue is already not supported in bitcode, so we do not need to worry about bitcode auto-upgrade. Uses of ConstantExpr::getExtractValue() should be replaced with IRBuilder::CreateExtractValue() (if the fact that the result is constant is not important) or ConstantFoldExtractValueInstruction() (if it is). Though for this particular case, it is also possible and usually preferable to use getAggregateElement() instead. The C API function LLVMConstExtractValue() is removed, as the underlying constant expression no longer exists. Instead, LLVMBuildExtractValue() should be used (which will constant fold or create an instruction). Depending on the use-case, LLVMGetAggregateElement() may also be used instead. Differential Revision: https://reviews.llvm.org/D125795	2022-06-28 10:40:17 +02:00
Guillaume Chatelet	3c126d5fe4	[Alignment] Replace commonAlignment with std::min `commonAlignment` is a shortcut to pick the smallest of two `Align` objects. As-is it doesn't bring much value compared to `std::min`. Differential Revision: https://reviews.llvm.org/D128345	2022-06-28 07:15:02 +00:00
wlei	7e86b13c63	[CSSPGO][llvm-profgen] Reimplement SampleContextTracker using context trie This is the followup patch to https://reviews.llvm.org/D125246 for the `SampleContextTracker` part. Before the promotion and merging of the context is based on the SampleContext(the array of frame), this causes a lot of cost to the memory. This patch detaches the tracker from using the array ref instead to use the context trie itself. This can save a lot of memory usage and benefit both the compiler's CS inliner and llvm-profgen's pre-inliner. One structure needs to be specially treated is the `FuncToCtxtProfiles`, this is used to get all the functionSamples for one function to do the merging and promoting. Before it search each functions' context and traverse the trie to get the node of the context. Now we don't have the context inside the profile, instead we directly use an auxiliary map `ProfileToNodeMap` for profile , it initialize to create the FunctionSamples to TrieNode relations and keep updating it during promoting and merging the node. Moreover, I was expecting the results before and after remain the same, but I found that the order of FuncToCtxtProfiles matter and affect the results. This can happen on recursive context case, but the difference should be small. Now we don't have the context, so I just used a vector for the order, the result is still deterministic. Measured on one huge size(12GB) profile from one of our internal service. The profile similarity difference is 99.999%, and the running time is improved by 3X(debug mode) and the memory is reduced from 170GB to 90GB. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D127031	2022-06-27 23:22:21 -07:00
wlei	aa58b7b1e3	[CSSPGO][llvm-profgen] Reimplement computeSummaryAndThreshold using context trie Follow-up patch to https://reviews.llvm.org/D125246, support `computeSummaryAndThreshold` based on context trie. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D127026	2022-06-27 23:22:21 -07:00
Congzhe Cao	b941857b40	[LoopInterchange] New cost model for loop interchange This is another attempt to land this patch. The patch proposed to use a new cost model for loop interchange, which is obtained from loop cache analysis. Given a loopnest, what loop cache analysis returns is a vector of loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost loop, loop1 should be placed one more level inside, and loop2 one more level inside, etc. What loop cache analysis does is not only more comprehensive than the current cost model, it is also a "one-shot" query which means that we only need to query it once during the entire loop interchange pass, which is better than the current cost model where we query it every time we check whether it is profitable to interchange two loops. Thus complexity is reduced, especially after D120386 where we do more interchanges to get the globally optimal loop access pattern. Updates made to test cases are mostly minor changes and some corrections. One change that applies to all tests is that we added an option `-cache-line-size=64` to the RUN lines. This is ensure that loop cache analysis receives a valid number of cache line size for correct analysis. Test coverage for loop interchange is not reduced. Currently we did not completely remove the legacy cost model, but keep it as fall-back in case the new cost model did not run successfully. This is because currently we have some limitations in delinearization, which sometimes makes loop cache analysis bail out. The longer term goal is to enhance delinearization and eventually remove the legacy cost model compeletely. Reviewed By: bmahjour, #loopoptwg Differential Revision: https://reviews.llvm.org/D124926	2022-06-28 00:08:37 -04:00
Vitaly Buka	6824eee942	[asan] Add missing dependency on Demangle Follow up to D127911.	2022-06-27 15:10:02 -07:00
Mitch Phillips	dacfa24f75	Delete 'llvm.asan.globals' for global metadata. Now that we have the sanitizer metadata that is actually on the global variable, and now that we use debuginfo in order to do symbolization of globals, we can delete the 'llvm.asan.globals' IR synthesis. This patch deletes the 'location' part of the __asan_global that's embedded in the binary as well, because it's unnecessary. This saves about ~1.7% of the optimised non-debug with-asserts clang binary. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D127911	2022-06-27 14:40:40 -07:00
Philip Reames	20dd3297b1	[LV] Allow scalable vectorization with vscale = 1 This change is a bit subtle. If we have a type like <vscale x 1 x i64>, the vectorizer will currently reject vectorization. The reason is that a type like <1 x i64> is likely to get simply rescalarized, and the vectorizer doesn't want to be in the game of simple unrolling. (I've given the example in terms of 1 x types which use a single register, but the same issue exists for any N x types which use N registers. e.g. RISCV LMULs.) This change distinguishes scalable types from fixed types under the reasoning that converting to a scalable type isn't unrolling. Because the actual vscale isn't known until runtime, using a vscale type is potentially very profitable. This makes an important, but unchecked, assumption. Specifically, the scalable type is assumed to only be legal per the cost model if there's actually a scalable register class which is distinct from the scalar domain. This is, to my knowledge, true for all targets which return non-invalid costs for scalable vector ops today, but in theory, we could have a target decide to lower scalable to fixed length vector or even scalar registers. If that ever happens, we'd need to revisit this code. In practice, this patch unblocks scalable vectorization for ELEN types on RISCV. Let me sketch one alternate implementation I considered. We could have restricted this to when we know a minimum value for vscale. Specifically, for the default +v extension for RISCV, we actually know that vscale >= 2 for ELEN types. However, doing it this way means we can't generate scalable vectors when using the various embedded vector extensions which have a minimum vscale of 1. Differential Revision: https://reviews.llvm.org/D128542	2022-06-27 13:38:57 -07:00
Yuanfang Chen	e2e9e708e5	[Coroutine] Remove the '!func_sanitize' metadata for split functions There is no proper RTTI for these split functions. So just delete the metadata. Fixes https://github.com/llvm/llvm-project/issues/49689. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D116130	2022-06-27 12:09:13 -07:00
Yuanfang Chen	6678f8e505	[ubsan] Using metadata instead of prologue data for function sanitizer Information in the function `Prologue Data` is intentionally opaque. When a function with `Prologue Data` is duplicated. The self (global value) references inside `Prologue Data` is still pointing to the original function. This may cause errors like `fatal error: error in backend: Cannot represent a difference across sections`. This patch detaches the information from function `Prologue Data` and attaches it to a function metadata node. This and D116130 fix https://github.com/llvm/llvm-project/issues/49689. Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D115844	2022-06-27 12:09:13 -07:00
Joseph Huber	c7243f21d3	[OpenMP] Only strip runtime attributes if needed Summary: Currently in OpenMPOpt we strip `noinline` attributes from runtime functions. This is here because the device bitcode library that we link has problems with needed definitions getting prematurely optimized out. This is only necessary for OpenMP offloading to GPUs so we should narrow the scope for where we spend time doing this. In the future this shouldn't be necessary as we move to using a linked library rather than pulling in a bitcode library in Clang.	2022-06-27 13:35:41 -04:00
Nikita Popov	f65c88c42f	[GlobalOpt] Fix memset handling in global ctor evaluation (PR55859) The global ctor evaluator currently handles by checking whether the memset memory is already zero, and skips it in that case. However, it only actually checks the first byte of the memory being set. This patch extends the code to check all bytes being set. This is done byte-by-byte to avoid converting undef values to zeros in larger reads. However, the handling is still not completely correct, because there might still be padding bytes (though probably this doesn't matter much in practice, as I'd expect global variable padding to be zero-initialized in practice). Mostly fixes https://github.com/llvm/llvm-project/issues/55859. Differential Revision: https://reviews.llvm.org/D128532	2022-06-27 16:50:49 +02:00
Bradley Smith	a83aa33d1b	[IR] Move vector.insert/vector.extract out of experimental namespace These intrinsics are now fundemental for SVE code generation and have been present for a year and a half, hence move them out of the experimental namespace. Differential Revision: https://reviews.llvm.org/D127976	2022-06-27 10:48:45 +00:00
Nikita Popov	cde402778a	[FunctionAttrs] Add missing pass dependency This pass depends on AAResults. This fixes the ocaml IPO binding tests.	2022-06-27 10:15:06 +02:00
Nikita Popov	217e85761c	[ArgPromotion] Remove legacy PM support Support for the legacy pass manager in ArgPromotion causes complications in D125485. As the legacy pass manager for middle-end optimizations is unsupported, drop ArgPromotion from the legacy pipeline, rather than introducing additional complexity to deal with it. Differential Revision: https://reviews.llvm.org/D128536	2022-06-27 09:42:17 +02:00
Chuanqi Xu	24e53b01d5	Revert "[Coroutines] Only do symmetric transfer if optimization is on" This reverts commit `7782e080e8`. According to the discussion of WG21, symmetric transfer is a desired feature.	2022-06-27 10:54:56 +08:00
Kazu Hirata	d08f34b592	[llvm] Don't use Optional::hasValue (NFC) This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.	2022-06-26 18:31:51 -07:00
Kazu Hirata	a81b64a1fb	[llvm] Use Optional::has_value instead of Optional::hasValue (NFC) This patch replaces x.hasValue() with x.has_value() where x is not contextually convertible to bool.	2022-06-26 16:10:42 -07:00
Nuno Lopes	6ef9a2ad01	[LICM] Use poison to replace unreachable values instead of undef [NFC]	2022-06-26 14:56:35 +01:00
Nuno Lopes	3fa2411dc5	[LoopSimplifyCFG] use poison when replacing dead instructions instead of undef [NFC]	2022-06-26 14:15:55 +01:00
Nuno Lopes	d46fa1fc58	[ArgumentPromotion] use poison when replacing dead instructions instead of undef [NFC]	2022-06-26 13:44:05 +01:00
Kazu Hirata	a7938c74f1	[llvm] Don't use Optional::hasValue (NFC) This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.	2022-06-25 21:42:52 -07:00
Kazu Hirata	3b7c3a654c	Revert "Don't use Optional::hasValue (NFC)" This reverts commit `aa8feeefd3`.	2022-06-25 11:56:50 -07:00
Kazu Hirata	aa8feeefd3	Don't use Optional::hasValue (NFC)	2022-06-25 11:55:57 -07:00
Pavel Samolysov	6e3d4712b9	[DeadArgElim] Replace insert with emplace (NFC)	2022-06-25 10:31:27 +03:00
Mitch Phillips	f57066401e	[HWASan] Use new IR attribute for communicating unsanitized globals. Globals that shouldn't be sanitized are currently communicated to HWASan through the use of the llvm.asan.globals IR metadata. Now that we have an on-GV attribute, use it. Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D127543	2022-06-24 12:04:11 -07:00
Mingming Liu	e0d069598b	[Inline] Annotate inline pass name with link phase information for analysis. The annotation is flag gated; flag is turned off by default. Differential Revision: https://reviews.llvm.org/D125495	2022-06-24 10:06:43 -07:00
Alexey Bataev	2faacf61a5	[SLP]Improve shuffles cost estimation where possible. Improved/fixed cost modeling for shuffles by providing masks, improved cost model for non-identity insertelements. Differential Revision: https://reviews.llvm.org/D115462	2022-06-24 09:28:01 -07:00
Arthur Eubanks	e422c0d3b2	[GlobalOpt] Perform store->dominated load forwarding for stored once globals The initial land incorrectly optimized forwarding non-Constants in non-nosync/norecurse functions. Bail on non-Constants since norecurse should cause global -> alloca promotion anyway. The initial land also incorrectly assumed that StoredOnceStore was the only store to the global, but it actually means that only one value other than the global initializer is stored. Add a check that there's only one store. Compile time tracker: https://llvm-compile-time-tracker.com/compare.php?from=c80b88ee29f34078d2149de94e27600093e6c7c0&to=ef2c2b7772424b6861a75e794f3c31b45167304a&stat=instructions Reviewed By: nikic, asbirlea, jdoerfert Differential Revision: https://reviews.llvm.org/D128128	2022-06-24 09:09:26 -07:00
Florian Hahn	cb69ba4faa	[LV] Create RT checks once VF/IC are selected, track scalar cost. This patch updates LV to generate runtime after the VF & IC are selected. It allows deciding whether to vectorize with runtime checks or not based on their cost compared to the vector loop. It also updates VectorizationFactor to include the scalar cost. Reviewed By: lebedev.ri, dmgreen Differential Revision: https://reviews.llvm.org/D75981	2022-06-24 17:42:11 +02:00
Nikita Popov	871197d0a3	[MemoryBuiltins] Accept any value in getInitialValueOfAllocation() (NFC) Drop the requirement that getInitialValueOfAllocation() must be passed an allocator function, shifting the responsibility for checking that into the function (which it does anyway). The motivation is to avoid some calls to isAllocationFn(), which has somewhat ill-defined semantics (given the number of allocator-related attributes we have floating around...) (For this function, all we eventually need is an allockind of zeroed or uninitialized.) Differential Revision: https://reviews.llvm.org/D127274	2022-06-24 16:08:07 +02:00
Nikita Popov	e523baa664	[InlineFunction] Slightly clarify noalias scope calculation (NFC) Rename CanDeriveViaCapture -> RequiresNoCaptureBefore, drop unnecessary const cast, reformat some code avoid an ugly super-indented comment.	2022-06-24 12:31:46 +02:00
Florian Hahn	b18141a8f2	[VPlan] Set VFs included in plan before last set of VPTransforms (NFC). This allows VPlanTransforms to query the VFs included in the plan in the future.	2022-06-24 10:16:56 +02:00
Florian Hahn	92f87787b3	Recommit "[ConstraintElimination] Transfer info from ULT to signed system." This reverts commit `94ed2caf70`. The issue with no-determinism with the test has been fixed in `d9526e8a52`.	2022-06-24 09:27:14 +02:00
Evgenii Stepanov	878309cc54	Revert "[LoopInterchange] New cost model for loop interchange" llvm/lib/Analysis/LoopCacheAnalysis.cpp:702:30: runtime error: signed integer overflow: 6148914691236517209 * 100 cannot be represented in type 'long' https://lab.llvm.org/buildbot/#/builders/5/builds/25185 This reverts commit `1b24fe34b0`.	2022-06-23 16:10:53 -07:00
Congzhe Cao	1b24fe34b0	[LoopInterchange] New cost model for loop interchange This is the second attempt to land this patch. The patch proposed to use a new cost model for loop interchange, which is obtained from loop cache analysis. Given a loopnest, what loop cache analysis returns is a vector of loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost loop, loop1 should be placed one more level inside, and loop2 one more level inside, etc. What loop cache analysis does is not only more comprehensive than the current cost model, it is also a "one-shot" query which means that we only need to query it once during the entire loop interchange pass, which is better than the current cost model where we query it every time we check whether it is profitable to interchange two loops. Thus complexity is reduced, especially after D120386 where we do more interchanges to get the globally optimal loop access pattern. Updates made to test cases are mostly minor changes and some corrections. One change that applies to all tests is that we added an option `-cache-line-size=64` to the RUN lines. This is ensure that loop cache analysis receives a valid number of cache line size for correct analysis. Test coverage for loop interchange is not reduced. Currently we did not completely remove the legacy cost model, but keep it as fall-back in case the new cost model did not run successfully. This is because currently we have some limitations in delinearization, which sometimes makes loop cache analysis bail out. The longer term goal is to enhance delinearization and eventually remove the legacy cost model compeletely. Reviewed By: bmahjour, #loopoptwg Differential Revision: https://reviews.llvm.org/D124926	2022-06-23 16:34:57 -04:00
Philip Reames	46ea4b5ea1	[LV] Avoid a crash when costing a uniform store which doesn't correspond to a legal scatter If we have an unaligned uniform store, then when costing a scalable VF we can't emit code to scalarize it. (Well, we could, but we haven't implemented that case.) This change replaces an assert with a cost-model bailout such that we reject vectorization with the scalable VF instead of crashing.	2022-06-23 12:41:09 -07:00
Alexey Bataev	3b6edef15d	[SLP]Fix a crash when reorder masked gather nodes with reused scalars. If the masked gather nodes must be reordered, we can just reorder scalars, just like for gather nodes. But if the node contains reused scalars, it must be handled same way as a regular vectorizable node, since need to reorder reused mask, not the scalars directly. Differential Revision: https://reviews.llvm.org/D128360	2022-06-23 11:32:30 -07:00
Florian Hahn	d9526e8a52	[ConstraintElimination] Use stable_sort to sort worklist. If there are multiple constraints in the same block, at the moment the order they are processed may be different depending on the sort implementation. Use stable_sort to ensure consistent ordering.	2022-06-23 19:22:15 +02:00
Florian Hahn	94ed2caf70	Revert "[ConstraintElimination] Transfer info from ULT to signed system." This reverts commit `316e106f49`. This breaks a bot with expensive checks.	2022-06-23 17:27:33 +02:00
Florian Hahn	316e106f49	[ConstraintElimination] Transfer info from ULT to signed system. If A u< B holds, then A s>= 0 && A s< B holds if B s>= 0. https://alive2.llvm.org/ce/z/RrNxHh	2022-06-23 17:17:01 +02:00
Florian Hahn	9a33f3975e	[ConstraintElimination] Transfer info from SLT to unsigned system. If A s< B holds, then A u< also holds, if A s>= 0. https://alive2.llvm.org/ce/z/J4JZuN	2022-06-23 15:57:59 +02:00
chenglin.bi	30e49a3794	[InstCombine] Optimise shift+and+boolean conversion pattern to simple comparison if (`C1` is pow2) & (`(C2 & ~(C1-1)) + C1)` is pow2): ((C1 << X) & C2) == 0 -> X >= (Log2(C2+C1) - Log2(C1)); https://alive2.llvm.org/ce/z/EJAl1R ((C1 << X) & C2) != 0 -> X < (Log2(C2+C1) - Log2(C1)); https://alive2.llvm.org/ce/z/3bVRVz And remove dead code. Fix: https://github.com/llvm/llvm-project/issues/56124 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D126591	2022-06-23 21:53:07 +08:00
Florian Hahn	569d84fe99	[VPlan] Remove dead recipes across whole plan. This extends removeDeadRecipe to remove recipes across the whole plan. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D127580	2022-06-23 13:36:02 +02:00
Florian Hahn	24a98881cd	[ConstraintElimination] Transfer info from SGT to unsigned system. If A >s B then A >=u 0, if B >=s -1. https://alive2.llvm.org/ce/z/cncGKi	2022-06-23 11:04:51 +02:00
Fangrui Song	1ffd2d99c2	Revert D115462 "[SLP]Improve shuffles cost estimation where possible." This reverts commit `cac60940b7`. Caused -Os -fsanitize=memory -march=haswell miscompile to pytorch/cpuinfo. See my latest comment (may update) on D115462.	2022-06-22 23:16:31 -07:00
Fangrui Song	a411bc11d6	Revert "[SLP]Fix a crash when insert subvector is out of range." This reverts commit `f1ee2738b3`. Revert due to the revert of a dependent commit `[SLP]Improve shuffles cost estimation where possible.`	2022-06-22 23:16:25 -07:00
Serguei Katkov	5e1ccdf960	[RS4GC] Handle freeze case for vector Finding BDV for vector value does not handle freeze instruction. Adding its handling as it is done for scalar case. Reviewed By: apilipenko Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D128254	2022-06-23 11:58:41 +07:00
Mingming Liu	bc856eb3fc	[SampleProfile][Inline] Annotate sample profile inline remarks with link phase (prelink/postlink) information. Differential Revision: https://reviews.llvm.org/D126833	2022-06-22 17:00:53 -07:00
Florian Mayer	9320a32bb9	[MTE] [HWASan] Use LoopInfo for reachability queries. The reachability queries default to "reachable" after exploring too many basic blocks. LoopInfo helps it skip over the whole loop. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D127917	2022-06-22 15:28:49 -07:00
Adrian Tong	4e555a3df4	Fix a misspell. NFC	2022-06-22 21:23:21 +00:00
Brendon Cahoon	f1b05a0a2b	[StructurizeCFG] Improve basic block ordering StructurizeCFG linearizes the successors of branching basic block by adding Flow blocks to record the true/false path for branches and back edges. This patch reduces the number of Phi values needed to capture the control flow path by improving the basic block ordering. Previously, StructurizeCFG adds loop exit blocks outside of the loop. StructurizeCFG sets a boolean value to indicate the path taken, and all exit block live values extend to after the loop. For loops with a large number of exits blocks, this creates a huge number of values that are maintained, which increases compilation time and register pressure. This is problem especially with ASAN, which adds early exits to blocks with unreachable instructions for each instrumented check in the loop. In specific cases, this patch reduces the number of values needed after the loop by moving the exit block into the loop. This is done for blocks that have a single predecessor and single successor by moving the block to appear just after the predecessor. Differential Revision: https://reviews.llvm.org/D123231	2022-06-22 16:10:41 -05:00
Brendon Cahoon	e13248ab0e	[UnifyLoopExits] Reduce number of guard blocks UnifyLoopExits creates a single exit, a control flow hub, for loops with multiple exits. There is an input to the block for each loop exiting block and an output from the block for each loop exit block. Multiple checks, or guard blocks, are needed to branch to the correct exit block. For large loops with lots of exit blocks, all the extra guard blocks cause problems for StructurizeCFG and subsequent passes. This patch reduces the number of guard blocks needed when the exit blocks branch to a common block (e.g., an unreachable block). The guard blocks are reduced by changing the inputs and outputs of the control flow hub. The inputs are the exit blocks and the outputs are the common block. Reducing the guard blocks enables StructurizeCFG to reorder the basic blocks in the CFG to reduce the values that exit a loop with multiple exits. This reduces the compile-time of StructurizeCFG and also reduces register pressure. Differential Revision: https://reviews.llvm.org/D123230	2022-06-22 15:44:23 -05:00
Evgenii Stepanov	5011b4ca0e	Revert "[Attributor] Ensure to use the proper liveness AA" Reason: memory leaks This reverts commit `083010312a`.	2022-06-22 13:40:45 -07:00
Florian Mayer	476ced4b89	[MTE] [HWASan] Support diamond lifetimes. We were overly conservative and required a ret statement to be dominated completely be a single lifetime.end marker. This is quite restrictive and leads to two problems: * limits coverage of use-after-scope, as we degenerate to use-after-return; * increases stack usage in programs, as we have to remove all lifetime markers if we degenerate to use-after-return, which prevents reuse of stack slots by the stack coloring algorithm. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D127905	2022-06-22 11:16:34 -07:00
Florian Mayer	acc9721e38	[NFC] [HWASan] Remove indirection for getting analyses. This was necessary for code reuse between the old and new passmanager. With the old pass-manager gone, this is no longer necessary. Reviewed By: eugenis, myhsu Differential Revision: https://reviews.llvm.org/D127913	2022-06-22 10:53:20 -07:00
Mingming Liu	67dc8021a1	[Support] Change TrackingStatistic and NoopStatistic to use uint64_t instead of unsigned. Binary size of `clang` is trivial; namely, numerical value doesn't change when measured in MiB, and `.data` section increases from 139Ki to 173 Ki. Differential Revision: https://reviews.llvm.org/D128070	2022-06-22 10:11:40 -07:00
Max Kazantsev	cff4f04e2e	[LSR] Don't allow zero quotient as scale ref. PR56160 Scale reg should never be zero, so when the quotient is zero, we cannot assign it there. Limit this transform to avoid this situation. Differential Revision: https://reviews.llvm.org/D128339 Reviewed By: eopXD	2022-06-22 23:33:57 +07:00
Guillaume Chatelet	57ffff6db0	Revert "[NFC] Remove dead code" This reverts commit `8ba2cbff70`.	2022-06-22 14:55:47 +00:00
Guillaume Chatelet	8ba2cbff70	[NFC] Remove dead code	2022-06-22 13:33:58 +00:00
Florian Hahn	098b0b18a7	[ConstraintElimination] Transfer info from SGE to unsigned system. This patch adds a new transferToOtherSystem helper that tries to transfer information from signed predicates to the unsigned system and vice versa. The initial version adds A >=u B for A >=s B && B >=s 0 https://alive2.llvm.org/ce/z/8b6F9i	2022-06-22 15:27:59 +02:00
Nikita Popov	1f88d80408	[SCCP] Don't mark edges feasible when resolving undefs As branch on undef is immediate undefined behavior, there is no need to mark one of the edges as feasible. We can leave all the edges non-feasible. In IPSCCP, we can replace the branch with an unreachable terminator. Differential Revision: https://reviews.llvm.org/D126962	2022-06-22 10:28:27 +02:00
Florian Hahn	ac62b8f704	[ConstraintElimination] Update addFact to take Predicate and ops (NFC). This allows adding facts without necessarily having a corresponding CmpInst.	2022-06-22 08:36:41 +02:00
Pavel Samolysov	f44bf3805a	[DeadArgElim] Reformat the pass in accordance with the code style The code has been reformatted in accordance with the code style. Some function comments were extended to the Doxygen ones and reworded a bit to eliminate the duplication of the function's/class' name in the comment. Differential Revision: https://reviews.llvm.org/D128168	2022-06-22 09:13:00 +03:00
chenglin.bi	810b5c471f	[NewGVN] add context instruction for SimplifyQuery NewGVN will find operator from other context. ValueTracking currently doesn't have a way to run completely without context instruction. So it will use operator itself as conext instruction. If the operator in another branch will never be executed but it has an assume, it may caused value tracking use the assume to do wrong simpilfy. It would be better to make these simplification queries not use context at all, but that would require some API changes. For now we just use the orignial instruction as context instruction to fix the issue. Fix #56039 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D127942	2022-06-22 12:25:24 +08:00
Serguei Katkov	8f891b7c39	[LoopVectorize] Uninitialized phi node leads to a crash in SSAUpdater. createInductionResumeValues creates a phi node placeholder without filling incoming values. Then it generates the incoming values. It includes triggering of SCEV expander which may invoke SSAUpdater. SSAUpdater has an optimization to detect number of predecessors basing on incoming values if there is phi node. In case phi node is not filled with incoming values - the number of predecessors is detected as 0 and this leads to segmentation fault. In other words SSAUpdater expects that phi is in good shape while LoopVectorizer breaks this requirement. The fix is just prepare all incoming values first and then build a phi node. Reviewed By: fhahn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D128033	2022-06-22 10:49:27 +07:00
Johannes Doerfert	b7cc3b10c5	[Attributor][FIX] Avoid empty bin in AAPointerInfo This avoid creating empty bins in AAPointerInfo which can lead to segfaults. Also ensure we do not try to translate from callee to caller except if we really take the argument state and move it to the call site argument state. Fixes: https://github.com/llvm/llvm-project/issues/55726	2022-06-21 21:30:57 -05:00
Johannes Doerfert	083010312a	[Attributor] Ensure to use the proper liveness AA When determining liveness via Attributor::isAssumedDead(...) we might end up without a liveness AA or with one pointing into another function. Neither is helpful and we will avoid both from now on. Reapplied after fixing the ASAN error which caused the revert: `db68a25ca9`	2022-06-21 21:28:26 -05:00
Vasileios Porpodas	7a9ad25769	Recommit "[SLP][X86] Improve reordering to consider alternate instruction bundles" This reverts commit `6d6268dcbf`. Review: https://reviews.llvm.org/D125712	2022-06-21 18:35:29 -07:00
Vasileios Porpodas	6d6268dcbf	Revert "[SLP][X86] Improve reordering to consider alternate instruction bundles" This reverts commit `6f88acf410`.	2022-06-21 17:07:21 -07:00
Vasileios Porpodas	6f88acf410	[SLP][X86] Improve reordering to consider alternate instruction bundles During the reordering transformation we should try to avoid reordering bundles like fadd,fsub because this may block them being matched into a single vector instruction in x86. We do this by checking if a TreeEntry is such a pattern and adding it to the list of TreeEntries with orders that need to be considered. Differential Revision: https://reviews.llvm.org/D125712	2022-06-21 16:44:48 -07:00
Florian Hahn	88ce403c6a	[LV] Add new block to place recurrence splice, if needed. In some cases, a recurrence splice instructions needs to be inserted between to regions, for example if the regions get re-arranged during sinking. Fixes #56146.	2022-06-21 21:54:37 +02:00
Heejin Ahn	27e4afcea7	[DSE] Don't remove nounwind invokes For non-mem-intrinsic and non-lifetime `CallBase`s, the current `isRemovable` function only checks if the `CallBase` 1. has no uses 2. will return 3. does not throw: `80fb782336/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp (L1017)` But we should also exclude invokes even in case they don't throw, because they are terminators and thus cannot be removed. While it doesn't seem to make much sense for `invoke`s to have an `nounwind` target, this kind of code can be generated and is also valid bitcode. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128224	2022-06-21 11:54:09 -07:00
Martin Sebor	b19194c032	[InstCombine] handle subobjects of constant aggregates Remove the known limitation of the library function call folders to only work with top-level arrays of characters (as per the TODO comment in the code) and allows them to also fold calls involving subobjects of constant aggregates such as member arrays.	2022-06-21 11:55:14 -06:00
Alexey Bataev	d4ee43153d	[SLP][NFC]Fix a warning in a comparison, NFC. Fixed signedness warning.	2022-06-21 10:19:47 -07:00
serge-sans-paille	aaf1630ac3	[Scalarizer] No need to gather a scattered extracted element ExtractElement does not produce a vector out of a vector, so there's no need to call a gather once done. Fix #54469 Credits to npopov@redhat.com for the original approach. Differential Revision: https://reviews.llvm.org/D126012	2022-06-21 18:43:54 +02:00
Arthur Eubanks	b5db65e0da	Reland [GlobalOpt] Preserve CFG analyses The only place we modify the CFG is when calling removeUnreachableBlocks(), so insert a callback there which invalidates analyses for that function (or recomputes DT in the legacy PM). We may delete functions, make sure to clear analyses for those functions. (this was missed in the original revision) Small compile time wins across the board: https://llvm-compile-time-tracker.com/compare.php?from=f444ea8ce0aaaa5ec1a4129809389da15cc41396&to=698f41f4fc26cbf1006ed5d88e9d658edfc5b749&stat=instructions Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128145	2022-06-21 09:19:59 -07:00
Alexey Bataev	f1ee2738b3	[SLP]Fix a crash when insert subvector is out of range. If the OffsetBeg + InsertVecSz is greater than VecSz, need to estimate the cost as shuffle of 2 vector, not as insert of subvector. Otherwise, the inserted subvector is out of range and compiler may crash. Differential Revision: https://reviews.llvm.org/D128071	2022-06-21 07:16:35 -07:00
Florian Hahn	4ea6891f95	[ConstraintElimination] Remove unneeded StackEntry::Condition (NFC). The field was only used for debug printing. Print constraint from the system instead.	2022-06-21 15:57:29 +02:00
Florian Hahn	2a9313ee0b	[ConstraintElimination] Move logic to check condition to helper (NFC).	2022-06-21 11:50:33 +02:00
Kazu Hirata	7a47ee51a1	[llvm] Don't use Optional::getValue (NFC)	2022-06-20 22:45:45 -07:00
Kazu Hirata	d66cbc565a	Don't use Optional::hasValue (NFC)	2022-06-20 20:26:05 -07:00
Kazu Hirata	0916d96d12	Don't use Optional::hasValue (NFC)	2022-06-20 20:17:57 -07:00
Florian Hahn	6dd772d348	[ConstraintElimination] Move logic to get a constraint to helper (NFC).	2022-06-20 21:34:07 +02:00
Kazu Hirata	ad7ce1e769	Don't use Optional::hasValue (NFC)	2022-06-20 11:49:10 -07:00
Kazu Hirata	5413bf1bac	Don't use Optional::hasValue (NFC)	2022-06-20 11:33:56 -07:00
Kazu Hirata	e0e687a615	[llvm] Don't use Optional::hasValue (NFC)	2022-06-20 10:38:12 -07:00
Arthur Eubanks	13ff7d6f39	Revert "[GlobalOpt] Perform store->dominated load forwarding for stored once globals" This reverts commit `6f348b146b`. Am seeing internal test failures plus a linux kernel breakage reported due to this.	2022-06-20 10:26:47 -07:00
Arthur Eubanks	1cd2c72bef	Revert "[GlobalOpt] Preserve CFG analyses" This reverts commit `cc65f3e167`. Causes crashes: https://github.com/llvm/llvm-project/issues/56131	2022-06-20 10:25:10 -07:00
Guillaume Chatelet	589c8d6fb9	[NFC] Simplify alignment code in MemorySanitizer	2022-06-20 15:15:53 +00:00
Guillaume Chatelet	7296811910	[NFC] Simplify alignment code in CoroFrame	2022-06-20 15:15:52 +00:00
Florian Hahn	cebe7ae881	[ConstraintElimination] Move logic to add constraint to helper (NFC).	2022-06-20 17:08:35 +02:00
Florian Hahn	bd9632afd2	[ConstraintElimination] Move StackEntry up, to allow use earlier (NFC).	2022-06-20 16:40:42 +02:00
Florian Hahn	cfc741bc0e	[LoopPeel] Forget SCEV for updated exit phi values. LoopPeel add new incoming values to exit phi nodes which can change the SCEV for the phi after `20d798bd47`. Forget SCEVs for such phis. Fixes #56044. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128164	2022-06-20 13:19:27 +02:00
Guillaume Chatelet	f1255186c7	[NFC][Alignment] Remove max functions between Align and MaybeAlign `llvm::max(Align, MaybeAlign)` and `llvm::max(MaybeAlign, Align)` are not used often enough to be required. They also make the code more opaque. Differential Revision: https://reviews.llvm.org/D128121	2022-06-20 08:37:48 +00:00
Guillaume Chatelet	009fe0755e	[Alignment] Remove multiply by MaybeAlign	2022-06-20 08:37:15 +00:00

1 2 3 4 5 ...

31018 Commits