llvm-project

Commit Graph

Author	SHA1	Message	Date
Anna Thomas	28f27dd264	Check users of instrinsics instead of traversing entire function.NFC Updated LowerGuardIntrinsic and LowerWidenableCondition to check for users of the respective intrinsic, instead of checking for guards and widenable conditions by traversing the entire function. This is an NFC. Should save some compile time.	2022-04-13 12:28:51 -04:00
Florian Hahn	4bf3b7dc92	Recommit "[LICM] Only create load in pre-header when promoting load." This reverts the revert commit `1ddc719680`. This version of the patch sets the initial available value to poison, which resolves an issue with the SSAUpdater breaking LCSSA form.	2022-04-13 17:20:39 +02:00
Whitney Tsang	80304c5f88	[LoopUnroll] Always respect user unroll pragma IMO when user provide unroll pragma, compiler should always respect it. It is not clear to me why loop unroll pass currently ensure that the unrolled loop size is limited by PragmaUnrollThreshold. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D119148	2022-04-11 14:33:24 -04:00
Florian Hahn	1ddc719680	Revert "[LICM] Only create load in pre-header when promoting load." This reverts commit `42229b96bf`. This appears to cause crashes on multiple bots.	2022-04-11 17:37:23 +02:00
Florian Hahn	42229b96bf	[LICM] Only create load in pre-header when promoting load. When only a store is sunk, there is no need to create a load in the pre-header, as the result of the load will never get used. The dead load can can introduce UB, if the function is marked as writeonly. Fixes #51248. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123473	2022-04-11 16:45:18 +02:00
Arthur Eubanks	b22ffc7b98	[CaptureTracking] Ignore ephemeral values in EarliestEscapeInfo And thread DSE's ephemeral values to EarliestEscapeInfo. This allows more precise analysis in DSEState::isReadClobber() via BatchAA. Followup to D123162. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123342	2022-04-08 10:07:26 -07:00
Zaara Syeda	07005440ae	[LSR] Optimize unused IVs to final values in the exit block Loop Strength Reduce sometimes optimizes away all uses of an induction variable from a loop but leaves the IV increments. When the only remaining use of the IV is the PHI in the exit block, this patch will call rewriteLoopExitValues to replace the exit block PHI with the final value of the IV to skip the updates in each loop iteration. Differential Revision: https://reviews.llvm.org/D118808	2022-04-08 11:16:37 -04:00
Nikita Popov	c8c6362560	[LICM] Pass MemorySSAUpdater by referene (NFC) Make it clearer that this is a required dependency.	2022-04-08 10:08:57 +02:00
Nikita Popov	5cefe7d9f5	[LoopSink] Require MemorySSA This makes MemorySSA in LoopSink required, and removes the AST-based implementation, as well as the related support code in LICM. Differential Revision: https://reviews.llvm.org/D123288	2022-04-08 09:49:44 +02:00
Austin Kerbow	26b14c3ea7	[InferAddressSpaces] Fix assert on invalid bitcast placement Similar to the problem in `0bb25b4603`, bitcasts that are inserted must dominate all uses. When rewriting "values" with "new values" that have the updated address space, we may replace the "new value" with a bitcast if one of the original users is an addresspace cast. This bitcast must be inserted before ALL users, not only before the addresspace cast. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D122964	2022-04-07 20:07:53 -07:00
Arthur Eubanks	17fdaccccf	[CaptureTracking] Ignore ephemeral values when determining pointer escapeness Ephemeral values cannot cause a pointer to escape. No change in compile time: https://llvm-compile-time-tracker.com/compare.php?from=4371710085ba1c376a094948b806ddd3b88319de&to=c5ddbcc4866f38026737762ee8d7b9b00395d4f4&stat=instructions This partially fixes some regressions caused by more calls to `__builtin_assume` (D122397). Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D123162	2022-04-07 10:11:14 -07:00
Nikita Popov	e22af03a79	[Sink] Don't sink non-willreturn calls (PR51188) Fixes https://github.com/llvm/llvm-project/issues/51188.	2022-04-07 16:35:05 +02:00
Nikita Popov	674ee4d353	[LoopSink] Use MemorySSA with legacy pass manager LoopSink with the legacy pass manager still uses AST, because we can't compute MemorySSA conditionally. I think now that the legacy pass manager will be removed soon(TM) we don't need to care about compile-time impact here anymore. Additionally, since MemorySSA is no longer eagerly optimized, the impact is actually not that high anymore (~0.2% geomean regression on CTMark). This just makes legacy PM and new PM behavior line up -- as a followup I'll drop these options entirely and make MemorySSA use mandatory. Differential Revision: https://reviews.llvm.org/D123216	2022-04-07 09:40:29 +02:00
Matt Arsenault	39f1568633	Transforms: Split LowerAtomics into separate Utils and pass This will allow code sharing from AtomicExpandPass. Not entirely sure why these exist as separate passes though.	2022-04-06 20:54:45 -04:00
Alina Sbirlea	08075a7ee8	Revert `f7381a795a` Roll-forward `29fada4a3d`. Issue triggered was due to UB. Differential Revision: https://reviews.llvm.org/D121987	2022-04-06 16:06:14 -07:00
Congzhe Cao	eac3487510	[LoopInterchange] Try to achieve the most optimal access pattern after interchange Motivated by pr43326 (https://bugs.llvm.org/show_bug.cgi?id=43326), where a slightly modified case is as follows. void f(int e[10][10][10], int f[10][10][10]) { for (int a = 0; a < 10; a++) for (int b = 0; b < 10; b++) for (int c = 0; c < 10; c++) f[c][b][a] = e[c][b][a]; } The ideal optimal access pattern after running interchange is supposed to be the following void f(int e[10][10][10], int f[10][10][10]) { for (int c = 0; c < 10; c++) for (int b = 0; b < 10; b++) for (int a = 0; a < 10; a++) f[c][b][a] = e[c][b][a]; } Currently loop interchange is limited to picking up the innermost loop and finding an order that is locally optimal for it. However, the pass failed to produce the globally optimal loop access order. For more complex examples what we get could be quite far from the globally optimal ordering. What is proposed in this patch is to do a "bubble-sort" fashion when doing interchange. By comparing neighbors in `LoopList` in each iteration, we would be able to move each loop onto a most appropriate place, hence this is an approach that tries to achieve the globally optimal ordering. The motivating example above is added as a test case. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D120386	2022-04-06 15:31:56 -04:00
Martin Storsjö	46776f7556	Fix warnings about variables that are set but only used in debug mode Add void casts to mark the variables used, next to the places where they are used in assert or `LLVM_DEBUG()` expressions. Differential Revision: https://reviews.llvm.org/D123117	2022-04-06 10:01:46 +03:00
Bert Abrath	019e7b7f6e	[PartiallyInlineLibCalls] Don't partially inline a musttail libcall. Partially inlining a libcall that has the musttail attribute leads to broken LLVM IR, triggering an assertion in the IR verifier. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D123116	2022-04-05 22:30:50 +03:00
serge-sans-paille	1e02737593	[iwyu] Fix some header include regression Running iwyu-diff from https://github.com/serge-sans-paille/preprocessor-utils makes it possible to quickly spot regression in unused includes. This patch contains the few regressions since the last header cleanup. Differential Revision: https://reviews.llvm.org/D123036	2022-04-05 15:02:03 +02:00
Nikita Popov	a5c3b5748c	[MemCpyOpt] Work around PR54682 As discussed on https://github.com/llvm/llvm-project/issues/54682, MemorySSA currently has a bug when computing the clobber of calls that access loop-varying locations. I think a "proper" fix for this on the MemorySSA side might be non-trivial, but we can easily work around this in MemCpyOpt: Currently, MemCpyOpt uses a location-less getClobberingMemoryAccess() call to find a clobber on either the src or dest location, and then refines it for the src and dest clobber. This was intended as an optimization, as the location-less API is cached, while the location-affected APIs are not. However, I don't think this really makes a difference in practice, because I don't think anything will use the cached clobbers on those calls later anyway. On CTMark, this patch seems to be very mildly positive actually. So I think this is a reasonable way to avoid the problem for now, though MemorySSA should also get a fix. Differential Revision: https://reviews.llvm.org/D122911	2022-04-04 10:19:51 +02:00
Nikita Popov	c0cc98251a	[Float2Int] Make sure dependent ranges are calculated first (PR54669) The range calculation in walkForwards() assumes that the ranges of the operands have already been calculated. With the used visit order, this is not necessarily the case when there are multiple roots. (There is nothing guaranteeing that instructions are visited in topological order.) Fix this by queuing instructions for reprocessing if the operand ranges haven't been calculated yet. Fixes https://github.com/llvm/llvm-project/issues/54669. Differential Revision: https://reviews.llvm.org/D122817	2022-04-04 10:18:39 +02:00
Philip Reames	7c51669c21	[memcpyopt] Restructure store(load src, dest) form of callslotopt for compile time The search for the clobbering call is fairly expensive if uses are not optimized at construction. Defer the clobber walk to the point in the implementation we need it; there are a bunch of bailouts before that point. (e.g. If the source pointer is not an alloca, we can't do callslotopt.) On a test case which involves a bunch of copies from argument pointers, this switches memcpyopt from > 1/2 second to < 10ms.	2022-04-03 20:16:20 -07:00
Xiang1 Zhang	f830392be7	Correct spelling error in TLS-Load-Hoist	2022-04-04 08:27:54 +08:00
Florian Hahn	5bedc1f093	[ConstraintElimination] Move logic to build worklist to helper (NFC). This refactor makes it easier to extend the logic to collect information from blocks in the future, without even further increasing the size of eliminateConstriants.	2022-04-02 16:55:05 +01:00
Xiang1 Zhang	a56f264958	Refine tls-load-hoista llvm option Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D122890	2022-04-01 19:03:58 +08:00
Jorge Gorbe Moya	fc7573f29c	Revert "[misexpect] Re-implement MisExpect Diagnostics" This reverts commit `46774df307`.	2022-03-31 14:54:41 -07:00
Paul Kirth	46774df307	[misexpect] Re-implement MisExpect Diagnostics Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D115907	2022-03-31 17:38:21 +00:00
Nikita Popov	33ac23e7cf	[Float2Int] Avoid unnecessary lamdbas (NFC) Instead of first creating a lambda for calculating the range, then collecting the ranges for the operands, and then calling the lambda on those ranges, we can first calculate the operand ranges and then calculate the result directly in the switch.	2022-03-31 16:13:13 +02:00
Nikita Popov	f66975555f	[Float2Int] Extract calcRange() method (NFC) This avoids the awkward "Abort" flag, because we can simply early-return instead.	2022-03-31 16:13:13 +02:00
Aditya Kumar	368681f803	[GVNHoist] drop debug location according to the debug info guide According to the LLVM debug info update guide: https://llvm.org/docs/HowToUpdateDebugInfo.html, "Hoisting identical instructions which appear in several successor blocks into a predecessor block. In this case there is no single merged instruction. The rule for dropping locations applies". Thanks to Yuanbo Li for reporting this. Reviewed By: dblaikie Reviewers: sebpop, tejohnson, dblaikie Differential Revision: https://reviews.llvm.org/D122730	2022-03-30 20:17:53 -07:00
Stephen Long	e02f4976ac	[LoopIdiom] Merge TBAA of adjacent stores when creating memset Factor in the TBAA of adjacent stores instead of just the head store when merging stores into a memset. We were seeing GVN remove a load that had a TBAA that matched the 2nd store because GVN determined it didn't match the TBAA of the memset. The memset had the TBAA of only the first store. i.e. Loading the field pi_ of shared_count after memset to create an array of shared_ptr template<class T> class shared_ptr { T p; shared_count refcount; }; class shared_count { sp_counted_base pi_; }; Differential Revision: https://reviews.llvm.org/D122205	2022-03-30 16:54:49 -07:00
Chang-Sun Lin Jr	c28ce745cf	Value-number GVNHoist loads by result type as well as pointer address. Avoids merge errors when opaque pointers are loaded into different types. Reviewed by: jcranmer-intel, hiraditya Differential Revision: https://reviews.llvm.org/D122521	2022-03-30 11:33:49 -07:00
Florian Hahn	3dbb5eb2cd	[ConstraintElimination] Move ConstraintInfo after ConstraintTy. (NFC) Code movement to it slightly easier to use ConstraintTy & co in ConstraintInfo directly, for follow-up patches.	2022-03-29 09:59:03 +01:00
Serguei Katkov	6444a65514	[LSR] Fixup canonicalization formula and its checker. According to definition of canonical form, it is a canonical if scale reg does not contain addrec for loop L then none of bases should contain addrec for this loop. The critical word here is "contains". Current checker of canonical form checks not "containing" property but "is". So it does not check whether it contains but whether it is. Fix the checker and canonicalizing utility to follow definition. Without this fix in the test attached the base formula looking as reg((-1 * {0,+,8}<nuw><nsw><%bb2>)<nsw>) + 1reg((8 (%arg /u 8))<nuw>) is considered as conanocial while base contains an addrec. And modified formula we want to insert reg({0,+,8}<nuw><nsw><%bb2>) + 1reg((-8 (%arg /u 8))) is considered as not canonical. Reviewed By: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D122457	2022-03-29 14:05:04 +07:00
Paul Kirth	90cb325abd	Revert "[misexpect] Re-implement MisExpect Diagnostics" This reverts commit `2add3fbd97`.	2022-03-29 06:20:30 +00:00
Philip Reames	33deaa13b8	[memcpyopt] Common code into performCallSlotOptzn [NFC] We have the same code repeated in both callers, sink it into callee. The motivation here isn't just code style, we can also defer the relatively expensive aliasing checks until the cheap structural preconditions have been validated. (e.g. Don't bother aliasing if src is not an alloca.) This helps compile time significantly.	2022-03-28 20:10:13 -07:00
Paul Kirth	2add3fbd97	[misexpect] Re-implement MisExpect Diagnostics Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D115907	2022-03-28 23:30:04 +00:00
Alina Sbirlea	f7381a795a	Revert `29fada4a3d` Seeing a test failure with asan in Halide generated code, reverting while I investigate. Differential Revision: https://reviews.llvm.org/D121987	2022-03-28 16:17:41 -07:00
Florian Hahn	8c3281db49	[ConstraintElimination] Use AddOverflow for offset summation. Fixes an incorrect transformation due to values overflowing https://alive2.llvm.org/ce/z/uizoea	2022-03-25 18:08:24 +00:00
Djordje Todorovic	9dbc687a5e	NFC: [LICM] Update some stale comments After removing the MaybePromotable, some comments became stale. This improves them. Differential Revision: https://reviews.llvm.org/D122319	2022-03-24 14:37:20 +01:00
Nikita Popov	29fada4a3d	[EarlyCSE] Don't eagerly optimize MemoryUses EarlyCSE currently optimizes all MemoryUses upfront. However, EarlyCSE only actually queries the clobbering memory access for a subset of uses, namely those where a CSE candidate has already been identified. Delaying use optimization to the clobber query improves compile-time in practice. This change is not NFC because EarlyCSE has a limit on the number of clobber queries (EarlyCSEMssaOptCap), in which case it falls back to the defining access. The defining access for uses will now no longer coincide with the optimized access. If there are performance regressions from this change, we should be able to address them by raising this limit. Differential Revision: https://reviews.llvm.org/D121987	2022-03-23 16:47:35 +01:00
Nikita Popov	afb526b3f4	[LICM] Handle store of pointer to itself (PR54495) Rather than iterating over users and comparing operands, iterate over uses and check operand number. Otherwise, we'll end up promoting a store twice if it has two equal operands. This can only happen with opaque pointers, as otherwise both operands differ by a level of indirection, so a bitcast would have to be involved. Fixes https://github.com/llvm/llvm-project/issues/54495.	2022-03-22 14:00:07 +01:00
Philip Reames	ee7324b898	Rename mayBeMemoryDependent to mayHaveNonDefUseDependency [nfc]	2022-03-21 10:01:40 -07:00
psamolysov-intel	2ed030ba88	[InferAddressSpaces][NFC] Small code improvements for the InferAddressSpaces pass There is a bunch of code improvements in the patch: marking as const everything what can be const and fixing some typos in comments. Also the patch removes the shadowing parameter TTI from the rewriteWithNewAddressSpaces method, the TTI parameter is not required because the same field is in the class. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D121671	2022-03-21 11:03:12 -05:00
Kazu Hirata	bce1bf0ee2	[Transform] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)	2022-03-20 10:41:22 -07:00
Nick Desaulniers	e1bae23f6f	[SCCP] do not clean up dead blocks that have their address taken [SCCP] do not clean up dead blocks that have their address taken Fixes a crash observed in IPSCCP. Because the SCCPSolver has already internalized BlockAddresses as Constants or ConstantExprs, we don't want to try to update their Values in the ValueLatticeElement. Instead, continue to propagate these BlockAddress Constants, continue converting BasicBlocks to unreachable, but don't delete the "dead" BasicBlocks which happen to have their address taken. Leave replacing the BlockAddresses to another pass. Fixes: https://github.com/llvm/llvm-project/issues/54238 Fixes: https://github.com/llvm/llvm-project/issues/54251 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D121744	2022-03-18 11:02:15 -07:00
Florian Hahn	5ab421fb4e	[LICM] Add allowspeculation pass options. This adds a new option to control AllowSpeculation added in D119965 when using `-passes=...`. This allows reproducing #54023 using opt. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D121944	2022-03-18 16:51:57 +00:00
Nikita Popov	ab2284a643	[LowerConstantIntrinsics] Make TLI a required dependency The way the pass is actually used in the optimization pipeline, TLI will be available, but this is not the case when running just -lower-constant-intrinsics in tests, which ends up being quite confusing. Require TLI unconditionally, as we usually do.	2022-03-18 14:59:18 +01:00
Nikita Popov	f96428e16d	[MemorySSA] Don't optimize uses during construction This changes MemorySSA to be constructed in unoptimized form. MemorySSA::ensureOptimizedUses() can be called to optimize all uses (once). This should be done by passes where having optimized uses is beneficial, either because we're going to query all uses anyway, or because we're doing def-use walks. This should help reduce the compile-time impact of MemorySSA for some use cases (the reason why I started looking into this is D117926), which can avoid optimizing all uses upfront, and instead only optimize those that are actually queried. Actually, we have an existing use-case for this, which is EarlyCSE. Disabling eager use optimization there gives a significant compile-time improvement, because EarlyCSE will generally only query clobbers for a subset of all uses (this change is not included in this patch). Differential Revision: https://reviews.llvm.org/D121381	2022-03-18 09:56:16 +01:00
Florian Hahn	4a699ae9c6	[LoopSimplifyCFG] Check predecessors of exits before marking them dead. LoopSimplifyCFG may process loops that are not in loop-simplify/canonical form. For loops not in canonical form, exit blocks may be reachable from non-loop blocks and we cannot consider them as dead if they only are not reachable from the loop itself. Unfortunately the smallest test I could come up with requires running multiple passes: -passes='loop-mssa(loop-instsimplify,loop-simplifycfg,simple-loop-unswitch)' The reason is that loops are canonicalized at the beginning of loop pipelines, so a later transform has to break canonical form in a way that breaks LoopSimplifyCFG's dead-exit analysis. Alternatively we could try to require all loop passes to maintain canonical form. That in turn would also require additional verification. Fixes #54023, #49931. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D121925	2022-03-18 08:54:44 +00:00
Paul Kirth	964398ccb1	Revert "Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics""" This reverts commit `6cf560d69a`.	2022-03-18 00:21:33 +00:00
Paul Kirth	6cf560d69a	Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics"" I mistakenly reverted my commit, so I'm relanding it. This reverts commit `10866a1df4`.	2022-03-18 00:04:22 +00:00
Paul Kirth	10866a1df4	Revert "[misexpect] Re-implement MisExpect Diagnostics" This reverts commit `e7749d4713`.	2022-03-17 23:54:26 +00:00
Paul Kirth	e7749d4713	[misexpect] Re-implement MisExpect Diagnostics Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Differential Revision: https://reviews.llvm.org/D115907	2022-03-17 23:46:23 +00:00
Florian Hahn	470a975c84	[ConstraintElimination] Add missing dominance check. When dealing with an unconditional branch, the condition can only added if BB properly dominates the successor.	2022-03-16 20:01:24 +00:00
Nikita Popov	d7cf7ec05d	[SROA] Handle over-large loads during presplitting When a load extends past the extent of the alloca, SROA will restrict the slice size to extend to the end of the alloca only. However, presplitting was asserting that the load size and the slice size match exactly, which does not hold in this case. Relax the assertion to only require that the load size is greater or equal than the slice size.	2022-03-16 15:41:11 +01:00
Florian Hahn	f473d4aa80	[ConstraintElimination] Support BBs with single successor in CanAdd. If BB has a single successor, conditions can be added safely.	2022-03-16 14:13:52 +00:00
Nikita Popov	cf18ec445d	[GVN] Check load type in select PRE This is no longer implicitly guaranteed with opaque pointers.	2022-03-14 12:46:54 +01:00
Benoit Jacob	9879c555f2	Expose ScalarizerPass options to C++ (not just commandline) Context: I needed this for https://github.com/google/iree/pull/8474 . I found that TSan instrumentation expects vector sizes to be <= 16, and in my project (IREE) we have tests with higher vector sizes. That left some test functions uninstrumented, resulting in crashes as instrumented code called into them. Differential Revision: https://reviews.llvm.org/D121182	2022-03-14 12:00:35 +01:00
serge-sans-paille	ed98c1b376	Cleanup includes: DebugInfo & CodeGen Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332	2022-03-12 17:26:40 +01:00
Benjamin Kramer	dbc32e2aa7	[LoopUnswitch] Use SmallPtrSet instead of std::set. NFCI.	2022-03-11 19:14:34 +01:00
Xiang1 Zhang	c31014322c	TLS loads opimization (hoist) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120000	2022-03-10 09:29:06 +08:00
Congzhe Cao	abc8ca65c3	[LoopInterchange] Detect output dependency of a store instruction with itself This patch is motivated by pr48057 where an output dependency is not detected since loop interchange did not check a store instruction with itself. Fixed that deficiency. Reviewed By: bmahjour, Meinersbur, #loopoptwg Differential Revision: https://reviews.llvm.org/D118102	2022-03-09 15:50:27 -05:00
Florian Hahn	f98125abb2	Revert "[PassManager] Add pretty stack entries before P->run() call." This reverts commit `128745cc26`. This increased compile-time unnecessarily. Revert this change and follow ups `2c7afadb47` & `add0c5856d`. http://llvm-compile-time-tracker.com/compare.php?from=338dfcd60f843082bb589b287d890dbd9394eb82&to=128745cc2681c284bc6d0150a319673a6d6e8424&stat=instructions	2022-03-09 18:46:32 +00:00
Florian Hahn	128745cc26	[PassManager] Add pretty stack entries before P->run() call. This patch adds PrettyStackEntries before running passes. The entries include the pass name and the IR unit the pass runs on. The information is used the print additional information when a pass crashes, including the name and a reference to the IR unit on which it crashed. This is similar to the behavior of the legacy pass manager. The improved stack trace now includes: Stack dump: 0. Program arguments: bin/opt -loop-vectorize -force-vector-width=4 crash.ll 1. Running pass 'ModuleToFunctionPassAdaptor' on module 'crash.ll' 2. Running pass 'LoopVectorizePass' on function '@a' Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D120993	2022-03-09 13:01:09 +00:00
Florian Hahn	e10b0ea371	[ConstraintElimination] Remove over-eager assertion. After moving the CanAdd check in `c60cdb44f7` and using it for the assume cases as well, the passed in block may not have a branch instruction as terminator. This can trigger the assertion. Given the new use case, it doesn't add value any longer and can be removed. Fixes https://github.com/llvm/llvm-project/issues/54281	2022-03-08 22:02:08 +00:00
Florian Hahn	4bbee17ecb	[ConstraintElimination] Use ZExtValue for unsigned decomposition. When decomposing constraints for unsigned conditions, we can use negative values by zero-extending them, as long as they are less than the maximum constraint value. Fixes https://github.com/llvm/llvm-project/issues/54224	2022-03-07 13:34:01 +00:00
Florian Hahn	c60cdb44f7	[ConstraintElimination] Only add cond from assume to succs if valid. Add missing CanAdd check before adding a condition from an assume to the successor blocks. When adding information from assume to successor blocks we need to perform the same CanAdd as we do for adding a condition from a branch. Fixes https://github.com/llvm/llvm-project/issues/54217	2022-03-07 12:01:15 +00:00
Florian Hahn	542c335159	[ConstraintElimination] Remove dead variables when dropping constraints. This patch extends ConstraintElimination to also remove dead variables when removing a constraint. When a constraint is removed because it is out of scope, all new variables added for this constraint can also be removed. This keeps the total size of the systems much smaller, because it reduces the number of variables drastically. It also fixes a bug where variables where removed incorrectly. Fixes https://github.com/llvm/llvm-project/issues/54228	2022-03-07 09:04:07 +00:00
Nikita Popov	d1e880acaa	[SCEV] Enable verification in LoopPM Currently, we hardly ever actually run SCEV verification, even in tests with -verify-scev. This is because the NewPM LPM does not verify SCEV. The reason for this is that SCEV verification can actually change the result of subsequent SCEV queries, which means that you see different transformations depending on whether verification is enabled or not. To allow verification in the LPM, this limits verification to BECounts that have actually been cached. It will not calculate new BECounts. BackedgeTakenInfo::getExact() is still not entirely readonly, it still calls getUMinFromMismatchedTypes(). But I hope that this is not problematic in the same way. (This could be avoided by performing the umin in the other SCEV instance, but this would require duplicating some of the code.) Differential Revision: https://reviews.llvm.org/D120551	2022-03-07 09:46:20 +01:00
Florian Hahn	bc00f47c01	[LoopSink] Do not try to sink phi nodes. Skip phi nodes in the preheader. They may not be considered loop invariant by the assertion below. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D121010	2022-03-06 11:16:22 +00:00
Roman Lebedev	e47257e251	Revert "Reland [SROA] Maintain shadow/backing alloca when some slices are noncapturnig read-only calls to allow alloca partitioning/promotion" There seems to be one more uncaught problem, SROA may now end up trying to re-re-repromote the just-promoted shadow alloca, and do that endlessly. This reverts commit `adc0984d81`.	2022-03-05 01:09:51 +03:00
Roman Lebedev	adc0984d81	Reland [SROA] Maintain shadow/backing alloca when some slices are noncapturnig read-only calls to allow alloca partitioning/promotion This is inspired by the original variant of D109749 by Graham Hunter, but is a more general version. Roughly, instead of promoting the alloca, we call it a shadow/backing alloca, go through all it's slices, clone(!) instructions that operated on it, but make them operate on the cloned alloca, and promote cloned alloca instead. This keeps the shadow/backing alloca, and all the original instructions around, which results in said shadow/backing alloca being a perfect mirror/representation of the promoted alloca's content, so calls that take the alloca as arguments (non-capturingly!) can be supported. For now, we require that the calls also don't modify the alloca's content, but that is only to simplify the initial implementation, and that will be supported in a follow-up. Overall, this leads to smaller codesize: https://llvm-compile-time-tracker.com/compare.php?from=a8b4f5bbab62091835205f3d648902432a4a5b58&to=aeae054055b125b011c1122f82c86457e159436f&stat=size-total and is roughly neutral compile-time wise: https://llvm-compile-time-tracker.com/compare.php?from=a8b4f5bbab62091835205f3d648902432a4a5b58&to=aeae054055b125b011c1122f82c86457e159436f&stat=instructions This relands commit `703240c71f`, that was reverted by commit `7405581f7c`, because the assertion `isa<LoadInst>(OrigInstr)` didn't hold in practice, as the newly added test `@select_of_ptrs` shows: If the pointers into alloca are used by select's/PHI's, then even if we manage to fracture the alloca, some sub-alloca's will likely remain. And if there are any non-capturing calls, then we will also decide to keep the original backing alloca around, and we suddenly ~doubled the alloca size, and the amount of memory traffic. I'm not sure if this is a problem or we could live with it, but let's leave that for later... Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D113520	2022-03-05 00:14:12 +03:00
Roman Lebedev	7405581f7c	Revert "[SROA] Maintain shadow/backing alloca when some slices are noncapturnig read-only calls to allow alloca partitioning/promotion" Bots are reporting that the assertion about only expecting loads is wrong. This reverts commit `703240c71f`.	2022-03-04 21:49:30 +03:00
Roman Lebedev	703240c71f	[SROA] Maintain shadow/backing alloca when some slices are noncapturnig read-only calls to allow alloca partitioning/promotion This is inspired by the original variant of D109749 by Graham Hunter, but is a more general version. Roughly, instead of promoting the alloca, we call it a shadow/backing alloca, go through all it's slices, clone(!) instructions that operated on it, but make them operate on the cloned alloca, and promote cloned alloca instead. This keeps the shadow/backing alloca, and all the original instructions around, which results in said shadow/backing alloca being a perfect mirror/representation of the promoted alloca's content, so calls that take the alloca as arguments (non-capturingly!) can be supported. For now, we require that the calls also don't modify the alloca's content, but that is only to simplify the initial implementation, and that will be supported in a follow-up. Overall, this leads to smaller codesize: https://llvm-compile-time-tracker.com/compare.php?from=a8b4f5bbab62091835205f3d648902432a4a5b58&to=aeae054055b125b011c1122f82c86457e159436f&stat=size-total and is roughly neutral compile-time wise: https://llvm-compile-time-tracker.com/compare.php?from=a8b4f5bbab62091835205f3d648902432a4a5b58&to=aeae054055b125b011c1122f82c86457e159436f&stat=instructions Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D113520	2022-03-04 21:08:43 +03:00
Nikita Popov	d3a52089eb	Reapply [MergeICmps] Don't require GEP Recommit without changes over `53abe3ff66`, which addressed the cause of the reported crash. ----- With opaque pointers, the zero-offset load will generally not use a GEP. Allow a direct load without GEP, which is treated the same way as a zero-offset GEP.	2022-03-04 11:39:11 +01:00
Nikita Popov	53abe3ff66	[MergeICmp] Make instruction move robust against empty block (NFCI) Use the overload that support moving into an empty block. I don't think that this situation can occur right now, but it can happen with the change from `e7fb1c15cb`, and the test is derived from the issue reported there.	2022-03-04 11:15:08 +01:00
Arthur Eubanks	bc1574b495	Revert "[MergeICmps] Don't require GEP" This reverts commit `e7fb1c15cb`. Causes crashes, see https://reviews.llvm.org/rGe7fb1c15cb85d748c1c4fdd5a2eb5613ec7bef1d.	2022-03-03 15:01:39 -08:00
Philip Reames	00a877f96a	[DSE] Cache liveOnEntry as clobbering access This builds on @fhahn's D112313, and caches the liveOnEntry node as a optimized access. D112313 tied to only cache a known clobber. This change adds caching the fact that no clobber exists. It still does not cache may-clobber results. Differential Revision: https://reviews.llvm.org/D120842	2022-03-03 11:36:21 -08:00
Nikita Popov	c262ba2aab	[Scalarizer] Avoid pointer element type accesses Pass through the load/store type to the Scatterer instead.	2022-03-03 10:28:58 +01:00
serge-sans-paille	f90a66a544	Add missing include under -DEXPENSIVE_CHECKS This is a follow-up to `59630917d6`	2022-03-03 10:19:39 +01:00
Nikita Popov	b214f550f7	[DSE] Drop redundant WalkerStepLimit adjustment There is a general WalkerStepLimit adjustment higher up in the loop, and I don't see any reason why this particular case would need additional adjustment. Furthermore, this could underflow.	2022-03-03 09:42:38 +01:00
serge-sans-paille	59630917d6	Cleanup includes: Transform/Scalar Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817	2022-03-03 07:56:34 +01:00
Xiang1 Zhang	65588a0776	Revert "TLS loads opimization (hoist)" Revert for more reviews This reverts commit `30e612ebdf`.	2022-03-02 14:10:11 +08:00
Xiang1 Zhang	30e612ebdf	TLS loads opimization (hoist) Reviewed By: Wang Pheobe, Topper Craig Differential Revision: https://reviews.llvm.org/D120000	2022-03-02 10:37:24 +08:00
serge-sans-paille	a494ae43be	Cleanup includes: TransformsUtils Estimation on the impact on preprocessor output: before: 1065307662 after: 1064800684 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120741	2022-03-01 21:00:07 +01:00
serge-sans-paille	71c3a5519d	Cleanup includes: LLVMAnalysis Number of lines output by preprocessor: before: 1065940348 after: 1065307662 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120659	2022-03-01 18:01:54 +01:00
Craig Topper	1b1f8d6eff	[SeparateConstOffsetFromGEP] Remove TargetMachine.h include. NFC This doesn't appear to be used and it would be a layering violation if it was.	2022-02-25 21:40:00 -08:00
Nikita Popov	e7fb1c15cb	[MergeICmps] Don't require GEP With opaque pointers, the zero-offset load will generally not use a GEP. Allow a direct load without GEP, which is treated the same way as a zero-offset GEP.	2022-02-25 17:38:02 +01:00
Nikita Popov	4736e57199	[IndVars] Use phis() (NFC)	2022-02-25 12:08:12 +01:00
Dmitry Vassiliev	90a3b31091	[Transforms] Enhance CorrelatedValuePropagation to handle both values of select The "Correlated Value Propagation" pass was missing a case when handling select instructions. It was only handling the "false" constant value, while in NVPTX the select may have the condition (and thus the branches) inverted, for example: ``` loop: %phi = phi i32* [ %sel, %loop ], [ %x, %entry ] %f = tail call i32* @f(i32* %phi) %cmp1 = icmp ne i32* %f, %y %sel = select i1 %cmp1, i32* %f, i32* null %cmp2 = icmp eq i32* %sel, null br i1 %cmp2, label %return, label %loop ``` But the select condition can be inverted: ``` %cmp1 = icmp eq i32* %f, %y %sel = select i1 %cmp1, i32* null, i32* %f ``` The fix is to enhance "Correlated Value Propagation" to handle both branches of the select instruction. Reviewed By: nikic, lebedev.ri Differential Revision: https://reviews.llvm.org/D119643	2022-02-23 00:11:20 +04:00
Jay Foad	0e74d75a29	[StructurizeCFG] Fix boolean not bug D118623 added code to fold not-of-compare into a compare with the inverted predicate, if the compare had no other uses. This relies on accurate use lists in the IR but it was run before setPhiValues, when some phi inputs are still stored in a data structure on the side, instead of being real uses in the IR. The effect was that a phi that should be using the original compare result would now get an inverted result instead. Fix this by moving simplifyConditions after setPhiValues. Differential Revision: https://reviews.llvm.org/D120312	2022-02-22 17:36:20 +00:00
Nikita Popov	3c0096a1d4	[MergeICmps] Don't call comesBefore() if in different blocks (PR53959) Only call comesBefore() if the instructions are in the same block. Otherwise make a conservative assumption. Fixes https://github.com/llvm/llvm-project/issues/53959.	2022-02-22 12:27:20 +01:00
Florian Hahn	7662d1687b	[MemCpyOpt] Check all access for MemoryUses in writtenBetween. Currently writtenBetween can miss clobbers of Loc between End and Start, if End is a MemoryUse. To guarantee we see all write clobbers of Loc between Start and End for MemoryUses, restrict to Start and End being in the same block and check all accesses between them. This fixes 2 mis-compiles illustrated in llvm/test/Transforms/MemCpyOpt/memcpy-byval-forwarding-clobbers.ll Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D119929	2022-02-21 16:54:30 +00:00
Florian Hahn	00ab91b70d	[ConstraintElimination] Remove ConstraintListTy (NFCI). This patch simplifies constraint handling by removing the ConstraintListTy wrapper struct and moving the Preconditions directly into ConstraintTy. This reduces the amount of memory needed for managing constraints. The only use case for ConstraintListTy was adding 2 constraints to model ICMP_EQ conditions. But this can be handled by adding an IsEq flag. When adding an equality constraint, we need to add the constraint and the inverted constraint.	2022-02-18 14:35:01 +00:00
William S. Moses	d9da6a535f	[LICM][PhaseOrder] Don't speculate in LICM until after running loop rotate LICM will speculatively hoist code outside of loops. This requires removing information, like alias analysis (https://github.com/llvm/llvm-project/issues/53794), range information (https://bugs.llvm.org/show_bug.cgi?id=50550), among others. Prior to https://reviews.llvm.org/D99249 , LICM would only be run after LoopRotate. Running Loop Rotate prior to LICM prevents a instruction hoist from being speculative, if it was conditionally executed by the iteration (as is commonly emitted by clang and other frontends). Adding the additional LICM pass first, however, forces all of these instructions to be considered speculative, even if they are not speculative after LoopRotate. This destroys information, resulting in performance losses for discarding this additional information. This PR modifies LICM to accept a ``speculative'' parameter which allows LICM to be set to perform information-loss speculative hoists or not. Phase ordering is then modified to not perform the information-losing speculative hoists until after loop rotate is performed, preserving this additional information. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D119965	2022-02-17 20:13:07 -05:00
Arthur Eubanks	af6b9939aa	[EarlyCSE][OpaquePtr] Check access type when performing DSE This will bail out on target specific intrinsics. If those are deemed important enough for EarlyCSE to handle, we can augment MemIntrinsicInfo with an access type for TargetTransformInfo::getTgtMemIntrinsic() to handle. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D120077	2022-02-17 11:58:53 -08:00
Daniil Suchkov	7c3e2b92cf	[RewriteStatepointsForGC] Fix an incorrect assertion The assertion verifying that a newly computed value matches what is already cached used stripPointerCasts() to strip bitcasts, however the values can be not only pointers, but also vectors of pointers. That is problematic because stripPointerCasts() doesn't handle vectors of pointers. This patch introduces an ad-hoc utility function to strip all bitcasts regardless of the value type. Reviewed By: skatkov, reames Differential Revision: https://reviews.llvm.org/D119994	2022-02-17 18:44:57 +00:00
Roman Lebedev	371fcb720e	[SimplifyCFG][PhaseOrdering] Defer lowering switch into an integer range comparison and branch until after at least the IPSCCP That transformation is lossy, as discussed in https://github.com/llvm/llvm-project/issues/53853 and https://github.com/rust-lang/rust/issues/85133#issuecomment-904185574 This is an alternative to D119839, which would add a limited IPSCCP into SimplifyCFG. Unlike lowering switch to lookup, we still want this transformation to happen relatively early, but after giving a chance for the things like CVP to do their thing. It seems like deferring it just until the IPSCCP is enough for the tests at hand, but perhaps we need to be more aggressive and disable it until CVP. Fixes https://github.com/llvm/llvm-project/issues/53853 Refs. https://github.com/rust-lang/rust/issues/85133 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D119854	2022-02-17 12:13:55 +03:00
Florian Hahn	d03d3d7966	[DSE] Fall back to CFG scan for unreachable terminators. Blocks with UnreachableInst terminators are considered as root nodes in the PDT. This pessimize DSE, if there are no aliasing reads from the potentially dead store and the block with the unreachable terminator. If any of the root nodes of the PDF has UnreachableInst as terminator, fall back to the CFG scan, even the common dominator of all killing blocks does not post-dominate the block with potentially dead store. It looks like the compile-time impact for the extra scans is negligible. https://llvm-compile-time-tracker.com/compare.php?from=779bbbf27fe631154bdfaac7a443f198d4654688&to=ac59945f1bec1c6a7d7f5590c8c69fd9c5369c53&stat=instructions Fixes #53800. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D119760	2022-02-16 14:06:40 +00:00
Nikita Popov	2460a2ce47	[DSE] Extract a common PDT check (NFC)	2022-02-15 13:05:45 +01:00
Ahmed Bougacha	c703f852c9	[IR] Define "ptrauth" operand bundle. This introduces a new "ptrauth" operand bundle to be used in call/invoke. At the IR level, it's semantically equivalent to an @llvm.ptrauth.auth followed by an indirect call, but it additionally provides additional hardening, by preventing the intermediate raw pointer from being exposed. This mostly adds the IR definition, verifier checks, and support in a couple of general helper functions. Clang IRGen and backend support will come separately. Note that we'll eventually want to support this bundle in indirectbr as well, for similar reasons. indirectbr currently doesn't support bundles at all, and the IR data structures need to be updated to allow that. Differential Revision: https://reviews.llvm.org/D113685	2022-02-14 11:27:35 -08:00
Nikita Popov	18bf42c0a6	[CVP] Extract helper from phi processing (NFC) So we can use early returns and avoid those awkward !V checks.	2022-02-14 10:51:34 +01:00
Kazu Hirata	befeb5acf6	[Transforms] Use default member initialization in MemmoveVerifier (NFC)	2022-02-13 10:34:03 -08:00
Kazu Hirata	fd3e8044cd	[Transforms] Use default member initialization in Prefetch (NFC)	2022-02-13 10:34:02 -08:00
Kazu Hirata	0b9a610a75	[Transforms] Use default member initialization in ConditionInfo (NFC)	2022-02-13 10:34:00 -08:00
Florian Hahn	66400fc2dd	[ConstraintElimination] Support add with precondition. If we can prove that an addition without wrap flags won't wrap, decompse the operation. Issue #48253	2022-02-11 20:26:25 +00:00
Austin Kerbow	0bb25b4603	[InferAddressSpaces] Fix assert on invalid cast ordering If a cast is needed when replacing uses with newly created values, the cast must be inserted after the instruction that defines the new value. Fixes: SWDEV-321215 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D119524	2022-02-11 10:02:30 -08:00
Matt Arsenault	52fbb786a6	InferAddressSpaces: Fix assert on inferred source for inttoptr/ptrtoint If we had some source value we could infer an address space from that went through a ptrtoint/inttoptr pair, this would fail since bitcast can't change the address space. Fixes issue 53665.	2022-02-11 10:35:29 -05:00
Nikita Popov	2a1b1f1b1b	[GVN] Store source element type for GEP expressions To avoid incorrectly merging GEPs with different source types under opaque pointers. To avoid increasing the Expression structure size, this reuses the existing type member. The code does not rely on this to be the expression result type, it's only used as a disambiguator.	2022-02-11 13:03:30 +01:00
Simon Pilgrim	a5d6851489	LoopReroll::isLoopControlIV - use cast<> instead of dyn_cast<> to avoid dereference of nullptr The pointer is always dereferenced by isCompareUsedByBranch, so assert the cast is correct instead of returning nullptr	2022-02-11 10:19:25 +00:00
Philip Reames	5ba115031d	[PSE] Remove assumption that top level predicate is union from public interface [NFC] Note that this doesn't actually cause the top level predicate to become a non-union just yet. The above comes from a case in the LoopVectorizer where a predicate which is later proven no longer blocks vectorization due to a change from checking if predicates exists to whether the predicate is possibly false.	2022-02-10 16:14:52 -08:00
Florian Hahn	80eea38d8d	[ConstraintElimination] Remove unnecessary recursion (NFC). Perform predicate normalization in a single switch, rather then going through recursions.	2022-02-10 12:26:35 +00:00
Florian Hahn	79d60b93b4	[ConstraintElimination] Skip floating point compares. (NFC) The solver only supports integer conditions. Adding floating point compares to the worklist only adds extra work. Just skip them.	2022-02-09 21:16:49 +00:00
Florian Hahn	b71eed7e8f	[ConstraintElimination] Remove redundant lookup (NFC).	2022-02-09 18:00:03 +00:00
Florian Hahn	902db4ec1c	[ConstraintElimination] Move some definitions closer to uses (NFC).	2022-02-09 17:29:49 +00:00
Arthur Eubanks	1bdc6eacba	[LoopLoadElim] Support opaque pointers With typed pointers the pointer operand type checks the address space and the load/store type. With opaque pointers we have to check the load/store type separately.	2022-02-09 09:22:21 -08:00
Nikita Popov	cdc0573f75	[MatrixBuilder] Remove unnecessary IRBuilder template (NFC) IRBuilderBase exists specifically to avoid the need for this.	2022-02-07 16:42:38 +01:00
Kazu Hirata	3a3cb929ab	[llvm] Use = default (NFC)	2022-02-06 22:18:35 -08:00
Congzhe Cao	1ef04326ec	[LoopInterchange] Support loop interchange with floating point reductions Enabled loop interchange support for floating point reductions if it is allowed to reorder floating point operations. Previously when we encouter a floating point PHI node in the outer loop exit block, we bailed out since we could not detect floating point reductions in the early days. Now we remove this limiation since we are able to detect floating point reductions. Reviewed By: #loopoptwg, Meinersbur Differential Revision: https://reviews.llvm.org/D117450	2022-02-06 17:04:47 -05:00
Benjamin Kramer	a40dc4eaf8	Simplify mask creation with llvm::seq. NFCI.	2022-02-05 23:35:41 +01:00
Florian Hahn	0a781d98fb	[ConstraintElimination] Add initial signed support. This patch adds initial support for signed conditions. To do so, ConstraintElimination maintains two separate systems, one with facts from signed and one for unsigned conditions. To start with this means information from signed and unsigned conditions is kept completely separate. When it is safe to do so, information from signed conditions may be also transferred to the unsigned system and vice versa. That's left for follow-ups. In the initial version, de-composition of signed values just handles constants and otherwise just uses the value, without trying to decompose the operation. Again this can be extended in follow-up changes. The main benefit of this limited signed support is proving >=s 0 pre-conditions added in D118799. But even this initial version also fixes PR53273. Depends on D118799. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D118806	2022-02-04 14:02:48 +00:00
Florian Hahn	06f3ef6626	[ConstraintElimination] Allow adding pre-conditions for constraints. With this patch pre-conditions can be added to a list of constraints. Constraints with pre-conditions can only be used if all pre-conditions are satisfied when the constraint is used. The pre-conditions at the moment are specified as a list of (Predicate, Value ,Value ) tuples. This allow easily checking them like any other condition, using the existing infrastructure. This then is used to limit GEP decomposition to cases where we can prove that offsets are signed positive. This fixes a couple of incorrect transforms where GEP offsets where assumed to be signed positive, but they were not. Note that this effectively disables GEP decomposition, as there's no support for reasoning about signed predicates. D118806 adds initial signed support. Fixes PR49624. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D118799	2022-02-04 11:45:07 +00:00
serge-sans-paille	ffe8720aa0	Reduce dependencies on llvm/BinaryFormat/Dwarf.h This header is very large (3M Lines once expended) and was included in location where dwarf-specific information were not needed. More specifically, this commit suppresses the dependencies on llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used, this has a decent impact on number of preprocessed lines generated during compilation of LLVM, as showcased below. This is achieved by moving some definitions back to the .cpp file, no performance impact implied[0]. As a consequence of that patch, downstream user may need to manually some extra files: llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h In some situations, codes maybe relying on the fact that llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden dependency now needs to be explicit. $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l after: 10978519 before: 11245451 Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup [0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions Differential Revision: https://reviews.llvm.org/D118781	2022-02-04 11:44:03 +01:00
Nikita Popov	c680eeab30	[IRBuilder][RS4GC] Require FunctionCallee when creating statepoint This makes the statepoint methods in IRBuilder accept a FunctionCallee, which carries both the callee and function type. This is used to add the elementtype attribute to the statepoint call. RS4GC requires an additional tweak to actually preserve that attribute -- previously the attributes on the call were completely overwritten. Differential Revision: https://reviews.llvm.org/D118886	2022-02-04 09:47:32 +01:00
Serguei Katkov	66f1c6fc71	[RS4GC] Extract rematerilazable candidate search. NFC. Finding re-materialization chain for derived pointer does not depend on call site. To avoid this finding for each call site it can be extracted in a separate routine. Reviewers: reames, dantrushin Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D118676	2022-02-04 09:26:03 +07:00
Florian Hahn	413e47ecd4	[ConstraintElimination] Handle degenerate case with branch to same dest. When a conditional branch has the same block as both true and false successor it is not safe to add the condition. Fixes PR49819.	2022-02-03 11:09:14 +00:00
Anna Thomas	a73e4ce6a5	[LoopFuse] Change DT to reference in FusionCandidate struct. NFC Assertion added in `f50821cff0` confirms that the DT is indeed nonnull. Change it to a reference instead of a pointer to make this explicit in FusionCandidate. Suggested in D118472.	2022-02-02 14:55:37 -05:00
Florian Hahn	1c9f15426f	[GVN] Replace PointerIntPair with separate pointer & kind fields (NFC). After adding another value kind in `8a12cae862`, Value * pointers do not have enough available empty bits to store the kind (e.g. on ARM) To address this, the patch replaces the PointerIntPair with separate value and kind fields.	2022-02-02 09:44:15 +00:00
Florian Hahn	8a12cae862	[GVN] Support load of pointer-select to value-select conversion. This patch extends the available-value logic to detect loads of pointer-selects that can be replaced by a value select. For example, consider the code below: loop: %sel.phi = phi i32* [ %start, %ph ], [ %sel, %ph ] %l = load %ptr %l.sel = load %sel.phi %sel = select cond, %ptr, %sel.phi ... exit: %res = load %sel use(%res) The load of the pointer phi can be replaced by a load of the start value outside the loop and a new phi/select chain based on the loaded values, as illustrated below %l.start = load %start loop: sel.phi.prom = phi i32 [ %l.start, %ph ], [ %sel.prom, %ph ] %l = load %ptr %sel.prom = select cond, %l, %sel.phi.prom ... exit: use(%sel.prom) This is a first step towards alllowing vectorizing loops using common libc++ library functions, like std::min_element (https://clang.godbolt.org/z/6czGzzqbs) #include <vector> #include <algorithm> int foo(const std::vector<int> &V) { return *std::min_element(V.begin(), V.end()); } Reviewed By: reames Differential Revision: https://reviews.llvm.org/D118143	2022-02-02 09:23:09 +00:00
serge-sans-paille	e188aae406	Cleanup header dependencies in LLVMCore Based on the output of include-what-you-use. This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avoiding hidden ehader dependencies, something the LLVM codebase doesn't do that well :-/ I've tried to summarize the biggest change below: - llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h - llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h - llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h - llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h - llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h - llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h - llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h And the usual count of preprocessed lines: $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l before: 6400831 after: 6189948 200k lines less to process is no that bad ;-) Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D118652	2022-02-02 06:54:20 +01:00
Anna Thomas	f50821cff0	[LoopFuse] Add assertion for non-null DT in fusion candidate The code paths analyzed (all constructor invocations of fusion candidate) pass in a non-null DT. Adding this assert as requested in D118472 before converting this to a reference argument.	2022-02-01 17:00:09 -05:00
Anna Thomas	bc48a26655	[LoopPeel] Use reference instead of pointer for DT argument Cleanup code in peelLoop API. We already have usage of DT without guarding against a null DT, so this change constant folds the remaining null DT checks. Also make the argument a reference so that it is clear the argument is a nonnull DT. Extracted from D118472.	2022-02-01 17:00:08 -05:00
Olle Fredriksson	9d555b4a83	[DFAJumpThreading] make update order deterministic We tracked down some non-determinism in compilation output to the DFAJumpThreading pass. These changes fixed our issue: * Make the DefMap type a MapVector to make its iteration order depend on insertion order. * Sort the values to be inserted into NewDefs by instruction order to make the insertion order deterministic. Since these values come from iterating over a ValueMap, which doesn't have deterministic iteration order, I couldn't fix this at its source. Reviewed By: alexey.zhikhar Differential Revision: https://reviews.llvm.org/D118590	2022-02-01 11:02:58 -05:00
Jay Foad	d2e5d3512b	[StructurizeCFG] Clean up some boolean not instructions In some cases StructurizeCFG inserts i1 xor instructions to invert predicates. Add a quick loop to clean these up afterwards if we can get away with modifying an existing compare instruction instead. (StructurizeCFG is generally run late in the pipeline so instcombine does not clean them up for us.) Differential Revision: https://reviews.llvm.org/D118623	2022-02-01 09:35:37 +00:00
Serguei Katkov	28c5e1b760	[RS4GC] Make PointerToBase mapping be independent on call site. NFC. PointerToBase is a mapping between potentially derived pointer to its base. As soon as we are in SSA form if there is a base of derived pointer and it is available at def of derived pointer, the same base will be available at any point where derived pointer is alive. So the mapping of derived pointer to base pointer is not a property of a call site but the same on function level. Reviewers: reames, yrouban Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D118604	2022-02-01 11:47:36 +07:00
William S. Moses	8cb9c73609	[LoopIdiom] Keep TBAA when creating memcpy/memmove When upgrading a loop of load/store to a memcpy, the existing pass does not keep existing aliasing information. This patch allows existing aliasing information to be kept. Reviewed By: jeroen.dobbelaere Differential Revision: https://reviews.llvm.org/D108221	2022-01-31 16:28:13 -05:00
Nuno Lopes	f1c18acb07	[NewGVN] do phi(undef, x) -> x only if x is not poison phi([undef, A], [x, B]) -> x is only correct x is guaranteed to be a non-poison value. Otherwise we would be changing an undef to poison in the branch A. Differential Revision: https://reviews.llvm.org/D117907	2022-01-29 21:43:57 +00:00
Nikita Popov	cf0357a545	[BasicBlockUtils] Fix typo in API name (NFC) detatch -> detach. As this requires touching all uses, also lower-case it in accordance with the style guide.	2022-01-28 16:32:13 +01:00
Florian Hahn	b339bbdb19	[Matrix] Use ArrayType for allocas instead of VectorType. When creating an alloca to copy a matrix due to memory conflicts, those allocas used to use VectorTypes, which forced them to have huge alignments for large vectors. This patch updates LowerMatrixIntrinsics to use a corresponding array type, like Clang already does, to get more manageable alignments. Reviewed By: anemet, thegameg Differential Revision: https://reviews.llvm.org/D118239	2022-01-28 10:47:52 +00:00
Florian Hahn	9fd7a2e379	[ConstraintElimination] Use constraints with 0 or 1 coefficients. isConditionImplied is able to correctly handle 0 or 1 coefficients, so let it handle those cases, rather than skipping them.	2022-01-27 18:41:33 +00:00
Florian Hahn	258a0a3a55	[ConstraintElimination] Use simplified constraint for == 0. When checking x == 0, checking x u<= 0 is sufficient and simpler than x u>= 0 && x u<= 0. https://alive2.llvm.org/ce/z/btM7d3 ---------------------------------------- define i1 @src(i4 %a) { %0: %c = icmp eq i4 %a, 0 ret i1 %c } => define i1 @tgt(i4 %a) { %0: %c = icmp ule i4 %a, 0 ret i1 %c } Transformation seems to be correct!	2022-01-27 13:31:23 +00:00
Florian Hahn	a78ce48c37	[ConstraintElimination] Introduce struct to manage constraints. (NFC) This patch adds a struct to manage a list of constraints. It simplifies a follow-up change, that adds pre-conditions that must hold before a list of constraints can be used.	2022-01-27 12:40:09 +00:00
Benjamin Kramer	f15014ff54	Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17" This reverts commit `ef82063207`. - It conflicts with the existing llvm::size in STLExtras, which will now never be called. - Calling it without llvm:: breaks C++17 compat	2022-01-26 16:55:53 +01:00
serge-sans-paille	ef82063207	Rename llvm::array_lengthof into llvm::size to match std::size from C++17 As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no build breakage expected).	2022-01-26 16:17:45 +01:00
Nikita Popov	03d0acc545	[DSE] Use helper for unwind check (NFCI) This should be no functional change, as the cases supported by the helper and the cases supported by DSE are currently the same, the code structure is just slightly different.	2022-01-26 14:08:08 +01:00
Nikita Popov	6b69985da4	[MemCpyOpt] Use helper for unwind check This extends support to byval arguments. It would be further extended to handle the case of non-captured noalias returns.	2022-01-26 12:43:31 +01:00
Nuno Lopes	24a49e99f3	[NewGVN] FIx phi-of-ops in the presence of memory read operations The phi-of-ops functionality has a function OpIsSafeForPHIOfOps to determine when it's safe to create the new phi. But this function only checks for the obvious dominator conditions and ignores memory. This patch takes the conservative approach and disables phi-of-ops whenever there's a load that doesn't dominate the phi, as its value may be affected by a store inside the loop. This can be improved later to check aliasing between the load/stores. Fixes https://llvm.org/PR53277 Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D117999	2022-01-26 10:19:18 +00:00
Nikita Popov	44cfc3a816	[LICM] Generalize unwinding check during scalar promotion This extract a common isNotVisibleOnUnwind() helper into AliasAnalysis, which handles allocas, byval arguments and noalias calls. After D116998 this could also handle sret arguments. We have similar logic in DSE and MemCpyOpt, which will be switched to use this helper as well. The noalias call case is a bit different from the others, because it also requires that the object is not captured. The caller is responsible for doing the appropriate check. Differential Revision: https://reviews.llvm.org/D117000	2022-01-26 11:15:03 +01:00
Nikita Popov	aa97bc116d	[NFC] Remove uses of PointerType::getElementType() Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType(). This is part of D117885, in preparation for deprecating the API.	2022-01-25 09:44:52 +01:00
Florian Hahn	8a15caaae5	[ConstraintElimination] Fix sign of sub decomposition. Update the decomposition code to make sure the right coefficient (-1) is used for the second operand of the subtract. Fixes PR53123.	2022-01-24 18:32:32 +00:00
Sjoerd Meijer	ada6d78a78	[LoopFlatten] Address FIXME about getTripCountFromExitCount. NFC. Together with the previous commit which mainly documents better LoopFlatten's overall strategy, this addresses a concern added as a FIXME comment in D110587; the code refactoring (NFC) introduces functions (also for the SCEV usage) to make this clearer.	2022-01-24 13:46:19 +00:00
Sjoerd Meijer	f6ac8088b0	[LoopFlatten] Added comments about usage of various Loop APIs. NFC.	2022-01-24 13:46:19 +00:00
Nikita Popov	d29e319263	[OpaquePtrs] Add getNonOpaquePointerElementType() method (NFC) This method is intended for use in places that cannot be reached with opaque pointers, or part of deprecated methods. This makes it easier to see that some uses of getPointerElementType() don't need further action. Differential Revision: https://reviews.llvm.org/D117870	2022-01-24 10:03:49 +01:00
Nikita Popov	0d20407d1a	Reapply [MemCpyOpt] Look through pointer casts when checking capture This is a recommit of the patch without changes. The reason for the revert has been addressed in D117679. ----- The user scanning loop above looks through pointer casts, so we also need to strip pointer casts in the capture check. Previously the source was incorrectly considered not captured if a bitcast was passed to the call.	2022-01-20 09:30:21 +01:00
Nikita Popov	655a7024db	Reapply [MemCpyOpt] Make capture check during call slot optimization more precise This is a recommit of the patch without changes. The reason for the revert has been addressed in D117679. ----- Call slot optimization is currently supposed to be prevented if the call can capture the source pointer. Due to an implementation bug, this check currently doesn't trigger if a bitcast of the source pointer is passed instead. I'm somewhat afraid of the fallout of fixing this bug (due to heavy reliance on call slot optimization in rust), so I'd like to strengthen the capture reasoning a bit first. In particular, I believe that the capture is fine as long as a) the call itself cannot depend on the pointer identity, because neither dest has been captured before/at nor src before the call and b) there is no potential use of the captured pointer before the lifetime of the source alloca ends, either due to lifetime.end or a return from a function. At that point the potentially captured pointer becomes dangling. Differential Revision: https://reviews.llvm.org/D115615	2022-01-20 09:30:20 +01:00
Nikita Popov	d7bff2e9d2	[MemCpyOpt] Fix metadata merging during call slot optimization Call slot optimization currently merges the metadata between the call and the load. However, we also need to merge in the metadata of the store. Part of the reason why we might have gotten away with this previously is that usually the load and the store are the same instruction (a memcpy), this can only happen if call slot optimization occurs on an actual load/store pair. This addresses the issue reported in https://reviews.llvm.org/D115615#3251386. Differential Revision: https://reviews.llvm.org/D117679	2022-01-20 09:25:13 +01:00
Nikita Popov	4dc4815f56	[MemCpyOpt] Add some debug output to call slot optimization (NFC)	2022-01-19 15:51:10 +01:00
Nikita Popov	26f81984e7	[DSE] Handle inaccessiblememonly calloc Change the DSE calloc handling to assume that it is inaccessiblememonly, i.e. the defining access is liveOnEntry. Differential Revision: https://reviews.llvm.org/D117543	2022-01-19 12:55:09 +01:00
Sjoerd Meijer	d544a89a37	[LoopFlatten] Update MemorySSA state I would like to move LoopFlatten from LoopPass Manager LPM2 to LPM1 (D116612), but that is a LPM that is using MemorySSA and so LoopFlatten needs to preserve MemorySSA and this adds that. More specifically, LoopFlatten restructures the CFG and with this change the MSSA state is updated accordingly, where we also update the DomTree. LoopFlatten doesn't rewrite/optimise/delete load or store instructions, so I have not added any MSSA updates for that. Differential Revision: https://reviews.llvm.org/D116660	2022-01-19 10:57:33 +00:00
Nikita Popov	d56b0ad441	[ConstantHoist] Remove check for notional overindexing ConstantHoist currently only hoists GEPs if there is no notional overindexing. As this transform only hoists address arithmetic, it shouldn't care about whether any overindexing occurs or not. There is one caveat: If the hoisted base GEP is inbounds, and a later non-inbounds GEP is rewritten in terms of it, the value may be incorrectly poisoned. To avoid this, restrict the transform to inbounds GEPs for now, as the notional overindexing check effectively did that as well. The inbounds restriction could be dropped by dropping inbounds from the base GEP expression. Differential Revision: https://reviews.llvm.org/D117201	2022-01-19 11:32:10 +01:00
Adrian Tong	ea27adb45b	[NFC] Test commit. This is just a test commit to check whether I got commit permission.	2022-01-18 19:01:04 +00:00
Hans Wennborg	53a51acc36	Revert "[MemCpyOpt] Make capture check during call slot optimization more precise" This casued a miscompile due to call slot optimization replacing a call argument without considering the call's !noalias metadata, see discussion on the code review. > Call slot optimization is currently supposed to be prevented if > the call can capture the source pointer. Due to an implementation > bug, this check currently doesn't trigger if a bitcast of the source > pointer is passed instead. I'm somewhat afraid of the fallout of > fixing this bug (due to heavy reliance on call slot optimization > in rust), so I'd like to strengthen the capture reasoning a bit first. > > In particular, I believe that the capture is fine as long as a) > the call itself cannot depend on the pointer identity, because > neither dest has been captured before/at nor src before the > call and b) there is no potential use of the captured pointer > before the lifetime of the source alloca ends, either due to > lifetime.end or a return from a function. At that point the > potentially captured pointer becomes dangling. > > Differential Revision: https://reviews.llvm.org/D115615 Also reverting the dependent commit: > [MemCpyOpt] Look through pointer casts when checking capture > > The user scanning loop above looks through pointer casts, so we > also need to strip pointer casts in the capture check. Previously > the source was incorrectly considered not captured if a bitcast > was passed to the call. This reverts commit `487a34ed9d` and `00e6869463`.	2022-01-18 17:41:49 +01:00
Philip Reames	6ca192de58	[LoopDeletion] Add back statistic update lost in `523573e` Caught by a couple of builders as an unused variable warning (e.g. https://lab.llvm.org/buildbot#builders/57/builds/13973).	2022-01-17 12:20:51 -08:00
Philip Reames	523573e90d	[LoopDeletion] Revert `3af8a11` and add test coverage for breakage This reverts `3af8a11` because I'd used an upper bound where an lower bound was required. The included reduced test case demonstrates the issue.	2022-01-17 11:44:03 -08:00
Florian Hahn	aa7f0e6a55	[DSE] Remove commented-out InvisibleToCallerBeforeRet. (NFC) This code was is a leftover from earlier changes and should be removed.	2022-01-17 13:59:13 +00:00
Nikita Popov	00b77d917c	[DSE] Remove alloc function check in canSkipDef() canSkipDef() currently skips inaccessiblememonly calls, but not if they are allocation functions. This check was added in D103009, but actually seems to be a leftover from a previous implementation in D101440. canSkipDef() is not used on the storeIsNoop() path, where the relevant transform ended up being implemented. Differential Revision: https://reviews.llvm.org/D117005	2022-01-17 09:23:51 +01:00
Quentin Colombet	a8ca4046e2	[LSR] Fix crash in Phi node with EHPad block This fixes a crash I observed in issue #48708 where the LSR pass tries to insert an instruction in a basic block with only a catchswitch statement in there. This happens because the Phi node being evaluated assumes the same value for different basic blocks. If the basic block associated with the incoming value of the operand being evaluated has an EHPad terminator LSR skips optimizing it. But if that incoming value can come from multiple different blocks there can be some incoming basic blocks which are terminated in an EHPad. If these are then rewritten in RewriteForPhi the ones containing an EHPad terminator will hit the "Insertion point must be a normal instruction" assert in AdjustInsertPositionForExpand. This fix makes CollectLoopInvariantFixupsAndFormulae also ignore cases where the same value has another incoming basic block with an EHPad, same as it already does in case the primary value has one. Patch by Lorenz Brun <lorenz@brun.one> Differential Revision: https://reviews.llvm.org/D98378	2022-01-14 18:53:18 -08:00
Heejin Ahn	c3a68c5d63	[SROA] Bail out on PHIs in catchswitch BBs In the process of rewriting `alloca`s and `phi`s that use them, the SROA pass can try to insert a non-PHI instruction by calling `getFirstInsertionPt()`, which is not possible in a catchswitch BB. This CL makes we bail out on these cases. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D117168	2022-01-14 14:55:07 -08:00
Congzhe Cao	fa6a2876c7	[LoopInterchange] Enable interchange with multiple inner loop indvars Currently loop interchange only supports loops with one inner loop induction variable. This patch adds support for transformation with more than one inner loop induction variables. The induction PHIs and induction increment instructions are moved/duplicated properly to the new outer header and the new outer latch, respectively. Reviewed By: bmahjour Differential Revision: https://reviews.llvm.org/D114917	2022-01-14 16:28:41 -05:00
Jessica Paquette	acb8de565e	[JumpThreading] Change asserts for WantInteger into actual checks After `e734e8286b`, it is possible to end up in a situation where an `indirectbr` is fed by a cast, which is in turn fed by an operation which only produces integers. `indirectbr` expects a block address, however these operations can't produce that. There were several asserts in `computeValueKnownInPredecessorsImpl` which check that we're not looking for a block address if we're walking through something which can never produce one. Since it's now possible to hit these asserts, this changes them into actual checks which return false if `Preference` is not `WantInteger`. This adds a testcase which verifies that we don't crash anymore in these situations. Differential Revision: https://reviews.llvm.org/D99814	2022-01-14 11:15:14 -08:00
Florian Hahn	1ef9bfa013	[InstSimplify] Pass pointer and indices separately to SimplifyGEPInst. This doesn't require callers to put the pointer operand and the indices in a container like a vector when calling the function. This is not really an issue with the existing callers. But when using it from IRBuilder the inputs are available as separate pointer value and indices ArrayRef. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D117038	2022-01-14 09:59:52 +00:00
Congzhe Cao	37e34b74e9	[LoopInterchange] Enable interchange with multiple outer loop indvars This patch enables loop interchange with multiple outer loop induction variables, and hence removes the limitation that only a single outer loop induction variable is supported. In fact, it turns out that the current pass already trivially supports multiple outer indvars, which is the result of a previous patch `https://reviews.llvm.org/D102743`. Therefore, this patch removed that limitation and provides test cases for multiple outer indvars. Reviewed By: bmahjour Differential Revision: https://reviews.llvm.org/D114916	2022-01-13 16:51:32 -05:00
Rosie Sumpter	552eb372cb	[LoopVectorize] Pass a vector type to isLegalMaskedGather/Scatter This is required to query the legality more precisely in the LoopVectorizer. This adds another TTI function named 'forceScalarizeMaskedGather/Scatter' function to work around the hack introduced for MVE, where isLegalMaskedGather/Scatter would return an answer by second-guessing where the function was called from, based on the Type passed in (vector vs scalar). The new interface makes this explicit. It is also used by X86 to check for vector widths where gather/scatters aren't profitable (or don't exist) for certain subtargets. Differential Revision: https://reviews.llvm.org/D115329	2022-01-12 13:34:12 +00:00
Philip Reames	75de92d3e2	[DSE] Seperate malloc+memset -> calloc transform from noop store dedection [NFC] This transformation has nothing to do with whether the store is a noop. The memset becomes a noop, but only after we replace the malloc with a calloc.	2022-01-11 12:55:59 -08:00
Philip Reames	e2e7ecf25d	[DSE] Minor style improvements to calloc formation code [NFC]	2022-01-11 12:18:23 -08:00
Philip Reames	a1bf4ddac6	[DSE] Generalize store null to calloc allocated memory [NFC-ish] This change removes a direct check for calloc-like allocation functions, and instead handles the generic case where we're storing a constant to constant initialized memory. This is mostly to remove the call to isCallocLike, but if someone downstream happens to have an initialized alloc which initializes to e.g. -1, this will also kick in for them. (I don't know of such an example ftr.)	2022-01-11 12:02:51 -08:00
Philip Reames	3712372fa5	[DSE] Style improvements after `3cef3cf` - remove redundant dyn_casts [NFC] I'd been working on exactly the same patch when Nikita landed his, so this patch is basically the style diff between the two. :)	2022-01-11 08:39:18 -08:00
Nikita Popov	3cef3cf02f	[DSE] Check for noalias calls rather than alloc functions For these "visible on unwind/ret" checks we only care about the fact that no other code has access to the pointer (unless it escapes). A noalias call is sufficient for this, it does not have to be a known allocation function. This is basically the same change as D116728, but for DSE rather than LICM.	2022-01-11 12:22:16 +01:00
Craig Topper	38b30eb2b2	[LowerMatrixIntrinsics] Call getRegisterClassForType before getNumberOfRegisters. getNumberOfRegisters takes a ClassID as it's argument. It shouldn't be passed a bool. Assuming the bool meant vector or not, we should call getRegisterClassForType first. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D116903	2022-01-10 15:32:13 -08:00
Simon Pilgrim	353484d191	[LowerExpectIntrinsic] Use cast<> instead of dyn_cast<> to avoid dereference of nullptr. NFC	2022-01-10 15:34:37 +00:00
Nuno Lopes	7b1cb72ad9	[SROA] Switch replacement of dead/UB/unreachable ops from undef to poison SROA has 3 data-structures where it stores sets of instructions that should be deleted: - DeadUsers -> instructions that are UB or have no users - DeadOperands -> instructions that are UB or operands of useless phis - DeadInsts -> "dead" instructions, including loads of uninitialized memory with users The first 2 sets can be RAUW with poison instead of undef. No brainer as UB can be replaced with poison, and for instructions with no users RAUW is a NOP. The 3rd case cannot be currently replaced with poison because the set mixes the loads of uninit memory. I leave that alone for now. Another case where we can use poison is in the construction of vectors from multiple loads. The base vector for the first insertelement is now poison as it doesn't matter as it is fully overwritten by inserts. Differential Revision: https://reviews.llvm.org/D116887	2022-01-10 14:04:26 +00:00
Serge Guelton	d2cc6c2d0c	Use a sorted array instead of a map to store AttrBuilder string attributes Using and std::map<SmallString, SmallString> for target dependent attributes is inefficient: it makes its constructor slightly heavier, and involves extra allocation for each new string attribute. Storing the attribute key/value as strings implies extra allocation/copy step. Use a sorted vector instead. Given the low number of attributes generally involved, this is cheaper, as showcased by https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions Differential Revision: https://reviews.llvm.org/D116599	2022-01-10 14:49:53 +01:00
Florian Hahn	003ac239d8	[SROA] Reduce the number of times a IRBuilder is constructed (NFC). This patch reduces the number of times IRBuilders need to be constructed in SROA.cpp by passing existing ones by reference to the appropriate places.	2022-01-10 12:09:13 +00:00
Florian Hahn	1ce01b7dfe	[SCEVExpander] Simplify cleanup, skip sorting by dominance. There is no need to sort inserted instructions by dominance, as the deletion loop still requires RAUW with undef before deleting. Removing instructions in reverse insertion order should still insure that the number of uselist updates is kept to a minimum.	2022-01-09 18:38:41 +00:00
Kazu Hirata	4e2ec7e38d	[llvm] Remove unused forward declarations (NFC)	2022-01-07 20:00:34 -08:00
Kazu Hirata	b932bdf59f	[llvm] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-07 17:45:09 -08:00
Philip Reames	6b0ff0969d	Extract utility function for checking initial value of allocation [NFC, try 2] This is a reoccuring pattern, we can consolidate three copies into one. The main motivation is to reduce usages of isMallocLike. The original commit (which was quickly reverted) didn't account for the allocation function could be an invoke, test coverage for that case added in this commit.	2022-01-07 08:44:08 -08:00
Philip Reames	c6a0c1585a	Revert "Extract utility function for checking initial value of allocation [NFC]" This reverts commit `9ce30fe86f`. Appears to be causing a problem on a buildbot, revert while investigating. https://green.lab.llvm.org/green//job/clang-stage1-RA/26818/consoleFull#-1502953973d489585b-5106-414a-ac11-3ff90657619c	2022-01-06 19:05:51 -08:00
Philip Reames	9ce30fe86f	Extract utility function for checking initial value of allocation [NFC] This is a reoccuring pattern, we can consolidate three copies into one. The main motivation is to reduce usages of isMallocLike.	2022-01-06 18:02:14 -08:00
Congzhe Cao	c251bfc3b9	[LoopInterchange] Remove a limitation in LoopInterchange legality There was a limitation in legality that in the original inner loop latch, no instruction was allowed between the induction variable increment and the branch instruction. This is because we used to split the inner latch at the induction variable increment instruction. Since now we have split at the inner latch branch instruction and have properly duplicated instructions over to the split block, we remove this limitation. Please refer to the test case updates to see how we now interchange loops where instructions exist between the induction variable increment and the branch instruction. Reviewed By: bmahjour Differential Revision: https://reviews.llvm.org/D115238	2022-01-06 15:56:32 -05:00
Simon Pilgrim	5e7912d80f	[LowerMatrixIntrinsics] writeFnName - don't dereference a dyn_cast<>. NFC. dyn_cast<> can return null - use cast<> instead to assert the cast is valid before dereferencing the casted pointer. Fixes static-analyzer null dereference warning.	2022-01-06 17:09:32 +00:00
Nikita Popov	918015c9ba	[EarlyCSE] Support opaque pointers Explicitly check the load/store value type, because this is no longer implicitly checked through the pointer type.	2022-01-06 17:08:50 +01:00
Simon Pilgrim	5bbcff6181	[MemCpyOptimizer] hasUndefContents - only look for underlying object if we've found an alloca Provides an early-out if we fail to find an AllocaInst, and avoids a static analyzer warning about null dereferencing.	2022-01-06 15:15:03 +00:00
Simon Pilgrim	8399fa673b	[MemCpyOptimizer] Use auto* for cast<> results (style). NFC.	2022-01-06 15:15:03 +00:00
Simon Pilgrim	6638303869	[LoopFlatten] checkOverflow - use cast<> instead of dyn_cast<> to avoid dereference of nullptr. Fix static analysis warning by using cast<> instead of dyn_cast<> as both isa<> and isGuaranteedToExecuteForEveryIteration expect a non-null Instruction pointer.	2022-01-06 14:13:50 +00:00
Nikita Popov	ddd9ec667a	[LICM] Update comments related to escape check (NFC) The comments here were outdated and a bit confusing without the knowledge that we're only guarding against reads on unwind.	2022-01-06 14:45:48 +01:00
Nikita Popov	41a522779d	[LICM] Check for noalias call instead of alloc like fn When determining whether the memory is local to the function (and we can thus introduce spurious writes without thread-safety issues), check for a noalias call rather than the hardcoded list of memory allocation functions. Noalias calls are the more general way to determine allocation functions, as long as we're only interested in the property that the returned value is distinct from any other accessible memory. Differential Revision: https://reviews.llvm.org/D116728	2022-01-06 14:38:19 +01:00
David Blaikie	31b79b86ee	Revert "Remove unused variable (-Wunused)" Patch that removed the use of this variable was reverted in `8ade3d43a3` This reverts commit `3988a06d86`.	2022-01-05 20:43:30 -08:00
Congzhe Cao	8ade3d43a3	Revert "[LoopInterchange] Remove a limitation in LoopInterchange legality" This reverts commit `15702ff9ce` while I investigate a ppc build bot failure at https://lab.llvm.org/buildbot#builders/36/builds/16051.	2022-01-05 23:34:36 -05:00
David Blaikie	3988a06d86	Remove unused variable (-Wunused)	2022-01-05 20:29:35 -08:00
Congzhe Cao	15702ff9ce	[LoopInterchange] Remove a limitation in LoopInterchange legality There was a limitation in legality that in the original inner loop latch, no instruction was allowed between the induction variable increment and the branch instruction. This is because we used to split the inner latch at the induction variable increment instruction. Since now we have split at the inner latch branch instruction and have properly duplicated instructions over to the split block, we remove this limitation. Please refer to the test case updates to see how we now interchange loops where instructions exist between the induction variable increment and the branch instruction. Reviewed By: bmahjour Differential Revision: https://reviews.llvm.org/D115238	2022-01-05 22:37:54 -05:00
Quentin Colombet	cdbad62c52	[ADCE][NFC] Batch DT updates together This patch delayed the updates of the dominator tree to the very end of the pass instead of doing that in small increments after each basic block. This improves the runtime of the pass in particular in pathological cases because now the updater sees the full extend of the updates and can decide whether it is faster to apply the changes incrementally or just recompute the full tree from scratch. Put differently, thanks to this patch, we can take advantage of the improvements that Chijun Sima <simachijun@gmail.com> made in the dominator tree updater a while ago with commit 32fd196cbf4d: "Teach the DominatorTree fallback to recalculation when applying updates to speedup JT (PR37929)". This change is NFC but can improve the runtime of the compiler dramatically in some pathological cases (where the pass was pushing a lot (several thousands) of small updates (less than 6)). For instance on the motivating example we went from 300+ sec to less than a second. Differential Revision: https://reviews.llvm.org/D116610	2022-01-05 14:05:20 -08:00
Roman Lebedev	2353e1c87b	[NFC][SimplifyCFG] Extract `performBlockTailMerging()` out of `tailMergeBlocksWithSimilarFunctionTerminators()`	2022-01-05 22:59:39 +03:00
Philip Reames	c16fd6a376	Rename doesNotReadMemory to onlyWritesMemory globally [NFC] The naming has come up as a source of confusion in several recent reviews. onlyWritesMemory is consist with onlyReadsMemory which we use for the corresponding readonly case as well.	2022-01-05 08:52:55 -08:00
Benjamin Kramer	c8ffc73350	[PartiallyInlineLibCalls] Don't crash when there's a writeonly attribute on the call readnone subsumes writeonly, so just swap out the attributes. The verifier doesn't allow us to have both on a call.	2022-01-05 12:16:26 +01:00
Nikita Popov	00e6869463	[MemCpyOpt] Look through pointer casts when checking capture The user scanning loop above looks through pointer casts, so we also need to strip pointer casts in the capture check. Previously the source was incorrectly considered not captured if a bitcast was passed to the call.	2022-01-05 09:50:33 +01:00
Nikita Popov	487a34ed9d	[MemCpyOpt] Make capture check during call slot optimization more precise Call slot optimization is currently supposed to be prevented if the call can capture the source pointer. Due to an implementation bug, this check currently doesn't trigger if a bitcast of the source pointer is passed instead. I'm somewhat afraid of the fallout of fixing this bug (due to heavy reliance on call slot optimization in rust), so I'd like to strengthen the capture reasoning a bit first. In particular, I believe that the capture is fine as long as a) the call itself cannot depend on the pointer identity, because neither dest has been captured before/at nor src before the call and b) there is no potential use of the captured pointer before the lifetime of the source alloca ends, either due to lifetime.end or a return from a function. At that point the potentially captured pointer becomes dangling. Differential Revision: https://reviews.llvm.org/D115615	2022-01-05 09:39:25 +01:00
serge-sans-paille	9290ccc3c1	Introduce the AttributeMask class This class is solely used as a lightweight and clean way to build a set of attributes to be removed from an AttrBuilder. Previously AttrBuilder was used both for building and removing, which introduced odd situation like creation of Attribute with dummy value because the only relevant part was the attribute kind. Differential Revision: https://reviews.llvm.org/D116110	2022-01-04 15:37:46 +01:00
Kazu Hirata	e5947760c2	Revert "[llvm] Remove redundant member initialization (NFC)" This reverts commit `fd4808887e`. This patch causes gcc to issue a lot of warnings like: warning: base class ‘class llvm::MCParsedAsmOperand’ should be explicitly initialized in the copy constructor [-Wextra]	2022-01-03 11:28:47 -08:00
Nikita Popov	3478d64ee4	[DSE] Check for whole object overwrite even if dead store size not known If the killing store overwrites the whole object, we know that the preceding store is dead, regardless of the accessed offset or size. This case was previously only handled if the size of the dead store was also known. This allows us to perform conventional DSE for calls that write to an argument (but without known size). Differential Revision: https://reviews.llvm.org/D116267	2022-01-03 09:36:44 +01:00
Kazu Hirata	fd4808887e	[llvm] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-01 16:18:18 -08:00
Kazu Hirata	732e8968a8	[Scalar] Remove a redundant declaration (NFC) InitializePasses.h contains the proper declaration. Identified with readability-redundant-declaration.	2021-12-31 14:02:29 -08:00
Nuno Lopes	84b285d6eb	[GVN] Set phi entries of unreachable predecessors to poison instead of undef This matches NewGVN's behavior.	2021-12-30 14:47:24 +00:00
Nuno Lopes	680d409561	[NewGVN] Use poison instead of undef to represent unreachable values This enables more simplifications and gets us closer to removing undef. ping @alinas	2021-12-29 15:51:29 +00:00
Nuno Lopes	6d702a1e6a	[NewGVN] Prefer poison to undef when ranking operands ping @alinas	2021-12-29 12:38:14 +00:00
Kazu Hirata	5a667c0e74	[llvm] Use nullptr instead of 0 (NFC) Identified with modernize-use-nullptr.	2021-12-28 08:52:25 -08:00
Nikita Popov	daf32b13d7	[IndVars] Support opaque pointers in LFTR Remove the assertion about the pointer element type, only check that the stride is one. Ultimately, the actual pointer type here doesn't matter, because SCEVExpander would insert appropriate casts if necessary.	2021-12-27 12:32:50 +01:00
Alexey Zhikhartsev	d5dc3964a7	[DFAJumpThreading] Determinator BB should precede switch-defining BB Otherwise, it is possible that the state defined in the determinator block defines the state for the next iteration of the loop, rather than for the current one. Fixes llvm-test-suite's SingleSource/Regression/C/gcc-c-torture/execute/pr80421.c Differential Revision: https://reviews.llvm.org/D115832	2021-12-24 10:27:03 -05:00
Nikita Popov	eb91d91b7a	[DSE] Fix typo in recent commit This fixes a typo in `81d69e1bda`. Of course we should only skip the particular store if it isn't removable, not bail out of the whole loop. Add a test to cover this case.	2021-12-24 11:25:25 +01:00
Nikita Popov	90095a0b65	[DSE] Remove unnecessary check in getLocForWrite() (NFC) MemoryLocation::getForDest() checks this itself, call it directly.	2021-12-24 10:45:35 +01:00
Nikita Popov	72d2201785	[DSE] Rename getLocForWriteEx() to getLocForWrite() (NFC) We used to have both getLocForWrite() and getLocForWriteEx(). Now that we only have a single method, the "ex" suffix no longer makes sense.	2021-12-24 10:43:48 +01:00
Nikita Popov	034e66e76c	[DSE] Assert analyzable write in isRemovable() (NFC) As requested on D116210. The function is not necessarily well-defined without this precondition.	2021-12-24 10:39:50 +01:00
Nikita Popov	2b8a703858	[DSE] Avoid calling isRemovable() on non-analyzable location (NFC) At this point the instruction may either have an analyzable write or be a terminator. For terminators, isRemovable() is not necessarily well-defined. Move the check until after we have ensured that it is not a terminator.	2021-12-24 10:18:15 +01:00
Nikita Popov	81d69e1bda	[DSE] Call isRemovable() after getLocForWriteEx() (NFCI) The only non-trivial change here is that the isReadClobber() check for redundant stores is now on the DefLoc, not the UpperLoc. This is semantically the right location to use, though in practice it makes no difference (the locations are either the same, or the def inst does not read).	2021-12-24 10:01:25 +01:00
Nikita Popov	ba2b34b1c7	[DSE] Simplify isGuaranteedLoopInvariant() (NFC) We have Value->stripInBoundsConstantOffsets() which does what we want here, but the inbounds requirement isn't actually necessary. We should probably add Value->stripConstantOffsets() as well.	2021-12-24 09:39:44 +01:00
Nikita Popov	ae64c5a0fd	[DSE][MemLoc] Handle intrinsics more generically Remove the special casing for intrinsics in MemoryLocation::getForDest() and handle them through the general attribute based code. On the DSE side, this means that isRemovable() now needs to handle more than a hardcoded list of intrinsics. We consider everything apart from volatile memory intrinsics and lifetime markers to be removable. This allows us to perform DSE on intrinsics that DSE has not been specially taught about, using a matrix store as an example here. There is an interesting test change for invariant.start, but I believe that optimization is correct. It only looks a bit odd because the code is immediate UB anyway. Differential Revision: https://reviews.llvm.org/D116210	2021-12-24 09:29:57 +01:00
Marianne Mailhot-Sarrasin	90d1786ba0	[DSE] Fix invalid removal of store instruction Fix handling of alloc-like instructions in isGuaranteedLoopInvariant(). It was not valid when the 'KillingDef' was outside of the loop, while the 'CurrentDef' was inside the loop. In that case, the 'KillingDef' only overwrites the definition from the last iteration of the loop, and not the ones of all iterations. Therefor it does not make the 'CurrentDef' to be dead, and must not remove it. Fixing issue : https://github.com/llvm/llvm-project/issues/52774 Reviewed by: Florian Hahn Differential revision: https://reviews.llvm.org/D115965	2021-12-22 16:11:23 -05:00
Nikita Popov	9f24f010ab	[RS4GC] Clean up attribute removal (NFC) It is not necessary to explicitly check which attributes are present, and only add those to the builder. We can simply list all attributes that need to be stripped and remove them unconditionally. This also allows us to use some nicer APIs that don't require mucking about with attribute list indices.	2021-12-22 09:55:54 +01:00
Kazu Hirata	9db0e21660	[llvm] Use depth_first (NFC)	2021-12-21 22:28:48 -08:00
Kazu Hirata	500c4b68dc	[llvm] Construct SmallVector with iterator ranges (NFC)	2021-12-20 23:43:24 -08:00
Nikita Popov	eb2cad8329	[DSE] Make isRemovable() for calls more robust (NFCI) We can only drop calls if they have an analyzable write, the return value is not used, they don't throw and they don't diverge. The last two conditions were previously not checked, because all the libcalls with analyzable writes already happened to satisfy those conditions anyway. This may not be true for generalizations (with D115904 in mind). No test changes because the necessary attributes are already inferred for currently supported libcalls. Differential Revision: https://reviews.llvm.org/D115962	2021-12-17 20:52:34 +01:00
Kazu Hirata	2b7be47b22	[llvm] Strip redundant lambda (NFC)	2021-12-17 10:51:40 -08:00
eopXD	ecb3ae524e	[LoopIdiom] Use utility from SE instead of local rewriter ScalarEvolution::applyLoopGuards shall do the work. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D115784	2021-12-15 20:46:49 -08:00
Arthur Eubanks	5a81a60391	[NFC] Remove more calls to getAlignment() These are deprecated and should be replaced with getAlign(). Some of these asserts don't do anything because Load/Store/AllocaInst never have a 0 align value.	2021-12-15 14:40:57 -08:00
Zaara Syeda	dd245bab9f	[LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam If a loop isn't forced to be unrolled, we want to avoid unrolling it when there is an explicit unroll-and-jam pragma. This is to prevent automatic unrolling from interfering with the user requested transformation. Differential Revision: https://reviews.llvm.org/D114886	2021-12-14 16:46:37 +00:00
Kazu Hirata	7787a8f1b7	[llvm] Use llvm::reverse (NFC)	2021-12-13 21:54:51 -08:00
Alina Sbirlea	46fb810955	[NewGVN] Use PredicateInfo info when previously used for the same ssa.copy intrinsic Symbolic execution using PredicateInfo is only done for the ssa.copy intrinsic. It's using two potential sources for building the expression: 1. the Value of the instruction for which the instruction is a copy of, and 2. the Value from the contraint in PredicateInfo It's possible to get into an infinite loop when choosing between these two, as described in PR31613. This patch proposes performing swapping of the two values (i.e. choosing the second one for the expression), if that same second value was chosen before; this breaks the cycle. In the testcases provided, where there is a contradiction between the value from symbolic execution and assume instruction, NewGVN reduces the assume to assume(false). Resolves PR31613. Differential Revision: https://reviews.llvm.org/D110907	2021-12-13 16:49:24 -08:00
eopXD	bc17d32a5f	[LoopIdiom] Let LIR fold memset pointer / stride SCEV regarding loop guards Expression guraded in loop entry can be folded prior to comparison. This patch proceeds D107353 and makes LIR able to deal with nested for-loop. Reviewed By: qianzhen, bmahjour Differential Revision: https://reviews.llvm.org/D108112	2021-12-13 09:36:58 -08:00
Florian Hahn	361111906b	[EarlyCSE] Retain poison flags, if program is UB if poison. Poison-generating flags can be retained during CSE on the earlier instruction , if the earlier instruction being poison causes UB. For now, always take AND for floating point instructions. https://alive2.llvm.org/ce/z/4K3D7P Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D115247	2021-12-11 15:11:44 +00:00
Fraser Cormack	eb87f668fe	[NewPM] Port FlattenCFGPass to NPM Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D115361	2021-12-09 07:55:02 +00:00
Philip Reames	1a25d0bfbb	[LICM] Remove profile driven restriction on hoisting This reverts change 2c391a5/D87551. As noted in the llvm-dev thread "LICM as canonical form" sent earlier today, introducing this was a major design change made without sufficient cause. A profile driven LICM is not an unreasonable design, it simply is not what we have. Switching to such a model requires a lot more work than just this patch, and broad aggeement that is the right direction for the optimizer as a whole. Worth noting is that all the tests included in the reverted changed are probably handled if we allow running unconstrained LICM, and later run LoopSink. As such, we have no public examples which motivate a profit based hoisting approach.	2021-12-03 17:19:25 -08:00
Florian Hahn	ead3979a92	[MemoryLocation] Move DSE intrinsic handling to MemoryLocation. (NFC) Suggested in D114872.	2021-12-03 16:00:39 +00:00
Florian Hahn	f078536f46	[MemoryLocation] Move DSE's logic to new MemLoc::getForDest helper (NFC). DSE has some extra logic to determine the write location of library calls like strcpy and strcat. This patch moves the logic to a new MemoryLocation:getForDest variant, which takes a call and TLI. This patch should be NFC, because no other places take advantage of the new helper yet. Suggested by @reames post-commit `7eec832def`. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D114872	2021-12-03 09:12:01 +00:00
Djordje Todorovic	2cdc6f2ca6	Reland "[LICM] Hoist LOAD without sinking the STORE" When doing load/store promotion within LICM, if we cannot prove that it is safe to sink the store we won't hoist the load, even though we can prove the load could be dereferenced and moved outside the loop. This patch implements the load promotion by moving it in the loop preheader by inserting proper PHI in the loop. The store is kept as is in the loop. By doing this, we avoid doing the load from a memory location in each iteration. Please consider this small example: loop { var = ptr; if (var) break; ptr= var + 1; } After this patch, it will be: var0 = ptr; loop { var1 = phi (var0, var2); if (var1) break; var2 = var1 + 1; ptr = var2; } This addresses some problems from [0]. [0] https://bugs.llvm.org/show_bug.cgi?id=51193 Differential revision: https://reviews.llvm.org/D113289	2021-12-02 03:53:50 -08:00
Daniel Sanders	54e21df973	[unroll] Fix a functional change in an NFC patch `5c77aa2b91` [unroll] Use early return in shouldFullUnroll [nfc] wasn't quite NFC since !(x <= y) is x > y rather than x >= y Credit to Justin Bogner for spotting the bug Reviewed By: reames Differential Revision: https://reviews.llvm.org/D114894	2021-12-01 17:28:12 -08:00
Djordje Todorovic	72f9f066df	Revert "[LICM] Hoist LOAD without sinking the STORE" This reverts commit `ecb9d8e4e3`. I'll reland this as soon as the failing tests are fixed/updated.	2021-12-01 04:39:26 -08:00
Djordje Todorovic	ecb9d8e4e3	[LICM] Hoist LOAD without sinking the STORE When doing load/store promotion within LICM, if we cannot prove that it is safe to sink the store we won't hoist the load, even though we can prove the load could be dereferenced and moved outside the loop. This patch implements the load promotion by moving it in the loop preheader by inserting proper PHI in the loop. The store is kept as is in the loop. By doing this, we avoid doing the load from a memory location in each iteration. Please consider this small example: loop { var = ptr; if (var) break; ptr= var + 1; } After this patch, it will be: var0 = ptr; loop { var1 = phi (var0, var2); if (var1) break; var2 = var1 + 1; ptr = var2; } This addresses some problems from [0]. [0] https://bugs.llvm.org/show_bug.cgi?id=51193 Differential revision: https://reviews.llvm.org/D113289	2021-12-01 04:27:50 -08:00
Nikita Popov	1e1a8be21f	[LICM] Support opaque pointers in scalar promotion Make sure that all pointers have the same load/store access type, rather than comparing pointer element types.	2021-12-01 12:56:24 +01:00
Florian Hahn	7de410440d	[DSE] Allow DSE to optimize MemorySSA by default. This allows for better optimization of 'stores-of-existing-values' and possibly helps passes further down the pipeline. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D113712	2021-12-01 08:29:23 +00:00

... 3 4 5 6 7 ...

11781 Commits