llvm-project

Commit Graph

Author	SHA1	Message	Date
Matteo Favaro	633e090528	[DSE] Allow ptrs defined in the entry block in IsGuaranteedLoopInvariant. The IsGuaranteedLoopInvariant function is making sure to check if the incoming pointer is guaranteed to be loop invariant, therefore I think the case where the pointer is defined in the entry block of a function automatically guarantees the pointer to be loop invariant, as the entry block of a function cannot have predecessors or be part of a loop. I implemented this small patch and tested it using ninja check-llvm-unit and ninja check-llvm. I added a contained test file that shows the problem and used opt -O3 -debug on it to make sure the case is not currently handled (in fact the debug log is showing that the DSE pass is bailing out when testing if the killer store is able to clobber the dead store). Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D96979	2021-02-23 12:00:44 +00:00
Kazu Hirata	fb74e1e78a	[Transforms/Scalar] Use range-based for loops (NFC)	2021-02-04 21:18:05 -08:00
Kazu Hirata	56edfcada9	[Target, Transforms] Use contains (NFC)	2020-12-19 10:43:19 -08:00
Nikita Popov	1f1145006b	[DSE] Use correct memory location for read clobber check MSSA DSE starts at a killing store, finds an earlier store and then checks that the earlier store is not read along any paths (without being killed first). However, it uses the memory location of the killing store for that, not the earlier store that we're attempting to eliminate. This has a number of problems: * Mismatches between what BasicAA considers aliasing and what DSE considers an overwrite (even though both are correct in isolation) can result in miscompiles. This is PR48279, which D92045 tries to fix in a different way. The problem is that we're using a location from a store that is potentially not executed and thus may be UB, in which case analysis results can be arbitrary. * Metadata on the killing store may be used to determine aliasing, but there is no guarantee that the metadata is valid, as the specific killing store may not be executed. Using the metadata on the earlier store is valid (it is the store we're removing, so on any execution where its removal may be observed, it must be executed). * The location is imprecise. For full overwrites the killing store will always have a location that is larger or equal than the earlier access location, so it's beneficial to use the earlier access location. This is not the case for partial overwrites, in which case either location might be smaller. There is some room for improvement here. Using the earlier access location means that we can no longer cache which accesses are read for a given killing store, as we may be querying different locations. However, it turns out that simply dropping the cache has no notable impact on compile-time. Differential Revision: https://reviews.llvm.org/D93523	2020-12-18 20:26:53 +01:00
Nikita Popov	e728024808	[DSE] Pass MemoryLocation by const ref (NFC)	2020-12-16 21:47:46 +01:00
Evgeniy Brevnov	2d1b024d06	[DSE][NFC] Need to be carefull mixing signed and unsigned types Currently in some places we use signed type to represent size of an access and put explicit casts from unsigned to signed. For example: int64_t EarlierSize = int64_t(Loc.Size.getValue()); Even though it doesn't loos bits (immidiatly) it may overflow and we end up with negative size. Potentially that cause later code to work incorrectly. A simple expample is a check that size is not negative. I think it would be safer and clearer if we use unsigned type for the size and handle it appropriately. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D92648	2020-12-08 16:53:37 +07:00
Greg Parker	bcc802fa36	[DSE] Remove a redundant call to getLocForWriteEx() Differential Revision: https://reviews.llvm.org/D92263	2020-11-30 21:12:24 -08:00
Nikita Popov	4df8efce80	[AA] Split up LocationSize::unknown() Currently, we have some confusion in the codebase regarding the meaning of LocationSize::unknown(): Some parts (including most of BasicAA) assume that LocationSize::unknown() only allows accesses after the base pointer. Some parts (various callers of AA) assume that LocationSize::unknown() allows accesses both before and after the base pointer (but within the underlying object). This patch splits up LocationSize::unknown() into LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer() to make this completely unambiguous. I tried my best to determine which one is appropriate for all the existing uses. The test changes in cs-cs.ll in particular illustrate a previously clearly incorrect AA result: We were effectively assuming that argmemonly functions were only allowed to access their arguments after the passed pointer, but not before it. I'm pretty sure that this was not intentional, and it's certainly not specified by LangRef that way. Differential Revision: https://reviews.llvm.org/D91649	2020-11-26 18:39:55 +01:00
Nikita Popov	393b9e9db3	[MemLoc] Require LocationSize argument (NFC) When constructing a MemoryLocation by hand, require that a LocationSize is explicitly specified. D91649 will split up LocationSize::unknown() into two different states, and callers should make an explicit choice regarding the kind of MemoryLocation they want to have.	2020-11-19 21:45:52 +01:00
Simon Pilgrim	b11eaf5617	[DSE] Don't dereference a dyn_cast<> result - use cast<> instead. NFCI. We were relying on the dyn_cast<> succeeding - better use cast<> and have it assert that its the correct type than dereference a null result.	2020-11-08 13:07:45 +00:00
Florian Hahn	aab71d4443	[DSE] Use same logic as legacy impl to check if free kills a location. This patch updates DSE + MemorySSA to use the same check as the legacy implementation to determine if a location is killed by a free call. This changes the existing behavior so that a free does not kill locations before the start of the freed pointer. This should fix PR48036.	2020-10-31 20:09:25 +00:00
Evgeniy Brevnov	3d31adaec4	[DSE] Improve partial overlap detection Currently isOverwrite returns OW_MaybePartial even for accesss known not to overlap. This is not a big problem for legacy implementation (since isPartialOverwrite follows isOverwrite and clarifies the result). Contrary SSA based version does a lot of work to later find out that accesses don't overlap. Besides negative impact on compile time we quickly reach MemorySSAPartialStoreLimit and miss optimization opportunities. Note: In fact, I think it would be cleaner implementation if isOverwrite returned fully clarified result in the first place whithout need to call isPartialOverwrite. This can be done as a follow up. What do you think? Reviewed By: fhahn, asbirlea Differential Revision: https://reviews.llvm.org/D90371	2020-10-30 22:23:20 +07:00
Florian Hahn	05e4f7bde9	[DSE] Remove noop stores after killing stores for a MemoryDef. Currently we fail to eliminate some noop stores if there is a kill-able store between the starting def and the load. This is because we eliminate noop stores first. In practice it seems like eliminating noop stores after the main elimination for a def covers slightly more cases. This patch improves the number of stores slightly in 2 cases for X86 -O3 -flto Same hash: 235 (filtered out) Remaining: 2 Metric: dse.NumRedundantStores Program base patch diff test-suite...ce/Benchmarks/PAQ8p/paq8p.test 2.00 3.00 50.0% test-suite...006/453.povray/453.povray.test 18.00 21.00 16.7% There might be other phase ordering issues, but it appears that they do not show up in the test-suite/SPEC2000/SPEC2006. We can always tune the ordering later. Partly fixes PR47887. Reviewed By: asbirlea, zoecarver Differential Revision: https://reviews.llvm.org/D89650	2020-10-30 09:40:15 +00:00
Florian Hahn	b82f80057d	[DSE] Use walker to skip noalias stores between current & clobber def. Instead of getting the defining access we should be able to use getClobberingMemoryAccess to skip non-aliasing MemoryDefs. No additional checks should be needed, because we only remove the starting def if it matches the defining access of the load. All we need to worry about is that there are no (may)alias stores between the starting def and the load and getClobberingMemoryAccess should guarantee that. Partly fixes PR47887. This improves the number of redundant stores removed in some cases (numbers below for MultiSource, SPEC2000, SPEC2006 on X86 with -flto -O3). Same hash: 226 (filtered out) Remaining: 11 Metric: dse.NumRedundantStores Program base patch1 diff test-suite...:: External/Povray/povray.test 1.00 5.00 400.0% test-suite...chmarks/MallocBench/gs/gs.test 1.00 3.00 200.0% test-suite...0/253.perlbmk/253.perlbmk.test 21.00 37.00 76.2% test-suite...0.perlbench/400.perlbench.test 24.00 37.00 54.2% test-suite.../Applications/SPASS/SPASS.test 3.00 4.00 33.3% test-suite...006/453.povray/453.povray.test 15.00 18.00 20.0% test-suite...T2006/445.gobmk/445.gobmk.test 27.00 29.00 7.4% test-suite.../CINT2006/403.gcc/403.gcc.test 136.00 137.00 0.7% test-suite.../CINT2000/176.gcc/176.gcc.test 6.00 6.00 0.0% test-suite.../Benchmarks/Bullet/bullet.test NaN 3.00 nan% test-suite.../Benchmarks/Ptrdist/bc/bc.test NaN 1.00 nan% Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89647	2020-10-28 11:01:25 +00:00
Florian Hahn	2e58010208	[DSE] Do not scan users of memory terminators for further reads. isMemTerminator checks if the current def is a memory terminator that terminates the memory pointed to by DefLoc. We do not have to add any of their users to the worklist, because the follow-on users cannot read the memory in question. This leads to more stores eliminated in the presence of lifetime calls. Previously we added the users of those intrinsics to the worklist, limiting elimination. In terms of removed stores, this gives a nice boost on some benchmarks (MultiSource/SPEC2000/SPEC2006 on X86 with -flto -O3): Same hash: 205 (filtered out) Remaining: 32 Metric: dse.NumFastStores Program base patch diff test-suite...000/197.parser/197.parser.test 4.00 8.00 100.0% test-suite...rolangs-C++/family/family.test 4.00 7.00 75.0% test-suite...marks/7zip/7zip-benchmark.test 1722.00 2189.00 27.1% test-suite...CFP2000/177.mesa/177.mesa.test 30.00 38.00 26.7% test-suite :: External/Nurbs/nurbs.test 44.00 49.00 11.4% test-suite...lications/sqlite3/sqlite3.test 115.00 128.00 11.3% test-suite...006/447.dealII/447.dealII.test 2715.00 3013.00 11.0% test-suite...ProxyApps-C++/CLAMR/CLAMR.test 237.00 261.00 10.1% test-suite...tions/lambda-0.1.3/lambda.test 40.00 44.00 10.0% test-suite...3.xalancbmk/483.xalancbmk.test 1366.00 1475.00 8.0% test-suite...abench/jpeg/jpeg-6a/cjpeg.test 13.00 14.00 7.7% test-suite...oxyApps-C++/miniFE/miniFE.test 43.00 46.00 7.0% test-suite...lications/ClamAV/clamscan.test 230.00 246.00 7.0% test-suite...006/450.soplex/450.soplex.test 284.00 299.00 5.3% test-suite...nsumer-jpeg/consumer-jpeg.test 21.00 22.00 4.8%	2020-10-20 16:55:22 +01:00
Florian Hahn	6439fde6d4	[DSE] Bail out from getLocForWriteEx if call is not argmemonly/inacc_mem. This change should currently not have any impact, but guard against further inconsistencies between MemoryLocation and function attributes.	2020-10-20 14:37:53 +01:00
Florian Hahn	f5cf7f544b	[DSE] Do not consider 'noop' intrinsics as read-clobbers. isNoopIntrinsic returns true for some intrinsics that are modeled in MemorySSA but do not actually read or write any memory and do not block DSE. Such intrinsics should not be considered as read-clobbers.	2020-10-18 15:51:05 +01:00
Florian Hahn	51ff04567b	Recommit "[DSE] Switch to MemorySSA-backed DSE by default." After investigation by @asbirlea, the issue that caused the revert appears to be an issue in the original source, rather than a problem with the compiler. This patch enables MemorySSA DSE again. This reverts commit `915310bf14`.	2020-10-16 09:02:53 +01:00
zoecarver	6c25816d7b	[DSE] Look through memory PHI arguments when removing noop stores in MSSA. Summary: Adds support for "following" memory through MSSA PHI arguments. This will help catch more noop stores that exist between blocks. Originally part of D79391. Reviewers: fhahn, jfb, asbirlea Differential Revision: https://reviews.llvm.org/D82588	2020-10-01 10:42:02 -07:00
Florian Hahn	915310bf14	Revert "[DSE] Switch to MemorySSA-backed DSE by default." There appears to be a mis-compile with MemorySSA-backed DSE in combination with llvm.lifetime.end. It currently appears like DSE is doing the right thing and the llvm.lifetime.end markers are incorrect. The reverted patch uncovers the mis-compile. This patch temporarily switches back to the legacy DSE implementation, while we investigate. This reverts commit `9d172c8e9c`.	2020-09-26 18:35:27 +01:00
Florian Hahn	8f0466edc0	[DSE] Unify & fix mem terminator location checks. When looking for memory defs killed by memory terminators the code currently incorrectly ignores the size argument of llvm.lifetime.end. This patch updates the code to use isMemTerminator and updates isMemTerminator to use isOverwrite() to make sure locations that are outside the range marked as dead by llvm.lifetime.end are not considered. Note that isOverwrite is only used for llvm.lifetime.end, because free-like functions make the whole underlying object dead.	2020-09-26 13:47:50 +01:00
Florian Hahn	9d172c8e9c	Recommit "[DSE] Switch to MemorySSA-backed DSE by default." This switches to using DSE + MemorySSA by default again, after fixing the issues reported after the first commit. Notable fixes `fc82006331`, `a0017c2bc2`. This reverts commit `3a59628f3c`.	2020-09-18 11:05:00 +01:00
Florian Hahn	3a59628f3c	Revert "[DSE] Switch to MemorySSA-backed DSE by default." This reverts commit `fb109c42d9`. Temporarily revert due to a mis-compile pointed out at D87163.	2020-09-15 18:07:56 +01:00
Florian Hahn	f715d81c9d	[DSE] Only eliminate candidates that always store the same loc. AliasAnalysis/MemoryLocation does not account for loops. Two MemoryLocation can be must-overwrite, even if the first one writes multiple locations in a loop. This patch prevents removing such stores, by only considering candidates that are known to be loop invariant, or executed in the same BB. Currently the invariant check is quite conservative and only considers Alloca and Alloca-like instructions and arguments as invariant base pointers. It also considers GEPs with all constant indices and invariant bases as invariant. This can be improved in the future, but the current implementation has only minor impact on the total number of stores eliminated (25903 vs 26047 for the baseline). There are some 2-10% swings for some individual benchmarks. In roughly half of the cases, the number of stores removed increases actually, because we skip candidates that are unlikely to be valid candidates early.	2020-09-14 12:06:58 +01:00
Florian Hahn	e082dee2b5	[DSE] Bail out on MemoryPhis when deleting stores at end of function. When deleting stores at the end of a function, we have to do PHI translation, otherwise we might miss reads in different iterations of a loop. See multiblock-loop-carried-dependence.ll for details. This fixes a mis-compile and surprisingly also increases the number of eliminated stores from 26047 to 26572 for MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto. This is most likely because we save budget by not exploring through MemoryPhis, which are less likely to result in valid candidates for elimination. The issue was reported post-commit for `fb109c42d9`.	2020-09-12 19:05:59 +01:00
Krzysztof Parzyszek	f92908cc74	[DSE] Make sure that DSE+MSSA can handle masked stores Differential Revision: https://reviews.llvm.org/D87414	2020-09-11 10:00:21 -05:00
Florian Hahn	fb109c42d9	[DSE] Switch to MemorySSA-backed DSE by default. The tests have been updated and I plan to move them from the MSSA directory up. Some end-to-end tests needed small adjustments. One difference to the legacy DSE is that legacy DSE also deletes trivially dead instructions that are unrelated to memory operations. Because MemorySSA-backed DSE just walks the MemorySSA, we only visit/check memory instructions. But removing unrelated dead instructions is not really DSE's job and other passes will clean up. One noteworthy change is in llvm/test/Transforms/Coroutines/ArgAddr.ll, but I think this comes down to legacy DSE not handling instructions that may throw correctly in that case. To cover this with MemorySSA-backed DSE, we need an update to llvm.coro.begin to treat it's return value to belong to the same underlying object as the passed pointer. There are some minor cases MemorySSA-backed DSE currently misses, e.g. related to atomic operations, but I think those can be implemented after the switch. This has been discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html For the MultiSource/SPEC2000/SPEC2006 the number of eliminated stores goes from ~17500 (legayc DSE) to ~26300 (MemorySSA-backed). More numbers and details in the thread on llvm-dev. Impact on CTMark: ``` Legacy Pass Manager exec instrs size-text O3 + 0.60% - 0.27% ReleaseThinLTO + 1.00% - 0.42% ReleaseLTO-g. + 0.77% - 0.33% RelThinLTO (link only) + 0.87% - 0.42% RelLO-g (link only) + 0.78% - 0.33% ``` http://llvm-compile-time-tracker.com/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions ``` New Pass Manager exec instrs. size-text O3 + 0.95% - 0.25% ReleaseThinLTO + 1.34% - 0.41% ReleaseLTO-g. + 1.71% - 0.35% RelThinLTO (link only) + 0.96% - 0.41% RelLO-g (link only) + 2.21% - 0.35% ``` http://195.201.131.214:8000/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions Reviewed By: asbirlea, xbolva00, nikic Differential Revision: https://reviews.llvm.org/D87163	2020-09-10 22:24:32 +01:00
Florian Hahn	a5ec99da6e	[DSE] Support eliminating memcpy.inline. MemoryLocation has been taught about memcpy.inline, which means we can get the memory locations read and written by it. This means DSE can handle memcpy.inline	2020-09-10 13:19:25 +01:00
Florian Hahn	9969c317ff	[DSE,MemorySSA] Handle atomic stores explicitly in isReadClobber. Atomic stores are modeled as MemoryDef to model the fact that they may not be reordered, depending on the ordering constraints. Atomic stores that are monotonic or weaker do not limit re-ordering, so we do not have to treat them as potential read clobbers. Note that llvm/test/Transforms/DeadStoreElimination/MSSA/atomic.ll already contains a set of negative test cases. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87386	2020-09-09 23:01:58 +01:00
Krzysztof Parzyszek	81ff2d30a9	[DSE] Handle masked stores	2020-09-09 13:31:31 -05:00
Florian Hahn	c7b7c32f4a	[DSE,MemorySSA] Increase walker limit a bit. This slightly bumps the walker limit so that it covers more cases while not increasing compile-time too much: http://llvm-compile-time-tracker.com/compare.php?from=0fc1c2b51ba0cfb9145139af35be638333865251&to=91144a50ea4fa82c0c877e77784f60371640b263&stat=instructions	2020-09-08 14:55:46 +01:00
Florian Hahn	efb8e156da	[DSE,MemorySSA] Add an early check for read clobbers to traversal. Depending on the benchmark, this early exit can save a substantial amount of compile-time: http://llvm-compile-time-tracker.com/compare.php?from=505f2d817aa8e07ba98e5fd4a8f6ff0666f89df1&to=eb4e441147f9b4b7a5fcbbc57428cadbe9e01f10&stat=instructions	2020-09-07 23:22:10 +01:00
Florian Hahn	16bb71fd4f	[DSE,MemorySSA] Add a few additional debug messages.	2020-09-06 20:31:00 +01:00
Florian Hahn	00eb6fef08	[DSE,MemorySSA] Check for throwing instrs between killing/killed def. We also have to check all uses between the killing & killed def and check if any of them is throwing.	2020-09-04 18:54:59 +01:00
Florian Hahn	86d817d7cf	[DSE,MemorySSA] Skip defs without analyzable write locations. Similar to other checks above, if there is no write location for a def, it cannot be considered for elimination and can be skipped.	2020-08-30 21:56:25 +01:00
Florian Hahn	42c57c294d	[DSE,MemorySSA] Simplify code, EarlierAccess is be a MemoryDef (NFC). After recent changes, we return early if Current is a MemoryPhi, so EarlierAccess can only be a MemoryDef.	2020-08-30 21:31:57 +01:00
Florian Hahn	31cdb29de4	[DSE,MemorySSA] Return early when hitting a MemoryPhi. A MemoryPhi can never be eliminated. If we hit one, return the Phi, so the caller can continue traversing the incoming accesses. This saves some unnecessary read clobber checks and improves compile-time http://llvm-compile-time-tracker.com/compare.php?from=1ffc58b6d098ce8fa71f3a80fe75b990f633f921&to=d0fa8d1982380b57d7b6067528104bc373dbe07a&stat=instructions	2020-08-29 18:28:26 +01:00
Florian Hahn	43aa7227df	[DSE,MemorySSA] Check if Current is valid for elimination first. This changes getDomMemoryDef to check if a Current is a valid candidate for elimination before checking for reads. Before the change, we were spending a lot of compile-time in checking for read accesses for Current that might not even be removable. This patch flips the logic, so we skip Current if they cannot be removed before checking all their uses. This is much more efficient in practice. It also adds a more aggressive limit for checking partially overlapping stores. The main problem with overlapping stores is that we do not know if they will lead to elimination until seeing all of them. This patch limits adds a new limit for overlapping store candidates, which keeps the number of modified overlapping stores roughly the same. This is another substantial compile-time improvement (while also increasing the number of stores eliminated). Geomean -O3 -0.67%, ReleaseThinLTO -0.97%. http://llvm-compile-time-tracker.com/compare.php?from=0a929b6978a068af8ddb02d0d4714a2843dd8ba9&to=2e630629b43f64b60b282e90f0d96082fde2dacc&stat=instructions Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86487	2020-08-28 11:19:04 +01:00
Florian Hahn	bb024c3c4e	[DSE,MemorySSA] Remove short-cut to check if all paths are covered. The post-order number early continue does not work in some cases, e.g. if a path from EarlierAccess to an exit includes a node that dominates EarlierAccess in a cycle. The short-cut only has very minor impact on compile-time, so it seems straight-forward to remove it for now: http://llvm-compile-time-tracker.com/compare.php?from=062412e79fcfedf2cf004433e42036b0333e3f83&to=d7386016a77ce1387bdbbf360f1de157faea9d31&stat=instructions Fixes PR47285.	2020-08-27 12:42:40 +01:00
Florian Hahn	e717fdb0f1	[DSE,MemorySSA] Traverse use-def chain without MemSSA Walker. For DSE with MemorySSA it is beneficial to manually traverse the defining access, instead of using a MemorySSA walker, so we can better control the number of steps together with other limits and also weed out invalid/unprofitable paths early on. This patch requires a follow-up patch to be most effective, which I will share soon after putting this patch up. This temporarily XFAIL's the limit tests, because we now explore more MemoryDefs that may not alias/clobber the killing def. This will be improved/fixed by the follow-up patch. This patch also renames some `Dom` variables to `Earlier`, because the dominance relation is not really used/important here and potentially confusing. This patch allows us to aggressively cut down compile time, geomean -O3 -0.64%, ReleaseThinLTO -1.65%, at the expense of fewer stores removed. Subsequent patches will increase the number of removed stores again, while keeping compile-time in check. http://llvm-compile-time-tracker.com/compare.php?from=d8e3294118a8c5f3f97688a704d5a05b67646012&to=0a929b6978a068af8ddb02d0d4714a2843dd8ba9&stat=instructions Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86486	2020-08-27 10:02:02 +01:00
Florian Hahn	e19ef1aab5	[DSE,MemorySSA] Cache accesses with/without reachable read-clobbers. Currently we repeatedly check the same uses for read clobbers in some cases. We can avoid unnecessary checks by keeping track of the memory accesses we already found read clobbers for. To do so, we just add memory access causing read-clobbers to a set. Note that marking all visited accesses as read-clobbers would be to pessimistic, as that might include accesses not on any path to the actual read clobber. If we do not find any read-clobbers, we can add all visited instructions to another set and use that to skip the same accesses in the next call. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D75025	2020-08-25 08:48:46 +01:00
Florian Hahn	d1a1cce5b1	[DSE,MemorySSA] Do not use callCapturesBefore in isReadClobber. Using callCapturesBefore potentially improves the precision and the number of stores we can remove. But in practice, it seems to have very little impact in terms of stores removed. For example, for SPEC2000/SPEC2006/MultiSource with -O3 -flto, ~50 more stores are removed (out of ~26900 stores removed). But in terms of compile-time, it is very expensive and the patch gives substantial compile-time improvements: Geomean O3 -0.24%, ReleaseThinLTO -0.47%, ReleaseLTO-g -0.39%. http://llvm-compile-time-tracker.com/compare.php?from=612a0bff88ed906c83b82f079d4c49e5fecfb9d0&to=e6c86b96d20d97dd88e903a409bd8d39b6114312&stat=instructions	2020-08-24 16:19:42 +01:00
Florian Hahn	b99a5eb659	[DSE,MemorySSA] Delay PointerMayBeCaptured calls until actually needed. Avoid computing InvisibleToCallerBefore/AfterRet up front. In most cases, this information is not really needed. Instead, introduce helper functions to compute and cache the result on demand. Notably, this also does not use PointerMayBeCapturedBefore for isInvisibleToCallerBeforeRet, as it requires the killing MemoryDef as starting instruction, making the caching ineffective. But it appears the use of PointerMayBeCapturedBefore has very limited benefits in practice (e.g. on SPEC2000/SPEC2006/MultiSource there are no binary changes with -O3 -flto). Refrain from using it for now, to limit-compile-time. This gives some nice compile-time improvements: http://llvm-compile-time-tracker.com/compare.php?from=db9345f6810f379a36752dc52caf5230585d0ebd&to=b4d091047e1b8a3d377d200137b79d03aca65663&stat=instructions	2020-08-24 14:05:44 +01:00
Florian Hahn	2431b143ae	[DSE,MemorySSA] Limit elimination at end of function to single UO. Limit elimination of stores at the end of a function to MemoryDefs with a single underlying object, to save compile time. In practice, the case with multiple underlying objects seems not very important in practice. For -O3 -flto on MultiSource/SPEC2000/SPEC2006 this results in a total of 2 more stores being eliminated. We can always re-visit that in the future.	2020-08-24 13:00:17 +01:00
Florian Hahn	2843c9fe0a	[DSE,MemorySSA] Keep single DL instance in DSEState (NFC). Small cleanup, also removes one instance of getting DataLayout without using it later.	2020-08-23 15:56:38 +01:00
Florian Hahn	5e7e2162d4	[DSE,MemorySSA] Use BatchAA for AA queries. We can use BatchAA to avoid some repeated AA queries. We only remove stores, so I think we will get away with using a single BatchAA instance for the complete run. The changes in AliasAnalysis.h mirror the changes in D85583. The change improves compile-time by roughly 1%. http://llvm-compile-time-tracker.com/compare.php?from=67ad786353dfcc7633c65de11601d7823746378e&to=10529e5b43809808e8c198f88fffd8f756554e45&stat=instructions This is part of the patches to bring down compile-time to the level referenced in http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86275	2020-08-22 08:36:35 +01:00
Florian Hahn	9f7350672e	[DSE,MemorySSA] Handle atomicrmw/cmpxchg conservatively. This adds conservative handling of AtomicRMW/AtomicCmpXChg to isDSEBarrier, similar to atomic loads and stores.	2020-08-21 10:42:42 +01:00
Florian Hahn	a0e92ffd0d	[DSE,MemorySSA] Split off partial tracking from isOverwite. When traversing memory uses to look for aliasing reads/writes, we only care about complete overwrites. This patch splits off the partial overwrite tracking from isOverwrite This avoids some unnecessary work when checking for read/write clobbers with MemorySSA-DSE. isOverwrite, which skips the partial overwrite tracking. This gives a relatively small improvement http://llvm-compile-time-tracker.com/compare.php?from=ef2a2f77f87553a0a4a39f518eb9ac86b756bda6&to=658f3905dd96d3415f3782adc712c79fa59a4665&stat=instructions This is part of the patches to bring down compile-time to the level referenced in http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86280	2020-08-21 09:13:59 +01:00
Florian Hahn	c0cbe6453a	[DSE] Remove dead argument from removePartiallyOverlappedStores (NFC). The argument is unused and can be removed.	2020-08-19 19:33:52 +01:00
Florian Hahn	1a55fbceaa	[DSE,MemorySSA] Use NumRedundantStores instead of NumNoopStores. Legacy DSE uses NumRedundantStores, while MemorySSA DSE uses NumNoopStores. We should just use the same counter.	2020-08-19 08:50:33 +01:00
Florian Hahn	4cc20aa743	[DSE,MemorySSA] Skip access already dominated by a killing def. If we already found a killing def (= a def that completely overwrites the location) that dominates an access, we can skip processing it further. This does not help with compile-time, but increases the number of memory accesses we can process with the same scan budget, leading to more stores being eliminated. Improvements with this change Same hash: 203 (filtered out) Remaining: 34 Metric: dse.NumFastStores Program base dom diff test-suite...rolangs-C++/family/family.test 2.00 4.00 100.0% test-suite...ProxyApps-C++/CLAMR/CLAMR.test 172.00 229.00 33.1% test-suite...ks/Prolangs-C/agrep/agrep.test 10.00 12.00 20.0% test-suite...oxyApps-C++/miniFE/miniFE.test 44.00 51.00 15.9% test-suite...marks/7zip/7zip-benchmark.test 1285.00 1474.00 14.7% test-suite...006/450.soplex/450.soplex.test 254.00 289.00 13.8% test-suite...006/447.dealII/447.dealII.test 2466.00 2798.00 13.5% test-suite...000/197.parser/197.parser.test 9.00 10.00 11.1% test-suite.../Benchmarks/nbench/nbench.test 85.00 91.00 7.1% test-suite...ce/Applications/siod/siod.test 68.00 72.00 5.9% test-suite...ications/JM/lencod/lencod.test 786.00 824.00 4.8% test-suite...6/464.h264ref/464.h264ref.test 765.00 798.00 4.3% test-suite.../Benchmarks/Ptrdist/bc/bc.test 105.00 109.00 3.8% test-suite...lications/obsequi/Obsequi.test 29.00 28.00 -3.4% test-suite...3.xalancbmk/483.xalancbmk.test 1322.00 1367.00 3.4% test-suite...chmarks/MallocBench/gs/gs.test 118.00 122.00 3.4% test-suite...T2006/401.bzip2/401.bzip2.test 60.00 62.00 3.3% test-suite...6/482.sphinx3/482.sphinx3.test 30.00 31.00 3.3% test-suite...rks/tramp3d-v4/tramp3d-v4.test 862.00 887.00 2.9% test-suite...telecomm-gsm/telecomm-gsm.test 78.00 80.00 2.6% test-suite...ediabench/gsm/toast/toast.test 78.00 80.00 2.6% test-suite.../Applications/SPASS/SPASS.test 163.00 167.00 2.5% test-suite...lications/ClamAV/clamscan.test 240.00 245.00 2.1% test-suite...006/453.povray/453.povray.test 1392.00 1419.00 1.9% test-suite...000/255.vortex/255.vortex.test 211.00 215.00 1.9% test-suite...:: External/Povray/povray.test 1295.00 1317.00 1.7% test-suite...lications/sqlite3/sqlite3.test 175.00 177.00 1.1% test-suite...T2000/256.bzip2/256.bzip2.test 99.00 100.00 1.0% test-suite...0/253.perlbmk/253.perlbmk.test 629.00 635.00 1.0% test-suite.../CINT2006/403.gcc/403.gcc.test 1183.00 1194.00 0.9% test-suite.../CINT2000/176.gcc/176.gcc.test 647.00 653.00 0.9% test-suite...ications/JM/ldecod/ldecod.test 512.00 516.00 0.8% test-suite...0.perlbench/400.perlbench.test 1026.00 1034.00 0.8% test-suite...-typeset/consumer-typeset.test 1876.00 1877.00 0.1% Geomean difference 7.3%	2020-08-17 20:54:48 +01:00
Florian Hahn	df4756ec6c	[DSE,MemorySSA] Check for underlying objects first. isWriteAtEndOfFunction needs to check all memory uses of Def, which is much more expensive than getting the underlying objects in practice. Switch the call order, as recommended by the TODO, which was added as per an earlier review. This shaves off a bit of compile-time.	2020-08-17 18:52:18 +01:00
Florian Hahn	139810449b	[DSE,MemorySSA] Account for ScanLimit == 0 on entry. Currently the code does not account for the fact that getDomMemoryDef can be called with ScanLimit == 0, if we reached the limit while processing an earlier access. Also tighten the check a bit more and bump the scan limit now that it is handled properly. In some cases, this brings a 2x speedup in terms of compile-time.	2020-08-17 17:55:14 +01:00
Florian Hahn	3b0878a370	[DSE,MSSA] Fix crash when using tryToMergePartialOverlappingStores. We are re-using tryToMergePartialOverlappingStores, which requires earlier to domiante Later. In the long run, tryToMergeParialOverlappingStores should be re-written using MemorySSA. Fixes PR46513.	2020-08-13 12:07:56 +01:00
Vitaly Buka	b0eb40ca39	[NFC] Remove unused GetUnderlyingObject paramenter Depends on D84617. Differential Revision: https://reviews.llvm.org/D84621	2020-07-31 02:10:03 -07:00
Vitaly Buka	89051ebace	[NFC] GetUnderlyingObject -> getUnderlyingObject I am going to touch them in the next patch anyway	2020-07-30 21:08:24 -07:00
Matt Arsenault	023883a834	IR: Rename Argument::hasPassPointeeByValueAttr to prepare for byref When the byref attribute is added, there will need to be two similar functions for the existing cases which have an associate value copy, and byref which does not. Most, but not all of the existing uses will use the existing version. The associated size function added by D82679 also needs to contextually differ, and will help eliminate a few places still relying on pointee element types.	2020-07-16 13:50:49 -04:00
John Brawn	20854d85e1	[DSE,MSSA] Recognise init_trampoline in getLocForWriteEx This fixes an instance where MemorySSA-using Dead Store Elimination is failing to do a transformation that the non-MemorySSA-using version does. Differential Revision: https://reviews.llvm.org/D83783	2020-07-15 12:18:58 +01:00
Florian Hahn	80970ac875	[DSE,MSSA] Eliminate stores by terminators (free,lifetime.end). This patch adds support for eliminating stores by free & lifetime.end calls. We can remove stores that are not read before calling a memory terminator and we can eliminate all stores after a memory terminator until we see a new lifetime.start. The second case seems to not really trigger much in practice though. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72410	2020-07-08 08:59:46 +01:00
Nicolai Hähnle	dfcc68c528	DomTree: Remove getRoots() accessor Summary: Avoid exposing details about how roots are stored. This enables subsequent type-erasure changes. v5: - cleanup a unit test by using EXPECT_EQ instead of EXPECT_TRUE Change-Id: I532b774cc71f2224e543bc7d79131d97f63f093d Reviewers: arsenm, RKSimon, mehdi_amini, courbet Subscribers: jvesely, wdng, hiraditya, kuhar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83085	2020-07-06 21:58:11 +02:00
Nuno Lopes	7f903873b8	DSE: fix builtin function recognition to take decl into account	2020-07-02 10:28:47 +01:00
Florian Hahn	4837daf883	[DSE,MSSA] Check if Def is removable only wen we try to remove it. Non-removable MemoryDefs can still eliminate other defs. Update the isRemovable checks to only candidates for removal.	2020-06-25 14:01:10 +01:00
Florian Hahn	4e62c6359c	[DSE] Eliminate stores at the end of the function. This patch add support for eliminating MemoryDefs that do not have any aliasing users, which indicates that there are no reads/writes to the memory location until the end of the function. To eliminate such defs, we have to ensure that the underlying object is not visible in the caller and does not escape via returning. We need a separate check for that, as InvisibleToCaller does not consider returns. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker, george.burgess.iv Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72631	2020-06-24 12:58:20 +01:00
Florian Hahn	ff4de8683a	[DSE,MSSA] Treat `store 0` after calloc as noop stores. This patch extends storeIsNoop to also detect stores of 0 to an calloced object. This basically ports the logic from legacy DSE to the MemorySSA backed version. It triggers in a few cases on MultiSource, SPEC2000, SPEC2006 with -O3 LTO: Same hash: 218 (filtered out) Remaining: 19 Metric: dse.NumNoopStores Program base patch2 diff test-suite...CFP2000/177.mesa/177.mesa.test 1.00 15.00 1400.0% test-suite...6/482.sphinx3/482.sphinx3.test 1.00 14.00 1300.0% test-suite...lications/ClamAV/clamscan.test 2.00 28.00 1300.0% test-suite...CFP2006/433.milc/433.milc.test 1.00 8.00 700.0% test-suite...pplications/oggenc/oggenc.test 2.00 9.00 350.0% test-suite.../CINT2000/176.gcc/176.gcc.test 6.00 6.00 0.0% test-suite.../CINT2006/403.gcc/403.gcc.test NaN 137.00 nan% test-suite...libquantum/462.libquantum.test NaN 3.00 nan% test-suite...6/464.h264ref/464.h264ref.test NaN 7.00 nan% test-suite...decode/alacconvert-decode.test NaN 2.00 nan% test-suite...encode/alacconvert-encode.test NaN 2.00 nan% test-suite...ications/JM/ldecod/ldecod.test NaN 9.00 nan% test-suite...ications/JM/lencod/lencod.test NaN 39.00 nan% test-suite.../Applications/lemon/lemon.test NaN 2.00 nan% test-suite...pplications/treecc/treecc.test NaN 4.00 nan% test-suite...hmarks/McCat/08-main/main.test NaN 4.00 nan% test-suite...nsumer-lame/consumer-lame.test NaN 3.00 nan% test-suite.../Prolangs-C/bison/mybison.test NaN 1.00 nan% test-suite...arks/mafft/pairlocalalign.test NaN 30.00 nan% Reviewers: efriedma, zoecarver, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D82204	2020-06-23 21:01:39 +01:00
Florian Hahn	a822ec75cc	[DSE,MSSA] Treat passed by value args as invisible to caller. This updates the MemorySSA backed implementation to treat arguments passed by value similar to allocas: in they are assumed to be invisible in the caller. This is similar to how they are treated in legacy DSE. Reviewers: efriedma, asbirlea, george.burgess.iv Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D82222	2020-06-23 08:58:51 +01:00
Florian Hahn	328c8642e2	[DSE,MSSA] Reorder DSE blocking checks. Currently we stop exploring candidates too early in some cases. In particular, we can continue checking the defining accesses of non-removable MemoryDefs and defs without analyzable write location (read clobbers are already ruled out using MemorySSA at this point).	2020-06-22 17:16:34 +01:00
Florian Hahn	0e19ff02d8	[DSE,MSSA] Remove unused arguments for isDSEBarrier (NFC).	2020-06-22 10:58:53 +01:00
Florian Hahn	40569db7b3	[DSE,MSSA] Move reachability check to main loop. As we traverse the CFG backwards, we could end up reaching unreachable blocks. For unreachable blocks, we won't have computed post order numbers and because DomAccess is reachable, unreachable blocks cannot be on any path from it. This fixes a crash with unreachable blocks.	2020-06-21 16:38:10 +01:00
Florian Hahn	120c059292	[DSE,MSSA] Port partial store merging. Port partial constant store merging logic to MemorySSA backed DSE. The heavy lifting is done by the existing helper function. It is used in context where we already ensured that the later instruction can eliminate the earlier one, if it is a complete overwrite.	2020-06-15 18:41:46 +01:00
Florian Hahn	71a91b9837	[DSE] Hoist partial store merging code into function (NFC). Hoist the general logic into a new function, because it can be re-used by the MemorySSA backed DSE as well.	2020-06-15 17:44:24 +01:00
Florian Hahn	8c61f13a0f	[DSE,MSSA] Delete instructions after printing it. Also enables a now-passing test case, that exposed a crash caused by the wrong order.	2020-06-15 16:01:36 +01:00
Florian Hahn	97e7147e34	[DSE,MSSA] Fix location order in isOverwrite call. isOverwrite expects the later location as first argument and the earlier result later. The adjusted call is intended to check whether CC overwrites DefLoc.	2020-06-13 20:39:00 +01:00
Florian Hahn	67671024c8	[DSE,MSSA] Relax post-dom restriction for objs visible after return. This patch relaxes the post-dominance requirement for accesses to objects visible after the function returns. Instead of requiring the killing def to post-dominate the access to eliminate, the set of 'killing blocks' (= blocks that completely overwrite the original access) is collected. If all paths from the access to eliminate and an exit block go through a killing block, the access can be removed. To check this property, we first get the common post-dominator block for the killing blocks. If this block does not post-dominate the access block, there may be a path from DomAccess to an exit block not involving any killing block. Otherwise we have to check if there is a path from the DomAccess to the common post-dominator, that does not contain a killing block. If there is no such path, we can remove DomAccess. For this check, we start at the common post-dominator and then traverse the CFG backwards. Paths are terminated when we hit a killing block or a block that is not executed between DomAccess and a killing block according to the post-order numbering (if the post order number of a block is greater than the one of DomAccess, the block cannot be in in a path starting at DomAccess). This gives the following improvements on the total number of stores after DSE for MultiSource, SPEC2K, SPEC2006: Tests: 237 Same hash: 206 (filtered out) Remaining: 31 Metric: dse.NumRemainingStores Program base new100 diff test-suite...CFP2000/188.ammp/188.ammp.test 3624.00 3544.00 -2.2% test-suite...ch/g721/g721encode/encode.test 128.00 126.00 -1.6% test-suite.../Benchmarks/Olden/mst/mst.test 73.00 72.00 -1.4% test-suite...CFP2006/433.milc/433.milc.test 3202.00 3163.00 -1.2% test-suite...000/186.crafty/186.crafty.test 5062.00 5010.00 -1.0% test-suite...-typeset/consumer-typeset.test 40460.00 40248.00 -0.5% test-suite...Source/Benchmarks/sim/sim.test 642.00 639.00 -0.5% test-suite...nchmarks/McCat/09-vor/vor.test 642.00 644.00 0.3% test-suite...lications/sqlite3/sqlite3.test 35664.00 35563.00 -0.3% test-suite...T2000/300.twolf/300.twolf.test 7202.00 7184.00 -0.2% test-suite...lications/ClamAV/clamscan.test 19475.00 19444.00 -0.2% test-suite...INT2000/164.gzip/164.gzip.test 2199.00 2196.00 -0.1% test-suite...peg2/mpeg2dec/mpeg2decode.test 2380.00 2378.00 -0.1% test-suite.../Benchmarks/Bullet/bullet.test 39335.00 39309.00 -0.1% test-suite...:: External/Povray/povray.test 36951.00 36927.00 -0.1% test-suite...marks/7zip/7zip-benchmark.test 67396.00 67356.00 -0.1% test-suite...6/464.h264ref/464.h264ref.test 31497.00 31481.00 -0.1% test-suite...006/453.povray/453.povray.test 51441.00 51416.00 -0.0% test-suite...T2006/401.bzip2/401.bzip2.test 4450.00 4448.00 -0.0% test-suite...Applications/kimwitu++/kc.test 23481.00 23471.00 -0.0% test-suite...chmarks/MallocBench/gs/gs.test 6286.00 6284.00 -0.0% test-suite.../CINT2000/254.gap/254.gap.test 13719.00 13715.00 -0.0% test-suite.../Applications/SPASS/SPASS.test 30345.00 30338.00 -0.0% test-suite...006/450.soplex/450.soplex.test 15018.00 15016.00 -0.0% test-suite...ications/JM/lencod/lencod.test 27780.00 27777.00 -0.0% test-suite.../CINT2006/403.gcc/403.gcc.test 105285.00 105276.00 -0.0% There might be potential to pre-compute some of the information of which blocks are on the path to an exit for each block, but the overall benefit might be comparatively small. On the set of benchmarks, 15738 times out of 20322 we reach the CFG check, the CFG check is successful. The total number of iterations in the CFG check is 187810, so on average we need less than 10 steps in the check loop. Bumping the threshold in the loop from 50 to 150 gives a few small improvements, but I don't think they warrant such a big bump at the moment. This is all pending further tuning in the future. Reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv Reviewed By: george.burgess.iv Differential Revision: https://reviews.llvm.org/D78932	2020-06-10 10:39:25 +01:00
Chris Jackson	c6c65164af	[DebugInfo] Reduce SalvageDebugInfo() functions - Now all SalvageDebugInfo() calls will mark undef if the salvage attempt fails. Reviewed by: vsk, Orlando Differential Revision: https://reviews.llvm.org/D78369	2020-06-08 19:28:18 +01:00
Benjamin Kramer	3badd17b69	SmallPtrSet::find -> SmallPtrSet::count The latter is more readable and more efficient. While there clean up some double lookups. NFCI.	2020-06-07 22:38:08 +02:00
serge-sans-paille	424510095d	Correctly report modified status for DSE Differential Revision: https://reviews.llvm.org/D81233	2020-06-05 15:59:42 +02:00
zoecarver	065bf124fd	[DSE] Remove noop stores in MSSA. Adds a simple fast-path check for the pattern: v = load ptr store v to ptr I took the tests from the bugzilla post, I can add more if needed (but I think these should be sufficent). Refs: https://bugs.llvm.org/show_bug.cgi?id=45795 Differential Revision: https://reviews.llvm.org/D79391	2020-05-30 09:57:30 -07:00
Eli Friedman	11aa3707e3	StoreInst should store Align, not MaybeAlign This is D77454, except for stores. All the infrastructure work was done for loads, so the remaining changes necessary are relatively small. Differential Revision: https://reviews.llvm.org/D79968	2020-05-15 12:26:58 -07:00
Arthur Eubanks	a90948fd6e	[NFC] Rename ByValOrInalloca to PassPointeeByValue Summary: In preparation for preallocated. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79152	2020-04-30 09:42:13 -07:00
Florian Hahn	2f3e86b318	[DSE,MSSA] Continue checking more remaining candidates with dbgcnt. After changing the candidate iteration strategy, we should continue with the next candidate, rather than breaking out of the loop.	2020-04-26 16:59:32 +01:00
Florian Hahn	46a04940e8	[DSE] Add stat for remaining stores after DSE. Using the existing NumFastStores statistic can be misleading when comparing the impact of DSE patches. For example, consider the case where a store gets removed from a function before it is inlined into another function. A less powerful DSE might only remove the store from functions it has been inlined into, which will result in more stores being removed, but no difference in the actual number of stores after DSE. The new stat provides the absolute number of stores surviving after DSE. Reviewers: dmgreen, bryant, asbirlea, jfb Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D78830	2020-04-25 16:12:55 +01:00
Florian Hahn	e1235831c4	[DSE,MSSA] Improve debug output (NFC). This patch slightly improves the formatting of the debug output, adds a few missing outputs and makes some existing outputs more consistent with the rest.	2020-04-24 17:50:08 +01:00
Florian Hahn	44ce588670	[DSE,MSSA] Skip checking write clobber for DomAccess (NFC). There is no need to check if the starting access for is a write clobber and all of its uses have already been checked.	2020-04-24 17:16:22 +01:00
Mircea Trofin	ceb7f308b8	[llvm][NFC][CallSite] Removed CallSite from few implementation details Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78724	2020-04-23 10:36:36 -07:00
Florian Hahn	cf9ee49b4d	[DSE] Lift post-dominance for objs not accessible in caller. We can eliminate MemoryDefs of objects not accessible after the function returns (e.g. alloca), if there are no reads between the MemoryDef and any function exits. We can stop traversing paths that completely overwrite the memory location of the MemoryDef. This patch was split off D73763. Reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv Reviewed By: asbirlea, george.burgess.iv Differential Revision: https://reviews.llvm.org/D77736	2020-04-15 11:37:14 +01:00
Tyker	086de7673e	[AssumeBundles] preserve knowledge in DSE Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77404	2020-04-14 12:48:15 +02:00
Florian Hahn	bbbec71609	[DSE.MSSA] Only use callCapturesBefore for calls. callCapturesBefore always returns ModRef , if UseInst isn't a call. As we only call it if we already know Mod is set, this only destroys the Must bit for non-calls.	2020-04-08 15:12:33 +01:00
Florian Hahn	a6353fdf3b	[DSE,MSSA] Hoist getMemoryAccess call (NFC).	2020-04-08 15:10:05 +01:00
Chris Jackson	135709aa90	[DebugInfo] Ensure dead store elimination can mark an operand value as undefined - Correct a debug info salvage and add a test Reviewers: aprantl, vsk Differential Revision: https://reviews.llvm.org/D76930 Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45080	2020-03-30 14:58:14 +01:00
Florian Hahn	3a8372ed02	[DSE] Support traversing MemoryPhis. For MemoryPhis, we have to avoid that the MemoryPhi may be executed before before the access we are currently looking at. To do this we do a post-order numbering of the basic blocks in the function and bail out once we reach a MemoryPhi with a larger (or equal) post-order block number than the current MemoryAccess. This changes the order in which we visit stores for elimination. This patch also adds support for exploring multiple paths. We keep a worklist (ToCheck) of memory accesses that might be eliminated by our starting MemoryDef or MemoryPhis for further exploration. For MemoryPhis, we add the incoming values to the worklist, for MemoryDefs we add the defining access. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72148	2020-03-20 07:51:42 +00:00
Artur Pilipenko	02e3d5c3a2	Fix DSE miscompile when store is clobbered across loop iterations DSE would mistakenly remove store (2): a = calloc(n+1) for (int i = 0; i < n; i++) { store 1, a[i+1] // (1) store 0, a[i] // (2) } The fix is to do PHI transaltion while looking for clobbering instructions between the store and the calloc. Reviewed By: efriedma, bjope Differential Revision: https://reviews.llvm.org/D68006	2020-02-27 14:43:01 -08:00
Florian Hahn	b8d638d337	[DSE,MSSA] Do not attempt to remove un-removable memdefs. We have to skip MemoryDefs that cannot be removed. This fixes a crash in the newly added test case and fixes a wrong case in memset-and-memcpy.ll.	2020-02-25 13:31:46 +00:00
Florian Hahn	af69d5e10e	[DSE] Track overlapping stores. Add a map from BasicBlocks to overlap intervals. For partial writes, we can keep track of those in IOLs. We only add candidates that are valid for eliminations. Reviewers: dmgreen, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D73757	2020-02-23 15:44:40 +00:00
Florian Hahn	134bab7cd5	[DSE,MSSA] Add debug counter. Can be used like -debug-counter=dse-memoryssa-skip=10,dse-memoryssa-counter-count=20 Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72147	2020-02-21 17:04:37 +00:00
Reid Kleckner	0c2b09a9b6	[IR] Lazily number instructions for local dominance queries Essentially, fold OrderedBasicBlock into BasicBlock, and make it auto-invalidate the instruction ordering when new instructions are added. Notably, we don't need to invalidate it when removing instructions, which is helpful when a pass mostly delete dead instructions rather than transforming them. The downside is that Instruction grows from 56 bytes to 64 bytes. The resulting LLVM code is substantially simpler and automatically handles invalidation, which makes me think that this is the right speed and size tradeoff. The important change is in SymbolTableTraitsImpl.h, where the numbering is invalidated. Everything else should be straightforward. We probably want to implement a fancier re-numbering scheme so that local updates don't invalidate the ordering, but I plan for that to be future work, maybe for someone else. Reviewed By: lattner, vsk, fhahn, dexonsmith Differential Revision: https://reviews.llvm.org/D51664	2020-02-18 14:44:24 -08:00
Florian Hahn	81dbb6aec6	Recommit "[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk)." This includes a fix for the santizier failures. This reverts the revert commit `42f8b915eb`.	2020-02-12 14:17:50 +00:00
Kadir Cetinkaya	42f8b915eb	Revert "[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk)." This reverts commit `d0c4d4fe09`. Revert "[DSE,MSSA] Move more passing test cases from todo to simple.ll." This reverts commit `02266e64bb`. Revert "[DSE,MSSA] Adjust mda-with-dbg-values.ll to MSSA backed DSE." This reverts commit `74f03e4ff0`.	2020-02-11 15:34:48 +01:00
Florian Hahn	d0c4d4fe09	[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk). This patch adds a first version of a MemorySSA based DSE. It is missing a lot of features, which will get added as follow-ups, to help to keep the review manageable. The patch uses the following general approach: given a MemoryDef, walk upwards to find clobbering MemoryDefs that may be killed by the starting def. Then check that there are no uses that may read the location of the original MemoryDef in between both MemoryDefs. A bit more concretely: For all MemoryDefs StartDef: 1. Get the next dominating clobbering MemoryDef (DomAccess) by walking upwards. 2. Check that there no reads between DomAccess and the StartDef by checking all uses starting at DomAccess and walking until we see StartDef. 3. For each found DomDef, check that: 1. There are no barrier instructions between DomDef and StartDef (like throws or stores with ordering constraints). 2. StartDef is executed whenever DomDef is executed. 3. StartDef completely overwrites DomDef. 4. Erase DomDef from the function and MemorySSA. The patch uses a very simple approach to guarantee that no throwing instructions are between 2 stores: We only allow accesses to stack objects, access that are in the same basic block if the block does not contain any throwing instructions or accesses in functions that do not contain any throwing instructions. This will get lifted later. Besides adding support for the missing cases, there is plenty of additional potential for improvements as follow-up work, e.g. the way we visit stores (could be just a traversal of the MemorySSA, rather than collecting them up-front), using the alias information discovered during walking to optimize the MemorySSA. This is loosely based on D40480 by Dave Green. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72700	2020-02-10 11:52:11 +00:00
Florian Hahn	da52b9c118	[DSE] Add tests for MemorySSA based DSE. This copies the DSE tests into a MSSA subdirectory to test the MemorySSA backed DSE implementation, without disturbing the original tests. Differential Revision: https://reviews.llvm.org/D72145	2020-02-10 10:28:43 +00:00
Artur Pilipenko	34547ac959	NFC. Comments cleanup in DSE::memoryIsNotModifiedBetween Separated from https://reviews.llvm.org/D68006 review.	2020-01-31 15:22:33 -08:00

1 2 3 4 5 ...

452 Commits