llvm-project

Commit Graph

Author	SHA1	Message	Date
Florian Hahn	6439fde6d4	[DSE] Bail out from getLocForWriteEx if call is not argmemonly/inacc_mem. This change should currently not have any impact, but guard against further inconsistencies between MemoryLocation and function attributes.	2020-10-20 14:37:53 +01:00
Florian Hahn	f5cf7f544b	[DSE] Do not consider 'noop' intrinsics as read-clobbers. isNoopIntrinsic returns true for some intrinsics that are modeled in MemorySSA but do not actually read or write any memory and do not block DSE. Such intrinsics should not be considered as read-clobbers.	2020-10-18 15:51:05 +01:00
Florian Hahn	51ff04567b	Recommit "[DSE] Switch to MemorySSA-backed DSE by default." After investigation by @asbirlea, the issue that caused the revert appears to be an issue in the original source, rather than a problem with the compiler. This patch enables MemorySSA DSE again. This reverts commit `915310bf14`.	2020-10-16 09:02:53 +01:00
zoecarver	6c25816d7b	[DSE] Look through memory PHI arguments when removing noop stores in MSSA. Summary: Adds support for "following" memory through MSSA PHI arguments. This will help catch more noop stores that exist between blocks. Originally part of D79391. Reviewers: fhahn, jfb, asbirlea Differential Revision: https://reviews.llvm.org/D82588	2020-10-01 10:42:02 -07:00
Florian Hahn	915310bf14	Revert "[DSE] Switch to MemorySSA-backed DSE by default." There appears to be a mis-compile with MemorySSA-backed DSE in combination with llvm.lifetime.end. It currently appears like DSE is doing the right thing and the llvm.lifetime.end markers are incorrect. The reverted patch uncovers the mis-compile. This patch temporarily switches back to the legacy DSE implementation, while we investigate. This reverts commit `9d172c8e9c`.	2020-09-26 18:35:27 +01:00
Florian Hahn	8f0466edc0	[DSE] Unify & fix mem terminator location checks. When looking for memory defs killed by memory terminators the code currently incorrectly ignores the size argument of llvm.lifetime.end. This patch updates the code to use isMemTerminator and updates isMemTerminator to use isOverwrite() to make sure locations that are outside the range marked as dead by llvm.lifetime.end are not considered. Note that isOverwrite is only used for llvm.lifetime.end, because free-like functions make the whole underlying object dead.	2020-09-26 13:47:50 +01:00
Florian Hahn	9d172c8e9c	Recommit "[DSE] Switch to MemorySSA-backed DSE by default." This switches to using DSE + MemorySSA by default again, after fixing the issues reported after the first commit. Notable fixes `fc82006331`, `a0017c2bc2`. This reverts commit `3a59628f3c`.	2020-09-18 11:05:00 +01:00
Florian Hahn	3a59628f3c	Revert "[DSE] Switch to MemorySSA-backed DSE by default." This reverts commit `fb109c42d9`. Temporarily revert due to a mis-compile pointed out at D87163.	2020-09-15 18:07:56 +01:00
Florian Hahn	f715d81c9d	[DSE] Only eliminate candidates that always store the same loc. AliasAnalysis/MemoryLocation does not account for loops. Two MemoryLocation can be must-overwrite, even if the first one writes multiple locations in a loop. This patch prevents removing such stores, by only considering candidates that are known to be loop invariant, or executed in the same BB. Currently the invariant check is quite conservative and only considers Alloca and Alloca-like instructions and arguments as invariant base pointers. It also considers GEPs with all constant indices and invariant bases as invariant. This can be improved in the future, but the current implementation has only minor impact on the total number of stores eliminated (25903 vs 26047 for the baseline). There are some 2-10% swings for some individual benchmarks. In roughly half of the cases, the number of stores removed increases actually, because we skip candidates that are unlikely to be valid candidates early.	2020-09-14 12:06:58 +01:00
Florian Hahn	e082dee2b5	[DSE] Bail out on MemoryPhis when deleting stores at end of function. When deleting stores at the end of a function, we have to do PHI translation, otherwise we might miss reads in different iterations of a loop. See multiblock-loop-carried-dependence.ll for details. This fixes a mis-compile and surprisingly also increases the number of eliminated stores from 26047 to 26572 for MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto. This is most likely because we save budget by not exploring through MemoryPhis, which are less likely to result in valid candidates for elimination. The issue was reported post-commit for `fb109c42d9`.	2020-09-12 19:05:59 +01:00
Krzysztof Parzyszek	f92908cc74	[DSE] Make sure that DSE+MSSA can handle masked stores Differential Revision: https://reviews.llvm.org/D87414	2020-09-11 10:00:21 -05:00
Florian Hahn	fb109c42d9	[DSE] Switch to MemorySSA-backed DSE by default. The tests have been updated and I plan to move them from the MSSA directory up. Some end-to-end tests needed small adjustments. One difference to the legacy DSE is that legacy DSE also deletes trivially dead instructions that are unrelated to memory operations. Because MemorySSA-backed DSE just walks the MemorySSA, we only visit/check memory instructions. But removing unrelated dead instructions is not really DSE's job and other passes will clean up. One noteworthy change is in llvm/test/Transforms/Coroutines/ArgAddr.ll, but I think this comes down to legacy DSE not handling instructions that may throw correctly in that case. To cover this with MemorySSA-backed DSE, we need an update to llvm.coro.begin to treat it's return value to belong to the same underlying object as the passed pointer. There are some minor cases MemorySSA-backed DSE currently misses, e.g. related to atomic operations, but I think those can be implemented after the switch. This has been discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html For the MultiSource/SPEC2000/SPEC2006 the number of eliminated stores goes from ~17500 (legayc DSE) to ~26300 (MemorySSA-backed). More numbers and details in the thread on llvm-dev. Impact on CTMark: ``` Legacy Pass Manager exec instrs size-text O3 + 0.60% - 0.27% ReleaseThinLTO + 1.00% - 0.42% ReleaseLTO-g. + 0.77% - 0.33% RelThinLTO (link only) + 0.87% - 0.42% RelLO-g (link only) + 0.78% - 0.33% ``` http://llvm-compile-time-tracker.com/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions ``` New Pass Manager exec instrs. size-text O3 + 0.95% - 0.25% ReleaseThinLTO + 1.34% - 0.41% ReleaseLTO-g. + 1.71% - 0.35% RelThinLTO (link only) + 0.96% - 0.41% RelLO-g (link only) + 2.21% - 0.35% ``` http://195.201.131.214:8000/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions Reviewed By: asbirlea, xbolva00, nikic Differential Revision: https://reviews.llvm.org/D87163	2020-09-10 22:24:32 +01:00
Florian Hahn	a5ec99da6e	[DSE] Support eliminating memcpy.inline. MemoryLocation has been taught about memcpy.inline, which means we can get the memory locations read and written by it. This means DSE can handle memcpy.inline	2020-09-10 13:19:25 +01:00
Florian Hahn	9969c317ff	[DSE,MemorySSA] Handle atomic stores explicitly in isReadClobber. Atomic stores are modeled as MemoryDef to model the fact that they may not be reordered, depending on the ordering constraints. Atomic stores that are monotonic or weaker do not limit re-ordering, so we do not have to treat them as potential read clobbers. Note that llvm/test/Transforms/DeadStoreElimination/MSSA/atomic.ll already contains a set of negative test cases. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87386	2020-09-09 23:01:58 +01:00
Krzysztof Parzyszek	81ff2d30a9	[DSE] Handle masked stores	2020-09-09 13:31:31 -05:00
Florian Hahn	c7b7c32f4a	[DSE,MemorySSA] Increase walker limit a bit. This slightly bumps the walker limit so that it covers more cases while not increasing compile-time too much: http://llvm-compile-time-tracker.com/compare.php?from=0fc1c2b51ba0cfb9145139af35be638333865251&to=91144a50ea4fa82c0c877e77784f60371640b263&stat=instructions	2020-09-08 14:55:46 +01:00
Florian Hahn	efb8e156da	[DSE,MemorySSA] Add an early check for read clobbers to traversal. Depending on the benchmark, this early exit can save a substantial amount of compile-time: http://llvm-compile-time-tracker.com/compare.php?from=505f2d817aa8e07ba98e5fd4a8f6ff0666f89df1&to=eb4e441147f9b4b7a5fcbbc57428cadbe9e01f10&stat=instructions	2020-09-07 23:22:10 +01:00
Florian Hahn	16bb71fd4f	[DSE,MemorySSA] Add a few additional debug messages.	2020-09-06 20:31:00 +01:00
Florian Hahn	00eb6fef08	[DSE,MemorySSA] Check for throwing instrs between killing/killed def. We also have to check all uses between the killing & killed def and check if any of them is throwing.	2020-09-04 18:54:59 +01:00
Florian Hahn	86d817d7cf	[DSE,MemorySSA] Skip defs without analyzable write locations. Similar to other checks above, if there is no write location for a def, it cannot be considered for elimination and can be skipped.	2020-08-30 21:56:25 +01:00
Florian Hahn	42c57c294d	[DSE,MemorySSA] Simplify code, EarlierAccess is be a MemoryDef (NFC). After recent changes, we return early if Current is a MemoryPhi, so EarlierAccess can only be a MemoryDef.	2020-08-30 21:31:57 +01:00
Florian Hahn	31cdb29de4	[DSE,MemorySSA] Return early when hitting a MemoryPhi. A MemoryPhi can never be eliminated. If we hit one, return the Phi, so the caller can continue traversing the incoming accesses. This saves some unnecessary read clobber checks and improves compile-time http://llvm-compile-time-tracker.com/compare.php?from=1ffc58b6d098ce8fa71f3a80fe75b990f633f921&to=d0fa8d1982380b57d7b6067528104bc373dbe07a&stat=instructions	2020-08-29 18:28:26 +01:00
Florian Hahn	43aa7227df	[DSE,MemorySSA] Check if Current is valid for elimination first. This changes getDomMemoryDef to check if a Current is a valid candidate for elimination before checking for reads. Before the change, we were spending a lot of compile-time in checking for read accesses for Current that might not even be removable. This patch flips the logic, so we skip Current if they cannot be removed before checking all their uses. This is much more efficient in practice. It also adds a more aggressive limit for checking partially overlapping stores. The main problem with overlapping stores is that we do not know if they will lead to elimination until seeing all of them. This patch limits adds a new limit for overlapping store candidates, which keeps the number of modified overlapping stores roughly the same. This is another substantial compile-time improvement (while also increasing the number of stores eliminated). Geomean -O3 -0.67%, ReleaseThinLTO -0.97%. http://llvm-compile-time-tracker.com/compare.php?from=0a929b6978a068af8ddb02d0d4714a2843dd8ba9&to=2e630629b43f64b60b282e90f0d96082fde2dacc&stat=instructions Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86487	2020-08-28 11:19:04 +01:00
Florian Hahn	bb024c3c4e	[DSE,MemorySSA] Remove short-cut to check if all paths are covered. The post-order number early continue does not work in some cases, e.g. if a path from EarlierAccess to an exit includes a node that dominates EarlierAccess in a cycle. The short-cut only has very minor impact on compile-time, so it seems straight-forward to remove it for now: http://llvm-compile-time-tracker.com/compare.php?from=062412e79fcfedf2cf004433e42036b0333e3f83&to=d7386016a77ce1387bdbbf360f1de157faea9d31&stat=instructions Fixes PR47285.	2020-08-27 12:42:40 +01:00
Florian Hahn	e717fdb0f1	[DSE,MemorySSA] Traverse use-def chain without MemSSA Walker. For DSE with MemorySSA it is beneficial to manually traverse the defining access, instead of using a MemorySSA walker, so we can better control the number of steps together with other limits and also weed out invalid/unprofitable paths early on. This patch requires a follow-up patch to be most effective, which I will share soon after putting this patch up. This temporarily XFAIL's the limit tests, because we now explore more MemoryDefs that may not alias/clobber the killing def. This will be improved/fixed by the follow-up patch. This patch also renames some `Dom` variables to `Earlier`, because the dominance relation is not really used/important here and potentially confusing. This patch allows us to aggressively cut down compile time, geomean -O3 -0.64%, ReleaseThinLTO -1.65%, at the expense of fewer stores removed. Subsequent patches will increase the number of removed stores again, while keeping compile-time in check. http://llvm-compile-time-tracker.com/compare.php?from=d8e3294118a8c5f3f97688a704d5a05b67646012&to=0a929b6978a068af8ddb02d0d4714a2843dd8ba9&stat=instructions Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86486	2020-08-27 10:02:02 +01:00
Florian Hahn	e19ef1aab5	[DSE,MemorySSA] Cache accesses with/without reachable read-clobbers. Currently we repeatedly check the same uses for read clobbers in some cases. We can avoid unnecessary checks by keeping track of the memory accesses we already found read clobbers for. To do so, we just add memory access causing read-clobbers to a set. Note that marking all visited accesses as read-clobbers would be to pessimistic, as that might include accesses not on any path to the actual read clobber. If we do not find any read-clobbers, we can add all visited instructions to another set and use that to skip the same accesses in the next call. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D75025	2020-08-25 08:48:46 +01:00
Florian Hahn	d1a1cce5b1	[DSE,MemorySSA] Do not use callCapturesBefore in isReadClobber. Using callCapturesBefore potentially improves the precision and the number of stores we can remove. But in practice, it seems to have very little impact in terms of stores removed. For example, for SPEC2000/SPEC2006/MultiSource with -O3 -flto, ~50 more stores are removed (out of ~26900 stores removed). But in terms of compile-time, it is very expensive and the patch gives substantial compile-time improvements: Geomean O3 -0.24%, ReleaseThinLTO -0.47%, ReleaseLTO-g -0.39%. http://llvm-compile-time-tracker.com/compare.php?from=612a0bff88ed906c83b82f079d4c49e5fecfb9d0&to=e6c86b96d20d97dd88e903a409bd8d39b6114312&stat=instructions	2020-08-24 16:19:42 +01:00
Florian Hahn	b99a5eb659	[DSE,MemorySSA] Delay PointerMayBeCaptured calls until actually needed. Avoid computing InvisibleToCallerBefore/AfterRet up front. In most cases, this information is not really needed. Instead, introduce helper functions to compute and cache the result on demand. Notably, this also does not use PointerMayBeCapturedBefore for isInvisibleToCallerBeforeRet, as it requires the killing MemoryDef as starting instruction, making the caching ineffective. But it appears the use of PointerMayBeCapturedBefore has very limited benefits in practice (e.g. on SPEC2000/SPEC2006/MultiSource there are no binary changes with -O3 -flto). Refrain from using it for now, to limit-compile-time. This gives some nice compile-time improvements: http://llvm-compile-time-tracker.com/compare.php?from=db9345f6810f379a36752dc52caf5230585d0ebd&to=b4d091047e1b8a3d377d200137b79d03aca65663&stat=instructions	2020-08-24 14:05:44 +01:00
Florian Hahn	2431b143ae	[DSE,MemorySSA] Limit elimination at end of function to single UO. Limit elimination of stores at the end of a function to MemoryDefs with a single underlying object, to save compile time. In practice, the case with multiple underlying objects seems not very important in practice. For -O3 -flto on MultiSource/SPEC2000/SPEC2006 this results in a total of 2 more stores being eliminated. We can always re-visit that in the future.	2020-08-24 13:00:17 +01:00
Florian Hahn	2843c9fe0a	[DSE,MemorySSA] Keep single DL instance in DSEState (NFC). Small cleanup, also removes one instance of getting DataLayout without using it later.	2020-08-23 15:56:38 +01:00
Florian Hahn	5e7e2162d4	[DSE,MemorySSA] Use BatchAA for AA queries. We can use BatchAA to avoid some repeated AA queries. We only remove stores, so I think we will get away with using a single BatchAA instance for the complete run. The changes in AliasAnalysis.h mirror the changes in D85583. The change improves compile-time by roughly 1%. http://llvm-compile-time-tracker.com/compare.php?from=67ad786353dfcc7633c65de11601d7823746378e&to=10529e5b43809808e8c198f88fffd8f756554e45&stat=instructions This is part of the patches to bring down compile-time to the level referenced in http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86275	2020-08-22 08:36:35 +01:00
Florian Hahn	9f7350672e	[DSE,MemorySSA] Handle atomicrmw/cmpxchg conservatively. This adds conservative handling of AtomicRMW/AtomicCmpXChg to isDSEBarrier, similar to atomic loads and stores.	2020-08-21 10:42:42 +01:00
Florian Hahn	a0e92ffd0d	[DSE,MemorySSA] Split off partial tracking from isOverwite. When traversing memory uses to look for aliasing reads/writes, we only care about complete overwrites. This patch splits off the partial overwrite tracking from isOverwrite This avoids some unnecessary work when checking for read/write clobbers with MemorySSA-DSE. isOverwrite, which skips the partial overwrite tracking. This gives a relatively small improvement http://llvm-compile-time-tracker.com/compare.php?from=ef2a2f77f87553a0a4a39f518eb9ac86b756bda6&to=658f3905dd96d3415f3782adc712c79fa59a4665&stat=instructions This is part of the patches to bring down compile-time to the level referenced in http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86280	2020-08-21 09:13:59 +01:00
Florian Hahn	c0cbe6453a	[DSE] Remove dead argument from removePartiallyOverlappedStores (NFC). The argument is unused and can be removed.	2020-08-19 19:33:52 +01:00
Florian Hahn	1a55fbceaa	[DSE,MemorySSA] Use NumRedundantStores instead of NumNoopStores. Legacy DSE uses NumRedundantStores, while MemorySSA DSE uses NumNoopStores. We should just use the same counter.	2020-08-19 08:50:33 +01:00
Florian Hahn	4cc20aa743	[DSE,MemorySSA] Skip access already dominated by a killing def. If we already found a killing def (= a def that completely overwrites the location) that dominates an access, we can skip processing it further. This does not help with compile-time, but increases the number of memory accesses we can process with the same scan budget, leading to more stores being eliminated. Improvements with this change Same hash: 203 (filtered out) Remaining: 34 Metric: dse.NumFastStores Program base dom diff test-suite...rolangs-C++/family/family.test 2.00 4.00 100.0% test-suite...ProxyApps-C++/CLAMR/CLAMR.test 172.00 229.00 33.1% test-suite...ks/Prolangs-C/agrep/agrep.test 10.00 12.00 20.0% test-suite...oxyApps-C++/miniFE/miniFE.test 44.00 51.00 15.9% test-suite...marks/7zip/7zip-benchmark.test 1285.00 1474.00 14.7% test-suite...006/450.soplex/450.soplex.test 254.00 289.00 13.8% test-suite...006/447.dealII/447.dealII.test 2466.00 2798.00 13.5% test-suite...000/197.parser/197.parser.test 9.00 10.00 11.1% test-suite.../Benchmarks/nbench/nbench.test 85.00 91.00 7.1% test-suite...ce/Applications/siod/siod.test 68.00 72.00 5.9% test-suite...ications/JM/lencod/lencod.test 786.00 824.00 4.8% test-suite...6/464.h264ref/464.h264ref.test 765.00 798.00 4.3% test-suite.../Benchmarks/Ptrdist/bc/bc.test 105.00 109.00 3.8% test-suite...lications/obsequi/Obsequi.test 29.00 28.00 -3.4% test-suite...3.xalancbmk/483.xalancbmk.test 1322.00 1367.00 3.4% test-suite...chmarks/MallocBench/gs/gs.test 118.00 122.00 3.4% test-suite...T2006/401.bzip2/401.bzip2.test 60.00 62.00 3.3% test-suite...6/482.sphinx3/482.sphinx3.test 30.00 31.00 3.3% test-suite...rks/tramp3d-v4/tramp3d-v4.test 862.00 887.00 2.9% test-suite...telecomm-gsm/telecomm-gsm.test 78.00 80.00 2.6% test-suite...ediabench/gsm/toast/toast.test 78.00 80.00 2.6% test-suite.../Applications/SPASS/SPASS.test 163.00 167.00 2.5% test-suite...lications/ClamAV/clamscan.test 240.00 245.00 2.1% test-suite...006/453.povray/453.povray.test 1392.00 1419.00 1.9% test-suite...000/255.vortex/255.vortex.test 211.00 215.00 1.9% test-suite...:: External/Povray/povray.test 1295.00 1317.00 1.7% test-suite...lications/sqlite3/sqlite3.test 175.00 177.00 1.1% test-suite...T2000/256.bzip2/256.bzip2.test 99.00 100.00 1.0% test-suite...0/253.perlbmk/253.perlbmk.test 629.00 635.00 1.0% test-suite.../CINT2006/403.gcc/403.gcc.test 1183.00 1194.00 0.9% test-suite.../CINT2000/176.gcc/176.gcc.test 647.00 653.00 0.9% test-suite...ications/JM/ldecod/ldecod.test 512.00 516.00 0.8% test-suite...0.perlbench/400.perlbench.test 1026.00 1034.00 0.8% test-suite...-typeset/consumer-typeset.test 1876.00 1877.00 0.1% Geomean difference 7.3%	2020-08-17 20:54:48 +01:00
Florian Hahn	df4756ec6c	[DSE,MemorySSA] Check for underlying objects first. isWriteAtEndOfFunction needs to check all memory uses of Def, which is much more expensive than getting the underlying objects in practice. Switch the call order, as recommended by the TODO, which was added as per an earlier review. This shaves off a bit of compile-time.	2020-08-17 18:52:18 +01:00
Florian Hahn	139810449b	[DSE,MemorySSA] Account for ScanLimit == 0 on entry. Currently the code does not account for the fact that getDomMemoryDef can be called with ScanLimit == 0, if we reached the limit while processing an earlier access. Also tighten the check a bit more and bump the scan limit now that it is handled properly. In some cases, this brings a 2x speedup in terms of compile-time.	2020-08-17 17:55:14 +01:00
Florian Hahn	3b0878a370	[DSE,MSSA] Fix crash when using tryToMergePartialOverlappingStores. We are re-using tryToMergePartialOverlappingStores, which requires earlier to domiante Later. In the long run, tryToMergeParialOverlappingStores should be re-written using MemorySSA. Fixes PR46513.	2020-08-13 12:07:56 +01:00
Vitaly Buka	b0eb40ca39	[NFC] Remove unused GetUnderlyingObject paramenter Depends on D84617. Differential Revision: https://reviews.llvm.org/D84621	2020-07-31 02:10:03 -07:00
Vitaly Buka	89051ebace	[NFC] GetUnderlyingObject -> getUnderlyingObject I am going to touch them in the next patch anyway	2020-07-30 21:08:24 -07:00
Matt Arsenault	023883a834	IR: Rename Argument::hasPassPointeeByValueAttr to prepare for byref When the byref attribute is added, there will need to be two similar functions for the existing cases which have an associate value copy, and byref which does not. Most, but not all of the existing uses will use the existing version. The associated size function added by D82679 also needs to contextually differ, and will help eliminate a few places still relying on pointee element types.	2020-07-16 13:50:49 -04:00
John Brawn	20854d85e1	[DSE,MSSA] Recognise init_trampoline in getLocForWriteEx This fixes an instance where MemorySSA-using Dead Store Elimination is failing to do a transformation that the non-MemorySSA-using version does. Differential Revision: https://reviews.llvm.org/D83783	2020-07-15 12:18:58 +01:00
Florian Hahn	80970ac875	[DSE,MSSA] Eliminate stores by terminators (free,lifetime.end). This patch adds support for eliminating stores by free & lifetime.end calls. We can remove stores that are not read before calling a memory terminator and we can eliminate all stores after a memory terminator until we see a new lifetime.start. The second case seems to not really trigger much in practice though. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72410	2020-07-08 08:59:46 +01:00
Nicolai Hähnle	dfcc68c528	DomTree: Remove getRoots() accessor Summary: Avoid exposing details about how roots are stored. This enables subsequent type-erasure changes. v5: - cleanup a unit test by using EXPECT_EQ instead of EXPECT_TRUE Change-Id: I532b774cc71f2224e543bc7d79131d97f63f093d Reviewers: arsenm, RKSimon, mehdi_amini, courbet Subscribers: jvesely, wdng, hiraditya, kuhar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83085	2020-07-06 21:58:11 +02:00
Nuno Lopes	7f903873b8	DSE: fix builtin function recognition to take decl into account	2020-07-02 10:28:47 +01:00
Florian Hahn	4837daf883	[DSE,MSSA] Check if Def is removable only wen we try to remove it. Non-removable MemoryDefs can still eliminate other defs. Update the isRemovable checks to only candidates for removal.	2020-06-25 14:01:10 +01:00
Florian Hahn	4e62c6359c	[DSE] Eliminate stores at the end of the function. This patch add support for eliminating MemoryDefs that do not have any aliasing users, which indicates that there are no reads/writes to the memory location until the end of the function. To eliminate such defs, we have to ensure that the underlying object is not visible in the caller and does not escape via returning. We need a separate check for that, as InvisibleToCaller does not consider returns. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker, george.burgess.iv Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72631	2020-06-24 12:58:20 +01:00
Florian Hahn	ff4de8683a	[DSE,MSSA] Treat `store 0` after calloc as noop stores. This patch extends storeIsNoop to also detect stores of 0 to an calloced object. This basically ports the logic from legacy DSE to the MemorySSA backed version. It triggers in a few cases on MultiSource, SPEC2000, SPEC2006 with -O3 LTO: Same hash: 218 (filtered out) Remaining: 19 Metric: dse.NumNoopStores Program base patch2 diff test-suite...CFP2000/177.mesa/177.mesa.test 1.00 15.00 1400.0% test-suite...6/482.sphinx3/482.sphinx3.test 1.00 14.00 1300.0% test-suite...lications/ClamAV/clamscan.test 2.00 28.00 1300.0% test-suite...CFP2006/433.milc/433.milc.test 1.00 8.00 700.0% test-suite...pplications/oggenc/oggenc.test 2.00 9.00 350.0% test-suite.../CINT2000/176.gcc/176.gcc.test 6.00 6.00 0.0% test-suite.../CINT2006/403.gcc/403.gcc.test NaN 137.00 nan% test-suite...libquantum/462.libquantum.test NaN 3.00 nan% test-suite...6/464.h264ref/464.h264ref.test NaN 7.00 nan% test-suite...decode/alacconvert-decode.test NaN 2.00 nan% test-suite...encode/alacconvert-encode.test NaN 2.00 nan% test-suite...ications/JM/ldecod/ldecod.test NaN 9.00 nan% test-suite...ications/JM/lencod/lencod.test NaN 39.00 nan% test-suite.../Applications/lemon/lemon.test NaN 2.00 nan% test-suite...pplications/treecc/treecc.test NaN 4.00 nan% test-suite...hmarks/McCat/08-main/main.test NaN 4.00 nan% test-suite...nsumer-lame/consumer-lame.test NaN 3.00 nan% test-suite.../Prolangs-C/bison/mybison.test NaN 1.00 nan% test-suite...arks/mafft/pairlocalalign.test NaN 30.00 nan% Reviewers: efriedma, zoecarver, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D82204	2020-06-23 21:01:39 +01:00
Florian Hahn	a822ec75cc	[DSE,MSSA] Treat passed by value args as invisible to caller. This updates the MemorySSA backed implementation to treat arguments passed by value similar to allocas: in they are assumed to be invisible in the caller. This is similar to how they are treated in legacy DSE. Reviewers: efriedma, asbirlea, george.burgess.iv Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D82222	2020-06-23 08:58:51 +01:00
Florian Hahn	328c8642e2	[DSE,MSSA] Reorder DSE blocking checks. Currently we stop exploring candidates too early in some cases. In particular, we can continue checking the defining accesses of non-removable MemoryDefs and defs without analyzable write location (read clobbers are already ruled out using MemorySSA at this point).	2020-06-22 17:16:34 +01:00
Florian Hahn	0e19ff02d8	[DSE,MSSA] Remove unused arguments for isDSEBarrier (NFC).	2020-06-22 10:58:53 +01:00
Florian Hahn	40569db7b3	[DSE,MSSA] Move reachability check to main loop. As we traverse the CFG backwards, we could end up reaching unreachable blocks. For unreachable blocks, we won't have computed post order numbers and because DomAccess is reachable, unreachable blocks cannot be on any path from it. This fixes a crash with unreachable blocks.	2020-06-21 16:38:10 +01:00
Florian Hahn	120c059292	[DSE,MSSA] Port partial store merging. Port partial constant store merging logic to MemorySSA backed DSE. The heavy lifting is done by the existing helper function. It is used in context where we already ensured that the later instruction can eliminate the earlier one, if it is a complete overwrite.	2020-06-15 18:41:46 +01:00
Florian Hahn	71a91b9837	[DSE] Hoist partial store merging code into function (NFC). Hoist the general logic into a new function, because it can be re-used by the MemorySSA backed DSE as well.	2020-06-15 17:44:24 +01:00
Florian Hahn	8c61f13a0f	[DSE,MSSA] Delete instructions after printing it. Also enables a now-passing test case, that exposed a crash caused by the wrong order.	2020-06-15 16:01:36 +01:00
Florian Hahn	97e7147e34	[DSE,MSSA] Fix location order in isOverwrite call. isOverwrite expects the later location as first argument and the earlier result later. The adjusted call is intended to check whether CC overwrites DefLoc.	2020-06-13 20:39:00 +01:00
Florian Hahn	67671024c8	[DSE,MSSA] Relax post-dom restriction for objs visible after return. This patch relaxes the post-dominance requirement for accesses to objects visible after the function returns. Instead of requiring the killing def to post-dominate the access to eliminate, the set of 'killing blocks' (= blocks that completely overwrite the original access) is collected. If all paths from the access to eliminate and an exit block go through a killing block, the access can be removed. To check this property, we first get the common post-dominator block for the killing blocks. If this block does not post-dominate the access block, there may be a path from DomAccess to an exit block not involving any killing block. Otherwise we have to check if there is a path from the DomAccess to the common post-dominator, that does not contain a killing block. If there is no such path, we can remove DomAccess. For this check, we start at the common post-dominator and then traverse the CFG backwards. Paths are terminated when we hit a killing block or a block that is not executed between DomAccess and a killing block according to the post-order numbering (if the post order number of a block is greater than the one of DomAccess, the block cannot be in in a path starting at DomAccess). This gives the following improvements on the total number of stores after DSE for MultiSource, SPEC2K, SPEC2006: Tests: 237 Same hash: 206 (filtered out) Remaining: 31 Metric: dse.NumRemainingStores Program base new100 diff test-suite...CFP2000/188.ammp/188.ammp.test 3624.00 3544.00 -2.2% test-suite...ch/g721/g721encode/encode.test 128.00 126.00 -1.6% test-suite.../Benchmarks/Olden/mst/mst.test 73.00 72.00 -1.4% test-suite...CFP2006/433.milc/433.milc.test 3202.00 3163.00 -1.2% test-suite...000/186.crafty/186.crafty.test 5062.00 5010.00 -1.0% test-suite...-typeset/consumer-typeset.test 40460.00 40248.00 -0.5% test-suite...Source/Benchmarks/sim/sim.test 642.00 639.00 -0.5% test-suite...nchmarks/McCat/09-vor/vor.test 642.00 644.00 0.3% test-suite...lications/sqlite3/sqlite3.test 35664.00 35563.00 -0.3% test-suite...T2000/300.twolf/300.twolf.test 7202.00 7184.00 -0.2% test-suite...lications/ClamAV/clamscan.test 19475.00 19444.00 -0.2% test-suite...INT2000/164.gzip/164.gzip.test 2199.00 2196.00 -0.1% test-suite...peg2/mpeg2dec/mpeg2decode.test 2380.00 2378.00 -0.1% test-suite.../Benchmarks/Bullet/bullet.test 39335.00 39309.00 -0.1% test-suite...:: External/Povray/povray.test 36951.00 36927.00 -0.1% test-suite...marks/7zip/7zip-benchmark.test 67396.00 67356.00 -0.1% test-suite...6/464.h264ref/464.h264ref.test 31497.00 31481.00 -0.1% test-suite...006/453.povray/453.povray.test 51441.00 51416.00 -0.0% test-suite...T2006/401.bzip2/401.bzip2.test 4450.00 4448.00 -0.0% test-suite...Applications/kimwitu++/kc.test 23481.00 23471.00 -0.0% test-suite...chmarks/MallocBench/gs/gs.test 6286.00 6284.00 -0.0% test-suite.../CINT2000/254.gap/254.gap.test 13719.00 13715.00 -0.0% test-suite.../Applications/SPASS/SPASS.test 30345.00 30338.00 -0.0% test-suite...006/450.soplex/450.soplex.test 15018.00 15016.00 -0.0% test-suite...ications/JM/lencod/lencod.test 27780.00 27777.00 -0.0% test-suite.../CINT2006/403.gcc/403.gcc.test 105285.00 105276.00 -0.0% There might be potential to pre-compute some of the information of which blocks are on the path to an exit for each block, but the overall benefit might be comparatively small. On the set of benchmarks, 15738 times out of 20322 we reach the CFG check, the CFG check is successful. The total number of iterations in the CFG check is 187810, so on average we need less than 10 steps in the check loop. Bumping the threshold in the loop from 50 to 150 gives a few small improvements, but I don't think they warrant such a big bump at the moment. This is all pending further tuning in the future. Reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv Reviewed By: george.burgess.iv Differential Revision: https://reviews.llvm.org/D78932	2020-06-10 10:39:25 +01:00
Chris Jackson	c6c65164af	[DebugInfo] Reduce SalvageDebugInfo() functions - Now all SalvageDebugInfo() calls will mark undef if the salvage attempt fails. Reviewed by: vsk, Orlando Differential Revision: https://reviews.llvm.org/D78369	2020-06-08 19:28:18 +01:00
Benjamin Kramer	3badd17b69	SmallPtrSet::find -> SmallPtrSet::count The latter is more readable and more efficient. While there clean up some double lookups. NFCI.	2020-06-07 22:38:08 +02:00
serge-sans-paille	424510095d	Correctly report modified status for DSE Differential Revision: https://reviews.llvm.org/D81233	2020-06-05 15:59:42 +02:00
zoecarver	065bf124fd	[DSE] Remove noop stores in MSSA. Adds a simple fast-path check for the pattern: v = load ptr store v to ptr I took the tests from the bugzilla post, I can add more if needed (but I think these should be sufficent). Refs: https://bugs.llvm.org/show_bug.cgi?id=45795 Differential Revision: https://reviews.llvm.org/D79391	2020-05-30 09:57:30 -07:00
Eli Friedman	11aa3707e3	StoreInst should store Align, not MaybeAlign This is D77454, except for stores. All the infrastructure work was done for loads, so the remaining changes necessary are relatively small. Differential Revision: https://reviews.llvm.org/D79968	2020-05-15 12:26:58 -07:00
Arthur Eubanks	a90948fd6e	[NFC] Rename ByValOrInalloca to PassPointeeByValue Summary: In preparation for preallocated. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79152	2020-04-30 09:42:13 -07:00
Florian Hahn	2f3e86b318	[DSE,MSSA] Continue checking more remaining candidates with dbgcnt. After changing the candidate iteration strategy, we should continue with the next candidate, rather than breaking out of the loop.	2020-04-26 16:59:32 +01:00
Florian Hahn	46a04940e8	[DSE] Add stat for remaining stores after DSE. Using the existing NumFastStores statistic can be misleading when comparing the impact of DSE patches. For example, consider the case where a store gets removed from a function before it is inlined into another function. A less powerful DSE might only remove the store from functions it has been inlined into, which will result in more stores being removed, but no difference in the actual number of stores after DSE. The new stat provides the absolute number of stores surviving after DSE. Reviewers: dmgreen, bryant, asbirlea, jfb Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D78830	2020-04-25 16:12:55 +01:00
Florian Hahn	e1235831c4	[DSE,MSSA] Improve debug output (NFC). This patch slightly improves the formatting of the debug output, adds a few missing outputs and makes some existing outputs more consistent with the rest.	2020-04-24 17:50:08 +01:00
Florian Hahn	44ce588670	[DSE,MSSA] Skip checking write clobber for DomAccess (NFC). There is no need to check if the starting access for is a write clobber and all of its uses have already been checked.	2020-04-24 17:16:22 +01:00
Mircea Trofin	ceb7f308b8	[llvm][NFC][CallSite] Removed CallSite from few implementation details Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78724	2020-04-23 10:36:36 -07:00
Florian Hahn	cf9ee49b4d	[DSE] Lift post-dominance for objs not accessible in caller. We can eliminate MemoryDefs of objects not accessible after the function returns (e.g. alloca), if there are no reads between the MemoryDef and any function exits. We can stop traversing paths that completely overwrite the memory location of the MemoryDef. This patch was split off D73763. Reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv Reviewed By: asbirlea, george.burgess.iv Differential Revision: https://reviews.llvm.org/D77736	2020-04-15 11:37:14 +01:00
Tyker	086de7673e	[AssumeBundles] preserve knowledge in DSE Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77404	2020-04-14 12:48:15 +02:00
Florian Hahn	bbbec71609	[DSE.MSSA] Only use callCapturesBefore for calls. callCapturesBefore always returns ModRef , if UseInst isn't a call. As we only call it if we already know Mod is set, this only destroys the Must bit for non-calls.	2020-04-08 15:12:33 +01:00
Florian Hahn	a6353fdf3b	[DSE,MSSA] Hoist getMemoryAccess call (NFC).	2020-04-08 15:10:05 +01:00
Chris Jackson	135709aa90	[DebugInfo] Ensure dead store elimination can mark an operand value as undefined - Correct a debug info salvage and add a test Reviewers: aprantl, vsk Differential Revision: https://reviews.llvm.org/D76930 Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45080	2020-03-30 14:58:14 +01:00
Florian Hahn	3a8372ed02	[DSE] Support traversing MemoryPhis. For MemoryPhis, we have to avoid that the MemoryPhi may be executed before before the access we are currently looking at. To do this we do a post-order numbering of the basic blocks in the function and bail out once we reach a MemoryPhi with a larger (or equal) post-order block number than the current MemoryAccess. This changes the order in which we visit stores for elimination. This patch also adds support for exploring multiple paths. We keep a worklist (ToCheck) of memory accesses that might be eliminated by our starting MemoryDef or MemoryPhis for further exploration. For MemoryPhis, we add the incoming values to the worklist, for MemoryDefs we add the defining access. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72148	2020-03-20 07:51:42 +00:00
Artur Pilipenko	02e3d5c3a2	Fix DSE miscompile when store is clobbered across loop iterations DSE would mistakenly remove store (2): a = calloc(n+1) for (int i = 0; i < n; i++) { store 1, a[i+1] // (1) store 0, a[i] // (2) } The fix is to do PHI transaltion while looking for clobbering instructions between the store and the calloc. Reviewed By: efriedma, bjope Differential Revision: https://reviews.llvm.org/D68006	2020-02-27 14:43:01 -08:00
Florian Hahn	b8d638d337	[DSE,MSSA] Do not attempt to remove un-removable memdefs. We have to skip MemoryDefs that cannot be removed. This fixes a crash in the newly added test case and fixes a wrong case in memset-and-memcpy.ll.	2020-02-25 13:31:46 +00:00
Florian Hahn	af69d5e10e	[DSE] Track overlapping stores. Add a map from BasicBlocks to overlap intervals. For partial writes, we can keep track of those in IOLs. We only add candidates that are valid for eliminations. Reviewers: dmgreen, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D73757	2020-02-23 15:44:40 +00:00
Florian Hahn	134bab7cd5	[DSE,MSSA] Add debug counter. Can be used like -debug-counter=dse-memoryssa-skip=10,dse-memoryssa-counter-count=20 Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72147	2020-02-21 17:04:37 +00:00
Reid Kleckner	0c2b09a9b6	[IR] Lazily number instructions for local dominance queries Essentially, fold OrderedBasicBlock into BasicBlock, and make it auto-invalidate the instruction ordering when new instructions are added. Notably, we don't need to invalidate it when removing instructions, which is helpful when a pass mostly delete dead instructions rather than transforming them. The downside is that Instruction grows from 56 bytes to 64 bytes. The resulting LLVM code is substantially simpler and automatically handles invalidation, which makes me think that this is the right speed and size tradeoff. The important change is in SymbolTableTraitsImpl.h, where the numbering is invalidated. Everything else should be straightforward. We probably want to implement a fancier re-numbering scheme so that local updates don't invalidate the ordering, but I plan for that to be future work, maybe for someone else. Reviewed By: lattner, vsk, fhahn, dexonsmith Differential Revision: https://reviews.llvm.org/D51664	2020-02-18 14:44:24 -08:00
Florian Hahn	81dbb6aec6	Recommit "[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk)." This includes a fix for the santizier failures. This reverts the revert commit `42f8b915eb`.	2020-02-12 14:17:50 +00:00
Kadir Cetinkaya	42f8b915eb	Revert "[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk)." This reverts commit `d0c4d4fe09`. Revert "[DSE,MSSA] Move more passing test cases from todo to simple.ll." This reverts commit `02266e64bb`. Revert "[DSE,MSSA] Adjust mda-with-dbg-values.ll to MSSA backed DSE." This reverts commit `74f03e4ff0`.	2020-02-11 15:34:48 +01:00
Florian Hahn	d0c4d4fe09	[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk). This patch adds a first version of a MemorySSA based DSE. It is missing a lot of features, which will get added as follow-ups, to help to keep the review manageable. The patch uses the following general approach: given a MemoryDef, walk upwards to find clobbering MemoryDefs that may be killed by the starting def. Then check that there are no uses that may read the location of the original MemoryDef in between both MemoryDefs. A bit more concretely: For all MemoryDefs StartDef: 1. Get the next dominating clobbering MemoryDef (DomAccess) by walking upwards. 2. Check that there no reads between DomAccess and the StartDef by checking all uses starting at DomAccess and walking until we see StartDef. 3. For each found DomDef, check that: 1. There are no barrier instructions between DomDef and StartDef (like throws or stores with ordering constraints). 2. StartDef is executed whenever DomDef is executed. 3. StartDef completely overwrites DomDef. 4. Erase DomDef from the function and MemorySSA. The patch uses a very simple approach to guarantee that no throwing instructions are between 2 stores: We only allow accesses to stack objects, access that are in the same basic block if the block does not contain any throwing instructions or accesses in functions that do not contain any throwing instructions. This will get lifted later. Besides adding support for the missing cases, there is plenty of additional potential for improvements as follow-up work, e.g. the way we visit stores (could be just a traversal of the MemorySSA, rather than collecting them up-front), using the alias information discovered during walking to optimize the MemorySSA. This is loosely based on D40480 by Dave Green. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72700	2020-02-10 11:52:11 +00:00
Florian Hahn	da52b9c118	[DSE] Add tests for MemorySSA based DSE. This copies the DSE tests into a MSSA subdirectory to test the MemorySSA backed DSE implementation, without disturbing the original tests. Differential Revision: https://reviews.llvm.org/D72145	2020-02-10 10:28:43 +00:00
Artur Pilipenko	34547ac959	NFC. Comments cleanup in DSE::memoryIsNotModifiedBetween Separated from https://reviews.llvm.org/D68006 review.	2020-01-31 15:22:33 -08:00
Nuno Lopes	87407fc03c	DSE: fix bug where we would only check libcalls for name rather than whole decl	2020-01-11 11:57:29 +00:00
Ankit	369a919514	Fix for a dangling point bug in DeadStoreElimination pass The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction. While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration. In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container. Reviewers: fhahn, bcahoon, efriedma Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D65326	2020-01-03 14:28:44 +00:00
Florian Hahn	19071173fc	Revert "[DSE] Fix for a dangling point bug in DeadStoreElimination." The commit causes a failure: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20911 This reverts commit `1847fd9d85`.	2019-12-05 19:29:21 +00:00
Ankit	1847fd9d85	[DSE] Fix for a dangling point bug in DeadStoreElimination. The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction. While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration. In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container. Patch by Ankit <quic_aankit@quicinc.com> Reviewers: fhahn, bcahoon, efriedma Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D65326	2019-12-05 17:53:58 +00:00
Reid Kleckner	05da2fe521	Sink all InitializePasses.h includes This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211	2019-11-13 16:34:37 -08:00
Guillaume Chatelet	5b99c189b3	[Alignment][NFC] Convert StoreInst to MaybeAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69303 llvm-svn: 375499	2019-10-22 12:55:32 +00:00
Teresa Johnson	9c27b59cec	Change TargetLibraryInfo analysis passes to always require Function Summary: This is the first change to enable the TLI to be built per-function so that -fno-builtin* handling can be migrated to use function attributes. See discussion on D61634 for background. This is an enabler for fixing handling of these options for LTO, for example. This change should not affect behavior, as the provided function is not yet used to build a specifically per-function TLI, but rather enables that migration. Most of the changes were very mechanical, e.g. passing a Function to the legacy analysis pass's getTLI interface, or in Module level cases, adding a callback. This is similar to the way the per-function TTI analysis works. There was one place where we were looking for builtins but not in the context of a specific function. See FindCXAAtExit in lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround could provide the wrong behavior in some corner cases. Suggestions welcome. Reviewers: chandlerc, hfinkel Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66428 llvm-svn: 371284	2019-09-07 03:09:36 +00:00
Bjorn Pettersson	d63a2bb35f	[DSE] Bugfix to avoid PartialStoreMerging involving non byte-sized stores Summary: The DeadStoreElimination pass now skips doing PartialStoreMerging when stores overlap according to OW_PartialEarlierWithFullLater and at least one of the stores is having a store size that is different from the size of the type being stored. This solves problems seen in https://bugs.llvm.org/show_bug.cgi?id=41949 for which we in the past could end up with mis-compiles or assertions. The content and location of the padding bits is not formally described (or undefined) in the LangRef at the moment. So the solution is chosen based on that we cannot assume anything about the padding bits when having a store that clobbers more memory than indicated by the type of the value that is stored (such as storing an i6 using an 8-bit store instruction). Fixes: https://bugs.llvm.org/show_bug.cgi?id=41949 Reviewers: spatel, efriedma, fhahn Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62250 llvm-svn: 361605	2019-05-24 08:32:02 +00:00
Bjorn Pettersson	71e8c6f20f	Add "const" in GetUnderlyingObjects. NFC Summary: Both the input Value pointer and the returned Value pointers in GetUnderlyingObjects are now declared as const. It turned out that all current (in-tree) uses of GetUnderlyingObjects were trivial to update, being satisfied with have those Value pointers declared as const. Actually, in the past several of the users had to use const_cast, just because of ValueTracking not providing a version of GetUnderlyingObjects with "const" Value pointers. With this patch we get rid of those const casts. Reviewers: hfinkel, materi, jkorous Reviewed By: jkorous Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61038 llvm-svn: 359072	2019-04-24 06:55:50 +00:00
Jeremy Morse	32afe6a1f8	[DebugInfo] Fix pr41175 Dead Store Elimination missing debug loc Bug: https://bugs.llvm.org/show_bug.cgi?id=41175 In the bug test case the DSE pass is shortening the range of memory that a memset is working on. A getelementptr is generated so that the new starting address can be passed to memset. This instruction was not given a DebugLoc. To fix the bug, copy the DebugLoc from the memset instruction. Patch by Orlando Cazalet-Hyams! Differential Revision: https://reviews.llvm.org/D60556 llvm-svn: 358270	2019-04-12 09:47:35 +00:00
Florian Hahn	9b41a7320d	Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock." Updated to use DenseMap::insert instead of [] operator for insertion, to avoid a crash caused by epoch checks. This reverts commit `2b85de4383`. llvm-svn: 357257	2019-03-29 14:10:24 +00:00
Florian Hahn	2b85de4383	Revert Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock." Another buildbot failure http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/20402 clang-9: /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/llvm/include/llvm/ADT/DenseMap.h:1228: llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::value_type* llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::operator->() const [with KeyT = const llvm::Instruction; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<const llvm::Instruction>; Bucket = llvm::detail::DenseMapPair<const llvm::Instruction, unsigned int>; bool IsConst = false; llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::pointer = llvm::detail::DenseMapPair<const llvm::Instruction, unsigned int>; llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::value_type = llvm::detail::DenseMapPair<const llvm::Instruction, unsigned int>]: Assertion `isHandleInSync() && "invalid iterator access!"' failed. 0. Program arguments: /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/bin/clang-9 -cc1 -triple x86_64-unknown-linux-gnu -emit-obj -disable-free -main-file-name ArchiveCommandLine.cpp -mrelocation-model static -mthread-model posix -fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu skylake-avx512 -dwarf-column-info -debugger-tuning=gdb -momit-leaf-frame-pointer -coverage-notes-file /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip/Output/ArchiveCommandLine.llvm.gcno -resource-dir /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/lib/clang/9.0.0 -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/include -I ../../../include -D _GNU_SOURCE -D __STDC_LIMIT_MACROS -D NDEBUG -D BREAK_HANDLER -D UNICODE -D _UNICODE -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/C -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/myWindows -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/include_windows -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP -I . -D _FILE_OFFSET_BITS=64 -D _LARGEFILE_SOURCE -D NDEBUG -D _REENTRANT -D ENV_UNIX -D _7ZIP_LARGE_PAGES -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/x86_64-linux-gnu/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/x86_64-linux-gnu/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/backward -internal-isystem /usr/local/include -internal-isystem /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/lib/clang/9.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -O3 -std=gnu++98 -fdeprecated-macro -fdebug-compilation-dir /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip -ferror-limit 19 -fmessage-length 0 -pthread -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -vectorize-loops -vectorize-slp -o Output/ArchiveCommandLine.llvm.o -x c++ /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/7zip/UI/Common/ArchiveCommandLine.cpp -faddrsig This reverts r357222 (git commit `64cccfcc72`) llvm-svn: 357227	2019-03-29 00:22:26 +00:00
Florian Hahn	64cccfcc72	Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock." Recommitting after addressing a buildbot failure. This reverts commit `c87869ebea`. llvm-svn: 357222	2019-03-28 23:11:00 +00:00
Florian Hahn	c87869ebea	Revert [DSE] Preserve basic block ordering using OrderedBasicBlock. This reverts r357208 (git commit `c0bfd37d38`) This causes a buildbot failure: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/16124 FAILED: lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o /home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/install/stage2/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/IR -I/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/IR -Iinclude -I/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/include -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -std=c++11 -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -flto=thin -O3 -UNDEBUG -fno-exceptions -fno-rtti -MD -MT lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o -MF lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o.d -o lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o -c /home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/IR/IRBuilder.cpp clang-9: /home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/Analysis/OrderedBasicBlock.cpp:38: bool llvm::OrderedBasicBlock::comesBefore(const llvm::Instruction , const llvm::Instruction ): Assertion `!(LastInstFound == BB->end() && NextInstPos != 0) && "Instruction supposed to be in NumberedInsts"' failed. llvm-svn: 357211	2019-03-28 20:36:24 +00:00
Florian Hahn	c0bfd37d38	[DSE] Preserve basic block ordering using OrderedBasicBlock. By extending OrderedBB to allow removing and replacing cached instructions, we can preserve OrderedBBs in DSE easily. This eliminates one source of quadratic compile time in DSE. Fixes PR38829. Reviewers: rnk, efriedma, hfinkel Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D59789 llvm-svn: 357208	2019-03-28 20:02:33 +00:00

1 2 3 4 5 ...

437 Commits