Commit Graph

381 Commits

Author SHA1 Message Date
Florian Hahn 9d172c8e9c Recommit "[DSE] Switch to MemorySSA-backed DSE by default."
This switches to using DSE + MemorySSA by default again, after
fixing the issues reported after the first commit.

Notable fixes fc82006331, a0017c2bc2.

This reverts commit 3a59628f3c.
2020-09-18 11:05:00 +01:00
Florian Hahn 3a59628f3c Revert "[DSE] Switch to MemorySSA-backed DSE by default."
This reverts commit fb109c42d9.

Temporarily revert due to a mis-compile pointed out at D87163.
2020-09-15 18:07:56 +01:00
Florian Hahn f715d81c9d [DSE] Only eliminate candidates that always store the same loc.
AliasAnalysis/MemoryLocation does not account for loops. Two
MemoryLocation can be must-overwrite, even if the first one writes
multiple locations in a loop.

This patch prevents removing such stores, by only considering candidates
that are known to be loop invariant, or executed in the same BB.

Currently the invariant check is quite conservative and only considers
Alloca and Alloca-like instructions and arguments as invariant base pointers.
It also considers GEPs with all constant indices and invariant bases as
invariant.

This can be improved in the future, but the current implementation has
only minor impact on the total number of stores eliminated (25903 vs
26047 for the baseline). There are some 2-10% swings for some individual
benchmarks. In roughly half of the cases, the number of stores removed
increases actually, because we skip candidates that are unlikely to be
valid candidates early.
2020-09-14 12:06:58 +01:00
Florian Hahn e082dee2b5 [DSE] Bail out on MemoryPhis when deleting stores at end of function.
When deleting stores at the end of a function, we have to do PHI
translation, otherwise we might miss reads in different iterations of a
loop. See multiblock-loop-carried-dependence.ll for details.

This fixes a mis-compile and surprisingly also increases the number of
eliminated stores from 26047 to 26572 for MultiSource/SPEC2000/SPEC2006
on X86 with -O3 -flto. This is most likely because we save budget by not
exploring through MemoryPhis, which are less likely to result in valid
candidates for elimination.

The issue was reported post-commit for fb109c42d9.
2020-09-12 19:05:59 +01:00
Krzysztof Parzyszek f92908cc74 [DSE] Make sure that DSE+MSSA can handle masked stores
Differential Revision: https://reviews.llvm.org/D87414
2020-09-11 10:00:21 -05:00
Florian Hahn fb109c42d9 [DSE] Switch to MemorySSA-backed DSE by default.
The tests have been updated and I plan to move them from the MSSA
directory up.

Some end-to-end tests needed small adjustments. One difference to the
legacy DSE is that legacy DSE also deletes trivially dead instructions
that are unrelated to memory operations. Because MemorySSA-backed DSE
just walks the MemorySSA, we only visit/check memory instructions. But
removing unrelated dead instructions is not really DSE's job and other
passes will clean up.

One noteworthy change is in llvm/test/Transforms/Coroutines/ArgAddr.ll,
but I think this comes down to legacy DSE not handling instructions that
may throw correctly in that case. To cover this with MemorySSA-backed
DSE, we need an update to llvm.coro.begin to treat it's return value to
belong to the same underlying object as the passed pointer.

There are some minor cases MemorySSA-backed DSE currently misses, e.g. related
to atomic operations, but I think those can be implemented after the switch.

This has been discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html

For the MultiSource/SPEC2000/SPEC2006 the number of eliminated stores
goes from ~17500 (legayc DSE) to ~26300 (MemorySSA-backed). More numbers
and details in the thread on llvm-dev.

Impact on CTMark:
```
                                     Legacy Pass Manager
                        exec instrs    size-text
O3                       + 0.60%        - 0.27%
ReleaseThinLTO           + 1.00%        - 0.42%
ReleaseLTO-g.            + 0.77%        - 0.33%
RelThinLTO (link only)   + 0.87%        - 0.42%
RelLO-g (link only)      + 0.78%        - 0.33%
```
http://llvm-compile-time-tracker.com/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions
```
                                     New Pass Manager
                       exec instrs.   size-text
O3                       + 0.95%       - 0.25%
ReleaseThinLTO           + 1.34%       - 0.41%
ReleaseLTO-g.            + 1.71%       - 0.35%
RelThinLTO (link only)   + 0.96%       - 0.41%
RelLO-g (link only)      + 2.21%       - 0.35%
```
http://195.201.131.214:8000/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions

Reviewed By: asbirlea, xbolva00, nikic

Differential Revision: https://reviews.llvm.org/D87163
2020-09-10 22:24:32 +01:00
Florian Hahn a5ec99da6e [DSE] Support eliminating memcpy.inline.
MemoryLocation has been taught about memcpy.inline, which means we can
get the memory locations read and written by it. This means DSE can
handle memcpy.inline
2020-09-10 13:19:25 +01:00
Florian Hahn 9969c317ff [DSE,MemorySSA] Handle atomic stores explicitly in isReadClobber.
Atomic stores are modeled as MemoryDef to model the fact that they may
not be reordered, depending on the ordering constraints.

Atomic stores that are monotonic or weaker do not limit re-ordering, so
we do not have to treat them as potential read clobbers.

Note that llvm/test/Transforms/DeadStoreElimination/MSSA/atomic.ll
already contains a set of negative test cases.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87386
2020-09-09 23:01:58 +01:00
Krzysztof Parzyszek 81ff2d30a9 [DSE] Handle masked stores 2020-09-09 13:31:31 -05:00
Florian Hahn c7b7c32f4a [DSE,MemorySSA] Increase walker limit a bit.
This slightly bumps the walker limit so that it covers more cases while
not increasing compile-time too much:
http://llvm-compile-time-tracker.com/compare.php?from=0fc1c2b51ba0cfb9145139af35be638333865251&to=91144a50ea4fa82c0c877e77784f60371640b263&stat=instructions
2020-09-08 14:55:46 +01:00
Florian Hahn efb8e156da [DSE,MemorySSA] Add an early check for read clobbers to traversal.
Depending on the benchmark, this early exit can save a substantial
amount of compile-time:

http://llvm-compile-time-tracker.com/compare.php?from=505f2d817aa8e07ba98e5fd4a8f6ff0666f89df1&to=eb4e441147f9b4b7a5fcbbc57428cadbe9e01f10&stat=instructions
2020-09-07 23:22:10 +01:00
Florian Hahn 16bb71fd4f [DSE,MemorySSA] Add a few additional debug messages. 2020-09-06 20:31:00 +01:00
Florian Hahn 00eb6fef08 [DSE,MemorySSA] Check for throwing instrs between killing/killed def.
We also have to check all uses between the killing & killed def and
check if any of them is throwing.
2020-09-04 18:54:59 +01:00
Florian Hahn 86d817d7cf [DSE,MemorySSA] Skip defs without analyzable write locations.
Similar to other checks above, if there is no write location for a def,
it cannot be considered for elimination and can be skipped.
2020-08-30 21:56:25 +01:00
Florian Hahn 42c57c294d [DSE,MemorySSA] Simplify code, EarlierAccess is be a MemoryDef (NFC).
After recent changes, we return early if Current is a MemoryPhi, so
EarlierAccess can only be a MemoryDef.
2020-08-30 21:31:57 +01:00
Florian Hahn 31cdb29de4 [DSE,MemorySSA] Return early when hitting a MemoryPhi.
A MemoryPhi can never be eliminated. If we hit one, return the Phi, so
the caller can continue traversing the incoming accesses.

This saves some unnecessary read clobber checks and improves
compile-time
http://llvm-compile-time-tracker.com/compare.php?from=1ffc58b6d098ce8fa71f3a80fe75b990f633f921&to=d0fa8d1982380b57d7b6067528104bc373dbe07a&stat=instructions
2020-08-29 18:28:26 +01:00
Florian Hahn 43aa7227df [DSE,MemorySSA] Check if Current is valid for elimination first.
This changes getDomMemoryDef to check if a Current is a valid
candidate for elimination before checking for reads. Before the change,
we were spending a lot of compile-time in checking for read accesses for
Current that might not even be removable.

This patch flips the logic, so we skip Current if they cannot be
removed before checking all their uses. This is much more efficient in
practice.

It also adds a more aggressive limit for checking partially overlapping
stores. The main problem with overlapping stores is that we do not know
if they will lead to elimination until seeing all of them. This patch
limits adds a new limit for overlapping store candidates, which keeps
the number of modified overlapping stores roughly the same.

This is another substantial compile-time improvement (while also
increasing the number of stores eliminated). Geomean -O3 -0.67%,
ReleaseThinLTO -0.97%.

http://llvm-compile-time-tracker.com/compare.php?from=0a929b6978a068af8ddb02d0d4714a2843dd8ba9&to=2e630629b43f64b60b282e90f0d96082fde2dacc&stat=instructions

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D86487
2020-08-28 11:19:04 +01:00
Florian Hahn bb024c3c4e [DSE,MemorySSA] Remove short-cut to check if all paths are covered.
The post-order number early continue does not work in some cases, e.g.
if a path from EarlierAccess to an exit includes a node that dominates
EarlierAccess in a cycle.

The short-cut only has very minor impact on compile-time, so it seems
straight-forward to remove it for now:

http://llvm-compile-time-tracker.com/compare.php?from=062412e79fcfedf2cf004433e42036b0333e3f83&to=d7386016a77ce1387bdbbf360f1de157faea9d31&stat=instructions

Fixes PR47285.
2020-08-27 12:42:40 +01:00
Florian Hahn e717fdb0f1 [DSE,MemorySSA] Traverse use-def chain without MemSSA Walker.
For DSE with MemorySSA it is beneficial to manually traverse the
defining access, instead of using a MemorySSA walker, so we can
better control the number of steps together with other limits and
also weed out invalid/unprofitable paths early on.

This patch requires a follow-up patch to be most effective, which I will
share soon after putting this patch up.

This temporarily XFAIL's the limit tests, because we now explore more
MemoryDefs that may not alias/clobber the killing def. This will be
improved/fixed by the follow-up patch.

This patch also renames some `Dom*` variables to `Earlier*`, because the
dominance relation is not really used/important here and potentially
confusing.

This patch allows us to aggressively cut down compile time, geomean
-O3 -0.64%, ReleaseThinLTO -1.65%, at the expense of fewer stores
removed. Subsequent patches will increase the number of removed stores
again, while keeping compile-time in check.

http://llvm-compile-time-tracker.com/compare.php?from=d8e3294118a8c5f3f97688a704d5a05b67646012&to=0a929b6978a068af8ddb02d0d4714a2843dd8ba9&stat=instructions

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D86486
2020-08-27 10:02:02 +01:00
Florian Hahn e19ef1aab5 [DSE,MemorySSA] Cache accesses with/without reachable read-clobbers.
Currently we repeatedly check the same uses for read clobbers in some
cases. We can avoid unnecessary checks by keeping track of the memory
accesses we already found read clobbers for. To do so, we just add
memory access causing read-clobbers to a set. Note that marking all
visited accesses as read-clobbers would be to pessimistic, as that might
include accesses not on any path to  the actual read clobber.

If we do not find any read-clobbers, we can add all visited instructions
to another set and use that to skip the same accesses in the next call.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D75025
2020-08-25 08:48:46 +01:00
Florian Hahn d1a1cce5b1 [DSE,MemorySSA] Do not use callCapturesBefore in isReadClobber.
Using callCapturesBefore potentially improves the precision and the
number of stores we can remove. But in practice, it seems to have very
little impact in terms of stores removed. For example, for
SPEC2000/SPEC2006/MultiSource with -O3 -flto, ~50 more stores are
removed (out of ~26900 stores removed). But in terms of compile-time, it
is very expensive and the patch gives substantial compile-time
improvements: Geomean O3 -0.24%, ReleaseThinLTO -0.47%, ReleaseLTO-g
-0.39%.

http://llvm-compile-time-tracker.com/compare.php?from=612a0bff88ed906c83b82f079d4c49e5fecfb9d0&to=e6c86b96d20d97dd88e903a409bd8d39b6114312&stat=instructions
2020-08-24 16:19:42 +01:00
Florian Hahn b99a5eb659 [DSE,MemorySSA] Delay PointerMayBeCaptured calls until actually needed.
Avoid computing InvisibleToCallerBefore/AfterRet up front. In most
cases, this information is not really needed. Instead, introduce helper
functions to compute and cache the result on demand.

Notably, this also does not use PointerMayBeCapturedBefore for
isInvisibleToCallerBeforeRet, as it requires the killing MemoryDef as
starting instruction, making the caching ineffective. But it appears the
use of PointerMayBeCapturedBefore has very limited benefits in practice
(e.g. on SPEC2000/SPEC2006/MultiSource there are no binary changes with
-O3 -flto). Refrain from using it for now, to limit-compile-time.

This gives some nice compile-time improvements:
http://llvm-compile-time-tracker.com/compare.php?from=db9345f6810f379a36752dc52caf5230585d0ebd&to=b4d091047e1b8a3d377d200137b79d03aca65663&stat=instructions
2020-08-24 14:05:44 +01:00
Florian Hahn 2431b143ae [DSE,MemorySSA] Limit elimination at end of function to single UO.
Limit elimination of stores at the end of a function to MemoryDefs with
a single underlying object, to save compile time.

In practice, the case with multiple underlying objects seems not very
important in practice. For -O3 -flto on MultiSource/SPEC2000/SPEC2006
this results in a total of 2 more stores being eliminated.

We can always re-visit that in the future.
2020-08-24 13:00:17 +01:00
Florian Hahn 2843c9fe0a [DSE,MemorySSA] Keep single DL instance in DSEState (NFC).
Small cleanup, also removes one instance of getting DataLayout without
using it later.
2020-08-23 15:56:38 +01:00
Florian Hahn 5e7e2162d4 [DSE,MemorySSA] Use BatchAA for AA queries.
We can use BatchAA to avoid some repeated AA queries. We only remove
stores, so I think we will get away with using a single BatchAA instance
for the complete run.

The changes in AliasAnalysis.h mirror the changes in D85583.

The change improves compile-time by roughly 1%.
http://llvm-compile-time-tracker.com/compare.php?from=67ad786353dfcc7633c65de11601d7823746378e&to=10529e5b43809808e8c198f88fffd8f756554e45&stat=instructions

This is part of the patches to bring down compile-time to the level
referenced in
http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D86275
2020-08-22 08:36:35 +01:00
Florian Hahn 9f7350672e [DSE,MemorySSA] Handle atomicrmw/cmpxchg conservatively.
This adds conservative handling of AtomicRMW/AtomicCmpXChg to
isDSEBarrier, similar to atomic loads and stores.
2020-08-21 10:42:42 +01:00
Florian Hahn a0e92ffd0d [DSE,MemorySSA] Split off partial tracking from isOverwite.
When traversing memory uses to look for aliasing reads/writes, we only
care about complete overwrites. This patch splits off the partial
overwrite tracking from isOverwrite This avoids some unnecessary work
when checking for read/write clobbers with MemorySSA-DSE.
isOverwrite, which skips the partial overwrite tracking.

This gives a relatively small improvement
http://llvm-compile-time-tracker.com/compare.php?from=ef2a2f77f87553a0a4a39f518eb9ac86b756bda6&to=658f3905dd96d3415f3782adc712c79fa59a4665&stat=instructions

This is part of the patches to bring down compile-time to the level
referenced in
http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D86280
2020-08-21 09:13:59 +01:00
Florian Hahn c0cbe6453a [DSE] Remove dead argument from removePartiallyOverlappedStores (NFC).
The argument is unused and can be removed.
2020-08-19 19:33:52 +01:00
Florian Hahn 1a55fbceaa [DSE,MemorySSA] Use NumRedundantStores instead of NumNoopStores.
Legacy DSE uses NumRedundantStores, while MemorySSA DSE uses
NumNoopStores. We should just use the same counter.
2020-08-19 08:50:33 +01:00
Florian Hahn 4cc20aa743 [DSE,MemorySSA] Skip access already dominated by a killing def.
If we already found a killing def (= a def that completely overwrites
the location) that dominates an access, we can skip processing it
further.

This does not help with compile-time, but increases the number of memory
accesses we can process with the same scan budget, leading to more
stores being eliminated.

Improvements with this change

Same hash: 203 (filtered out)
Remaining: 34
Metric: dse.NumFastStores

Program                                        base    dom     diff
 test-suite...rolangs-C++/family/family.test     2.00    4.00  100.0%
 test-suite...ProxyApps-C++/CLAMR/CLAMR.test   172.00  229.00  33.1%
 test-suite...ks/Prolangs-C/agrep/agrep.test    10.00   12.00  20.0%
 test-suite...oxyApps-C++/miniFE/miniFE.test    44.00   51.00  15.9%
 test-suite...marks/7zip/7zip-benchmark.test   1285.00 1474.00 14.7%
 test-suite...006/450.soplex/450.soplex.test   254.00  289.00  13.8%
 test-suite...006/447.dealII/447.dealII.test   2466.00 2798.00 13.5%
 test-suite...000/197.parser/197.parser.test     9.00   10.00  11.1%
 test-suite.../Benchmarks/nbench/nbench.test    85.00   91.00   7.1%
 test-suite...ce/Applications/siod/siod.test    68.00   72.00   5.9%
 test-suite...ications/JM/lencod/lencod.test   786.00  824.00   4.8%
 test-suite...6/464.h264ref/464.h264ref.test   765.00  798.00   4.3%
 test-suite.../Benchmarks/Ptrdist/bc/bc.test   105.00  109.00   3.8%
 test-suite...lications/obsequi/Obsequi.test    29.00   28.00  -3.4%
 test-suite...3.xalancbmk/483.xalancbmk.test   1322.00 1367.00  3.4%
 test-suite...chmarks/MallocBench/gs/gs.test   118.00  122.00   3.4%
 test-suite...T2006/401.bzip2/401.bzip2.test    60.00   62.00   3.3%
 test-suite...6/482.sphinx3/482.sphinx3.test    30.00   31.00   3.3%
 test-suite...rks/tramp3d-v4/tramp3d-v4.test   862.00  887.00   2.9%
 test-suite...telecomm-gsm/telecomm-gsm.test    78.00   80.00   2.6%
 test-suite...ediabench/gsm/toast/toast.test    78.00   80.00   2.6%
 test-suite.../Applications/SPASS/SPASS.test   163.00  167.00   2.5%
 test-suite...lications/ClamAV/clamscan.test   240.00  245.00   2.1%
 test-suite...006/453.povray/453.povray.test   1392.00 1419.00  1.9%
 test-suite...000/255.vortex/255.vortex.test   211.00  215.00   1.9%
 test-suite...:: External/Povray/povray.test   1295.00 1317.00  1.7%
 test-suite...lications/sqlite3/sqlite3.test   175.00  177.00   1.1%
 test-suite...T2000/256.bzip2/256.bzip2.test    99.00  100.00   1.0%
 test-suite...0/253.perlbmk/253.perlbmk.test   629.00  635.00   1.0%
 test-suite.../CINT2006/403.gcc/403.gcc.test   1183.00 1194.00  0.9%
 test-suite.../CINT2000/176.gcc/176.gcc.test   647.00  653.00   0.9%
 test-suite...ications/JM/ldecod/ldecod.test   512.00  516.00   0.8%
 test-suite...0.perlbench/400.perlbench.test   1026.00 1034.00  0.8%
 test-suite...-typeset/consumer-typeset.test   1876.00 1877.00  0.1%
 Geomean difference                                             7.3%
2020-08-17 20:54:48 +01:00
Florian Hahn df4756ec6c [DSE,MemorySSA] Check for underlying objects first.
isWriteAtEndOfFunction needs to check all memory uses of Def, which is
much more expensive than getting the underlying objects in practice.
Switch the call order, as recommended by the TODO, which was added as
per an earlier review.

This shaves off a bit of compile-time.
2020-08-17 18:52:18 +01:00
Florian Hahn 139810449b [DSE,MemorySSA] Account for ScanLimit == 0 on entry.
Currently the code does not account for the fact that getDomMemoryDef
can be called with ScanLimit == 0, if we reached the limit while
processing an earlier access. Also tighten the check a bit more and bump
the scan limit now that it is handled properly.

In some cases, this brings a 2x speedup in terms of compile-time.
2020-08-17 17:55:14 +01:00
Florian Hahn 3b0878a370 [DSE,MSSA] Fix crash when using tryToMergePartialOverlappingStores.
We are re-using tryToMergePartialOverlappingStores, which requires
earlier to domiante Later. In the long run,
tryToMergeParialOverlappingStores should be re-written using MemorySSA.

Fixes PR46513.
2020-08-13 12:07:56 +01:00
Vitaly Buka b0eb40ca39 [NFC] Remove unused GetUnderlyingObject paramenter
Depends on D84617.

Differential Revision: https://reviews.llvm.org/D84621
2020-07-31 02:10:03 -07:00
Vitaly Buka 89051ebace [NFC] GetUnderlyingObject -> getUnderlyingObject
I am going to touch them in the next patch anyway
2020-07-30 21:08:24 -07:00
Matt Arsenault 023883a834 IR: Rename Argument::hasPassPointeeByValueAttr to prepare for byref
When the byref attribute is added, there will need to be two similar
functions for the existing cases which have an associate value copy,
and byref which does not. Most, but not all of the existing uses will
use the existing version.

The associated size function added by D82679 also needs to
contextually differ, and will help eliminate a few places still
relying on pointee element types.
2020-07-16 13:50:49 -04:00
John Brawn 20854d85e1 [DSE,MSSA] Recognise init_trampoline in getLocForWriteEx
This fixes an instance where MemorySSA-using Dead Store Elimination is failing
to do a transformation that the non-MemorySSA-using version does.

Differential Revision: https://reviews.llvm.org/D83783
2020-07-15 12:18:58 +01:00
Florian Hahn 80970ac875 [DSE,MSSA] Eliminate stores by terminators (free,lifetime.end).
This patch adds support for eliminating stores by free & lifetime.end
calls. We can remove stores that are not read before calling a memory
terminator and we can eliminate all stores after a memory terminator
until we see a new lifetime.start. The second case seems to not really
trigger much in practice though.

Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D72410
2020-07-08 08:59:46 +01:00
Nicolai Hähnle dfcc68c528 DomTree: Remove getRoots() accessor
Summary:
Avoid exposing details about how roots are stored. This enables subsequent
type-erasure changes.

v5:
- cleanup a unit test by using EXPECT_EQ instead of EXPECT_TRUE

Change-Id: I532b774cc71f2224e543bc7d79131d97f63f093d

Reviewers: arsenm, RKSimon, mehdi_amini, courbet

Subscribers: jvesely, wdng, hiraditya, kuhar, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83085
2020-07-06 21:58:11 +02:00
Nuno Lopes 7f903873b8 DSE: fix builtin function recognition to take decl into account 2020-07-02 10:28:47 +01:00
Florian Hahn 4837daf883 [DSE,MSSA] Check if Def is removable only wen we try to remove it.
Non-removable MemoryDefs can still eliminate other defs. Update the
isRemovable checks to only candidates for removal.
2020-06-25 14:01:10 +01:00
Florian Hahn 4e62c6359c [DSE] Eliminate stores at the end of the function.
This patch add support for eliminating MemoryDefs that do not have any
aliasing users, which indicates that there are no reads/writes to the
memory location until the end of the function.

To eliminate such defs, we have to ensure that the underlying object is
not visible in the caller and does not escape via returning. We need a
separate check for that, as InvisibleToCaller does not consider returns.

Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker, george.burgess.iv

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D72631
2020-06-24 12:58:20 +01:00
Florian Hahn ff4de8683a [DSE,MSSA] Treat `store 0` after calloc as noop stores.
This patch extends storeIsNoop to also detect stores of 0 to an calloced
object. This basically ports the logic from legacy DSE to the MemorySSA
backed version.

It triggers in a few cases on MultiSource, SPEC2000, SPEC2006 with -O3
LTO:

Same hash: 218 (filtered out)
Remaining: 19
Metric: dse.NumNoopStores

Program                                        base   patch2 diff
 test-suite...CFP2000/177.mesa/177.mesa.test     1.00  15.00 1400.0%
 test-suite...6/482.sphinx3/482.sphinx3.test     1.00  14.00 1300.0%
 test-suite...lications/ClamAV/clamscan.test     2.00  28.00 1300.0%
 test-suite...CFP2006/433.milc/433.milc.test     1.00   8.00 700.0%
 test-suite...pplications/oggenc/oggenc.test     2.00   9.00 350.0%
 test-suite.../CINT2000/176.gcc/176.gcc.test     6.00   6.00  0.0%
 test-suite.../CINT2006/403.gcc/403.gcc.test    NaN   137.00  nan%
 test-suite...libquantum/462.libquantum.test    NaN     3.00  nan%
 test-suite...6/464.h264ref/464.h264ref.test    NaN     7.00  nan%
 test-suite...decode/alacconvert-decode.test    NaN     2.00  nan%
 test-suite...encode/alacconvert-encode.test    NaN     2.00  nan%
 test-suite...ications/JM/ldecod/ldecod.test    NaN     9.00  nan%
 test-suite...ications/JM/lencod/lencod.test    NaN    39.00  nan%
 test-suite.../Applications/lemon/lemon.test    NaN     2.00  nan%
 test-suite...pplications/treecc/treecc.test    NaN     4.00  nan%
 test-suite...hmarks/McCat/08-main/main.test    NaN     4.00  nan%
 test-suite...nsumer-lame/consumer-lame.test    NaN     3.00  nan%
 test-suite.../Prolangs-C/bison/mybison.test    NaN     1.00  nan%
 test-suite...arks/mafft/pairlocalalign.test    NaN    30.00  nan%

Reviewers: efriedma, zoecarver, asbirlea

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D82204
2020-06-23 21:01:39 +01:00
Florian Hahn a822ec75cc [DSE,MSSA] Treat passed by value args as invisible to caller.
This updates the MemorySSA backed implementation to treat arguments
passed by value similar to allocas: in they are assumed to be invisible
in the caller. This is similar to how they are treated in legacy DSE.

Reviewers: efriedma, asbirlea, george.burgess.iv

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D82222
2020-06-23 08:58:51 +01:00
Florian Hahn 328c8642e2 [DSE,MSSA] Reorder DSE blocking checks.
Currently we stop exploring candidates too early in some cases.

In particular, we can continue checking the defining accesses of
non-removable MemoryDefs and defs without analyzable write location
(read clobbers are already ruled out using MemorySSA at this point).
2020-06-22 17:16:34 +01:00
Florian Hahn 0e19ff02d8 [DSE,MSSA] Remove unused arguments for isDSEBarrier (NFC). 2020-06-22 10:58:53 +01:00
Florian Hahn 40569db7b3 [DSE,MSSA] Move reachability check to main loop.
As we traverse the CFG backwards, we could end up reaching unreachable
blocks. For unreachable blocks, we won't have computed post order
numbers and because DomAccess is reachable, unreachable blocks cannot be
on any path from it.

This fixes a crash with unreachable blocks.
2020-06-21 16:38:10 +01:00
Florian Hahn 120c059292 [DSE,MSSA] Port partial store merging.
Port partial constant store merging logic to MemorySSA backed DSE. The
heavy lifting is done by the existing helper function. It is used in
context where we already ensured that the later instruction can
eliminate the earlier one, if it is a complete overwrite.
2020-06-15 18:41:46 +01:00
Florian Hahn 71a91b9837 [DSE] Hoist partial store merging code into function (NFC).
Hoist the general logic into a new function, because it can be re-used
by the MemorySSA backed DSE as well.
2020-06-15 17:44:24 +01:00
Florian Hahn 8c61f13a0f [DSE,MSSA] Delete instructions after printing it.
Also enables a now-passing test case, that exposed a crash caused by the
wrong order.
2020-06-15 16:01:36 +01:00