In the case where we have an operand defined by a lod of the
same memory location. Historically this was a VariableExpression
because we wanted to make sure they ended up in the same class,
but if we create the right expression, they end up in the same
class anyway.
Fixes PR32897. Thanks to Dan for the detailed discussion and the
fix suggestion.
llvm-svn: 303475
This was here because we don't want to switch leaders too much,
in order to avoid fixpoint(ing) issue, but it's not sure if it
matters in practice.
A first step towards fixing PR32897.
llvm-svn: 303473
This is a complicated bug involving two issues:
1. What do we do with phi nodes when we prove all arguments are not
live?
2. When is it safe to use value leaders to determine if we can ignore
an argumnet?
llvm-svn: 303453
Summary:
NewGVN: Handle equivalence between phi of ops and op of phis.
This makes our GVN mostly-complete. It would be complete, modulo some
deliberate choices we make. This means it detects roughly all herband
equivalences in polynomial time, including cases notoriously hard for
other GVN's to detect. It also detects a very large swath of the
cases we currently rely on instcombine to detect that involve folding
upwards through phis.
Fixes PR 31125, 31463, PR 31868
Reviewers: davide
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D32151
llvm-svn: 303444
We can have cycles between PHIs and this causes singleReachablePhi()
to call itself indefintely (until we run out of stack). The proper
solution would be that of computing SCCs, but it's not worth for
now, so just keep a visited set and give up when we find a cycle.
Thanks to Dan for the discussion/help with this.
Fixes PR33014.
llvm-svn: 303393
verifyMemoryCongruency() filters out trivially dead MemoryDef(s),
as we find them immediately dead, before moving from TOP to a new
congruence class.
This fixes the same problem for PHI(s) skipping MemoryPhis if all
the operands are dead.
Differential Revision: https://reviews.llvm.org/D33044
llvm-svn: 303100
This code was missing a check for stores, so we were thinking the
congruency class didn't have any memory members, and reset the
memory leader.
Differential Revision: https://reviews.llvm.org/D33056
llvm-svn: 302905
The way we currently define congruency for two PHIExpression(s) is:
1) The operands to the phi functions are congruent
2) The PHIs are defined in the same BasicBlock.
NewGVN works under the assumption that phi operands are in predecessor
order, or at least in some consistent order. OTOH, is valid IR:
patatino:
%meh = phi i16 [ %0, %winky ], [ %conv1, %tinky ]
%banana = phi i16 [ %0, %tinky ], [ %conv1, %winky ]
br label %end
and the in-memory representations of the two SSA registers have an
inconsistent order. This violation of NewGVN assumptions results into
two PHIs found congruent when they're not. While we think it's useful
to have always a consistent order enforced, let's fix this in NewGVN
sorting uses in predecessor order before creating a PHI expression.
Differential Revision: https://reviews.llvm.org/D32990
llvm-svn: 302552
In the testcase attached, we believe %tmp1 implies %tmp4.
where:
br i1 %tmp1, label %bb2, label %bb7
br i1 %tmp4, label %bb5, label %bb7
because Wwhile looking at PredicateInfo stuffs we end up calling
isImpliedTrueByMatchingCmp() with the arguments backwards.
Differential Revision: https://reviews.llvm.org/D32718
llvm-svn: 301849
and to expose a handle to represent the actual case rather than having
the iterator return a reference to itself.
All of this allows the iterator to be used with common STL facilities,
standard algorithms, etc.
Doing this exposed some missing facilities in the iterator facade that
I've fixed and required some work to the actual iterator to fully
support the necessary API.
Differential Revision: https://reviews.llvm.org/D31548
llvm-svn: 300032
Analysis, it has Analysis passes, and once NewGVN is made an Analysis,
this removes the cross dependency from Analysis to Transform/Utils.
NFC.
llvm-svn: 299980
memorydefs, not just stores. Along the way, we audit and fixup issues
about how we were tracking memory leaders, and improve the verifier
to notice more memory congruency issues.
llvm-svn: 299682
Summary:
Depends on D30928.
This adds support for coercion of stores and memory instructions that do not require insertion to process.
Another few tests down.
I added the relevant tests from rle.ll
Reviewers: davide
Subscribers: llvm-commits, Prazek
Differential Revision: https://reviews.llvm.org/D30929
llvm-svn: 299330
processing the congruence class of the store.
Because we use the stored value of a store as the def, it isn't dead
just because it appears as a def when it comes from a store.
Note: I have not hit any cases with the memory code as it is where
this breaks anything, just because of what memory congruences we
actually allow. In a followup that improves memory congruence,
this bug actually breaks real stuff (but the verifier catches it).
llvm-svn: 299300
Summary:
Depends on D29606 and D29682
Makes us pass GVN's edge.ll (we also will pass a few other testcases
they just need cleaning up).
Thoughts on the Predicate* hiearchy of classes especially welcome :)
(it's not clear to me how best to organize it, and currently, the getBlock* seems ... uglier than maybe wasting a field somewhere or something).
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29747
llvm-svn: 295889
Summary: This begins using the predicateinfo pass in NewGVN.
Reviewers: davide
Subscribers: llvm-commits, Prazek
Differential Revision: https://reviews.llvm.org/D29682
llvm-svn: 295583
it is dead or unreachable, as it should be.
This also makes the leader of INITIAL undef, enabling us to handle
irreducibility properly.
Summary:
This lets us verify, more than we do now, that we didn't screw up
value numbering.
Reviewers: davide
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D29842
llvm-svn: 294844
This reverts commit r293196
Besides making things look nicer, ATM, we'd like to preserve analysis
more than we'd like to destroy the CFG. We'll probably revisit in the future
llvm-svn: 293501
Summary:
This adds basic dead and redundant store elimination to
NewGVN. Unlike our current DSE, it will happily do cross-block DSE if
it meets our requirements.
We get a bunch of DSE's simple.ll cases, and some stuff it doesn't.
Unlike DSE, however, we only try to eliminate stores of the same value
to the same memory location, not just general stores to the same
memory location.
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29149
llvm-svn: 293258
Summary:
This does not actually fix the testcase in PR31761 (discussion is
ongoing on the testcase), but does fix a bug it exposes, where stores
were not properly clobbering loads.
We accomplish this by unifying the memory equivalence infratructure
back into the normal congruence infrastructure, and then properly
destroying congruence classes when memory state leaders disappear.
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29195
llvm-svn: 293216
Don't call `isTriviallyDeadInstructions()` once we discover that
an instruction is dead. Instead, set DFS number zero (as suggested
by Danny) and forget about it (this also speeds up things as we
won't try to reprocess that block).
Differential Revision: https://reviews.llvm.org/D28930
llvm-svn: 292676
Summary:
This rewrites store expression/leader handling. We no longer use the
value operand as the leader, instead, we store it separately. We also
now store the stored value as part of the expression, and compare it
when comparing stores for equality. This enables us to get rid of a
bunch of our previous hacks and machinations, as the existing
machinery takes care of everything *except* updating the stored value
on classes. The only time we have to update it is if the storecount
goes to 0, and when we do, we destroy it.
Since we no longer use the value operand as the leader, during elimination, we have to use the value operand. Doing this also fixes a bunch of store forwarding cases we were missing.
Any value operand we use is guaranteed to either be updated by previous eliminations, or minimized by future ones.
(IE the fact that we don't use the most dominating value operand when it's not a constant does not affect anything).
Sadly, this change also exposes that we didn't pay attention to the
output of the pr31594.ll test, as it also very clearly exposes the
same store leader bug we are fixing here.
(I added pr31682.ll anyway, but maybe we think that's too large to be useful)
On the plus side, propagate-ir-flags.ll now passes due to the
corrected store forwarding.
This change was 3 stage'd on darwin and linux, with the full test-suite.
Reviewers:
davide
Subscribers:
llvm-commits
llvm-svn: 292648
Part of the assert has been left active for further debugging.
The other part has been turned into a stat for tracking for the
moment.
llvm-svn: 292583
Summary:
This is a testcase where phi node cycling happens, and because we do
not order the leaders by domination or anything similar, the leader
keeps changing.
Using std::set for the members is too expensive, and we actually don't
need them sorted all the time, only at leader changes.
We could keep both a set and a vector, and keep them mostly sorted and
resort as necessary, or use a set and a fibheap, but all of this seems
premature.
After running some statistics, we are able to avoid the vast majority
of sorting by keeping a "next leader" field. Most congruence classes only have
leader changes once or twice during GVN.
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28594
llvm-svn: 291968
It was always zero. When we move a store from `initial` to its
own congruency class, we end up with a negative store count, which
is obviously wrong.
Also, while here, change StoreCount to be signed so that the assertions
actually fire.
Ack'ed by Daniel Berlin.
llvm-svn: 291725
classes, and updating checking to allow for equivalence through
reachability.
(Sadly, the checking here is not perfect, and can't be made perfect,
so we'll have to disable it after we are satisfied with correctness.
Right now it is just "very unlikely" to happen.)
llvm-svn: 291698
Summary: LLVM's non-standard notion of phi nodes means we can't both try to substitute for undef in phi nodes *and* use phi nodes as leaders all the time. This changes NewGVN to use the same semantics as SimplifyPHINode to decide which phi nodes are equivalent.
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28312
llvm-svn: 291308
Apparently my suggestion of using ternary doesn't really work
as clang complains about incompatible types on LHS and RHS. Some
GCC versions happen to accept the code but clang behaviour is
correct here.
llvm-svn: 290822
Summary:
This avoids the very fragile code for null expressions. We could also use a denseset that tracks which things have null expressions instead, but that seems pretty fragile and premature optimization.
This resolves a number of infinite loop cases, test reductions coming.
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28193
llvm-svn: 290816
Summary: Previously, we tried to fix up the equivalences during symbolic evaluation. This does not work. Now, we change the equivalences during congruence finding, where it belongs. We also initialize the equivalence table to give a maximal answer.
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28192
llvm-svn: 290815
Summary:
The optimal iteration order for this problem is RPO order. We want to
process as many preds of a backedge as we can before we process the
backedge.
At the same time, as we add predicate handling, we want to be able to
touch instructions that are dominated by a given block by
ranges (because a change in value numbering a predicate possibly
affects all users we dominate that are using that predicate).
If we don't do it this way, we can't do value inference over
backedges (the paper covers this in depth).
The newgvn branch currently overshoots the last part, and guarantees
that it will touch *at least* the right set of instructions, but it
does touch more. This is because the bitvector instruction ranges are
currently generated in RPO order (so we take the max and the min of
the ranges of dominated blocks, which means there are some in the
middle we didn't have to touch that we did).
We can do better by sorting the dominator tree, and then just using
dominator tree order.
As a preliminary, the dominator tree has some RPO guarantees, but not
enough. It guarantees that for a given node, your idom must come
before you in the RPO ordering. It guarantees no relative RPO ordering
for siblings. We add siblings in whatever order they appear in the module.
So that is what we fix.
We sort the children array of the domtree into RPO order, and then use
the dominator tree for ordering, instead of RPO, since the dominator
tree is now a valid RPO ordering.
Note: This would help any other pass that iterates a forward problem
in dominator tree order. Most of them are single pass. It will still
maximize whatever result they compute. We could also build the
dominator tree in this order, but our incremental updates would still
put it out of sort order, and recomputing the sort order is almost as
hard as general incremental updates of the domtree.
Also note that the sorting does not affect any tests, etc. Nothing
depends on domtree order, including the verifier, the equals
functions for domtree nodes, etc.
How much could this matter, you ask?
Here are the current numbers.
This is generated by running NewGVN over all files in LLVM.
Note that once we propagate equalities, the differences go up by an
order of magnitude or two (IE instead of 29, the max ends up in the
thousands, since the worst case we add a factor of N, where N is the
number of branch predicates). So while it doesn't look that stark for
the default ordering, it gets *much much* worse. There are also
programs in the wild where the difference is already pretty stark
(2 iterations vs hundreds).
RPO ordering:
759040 Number of iterations is 1
112908 Number of iterations is 2
Default dominator tree ordering:
755081 Number of iterations is 1
116234 Number of iterations is 2
603 Number of iterations is 3
27 Number of iterations is 4
2 Number of iterations is 5
1 Number of iterations is 7
Dominator tree sorted:
759040 Number of iterations is 1
112908 Number of iterations is 2
<yay!>
Really bad ordering (sort domtree siblings in postorder. not quite the
worst possible, but yeah):
754008 Number of iterations is 1
21 Number of iterations is 10
8 Number of iterations is 11
6 Number of iterations is 12
5 Number of iterations is 13
2 Number of iterations is 14
2 Number of iterations is 15
3 Number of iterations is 16
1 Number of iterations is 17
2 Number of iterations is 18
96642 Number of iterations is 2
1 Number of iterations is 20
2 Number of iterations is 21
1 Number of iterations is 22
1 Number of iterations is 29
17266 Number of iterations is 3
2598 Number of iterations is 4
798 Number of iterations is 5
273 Number of iterations is 6
186 Number of iterations is 7
80 Number of iterations is 8
42 Number of iterations is 9
Reviewers: chandlerc, davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28129
llvm-svn: 290699
emplace_back is not faster if it is equivalent to push_back. In this cases emplaced value had the
same type that the one stored in container. It is ugly and it might be even slower (see
Scott Meyers presentation about emplacement).
llvm-svn: 290685
Mostly use a bit more idiomatic C++ where we can,
so we can combine some things later.
Reviewers: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28111
llvm-svn: 290550