function to find the globals, iterate over all of the globals directly. This
speeds the function up from 14s to 6.3s on perlbmk, reducing DSA time from
53->46s.
llvm-svn: 10996
This reduces the number of nodes allocated, then immediately merged and DNE'd
from 2193852 to 1298049. unfortunately this only speeds DSA up by ~1.5s (of
53s), because it's spending most of its time waddling through the scalar map :(
llvm-svn: 10992
Also, use RC::merge when possible, reducing the number of nodes allocated, then immediately merged away from 2985444 to 2193852 on perlbmk.
llvm-svn: 10991
it to be off. If it looks like it's completely unnecessary after testing, I
will remove it completely (which is the hope).
* Callers of the DSNode "copy ctor" can not choose to not copy links.
* Make node collapsing not create a garbage node in some cases, avoiding a
memory allocation, and a subsequent DNE.
* When merging types, allow two functions of different types to be merged
without collapsing.
* Use DSNodeHandle::isNull more often instead of DSNodeHandle::getNode() == 0,
as it is much more efficient.
*** Implement the new, more efficient reachability cloner class
In addition to only cloning nodes that are reachable from interesting
roots, this also fixes the huge inefficiency we had where we cloned lots
of nodes, only to merge them away immediately after they were cloned.
Now we only actually allocate a node if there isn't one to merge it into.
* Eliminate the now-obsolete cloneReachable* and clonePartiallyInto methods
* Rewrite updateFromGlobalsGraph to use the reachability cloner
* Rewrite mergeInGraph to use the reachability cloner
* Disable the scalar map scanning code in removeTriviallyDeadNodes. In large
SCC's, this is extremely expensive. We need a better data structure for the
scalar map, because we really want to scan the unique node handles, not ALL
of the scalars.
* Remove the incorrect SANER_CODE_FOR_CHECKING_IF_ALL_REFERRERS_ARE_FROM_SCALARMAP code.
* Move the code for eliminating integer nodes from the trivially dead
eliminator to the dead node eliminator.
* removeDeadNodes no longer uses removeTriviallyDeadNodes, as it contains a
superset of the node removal power.
* Only futz around with the globals graph in removeDeadNodes if it is modified
llvm-svn: 10987
map was only used to implement a marginal GlobalsGraph optimization, and it
actually slows the analysis down (due to the overhead of keeping it), so just
eliminate it entirely.
llvm-svn: 10955
in terms of it.
Though clonePartiallyInto is not cloning partial graphs yet, this change
dramatically speeds up inlining of graphs with many scalars. For example,
this change speeds up the BU pass on 253.perlbmk from 69s to 36s, because
it avoids iteration over the scalar map, which can get pretty large.
llvm-svn: 10951
used to eliminate the hard coded, hacked in, sparc specific, global TargetData.
Changing the TargetData used to actually match the code fixes problems, and
eliminates a crash.
llvm-svn: 9659
and (2) faster inlining by cloning only reachable nodes. In particular:
(1) Added DSGraph::cloneReachableSubgraph and DSGraph::cloneReachableNodes
to clone the subgraph reachable from a set of root nodes, into the
current graph, merging the global nodes into thos in the current graph.
The TD pass now uses this for faster inlining, and so does the
next function.
(2) Added DSGraph::updateFromGlobalGraph() to rematerialize nodes from the
globals graph into the current graph in both BU and TD passes.
(3) `I' flags are removed from all nodes in the globals graph, because they
are difficult to maintain correctly and are not needed anyway.
(4) Aux. function calls are only removed to the globals graph if they
will never be resovled. (This is what fixed gap.) The immediate
reason is that if we took these out of a function (and moved them to
the globals graph) we would need to rematerialize these nodes into the
function graph for every function in the BU pass. The longer term
problem is that we would need to find a way to remove them from the
globals graph iff they have been resolved on all paths through the
call graph.
llvm-svn: 7187
* Add new MultiObject flag to DSNode which keeps track of whether or not
multiple objects have been merged into the node, allowing must-alias info
to be tracked.
llvm-svn: 6794
This helps a lot of testcases, for example:
New Time New #Nodes Old Time Old #Nodes
254.gap: 91.1024 21605 91.1397 22657
povray31: 2.7807 8613 3.0152 10338
255.vortex: 1.2034 8153 1.2172 8822
moria: .6756 3150 .7054 3877
300.twolf: .1652 2010 .1851 3270
Typically, testcases which use long and ulong integers a lot get better, f.e. povray above.
llvm-svn: 5566
fact that the incoming nodes may be merged away at intermediate
steps. Use an extra level of indirection via DSNodeHandles
to track the nodes being merged. All this now happens in a
static helper function MergeNodes().
llvm-svn: 4947
* The globals vector was getting broken and unsorted, this caused vortex
to get badly pessimized
* Node offset handling was being handled really poorly, and in particular
we were not merging types with offsets right. This causes several graphs
to be non-merged.
llvm-svn: 4699
* Implement the first step of the Globals graph: Deleting nodes from function
graphs. In practice, these nodes need to be moved to the globals graph, but
this will be taken care of later. Note that the graphs computed right now are
not strictly correct!
llvm-svn: 4681
are marked DEAD. This helped track down some bugs
- Fix a bunch of bugs where we were doing work on nodes after they became dead
- Add support for aux function call sites
- Add support for not cloning call sites
llvm-svn: 4633
the BU class.
This will be used by the IPModRef class to do stuff, eventually perhaps the
TD pass will use it also.
Speaking of the TD pass, this also eliminates the self recursive case, which
was broken, and couldn't occur anyway.
llvm-svn: 4599
nodes in the graph are memory objects, which is very nice. This also greatly
reduces the size and memory footprint for DSGraphs. For example, the local
DSGraph for llu went from 65 to 13 nodes with this change. As a side bonus,
dot seems to lay out the graphs slightly better too. :)
llvm-svn: 4488
be fixed, but we are getting much closer now.
* Make DSNode::TypeRec a full fledged DSTypeRec type.
* Add methods used to update and access the typerecords elements
* Add methods to query if and to cause a node to be completely folded
* DSGraph construction doesn't use the allocation type for anything at all,
now nodes get their type information based on how they are used.
* Fixed a bug with global value handling introduced in the last checkin
* GEP support is now much better, arrays are handled correctly. The array
flag is now updated in type records. There are still cases that are not
handled yet (we do not detect pessimizations), but getting much closer.
llvm-svn: 4465
Make copy ctor have two versions to avoid dealing with conditional template
argument. DSCallSite ctor now takes all arguments instead of taking one
and being populated later.
llvm-svn: 4240
used in the DS graphs. Essentially, what was vector<DSNodeHandle>
before is now a DSCallSite with the same vector, plus pointers to the
CallInst and the caller Function. The special-purpose class
BUDataStructure::CallSite is no longer needed.
llvm-svn: 4228
* BUClosure doesn't have to worry about global nodes
* TDClosure now works with global nodes
* Reenable DNE on TD pass, now that globals work right
llvm-svn: 4220
* Now all and any bytes of a DSNode can be merged together individually. This
is neccesary to support the full generality of C and support aliasing
correctly.
llvm-svn: 4008
This changes both the bottom-up and top-down propagation so that
globals and other external objects do not have to appear in every
function, but only in functions in which they are referenced or
they can be used to access something else that is referenced.
llvm-svn: 3170
methods
* Eliminate AnalysisID: Now it is just a typedef for const PassInfo*
* Simplify how AnalysisID's are initialized
* Eliminate Analysis/Writer.cpp/.h: incorporate printing functionality into
the analyses themselves.
llvm-svn: 3115
* Add new RegisterOpt/RegisterAnalysis templates for registering passes that
are to show up in opt or analyze
* Register Analyses now
* Change optimizations to use RegisterOpt instead of RegisterPass
* Add support for different "PassType's"
* Add new RegisterOpt/RegisterAnalysis templates for registering passes that
are to show up in opt or analyze
* Register Analyses now
* Change optimizations to use RegisterOpt instead of RegisterPass
* Remove getPassName implementations from various subclasses
llvm-svn: 3112
Each DSGraph now keeps a list of pending callers that have not
been inlined into the function represented by that graph.
It also keeps a copy of the original call nodes before the BU pass
eliminates some of them.
llvm-svn: 2965
* Make the function pointer argument explicit for a call nodes
* Eliminate unreachable global values
* Merge call nodes that are identical
llvm-svn: 2266
* Simplify a lot of the inlining stuff. There are still problems, but not
many
* Break up the Function representation to have a vector for every different
node type so it is fast to find nodes of a particular flavor.
* Do more intelligent merging of call values
* Allow elimination of unreachable shadow and allocation nodes
* Generalize indistinguishability testing to allow merging of identical calls.
* Increase shadow node merging power
llvm-svn: 2010