llvm-project/llvm/lib/Analysis
Juergen Ributzka 38b67d0caf Add Constant Hoisting Pass
This pass identifies expensive constants to hoist and coalesces them to
better prepare it for SelectionDAG-based code generation. This works around the
limitations of the basic-block-at-a-time approach.

First it scans all instructions for integer constants and calculates its
cost. If the constant can be folded into the instruction (the cost is
TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't
consider it expensive and leave it alone. This is the default behavior and
the default implementation of getIntImmCost will always return TCC_Free.

If the cost is more than TCC_BASIC, then the integer constant can't be folded
into the instruction and it might be beneficial to hoist the constant.
Similar constants are coalesced to reduce register pressure and
materialization code.

When a constant is hoisted, it is also hidden behind a bitcast to force it to
be live-out of the basic block. Otherwise the constant would be just
duplicated and each basic block would have its own copy in the SelectionDAG.
The SelectionDAG recognizes such constants as opaque and doesn't perform
certain transformations on them, which would create a new expensive constant.

This optimization is only applied to integer constants in instructions and
simple (this means not nested) constant cast experessions. For example:
%0 = load i64* inttoptr (i64 big_constant to i64*)

Reviewed by Eric

llvm-svn: 200022
2014-01-24 18:23:08 +00:00
..
IPA Put the functionality for printing a value to a raw_ostream as an 2014-01-09 02:29:41 +00:00
AliasAnalysis.cpp [cleanup] Move the Dominators.h and Verifier.h headers into the IR 2014-01-13 09:26:24 +00:00
AliasAnalysisCounter.cpp Put the functionality for printing a value to a raw_ostream as an 2014-01-09 02:29:41 +00:00
AliasAnalysisEvaluator.cpp Put the functionality for printing a value to a raw_ostream as an 2014-01-09 02:29:41 +00:00
AliasDebugger.cpp Move all of the header files which are involved in modelling the LLVM IR 2013-01-02 11:36:10 +00:00
AliasSetTracker.cpp Put the functionality for printing a value to a raw_ostream as an 2014-01-09 02:29:41 +00:00
Analysis.cpp [PM] Make the verifier work independently of any pass manager. 2014-01-19 02:22:18 +00:00
BasicAliasAnalysis.cpp Fix known typos 2014-01-24 17:20:08 +00:00
BlockFrequencyInfo.cpp BlockFrequencyInfo: Readded getEntryFreq. 2013-12-20 22:11:11 +00:00
BranchProbabilityInfo.cpp [block-freq] Teach branch probability how to return the edge weight in between a BasicBlock and one of its successors. 2013-12-14 02:24:25 +00:00
CFG.cpp [cleanup] Move the Dominators.h and Verifier.h headers into the IR 2014-01-13 09:26:24 +00:00
CFGPrinter.cpp Use the new script to sort the includes of every file under lib. 2012-12-03 16:50:05 +00:00
CMakeLists.txt delinearization of arrays 2013-11-12 22:47:20 +00:00
CaptureTracking.cpp Make nocapture analysis work with addrspacecast 2014-01-14 19:11:52 +00:00
CodeMetrics.cpp Begin fleshing out an interface in TTI for modelling the costs of 2013-01-22 11:26:02 +00:00
ConstantFolding.cpp Add addrspacecast instruction. 2013-11-15 01:34:59 +00:00
CostModel.cpp Get right cost for addrspacecast in cost model 2014-01-22 20:30:16 +00:00
Delinearization.cpp Re-sort all of the includes with ./utils/sort_includes.py so that 2014-01-07 11:48:04 +00:00
DependenceAnalysis.cpp Fix known typos 2014-01-24 17:20:08 +00:00
DomPrinter.cpp [PM] Split DominatorTree into a concrete analysis result object which 2014-01-13 13:07:17 +00:00
DominanceFrontier.cpp [PM] Split DominatorTree into a concrete analysis result object which 2014-01-13 13:07:17 +00:00
IVUsers.cpp [PM] Split DominatorTree into a concrete analysis result object which 2014-01-13 13:07:17 +00:00
InstCount.cpp Move all of the header files which are involved in modelling the LLVM IR 2013-01-02 11:36:10 +00:00
InstructionSimplify.cpp InstSimplify: Make shift, select and GEP simplifications vector-aware. 2014-01-24 17:09:53 +00:00
Interval.cpp Move all of the header files which are involved in modelling the LLVM IR 2013-01-02 11:36:10 +00:00
IntervalPartition.cpp
LLVMBuild.txt LLVMBuild: Introduce a common section which currently has a list of the 2011-12-12 22:45:54 +00:00
LazyValueInfo.cpp Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size. 2013-07-04 01:31:24 +00:00
LibCallAliasAnalysis.cpp Move all of the header files which are involved in modelling the LLVM IR 2013-01-02 11:36:10 +00:00
LibCallSemantics.cpp Move all of the header files which are involved in modelling the LLVM IR 2013-01-02 11:36:10 +00:00
Lint.cpp [PM] Split DominatorTree into a concrete analysis result object which 2014-01-13 13:07:17 +00:00
Loads.cpp Change GetPointerBaseWithConstantOffset's DataLayout argument from a 2013-01-31 02:00:45 +00:00
LoopInfo.cpp [PM] Split DominatorTree into a concrete analysis result object which 2014-01-13 13:07:17 +00:00
LoopPass.cpp [PM] Rename the IR printing pass header to a more generic and correct 2014-01-12 11:10:32 +00:00
Makefile
MemDepPrinter.cpp Put the functionality for printing a value to a raw_ostream as an 2014-01-09 02:29:41 +00:00
MemoryBuiltins.cpp Teach MemoryBuiltins about address spaces 2013-12-14 00:27:48 +00:00
MemoryDependenceAnalysis.cpp [PM] Split DominatorTree into a concrete analysis result object which 2014-01-13 13:07:17 +00:00
ModuleDebugInfoPrinter.cpp Put the functionality for printing a value to a raw_ostream as an 2014-01-09 02:29:41 +00:00
NoAliasAnalysis.cpp Move all of the header files which are involved in modelling the LLVM IR 2013-01-02 11:36:10 +00:00
PHITransAddr.cpp [cleanup] Move the Dominators.h and Verifier.h headers into the IR 2014-01-13 09:26:24 +00:00
PostDominators.cpp [PM] Pull the generic graph algorithms and data structures for dominator 2014-01-13 10:52:56 +00:00
PtrUseVisitor.cpp Hoist the GEP constant address offset computation to a common home on 2012-12-11 10:29:10 +00:00
README.txt
RegionInfo.cpp [PM] Split DominatorTree into a concrete analysis result object which 2014-01-13 13:07:17 +00:00
RegionPass.cpp Remove the the block_node_iterator of Region, replace it by the block_iterator. 2012-08-27 13:49:24 +00:00
RegionPrinter.cpp Use the new script to sort the includes of every file under lib. 2012-12-03 16:50:05 +00:00
ScalarEvolution.cpp Fix known typos 2014-01-24 17:20:08 +00:00
ScalarEvolutionAliasAnalysis.cpp Use the new script to sort the includes of every file under lib. 2012-12-03 16:50:05 +00:00
ScalarEvolutionExpander.cpp [cleanup] Move the Dominators.h and Verifier.h headers into the IR 2014-01-13 09:26:24 +00:00
ScalarEvolutionNormalization.cpp [cleanup] Move the Dominators.h and Verifier.h headers into the IR 2014-01-13 09:26:24 +00:00
SparsePropagation.cpp Move all of the header files which are involved in modelling the LLVM IR 2013-01-02 11:36:10 +00:00
TargetTransformInfo.cpp Add Constant Hoisting Pass 2014-01-24 18:23:08 +00:00
Trace.cpp Put the functionality for printing a value to a raw_ostream as an 2014-01-09 02:29:41 +00:00
TypeBasedAliasAnalysis.cpp TBAA: fix PR17620. 2013-10-22 01:40:25 +00:00
ValueTracking.cpp Don't speculate loads under ThreadSanitizer 2013-11-21 07:29:28 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//