llvm-project

History

Juergen Ributzka 38b67d0caf Add Constant Hoisting Pass This pass identifies expensive constants to hoist and coalesces them to better prepare it for SelectionDAG-based code generation. This works around the limitations of the basic-block-at-a-time approach. First it scans all instructions for integer constants and calculates its cost. If the constant can be folded into the instruction (the cost is TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't consider it expensive and leave it alone. This is the default behavior and the default implementation of getIntImmCost will always return TCC_Free. If the cost is more than TCC_BASIC, then the integer constant can't be folded into the instruction and it might be beneficial to hoist the constant. Similar constants are coalesced to reduce register pressure and materialization code. When a constant is hoisted, it is also hidden behind a bitcast to force it to be live-out of the basic block. Otherwise the constant would be just duplicated and each basic block would have its own copy in the SelectionDAG. The SelectionDAG recognizes such constants as opaque and doesn't perform certain transformations on them, which would create a new expensive constant. This optimization is only applied to integer constants in instructions and simple (this means not nested) constant cast experessions. For example: %0 = load i64* inttoptr (i64 big_constant to i64*) Reviewed by Eric llvm-svn: 200022		2014-01-24 18:23:08 +00:00
..
IPA	Put the functionality for printing a value to a raw_ostream as an	2014-01-09 02:29:41 +00:00
AliasAnalysis.cpp	[cleanup] Move the Dominators.h and Verifier.h headers into the IR	2014-01-13 09:26:24 +00:00
AliasAnalysisCounter.cpp	Put the functionality for printing a value to a raw_ostream as an	2014-01-09 02:29:41 +00:00
AliasAnalysisEvaluator.cpp	Put the functionality for printing a value to a raw_ostream as an	2014-01-09 02:29:41 +00:00
AliasDebugger.cpp	Move all of the header files which are involved in modelling the LLVM IR	2013-01-02 11:36:10 +00:00
AliasSetTracker.cpp	Put the functionality for printing a value to a raw_ostream as an	2014-01-09 02:29:41 +00:00
Analysis.cpp	[PM] Make the verifier work independently of any pass manager.	2014-01-19 02:22:18 +00:00
BasicAliasAnalysis.cpp	Fix known typos	2014-01-24 17:20:08 +00:00
BlockFrequencyInfo.cpp	BlockFrequencyInfo: Readded getEntryFreq.	2013-12-20 22:11:11 +00:00
BranchProbabilityInfo.cpp	[block-freq] Teach branch probability how to return the edge weight in between a BasicBlock and one of its successors.	2013-12-14 02:24:25 +00:00
CFG.cpp	[cleanup] Move the Dominators.h and Verifier.h headers into the IR	2014-01-13 09:26:24 +00:00
CFGPrinter.cpp	Use the new script to sort the includes of every file under lib.	2012-12-03 16:50:05 +00:00
CMakeLists.txt	delinearization of arrays	2013-11-12 22:47:20 +00:00
CaptureTracking.cpp	Make nocapture analysis work with addrspacecast	2014-01-14 19:11:52 +00:00
CodeMetrics.cpp	Begin fleshing out an interface in TTI for modelling the costs of	2013-01-22 11:26:02 +00:00
ConstantFolding.cpp	Add addrspacecast instruction.	2013-11-15 01:34:59 +00:00
CostModel.cpp	Get right cost for addrspacecast in cost model	2014-01-22 20:30:16 +00:00
Delinearization.cpp	Re-sort all of the includes with ./utils/sort_includes.py so that	2014-01-07 11:48:04 +00:00
DependenceAnalysis.cpp	Fix known typos	2014-01-24 17:20:08 +00:00
DomPrinter.cpp	[PM] Split DominatorTree into a concrete analysis result object which	2014-01-13 13:07:17 +00:00
DominanceFrontier.cpp	[PM] Split DominatorTree into a concrete analysis result object which	2014-01-13 13:07:17 +00:00
IVUsers.cpp	[PM] Split DominatorTree into a concrete analysis result object which	2014-01-13 13:07:17 +00:00
InstCount.cpp	Move all of the header files which are involved in modelling the LLVM IR	2013-01-02 11:36:10 +00:00
InstructionSimplify.cpp	InstSimplify: Make shift, select and GEP simplifications vector-aware.	2014-01-24 17:09:53 +00:00
Interval.cpp	Move all of the header files which are involved in modelling the LLVM IR	2013-01-02 11:36:10 +00:00
IntervalPartition.cpp	…
LLVMBuild.txt	LLVMBuild: Introduce a common section which currently has a list of the	2011-12-12 22:45:54 +00:00
LazyValueInfo.cpp	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size.	2013-07-04 01:31:24 +00:00
LibCallAliasAnalysis.cpp	Move all of the header files which are involved in modelling the LLVM IR	2013-01-02 11:36:10 +00:00
LibCallSemantics.cpp	Move all of the header files which are involved in modelling the LLVM IR	2013-01-02 11:36:10 +00:00
Lint.cpp	[PM] Split DominatorTree into a concrete analysis result object which	2014-01-13 13:07:17 +00:00
Loads.cpp	Change GetPointerBaseWithConstantOffset's DataLayout argument from a	2013-01-31 02:00:45 +00:00
LoopInfo.cpp	[PM] Split DominatorTree into a concrete analysis result object which	2014-01-13 13:07:17 +00:00
LoopPass.cpp	[PM] Rename the IR printing pass header to a more generic and correct	2014-01-12 11:10:32 +00:00
Makefile	…
MemDepPrinter.cpp	Put the functionality for printing a value to a raw_ostream as an	2014-01-09 02:29:41 +00:00
MemoryBuiltins.cpp	Teach MemoryBuiltins about address spaces	2013-12-14 00:27:48 +00:00
MemoryDependenceAnalysis.cpp	[PM] Split DominatorTree into a concrete analysis result object which	2014-01-13 13:07:17 +00:00
ModuleDebugInfoPrinter.cpp	Put the functionality for printing a value to a raw_ostream as an	2014-01-09 02:29:41 +00:00
NoAliasAnalysis.cpp	Move all of the header files which are involved in modelling the LLVM IR	2013-01-02 11:36:10 +00:00
PHITransAddr.cpp	[cleanup] Move the Dominators.h and Verifier.h headers into the IR	2014-01-13 09:26:24 +00:00
PostDominators.cpp	[PM] Pull the generic graph algorithms and data structures for dominator	2014-01-13 10:52:56 +00:00
PtrUseVisitor.cpp	Hoist the GEP constant address offset computation to a common home on	2012-12-11 10:29:10 +00:00
README.txt	…
RegionInfo.cpp	[PM] Split DominatorTree into a concrete analysis result object which	2014-01-13 13:07:17 +00:00
RegionPass.cpp	Remove the the block_node_iterator of Region, replace it by the block_iterator.	2012-08-27 13:49:24 +00:00
RegionPrinter.cpp	Use the new script to sort the includes of every file under lib.	2012-12-03 16:50:05 +00:00
ScalarEvolution.cpp	Fix known typos	2014-01-24 17:20:08 +00:00
ScalarEvolutionAliasAnalysis.cpp	Use the new script to sort the includes of every file under lib.	2012-12-03 16:50:05 +00:00
ScalarEvolutionExpander.cpp	[cleanup] Move the Dominators.h and Verifier.h headers into the IR	2014-01-13 09:26:24 +00:00
ScalarEvolutionNormalization.cpp	[cleanup] Move the Dominators.h and Verifier.h headers into the IR	2014-01-13 09:26:24 +00:00
SparsePropagation.cpp	Move all of the header files which are involved in modelling the LLVM IR	2013-01-02 11:36:10 +00:00
TargetTransformInfo.cpp	Add Constant Hoisting Pass	2014-01-24 18:23:08 +00:00
Trace.cpp	Put the functionality for printing a value to a raw_ostream as an	2014-01-09 02:29:41 +00:00
TypeBasedAliasAnalysis.cpp	TBAA: fix PR17620.	2013-10-22 01:40:25 +00:00
ValueTracking.cpp	Don't speculate loads under ThreadSanitizer	2013-11-21 07:29:28 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//