forked from OSchip/llvm-project
![]() Hoisting and sinking instructions out of conditional blocks enables additional vectorization by: 1. Executing memory accesses unconditionally. 2. Reducing the number of instructions that need predication. After disabling early hoisting / sinking, we miss out on a few vectorization opportunities. One of those is causing a ~10% performance regression in one of the Geekbench benchmarks on AArch64. This patch tires to recover the regression by running hoisting/sinking as part of a SimplifyCFG run after LoopRotate and before LoopVectorize. Note that in the legacy pass-manager, we run LoopRotate just before vectorization again and there's no SimplifyCFG run in between, so the sinking/hoisting may impact the later run on LoopRotate. But the impact should be limited and the benefit of hosting/sinking at this stage should outweigh the risk of not rotating. Compile-time impact looks slightly positive for most cases. http://llvm-compile-time-tracker.com/compare.php?from=2ea7fb7b1c045a7d60fcccf3df3ebb26aa3699e5&to=e58b4a763c691da651f25996aad619cb3d946faf&stat=instructions NewPM-O3: geomean -0.19% NewPM-ReleaseThinLTO: geoman -0.54% NewPM-ReleaseLTO-g: geomean -0.03% With a few benchmarks seeing a notable increase, but also some improvements. Alternative to D101290. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D101468 |
||
---|---|---|
.. | ||
AlwaysInliner.cpp | ||
Annotation2Metadata.cpp | ||
ArgumentPromotion.cpp | ||
Attributor.cpp | ||
AttributorAttributes.cpp | ||
BarrierNoopPass.cpp | ||
BlockExtractor.cpp | ||
CMakeLists.txt | ||
CalledValuePropagation.cpp | ||
ConstantMerge.cpp | ||
CrossDSOCFI.cpp | ||
DeadArgumentElimination.cpp | ||
ElimAvailExtern.cpp | ||
ExtractGV.cpp | ||
ForceFunctionAttrs.cpp | ||
FunctionAttrs.cpp | ||
FunctionImport.cpp | ||
GlobalDCE.cpp | ||
GlobalOpt.cpp | ||
GlobalSplit.cpp | ||
HotColdSplitting.cpp | ||
IPO.cpp | ||
IROutliner.cpp | ||
InferFunctionAttrs.cpp | ||
InlineSimple.cpp | ||
Inliner.cpp | ||
Internalize.cpp | ||
LoopExtractor.cpp | ||
LowerTypeTests.cpp | ||
MergeFunctions.cpp | ||
OpenMPOpt.cpp | ||
PartialInlining.cpp | ||
PassManagerBuilder.cpp | ||
PruneEH.cpp | ||
SCCP.cpp | ||
SampleContextTracker.cpp | ||
SampleProfile.cpp | ||
SampleProfileProbe.cpp | ||
StripDeadPrototypes.cpp | ||
StripSymbols.cpp | ||
SyntheticCountsPropagation.cpp | ||
ThinLTOBitcodeWriter.cpp | ||
WholeProgramDevirt.cpp |