We call tail-call-elim near the beginning of the pipeline, but that is too early to annotate calls that get added later. In the motivating case from issue #47852, the missing 'tail' on memset leads to sub-optimal codegen. I experimented with removing the early instance of tail-call-elim instead of just adding another pass, but that appears to be slightly worse for compile-time: +0.15% vs. +0.08% time. "tailcall" shows adding the pass; "tailcall2" shows moving the pass to later, then adding the original early pass back (so 1596886802 is functionally equivalent to 180b0439dc ): https://llvm-compile-time-tracker.com/index.php?config=NewPM-O3&stat=instructions&remote=rotateright Note that there was an effort to split the tail call functionality into 2 passes - that could help reduce compile-time if we find that this change costs more in compile-time than expected based on the preliminary testing: D60031 Differential Revision: https://reviews.llvm.org/D130374 |
||
|---|---|---|
| .. | ||
| interleave_IC.ll | ||
| interleaved-pointer-runtime-check-unprofitable.ll | ||
| large-loop-rdx.ll | ||
| lit.local.cfg | ||
| massv-altivec.ll | ||
| massv-calls.ll | ||
| massv-nobuiltin.ll | ||
| massv-unsupported.ll | ||
| optimal-epilog-vectorization-profitability.ll | ||
| optimal-epilog-vectorization.ll | ||
| pr30990.ll | ||
| pr41179.ll | ||
| reg-usage.ll | ||
| small-loop-rdx.ll | ||
| stride-vectorization.ll | ||
| vectorize-bswap.ll | ||
| vectorize-only-for-real.ll | ||
| vsx-tsvc-s173.ll | ||
| widened-massv-call.ll | ||
| widened-massv-vfabi-attr.ll | ||