This change adds a new hook for estimating the cost of vector extracts followed
by zero- and sign-extensions. The motivating example for this change is the
SMOV and UMOV instructions on AArch64. These instructions move data from vector
to general purpose registers while performing the corresponding extension
(sign-extend for SMOV and zero-extend for UMOV) at the same time. For these
operations, TargetTransformInfo can assume the extensions are free and only
report the cost of the vector extract. The SLP vectorizer has been updated to
make use of the new hook.
Differential Revision: http://reviews.llvm.org/D18523
llvm-svn: 267725
Commit r259357 was reverted because it caused PR26629. We were assuming all
roots of a vectorizable tree could be truncated to the same width, which is not
the case in general. This commit reapplies the patch along with a fix and a new
test case to ensure we don't regress because of this issue again. This should
fix PR26629.
llvm-svn: 261212
This patch is the second attempt to reapply commit r258404. There was bug in
the initial patch and subsequent fix (mentioned below).
The initial patch caused an assertion because we were computing smaller type
sizes for instructions that cannot be demoted. The fix first determines the
instructions that will be demoted, and then applies the smaller type size to
only those instructions.
This should fix PR26239 and PR26307.
llvm-svn: 258929
This commit exposes a crash in computeKnownBits on the Chromium buildbots.
Reverting to investigate.
Reference: https://llvm.org/bugs/show_bug.cgi?id=26307
llvm-svn: 258812
We were hitting an assertion because we were computing smaller type sizes for
instructions that cannot be demoted. The fix first determines the instructions
that will be demoted, and then applies the smaller type size to only those
instructions.
This should fix PR26239.
llvm-svn: 258705
This change attempts to produce vectorized integer expressions in bit widths
that are narrower than their scalar counterparts. The need for demotion arises
especially on architectures in which the small integer types (e.g., i8 and i16)
are not legal for scalar operations but can still be used in vectors. Like
similar work done within the loop vectorizer, we rely on InstCombine to perform
the actual type-shrinking. We use the DemandedBits analysis and
ComputeNumSignBits from ValueTracking to determine the minimum required bit
width of an expression.
Differential revision: http://reviews.llvm.org/D15815
llvm-svn: 258404
The fix uniques the bundle of getelementptr indices we are about to vectorize
since it's possible for the same index to be used by multiple instructions.
The original commit message is below.
[SLP] Vectorize the index computations of getelementptr instructions.
This patch seeds the SLP vectorizer with getelementptr indices. The primary
motivation in doing so is to vectorize gather-like idioms beginning with
consecutive loads (e.g., g[a[0] - b[0]] + g[a[1] - b[1]] + ...). While these
cases could be vectorized with a top-down phase, seeding the existing bottom-up
phase with the index computations avoids the complexity, compile-time, and
phase ordering issues associated with a full top-down pass. Only bundles of
single-index getelementptrs with non-constant differences are considered for
vectorization.
llvm-svn: 257918
This patch seeds the SLP vectorizer with getelementptr indices. The primary
motivation in doing so is to vectorize gather-like idioms beginning with
consecutive loads (e.g., g[a[0] - b[0]] + g[a[1] - b[1]] + ...). While these
cases could be vectorized with a top-down phase, seeding the existing bottom-up
phase with the index computations avoids the complexity, compile-time, and
phase ordering issues associated with a full top-down pass. Only bundles of
single-index getelementptrs with non-constant differences are considered for
vectorization.
Differential Revision: http://reviews.llvm.org/D14829
llvm-svn: 257800