Summary:
LSV replaces multiple adjacent loads with one vectorized load and a
bunch of extractelement instructions. This patch makes the
extractelement instructions' names match those of the original loads,
for (hopefully) improved readability.
Reviewers: asbirlea, tstellarAMD
Subscribers: arsenm, mzolotukhin
Differential Revision: https://reviews.llvm.org/D23748
llvm-svn: 280818
Summary:
LSV was using two vector sets (heads and tails) to track pairs of adjiacent position to vectorize.
A recent optimization is trying to obtain the longest chain to vectorize and assumes the positions
in heads(H) and tails(T) match, which is not the case is there are multiple tails for the same head.
e.g.:
i1: store a[0]
i2: store a[1]
i3: store a[1]
Leads to:
H: i1
T: i2 i3
Instead of:
H: i1 i1
T: i2 i3
So the positions for instructions that follow i3 will have different indexes in H/T.
This patch resolves PR29148.
This issue also surfaced the fact that if the chain is too long, and TLI
returns a "not-fast" answer, the whole chain will be abandoned for
vectorization, even though a smaller one would be beneficial.
Added a testcase and FIXME for this.
Reviewers: tstellarAMD, arsenm, jlebar
Subscribers: mzolotukhin, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D24057
llvm-svn: 280179
Summary:
getVectorizablePrefix previously didn't work properly in the face of
aliasing loads/stores. It unwittingly assumed that the loads/stores
appeared in the BB in address order. If they didn't, it would do the
wrong thing.
Reviewers: asbirlea, tstellarAMD
Subscribers: arsenm, llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D22535
llvm-svn: 276072
Summary:
Previously, the insertion point for stores was the last instruction in
Chain *before calling getVectorizablePrefixEndIdx*. Thus if
getVectorizablePrefixEndIdx didn't return Chain.size(), we still would
insert at the last instruction in Chain.
This patch changes our internal API a bit in an attempt to make it less
prone to this sort of error. As a result, we end up recalculating the
Chain's boundary instructions, but I think worrying about the speed hit
of this is a premature optimization right now.
Reviewers: asbirlea, tstellarAMD
Subscribers: mzolotukhin, arsenm, llvm-commits
Differential Revision: https://reviews.llvm.org/D22534
llvm-svn: 276056
Summary:
Aiming to correct the ordering of loads/stores. This patch changes the
insert point for loads to the position of the first load.
It updates the ordering method for loads to insert before, rather than after.
Before this patch the following sequence:
"load a[1], store a[1], store a[0], load a[2]"
Would incorrectly vectorize to "store a[0,1], load a[1,2]".
The correctness check was assuming the insertion point for loads is at
the position of the first load, when in practice it was at the last
load. An alternative fix would have been to invert the correctness check.
The current fix changes insert position but also requires reordering of
instructions before the vectorized load.
Updated testcases to reflect the changes.
Reviewers: tstellarAMD, llvm-commits, jlebar, arsenm
Subscribers: mzolotukhin
Differential Revision: http://reviews.llvm.org/D22071
llvm-svn: 275117
Summary:
GetBoundryInstruction returns the last instruction as the instruction which follows or end(). Otherwise the last instruction in the boundry set is not being tested by isVectorizable().
Partially solve reordering of instructions. More extensive solution to follow.
Reviewers: tstellarAMD, llvm-commits, jlebar
Subscribers: escha, arsenm, mzolotukhin
Differential Revision: http://reviews.llvm.org/D21934
llvm-svn: 274389
integer.
Fixes issues on some architectures where we use arithmetic ops to build
vectors, which can cause bad things to happen for loads/stores of mixed
types.
Patch by Fiona Glaser
llvm-svn: 274307