Add new type of tree node for `InsertElementInst` chain forming vector.
These instructions could be either removed, or replaced by shuffles during
vectorization and we can add this node to cost model, so naturally estimating
their cost, getting rid of `CompensateCost` tricks and reducing further work
for InstCombine. This fixes PR40522 and PR35732 in a natural way. Also this
patch is the first step towards revectorization of partially vectorization
(to fix PR42022 completely). After adding inserts to tree the next step is
to add vector instructions there (for instance, to merge `store <2 x float>`
and `store <2 x float>` to `store <4 x float>`).
Fixes PR40522 and PR35732.
Differential Revision: https://reviews.llvm.org/D98714
This commit copies existing tests at llvm/Transforms and replaces
'insertelement undef' in those files with 'insertelement poison'.
(see https://reviews.llvm.org/D93586)
Tests listed using this script:
grep -R -E '^[^;]*insertelement <.*> undef,' . | cut -d":" -f1 | uniq |
wc -l
Tests updated:
file_org=llvm/test/Transforms/$1
file=${file_org%.ll}-inseltpoison.ll
cp $file_org $file
sed -i -E 's/^([^;]*)insertelement <(.*)> undef/\1insertelement <\2> poison/g' $file
head -1 $file | grep "Assertions have been autogenerated by utils/update_test_checks.py" -q
if [ "$?" == 1 ]; then
echo "$file : should be manually updated"
# I manually updated the script
exit 1
fi
python3 ./llvm/utils/update_test_checks.py --opt-binary=./build-releaseassert/bin/opt $file
As it's causing some bot failures (and per request from kbarton).
This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.
llvm-svn: 358546
This was probably fixed long ago, but I don't see a test
that lines up with the example and target in the bug report:
https://bugs.llvm.org/show_bug.cgi?id=13837
...so adding it here.
llvm-svn: 323269
Essentially the same as the GEP change in r230786.
A similar migration script can be used to update test cases, though a few more
test case improvements/changes were required this time around: (r229269-r229278)
import fileinput
import sys
import re
pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")
for line in sys.stdin:
sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))
Reviewers: rafael, dexonsmith, grosser
Differential Revision: http://reviews.llvm.org/D7649
llvm-svn: 230794
One of several parallel first steps to remove the target type of pointers,
replacing them with a single opaque pointer type.
This adds an explicit type parameter to the gep instruction so that when the
first parameter becomes an opaque pointer type, the type to gep through is
still available to the instructions.
* This doesn't modify gep operators, only instructions (operators will be
handled separately)
* Textual IR changes only. Bitcode (including upgrade) and changing the
in-memory representation will be in separate changes.
* geps of vectors are transformed as:
getelementptr <4 x float*> %x, ...
->getelementptr float, <4 x float*> %x, ...
Then, once the opaque pointer type is introduced, this will ultimately look
like:
getelementptr float, <4 x ptr> %x
with the unambiguous interpretation that it is a vector of pointers to float.
* address spaces remain on the pointer, not the type:
getelementptr float addrspace(1)* %x
->getelementptr float, float addrspace(1)* %x
Then, eventually:
getelementptr float, ptr addrspace(1) %x
Importantly, the massive amount of test case churn has been automated by
same crappy python code. I had to manually update a few test cases that
wouldn't fit the script's model (r228970,r229196,r229197,r229198). The
python script just massages stdin and writes the result to stdout, I
then wrapped that in a shell script to handle replacing files, then
using the usual find+xargs to migrate all the files.
update.py:
import fileinput
import sys
import re
ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
normrep = re.compile( r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
def conv(match, line):
if not match:
return line
line = match.groups()[0]
if len(match.groups()[5]) == 0:
line += match.groups()[2]
line += match.groups()[3]
line += ", "
line += match.groups()[1]
line += "\n"
return line
for line in sys.stdin:
if line.find("getelementptr ") == line.find("getelementptr inbounds"):
if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("):
line = conv(re.match(ibrep, line), line)
elif line.find("getelementptr ") != line.find("getelementptr ("):
line = conv(re.match(normrep, line), line)
sys.stdout.write(line)
apply.sh:
for name in "$@"
do
python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
rm -f "$name.tmp"
done
The actual commands:
From llvm/src:
find test/ -name *.ll | xargs ./apply.sh
From llvm/src/tools/clang:
find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
From llvm/src/tools/polly:
find test/ -name *.ll | xargs ./apply.sh
After that, check-all (with llvm, clang, clang-tools-extra, lld,
compiler-rt, and polly all checked out).
The extra 'rm' in the apply.sh script is due to a few files in clang's test
suite using interesting unicode stuff that my python script was throwing
exceptions on. None of those files needed to be migrated, so it seemed
sufficient to ignore those cases.
Reviewers: rafael, dexonsmith, grosser
Differential Revision: http://reviews.llvm.org/D7636
llvm-svn: 230786
The SLP vectorizer should propagate IR-level optimization hints/flags (nsw, nuw, exact, fast-math)
when converting scalar instructions into vectors. But this isn't a simple copy - we need to take
the intersection (the logical 'and') of the sets of flags on the scalars.
The solution is further complicated because we can have non-uniform (non-SIMD) vector ops after:
http://reviews.llvm.org/D4015http://llvm.org/viewvc/llvm-project?view=revision&revision=211339
The vast majority of changed files are existing tests that were not propagating IR flags, but I've
also added a new test file for focused testing of IR flag possibilities.
Differential Revision: http://reviews.llvm.org/D5172
llvm-svn: 217051
By vectorizing a series of srl, or, ... instructions we have obfuscated the
intention so much that the backend does not know how to fold this code away.
radar://15336950
llvm-svn: 193573