llvm-project

Commit Graph

Author	SHA1	Message	Date
Bjorn Pettersson	d58512b2e3	[SLP] Don't vectorize stores of non-packed types (like i1, i2) In the spirit of commit `fc783e91e0` (llvm-svn: 248943) we shouldn't vectorize stores of non-packed types (i.e. types that has padding between consecutive variables in a scalar layout, but being packed in a vector layout). The problem was detected as a miscompile in a downstream test case. Reviewed By: anton-afanasyev Differential Revision: https://reviews.llvm.org/D94446	2021-01-14 11:30:33 +01:00
Sanjay Patel	e433ca28ec	[SLP] add reduction test for FMF; NFC	2021-01-13 11:43:51 -05:00
Bjorn Pettersson	dd07d60ec3	[SLP] Add test case showing a bug when dealing with padded types We shouldn't vectorize stores of non-packed types (i.e. types that has padding between consecutive variables in a scalar layout, but being packed in a vector layout). The problem was detected as a miscompile in a downstream test case. This is a pre-commit of a test case for the fix in D94446.	2021-01-12 16:35:33 +01:00
Alexey Bataev	0e57084d0e	[SLP][NFC]Add a test for reused shrink check, NFC.	2021-01-08 06:23:23 -08:00
Alexander Belyaev	bcbdeafa9c	Revert "[SLP]Need shrink the load vector after reordering." This reverts commit `4284afdf94`. This changes computed values in fused_batchnorm_test_cpu. Not equal to tolerance rtol=1e-06, atol=0.001 Mismatched value: a is different from b. not close where = (array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]), array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]), array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1]), array([0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5])) not close lhs = [-0.6636615 -0.9804948 -1.148275 -0.68193716 -0.8572368 -0.65046215 -0.6993756 -1.2244141 -1.0938729 -0.50369143 -0.51830524 -0.738452 -0.7214286 -0.48115745 -0.9380924 -0.9341769 -0.5916775 -1.2896856 -0.7264182 -0.9746917 -0.783249 -0.7659018 -0.86214024 -0.47784212] not close rhs = [ 0.44102234 0.12418899 -0.04359123 0.42274666 0.24744703 0.45422167 0.40530816 -0.11973029 0.01081094 0.6009924 0.5863786 0.3662318 0.38325527 0.62352633 0.1665914 0.1705069 0.5130063 -0.18500176 0.37826565 0.12999213 0.3214348 0.338782 0.24254355 0.62684166] not close dif = [1.1046839 1.1046838 1.1046838 1.1046839 1.1046839 1.1046839 1.1046838 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839 1.1046838 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839 1.1046838 1.1046838] not close tol = [0.00100044 0.00100012 0.00100004 0.00100042 0.00100025 0.00100045 0.00100041 0.00100012 0.00100001 0.0010006 0.00100059 0.00100037 0.00100038 0.00100062 0.00100017 0.00100017 0.00100051 0.00100019 0.00100038 0.00100013 0.00100032 0.00100034 0.00100024 0.00100063]	2021-01-08 14:42:26 +01:00
Alexey Bataev	4284afdf94	[SLP]Need shrink the load vector after reordering. After merging the shuffles, we cannot rely on the previous shuffle anymore and need to shrink the final shuffle, if it is required. Reported in D92668 Differential Revision: https://reviews.llvm.org/D93967	2021-01-07 04:50:48 -08:00
Juneyoung Lee	3a60a1f165	[InstSimplify] Fold insertelement vec, poison, idx into vec This is a simple patch that adds folding from `insertelement vec, poison, idx` into `vec`. Alive2 proof: https://alive2.llvm.org/ce/z/2y2vbC Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93994	2021-01-07 10:10:14 +09:00
Juneyoung Lee	8871a4b4ca	[Constant] Update ConstantVector::get to return poison if all input elems are poison The diff was reviewed at D93994	2021-01-07 09:26:07 +09:00
Juneyoung Lee	4a8e6ed2f7	[SLP,LV] Use poison constant vector for shufflevector/initial insertelement This patch makes SLP and LV emit operations with initial vectors set to poison constant instead of undef. This is a part of efforts for using poison vector instead of undef to represent "doesn't care" vector. The goal is to make nice shufflevector optimizations valid that is currently incorrect due to the tricky interaction between undef and poison (see https://bugs.llvm.org/show_bug.cgi?id=44185 ). Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D94061	2021-01-06 11:22:50 +09:00
Nikita Popov	766cf7f32e	[InstSimplify] Fold division by zero to poison Div/rem by zero is immediate undefined behavior and anything goes. Currently we fold it to undef, this patch changes it to fold to poison instead, which is slightly stronger. Differential Revision: https://reviews.llvm.org/D93995	2021-01-03 20:52:45 +01:00
Alexey Bataev	bf2a78fd4a	[SLP]Add a test for correct use of the reordered loads, NFC.	2021-01-01 08:27:59 -08:00
Sanjay Patel	3567908d8c	[SLP] add fadd reduction test to show broken FMF propagation; NFC	2020-12-30 11:27:50 -05:00
Juneyoung Lee	9b29610228	Use unary CreateShuffleVector if possible As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`. Let's update them. Actually, it would have been more natural if the patches were made in this order: (1) let them use unary CreateShuffleVector first (2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793) The order is swapped, but in terms of correctness it is still fine. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93923	2020-12-30 22:36:08 +09:00
Juneyoung Lee	278aa65cc4	[IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93793	2020-12-30 04:21:04 +09:00
Juneyoung Lee	9d70dbdc2b	[InstCombine] use poison as placeholder for undemanded elems Currently undef is used as a don’t-care vector when constructing a vector using a series of insertelement. However, this is problematic because undef isn’t undefined enough. Especially, a sequence of insertelement can be optimized to shufflevector, but using undef as its placeholder makes shufflevector a poison-blocking instruction because undef cannot be optimized to poison. This makes a few straightforward optimizations incorrect, such as: ``` ; https://bugs.llvm.org/show_bug.cgi?id=44185 define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) { %xv = insertelement <4 x float> %q, float %x, i32 2 %r = shufflevector <4 x float> %y, <4 x float> %xv, <4 x i32> { 0, 6, 2, undef } ret <4 x float> %r ; %r[3] is undef } => define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) { %r = insertelement <4 x float> %y, float %x, i32 1 ret <4 x float> %r ; %r[3] = %y[3], incorrect if %y[3] = poison } Transformation doesn't verify! ERROR: Target is more poisonous than source ``` I’d like to suggest 1. Using poison as insertelement’s placeholder value (IRBuilder::CreateVectorSplat should be patched too) 2. Updating shufflevector’s semantics to return poison element if mask is undef Note that poison is currently lowered into UNDEF in SelDag, so codegen part is okay. m_Undef() matches PoisonValue as well, so existing optimizations will still fire. The only concern is hidden miscompilations that will go incorrect when poison constant is given. A conservative way is copying all tests having `insertelement undef` & replacing it with `insertelement poison` & run Alive2 on it, but it will create many tests and people won’t like it. :( Instead, I’ll simply locally maintain the tests and run Alive2. If there is any bug found, I’ll report it. Relevant links: https://bugs.llvm.org/show_bug.cgi?id=43958 , http://lists.llvm.org/pipermail/llvm-dev/2019-November/137242.html Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93586	2020-12-28 08:58:15 +09:00
Juneyoung Lee	db7a2f347f	Precommit transform tests that have poison as insertelement's placeholder This commit copies existing tests at llvm/Transforms and replaces 'insertelement undef' in those files with 'insertelement poison'. (see https://reviews.llvm.org/D93586) Tests listed using this script: grep -R -E '^[^;]insertelement <.> undef,' . \| cut -d":" -f1 \| uniq \| wc -l Tests updated: file_org=llvm/test/Transforms/$1 file=${file_org%.ll}-inseltpoison.ll cp $file_org $file sed -i -E 's/^([^;])insertelement <(.)> undef/\1insertelement <\2> poison/g' $file head -1 $file \| grep "Assertions have been autogenerated by utils/update_test_checks.py" -q if [ "$?" == 1 ]; then echo "$file : should be manually updated" # I manually updated the script exit 1 fi python3 ./llvm/utils/update_test_checks.py --opt-binary=./build-releaseassert/bin/opt $file	2020-12-24 11:46:17 +09:00
Sanjay Patel	f6929c0195	[SLP] add reduction tests for maxnum/minnum intrinsics; NFC	2020-12-22 16:05:39 -05:00
Stanislav Mekhanoshin	87d7757bbe	[SLP] Control maximum vectorization factor from TTI D82227 has added a proper check to limit PHI vectorization to the maximum vector register size. That unfortunately resulted in at least a couple of regressions on SystemZ and x86. This change reverts PHI handling from D82227 and replaces it with a more general check in SLPVectorizerPass::tryToVectorizeList(). Moved to tryToVectorizeList() it allows to restart vectorization if initial chunk fails. However, this function is more general and handles not only PHI but everything which SLP handles. If vectorization factor would be limited to maximum vector register size it would limit much more vectorization than before leading to further regressions. Therefore a new TTI callback getMaximumVF() is added with the default 0 to preserve current behavior and limit nothing. Then targets can decide what is better for them. The callback gets ElementSize just like a similar getMinimumVF() function and the main opcode of the chain. The latter is to avoid regressions at least on the AMDGPU. We can have loads and stores up to 128 bit wide, and <2 x 16> bit vector math on some subtargets, where the rest shall not be vectorized. I.e. we need to differentiate based on the element size and operation itself. Differential Revision: https://reviews.llvm.org/D92059	2020-12-14 08:49:40 -08:00
Anton Afanasyev	fac7c7ec3c	[SLP] Fix vector element size for the store chains Vector element size could be different for different store chains. This patch prevents wrong computation of maximum number of elements for that case. Differential Revision: https://reviews.llvm.org/D93192	2020-12-14 15:51:43 +03:00
Anton Afanasyev	b8c847ee73	[SLP][Test] Precommit test for D93192 This test shows failure of combined stores chains vectorization	2020-12-14 09:23:47 +03:00
Anton Afanasyev	e5bf2e8989	[SLP] Use the width of value truncated just before storing For stores chain vectorization we choose the size of vector elements to ensure we fit to minimum and maximum vector register size for the number of elements given. This patch corrects vector element size choosing the width of value truncated just before storing instead of the width of value stored. Fixes PR46983 Differential Revision: https://reviews.llvm.org/D92824	2020-12-09 16:38:45 +03:00
Simon Pilgrim	41d0666391	[SLP][X86] Extend PR46983 tests to include SSE2,SSE42,AVX512BW test coverage Noticed while reviewing D92824	2020-12-08 12:41:47 +00:00
Anton Afanasyev	6c3f56efa6	[SLP][Test] Differentiate SSE/AVX512 test coverage (NFC) Add test coverage for SSE/AVX512 for insert-after-bundle.ll test. Prepare this test for accurate showing of PR46983 fix.	2020-12-08 12:00:52 +03:00
Anton Afanasyev	50bff64158	[SLP][Test] Add test for PR46983	2020-12-07 21:07:40 +03:00
Alexey Bataev	438682de6a	[SLP]Merge reorder and reuse shuffles. It is possible to merge reuse and reorder shuffles and reduce the total cost of the ivectorization tree/number of final instructions. Differential Revision: https://reviews.llvm.org/D92668	2020-12-07 07:50:00 -08:00
Alexey Bataev	97c08db84e	[SLP]Update test checks, NFC.	2020-12-07 06:12:05 -08:00
Sjoerd Meijer	5110ff0817	[AArch64][CostModel] Fix cost for mul <2 x i64> This was modeled to have a cost of 1, but since we do not have a MUL.2d this is scalarized into vector inserts/extracts and scalar muls. Motivating precommitted test is test/Transforms/SLPVectorizer/AArch64/mul.ll, which we don't want to SLP vectorize. Test Transforms/LoopVectorize/AArch64/extractvalue-no-scalarization-required.ll unfortunately needed changing, but the reason is documented in LoopVectorize.cpp:6855: // The cost of executing VF copies of the scalar instruction. This opcode // is unknown. Assume that it is the same as 'mul'. which I will address next as a follow up of this. Differential Revision: https://reviews.llvm.org/D92208	2020-11-30 11:36:55 +00:00
Sjoerd Meijer	a2016dc887	[AArch64][SLP] Precommit tests which would be better not to SLP vectorize. NFC.	2020-11-27 13:43:16 +00:00
Florian Hahn	926681b6be	[CostModel] Add basic implementation of getGatherScatterOpCost. Add a basic implementation of getGatherScatterOpCost to BasicTTIImpl. The implementation estimates the cost of scalarizing the loads/stores, the cost of packing/extracting the individual lanes and the cost of only selecting enabled lanes. This more accurately reflects the current cost on targets like AArch64. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D91984	2020-11-26 12:02:25 +00:00
Florian Hahn	3a1c6cec15	[AArch64] Add tests for masked.gather costs.	2020-11-23 17:33:27 +00:00
Matt Arsenault	1d1234b2a4	OpaquePtr: Update more tests to use typed sret	2020-11-20 20:08:43 -05:00
Matt Arsenault	20c43d6bd5	OpaquePtr: Bulk update tests to use typed sret	2020-11-20 17:58:26 -05:00
Matt Arsenault	06c192d454	OpaquePtr: Bulk update tests to use typed byval Upgrade of the IR text tests should be the only thing blocking making typed byval mandatory. Partially done through regex and partially manual.	2020-11-20 14:00:46 -05:00
Anton Afanasyev	6f1c07b23a	[SLP][Test] Update pr47269.ll test. NFC Expand test for PR47269 to better demonstrate changes introduced by D90445.	2020-11-20 18:33:57 +03:00
Benjamin Kramer	4dbe12e866	[SLP] Use the minimum alignment of the load bundle when forming a masked.gather Instead of the first load. That works when vectorizing contiguous loads, but not for gathers. Fixes a miscompile introduced in `fcad8d3635`.	2020-11-18 12:53:39 +01:00
Sanjay Patel	08834979e3	[SLP] avoid unreachable code crash/infloop Example based on the post-commit comments for D88735.	2020-11-17 15:10:23 -05:00
Anton Afanasyev	fcad8d3635	[SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic For the scattered operands of load instructions it makes sense to use gathering load intrinsic, which can lower to native instruction for X86/AVX512 and ARM/SVE. This also enables building vectorization tree with entries containing scattered operands. The next step is to add scattered store. Fixes PR47629 and PR47623 Differential Revision: https://reviews.llvm.org/D90445	2020-11-17 18:11:45 +03:00
Simon Pilgrim	10f8156e79	[SLPVectorizer][X86] Remove unused check-prefixes	2020-11-09 11:17:08 +00:00
Simon Pilgrim	119e4550dd	[SLPVectorizer][X86] Remove unused check-prefixes	2020-11-08 14:03:55 +00:00
Simon Pilgrim	b215adf4ed	[SLP][AMDGPU] Regenerate packed-math tests and remove unused check prefix	2020-11-06 17:27:13 +00:00
Florian Hahn	d8d1cc647d	[SLP] Also try to vectorize incoming values of PHIs . Currently we do not consider incoming values of PHIs as roots for SLP vectorization. This means we miss scenarios like the one in the test case and PR47670. It appears quite straight-forward to consider incoming values of PHIs as roots for vectorization, but I might be missing something that makes this problematic. In terms of vectorized instructions, this applies to quite a few benchmarks across MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto Same hash: 185 (filtered out) Remaining: 52 Metric: SLP.NumVectorInstructions Program base patch diff test-suite...ProxyApps-C++/HPCCG/HPCCG.test 9.00 27.00 200.0% test-suite...C/CFP2000/179.art/179.art.test 8.00 22.00 175.0% test-suite...T2006/458.sjeng/458.sjeng.test 14.00 30.00 114.3% test-suite...ce/Benchmarks/PAQ8p/paq8p.test 11.00 18.00 63.6% test-suite...s/FreeBench/neural/neural.test 12.00 18.00 50.0% test-suite...rimaran/enc-3des/enc-3des.test 65.00 95.00 46.2% test-suite...006/450.soplex/450.soplex.test 63.00 89.00 41.3% test-suite...ProxyApps-C++/CLAMR/CLAMR.test 177.00 250.00 41.2% test-suite...nchmarks/McCat/18-imp/imp.test 13.00 18.00 38.5% test-suite.../Applications/sgefa/sgefa.test 26.00 35.00 34.6% test-suite...pplications/oggenc/oggenc.test 100.00 133.00 33.0% test-suite...6/482.sphinx3/482.sphinx3.test 103.00 134.00 30.1% test-suite...oxyApps-C++/miniFE/miniFE.test 169.00 213.00 26.0% test-suite.../Benchmarks/Olden/tsp/tsp.test 59.00 73.00 23.7% test-suite...TimberWolfMC/timberwolfmc.test 503.00 622.00 23.7% test-suite...T2006/456.hmmer/456.hmmer.test 65.00 79.00 21.5% test-suite...libquantum/462.libquantum.test 58.00 68.00 17.2% test-suite...ternal/HMMER/hmmcalibrate.test 84.00 98.00 16.7% test-suite...ications/JM/ldecod/ldecod.test 351.00 401.00 14.2% test-suite...arks/VersaBench/dbms/dbms.test 52.00 57.00 9.6% test-suite...ce/Benchmarks/Olden/bh/bh.test 118.00 128.00 8.5% test-suite.../Benchmarks/Bullet/bullet.test 6355.00 6880.00 8.3% test-suite...nsumer-lame/consumer-lame.test 480.00 519.00 8.1% test-suite...000/183.equake/183.equake.test 226.00 244.00 8.0% test-suite...chmarks/Olden/power/power.test 105.00 113.00 7.6% test-suite...6/471.omnetpp/471.omnetpp.test 92.00 99.00 7.6% test-suite...ications/JM/lencod/lencod.test 1173.00 1261.00 7.5% test-suite...0/253.perlbmk/253.perlbmk.test 55.00 59.00 7.3% test-suite...oxyApps-C/miniAMR/miniAMR.test 92.00 98.00 6.5% test-suite...chmarks/MallocBench/gs/gs.test 446.00 473.00 6.1% test-suite.../CINT2006/403.gcc/403.gcc.test 464.00 491.00 5.8% test-suite...6/464.h264ref/464.h264ref.test 998.00 1055.00 5.7% test-suite...006/453.povray/453.povray.test 5711.00 6007.00 5.2% test-suite...FreeBench/distray/distray.test 102.00 107.00 4.9% test-suite...:: External/Povray/povray.test 4184.00 4378.00 4.6% test-suite...DOE-ProxyApps-C/CoMD/CoMD.test 112.00 117.00 4.5% test-suite...T2006/445.gobmk/445.gobmk.test 104.00 108.00 3.8% test-suite...CI_Purple/SMG2000/smg2000.test 789.00 819.00 3.8% test-suite...yApps-C++/PENNANT/PENNANT.test 233.00 241.00 3.4% test-suite...marks/7zip/7zip-benchmark.test 417.00 428.00 2.6% test-suite...arks/mafft/pairlocalalign.test 627.00 643.00 2.6% test-suite.../Benchmarks/nbench/nbench.test 259.00 265.00 2.3% test-suite...006/447.dealII/447.dealII.test 4641.00 4732.00 2.0% test-suite...lications/ClamAV/clamscan.test 106.00 108.00 1.9% test-suite...CFP2000/177.mesa/177.mesa.test 1639.00 1664.00 1.5% test-suite...oxyApps-C/RSBench/rsbench.test 66.00 65.00 -1.5% test-suite.../CINT2000/252.eon/252.eon.test 3416.00 3444.00 0.8% test-suite...CFP2000/188.ammp/188.ammp.test 1846.00 1861.00 0.8% test-suite.../CINT2000/176.gcc/176.gcc.test 152.00 153.00 0.7% test-suite...CFP2006/444.namd/444.namd.test 3528.00 3544.00 0.5% test-suite...T2006/473.astar/473.astar.test 98.00 98.00 0.0% test-suite...frame_layout/frame_layout.test NaN 39.00 nan% On ARM64, there appears to be a slight regression on SPEC2006, which might be interesting to investigate: test-suite...T2006/473.astar/473.astar.test 0.9% Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D88735	2020-11-06 12:50:32 +00:00
Simon Moll	d3b33a7810	[VE][TTI] don't advertise vregs/vops Claim to not have any vector support to dissuade SLP, LV and friends from generating SIMD IR for the VE target. We will take this back once vector isel is stable. Reviewed By: kaz7, fhahn Differential Revision: https://reviews.llvm.org/D90462	2020-11-06 11:12:10 +01:00
Anton Afanasyev	e8d67ef2dc	[SLP][X86][Test] Extend test coverage for PR47629 Add two cases for `<i32 x 8>`. Precommit for PR47629 and D90445. NFC	2020-11-03 17:51:24 +03:00
Florian Hahn	d9cbf39a37	[SLP] Pass VecPred argument to getCmpSelInstrCost. Check if all compares in VL have the same predicate and pass it to getCmpSelInstrCost, to improve cost-modeling on targets that only support compare/select combinations for certain uniform predicates. This leads to additional vectorization in some cases ``` Same hash: 217 (filtered out) Remaining: 19 Metric: SLP.NumVectorInstructions Program base slp2 diff test-suite...marks/SciMark2-C/scimark2.test 11.00 26.00 136.4% test-suite...T2006/445.gobmk/445.gobmk.test 79.00 135.00 70.9% test-suite...ediabench/gsm/toast/toast.test 54.00 71.00 31.5% test-suite...telecomm-gsm/telecomm-gsm.test 54.00 71.00 31.5% test-suite...CI_Purple/SMG2000/smg2000.test 426.00 542.00 27.2% test-suite...ch/g721/g721encode/encode.test 30.00 24.00 -20.0% test-suite...000/186.crafty/186.crafty.test 116.00 138.00 19.0% test-suite...ications/JM/ldecod/ldecod.test 697.00 765.00 9.8% test-suite...6/464.h264ref/464.h264ref.test 822.00 886.00 7.8% test-suite...chmarks/MallocBench/gs/gs.test 154.00 162.00 5.2% test-suite...nsumer-lame/consumer-lame.test 621.00 651.00 4.8% test-suite...lications/ClamAV/clamscan.test 223.00 231.00 3.6% test-suite...marks/7zip/7zip-benchmark.test 680.00 695.00 2.2% test-suite...CFP2000/177.mesa/177.mesa.test 2121.00 2129.00 0.4% test-suite...:: External/Povray/povray.test 2406.00 2412.00 0.2% test-suite...TimberWolfMC/timberwolfmc.test 634.00 634.00 0.0% test-suite...CFP2006/433.milc/433.milc.test 1036.00 1036.00 0.0% test-suite.../Benchmarks/nbench/nbench.test 321.00 321.00 0.0% test-suite...ctions-flt/Reductions-flt.test NaN 5.00 nan% ``` Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D90124	2020-11-03 10:16:43 +00:00
Dávid Bolvanský	980b860e67	[SLP] Added testcase for PR47623	2020-11-02 16:02:50 +01:00
Simon Pilgrim	b51b424c67	[SLP][X86] Add AVX512VL test target coverage for PR47629 As suggested on D90445 - the AVX512F test case alone won't handle 128/256-bit vector gather pattern very well	2020-11-02 11:40:40 +00:00
Florian Hahn	799033d8c5	Reland "[SLP] Consider alternatives for cost of select instructions." This reverts the revert commit `a1b53db324`. This patch includes a fix for a reported issue, caused by matchSelectPattern returning UMIN for selects of pointers in some cases by looking to some connected casts. For now, ensure integer instrinsics are only returned for selects of ints or int vectors.	2020-10-31 16:52:36 +00:00
Florian Hahn	a1b53db324	Revert "[SLP] Consider alternatives for cost of select instructions." This reverts commit `1922570489`. This appears to cause a crash in the following example a, b, c; l() { int e = a, f = l, g, h, i, j; float d = c, k = b; for (;;) for (; g < f; g++) { k[h] = d[i]; k[h - 1] = d[j]; h += e << 1; i += e; } } clang -cc1 -triple i386-unknown-linux-gnu -emit-obj -target-cpu pentium-m -O1 -vectorize-loops -vectorize-slp reduced.c llvm::Type *llvm::Type::getWithNewBitWidth(unsigned int) const: Assertion `isIntOrIntVectorTy() && "Original type expected to be a vector of integers or a scalar integer."' failed.	2020-10-30 21:26:14 +00:00
Simon Pilgrim	1eeae43107	[SLP][X86] Extend target coverage for PR47629 As suggested on D90445, add tests for various SSE/AVX levels and more complex gep pointer offsets	2020-10-30 15:20:41 +00:00
Nikita Popov	20b386aae0	[LoopUtils] Fix neutral value for vector.reduce.fadd Use -0.0 instead of 0.0 as the start value. The previous use of 0.0 was fine for all existing uses of this function though, as it is always generated with fast flags right now, and thus nsz.	2020-10-29 21:45:13 +01:00

1 2 3 4 5 ...

763 Commits