llvm-project

Commit Graph

Author	SHA1	Message	Date
Nirav Dave	d32a421f75	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r293184 which is failing in LTO builds llvm-svn: 293188	2017-01-26 16:46:13 +00:00
Nirav Dave	de6516c466	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 293184	2017-01-26 16:02:24 +00:00
Simon Dardis	548a53f5ee	[mips] Fix Mips MSA instrinsics The usage of some MIPS MSA instrinsics that took immediates could crash LLVM during lowering. This patch addresses that behaviour. Crucially this patch also makes the use of intrinsics with out of range immediates as producing an internal error. The ld,st instrinsics would trigger an assertion failure for MIPS64 as their lowering would attempt to add an i32 offset to a i64 pointer. Reviewers: vkalintiris, slthakur Differential Revision: https://reviews.llvm.org/D25438 llvm-svn: 291571	2017-01-10 16:40:57 +00:00
Simon Dardis	0e9e237310	[mips] Honour -mno-odd-spreg for vector splat (again) Previous the lowering of FILL_FW would use the MSA128W register class when performing a vector splat. Instead it should be honouring -mno-odd-spreg and only use the even registers when performing a splat from word to vector register. Logical follow-on from r230235. This fixes PR/31369. A previous commit was missing the test case and had another differential in it. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D28373 llvm-svn: 291566	2017-01-10 15:53:10 +00:00
Nirav Dave	f5bf03c7ef	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." Reverting due to ARM MCJIT and MIPS LLD error. This reverts commit r289659. llvm-svn: 289667	2016-12-14 16:43:44 +00:00
Nirav Dave	8527ab0ad2	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after fixing after removing load-store factoring through token factors in favor of improved token factor operand pruning Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289659	2016-12-14 15:44:26 +00:00
Nirav Dave	bedb5d906c	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r289221 which appears to be triggering an assertion llvm-svn: 289226	2016-12-09 17:18:24 +00:00
Nirav Dave	fd51ff4fd8	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after fixing overly aggressive load-store forwarding optimization. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289221	2016-12-09 16:15:12 +00:00
Simon Dardis	40a5040cd8	[mips] Add tests for half precision floating point support. These should have been part of r287349. llvm-svn: 287574	2016-11-21 20:34:10 +00:00
Nirav Dave	a81682aad4	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r284151 which appears to be triggering a LTO failures on Hexagon llvm-svn: 284157	2016-10-13 20:23:25 +00:00
Nirav Dave	4b36957243	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after upstream changes. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 284151	2016-10-13 19:20:16 +00:00
Nirav Dave	e524f50882	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r282600 due to test failues with MCJIT llvm-svn: 282604	2016-09-28 16:37:50 +00:00
Nirav Dave	e17e055b75	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 282600	2016-09-28 15:50:43 +00:00
Hrvoje Varga	00d96ee7b9	[mips] Clang generates unaligned offset for MSA instruction st.d Differential Revision: https://reviews.llvm.org/D19475 llvm-svn: 277323	2016-08-01 06:46:20 +00:00
Daniel Sanders	6a73883c48	[mips] Correct label prefixes for N32 and N64. Summary: N32 and N64 follow the standard ELF conventions (.L) whereas O32 uses its own ($). This fixes the majority of object differences between -fintegrated-as and -fno-integrated-as. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: https://reviews.llvm.org/D22412 llvm-svn: 275967	2016-07-19 10:49:03 +00:00
Daniel Sanders	0d97270ae5	[mips] Use --check-prefixes where appropriate. NFC. llvm-svn: 273669	2016-06-24 12:23:17 +00:00
Daniel Sanders	d3bb20821d	[mips][msa] Fix register/register-class mismatches in emitINSERT_DF_VIDX(). Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21068 llvm-svn: 272765	2016-06-15 08:43:23 +00:00
Daniel Sanders	d2a49ec3ab	[mips][msa] copyPhysReg() should not set RegState::Define on result of CTCMSA. Summary: The machine verifier reports 'Explicit operand marked as def' when it is manually specified even though it agrees with the operand info. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21065 llvm-svn: 272646	2016-06-14 09:11:33 +00:00
Petar Jovanovic	e578e970cb	[mips] Make Static a default relocation model for MIPS codegen This change follows up defaults for GCC and Clang, so LLVM does not differ from them. While number of the test files are touched with this change, they all keep the old (expected) behaviour with the explicit option: "-relocation-model=pic" The tests that have not been touched are insensitive to relocation model. Differential Revision: http://reviews.llvm.org/D17995 llvm-svn: 265949	2016-04-11 15:24:23 +00:00
Daniel Sanders	0f596814e9	[mips][msa] Remove copy_u.d and move copy_u.w to MSA64. Summary: The forwards compatibility strategy employed by MIPS is to consider registers to be infinitely sign-extended. Then on ISA's with a wider register, the result of existing instructions are sign-extended to register width and zero-extended counterparts are added. copy_u.w on MSA32 and copy_u.w on MSA64 violate this strategy and we have therefore corrected the MSA specs to fix this. We still keep track of sign/zero-extension during legalization but we now match copy_s.[wd] where required. No change required to clang since __builtin_msa_copy_u_[wd] will map to copy_s.[wd] where appropriate for the target. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13472 llvm-svn: 250887	2015-10-21 09:58:54 +00:00
Daniel Sanders	c8cd58fa26	[mips] Correct and improve special-case shuffle instructions. Summary: The documentation writes vectors highest-index first whereas LLVM-IR writes them lowest-index first. As a result, instructions defined in terms of left_half() and right_half() had the halves reversed. In addition to correcting them, they have been improved to allow shuffles that use the same operand twice or in reverse order. For example, ilvev used to accept masks of the form: <0, n, 2, n+2, 4, n+4, ...> but now accepts: <0, 0, 2, 2, 4, 4, ...> <n, n, n+2, n+2, n+4, n+4, ...> <0, n, 2, n+2, 4, n+4, ...> <n, 0, n+2, 2, n+4, 4, ...> One further improvement is that splati.[bhwd] is now the preferred instruction for splat-like operations. The other special shuffles are no longer used for splats. This lead to the discovery that <0, 0, ...> would not cause splati.[hwd] to be selected and this has also been fixed. This fixes the enc-3des test from the test-suite on Mips64r6 with MSA. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9660 llvm-svn: 237689	2015-05-19 12:24:52 +00:00
Daniel Sanders	eda60d217b	[mips] Generate code for insert/extract operations when using the N64 ABI and MSA. Summary: When using the N64 ABI, element-indices use the i64 type instead of i32. In many cases, we can use iPTR to account for this but additional patterns and pseudo's are also required. This fixes most (but not quite all) failures in the test-suite when using N64 and MSA together. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9342 llvm-svn: 236494	2015-05-05 10:32:24 +00:00
Daniel Sanders	4160c802d9	[mips][msa] Test basic operations for the N32 ABI too. Summary: This required adding instruction aliases for dneg. N64 will be enabled shortly but requires additional bugfixes. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9341 llvm-svn: 236489	2015-05-05 08:48:35 +00:00
Daniel Sanders	59f89aa8ed	[mips][msa] Rename main check prefix to 'ALL' in basic operations tests. NFC Summary: The majority of the checks are subtarget independent. The few that aren't will be corrected shortly. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9340 llvm-svn: 236220	2015-04-30 09:57:37 +00:00
Daniel Sanders	fa159165be	[mips][msa] Use CHECK-LABEL where missing, and remove checks matching the .size directive. NFC. Summary: Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9339 llvm-svn: 236219	2015-04-30 09:56:30 +00:00
Daniel Sanders	90b059d555	[mips] Add missing signext attributes to MSA basic operations tests. NFC. Summary: This doesn't make much difference to MIPS32, but it will simplify a MIPS64r6 bugfix which will follow shortly by removing unnecessary sign-extension of parameters. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9338 llvm-svn: 236216	2015-04-30 09:24:09 +00:00
David Blaikie	a79ac14fa6	[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794	2015-02-27 21:17:42 +00:00
David Blaikie	79e6c74981	[opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float> %x, ... ->getelementptr float, <4 x float> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.?[^%\w]getelementptr inbounds )(((?:<\d x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") normrep = re.compile( r"(^.?[^%\w]getelementptr )(((?:<\d* x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name .ll \| xargs ./apply.sh From llvm/src/tools/clang: find test/ -name .mm -o -name .m -o -name .cpp -o -name .c \| xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll \| xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 llvm-svn: 230786	2015-02-27 19:29:02 +00:00
Matt Arsenault	6cc00429ff	Fix fmul combines with constant splat vectors Fixes things like fmul x, 2 -> fadd x, x llvm-svn: 215820	2014-08-16 10:14:19 +00:00
Toma Tabacu	726f1ea2c5	[mips] Improve robustness of some tests. Summary: This is done by removing some hardcoded registers like $at or expecting a single digit register to be selected. Contains work done by Matheus Almeida. Reviewers: matheusalmeida, dsanders Reviewed By: dsanders Subscribers: tomatabacu Differential Revision: http://reviews.llvm.org/D4227 llvm-svn: 215640	2014-08-14 13:10:48 +00:00
Zoran Jovanovic	6a29b55a5a	ps][mips64r6] Added LSA/DLSA instructions Differential Revision: http://reviews.llvm.org/D3897 llvm-svn: 211346	2014-06-20 09:28:09 +00:00
Daniel Sanders	e296a0fce5	[mips][msa] Fix vector insertions where the index is variable Summary: This isn't supported directly so we rotate the vector by the desired number of elements, insert to element zero, then rotate back. The i64 case generates rather poor code on MIPS32. There is an obvious optimisation to be made in future (do both insert.w's inside a shared rotate/unrotate sequence) but for now it's sufficient to select valid code instead of aborting. Depends on D3536 Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://reviews.llvm.org/D3537 llvm-svn: 207640	2014-04-30 12:09:32 +00:00
Daniel Sanders	6857800b67	[mips][msa] Use CHECK-LABEL in basic_operations*.ll Differential Revision: http://reviews.llvm.org/D3536 llvm-svn: 207529	2014-04-29 14:28:58 +00:00
Daniel Sanders	b3268e71e2	[mips][msa] Fix element extraction where the index is variable. Summary: This isn't supported directly so we splat the vector element and extract the most convenient copy. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://reviews.llvm.org/D3530 llvm-svn: 207524	2014-04-29 13:31:37 +00:00
Daniel Sanders	f88a29e66a	[mips] Correct lowering of VECTOR_SHUFFLE to VSHF. Summary: VECTOR_SHUFFLE concatenates the vectors in an vectorwise fashion. <0b00, 0b01> + <0b10, 0b11> -> <0b00, 0b01, 0b10, 0b11> VSHF concatenates the vectors in a bitwise fashion: <0b00, 0b01> + <0b10, 0b11> -> 0b0100 + 0b1110 -> 0b01001110 <0b10, 0b11, 0b00, 0b01> We must therefore swap the operands to get the correct result. The test case that discovered the issue was MultiSource/Benchmarks/nbench. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3142 llvm-svn: 204480	2014-03-21 16:56:51 +00:00
Daniel Sanders	df22154579	[mips] BSEL's and BINS[RL] operands are reversed compared to the vselect node used in the pattern. Summary: Correct the match patterns and the lowerings that made the CodeGen tests pass despite the mistakes. The original testcase that discovered the problem was SingleSource/UnitTests/SignlessType/factor.c in test-suite. During review, we also found that some of the existing CodeGen tests were incorrect and fixed them: * bitwise.ll: In bsel_v16i8 the IfSet/IfClear were reversed because bsel and bmnz have different operand orders and the test didn't correctly account for this. bmnz goes 'IfClear, IfSet, CondMask', while bsel goes 'CondMask, IfClear, IfSet'. * vec.ll: In the cases where a bsel is emitted as a bmnz (they are the same operation with a different input tied to the result) the operands were in the wrong order. * compare.ll and compare_float.ll: The bsel operand order was correct for a greater-than comparison, but a greater-than comparison instruction doesn't exist. Lowering this operation inverts the condition so the IfSet/IfClear need to be swapped to match. The differences between BSEL, BMNZ, and BMZ and how they map to/from vselect are rather confusing. I've therefore added a note to MSA.txt to explain this in a single place in addition to the comments that explain each case. Reviewers: matheusalmeida, jacksprat Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3028 llvm-svn: 203657	2014-03-12 11:54:00 +00:00
Daniel Sanders	d920770add	[mips][msa] Correct the behaviour of the COPY_FW pseudo on lanes 2 and 3. Summary: Previously, attempting to extract lanes 2 and 3 would actually extract lane 1. The MSA CodeGen tests only covered lanes 0 and 1. Differential Revision: http://llvm-reviews.chandlerc.com/D2935 llvm-svn: 202848	2014-03-04 13:54:30 +00:00
Daniel Sanders	fa961d76f0	[mips] Prevent %lo relocation being used on MSA loads and stores. Summary: Parts of the compiler still believed MSA load/stores have a 16-bit offset when it is actually 10-bit. Corrected this, and fixed a closely related issue this uncovered where load/stores with 10-bit and 12-bit offsets (MSA and microMIPS respectively) could not load/store using offsets from the stack/frame pointer. They accepted frameindex+offset, but not frameindex by itself. Reviewers: jacksprat, matheusalmeida Reviewed By: jacksprat Differential Revision: http://llvm-reviews.chandlerc.com/D2888 llvm-svn: 202717	2014-03-03 14:31:21 +00:00
Nico Rieck	a0abeb3548	Fix more broken CHECK lines llvm-svn: 201493	2014-02-16 13:28:39 +00:00
Matheus Almeida	4b27eb588c	[mips][msa] Add DLSA instruction. llvm-svn: 201081	2014-02-10 12:05:17 +00:00
Matheus Almeida	b4133b25e7	[mips][msa] Update FileCheck prefix in preparation for the addition of Mips64 tests. No functional changes. llvm-svn: 201080	2014-02-10 11:30:09 +00:00
Matheus Almeida	1ace1f1236	[mips][msa] Add insert.d instruction. This instruction is only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200543	2014-01-31 13:31:20 +00:00
Matheus Almeida	8114cf70aa	Update FileCheck prefixes in preparation for the addition of Mips64 MSA tests. No functional changes. llvm-svn: 200541	2014-01-31 13:05:56 +00:00
Matheus Almeida	ec079d9e1d	[mips][msa] Add fill.d instruction. This instruction is only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200400	2014-01-29 15:12:02 +00:00
Matheus Almeida	4cb577c614	[mips][msa] CHECK-DAG-ize MSA 2r_vector_scalar.ll test. This update is a preparation for the addition of Mips64 MSA tests. No functional changes. llvm-svn: 200399	2014-01-29 14:32:03 +00:00
Matheus Almeida	74070327b2	[mips][msa] Add copy_{u,s}.d. These instructions are only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200398	2014-01-29 14:05:28 +00:00
Matheus Almeida	a64f0600f3	[mips][msa] CHECK-DAG-ize MSA elm_copy.ll test. This update is a preparation for the addition of Mips64 MSA tests. No functional changes. llvm-svn: 200395	2014-01-29 13:51:34 +00:00
Andrea Di Biagio	f09a357765	[DAGCombiner] Teach how to fold sext/aext/zext of constant build vectors. This patch teaches the DAGCombiner how to fold a sext/aext/zext dag node when the operand in input is a build vector of constants (or UNDEFs). The inability to fold a sext/zext of a constant build_vector was the root cause of some pcg bugs affecting vselect expansion on x86-64 with AVX support. Before this change, the DAGCombiner only knew how to fold a sext/zext/aext of a ConstantSDNode. llvm-svn: 200234	2014-01-27 18:45:30 +00:00
Alp Toker	cb40291100	Fix known typos Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018	2014-01-24 17:20:08 +00:00
Daniel Sanders	b825a634d6	[mips][msa] Correct pattern for LSA Summary: $rs and $rt were the wrong way round in the .td and the testcase wasn't strict enough to detect the mistake. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D2554 llvm-svn: 199498	2014-01-17 15:40:05 +00:00
Daniel Sanders	c309be2f1f	[mips][msa] Correct sld and sldi builtins. Summary: The result register of these instructions is also the first operand. Reviewers: jacksprat, dsanders Reviewed By: dsanders Differential Revision: http://llvm-reviews.chandlerc.com/D2362 Differential Revision: http://llvm-reviews.chandlerc.com/D2363 llvm-svn: 196910	2013-12-10 11:37:00 +00:00
Daniel Sanders	3519dce968	[mips][msa] Fix invalid generated code when lowering FrameIndex involving unaligned offsets. Summary: The MSA ld.[bhwd] and st.[bhwd] instructions scale the immediate by the element size before use as an offset. The offset must therefore be a multiple of the element size to be valid in these instructions. However, an unaligned base address is valid in MSA. This commit causes the compiler to emit valid code when the calculated offset is not a multiple of the element size by accounting for the offset using addiu and using a zero offset in the load/store. Depends on D2338 Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D2339 llvm-svn: 196777	2013-12-09 12:47:12 +00:00
Daniel Sanders	26a5a7475e	[mips][msa] Fix suboptimal FrameIndex lowering for ld.[hwd] and st.[hwd] Summary: The immediate in these instructions is scaled before use as an offset. They therefore have a wider reach than ld.b/st.b. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D2338 llvm-svn: 196775	2013-12-09 11:50:16 +00:00
Daniel Sanders	7fd68d6018	[mips][msa] MSA loads and stores have a 10-bit offset. Account for this when lowering FrameIndex. This prevents the compiler from emitting invalid ld.[bhwd]'s and st.[bhwd]'s when the stack frame is between 512 and 32,768 bytes in size. llvm-svn: 195973	2013-11-30 13:47:57 +00:00
Daniel Sanders	b021c6fdbd	Fixed tryFoldToZero() for vector types that need expansion. Summary: Moved the requirement for SelectionDAG::getConstant() to return legally typed nodes slightly earlier. There were two optional DAGCombine passes that were missed out and were required to produce type-legal DAGs. Simplified a code-path in tryFoldToZero() to use SelectionDAG::getConstant(). This provides support for both promoted and expanded vector types whereas the previous code only supported promoted vector types. Fixes a "Type for zero vector elements is not legal" assertion detected by an llvm-stress generated test. Reviewers: resistor CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2251 llvm-svn: 195635	2013-11-25 11:14:43 +00:00
Daniel Sanders	b516aae48e	[mips][msa] Add test case that should have been added in r195456. llvm-svn: 195469	2013-11-22 15:47:18 +00:00
Daniel Sanders	fd8e416879	[mips][msa] Float vector constants cannot use ldi.[wd] directly. Bitcast from the appropriate integer vector type. Fixes an instruction selection failure detected by llvm-stress. llvm-svn: 195444	2013-11-22 11:24:50 +00:00
Daniel Sanders	c8c50fb41f	[mips][msa] Fix a corner case in performORCombine() when combining nodes into VSELECT. Mask == ~InvMask asserts if the width of Mask and InvMask differ. The combine isn't valid (with two exceptions, see below) if the widths differ so test for this before testing Mask == ~InvMask. In the specific cases of Mask=~0 and InvMask=0, as well as Mask=0 and InvMask=~0, the combine is still valid. However, there are more appropriate combines that could be used in these cases such as folding x & 0 to 0, or x & ~0 to x. llvm-svn: 195364	2013-11-21 16:11:31 +00:00
Daniel Sanders	edc071b815	Add support for legalizing SETNE/SETEQ by inverting the condition code and the result of the comparison. Summary: LegalizeSetCCCondCode can now legalize SETEQ and SETNE by returning the inverse condition and requesting that the caller invert the result of the condition. The caller of LegalizeSetCCCondCode must handle the inverted CC, and they do so as follows: SETCC, BR_CC: Invert the result of the SETCC with SelectionDAG::getNOT() SELECT_CC: Swap the true/false operands. This is necessary for MSA which lacks an integer SETNE instruction. Reviewers: resistor CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2229 llvm-svn: 195355	2013-11-21 13:24:49 +00:00
Daniel Sanders	6e664bcef3	[mips][msa/dsp] Only do DSP combines if DSP is enabled. Fixes a crash (null pointer dereferenced) when MSA is enabled. llvm-svn: 195343	2013-11-21 11:40:14 +00:00
Daniel Sanders	6747c64068	[mips][msa] Merge basic_operations_little.ll into basic_operations.ll. Now that FileCheck supports multiple check prefixes, we don't need to keep the little and big endian versions of this test separate anymore. Merge them back together. llvm-svn: 194826	2013-11-15 17:24:41 +00:00
Daniel Sanders	50b8041066	Fix illegal DAG produced by SelectionDAG::getConstant() for v2i64 type Summary: When getConstant() is called for an expanded vector type, it is split into multiple scalar constants which are then combined using appropriate build_vector and bitcast operations. In addition to the usual big/little endian differences, the case where the element-order of the vector does not have the same endianness as the elements themselves is also accounted for. For example, for v4i32 on big-endian MIPS, the byte-order of the vector is <3210,7654,BA98,FEDC>. For little-endian, it is <0123,4567,89AB,CDEF>. Handling this case turns out to be a nop since getConstant() returns a splatted vector (so reversing the element order doesn't change the value) This fixes a number of cases in MIPS MSA where calling getConstant() during operation legalization introduces illegal types (e.g. to legalize v2i64 UNDEF into a v2i64 BUILD_VECTOR of illegal i64 zeros). It should also handle bigger differences between illegal and legal types such as legalizing v2i64 into v8i16. lowerMSASplatImm() in the MIPS backend no longer needs to avoid calling getConstant() so this function has been updated in the same patch. For the sake of transparency, the steps I've taken since the review are: * Added 'virtual' to isVectorEltOrderLittleEndian() as requested. This revealed that the MIPS tests were falsely passing because a polymorphic function was not actually polymorphic in the reviewed patch. * Fixed the tests that were now failing. This involved deleting the code to handle the MIPS MSA element-order (which was previously doing an byte-order swap instead of an element-order swap). This left isVectorEltOrderLittleEndian() unused and it was deleted. * Fixed build failures caused by rebasing beyond r194467-r194472. These build failures involved the bset, bneg, and bclr instructions added in these commits using lowerMSASplatImm() in a way that was no longer valid after this patch. Some of these were fixed by calling SelectionDAG::getConstant() instead, others were fixed by a new function getBuildVectorSplat() that provided the removed functionality of lowerMSASplatImm() in a more sensible way. Reviewers: bkramer Reviewed By: bkramer CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1973 llvm-svn: 194811	2013-11-15 12:56:49 +00:00
Daniel Sanders	1ede3002fa	[mips][msa] Build all the tests in little and big endian modes and correct an incorrect test. Summary: This patch (correctly) breaks some MSA tests by exposing the cases when SelectionDAG::getConstant() produces illegal types. These have been temporarily marked XFAIL and the XFAIL flag will be removed when SelectionDAG::getConstant() is fixed. There are three categories of failure: * Immediate instructions are not selected in one endian mode. * Immediates used in ldi.[bhwd] must be different according to endianness. (this only affects cases where the 'wrong' ldi is used to load the correct bitpattern. E.g. (bitcast:v2i64 (build_vector:v4i32 ...))) * Non-immediate instructions that rely on immediates affected by the previous two categories as part of their match pattern. For example, the bset match pattern is the vector equivalent of 'ws \| (1 << wt)'. One test needed correcting to expect different output depending on whether big or little endian was in use. This test was test/CodeGen/Mips/msa/basic_operations.ll and experiences the second category of failure shown above. The little endian version of this test is named basic_operations_little.ll and will be merged back into basic_operations.ll in a follow up commit now that FileCheck supports multiple check prefixes. Reviewers: bkramer, jacksprat, dsanders Reviewed By: dsanders CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1972 llvm-svn: 194806	2013-11-15 11:04:16 +00:00
Daniel Sanders	8b59af15ed	[mips][msa] Enable inlinse assembly for MSA. Like GCC, this re-uses the 'f' constraint and a new 'w' print-modifier: asm ("ldi.w %w0, 1", "=f"(result)); Unlike GCC, the 'w' print-modifer is not _required_ to produce the intended output. This is a consequence of differences in the internal handling of the registers in each compiler. To be source-compatible between the compilers, users must use the 'w' print-modifier. MSA registers (including control registers) are supported in clobber lists. llvm-svn: 194476	2013-11-12 12:56:01 +00:00
Daniel Sanders	3f6eb546d3	[mips][msa] Added support for matching bclr, and bclri from normal IR (i.e. not intrinsics) llvm-svn: 194471	2013-11-12 10:45:18 +00:00
Daniel Sanders	a5bc99f164	[mips][msa] Added support for matching bset, bseti, bneg, and bnegi from normal IR (i.e. not intrinsics) llvm-svn: 194469	2013-11-12 10:31:49 +00:00
Daniel Sanders	44657ef6e5	[mips][msa] Change constant used in ori tests to avoid conflict with bseti (also xori to avoid bnegi) Upcoming commit(s) are going to add support for bseti and bnegi. This would cause some existing tests to (correctly) change behaviour and emit a different instruction. This patch prevents this by changing the constant used in ori and xori tests so that they will not be matchable by the bseti and bnegi patterns when these instructions are matchable from normal IR. llvm-svn: 194467	2013-11-12 10:14:18 +00:00
Daniel Sanders	a1840d2f88	Vector forms of SHL, SRA, and SRL can be constant folded using SimplifyVBinOp too Reviewers: dsanders Reviewed By: dsanders CC: llvm-commits, nadav Differential Revision: http://llvm-reviews.chandlerc.com/D1958 llvm-svn: 194393	2013-11-11 17:23:41 +00:00
Matheus Almeida	c051a40506	[mips][msa] CHECK-DAG-ize MSA 3r-a.ll test. No functional changes. llvm-svn: 194391	2013-11-11 16:46:20 +00:00
Matheus Almeida	ce207fa078	[mips][msa] CHECK-DAG-ize MSA 2rf_int_float.ll test. No functional changes. llvm-svn: 194390	2013-11-11 16:38:55 +00:00
Matheus Almeida	fed22ad33b	[mips][msa] CHECK-DAG-ize MSA 2rf_float_int.ll test. No functional changes. llvm-svn: 194389	2013-11-11 16:31:46 +00:00
Matheus Almeida	c596839e67	[mips][msa] CHECK-DAG-ize MSA 2rf.ll test. No functional changes. llvm-svn: 194387	2013-11-11 16:24:53 +00:00
Matheus Almeida	9826d07a2f	[mips][msa] CHECK-DAG-ize MSA 2r.ll test. No functional changes. llvm-svn: 194386	2013-11-11 16:16:53 +00:00
Daniel Sanders	d5f554f0bb	[mips][msa] Correct definition of bins[lr] and CHECK-DAG-ize related tests llvm-svn: 193695	2013-10-30 15:45:42 +00:00
Daniel Sanders	ab94b537d7	[mips][msa] Added support for matching bmnz, bmnzi, bmz, and bmzi from normal IR (i.e. not intrinsics) Also corrected the definition of the intrinsics for these instructions (the result register is also the first operand), and added intrinsics for bsel and bseli to clang (they already existed in the backend). These four operations are mostly equivalent to bsel, and bseli (the difference is which operand is tied to the result). As a result some of the tests changed as described below. bitwise.ll: - bsel.v test adapted so that the mask is unknown at compile-time. This stops it emitting bmnzi.b instead of the intended bsel.v. - The bseli.b test now tests the right thing. Namely the case when one of the values is an uimm8, rather than when the condition is a uimm8 (which is covered by bmnzi.b) compare.ll: - bsel.v tests now (correctly) emits bmnz.v instead of bsel.v because this is the same operation (see MSA.txt). i8.ll - CHECK-DAG-ized test. - bmzi.b test now (correctly) emits equivalent bmnzi.b with swapped operands because this is the same operation (see MSA.txt). - bseli.b still emits bseli.b though because the immediate makes it distinguishable from bmnzi.b. vec.ll: - CHECK-DAG-ized test. - bmz.v tests now (correctly) emits bmnz.v with swapped operands (see MSA.txt). - bsel.v tests now (correctly) emits bmnz.v with swapped operands (see MSA.txt). llvm-svn: 193693	2013-10-30 15:20:38 +00:00
Daniel Sanders	d74b130cc9	[mips][msa] Added support for matching bins[lr]i.[bhwd] from normal IR (i.e. not intrinsics) This required correcting the definition of the bins[lr]i intrinsics because the result is also the first operand. It also required removing the (arbitrary) check for 32-bit immediates in MipsSEDAGToDAGISel::selectVSplat(). Currently using binsli.d with 2 bits set in the mask doesn't select binsli.d because the constant is legalized into a ConstantPool. Similar things can happen with binsri.d with more than 10 bits set in the mask. The resulting code when this happens is correct but not optimal. llvm-svn: 193687	2013-10-30 14:45:14 +00:00
Daniel Sanders	53fe6c4d56	[mips][msa] Combine binsri-like DAG of AND and OR into equivalent VSELECT (or (and $a, $mask), (and $b, $inverse_mask)) => (vselect $mask, $a, $b). where $mask is a constant splat. This allows bitwise operations to make use of bsel. It's also a stepping stone towards matching bins[lr], and bins[lr]i from normal IR. Two sets of similar tests have been added in this commit. The bsel_* functions test the case where binsri cannot be used. The binsr_*_i functions will start to use the binsri instruction in the next commit. llvm-svn: 193682	2013-10-30 13:51:01 +00:00
Daniel Sanders	e7ef0c817b	[mips][msa] Added support for matching splat.[bhw] from normal IR (i.e. not intrinsics) splat.d is implemented but this subtest is currently disabled. This is because it is difficult to match the appropriate IR on MIPS32. There is a patch under review that should help with this so I hope to enable the subtest soon. llvm-svn: 193680	2013-10-30 13:07:44 +00:00
Daniel Sanders	a952160078	[mips][msa] Added support for matching fexp2 from normal IR (i.e. not intrinsics) llvm-svn: 193239	2013-10-23 10:36:52 +00:00
Matheus Almeida	70fbf77546	[mips][msa] Fix definition of SLD instruction. The second parameter of the SLD intrinsic is the number of columns (GPR) to slide left the source array. llvm-svn: 193076	2013-10-21 11:47:56 +00:00
Daniel Sanders	eb4003651d	[mips][msa] Added a regression test that depended on multiple patches to pass. llvm-svn: 192961	2013-10-18 09:52:21 +00:00
Daniel Sanders	a4eaf59f9e	[mips][msa] Added lsa instruction llvm-svn: 192895	2013-10-17 13:38:20 +00:00
Daniel Sanders	66f5e46a2d	Fix r192888: test/CodeGen/Mips/msa/3r_ld_st.ll should have been deleted llvm-svn: 192889	2013-10-17 12:36:35 +00:00
Daniel Sanders	1dfddc73dc	[mips][msa] Added support for build_vector for v4f32 and v2f64. llvm-svn: 192699	2013-10-15 13:14:41 +00:00
Matheus Almeida	49b7564717	[mips][msa] Improves robustness of the test by enhancing pattern matching. llvm-svn: 192446	2013-10-11 13:18:01 +00:00
Daniel Sanders	50e5ed3d08	[mips][msa] Added support for matching maddv.[bhwd], and msubv.[bhwd] from normal IR (i.e. not intrinsics) llvm-svn: 192438	2013-10-11 10:50:42 +00:00
Daniel Sanders	e67bd87c48	[mips][msa] Added support for matching fmsub.[wd] from normal IR (i.e. not intrinsics) llvm-svn: 192435	2013-10-11 10:27:32 +00:00
Daniel Sanders	d7103f3187	[mips][msa] Added support for matching fmadd.[wd] from normal IR (i.e. not intrinsics) llvm-svn: 192430	2013-10-11 10:14:25 +00:00
Daniel Sanders	015972bd95	[mips][msa] Added support for matching ffint_[us].[wd], and ftrunc_[us].[wd] from normal IR (i.e. not intrinsics) llvm-svn: 192429	2013-10-11 10:00:06 +00:00
Daniel Sanders	0210dd4b93	[mips][msa] Added support for matching mod_[us] from normal IR (i.e. not intrinsics) llvm-svn: 191737	2013-10-01 10:22:35 +00:00
Daniel Sanders	6098b33515	[mips][msa] Implemented insert.d intrinsic. This intrinsic is lowered into an equivalent INSERT_VECTOR_ELT which is further lowered into a sequence of insert.w's on MIPS32. llvm-svn: 191521	2013-09-27 13:36:54 +00:00
Daniel Sanders	c72593e69a	[mips][msa] Implemented fill.d intrinsic. This intrinsic is lowered into an equivalent BUILD_VECTOR which is further lowered into a sequence of insert.w's on MIPS32. llvm-svn: 191519	2013-09-27 13:20:41 +00:00
Daniel Sanders	7f3d946fb7	[mips][msa] Implemented copy_[us].d intrinsic. This intrinsic is lowered into equivalent copy_s.w instructions during legalization. llvm-svn: 191518	2013-09-27 13:04:21 +00:00
Daniel Sanders	a515070eb3	[mips][msa] Implemented insert_vector_elt for v4f32 and v2f64. For v4f32 and v2f64, INSERT_VECTOR_ELT is matched by a pseudo-insn which is later expanded to appropriate insve.[wd] insns. llvm-svn: 191515	2013-09-27 12:31:32 +00:00
Daniel Sanders	39bb8ba023	[mips][msa] Implemented extract_vector_elt for v4f32 or v2f64 For v4f32 and v2f64, EXTRACT_VECTOR_ELT is matched by a pseudo-insn which may be expanded to subregister copies and/or instructions as appropriate. llvm-svn: 191514	2013-09-27 12:17:32 +00:00
Daniel Sanders	9ea9ff2da7	[mips][msa] Added support for MSA registers to copyPhysReg llvm-svn: 191512	2013-09-27 12:03:51 +00:00
Daniel Sanders	7e51fe19d5	[mips][msa] Added support for matching splati from normal IR (i.e. not intrinsics) Updated some of the vshf since they (correctly) emit splati's now llvm-svn: 191511	2013-09-27 11:48:57 +00:00
Daniel Sanders	1b1e25b7c5	[mips][msa] MSA requires FR=1 mode (64-bit FPU register file). Report fatal error when using it in FR=0 mode. llvm-svn: 191498	2013-09-27 10:08:31 +00:00
Daniel Sanders	36c671e2c7	[mips][msa] Expand all truncstores and loadexts for MSA as well as DSP llvm-svn: 191496	2013-09-27 09:44:59 +00:00
Daniel Sanders	f4f1a872ca	[mips][msa] Added missing check in performSRACombine Reviewers: jacksprat, dsanders Reviewed By: dsanders Differential Revision: http://llvm-reviews.chandlerc.com/D1755 llvm-svn: 191495	2013-09-27 09:25:29 +00:00

1 2 3 4

190 Commits