llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	040a36c176	[SelectionDAG] Add support for EXTRACT_SUBVECTOR to ComputeNumSignBits Pre-commit as discussed on D27657 llvm-svn: 289425	2016-12-12 10:29:43 +00:00
Simon Pilgrim	54945a12ec	[SelectionDAG] Add ability for computeKnownBits to peek through bitcasts from 'large element' scalar/vector to 'small element' vector. Extension to D27129 which already supported bitcasts from 'small element' vector to 'large element' scalar/vector types. llvm-svn: 289329	2016-12-10 17:00:00 +00:00
Simon Pilgrim	017b7a71d8	[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes (REAPPLIED) Reapplied with fix for PR31323 - X86 SSE2 vXi16 multiplies for illegal types were creating CONCAT_VECTORS nodes with vector inputs that might not total the number of elements in the result type. llvm-svn: 289232	2016-12-09 17:53:11 +00:00
Matt Arsenault	38d8ed2b75	AMDGPU: Fix i128 mul llvm-svn: 289231	2016-12-09 17:49:14 +00:00
Nirav Dave	bedb5d906c	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r289221 which appears to be triggering an assertion llvm-svn: 289226	2016-12-09 17:18:24 +00:00
Nirav Dave	fd51ff4fd8	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after fixing overly aggressive load-store forwarding optimization. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289221	2016-12-09 16:15:12 +00:00
Simon Pilgrim	b9eb99f570	Use SelectionDAG.getSplatBuildVector helper. NFCI. llvm-svn: 289220	2016-12-09 16:01:50 +00:00
Simon Pilgrim	bf9c0e7434	[SelectionDAG] Use SelectionDAG.getBuildVector helper. NFCI. Makes interception of BUILD_VECTOR creation easier for debugging. llvm-svn: 289218	2016-12-09 15:23:41 +00:00
Simon Pilgrim	15f1f828b5	[SelectionDAG] Add additional checks to CONCAT_VECTORS creation Part of the work for PR31323 - add extra asserts checking that the input vectors are of consistent type and result in the correct number of vector elements. llvm-svn: 289214	2016-12-09 14:27:52 +00:00
Simon Pilgrim	e4050a2961	[SelectionDAG] Add partial BITCAST support to computeKnownBits Adds support for bitcasting a little endian 'small element' vector to 'large element' scalar/vector (e.g. v16i8 to v4i32 or v2i32 to i64), which is required for PR30845. We extract the knownbits for each 'small element' part and concatenate the results together. We can add support for big endian and 'large element' scalar/vector to 'small element' vector bitcasting once we have test cases for them. Differential Revision: https://reviews.llvm.org/D27129 llvm-svn: 289200	2016-12-09 10:13:45 +00:00
Daniel Jasper	f51e05ffbc	Revert "[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes" This reverts commit r288916 as it is currently causing a crasher in Halide. Reproducer on llvm.org/PR31323. While it might be that halide is generating invalid IR, llc shouldn't crash. llvm-svn: 289194	2016-12-09 09:04:51 +00:00
Nicolai Haehnle	f08dc90253	[SelectionDAG] Add expansion and promotion of [US]MUL_LOHI Summary: Most targets set the action for these nodes to Expand even though there isn't actually any code for them in ExpandNode. Instead, targets simply relied on the fact that no code generates these nodes as long as the nodes aren't legal or custom. However, generating these nodes can be useful e.g. for divide-by-constant in wider integer types. Expand of [US]MUL_LOHI will use MULH[US] when legal or custom, and a sequence of half-width multiplications otherwise. Promote uses a wider multiply. This patch intends to not change the generated code, but indirect effects are possible since expansions/promotions that were previously done in DAGCombine may now be done in LegalizeDAG. See D24822 for a change that actually uses the new expansion. Reviewers: spatel, bkramer, venkatra, efriedma, hfinkel, ast, nadav, tstellarAMD Subscribers: arsenm, jyknight, nemanjai, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D24956 llvm-svn: 289050	2016-12-08 14:08:14 +00:00
Simon Pilgrim	ba05d41095	[SelectionDAG] Add knownbits support for vector demandedelts in SMAX/SMIN/UMAX/UMIN opcodes llvm-svn: 288926	2016-12-07 17:54:00 +00:00
Simon Pilgrim	967325b373	[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes llvm-svn: 288916	2016-12-07 16:28:21 +00:00
Simon Pilgrim	ff79f31328	[SelectionDAG] Removed old knownbits TODO comment. NFCI. EXTRACT_VECTOR_ELT does support demanded elts if the element index is known and in range. llvm-svn: 288913	2016-12-07 15:31:12 +00:00
Eli Friedman	0a76e3241f	[CodeGen] Fix result type for SMULO/UMULO legalization On some platforms (like MSP430) the second element of the result structure for SMULO/UMULO may have a shorter type than the one returned by SetCC. We need to truncate it to the right type, or else some incorrect code may be generated later on. This fixes issue https://github.com/rust-lang/rust/issues/37829 Patch by Vadzim Dambrouski! Differential Revision: https://reviews.llvm.org/D27154 llvm-svn: 288857	2016-12-06 22:49:36 +00:00
Simon Pilgrim	dd6ca639d5	[DAGCombine] Add (sext_in_reg (zext x)) -> (sext x) combine Handle the case where a sign extension has ended up being split into separate stages (typically to get around vector legal ops) and a zext + sext_in_reg gets inserted. Differential Revision: https://reviews.llvm.org/D27461 llvm-svn: 288842	2016-12-06 19:09:37 +00:00
Simon Pilgrim	1577b39f51	[SelectionDAG] We can ignore knownbits from an undef shuffle vector index if we don't actually demand that element llvm-svn: 288839	2016-12-06 18:58:25 +00:00
Simon Pilgrim	29c17f3f58	Avoid repeated calls to Op.getOpcode(). NFCI. llvm-svn: 288814	2016-12-06 14:50:09 +00:00
Sanjay Patel	1f158d6955	[TargetLowering] add special-case for demanded bits analysis of 'not' We treat bitwise 'not' as a special operation and try not to reduce its all-ones mask. Presumably, this is because a 'not' may be cheaper than a generic 'xor' or it may get folded into another logic op if the target has those. However, if we can remove a logic instruction by changing the xor's constant mask value, that should always be a win. Note that the IR version of SimplifyDemandedBits() does not treat 'not' as a special-case currently (although that's marked with a FIXME). So if you run this IR through -instcombine, you should get the same end result. I'm hoping to add a different backend transform that will expose this problem though, so I need to solve this first. Differential Revision: https://reviews.llvm.org/D27356 llvm-svn: 288676	2016-12-05 15:58:21 +00:00
Matt Arsenault	92fede361f	DAG: Fold out out of bounds insert_vector_elt getNode already prevents formation of out of bounds constant extract_vector_elts. Do the same for insert_vector_elt. llvm-svn: 288603	2016-12-03 23:03:26 +00:00
Nicolai Haehnle	33ca182c91	[DAGCombiner] do not fold (fmul (fadd X, 1), Y) -> (fmad X, Y, Y) by default Summary: When X = 0 and Y = inf, the original code produces inf, but the transformed code produces nan. So this transform (and its relatives) should only be used when the no-infs-fp-math flag is explicitly enabled. Also disable the transform using fmad (intermediate rounding) when unsafe-math is not enabled, since it can reduce the precision of the result; consider this example with binary floating point numbers with two bits of mantissa: x = 1.01 y = 111 x * (y + 1) = 1.01 * 1000 = 1010 (this is the exact result; no rounding occurs at any step) x * y + x = 1000.11 + 1.01 =r 1000 + 1.01 = 1001.01 =r 1000 (with rounding towards zero) The example relies on rounding towards zero at least in the second step. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98578 Reviewers: RKSimon, tstellarAMD, spatel, arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26602 llvm-svn: 288506	2016-12-02 16:06:18 +00:00
Peter Collingbourne	ab85225be4	IR: Change the gep_type_iterator API to avoid always exposing the "current" type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458	2016-12-02 02:24:42 +00:00
Justin Bogner	35c5e58f8c	SDAG: Avoid a large, usually empty SmallVector in a recursive function This SmallVector is using up 128 bytes on the stack every time despite almost always being empty[1], and since this function can recurse quite deeply that adds up to a lot of overhead. We've seen this run afoul of ulimits in some cases with ASAN on. Replacing the SmallVector with a std::vector trades an occasional heap allocation for vastly less stack usage. [1]: I gathered some stats on an internal test suite and the vector was non-empty in only 45,000 of 10,000,000 calls to this function. llvm-svn: 288441	2016-12-02 00:11:01 +00:00
Matthias Braun	d0ee66c2e9	Move most EH from MachineModuleInfo to MachineFunction Recommitting r288293 with some extra fixes for GlobalISel code. Most of the exception handling members in MachineModuleInfo is actually per function data (talks about the "current function") so it is better to keep it at the function instead of the module. This is a necessary step to have machine module passes work properly. Also: - Rename TidyLandingPads() to tidyLandingPads() - Use doxygen member groups instead of "//===- EH ---"... so it is clear where a group ends. - I had to add an ugly const_cast at two places in the AsmPrinter because the available MachineFunction pointers are const, but the code wants to call tidyLandingPads() in between (markFunctionEnd()/endFunction()). Differential Revision: https://reviews.llvm.org/D27227 llvm-svn: 288405	2016-12-01 19:32:15 +00:00
Nicolai Haehnle	da7e4017c6	[SelectionDAG] Rename and clarify visitFMULForFMADCombine (NFC) Summary: Suggested by @spatel in D26602. Reviewers: spatel, hfinkel Subscribers: spatel, llvm-commits Differential Revision: https://reviews.llvm.org/D27260 llvm-svn: 288336	2016-12-01 14:04:13 +00:00
Eric Christopher	e70b7c3dfb	Temporarily Revert "Move most EH from MachineModuleInfo to MachineFunction" This apprears to have broken the global isel bot: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-globalisel_build/5174/console This reverts commit r288293. llvm-svn: 288322	2016-12-01 07:50:12 +00:00
Matthias Braun	ed14cb0604	Move most EH from MachineModuleInfo to MachineFunction Most of the exception handling members in MachineModuleInfo is actually per function data (talks about the "current function") so it is better to keep it at the function instead of the module. This is a necessary step to have machine module passes work properly. Also: - Rename TidyLandingPads() to tidyLandingPads() - Use doxygen member groups instead of "//===- EH ---"... so it is clear where a group ends. - I had to add an ugly const_cast at two places in the AsmPrinter because the available MachineFunction pointers are const, but the code wants to call tidyLandingPads() in between (markFunctionEnd()/endFunction()). Differential Revision: https://reviews.llvm.org/D27227 llvm-svn: 288293	2016-11-30 23:49:01 +00:00
Matthias Braun	ef331eff5a	Move VariableDbgInfo from MachineModuleInfo to MachineFunction VariableDbgInfo is per function data, so it makes sense to have it with the function instead of the module. This is a necessary step to have machine module passes work properly. Differential Revision: https://reviews.llvm.org/D27186 llvm-svn: 288292	2016-11-30 23:48:50 +00:00
Nicolai Haehnle	73a9a27b5a	[SelectionDAG] Refactor TargetLowering::expandMUL (NFC) Summary: Further preparation for the expansion of MUL_LOHI added in D24956. Reviewers: efriedma, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27064 llvm-svn: 288248	2016-11-30 16:26:33 +00:00
Warren Ristow	d9777c1dbb	Test commit. Comment changes. NFC. llvm-svn: 288100	2016-11-29 02:37:13 +00:00
Sanjay Patel	2bd32b05fb	[DAG] clean up foldSelectCCToShiftAnd(); NFCI llvm-svn: 288088	2016-11-28 23:05:55 +00:00
Sanjay Patel	1cf9aff659	[DAG] add helper function for selectcc --> and+shift transforms; NFC llvm-svn: 288073	2016-11-28 21:47:41 +00:00
Nirav Dave	a413361798	Revert "[DAG] Improve loads-from-store forwarding to handle TokenFactor" This reverts commit r287773 which caused issues with ppc64le builds. llvm-svn: 288035	2016-11-28 14:30:29 +00:00
Simon Pilgrim	c5fb167df0	Use SDValue helpers instead of explicitly going via SDValue::getNode(). NFCI llvm-svn: 287941	2016-11-25 17:25:21 +00:00
Craig Topper	8c4cdf06db	[DAGCombine] Teach DAG combine that if both inputs of a vselect are the same, then the condition doesn't matter and the vselect can be removed. Selects with scalar condition already handle this correctly. llvm-svn: 287904	2016-11-24 21:48:52 +00:00
Nicolai Haehnle	934470f536	[SelectionDAG] Early-out in TargetLowering::expandMUL (NFC) Summary: Reduce indentation level; preparation for D24956. Reviewers: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27063 llvm-svn: 287831	2016-11-23 22:14:20 +00:00
Nirav Dave	cf34556330	[DAG] Improve loads-from-store forwarding to handle TokenFactor Forward store values to matching loads down through token factors. Factored from D14834. Reviewers: jyknight, hfinkel Subscribers: hfinkel, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D26080 llvm-svn: 287773	2016-11-23 16:48:35 +00:00
John Brawn	150addb45c	[DAGCombiner] Fix infinite loop in vector mul/shl combining We have the following DAGCombiner transformations: (mul (shl X, c1), c2) -> (mul X, c2 << c1) (mul (shl X, C), Y) -> (shl (mul X, Y), C) (shl (mul x, c1), c2) -> (mul x, c1 << c2) Usually the constant shift is optimised by SelectionDAG::getNode when it is constructed, by SelectionDAG::FoldConstantArithmetic, but when we're dealing with vectors and one of those vector constants contains an undef element FoldConstantArithmetic does not fold and we enter an infinite loop. Fix this by making FoldConstantArithmetic use getNode to decide how to fold each vector element, the same as FoldConstantVectorArithmetic does, and rather than adding the constant shift to the work list instead only apply the transformation if it's already been folded into a constant, as if it's not we're going to loop endlessly. Additionally add missing NoOpaques to one of those transformations, which I noticed when writing the tests for this. Differential Revision: https://reviews.llvm.org/D26605 llvm-svn: 287766	2016-11-23 16:05:51 +00:00
Elena Demikhovsky	09375d98b8	Type legalization for compressstore and expandload intrinsics. Implemented widening (v2f32) and splitting (v16f64). On splitting, I use "popcnt" to calculate memory increment. More type legalization work will come in the next patches. llvm-svn: 287761	2016-11-23 13:58:24 +00:00
Simon Pilgrim	72e43570b7	[SelectionDAG] ComputeNumSignBits of TRUNCATE operations Add basic ComputeNumSignBits support for TRUNCATE ops for cases where the source's number of sign bits overlaps with the truncated size. Improves X86 SIGN_EXTEND_IN_REG vector cases which were needlessly sign extending boolean vector results. Differential Revision: https://reviews.llvm.org/D26851 llvm-svn: 287635	2016-11-22 11:29:19 +00:00
Matt Arsenault	b30d2aca58	DAG: Ignore call site attributes when emitting target intrinsic A target intrinsic may be defined as possibly reading memory, but the call site may have additional knowledge that it doesn't read memory. The intrinsic lowering will expect the pessimistic assumption of the intrinsic definition, so the chain should still be used. llvm-svn: 287593	2016-11-21 22:56:42 +00:00
Simon Pilgrim	5662074ba3	[VectorLegalizer] Remove EVT::getSizeInBits code duplications. NFCI. We were calling SVT.getSizeInBits() several times in a row - just call it once and reuse the result. llvm-svn: 287556	2016-11-21 18:24:44 +00:00
Simon Pilgrim	49d7eda968	[SelectionDAG] Add ComputeNumSignBits support for CONCAT_VECTORS opcode llvm-svn: 287541	2016-11-21 14:36:19 +00:00
Simon Pilgrim	7a6b6d5656	Fix spelling mistakes in SelectionDAG comments. NFC. Identified by Pedro Giffuni in PR27636. llvm-svn: 287487	2016-11-20 13:14:57 +00:00
Simon Pilgrim	e40900dddd	[SelectionDAG] Add knowbits support for CONCAT_VECTOR opcode llvm-svn: 287387	2016-11-18 22:21:22 +00:00
Matthias Braun	9f15a79e5d	Timer: Track name and description. The previously used "names" are rather descriptions (they use multiple words and contain spaces), use short programming language identifier like strings for the "names" which should be used when exporting to machine parseable formats. Also removed a unused TimerGroup from Hexxagon. Differential Revision: https://reviews.llvm.org/D25583 llvm-svn: 287369	2016-11-18 19:43:18 +00:00
Simon Pilgrim	c4d733cd6a	Fix spelling in comment. NFC. llvm-svn: 287222	2016-11-17 12:03:05 +00:00
Chris Bieneman	05c279fc4b	[CMake] NFC. Updating CMake dependency specifications This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system. llvm-svn: 287206	2016-11-17 04:36:50 +00:00
Ahmed Bougacha	bd6ce9a247	[CodeGen] Pass references, not pointers, to MMI helpers. NFC. While there, rename them to follow the coding style. llvm-svn: 287169	2016-11-16 22:25:03 +00:00

1 2 3 4 5 ...

7882 Commits