llvm-project

Commit Graph

Author	SHA1	Message	Date
Stepan Dyatkovskiy	157bb42e27	Fix for PR18102. Issue outcomes from DAGCombiner::MergeConsequtiveStores, more precisely from mem-ops sequence sorting. Consider, how MergeConsequtiveStores works for next example: store i8 1, a[0] store i8 2, a[1] store i8 3, a[1] ; a[1] again. return ; DAG starts here 1. Method will collect all the 3 stores. 2. It sorts them by distance from the base pointer (farthest with highest index). 3. It takes first consecutive non-overlapping stores and (if possible) replaces them with a single store instruction. The point is, we can't determine here which 'store' instruction would be the second after sorting ('store 2' or 'store 3'). It happens that 'store 3' would be the second, and 'store 2' would be the third. So after merging we have the next result: store i16 (1 \| 3 << 8), base ; is a[0] but bit-casted to i16 store i8 2, a[1] So actually we swapped 'store 3' and 'store 2' and got wrong contents in a[1]. Fix: In sort routine just also take into account mem-op sequence number. llvm-svn: 200201	2014-01-27 09:18:31 +00:00
Hal Finkel	dbebb52a2f	Disable the use of TBAA when using AA in CodeGen There are currently two issues, of which I currently know, that prevent TBAA from being correctly usable in CodeGen: 1. Stack coloring does not update TBAA when merging allocas. This is easy enough to fix, but is not the largest problem. 2. CGP inserts ptrtoint/inttoptr pairs when sinking address computations. Because BasicAA does not handle inttoptr, we'll often miss basic type punning idioms that we need to catch so we don't miscompile real-world code (like LLVM). I don't yet have a small test case for this, but this fixes self hosting a non-asserts build of LLVM on PPC64 when using -enable-aa-sched-mi and -misched=shuffle. llvm-svn: 200093	2014-01-25 19:24:54 +00:00
Hal Finkel	9b2617a5a8	Add combiner-aa-only-func (debug only) This option (which is !NDEBUG only) allows restricting the use of alias analysis in DAGCombiner to a specific function. This has proved extremely valuable to isolating bugs related to this feature, and mirrors the misched-only-func option provided by the new instruction scheduler. llvm-svn: 200088	2014-01-25 17:32:39 +00:00
Hal Finkel	5fb07341f1	Improve descriptions of combiner-alias-analysis and combiner-global-alias-analysis llvm-svn: 200087	2014-01-25 17:32:37 +00:00
Juergen Ributzka	f26beda7c7	Revert "Revert "Add Constant Hoisting Pass" (r200034)" This reverts commit r200058 and adds the using directive for ARMTargetTransformInfo to silence two g++ overload warnings. llvm-svn: 200062	2014-01-25 02:02:55 +00:00
Hans Wennborg	4d67a2e85a	Revert "Add Constant Hoisting Pass" (r200034) This commit caused -Woverloaded-virtual warnings. The two new TargetTransformInfo::getIntImmCost functions were only added to the superclass, and to the X86 subclass. The other targets were not updated, and the warning highlighted this by pointing out that e.g. ARMTTI::getIntImmCost was hiding the two new getIntImmCost variants. We could pacify the warning by adding "using TargetTransformInfo::getIntImmCost" to the various subclasses, or turning it off, but I suspect that it's wrong to leave the functions unimplemnted in those targets. The default implementations return TCC_Free, which I don't think is right e.g. for ARM. llvm-svn: 200058	2014-01-25 01:18:18 +00:00
Juergen Ributzka	4f3df4ad64	Add Constant Hoisting Pass Retry commit r200022 with a fix for the build bot errors. Constant expressions have (unlike instructions) module scope use lists and therefore may have users in different functions. The fix is to simply ignore these out-of-function uses. llvm-svn: 200034	2014-01-24 20:18:00 +00:00
Hal Finkel	51a9838049	Fix DAGCombiner::GatherAllAliases to account for non-chain dependencies DAGCombiner::GatherAllAliases, which is only used when AA used is enabled during DAGCombine, had a fundamentally incorrect assumption for which this change compensates. GatherAllAliases, which is used to find aliasing predecessor chain nodes (so that a better chain can be selected for a load or store to enable subsequent optimizations) assumed that walking up the chain would always catch all possibly-aliasing loads and stores. This is not true: To really find all aliases, we also need to search for aliases through the value operand of a store, etc. Consider the following situation: Token1 = ... L1 = load Token1, %52 S1 = store Token1, L1, %51 L2 = load Token1, %52+8 S2 = store Token1, L2, %51+8 Token2 = Token(S1, S2) L3 = load Token2, %53 S3 = store Token2, L3, %52 L4 = load Token2, %53+8 S4 = store Token2, L4, %52+8 If we search for aliases of S3 (which loads address %52), and we look only through the chain, then we'll miss the trivial dependence on L1 (which loads from %52). We then might change all loads and stores to use Token1 as their chain operand, which could result in copying %53 into %52 before copying %52 into %51 (which should happen first). The problem is, however, that searching for such data dependencies can become expensive, and the cost is not directly related to the chain depth. Instead, we'll rule out such configurations by insisting that we've visited all chain users (except for users of the original chain, which is not necessary). When doing this, we need to look through nodes we don't care about (otherwise, things like register copies will interfere with trivial use cases). Unfortunately, I don't have a small test case for this problem. Creating the underlying situation is not hard (a pair of memcpys will do it), but arranging for the default instruction schedule to be incorrect is very fragile. This unbreaks self hosting on PPC64 when using -mllvm -combiner-global-alias-analysis -mllvm -combiner-alias-analysis. llvm-svn: 200033	2014-01-24 20:12:02 +00:00
Juergen Ributzka	50e7e80d00	Revert "Add Constant Hoisting Pass" This reverts commit r200022 to unbreak the build bots. llvm-svn: 200024	2014-01-24 18:40:30 +00:00
Hal Finkel	ccc18e1330	Restrict FindBetterChain DAG combines to unindexed nodes These transformations obviously won't work for indexed (pre/post-inc) loads and stores. In practice, I'm not sure there is any benefit to enabling them for indexed nodes because other transformations that these might enable likely also won't handle indexed nodes. I don't have an in-tree test case that hits this problem, but an upcoming bug fix will make it much more likely. llvm-svn: 200023	2014-01-24 18:25:26 +00:00
Juergen Ributzka	38b67d0caf	Add Constant Hoisting Pass This pass identifies expensive constants to hoist and coalesces them to better prepare it for SelectionDAG-based code generation. This works around the limitations of the basic-block-at-a-time approach. First it scans all instructions for integer constants and calculates its cost. If the constant can be folded into the instruction (the cost is TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't consider it expensive and leave it alone. This is the default behavior and the default implementation of getIntImmCost will always return TCC_Free. If the cost is more than TCC_BASIC, then the integer constant can't be folded into the instruction and it might be beneficial to hoist the constant. Similar constants are coalesced to reduce register pressure and materialization code. When a constant is hoisted, it is also hidden behind a bitcast to force it to be live-out of the basic block. Otherwise the constant would be just duplicated and each basic block would have its own copy in the SelectionDAG. The SelectionDAG recognizes such constants as opaque and doesn't perform certain transformations on them, which would create a new expensive constant. This optimization is only applied to integer constants in instructions and simple (this means not nested) constant cast experessions. For example: %0 = load i64* inttoptr (i64 big_constant to i64*) Reviewed by Eric llvm-svn: 200022	2014-01-24 18:23:08 +00:00
Alp Toker	cb40291100	Fix known typos Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018	2014-01-24 17:20:08 +00:00
Elena Demikhovsky	9d56f1e0e5	AVX512: combining setcc and zext is wrong on AVX512 because vector compare instruction puts result in mask register. llvm-svn: 199798	2014-01-22 12:26:19 +00:00
Owen Anderson	fb00d5bc7c	Allow SMUL_LOHI and UMUL_LOHI to be narrow to MUL on targets where MUL is Custom rather than Legal. Even if the target is doing some kind of expansion for MUL, it's pretty much guaranteed to be more efficent than whatever it does for SMUL_LOHI or UMUL_LOHI! llvm-svn: 199678	2014-01-20 18:41:34 +00:00
Andrea Di Biagio	d7c03ec348	[DAGCombiner] Fix a wrong check in method SimplifyVBinOp. This fixes a regression intruced by r199135. Revision 199135 tried to simplify part of the logic in method DAGCombiner::SimplifyVBinOp introducing calls to method BuildVectorSDNode::isConstant(). However, that revision wrongly changed the check performed by method SimplifyVBinOp to identify dag nodes that can be folded. Before revision 199135, that method only tried to simplify vector binary operations if both operands were build_vector of Constant/ConstantFP/Undef only. After revision 199135, method SimplifyVBinop tried to simplify also vector binary operations with only one constant operand. This fixes the problem restoring the old behavior of SimplifyVBinOp. llvm-svn: 199328	2014-01-15 19:51:32 +00:00
Juergen Ributzka	6840282c99	[DAG] Refactor ReassociateOps - no functional change intended. llvm-svn: 199146	2014-01-13 21:49:25 +00:00
Juergen Ributzka	7384405f23	[DAG] Teach DAG to also reassociate vector operations This commit teaches DAG to reassociate vector ops, which in turn enables constant folding of vector op chains that appear later on during custom lowering and DAG combine. Reviewed by Andrea Di Biagio llvm-svn: 199135	2014-01-13 20:51:35 +00:00
Richard Sandiford	15cfc1c33c	Handle masked rotate amounts At the moment we expect rotates to have the form: (or (shl X, Y), (shr X, Z)) where Y == bitsize(X) - Z or Z == bitsize(X) - Y. This form means that the (or ...) is undefined for Y == 0 or Z == 0. This undefinedness can be avoided by using Y == (C * bitsize(X) - Z) & (bitsize(X) - 1) or Z == (C * bitsize(X) - Y) & (bitsize(X) - 1) for any integer C (including 0, the most natural choice). llvm-svn: 198861	2014-01-09 10:56:42 +00:00
Richard Sandiford	0f264db3c6	Match the InstCombine form of rotates by X+C InstCombine converts (sub 32, (add X, C)) into (sub 32-C, X), so a rotate left of a 32-bit Y by X+C could appear as either: (or (shl Y, (add X, C)), (shr Y, (sub 32, (add X, C)))) without InstCombine or: (or (shl Y, (add X, C)), (shr Y, (sub 32-C, X))) with it. We already matched the first form. This patch handles the second too. llvm-svn: 198860	2014-01-09 10:49:40 +00:00
Andrea Di Biagio	23df4e4a2d	Teach the DAGCombiner how to fold 'vselect' dag nodes according to the following two rules: 1) fold (vselect (build_vector AllOnes), A, B) -> A 2) fold (vselect (build_vector AllZeros), A, B) -> B llvm-svn: 198777	2014-01-08 18:33:04 +00:00
Richard Sandiford	95c864d9bd	[DAGCombiner] Factor duplicated rotate code into a separate function No functional change intended. llvm-svn: 198768	2014-01-08 15:40:47 +00:00
Kevin Qin	5cd73c9e0a	[AArch64 NEON] Fix invalid constant used in vselect condition. There is a wrong assumption that the vector element type and the type of each ConstantSDNode in the build_vector were the same. However, when promoting the integer operand of a legally typed build_vector, the operand type and the vector element type do not need to be the same (See method 'DAGTypeLegalizer::PromoteIntOp_BUILD_VECTOR' in LegalizeIntegerTypes.cpp). in AArch64 backend, the following dag sequence: C0: i1 = Constant<0> C1: i1 = Constant<-1> V: v8i1 = BUILD_VECTOR C1, C1, C0, C0, C0, C0, C0, C0 is type-legalized into: NewC0: i32 = Constant<0> NewC1: i32 = Constant<1> V: v8i8 = BUILD_VECTOR NewC1, NewC1, NewC0, NewC0, NewC0, NewC0, NewC0, NewC0 Forcing a getZeroExtend to VTBits to ensure that the new constant is correctly. llvm-svn: 198582	2014-01-06 02:26:10 +00:00
Kevin Qin	ede9ce1933	Fix a bug in DAGcombiner about zero-extend after setcc. For AArch64 backend, if DAGCombiner see "sext(setcc)", it will combine them together to a single setcc with extended value type. Then if it see "zext(setcc)", it assumes setcc is Vxi1, and try to create "(and (vsetcc), (1, 1, ...)". While setcc isn't Vxi1, DAGcombiner will create wrong node and get wrong code emitted. llvm-svn: 198190	2013-12-30 02:05:13 +00:00
Andrea Di Biagio	46dcddb350	Teach DAGCombiner how to fold a SIGN_EXTEND_INREG of a BUILD_VECTOR of ConstantSDNodes (or UNDEFs) into a simple BUILD_VECTOR. For example, given the following sequence of dag nodes: i32 C = Constant<1> v4i32 V = BUILD_VECTOR C, C, C, C v4i32 Result = SIGN_EXTEND_INREG V, ValueType:v4i1 The SIGN_EXTEND_INREG node can be folded into a build_vector since the vector in input is a BUILD_VECTOR of constants. The optimized sequence is: i32 C = Constant<-1> v4i32 Result = BUILD_VECTOR C, C, C, C llvm-svn: 198084	2013-12-27 20:20:28 +00:00
Richard Sandiford	d1093636cc	Extend (truncate (load)) folding DAGCombiner could fold (truncate (load)) -> smaller load if the original load was the width of the truncation result or wider. This patch extends it to handle cases where the original load was narrower (and so the extension type stays the same). llvm-svn: 197030	2013-12-11 11:37:27 +00:00
Nadav Rotem	6eee080450	Fix PR18162 - Incorrect assertion assumed that the SDValue resno is zero. llvm-svn: 196858	2013-12-10 01:13:59 +00:00
Alp Toker	f907b891da	Correct word hyphenations This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. llvm-svn: 196471	2013-12-05 05:44:44 +00:00
Bill Wendling	9200bb08f9	Unrevert r195599 with testcase fix. I'm not sure how it was checking for the wrong values... PR18023. llvm-svn: 195670	2013-11-25 18:05:22 +00:00
Amara Emerson	f59125f5bb	Revert r195599 as it broke the builds. llvm-svn: 195636	2013-11-25 11:24:18 +00:00
Daniel Sanders	b021c6fdbd	Fixed tryFoldToZero() for vector types that need expansion. Summary: Moved the requirement for SelectionDAG::getConstant() to return legally typed nodes slightly earlier. There were two optional DAGCombine passes that were missed out and were required to produce type-legal DAGs. Simplified a code-path in tryFoldToZero() to use SelectionDAG::getConstant(). This provides support for both promoted and expanded vector types whereas the previous code only supported promoted vector types. Fixes a "Type for zero vector elements is not legal" assertion detected by an llvm-stress generated test. Reviewers: resistor CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2251 llvm-svn: 195635	2013-11-25 11:14:43 +00:00
Bill Wendling	e3c48709ed	Don't look past volatile loads. A volatile load should block us from trying to coalesce stores. PR18023 llvm-svn: 195599	2013-11-25 05:01:21 +00:00
Tom Stellard	9cbd2c5581	Split SETCC if VSELECT requires splitting too. This patch is a rewrite of the original patch commited in r194542. Instead of relying on the type legalizer to do the splitting for us, we now peform the splitting ourselves in the DAG combiner. This is necessary for the case where the vector mask is a legal type after promotion and still wouldn't require splitting. Patch by: Juergen Ributzka NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195397	2013-11-22 00:39:23 +00:00
Benjamin Kramer	bb1dd73d3e	DAGCombiner: Partially revert r192795, getNOT was fixed not to create illegal constants. llvm-svn: 194959	2013-11-17 10:40:03 +00:00
Matt Arsenault	c5559bb14b	Add target hook to prevent folding some bitcasted loads. This is to avoid this transformation in some cases: fold (conv (load x)) -> (load (conv*)x) On architectures that don't natively support some vector loads efficiently casting the load to a smaller vector of larger types and loading is more efficient. Patch by Micah Villmow. llvm-svn: 194783	2013-11-15 04:42:23 +00:00
Juergen Ributzka	34c652d34d	SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too. This patch reapplies r193676 with an additional fix for the Hexagon backend. The SystemZ backend has already been fixed by r194148. The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask type for the given target. Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. Reviewed by Nadav llvm-svn: 194542	2013-11-13 01:57:54 +00:00
Daniel Sanders	a1840d2f88	Vector forms of SHL, SRA, and SRL can be constant folded using SimplifyVBinOp too Reviewers: dsanders Reviewed By: dsanders CC: llvm-commits, nadav Differential Revision: http://llvm-reviews.chandlerc.com/D1958 llvm-svn: 194393	2013-11-11 17:23:41 +00:00
Juergen Ributzka	3bd686d493	Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too." Now Hexagon and SystemZ are not happy with it :-( llvm-svn: 193677	2013-10-30 06:36:19 +00:00
Juergen Ributzka	6ad05d6b95	SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too. The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask type for the given target. This mask has usually the same size as the VSELECT return type (except for Intel KNL). Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. Reviewed by Nadav llvm-svn: 193676	2013-10-30 05:48:18 +00:00
Richard Sandiford	981fdeb477	[DAGCombiner] Respect volatility when checking for aliases Making useAA() default to true for SystemZ showed that the combiner alias analysis wasn't handling volatile accesses. This hit many of the SystemZ tests, but I arbitrarily picked one for the purpose of this patch. llvm-svn: 193518	2013-10-28 12:00:00 +00:00
Richard Sandiford	39c1ce4dc1	Keep TBAA info when rewriting SelectionDAG loads and stores Most SelectionDAG code drops the TBAA info when creating a new form of a load and store (e.g. during legalization, or when converting a plain load to an extending one). This patch tries to catch all cases where the TBAA information can legitimately be carried over. The patch adds alternative forms of getLoad() and getExtLoad() that take a MachineMemOperand instead of individual fields. (The corresponding getTruncStore() already exists.) The idea is to use the MachineMemOperand forms when all fields are carried over (size, pointer info, isVolatile, isNonTemporal, alignment and TBAA info). If some adjustment is being made, e.g. to narrow the load, then we still pass the individual fields but also pass the TBAA info. llvm-svn: 193517	2013-10-28 11:17:59 +00:00
Nadav Rotem	d369d4bdf9	Optimize concat_vectors(X, undef) -> scalar_to_vector(X). This optimization is not SSE specific so I am moving it to DAGco. The new scalar_to_vector dag node exposed a missing pattern in the AArch64 target that I needed to add. llvm-svn: 193393	2013-10-25 06:41:18 +00:00
Andrea Di Biagio	561badf717	Fix edge condition in DAGCombiner to improve codegen of shift sequences. When canonicalizing dags according to the rule (shl (zext (shr X, c1) ), c1) ==> (zext (shl (shr X, c1), c1)) remember to add the new shl dag to the DAGCombiner worklist of nodes. If we don't explicitly add it to the worklist of nodes to visit, we may not trigger later on the rule that folds the shift left + logical shift right into a AND instruction with bitmask. llvm-svn: 192883	2013-10-17 11:02:58 +00:00
Jack Carter	d4e9615d1c	[projects/test-suite] White space and long line fixes. No functionality changes. llvm-svn: 192863	2013-10-17 01:34:33 +00:00
Benjamin Kramer	00eb07b791	DAGCombiner: Don't fold xor into not if getNOT would introduce an illegal constant. This happens e.g. with <2 x i64> -1 on x86_32. It cannot be generated directly because i64 is illegal. It would be nice if getNOT would handle this transparently, but I don't see a way to generate a legal constant there right now. Fixes PR17487. llvm-svn: 192795	2013-10-16 14:16:19 +00:00
Quentin Colombet	de0e06234c	[DAGCombiner] Reapply load slicing (192471) with a test that explicitly set sse4.2 support. This should fix the buildbots. Original commit message: [DAGCombiner] Slice a big load in two loads when the element are next to each other in memory and the target has paired load and performs post-isel loads combining. E.g., this optimization will transform something like this: a = load i64* addr b = trunc i64 a to i32 c = lshr i64 a, 32 d = trunc i64 c to i32 into: b = load i32* addr1 d = load i32* addr2 Where addr1 = addr2 +/- sizeof(i32), if the target supports paired load and performs post-isel loads combining. One should overload TargetLowering::hasPairedLoad to provide this information. The default is false. <rdar://problem/14477220> llvm-svn: 192476	2013-10-11 18:29:42 +00:00
Quentin Colombet	5aee63d9e3	[DAGCombiner] Revert load slicing (r192471), until I figure out why it fails on ubuntu. llvm-svn: 192474	2013-10-11 18:17:17 +00:00
Quentin Colombet	41dc258f71	[DAGCombiner] Slice a big load in two loads when the element are next to each other in memory and the target has paired load and performs post-isel loads combining. E.g., this optimization will transform something like this: a = load i64* addr b = trunc i64 a to i32 c = lshr i64 a, 32 d = trunc i64 c to i32 into: b = load i32* addr1 d = load i32* addr2 Where addr1 = addr2 +/- sizeof(i32), if the target supports paired load and performs post-isel loads combining. One should overload TargetLowering::hasPairedLoad to provide this information. The default is false. <rdar://problem/14477220> llvm-svn: 192471	2013-10-11 18:01:14 +00:00
Hal Finkel	dbc7a8a8a3	Fix DAGCombiner::visitFP_EXTEND to ignore indexed loads DAGCombiner::visitFP_EXTEND will apply the following transformation: fold (fpext (load x)) -> (fpext (fptrunc (extload x))) but the implementation does not handle indexed loads (pre/post inc.), but did not specifically ignore them either (unlike for extending loads, which it already ignored), causing an assert when the transformation was applied to an indexed load. This is the minimal fix for correctness (causing the transformation to be skipped for indexed loads). Unfortunately, I don't have an in-tree test case. llvm-svn: 191989	2013-10-04 22:18:12 +00:00
Jin-Gu Kang	0bf8241d4b	Added checking code whehter target supports specific dag combining about rotate or not. The corresponding dag patterns are as following: "DAGCombier::MatchRotate" function in DAGCombiner.cpp Pattern1 // fold (or (shl (ext x), (ext y)), // (srl (ext x), (ext (sub 32, y)))) -> // (ext (rotl x, y)) // fold (or (shl (ext x), (ext y)), // (srl (ext x), (ext (sub 32, y)))) -> // (ext (rotr x, (sub 32, y))) pattern2 // fold (or (shl (ext x), (ext (sub 32, y))), // (srl (ext x), (ext y))) -> // (ext (rotl x, y)) // fold (or (shl (ext x), (ext (sub 32, y))), // (srl (ext x), (ext y))) -> // (ext (rotr x, (sub 32, y))) llvm-svn: 191905	2013-10-03 15:58:48 +00:00
Andrea Di Biagio	56ce9c4e78	Re-apply the change from r191393 with fix for pr17380. This change fixes the problem reported in pr17380 and re-add the dagcombine transformation ensuring that the value types are always legal if the transformation is triggered after Legalization took place. Added the test case from pr17380. llvm-svn: 191509	2013-09-27 11:37:05 +00:00
Andrea Di Biagio	549d6605a0	Revert r191393 since it caused pr17380. llvm-svn: 191438	2013-09-26 16:54:01 +00:00
Andrea Di Biagio	9f3313109f	Teach DAGCombiner how to canonicalize dags according to the rule (shl (zext (shr A, X)), X) => (zext (shl (shr A, X), X)). The rule only triggers when there are no other uses of the zext to avoid materializing more instructions. This helps the DAGCombiner understand that the shl/shr sequence can then be converted into an and instruction. llvm-svn: 191393	2013-09-25 19:01:01 +00:00
Benjamin Kramer	64bdb29a83	DAGCombiner: Unify rotate matching for extended and unextended amounts. No functionality change, lots of indentation changes. llvm-svn: 191303	2013-09-24 14:21:28 +00:00
Kay Tiong Khoo	9195a5b081	fix typo: than -> then llvm-svn: 191214	2013-09-23 18:43:51 +00:00
Juergen Ributzka	f043a65327	Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too." This reverts commit r191130. llvm-svn: 191138	2013-09-21 15:09:46 +00:00
Juergen Ributzka	e9a80fc912	SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too. The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask for the given target. This mask has usually te same size as the VSELECT return type (except for Intel KNL). Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. llvm-svn: 191130	2013-09-21 04:55:18 +00:00
David Blaikie	9d117ab7ef	Add braces to suppress Clang's dangling-else warning. These violations were introduced in r191049 llvm-svn: 191059	2013-09-20 00:33:11 +00:00
Kai Nacke	d09bb4614b	PR16726: extend rol/ror matching C-like languages promote types like unsigned short to unsigned int before performing an arithmetic operation. Currently the rotate matcher in the DAGCombiner does not consider this situation. This commit extends the DAGCombiner in the way that the pattern (or (shl ([az]ext x), (ext y)), (srl ([az]ext x), (ext (sub 32, y)))) is folded into ([az]ext (rotl x, y)) The matching is restricted to aext and zext because in this cases the upper bits are either undefined or known. Test case is included. This fixes PR16726. llvm-svn: 191049	2013-09-19 23:00:28 +00:00
Kai Nacke	2d967b2751	Revert PR16726: extend rol/ror matching There is a buildbot failure. Need to investigate this. llvm-svn: 191048	2013-09-19 22:53:36 +00:00
Kai Nacke	4eaf6444fa	PR16726: extend rol/ror matching C-like languages promote types like unsigned short to unsigned int before performing an arithmetic operation. Currently the rotate matcher in the DAGCombiner does not consider this situation. This commit extends the DAGCombiner in the way that the pattern (or (shl ([az]ext x), (ext y)), (srl ([az]ext x), (ext (sub 32, y)))) is folded into ([az]ext (rotl x, y)) The matching is restricted to aext and zext because in this cases the upper bits are either undefined or known. Test case is included. This fixes PR16726. llvm-svn: 191045	2013-09-19 22:36:39 +00:00
Benjamin Kramer	d443e4a080	DAGCombiner: Don't fold vector muls with constants that look like a splat of a power of 2 but differ in bit width. PR17283. llvm-svn: 191000	2013-09-19 13:28:20 +00:00
Hal Finkel	31658834e6	Prevent assert in CombinerGlobalAA with null values DAGCombiner::isAlias can be called with SrcValue1 or SrcValue2 null, and we can't use AA in this case (if we try, then the casting code in AA will assert). llvm-svn: 190763	2013-09-15 02:19:49 +00:00
Hal Finkel	5ef4dccdce	Use TargetSubtargetInfo::useAA() in DAGCombine This uses the TargetSubtargetInfo::useAA() function to control the defaults of the -combiner-alias-analysis and -combiner-global-alias-analysis options. llvm-svn: 189564	2013-08-29 03:29:55 +00:00
Juergen Ributzka	11c52c601a	Fix a typo and coding style of a previous commit. No functional change. llvm-svn: 189526	2013-08-28 22:33:58 +00:00
Tim Northover	819bfb5a25	DAGCombiner: make sure or/shl/srl really has zero high bits before forming bswap We want to convert code like (or (srl N, 8), (shl N, 8)) into (srl (bswap N), const), but this is only valid if the bits above 16 on the source pattern are 0, the checks we were doing on this were slightly wrong before. llvm-svn: 189348	2013-08-27 13:46:45 +00:00
Tom Stellard	838e2344ec	SelectionDAG: Remove unnecessary uses of TargetLowering::getPointerTy() If we have a binary operation like ISD:ADD, we can set the result type equal to the result type of one of its operands rather than using TargetLowering::getPointerTy(). Also, any use of DAG.getIntPtrConstant(C) as an operand for a binary operation can be replaced with: DAG.getConstant(C, OtherOperand.getValueType()); llvm-svn: 189227	2013-08-26 15:06:10 +00:00
Juergen Ributzka	3db39dc1ae	Teach BaseIndexOffset::match to identify base pointers in loops. The small utility function that pattern matches Base + Index + Offset patterns for loads and stores fails to recognize the base pointer for loads/stores from/into an array at offset 0 inside a loop. As a result DAGCombiner::MergeConsecutiveStores was not able to merge all stores. This commit fixes the issue by adding an additional pattern match and also a test case. Reviewer: Nadav llvm-svn: 188936	2013-08-21 21:53:38 +00:00
Craig Topper	d9c2783d8f	Replace getValueType().getSimpleVT() with getSimpleValueType(). llvm-svn: 188442	2013-08-15 02:44:19 +00:00
Jim Grosbach	327ccc787e	DAG: Combine (and (setne X, 0), (setne X, -1)) -> (setuge (add X, 1), 2) A common idiom is to use zero and all-ones as sentinal values and to check for both in a single conditional ("x != 0 && x != (unsigned)-1"). That generates code, for i32, like: testl %edi, %edi setne %al cmpl $-1, %edi setne %cl andb %al, %cl With this transform, we generate the simpler: incl %edi cmpl $1, %edi seta %al Similar improvements for other integer sizes and on other platforms. In general, combining the two setcc instructions into one is better. rdar://14689217 llvm-svn: 188315	2013-08-13 21:30:58 +00:00
Craig Topper	309dfefb6f	Optimize mask generation for one of the DAG combiner shufflevector cases. llvm-svn: 187961	2013-08-08 07:38:55 +00:00
Tom Stellard	d42c594960	TargetLowering: Add getVectorIdxTy() function v2 This virtual function can be implemented by targets to specify the type to use for the index operand of INSERT_VECTOR_ELT, EXTRACT_VECTOR_ELT, INSERT_SUBVECTOR, EXTRACT_SUBVECTOR. The default implementation returns the result from TargetLowering::getPointerTy() The previous code was using TargetLowering::getPointerTy() for vector indices, because this is guaranteed to be legal on all targets. However, using TargetLowering::getPointerTy() can be a problem for targets with pointer sizes that differ across address spaces. On such targets, when vectors need to be loaded or stored to an address space other than the default 'zero' address space (which is the address space assumed by TargetLowering::getPointerTy()), having an index that is a different size than the pointer can lead to inefficient pointer calculations, (e.g. 64-bit adds for a 32-bit address space). There is no intended functionality change with this patch. llvm-svn: 187748	2013-08-05 22:22:01 +00:00
Quentin Colombet	6bf4baa408	[DAGCombiner] insert_vector_elt: Avoid building a vector twice. This patch prevents the following combine when the input vector is used more than once. insert_vector_elt (build_vector elt0, ..., eltN), NewEltIdx, idx => build_vector elt0, ..., NewEltIdx, ..., eltN The reasons are: - Building a vector may be expensive, so try to reuse the existing part of a vector instead of creating a new one (think big vectors). - elt0 to eltN now have two users instead of one. This may prevent some other optimizations. llvm-svn: 187396	2013-07-30 00:24:09 +00:00
Tom Stellard	c54731aa9d	DAGCombiner: Pass the correct type to TargetLowering::isF(Abs\|Neg)Free This commit also implements these functions for R600 and removes a test case that was relying on the buggy behavior. llvm-svn: 187007	2013-07-23 23:55:03 +00:00
Craig Topper	b94011fd28	Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size. llvm-svn: 186274	2013-07-14 04:42:23 +00:00
Craig Topper	e0b711864c	Pass SmallVector by const reference instead of by value. llvm-svn: 186243	2013-07-13 07:43:40 +00:00
Stephen Lin	10947502e5	Remove trailing whitespac llvm-svn: 186032	2013-07-10 20:47:39 +00:00
Stephen Lin	73de7bf5de	AArch64/PowerPC/SystemZ/X86: This patch fixes the interface, usage, and all in-tree implementations of TargetLoweringBase::isFMAFasterThanMulAndAdd in order to resolve the following issues with fmuladd (i.e. optional FMA) intrinsics: 1. On X86(-64) targets, ISD::FMA nodes are formed when lowering fmuladd intrinsics even if the subtarget does not support FMA instructions, leading to laughably bad code generation in some situations. 2. On AArch64 targets, ISD::FMA nodes are formed for operations on fp128, resulting in a call to a software fp128 FMA implementation. 3. On PowerPC targets, FMAs are not generated from fmuladd intrinsics on types like v2f32, v8f32, v4f64, etc., even though they promote, split, scalarize, etc. to types that support hardware FMAs. The function has also been slightly renamed for consistency and to force a merge/build conflict for any out-of-tree target implementing it. To resolve, see comments and fixed in-tree examples. llvm-svn: 185956	2013-07-09 18:16:56 +00:00
Hal Finkel	6c29bd9088	DAGCombine tryFoldToZero cannot create illegal types after type legalization When folding sub x, x (and other similar constructs), where x is a vector, the result is a vector of zeros. After type legalization, make sure that the input zero elements have a legal type. This type may be larger than the result's vector element type. This was another bug found by llvm-stress. llvm-svn: 185949	2013-07-09 17:02:45 +00:00
Stephen Lin	8e8424eb17	Style fixes: remove unnecessary braces for one-statement if blocks, no else after return, etc. No funcionality change. llvm-svn: 185893	2013-07-09 00:44:49 +00:00
Stephen Lin	cfe7f352c7	Remove trailing whitespace from SelectionDAG/*.cpp llvm-svn: 185780	2013-07-08 00:37:03 +00:00
Benjamin Kramer	c7332b2796	DAGCombiner: Don't drop extension behavior when shrinking a load when unsafe. ReduceLoadWidth unconditionally drops extensions from loads. Limit it to the case when all of the bits the extension would otherwise produce are dropped by the shrink. It would be possible to shrink the load in more cases by merging the extensions, but this isn't trivial and a very rare case. I left a TODO for that case. Fixes PR16551. llvm-svn: 185755	2013-07-06 14:05:09 +00:00
Tim Northover	6823900e55	DAGCombiner: fix use-counting issue when forming zextload DAGCombiner was counting all uses of a load node when considering whether it's worth combining into a zextload. Really, it wants to ignore the chain and just count real uses. rdar://problem/13896307 llvm-svn: 185419	2013-07-02 09:58:53 +00:00
Elena Demikhovsky	fed077be03	Fixed a comment. llvm-svn: 184933	2013-06-26 12:15:53 +00:00
Elena Demikhovsky	6769c50d9e	Optimized integer vector multiplication operation by replacing it with shift/xor/sub when it is possible. Fixed a bug in SDIV, where the const operand is not a splat constant vector. llvm-svn: 184931	2013-06-26 10:55:03 +00:00
Michael Liao	62ebfd8786	Fix PR16360 When (srl (anyextend x), c) is folded into (anyextend (srl x, c)), the high bits are not cleared. Add 'and' to clear off them. llvm-svn: 184575	2013-06-21 18:45:27 +00:00
Stephen Lin	605207fe75	SelectionDAG: slightly refactor DAGCombiner::visitSELECT_CC to avoid redudant checks... This doesn't really effect performance due to all the relevant calls being transparent but is clearer. llvm-svn: 184027	2013-06-15 04:03:33 +00:00
Matt Arsenault	d2f0332a29	Introduce getSelect usage and use more getSelectCC llvm-svn: 184012	2013-06-14 22:04:37 +00:00
Stephen Lin	4e69d01b67	SelectionDAG: minor fix to order of operands in comments to match the code llvm-svn: 184008	2013-06-14 21:33:58 +00:00
Stephen Lin	e31f2d2d54	SelectionDAG: Fix incorrect condition checks in some cases of folding FADD/FMUL combinations; also improve accuracy of comments llvm-svn: 183993	2013-06-14 18:17:35 +00:00
Andrew Trick	ef9de2a739	Track IR ordering of SelectionDAG nodes 2/4. Change SelectionDAG::getXXXNode() interfaces as well as call sites of these functions to pass in SDLoc instead of DebugLoc. llvm-svn: 182703	2013-05-25 02:42:55 +00:00
Michael J. Spencer	df1ecbd734	Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros. llvm-svn: 182680	2013-05-24 22:23:49 +00:00
Matt Arsenault	75865923c9	Add LLVMContext argument to getSetCCResultType llvm-svn: 182180	2013-05-18 00:21:46 +00:00
Matt Arsenault	04126234e5	Replace redundant code Use EVT::changeExtendedVectorElementTypeToInteger instead of doing the same thing that it does llvm-svn: 182165	2013-05-17 21:43:43 +00:00
Bob Wilson	c5c0823724	Remove redundant variable introduced by r181682. llvm-svn: 181721	2013-05-13 19:02:31 +00:00
Hao Liu	bc60196951	Fix PR15950 A bug in DAG Combiner about undef mask llvm-svn: 181682	2013-05-13 02:07:05 +00:00
Benjamin Kramer	a5d59333b3	DAGCombiner: Generate a correct constant for vector types when folding (xor (and)) into (and (not)). PR15948. llvm-svn: 181597	2013-05-10 14:09:52 +00:00
David Majnemer	386ab7f872	DAGCombiner: Simplify inverted bit tests Fold (xor (and x, y), y) -> (and (not x), y) This removes an opportunity for a constant to appear twice. llvm-svn: 181395	2013-05-08 06:44:42 +00:00
Michael Kuperstein	ac868757d0	Fix slightly too aggressive conact_vector optimization. (Would sometimes optimize away conacts used to extend a vector with undef values) llvm-svn: 181186	2013-05-06 08:06:13 +00:00
Nadav Rotem	e5a2dda372	Optimize away nop CONCAT_VECTOR nodes. Optimize CONCAT_VECTOR nodes that merge EXTRACT_SUBVECTOR values that extract from the same vector. rdar://13402653 PR15866 llvm-svn: 180871	2013-05-01 19:18:51 +00:00
Silviu Baranga	af7e8c367f	Re-write the address propagation code for pre-indexed loads/stores to take into account some previously misssed cases (PRE_DEC addressing mode, the offset and base address are swapped, etc). This should fix PR15581. llvm-svn: 180609	2013-04-26 15:52:24 +00:00
Benjamin Kramer	d56ffc709d	DAGCombiner: Canonicalize vector integer abs in the same way we do it for scalars. This already helps SSE2 x86 a lot because it lacks an efficient way to represent a vector select. The long term goal is to enable the backend to match a canonicalized pattern into a single instruction (e.g. vabs or pabs). llvm-svn: 180597	2013-04-26 09:19:19 +00:00
Owen Anderson	2d4cca35c3	DAGCombine should not aggressively fold SEXT(VSETCC(...)) into a wider VSETCC without first checking the target's vector boolean contents. This exposed an issue with PowerPC AltiVec where it appears it was setting the wrong vector boolean contents. The included change fixes the PowerPC tests, and was OK'd by Hal. llvm-svn: 180129	2013-04-23 18:09:28 +00:00
Tim Northover	a2b533906a	Remove unused MEMBARRIER DAG node; it's been replaced by ATOMIC_FENCE. llvm-svn: 179939	2013-04-20 12:32:17 +00:00
Benjamin Kramer	bbae991db6	DAGCombiner: Fold a shuffle on CONCAT_VECTORS into a new CONCAT_VECTORS if possible. This pattern occurs in SROA output due to the way vector arguments are lowered on ARM. The testcase from PR15525 now compiles into this, which is better than the code we got with the old scalarrepl: _Store: ldr.w r9, [sp] vmov d17, r3, r9 vmov d16, r1, r2 vst1.8 {d16, d17}, [r0] bx lr Differential Revision: http://llvm-reviews.chandlerc.com/D647 llvm-svn: 179106	2013-04-09 17:41:43 +00:00
Arnold Schwaighofer	d6c6e868b2	DAGCombiner: Merge store/loads when we have extload/truncstores This is helps on architectures where i8,i16 are not legal but we have byte, and short loads/stores. Allowing us to merge copies like the one below on ARM. copy(char a, char b, int n) { do { int t0 = a[0]; int t1 = a[1]; b[0] = t0; b[1] = t1; radar://13536387 llvm-svn: 178546	2013-04-02 15:58:51 +00:00
Arnold Schwaighofer	6752366ed7	Merge load/store sequences with adresses: base + index + offset We would also like to merge sequences that involve a variable index like in the example below. int index = *idx++ int i0 = c[index+0]; int i1 = c[index+1]; b[0] = i0; b[1] = i1; By extending the parsing of the base pointer to handle dags that contain a base, index, and offset we can handle examples like the one above. The dag for the code above will look something like: (load (i64 add (i64 copyfromreg %c) (i64 signextend (i8 load %index)))) (load (i64 add (i64 copyfromreg %c) (i64 signextend (i32 add (i32 signextend (i8 load %index)) (i32 1))))) The code that parses the tree ignores the intermediate sign extensions. However, if there is a sign extension it needs to be on all indexes. (load (i64 add (i64 copyfromreg %c) (i64 signextend (add (i8 load %index) (i8 1)))) vs (load (i64 add (i64 copyfromreg %c) (i64 signextend (i32 add (i32 signextend (i8 load %index)) (i32 1))))) radar://13536387 llvm-svn: 178483	2013-04-01 18:12:58 +00:00
Benjamin Kramer	9335443236	DAGCombine: visitXOR can replace a node without returning it, bail out in that case. Fixes the crash reported in PR15608. llvm-svn: 178429	2013-03-30 21:28:18 +00:00
Michael Liao	bb05a1d7b5	Enhance folding of (extract_subvec (insert_subvec V1, V2, IIdx), EIdx) - Handle the case where the result of 'insert_subvect' is bitcasted before 'extract_subvec'. This removes the redundant insertf128/extractf128 pair on unaligned 256-bit vector load/store on vectors of non 64-bit integer. llvm-svn: 177945	2013-03-25 23:47:35 +00:00
Shuxin Yang	93b1f12ac1	Disable some unsafe-fp-math DAG-combine transformation after legalization. For instance, following transformation will be disabled: x + x + x => 3.0f * x; The problem of these transformations is that it introduces a FP constant, which following Instruction-Selection pass cannot handle. Reviewed by Nadav, thanks a lot! rdar://13445387 llvm-svn: 177933	2013-03-25 22:52:29 +00:00
Richard Relph	61046a9727	Avoid generating ISD::SELECT for vector operands to SIGN_EXTEND llvm-svn: 176881	2013-03-12 18:17:18 +00:00
Tom Stellard	b1588fc057	DAGCombiner: Use correct value type for checking legality of BR_CC v3 LegalizeDAG.cpp uses the value of the comparison operands when checking the legality of BR_CC, so DAGCombiner should do the same. v2: - Expand more BR_CC value types for NVPTX v3: - Expand correct BR_CC value types for Hexagon, Mips, and XCore. llvm-svn: 176694	2013-03-08 15:36:57 +00:00
Benjamin Kramer	3238dc0c61	DAGCombiner: Make the post-legalize vector op optimization more aggressive. A legal BUILD_VECTOR goes in and gets constant folded into another legal BUILD_VECTOR so we don't lose any legality here. The problematic PPC optimization that made this check necessary was fixed recently. llvm-svn: 175759	2013-02-21 15:24:35 +00:00
Arnold Schwaighofer	3f9568e921	DAGCombiner: Fold pointless truncate, bitcast, buildvector series (2xi32) (truncate ((2xi64) bitcast (buildvector i32 a, i32 x, i32 b, i32 y))) can be folded into a (2xi32) (buildvector i32 a, i32 b). Such a DAG would cause uneccessary vdup instructions followed by vmovn instructions. We generate this code on ARM NEON for a setcc olt, 2xf64, 2xf64. For example, in the vectorized version of the code below. double A[N]; double B[N]; void test_double_compare_to_double() { int i; for(i=0;i<N;i++) A[i] = (double)(A[i] < B[i]); } radar://13191881 Fixes bug 15283. llvm-svn: 175670	2013-02-20 21:33:32 +00:00
Nadav Rotem	495b1a43c1	Dont merge consecutive loads/stores into vectors when noimplicitfloat is used. llvm-svn: 175190	2013-02-14 18:28:52 +00:00
Owen Anderson	cc068993ee	Add some legality checks for SETCC before introducing it in the DAG combiner post-operand legalization. llvm-svn: 175149	2013-02-14 09:07:33 +00:00
Paul Redmond	288604ed0c	PR14562 - Truncation of left shift became undef DAGCombiner::ReduceLoadWidth was converting (trunc i32 (shl i64 v, 32)) into (shl i32 v, 32) into undef. To prevent this, check the shift count against the final result size. Patch by: Kevin Schoedel Reviewed by: Nadav Rotem llvm-svn: 174972	2013-02-12 15:21:21 +00:00
Pete Cooper	10a3ae7039	Check type for legality before forming a select from loads. Sorry for the lack of a test case. I tried writing one for i386 as i know selects are illegal on this target, but they are actually considered legal by isel and expanded later. I can't see any targets to trigger this, but checking for the legality of a node before forming it is general goodness. llvm-svn: 174934	2013-02-12 03:14:50 +00:00
Hal Finkel	2581905f81	DAGCombiner: Constant folding around pre-increment loads/stores Previously, even when a pre-increment load or store was generated, we often needed to keep a copy of the original base register for use with other offsets. If all of these offsets are constants (including the offset which was combined into the addressing mode), then this is clearly unnecessary. This change adjusts these other offsets to use the new incremented address. llvm-svn: 174746	2013-02-08 21:35:47 +00:00
Owen Anderson	de89ecf1fc	Reapply r174343, with a fix for a scary DAG combine bug where it failed to differentiate between the alignment of the base point of a load, and the overall alignment of the load. This caused infinite loops in DAG combine with the original application of this patch. ORIGINAL COMMIT LOG: When the target-independent DAGCombiner inferred a higher alignment for a load, it would replace the load with one with the higher alignment. However, it did not place the new load in the worklist, which prevented later DAG combines in the same phase (for example, target-specific combines) from ever seeing it. This patch corrects that oversight, and updates some tests whose output changed due to slightly different DAGCombine outputs. llvm-svn: 174431	2013-02-05 19:24:39 +00:00
NAKAMURA Takumi	3753b28cd2	Revert r174343, "When the target-independent DAGCombiner inferred a higher alignment for a load," It caused hangups in compiling clang/lib/Parse/ParseDecl.cpp and clang/lib/Driver/Tools.cpp in stage2 on some hosts. llvm-svn: 174374	2013-02-05 14:44:16 +00:00
Owen Anderson	a47fdbb032	When the target-independent DAGCombiner inferred a higher alignment for a load, it would replace the load with one with the higher alignment. However, it did not place the new load in the worklist, which prevented later DAG combines in the same phase (for example, target-specific combines) from ever seeing it. This patch corrects that oversight, and updates some tests whose output changed due to slightly different DAGCombine outputs. llvm-svn: 174343	2013-02-05 06:25:30 +00:00
Shuxin Yang	cadd8a068e	rdar://13126763 Fix a bug in DAGCombine. The symptom is mistakenly optimizing expression "x + xx" into "x 3.0". llvm-svn: 174239	2013-02-02 00:22:03 +00:00
Nadav Rotem	9450fcfff1	Revert 172708. The optimization handles esoteric cases but adds a lot of complexity both to the X86 backend and to other backends. This optimization disables an important canonicalization of chains of SEXT nodes and makes SEXT and ZEXT asymmetrical. Disabling the canonicalization of consecutive SEXT nodes into a single node disables other DAG optimizations that assume that there is only one SEXT node. The AVX mask optimizations is one example. Additionally this optimization does not update the cost model. llvm-svn: 172968	2013-01-20 08:35:56 +00:00
Elena Demikhovsky	f6a30e05d5	Optimization for the following SIGN_EXTEND pairs: v8i8 -> v8i64, v8i8 -> v8i32, v4i8 -> v4i64, v4i16 -> v4i64 for AVX and AVX2. Bug 14865. llvm-svn: 172708	2013-01-17 09:59:53 +00:00
Bill Schmidt	d006c6938b	This patch addresses an incorrect transformation in the DAG combiner. The included test case is derived from one of the GCC compatibility tests. The problem arises after the selection DAG has been converted to type-legalized form. The combiner first sees a 64-bit load that can be converted into a pre-increment form. The original load feeds into a SRL that isolates the upper 32 bits of the loaded doubleword. This looks like an opportunity for DAGCombiner::ReduceLoadWidth() to replace the 64-bit load with a 32-bit load. However, this transformation is not valid, as the replacement load is not a pre-increment load. The pre-increment load produces an extra result, which feeds a subsequent add instruction. The replacement load only has one result value, and this value is propagated to all uses of the pre- increment load, including the add. Because the add is looking for the second result value as its operand, it ends up attempting to add a constant to a token chain, resulting in a crash. So the patch simply disables this transformation for any load with more than two result values. llvm-svn: 172480	2013-01-14 22:04:38 +00:00
Evan Cheng	5652a8df32	Fix a DAG combine bug visitBRCOND() is transforming br(xor(x, y)) to br(x != y). It cahced XOR's operands before calling visitXOR() but failed to update the operands when visitXOR changed the XOR node. rdar://12968664 llvm-svn: 171999	2013-01-09 20:56:40 +00:00
Chandler Carruth	95f83e0155	Sink AddrMode back into TargetLowering, removing one of the most peculiar headers under include/llvm. This struct still doesn't make a lot of sense, but it makes more sense down in TargetLowering than it did before. llvm-svn: 171739	2013-01-07 15:14:13 +00:00
Tom Stellard	567f886eb0	DAGCombiner: Avoid generating illegal vector INT_TO_FP nodes DAGCombiner::reduceBuildVecConvertToConvertBuildVec() was making two mistakes: 1. It was checking the legality of scalar INT_TO_FP nodes and then generating vector nodes. 2. It was passing the result value type to TargetLoweringInfo::getOperationAction() when it should have been passing the value type of the first operand. llvm-svn: 171420	2013-01-02 22:13:01 +00:00
Chandler Carruth	9fb823bbd4	Move all of the header files which are involved in modelling the LLVM IR into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366	2013-01-02 11:36:10 +00:00
Bill Wendling	698e84fc4f	Remove the Function::getFnAttributes method in favor of using the AttributeSet directly. This is in preparation for removing the use of the 'Attribute' class as a collection of attributes. That will shift to the AttributeSet class instead. llvm-svn: 171253	2012-12-30 10:32:01 +00:00
Nadav Rotem	b1dd52450e	Refactor DAGCombinerInfo. Change the different booleans that indicate if we are before or after different runs of DAGCo, with the CombineLevel enum. Also, added a new API for checking if we are running before or after the LegalizeVectorOps phase. llvm-svn: 171142	2012-12-27 06:47:41 +00:00
Bob Wilson	3365b80290	Do not introduce vector operations in functions marked with noimplicitfloat. <rdar://problem/12879313> llvm-svn: 170630	2012-12-20 01:36:20 +00:00
Patrik Hagglund	ffd057a3e1	Change TargetLowering::isCondCodeLegal to take an MVT, instead of EVT. llvm-svn: 170524	2012-12-19 10:19:55 +00:00
Elena Demikhovsky	14a4af0e66	Optimized load + SIGN_EXTEND patterns in the X86 backend. llvm-svn: 170506	2012-12-19 07:50:20 +00:00
Evan Cheng	bf0baa9de7	Fix a bug in DAGCombiner::MatchBSwapHWord. Make sure the node has operands before referencing them. rdar://12868039 llvm-svn: 170078	2012-12-13 01:34:32 +00:00
Manman Ren	82751a105c	DAGCombine: clamp hi bit in APInt::getBitsSet to avoid assertion rdar://12838504 llvm-svn: 169951	2012-12-12 01:13:50 +00:00
Patrik Hagglund	e98b7a0389	Revert EVT->MVT changes, r169836-169851, due to buildbot failures. llvm-svn: 169854	2012-12-11 11:14:33 +00:00
Patrik Hagglund	a970281106	Change TargetLowering::isCondCodeLegal to take an MVT, instead of EVT. llvm-svn: 169843	2012-12-11 09:51:27 +00:00
Chandler Carruth	b27041c50b	Fix a miscompile in the DAG combiner. Previously, we would incorrectly try to reduce the width of this load, and would end up transforming: (truncate (lshr (sextload i48 <ptr> as i64), 32) to i32) to (truncate (zextload i32 <ptr+4> as i64) to i32) We lost the sext attached to the load while building the narrower i32 load, and replaced it with a zext because lshr always zext's the results. Instead, bail out of this combine when there is a conflict between a sextload and a zext narrowing. The rest of the DAG combiner still optimize the code down to the proper single instruction: movswl 6(...),%eax Which is exactly what we wanted. Previously we read past the end and missed the sign extension: movl 6(...), %eax llvm-svn: 169802	2012-12-11 00:36:57 +00:00
Craig Topper	d8005db486	Teach DAG combine to handle vector add/sub with vectors of all 0s. llvm-svn: 169727	2012-12-10 08:12:29 +00:00
Craig Topper	5ea3bdd75b	Remove extra blank line. llvm-svn: 169692	2012-12-09 08:20:52 +00:00
Craig Topper	a183ddb0fe	Teach DAG combine to handle vector logical operations with vectors of all 1s or all 0s. These cases can show up when vectors are split for legalizing. Fix some tests that were dependent on these cases not being combined. llvm-svn: 169684	2012-12-08 22:49:19 +00:00
Nadav Rotem	ac450eb59e	Fix a bug in the code that merges consecutive stores. Previously we did not check if loads that happen in between stores alias with the first store in the chain, only with the second store onwards. llvm-svn: 169516	2012-12-06 17:34:13 +00:00
Chandler Carruth	ed0881b2a6	Use the new script to sort the includes of every file under lib. Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131	2012-12-03 16:50:05 +00:00
Nadav Rotem	1157e1410c	Allow merging multiple store sequences on the same chain. llvm-svn: 169111	2012-12-02 17:14:09 +00:00
Nadav Rotem	307d767177	When combining consecutive stores allow loads in between the stores, if the loads do not alias. llvm-svn: 168832	2012-11-29 00:00:08 +00:00
Rafael Espindola	c79532d101	Handle DAG CSE adding new uses during ReplaceAllUsesWith. Fixes PR14333. llvm-svn: 167912	2012-11-14 05:08:56 +00:00
Owen Anderson	15fd6ac4ba	Be careful not to optimize a SELECT_CC into a SETCC post-legalization if the SETCC node would be illegal. llvm-svn: 167344	2012-11-03 00:17:26 +00:00
Owen Anderson	b351c8d692	Add a few more simple fast-math constant propagations and cancellations. llvm-svn: 167200	2012-11-01 02:00:53 +00:00
Ulrich Weigand	3abb34389d	In various places throughout the code generator, there were special checks to avoid performing compile-time arithmetic on PPCDoubleDouble. Now that APFloat supports arithmetic on PPCDoubleDouble, those checks are no longer needed, and we can treat the type like any other. llvm-svn: 166958	2012-10-29 18:35:49 +00:00
Michael Liao	5922979e49	Teach DAG combine to fold (buildvec (Xint2fp x)) to (Xint2fp (buildvec x)) - If more than 1 elemennts are defined and target supports the vectorized conversion, use the vectorized one instead to reduce the strength on conversion operation. llvm-svn: 166546	2012-10-24 04:14:18 +00:00
Jakub Staszak	a6addc2741	Keep coding standard. Don't evaluate getNumOperands() every time. llvm-svn: 166531	2012-10-24 00:38:25 +00:00
Michael Liao	6d106b7bfd	Clean up code and put transformation on (build_vec (ext x)) into a helper func llvm-svn: 166519	2012-10-23 23:06:52 +00:00
Michael Liao	2c2358036d	Simplify condition checking as CONCAT assume all inputs of the same type. llvm-svn: 166260	2012-10-19 03:17:00 +00:00
Nadav Rotem	d5f8859672	In SimplifySelectOps we pulled two loads through a select node despite the fact that one was dependent on the other. rdar://12513091 llvm-svn: 166196	2012-10-18 18:06:48 +00:00
Michael Liao	3ac8201ea4	Revert part of r166049 back and enable test case in r166125. - Folding (trunc (concat ... X )) to (concat ... (trunc X) ...) is valid when '...' are all 'undef's. - r166125 relies on this transformation. llvm-svn: 166155	2012-10-17 23:45:54 +00:00
Michael Liao	c87d98dbc8	Revert r166049 - In general, it's unsafe for this transformation. llvm-svn: 166135	2012-10-17 22:41:15 +00:00
Michael Liao	7a442c8031	Teach DAG combine to fold (extract_subvec (concat v1, ..) i) to v_i - If the extracted vector has the same type of all vectored being concatenated together, it should be simplified directly into v_i, where i is the index of the element being extracted. llvm-svn: 166125	2012-10-17 20:48:33 +00:00
Michael Liao	19006206a1	Teach DAG combine to fold (trunc (fptoXi x)) to (fptoXi x) llvm-svn: 166049	2012-10-16 19:38:35 +00:00
Nadav Rotem	35315fea70	Refactor the AddrMode class out of TLI to its own header file. This class is used by LSR and a number of places in the codegen. This is the first step in de-coupling LSR from TLI, and creating a new interface in between them. llvm-svn: 165455	2012-10-08 23:06:34 +00:00
Micah Villmow	cdfe20b97f	Move TargetData to DataLayout. llvm-svn: 165402	2012-10-08 16:38:25 +00:00
Benjamin Kramer	db5fb3bfe8	Remove unused but set variable flagged by GCC. llvm-svn: 165331	2012-10-05 20:08:45 +00:00
Benjamin Kramer	62f7fb977c	Simplify code, don't or a bool with an uint64_t. No functionality change. llvm-svn: 165321	2012-10-05 18:19:44 +00:00
Nadav Rotem	b27777ff02	When merging connsecutive stores, use vectors to store the constant zero. llvm-svn: 165267	2012-10-04 22:35:15 +00:00
Nadav Rotem	ac92066b0c	Fix a cycle in the DAG. In this code we replace multiple loads with a single load and multiple stores with a single load. We create the wide loads and stores (and their chains) before we remove the scalar loads and stores and fix the DAG chain. We attempted to merge loads with a different chain. When that happened, the assumption that it is safe to RAUW broke and a cycle was introduced. llvm-svn: 165148	2012-10-03 19:30:31 +00:00
Nadav Rotem	7cbc12a41d	A DAGCombine optimization for mergeing consecutive stores to memory. The optimization is not profitable in many cases because modern processors perform multiple stores in parallel and merging stores prior to merging requires extra work. We handle two main cases: 1. Store of multiple consecutive constants: q->a = 3; q->4 = 5; In this case we store a single legal wide integer. 2. Store of multiple consecutive loads: int a = p->a; int b = p->b; q->a = a; q->b = b; In this case we load/store either ilegal vector registers or legal wide integer registers. llvm-svn: 165125	2012-10-03 16:11:15 +00:00
Nadav Rotem	abbe665154	Revert r164910 because it causes failures to several phase2 builds. llvm-svn: 164911	2012-09-30 07:17:56 +00:00
Nadav Rotem	45715b25f7	A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases because moden processos can store multiple values in parallel, and preparing the consecutive store requires some work. We only handle these cases: 1. Consecutive stores where the values and consecutive loads. For example: int a = p->a; int b = p->b; q->a = a; q->b = b; 2. Consecutive stores where the values are constants. Foe example: q->a = 4; q->b = 5; llvm-svn: 164910	2012-09-30 06:24:14 +00:00
Duncan Sands	fb9d30dd64	Speculatively revert commit 164885 (nadav) in the hope of ressurecting a pile of buildbots. Original commit message: A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases because moden processos can store multiple values in parallel, and preparing the consecutive store requires some work. We only handle these cases: 1. Consecutive stores where the values and consecutive loads. For example: int a = p->a; int b = p->b; q->a = a; q->b = b; 2. Consecutive stores where the values are constants. Foe example: q->a = 4; q->b = 5; llvm-svn: 164890	2012-09-29 10:25:35 +00:00
Craig Topper	5f9791fd2f	Tidy up to match coding standards. Remove 'else' after 'return' and moving operators to end of preceding line. No functional change intended. llvm-svn: 164887	2012-09-29 07:18:53 +00:00
Craig Topper	65161fa493	Replace a couple if/elses around similar calls with conditional operators on the varying arguments. No functional change. llvm-svn: 164886	2012-09-29 06:54:22 +00:00
Nadav Rotem	a2e7ea2f18	A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases because moden processos can store multiple values in parallel, and preparing the consecutive store requires some work. We only handle these cases: 1. Consecutive stores where the values and consecutive loads. For example: int a = p->a; int b = p->b; q->a = a; q->b = b; 2. Consecutive stores where the values are constants. Foe example: q->a = 4; q->b = 5; llvm-svn: 164885	2012-09-29 06:33:25 +00:00
Sylvestre Ledru	91ce36c986	Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767 llvm-svn: 164768	2012-09-27 10:14:43 +00:00
Sylvestre Ledru	721cffd53a	Fix a typo 'iff' => 'if' llvm-svn: 164767	2012-09-27 09:59:43 +00:00
Nadav Rotem	841c9a84d0	Fix 80-col violations. llvm-svn: 164297	2012-09-20 08:53:31 +00:00
Nadav Rotem	24a822a5cb	Fix a dagcombine optimization. The optimization attempts to optimize a bitcast of fneg to integers by xoring the high-bit. This fails if the source operand is a vector because we need to negate each of the elements in the vector. Fix rdar://12281066 PR13813. llvm-svn: 163802	2012-09-13 14:54:28 +00:00
Craig Topper	8238461211	Teach DAG combiner to constant fold FABS of a BUILD_VECTOR of ConstantFPs. Factor similar code out of FNEG DAG combiner. llvm-svn: 163587	2012-09-11 01:45:21 +00:00
James Molloy	1e5c611815	Fix an assertion failure when optimising a shufflevector incorrectly into concat_vectors, and a followup bug with SelectionDAG::getNode() creating nodes with invalid types. llvm-svn: 163511	2012-09-10 14:01:21 +00:00
Craig Topper	03f39773e0	Teach DAG combiner to constant fold fneg of a BUILD_VECTOR of constants. llvm-svn: 163483	2012-09-09 22:58:45 +00:00
Roman Divacky	9338344acb	Constify this properly. Found by gcc48 -Wcast-qual. llvm-svn: 163256	2012-09-05 22:15:49 +00:00
Silviu Baranga	3f40d87207	Fixed the DAG combiner to better handle the folding of AND nodes for vector types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value. llvm-svn: 163203	2012-09-05 08:57:21 +00:00
Owen Anderson	90e0eaffa8	Teach DAG combine a number of tricks to simplify FMA expressions in fast-math mode. llvm-svn: 163051	2012-09-01 06:04:27 +00:00
Michael Liao	ec385012ae	Fix typo llvm-svn: 163049	2012-09-01 04:09:16 +00:00
Owen Anderson	cc61f87cf7	Teach the DAG combiner to turn chains of FADDs (x+x+x+x+...) into FMULs by constants. This is only enabled in unsafe FP math mode, since it does not preserve rounding effects for all such constants. llvm-svn: 162956	2012-08-30 23:35:16 +00:00
Stepan Dyatkovskiy	99120e04be	Rejected 169195. As Duncan commented, bitcasting to proper type is wrong approach. We need to insert some valid TRANCATE node here. llvm-svn: 162354	2012-08-22 09:33:55 +00:00
Stepan Dyatkovskiy	6a638ec521	Fixed DAGCombiner bug (found and localized by James Malloy): The DAGCombiner tries to optimise a BUILD_VECTOR by checking if it consists purely of get_vector_elts from one or two source vectors. If so, it either makes a concat_vectors node or a shufflevector node. However, it doesn't check the element type width of the underlying vector, so if you have this sequence: Node0: v4i16 = ... Node1: i32 = extract_vector_elt Node0 Node2: i32 = extract_vector_elt Node0 Node3: v16i8 = BUILD_VECTOR Node1, Node2, ... It will attempt to: Node0: v4i16 = ... NewNode1: v16i8 = concat_vectors Node0, ... Where this is actually invalid because the element width is completely different. This causes an assertion failure on DAG legalization stage. Fix: If output item type of BUILD_VECTOR differs from input item type. Make concat_vectors based on input element type and then bitcast it to the output vector type. So the case described above will transformed to: Node0: v4i16 = ... NewNode1: v8i16 = concat_vectors Node0, ... NewNode2: v16i8 = bitcast NewNode1 llvm-svn: 162195	2012-08-20 07:57:06 +00:00
Owen Anderson	a40319b7f1	Add a roundToIntegral method to APFloat, which can be parameterized over various rounding modes. Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC. llvm-svn: 161807	2012-08-13 23:32:49 +00:00
Elena Demikhovsky	3cb3b0045c	Added FMA functionality to X86 target. llvm-svn: 161110	2012-08-01 12:06:00 +00:00
Nadav Rotem	9056076cab	Fixed DAGCombine optimizations which generate select_cc for targets that do not support it (X86 does not lower select_cc). PR: 13428 Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160619	2012-07-23 07:59:50 +00:00
Bill Wendling	d163405df8	Remove tabs. llvm-svn: 160475	2012-07-19 00:04:14 +00:00
Evan Cheng	e6a3b03ee0	Back out r160101 and instead implement a dag combine to recover from instcombine transformation. llvm-svn: 160387	2012-07-17 18:54:11 +00:00
Nadav Rotem	a62368c965	Refactor the code that checks that all operands of a node are UNDEFs. Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs. Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160229	2012-07-15 08:38:23 +00:00
Nadav Rotem	018921002e	Add a dagcombine optimization to convert concat_vectors of undefs into a single undef. The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node. llvm-svn: 160221	2012-07-14 21:30:27 +00:00
Owen Anderson	b8844d6744	Only apply the SETCC+SITOFP -> SELECTCC optimization when the SETCC returns an MVT::i1, i.e. before type legalization. This is a speculative fix for a problem on Mips reported by Akira Hatanaka. llvm-svn: 160036	2012-07-11 06:38:55 +00:00
Nadav Rotem	d908ddc186	Improve the loading of load-anyext vectors by allowing the codegen to load multiple scalars and insert them into a vector. Next, we shuffle the elements into the correct places, as before. Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the migration of bitcasts happened too late in the SelectionDAG process. llvm-svn: 159991	2012-07-10 13:25:08 +00:00
Owen Anderson	d4b841f8f9	Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values. Previously, this would become an integer extension operation, followed by a real integer->float conversion. llvm-svn: 159957	2012-07-09 20:31:12 +00:00
Evan Cheng	4c6f917d34	Make sure type is not extended or untyped before create a constant of the type. No test case. Found by inspection. llvm-svn: 159179	2012-06-26 01:19:33 +00:00
Lang Hames	b8650f106a	Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a boolean flag to an enum: { Fast, Standard, Strict } (default = Standard). This option controls the creation by optimizations of fused FP ops that store intermediate results in higher precision than IEEE allows (E.g. FMAs). The behavior of this option is intended to match the behaviour specified by a soon-to-be-introduced frontend flag: '-ffuse-fp-ops'. Fast mode - allows formation of fused FP ops whenever they're profitable. Standard mode - allow fusion only for 'blessed' FP ops. At present the only blessed op is the fmuladd intrinsic. In the future more blessed ops may be added. Strict mode - allow fusion only if/when it can be proven that the excess precision won't effect the result. Note: This option only controls formation of fused ops by the optimizers. Fused operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic) will always be honored, regardless of the value of this option. Internally TargetOptions::AllowExcessFPPrecision has been replaced by TargetOptions::AllowFPOpFusion. llvm-svn: 158956	2012-06-22 01:09:09 +00:00
Pete Cooper	5b61422d80	Fix potential crash if DAGCombine on stores sees a half type llvm-svn: 158927	2012-06-21 18:00:39 +00:00
Pete Cooper	fe5b84b404	Add users of a MERGE_VALUE node to the worklist to process again when the node is removed. Sorry, no test case. Foudn it by inspection of the code llvm-svn: 158839	2012-06-20 19:35:43 +00:00
Hal Finkel	8a31138521	Fix DAGCombine to deal with ext-conversion of pre/post_inc loads. The test case for this will come with the PPC indexed preinc loads commit. llvm-svn: 158822	2012-06-20 15:42:48 +00:00
Lang Hames	39fb1d08dc	Add DAG-combines for aggressive FMA formation. This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or FSUB + FMUL. The combines are performed when: (a) Either AllowExcessFPPrecision option (-enable-excess-fp-precision for llc) OR UnsafeFPMath option (-enable-unsafe-fp-math) are set, and (b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of the FADD/FSUB, and (c) The FMUL only has one user (the FADD/FSUB). If your target has fast FMA instructions you can make use of these combines by overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for types supported by your FMA instruction, and adding patterns to match ISD::FMA to your FMA instructions. llvm-svn: 158757	2012-06-19 22:51:23 +00:00
Lang Hames	a33db65bd9	Make comment slightly more helpful. llvm-svn: 158467	2012-06-14 20:37:15 +00:00
Owen Anderson	0eda3e1de6	Switch the canonical FMA term operand order to match both the comment I wrote and the usual LLVM convention. llvm-svn: 157708	2012-05-30 18:54:50 +00:00
Owen Anderson	c7aaf523e1	Teach DAGCombine to canonicalize the position of a constant in the term operands of an FMA node. llvm-svn: 157707	2012-05-30 18:50:39 +00:00
Jim Grosbach	92f6adc8be	DAGCombiner should not change the type of an extract_vector index. When a combine twiddles an extract_vector, care should be take to preserve the type of the index operand. No luck extracting a reasonable testcase, unfortunately. rdar://11391009 llvm-svn: 156419	2012-05-08 20:56:07 +00:00
Owen Anderson	ab63d84252	Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled. llvm-svn: 156324	2012-05-07 20:51:25 +00:00
Owen Anderson	41b0665b5b	Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs. llvm-svn: 156029	2012-05-02 22:17:40 +00:00
Owen Anderson	b5f167c660	Teach DAG combine that multiplication by 1.0 can always be constant folded. llvm-svn: 156023	2012-05-02 21:32:35 +00:00
Elena Demikhovsky	8d7e56c409	ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2 llvm-svn: 155309	2012-04-22 09:39:03 +00:00
Jakob Stoklund Olesen	beb9469d5c	Register DAGUpdateListeners with SelectionDAG. Instead of passing listener pointers to RAUW, let SelectionDAG itself keep a linked list of interested listeners. This makes it possible to have multiple listeners active at once, like RAUWUpdateListener was already doing. It also makes it possible to register listeners up the call stack without controlling all RAUW calls below. DAGUpdateListener uses an RAII pattern to add itself to the SelectionDAG list of active listeners. llvm-svn: 155248	2012-04-20 22:08:46 +00:00
Hal Finkel	e0cf6397fd	Remove dead SD nodes after the combining pass. Fixes PR12201. llvm-svn: 154786	2012-04-16 03:33:22 +00:00
Nadav Rotem	9d376b6578	Reapply 154397. Original message: Fix a dagcombine optimization which assumes that the vsetcc result type is always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154490	2012-04-11 08:26:11 +00:00
Duncan Sands	4f53074cca	Add a comment noting that the fdiv -> fmul conversion won't generate multiplication by a denormal, and some tests checking that. llvm-svn: 154431	2012-04-10 20:35:27 +00:00
Owen Anderson	3efc8f22bd	Revert r154397, which was causing make check failures on the buildbots. llvm-svn: 154414	2012-04-10 18:02:12 +00:00
Nadav Rotem	065564d85a	Fix a dagcombine optimization which assumes that the vsetcc result type is always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154397	2012-04-10 14:58:31 +00:00
Anton Korobeynikov	4d1220de34	Transform div to mul with reciprocal only when fp imm is legal. This fixes PR12516 and uncovers one weird problem in legalize (workarounded) llvm-svn: 154394	2012-04-10 13:22:49 +00:00
Rafael Espindola	1d9672bdce	Don't try to zExt just to check if an integer constant is zero, it might not fit in a i64. llvm-svn: 154364	2012-04-10 00:16:22 +00:00
Rafael Espindola	8f62b3248e	Pattern match a setcc of boolean value with 0 as a truncate. llvm-svn: 154322	2012-04-09 16:06:03 +00:00
Craig Topper	9c3da316ec	Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out. llvm-svn: 154309	2012-04-09 07:19:09 +00:00
Craig Topper	e5893f64e8	Remove unnecessary 'else' on an 'if' that always returns llvm-svn: 154308	2012-04-09 05:59:53 +00:00
Craig Topper	e3ad4834ae	Optimize code slightly. No functionality change. llvm-svn: 154307	2012-04-09 05:55:33 +00:00
Craig Topper	5894fe430a	Replace some explicit checks with asserts for conditions that should never happen. llvm-svn: 154305	2012-04-09 05:16:56 +00:00
Benjamin Kramer	bb6ff08766	Silence sign-compare warning. llvm-svn: 154297	2012-04-08 19:04:45 +00:00
Duncan Sands	2f1dc3814b	Only have codegen turn fdiv by a constant into fmul by the reciprocal when -ffast-math, i.e. don't just always do it if the reciprocal can be formed exactly. There is already an IR level transform that does that, and it does it more carefully. llvm-svn: 154296	2012-04-08 18:08:12 +00:00
Nadav Rotem	71d07ae5cb	1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new shuffle node because it could introduce new shuffle nodes that were not supported efficiently by the target. 2. Add a more restrictive shuffle-of-shuffle optimization for cases where the second shuffle reverses the transformation of the first shuffle. llvm-svn: 154266	2012-04-07 21:19:08 +00:00
Duncan Sands	5f8397a934	Convert floating point division by a constant into multiplication by the reciprocal if converting to the reciprocal is exact. Do it even if inexact if -ffast-math. This substantially speeds up ac.f90 from the polyhedron benchmarks. llvm-svn: 154265	2012-04-07 20:04:00 +00:00
Rafael Espindola	ba0a6cabb8	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Owen Anderson	98f2c0c384	Add predicates for checking whether targets have free FNEG and FABS operations, and prevent the DAGCombiner from turning them into bitwise operations if they do. llvm-svn: 153901	2012-04-02 22:10:29 +00:00
Nadav Rotem	702f080767	Optimizing swizzles of complex shuffles may generate additional complex shuffles. Do not try to optimize swizzles of shuffles if the source shuffle has more than a single user, except when the source shuffle is also a swizzle. llvm-svn: 153864	2012-04-02 07:11:12 +00:00
Nadav Rotem	b078350872	This commit contains a few changes that had to go in together. 1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) (and also scalar_to_vector). 2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src). Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B)) 3. Optimize swizzles of shuffles: shuff(shuff(x, y), undef) -> shuff(x, y). 4. Fix an X86ISelLowering optimization which was very bitcast-sensitive. Code which was previously compiled to this: movd (%rsi), %xmm0 movdqa .LCPI0_0(%rip), %xmm2 pshufb %xmm2, %xmm0 movd (%rdi), %xmm1 pshufb %xmm2, %xmm1 pxor %xmm0, %xmm1 pshufb .LCPI0_1(%rip), %xmm1 movd %xmm1, (%rdi) ret Now compiles to this: movl (%rsi), %eax xorl %eax, (%rdi) ret llvm-svn: 153848	2012-04-01 19:31:22 +00:00
Chris Lattner	1cc25e8a40	fix what looks like a real logic bug, found by PVS-Studio (part of PR12357) llvm-svn: 153513	2012-03-27 16:27:21 +00:00
Craig Topper	aaeae98936	When combining (vextract shuffle (load ), <1,u,u,u>), 0) -> (load ), add users of the final load to the worklist too. Needed by changes I'm preparing to make to X86 backend. llvm-svn: 153078	2012-03-20 05:28:39 +00:00
Duncan Sands	3fb2fc6edb	Fix DAG combine which creates illegal vector shuffles. Patch by Heikki Kultala. llvm-svn: 153035	2012-03-19 15:35:44 +00:00
Nadav Rotem	6fd1d32c63	When optimizing certain BUILD_VECTOR nodes into other BUILD_VECTOR nodes, add the new node into the work list because there is a potential for further optimizations. llvm-svn: 152784	2012-03-15 08:49:06 +00:00
Bill Wendling	df170db2f6	Add a xform to the DAG combiner. Transform: (fsub x, (fadd x, y)) -> (fneg y) and (fsub x, (fadd y, x)) -> (fneg y) if 'unsafe math' is specified. <rdar://problem/7540295> llvm-svn: 152777	2012-03-15 05:12:00 +00:00
Evan Cheng	d5f8e5766c	Fortify r152675 a bit. Although I'm not able to come up with a test case that would trigger the truncation case. llvm-svn: 152678	2012-03-13 22:16:11 +00:00
Evan Cheng	7bf83096df	DAG combine incorrectly optimize (i32 vextract (v4i16 load $addr), c) to (i16 load $addr+csizeof(i16)) and replace uses of (i32 vextract) with the i16 load. It should issue an extload instead: (i32 extload $addr+csizeof(i16)). rdar://11035895 llvm-svn: 152675	2012-03-13 22:00:52 +00:00
Benjamin Kramer	e1e549d617	Give dagcombiner's worklist some inline capacity. llvm-svn: 152454	2012-03-10 00:23:58 +00:00
Evan Cheng	80893ce5f5	Extend r148086 to check for [r +/- reg] address mode. This fixes queens performance regression (due to increased register pressure from overly aggressive pre-inc formation). llvm-svn: 152162	2012-03-06 23:33:32 +00:00
Owen Anderson	2ee7c4dfc5	Make it possible for a target to mark FSUB as Expand. This requires providing a default expansion (FADD+FNEG), and teaching DAGCombine not to form FSUBs post-legalize if they are not legal. llvm-svn: 152079	2012-03-06 00:29:31 +00:00
James Molloy	862fe49c55	Teach the DAGCombiner that certain loadext nodes followed by ANDs can be converted to zeroexts. llvm-svn: 150957	2012-02-20 12:02:38 +00:00
James Molloy	920ae8c642	Remove extraneous #include and spelling mistake introduced in r150669. llvm-svn: 150670	2012-02-16 09:48:07 +00:00
James Molloy	67b6b11b52	Modify the algorithm when traversing the DAGCombiner's worklist to be O(log N) for all operations. This fixes a horrible worst case with lots of nodes where 99% of the time was being spent in std::remove. llvm-svn: 150669	2012-02-16 09:17:04 +00:00
Nadav Rotem	0c65064dbe	Fix a bug in DAGCombine for the optimization of BUILD_VECTOR. We cant generate a shuffle node from two vectors of different types. llvm-svn: 150383	2012-02-13 12:42:26 +00:00
Nadav Rotem	34ca89afa8	This patch addresses the problem of poor code generation for the zext v8i8 -> v8i32 on AVX machines. The codegen often scalarizes ANY_EXTEND nodes. The DAGCombiner has two optimizations that can mitigate the problem. First, if all of the operands of a BUILD_VECTOR node are extracted from an ZEXT/ANYEXT nodes, then it is possible to create a new simplified BUILD_VECTOR which uses UNDEFS/ZERO values to eliminate the scalar ZEXT/ANYEXT nodes. Second, another dag combine optimization lowers BUILD_VECTOR into a shuffle vector instruction. In the case of zext v8i8->v8i32 on AVX, a value in an XMM register is to be shuffled into a wide YMM register. This patch modifes the second optimization and allows the creation of shuffle vectors even when the newly generated vector and the original vector from which we extract the values are of different types. llvm-svn: 150340	2012-02-12 15:05:31 +00:00
Nadav Rotem	4f4546b73a	Add additional documentation to the extract-and-trunc dagcombine optimization. llvm-svn: 149823	2012-02-05 11:39:23 +00:00
Nadav Rotem	5399f4d6bf	The type-legalizer often scalarizes code. One of the common patterns is extract-and-truncate. In this patch we optimize this pattern and convert the sequence into extract op of a narrow type. This allows the BUILD_VECTOR dag optimizations to construct efficient shuffle operations in many cases. llvm-svn: 149692	2012-02-03 13:18:25 +00:00
Nadav Rotem	fb6ddee0e9	Transform: (EXTRACT_VECTOR_ELT( VECTOR_SHUFFLE )) -> EXTRACT_VECTOR_ELT. llvm-svn: 148337	2012-01-17 21:44:01 +00:00
Craig Topper	02cb0fb136	Teach DAG combiner to turn a BUILD_VECTOR of UNDEFs into an UNDEF of vector type. llvm-svn: 148297	2012-01-17 09:09:48 +00:00
Benjamin Kramer	5a377e28da	DAGCombiner: Deduplicate code. llvm-svn: 148217	2012-01-15 11:50:43 +00:00
Evan Cheng	fa8326334b	DAGCombine's logic for forming pre- and post- indexed loads / stores were being overly conservative. It was concerned about cases where it would prohibit folding simple [r, c] addressing modes. e.g. ldr r0, [r2] ldr r1, [r2, #4] => ldr r0, [r2], #4 ldr r1, [r2] Change the logic to look for such cases which allows it to form indexed memory ops more aggressively. rdar://10674430 llvm-svn: 148086	2012-01-13 01:37:24 +00:00
Chandler Carruth	55b2cdee26	Teach the X86 instruction selection to do some heroic transforms to detect a pattern which can be implemented with a small 'shl' embedded in the addressing mode scale. This happens in real code as follows: unsigned x = my_accelerator_table[input >> 11]; Here we have some lookup table that we look into using the high bits of 'input'. Each entity in the table is 4-bytes, which means this implicitly gets turned into (once lowered out of a GEP): (unsigned)((char)my_accelerator_table + ((input >> 11) << 2)); The shift right followed by a shift left is canonicalized to a smaller shift right and masking off the low bits. That hides the shift right which x86 has an addressing mode designed to support. We now detect masks of this form, and produce the longer shift right followed by the proper addressing mode. In addition to saving a (rather large) instruction, this also reduces stalls in Intel chips on benchmarks I've measured. In order for all of this to work, one part of the DAG needs to be canonicalized still further* than it currently is. This involves removing pointless 'trunc' nodes between a zextload and a zext. Without that, we end up generating spurious masks and hiding the pattern. llvm-svn: 147936	2012-01-11 08:41:08 +00:00
Craig Topper	0515cd41e4	Replace some uses of hasNUsesOfValue(0, X) with !hasAnyUseOfValue(X) llvm-svn: 147733	2012-01-07 18:31:09 +00:00
Craig Topper	43a1bd6ac7	Add some DAG combines for SUBC/SUBE. If nothing uses the carry/borrow out of subc, turn it into a sub. Turn (subc x, x) into 0 with no borrow. Turn (subc x, 0) into x with no borrow. Turn (subc -1, x) into (xor x, -1) with no borrow. Turn sube with no borrow in into subc. llvm-svn: 147728	2012-01-07 09:06:39 +00:00
Chandler Carruth	e041a30bb9	Prevent a DAGCombine from firing where there are two uses of a combined-away node and the result of the combine isn't substantially smaller than the input, it's just canonicalized. This is the first part of a significant (7%) performance gain for Snappy's hot decompression loop. llvm-svn: 147604	2012-01-05 11:05:55 +00:00
Craig Topper	279c77b677	Implement VECTOR_SHUFFLE canonicalizations during DAG combine. llvm-svn: 147525	2012-01-04 08:07:43 +00:00
Eli Friedman	e96286cdf2	Make sure DAGCombiner doesn't introduce multiple loads from the same memory location. PR10747, part 2. llvm-svn: 147283	2011-12-26 22:49:32 +00:00
Chandler Carruth	637cc6a8aa	Initial CodeGen support for CTTZ/CTLZ where a zero input produces an undefined result. This adds new ISD nodes for the new semantics, selecting them when the LLVM intrinsic indicates that the undef behavior is desired. The new nodes expand trivially to the old nodes, so targets don't actually need to do anything to support these new nodes besides indicating that they should be expanded. I've done this for all the operand types that I could figure out for all the targets. Owners of various targets, please review and let me know if any of these are incorrect. Note that the expand behavior is conservatively correct, and exactly matches LLVM's current behavior with these operations. Ideally this patch will not change behavior in any way. For example the regtest suite finds the exact same instruction sequences coming out of the code generator. That's why there are no new tests here -- all of this is being exercised by the existing test suite. Thanks to Duncan Sands for reviewing the various bits of this patch and helping me get the wrinkles ironed out with expanding for each target. Also thanks to Chris for clarifying through all the discussions that this is indeed the approach he was looking for. That said, there are likely still rough spots. Further review much appreciated. llvm-svn: 146466	2011-12-13 01:56:10 +00:00
Eli Friedman	f9081a8afe	Zap unnecessary isIntDivCheap() check. PR11485. No testcase because this doesn't affect any in-tree target. llvm-svn: 146015	2011-12-07 03:55:52 +00:00
Eli Friedman	0e58cba286	Fix an optimization involving EXTRACT_SUBVECTOR in DAGCombine so it behaves correctly. PR11494. llvm-svn: 145996	2011-12-07 00:11:56 +00:00
Nick Lewycky	50f02cb21b	Move global variables in TargetMachine into new TargetOptions class. As an API change, now you need a TargetOptions object to create a TargetMachine. Clang patch to follow. One small functionality change in PTX. PTX had commented out the machine verifier parts in their copy of printAndVerify. That now calls the version in LLVMTargetMachine. Users of PTX who need verification disabled should rely on not passing the command-line flag to enable it. llvm-svn: 145714	2011-12-02 22:16:29 +00:00
Evan Cheng	4a5b2040e2	Revert r145273 and fix in SelectionDAG::InferPtrAlignment() instead. Conservatively returns zero when the GV does not specify an alignment nor is it initialized. Previously it returns ABI alignment for type of the GV. However, if the type is a "packed" type, then the under-specified alignments is attached to the load / store instructions. In that case, the alignment of the type cannot be trusted. rdar://10464621 llvm-svn: 145300	2011-11-28 22:37:34 +00:00
Evan Cheng	a4b6404cf0	DAG combine should not increase alignment of loads / stores with alignment less than ABI alignment. These are loads / stores from / to "packed" data structures. Their alignments are intentionally under-specified. rdar://10301431 llvm-svn: 145273	2011-11-28 20:42:56 +00:00
Eli Friedman	ff1eaa7578	Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393. llvm-svn: 144863	2011-11-16 23:50:22 +00:00
Jay Foad	70679df664	Remove some unnecessary includes of PseudoSourceValue.h. llvm-svn: 144634	2011-11-15 07:50:46 +00:00
Eli Friedman	9d448e4a42	Don't try to form pre/post-indexed loads/stores until after LegalizeDAG runs. Fixes PR11029. llvm-svn: 144438	2011-11-12 00:35:34 +00:00
Lang Hames	b85fcd07df	Lower mem-ops to unaligned i32/i16 load/stores on ARM where supported. Add support for trimming constants to GetDemandedBits. This fixes some funky constant generation that occurs when stores are expanded for targets that don't support unaligned stores natively. llvm-svn: 144102	2011-11-08 18:56:23 +00:00
Pete Cooper	82cd9e81fc	Added invariant field to the DAG.getLoad method and changed all calls. When this field is true it means that the load is from constant (runt-time or compile-time) and so can be hoisted from loops or moved around other memory accesses llvm-svn: 144100	2011-11-08 18:42:53 +00:00
Richard Osborne	561fac4d4e	Don't introduce custom nodes after legalization in TargetLowering::BuildSDIV() and TargetLowering::BuildUDIV(). Fixes PR11283 llvm-svn: 143964	2011-11-07 17:09:05 +00:00
Nadav Rotem	f310361a7d	Cleanup. Document. Make sure that this build_vector optimization only runs before the op legalizer and that the used type is legal. llvm-svn: 143358	2011-10-31 20:08:25 +00:00
Benjamin Kramer	a4eba41b7a	Silence compiler warning. llvm-svn: 143308	2011-10-30 08:39:55 +00:00
Nadav Rotem	bf6568b5d6	Add a new DAGCombine optimization for BUILD_VECTOR. If all of the inputs are zero/any_extended, create a new simple BV which can be further optimized by other BV optimizations. llvm-svn: 143297	2011-10-29 21:23:04 +00:00
Eli Friedman	e9e356ad6b	Don't crash on 128-bit sdiv by constant. Found by inspection. llvm-svn: 143095	2011-10-27 02:06:39 +00:00
Eli Friedman	3e9ef907e0	Remove a couple redundant checks. llvm-svn: 142959	2011-10-25 20:34:22 +00:00
Bob Wilson	681561901d	Fix a DAG combiner assertion failure when constant folding BUILD_VECTORS. svn r139159 caused SelectionDAG::getConstant() to promote BUILD_VECTOR operands with illegal types, even before type legalization. For this testcase, that led to one BUILD_VECTOR with i16 operands and another with promoted i32 operands, which triggered the assertion. llvm-svn: 142370	2011-10-18 17:34:47 +00:00
Dan Gohman	e83e1b2d2c	Fix SimplifySelectCC to add newly created nodes to the DAGCombiner worklist, as it may be possible to perform further optimization on them. llvm-svn: 140349	2011-09-22 23:01:29 +00:00
Bruno Cardoso Lopes	6cb23f6e7f	Add a DAGCombine for subvector extracts to remove useless chains of subvector inserts and extracts. Initial patch by Rackover, Zvi with some tweak done by me. llvm-svn: 140204	2011-09-20 23:19:33 +00:00
Eli Friedman	b7910b79f5	Make the SelectionDAG verify that all the operands of BUILD_VECTOR have the same type. Teach DAGCombiner::visitINSERT_VECTOR_ELT not to make invalid BUILD_VECTORs. Fixes PR10897. llvm-svn: 139407	2011-09-09 21:04:06 +00:00
Duncan Sands	f2641e1bc1	Add codegen support for vector select (in the IR this means a select with a vector condition); such selects become VSELECT codegen nodes. This patch also removes VSETCC codegen nodes, unifying them with SETCC nodes (codegen was actually often using SETCC for vector SETCC already). This ensures that various DAG combiner optimizations kick in for vector comparisons. Passes dragonegg bootstrap with no testsuite regressions (nightly testsuite as well as "make check-all"). Patch mostly by Nadav Rotem. llvm-svn: 139159	2011-09-06 19:07:46 +00:00
Benjamin Kramer	68ed46ce9a	Roll back the rest of r126557. It's a hack that will break in some obscure cases. llvm-svn: 138130	2011-08-19 22:39:31 +00:00
Nadav Rotem	62da15a330	Revert r137310 because it does not optimize any code on ToT llvm-svn: 137466	2011-08-12 17:15:04 +00:00
Nadav Rotem	61140e1028	[AVX] When joining two XMM registers into a YMM register, make sure that the lower XMM register gets in first. This will allow the SUBREG pattern to elliminate the first vector insertion. llvm-svn: 137310	2011-08-11 16:49:36 +00:00
Eli Friedman	cbd3ba91b7	Make sure this DAGCombine actually returns an UNDEF of the correct type; PR10476. llvm-svn: 135993	2011-07-25 22:25:42 +00:00
Chris Lattner	229907cd11	land David Blaikie's patch to de-constify Type, with a few tweaks. llvm-svn: 135375	2011-07-18 04:54:35 +00:00
Eric Christopher	d6300d2956	Add a dag combine pattern for folding C2-(A+C1) -> (C2-C1)-A Fixes rdar://9761830 llvm-svn: 135123	2011-07-14 01:12:15 +00:00
Lang Hames	5a00499e87	Add functions 'hasPredecessor' and 'hasPredecessorHelper' to SDNode. The hasPredecessorHelper function allows predecessors to be cached to speed up repeated invocations. This fixes PR10186. X.isPredecessorOf(Y) now just calls Y.hasPredecessor(X) Y.hasPredecessor(X) calls Y.hasPredecessorHelper(X, Visited, Worklist) with empty Visited and Worklist sets (i.e. no caching over invocations). Y.hasPredecessorHelper(X, Visited, Worklist) caches search state in Visited and Worklist to speed up repeated calls. The Visited set is searched for X before going to the worklist to further search the DAG if necessary. llvm-svn: 134592	2011-07-07 04:31:51 +00:00
Benjamin Kramer	8665f8d916	Revert a part of r126557 which could create unschedulable DAGs. llvm-svn: 134067	2011-06-29 13:47:25 +00:00
Jay Foad	83be361b8a	Replace the existing forms of ConstantArray::get() with a single form that takes an ArrayRef. llvm-svn: 133615	2011-06-22 09:24:39 +00:00
Evan Cheng	4c0bd9629d	Teach dag combine to match halfword byteswap patterns. 1. (((x) & 0xFF00) >> 8) \| (((x) & 0x00FF) << 8) => (bswap x) >> 16 2. ((x&0xff)<<8)\|((x&0xff00)>>8)\|((x&0xff000000)>>8)\|((x&0x00ff0000)<<8)) => (rotl (bswap x) 16) This allows us to eliminate most of the def : Pat patterns for ARM rev16 revsh instructions. It catches many more cases for ARM and x86. rdar://9609108 llvm-svn: 133503	2011-06-21 06:01:08 +00:00
Nick Lewycky	6d677cfdd8	Add a DAGCombine for (ext (binop (load x), cst)). llvm-svn: 133124	2011-06-16 01:15:49 +00:00
Nadav Rotem	d2d9bdb2b0	Enable the simplification of truncating-store after fixing the usage of GetDemandBits (which must operate on the vector element type). Fix the a usage of getZeroExtendInReg which must also be done on scalar types. llvm-svn: 133052	2011-06-15 11:19:12 +00:00
Chad Rosier	818e116723	When pattern matching during instruction selection make sure shl x,1 is not converted to add x,x if x is a undef. add undef, undef does not guarantee that the resulting low order bit is zero. Fixes <rdar://problem/9453156> and <rdar://problem/9487392>. llvm-svn: 133022	2011-06-14 22:29:10 +00:00
Nadav Rotem	571ae19af7	Disable trunc-store simplification on vectors. llvm-svn: 132984	2011-06-14 07:18:26 +00:00
Eli Friedman	1877ac9937	Change this DAGCombine to build AND of SHR instead of SHR of AND; this matches the ordering we prefer in instcombine. Part of rdar://9562809. The potential DAGCombine which enforces this more generally messes up some other very fragile patterns, so I'm leaving that alone, at least for now. llvm-svn: 132809	2011-06-09 22:14:44 +00:00
Devang Patel	efec7715ec	Revert 121907 (it causes llc crash) and apply original patch from PR9817. llvm-svn: 131926	2011-05-23 22:04:42 +00:00
Benjamin Kramer	2fd48f2730	Implement mulo x, 2 -> addo x, x in DAGCombiner. llvm-svn: 131800	2011-05-21 18:31:55 +00:00
Dan Gohman	4298df6d86	Misc. code cleanups. llvm-svn: 131495	2011-05-17 22:20:36 +00:00
Nadav Rotem	8a7beb80f0	Fixes a bug in the DAGCombiner. LoadSDNodes have two values (data, chain). If there is a store after the load node, then there is a chain, which means that there is another user. Thus, asking hasOneUser would fail. Instead we ask hasNUsesOfValue on the 'data' value. llvm-svn: 131183	2011-05-11 14:40:50 +00:00
Duncan Sands	6be291a2cd	Indent properly, no functionality change. llvm-svn: 131082	2011-05-09 08:03:33 +00:00
Eli Friedman	55b0acd624	PR9055: extend the fix to PR4050 (r70179) to apply to zext and anyext. Returning a new node makes the code try to replace the old node, which in the included testcase is killed by CSE. llvm-svn: 129650	2011-04-16 23:25:34 +00:00
Owen Anderson	a519284fec	Fix another instance of the DAG combiner not using the correct type for the RHS of a shift. llvm-svn: 129522	2011-04-14 17:30:49 +00:00
Chris Lattner	41c80e89f3	have dag combine zap "store undef", which can be formed during call lowering with undef arguments. llvm-svn: 129185	2011-04-09 02:32:02 +00:00
Cameron Zwarich	8c7bbc09e2	Add a RemoveFromWorklist method to DCI. This is needed to do some complicated transformations in target-specific DAG combines without causing DAGCombiner to delete the same node twice. If you know of a better way to avoid this (see my next patch for an example), please let me know. llvm-svn: 128758	2011-04-02 02:40:26 +00:00
Evan Cheng	adb9c03e41	Avoid replacing the value of a directly stored load with the stored value if the load is indexed. rdar://9117613. llvm-svn: 127440	2011-03-11 00:48:56 +00:00
Stuart Hastings	6b4007dec6	Can't introduce floating-point immediate constants after legalization. Radar 9056407. llvm-svn: 126864	2011-03-02 19:36:30 +00:00
Nadav Rotem	b00913028f	Fix typos in the comments. llvm-svn: 126565	2011-02-27 07:40:43 +00:00
Benjamin Kramer	26691d9660	Add some DAGCombines for (adde 0, 0, glue), which are useful to optimize legalized code for large integer arithmetic. 1. Inform users of ADDEs with two 0 operands that it never sets carry 2. Fold other ADDs or ADDCs into the ADDE if possible It would be neat if we could do the same thing for SETCC+ADD eventually, but we can't do that in target independent code. llvm-svn: 126557	2011-02-26 22:48:07 +00:00
Owen Anderson	b2c80da4ae	Allow targets to specify a the type of the RHS of a shift parameterized on the type of the LHS. llvm-svn: 126518	2011-02-25 21:41:48 +00:00
Nadav Rotem	502f1b943f	Enable support for vector sext and trunc: Limit the folding of any_ext and sext into the load operation to scalars. Limit the active-bits trunc optimization to scalars. Document vector trunc and vector sext in LangRef. Similar to commit 126080 (for enabling zext). llvm-svn: 126424	2011-02-24 21:01:34 +00:00
Nadav Rotem	25f2ac948b	Fix 9267; Add vector zext support. The DAGCombiner folds the zext into complex load instructions. This patch prevents this optimization on vectors since none of the supported targets knows how to perform load+vector_zext in one instruction. llvm-svn: 126080	2011-02-20 12:37:50 +00:00
Stuart Hastings	81c4306005	Swap VT and DebugLoc operands of getExtLoad() for consistency with other getNode() methods. Radar 9002173. llvm-svn: 125665	2011-02-16 16:23:55 +00:00
Eric Christopher	e5ca1e0506	Refactor zero folding slightly. Clean up todo. llvm-svn: 125651	2011-02-16 04:50:12 +00:00
Eric Christopher	ef72141a75	The change for PR9190 wasn't quite right. We need to avoid making the transformation if we can't legally create a build vector of the correct type. Check that we can make the transformation first, and add a TODO to refactor this code with similar cases. Fixes: PR9223 and rdar://9000350 llvm-svn: 125631	2011-02-16 01:10:03 +00:00
Chris Lattner	e95d195014	Revisit my fix for PR9028: the issue is that DAGCombine was generating i8 shift amounts for things like i1024 types. Add an assert in getNode to prevent this from occuring in the future, fix the buggy transformation, revert my previous patch, and document this gotcha in ISDOpcodes.h llvm-svn: 125465	2011-02-13 19:09:16 +00:00
Nadav Rotem	db2f54811d	A fix for 9165. The DAGCombiner created illegal BUILD_VECTOR operations. The patch added a check that either illegal operations are allowed or that the created operation is legal. llvm-svn: 125435	2011-02-12 14:40:33 +00:00
Nadav Rotem	a49a02a04f	SimplifySelectOps can only handle selects with a scalar condition. Add a check that the condition is not a vector. llvm-svn: 125398	2011-02-11 19:57:47 +00:00
Nadav Rotem	18f6a33457	Fix #9190 The bug happens when the DAGCombiner attempts to optimize one of the patterns of the SUB opcode. It tries to create a zero of type v2i64. This type is legal on 32bit machines, but the initializer of this vector (i64) is target dependent. Currently, the initializer attempts to create an i64 zero constant, which fails. Added a flag to tell the DAGCombiner to create a legal zero, if we require that the pass would generate legal types. llvm-svn: 125391	2011-02-11 19:20:37 +00:00
Evan Cheng	d42641c6b5	Given a pair of floating point load and store, if there are no other uses of the load, then it may be legal to transform the load and store to integer load and store of the same width. This is done if the target specified the transformation as profitable. e.g. On arm, this can transform: vldr.32 s0, [] vstr.32 s0, [] to ldr r12, [] str r12, [] rdar://8944252 llvm-svn: 124708	2011-02-02 01:06:55 +00:00
Richard Osborne	272e084bca	Fix bug where ReduceLoadWidth was creating illegal ZEXTLOAD instructions. llvm-svn: 124587	2011-01-31 17:41:44 +00:00
Benjamin Kramer	946e1522b6	Teach DAGCombine to fold fold (sra (trunc (sr x, c1)), c2) -> (trunc (sra x, c1+c2) when c1 equals the amount of bits that are truncated off. This happens all the time when a smul is promoted to a larger type. On x86-64 we now compile "int test(int x) { return x/10; }" into movslq %edi, %rax imulq $1717986919, %rax, %rax movq %rax, %rcx shrq $63, %rcx sarq $34, %rax <- used to be "shrq $32, %rax; sarl $2, %eax" addl %ecx, %eax This fires 96 times in gcc.c on x86-64. llvm-svn: 124559	2011-01-30 16:38:43 +00:00
Benjamin Kramer	65bb14d368	Add the missing sub identity "A-(A-B) -> B" to DAGCombine. This happens e.g. for code like "X - X%10" where we lower the modulo operation to a series of multiplies and shifts that are then subtracted from X, leading to this missed optimization. llvm-svn: 124532	2011-01-29 12:34:05 +00:00
Anton Korobeynikov	2f93128109	Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs and fixes here and there. llvm-svn: 123170	2011-01-10 12:39:04 +00:00
Benjamin Kramer	1f4dfbbcb0	DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal. The latter usually compiles into smaller code. example code: unsigned foo(unsigned x, unsigned y) { if (x != 0) y--; return y; } before: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] sbbl %eax, %eax ## encoding: [0x19,0xc0] notl %eax ## encoding: [0xf7,0xd0] addl 8(%esp), %eax ## encoding: [0x03,0x44,0x24,0x08] ret ## encoding: [0xc3] after: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] movl 8(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08] adcl $-1, %eax ## encoding: [0x83,0xd0,0xff] ret ## encoding: [0xc3] llvm-svn: 122455	2010-12-22 23:17:45 +00:00
Chris Lattner	cafc1e60bb	Fix a bug in ReduceLoadWidth that wasn't handling extending loads properly. We miscompiled the testcase into: _test: ## @test movl $128, (%rdi) movzbl 1(%rdi), %eax ret Now we get a proper: _test: ## @test movl $128, (%rdi) movsbl (%rdi), %eax movzbl %ah, %eax ret This fixes PR8757. llvm-svn: 122392	2010-12-22 08:02:57 +00:00
Chris Lattner	9a499e96eb	more cleanups, move a check for "roundedness" earlier to reject unhanded cases faster and simplify code. llvm-svn: 122391	2010-12-22 08:01:44 +00:00
Chris Lattner	222374d886	reduce indentation and improve comments, no functionality change. llvm-svn: 122389	2010-12-22 07:36:50 +00:00
Dale Johannesen	a94e36bbee	Reapply 122353-122355 with fixes. 122354 was wrong; the shift type was needed one place, the shift count type another. The transform in 123555 had the same problem. llvm-svn: 122366	2010-12-21 21:55:50 +00:00
Dale Johannesen	87c47499c6	Revert 122353-122355 for the moment, they broke stuff. llvm-svn: 122360	2010-12-21 21:22:27 +00:00
Dale Johannesen	caf42aa6a4	Add a new transform to DAGCombiner. llvm-svn: 122355	2010-12-21 20:10:51 +00:00
Dale Johannesen	fa5dc82fda	Get the type of a shift from the shift, not from its shift count operand. These should be the same but apparently are not always, and this is cleaner anyway. This improves the code in an existing test. llvm-svn: 122354	2010-12-21 20:06:19 +00:00
Dale Johannesen	d64931df77	Shift by the word size is invalid IR; don't create it. llvm-svn: 122353	2010-12-21 20:00:06 +00:00
Chris Lattner	2a7ff99979	fix some typos llvm-svn: 122349	2010-12-21 18:05:22 +00:00
Chris Lattner	3e5fbd74ed	rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310	2010-12-21 02:38:05 +00:00
Dale Johannesen	0a291a36f2	Cosmetic changes. llvm-svn: 122259	2010-12-20 20:10:50 +00:00
Bob Wilson	5408144add	Fix a DAGCombiner crash when folding binary vector operations with constant BUILD_VECTOR operands where the element type is not legal. I had previously changed this code to insert TRUNCATE operations, but that was just wrong. llvm-svn: 122102	2010-12-17 23:06:49 +00:00
Dale Johannesen	cd538afa52	Add a transform to DAG Combiner. This improves the code for the case where 32-bit divide by constant is turned into 64-bit multiply by constant. 8771012. llvm-svn: 122090	2010-12-17 21:45:49 +00:00
Chris Lattner	15090e1eb0	take care of some todos, transforming [us]mul_lohi into a wider mul if the wider mul is legal. llvm-svn: 121848	2010-12-15 06:04:19 +00:00
Chris Lattner	b86dceea1b	when transforming a MULHS into a wider MUL, there is no need to SRA the result, the top bits are truncated off anyway, just use SRL. llvm-svn: 121846	2010-12-15 05:51:39 +00:00
Chris Lattner	10bd29f1d4	Add a couple dag combines to transform mulhi/mullo into a wider multiply when the wider type is legal. This allows us to compile: define zeroext i16 @test1(i16 zeroext %x) nounwind { entry: %div = udiv i16 %x, 33 ret i16 %div } into: test1: # @test1 movzwl 4(%esp), %eax imull $63551, %eax, %eax # imm = 0xF83F shrl $21, %eax ret instead of: test1: # @test1 movw $-1985, %ax # imm = 0xFFFFFFFFFFFFF83F mulw 4(%esp) andl $65504, %edx # imm = 0xFFE0 movl %edx, %eax shrl $5, %eax ret Implementing rdar://8760399 and example #4 from: http://blog.regehr.org/archives/320 We should implement the same thing for [su]mul_hilo, but I don't have immediate plans to do this. llvm-svn: 121696	2010-12-13 08:39:01 +00:00
Eric Christopher	d9e8eac235	80-col fixups. llvm-svn: 121356	2010-12-09 04:48:06 +00:00
Jay Foad	583abbc4df	PR5207: Change APInt methods trunc(), sext(), zext(), sextOrTrunc() and zextOrTrunc(), and APSInt methods extend(), extOrTrunc() and new method trunc(), to be const and to return a new value instead of modifying the object in place. llvm-svn: 121120	2010-12-07 08:25:19 +00:00
Bob Wilson	f9b96c474f	Fix a comment typo. llvm-svn: 120235	2010-11-28 06:51:19 +00:00
Wesley Peck	527da1b6e2	Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept. llvm-svn: 119990	2010-11-23 03:31:01 +00:00
Duncan Sands	c92331b984	Fix thinko: we must turn select(anyext, sext) into sext(select) not anyext(select). Spotted by Frits van Bommel. llvm-svn: 119739	2010-11-18 21:16:28 +00:00
Duncan Sands	12f3b3b44f	The DAGCombiner was threading select over pairs of extending loads even if the extension types were not the same. The result was that if you fed a select with sext and zext loads, as in the testcase, then it would get turned into a zext (or sext) of the select, which is wrong in the cases when it should have been an sext (resp. zext). Reported and diagnosed by Sebastien Deldon. llvm-svn: 119728	2010-11-18 20:05:18 +00:00
Dan Gohman	5db8921422	Fix DAGCombiner to avoid folding a sext-in-reg or similar through a shl in order to fold it into a load. llvm-svn: 118471	2010-11-09 01:54:35 +00:00
Eric Christopher	c6418b105a	Just return undef for invalid masks or elts, and since we're doing that, just do it earlier too. llvm-svn: 118195	2010-11-03 20:44:42 +00:00
Eric Christopher	fcc9e6848a	If we have an undef mask our Elt will be -1 for our access, handle this by using an undef as a pointer. Fixes rdar://8625016 llvm-svn: 118164	2010-11-03 09:36:40 +00:00
Dan Gohman	68fb004616	Fix DAGCombiner to avoid going into an infinite loop when it encounters (and:i64 (shl:i64 (load:i64), 1), 0xffffffff). This fixes rdar://8606584. llvm-svn: 118143	2010-11-03 01:47:46 +00:00

... 5 6 7 8 9 ...

1384 Commits