llvm-project

Commit Graph

Author	SHA1	Message	Date
Juergen Ributzka	a415943590	[FastISel] Fix patchpoint lowering to set the result register. Always update the value map with the result register (if there is one), for the patchpoint instruction we created to replace the target-specific call instruction. llvm-svn: 213033	2014-07-15 02:22:43 +00:00
Andrea Di Biagio	2152a6c78b	[DAGCombiner] Avoid calling method 'isShuffleMaskLegal' on illegal vector types. This patch fixes a crasher in method 'DAGCombiner::visitOR' due to an invalid call to method 'isShuffleMaskLegal'. On x86, method 'isShuffleMaskLegal' always expects a legal vector value type in input. With this patch, we immediately check if the input OR dag node has a legal vector type; we only try to fold a OR dag node into a single shufflevector if we know that the resulting shuffle will have a legal type. This is to avoid calling method 'isShuffleMaskLegal' on a potentially illegal vector value type. Added a new test-case to file 'CodeGen/X86/combine-or.ll' to verify that DAGCombiner doesn't crash in the attempt to check/combine an OR between shuffles with illegal types. llvm-svn: 213020	2014-07-15 00:02:32 +00:00
Andrea Di Biagio	3960a9571f	[DAGCombiner] Add more rules to combine shuffle vector dag nodes. This patch teaches the DAGCombiner how to fold a pair of shuffles according to rules: 1. shuffle(shuffle A, B, M0), B, M1) -> shuffle(A, B, M2) 2. shuffle(shuffle A, B, M0), A, M1) -> shuffle(A, B, M3) The new rules would only trigger if the resulting shuffle has legal type and legal mask. Added test 'combine-vec-shuffle-3.ll' to verify that DAGCombiner correctly folds shuffles on x86 when the resulting mask is legal. Also added some negative cases to verify that we avoid introducing illegal shuffles. llvm-svn: 213001	2014-07-14 22:46:26 +00:00
Sanjay Patel	b49bf168f2	fixed typo llvm-svn: 212966	2014-07-14 18:21:07 +00:00
Andrea Di Biagio	67d8b2e2b0	[DAGCombiner] Fix a crash caused by a missing check for legal type when trying to fold shuffles. Verify that DAGCombiner does not crash when trying to fold a pair of shuffles according to rule (added at r212539): (shuffle (shuffle A, Undef, M0), Undef, M1) -> (shuffle A, Undef, M2) The DAGCombiner avoids folding shuffles if the resulting shuffle dag node is not legal for the target. That means, the resulting shuffle must have legal type and legal mask. Before, the DAGCombiner only called method 'TargetLowering::isShuffleMaskLegal' to check if it was "safe" to fold according to the above-mentioned rule. However, this caused a crash in the x86 backend since method 'isShuffleMaskLegal' always expects to be called on a legal vector type. llvm-svn: 212915	2014-07-13 21:02:14 +00:00
Reid Kleckner	fb9519838a	Avoid a warning from MSVC on "*/" in this code by inserting a space llvm-svn: 212862	2014-07-12 00:06:46 +00:00
Juergen Ributzka	3d9e6755e4	[FastISel] Add target-independent patchpoint intrinsic support. WIP. This implements the target-independent lowering for the patchpoint intrinsic. Targets have to implement the FastLowerCall hook to support this intrinsic. Related to <rdar://problem/17427052> llvm-svn: 212849	2014-07-11 22:19:02 +00:00
Juergen Ributzka	8179e9e5ad	[FastISel] Add basic infrastructure to support a target-independent call lowering hook in FastISel. WIP The infrastructure mimics the call lowering we have already in place for SelectionDAG, but with limitations. For example structure return demotion and non-simple types are not supported (yet). Currently every backend has its own implementation and duplicated code for call lowering. There is also no specified interface that could be called from target-independent code. The target-hook is opt-in and doesn't affect current implementations. llvm-svn: 212848	2014-07-11 22:01:42 +00:00
Juergen Ributzka	4ce9863d0b	[FastISel] Make isInTailCallPosition independent of SelectionDAG. Break out the arguemnts required from SelectionDAG, so that this function can also be used by FastISel. llvm-svn: 212844	2014-07-11 20:50:47 +00:00
Juergen Ributzka	5dd32136b9	[FastISel] Breakout intrinsic lowering into a separate function and add a target-hook. Create a separate helper function for target-independent intrinsic lowering. Also add an target-hook that allows to directly call into a target-sepcific intrinsic lowering method. Currently the implementation is opt-in and doesn't affect existing target implementations. llvm-svn: 212843	2014-07-11 20:42:12 +00:00
Oliver Stannard	6eda6ffc0c	ARM: Allow __fp16 as a function arg or return type for AArch64 ACLE 2.0 allows __fp16 to be used as a function argument or return type. This enables this for AArch64. llvm-svn: 212812	2014-07-11 13:33:46 +00:00
Jan Vesely	eca89d283e	SelectionDAG: Factor FP_TO_SINT lower code out of DAGLegalizer Move the code to a helper function to allow calls from TypeLegalizer. No functionality change intended Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> Reviewed-by: Owen Anderson <resistor@mac.com> llvm-svn: 212772	2014-07-10 22:40:18 +00:00
Matt Arsenault	3332b70627	Revert "Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine."" Don't try to convert the select condition type. llvm-svn: 212750	2014-07-10 18:21:04 +00:00
Andrea Di Biagio	b2921c7ca0	[DAG] Further improve the logic in DAGCombiner that folds a pair of shuffles into a single shuffle if the resulting mask is legal. This patch teaches the DAGCombiner how to fold shuffles according to the following new rules: 1. shuffle(shuffle(x, y), undef) -> x 2. shuffle(shuffle(x, y), undef) -> y 3. shuffle(shuffle(x, y), undef) -> shuffle(x, undef) 4. shuffle(shuffle(x, y), undef) -> shuffle(y, undef) The backend avoids to combine shuffles according to rules 3. and 4. if the resulting shuffle does not have a legal mask. This is to avoid introducing illegal shuffles that are potentially expanded into a sub-optimal sequence of target specific dag nodes during vector legalization. Added test case combine-vec-shuffle-2.ll to verify that we correctly triggers the new rules when combining shuffles. llvm-svn: 212748	2014-07-10 18:04:55 +00:00
Chandler Carruth	0b666e0648	[x86,SDAG] Introduce any- and sign-extend-vector-inreg nodes analogous to the zero-extend-vector-inreg node introduced previously for the same purpose: manage the type legalization of widened extend operations, especially to support the experimental widening mode for x86. I'm adding both because sign-extend is expanded in terms of any-extend with shifts to propagate the sign bit. This removes the last fundamental scalarization from vec_cast2.ll (a test case that hit many really bad edge cases for widening legalization), although the trunc tests in that file still appear scalarized because the the shuffle legalization is scalarizing. Funny thing, I've been working on that. Some initial experiments with this and SSE2 scenarios is showing moderately good behavior already for sign extension. Still some work to do on the shuffle combining on X86 before we're generating optimal sequences, but avoiding scalarization is a huge step forward. llvm-svn: 212714	2014-07-10 12:32:32 +00:00
NAKAMURA Takumi	f862ce8908	Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine." This caused miscompilation on, at least, x86-64. SExt(i1 cond) confused other optimizations. llvm-svn: 212708	2014-07-10 11:37:28 +00:00
Daniel Sanders	cbd44c591d	Make it possible for ints/floats to return different values from getBooleanContents() Summary: On MIPS32r6/MIPS64r6, floating point comparisons return 0 or -1 but integer comparisons return 0 or 1. Updated the various uses of getBooleanContents. Two simplifications had to be disabled when float and int boolean contents differ: - ScalarizeVecRes_VSELECT except when the kind of boolean contents is trivially discoverable (i.e. when the condition of the VSELECT is a SETCC node). - visitVSELECT (select C, 0, 1) -> (xor C, 1). Come to think of it, this one could test for the common case of 'C' being a SETCC too. Preserved existing behaviour for all other targets and updated the affected MIPS32r6/MIPS64r6 tests. This also fixes the pi benchmark where the 'low' variable was counting in the wrong direction because it thought it could simply add the result of the comparison. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, jholewinski, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D4389 llvm-svn: 212697	2014-07-10 10:18:12 +00:00
Hao Liu	71224b02fb	[AArch64]Fix an assertion failure in DAG Combiner about concating 2 build_vector. llvm-svn: 212677	2014-07-10 03:41:50 +00:00
Matt Arsenault	658c5576d1	Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine. Do this if the truncate is free and the select is legal. llvm-svn: 212640	2014-07-09 19:12:07 +00:00
Chandler Carruth	5865a73a82	[x86] Fix a bug in my new zext-vector-inreg DAG trickery where we were not widening the input type to the node sufficiently to let the ext take place in a register. This would in turn result in a mysterious bitcast assertion failure downstream. First change here is to add back the helpful assert I had in an earlier version of the code to catch this immediately. Next change is to add support to the type legalization to detect when we have widened the operand either too little or too much (for whatever reason) and find a size-matched legal vector type to convert it to first. This can also fail so we get a new fallback path, but that seems OK. With this, we no longer crash on vec_cast2.ll when using widening. I've also added the CHECK lines for the zero-extend cases here. We still need to support sign-extend and trunc (or something) to get plausible code for the other two thirds of this test which is one of the regression tests that showed the most scalarization when widening was force-enabled. Slowly closing in on widening being a viable legalization strategy without it resorting to scalarization at every turn. =] llvm-svn: 212614	2014-07-09 12:36:54 +00:00
Chandler Carruth	14cad41e14	Sink two variables only used in an assert into the assert itself. Should fix the release builds with Werror. llvm-svn: 212612	2014-07-09 11:13:16 +00:00
Chandler Carruth	afe4b2507e	[x86] Add a ZERO_EXTEND_VECTOR_INREG DAG node and use it when widening vector types to be legal and a ZERO_EXTEND node is encountered. When we use widening to legalize vector types, extend nodes are a real challenge. Either the input or output is likely to be legal, but in many cases not both. As a consequence, we don't really have any way to represent this situation and the prior code in the widening legalization framework would just scalarize the extend operation completely. This patch introduces a new DAG node to represent doing a zero extend of a vector "in register". The core of the idea is to allow legal but different vector types in the input and output. The output vector must have fewer lanes but wider elements. The operation is defined to zero extend the low elements of the input to the size of the output elements, and drop all of the high elements which don't have a corresponding lane in the output vector. It also includes generic expansion of this node in terms of blending a zero vector into the high elements of the vector and bitcasting across. This in turn yields extremely nice code for x86 SSE2 when we use the new widening legalization logic in conjunction with the new shuffle lowering logic. There is still more to do here. We need to support sign extension, any extension, and potentially int-to-float conversions. My current plan is to continue using similar synthetic nodes to model each of these transitions with generic lowering code for each one. However, with this patch LLVM already reaches performance parity with GCC for the core C loops of the x264 code (assuming you disable the hand-written assembly versions) when compiling for SSE2 and SSE3 architectures and enabling the new widening and lowering logic for vectors. Differential Revision: http://reviews.llvm.org/D4405 llvm-svn: 212610	2014-07-09 10:58:18 +00:00
Chandler Carruth	f0a33b71e9	[SDAG] At the suggestion of Hal, switch to an output parameter that tracks which elements of the build vector are in fact undef. This should make actually inpsecting them (likely in my next patch) reasonably pretty. Also makes the output parameter optional as it is clear now that most users are happy with undefs in their splats. llvm-svn: 212581	2014-07-09 00:41:34 +00:00
Andrea Di Biagio	d261e98f3d	[DAG] Teach how to combine a pair of shuffles into a single shuffle if the resulting mask is legal. This patch teaches how to fold a shuffle according to rule: shuffle (shuffle (x, undef, M0), undef, M1) -> shuffle(x, undef, M2) We do this only if the resulting mask M2 is legal; this is to avoid introducing illegal shuffles that are potentially expanded into a sub-optimal sequence of target specific dag nodes. This patch has the advantage of being target independent, since it works on ISD nodes. Therefore, all targets (not only x86) can take advantage of this rule. The idea behind this patch is that most shuffle pairs can be safely combined before we run the legalizer on vector operations. This allows us to combine/simplify dag nodes earlier in the process and not only immediately before instruction selection stage. That said. This patch is not meant to replace any existing target specific combine rules; backends might still introduce new shuffles during legalization stage. Also, this rule is very simple and avoids to aggressively optimize shuffles. llvm-svn: 212539	2014-07-08 15:22:29 +00:00
Chandler Carruth	142e966261	[x86,SDAG] Sink the logic for folding shuffles of splats more aggressively from the x86 shuffle lowering to the generic SDAG vector shuffle formation code. This code already tried to fold away shuffles of splats! It just had lots of bugs and couldn't handle the case my new x86 shuffle lowering needed. First, it failed to correctly compute whether N2 was undef because it pre-computed this, then did transformations which could make N2 undef, then failed to ever re-consider the precomputed state. Second, it didn't look through bitcasts at all, even in the safe cases where they are just element-type bitcasts with no change to the number of elements. Third, it didn't handle all-zero bit casts nicely the way my code in the x86 side of things did, which is essential to getting good zext-shuffle lowerings. But all of these are generic. I just ported the code down to this layer and fixed the surrounding bugs. Tests exercising this in the x86 backend still pass and some silly code in widen_cast-6.ll gets better. I updated that test to be a bit more precise but it's still pretty unclear what the value of the test is in this day and age. llvm-svn: 212517	2014-07-08 08:45:38 +00:00
Chandler Carruth	efbce58775	[SDAG] Actually check for a non-constant splat and clarify comments around the handling of UNDEF lanes in boolean vector content analysis. The code before my changes here also failed to check for non-constant splats in a buildvector. I have no idea how to trigger this, I just spotted by inspection when trying to understand the code. It seems extremely unlikely to be worth the trouble to teach the only caller of this code (DAG combining setcc patterns) how to cleverly handle undef lanes, so I've just commented more thoroughly that we're giving up there. llvm-svn: 212515	2014-07-08 07:44:15 +00:00
Chandler Carruth	b844e72e85	[SDAG] Build up a more rich set of APIs for querying build-vector SDAG nodes about whether they are splats. This is factored out and improved from r212324 which got reverted as it was far too aggressive. The new API should help more conservatively handle buildvectors that are a mixture of splatted and undef values. No functionality change at this point. The hope is to slowly re-introduce the undef-tolerant optimization of splats, but each time being forced to make a concious decision about how to handle the undefs in a way that doesn't lead to contradicting assumptions about the collapsed value. Hal has pointed out in discussions that this may not end up being the desired API and instead it may be more convenient to get a mask of the undef elements or something similar. I'm starting simple and will expand the API as I adapt actual callers and see exactly what they need. llvm-svn: 212514	2014-07-08 07:19:55 +00:00
Chandler Carruth	beeacac0b3	[x86] Revert r212324 which was too aggressive w.r.t. allowing undef lanes in vector splats. The core problem here is that undef lanes can't unilaterally be considered to contribute to splats. Their handling needs to be more cautious. There is also a reported failure of the nightly testers (thanks Tobias!) that may well stem from the same core issue. I'm going to fix this theoretical issue, factor the APIs a bit better, and then verify that I don't see anything bad with Tobias's reduction from the test suite before recommitting. Original commit message for r212324: [x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for any constant, constant FP, or undef splat and to tolerate any undef lanes in a splat, then replace all uses of isSplatVector in X86's lowering with it. This fixes issues where undef lanes in an otherwise splat vector would prevent the splat logic from firing. It is a touch more awkward to use this interface, but it is much more accurate. Suggestions for better interface structuring welcome. With this fix, the code generated with the widening legalization strategy for widen_cast-4.ll is dramatically improved as the special lowering strategies for a v16i8 SRA kick in even though the high lanes are undef. We also get a slightly different choice for broadcasting an aligned memory location, and use vpshufd instead of vbroadcastss. This looks like a minor win for pipelining and domain crossing, but a minor loss for the number of micro-ops. I suspect its a wash, but folks can easily tweak the lowering if they want. llvm-svn: 212475	2014-07-07 19:03:32 +00:00
Chandler Carruth	5d79bb5d32	[x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for any constant, constant FP, or undef splat and to tolerate any undef lanes in a splat, then replace all uses of isSplatVector in X86's lowering with it. This fixes issues where undef lanes in an otherwise splat vector would prevent the splat logic from firing. It is a touch more awkward to use this interface, but it is much more accurate. Suggestions for better interface structuring welcome. With this fix, the code generated with the widening legalization strategy for widen_cast-4.ll is dramatically improved as the special lowering strategies for a v16i8 SRA kick in even though the high lanes are undef. We also get a slightly different choice for broadcasting an aligned memory location, and use vpshufd instead of vbroadcastss. This looks like a minor win for pipelining and domain crossing, but a minor loss for the number of micro-ops. I suspect its a wash, but folks can easily tweak the lowering if they want. llvm-svn: 212324	2014-07-04 08:11:49 +00:00
Eric Christopher	c1058df66f	Move function dependent resetting of a subtarget variable out of the subtarget. This involved having the movt predicate take the current function - since we care about size in instruction selection for whether or not to use movw/movt take the function so we can check the attributes. This required adding the current MachineFunction to FastISel and propagating through. llvm-svn: 212309	2014-07-04 01:55:26 +00:00
Ulrich Weigand	f236bb1b5b	Fix ppcf128 component access on little-endian systems The PowerPC 128-bit long double data type (ppcf128 in LLVM) is in fact a pair of two doubles, where one is considered the "high" or more-significant part, and the other is considered the "low" or less-significant part. When a ppcf128 value is stored in memory or a register pair, the high part always comes first, i.e. at the lower memory address or in the lower-numbered register, and the low part always comes second. This is true both on big-endian and little-endian PowerPC systems. (Similar to how with a complex number, the real part always comes first and the imaginary part second, no matter the byte order of the system.) This was implemented incorrectly for little-endian systems in LLVM. This commit fixes three related issues: - When printing an immediate ppcf128 constant to assembler output in emitGlobalConstantFP, emit the high part first on both big- and little-endian systems. - When lowering a ppcf128 type to a pair of f64 types in SelectionDAG (which is used e.g. when generating code to load an argument into a register pair), use correct low/high part ordering on little-endian systems. - In a related issue, because lowering ppcf128 into a pair of f64 must operate differently from lowering an int128 into a pair of i64, bitcasts between ppcf128 and int128 must not be optimized away by the DAG combiner on little-endian systems, but must effect a word-swap. Reviewed by Hal Finkel. llvm-svn: 212274	2014-07-03 15:06:47 +00:00
Chandler Carruth	99b1104c46	[x86] Fix the completely broken vector widening legalization of bswap. This operation was classified as a binary operation in the widening logic for some reason (clearly, untested). It is in fact a unary operation. Add a RUN line to a test to exercise this for x86. Note that again the vector widening strategy doesn't regress anything and in one case removes a totally unecessary instruction that we couldn't avoid when promoting the element type. llvm-svn: 212257	2014-07-03 07:04:38 +00:00
Chandler Carruth	c1bedac3bd	[cleanup] Hoist an if-else chain on ISD opcodes (really designed for switches) into a switch, and sink them into a dispatch function that can return the result rather than awkward variable setting with breaks. llvm-svn: 212166	2014-07-02 06:23:34 +00:00
Chandler Carruth	722289f311	[cleanup] Remove dead 'break;' statements that I meant to nuke in r212158 but missed. Thanks to Craig for spotting the goof! llvm-svn: 212159	2014-07-02 04:39:34 +00:00
Chandler Carruth	2746c2861f	[cleanup] Hoist the promotion dispatch logic into the promote function so that we can use return to express it more cleanly and avoid so many nested switch statements. llvm-svn: 212158	2014-07-02 03:07:15 +00:00
Chandler Carruth	1cfa895c4a	[cleanup] Nuke the 'VectorOp' bit of the promote method names. This doesn't add any information for methods in the VectorLegalizer class that clearly take SDAG operations to legalize. llvm-svn: 212157	2014-07-02 03:07:11 +00:00
Chandler Carruth	68adf1568a	[x86] Clean up and modernize the doxygen and API comments for the vector operation legalization code. llvm-svn: 212155	2014-07-02 02:16:57 +00:00
Juergen Ributzka	190305b648	[FastISel] Factor out stackmap intrinsic selection code into a dedicated helper method. NFCI. llvm-svn: 212140	2014-07-01 22:25:49 +00:00
Juergen Ributzka	3bd03c7099	[DAG] Pass the argument list to the CallLoweringInfo via move semantics. NFCI. The argument list vector is never used after it has been passed to the CallLoweringInfo and moving it to the CallLoweringInfo is cleaner and pretty much as cheap as keeping a pointer to it. llvm-svn: 212135	2014-07-01 22:01:54 +00:00
Alp Toker	cf21875d41	Fix 'platform-specific' hyphenations llvm-svn: 212056	2014-06-30 18:57:16 +00:00
Craig Topper	66e588be09	Add ops() method to SDNode that returns an ArrayRef<SDUse>. Use it to simplify some code. llvm-svn: 211993	2014-06-29 00:40:57 +00:00
Chad Rosier	5235973ee0	[AArch64] Fix memset ICE when memset value is f128. llvm-svn: 211960	2014-06-27 21:05:09 +00:00
Alp Toker	e69170a110	Revert "Introduce a string_ostream string builder facilty" Temporarily back out commits r211749, r211752 and r211754. llvm-svn: 211814	2014-06-26 22:52:05 +00:00
Alp Toker	614717388c	Introduce a string_ostream string builder facilty string_ostream is a safe and efficient string builder that combines opaque stack storage with a built-in ostream interface. small_string_ostream<bytes> additionally permits an explicit stack storage size other than the default 128 bytes to be provided. Beyond that, storage is transferred to the heap. This convenient class can be used in most places an std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair would previously have been used, in order to guarantee consistent access without byte truncation. The patch also converts much of LLVM to use the new facility. These changes include several probable bug fixes for truncated output, a programming error that's no longer possible with the new interface. llvm-svn: 211749	2014-06-26 00:00:48 +00:00
Kevin Qin	93d45ecdbf	[AArch64] Fix a build_vector pattern match fail caused by defect in isBuildVectorAllZeros(). llvm-svn: 211567	2014-06-24 05:37:27 +00:00
Benjamin Kramer	b7f5fb5751	Legalizer: Add support for splitting insert_subvectors. We handle this by spilling the whole thing to the stack and doing the insertion as a store. PR19492. This happens in real code because the vectorizer creates v2i128 when AVX is enabled. llvm-svn: 211435	2014-06-21 12:56:42 +00:00
Jingyue Wu	37fcb5919d	[ValueTracking] Extend range metadata to call/invoke Summary: With this patch, range metadata can be added to call/invoke including IntrinsicInst. Previously, it could only be added to load. Rename computeKnownBitsLoad to computeKnownBitsFromRangeMetadata because range metadata is not only used by load. Update the language reference to reflect this change. Test Plan: Add several tests in range-2.ll to confirm the verifier is happy with having range metadata on call/invoke. Add two tests in AddOverFlow.ll to confirm annotating range metadata to call/invoke can benefit InstCombine. Reviewers: meheff, nlewycky, reames, hfinkel, eliben Reviewed By: eliben Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4187 llvm-svn: 211281	2014-06-19 16:50:16 +00:00
Tim Northover	d82ed2e581	DAG: move sret demotion into most basic LowerCallTo implementation. It looks like there are two versions of LowerCallTo here: the SelectionDAGBuilder one is designed to operate on LLVM IR, and the TargetLowering one in the case where everything is at DAG level. Previously, only the SelectionDAGBuilder variant could handle demoting an impossible return to sret semantics (before delegating to the TargetLowering version), but this functionality is also useful for certain libcalls (e.g. 128-bit operations on 32-bit x86). So this commit moves the sret handling down a level. rdar://problem/17242889 llvm-svn: 211155	2014-06-18 11:52:44 +00:00
Tom Stellard	aad4659470	SelectionDAG: Expand i64 = FP_TO_SINT i32 llvm-svn: 211108	2014-06-17 16:53:07 +00:00
Tim Northover	65277a2bc0	LegalizeDAG: make sure cast is unsigned before using FP_TO_UINT. It's valid to use FP_TO_SINT when asking for a smaller type (e.g. all "unsigned int16" values fit into a "signed int32"), but the reverse isn't true. Unfortunately, I'm not actually aware of any architecture with asymmetric FP_TO_SINT and FP_TO_UINT handling and the logic happens to work in the symmetric case, so I can't actually write a test for this. llvm-svn: 210986	2014-06-15 09:27:20 +00:00
Eric Christopher	f047bfd115	The hazard recognizer only needs a subtarget, not a target machine so make it take one. Fix up all users accordingly. llvm-svn: 210948	2014-06-13 22:38:52 +00:00
Tim Northover	420a216817	IR: add "cmpxchg weak" variant to support permitted failure. This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity all cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. llvm-svn: 210903	2014-06-13 14:24:07 +00:00
Juergen Ributzka	454d374e37	[FastISel][X86] - Add branch weights Add branch weights to branch instructions, so that the following passes can optimize based on it (i.e. basic block ordering). llvm-svn: 210863	2014-06-13 00:45:11 +00:00
Juergen Ributzka	349777d3ea	[FastISel][X86] Add MachineMemOperand to load/store instructions. This commit adds MachineMemOperands to load and store instructions. This allows the peephole optimizer to fold load instructions. Unfortunatelly the peephole optimizer currently doesn't run at -O0. llvm-svn: 210858	2014-06-12 23:27:57 +00:00
Tom Stellard	7783b0adf4	Revert "SelectionDAG: Enable (and (setcc x), (setcc y)) -> (setcc (and x, y)) for vectors" This reverts commit r210540, adds a testcase for the regression it caused, and marks the R600 test it was supposed to fix as XFAIL. llvm-svn: 210792	2014-06-12 16:04:47 +00:00
Juergen Ributzka	04558dc77a	[FastISel] Add support for the stackmap intrinsic. This implements target-independent FastISel lowering for the stackmap intrinsic. llvm-svn: 210742	2014-06-12 03:29:26 +00:00
Eric Christopher	576d36ae05	Have isInTailCallPosition take the DAG so that we can use the version of TargetLowering/Machine from there on the way to avoiding TargetMachine in TargetLowering. llvm-svn: 210579	2014-06-10 20:39:38 +00:00
Juergen Ributzka	89fe23e888	[FastISel] Collect statistics about failing intrinsic calls. Add more instruction-specific statistics about failing intrinsic calls during FastISel. llvm-svn: 210556	2014-06-10 18:17:00 +00:00
Tom Stellard	3787b12255	SelectionDAG: Don't use MVT::Other to determine legality of ISD::SELECT_CC The SelectionDAG bad a special case for ISD::SELECT_CC, where it would allow targets to specify: setOperationAction(ISD::SELECT_CC, MVT::Other, Expand); to indicate that they wanted to expand ISD::SELECT_CC for all types. This wasn't applied correctly everywhere, and it makes writing new DAG patterns with ISD::SELECT_CC difficult. llvm-svn: 210541	2014-06-10 16:01:29 +00:00
Tom Stellard	b9a023383e	SelectionDAG: Enable (and (setcc x), (setcc y)) -> (setcc (and x, y)) for vectors This prevents a future commit from regressing: test/CodeGen/R600/setcc-equivalent.ll llvm-svn: 210540	2014-06-10 16:01:25 +00:00
Tom Stellard	3ca1bfc728	SelectionDAG: Expand SELECT_CC to SELECT + SETCC This consolidates code from the Hexagon, R600, and XCore targets. No functionality change intended. llvm-svn: 210539	2014-06-10 16:01:22 +00:00
Andrea Di Biagio	f99dd64f0a	[X86] Add target combine rules for horizontal add/sub. This patch adds new target specific combine rules to identify horizontal add/sub idioms from BUILD_VECTOR dag nodes. This patch also teaches the DAGCombiner how to canonicalize sequences of insert_vector_elt dag nodes according to the following rule: (insert_vector_elt (insert_vector_elt A, I0), I1) -> (insert_vecto_elt (insert_vector_elt A, I1), I0) This new canonicalization rule only triggers if the inner insert_vector dag node has exactly one use; also, both indices must be known constants, and I1 < I0. This last rule made it possible to write a simpler algorithm to identify horizontal add/sub patterns because now we don't have to worry about the ordering of insert_vector_elt dag nodes. llvm-svn: 210477	2014-06-09 16:54:41 +00:00
Andrea Di Biagio	4db1abea15	[DAG] Expose NoSignedWrap, NoUnsignedWrap and Exact flags to SelectionDAG. This patch modifies SelectionDAGBuilder to construct SDNodes with associated NoSignedWrap, NoUnsignedWrap and Exact flags coming from IR BinaryOperator instructions. Added a new SDNode type called 'BinaryWithFlagsSDNode' to allow accessing nsw/nuw/exact flags during codegen. Patch by Marcello Maggioni. llvm-svn: 210467	2014-06-09 12:32:53 +00:00
Eric Christopher	0dd8d486b3	Have TargetSelectionDAGInfo take a DataLayout initializer rather than a TargetMachine since the only thing it wants is DataLayout. llvm-svn: 210366	2014-06-06 19:04:48 +00:00
Adam Nemet	b4690e3fd1	[SelectionDAG] Force cycle detection in AssignTopologicalOrder before aborting DAG cycle detection is only enabled with ENABLE_EXPENSIVE_CHECKS. However we can run it just before we would crash in order to provide more informative diagnostics. Now in addition to the "Overran sorted position" message we also get the Node printed if a cycle was detected. Tested by building several configs: Debug+Assert, Debug+Assert+Check (this is ENABLE_EXPENSIVE_CHECKS), Release+Assert and Release. Also tried that the AssignTopologicalOrder assert produces the expected results. llvm-svn: 209977	2014-05-31 16:23:20 +00:00
Adam Nemet	7d39430a14	[SelectionDAG] Pass DAG to checkForCycles Pass the DAG down to checkForCycles from all callers where we have it. This allows target-specific nodes to be printed properly. Also print some missing newlines. llvm-svn: 209976	2014-05-31 16:23:17 +00:00
Andrea Di Biagio	446a527905	[X86] Add two combine rules to simplify dag nodes introduced during type legalization when promoting nodes with illegal vector type. This patch teaches the backend how to simplify/canonicalize dag node sequences normally introduced by the backend when promoting certain dag nodes with illegal vector type. This patch adds two new combine rules: 1) fold (shuffle (bitcast (BINOP A, B)), Undef, <Mask>) -> (shuffle (BINOP (bitcast A), (bitcast B)), Undef, <Mask>) 2) fold (BINOP (shuffle (A, Undef, <Mask>)), (shuffle (B, Undef, <Mask>))) -> (shuffle (BINOP A, B), Undef, <Mask>). Both rules are only triggered on the type-legalized DAG. In particular, rule 1. is a target specific combine rule that attempts to sink a bitconvert into the operands of a binary operation. Rule 2. is a target independet rule that attempts to move a shuffle immediately after a binary operation. llvm-svn: 209930	2014-05-30 23:17:53 +00:00
Filipe Cabecinhas	82111f12fb	Convert a vselect into a concat_vector if possible Summary: If both vector args to vselect are concat_vectors and the condition is constant and picks half a vector from each argument, convert the vselect into a concat_vectors. Added a test. The ConvertSelectToConcatVector is assuming it doesn't get vselects with arguments of, for example, <undef, undef, true, true>. Those get taken care of in the checks above its call. Reviewers: nadav, delena, grosbach, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3916 llvm-svn: 209929	2014-05-30 23:03:11 +00:00
Rafael Espindola	92945eee80	[pr19636] Fix known bit computation in urem instruction with power of two. Patch by Andrey Kuharev. llvm-svn: 209902	2014-05-30 15:00:45 +00:00
Tim Northover	d622e1282c	SelectionDAG: skip barriers for unordered atomic operations Unordered is strictly weaker than monotonic, so if the latter doesn't have any barriers then the former certainly shouldn't. rdar://problem/16548260 llvm-svn: 209901	2014-05-30 14:41:51 +00:00
Hao Liu	4091450181	Fix an assertion failure caused by v1i64 in DAGCombiner Shrink. llvm-svn: 209798	2014-05-29 09:19:07 +00:00
Michael J. Spencer	f375d80635	[x86] Fold extract_vector_elt of a load into the Load's address computation. An address only use of an extract element of a load can be simplified to a load. Without this the result of the extract element is spilled to the stack so that an address is available. llvm-svn: 209788	2014-05-29 01:42:45 +00:00
Matt Arsenault	3ee3746374	Fix wrong setcc result type when legalizing uaddo/usubo No test because no in-tree targets change the bitwidth of the setcc type depending on the bitwidth of the compared type. Patch by Ke Bai llvm-svn: 209771	2014-05-28 20:51:42 +00:00
Rafael Espindola	59f7eba2b5	[pr19844] Add thread local mode to aliases. This matches gcc's behavior. It also seems natural given that aliases contain other properties that govern how it is accessed (linkage, visibility, dll storage). Clang still has to be updated to expose this feature to C. llvm-svn: 209759	2014-05-28 18:15:43 +00:00
Hal Finkel	2c77fe59d9	Revert "[DAGCombiner] Split up an indexed load if only the base pointer value is live" This reverts r208640 (I've just XFAILed the test) because it broke ppc64/Linux self-hosting. Because nearly every regression test triggers a segfault, I hope this will be easy to fix. llvm-svn: 209747	2014-05-28 15:33:19 +00:00
Tim Northover	4f1909f1da	ARM: teach AAPCS-VFP to deal with Cortex-M4. Cortex-M4 only has single-precision floating point support, so any LLVM "double" type will have been split into 2 i32s by now. Fortunately, the consecutive-register framework turns out to be precisely what's needed to reconstruct the double and follow AAPCS-VFP correctly! rdar://problem/17012966 llvm-svn: 209650	2014-05-27 10:43:38 +00:00
Benjamin Kramer	7bd6bee385	Legalizer: Make bswap promotion safe for vectors. llvm-svn: 209202	2014-05-20 09:42:31 +00:00
Benjamin Kramer	f3ad23551d	SDAG: Legalize vector BSWAP into a shuffle if the shuffle is legal but the bswap not. - On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though. - On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal. - On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled. llvm-svn: 209123	2014-05-19 13:12:38 +00:00
Saleem Abdulrasool	f3a5a5c546	Target: remove old constructors for CallLoweringInfo This is mostly a mechanical change changing all the call sites to the newer chained-function construction pattern. This removes the horrible 15-parameter constructor for the CallLoweringInfo in favour of setting properties of the call via chained functions. No functional change beyond the removal of the old constructors are intended. llvm-svn: 209082	2014-05-17 21:50:17 +00:00
Saleem Abdulrasool	9f664c1083	Target: change member from reference to pointer This is a preliminary step to help ease the construction of CallLoweringInfo. Changing the construction to a chained function pattern requires that the parameter be nullable. However, rather than copying the vector, save a pointer rather than the reference to permit a late binding of the arguments. llvm-svn: 209080	2014-05-17 21:50:01 +00:00
Rafael Espindola	e0098928c9	Delete getAliasedGlobal. llvm-svn: 209040	2014-05-16 22:37:03 +00:00
Jay Foad	5a29c367f7	Instead of littering asserts throughout the code after every call to computeKnownBits, consolidate them into one assert at the end of computeKnownBits itself. llvm-svn: 208876	2014-05-15 12:12:55 +00:00
Jay Foad	a0653a3e6c	Rename ComputeMaskedBits to computeKnownBits. "Masked" has been inappropriate since it lost its Mask parameter in r154011. llvm-svn: 208811	2014-05-14 21:14:37 +00:00
Jay Foad	e48d9e8efe	Update the comments for ComputeMaskedBits, which lost its Mask parameter in r154011. llvm-svn: 208757	2014-05-14 08:00:07 +00:00
Pete Cooper	7fd1d725b9	Use a logical not when inverting SetCC. This unfortunately doesn't fire on any targets so I couldn't find a test case to trigger it. The problem occurs when a non-i1 setcc is inverted. For example 'i8 = setcc' will get 'xor 0xff' to invert this. This is clearly wrong when the boolean contents are ZeroOrOne. This patch introduces getLogicalNOT and updates SetCC legalisation to use it. Reviewed by Hal Finkel. llvm-svn: 208641	2014-05-12 23:26:58 +00:00
Adam Nemet	5d78558c2b	[DAGCombiner] Split up an indexed load if only the base pointer value is live Right now the load may not get DCE'd because of the side-effect of updating the base pointer. This can happen if we lower a read-modify-write of an illegal larger type (e.g. i48) such that the modification only affects one of the subparts (the lower i32 part but not the higher i16 part). See the testcase. In order to spot the dead load we need to revisit it when SimplifyDemandedBits decided that the value of the load is masked off. This is the CommitTargetLoweringOpt piece. I checked compile time with ARM64 by sending SPEC bitcode files through llc. No measurable change. Fixes <rdar://problem/16031651> llvm-svn: 208640	2014-05-12 23:00:03 +00:00
Matt Arsenault	2adca6090f	Make SimplifyDemandedBits understand BUILD_PAIR llvm-svn: 208598	2014-05-12 17:14:48 +00:00
Hal Finkel	f0e086a0bc	Pass the value type to TLI::getRegisterByName We must validate the value type in TLI::getRegisterByName, because if we don't and the wrong type was used with the IR intrinsic, then we'll assert (because we won't be able to find a valid register class with which to construct the requested copy operation). For PPC64, additionally, the type information is necessary to decide between the 64-bit register and the 32-bit subregister. No functionality change. llvm-svn: 208508	2014-05-11 19:29:07 +00:00
Oliver Stannard	c24f2171ca	ARM: HFAs must be passed in consecutive registers When using the ARM AAPCS, HFAs (Homogeneous Floating-point Aggregates) must be passed in a block of consecutive floating-point registers, or on the stack. This means that unused floating-point registers cannot be back-filled with part of an HFA, however this can currently happen. This patch, along with the corresponding clang patch (http://reviews.llvm.org/D3083) prevents this. llvm-svn: 208413	2014-05-09 14:01:47 +00:00
Matt Arsenault	5f2fd4b22a	Fix using wrong result type for setcc. When reducing the bitwidth of a comparison against a constant, the original setcc's result type was used, which was incorrect. No test since I don't think any other in tree targets change the bitwidth of the setcc type depending on the bitwidth of the compared type. llvm-svn: 208236	2014-05-07 18:26:58 +00:00
Renato Golin	c7aea40ec6	Implememting named register intrinsics This patch implements the infrastructure to use named register constructs in programs that need access to specific registers (bare metal, kernels, etc). So far, only the stack pointer is supported as a technology preview, but as it is, the intrinsic can already support all non-allocatable registers from any architecture. llvm-svn: 208104	2014-05-06 16:51:25 +00:00
Benjamin Kramer	6dd9f8feb3	Satisfy GCC's urgent need for parentheses around ‘&&’ within ‘\|\|’. llvm-svn: 207871	2014-05-02 21:28:49 +00:00
Tim Northover	820e041a3c	DAGCombine: prevent formation of illegal ConstantFP nodes. llvm-svn: 207850	2014-05-02 17:25:02 +00:00
Benjamin Kramer	42d262f410	Allow SelectionDAG::FoldConstantArithmetic to work when it's called with a vector VT but scalar values. llvm-svn: 207835	2014-05-02 12:35:22 +00:00
Alexey Samsonov	f74bde6735	Convert more loops to range-based equivalents llvm-svn: 207714	2014-04-30 22:17:38 +00:00
Weiming Zhao	7f6daf1799	[ARM64] Prevent bit extraction to be adjusted by following shift For pattern like ((x >> C1) & Mask) << C2, DAG combiner may convert it into (x >> (C1-C2)) & (Mask << C2), which makes pattern matching of ubfx more difficult. For example: Given %shr = lshr i64 %x, 4 %and = and i64 %shr, 15 %arrayidx = getelementptr inbounds [8 x [64 x i64]]* @arr, i64 0, %i64 2, i64 %and %0 = load i64* %arrayidx With current shift folding, it takes 3 instrs to compute base address: lsr x8, x0, #1 and x8, x8, #0x78 add x8, x9, x8 If using ubfx, it only needs 2 instrs: ubfx x8, x0, #4, #4 add x8, x9, x8, lsl #3 This fixes bug 19589 llvm-svn: 207702	2014-04-30 21:07:24 +00:00
Craig Topper	2d2aa0ca1f	Use makeArrayRef insted of calling ArrayRef<T> constructor directly. I introduced most of these recently. llvm-svn: 207616	2014-04-30 07:17:30 +00:00
Jim Grosbach	2eb60fdc85	Tidy up whitespace. llvm-svn: 207583	2014-04-29 22:41:50 +00:00
Craig Topper	9d74a5a5f1	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. llvm-svn: 207511	2014-04-29 07:58:41 +00:00
Eric Christopher	83dd2fad2a	We already calculate WideVT above, just reuse it. Patch by Jan Vesely <jan.vesely@rutgers.edu>. llvm-svn: 207455	2014-04-28 22:24:57 +00:00
Craig Topper	8c0b4d0791	Convert more SelectionDAG functions to use ArrayRef. llvm-svn: 207397	2014-04-28 05:57:50 +00:00
Craig Topper	633d99b62d	Convert AddNodeIDNode and SelectionDAG::getNodeIfExiists to use ArrayRef<SDValue> llvm-svn: 207383	2014-04-27 23:22:43 +00:00
Craig Topper	b2ba83cd30	Convert SelectionDAGISel::MorphNode to use ArrayRef. llvm-svn: 207379	2014-04-27 19:21:20 +00:00
Craig Topper	131de82adb	Convert SelectionDAG::MorphNodeTo to use ArrayRef. llvm-svn: 207378	2014-04-27 19:21:16 +00:00
Craig Topper	481fb2879f	Convert SelectionDAG::SelectNodeTo to use ArrayRef. llvm-svn: 207377	2014-04-27 19:21:11 +00:00
Craig Topper	dd5e16dd34	Convert one last signature of getNode to take an ArrayRef of SDUse. llvm-svn: 207376	2014-04-27 19:21:06 +00:00
Craig Topper	bb5330725e	Convert SDNode constructor to use ArrayRef. llvm-svn: 207375	2014-04-27 19:21:02 +00:00
Craig Topper	64941d9786	Convert SelectionDAG::getMergeValues to use ArrayRef. llvm-svn: 207374	2014-04-27 19:20:57 +00:00
Craig Topper	2d7d6052c6	Const-correct SelectionDAG::getAtomic. llvm-svn: 207373	2014-04-27 19:20:47 +00:00
Benjamin Kramer	6bca8ef667	SelectionDAG: Aggressively fold shuffles of constant splats. llvm-svn: 207352	2014-04-27 11:41:06 +00:00
Benjamin Kramer	da4841b3a9	DAGCombiner: Simplify code a bit, make more transforms work with vectors. llvm-svn: 207338	2014-04-26 23:09:49 +00:00
Craig Topper	206fcd450a	Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer and size. llvm-svn: 207329	2014-04-26 19:29:41 +00:00
Craig Topper	48d114bed1	Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>. llvm-svn: 207327	2014-04-26 18:35:24 +00:00
Craig Topper	963c5d5ef8	Remove an unused version of getMemIntrinsicNode and getNode. Additionally, these were calling makeVTList with the pointers passed in which would were unlikely to belong to SelectionDAG and likely would have just been stack pointers. llvm-svn: 207326	2014-04-26 18:35:13 +00:00
Benjamin Kramer	ad0168702a	Rip out X86-specific vector SDIV lowering, make the corresponding DAGCombiner transform work on vectors. llvm-svn: 207316	2014-04-26 13:00:53 +00:00
Benjamin Kramer	4dae598bc8	DAGCombiner: Turn divs of vector splats into vectorized multiplications. Otherwise the legalizer would just scalarize everything. Support for mulhi in the targets isn't that great yet so on most targets we get exactly the same scalarized output. Add a test for x86 vector udiv. I had to disable the mulhi nodes on ARM because there aren't any patterns for it. As far as I know ARM has instructions for getting the high part of a multiply so this should be fixed. llvm-svn: 207315	2014-04-26 12:06:28 +00:00
Juergen Ributzka	a6bda8bae2	[DAG] During DAG legalization keep opaque constants even after expanding. The included test case would return the incorrect results, because the expansion of an shift with a constant shift amount of 0 would generate undefined behavior. This is because ExpandShiftByConstant assumes that all shifts by constants with a value of 0 have already been optimized away. This doesn't happen for opaque constants and usually this isn't a problem, because opaque constants won't take this code path - they are not supposed to. In the case that the opaque constant has to be expanded by the legalizer, the legalizer would drop the opaque flag. In this case we hit the limitations of ExpandShiftByConstant and create incorrect code. This commit fixes the legalizer by not dropping the opaque flag when expanding opaque constants and adding an assertion to ExpandShiftByConstant to catch this not supported case in the future. This fixes <rdar://problem/16718472> llvm-svn: 207304	2014-04-26 02:58:04 +00:00
Adrian Prantl	32da88923a	This reapplies r207235 with an additional bugfixes caught by the msan buildbot - do not insert debug intrinsics before phi nodes. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207269	2014-04-25 20:49:25 +00:00
Adrian Prantl	d2d9b76e48	Revert "This reapplies r207130 with an additional testcase+and a missing check for" This reverts commit 207235 to investigate msan buildbot breakage. llvm-svn: 207250	2014-04-25 18:18:09 +00:00
Adrian Prantl	f5834a4b49	This reapplies r207130 with an additional testcase+and a missing check for AllocaInst that was missing in one location. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207235	2014-04-25 17:01:00 +00:00
Adrian Prantl	6e5de2ea06	Revert "This reapplies r207130 with an additional testcase+and a missing check for" Typo in testcase. llvm-svn: 207166	2014-04-25 00:42:50 +00:00
Adrian Prantl	3512190ab3	This reapplies r207130 with an additional testcase+and a missing check for AllocaInst that was missing in one location. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207165	2014-04-25 00:38:40 +00:00
Adrian Prantl	ff4282a204	Revert "Debug info for optimized code: Support variables that are on the stack and" This reverts commit 207130 for buildbot breakage. llvm-svn: 207162	2014-04-25 00:04:49 +00:00
Adrian Prantl	f4223918de	Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine-intrinsics testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207130	2014-04-24 17:41:45 +00:00
Hao Liu	c636d15284	Fix an infinite loop bug in DAG Combine about keeping transfering between ANY_EXTEND and SIGN_EXTEND. llvm-svn: 206873	2014-04-22 09:57:06 +00:00
Chandler Carruth	1b9dde087e	[Modules] Remove potential ODR violations by sinking the DEBUG_TYPE define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers. Other sub-trees will follow. llvm-svn: 206837	2014-04-22 02:02:50 +00:00
Chandler Carruth	e96dd8975f	[Modules] Make Support/Debug.h modular. This requires it to not change behavior based on other files defining DEBUG_TYPE, which means it cannot define DEBUG_TYPE at all. This is actually better IMO as it forces folks to define relevant DEBUG_TYPEs for their files. However, it requires all files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't already. I've updated all such files in LLVM and will do the same for other upstream projects. This still leaves one important change in how LLVM uses the DEBUG_TYPE macro going forward: we need to only define the macro after header files have been #include-ed. Previously, this wasn't possible because Debug.h required the macro to be pre-defined. This commit removes that. By defining DEBUG_TYPE after the includes two things are fixed: - Header files that need to provide a DEBUG_TYPE for some inline code can do so by defining the macro before their inline code and undef-ing it afterward so the macro does not escape. - We no longer have rampant ODR violations due to including headers with different DEBUG_TYPE definitions. This may be mostly an academic violation today, but with modules these types of violations are easy to check for and potentially very relevant. Where necessary to suppor headers with DEBUG_TYPE, I have moved the definitions below the includes in this commit. I plan to move the rest of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big enough. The comments in Debug.h, which were hilariously out of date already, have been updated to reflect the recommended practice going forward. llvm-svn: 206822	2014-04-21 22:55:11 +00:00
Chandler Carruth	6d23a7b600	[Modules] Sink the DEBUG_TYPE macro out of LegalizeTypes.h and into the various .cpp files. This macro is inherently non-modular, and it wasn't even needed in this header file. llvm-svn: 206775	2014-04-21 19:43:07 +00:00
Matt Arsenault	443252c011	Fix unnecessary line break llvm-svn: 206772	2014-04-21 18:39:13 +00:00
Yaron Keren	d7ba46b287	Patch by Vadim Chugunov Win64 stack unwinder gets confused when execution flow "falls through" after a call to 'noreturn' function. This fixes the "missing epilogue" problem by emitting a trap instruction for IR 'unreachable' on x86_x64-pc-windows. A secondary use for it would be for anyone wanting to make double-sure that 'noreturn' functions, indeed, do not return. llvm-svn: 206684	2014-04-19 13:47:43 +00:00
Tim Northover	863a789a99	DAGCombiner: don't optimise non-existant litpool load This particular DAG combine is designed to kick in when both ConstantFPs will end up being loaded via a litpool, however those nodes have a semi-legal status, dictated by isFPImmLegal so in some cases there wouldn't have been a litpool in the first place. Don't try to be clever in those circumstances. Picked up while merging some AArch64 tests. llvm-svn: 206365	2014-04-16 09:03:09 +00:00
Craig Topper	abb4ac7f87	Convert SelectionDAG::getVTList to use ArrayRef llvm-svn: 206357	2014-04-16 06:10:51 +00:00
Craig Topper	ada0857679	[C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr. llvm-svn: 206356	2014-04-16 04:21:27 +00:00
Akira Hatanaka	3d90f99d1a	Make FastISel::SelectInstruction return before target specific fast-isel code handles Intrinsic::trap if TargetOptions::TrapFuncName is set. This fixes a bug in which the trap function was not taken into consideration when a program was compiled without optimization (at -O0). <rdar://problem/16291933> llvm-svn: 206323	2014-04-15 21:30:06 +00:00
Robert Lougher	a9bf2463b9	Revert r191049/r191059 as it can produce wrong code (see PR17975). It has already been reverted on the 3.4 branch in r196521. llvm-svn: 206311	2014-04-15 18:34:24 +00:00
Tim Northover	2f553f326a	FastISel: constrain the RegClass of operands when emitting instructions. ARM64 suffered multiple -verify-machineinstr failures (principally over the xsp/xzr issue) because FastISel was completely ignoring which subset of the general-purpose registers each instruction required. More fixes are coming in ARM64 specific FastISel, but this should cover the generic problems. llvm-svn: 206283	2014-04-15 13:59:49 +00:00
Nick Lewycky	aad475b324	Break PseudoSourceValue out of the Value hierarchy. It is now the root of its own tree containing FixedStackPseudoSourceValue (which you can use isa/dyn_cast on) and MipsCallEntry (which you can't). Anything that needs to use either a PseudoSourceValue* and Value* is strongly encouraged to use a MachinePointerInfo instead. llvm-svn: 206255	2014-04-15 07:22:52 +00:00
Craig Topper	c0196b1b40	[C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr. llvm-svn: 206142	2014-04-14 00:51:57 +00:00
Hal Finkel	3b48d08f54	Reenable use of TBAA during CodeGen We had disabled use of TBAA during CodeGen (even when otherwise using AA) because the ptrtoint/inttoptr used by CGP for address sinking caused BasicAA to miss basic type punning that it should catch (and, thus, we'd fail to override TBAA when we should). However, when AA is in use during CodeGen, CGP now uses normal GEPs and bitcasts, instead of ptrtoint/inttoptr, when doing address sinking. As a result, BasicAA should be able to make us do the right thing in the face of type-punning, and it seems safe to enable use of TBAA again. self-hosting seems fine on PPC64/Linux on the P7, with TBAA enabled and -misched=shuffle. Note: We still don't update TBAA when merging stack slots, although because BasicAA should now catch all such cases, this is no longer a blocking issue. Nevertheless, I plan to commit code to deal with this properly in the near future. llvm-svn: 206093	2014-04-12 01:26:00 +00:00
Matt Arsenault	9ec3cf2c8a	Move ExtractVectorElements to SelectionDAG. This seems generally useful, and makes sense to go along with SplitVector. llvm-svn: 206041	2014-04-11 17:47:30 +00:00
Tom Stellard	a1a5d9aa2e	SelectionDAG: Use helper function to improve legalization of ISD::MUL The TargetLowering::expandMUL() helper contains lowering code extracted from the DAGTypeLegalizer and allows the SelectionDAGLegalizer to expand more ISD::MUL patterns without having to use a library call. llvm-svn: 206037	2014-04-11 16:12:01 +00:00
Tom Stellard	b3a7fa2d17	SelectionDAG: Factor ISD::MUL lowering code out of DAGTypeLegalizer This code has been moved to a new function in the TargetLowering class called expandMUL(). The purpose of this is to be able to share lowering code between the SelectionDAGLegalize and DAGTypeLegalizer classes. No functionality changed intended. llvm-svn: 206036	2014-04-11 16:11:58 +00:00
Jim Grosbach	5d049b9732	[c++11] Range'ify use list loops in InstrEmitter. llvm-svn: 206015	2014-04-11 01:13:16 +00:00
Jim Grosbach	e816003d3f	[c++11] Range'ify use list loops in DAGCombiner. llvm-svn: 206014	2014-04-11 01:13:13 +00:00
Jim Grosbach	cad4cd6c9e	SelectionDAG: Don't constant fold target-specific nodes. FoldConstantArithmetic() only knows how to deal with a few target independent ISD opcodes. Bail early if it sees a target-specific ISD node. These node do funny things with operand types which may break the assumptions of the code that follows, and there's no actual folding that can be done anyway. For example, non-constant 256 bit vector shifts on X86 have a shift-amount operand that's a 128-bit v4i32 vector regardless of what the first operand type is and that breaks the assumption that the operand types must match. rdar://16530923 llvm-svn: 205937	2014-04-09 23:28:11 +00:00
Quentin Colombet	0b1a5584d6	[DAGCombiner] DAG combine does not know how to combine indexed loads with sign/zero/any extensions. However a few places were not checking properly the property of the load and were turning an indexed load into a regular extended load. Therefore the indexed value was lost during the process and this was triggering an assertion. <rdar://problem/16389332> llvm-svn: 205923	2014-04-09 20:03:05 +00:00
Matt Arsenault	aaf9623d55	Bug 19348: Check for legal ExtLoad operation before folding (aext (zextload x)) -> (aext (truncate (*extload x))) Patch by Stanislav Mekhanoshin! llvm-svn: 205805	2014-04-08 21:40:37 +00:00
Andrew Trick	8d007bb5d4	Put a limit on ScheduleDAGSDNodes::ClusterNeighboringLoads to avoid blowing up compile time. Fixes PR16365 - Extremely slow compilation in -O1 and -O2. The SD scheduler has a quadratic implementation of load clustering which absolutely blows up compile time for large blocks with constant pool loads. The MI scheduler has a better implementation of load clustering. However, we have not done the work yet to completely eliminate the SD scheduler. Some benchmarks still seem to benefit from early load clustering, although maybe by chance. As an intermediate term fix, I just put a nice limit on the number of DAG users to search before finding a match. With this limit there are no binary differences in the LLVM test suite, and the PR16365 test case does not suffer any compile time impact from this routine. llvm-svn: 205738	2014-04-07 21:29:22 +00:00
Matt Arsenault	cf6f688a40	Add DAG parameter to ComputeNumSignBitsForTargetNode This way, you can check the number of sign bits in the operands. The depth parameter it already has is pretty useless without this. llvm-svn: 205649	2014-04-04 20:13:13 +00:00
Tim Northover	0e5eaae1cb	DAGLegalize: add last-ditch type-legalization for VSELECT. When LLVM sees something like (v1iN (vselect v1i1, v1iN, v1iN)) it can decide that the result is OK (v1i64 is legal on AArch64, for example) but it still need scalarising because of that v1i1. There was no code to do this though. AArch64 and ARM64 have DAG combines to produce efficient code and prevent that occuring in most such situations, but there are edge cases that they miss. This adds a legalization to cope with that. llvm-svn: 205626	2014-04-04 14:49:30 +00:00
Tim Northover	07a8ff4892	ARM64: handle v1i1 types arising from setcc properly. There were several overlapping problems here, and this solution is closely inspired by the one adopted in AArch64 in r201381. Firstly, scalarisation of v1i1 setcc operations simply fails if the input types are legal. This is fixed in LegalizeVectorTypes.cpp this time, and allows AArch64 code to be simplified slightly. Second, vselect with such a setcc feeding into it ends up in ScalarizeVectorOperand, where it's not handled. I experimented with an implementation, but found that whatever DAG came out was rather horrific. I think Hao's DAG combine approach is a good one for quality, though there are edge cases it won't catch (to be fixed separately). Should fix PR19335. llvm-svn: 205625	2014-04-04 14:49:21 +00:00
Craig Topper	840beec2d0	Make consistent use of MCPhysReg instead of uint16_t throughout the tree. llvm-svn: 205610	2014-04-04 05:16:06 +00:00
Eric Christopher	bfb38badc1	Fix for PR 19261: llc doesn't generate nodes for unconditional fall-through branches for targets without FastISel implementation (X86 has it, but can be disabled by "-fast-isel=false") in SelectionDAGBuilder::visitBr(). So for line 4 in the following testcase 1: void foo(int i){ 2: switch(i){ 3: default: 4: break; 5: } 6: return; 7: } there is no corresponding line in .debug_line section, and a debugger cannot set a breakpoint at line 4. Fix this by always emitting a branch when we're not optimizing and add a testcase to ensure that there's code on every line we'd want to break. Patch by Daniil Fukalov. llvm-svn: 205529	2014-04-03 12:11:51 +00:00
Juergen Ributzka	fcd2e94ecc	Add comments and test case for [DAG] Keep the opaque constant flag when performing unary constant folding operations (r204737). llvm-svn: 205474	2014-04-02 22:21:01 +00:00
Matt Arsenault	e407ae9846	Make isSetCCEquivalent respect the TargetBooleanContents llvm-svn: 205336	2014-04-01 18:13:26 +00:00
Matt Arsenault	6310c3f667	Add helpers for checking if a value is a target boolean constant. llvm-svn: 205335	2014-04-01 18:13:22 +00:00
Hal Finkel	b811b6d0d1	Add an optional ability to expand larger BUILD_VECTORs with shuffles This adds the ability to expand large (meaning with more than two unique defined values) BUILD_VECTOR nodes in terms of SCALAR_TO_VECTOR and (legal) vector shuffles. There is now no limit of the size we are capable of expanding this way, although we don't currently do this for vectors with many unique values because of the default implementation of TLI's shouldExpandBuildVectorWithShuffles function. There is currently no functional change to any existing targets because the new capabilities are not used unless some target overrides the TLI shouldExpandBuildVectorWithShuffles function. As a result, I've not included a test case for the new functionality in this commit, but regression tests will (at least) be added soon when I commit support for the PPC QPX vector instruction set. The benefit of committing this now is that it makes the shouldExpandBuildVectorWithShuffles callback, which had to be added for other reasons regardless, fully functional. I suspect that other targets will also benefit from tuning the heuristic. llvm-svn: 205243	2014-03-31 19:42:55 +00:00
Hal Finkel	1977514287	Add a TLI hook to control when BUILD_VECTOR might be expanded using shuffles There are two general methods for expanding a BUILD_VECTOR node: 1. Use SCALAR_TO_VECTOR on the defined scalar values and then shuffle them together. 2. Build the vector on the stack and then load it. Currently, we use a fixed heuristic: If there are only one or two unique defined values, then we attempt an expansion in terms of SCALAR_TO_VECTOR and vector shuffles (provided that the required shuffle mask is legal). Otherwise, always expand via the stack. Even when SCALAR_TO_VECTOR is not legal, this can still be a good idea depending on what tricks the target can play when lowering the resulting shuffle. If the target can't do anything special, however, and if SCALAR_TO_VECTOR is expanded via the stack, this heuristic leads to sub-optimal code (two stack loads instead of one). Because only the target knows whether the SCALAR_TO_VECTORs and shuffles for a build vector of a particular type are likely to be optimial, this adds a new TLI function: shouldExpandBuildVectorWithShuffles which takes the vector type and the count of unique defined values. If this function returns true, then method (1) will be used, subject to the constraint that all of the necessary shuffles are legal (as determined by isShuffleMaskLegal). If this function returns false, then method (2) is always used. This commit does not enhance the current code to support expanding a build_vector with more than two unique values using shuffles, but I'll commit an implementation of the more-general case shortly. llvm-svn: 205230	2014-03-31 17:48:10 +00:00
Hal Finkel	02807595fb	Look at shuffles of build_vectors in DAGCombiner::visitEXTRACT_VECTOR_ELT When the loop vectorizer vectorizes code that uses the loop induction variable, we often end up with IR like this: %b1 = insertelement <2 x i32> undef, i32 %v, i32 0 %b2 = shufflevector <2 x i32> %b1, <2 x i32> undef, <2 x i32> zeroinitializer %i = add <2 x i32> %b2, <i32 2, i32 3> If the add in this example is not legal (as is the case on PPC with VSX), it will be scalarized, and we'll end up with a number of extract_vector_elt nodes with the vector shuffle as the input operand, and that vector shuffle is fed by one or more build_vector nodes. By the time that vector operations are expanded, visitEXTRACT_VECTOR_ELT will not create new extract_vector_elt by looking through the vector shuffle (to make sure that no illegal operations are created), and so the extract_vector_elt -> vector shuffle -> build_vector is never simplified to an operand of the build vector. By looking at build_vectors through a shuffle we fix this particular situation, preventing a vector from being built, only to be deconstructed again (for the scalarized add) -- an expensive proposition when this all needs to be done via the stack. We probably want a more comprehensive fix here where we look back recursively through any shuffles to any build_vectors or scalar_to_vectors, etc. but that can come later. llvm-svn: 205179	2014-03-31 11:43:19 +00:00
Hal Finkel	90adf0fe06	Make use of previously generated stores in SelectionDAGLegalize::ExpandExtractFromVectorThroughStack When expanding EXTRACT_VECTOR_ELT and EXTRACT_SUBVECTOR using SelectionDAGLegalize::ExpandExtractFromVectorThroughStack, we store the entire vector and then load the piece we want. This is fine in isolation, but generating a new store (and corresponding stack slot) for each extraction ends up producing code of poor quality. When we scalarize a vector operation (using SelectionDAG::UnrollVectorOp for example) we generate one EXTRACT_VECTOR_ELT for each element in the vector. This used to generate one stored copy of the vector for each element in the vector. Now we search the uses of the vector for a suitable store before generating a new one, which results in much more efficient scalarization code. llvm-svn: 205153	2014-03-30 15:10:18 +00:00
Benjamin Kramer	fd719b9551	Avoid storing Twines. While there nested ifs into a helper function. No functionality change. llvm-svn: 205108	2014-03-29 16:54:29 +00:00
Rafael Espindola	24a669d225	Prevent alias from pointing to weak aliases. This adds back r204781. Original message: Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is not the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204934	2014-03-27 15:26:56 +00:00
Renato Golin	c0a3c1d66b	Add @llvm.clear_cache builtin Implementing the LLVM part of the call to __builtin___clear_cache which translates into an intrinsic @llvm.clear_cache and is lowered by each target, either to a call to __clear_cache or nothing at all incase the caches are unified. Updating LangRef and adding some tests for the implemented architectures. Other archs will have to implement the method in case this builtin has to be compiled for it, since the default behaviour is to bail unimplemented. A Clang patch is required for the builtin to be lowered into the llvm intrinsic. This will be done next. llvm-svn: 204802	2014-03-26 12:52:28 +00:00
Rafael Espindola	65481d7b97	Revert "Prevent alias from pointing to weak aliases." This reverts commit r204781. I will follow up to with msan folks to see what is what they were trying to do with aliases to weak aliases. llvm-svn: 204784	2014-03-26 06:14:40 +00:00
Rafael Espindola	3b712a84a9	Prevent alias from pointing to weak aliases. Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is not the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204781	2014-03-26 04:48:47 +00:00
Juergen Ributzka	e2e16844f5	[DAG] Keep the opaque constant flag when performing unary constant folding operations. Usually opaque constants shouldn't be folded, unless they are simple unary operations that don't create new constants. Although this shouldn't drop the opaque constant flag. This commit fixes this. Related to <rdar://problem/14774662> llvm-svn: 204737	2014-03-25 18:01:20 +00:00
Matt Arsenault	b22426c510	Fix creating illegal setcc cond codes. If GT/UGT or LT/ULT were set to expand, a comparison with a constant would replace it with the illegal cond code. There are several more places later in this function that will have the same basic problem. Theoretically R600 should hit this problem for a test, but for some reason it doesn't. llvm-svn: 204727	2014-03-25 16:09:21 +00:00
Tom Stellard	c9a67a2b6d	SelectionDAG: Allow promotion of SELECT nodes from float to int types And vice-versa, as long as the types are the same width. There are a few R600 tests that will cover this. llvm-svn: 204616	2014-03-24 16:07:28 +00:00
Nuno Lopes	31617266ea	remove a bunch of unused private methods found with a smarter version of -Wunused-member-function that I'm playwing with. Appologies in advance if I removed someone's WIP code. include/llvm/CodeGen/MachineSSAUpdater.h \| 1 include/llvm/IR/DebugInfo.h \| 3 lib/CodeGen/MachineSSAUpdater.cpp \| 10 -- lib/CodeGen/PostRASchedulerList.cpp \| 1 lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp \| 10 -- lib/IR/DebugInfo.cpp \| 12 -- lib/MC/MCAsmStreamer.cpp \| 2 lib/Support/YAMLParser.cpp \| 39 --------- lib/TableGen/TGParser.cpp \| 16 --- lib/TableGen/TGParser.h \| 1 lib/Target/AArch64/AArch64TargetTransformInfo.cpp \| 9 -- lib/Target/ARM/ARMCodeEmitter.cpp \| 12 -- lib/Target/ARM/ARMFastISel.cpp \| 84 -------------------- lib/Target/Mips/MipsCodeEmitter.cpp \| 11 -- lib/Target/Mips/MipsConstantIslandPass.cpp \| 12 -- lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp \| 21 ----- lib/Target/NVPTX/NVPTXISelDAGToDAG.h \| 2 lib/Target/PowerPC/PPCFastISel.cpp \| 1 lib/Transforms/Instrumentation/AddressSanitizer.cpp \| 2 lib/Transforms/Instrumentation/BoundsChecking.cpp \| 2 lib/Transforms/Instrumentation/MemorySanitizer.cpp \| 1 lib/Transforms/Scalar/LoopIdiomRecognize.cpp \| 8 - lib/Transforms/Scalar/SCCP.cpp \| 1 utils/TableGen/CodeEmitterGen.cpp \| 2 24 files changed, 2 insertions(+), 261 deletions(-) llvm-svn: 204560	2014-03-23 17:09:26 +00:00
Andrea Di Biagio	5b0aacf1c7	[DAG] Fix an assertion failure caused by an invalid cast in method 'BuildVectorSDNode::isConstantSplat' This patch renames method 'isConstantSplat' as 'getConstantSplatValue' (mainly for consistency reasons), and rewrites its logic to ensure that we always perform a legal 'cast<ConstantSDNode>'. Added test shift-combine-crash.ll to verify that DAGCombiner no longer crashes with an assertion failure in the attempt to simplify a vector shift by a vector of all undef counts. llvm-svn: 204536	2014-03-22 01:47:22 +00:00
Kevin Qin	275ce91243	Fix an assertion caused by using inline asm with indirect register inputs. llvm-svn: 204425	2014-03-21 02:14:50 +00:00
Raul E. Silvera	a9dafe6793	Add support for scalarizing/splitting vector bswap. Summary: SLP Vectorization of intrinsics (r203707) has exposed cases where the expansion of vector bswap is failing (PR19151). Reviewers: hfinkel CC: chandlerc Differential Revision: http://llvm-reviews.chandlerc.com/D3104 llvm-svn: 204163	2014-03-18 17:49:12 +00:00
Andrea Di Biagio	28f46d9f39	[DAGCombiner] teach how to simplify xor/and/or nodes according to the following rules: 1) (AND (shuf (A, C, Mask), shuf (B, C, Mask)) -> shuf (AND (A, B), C, Mask) 2) (OR (shuf (A, C, Mask), shuf (B, C, Mask)) -> shuf (OR (A, B), C, Mask) 3) (XOR (shuf (A, C, Mask), shuf (B, C, Mask)) -> shuf (XOR (A, B), V_0, Mask) 4) (AND (shuf (C, A, Mask), shuf (C, B, Mask)) -> shuf (C, AND (A, B), Mask) 5) (OR (shuf (C, A, Mask), shuf (C, B, Mask)) -> shuf (C, OR (A, B), Mask) 6) (XOR (shuf (C, A, Mask), shuf (C, B, Mask)) -> shuf (V_0, XOR (A, B), Mask) llvm-svn: 204160	2014-03-18 17:12:59 +00:00
Matt Arsenault	985b9de485	Make DAGCombiner work on vector bitshifts with constant splat vectors. llvm-svn: 204071	2014-03-17 18:58:01 +00:00
Adam Nemet	24381f1cb7	[VectorLegalizer/X86] Don't unvectorize fp_to_uint for v8f32->v8i16 Rather than LegalizeAction::Expand, this needs LegalizeAction::Promote to get promoted to fp_to_sint v8f32->v8i32. This is a legal operation on AVX. For that to work properly, we also need to teach the legalizer about the specific promotion required here. The default vector promotion uses bitcasting to a vector type of the same total size. We want to promote the vector element type, effectively widening the operation and then truncating the result. This is analogous to the current logic of how int_to_fp is promoted. The change also factors out some code from the int_to_fp promotion code to ValueType::widenIntegerVectorElementType. This is now shared between int_to_fp and fp_to_int. There is no longer need for the custom lowering of fp_to_sint f32->v8i16 in X86. It can now go through the new target-independent fp_to_*int promotion logic. I also checked that no other target uses Promote for these ops yet, so there shouldn't be any unexpected change in behavior. Fixes <rdar://problem/16202247> llvm-svn: 204058	2014-03-17 17:06:14 +00:00
Owen Anderson	16c6bf49b7	Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changing operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost behave like normal iterators! Again, this requires making some existing loops more verbose, but should pave the way for the big range-based for-loop cleanups in the future. llvm-svn: 203865	2014-03-13 23:12:04 +00:00
Owen Anderson	abb90c9ddb	Phase 1 of refactoring the MachineRegisterInfo iterators to make them suitable for use with C++11 range-based for-loops. The gist of phase 1 is to remove the skipInstruction() and skipBundle() methods from these iterators, instead splitting each iterator into a version that walks operands, a version that walks instructions, and a version that walks bundles. This has the result of making some "clever" loops in lib/CodeGen more verbose, but also makes their iterator invalidation characteristics much more obvious to the casual reader. (Making them concise again in the future is a good motivating case for a pre-incrementing range adapter!) Phase 2 of this undertaking with consist of removing the getOperand() method, and changing operator*() of the operand-walker to return a MachineOperand&. At that point, it should be possible to add range views for them that work as one might expect. llvm-svn: 203757	2014-03-13 06:02:25 +00:00
Patrik Hagglund	1da3512166	Replace '#include ValueTypes.h' with forward declarations. In some cases the include is pushed "downstream" (or removed if unused). llvm-svn: 203644	2014-03-12 08:00:24 +00:00
Benjamin Kramer	f8502272ef	Remove copy ctors that did the same thing as the default one. The code added nothing but potentially disabled move semantics and made types non-trivially copyable. llvm-svn: 203563	2014-03-11 11:32:49 +00:00
Tim Northover	e94a518a22	IR: add a second ordering operand to cmpxhg for failure The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 llvm-svn: 203559	2014-03-11 10:48:52 +00:00
Matt Arsenault	95b714c749	Fix non 2-space indentation. llvm-svn: 203514	2014-03-11 00:01:25 +00:00
Chandler Carruth	cdf4788401	[C++11] Add range based accessors for the Use-Def chain of a Value. This requires a number of steps. 1) Move value_use_iterator into the Value class as an implementation detail 2) Change it to actually be a Use iterator rather than a User iterator. 3) Add an adaptor which is a User iterator that always looks through the Use to the User. 4) Wrap these in Value::use_iterator and Value::user_iterator typedefs. 5) Add the range adaptors as Value::uses() and Value::users(). 6) Update all of the callers to correctly distinguish between whether they wanted a use_iterator (and to explicitly dig out the User when needed), or a user_iterator which makes the Use itself totally opaque. Because #6 requires churning essentially everything that walked the Use-Def chains, I went ahead and added all of the range adaptors and switched them to range-based loops where appropriate. Also because the renaming requires at least churning every line of code, it didn't make any sense to split these up into multiple commits -- all of which would touch all of the same lies of code. The result is still not quite optimal. The Value::use_iterator is a nice regular iterator, but Value::user_iterator is an iterator over Users rather than over the User objects themselves. As a consequence, it fits a bit awkwardly into the range-based world and it has the weird extra-dereferencing 'operator->' that so many of our iterators have. I think this could be fixed by providing something which transforms a range of T&s into a range of Ts, but that can be separated into another patch, and it isn't yet 100% clear whether this is the right move. However, this change gets us most of the benefit and cleans up a substantial amount of code around Use and User. =] llvm-svn: 203364	2014-03-09 03:16:01 +00:00
Craig Topper	7b883b314c	[C++11] Add 'override' keyword to virtual methods that override their base class. llvm-svn: 203339	2014-03-08 06:31:39 +00:00
Adam Nemet	7f928f14f4	[DAGCombiner] Distribute TRUNC through AND in rotation amount This is already done for shifts. Allow it for rotations as well. E.g.: (rotl:i32 x, (trunc (and y, 31))) -> (rotl:i32 x, (and (trunc y), 31)) Use the newly factored-out distributeTruncateThroughAnd. With this patch and some X86.td tweaks we should be able to remove redundant masking of the rotation amount like in the example above. HW implicitly performs this masking. The testcase will be added as part of the X86 patch. llvm-svn: 203316	2014-03-07 23:56:30 +00:00
Adam Nemet	5117f5dffc	[DAGCombiner] Recognize another rotation idiom This is the new idiom: x<<(y&31) \| x>>((0-y)&31) which is recognized as: x ROTL (y&31) The change refines matchRotateSub. In Neg & (OpSize - 1) == (OpSize - Pos) & (OpSize - 1), if Pos is Pos' & (OpSize - 1) we can just use Pos' instead of Pos. llvm-svn: 203315	2014-03-07 23:56:28 +00:00
Adam Nemet	c6553a8354	[DAGCombiner] Slightly improve readability of matchRotateSub Slightly change the wording in the function comment. Originally, it can be misunderstood as we turned the input into two subsequent rotates. Better connect the comment which talks about Mask and the code which used LoBits. Renamed variable to MaskLoBits. llvm-svn: 203314	2014-03-07 23:56:24 +00:00
Arnold Schwaighofer	d33e942958	ISel: Make VSELECT selection terminate in cases where the condition type has to be split and the result type widened. When the condition of a vselect has to be split it makes no sense widening the vselect and thereby widening the condition. We end up in an endless loop of widening (vselect result type) and splitting (condition mask type) doing this. Instead, split both the condition and the vselect and widen the result. I ran this over the test suite with i686 and mattr=+sse and saw no regressions. Fixes PR18036. llvm-svn: 203311	2014-03-07 23:25:55 +00:00
Andrea Di Biagio	6292a140ee	[X86] Teach the DAGCombiner how to fold a OR of two shufflevector nodes. This patch teaches the DAGCombiner how to fold a binary OR between two shufflevector into a single shuffle vector when possible. The rules are: 1. fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf A, B, Mask1) 2. fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf B, A, Mask2) The DAGCombiner can take advantage of the fact that OR is commutative and compute two possible shuffle masks (Mask1 and Mask2) for the resulting shuffle node. Before folding a dag according to either rule 1 or 2, DAGCombiner verifies that the resulting shuffle mask is legal for the target. DAGCombiner would firstly try to fold according to 1.; If not possible then it will try to fold according to 2. If both Mask1 and Mask2 are illegal then we conservatively don't fold the OR instruction. llvm-svn: 203156	2014-03-06 20:19:52 +00:00
Matt Arsenault	f9a995d68c	R600: Fix extloads from i8 / i16 to i64. This appears to only be working for global loads. Private and local break for other reasons. llvm-svn: 203135	2014-03-06 17:34:12 +00:00
Chandler Carruth	9a4c9e597b	[Layering] Move DebugInfo.h into the IR library where its implementation already lives. llvm-svn: 203046	2014-03-06 00:46:21 +00:00
Chandler Carruth	9205140772	[Layering] Move DebugLoc.h into the IR library. The implementation already lived there and it is where it belongs -- this is the in-memory debug location representation. This is just cleanup -- Modules can actually cope with this, but that doesn't make it right. After chatting with folks that have out-of-tree stuff, going ahead and moving the rest of the headers seems preferable. llvm-svn: 202960	2014-03-05 10:30:38 +00:00
Andrew Trick	fbb278c541	Make stackmap machineinstrs clobber the scratch regs too. Patchpoints already did this. Doing it for stackmaps is a convenience for the runtime in the event that it needs to scratch register to patch or perform a runtime call thunk. Unlike patchpoints, we just assume the AnyRegCC calling convention. This is the only language and target independent calling convention specific to stackmaps so makes sense. Although the calling convention is not currently used to select the scratch registers. llvm-svn: 202943	2014-03-05 07:08:16 +00:00
Hans Wennborg	0c72fd2b4e	Fix unused variable in FunctionLoweringInfo.cpp llvm-svn: 202932	2014-03-05 03:21:23 +00:00
Hans Wennborg	acb842d523	Check for dynamic allocas and inline asm that clobbers sp before building selection dag (PR19012) In X86SelectionDagInfo::EmitTargetCodeForMemcpy we check with MachineFrameInfo to make sure that ESI isn't used as a base pointer register before we choose to emit rep movs (which clobbers esi). The problem is that MachineFrameInfo wouldn't know about dynamic allocas or inline asm that clobbers the stack pointer until SelectionDAGBuilder has encountered them. This patch fixes the problem by checking for such things when building the FunctionLoweringInfo. Differential Revision: http://llvm-reviews.chandlerc.com/D2954 llvm-svn: 202930	2014-03-05 02:43:26 +00:00
Adam Nemet	67483897a5	[DAGCombiner] Factor out distributeTruncateThroughAnd Currently this code is duplicated across visitSHL, visitSRA and visitSRL. The plan is to add rotates as clients to this new function. There is no functional change intended here. llvm-svn: 202908	2014-03-04 23:28:31 +00:00
Chandler Carruth	219b89b987	[Modules] Move CallSite into the IR library where it belogs. It is abstracting between a CallInst and an InvokeInst, both of which are IR concepts. llvm-svn: 202816	2014-03-04 11:01:28 +00:00
Benjamin Kramer	d6f1f84f51	[C++11] Replace llvm::tie with std::tie. The old implementation is no longer needed in C++11. llvm-svn: 202644	2014-03-02 13:30:33 +00:00
Benjamin Kramer	b6d0bd48bd	[C++11] Replace llvm::next and llvm::prior with std::next and std::prev. Remove the old functions. llvm-svn: 202636	2014-03-02 12:27:27 +00:00
Benjamin Kramer	3a377bce4e	Now that we have C++11, turn simple functors into lambdas and remove a ton of boilerplate. No intended functionality change. llvm-svn: 202588	2014-03-01 11:47:00 +00:00
Hal Finkel	ab51ecd4fc	Fix visitTRUNCATE for legal i1 values This extract-and-trunc vector optimization cannot work for i1 values as currently implemented, and so I'm disabling this for now for i1 values. In the future, this can be fixed properly. Soon I'll commit support for i1 CR bit tracking in the PowerPC backend, and this will be covered by one of the existing regression tests. llvm-svn: 202449	2014-02-28 00:26:45 +00:00
Matt Arsenault	b598f7b869	Add missing const llvm-svn: 202074	2014-02-24 21:01:18 +00:00
Matt Arsenault	58a7639698	Trivial code simplification llvm-svn: 202073	2014-02-24 21:01:15 +00:00
Quentin Colombet	4db08df18e	[DAGCombiner] PCMP* sets its result to all ones or zeros so we can AND with the shifted mask rather than masking and shifting separately. The patch adds this transformation to the DAGCombiner: (shl (and (setcc:i8v16 ...) N01C) N1C) -> (and (setcc:i8v16 ...) N01C<<N1C) <rdar://problem/16054492> Patch by Adam Nemet <anemet@apple.com> llvm-svn: 201906	2014-02-21 23:42:41 +00:00
Rafael Espindola	5f57f462a8	Rename a few more DataLayout variables from TD to DL. llvm-svn: 201870	2014-02-21 18:34:28 +00:00
Rafael Espindola	ea09c595a6	Rename a DebugLoc variable to DbgLoc and a DataLayout to DL. This is quiet a bit less confusing now that TargetData was renamed DataLayout. llvm-svn: 201606	2014-02-18 22:05:46 +00:00
Rafael Espindola	7c68bebb9c	Rename some member variables from TD to DL. TargetData was renamed DataLayout back in r165242. llvm-svn: 201581	2014-02-18 15:33:12 +00:00
Juergen Ributzka	2b97f9b211	[DAG] Fix the recognition of opaque constants in the SelectionDAGBuilder. This fix checks the original LLVM IR node to identify opaque constants by looking for the bitcast-constant pattern. Originally we looked at the generated SDNode, but this might lead to incorrect results. The SDNode could have been generated by an constant expression that was folded to a constant. This fixes <rdar://problem/16050719> llvm-svn: 201291	2014-02-13 04:19:26 +00:00
Juergen Ributzka	d1777cc344	[Stackmaps] Improve the stackmap lowering code in the SelectionDAGBuilder. We are now no longer relying on the target-specific call lowering implementation to lower a stackmap intrinsic call. Instead we perform the call lowering in a target-independent way directly in the stackmap lowering code. This simplifies the code and removes the need to fixup the code after the target-specific call lowering. llvm-svn: 201263	2014-02-12 22:17:13 +00:00
Juergen Ributzka	aa30da30bb	[Stackmaps] Fix the ID type to be i64 also for stackmaps (as we claim in the documenation) The ID type for the stackmap and patchpoint intrinsics are in both cases i64. This fixes an zero extend in the SelectionDAGBuilder that still used i32. This also updates the target independent instructions STACKMAP and PATCHPOINT to use the correct type. llvm-svn: 201262	2014-02-12 22:17:10 +00:00
Robert Lougher	7d9084ffa1	Teach the DAGCombiner how to fold concat_vector nodes when the input is two BUILD_VECTOR nodes, e.g.: (concat_vectors (BUILD_VECTOR a1, a2, a3, a4), (BUILD_VECTOR b1, b2, b3, b4)) -> (BUILD_VECTOR a1, a2, a3, a4, b1, b2, b3, b4) This fixes an issue with AVX, where a sequence was not recognized as a 256-bit vbroadcast due to the concat_vectors. llvm-svn: 201158	2014-02-11 15:42:46 +00:00
Juergen Ributzka	fa0eba6c8b	[DAG] Don't pull the binary operation though the shift if the operands have opaque constants. During DAGCombine visitShiftByConstant assumes that certain binary operations with only constant operands can always be folded successfully. This is no longer true when the constant is opaque. This commit fixes visitShiftByConstant by not performing the optimization for opaque constants. Otherwise we would end up in an infinite DAGCombine loop. llvm-svn: 200900	2014-02-06 04:09:06 +00:00
Matt Arsenault	1b55dd9a81	Pass address space to allowsUnalignedMemoryAccesses llvm-svn: 200888	2014-02-05 23:16:05 +00:00
Matt Arsenault	25793a3f22	Add address space argument to allowsUnalignedMemoryAccess. On R600, some address spaces have more strict alignment requirements than others. llvm-svn: 200887	2014-02-05 23:15:53 +00:00
Craig Topper	7ca1d18055	Add CheckChildInteger to ISelMatcher operations. Removes nearly 2000 bytes from X86 matcher table. llvm-svn: 200821	2014-02-05 05:44:28 +00:00
Hal Finkel	5c968d9440	Expand vector bswap in LegalizeVectorOps ISD::BSWAP was missing from the list of node types that should be expanded element-wise. llvm-svn: 200705	2014-02-03 17:27:25 +00:00
Reid Kleckner	f5b76518c9	Implement inalloca codegen for x86 with the new inalloca design Calls with inalloca are lowered by skipping all stores for arguments passed in memory and the initial stack adjustment to allocate argument memory. Now the frontend is responsible for the memory layout, and the backend doesn't have to do any work. As a result these changes are pretty minimal. Reviewers: echristo Differential Revision: http://llvm-reviews.chandlerc.com/D2637 llvm-svn: 200596	2014-01-31 23:50:57 +00:00
Reid Kleckner	dfbed59cc2	Don't put non-static allocas in the static alloca map Allocas marked inalloca are never static, but we were trying to put them into the static alloca map if they were in the entry block. Also add an assertion in x86 fastisel. llvm-svn: 200593	2014-01-31 23:45:12 +00:00
Manman Ren	413a6cb42b	This patch teaches the DAGCombiner how to fold insert_subvector nodes when the input is a concat_vectors and the insert replaces one of the concat halves: Lower half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors Z, Y) Upper half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors X, Z) This can be seen with the following IR: define <8 x float> @lower_half(<4 x float> %v1, <4 x float> %v2, <4 x float> %v3) { %1 = shufflevector <4 x float> %v1, <4 x float> %v2, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7> %2 = tail call <8 x float> @llvm.x86.avx.vinsertf128.ps.256(<8 x float> %1, <4 x float> %v3, i8 0) The vinsertf128 intrinsic is converted into an insert_subvector node in SelectionDAGBuilder.cpp. Using AVX, without the patch this generates two vinsertf128 instructions: vinsertf128 $1, %xmm1, %ymm0, %ymm0 vinsertf128 $0, %xmm2, %ymm0, %ymm0 With the patch this is optimized into: vinsertf128 $1, %xmm1, %ymm2, %ymm0 Patch by Robert Lougher. llvm-svn: 200506	2014-01-31 01:10:35 +00:00
Owen Anderson	60a4678c42	DAGCombine should not produce ISD::OR nodes after operation legalization if they're not legal. llvm-svn: 200503	2014-01-31 00:51:43 +00:00
Manman Ren	4ece7452ba	PGO branch weight: update edge weights in SelectionDAGBuilder. When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. The previous attempt at r200431 was reverted at r200434 because of two testing case failures. I modified my patch a little, but forgot to re-run "make check-all". Testing case CodeGen/ARM/lsr-unfolded-offset.ll is updated because of the patch's impact on branch probability which causes changes in spill placement. llvm-svn: 200502	2014-01-31 00:42:44 +00:00
Manman Ren	7407e0e31c	Revert r200431 due to bot failures. llvm-svn: 200434	2014-01-30 00:53:27 +00:00
Manman Ren	104e0c80cc	PGO branch weight: update edge weights in SelectionDAGBuilder. When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. llvm-svn: 200431	2014-01-30 00:24:37 +00:00
Andrea Di Biagio	b6d39afbda	[DAGCombiner] Avoid introducing an illegal build_vector when folding a sign_extend. Make sure that we don't introduce illegal build_vector dag nodes when trying to fold a sign_extend of a build_vector. This fixes a regression introduced by r200234. Added test CodeGen/X86/fold-vector-sext-crash.ll to verify that llc no longer crashes with an assertion failure due to an illegal build_vector of type MVT::v4i64. Thanks to Ilia Filippov for spotting this regression and for providing a reproducible test case. llvm-svn: 200313	2014-01-28 12:53:56 +00:00
Juergen Ributzka	659ce00d60	[TLI] Add a new hook to TargetLowering to query the target if a load of a constant should be converted to simply the constant itself. Before this patch we used getIntImmCost from TargetTransformInfo to determine if a load of a constant should be converted to just a constant, but the threshold for this was set to an arbitrary value. This value works well for the two targets (X86 and ARM) that implement this target-hook, but it isn't target-independent at all. Now targets have the possibility to decide directly if this optimization should be performed. The default value is set to false to preserve the current behavior. The target hook has been moved to TargetLowering, which removed the last use and need of TargetTransformInfo in SelectionDAG. llvm-svn: 200271	2014-01-28 01:20:14 +00:00
Matt Arsenault	5f2a92a26c	Fix sext(setcc) -> select_cc using wrong type for setcc. Also update the comment, since it actually produces a select (setcc) instead of select_cc. It was checking and using the setcc result type for the type of the sext, instead of the type of the compared items. In my problem case, the sext was to i32 and was used as the setcc type, but the expected type was i64. No test since I haven't been able to hit the problem with this on any in-tree targets. llvm-svn: 200249	2014-01-27 21:41:54 +00:00
Andrea Di Biagio	f09a357765	[DAGCombiner] Teach how to fold sext/aext/zext of constant build vectors. This patch teaches the DAGCombiner how to fold a sext/aext/zext dag node when the operand in input is a build vector of constants (or UNDEFs). The inability to fold a sext/zext of a constant build_vector was the root cause of some pcg bugs affecting vselect expansion on x86-64 with AVX support. Before this change, the DAGCombiner only knew how to fold a sext/zext/aext of a ConstantSDNode. llvm-svn: 200234	2014-01-27 18:45:30 +00:00
Stepan Dyatkovskiy	157bb42e27	Fix for PR18102. Issue outcomes from DAGCombiner::MergeConsequtiveStores, more precisely from mem-ops sequence sorting. Consider, how MergeConsequtiveStores works for next example: store i8 1, a[0] store i8 2, a[1] store i8 3, a[1] ; a[1] again. return ; DAG starts here 1. Method will collect all the 3 stores. 2. It sorts them by distance from the base pointer (farthest with highest index). 3. It takes first consecutive non-overlapping stores and (if possible) replaces them with a single store instruction. The point is, we can't determine here which 'store' instruction would be the second after sorting ('store 2' or 'store 3'). It happens that 'store 3' would be the second, and 'store 2' would be the third. So after merging we have the next result: store i16 (1 \| 3 << 8), base ; is a[0] but bit-casted to i16 store i8 2, a[1] So actually we swapped 'store 3' and 'store 2' and got wrong contents in a[1]. Fix: In sort routine just also take into account mem-op sequence number. llvm-svn: 200201	2014-01-27 09:18:31 +00:00
Kevin Qin	fb9871ff50	[AArch64 NEON] Fix pattern match failed on FP_ROUND from v1f128 to v1f64. llvm-svn: 200109	2014-01-26 02:19:35 +00:00
Hal Finkel	dbebb52a2f	Disable the use of TBAA when using AA in CodeGen There are currently two issues, of which I currently know, that prevent TBAA from being correctly usable in CodeGen: 1. Stack coloring does not update TBAA when merging allocas. This is easy enough to fix, but is not the largest problem. 2. CGP inserts ptrtoint/inttoptr pairs when sinking address computations. Because BasicAA does not handle inttoptr, we'll often miss basic type punning idioms that we need to catch so we don't miscompile real-world code (like LLVM). I don't yet have a small test case for this, but this fixes self hosting a non-asserts build of LLVM on PPC64 when using -enable-aa-sched-mi and -misched=shuffle. llvm-svn: 200093	2014-01-25 19:24:54 +00:00
Hal Finkel	9b2617a5a8	Add combiner-aa-only-func (debug only) This option (which is !NDEBUG only) allows restricting the use of alias analysis in DAGCombiner to a specific function. This has proved extremely valuable to isolating bugs related to this feature, and mirrors the misched-only-func option provided by the new instruction scheduler. llvm-svn: 200088	2014-01-25 17:32:39 +00:00
Hal Finkel	5fb07341f1	Improve descriptions of combiner-alias-analysis and combiner-global-alias-analysis llvm-svn: 200087	2014-01-25 17:32:37 +00:00
Juergen Ributzka	f26beda7c7	Revert "Revert "Add Constant Hoisting Pass" (r200034)" This reverts commit r200058 and adds the using directive for ARMTargetTransformInfo to silence two g++ overload warnings. llvm-svn: 200062	2014-01-25 02:02:55 +00:00
Hans Wennborg	4d67a2e85a	Revert "Add Constant Hoisting Pass" (r200034) This commit caused -Woverloaded-virtual warnings. The two new TargetTransformInfo::getIntImmCost functions were only added to the superclass, and to the X86 subclass. The other targets were not updated, and the warning highlighted this by pointing out that e.g. ARMTTI::getIntImmCost was hiding the two new getIntImmCost variants. We could pacify the warning by adding "using TargetTransformInfo::getIntImmCost" to the various subclasses, or turning it off, but I suspect that it's wrong to leave the functions unimplemnted in those targets. The default implementations return TCC_Free, which I don't think is right e.g. for ARM. llvm-svn: 200058	2014-01-25 01:18:18 +00:00
Juergen Ributzka	4f3df4ad64	Add Constant Hoisting Pass Retry commit r200022 with a fix for the build bot errors. Constant expressions have (unlike instructions) module scope use lists and therefore may have users in different functions. The fix is to simply ignore these out-of-function uses. llvm-svn: 200034	2014-01-24 20:18:00 +00:00
Hal Finkel	51a9838049	Fix DAGCombiner::GatherAllAliases to account for non-chain dependencies DAGCombiner::GatherAllAliases, which is only used when AA used is enabled during DAGCombine, had a fundamentally incorrect assumption for which this change compensates. GatherAllAliases, which is used to find aliasing predecessor chain nodes (so that a better chain can be selected for a load or store to enable subsequent optimizations) assumed that walking up the chain would always catch all possibly-aliasing loads and stores. This is not true: To really find all aliases, we also need to search for aliases through the value operand of a store, etc. Consider the following situation: Token1 = ... L1 = load Token1, %52 S1 = store Token1, L1, %51 L2 = load Token1, %52+8 S2 = store Token1, L2, %51+8 Token2 = Token(S1, S2) L3 = load Token2, %53 S3 = store Token2, L3, %52 L4 = load Token2, %53+8 S4 = store Token2, L4, %52+8 If we search for aliases of S3 (which loads address %52), and we look only through the chain, then we'll miss the trivial dependence on L1 (which loads from %52). We then might change all loads and stores to use Token1 as their chain operand, which could result in copying %53 into %52 before copying %52 into %51 (which should happen first). The problem is, however, that searching for such data dependencies can become expensive, and the cost is not directly related to the chain depth. Instead, we'll rule out such configurations by insisting that we've visited all chain users (except for users of the original chain, which is not necessary). When doing this, we need to look through nodes we don't care about (otherwise, things like register copies will interfere with trivial use cases). Unfortunately, I don't have a small test case for this problem. Creating the underlying situation is not hard (a pair of memcpys will do it), but arranging for the default instruction schedule to be incorrect is very fragile. This unbreaks self hosting on PPC64 when using -mllvm -combiner-global-alias-analysis -mllvm -combiner-alias-analysis. llvm-svn: 200033	2014-01-24 20:12:02 +00:00
Juergen Ributzka	50e7e80d00	Revert "Add Constant Hoisting Pass" This reverts commit r200022 to unbreak the build bots. llvm-svn: 200024	2014-01-24 18:40:30 +00:00
Hal Finkel	ccc18e1330	Restrict FindBetterChain DAG combines to unindexed nodes These transformations obviously won't work for indexed (pre/post-inc) loads and stores. In practice, I'm not sure there is any benefit to enabling them for indexed nodes because other transformations that these might enable likely also won't handle indexed nodes. I don't have an in-tree test case that hits this problem, but an upcoming bug fix will make it much more likely. llvm-svn: 200023	2014-01-24 18:25:26 +00:00
Juergen Ributzka	38b67d0caf	Add Constant Hoisting Pass This pass identifies expensive constants to hoist and coalesces them to better prepare it for SelectionDAG-based code generation. This works around the limitations of the basic-block-at-a-time approach. First it scans all instructions for integer constants and calculates its cost. If the constant can be folded into the instruction (the cost is TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't consider it expensive and leave it alone. This is the default behavior and the default implementation of getIntImmCost will always return TCC_Free. If the cost is more than TCC_BASIC, then the integer constant can't be folded into the instruction and it might be beneficial to hoist the constant. Similar constants are coalesced to reduce register pressure and materialization code. When a constant is hoisted, it is also hidden behind a bitcast to force it to be live-out of the basic block. Otherwise the constant would be just duplicated and each basic block would have its own copy in the SelectionDAG. The SelectionDAG recognizes such constants as opaque and doesn't perform certain transformations on them, which would create a new expensive constant. This optimization is only applied to integer constants in instructions and simple (this means not nested) constant cast experessions. For example: %0 = load i64* inttoptr (i64 big_constant to i64*) Reviewed by Eric llvm-svn: 200022	2014-01-24 18:23:08 +00:00
Alp Toker	cb40291100	Fix known typos Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018	2014-01-24 17:20:08 +00:00
Owen Anderson	77e4d44411	Revert r162101 and replace it with a solution that works for targets where the pointer type is illegal. This is a horrible bit of code. We're calling a simplification routine in the middle of type legalization. We tell the simplification routine that it's running after legalization, but some of the types it will encounter will be illegal! The fix is only to invoke the simplification if the types in question were legal, so that none of its invariants will be violated. llvm-svn: 199847	2014-01-22 22:34:17 +00:00
Elena Demikhovsky	9d56f1e0e5	AVX512: combining setcc and zext is wrong on AVX512 because vector compare instruction puts result in mask register. llvm-svn: 199798	2014-01-22 12:26:19 +00:00
Owen Anderson	fb00d5bc7c	Allow SMUL_LOHI and UMUL_LOHI to be narrow to MUL on targets where MUL is Custom rather than Legal. Even if the target is doing some kind of expansion for MUL, it's pretty much guaranteed to be more efficent than whatever it does for SMUL_LOHI or UMUL_LOHI! llvm-svn: 199678	2014-01-20 18:41:34 +00:00
Andrea Di Biagio	d7c03ec348	[DAGCombiner] Fix a wrong check in method SimplifyVBinOp. This fixes a regression intruced by r199135. Revision 199135 tried to simplify part of the logic in method DAGCombiner::SimplifyVBinOp introducing calls to method BuildVectorSDNode::isConstant(). However, that revision wrongly changed the check performed by method SimplifyVBinOp to identify dag nodes that can be folded. Before revision 199135, that method only tried to simplify vector binary operations if both operands were build_vector of Constant/ConstantFP/Undef only. After revision 199135, method SimplifyVBinop tried to simplify also vector binary operations with only one constant operand. This fixes the problem restoring the old behavior of SimplifyVBinOp. llvm-svn: 199328	2014-01-15 19:51:32 +00:00
Jakob Stoklund Olesen	b6b35a4955	Always let value types influence register classes. When creating a virtual register for a def, the value type should be used to pick the register class. If we only use the register class constraint on the instruction, we might pick a too large register class. Some registers can store values of different sizes. For example, the x86 xmm registers can hold f32, f64, and 128-bit vectors. The three different value sizes are represented by register classes with identical register sets: FR32, FR64, and VR128. These register classes have different spill slot sizes, so it is important to use the right one. The register class constraint on an instruction doesn't necessarily care about the size of the value its defining. The value type determines that. This fixes a problem where InstrEmitter was picking 32-bit register classes for 64-bit values on SPARC. llvm-svn: 199187	2014-01-14 06:18:38 +00:00
Juergen Ributzka	6840282c99	[DAG] Refactor ReassociateOps - no functional change intended. llvm-svn: 199146	2014-01-13 21:49:25 +00:00
Juergen Ributzka	7384405f23	[DAG] Teach DAG to also reassociate vector operations This commit teaches DAG to reassociate vector ops, which in turn enables constant folding of vector op chains that appear later on during custom lowering and DAG combine. Reviewed by Andrea Di Biagio llvm-svn: 199135	2014-01-13 20:51:35 +00:00
Andrew Trick	7daf6a45f4	Hide the pre-RA-sched= option. This is a very confusing option for a feature that will go away. -enable-misched is exposed instead to help triage issues with the new scheduler. llvm-svn: 199133	2014-01-13 20:08:27 +00:00
Nico Rieck	b5262d6d8f	Fix non-deterministic SDNodeOrder-dependent codegen Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code generation. llvm-svn: 199050	2014-01-12 14:09:17 +00:00
Alp Toker	798060e006	Fix 'ned' typo in doc comment Patch by Jasper Neumann! llvm-svn: 199007	2014-01-11 14:01:43 +00:00
Richard Sandiford	15cfc1c33c	Handle masked rotate amounts At the moment we expect rotates to have the form: (or (shl X, Y), (shr X, Z)) where Y == bitsize(X) - Z or Z == bitsize(X) - Y. This form means that the (or ...) is undefined for Y == 0 or Z == 0. This undefinedness can be avoided by using Y == (C * bitsize(X) - Z) & (bitsize(X) - 1) or Z == (C * bitsize(X) - Y) & (bitsize(X) - 1) for any integer C (including 0, the most natural choice). llvm-svn: 198861	2014-01-09 10:56:42 +00:00

... 3 4 5 6 7 ...

6600 Commits