llvm-project

Commit Graph

Author	SHA1	Message	Date
Keno Fischer	9aae445e09	[Utils] Insert DW_OP_bit_piece when only describing part of the variable Summary: The dbg.declare -> dbg.value conversion looks through any zext/sext to find a value to describe the variable (in the expectation that those zext/sext instruction will go away later). However, those values do not cover the entire variable and thus need a DW_OP_bit_piece. Reviewers: aprantl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16061 llvm-svn: 257534	2016-01-12 22:46:09 +00:00
Sanjay Patel	53ba88dbb0	[LibCallSimplifier] use instruction-level fast-math-flags to transform pow(x, 0.5) calls Also, propagate the FMF to the newly created sqrt() call. llvm-svn: 257503	2016-01-12 19:06:35 +00:00
Sanjay Patel	a252815bc1	function names start with a lower case letter ; NFC llvm-svn: 257496	2016-01-12 18:03:37 +00:00
Sanjay Patel	6002e78a06	[LibCallSimplifier] use instruction-level fast-math-flags to transform pow(exp(x)) calls See also: http://reviews.llvm.org/rL255555 http://reviews.llvm.org/rL256871 http://reviews.llvm.org/rL256964 http://reviews.llvm.org/rL257400 http://reviews.llvm.org/rL257404 http://reviews.llvm.org/rL257414 llvm-svn: 257491	2016-01-12 17:30:37 +00:00
Sanjay Patel	e896ede7f1	[LibCallSimplifier] use instruction-level fast-math-flags to transform log calls Also, add tests to verify that we're checking 'fast' on both calls of each transform pair, tighten the CHECK lines, and give the tests more meaningful names. This is a continuation of: http://reviews.llvm.org/rL255555 http://reviews.llvm.org/rL256871 http://reviews.llvm.org/rL256964 http://reviews.llvm.org/rL257400 http://reviews.llvm.org/rL257404 llvm-svn: 257414	2016-01-11 23:31:48 +00:00
Sanjay Patel	6c1ddbb7b6	[LibCallSimplifier] don't allow sqrt transform unless all ops are unsafe Fix the FIXME added with: http://reviews.llvm.org/rL257400 llvm-svn: 257404	2016-01-11 22:50:36 +00:00
Sanjay Patel	9f67dadea2	more space; NFC llvm-svn: 257401	2016-01-11 22:35:39 +00:00
Sanjay Patel	683f29735f	[LibCallSimplifier] use instruction-level fast-math-flags to transform sqrt calls This is a continuation of adding FMF to call instructions: http://reviews.llvm.org/rL255555 The intent of the patch is to preserve the current behavior of the transform except that we use the sqrt instruction's 'fast' attribute as a trigger rather than the function-level attribute. But this raises a bug noted by the new FIXME comment. In order to do this transform: sqrt((x * x) * y) ---> fabs(x) * sqrt(y) ...we need all of the sqrt, the first fmul, and the second fmul to be 'fast'. If any of those ops is strict, we should bail out. Differential Revision: http://reviews.llvm.org/D15937 llvm-svn: 257400	2016-01-11 22:34:19 +00:00
Teresa Johnson	b43257d594	Split resolveCycles(bool AllowTemps) into two interfaces and document Address review feedback from r255909. Move body of resolveCycles(bool AllowTemps) to resolveRecursivelyImpl(bool AllowTemps). Revert resolveCycles back to asserting on temps, and add new resolveNonTemporaries interface to invoke the new implementation with AllowTemps=true. Document the differences between these interfaces, specifically the effect on RAUW support and uniquing. Call appropriate interface from ValueMapper. llvm-svn: 257389	2016-01-11 21:37:41 +00:00
Chen Li	509ff21300	Code refactoring for commit r257278. llvm-svn: 257366	2016-01-11 19:20:53 +00:00
David Majnemer	d9833ea579	[JumpThreading] Don't forget to report that the IR changed JumpThreading's runOnFunction is supposed to return true if it made any changes. JumpThreading has a call to removeUnreachableBlocks which may result in changes to the IR but runOnFunction didn't appropriate account for this possibility, leading to badness. While we are here, make sure to call LazyValueInfo::eraseBlock in removeUnreachableBlocks; JumpThreading preserves LVI. This fixes PR26096. llvm-svn: 257279	2016-01-10 07:13:04 +00:00
Chen Li	c375450e3f	Fix a control flow problem in commit rL257277. llvm-svn: 257278	2016-01-10 06:13:32 +00:00
Chen Li	1689c2f54b	[SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad. Summary: This is a fix of D13718. D13718 was committed but then reverted because of the following bug: https://llvm.org/bugs/show_bug.cgi?id=25299 This patch fixes the issue shown in the bug. Reviewers: majnemer, reames Subscribers: jevinskie, llvm-commits Differential Revision: http://reviews.llvm.org/D14308 llvm-svn: 257277	2016-01-10 05:48:01 +00:00
Justin Bogner	e9fb228d59	LoopInfo: Simplify ownership of Loop objects It's strange that LoopInfo mostly owns the Loop objects, but that it defers deleting them to the loop pass manager. Instead, change the oddly named "updateUnloop" to "markAsRemoved" and have it queue the Loop object for deletion. We can't delete the Loop immediately when we remove it, since we need its pointer identity still, so we'll mark the object as "invalid" so that clients can see what's going on. llvm-svn: 257191	2016-01-08 19:08:53 +00:00
Easwaran Raman	7f18729039	Remove CloningDirector and associated code With the removal of the old landing pad code in r249918, CloningDirector is not used anywhere else. NFCI. llvm-svn: 257185	2016-01-08 18:23:17 +00:00
Sanjay Patel	c2d6461a4a	[LibCallSimplifier] less indenting; NFCI llvm-svn: 256973	2016-01-06 20:52:21 +00:00
Chen Li	78bde83003	[SplitLandingPadPredecessors] Create a PHINode for the original landingpad only if it has some uses Summary: This patch adds a check in SplitLandingPadPredecessors to see if the original landingpad instruction has any uses. If not, we don't need to create a PHINode for it in the joint block since it's gonna be a dead code anyway. The motivation for this patch is that we found a bug that SplitLandingPadPredecessors created a PHINode of token type landingpad, which failed the verifier since PHINode can not be token type. However, the created PHINode will never be used in our code pattern. This patch will workaround this bug, and we might add supports in SplitLandingPadPredecessors to handle token type landingpad with uses in the future. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15835 llvm-svn: 256972	2016-01-06 20:32:05 +00:00
Sanjay Patel	cddcd7256c	[LibCallSimplifier] use instruction-level fast-math-flags for tan/atan transform llvm-svn: 256964	2016-01-06 19:23:35 +00:00
David Majnemer	b70e23c390	[SimplifyLibCalls] Teach SimplifyLibCalls about operand bundles If we replace one call-site with another, be sure to move over any operand bundles that lingered on the old call-site. This fixes PR26036. llvm-svn: 256912	2016-01-06 05:01:34 +00:00
Sanjay Patel	c7ddb7fcdb	A (B + C) = A B + A C ; NFCI llvm-svn: 256884	2016-01-06 00:32:15 +00:00
Manuel Jacob	3eedd11329	[Statepoints] Check for the "gc-leaf-function" attribute on call sites as well. Reviewers: sanjoy, reames Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D15900 llvm-svn: 256875	2016-01-05 23:59:08 +00:00
Sanjay Patel	29095ea1b0	[LibCallSimplfier] use instruction-level fast-math-flags for fmin/fmax transforms llvm-svn: 256871	2016-01-05 20:46:19 +00:00
David Majnemer	59eb733af1	[SimplifyCFG] Further improve our ability to remove redundant catchpads In r256814, we managed to remove catchpads which were trivially redudant because they were the same SSA value. We can do better using the same algorithm but with a smarter datastructure by hashing the SSA values within the catchpad and comparing them structurally. llvm-svn: 256815	2016-01-05 07:42:17 +00:00
David Majnemer	2fa8651a8f	[SimplifyCFG] Remove redundant catchpads Remove duplicate catchpad handlers from a catchswitch. llvm-svn: 256814	2016-01-05 06:27:50 +00:00
Joseph Tremoulet	0d808888c1	[WinEH] Simplify unreachable catchpads Summary: At least for CoreCLR, a catchpad which immediately executes an `unreachable` instruction indicates that the exception can never have a matching type, and so such catchpads can be removed, and so can their catchswitches if the catchswitch becomes empty. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15846 llvm-svn: 256809	2016-01-05 02:37:41 +00:00
Eric Christopher	49a7d6c473	Clarify that the bypassSlowDivision optimization operates on a single BB [v2] Update some comments to be more explicit. Change bypassSlowDivision and the functions it calls so that they take BasicBlocks and Instructions, rather than Function::iterator&s and BasicBlock::iterator&s. Change the APIs so that the caller is responsible for updating the iterator, rather than the callee. This makes control flow much easier to follow. Patch by Justin Lebar! llvm-svn: 256789	2016-01-04 23:18:58 +00:00
Sanjay Patel	bee05caa6b	[LibCallSimplifier] propagate FMF when shrinking binary calls llvm-svn: 256682	2015-12-31 23:40:59 +00:00
Sanjay Patel	aa23114cb4	[LibCallSimplifier] propagate FMF when shrinking unary calls llvm-svn: 256679	2015-12-31 21:52:31 +00:00
Sanjay Patel	96475cbd22	Variable names start with an upper case letter; NFC llvm-svn: 256676	2015-12-31 16:16:58 +00:00
Sanjay Patel	d707db97a9	fix formatting; NFC llvm-svn: 256675	2015-12-31 16:10:49 +00:00
Teresa Johnson	96f7f81aa3	[ThinLTO] Rename variables used in metadata linking (NFC) As suggested in review for r255909, rename MDMaterialized to AllowTemps, and identify the name of the boolean flag being set in calls to saveMetadataList. llvm-svn: 256653	2015-12-30 21:13:55 +00:00
Craig Topper	582d8ecf6a	[Transforms] Use asserts instead of ifs around llvm_unreachable. NFC llvm-svn: 256405	2015-12-25 02:04:17 +00:00
Sanjoy Das	ab0626e35f	Nonnull elements in OperandBundleCallSites are not all Instructions `CloneAndPruneIntoFromInst` sometimes RAUW's dead instructions with `undef` before erasing them (to avoid deleting instructions that still have uses). This changes the `WeakVH` in `OperandBundleCallSites` to hold an `undef`, and we need to guard for this situation in eventuality in `llvm::InlineFunction`. llvm-svn: 256110	2015-12-19 22:40:28 +00:00
Keno Fischer	00cbf9a69a	Clean up the processing of dbg.value in various places Summary: First up is instcombine, where in the dbg.declare -> dbg.value conversion, the llvm.dbg.value needs to be called on the actual loaded value, rather than the address (since the whole point of this transformation is to be able to get rid of the alloca). Further, now that that's cleaned up, we can remove a hack in the backend, that would add an implicit OP_deref if the argument to dbg.value was an alloca. This stems from before the existence of DIExpression and is no longer necessary since the deref can be expressed explicitly. Now, in order to make sure that the tests pass with this change, we need to correct the printing of DEBUG_VALUE comments to take into account the expression, which wasn't taken into account before. Unfortunately, for both these changes, there were a number of incorrect test cases (mostly the wrong number of DW_OP_derefs, but also a couple where the test itself was broken more badly). aprantl and I have gone through and adjusted these test case in order to make them pass with these fixes and in some cases to make sure they're actually testing what they are meant to test. Reviewers: aprantl Subscribers: dsanders Differential Revision: http://reviews.llvm.org/D14186 llvm-svn: 256077	2015-12-19 02:02:44 +00:00
Andrew Kaylor	123048d26a	[WinEH] Update LCSSA to handle catchswitch with handlers inside and outside a loop Differential Revision: http://reviews.llvm.org/D15630 llvm-svn: 256005	2015-12-18 18:12:35 +00:00
Teresa Johnson	0e7c82cb69	[ThinLTO/LTO] Don't link in unneeded metadata Summary: Third patch split out from http://reviews.llvm.org/D14752. Only map in needed DISubroutine metadata (imported or otherwise linked in functions and other DISubroutine referenced by inlined instructions). This is supported for ThinLTO, LTO and llvm-link --only-needed, with associated tests for each one. Depends on D14838. Reviewers: dexonsmith, joker.eph Subscribers: davidxl, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D14843 llvm-svn: 256003	2015-12-18 17:51:37 +00:00
Teresa Johnson	e5a6191732	[ThinLTO] Metadata linking for imported functions Summary: Second patch split out from http://reviews.llvm.org/D14752. Maps metadata as a post-pass from each module when importing complete, suturing up final metadata to the temporary metadata left on the imported instructions. This entails saving the mapping from bitcode value id to temporary metadata in the importing pass, and from bitcode value id to final metadata during the metadata linking postpass. Depends on D14825. Reviewers: dexonsmith, joker.eph Subscribers: davidxl, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D14838 llvm-svn: 255909	2015-12-17 17:14:09 +00:00
Justin Bogner	883a3ea67f	LPM: Make callers of LPM.deleteLoopFromQueue update LoopInfo directly. NFC As of r255720, the loop pass manager will DTRT when passes update the loop info for removed loops, so they no longer need to reach into LPPassManager APIs to do this kind of transformation. This change very nearly removes the need for the LPPassManager to even be passed into loop passes - the only remaining pass that uses the LPM argument is LoopUnswitch. llvm-svn: 255797	2015-12-16 18:40:20 +00:00
James Molloy	3d21dcf3ed	[SimplifyCFG] Don't create unnecessary PHIs In conditional store merging, we were creating PHIs when we didn't need to. If the value to be predicated isn't defined in the block we're predicating, then it doesn't need a PHI at all (because we only deal with triangles and diamonds, any value not in the predicated BB must dominate the predicated BB). This fixes a large code size increase in some benchmarks in a popular embedded benchmark suite. Now with a fix (and fixed tests) for the conformance issue seen in Chromium. llvm-svn: 255767	2015-12-16 14:12:44 +00:00
David Majnemer	3bb88c0210	[WinEH] Use operand bundles to describe call sites SimplifyCFG allows tail merging with code which terminates in unreachable which, in turn, makes it possible for an invoke to end up in a funclet which it was not originally part of. Using operand bundles on invokes allows us to determine whether or not an invoke was part of a funclet in the source program. Furthermore, it allows us to unambiguously answer questions about the legality of inlining into call sites which the personality may have trouble with. Differential Revision: http://reviews.llvm.org/D15517 llvm-svn: 255674	2015-12-15 21:27:27 +00:00
Justin Bogner	843fb204b7	LPM: Stop threading `Pass ` through all of the loop utility APIs. NFC A large number of loop utility functions take a `Pass ` and reach into it to find out which analyses to preserve. There are a number of problems with this: - The APIs have access to pretty well any Pass state they want, so it's hard to tell what they may or may not do. - Other APIs have copied these and pass around a `Pass *` even though they don't even use it. Some of these just hand a nullptr to the API since the callers don't even have a pass available. - Passes in the new pass manager don't work like the current ones, so the APIs can't be used as is there. Instead, we should explicitly thread the analysis results that we actually care about through these APIs. This is both simpler and more reusable. llvm-svn: 255669	2015-12-15 19:40:57 +00:00
Sanjay Patel	38a022623a	[SimplifyCFG] allow speculation of exactly one expensive instruction (PR24818) This is the last general step to allow more IR-level speculation with a safety harness in place in CodeGenPrepare. The intent is to restore the behavior enabled by: http://reviews.llvm.org/rL228826 but prevent bad performance such as: https://llvm.org/bugs/show_bug.cgi?id=24818 Earlier patches in this sequence: D12882 (disable SimplifyCFG speculation for expensive instructions) D13297 (have CGP despeculate expensive ops) D14630 (have CGP despeculate special versions of cttz/ctlz) As shown in the test cases, we only have two instructions currently affected: ctz for some x86 and fdiv generally. Allowing exactly one expensive instruction is a bit of a hack, but it lines up with what is currently implemented in CGP. If we make the despeculation more general in CGP, we can make the speculation here more liberal. A follow-up patch will adjust the cost for sqrt and possibly other typically expensive math intrinsics (currently everything is cheap by default). GPU targets would likely want to override those expensive default costs (just as they probably should already override the cost of div/rem) because just about any math is cheaper than control-flow on those targets. Differential Revision: http://reviews.llvm.org/D15213 llvm-svn: 255660	2015-12-15 17:38:29 +00:00
Reid Kleckner	db9a91e324	Revert "Don't create unnecessary PHIs" This reverts commit r255489. It causes test failures in Chromium and does not appear to respect the AlternativeV parameter. llvm-svn: 255562	2015-12-14 22:36:57 +00:00
David Majnemer	bbfc7219ef	[IR] Remove terminatepad It turns out that terminatepad gives little benefit over a cleanuppad which calls the termination function. This is not sufficient to implement fully generic filters but MSVC doesn't support them which makes terminatepad a little over-designed. Depends on D15478. Differential Revision: http://reviews.llvm.org/D15479 llvm-svn: 255522	2015-12-14 18:34:23 +00:00
Sanjay Patel	af674fbfd9	getParent() ^ 3 == getModule() ; NFCI llvm-svn: 255511	2015-12-14 17:24:23 +00:00
James Molloy	2b1e101e99	Don't create unnecessary PHIs In conditional store merging, we were creating PHIs when we didn't need to. If the value to be predicated isn't defined in the block we're predicating, then it doesn't need a PHI at all (because we only deal with triangles and diamonds, any value not in the predicated BB must dominate the predicated BB). This fixes a large code size increase in some benchmarks in a popular embedded benchmark suite. llvm-svn: 255489	2015-12-14 10:57:01 +00:00
David Majnemer	8a1c45d6e8	[IR] Reformulate LLVM's EH funclet IR While we have successfully implemented a funclet-oriented EH scheme on top of LLVM IR, our scheme has some notable deficiencies: - catchendpad and cleanupendpad are necessary in the current design but they are difficult to explain to others, even to seasoned LLVM experts. - catchendpad and cleanupendpad are optimization barriers. They cannot be split and force all potentially throwing call-sites to be invokes. This has a noticable effect on the quality of our code generation. - catchpad, while similar in some aspects to invoke, is fairly awkward. It is unsplittable, starts a funclet, and has control flow to other funclets. - The nesting relationship between funclets is currently a property of control flow edges. Because of this, we are forced to carefully analyze the flow graph to see if there might potentially exist illegal nesting among funclets. While we have logic to clone funclets when they are illegally nested, it would be nicer if we had a representation which forbade them upfront. Let's clean this up a bit by doing the following: - Instead, make catchpad more like cleanuppad and landingpad: no control flow, just a bunch of simple operands; catchpad would be splittable. - Introduce catchswitch, a control flow instruction designed to model the constraints of funclet oriented EH. - Make funclet scoping explicit by having funclet instructions consume the token produced by the funclet which contains them. - Remove catchendpad and cleanupendpad. Their presence can be inferred implicitly using coloring information. N.B. The state numbering code for the CLR has been updated but the veracity of it's output cannot be spoken for. An expert should take a look to make sure the results are reasonable. Reviewers: rnk, JosephTremoulet, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D15139 llvm-svn: 255422	2015-12-12 05:38:55 +00:00
James Molloy	1bb6ea5e2d	[Mem2Reg] Respect optnone Mem2Reg shouldn't be optimizing a function that is marked optnone. There is a test checking this that fails when mem2reg is explicitly added to the standard pass pipeline. llvm-svn: 255336	2015-12-11 13:36:59 +00:00
Sanjoy Das	ccd14566e2	Add arg_begin() and arg_end() to CallInst and InvokeInst; NFCI - This simplifies the CallSite class, arg_begin / arg_end are now simple wrapper getters. - In several places, we were creating CallSite instances solely to call arg_begin and arg_end. With this change, that's no longer required. llvm-svn: 255226	2015-12-10 06:39:02 +00:00
Sanjoy Das	9abfb0b429	Use WeakVH to keep track of calls with operand bundles in CloneCodeInfo `CloneAndPruneIntoFromInst` can DCE instructions after cloning them into the new function, and so an AssertingVH is too strong. This change switches CloneCodeInfo to use a std::vector<WeakVH>. llvm-svn: 255148	2015-12-09 20:33:52 +00:00
Sanjoy Das	1f8fd88873	Delete trailing whitespace; NFC llvm-svn: 255147	2015-12-09 20:33:45 +00:00
Michael Zolotukhin	78760ee73d	Revert "Revert r253253 and r253126: "Don't recompute LCSSA after loop-unrolling when possible."" The bug in IndVarSimplify was fixed in r254976, r254977, so I'm reapplying the original patch for avoiding redundant LCSSA recomputation. This reverts commit ffe3b434e505e403146aff00be0c177bb6d13466. llvm-svn: 255133	2015-12-09 18:20:28 +00:00
Silviu Baranga	9cd9a7e310	Re-commit r255115, with the PredicatedScalarEvolution class moved to ScalarEvolution.h, in order to avoid cyclic dependencies between the Transform and Analysis modules: [LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is that both LAA and LV should use this interface everywhere. This also solves a problem involving the result of SCEV expression rewritting when the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates P1: {a,+,b} has nsw P2: b = 1. Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies). The SCEVPredicatedLayer maintains the order of transformations by feeding back the results of previous transformations into new transformations, and therefore avoiding this issue. The SCEVPredicatedLayer maintains a cache to remember the results of previous SCEV rewritting results. This also has the benefit of reducing the overall number of expression rewrites. Reviewers: mzolotukhin, anemet Subscribers: jmolloy, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14296 llvm-svn: 255122	2015-12-09 16:06:28 +00:00
Silviu Baranga	ad1ccb357b	Revert r255115 until we figure out how to fix the bot failures. llvm-svn: 255117	2015-12-09 15:25:28 +00:00
Silviu Baranga	41eb682501	[LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is that both LAA and LV should use this interface everywhere. This also solves a problem involving the result of SCEV expression rewritting when the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates P1: {a,+,b} has nsw P2: b = 1. Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies). The SCEVPredicatedLayer maintains the order of transformations by feeding back the results of previous transformations into new transformations, and therefore avoiding this issue. The SCEVPredicatedLayer maintains a cache to remember the results of previous SCEV rewritting results. This also has the benefit of reducing the overall number of expression rewrites. Reviewers: mzolotukhin, anemet Subscribers: jmolloy, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14296 llvm-svn: 255115	2015-12-09 15:03:52 +00:00
Rafael Espindola	cab951dd46	Return a std::unique_ptr from CloneModule. NFC. llvm-svn: 255078	2015-12-08 23:57:17 +00:00
Sanjoy Das	8a954a0553	[OperandBundles] Fix a transform in simplifycfg Reviewers: pcc, majnemer, reames Subscribers: reames, llvm-commits Differential Revision: http://reviews.llvm.org/D15345 llvm-svn: 255062	2015-12-08 22:26:08 +00:00
Sanjoy Das	8da1f95916	[OperandBundles] Remove unncessary constructor The StringRef constructor is unnecessary (since we're converting to std::string anyway), and having it requires an explicit call to StringRef's or std::string's constructor. llvm-svn: 255000	2015-12-08 03:50:32 +00:00
Rafael Espindola	b6d56a7655	Create llvm.global_ctors in the new format. llvm-svn: 254878	2015-12-06 16:18:25 +00:00
Weiming Zhao	8213072a45	[SimplifyLibCalls] Optimization for pow(x, n) where n is some constant Summary: In order to avoid calling pow function we generate repeated fmul when n is a positive or negative whole number. For each exponent we pre-compute Addition Chains in order to minimize the no. of fmuls. Refer: http://wwwhomes.uni-bielefeld.de/achim/addition_chain.html We pre-compute addition chains for exponents upto 32 (which results in a max of 7 fmuls). For eg: 4 = 2+2 5 = 2+3 6 = 3+3 and so on Hence, pow(x, 4.0) ==> y = fmul x, x x = fmul y, y ret x For negative exponents, we simply compute the reciprocal of the final result. Note: This transformation is only enabled under fast-math. Patch by Mandeep Singh Grang <mgrang@codeaurora.org> Reviewers: weimingz, majnemer, escha, davide, scanon, joerg Subscribers: probinson, escha, llvm-commits Differential Revision: http://reviews.llvm.org/D13994 llvm-svn: 254776	2015-12-04 22:00:47 +00:00
David Majnemer	70497c696a	Move EH-specific helper functions to a more appropriate place No functionality change is intended. llvm-svn: 254562	2015-12-02 23:06:39 +00:00
Rafael Espindola	baa3bf8f76	Bring r254336 back: The difference is that now we don't error on out-of-comdat access to internal global values. We copy them instead. This seems to match the expectation of COFF linkers (see pr25686). Original message: Start deciding earlier what to link. A traditional linker is roughly split in symbol resolution and "copying stuff". The two tasks are badly mixed in lib/Linker. This starts splitting them apart. With this patch there are no direct call to linkGlobalValueBody or linkGlobalValueProto. Everything is linked via WapValue. This also includes a few fixes: * A GV goes undefined if the comdat is dropped (comdat11.ll). * We error if an internal GV goes undefined (comdat13.ll). * We don't link an unused comdat. The first two match the behavior of an ELF linker. The second one is equivalent to running globaldce on the input. llvm-svn: 254418	2015-12-01 15:19:48 +00:00
Evgeniy Stepanov	42f3b12274	[safestack] Protect byval function arguments. Detect unsafe byval function arguments and move them to the unsafe stack. llvm-svn: 254353	2015-12-01 00:40:05 +00:00
Rafael Espindola	e9841a6bb5	This reverts commit r254336 and r254344. They broke a bot and I am debugging why. llvm-svn: 254347	2015-11-30 23:54:19 +00:00
Rafael Espindola	c109200c53	Start deciding earlier what to link. A traditional linker is roughly split in symbol resolution and "copying stuff". The two tasks are badly mixed in lib/Linker. This starts splitting them apart. With this patch there are no direct call to linkGlobalValueBody or linkGlobalValueProto. Everything is linked via WapValue. This also includes a few fixes: * A GV goes undefined if the comdat is dropped (comdat11.ll). * We error if an internal GV goes undefined (comdat13.ll). * We don't link an unused comdat. The first two match the behavior of an ELF linker. The second one is equivalent to running globaldce on the input. llvm-svn: 254336	2015-11-30 22:01:43 +00:00
Davide Italiano	1aeed6a955	[SimplifyLibCalls] Transform log(exp2(y)) to y*log(2) under fast-math. llvm-svn: 254317	2015-11-30 19:36:35 +00:00
Davide Italiano	0b14f29285	[SimplifyLibCalls] Don't crash if the function doesn't have a name. llvm-svn: 254265	2015-11-29 21:58:56 +00:00
Davide Italiano	e2db58cfb8	[SimplifyLibCalls] Cross out implemented transformations. llvm-svn: 254264	2015-11-29 21:00:43 +00:00
Davide Italiano	b8b7133c94	[SimplifyLibCalls] Tranform log(pow(x, y)) -> ylog(x). This one is enabled only under -ffast-math. There are cases where the difference between the value computed and the correct value is huge even for ffast-math, e.g. as Steven pointed out: x = -1, y = -4 log(pow(-1), 4) = 0 4log(-1) = NaN I checked what GCC does and apparently they do the same optimization (which result in the dramatic difference). Future work might try to make this (slightly) less worse. Differential Revision: http://reviews.llvm.org/D14400 llvm-svn: 254263	2015-11-29 20:58:04 +00:00
Davide Italiano	da3beebad1	[SimplifyLibCalls] Use any_of(). Suggested by David Blaikie! llvm-svn: 254239	2015-11-28 22:27:48 +00:00
Benjamin Kramer	89766e5b1d	[SimplifyLibCalls] Fix inverted condition that lead to an uninitialized memory read below. Found by msan! llvm-svn: 254238	2015-11-28 21:43:12 +00:00
Rafael Espindola	19b52383c5	Simplify the linking of recursive data. Now the ValueMapper has two callbacks. The first one maps the declaration. The ValueMapper records the mapping and then materializes the body/initializer. llvm-svn: 254209	2015-11-27 20:28:19 +00:00
Davide Italiano	ac0953a2e6	[SimplifyLibCalls] Use range-based loop. NFC. llvm-svn: 254193	2015-11-27 08:05:40 +00:00
Benjamin Kramer	fb419e71f4	[SimplifyLibCalls] Don't depend on a called function having a name, it might be an indirect call. Fixes the crasher in PR25651 and related crashers using the same pattern. llvm-svn: 254145	2015-11-26 09:51:17 +00:00
Sanjoy Das	c521c7bea5	[OperandBundles] Extract duplicated code into a helper function, NFC llvm-svn: 254047	2015-11-25 00:42:24 +00:00
Weiming Zhao	45d4cb9a14	[Utils] Put includes in correct order. NFC. Summary: Followed the guidelines in: http://llvm.org/docs/CodingStandards.html#include-style However, I noticed that uppercase named headers come before lowercase ones throughout the codebase. So kept them as is. Patch by Mandeep Singh Grang <mgrang@codeaurora.org> Reviewers: majnemer, davide, jmolloy, atrick Subscribers: sanjoy Differential Revision: http://reviews.llvm.org/D14939 llvm-svn: 254005	2015-11-24 18:57:06 +00:00
Weiming Zhao	8d5c08f591	[SimplifyLibCalls] Removed some TODOs which are already implemented. NFC. Summary: D14302 implements tan(atan(x)) -> x D14045 implements pow(exp(x), y) -> exp(x*y) Patch by Mandeep Singh Grang <mgrang@codeaurora.org> Reviewers: majnemer, davide Differential Revision: http://reviews.llvm.org/D14882 llvm-svn: 253768	2015-11-21 06:10:20 +00:00
Dehao Chen	014fb55711	Fix the debug build breakage that getDiscriminator is called by mistake. llvm-svn: 253597	2015-11-19 20:29:27 +00:00
Michael Zolotukhin	6c11c04db3	Revert r253253 and r253126: "Don't recompute LCSSA after loop-unrolling when possible." The change exposed a bug in IndVarSimplify (PR25578), which led to a failure (PR25538). When the bug is fixed, this patch can be reapplied. The tests are kept in tree, as they're useful anyway, and will not break with this revert. llvm-svn: 253596	2015-11-19 20:28:32 +00:00
Dehao Chen	23e2278e27	Reimplement discriminator assignment algorithm. Summary: The new algorithm is more efficient (O(n), n is number of basic blocks). And it is guaranteed to cover all cases of multiple BB mapped to same line. Reviewers: dblaikie, davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14738 llvm-svn: 253594	2015-11-19 19:53:05 +00:00
Pete Cooper	67cf9a723b	Revert "Change memcpy/memset/memmove to have dest and source alignments." This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543	2015-11-19 05:56:52 +00:00
Davide Italiano	c5cedd195a	[SimplifyLibCalls] New trick: pow(x, 0.5) -> sqrt(x) under -ffast-math. Differential Revision: http://reviews.llvm.org/D14466 llvm-svn: 253521	2015-11-18 23:21:32 +00:00
Davide Italiano	455ea11d13	[BuildLibCalls] EmitStrNLen() is dead code. Garbage collect. llvm-svn: 253514	2015-11-18 22:29:38 +00:00
Pete Cooper	72bc23ef02	Change memcpy/memset/memmove to have dest and source alignments. Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.llvm\.memset.)i32\ [0-9]\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, / isVolatile / false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, / isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511	2015-11-18 22:17:24 +00:00
Igor Laevsky	7310c68e85	Revert "Revert "Strip metadata when speculatively hoisting instructions (r252604)" Failing clang test is now fixed by the r253458. llvm-svn: 253459	2015-11-18 14:50:18 +00:00
Sanjoy Das	f79d3449c5	[OperandBundles] Tighten OperandBundleDef's interface; NFC llvm-svn: 253446	2015-11-18 08:30:07 +00:00
Sanjoy Das	2d16145acf	Teach the inliner to track deoptimization state Summary: This change teaches LLVM's inliner to track and suitably adjust deoptimization state (tracked via deoptimization operand bundles) as it inlines through call sites. The operation is described in more detail in the LangRef changes. Reviewers: reames, majnemer, chandlerc, dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14552 llvm-svn: 253438	2015-11-18 06:23:38 +00:00
Michael Zolotukhin	927bdba29d	[PR25538]: Fix a failure caused by r253126. In r253126 we stopped to recompute LCSSA after loop unrolling in all cases, except the unrolling is full and at least one of the loop exits is outside the parent loop. In other cases the transformation should not break LCSSA, but it turned out, that we also call SimplifyLoop on the parent loop, which might break LCSSA by itself. This fix just triggers LCSSA recomputation in this case as well. I'm committing it without a test case for now, but I'll try to invent one. It's a bit tricky because in an isolated test LoopSimplify would be scheduled before LoopUnroll, and thus will change the test and hide the problem. llvm-svn: 253253	2015-11-16 21:17:26 +00:00
Davide Italiano	ed5cc95d22	[SimplifyLibCalls] Generalize a comment. This doesn't apply only to sqrt. llvm-svn: 253224	2015-11-16 16:54:28 +00:00
Pavel Labath	978060ce2f	Don't generate discriminators for calls to debug intrinsics Summary: This fails a check in Verifier.cpp, which checks for location matches between the declared variable and the !dbg attachments. Reviewers: dnovillo, dblaikie, danielcdh Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14657 llvm-svn: 253194	2015-11-16 10:40:38 +00:00
Keno Fischer	2ac0c27001	Also map the personality function in CloneFunctionInto Summary: The Old personality function gets copied over, but the Materializer didn't have a chance to inspect it (e.g. to fix up references to the correct module for the target function). Also add a verifier check that makes sure the personality routine is in the same module as the function whose personality it is. Reviewers: majnemer Subscribers: jevinskie, llvm-commits Differential Revision: http://reviews.llvm.org/D14474 llvm-svn: 253183	2015-11-16 05:13:30 +00:00
Teresa Johnson	83d03ddbf6	Fix mapping of unmaterialized global values during metadata linking Summary: The patch to move metadata linking after global value linking didn't correctly map unmaterialized global values to null as desired. They were in fact mapped to the source copy. It largely worked by accident since most module linker clients destroyed the source module which caused the source GVs to be replaced by null, but caused a failure with LTO linking on Windows: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312869.html The problem is that a null return value from materializeValueFor is handled by mapping the value to self. This is the desired behavior when materializeValueFor is passed a non-GlobalValue. The problem is how to distinguish that case from the case where we really do want to map to null. This patch addresses this by passing in a new flag to the value mapper indicating that unmapped global values should be mapped to null. Other Value types are handled as before. Note that the documented behavior of asserting on unmapped values when the flag RF_IgnoreMissingValues isn't set is currently disabled with FIXME notes due to bootstrap failures. I modified these disabled asserts so when they are eventually enabled again it won't assert for the unmapped values when the new RF_NullMapMissingGlobalValues flag is set. I also considered using a callback into the value materializer, but a flag seemed cleaner given that there are already existing flags. I also considered modifying materializeValueFor to return the input value when we want to map to source and then treat a null return to mean map to null. However, there are other value materializer subclasses that implement materializeValueFor, and they would all need to be audited and the return values possibly changed, which seemed error-prone. Reviewers: dexonsmith, joker.eph Subscribers: pcc, llvm-commits Differential Revision: http://reviews.llvm.org/D14682 llvm-svn: 253170	2015-11-15 14:50:14 +00:00
Michael Zolotukhin	8ef44f93ca	Don't recompute LCSSA after loop-unrolling when possible. Summary: Currently we always recompute LCSSA for outer loops after unrolling an inner loop. That leads to compile time problem when we have big loop nests, and we can solve it by avoiding unnecessary work. For instance, if w eonly do partial unrolling, we don't break LCSSA, so we don't need to rebuild it. Also, if all exits from the inner loop are inside the enclosing loop, then complete unrolling won't break LCSSA either. I replaced unconditional LCSSA recomputation with conditional recomputation + unconditional assert and added several tests, which were failing when I experimented with it. Soon I plan to follow up with a similar patch for recalculation of dominators tree. Reviewers: hfinkel, dexonsmith, bogner, joker.eph, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14526 llvm-svn: 253126	2015-11-14 05:51:41 +00:00
Davide Italiano	b883b01a8e	[SimplifyLibCalls] Make a function shorter. NFC. llvm-svn: 252970	2015-11-12 23:39:00 +00:00
Diego Novillo	0354a9f67b	SamplePGO - Fix PR 25482 - Do not rely on llvm.dbg.cu for discriminators The discriminators pass relied on the presence of llvm.dbg.cu to decide whether to add discriminators, but this fails in the case where debug info is only enabled partially when -fprofile-sample-use is active. The reason llvm.dbg.cu is not present in these cases is to prevent codegen from emitting debug info (as it is only used for the sample profile pass). This changes the discriminators pass to also emit discriminators even when debug info is not being emitted. llvm-svn: 252763	2015-11-11 17:54:37 +00:00
Renato Golin	0e77d72b0a	Revert "Strip metadata when speculatively hoisting instructions" This reverts commit r252604, as it broke all ARM and AArch64 buildbots, as well as some x86, et al. llvm-svn: 252623	2015-11-10 18:01:16 +00:00
Igor Laevsky	01c3692a10	Strip metadata when speculatively hoisting instructions This is fix for PR24059. When we are hoisting instruction above some condition it may turn out that metadata on this instruction was control dependant on the condition. This metadata becomes invalid and we need to drop it. This patch should cover most obvious places of speculative execution (which I have found by greping isSafeToSpeculativelyExecute). I think there are more cases but at least this change covers the severe ones. Differential Revision: http://reviews.llvm.org/D14398 llvm-svn: 252604	2015-11-10 14:10:31 +00:00
Dehao Chen	3656e3064b	Add discriminators for call instructions that are from the same line and same basic block. Summary: Call instructions that are from the same line and same basic block needs to have separate discriminators to distinguish between different callsites. Reviewers: davidxl, dnovillo, dblaikie Subscribers: dblaikie, probinson, llvm-commits Differential Revision: http://reviews.llvm.org/D14464 llvm-svn: 252492	2015-11-09 17:30:38 +00:00
Silviu Baranga	2910a4f6b1	Allow LLE/LD and the loop versioning infrastructure to use SCEV predicates Summary: LAA currently generates a set of SCEV predicates that must be checked by users. In the case of Loop Distribute/Loop Load Elimination, no such predicates could have been emitted, since we don't allow stride versioning. However, in the future there could be SCEV predicates that will need to be checked. This change adds support for SCEV predicate versioning in the Loop Distribute, Loop Load Eliminate and the loop versioning infrastructure. Reviewers: anemet Subscribers: mssimpso, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14240 llvm-svn: 252467	2015-11-09 13:26:09 +00:00
Duncan P. N. Exon Smith	83c4b68720	ADT: Remove last implicit ilist iterator conversions, NFC Some implicit ilist iterator conversions have crept back into Analysis, Transforms, Hexagon, and llvm-stress. This removes them. I'll commit a patch immediately after this to disallow them (in a separate patch so that it's easy to revert if necessary). llvm-svn: 252371	2015-11-07 00:01:16 +00:00
Davide Italiano	d9f87b4642	[SimplifyLibCalls] Don't hardcode the function name. llvm-svn: 252342	2015-11-06 21:05:07 +00:00
Sanjoy Das	55ea67cea7	[ValueTracking] Add parameters to isImpliedCondition; NFC Summary: This change makes the `isImpliedCondition` interface similar to the rest of the functions in ValueTracking (in that it takes a DataLayout, AssumptionCache etc.). This is an NFC, intended to make a later diff less noisy. Depends on D14369 Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14391 llvm-svn: 252333	2015-11-06 19:01:08 +00:00
Peter Collingbourne	d4bff30370	DI: Reverse direction of subprogram -> function edge. Previously, subprograms contained a metadata reference to the function they described. Because most clients need to get or set a subprogram for a given function rather than the other way around, this created unneeded inefficiency. For example, many passes needed to call the function llvm::makeSubprogramMap() to build a mapping from functions to subprograms, and the IR linker needed to fix up function references in a way that caused quadratic complexity in the IR linking phase of LTO. This change reverses the direction of the edge by storing the subprogram as function-level metadata and removing DISubprogram's function field. Since this is an IR change, a bitcode upgrade has been provided. Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is attached to the PR. Differential Revision: http://reviews.llvm.org/D14265 llvm-svn: 252219	2015-11-05 22:03:56 +00:00
Davide Italiano	a345877ce8	[SimplifyLibCalls] Use hasFloatVersion(). NFCI. llvm-svn: 252186	2015-11-05 19:18:23 +00:00
James Molloy	9e959ac397	[SimplifyCFG] Tweak heuristic for merging conditional stores We were correctly skipping dbginfo intrinsics and terminators, but the initial bailout wasn't, causing it to bail out on almost any block. llvm-svn: 252152	2015-11-05 08:40:19 +00:00
Davide Italiano	51507d2ad8	[SimplifyLibCalls] New transformation: tan(atan(x)) -> x This is enabled only under -ffast-math. So, instead of emitting: 4007b0: 50 push %rax 4007b1: e8 8a fd ff ff callq 400540 <atanf@plt> 4007b6: 58 pop %rax 4007b7: e9 94 fd ff ff jmpq 400550 <tanf@plt> 4007bc: 0f 1f 40 00 nopl 0x0(%rax) for: float mytan(float x) { return tanf(atanf(x)); } we emit a single retq. Differential Revision: http://reviews.llvm.org/D14302 llvm-svn: 252098	2015-11-04 23:36:56 +00:00
James Molloy	4de84ddec9	[SimplifyCFG] Merge conditional stores We can often end up with conditional stores that cannot be speculated. They can come from fairly simple, idiomatic code: if (c & flag1) a = x; if (c & flag2) a = y; ... There is no dominating or post-dominating store to a, so it is not legal to move the store unconditionally to the end of the sequence and cache the intermediate result in a register, as we would like to. It is, however, legal to merge the stores together and do the store once: tmp = undef; if (c & flag1) tmp = x; if (c & flag2) tmp = y; if (c & flag1 \|\| c & flag2) *a = tmp; The real power in this optimization is that it allows arbitrary length ladders such as these to be completely and trivially if-converted. The typical code I'd expect this to trigger on often uses binary-AND with constants as the condition (as in the above example), which means the ending condition can simply be truncated into a single binary-AND too: 'if (c & (flag1\|flag2))'. As in the general case there are bitwise operators here, the ladder can often be optimized further too. This optimization involves potentially increasing register pressure. Even in the simplest case, the lifetime of the first predicate is extended. This can be elided in some cases such as using binary-AND on constants, but not in the general case. Threading 'tmp' through all branches can also increase register pressure. The optimization as in this patch is enabled by default but kept in a very conservative mode. It will only optimize if it thinks the resultant code should be if-convertable, and additionally if it can thread 'tmp' through at least one existing PHI, so it will only ever in the worst case create one more PHI and extend the lifetime of a predicate. This doesn't trigger much in LNT, unfortunately, but it does trigger in a big way in a third party test suite. llvm-svn: 252051	2015-11-04 15:28:04 +00:00
Davide Italiano	c8a7913f23	[SimplifyLibCalls] Add a new transformation: pow(exp(x), y) -> exp(x*y) This one is enabled only under -ffast-math (due to rounding/overflows) but allows us to emit shorter code. Before (on FreeBSD x86-64): 4007f0: 50 push %rax 4007f1: f2 0f 11 0c 24 movsd %xmm1,(%rsp) 4007f6: e8 75 fd ff ff callq 400570 <exp2@plt> 4007fb: f2 0f 10 0c 24 movsd (%rsp),%xmm1 400800: 58 pop %rax 400801: e9 7a fd ff ff jmpq 400580 <pow@plt> 400806: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 40080d: 00 00 00 After: 4007b0: f2 0f 59 c1 mulsd %xmm1,%xmm0 4007b4: e9 87 fd ff ff jmpq 400540 <exp2@plt> 4007b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Differential Revision: http://reviews.llvm.org/D14045 llvm-svn: 251976	2015-11-03 20:32:23 +00:00
Davide Italiano	b7487e6b8d	[SimplifyLibCalls] Remove variables that are not used. NFC. llvm-svn: 251852	2015-11-02 23:07:14 +00:00
Davide Italiano	e84d4da234	[SimplifyLibCalls] Merge two if statements. NFC. llvm-svn: 251845	2015-11-02 22:33:26 +00:00
Artur Pilipenko	5c5011d503	Preserve load alignment and dereferenceable metadata during some transformations Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D13953 llvm-svn: 251809	2015-11-02 17:53:51 +00:00
Davide Italiano	5cdf915191	Simplify a check. NFC. llvm-svn: 251757	2015-11-01 00:09:16 +00:00
Davide Italiano	396f3eeafb	[SimplifyLibCalls] Factor out other common code. llvm-svn: 251754	2015-10-31 23:17:45 +00:00
Davide Italiano	3817486c47	[SimplifyLibCalls] Remove dead code. llvm-svn: 251737	2015-10-31 08:28:10 +00:00
Dehao Chen	49359bf3d7	Recommit r251680 (also need to update clang test) Update the discriminator assignment algorithm * If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it. * If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt. original code: ; #1 int foo(int i) { ; #2 if (i == 3 \|\| i == 5) return 100; else return 99; ; #3 } ; i == 3: discriminator 0 ; i == 5: discriminator 2 ; return 100: discriminator 1 ; return 99: discriminator 3 llvm-svn: 251689	2015-10-30 05:07:15 +00:00
Dehao Chen	4d84b9321e	Revert r251680: Update the discriminator assignment algorithm * If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it. * If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt. original code: ; #1 int foo(int i) { ; #2 if (i == 3 \|\| i == 5) return 100; else return 99; ; #3 } ; i == 3: discriminator 0 ; i == 5: discriminator 2 ; return 100: discriminator 1 ; return 99: discriminator 3 llvm-svn: 251685	2015-10-30 04:29:05 +00:00
Dehao Chen	9a5d2b18e0	Update the discriminator assignment algorithm * If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it. * If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt. original code: ; #1 int foo(int i) { ; #2 if (i == 3 \|\| i == 5) return 100; else return 99; ; #3 } ; i == 3: discriminator 0 ; i == 5: discriminator 2 ; return 100: discriminator 1 ; return 99: discriminator 3 llvm-svn: 251680	2015-10-30 02:38:29 +00:00
Dehao Chen	7ddf7865b4	clang-format lib/Transforms/Utils/AddDiscriminators.cpp llvm-svn: 251656	2015-10-29 21:25:33 +00:00
Philip Reames	846e3e41ed	[SimplifyCFG] Constant fold a branch implied by it's incoming edge The most common use case is when eliminating redundant range checks in an example like the following: c = a[i+1] + a[i]; Note that all the smarts of the transform (the implication engine) is already in ValueTracking and is tested directly through InstructionSimplify. Differential Revision: http://reviews.llvm.org/D13040 llvm-svn: 251596	2015-10-29 03:11:49 +00:00
Davide Italiano	a904e520c2	[SimplifyLibCalls] Factor out common unsafe-math checks. llvm-svn: 251595	2015-10-29 02:58:44 +00:00
David Majnemer	492937095f	[SimplifyCFG] Don't DCE catchret because the successor is unreachable CatchReturnInst has side-effects: it runs a destructor. This destructor could conceivably run forever/call exit/etc. and should not be removed. llvm-svn: 251461	2015-10-27 22:43:56 +00:00
Davide Italiano	c692688cbd	[SimplifyLibCalls] Use range-based loop. No functional change. llvm-svn: 251383	2015-10-27 04:17:51 +00:00
David Blaikie	94c83370b5	Move the canonical header to the top of its matching cpp file as per coding convention This ensures that the header will be verified to be standalone (and avoid mistakes like the one fixed in r251178) llvm-svn: 251326	2015-10-26 18:40:56 +00:00
Sanjoy Das	15c4c4604f	[LCSSA] Unbreak build, don't reuse L; NFC The build broke in r251248. llvm-svn: 251251	2015-10-25 19:27:17 +00:00
Sanjoy Das	331521c688	[LCSSA] Use range for loops; NFC llvm-svn: 251248	2015-10-25 19:08:32 +00:00
Chen Li	7009cd3554	Revert rL251061 [SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad. llvm-svn: 251149	2015-10-23 21:13:01 +00:00
Sanjoy Das	0a1bee8a80	[Inliner] Don't inline through callsites with operand bundles Summary: This change teaches the LLVM inliner to not inline through callsites with unknown operand bundles. Currently all operand bundles are "unknown" operand bundles but in the near future we will add support for inlining through some select kinds of operand bundles. Reviewers: reames, chandlerc, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14001 llvm-svn: 251141	2015-10-23 20:09:55 +00:00
Chen Li	c6e28782d8	[SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad. Summary: Currently SimplifyResume can convert an invoke instruction to a call instruction if its landing pad is trivial. In practice we could have several invoke instructions with trivial landing pads and share a common rethrow block, and in the common rethrow block, all the landing pads join to a phi node. The patch extends SimplifyResume to check the phi of landing pad and their incoming blocks. If any of them is trivial, remove it from the phi node and convert the invoke instruction to a call instruction. Reviewers: hfinkel, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13718 llvm-svn: 251061	2015-10-22 20:48:38 +00:00
David Majnemer	dc3b67b4ca	[SimplifyCFG] Don't use-after-free an SSA value SimplifyTerminatorOnSelect didn't consider the possibility that the condition might be related to one of PHI nodes. This fixes PR25267. llvm-svn: 250922	2015-10-21 18:22:24 +00:00
Philip Reames	a956cc7f08	Revert 250343 and 250344 Turns out this approach is buggy. In discussion about follow on work, Sanjoy pointed out that we could be subject to circular logic problems. Consider: if (i u< L) leave() if ((i + 1) u< L) leave() print(a[i] + a[i+1]) If we know that L is less than UINT_MAX, we could possible prove (in a control dependent way) that i + 1 does not overflow. This gives us: if (i u< L) leave() if ((i +nuw 1) u< L) leave() print(a[i] + a[i+1]) If we now do the transform this patch proposed, we end up with: if ((i +nuw 1) u< L) leave_appropriately() print(a[i] + a[i+1]) That would be a miscompile when i==-1. The problem here is that the control dependent nuw bits got used to prove something about the first condition. That's obviously invalid. This won't happen today, but since I plan to enhance LVI/CVP with exactly that transform at some point in the not too distant future... llvm-svn: 250430	2015-10-15 16:51:00 +00:00
Philip Reames	b42db21de8	[SimplifyCFG] Speculatively flatten CFG based on profiling metadata If we have a series of branches which are all unlikely to fail, we can possibly combine them into a single check on the fastpath combined with a bit of dispatch logic on the slowpath. We don't want to do this unconditionally since it requires speculating instructions past a branch, but if the profiling metadata on the branch indicates profitability, this can reduce the number of checks needed along the fast path. The canonical example this is trying to handle is removing the second bounds check implied by the Java code: a[i] + a[i+1]. Note that it can currently only do so for really simple conditions and the values of a[i] can't be used anywhere except in the addition. (i.e. the load has to have been sunk already and not prevent speculation.) I plan on extending this transform over the next few days to handle alternate sequences. Differential Revision: http://reviews.llvm.org/D13070 llvm-svn: 250343	2015-10-14 22:46:19 +00:00
David Majnemer	eba62796cb	[InlineFunction] Correctly inline TerminatePadInst We forgot to append the terminatepad's arguments which resulted in us treating the old terminatepad as an argument to the new terminatepad causing us to crash immediately. Instead, add the old terminatepad's arguments to the new terminatepad. This fixes PR25155. llvm-svn: 250234	2015-10-13 22:08:17 +00:00
Duncan P. N. Exon Smith	5b4c837c58	TransformUtils: Remove implicit ilist iterator conversions, NFC Continuing the work from last week to remove implicit ilist iterator conversions. First related commit was probably r249767, with some more motivation in r249925. This edition gets LLVMTransformUtils compiling without the implicit conversions. No functional change intended. llvm-svn: 250142	2015-10-13 02:39:05 +00:00
Oliver Stannard	939724cd02	GlobalOpt does not treat externally_initialized globals correctly GlobalOpt currently merges stores into the initialisers of internal, externally_initialized globals, but should not do so as the value of the global may change between the initialiser and any code in the module being run. llvm-svn: 250035	2015-10-12 13:20:52 +00:00
Sanjoy Das	c21a05a3a4	[PlaceSafeopints] Extract out `callsGCLeafFunction`, NFC Summary: This will be used in a later change to RewriteStatepointsForGC. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13490 llvm-svn: 249777	2015-10-08 23:18:30 +00:00
Sanjoy Das	0015e5a088	[IndVars] Preserve LCSSA in `eliminateIdentitySCEV` Summary: After r249211, SCEV can see through some LCSSA phis. Add a `replacementPreservesLCSSAForm` check before replacing uses of these phi nodes with a simplified use of the induction variable to avoid breaking LCSSA. Fixes 25047. Depends on D13460. Reviewers: atrick, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13461 llvm-svn: 249575	2015-10-07 17:38:31 +00:00
Hans Wennborg	083ca9bb32	Fix Clang-tidy modernize-use-nullptr warnings in source directories and generated files; other minor cleanups. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D13321 llvm-svn: 249482	2015-10-06 23:24:35 +00:00
Sanjoy Das	5c8bead46d	[IndVars] Don't break dominance in `eliminateIdentitySCEV` Summary: After r249211, `getSCEV(X) == getSCEV(Y)` does not guarantee that X and Y are related in the dominator tree, even if X is an operand to Y (I've included a toy example in comments, and a real example as a test case). This commit changes `SimplifyIndVar` to require a `DominatorTree`. I don't think this is a problem because `ScalarEvolution` requires it anyway. Fixes PR25051. Depends on D13459. Reviewers: atrick, hfinkel Subscribers: joker.eph, llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13460 llvm-svn: 249471	2015-10-06 21:44:49 +00:00
Sanjoy Das	088bb0ea9f	[IndVars] Extract out eliminateIdentitySCEV, NFC Summary: Reflow a comment while at it. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13459 llvm-svn: 249470	2015-10-06 21:44:39 +00:00
Piotr Padlewski	dc9b2cfc50	inariant.group handling in GVN The most important part required to make clang devirtualization works ( ͡°͜ʖ ͡°). The code is able to find non local dependencies, but unfortunatelly because the caller can only handle local dependencies, I had to add some restrictions to look for dependencies only in the same BB. http://reviews.llvm.org/D12992 llvm-svn: 249196	2015-10-02 22:12:22 +00:00
Bruno Cardoso Lopes	b491a2d641	[SimplifyLibCalls] Fix instruction misplacement in string/memory libcall optimization When trying to optimize fortified library functions use the right location to insert new instructions in order to preserve correct def-use order. This fixes an issue where a misplaced instruction definition would happen to be after one of its use after a RAUW, forming invalid IR. This behavior was introduced by r227250. Differential Revision: http://reviews.llvm.org/D13301 rdar://problem/22802369 llvm-svn: 249092	2015-10-01 22:43:53 +00:00
Evgeniy Stepanov	f608111d1b	Fix debug info with SafeStack. llvm-svn: 248933	2015-09-30 19:55:43 +00:00
Evgeniy Stepanov	d8b86f7cdc	Move dbg.declare intrinsics when merging and replacing allocas. Place new and update dbg.declare calls immediately after the corresponding alloca. Current code in replaceDbgDeclareForAlloca puts the new dbg.declare at the end of the basic block. LLVM codegen has problems emitting debug info in a situation when dbg.declare appears after all uses of the variable. This usually kinda works for inlining and ASan (two users of this function) but not for SafeStack (see the pending change in http://reviews.llvm.org/D13178). llvm-svn: 248769	2015-09-29 00:30:19 +00:00
Fiona Glaser	f74cc40e34	Improve performance of SimplifyInstructionsInBlock 1. Use a worklist, not a recursive approach, to avoid needless revisitation and being repeatedly forced to jump back to the start of the BB if a handle is invalidated. 2. Only insert operands to the worklist if they become unused after a dead instruction is removed, so we don’t have to visit them again in most cases. 3. Use a SmallSetVector to track the worklist. 4. Instead of pre-initting the SmallSetVector like in DeadCodeEliminationPass, only put things into the worklist if they have to be revisited after the first run-through. This minimizes how much the actual SmallSetVector gets used, which saves a lot of time. llvm-svn: 248727	2015-09-28 18:56:07 +00:00
Joseph Tremoulet	09af67aba5	[EH] Create removeUnwindEdge utility Summary: Factor the code that rewrites invokes to calls and rewrites WinEH terminators to their "unwind to caller" equivalents into a helper in Utils/Local, and use it in the three places I'm aware of that need to do this. Reviewers: andrew.w.kaylor, majnemer, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13152 llvm-svn: 248677	2015-09-27 01:47:46 +00:00
Michael Zolotukhin	d56ee06d1f	[Unroll] When completely unrolling the loop, replace conditinal branches with unconditional. Nothing is expected to change, except we do less redundant work in clean-up. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12951 llvm-svn: 248444	2015-09-23 23:12:43 +00:00
Vedant Kumar	ff08e926ba	[Inline] Use AssumptionCache from the right Function This changes the behavior of AddAligntmentAssumptions to match its comment. I.e, prove the asserted alignment in the context of the caller, not the callee. Thanks to Mehdi Amini for seeing the issue here! Also to Artur Pilipenko who also saw a fix for the issue. rdar://22521387 Differential Revision: http://reviews.llvm.org/D12997 llvm-svn: 248390	2015-09-23 15:49:08 +00:00
Sanjoy Das	2aacc0ecca	[SCEV] Introduce ScalarEvolution::getOne and getZero. Summary: It is fairly common to call SE->getConstant(Ty, 0) or SE->getConstant(Ty, 1); this change makes such uses a little bit briefer. I've refactored the call sites I could find easily to use getZero / getOne. Reviewers: hfinkel, majnemer, reames Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12947 llvm-svn: 248362	2015-09-23 01:59:04 +00:00
James Molloy	50a4c27f97	[LoopUtils,LV] Propagate fast-math flags on generated FCmp instructions We're currently losing any fast-math flags when synthesizing fcmps for min/max reductions. In LV, make sure we copy over the scalar inst's flags. In LoopUtils, we know we only ever match patterns with hasUnsafeAlgebra, so apply that to any synthesized ops. llvm-svn: 248201	2015-09-21 19:41:19 +00:00
Sanjay Patel	815adacd22	don't repeat function names in comments; NFC llvm-svn: 247813	2015-09-16 16:21:08 +00:00
Sanjay Patel	f9b776350f	more space; NFC llvm-svn: 247699	2015-09-15 15:24:42 +00:00
David Blaikie	6614d8d230	[opaque pointer types] Switch a few cases of getElementType over, since I had them lying around anyway llvm-svn: 247610	2015-09-14 20:29:26 +00:00
David Blaikie	16a2f3e302	Revert "[opaque pointer type] Pass GlobalAlias the actual pointer type rather than decomposing it into pointee type + address space" This was a flawed change - it just caused the getElementType call to be deferred until later, when we really need to remove it. Now that the IR for GlobalAliases has been updated, the root cause is addressed that way instead and this change is no longer needed (and in fact gets in the way - because we want to pass the pointee type directly down further). Follow up patches to push this through GlobalValue, bitcode format, etc, will come along soon. This reverts commit 236160. llvm-svn: 247585	2015-09-14 18:01:59 +00:00
Filipe Cabecinhas	48b090a31f	Remove gcc warning when comparing an unsigned var for >= 0 llvm-svn: 247352	2015-09-10 22:34:39 +00:00
Matthew Simpson	29dc0f7075	[LV] Relax Small Size Reduction Type Requirement This patch enables small size reductions in which the source types are smaller than the reduction type (e.g., computing an i16 sum from the values in an i8 array). The previous behavior was to only allow small size reductions if the source types and reduction type were the same. The change accounts for the fact that the existing sign- and zero-extend instructions in these cases should still be included in the cost model. Differential Revision: http://reviews.llvm.org/D12770 llvm-svn: 247337	2015-09-10 21:12:57 +00:00
Philip Reames	053701399d	[SimplifyCFG] Use known bits to eliminate dead switch defaults This is a follow up to http://reviews.llvm.org/D11995 implementing the suggestion by Hans. If we know some of the bits of the value being switched on, we know that the maximum number of unique cases covers the unknown bits. This allows to eliminate switch defaults for large integers (i32) when most bits in the value are known. Note that I had to make the transform contingent on not having any dead cases. This is conservatively correct with the old code, but required for the new code since we might have a dead case which varies one of the known bits. Counting that towards our number of covering cases would be bad. If we do have dead cases, we'll eliminate them first, then revisit the possibly dead default. Differential Revision: http://reviews.llvm.org/D12497 llvm-svn: 247309	2015-09-10 17:44:47 +00:00
Sanjay Patel	9361d35525	80-cols; NFC llvm-svn: 247295	2015-09-10 16:31:19 +00:00
Sanjay Patel	f4b34b76d4	use range-based for loop; NFCI llvm-svn: 247294	2015-09-10 16:25:38 +00:00
Sanjay Patel	5e7bd91891	use range-based for loop; NFCI llvm-svn: 247293	2015-09-10 16:15:21 +00:00
Sanjay Patel	59661459f1	fix typo; NFC llvm-svn: 247287	2015-09-10 15:14:34 +00:00
Chandler Carruth	7b560d40bd	[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible with the new pass manager, and no longer relying on analysis groups. This builds essentially a ground-up new AA infrastructure stack for LLVM. The core ideas are the same that are used throughout the new pass manager: type erased polymorphism and direct composition. The design is as follows: - FunctionAAResults is a type-erasing alias analysis results aggregation interface to walk a single query across a range of results from different alias analyses. Currently this is function-specific as we always assume that aliasing queries are within a function. - AAResultBase is a CRTP utility providing stub implementations of various parts of the alias analysis result concept, notably in several cases in terms of other more general parts of the interface. This can be used to implement only a narrow part of the interface rather than the entire interface. This isn't really ideal, this logic should be hoisted into FunctionAAResults as currently it will cause a significant amount of redundant work, but it faithfully models the behavior of the prior infrastructure. - All the alias analysis passes are ported to be wrapper passes for the legacy PM and new-style analysis passes for the new PM with a shared result object. In some cases (most notably CFL), this is an extremely naive approach that we should revisit when we can specialize for the new pass manager. - BasicAA has been restructured to reflect that it is much more fundamentally a function analysis because it uses dominator trees and loop info that need to be constructed for each function. All of the references to getting alias analysis results have been updated to use the new aggregation interface. All the preservation and other pass management code has been updated accordingly. The way the FunctionAAResultsWrapperPass works is to detect the available alias analyses when run, and add them to the results object. This means that we should be able to continue to respect when various passes are added to the pipeline, for example adding CFL or adding TBAA passes should just cause their results to be available and to get folded into this. The exception to this rule is BasicAA which really needs to be a function pass due to using dominator trees and loop info. As a consequence, the FunctionAAResultsWrapperPass directly depends on BasicAA and always includes it in the aggregation. This has significant implications for preserving analyses. Generally, most passes shouldn't bother preserving FunctionAAResultsWrapperPass because rebuilding the results just updates the set of known AA passes. The exception to this rule are LoopPass instances which need to preserve all the function analyses that the loop pass manager will end up needing. This means preserving both BasicAAWrapperPass and the aggregating FunctionAAResultsWrapperPass. Now, when preserving an alias analysis, you do so by directly preserving that analysis. This is only necessary for non-immutable-pass-provided alias analyses though, and there are only three of interest: BasicAA, GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is preserved when needed because it (like DominatorTree and LoopInfo) is marked as a CFG-only pass. I've expanded GlobalsAA into the preserved set everywhere we previously were preserving all of AliasAnalysis, and I've added SCEVAA in the intersection of that with where we preserve SCEV itself. One significant challenge to all of this is that the CGSCC passes were actually using the alias analysis implementations by taking advantage of a pretty amazing set of loop holes in the old pass manager's analysis management code which allowed analysis groups to slide through in many cases. Moving away from analysis groups makes this problem much more obvious. To fix it, I've leveraged the flexibility the design of the new PM components provides to just directly construct the relevant alias analyses for the relevant functions in the IPO passes that need them. This is a bit hacky, but should go away with the new pass manager, and is already in many ways cleaner than the prior state. Another significant challenge is that various facilities of the old alias analysis infrastructure just don't fit any more. The most significant of these is the alias analysis 'counter' pass. That pass relied on the ability to snoop on AA queries at different points in the analysis group chain. Instead, I'm planning to build printing functionality directly into the aggregation layer. I've not included that in this patch merely to keep it smaller. Note that all of this needs a nearly complete rewrite of the AA documentation. I'm planning to do that, but I'd like to make sure the new design settles, and to flesh out a bit more of what it looks like in the new pass manager first. Differential Revision: http://reviews.llvm.org/D12080 llvm-svn: 247167	2015-09-09 17:55:00 +00:00
NAKAMURA Takumi	0d72539d5a	Prune utf8 chars in comments. llvm-svn: 246953	2015-09-07 00:26:54 +00:00
Craig Topper	02a55d701d	Fix build warning. llvm-svn: 246908	2015-09-05 04:49:44 +00:00
Andrew Kaylor	2a9a6d8c38	Fix build warning llvm-svn: 246903	2015-09-05 01:00:51 +00:00
Andrew Kaylor	a212aba680	Fix build warning llvm-svn: 246899	2015-09-04 23:58:32 +00:00
Andrew Kaylor	50e4e86c26	[WinEH] Teach SimplfyCFG to eliminate empty cleanup pads. Differential Revision: http://reviews.llvm.org/D12434 llvm-svn: 246896	2015-09-04 23:39:40 +00:00
Joseph Tremoulet	9ce71f76b9	[WinEH] Add cleanupendpad instruction Summary: Add a `cleanupendpad` instruction, used to mark exceptional exits out of cleanups (for languages/targets that can abort a cleanup with another exception). The `cleanupendpad` instruction is similar to the `catchendpad` instruction in that it is an EH pad which is the target of unwind edges in the handler and which itself has an unwind edge to the next EH action. The `cleanupendpad` instruction, similar to `cleanupret` has a `cleanuppad` argument indicating which cleanup it exits. The unwind successors of a `cleanuppad`'s `cleanupendpad`s must agree with each other and with its `cleanupret`s. Update WinEHPrepare (and docs/tests) to accomodate `cleanupendpad`. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12433 llvm-svn: 246751	2015-09-03 09:09:43 +00:00
Piotr Padlewski	28ffcbe1cc	Constant propagation after hitting assume(cmp) bugfix Last time code run into assertion `BBE.isSingleEdge()` in lib/IR/Dominators.cpp:200. http://reviews.llvm.org/D12170 llvm-svn: 246696	2015-09-02 19:59:59 +00:00
Benjamin Kramer	f175e04435	[RemoveDuplicatePHINodes] Start over after removing a PHI. This makes RemoveDuplicatePHINodes more effective and fixes an assertion failure. Triggering the assertions requires a DenseSet reallocation so this change only contains a constructive test. I'll explain the issue with a small example. In the following function there's a duplicate PHI, %4 and %5 are identical. When this is found the DenseSet in RemoveDuplicatePHINodes contains %2, %3 and %4. define void @F() { br label %1 ; <label>:1 ; preds = %1, %0 %2 = phi i32 [ 42, %0 ], [ %4, %1 ] %3 = phi i32 [ 42, %0 ], [ %5, %1 ] %4 = phi i32 [ 42, %0 ], [ 23, %1 ] %5 = phi i32 [ 42, %0 ], [ 23, %1 ] br label %1 } after RemoveDuplicatePHINodes runs the function looks like this. %3 has changed and is now identical to %2, but RemoveDuplicatePHINodes never saw this. define void @F() { br label %1 ; <label>:1 ; preds = %1, %0 %2 = phi i32 [ 42, %0 ], [ %4, %1 ] %3 = phi i32 [ 42, %0 ], [ %4, %1 ] %4 = phi i32 [ 42, %0 ], [ 23, %1 ] br label %1 } If the DenseSet does a reallocation now it will reinsert all keys and stumble over %3 now having a different hash value than it had when inserted into the map for the first time. This change clears the set whenever a PHI is deleted and starts the progress from the beginning, allowing %3 to be deleted and avoiding inconsistent DenseSet state. This potentially has a negative performance impact because it rescans all PHIs, but I don't think that this ever makes a difference in practice. llvm-svn: 246694	2015-09-02 19:52:23 +00:00
Chad Rosier	dc65532fd9	Optimize memcmp(x,y,n)==0 for small n and suitably aligned x/y. http://reviews.llvm.org/D6952 PR20673 llvm-svn: 246313	2015-08-28 18:30:18 +00:00
Steven Wu	61db34d12e	Revert r246244 and r246243 These two commits cause clang/llvm bootstrap to hang. llvm-svn: 246279	2015-08-28 06:52:00 +00:00
Piotr Padlewski	3f81ec1e38	Constant propagation after hitting assume(cmp) bugfix Last time code run into assertion `BBE.isSingleEdge()` in lib/IR/Dominators.cpp:200. http://reviews.llvm.org/D12170 llvm-svn: 246244	2015-08-28 01:02:00 +00:00
Chad Rosier	c94f8e2906	[LoopVectorize] Add Support for Small Size Reductions. Unlike scalar operations, we can perform vector operations on element types that are smaller than the native integer types. We type-promote scalar operations if they are smaller than a native type (e.g., i8 arithmetic is promoted to i32 arithmetic on Arm targets). This patch detects and removes type-promotions within the reduction detection framework, enabling the vectorization of small size reductions. In the legality phase, we look through the ANDs and extensions that InstCombine creates during promotion, keeping track of the smaller type. In the profitability phase, we use the smaller type and ignore the ANDs and extensions in the cost model. Finally, in the code generation phase, we truncate the result of the reduction to allow InstCombine to rewrite the entire expression in the smaller type. This fixes PR21369. http://reviews.llvm.org/D12202 Patch by Matt Simpson <mssimpso@codeaurora.org>! llvm-svn: 246149	2015-08-27 14:12:17 +00:00
James Molloy	1bbf15c57c	[LoopVectorize] Extract InductionInfo into a helper class... ... and move it into LoopUtils where it can be used by other passes, just like ReductionDescriptor. The API is very similar to ReductionDescriptor - that is, not very nice at all. Sorting these both out will come in a followup. NFC llvm-svn: 246145	2015-08-27 09:53:00 +00:00
Alex Rosenberg	a0a19c1c91	Whoops, remove trailing whitespace. llvm-svn: 246141	2015-08-27 05:37:12 +00:00
Philip Reames	98a2dabc08	[SimplifyCFG] Prune code from a provably unreachable switch default As Sanjoy pointed out over in http://reviews.llvm.org/D11819, a switch on an icmp should always be able to become a branch instruction. This patch generalizes that notion slightly to prove that the default case of a switch is unreachable if the cases completely cover all possible bit patterns in the condition. Once that's done, the switch to branch conversion kicks in just fine. Note: Duplicate case values are disallowed by the LangRef and verifier. Differential Revision: http://reviews.llvm.org/D11995 llvm-svn: 246125	2015-08-26 23:56:46 +00:00
David Majnemer	3354fe473f	[SimplifyLibCalls] Fix a typo cbrt(sqrt(x)) calculates the sixth root, not the ninth root. cbrt(cbrt(x)) calculates the ninth root. llvm-svn: 246046	2015-08-26 18:30:16 +00:00
Alex Rosenberg	81cfed21ca	Modernize with range-based for loops. llvm-svn: 246018	2015-08-26 06:11:41 +00:00
Alex Rosenberg	99805ed45a	Reduce code duplication. llvm-svn: 246017	2015-08-26 06:11:38 +00:00
Alex Rosenberg	5b3404a03e	Trailing whitespace llvm-svn: 246016	2015-08-26 06:11:36 +00:00
Joseph Tremoulet	8220bcc570	[WinEH] Require token linkage in EH pad/ret signatures Summary: WinEHPrepare is going to require that cleanuppad and catchpad produce values of token type which are consumed by any cleanupret or catchret exiting the pad. This change updates the signatures of those operators to require/enforce that the type produced by the pads is token type and that the rets have an appropriate argument. The catchpad argument of a `CatchReturnInst` must be a `CatchPadInst` (and similarly for `CleanupReturnInst`/`CleanupPadInst`). To accommodate that restriction, this change adds a notion of an operator constraint to both LLParser and BitcodeReader, allowing appropriate sentinels to be constructed for forward references and appropriate error messages to be emitted for illegal inputs. Also add a verifier rule (noted in LangRef) that a catchpad with a catchpad predecessor must have no other predecessors; this ensures that WinEHPrepare will see the expected linear relationship between sibling catches on the same try. Lastly, remove some superfluous/vestigial casts from instruction operand setters operating on BasicBlocks. Reviewers: rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12108 llvm-svn: 245797	2015-08-23 00:26:33 +00:00
David Blaikie	88208840b5	[opaque pointer type]: Pass explicit pointee type when building a constant GEP. Gets a bit tricky in the ValueMapper, of course - not sure if we should just expose a list of explicit types for each Value so that the ValueMapper can be neutral to these special cases (it's OK for things like load, where the explicit type is the result type - but when that's not the case, it means plumbing through another "special" type... ) llvm-svn: 245728	2015-08-21 20:16:51 +00:00
Peter Collingbourne	1dc6a8d179	TransformUtils: Introduce module splitter. The module splitter splits a module into linkable partitions. It will be used to implement parallel LTO code generation. This initial version of the splitter does not attempt to deal with the somewhat subtle symbol visibility issues around module splitting. These will be dealt with in a future change. Differential Revision: http://reviews.llvm.org/D12132 llvm-svn: 245662	2015-08-21 02:48:20 +00:00
Adrian Prantl	cbdfdb74d3	Rename Instruction::dropUnknownMetadata() to dropUnknownNonDebugMetadata() and make it always preserve debug locations, since all callers wanted this behavior anyway. This is addressing a post-commit review feedback for r245589. NFC (inside the LLVM tree). llvm-svn: 245622	2015-08-20 22:00:30 +00:00
Adrian Prantl	baf90fc265	Fix a bug that caused SimplifyCFG to drop DebugLocs. Instruction::dropUnknownMetadata(KnownSet) is supposed to preserve all metadata in KnownSet, but the condition for DebugLocs was inverted. Most users of dropUnknownMetadata() actually worked around this by not adding LLVMContext::MD_dbg to their list of KnowIDs. This is now made explicit. llvm-svn: 245589	2015-08-20 18:24:02 +00:00
Adam Nemet	e48134093d	[LVer] Fix FIXME: hide addPHINodes, NFC Since Ashutosh made findDefsUsedOutsideOfLoop public, we can clean this up. Now clients that don't compute DefsUsedOutsideOfLoop can just call versionLoop() and computing DefsUsedOutsideOfLoop will happen implicitly. With that there is no reason to expose addPHINodes anymore. Ashutosh, you can now drop the calls to findDefsUsedOutsideOfLoop and addPHINodes in LVerLICM and things should just work. llvm-svn: 245579	2015-08-20 17:22:29 +00:00
David Majnemer	ba275f9947	Replace some calls to isa<LandingPadInst> with isEHPad() No functionality change is intended. llvm-svn: 245487	2015-08-19 19:54:02 +00:00
Ashutosh Nema	c5b7b55589	Exposed findDefsUsedOutsideOfLoop as a loop utility function Exposed findDefsUsedOutsideOfLoop as a loop utility function by moving it from LoopDistribute to LoopUtils. Reviewed By: anemet llvm-svn: 245416	2015-08-19 05:40:42 +00:00
Chandler Carruth	7adc3a2b0e	[PM/AA] Remove the last relics of the separate IPA library from LLVM, folding the code into the main Analysis library. There already wasn't much of a distinction between Analysis and IPA. A number of the passes in Analysis are actually IPA passes, and there doesn't seem to be any advantage to separating them. Moreover, it makes it hard to have interactions between analyses that are both local and interprocedural. In trying to make the Alias Analysis infrastructure work with the new pass manager, it becomes particularly awkward to navigate this split. I've tried to find all the places where we referenced this, but I may have missed some. I have also adjusted the C API to continue to be equivalently functional after this change. Differential Revision: http://reviews.llvm.org/D12075 llvm-svn: 245318	2015-08-18 17:51:53 +00:00
Chandler Carruth	2f1fd1658f	[PM] Port ScalarEvolution to the new pass manager. This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never actually invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there is actually a key that we don't update with value handles -- there is a map keyed off of Loops. Because LoopInfo does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that don't trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in just such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193	2015-08-17 02:08:17 +00:00
Benjamin Kramer	bb70d751de	[SimplifyLibCalls] Drop default template args. No functional change. llvm-svn: 245189	2015-08-16 21:16:37 +00:00
Sanjay Patel	57fd1dc5db	transform fmin/fmax calls when possible (PR24314) If we can ignore NaNs, fmin/fmax libcalls can become compare and select (this is what we turn std::min / std::max into). This IR should then be optimized in the backend to whatever is best for any given target. Eg, x86 can use minss/maxss instructions. This should solve PR24314: https://llvm.org/bugs/show_bug.cgi?id=24314 Differential Revision: http://reviews.llvm.org/D11866 llvm-svn: 245187	2015-08-16 20:18:19 +00:00
David Majnemer	0bc0eef71c	[IR] Give catchret an optional 'return value' operand Some personality routines require funclet exit points to be clearly marked, this is done by producing a token at the funclet pad and consuming it at the corresponding ret instruction. CleanupReturnInst already had a spot for this operand but CatchReturnInst did not. Other personality routines don't need to use this which is why it has been made optional. llvm-svn: 245149	2015-08-15 02:46:08 +00:00
Adam Nemet	06ccf0145f	[LVer] Remove unused Pass parameter from versionLoop, NFC llvm-svn: 245032	2015-08-14 06:30:26 +00:00
David Majnemer	b611e3f50e	[IR] Add token types This introduces the basic functionality to support "token types". The motivation stems from the need to perform operations on a Value whose provenance cannot be obscured. There are several applications for such a type but my immediate motivation stems from WinEH. Our personality routine enforces a single-entry - single-exit regime for cleanups. After several rounds of optimizations, we may be left with a terminator whose "cleanup-entry block" is not entirely clear because control flow has merged two cleanups together. We have experimented with using labels as operands inside of instructions which are not terminators to indicate where we came from but found that LLVM does not expect such exotic uses of BasicBlocks. Instead, we can use this new type to clearly associate the "entry point" and "exit point" of our cleanup. This is done by having the cleanuppad yield a Token and consuming it at the cleanupret. The token type makes it impossible to obscure or otherwise hide the Value, making it trivial to track the relationship between the two points. What is the burden to the optimizer? Well, it turns out we have already paid down this cost by accepting that there are certain calls that we are not permitted to duplicate, optimizations have to watch out for such instructions anyway. There are additional places in the optimizer that we will probably have to update but early examination has given me the impression that this will not be heroic. Differential Revision: http://reviews.llvm.org/D11861 llvm-svn: 245029	2015-08-14 05:09:07 +00:00
Davide Italiano	a195386ca1	[SimplifyLibCalls] Correctly set the is_zero_undef flag for llvm.cttz If <src> is non-zero we can safely set the flag to true, and this results in less code generated for, e.g. ffs(x) + 1 on FreeBSD. Thanks to majnemer for suggesting the fix and reviewing. Code generated before the patch was applied: 0: 0f bc c7 bsf %edi,%eax 3: b9 20 00 00 00 mov $0x20,%ecx 8: 0f 45 c8 cmovne %eax,%ecx b: 83 c1 02 add $0x2,%ecx e: b8 01 00 00 00 mov $0x1,%eax 13: 85 ff test %edi,%edi 15: 0f 45 c1 cmovne %ecx,%eax 18: c3 retq Code generated after the patch was applied: 0: 0f bc cf bsf %edi,%ecx 3: 83 c1 02 add $0x2,%ecx 6: 85 ff test %edi,%edi 8: b8 01 00 00 00 mov $0x1,%eax d: 0f 45 c1 cmovne %ecx,%eax 10: c3 retq It seems we can still use cmove and save another 'test' instruction, but that can be tackled separately. Differential Revision: http://reviews.llvm.org/D11989 llvm-svn: 244947	2015-08-13 20:34:26 +00:00
Sanjay Patel	e24c60eb54	fix typo; NFC llvm-svn: 244805	2015-08-12 20:36:18 +00:00
Adam Nemet	dfaeb33ec7	[LoopVer] Optionally allow using memchecks from LAA r243382 changed the behavior to always require a set of memchecks to be passed to LoopVer. This change restores the prior behavior as an alternative to the new behavior. This allows the checks to be implicitly taken from the LAA object. Patch by Ashutosh Nema! llvm-svn: 244763	2015-08-12 16:51:19 +00:00
Chen Li	0786bc9fe8	[LowerSwitch] Skip dead blocks for processSwitchInst() Summary: This patch adds check for dead blocks and skip them for processSwitchInst(). This will help reduce compilation time. Reviewers: reames, hans Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11953 llvm-svn: 244656	2015-08-11 20:16:17 +00:00
Chen Li	10f01bd4d3	[LowerSwitch] Fix a bug when LowerSwitch deletes the default block Summary: LowerSwitch crashed with the attached test case after deleting the default block. This happened because the current implementation of deleting dead blocks is wrong. After the default block being deleted, it contains no instruction or terminator, and it should no be traversed anymore. However, since the iterator is advanced before processSwitchInst() function is executed, the block advanced to could be deleted inside processSwitchInst(). The deleted block would then be visited next and crash dyn_cast<SwitchInst>(Cur->getTerminator()) because Cur->getTerminator() returns a nullptr. This patch fixes this problem by recording dead default blocks into a list, and delete them after all processSwitchInst() has been done. It still possible to visit dead default blocks and waste time process them. But it is a compile time issue, and I plan to have another patch to add support to skip dead blocks. Reviewers: kariddi, resistor, hans, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11852 llvm-svn: 244642	2015-08-11 18:12:26 +00:00
David Majnemer	fd9f47756a	[WinEHPrepare] Add rudimentary support for the new EH instructions This adds somewhat basic preparation functionality including: - Formation of funclets via coloring basic blocks. - Cloning of polychromatic blocks to ensure that funclets have unique program counters. - Demotion of values used between different funclets. - Some amount of cleanup once we have removed predecessors from basic blocks. - Verification that we are left with a CFG that makes some amount of sense. N.B. Arguments and numbering still need to be done. Differential Revision: http://reviews.llvm.org/D11750 llvm-svn: 244558	2015-08-11 01:15:26 +00:00
Adam Nemet	5b0a479541	[LAA] Change name from addRuntimeCheck to addRuntimeChecks, NFC This was requested by Hal in D11205. llvm-svn: 244540	2015-08-11 00:09:37 +00:00
Adam Nemet	0bc068728e	[LoopVer] Remove unused pointer partition argument, NFC. llvm-svn: 244527	2015-08-10 23:05:31 +00:00
Tyler Nowicki	c1a86f5866	Late evaluation of the fast-math vectorization requirement. This patch moves the verification of fast-math to just before vectorization is done. This way we can tell clang to append the command line options would that allow floating-point commutativity. Specifically those are enableing fast-math or specifying a loop hint. llvm-svn: 244489	2015-08-10 19:51:46 +00:00
Benjamin Kramer	df005cbe19	Fix some comment typos. llvm-svn: 244402	2015-08-08 18:27:36 +00:00
Matt Arsenault	b130076469	Remove unnecessary includes llvm-svn: 244382	2015-08-08 00:41:53 +00:00
Chen Li	eafbc9dc47	[ConstantFoldTerminator] Preserve make.implicit metadata when converting SwitchInst to BranchInst Summary: llvm::ConstantFoldTerminator function can convert SwitchInst with single case (and default) to a conditional BranchInst. This patch adds support to preserve make.implicit metadata on this conversion. Reviewers: sanjoy, weimingz, chenli Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11841 llvm-svn: 244348	2015-08-07 19:30:12 +00:00
Duncan P. N. Exon Smith	8c9dcace0d	ValueMapper: Resolve uniquing cycles more aggressively As a follow-up to r244181, resolve uniquing cycles underneath distinct nodes on the fly. This prevents uniquing cycles in early operands from affecting later operands. It also removes an iteration through distinct nodes' operands. No real functional change here, just more prompt resolution of temporary nodes. llvm-svn: 244302	2015-08-07 00:44:55 +00:00
Duncan P. N. Exon Smith	c9fdbdb78d	ValueMapper: Pull out helper to resolve cycles, NFC Pull out a helper for resolving uniquing cycles of `Metadata` to remove the boiler-plate of downcasting to `MDNode`. llvm-svn: 244301	2015-08-07 00:39:26 +00:00
David Majnemer	09e1fdb3f4	Revert accidentally committed WinEHPrepare changes This reverts commit r244272, r244273, r244274, and r244275. llvm-svn: 244278	2015-08-06 21:13:51 +00:00
David Majnemer	ac6b298850	Handle PHI nodes prefacing EH pads too llvm-svn: 244274	2015-08-06 21:08:32 +00:00
Sanjoy Das	c18115db9c	[IndVars] Improved logging under DEBUG(); NFC. Before this, we'd print the modified comparision in the "Simplified comparison" case. That looked misleading. llvm-svn: 244264	2015-08-06 20:43:28 +00:00
Pete Cooper	ebcd748927	Convert a bunch of loops to foreach. NFC. After r244074, we now have a successors() method to iterate over all the successors of a TerminatorInst. This commit changes a bunch of eligible loops to use it. llvm-svn: 244260	2015-08-06 20:22:46 +00:00
Duncan P. N. Exon Smith	3115f75bf8	ValueMapper: Rotate distinct node remapping algorithm Rotate the algorithm for remapping distinct nodes in order to simplify how uniquing cycles get resolved. This removes some of the recursion, and, most importantly, exposes all uniquing cycles at the top-level. Besides being a little more efficient -- temporary MDNodes won't live as long -- the clearer logic should help protect against bugs like those fixed in r243961 and r243976. What are uniquing cycles? Why do they present challenges when remapping metadata? !0 = !{!1} !1 = !{!0} !0 and !1 form a simple uniquing cycle. When remapping from one metadata graph to another, every uniquing cycle gets "duplicated" through a dance: !0-temp = !{!1?} ; map(!0): clone !0, VM[!0] = !0-temp !1-temp = !{!0?} ; ..map(!1): clone !1, VM[!1] = !1-temp !1-temp = !{!0-temp} ; ..map(!1): remap !1's operands !2 = !{!0-temp} ; ..map(!1): uniquify: !1-temp => !2 !0-temp = !{!2} ; map(!0): remap !0's operands !3 = !{!2} ; map(!0): uniquify: !0-temp => !3 ; Result !2 = !{!3} !3 = !{!2} (In the two "uniquify" steps above, the operands of !X-temp are compared to the operands of !X. If they're the same, then !X-temp gets RAUW'ed to !X; if they're different, then !X-temp is promoted to a new unique node. The latter case always hits in for uniquing cycles, so we duplicate all the nodes involved.) Why is this a problem? Uniquable Metadata nodes that have temporary node as transitive operands keep RAUW support until the temporary nodes get finalized. With non-cycles, this happens automatically: when a uniquable node's count of unresolved operands drops to zero, it immediately sheds its own RAUW support (possibly triggering the same in any node that references it). However, uniquing cycles create a reference cycle, and uniqued nodes that transitively reference a uniquing cycle are "stuck" in an unresolved state until someone calls `MDNode::resolveCycles()` on a node in the unresolved subgraph. Distinct nodes should help here (and mostly do): since they aren't uniqued anywhere, they are guaranteed not to be RAUW'ed. They effectively form a barrier between uniqued nodes, breaking some uniquing cycles, and shielding uniqued nodes from uniquing cycles. Unfortunately, with this barrier in place, the unresolved subgraph(s) can be disjoint from the top-level node. The mapping algorithm needs to find at least one representative from each disjoint subgraph. But which nodes are stuck, and which will get resolved automatically? And which nodes are in the unresolved subgraph? The old logic was conservative. This commit rotates the logic for distinct nodes, so that we have access to unresolved nodes at the top-level call to `llvm::MapMetadata()`. Each time we return to the top-level, we know that all temporaries have been RAUW'ed away. Here, it's safe (and necessary) to call `resolveCycles()` immediately on unresolved operands. This should also perform better than the old algorithm. The recursion stack is shorter, temporary nodes don't live as long, and there are fewer tracking references to unresolved nodes. As the debug info graph introduces more 'distinct' nodes, remapping should incrementally get cheaper and cheaper. Aside from possible performance improvements (and reduced cruft in the `LLVMContext`), there should be no functionality change here. llvm-svn: 244181	2015-08-05 23:52:42 +00:00
Duncan P. N. Exon Smith	2705097e47	ValueMapper: Simplify remap() helper function, NFC Rename `remap()` to `remapOperands()`, and restrict its contract to remapping operands. Previously, it also called `mapToMetadata()`, but this logic is hard to reason about externally. In particular, this refactors `mapUniquedNode()` to avoid redundant mapping calls, taking advantage of the RAUWs that are already in place. llvm-svn: 244168	2015-08-05 23:22:34 +00:00
Duncan P. N. Exon Smith	1de9ccb472	Fix 80-column llvm-svn: 243977	2015-08-04 13:24:26 +00:00
Duncan P. N. Exon Smith	5ed90c0278	Linker: Fix ASan failure from r243961 r243883 and r243961 made a use-after-free far more likely: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/6041/steps/check-llvm%20asan/logs/stdio Unresolved nodes get inserted into the `Cycles` array. If they later get resolved through RAUW, we need to update the reference. It's interesting that this never hit before (maybe an asan-ified clang bootstrap with `-flto -g` would have hit it, but I admit I haven't tried anything quite that crazy). llvm-svn: 243976	2015-08-04 13:23:30 +00:00
David Majnemer	eb518bd5d8	Drive-by fixes for LandingPad -> EHPad This change was done as an audit and is by inspection. The new EH system is still very much a work in progress. NFC for the landingpad case. llvm-svn: 243965	2015-08-04 08:21:40 +00:00
Duncan P. N. Exon Smith	706f37e8df	Linker: Fix references to uniqued nodes after r243883 r243883 started moving 'distinct' nodes instead of duplicated them in lib/Linker. This had the side-effect of sometimes not cloning uniqued nodes that reference them. I missed a corner case: !named = !{!0} !0 = !{!1} !1 = distinct !{!0} !0 is the entry point for "remapping", and a temporary clone (say, !0-temp) is created and mapped in case we need to model a uniquing cycle. Recursive descent into !1. !1 is distinct, so we leave it alone, but update its operand to !0-temp. Pop back out to !0. Its only operand, !1, hasn't changed, so we don't need to use !0-temp. !0-temp goes out of scope, and we're finished remapping, but we're left with: !named = !{!0} !0 = !{!1} !1 = distinct !{null} ; uh oh... Previously, if !0 and !0-temp ended up with identical operands, then !0-temp couldn't have been referenced at all. Now that distinct nodes don't get duplicated, that assumption is invalid. We need to !0-temp->replaceAllUsesWith(!0) before freeing !0-temp. I found this while running an internal `-flto -g` bootstrap. Strangely, there was no case of this in the open source bootstrap I'd done before commit... llvm-svn: 243961	2015-08-04 06:42:31 +00:00
Adam Nemet	6b6082dc42	[LoopVer] Remove unused needsRuntimeChecks(), NFC The previous commits moved this functionality into the client. Also remove the now unused member variable. llvm-svn: 243920	2015-08-03 23:32:57 +00:00
Duncan P. N. Exon Smith	4fb46cb818	Linker: Move distinct MDNodes instead of cloning Instead of cloning distinct `MDNode`s when linking in a module, just move them over. The module linker destroys the source module, so the old node would otherwise just be leaked on the context. Create the new node in place. This also reduces the number of cloned uniqued nodes (since it's less likely their operands have changed). This mapping strategy is only correct when we're discarding the source, so the linker turns it on via a ValueMapper flag, `RF_MoveDistinctMDs`. There's nothing observable in terms of `llvm-link` output here: the linked module should be semantically identical. I'll be adding more 'distinct' nodes to the debug info metadata graph in order to break uniquing cycles, so the benefits of this will partly come in future commits. However, we should get some gains immediately, since we have a fair number of 'distinct' `DILocation`s being linked in. llvm-svn: 243883	2015-08-03 17:09:38 +00:00
Duncan P. N. Exon Smith	50f8969e52	ValueMapper: Only check for cycles if operands change This is a minor optimization to only check for unresolved operands inside `mapDistinctNode()` if the operands have actually changed. This shouldn't really cause any change in behaviour. I didn't actually see a slowdown in a profile, I was just poking around nearby and saw the opportunity. llvm-svn: 243866	2015-08-03 03:45:32 +00:00
Duncan P. N. Exon Smith	e08bcbff8f	ValueMapper: Use a range-based for, NFC llvm-svn: 243865	2015-08-03 03:27:12 +00:00
Duncan P. N. Exon Smith	0880014d48	ValueMapper: Reuse local variable, NFC llvm-svn: 243864	2015-08-03 03:24:28 +00:00
Craig Topper	e3dcce9700	De-constify pointers to Type since they can't be modified. NFC This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842	2015-08-01 22:20:21 +00:00
David Majnemer	654e130b6e	New EH representation for MSVC compatibility This introduces new instructions neccessary to implement MSVC-compatible exception handling support. Most of the middle-end and none of the back-end haven't been audited or updated to take them into account. Differential Revision: http://reviews.llvm.org/D11097 llvm-svn: 243766	2015-07-31 17:58:14 +00:00
Adam Nemet	252d529b6c	[LoopVer] Add missing std::move The reason I was passing this vector by value in the constructor so that I wouldn't have to copy when initializing the corresponding member but then I forgot the std::move. The use-case is LoopDistribution which filters the checks then std::moves it to LoopVersioning's constructor. With this interface we can avoid any copies. llvm-svn: 243616	2015-07-30 04:21:13 +00:00
Adam Nemet	0a674401bf	[LDist][LVer] Explicitly pass the set of memchecks to LoopVersioning, NFC Before the patch, the checks were generated internally in addRuntimeCheck. Now, we use the new overloaded version of addRuntimeCheck that takes the ready-made set of checks as a parameter. The checks are now generated by the client (LoopDistribution) with the new RuntimePointerChecking::generateChecks API. Also the new printChecks API is used to print out the checks for debugging. This is to continue the transition over to the new model whereby clients will get the full set of checks from LAA, filter it and then pass it to LoopVersioning and in turn to addRuntimeCheck. llvm-svn: 243382	2015-07-28 05:01:53 +00:00
Sanjoy Das	5dab205ced	[IndVars] Make loop varying predicates loop invariant. Summary: Was D9784: "Remove loop variant range check when induction variable is strictly increasing" This change re-implements D9784 with the two differences: 1. It does not use SCEVExpander and does not generate new instructions. Instead, it does a quick local search for existing `llvm::Value`s that it needs when modifying the `icmp` instruction. 2. It is more general -- it deals with both increasing and decreasing induction variables. I've added all of the tests included with D9784, and two more. As an example on what this change does (copied from D9784): Given C code: ``` for (int i = M; i < N; i++) // i is known not to overflow if (i < 0) break; a[i] = 0; } ``` This transformation produces: ``` for (int i = M; i < N; i++) if (M < 0) break; a[i] = 0; } ``` Which can be unswitched into: ``` if (!(M < 0)) for (int i = M; i < N; i++) a[i] = 0; } ``` I went back and forth on whether the top level logic should live in `SimplifyIndvar::eliminateIVComparison` or be put into its own routine. Right now I've put it under `eliminateIVComparison` because even though the `icmp` is not eliminated, it no longer is an IV comparison. I'm open to putting it in its own helper routine if you think that is better. Reviewers: reames, nicholas, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11278 llvm-svn: 243331	2015-07-27 21:42:49 +00:00
Pete Cooper	7679afda82	Use make_range(rbegin(), rend()) to allow foreach loops. NFC. Instead of the pattern for (auto I = x.rbegin(), E = x.end(); I != E; ++I) we can use make_range to construct the reverse range and iterate using that instead. llvm-svn: 243163	2015-07-24 21:13:43 +00:00
Kuba Brecka	45dbffdc3d	[asan] Rename the ABI versioning symbol to '__asan_version_mismatch_check' instead of abusing '__asan_init' We currently version `__asan_init` and when the ABI version doesn't match, the linker gives a `undefined reference to '__asan_init_v5'` message. From this, it might not be obvious that it's actually a version mismatch error. This patch makes the error message much clearer by changing the name of the undefined symbol to be `__asan_version_mismatch_check_xxx` (followed by the version string). We obviously don't want the initializer to be named like that, so it's a separate symbol that is used only for the purpose of version checking. Reviewed at http://reviews.llvm.org/D11004 llvm-svn: 243003	2015-07-23 10:54:06 +00:00
Chandler Carruth	194f59ca5d	[PM/AA] Extract the ModRef enums from the AliasAnalysis class in preparation for de-coupling the AA implementations. In order to do this, they had to become fake-scoped using the traditional LLVM pattern of a leading initialism. These can't be actual scoped enumerations because they're bitfields and thus inherently we use them as integers. I've also renamed the behavior enums that are specific to reasoning about the mod/ref behavior of functions when called. This makes it more clear that they have a very narrow domain of applicability. I think there is a significantly cleaner API for all of this, but I don't want to try to do really substantive changes for now, I just want to refactor the things away from analysis groups so I'm preserving the exact original design and just cleaning up the names, style, and lifting out of the class. Differential Revision: http://reviews.llvm.org/D10564 llvm-svn: 242963	2015-07-22 23:15:57 +00:00
Michael Kuperstein	d72403636c	Fix mem2reg to correctly handle allocas only used in a single block Currently, a load from an alloca that is used in as single block and is not preceded by a store is replaced by undef. This is not always correct if the single block is inside a loop. Fix the logic so that: 1) If there are no stores in the block, replace the load with an undef, as before. 2) If there is a store (regardless of where it is in the block w.r.t the load), bail out, and let the rest of mem2reg handle this alloca. Patch by: gil.rapaport@intel.com Differential Revision: http://reviews.llvm.org/D11355 llvm-svn: 242884	2015-07-22 10:29:29 +00:00
Chandler Carruth	96ada25bf3	[PM/AA] Remove all of the dead AliasAnalysis pointers being threaded through APIs that are no longer necessary now that the update API has been removed. This will make changes to the AA interfaces significantly less disruptive (I hope). Either way, it seems like a really nice cleanup. llvm-svn: 242882	2015-07-22 09:52:54 +00:00
Chandler Carruth	a1032a0f7c	[PM/AA] Remove the last of the legacy update API from AliasAnalysis as part of simplifying its interface and usage in preparation for porting to work with the new pass manager. Note that this will likely expose that we have dead arguments, members, and maybe even pass requirements for AA. I'll be cleaning those up in seperate patches. This just zaps the actual update API. Differential Revision: http://reviews.llvm.org/D11325 llvm-svn: 242881	2015-07-22 09:49:59 +00:00
Adam Nemet	7cdebac0c8	[LAA] Lift RuntimePointerCheck out of LoopAccessInfo, NFC I am planning to add more nested classes inside RuntimePointerCheck so all these triple-nesting would be hard to follow. Also rename it to RuntimePointerChecking (i.e. append 'ing'). llvm-svn: 242218	2015-07-14 22:32:44 +00:00
Reid Kleckner	486fa3977a	Update enforceKnownAlignment after the isWeakForLinker semantic change Previously we would refrain from attempting to increase the linkage of available_externally globals because they were considered weak for the linker. Now they are treated more like a declaration instead of a weak definition. This was causing SSE alignment faults in Chromuim, when some code assumed it could increase the alignment of a dllimported global that it didn't control. http://crbug.com/509256 llvm-svn: 242091	2015-07-14 00:11:08 +00:00
Chandler Carruth	00ebdbcc47	[PM/AA] Completely remove the AliasAnalysis::copyValue interface. No in-tree alias analysis used this facility, and it was not called in any particularly rigorous way, so it seems unlikely to be correct. Note that one of the only stateful AA implementations in-tree, GlobalsModRef is completely broken currently (and any AA passes like it are equally broken) because Module AA passes are not effectively invalidated when a function pass that fails to update the AA stack runs. Ultimately, it doesn't seem like we know how we want to build stateful AA, and until then trying to support and maintain correctness for an untested API is essentially impossible. To that end, I'm planning to rip out all of the update API. It can return if and when we need it and know how to build it on top of the new pass manager and as part of tested stateful AA implementations in the tree. Differential Revision: http://reviews.llvm.org/D10889 llvm-svn: 241975	2015-07-11 04:39:00 +00:00
Adam Nemet	215746b45a	[LoopDist/LoopVer] Move LoopVersioning to a new module, NFC Summary: The class will obviously need improvement down the road. For one, there is no reason that addPHINodes would have to be exposed like that. I will make this and other improvements in follow-up patches. The main goal is to be able to share this functionality. The LoopLoadElimination pass I am working on needs it too. Later we can move other clients as well (LV and Ashutosh's LICMVer). Reviewers: hfinkel, ashutosh.nema Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10577 llvm-svn: 241932	2015-07-10 18:55:13 +00:00
Adam Nemet	1a689188c4	[LoopDist] Move loop-versioning helper functions to Cloning, NFC Summary: This makes them available to the LoopVersioning class as that is moved to its own module in the next patch. Reviewers: ashutosh.nema, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10576 llvm-svn: 241931	2015-07-10 18:55:09 +00:00
David Majnemer	db82d2f338	Revert the new EH instructions This reverts commits r241888-r241891, I didn't mean to commit them. llvm-svn: 241893	2015-07-10 07:15:17 +00:00
David Majnemer	ae2ffc8a8c	New EH representation for MSVC compatibility Summary: This introduces new instructions neccessary to implement MSVC-compatible exception handling support. Most of the middle-end and none of the back-end haven't been audited or updated to take them into account. Reviewers: rnk, JosephTremoulet, reames, nlewycky, rjmccall Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11041 llvm-svn: 241888	2015-07-10 07:00:44 +00:00
David Majnemer	453f7a1480	[LoopUnroll] Use undef for phis with no value live We would create a phi node with a zero initialized operand instead of undef in the case where no value was originally available. This was problematic for x86_mmx which has no null value. llvm-svn: 241143	2015-07-01 05:38:07 +00:00
David Majnemer	cda8688f61	[Cloning] Teach CloneModule about personality functions CloneModule didn't take into account that it needed to remap the value using values in the module. This fixes PR23992. llvm-svn: 241122	2015-06-30 22:14:01 +00:00
Alexey Samsonov	b7724b95d8	[LoopSimplify] Set proper debug location in loop backedge blocks. Set debug location for terminator instruction in loop backedge block (which is an unconditional jump to loop header). We can't copy debug location from original backedges, as there can be several of them, with different debug info locations. So, we follow the approach of SplitBlockPredecessors, and copy the debug info from first non-PHI instruction in the header (i.e. destination block). This is yet another change for PR23837. llvm-svn: 240999	2015-06-29 21:30:14 +00:00
David Blaikie	b447ac6435	Move VectorUtils from Transforms to Analysis to correct layering violation llvm-svn: 240804	2015-06-26 18:02:52 +00:00
David Blaikie	1213dbf1fd	Fix ODR violation waiting to happen by making static function definitions in VectorUtils.h non-static and defined out of line Patch by Ashutosh Nema Differential Revision: http://reviews.llvm.org/D10682 llvm-svn: 240794	2015-06-26 16:57:30 +00:00
Sanjay Patel	09159b8f47	don't repeat function names in comments; NFC llvm-svn: 240591	2015-06-24 20:40:57 +00:00
Sanjay Patel	adb110c372	fix typos; NFC llvm-svn: 240585	2015-06-24 20:07:50 +00:00
Alexey Samsonov	19ffcb900f	Let llvm::ReplaceInstWithInst copy debug location from old to new instruction. Currently some users of this function do this explicitly, and all the rest forget to do this. ThreadSanitizer was one of such users, and had missing debug locations for calls into TSan runtime handling atomic operations, eventually leading to poorly symbolized stack traces and malfunctioning suppressions. This is another change relevant to PR23837. llvm-svn: 240460	2015-06-23 21:00:08 +00:00
Alexander Kornienko	f00654e31b	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) Apparently, the style needs to be agreed upon first. llvm-svn: 240390	2015-06-23 09:49:53 +00:00
Benjamin Kramer	00a477f279	[SwitchLowering] Remove quadratic vector removal. This can be triggered with giant switches. No functionality change intended. llvm-svn: 240221	2015-06-20 15:59:34 +00:00
Justin Bogner	e46d3796fc	LowerSwitch: Avoid some undefined behaviour When a case of INT64_MIN was followed by a case that was greater than zero, we were overflowing a signed integer here. Since we've sorted the cases here anyway (and thus currentValue must be greater than nextValue) it's simple enough to avoid this by using addition rather than subtraction. Found by UBSAN on existing tests. llvm-svn: 240201	2015-06-20 00:28:25 +00:00
Alexander Kornienko	70bc5f1398	Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137	2015-06-19 15:57:42 +00:00
Eric Christopher	572e03a396	Fix "the the" in comments. llvm-svn: 240112	2015-06-19 01:53:21 +00:00
Benjamin Kramer	2b2cdd7799	[EliminateDuplicatePHINodes] Replace custom hash map with DenseSet. While there use hash_combine instead of hand-rolled hashing. No functionality change intended. llvm-svn: 240023	2015-06-18 16:01:00 +00:00
David Majnemer	7fddeccb8b	Move the personality function from LandingPadInst to Function The personality routine currently lives in the LandingPadInst. This isn't desirable because: - All LandingPadInsts in the same function must have the same personality routine. This means that each LandingPadInst beyond the first has an operand which produces no additional information. - There is ongoing work to introduce EH IR constructs other than LandingPadInst. Moving the personality routine off of any one particular Instruction and onto the parent function seems a lot better than have N different places a personality function can sneak onto an exceptional function. Differential Revision: http://reviews.llvm.org/D10429 llvm-svn: 239940	2015-06-17 20:52:32 +00:00
Tyler Nowicki	27b2c39eb3	Refactor RecurrenceInstDesc Moved RecurrenceInstDesc into RecurrenceDescriptor to simplify the namespaces. llvm-svn: 239862	2015-06-16 22:59:45 +00:00
Tyler Nowicki	0a91310c7f	Rename Reduction variables/structures to Recurrence. A reduction is a special kind of recurrence. In the loop vectorizer we currently identify basic reductions. Future patches will extend this to identifying basic recurrences. llvm-svn: 239835	2015-06-16 18:07:34 +00:00
Alexey Samsonov	ea20199b48	[LoopUnroll] Use IRBuilder to create branch instructions. Use IRBuilder::Create(Cond)?Br instead of constructing instructions manually with BranchInst::Create(). It's consistent with other uses of IRBuilder in this pass, and has an additional important benefit: Using IRBuilder will ensure that new branch instruction will get the same debug location as original terminator instruction it will eventually replace. For now I'm not adding a testcase, as currently original terminator instruction also lack debug location due to missing debug location propagation in BasicBlock::splitBasicBlock. That is, the testcase will accompany the fix for the latter I'm going to mail soon. llvm-svn: 239550	2015-06-11 18:25:44 +00:00
Alexey Samsonov	b7f02d371f	[BasicBlockUtils] Set debug locations for instructions created in SplitBlockPredecessors. Test Plan: regression test suite Reviewers: eugenis, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10343 llvm-svn: 239438	2015-06-09 22:10:29 +00:00
David Majnemer	b58f32f7a8	[LoopVectorize] Don't crash on zero-sized types in isInductionPHI isInductionPHI wants to calculate the stride based on the pointee size. However, this is not possible when the pointee is zero sized. This fixes PR23763. llvm-svn: 239143	2015-06-05 10:52:40 +00:00
David Blaikie	f5147ef0b9	[opaque pointer type] Explicitly store the pointee type of the result of a GEP Alternatively, this type could be derived on-demand whenever getResultElementType is called - if someone thinks that's the better choice (simple time/space tradeoff), I'm happy to give it a go. llvm-svn: 238716	2015-06-01 03:09:34 +00:00
Benjamin Kramer	f5e2fc474d	Replace push_back(Constructor(foo)) with emplace_back(foo) for non-trivial types If the type isn't trivially moveable emplace can skip a potentially expensive move. It also saves a couple of characters. Call sites were found with the ASTMatcher + some semi-automated cleanup. memberCallExpr( argumentCountIs(1), callee(methodDecl(hasName("push_back"))), on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))), hasArgument(0, bindTemporaryExpr( hasType(recordDecl(hasNonTrivialDestructor())), has(constructExpr()))), unless(isInTemplateInstantiation())) No functional change intended. llvm-svn: 238602	2015-05-29 19:43:39 +00:00
Philip Reames	7c78ef7dd9	Extend EarlyCSE to handle basic cases from JumpThreading and CVP This patch extends EarlyCSE to take advantage of the information that a controlling branch gives us about the value of a Value within this and dominated basic blocks. If the current block has a single predecessor with a controlling branch, we can infer what the branch condition must have been to execute this block. The actual change to support this is downright simple because EarlyCSE's existing scoped hash table logic deals with most of the complexity around merging. The patch actually implements two optimizations. 1) The first is analogous to JumpThreading in that it enables EarlyCSE's CSE handling to fold branches which are exactly redundant due to a previous branch to branches on constants. (It doesn't actually replace the branch or change the CFG.) This is pretty clearly a win since it enables substantial CFG simplification before we start trying to inline. 2) The second is analogous to CVP in that it exploits the knowledge gained to replace dominated uses of the original value. EarlyCSE does not otherwise reason about specific uses, so this is the more arguable one. It does enable further simplication and constant folding within the rest of the visit by EarlyCSE. In both cases, the added code only handles the easy dominance based case of each optimization. The general case is deferred to the existing passes. Differential Revision: http://reviews.llvm.org/D9763 llvm-svn: 238071	2015-05-22 23:53:24 +00:00
Pete Cooper	9e1d335697	Change Function::getIntrinsicID() to return an Intrinsic::ID. NFC. Now that Intrinsic::ID is a typed enum, we can forward declare it and so return it from this method. This updates all users which were either using an unsigned to store it, or had a now unnecessary cast. llvm-svn: 237810	2015-05-20 17:16:39 +00:00
David Blaikie	ff6409d096	Simplify IRBuilder::CreateCall* by using ArrayRef+initializer_list/braced init only llvm-svn: 237624	2015-05-18 22:13:54 +00:00
Andrew Trick	018e55a187	SimplifyIV comments and dead argument cleanup. Remove crufty comments. IVUsers hasn't been used here for a long time. llvm-svn: 237586	2015-05-18 16:49:31 +00:00
Pete Cooper	41e0ee3074	Change LoadAndStorePromoter to take ArrayRef instead of SmallVectorImpl&. The array passed to LoadAndStorePromoter's constructor was a constant reference to a SmallVectorImpl, which is just the same as passing an ArrayRef. Also, the data in the array can be 'const Instruction' instead of 'Instruction'. Its not possible to convert a SmallVectorImpl<T> to SmallVectorImpl<const T>, but ArrayRef does provide such a method. Currently this added calls to makeArrayRef which should be a nop, but i'm going to kick off a discussion about improving ArrayRef to not need these. llvm-svn: 237226	2015-05-13 01:12:16 +00:00
Pete Cooper	833f34d837	Convert PHI getIncomingValue() to foreach over incoming_values(). NFC. We already had a method to iterate over all the incoming values of a PHI. This just changes all eligible code to use it. Ineligible code included anything which cared about the index, or was also trying to get the i'th incoming BB. llvm-svn: 237169	2015-05-12 20:05:31 +00:00
Ismail Pazarbasi	56ccf1c9d5	Implement `createSanitizerCtor`, common helper function for all sanitizers Summary: This helper function creates a ctor function, which calls sanitizer's init function with given arguments. This constructor is then expected to be added to module's ctors. The patch helps unifying how sanitizer constructor functions are created, and how init functions are called across all sanitizers. Reviewers: kcc, samsonov Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8777 llvm-svn: 236627	2015-05-06 18:48:22 +00:00
David Blaikie	73cf872adb	[opaque pointer type] Track explicit GEP pointee type through in-memory IR llvm-svn: 236510	2015-05-05 18:03:48 +00:00
David Blaikie	bf0a42ac09	[opaque pointer type] Store the value type of an alloca llvm-svn: 236175	2015-04-29 23:00:35 +00:00
David Blaikie	f64246be72	[opaque pointer type] Pass GlobalAlias the actual pointer type rather than decomposing it into pointee type + address space Many of the callers already have the pointer type anyway, and for the couple of callers that don't it's pretty easy to call PointerType::get on the pointee type and address space. This avoids LLParser from using PointerType::getElementType when parsing GlobalAliases from IR. llvm-svn: 236160	2015-04-29 21:22:39 +00:00
Duncan P. N. Exon Smith	a9308c49ef	IR: Give 'DI' prefix to debug info metadata Finish off PR23080 by renaming the debug info IR constructs from `MD` to `DI`. The last of the `DIDescriptor` classes were deleted in r235356, and the last of the related typedefs removed in r235413, so this has all baked for about a week. Note: If you have out-of-tree code (like a frontend), I recommend that you get everything compiling and tests passing with the previous commit before updating to this one. It'll be easier to keep track of what code is using the `DIDescriptor` hierarchy and what you've already updated, and I think you're extremely unlikely to insert bugs. YMMV of course. Back to this commit: I did this using the rename-md-di-nodes.sh upgrade script I've attached to PR23080 (both code and testcases) and filtered through clang-format-diff.py. I edited the tests for test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns were off-by-three. It should work on your out-of-tree testcases (and code, if you've followed the advice in the previous paragraph). Some of the tests are in badly named files now (e.g., test/Assembler/invalid-mdcompositetype-missing-tag.ll should be 'dicompositetype'); I'll come back and move the files in a follow-up commit. llvm-svn: 236120	2015-04-29 16:38:44 +00:00
Hans Wennborg	86ac630585	SimplifyCFG: Correctly handle switch lookup tables which fully cover the input type and use bit tests to check for holes When using bit tests for hole checks, we call AddPredecessorToBlock to give the phi node a value from the bit test block. This would break if we've previously called removePredecessor on the default destination because the switch is fully covered. Test case by Mark Lacey. llvm-svn: 235771	2015-04-24 20:57:56 +00:00
Aaron Ballman	5e90906c0d	Removing dead code; NFC. This code was triggering a C4718 warning (recursive call has no side effects, deleting) with MSVC. llvm-svn: 235717	2015-04-24 12:51:45 +00:00
David Blaikie	348de69a30	Recommit r235458: [opaque pointer type] Avoid using PointerType::getElementType for a few cases of CallInst (reverted in r235533) Original commit message: "Calls to llvm::Value::mutateType are becoming extra-sensitive now that instructions have extra type information that will not be derived from operands or result type (alloca, gep, load, call/invoke, etc... ). The special-handling for mutateType will get more complicated as this work continues - it might be worth making mutateType virtual & pushing the complexity down into the classes that need special handling. But with only two significant uses of mutateType (vectorization and linking) this seems OK for now. Totally open to ideas/suggestions/improvements, of course. With this, and a bunch of exceptions, we can roundtrip an indirect call site through bitcode and IR. (a direct call site is actually trickier... I haven't figured out how to deal with the IR deserializer's lazy construction of Function/GlobalVariable decl's based on the type of the entity which means looking through the "pointer to T" type referring to the global)" The remapping done in ValueMapper for LTO was insufficient as the types weren't correctly mapped (though I was using the post-mapped operands, some of those operands might not have been mapped yet so the type wouldn't be post-mapped yet). Instead use the pre-mapped type and explicitly map all the types. llvm-svn: 235651	2015-04-23 21:36:23 +00:00
Karthik Bhat	24e6cc2de4	Move common loop utility function isInductionPHI into LoopUtils.cpp This patch refactors the definition of common utility function "isInductionPHI" to LoopUtils.cpp. This fixes compilation error when configured with -DBUILD_SHARED_LIBS=ON llvm-svn: 235577	2015-04-23 08:29:20 +00:00
David Blaikie	d2db881e85	Revert "[opaque pointer type] Avoid using PointerType::getElementType for a few cases of CallInst" This reverts commit r235458. It looks like this might be breaking something LTO-ish. Looking into it & will recommit with a fix/test case/etc once I've got more to go on. llvm-svn: 235533	2015-04-22 18:16:49 +00:00
David Blaikie	506993636e	[opaque pointer type] Avoid using PointerType::getElementType for a few cases of CallInst Calls to llvm::Value::mutateType are becoming extra-sensitive now that instructions have extra type information that will not be derived from operands or result type (alloca, gep, load, call/invoke, etc... ). The special-handling for mutateType will get more complicated as this work continues - it might be worth making mutateType virtual & pushing the complexity down into the classes that need special handling. But with only two significant uses of mutateType (vectorization and linking) this seems OK for now. Totally open to ideas/suggestions/improvements, of course. With this, and a bunch of exceptions, we can roundtrip an indirect call site through bitcode and IR. (a direct call site is actually trickier... I haven't figured out how to deal with the IR deserializer's lazy construction of Function/GlobalVariable decl's based on the type of the entity which means looking through the "pointer to T" type referring to the global) llvm-svn: 235458	2015-04-21 23:26:57 +00:00
Daniel Berlin	b4e7a4a40c	Revamp PredIteratorCache interface to be cleaner. Summary: This lets us use range based for loops. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9169 llvm-svn: 235416	2015-04-21 21:11:50 +00:00
Daniel Berlin	2372a193ba	Move IDF Calculation to a separate file, expose an interface to it. Summary: MemorySSA uses this algorithm as well, and this enables us to reuse the code in both places. There are no actual algorithm or datastructure changes in here, just code movement. Reviewers: qcolombet, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9118 llvm-svn: 235406	2015-04-21 19:13:02 +00:00
Duncan P. N. Exon Smith	60635e39b6	DebugInfo: Drop rest of DIDescriptor subclasses Delete the remaining subclasses of (the already deleted) `DIDescriptor`. Part of PR23080. llvm-svn: 235404	2015-04-21 18:44:06 +00:00
Duncan P. N. Exon Smith	d4a19a396d	DebugInfo: Assert dbg.declare/value insts are valid Remove early returns for when `getVariable()` is null, and just assert that it never happens. The Verifier already confirms that there's a valid variable on these intrinsics, so we should assume the debug info isn't broken. I also updated a check for a `!dbg` attachment, which the Verifier similarly guarantees. llvm-svn: 235400	2015-04-21 18:24:23 +00:00
Duncan P. N. Exon Smith	2fbe13540a	DebugInfo: Delete subclasses of DIScope Delete subclasses of (the already defunct) `DIScope`, updating users to use the raw pointers from the `Metadata` hierarchy directly. llvm-svn: 235356	2015-04-20 22:10:08 +00:00
Akira Hatanaka	2cc2b63f53	[InlineFunction] Don't add lifetime markers for zero-sized allocas. This commit fixes the code which adds lifetime markers in InlineFunction to skip zero-sized allocas instead of asserting on them. rdar://problem/20531155 llvm-svn: 235312	2015-04-20 16:11:05 +00:00
Karthik Bhat	76aa662cf0	[NFC] Refactor identification of reductions as common utility function. This patch refactors reduction identification code out of LoopVectorizer and exposes them as common utilities. No functional change. Review: http://reviews.llvm.org/D9046 llvm-svn: 235284	2015-04-20 04:38:33 +00:00
Aaron Ballman	a2f9943cf6	Silencing a -Wunused-but-set-variable warning; NFC. llvm-svn: 235094	2015-04-16 13:29:36 +00:00
Duncan P. N. Exon Smith	b273d06b63	DebugInfo: Gut DIScope, DIEnumerator and DISubrange The only class the still has API left is `DIDescriptor` itself. llvm-svn: 235067	2015-04-16 01:37:00 +00:00
Duncan P. N. Exon Smith	35ef22cf53	DebugInfo: Gut DICompileUnit and DIFile Continuing gutting `DIDescriptor` subclasses; this edition, `DICompileUnit` and `DIFile`. In the name of PR23080. llvm-svn: 235055	2015-04-15 23:19:27 +00:00
Duncan P. N. Exon Smith	62e0f454a0	DebugInfo: Remove 'inlinedAt:' field from MDLocalVariable Remove 'inlinedAt:' from MDLocalVariable. Besides saving some memory (variables with it seem to be single largest `Metadata` contributer to memory usage right now in -g -flto builds), this stops optimization and backend passes from having to change local variables. The 'inlinedAt:' field was used by the backend in two ways: 1. To tell the backend whether and into what a variable was inlined. 2. To create a unique id for each inlined variable. Instead, rely on the 'inlinedAt:' field of the intrinsic's `!dbg` attachment, and change the DWARF backend to use a typedef called `InlinedVariable` which is `std::pair<MDLocalVariable, MDLocation>`. This `DebugLoc` is already passed reliably through the backend (as verified by r234021). This commit removes the check from r234021, but I added a new check (that will survive) in r235048, and changed the `DIBuilder` API in r235041 to require a `!dbg` attachment whose 'scope:` is in the same `MDSubprogram` as the variable's. If this breaks your out-of-tree testcases, perhaps the script I used (mdlocalvariable-drop-inlinedat.sh) will help; I'll attach it to PR22778 in a moment. llvm-svn: 235050	2015-04-15 22:29:27 +00:00
Duncan P. N. Exon Smith	cd1aecfe36	DebugInfo: Require a DebugLoc in DIBuilder::insertDeclare() Change `DIBuilder::insertDeclare()` and `insertDbgValueIntrinsic()` to take an `MDLocation*`/`DebugLoc` parameter which it attaches to the created intrinsic. Assert at creation time that the `scope:` field's subprogram matches the variable's. There's a matching `clang` commit to use the API. The context for this is PR22778, which is removing the `inlinedAt:` field from `MDLocalVariable`, instead deferring to the `!dbg` location attached to the debug info intrinsic. The best way to ensure we always have a `!dbg` attachment is to require one at creation time. I'll be adding verifier checks next, but this API change is the best way to shake out frontend bugs. Note: I added an `llvm_unreachable()` in `bindings/go` and passed in `nullptr` for the `DebugLoc`. The `llgo` folks will eventually need to pass a valid `DebugLoc` here. llvm-svn: 235041	2015-04-15 21:18:07 +00:00
Duncan P. N. Exon Smith	acdee690c8	DebugInfo: Update signature of DICompileUnit::replace*() Change `DICompileUnit::replaceSubprograms()` and `DICompileUnit::replaceGlobalVariables()` to match the `MDCompileUnit` equivalents that they're wrapping. llvm-svn: 234852	2015-04-14 03:51:36 +00:00
Duncan P. N. Exon Smith	537b4a8159	DebugInfo: Gut DISubprogram and DILexicalBlock* Gut the `DIDescriptor` wrappers around `MDLocalScope` subclasses. Note that `DILexicalBlock` wraps `MDLexicalBlockBase`, not `MDLexicalBlock`. llvm-svn: 234850	2015-04-14 03:40:37 +00:00
Sanjoy Das	e178f46965	[LoopUnrollRuntime] Avoid high-cost trip count computation. Summary: Runtime unrolling of loops needs to emit an expression to compute the loop's runtime trip-count. Avoid runtime unrolling if this computation will be expensive. Depends on D8993. Reviewers: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8994 llvm-svn: 234846	2015-04-14 03:20:38 +00:00
Duncan P. N. Exon Smith	b7e221ba55	DebugInfo: Gut DILocation This is along the same lines as r234832, but for `DILocation`. Clean out all accessors from `DILocation`. Any callers should be using `MDLocation` directly (e.g., via `operator->()`). llvm-svn: 234835	2015-04-14 01:35:55 +00:00
Duncan P. N. Exon Smith	6a0320a991	DebugInfo: Gut DIExpression Completely gut `DIExpression`, turning it into a simple wrapper around `MDExpression `. There are two bits of magic left: - It's constructed from `const MDExpression` but convertible to `MDExpression*`. - It's default-constructed to `nullptr`. Otherwise, it should behave quite like a raw pointer. Once I've done the same to the rest of the `DIDescriptor` subclasses, I'll come back to delete them entirely (and update call sites as necessary to deal with the missing magic). llvm-svn: 234832	2015-04-14 01:12:42 +00:00
Duncan P. N. Exon Smith	843237f573	DebugInfo: Move DILocation::computeNewDiscriminators() As documented in PR23200 (and the FIXMEs I've added to the code here), this logic is fairly broken: it modifies the `LLVMContext` in a way that affects other modules and cannot be serialized to assembly/bitcode. For now, move it over to `MDLocation::computeNewDiscriminators()` anyway. llvm-svn: 234825	2015-04-14 00:35:42 +00:00
Duncan P. N. Exon Smith	4fd839b0da	AddDiscriminators: Create new MDLocation directly I don't see a reason to add the `copyWithNewScope()` API over to `MDLocation` -- it seems to be a holdover from when creating locations required knowing details of operand layout -- so change `AddDiscriminators` to call `MDLocation::get()` directly. Should be no functionality change here. llvm-svn: 234824	2015-04-14 00:34:30 +00:00
Mark Lacey	274f48b5a8	Fix typo. llvm-svn: 234706	2015-04-12 18:18:51 +00:00
Sanjoy Das	71190feca5	[LoopUnrollRuntime] Clean up a predicate. Clean up a predicate I added in r229731, fix the relevant comment and add a test case. The earlier version is confusing to read and was also buggy (probably not a coincidence) till Alexey fixed it in r233881. llvm-svn: 234701	2015-04-12 01:24:01 +00:00
Duncan P. N. Exon Smith	63ffa21d90	DebugInfo: Rewrite atSameLineAs() as MDLocation::canDiscriminate() Rewrite `DILocation::atSameLineAs()` as `MDLocation::canDiscriminate()` with a doxygen comment explaining its purpose. I've added a few FIXMEs where I think this check is too weak; fixing that is tracked by PR23199. llvm-svn: 234674	2015-04-11 01:00:47 +00:00
Reid Kleckner	6e48a826e8	[WinEH] Try to make outlining invokes work a little better WinEH currently turns invokes into calls. Long term, we will reconsider this, but for now, make sure we remap the operands and clone the successors of the new terminator. llvm-svn: 234608	2015-04-10 16:26:42 +00:00
Benjamin Kramer	3a09ef64ee	[CallSite] Make construction from Value* (or Instruction) explicit. CallSite roughly behaves as a common base CallInst and InvokeInst. Bring the behavior closer to that model by making upcasts explicit. Downcasts remain implicit and work as before. Following dyn_cast as a mental model checking whether a Value V isa CallSite now looks like this: if (auto CS = CallSite(V)) // think dyn_cast instead of: if (CallSite CS = V) This is an extra token but I think it is slightly clearer. Making the ctor explicit has the advantage of not accidentally creating nullptr CallSites, e.g. when you pass a Value * to a function taking a CallSite argument. llvm-svn: 234601	2015-04-10 14:50:08 +00:00
Cameron Zwarich	b282ef0111	Eliminate O(n^2) worst-case behavior in SSA construction The code uses a priority queue and a worklist, which share the same visited set, but the visited set is only updated when inserting into the priority queue. Instead, switch to using separate visited sets for the priority queue and worklist. llvm-svn: 234425	2015-04-08 18:26:20 +00:00
Duncan P. N. Exon Smith	000fa2c646	DebugInfo: Remove DITypedArray<>, replace with typedefs Replace all uses of `DITypedArray<>` with `MDTupleTypedArrayWrapper<>` and `MDTypeRefArray`. The APIs are completely different, but the provided functionality is the same: treat an `MDTuple` as if it's an array of a particular element type. To simplify this patch a bit, I've temporarily typedef'ed `DebugNodeArray` to `DIArray` and `MDTypeRefArray` to `DITypeArray`. I've also temporarily conditionalized the accessors to check for null -- eventually these should be changed to asserts and the callers should check for null themselves. There's a tiny accompanying patch to clang. llvm-svn: 234290	2015-04-07 04:14:33 +00:00
Duncan P. N. Exon Smith	6186fb2cd0	Transforms: Stop using DIDescriptor::is*() and auto-casting Same as r234255, but for lib/Analysis and lib/Transforms. llvm-svn: 234257	2015-04-06 23:27:00 +00:00
Ismail Pazarbasi	198d6d53e2	Move `checkInterfaceFunction` to ModuleUtils Summary: Instead of making a local copy of `checkInterfaceFunction` for each sanitizer, move the function in a common place. Reviewers: kcc, samsonov Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8775 llvm-svn: 234220	2015-04-06 21:09:08 +00:00
David Blaikie	aa41cd57e0	[opaque pointer type] More GEP IRBuilder API migrations... llvm-svn: 234058	2015-04-03 21:33:42 +00:00
David Blaikie	65fab6d896	Use early returns to reduce indentation. llvm-svn: 234057	2015-04-03 21:32:06 +00:00
Alexey Samsonov	f96cde9f68	Fix a bug indicated by -fsanitize=shift-exponent. llvm-svn: 233881	2015-04-02 01:30:10 +00:00
Ahmed Bougacha	408d010a7c	[SimplifyLibCalls] Ignore nobuiltin/unavailable fortified libcalls. We used to do this before refactorings around r225640. Some clang users checked for _chk libcall availability using: __has_builtin(__builtin___memcpy_chk) When compiling with -fno-builtin, this is always true. When passing -ffreestanding/-mkernel, which both imply -fno-builtin, we end up with fortified libcalls, which isn't acceptable in a freestanding environment which only provides their non-fortified counterparts. Until we change clang and/or teach external users to check for availability differently, disregard the "nobuiltin" attribute and TLI::has. Workaround for PR23093. llvm-svn: 233776	2015-04-01 00:45:09 +00:00
David Blaikie	3909da7f4b	[opaque pointer type] More IRBuilder::createGEP (non-inbounds) migrations: CodeGenPrepare and SimplifyLibCalls llvm-svn: 233596	2015-03-30 20:42:56 +00:00
Duncan P. N. Exon Smith	ec819c096b	Transforms: Use the new DebugLoc API, NFC Update lib/Analysis and lib/Transforms to use the new `DebugLoc` API. llvm-svn: 233587	2015-03-30 19:49:49 +00:00
Philip Reames	2b969d7010	Merge empty landing pads in SimplifyCFG This patch tries to merge duplicate landing pads when they branch to a common shared target. Given IR that looks like this: lpad1: %exn = landingpad {i8, i32} personality i32 (...) @__gxx_personality_v0 cleanup br label %shared_resume lpad2: %exn2 = landingpad {i8, i32} personality i32 (...) @__gxx_personality_v0 cleanup br label %shared_resume shared_resume: call void @fn() ret void } We can rewrite the users of both landing pad blocks to use one of them. This will generally allow the shared_resume block to be merged with the common landing pad as well. Without this change, tail duplication would likely kick in - creating N (2 in this case) copies of the shared_resume basic block. Differential Revision: http://reviews.llvm.org/D8297 llvm-svn: 233125	2015-03-24 22:28:45 +00:00
Benjamin Kramer	799003bf8c	Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used. llvm-svn: 232998	2015-03-23 19:32:43 +00:00
Benjamin Kramer	1f7c328bf2	[ctorutils] Update and sort includes. NFC. llvm-svn: 232995	2015-03-23 19:06:17 +00:00
Benjamin Kramer	16132e6faa	Purge unused includes throughout libSupport. NFC. llvm-svn: 232976	2015-03-23 18:07:13 +00:00
Benjamin Kramer	d6aa0ec737	[SimplifyLibCalls] Fix negative shifts being produced by the memchr -> bitfield transform. llvm-svn: 232903	2015-03-21 22:04:26 +00:00
Benjamin Kramer	7857d723f1	[SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check. strchr("123!", C) != nullptr is a common pattern to check if C is one of 1, 2, 3 or !. If the largest element of the string is smaller than the target's register size we can easily create a bitfield and just do a simple test for set membership. int foo(char C) { return strchr("123!", C) != nullptr; } now becomes cmpl $64, %edi ## range check sbbb %al, %al movabsq $0xE000200000001, %rcx btq %rdi, %rcx ## bit test sbbb %cl, %cl andb %al, %cl ## and the two conditions andb $1, %cl movzbl %cl, %eax ## returning an int ret (imho the backend should expand this into a series of branches, but that's a different story) The code is currently limited to bit fields that fit in a register, so usually 64 or 32 bits. Sadly, this misses anything using alpha chars or {}. This could be fixed by just emitting a i128 bit field, but that can generate really ugly code so we have to find a better way. To some degree this is also recreating switch lowering logic, but we can't simply emit a switch instruction and thus change the CFG within instcombine. llvm-svn: 232902	2015-03-21 21:09:33 +00:00
Benjamin Kramer	691363e7f2	SimplifyLibCalls: Add basic optimization of memchr calls. This is just memchr(x, y, 0) -> nullptr and constant folding. llvm-svn: 232896	2015-03-21 15:36:21 +00:00
Andrew Kaylor	3170e5620e	Fixing a bug with WinEH PHI handling llvm-svn: 232851	2015-03-20 21:42:54 +00:00
Sanjoy Das	7182d36f66	[ConstantRange] Split makeICmpRegion in two. Summary: This change splits `makeICmpRegion` into `makeAllowedICmpRegion` and `makeSatisfyingICmpRegion` with slightly different contracts. The first one is useful for determining what values some expression //may// take, given that a certain `icmp` evaluates to true. The second one is useful for determining what values are guaranteed to //satisfy// a given `icmp`. Reviewers: nlewycky Reviewed By: nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8345 llvm-svn: 232575	2015-03-18 00:41:24 +00:00
Michael Liao	24fcae8fa0	[SwitchLowering] Remove incoming values in the reverse order - To prevent invalidating successive indices. llvm-svn: 232510	2015-03-17 18:03:10 +00:00
Duncan P. N. Exon Smith	170c26d75e	MapMetadata: Allow unresolved metadata if it won't change Allow unresolved nodes through the `MapMetadata()` if `RF_NoModuleLevelChanges`, since there's no remapping to do anyway. This fixes PR22929. I'll add a clang test as a follow-up. llvm-svn: 232449	2015-03-17 01:14:40 +00:00
David Blaikie	741c8f81e4	[opaque pointer type] Start migrating GEP creation to explicitly specify the pointee type I'm just going to migrate these in a pretty ad-hoc & incremental way - providing the backwards compatible API for now, then locally removing it, fixing a few callers, adding it back in and commiting those callers. Rinse, repeat. The assertions should ensure that if I get this wrong we'll find out about it and not just have one giant patch to revert, recommit, revert, recommit, etc. llvm-svn: 232240	2015-03-14 01:53:18 +00:00
Andrew Kaylor	6b67d42773	Extended support for native Windows C++ EH outlining Differential Review: http://reviews.llvm.org/D7886 llvm-svn: 231981	2015-03-11 23:22:06 +00:00
Sanjay Patel	c04b6f242c	Inliner should not add callgraph edges for intrinsic calls (PR22857) The CallGraphNode function "addCalledFunction()" asserts that edges are not to intrinsics. This patch makes sure that the Inliner does not add such an edge to the callgraph. Fix for clang crash by assertion: https://llvm.org/bugs/show_bug.cgi?id=22857 Differential Revision: http://reviews.llvm.org/D8231 llvm-svn: 231927	2015-03-11 15:12:32 +00:00
Sanjay Patel	0fdb437b25	remove function names from comments; NFC llvm-svn: 231826	2015-03-10 19:42:57 +00:00
Sanjay Patel	abf7023c63	remove names from comments; NFC llvm-svn: 231813	2015-03-10 18:41:22 +00:00
Sanjay Patel	51bd9421ac	fix typos; NFC llvm-svn: 231812	2015-03-10 18:37:05 +00:00
Mehdi Amini	a28d91d81b	DataLayout is mandatory, update the API to reflect it with references. Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first attempt at doing that. This patch is not exactly NFC as for instance some places were passing a nullptr instead of the DataLayout, possibly just because there was a default value on the DataLayout argument to many functions in the API. Even though it is not purely NFC, there is no change in the validation. I turned as many pointer to DataLayout to references, this helped figuring out all the places where a nullptr could come up. I had initially a local version of this patch broken into over 30 independant, commits but some later commit were cleaning the API and touching part of the code modified in the previous commits, so it seemed cleaner without the intermediate state. Test Plan: Reviewers: echristo Subscribers: llvm-commits From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231740	2015-03-10 02:37:25 +00:00
Benjamin Kramer	fd3bc74460	SymbolRewriter: Hide implementation details NFC. llvm-svn: 231660	2015-03-09 15:50:47 +00:00
Kevin Qin	65b07b8e1b	Revert r231630 - Run LICM pass after loop unrolling pass. As it broke llvm bootstrap. llvm-svn: 231635	2015-03-09 07:26:37 +00:00
Kevin Qin	a998735def	Run LICM pass after loop unrolling pass. Runtime unrollng will introduce a runtime check in loop prologue. If the unrolled loop is a inner loop, then the proglogue will be inside the outer loop. LICM pass can help to promote the runtime check out if the checked value is loop invariant. llvm-svn: 231630	2015-03-09 06:14:07 +00:00
Sanjoy Das	a5397c0198	[IndVarSimplify] use the "canonical" way to infer no-wrap. Summary: rL225282 introduced an ad-hoc way to promote some additions to nuw or nsw. Since then SCEV has become smarter in directly proving no-wrap; and using the canonical "ext(A op B) == ext(A) op ext(B)" method of proving no-wrap is just as powerful now. Rip out the existing complexity in favor of getting SCEV to do all the heaving lifting internally. This change does not add any unit tests because it is supposed to be a non-functional change. Tests added in rL225282 and rL226075 are valid tests for this change. Reviewers: atrick, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7981 llvm-svn: 231306	2015-03-04 22:24:23 +00:00
Mehdi Amini	46a43556db	Make DataLayout Non-Optional in the Module Summary: DataLayout keeps the string used for its creation. As a side effect it is no longer needed in the Module. This is "almost" NFC, the string is no longer canonicalized, you can't rely on two "equals" DataLayout having the same string returned by getStringRepresentation(). Get rid of DataLayoutPass: the DataLayout is in the Module The DataLayout is "per-module", let's enforce this by not duplicating it more than necessary. One more step toward non-optionality of the DataLayout in the module. Make DataLayout Non-Optional in the Module Module->getDataLayout() will never returns nullptr anymore. Reviewers: echristo Subscribers: resistor, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D7992 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231270	2015-03-04 18:43:29 +00:00
Andrew Kaylor	f22fe4ae18	Remap frame variables for native Windows exception handling. Differential Revision: http://reviews.llvm.org/D7770 llvm-svn: 230249	2015-02-23 20:01:56 +00:00
Chad Rosier	543900539f	Prevent hoisting fmul from THEN/ELSE to IF if there is fmsub/fmadd opportunity. This patch adds the isProfitableToHoist API. For AArch64, we want to prevent a fmul from being hoisted in cases where it is more profitable to form a fmsub/fmadd. Phabricator Review: http://reviews.llvm.org/D7299 Patch by Lawrence Hu <lawrence@codeaurora.org> llvm-svn: 230241	2015-02-23 19:15:16 +00:00
Benjamin Kramer	dfedfeb298	SSAUpdater: Use range-based for. NFC. llvm-svn: 229908	2015-02-19 20:04:02 +00:00
Benjamin Kramer	ea68a944a1	Demote vectors to arrays. No functionality change. llvm-svn: 229861	2015-02-19 15:26:17 +00:00
Sanjoy Das	11b279a832	Partial fix for bug 22589 Don't spend the entire iteration space in the scalar loop prologue if computing the trip count overflows. This change also gets rid of the backedge check in the prologue loop and the extra check for overflowing trip-count. Differential Revision: http://reviews.llvm.org/D7715 llvm-svn: 229731	2015-02-18 19:32:25 +00:00
Andrew Kaylor	527c5dc68d	Adding implementation to outline C++ catch handlers for native Windows 64 exception handling. Differential Revision: http://reviews.llvm.org/D7363 llvm-svn: 229715	2015-02-18 18:31:51 +00:00
Benjamin Kramer	6cd780ff21	Prefer SmallVector::append/insert over push_back loops. Same functionality, but hoists the vector growth out of the loop. llvm-svn: 229500	2015-02-17 15:29:18 +00:00
Evgeniy Stepanov	292acab847	[asan] Reuse a common function. Do not reimplement RoundUpToAlignment. llvm-svn: 229397	2015-02-16 14:49:37 +00:00
Aaron Ballman	f9a1897c72	Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition. llvm-svn: 229340	2015-02-15 22:54:22 +00:00
James Molloy	1b6207e6eb	[SimplifyCFG] Be more aggressive Up the phi node folding threshold from a cheap "1" to a meagre "2". Update tests for extra added selects and slight code churn. llvm-svn: 229099	2015-02-13 10:48:30 +00:00
Chandler Carruth	30d69c2e36	[PM] Remove the old 'PassManager.h' header file at the top level of LLVM's include tree and the use of using declarations to hide the 'legacy' namespace for the old pass manager. This undoes the primary modules-hostile change I made to keep out-of-tree targets building. I sent an email inquiring about whether this would be reasonable to do at this phase and people seemed fine with it, so making it a reality. This should allow us to start bootstrapping with modules to a certain extent along with making it easier to mix and match headers in general. The updates to any code for users of LLVM are very mechanical. Switch from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h". Qualify the types which now produce compile errors with "legacy::". The most common ones are "PassManager", "PassManagerBase", and "FunctionPassManager". llvm-svn: 229094	2015-02-13 10:01:29 +00:00
James Molloy	7c336576a5	[SimplifyCFG] Swap to using TargetTransformInfo for cost analysis. We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness" heuristic and use TTI directly. Generally NFC intended, but we're using a slightly different heuristic now so there is a slight test churn. Test changes: * combine-comparisons-by-cse.ll: Removed unneeded branch check. * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq. * coalesce-subregs.ll: Superfluous block check. * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv. * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present. * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI. llvm-svn: 228826	2015-02-11 12:15:41 +00:00
Zachary Turner	3bd47cee78	Use ADDITIONAL_HEADER_DIRS in all LLVM CMake projects. This allows IDEs to recognize the entire set of header files for each of the core LLVM projects. Differential Revision: http://reviews.llvm.org/D7526 Reviewed By: Chris Bieneman llvm-svn: 228798	2015-02-11 03:28:02 +00:00
Reid Kleckner	96d011315a	Don't promote asynch EH invokes of nounwind functions to calls If the landingpad of the invoke is using a personality function that catches asynch exceptions, then it can catch a trap. Also add some landingpads to invalid LLVM IR test cases that lack them. Over-the-shoulder reviewed by David Majnemer. llvm-svn: 228782	2015-02-11 01:23:16 +00:00
Duncan P. N. Exon Smith	bd75ad4d0c	IR: Take uint64_t in DIBuilder::createExpression() `DIExpression` deals with `uint64_t`, so it doesn't make sense that `createExpression()` is created from `int64_t`. Switch to `uint64_t` to unify them. I've temporarily left in the `int64_t` version, which forwards to the `uint64_t` version. I'll delete it once I've updated the callers. llvm-svn: 228619	2015-02-09 22:13:27 +00:00
Akira Hatanaka	8d3cb829ce	Fix a bug in DemoteRegToStack where a reload instruction was inserted into the wrong basic block. This would happen when the result of an invoke was used by a phi instruction in the invoke's normal destination block. An instruction to reload the invoke's value would get inserted before the critical edge was split and a new basic block (which is the correct insertion point for the reload) was created. This commit fixes the bug by splitting the critical edge before all the reload instructions are inserted. Also, hoist up the code which computes the insertion point to the only place that need that computation. rdar://problem/15978721 llvm-svn: 228566	2015-02-09 06:38:23 +00:00
Bjorn Steinbrink	5ec7522771	Correctly combine alias.scope metadata by a union instead of intersecting Summary: The alias.scope metadata represents sets of things an instruction might alias with. When generically combining the metadata from two instructions the result must be the union of the original sets, because the new instruction might alias with anything any of the original instructions aliased with. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7490 llvm-svn: 228525	2015-02-08 17:07:14 +00:00
Hans Wennborg	8b4dbdf15d	LowerSwitch: Use ConstantInt for CaseRange::{Low,High} Case values are always ConstantInt. This allows us to remove a bunch of casts. NFC. llvm-svn: 228312	2015-02-05 16:58:10 +00:00
Hans Wennborg	8c82fbcb73	LowerSwitch: remove default args from CaseRange ctor; NFC llvm-svn: 228311	2015-02-05 16:50:27 +00:00
Duncan P. N. Exon Smith	920df5c1bb	Utils: Resolve cycles under distinct MDNodes Track unresolved nodes under distinct `MDNode`s during `MapMetadata()`, and resolve them at the end. Previously, these cycles wouldn't get resolved. llvm-svn: 228180	2015-02-04 19:44:34 +00:00
Jingyue Wu	49a766e468	Resurrect the assertion removed by r227717 Summary: MSVC can compile "LoopID->getOperand(0) == LoopID" when LoopID is MDNode*. Test Plan: no regression Reviewers: mkuper Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D7327 llvm-svn: 227853	2015-02-02 20:41:11 +00:00
Michael Kuperstein	a691f3e921	Removed assert that doesn't typecheck and breaks debug MSVC build. llvm-svn: 227717	2015-02-01 08:46:20 +00:00
Jingyue Wu	0220df0dfd	[NVPTX] Emit .pragma "nounroll" for loops marked with nounroll Summary: CUDA driver can unroll loops when jit-compiling PTX. To prevent CUDA driver from unrolling a loop marked with llvm.loop.unroll.disable is not unrolled by CUDA driver, we need to emit .pragma "nounroll" at the header of that loop. This patch also extracts getting unroll metadata from loop ID metadata into a shared helper function. Test Plan: test/CodeGen/NVPTX/nounroll.ll Reviewers: eliben, meheff, jholewinski Reviewed By: jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D7041 llvm-svn: 227703	2015-02-01 02:27:45 +00:00
Adrian Prantl	133e102b8f	Remove a redundant dyn_cast. llvm-svn: 227605	2015-01-30 19:42:59 +00:00
Adrian Prantl	3e2659eb92	Inliner: Use replaceDbgDeclareForAlloca() instead of splicing the instruction and generalize it to optionally dereference the variable. Follow-up to r227544. llvm-svn: 227604	2015-01-30 19:37:48 +00:00
Adrian Prantl	4d365250ec	Fix PR22386. The inliner moves static allocas to the entry basic block so we need to move the dbg.declare intrinsics that describe them, too. llvm-svn: 227544	2015-01-30 01:55:25 +00:00
Philip Reames	9198b33b48	Teach SplitBlockPredecessors how to handle landingpad blocks. Patch by: Igor Laevsky <igor@azulsystems.com> "Currently SplitBlockPredecessors generates incorrect code in case if basic block we are going to split has a landingpad. Also seems like it is fairly common case among it's users to conditionally call either SplitBlockPredecessors or SplitLandingPadPredecessors. Because of this I think it is reasonable to add this condition directly into SplitBlockPredecessors." Differential Revision: http://reviews.llvm.org/D7157 llvm-svn: 227390	2015-01-28 23:06:47 +00:00
Chandler Carruth	b81dfa6378	[LPM] Stop using the string based preservation API. It is an abomination. For starters, this API is incredibly slow. In order to lookup the name of a pass it must take a memory fence to acquire a pointer to the managed static pass registry, and then potentially acquire locks while it consults this registry for information about what passes exist by that name. This stops the world of LLVMs in your process no matter how little they cared about the result. To make this more joyful, you'll note that we are preserving many passes which do not exist any more, or are not even analyses which one might wish to have be preserved. This means we do all the work only to say "nope" with no error to the user. String-based APIs are a bad idea. String-based APIs that cannot produce any meaningful error are an even worse idea. =/ I have a patch that simply removes this API completely, but I'm hesitant to commit it as I don't really want to perniciously break out-of-tree users of the old pass manager. I'd rather they just have to migrate to the new one at some point. If others disagree and would like me to kill it with fire, just say the word. =] llvm-svn: 227294	2015-01-28 04:57:56 +00:00
Saleem Abdulrasool	c44d71b8df	SymbolRewriter: allow rewriting with comdats COMDATs must be identically named to the symbol. When support for COMDATs was introduced, the symbol rewriter was not updated, resulting in rewriting failing for symbols which were placed into COMDATs. This corrects the behaviour and adds test cases for this. llvm-svn: 227261	2015-01-27 22:57:39 +00:00
Saleem Abdulrasool	9769b18cba	SymbolRewriter: prevent unnecessary rewrite The rewrite for the pattern based rewrite is unnecessary if the existing name matches the pattern. llvm-svn: 227260	2015-01-27 22:57:35 +00:00
Ahmed Bougacha	1ac9356524	[SimplifyLibCalls] Don't confuse strcpy_chk for stpcpy_chk. This was introduced in a faulty refactoring (r225640, mea culpa): the tests weren't testing the return values, so, for both __strcpy_chk and __stpcpy_chk, we would return the end of the buffer (matching stpcpy) instead of the beginning (for strcpy). The root cause was the prefix "__" being ignored when comparing, which made us always pick LibFunc::stpcpy_chk. Pass the LibFunc::Func directly to avoid this kind of error. Also, make the testcases as explicit as possible to prevent this. The now-useful testcases expose another, entangled, stpcpy problem, with the further simplification. This was introduced in a refactoring (r225640) to match the original behavior. However, this leads to problems when successive simplifications generate several similar instructions, none of which are removed by the custom replaceAllUsesWith. For instance, InstCombine (the main user) doesn't erase the instruction in its custom RAUW. When trying to simplify say __stpcpy_chk: - first, an stpcpy is created (fortified simplifier), - second, a memcpy is created (normal simplifier), but the stpcpy call isn't removed. - third, InstCombine later revisits the instructions, and simplifies the first stpcpy to a memcpy. We now have two memcpys. llvm-svn: 227250	2015-01-27 21:52:16 +00:00
Hans Wennborg	b64cb271dc	SimplifyCFG: Omit range checks for switch lookup tables when default is unreachable The range check would get optimized away later, but we might as well not emit them in the first place. http://reviews.llvm.org/D6471 llvm-svn: 227126	2015-01-26 19:52:34 +00:00
Hans Wennborg	6800008f04	SimplifyCFG: don't remove unreachable default switch destinations An unreachable default destination can be exploited by other optimizations and allows for more efficient lowering. Both the SDag switch lowering and LowerSwitch can exploit unreachable defaults. Also make TurnSwitchRangeICmp handle switches with unreachable default. This is kind of separate change, but it cannot be tested without the change above, and I don't want to land the change above without this since that would regress other tests. Differential Revision: http://reviews.llvm.org/D6471 llvm-svn: 227125	2015-01-26 19:52:32 +00:00
Hans Wennborg	90b827cae2	Make ConstantFoldTerminator() handle switches with unreachable default. Tested by Transforms/SimplifyCFG/switch-to-br.ll's @unreachable function. Differential Revision: http://reviews.llvm.org/D6471 llvm-svn: 227124	2015-01-26 19:52:24 +00:00
Chandler Carruth	72793727cc	[PM] Move the LowerExpectIntrinsic pass to the Scalar library. It was already in the Scalar header and referenced extensively as being in this library, the source file was just in the utils directory for some reason. No actual functionality changed. I noticed as it didn't make sense to add a pass header to the utils headers. llvm-svn: 226991	2015-01-24 10:18:47 +00:00
Hans Wennborg	ae9c971a2f	LowerSwitch: replace unreachable default with popular case destination SimplifyCFG currently does this transformation, but I'm planning to remove that to allow other passes, such as this one, to exploit the unreachable default. This patch takes care to keep track of what case values are unreachable even after the transformation, allowing for more efficient lowering. Differential Revision: http://reviews.llvm.org/D6697 llvm-svn: 226934	2015-01-23 20:43:51 +00:00
Reid Kleckner	f12b33454f	Revert "Don't remove a landing pad if the invoke requires a table entry." This reverts commit r176827. Björn Steinbrink pointed out that this didn't actually fix the bug (PR15555) it was attempting to fix. With this reverted, we can now remove landingpad cleanups that immediately resume unwinding, converting the invoke to a call. llvm-svn: 226850	2015-01-22 19:29:46 +00:00
David Blaikie	df706288fb	DebugInfo: Use distinct inlinedAt MDLocations to avoid separate inlined calls being coalesced When two calls from the same MDLocation are inlined they currently get treated as one inlined function call (creating difficulty debugging, duplicate variables, etc). Clang worked around this by including column information on inline calls which doesn't address LTO inlining or calls to the same function from the same line and column (such as through a macro). It also didn't address ctor and member function calls. By making the inlinedAt locations distinct, every call site has an explicitly distinct location that cannot be coalesced with any other call. This can produce linearly (2x in the worst case where every call is inlined and the call instruction has a non-call instruction at the same location) more debug locations. Any increase beyond that are in cases where the Clang workaround was insufficient and the new scheme is creating necessary distinct nodes that were being erroneously coalesced previously. After this change to LLVM the incomplete workarounds in Clang. That should reduce the number of debug locations (in a build without column info, the default on Darwin, not the default on Linux) by not creating pseudo-distinct locations for every call to an inline function. (oh, and I made the inlined-at chain rebuilding iterative instead of recursive because I was having trouble wrapping my head around it the way it was - open to discussion on the right design for that function (including going back to a recursive solution)) llvm-svn: 226736	2015-01-21 22:57:29 +00:00
Chandler Carruth	9280382ac6	[PM] Replace an abuse of inheritance to override a single function with a more direct approach: a type-erased glorified function pointer. Now we can pass a function pointer into this for the easy case and we can even pass a lambda into it in the interesting case in the instruction combiner. I'll be using this shortly to simplify the interfaces to InstCombiner, but this helps pave the way and seems like a better design for the libcall simplifier utility. llvm-svn: 226640	2015-01-21 02:11:59 +00:00
Duncan P. N. Exon Smith	03e0583a2d	IR: Move MDNode clone() methods from ValueMapper to MDNode, NFC Now that the clone methods used by `MapMetadata()` don't do any remapping (and return a temporary), they make more sense as member functions on `MDNode` (and subclasses). llvm-svn: 226541	2015-01-20 02:56:57 +00:00
Chandler Carruth	10f28f26fd	[PM] Replace the Pass argument in MergeBasicBlockIntoOnlyPred with a DominatorTree argument as that is the analysis that it wants to update. This removes the last non-loop utility function in Utils/ which accepts a raw Pass argument. llvm-svn: 226537	2015-01-20 01:37:09 +00:00
Duncan P. N. Exon Smith	fed199a758	IR: Introduce GenericDwarfNode As part of PR22235, introduce `DwarfNode` and `GenericDwarfNode`. The former is a metadata node with a DWARF tag. The latter matches our current (generic) schema of a header with string (and stringified integer) data and an arbitrary number of operands. This doesn't move it into place yet; that change will require a large number of testcase updates. llvm-svn: 226529	2015-01-20 00:01:43 +00:00
Duncan P. N. Exon Smith	2bc00f4a38	IR: Merge UniquableMDNode back into MDNode, NFC As pointed out in r226501, the distinction between `MDNode` and `UniquableMDNode` is confusing. When we need subclasses of `MDNode` that don't use all its functionality it might make sense to break it apart again, but until then this makes the code clearer. llvm-svn: 226520	2015-01-19 23:13:14 +00:00
Duncan P. N. Exon Smith	6dc22bf27b	Utils: Simplify MapMetadata(), NFC Extract out the operand remapping loops, which are now very similar. llvm-svn: 226515	2015-01-19 22:44:32 +00:00
Duncan P. N. Exon Smith	9fa10658ce	Skip upcast, NFC llvm-svn: 226514	2015-01-19 22:41:14 +00:00
Duncan P. N. Exon Smith	c862be860d	Fix whitespace, NFC llvm-svn: 226512	2015-01-19 22:40:25 +00:00
Duncan P. N. Exon Smith	0dcffe2cdc	Utils: Simplify MapMetadata(), NFC Take advantage of the new ability of temporary nodes to mutate to distinct and uniqued nodes to greatly simplify the `MapMetadata()` helper functions. llvm-svn: 226511	2015-01-19 22:39:07 +00:00
Duncan P. N. Exon Smith	422e5c7acc	Cleanup whitespace, NFC llvm-svn: 226507	2015-01-19 22:16:01 +00:00
Duncan P. N. Exon Smith	7d82313bcd	IR: Return unique_ptr from MDNode::getTemporary() Change `MDTuple::getTemporary()` and `MDLocation::getTemporary()` to return (effectively) `std::unique_ptr<T, MDNode::deleteTemporary>`, and clean up call sites. (For now, `DIBuilder` call sites just call `release()` immediately.) There's an accompanying change in each of clang and polly to use the new API. llvm-svn: 226504	2015-01-19 21:30:18 +00:00
Duncan P. N. Exon Smith	946fdcc50c	IR: Remove MDNodeFwdDecl Remove `MDNodeFwdDecl` (as promised in r226481). Aside from API changes, there's no real functionality change here. `MDNode::getTemporary()` now forwards to `MDTuple::getTemporary()`, which returns a tuple with `isTemporary()` equal to true. The main point is that we can now add temporaries of other `MDNode` subclasses, needed for PR22235 (I introduced `MDNodeFwdDecl` in the first place because I didn't recognize this need, and thought they were only needed to handle forward references). A few things left out of (or highlighted by) this commit: - I've had to remove the (few) uses of `std::unique_ptr<>` to deal with temporaries, since the destructor is no longer public. `getTemporary()` should probably return the equivalent of `std::unique_ptr<T, MDNode::deleteTemporary>`. - `MDLocation::getTemporary()` doesn't exist yet (worse, it actually does exist, but does the wrong thing: `MDNode::getTemporary()` is inherited and returns an `MDTuple`). - `MDNode` now only has one subclass, `UniquableMDNode`, and the distinction between them is actually somewhat confusing. I'll fix those up next. llvm-svn: 226501	2015-01-19 20:36:39 +00:00
Duncan P. N. Exon Smith	de03a8b38d	IR: Add isUniqued() and isTemporary() Change `MDNode::isDistinct()` to only apply to 'distinct' nodes (not temporaries), and introduce `MDNode::isUniqued()` and `MDNode::isTemporary()` for the other two possibilities. llvm-svn: 226482	2015-01-19 18:45:35 +00:00
Chandler Carruth	d450056c78	[PM] Replace the Pass argument to SplitEdge with specific analyses used and updated. This may appear to remove handling for things like alias analysis when splitting critical edges here, but in fact no callers of SplitEdge relied on this. Similarly, all of them wanted to preserve LCSSA if there was any update of the loop info. That makes the interface much simpler. With this, all of BasicBlockUtils.h is free of Pass arguments and prepared for the new pass manager. This is tho majority of utilities that relied on pass arguments. llvm-svn: 226459	2015-01-19 12:36:53 +00:00
Chandler Carruth	37df2cfbf8	[PM] Remove the Pass argument from all of the critical edge splitting APIs and replace it and numerous booleans with an option struct. The critical edge splitting API has a really large surface of flags and so it seems worth burning a small option struct / builder. This struct can be constructed with the various preserved analyses and then flags can be flipped in a builder style. The various users are now responsible for directly passing along their analysis information. This should be enough for the critical edge splitting to work cleanly with the new pass manager as well. This API is still pretty crufty and could be cleaned up a lot, but I've focused on this change just threading an option struct rather than a pass through the API. llvm-svn: 226456	2015-01-19 12:09:11 +00:00
Chandler Carruth	ad34d91343	[PM] Relax asserts and always try to reconstruct loop simplify form when we can while splitting critical edges. The only code which called this and didn't require simplified loops to be preserved is polly, and the code behaves correctly there anyways. Without this change, it becomes really hard to share this code with the new pass manager where things like preserving loop simplify form don't make any sense. If anyone discovers this code behaving incorrectly, what it should be testing for is whether the loops it needs to be in simplified form are in fact in that form. It should always be trying to preserve that form when it exists. llvm-svn: 226443	2015-01-19 10:23:00 +00:00
Chandler Carruth	0eae112009	[PM] Lift the analyses into the interface for SplitLandingPadPredecessors and remove the Pass argument from its interface. Another step to the utilities being usable with both old and new pass managers. llvm-svn: 226426	2015-01-19 03:03:39 +00:00
Chandler Carruth	b5797b659f	[PM] Pull the analyses used for another utility routine into its API rather than relying on the pass object. This one is a bit annoying, but will pay off. First, supporting this one will make the next one much easier, and for utilities like LoopSimplify, this is moving them (slowly) closer to not having to pass the pass object around throughout their APIs. llvm-svn: 226396	2015-01-18 09:21:15 +00:00
Chandler Carruth	32c52c7e04	[PM] Sink the specific analyses preserved by SplitBlock into its interface, removing Pass from its interface. This also makes those analyses optional so that passes which don't even preserve these (or use them) can skip the logic entirely. llvm-svn: 226394	2015-01-18 02:39:37 +00:00
Chandler Carruth	b5c115357c	[PM] Replace another Pass argument with specific analyses that are optionally updated by MergeBlockIntoPredecessors. No functionality changed, just refactoring to clear the way for the new pass manager. llvm-svn: 226392	2015-01-18 02:11:23 +00:00
Chandler Carruth	5eee895ccf	[PM] Lift the actual analyses used into the inferface rather than accepting a Pass and querying it for analyses. This is necessary to allow the utilities to work both with the old and new pass managers, and I also think this makes the interface much more clear and helps the reader know what analyses the utility can actually handle. I plan to repeat this process iteratively to clean up all the pass utilities. llvm-svn: 226386	2015-01-18 01:45:07 +00:00
Chandler Carruth	691addc25f	[PM] Now that LoopInfo isn't in the Pass type hierarchy, it is much cleaner to derive from the generic base. Thise removes a ton of boiler plate code and somewhat strange and pointless indirections. It also remove a bunch of the previously needed friend declarations. To fully remove these, I also lifted the verify logic into the generic LoopInfoBase, which seems good anyways -- it is generic and useful logic even for the machine side. llvm-svn: 226385	2015-01-18 01:25:51 +00:00
Chandler Carruth	24fd029a60	[PM] Remove a dead field. This was dead even before I refactored how we initialized it, but my refactoring made it trivially dead and it is now caught by a Clang warning. This fixes the warning and should clean up the -Werror bot failures (sorry!). llvm-svn: 226376	2015-01-17 14:31:35 +00:00
Chandler Carruth	4f8f307c77	[PM] Split the LoopInfo object apart from the legacy pass, creating a LoopInfoWrapperPass to wire the object up to the legacy pass manager. This switches all the clients of LoopInfo over and paves the way to port LoopInfo to the new pass manager. No functionality change is intended with this iteration. llvm-svn: 226373	2015-01-17 14:16:18 +00:00
Chandler Carruth	b98f63dbdb	[PM] Separate the TargetLibraryInfo object from the immutable pass. The pass is really just a means of accessing a cached instance of the TargetLibraryInfo object, and this way we can re-use that object for the new pass manager as its result. Lots of delta, but nothing interesting happening here. This is the common pattern that is developing to allow analyses to live in both the old and new pass manager -- a wrapper pass in the old pass manager emulates the separation intrinsic to the new pass manager between the result and pass for analyses. llvm-svn: 226157	2015-01-15 10:41:28 +00:00
David Majnemer	f0982d0ac6	SimplifyIndVar: Remove unused variable OtherOperandIdx is not used anymore, remove it to silence warnings. llvm-svn: 226138	2015-01-15 07:11:23 +00:00
NAKAMURA Takumi	24ebfcb619	Update libdeps since TLI was moved from Target to Analysis in r226078. llvm-svn: 226126	2015-01-15 05:21:00 +00:00
Chandler Carruth	62d4215baa	[PM] Move TargetLibraryInfo into the Analysis library. While the term "Target" is in the name, it doesn't really have to do with the LLVM Target library -- this isn't an abstraction which LLVM targets generally need to implement or extend. It has much more to do with modeling the various runtime libraries on different OSes and with different runtime environments. The "target" in this sense is the more general sense of a target of cross compilation. This is in preparation for porting this analysis to the new pass manager. No functionality changed, and updates inbound for Clang and Polly. llvm-svn: 226078	2015-01-15 02:16:27 +00:00
Sanjoy Das	8c252bde36	Fix PR22222 The bug was introduced in r225282. r225282 assumed that sub X, Y is the same as add X, -Y. This is not correct if we are going to upgrade the sub to sub nuw. This change fixes the issue by making the optimization ignore sub instructions. Differential Revision: http://reviews.llvm.org/D6979 llvm-svn: 226075	2015-01-15 01:46:09 +00:00
Duncan P. N. Exon Smith	e65b0663e6	Remove trailing slash from r225924 llvm-svn: 225929	2015-01-14 01:42:43 +00:00
Duncan P. N. Exon Smith	e54cd9a6f3	Utils: Remove unreachable break, NFC llvm-svn: 225924	2015-01-14 01:31:34 +00:00
Duncan P. N. Exon Smith	a5a0f5766a	Utils: Handle remapping distinct MDLocations Part of PR21433. llvm-svn: 225921	2015-01-14 01:29:32 +00:00
Duncan P. N. Exon Smith	b84840c04e	Utils: Thread distinct-ness through the cloneMD*() functions, NFC The new logic isn't actually reachable yet, so no functionality change. llvm-svn: 225918	2015-01-14 01:24:38 +00:00
Duncan P. N. Exon Smith	7c69c1ebda	Utils: Extract cloneMDNode(), NFC llvm-svn: 225917	2015-01-14 01:22:47 +00:00
Duncan P. N. Exon Smith	b6515d6a71	Utils: Move cloneMD*() up, NFC llvm-svn: 225915	2015-01-14 01:21:24 +00:00
Duncan P. N. Exon Smith	47d82981d6	Utils: Add mapping for uniqued MDLocations Still doesn't handle distinct ones. Part of PR21433. llvm-svn: 225914	2015-01-14 01:20:27 +00:00
Duncan P. N. Exon Smith	4766e01250	Utils: Extract cloneMDTuple(), NFC llvm-svn: 225912	2015-01-14 01:12:14 +00:00
Duncan P. N. Exon Smith	fb9d128ab1	Utils: Extract shouldRemapUniquedNode(), NFC llvm-svn: 225911	2015-01-14 01:08:47 +00:00
Duncan P. N. Exon Smith	637e765907	Utils: Simplify code, NFC llvm-svn: 225906	2015-01-14 01:07:03 +00:00
Duncan P. N. Exon Smith	b557989a40	Utils: Extract mapUniquedNode(), NFC llvm-svn: 225905	2015-01-14 01:06:21 +00:00
Duncan P. N. Exon Smith	8725ca8c60	Utils: MDNode => UniquableMDNode, NFC Although this makes the `cast<>` assert more often, the `assert(Node->isResolved())` on the following line would assert in all those cases. So, no functionality change here. llvm-svn: 225903	2015-01-14 01:05:17 +00:00
Duncan P. N. Exon Smith	14cc94c1c6	Utils: Separate out mapDistinctNode(), NFC llvm-svn: 225902	2015-01-14 01:03:05 +00:00
Duncan P. N. Exon Smith	3956a85e6e	Utils: Use helper function directly, NFC llvm-svn: 225901	2015-01-14 01:02:17 +00:00
Duncan P. N. Exon Smith	077affdbb9	Utils: Extract helper function, NFC llvm-svn: 225897	2015-01-14 01:01:19 +00:00
Duncan P. N. Exon Smith	34651ee2f6	Utils: Use MDTuple::get() directly, NFC Working towards supporting `MDLocation` in `MapMetadata()`. llvm-svn: 225896	2015-01-14 00:59:57 +00:00
Ahmed Bougacha	71d7b18e3d	[SimplifyLibCalls] Don't try to simplify indirect calls. It turns out, all callsites of the simplifier are guarded by a check for CallInst::getCalledFunction (i.e., to make sure the callee is direct). This check wasn't done when trying to further optimize a simplified fortified libcall, introduced by a refactoring in r225640. Fix that, add a testcase, and document the requirement. llvm-svn: 225895	2015-01-14 00:55:05 +00:00
Ramkumar Ramachandra	181233b2b7	fix {typo, build failure} in r225760 llvm-svn: 225762	2015-01-13 04:17:47 +00:00
Ramkumar Ramachandra	40c3e03e27	Standardize {pred,succ,use,user}_empty() The functions {pred,succ,use,user}_{begin,end} exist, but many users have to check _begin() with _end() by hand to determine if the BasicBlock or User is empty. Fix this with a standard *_empty(), demonstrating a few usecases. llvm-svn: 225760	2015-01-13 03:46:47 +00:00
Duncan P. N. Exon Smith	118632dbf6	IR: Split GenericMDNode into MDTuple and UniquableMDNode Split `GenericMDNode` into two classes (with more descriptive names). - `UniquableMDNode` will be a common subclass for `MDNode`s that are sometimes uniqued like constants, and sometimes 'distinct'. This class gets the (short-lived) RAUW support and related API. - `MDTuple` is the basic tuple that has always been returned by `MDNode::get()`. This is as opposed to more specific nodes to be added soon, which have additional fields, custom assembly syntax, and extra semantics. This class gets the hash-related logic, since other sublcasses of `UniquableMDNode` may need to hash based on other fields. To keep this diff from getting too big, I've added casts to `MDTuple` that won't really scale as new subclasses of `UniquableMDNode` are added, but I'll clean those up incrementally. (No functionality change intended.) llvm-svn: 225682	2015-01-12 20:09:34 +00:00
Ahmed Bougacha	e03bef7543	[SimplifyLibCalls] Factor out fortified libcall handling. This lets us remove CGP duplicate. Differential Revision: http://reviews.llvm.org/D6541 llvm-svn: 225640	2015-01-12 17:22:43 +00:00
Ahmed Bougacha	6722f5e5b3	[SimplifyLibCalls] Factor out str/mem libcall optimizations. Put them in a separate function, so we can reuse them to further simplify fortified libcalls as well. Differential Revision: http://reviews.llvm.org/D6540 llvm-svn: 225639	2015-01-12 17:20:06 +00:00
Ahmed Bougacha	b7d8afb6c5	[SimplifyLibCalls] Factor out signature checks for fortifiable libcalls. The checks are the same for fortified counterparts to the libcalls, so we might as well do them in a single place. Differential Revision: http://reviews.llvm.org/D6539 llvm-svn: 225638	2015-01-12 17:18:19 +00:00
Hans Wennborg	dcc6e5bc03	SimplifyCFG: check uses of constant-foldable instrs in switch destinations (PR20210) The previous code assumed that such instructions could not have any uses outside CaseDest, with the motivation that the instruction could not dominate CommonDest because CommonDest has phi nodes in it. That simply isn't true; e.g., CommonDest could have an edge back to itself. llvm-svn: 225552	2015-01-09 22:13:31 +00:00
Duncan P. N. Exon Smith	953e1a48f0	Utils: Keep distinct MDNodes distinct in MapMetadata() Create new copies of distinct `MDNode`s instead of following the uniquing `MDNode` logic. Just like self-references (or other cycles), `MapMetadata()` creates a new node. In practice most calls use `RF_NoModuleLevelChanges`, in which case nothing is duplicated anyway. Part of PR22111. llvm-svn: 225476	2015-01-08 22:42:30 +00:00
Sanjoy Das	7c0ce26614	This patch teaches IndVarSimplify to add nuw and nsw to certain kinds of operations that provably don't overflow. For example, we can prove %civ.inc below does not sign-overflow. With this change, IndVarSimplify changes %civ.inc to an add nsw. define i32 @foo(i32* %array, i32* %length_ptr, i32 %init) { entry: %length = load i32* %length_ptr, !range !0 %len.sub.1 = sub i32 %length, 1 %upper = icmp slt i32 %init, %len.sub.1 br i1 %upper, label %loop, label %exit loop: %civ = phi i32 [ %init, %entry ], [ %civ.inc, %latch ] %civ.inc = add i32 %civ, 1 %cmp = icmp slt i32 %civ.inc, %length br i1 %cmp, label %latch, label %break latch: store i32 0, i32* %array %check = icmp slt i32 %civ.inc, %len.sub.1 br i1 %check, label %loop, label %break break: ret i32 %civ.inc exit: ret i32 42 } Differential Revision: http://reviews.llvm.org/D6748 llvm-svn: 225282	2015-01-06 19:02:56 +00:00
Saleem Abdulrasool	150a1dc5c2	SymbolRewriter: use iplist::splice The swap implementation for iplist is currently unsupported. Simply splice the old list into place, which achieves the same purpose. This is needed in order to thread the -frewrite-map-file frontend option correctly. NFC. llvm-svn: 225186	2015-01-05 17:56:32 +00:00
Saleem Abdulrasool	d37ce30888	SymbolRewriter: 80-column Wrap a couple of lines. NFC. llvm-svn: 225185	2015-01-05 17:56:29 +00:00
Chandler Carruth	66b3130cda	[PM] Split the AssumptionTracker immutable pass into two separate APIs: a cache of assumptions for a single function, and an immutable pass that manages those caches. The motivation for this change is two fold. Immutable analyses are really hacks around the current pass manager design and don't exist in the new design. This is usually OK, but it requires that the core logic of an immutable pass be reasonably partitioned off from the pass logic. This change does precisely that. As a consequence it also paves the way for the many utility functions that deal in the assumptions to live in both pass manager worlds by creating an separate non-pass object with its own independent API that they all rely on. Now, the only bits of the system that deal with the actual pass mechanics are those that actually need to deal with the pass mechanics. Once this separation is made, several simplifications become pretty obvious in the assumption cache itself. Rather than using a set and callback value handles, it can just be a vector of weak value handles. The callers can easily skip the handles that are null, and eventually we can wrap all of this up behind a filter iterator. For now, this adds boiler plate to the various passes, but this kind of boiler plate will end up making it possible to port these passes to the new pass manager, and so it will end up factored away pretty reasonably. llvm-svn: 225131	2015-01-04 12:03:27 +00:00
Michael Liao	5313da3263	[SimplifyCFG] Revise common code sinking - Fix the case where more than 1 common instructions derived from the same operand cannot be sunk. When a pair of value has more than 1 derived values in both branches, only 1 derived value could be sunk. - Replace BB1 -> (BB2, PN) map with joint value map, i.e. map of (BB1, BB2) -> PN, which is more accurate to track common ops. llvm-svn: 224757	2014-12-23 08:26:55 +00:00
Michael Kuperstein	0bf33ffde4	Remove a bad cast in CloneModule() A cast that was introduced in r209007 was accidentally left in after the changes made to GlobalAlias rules in r210062. This crashes if the aliasee is a now-leggal ConstantExpr. llvm-svn: 224756	2014-12-23 08:23:45 +00:00
Bruno Cardoso Lopes	bad65c3b70	[LCSSA] Handle PHI insertion in disjoint loops Take two disjoint Loops L1 and L2. LoopSimplify fails to simplify some loops (e.g. when indirect branches are involved). In such situations, it can happen that an exit for L1 is the header of L2. Thus, when we create PHIs in one of such exits we are also inserting PHIs in L2 header. This could break LCSSA form for L2 because these inserted PHIs can also have uses in L2 exits, which are never handled in the current implementation. Provide a fix for this corner case and test that we don't assert/crash on that. Differential Revision: http://reviews.llvm.org/D6624 rdar://problem/19166231 llvm-svn: 224740	2014-12-22 22:35:46 +00:00
Duncan P. N. Exon Smith	46d7af5729	Rename MapValue(Metadata*) to MapMetadata() Instead of reusing the name `MapValue()` when mapping `Metadata`, use `MapMetadata()`. The old name doesn't make much sense after the `Metadata`/`Value` split. llvm-svn: 224566	2014-12-19 06:06:18 +00:00
Michael Kuperstein	fffb6996c9	The inliner needs to fix up debug information for llvm.dbg.declare, not only for llvm.dbg.value. Patch by Amjad Aboud Differential Revision: http://reviews.llvm.org/D6525 llvm-svn: 224015	2014-12-11 12:41:10 +00:00
Kaelyn Takata	22324f378a	Rename static functiom "map" to be more descriptive and to avoid potential confusion with the std::map type. llvm-svn: 223853	2014-12-09 23:32:46 +00:00
Frederic Riss	35f0a9aeba	Remove unneeded curly braces. llvm-svn: 223809	2014-12-09 18:57:39 +00:00
Frederic Riss	ff58fd207e	Reorder the code to avoid inserting at the beginning of a vector. As per dblaikie suggestion, thanks\! llvm-svn: 223808	2014-12-09 18:57:34 +00:00
Duncan P. N. Exon Smith	5bf8fef580	IR: Split Metadata from Value Split `Metadata` away from the `Value` class hierarchy, as part of PR21532. Assembly and bitcode changes are in the wings, but this is the bulk of the change for the IR C++ API. I have a follow-up patch prepared for `clang`. If this breaks other sub-projects, I apologize in advance :(. Help me compile it on Darwin I'll try to fix it. FWIW, the errors should be easy to fix, so it may be simpler to just fix it yourself. This breaks the build for all metadata-related code that's out-of-tree. Rest assured the transition is mechanical and the compiler should catch almost all of the problems. Here's a quick guide for updating your code: - `Metadata` is the root of a class hierarchy with three main classes: `MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from the `Value` class hierarchy. It is typeless -- i.e., instances do not have a `Type`. - `MDNode`'s operands are all `Metadata ` (instead of `Value `). - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively. If you're referring solely to resolved `MDNode`s -- post graph construction -- just use `MDNode`. - `MDNode` (and the rest of `Metadata`) have only limited support for `replaceAllUsesWith()`. As long as an `MDNode` is pointing at a forward declaration -- the result of `MDNode::getTemporary()` -- it maintains a side map of its uses and can RAUW itself. Once the forward declarations are fully resolved RAUW support is dropped on the ground. This means that uniquing collisions on changing operands cause nodes to become "distinct". (This already happened fairly commonly, whenever an operand went to null.) If you're constructing complex (non self-reference) `MDNode` cycles, you need to call `MDNode::resolveCycles()` on each node (or on a top-level node that somehow references all of the nodes). Also, don't do that. Metadata cycles (and the RAUW machinery needed to construct them) are expensive. - An `MDNode` can only refer to a `Constant` through a bridge called `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`). As a side effect, accessing an operand of an `MDNode` that is known to be, e.g., `ConstantInt`, takes three steps: first, cast from `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`; third, cast down to `ConstantInt`. The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have metadata schema owners transition away from using `Constant`s when the type isn't important (and they don't care about referring to `GlobalValue`s). In the meantime, I've added transitional API to the `mdconst` namespace that matches semantics with the old code, in order to avoid adding the error-prone three-step equivalent to every call site. If your old code was: MDNode N = foo(); bar(isa <ConstantInt>(N->getOperand(0))); baz(cast <ConstantInt>(N->getOperand(1))); bak(cast_or_null <ConstantInt>(N->getOperand(2))); bat(dyn_cast <ConstantInt>(N->getOperand(3))); bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4))); you can trivially match its semantics with: MDNode N = foo(); bar(mdconst::hasa <ConstantInt>(N->getOperand(0))); baz(mdconst::extract <ConstantInt>(N->getOperand(1))); bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2))); bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3))); bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4))); and when you transition your metadata schema to `MDInt`: MDNode N = foo(); bar(isa <MDInt>(N->getOperand(0))); baz(cast <MDInt>(N->getOperand(1))); bak(cast_or_null <MDInt>(N->getOperand(2))); bat(dyn_cast <MDInt>(N->getOperand(3))); bay(dyn_cast_or_null<MDInt>(N->getOperand(4))); - A `CallInst` -- specifically, intrinsic instructions -- can refer to metadata through a bridge called `MetadataAsValue`. This is a subclass of `Value` where `getType()->isMetadataTy()`. `MetadataAsValue` is the only class that can legally refer to a `LocalAsMetadata`, which is a bridged form of non-`Constant` values like `Argument` and `Instruction`. It can also refer to any other `Metadata` subclass. (I'll break all your testcases in a follow-up commit, when I propagate this change to assembly.) llvm-svn: 223802	2014-12-09 18:38:53 +00:00
Frederic Riss	7c78db5065	Correctly handle complex locations expressions in replaceDbgDeclareForAlloca() replaceDbgDeclareForAlloca() replaces an alloca by a value storing the address of what was the alloca. If there is a dbg.declare corresponding to that alloca, we need to lower it to a dbg.value describing the additional dereference operation to be performed to get to the underlying variable. This is done by adding a DW_OP_deref to the complex location part of the location description. This deref was added to the end of the operation list, which is wrong. The expression applies to what is described by the dbg.{declare,value}, and as we are changing this, we need to apply the DW_OP_deref as the first operation in the list. Part of the fix for rdar://19162268. llvm-svn: 223799	2014-12-09 17:55:48 +00:00
Juergen Ributzka	194350a936	Revert "Move function to obtain branch weights into the BranchInst class. NFC." This reverts commit r223784 and copies the 'ExtractBranchMetadata' to CodeGenPrepare. llvm-svn: 223795	2014-12-09 17:32:12 +00:00
Juergen Ributzka	e2aa3aa38a	Move function to obtain branch weights into the BranchInst class. NFC. Make this function available to other parts of LLVM. llvm-svn: 223784	2014-12-09 16:36:06 +00:00
Duncan P. N. Exon Smith	b236211c4c	Utils: Style cleanups, NFC llvm-svn: 223556	2014-12-06 00:48:17 +00:00
Duncan P. N. Exon Smith	b13f7d2e36	Utils: Avoid RAUW on metadata in CloneFunction() llvm-svn: 223555	2014-12-06 00:48:13 +00:00
Matthias Braun	395a82f6cc	correct spelling, NFC llvm-svn: 223274	2014-12-03 22:10:39 +00:00
Matthias Braun	d34e4d2354	[SimplifyLibCalls] Improve double->float shrinking to consider constants This allows cases like float x; fmin(1.0, x); to be optimized to fminf(1.0f, x); rdar://19049359 Differential Revision: http://reviews.llvm.org/D6496 llvm-svn: 223270	2014-12-03 21:46:33 +00:00
Matthias Braun	892c923c46	[SimplifyLibCalls] Enable double to float shrinking for copysign rdar://19049359 Differential Revision: http://reviews.llvm.org/D6495 llvm-svn: 223269	2014-12-03 21:46:29 +00:00
Bruno Cardoso Lopes	15520db9ad	[SwitchLowering] Handle destinations on multiple phi instructions Follow up from r222926. Also handle multiple destinations from merged cases on multiple and subsequent phi instructions. rdar://problem/19106978 llvm-svn: 223135	2014-12-02 18:31:53 +00:00
Hans Wennborg	5bef5b522b	Revert r223049, r223050 and r223051 while investigating test failures. I didn't foresee affecting the Clang test suite :/ llvm-svn: 223054	2014-12-01 17:36:43 +00:00
Hans Wennborg	269ebb612e	SimplifyCFG: Omit range checks for switch lookup tables when default is unreachable They would get optimized away later, but we might as well not emit them. llvm-svn: 223051	2014-12-01 17:08:38 +00:00
Hans Wennborg	5a1e5c05d8	SimplifyCFG: don't remove unreachable default switch destinations An unreachable default destination can be exploited by other optimizations, and SDag lowering is now prepared to handle them efficiently. For example, branches to the unreachable destination will be optimized away, such as in the case of range checks for switch lookup tables. On 64-bit Linux, this reduces the size of a clang bootstrap by 80 kB (and Chromium by 30 kB). llvm-svn: 223050	2014-12-01 17:08:35 +00:00
Bruno Cardoso Lopes	bc7ba2c766	[SwitchLowering] Handle multiple destinations on condensed case stmts Switch cases statements with sequential values that branch to the same destination BB may often be handled together in a single new source BB. In this scenario we need to remove remaining incoming values from PHI instructions in the destination BB, as to match the number of source branches. Differential Revision: http://reviews.llvm.org/D6415 rdar://problem/19040894 llvm-svn: 222926	2014-11-28 19:47:33 +00:00
Erik Eckstein	0d86c7623f	reinstate r222872: Peephole optimization in switch table lookup: reuse the guarding table comparison if possible. Fixed missing dominance check. Original commit message: This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch. Example: if (idx < tablesize) r = table[idx]; // table does not contain default_value else r = default_value; if (r != default_value) ... Is optimized to: cond = idx < tablesize; if (cond) r = table[idx]; else r = default_value; if (cond) ... Jump threading will then eliminate the second if(cond). llvm-svn: 222891	2014-11-27 15:13:14 +00:00
Erik Eckstein	2190cd9ffa	Revert "Peephole optimization in switch table lookup: reuse the guarding table comparison if possible." It is breaking the clang bootstrag. llvm-svn: 222877	2014-11-27 10:59:08 +00:00
Erik Eckstein	e73e308ab9	Peephole optimization in switch table lookup: reuse the guarding table comparison if possible. This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch. Example: if (idx < tablesize) r = table[idx]; // table does not contain default_value else r = default_value; if (r != default_value) ... Is optimized to: cond = idx < tablesize; if (cond) r = table[idx]; else r = default_value; if (cond) ... \endcode Jump threading will then eliminate the second if(cond). llvm-svn: 222872	2014-11-27 08:33:51 +00:00
Mehdi Amini	ffd0100618	SimplifyCFG: Refactor GatherConstantCompares() result in a struct Code seems cleaner and easier to understand this way This is basically r222416, after fixes for MSVC lack of standard support, and a few cleaning (got rid of a warning). Thanks Nakamura Takumi and Nico Weber for the MSVC fixes. llvm-svn: 222472	2014-11-20 22:40:25 +00:00
Michael Zolotukhin	0dcae71449	Fix a trip-count overflow issue in LoopUnroll. Currently LoopUnroll generates a prologue loop before the main loop body to execute first N%UnrollFactor iterations. Also, this loop is used if trip-count can overflow - it's determined by a runtime check. However, we've been mistakenly optimizing this loop to a linear code for UnrollFactor = 2, not taking into account that it also serves as a safe version of the loop if its trip-count overflows. llvm-svn: 222451	2014-11-20 20:19:55 +00:00
Timur Iskhodzhanov	71526a3eda	Revert r222416, r222422, r222426: the former revision had problems and fixing them introduced bugs llvm-svn: 222428	2014-11-20 12:36:43 +00:00
Timur Iskhodzhanov	a0bffc0c11	Fix a typo llvm-svn: 222426	2014-11-20 11:48:58 +00:00
NAKAMURA Takumi	5a83192570	SimplifyCFG.cpp: Tweak to let msc17 compliant. - Use LLVM_DELETED_FUNCTION. - Don't use member initializers. - Don't use initializer list. llvm-svn: 222422	2014-11-20 08:59:02 +00:00
Mehdi Amini	65253e76ed	SimplifyCFG: Refactor GatherConstantCompares() result in a struct Code seems cleaner and easier to understand this way llvm-svn: 222416	2014-11-20 06:51:02 +00:00
Nico Weber	06839a536f	Try to fix MSVS build after r222384. No intended behavior change. llvm-svn: 222386	2014-11-19 21:16:11 +00:00
Mehdi Amini	9a25cb8806	SimplifyCFG: turn recursive GatherConstantCompares into iterative A long sequence of \|\| or && could lead to a stack explosion. llvm-svn: 222384	2014-11-19 20:09:11 +00:00
David Blaikie	70573dcd9f	Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334	2014-11-19 07:49:26 +00:00
Kostya Serebryany	e5ea424a77	Introduce llvm::SplitAllCriticalEdges Summary: move the code from BreakCriticalEdges::runOnFunction() into a separate utility function llvm::SplitAllCriticalEdges() so that it can be used independently. No functionality change intended. Test Plan: check-llvm Reviewers: nlewycky Reviewed By: nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6313 llvm-svn: 222288	2014-11-19 00:17:31 +00:00
Hans Wennborg	a6a11a969a	SimplifyCFG: Range'ify some for-loops. No functional change. llvm-svn: 222215	2014-11-18 02:37:11 +00:00
Juergen Ributzka	c9591e9bdb	[SimplifyCFG] Make the value type of the hole check bitmask a power-of-2. When converting a switch to a lookup table we might have to generate a bitmaks to encode and check for holes in the original switch statement. The type of this mask depends on the number of switch statements, which can result in illegal types for pretty much all architectures. To avoid unnecessary type legalization and help FastISel this commit increases the size of the bitmask to next power-of-2 value when necessary. This fixes rdar://problem/18984639. llvm-svn: 222168	2014-11-17 19:39:56 +00:00
Erik Eckstein	105374fe5e	Optimize switch lookup tables with linear mapping. This is a simple optimization for switch table lookup: It computes the output value directly with an (optional) mul and add if there is a linear mapping between index and output. Example: int f1(int x) { switch (x) { case 0: return 10; case 1: return 11; case 2: return 12; case 3: return 13; } return 0; } generates: define i32 @f1(i32 %x) #0 { entry: %0 = icmp ult i32 %x, 4 br i1 %0, label %switch.lookup, label %return switch.lookup: %switch.offset = add i32 %x, 10 ret i32 %switch.offset return: ret i32 0 } llvm-svn: 222121	2014-11-17 09:13:57 +00:00
David Blaikie	711cd9c53c	Remove redundant virtual on overriden functions. llvm-svn: 222023	2014-11-14 19:06:36 +00:00
Reid Kleckner	971c3ea67b	Use nullptr instead of NULL for variadic sentinels Windows defines NULL to 0, which when used as an argument to a variadic function, is not a null pointer constant. As a result, Clang's -Wsentinel fires on this code. Using '0' would be wrong on most 64-bit platforms, but both MSVC and Clang make it work on Windows. Sidestep the issue with nullptr. llvm-svn: 221940	2014-11-13 22:55:19 +00:00
Ahmed Bougacha	55a333d89b	Add fortified (__*_chk) library functions to TLI (NFC) One of them (__memcpy_chk) was already there, the others were checked by comparing function names. Note that the fortified libfuncs are now part of TLI, but are always available, because they aren't generated, only optimized into the non-checking versions. Differential Revision: http://reviews.llvm.org/D6179 llvm-svn: 221817	2014-11-12 21:23:34 +00:00
Sanjay Patel	7777b50eaf	remove function names from comments; NFC llvm-svn: 221798	2014-11-12 18:07:42 +00:00
Duncan P. N. Exon Smith	de36e8040f	Revert "IR: MDNode => Value" Instead, we're going to separate metadata from the Value hierarchy. See PR21532. This reverts commit r221375. This reverts commit r221373. This reverts commit r221359. This reverts commit r221167. This reverts commit r221027. This reverts commit r221024. This reverts commit r221023. This reverts commit r220995. This reverts commit r220994. llvm-svn: 221711	2014-11-11 21:30:22 +00:00
Juergen Ributzka	d441725d3d	[SwitchLowering] Fix the "fixPhis" function. Switch statements may have more than one incoming edge into the same BB if they all have the same value. When the switch statement is converted these incoming edges are now coming from multiple BBs. Updating all incoming values to be from a single BB is incorrect and would generate invalid LLVM IR. The fix is to only update the first occurrence of an incoming value. Switch lowering will perform subsequent calls to this helper function for each incoming edge with a new basic block - updating all edges in the process. This fixes rdar://problem/18916275. llvm-svn: 221627	2014-11-10 21:05:27 +00:00
Vasileios Kalintiris	ccde2a9a1e	Fix extra semicolon warning. NFC. llvm-svn: 221613	2014-11-10 17:37:53 +00:00
Saleem Abdulrasool	d2c5d7f6da	Transforms: address some late comments We already use the llvm namespace. Remove the unnecessary prefix. Use the StringRef::equals method to compare with C strings rather than instantiating std::strings. Addresses late review comments from David Majnemer. llvm-svn: 221564	2014-11-08 00:00:50 +00:00
Saleem Abdulrasool	92b13aac04	Transforms: sort source files in build Sort target sources. NFC. llvm-svn: 221563	2014-11-08 00:00:47 +00:00
Saleem Abdulrasool	89c5ad4cda	Transforms: use typedef rather than using aliases Visual Studio 2012 apparently does not support using alias declarations. Use the more traditional typedef approach. This should let the Windows buildbots pass. NFC. llvm-svn: 221554	2014-11-07 22:09:52 +00:00
Saleem Abdulrasool	5898e09057	Transform: add SymbolRewriter pass This introduces the symbol rewriter. This is an IR->IR transformation that is implemented as a CodeGenPrepare pass. This allows for the transparent adjustment of the symbols during compilation. It provides a clean, simple, elegant solution for symbol inter-positioning. This technique is often used, such as in the various sanitizers and performance analysis. The control of this is via a custom YAML syntax map file that indicates source to destination mapping, so as to avoid having the compiler to know the exact details of the source to destination transformations. llvm-svn: 221548	2014-11-07 21:32:08 +00:00
Michael Ilseman	a7202bdbed	Fix heap-use-after-free bug in expandSDiv when the operands are constants, as discovered by ASAN. Patch by Mehdi Amini! llvm-svn: 221401	2014-11-05 21:28:24 +00:00
Mark Heffernan	2d393ea6ef	Revert earlier change removing setPreservesCFG from instcombine (r221223) and change LoopSimplifyPass to be !isCFGOnly. The motivation for the earlier patch (r221223) was that LoopSimplify is not preserved by instcombine though setPreservesCFG indicates that it is. This change fixes the issue by making setPreservesCFG no longer imply LoopSimplifyPass, and is therefore less invasive. llvm-svn: 221311	2014-11-04 23:02:09 +00:00
Reid Kleckner	dd3f3edafa	Revert "Transforms: reapply SVN r219899" This reverts commit r220811 and r220839. It made an incorrect change to musttail handling. llvm-svn: 221226	2014-11-04 02:02:14 +00:00
Duncan P. N. Exon Smith	3d5a02f677	IR: MDNode => Value: Instruction::getAllMetadataOtherThanDebugLoc() Change `Instruction::getAllMetadataOtherThanDebugLoc()` from a vector of `MDNode` to one of `Value`. Part of PR21433. llvm-svn: 221167	2014-11-03 18:13:57 +00:00
Duncan P. N. Exon Smith	4abd1a0808	IR: MDNode => Value: Instruction::getAllMetadata() Change `Instruction::getAllMetadata()` to modify a vector of `Value` instead of `MDNode` and update call sites. This is part of PR21433. llvm-svn: 221027	2014-11-01 00:26:42 +00:00
Duncan P. N. Exon Smith	3872d0084c	IR: MDNode => Value: Instruction::getMetadata() Change `Instruction::getMetadata()` to return `Value` as part of PR21433. Update most callers to use `Instruction::getMDNode()`, which wraps the result in a `cast_or_null<MDNode>`. llvm-svn: 221024	2014-11-01 00:10:31 +00:00
Saleem Abdulrasool	d178ada55e	Transforms: reapply SVN r219899 This restores the commit from SVN r219899 with an additional change to ensure that the CodeGen is correct for the case that was identified as being incorrect (originally PR7272). In the case that during inlining we need to synthesize a value on the stack (i.e. for passing a value byval), then any function involving that alloca must be stripped of its tailness as the restriction that it does not access the parent's stack no longer holds. Unfortunately, a single alloca can cause a rippling effect through out the inlining as the value may be aliased or may be mutated through an escaped external call. As such, we simply track if an alloca has been introduced in the frame during inlining, and strip any tail calls. llvm-svn: 220811	2014-10-28 18:27:37 +00:00
NAKAMURA Takumi	335a7bcf1e	Untabify and whitespace cleanups. llvm-svn: 220771	2014-10-28 11:53:30 +00:00
Sanjay Patel	848309da7c	Handle sqrt() shrinking in SimplifyLibCalls like any other call This patch removes a chunk of special case logic for folding (float)sqrt((double)x) -> sqrtf(x) in InstCombineCasts and handles it in the mainstream path of SimplifyLibCalls. No functional change intended, but I loosened the restriction on the existing sqrt testcases to allow for this optimization even without unsafe-fp-math because that's the existing behavior. I also added a missing test case for not shrinking the llvm.sqrt.f64 intrinsic in case the result is used as a double. Differential Revision: http://reviews.llvm.org/D5919 llvm-svn: 220514	2014-10-23 21:52:45 +00:00
Philip Reames	d92c2a7592	Preserving 'nonnull' metadata in SimplifyCFG When we hoist two loads above an if, we can preserve the nonnull metadata. We could also do the same for sinking them, but we appear to not handle metadata at all in that case. Thanks to Hal for the review. Differential Revision: http://reviews.llvm.org/D5910 llvm-svn: 220392	2014-10-22 16:37:13 +00:00
Sanjay Patel	a92fa44740	Shrinkify libcalls: use float versions of double libm functions with fast-math (bug 17850) When a call to a double-precision libm function has fast-math semantics (via function attribute for now because there is no IR-level FMF on calls), we can avoid fpext/fptrunc operations and use the float version of the call if the input and output are both float. We already do this optimization using a command-line option; this patch just adds the ability for fast-math to use the existing functionality. I moved the cl::opt from InstructionCombining into SimplifyLibCalls because it's only ever used internally to that class. Modified the existing test cases to use the unsafe-fp-math attribute rather than repeating all tests. This patch should solve: http://llvm.org/bugs/show_bug.cgi?id=17850 Differential Revision: http://reviews.llvm.org/D5893 llvm-svn: 220390	2014-10-22 15:29:23 +00:00
Philip Reames	d7c21364a9	Teach combineMetadata how to merge 'nonnull' metadata. combineMetadata is used when merging two instructions into one. This change teaches it how to merge 'nonnull' - i.e. only preserve it on the new instruction if it's set on both sources. This isn't actually used yet since I haven't adjusted any of the call sites to pass in nonnull as a 'known metadata'. llvm-svn: 220325	2014-10-21 21:02:19 +00:00
Paul Robinson	f60e0a160f	Do not attribute static allocas to the call site's DebugLoc. When functions are inlined, instructions without debug information are attributed to the call site's DebugLoc. After inlining, inlined static allocas are moved to the caller's entry block, adjacent to the caller's original static alloca instructions. By retaining the call site's DebugLoc, these instructions could cause instructions that were subsequently inserted at the entry block to pick up the same DebugLoc. Patch by Wolfgang Pieb! llvm-svn: 220255	2014-10-21 01:00:55 +00:00
Sanjay Patel	c699a6117b	fold: sqrt(x * x * y) -> fabs(x) * sqrt(y) If a square root call has an FP multiplication argument that can be reassociated, then we can hoist a repeated factor out of the square root call and into a fabs(). In the simplest case, this: y = sqrt(x * x); becomes this: y = fabs(x); This patch relies on an earlier optimization in instcombine or reassociate to put the multiplication tree into a canonical form, so we don't have to search over every permutation of the multiplication tree. Because there are no IR-level FastMathFlags for intrinsics (PR21290), we have to use function-level attributes to do this optimization. This needs to be fixed for both the intrinsics and in the backend. Differential Revision: http://reviews.llvm.org/D5787 llvm-svn: 219944	2014-10-16 18:48:17 +00:00
Hal Finkel	68dc3c7ab2	Preserve non-byval pointer alignment attributes using @llvm.assume when inlining For pointer-typed function arguments, enhanced alignment can be asserted using the 'align' attribute. When inlining, if this enhanced alignment information is not otherwise available, preserve it using @llvm.assume-based alignment assumptions. llvm-svn: 219876	2014-10-15 23:44:41 +00:00
Sanjay Patel	0ca42bb5a8	Optimize away fabs() calls when input is squared (known positive). Eliminate library calls and intrinsic calls to fabs when the input is a squared value. Note that no unsafe-math / fast-math assumptions are needed for this optimization. Differential Revision: http://reviews.llvm.org/D5777 llvm-svn: 219717	2014-10-14 20:43:11 +00:00
Marcello Maggioni	5bbe3df63f	Switch to select optimization for two-case switches This is the same optimization of r219233 with modifications to support PHIs with multiple incoming edges from the same block and a test to check that this condition is handled. llvm-svn: 219656	2014-10-14 01:58:26 +00:00
Joerg Sonnenberger	5ca10d0edb	Revert r219223, it creates invalid PHI nodes. llvm-svn: 219587	2014-10-12 17:16:04 +00:00
Arnold Schwaighofer	d7d010eb2a	SimplifyCFG: Don't convert phis into selects if we could remove undef behavior instead We used to transform this: define void @test6(i1 %cond, i8* %ptr) { entry: br i1 %cond, label %bb1, label %bb2 bb1: br label %bb2 bb2: %ptr.2 = phi i8* [ %ptr, %entry ], [ null, %bb1 ] store i8 2, i8* %ptr.2, align 8 ret void } into this: define void @test6(i1 %cond, i8* %ptr) { %ptr.2 = select i1 %cond, i8* null, i8* %ptr store i8 2, i8* %ptr.2, align 8 ret void } because the simplifycfg transformation into selects would happen to happen before the simplifycfg transformation that removes unreachable control flow (We have 'unreachable control flow' due to the store to null which is undefined behavior). The existing transformation that removes unreachable control flow in simplifycfg is: /// If BB has an incoming value that will always trigger undefined behavior /// (eg. null pointer dereference), remove the branch leading here. static bool removeUndefIntroducingPredecessor(BasicBlock BB) Now we generate: define void @test6(i1 %cond, i8 %ptr) { store i8 2, i8* %ptr.2, align 8 ret void } I did not see any impact on the test-suite + externals. rdar://18596215 llvm-svn: 219462	2014-10-10 01:27:02 +00:00
Duncan P. N. Exon Smith	c46cfcbbc6	LoopUnroll: Create sub-loops in LoopInfo `LoopUnrollPass` says that it preserves `LoopInfo` -- make it so. In particular, tell `LoopInfo` about copies of inner loops when unrolling the outer loop. Conservatively, also tell `ScalarEvolution` to forget about the original versions of these loops, since their inputs may have changed. Fixes PR20987. llvm-svn: 219241	2014-10-07 21:19:00 +00:00
Duncan P. N. Exon Smith	9b4d37e8f5	LoopUnroll: Only check for ScalarEvolution analysis once, NFC A follow-up commit will add use to a tight loop. We might as well just find it once anyway. llvm-svn: 219239	2014-10-07 21:12:44 +00:00
Marcello Maggioni	963bc87dbd	Two case switch to select optimization This optimization tries to convert switch instructions that are used to select a value with only 2 unique cases + default block to a select or a couple of selects (depending if the default block is reachable or not). The typical case this optimization wants to be able to optimize is this one: Example: switch (a) { case 10: %0 = icmp eq i32 %a, 10 return 10; %1 = select i1 %0, i32 10, i32 4 case 20: ----> %2 = icmp eq i32 %a, 20 return 2; %3 = select i1 %2, i32 2, i32 %1 default: return 4; } It also sets the base for further optimizations that are planned and being reviewed. llvm-svn: 219223	2014-10-07 18:16:44 +00:00
Duncan P. N. Exon Smith	e5d7d9797b	LoopUnroll: Change code order of changes to new basic blocks Add new basic blocks to `LoopInfo` earlier. No functionality change intended (simplifies upcoming bugfix patch). llvm-svn: 219150	2014-10-06 22:05:02 +00:00
Duncan P. N. Exon Smith	0bbf5418c6	Sink comment, NFC llvm-svn: 219149	2014-10-06 22:04:59 +00:00
Duncan P. N. Exon Smith	611afb229c	DIBuilder: Encapsulate DIExpression's element type `DIExpression`'s elements are 64-bit integers that are stored as `ConstantInt`. The accessors already encapsulate the storage. This commit updates the `DIBuilder` API to also encapsulate that. llvm-svn: 218797	2014-10-01 20:26:08 +00:00
Adrian Prantl	87b7eb9d0f	Move the complex address expression out of DIVariable and into an extra argument of the llvm.dbg.declare/llvm.dbg.value intrinsics. Previously, DIVariable was a variable-length field that has an optional reference to a Metadata array consisting of a variable number of complex address expressions. In the case of OpPiece expressions this is wasting a lot of storage in IR, because when an aggregate type is, e.g., SROA'd into all of its n individual members, the IR will contain n copies of the DIVariable, all alike, only differing in the complex address reference at the end. By making the complex address into an extra argument of the dbg.value/dbg.declare intrinsics, all of the pieces can reference the same variable and the complex address expressions can be uniqued across the CU, too. Down the road, this will allow us to move other flags, such as "indirection" out of the DIVariable, too. The new intrinsics look like this: declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr) declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr) This patch adds a new LLVM-local tag to DIExpressions, so we can detect and pretty-print DIExpression metadata nodes. What this patch doesn't do: This patch does not touch the "Indirect" field in DIVariable; but moving that into the expression would be a natural next step. http://reviews.llvm.org/D4919 rdar://problem/17994491 Thanks to dblaikie and dexonsmith for reviewing this patch! Note: I accidentally committed a bogus older version of this patch previously. llvm-svn: 218787	2014-10-01 18:55:02 +00:00
Adrian Prantl	b458dc2eee	Revert r218778 while investigating buldbot breakage. "Move the complex address expression out of DIVariable and into an extra" llvm-svn: 218782	2014-10-01 18:10:54 +00:00
Adrian Prantl	25a7174e7a	Move the complex address expression out of DIVariable and into an extra argument of the llvm.dbg.declare/llvm.dbg.value intrinsics. Previously, DIVariable was a variable-length field that has an optional reference to a Metadata array consisting of a variable number of complex address expressions. In the case of OpPiece expressions this is wasting a lot of storage in IR, because when an aggregate type is, e.g., SROA'd into all of its n individual members, the IR will contain n copies of the DIVariable, all alike, only differing in the complex address reference at the end. By making the complex address into an extra argument of the dbg.value/dbg.declare intrinsics, all of the pieces can reference the same variable and the complex address expressions can be uniqued across the CU, too. Down the road, this will allow us to move other flags, such as "indirection" out of the DIVariable, too. The new intrinsics look like this: declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr) declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr) This patch adds a new LLVM-local tag to DIExpressions, so we can detect and pretty-print DIExpression metadata nodes. What this patch doesn't do: This patch does not touch the "Indirect" field in DIVariable; but moving that into the expression would be a natural next step. http://reviews.llvm.org/D4919 rdar://problem/17994491 Thanks to dblaikie and dexonsmith for reviewing this patch! llvm-svn: 218778	2014-10-01 17:55:39 +00:00
Tom Stellard	0a4e9a3b25	C API: Add LLVMCloneModule() llvm-svn: 218775	2014-10-01 17:14:57 +00:00
Jingyue Wu	fc0296704c	[SimplifyCFG] threshold for folding branches with common destination Summary: This patch adds a threshold that controls the number of bonus instructions allowed for folding branches with common destination. The original code allows at most one bonus instruction. With this patch, users can customize the threshold to allow multiple bonus instructions. The default threshold is still 1, so that the code behaves the same as before when users do not specify this threshold. The motivation of this change is that tuning this threshold significantly (up to 25%) improves the performance of some CUDA programs in our internal code base. In general, branch instructions are very expensive for GPU programs. Therefore, it is sometimes worth trading more arithmetic computation for a more straightened control flow. Here's a reduced example: __global__ void foo(int a, int b, int c, int d, int e, int n, const int input, int output) { int sum = 0; for (int i = 0; i < n; ++i) sum += (((i ^ a) > b) && (((i \| c ) ^ d) > e)) ? 0 : input[i]; *output = sum; } The select statement in the loop body translates to two branch instructions "if ((i ^ a) > b)" and "if (((i \| c) ^ d) > e)" which share a common destination. With the default threshold, SimplifyCFG is unable to fold them, because computing the condition of the second branch "(i \| c) ^ d > e" requires two bonus instructions. With the threshold increased, SimplifyCFG can fold the two branches so that the loop body contains only one branch, making the code conceptually look like: sum += (((i ^ a) > b) & (((i \| c ) ^ d) > e)) ? 0 : input[i]; Increasing the threshold significantly improves the performance of this particular example. In the configuration where both conditions are guaranteed to be true, increasing the threshold from 1 to 2 improves the performance by 18.24%. Even in the configuration where the first condition is false and the second condition is true, which favors shortcuts, increasing the threshold from 1 to 2 still improves the performance by 4.35%. We are still looking for a good threshold and maybe a better cost model than just counting the number of bonus instructions. However, according to the above numbers, we think it is at least worth adding a threshold to enable more experiments and tuning. Let me know what you think. Thanks! Test Plan: Added one test case to check the threshold is in effect Reviewers: nadav, eliben, meheff, resistor, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D5529 llvm-svn: 218711	2014-09-30 22:23:38 +00:00
Kevin Qin	fc02e3c363	Use a loop to simplify the runtime unrolling prologue. Runtime unrolling will create a prologue to execute the extra iterations which is can't divided by the unroll factor. It generates an if-then-else sequence to jump into a factor -1 times unrolled loop body, like extraiters = tripcount % loopfactor if (extraiters == 0) jump Loop: if (extraiters == loopfactor) jump L1 if (extraiters == loopfactor-1) jump L2 ... L1: LoopBody; L2: LoopBody; ... if tripcount < loopfactor jump End Loop: ... End: It means if the unroll factor is 4, the loop body will be 7 times unrolled, 3 are in loop prologue, and 4 are in the loop. This commit is to use a loop to execute the extra iterations in prologue, like extraiters = tripcount % loopfactor if (extraiters == 0) jump Loop: else jump Prol Prol: LoopBody; extraiters -= 1 // Omitted if unroll factor is 2. if (extraiters != 0) jump Prol: // Omitted if unroll factor is 2. if (tripcount < loopfactor) jump End Loop: ... End: Then when unroll factor is 4, the loop body will be copied by only 5 times, 1 in the prologue loop, 4 in the original loop. And if the unroll factor is 2, new loop won't be created, just as the original solution. llvm-svn: 218604	2014-09-29 11:15:00 +00:00
Reid Kleckner	78927e884b	GlobalOpt: Preserve comdats of unoptimized initializers Rather than slurping in and splatting out the whole ctor list, preserve the existing array entries without trying to understand them. Only remove the entries that we know we can optimize away. This way we don't need to wire through priority and comdats or anything else we might add. Fixes a linker issue where the .init_array or .ctors entry would point to discarded initialization code if the comdat group from the TU with the faulty global_ctors entry was dropped. llvm-svn: 218337	2014-09-23 22:33:01 +00:00
Chris Bieneman	cf93cbb7a4	Fixing a build error. llvm-svn: 217983	2014-09-17 21:06:59 +00:00
Chris Bieneman	ad070d0588	Refactoring SimplifyLibCalls to remove static initializers and generally cleaning up the code. Summary: This eliminates ~200 lines of code mostly file scoped struct definitions that were unnecessary. Reviewers: chandlerc, resistor Reviewed By: resistor Subscribers: morisset, resistor, llvm-commits Differential Revision: http://reviews.llvm.org/D5364 llvm-svn: 217982	2014-09-17 20:55:46 +00:00
Jingyue Wu	b67140b812	Remove dead code in SimplifyCFG Summary: UsedByBranch is always true according to how BonusInst is defined. Test Plan: Passes check-all, and also verified if (BonusInst && !UsedByBranch) { ... } is never entered during check-all. Reviewers: resistor, nadav, jingyue Reviewed By: jingyue Subscribers: llvm-commits, eliben, meheff Differential Revision: http://reviews.llvm.org/D5324 llvm-svn: 217824	2014-09-15 20:48:13 +00:00
Benjamin Kramer	0bd147da17	Simplify code. No functionality change. llvm-svn: 217726	2014-09-13 12:38:49 +00:00
Hal Finkel	60db05896a	Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.) This change, which allows @llvm.assume to be used from within computeKnownBits (and other associated functions in ValueTracking), adds some (optional) parameters to computeKnownBits and friends. These functions now (optionally) take a "context" instruction pointer, an AssumptionTracker pointer, and also a DomTree pointer, and most of the changes are just to pass this new information when it is easily available from InstSimplify, InstCombine, etc. As explained below, the significant conceptual change is that known properties of a value might depend on the control-flow location of the use (because we care that the @llvm.assume dominates the use because assumptions have control-flow dependencies). This means that, when we ask if bits are known in a value, we might get different answers for different uses. The significant changes are all in ValueTracking. Two main changes: First, as with the rest of the code, new parameters need to be passed around. To make this easier, I grouped them into a structure, and I made internal static versions of the relevant functions that take this structure as a parameter. The new code does as you might expect, it looks for @llvm.assume calls that make use of the value we're trying to learn something about (often indirectly), attempts to pattern match that expression, and uses the result if successful. By making use of the AssumptionTracker, the process of finding @llvm.assume calls is not expensive. Part of the structure being passed around inside ValueTracking is a set of already-considered @llvm.assume calls. This is to prevent a query using, for example, the assume(a == b), to recurse on itself. The context and DT params are used to find applicable assumptions. An assumption needs to dominate the context instruction, or come after it deterministically. In this latter case we only handle the specific case where both the assumption and the context instruction are in the same block, and we need to exclude assumptions from being used to simplify their own ephemeral values (those which contribute only to the assumption) because otherwise the assumption would prove its feeding comparison trivial and would be removed. This commit adds the plumbing and the logic for a simple masked-bit propagation (just enough to write a regression test). Future commits add more patterns (and, correspondingly, more regression tests). llvm-svn: 217342	2014-09-07 18:57:58 +00:00
Hal Finkel	74c2f355d2	Add an Assumption-Tracking Pass This adds an immutable pass, AssumptionTracker, which keeps a cache of @llvm.assume call instructions within a module. It uses callback value handles to keep stale functions and intrinsics out of the map, and it relies on any code that creates new @llvm.assume calls to notify it of the new instructions. The benefit is that code needing to find @llvm.assume intrinsics can do so directly, without scanning the function, thus allowing the cost of @llvm.assume handling to be negligible when none are present. The current design is intended to be lightweight. We don't keep track of anything until we need a list of assumptions in some function. The first time this happens, we scan the function. After that, we add/remove @llvm.assume calls from the cache in response to registration calls and ValueHandle callbacks. There are no new direct test cases for this pass, but because it calls it validation function upon module finalization, we'll pick up detectable inconsistencies from the other tests that touch @llvm.assume calls. This pass will be used by follow-up commits that make use of @llvm.assume. llvm-svn: 217334	2014-09-07 12:44:26 +00:00
James Molloy	6b95d8ed36	Enable noalias metadata by default and swap the order of the SLP and Loop vectorizers by default. After some time maturing, hopefully the flags themselves will be removed. llvm-svn: 217144	2014-09-04 13:23:08 +00:00
Hal Finkel	0c083024f0	Feed AA to the inliner and use AA->getModRefBehavior in AddAliasScopeMetadata This feeds AA through the IFI structure into the inliner so that AddAliasScopeMetadata can use AA->getModRefBehavior to figure out which functions only access their arguments (instead of just hard-coding some knowledge of memory intrinsics). Most of the information is only available from BasicAA; this is important for preserving alias scoping information for target-specific intrinsics when doing the noalias parameter attribute to metadata conversion. llvm-svn: 216866	2014-09-01 09:01:39 +00:00
Hal Finkel	cbb85f249e	Fix AddAliasScopeMetadata again - alias.scope must be a complete description I thought that I had fixed this problem in r216818, but I did not do a very good job. The underlying issue is that when we add alias.scope metadata we are asserting that this metadata completely describes the aliasing relationships within the current aliasing scope domain, and so in the context of translating noalias argument attributes, the pointers must all be based on noalias arguments (as underlying objects) and have no other kind of underlying object. In r216818 excluding appropriate accesses from getting alias.scope metadata is done by looking for underlying objects that are not identified function-local objects -- but that's wrong because allocas, etc. are also function-local objects and we need to explicitly check that all underlying objects are the noalias arguments for which we're adding metadata aliasing scopes. This fixes the underlying-object check for adding alias.scope metadata, and does some refactoring of the related capture-checking eligibility logic (and adds more comments; hopefully making everything a bit clearer). Fixes self-hosting on x86_64 with -mllvm -enable-noalias-to-md-conversion (the feature is still disabled by default). llvm-svn: 216863	2014-09-01 04:26:40 +00:00
Hal Finkel	a3708df41a	Fix AddAliasScopeMetadata to not add scopes when deriving from unknown pointers The previous implementation of AddAliasScopeMetadata, which adds noalias metadata to preserve noalias parameter attribute information when inlining had a flaw: it would add alias.scope metadata to accesses which might have been derived from pointers other than noalias function parameters. This was incorrect because even some access known not to alias with all noalias function parameters could easily alias with an access derived from some other pointer. Instead, when deriving from some unknown pointer, we cannot add alias.scope metadata at all. This fixes a miscompile of the test-suite's tramp3d-v4. Furthermore, we cannot add alias.scope to functions unless we know they access only argument-derived pointers (currently, we know this only for memory intrinsics). Also, we fix a theoretical problem with using the NoCapture attribute to skip the capture check. This is incorrect (as explained in the comment added), but would not matter in any code generated by Clang because we get only inferred nocapture attributes in Clang-generated IR. This functionality is not yet enabled by default. llvm-svn: 216818	2014-08-30 12:48:33 +00:00
Hal Finkel	2d3d6da44b	Fix a typo in AddAliasScopeMetadata llvm-svn: 216741	2014-08-29 16:33:41 +00:00
Craig Topper	e1d1294853	Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created. llvm-svn: 216525	2014-08-27 05:25:25 +00:00
Bruno Cardoso Lopes	e2a1fa35df	Remove dangling initializers in GlobalDCE GlobalDCE deletes global vars and updates their initializers to nullptr while leaving underlying constants to be cleaned up later by its uses. The clean up may never happen, fix this by forcing it every time it's safe to destroy constants. Final patch by Rafael Espindola http://reviews.llvm.org/D4931 <rdar://problem/17523868> llvm-svn: 216390	2014-08-25 17:51:14 +00:00
Craig Topper	4627679cec	Use range based for loops to avoid needing to re-mention SmallPtrSet size. llvm-svn: 216351	2014-08-24 23:23:06 +00:00
David Blaikie	2f3f76fdb1	Use DILexicalBlockFile, rather than DILexicalBlock, to track discriminator changes to ensure discriminator changes don't introduce new DWARF DW_TAG_lexical_blocks. Somewhat unnoticed in the original implementation of discriminators, but it could cause instructions to end up in new, small, DW_TAG_lexical_blocks due to the use of DILexicalBlock to track discriminator changes. Instead, use DILexicalBlockFile which we already use to track file changes without introducing new scopes, so it works well to track discriminator changes in the same way. llvm-svn: 216239	2014-08-21 22:45:21 +00:00
Craig Topper	71b7b68b74	Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. llvm-svn: 216158	2014-08-21 05:55:13 +00:00
Craig Topper	6230691c91	Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size." Getting a weird buildbot failure that I need to investigate. llvm-svn: 215870	2014-08-18 00:24:38 +00:00
Craig Topper	5229cfd163	Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. llvm-svn: 215868	2014-08-17 23:47:00 +00:00
Rafael Espindola	ea46c32f81	Introduce a helper to combine instruction metadata. Replace the old code in GVN and BBVectorize with it. Update SimplifyCFG to use it. Patch by Björn Steinbrink! llvm-svn: 215723	2014-08-15 15:46:38 +00:00
Hal Finkel	61c386126b	Copy noalias metadata from call sites to inlined instructions When a call site with noalias metadata is inlined, that metadata can be propagated directly to the inlined instructions (only those that might access memory because it is not useful on the others). Prior to inlining, the noalias metadata could express that a call would not alias with some other memory access, which implies that no instruction within that called function would alias. By propagating the metadata to the inlined instructions, we preserve that knowledge. This should complete the enhancements requested in PR20500. llvm-svn: 215676	2014-08-14 21:09:37 +00:00
Hal Finkel	d2dee16c27	Add noalias metadata for general calls (not just memory intrinsics) during inlining When preserving noalias function parameter attributes by adding noalias metadata in the inliner, we should do this for general function calls (not just memory intrinsics). The logic is very similar to what already existed (except that we want to add this metadata even for functions taking no relevant parameters). This metadata can be used by ModRef queries in the caller after inlining. This addresses the first part of PR20500. Adding noalias metadata during inlining is still turned off by default. llvm-svn: 215657	2014-08-14 16:44:03 +00:00
Jan Vesely	0cd3ec6cfa	utils: Fix segfault in flattencfg v2: continue iterating through the rest of the bb use for loop v3: initialize FlattenCFG pass in ScalarOps add test v4: split off initializing flattencfg to a separate patch add comment Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 215574	2014-08-13 20:31:53 +00:00
Reid Kleckner	e31acf239a	Move helper for getting a terminating musttail call to BasicBlock No functional change. To be used in future commits that need to look for such instructions. Reviewed By: rafael Differential Revision: http://reviews.llvm.org/D4504 llvm-svn: 215413	2014-08-12 00:05:15 +00:00
Manman Ren	062f58d550	[SimplifyCFG] fix accessing deleted PHINodes in switch-to-table conversion. When we have a covered lookup table, make sure we don't delete PHINodes that are cached in PHIs. rdar://17887153 llvm-svn: 214642	2014-08-02 23:41:54 +00:00
Rafael Espindola	d07cf400ab	SimplifyCFG: Avoid miscompilations due to removed lifetime intrinsics. The lifetime intrinsics need some work in order to make it clear which optimizations are or are not valid. For now dropping this optimization avoids a miscompilation. Patch by Björn Steinbrink. llvm-svn: 214336	2014-07-30 21:04:00 +00:00
Hal Finkel	930469107d	Add @llvm.assume, lowering, and some basic properties This is the first commit in a series that add an @llvm.assume intrinsic which can be used to provide the optimizer with a condition it may assume to be true (when the control flow would hit the intrinsic call). Some basic properties are added here: - llvm.invariant(true) is dead. - llvm.invariant(false) is unreachable (this directly corresponds to the documented behavior of MSVC's __assume(0)), so is llvm.invariant(undef). The intrinsic is tagged as writing arbitrarily, in order to maintain control dependencies. BasicAA has been updated, however, to return NoModRef for any particular location-based query so that we don't unnecessarily block code motion. llvm-svn: 213973	2014-07-25 21:13:35 +00:00
Hal Finkel	ff0bcb60c9	Convert noalias parameter attributes into noalias metadata during inlining This functionality is currently turned off by default. Part of the motivation for introducing scoped-noalias metadata is to enable the preservation of noalias parameter attribute information after inlining. Sometimes this can be inferred from the code in the caller after inlining, but often we simply lose valuable information. The overall process if fairly simple: 1. Create a new unqiue scope domain. 2. For each (used) noalias parameter, create a new alias scope. 3. For each pointer, collect the underlying objects. Add a noalias scope for each noalias parameter from which we're not derived (and has not been captured prior to that point). 4. Add an alias.scope for each noalias parameter from which we might be derived (or has been captured before that point). Note that the capture checks apply only if one of the underlying objects is not an identified function-local object. llvm-svn: 213949	2014-07-25 15:50:08 +00:00
Manman Ren	4d189fb9a6	Feedback from Hans on r213815. No functionaility change. llvm-svn: 213895	2014-07-24 21:13:20 +00:00
Hal Finkel	9414665a3b	Add scoped-noalias metadata This commit adds scoped noalias metadata. The primary motivations for this feature are: 1. To preserve noalias function attribute information when inlining 2. To provide the ability to model block-scope C99 restrict pointers Neither of these two abilities are added here, only the necessary infrastructure. In fact, there should be no change to existing functionality, only the addition of new features. The logic that converts noalias function parameters into this metadata during inlining will come in a follow-up commit. What is added here is the ability to generally specify noalias memory-access sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA nodes: !scope0 = metadata !{ metadata !"scope of foo()" } !scope1 = metadata !{ metadata !"scope 1", metadata !scope0 } !scope2 = metadata !{ metadata !"scope 2", metadata !scope0 } !scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 } !scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 } Loads and stores can be tagged with an alias-analysis scope, and also, with a noalias tag for a specific scope: ... = load %ptr1, !alias.scope !{ !scope1 } ... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 } When evaluating an aliasing query, if one of the instructions is associated with an alias.scope id that is identical to the noalias scope associated with the other instruction, or is a descendant (in the scope hierarchy) of the noalias scope associated with the other instruction, then the two memory accesses are assumed not to alias. Note that is the first element of the scope metadata is a string, then it can be combined accross functions and translation units. The string can be replaced by a self-reference to create globally unqiue scope identifiers. [Note: This overview is slightly stylized, since the metadata nodes really need to just be numbers (!0 instead of !scope0), and the scope lists are also global unnamed metadata.] Existing noalias metadata in a callee is "cloned" for use by the inlined code. This is necessary because the aliasing scopes are unique to each call site (because of possible control dependencies on the aliasing properties). For example, consider a function: foo(noalias a, noalias b) { a = b; } that gets inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } -- now just because we know that a1 does not alias with b1 at the first call site, and a2 does not alias with b2 at the second call site, we cannot let inlining these functons have the metadata imply that a1 does not alias with b2. llvm-svn: 213864	2014-07-24 14:25:39 +00:00
Aaron Ballman	99e0ea0aa8	Fixing an MSVC conversion warning about implicitly converting the shift results to 64-bits. No functional change intended. llvm-svn: 213863	2014-07-24 14:24:59 +00:00
Manman Ren	edc60376ed	SimplifyCFG: fix a bug in switch to table conversion We use gep to access the global array "switch.table", and the table index should be treated as unsigned. When the highest bit is 1, this commit zero-extends the index to an integer type with larger size. For a switch on i2, we used to generate: %switch.tableidx = sub i2 %0, -2 getelementptr inbounds [4 x i64]* @switch.table, i32 0, i2 %switch.tableidx It is incorrect when %switch.tableidx is 2 or 3. The fix is to generate %switch.tableidx = sub i2 %0, -2 %switch.tableidx.zext = zext i2 %switch.tableidx to i3 getelementptr inbounds [4 x i64]* @switch.table, i32 0, i3 %switch.tableidx.zext rdar://17735071 llvm-svn: 213815	2014-07-23 23:13:23 +00:00
Duncan P. N. Exon Smith	6c99015fe2	Revert "[C++11] Add predecessors(BasicBlock ) / successors(BasicBlock ) iterator ranges." This reverts commit r213474 (and r213475), which causes a miscompile on a stage2 LTO build. I'll reply on the list in a moment. llvm-svn: 213562	2014-07-21 17:06:51 +00:00
Manuel Jacob	d11beffef4	[C++11] Add predecessors(BasicBlock ) / successors(BasicBlock ) iterator ranges. Summary: This patch introduces two new iterator ranges and updates existing code to use it. No functional change intended. Test Plan: All tests (make check-all) still pass. Reviewers: dblaikie Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4481 llvm-svn: 213474	2014-07-20 09:10:11 +00:00
Peter Collingbourne	818f5c4837	Give SplitBlockAndInsertIfThen the ability to update a domtree. llvm-svn: 213045	2014-07-15 04:40:27 +00:00
Owen Anderson	a8d1c3e74e	Fix an issue with the MergeBasicBlockIntoOnlyPred() helper function where it did not properly handle the case where the predecessor block was the entry block to the function. The only in-tree client of this is JumpThreading, which worked around the issue in its own code. This patch moves the solution into the helper so that JumpThreading (and other clients) do not have to replicate the same fix everywhere. llvm-svn: 212875	2014-07-12 07:12:47 +00:00

... 9 10 11 12 13 ...

3409 Commits