llvm-project

Commit Graph

Author	SHA1	Message	Date
Nadav Rotem	3072baeb9c	Add a micro optimization to catch cases where the PtrA equals PtrB. llvm-svn: 186531	2013-07-17 19:52:25 +00:00
Hal Finkel	ec7cd26968	Fix comparisons of alloca alignment in inliner merging Duncan pointed out a mistake in my fix in r186425 when only one of the allocas being compared had the target-default alignment. This is essentially his suggested solution. Thanks! llvm-svn: 186510	2013-07-17 14:32:41 +00:00
Craig Topper	24048c9440	Mark a method 'const' and another 'static'. llvm-svn: 186485	2013-07-17 03:54:53 +00:00
Craig Topper	1c4d667ca5	Make a few more static string pointers constant. llvm-svn: 186484	2013-07-17 03:43:10 +00:00
Nadav Rotem	2202317fce	SLPVectorizer: Accelerate the isConsecutive check by replacing the subtraction of the two values with a simple SCEV expression that adds the offset to one of the pointers that we compare. llvm-svn: 186479	2013-07-17 00:48:31 +00:00
Nadav Rotem	d2e8c4cdea	flip the scev minus direction to simplify the code. llvm-svn: 186466	2013-07-16 22:57:06 +00:00
Nadav Rotem	8f924f3891	SLPVectorizer: Improve the compile time of isConsecutive by adding a simple constant-gep check before using SCEV. This check does not always work because not all of the GEPs use a constant offset, but it happens often enough to reduce the number of times we use SCEV. llvm-svn: 186465	2013-07-16 22:51:07 +00:00
Rafael Espindola	6d35481c94	Add a wrapper for open. This centralizes the handling of O_BINARY and opens the way for hiding more differences (like how open behaves with directories). llvm-svn: 186447	2013-07-16 19:44:17 +00:00
Peter Collingbourne	8b77f18da0	Make SpecialCaseList match full strings, as documented, using anchors. Differential Revision: http://llvm-reviews.chandlerc.com/D1149 llvm-svn: 186431	2013-07-16 17:56:07 +00:00
Hal Finkel	9caa8f7ba7	When the inliner merges allocas, it must keep the larger alignment For safety, the inliner cannot decrease the allignment on an alloca when merging it with another. I've included two variants of the test case for this: one with DataLayout available, and one without. When DataLayout is not available, if only one of the allocas uses the default alignment (getAlignment() == 0), then they cannot be safely merged. llvm-svn: 186425	2013-07-16 17:10:55 +00:00
Nadav Rotem	26bf9a0c75	SLPVectorizer: Reduce the compile time of the consecutive store lookup. Process groups of stores in chunks of 16. llvm-svn: 186420	2013-07-16 15:25:17 +00:00
Craig Topper	d3a34f81f8	Add 'const' qualifiers to static const char* variables. llvm-svn: 186371	2013-07-16 01:17:10 +00:00
Nadav Rotem	1c1d6c1666	PR16628: Fix a bug in the code that merges compares. Compares return i1 but they compare different types. llvm-svn: 186359	2013-07-15 22:52:48 +00:00
Stephen Lin	837bba1c51	Remove trailing whitespace llvm-svn: 186333	2013-07-15 17:55:02 +00:00
Chandler Carruth	e3899f2c2c	Revert r186316 while I track down an ASan failure and an assert from a bot. This reverts the commit which introduced a new implementation of the fancy SROA pass designed to reduce its overhead. I'll skip the huge commit log here, refer to r186316 if you're looking for how this all works and why it works that way. llvm-svn: 186332	2013-07-15 17:36:21 +00:00
Chandler Carruth	e74ff4c643	Reimplement SROA yet again. Same fundamental principle, but a totally different core implementation strategy. Previously, SROA would build a relatively elaborate partitioning of an alloca, associate uses with each partition, and then rewrite the uses of each partition in an attempt to break apart the alloca into chunks that could be promoted. This was very wasteful in terms of memory and compile time because regardless of how complex the alloca or how much we're able to do in breaking it up, all of the datastructure work to analyze the partitioning was done up front. The new implementation attempts to form partitions of the alloca lazily and on the fly, rewriting the uses that make up that partition as it goes. This has a few significant effects: 1) Much simpler data structures are used throughout. 2) No more double walk of the recursive use graph of the alloca, only walk it once. 3) No more complex algorithms for associating a particular use with a particular partition. 4) PHI and Select speculation is simplified and happens lazily. 5) More precise information is available about a specific use of the alloca, removing the need for some side datastructures. Ultimately, I think this is a much better implementation. It removes about 300 lines of code, but arguably removes more like 500 considering that some code grew in the process of being factored apart and cleaned up for this all to work. I've re-used as much of the old implementation as possible, which includes the lion's share of code in the form of the rewriting logic. The interesting new logic centers around how the uses of a partition are sorted, and split into actual partitions. Each instruction using a pointer derived from the alloca gets a 'Partition' entry. This name is totally wrong, but I'll do a rename in a follow-up commit as there is already enough churn here. The entry describes the offset range accessed and the nature of the access. Once we have all of these entries we sort them in a very specific way: increasing order of begin offset, followed by whether they are splittable uses (memcpy, etc), followed by the end offset or whatever. Sorting by splittability is important as it simplifies the collection of uses into a partition. Once we have these uses sorted, we walk from the beginning to the end building up a range of uses that form a partition of the alloca. Overlapping unsplittable uses are merged into a single partition while splittable uses are broken apart and carried from one partition to the next. A partition is also introduced to bridge splittable uses between the unsplittable regions when necessary. I've looked at the performance PRs fairly closely. PR15471 no longer will even load (the module is invalid). Not sure what is up there. PR15412 improves by between 5% and 10%, however it is nearly impossible to know what is holding it up as SROA (the entire pass) takes less time than reading the IR for that test case. The analysis takes the same time as running mem2reg on the final allocas. I suspect (without much evidence) that the new implementation will scale much better however, and it is just the small nature of the test cases that makes the changes small and noisy. Either way, it is still simpler and cleaner I think. llvm-svn: 186316	2013-07-15 10:30:19 +00:00
Craig Topper	06b3b6651e	Add 'const' qualifier to some arrays. llvm-svn: 186312	2013-07-15 08:02:13 +00:00
Craig Topper	5871321e49	Use llvm::array_lengthof to replace sizeof(array)/sizeof(array[0]). llvm-svn: 186301	2013-07-15 04:27:47 +00:00
Nadav Rotem	d9f3f4548e	SLPVectorizer: change the order in which we search for vectorization candidates. Do stores first and PHIs second. llvm-svn: 186277	2013-07-14 06:15:46 +00:00
Craig Topper	b94011fd28	Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size. llvm-svn: 186274	2013-07-14 04:42:23 +00:00
Arnold Schwaighofer	a92eeebde8	LoopVectorizer: Disallow reductions whose header phi is used outside the loop If an outside loop user of the reduction value uses the header phi node we cannot just reduce the vectorized phi value in the vector code epilog because we would loose VF-1 reductions. lp: p = phi (0, lv) lv = lv + 1 ... brcond , lp, outside outside: usr = add 0, p (Say the loop iterates two times, the value of p coming out of the loop is one). We cannot just transform this to: vlp: p = phi (<0,0>, lv) lv = lv + <1,1> .. brcond , lp, outside outside: p_reduced = p[0] + [1]; usr = add 0, p_reduced (Because the original loop iterated two times the vectorized loop would iterate one time, but p_reduced ends up being zero instead of one). We would have to execute VF-1 iterations in the scalar remainder loop in such cases. For now, just disable vectorization. PR16522 llvm-svn: 186256	2013-07-13 19:09:29 +00:00
Andrew Trick	0ae8c94f8f	LoopVectorize fix: LoopInfo must be valid when invoking utils like SCEVExpander. In general, one should always complete CFG modifications first, update CFG-based analyses, like Dominatores and LoopInfo, then generate instruction sequences. LoopVectorizer was creating a new loop, calling SCEVExpander to generate checks, then updating LoopInfo. I just changed the order. llvm-svn: 186241	2013-07-13 06:20:06 +00:00
Nick Lewycky	7459be6dc7	Add a microoptimization for urem. llvm-svn: 186235	2013-07-13 01:16:47 +00:00
Joey Gouly	a3250f22c2	Fix a crash in EvaluateInDifferentElementOrder where it would generate an undef vector of the wrong type. LGTM'd by Nick Lewycky on IRC. llvm-svn: 186224	2013-07-12 23:08:06 +00:00
Andrew Trick	a1e4118a46	LFTR improvement to avoid truncation. This is a reimplemntation of the patch originally in r186107. llvm-svn: 186215	2013-07-12 22:08:48 +00:00
Andrew Trick	2b71848ffe	Cleanup LFTR logic. llvm-svn: 186214	2013-07-12 22:08:44 +00:00
Andrew Trick	466555e50d	Cleanup: rename a variable to make the logic easier to follow. llvm-svn: 186213	2013-07-12 22:08:41 +00:00
Arnold Schwaighofer	9da9a43af8	TargetTransformInfo: address calculation parameter for gather/scather Address calculation for gather/scather in vectorized code can incur a significant cost making vectorization unbeneficial. Add infrastructure to add cost. Tests and cost model for targets will be in follow-up commits. radar://14351991 llvm-svn: 186187	2013-07-12 19:16:02 +00:00
Chandler Carruth	cf3715cadd	Revert "indvars: Improve LFTR by eliminating truncation when comparing against a constant." This reverts commit r186107. It didn't handle wrapping arithmetic in the loop correctly and thus caused the following C program to count from 0 to UINT64_MAX instead of from 0 to 255 as intended: #include <stdio.h> int main() { unsigned char first = 0, last = 255; do { printf("%d\n", first); } while (first++ != last); } Full test case and instructions to reproduce with just the -indvars pass sent to the original review thread rather than to r186107's commit. llvm-svn: 186152	2013-07-12 11:18:55 +00:00
Nadav Rotem	89c41bf06a	SLPVectorizer: Sink and enable CSE for ExtractElements. llvm-svn: 186145	2013-07-12 06:09:24 +00:00
Nadav Rotem	fa3c2db211	SLPVectorize: Replace the code that checks for vectorization candidates in successor blocks with code that scans PHINodes. Before we could vectorize PHINodes scanning successors was a good way of finding candidates. Now we can vectorize the phinodes which is simpler. llvm-svn: 186139	2013-07-12 00:04:18 +00:00
Nadav Rotem	db06b139fd	Remove an argument that we dont use anymore. llvm-svn: 186116	2013-07-11 20:56:13 +00:00
Andrew Trick	3095993d6f	indvars: Improve LFTR by eliminating truncation when comparing against a constant. Patch by Michele Scandale! Adds a special handling of the case where, during the loop exit condition rewriting, the exit value is a constant of bitwidth lower than the type of the induction variable: instead of introducing a trunc operation in order to match correctly the operand types, it allows to convert the constant value to an equivalent constant, depending on the initial value of the induction variable and the trip count, in order have an equivalent comparison between the induction variable and the new constant. llvm-svn: 186107	2013-07-11 17:08:59 +00:00
Benjamin Kramer	fc3ea6f4bc	Don't use a potentially expensive shift if all we want is one set bit. No functionality change. llvm-svn: 186095	2013-07-11 16:05:50 +00:00
Arnold Schwaighofer	e97c71b8fd	LoopVectorize: Vectorize all accesses in address space zero with unit stride We can vectorize them because in the case where we wrap in the address space the unvectorized code would have had to access a pointer value of zero which is undefined behavior in address space zero according to the LLVM IR semantics. (Thank you Duncan, for pointing this out to me). Fixes PR16592. llvm-svn: 186088	2013-07-11 15:21:55 +00:00
Duncan Sands	e773c08021	TryToSimplifyUncondBranchFromEmptyBlock was checking that any common predecessors of the two blocks it is attempting to merge supply the same incoming values to any phi in the successor block. This change allows merging in the case where there is one or more incoming values that are undef. The undef values are rewritten to match the non-undef value that flows from the other edge. Patch by Mark Lacey. llvm-svn: 186069	2013-07-11 08:28:20 +00:00
Nadav Rotem	08efb262a9	Fix a warning. llvm-svn: 186064	2013-07-11 05:39:02 +00:00
Nadav Rotem	b8dd66f655	SLPVectorizer: refactor the code that places extracts. Place the code that decides where to put extracts in the build-tree phase. This allows us to take the cost of the extracts into account. llvm-svn: 186058	2013-07-11 04:54:05 +00:00
Michael Gottesman	b40db26eae	Teach TailRecursionElimination to handle certain cases of nocapture escaping allocas. Without the changes introduced into this patch, if TRE saw any allocas at all, TRE would not perform TRE or mark callsites with the tail marker. Because TRE runs after mem2reg, this inadequacy is not a death sentence. But given a callsite A without escaping alloca argument, A may not be able to have the tail marker placed on it due to a separate callsite B having a write-back parameter passed in via an argument with the nocapture attribute. Assume that B is the only other callsite besides A and B only has nocapture escaping alloca arguments (NOTE B may have other arguments that are not passed allocas). In this case not marking A with the tail marker is unnecessarily conservative since: 1. By assumption A has no escaping alloca arguments itself so it can not access the caller's stack via its arguments. 2. Since all of B's escaping alloca arguments are passed as parameters with the nocapture attribute, we know that B does not stash said escaping allocas in a manner that outlives B itself and thus could be accessed indirectly by A. With the changes introduced by this patch: 1. If we see any escaping allocas passed as a capturing argument, we do nothing and bail early. 2. If we do not see any escaping allocas passed as captured arguments but we do see escaping allocas passed as nocapture arguments: i. We do not perform TRE to avoid PR962 since the code generator produces significantly worse code for the dynamic allocas that would be created by the TRE algorithm. ii. If we do not return twice, mark call sites without escaping allocas with the tail marker. NOTE This excludes functions with escaping nocapture allocas. 3. If we do not see any escaping allocas at all (whether captured or not): i. If we do not have usage of setjmp, mark all callsites with the tail marker. ii. If there are no dynamic/variable sized allocas in the function, attempt to perform TRE on all callsites in the function. Based off of a patch by Nick Lewycky. rdar://14324281. llvm-svn: 186057	2013-07-11 04:40:01 +00:00
Michael Gottesman	6eb95dc2f7	[objc-arc] Changed 'mode: c++' => 'C++' at Nick Lewycky's suggestion. Also removed unnecessary mode: c++ lines from .cpp files. llvm-svn: 186026	2013-07-10 18:49:00 +00:00
Peter Collingbourne	49062a97cf	Implement categories for special case lists. A special case list can now specify categories for specific globals, which can be used to instruct an instrumentation pass to treat certain functions or global variables in a specific way, such as by omitting certain aspects of instrumentation while keeping others, or informing the instrumentation pass that a specific uninstrumentable function has certain semantics, thus allowing the pass to instrument callers according to those semantics. For example, AddressSanitizer now uses the "init" category instead of global-init prefixes for globals whose initializers should not be instrumented, but which in all other respects should be instrumented. The motivating use case is DataFlowSanitizer, which will have a number of different categories for uninstrumentable functions, such as "functional" which specifies that a function has pure functional semantics, or "discard" which indicates that a function's return value should not be labelled. Differential Revision: http://llvm-reviews.chandlerc.com/D1092 llvm-svn: 185978	2013-07-09 22:03:17 +00:00
Peter Collingbourne	2eb048d230	Introduce a SpecialCaseList ctor which takes a MemoryBuffer to make it more unit testable, and fix memory leak in the other ctor. Differential Revision: http://llvm-reviews.chandlerc.com/D1090 llvm-svn: 185976	2013-07-09 22:03:09 +00:00
Peter Collingbourne	015370e23a	Rename BlackList class to SpecialCaseList and move it to Transforms/Utils. Differential Revision: http://llvm-reviews.chandlerc.com/D1089 llvm-svn: 185975	2013-07-09 22:02:49 +00:00
Nadav Rotem	d7b574e5b3	Fix PR16571, which is a bug in the code that checks that all of the types in the bundle are uniform. llvm-svn: 185970	2013-07-09 21:38:08 +00:00
Nadav Rotem	861bef7dd0	Set the default insert point to the first instruction, and not to end() llvm-svn: 185953	2013-07-09 17:55:36 +00:00
David Majnemer	eeed73b981	InstCombine: Fix typo in comment for visitICmpInstWithInstAndIntCst llvm-svn: 185916	2013-07-09 09:24:35 +00:00
David Majnemer	72d76275ac	InstCombine: variations on 0xffffffff - x >= 4 The following transforms are valid if -C is a power of 2: (icmp ugt (xor X, C), ~C) -> (icmp ult X, C) (icmp ult (xor X, C), -C) -> (icmp uge X, C) These are nice, they get rid of the xor. llvm-svn: 185915	2013-07-09 09:20:58 +00:00
David Majnemer	414d4e58aa	InstCombine: X & -C != -C -> X <= u ~C Tests were added in r185910 somehow. llvm-svn: 185912	2013-07-09 08:09:32 +00:00
David Majnemer	bafa537eb7	Commit r185909 was a misapplied patch, fix it llvm-svn: 185910	2013-07-09 07:58:32 +00:00
David Majnemer	f2a9a513c7	InstCombine: add more transforms C1-X <u C2 -> (X\|(C2-1)) == C1 C1-X >u C2 -> (X\|C2) == C1 X-C1 <u C2 -> (X & -C2) == C1 X-C1 >u C2 -> (X & ~C2) == C1 llvm-svn: 185909	2013-07-09 07:50:59 +00:00
Eli Bendersky	07b0e451ca	Fix comment llvm-svn: 185888	2013-07-08 23:57:07 +00:00
Nadav Rotem	c9c57518ab	This patch changes the saved IRBuilder insert point from BasicBlock::iterator to AssertingVH. Commit 185883 fixes a bug in the IRBuilder that should fix the ASan bot. AssertingVH can help in exposing some RAUW problems. Thanks Ben and Alexey! llvm-svn: 185886	2013-07-08 23:31:13 +00:00
Michael Gottesman	c1b648f6c0	[objc-arc] Fix assertion in EraseInstruction so that noop on null calls when passed null do not trigger the assert. The specific case of interest is when objc_retainBlock is passed null. llvm-svn: 185885	2013-07-08 23:30:23 +00:00
David Majnemer	fa90a0b325	InstCombine: Fold X-C1 <u 2 -> (X & -2) == C1 Back in r179493 we determined that two transforms collided with each other. The fix back then was to reorder the transforms so that the preferred transform would give it a try and then we would try the secondary transform. However, it was noted that the best approach would canonicalize one transform into the other, removing the collision and allowing us to optimize IR given to us in that form. llvm-svn: 185808	2013-07-08 11:53:08 +00:00
Nadav Rotem	2ee35771a8	Clear the builder insert point between tree-vectorization phases. llvm-svn: 185777	2013-07-07 14:57:18 +00:00
Nadav Rotem	2041b742d4	SLPVectorizer: Implement DCE as part of vectorization. This is a complete re-write if the bottom-up vectorization class. Before this commit we scanned the instruction tree 3 times. First in search of merge points for the trees. Second, for estimating the cost. And finally for vectorization. There was a lot of code duplication and adding the DCE exposed bugs. The new design is simpler and DCE was a part of the design. In this implementation we build the tree once. After that we estimate the cost by scanning the different entries in the constructed tree (in any order). The vectorization phase also works on the built tree. llvm-svn: 185774	2013-07-07 06:57:07 +00:00
Michael Gottesman	618df456e2	[objc-arc] Remove the alias analysis part of r185764. Upon further reflection, the alias analysis part of r185764 is not a safe change. llvm-svn: 185770	2013-07-07 04:18:03 +00:00
Michael Gottesman	a72630d453	[objc-arc] Teach the ARC optimizer that objc_sync_enter/objc_sync_exit do not modify the ref count of an objc object and additionally are inert for modref purposes. llvm-svn: 185769	2013-07-07 01:52:55 +00:00
Michael Gottesman	e557da26db	[objc-arc] When we initialize ARCRuntimeEntryPoints, make sure we reset all references to entrypoint declarations as well. llvm-svn: 185764	2013-07-06 18:43:05 +00:00
Benjamin Kramer	3d90a8f4f9	Reassociate: Remove unnecessary default operator=. llvm-svn: 185757	2013-07-06 15:10:13 +00:00
Michael Gottesman	4d9439c73f	[objc-arc] Performed some small cleanups in ARCRuntimeEntryPoints and added an llvm_unreachable after the switch to quiet -Wreturn_type errors. llvm-svn: 185746	2013-07-06 02:18:56 +00:00
Michael Gottesman	574d521c85	[objc-arc] Renamed Module => TheModule in ARCRuntimeEntryPoints. Also did some small cleanups. This fixes an issue that came up due to -fpermissive on the bots. llvm-svn: 185744	2013-07-06 01:57:32 +00:00
Michael Gottesman	01df45056e	Removed trailing whitespace. llvm-svn: 185743	2013-07-06 01:41:35 +00:00
Michael Gottesman	0b912b2673	[objc-arc] Updated ObjCARCContract to use ARCRuntimeEntryPoints. llvm-svn: 185742	2013-07-06 01:39:26 +00:00
Michael Gottesman	14acfacc48	[objc-arc] Updated ObjCARCOpts to use ARCRuntimeEntryPoints. llvm-svn: 185741	2013-07-06 01:39:23 +00:00
Michael Gottesman	a94186a4c0	[objc-arc] Refactor runtime entrypoint declaration entrypoint creation. This is the first patch in a series of 3 patches which clean up how we create runtime function declarations in the ARC optimizer when they do not exist already in the IR. Currently we have a bunch of duplicated code in ObjCARCOpts, ObjCARCContract that does this. This patch refactors that code into a separate class called ARCRuntimeEntryPoints which lazily creates the declarations for said entrypoints. The next two patches will consist of the work of refactoring ObjCARCContract/ObjCARCOpts to use this new code. llvm-svn: 185740	2013-07-06 01:39:18 +00:00
Nick Lewycky	cff2cf8e3a	Fix annotation of unlink. Should fix builder. llvm-svn: 185738	2013-07-06 00:59:28 +00:00
Nick Lewycky	c2ec0725ce	Extend 'readonly' and 'readnone' to work on function arguments as well as functions. Make the function attributes pass add it to known library functions and when it can deduce it. llvm-svn: 185735	2013-07-06 00:29:58 +00:00
Rafael Espindola	155cf0f3a6	Use sys::fs::createTemporaryFile. llvm-svn: 185719	2013-07-05 20:14:52 +00:00
Sylvestre Ledru	751447a3ac	Remove a useless declarations (found by scan-build) llvm-svn: 185709	2013-07-05 15:58:12 +00:00
David Majnemer	c2a990bc00	InstCombine: (icmp eq B, 0) \| (icmp ult A, B) -> (icmp ule A, B-1) This transform allows us to turn IR that looks like: %1 = icmp eq i64 %b, 0 %2 = icmp ult i64 %a, %b %3 = or i1 %1, %2 ret i1 %3 into: %0 = add i64 %b, -1 %1 = icmp uge i64 %0, %a ret i1 %1 which means we go from lowering: cmpq %rsi, %rdi setb %cl testq %rsi, %rsi sete %al orb %cl, %al ret to lowering: decq %rsi cmpq %rdi, %rsi setae %al ret llvm-svn: 185677	2013-07-05 00:31:17 +00:00
David Majnemer	37f8f445de	InstCombine: Reimplementation of visitUDivOperand This transform was originally added in r185257 but later removed in r185415. The original transform would create instructions speculatively and then discard them if the speculation was proved incorrect. This has been replaced with a scheme that splits the transform into two parts: preflight and fold. While we preflight, we build up fold actions that inform the folding stage on how to act. llvm-svn: 185667	2013-07-04 21:17:49 +00:00
Benjamin Kramer	371722288c	SimplifyCFG: Teach switch generation some patterns that instcombine forms. This allows us to create switches even if instcombine has munged two of the incombing compares into one and some bit twiddling. This was motivated by enum compares that are common in clang. llvm-svn: 185632	2013-07-04 14:22:02 +00:00
Nick Lewycky	a833c667e2	Tabs to spaces. No functionality change. llvm-svn: 185612	2013-07-04 03:51:53 +00:00
Craig Topper	af0dea1347	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size. llvm-svn: 185606	2013-07-04 01:31:24 +00:00
Craig Topper	31ee5866de	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size. llvm-svn: 185540	2013-07-03 15:07:05 +00:00
Evgeniy Stepanov	dc6d7eb860	[msan] Unpoison stack allocations and undef values in blacklisted functions. This changes behavior of -msan-poison-stack=0 flag from not poisoning stack allocations to actively unpoisoning them. llvm-svn: 185538	2013-07-03 14:39:14 +00:00
Michael Gottesman	2db11161a8	Added support in FunctionAttrs for adding relevant function/argument attributes for the posix call gettimeofday. This implies annotating it as nounwind and its arguments as nocapture. To be conservative, we do not annotate the arguments with noalias since some platforms do not have restrict on the declaration for gettimeofday. llvm-svn: 185502	2013-07-03 04:00:54 +00:00
Manman Ren	d0e67aa1ce	Debug Info: cleanup llvm-svn: 185456	2013-07-02 18:37:35 +00:00
Hal Finkel	fdbe161b1a	Revert r185257 (InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms) I'm reverting this commit because: 1. As discussed during review, it needs to be rewritten (to avoid creating and then deleting instructions). 2. This is causing optimizer crashes. Specifically, I'm seeing things like this: While deleting: i1 % Use still stuck around after Def is destroyed: <badref> = select i1 <badref>, i32 0, i32 1 opt: /src/llvm-trunk/lib/IR/Value.cpp:79: virtual llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when a value is destroyed!"' failed. I'd guess that these will go away once we're no longer creating/deleting instructions here, but just in case, I'm adding a regression test. Because the code is bring rewritten, I've just XFAIL'd the original regression test. Original commit message: InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms Real world code sometimes has the denominator of a 'udiv' be a 'select'. LLVM can handle such cases but only when the 'select' operands are symmetric in structure (both select operands are a constant power of two or a left shift, etc.). This falls apart if we are dealt a 'udiv' where the code is not symetric or if the select operands lead us to more select instructions. Instead, we should treat the LHS and each select operand as a distinct divide operation and try to optimize them independently. If we can to simplify each operation, then we can replace the 'udiv' with, say, a 'lshr' that has a new select with a bunch of new operands for the select. llvm-svn: 185415	2013-07-02 05:21:11 +00:00
Nick Lewycky	26fcc51f15	Add missing break statements. Noticed by inspection. llvm-svn: 185414	2013-07-02 05:02:56 +00:00
Manman Ren	74c188f026	Debug Info: clean up usage of Verify. No functionality change. It should suffice to check the type of a debug info metadata, instead of calling Verify. llvm-svn: 185383	2013-07-01 21:02:01 +00:00
Arnold Schwaighofer	ef51cf202b	LoopVectorize: Math functions only read rounding mode Math functions are mark as readonly because they read the floating point rounding mode. Because we don't vectorize loops that would contain function calls that set the rounding mode it is safe to ignore this memory read. llvm-svn: 185299	2013-07-01 00:54:44 +00:00
Stephen Lin	2e551adcd9	DeadArgumentElimination: keep return value on functions that have a live argument with the 'returned' attribute (rather than generate invalid IR); however, if both can be eliminated, both will be llvm-svn: 185290	2013-06-30 20:26:21 +00:00
Benjamin Kramer	4093f29366	InstCombine: Also turn selects fed by an and into arithmetic when the types don't match. Inserting a zext or trunc is sufficient. This pattern is somewhat common in LLVM's pointer mangling code. llvm-svn: 185270	2013-06-29 21:17:04 +00:00
Benjamin Kramer	4ab72f9b9a	LoopVectorizer: Pack MemAccessInfo pairs. llvm-svn: 185263	2013-06-29 17:52:08 +00:00
Benjamin Kramer	53545693d7	Move helper classes into anonymous namespaces. llvm-svn: 185262	2013-06-29 17:02:06 +00:00
David Majnemer	5953d3712a	InstCombine: FoldGEPICmp shouldn't change sign of base pointer comparison Changing the sign when comparing the base pointer would introduce all sorts of unexpected things like: %gep.i = getelementptr inbounds [1 x i8]* %a, i32 0, i32 0 %gep2.i = getelementptr inbounds [1 x i8]* %b, i32 0, i32 0 %cmp.i = icmp ult i8* %gep.i, %gep2.i %cmp.i1 = icmp ult [1 x i8]* %a, %b %cmp = icmp ne i1 %cmp.i, %cmp.i1 ret i1 %cmp into: %cmp.i = icmp slt [1 x i8]* %a, %b %cmp.i1 = icmp ult [1 x i8]* %a, %b %cmp = xor i1 %cmp.i, %cmp.i1 ret i1 %cmp By preserving the original sign, we now get: ret i1 false This fixes PR16483. llvm-svn: 185259	2013-06-29 10:28:04 +00:00
David Majnemer	92a8a7d45a	InstCombine: Small whitespace cleanup in FoldGEPICmp llvm-svn: 185258	2013-06-29 09:45:35 +00:00
David Majnemer	797227eea6	InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms Real world code sometimes has the denominator of a 'udiv' be a 'select'. LLVM can handle such cases but only when the 'select' operands are symmetric in structure (both select operands are a constant power of two or a left shift, etc.). This falls apart if we are dealt a 'udiv' where the code is not symetric or if the select operands lead us to more select instructions. Instead, we should treat the LHS and each select operand as a distinct divide operation and try to optimize them independently. If we can to simplify each operation, then we can replace the 'udiv' with, say, a 'lshr' that has a new select with a bunch of new operands for the select. llvm-svn: 185257	2013-06-29 08:40:07 +00:00
Nadav Rotem	0a25727f31	We preserve the CFG and some of the analysis passes. llvm-svn: 185251	2013-06-29 05:38:15 +00:00
Nadav Rotem	e00343446c	Update docs. llvm-svn: 185250	2013-06-29 05:37:19 +00:00
David Majnemer	b889e405eb	InstCombine: Optimize (1 << X) Pred CstP2 to X Pred Log2(CstP2) We may, after other optimizations, find ourselves with IR that looks like: %shl = shl i32 1, %y %cmp = icmp ult i32 %shl, 32 Instead, we should just compare the shift count: %cmp = icmp ult i32 %y, 5 llvm-svn: 185242	2013-06-28 23:42:03 +00:00
Nadav Rotem	060be733a5	SLP Vectorizer: Add support for trees with external users. To support this we have to insert 'extractelement' instructions to pick the right lane. We had this functionality before but I removed it when we moved to the multi-block design because it was too complicated. llvm-svn: 185230	2013-06-28 22:07:09 +00:00
Nadav Rotem	9ce3fedcdd	LoopVectorizer: Refactor the code that checks if it is safe to predicate blocks. In this code we keep track of pointers that we are allowed to read from, if they are accessed by non-predicated blocks. We use this list to allow vectorization of conditional loads in predicated blocks because we know that these addresses don't segfault. llvm-svn: 185214	2013-06-28 20:46:27 +00:00
Daniel Malea	b17b1cd6f5	Remove needless include (unistd.h) in DebugIR pass - should unbreak Windows builds llvm-svn: 185198	2013-06-28 19:19:44 +00:00
Daniel Malea	0673464a92	Add missing header for DebugIR - missed svn add... llvm-svn: 185194	2013-06-28 19:07:59 +00:00
Daniel Malea	31321fa53d	Remove limitation on DebugIR that made it require existing debug metadata. - Build debug metadata for 'bare' Modules using DIBuilder - DebugIR can be constructed to generate an IR file (to be seen by a debugger) or not in cases where the user already has an IR file on disk. llvm-svn: 185193	2013-06-28 19:05:23 +00:00
Arnold Schwaighofer	ce2c766f61	LoopVectorize: Pull dyn_cast into setDebugLocFromInst llvm-svn: 185168	2013-06-28 17:14:48 +00:00
Arnold Schwaighofer	3b27b992ca	LoopVectorize: Use static function instead of DebugLocSetter class I used the class to safely reset the state of the builder's debug location. I think I have caught all places where we need to set the debug location to a new one. Therefore, we can replace the class by a function that just sets the debug location. llvm-svn: 185165	2013-06-28 16:26:54 +00:00

1 2 3 4 5 ...

10577 Commits