Commit Graph

9448 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen c5c4e96f3e Revert remaining part of r93200: "Disable folding sext(trunc(x)) -> x"
This fixes PR5997.

These transforms were disabled because codegen couldn't deal with other
uses of trunc(x). This is now handled by the peephole pass.

This causes no regressions on x86-64.

llvm-svn: 159003
2012-06-22 16:36:43 +00:00
Stepan Dyatkovskiy a6c8cc307b Fixed r158979.
Original message:
Performance optimizations:
- SwitchInst: case values stored separately from Operands List. It allows to make faster access to individual case value numbers or ranges.
- Optimized IntItem, added APInt value caching.
- Optimized IntegersSubsetGeneric: added optimizations for cases when subset is single number or when subset consists from single numbers only.

llvm-svn: 158997
2012-06-22 14:53:30 +00:00
Nuno Lopes 0b60ebbf79 fix whitespace in my last commit.
sorry for the churn :S  enough for today; going to sleep.

llvm-svn: 158953
2012-06-22 00:29:58 +00:00
Nuno Lopes 9792d68381 remove extractMallocCallFromBitCast, since it was tailor maded for its sole user. Update GlobalOpt accordingly.
llvm-svn: 158952
2012-06-22 00:25:01 +00:00
Nuno Lopes 771e7bd4ba instcombine: disable optimization of 'invoke null/undef'. I'll move this functionality to SimplifyCFG (since we cannot make changes to the CFG here).
Fixes the crashes with the attached test case

llvm-svn: 158951
2012-06-21 23:52:14 +00:00
Evan Cheng 32c7cc8ec9 Look pass zext to strength reduce an udiv. Patch by David Majnemer. rdar://11721329
llvm-svn: 158946
2012-06-21 22:52:49 +00:00
Nuno Lopes dc6085e52d Add support for invoke to the MemoryBuiltin analysid.
Update comments accordingly.

Make instcombine remove useless invokes to C++'s 'new' allocation function (test attached).

llvm-svn: 158937
2012-06-21 21:25:05 +00:00
Nuno Lopes 0e967e0186 port the BoundsChecking patch to the new MemoryBuiltin API (i.e., remove most of the code from here).
Remove the alloc_size.ll test until we settle on a metadata format that makes everyone happy..

llvm-svn: 158920
2012-06-21 15:59:53 +00:00
Nuno Lopes 55fff83422 refactor the MemoryBuiltin analysis:
- provide more extensive set of functions to detect library allocation functions (e.g., malloc, calloc, strdup, etc)
 - provide an API to compute the size and offset of an object pointed by

Move a few clients (GVN, AA, instcombine, ...) to the new API.
This implementation is a lot more aggressive than each of the custom implementations being replaced.

Patch reviewed by Nick Lewycky and Chandler Carruth, thanks.

llvm-svn: 158919
2012-06-21 15:45:28 +00:00
Nadav Rotem 4e9012c2b1 Add a number of threshold arguments to the SRA pass.
A patch by Tom Stellard with minor changes.

llvm-svn: 158918
2012-06-21 13:44:31 +00:00
Nuno Lopes 3fa32f2452 replace usage of EmitGEPOffset() with TargetData::getIndexedOffset() when the GEP offset is known to be constant.
With this change, we avoid relying on the IR Builder to constant fold the operations.

No functionality change intended.

llvm-svn: 158829
2012-06-20 17:30:51 +00:00
Chandler Carruth c60fbe6b58 Fix two rather subtle internal vs. external linker issues.
I'll admit I'm not entirely satisfied with this change, but it seemed
the cleanest option. Other suggestions quite welcome

The issue is that the traits specializations have static methods which
return the typedef'ed PHI_iterator type. In both the IR and MI layers
this is typedef'ed to a custom iterator class defined in an anonymous
namespace giving the types and the functions returning them internal
linkage. However, because the traits specialization is defined in the
'llvm' namespace (where it has to be, specialized template lives there),
and is in turn used in the templated implementation of the SSAUpdater.
This led to the linkage conflict that Clang now warns about.

The simplest solution to me was just to define the PHI_iterator as
a nested class inside the trait specialization. That way it still
doesn't get scoped widely, it can't be accidentally reused somewhere,
etc. This is a little gross just because nested class definitions are
a little gross, but the alternatives seem more ad-hoc.

llvm-svn: 158799
2012-06-20 08:39:30 +00:00
Pete Cooper 33ee6c9bf1 Now that SROA can form alloca's for dynamic vector accesses, further improve it to be able to replace operations on these vector alloca's with insert/extract element insts
llvm-svn: 158623
2012-06-17 03:58:26 +00:00
Hal Finkel fa103d3fc7 Teach BBVectorize to combine, when possible, or discard metadata when fusing instructions.
The present implementation handles only TBAA and FP metadata, discarding everything else.
For debug metadata, the current behavior is maintained (the debug metadata associated with
one of the instructions will be kept, discarding that attached to the other).

This should address PR 13040.

llvm-svn: 158606
2012-06-16 20:34:06 +00:00
Hal Finkel 16ddd4b66b Move the Metadata merging methods from GVN and make them public in MDNode.
There are other passes, BBVectorize specifically, that also need some of
this functionality.

llvm-svn: 158605
2012-06-16 20:33:37 +00:00
Evan Cheng 773b2cd63c It's not deterministic to iterate over SmallPtrSet. Replace it with SmallSetVector. Patch by Daniel Reynaud. rdar://11671029
llvm-svn: 158594
2012-06-16 04:28:11 +00:00
Pete Cooper 818e9f4a26 Fix crash from r158529 on Bullet.
Dynamic GEPs created by SROA needed to insert extra "i32 0"
operands to index through structs and arrays to get to the
vector being indexed.

llvm-svn: 158590
2012-06-16 01:43:26 +00:00
Andrew Trick 8370c7c38f LSR: fix expansion of scaled reg in non-address type formulae.
For non-address users, Base and Scaled registers are not specially
associated to fit an address mode, so SCEVExpander should apply normal
expansion rules. Otherwise we may sink computation into inner loops
that have already been optimized.

llvm-svn: 158537
2012-06-15 20:07:29 +00:00
Andrew Trick aca8fb3c45 LSR fix: "Special" users are just like "Basic" users but allow -1 scale.
llvm-svn: 158536
2012-06-15 20:07:26 +00:00
Pete Cooper e24d6a19e3 Allow SROA to split up an array of vectors into multiple vectors, even when the vectors are dynamically indexed
llvm-svn: 158529
2012-06-15 18:07:29 +00:00
Rafael Espindola 1821c6c3b0 Some optimizations done by globalopt are safe only for internal linkage, not
linkonce linkage. For example, it is not valid to add unnamed_addr.

This also fixes a crash in g++.dg/opt/static5.C.

llvm-svn: 158528
2012-06-15 18:00:24 +00:00
Duncan Sands 7838603ffc Fix issues (infinite loop and/or crash) with self-referential instructions, for
example degenerate phi nodes and binops that use themselves in unreachable code.
Thanks to Charles Davis for the testcase that uncovered this can of worms.

llvm-svn: 158508
2012-06-15 08:37:50 +00:00
Pete Cooper 1d1fa72837 Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct
llvm-svn: 158479
2012-06-14 23:53:53 +00:00
Rafael Espindola def1b09be2 Implement the isSafeToDiscardIfUnused predicate and use it in globalopt and
globaldce. Globaldce was already removing linkonce globals, but globalopt was
not.

llvm-svn: 158476
2012-06-14 22:48:13 +00:00
Pete Cooper 5d19452f3f Revert r158454: Allow SROA to look at a vector type... Its breaking the vectorise buildbot
This reverts commit 12c1f86ffa731e2952c80d2cc577000c96b8962c.

llvm-svn: 158462
2012-06-14 18:32:52 +00:00
Pete Cooper a7e6d58a87 Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct
llvm-svn: 158454
2012-06-14 16:38:13 +00:00
Manman Ren c2bc2d106b InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y).
uno && ueq was converted to ueq, it should be converted to uno.

llvm-svn: 158441
2012-06-14 05:57:42 +00:00
Pete Cooper e2fe809772 Revert "Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access"
This reverts commit 51786e0aaec76b973205066bd44f7f427b21969f.

llvm-svn: 158408
2012-06-13 17:55:22 +00:00
Pete Cooper e1d4e8b563 Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access
llvm-svn: 158407
2012-06-13 17:30:34 +00:00
Duncan Sands 409d8ae165 It is possible for several constants which aren't individually absorbing to
combine to the absorbing element.  Thanks to nbjoerg on IRC for pointing this 
out.

llvm-svn: 158399
2012-06-13 12:15:56 +00:00
Duncan Sands 318a89ddac When linearizing a multiplication, return at once if we see a factor of zero,
since then the entire expression must equal zero (similarly for other operations
with an absorbing element).  With this in place a bunch of reassociate code for
handling constants is dead since it is all taken care of when linearizing.  No
intended functionality change.

llvm-svn: 158398
2012-06-13 09:42:13 +00:00
Manman Ren d33f4efbfd SimplifyCFG: fold unconditional branch to its predecessor if profitable.
This patch extends FoldBranchToCommonDest to fold unconditional branches.
For unconditional branches, we fold them if it is easy to update the phi nodes 
in the common successors.

rdar://10554090

llvm-svn: 158392
2012-06-13 05:43:29 +00:00
Duncan Sands 72aea01b6e Use DenseMap as SmallMap workaround rather than std::map, at Chandler's request.
llvm-svn: 158371
2012-06-12 20:26:43 +00:00
Duncan Sands 67cd591989 Use std::map rather than SmallMap because SmallMap assumes that the value has
POD type, causing memory corruption when mapping to APInts with bitwidth > 64.
Merge another crash testcase into crash.ll while there.

llvm-svn: 158369
2012-06-12 20:16:51 +00:00
Duncan Sands d7aeefebd6 Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x).  This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value.  It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights.  As a side-effect it
reduces the number of multiplies needed in some cases of large powers.  While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users.  This is progress towards
fixing PR13021.

llvm-svn: 158358
2012-06-12 14:33:56 +00:00
Benjamin Kramer 2150145ae4 InstCombine: factor code better.
No functionality change.

llvm-svn: 158301
2012-06-11 08:01:25 +00:00
Benjamin Kramer 8b8a76974f InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), narrowing the compare.
This saves a cast, and zext is more expensive on platforms with subreg support
than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750.
On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the
same performance now when not inlining either function.

stupid_memchr: 323.0us
bsd_memchr: 321.0us
memchr: 479.0us

where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When
inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time,
I haven't fully understood the issue yet, something is grossly mangling the
loop after inlining.

llvm-svn: 158297
2012-06-10 20:35:00 +00:00
Dmitri Gribenko dbeafa773a Convert comments to proper Doxygen comments.
llvm-svn: 158248
2012-06-09 00:01:45 +00:00
Nuno Lopes 2710f1b049 canonicalize:
-%a + 42
into
42 - %a

previously we were emitting:
-(%a + 42)

This fixes the infinite loop in PR12338. The generated code is still not perfect, though.
Will work on that next

llvm-svn: 158237
2012-06-08 22:30:05 +00:00
Duncan Sands 3293f460e7 Reapply commit 158073 with a fix (the testcase was already committed). The
problem was that by moving instructions around inside the function, the pass
could accidentally move the iterator being used to advance over the function
too.  Fix this by only processing the instruction equal to the iterator, and
leaving processing of instructions that might not be equal to the iterator
to later (later = after traversing the basic block; it could also wait until
after traversing the entire function, but this might make the sets quite big).
Original commit message:

Grab-bag of reassociate tweaks.  Unify handling of dead instructions and
instructions to reoptimize.  Exploit this to more systematically eliminate
dead instructions (this isn't very useful in practice but is convenient for
analysing some testcase I am working on).  No need for WeakVH any more: use
an AssertingVH instead.

llvm-svn: 158226
2012-06-08 20:15:33 +00:00
Nuno Lopes 4b68c1da54 BoundsChecking: add support for ConstantPointerNull. fixes a bunch of instrumentation failures in loops with reallocs
llvm-svn: 158210
2012-06-08 16:31:42 +00:00
Duncan Sands 9a5cf92250 Revert commit 158073 while waiting for a fix. The issue is that reassociate
can move instructions within the instruction list.  If the instruction just
happens to be the one the basic block iterator is pointing to, and it is
moved to a different basic block, then we get into an infinite loop due to
the iterator running off the end of the basic block (for some reason this
doesn't fire any assertions).  Original commit message:

Grab-bag of reassociate tweaks.  Unify handling of dead instructions and
instructions to reoptimize.  Exploit this to more systematically eliminate
dead instructions (this isn't very useful in practice but is convenient for
analysing some testcase I am working on).  No need for WeakVH any more: use
an AssertingVH instead.

llvm-svn: 158199
2012-06-08 13:37:30 +00:00
Nadav Rotem 4e50efead6 Fix a bug in FoldSelectOpOp. Bitcast ops may change the number of vector elements, which may disagree with the select condition type.
llvm-svn: 158166
2012-06-07 20:28:57 +00:00
Benjamin Kramer 628a39faa3 Remove unused private fields found by clang's new -Wunused-private-field.
There are some that I didn't remove this round because they looked like
obvious stubs. There are dead variables in gtest too, they should be
fixed upstream.

llvm-svn: 158090
2012-06-06 18:25:08 +00:00
Chad Rosier faa3894628 Fix combine of uno && ord -> false so that the ordering of the fcmps doesn't
matter.
rdar://11579835

llvm-svn: 158084
2012-06-06 17:22:40 +00:00
Duncan Sands 763da45e9e Grab-bag of reassociate tweaks. Unify handling of dead instructions and
instructions to reoptimize.  Exploit this to more systematically eliminate
dead instructions (this isn't very useful in practice but is convenient for
analysing some testcase I am working on).  No need for WeakVH any more: use
an AssertingVH instead.

llvm-svn: 158073
2012-06-06 14:53:10 +00:00
Andrew Trick a6fb910fad LoopUnroll: always check for NULL LoopPassManager
llvm-svn: 158007
2012-06-05 17:51:05 +00:00
Rafael Espindola 47d988c54c When gvn decides to replace an instruction with another, we have to patch the
replacement to make it at least as generic as the instruction being replaced.
This includes:
* dropping nsw/nuw flags
* getting the least restrictive tbaa and fpmath metadata
* merging ranges

Fixes PR12979.

llvm-svn: 157958
2012-06-04 22:44:21 +00:00
Benjamin Kramer bde9176663 Fix typos found by http://github.com/lyda/misspell-check
llvm-svn: 157885
2012-06-02 10:20:22 +00:00
Stepan Dyatkovskiy 0e46d8a08c PR1255: case ranges.
IntRange converted from struct to class. So main change everywhere is replacement of ".Low/High" with ".getLow/getHigh()"

llvm-svn: 157884
2012-06-02 09:42:43 +00:00
Bill Wendling e85f34969e Register the gcov "writeout" at init time. Don't list this as a d'tor. Instead,
inject some code in that will run via the "__mod_init_func" method that
registers the gcov "writeout" function to execute at exit time.

The problem is that the "__mod_term_func" method of specifying d'tors is
deprecated on Darwin. And it can lead to some ambiguities when dealing with
multiple libraries.
<rdar://problem/11110106>

llvm-svn: 157852
2012-06-01 23:14:32 +00:00
Nuno Lopes adf1c859dd BoundsChecking: fix a bug when the handling of recursive PHIs failed and could leave dangling references in the cache
add regression tests for this problem.

Can already compile & run: PHP, PCRE, and ICU  (i.e., all the software I tried)

llvm-svn: 157822
2012-06-01 17:43:31 +00:00
Nuno Lopes 288e86ff6b add -bounds-checking-multiple-traps option to make one trap BB per check
disabled by default for now; we can discusse the default value (& name) later

llvm-svn: 157777
2012-05-31 22:58:48 +00:00
Nuno Lopes 7d00061d87 revamp BoundsChecking considerably:
- compute size & offset at the same time. The side-effects of this are that we now support negative GEPs. It's now approaching a phase that it can be reused by other passes (e.g., lowering of the objectsize intrinsic)
 - use APInt throughout to handle wrap-arounds
 - add support for PHI instrumentation
 - add a cache (required for recursive PHIs anyway)
 - remove hoisting support for now, since it was wrong in a few cases

sorry for the churn here.. tests will follow soon.

llvm-svn: 157775
2012-05-31 22:45:40 +00:00
Duncan Sands 339bb61e32 Enhance the sinking code to handle diamond patterns. Patch by
Carlo Alberto Ferraris.

llvm-svn: 157736
2012-05-31 08:09:49 +00:00
Kostya Serebryany 9024160439 [asan] instrument cmpxchg and atomicrmw
llvm-svn: 157683
2012-05-30 09:04:06 +00:00
Nuno Lopes 8bd45f8ecd bounds checking:
- hoist checks out of loops where SCEV is smart enough
 - add additional statistics to measure how much we loose for not supporting interprocedural and pointers loaded from memory

llvm-svn: 157649
2012-05-29 22:32:51 +00:00
Stepan Dyatkovskiy 58107dd547 ConstantRangesSet renamed to IntegersSubset. CRSBuilder renamed to IntegersSubsetMapping.
llvm-svn: 157612
2012-05-29 12:26:47 +00:00
Benjamin Kramer 9d5849f51d Fix suspicous hasOneUse() check, found by PVS Studio (PR12357).
llvm-svn: 157592
2012-05-28 20:52:48 +00:00
Benjamin Kramer b8743a9150 InstCombine: Fix infinite loop when encountering switch on trivial icmp.
The test case feeds the following into InstCombine's visitSelect:
%tobool8 = icmp ne i32 0, 0
%phitmp = select i1 %tobool8, i32 3, i32 0
Then instcombine replaces the right side of the switch with 0, doesn't notice
that nothing changes and tries again indefinitely.

This fixes PR12897.

llvm-svn: 157587
2012-05-28 19:18:16 +00:00
Stepan Dyatkovskiy e3e19cbb13 PR1255: Case Ranges
Implemented IntItem - the wrapper around APInt. Why not to use APInt item directly right now?
1. It will very difficult to implement case ranges as series of small patches. We got several large and heavy patches. Each patch will about 90-120 kb. If you replace ConstantInt with APInt in SwitchInst you will need to changes at the same time all Readers,Writers and absolutely all passes that uses SwitchInst.
2. We can implement APInt pool inside and save memory space. E.g. we use several switches that works with 256 bit items (switch on signatures, or strings). We can avoid value duplicates in this case.
3. IntItem can be easyly easily replaced with APInt.
4. Currenly we can interpret IntItem both as ConstantInt and as APInt. It allows to provide SwitchInst methods that works with ConstantInt for non-updated passes.

Why I need it right now? Currently I need to update SimplifyCFG pass (EqualityComparisons). I need to work with APInts directly a lot, so peaces of code
ConstantInt *V = ...;
if (V->getValue().ugt(AnotherV->getValue()) {
  ...
}
will look awful. Much more better this way:
IntItem V = ConstantIntVal->getValue();
if (AnotherV < V) {
}

Of course any reviews are welcome.

P.S.: I'm also going to rename ConstantRangesSet to IntegersSubset, and CRSBuilder to IntegersSubsetMapping (allows to map individual subsets of integers to the BasicBlocks).
Since in future these classes will founded on APInt, it will possible to use them in more generic ways.

llvm-svn: 157576
2012-05-28 12:39:09 +00:00
Bill Wendling 1560517ec3 Implement the indirect counter increment code in a better way. Instead of
replicating the code for every place it's needed, we instead generate a function
that does that for us. This function is local to the executable, so there
shouldn't be any writing violations.

llvm-svn: 157564
2012-05-28 06:10:56 +00:00
Chris Lattner 3cb6f83ebb switch AttrListPtr::get to take an ArrayRef, simplifying a lot of clients.
llvm-svn: 157556
2012-05-28 01:47:44 +00:00
Benjamin Kramer 152f106e5f PR12967: Don't crash when trying to fold a shift that's larger than the type's size.
llvm-svn: 157548
2012-05-27 22:03:32 +00:00
Chris Lattner 144b619684 Reimplement the intrinsic verifier to use the same table as Intrinsic::getDefinition,
making it stronger and more sane.

Delete the code from tblgen that produced the old code.

Besides being a path forward in intrinsic sanity, this also eliminates a bunch of
machine generated code that was compiled into Function.o

llvm-svn: 157545
2012-05-27 19:37:05 +00:00
Duncan Sands 3c05cd3ea8 Since commit 157467, if reassociate isn't actually going to change an expression
then it doesn't alter the instructions composing it, however it would continue
to move the instructions to just before the expression root.  Ensure it doesn't
move them either, so now it really does nothing if there is nothing to do.  That
commit also ensured that nsw etc flags weren't cleared if the expression was not
being changed.  Tweak this a bit so that it doesn't clear flags on the initial
part of a computation either if that part didn't change but later bits did.

llvm-svn: 157518
2012-05-26 16:42:52 +00:00
Benjamin Kramer 58abf4f193 SimplifyCFG: Turn the ad-hoc std::pair that represents switch cases into an explicit struct.
llvm-svn: 157516
2012-05-26 14:29:37 +00:00
Benjamin Kramer 65e75666ff Add support for branch weight metadata to MDBuilder and use it in various places.
llvm-svn: 157515
2012-05-26 13:59:43 +00:00
Duncan Sands c94ac6fdf6 Move this debug statement earlier so it is easy to see the order in
which operands come flying out of the linearization stage.

llvm-svn: 157512
2012-05-26 07:47:48 +00:00
Bill Wendling 8ed0749a34 The llvm_gcda_increment_indirect_counter function writes to the arguments that
are passed in. However, those arguments may be in a write-protected area, as far
as the runtime library is concerned. For instance, the data could be placed into
a 'linkedit' section, which isn't writable. Emit the code from
llvm_gcda_increment_indirect_counter directly into the function instead.

Note: The code for this is ugly, and can lead to bloat. We should look into
simplifying this code instead of having all of these branches.

<rdar://problem/11181370>

llvm-svn: 157505
2012-05-25 23:55:00 +00:00
Nuno Lopes e9b0bdf804 bounds checking: add support for byval arguments
llvm-svn: 157498
2012-05-25 21:15:17 +00:00
Nuno Lopes a6da3ff896 boundschecking:
add support for select
add experimental support for alloc_size metadata

llvm-svn: 157481
2012-05-25 16:54:04 +00:00
Duncan Sands bddfb2f96b Make the reassociation pass more powerful so that it can handle expressions
with arbitrary topologies (previously it would give up when hitting a diamond
in the use graph for example).  The testcase from PR12764 is now reduced from
a pile of additions to the optimal 1617*%x0+208.  In doing this I changed the
previous strategy of dropping all uses for expression leaves to one of dropping
all but one use.  This works out more neatly (but required a bunch of tweaks)
and is also safer: some recently fixed bugs during recursive linearization were
because the linearization code thinks it completely owns a node if it has no uses
outside the expression it is linearizing.  But if the node was also in another
expression that had been linearized (and thus all uses of the node from that
expression dropped) then the conclusion that it is completely owned by the
expression currently being linearized is wrong.  Keeping one use from within each
linearized expression avoids this kind of mistake.

llvm-svn: 157467
2012-05-25 12:03:02 +00:00
Stepan Dyatkovskiy 183d18aa5a PR1255 related changes (case ranges):
LowerSwitch::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced.
test/Transform/LowerSwitch/feature.ll - this test was refactored: grep + count was replaced with FileCheck usage.

llvm-svn: 157384
2012-05-24 09:33:20 +00:00
Nuno Lopes 10287d839f BoundsChecking: add a couple of simple tests and fix a bug in branch emition
llvm-svn: 157329
2012-05-23 16:24:52 +00:00
Patrik Hägglund 8a1e316c15 Fix the inliner so that the optsize function attribute don't alter the
inline threshold if the global inline threshold is lower (as for -Oz).

Reviewed by Chandler Carruth and Bill Wendling.

llvm-svn: 157323
2012-05-23 13:42:57 +00:00
Evgeniy Stepanov 617232f32b Use zero-based shadow by default on Android.
llvm-svn: 157317
2012-05-23 11:52:12 +00:00
Stepan Dyatkovskiy 7a50155227 PR1255(case ranges) related changes in Local Transformations.
llvm-svn: 157315
2012-05-23 08:18:26 +00:00
Nuno Lopes 59e9df773a address some of John Criswell's comments
teach computeAllocSize about realloc, reallocf, and valloc

llvm-svn: 157298
2012-05-22 22:02:19 +00:00
Nuno Lopes eee43e1bc7 hopefully fix the CMake build. sorry for breakage
llvm-svn: 157264
2012-05-22 17:40:46 +00:00
Nuno Lopes a2f6cecb6d add a new pass to instrument loads and stores for run-time bounds checking
move EmitGEPOffset from InstCombine to Transforms/Utils/Local.h

(a draft of this) patch reviewed by Andrew, thanks.

llvm-svn: 157261
2012-05-22 17:19:09 +00:00
Nuno Lopes ad40c0a425 revert my previous patches that introduced an additional parameter to the objectsize intrinsic.
After a lot of discussion, we realized it's not the best option for run-time bounds checking

llvm-svn: 157255
2012-05-22 15:25:31 +00:00
Duncan Sands 4df5e96d3a Fix PR12858, a crash due to GVN's PRE not fully removing an instruction from the
leader table.  That's because it wasn't expecting instructions to turn up as
leader for a value number that is not its own, but equality propagation could
create this situation.  One solution is to have the leader table use a WeakVH
but this slows down GVN by about 5%.  Instead just have equality propagation not
add instructions to the leader table, only constants and arguments.  In theory
this might cause GVN to run more (each time it changes something it runs again)
but it doesn't seem to occur enough to cause a slow down.

llvm-svn: 157251
2012-05-22 14:17:53 +00:00
Dan Gohman 9c97eea0fd Mark an unreachable region of code with llvm_unreachable.
llvm-svn: 157197
2012-05-21 17:41:28 +00:00
Peter Collingbourne 9a03c73297 Do not pass an invalid domtree to SimplifyInstruction from
LoopUnswitch.  Fixes PR12887.

llvm-svn: 157140
2012-05-20 01:32:09 +00:00
Peter Collingbourne 97b1076435 Do not eliminate allocas whose alignment exceeds that of the
copied-in constant, as a subsequent user may rely on over alignment.
Fixes PR12885.

llvm-svn: 157134
2012-05-19 22:52:10 +00:00
Dan Gohman 14862c3141 Fix replacing all the users of objc weak runtime routines
when deleting them. rdar://11434915.

llvm-svn: 157080
2012-05-18 22:17:29 +00:00
David Majnemer a9330fe553 Teach SimplifyLibCalls about stpcpy.
llvm-svn: 156815
2012-05-15 11:46:21 +00:00
Chad Rosier a968caf8e0 Move the capture analysis from MemoryDependencyAnalysis to a more general place
so that it can be reused in MemCpyOptimizer.  This analysis is needed to remove
an unnecessary memcpy when returning a struct into a local variable.
rdar://11341081
PR12686

llvm-svn: 156776
2012-05-14 20:35:04 +00:00
Jay Foad ca0c499609 Teach Function::hasAddressTaken that BlockAddress doesn't really take
the address of a function.

llvm-svn: 156703
2012-05-12 08:30:16 +00:00
Nuno Lopes e2cfd3ce95 objectsize: add a few more tests and fix a bug
llvm-svn: 156625
2012-05-11 18:25:29 +00:00
Eli Friedman e0a64d83fc Fix a minor logic mistake transforming compares in instcombine. PR12514.
llvm-svn: 156600
2012-05-11 01:32:59 +00:00
Nuno Lopes f573030391 objectsize: add support for GEPs with non-constant indexes
add an additional parameter to InstCombiner::EmitGEPOffset() to force it to *not* emit operations with NUW flag

llvm-svn: 156585
2012-05-10 23:17:35 +00:00
Dan Gohman ed7c24e2d9 Teach DeadStoreElimination to eliminate exit-block stores with phi addresses.
llvm-svn: 156558
2012-05-10 18:57:38 +00:00
Nuno Lopes 300d629924 teach DSE and isInstructionTriviallyDead() about calloc
llvm-svn: 156553
2012-05-10 17:14:00 +00:00
Dan Gohman f8b19d09ba Fix the objc_storeStrong recognizer to stop before walking off the
end of a basic block if there's no store.

llvm-svn: 156520
2012-05-09 23:08:33 +00:00
Nuno Lopes 7100f463b0 objectsize:
refactor code a bit to enable future changes to support run-time information
add support to compute allocation sizes at run-time if penalty > 1 (e.g., malloc(x), calloc(x, y), and VLAs)

llvm-svn: 156515
2012-05-09 21:30:57 +00:00
Craig Topper 28540adfcf Remove unused variable to get rid of warning.
llvm-svn: 156466
2012-05-09 07:08:58 +00:00
Dan Gohman 41375a3545 Miscellaneous accumulated cleanups.
llvm-svn: 156445
2012-05-08 23:39:44 +00:00
Dan Gohman 61708d37d6 Fix objc_storeStrong pattern matching to catch a potential use of the
old value after the store but before it is released.
This fixes rdar:/11116986.

llvm-svn: 156442
2012-05-08 23:34:08 +00:00
Duncan Sands 3bbb1d50df Calling ReassociateExpression recursively is extremely dangerous since it will
replace the operands of expressions with only one use with undef and generate
a new expression for the original without using RAUW to update the original.
Thus any copies of the original expression held in a vector may end up
referring to some bogus value - and using a ValueHandle won't help since there
is no RAUW.  There is already a mechanism for getting the effect of recursion
non-recursively: adding the value to be recursed on to RedoInsts.  But it wasn't
being used systematically.  Have various places where recursion had snuck in at
some point use the RedoInsts mechanism instead.  Fixes PR12169.

llvm-svn: 156379
2012-05-08 12:16:05 +00:00
Andrew Trick d29cd732d4 Allow NULL LoopPassManager argument in UnrollLoop. PR12734.
llvm-svn: 156358
2012-05-08 02:52:09 +00:00
Owen Anderson f4f80e1f39 Teach reassociate to commute FMul's and FAdd's in order to canonicalize the order of their operands across instructions. This allows for greater CSE opportunities.
llvm-svn: 156323
2012-05-07 20:47:23 +00:00
Benjamin Kramer 3d38c17b59 Switch the select to branch transformation on by default.
The primitive conservative heuristic seems to give a slight overall
improvement while not regressing stuff. Make it available to wider
testing. If you notice any speed regressions (or significant code
size regressions) let me know!

llvm-svn: 156258
2012-05-06 14:25:16 +00:00
Jakub Staszak cfc46f82ff Remove trailing spaces.
llvm-svn: 156257
2012-05-06 13:52:31 +00:00
Benjamin Kramer 047d7ca0b1 CodeGenPrepare: Add a transform to turn selects into branches in some cases.
This came up when a change in block placement formed a cmov and slowed down a
hot loop by 50%:

	ucomisd	(%rdi), %xmm0
	cmovbel	%edx, %esi

cmov is a really bad choice in this context because it doesn't get branch
prediction. If we emit it as a branch, an out-of-order CPU can do a better job
(if the branch is predicted right) and avoid waiting for the slow load+compare
instruction to finish. Of course it won't help if the branch is unpredictable,
but those are really rare in practice.

This patch uses a dumb conservative heuristic, it turns all cmovs that have one
use and a direct memory operand into branches. cmovs usually save some code
size, so we disable the transform in -Os mode. In-Order architectures are
unlikely to benefit as well, those are included in the
"predictableSelectIsExpensive" flag.

It would be better to reuse branch probability info here, but BPI doesn't
support select instructions currently. It would make sense to use the same
heuristics as the if-converter pass, which does the opposite direction of this
transform.


Test suite shows a small improvement here and there on corei7-level machines,
but the actual results depend a lot on the used microarchitecture. The
transformation is currently disabled by default and available by passing the
-enable-cgp-select2branch flag to the code generator.

Thanks to Chandler for the initial test case to him and Evan Cheng for providing
me with comments and test-suite numbers that were more stable than mine :)

llvm-svn: 156234
2012-05-05 12:49:22 +00:00
Stepan Dyatkovskiy cb2a1a34e2 Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for case when alloca's size is calculated within the "add/sub/... nsw".
Also added fix to 2011-06-13-nsw-alloca.ll test.

llvm-svn: 156231
2012-05-05 07:09:40 +00:00
Chandler Carruth 6781821c01 Teach the code extractor how to extract a sequence of blocks from
RegionInfo's RegionNode. This mirrors the logic for automating the
extraction from a Loop.

llvm-svn: 156208
2012-05-04 21:33:30 +00:00
Chandler Carruth 14316fcf7d Factor the computation of input and output sets into a public interface
of the CodeExtractor utility. This allows speculatively computing input
and output sets to measure the likely size impact of the code
extraction.

These sets cannot be reused sadly -- we mutate the function prior to
forming the final sets used by the actual extraction.

The interface has been revamped slightly to make it easier to use
correctly by making the interface const and sinking the computation of
the number of exit blocks into the full extraction function and away
from the rest of this logic which just computed two output parameters.

llvm-svn: 156168
2012-05-04 11:20:27 +00:00
Chandler Carruth 44e13911bc Rather than trying to gracefully handle input sequences with repeated
blocks, assert that this doesn't happen. We don't want to bother trying
to support this call pattern as it isn't necessary.

llvm-svn: 156167
2012-05-04 11:17:06 +00:00
Chandler Carruth 0a570552d1 Fix a goof with my previous commit by completely returning when we
detect an in-eligible block rather than just breaking out of the loop.

llvm-svn: 156166
2012-05-04 11:14:19 +00:00
Chandler Carruth 2f5d0191f7 Hoist a safety assert from the extraction method into the construction
of the extractor itself.

llvm-svn: 156164
2012-05-04 10:26:45 +00:00
Chandler Carruth 0fde00150d Move the CodeExtractor utility to a dedicated header file / source file,
and expose it as a utility class rather than as free function wrappers.

The simple free-function interface works well for the bugpoint-specific
pass's uses of code extraction, but in an upcoming patch for more
advanced code extraction, they simply don't expose a rich enough
interface. I need to expose various stages of the process of doing the
code extraction and query information to decide whether or not to
actually complete the extraction or give up.

Rather than build up a new predicate model and pass that into these
functions, just take the class that was actually implementing the
functions and lift it up into a proper interface that can be used to
perform code extraction. The interface is cleaned up and re-documented
to work better in a header. It also is now setup to accept the blocks to
be extracted in the constructor rather than in a method.

In passing this essentially reverts my previous commit here exposing
a block-level query for eligibility of extraction. That is no longer
necessary with the more rich interface as clients can query the
extraction object for eligibility directly. This will reduce the number
of walks of the input basic block sequence by quite a bit which is
useful if this enters the normal optimization pipeline.

llvm-svn: 156163
2012-05-04 10:18:49 +00:00
Bill Wendling fa0ebcd1b0 Add 'landingpad' instructions to the list of instructions to ignore.
Also combine the code in the 'assert' statement.

llvm-svn: 156155
2012-05-04 04:22:32 +00:00
Chandler Carruth da7513a834 A pile of long over-due refactorings here. There are some very, *very*
minor behavior changes with this, but nothing I have seen evidence of in
the wild or expect to be meaningful. The real goal is unifying our logic
and simplifying the interfaces. A summary of the changes follows:

- Make 'callIsSmall' actually accept a callsite so it can handle
  intrinsics, and simplify callers appropriately.
- Nuke a completely bogus declaration of 'callIsSmall' that was still
  lurking in InlineCost.h... No idea how this got missed.
- Teach the 'isInstructionFree' about the various more intelligent
  'free' heuristics that got added to the inline cost analysis during
  review and testing. This mostly surrounds int->ptr and ptr->int casts.
- Switch most of the interesting parts of the inline cost analysis that
  were essentially computing 'is this instruction free?' to use the code
  metrics routine instead. This way we won't keep duplicating logic.

All of this is motivated by the desire to allow other passes to compute
a roughly equivalent 'cost' metric for a particular basic block as the
inline cost analysis. Sadly, re-using the same analysis for both is
really messy because only the actual inline cost analysis is ever going
to go to the contortions required for simplification, SROA analysis,
etc.

llvm-svn: 156140
2012-05-04 00:58:03 +00:00
Chandler Carruth a46e62424b Factor the logic for testing whether a basic block is viable for code
extraction into a public interface. Also clean it up and apply it more
consistently such that we check for landing pads *anywhere* in the
extracted code, not just in single-block extraction.

This will be used to guide decisions in passes that are planning to
eventually perform a round of code extraction.

llvm-svn: 156114
2012-05-03 22:26:53 +00:00
Nuno Lopes d4cf35d775 remove calls to calloc if the allocated memory is not used (it was already being done for malloc)
fix a few typos found by Chad in my previous commit

llvm-svn: 156110
2012-05-03 22:08:19 +00:00
Nuno Lopes d2b71e7fa9 add support for calloc to objectsize lowering
llvm-svn: 156102
2012-05-03 21:19:58 +00:00
Nuno Lopes 22f6f3b055 replace 'break's with 'return 0' in visitCallInst code for objectsize, since there is no need to fallback to visitCallSite.
This gives a 0.9% in a test case

llvm-svn: 156069
2012-05-03 16:06:07 +00:00
Bill Wendling c94d86c4ad Whitespace cleanup.
llvm-svn: 156034
2012-05-02 23:43:23 +00:00
Kostya Serebryany ae7188d9b9 [tsan] typo and style (thanks to Nick Lewycky)
llvm-svn: 155986
2012-05-02 13:12:19 +00:00
Bill Wendling 274ba89d77 The value held in the vector may be RAUW'ed by some of the canonicalization
methods. Use a weak value handle to keep up with this.
PR12245

llvm-svn: 155984
2012-05-02 09:59:45 +00:00
Nick Lewycky 78ee67e814 An instruction in a loop is not guaranteed to be executed just because the loop
has no exit blocks. Fixes PR12706!

llvm-svn: 155884
2012-05-01 04:03:01 +00:00
Lang Hames 3a90fabd85 Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. Fixes
<rdar://problem/11291436>.

This is a second attempt at a fix for this, the first was r155468. Thanks
to Chandler, Bob and others for the feedback that helped me improve this.

llvm-svn: 155866
2012-05-01 00:20:38 +00:00
Bill Wendling bf4b9afbeb Second attempt at PR12573:
Allow the "SplitCriticalEdge" function to split the edge to a landing pad. If
the pass is *sure* that it thinks it knows what it's doing, then it may go ahead
and specify that the landing pad can have its critical edge split. The loop
unswitch pass is one of these passes. It will split the critical edges of all
edges coming from a loop to a landing pad not within the loop. Doing so will
retain important loop analysis information, such as loop simplify.

llvm-svn: 155817
2012-04-30 10:44:54 +00:00
Bill Wendling 325e6cd9cb Use an ArrayRef instead of explicit vector type.
llvm-svn: 155816
2012-04-30 10:25:51 +00:00
Bill Wendling 712d85a8c0 Remove hack from r154987. The problem persists even with it, so it's not even a good hack.
llvm-svn: 155813
2012-04-30 09:23:48 +00:00
Rafael Espindola dd48931461 Make sure HoistInsertPosition finds a position that is dominated by all
inputs.

llvm-svn: 155809
2012-04-30 03:53:06 +00:00
Hal Finkel 27c3246169 Don't vectorize target-specific types (ppc_fp128, x86_fp80, etc.).
Target specific types should not be vectorized. As a practical matter,
these types are already register matched (at least in the x86 case),
and codegen does not always work correctly (at least in the ppc case,
and this is not worth fixing because ppc_fp128 is currently broken and
will probably go away soon).

llvm-svn: 155729
2012-04-27 19:34:00 +00:00
David Blaikie 84e4b39995 Change recurse depth limit to uint32 to fix warning.
llvm-svn: 155727
2012-04-27 19:30:32 +00:00
Dan Gohman dae3349ac2 Miscellaneous accumulated cleanups.
llvm-svn: 155725
2012-04-27 18:56:31 +00:00
Mon P Wang 6120cfb8cd Add an early bailout to IsValueFullyAvailableInBlock from deeply nested blocks.
The limit is set to an arbitrary 1000 recursion depth to avoid stack overflow
issues. <rdar://problem/11286839>.

llvm-svn: 155722
2012-04-27 18:09:28 +00:00
Kostya Serebryany 5a464f03d3 [asan] small optimization: do not emit "x+0" instructions
llvm-svn: 155701
2012-04-27 10:04:53 +00:00
Kostya Serebryany a1259778b4 [tsan] Atomic support for ThreadSanitizer, patch by Dmitry Vyukov
llvm-svn: 155698
2012-04-27 07:31:53 +00:00
Jakob Stoklund Olesen c90abc8956 Break up getProfitableChainIncrement().
The required checks are moved to ChainInstruction() itself and the
policy decisions are moved to IVChain::isProfitableInc().

Also cache the ExprBase in IVChain to avoid frequent recomputations.

No functional change intended.

llvm-svn: 155676
2012-04-26 23:33:11 +00:00
Jakob Stoklund Olesen a0337d7bd9 Turn IVChain into a struct.
No functional change intended.

llvm-svn: 155675
2012-04-26 23:33:09 +00:00
Chad Rosier 7813dcee30 Add instcombine patterns for the following transformations:
(x & y) | (x ^ y) -> x | y 
 (x & y) + (x ^ y) -> x | y 

Patch by Manman Ren.
rdar://10770603

llvm-svn: 155674
2012-04-26 23:29:14 +00:00
Chandler Carruth 739ef80fd7 Teach the reassociate pass to fold chains of multiplies with repeated
elements to minimize the number of multiplies required to compute the
final result. This uses a heuristic to attempt to form near-optimal
binary exponentiation-style multiply chains. While there are some cases
it misses, it seems to at least a decent job on a very diverse range of
inputs.

Initial benchmarks show no interesting regressions, and an 8%
improvement on SPASS. Let me know if any other interesting results (in
either direction) crop up!

Credit to Richard Smith for the core algorithm, and helping code the
patch itself.

llvm-svn: 155616
2012-04-26 05:30:30 +00:00
Jakob Stoklund Olesen 293673d788 Print IV chain numbers while collecting them.
llvm-svn: 155567
2012-04-25 18:01:32 +00:00
Lang Hames 2fd0c69125 Reverting r155468. Chris and Chandler have convinced me that it's dangerous and
in poor taste.

Talking through some alternate solutions with Chandler.

llvm-svn: 155530
2012-04-25 02:16:54 +00:00
Dan Gohman 62079b43cc Simplify the known retain count tracking; use a boolean state instead
of a precise count. Also, move RRInfo's Partial field into PtrState,
now that it won't increase the size.

llvm-svn: 155513
2012-04-25 00:50:46 +00:00
Dan Gohman c24c66f21c Build custom predecessor and successor lists for each basic block.
These lists exclude invoke unwind edges and loop backedges which
are being ignored. This makes it easier to ignore them
consistently.

llvm-svn: 155500
2012-04-24 22:53:18 +00:00
Lang Hames 84531c2b5f Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. This fixes
<rdar://problem/11291436>.

llvm-svn: 155468
2012-04-24 18:58:36 +00:00
Jakob Stoklund Olesen 43bcb970e5 Reapply r155136 after fixing PR12599.
Original commit message:

Defer some shl transforms to DAGCombine.

The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.

Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.

These transformations are deferred:

  (X >>? C) << C   --> X & (-1 << C)  (When X >> C has multiple uses)
  (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2)   (When C2 > C1)
  (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2)  (When C1 > C2)

The corresponding exact transformations are preserved, just like
div-exact + mul:

  (X >>?,exact C) << C   --> X
  (X >>?,exact C1) << C2 --> X << (C2-C1)
  (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)

The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.

llvm-svn: 155362
2012-04-23 17:39:52 +00:00
Alexander Potapenko 056e27ea49 Fix issue 67 by checking that the interface functions weren't redefined in the compiled source file.
llvm-svn: 155346
2012-04-23 10:47:31 +00:00
Kostya Serebryany 5a4b7a232c [tsan] use llvm/ADT/Statistic.h for tsan stats
llvm-svn: 155341
2012-04-23 08:44:59 +00:00
Jakob Stoklund Olesen 205ee3b389 Revert r155136 "Defer some shl transforms to DAGCombine."
While the patch was perfect and defect free, it exposed a really nasty
bug in X86 SelectionDAG that caused an llc crash when compiling lencod.

I'll put the patch back in after fixing the SelectionDAG problem.

llvm-svn: 155181
2012-04-20 00:38:45 +00:00
Bill Wendling 9f97595201 Put this expensive check below the less expensive ones.
llvm-svn: 155166
2012-04-19 23:31:07 +00:00
Dan Gohman 26aa827461 Avoid a bug in the path count computation, preventing an infinite
loop repeatedlt making the same change. This is for rdar://11256239.

llvm-svn: 155160
2012-04-19 21:50:46 +00:00
Jakob Stoklund Olesen 6b6c81e6b2 Defer some shl transforms to DAGCombine.
The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.

Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.

These transformations are deferred:

  (X >>? C) << C   --> X & (-1 << C)  (When X >> C has multiple uses)
  (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2)   (When C2 > C1)
  (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2)  (When C1 > C2)

The corresponding exact transformations are preserved, just like
div-exact + mul:

  (X >>?,exact C) << C   --> X
  (X >>?,exact C1) << C2 --> X << (C2-C1)
  (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)

The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.

llvm-svn: 155136
2012-04-19 16:46:26 +00:00
Dan Gohman 22fbe8d709 Don't crash on code where the user put __attribute__((constructor)) on
a function with arguments. This fixes rdar://11265785.

llvm-svn: 155073
2012-04-18 22:24:33 +00:00
Bill Wendling 4d4d025751 Use a heavy hammer to fix PR12573.
If the loop contains invoke instructions, whose unwind edge escapes the loop,
then don't try to unswitch the loop. Doing so may cause the unwind edge to be
split, which not only is non-trivial but doesn't preserve loop simplify
information.

Fixes PR12573

llvm-svn: 154987
2012-04-18 06:00:09 +00:00
Andrew Trick 19f80c1e7e loop-reduce: Add an early bailout to catch extremely large loops.
This introduces a threshold of 200 IV Users, which is very
conservative but should be sufficient to avoid serious compile time
sink or stack overflow. The llvm test-suite with LTO never exceeds 190
users per loop.

The bug doesn't relate to a specific type of loop. Checking in an
arbitrary giant loop as a unit test would be silly.

Fixes rdar://11262507.

llvm-svn: 154983
2012-04-18 04:00:10 +00:00
Joe Groff a81bcbb9bb fix pr12559: mark unavailable win32 math libcalls
also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint

llvm-svn: 154960
2012-04-17 23:05:54 +00:00
Hal Finkel 52ba49f399 Fix style violation in BBVectorize (pointed out by Bill Wendling)
llvm-svn: 154810
2012-04-16 12:39:17 +00:00
Bill Wendling 82b90a3804 Add a Fixme.
llvm-svn: 154793
2012-04-16 04:23:52 +00:00
Hal Finkel 8ee309d9b7 Simplify checking for pointer types in BBVectorize (this change was suggested by Duncan).
llvm-svn: 154787
2012-04-16 03:49:42 +00:00
Hal Finkel 83c9796033 Fix an error in BBVectorize important for vectorizing pointer types.
When vectorizing pointer types it is important to realize that potential
pairs cannot be connected via the address pointer argument of a load or store.
This is because even after vectorization, the address is still a scalar because
the address of the higher half of the pair is implicit from the address of the
lower half (it need not be, and should not be, explicitly computed).

llvm-svn: 154735
2012-04-14 07:32:50 +00:00
Hal Finkel f589519a67 Enhance BBVectorize to more-properly handle pointer values and vectorize GEPs.
llvm-svn: 154734
2012-04-14 07:32:43 +00:00
Hal Finkel b2336a79f9 Add support to BBVectorize for vectorizing selects.
llvm-svn: 154700
2012-04-13 20:45:45 +00:00
Dan Gohman 670f93744b Add some comments, and fix a few places that missed setting Changed.
llvm-svn: 154687
2012-04-13 18:57:48 +00:00
Dan Gohman e1e352af2b Consider ObjC runtime calls objc_storeWeak and others which make a copy of
their argument as "escape" points for objc_retainBlock optimization.
This fixes rdar://11229925.

llvm-svn: 154682
2012-04-13 18:28:58 +00:00
Hal Finkel 204bf5352a By default, use Early-CSE instead of GVN for vectorization cleanup.
As has been suggested by Duncan and others, Early-CSE and GVN should
do similar redundancy elimination, but Early-CSE is much less expensive.
Most of my autovectorization benchmarks show a performance regresion, but
all of these are < 0.1%, and so I think that it is still worth using
the less expensive pass.

llvm-svn: 154673
2012-04-13 17:15:33 +00:00
Dan Gohman de8d2c446b Use the new Use-aware dominates method to apply the objc runtime
library return value optimization for phi uses. Even when the
phi itself is not dominated, the specific use may be dominated.

llvm-svn: 154647
2012-04-13 01:08:28 +00:00
Bill Wendling 585583c8dd Code-gen may inject code into the IR before it emits the ASM. The linker
obviously cannot know that this code is present, let alone used. So prevent the
internalize pass from internalizing those global values which code-gen may
insert.

llvm-svn: 154645
2012-04-13 01:06:27 +00:00
Dan Gohman 8478d76d64 Don't move objc_autorelease calls past autorelease pool boundaries when
optimizing autorelease calls on phi nodes with null operands.
This fixes rdar://11207070.

llvm-svn: 154642
2012-04-13 00:59:57 +00:00
Chad Rosier cc899f3b6d Typo.
llvm-svn: 154522
2012-04-11 19:21:58 +00:00
Chandler Carruth 7ae90d4d2d Add two statistics to help track how we are computing the inline cost.
Yea, 'NumCallerCallersAnalyzed' isn't a great name, suggestions welcome.

llvm-svn: 154492
2012-04-11 10:15:10 +00:00
Kostya Serebryany 5ba61ac651 [tsan] two more compile-time optimizations:
- don't isntrument reads from constant globals.
Saves ~1.5% of instrumented instructions on CPU2006
(counting static instructions, not their execution).
- don't insrument reads from vtable (which is a global constant too).
Saves ~5%.

I did not measure the run-time impact of this,
but it is certainly non-negative.

llvm-svn: 154444
2012-04-10 22:29:17 +00:00
Kostya Serebryany bf2de80be6 [tsan] compile-time instrumentation: do not instrument a read if
a write to the same temp follows in the same BB.
Also add stats printing.

On Spec CPU2006 this optimization saves roughly 4% of instrumented reads
(which is 3% of all instrumented accesses):
Writes            : 161216
Reads             : 446458
Reads-before-write: 18295

llvm-svn: 154418
2012-04-10 18:18:56 +00:00
Andrew Trick 4442bfe559 Fix 12513: Loop unrolling breaks with indirect branches.
Take this opportunity to generalize the indirectbr bailout logic for
loop transformations. CFG transformations will never get indirectbr
right, and there's no point trying.

llvm-svn: 154386
2012-04-10 05:14:42 +00:00
Andrew Trick 4104ed9c76 whitespace
llvm-svn: 154385
2012-04-10 05:14:37 +00:00
Chandler Carruth f82b0e2d29 Teach InstCombine to nuke a common alloca pattern -- an alloca which has
GEPs, bit casts, and stores reaching it but no other instructions. These
often show up during the iterative processing of the inliner, SROA, and
DCE. Once we hit this point, we can completely remove the alloca. These
were actually showing up in the final, fully optimized code in a bunch
of inliner tests I've been working on, and notably they show up after
LLVM finishes optimizing away all function calls involved in
hash_combine(a, b).

llvm-svn: 154285
2012-04-08 14:36:56 +00:00
Hongbin Zheng 5758f495da Refactor: Use positive field names in VectorizeConfig.
llvm-svn: 154249
2012-04-07 03:56:23 +00:00
Chandler Carruth 49da93396e Sink the collection of return instructions until after *all*
simplification has been performed. This is a bit less efficient
(requires another ilist walk of the basic blocks) but shouldn't matter
in practice. More importantly, it's just too much work to keep track of
all the various ways the return instructions can be mutated while
simplifying them. This fixes yet another crasher, reported by Daniel
Dunbar.

llvm-svn: 154179
2012-04-06 17:21:31 +00:00
Duncan Sands d12b18f820 Make GVN's propagateEquality non-recursive. No intended functionality change.
The modifications are a lot more trivial than they appear to be in the diff!

llvm-svn: 154174
2012-04-06 15:31:09 +00:00
Chandler Carruth e41f6f4189 Sink the return instruction collection until after we're done deleting
dead code, including dead return instructions in some cases. Otherwise,
we end up having a bogus poniter to a return instruction that blows up
much further down the road.

It turns out that this pattern is both simpler to code, easier to update
in the face of enhancements to the inliner cleanup, and likely cheaper
given that it won't add dead instructions to the list.

Thanks to John Regehr's numerous test cases for teasing this out.

llvm-svn: 154157
2012-04-06 01:11:52 +00:00
Dan Gohman cc64bbca81 Fix accidentally inverted logic from r152803, and make the
testcase slightly less trivial. This fixes rdar://11171718.

llvm-svn: 154118
2012-04-05 20:27:21 +00:00
Hongbin Zheng 31d33b8318 BBVectorize: Add the const modifier to the VectorizeConfig because we won't
modify it.

llvm-svn: 154098
2012-04-05 16:07:49 +00:00
Hongbin Zheng d6825173d3 Introduce the VectorizeConfig class, with which we can control the behavior
of the BBVectorizePass without using command line option. As pointed out
  by Hal, we can ask the TargetLoweringInfo for the architecture specific
  VectorizeConfig to perform vectorizing with architecture specific
  information.

llvm-svn: 154096
2012-04-05 15:46:55 +00:00
Hongbin Zheng 6edbc39bd7 Add the function "vectorizeBasicBlock" which allow users vectorize a
BasicBlock in other passes, e.g. we can call vectorizeBasicBlock in the
 loop unroll pass right after the loop is unrolled.

llvm-svn: 154089
2012-04-05 08:05:16 +00:00
Jakob Stoklund Olesen f2390e8303 Pass the right sign to TLI->isLegalICmpImmediate.
LSR can fold three addressing modes into its ICmpZero node:

  ICmpZero BaseReg + Offset      => ICmp BaseReg, -Offset
  ICmpZero -1*ScaleReg + Offset  => ICmp ScaleReg, Offset
  ICmpZero BaseReg + -1*ScaleReg => ICmp BaseReg, ScaleReg

The first two cases are only used if TLI->isLegalICmpImmediate() likes
the offset.

Make sure the right Offset sign is passed to this method in the second
case. The ARM version is not symmetric.

<rdar://problem/11184260>

llvm-svn: 154079
2012-04-05 03:10:56 +00:00
Rafael Espindola ba0a6cabb8 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Hongbin Zheng b21b865fe8 LoopUnrollPass: Use variable "Threshold" instead of "CurrentThreshold" when
reducing unroll count, otherwise the reduced unroll count is not taking
  the "OptimizeForSize" attribute into account.

llvm-svn: 154007
2012-04-04 11:44:08 +00:00
Bill Wendling 932b992888 Add an option to turn off the expensive GVN load PRE part of GVN.
llvm-svn: 153902
2012-04-02 22:16:50 +00:00
Stepan Dyatkovskiy f62ffeca88 Fast fix for PR12343:
http://llvm.org/bugs/show_bug.cgi?id=12343

We have not trivial way for splitting edges that are goes from indirect branch. We can do it with some tricks, but it should be additionally discussed. And it is still dangerous due to difficulty of indirect branches controlling.

Fix forbids this case for unswitching.

llvm-svn: 153879
2012-04-02 17:16:45 +00:00
Chandler Carruth 45ae88f5fc Belatedly address some code review from Chris.
As a side note, I really dislike array_pod_sort... Do we really still
care about any STL implementations that get this so wrong? Does libc++?

llvm-svn: 153834
2012-04-01 10:41:24 +00:00
Chandler Carruth c5bfb3c0f5 Fix a pretty scary bug I introduced into the always inliner with
a single missing character. Somehow, this had gone untested. I've added
tests for returns-twice logic specifically with the always-inliner that
would have caught this, and fixed the bug.

Thanks to Matt for the careful review and spotting this!!! =D

llvm-svn: 153832
2012-04-01 10:21:05 +00:00
Chandler Carruth a88a0faaa3 Give the always-inliner its own custom filter. It shouldn't have to pay
the very high overhead of the complex inline cost analysis when all it
wants to do is detect three patterns which must not be inlined. Comment
the code, clean it up, and leave some hints about possible performance
improvements if this ever shows up on a profile.

Moving this off of the (now more expensive) inline cost analysis is
particularly important because we have to run this inliner even at -O0.

llvm-svn: 153814
2012-03-31 13:17:18 +00:00
Chandler Carruth edd2826f3e Remove a bunch of empty, dead, and no-op methods from all of these
interfaces. These methods were used in the old inline cost system where
there was a persistent cache that had to be updated, invalidated, and
cleared. We're now doing more direct computations that don't require
this intricate dance. Even if we resume some level of caching, it would
almost certainly have a simpler and more narrow interface than this.

llvm-svn: 153813
2012-03-31 12:48:08 +00:00
Chandler Carruth 0539c071ea Initial commit for the rewrite of the inline cost analysis to operate
on a per-callsite walk of the called function's instructions, in
breadth-first order over the potentially reachable set of basic blocks.

This is a major shift in how inline cost analysis works to improve the
accuracy and rationality of inlining decisions. A brief outline of the
algorithm this moves to:

- Build a simplification mapping based on the callsite arguments to the
  function arguments.
- Push the entry block onto a worklist of potentially-live basic blocks.
- Pop the first block off of the *front* of the worklist (for
  breadth-first ordering) and walk its instructions using a custom
  InstVisitor.
- For each instruction's operands, re-map them based on the
  simplification mappings available for the given callsite.
- Compute any simplification possible of the instruction after
  re-mapping, and store that back int othe simplification mapping.
- Compute any bonuses, costs, or other impacts of the instruction on the
  cost metric.
- When the terminator is reached, replace any conditional value in the
  terminator with any simplifications from the mapping we have, and add
  any successors which are not proven to be dead from these
  simplifications to the worklist.
- Pop the next block off of the front of the worklist, and repeat.
- As soon as the cost of inlining exceeds the threshold for the
  callsite, stop analyzing the function in order to bound cost.

The primary goal of this algorithm is to perfectly handle dead code
paths. We do not want any code in trivially dead code paths to impact
inlining decisions. The previous metric was *extremely* flawed here, and
would always subtract the average cost of two successors of
a conditional branch when it was proven to become an unconditional
branch at the callsite. There was no handling of wildly different costs
between the two successors, which would cause inlining when the path
actually taken was too large, and no inlining when the path actually
taken was trivially simple. There was also no handling of the code
*path*, only the immediate successors. These problems vanish completely
now. See the added regression tests for the shiny new features -- we
skip recursive function calls, SROA-killing instructions, and high cost
complex CFG structures when dead at the callsite being analyzed.

Switching to this algorithm required refactoring the inline cost
interface to accept the actual threshold rather than simply returning
a single cost. The resulting interface is pretty bad, and I'm planning
to do lots of interface cleanup after this patch.

Several other refactorings fell out of this, but I've tried to minimize
them for this patch. =/ There is still more cleanup that can be done
here. Please point out anything that you see in review.

I've worked really hard to try to mirror at least the spirit of all of
the previous heuristics in the new model. It's not clear that they are
all correct any more, but I wanted to minimize the change in this single
patch, it's already a bit ridiculous. One heuristic that is *not* yet
mirrored is to allow inlining of functions with a dynamic alloca *if*
the caller has a dynamic alloca. I will add this back, but I think the
most reasonable way requires changes to the inliner itself rather than
just the cost metric, and so I've deferred this for a subsequent patch.
The test case is XFAIL-ed until then.

As mentioned in the review mail, this seems to make Clang run about 1%
to 2% faster in -O0, but makes its binary size grow by just under 4%.
I've looked into the 4% growth, and it can be fixed, but requires
changes to other parts of the inliner.

llvm-svn: 153812
2012-03-31 12:42:41 +00:00
Benjamin Kramer 53dc873342 Internalize: Remove reference of @llvm.noinline, it was replaced with the noinline attribute a long time ago.
llvm-svn: 153806
2012-03-31 11:03:47 +00:00
Hal Finkel 5cad8742cc Correctly vectorize powi.
The powi intrinsic requires special handling because it always takes a single
integer power regardless of the result type. As a result, we can vectorize
only if the powers are equal. Fixes PR12364.

llvm-svn: 153797
2012-03-31 03:38:40 +00:00
Jakob Stoklund Olesen 4e55044ff5 Don't PRE compares.
CodeGenPrepare sinks compare instructions down to their uses to prevent
live flags and predicate registers across basic blocks.

PRE of a compare instruction prevents that, forcing the i1 compare
result into a general purpose register.  That is usually more expensive
than the redundant compare PRE was trying to eliminate in the first
place.

llvm-svn: 153657
2012-03-29 17:22:39 +00:00
Benjamin Kramer aa9e4a5e59 GlobalOpt: If we have an inbounds GEP from a ConstantAggregateZero global that we just determined to be constant, replace all loads from it with a zero value.
llvm-svn: 153576
2012-03-28 14:50:09 +00:00
Chandler Carruth 772c88b887 Switch to WeakVHs in the value mapper, and aggressively prune dead basic
blocks in the function cloner. This removes the last case of trivially
dead code that I've been seeing in the wild getting inlined, analyzed,
re-inlined, optimized, only to be deleted. Nukes a FIXME from the
cleanup tests.

llvm-svn: 153572
2012-03-28 08:38:27 +00:00
Chad Rosier bb2a6da440 Fix 80-column violation.
llvm-svn: 153556
2012-03-28 00:35:33 +00:00
Chandler Carruth b9e35fbc1e Make a seemingly tiny change to the inliner and fix the generated code
size bloat. Unfortunately, I expect this to disable the majority of the
benefit from r152737. I'm hopeful at least that it will fix PR12345. To
explain this requires... quite a bit of backstory I'm afraid.

TL;DR: The change in r152737 actually did The Wrong Thing for
linkonce-odr functions. This change makes it do the right thing. The
benefits we saw were simple luck, not any actual strategy. Benchmark
numbers after a mini-blog-post so that I've written down my thoughts on
why all of this works and doesn't work...

To understand what's going on here, you have to understand how the
"bottom-up" inliner actually works. There are two fundamental modes to
the inliner:

1) Standard fixed-cost bottom-up inlining. This is the mode we usually
   think about. It walks from the bottom of the CFG up to the top,
   looking at callsites, taking information about the callsite and the
   called function and computing th expected cost of inlining into that
   callsite. If the cost is under a fixed threshold, it inlines. It's
   a touch more complicated than that due to all the bonuses, weights,
   etc. Inlining the last callsite to an internal function gets higher
   weighth, etc. But essentially, this is the mode of operation.

2) Deferred bottom-up inlining (a term I just made up). This is the
   interesting mode for this patch an r152737. Initially, this works
   just like mode #1, but once we have the cost of inlining into the
   callsite, we don't just compare it with a fixed threshold. First, we
   check something else. Let's give some names to the entities at this
   point, or we'll end up hopelessly confused. We're considering
   inlining a function 'A' into its callsite within a function 'B'. We
   want to check whether 'B' has any callers, and whether it might be
   inlined into those callers. If so, we also check whether inlining 'A'
   into 'B' would block any of the opportunities for inlining 'B' into
   its callers. We take the sum of the costs of inlining 'B' into its
   callers where that inlining would be blocked by inlining 'A' into
   'B', and if that cost is less than the cost of inlining 'A' into 'B',
   then we skip inlining 'A' into 'B'.

Now, in order for #2 to make sense, we have to have some confidence that
we will actually have the opportunity to inline 'B' into its callers
when cheaper, *and* that we'll be able to revisit the decision and
inline 'A' into 'B' if that ever becomes the correct tradeoff. This
often isn't true for external functions -- we can see very few of their
callers, and we won't be able to re-consider inlining 'A' into 'B' if
'B' is external when we finally see more callers of 'B'. There are two
cases where we believe this to be true for C/C++ code: functions local
to a translation unit, and functions with an inline definition in every
translation unit which uses them. These are represented as internal
linkage and linkonce-odr (resp.) in LLVM. I enabled this logic for
linkonce-odr in r152737.

Unfortunately, when I did that, I also introduced a subtle bug. There
was an implicit assumption that the last caller of the function within
the TU was the last caller of the function in the program. We want to
bonus the last caller of the function in the program by a huge amount
for inlining because inlining that callsite has very little cost.
Unfortunately, the last caller in the TU of a linkonce-odr function is
*not* the last caller in the program, and so we don't want to apply this
bonus. If we do, we can apply it to one callsite *per-TU*. Because of
the way deferred inlining works, when it sees this bonus applied to one
callsite in the TU for 'B', it decides that inlining 'B' is of the
*utmost* importance just so we can get that final bonus. It then
proceeds to essentially force deferred inlining regardless of the actual
cost tradeoff.

The result? PR12345: code bloat, code bloat, code bloat. Another result
is getting *damn* lucky on a few benchmarks, and the over-inlining
exposing critically important optimizations. I would very much like
a list of benchmarks that regress after this change goes in, with
bitcode before and after. This will help me greatly understand what
opportunities the current cost analysis is missing.

Initial benchmark numbers look very good. WebKit files that exhibited
the worst of PR12345 went from growing to shrinking compared to Clang
with r152737 reverted.

- Bootstrapped Clang is 3% smaller with this change.
- Bootstrapped Clang -O0 over a single-source-file of lib/Lex is 4%
  faster with this change.

Please let me know about any other performance impact you see. Thanks to
Nico for reporting and urging me to actually fix, Richard Smith, Duncan
Sands, Manuel Klimek, and Benjamin Kramer for talking through the issues
today.

llvm-svn: 153506
2012-03-27 10:48:28 +00:00
Nadav Rotem a8f3562e8f 153465 was incorrect. In this code we wanted to check that the pointer operand is of pointer type (and not vector type).
llvm-svn: 153468
2012-03-26 21:00:53 +00:00
Nadav Rotem e63e59cc44 PR12357: The pointer was used before it was checked.
llvm-svn: 153465
2012-03-26 20:39:18 +00:00
Andrew Trick 14779cc49e LSR ivchain bug fix: corner case with ConstantExpr.
Fixes PR11950.

llvm-svn: 153463
2012-03-26 20:28:37 +00:00
Andrew Trick 356a896394 comment typo
llvm-svn: 153462
2012-03-26 20:28:35 +00:00
Chris Lattner b1e2e1e091 eliminate an unneeded branch, part of PR12357
llvm-svn: 153458
2012-03-26 19:13:57 +00:00
Eric Christopher 2b40fdf3ae Tidy.
llvm-svn: 153456
2012-03-26 19:09:40 +00:00
Eric Christopher f16bee8682 Tidy.
llvm-svn: 153455
2012-03-26 19:09:38 +00:00
Andrew Trick e51feea79c LSR cleanup: potential bug caught by PVS-Studio.
Thanks Andrey.

llvm-svn: 153451
2012-03-26 18:03:16 +00:00
Kostya Serebryany 6f8a776041 [tsan] treat vtable pointer updates in a special way (requires tbaa); fix a bug (forgot to return true after instrumenting); make sure the tsan tests are run
llvm-svn: 153448
2012-03-26 17:35:03 +00:00
Craig Topper 6e80c28017 Prune some includes and forward declarations.
llvm-svn: 153429
2012-03-26 06:58:25 +00:00
Chandler Carruth ef82cf5b1e Teach the function cloner (and thus the inliner) to simplify PHINodes
aggressively. There are lots of dire warnings about this being expensive
that seem to predate switching to the TrackingVH-based value remapper
that is automatically updated on RAUW. This makes it easy to not just
prune single-entry PHIs, but to fully simplify PHIs, and to recursively
simplify the newly inlined code to propagate PHINode simplifications.

This introduces a bit of a thorny problem though. We may end up
simplifying a branch condition to a constant when we fold PHINodes, and
we would like to nuke any dead blocks resulting from this so that time
isn't wasted continually analyzing them, but this isn't easy. Deleting
basic blocks *after* they are fully cloned and mapped into the new
function currently requires manually updating the value map. The last
piece of the simplification-during-inlining puzzle will require either
switching to WeakVH mappings or some other piece of refactoring. I've
left a FIXME in the testcase about this.

llvm-svn: 153410
2012-03-25 10:34:54 +00:00
Chandler Carruth 2121199241 Move the instruction simplification of callsite arguments in the inliner
to instead rely on much more generic and powerful instruction
simplification in the function cloner (and thus inliner).

This teaches the pruning function cloner to use instsimplify rather than
just the constant folder to fold values during cloning. This can
simplify a large number of things that constant folding alone cannot
begin to touch. For example, it will realize that 'or' and 'and'
instructions with certain constant operands actually become constants
regardless of what their other operand is. It also can thread back
through the caller to perform simplifications that are only possible by
looking up a few levels. In particular, GEPs and pointer testing tend to
fold much more heavily with this change.

This should (in some cases) have a positive impact on compile times with
optimizations on because the inliner itself will simply avoid cloning
a great deal of code. It already attempted to prune proven-dead code,
but now it will be use the stronger simplifications to prove more code
dead.

llvm-svn: 153403
2012-03-25 04:03:40 +00:00
Chandler Carruth 0c72e3f469 Add an asserting ValueHandle to the block simplification code which will
fire if anything ever invalidates the assumption of a terminator
instruction being unchanged throughout the routine.

I've convinced myself that the current definition of simplification
precludes such a transformation, so I think getting some asserts
coverage that we don't violate this agreement is sufficient to make this
code safe for the foreseeable future.

Comments to the contrary or other suggestions are of course welcome. =]
The bots are now happy with this code though, so it appears the bug here
has indeed been fixed.

llvm-svn: 153401
2012-03-25 03:29:25 +00:00
Chandler Carruth 17fc6ef234 Don't form a WeakVH around the sentinel node in the instructions BB
list. This is a bad idea. ;] I'm hopeful this is the bug that's showing
up with the MSVC bots, but we'll see.

It is definitely unnecessary. InstSimplify won't do anything to
a terminator instruction, we don't need to even include it in the
iteration range. We can also skip the now dead terminator check,
although I've made it an assert to help document that this is an
important invariant.

I'm still a bit queasy about this because there is an implicit
assumption that the terminator instruction cannot be RAUW'ed by the
simplification code. While that appears to be true at the moment, I see
no guarantee that would ensure it remains true in the future. I'm
looking at the cleanest way to solve that...

llvm-svn: 153399
2012-03-24 23:03:27 +00:00
Chandler Carruth cf1b585f60 Refactor the interface to recursively simplifying instructions to be tad
bit simpler by handling a common case explicitly.

Also, refactor the implementation to use a worklist based walk of the
recursive users, rather than trying to use value handles to detect and
recover from RAUWs during the recursive descent. This fixes a very
subtle bug in the previous implementation where degenerate control flow
structures could cause mutually recursive instructions (PHI nodes) to
collapse in just such a way that From became equal to To after some
amount of recursion. At that point, we hit the inf-loop that the assert
at the top attempted to guard against. This problem is defined away when
not using value handles in this manner. There are lots of comments
claiming that the WeakVH will protect against just this sort of error,
but they're not accurate about the actual implementation of WeakVHs,
which do still track RAUWs.

I don't have any test case for the bug this fixes because it requires
running the recursive simplification on unreachable phi nodes. I've no
way to either run this or easily write an input that triggers it. It was
found when using instruction simplification inside the inliner when
running over the nightly test-suite.

llvm-svn: 153393
2012-03-24 21:11:24 +00:00
Francois Pichet 4b9ab74690 Fix the MSVC build.
llvm-svn: 153366
2012-03-24 01:36:37 +00:00
Andrew Trick 25553ab5fe More IndVarSimplify cleanup.
llvm-svn: 153362
2012-03-24 00:51:17 +00:00
Kostya Serebryany e505a5abe9 add EP_OptimizerLast extension point
llvm-svn: 153353
2012-03-23 23:22:59 +00:00
Dan Gohman e3ed2b0699 Don't convert objc_retainAutoreleasedReturnValue to objc_retain if it
is retaining the return value of an invoke that it immediately follows.

llvm-svn: 153344
2012-03-23 18:09:00 +00:00
Dan Gohman 5c70fadc17 It's not possible to insert code immediately after an invoke in the
same basic block, and it's not safe to insert code in the successor
blocks if the edges are critical edges. Splitting those edges is
possible, but undesirable, especially on the unwind side. Instead,
make the bottom-up code motion to consider invokes to be part of
their successor blocks, rather than part of their parent blocks, so
that it doesn't push code past them and onto the edges. This fixes
PR12307.

llvm-svn: 153343
2012-03-23 17:47:54 +00:00
Duncan Sands a11ef6e4ea When propagating equalities, eg replacing A with B in every basic block
dominated by Root, check that B is available throughout the scope.  This
is obviously true (famous last words?) given the current logic, but the
check may be helpful if more complicated reasoning is added one day.

llvm-svn: 153323
2012-03-23 08:45:52 +00:00
Duncan Sands 8f897dc88b Indentation.
llvm-svn: 153322
2012-03-23 08:29:04 +00:00
Andrew Trick e3502cb204 Remove -enable-lsr-retry in time for 3.1.
llvm-svn: 153287
2012-03-22 22:42:51 +00:00
Andrew Trick d97b83e320 Remove -enable-lsr-nested in time for 3.1.
Tests cases have been removed but attached to open PR12330.

llvm-svn: 153286
2012-03-22 22:42:45 +00:00
Dan Gohman 817a7c6fdf Refactor the code for visiting instructions out into helper functions.
llvm-svn: 153267
2012-03-22 18:24:56 +00:00
Andrew Trick 0654989062 Remove unused simplifyIVUsers
llvm-svn: 153262
2012-03-22 17:47:30 +00:00
Andrew Trick f47d0af551 Remove -enable-iv-rewrite, which has been unsupported since 3.0.
llvm-svn: 153260
2012-03-22 17:10:11 +00:00
Chris Lattner 7d7dba3c92 don't use "signed", just something I noticed in patches flying by.
llvm-svn: 153237
2012-03-22 03:46:58 +00:00
Kostya Serebryany 84a7f2e8e9 [asan] fix one more bug related to long double
llvm-svn: 153189
2012-03-21 15:28:50 +00:00
Eric Christopher 7d522f161d Zap some dead code pointed out by Chandler.
llvm-svn: 153150
2012-03-20 23:28:58 +00:00
Andrew Trick f7711010e1 LoopSimplify bug fix. Handle indirect loop back edges.
Do not call SplitBlockPredecessors on a loop preheader when one of the
predecessors is an indirectbr. Otherwise, you will hit this assert:
!isa<IndirectBrInst>(Preds[i]->getTerminator()) && "Cannot split an edge from an IndirectBrInst"

llvm-svn: 153134
2012-03-20 21:24:52 +00:00
Andrew Trick bb01cbb312 whitespace
llvm-svn: 153133
2012-03-20 21:24:47 +00:00
Kostya Serebryany c58dc9fcd2 [asan] don't emit __asan_mapping_offset/__asan_mapping_scale by default -- they are currently used only for experiments
llvm-svn: 153040
2012-03-19 16:40:35 +00:00
Bill Wendling 55b6b2b6a9 Revert r152907.
llvm-svn: 152935
2012-03-16 18:20:54 +00:00
Bill Wendling a2a26b546c The alignment of the pointer part of the store instruction may have an
alignment. If that's the case, then we want to make sure that we don't increase
the alignment of the store instruction. Because if we increase it to be "more
aligned" than the pointer, code-gen may use instructions which require a greater
alignment than the pointer guarantees.
<rdar://problem/11043589>

llvm-svn: 152907
2012-03-16 07:40:08 +00:00
Chandler Carruth b37fc13a36 Rip out support for 'llvm.noinline'. This thing has a strange history...
It was added in 2007 as the first cut at supporting no-inline
attributes, but we didn't have function attributes of any form at the
time. However, it was added without any mention in the LangRef or other
documentation.

Later on, in 2008, Devang added function notes for 'inline=never' and
then turned them into proper function attributes. From that point
onward, as far as I can tell, the world moved on, and no one has touched
'llvm.noinline' in any meaningful way since.

It's time has now come. We have had better mechanisms for doing this for
a long time, all the frontends I'm aware of use them, and this is just
holding back progress. Given that it was never a documented feature of
the IR, I've provided no auto-upgrade support. If people know of real,
in-the-wild bitcode that relies on this, yell at me and I'll add it, but
I *seriously* doubt anyone cares.

llvm-svn: 152904
2012-03-16 06:10:15 +00:00
Chandler Carruth d7a5f2adb0 Start removing the use of an ad-hoc 'never inline' set and instead
directly query the function information which this set was representing.
This simplifies the interface of the inline cost analysis, and makes the
always-inline pass significantly more efficient.

Previously, always-inline would first make a single set of every
function in the module *except* those marked with the always-inline
attribute. It would then query this set at every call site to see if the
function was a member of the set, and if so, refuse to inline it. This
is quite wasteful. Instead, simply check the function attribute directly
when looking at the callsite.

The normal inliner also had similar redundancy. It added every function
in the module with the noinline attribute to its set to ignore, even
though inside the cost analysis function we *already tested* the
noinline attribute and produced the same result.

The only tricky part of removing this is that we have to be able to
correctly remove only the functions inlined by the always-inline pass
when finalizing, which requires a bit of a hack. Still, much less of
a hack than the set of all non-always-inline functions was. While I was
touching this function, I switched a heavy-weight set to a vector with
sort+unique. The algorithm already had a two-phase insert and removal
pattern, we were just needlessly paying the uniquing cost on every
insert.

This probably speeds up some compiles by a small amount (-O0 compiles
with lots of always-inline, so potentially heavy libc++ users), but I've
not tried to measure it.

I believe there is no functional change here, but yell if you spot one.
None are intended.

Finally, the direction this is going in is to greatly simplify the
inline cost query interface so that we can replace its implementation
with a much more clever one. Along the way, all the APIs get simplified,
so it seems incrementally good.

llvm-svn: 152903
2012-03-16 06:10:13 +00:00
Andrew Trick 070e540a3e LSR fix: Add isSimplifiedLoopNest to IVUsers analysis.
Only record IVUsers that are dominated by simplified loop
headers. Otherwise SCEVExpander will crash while looking for a
preheader.

I previously tried to work around this in LSR itself, but that was
insufficient. This way, LSR can continue to run if some uses are not
in simple loops, as long as we don't attempt to analyze those users.

Fixes <rdar://problem/11049788> Segmentation fault: 11 in LoopStrengthReduce

llvm-svn: 152892
2012-03-16 03:16:56 +00:00
Eli Friedman e06535b2f6 In InstCombiner::visitOr, make sure we reverse the operand swap used for checking for or-of-xor operations after those checks; a later check expects that any constant will be in Op1. PR12234.
llvm-svn: 152884
2012-03-16 00:52:42 +00:00
Rafael Espindola f58927855b Short term fix for pr12270 before we change dominates to handle unreachable
code.
While here, reduce indentation.

llvm-svn: 152803
2012-03-15 15:52:59 +00:00
Bill Wendling 7fa1be77cc Use an iterator instead of calling .size() on the worklist every time, which is wasteful.
llvm-svn: 152794
2012-03-15 11:19:41 +00:00
Chandler Carruth be2ccf01b7 Remove the basic inliner. This was added in 2007, and hasn't really
changed since. No one was using it. It is yet another consumer of the
InlineCost interface that I'd like to change.

llvm-svn: 152769
2012-03-15 01:37:56 +00:00
Chandler Carruth 3904590ba8 This pass didn't want the inline cost per-se, it just wants generic code
metrics.

llvm-svn: 152760
2012-03-15 00:29:10 +00:00
Aaron Ballman a733297fa6 Fixed a transform crash when setting a negative size value for memset. Fixes PR12202.
llvm-svn: 152756
2012-03-15 00:05:31 +00:00
Kostya Serebryany abad002d55 [tsan] use FunctionBlackList
llvm-svn: 152755
2012-03-14 23:33:24 +00:00
Kostya Serebryany 01401cec00 [asan] rename class BlackList to FunctionBlackList and move it into a separate file -- we will need the same functionality in ThreadSanitizer
llvm-svn: 152753
2012-03-14 23:22:10 +00:00
Dan Gohman 532fb8131b When an invoke is marked with metadata indicating its unwind edge
should be ignored by ARC optimization, don't insert new ARC runtime
calls in the unwind destination.

llvm-svn: 152748
2012-03-14 23:05:06 +00:00
Chandler Carruth 30b8416d2c Change where we enable the heuristic that delays inlining into functions
which are small enough to themselves be inlined. Delaying in this manner
can be harmful if the function is inelligible for inlining in some (or
many) contexts as it pessimizes the code of the function itself in the
event that inlining does not eventually happen.

Previously the check was written to only do this delaying of inlining
for static functions in the hope that they could be entirely deleted and
in the knowledge that all callers of static functions will have the
opportunity to inline if it is in fact profitable. However, with C++ we
get two other important sources of functions where the definition is
always available for inlining: inline functions and templated functions.
This patch generalizes the inliner to allow linkonce-ODR (the linkage
such C++ routines receive) to also qualify for this delay-based
inlining.

Benchmarking across a range of large real-world applications shows
roughly 2% size increase across the board, but an average speedup of
about 0.5%. Some benhcmarks improved over 2%, and the 'clang' binary
itself (when bootstrapped with this feature) shows a 1% -O0 performance
improvement when run over all Sema, Lex, and Parse source code smashed
into a single file. A clean re-build of Clang+LLVM with a bootstrapped
Clang shows approximately 2% improvement, but that measurement is often
noisy.

llvm-svn: 152737
2012-03-14 20:16:41 +00:00
Pete Cooper 615fd897e0 Target override to allow CodeGenPrepare to sink address operands to intrinsics in the same way it current does for loads and stores
llvm-svn: 152666
2012-03-13 20:59:56 +00:00
Chris Lattner 87fa77bd8a enhance jump threading to preserve TBAA information when PRE'ing loads,
fixing rdar://11039258, an issue that came up when inspecting clang's 
bootstrapped codegen.

llvm-svn: 152635
2012-03-13 18:07:41 +00:00
Dan Gohman eab06fa3c9 Teach globalopt how to evaluate an invoke with a non-void return type.
llvm-svn: 152634
2012-03-13 18:01:37 +00:00
Chandler Carruth 595fda8466 When inlining a function and adding its inner call sites to the
candidate set for subsequent inlining, try to simplify the arguments to
the inner call site now that inlining has been performed.

The goal here is to propagate and fold constants through deeply nested
call chains. Without doing this, we loose the inliner bonus that should
be applied because the arguments don't match the exact pattern the cost
estimator uses.

Reviewed on IRC by Benjamin Kramer.

llvm-svn: 152556
2012-03-12 11:19:33 +00:00
Stepan Dyatkovskiy 97b02fc1b3 llvm::SwitchInst
Renamed methods caseBegin, caseEnd and caseDefault with case_begin, case_end, and case_default.
Added some notes relative to case iterators.

llvm-svn: 152532
2012-03-11 06:09:17 +00:00
Duncan Sands 14eb175836 Add statistics on removed switch cases, and fix the phi statistic
to count the number of phis changed, not the number visited.

llvm-svn: 152425
2012-03-09 19:21:15 +00:00
Dan Gohman 500b598c5c When identifying exit nodes for the reverse-CFG reverse-post-order
traversal, consider nodes for which the only successors are backedges
which the traversal is ignoring to be exit nodes. This fixes a problem
where the bottom-up traversal was failing to visit split blocks along
split loop backedges. This fixes rdar://10989035.

llvm-svn: 152421
2012-03-09 18:50:52 +00:00
Duncan Sands cca89124a2 Eliminate switch cases that can never match, for example removes all
negative switch cases if the branch condition is known to be positive.
Inspired by a recent improvement to GCC's VRP.

llvm-svn: 152405
2012-03-09 13:45:18 +00:00
Stepan Dyatkovskiy 5b648afb4d Taken into account Duncan's comments for r149481 dated by 2nd Feb 2012:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/136146.html

Implemented CaseIterator and it solves almost all described issues: we don't need to mix operand/case/successor indexing anymore. Base iterator class is implemented as a template since it may be initialized either from "const SwitchInst*" or from "SwitchInst*".

ConstCaseIt is just a read-only iterator.
CaseIt is read-write iterator; it allows to change case successor and case value.

Usage of iterator allows totally remove resolveXXXX methods. All indexing convertions done automatically inside the iterator's getters.

Main way of iterator usage looks like this:
SwitchInst *SI = ... // intialize it somehow

for (SwitchInst::CaseIt i = SI->caseBegin(), e = SI->caseEnd(); i != e; ++i) {
  BasicBlock *BB = i.getCaseSuccessor();
  ConstantInt *V = i.getCaseValue();
  // Do something.
}

If you want to convert case number to TerminatorInst successor index, just use getSuccessorIndex iterator's method.
If you want initialize iterator from TerminatorInst successor index, use CaseIt::fromSuccessorIndex(...) method.

There are also related changes in llvm-clients: klee and clang.

llvm-svn: 152297
2012-03-08 07:06:20 +00:00
Sebastian Pop 5ce71b18cb fix typos
llvm-svn: 152035
2012-03-05 17:39:47 +00:00
Sebastian Pop 8844e224b8 remove spaces on empty lines
llvm-svn: 152034
2012-03-05 17:39:45 +00:00
Duncan Sands 3eb328574e This is not a common case, in fact it never happens!
llvm-svn: 152027
2012-03-05 12:23:00 +00:00
Chandler Carruth d95357a18e Switch mem2reg to use the new hashing infrastructure.
llvm-svn: 152026
2012-03-05 11:29:56 +00:00
Chandler Carruth e134d1a336 Replace the ad-hoc hashing in GVN with the new hashing infrastructure.
This implicitly fixes a nasty bug in the GVN hashing (that thankfully
could only manifest as a performance bug): actually include the opcode
in the hash. The old code started the hash off with the opcode, but then
overwrote it with the type pointer.

Since this is likely to be pretty hot (GVN being already pretty
expensive) I've included a micro-optimization to just not bother with
the varargs hashing if they aren't present. I can't measure any change
in GVN performance due to this, even with a big test case like Duncan's
sqlite one. Everything I see is in the noise floor. That said, this
closes a loop hole for a potential scaling problem due to collisions if
the opcode were the differentiating aspect of the expression.

llvm-svn: 152025
2012-03-05 11:29:54 +00:00
Duncan Sands 4d928e7dff Nick pointed out on IRC that GVN's propagateEquality wasn't propagating
equalities into phi node operands for which the equality is known to
hold in the incoming basic block.  That's because replaceAllDominatedUsesWith
wasn't handling phi nodes correctly in general (that this didn't give wrong
results was just luck: the specific way GVN uses replaceAllDominatedUsesWith
precluded wrong changes to phi nodes).

llvm-svn: 152006
2012-03-04 13:25:19 +00:00
Bill Wendling 97b9359623 Do trivial CSE of dead BBs during codegen preparation.
Some BBs can become dead after codegen preparation. If we delete them here, it
could help enable tail-call optimizations later on.
<rdar://problem/10256573>

llvm-svn: 152002
2012-03-04 10:46:01 +00:00
Evgeniy Stepanov d33e3d8c6e ASan: use getTypeAllocSize instead of getTypeStoreSize.
This change replaces getTypeStoreSize with getTypeAllocSize in AddressSanitizer
instrumentation for stack allocations.

One case where old behaviour produced undesired results is an optimization in
InstCombine pass (PromoteCastOfAllocation), which can replace  alloca(T) with
alloca(S), where S has the same AllocSize, but a smaller StoreSize. Another
case is memcpy(long double => long double), where ASan will poison bytes 10-15
of a stack-allocated long double (StoreSize  10, AllocSize 16,
sizeof(long double) = 16).

See http://llvm.org/bugs/show_bug.cgi?id=12047 for more context.

llvm-svn: 151887
2012-03-02 10:41:08 +00:00
Dan Gohman 362eb69f24 Fix an iterator invalidation problem. operator[] on a DenseMap
can insert a new element, invalidating iterators. Use find
instead, and handle the case where the key is not found explicitly.

llvm-svn: 151871
2012-03-02 01:26:46 +00:00
Dan Gohman 55b067427b Misc micro-optimizations.
llvm-svn: 151869
2012-03-02 01:13:53 +00:00
Duncan Sands bb2fe65542 Have GVN also do condition propagation when the right-hand side is not
a constant.  This fixes PR1768.

llvm-svn: 151713
2012-02-29 11:12:03 +00:00
Bill Wendling f2c78f344e Restrict this transformation to equality conditions.
This transformation is not correct for not-equal conditions:

(trunc x) != C1 & (and x, CA) != C2 -> (and x, CA|CMAX) != C1|C2

Let
  C1 == 0
  C2 == 0
  CA == 0xFF0000
  CMAX == 0xFF
and truncating to i8.

The original truth table:

    x   | A: trunc x != 0 | B: x & 0xFF0000 != 0 | A & B != 0
--------------------------------------------------------------
0x00000 |        0        |          0           |     0
0x00001 |        1        |          0           |     0
0x10000 |        0        |          1           |     0
0x10001 |        1        |          1           |     1

The truth table of the replacement:

    x   | x & 0xFF00FF != 0
----------------------------
0x00000 |        0
0x00001 |        1
0x10000 |        1
0x10001 |        1

So they are different.

llvm-svn: 151691
2012-02-29 01:46:50 +00:00
Pete Cooper 39b5255df4 Reverted r152620 - DSE: Shorten memset when a later store overwrites the start of it. There were all sorts of buildbot issues
llvm-svn: 151621
2012-02-28 05:06:24 +00:00
Pete Cooper f3862f91de DSE: Shorten memset when a later store overwrites the start of it
llvm-svn: 151620
2012-02-28 04:27:10 +00:00
Benjamin Kramer 93887631d9 Plog a memleak in GlobalOpt.
Found by valgrind.

llvm-svn: 151525
2012-02-27 12:48:24 +00:00
Duncan Sands 9edea84420 Micro-optimization, no functionality change.
llvm-svn: 151524
2012-02-27 12:11:41 +00:00
Duncan Sands 1be25a78f7 The value numbering function is recursive, so it is possible for multiple new
value numbers to be assigned when calculating any particular value number.
Enhance the logic that detects new value numbers to take this into account,
for a tiny compile time speedup.  Fix a comment typo while there.

llvm-svn: 151522
2012-02-27 09:54:35 +00:00
Duncan Sands 27f459519d When performing a conditional branch depending on the value of a comparison
%cmp (eg: A==B) we already replace %cmp with "true" under the true edge, and
with "false" under the false edge.  This change enhances this to replace the
negated compare (A!=B) with "false" under the true edge and "true" under the
false edge.  Reported to improve perlbench results by 1%.

llvm-svn: 151517
2012-02-27 08:14:30 +00:00
Chad Rosier 50e0b81ea9 Add comment.
llvm-svn: 151431
2012-02-25 03:07:57 +00:00
Chad Rosier 07d37bc1ed Add support for disabling llvm.lifetime intrinsics in the AlwaysInliner. These
are optimization hints, but at -O0 we're not optimizing.  This becomes a problem
when the alwaysinline attribute is abused.
rdar://10921594

llvm-svn: 151429
2012-02-25 02:56:01 +00:00
Chad Rosier e48e5d2945 Fix indentation.
llvm-svn: 151420
2012-02-25 01:10:59 +00:00
Duncan Sands 926d101640 Teach GVN that x+y is the same as y+x and that x<y is the same as y>x.
llvm-svn: 151365
2012-02-24 15:16:31 +00:00
Benjamin Kramer 077e55252a Reflow code, no functionality change.
llvm-svn: 151262
2012-02-23 17:42:19 +00:00
Duncan Sands 4730cb9c7c GCC fails to understand that NextBB is always initialized if EvaluateBlock
returns 'true' and emits a warning.  Help it out.

llvm-svn: 151242
2012-02-23 08:23:06 +00:00
Nick Lewycky 9d0da18597 Use the target-aware constant folder on expressions to improve the chance
they'll be simple enough to simulate, and to reduce the chance we'll encounter
equal but different simple pointer constants.

This removes the symptoms from PR11352 but is not a full fix. A proper fix would
either require a guarantee that two constant objects we simulate are folded
when equal, or a different way of handling equal pointers (ie., trying a
constantexpr icmp on them to see whether we know they're equal or non-equal or
unsure).

llvm-svn: 151093
2012-02-21 22:08:06 +00:00
Benjamin Kramer c7a22fe76b Fix unsigned off-by-one in comment.
llvm-svn: 151056
2012-02-21 13:40:06 +00:00
Benjamin Kramer 6ee8690aa5 InstCombine: Don't transform a signed icmp of two GEPs into a signed compare of the indices.
This transformation is not safe in some pathological cases (signed icmp of pointers should be an
extremely rare thing, but it's valid IR!). Add an explanatory comment.

Kudos to Duncan for pointing out this edge case (and not giving up explaining it until I finally got it).

llvm-svn: 151055
2012-02-21 13:31:09 +00:00
Nick Lewycky 519561f418 Check for the correct size in the invariant marker.
llvm-svn: 151003
2012-02-20 23:32:26 +00:00
Chad Rosier 47eeddde24 Fix 80-column violation.
llvm-svn: 150998
2012-02-20 23:13:17 +00:00
Benjamin Kramer ac8ecc4e7e InstCombine: Removing the base from the address calculation is only safe when the GEPs are inbounds.
llvm-svn: 150978
2012-02-20 18:45:10 +00:00
Benjamin Kramer 7adb189538 InstCombine: When comparing two GEPs that were derived from the same base pointer but use different types, expand the offset calculation and to the compare on the offset if profitable.
This came up in SmallVector code.

llvm-svn: 150962
2012-02-20 15:07:47 +00:00
Benjamin Kramer 7746eb62fb InstCombine: Make OptimizePointerDifference more aggressive.
- Ignore pointer casts.
- Also expand GEPs that aren't constantexprs when they have one use or only constant indices.

- We now compile "&foo[i] - &foo[j]" into "i - j".

llvm-svn: 150961
2012-02-20 14:34:57 +00:00
Nick Lewycky 60829a587a Rename class Evaluate to Evaluator and put it in an anonymous namespace.
llvm-svn: 150947
2012-02-20 03:25:59 +00:00
Nick Lewycky 73be5e31a6 Move EvaluateFunction and EvaluateBlock into a class, and make the class store
the information that they pass around between them. No functionality change!

llvm-svn: 150939
2012-02-19 23:26:27 +00:00
Ahmed Charles 636a3d618c Remove dead code. Improve llvm_unreachable text. Simplify some control flow.
llvm-svn: 150918
2012-02-19 11:37:01 +00:00
Dan Gohman 0155f30a9c Calls and invokes with the new clang.arc.no_objc_arc_exceptions
metadata may still unwind, but only in ways that the ARC
optimizer doesn't need to consider. This permits more
aggressive optimization.

llvm-svn: 150829
2012-02-17 18:59:53 +00:00
Nick Lewycky 68f9f9d9c8 Add support for invariant.start inside the static constructor evaluator. This is
useful to represent a variable that is const in the source but can't be constant
in the IR because of a non-trivial constructor. If globalopt evaluates the
constructor, and there was an invariant.start with no matching invariant.end
possible, it will mark the global constant afterwards.

llvm-svn: 150794
2012-02-17 06:59:21 +00:00
Bill Wendling aa9a3eae79 Remove redundant comment. Use a more efficient datatype.
llvm-svn: 150780
2012-02-17 02:12:54 +00:00
Bill Wendling 0a8fec2762 Fix some grammar-os and formatting.
llvm-svn: 150779
2012-02-17 02:09:28 +00:00
Eli Friedman c458885c58 loop-rotate shouldn't hoist alloca instructions out of a loop. Patch by Patrik Hägglund, with slightly modified test. Issue reported by Patrik Hägglund on llvmdev.
llvm-svn: 150642
2012-02-16 00:41:10 +00:00
Kostya Serebryany a8531eeb64 [tsan] fix compiler warnings
llvm-svn: 150449
2012-02-14 00:52:07 +00:00
Andrew Trick 10cc45336d Add simplifyLoopLatch to LoopRotate pass.
This folds a simple loop tail into a loop latch. It covers the common (in fortran) case of postincrement loops. It's a "free" way to expose this type of loop to downstream loop optimizations that bail out on non-canonical loops (getLoopLatch is a heavily used check).

llvm-svn: 150439
2012-02-14 00:00:23 +00:00
Andrew Trick a20f198747 whitespace
llvm-svn: 150438
2012-02-14 00:00:19 +00:00
Devang Patel 698452bc7e Check against umin while converting fcmp into an icmp.
llvm-svn: 150425
2012-02-13 23:05:18 +00:00
Dan Gohman eb6e01533a Just like in regular escape analysis, loads and stores through
(but not of) a block pointer do not cause the block pointer to
escape. This fixes rdar://10803830.

llvm-svn: 150424
2012-02-13 22:57:02 +00:00
Kostya Serebryany e2a0e4163a ThreadSanitizer, a race detector. First LLVM commit.
Clang patch (flags) will follow shortly.
The run-time library will also follow, but not immediately.

llvm-svn: 150423
2012-02-13 22:50:51 +00:00
Ahmed Charles 32e983e4fc Fix various issues (or do cleanups) found by enabling certain MSVC warnings.
- Use unsigned literals when the desired result is unsigned. This mostly allows unsigned/signed mismatch warnings to be less noisy even if they aren't on by default.
- Remove misplaced llvm_unreachable.
- Add static to a declaration of a function on MSVC x86 only.
- Change some instances of calling a static function through a variable to simply calling that function while removing the unused variable.

llvm-svn: 150364
2012-02-13 06:30:56 +00:00
Nick Lewycky c1572e4c90 Handle InvokeInst in EvaluateBlock. Don't try to support exceptions, it's just
that no optz'ns have run yet to convert invokes to calls.

llvm-svn: 150326
2012-02-12 05:09:35 +00:00
Nick Lewycky f285256f72 false is totally null!
llvm-svn: 150324
2012-02-12 02:17:18 +00:00
Nick Lewycky 4b273cb7ea Remove redundant getAnalysis<> calls in GlobalOpt. Add a few Itanium ABI calls
to TargetLibraryInfo and use one of them in GlobalOpt.

llvm-svn: 150323
2012-02-12 02:15:20 +00:00
Nick Lewycky cf6aae686d Pass TargetData and TargetLibraryInfo through to the constant folder. Fixes a
few fixme's when TLI was added.

llvm-svn: 150322
2012-02-12 01:13:18 +00:00
Nick Lewycky 1480f1d3f9 Fix function name in comment to match actual name. Fix comments that are using
doxy-style on local variables to not do so. Fix one 80-col violation.

llvm-svn: 150320
2012-02-12 00:52:26 +00:00
Nick Lewycky 4231c41c64 Don't traverse the PHI nodes twice. No functionality change!
llvm-svn: 150319
2012-02-12 00:47:24 +00:00
Hal Finkel 1bde3f86d1 Update BBVectorize to use aliasesUnknownInst.
This allows BBVectorize to check the "unknown instruction" list in the
alias sets. This is important to prevent instruction fusing from reordering
function calls. Resolves PR11920.

llvm-svn: 150250
2012-02-10 15:52:40 +00:00
Benjamin Kramer 1a4695a091 Tweak comment readability and grammar.
llvm-svn: 150183
2012-02-09 16:28:15 +00:00
Benjamin Kramer 487a3962c7 GlobalOpt: Be more aggressive about elminating side-effect free static dtors.
GlobalOpt runs early in the pipeline (before inlining) and complex class
hierarchies often introduce bitcasts or GEPs which weren't optimized away.
Teach it to ignore side-effect free instructions instead of depending on
other passes to remove them.

llvm-svn: 150174
2012-02-09 14:26:06 +00:00
Kostya Serebryany 154a54d972 [asan] unpoison the stack before every noreturn call. Fixes asan issue 37. llvm part
llvm-svn: 150102
2012-02-08 21:36:17 +00:00
Duncan Sands 0920308a7e Use Use::set rather than finding the operand number of the use
and setting that.

llvm-svn: 150074
2012-02-08 14:10:53 +00:00
Craig Topper a2886c21d9 Convert assert(0) to llvm_unreachable
llvm-svn: 149967
2012-02-07 05:05:23 +00:00
Chris Lattner 8213c8af29 Remove some dead code and tidy things up now that vectors use ConstantDataVector
instead of always using ConstantVector.

llvm-svn: 149912
2012-02-06 21:56:39 +00:00
Bill Wendling 0aef16afd5 [unwind removal] Remove all of the code for the dead 'unwind' instruction. There
were no 'unwind' instructions being generated before this, so this is in effect
a no-op.

llvm-svn: 149906
2012-02-06 21:44:22 +00:00
Bill Wendling d5d95b0b51 [unwind removal] We no longer have 'unwind' instructions being generated, so
remove the code that handles them.

llvm-svn: 149901
2012-02-06 21:16:41 +00:00
Benjamin Kramer baba1aa001 Make helper static.
llvm-svn: 149865
2012-02-06 11:28:19 +00:00
Nick Lewycky 239fdf0f61 Split part of EvaluateFunction into a new EvaluateBlock method. No functionality
change.

llvm-svn: 149861
2012-02-06 08:24:44 +00:00
Sebastian Pop 662beed828 fix indentation
llvm-svn: 149857
2012-02-06 05:29:32 +00:00
Nick Lewycky 52da72b12a Teach GlobalOpt to handle atomic accesses to globals.
* Most of the transforms come through intact by having each transformed load or
store copy the ordering and synchronization scope of the original.
 * The transform that turns a global only accessed in main() into an alloca
(since main is non-recursive) with a store of the initial value uses an
unordered store, since it's guaranteed to be the first thing to happen in main.
(Threads may have started before main (!) but they can't have the address of a
function local before the point in the entry block we insert our code.)
 * The heap-SRoA transforms are disabled in the face of atomic operations. This
can probably be improved; it seems odd to have atomic accesses to an alloca
that doesn't have its address taken.

AnalyzeGlobal keeps track of the strongest ordering found in any use of the
global. This is more information than we need right now, but it's cheap to
compute and likely to be useful.

llvm-svn: 149847
2012-02-05 19:56:38 +00:00
Nick Lewycky bbd1156b95 Clean up some whitespace and comments. No functionality change.
llvm-svn: 149845
2012-02-05 19:48:37 +00:00
Duncan Sands 9066fb5c43 Neaten up this method. Check that if there is only one
predecessor then it's Src.

llvm-svn: 149843
2012-02-05 19:43:37 +00:00
Duncan Sands 12efb16b01 Fix a thinko pointed out by Eli and the buildbots.
llvm-svn: 149839
2012-02-05 18:56:50 +00:00
Duncan Sands 4b613497f0 Reduce the number of dom queries made by GVN's conditional propagation
logic by half: isOnlyReachableViaThisEdge was trying to be clever and
handle the case of a branch to a basic block which is contained in a
loop.  This costs a domtree lookup and is completely useless due to
GVN's position in the pass pipeline: all loops have preheaders at this
point, which means it is enough for isOnlyReachableViaThisEdge to check
that Dst has only one predecessor.  (I checked this theoretical argument
by running over the entire nightly testsuite, and indeed it is so!).

llvm-svn: 149838
2012-02-05 18:25:50 +00:00
Duncan Sands 268903955c Reduce the number of non-trivial domtree queries by about 1% when
compiling sqlite3, by only doing dom queries after the cheap check
rather than interleaved with it.

llvm-svn: 149836
2012-02-05 15:50:43 +00:00
David Blaikie f9c1291fde Simplify contains tests using 'count'.
llvm-svn: 149813
2012-02-05 06:35:36 +00:00
NAKAMURA Takumi 32c48634db BBVectorize.cpp: Get rid of comparision to bool to fix a warning.
llvm-svn: 149810
2012-02-05 05:47:51 +00:00
Chris Lattner cf9e8f6968 reapply the patches reverted in r149470 that reenable ConstantDataArray,
but with a critical fix to the SelectionDAG code that optimizes copies
from strings into immediate stores: the previous code was stopping reading
string data at the first nul.  Address this by adding a new argument to
llvm::getConstantStringInfo, preserving the behavior before the patch.

llvm-svn: 149800
2012-02-05 02:29:43 +00:00
Hal Finkel 135cac922c Boost the effective chain depth of loads and stores.
By default, boost the chain depth contribution of loads and stores. This will allow a load/store pair to vectorize even when it would not otherwise be long enough to satisfy the chain depth requirement.

llvm-svn: 149761
2012-02-04 04:14:04 +00:00
Jim Grosbach 1df8cdc588 Narrow test further. Make bot and test happy.
llvm-svn: 149650
2012-02-03 00:26:07 +00:00
Jim Grosbach 7815f56b22 Tidy up. Trailing whitespace.
llvm-svn: 149649
2012-02-03 00:07:04 +00:00
Jim Grosbach e84ae7bfa0 Restrict InstCombine from converting varargs to or from fixed args.
More targetted fix replacing d0e277d272d517ca1cda368267d199f0da7cad95.

llvm-svn: 149648
2012-02-03 00:00:55 +00:00
Jim Grosbach 0ab54184d7 Revert "Disable InstCombine unsafe folding bitcasts of calls w/ varargs."
This reverts commit d0e277d272d517ca1cda368267d199f0da7cad95.

llvm-svn: 149647
2012-02-03 00:00:50 +00:00
Benjamin Kramer f61f60d97a BBVectorize: Simplify code, no functionality change.
Also silences warnings about bodyless for loops.

llvm-svn: 149612
2012-02-02 18:52:15 +00:00
Hal Finkel 8cf51b871c Minor changes from review.
As suggested by Nick Lewycky, the tree traversal queues have been changed to SmallVectors and the associated loops have been rotated. Also, an 80-col violation was fixed.

llvm-svn: 149607
2012-02-02 17:29:39 +00:00
Hal Finkel 0f3298e8d4 Vectorize long blocks in groups.
Long basic blocks with many candidate pairs (such as in the SHA implementation in Perl 5.14; thanks to Roman Divacky for the example) used to take an unacceptably-long time to compile. Instead, break long blocks into groups so that no group has too many candidate pairs.

llvm-svn: 149595
2012-02-02 06:14:56 +00:00
Stepan Dyatkovskiy 513aaa5691 SwitchInst refactoring.
The purpose of refactoring is to hide operand roles from SwitchInst user (programmer). If you want to play with operands directly, probably you will need lower level methods than SwitchInst ones (TerminatorInst or may be User). After this patch we can reorganize SwitchInst operands and successors as we want.

What was done:

1. Changed semantics of index inside the getCaseValue method:
getCaseValue(0) means "get first case", not a condition. Use getCondition() if you want to resolve the condition. I propose don't mix SwitchInst case indexing with low level indexing (TI successors indexing, User's operands indexing), since it may be dangerous.
2. By the same reason findCaseValue(ConstantInt*) returns actual number of case value. 0 means first case, not default. If there is no case with given value, ErrorIndex will returned.
3. Added getCaseSuccessor method. I propose to avoid usage of TerminatorInst::getSuccessor if you want to resolve case successor BB. Use getCaseSuccessor instead, since internal SwitchInst organization of operands/successors is hidden and may be changed in any moment.
4. Added resolveSuccessorIndex and resolveCaseIndex. The main purpose of these methods is to see how case successors are really mapped in TerminatorInst.
4.1 "resolveSuccessorIndex" was created if you need to level down from SwitchInst to TerminatorInst. It returns TerminatorInst's successor index for given case successor.
4.2 "resolveCaseIndex" converts low level successors index to case index that curresponds to the given successor.

Note: There are also related compatability fix patches for dragonegg, klee, llvm-gcc-4.0, llvm-gcc-4.2, safecode, clang.
llvm-svn: 149481
2012-02-01 07:49:51 +00:00
NAKAMURA Takumi e1d61f666b BBVectorize.cpp: Try to fix MSVC build. map::iterator and multimap::iterator are incompatible.
llvm-svn: 149475
2012-02-01 06:11:58 +00:00
Hal Finkel 8a3aebe5e0 A few of the changes suggested in code review (by Nick Lewycky)
llvm-svn: 149472
2012-02-01 05:51:45 +00:00
Argyrios Kyrtzidis 17c981a45b Revert Chris' commits up to r149348 that started causing VMCoreTests unit test to fail.
These are:

r149348
r149351
r149352
r149354
r149356
r149357
r149361
r149362
r149364
r149365

llvm-svn: 149470
2012-02-01 04:51:17 +00:00
Hal Finkel c34e51132c Add a basic-block autovectorization pass.
This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure.
Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser).

llvm-svn: 149468
2012-02-01 03:51:43 +00:00
Jim Grosbach 9fa0481569 Disable InstCombine unsafe folding bitcasts of calls w/ varargs.
Changing arguments from being passed as fixed to varargs is unsafe, as
the ABI may require they be handled differently (stack vs. register, for
example).

Remove two tests which rely on the bitcast being folded into the direct
call, which is exactly the transformation that's unsafe.

llvm-svn: 149457
2012-02-01 00:08:17 +00:00
Lenny Maiorani 8d670b8f93 bz11794 : EarlyCSE stack overflow on long functions.
Make the EarlyCSE optimizer not use recursion to do a depth first iteration.

llvm-svn: 149445
2012-01-31 23:14:41 +00:00
Bill Wendling e5f4a6d904 Increase the initial vector size to be equivalent to the size of the Deps
vector. This potentially saves a resizing.

llvm-svn: 149369
2012-01-31 07:04:52 +00:00
Bill Wendling 8a33312948 Cache the size of the vector instead of calling .size() all over the place.
llvm-svn: 149368
2012-01-31 06:57:53 +00:00
Chris Lattner f1179025ae eliminate the "string" form of ConstantArray::get, using
ConstantDataArray::getString instead.

llvm-svn: 149365
2012-01-31 06:18:43 +00:00
Chris Lattner 9e4b8726f8 eliminate the last uses of GetConstantStringInfo from this file, I didn't realize I was that close...
llvm-svn: 149354
2012-01-31 04:54:27 +00:00
Chris Lattner 8193b06e44 start moving SimplifyLibcalls over to getConstantStringInfo, which is
dramatically more efficient than GetConstantStringInfo.

llvm-svn: 149352
2012-01-31 04:43:11 +00:00
Chris Lattner fe741769dd enhance logic to support ConstantDataArray.
llvm-svn: 149340
2012-01-31 02:55:06 +00:00
Bill Wendling 3fd879dde2 s/getInnerUnwindDest/getInnerResumeDest/g
llvm-svn: 149328
2012-01-31 01:48:40 +00:00
Bill Wendling ea6e935e95 Remove ivar which is identical to another ivar.
llvm-svn: 149323
2012-01-31 01:25:54 +00:00
Bill Wendling 0c2d82b942 Remove unused ivars and s/getOuterUnwindDest/getOuterResumeDest/g.
llvm-svn: 149322
2012-01-31 01:22:03 +00:00
Bill Wendling 7778e6d818 Remove more dead functions.
llvm-svn: 149318
2012-01-31 01:18:21 +00:00
Bill Wendling 803d6b1b0c s/getInnerUnwindDestNewEH/getInnerUnwindDest/g
llvm-svn: 149317
2012-01-31 01:15:59 +00:00
Bill Wendling 621699de22 Remove some unused, old-EH methods.
llvm-svn: 149316
2012-01-31 01:14:49 +00:00
Bill Wendling 518a205d0a Get rid of references to dead intrinsics.
The eh.selector and eh.resume intrinsics aren't used anymore. Get rid of some
calls to them.

llvm-svn: 149314
2012-01-31 01:05:20 +00:00
Bill Wendling ce0c229234 Formatting cleanups. No functionality change.
llvm-svn: 149312
2012-01-31 01:01:16 +00:00
Bill Wendling f3cae51490 Remove no-longer-useful dyn_casts and pals.
llvm-svn: 149307
2012-01-31 00:56:53 +00:00
Kostya Serebryany 22ddcfd2df [asan] fix the ObjC support (asan Issue #33)
llvm-svn: 149300
2012-01-30 23:50:10 +00:00
Chad Rosier 6a0baa8f09 Typo.
llvm-svn: 149289
2012-01-30 22:44:13 +00:00
Chad Rosier 41003f819c Typo.
llvm-svn: 149275
2012-01-30 21:13:22 +00:00
Alexander Potapenko 7a36f9d399 Fix compilation of ASan tests on OS X Lion (see http://code.google.com/p/address-sanitizer/issues/detail?id=32)
The redzones emitted by AddressSanitizer for CFString instances confuse the linker and are of little use, so we shouldn't add them. 

llvm-svn: 149243
2012-01-30 10:40:22 +00:00
Nick Lewycky 1b3167edec Fix typo.
llvm-svn: 149185
2012-01-28 23:33:44 +00:00
Kostya Serebryany 7471d1303d [asan] correctly use ConstantExpr::getGetElementPtr. Catch by NAKAMURA Takumi
llvm-svn: 149172
2012-01-28 04:27:16 +00:00
Chris Lattner 0256be96f2 continue making the world safe for ConstantDataVector. At this point,
we should (theoretically optimize and codegen ConstantDataVector as well
as ConstantVector.

llvm-svn: 149116
2012-01-27 03:08:05 +00:00
Chris Lattner fa77500d96 Continue improving support for ConstantDataAggregate, and use the
new methods recently added to (sometimes greatly!) simplify code.

llvm-svn: 149024
2012-01-26 02:32:04 +00:00
Chris Lattner 8326bd8e10 some general cleanup, using new methods and tidying up old code.
llvm-svn: 149006
2012-01-26 00:42:34 +00:00
Nick Lewycky 3c3feaf40c Gracefully degrade precision in branch probability numbers.
llvm-svn: 148946
2012-01-25 09:43:14 +00:00
Chris Lattner 6705883ad8 use Constant::getAggregateElement to simplify a bunch of code.
llvm-svn: 148934
2012-01-25 06:48:06 +00:00
Chris Lattner 47a86bdbe2 use ConstantVector::getSplat in a few places.
llvm-svn: 148929
2012-01-25 06:02:56 +00:00
Kostya Serebryany c11d1dd133 [asan] enable asan only for the functions that have Attribute::AddressSafety
llvm-svn: 148846
2012-01-24 19:34:43 +00:00
Chris Lattner a0d01ff567 basic instcombine support for CDS.
llvm-svn: 148806
2012-01-24 14:31:22 +00:00
Alexander Potapenko c94cf8faf6 Implemented AddressSanitizer::getPassName()
llvm-svn: 148697
2012-01-23 11:22:43 +00:00
David Blaikie 46a9f016c5 More dead code removal (using -Wunreachable-code)
llvm-svn: 148578
2012-01-20 21:51:11 +00:00
Andrew Trick b9c822ab0b Handle a corner case with IV chain collection with bailout instead of assert.
Fixes PR11783: bad cast to AddRecExpr.

llvm-svn: 148572
2012-01-20 21:23:40 +00:00
Kostya Serebryany a5054ad2f3 Extend Attributes to 64 bits
Problem: LLVM needs more function attributes than currently available (32 bits).
One such proposed attribute is "address_safety", which shows that a function is being checked for address safety (by AddressSanitizer, SAFECode, etc).

Solution:
- extend the Attributes from 32 bits to 64-bits
- wrap the object into a class so that unsigned is never erroneously used instead
- change "unsigned" to "Attributes" throughout the code, including one place in clang.
- the class has no "operator uint64 ()", but it has "uint64_t Raw() " to support packing/unpacking.
- the class has "safe operator bool()" to support the common idiom:  if (Attributes attr = getAttrs()) useAttrs(attr);
- The CTOR from uint64_t is marked explicit, so I had to add a few explicit CTOR calls
- Add the new attribute "address_safety". Doing it in the same commit to check that attributes beyond first 32 bits actually work.
- Some of the functions from the Attribute namespace are worth moving inside the class, but I'd prefer to have it as a separate commit.

Tested:
"make check" on Linux (32-bit and 64-bit) and Mac (10.6)
built/run spec CPU 2006 on Linux with clang -O2.


This change will break clang build in lib/CodeGen/CGCall.cpp.
The following patch will fix it.

llvm-svn: 148553
2012-01-20 17:56:17 +00:00
Andrew Trick c908b43d9f SCEVExpander fixes. Affects LSR and indvars.
LSR has gradually been improved to more aggressively reuse existing code, particularly existing phi cycles. This exposed problems with the SCEVExpander's sloppy treatment of its insertion point. I applied some rigor to the insertion point problem that will hopefully avoid an endless bug cycle in this area. Changes:

- Always used properlyDominates to check safe code hoisting.

- The insertion point provided to SCEV is now considered a lower bound. This is usually a block terminator or the use itself. Under no cirumstance may SCEVExpander insert below this point.

- LSR is reponsible for finding a "canonical" insertion point across expansion of different expressions.

- Robust logic to determine whether IV increments are in "expanded" form and/or can be safely hoisted above some insertion point.

Fixes PR11783: SCEVExpander assert.

llvm-svn: 148535
2012-01-20 07:41:13 +00:00
Dan Gohman 8ee108bf98 Set the "tail" flag on pattern-matched objc_storeStrong calls.
rdar://10531041.

llvm-svn: 148490
2012-01-19 19:14:36 +00:00
Nick Lewycky 219e6bcb71 Actually, this code handles wrapped sets just fine. Noticed by inspection.
llvm-svn: 148487
2012-01-19 18:19:42 +00:00
Dan Gohman 8f12faeb14 Add a depth limit to avoid runaway recursion.
llvm-svn: 148419
2012-01-18 21:24:45 +00:00
Dan Gohman 82041c2e60 Use llvm.global_ctors to locate global constructors instead
of recognizing them by name.

llvm-svn: 148416
2012-01-18 21:19:38 +00:00
Jakub Staszak 632a355a01 Remove trailing spaces and unneeded includes.
llvm-svn: 148415
2012-01-18 21:16:33 +00:00
Dan Gohman e7a243fea5 Add a new ObjC ARC optimization pass to eliminate unneeded
autorelease push+pop pairs.

llvm-svn: 148330
2012-01-17 20:52:24 +00:00
Dan Gohman b9936296d3 Add a new PassManagerBuilder customization point,
EP_ModuleOptimizerEarly, to allow passes to be added before the
main ModulePass optimizers.

llvm-svn: 148329
2012-01-17 20:51:32 +00:00
Andrew Trick 12728f04ca LSR fix: broaden the check for loop preheaders.
It's becoming clear that LoopSimplify needs to unconditionally create loop preheaders. But that is a bigger fix. For now, continuing to hack LSR.
Fixes rdar://10701050 "Cannot split an edge from an IndirectBrInst" assert.

llvm-svn: 148288
2012-01-17 06:45:52 +00:00
David Blaikie b48ed1a4cb Remove unreachable code. (replace with llvm_unreachable to help GCC where necessary)
llvm-svn: 148284
2012-01-17 04:43:56 +00:00
Stepan Dyatkovskiy 2931a59ec5 Fixed comment in loop-unswitch.
llvm-svn: 148252
2012-01-16 20:48:04 +00:00
Stepan Dyatkovskiy 7ec12e431a Cosmetic patch for r148215.
llvm-svn: 148216
2012-01-15 09:45:11 +00:00
Stepan Dyatkovskiy cb2adbacf8 Fixup for r148132. Type replacement for LoopsProperties: from DenseMap to std::map, since we need to keep a valid pointer to properties of current loop.
Message for r148132:
LoopUnswitch: All helper data that is collected during loop-unswitch iterations was moved to separated class (LUAnalysisCache).

llvm-svn: 148215
2012-01-15 09:44:07 +00:00
Dan Gohman 4cf362acc1 Fix an unused variable warning that Chad noticed.
llvm-svn: 148164
2012-01-14 00:47:44 +00:00
Eli Friedman d476fdc392 Speculatively revert r148132+r148133 to try and fix a buildbot failure.
llvm-svn: 148149
2012-01-13 22:34:39 +00:00
Stepan Dyatkovskiy 0a920fa210 Cosmetic patch for r148132.
llvm-svn: 148133
2012-01-13 19:27:22 +00:00
Stepan Dyatkovskiy cbcbdb237f LoopUnswitch: All helper data that is collected during loop-unswitch iterations was moved to separated class (LUAnalysisCache).
llvm-svn: 148132
2012-01-13 19:13:54 +00:00
Dan Gohman 728db4997a Implement proper ObjC ARC objc_retainBlock "escape" analysis, so that
the optimizer doesn't eliminate objc_retainBlock calls which are needed
for their side effect of copying blocks onto the heap.
This implements rdar://10361249.

llvm-svn: 148076
2012-01-13 00:39:07 +00:00
Eli Friedman b31c627be1 Re-fix the issue Bill fixed in r147899 in a slightly different way, which doesn't abuse the semantics of linker_private. We don't really want to merge any string constant with a weak_odr global.
llvm-svn: 147971
2012-01-11 22:06:46 +00:00
Kostya Serebryany 687d078192 [asan] extend the workaround for http://llvm.org/bugs/show_bug.cgi?id=11395: don't instrument the function at all on x86_32 if it has a large asm blob
llvm-svn: 147953
2012-01-11 18:15:23 +00:00
Stepan Dyatkovskiy 8216569812 Improved compile time:
1. Size heuristics changed. Now we calculate number of unswitching
branches only once per loop.
2. Some checks was moved from UnswitchIfProfitable to
processCurrentLoop, since it is not changed during processCurrentLoop
iteration. It allows decide to skip some loops at an early stage.
Extended statistics:
- Added total number of instructions analyzed.

llvm-svn: 147935
2012-01-11 08:40:51 +00:00
Bill Wendling c79155192d If the global variable is removed by the linker, then don't constant merge it
with other symbols.

An object in the __cfstring section is suppoed to be filled with CFString
objects, which have a pointer to ___CFConstantStringClassReference followed by a
pointer to a __cstring. If we allow the object in the __cstring section to be
merged with another global, then it could end up in any section. Because the
linker is going to remove these symbols in the final executable, we shouldn't
bother to merge them.
<rdar://problem/10564621>

llvm-svn: 147899
2012-01-11 00:13:08 +00:00
Andrew Trick d5d2db9af9 Enable LSR IV Chains with sufficient heuristics.
These heuristics are sufficient for enabling IV chains by
default. Performance analysis has been done for i386, x86_64, and
thumbv7. The optimization is rarely important, but can significantly
speed up certain cases by eliminating spill code within the
loop. Unrolled loops are prime candidates for IV chains. In many
cases, the final code could still be improved with more target
specific optimization following LSR. The goal of this feature is for
LSR to make the best choice of induction variables.

Instruction selection may not completely take advantage of this
feature yet. As a result, there could be cases of slight code size
increase.

Code size can be worse on x86 because it doesn't support postincrement
addressing. In fact, when chains are formed, you may see redundant
address plus stride addition in the addressing mode. GenerateIVChains
tries to compensate for the common cases.

On ARM, code size increase can be mitigated by using postincrement
addressing, but downstream codegen currently misses some opportunities.

llvm-svn: 147826
2012-01-10 01:45:08 +00:00
Andrew Trick 248d410e3e Adding IV chain generation to LSR.
After collecting chains, check if any should be materialized. If so,
hide the chained IV users from the LSR solver. LSR will only solve for
the head of the chain. GenerateIVChains will then materialize the
chained IV users by computing the IV relative to its previous value in
the chain.

In theory, chained IV users could be exposed to LSR's solver. This
would be considerably complicated to implement and I'm not aware of a
case where we need it. In practice it's more important to
intelligently prune the search space of nontrivial loops before
running the solver, otherwise the solver is often forced to prune the
most optimal solutions. Hiding the chained users does this well, so
that LSR is more likely to find the best IV for the chain as a whole.

llvm-svn: 147801
2012-01-09 21:18:52 +00:00
Andrew Trick 29fe5f03d7 Adding collection of IV chains to LSR.
This collects a set of IV uses within the loop whose values can be
computed relative to each other in a sequence. Following checkins will
make use of this information.

llvm-svn: 147797
2012-01-09 19:50:34 +00:00
Andrew Trick 4dc3eff5ae "Minor LSR debugging stuff"
llvm-svn: 147785
2012-01-09 18:58:16 +00:00
Benjamin Kramer f7fe24f40a Move assert to the right place.
llvm-svn: 147779
2012-01-09 17:36:29 +00:00
Benjamin Kramer f9d0cc0160 InstCombine: Teach foldLogOpOfMaskedICmpsHelper that sign bit tests are bit tests.
This subsumes several other transforms while enabling us to catch more cases.

llvm-svn: 147777
2012-01-09 17:23:27 +00:00
Benjamin Kramer 6609f741b9 Tweak my last commit to be less conservative about uses.
We still save an instruction when just the "and" part is replaced.
Also change the code to match comments more closely.

llvm-svn: 147753
2012-01-08 21:12:51 +00:00
Benjamin Kramer da37e15345 InstCombine: If we have a bit test and a sign test anded/ored together, merge the sign bit into the bit test.
This is common in bit field code, e.g. checking if the first or the last bit of a bit field is set.

llvm-svn: 147749
2012-01-08 18:32:24 +00:00
Andrew Trick 06f6c05d08 Enable redundant phi elimination after LSR.
This will be more important as we extend the LSR pass in ways that don't rely on the formula solver. In particular, we need it for constructing IV chains.

llvm-svn: 147724
2012-01-07 07:08:17 +00:00
Andrew Trick 732ad80dbb LSR: Don't optimize loops if an outer loop has no preheader.
LoopSimplify may not run on some outer loops, e.g. because of indirect
branches. SCEVExpander simply cannot handle outer loops with no preheaders.
Fixes rdar://10655343 SCEVExpander segfault.

llvm-svn: 147718
2012-01-07 03:16:50 +00:00
Andrew Trick 2ec61a896b LSR: run DeleteDeadPhis before replaceCongruentPhis.
llvm-svn: 147711
2012-01-07 01:36:44 +00:00
Andrew Trick 5adedf5d47 Extended replaceCongruentPhis to handle mixed phi types.
llvm-svn: 147707
2012-01-07 01:12:09 +00:00
Kostya Serebryany 3411f2ea68 [asan] cleanup: remove the SIGILL-related code (compiler part)
llvm-svn: 147667
2012-01-06 18:09:21 +00:00
Dan Gohman 5ab9c0a927 Fix SpeculativelyExecuteBB to either speculate all or none of the phis
present in the bottom of the CFG triangle, as the transformation isn't
ever valuable if the branch can't be eliminated.

Also, unify some heuristics between SimplifyCFG's multiple
if-converters, for consistency.

This fixes rdar://10627242.

llvm-svn: 147630
2012-01-05 23:58:56 +00:00
Eli Friedman 55fa49f32d PR11705, part 2: globalopt shouldn't put inttoptr/ptrtoint operations into global initializers if there's an implied extension or truncation.
llvm-svn: 147625
2012-01-05 23:03:32 +00:00
Dan Gohman 5267211899 Revert r56315. When the instruction to speculate is a load, this
code can incorrectly move the load across a store. This never
happens in practice today, but only because the current
heuristics accidentally preclude it.

llvm-svn: 147623
2012-01-05 22:54:35 +00:00
Nick Lewycky f740db31e2 SCCCaptured is trivially false on entry to this loop and not modified inside it.
Eliminate the dead test for it on each loop iteration. No functionality change.

llvm-svn: 147616
2012-01-05 22:21:45 +00:00
Nick Lewycky 6d1d4bb6a1 Remove pointless asserts.
llvm-svn: 147529
2012-01-04 09:42:30 +00:00
Nick Lewycky 0c48afa0ed Teach instcombine all sorts of great stuff about shifts that have exact, nuw or
nsw bits on them.

llvm-svn: 147528
2012-01-04 09:28:29 +00:00
Nick Lewycky b59008c694 Make use of the exact bit when optimizing '(X >>exact 3) << 1' to eliminate the
'and' that would zero out the trailing bits, and to produce an exact shift
ourselves.

llvm-svn: 147391
2011-12-31 21:30:22 +00:00
Nick Lewycky 4c378a4453 Change CaptureTracking to pass a Use* instead of a Value* when a value is
captured. This allows the tracker to look at the specific use, which may be
especially interesting for function calls.

Use this to fix 'nocapture' deduction in FunctionAttrs. The existing one does
not iterate until a fixpoint and does not guarantee that it produces the same
result regardless of iteration order. The new implementation builds up a graph
of how arguments are passed from function to function, and uses a bottom-up walk
on the argument-SCCs to assign nocapture. This gets us nocapture more often, and
does so rather efficiently and independent of iteration order.

llvm-svn: 147327
2011-12-28 23:24:21 +00:00
Nick Lewycky 8640fdf0b7 Demystify this comment.
llvm-svn: 147307
2011-12-28 06:57:32 +00:00
Nick Lewycky 398255e70c Use false not zero, as a bool.
llvm-svn: 147292
2011-12-27 18:27:22 +00:00
Nick Lewycky a8e84fb56b Turn cos(-x) into cos(x). Patch by Alexander Malyshev!
llvm-svn: 147291
2011-12-27 18:25:50 +00:00
Nick Lewycky c554a9b58e Teach simplifycfg to recompute branch weights when merging some branches, and
to discard weights when appropriate. Still more to do (and a new TODO), but
it's a start!

llvm-svn: 147286
2011-12-27 04:31:52 +00:00
Rafael Espindola 2b14b80b60 Fix warning.
llvm-svn: 147284
2011-12-26 23:12:42 +00:00
Nick Lewycky 8d302df4a4 Update the branch weight metadata when reversing the order of a branch.
llvm-svn: 147280
2011-12-26 20:54:14 +00:00
Nick Lewycky e87d54c817 Sort includes, canonicalize whitespace, fix typos. No functionality change.
llvm-svn: 147279
2011-12-26 20:37:40 +00:00
Benjamin Kramer b16bd77bd2 InstCombine: Add a combine that turns (2^n)-1 ^ x back into (2^n)-1 - x iff x is smaller than 2^n and it fuses with a following add.
This was intended to undo the sub canonicalization in cases where it's not profitable, but it also
finds some cases on it's own.

llvm-svn: 147256
2011-12-24 17:31:53 +00:00
Benjamin Kramer 010337c838 InstCombine: Canonicalize (2^n)-1 - x into (2^n)-1 ^ x iff x is known to be smaller than 2^n.
This has the obvious advantage of being commutable and is always a win on x86 because
const - x wastes a register there. On less weird architectures this may lead to
a regression because other arithmetic doesn't fuse with it anymore. I'll address that
problem in a followup.

llvm-svn: 147254
2011-12-24 17:31:38 +00:00
Nick Lewycky d9d1de4f69 Fix typo "infinte".
llvm-svn: 147226
2011-12-23 23:49:25 +00:00
Mon P Wang 5d44a4332a When not destroying the source, the linker is not remapping the types. Added support
to CloneFunctionInto to allow remapping for this case.

llvm-svn: 147217
2011-12-23 02:18:32 +00:00
Chad Rosier 3ba90a1655 Add the actual code for r147175.
llvm-svn: 147176
2011-12-22 21:10:46 +00:00
Chad Rosier 1b7e2baf47 Speculatively revert r146578 to determine if it is the cause of a number of
performance regressions (both execution-time and compile-time) on our
nightly testers.

Original commit message:
Fix for bug #11429: Wrong behaviour for switches. Small improvement for code
size heuristics.

llvm-svn: 147131
2011-12-22 02:40:57 +00:00
Dan Gohman 51c81685a8 Fix a copy+pasto. No testcase, because the symptoms of dereferencing
an invalid iterator aren't reproducible.  rdar://10614085.

llvm-svn: 147098
2011-12-21 21:43:50 +00:00
Nick Lewycky b4039f633c Make some intrinsics safe to speculatively execute.
llvm-svn: 147036
2011-12-21 05:52:02 +00:00
David Blaikie a379b18173 Unweaken vtables as per http://llvm.org/docs/CodingStandards.html#ll_virtual_anch
llvm-svn: 146960
2011-12-20 02:50:00 +00:00
Jakub Staszak 1b1d523d9e - Use getExitingBlock instead of getExitingBlocks.
- Remove trailing spaces.

llvm-svn: 146854
2011-12-18 21:52:30 +00:00
Kevin Enderby 8b3deabd2d Revert r146822 at Pete Cooper's request as it broke clang self hosting.
Hope I did this correctly :)

llvm-svn: 146834
2011-12-17 19:48:52 +00:00
Pete Cooper eadf124d2b SimplifyCFG now predicts some conditional branches to true or false depending on previous branch on same comparison operands.
For example, 

if (a == b) {
    if (a > b) // this is false
    
Fixes some of the issues on <rdar://problem/10554090>

llvm-svn: 146822
2011-12-17 06:32:38 +00:00
Pete Cooper ebf98c1304 Refactor code used in InstCombine::FoldAndOfICmps to new file.
This will be used by SimplifyCfg in a later commit.

llvm-svn: 146803
2011-12-17 01:20:32 +00:00
Dan Gohman 518cda42b9 The powers that be have decided that LLVM IR should now support 16-bit
"half precision" floating-point with a first-class type.

This patch adds basic IR support (but not codegen support).

llvm-svn: 146786
2011-12-17 00:04:22 +00:00
Andrew Trick ca3417e932 Avoid a confusing assert for silly options: -unroll-runtime -unroll-count=1.
No need for an explicit test case for an unsupported combination of options.

llvm-svn: 146721
2011-12-16 02:03:48 +00:00
Kostya Serebryany 7a9eb49a47 [asan] add the name of the module to the description of a global variable. This improves the readability of global-buffer-overflow reports.
llvm-svn: 146698
2011-12-15 22:55:55 +00:00
Kostya Serebryany cd1aba8b4d [asan] fix a bug (issue 19) where dlclose and the following mmap caused a false positive. compiler part.
llvm-svn: 146688
2011-12-15 21:59:03 +00:00
Pete Cooper b33c297f14 Added InstCombine for "select cond, ~cond, x" type patterns
These can be reduced to "~cond & x" or "~cond | x"

llvm-svn: 146624
2011-12-15 00:56:45 +00:00
Eli Friedman 16ad2905a3 Make loop preheader insertion in LoopSimplify handle the case where the loop header is a landing pad correctly (by splitting the landingpad out of the loop header). Make some adjustments to the rest of LoopSimplify to make it clear that the rest of LoopSimplify isn't making bad assumptions about the presence of landing pads. PR11575.
llvm-svn: 146621
2011-12-15 00:50:34 +00:00
Dan Gohman 75d7d5e988 Move Instruction::isSafeToSpeculativelyExecute out of VMCore and
into Analysis as a standalone function, since there's no need for
it to be in VMCore. Also, update it to use isKnownNonZero and
other goodies available in Analysis, making it more precise,
enabling more aggressive optimization.

llvm-svn: 146610
2011-12-14 23:49:11 +00:00
Stepan Dyatkovskiy d7b2bb3bdd Fix for bug #11429: Wrong behaviour for switches. Small improvement for code size heuristics.
llvm-svn: 146578
2011-12-14 19:19:17 +00:00
Dan Gohman bd944b4153 It turns out that clang does use pointer-to-function types to
point to ARC-managed pointers sometimes. This fixes rdar://10551239.

llvm-svn: 146577
2011-12-14 19:10:53 +00:00
Kostya Serebryany ac6ae7302d [asan] remove .preinit_array from the compiler module (it breaks .so builds). This should be done in the run-time.
llvm-svn: 146527
2011-12-14 00:01:51 +00:00
Kostya Serebryany 21dc2be97a [asan] report an error if blacklist file contains a malformed regex. fixes asan issue 17
llvm-svn: 146503
2011-12-13 19:34:53 +00:00
Andrew Trick dc18e383b7 Cleanup. Clarify LSRInstance public methods.
llvm-svn: 146459
2011-12-13 00:55:33 +00:00
Andrew Trick dbe2bdf9e7 Indvars: guard against exponential behavior in isHighCostExpansion.
This should always be done as a matter of principal. I don't have a
case that exposes the problem. I just noticed this recently while
scanning the code and realized I meant to fix it long ago.

llvm-svn: 146438
2011-12-12 22:46:16 +00:00
Daniel Dunbar 8889bb08b8 LLVMBuild: Introduce a common section which currently has a list of the
subdirectories to traverse into.
 - Originally I wanted to avoid this and just autoscan, but this has one key
   flaw in that new subdirectories can not automatically trigger a rerun of the
   llvm-build tool. This is particularly a pain when switching back and forth
   between trees where one has added a subdirectory, as the dependencies will
   tend to be wrong. This will also eliminates FIXME implicitly.

llvm-svn: 146436
2011-12-12 22:45:54 +00:00
Joerg Sonnenberger 45c4164166 Only replace fwrite with fputc, if the return value is unused.
llvm-svn: 146411
2011-12-12 20:18:31 +00:00
Daniel Dunbar 27a7489a03 LLVMBuild: Remove trailing newline, which irked me.
llvm-svn: 146409
2011-12-12 19:48:00 +00:00
Dan Gohman a53a12ce03 When computing reverse-CFG reverse-post-order, skip backedges, as
detected in the forward-CFG DFS. This prevents the reverse-CFG from
visiting blocks inside loops after blocks that dominate them in the
case where loops have multiple exits.

No testcase, because this fixes a bug which in practice only shows
up in a full optimizer run, due to the use-list order.

This fixes rdar://10422791 and others.

llvm-svn: 146408
2011-12-12 19:42:25 +00:00
Dan Gohman 766a54bde5 Add a TODO comment.
llvm-svn: 146389
2011-12-12 18:30:26 +00:00
Dan Gohman 20db059d06 Fix a copy+pasto in a comment.
llvm-svn: 146385
2011-12-12 18:20:00 +00:00
Dan Gohman 09b272bb2b Use getArgOperand instead of getOperand on a call.
llvm-svn: 146384
2011-12-12 18:19:12 +00:00
Dan Gohman 843044b75b Inline SetSeqToRelease into its only caller, since it's more clear that way.
llvm-svn: 146383
2011-12-12 18:16:56 +00:00
Dan Gohman 0444370645 Fix omitted break statements in a switch.
llvm-svn: 146380
2011-12-12 18:13:53 +00:00
Kostya Serebryany acb42b5919 [asan] use .preinit_array only on linux
llvm-svn: 146379
2011-12-12 18:01:46 +00:00
Chandler Carruth 58a71ed339 Switch llvm.cttz and llvm.ctlz to accept a second i1 parameter which
indicates whether the intrinsic has a defined result for a first
argument equal to zero. This will eventually allow these intrinsics to
accurately model the semantics of GCC's __builtin_ctz and __builtin_clz
and the X86 instructions (prior to AVX) which implement them.

This patch merely sets the stage by extending the signature of these
intrinsics and establishing auto-upgrade logic so that the old spelling
still works both in IR and in bitcode. The upgrade logic preserves the
existing (inefficient) semantics. This patch should not change any
behavior. CodeGen isn't updated because it can use the existing
semantics regardless of the flag's value.

Note that this will be followed by API updates to Clang and DragonEgg.

Reviewed by Nick Lewycky!

llvm-svn: 146357
2011-12-12 04:26:04 +00:00
Andrew Trick e8b4f409b2 LSR: ignore strides in outer loops.
Since we're not rewriting IVs in other loops, there's not much reason
to consider their stride when generating formulae.
This should reduce the number of useless formulas considered by LSR.

llvm-svn: 146302
2011-12-10 00:25:00 +00:00
Kostya Serebryany 3563f8cd41 [asan] call __asan_init from .preinit_array. This simplifies __asan_init vs malloc chicken-and-egg situation on Android and probably on other flavours of Linux. Patch by eugenis@google.com.
llvm-svn: 146284
2011-12-09 22:09:32 +00:00
Jakub Staszak f5b32e52db SplitBlockPredecessors uses ArrayRef instead of Data and Size.
llvm-svn: 146277
2011-12-09 21:19:53 +00:00
Andrew Trick d04d152998 Add -unroll-runtime for unrolling loops with run-time trip counts.
Patch by Brendon Cahoon!

This extends the existing LoopUnroll and LoopUnrollPass. Brendon
measured no regressions in the llvm test suite with -unroll-runtime
enabled. This implementation works by using the existing loop
unrolling code to unroll the loop by a power-of-two (default 8). It
generates an if-then-else sequence of code prior to the loop to
execute the extra iterations before entering the unrolled loop.

llvm-svn: 146245
2011-12-09 06:19:40 +00:00
Nick Lewycky fe970725cc Fix infinite loop in DSE when deleting a free in a reachable loop that's also
trivially infinite.

llvm-svn: 146197
2011-12-08 22:36:35 +00:00
Duncan Sands 8fa0b6927d Remove unused include.
llvm-svn: 146037
2011-12-07 17:18:31 +00:00
Benjamin Kramer b5188f163a Simplify common predecessor finding.
- Walking over pred_begin/pred_end is an expensive operation.
- PHINodes contain a value for each predecessor anyway.
- While it may look like we used to save a few iterations with the set,
  be aware that getIncomingValueForBlock does a linear search on
  the values of the phi node.
- Another -5% on ARMDisassembler.cpp (Release build). This was the last
  entry in the profile that was obviously wasting time.

llvm-svn: 145937
2011-12-06 16:14:29 +00:00
Benjamin Kramer b3bd019cd7 Push StringRefs through the metadata interface.
llvm-svn: 145934
2011-12-06 11:50:26 +00:00
Andrew Trick 5df9096584 LSR: prune undesirable formulae early.
It's always good to prune early, but formulae that are unsatisfactory
in their own right need to be removed before running any other pruning
heuristics. We easily avoid generating such formulae, but we need them
as an intermediate basis for forming other good formulae.

llvm-svn: 145906
2011-12-06 03:13:31 +00:00
Nick Lewycky 72d4d32cd6 Expose a switch for the new gcov format.
llvm-svn: 145880
2011-12-06 00:29:13 +00:00
Chad Rosier 3277557741 Update comment.
llvm-svn: 145866
2011-12-05 22:53:09 +00:00
Chad Rosier 19446a07a7 Make the MemCpyOptimizer a bit more aggressive. I can't think of a scenerio
where this would be bad as the backend shouldn't have a problem inlining small
memcpys.
rdar://10510150

llvm-svn: 145865
2011-12-05 22:37:00 +00:00
Benjamin Kramer 13231037f0 Add a little heuristic to Value::isUsedInBasicBlock to speed it up for small basic blocks.
- Calling getUser in a loop is much more expensive than iterating over a few instructions.
- Use it instead of the open-coded loop in AddrModeMatcher.
- 5% speedup on ARMDisassembler.cpp Release builds.

llvm-svn: 145810
2011-12-05 17:23:27 +00:00
Nadav Rotem 3924cb0267 Add support for vectors of pointers.
llvm-svn: 145801
2011-12-05 06:29:09 +00:00
Pete Cooper e03fe83d98 Fixed deadstoreelimination bug where negative indices were incorrectly causing the optimisation to occur
Turns out long long + unsigned long long is unsigned.  Doh!

Fixes http://llvm.org/bugs/show_bug.cgi?id=11455

llvm-svn: 145731
2011-12-03 00:04:30 +00:00
Benjamin Kramer 4d2b871cda Fix quadratic behavior in InlineFunction by fetching the personality function of the callee once and not for every invoke in the caller.
The callee is usually smaller than the caller, too. This reduces the compile
time of ARMDisassembler.cpp by 32% (Release build). It still takes ages to
compile though.

llvm-svn: 145690
2011-12-02 18:37:31 +00:00
Chad Rosier 43a33066b4 Fix a few more places where TargetData/TargetLibraryInfo is not being passed.
Add FIXMEs to places that are non-trivial to fix.

llvm-svn: 145661
2011-12-02 01:26:24 +00:00
Chad Rosier e6de63dfc5 Last bit of TargetLibraryInfo propagation. Also fixed a case for TargetData
where it appeared beneficial to pass.
More of rdar://10500969

llvm-svn: 145630
2011-12-01 21:29:16 +00:00
Pete Cooper fdddc27143 Improved fix for abs(val) != 0 to check other similar case. Also fixed style issues and confusing comment
llvm-svn: 145618
2011-12-01 19:13:26 +00:00
Kostya Serebryany d594bac68b [asan] two minor fixes: use UnreachableInst after the neverreturn function call; use report_fatal_error when blacklist file can not be found
llvm-svn: 145611
2011-12-01 18:54:53 +00:00
Pete Cooper bc5c524b71 Added instcombine pattern to spot comparing -val or val against 0.
(val != 0) == (-val != 0) so "abs(val) != 0" becomes "val != 0"

Fixes <rdar://problem/10482509>

llvm-svn: 145563
2011-12-01 03:58:40 +00:00
Chad Rosier c24b86ffbe Propagate TargetLibraryInfo throughout ConstantFolding.cpp and
InstructionSimplify.cpp.  Other fixups as needed.
Part of rdar://10500969

llvm-svn: 145559
2011-12-01 03:08:23 +00:00
Kostya Serebryany dc436f95d2 make asan work at -O0, llvm part. Patch by glider@google.com
llvm-svn: 145530
2011-11-30 22:19:26 +00:00
Eli Friedman 6cff9df298 Make GlobalMerge honor the preferred alignment on globals without an explicitly specified alignment.
<rdar://problem/10497732>.

llvm-svn: 145523
2011-11-30 21:54:15 +00:00
Chad Rosier 385d9f6c24 Whitespace.
llvm-svn: 145470
2011-11-30 01:59:59 +00:00
Chad Rosier 82e1bd8e94 Add support for sqrt, sqrtl, and sqrtf in TargetLibraryInfo. Disable
(fptrunc (sqrt (fpext x))) -> (sqrtf x) transformation if -fno-builtin is 
specified.
rdar://10466410

llvm-svn: 145460
2011-11-29 23:57:10 +00:00
Stepan Dyatkovskiy 31798ef3c0 Potential bug in RewriteLoopBodyWithConditionConstant: use iterator should not be changed inside the uses enumeration loop.
llvm-svn: 145432
2011-11-29 20:34:39 +00:00
Daniel Dunbar 539d0a8a09 build/CMake: Finish removal of add_llvm_library_dependencies.
llvm-svn: 145420
2011-11-29 19:25:30 +00:00
Duncan Sands ca6f8ddbf8 Fix a theoretical problem (not seen in the wild): if different instances of a
weak variable are compiled by different compilers, such as GCC and LLVM, while
LLVM may increase the alignment to the preferred alignment there is no reason to
think that GCC will use anything more than the ABI alignment.  Since it is the
GCC version that might end up in the final program (as the linkage is weak), it
is wrong to increase the alignment of loads from the global up to the preferred
alignment as the alignment might only be the ABI alignment.

Increasing alignment up to the ABI alignment might be OK, but I'm not totally
convinced that it is.  It seems better to just leave the alignment of weak
globals alone.

llvm-svn: 145413
2011-11-29 18:26:38 +00:00
Andrew Trick d25089f8e0 SCEV fix. In general, Add/Mul expressions should not inherit NSW/NUW.
This reverts r139450, fixes r139453, and adds much needed comments and a
unit test.

llvm-svn: 145367
2011-11-29 02:16:38 +00:00
Eli Friedman 7534b46884 Zap some completely ridiculous code. There's probably a miscompile here, but I don't really want to try to write a testcase involving an invoke returning a pointer to a varargs function...
llvm-svn: 145347
2011-11-29 01:18:23 +00:00
Eli Friedman b3f9b0676a Add a missing safety check to ProcessUGT_ADDCST_ADD. Fixes PR11438.
llvm-svn: 145316
2011-11-28 23:32:19 +00:00
Andrew Trick a8bdb7cbf1 Remove the temporary flag -disable-unroll-scev and dead code.
SCEV should now be used for trip count analysis, not LoopInfo.

llvm-svn: 145262
2011-11-28 19:22:09 +00:00
Nick Lewycky 6404d97a99 Place the "cfg checksum" around a test. This was recently added in April 2011 to
gcc, though I thought it was older (my gcc 4.4 has it as a local patch. Whoops!)
This fixes PR10589.

Also add some debugging statements.

Remove GcnoFiles, the mapping from CompilationUnit to raw_ostream. Now that we
start by iterating over each CU and descending into them, there's no need to
maintain a mapping.

llvm-svn: 145208
2011-11-27 23:22:20 +00:00
Benjamin Kramer 7ba71be392 Move code into anonymous namespaces.
llvm-svn: 145154
2011-11-26 23:01:57 +00:00
Kostya Serebryany 8b5c7a56a3 [asan] do not instrument threadlocal globals, this is buggy
llvm-svn: 145092
2011-11-23 02:10:54 +00:00
Nick Lewycky 612d70b19d Refactor code to use new attribute getters on CallSite for NoCapture and ByVal.
Suggested in code review by Eli.

That code in InstCombine looks kinda suspicious.

llvm-svn: 145013
2011-11-20 19:09:04 +00:00
Kostya Serebryany 1cdc6e9567 [asan] workaround for reg alloc bug 11395: don't instrument functions with large chunks of inline assembler
llvm-svn: 144962
2011-11-18 01:41:06 +00:00
Kostya Serebryany a6edf4c21f quick fix: remove GlobalVariable::GlobalVariable mistakenly commited at r144933. For some reason this compiles on linux
llvm-svn: 144936
2011-11-17 23:37:53 +00:00
Andrew Trick 949045864d Fix an overly general check in SimplifyIndvar to handle useless phi cycles.
The right way to check for a binary operation is
cast<BinaryOperator>. The original check: cast<Instruction> &&
numOperands() == 2 would match phi "instructions", leading to an
infinite loop in extreme corner case: a useless phi with operands
[self, constant] that prior optimization passes failed to remove,
being used in the loop by another useless phi, in turn being used by an
lshr or udiv.

Fixes PR11350: runaway iteration assertion.

llvm-svn: 144935
2011-11-17 23:36:35 +00:00
Kostya Serebryany 65e2211b95 fall back to explicit list of allowed linkages when instrumenting globals in asan; add a test check that asan does not touch linkonce_odr
llvm-svn: 144933
2011-11-17 23:14:59 +00:00
Eli Friedman 489c0ff4a4 Add support for custom names for library functions in TargetLibraryInfo. Add a custom name for fwrite and fputs on x86-32 OSX. Make SimplifyLibCalls honor the custom
names for fwrite and fputs.

Fixes <rdar://problem/9815881>.

llvm-svn: 144876
2011-11-17 01:27:36 +00:00
Nick Lewycky c7f1e7993c Merge isObjectPointerWithTrustworthySize with getPointerSize. Use it when
looking at the size of the pointee. Fixes PR11390!

llvm-svn: 144773
2011-11-16 03:49:48 +00:00
Kostya Serebryany 6e6b03ec46 AddressSanitizer, first commit (compiler module only)
llvm-svn: 144758
2011-11-16 01:35:23 +00:00
Kostya Serebryany db999c01f2 test commit to verify that commit access works (added blank line)
llvm-svn: 144748
2011-11-16 01:14:38 +00:00
Nadav Rotem 51f71054b6 Fix MSVC warnings by adding a cast.
llvm-svn: 144721
2011-11-15 22:54:21 +00:00
Benjamin Kramer b106bcc536 StringRefize and simplify.
llvm-svn: 144675
2011-11-15 19:12:09 +00:00
Benjamin Kramer 1f97a5a671 Remove all remaining uses of Value::getNameStr().
llvm-svn: 144648
2011-11-15 16:27:03 +00:00
Benjamin Kramer d00e94e882 Make headers standalone, move a virtual method out of line.
llvm-svn: 144536
2011-11-14 17:22:45 +00:00
Daniel Dunbar 52823cc91c build: Attempt to rectify inconsistencies between CMake and LLVMBuild versions of explicit dependencies.
- The hope is that we have a tool/test to verify these are accurate (and tight) soon.

llvm-svn: 144444
2011-11-12 02:10:57 +00:00
Eli Friedman ecb453805d Make sure scalarrepl picks the correct alloca when it rewrites a bitcast. Fixes PR11353.
llvm-svn: 144442
2011-11-12 02:07:50 +00:00
Daniel Dunbar 2f39f72703 LLVMBuild: Alphabetize required_libraries lists.
llvm-svn: 144416
2011-11-11 22:59:23 +00:00
Eli Friedman 0a309292c4 Get rid of an optimization in SCCP which appears to have many issues. Specifically, it doesn't handle many cases involving undef correctly, and it is missing other checks which
lead to it trying to re-mark a value marked as a constant with a different value.  It also appears to trigger very rarely.

Fixes PR11357.

llvm-svn: 144352
2011-11-11 01:16:15 +00:00
Pete Cooper a4237c380e Fixed bug in DeadStoreElimination commit r144239
Size of data being pointed to wasn't always being checked so some small writes were killing big writes

Fixes <rdar://problem/10426753>

llvm-svn: 144312
2011-11-10 20:22:08 +00:00
Pete Cooper 856977cb15 DeadStoreElimination can now trim the size of a store if the end of the store is dead.
Currently checks alignment and killing stores on a power of 2 boundary as this is likely
to trim the size of the earlier store without breaking large vector stores into scalar ones.

Fixes <rdar://problem/10140300>

llvm-svn: 144239
2011-11-09 23:07:35 +00:00
Pete Cooper 9ee220915b LICM pass now understands invariant load metadata. Nothing generates this yet so it will currently never get used in real tests
llvm-svn: 144107
2011-11-08 19:30:00 +00:00
Pete Cooper 7a4be01ac8 InstCombine now optimizes vector udiv by power of 2 to shifts
Fixes r8429

llvm-svn: 144036
2011-11-07 23:04:49 +00:00
Bill Wendling 7496461f44 Make sure we don't insert instructions before a landingpad instruction.
<rdar://problem/10405911>

llvm-svn: 144000
2011-11-07 19:38:34 +00:00
Nick Lewycky f2905afe62 Do simple cross-block DSE when we encounter a free statement. Fixes PR11240.
llvm-svn: 143808
2011-11-05 10:48:42 +00:00
Daniel Dunbar e6d40de414 Speculatively revert "DeadStoreElimination can now trim the size of a store if
the end of it is dead.", which appears to break bootstrapping LLVM.

llvm-svn: 143668
2011-11-04 00:48:26 +00:00
Daniel Dunbar bf9bba47a1 build: Add initial cut at LLVMBuild.txt files.
llvm-svn: 143634
2011-11-03 18:53:17 +00:00
Pete Cooper 8a95aedb5d DeadStoreElimination can now trim the size of a store if the end of it is dead.
Only currently done if the later store is writing to a power of 2 address or 
has the same alignment as the earlier store as then its likely to not break up
large stores into smaller ones

Fixes <rdar://problem/10140300>

llvm-svn: 143630
2011-11-03 18:01:56 +00:00
Andrew Trick c2c79c90f2 Rewrite LinearFunctionTestReplace to handle pointer-type IVs.
We've been hitting asserts in this code due to the many supported
combintions of modes (iv-rewrite/no-iv-rewrite) and IV types. This
second rewrite of the code attempts to deal with these cases systematically.

llvm-svn: 143546
2011-11-02 17:19:57 +00:00
Chandler Carruth 9dba8af074 Add parentheses to disambiguate the precedence of these operations and
silence -Wparentheses.

llvm-svn: 143534
2011-11-02 05:43:44 +00:00
Andrew Trick 0dae890346 Broaden an assert to handle enable-iv-rewrite=true following r143183.
Narrowest possible fix for PR11279.

llvm-svn: 143522
2011-11-02 00:02:45 +00:00
Eli Friedman a49b828f8f Make sure we use the right insertion point when instcombine replaces a PHI with another instruction. (Specifically, don't insert an arbitrary instruction before a PHI.) Fixes PR11275.
llvm-svn: 143437
2011-11-01 04:49:29 +00:00
Devang Patel f4af8c65aa Add utility to append a function to the list of global constructors.
Patch by Kostya Serebryany.

llvm-svn: 143405
2011-10-31 23:58:51 +00:00
Benjamin Kramer 594ee77964 SimplifyLibCalls: Use IRBuilder.CreateGlobalString when creating a string for printf->puts, which correctly sets the unnamed_addr bit on the resulting GlobalVariable.
Fixes PR11264.

llvm-svn: 143289
2011-10-29 19:43:31 +00:00
Andrew Trick effdca9441 LFTR should avoid a type mismatch with null pointer IVs.
Fixes rdar://10359193 Indvar LinearFunctionTestReplace assertion

llvm-svn: 143183
2011-10-28 03:45:11 +00:00
Eli Friedman 73beaf7bbc It is not safe to sink an alloca into a stacksave/stackrestore pair, so don't do that. <rdar://problem/10352360>
llvm-svn: 143093
2011-10-27 01:33:51 +00:00
Nick Lewycky dd1d3df524 A dead malloc, a free(NULL) and a free(undef) are all trivially dead
instructions.

This doesn't introduce any optimizations we weren't doing before (except
potentially due to pass ordering issues), now passes will eliminate them sooner
as part of their own cleanups.

llvm-svn: 142787
2011-10-24 04:35:36 +00:00
Cameron Zwarich 057fbb1a10 The element insertion code in scalar replacement doesn't handle incorrect
element types, even though the element extraction code does. It is surprising
that this bug has been here for so long. Fixes <rdar://problem/10318778>.

llvm-svn: 142740
2011-10-23 07:02:10 +00:00
Nick Lewycky 32f8051d66 A non-escaping malloc in the entry block is not unlike an alloca. Do dead-store
elimination on them too.

llvm-svn: 142735
2011-10-22 21:59:35 +00:00
Eli Friedman 688db1d6d0 Remap blockaddress correctly when inlining a function. Fixes PR10162.
llvm-svn: 142684
2011-10-21 20:45:19 +00:00
Eli Friedman 303c81c773 Minor simplification: use ShuffleVectorInst::getMaskValue instead of a more expensive helper.
llvm-svn: 142672
2011-10-21 19:11:34 +00:00
Eli Friedman ce818277fc Extend instcombine's shufflevector simplification to handle more cases where the input and output vectors have different sizes. Patch by Xiaoyi Guo.
llvm-svn: 142671
2011-10-21 19:06:29 +00:00
Eli Friedman 1923a330e6 Refactor code from inlining and globalopt that checks whether a function definition is unused, and enhance it so it can tell that functions which are only used by a blockaddress are in fact dead. This probably doesn't happen much on most code, but the Linux kernel's _THIS_IP_ can trigger this issue with blockaddress. (GlobalDCE can also handle the given tescase, but we only run that at -O3.) Found while looking at PR11180.
llvm-svn: 142572
2011-10-20 05:23:42 +00:00
Devang Patel 88b4fa21c8 Initialze ScalarEvalution dependency.
Patch by Pranav Bhandarkar!

llvm-svn: 142556
2011-10-19 23:56:07 +00:00
Dan Gohman a7107f992e Teach the ARC optimizer about the !clang.arc.copy_on_escape metadata
tag on objc_retainBlock calls, which indicates that they may be
optimized away. rdar://10211286.

llvm-svn: 142298
2011-10-17 22:53:25 +00:00
Bill Wendling c68c8cb8d4 Add support for the Objective-C personality function to the instruction
combining of the landingpad instruction. The ObjC personality function acts
almost identically to the C++ personality function. In particular, it uses
"null" as a "catch-all" value.

llvm-svn: 142256
2011-10-17 21:20:24 +00:00
Dan Gohman 1736c14b85 Suppress partial retain+release elimination when there's a
possibility that it will span multiple CFG diamonds/triangles which
could have different controlling predicates.  rdar://10282956

llvm-svn: 142222
2011-10-17 18:48:25 +00:00
Bill Wendling 63a4ea1859 Correct over-zealous removal of hack.
Some code want to check that *any* call within a function has the 'returns
twice' attribute, not just that the current function has one.

llvm-svn: 142221
2011-10-17 18:43:40 +00:00
Bill Wendling 2a83a71c2a Now that we have the ReturnsTwice function attribute, this method is
obsolete. Check the attribute instead.
<rdar://problem/8031714>

llvm-svn: 142212
2011-10-17 18:22:52 +00:00
Michael J. Spencer 0050f59665 Fix CMake build.
llvm-svn: 142204
2011-10-17 17:50:39 +00:00
Devang Patel 76c8563239 svn mv Target/ARM/ARMGlobalMerge.cpp Transforms/Scalar/GlobalMerge.cpp
There is no reason to have simple IR level pass in lib/Target.

llvm-svn: 142200
2011-10-17 17:17:43 +00:00
Chandler Carruth 3e8aa65bc2 Add a routine to swap branch instruction operands, and update any
profile metadata at the same time. Use it to preserve metadata attached
to a branch when re-writing it in InstCombine.

Add metadata to the canonicalize_branch InstCombine test, and check that
it is tranformed correctly.

Reviewed by Nick Lewycky!

llvm-svn: 142168
2011-10-17 01:11:57 +00:00