Commit Graph

11073 Commits

Author SHA1 Message Date
Nadav Rotem 2ee35771a8 Clear the builder insert point between tree-vectorization phases.
llvm-svn: 185777
2013-07-07 14:57:18 +00:00
Nadav Rotem 2041b742d4 SLPVectorizer: Implement DCE as part of vectorization.
This is a complete re-write if the bottom-up vectorization class.
Before this commit we scanned the instruction tree 3 times. First in search of merge points for the trees. Second, for estimating the cost. And finally for vectorization.
There was a lot of code duplication and adding the DCE exposed bugs. The new design is simpler and DCE was a part of the design.
In this implementation we build the tree once. After that we estimate the cost by scanning the different entries in the constructed tree (in any order). The vectorization phase also works on the built tree.

llvm-svn: 185774
2013-07-07 06:57:07 +00:00
Michael Gottesman 618df456e2 [objc-arc] Remove the alias analysis part of r185764.
Upon further reflection, the alias analysis part of r185764 is not a safe
change.

llvm-svn: 185770
2013-07-07 04:18:03 +00:00
Michael Gottesman a72630d453 [objc-arc] Teach the ARC optimizer that objc_sync_enter/objc_sync_exit do not modify the ref count of an objc object and additionally are inert for modref purposes.
llvm-svn: 185769
2013-07-07 01:52:55 +00:00
Michael Gottesman e557da26db [objc-arc] When we initialize ARCRuntimeEntryPoints, make sure we reset all references to entrypoint declarations as well.
llvm-svn: 185764
2013-07-06 18:43:05 +00:00
Benjamin Kramer 3d90a8f4f9 Reassociate: Remove unnecessary default operator=.
llvm-svn: 185757
2013-07-06 15:10:13 +00:00
Michael Gottesman 4d9439c73f [objc-arc] Performed some small cleanups in ARCRuntimeEntryPoints and added an llvm_unreachable after the switch to quiet -Wreturn_type errors.
llvm-svn: 185746
2013-07-06 02:18:56 +00:00
Michael Gottesman 574d521c85 [objc-arc] Renamed Module => TheModule in ARCRuntimeEntryPoints. Also did some small cleanups.
This fixes an issue that came up due to -fpermissive on the bots.

llvm-svn: 185744
2013-07-06 01:57:32 +00:00
Michael Gottesman 01df45056e Removed trailing whitespace.
llvm-svn: 185743
2013-07-06 01:41:35 +00:00
Michael Gottesman 0b912b2673 [objc-arc] Updated ObjCARCContract to use ARCRuntimeEntryPoints.
llvm-svn: 185742
2013-07-06 01:39:26 +00:00
Michael Gottesman 14acfacc48 [objc-arc] Updated ObjCARCOpts to use ARCRuntimeEntryPoints.
llvm-svn: 185741
2013-07-06 01:39:23 +00:00
Michael Gottesman a94186a4c0 [objc-arc] Refactor runtime entrypoint declaration entrypoint creation.
This is the first patch in a series of 3 patches which clean up how we create
runtime function declarations in the ARC optimizer when they do not exist
already in the IR.

Currently we have a bunch of duplicated code in ObjCARCOpts, ObjCARCContract
that does this. This patch refactors that code into a separate class called
ARCRuntimeEntryPoints which lazily creates the declarations for said
entrypoints.

The next two patches will consist of the work of refactoring
ObjCARCContract/ObjCARCOpts to use this new code.

llvm-svn: 185740
2013-07-06 01:39:18 +00:00
Nick Lewycky cff2cf8e3a Fix annotation of unlink. Should fix builder.
llvm-svn: 185738
2013-07-06 00:59:28 +00:00
Nick Lewycky c2ec0725ce Extend 'readonly' and 'readnone' to work on function arguments as well as
functions. Make the function attributes pass add it to known library functions
and when it can deduce it.

llvm-svn: 185735
2013-07-06 00:29:58 +00:00
Rafael Espindola 155cf0f3a6 Use sys::fs::createTemporaryFile.
llvm-svn: 185719
2013-07-05 20:14:52 +00:00
Sylvestre Ledru 751447a3ac Remove a useless declarations (found by scan-build)
llvm-svn: 185709
2013-07-05 15:58:12 +00:00
David Majnemer c2a990bc00 InstCombine: (icmp eq B, 0) | (icmp ult A, B) -> (icmp ule A, B-1)
This transform allows us to turn IR that looks like:
  %1 = icmp eq i64 %b, 0
  %2 = icmp ult i64 %a, %b
  %3 = or i1 %1, %2
  ret i1 %3

into:
  %0 = add i64 %b, -1
  %1 = icmp uge i64 %0, %a
  ret i1 %1

which means we go from lowering:
        cmpq    %rsi, %rdi
        setb    %cl
        testq   %rsi, %rsi
        sete    %al
        orb     %cl, %al
        ret

to lowering:
        decq    %rsi
        cmpq    %rdi, %rsi
        setae   %al
        ret

llvm-svn: 185677
2013-07-05 00:31:17 +00:00
David Majnemer 37f8f445de InstCombine: Reimplementation of visitUDivOperand
This transform was originally added in r185257 but later removed in
r185415.  The original transform would create instructions speculatively
and then discard them if the speculation was proved incorrect.  This has
been replaced with a scheme that splits the transform into two parts:
preflight and fold.  While we preflight, we build up fold actions that
inform the folding stage on how to act.

llvm-svn: 185667
2013-07-04 21:17:49 +00:00
Benjamin Kramer 371722288c SimplifyCFG: Teach switch generation some patterns that instcombine forms.
This allows us to create switches even if instcombine has munged two of the
incombing compares into one and some bit twiddling. This was motivated by enum
compares that are common in clang.

llvm-svn: 185632
2013-07-04 14:22:02 +00:00
Nick Lewycky a833c667e2 Tabs to spaces. No functionality change.
llvm-svn: 185612
2013-07-04 03:51:53 +00:00
Craig Topper af0dea1347 Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size.
llvm-svn: 185606
2013-07-04 01:31:24 +00:00
Craig Topper 31ee5866de Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size.
llvm-svn: 185540
2013-07-03 15:07:05 +00:00
Evgeniy Stepanov dc6d7eb860 [msan] Unpoison stack allocations and undef values in blacklisted functions.
This changes behavior of -msan-poison-stack=0 flag from not poisoning stack
allocations to actively unpoisoning them.

llvm-svn: 185538
2013-07-03 14:39:14 +00:00
Michael Gottesman 2db11161a8 Added support in FunctionAttrs for adding relevant function/argument attributes for the posix call gettimeofday.
This implies annotating it as nounwind and its arguments as nocapture. To be
conservative, we do not annotate the arguments with noalias since some platforms
do not have restrict on the declaration for gettimeofday.

llvm-svn: 185502
2013-07-03 04:00:54 +00:00
Manman Ren d0e67aa1ce Debug Info: cleanup
llvm-svn: 185456
2013-07-02 18:37:35 +00:00
Hal Finkel fdbe161b1a Revert r185257 (InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms)
I'm reverting this commit because:

 1. As discussed during review, it needs to be rewritten (to avoid creating and
then deleting instructions).

 2. This is causing optimizer crashes. Specifically, I'm seeing things like
this:

    While deleting: i1 %
    Use still stuck around after Def is destroyed:  <badref> = select i1 <badref>, i32 0, i32 1
    opt: /src/llvm-trunk/lib/IR/Value.cpp:79: virtual llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when a value is destroyed!"' failed.

   I'd guess that these will go away once we're no longer creating/deleting
instructions here, but just in case, I'm adding a regression test.

Because the code is bring rewritten, I've just XFAIL'd the original regression test. Original commit message:

	InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms

	Real world code sometimes has the denominator of a 'udiv' be a
	'select'.  LLVM can handle such cases but only when the 'select'
	operands are symmetric in structure (both select operands are a constant
	power of two or a left shift, etc.).  This falls apart if we are dealt a
	'udiv' where the code is not symetric or if the select operands lead us
	to more select instructions.

	Instead, we should treat the LHS and each select operand as a distinct
	divide operation and try to optimize them independently.  If we can
	to simplify each operation, then we can replace the 'udiv' with, say, a
	'lshr' that has a new select with a bunch of new operands for the
	select.

llvm-svn: 185415
2013-07-02 05:21:11 +00:00
Nick Lewycky 26fcc51f15 Add missing break statements. Noticed by inspection.
llvm-svn: 185414
2013-07-02 05:02:56 +00:00
Manman Ren 74c188f026 Debug Info: clean up usage of Verify.
No functionality change. It should suffice to check the type of a debug info
metadata, instead of calling Verify.

llvm-svn: 185383
2013-07-01 21:02:01 +00:00
Arnold Schwaighofer ef51cf202b LoopVectorize: Math functions only read rounding mode
Math functions are mark as readonly because they read the floating point
rounding mode. Because we don't vectorize loops that would contain function
calls that set the rounding mode it is safe to ignore this memory read.

llvm-svn: 185299
2013-07-01 00:54:44 +00:00
Stephen Lin 2e551adcd9 DeadArgumentElimination: keep return value on functions that have a live argument with the 'returned' attribute (rather than generate invalid IR); however, if both can be eliminated, both will be
llvm-svn: 185290
2013-06-30 20:26:21 +00:00
Benjamin Kramer 4093f29366 InstCombine: Also turn selects fed by an and into arithmetic when the types don't match.
Inserting a zext or trunc is sufficient. This pattern is somewhat common in
LLVM's pointer mangling code.

llvm-svn: 185270
2013-06-29 21:17:04 +00:00
Benjamin Kramer 4ab72f9b9a LoopVectorizer: Pack MemAccessInfo pairs.
llvm-svn: 185263
2013-06-29 17:52:08 +00:00
Benjamin Kramer 53545693d7 Move helper classes into anonymous namespaces.
llvm-svn: 185262
2013-06-29 17:02:06 +00:00
David Majnemer 5953d3712a InstCombine: FoldGEPICmp shouldn't change sign of base pointer comparison
Changing the sign when comparing the base pointer would introduce all
sorts of unexpected things like:
  %gep.i = getelementptr inbounds [1 x i8]* %a, i32 0, i32 0
  %gep2.i = getelementptr inbounds [1 x i8]* %b, i32 0, i32 0
  %cmp.i = icmp ult i8* %gep.i, %gep2.i
  %cmp.i1 = icmp ult [1 x i8]* %a, %b
  %cmp = icmp ne i1 %cmp.i, %cmp.i1
  ret i1 %cmp

into:
  %cmp.i = icmp slt [1 x i8]* %a, %b
  %cmp.i1 = icmp ult [1 x i8]* %a, %b
  %cmp = xor i1 %cmp.i, %cmp.i1
  ret i1 %cmp

By preserving the original sign, we now get:
  ret i1 false

This fixes PR16483.

llvm-svn: 185259
2013-06-29 10:28:04 +00:00
David Majnemer 92a8a7d45a InstCombine: Small whitespace cleanup in FoldGEPICmp
llvm-svn: 185258
2013-06-29 09:45:35 +00:00
David Majnemer 797227eea6 InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms
Real world code sometimes has the denominator of a 'udiv' be a
'select'.  LLVM can handle such cases but only when the 'select'
operands are symmetric in structure (both select operands are a constant
power of two or a left shift, etc.).  This falls apart if we are dealt a
'udiv' where the code is not symetric or if the select operands lead us
to more select instructions.

Instead, we should treat the LHS and each select operand as a distinct
divide operation and try to optimize them independently.  If we can
to simplify each operation, then we can replace the 'udiv' with, say, a
'lshr' that has a new select with a bunch of new operands for the
select.

llvm-svn: 185257
2013-06-29 08:40:07 +00:00
Nadav Rotem 0a25727f31 We preserve the CFG and some of the analysis passes.
llvm-svn: 185251
2013-06-29 05:38:15 +00:00
Nadav Rotem e00343446c Update docs.
llvm-svn: 185250
2013-06-29 05:37:19 +00:00
David Majnemer b889e405eb InstCombine: Optimize (1 << X) Pred CstP2 to X Pred Log2(CstP2)
We may, after other optimizations, find ourselves with IR that looks
like:

  %shl = shl i32 1, %y
  %cmp = icmp ult i32 %shl, 32

Instead, we should just compare the shift count:

  %cmp = icmp ult i32 %y, 5

llvm-svn: 185242
2013-06-28 23:42:03 +00:00
Nadav Rotem 060be733a5 SLP Vectorizer: Add support for trees with external users.
To support this we have to insert 'extractelement' instructions to pick the right lane.
We had this functionality before but I removed it when we moved to the multi-block design because it was too complicated.

llvm-svn: 185230
2013-06-28 22:07:09 +00:00
Nadav Rotem 9ce3fedcdd LoopVectorizer: Refactor the code that checks if it is safe to predicate blocks.
In this code we keep track of pointers that we are allowed to read from, if they are accessed by non-predicated blocks.
We use this list to allow vectorization of conditional loads in predicated blocks because we know that these addresses don't segfault.

llvm-svn: 185214
2013-06-28 20:46:27 +00:00
Daniel Malea b17b1cd6f5 Remove needless include (unistd.h) in DebugIR pass
- should unbreak Windows builds

llvm-svn: 185198
2013-06-28 19:19:44 +00:00
Daniel Malea 0673464a92 Add missing header for DebugIR
- missed svn add...

llvm-svn: 185194
2013-06-28 19:07:59 +00:00
Daniel Malea 31321fa53d Remove limitation on DebugIR that made it require existing debug metadata.
- Build debug metadata for 'bare' Modules using DIBuilder
- DebugIR can be constructed to generate an IR file (to be seen by a debugger)
  or not in cases where the user already has an IR file on disk.

llvm-svn: 185193
2013-06-28 19:05:23 +00:00
Arnold Schwaighofer ce2c766f61 LoopVectorize: Pull dyn_cast into setDebugLocFromInst
llvm-svn: 185168
2013-06-28 17:14:48 +00:00
Arnold Schwaighofer 3b27b992ca LoopVectorize: Use static function instead of DebugLocSetter class
I used the class to safely reset the state of the builder's debug location.  I
think I have caught all places where we need to set the debug location to a new
one. Therefore, we can replace the class by a function that just sets the debug
location.

llvm-svn: 185165
2013-06-28 16:26:54 +00:00
Manman Ren 983a16c08a Debug Info: clean up usage of Verify.
No functionality change.
It should suffice to check the type of a debug info metadata, instead of
calling Verify. For cases where we know the type of a DI metadata, use
assert.

Also update testing cases to make them conform to the format of DI classes.

llvm-svn: 185135
2013-06-28 05:43:10 +00:00
Arnold Schwaighofer 12ecb331af LoopVectorize: Preserve debug location info
radar://14169017

llvm-svn: 185122
2013-06-28 00:38:54 +00:00
Matt Arsenault 5d2e85f6d7 Fix using arg_end() - arg_begin() instead of arg_size()
llvm-svn: 185121
2013-06-28 00:25:40 +00:00
Michael Gottesman 79b0967548 Revert "Revert "[APFloat] Removed APFloat constructor which initialized to either zero/NaN but allowed you to arbitrarily set the category of the float.""
This reverts commit r185099.

Looks like both the ppc-64 and mips bots are still failing after I reverted this
change.

Since:

1. The mips bot always performs a clean build,
2. The ppc64-bot failed again after a clean build (I asked the ppc-64
maintainers to clean the bot which they did... Thanks Will!),

I think it is safe to assume that this change was not the cause of the failures
that said builders were seeing. Thus I am recomitting.

llvm-svn: 185111
2013-06-27 21:58:19 +00:00
Michael Gottesman ccaf3321f1 Revert "[APFloat] Removed APFloat constructor which initialized to either zero/NaN but allowed you to arbitrarily set the category of the float."
This reverts commit r185095. This is causing a FileCheck failure on
the 3dnow intrinsics on at least the mips/ppc bots but not on the x86
bots.

Reverting while I figure out what is going on.

llvm-svn: 185099
2013-06-27 20:40:11 +00:00
Arnold Schwaighofer 38de7cd464 LoopVectorize: Cache edge masks created during if-conversion
Otherwise, we end up with an exponential IR blowup.
Fixes PR16472.

llvm-svn: 185097
2013-06-27 20:31:06 +00:00
Michael Gottesman 03255a1675 [APFloat] Removed APFloat constructor which initialized to either zero/NaN but allowed you to arbitrarily set the category of the float.
The category which an APFloat belongs to should be dependent on the
actual value that the APFloat has, not be arbitrarily passed in by the
user. This will prevent inconsistency bugs where the category and the
actual value in APFloat differ.

I also fixed up all of the references to this constructor (which were
only in LLVM).

llvm-svn: 185095
2013-06-27 19:50:52 +00:00
Arnold Schwaighofer a2dd195fb3 LoopVectorize: Use vectorized loop invariant gep index anchored in loop
Use vectorized instruction instead of original instruction anchored in the
original loop.

Fixes PR16452 and t2075.c of PR16455.

llvm-svn: 185081
2013-06-27 15:11:55 +00:00
Arnold Schwaighofer ccd6c9929b LoopVectorize: Don't store a reversed value in the vectorized value map
When we store values for reversed induction stores we must not store the
reversed value in the vectorized value map. Another instruction might use this
value.

This fixes 3 test cases of PR16455.

llvm-svn: 185051
2013-06-27 00:45:41 +00:00
Michael Gottesman 41748d7c86 Added support for the Builtin attribute.
The Builtin attribute is an attribute that can be placed on function call site that signal that even though a function is declared as being a builtin,

rdar://problem/13727199

llvm-svn: 185049
2013-06-27 00:25:01 +00:00
Nadav Rotem 8edefb3665 No need to use a Set when a vector would do.
llvm-svn: 185047
2013-06-27 00:14:13 +00:00
Nadav Rotem 93f880fb77 SLP: When searching for vectorization opportunities scan the blocks in post-order because we grow chains upwards.
llvm-svn: 185041
2013-06-26 23:44:45 +00:00
Nadav Rotem 7f0d6d7975 SLP: Dont erase instructions during vectorization because it prevents the outerloops from iterating over the instructions.
llvm-svn: 185040
2013-06-26 23:43:23 +00:00
Michael Gottesman c2af8d6273 In InstCombine{AddSub,MulDivRem} convert APFloat.isFiniteNonZero() && !APFloat.isDenormal => APFloat.isNormal.
llvm-svn: 185037
2013-06-26 23:17:31 +00:00
Eric Christopher b8c608ea39 Revert "Debug Info: clean up usage of Verify." as it's breaking bots.
This reverts commit r185020

llvm-svn: 185032
2013-06-26 22:44:57 +00:00
Manman Ren aa00ce0e8f Debug Info: clean up usage of Verify.
No functionality change.
It should suffice to check the type of a debug info metadata, instead of
calling Verify.

llvm-svn: 185020
2013-06-26 21:26:10 +00:00
Nadav Rotem 4c5b2d1de6 Erase all of the instructions that we RAUWed
llvm-svn: 184969
2013-06-26 17:16:09 +00:00
Nadav Rotem f4ca3994b8 Do not add cse-ed instructions into the visited map because we dont want to consider them as a candidate for replacement of instructions to be visited.
llvm-svn: 184966
2013-06-26 16:54:53 +00:00
Kostya Serebryany 5e276f9dbc [asan] workaround for PR16277: don't instrument AllocaInstr with alignment more than the redzone size
llvm-svn: 184928
2013-06-26 09:49:52 +00:00
Kostya Serebryany 9f5213f20f [asan] add option -asan-keep-uninstrumented-functions
llvm-svn: 184927
2013-06-26 09:18:17 +00:00
Nick Lewycky 5cd9538b90 dbgs() << Instruction doesn't print a newline on the end any more. Update these
debug statements to add a missing newline. Also canonicalize to '\n' instead of
"\n"; the latter calls a function with a loop the former does not.

llvm-svn: 184897
2013-06-26 00:30:18 +00:00
Nadav Rotem 0794acc1da SLPVectorizer: support slp-vectorization of PHINodes between basic blocks
llvm-svn: 184888
2013-06-25 23:04:09 +00:00
Bob Wilson acfc01dedf Fix SROA to avoid unnecessary scalar conversions for 1-element vectors.
When a 1-element vector alloca is promoted, a store instruction can often be
rewritten without converting the value to a scalar and using an insertelement
instruction to stuff it into the new alloca.  This patch just adds a check
to skip that conversion when it is unnecessary.  This turns out to be really
important for some ARM Neon operations where <1 x i64> is used to get around
the fact that i64 is not a legal type.

llvm-svn: 184870
2013-06-25 19:09:50 +00:00
Nadav Rotem 3de032a3b6 Fix a typo in the code that collected the costs recursively.
llvm-svn: 184827
2013-06-25 05:30:56 +00:00
Nadav Rotem 9c7c997a7e Rename the variable to fix a warning. Thanks Andy Gibbs.
llvm-svn: 184749
2013-06-24 15:59:47 +00:00
Arnold Schwaighofer b252c11ccc Reapply 184685 after the SetVector iteration order fix.
This should hopefully have fixed the stage2/stage3 miscompare on the dragonegg
testers.

"LoopVectorize: Use the dependence test utility class

We now no longer need alias analysis - the cases that alias analysis would
handle are now handled as accesses with a large dependence distance.

We can now vectorize loops with simple constant dependence distances.

  for (i = 8; i < 256; ++i) {
    a[i] = a[i+4] * a[i+8];
  }

  for (i = 8; i < 256; ++i) {
    a[i] = a[i-4] * a[i-8];
  }

We would be able to vectorize about 200 more loops (in many cases the cost model
instructs us no to) in the test suite now. Results on x86-64 are a wash.

I have seen one degradation in ammp. Interestingly, the function in which we
now vectorize a loop is never executed so we probably see some instruction
cache effects. There is a 2% improvement in h264ref. There is one or the other
TSCV loop kernel that speeds up.

radar://13681598"

llvm-svn: 184724
2013-06-24 12:09:15 +00:00
Arnold Schwaighofer 91472fa4fc LoopVectorize: Use SetVector for the access set
We are creating the runtime checks using this set so we need a deterministic
iteration order.

llvm-svn: 184723
2013-06-24 12:09:12 +00:00
Chandler Carruth 08e1b8742b Add a flag to defer vectorization into a phase after the inliner and its
CGSCC pass manager. This should insulate the inlining decisions from the
vectorization decisions, however it may have both compile time and code
size problems so it is just an experimental option right now.

Adding this based on a discussion with Arnold and it seems at least
worth having this flag for us to both run some experiments to see if
this strategy is workable. It may solve some of the regressions seen
with the loop vectorizer.

llvm-svn: 184698
2013-06-24 07:21:47 +00:00
Arnold Schwaighofer 58ca945f38 Revert "LoopVectorize: Use the dependence test utility class"
This reverts commit cbfa1ca993363ca5c4dbf6c913abc957c584cbac.

We are seeing a stage2 and stage3 miscompare on some dragonegg bots.

llvm-svn: 184690
2013-06-24 06:10:41 +00:00
Arnold Schwaighofer b914a7e2ef LoopVectorize: Use the dependence test utility class
We now no longer need alias analysis - the cases that alias analysis would
handle are now handled as accesses with a large dependence distance.

We can now vectorize loops with simple constant dependence distances.

  for (i = 8; i < 256; ++i) {
    a[i] = a[i+4] * a[i+8];
  }

  for (i = 8; i < 256; ++i) {
    a[i] = a[i-4] * a[i-8];
  }

We would be able to vectorize about 200 more loops (in many cases the cost model
instructs us no to) in the test suite now. Results on x86-64 are a wash.

I have seen one degradation in ammp. Interestingly, the function in which we
now vectorize a loop is never executed so we probably see some instruction
cache effects. There is a 2% improvement in h264ref. There is one or the other
TSCV loop kernel that speeds up.

radar://13681598

llvm-svn: 184685
2013-06-24 03:55:48 +00:00
Arnold Schwaighofer d517976758 LoopVectorize: Add utility class for checking dependency among accesses
This class checks dependences by subtracting two Scalar Evolution access
functions allowing us to catch very simple linear dependences.

The checker assumes source order in determining whether vectorization is safe.
We currently don't reorder accesses.
Positive true dependencies need to be a multiple of VF otherwise we impede
store-load forwarding.

llvm-svn: 184684
2013-06-24 03:55:45 +00:00
Arnold Schwaighofer d57419696d LoopVectorize: Add utility class for building sets of dependent accesses
Sets of dependent accesses are built by unioning sets based on underlying
objects. This class will be used by the upcoming dependence checker.

llvm-svn: 184683
2013-06-24 03:55:44 +00:00
Nadav Rotem 210e86d7c4 SLP Vectorizer: Add support for vectorizing parts of the tree.
Untill now we detected the vectorizable tree and evaluated the cost of the
entire tree.  With this patch we can decide to trim-out branches of the tree
that are not profitable to vectorizer.

Also, increase the max depth from 6 to 12. In the worse possible case where all
of the code is made of diamond-shaped graph this can bring the cost to 2**10,
but diamonds are not very common.

llvm-svn: 184681
2013-06-24 02:52:43 +00:00
Nadav Rotem 0323925d51 SLP Vectorizer: Fix a bug in the code that does CSE on the generated gather sequences.
Make sure that we don't replace and RAUW two sequences if one does not dominate the other.

llvm-svn: 184674
2013-06-23 21:57:27 +00:00
Nadav Rotem 78428401e9 SLP Vectorizer: Erase instructions outside the vectorizeTree method.
The RAII builder location guard is saving a reference to instructions, so we can't erase instructions during vectorization.

llvm-svn: 184671
2013-06-23 19:38:56 +00:00
Nadav Rotem eb65e67eea SLP Vectorizer: Implement a simple CSE optimization for the gather sequences.
llvm-svn: 184660
2013-06-23 06:15:46 +00:00
Nadav Rotem 80de0a28f1 SLP Vectorizer: Implement multi-block slp-vectorization.
Rewrote the SLP-vectorization as a whole-function vectorization pass. It is now able to vectorize chains across multiple basic blocks.
It still does not vectorize PHIs, but this should be easy to do now that we scan the entire function.
I removed the support for extracting values from trees.
We are now able to vectorize more programs, but there are some serious regressions in many workloads (such as flops-6 and mandel-2).

llvm-svn: 184647
2013-06-22 21:34:10 +00:00
Benjamin Kramer 40d7f354b5 Revert "FunctionAttrs: Merge attributes once instead of doing it for every argument."
It doesn't work as I intended it to.  This reverts commit r184638.

llvm-svn: 184641
2013-06-22 16:56:32 +00:00
Benjamin Kramer 76b7bd0e75 FunctionAttrs: Merge attributes once instead of doing it for every argument.
It has become an expensive operation. No functionality change.

llvm-svn: 184638
2013-06-22 15:51:19 +00:00
Michael Gottesman 9799cf7fb3 [objc-arc-opts] Make IsTrackingImpreciseReleases a const method.
Thanks to Bill Wendling for pointing this out!

llvm-svn: 184593
2013-06-21 20:52:49 +00:00
Michael Gottesman e3943d0554 [objc-arc-opts] Now that PtrState.RRI is encapsulated in PtrState, make PtrState.RRI private and delete the TODO.
llvm-svn: 184587
2013-06-21 19:44:30 +00:00
Michael Gottesman 4f6ef11763 [objc-arc-opts] Encapsulated PtrState.RRI.{Calls,ReverseInsertPts} into several methods on PtrState.
llvm-svn: 184586
2013-06-21 19:44:27 +00:00
Michael Gottesman f040118167 [objcarcopts] Encapsulated PtrState.RRI.IsTrackingImpreciseRelease() => PtrState.IsTrackingImpreciseRelease().
llvm-svn: 184583
2013-06-21 19:12:38 +00:00
Michael Gottesman 2f2945973a [objcarcopts] Encapsulate PtrState.RRI.CFGHazardAfflicted via methods PtrState.{IsCFGHazardAfflicted,SetCFGHazardAfflicted}.
llvm-svn: 184582
2013-06-21 19:12:36 +00:00
Michael Gottesman f701d3f864 [objcarcopts] Encapsulate PtrState.RRI.ReleaseMetadata into the methods PtrState.GetReleaseMetadata() and PtrState.SetReleaseMetadata().
llvm-svn: 184534
2013-06-21 07:03:07 +00:00
Michael Gottesman b82a179606 [objcarcopts] Encapsulate PtrState.RRI.IsTailCallRelease into the method PtrState.IsTailCallRelease() and PtrState.SetTailCallRelease().
llvm-svn: 184533
2013-06-21 07:00:44 +00:00
Michael Gottesman 9313225e72 [obcjarcopts] Encapsulate PtrState.RRI.KnownSafe in the methods PtrState.IsKnownSafe and PtrState.SetKnownSafe.
This is apart of a series of patches to encapsulate PtrState.RRI and
make PtrState.RRI a private field of PtrState.

*NOTE* This is actually the second commit in the patch stream. I should
have put this note on the first such commit r184528.

llvm-svn: 184532
2013-06-21 06:59:02 +00:00
Michael Gottesman b7deb4cd79 [objcarcopts] Some more minor code cleanups/comment additions.
llvm-svn: 184531
2013-06-21 06:54:31 +00:00
Michael Gottesman 4773a10cfb [objcarcopts] Refactor out the RRInfo merging code from PtrState into RRInfo::Merge.
I also added some comments and performed minor code cleanups.

llvm-svn: 184528
2013-06-21 05:42:08 +00:00
Nadav Rotem e1713e5fcf SLP Vectorizer: do not search for store-chains that are wider than the vector-register size.
llvm-svn: 184527
2013-06-21 04:18:13 +00:00
Meador Inge dfb08a2cb8 Remove the simplify-libcalls pass (finally)
This commit completely removes what is left of the simplify-libcalls
pass.  All of the functionality has now been migrated to the instcombine
and functionattrs passes.  The following C API functions are now NOPs:

  1. LLVMAddSimplifyLibCallsPass
  2. LLVMPassManagerBuilderSetDisableSimplifyLibCalls

llvm-svn: 184459
2013-06-20 19:48:07 +00:00
Nadav Rotem b488beefeb Clang-format the SLP vectorizer. No functionality change.
llvm-svn: 184446
2013-06-20 17:54:36 +00:00
Nadav Rotem 14a89c5428 SLPVectorization: Add a basic support for cross-basic block slp vectorization.
We collect gather sequences when we vectorize basic blocks. Gather sequences are excellent
hints for vectorization of other basic blocks.

llvm-svn: 184444
2013-06-20 17:41:45 +00:00
Nadav Rotem c41028a013 Change the debug type to match the debug type that is used by vecutils.cpp.
This change makes it easier to filter debug messages.

llvm-svn: 184440
2013-06-20 16:38:05 +00:00
Michael Gottesman 3cb77ab98a [APFloat] Converted all references to APFloat::isNormal => APFloat::isFiniteNonZero.
Turns out all the references were in llvm and not in clang.

llvm-svn: 184356
2013-06-19 21:23:18 +00:00
Bill Wendling 7a639ea2a4 Access the TargetLoweringInfo from the TargetMachine object instead of caching it. The TLI may change between functions. No functionality change.
llvm-svn: 184352
2013-06-19 21:07:11 +00:00
Matt Arsenault d46fce1141 Move StructurizeCFG out of R600 to generic Transforms.
Register it with PassManager

llvm-svn: 184343
2013-06-19 20:18:24 +00:00
Quentin Colombet 145eb97d3a LSR: Fix the parameters used to compute the scaling factor cost.
Prior to this change, the considered addressing modes may be invalid since the
maximum and minimum offsets were not taking into account.
This was causing an assertion failure.

The added test case exercices that behavior.

<rdar://problem/14199725> Assertion failed: (CurScaleCost >= 0 && "Legal
addressing mode has an illegal cost!")

llvm-svn: 184341
2013-06-19 19:59:41 +00:00
Nadav Rotem 1e9668ea81 SLPVectorizer: handle scalars that are extracted from vectors (using ExtractElementInst).
llvm-svn: 184325
2013-06-19 17:33:16 +00:00
Nadav Rotem 86e848c849 SLPVectorizer: start constructing chains at stores that are not power of two.
The type <3 x i8> is a common in graphics and we want to be able to vectorize it.

This changes accelerates bullet by 12% and 471_omnetpp by 5%.

llvm-svn: 184317
2013-06-19 15:57:29 +00:00
Nadav Rotem e98da7f548 SLPVectorizer: vectorize compares and selects.
llvm-svn: 184282
2013-06-19 05:49:52 +00:00
Nadav Rotem 4f3224f3ed Document the return value and fix a typo.
llvm-svn: 184281
2013-06-19 05:47:33 +00:00
Nadav Rotem 1f96427da0 Scan the successor blocks and use the PHI nodes as a hint for possible chain roots.
llvm-svn: 184201
2013-06-18 15:58:05 +00:00
Nadav Rotem 3349feac4e Add a return value to make this function more useful.
llvm-svn: 184200
2013-06-18 15:57:12 +00:00
Nick Lewycky 0fdd01965e Fix nondeterminism in .gcno file generation.
llvm-svn: 184174
2013-06-18 06:38:21 +00:00
Pekka Jaaskelainen eb90fd1c3b Fix for a regression caused by the LoopVectorizer when
vectorizing loops with memory accesses to non-zero address spaces. It
simply dropped the AS info. Fixes PR16306.

llvm-svn: 184103
2013-06-17 18:49:06 +00:00
Nadav Rotem cde24ef389 Disable vectorization for -Oz.
llvm-svn: 184089
2013-06-17 17:22:40 +00:00
Nadav Rotem 7dd8210b71 Enable the loop vectorizer by default for -Os and -O2.
llvm-svn: 184084
2013-06-17 16:23:34 +00:00
Jakub Staszak 4898e62ac0 Use 0 instead of NULL.
llvm-svn: 184044
2013-06-15 12:20:44 +00:00
Benjamin Kramer 9ddfaf2be6 PruneEH: Only merge attribute sets when used. No functionality change.
llvm-svn: 184041
2013-06-15 10:55:39 +00:00
Derek Schuff ec9dc01b33 Fix DeleteDeadVarargs not to crash on functions referenced by BlockAddresses
This pass was assuming that if hasAddressTaken() returns false for a
function, the function's only uses are call sites.  That's not true
because there can be references by BlockAddresses too.

Fix the pass to handle this case.  Fix
BlockAddress::replaceUsesOfWithOnConstant() to allow a function's type
to be changed by RAUW'ing the function with a bitcast of the recreated
function.

Patch by Mark Seaborn.

llvm-svn: 183933
2013-06-13 19:51:17 +00:00
Rafael Espindola 8d30480344 Always remove an alias when we rename the target.
Should fix the dragonegg build bots.

llvm-svn: 183845
2013-06-12 16:45:47 +00:00
Rafael Espindola 3bc8e71909 Move PathV2.h to Path.h
Most clients have already been moved from Path V1 to V2. The ones using V1
now include PathV1.h explicitly.

llvm-svn: 183801
2013-06-11 22:21:28 +00:00
Rafael Espindola a82555c0f8 Change how globalopt handles aliases in llvm.used.
Instead of a custom implementation of replaceAllUsesWith, we just call
replaceAllUsesWith and recreate llvm.used and llvm.compiler-used.

This change is particularity interesting because it makes llvm see
through what clang is doing with static used functions in extern "C"
contexts. With this change, running clang -O2 in

extern "C" {
  __attribute__((used)) static void foo() {}
}

produces

@llvm.used = appending global [1 x i8*] [i8* bitcast (void ()* @foo to
i8*)], section "llvm.metadata"
define internal void @foo() #0 {
entry:
  ret void
}

llvm-svn: 183756
2013-06-11 17:48:06 +00:00
Tim Northover 64280fbba1 Make DeadArgumentElimination more conservative on variadic functions
Variadic functions are particularly fragile in the face of ABI changes, so this
limits how much the pass changes them

llvm-svn: 183625
2013-06-09 02:17:27 +00:00
Shuxin Yang 140d592d84 Fix a potential bug in r183584.
r183584 tries to derive some info from the code *AFTER* a call and apply
these derived info to the code *BEFORE* the call, which is not always safe
as the call in question may never return, and in this case, the derived
info is invalid.
  
  Thank Duncan for pointing out this potential bug.

rdar://14073661 

llvm-svn: 183606
2013-06-08 04:56:05 +00:00
Shuxin Yang bd254f2601 Fix an assertion in MemCpyOpt pass.
The MemCpyOpt pass is capable of optimizing:
      callee(&S); copy N bytes from S to D.
    into:
      callee(&D);
subject to some legality constraints. 

  Assertion is triggered when the compiler tries to evalute "sizeof(typeof(D))",
while D is an opaque-typed, 'sret' formal argument of function being compiled.
i.e. the signature of the func being compiled is something like this:
  T caller(...,%opaque* noalias nocapture sret %D, ...)

  The fix is that when come across such situation, instead of calling some
utility functions to get the size of D's type (which will crash), we simply
assume D has at least N bytes as implified by the copy-instruction.

rdar://14073661 

llvm-svn: 183584
2013-06-07 22:45:21 +00:00
Michael Gottesman 9e7261c874 [objc-arc] Ensure that the cfg path count does not overflow when we multiply TopDownPathCount/BottomUpPathCount.
rdar://12480535

llvm-svn: 183489
2013-06-07 06:16:49 +00:00
Jakub Staszak 96ff4d6d3b Simplify code. No functionality change.
llvm-svn: 183461
2013-06-06 23:34:59 +00:00
Nadav Rotem 99e529ea3c Jeffrey Yasskin volunteered to benchmark the vectorizer on -O2 or -Os when compiling chrome. This patch adds a new flag to enable vectorization on all levels and not only on -O3. It should go away once we make a decision.
llvm-svn: 183456
2013-06-06 22:35:47 +00:00
Jakub Staszak bddea11bc5 Re-apply "Use IRBuilder instead of ConstantInt methods." with the fixed issues.
llvm-svn: 183439
2013-06-06 20:18:46 +00:00
Rafael Espindola a7bbc0b740 Revert "Use IRBuilder instead of ConstantInt methods. It simplifies code a little bit."
This reverts commit 183328. It caused pr16244 and broke the bots.

llvm-svn: 183422
2013-06-06 17:03:05 +00:00
Jakub Staszak 9de494e0ee Remove unneeded cast<>.
llvm-svn: 183363
2013-06-06 00:49:57 +00:00
Jakub Staszak 461d1fe6fc Use IRBuilder instead of ConstantInt methods.
llvm-svn: 183360
2013-06-06 00:37:23 +00:00
Jakub Staszak 2f390b755a Use IRBuilder instead of ConstantInt methods. It simplifies code a little bit.
llvm-svn: 183328
2013-06-05 18:27:02 +00:00
David Majnemer 29130c5e8d IndVarSimplify: check if loop invariant expansion can trap
IndVarSimplify is willing to move divide instructions outside of their
loop bodies if they are invariant of the loop.  However, it may not be
safe to expand them if we do not know if they can trap.

Instead, check to see if it is not safe to expand the instruction and
skip the expansion.

This fixes PR16041.

Testcase by Rafael Ávila de Espíndola.

llvm-svn: 183239
2013-06-04 17:51:58 +00:00
Rafael Espindola a5e536ab0e Second part of pr16069
The problem this time seems to be a thinko. We were assuming that in the CFG

A
| \
|  B
| /
C

speculating the basic block B would cause only the phi value for the B->C edge
to be speculated. That is not true, the phi's are semantically in the edges, so
if the A->B->C path is taken, any code needed for A->C is not executed and we
have to consider it too when deciding to speculate B.

llvm-svn: 183226
2013-06-04 14:11:59 +00:00
Hans Wennborg 5cf30be6e4 Typo: s/caes/cases/ in SimplifyCFG
llvm-svn: 183219
2013-06-04 11:22:30 +00:00
Nick Lewycky 688d668e5c Delete dead safety check.
llvm-svn: 183167
2013-06-03 23:15:20 +00:00
David Majnemer c82f27af2a SimplifyCFG: Do not transform PHI to select if doing so would be unsafe
PR16069 is an interesting case where an incoming value to a PHI is a
trap value while also being a 'ConstantExpr'.

We do not consider this case when performing the 'HoistThenElseCodeToIf'
optimization.

Instead, make our modifications more conservative if we detect that we
cannot transform the PHI to a select.

llvm-svn: 183152
2013-06-03 20:43:12 +00:00
David Majnemer 8e7dd2f628 SimplifyCFG: Small cleanup, use ICmpInst::isEquality()
llvm-svn: 183151
2013-06-03 20:39:50 +00:00
Kostya Serebryany 9e62b301e6 [asan] ASan Linux MIPS32 support (llvm part), patch by Jyun-Yan Y
llvm-svn: 183104
2013-06-03 14:46:56 +00:00
Nick Lewycky 3f715e260a When determining the new index for an insertelement, we may not assume that an
index greater than the size of the vector is invalid. The shuffle may be
shrinking the size of the vector. Fixes a crash!

Also drop the maximum recursion depth of the safety check for this
optimization to five.

llvm-svn: 183080
2013-06-01 20:51:31 +00:00
David Majnemer 91142c485e SimplifyCFG: Fix typo in comment for ComputeSpeculationCost
llvm-svn: 183078
2013-06-01 19:43:23 +00:00
Benjamin Kramer 7c275640e7 Move getRealLinkageName to a common place and remove all the duplicates of it.
Also simplify code a bit while there. No functionality change.

llvm-svn: 183076
2013-06-01 17:51:14 +00:00
Arnold Schwaighofer 7b1b4db35e LoopVectorize: Change API call to get the backedge taken count
Use ScalarEvolution's getBackedgeTakenCount API instead of getExitCount since
that is really what we want to know. Using the more specific getExitCount was
safe because we made sure that there is only one exiting block.

No functionality change.

llvm-svn: 183047
2013-05-31 21:48:56 +00:00
Quentin Colombet bf490d4a32 Loop Strength Reduce: Scaling factor cost.
Account for the cost of scaling factor in Loop Strength Reduce when rating the
formulae. This uses a target hook.

The default implementation of the hook is: if the addressing mode is legal, the
scaling factor is free.

<rdar://problem/13806271>

llvm-svn: 183045
2013-05-31 21:29:03 +00:00
Arnold Schwaighofer 70a9be5297 LoopVectorize: PHIs with only outside users should prevent vectorization
We check that instructions in the loop don't have outside users (except if
they are reduction values). Unfortunately, we skipped this check for
if-convertable PHIs.

Fixes PR16184.

llvm-svn: 183035
2013-05-31 19:53:50 +00:00
Quentin Colombet 8aa7abe2ae Modify how the formulae are rated in Loop Strength Reduce.
Namely, check if the target allows to fold more that one register in the
addressing mode and if yes, adjust the cost accordingly.

Prior to this commit, reg1 + scale * reg2 accesses were artificially preferred
to reg1 + reg2 accesses. Indeed, the cost model wrongly assumed that reg1 + reg2
needs a temporary register for the computation, whereas it was correctly
estimated for reg1 + scale * reg2.

<rdar://problem/13973908>

llvm-svn: 183021
2013-05-31 17:20:29 +00:00
Rafael Espindola 65281bf36e Simplify multiplications by vectors whose elements are powers of 2.
Patch by Andrea Di Biagio.

llvm-svn: 183005
2013-05-31 14:27:15 +00:00
Evgeniy Stepanov 888385e40f [msan] Handle mixed track-origins and keep-going settings (llvm part).
Before this change, each module defined a weak_odr global __msan_track_origins 
with a value of 1 if origin tracking is enabled, 0 if disabled. If there are 
modules with different values, any of them may win. If 0 wins, and there is at 
least one module with 1, the program will most likely crash.

With this change, __msan_track_origins is only emitted if origin tracking is 
on. Then runtime library detects if there is at least one module with origin 
tracking, and enables runtime support for it.

llvm-svn: 182997
2013-05-31 12:04:29 +00:00
Nick Lewycky a2b7720618 Reapply with r182909 with a fix to the calculation of the new indices for
insertelement instructions.

llvm-svn: 182976
2013-05-31 00:59:42 +00:00
Evgeniy Stepanov 2c14269883 Revert r182909.
PR/16177

llvm-svn: 182919
2013-05-30 09:40:17 +00:00
Nick Lewycky d7f27094c0 Swizzle vector inputs if it helps us eliminate shuffles.
llvm-svn: 182909
2013-05-30 04:33:38 +00:00
NAKAMURA Takumi d11b42aaad LoopVectorize.cpp: Fix abuse of StringRef on Twine. Twine captures the pointer of StringRef.
llvm-svn: 182820
2013-05-29 03:13:47 +00:00
NAKAMURA Takumi d57ea87080 Whitespace.
llvm-svn: 182819
2013-05-29 03:13:41 +00:00
Paul Redmond 5fdf836ba4 Add support for llvm.vectorizer metadata
- llvm.loop.parallel metadata has been renamed to llvm.loop to be more generic
  by making the root of additional loop metadata.
  - Loop::isAnnotatedParallel now looks for llvm.loop and associated
    llvm.mem.parallel_loop_access
  - document llvm.loop and update llvm.mem.parallel_loop_access
- add support for llvm.vectorizer.width and llvm.vectorizer.unroll
  - document llvm.vectorizer.* metadata
  - add utility class LoopVectorizerHints for getting/setting loop metadata
  - use llvm.vectorizer.width=1 to indicate already vectorized instead of
    already_vectorized
- update existing tests that used llvm.loop.parallel and
  llvm.vectorizer.already_vectorized

Reviewed by: Nadav Rotem

llvm-svn: 182802
2013-05-28 20:00:34 +00:00
James Molloy f6f121e277 Extend RemapInstruction and friends to take an optional new parameter, a ValueMaterializer.
Extend LinkModules to pass a ValueMaterializer to RemapInstruction and friends to lazily create Functions for lazily linked globals. This is a big win when linking small modules with large (mostly unused) library modules.

llvm-svn: 182776
2013-05-28 15:17:05 +00:00
Evgeniy Stepanov fca012334b [msan] Fix argument shadow alignment.
llvm-svn: 182771
2013-05-28 13:07:43 +00:00
Michael J. Spencer df1ecbd734 Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.
llvm-svn: 182680
2013-05-24 22:23:49 +00:00
Michael Gottesman e67f40c514 [objc-arc] KnownSafe does not imply that it is safe to perform code motion across CFG edges since even if it is safe to remove RR pairs, we may still be able to move a retain/release into a loop.
rdar://13949644

llvm-svn: 182670
2013-05-24 20:44:05 +00:00
Michael Gottesman 5a91bbf33a [objc-arc] Make sure that multiple owners is propogated correctly through the pass via the usage of a global data structure.
rdar://13750319

llvm-svn: 182669
2013-05-24 20:44:02 +00:00
Benjamin Kramer 6ac1e62377 LoopVectorize: LoopSimplify can't canonicalize loops with an indirectbr in it, don't assert on those cases.
Fixes PR16139.

llvm-svn: 182656
2013-05-24 18:05:35 +00:00
Joey Gouly b34294d0e4 Run clang-format over the scalarizePHI function.
llvm-svn: 182640
2013-05-24 12:33:28 +00:00
Joey Gouly 83699284be scalarizePHI needs to insert the next ExtractElement in the same block
as the BinaryOperator, *not* in the block where the IRBuilder is currently
inserting into. Fixes a bug where scalarizePHI would create instructions
that would not dominate all uses.

llvm-svn: 182639
2013-05-24 12:29:54 +00:00
Daniel Malea fddddbeab0 Re-implement DebugIR in a way that does not subclass AssemblyWriter:
- move AsmWriter.h from public headers into lib
- marked all AssemblyWriter functions as non-virtual; no need to override them
- DebugIR now "plugs into" AssemblyWriter with an AssemblyAnnotationWriter helper
- exposed flags to control hiding of a) debug metadata b) debug intrinsic calls

C/R: Paul Redmond

llvm-svn: 182617
2013-05-23 22:34:33 +00:00
Benjamin Kramer ad5c24f161 More symbols that should be static.
llvm-svn: 182590
2013-05-23 16:09:15 +00:00
Michael Gottesman 740db977f6 [objc-arc] Fixed number of prefixing slashes in some comments in a function from 3 to 2 to match the rest of ObjCARCOpts.
llvm-svn: 182557
2013-05-23 02:35:21 +00:00
Nadav Rotem 9e00eb38a2 SLPVectorizer: Change the order in which new instructions are added to the function.
We are not working on a DAG and I ran into a number of problems when I enabled the vectorizations of 'diamond-trees' (trees that share leafs).
* Imroved the numbering API.
* Changed the placement of new instructions to the last root.
* Fixed a bug with external tree users with non-zero lane.
* Fixed a bug in the placement of in-tree users.

llvm-svn: 182508
2013-05-22 19:47:32 +00:00
Jean-Luc Duprat 0dda6f168c This is an update to a previous commit (r181216).
The earlier change list introduced the following inst combines:
B * (uitofp i1 C) —> select C, B, 0
A * (1 - uitofp i1 C) —> select C, 0, A
select C, 0, B + select C, A, 0 —> select C, A, B

Together these 3 changes would simplify :
A * (1 - uitofp i1 C) + B * uitofp i1 C 
down to :
select C, B, A

In practice we found that the first two substitutions can have a
negative effect on performance, because they reduce opportunities to
use FMA contractions; between the two options FMAs are often the
better choice.  This change list amends the previous one to enable
just these inst combines:

select C, B, 0 + select C, 0, A —> select C, B, A
A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A

llvm-svn: 182499
2013-05-22 18:29:31 +00:00
Arnold Schwaighofer 12b0d1cda0 LoopVectorize: Make Value pointers that could be RAUW'ed a VH
The Value pointers we store in the induction variable list can be RAUW'ed by a
call to SCEVExpander::expandCodeFor, use a TrackingVH instead. Do the same thing
in some other places where we store pointers that could potentially be RAUW'ed.

Fixes PR16073.

llvm-svn: 182485
2013-05-22 16:54:56 +00:00
Evgeniy Stepanov ebd7f8e7ef [msan] A no-op implementation of VarArg handling.
This stuff is used on platforms where MSan does not have a proper VarArg
implementation (anything other than x86_64 at the moment).

llvm-svn: 182375
2013-05-21 12:27:47 +00:00
Bill Wendling 5f4740390e Remove unused #include.
llvm-svn: 182315
2013-05-20 20:59:12 +00:00
Hal Finkel a969df84ab Rename LoopSimplify.h to LoopUtils.h
As discussed, LoopUtils.h is a better name.

llvm-svn: 182314
2013-05-20 20:46:30 +00:00
Hal Finkel a12d82b421 Expose InsertPreheaderForLoop from LoopSimplify to other passes
Other passes, PPC counter-loop formation for example, also need to add loop
preheaders outside of the regular loop simplification pass. This makes
InsertPreheaderForLoop a global function so that it can be used by other
passes.

No functionality change intended.

llvm-svn: 182299
2013-05-20 16:47:07 +00:00
Arnold Schwaighofer 693a1ca628 LoopVectorize: Handle single edge PHIs
We might encouter single edge PHIs - handle them with an identity select.

Fixes PR15990.

llvm-svn: 182199
2013-05-18 18:38:34 +00:00
Matt Arsenault 52ddb7bcdd Add missing -*- C++ -*- to headers
llvm-svn: 182164
2013-05-17 21:43:39 +00:00
Benjamin Kramer d84a63398e LoopVectorize: Simplify code. No functionality change.
llvm-svn: 182100
2013-05-17 14:48:17 +00:00
Evgeniy Stepanov 1e7643243d [msan] Switch TLS globals to initial-exec model.
They are always defined in the main executable.

llvm-svn: 181994
2013-05-16 09:14:05 +00:00
Arnold Schwaighofer 88e7fddc8c LoopVectorize: Move call of canHoistAllLoads to canVectorizeWithIfConvert
We only want to check this once, not for every conditional block in the loop.

No functionality change (except that we don't perform a check redudantly
anymore).

llvm-svn: 181942
2013-05-15 22:38:14 +00:00
Michael Gottesman b4e7f4d841 [objc-arc] Fixed a spelling error and made the statistic descriptions be consistent about their usage of periods.
llvm-svn: 181901
2013-05-15 17:43:03 +00:00
Arnold Schwaighofer 09cee97270 LoopVectorize: Fix comments
No functionality change.

llvm-svn: 181862
2013-05-15 02:02:45 +00:00
Arnold Schwaighofer 2d920477a4 LoopVectorize: Hoist conditional loads if possible
InstCombine can be uncooperative to vectorization and sink loads into
conditional blocks. This prevents vectorization.

Undo this optimization if there are unconditional memory accesses to the same
addresses in the loop.

radar://13815763

llvm-svn: 181860
2013-05-15 01:44:30 +00:00
Sylvestre Ledru 149e281aa8 Fix two typo
llvm-svn: 181848
2013-05-14 23:36:24 +00:00
Manman Ren b3c52fb45b GlobalOpt: fix an issue where CXAAtExitFn points to a deleted function.
CXAAtExitFn was set outside a loop and before optimizations where functions
can be deleted. This patch will set CXAAtExitFn inside the loop and after
optimizations.

Seg fault when running LTO because of accesses to a deleted function.
rdar://problem/13838828

llvm-svn: 181838
2013-05-14 21:52:44 +00:00
Michael Gottesman 0c8b562851 Removed trailing whitespace.
llvm-svn: 181760
2013-05-14 06:40:10 +00:00
Arnold Schwaighofer 2e7a922a15 LoopVectorize: Handle loops with multiple forward inductions
We used to give up if we saw two integer inductions. After this patch, we base
further induction variables on the chosen one like we do in the reverse
induction and pointer induction case.

Fixes PR15720.

radar://13851975

llvm-svn: 181746
2013-05-14 00:21:18 +00:00
Michael Gottesman f3f9e3b10a [objc-arc-opts] Added debug statements when we set and unset whether a pointer is known positive.
llvm-svn: 181745
2013-05-14 00:08:09 +00:00
Michael Gottesman a76143eeee [objc-arc-opts] In the presense of an alloca unconditionally remove RR pairs if and only if we are both KnownSafeBU/KnownSafeTD rather than just either or.
In the presense of a block being initialized, the frontend will emit the
objc_retain on the original pointer and the release on the pointer loaded from
the alloca. The optimizer will through the provenance analysis realize that the
two are related (albiet different), but since we only require KnownSafe in one
direction, will match the inner retain on the original pointer with the guard
release on the original pointer. This is fixed by ensuring that in the presense
of allocas we only unconditionally remove pointers if both our retain and our
release are KnownSafe (i.e. we are KnownSafe in both directions) since we must
deal with the possibility that the frontend will emit what (to the optimizer)
appears to be unbalanced retain/releases.

An example of the miscompile is:

  %A = alloca
  retain(%x)
  retain(%x) <--- Inner Retain
  store %x, %A
  %y = load %A
  ... DO STUFF ...
  release(%y)
  call void @use(%x)
  release(%x) <--- Guarding Release

getting optimized to:

  %A = alloca
  retain(%x)
  store %x, %A
  %y = load %A
  ... DO STUFF ...
  release(%y)
  call void @use(%x)

rdar://13750319

llvm-svn: 181743
2013-05-13 23:49:42 +00:00
Matt Beaumont-Gay e55d9492e3 Move a couple more statistics inside '#ifndef NDEBUG'.
Suppresses an unused-variable warning in -Asserts builds.

llvm-svn: 181733
2013-05-13 21:10:49 +00:00
Michael Gottesman 993fbf704a [objc-arc-opts] Add comment to BBState making it clear that get{TopDown,BottomUp}PtrState will create a new PtrState object if it does not find a PtrState for Arg.
llvm-svn: 181726
2013-05-13 19:40:39 +00:00
Michael Gottesman 9fc50b82a4 [objc-arc] Move the before optimization statistics gathering phase out of OptimizeIndividualCalls.
This makes the statistics gathering completely independent of the actual
optimization occuring, preventing any sort of bleeding over from occuring.

Additionally, it simplifies a switch statement in the non-statistic gathering case.

llvm-svn: 181719
2013-05-13 18:29:07 +00:00
Duncan Sands 0480b9b54e Suppress GCC compiler warnings in release builds about variables that are only
read in asserts.

llvm-svn: 181689
2013-05-13 07:50:47 +00:00
Nadav Rotem 33dcf0a70f SLPVectorizer: Swap LHS and RHS. No functionality change.
llvm-svn: 181684
2013-05-13 05:13:13 +00:00
Nadav Rotem ce42cc6d4d SLPVectorizer: Fix a bug in the code that generates extracts for values with multiple users.
The external user does not have to be in lane #0. We have to save the lane for each scalar so that we know which vector lane to extract.

llvm-svn: 181674
2013-05-12 22:58:45 +00:00
Nadav Rotem cbf6d24d50 SLPVectorizer: Clear the map that maps between scalars to vectors after each round of vectorization.
Testcase in the next commit.

llvm-svn: 181673
2013-05-12 22:55:57 +00:00
David Majnemer 6c30f49af3 InstCombine: Flip the order of two urem transforms
There are two transforms in visitUrem that conflict with each other.

*) One, if a divisor is a power of two, subtracts one from the divisor
   and turns it into a bitwise-and.
*) The other unwraps both operands if they are surrounded by zext
   instructions.

Flipping the order allows the subtraction to go beneath the sign
extension.

llvm-svn: 181668
2013-05-12 00:07:05 +00:00
Arnold Schwaighofer f2305e4467 LoopVectorize: Use the widest induction variable type
Use the widest induction type encountered for the cannonical induction variable.

We used to turn the following loop into an empty loop because we used i8 as
induction variable type and truncated 1024 to 0 as trip count.

int a[1024];
void fail() {
  int reverse_induction = 1023;
  unsigned char forward_induction = 0;
  while ((reverse_induction) >= 0) {
    forward_induction++;
    a[reverse_induction] = forward_induction;
    --reverse_induction;
  }
}

radar://13862901

llvm-svn: 181667
2013-05-11 23:04:28 +00:00
Arnold Schwaighofer a544fefa32 LoopVectorize: Use variable instead of repeated function call
No functionality change intended.

llvm-svn: 181666
2013-05-11 23:04:26 +00:00
Arnold Schwaighofer 1ba84df437 LoopVectorize: Use IRBuilder interface in more places
No functionality change intended.

llvm-svn: 181665
2013-05-11 23:04:24 +00:00
David Majnemer 470b077bca InstCombine: Turn urem to bitwise-and more often
Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively
fold away urem instructions.

llvm-svn: 181661
2013-05-11 09:01:28 +00:00
Nadav Rotem cdfb48d2fe SLPVectorizer: Add support for trees with external users.
For example:
bar() {
  int a = A[i];
  int b = A[i+1];
  B[i] = a;
  B[i+1] = b;
  foo(a);  <--- a is used outside the vectorized expression.
}

llvm-svn: 181648
2013-05-10 22:59:33 +00:00
Nadav Rotem 0686e5cb05 Add a debug print
llvm-svn: 181647
2013-05-10 22:56:18 +00:00
Benjamin Kramer 14e915f7b4 InstCombine: Don't claim to be able to evaluate any shl in a zexted type.
The shift amount may be larger than the type leading to undefined behavior.
Limit the transform to constant shift amounts. While there update the bits to
clear in the result which may enable additional optimizations.

PR15959.

llvm-svn: 181604
2013-05-10 16:26:37 +00:00
Benjamin Kramer a6645e8b8f InstCombine: Verify the type before transforming uitofp into select.
PR15952.

llvm-svn: 181586
2013-05-10 09:16:52 +00:00
Dmitri Gribenko 9bf66a5fd0 Fix a documentation warning: \bried -> \brief
llvm-svn: 181551
2013-05-09 21:16:18 +00:00
Shuxin Yang 1d8d7e4d38 [GVN] Split critical-edge on the fly, instead of postpone edge-splitting to next
iteration.
  
  This on step toward non-iterative GVN. My local hack suggests that getting rid
of iteration will speedup GVN by 30%+ on a medium sized input (2k LOC, C++).
I cannot explain why not 2x or more at this moment.

llvm-svn: 181532
2013-05-09 18:34:27 +00:00
Rafael Espindola 007521673b Don't replace an alias in llvm.used with its target.
When we replace an internal alias with its target, be careful not to
replace the entry in llvm.used (and llvm.compiler_used).

llvm-svn: 181524
2013-05-09 17:22:59 +00:00
Benjamin Kramer 21b972ae94 InstCombine: Don't just copy known bits from the first operand of an srem.
That's obviously wrong. Conservatively restrict it to the sign bit, which
matches the original intention of this analysis. Fixes PR15940.

llvm-svn: 181518
2013-05-09 16:32:32 +00:00
Arnold Schwaighofer 2e8c69cf97 LoopVectorizer: Don't assert on the absence of induction variables
A computable loop exit count does not imply the presence of an induction
variable. Scalar evolution can return a value for an infinite loop.

Fixes PR15926.

llvm-svn: 181495
2013-05-09 00:32:18 +00:00
Daniel Malea 3c5bed1670 Add DebugIR pass -- emits IR file and replace source lines with IR lines in MD
- requires existing debug information to be present
- fixes up file name and line number information in metadata
- emits a "<orig_filename>-debug.ll" succinct IR file (without !dbg metadata
  or debug intrinsics) that can be read by a debugger
- initialize pass in opt tool to enable the "-debug-ir" flag
- lit tests to follow

llvm-svn: 181467
2013-05-08 20:44:14 +00:00
Nick Lewycky 5fb1963f2a Fix a bug in codegenprep where it was losing track of values OptimizeMemoryInst
by switching to a ValueMap. Patch by Andrea DiBiagio!

llvm-svn: 181397
2013-05-08 09:00:10 +00:00
Arnold Schwaighofer 3610139ac5 LoopVectorizer: Improve reduction variable identification
The two nested loops were confusing and also conservative in identifying
reduction variables. This patch replaces them by a worklist based approach.

llvm-svn: 181369
2013-05-07 21:55:37 +00:00
Arnold Schwaighofer e78b76fbed LoopVectorize: getConsecutiveVector must respect signed arithmetic
We were passing an i32 to ConstantInt::get where an i64 was needed and we must
also pass the sign if we pass negatives numbers. The start index passed to
getConsecutiveVector must also be signed.

Should fix PR15882.

llvm-svn: 181286
2013-05-07 04:37:05 +00:00
David Majnemer 70f286d95f InstCombine: (X ^ signbit) + C -> X + (signbit ^ C)
llvm-svn: 181249
2013-05-06 21:21:31 +00:00
Andrew Trick 9c72b071fe Rotate multi-exit loops even if the latch was simplified.
Test case by Michele Scandale!

Fixes PR10293: Load not hoisted out of loop with multiple exits.

There are few regressions with this patch, now tracked by
rdar:13817079, and a roughly equal number of improvements. The
regressions are almost certainly back luck because LoopRotate has very
little idea of whether rotation is profitable. Doing better requires a
more comprehensive solution.

This checkin is a quick fix that lacks generality (PR10293 has
a counter-example). But it trivially fixes the case in PR10293 without
interfering with other cases, and it does satify the criteria that
LoopRotate is a loop canonicalization pass that should avoid
heuristics and special cases.

I can think of two approaches that would probably be better in
the long run. Ultimately they may both make sense.

(1) LoopRotate should check that the current header would make a good
loop guard, and that the loop does not already has a sufficient
guard. The artifical SimplifiedLoopLatch check would be unnecessary,
and the design would be more general and canonical. Two difficulties:

- We need a strong guarantee that we won't endlessly rotate, so the
  analysis would need to be precise in order to avoid the
  SimplifiedLoopLatch precondition.

- Analysis like this are usually based on SCEV, which we don't want to
  rely on.

(2) Rotate on-demand in late loop passes. This could even be done by
shoving the loop back on the queue after the optimization that needs
it. This could work well when we find LICM opportunities in
multi-branch loops. This requires some work, and it doesn't really
solve the problem of SCEV wanting a loop guard before the analysis.

llvm-svn: 181230
2013-05-06 17:58:18 +00:00
Jean-Luc Duprat 3e4fc3ef24 Provide InstCombines for the following 3 cases:
A * (1 - (uitofp i1 C)) -> select C, 0, A
B * (uitofp i1 C) -> select C, B, 0
select C, 0, A + select C, B, 0 -> select C, B, A

These come up in code that has been hand-optimized from a select to a linear blend, 
on platforms where that may have mattered. We want to undo such changes 
with the following transform:
A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B

llvm-svn: 181216
2013-05-06 16:55:50 +00:00
Nadav Rotem 632b25b743 Update the comment to mention that we use TTI.
llvm-svn: 181178
2013-05-06 03:06:36 +00:00
Nadav Rotem c70ef4e93c Revert r164763 because it introduces new shuffles.
Thanks Nick Lewycky for pointing this out.

llvm-svn: 181177
2013-05-06 02:39:09 +00:00
Rafael Espindola c229a4fff4 Fix const merging when an alias of a const is llvm.used.
We used to disable constant merging not only if a constant is llvm.used, but
also if an alias of a constant is llvm.used. This change fixes that.

llvm-svn: 181175
2013-05-06 01:48:55 +00:00
Benjamin Kramer 3e3f2a4b8d LoopVectorize: Print values instead of pointers in debug output.
llvm-svn: 181157
2013-05-05 14:54:52 +00:00
Arnold Schwaighofer d96e427eac LoopVectorize: Add support for floating point min/max reductions
Add support for min/max reductions when "no-nans-float-math" is enabled. This
allows us to assume we have ordered floating point math and treat ordered and
unordered predicates equally.

radar://13723044

llvm-svn: 181144
2013-05-05 01:54:48 +00:00
Arnold Schwaighofer f5183729db LoopVectorizer: Cleanup of miminimum/maximum pattern match code
No need for setting the operands. The pointers are going to be bound by the
matcher.

radar://13723044

llvm-svn: 181142
2013-05-05 01:54:44 +00:00
Arnold Schwaighofer a670a0a3aa LoopVectorize: We don't need an identity element for min/max reductions
We can just use the initial element that feeds the reduction.

  max(max(x, y), z) == max(max(x,y), max(x,z))

radar://13723044

llvm-svn: 181141
2013-05-05 01:54:42 +00:00
Dmitri Gribenko 3238fb7595 Add ArrayRef constructor from None, and do the cleanups that this constructor enables
Patch by Robert Wilhelm.

llvm-svn: 181138
2013-05-05 00:40:33 +00:00
Nick Lewycky 881e9d62e2 Tabs to spaces. No functionality change.
llvm-svn: 181082
2013-05-04 01:08:15 +00:00
Shuxin Yang 637b9bebd4 Decompose GVN::processNonLocalLoad() (about 400 LOC) into smaller helper functions. No function change.
This function consists of following steps:
   1. Collect dependent memory accesses.
   2. Analyze availability.
   3. Perform fully redundancy elimination, or 
   4. Perform PRE, depending on the availability

 Step 2, 3 and 4 are now moved to three helper routines.

llvm-svn: 181047
2013-05-03 19:17:26 +00:00
Nadav Rotem 4ce060b3da LoopVectorizer: Add support for if-conversion of PHINodes with 3+ incoming values.
By supporting the vectorization of PHINodes with more than two incoming values we can increase the complexity of nested if statements.

We can now vectorize this loop:

int foo(int *A, int *B, int n) {
  for (int i=0; i < n; i++) {
    int x = 9;
    if (A[i] > B[i]) {
      if (A[i] > 19) {
        x = 3;
      } else if (B[i] < 4 ) {
        x = 4;
      } else {
        x = 5;
      }
    }
    A[i] = x;
  }
}

llvm-svn: 181037
2013-05-03 17:42:55 +00:00
Shuxin Yang af2c3ddf0d [GV] Remove dead code which is really difficult to decipher.
Actually it took me couple of hours trying to make sense of them and
only to find they are dead code.  I guess the original author used
"allSingleSucc" to indicate if there are any critial edge emanating
from some blocks, and tried to perform code motion (actually speculation)
in the presence of these critical edges; but later on he/she changed mind
and decided to perform edge-splitting first.

llvm-svn: 180951
2013-05-02 21:14:31 +00:00
Filip Pizlo dec20e43c0 This patch breaks up Wrap.h so that it does not have to include all of
the things, and renames it to CBindingWrapping.h.  I also moved 
CBindingWrapping.h into Support/.

This new file just contains the macros for defining different wrap/unwrap 
methods.

The calls to those macros, as well as any custom wrap/unwrap definitions 
(like for array of Values for example), are put into corresponding C++ 
headers.

Doing this required some #include surgery, since some .cpp files relied 
on the fact that including Wrap.h implicitly caused the inclusion of a 
bunch of other things.

This also now means that the C++ headers will include their corresponding 
C API headers; for example Value.h must include llvm-c/Core.h.  I think 
this is harmless, since the C API headers contain just external function 
declarations and some C types, so I don't believe there should be any 
nasty dependency issues here.

llvm-svn: 180881
2013-05-01 20:59:00 +00:00
Nadav Rotem 1e211913b5 SROA: Generate selects instead of shuffles when blending values because this is the cannonical form.
Shuffles are more difficult to lower and we usually don't touch them, while we do optimize selects more often.

llvm-svn: 180875
2013-05-01 19:53:30 +00:00
Jim Grosbach d11584a7f7 Revert "InstCombine: Fold more shuffles of shuffles."
This reverts commit r180802

There's ongoing discussion about whether this is the right place to make
this transformation. Reverting for now while we figure it out.

llvm-svn: 180834
2013-05-01 00:25:27 +00:00
Richard Trieu 624c2ebcbb Fix a use after free. RI is freed before the call to getDebugLoc(). To
prevent this, capture the location before RI is freed.

llvm-svn: 180824
2013-04-30 22:45:10 +00:00
Nadav Rotem 9feda6071a Fix a typo
llvm-svn: 180806
2013-04-30 21:04:51 +00:00
Jim Grosbach 0b914fe839 InstCombine: Fold more shuffles of shuffles.
Always fold a shuffle-of-shuffle into a single shuffle when there's only one
input vector in the first place. Continue to be more conservative when there's
multiple inputs.

rdar://13402653
PR15866

llvm-svn: 180802
2013-04-30 20:43:52 +00:00
Adrian Prantl 8beccf9e6d Spelling. Thanks, Eric.
llvm-svn: 180794
2013-04-30 17:33:32 +00:00
Adrian Prantl 0941638a1b Set debug locations for branch instructions created during inlining, even
the inlined function has multiple returns.

rdar://problem/12415623

llvm-svn: 180793
2013-04-30 17:08:16 +00:00
David Majnemer d73f37bb83 Fix a bug in foldSelectICmpAndOr.
Differences in bitwidth between X and Y could exist even if C1 and C2 have
the same Log2 representation.

llvm-svn: 180779
2013-04-30 10:36:33 +00:00
David Majnemer 8d048d0482 Fix "Combine bit test + conditional or into simple math"
This fixes the optimization introduced in r179748 and reverted in r179750.

While the optimization was sound, it did not properly respect differences in
bit-width.

llvm-svn: 180777
2013-04-30 08:57:58 +00:00
Arnold Schwaighofer 474df6d3ed SimplifyCFG: If convert single conditional stores
This resurrects r179957, but adds code that makes sure we don't touch
atomic/volatile stores:

This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:

 a[i] =
 may-alias with a[i] load
 if (cond)
   a[i] = Y

into an unconditional store.

 a[i] = X
 may-alias with a[i] load
 tmp = cond ? Y : X;
 a[i] = tmp

We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case where the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.

hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.

llvm-svn: 180731
2013-04-29 21:28:24 +00:00
Michael Gottesman 03cf3c8966 Add in some conditional compilation in order to silence an unused variable warning.
llvm-svn: 180700
2013-04-29 07:29:08 +00:00
Michael Gottesman 214ca90f8e [objc-arc] Apply the RV optimization to retains next to calls in ObjCARCContract instead of ObjCARCOpts.
Turning retains into retainRV calls disrupts the data flow analysis in
ObjCARCOpts. Thus we move it as late as we can by moving it into
ObjCARCContract.

We leave in the conversion from retainRV -> retain in ObjCARCOpt since
it enables the dataflow analysis.

rdar://10813093

llvm-svn: 180698
2013-04-29 06:53:53 +00:00
Michael Gottesman 9c11815978 Added statistics to count the number of retains/releases before/after optimization.
llvm-svn: 180697
2013-04-29 06:16:57 +00:00
Michael Gottesman 8005ad3f3e Removed trailing whitespace.
llvm-svn: 180696
2013-04-29 06:16:55 +00:00
Michael Gottesman 3e3977c49f Fix for r180693. = /.
llvm-svn: 180694
2013-04-29 05:25:39 +00:00
Michael Gottesman a87bb8f50b [objc-arc-annotations] Moved the disabling of call movement to ConnectTDBUTraversals so that I can prevent Changed = true from being set. This prevents an infinite loop.
llvm-svn: 180693
2013-04-29 05:13:13 +00:00
Shuxin Yang 04a4fd43aa Fix a XOR reassociation bug.
When Reassociator optimize "(x | C1)" ^ "(X & C2)", it may swap the two
subexpressions, however, it forgot to swap cached constants (of C1 and C2)
accordingly.

rdar://13739160

llvm-svn: 180676
2013-04-27 18:02:12 +00:00
Adrian Prantl d00333a4b2 fix a typo that due to cu&paste quadrupled itself
rdar://problem/13056109

llvm-svn: 180618
2013-04-26 18:10:50 +00:00
Adrian Prantl 29b9de7bf1 Bugfix for the debug intrinsic handling in InstCombiner:
Since we can't guarantee that the original dbg.declare instrinsic
is removed by LowerDbgDeclare(), we need to make sure that we are
not inserting the same dbg.value intrinsic over and over.
This removes tons of redundant DIEs when compiling optimized code.

rdar://problem/13056109

llvm-svn: 180615
2013-04-26 17:48:33 +00:00
Nadav Rotem 13306816fc LoopVectorizer: Calculate the number of pointers to disambiguate at runtime based on the numbers of reads and writes.
llvm-svn: 180593
2013-04-26 05:08:59 +00:00
Michael Gottesman 47cf8a4c12 Revert "[objc-arc] Added ImpreciseAutoreleaseSet to track autorelease calls that were once autoreleaseRV instructions."
This reverts commit r180222.

I think this might tie in with a different problem which will require a
different approach potentially. I am reverting this in the case I need to go
down that second path.

My apologies for the noise. = /.

llvm-svn: 180590
2013-04-26 01:12:18 +00:00
Nadav Rotem f43cbeee15 LoopVectorizer: No need to generate pointer disambiguation checks between readonly pointers.
llvm-svn: 180570
2013-04-25 19:55:03 +00:00
Michael Gottesman fdb497a9b2 [objc-arc] Added ImpreciseAutoreleaseSet to track autorelease calls that were once autoreleaseRV instructions.
Due to the semantics of ARC, we must be extremely conservative with autorelease
calls inserted by the frontend since ARC gaurantees that said object will be in
the autorelease pool after that point, an optimization invariant that the
optimizer must respect.

On the other hand, we are allowed significantly more flexibility with
autoreleaseRV instructions.

Often times though this flexibility is disrupted by early transformations which
transform objc_autoreleaseRV => objc_autorelease if said instruction is no
longer being used as part of an RV pair (generally due to inlining). Since we
can not tell the difference in between an autorelease put into place by the
frontend and one created through said ``strength reduction'' we can not perform
these optimizations.

The addition of this set gets around said issues by allowing us to differentiate
in between said two cases.

rdar://problem/13697741.

llvm-svn: 180222
2013-04-24 22:18:18 +00:00
Michael Gottesman cd5b02701c Fixed comment typo.
llvm-svn: 180221
2013-04-24 22:18:15 +00:00
Arnold Schwaighofer 3fa801fbc2 LoopVectorizer: Change variable name Stride to ConsecutiveStride
This makes it easier to read the code.

No functionality change.

llvm-svn: 180197
2013-04-24 16:16:03 +00:00
Arnold Schwaighofer a6578f7056 LoopVectorize: Scalarize padded types
This patch disables memory-instruction vectorization for types that need padding
bytes, e.g., x86_fp80 has 10 bytes store size with 6 bytes padding in darwin on
x86_64. Because the load/store vectorization is performed by the bit casting to
a packed vector, which has incompatible memory layout due to the lack of padding
bytes, the present vectorizer produces inconsistent result for memory
instructions of those types.
This patch checks an equality of the AllocSize of a scalar type and allocated
size for each vector element, to ensure that there is no padding bytes and the
array can be read/written using vector operations.

Patch by Daisuke Takahashi!

Fixes PR15758.

llvm-svn: 180196
2013-04-24 16:16:01 +00:00
Arnold Schwaighofer 23a0589bce LoopVectorizer: Bail out if we don't have datalayout we need it
llvm-svn: 180195
2013-04-24 16:15:58 +00:00
Adrian Prantl 15db52bf6d Make sure the instruction right after an inlined function has a
debug location. This solves a problem where range of an inlined
subroutine is emitted wrongly.
Patch by Manman Ren.

Fixes rdar://problem/12415623

llvm-svn: 180140
2013-04-23 19:56:03 +00:00
Nadav Rotem 71c9d6d333 LoopVectorizer: Fix 15830. When scalarizing and unrolling stores make sure that the order in which the elements are scalarized is the same as the original order.
This fixes a miscompilation in FreeBSD's regex library.

llvm-svn: 180121
2013-04-23 17:12:42 +00:00
Pekka Jaaskelainen d3c90e132a Call the potentially costly isAnnotatedParallel() only once.
Made the uniform write test's checks a bit stricter.

llvm-svn: 180119
2013-04-23 16:44:43 +00:00
Pekka Jaaskelainen 6f2f66b63f Refuse to (even try to) vectorize loops which have uniform writes,
even if erroneously annotated with the parallel loop metadata.

Fixes Bug 15794: 
"Loop Vectorizer: Crashes with the use of llvm.loop.parallel metadata"

llvm-svn: 180081
2013-04-23 08:08:51 +00:00
Eric Christopher 04d4e9312c Move C++ code out of the C headers and into either C++ headers
or the C++ files themselves. This enables people to use
just a C compiler to interoperate with LLVM.

llvm-svn: 180063
2013-04-22 22:47:22 +00:00
Anat Shemer 10260a75e3 Changed back (relative to commit 179786) the operations executed when extract(cast) is transformed to cast(extract). It uses the Builder class as before. In addition the result node is added to the Worklist, so all the previous extract users will become the new scalar cast users.
llvm-svn: 180045
2013-04-22 20:51:10 +00:00
Rafael Espindola 74f2e46eef Clarify that llvm.used can contain aliases.
Also add a check for llvm.used in the verifier and simplify clients now that
they can assume they have a ConstantArray.

llvm-svn: 180019
2013-04-22 14:58:02 +00:00
Benjamin Kramer 0212dc27ed SROA: Don't crash on a select with two identical operands.
This is an edge case that can happen if we modify a chain of multiple selects.
Update all operands in that case and remove the assert. PR15805.

llvm-svn: 179982
2013-04-21 17:48:39 +00:00
Arnold Schwaighofer 6eb32b31bd Revert "SimplifyCFG: If convert single conditional stores"
There is the temptation to make this tranform dependent on target information as
it is not going to be beneficial on all (sub)targets. Therefore, we should
probably do this in MI Early-Ifconversion.

This reverts commit r179957. Original commit message:

"SimplifyCFG: If convert single conditional stores

This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:

a[i] =
may-alias with a[i] load
if (cond)
    a[i] = Y
into an unconditional store.

a[i] = X
may-alias with a[i] load
tmp = cond ? Y : X;
a[i] = tmp

We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case were the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.

hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.

I am going to watch performance numbers across the builtbots and will revert
this if anything unexpected comes up."

llvm-svn: 179980
2013-04-21 13:09:04 +00:00
Nadav Rotem c57af326a4 SLPVectorize: Add support for vectorization of casts.
llvm-svn: 179975
2013-04-21 08:05:59 +00:00
Nadav Rotem 98ad5f0f4c SLPVectorizer: Fix a bug in the code that scans the tree in search of nodes with multiple users.
We did not terminate the switch case and we executed the search routine twice.

llvm-svn: 179974
2013-04-21 07:37:56 +00:00
Michael Gottesman 3eab2e43d2 When we strength reduce an objc_retainBlock call to objc_retain, increment NumPeeps and make sure that Changed is set to true.
llvm-svn: 179968
2013-04-21 00:50:27 +00:00
Michael Gottesman 1e43004295 Fixed comment typo.
llvm-svn: 179967
2013-04-21 00:44:46 +00:00
Michael Gottesman df110ac9ec [objc-arc] Fixed typo in debug message.
llvm-svn: 179966
2013-04-21 00:30:50 +00:00
Michael Gottesman cdb7c15ce8 [objc-arc] Fixed comment typo.
llvm-svn: 179965
2013-04-21 00:25:04 +00:00
Michael Gottesman fb9ece9a7c [objc-arc] Refactored OptimizeReturns so that it uses continue instead of a large multi-level nested if statement.
llvm-svn: 179964
2013-04-21 00:25:01 +00:00
Michael Gottesman 01338a442a [objc-arc] Added debug statement saying when we are resetting a sequence's progress.
This will make it clearer when we are actually resetting a sequence's progress
vs just changing state. This is an important distinction because the former case
clears any pointers that we are tracking while the later does not.

llvm-svn: 179963
2013-04-20 23:36:57 +00:00
Nadav Rotem 8aca44a623 Fix PR15800. Do not try to vectorize vectors and structs.
llvm-svn: 179960
2013-04-20 22:29:43 +00:00
Arnold Schwaighofer 3546ccf465 SimplifyCFG: If convert single conditional stores
This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:

 a[i] =
 may-alias with a[i] load
 if (cond)
   a[i] = Y

into an unconditional store.

 a[i] = X
 may-alias with a[i] load
 tmp = cond ? Y : X;
 a[i] = tmp

We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case were the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.

hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.

I am going to watch performance numbers across the builtbots and will revert
this if anything unexpected comes up.

llvm-svn: 179957
2013-04-20 21:42:09 +00:00
Benjamin Kramer 519b2e3087 VecUtils: Clean up uses of dyn_cast.
llvm-svn: 179936
2013-04-20 10:36:17 +00:00
Benjamin Kramer 4600bcc337 SLPVectorizer: Strength reduce SmallVectors to ArrayRefs.
Avoids a couple of copies and allows more flexibility in the clients.

llvm-svn: 179935
2013-04-20 09:49:10 +00:00
Nadav Rotem ce2660d639 SLPVectorizer: Reduce the compile time by eliminating the search for some of the more expensive patterns. After this change will only check basic arithmetic trees that start at cmpinstr.
llvm-svn: 179933
2013-04-20 07:29:34 +00:00
Nadav Rotem 998e035cae refactor tryToVectorizePair to a new method that supports vectorization of lists.
llvm-svn: 179932
2013-04-20 07:22:58 +00:00
Nadav Rotem 890387289e Fix an unused variable warning.
llvm-svn: 179931
2013-04-20 06:40:28 +00:00
Nadav Rotem 83c7c41bc2 SLPVectorizer: Improve the cost model for loop invariant broadcast values.
llvm-svn: 179930
2013-04-20 06:13:47 +00:00
Nadav Rotem dfe1c93ca4 Report the number of stores that were found in the debug message.
llvm-svn: 179929
2013-04-20 05:23:11 +00:00
Nadav Rotem dfd8fcbb00 Fix the header comment.
llvm-svn: 179928
2013-04-20 05:18:51 +00:00
Nadav Rotem 5ed99674e9 Use 64bit arithmetic for calculating distance between pointers.
llvm-svn: 179927
2013-04-20 05:17:47 +00:00
Benjamin Kramer 630e6e1422 MergeFunc: Make pointer and integer types generate the same hash.
The logic that actually compares the types considers pointers and integers the
same if they are of the same size. This created a strange mismatch between hash
and reality and made the test case for this fail on some platforms (yay,
test cases).

llvm-svn: 179905
2013-04-19 23:06:44 +00:00
Arnold Schwaighofer 5146940316 LoopVectorizer: Use matcher from PatternMatch.h for the min/max patterns
Also make some static function class functions to avoid having to mention the
class namespace for enums all the time.

No functionality change intended.

llvm-svn: 179886
2013-04-19 21:03:36 +00:00
Jakub Staszak 99317268e2 Keep coding stanard. Don't use "else if" after "return".
llvm-svn: 179826
2013-04-19 01:18:04 +00:00
Bill Wendling 3b21eb69fb Implement a better fix for PR15185.
If the return type is a pointer and the call returns an integer, then do the
inttoptr convertions. And vice versa.

llvm-svn: 179817
2013-04-18 23:34:17 +00:00
Dmitri Gribenko d29ea04446 Fix a -Wdocumentation warning
llvm-svn: 179789
2013-04-18 20:13:04 +00:00
Anat Shemer 5570318f43 In the function InstCombiner::visitExtractElementInst() removed the limitation that extract is promoted over a cast only if the cast has only one use.
llvm-svn: 179786
2013-04-18 19:56:44 +00:00
Anat Shemer 0c95efad7e Added a function scalarizePHI() that sclarizes a vector phi instruction if it has only 2 uses: one to promote the vector phi in a loop and the other use is an extract operation of one element at a constant location.
llvm-svn: 179783
2013-04-18 19:35:39 +00:00
Chris Lattner 8cf09416ea Fix a comment, PR15777.
llvm-svn: 179775
2013-04-18 17:42:14 +00:00
Arnold Schwaighofer 4cd6aa110c LoopVectorizer: Recognize min/max reductions
A min/max operation is represented by a select(cmp(lt/le/gt/ge, X, Y), X, Y)
sequence in LLVM. If we see such a sequence we can treat it just as any other
commutative binary instruction and reduce it.

This appears to help bzip2 by about 1.5% on an imac12,2.

radar://12960601

llvm-svn: 179773
2013-04-18 17:22:34 +00:00
Benjamin Kramer 8df2cfb858 LoopVectorize: Use a set to avoid longer cycles in the reduction chain too.
Fixes PR15748.

llvm-svn: 179757
2013-04-18 14:29:13 +00:00
David Majnemer 81af06e003 Revert "Combine bit test + conditional or into simple math"
It is causing stage2 builds to fail, let's get them running again.

llvm-svn: 179750
2013-04-18 08:42:33 +00:00
David Majnemer bdf0caf6b1 Combine bit test + conditional or into simple math
Simplify:
(select (icmp eq (and X, C1), 0), Y, (or Y, C2))

Into:
(or (shl (and X, C1), C3), y)

Where:
C3 = Log(C2) - Log(C1)

If:
C1 and C2 are both powers of two

llvm-svn: 179748
2013-04-18 07:30:07 +00:00
Michael Gottesman 323964ca9e [objc-arc] Do not mismatch up retains inside a for loop with releases outside said for loop in the presense of differing provenance caused by escaping blocks.
This occurs due to an alloca representing a separate ownership from the
original pointer. Thus consider the following pseudo-IR:

  objc_retain(%a)
  for (...) {
    objc_retain(%a)
    %block <- %a
    F(%block)
    objc_release(%block)
  }
  objc_release(%a)

From the perspective of the optimizer, the %block is a separate
provenance from the original %a. Thus the optimizer pairs up the inner
retain for %a and the outer release from %a, resulting in segfaults.

This is fixed by noting that the signature of a mismatch of
retain/releases inside the for loop is a Use/CanRelease top down with an
None bottom up (since bottom up the Retain-CanRelease-Use-Release
sequence is completed by the inner objc_retain, but top down due to the
differing provenance from the objc_release said sequence is not
completed). In said case in CheckForCFGHazards, we now clear the state
of %a implying that no pairing will occur.

Additionally a test case is included.

rdar://12969722

llvm-svn: 179747
2013-04-18 05:39:45 +00:00
Michael Gottesman 9e5181393a Removed trailing whitespace.
llvm-svn: 179746
2013-04-18 04:34:11 +00:00
Michael Gottesman 4e88ce68ae [objc-arc] Added annotation option to only emit annotations for a specific ssa identifier.
llvm-svn: 179729
2013-04-17 21:59:41 +00:00
Michael Gottesman adb921affa Fixed typo.
llvm-svn: 179721
2013-04-17 21:03:53 +00:00
Michael Gottesman 6806b51ad2 [objc-arc] Added descriptions for EnableARCAnnotations, EnableCheckForCFGHazards, EnableARCOptimizations.
llvm-svn: 179718
2013-04-17 20:48:03 +00:00
Michael Gottesman ffef24f964 [objc-arc] Added an option to arc-annotations for turning off CheckForCFGHazard.
llvm-svn: 179717
2013-04-17 20:48:01 +00:00
Peter Collingbourne 37ae72b508 Do not optimise fprintf() calls if its return value is used.
Differential Revision: http://llvm-reviews.chandlerc.com/D620

llvm-svn: 179661
2013-04-17 02:01:10 +00:00
Hans Wennborg c9e1d99279 simplifycfg: Fix integer overflow converting switch into icmp.
If a switch instruction has a case for every possible value of its type,
with the same successor, SimplifyCFG would replace it with an icmp ult,
but the computation of the bound overflows in that case, which inverts
the test.

Patch by Jed Davis!

llvm-svn: 179587
2013-04-16 08:35:36 +00:00
Bill Wendling 3789171972 We are not able to bitcast a pointer to an integral value.
Two return types are not equivalent if one is a pointer and the other is an
integral. This is because we cannot bitcast a pointer to an integral value.
PR15185

llvm-svn: 179569
2013-04-15 22:33:50 +00:00
Nadav Rotem b9116e6966 SLPVectorizer: Make it a function pass and add code for hoisting the vector-gather sequence out of loops.
llvm-svn: 179562
2013-04-15 22:00:26 +00:00
Jim Grosbach 0f38c1e3a7 Fix a typo in comment.
llvm-svn: 179542
2013-04-15 17:40:48 +00:00
Nadav Rotem d4dcc003df Add an option -vectorize-slp-aggressive for running the BB vectorizer. Make -fslp-vectorize run the slp-vectorizer.
llvm-svn: 179508
2013-04-15 05:39:58 +00:00
Nadav Rotem a1e5e44eb3 Rename the slp-vectorizer clang/llvm flags. No functionality change.
llvm-svn: 179505
2013-04-15 04:54:42 +00:00
Nadav Rotem 5d393c416f SLPVectorizer: Add support for vectorizing trees that start at compare instructions.
llvm-svn: 179504
2013-04-15 04:25:27 +00:00
David Majnemer 1fae195557 Reorders two transforms that collide with each other
One performs: (X == 13 | X == 14) -> X-13 <u 2
The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1

The problem is that there are certain values of C1 and C2 that
trigger both transforms but the first one blocks out the second,
this generates suboptimal code.

Reordering the transforms should be better in every case and
allows us to do interesting stuff like turn:
  %shr = lshr i32 %X, 4
  %and = and i32 %shr, 15
  %add = add i32 %and, -14
  %tobool = icmp ne i32 %add, 0

into:
  %and = and i32 %X, 240
  %tobool = icmp ne i32 %and, 224

llvm-svn: 179493
2013-04-14 21:15:43 +00:00
Benjamin Kramer 7d62ea86e5 Miscellaneous cleanups for VecUtils.h
llvm-svn: 179483
2013-04-14 09:33:08 +00:00
Nadav Rotem 3403c11529 SLP: Document the scalarization cost method.
llvm-svn: 179479
2013-04-14 07:22:22 +00:00
Nadav Rotem 54b413d157 SLPVectorizer: Add support for trees that don't start at binary operators, and add the cost of extracting values from the roots of the tree.
llvm-svn: 179475
2013-04-14 05:15:53 +00:00
Nadav Rotem 0b9cf8567b SLPVectorizer: add initial support for reduction variable vectorization.
llvm-svn: 179470
2013-04-14 03:22:20 +00:00
Benjamin Kramer adc1727c39 GlobalDCE: Fix an oversight in my last commit that could lead to crashes.
There is a Constant with non-constant operands: blockaddress.

llvm-svn: 179460
2013-04-13 16:11:14 +00:00
Benjamin Kramer 89ca4bc6d4 Fix a scalability issue with complex ConstantExprs.
This is basically the same fix in three different places. We use a set to avoid
walking the whole tree of a big ConstantExprs multiple times.

For example: (select cmp, (add big_expr 1), (add big_expr 2))
We don't want to visit big_expr twice here, it may consist of thousands of
nodes.

The testcase exercises this by creating an insanely large ConstantExprs out of
a loop. It's questionable if the optimizer should ever create those, but this
can be triggered with real C code. Fixes PR15714.

llvm-svn: 179458
2013-04-13 12:53:18 +00:00
Benjamin Kramer e89c705030 InstCombine: Check the operand types before merging fcmp ord & fcmp ord.
Fixes PR15737.

llvm-svn: 179417
2013-04-12 21:56:23 +00:00
Nadav Rotem 8543ba3e52 SLPVectorizer: add support for vectorization of diamond shaped trees. We now perform a preliminary traversal of the graph to collect values with multiple users and check where the users came from.
llvm-svn: 179414
2013-04-12 21:16:54 +00:00
Nadav Rotem 4da0ab1d68 Add debug prints.
llvm-svn: 179412
2013-04-12 21:11:14 +00:00
David Majnemer 1a08accbb7 Simplify (A & ~B) in icmp if A is a power of 2
The transform will execute like so:
(A & ~B) == 0 --> (A & B) != 0
(A & ~B) != 0 --> (A & B) == 0

llvm-svn: 179386
2013-04-12 17:25:07 +00:00
Arnold Schwaighofer f9cea17f75 LoopVectorizer: integer division is not a reduction operation
Don't classify idiv/udiv as a reduction operation. Integer division is lossy.
For example : (1 / 2) * 4 != 4/2.

Example:

int a[] = { 2, 5, 2, 2}
int x = 80;

for()
  x /= a[i];

Scalar:
  x /= 2 // = 40
  x /= 5 // = 8
  x /= 2 // = 4
  x /= 2 // = 2

Vectorized:

 <80, 1> / <2,5> //= <40,0>
 <40, 0> / <2,2> //= <20,0>

 20*0 = 0

radar://13640654

llvm-svn: 179381
2013-04-12 15:15:19 +00:00
David Majnemer b81cd63c4b Optimize icmp involving addition better
Allows LLVM to optimize sequences like the following:

%add = add nsw i32 %x, 1
%cmp = icmp sgt i32 %add, %y

into:

%cmp = icmp sge i32 %x, %y

as well as:

%add1 = add nsw i32 %x, 20
%add2 = add nsw i32 %y, 57
%cmp = icmp sge i32 %add1, %add2

into:

%add = add nsw i32 %y, 37
%cmp = icmp sle i32 %cmp, %x

llvm-svn: 179316
2013-04-11 20:05:46 +00:00
Benjamin Kramer a95f87494a Fix for wrong instcombine on vector insert/extract
When trying to collapse sequences of insertelement/extractelement
instructions into single shuffle instructions, there is one specific
case where the Instruction Combiner wrongly updates the resulting
Mask of shuffle indexes.

The problem is in function CollectShuffleElments.

If we have a sequence of insert/extract element instructions
like the one below:

  %tmp1 = extractelement <4 x float> %LHS, i32 0
  %tmp2 = insertelement <4 x float> %RHS, float %tmp1, i32 1
  %tmp3 = extractelement <4 x float> %RHS, i32 2
  %tmp4 = insertelement <4 x float> %tmp2, float %tmp3, i32 3

Where:
  . %RHS will have a mask of [4,5,6,7]
  . %LHS will have a mask of [0,1,2,3]

The Mask of shuffle indexes is wrongly computed to [4,1,6,7]
instead of [4,0,6,7].
When analyzing %tmp2 in order to compute the Mask for the
resulting shuffle instruction, the algorithm forgets to update
the mask index at position 1 with the index associated to the
element extracted from %LHS by instruction %tmp1.

Patch by Andrea DiBiagio!

llvm-svn: 179291
2013-04-11 15:10:09 +00:00
Alexey Samsonov a28f36c2e2 [ASan] Allow disabling init-order checks for globals by source file name.
llvm-svn: 179280
2013-04-11 13:20:00 +00:00
Benjamin Kramer c86fdf12e8 Rename the C function to create a SLPVectorizerPass to something sane and expose it in the header file.
llvm-svn: 179272
2013-04-11 11:36:36 +00:00
Nadav Rotem 73dffa4184 Make the SLP store-merger less paranoid about function calls. We check for function calls when we check if it is safe to sink instructions.
llvm-svn: 179207
2013-04-10 19:41:36 +00:00
Nadav Rotem 88dd5f7a38 We require DataLayout for analyzing the size of stores.
llvm-svn: 179206
2013-04-10 18:57:27 +00:00
Joey Gouly 81259294be Change CloneFunctionInto to always clone Argument attributes induvidually,
rather than checking if the source and destination have the same number of
arguments and copying the attributes over directly.

llvm-svn: 179169
2013-04-10 10:37:38 +00:00
Bob Wilson 798a7709fc Fix some comment typos.
llvm-svn: 179132
2013-04-09 22:15:51 +00:00
Nadav Rotem 2d9dec322e Add support for bottom-up SLP vectorization infrastructure.
This commit adds the infrastructure for performing bottom-up SLP vectorization (and other optimizations) on parallel computations.
The infrastructure has three potential users:

  1. The loop vectorizer needs to be able to vectorize AOS data structures such as (sum += A[i] + A[i+1]).

  2. The BB-vectorizer needs this infrastructure for bottom-up SLP vectorization, because bottom-up vectorization is faster to compute.

  3. A loop-roller needs to be able to analyze consecutive chains and roll them into a loop, in order to reduce code size. A loop roller does not need to create vector instructions, and this infrastructure separates the chain analysis from the vectorization.

This patch also includes a simple (100 LOC) bottom up SLP vectorizer that uses the infrastructure, and can vectorize this code:

void SAXPY(int *x, int *y, int a, int i) {
  x[i]   = a * x[i]   + y[i];
  x[i+1] = a * x[i+1] + y[i+1];
  x[i+2] = a * x[i+2] + y[i+2];
  x[i+3] = a * x[i+3] + y[i+3];
}

llvm-svn: 179117
2013-04-09 19:44:35 +00:00
Shuxin Yang 331f01dcb4 Redo the fix Benjamin Kramer committed in r178793 about iterator invalidation in Reassociate.
I brazenly think this change is slightly simpler than r178793 because: 
  - no "state" in functor
  - "OpndPtrs[i]" looks simpler than "&Opnds[OpndIndices[i]]" 

  While I can reproduce the probelm in Valgrind, it is rather difficult to come up
a standalone testing case. The reason is that when an iterator is invalidated,
the stale invalidated elements are not yet clobbered by nonsense data, so the
optimizer can still proceed successfully. 

  Thank Benjamin for fixing this bug and generously providing the test case.

llvm-svn: 179062
2013-04-08 22:00:43 +00:00
Chandler Carruth 0e8a52d18f Fix PR15674 (and PR15603): a SROA think-o.
The fix for PR14972 in r177055 introduced a real think-o in the *store*
side, likely because I was much more focused on the load side. While we
can arbitrarily widen (or narrow) a loaded value, we can't arbitrarily
widen a value to be stored, as that changes the width of memory access!
Lock down the code path in the store rewriting which would do this to
only handle the intended circumstance.

All of the existing tests continue to pass, and I've added a test from
the PR.

llvm-svn: 178974
2013-04-07 11:47:54 +00:00
Michael Gottesman 7924997c36 Removed trailing whitespace.
llvm-svn: 178932
2013-04-05 23:46:45 +00:00
Michael Gottesman 31ba23aa56 An objc_retain can serve as a use for a different pointer.
This is the counterpart to commit r160637, except it performs the action
in the bottomup portion of the data flow analysis.

llvm-svn: 178922
2013-04-05 22:54:32 +00:00
Michael Gottesman 1d8d25777d Properly model precise lifetime when given an incomplete dataflow sequence.
The normal dataflow sequence in the ARC optimizer consists of the following
states:

    Retain -> CanRelease -> Use -> Release

The optimizer before this patch stored the uses that determine the lifetime of
the retainable object pointer when it bottom up hits a retain or when top down
it hits a release. This is correct for an imprecise lifetime scenario since what
we are trying to do is remove retains/releases while making sure that no
``CanRelease'' (which is usually a call) deallocates the given pointer before we
get to the ``Use'' (since that would cause a segfault).

If we are considering the precise lifetime scenario though, this is not
correct. In such a situation, we *DO* care about the previous sequence, but
additionally, we wish to track the uses resulting from the following incomplete
sequences:

  Retain -> CanRelease -> Release   (TopDown)
  Retain <- Use <- Release          (BottomUp)

*NOTE* This patch looks large but the most of it consists of updating
test cases. Additionally this fix exposed an additional bug. I removed
the test case that expressed said bug and will recommit it with the fix
in a little bit.

llvm-svn: 178921
2013-04-05 22:54:28 +00:00
Jim Grosbach bdbd73460c Tidy up a bit. No functional change.
llvm-svn: 178915
2013-04-05 21:20:12 +00:00
Shuxin Yang 95adf5258f Disable the optimization about promoting vector-element-access with symbolic index.
This optimization is unstable at this moment; it 
  1) block us on a very important application
  2) PR15200
  3) test6 and test7 in test/Transforms/ScalarRepl/dynamic-vector-gep.ll
     (the CHECK command compare the output against wrong result)

   I personally believe this optimization should not have any impact on the
autovectorized code, as auto-vectorizer is supposed to put gather/scatter
in a "right" way.  Although in theory downstream optimizaters might reveal 
some gather/scatter optimization opportunities, the chance is quite slim.

   For the hand-crafted vectorizing code, in term of redundancy elimination,
load-CSE, copy-propagation and DSE can collectively achieve the same result,
but in much simpler way. On the other hand, these optimizers are able to 
improve the code in a incremental way; in contrast, SROA is sort of all-or-none
approach. However, SROA might slighly win in stack size, as it tries to figure 
out a stretch of memory tightenly cover the area accessed by the dynamic index.

 rdar://13174884
 PR15200

llvm-svn: 178912
2013-04-05 21:07:08 +00:00
Michael Gottesman bab49e976b Added two debug logging messages to VisitInstructionsTopDown to match VisitInstructionsBottomUp.
llvm-svn: 178895
2013-04-05 18:26:08 +00:00
Michael Gottesman 89279f8383 Cleaned up whitespace and made debug logging less verbose.
llvm-svn: 178893
2013-04-05 18:10:41 +00:00
Arnold Schwaighofer df6f67ed87 LoopVectorizer: Pass OperandValueKind information to the cost model
Pass down the fact that an operand is going to be a vector of constants.

This should bring the performance of MultiSource/Benchmarks/PAQ8p/paq8p on x86
back. It had degraded to scalar performance due to my pervious shift cost change
that made all shifts expensive on x86.

radar://13576547

llvm-svn: 178809
2013-04-04 23:26:27 +00:00
Benjamin Kramer dd67654af6 Reassociate: Avoid iterator invalidation.
OpndPtrs stored pointers into the Opnd vector that became invalid when the
vector grows. Store indices instead. Sadly I only have a large testcase that
only triggers under valgrind, so I didn't include it.

llvm-svn: 178793
2013-04-04 21:15:42 +00:00
Michael Gottesman 21a4ed3227 Refactored out the helper method FindPredecessorAutoreleaseWithSafePath from ObjCARCOpt::OptimizeReturns.
Now ObjCARCOpt::OptimizeReturns is easy to read and reason about.

llvm-svn: 178715
2013-04-03 23:39:14 +00:00
Michael Gottesman 6908db148b Refactored out the helper function FindPredecessorRetainWithSafePath from ObjCARCOpt::OptimizeReturns.
llvm-svn: 178714
2013-04-03 23:16:05 +00:00
Michael Gottesman c2d5bf5c53 Small cleanups.
Cleaned up trailing whitespace and added extra slashes in front of a
function level comment so that it follow the convention of having 3
slashes.

llvm-svn: 178712
2013-04-03 23:07:45 +00:00
Michael Gottesman 54dc7fdefb Refactored out a part of ObjCARCOpt::OptimizeReturns into its own method HasSafePathToPredecessorCall.
llvm-svn: 178710
2013-04-03 23:04:28 +00:00
Michael Gottesman 0a1748bb8c Removed an old comment.
llvm-svn: 178709
2013-04-03 23:04:24 +00:00
Michael Gottesman 43e7e00a68 Clean up arc annotations by moving the top/bottom BB annotations into conditional macros that no-op in Release mode instead of #ifdef sections of the code.
This is to follow the example of the DEBUG macro.

llvm-svn: 178705
2013-04-03 22:41:59 +00:00
Michael Gottesman b8c8836594 Remove an optimization where we were changing an objc_autorelease into an objc_autoreleaseReturnValue.
The semantics of ARC implies that a pointer passed into an objc_autorelease
must live until some point (potentially down the stack) where an
autorelease pool is popped. On the other hand, an
objc_autoreleaseReturnValue just signifies that the object must live
until the end of the given function at least.

Thus objc_autorelease is stronger than objc_autoreleaseReturnValue in
terms of the semantics of ARC* implying that performing the given
strength reduction without any knowledge of how this relates to
the autorelease pool pop that is further up the stack violates the
semantics of ARC.

*Even though objc_autoreleaseReturnValue if you know that no RV
optimization will occur is more computationally expensive.

llvm-svn: 178612
2013-04-03 02:57:24 +00:00
Michael Gottesman 624243914f Improved comment. No functionality change.
llvm-svn: 178605
2013-04-03 01:57:16 +00:00
Bill Wendling 88d06c3b2d Use a worklist to avoid a sneaky iterator invalidation.
The iterator could be invalidated when it's recursively deleting a whole bunch
of constant expressions in a constant initializer.

Note: This was only reproducible if `opt' was run on a `.bc' file. If `opt' was
run on a `.ll' file, it wouldn't crash. This is why the test first pushes the
`.ll' file through `llvm-as' before feeding it to `opt'.

PR15440

llvm-svn: 178531
2013-04-02 08:16:45 +00:00
Shuxin Yang 6662fd0f15 Correct assertion condition
llvm-svn: 178484
2013-04-01 18:13:05 +00:00
Shuxin Yang 7b0c94e207 Implement XOR reassociation. It is based on following rules:
rule 1: (x | c1) ^ c2 => (x & ~c1) ^ (c1^c2),
     only useful when c1=c2
  rule 2: (x & c1) ^ (x & c2) = (x & (c1^c2))
  rule 3: (x | c1) ^ (x | c2) = (x & c3) ^ c3 where c3 = c1 ^ c2
  rule 4: (x | c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2

 It reduces an application's size (in terms of # of instructions) by 8.9%.
 Reviwed by Pete Cooper. Thanks a lot!

 rdar://13212115  

llvm-svn: 178409
2013-03-30 02:15:01 +00:00
Michael Gottesman 3b8f877860 Add clang.arc.used to ModuleHasARC so ARC always runs if said call is present in a module.
clang.arc.used is an interesting call for ARC since ObjCARCContract
needs to run to remove said intrinsic to avoid a linker error (since the
call does not exist).

llvm-svn: 178369
2013-03-29 21:15:23 +00:00
Michael Gottesman 60f6b28c58 Removed trailing whitespace.
llvm-svn: 178329
2013-03-29 05:13:07 +00:00
Michael Gottesman ba64859e6e Removed dead code from ObjCARCOpts relating to tracking objc_retainBlocks through the ARC Dataflow analysis. By the time we get to the ARC dataflow analysis, any objc_retainBlock calls are not optimizable.
llvm-svn: 178306
2013-03-28 23:08:44 +00:00
Bill Wendling 85722f48da Minor simplification.
Go ahead and use the full path for both the .gcno and .gcda files.

llvm-svn: 178302
2013-03-28 22:40:08 +00:00
Michael Gottesman 49f9885a2a Non optimizable objc_retainBlock calls are not forwarding.
Since we handle optimizable objc_retainBlocks through strength reduction
in OptimizableIndividualCalls, we know that all code after that point
will only see non-optimizable objc_retainBlock calls. IsForwarding is
only called by functions after that point, so it is ok to just classify
objc_retainBlock as non-forwarding.

<rdar://problem/13249661>.

llvm-svn: 178285
2013-03-28 20:11:30 +00:00
Michael Gottesman 158fdf699e [ObjCARC] Strength reduce objc_retainBlock -> objc_retain if the objc_retainBlock is optimizable.
If an objc_retainBlock has the copy_on_escape metadata attached to it
AND if the block pointer argument only escapes down the stack, we are
allowed to strength reduce the objc_retainBlock to to an objc_retain and
thus optimize it.

Current there is logic in the ARC data flow analysis to handle
this case which is complicated and involved making distinctions in
between objc_retainBlock and objc_retain in certain places and
considering them the same in others.

This patch simplifies said code by:

1. Performing the strength reduction in the initial ARC peephole
analysis (ObjCARCOpts::OptimizeIndividualCalls).

2. Changes the ARC dataflow analysis (which runs after the peephole
analysis) to consider all objc_retainBlock calls to not be optimizable
(since if the call was optimizable, we would have strength reduced it
already).

This patch leaves in the infrastructure in the ARC dataflow analysis to
handle this case, which due to 2 will just be dead code. I am doing this
on purpose to separate the removal of the old code from the testing of
the new code.

<rdar://problem/13249661>.

llvm-svn: 178284
2013-03-28 20:11:19 +00:00
Kostya Serebryany 463aa81418 [tsan] make sure memset/memcpy/memmove are not inlined in tsan mode
llvm-svn: 178230
2013-03-28 11:21:13 +00:00
Akira Hatanaka 99866dd535 Check if Type is a vector before calling function Type::getVectorNumElements.
llvm-svn: 178208
2013-03-28 01:28:02 +00:00
Bill Wendling 5aa82397e8 Use the full path when outputting the `.gcda' file.
If we compile a single source program, the `.gcda' file will be generated where
the program was executed. This isn't desirable, because that place may be at an
unpredictable place (the program could call `chdir' for instance).

Instead, we will output the `.gcda' file in the same place we output the `.gcno'
file. I.e., the directory where the executable was generated. This matches GCC's
behavior.

<rdar://problem/13061072> & PR11809

llvm-svn: 178084
2013-03-26 22:47:50 +00:00
Ulrich Weigand 8a51d8ea95 Make InstCombineCasts.cpp:OptimizeIntToFloatBitCast endian safe.
The OptimizeIntToFloatBitCast converts shift-truncate sequences
into extractelement operations.  The computation of the element
index to be used in the resulting operation is currently only
correct for little-endian targets.

This commit fixes the element index computation to be correct
for big-endian targets as well.  If the target byte order is
unknown, the optimization cannot be performed at all.

llvm-svn: 178031
2013-03-26 15:36:14 +00:00
Alexey Samsonov e1e26bf158 [ASan] Change the ABI of __asan_before_dynamic_init function: now it takes pointer to private string with module name. This string serves as a unique module ID in ASan runtime. LLVM part
llvm-svn: 178013
2013-03-26 13:05:41 +00:00
Michael Gottesman cd4de0f9bb [ObjCARC Annotations] Added support for displaying the state of pointers at the bottom/top of BBs of the ARC dataflow analysis for both bottomup and topdown analyses.
This will allow for verification and analysis of the merge function of
the data flow analyses in the ARC optimizer.

The actual implementation of this feature is by introducing calls to
the functions llvm.arc.annotation.{bottomup,topdown}.{bbstart,bbend}
which are only declared. Each such call takes in a pointer to a global
with the same name as the pointer whose provenance is being tracked and
a pointer whose name is one of our Sequence states and points to a
string that contains the same name.

To ensure that the optimizer does not consider these annotations in any
way, I made it so that the annotations are considered to be of IC_None
type.

A test case is included for this commit and the previous
ObjCARCAnnotation commit.

llvm-svn: 177952
2013-03-26 00:42:09 +00:00
Michael Gottesman 81b1d43783 [ObjCARC Annotations] Implemented ARC annotation metadata to expose the ARC data flow analysis state in the IR via metadata.
Previously the inner works of the data flow analysis in ObjCARCOpts was hard to
get out of the optimizer for analysis of bugs or testing. All of the current ARC
unit tests are based off of testing the effect of the data flow
analysis (i.e. what statements are removed or moved, etc.). This creates
weakness in the current unit testing regimem since we are not actually testing
what effects various instructions have on the modeled pointer state.
Additionally in order to analyze a bug in the optimizer, one would need to track
by hand what the optimizer was actually doing either through use of DEBUG
statements or through the usage of a debugger, both yielding large loses in
developer productivity.

This patch deals with these two issues by providing ARC annotation
metadata that annotates instructions with the state changes that they cause in
various pointers as well as provides metadata to annotate provenance sources.

Specifically, we introduce the following metadata types:

1. llvm.arc.annotation.bottomup.
2. llvm.arc.annotation.topdown.
3. llvm.arc.annotation.provenancesource.

llvm.arc.annotation.{bottomup,topdown}: These annotations describes a state
change in a pointer when we are visiting instructions bottomup/topdown
respectively. The output format for both is the same:

  !1 = metadata !{metadata !"(test,%x)", metadata !"S_Release", metadata !"S_Use"}

The first element is a string tuple with the following format:

  (function,variable name)

The second two elements of the metadata show the previous state of the
pointer (in this case S_Release) and the new state of the pointer (S_Use). We
write the metadata in such a manner to ensure that it is easy for outside tools
to parse. This is important since I am currently working on a tool for taking
this information and pretty printing it besides the IR and that can be used for
LIT style testing via the generation of an index.

llvm.arc.annotation.provenancesource: This metadata is used to annotate
instructions which act as provenance sources, i.e. ones that introduce a
new (from the optimizer's perspective) non-argument pointer to track. This
enables cross-referencing in between provenance sources and the state changes
that occur to them.

This is still a work in progress. Additionally I plan on committing
later today additions to the annotations that annotate at the top/bottom
of basic blocks the state of the various pointers being tracked.

*NOTE* The metadata support is conditionally compiled into libObjCARCOpts only
when we are producing a debug build of llvm/clang and even so are
disabled by default. To enable the annotation metadata, pass in
-enable-objc-arc-annotations to opt.

llvm-svn: 177951
2013-03-26 00:42:04 +00:00
Shuxin Yang 389ed4b8f7 Fix a bug in fast-math fadd/fsub simplification.
The problem is that the code mistakenly took for granted that following constructor 
is able to create an APFloat from a *SIGNED* integer:
   
  APFloat::APFloat(const fltSemantics &ourSemantics, integerPart value)

rdar://13486998

llvm-svn: 177906
2013-03-25 20:43:41 +00:00
Arnaud A. de Grandmaison 3ee88e8a77 Address issues found by Duncan during post-commit review of r177856.
llvm-svn: 177863
2013-03-25 11:47:38 +00:00
Arnaud A. de Grandmaison 9c383d68cf InstCombine: simplify comparisons to zero of (shl %x, Cst) or (mul %x, Cst)
This simplification happens at 2 places :
 - using the nsw attribute when the shl / mul is used by a sign test
 - when the shl / mul is compared for (in)equality to zero

llvm-svn: 177856
2013-03-25 09:48:49 +00:00
Michael Gottesman 65c2481d09 Changed isNullOrUndef => IsNullOrUndef and isNoopInstruction => IsNoopInstruction so that all helper functions are named similarly in ObjCARC.h.
llvm-svn: 177855
2013-03-25 09:27:43 +00:00
Jakub Staszak 4f9d1e85d0 Minor cleanups. No functionality change.
llvm-svn: 177837
2013-03-24 09:56:28 +00:00
Jakub Staszak f6df1e3def Use dyn_cast instead of isa && cast.
No functionality change.

llvm-svn: 177836
2013-03-24 09:25:47 +00:00
Michael Gottesman 764b1cfced Change method name ClearRefCount => ClearKnownPositiveRefCount to match the name of the member that it is modifying.
llvm-svn: 177818
2013-03-23 05:46:19 +00:00
Michael Gottesman 07beea47b8 Changed the method name PtrState.IsKnownIncremented() to PtrState.HasKnownPositiveRefCount().
Now said method matches namewise every other method which refers to
the member KnownPositiveRefCount of the class PtrState.

llvm-svn: 177816
2013-03-23 05:31:01 +00:00
John McCall 20182ac0c7 Kill every call to @clang.arc.use in the ARC contract phase.
llvm-svn: 177769
2013-03-22 21:38:36 +00:00
Bill Wendling 56f15bf490 Add all clauses when merging the landing pads. Duplicates will be handled later on.
llvm-svn: 177757
2013-03-22 20:31:05 +00:00
Bill Wendling a397c017bb Don't use the removed API.
llvm-svn: 177749
2013-03-22 18:49:53 +00:00
Kostya Serebryany cdd35a9050 [asan] Change the way we report the alloca frame on stack-buff-overflow.
Before: the function name was stored by the compiler as a constant string
and the run-time was printing it.
Now: the PC is stored instead and the run-time prints the full symbolized frame.
This adds a couple of instructions into every function with non-empty stack frame,
but also reduces the binary size because we store less strings (I saw 2% size reduction).
This change bumps the asan ABI version to v3.

llvm part.

Example of report (now):
==31711==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffa77cf1c5 at pc 0x41feb0 bp 0x7fffa77cefb0 sp 0x7fffa77cefa8
READ of size 1 at 0x7fffa77cf1c5 thread T0
    #0 0x41feaf in Frame0(int, char*, char*, char*) stack-oob-frames.cc:20
    #1 0x41f7ff in Frame1(int, char*, char*) stack-oob-frames.cc:24
    #2 0x41f477 in Frame2(int, char*) stack-oob-frames.cc:28
    #3 0x41f194 in Frame3(int) stack-oob-frames.cc:32
    #4 0x41eee0 in main stack-oob-frames.cc:38
    #5 0x7f0c5566f76c (/lib/x86_64-linux-gnu/libc.so.6+0x2176c)
    #6 0x41eb1c (/usr/local/google/kcc/llvm_cmake/a.out+0x41eb1c)
Address 0x7fffa77cf1c5 is located in stack of thread T0 at offset 293 in frame
    #0 0x41f87f in Frame0(int, char*, char*, char*) stack-oob-frames.cc:12  <<<<<<<<<<<<<< this is new
  This frame has 6 object(s):
    [32, 36) 'frame.addr'
    [96, 104) 'a.addr'
    [160, 168) 'b.addr'
    [224, 232) 'c.addr'
    [288, 292) 's'
    [352, 360) 'd'

llvm-svn: 177724
2013-03-22 10:37:20 +00:00
Dmitry Vyukov 55e63ef454 tsan: handle vptr loads specially
This is required to determine ctor/dtor vs virtual call races.
http://llvm-reviews.chandlerc.com/D566

llvm-svn: 177717
2013-03-22 08:51:22 +00:00
Evgeniy Stepanov 2a066afce5 Fix llvm::removeUnreachableBlocks to handle unreachable loops.
llvm-svn: 177713
2013-03-22 08:43:04 +00:00
Arnaud A. de Grandmaison f364bc63e7 InstCombine: Improve the result bitvect type when folding (cmp pred (load (gep GV, i)) C) to a bit test.
The original code used i32, and i64 if legal. This introduced unneeded
casts when they aren't legal, or when the index variable i has another
type. In order of preference: try to use i's type; use the smallest
fitting legal type (using an added DataLayout method); default to i32.
A testcase checks that this works when the index gep operand is i16.

Patch by : Ahmed Bougacha <ahmed.bougacha@gmail.com>
Reviewed by : Duncan

llvm-svn: 177712
2013-03-22 08:25:01 +00:00
Bill Wendling 173c71ff3d Always forward 'resume' instructions to the outter landing pad.
How did this ever work?

Basically, if you have a function that's inlined into the caller, it may not
have any 'call' instructions, but any 'resume' instructions it may have should
still be forwarded to the outer (caller's) landing pad. This requires that all
of the 'landingpad' instructions in the callee have their clauses merged with
the caller's outer 'landingpad' instruction (hence the bit of ugly code in the
`forwardResume' method).

Testcase in a follow commit to the test-suite repository.

<rdar://problem/13360379> & PR15555

llvm-svn: 177680
2013-03-21 23:30:12 +00:00
Chandler Carruth 34f0c7fcaf [SROA] Prefix names using a custom IRBuilder inserter.
The key part of this is ensuring that name prefixes remain in a Twine
form until we get to a point where we can nuke them under NDEBUG. This
is tricky using the old APIs as they played fast and loose with Twine,
which is prone to serious error. The inserter is much cleaner as it is
actually in the call stack leading to the setName call, and so has
a good opportunity to prepend the prefix.

This matters more than you might imagine because most runs over an
alloca find a single partition, and rewrite 3 or 4 instructions
referring to it. As a consequence doing this lazily and exclusively with
Twine allows the optimizer to delete more of it and shaves another 2% to
3% off of the release build's SROA run time for PR15412. I also think
the APIs are cleaner, and the use of Twine is more reliable, so
I consider it a win-win despite the churn required to reach this state.

llvm-svn: 177631
2013-03-21 09:52:18 +00:00
Evgeniy Stepanov a9a962cad8 [msan] Add an option to disable poisoning of shadow for undef values.
llvm-svn: 177630
2013-03-21 09:38:26 +00:00
Meador Inge cf691565ed simplify-libcalls: Removed unused variable
The 'Modified' variable should have been removed from SimplifyLibCalls
in r177619, but was missed.  This commit removes it.

llvm-svn: 177622
2013-03-21 02:44:07 +00:00
Meador Inge 6b6a161ccf Move library call prototype attribute inference to functionattrs
The simplify-libcalls pass implemented a doInitialization hook to infer
function prototype attributes for well-known functions.  Given that the
simplify-libcalls pass is going away *and* that the functionattrs pass
is already in place to deduce function attributes, I am moving this logic
to the functionattrs pass.  This approach was discussed during patch
review:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121126/157465.html.

llvm-svn: 177619
2013-03-21 00:55:59 +00:00
Bill Wendling c77e9440cf Call the new llvm_gcov_init function to register the environment.
Use the new `llvm_gcov_init' function to register the writeout and flush
functions. The initialization function will also call `atexit' for some cleanups
and final writout calls. But it does this only once. This is better than
checking for the `main' function, because in a library that function may not
exist.
<rdar://problem/12439551>

llvm-svn: 177579
2013-03-20 21:13:59 +00:00
Chandler Carruth 0fad17527b Fix a silly search-and-replace goof with r177495 that only broke
non-release builds.

llvm-svn: 177498
2013-03-20 07:40:56 +00:00
Chandler Carruth d177f86124 [SROA] Don't preserve the IR names in release builds.
This is espcially important because the new SROA pass goes to great
lengths to provide helpful names for debugging, and as a consequence
they can become very slow to render.

Good for between 5% and 15% of the SROA runtime on some slow test cases
such as the one in PR15412.

llvm-svn: 177495
2013-03-20 07:30:36 +00:00
Chandler Carruth 0941b66283 Move the endif to the correct line so we don't have warnings about
unused statistics variables.

llvm-svn: 177494
2013-03-20 06:47:00 +00:00
Chandler Carruth 5f5b616344 Introduce some new statistics to help track the exact behavior of the
new SROA pass.

llvm-svn: 177493
2013-03-20 06:30:46 +00:00
Quentin Colombet 2393cb92b8 Update global merge pass according to Duncan's advices:
- Remove useless includes
- Change misleading comments
- Move code into doFinalization

llvm-svn: 177445
2013-03-19 21:46:49 +00:00
Bill Wendling 04d57c7b2c Register the GCOV writeout functions so that they're emitted serially.
We don't want to write out >1000 files at the same time. That could make things
prohibitively expensive. Instead, register the "writeout" function so that it's
emitted serially.
<rdar://problem/12439551>

llvm-svn: 177437
2013-03-19 21:03:22 +00:00
Arnaud A. de Grandmaison 87c473f0d1 IndVarSimplify: do not recompute an IV value outside of the loop if :
- it is trivially known to be used inside the loop in a way that can not be optimized away
- there is no use outside of the loop which can take advantage of the computation hoisting

llvm-svn: 177432
2013-03-19 20:00:22 +00:00
Andrew Trick f3a2544dba Revert "Cleanup some SCEV logic a bit."
This reverts commit 82cd8f7382322bee7a71cdc31f7a923c44d37d32.

Just add a comment instead!

llvm-svn: 177377
2013-03-19 05:10:27 +00:00
Andrew Trick de78866594 Cleanup some SCEV logic a bit.
Make the code more obvious to scan-build and humans.

llvm-svn: 177375
2013-03-19 04:14:59 +00:00
Andrew Trick a1c01ba8c7 Tighten up an internal LSR API that should check for NULL.
No test case, but should fix a scan_build warning.

llvm-svn: 177374
2013-03-19 04:14:57 +00:00
Nick Lewycky d67186337a Emit the linkage name instead of the function name, when available. This means
that we'll prefer to emit the mangled C++ name (pending a clang change).

llvm-svn: 177371
2013-03-19 01:37:55 +00:00
Jakub Staszak bc421efddf Make method private. Keep coding standard.
llvm-svn: 177348
2013-03-18 23:31:30 +00:00
Bill Wendling c3cab816bb Register the flush function for each compile unit.
For each compile unit, we want to register a function that will flush that
compile unit. Otherwise, __gcov_flush() would only flush the counters within the
current compile unit, and not any outside of it.

PR15191 & <rdar://problem/13167507>

llvm-svn: 177340
2013-03-18 23:04:39 +00:00
Quentin Colombet 8fc340976d Extend global merge pass to optionally consider global constant variables.
Also add some checks to not merge globals used within landing pad instructions or marked as "used".

llvm-svn: 177331
2013-03-18 22:30:07 +00:00
Kostya Serebryany 10cc12f2b7 [asan] when creating string constants, set unnamed_attr and align 1 so that equal strings are merged by the linker. Observed up to 1% binary size reduction. Thanks to Anton Korobeynikov for the suggestion
llvm-svn: 177264
2013-03-18 09:38:39 +00:00
Chandler Carruth f74654d274 Mark internal classes as POD-like to get better behavior out of
SmallVector and DenseMap.

This speeds up SROA by 25% on PR15412.

llvm-svn: 177259
2013-03-18 08:36:46 +00:00
Kostya Serebryany bd016bb614 [asan] while generating the description of a global variable, emit the module name in a separate field, thus not duplicating this information if every description. This decreases the binary size (observed up to 3%). https://code.google.com/p/address-sanitizer/issues/detail?id=168 . This changes the asan API version. llvm-part
llvm-svn: 177254
2013-03-18 08:05:29 +00:00
Kostya Serebryany 6b5b58deeb [asan] don't instrument functions with available_externally linkage. This saves a bit of compile time and reduces the number of redundant global strings generated by asan (https://code.google.com/p/address-sanitizer/issues/detail?id=167)
llvm-svn: 177250
2013-03-18 07:33:49 +00:00
Arnold Schwaighofer c63cf3a0ae LoopVectorize: Invert case when we use a vector cmp value to query select cost
We generate a select with a vectorized condition argument when the condition is
NOT loop invariant. Not the other way around.

llvm-svn: 177098
2013-03-14 18:54:36 +00:00
Shuxin Yang 2eca602f8b Perform factorization as a last resort of unsafe fadd/fsub simplification.
Rules include:
  1)1 x*y +/- x*z => x*(y +/- z) 
    (the order of operands dosen't matter)

  2) y/x +/- z/x => (y +/- z)/x 

 The transformation is disabled if the new add/sub expr "y +/- z" is a 
denormal/naz/inifinity.

rdar://12911472

llvm-svn: 177088
2013-03-14 18:08:26 +00:00
Alexey Samsonov 819eddc3ce [ASan] emit instrumentation for initialization order checking by default
llvm-svn: 177063
2013-03-14 12:38:58 +00:00
Chandler Carruth a1c54bbe34 PR14972: SROA vs. GVN exposed a really bad bug in SROA.
The fundamental problem is that SROA didn't allow for overly wide loads
where the bits past the end of the alloca were masked away and the load
was sufficiently aligned to ensure there is no risk of page fault, or
other trapping behavior. With such widened loads, SROA would delete the
load entirely rather than clamping it to the size of the alloca in order
to allow mem2reg to fire. This was exposed by a test case that neatly
arranged for GVN to run first, widening certain loads, followed by an
inline step, and then SROA which miscompiles the code. However, I see no
reason why this hasn't been plaguing us in other contexts. It seems
deeply broken.

Diagnosing all of the above took all of 10 minutes of debugging. The
really annoying aspect is that fixing this completely breaks the pass.
;] There was an implicit reliance on the fact that no loads or stores
extended past the alloca once we decided to rewrite them in the final
stage of SROA. This was used to encode information about whether the
loads and stores had been split across multiple partitions of the
original alloca. That required threading explicit tracking of whether
a *use* of a partition is split across multiple partitions.

Once that was done, another problem arose: we allowed splitting of
integer loads and stores iff they were loads and stores to the entire
alloca. This is a really arbitrary limitation, and splitting at least
some integer loads and stores is crucial to maximize promotion
opportunities. My first attempt was to start removing the restriction
entirely, but currently that does Very Bad Things by causing *many*
common alloca patterns to be fully decomposed into i8 operations and
lots of or-ing together to produce larger integers on demand. The code
bloat is terrifying. That is still the right end-goal, but substantial
work must be done to either merge partitions or ensure that small i8
values are eagerly merged in some other pass. Sadly, figuring all this
out took essentially all the time and effort here.

So the end result is that we allow splitting only when the load or store
at least covers the alloca. That ensures widened loads and stores don't
hurt SROA, and that we don't rampantly decompose operations more than we
have previously.

All of this was already fairly well tested, and so I've just updated the
tests to cover the wide load behavior. I can add a test that crafts the
pass ordering magic which caused the original PR, but that seems really
brittle and to provide little benefit. The fundamental problem is that
widened loads should Just Work.

llvm-svn: 177055
2013-03-14 11:32:24 +00:00
Nick Lewycky 307a1d03b5 Remove accidentally committed debug line.
llvm-svn: 177005
2013-03-14 05:19:12 +00:00
Nick Lewycky fdfed3e9c9 Refactor GCOV's six constructor arguments into a struct with a getter that
constructs default arguments. It can now take default arguments from
cl::opt'ions. Add a new -default-gcov-version=... option, and actually test it!

Sink the reverse-order of the version into GCOVProfiling, hiding it from our
users.

llvm-svn: 177002
2013-03-14 05:13:26 +00:00
Nick Lewycky ad145509eb No functionality change. Rename emitGCNO() to the more sensible
emitProfileNotes(), similar to emitProfileArcs(). Also update its comment.

Also add a comment on Version[4] (there will be another comment in clang later),
and compress lines that exceeded 80 columns.

llvm-svn: 176994
2013-03-13 22:55:42 +00:00
Arnaud A. de Grandmaison 7153305b92 Fix a performance regression when combining to smaller types in icmp (shl %v, C1), C2 :
Only combine when the shl is only used by the icmp

llvm-svn: 176950
2013-03-13 14:40:37 +00:00
Dan Gohman 00253592c7 Change the order of the operands in patchAndReplaceAllUsesWith so
that they're more consistent with Value::replaceAllUsesWith.

llvm-svn: 176872
2013-03-12 16:22:56 +00:00
Meador Inge 20255ef24d LibCallSimplifier: optimize speed for short-lived instances
Nadav reported a performance regression due to the work I did to
merge the library call simplifier into instcombine [1].  The issue
is that a new LibCallSimplifier object is being created whenever
InstCombiner::runOnFunction is called.  Every time a LibCallSimplifier
object is used to optimize a call it creates a hash table to map from
a function name to an object that optimizes functions of that name.
For short-lived LibCallSimplifier instances this is quite inefficient.
Especially for cases where no calls are actually simplified.

This patch fixes the issue by dropping the hash table and implementing
an explicit lookup function to correlate the function name to the object
that optimizes functions of that name.  This avoids the cost of always
building and destroying the hash table in cases where the LibCallSimplifier
object is short-lived and avoids the cost of building the table when no
simplifications are actually preformed.

On a benchmark containing 100,000 calls where none of them are simplified
I noticed a 30% speedup.  On a benchmark containing 100,000 calls where
all of them are simplified I noticed an 8% speedup.

[1] http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130304/167639.html

llvm-svn: 176840
2013-03-12 00:08:29 +00:00
Bill Wendling 9534d8885f Don't remove a landing pad if the invoke requires a table entry.
An invoke may require a table entry. For instance, when the function it calls
is expected to throw.
<rdar://problem/13360379>

llvm-svn: 176827
2013-03-11 20:53:00 +00:00
Nick Lewycky 5f50854186 Use LLVMBool instead of 'bool' in the C API. Based on a patch by Peter Zotov!
llvm-svn: 176793
2013-03-10 21:58:22 +00:00
Hal Finkel f610be9f36 BBVectorize: Fixup debugging statements
After the recent data-structure improvements, a couple of debugging statements
were broken (printing pointer values).

llvm-svn: 176791
2013-03-10 20:57:42 +00:00
Benjamin Kramer 6eda79f69a Remove a source of nondeterminism from the LoopVectorizer.
This made us emit runtime checks in a random order. Hopefully bootstrap
miscompares will go away now.

llvm-svn: 176775
2013-03-09 19:22:40 +00:00
Arnold Schwaighofer 8b3dc09400 LoopVectorizer: Ignore all dbg intrinisic
Ignore all DbgIntriniscInfo instructions instead of just DbgValueInst.

llvm-svn: 176769
2013-03-09 16:27:27 +00:00
Arnold Schwaighofer 4090b61ac3 LoopVectorizer: Ignore dbg.value instructions
We want vectorization to happen at -g. Ignore calls to the dbg.value intrinsic
and don't transfer them to the vectorized code.

radar://13378964

llvm-svn: 176768
2013-03-09 15:56:34 +00:00
Jakub Staszak 2ef36b633b Simplify code. No functionality change.
llvm-svn: 176765
2013-03-09 11:18:59 +00:00
Nick Lewycky 291df6ec42 Use the correct index variable. This is the meat of what was supposed to be in
r176751. Also, learn a lesson about applying patches by hand/eyeball.

llvm-svn: 176764
2013-03-09 10:13:26 +00:00
Nick Lewycky 03aed11cdb Fix bug introduced in r176616 when making function identifier numbers stable.
Count the subprograms, not the compile units.

llvm-svn: 176751
2013-03-09 02:06:37 +00:00
Nick Lewycky 88f1d0d64e Don't emit the extra checksum into the .gcda file if the user hasn't asked for
it. Fortunately, versions of gcov that predate the extra checksum also ignore
any extra data, so this isn't a problem. There will be a matching commit in
compiler-rt.

llvm-svn: 176745
2013-03-09 01:33:06 +00:00
Benjamin Kramer 37c2d65c5a Insert the reduction start value into the first bypass block to preserve domination.
Fixes PR15344.

llvm-svn: 176701
2013-03-08 16:58:37 +00:00
Jakub Staszak fd56611b49 Keep coding stanard.
llvm-svn: 176661
2013-03-07 22:20:06 +00:00
Jakub Staszak db4579d796 Don't create IRBuilder if we can return from the method earlier.
llvm-svn: 176660
2013-03-07 22:10:33 +00:00
Pekka Jaaskelainen 093cf41e86 Fixed a crash when cloning a function into a function with
different size argument list and without attributes in the
arguments.

llvm-svn: 176632
2013-03-07 16:46:43 +00:00
Nick Lewycky 492afe8127 Switch from a version 4.2/4.4 switch to a four-byte version string to be put
into the actual gcov file.

Instead of using the bottom 4 bytes as the function identifier, use a counter.
This makes the identifier numbers stable across multiple runs.

llvm-svn: 176616
2013-03-07 08:28:49 +00:00
Andrew Trick a0a5ca06b9 SimplifyCFG fix for volatile load/store.
Fixes rdar:13349374.

Volatile loads and stores need to be preserved even if the language
standard says they are undefined. "volatile" in this context means "get
out of the way compiler, let my platform handle it".

Additionally, this is the only way I know of with llvm to write to the
first page (when hardware allows) without dropping to assembly.

llvm-svn: 176599
2013-03-07 01:03:35 +00:00
Andrew Trick fcb37243f9 Generalize my previous fix for -print-options.
Always print options that differ from their implicit default. At least
for simple option types.

llvm-svn: 176572
2013-03-06 19:04:56 +00:00
Andrew Trick 946c2b32e6 Give -loop-vectorize an explicit default.
This way, clang -mllvm -print-options shows that the driver is overriding it.

llvm-svn: 176569
2013-03-06 18:22:22 +00:00
Jim Grosbach 95d2eb95c3 InstCombine: Don't shrink allocas when combining with a bitcast.
When considering folding a bitcast of an alloca into the alloca itself,
make sure we don't shrink the amount of memory being allocated, or
things rapidly go sideways.

rdar://13324424

llvm-svn: 176547
2013-03-06 05:44:53 +00:00
Lang Hames 30be8a30cc Check isDiscardableIfUnused, rather than hasLocalLinkage, when bumping
GlobalValue linkage up to ExternalLinkage in the ExtractGV pass. This
prevents linkonce and linkonce_odr symbols from being DCE'd.

llvm-svn: 176459
2013-03-04 22:40:44 +00:00
Preston Gurd 485296d1e8 Bypass Slow Divides
* Only apply divide bypass optimization when not optimizing for size. 
* Fixed bug caused by constant for 0 value of type Int32,
  used dividend type to generate the constant instead.
* For atom x86-64 apply the divide bypass to use 16-bit divides instead of
  64-bit divides when operand values are small enough.
* Added lit tests for 64-bit divide bypass.

Patch by Tyler Nowicki!

llvm-svn: 176442
2013-03-04 18:13:57 +00:00
Nadav Rotem 739e37a0d2 PR14448 - prevent the loop vectorizer from vectorizing the same loop twice.
The LoopVectorizer often runs multiple times on the same function due to inlining.
When this happens the loop vectorizer often vectorizes the same loops multiple times, increasing code size and adding unneeded branches.
With this patch, the vectorizer during vectorization puts metadata on scalar loops and marks them as 'already vectorized' so that it knows to ignore them when it sees them a second time.

PR14448.

llvm-svn: 176399
2013-03-02 01:33:49 +00:00
Peter Collingbourne 1b97a9c82a Modify {Call,Invoke}Inst::addAttribute to take an AttrKind.
llvm-svn: 176397
2013-03-02 01:20:18 +00:00
Benjamin Kramer 12f98fae98 LoopVectorize: Don't hang forever if a PHI only has skipped PHI uses.
Fixes PR15384.

llvm-svn: 176366
2013-03-01 19:07:31 +00:00
Quentin Colombet e684a6d4aa Fix a bug in instcombine for fmul in fast math mode.
The instcombine recognized pattern looks like:
a = b * c
d = a +/- Cst
or
a = b * c
d = Cst +/- a

When creating the new operands for fadd or fsub instruction following the related fmul, the first operand was created with the second original operand (M0 was created with C1) and the second with the first (M1 with Opnd0).

The fix consists in creating the new operands with the appropriate original operand, i.e., M0 with Opnd0 and M1 with C1.

llvm-svn: 176300
2013-02-28 21:12:40 +00:00
Evgeniy Stepanov 00062b4498 [msan] Implement sanitize_memory attribute.
Shadow checks are disabled and memory loads always produce fully initialized
values in functions that don't have a sanitize_memory attribute. Value and
argument shadow is propagated as usual.

This change also updates blacklist behaviour to match the above.

llvm-svn: 176247
2013-02-28 11:25:14 +00:00
Evgeniy Stepanov 4c9300e630 Remove unused leftover declarations.
llvm-svn: 176240
2013-02-28 08:42:11 +00:00
Benjamin Kramer dc145816fd LoopVectorize: Vectorize math builtin calls.
This properly asks TargetLibraryInfo if a call is available and if it is, it
can be translated into the corresponding LLVM builtin. We don't vectorize sqrt()
yet because I'm not sure about the semantics for negative numbers. The other
intrinsic should be exact equivalents to the libm functions.

Differential Revision: http://llvm-reviews.chandlerc.com/D465

llvm-svn: 176188
2013-02-27 15:24:19 +00:00
Nick Lewycky 6fd43e4071 In GCC 4.7, function names are now forbidden from .gcda files. Support this by
passing a null pointer to the function name in to GCDAProfiling, and add another
switch onto GCOVProfiling.

llvm-svn: 176173
2013-02-27 06:22:56 +00:00
Nick Lewycky 625f395663 Doh, fix behaviour change introduced in r176168 which is tested in clang,
not llvm.

llvm-svn: 176172
2013-02-27 06:21:30 +00:00
Nadav Rotem 464e807d41 For each function that we optimize we initialize a new list of lib functions. For each function name we malloc memory. This patch changes the Libcall map to use BumpPtrAllocator. Now we malloc only once. This speeds up instcombine by a few % on a large c++ program.
llvm-svn: 176170
2013-02-27 05:53:43 +00:00
Nick Lewycky 8e94d80aab IRBuilder has grown all sorts of useful utility functions. Make use of them to
clean up this code a tiny bit. No functionality change.

llvm-svn: 176168
2013-02-27 05:46:30 +00:00
Pedro Artigas e40467b589 Enhance integer division emulation support to handle types smaller than 32 bits,
enhancement done the trivial way; by extending inputs and truncating outputs 
which is addequate for targets with little or no support for integer arithmetic
on integer types less than 32 bits.

llvm-svn: 176139
2013-02-26 23:33:20 +00:00
Kostya Serebryany cf880b9443 Unify clang/llvm attributes for asan/tsan/msan (LLVM part)
These are two related changes (one in llvm, one in clang).
LLVM: 
- rename address_safety => sanitize_address (the enum value is the same, so we preserve binary compatibility with old bitcode)
- rename thread_safety => sanitize_thread
- rename no_uninitialized_checks -> sanitize_memory

CLANG: 
- add __attribute__((no_sanitize_address)) as a synonym for __attribute__((no_address_safety_analysis))
- add __attribute__((no_sanitize_thread))
- add __attribute__((no_sanitize_memory))

for S in address thread memory
If -fsanitize=S is present and __attribute__((no_sanitize_S)) is not
set llvm attribute sanitize_S

llvm-svn: 176075
2013-02-26 06:58:09 +00:00
Benjamin Kramer ee40b9a2d4 CVP: If we have a PHI with an incoming select, try to skip the select.
This is a common pattern with dyn_cast and similar constructs, when the
PHI no longer depends on the select it can often be turned into a simpler
construct or even get hoisted out of the loop.

PR15340.

llvm-svn: 175995
2013-02-24 15:34:43 +00:00
Michael Gottesman f4b7761ed7 Fixed a careless mistake.
rdar://13273675.

llvm-svn: 175939
2013-02-23 00:31:32 +00:00
Bill Wendling 09bd1f71ee Implement the NoBuiltin attribute.
The 'nobuiltin' attribute is applied to call sites to indicate that LLVM should
not treat the callee function as a built-in function. I.e., it shouldn't try to
replace that function with different code.

llvm-svn: 175835
2013-02-22 00:12:35 +00:00
Renato Golin cf928cb53f Allow GlobalValues to vectorize with AliasAnalysis
Storing the load/store instructions with the values
and inspect them using Alias Analysis to make sure
they don't alias, since the GEP pointer operand doesn't
take the offset into account.

Trying hard to not add any extra cost to loads and stores
that don't overlap on global values, AA is *only* calculated
if all of the previous attempts failed.

Using biggest vector register size as the stride for the
vectorization access, as we're being conservative and
the cost model (which calculates the real vectorization
factor) is only run after the legalization phase.

We might re-think this relationship in the future, but
for now, I'd rather be safe than sorry.

llvm-svn: 175818
2013-02-21 22:39:03 +00:00
Chad Rosier 9b7f9c3e9e Remove dead code and whitespace.
llvm-svn: 175804
2013-02-21 21:40:51 +00:00
Chad Rosier 4d87d45a05 Update a comment that looks to have been accidentally deleted many moons ago.
llvm-svn: 175658
2013-02-20 20:15:55 +00:00
Kostya Serebryany 699ac28aa5 [asan] instrument invoke insns with noreturn attribute (as well as call insns)
llvm-svn: 175617
2013-02-20 12:35:15 +00:00
Jakub Staszak ae2fd9c97d Remove unused variable.
llvm-svn: 175568
2013-02-19 22:17:58 +00:00
Jakub Staszak 3c6583a1b1 Minor cleanups. No functionality change.
llvm-svn: 175567
2013-02-19 22:14:45 +00:00
Jakub Staszak 90fbe91c58 Remove unneeded #includes.
llvm-svn: 175565
2013-02-19 22:06:38 +00:00
Jakub Staszak 086f6cde5d Fix typos.
llvm-svn: 175562
2013-02-19 22:02:21 +00:00
Kostya Serebryany 3ece9beaf1 [asan] instrument memory accesses with unusual sizes
This patch makes asan instrument memory accesses with unusual sizes (e.g. 5 bytes or 10 bytes), e.g. long double or
packed structures.
Instrumentation is done with two 1-byte checks
(first and last bytes) and if the error is found
__asan_report_load_n(addr, real_size) or
__asan_report_store_n(addr, real_size)
is called.

Also, call these two new functions in memset/memcpy
instrumentation.

asan-rt part will follow.

llvm-svn: 175507
2013-02-19 11:29:21 +00:00
Bill Wendling c98e4fef1a Temporarily revert r175470 for more review.
llvm-svn: 175476
2013-02-19 00:52:45 +00:00
Bill Wendling 66651e4c2f Check to see if the 'no-builtin' attribute is set before simplifying a library call.
llvm-svn: 175470
2013-02-18 23:17:16 +00:00
Kostya Serebryany 7ca384bc1a [asan] revert r175266 as it breaks code with packed structures. supporting long double will require a more general solution
llvm-svn: 175442
2013-02-18 13:47:02 +00:00
Hal Finkel 76e65e4542 BBVectorize: Fix an invalid reference bug
This fixes PR15289. This bug was introduced (recently) in r175215; collecting
all std::vector references for candidate pairs to delete at once is invalid
because subsequent lookups in the owning DenseMap could invalidate the
references.

bugpoint was able to reduce a useful test case. Unfortunately, because whether
or not this asserts depends on memory layout, this test case will sometimes
appear to produce valid output. Nevertheless, running under valgrind will
reveal the error.

llvm-svn: 175397
2013-02-17 15:59:26 +00:00
Bill Wendling 23242098e7 The transform is:
(or (bool?A:B),(bool?C:D)) --> (bool?(or A,C):(or B,D))

By the time the OR is visited, both the SELECTs have been visited and not
optimized and the OR itself hasn't been transformed so we do this transform in
the hopes that the new ORs will be optimized.

The transform is explicitly disabled for vector-selects until "codegen matures
to handle them better".

Patch by Muhammad Tauqir!

llvm-svn: 175380
2013-02-16 23:41:36 +00:00
Jakub Staszak 11bd83551c Reduce indents in LSRInstance::NarrowSearchSpaceByCollapsingUnrolledCode method.
No functionality change.

llvm-svn: 175364
2013-02-16 16:08:15 +00:00
Hal Finkel 89909397a1 BBVectorize: Call a DAG and DAG instead of a tree
Several functions and variable names used the term 'tree' to refer
to what is actually a DAG. Correcting this mistake will, hopefully,
prevent confusion in the future.

No functionality change intended.

llvm-svn: 175278
2013-02-15 17:20:54 +00:00
Arnaud A. de Grandmaison 1fd843eee7 Fix refactoring mistake in "Teach InstCombine to work with smaller legal types..."
llvm-svn: 175273
2013-02-15 15:18:17 +00:00
Arnaud A. de Grandmaison 61c167c62b Teach InstCombine to work with smaller legal types in icmp (shl %v, C1), C2
It enables to work with a smaller constant, which is target friendly for those which can compare to immediates.
It also avoids inserting a shift in favor of a trunc, which can be free on some targets.

This used to work until LLVM-3.1, but regressed with the 3.2 release.

llvm-svn: 175270
2013-02-15 14:35:47 +00:00
Kostya Serebryany a968568165 [asan] support long double on 64-bit. See https://code.google.com/p/address-sanitizer/issues/detail?id=151
llvm-svn: 175266
2013-02-15 12:46:06 +00:00
Benjamin Kramer 6ecb1e78a9 Make helpers static. Add missing include so LLVMInitializeObjCARCOpts gets C linkage.
llvm-svn: 175264
2013-02-15 12:30:38 +00:00
Hal Finkel 283f4f0e66 BBVectorize: Cap the number of candidate pairs in each instruction group
For some basic blocks, it is possible to generate many candidate pairs for
relatively few pairable instructions. When many (tens of thousands) of these pairs
are generated for a single instruction group, the time taken to generate and
rank the different vectorization plans can become quite large. As a result, we now
cap the number of candidate pairs within each instruction group. This is done by
closing out the group once the threshold is reached (set now at 3000 pairs).

Although this will limit the overall compile-time impact, this may not be the best
way to achieve this result. It might be better, for example, to prune excessive
candidate pairs after the fact the prevent the generation of short, but highly-connected
groups. We can experiment with this in the future.

This change reduces the overall compile-time slowdown of the csa.ll test case in
PR15222 to ~5x. If 5x is still considered too large, a lower limit can be
used as the default.

This represents a functionality change, but only for very large inputs
(thus, there is no regression test).

llvm-svn: 175251
2013-02-15 04:28:42 +00:00
Hal Finkel e7a1ef422b BBVectorize: Remove the remaining instances of std::multimap
All instances of std::multimap have now been replaced by
DenseMap<K, std::vector<V> >, and this yields a speedup of 5% on the
csa.ll test case from PR15222.

No functionality change intended.

llvm-svn: 175216
2013-02-14 22:38:04 +00:00
Hal Finkel c3a4425c34 BBVectorize: Don't store candidate pairs in a std::multimap
This is another commit on the road to removing std::multimap from
BBVectorize. This gives an ~1% speedup on the csa.ll test case
in PR15222.

No functionality change intended.

llvm-svn: 175215
2013-02-14 22:37:09 +00:00
Bill Wendling 7297b864a4 Retain the name of the new internal global that's been shrunk.
It's possible (e.g. after an LTO build) that an internal global may be used for
debugging purposes. If that's the case appending a '.b' to it makes it hard to
find that variable. Steal the name from the old GV before deleting it so that
they can find that variable again.

llvm-svn: 175104
2013-02-13 23:00:51 +00:00
Benjamin Kramer 0aa2ad6104 LoopVectorize: Simplify code for clarity.
No functionality change.

llvm-svn: 175076
2013-02-13 21:12:29 +00:00
Pekka Jaaskelainen 0d23725a8d Metadata for annotating loops as parallel. The first consumer for this
metadata is the loop vectorizer.

See the documentation update for more info.

llvm-svn: 175060
2013-02-13 18:08:57 +00:00
Kostya Serebryany caf11af9d3 [asan] fix confusing indentation
llvm-svn: 175033
2013-02-13 05:14:12 +00:00
Arnaud A. de Grandmaison 2e4df4f7c2 Fix comment
visitSExt is an adapted copy of the related visitZExt method, so adapt the comment accordingly.

llvm-svn: 175019
2013-02-13 00:19:19 +00:00
Michael Gottesman 27029f4642 Changed isStoredObjCPointer => IsStoredObjCPointer. No functionality change.
llvm-svn: 175017
2013-02-12 23:35:08 +00:00
Dan Gohman a6307574d6 Actually delete this code, since it's really not clear what it's
trying to do.

llvm-svn: 175014
2013-02-12 22:26:41 +00:00
Dan Gohman f377160d2f Record PRE predecessors with a SmallVector instead of a DenseMap, and
avoid a second pred_iterator traversal.

llvm-svn: 175001
2013-02-12 19:49:10 +00:00
Dan Gohman 2001cd8f9e When disabling PRE for a value is directly redundant with itself
(through a loop), don't continue to iterate through the reamining
predecessors.

llvm-svn: 174994
2013-02-12 19:05:10 +00:00
Dan Gohman fd41de0b10 Check that pointers are removed from maps before calling delete on the pointers,
for tidiness' sake.

llvm-svn: 174988
2013-02-12 18:44:43 +00:00
Dan Gohman f60667020a Minor code simplification.
llvm-svn: 174985
2013-02-12 18:38:36 +00:00
Alexander Potapenko 259e8127ad [ASan] Do not use kDefaultShort64bitShadowOffset on Mac, where the binaries may get mapped at 0x100000000+ and thus may interleave with the shadow.
llvm-svn: 174964
2013-02-12 12:41:12 +00:00
Kostya Serebryany be73337ad2 [asan] change the default mapping offset on x86_64 to 0x7fff8000. This gives roughly 5% speedup. Since this is an ABI change, bump the asan ABI version by renaming __asan_init to __asan_init_v1. llvm part, compiler-rt part will follow
llvm-svn: 174957
2013-02-12 11:11:02 +00:00
Hal Finkel 6ae564b4a0 BBVectorize: Don't over-search when building the dependency map
When building the pairable-instruction dependency map, don't search
past the last pairable instruction. For large blocks that have been
divided into multiple instruction groups, searching past the last
instruction in each group is very wasteful. This gives a 32% speedup
on the csa.ll test case from PR15222 (when using 50 instructions
in each group).

No functionality change intended.

llvm-svn: 174915
2013-02-11 23:02:17 +00:00
Hal Finkel 39a95032d2 BBVectorize: Omit unnecessary entries in PairableInstUsers
This map is queried only for instructions in pairs of pairable
instructions; so make sure that only pairs of pairable
instructions are added to the map. This gives a 3.5% speedup
on the csa.ll test case from PR15222.

No functionality change intended.

llvm-svn: 174914
2013-02-11 23:02:09 +00:00
Michael Ilseman 74a6da963b Optimization: bitcast (<1 x ...> insertelement ..., X, ...) to ... ==> bitcast X to ...
llvm-svn: 174905
2013-02-11 21:41:44 +00:00
Hal Finkel 0b8ae895b4 BBVectorize: Eliminate one more restricted linear search
This eliminates one more linear search over a range of
std::multimap entries. This gives a 22% speedup on the
csa.ll test case from PR15222.

No functionality change intended.

llvm-svn: 174893
2013-02-11 17:19:34 +00:00
Kostya Serebryany c5f44bc62d [asan] added a flag -mllvm asan-short-64bit-mapping-offset=1 (0 by default)
This flag makes asan use a small (<2G) offset for 64-bit asan shadow mapping.
On x86_64 this saves us a register, thus achieving ~2/3 of the
zero-base-offset's benefits in both performance and code size.

Thanks Jakub Jelinek for the idea.

llvm-svn: 174886
2013-02-11 14:36:01 +00:00
Hal Finkel cb268f7995 BBVectorize: Remove the linear searches from pair connection searching
This removes the last of the linear searches over ranges of std::multimap
iterators, giving a 7% speedup on the doduc.bc input from PR15222.

No functionality change intended.

llvm-svn: 174859
2013-02-11 05:29:51 +00:00
Hal Finkel fee38f9754 BBVectorize: Avoid linear searches within the load-move set
This is another cleanup aimed at eliminating linear searches
in ranges of std::multimap.

No functionality change intended.

llvm-svn: 174858
2013-02-11 05:29:49 +00:00
Hal Finkel dd4bc66593 BBVectorize: isa/cast cleanup in getInstructionTypes
Profiling suggests that getInstructionTypes is performance-sensitive,
this cleans up some double-casting in that function in favor of
using dyn_cast.

No functionality change intended.

llvm-svn: 174857
2013-02-11 05:29:48 +00:00
Hal Finkel c1cc166948 BBVectorize: Make the bookkeeping to support full cycle checking less expensive
By itself, this does not have much of an effect, but only because in the default
configuration the full cycle checks are used only for small problem sizes.
This is part of a general cleanup of uses of iteration over std::multimap
ranges only for the purpose of checking membership.

No functionality change intended.

llvm-svn: 174856
2013-02-11 05:29:41 +00:00
Andrew Trick bc7059032b LSR IVChain improvement.
Handle chains in which the same offset is used for both loads and
stores to the same array.

Fixes rdar://11410078.

llvm-svn: 174789
2013-02-09 01:11:01 +00:00
Jakub Staszak f23980aba5 Remove #includes from the commonly used LoopInfo.h.
llvm-svn: 174786
2013-02-09 01:04:28 +00:00
Bob Wilson bfb44ef9cb Revert "Add LLVMContext::emitWarning methods and use them. <rdar://problem/12867368>"
This reverts r171041. This was a nice idea that didn't work out well.
Clang warnings need to be associated with warning groups so that they can
be selectively disabled, promoted to errors, etc. This simplistic patch didn't
allow for that. Enhancing it to provide some way for the backend to specify
a front-end warning type seems like overkill for the few uses of this, at
least for now.

llvm-svn: 174748
2013-02-08 21:48:29 +00:00
Hal Finkel dd2721842d BBVectorize: Use TTI->getAddressComputationCost
This is a follow-up to the cost-model change in r174713 which splits
the cost of a memory operation between the address computation and the
actual memory access. In r174713, this cost is always added to the
memory operation cost, and so BBVectorize will do the same.

Currently, this new cost function is used only by ARM, and I don't
have any ARM test cases for BBVectorize. Assistance in generating some
good ARM test cases for BBVectorize would be greatly appreciated!

llvm-svn: 174743
2013-02-08 21:13:39 +00:00
Chad Rosier 22d275f7b8 [SimplifyLibCalls] Library call simplification doen't work if the call site
isn't using the default calling convention.  However, if the transformation is
from a call to inline IR, then the calling convention doesn't matter.
rdar://13157990

llvm-svn: 174724
2013-02-08 18:00:14 +00:00
Jakob Stoklund Olesen 479e5a9313 Typos.
llvm-svn: 174723
2013-02-08 17:43:32 +00:00
Arnold Schwaighofer 594fa2dc2b ARM cost model: Address computation in vector mem ops not free
Adds a function to target transform info to query for the cost of address
computation. The cost model analysis pass now also queries this interface.
The code in LoopVectorize adds the cost of address computation as part of the
memory instruction cost calculation. Only there, we know whether the instruction
will be scalarized or not.
Increase the penality for inserting in to D registers on swift. This becomes
necessary because we now always assume that address computation has a cost and
three is a closer value to the architecture.

radar://13097204

llvm-svn: 174713
2013-02-08 14:50:48 +00:00
Michael Kuperstein f63b77be7f Test Commit
llvm-svn: 174709
2013-02-08 12:58:29 +00:00
Andrew Trick 1bd53c3675 Revert "Have InstCombine call SipmlifyCall when handling calls. Test case included."
This reverts commit 3854a5d90fee52af1065edbed34521fff6cdc18d.

This causes a clang unit test to hang: vtable-available-externally.cpp.

llvm-svn: 174692
2013-02-08 01:55:39 +00:00
Michael Ilseman 6092dc5455 Have InstCombine call SipmlifyCall when handling calls. Test case included.
llvm-svn: 174675
2013-02-07 23:01:35 +00:00
Nadav Rotem a9100f3609 fix 80-col violation and fix the docs.
llvm-svn: 174671
2013-02-07 22:34:07 +00:00
Arnold Schwaighofer 3476fc8c82 Loop Vectorizer: Refactor Memory Cost Computation
We don't want too many classes in a pass and the classes obscure the details. I
was going a little overboard with object modeling here. Replace classes by
generic code that handles both loads and stores.

No functionality change intended.

llvm-svn: 174646
2013-02-07 19:05:21 +00:00
Michael Gottesman 697d8b9a26 Moved some comments due to the recent refactoring of ObjCARC.
1. Moved a comment from ObjCARCOpts.cpp -> ObjCARCContract.cpp.
2. Removed a comment from ObjCARCOpts.cpp that was already moved to
ObjCARCAliasAnalysis.h/.cpp.

llvm-svn: 174581
2013-02-07 04:12:57 +00:00
Michael Ilseman 1dd6f2a5ba Preserve fast-math flags after reassociation and commutation. Update test cases
llvm-svn: 174571
2013-02-07 01:40:15 +00:00
Benjamin Kramer 944e0abf04 InstCombine: Fix and simplify the inttoptr side too.
llvm-svn: 174438
2013-02-05 20:22:40 +00:00
Michael Gottesman 415ddd7e13 Removed explicit inline as per the LLVM style guide.
llvm-svn: 174432
2013-02-05 19:32:18 +00:00
Benjamin Kramer e477875873 InstCombine: Harden code to work with vectors of pointers and simplify it a bit.
Found by running instcombine on a fabricated test case for the constant folder.

llvm-svn: 174430
2013-02-05 19:21:56 +00:00
Arnold Schwaighofer 3be40b56c5 Loop Vectorizer: Refactor code to compute vectorized memory instruction cost
Introduce a helper class that computes the cost of memory access instructions.
No functionality change intended.

llvm-svn: 174422
2013-02-05 18:46:41 +00:00
Chad Rosier 92a54f6d4c [SjLj Prepare] When demoting an invoke instructions to the stack, if the normal
edge is critical, then split it so we can insert the store.
rdar://13126179

llvm-svn: 174418
2013-02-05 18:23:10 +00:00
Arnold Schwaighofer 22174f5d5a Loop Vectorizer: Handle pointer stores/loads in getWidestType()
In the loop vectorizer cost model, we used to ignore stores/loads of a pointer
type when computing the widest type within a loop. This meant that if we had
only stores/loads of pointers in a loop we would return a widest type of 8bits
(instead of 32 or 64 bit) and therefore a vector factor that was too big.

Now, if we see a consecutive store/load of pointers we use the size of a pointer
(from data layout).

This problem occured in SingleSource/Benchmarks/Shootout-C++/hash.cpp (reduced
test case is the first test in vector_ptr_load_store.ll).

radar://13139343

llvm-svn: 174377
2013-02-05 15:08:02 +00:00
Nick Lewycky 535d97cc86 Revert accidental commit (ran svn commit from wrong directory).
llvm-svn: 174241
2013-02-02 00:25:26 +00:00
Nick Lewycky a8c77e4266 This patch makes "&Cls::purevfn" not an odr use. This isn't what the standard
says, but that's a defect (to be filed). "Cls::purevfn()" is still an odr use.

Also fixes a bug in the previous patch that caused us to not mark the function
referenced just because we didn't want to mark it odr used.

llvm-svn: 174240
2013-02-02 00:22:37 +00:00
Preston Gurd 25c3b6acc0 This patch aims to improve compile time performance by increasing
the SCEV vector size in LoopStrengthReduce. It is observed that
the BaseRegs vector size is 4 in most cases,
and elements are frequently copied when it is initialized as
SmallVector<const SCEV *, 2> BaseRegs.
Our benchmark results show that the compilation time performance
improved by ~0.5%.

Patch by Wan Xiaofei.

llvm-svn: 174219
2013-02-01 20:41:27 +00:00
Nadav Rotem 4349f6963e Revert r174152. The shift amount may overflow and in that case this transformation is illegal.
llvm-svn: 174156
2013-02-01 07:59:33 +00:00
Nadav Rotem 1d584029ae Optimize shift lefts of a constant by a value plus constant into a single shift.
llvm-svn: 174152
2013-02-01 06:45:40 +00:00
Manman Ren aec2ce7db4 Linker: correctly link in dbg.declare
This is a re-worked version of r174048.
Given source IR:
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15
we used to generate 
call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29
!27 = metadata !{null}

With this patch, we will correctly generate
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28

Looking up %argc.addr in ValueMap will return null, since %argc.addr is already
correctly set up, we can use identity mapping.

rdar://problem/13089880

llvm-svn: 174093
2013-01-31 21:19:18 +00:00
Alexey Samsonov 5234a8ed9f Revert r173946. This breaks compilation of googletest with Clang
llvm-svn: 174048
2013-01-31 08:02:11 +00:00
Dan Gohman 20a2ae9df5 Change GetPointerBaseWithConstantOffset's DataLayout argument from a
reference to a pointer, so that it can handle the case where DataLayout
is not available and behave conservatively.

llvm-svn: 174024
2013-01-31 02:00:45 +00:00
Bill Wendling 785afdf3a4 Remove addRetAttributes and addFnAttributes, which aren't useful abstractions.
llvm-svn: 173992
2013-01-30 23:40:31 +00:00
Bill Wendling d219675c2a Convert typeIncompatible to return an AttributeSet.
There are still places which treat the Attribute object as a collection of
attributes. I'm systematically removing them.

llvm-svn: 173990
2013-01-30 23:07:40 +00:00
Manman Ren 81dcc62805 Linker: correctly link in dbg.declare
Given source IR:
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15
we used to generate 
call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29
!27 = metadata !{null}

With this patch, we will correctly generate
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28

Looking up %argc.addr in ValueMap will return null, since %argc.addr is already
correctly set up, we can use identity mapping.

llvm-svn: 173946
2013-01-30 17:42:15 +00:00
Nadav Rotem 513bd8a73c InstCombine: canonicalize sext-and --> select
sext-not-and --> select.

Patch by Muhammad Tauqir Ahmad.

llvm-svn: 173901
2013-01-30 06:35:22 +00:00
Michael Gottesman e52dec1695 Made certain small functions in PtrState inlined.
llvm-svn: 173842
2013-01-29 22:29:59 +00:00
Pekka Jaaskelainen f50ab84bb1 LoopVectorize: convert TinyTripCountVectorThreshold constant
to a command line switch.

llvm-svn: 173837
2013-01-29 21:42:08 +00:00
Michael Gottesman 9bdab2bf6b Removed trailing comma in last element of enum declaration.
llvm-svn: 173836
2013-01-29 21:41:44 +00:00
Michael Gottesman 386241ce5b Moved S_Stop back to its previous position in the sequence order.
llvm-svn: 173834
2013-01-29 21:39:02 +00:00
Michael Gottesman 23cda0cd39 Fixed a few debug messages and some 80+ violations.
llvm-svn: 173832
2013-01-29 21:07:53 +00:00
Michael Gottesman 53fd20bdbd Added some periods to some comments and added an overload for operator<< for type Sequence so I can print out Sequences in debug statements.
llvm-svn: 173831
2013-01-29 21:07:51 +00:00
Michael Gottesman 774d2c014e Changed DoesObjCBlockEscape => DoesRetainableObjPtrEscape so I can use it to perform escape analysis of other retainable object pointers in other locations.
llvm-svn: 173829
2013-01-29 21:00:52 +00:00
Edwin Vane 82f80d4967 Fixing warnings revealed by gcc release build
Fixed set-but-not-used warnings.

Reviewer: gribozavr
llvm-svn: 173810
2013-01-29 17:42:24 +00:00
Benjamin Kramer cf406756ce LoopVectorize: Clean up ValueMap a bit and avoid double lookups.
No intended functionality change.

llvm-svn: 173809
2013-01-29 17:31:33 +00:00
Timur Iskhodzhanov 5d7ff00456 Hopefully fix the Windows build failure introduced in r173769
llvm-svn: 173781
2013-01-29 09:09:27 +00:00
Michael Gottesman 1e29ca1501 Fixed 2 more header comments...
llvm-svn: 173774
2013-01-29 05:07:18 +00:00
Michael Gottesman 5a8f9e7c54 Fixed header comment.
llvm-svn: 173773
2013-01-29 05:05:17 +00:00
Michael Gottesman 23a1ee5f5b Fixed some whitespace/80+ violations. Also added a space after a namespace declaration.
llvm-svn: 173772
2013-01-29 04:58:30 +00:00
Michael Gottesman 7bf48af498 Added missing dashes from header declaration comment.
llvm-svn: 173770
2013-01-29 04:53:55 +00:00
Michael Gottesman 13a5f1a8b7 Juggled Debug.h from ObjCARC.h to only the including cpp files that
actually have DEBUG statements. Also changed raw_ostream in said header
to be a forward declaration (removing an include).

llvm-svn: 173769
2013-01-29 04:51:59 +00:00
Michael Gottesman 278266faa8 Sorted includes using utils/sort_includes.
llvm-svn: 173767
2013-01-29 04:20:52 +00:00
Michael Gottesman f823dd2ef7 Added two missing headers from ObjCARCAliasAnalysis.h.
This was missed since whenever I was including ObjCARCAliasAnalysis.h, I
was including ObjCARC.h before it which included these includes
(resulting in no compilation breakage).

llvm-svn: 173764
2013-01-29 04:09:24 +00:00
Michael Gottesman 7f387ae6e3 Removed InstCombine/Targets as library dependencies for libObjCARCOpts since they are unnecessary.
llvm-svn: 173763
2013-01-29 04:05:17 +00:00
Michael Gottesman 778138e960 Extracted ObjCARCContract from ObjCARCOpts into its own file.
This also required adding 2x headers Dependency Analysis.h/Provenance Analysis.h
and a .cpp file DependencyAnalysis.cpp to unentangle the dependencies inbetween
ObjCARCContract and ObjCARCOpts.

llvm-svn: 173760
2013-01-29 03:03:03 +00:00
Michael Gottesman 50a622f120 Removed some cruft from ObjCARCAliasAnalysis.cpp.
llvm-svn: 173759
2013-01-29 03:02:59 +00:00
Hal Finkel bf4db4fe11 Unroll again after running BBVectorize
Because BBVectorize may significantly shorten a loop body, unroll
again after vectorization. This is especially important when using
runtime or partial unrolling.

llvm-svn: 173730
2013-01-29 00:22:49 +00:00
Renato Golin 1258519674 Vectorization Factor clarification
llvm-svn: 173691
2013-01-28 16:02:45 +00:00
Evgeniy Stepanov 6f85ef300d [msan] Mostly disable msan-handle-icmp-exact.
It is way too slow. Change the default option value to 0.
Always do exact shadow propagation for unsigned ICmp with constants, it is
cheap (under 1% cpu time) and required for correctness.

llvm-svn: 173682
2013-01-28 11:42:28 +00:00
Evgeniy Stepanov 52c7b1b98f Revert r173678.
Broken tests.

llvm-svn: 173679
2013-01-28 09:18:40 +00:00
Evgeniy Stepanov 5ec2ff57e9 [msan] Make msan-handle-icmp-exact=0 by default.
50% slowdown on one of the specs.

llvm-svn: 173678
2013-01-28 09:15:15 +00:00
Michael Gottesman 5ed40afe17 Created ObjCARCUtil.cpp for functions which in my humble opinion are too large to static inline and place in a header file such as ObjCARC.h.
llvm-svn: 173666
2013-01-28 06:39:31 +00:00
Michael Gottesman 9bfcf28d88 Cleaned up includes in various ObjCARC files and removed some whitespace violations.
llvm-svn: 173663
2013-01-28 05:51:58 +00:00
Michael Gottesman 294e7daaac Refactor ObjCARCAliasAnalysis into its own file.
llvm-svn: 173662
2013-01-28 05:51:54 +00:00
Michael Gottesman fa0939f790 Refactored out pass ObjCARCAPElim from ObjCARCOpts.cpp => ObjCARCAPElim.cpp.
llvm-svn: 173654
2013-01-28 04:12:07 +00:00
Michael Gottesman 283e079fa6 Fixed case insensitive issue.
llvm-svn: 173653
2013-01-28 03:35:20 +00:00
Michael Gottesman 0d90b12acc Removed extraneous doxygen end module statement.
llvm-svn: 173652
2013-01-28 03:30:34 +00:00
Michael Gottesman 08904e3ba4 Extracted pass ObjCARCExpand from ObjCARC.cpp => ObjCARCExpand.cpp.
I also added the local header ObjCARC.h for common functions used by the
various passes.

llvm-svn: 173651
2013-01-28 03:28:38 +00:00
Michael Gottesman 79d8d81226 Extracted ObjCARC.cpp into its own library libLLVMObjCARCOpts in preparation for refactoring the ARC Optimizer.
llvm-svn: 173647
2013-01-28 01:35:51 +00:00
Hal Finkel 293a41d14f BBVectorize: Better use of TTI->getShuffleCost
When flipping the pair of subvectors that form a vector, if the
vector length is 2, we can use the SK_Reverse shuffle kind to get
more-accurate cost information. Also we can use the SK_ExtractSubvector
shuffle kind to get accurate subvector extraction costs.

The current cost model implementations don't yet seem complex enough
for this to make a difference (thus, there are no test cases with this
commit), but it should help in future.

Depending on how the various targets optimize and combine shuffles in
practice, we might be able to get more-accurate costs by combining the
costs of multiple shuffle kinds. For example, the cost of flipping the
subvector pairs could be modeled as two extractions and two subvector
insertions. These changes, however, should probably be motivated
by specific test cases.

llvm-svn: 173621
2013-01-27 20:07:01 +00:00
Chandler Carruth 329b590e6e Re-revert r173342, without losing the compile time improvements, flat
out bug fixes, or functionality preserving refactorings.

llvm-svn: 173610
2013-01-27 06:42:03 +00:00
Michael Gottesman 5300cdd8f2 Renamed function IsPotentialUse to IsPotentialRetainableObjPtr.
This name change does the following:

1. Causes the function name to use proper ARC terminology.
2. Makes it clear what the function truly does.

llvm-svn: 173609
2013-01-27 06:19:48 +00:00
Bill Wendling 3575c8c6d6 Use the AttributeSet instead of AttributeWithIndex.
In the future, AttributeWithIndex won't be used anymore. Besides, it exposes the
internals of the AttributeSet to outside users, which isn't goodness.

llvm-svn: 173602
2013-01-27 02:08:22 +00:00
Bill Wendling 37a52df920 Use the AttributeSet instead of AttributeWithIndex.
In the future, AttributeWithIndex won't be used anymore. Besides, it exposes the
internals of the AttributeSet to outside users, which isn't goodness.

llvm-svn: 173601
2013-01-27 01:57:28 +00:00
Bill Wendling 6eaab61bb5 Use the AttributeSet instead of AttributeWithIndex.
In the future, AttributeWithIndex won't be used anymore. Besides, it exposes the
internals of the AttributeSet to outside users, which isn't goodness.

llvm-svn: 173600
2013-01-27 01:44:34 +00:00
Hal Finkel 2d443e94b4 BBVectorize: Add a additional comment about the cost computation
llvm-svn: 173580
2013-01-26 16:49:04 +00:00
Hal Finkel 351a75b6d7 BBVectorize: Fix anomalous capital letter in comment
llvm-svn: 173579
2013-01-26 16:49:03 +00:00
Bill Wendling 201d7b2545 Convert BuildLibCalls.cpp to using the AttributeSet methods instead of AttributeWithIndex.
llvm-svn: 173536
2013-01-26 00:03:11 +00:00
Bill Wendling 57625a4966 Remove some introspection functions.
The 'getSlot' function and its ilk allow introspection into the AttributeSet
class. However, that class should be opaque. Allow access through accessor
methods instead.

llvm-svn: 173522
2013-01-25 23:09:36 +00:00
Nadav Rotem 69a040d3eb LoopVectorize: Refactor the code that vectorizes loads/stores to remove duplication.
llvm-svn: 173500
2013-01-25 21:47:42 +00:00
Bill Wendling 8649283e75 Use the new 'getSlotIndex' method to retrieve the attribute's slot index.
llvm-svn: 173499
2013-01-25 21:46:52 +00:00
Benjamin Kramer 21e8da5990 LoopVectorize: Simplify code. No functionality change.
llvm-svn: 173475
2013-01-25 19:43:15 +00:00
Pedro Artigas b95c98faa2 added ability to dynamically change the ExportList of an already
created InternalizePass (useful for pass reuse)

llvm-svn: 173474
2013-01-25 19:41:03 +00:00
Nadav Rotem 8e9ca2f8cb LoopVectorizer: Refactor more code to use the IRBuilder.
llvm-svn: 173471
2013-01-25 19:26:23 +00:00
Nadav Rotem c8adf3ff6e Refactor some code to use the IRBuilder.
llvm-svn: 173467
2013-01-25 18:34:09 +00:00
Evgeniy Stepanov 2cb0fa10c2 [msan] A comment on ICmp handling logic.
llvm-svn: 173453
2013-01-25 15:35:29 +00:00
Evgeniy Stepanov fac8403249 [msan] Implement exact shadow propagation for relational ICmp.
Only for integers, pointers, and vectors of those. No floats.
Instrumentation seems very heavy, and may need to be replaced
with some approximation in the future.

llvm-svn: 173452
2013-01-25 15:31:10 +00:00
Chandler Carruth ceff222dea Switch this code away from Value::isUsedInBasicBlock. That code either
loops over instructions in the basic block or the use-def list of the
value, neither of which are really efficient when repeatedly querying
about values in the same basic block.

What's more, we already know that the CondBB is small, and so we can do
a much more efficient test by counting the uses in CondBB, and seeing if
those account for all of the uses.

Finally, we shouldn't blanket fail on any such instruction, instead we
should conservatively assume that those instructions are part of the
cost.

Note that this actually fixes a bug in the pass because
isUsedInBasicBlock has a really terrible bug in it. I'll fix that in my
next commit, but the fix for it would make this code suddenly take the
compile time hit I thought it already was taking, so I wanted to go
ahead and migrate this code to a faster & better pattern.

The bug in isUsedInBasicBlock was also causing other tests to test the
wrong thing entirely: for example we weren't actually disabling
speculation for floating point operations as intended (and tested), but
the test passed because we failed to speculate them due to the
isUsedInBasicBlock failure.

llvm-svn: 173417
2013-01-25 05:40:09 +00:00
Michael Gottesman 12780c2d97 Added comment to ObjCARC elaborating what is meant by the term 'Provenance' in 'Provenance Analysis'.
llvm-svn: 173374
2013-01-24 21:35:00 +00:00
Benjamin Kramer 1c4e323fdd Reapply chandlerc's r173342 now that the miscompile it was triggering is fixed.
Original commit message:
Plug TTI into the speculation logic, giving it a real cost interface
that can be specialized by targets.

The goal here is not to be more aggressive, but to just be more accurate
with very obvious cases. There are instructions which are known to be
truly free and which were not being modeled as such in this code -- see
the regression test which is distilled from an inner loop of zlib.

Everywhere the TTI cost model is insufficiently conservative I've added
explicit checks with FIXME comments to go add proper modelling of these
cost factors.

If this causes regressions, the likely solution is to make TTI even more
conservative in its cost estimates, but test cases will help here.

llvm-svn: 173357
2013-01-24 16:44:25 +00:00
Chandler Carruth 321c6a7c50 Revert r173342 temporarily. It appears to cause a very late miscompile
of stage2 in a bootstrap. Still investigating....

llvm-svn: 173343
2013-01-24 13:24:24 +00:00
Chandler Carruth 5f4519309f Plug TTI into the speculation logic, giving it a real cost interface
that can be specialized by targets.

The goal here is not to be more aggressive, but to just be more accurate
with very obvious cases. There are instructions which are known to be
truly free and which were not being modeled as such in this code -- see
the regression test which is distilled from an inner loop of zlib.

Everywhere the TTI cost model is insufficiently conservative I've added
explicit checks with FIXME comments to go add proper modelling of these
cost factors.

If this causes regressions, the likely solution is to make TTI even more
conservative in its cost estimates, but test cases will help here.

llvm-svn: 173342
2013-01-24 12:39:29 +00:00
Chandler Carruth 01bffaad03 Address a large chunk of this FIXME by accumulating the cost for
unfolded constant expressions rather than checking each one
independently.

llvm-svn: 173341
2013-01-24 12:05:17 +00:00
Chandler Carruth 8a21005cca Switch the constant expression speculation cost evaluation away from
a cost fuction that seems both a bit ad-hoc and also poorly suited to
evaluating constant expressions.

Notably, it is missing any support for trivial expressions such as
'inttoptr'. I could fix this routine, but it isn't clear to me all of
the constraints its other users are operating under.

The core protection that seems relevant here is avoiding the formation
of a select instruction wich a further chain of select operations in
a constant expression operand. Just explicitly encode that constraint.

Also, update the comments and organization here to make it clear where
this needs to go -- this should be driven off of real cost measurements
which take into account the number of constants expressions and the
depth of the constant expression tree.

llvm-svn: 173340
2013-01-24 11:53:01 +00:00
Chandler Carruth 7481ca8ff5 Rephrase the speculating scan of the conditional BB to be phrased in
terms of cost rather than hoisting a single instruction.

This does *not* change the cost model! We still set the cost threshold
at 1 here, it's just that we track it by accumulating cost rather than
by storing an instruction.

The primary advantage is that we no longer leave no-op intrinsics in the
basic block. For example, this will now move both debug info intrinsics
and a single instruction, instead of only moving the instruction and
leaving a basic block with nothing bug debug info intrinsics in it, and
those intrinsics now no longer ordered correctly with the hoisted value.

Instead, we now splice the entire conditional basic block's instruction
sequence.

This also places the code for checking the safety of hoisting next to
the code computing the cost.

Currently, the only observable side-effect of this change is that debug
info intrinsics are no longer abandoned. I'm not sure how to craft
a test case for this, and my real goal was the refactoring, but I'll
talk to Dave or Eric about how to add a test case for this.

llvm-svn: 173339
2013-01-24 11:52:58 +00:00
Kostya Serebryany e35d59a8d0 [asan] fix 32-bit builds
llvm-svn: 173338
2013-01-24 10:43:50 +00:00
Chandler Carruth 76aacbd874 Simplify the PHI node operand rewriting.
Previously, the code would scan the PHI nodes and build up a small
setvector of candidate value pairs in phi nodes to go and rewrite. Once
certain the rewrite could be performed, the code walks the set, and for
each one re-scans the entire PHI node list looking for nodes to rewrite
operands.

Instead, scan the PHI nodes once to check for hazards, and then scan it
a second time to rewrite the operands to selects. No set vector, and
a max of two scans.

The only downside is that we might form identical selects, but
instcombine or anything else should fold those easily, and it seems
unlikely to happen often.

llvm-svn: 173337
2013-01-24 10:40:51 +00:00
Kostya Serebryany 87191f6221 [asan] adaptive redzones for globals (the larger the global the larger is the redzone)
llvm-svn: 173335
2013-01-24 10:35:40 +00:00
Chandler Carruth e2a779f3a7 Give the basic block variables here names based on the if-then-end
structure being analyzed. No functionality changed.

llvm-svn: 173334
2013-01-24 09:59:39 +00:00
Chandler Carruth 1d20c02f55 Lift a cheap early exit test above loops and other complex early exit
tests. No need to pay the high cost when we're never going to do
anything.

No functionality changed.

llvm-svn: 173331
2013-01-24 08:22:40 +00:00
Chandler Carruth 8a4a16618f Spiff up the comment on this method, making the example a bit more
pretty in doxygen, adding some of the details actually present in
a classic example where this matters (a loop from gzip and many other
compression algorithms), and a cautionary note about the risks inherent
in the transform. This has come up on the mailing lists recently, and
I suspect folks reading this code could benefit from going and looking
at the MI pass that can really deal with these issues.

llvm-svn: 173329
2013-01-24 08:05:06 +00:00
Craig Topper 3529aa5fc2 Remove trailing whitespace.
llvm-svn: 173322
2013-01-24 05:22:40 +00:00
Benjamin Kramer e4c46fec73 Revert "InstCombine: Clean up weird code that talks about a modulus that's long gone."
This causes crashes during the build of compiler-rt during selfhost. Add a
testcase for coverage.

llvm-svn: 173279
2013-01-23 17:52:29 +00:00
Benjamin Kramer cd86115d8a InstCombine: Clean up weird code that talks about a modulus that's long gone.
This does the right thing unless the multiplication overflows, but the old code
didn't handle that case either.

llvm-svn: 173276
2013-01-23 17:16:22 +00:00
Anton Korobeynikov 4ec3ae78b3 Make sure metarenamer won't rename special stuff (intrinsics and explicitly renamed stuff).
Otherwise this might hide the problems.

llvm-svn: 173265
2013-01-23 15:03:08 +00:00
Kostya Serebryany 4766fe6f10 [asan] use ADD instead of OR when applying shadow offset of PowerPC. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55975 for details
llvm-svn: 173258
2013-01-23 12:54:55 +00:00
Duncan Sands 5924545c0c Initialize the components of this class. Otherwise GCC thinks that Array may be
used uninitialized, since it fails to understand that Array is only used when
SingleValue is not, and outputs a warning.  It also seems generally safer given
that the constructor is non-trivial and has plenty of early exits.

llvm-svn: 173242
2013-01-23 09:09:50 +00:00
Bill Wendling d154e283f2 Add the IR attribute 'sspstrong'.
SSPStrong applies a heuristic to insert stack protectors in these situations:

* A Protector is required for functions which contain an array, regardless of
  type or length.

* A Protector is required for functions which contain a structure/union which
  contains an array, regardless of type or length.  Note, there is no limit to
  the depth of nesting.

* A protector is required when the address of a local variable (i.e., stack
  based variable) is exposed. (E.g., such as through a local whose address is
  taken as part of the RHS of an assignment or a local whose address is taken as
  part of a function argument.)

This patch implements the SSPString attribute to be equivalent to
SSPRequired. This will change in a subsequent patch.

llvm-svn: 173230
2013-01-23 06:41:41 +00:00
Bill Wendling 49bc76cbb3 Remove the last of uses that use the Attribute object as a collection of attributes.
Collections of attributes are handled via the AttributeSet class now. This
finally frees us up to make significant changes to how attributes are structured.

llvm-svn: 173228
2013-01-23 06:14:59 +00:00
Nadav Rotem ab3e698ee9 Add support for reverse pointer induction variables. These are loops that contain pointers that count backwards.
For example, this is the hot loop in BZIP:

  do {
    m = *--p;
    *p = ( ... );
  } while (--n);

llvm-svn: 173219
2013-01-23 01:35:00 +00:00
Bill Wendling 430fa9bfb3 Use the AttributeSet when removing multiple attributes. Use Attribute::AttrKind
when removing one attribute. This further encapsulates the use of the attributes.

llvm-svn: 173214
2013-01-23 00:45:55 +00:00
Bill Wendling c0e2a1f457 Use the AttributeSet when adding multiple attributes and an Attribute::AttrKind
when adding a single attribute to the function.

llvm-svn: 173210
2013-01-23 00:20:53 +00:00
Michael Gottesman 8b5515fa1b Fixed typo.
llvm-svn: 173202
2013-01-22 21:53:43 +00:00
Michael Gottesman 9de6f96ad5 [ObjCARC] Refactored out the inner most 2-loops from PerformCodePlacement into the method ConnectTDBUTraversals.
The method PerformCodePlacement was doing too much (i.e. 3x loops, lots of
different checking). This refactoring separates the analysis section of the
method into a separate function while leaving the actual code placement and
analysis preparation in PerformCodePlacement.

*NOTE* Really this part of ObjCARC should be refactored out of the main pass
class into its own seperate class/struct. But, it is not time to make that
change yet though (don't want to make such an invasive change without fixing all
of the bugs first).

llvm-svn: 173201
2013-01-22 21:49:00 +00:00
Bill Wendling 09175b39f2 More encapsulation work.
Use the AttributeSet when we're talking about more than one attribute. Add a
function that adds a single attribute. No functionality change intended.

llvm-svn: 173196
2013-01-22 21:15:51 +00:00
Evgeniy Stepanov dcf6bcb904 [msan] Export the value of msan-keep-going flag for the runtime.
llvm-svn: 173156
2013-01-22 13:26:53 +00:00
Evgeniy Stepanov c4415591ed [msan] Do not insert check on volatile store.
Volatile bitfields can cause valid stores of uninitialized bits.

llvm-svn: 173153
2013-01-22 12:30:52 +00:00
Chandler Carruth 0ba8db45c6 Begin fleshing out an interface in TTI for modelling the costs of
generic function calls and intrinsics. This is somewhat overlapping with
an existing intrinsic cost method, but that one seems targetted at
vector intrinsics. I'll merge them or separate their names and use cases
in a separate commit.

This sinks the test of 'callIsSmall' down into TTI where targets can
control it. The whole thing feels very hack-ish to me though. I've left
a FIXME comment about the fundamental design problem this presents. It
isn't yet clear to me what the users of this function *really* care
about. I'll have to do more analysis to figure that out. Putting this
here at least provides it access to proper analysis pass tools and other
such. It also allows us to more cleanly implement the baseline cost
interfaces in TTI.

With this commit, it is now theoretically possible to simplify much of
the inline cost analysis's handling of calls by calling through to this
interface. That conversion will have to happen in subsequent commits as
it requires more extensive restructuring of the inline cost analysis.

The CodeMetrics class is now really only in the business of running over
a block of code and aggregating the metrics on that block of code, with
the actual cost evaluation done entirely in terms of TTI.

llvm-svn: 173148
2013-01-22 11:26:02 +00:00
Bill Wendling c90d9f89b7 Have AttributeSet::getRetAttributes() return an AttributeSet instead of Attribute.
This further restricts the use of the Attribute class to the Attribute family of
classes.

llvm-svn: 173098
2013-01-21 22:44:49 +00:00
Bill Wendling bd4ea16bf3 Make AttributeSet::getFnAttributes() return an AttributeSet instead of an Attribute.
This is more code to isolate the use of the Attribute class to that of just
holding one attribute instead of a collection of attributes.

llvm-svn: 173094
2013-01-21 21:57:28 +00:00
Paul Redmond 9d86a4a3b6 Transform (sub 0, (zext bool to A)) to (sext bool to A) and
(sub 0, (sext bool to A)) to (zext bool to A).

Patch by Muhammad Ahmad
Reviewed by Duncan Sands

llvm-svn: 173093
2013-01-21 21:57:20 +00:00
Nadav Rotem b2e7e7a0b6 Fix a comment. Induction vars dont need to start at zero.
llvm-svn: 173061
2013-01-21 17:59:18 +00:00
Chandler Carruth bb9caa9241 Switch CodeMetrics itself over to use TTI to determine if an instruction
is free. The whole CodeMetrics API should probably be reworked more, but
this is enough to allow deleting the duplicate code there for computing
whether an instruction is free.

All of the passes using this have been updated to pull in TTI and hand
it to the CodeMetrics stuff. Further, a dead CodeMetrics API
(analyzeFunction) is nuked for lack of users.

llvm-svn: 173036
2013-01-21 13:04:33 +00:00
Chandler Carruth 4319e2948d Make the inline cost a proper analysis pass. This remains essentially
a dynamic analysis done on each call to the routine. However, now it can
use the standard pass infrastructure to reference other analyses,
instead of a silly setter method. This will become more interesting as
I teach it about more analysis passes.

This updates the two inliner passes to use the inline cost analysis.
Doing so highlights how utterly redundant these two passes are. Either
we should find a cheaper way to do always inlining, or we should merge
the two and just fiddle with the thresholds to get the desired behavior.
I'm leaning increasingly toward the latter as it would also remove the
Inliner sub-class split.

llvm-svn: 173030
2013-01-21 11:39:18 +00:00
Chandler Carruth 7e88e74447 Formatting and comment fixes to the always inliner.
Formatting fixes brought to you by clang-format.

llvm-svn: 173029
2013-01-21 11:39:16 +00:00
Chandler Carruth 0df3e5310c Clean up the formatting and doxygen for the simple inliner a bit. No
functionality changed.

llvm-svn: 173028
2013-01-21 11:39:14 +00:00
Benjamin Kramer a6e2e2a0a7 LoopVectorize: Fix a C++11 incompatibility.
llvm-svn: 172990
2013-01-20 20:29:52 +00:00
Nadav Rotem da9f2adffd Fix a build error.
llvm-svn: 172971
2013-01-20 09:39:17 +00:00
Nadav Rotem c42f90b1f4 LoopVectorizer: Implement a new heuristics for selecting the unroll factor.
We ignore the cpu frontend and focus on pipeline utilization. We do this because we
don't have a good way to estimate the loop body size at the IR level.

llvm-svn: 172964
2013-01-20 05:24:29 +00:00
Benjamin Kramer d455ed85d1 LoopVectorizer: Emit memory checks into their own basic block.
This separates the check for "too few elements to run the vector loop" from the
"memory overlap" check, giving a lot nicer code and allowing to skip the memory
checks when we're not going to execute the vector code anyways. We still leave
the decision of whether to emit the memory checks as branches or setccs, but it
seems to be doing a good job. If ugly code pops up we may want to emit them as
separate blocks too. Small speedup on MultiSource/Benchmarks/MallocBench/espresso.

Most of this is legwork to allow multiple bypass blocks while updating PHIs,
dominators and loop info.

llvm-svn: 172902
2013-01-19 13:57:58 +00:00
Chandler Carruth 1fe21fc0b5 Sort all of the includes. Several files got checked in with mis-sorted
includes.

llvm-svn: 172891
2013-01-19 08:03:47 +00:00
Michael Gottesman 87db357547 Improved comment.
llvm-svn: 172864
2013-01-18 23:02:45 +00:00
Michael Gottesman 9854e0c6a2 Fixed typo in comment.
llvm-svn: 172863
2013-01-18 23:00:33 +00:00
Bill Wendling 658d24d211 Use AttributeSet accessor methods instead of Attribute accessor methods.
Further encapsulation of the Attribute object. Don't allow direct access to the
Attribute object as an aggregate.

llvm-svn: 172853
2013-01-18 21:53:16 +00:00
Bill Wendling 7754389526 Push some more methods down to hide the use of the Attribute class.
Because the Attribute class is going to stop representing a collection of
attributes, limit the use of it as an aggregate in favor of using AttributeSet.
This replaces some of the uses for querying the function attributes.

llvm-svn: 172844
2013-01-18 21:11:39 +00:00
Benjamin Kramer 0eba5775f3 Silence GCC warning about dropping off a non-void function.
llvm-svn: 172839
2013-01-18 19:45:22 +00:00
Alexey Samsonov 46c5a5549e 80 columns
llvm-svn: 172813
2013-01-18 12:49:06 +00:00
Will Dietz b9eb34e100 Move Blacklist.h to include/ to enable use from clang.
llvm-svn: 172806
2013-01-18 11:29:21 +00:00
Craig Topper 45d9f4b569 Check for less than 0 in shuffle mask instead of -1. It's more consistent with other code related to shuffles and easier to implement in compiled code.
llvm-svn: 172788
2013-01-18 05:30:07 +00:00
Craig Topper 2ea22b0b84 Remove trailing whitespace. Remove new lines between closing brace and 'else'
llvm-svn: 172784
2013-01-18 05:09:16 +00:00
Michael Gottesman d359e06245 Fixed 80+ violation.
llvm-svn: 172782
2013-01-18 03:08:39 +00:00
Michael Gottesman 1d777513e5 Added missing const from my last commit.
llvm-svn: 172736
2013-01-17 18:36:17 +00:00
Michael Gottesman 782e34474a [ObjCARC] Implemented operator<< for InstructionClass and changed a ``Visited'' Debug message to use it.
llvm-svn: 172735
2013-01-17 18:32:34 +00:00
Alexey Samsonov 347bcd3c5c ASan: add optional 'zero-based shadow' option to ASan passes. Always tell the values of shadow scale and offset to the runtime
llvm-svn: 172709
2013-01-17 11:12:32 +00:00
Alexey Samsonov 1345d35e40 ASan: wrap mapping scale and offset in a struct and make it a member of ASan passes. Add test for non-default mapping scale and offset. No functionality change
llvm-svn: 172610
2013-01-16 13:23:28 +00:00
Michael Gottesman 6a9355f8d7 [ObjCARC] Turn off ignoring unwind edges in ObjCARC when -fno-objc-arc-exception is enabled due to it's affect on correctness.
Specifically according to the semantics of ARC -fno-objc-arc-exception simply
states that it is expected that the unwind path out of a call *MAY* not release
objects. Thus we can have the situation where a release gets moved into a catch
block which we ignore when we remove a retain/release pair resulting in (even
though we assume the program is exiting anyways) the cleanup code path
potentially blowing up before program exit.

llvm-svn: 172599
2013-01-16 06:32:39 +00:00
Nadav Rotem 7df850924d Teach InstCombine to optimize extract of a value from a vector add operation with a constant zero.
llvm-svn: 172576
2013-01-15 23:43:14 +00:00
Shuxin Yang e822745202 1. Hoist minus sign as high as possible in an attempt to reveal
some optimization opportunities (in the enclosing supper-expressions).

   rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y)
     if expression "-0.0 - X" has only one reference.

   rule 2. (0.0 - X ) * Y => -0.0 - (X * Y)
     if expression "0.0 - X" has only one reference, and
        the instruction is marked "noSignedZero".

2. Eliminate negation (The compiler was already able to handle these
    opt if the 0.0s are replaced with -0.0.)

   rule 3: (0.0 - X) * (0.0 - Y) => X * Y
   rule 4: (0.0 - X) * C => X * -C
   if the expr is flagged "noSignedZero".

3. 
  Rule 5: (X*Y) * X => (X*X) * Y
   if X!=Y and the expression is flagged with "UnsafeAlgebra".

   The purpose of this transformation is two-fold:
    a) to form a power expression (of X).
    b) potentially shorten the critical path: After transformation, the
       latency of the instruction Y is amortized by the expression of X*X,
       and therefore Y is in a "less critical" position compared to what it
      was before the transformation. 

4. Remove the InstCombine code about simplifiying "X * select".
   
   The reasons are following:
    a) The "select" is somewhat architecture-dependent, therefore the
       higher level optimizers are not able to precisely predict if
       the simplification really yields any performance improvement
       or not.

    b) The "select" operator is bit complicate, and tends to obscure
       optimization opportunities. It is btter to keep it as low as
       possible in expr tree, and let CodeGen to tackle the optimization.

llvm-svn: 172551
2013-01-15 21:09:32 +00:00
Nadav Rotem d33ce6f100 LoopVectorizer cost model. Honor the user command line flag that selects the vectorization factor even if the target machine does not have any vector registers.
llvm-svn: 172544
2013-01-15 18:25:16 +00:00
Evgeniy Stepanov d14e47b146 [msan] Fix handling of equality comparison of pointer vectors.
Also improve test coveration of the handling of relational comparisons.

llvm-svn: 172539
2013-01-15 16:44:52 +00:00
Jakub Staszak 190db2f25c Remove trailing spaces.
llvm-svn: 172489
2013-01-14 23:16:36 +00:00
Shuxin Yang 320f52a4b0 This change is to implement following rules under the condition C_A and/or C_R
---------------------------------------------------------------------------
 C_A: reassociation is allowed
 C_R: reciprocal of a constant C is appropriate, which means 
    - 1/C is exact, or 
    - reciprocal is allowed and 1/C is neither a special value nor a denormal.
 -----------------------------------------------------------------------------

 rule1:  (X/C1) / C2 => X / (C2*C1)  (if C_A)
                     => X * (1/(C2*C1))  (if C_A && C_R)
 rule 2:  X*C1 / C2 => X * (C1/C2)  if C_A
 rule 3: (X/Y)/Z = > X/(Y*Z)  (if C_A && at least one of Y and Z is symbolic value)
 rule 4: Z/(X/Y) = > (Z*Y)/X  (similar to rule3)

 rule 5: C1/(X*C2) => (C1/C2) / X (if C_A)
 rule 6: C1/(X/C2) => (C1*C2) / X (if C_A)
 rule 7: C1/(C2/X) => (C1/C2) * X (if C_A)

llvm-svn: 172488
2013-01-14 22:48:41 +00:00
David Greene 530430be98 Fix Casting Bug
Add a const version of getFpValPtr to avoid a cast-away-const warning.

llvm-svn: 172467
2013-01-14 21:04:40 +00:00
Nick Lewycky 80ea003c6c Fix typo in comment.
llvm-svn: 172460
2013-01-14 20:56:10 +00:00
Michael Gottesman e9145d3846 Changed SmallPtrSet.count guard + SmallPtrSet.insert to just SmallPtrSet.insert.
llvm-svn: 172452
2013-01-14 19:18:39 +00:00
Michael Gottesman 4385edf5cb Fixed some 80+ violations.
llvm-svn: 172374
2013-01-14 01:47:53 +00:00
Michael Gottesman 97e3df087d Updated the documentation in ObjCARC.cpp to fit the style guide better (i.e. use doxygen). Still some work to do though.
llvm-svn: 172371
2013-01-14 00:35:14 +00:00
Michael Gottesman f15c0bb495 Fixed an infinite loop in the block escape in analysis in ObjCARC caused by 2x blocks each assigned a value via a phi-node causing each to depend on the other.
A test case is provided as well.

llvm-svn: 172368
2013-01-13 22:12:06 +00:00
Dmitri Gribenko 226fea5bd6 Remove redundant 'llvm::' qualifications
llvm-svn: 172358
2013-01-13 16:01:15 +00:00
Nadav Rotem 40e45eeae2 Fix PR14547. Handle induction variables of small sizes smaller than i32 (i8 and i16).
llvm-svn: 172348
2013-01-13 07:56:29 +00:00