Commit Graph

10696 Commits

Author SHA1 Message Date
Nadav Rotem c60d7d96f5 LoopVectorizer: When we vectorizer and widen loops we process many elements at once. This is a good thing, except for
small loops. On small loops post-loop that handles scalars (and runs slower) can take more time to execute than the
rest of the loop. This patch disables widening of loops with a small static trip count.

llvm-svn: 171798
2013-01-07 21:54:51 +00:00
Shuxin Yang df0e61e793 This change is to implement following rules:
o. X/C1 * C2 => X * (C2/C1) (if C2/C1 is neither special FP nor denormal)
  o. X/C1 * C2 -> X/(C1/C2)   (if C2/C1 is either specical FP or denormal, but C1/C2 is a normal Fp)

     Let MDC denote multiplication or dividion with one & only one operand being a constant
  o. (MDC ± C1) * C2 => (MDC * C2) ± (C1 * C2)
     (so long as the constant-folding doesn't yield any denormal or special value)

llvm-svn: 171793
2013-01-07 21:39:23 +00:00
Michael Gottesman 10426b571e Fixed EOL whitespace.
llvm-svn: 171791
2013-01-07 21:26:07 +00:00
Quentin Colombet 3b2db0bcd3 When code size is the priority (Oz, MinSize attribute), help llvm
turning a code like this:

if (foo)
   free(foo)

into that:
free(foo)

Move a call to free from basic block FB into FB's predecessor, P,
when the path from P to FB is taken only if the argument of free is
not equal to NULL.

Some restrictions apply on P and FB to be sure that this code motion
is profitable. Namely:
1. FB must have only one predecessor P.
2. FB must contain only the call to free plus an unconditional
   branch to S.
3. P's successors are FB and S.

Because of 1., we will not increase the code size when moving the call
to free from FB to P.
Because of 2., FB will be empty after the move.
Because of 2. and 3., P's branch instruction becomes useless, so as FB
(simplifycfg will do the job).

llvm-svn: 171762
2013-01-07 18:37:41 +00:00
Chandler Carruth dcb603feef Move TypeFinder.h into the IR tree, it clearly belongs with the IR library.
llvm-svn: 171749
2013-01-07 15:43:51 +00:00
Chandler Carruth 839a98e687 Move CallGraphSCCPass.h into the Analysis tree; that's where the
implementation lives already.

llvm-svn: 171746
2013-01-07 15:26:48 +00:00
Chandler Carruth 683ff2d7f9 Remove the long defunct 'DefaultPasses' header. We have a pass manager
builder these days, and this thing hasn't seen updates for a very long
time.

llvm-svn: 171741
2013-01-07 15:16:50 +00:00
Chandler Carruth 95f83e0155 Sink AddrMode back into TargetLowering, removing one of the most
peculiar headers under include/llvm.

This struct still doesn't make a lot of sense, but it makes more sense
down in TargetLowering than it did before.

llvm-svn: 171739
2013-01-07 15:14:13 +00:00
Chandler Carruth 6e479322aa Remove LSR's use of the random AddrMode struct. These variables were
already in a class, just inline the four of them. I suspect that this
class could be simplified some to not always keep distinct variables for
these things, but it wasn't clear to me how given the usage so I opted
for a trivial and mechanical translation.

This removes one of the two remaining users of a header in include/llvm
which does nothing more than define a 4 member struct.

llvm-svn: 171738
2013-01-07 15:04:40 +00:00
Chandler Carruth 26c59fa870 Switch the SCEV expander and LoopStrengthReduce to use
TargetTransformInfo rather than TargetLowering, removing one of the
primary instances of the layering violation of Transforms depending
directly on Target.

This is a really big deal because LSR used to be a "special" pass that
could only be tested fully using llc and by looking at the full output
of it. It also couldn't run with any other loop passes because it had to
be created by the backend. No longer is this true. LSR is now just
a normal pass and we should probably lift the creation of LSR out of
lib/CodeGen/Passes.cpp and into the PassManagerBuilder. =] I've not done
this, or updated all of the tests to use opt and a triple, because
I suspect someone more familiar with LSR would do a better job. This
change should be essentially without functional impact for normal
compilations, and only change behvaior of targetless compilations.

The conversion required changing all of the LSR code to refer to the TTI
interfaces, which fortunately are very similar to TargetLowering's
interfaces. However, it also allowed us to *always* expect to have some
implementation around. I've pushed that simplification through the pass,
and leveraged it to simplify code somewhat. It required some test
updates for one of two things: either we used to skip some checks
altogether but now we get the default "no" answer for them, or we used
to have no information about the target and now we do have some.

I've also started the process of removing AddrMode, as the TTI interface
doesn't use it any longer. In some cases this simplifies code, and in
others it adds some complexity, but I think it's not a bad tradeoff even
there. Subsequent patches will try to clean this up even further and use
other (more appropriate) abstractions.

Yet again, almost all of the formatting changes brought to you by
clang-format. =]

llvm-svn: 171735
2013-01-07 14:41:08 +00:00
Silviu Baranga a055aab506 Make the MergeGlobals pass correctly handle the address space qualifiers of the global variables. We partition the set of globals by their address space, and apply the same the trasnformation as before to merge them.
llvm-svn: 171730
2013-01-07 12:31:25 +00:00
Chandler Carruth b348328b5d Simplify LoopVectorize to require target transform info and rely on it
being present. Make a member of one of the helper classes a reference as
part of this.

Reformatting goodness brought to you by clang-format.

llvm-svn: 171726
2013-01-07 11:12:29 +00:00
Chandler Carruth b7e60f6844 Merge the unused header file for LoopVectorizer into the source file.
This makes the loop vectorizer match the pattern followed by roughly all
other passses. =]

Notably, this header file was braken in several regards: it contained
a using namespace directive, global #define's that aren't globaly
appropriate, and global constants defined directly in the header file.

As a side benefit, lots of the types in this file become internal, which
will cause the optimizer to chew on this pass more effectively.

llvm-svn: 171723
2013-01-07 10:44:06 +00:00
Chandler Carruth 7383bfd67e Switch BBVectorize to directly depend on having a TTI analysis.
This could be simplified further, but Hal has a specific feature for
ignoring TTI, and so I preserved that.

Also, I needed to use it because a number of tests fail when switching
from a null TTI to the NoTTI nonce implementation. That seems suspicious
to me and so may be something that you need to look into Hal. I worked
it by preserving the old behavior for these tests with the flag that
ignores all target info.

llvm-svn: 171722
2013-01-07 10:22:36 +00:00
Chandler Carruth 04ece8623e Fix a slew of indentation and parameter naming style issues. This 80% of
this patch brought to you by the tool clang-format.

I wanted to fix up the names of constructor parameters because they
followed a bit of an anti-pattern by naming initialisms with CamelCase:
'Tti', 'Se', etc. This appears to have been in an attempt to not overlap
with the names of member variables 'TTI', 'SE', etc. However,
constructor arguments can very safely alias members, and in fact that's
the conventional way to pass in members. I've fixed all of these I saw,
along with making some strang abbreviations such as 'Lp' be simpler 'L',
or 'Lgl' be the word 'Legal'.

However, the code I was touching had indentation and formatting somewhat
all over the map. So I ran clang-format and fixed them.

I also fixed a few other formatting or doxygen formatting issues such as
using ///< on trailing comments so they are associated with the correct
entry.

There is still a lot of room for improvement of the formating and
cleanliness of this code. ;] At least a few parts of the coding
standards or common practices in LLVM's code aren't followed, the enum
naming rules jumped out at me. I may mix some of these while I'm here,
but not all of them.

llvm-svn: 171719
2013-01-07 09:57:00 +00:00
Chandler Carruth 342cc255d0 Switch LoopIdiom pass to directly require target transform information.
I'm sorry for duplicating bad style here, but I wanted to keep
consistency. I've pinged the code review thread where this style was
reviewed and changes were requested.

llvm-svn: 171714
2013-01-07 09:17:41 +00:00
Chandler Carruth 0b4ef9cedc Make SimplifyCFG simply depend upon TargetTransformInfo and pass it
through as a reference rather than a pointer. There is always *some*
implementation of this available, so this simplifies code by not having
to test for whether it is available or not.

Further, it turns out there were piles of places where SimplifyCFG was
recursing and not passing down either TD or TTI. These are fixed to be
more pedantically consistent even though I don't have any particular
cases where it would matter.

llvm-svn: 171691
2013-01-07 03:53:25 +00:00
Chandler Carruth 2109f47d97 Fix the enumerator names for ShuffleKind to match tho coding standards,
and make its comments doxygen comments.

llvm-svn: 171688
2013-01-07 03:20:02 +00:00
Chandler Carruth 50a36cd148 Make the popcnt support enums and methods have more clear names and
follow the conding conventions regarding enumerating a set of "kinds" of
things.

llvm-svn: 171687
2013-01-07 03:16:03 +00:00
Chandler Carruth d3e73556d6 Move TargetTransformInfo to live under the Analysis library. This no
longer would violate any dependency layering and it is in fact an
analysis. =]

llvm-svn: 171686
2013-01-07 03:08:10 +00:00
Michael Gottesman add0847459 [ObjCARC Debug Message] - Added debug message when fuse a retain/autorelease pair in ObjCARCContract::ContractAutorelease.
llvm-svn: 171679
2013-01-07 00:31:26 +00:00
Michael Gottesman d61a3b2707 [ObjCARC Debug Message] - Added debug message when we zap a matching retain/autorelease pair in ObjCARCOpt::OptimizeReturns.
llvm-svn: 171678
2013-01-07 00:04:56 +00:00
Michael Gottesman 5b970e14e6 [ObjCARC Debug Message] - Added debug message when we erase ARC calls with null since they are no-ops.
llvm-svn: 171677
2013-01-07 00:04:52 +00:00
Michael Gottesman 8800a51ac1 [ObjCARC Debug Message] - Added debug message when we add a nounwind keyword to a function which can not throw.
llvm-svn: 171676
2013-01-06 23:39:13 +00:00
Michael Gottesman 2d76331f86 [ObjCARC Debug Message] - Added debug message when we add a tail keyword to a function which can never be passed stack args.
llvm-svn: 171675
2013-01-06 23:39:09 +00:00
Michael Gottesman 4bf6e7516e [ObjCARC Debug Messages] - Added missing newline.
llvm-svn: 171674
2013-01-06 22:56:54 +00:00
Michael Gottesman a6a1dadeab Added debug statement to ObjCARC when we replace objc_autorelease(x) with objc_release(x) when x is otherwise unused.
llvm-svn: 171673
2013-01-06 22:56:50 +00:00
Michael Gottesman fec61c018d Added 2x Debug statements to ObjCARC that log when we handle the two undefined pointer-to-weak-pointer is NULL cases by replacing the given call inst with an undefined value.
The reason that there are two cases is that the first case handles the unary cases and the second the binary cases.

llvm-svn: 171672
2013-01-06 21:54:30 +00:00
Michael Gottesman dc042f0089 Added debug message in ObjCARC when we remove a no-op cast which has only special semantic meaning in the frontend and thus in the optimizer can be deleted.
llvm-svn: 171670
2013-01-06 21:07:15 +00:00
Michael Gottesman 1bf6908867 Added debug message to ObjCARC when we transform an objc_autoreleaseReturnValue => objc_autorelease due to its operand not being used as a return value.
llvm-svn: 171669
2013-01-06 21:07:11 +00:00
Andrew Trick f950ce8e38 Fix a crash in LSR replaceCongruentIVs.
Indirect branch in the preheader crashes replaceCongruentIVs.
Fixes rdar://12910141.

llvm-svn: 171653
2013-01-06 05:59:39 +00:00
Michael Gottesman def07bba3e Added debug message to ObjCARC when we transform objc_retainAutorelasedReturnValue => objc_retain since the operand to said function is not a return value.
llvm-svn: 171629
2013-01-05 17:55:42 +00:00
Michael Gottesman 5c32ce9d3e Added debug message for ObjCARC when we zap an objc_autoreleaseReturnValue/objc_retainAutoreleasedValue pair.
llvm-svn: 171628
2013-01-05 17:55:35 +00:00
Chris Lattner 473988cf54 switch from pointer equality comparison to MDNode::getMostGenericTBAA
when merging two TBAA tags, pointed out by Nuno.

llvm-svn: 171627
2013-01-05 16:44:07 +00:00
Chandler Carruth 21b3c586ab Switch the loop vectorizer from VTTI to just use TTI directly.
llvm-svn: 171620
2013-01-05 10:16:02 +00:00
Chandler Carruth 7c4f91dea5 Switch the BB vectorizer from the VTTI interface to the simple TTI
interface.

llvm-svn: 171618
2013-01-05 10:05:28 +00:00
Chandler Carruth 6db43e6ca3 Switch SimplifyCFG over to the TargetTransformInfo interface rather than
the ScalarTargetTransformInfo interface.

llvm-svn: 171617
2013-01-05 10:05:26 +00:00
Chandler Carruth 6fe147fb3a Switch LoopIdiomRecognize to directly use the TargetTransformInfo
interface rather than the ScalarTargetTransformInterface.

llvm-svn: 171616
2013-01-05 10:00:09 +00:00
Chandler Carruth c892591596 Sink the AddressingModeMatcher helper class into an anonymous namespace
next to its only user. This helper relies on TargetLowering information
that shouldn't be generally used throughout the Transfoms library, and
so it made little sense as a generic utility.

This also consolidates the file where we need to remove the remaining
uses of TargetLowering in favor of the IR-layer abstract interface in
TargetTransformInfo.

llvm-svn: 171590
2013-01-05 02:09:22 +00:00
Nadav Rotem e9f5bfd5e9 iLoopVectorize: Non commutative operators can be used as reduction variables as long as the reduction chain is used in the LHS.
PR14803.

llvm-svn: 171583
2013-01-05 01:15:47 +00:00
Paul Redmond 874f01e956 Do not vectorize loops with subtraction reductions
Since subtraction does not commute the loop vectorizer incorrectly vectorizes
reductions such as x = A[i] - x.

Disabling for now.

llvm-svn: 171537
2013-01-04 22:10:16 +00:00
Michael Gottesman 1e00ac6256 Added DEBUG message to ObjCARC when we optimize objc_retain => objc_retainAutorelasedReturnValue.
llvm-svn: 171535
2013-01-04 21:30:38 +00:00
Michael Gottesman 9f848aeddd Fixed up some DEBUG messages where I was putting in the text of a message the method where it was being called when I should have just prefixed the actual message with Pass::Method.
Additionally I fixed some whitespace issues.

llvm-svn: 171534
2013-01-04 21:29:57 +00:00
Nadav Rotem 93bd30be9b Fix a warning
llvm-svn: 171525
2013-01-04 21:08:44 +00:00
Nadav Rotem be6570d429 Move the loop vectorizer from O2 to O3. It looks like the increase in code size actually hurts the performance on many programs.
llvm-svn: 171471
2013-01-04 17:57:44 +00:00
Nadav Rotem e1d5c4b8b9 LoopVectorizer:
1. Add code to estimate register pressure.
2. Add code to select the unroll factor based on register pressure.
3. Add bits to TargetTransformInfo to provide the number of registers.

llvm-svn: 171469
2013-01-04 17:48:25 +00:00
Michael Gottesman 50ae5b28e9 Changed two debug statements that state that a queue had finished being processed when said queue was really a list to state a list had finished being processed.
llvm-svn: 171465
2013-01-03 08:09:27 +00:00
Michael Gottesman ef682c5430 Added DEBUG message for ObjCARC when we zap a push/pop pair in ObjCARCAPElim::OptimizeBB.
llvm-svn: 171464
2013-01-03 08:09:17 +00:00
Michael Gottesman 416dc00cad Added DEBUG message to ObjCARC when we transform objc_initWeak(p, null) => *p = null.
llvm-svn: 171463
2013-01-03 07:32:53 +00:00
Michael Gottesman 00d1f966b4 Added DEBUG message for ObjCARC when an inline asm marker is inserted for architectures where this is required to perform a retainAutoreleasedReturnValue optimization.
llvm-svn: 171462
2013-01-03 07:32:41 +00:00
Nadav Rotem 72f984b596 LoopVectorizer: Add support for loop-unrolling during vectorization for increasing the ILP. At the moment this feature is disabled by default and this commit should not cause any functional changes.
llvm-svn: 171436
2013-01-03 00:52:27 +00:00
Nadav Rotem 4897392360 Avoid vectorization when the function has the "noimplicitflot" attribute.
llvm-svn: 171429
2013-01-02 23:54:43 +00:00
Shuxin Yang 98c844fd89 - Add comment to two functions which might be considered as dead code.
- Fix a typo

llvm-svn: 171399
2013-01-02 18:26:31 +00:00
Chandler Carruth db25c6cf8e Actually update the CMake and Makefile builds correctly, and update the
code that includes Intrinsics.gen directly.

This never showed up in my testing because the old Intrinsics.gen was
still kicking around in the make build system and was correct there. =[
Thankfully, some of the bots to clean rebuilds and that caught this.

llvm-svn: 171373
2013-01-02 12:09:16 +00:00
Chandler Carruth 9fb823bbd4 Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366
2013-01-02 11:36:10 +00:00
Chandler Carruth be81023d74 Resort the #include lines in include/... and lib/... with the
utils/sort_includes.py script.

Most of these are updating the new R600 target and fixing up a few
regressions that have creeped in since the last time I sorted the
includes.

llvm-svn: 171362
2013-01-02 10:22:59 +00:00
Benjamin Kramer 614b5e85b9 Add IRBuilder::CreateVectorSplat and use it to simplify code.
llvm-svn: 171349
2013-01-01 19:55:16 +00:00
Benjamin Kramer c003a4521b SROA: Clean up unused assignment warnings from clang's analyzer.
No functionality change.

llvm-svn: 171348
2013-01-01 16:13:35 +00:00
Michael Gottesman c8a11df33b Added DEBUG message when ObjCARC replaces a call which returns its argument verbatim with its argument to temporarily undo an optimization.
Specifically these calls return their argument verbatim, as a low-level
optimization. However, this makes high-level optimizations
harder. We undo any uses of this optimization that the front-end
emitted. We redo them later in the contract pass.

llvm-svn: 171346
2013-01-01 16:05:54 +00:00
Michael Gottesman 3f146e204e Added DEBUG messages to the top of several processing loops in ObjCARC.cpp that emit what instructions are being visited.
This is a part of a larger effort of adding DEBUG messages to the ARC
Optimizer Backend.

llvm-svn: 171345
2013-01-01 16:05:48 +00:00
Jakub Staszak c48bbe7170 Add extra CHECK to make sure that 'or' instruction was replaced.
Also add an assert to avoid confusion in the code where is known that C1 <= C2.

llvm-svn: 171310
2012-12-31 18:26:42 +00:00
Chris Lattner f5cca68c2c Fix LICM's memory promotion optimization to preserve TBAA tags when
promoting a store in a loop.  This was noticed when working on PR14753,
but isn't directly related.

llvm-svn: 171281
2012-12-31 08:37:17 +00:00
Chris Lattner eeefe1bc07 teach instcombine to preserve TBAA tag when merging two stores, part of
PR14753

llvm-svn: 171279
2012-12-31 08:10:58 +00:00
Jakub Staszak f584977df2 Grammo.
llvm-svn: 171272
2012-12-31 01:40:44 +00:00
Bill Wendling 6e95ae803a Remove the getAttributesAtIndex and getNumAttrs methods in favor of using the getAttrSomewhere predicate. This prevents the uses of 'Attribute' as a collection of attributes.
llvm-svn: 171271
2012-12-31 00:49:59 +00:00
Jakub Staszak ea2b9b9d67 Transform (A == C1 || A == C2) into (A & ~(C1 ^ C2)) == C1
if C1 and C2 differ only with one bit.
Fixes PR14708.

llvm-svn: 171270
2012-12-31 00:34:55 +00:00
Nuno Lopes b6ad98224a convert a bunch of callers from DataLayout::getIndexedOffset() to GEP::accumulateConstantOffset().
The later API is nicer than the former, and is correct regarding wrap-around offsets (if anyone cares).
There are a few more places left with duplicated code, which I'll remove soon.

llvm-svn: 171259
2012-12-30 16:25:48 +00:00
Bill Wendling 94dcaf8e2b Remove Function::getParamAttributes and use the AttributeSet accessor methods instead.
llvm-svn: 171255
2012-12-30 12:45:13 +00:00
Bill Wendling 698e84fc4f Remove the Function::getFnAttributes method in favor of using the AttributeSet
directly.

This is in preparation for removing the use of the 'Attribute' class as a
collection of attributes. That will shift to the AttributeSet class instead.

llvm-svn: 171253
2012-12-30 10:32:01 +00:00
Nadav Rotem 0b37f14371 LoopVectorizer: Fix a bug in the code that updates the loop exiting block.
LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs.
The bug happened because undefs are not loop values. This patch handles these PHIs.

PR14725

llvm-svn: 171251
2012-12-30 07:47:00 +00:00
Alexey Samsonov 3efc87e92d Add proper support for -fsanitize-blacklist= flag for TSan and MSan. LLVM part.
llvm-svn: 171183
2012-12-28 09:30:44 +00:00
Chandler Carruth e40e60eed5 Make this parameter be named consistently with most other
getAnalysisUsage implementations.

llvm-svn: 171157
2012-12-27 11:17:15 +00:00
Alexey Samsonov 29dd7f2090 [ASan] Fix lifetime intrinsics handling. Now for each intrinsic we check if it describes one of 'interesting' allocas. Assume that allocas can go through casts and phi-nodes before apperaring as llvm.lifetime arguments
llvm-svn: 171153
2012-12-27 08:50:58 +00:00
Nadav Rotem 5350cd314b If all of the write objects are identified then we can vectorize the loop even if the read objects are unidentified.
PR14719.

llvm-svn: 171124
2012-12-26 23:30:53 +00:00
Nick Lewycky 90053a1214 Remove mid-optimizer warning. This situation should be handled differently,
such as by a compiler warning, a check in clang -fsanitizer=undefined, being
optimized to unreachable, or a combination of the above. PR14722.

llvm-svn: 171119
2012-12-26 22:00:35 +00:00
Nadav Rotem 3f7c4f36ba LoopVectorizer: Optimize the vectorization of consecutive memory access when the iteration step is -1
llvm-svn: 171114
2012-12-26 19:08:17 +00:00
Evgeniy Stepanov 5eb5bf8b46 [msan] Raise alignment of origin stores/loads when possible.
Origin alignment is as high as the alignment of the corresponding application
location, but never less than 4.

llvm-svn: 171110
2012-12-26 11:55:09 +00:00
Evgeniy Stepanov d8be0c510c [msan] Expand the file comment with track-origins info.
llvm-svn: 171109
2012-12-26 10:59:00 +00:00
Hal Finkel 30e95a8ebb BBVectorize: Use VTTI to compute costs for intrinsics vectorization
For the time being this includes only some dummy test cases. Once the
generic implementation of the intrinsics cost function does something other
than assuming scalarization in all cases, or some target specializes the
interface, some real test cases can be added.

Also, for consistency, I changed the type of IID from unsigned to Intrinsic::ID
in a few other places.

llvm-svn: 171079
2012-12-26 01:36:57 +00:00
Hal Finkel b44f890133 LoopVectorize: Enable vectorization of the fmuladd intrinsic
llvm-svn: 171076
2012-12-25 23:21:29 +00:00
Hal Finkel 2a456112ec BBVectorize: Enable vectorization of the fmuladd intrinsic
llvm-svn: 171075
2012-12-25 22:36:08 +00:00
Evgeniy Stepanov f19c086d1e [msan] Fix handling of vectors of pointers.
VectorType::getInteger() can not be used with them, because pointer size
depends on the target.

llvm-svn: 171070
2012-12-25 16:04:38 +00:00
Evgeniy Stepanov ec8371283b [msan] Fix handling of select with vector condition.
llvm-svn: 171069
2012-12-25 14:56:21 +00:00
Alexey Samsonov 788381b8ac ASan: initialize callbacks from ASan module pass in a separate function for consistency
llvm-svn: 171061
2012-12-25 12:28:20 +00:00
Alexey Samsonov 1e3f7ba8f7 ASan: move stack poisoning logic into FunctionStackPoisoner struct
llvm-svn: 171060
2012-12-25 12:04:36 +00:00
Bob Wilson 4ed23578da Add LLVMContext::emitWarning methods and use them. <rdar://problem/12867368>
When the backend is used from clang, it should produce proper diagnostics
instead of just printing messages to errs(). Other clients may also want to
register their own error handlers with the LLVMContext, and the same handler
should work for warnings in the same way as the existing emitError methods.

llvm-svn: 171041
2012-12-24 18:15:21 +00:00
Nadav Rotem 5f7c12cfbd LoopVectorizer: When checking for vectorizable types, also check
the StoreInst operands.

PR14705.

llvm-svn: 171023
2012-12-24 09:14:18 +00:00
Alexey Samsonov 098842b401 Fix typo in comments
llvm-svn: 171021
2012-12-24 08:52:53 +00:00
Nadav Rotem bd5d1d832a LoopVectorizer: Fix an endless loop in the code that looks for reductions.
The bug was in the code that detects PHIs in if-then-else block sequence.

PR14701.

llvm-svn: 171008
2012-12-24 01:22:06 +00:00
Benjamin Kramer 28691400dd LoopVectorize: Fix accidentaly inverted condition.
llvm-svn: 171001
2012-12-23 13:21:41 +00:00
Benjamin Kramer 855ba03408 LoopVectorize: For scalars and void types there is no need to compute vector insert/extract costs.
Fixes an assert during the build of oggenc in the test suite.

llvm-svn: 171000
2012-12-23 13:19:18 +00:00
Nadav Rotem 2cade68025 Loop Vectorizer: Update the cost model of scatter/gather operations and make
them more expensive.

llvm-svn: 170995
2012-12-23 07:23:55 +00:00
Craig Topper 4c94775198 Remove trailing whitespace
llvm-svn: 170990
2012-12-22 18:09:02 +00:00
Bill Wendling c79e42c5ce Change 'AttrVal' to 'AttrKind' to better reflect that it's a kind of attribute instead of the value of the attribute.
llvm-svn: 170972
2012-12-22 00:37:52 +00:00
Roman Divacky a229186a82 Remove duplicate includes.
llvm-svn: 170902
2012-12-21 17:06:44 +00:00
Evgeniy Stepanov 4fbc0d08bf [msan] Remove unreachable blocks before instrumenting a function.
llvm-svn: 170883
2012-12-21 11:18:49 +00:00
Nadav Rotem 3b850b70b3 Enable if-conversion.
llvm-svn: 170841
2012-12-21 04:47:54 +00:00
Evan Cheng 99cafb1db2 Every pass deserves a name, even codegenprep.
llvm-svn: 170831
2012-12-21 01:48:14 +00:00
Nadav Rotem a4b53f20a3 BB-Vectorizer: Check the cost of the store pointer type
and not the return type, which is void. A number of test
cases fail after adding the assertion in TTImpl.

llvm-svn: 170828
2012-12-21 01:24:36 +00:00
Nadav Rotem e7785686a5 Fix a bug in the code that checks if we can vectorize loops while using dynamic
memory bound checks.  Before the fix we were able to vectorize this loop from
the Livermore Loops benchmark:

for ( k=1 ; k<n ; k++ )
  x[k] = x[k-1] + y[k];

llvm-svn: 170811
2012-12-21 00:07:35 +00:00
Nadav Rotem 2ababf68d7 LoopVectorize: Fix a bug in the scalarization of instructions.
Before if-conversion we could check if a value is loop invariant
if it was declared inside the basic block. Now that loops have
multiple blocks this check is incorrect.

This fixes External/SPEC/CINT95/099_go/099_go

llvm-svn: 170756
2012-12-20 20:24:40 +00:00
Nadav Rotem 8b20c0a814 Loop Vectorizer: turn-off if-conversion.
llvm-svn: 170708
2012-12-20 17:42:53 +00:00
James Molloy 4f6fb953a7 Add a new attribute, 'noduplicate'. If a function contains a noduplicate call, the call cannot be duplicated - Jump threading, loop unrolling, loop unswitching, and loop rotation are inhibited if they would duplicate the call.
Similarly inlining of the function is inhibited, if that would duplicate the call (in particular inlining is still allowed when there is only one callsite and the function has internal linkage).

llvm-svn: 170704
2012-12-20 16:04:27 +00:00
Craig Topper ae48cb2e5a Formatting fixes. Remove some unnecessary 'else' after 'return'. No functional change.
llvm-svn: 170676
2012-12-20 07:15:54 +00:00
Craig Topper 9d4171afed Removing trailing whitespace
llvm-svn: 170675
2012-12-20 07:09:41 +00:00
Nadav Rotem 7bdc45b570 Loop Vectorizer: Enable if-conversion.
llvm-svn: 170632
2012-12-20 02:00:02 +00:00
Nadav Rotem 28408a20c9 whitespace
llvm-svn: 170626
2012-12-20 00:49:56 +00:00
Paul Redmond 5917f4c715 Transform (x&C)>V into (x&C)!=0 where possible
When the least bit of C is greater than V, (x&C) must be greater than V
if it is not zero, so the comparison can be simplified.

Although this was suggested in Target/X86/README.txt, it benefits any
architecture with a directly testable form of AND.

Patch by Kevin Schoedel

llvm-svn: 170576
2012-12-19 19:47:13 +00:00
Evgeniy Stepanov abeae5c7d5 [msan] Add track-origins argument to the pass constructor.
llvm-svn: 170544
2012-12-19 13:55:51 +00:00
Evgeniy Stepanov d7571cd4bc [msan] Heuristically instrument unknown intrinsics.
This changes adds shadow and origin propagation for unknown intrinsics
by examining the arguments and ModRef behaviour. For now, only 3 classes
of intrinsics are handled:
- those that look like simple SIMD store
- those that look like simple SIMD load
- those that don't have memory effects and look like arithmetic/logic/whatever
  operation on simple types.

llvm-svn: 170530
2012-12-19 11:22:04 +00:00
Benjamin Kramer e300004bd5 LoopVectorize: Make iteration over induction variables not depend on pointer values.
MapVector is a bit heavyweight, but I don't see a simpler way. Also the
InductionList is unlikely to be large. This should help 3-stage selfhost
compares (PR14647).

llvm-svn: 170528
2012-12-19 11:09:15 +00:00
Bill Wendling d97b75d816 Inline the 'hasIncompatibleWithVarArgsAttrs' method into its only uses. And some minor comment reformatting.
llvm-svn: 170516
2012-12-19 08:57:40 +00:00
Bill Wendling 3d7b0b8ac7 Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future.
llvm-svn: 170502
2012-12-19 07:18:57 +00:00
Shuxin Yang 5b841c4a64 Make sure the buffer, which containas an instance of APFloat, has proper alignment.
llvm-svn: 170486
2012-12-19 01:10:17 +00:00
Shuxin Yang 37a1efe1c6 rdar://12801297
InstCombine for unsafe floating-point add/sub.

llvm-svn: 170471
2012-12-18 23:10:12 +00:00
Nadav Rotem 9aee065e3c Enable the loop vectorizer in clang and not in the pass manager, so that we can disable it in clang.
llvm-svn: 170470
2012-12-18 23:09:44 +00:00
Benjamin Kramer f0e5d2f032 LoopVectorize: Emit reductions as log2(vectorsize) shuffles + vector ops instead of scalar operations.
For example on x86 with SSE4.2 a <8 x i8> add reduction becomes
	movdqa	%xmm0, %xmm1
	movhlps	%xmm1, %xmm1            ## xmm1 = xmm1[1,1]
	paddw	%xmm0, %xmm1
	pshufd	$1, %xmm1, %xmm0        ## xmm0 = xmm1[1,0,0,0]
	paddw	%xmm1, %xmm0
	phaddw	%xmm0, %xmm0
	pextrb	$0, %xmm0, %edx

instead of
	pextrb	$2, %xmm0, %esi
	pextrb	$0, %xmm0, %edx
	addb	%sil, %dl
	pextrb	$4, %xmm0, %esi
	addb	%dl, %sil
	pextrb	$6, %xmm0, %edx
	addb	%sil, %dl
	pextrb	$8, %xmm0, %esi
	addb	%dl, %sil
	pextrb	$10, %xmm0, %edi
	pextrb	$14, %xmm0, %edx
	addb	%sil, %dil
	pextrb	$12, %xmm0, %esi
	addb	%dil, %sil
	addb	%sil, %dl

llvm-svn: 170439
2012-12-18 18:40:20 +00:00
Nadav Rotem c0699854dd Enable the loop vectorizer.
llvm-svn: 170416
2012-12-18 06:37:12 +00:00
Nadav Rotem a5024fc3e1 SROA: Replace calls to getScalarSizeInBits to DataLayout's API because
getScalarSizeInBits could not handle vectors of pointers.

llvm-svn: 170412
2012-12-18 05:23:31 +00:00
Rafael Espindola 46b9c8a2cd Initialize NoRedZone and remove unused default values.
llvm-svn: 170404
2012-12-18 03:35:05 +00:00
Chandler Carruth e3f4119b06 Fix another SROA crasher, PR14601.
This was a silly oversight, we weren't pruning allocas which were used
by variable-length memory intrinsics from the set that could be widened
and promoted as integers. Fix that.

llvm-svn: 170353
2012-12-17 18:48:07 +00:00
Evgeniy Stepanov 88b8dceddf [msan] Fix lint warning.
llvm-svn: 170347
2012-12-17 16:30:05 +00:00
Chandler Carruth 21eb4e96c2 Teach the rewriting of memcpy calls to support subvector copies.
This also cleans up a bit of the memcpy call rewriting by sinking some
irrelevant code further down and making the call-emitting code a bit
more concrete.

Previously, memcpy of a subvector would actually miscompile (!!!) the
copy into a single vector element copy. I have no idea how this ever
worked. =/ This is the memcpy half of PR14478 which we probably weren't
noticing previously because it didn't actually assert.

The rewrite relies on the newly refactored insert- and extractVector
functions to do the heavy lifting, and those are the same as used for
loads and stores which makes the test coverage a bit more meaningful
here.

llvm-svn: 170338
2012-12-17 14:51:24 +00:00
Evgeniy Stepanov 95a80abead Optimize tree walking in markAliveBlocks.
Check whether a BB is known as reachable before adding it to the worklist.
This way BB's with multiple predecessors are added to the list no more than
once.

llvm-svn: 170335
2012-12-17 14:28:00 +00:00
Chandler Carruth cacda256a1 Fix a secondary bug I introduced while fixing the first part of PR14478.
The first half of fixing this bug was actually in r170328, but was
entirely coincidental. It did however get me to realize the nature of
the bug, and adapt the test case to test more interesting behavior. In
turn, that uncovered the rest of the bug which I've fixed here.

This should fix two new asserts that showed up in the vectorize nightly
tester.

llvm-svn: 170333
2012-12-17 14:03:01 +00:00
Chandler Carruth 95e1fb8a42 Hoist a convertValue call to the two paths where it is needed.
I noticed this while looking at r170328. We only ever do a vector
rewrite when the alloca *is* the vector type, so it's good to not paper
over bugs here by doing a convertValue that isn't needed.

llvm-svn: 170331
2012-12-17 13:51:03 +00:00
Chandler Carruth ce4562bdcb Hoist the insertVector helper to be a static helper.
This will allow its use inside of memcpy rewriting as well. This routine
is more complex than extractVector, and some of its uses are not 100%
where I want them to be so there is still some work to do here.

While this can technically change the output in some cases, it shouldn't
be a change that matters -- IE, it can leave some dead code lying around
that prior versions did not, etc.

Yet another step in the refactorings leading up to the solution to the
last component of PR14478.

llvm-svn: 170328
2012-12-17 13:41:21 +00:00
Chandler Carruth b6bc8749e8 Lift the extractVector helper all the way out to a static helper function.
The method helpers all implicitly act upon the alloca, and what we
really want is a fully generic helper. Doing memcpy rewrites is more
special than all other rewrites because we are at times rewriting
instructions which touch pointers *other* than the alloca. As
a consequence all of the helpers needed by memcpy rewriting of
sub-vector copies will need to be generalized fully.

Note that all of these helpers ({insert,extract}{Integer,Vector}) are
woefully uncommented. I'm going to go back through and document them
once I get the factoring correct.

No functionality changed.

llvm-svn: 170325
2012-12-17 13:07:30 +00:00
Chandler Carruth 769445ef03 Factor the vector load rewriting into a more generic form.
This makes it suitable for use in rewriting memcpy in the presence of
subvector memcpy intrinsics.

No functionality changed.

llvm-svn: 170324
2012-12-17 12:50:21 +00:00
Chandler Carruth ccca504f3a Fix the first part of PR14478: memset now works.
PR14478 highlights a serious problem in SROA that simply wasn't being
exercised due to a lack of vector input code mixed with C-library
function calls. Part of SROA was written carefully to handle subvector
accesses via memset and memcpy, but the rewriter never grew support for
this. Fixing it required refactoring the subvector access code in other
parts of SROA so it could be shared, and then fixing the splat formation
logic and using subvector insertion (this patch).

The PR isn't quite fixed yet, as memcpy is still broken in the same way.
I'm starting on that series of patches now.

Hopefully this will be enough to bring the bullet benchmark back to life
with the bb-vectorizer enabled, but that may require fixing memcpy as
well.

llvm-svn: 170301
2012-12-17 04:07:37 +00:00
Chandler Carruth eae65a5629 Extract the logic for inserting a subvector into a vector alloca.
No functionality changed. Another step of refactoring toward solving
PR14487.

llvm-svn: 170300
2012-12-17 04:07:35 +00:00
Chandler Carruth 514f34f9c4 Lift the integer splat computation into a helper function.
No functionality changed. Refactoring leading up to the fix for PR14478
which requires some significant changes to the memset and memcpy
rewriting.

llvm-svn: 170299
2012-12-17 04:07:30 +00:00
Chandler Carruth 067edd342f Relax an overly aggressive assert to fix PR14572.
The alloca width is based on the alloc size, not the type size.

llvm-svn: 170270
2012-12-15 09:26:06 +00:00
NAKAMURA Takumi 8f45b6c709 Revert r170246, "Enable the loop vectorizer by default."
llvm-svn: 170267
2012-12-15 06:11:13 +00:00
Michael Ilseman e2754dc887 Add back FoldOpIntoPhi optimizations with fix. Included test cases to help catch these errors and to test the presence of the optimization itself
llvm-svn: 170248
2012-12-14 22:08:26 +00:00
Nadav Rotem acde77481d Enable the loop vectorizer by default.
llvm-svn: 170246
2012-12-14 21:30:23 +00:00
Shuxin Yang f8e9a5a061 rdar://12753946
Implement rule : "x * (select cond 1.0, 0.0) -> select cond x, 0.0"

llvm-svn: 170226
2012-12-14 18:46:06 +00:00
Evgeniy Stepanov 9b72e991c6 Fix lint warnings in MemorySanitizer.cpp.
llvm-svn: 170203
2012-12-14 13:48:31 +00:00
Evgeniy Stepanov 49175b237d [msan] Origin stores and loads do not need explicit alignment.
Origin address is always 4 byte aligned, and the access type is always i32.

llvm-svn: 170199
2012-12-14 13:43:11 +00:00
Evgeniy Stepanov f18e3af11f [msan] Refactor default shadow propagation and origin tracking.
This change moves the code for default shadow propagaition (handleShadowOr)
and origin tracking (setOriginForNaryOp) into a new builder-like class. Also
gets rid of handleShadowOrBinary.

llvm-svn: 170192
2012-12-14 12:54:18 +00:00
Nadav Rotem d3a3c9fdd5 revert r170166 - disable the loop vectorizer.
llvm-svn: 170172
2012-12-14 01:57:00 +00:00
Nadav Rotem 3b606d6fd5 Enable the loop vectorizer.
llvm-svn: 170166
2012-12-14 00:30:34 +00:00
Nadav Rotem b4ea4b3751 Disable the loop vectorizer.
llvm-svn: 170162
2012-12-14 00:02:07 +00:00
Nadav Rotem e5e28b48c8 Enable the Loop Vectorizer by default for O2 and O3. Disable if-conversion by default. I plan to revert this patch later today.
llvm-svn: 170157
2012-12-13 23:11:54 +00:00
NAKAMURA Takumi 38d2b2442f Revert r170020, "Simplify negated bit test", for now.
This assumes (1 << n) is always not zero. Consider n is greater than word size.
Although I know it is undefined, this transforms undefined behavior hidden.

This led clang unexpected behavior with some failures. I will investigate to fix undefined shl in clang.

llvm-svn: 170128
2012-12-13 14:28:16 +00:00
Eric Christopher a1bbeeca72 Revert "Restore the PHI optimization I accidently removed" temporarily since
it seems to be breaking self-host for a few people and is PR14592.

This reverts commit r170024.

llvm-svn: 170106
2012-12-13 06:48:05 +00:00
Rafael Espindola a2c107e661 Missed these calls from the previous rename somehow.
llvm-svn: 170094
2012-12-13 03:42:31 +00:00
Rafael Espindola 319f74cd11 Rename isPowerOfTwo to isKnownToBeAPowerOfTwo.
In a previous thread it was pointed out that isPowerOfTwo is not a very precise
name since it can return false for powers of two if it is unable to show that
they are powers of two.

llvm-svn: 170093
2012-12-13 03:37:24 +00:00
Michael Ilseman 536cc32ba0 Pattern matching code for intrinsics.
Provides m_Argument that allows matching against a CallSite's specified argument. Provides m_Intrinsic pattern that can be templatized over the intrinsic id and bind/match arguments similarly to other pattern matchers. Implementations provided for 0 to 4 arguments, though it's very simple to extend for more. Also provides example template specialization for bswap (m_BSwap) and example of code cleanup for its use.

llvm-svn: 170091
2012-12-13 03:13:36 +00:00
Quentin Colombet c0dba2035a Take into account minimize size attribute in the inliner.
Better controls the inlining of functions when the caller function has MinSize attribute.
Basically, when the caller function has this attribute, we do not "force" the inlining
of callee functions carrying the InlineHint attribute (i.e., functions defined with
inline keyword)

llvm-svn: 170065
2012-12-13 01:05:25 +00:00
Nadav Rotem 36510f7194 Teach the cost model about the optimization in r169904: Truncation of induction variables costs the same as scalar trunc.
llvm-svn: 170051
2012-12-13 00:21:03 +00:00
Chad Rosier e28ae30a8e Typo.
llvm-svn: 170050
2012-12-13 00:18:46 +00:00
Michael Ilseman 3c814128cd Restore the PHI optimization I accidently removed
llvm-svn: 170024
2012-12-12 20:59:36 +00:00
Michael Ilseman 9fc0f258fa Remove trailing whitespace
llvm-svn: 170022
2012-12-12 20:57:53 +00:00
David Majnemer 5226aa94ce Simplify negated bit test
llvm-svn: 170020
2012-12-12 20:48:54 +00:00
Nadav Rotem 6027bdf898 Fix indentation.
llvm-svn: 170005
2012-12-12 19:39:36 +00:00
Nadav Rotem d0bb22bba3 LoopVectorizer: Use the "optsize" attribute to decide if we are allowed to increase the function size.
llvm-svn: 170004
2012-12-12 19:29:45 +00:00
Rafael Espindola e40238069e The TargetData is not used for the isPowerOfTwo determination. It has never
been used in the first place.  It simply was passed to the function and to the
recursive invocations.  Simply drop the parameter and update the callers for the
new signature.

Patch by Saleem Abdulrasool!

llvm-svn: 169988
2012-12-12 16:52:40 +00:00
Alexey Samsonov 3d43b63a6e Improve debug info generated with enabled AddressSanitizer.
When ASan replaces <alloca instruction> with
<offset into a common large alloca>, it should also patch
llvm.dbg.declare calls and replace debug info descriptors to mark
that we've replaced alloca with a value that stores an address
of the user variable, not the user variable itself.

See PR11818 for more context.

llvm-svn: 169984
2012-12-12 14:31:53 +00:00
Nadav Rotem 6798a04b15 Fix the ascii drawing that was ruined when I split the H and CPP
llvm-svn: 169955
2012-12-12 01:33:47 +00:00
Nadav Rotem 4fa2e3d5af fix a typo.
llvm-svn: 169953
2012-12-12 01:31:10 +00:00
Nadav Rotem aeb17df802 LoopVectorizer: When -Os is used, vectorize only loops that dont require a tail loop. There is no testcase because I dont know of a way to initialize the loop vectorizer pass without adding an additional hidden flag.
llvm-svn: 169950
2012-12-12 01:11:46 +00:00
Shuxin Yang 81b3678564 - Fix a problematic way in creating all-the-1 APInt.
- Propagate "exact" bit of [l|a]shr instruction.

llvm-svn: 169942
2012-12-12 00:29:03 +00:00
Michael Ilseman d5787be5ba Remove redunant optimizations from InstCombine, instead call the appropriate functions from SimplifyInstruction
llvm-svn: 169941
2012-12-12 00:28:32 +00:00
Nadav Rotem f707bf4ca3 PR14574. Fix a bug in the code that calculates the mask the converted PHIs in if-conversion.
llvm-svn: 169916
2012-12-11 21:30:14 +00:00
Nadav Rotem e266efb70b Loop Vectorize: optimize the vectorization of trunc(induction_var). The truncation is now done on scalars.
llvm-svn: 169904
2012-12-11 18:58:10 +00:00
Rafael Espindola a92da5b34f Use an ArrayRef instead of a std::vector&.
llvm-svn: 169881
2012-12-11 16:36:02 +00:00
Evgeniy Stepanov d2bd319adc [msan] Use explicitely aligned stores and loads with function argument shadow.
Use explicitely aligned store and load instructions to deal with argument and
retval shadow. This matters when an argument's alignment is higher than
__msan_param_tls alignment (which is the case with __m128i).

llvm-svn: 169859
2012-12-11 12:34:09 +00:00
Patrik Hagglund e98b7a0389 Revert EVT->MVT changes, r169836-169851, due to buildbot failures.
llvm-svn: 169854
2012-12-11 11:14:33 +00:00
Patrik Hagglund cbc9d4d0f9 Change TargetLowering::getLoadExtAction to take an MVT, instead of EVT.
llvm-svn: 169840
2012-12-11 09:39:09 +00:00
Nadav Rotem dbb3328194 Fix PR14565. Don't if-convert loops that have switch statements in them.
llvm-svn: 169813
2012-12-11 04:55:10 +00:00
Nadav Rotem 36cdd82627 Enable the loop vectorizer only on O2 and above. (Still disabled by default)
llvm-svn: 169774
2012-12-10 21:45:01 +00:00
Nadav Rotem 07df5ac1a1 Split the LoopVectorizer into H and CPP.
llvm-svn: 169771
2012-12-10 21:39:02 +00:00
Bill Wendling 74f334e476 Don't use a red zone for code coverage if the user specified `-mno-red-zone'.
The `-mno-red-zone' flag wasn't being propagated to the functions that code
coverage generates. This allowed some of them to use the red zone when that
wasn't allowed.
<rdar://problem/12843084>

llvm-svn: 169754
2012-12-10 19:46:49 +00:00
Nadav Rotem 7b5b55c195 Add support for reverse induction variables. For example:
while (i--)
 sum+=A[i];

llvm-svn: 169752
2012-12-10 19:25:06 +00:00
Chandler Carruth e41e7b7901 Add a new visitor for walking the uses of a pointer value.
This visitor provides infrastructure for recursively traversing the
use-graph of a pointer-producing instruction like an alloca or a malloc.
It maintains a worklist of uses to visit, so it can handle very deep
recursions. It automatically looks through instructions which simply
translate one pointer to another (bitcasts and GEPs). It tracks the
offset relative to the original pointer as long as that offset remains
constant and exposes it during the visit as an APInt offset. Finally, it
performs conservative escape analysis.

However, currently it has some limitations that should be addressed
going forward:
1) It doesn't handle vectors of pointers.
2) It doesn't provide a cheaper visitor when the constant offset
   tracking isn't needed.
3) It doesn't support non-instruction pointer values.

The current functionality is exactly what is required to implement the
SROA pointer-use visitors in terms of this one, rather than in terms of
their own ad-hoc base visitor, which was always very poorly specified.
SROA has been converted to use this, and the code there deleted which
this utility now provides.

Technically speaking, using this new visitor allows SROA to handle a few
more cases than it previously did. It is now more aggressive in ignoring
chains of instructions which look like they would defeat SROA, but in
fact do not because they never result in a read or write of memory.
While this is "neat", it shouldn't be interesting for real programs as
any such chains should have been removed by others passes long before we
get to SROA. As a consequence, I've not added any tests for these
features -- it shouldn't be part of SROA's contract to perform such
heroics.

The goal is to extend the functionality of this visitor going forward,
and re-use it from passes like ASan that can benefit from doing
a detailed walk of the uses of a pointer.

Thanks to Ben Kramer for the code review rounds and lots of help
reviewing and debugging this patch.

llvm-svn: 169728
2012-12-10 08:28:39 +00:00
Chandler Carruth e45f4658a3 Fix PR14548: SROA was crashing on a mixture of i1 and i8 loads and stores.
When SROA was evaluating a mixture of i1 and i8 loads and stores, in
just a particular case, it would tickle a latent bug where we compared
bits to bytes rather than bits to bits. As a consequence of the latent
bug, we would allow integers through which were not byte-size multiples,
a situation the later rewriting code was never intended to handle.

In release builds this could trigger all manner of oddities, but the
reported issue in PR14548 was forming invalid bitcast instructions.

The only downside of this fix is that it makes it more clear that SROA
in its current form is not capable of handling mixed i1 and i8 loads and
stores. Sometimes with the previous code this would work by luck, but
usually it would crash, so I'm not terribly worried. I'll watch the LNT
numbers just to be sure.

llvm-svn: 169719
2012-12-10 00:54:45 +00:00
Paul Redmond 2adb13c100 LoopVectorize: support vectorizing intrinsic calls
- added function to VectorTargetTransformInfo to query cost of intrinsics
- vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc.

Reviewed by: Nadav

llvm-svn: 169711
2012-12-09 20:42:17 +00:00
Paul Redmond f7cd6b391a test commit.
llvm-svn: 169709
2012-12-09 19:46:31 +00:00
Jakub Staszak 8432185ec2 Use m_OneUse pattern instead of hasOneUse() method.
No functionality change.

llvm-svn: 169703
2012-12-09 16:06:44 +00:00
Jakub Staszak 538e3861e3 Remove trailing spaces.
llvm-svn: 169701
2012-12-09 15:37:46 +00:00
Chandler Carruth 93ff2447ec Switch SROA to pop Uses off the back of its visitors' queues.
This will more closely match the behavior of the new PtrUseVisitor that
I am adding. Hopefully this will not change the actual behavior in any
way, but by making the processing order more similar help in debugging.

llvm-svn: 169697
2012-12-09 11:56:01 +00:00
Shuxin Yang 95de7c37e2 - Re-enable population count loop idiom recognization
- fix a bug which cause sigfault.
- add two testing cases which was causing crash

llvm-svn: 169687
2012-12-09 03:12:46 +00:00
Chandler Carruth 91e47532fe Revert the patches adding a popcount loop idiom recognition pass.
There are still bugs in this pass, as well as other issues that are
being worked on, but the bugs are crashers that occur pretty easily in
the wild. Test cases have been sent to the original commit's review
thread.

This reverts the commits:
  r169671: Fix a logic error.
  r169604: Move the popcnt tests to an X86 subdirectory.
  r168931: Initial commit adding the pass.

llvm-svn: 169683
2012-12-08 22:18:29 +00:00
Shuxin Yang 9c5c97647f Fix an inadvertent typo error.
llvm-svn: 169671
2012-12-08 05:00:59 +00:00
Bill Wendling e94d843e43 s/AttrListPtr/AttributeSet/g to better label what this class is going to be in the near future.
llvm-svn: 169651
2012-12-07 23:16:57 +00:00
Evgeniy Stepanov 383b61e791 [msan] Remove readonly/readnone attributes from all called functions.
MSan uses a TLS slot to pass shadow for function arguments and return values.
This makes all instrumented functions not readonly, and at the same time
requires that all callees of an instrumented function that may be
MSan-instrumented do not have readonly attribute (otherwise some of the
instrumentation may be optimized out).

llvm-svn: 169591
2012-12-07 09:08:32 +00:00
Jakub Staszak 772b893e5d Remove unused field.
llvm-svn: 169551
2012-12-06 22:08:59 +00:00
Jakub Staszak 9525a77bf5 Remove trailing spaces.
llvm-svn: 169550
2012-12-06 21:57:16 +00:00
NAKAMURA Takumi e0b1b4645a MemorySanitizer.cpp: Suppress a warning. [-Wunused-variable]
llvm-svn: 169504
2012-12-06 13:38:00 +00:00
Evgeniy Stepanov 47ac9ba9cc [msan] Fix a typo in a comment.
llvm-svn: 169491
2012-12-06 11:58:59 +00:00
Evgeniy Stepanov 4f220d96c5 [msan] Do not store origin for clean values.
Instead of unconditionally storing origin with every application store,
only do this when the shadow of the stored value is != 0.

This change also delays instrumentation of stores until after the walk over
function's instructions, because adding new basic blocks confuses InstVisitor.

We only keep 1 origin value per 4 bytes of application memory. This change
fixes the bug when a store of a single clean byte wiped the origin for the
whole 4-byte area.

Since stores of uninitialized values are relatively uncommon, this change
improves performance of track-origins mode by 5% median and by up to 47% on
specs.

llvm-svn: 169490
2012-12-06 11:41:03 +00:00
Bill Wendling ab417b644c Set the 'MadeChange' variable if we are deleting blocks.
llvm-svn: 169455
2012-12-06 00:30:20 +00:00
Evgeniy Stepanov 8b51bab495 [msan] Instrument bswap intrinsic.
llvm-svn: 169383
2012-12-05 14:39:55 +00:00
Evgeniy Stepanov 94b257df3c [msan] Initialize callbacks in runOnFunction as opposed to doInitialization.
This mirrors the change in ASan & TSan done in r168864.

llvm-svn: 169378
2012-12-05 13:14:33 +00:00
Evgeniy Stepanov 474cb3b3b5 [msan] Change linkage type of __msan_track_origins.
LinkOnceODRLinkage globals may be removed in GlobalOpt if not used in the
current module.

llvm-svn: 169377
2012-12-05 12:49:41 +00:00
Nadav Rotem a8f026e2d4 LoopVectorizer: Increase the number of pointers that can be tested at runtime. If we cant prove statically that the pointers are disjoint then we add the runtime check.
llvm-svn: 169334
2012-12-04 23:25:24 +00:00
Nadav Rotem 87fc988c5d Enable if-conversion during vectorization.
llvm-svn: 169331
2012-12-04 22:59:52 +00:00
Nadav Rotem 93fa5ef957 Fix a bug in vectorization of if-converted reduction variables. If the
reduction variable is not used outside the loop then we ran into an
endless loop. This change checks if we found the original PHI.

llvm-svn: 169324
2012-12-04 22:40:22 +00:00
Shuxin Yang 73285933c9 For rdar://12329730, last piece.
This change attempts to simplify (X^Y) -> X or Y in the user's context if we know that
only bits from X or Y are demanded.

  A minimized case is provided bellow. This change will simplify "t>>16" into "var1 >>16".

  =============================================================
  unsigned foo (unsigned val1, unsigned val2) {
    unsigned t = val1 ^ 1234;
    return (t >> 16) | t; // NOTE: t is used more than once.
  }
  =============================================================

  Note that if the "t" were used only once, the expression would be finally optimized as well.
However, with with this change, the optimization will take place earlier.

  Reviewed by Nadav, Thanks a lot!

llvm-svn: 169317
2012-12-04 22:15:32 +00:00
Nadav Rotem a10b311aec Add support for reduction variables when IF-conversion is enabled.
llvm-svn: 169288
2012-12-04 18:17:33 +00:00
Chandler Carruth 802d755533 Sort includes for all of the .h files under the 'lib' tree. These were
missed in the first pass because the script didn't yet handle include
guards.

Note that the script is now able to handle all of these headers without
manual edits. =]

llvm-svn: 169224
2012-12-04 07:12:27 +00:00
Nadav Rotem 07674cb566 Give scalar if-converted blocks half the score because they are not always executed due to CF.
llvm-svn: 169223
2012-12-04 07:11:52 +00:00
Nadav Rotem 628c2dba60 Add the last part that is needed for vectorization of if-converted code.
Added the code that actually performs the if-conversion during vectorization.

We can now vectorize this code:

for (int i=0; i<n; ++i) {
  unsigned k = 0;

  if (a[i] > b[i])   <------ IF inside the loop.
    k = k * 5 + 3;

  a[i] = k;          <---- K is a phi node that becomes vector-select.
}

llvm-svn: 169217
2012-12-04 06:15:11 +00:00
Kostya Serebryany 9b65726d24 [asan] add experimental -asan-realign-stack option (true by default, which does not change the current behavior)
llvm-svn: 169216
2012-12-04 06:14:01 +00:00
Matt Beaumont-Gay abfc446063 Add 'using' declarations to suppress -Woverloaded-virtual warnings.
llvm-svn: 169214
2012-12-04 05:41:27 +00:00
Shuxin Yang 86c0e232b7 rdar://12329730 (2nd part, revised)
The type of shirt-right (logical or arithemetic) should remain unchanged 
when transforming  "X << C1 >> C2" into "X << (C1-C2)"

llvm-svn: 169209
2012-12-04 03:28:32 +00:00
Alexey Samsonov 261177a1e1 ASan: add initial support for handling llvm.lifetime intrinsics in ASan - emit calls into runtime library that poison memory for local variables when their lifetime is over and unpoison memory when their lifetime begins.
llvm-svn: 169200
2012-12-04 01:34:23 +00:00
NAKAMURA Takumi f99b535fdb LoopVectorize.cpp: Suppress a warning. [-Wunused-variable]
llvm-svn: 169195
2012-12-04 00:49:34 +00:00
NAKAMURA Takumi 8b07bc579b Fix whitespace.
llvm-svn: 169194
2012-12-04 00:49:28 +00:00
Shuxin Yang 63e999edbf rdar://12329730 (2nd part)
This change tries to simmplify E1 = " X >> C1 << C2" into :
  - E2 = "X << (C2 - C1)" if C2 > C1, or
  - E2 = "X >> (C1 - C2)" if C1 > C2, or
  - E2 = X if C1 == C2.

 Reviewed by Nadav. Thanks!

llvm-svn: 169182
2012-12-04 00:04:54 +00:00
Nadav Rotem d479a57f68 minor renaming, documentation and cleanups.
llvm-svn: 169175
2012-12-03 22:57:09 +00:00
Nadav Rotem fad16be973 IF-conversion: teach the cost-model how to grade if-converted loops.
llvm-svn: 169171
2012-12-03 22:46:31 +00:00
Nadav Rotem eee203d885 Now that we have a basic if-conversion infrastructure we can rename the
"single basic block loop vectorizer" to "innermost loop vectorizer".

llvm-svn: 169158
2012-12-03 21:33:08 +00:00
Nadav Rotem a30aba7a01 Add initial support for IF-conversion. This patch implements the first 1/3,
which is the legality of the if-conversion transformation. The next step is to
implement the cost-model for the if-converted code as well as the
vectorization itself.

llvm-svn: 169152
2012-12-03 21:06:35 +00:00
Alexey Samsonov ef51c3ff81 ASan: add blacklist file to ASan pass options. Clang patch for this will follow.
llvm-svn: 169143
2012-12-03 19:09:26 +00:00
Nadav Rotem 2349531def Teach the jump threading optimization to stop scanning the basic block when calculating the cost after passing the threshold.
llvm-svn: 169135
2012-12-03 17:34:44 +00:00
Chandler Carruth ed0881b2a6 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Chandler Carruth f02b8bf11b Remove some buggy and apparantly unnecessary code from SROA.
The partitioning logic attempted to handle uses of an alloca with an
offset starting before the alloca so long as the use had some overlap
with the alloca itself. However, there was a bug where we tested
'(uint64_t)Offset >= AllocSize' without first checking whether 'Offset'
was positive. As a consequence, essentially every negative offset (that
is, starting *before* the alloca does) would be thrown out, even if it
was overlapping. The subsequent code to throw out negative offsets which
were actually non-overlapping was essentially dead. The code to *handle*
overlapping negative offsets was actually dead!

I've just removed all of this, and taught SROA to discard any uses which
start prior to the alloca from the beginning. It has the lovely property
of simplifying the code. =] All the tests still pass, and in fact no new
tests are needed as this is already covered by our testsuite. Fixing the
code so that negative offsets work the way the comments indicate they
were supposed to work causes regressions. That's how I found this.

Anyways, this is all progress in the correct direction -- tightening up
SROA to be maximally aggressive. Some day, I really hope to turn
out-of-bounds accesses to an alloca into 'unreachable'.

llvm-svn: 169120
2012-12-03 10:59:55 +00:00
Nuno Lopes 5eec2679df fix stats for added checks
llvm-svn: 169119
2012-12-03 10:15:03 +00:00
Benjamin Kramer 47534c7440 SROA: Avoid struct and array types early to avoid creating an overly large integer type.
Fixes PR14465.

Differential Revision: http://llvm-reviews.chandlerc.com/D148

llvm-svn: 169084
2012-12-01 11:53:32 +00:00
Zhou Sheng 8e6d64a71d Revert previous check in r168581, r169079 as they are still in code review status.
llvm-svn: 169083
2012-12-01 10:54:28 +00:00
Zhou Sheng 13fb1ca44a The patch is to improve the memory footprint of pass GlobalOpt.
Also check in a case to repeat the issue, on which 'opt -globalopt' consumes 1.6GB memory.
The big memory footprint cause is that current GlobalOpt one by one hoists and stores the leaf element constant into the global array, in each iteration, it recreates the global array initializer constant and leave the old initializer alone. This may result in many obsolete constants left.
For example:  we have global array @rom = global [16 x i32] zeroinitializer
After the first element value is hoisted and installed:   @rom = global [16 x i32] [ 1, 0, 0, ... ]
After the second element value is installed:  @rom = global [16 x 32] [ 1, 2, 0, 0, ... ]        // here the previous initializer is obsolete
...
When the transform is done, we have 15 obsolete initializers left useless.

llvm-svn: 169079
2012-12-01 04:38:53 +00:00
Pedro Artigas 00b83c9b8d reversed the logic of the log2 detection routine to reduce the number of nested ifs
llvm-svn: 169049
2012-11-30 22:47:15 +00:00
Nadav Rotem 3ae24ee08a minor cleanups
llvm-svn: 169048
2012-11-30 22:37:11 +00:00
Bill Wendling c786b31233 Replace r168930 with a more reasonable patch.
The original patch removed a bunch of code that the SjLjEHPrepare pass placed
into the entry block if all of the landing pads were removed during the
CodeGenPrepare class. The more natural way of doing things is to run the CGP
*before* we run the SjLjEHPrepare pass.

Make it so!

llvm-svn: 169044
2012-11-30 22:08:55 +00:00
Pedro Artigas 993acd0c54 Addresses many style issues with prior checkin (r169025)
llvm-svn: 169043
2012-11-30 22:07:05 +00:00
Pedro Artigas d8795040de Add fast math inst combine X*log2(Y*0.5)-->X*log2(Y)-X
reviewed by Michael Ilseman <milseman@apple.com>

llvm-svn: 169025
2012-11-30 19:09:41 +00:00
Nadav Rotem 6b494be886 Remove the use of LPPassManager. We can remove LPM because we dont need to run any additional loop passes on the new vector loop.
llvm-svn: 169016
2012-11-30 17:27:53 +00:00
Kostya Serebryany 817b60af38 [asan] simplify the code around doesNotReturn call. It now magically works.
llvm-svn: 168995
2012-11-30 11:08:59 +00:00
Chandler Carruth d9ef81e133 Fix non-determinism introduced in r168970 and pointed out by Duncan.
We're iterating over a non-deterministically ordered container looking
for two saturating flags. To do this correctly, we have to saturate
both, and only stop looping if both saturate to their final value.
Otherwise, which flag we see first changes the result.

This is also a micro-optimization of the previous version as now we
don't go into the (possibly expensive) test logic once the first
violation of either constraint is detected.

llvm-svn: 168989
2012-11-30 09:34:29 +00:00
Chandler Carruth 77d433dafe Rearrange the comments, control flow, and variable names; no
functionality changed.

Evan's commit r168970 moved the code that the primary comment in this
function referred to to the other end of the function without moving the
comment, and there has been a steady creep of "boolean" logic in it that
is simpler if handled via early exit. That way each special case can
have its own comments. I've also made the variable name a bit more
explanatory than "AllFit". This is in preparation to fix the
non-deterministic output of this function.

llvm-svn: 168988
2012-11-30 09:26:25 +00:00
Meador Inge e3f2b26bfa Move library call simplification statistic to instcombine
The simplify-libcalls pass maintained a statistic to count the number
of library calls that have been simplified.  Now that library call
simplification is being carried out in instcombine the statistic should
be moved to there.

llvm-svn: 168975
2012-11-30 04:05:06 +00:00
Chandler Carruth dbd6958183 Move the InstVisitor utility into VMCore where it belongs. It heavily
depends on the IR infrastructure, there is no sense in it being off in
Support land.

This is in preparation to start working to expand InstVisitor into more
special-purpose visitors that are still generic and can be re-used
across different passes. The expansion will go into the Analylis tree
though as nothing in VMCore needs it.

llvm-svn: 168972
2012-11-30 03:08:41 +00:00
Evan Cheng 65df808f62 Fix logic to determine whether to turn a switch into a lookup table. When
the tables cannot fit in registers (i.e. bitmap), do not emit the table
if it's using an illegal type.

rdar://12779436

llvm-svn: 168970
2012-11-30 02:02:42 +00:00
Shuxin Yang abcc370423 rdar://12100355 (part 1)
This revision attempts to recognize following population-count pattern:

 while(a) { c++; ... ; a &= a - 1; ... },
  where <c> and <a>could be used multiple times in the loop body.

 TODO: On X8664 and ARM, __buildin_ctpop() are not expanded to a efficent 
instruction sequence, which need to be improved in the following commits.

Reviewed by Nadav, really appreciate!

llvm-svn: 168931
2012-11-29 19:38:54 +00:00
Bill Wendling a4a77edf2e Handle the situation where CodeGenPrepare removes a reference to a BB that has
the last invoke instruction in the function. This also removes the last landing
pad in an function. This is fine, but with SjLj EH code, we've already placed a
bunch of code in the 'entry' block, which expects the landing pad to stick
around.

When we get to the situation where CGP has removed the last landing pad, go
ahead and nuke the SjLj instructions from the 'entry' block.
<rdar://problem/12721258>

llvm-svn: 168930
2012-11-29 19:38:06 +00:00
Nadav Rotem ec739205cc No need to run LICM after loop vectorization because we dont generate invariant code any more.
llvm-svn: 168928
2012-11-29 19:28:29 +00:00
Nadav Rotem 8dd6ee8df5 When broadcasting invariant scalars into vectors, place the broadcast code in the preheader.
llvm-svn: 168927
2012-11-29 19:25:41 +00:00
Meador Inge 75798bb7fe instcombine: Migrate puts optimizations
This patch migrates the puts optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

All the simplifiers from simplify-libcalls have now been migrated to
instcombine.  Yay!  Just a few other bits to migrate (prototype attribute
inference and a few statistics) and simplify-libcalls can finally be put
to rest.

llvm-svn: 168925
2012-11-29 19:15:17 +00:00
Alexey Samsonov 9a956e8cd2 [ASan] Simplify check added in r168861. Bail out from module pass early if the module is blacklisted.
llvm-svn: 168913
2012-11-29 18:27:01 +00:00
Matt Beaumont-Gay c76536f886 Apply Takumi's patch to suppress unused-variable warnings in -Asserts builds.
llvm-svn: 168911
2012-11-29 18:15:49 +00:00
Alexey Samsonov df6245233c Add options to AddressSanitizer passes to make them configurable by frontend.
llvm-svn: 168910
2012-11-29 18:14:24 +00:00
Meador Inge f8e725081c instcombine: Migrate fputs optimizations
This patch migrates the fputs optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168893
2012-11-29 15:45:43 +00:00
Meador Inge bc84d1a4f5 instcombine: Migrate fwrite optimizations
This patch migrates the fwrite optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168892
2012-11-29 15:45:39 +00:00
Meador Inge 1009cecca0 instcombine: Migrate fprintf optimizations
This patch migrates the fprintf optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168891
2012-11-29 15:45:33 +00:00
Evgeniy Stepanov 30484fc704 [msan] Handle vector manipulation instructions.
Handle insertelement, extractelement, shufflevector.

llvm-svn: 168889
2012-11-29 15:22:06 +00:00
Evgeniy Stepanov f433cecf96 [msan] Fix getOriginForNaryOp.
The old version failed on a 3-arg instruction with (-1, 0, 0) shadows (it would
pick the 3rd operand origin irrespective of its shadow).
    
The new version always picks the origin of the rightmost poisoned operand.

llvm-svn: 168887
2012-11-29 14:44:00 +00:00
Evgeniy Stepanov 7ad7e83031 [msan] Basic handling of inline asm.
llvm-svn: 168884
2012-11-29 14:32:03 +00:00
Evgeniy Stepanov 857d9d2a59 [msan] Propagate shadow through (x<0) and (x>=0) comparisons.
This is a special case of signed relational comparison where result
only depends on the sign of x.

llvm-svn: 168881
2012-11-29 14:25:47 +00:00
Evgeniy Stepanov eeb8b7c391 [msan] Fix shadow & origin store & load alignment.
This change ensures that shadow memory accesses have the same alignment
as corresponding app memory accesses.

llvm-svn: 168880
2012-11-29 14:05:53 +00:00
Evgeniy Stepanov 62ba611828 [msan] Optimize getOriginPtr.
Rewrite getOriginPtr in a way that lets subsequent optimizations factor out
the common part of Shadow and Origin address calculation. Improves perf by
up to 5%.

llvm-svn: 168879
2012-11-29 13:43:05 +00:00
Evgeniy Stepanov da0072b676 [msan] Fix a few compilation warnings.
llvm-svn: 168878
2012-11-29 13:12:03 +00:00
Evgeniy Stepanov 62b5db9361 [msan] Transform memcpy and memset to library calls.
This was already done for memmove, where it is required for correctness.
This change improves performance by avoiding copyingthe same memory twice.
Also, the library functions are given __msan_ prefix to prevent instcombine
pass from converting them back to intrinsics.

llvm-svn: 168876
2012-11-29 12:49:04 +00:00
Evgeniy Stepanov 1d2da65bf8 [msan] Make sure that report callbacks do not get merged.
llvm-svn: 168873
2012-11-29 12:30:18 +00:00
Evgeniy Stepanov d4bd7b73e3 Initial commit of MemorySanitizer.
Compiler pass only.

llvm-svn: 168866
2012-11-29 09:57:20 +00:00
Kostya Serebryany 4b929dae93 [asan/tsan] initialize the asan/tsan callbacks in runOnFunction as opposed to doInitialization. This is required to allow the upcoming changes in PassManager behavior
llvm-svn: 168864
2012-11-29 09:54:21 +00:00
Kostya Serebryany 633bf93fb8 [asan] when checking the noreturn attribute on the call, also check it on the callee
llvm-svn: 168861
2012-11-29 08:57:20 +00:00
Nick Lewycky 1af94eb075 Issue a fatal error if the line doesn't have a regular expression.
Also a couple not-user-visible changes; using empty() instead of size(), and
make inSection() not insert NULL Regex*'s into StringMap when doing a lookup.

llvm-svn: 168833
2012-11-29 00:01:38 +00:00
Bill Wendling f3614fd8e2 When we delete a dead basic block, see if any of its successors are dead and
delete those as well.

llvm-svn: 168829
2012-11-28 23:23:48 +00:00
Kostya Serebryany dfe9e7933e [asan] Split AddressSanitizer into two passes (FunctionPass, ModulePass), LLVM part. This requires a clang part which will follow.
llvm-svn: 168781
2012-11-28 10:31:36 +00:00
Hal Finkel 88ee6b0082 BBVectorize: Correctly merge SubclassOptionalData
When two instructions are combined into a vector instruction,
the resulting instruction must have the most-conservative flags.

llvm-svn: 168765
2012-11-28 03:04:10 +00:00
Meador Inge f1bc9e7431 instcombine: Don't replace all uses for instructions with no uses
My commit to migrate the printf simplifiers from the simplify-libcalls
in r168604 introduced a regression reported by Duncan [1].  The problem
is that in some cases the library call simplifier can return a new value
that has no uses and the new value's type is different than the old value's
type (which is fine because there are no uses).  The specific case that
triggered the bug looked something like:

   declare void @printf(i8*, ...)
   ...
   call void (i8*, ...)* @printf(i8* %fmt)

Which we want to optimized into:

   call i32 @putchar(i32 104)

However, the code was attempting to replace all uses of the printf with
the putchar and the types differ, hence a crash.  This is fixed by *just*
deleting the original instruction when there are no uses.  The old
simplify-libcalls pass is already doing something similar.

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-November/056338.html

llvm-svn: 168716
2012-11-27 18:52:49 +00:00
Bill Wendling ee5984df39 Remove the dependent libraries feature.
The dependent libraries feature was never used and has bit-rotted. Remove it.

llvm-svn: 168694
2012-11-27 09:55:56 +00:00
Dmitry Vyukov a878e74351 tsan: instrument atomic nand operation
llvm-svn: 168684
2012-11-27 08:09:25 +00:00
Meador Inge 25c9b3b6e4 instcombine: Migrate sprintf optimizations
This patch migrates the sprintf optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168677
2012-11-27 05:57:54 +00:00
Eli Friedman b14873c4f1 Get rid of the getPointeeAlignment helper function from
InstCombineLoadStoreAlloca.cpp, which had many issues.
(At least two bugs were noted on llvm-commits, and it was overly conservative.)
Instead, use getOrEnforceKnownAlignment.

llvm-svn: 168629
2012-11-26 23:04:53 +00:00
Shuxin Yang 6ea79e864d rdar://12329730 (defect 2)
Enhancement to InstCombine. Try to catch this opportunity:
  
 ---------------------------------------------------------------
 ((X^C1) >> C2) ^ C3  => (X>>C2) ^ ((C1>>C2)^C3)
  where the subexpression "X ^ C1" has more than one uses, and
  "(X^C1) >> C2" has single use. 
 ---------------------------------------------------------------- 

 Reviewed by Nadav (with minor change per his request).

llvm-svn: 168615
2012-11-26 21:44:25 +00:00
Meador Inge efe2393113 Fix a comment bug in toascii simplifier
When I migrated the toascii simplifier in r168580 Benjamin Kramer noticed
a bug in one of the comments that I migrated.

llvm-svn: 168605
2012-11-26 20:37:23 +00:00
Meador Inge 08ca115abd instcombine: Migrate printf optimizations
This patch migrates the printf optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168604
2012-11-26 20:37:20 +00:00
Nadav Rotem caf5acfd14 Move the code that uses SCEVs prior to creating the new loops.
llvm-svn: 168601
2012-11-26 19:51:46 +00:00
Matt Beaumont-Gay 000a3958e9 Remove stray trailing backslash
llvm-svn: 168592
2012-11-26 16:27:22 +00:00
Dmitry Vyukov 93e67b6c06 tsan: fix lint warnings
llvm-svn: 168590
2012-11-26 14:55:26 +00:00
Dmitry Vyukov 12b5cb9a0a [tsan] add fail order to compare_exchange
llvm-svn: 168586
2012-11-26 11:36:19 +00:00
Meador Inge 604937d1cc instcombine: Migrate toascii optimizations
This patch migrates the toascii optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168580
2012-11-26 03:38:52 +00:00
Meador Inge a62a39e0e9 instcombine: Migrate isascii optimizations
This patch migrates the isascii optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168579
2012-11-26 03:10:07 +00:00
Meador Inge 9a59ab6133 instcombine: Migrate isdigit optimizations
This patch migrates the isdigit optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168578
2012-11-26 02:31:59 +00:00
Meador Inge a0b6d87879 instcombine: Migrate *abs optimizations
This patch migrates the *abs optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168574
2012-11-26 00:24:07 +00:00
Meador Inge 7415f8403d instcombine: Migrate ffs* optimizations
This patch migrates the ffs* optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168571
2012-11-25 20:45:27 +00:00
Nadav Rotem ee7ede76f4 Move the max vector width to a constant parameter. No functionality change.
llvm-svn: 168570
2012-11-25 16:48:08 +00:00
Nadav Rotem ef33b5076c Fix the document style.
llvm-svn: 168569
2012-11-25 16:39:01 +00:00
Nadav Rotem 12192f19eb Refactor the ptr runtime check generation code. No functionality change.
llvm-svn: 168568
2012-11-25 16:27:16 +00:00
Nadav Rotem b15d9fe24d Rename method. No functionality change.
llvm-svn: 168560
2012-11-25 09:13:57 +00:00
Nadav Rotem bf5173460f The induction-pointer work is inspired by a research paper. This commit adds a reference.
llvm-svn: 168559
2012-11-25 09:09:26 +00:00
Nadav Rotem ea3824f160 Add support for pointer induction variables even when there is no integer induction variable.
llvm-svn: 168558
2012-11-25 08:41:35 +00:00
Benjamin Kramer 455fa35e51 CodeGenPrepare: Move ret duplication out of the instruction iteration loop.
It can delete the block, and the loop continues on free'd memory.
No change in output. Found by valgrind.

llvm-svn: 168525
2012-11-23 19:17:06 +00:00
Joey Gouly 9519823271 Remove unused parameter Penalty from the BoundsChecking pass.
llvm-svn: 168511
2012-11-23 10:47:35 +00:00
NAKAMURA Takumi 6b8b2a9b98 llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp: Prune AddressSanitizerCreateGlobalRedzonesPass::ID. [-Wunused-variable]
llvm-svn: 168499
2012-11-22 14:18:25 +00:00
Kostya Serebryany 20a79970e0 [asan] rip off the creation of global redzones from the main AddressSanitizer class into a separate class. The intent is to make it a separate ModulePass in the following commmits
llvm-svn: 168484
2012-11-22 03:18:50 +00:00
Chandler Carruth 845b73c06f PR14055: Implement support for sub-vector operations in SROA.
Now if we can transform an alloca into a single vector value, but it has
subvector, non-element accesses, we form the appropriate shufflevectors
to allow SROA to proceed. This fixes PR14055 which pointed out a very
common pattern that SROA couldn't handle -- mixed vec3 and vec4
operations on a single alloca.

llvm-svn: 168418
2012-11-21 08:16:30 +00:00
Kostya Serebryany 139a937903 [asan] use names of globals instead of an external set to distinguish the globals generated by asan
llvm-svn: 168368
2012-11-20 14:16:08 +00:00
Kostya Serebryany dc4cb2b736 [asan] don't instrument linker-initialized globals even with external linkage in -asan-initialization-order mode
llvm-svn: 168367
2012-11-20 13:11:32 +00:00
Kostya Serebryany b3bd605ffa [asan] make sure that linker-initialized globals (non-extern) are not instrumented even in -asan-initialization-order mode. This time with a test
llvm-svn: 168366
2012-11-20 13:00:01 +00:00
Chandler Carruth b7915f7fbf Use LLVM_ENABLE_DUMP for the variables used in printing as well as the
printing functions themselves.

Part of PR14324 (which should have just been a patch to the list, but
hey...)

llvm-svn: 168362
2012-11-20 10:23:07 +00:00
Chandler Carruth 3e994a26e2 Fix PR14132 and handle OOB loads speculated throuh PHI nodes.
The issue is that we may end up with newly OOB loads when speculating
a load into the predecessors of a PHI node, and this confuses the new
integer splitting logic in some cases, triggering an assertion failure.
In fact, the branch in question must be dead code as it loads from
a too-narrow alloca. Add code to handle this gracefully and leave the
requisite FIXMEs for both optimizing more aggressively and doing more to
aid sanitizing invalid code which triggers these patterns.

llvm-svn: 168361
2012-11-20 10:02:19 +00:00
Bill Wendling f86efb9b98 Make the AttrListPtr object a part of the LLVMContext.
When code deletes the context, the AttributeImpls that the AttrListPtr points to
are now invalid. Therefore, instead of keeping a separate managed static for the
AttrListPtrs that's reference counted, move it into the LLVMContext and delete
it when deleting the AttributeImpls.

llvm-svn: 168354
2012-11-20 05:09:20 +00:00
Chandler Carruth 9324c96064 Add a comment to associate a FIXME with a PR where it is matters.
llvm-svn: 168347
2012-11-20 01:27:48 +00:00
Chandler Carruth 18db795b05 Rework the rewriting of loads and stores for vector and integer allocas
to properly handle the combinations of these with split integer loads
and stores. This essentially replaces Evan's r168227 by refactoring the
code in a different way, and trynig to mirror that refactoring in both
the load and store sides of the rewriting.

Generally speaking there was some really problematic duplicated code
here that led to poorly founded assumptions and then subtle bugs. Now
much of the code actually flows through and follows a more consistent
style and logical path. There is still a tiny bit of duplication on the
store side of things, but it is much less bad.

This also changes the logic to never re-use a load or store instruction
as that was simply too error prone in practice.

I've added a few tests (one a reduction of the one in Evan's original
patch, which happened to be the same as the report in PR14349). I'm
going to look at adding a few more tests for things I found and fixed in
passing (such as the volatile tests in the vectorizable predicate).

This patch has survived bootstrap, and modulo one bugfix survived
Duncan's test suite, but let me know if anything else explodes.

llvm-svn: 168346
2012-11-20 01:12:50 +00:00
Bob Wilson a5b0dc8884 Clean up handling of always-inline functions in the inliner.
This patch moves the isInlineViable function from the InlineAlways pass into
the InlineCostAnalyzer and then changes the InlineCost computation to use that
simple check for always-inline functions. All the special-case checks for
AlwaysInline in the CallAnalyzer can then go away.

llvm-svn: 168300
2012-11-19 07:04:35 +00:00
Duncan Sands 79cf530d56 Remove the last bit of constant folding from LinearizeExprTree (most of it was
removed in commit 168035, but I missed this bit).

llvm-svn: 168292
2012-11-18 20:15:36 +00:00
Duncan Sands 20bd7fa0f7 Fix PR14060, an infinite loop in reassociate. The problem was that one of the
operands of the expression being written was wrongly thought to be reusable as
an inner node of the expression resulting in it turning up as both an inner node
*and* a leaf, creating a cycle in the def-use graph.  This would have caused the
verifier to blow up if things had gotten that far, however it managed to provoke
an infinite loop first.

llvm-svn: 168291
2012-11-18 19:27:01 +00:00
Nick Lewycky 3d35b45f8e Don't try to calculate the alignment of an unsigned type. Fixes PR14371!
llvm-svn: 168280
2012-11-18 05:39:39 +00:00
Benjamin Kramer 96e1e39642 Plug a memory leak in the GCOV profiling emitter, which never released the edge table memory.
llvm-svn: 168259
2012-11-17 13:49:37 +00:00
Nadav Rotem c3c07e62e8 LoopVectorizer: Add initial support for pointer induction variables (for example: *dst++ = *src++).
At the moment we still require to have an integer induction variable (for example: i++).

llvm-svn: 168231
2012-11-17 00:27:03 +00:00
Evan Cheng f1b6177b62 Teach SROA rewriteVectorizedStoreInst to handle cases when the loaded value is narrower than the stored value. rdar://12713675
llvm-svn: 168227
2012-11-17 00:05:06 +00:00
Duncan Sands d7d8c09b93 Make this easier to understand, as suggested by Chandler.
llvm-svn: 168196
2012-11-16 20:53:08 +00:00
Duncan Sands 1d3acddf0e Fix PR14361: wrong simplification of A+B==B+A. You may think that the old logic
replaced by this patch is equivalent to the new logic, but you'd be wrong, and
that's exactly where the bug was.  There's a similar bug in instsimplify which
manifests itself as instsimplify failing to simplify this, rather than doing it
wrong, see next commit.

llvm-svn: 168181
2012-11-16 18:55:49 +00:00
Hans Wennborg 7b8af0ea05 SimplifyCFG: Don't assume non-null ScalarTargetTransformInfo.
Patch by Pekka Jääskeläinen!

llvm-svn: 168176
2012-11-16 18:22:08 +00:00
Nadav Rotem 0565b5a279 LoopVectorize: Division reductions generate incorrect code. Remove the part of the code that deals with divs.
Thanks to Paul Redmond for catching this while reviewing the code.

llvm-svn: 168142
2012-11-16 06:51:17 +00:00
Andrew Trick 7656f6dbf7 misspell
llvm-svn: 168058
2012-11-15 18:40:31 +00:00
Andrew Trick 90f5029118 whitespace
llvm-svn: 168057
2012-11-15 18:40:29 +00:00
Dmitri Gribenko 0011bbf985 Use empty parens for empty function parameter list instead of '(void)'.
llvm-svn: 168049
2012-11-15 16:51:49 +00:00
Hans Wennborg 709e015cf1 Make GlobalOpt be conservative with TLS variables (PR14309)
For global variables that get the same value stored into them
everywhere, GlobalOpt will replace them with a constant. The problem is
that a thread-local GlobalVariable looks like one value (the address of
the TLS var), but is different between threads.

This patch introduces Constant::isThreadDependent() which returns true
for thread-local variables and constants which depend on them (e.g. a GEP
into a thread-local array), and teaches GlobalOpt not to track such
values.

llvm-svn: 168037
2012-11-15 11:40:00 +00:00
Duncan Sands ac852c742f Fix a crash observed by Shuxin Yang. The issue here is that LinearizeExprTree,
the utility for extracting a chain of operations from the IR, thought that it
might as well combine any constants it came across (rather than just returning
them along with everything else).  On the other hand, the factorization code
would like to see the individual constants (this is quite reasonable: it is
much easier to pull a factor of 3 out of 2*3 than it is to pull it out of 6;
you may think 6/3 isn't so hard, but due to overflow it's not as easy to undo
multiplications of constants as it may at first appear).  This patch therefore
makes LinearizeExprTree stupider: it now leaves optimizing to the optimization
part of reassociate, and sticks to just analysing the IR.

llvm-svn: 168035
2012-11-15 09:58:38 +00:00
NAKAMURA Takumi 00d2a107fb InstCombineAndOrXor.cpp: Escape bracket in doxygen description. [-Wdocumentation]
llvm-svn: 168013
2012-11-15 00:35:50 +00:00
Hal Finkel e9740a4692 Replace std::vector -> SmallVector in BBVectorize
For now, this uses 8 on-stack elements. I'll need to do some profiling
to see if this is the best number.

Pointed out by Jakob in post-commit review.

llvm-svn: 167966
2012-11-14 19:53:27 +00:00
Hal Finkel 1b7f0aba48 Fix the largest offender of determinism in BBVectorize
Iterating over the children of each node in the potential vectorization
plan must happen in a deterministic order (because it affects which children
are erased when two children conflict). There was no need for this data
structure to be a map in the first place, so replacing it with a vector
is a small change.

I believe that this was the last remaining instance if iterating over the
elements of a Dense* container where the iteration order could matter.
There are some remaining iterations over std::*map containers where the order
might matter, but so long as the Value* for instructions in a block increase
with the order of the instructions in the block (or decrease) monotonically,
then this will appear to be deterministic.

llvm-svn: 167942
2012-11-14 18:38:11 +00:00
Alexey Samsonov 2b27170fdc [TSan] fix indentation
llvm-svn: 167928
2012-11-14 14:33:59 +00:00
Nadav Rotem a43bcddc8d use the getSplat API. Patch by Paul Redmond.
llvm-svn: 167892
2012-11-14 00:02:13 +00:00
Alexey Samsonov cfd662f279 Figure out <size> argument of llvm.lifetime intrinsics at the moment they are created (during function inlining)
llvm-svn: 167821
2012-11-13 07:15:32 +00:00
Hal Finkel b51bdd20d3 BBVectorize: Remove temporary assert used for debugging
llvm-svn: 167817
2012-11-13 05:54:54 +00:00
Meador Inge 193e035b9c instcombine: Migrate math library call simplifications
This patch migrates the math library call simplifications from the
simplify-libcalls pass into the instcombine library call simplifier.

I have typically migrated just one simplifier at a time, but the math
simplifiers are interdependent because:

   1. CosOpt, PowOpt, and Exp2Opt all depend on UnaryDoubleFPOpt.
   2. CosOpt, PowOpt, Exp2Opt, and UnaryDoubleFPOpt all depend on
      the option -enable-double-float-shrink.

These two factors made migrating each of these simplifiers individually
more of a pain than it would be worth.  So, I migrated them all together.

llvm-svn: 167815
2012-11-13 04:16:17 +00:00
Hal Finkel 2a1df367d4 BBVectorize: Don't vectorize vector-manipulation chains
Don't choose a vectorization plan containing only shuffles and
vector inserts/extracts. Due to inperfections in the cost model,
these can lead to infinite recusion.

llvm-svn: 167811
2012-11-13 03:12:40 +00:00
Shuxin Yang c94c3bb5d0 revert r167740
llvm-svn: 167787
2012-11-13 00:08:49 +00:00
Hal Finkel 3b79f55c5f BBVectorize: Only some insert element operand pairs are free.
This fixes another infinite recursion case when using target costs.
We can only replace insert element input chains that are pure (end
with inserting into an undef).

llvm-svn: 167784
2012-11-12 23:55:36 +00:00
Hal Finkel 9cf3372931 BBVectorize: Use a more sophisticated check for input cost
The old checking code, which assumed that input shuffles and insert-elements
could always be folded (and thus were free) is too simple.
This can only happen in special circumstances.
Using the simple check caused infinite recursion.

llvm-svn: 167750
2012-11-12 21:21:02 +00:00
Hal Finkel f8326b6052 BBVectorize: Check the types of compare instructions
The pass would previously assert when trying to compute the cost of
compare instructions with illegal vector types (like struct pointers).

llvm-svn: 167743
2012-11-12 19:41:38 +00:00
Shuxin Yang 1c442f5ec6 This change is to fix rdar://12571717 which is about assertion in Reassociate pass.
The assertion is trigged when the Reassociater tries to transform expression
     ... + 2 * n * 3 + 2 * m + ...
  into:
     ... + 2 * (n*3 + m).

In the process of the transformation, a helper routine folds the constant 2*3 into 6,
confusing optimizer which is trying the to eliminate the common factor 2, and cannot
find 2 any more. 

Review is pending. But I'd like commit first in order to help those who are waiting 
for this fix. 

llvm-svn: 167740
2012-11-12 19:34:11 +00:00
Hal Finkel ef53df0f9f BBVectorize: Check the input types of shuffles for legality
This fixes a bug where shuffles were being fused such that the
resulting input types were not legal on the target. This would
occur only when both inputs and dependencies were also foldable
operations (such as other shuffles) and there were other connected
pairs in the same block.

llvm-svn: 167731
2012-11-12 14:50:59 +00:00
Alexey Samsonov afc550d948 [ASan] fixup for r167725: Don't fetch name of StructType if it is literal
llvm-svn: 167729
2012-11-12 14:47:00 +00:00
Meador Inge b3e91f6ae0 Normalize memcmp constant folding results.
The library call simplifier folds memcmp calls with all constant arguments
to a constant.  For example:

  memcmp("foo", "foo", 3) ->  0
  memcmp("hel", "foo", 3) ->  1
  memcmp("foo", "hel", 3) -> -1

The folding is implemented in terms of the system memcmp that LLVM gets
linked with.  It currently just blindly uses the value returned from
the system memcmp as the folded constant.

This patch normalizes the values returned from the system memcmp to
(-1, 0, 1) so that we get consistent results across multiple platforms.
The test cases were adjusted accordingly.

llvm-svn: 167726
2012-11-12 14:00:45 +00:00
Alexey Samsonov 582d7de709 [ASan]: Add minimalistic support for turning off initialization-order checking for globals of specified types. Tests for this behavior will go to ASan test suite in compiler-rt.
llvm-svn: 167725
2012-11-12 14:00:01 +00:00
Meador Inge f963a8ffcc Delete a stale comment. No functional change.
llvm-svn: 167698
2012-11-12 00:28:15 +00:00
Meador Inge d4825780ed instcombine: Migrate memset optimizations
This patch migrates the memset optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167689
2012-11-11 06:49:03 +00:00
Meador Inge 9cf328b526 instcombine: Migrate memmove optimizations
This patch migrates the memmove optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167687
2012-11-11 06:22:40 +00:00
Meador Inge dd9234a10a instcombine: Migrate memcpy optimizations
This patch migrates the memcpy optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167686
2012-11-11 05:54:34 +00:00
Nadav Rotem 12930749ab Fix a comment typo and add comments.
llvm-svn: 167684
2012-11-11 05:15:00 +00:00
Meador Inge 4d2827c10d instcombine: Migrate memcmp optimizations
This patch migrates the memcmp optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167683
2012-11-11 05:11:20 +00:00
Meador Inge 56edbc9323 instcombine: Migrate strstr optimizations
This patch migrates the strstr optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167682
2012-11-11 03:51:48 +00:00
Meador Inge 76fc1a479a Add method for replacing instructions to LibCallSimplifier
In some cases the library call simplifier may need to replace instructions
other than the library call being simplified.  In those cases it may be
necessary for clients of the simplifier to override how the replacements
are actually done.  As such, a new overrideable method for replacing
instructions was added to LibCallSimplifier.

A new subclass of LibCallSimplifier is also defined which overrides
the instruction replacement method.  This is because the instruction
combiner defines its own replacement method which updates the worklist
when instructions are replaced.

llvm-svn: 167681
2012-11-11 03:51:43 +00:00
Meador Inge bcd88ef764 instcombine: Migrate strcspn optimizations
This patch migrates the strcspn optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167675
2012-11-10 15:16:48 +00:00
Meador Inge 03be256db9 instcombine: Query target library information to gate libcall simplifications
Several of the simplifiers migrated from the simplify-libcalls pass to
the instcombine pass were not correctly checking the target library
information to gate the simplifications.  This patch ensures that the
check is made.

llvm-svn: 167660
2012-11-10 03:11:10 +00:00
Dmitry Vyukov 0044e386e9 tsan: switch to new memory_order constants (ABI compatible)
llvm-svn: 167615
2012-11-09 14:12:16 +00:00
Dmitry Vyukov 92b9e1dbfd tsan: instrument all atomics (including fetch_add, exchange, cas, etc)
llvm-svn: 167612
2012-11-09 12:55:36 +00:00
Nadav Rotem 1cfef3e9ee Add support for memory runtime check. When we can, we calculate array bounds.
If the arrays are found to be disjoint then we run the vectorized version of
the loop. If they are not, we run the scalar code.

llvm-svn: 167608
2012-11-09 07:09:44 +00:00
Meador Inge 489b5d645f instcombine: Migrate strspn optimizations
This patch migrates the strspn optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167568
2012-11-08 01:33:50 +00:00
Hans Wennborg c3c8d95c51 Only do switch-to-lookup table transformation when TargetTransformInfo
is available.

llvm-svn: 167552
2012-11-07 21:35:12 +00:00
Kostya Serebryany 157a515376 [asan] fix bug 14277 (asan needs to fail with fata error if an __asan interface function is being redefined. Before this fix asan asserts)
llvm-svn: 167529
2012-11-07 12:42:18 +00:00
Duncan Sands a318ef6fa6 Generalize the transform that boosts GEP indices to the size of a pointer to
also do it for vectors of pointers.

llvm-svn: 167354
2012-11-03 11:44:17 +00:00
Alexey Samsonov 9bdb63ae0d Fix whitespaces
llvm-svn: 167295
2012-11-02 12:20:34 +00:00
Chandler Carruth 099f5cb031 Revert the switch of loop-idiom to use the new dependence analysis.
The new analysis is not yet ready for prime time. It has a *critical*
flawed assumption, and some troubling shortages of testing. Until it's
been hammered into better shape, let's stick with the working code. This
should be easy to revert itself when the analysis is ready.

Fixes PR14241, a miscompile of any memcpy-able loop which uses a pointer
as the induction mechanism. If you have been seeing miscompiles in this
revision range, you really want to test with this backed out. The
results of this miscompile are a bit subtle as they can lead to
downstream passes concluding things are impossible which are in fact
possible.

Thanks to David Blaikie for the majority of the reduction of this
miscompile. I'll be checking in the test case in a non-revert commit.

Revesions reverted here:

r167045: LoopIdiom: Fix a serious missed optimization: we only turned
         top-level loops into memmove.
r166877: LoopIdiom: Add checks to avoid turning memmove into an infinite
         loop.
r166875: LoopIdiom: Recognize memmove loops.
r166874: LoopIdiom: Replace custom dependence analysis with
         DependenceAnalysis.
llvm-svn: 167286
2012-11-02 08:33:25 +00:00
Duncan Sands a17bb1419f Fix an obvious typo that causes an assertion failure when running
test/Transforms/GVN/rle.ll if the (currently disabled) check for a
pointer type in getIntPtrType is turned on.

llvm-svn: 167285
2012-11-02 07:49:32 +00:00
Chandler Carruth acc748b2b5 Fix sign compare warning. Patch by Mahesha HS.
llvm-svn: 167282
2012-11-02 05:24:00 +00:00
Hal Finkel 560545b85f BBVectorize: Use target costs for incoming and outgoing values instead of the depth heuristic.
When target cost information is available, compute explicit costs of inserting and
extracting values from vectors. At this point, all costs are estimated using the
target information, and the chain-depth heuristic is not needed. As a result, it is now, by
default, disabled when using target costs.

llvm-svn: 167256
2012-11-01 21:50:12 +00:00
Kostya Serebryany 28d0694c27 [asan] don't instrument globals that we've created ourselves (reduces the binary size a bit)
llvm-svn: 167230
2012-11-01 13:42:40 +00:00
Chandler Carruth 5da3f0512e Revert the majority of the next patch in the address space series:
r165941: Resubmit the changes to llvm core to update the functions to
         support different pointer sizes on a per address space basis.

Despite this commit log, this change primarily changed stuff outside of
VMCore, and those changes do not carry any tests for correctness (or
even plausibility), and we have consistently found questionable or flat
out incorrect cases in these changes. Most of them are probably correct,
but we need to devise a system that makes it more clear when we have
handled the address space concerns correctly, and ideally each pass that
gets updated would receive an accompanying test case that exercises that
pass specificaly w.r.t. alternate address spaces.

However, from this commit, I have retained the new C API entry points.
Those were an orthogonal change that probably should have been split
apart, but they seem entirely good.

In several places the changes were very obvious cleanups with no actual
multiple address space code added; these I have not reverted when
I spotted them.

In a few other places there were merge conflicts due to a cleaner
solution being implemented later, often not using address spaces at all.
In those cases, I've preserved the new code which isn't address space
dependent.

This is part of my ongoing effort to clean out the partial address space
code which carries high risk and low test coverage, and not likely to be
finished before the 3.2 release looms closer. Duncan and I would both
like to see the above issues addressed before we return to these
changes.

llvm-svn: 167222
2012-11-01 09:14:31 +00:00
Chandler Carruth 7ec5085e01 Revert the series of commits starting with r166578 which introduced the
getIntPtrType support for multiple address spaces via a pointer type,
and also introduced a crasher bug in the constant folder reported in
PR14233.

These commits also contained several problems that should really be
addressed before they are re-committed. I have avoided reverting various
cleanups to the DataLayout APIs that are reasonable to have moving
forward in order to reduce the amount of churn, and minimize the number
of commits that were reverted. I've also manually updated merge
conflicts and manually arranged for the getIntPtrType function to stay
in DataLayout and to be defined in a plausible way after this revert.

Thanks to Duncan for working through this exact strategy with me, and
Nick Lewycky for tracking down the really annoying crasher this
triggered. (Test case to follow in its own commit.)

After discussing with Duncan extensively, and based on a note from
Micah, I'm going to continue to back out some more of the more
problematic patches in this series in order to ensure we go into the
LLVM 3.2 branch with a reasonable story here. I'll send a note to
llvmdev explaining what's going on and why.

Summary of reverted revisions:

r166634: Fix a compiler warning with an unused variable.
r166607: Add some cleanup to the DataLayout changes requested by
         Chandler.
r166596: Revert "Back out r166591, not sure why this made it through
         since I cancelled the command. Bleh, sorry about this!
r166591: Delete a directory that wasn't supposed to be checked in yet.
r166578: Add in support for getIntPtrType to get the pointer type based
         on the address space.
llvm-svn: 167221
2012-11-01 08:07:29 +00:00
Hal Finkel c89e75e93e BBVectorize: Account for internal shuffle costs
When target costs are available, use them to account for the costs of
shuffles on internal edges of the DAG of candidate pairs.

Because the shuffle costs here are currently for only the internal edges,
the current target cost model is trivial, and the chain depth requirement
is still in place, I don't yet have an easy test
case. Nevertheless, by looking at the debug output, it does seem to do the right
think to the effective "size" of each DAG of candidate pairs.

llvm-svn: 167217
2012-11-01 06:26:34 +00:00
Jakub Staszak 4e45abf0ae Don't insert and erase load instruction. Simply create (new) and delete it.
llvm-svn: 167196
2012-11-01 01:10:43 +00:00
Nadav Rotem 4cb8cdab5e LoopVectorize: Preserve NSW, NUW and IsExact flags.
llvm-svn: 167174
2012-10-31 21:40:39 +00:00
Benjamin Kramer ede2fe3bfd LCSSA: Try to recover compile time regressions due to SCEV updates.
- Use value handle tricks to communicate use replacements instead of forgetLoop, this is a lot faster.
- Move the "big hammer" out of the main loop so it's not called for every instruction.

This should recover most (if not all) compile time regressions introduced by this code.

llvm-svn: 167136
2012-10-31 16:30:03 +00:00
Nadav Rotem ec3ab49dda Put the threshold magic number in a variable.
llvm-svn: 167134
2012-10-31 16:22:16 +00:00
Hans Wennborg b71f72aa82 Remove fixme about unreachable cases from SwitchToLookupTable
SimplifyCFG will have removed those cases for us.

llvm-svn: 167132
2012-10-31 16:15:25 +00:00
Nadav Rotem 1265ea8f8d Remove enum values since they are not used anymore.
llvm-svn: 167131
2012-10-31 16:14:06 +00:00
Hans Wennborg 4fef2fec3d Address Duncan's comments on r167121.
llvm-svn: 167130
2012-10-31 15:31:09 +00:00
Hal Finkel 842ad0b621 BBVectorize: Choose pair ordering to minimize shuffles
BBVectorize would, except for loads and stores, always fuse instructions
so that the first instruction (in the current source order) would always
represent the low part of the input vectors and the second instruction
would always represent the high part. This lead to too many shuffles
being produced because sometimes the opposite order produces fewer of them.

With this change, BBVectorize tracks the kind of pair connections that form
the DAG of candidate pairs, and uses that information to reorder the pairs to
avoid excess shuffles. Using this information, a future commit will be able
to add VTTI-based shuffle costs to the pair selection procedure. Importantly,
the number of remaining shuffles can now be estimated during pair selection.

There are some trivial instruction reorderings in the test cases, and one
simple additional test where we certainly want to do a reordering to
avoid an unnecessary shuffle.

llvm-svn: 167122
2012-10-31 15:17:07 +00:00
Hans Wennborg 09acdb9a16 Address Duncan's comments on r167115
- Use 0 instead of NULL
 - Helper function for "dyn_cast, else lookup in the constant pool".

llvm-svn: 167121
2012-10-31 15:14:39 +00:00
Meador Inge 05a625a0ed instcombine: Migrate strto* optimizations
This patch migrates the strto* optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167119
2012-10-31 14:58:26 +00:00
Hans Wennborg 793b342dcf Fix false -> NULL conversion from r167115 spotted by Benjamin Kramer.
llvm-svn: 167117
2012-10-31 14:36:48 +00:00
Benjamin Kramer 1559127f6f Replace some instances of UniqueVector with SetVector, which is slightly cheaper.
No functionality change.

llvm-svn: 167116
2012-10-31 13:45:49 +00:00
Hans Wennborg 9e74dd97b8 Do simple constant propagation in lookup table formation for switches
By propagating the value for the switch condition, LLVM can now build
lookup tables for code such as:

  switch (x) {
    case 1: return 5;
    case 2: return 42;
    case 3: case 4: case 5:
      return x - 123;
    default:
      return 123;
  }

Given that x is known for each case, "x - 123" becomes a constant for
cases 3, 4, and 5.

llvm-svn: 167115
2012-10-31 13:42:45 +00:00
Benjamin Kramer 8682ac1a77 LCSSA: Add a workaround for another nasty SCEV cache invalidation issue.
I'm not entirely happy with this solution, but I don't see a smarter way currently.
Fixes PR14214.

llvm-svn: 167112
2012-10-31 10:01:29 +00:00
Meador Inge 6f8e01121a instcombine: Migrate strpbrk optimizations
This patch migrates the strpbrk optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167105
2012-10-31 04:29:58 +00:00
Meador Inge d589ac621b instcombine: Migrate strlen optimizations
This patch migrates the strlen optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167103
2012-10-31 03:33:06 +00:00
Meador Inge 067294b3ac instcombine: Migrate strncpy optimizations
This patch migrates the strncpy optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167102
2012-10-31 03:33:00 +00:00
Nadav Rotem ce77ab0c24 LoopVectorize: Do not vectorize loops with tiny constant trip counts.
llvm-svn: 167101
2012-10-31 03:31:07 +00:00
Nadav Rotem ff7889196b Add support for loops that don't start with Zero.
This is important for loops in the LAPACK test-suite.
These loops start at 1 because they are auto-converted from fortran.

llvm-svn: 167084
2012-10-31 00:45:26 +00:00
Meador Inge 9a6a190562 instcombine: Migrate stpcpy optimizations
This patch migrates the stpcpy optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.  Note that the
__stpcpy_chk simplifications were migrated in a previous commit.

llvm-svn: 167083
2012-10-31 00:20:56 +00:00
Meador Inge cdb2ca54ae instcombine: Split out the __stpcpy_chk simplifications from StrCpyChkOpt
r166198 migrated the strcpy optimization to instcombine.  The strcpy
simplifier that was migrated from Transforms/Scalar/SimplifyLibCalls.cpp
was also doing some __strcpy_chk simplifications.  Those fortified
simplifications were migrated as well, but introduced a bug in the
__stpcpy_chk simplifier in the process.  This happened because the
__strcpy_chk and __stpcpy_chk simplifiers were both mapped to StrCpyChkOpt
which was updated with simplifications that worked for __strcpy_chk, but
not __stpcpy_chk.

This patch fixes the problem by adding proper test coverage and creating a
new simplifier for __stpcpy_chk (instead of sharing one with __strcpy_chk).

llvm-svn: 167082
2012-10-31 00:20:51 +00:00
Nadav Rotem 47a299dcc9 Add documentation.
llvm-svn: 167055
2012-10-30 22:06:26 +00:00
Chandler Carruth 1296b59522 Fix PR14212: For some strange reason I treated vectors differently from
integers in that the code to handle split alloca-wide integer loads or
stores doesn't come first. It should, for the same reasons as with
integers, and the PR attests to that. Also had to fix a busted assert in
that this test case also covers.

llvm-svn: 167051
2012-10-30 20:52:40 +00:00
Hal Finkel 08f34ac9dd BBVectorize: Cache fixed-order pairs instead of recomputing pointer info.
Instead of recomputing relative pointer information just prior to fusing,
cache this information (which also needs to be computed during the
candidate-pair selection process). This cuts down on the total number of
SE queries made, and also is a necessary intermediate step on the road toward
including shuffle costs in the pair selection procedure.

No functionality change is intended.

llvm-svn: 167049
2012-10-30 20:17:37 +00:00
Benjamin Kramer 48a6478242 LoopIdiom: Fix a serious missed optimization: we only turned top-level loops into memmove.
Thanks to Preston Briggs for catching this!

llvm-svn: 167045
2012-10-30 19:49:39 +00:00
Hal Finkel 2eaadd1a2d BBVectorize: Fix a small bug introduced in r167042.
We need to make sure that we take the correct load/store alignment
when the inputs are flipped.

llvm-svn: 167044
2012-10-30 19:47:37 +00:00
Hal Finkel f384890961 BBVectorize: Simplify how input swapping is handled.
Stop propagating the FlipMemInputs variable into the routines that
create the replacement instructions. Instead, just flip the arguments
of those routines. This allows for some associated cleanup (not all
of which is done here). No functionality change is intended.

llvm-svn: 167042
2012-10-30 19:35:29 +00:00
Hal Finkel eac2887143 BBVectorize: Don't make calls to SE when the result is unused.
SE was being called during the instruction-fusion process (when the result
is unreliable, and thus ignored). No functionality change is intended.

llvm-svn: 167037
2012-10-30 18:55:49 +00:00
Nadav Rotem d3df665140 80-col
llvm-svn: 167036
2012-10-30 18:37:43 +00:00
Nadav Rotem bc21aceb19 LoopVectorize: Add support for write-only loops when the write destination is a single pointer.
Speedup SciMark by 1%

llvm-svn: 167035
2012-10-30 18:36:45 +00:00
Nadav Rotem b3e8e688da LoopVectorize: Fix a bug in the initialization of reduction variables. AND needs to start at all-one
while XOR, and OR need to start at zero.

llvm-svn: 167032
2012-10-30 18:12:36 +00:00
Duncan Sands e2395dc27b Fix isEliminableCastPair to work correctly in the presence of pointers
with different sizes.

llvm-svn: 167018
2012-10-30 16:03:32 +00:00
Ulrich Weigand 6a9bb51a8d Enable some additional constant folding for PPCDoubleDouble.
This fixes Clang :: CodeGen/complex-builtints.c on PowerPC.

llvm-svn: 167013
2012-10-30 12:33:18 +00:00
Hans Wennborg f3254838e4 Use TargetTransformInfo to control switch-to-lookup table transformation
When the switch-to-lookup tables transform landed in SimplifyCFG, it
was pointed out that this could be inappropriate for some targets.
Since there was no way at the time for the pass to know anything about
the target, an awkward reverse-transform was added in CodeGenPrepare
that turned lookup tables back into switches for some targets.

This patch uses the new TargetTransformInfo to determine if a
switch should be transformed, and removes
CodeGenPrepare::ConvertLoadToSwitch.

llvm-svn: 167011
2012-10-30 11:23:25 +00:00
Nadav Rotem 73ddcfe03f LoopVectorizer: change debug prints: Print the module identifier when deciding to vectorize. When deciding not to vectorize do not print the called function name because it can be null.
llvm-svn: 166989
2012-10-30 00:40:39 +00:00
Nadav Rotem 5ad045a8c5 LoopVectorize: Update and preserve the dominator tree info.
llvm-svn: 166970
2012-10-29 21:52:38 +00:00
Ulrich Weigand 3abb34389d In various places throughout the code generator, there were special
checks to avoid performing compile-time arithmetic on PPCDoubleDouble.

Now that APFloat supports arithmetic on PPCDoubleDouble, those checks
are no longer needed, and we can treat the type like any other.

llvm-svn: 166958
2012-10-29 18:35:49 +00:00
Nadav Rotem 39aab03be3 Rename the BB-vectorize flag to match the dragonegg name
llvm-svn: 166948
2012-10-29 18:01:14 +00:00
Duncan Sands 5bdd9dda48 Remove a wrapper around getIntPtrType added to GVN by Hal in commit 166624 (the
wrapper returns a vector of integers when passed a vector of pointers) by having
getIntPtrType itself return a vector of integers in this case.  Outside of this
wrapper, I didn't find anywhere in the codebase that was relying on the old
behaviour for vectors of pointers, so give this a whirl through the buildbots.

llvm-svn: 166939
2012-10-29 17:31:46 +00:00
Nadav Rotem c59ae207ef Change the PassManagerBuilder (used by -O3) loop vectorizer flag from -vectorize to -vectorize-loops because we dont want to share the same flag as the bb-vectorizer.
llvm-svn: 166937
2012-10-29 16:36:25 +00:00
Rafael Espindola 56183fbe78 llvm-extract changes linkages so that functions on both sides of the
split module can see each other. If it is keeping a symbol that already has
a non local linkage, it doesn't need to change it.

llvm-svn: 166908
2012-10-29 01:59:03 +00:00
Rafael Espindola 9d30d0fc67 llvm-extract was unable to handle aliases. It would leave a copy on the
output of both

llvm-extract foo.ll -func=bar
and
llvm-extract foo.ll -func=bar -delete

so the two new files could not be linked together anymore. With this change
alias are handled almost like functions and global variables. Almost because
with alias we cannot just clear the initializer/body, we have to create a new
declaration and replace the alias with it.

The net result is that now the output of the above commands can be linked
even if foo.ll has aliases.

llvm-svn: 166907
2012-10-29 00:27:55 +00:00
Benjamin Kramer 8d2ee55a0c LoopIdiom: Add checks to avoid turning memmove into an infinite loop.
I don't think this is possible with the current implementation but that may change eventually.

llvm-svn: 166877
2012-10-27 15:18:28 +00:00
Benjamin Kramer 1c9e5186c0 LoopIdiom: Recognize memmove loops.
This turns loops like
  for (unsigned i = 0; i != n; ++i)
    p[i] = p[i+1];
into memmove, which has a highly optimized implementation in most libcs.

This was really easy with the new DependenceAnalysis :)

llvm-svn: 166875
2012-10-27 14:25:51 +00:00
Benjamin Kramer d5c9be8247 LoopIdiom: Replace custom dependence analysis with DependenceAnalysis.
Requires a lot less code and complexity on loop-idiom's side and the more
precise analysis can catch more cases, like the one I included as a test case.
This also fixes the edge-case miscompilation from PR9481.

Compile time performance seems to be slightly worse, but this is mostly due
to an extra LCSSA run scheduled by the PassManager and should be fixed there.

llvm-svn: 166874
2012-10-27 14:25:44 +00:00
Hal Finkel bad10bb2f3 Update BBVectorize to use the new VTTI instr. cost interfaces.
The monolithic interface for instruction costs has been split into
several functions. This is the corresponding change. No functionality
change is intended.

llvm-svn: 166865
2012-10-27 04:33:48 +00:00
Nadav Rotem 859366f93f 1. Fix a bug in getTypeConversion. When a *simple* type is split, we need to return the type of the split result.
2. Change the maximum vectorization width from 4 to 8.
3. A test for both.

llvm-svn: 166864
2012-10-27 04:11:32 +00:00
Nadav Rotem afae78edab Refactor the VectorTargetTransformInfo interface.
Add getCostXXX calls for different families of opcodes, such as casts, arithmetic, cmp, etc.

Port the LoopVectorizer to the new API.

The LoopVectorizer now finds instructions which will remain uniform after vectorization. It uses this information when calculating the cost of these instructions.

llvm-svn: 166836
2012-10-26 23:49:28 +00:00
Rafael Espindola 4253bd8faf Change the internalize pass to internalize all symbols when given an empty
list of externals. This makes sense since a shared library with no symbols
can still be useful if it has static constructors.

llvm-svn: 166795
2012-10-26 18:47:48 +00:00
Benjamin Kramer 7736085894 LoopSimplify: Preserve DependenceAnalysis.
This is currently true, but may change when DA grows more aggressive caching.
Without this setting it's impossible to use DA from a LoopPass because DA is a
function pass and cannot be properly scheduled in between LoopPasses. The
LoopManager reacts to this with an infinite loop which made this really annoying
to debug.

llvm-svn: 166788
2012-10-26 17:40:50 +00:00
Benjamin Kramer e3d821a466 Fix SCEV cache invalidation in LCSSA and LoopSimplify.
The LoopSimplify bug is pretty harmless because the loop goes from unanalyzable
to analyzable but the LCSSA bug is very nasty. It only comes into play with a
specific order of the LoopPassManager worklist and can cause actual
miscompilations, when a SCEV refers to a value that has been replaced with PHI
node. SCEVExpander may then insert code into the wrong place, either violating
domination or randomly miscompiling stuff.

Comes with an extensive test case reduced from the test-suite with
bugpoint+SCEVValidator.

llvm-svn: 166787
2012-10-26 17:31:43 +00:00
Hal Finkel 4863448dca Use VTTI->getNumberOfParts in BBVectorize.
This change reflects VTTI refactoring; no functionality change intended.

llvm-svn: 166752
2012-10-26 04:28:06 +00:00
Hal Finkel 41a6ded4a0 Disable generation of pointer vectors by BBVectorize.
Once vector-of-pointer support works, then this can be reverted.

llvm-svn: 166741
2012-10-26 00:05:26 +00:00
Hal Finkel 20a49d6f2c BBVectorize, when using VTTI, should not form types that will be split.
This is needed so that perl's SHA can be compiled (otherwise
BBVectorize takes far too long to find its fixed point).

I'll try to come up with a reduced test case.

llvm-svn: 166738
2012-10-25 23:47:16 +00:00
Hal Finkel cbf9365f4c Begin incorporating target information into BBVectorize.
This is the first of several steps to incorporate information from the new
TargetTransformInfo infrastructure into BBVectorize. Two things are done here:

 1. Target information is used to determine if it is profitable to fuse two
    instructions. This means that the cost of the vector operation must not
    be more expensive than the cost of the two original operations. Pairs that
    are not profitable are no longer considered (because current cost information
    is incomplete, for intrinsics for example, equal-cost pairs are still
    considered).

 2. The 'cost savings' computed for the profitability check are also used to
    rank the DAGs that represent the potential vectorization plans. Specifically,
    for nodes of non-trivial depth, the cost savings is used as the node
    weight.

The next step will be to incorporate the shuffle costs into the DAG weighting;
this will give the edges of the DAG weights as well. Once that is done, when
target information is available, we should be able to dispense with the
depth heuristic.

llvm-svn: 166716
2012-10-25 21:12:23 +00:00
Nadav Rotem 579042f71b LoopVectorize: Teach the cost model to query scalar costs as scalar types and not vectors of 1.
llvm-svn: 166715
2012-10-25 21:03:48 +00:00
Jakob Stoklund Olesen 977f41a1fa Also optimize large switch statements.
The isValueEqualityComparison() guard at the top of SimplifySwitch()
only applies to some of the possible transformations.

The newer transformations work just fine on large switches, and the
check on predecessor count is nonsensical.

llvm-svn: 166710
2012-10-25 18:51:15 +00:00
Chandler Carruth 58d0556765 Teach SROA how to split whole-alloca integer loads and stores into
smaller integer loads and stores.

The high-level motivation is that the frontend sometimes generates
a single whole-alloca integer load or store during ABI lowering of
splittable allocas. We need to be able to break this apart in order to
see the underlying elements and properly promote them to SSA values. The
hope is that this fixes some performance regressions on x86-32 with the
new SROA pass.

Unfortunately, this causes quite a bit of churn in the test cases, and
bloats some IR that comes out. When we see an alloca that consists soley
of bits and bytes being extracted and re-inserted, we now do some
splitting first, before building widened integer "bucket of bits"
representations. These are always well folded by instcombine however, so
this shouldn't actually result in missed opportunities.

If this splitting of all-integer allocas does cause problems (perhaps
due to smaller SSA values going into the RA), we could potentially go to
some extreme measures to only do this integer splitting trick when there
are non-integer component accesses of an alloca, but discovering this is
quite expensive: it adds yet another complete walk of the recursive use
tree of the alloca.

Either way, I will be watching build bots and LNT bots to see what
fallout there is here. If anyone gets x86-32 numbers before & after this
change, I would be very interested.

llvm-svn: 166662
2012-10-25 04:37:07 +00:00
Nadav Rotem 5ffb049a55 Add support for additional reduction variables: AND, OR, XOR.
Patch by Paul Redmond <paul.redmond@intel.com>.

llvm-svn: 166649
2012-10-25 00:08:41 +00:00
Nadav Rotem 086ea5c1f5 revert accidental change
llvm-svn: 166643
2012-10-24 23:48:57 +00:00
Nadav Rotem 4a87683a41 Implement a basic cost model for vector and scalar instructions.
llvm-svn: 166642
2012-10-24 23:47:38 +00:00
Micah Villmow f07b962801 Fix a compiler warning with an unused variable.
llvm-svn: 166634
2012-10-24 22:32:26 +00:00
Hal Finkel 69b07a2c3a Update GVN to support vectors of pointers.
GVN will now generate ptrtoint instructions for vectors of pointers.
Fixes PR14166.

llvm-svn: 166624
2012-10-24 21:22:30 +00:00
Nadav Rotem e4f491e7ee whitespace
llvm-svn: 166622
2012-10-24 20:58:40 +00:00
Nadav Rotem a721b21c64 LoopVectorizer: Add a basic cost model which uses the VTTI interface.
llvm-svn: 166620
2012-10-24 20:36:32 +00:00
Micah Villmow bf3eeb2dfc Add some cleanup to the DataLayout changes requested by Chandler.
llvm-svn: 166607
2012-10-24 18:36:13 +00:00
Micah Villmow 51e7246cb4 Back out r166591, not sure why this made it through since I cancelled the command. Bleh, sorry about this!
llvm-svn: 166596
2012-10-24 17:25:11 +00:00
Micah Villmow 6a8f3f9e20 Delete a directory that wasn't supposed to be checked in yet.
llvm-svn: 166591
2012-10-24 17:20:04 +00:00
Micah Villmow 12d9127833 Add in support for getIntPtrType to get the pointer type based on the address space.
This checkin also adds in some tests that utilize these paths and updates some of the
clients.

llvm-svn: 166578
2012-10-24 15:52:52 +00:00
Nadav Rotem 5bed7b4fad Use the AliasAnalysis isIdentifiedObj because it also understands mallocs and c++ news.
PR14158.

llvm-svn: 166491
2012-10-23 18:44:18 +00:00
Duncan Sands 5ed3900d77 Fix typo that somehow escaped both testing and code inspection.
llvm-svn: 166475
2012-10-23 09:07:02 +00:00
Duncan Sands 533c8ae79f Transform code like this
%V = mul i64 %N, 4
 %t = getelementptr i8* bitcast (i32* %arr to i8*), i32 %V
into
 %t1 = getelementptr i32* %arr, i32 %N
 %t = bitcast i32* %t1 to i8*
incorporating the multiplication into the getelementptr.
This happens all the time in dragonegg, for example for
  int foo(int *A, int N) {
    return A[N];
  }
because gcc turns this into byte pointer arithmetic before it hits the plugin:
  D.1590_2 = (long unsigned int) N_1(D);
  D.1591_3 = D.1590_2 * 4;
  D.1592_5 = A_4(D) + D.1591_3;
  D.1589_6 = *D.1592_5;
  return D.1589_6;
The D.1592_5 line is a POINTER_PLUS_EXPR, which is turned into a getelementptr
on a bitcast of A_4 to i8*, so this becomes exactly the kind of IR that the
transform fires on.

An analogous transform (with no testcases!) already existed for bitcasts of
arrays, so I rewrote it to share code with this one.

llvm-svn: 166474
2012-10-23 08:28:26 +00:00
Richard Smith 6289a4e85e Per the C++ standard, we need to include the definition of llvm::Calculate in
every TU where it's implicitly instantiated, even if there's an implicit
instantiation for the same types available in another TU.

llvm-svn: 166470
2012-10-23 06:19:46 +00:00
Julien Lerouge a302b6d95e Fix typo.
llvm-svn: 166456
2012-10-23 00:38:15 +00:00
Julien Lerouge d7fa5e420d Explain why DenseMap is still used here instead of MapVector.
llvm-svn: 166454
2012-10-23 00:23:46 +00:00
Julien Lerouge 8cf84fa4e2 Iterating over a DenseMap<std::pair<BasicBlock*, unsigned>, PHINode*> is not
deterministic, replace it with a DenseMap<std::pair<unsigned, unsigned>,
PHINode*> (we already have a map from BasicBlock to unsigned).

<rdar://problem/12541389>

llvm-svn: 166435
2012-10-22 19:43:56 +00:00
Nadav Rotem 1c7fc71e69 Don't crash if the load/store pointer is not a GEP.
Fix by Shivarama Rao <Shivarama.Rao@amd.com>

llvm-svn: 166427
2012-10-22 18:27:56 +00:00
Argyrios Kyrtzidis 54ff5e81a1 Revert r166407 because it caused analyzer tests to crash and broke self-host bots.
llvm-svn: 166424
2012-10-22 18:16:14 +00:00
Hal Finkel 931c52b84c BBVectorize should ignore unreachable blocks.
Unreachable blocks can have invalid instructions. For example,
jump threading can produce self-referential instructions in
unreachable blocks. Also, we should not be spending time
optimizing unreachable code. Fixes PR14133.

llvm-svn: 166423
2012-10-22 18:00:55 +00:00
Nadav Rotem f17cd27362 Rename a variable.
llvm-svn: 166410
2012-10-22 04:53:05 +00:00
Nadav Rotem 03011f1393 Vectorizer: optimize the generation of selects. If the condition is uniform, generate a scalar-cond select (i1 as selector).
llvm-svn: 166409
2012-10-22 04:38:00 +00:00
Nadav Rotem c9741887c3 Update the loop vectorizer docs.
llvm-svn: 166408
2012-10-22 03:52:53 +00:00
Nick Lewycky 8b67e1e0b9 Reapply r166405, teaching tailcallelim to be smarter about nocapture, with a
very small but very important bugfix:
  bool shouldExplore(Use *U) {
    Value *V = U->get();
    if (isa<CallInst>(V) || isa<InvokeInst>(V))
    [...]
should have read:
  bool shouldExplore(Use *U) {
    Value *V = U->getUser();
    if (isa<CallInst>(V) || isa<InvokeInst>(V))
Fixes PR14143!

llvm-svn: 166407
2012-10-22 03:03:52 +00:00
NAKAMURA Takumi 60d56d2eea Revert r166405, "Teach TailRecursionElimination to consider 'nocapture' when deciding whether"
It broke selfhosting stage2 in several builders.

llvm-svn: 166406
2012-10-22 00:48:51 +00:00
Nick Lewycky 2d28f2bf83 Teach TailRecursionElimination to consider 'nocapture' when deciding whether
calls can be marked tail.

llvm-svn: 166405
2012-10-21 23:51:22 +00:00
Benjamin Kramer f77f224df9 Revert r166390 "LoopIdiom: Replace custom dependence analysis with LoopDependenceAnalysis."
It passes all tests, produces better results than the old code but uses the
wrong pass, LoopDependenceAnalysis, which is old and unmaintained. "Why is it
still in tree?", you might ask. The answer is obviously: "To confuse developers."

Just swapping in the new dependency pass sends the pass manager into an infinte
loop, I'll try to figure out why tomorrow.

llvm-svn: 166399
2012-10-21 19:31:16 +00:00
Anders Carlsson 7d8991c778 Avoid an extra hash lookup when inserting a value into the widen map.
llvm-svn: 166395
2012-10-21 16:26:35 +00:00
Jakub Staszak baa063bd03 Simplify code. No functionality change.
llvm-svn: 166393
2012-10-21 15:36:03 +00:00
Jakub Staszak 9694ab8ffa Simplify code. No functionality change.
llvm-svn: 166392
2012-10-21 15:29:19 +00:00
Benjamin Kramer 3ae8bc68af LoopIdiom: Replace custom dependence analysis with LoopDependenceAnalysis.
Requires a lot less code and complexity on loop-idiom's side and the more
precise analysis can catch more cases, like the one I included as a test case.
This also fixes the edge-case miscompilation from PR9481. I'm not entirely
sure that all cases are handled that the old checks handled but LDA will
certainly become smarter in the future.

llvm-svn: 166390
2012-10-21 15:03:07 +00:00
Nadav Rotem fe88c67161 Fix a bug in the vectorization of wide load/store operations.
We used a SCEV to detect that A[X] is consecutive. We assumed that X was
the induction variable. But X can be any expression that uses the induction
for example: X = i + 2;

llvm-svn: 166388
2012-10-21 06:49:10 +00:00
Nadav Rotem c1679a95b6 Add support for reduction variables that do not start at zero.
This is important for nested-loop reductions such as :

In the innermost loop, the induction variable does not start with zero:

for (i = 0 .. n)
 for (j = 0 .. m)
  sum += ...

llvm-svn: 166387
2012-10-21 05:52:51 +00:00
Nadav Rotem 364bd30641 Document change. Describe the pass and some papers that inspired the design of the pass.
llvm-svn: 166386
2012-10-21 04:04:25 +00:00
Nadav Rotem 7e1084d36c Vectorizer: fix a bug in the classification of induction/reduction phis.
llvm-svn: 166384
2012-10-21 02:38:01 +00:00
Nadav Rotem e5dc57d4fb Fix an infinite loop in the loop-vectorizer.
PR14134.

llvm-svn: 166379
2012-10-20 20:45:01 +00:00
Benjamin Kramer 7ddd70527c SROA: Simplify code. No functionality change.
llvm-svn: 166375
2012-10-20 12:04:57 +00:00
Benjamin Kramer f55b592cc8 InstCombine: Fix an edge case where constant icmps could sneak into ConstantFoldInstOperands and crash.
Have to refactor the ConstantFolder interface one day to define bugs like this away. Fixes PR14131.

llvm-svn: 166374
2012-10-20 08:43:52 +00:00
Nadav Rotem d189b82a9b Vectorize: teach cavVectorizeMemory to distinguish between A[i]+=x and A[B[i]]+=x.
If the pointer is consecutive then it is safe to read and write. If the pointer is non-loop-consecutive then
it is unsafe to vectorize it because we may hit an ordering issue.

llvm-svn: 166371
2012-10-20 08:26:33 +00:00
Nadav Rotem 3940bafb54 Fix a typo
llvm-svn: 166367
2012-10-20 05:03:27 +00:00
Nadav Rotem f70ca3ceed Vectorizer: refactor the memory checks to a new function. No functionality change.
llvm-svn: 166366
2012-10-20 04:59:06 +00:00
Nadav Rotem 550f7f7e19 LoopVectorize: Keep the IRBuilder on the stack.
llvm-svn: 166354
2012-10-19 23:27:19 +00:00
Nadav Rotem 4f7f72702b Vectorizer: Add support for loop reductions.
For example:

  for (i=0; i<n; i++)
   sum += A[i] +  B[i] + i;

llvm-svn: 166351
2012-10-19 23:05:40 +00:00
Nadav Rotem 4dc976fbcb revert r166264 because the LTO build is still failing
llvm-svn: 166340
2012-10-19 21:28:43 +00:00
Benjamin Kramer 317d6c621d SimplifyLibcalls: The return value of ffsll is always i32, even when the input is zero.
Fixes PR13028.

llvm-svn: 166313
2012-10-19 20:43:44 +00:00
Benjamin Kramer f1088a37cb Indvars: Don't recursively delete instruction during BB iteration.
This can invalidate the iterators leading to use after frees and crashes.
Fixes PR12536.

llvm-svn: 166291
2012-10-19 17:53:54 +00:00
Alexey Samsonov 8418442ff1 [ASan] Support comments in ASan/TSan blacklist file as lines starting with #
llvm-svn: 166283
2012-10-19 15:24:46 +00:00
Evgeniy Stepanov 8eb77d847e Move SplitBlockAndInsertIfThen to BasicBlockUtils.
llvm-svn: 166278
2012-10-19 10:48:31 +00:00
Benjamin Kramer 319cb771b2 LoopVectorize: Keep the IRBuilder on the stack.
No functionality change.

llvm-svn: 166274
2012-10-19 08:42:02 +00:00
Kostya Serebryany 0995994989 [asan] make sure asan erases old unused allocas after it created a new one. This became important after the recent move from ModulePass to FunctionPass because no cleanup is happening after asan pass any more.
llvm-svn: 166267
2012-10-19 06:20:53 +00:00
Nadav Rotem 4985ddc5e0 recommit the patch that makes LSR and LowerInvoke use the TargetTransform interface.
llvm-svn: 166264
2012-10-19 04:27:49 +00:00
Nadav Rotem ced93f3a05 vectorizer: Add support for reading and writing from the same memory location.
llvm-svn: 166255
2012-10-19 01:24:18 +00:00
Nadav Rotem 1667324f22 cleanup the comment.
llvm-svn: 166247
2012-10-18 23:21:01 +00:00
Nadav Rotem d45a6b93df fix a naming typo
llvm-svn: 166232
2012-10-18 21:45:31 +00:00
Nadav Rotem f8a1396882 Avoid reconstructing the pointer set when searching for duplicated read/write pointers.
llvm-svn: 166205
2012-10-18 18:34:50 +00:00
Meador Inge 2332615f53 Cosmetic change -- move two simplifiers to the right commented statement group.
llvm-svn: 166199
2012-10-18 18:12:43 +00:00
Meador Inge 000dbccfc6 instcombine: Migrate strcpy optimizations
This patch migrates the strcpy optimizations from the simplify-libcalls pass
into the instcombine library call simplifier.  Note also that StrCpyChkOpt
has been updated with a few simplifications that were being done in the
simplify-libcalls version of StrCpyOpt, but not in the migrated implementation
of StrCpyOpt.  There is no reason to overload StrCpyOpt with fortified and
regular simplifications in the new model since there is already a dedicated
simplifier for __strcpy_chk.

llvm-svn: 166198
2012-10-18 18:12:40 +00:00
Nadav Rotem a031c57417 When looking for a vector representation of a scalar, do a single lookup. Also, cache the result of the broadcast instruction.
No functionality change.

llvm-svn: 166191
2012-10-18 17:31:49 +00:00
Chandler Carruth 59ff93afe6 Refactor insert and extract of sub-integers into static helpers that
operate purely on values. Sink the alloca loading and storing logic into
the rewrite routines that are specific to alloca-integer-rewrite
driving. This is just a refactoring here, but the subsequent step will
be to reuse the insertion and extraction logic when rewriting integer
loads and stores that have been split and decomposed into narrower loads
and stores.

No functionality changed other than different names for instructions.

llvm-svn: 166176
2012-10-18 09:56:08 +00:00
Chandler Carruth e793a50f45 This FIXME was fixed some time ago. =]
llvm-svn: 166175
2012-10-18 09:56:06 +00:00
Chandler Carruth e8479e15f5 Introduce a BarrierNoop pass, a hack designed to allow *some* control
over the implicitly-formed-and-nesting CGSCC pass manager and function
pass managers, especially when using them on the opt commandline or
using extension points in the module builder. The '-barrier' opt flag
(or the pass itself) will create a no-op module pass in the pipeline,
resetting the pass manager stack, and allowing the creation of a new
pipeline of function passes or CGSCC passes to be created that is
independent from any previous pipelines.

For example, this can be used to test running two CGSCC passes in
independent CGSCC pass managers as opposed to in the same CGSCC pass
manager. It also allows us to introduce a further hack into the
PassManagerBuilder to separate the O0 pipeline extension passes from the
always-inliner's CGSCC pass manager, which they likely do not want to
participate in... At the very least none of the Sanitizer passes want
this behavior.

This fixes a bug with ASan at O0 currently, and I'll commit the ASan
test which covers this pass. I'm happy to add a test case that this pass
exists and works, but not sure how much time folks would like me to
spend adding test cases for the details of its behavior of partition
pass managers.... The whole thing is just vile, and mostly intended to
unblock ASan, so I'm hoping to rip this all out in a brave new pass
manager world.

llvm-svn: 166172
2012-10-18 08:05:46 +00:00
Nadav Rotem 7a1728094c remove unused variable to fix a warning.
llvm-svn: 166170
2012-10-18 06:09:21 +00:00
Bob Wilson d6d9ccca38 Temporarily revert the TargetTransform changes.
The TargetTransform changes are breaking LTO bootstraps of clang.  I am
working with Nadav to figure out the problem, but I am reverting it for now
to get our buildbots working.

This reverts svn commits: 165665 165669 165670 165786 165787 165997
and I have also reverted clang svn 165741

llvm-svn: 166168
2012-10-18 05:43:52 +00:00
Nadav Rotem 642efbcdd8 Remove the use of dominators and AA.
llvm-svn: 166167
2012-10-18 05:33:02 +00:00
Nadav Rotem b52f717411 Vectorizer: Add support for loops with an unknown count. For example:
for (i=0; i<n; i++){
        a[i] = b[i+1] + c[i+3];
     }

llvm-svn: 166165
2012-10-18 05:29:12 +00:00
NAKAMURA Takumi 7857415785 LoopVectorize.cpp: Fix a warning. [-Wunused-variable]
llvm-svn: 166153
2012-10-17 23:40:15 +00:00
Jakub Staszak 68e5dfddcb Remove redundant SetInsertPoint call.
llvm-svn: 166138
2012-10-17 23:06:37 +00:00
Roman Divacky 4955ec317c Fix some typos and wrong indenting.
llvm-svn: 166128
2012-10-17 21:07:35 +00:00
Nadav Rotem 6b94c2a09b Add a loop vectorizer.
llvm-svn: 166112
2012-10-17 18:25:06 +00:00
Kostya Serebryany 20343351be [asan] better debug diagnostics in asan compiler module
llvm-svn: 166102
2012-10-17 13:40:06 +00:00
Chandler Carruth 6fab42aa39 This just in, it is a *bad idea* to use 'udiv' on an offset of
a pointer. A very bad idea. Let's not do that. Fixes PR14105.

Note that this wasn't *that* glaring of an oversight. Originally, these
routines were only called on offsets within an alloca, which are
intrinsically positive. But over the evolution of the pass, they ended
up being called for arbitrary offsets, and things went downhill...

llvm-svn: 166095
2012-10-17 09:23:48 +00:00
Chandler Carruth 40617f593e Fix a really annoying "bug" introduced in r165941. The change from that
revision makes no sense. We cannot use the address space of the *post
indexed* type to conclude anything about a *pre indexed* pointer type's
size. More importantly, this index can never be over a pointer. We are
indexing over arrays and vectors here.

Of course, I have no test case here. Neither did the original patch. =/

llvm-svn: 166091
2012-10-17 07:22:16 +00:00
Michael Gottesman 02a1141e5a [InstCombine] Teach InstCombine how to handle an obfuscated splat.
An obfuscated splat is where the frontend poorly generates code for a splat
using several different shuffles to create the splat, i.e.,

  %A = load <4 x float>* %in_ptr, align 16
  %B = shufflevector <4 x float> %A, <4 x float> undef, <4 x i32> <i32 0, i32 0, i32 undef, i32 undef>
  %C = shufflevector <4 x float> %B, <4 x float> %A, <4 x i32> <i32 0, i32 1, i32 4, i32 undef>
  %D = shufflevector <4 x float> %C, <4 x float> %A, <4 x i32> <i32 0, i32 1, i32 2, i32 4>

llvm-svn: 166061
2012-10-16 21:29:38 +00:00
Jakub Staszak 8f46e914fb Simplify code. No functionality change.
llvm-svn: 166053
2012-10-16 19:52:32 +00:00
Jakub Staszak 25dcab1eaa 80-col fixup.
llvm-svn: 166050
2012-10-16 19:39:40 +00:00
Jakub Staszak ba34fdb0e4 Simplify potentially quadratic behavior while erasing elements from std::vector.
llvm-svn: 166045
2012-10-16 19:32:31 +00:00
Bill Wendling c6a15cf519 Use the Attributes::get method which takes an AttrVal value directly to simplify the code a bit. No functionality change.
llvm-svn: 166009
2012-10-16 05:23:31 +00:00
Craig Topper c74b600afb Fix filename in file header.
llvm-svn: 166004
2012-10-16 02:21:30 +00:00
Bill Wendling 50d27849f6 Move the Attributes::Builder outside of the Attributes class and into its own class named AttrBuilder. No functionality change.
llvm-svn: 165960
2012-10-15 20:35:56 +00:00
Micah Villmow 4bb926d91d Resubmit the changes to llvm core to update the functions to support different pointer sizes on a per address space basis.
llvm-svn: 165941
2012-10-15 16:24:29 +00:00
Kostya Serebryany b0e2506d97 [asan] make AddressSanitizer to be a FunctionPass instead of ModulePass. This will simplify chaining other FunctionPasses with asan. Also some minor cleanup
llvm-svn: 165936
2012-10-15 14:20:06 +00:00
Chandler Carruth 49c8eea3c0 Update the memcpy rewriting to fully support widened int rewriting. This
includes extracting ints for copying elsewhere and inserting ints when
copying into the alloca. This should fix the CanSROA assertion coming
out of Clang's regression test suite.

llvm-svn: 165931
2012-10-15 10:24:43 +00:00
Chandler Carruth 9d966a2002 Follow-up fix to r165928: handle memset rewriting for widened integers,
and generally clean up the memset handling. It had rotted a bit as the
other rewriting logic got polished more.

llvm-svn: 165930
2012-10-15 10:24:40 +00:00
Chandler Carruth 435c4e0792 First major step toward addressing PR14059. This teaches SROA to handle
cases where we have partial integer loads and stores to an otherwise
promotable alloca to widen[1] those loads and stores to cover the entire
alloca and bitcast them into the appropriate type such that promotion
can proceed.

These partial loads and stores stem from an annoying confluence of ARM's
calling convention and ABI lowering and the FCA pre-splitting which
takes place in SROA. Clang lowers a { double, double } in-register
function argument as a [4 x i32] function argument to ensure it is
placed into integer 32-bit registers (a really unnerving implicit
contract between Clang and the ARM backend I would add). This results in
a FCA load of [4 x i32]* from the { double, double } alloca, and SROA
decomposes this into a sequence of i32 loads and stores. Inlining
proceeds, code gets folded, but at the end of the day, we still have i32
stores to the low and high halves of a double alloca. Widening these to
be i64 operations, and bitcasting them to double prior to loading or
storing allows promotion to proceed for these allocas.

I looked quite a bit changing the IR which Clang produces for this case
to be more friendly, but small changes seem unlikely to help. I think
the best representation we could use currently would be to pass 4 i32
arguments thereby avoiding any FCAs, but that would still require this
fix. It seems like it might eventually be nice to somehow encode the ABI
register selection choices outside of the parameter type system so that
the parameter can be a { double, double }, but the CC register
annotations indicate that this should be passed via 4 integer registers.

This patch does not address the second problem in PR14059, which is the
reverse: when a struct alloca is loaded as a *larger* single integer.

This patch also does not address some of the code quality issues with
the FCA-splitting. Those don't actually impede any optimizations really,
but they're on my list to clean up.

[1]: Pedantic footnote: for those concerned about memory model issues
here, this is safe. For the alloca to be promotable, it cannot escape or
have any use of its address that could allow these loads or stores to be
racing. Thus, widening is always safe.

llvm-svn: 165928
2012-10-15 08:40:30 +00:00
Chandler Carruth aa6afbb831 Hoist the canConvertValue predicate and the convertValue transform out
into static helper functions. They're really quite generic and are going
to be needed elsewhere shortly.

llvm-svn: 165927
2012-10-15 08:40:22 +00:00
Bill Wendling fbd38fe2e3 Add an enum for the return and function indexes into the AttrListPtr object. This gets rid of some magic numbers.
llvm-svn: 165924
2012-10-15 07:29:08 +00:00
Bill Wendling d079a446d7 Attributes Rewrite
Convert the internal representation of the Attributes class into a pointer to an
opaque object that's uniqued by and stored in the LLVMContext object. The
Attributes class then becomes a thin wrapper around this opaque
object. Eventually, the internal representation will be expanded to include
attributes that represent code generation options, etc.

llvm-svn: 165917
2012-10-15 04:46:55 +00:00
Meador Inge 40b6fac36c instcombine: Migrate strcmp and strncmp optimizations
This patch migrates the strcmp and strncmp optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.

llvm-svn: 165915
2012-10-15 03:47:37 +00:00
Benjamin Kramer c5b0678cf8 Simplify code. No functionality change.
llvm-svn: 165904
2012-10-14 11:15:42 +00:00
Benjamin Kramer 650b1dbd56 Unquadratize SetVector removal loops in DSE.
Erasing from the beginning or middle of the vector is expensive, remove_if can
do it in linear time even though it's a bit ugly without lambdas.

No functionality change.

llvm-svn: 165903
2012-10-14 10:21:31 +00:00
Bill Wendling 76d2cd2f60 Remove operator cast method in favor of querying with the correct method.
llvm-svn: 165899
2012-10-14 08:54:26 +00:00
Bill Wendling 2a3c1cca7d Remove the bitwise AND operators from the Attributes class. Replace it with the equivalent from the builder class.
llvm-svn: 165896
2012-10-14 07:52:48 +00:00
Bill Wendling 722b26c0f2 Remove the bitwise assignment OR operator from the Attributes class. Replace it with the equivalent from the builder class.
llvm-svn: 165895
2012-10-14 07:35:59 +00:00
Bill Wendling a05b043c4a Remove the bitwise XOR operator from the Attributes class. Replace it with the equivalent from the builder class.
llvm-svn: 165893
2012-10-14 06:56:13 +00:00
Bill Wendling 85a64c217f Remove the bitwise NOT operator from the Attributes class. Replace it with the equivalent from the builder class.
llvm-svn: 165892
2012-10-14 06:39:53 +00:00
Benjamin Kramer 44e58f9eb1 Remove unused private field.
llvm-svn: 165881
2012-10-13 18:03:34 +00:00
Meador Inge 174185084c instcombine: Migrate strchr and strrchr optimizations
This patch migrates the strchr and strrchr optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.

llvm-svn: 165875
2012-10-13 16:45:37 +00:00
Meador Inge 7fb2f7378b instcombine: Migrate strcat and strncat optimizations
This patch migrates the strcat and strncat optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.

llvm-svn: 165874
2012-10-13 16:45:32 +00:00
Meador Inge df796f893f Implement new LibCallSimplifier class
This patch implements the new LibCallSimplifier class as outlined in [1].
In addition to providing the new base library simplification infrastructure,
all the fortified library call simplifications were moved over to the new
infrastructure.  The rest of the library simplification optimizations will
be moved over with follow up patches.

NOTE: The original fortified library call simplifier located in the
SimplifyFortifiedLibCalls class was not removed because it is still
used by CodeGenPrepare.  This class will eventually go away too.

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-August/052283.html

llvm-svn: 165873
2012-10-13 16:45:24 +00:00
Chandler Carruth ba9319925e Teach SROA to cope with wrapper aggregates. These show up a lot in ABI
type coercion code, especially when targetting ARM. Things like [1
x i32] instead of i32 are very common there.

The goal of this logic is to ensure that when we are picking an alloca
type, we look through such wrapper aggregates and across any zero-length
aggregate elements to find the simplest type possible to form a type
partition.

This logic should (generally speaking) rarely fire. It only ends up
kicking in when an alloca is accessed using two different types (for
instance, i32 and float), and the underlying alloca type has wrapper
aggregates around it. I noticed a significant amount of this occurring
looking at stepanov_abstraction generated code for arm, and suspect it
happens elsewhere as well.

Note that this doesn't yet address truly heinous IR productions such as
PR14059 is concerning. Those result in mismatched *sizes* of types in
addition to mismatched access and alloca types.

llvm-svn: 165870
2012-10-13 10:49:33 +00:00
Chandler Carruth 482c61787c Speculatively harden the conversion logic. I have no idea if this will
help the dragonegg builders, and no test case at this point, but this
was one dimly plausible case I spotted by inspection. Hopefully will get
a testcase from those bots soon-ish, and will tidy this up with proper
testing.

llvm-svn: 165869
2012-10-13 10:49:30 +00:00
Chandler Carruth 0fb8a7787e Silence a warning in -assert builds.
llvm-svn: 165867
2012-10-13 05:09:27 +00:00
Chandler Carruth 891fec0b56 Clean up how we rewrite loads and stores to the whole alloca. When these
are single value types, the load and store should be directly based upon
the alloca and then bitcasting can fix the type as needed afterward.
This might in theory improve some of the IR coming out of SROA, but
I don't expect big changes yet and don't have any test cases on hand.
This is really just a cleanup/refactoring patch. The next patch will
cause this code path to be hit a lot more, actually get SROA to promote
more allocas and include several more test cases.

llvm-svn: 165864
2012-10-13 02:41:05 +00:00
Manman Ren 97c1876256 PGO: create metadata for switch only if it has more than one targets.
When all cases of a switch statement are dead, the weights vector only has one
element, and we will get an ssertion failure when calling createBranchWeights.

llvm-svn: 165759
2012-10-11 22:28:34 +00:00
Micah Villmow 0c61134d8d Revert 165732 for further review.
llvm-svn: 165747
2012-10-11 21:27:41 +00:00
Micah Villmow 083189730e Add in the first iteration of support for llvm/clang/lldb to allow variable per address space pointer sizes to be optimized correctly.
llvm-svn: 165726
2012-10-11 17:21:41 +00:00
Nick Lewycky 49ac81ac68 Don't crash when !tbaa.struct contents is invalid.
llvm-svn: 165693
2012-10-11 02:05:23 +00:00
Nadav Rotem e10328737d Add a new interface to allow IR-level passes to access codegen-specific information.
llvm-svn: 165665
2012-10-10 22:04:55 +00:00
Bill Wendling bbcdf4e2a5 Remove the final bits of Attributes being declared in the Attribute
namespace. Use the attribute's enum value instead. No functionality change
intended.

llvm-svn: 165610
2012-10-10 07:36:45 +00:00
Bill Wendling ed42e799dc Pass into the AttributeWithIndex::get method an ArrayRef of attribute
enums. These are then created via the correct Attributes creation method.

llvm-svn: 165607
2012-10-10 06:13:42 +00:00
Bill Wendling f319e9905f Have 'addFnAttr' take the attribute enum value. Then have it build the attribute object and add it appropriately. No functionality change.
llvm-svn: 165595
2012-10-10 03:12:49 +00:00
Bill Wendling 8ccd6ca199 Use the attribute enums to query if a parameter has an attribute.
llvm-svn: 165550
2012-10-09 21:38:14 +00:00
Michael Ilseman 336cb79fdf Update EarlyCSE's SimpleValues to use Hashing.h for their hashes. Expanded the hashing and equality to allow for equality modulo commutativity for binary ops, and comparisons with swapping of predicates.
llvm-svn: 165509
2012-10-09 16:57:38 +00:00
Alexey Samsonov 2747e22051 Fixup for r165490: Use DenseMap instead of std::map. Simplify the loop in CollectFunctionDIs.
llvm-svn: 165498
2012-10-09 10:34:52 +00:00
Bill Wendling 93f70b78fd Use the enum value of the attributes when adding them to the attributes builder.
llvm-svn: 165494
2012-10-09 09:11:20 +00:00
Alexey Samsonov 3b861ec989 Fix PR14016.
DeadArgumentElimination pass can replace one LLVM function with another,
invalidating a pointer stored in debug info metadata entry for this function.
To fix this, we collect debug info descriptors for functions before
running a DeadArgumentElimination pass and "patch" pointers in metadata nodes
if we replace a function.

llvm-svn: 165490
2012-10-09 08:13:15 +00:00
Bill Wendling c9b22d735a Create enums for the different attributes.
We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.

llvm-svn: 165488
2012-10-09 07:45:08 +00:00
Chandler Carruth 503eb2bb49 Fix PR14034, an infloop / heap corruption / crash bug in the new SROA.
Thanks to Benjamin for the raw test case. This one took about 50 times
longer to reduce than to fix. =/

llvm-svn: 165476
2012-10-09 01:58:35 +00:00
Bill Wendling f1c60d6d04 Fix. Apply the no capture attribute to the correct parameter.
llvm-svn: 165469
2012-10-09 00:51:40 +00:00
Bill Wendling c1e8e74cbd Convert to using the Attributes::Builder class to create attributes.
llvm-svn: 165468
2012-10-09 00:47:36 +00:00
Bill Wendling 70f3917b0e Convert to using the Attributes::Builder interface.
llvm-svn: 165465
2012-10-09 00:01:21 +00:00
Nadav Rotem 35315fea70 Refactor the AddrMode class out of TLI to its own header file.
This class is used by LSR and a number of places in the codegen.
This is the first step in de-coupling LSR from TLI, and creating
a new interface in between them.

llvm-svn: 165455
2012-10-08 23:06:34 +00:00
Nick Lewycky 7c3b5d9444 Give CaptureTracker::shouldExplore a base implementation. Most users want to do
the same thing. No functionality change.

llvm-svn: 165435
2012-10-08 22:12:48 +00:00
Micah Villmow cdfe20b97f Move TargetData to DataLayout.
llvm-svn: 165402
2012-10-08 16:38:25 +00:00
NAKAMURA Takumi 605fe78aca SROA.cpp: Fix a warning, [-Wunused-variable]
llvm-svn: 165309
2012-10-05 13:56:23 +00:00
Duncan Sands 933db779a2 Move this test a bit later, after the point at which we know that we either
have an alloca or a parameter, since then the alloca test should make sense
to readers, while before it probably appears too specific.  No functionality
change.

llvm-svn: 165306
2012-10-05 07:29:46 +00:00
Chandler Carruth e5b7a2ccd2 Teach the new SROA a new trick. Now we zap any memcpy or memmoves which
are in fact identity operations. We detect these and kill their
partitions so that even splitting is unaffected by them. This is
particularly important because Clang relies on emitting identity memcpy
operations for struct copies, and these fold away to constants very
often after inlining.

Fixes the last big performance FIXME I have on my plate.

llvm-svn: 165285
2012-10-05 01:29:09 +00:00
Chandler Carruth 90c4a3ae20 Lift the speculation visitor above all the helpers that are targeted at
the rewrite visitor to make the fact that the speculation is completely
independent a bit more clear.

I promise that this is just a cut/paste of the one visitor and adding
the annonymous namespace wrappings. The diff may look completely
preposterous, it does in git for some reason.

llvm-svn: 165284
2012-10-05 01:29:06 +00:00
Preston Gurd 0d67f5106c This patch corrects commit 165126 by using an integer bit width instead of
a pointer to a type, in order to remove the uses of getGlobalContext().

Patch by Tyler Nowicki.

llvm-svn: 165255
2012-10-04 21:33:40 +00:00
Jakub Staszak e076cac097 Add a comment to the commit r165187.
llvm-svn: 165238
2012-10-04 19:08:30 +00:00
Benjamin Kramer d12e82e523 SimplifyCFG: Enhance the "remove CFG edge that leads to null pointer dereference" optimization to also handle instructions with multiple uses.
We conservatively only check the first use to avoid walking long use chains.
This catches the common case of having both a load and a store to a pointer
supplied by a PHI node.

llvm-svn: 165232
2012-10-04 16:11:49 +00:00
Duncan Sands a6d20010fe In my recent change to avoid use of underaligned memory I didn't notice that
cpyDest can be mutated in some cases, which would then cause a crash later if
indeed the memory was underaligned.  This brought down several buildbots, so
I guess the underaligned case is much more common than I thought!

llvm-svn: 165228
2012-10-04 13:53:21 +00:00
Chandler Carruth ac8317fd36 Fix PR13969, a mini-phase-ordering issue with the new SROA pass.
Currently, we re-visit allocas when something changes about the way they
might be *split* to allow better scalarization to take place. However,
we weren't handling the case when the *promotion* is what would change
the behavior of SROA. When an address derived from an alloca is stored
into another alloca, we consider the first to have escaped. If the
second is ever promoted to an SSA value, we will suddenly be able to run
the SROA pass on the first alloca.

This patch adds explicit support for this form if iteration. When we
detect a store of a pointer derived from an alloca, we flag the
underlying alloca for reprocessing after promotion. The logic works hard
to only do this when there is definitely going to be promotion and it
might remove impediments to the analysis of the alloca.

Thanks to Nick for the great test case and Benjamin for some sanity
check review.

llvm-svn: 165223
2012-10-04 12:33:50 +00:00
Duncan Sands c6ada69a14 The memcpy optimizer was happily doing call slot forwarding when the new memory
was less aligned than the old.  In the testcase this results in an overaligned
memset: the memset alignment was correct for the original memory but is too much
for the new memory.  Fix this by either increasing the alignment of the new
memory or bailing out if that isn't possible.  Should fix the gcc-4.7 self-host
buildbot failure.

llvm-svn: 165220
2012-10-04 10:54:40 +00:00
Chandler Carruth 43c8b46deb Teach the integer-promotion rewrite strategy to be endianness aware.
Sorry for this being broken so long. =/

As part of this, switch all of the existing tests to be Little Endian,
which is the behavior I was asserting in them anyways! Add in a new
big-endian test that checks the interesting behavior there.

Another part of this is to tighten the rules abotu when we perform the
full-integer promotion. This logic now rejects cases where there fully
promoted integer is a non-multiple-of-8 bitwidth or cases where the
loads or stores touch bits which are in the allocated space of the
alloca but are not loaded or stored when accessing the integer. Sadly,
these aren't really observable today as the rest of the pass will
already ensure the invariants hold. However, the latter situation is
likely to become a potential concern in the future.

Thanks to Benjamin and Duncan for early review of this patch. I'm still
looking into whether there are further endianness issues, please let me
know if anyone sees BE failures persisting past this.

llvm-svn: 165219
2012-10-04 10:39:28 +00:00
Bill Wendling e8619aa1c1 Use method to query for attributes.
llvm-svn: 165209
2012-10-04 06:58:52 +00:00
Bill Wendling d777398ee4 Add method to query for 'NoAlias' attribute on call/invoke instructions.
llvm-svn: 165208
2012-10-04 06:52:09 +00:00
Bill Wendling d0935f7069 Use method to query for attributes.
llvm-svn: 165207
2012-10-04 06:49:41 +00:00
Bill Wendling 9cae65918c Query for attributes via the correct method call.
llvm-svn: 165206
2012-10-04 06:48:57 +00:00
Kostya Serebryany d23b18fe7f [tsan] add 3 internal flags for fine-grain control of what is instrumented and what is not.
llvm-svn: 165204
2012-10-04 05:28:50 +00:00
Jakub Staszak f8a8129513 Fix PR13967.
llvm-svn: 165187
2012-10-03 23:59:47 +00:00
Preston Gurd 5509e3d727 This Patch corrects a problem whereby the optimization to use a faster divide
instruction (for Intel Atom) was not being done by Clang, because
the type context used by Clang is not the default context.

It fixes the problem by getting the global context types for each div/rem
instruction in order to compare them against the types in the BypassTypeMap.

Tests for this will be done as a separate patch to Clang.

Patch by Tyler Nowicki.

llvm-svn: 165126
2012-10-03 16:11:44 +00:00
Dmitry Vyukov f4cb22121a tsan: prepare for migration to new memory_order enum values (ABI compatible)
llvm-svn: 165107
2012-10-03 13:00:57 +00:00
Chandler Carruth 08e5f49f90 Fix an issue where we failed to adjust the alignment constraint on
a memcpy to reflect that '0' has a different meaning when applied to
a load or store. Now we correctly use underaligned loads and stores for
the test case added.

llvm-svn: 165101
2012-10-03 08:26:28 +00:00
Chandler Carruth 4b2b38d398 Try to use a better set of abstractions for computing the alignment
necessary during rewriting. As part of this, fix a real think-o here
where we might have left off an alignment specification when the address
is in fact underaligned. I haven't come up with any way to trigger this,
as there is always some other factor that reduces the alignment, but it
certainly might have been an observable bug in some way I can't think
of. This also slightly changes the strategy for placing explicit
alignments on loads and stores to only do so when the alignment does not
match that required by the ABI. This causes a few redundant alignments
to go away from test cases.

I've also added a couple of tests that really push on the alignment that
we end up with on loads and stores. More to come here as I try to fix an
underlying bug I have conjectured and produced test cases for, although
it's not clear if this bug is the one currently hitting dragonegg's
gcc47 bootstrap.

llvm-svn: 165100
2012-10-03 08:14:02 +00:00
Chandler Carruth 3f57b82979 Switch the SetVector::remove_if implementation to use partition which
preserves the values of the relocated entries, unlikely remove_if. This
allows walking them and erasing them.

Also flesh out the predicate we are using for this to support the
various constraints actually imposed on a UnaryPredicate -- without this
we can't compose it with std::not1.

Thanks to Sean Silva for the review here and noticing the issue with
std::remove_if.

llvm-svn: 165073
2012-10-03 00:03:00 +00:00
Chandler Carruth b09f0a3c75 Teach the new SROA to handle cases where an alloca that has already been
scheduled for processing on the worklist eventually gets deleted while
we are processing another alloca, fixing the original test case in
PR13990.

To facilitate this, add a remove_if helper to the SetVector abstraction.
It's not easy to use the standard abstractions for this because of the
specifics of SetVectors types and implementation.

Finally, a nice small test case is included. Thanks to Benjamin for the
fantastic reduced test case here! All I had to do was delete some empty
basic blocks!

llvm-svn: 165065
2012-10-02 22:46:45 +00:00
Chandler Carruth 6c3890b680 Fix another crasher in SROA, reported by Joel.
We require that the indices into the use lists are stable in order to
build fast lookup tables to locate a particular partition use from an
operand of a PHI or select. This is (obviously in hind sight)
incompatible with erasing elements from the array. Really, we don't want
to erase anyways. It is expensive, and a rare operation. Instead, simply
weaken the contract of the PartitionUse structure to allow null Use
pointers to represent dead uses. Now we can clear out the pointer to
mark things as dead, and all it requires is adding some 'continue'
checks to the various loops.

I'm still reducing a test case for this, as the test case I have is
huge. I think this one I can get a nice test case for though, as it was
much more deterministic.

llvm-svn: 165032
2012-10-02 18:57:13 +00:00
Chandler Carruth 3903e05244 Fix a silly coding error on my part. The whole point of the speculator
being separate was that it can grow the use list. As a consequence, we
can't use the iterator-pair interface, we need an index based interface.
Expose such an interface from the AllocaPartitioning, and use it in the
speculator.

This should at least fix a use-after-free bug found by Duncan, and may
fix some of the other crashers.

I don't have a nice deterministic test case yet, but if I get a good
one, I'll add it.

llvm-svn: 165027
2012-10-02 17:49:47 +00:00
Chandler Carruth 4e4359935b Turn the new SROA pass back on. Let's see if it sticks this time. =]
Again, let me know if anything breaks due to this!

llvm-svn: 164986
2012-10-02 04:24:01 +00:00
Chandler Carruth d71ef3a02a Make this plural. Spotted by Duncan in review (and a very old typo, this
is the second time I've moved this comment around...)

llvm-svn: 164939
2012-10-01 12:24:42 +00:00
Chandler Carruth d325f8021b Prune some unnecessary includes.
llvm-svn: 164938
2012-10-01 12:21:54 +00:00
Chandler Carruth 176ca71a82 Fix several issues with alignment. We weren't always accounting for type
alignment requirements of the new alloca. As one consequence which was
reported as a bug by Duncan, we overaligned memcpy calls to ranges of
allocas after they were rewritten to types with lower alignment
requirements. Other consquences are possible, but I don't have any test
cases for them.

llvm-svn: 164937
2012-10-01 12:16:54 +00:00
Benjamin Kramer 9fc3dc7781 SimplifyCFG: Don't crash when forming a switch bitmap with an undef default value.
Fixes PR13985.

llvm-svn: 164934
2012-10-01 11:31:48 +00:00
Chandler Carruth 82a57543d6 Factor the PHI and select speculation into a separate rewriter. This
could probably be factored still further to hoist this logic into
a generic helper, but currently I don't have particularly clean ideas
about how to handle that.

This at least allows us to drop custom load rewriting from the
speculation logic, which in turn allows the existing load rewriting
logic to fire. In theory, this could enable vector promotion or other
tricks after speculation occurs, but I've not dug into such issues. This
is primarily just cleaning up the factoring of the code and the
resulting logic.

llvm-svn: 164933
2012-10-01 10:54:05 +00:00
Chandler Carruth 54e8f0b4cf Refactor the PartitionUse structure to actually use the Use* instead of
a pair of instructions, one for the used pointer and the second for the
user. This simplifies the representation and also makes it more dense.

This was noticed because of the miscompile in PR13926. In that case, we
were running up against a fundamental "bad idea" in the speculation of
PHI and select instructions: the speculation and rewriting are
interleaved, which requires phi speculation to also perform load
rewriting! This is bad, and causes us to miss opportunities to do (for
example) vector rewriting only exposed after PHI speculation, etc etc.
It also, in the old system, required us to insert *new* load uses into
the current partition's use list, which would then be ignored during
rewriting because we had already extracted an end iterator for the use
list. The appending behavior (and much of the other oddities) stem from
the strange de-duplication strategy in the PartitionUse builder.
Amusingly, all this went without notice for so long because it could
only be triggered by having *different* GEPs into the same partition of
the same alloca, where both different GEPs were operands of a single
PHI, and where the GEP which was not encountered first also had multiple
uses within that same PHI node... Hence the insane steps required to
reproduce.

So, step one in fixing this fundamental bad idea is to make the
PartitionUse actually contain a Use*, and to make the builder do proper
deduplication instead of funky de-duplication. This is enough to remove
the appending behavior, and fix the miscompile in PR13926, but there is
more work to be done here. Subsequent commits will lift the speculation
into its own visitor. It'll be a useful step toward potentially
extracting all of the speculation logic into a generic utility
transform.

The existing PHI test case for repeated operands has been made more
extreme to catch even these issues. This test case, run through the old
pass, will exactly reproduce the miscompile from PR13926. ;] We were so
close here!

llvm-svn: 164925
2012-10-01 01:49:22 +00:00
Benjamin Kramer f064b65a94 SimplifyCFG: Enumerating all predecessors of a BB can be expensive (switches), avoid it if possible.
No functionality change.

llvm-svn: 164923
2012-09-30 21:03:56 +00:00
Benjamin Kramer 8403625123 ArgumentPromotion: Remove ancient workaround for a bug in the C backend.
Fun fact: The CBE learned how to deal with this situation before it was removed.

llvm-svn: 164918
2012-09-30 17:31:56 +00:00
Chandler Carruth 903790eff5 Fix a somewhat surprising miscompile where code relying on an ABI
alignment could lose it due to the alloca type moving down to a much
smaller alignment guarantee.

Now SROA will actively compute a proper alignment, factoring the target
data, any explicit alignment, and the offset within the struct. This
will in some cases lower the alignment requirements, but when we lower
them below those of the type, we drop the alignment entirely to give
freedom to the code generator to align it however is convenient.

Thanks to Duncan for the lovely test case that pinned this down. =]

llvm-svn: 164891
2012-09-29 10:41:21 +00:00
Evan Cheng 64a223aed8 Do not delete BBs if their addresses are taken. rdar://12396696
llvm-svn: 164866
2012-09-28 23:58:57 +00:00
Evan Cheng 8c6b06d4a0 GlobalDCE should be run at -O2 / -Os to eliminate unused dtor, etc. rdar://9142819
llvm-svn: 164850
2012-09-28 21:23:26 +00:00
Benjamin Kramer 255dea4b90 CorrelatedPropagation: BasicBlock::removePredecessor can simplify PHI nodes. If the it's the condition of a SwitchInst, reload it.
Fixes PR13972.

llvm-svn: 164818
2012-09-28 10:42:50 +00:00
Benjamin Kramer ed84360a45 GlobalOpt: non-constexpr bitcasts or GEPs can occur even if the global value is only stored once.
Fixes PR13968.

llvm-svn: 164815
2012-09-28 10:01:27 +00:00
Nick Lewycky 156999f8b9 Surprisingly, we missed a trivial case here. Fix that!
llvm-svn: 164814
2012-09-28 09:33:53 +00:00
Benjamin Kramer c2081d1c19 Fix a integer overflow in SimplifyCFG's look up table formation logic.
If the width is very large it gets truncated from uint64_t to uint32_t when
passed to TD->fitsInLegalInteger. The truncated value can fit in a register.
This manifested in massive memory usage or crashes (PR13946).

llvm-svn: 164784
2012-09-27 18:29:58 +00:00
Sylvestre Ledru 91ce36c986 Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767
llvm-svn: 164768
2012-09-27 10:14:43 +00:00
Sylvestre Ledru 721cffd53a Fix a typo 'iff' => 'if'
llvm-svn: 164767
2012-09-27 09:59:43 +00:00
Nick Lewycky 7b4cd228aa Prefer shuffles to selects. Backends love shuffles!
llvm-svn: 164763
2012-09-27 08:33:56 +00:00
Nick Lewycky 2e646236fb Disable the new SROA pass to get the tree back in working order. We don't yet
have testcases for the current problems.

llvm-svn: 164731
2012-09-26 22:43:04 +00:00
Bill Wendling 863bab689a Remove the `hasFnAttr' method from Function.
The hasFnAttr method has been replaced by querying the Attributes explicitly. No
intended functionality change.

llvm-svn: 164725
2012-09-26 21:48:26 +00:00
Hans Wennborg cd3a11f725 Address Duncan's comments on r164684:
- Put statistics in alphabetical order
- Don't use getZextValue when building TableInt, just use APInts
- Introduce Create{Z,S}ExtOrTrunc in IRBuilder.

llvm-svn: 164696
2012-09-26 14:01:53 +00:00
Hans Wennborg f2e2c108dd Address Duncan's comments on r164682:
- Finish assert messages with exclamation mark
- Move overflow checking into ShouldBuildLookupTable.

llvm-svn: 164692
2012-09-26 11:07:37 +00:00
Chandler Carruth 208124f5a2 Analogous fix to memset and memcpy rewriting. Don't have a test case
contrived for these yet, as I spotted them by inspection and the test
cases are a bit more tricky to phrase.

llvm-svn: 164691
2012-09-26 10:59:22 +00:00
Chandler Carruth 3e4273dd0c When rewriting the pointer operand to a load or store which has
alignment guarantees attached, re-compute the alignment so that we
consider offsets which impact alignment.

llvm-svn: 164690
2012-09-26 10:45:28 +00:00
Chandler Carruth 871ba7249c Teach all of the loads, stores, memsets and memcpys created by the
rewriter in SROA to carry a proper alignment. This involves
interrogating various sources of alignment, etc. This is a more complete
and principled fix to PR13920 as well as related bugs pointed out by Eli
in review and by inspection in the area.

Also by inspection fix the integer and vector promotion paths to create
aligned loads and stores. I still need to work up test cases for
these... Sorry for the delay, they were found purely by inspection.

llvm-svn: 164689
2012-09-26 10:27:46 +00:00
Hans Wennborg 39583b88a0 SimplifyCFG: Make the switch-to-lookup table transformation store the
tables in bitmaps when they fit in a target-legal register.

This saves some space, and it also allows for building tables that would
otherwise be deemed too sparse.

One interesting case that this hits is example 7 from
http://blog.regehr.org/archives/320. We currently generate good code
for this when lowering the switch to the selection DAG: we build a
bitmask to decide whether to jump to one block or the other. My patch
will result in the same bitmask, but it removes the need for the jump,
as the return value can just be retrieved from the mask.

llvm-svn: 164684
2012-09-26 09:44:49 +00:00
Hans Wennborg 776d7126b7 SimplifyCFG: Refactor the switch-to-lookup table transformation by
breaking out the building of lookup tables into a separate class.

llvm-svn: 164682
2012-09-26 09:34:53 +00:00
Chandler Carruth 4bd8f66ed9 Revert the business end of r164636 and try again. I'll come in again. ;]
This should really, really fix PR13916. For real this time. The
underlying bug is... a bit more subtle than I had imagined.

The setup is a code pattern that leads to an @llvm.memcpy call with two
equal pointers to an alloca in the source and dest. Now, not any pattern
will do. The alloca needs to be formed just so, and both pointers should
be wrapped in different bitcasts etc. When this precise pattern hits,
a funny sequence of events transpires. First, we correctly detect the
potential for overlap, and correctly optimize the memcpy. The first
time. However, we do simplify the set of users of the alloca, and that
causes us to run the alloca back through the SROA pass in case there are
knock-on simplifications. At this point, a curious thing has happened.
If we happen to have an i8 alloca, we have direct i8 pointer values. So
we don't bother creating a cast, we rewrite the arguments to the memcpy
to dircetly refer to the alloca.

Now, in an unrelated area of the pass, we have clever logic which
ensures that when visiting each User of a particular pointer derived
from an alloca, we only visit that User once, and directly inspect all
of its operands which refer to that particular pointer value. However,
the mechanism used to detect memcpy's with the potential to overlap
relied upon getting visited once per *Use*, not once per *User*. This is
always true *unless* the same exact value is both source and dest. It
turns out that almost nothing actually produces that pattern though.

We can hand craft test cases that more directly test this behavior of
course, and those are included. Also, note that there is a significant
missed optimization here -- we prove in many cases that there is
a non-volatile memcpy call with identical source and dest addresses. We
shouldn't prevent splitting the alloca in that case, and in fact we
should just remove such memcpy calls eagerly. I'll address that in
a subsequent commit.

llvm-svn: 164669
2012-09-26 07:41:40 +00:00
Craig Topper 2a6a08b1cd Rename virtual table anchors from Anchor() to anchor() for consistency with the rest of the tree.
llvm-svn: 164666
2012-09-26 06:36:36 +00:00
Michael Ilseman a398d4cfaf Expansions for u/srem, using the udiv expansion. More unit tests for udiv and u/srem.
Fixed issue with Release build.

llvm-svn: 164654
2012-09-26 01:55:01 +00:00
Nick Lewycky d9f7910671 Don't drop the alignment on a memcpy intrinsic when producing a store. This is
only a missed optimization opportunity if the store is over-aligned, but a
miscompile if the store's new type has a higher natural alignment than the
memcpy did. Fixes PR13920!

llvm-svn: 164641
2012-09-25 22:46:21 +00:00
Nick Lewycky a0c16aee0a Revert the business end of r164634, and replace it with a different fix. The
reason we were getting two of the same alloca is because of a memmove/memcpy
which had the same alloca in both the src and dest. Now we detect that case
directly. This has the same testcase as before, but fixes a clang test
CodeGenObjC/exceptions.m which runs clang -O2.

llvm-svn: 164636
2012-09-25 21:50:37 +00:00
Nick Lewycky 9f19349846 Don't try to promote the same alloca twice. Fixes PR13916!
Chandler, it's not obvious that it's okay that this alloca gets into the list
twice to begin with. Please review and see whether this is the fix you really
want, but I wanted to get a fix checked in quickly.

llvm-svn: 164634
2012-09-25 21:15:50 +00:00
Bill Wendling eb33723ace Move Attribute::typeIncompatible inside of the Attributes class.
llvm-svn: 164629
2012-09-25 20:38:59 +00:00
Chad Rosier 88387f5332 Revert r164614 to appease the buildbots.
llvm-svn: 164627
2012-09-25 19:57:20 +00:00
Michael Ilseman 506150a071 Expansions for u/srem, using the udiv expansion. More unit tests for udiv and u/srem.
llvm-svn: 164614
2012-09-25 17:56:47 +00:00
Chandler Carruth 8b907e8acb Fix a case where SROA did not correctly detect dead PHI or selects due
to chains or cycles between PHIs and/or selects. Also add a couple of
really nice test cases reduced from Kostya's reports in PR13905 and
PR13906. Both are fixed by this patch.

llvm-svn: 164596
2012-09-25 10:03:40 +00:00
Chandler Carruth 2603a18769 Fix a crash in SROA. This was reported independently by Takumi and
David (I think), but I would appreciate folks verifying that this fixes
the big crasher.

I'm still working on a reduced test case, but because this was causing
problems I wanted to get the fix checked in quickly.

llvm-svn: 164585
2012-09-25 02:42:03 +00:00
Nick Lewycky 42bca056e0 Don't forget that strcpy and friends return a pointer to the destination, so
it's not a dead store if that pointer is used. Whoops!

llvm-svn: 164583
2012-09-25 01:55:59 +00:00
Nick Lewycky 627d217727 Remove unused name of variable to quiet a warning. Also canonicalize a
declaration to use the same form as in the rest of the file. No functionality
change.

llvm-svn: 164576
2012-09-24 23:47:23 +00:00
Nick Lewycky 9f4729d331 Teach DSE that strcpy, strncpy, strcat and strncat are all stores which may be
dead.

llvm-svn: 164561
2012-09-24 22:09:10 +00:00
Nick Lewycky 135ac9ac89 Move all the calls to AA.getTargetLibraryInfo() to using a TLI member variable.
No functionality change.

llvm-svn: 164560
2012-09-24 22:07:09 +00:00
Richard Osborne 2fd29bfb90 Add missing check for presence of target data.
This avoids a crash in visitAllocaInst when target data isn't available.

llvm-svn: 164539
2012-09-24 17:10:03 +00:00
Chandler Carruth 8232bf53c6 Enable the new SROA pass by default.
Queue the fallout. ;]

llvm-svn: 164480
2012-09-24 01:10:25 +00:00
Chandler Carruth 92924fd28f Address one of the original FIXMEs for the new SROA pass by implementing
integer promotion analogous to vector promotion. When there is an
integer alloca being accessed both as its integer type and as a narrower
integer type, promote the narrower access to "insert" and "extract" the
smaller integer from the larger one, and make the integer alloca
a candidate for promotion.

In the new formulation, we don't care about target legal integer or use
thresholds to control things. Instead, we only perform this promotion to
an integer type which the frontend has already emitted a load or store
for. This bounds the scope and prevents optimization passes from
coalescing larger and larger entities into a single integer.

llvm-svn: 164479
2012-09-24 00:34:20 +00:00
Chandler Carruth e7a1ba5e8b Switch to a signed representation for the dynamic offsets while walking
across the uses of the alloca. It's entirely possible for negative
numbers to come up here, and in some rare cases simply doing the 2's
complement arithmetic isn't the correct decision. Notably, we can't zext
the index of the GEP. The definition of GEP is that these offsets are
sign extended or truncated to the size of the pointer, and then wrapping
2's complement arithmetic used.

This patch fixes an issue that comes up with *no* input from the
buildbots or bootstrap afaict. The only place where it manifested,
disturbingly, is Clang's own regression test suite. A reduced and
targeted collection of tests are added to cope with this. Note that I've
tried to pin down the potential cases of overflow, but may have missed
some cases. I've tried to add a few cases to test this, but its hard
because LLVM has quite limited support for >64bit constructs.

llvm-svn: 164475
2012-09-23 11:43:14 +00:00
Chandler Carruth 225d4bdb07 Fix a case where the new SROA pass failed to zap dead operands to
selects with a constant condition. This resulted in the operands
remaining live through the SROA rewriter. Most of the time, this just
caused some dead allocas to persist and get zapped by later passes, but
in one case found by Joerg, it caused a crash when we tried to *promote*
the alloca despite it having this dead use. We already have the
mechanisms in place to handle this, just wire select up to them.

llvm-svn: 164427
2012-09-21 23:36:40 +00:00
Benjamin Kramer eba9aca5cd LoopIdiom: Give up when the loop is not in canonical form.
We rely on it when doing the transforms. This can happen when there is an
indirectbr in  the loop.

Fixes PR13892.

llvm-svn: 164383
2012-09-21 17:27:23 +00:00
Benjamin Kramer efb4d34bcf InstCombine: Make sure we use the pre-zext type when creating a constant of a value that is zext'd.
Fixes PR13250.

llvm-svn: 164377
2012-09-21 16:26:41 +00:00
Manman Ren 93ab64916f SimplifyCFG: sink common codes from IF, ELSE blocks down to END block.
We already have HoistThenElseCodeToIf, this patch implements
SinkThenElseCodeToEnd. When END block has only two predecessors and each
predecessor terminates with unconditional branches, we compare instructions in
IF and ELSE blocks backwards and check whether we can sink the common
instructions down.

rdar://12191395

llvm-svn: 164325
2012-09-20 22:37:36 +00:00
Michael Ilseman 5117db54ff Renaming functions to match coding style guidelines
llvm-svn: 164238
2012-09-19 18:14:45 +00:00
Michael Ilseman 370a1a1c94 Doxygen-ify comments
llvm-svn: 164235
2012-09-19 16:25:57 +00:00
Michael Ilseman 1db690d15e Put the * and & next to the variable, rather than the type.
llvm-svn: 164232
2012-09-19 16:17:20 +00:00
Hans Wennborg f744fa917d SimplifyCFG: Don't generate invalid code for switch used to initialize
two variables where the first variable is returned and the second
ignored.

I don't think this occurs in practice (other passes should have cleaned
up the unused phi node), but it should still be handled correctly.

Also make the logic for determining if we should return early less
sketchy.

llvm-svn: 164225
2012-09-19 14:24:21 +00:00
Benjamin Kramer 47196e6cd5 IntegerDivision: Style cleanups, avoid warning about mixing || and && without parens.
llvm-svn: 164216
2012-09-19 13:03:07 +00:00
Hans Wennborg 02fbc71647 CodeGenPrep: turn lookup tables into switches for some targets.
This is a follow-up from r163302, which added a transformation to
SimplifyCFG that turns some switches into loads from lookup tables.

It was pointed out that some targets, such as GPUs and deeply embedded
targets, might not find this appropriate, but SimplifyCFG doesn't have
enough information about the target to decide this.

This patch adds the reverse transformation to CodeGenPrep: it turns
loads from lookup tables back into switches for targets where we do not
build jump tables (assuming these are also the targets where lookup
tables are inappropriate).

Hopefully we will eventually get to have target information in
SimplifyCFG, and then this CodeGenPrep transformation can be removed.

llvm-svn: 164206
2012-09-19 07:48:16 +00:00
Chandler Carruth 3f882d4cf5 Fix the last crasher I've gotten a reproduction for in SROA. This one
from the dragonegg build bots when we turned on the full version of the
pass. Included a much reduced test case for this pesky bug, despite
bugpoint's uncooperative behavior.

Also, I audited all the similar code I could find and didn't spot any
other cases where this mistake cropped up.

llvm-svn: 164178
2012-09-18 22:37:19 +00:00
Michael Ilseman 52059da858 New utility for expanding integer division for targets that don't support it.
Implementation derived from compiler-rt's implementation of signed and unsigned integer division.

llvm-svn: 164173
2012-09-18 22:02:40 +00:00
Andrew Trick 402edbbe39 LSR critical edge splitting fix for PR13756.
llvm-svn: 164147
2012-09-18 17:51:33 +00:00
Chandler Carruth d356fd02a9 Fix getCommonType in a different way from the way I fixed it when
working on FCA splitting. Instead of refusing to form a common type when
there are uses of a subsection of the alloca as well as a use of the
entire alloca, just skip the subsection uses and continue looking for
a whole-alloca use with a type that we can use.

This produces slightly prettier IR I think, and also fixes the other
failure in the test.

llvm-svn: 164146
2012-09-18 17:49:37 +00:00
Benjamin Kramer a59ef5795d Fix build for compilers that don't understand injected class names properly.
llvm-svn: 164142
2012-09-18 17:11:47 +00:00
Benjamin Kramer 73a9e4a1f9 SROA: Use CRTP for OpSplitter to get rid of virtual dispatch and the virtual-dtor warnings that come with it.
llvm-svn: 164140
2012-09-18 17:06:32 +00:00
Benjamin Kramer 65f8c88242 SROA: Replace the member function template contraption for recursively splitting aggregates into a real class.
No intended functionality change.

llvm-svn: 164135
2012-09-18 16:20:46 +00:00
NAKAMURA Takumi eb2c8f0fc6 SROA.cpp: Appease msvc.
...I don't know why this could appease msvc...baad.

llvm-svn: 164130
2012-09-18 15:29:02 +00:00
Benjamin Kramer 9bc3efc81c LNT builders have picked up new SROA, disable it to get the remaining builders green again.
llvm-svn: 164124
2012-09-18 13:43:00 +00:00
Chandler Carruth a34f3567e0 Fix a warning in release builds and a test case I forgot to update with
a fix to getCommonType in the previous patch.

llvm-svn: 164120
2012-09-18 13:02:06 +00:00
Chandler Carruth 42cb9cb14f Add a major missing piece to the new SROA pass: aggressive splitting of
FCAs. This is essential in order to promote allocas that are used in
struct returns by frontends like Clang. The FCA load would block the
rest of the pass from firing, resulting is significant regressions with
the bullet benchmark in the nightly test suite.

Thanks to Duncan for repeated discussions about how best to do this, and
to both him and Benjamin for review.

This appears to have blocked many places where the pass tries to fire,
and so I'm expect somewhat different results with this fix added.

As with the last big patch, I'm including a change to enable the SROA by
default *temporarily*. Ben is going to remove this as soon as the LNT
bots pick up the patch. I'm just trying to get a round of LNT numbers
from the stable machines in the lab.

NOTE: Four clang tests are expected to fail in the brief window where
this is enabled. Sorry for the noise!

llvm-svn: 164119
2012-09-18 12:57:43 +00:00
Richard Osborne b68053e266 Fix instcombine to obey requested alignment when merging allocas.
llvm-svn: 164117
2012-09-18 09:31:44 +00:00
Craig Topper b1d83e8c72 Mark unimplemented copy constructors and copy assignment operators as LLVM_DELETED_FUNCTION.
llvm-svn: 164090
2012-09-18 02:01:41 +00:00
Manman Ren 5657555357 PGO: preserve branch-weight metadata when simplifying Switch to a sub, an icmp
and a conditional branch; also when removing dead cases from a switch.

llvm-svn: 164084
2012-09-18 00:47:33 +00:00
Manman Ren ce48ea7e25 PGO: preserve branch-weight metadata when simplifying Switch
Hanlde the case when we split the default edge if the default target has "icmp"
and unconditinal branch.

llvm-svn: 164076
2012-09-17 23:07:43 +00:00
Manman Ren 774246a3a9 PGO: preserve branch-weight metadata when simplifying SwitchOnSelect.
llvm-svn: 164068
2012-09-17 22:28:55 +00:00
Manman Ren 2d4c10fc49 PGO: preserve branch-weight metadata when simplifying two branches with a common
destination in SimplifyCondBranchToCondBranch.

llvm-svn: 164054
2012-09-17 21:30:40 +00:00
Bill Wendling 636f1a1d99 s/__llvm_gcov_flush/__gcov_flush/g
llvm-svn: 164040
2012-09-17 17:57:05 +00:00
Benjamin Kramer 02a4dff492 NewSROA: Provide a full set of operator< for ByteRanges.
MSVC8 won't compile lower_bound if one is missing.

llvm-svn: 164035
2012-09-17 16:42:36 +00:00
Axel Naumann 4a1270691e Fix a few vars that can end up being used without initialization.
The cases where no initialization happens should still be checked for logic flaws.

llvm-svn: 164032
2012-09-17 14:20:57 +00:00
Chandler Carruth 9712117a07 Refactor the SROA visitors for partitioning an alloca and building
partition use lists a bit. No functionality changed.

These visitors are actually visiting a tuple of a Use and an offset into
the alloca. However, we use the InstVisitor to handle the dispatch over
the users, and so the Use and Offset are stored in class member
variables and set just before each call to visit(). This is fairly
awkward and makes the functions a bit harder to read, but its the only
real option we have until InstVisitor can be rewritten to use variadic
templates.

However, this pattern shouldn't be followed on the helper member
functions where there is no interface constraint from the visitor. We
already were passing the instruction as a normal parameter rather than
use the Use to get at it, start passing the offset as well. This will
become more important in subsequent patches as the offset will in some
cases change while visiting a single instruction.

llvm-svn: 164003
2012-09-16 19:39:50 +00:00
Craig Topper a60c0f1163 Use LLVM_DELETED_FUNCTION in place of 'DO NOT IMPLEMENT' comments.
llvm-svn: 163974
2012-09-15 17:09:36 +00:00
Benjamin Kramer ed11e35e57 Disable new sroa now that all buildbots have tested it.
What we have so far:
- Some clang test failures (these were known already)

- Perf results are mixed, some big regressions
  http://llvm.org/perf/db_default/v4/nts/3844
  http://llvm.org/perf/db_default/v4/nts/3845

  bullet suffers a lot. matmul is interesting: slower scalar code, faster with -vectorize.

- Some dragonegg selfhost bots crash in SROA during selfhost now
  http://lab.llvm.org:8011/builders/dragonegg-x86_64-linux-gcc-4.6-self-host-checks/builds/1632
  http://lab.llvm.org:8011/builders/dragonegg-x86_64-linux-gcc-4.5-self-host/builds/1891

llvm-svn: 163968
2012-09-15 15:11:10 +00:00
Chandler Carruth 70b44c5ccf Port the SSAUpdater-based promotion logic from the old SROA pass to the
new one, and add support for running the new pass in that mode and in
that slot of the pass manager. With this the new pass can completely
replace the old one within the pipeline.

The strategy for enabling or disabling the SSAUpdater logic is to do it
by making the requirement of the domtree analysis optional. By default,
it is required and we get the standard mem2reg approach. This is usually
the desired strategy when run in stand-alone situations. Within the
CGSCC pass manager, we disable requiring of the domtree analysis and
consequentially trigger fallback to the SSAUpdater promotion.

In theory this would allow the pass to re-use a domtree if one happened
to be available even when run in a mode that doesn't require it. In
practice, it lets us have a single pass rather than two which was
simpler for me to wrap my head around.

There is a hidden flag to force the use of the SSAUpdater code path for
the purpose of testing. The primary testing strategy is just to run the
existing tests through that path. One notable difference is that it has
custom code to handle lifetime markers, and one of the tests has been
enhanced to exercise that code.

This has survived a bootstrap and the test suite without serious
correctness issues, however my run of the test suite produced *very*
alarming performance numbers. I don't entirely understand or trust them
though, so more investigation is on-going.

To aid my understanding of the performance impact of the new SROA now
that it runs throughout the optimization pipeline, I'm enabling it by
default in this commit, and will disable it again once the LNT bots have
picked up one iteration with it. I want to get those bots (which are
much more stable) to evaluate the impact of the change before I jump to
any conclusions.

NOTE: Several Clang tests will fail because they run -O3 and check the
result's order of output. They'll go back to passing once I disable it
again.

llvm-svn: 163965
2012-09-15 11:43:14 +00:00
Manman Ren bfb9d435e4 PGO: preserve branch-weight metadata when simplifying two branches with a common
destination.

Updated previous implementation to fix a case not covered:
// PBI: br i1 %x, TrueDest, BB
// BI:  br i1 %y, TrueDest, FalseDest
The other case was handled correctly.
// PBI: br i1 %x, BB, FalseDest
// BI:  br i1 %y, TrueDest, FalseDest

Also tried to use 64-bit arithmetic instead of APInt with scale to simplify the
computation. Let me know if you have other opinions about this.

llvm-svn: 163954
2012-09-15 00:39:57 +00:00
Bill Wendling 8d26bc38f5 Remove comment.
llvm-svn: 163945
2012-09-14 22:35:49 +00:00
Manman Ren 8691e5220b PGO: preserve branch-weight metadata when simplifying a switch with a single
case to a conditional branch and when removing dead cases.

llvm-svn: 163942
2012-09-14 21:53:06 +00:00
Evan Cheng 71be12b35b Stylistic and 80-col fixes
llvm-svn: 163940
2012-09-14 21:25:34 +00:00
Alex Rosenberg af2808cb72 Review feedback from Duncan Sands. Alphabetize includes and simplify
lit config.

llvm-svn: 163928
2012-09-14 19:19:57 +00:00
Manman Ren 5e5049d9a6 Try to fix the bots by detecting inconsistant branch-weight metadata.
llvm-svn: 163926
2012-09-14 19:05:19 +00:00
Manman Ren d81b8e88e3 PGO: preserve branch-weight metadata when merging two switches where
the default target of the first switch is not the basic block the second switch
is in (PredDefault != BB).

llvm-svn: 163916
2012-09-14 17:29:56 +00:00
Dmitri Gribenko 5485acd440 Fix Doxygen issues:
* wrap code blocks in \code ... \endcode;
* refer to parameter names in paragraphs correctly (\arg is not what most
  people want -- it starts a new paragraph);
* use \param instead of \arg to document parameters in order to be consistent
  with the rest of the codebase.

llvm-svn: 163902
2012-09-14 14:57:36 +00:00
Benjamin Kramer 4622cd7edd SROA: Silence unused variable warnings in Release builds.
The NDEBUG hack is ugly, but I see no better solution.

llvm-svn: 163900
2012-09-14 13:08:09 +00:00
Chandler Carruth 054a40a4ff Rework the computation of a sub-structure natural type. There were
pointless checks in here, bad asserts, and just confusing code. I've
also added a bit more to the comment to clarify what this function is
really trying to do as it was not obvious to Duncan when studying it.

Thanks to Duncan for helping me dig through the issue.

No real functionality changed here in practical cases, and certainly no
test case. This is just cleanup spotted by inspection.

llvm-svn: 163897
2012-09-14 11:08:31 +00:00
Chandler Carruth 0cc59250d5 Rely on the recursive check for pointer types rather than adding an
explicit check before recursing. A simplification requested by Duncan
during review.

llvm-svn: 163896
2012-09-14 10:30:44 +00:00
Chandler Carruth cabd96cbaa Be a bit more aggressive in bailing out of this routine. Spotted by
inspection by Duncan during review. My suspicion is that we would still
have returned 0 anyways in this case, but doing it sooner is better.

llvm-svn: 163895
2012-09-14 10:30:42 +00:00
Chandler Carruth dd3cea898f Add some comments clarifying that the GEP analysis for vector GEPs is
deeply suspicious and likely to go away eventually. Also fix a bogus
comment about one of the checks in the vector GEP analysis. Based on
review from Duncan.

llvm-svn: 163894
2012-09-14 10:30:40 +00:00
Chandler Carruth 19450da9e6 Move an instance variable to a local variable based on review by Duncan.
Originally I had anticipated needing to thread this through more bits of
the SROA pass itself, but that ended up not happening. In the end, this
is a much simpler way to manange the variable.

llvm-svn: 163893
2012-09-14 10:26:38 +00:00
Chandler Carruth 4b40e008bd Add a comment about debug intrinsics that I *really* don't want to
forget from Duncan's review as a FIXME.

llvm-svn: 163892
2012-09-14 10:26:36 +00:00
Chandler Carruth b0de6ddbe0 Add two asserts that Duncan thought would help ensure things don't rot
unexpectedly in the future. More fixes from his code review.

llvm-svn: 163891
2012-09-14 10:26:34 +00:00
Chandler Carruth 6ba9824c2b Actually keep the flag default-off for now. =/ That's what I get for
being busy testing this...

llvm-svn: 163890
2012-09-14 10:18:54 +00:00
Chandler Carruth 796de48459 Remove some dead, commented out code Duncan spotted in review.
llvm-svn: 163889
2012-09-14 10:18:53 +00:00
Chandler Carruth 25fb23d687 Wrap the dumping and printing routines in NDEBUG and LLVM_ENABLE_DUMP macros.
llvm-svn: 163888
2012-09-14 10:18:51 +00:00
Chandler Carruth 93a21e7aaf Lots of comment fixes and cleanups from Duncan's review.
llvm-svn: 163887
2012-09-14 10:18:49 +00:00
NAKAMURA Takumi 4bbca0bb6c SROA.cpp: Unbreak gcc, sorry!
llvm-svn: 163886
2012-09-14 10:06:10 +00:00
NAKAMURA Takumi f4619d169d SROA.cpp: Appease msvc. LLVM_ATTRIBUTE(s) should come front of "const".
llvm-svn: 163885
2012-09-14 09:55:22 +00:00
Chandler Carruth 9a447db9fc Speculative change to try to fix older GCC versions that can't handle
the injected class name of a dependent base class here.

llvm-svn: 163884
2012-09-14 09:30:33 +00:00
Chandler Carruth 1b398ae0ae Introduce a new SROA implementation.
This is essentially a ground up re-think of the SROA pass in LLVM. It
was initially inspired by a few problems with the existing pass:
- It is subject to the bane of my existence in optimizations: arbitrary
  thresholds.
- It is overly conservative about which constructs can be split and
  promoted.
- The vector value replacement aspect is separated from the splitting
  logic, missing many opportunities where splitting and vector value
  formation can work together.
- The splitting is entirely based around the underlying type of the
  alloca, despite this type often having little to do with the reality
  of how that memory is used. This is especially prevelant with unions
  and base classes where we tail-pack derived members.
- When splitting fails (often due to the thresholds), the vector value
  replacement (again because it is separate) can kick in for
  preposterous cases where we simply should have split the value. This
  results in forming i1024 and i2048 integer "bit vectors" that
  tremendously slow down subsequnet IR optimizations (due to large
  APInts) and impede the backend's lowering.

The new design takes an approach that fundamentally is not susceptible
to many of these problems. It is the result of a discusison between
myself and Duncan Sands over IRC about how to premptively avoid these
types of problems and how to do SROA in a more principled way. Since
then, it has evolved and grown, but this remains an important aspect: it
fixes real world problems with the SROA process today.

First, the transform of SROA actually has little to do with replacement.
It has more to do with splitting. The goal is to take an aggregate
alloca and form a composition of scalar allocas which can replace it and
will be most suitable to the eventual replacement by scalar SSA values.
The actual replacement is performed by mem2reg (and in the future
SSAUpdater).

The splitting is divided into four phases. The first phase is an
analysis of the uses of the alloca. This phase recursively walks uses,
building up a dense datastructure representing the ranges of the
alloca's memory actually used and checking for uses which inhibit any
aspects of the transform such as the escape of a pointer.

Once we have a mapping of the ranges of the alloca used by individual
operations, we compute a partitioning of the used ranges. Some uses are
inherently splittable (such as memcpy and memset), while scalar uses are
not splittable. The goal is to build a partitioning that has the minimum
number of splits while placing each unsplittable use in its own
partition. Overlapping unsplittable uses belong to the same partition.
This is the target split of the aggregate alloca, and it maximizes the
number of scalar accesses which become accesses to their own alloca and
candidates for promotion.

Third, we re-walk the uses of the alloca and assign each specific memory
access to all the partitions touched so that we have dense use-lists for
each partition.

Finally, we build a new, smaller alloca for each partition and rewrite
each use of that partition to use the new alloca. During this phase the
pass will also work very hard to transform uses of an alloca into a form
suitable for promotion, including forming vector operations, speculating
loads throguh PHI nodes and selects, etc.

After splitting is complete, each newly refined alloca that is
a candidate for promotion to a scalar SSA value is run through mem2reg.

There are lots of reasonably detailed comments in the source code about
the design and algorithms, and I'm going to be trying to improve them in
subsequent commits to ensure this is well documented, as the new pass is
in many ways more complex than the old one.

Some of this is still a WIP, but the current state is reasonbly stable.
It has passed bootstrap, the nightly test suite, and Duncan has run it
successfully through the ACATS and DragonEgg test suites. That said, it
remains behind a default-off flag until the last few pieces are in
place, and full testing can be done.

Specific areas I'm looking at next:
- Improved comments and some code cleanup from reviews.
- SSAUpdater and enabling this pass inside the CGSCC pass manager.
- Some datastructure tuning and compile-time measurements.
- More aggressive FCA splitting and vector formation.

Many thanks to Duncan Sands for the thorough final review, as well as
Benjamin Kramer for lots of review during the process of writing this
pass, and Daniel Berlin for reviewing the data structures and algorithms
and general theory of the pass. Also, several other people on IRC, over
lunch tables, etc for lots of feedback and advice.

llvm-svn: 163883
2012-09-14 09:22:59 +00:00
Dan Gohman 3f553c21eb Handle the new !tbaa.struct metadata tags when converting a memcpy into scalar
loads and stores.

llvm-svn: 163844
2012-09-13 21:51:01 +00:00
Dan Gohman d0080c45f9 Extract code for reducing a type to a single value type into a helper function.
llvm-svn: 163817
2012-09-13 18:19:06 +00:00
Benjamin Kramer 15a257dadd MemCpyOpt: When forming a memset from stores also take GEP constexprs into account.
This is common when storing to global variables.

llvm-svn: 163809
2012-09-13 16:29:49 +00:00
Nadav Rotem 97d44349c9 Fix an 80 char line limit.
llvm-svn: 163808
2012-09-13 16:27:32 +00:00
Bill Wendling fb1f6681a3 Use Nick's suggestion of storing a large NULL into the GV instead of memset, which requires TargetData.
llvm-svn: 163799
2012-09-13 14:32:30 +00:00
Dmitri Gribenko 2bc1d483fe Fix Doxygen issues:
* wrap code blocks in \code ... \endcode;
* refer to parameter names in paragraphs correctly (\arg is not what most
  people want -- it starts a new paragraph).

llvm-svn: 163790
2012-09-13 12:34:29 +00:00
Bill Wendling 2e6e86606a Introduce the __llvm_gcov_flush function.
This function writes out the current values of the counters and then resets
them. This can be used similarly to the __gcov_flush function to sync the
counters when need be. For instance, in a situation where the application
doesn't exit.
<rdar://problem/12185886>

llvm-svn: 163757
2012-09-13 00:09:55 +00:00
Dan Gohman 7c84dad80a Detect overflow in the path count computation. rdar://12277446.
llvm-svn: 163739
2012-09-12 20:45:17 +00:00
Manman Ren 49dbe255e6 PGO: preserve branch-weight metadata when removing a case which jumps
to the default target.

llvm-svn: 163724
2012-09-12 17:04:11 +00:00
Manman Ren 49d684e1e2 Release build: guard dump functions with
"#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)"

No functional change. Update r163344.

llvm-svn: 163679
2012-09-12 05:06:18 +00:00
Manman Ren 571d9e4b80 SimplifyCFG: preserve branch-weight metadata when creating a new switch from
a pair of switch/branch where both depend on the value of the same variable and
the default case of the first switch/branch goes to the second switch/branch.

Code clean up and fixed a few issues:
1> handling the case where some cases of the 2nd switch are invalidated
2> correctly calculate the weight for the 2nd switch when it is a conditional eq

Testing case is modified from Alastair's original patch.

llvm-svn: 163635
2012-09-11 17:43:35 +00:00
NAKAMURA Takumi 7419c5fb45 llvm/lib/Transforms/Utils/CMakeLists.txt: Update.
llvm-svn: 163593
2012-09-11 02:55:37 +00:00
Alex Rosenberg 04b43aab43 Add a pass that renames everything with metasyntatic names. This works well after using bugpoint to reduce the confusion presented by the original names, which no longer mean what they used to.
llvm-svn: 163592
2012-09-11 02:46:18 +00:00
Benjamin Kramer 1f66f885e8 Move bypassSlowDivision into the llvm namespace.
llvm-svn: 163503
2012-09-10 11:52:08 +00:00
Hans Wennborg 7fd5c844af Fix style issues from r163302 pointed out by Evan.
llvm-svn: 163491
2012-09-10 07:44:22 +00:00
Nick Lewycky 12d825d9ca Move spaces to the right places. No functionality change.
llvm-svn: 163485
2012-09-09 23:41:11 +00:00
Benjamin Kramer 2b11eb0729 DSE: Poking holes into a SetVector is expensive, avoid it if possible.
llvm-svn: 163480
2012-09-09 16:44:05 +00:00
Andrew Trick d3b4d2cb76 Remove an incorrect assert during branch weight propagation.
Patch and test case by Alastair Murray!

llvm-svn: 163437
2012-09-08 00:07:26 +00:00
Hans Wennborg 08238adbbb SimplifyCFG: ValidLookupTableConstant should be static
llvm-svn: 163378
2012-09-07 08:22:57 +00:00
Manman Ren c3366ccecb Release build: guard dump functions with "ifndef NDEBUG"
No functional change.

llvm-svn: 163344
2012-09-06 19:55:56 +00:00
Hans Wennborg feb4d07d88 Fix switch_to_lookup_table.ll test from r163302.
The lookup tables did not get built in a deterministic order.
This makes them get built in the order that the corresponding phi nodes
were found.

llvm-svn: 163305
2012-09-06 10:10:35 +00:00
Hans Wennborg 8a62fc5294 Build lookup tables for switches (PR884)
This adds a transformation to SimplifyCFG that attemps to turn switch
instructions into loads from lookup tables. It works on switches that
are only used to initialize one or more phi nodes in a common successor
basic block, for example:

  int f(int x) {
    switch (x) {
    case 0: return 5;
    case 1: return 4;
    case 2: return -2;
    case 5: return 7;
    case 6: return 9;
    default: return 42;
  }

This speeds up the code by removing the hard-to-predict jump, and
reduces code size by removing the code for the jump targets.

llvm-svn: 163302
2012-09-06 09:43:28 +00:00
Jim Grosbach 30c4282f88 Update function names to conform to guidelines.
No functional change.

llvm-svn: 163279
2012-09-06 00:59:08 +00:00
Roman Divacky ad06cee239 Stop casting away const qualifier needlessly.
llvm-svn: 163258
2012-09-05 22:26:57 +00:00
Kostya Serebryany 5f5973df08 [asan] fix lint
llvm-svn: 163205
2012-09-05 09:00:18 +00:00
Kostya Serebryany 2fa38f8ce0 [asan] extend the blacklist functionality to handle global-init. Patch by Reid Watson
llvm-svn: 163199
2012-09-05 07:29:56 +00:00
Dan Gohman df476e5e93 Make provenance checking conservative in cases when
pointers-to-strong-pointers may be in play. These can lead to retains and
releases happening in unstructured ways, foiling the optimizer. This fixes
rdar://12150909.

llvm-svn: 163180
2012-09-04 23:16:20 +00:00
Jakub Staszak e535c1a12e BypassSlowDivision: Assign to reference, don't copy the object.
llvm-svn: 163179
2012-09-04 23:11:11 +00:00
Jakub Staszak 85a7787588 Fix my previous patch (r163164). It does now what it is supposed to do:
Doesn't set MadeChange to TRUE if BypassSlowDivision doesn't change anything.

llvm-svn: 163165
2012-09-04 21:16:59 +00:00
Jakub Staszak 46beca6364 Return false if BypassSlowDivision doesn't change anything.
Also a few minor changes:
- use pre-inc instead of post-inc
- use isa instead of dyn_cast
- 80 col
- trailing spaces

llvm-svn: 163164
2012-09-04 20:48:24 +00:00
Preston Gurd cdf540d5d6 Generic Bypass Slow Div
- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.

Patch by Tyler Nowicki!

llvm-svn: 163150
2012-09-04 18:22:17 +00:00
Nadav Rotem 03dcd85b56 LICM may hoist an instruction with undefined behavior above a trap.
Scan the body of the loop and find instructions that may trap.
Use this information when deciding if it is safe to hoist or sink instructions.
Notice that we can optimize the search of instructions that may throw in the case of nested loops.

rdar://11518836

llvm-svn: 163132
2012-09-04 10:25:04 +00:00
Nadav Rotem 9d83202620 Not all targets have efficient ISel code generation for select instructions.
For example, the ARM target does not have efficient ISel handling for vector
selects with scalar conditions. This patch adds a TLI hook which allows the
different targets to report which selects are supported well and which selects
should be converted to CF duting codegen prepare.

llvm-svn: 163093
2012-09-02 12:10:19 +00:00
Benjamin Kramer 599a4bb6ea LoopRotation: Make the brute force DomTree update more brute force.
We update until we hit a fixpoint. This is probably slow but also
slightly simplifies the code. It should also fix the occasional
invalid domtrees observed when building with expensive checking.

I couldn't find a case where this had a measurable slowdown, but
if someone finds a pathological case where it does we may have
to find a cleverer way of updating dominators here.

Thanks to Duncan for the test case.

llvm-svn: 163091
2012-09-02 11:57:22 +00:00
Logan Chien 9ab55b8d59 Rename ANDROIDEABI to Android.
Most of the code guarded with ANDROIDEABI are not
ARM-specific, and having no relation with arm-eabi.
Thus, it will be more natural to call this
environment "Android" instead of "ANDROIDEABI".

Note: We are not using ANDROID because several projects
are using "-DANDROID" as the conditional compilation
flag.

llvm-svn: 163087
2012-09-02 09:29:46 +00:00
Benjamin Kramer 3be6a480a4 LoopRotation: Check some invariants of the dominator updating code.
llvm-svn: 163058
2012-09-01 12:04:51 +00:00
Michael Ilseman 30c3e14e8e test
llvm-svn: 162914
2012-08-30 15:45:16 +00:00
Benjamin Kramer afdfdb5cff LoopRotate: Also rotate loops with multiple exits.
The old PHI updating code in loop-rotate was replaced with SSAUpdater a while
ago, it has no problems with comples PHIs. What had to be fixed is detecting
whether a loop was already rotated and updating dominators when multiple exits
were present.

This change increases overall code size a bit, mostly due to additional loop
unrolling opportunities. Passes test-suite and selfhost with -verify-dom-info.
Fixes PR7447.

Thanks to Andy for the input on the domtree updating code.

llvm-svn: 162912
2012-08-30 15:39:42 +00:00
Benjamin Kramer d4a64716ab InstCombine: Fix comment to reflect the code.
llvm-svn: 162911
2012-08-30 15:07:40 +00:00
Alexey Samsonov f54e3aaeaa Whitespace
llvm-svn: 162907
2012-08-30 13:47:13 +00:00
Nadav Rotem d5f5777b77 It is illegal to transform (sdiv (ashr X c1) c2) -> (sdiv x (2^c1 * c2)),
because C always rounds towards zero.

Thanks Dirk and Ben.

llvm-svn: 162899
2012-08-30 11:23:20 +00:00
Bill Wendling 14c8a051ca Pass by pointer and not std::string.
llvm-svn: 162888
2012-08-30 01:32:31 +00:00
Bill Wendling 1f6f8c2cb7 Revert r162855 in favor of changing clang to emit the absolute coverage file path.
llvm-svn: 162883
2012-08-30 00:34:21 +00:00
Andrew Trick 3051aa1cb8 Preserve branch profile metadata during switch formation.
Patch by Michael Ilseman!
This fixes SimplifyCFGOpt::FoldValueComparisonIntoPredecessors to preserve metata when folding conditional branches into switches.

void foo(int x) {
  if (x == 0)
    bar(1);
  else if (__builtin_expect(x == 10, 1))
    bar(2);
  else if (x == 20)
    bar(3);
}

CFG:

B0
|  \
|   X0
B10
|  \
|   X10
B20
|  \
E   X20

Merge B0-B10:
w(B0-X0) = w(B0-X0)*sum-weights(B10) = w(B0-X0) * (w(B10-X10) + w(B10-B20))
w(B0-X10) = w(B0-B10) * w(B10-X10)
w(B0-B20) = w(B0-B10) * w(B10-B20)

B0 __
| \  \
| X10 X0
B20
|  \
E  X20

Merge B0-B20:
w(B0-X0) = w(B0-X0) * sum-weights(B20) = w(B0-X0) * (w(B20-E) + w(B20-X20))
w(B0-X10) = w(B0-X10) * sum-weights(B20) = ...
w(B0-X20) = w(B0-B20) * w(B20-X20)
w(B0-E) = w(B0-B20) * w(B20-E)

llvm-svn: 162868
2012-08-29 21:46:38 +00:00
Andrew Trick f3cf1932b3 whitespace
llvm-svn: 162867
2012-08-29 21:46:36 +00:00
Bill Wendling 11e61b9557 Use the full path to output the .gcda file.
This lets the user run the program from a different directory and still have the
.gcda files show up in the correct place.
<rdar://problem/12179524>

llvm-svn: 162855
2012-08-29 20:30:44 +00:00
Bill Wendling e8aee6b8a5 Use ArrayRef instead of SmallVector when passing vector into function.
llvm-svn: 162851
2012-08-29 18:45:41 +00:00
Benjamin Kramer 8bcc971174 Make MemoryBuiltins aware of TargetLibraryInfo.
This disables malloc-specific optimization when -fno-builtin (or -ffreestanding)
is specified. This has been a problem for a long time but became more severe
with the recent memory builtin improvements.

Since the memory builtin functions are used everywhere, this required passing
TLI in many places. This means that functions that now have an optional TLI
argument, like RecursivelyDeleteTriviallyDeadFunctions, won't remove dead
mallocs anymore if the TLI argument is missing. I've updated most passes to do
the right thing.

Fixes PR13694 and probably others.

llvm-svn: 162841
2012-08-29 15:32:21 +00:00
Benjamin Kramer 1e1a1dedc6 InstCombine: Defensively avoid undefined shifts by limiting the amount to the bit width.
No test case, undefined shifts get folded early, but can occur when other
transforms generate a constant. Thanks to Duncan for bringing this up.

llvm-svn: 162755
2012-08-28 13:59:23 +00:00
Benjamin Kramer 9c0a807c27 InstCombine: Guard the transform introduced in r162743 against large ints and non-const shifts.
llvm-svn: 162751
2012-08-28 13:08:13 +00:00
Nadav Rotem d457787fed Make sure that we don't call getZExtValue on values > 64 bits.
Thanks Benjamin for noticing this.

llvm-svn: 162749
2012-08-28 12:23:22 +00:00
Nadav Rotem 11935b29f3 Teach InstCombine to canonicalize [SU]div+[AL]shl patterns.
For example:
  %1 = lshr i32 %x, 2
  %2 = udiv i32 %1, 100

rdar://12182093

llvm-svn: 162743
2012-08-28 10:01:43 +00:00
Dan Gohman 10c82cee04 Don't use for loops for code that is only intended to execute once. No
intended functionality change. Thanks to Ahmed Charles for spotting it.

llvm-svn: 162686
2012-08-27 18:31:36 +00:00
Kostya Serebryany 4cc511daf0 [asan/tsan] rename FunctionBlackList* to BlackList* as this class is not limited to functions any more
llvm-svn: 162566
2012-08-24 16:44:47 +00:00
Kostya Serebryany 36dfc5ceab [asan/tsan] extend the functionality of FunctionBlackList to globals and modules. Patch by Reid Watson.
llvm-svn: 162565
2012-08-24 16:40:11 +00:00
Benjamin Kramer dd62d6b6c8 GVN: Fix quadratic runtime on the number of switch cases.
No intended behavior change.  This was introduced in r162023.  With the fixed
algorithm a Release build of ARMInstPrinter.cpp goes from 16s to 10s on a
2011 MBP.

llvm-svn: 162559
2012-08-24 15:06:28 +00:00
Benjamin Kramer e07728b936 SimplifyLibCalls: Give all safely-shrinkable libcalls the same treatment.
llvm-svn: 162383
2012-08-22 19:39:15 +00:00
Chad Rosier 0122909d95 Add a few float shrinking optimizations to SimplifyLibCalls. Unsafe
optimizations are guarded by the -enable-double-float-shrink LLVM option.
Last bit of PR13574.  Patch by Weiming Zhao <weimingz@codeaurora.org>.

llvm-svn: 162368
2012-08-22 17:22:33 +00:00
Chad Rosier b2f5c1cdbb Add a new helper function, AddOpt(F1, F1, Opt), as part of PR13574. No
functional change intended.  Patch by Weiming Zhao <weimingz@codeaurora.org>.

llvm-svn: 162363
2012-08-22 16:52:57 +00:00
Richard Smith 976f8605e2 MaximumSpanningTree::EdgeWeightCompare: Make this comparator actually be a
strict weak ordering, and don't pass possibly-null pointers to dyn_cast.

llvm-svn: 162314
2012-08-21 21:03:40 +00:00
Richard Smith ad9c8e839e Don't bind a reference to a dereferenced null pointer (for return value of WeakVH::operator*).
llvm-svn: 162309
2012-08-21 20:35:14 +00:00
Chandler Carruth c908ca1766 Port the global copy optimization from the SROA pass to InstCombine.
This optimization is really just replacing allocas wholesale with
globals, there is no scalarization.

The underlying motivation for this patch is to simplify the SROA pass
and focus it on splitting and promoting allocas.

llvm-svn: 162271
2012-08-21 08:39:44 +00:00
Kostya Serebryany f4be019fba [asan] add code to detect global initialization fiasco in C/C++. The sub-pass is off by default for now. Patch by Reid Watson. Note: this patch changes the interface between LLVM and compiler-rt parts of asan. The corresponding patch to compiler-rt will follow.
llvm-svn: 162268
2012-08-21 08:24:25 +00:00
Michael Liao 6e12d12830 revise debug output to avoid dangling pointer
llvm-svn: 162256
2012-08-21 05:55:22 +00:00
Benjamin Kramer 9d03242fcf InstCombine: Fix a crasher when encountering a function pointer.
llvm-svn: 162180
2012-08-18 22:04:34 +00:00
Benjamin Kramer 9282aef86d Remove overly conservative hasOneUse check, this always expands into a single IR instruction.
llvm-svn: 162175
2012-08-18 20:24:19 +00:00
Benjamin Kramer 8c2a733c55 InstCombine: Add a couple of fabs identities for comparing with 0.0.
llvm-svn: 162174
2012-08-18 20:06:47 +00:00
Benjamin Kramer 000132454c SimplifyLibcalls: Add fabs and trunc to the list of libcalls that are safe to shrink from double to float.
llvm-svn: 162173
2012-08-18 19:27:32 +00:00
Richard Smith 257c5f2088 Fix undefined behavior (binding a reference to a dereferenced null pointer) if
SSAUpdater was created and destroyed without being initialized.

llvm-svn: 162137
2012-08-17 21:42:44 +00:00
Rafael Espindola cc80cdebb9 Teach GVN to reason about edges dominating uses. This allows it to handle cases
where some fact lake a=b dominates a use in a phi, but doesn't dominate the
basic block itself.

This feature could also be implemented by splitting critical edges, but at least
with the current algorithm reasoning about the dominance directly is faster.

The time for running "opt -O2" in the testcase in pr10584 is 1.003 times slower
and on gcc as a single file it is 1.0007 times faster.

llvm-svn: 162023
2012-08-16 15:09:43 +00:00
Bill Wendling 4d5150d978 Remove dead flag.
llvm-svn: 161990
2012-08-15 21:18:10 +00:00
Kostya Serebryany 1e575ab8b2 [asan] implement --asan-always-slow-path, which is a part of the improvement to handle unaligned partially OOB accesses. See http://code.google.com/p/address-sanitizer/issues/detail?id=100
llvm-svn: 161937
2012-08-15 08:58:58 +00:00
Michael Liao 69e172a6f0 fix infinite loop in instcombine with more than 4GB memcpy
- memcpy size is wrongly truncated into 32-bit and treat 8GB memcpy is
  0-sized memcpy
- as 0-sized memcpy/memset is already removed before SimplifyMemTransfer
  and SimplifyMemSet in visitCallInst, replace 0 checking with
  assertions.
- replace getZExtValue() with getLimitedValue() according to
  Eli Friedman

llvm-svn: 161923
2012-08-15 03:49:59 +00:00
Kostya Serebryany fda7a138f7 [asan] insert crash basic blocks inline as opposed to inserting them at the end of the function. This doesn't seem to fix or break anything, but is considered to be more friendly to downstream passes
llvm-svn: 161870
2012-08-14 14:04:51 +00:00
Craig Topper 2a40418a99 Change greater than to greater than or equal so that an identical sized store to the same offset is treated as completing overwriting.
llvm-svn: 161857
2012-08-14 07:32:05 +00:00
Nadav Rotem 70409991bc During the CodeGenPrepare we often lower intrinsics (such as objsize)
and allow some optimizations to turn conditional branches into unconditional.
This commit adds a simple control-flow optimization which merges two consecutive
basic blocks which are connected by a single edge. This allows the codegen to
operate on larger basic blocks.

rdar://11973998

llvm-svn: 161852
2012-08-14 05:19:07 +00:00
Nadav Rotem 8d80452076 LICM uses AliasSet information to hoist and sink instructions. However, other passes, such as LoopRotate
may invalidate its AliasSet because SSAUpdater does not update the AliasSet properly.
This patch teaches SSAUpdater to notify AliasSet that it made changes.
The testcase in PR12901 is too big to be useful and I could not reduce it to a normal size. 

rdar://11872059 PR12901

llvm-svn: 161803
2012-08-13 23:06:54 +00:00
Kostya Serebryany 0f7a80d0c3 [asan] remove the code for --asan-merge-callbacks as it appears to be a bad idea. (partly related to Bug 13225)
llvm-svn: 161757
2012-08-13 14:08:46 +00:00
Rafael Espindola 64e7b5703e Constify some basic blocks, no functionality change.
llvm-svn: 161668
2012-08-10 15:55:25 +00:00
Pete Cooper 0deca6be79 Fix crash when when do lto on Bullet. Dynamic GEPs in SROA were incorrectly being applied to all accesses to an alloca, not just the ones which read from the GEP. Thanks to Evan for reducing the test. rdar://11861001
llvm-svn: 161654
2012-08-10 03:26:36 +00:00
Eli Friedman 08ec0a8122 isAllocLikeFn is allowed to return true for functions which read memory; make
sure we account for that correctly in DeadStoreElimination.  Fixes a regression
from r158919.  PR13547.

llvm-svn: 161468
2012-08-08 02:17:32 +00:00
Dan Gohman b948736002 Avoid recomputing the unique exit blocks and their insert points when doing
multiple scalar promotions on a single loop. This also has the effect of
preserving the order of stores sunk out of loops, which is aesthetically
pleasing, and it happens to fix the testcase in PR13542, though it doesn't
fix the underlying problem.

llvm-svn: 161459
2012-08-08 00:00:26 +00:00
Bob Wilson 61f3ad5759 Fix a serious typo in InstCombine's optimization of comparisons.
An unsigned value converted to floating-point will always be greater than
a negative constant.  Unfortunately InstCombine reversed the check so that
unsigned values were being optimized to always be greater than all positive
floating-point constants.  <rdar://problem/12029145>

llvm-svn: 161452
2012-08-07 22:35:16 +00:00
Bill Wendling 8555a37c04 Move the "findUsedStructTypes" functionality outside of the Module class.
The "findUsedStructTypes" method is very expensive to run. It needs to be
optimized so that LTO can run faster. Splitting this method out of the Module
class will help this occur. For instance, it can keep a list of seen objects so
that it doesn't process them over and over again.

llvm-svn: 161228
2012-08-03 00:30:35 +00:00
Nuno Lopes a9a8c62714 remove tabs from my previous commit.
Sorry, not used to this editor anymore.. XCode please come back; you're forgiven :)

llvm-svn: 161120
2012-08-01 17:13:28 +00:00
Nuno Lopes e7220312c2 (hopefuly) fix the remaining cases where null wasnt expected (PR13497).
I'll commit a test to the clang tree.

llvm-svn: 161118
2012-08-01 16:58:51 +00:00
Evan Cheng 249716e8ae Teach CodeGenPrep to look past bitcast when it's duplicating return instruction
into predecessor blocks to enable tail call optimization.

rdar://11958338

llvm-svn: 160894
2012-07-27 21:21:26 +00:00
Nuno Lopes 20c7eb3549 fix infinite loop in instcombine in the presence of a (malformed) self-referencing select inst.
This can happen as long as the instruction is not reachable. Instcombine does generate these unreachable malformed selects when doing RAUW

llvm-svn: 160874
2012-07-27 18:03:57 +00:00
Pete Cooper abc13af9c6 Simplify demanded bits of select sources where the condition is a constant vector
llvm-svn: 160835
2012-07-26 23:10:24 +00:00
Pete Cooper e807e45bff Teach SimplifyDemandedBits how to look through fpext and fptrunc to simplify their operand
llvm-svn: 160823
2012-07-26 22:37:04 +00:00
Nuno Lopes 5940c4a15f do null checks for a few more Emit*() functions.
Thanks Eli for noticing.

llvm-svn: 160787
2012-07-26 17:10:46 +00:00
Duncan Sands 5651452076 Stop reassociate from looking through expressions of arbitrary complexity. This
is a temporary measure until my fix for PR13021 is ready.

llvm-svn: 160778
2012-07-26 09:26:40 +00:00
Nick Lewycky 7d0f110cb3 It's not safe to blindly remove invoke instructions. This happens when we
encounter an invoke of an allocation function. This should fix the dragonegg
bootstrap. Testcase to follow, later.

llvm-svn: 160757
2012-07-25 21:19:40 +00:00
Nuno Lopes f0626f2205 revert r160742: it's breaking CMake build
original commit msg:
MemoryBuiltins: add support to determine the size of strdup'ed non-constant strings

llvm-svn: 160751
2012-07-25 18:49:28 +00:00
Nuno Lopes f0441e04bd MemoryBuiltins: add support to determine the size of strdup'ed non-constant strings
llvm-svn: 160742
2012-07-25 17:29:22 +00:00
Nuno Lopes 7ba5b98720 add EmitStrNLen()
llvm-svn: 160741
2012-07-25 17:18:59 +00:00
Nuno Lopes 89702e94b5 make all Emit*() functions consult the TargetLibraryInfo information before creating a call to a library function.
Update all clients to pass the TLI information around.
Previous draft reviewed by Eli.

llvm-svn: 160733
2012-07-25 16:46:31 +00:00
Nick Lewycky 38be931223 Don't delete one more instruction than we're allowed to. This should fix the
Darwin bootstrap. Testcase exists but isn't fully reduced, I expect to commit
the testcase this evening.

llvm-svn: 160693
2012-07-24 21:33:00 +00:00
Nadav Rotem 465834c85f Clean whitespaces.
llvm-svn: 160668
2012-07-24 10:51:42 +00:00
Nick Lewycky faa9c3b035 Teach globalopt to not nuke all stores to globals. Keep them around of they
might be deliberate "one time" leaks, so that leak checkers can find them.
This is a reapply of r160602 with the fix that this time I'm committing the
code I thought I was committing last time; the I->eraseFromParent() goes
*after* the break out of the loop.

llvm-svn: 160664
2012-07-24 07:21:08 +00:00
Dan Gohman f64ff8ed3a An objc_retain can serve as a may-use for a different pointer.
rdar://11931823.

llvm-svn: 160637
2012-07-23 19:27:31 +00:00
Nadav Rotem 1088811c33 Suppress a warning.
llvm-svn: 160629
2012-07-23 13:44:15 +00:00
Sylvestre Ledru 35521e2310 Fix a typo (the the => the)
llvm-svn: 160621
2012-07-23 08:51:15 +00:00
Chandler Carruth c8acd7c96b Move the initialization of the bounds checking pass. The pass itself
moved earlier. This fixes some layering issues.

llvm-svn: 160611
2012-07-22 05:19:32 +00:00
Nick Lewycky 9669c198ba Revert r160602.
llvm-svn: 160603
2012-07-21 09:03:15 +00:00
Nick Lewycky 72b83e5eaa Teach globalopt to play nice with leak checkers. This is a reapplication of
r160529 that was subsequently reverted. The fix was to not call
GV->eraseFromParent() right before the caller does the same. The existing
testcases already caught this bug if run under valgrind.

llvm-svn: 160602
2012-07-21 08:29:45 +00:00
Nuno Lopes 20ea62527a move the bounds checking pass to the instrumentation folder, where it belongs. I dunno why in the world I dropped it in the Scalar folder in the first place.
No functionality change.

llvm-svn: 160587
2012-07-20 22:39:33 +00:00
Richard Osborne 0ab2b0df82 Fix assertion in jump threading (PR13405).
GetBestDestForJumpOnUndef() assumes there is at least 1 successor, which isn't
true if the block ends in an indirect branch with no successors. Fix this by
bailing out earlier in this case.

llvm-svn: 160546
2012-07-20 10:36:17 +00:00
Kostya Serebryany f02c6069ac [asan] make sure that the crash callbacks do not get merged (Chandler's idea: insert an empty InlineAsm). Change the order in which the new BBs are inserted: the slow path BB is insert between old BBs, the crash BB is inserted at the end. Don't create an empty BB (introduced by recent commits). Update the test. The experimental code that does manual crash callback merge will most likely be deleted later.
llvm-svn: 160544
2012-07-20 09:54:50 +00:00
Nick Lewycky 7707e23429 Revert r160529 due to crashes.
llvm-svn: 160532
2012-07-19 23:59:21 +00:00
Nick Lewycky 0fa6a28141 Don't wipe out global variables that are probably storing pointers to heap
memory. This makes clang play nice with leak checkers.

llvm-svn: 160529
2012-07-19 22:35:28 +00:00
Benjamin Kramer f364a63c3e Replace some explicit compare loops with std::equal.
No functionality change.

llvm-svn: 160501
2012-07-19 10:46:05 +00:00
Bill Wendling ea6397f67b Remove tabs.
llvm-svn: 160477
2012-07-19 00:11:40 +00:00
Andrew Trick 0d07dfcd6f indvars: drive by heuristics fix.
Minor oversight noticed by inspection. Sorry no unit test.

llvm-svn: 160422
2012-07-18 04:35:13 +00:00
Andrew Trick c08726627c indvars: Linear function test replace should avoid reusing undef.
Fixes PR13371: indvars pass incorrectly substitutes 'undef' values.

I do not like this fix. It's needed until/unless the meaning of undef
changes. It attempts to be complete according to the IR spec, but I
don't have much confidence in the implementation given the difficulty
testing undefined behavior. Worse, this invalidates some of my
hard-fought work on indvars and LSR to optimize pointer induction
variables. It results benchmark regressions, which I'll track
internally. On x86_64 no LTO I see:

-3% huffbench
-3% 400.perlbench
-8% fhourstones

My only suggestion for recovering is to change the meaning of
undef. If we could trust an arbitrary instruction to produce a some
real value that can be manipulated (e.g. incremented) according to
non-undef rules, then this case could be easily handled with SCEV.

llvm-svn: 160421
2012-07-18 04:35:10 +00:00
Evan Cheng e6a3b03ee0 Back out r160101 and instead implement a dag combine to recover from instcombine transformation.
llvm-svn: 160387
2012-07-17 18:54:11 +00:00
Kostya Serebryany 986b8da500 [asan] more code to merge crash callbacks. Doesn't fully work yet, but allows to hold performance experiments
llvm-svn: 160361
2012-07-17 11:04:12 +00:00
Andrew Trick c803706c18 Reapply r160340. LSR: Limit CollectSubexprs.
Speculatively fix crashes by code inspection. Can't reproduce them yet.

llvm-svn: 160344
2012-07-17 05:30:37 +00:00
Andrew Trick e834cb465a Revert "LSR: try not to blow up solving combinatorial problems brute force."
Some units tests crashed on a different platform.

llvm-svn: 160341
2012-07-17 05:05:21 +00:00
Andrew Trick 7cd6d426b3 LSR: try not to blow up solving combinatorial problems brute force.
This places limits on CollectSubexprs to constrains the number of
reassociation possibilities. It limits the recursion depth and skips
over chains of nested recurrences outside the current loop.

Fixes PR13361. Although underlying SCEV behavior is still potentially bad.

llvm-svn: 160340
2012-07-17 05:00:56 +00:00
Nuno Lopes 482fb19fd5 fix PR13339 (remove the predecessor from the unwind BB when removing an invoke)
llvm-svn: 160325
2012-07-16 22:49:40 +00:00
Kostya Serebryany c4ce5dfe2d [asan] a bit more refactoring, addressed some of the style comments from chandlerc, partially implemented crash callback merging (under flag)
llvm-svn: 160290
2012-07-16 17:12:07 +00:00
Kostya Serebryany 874dae6119 [asan] refactor instrumentation to allow merging the crash callbacks (not fully implemented yet, no functionality change except the BB order)
llvm-svn: 160284
2012-07-16 16:15:40 +00:00
Kostya Serebryany 4273bb05d1 [asan] initialize asan error callbacks in runOnModule instead of doing that on-demand
llvm-svn: 160269
2012-07-16 14:09:42 +00:00
Chandler Carruth 8b540ab337 Revert r160254 temporarily.
It turns out that ASan relied on the at-the-end block insertion order to
(purely by happenstance) disable some LLVM optimizations, which in turn
start firing when the ordering is made more "normal". These
optimizations in turn merge many of the instrumentation reporting calls
which breaks the return address based error reporting in ASan.

We're looking at several different options for fixing this.

llvm-svn: 160256
2012-07-16 10:01:02 +00:00
Chandler Carruth 3dd6c81492 Teach AddressSanitizer to create basic blocks in a more natural order.
This is particularly useful to the backend code generators which try to
process things in the incoming function order.

Also, cleanup some uses of IRBuilder to be a bit simpler and more clear.

llvm-svn: 160254
2012-07-16 08:58:53 +00:00
Chandler Carruth 36e2ecf528 Move llvm/Support/TypeBuilder.h -> llvm/TypeBuilder.h. This completes
the move of *Builder classes into the Core library.

No uses of this builder in Clang or DragonEgg I could find.

If there is a desire to have an IR-building-support library that
contains all of these builders, that can be easily added, but currently
it seems likely that these add no real overhead to VMCore.

llvm-svn: 160243
2012-07-15 23:45:24 +00:00
Chandler Carruth ec7ad6561f Move llvm/Support/MDBuilder.h to llvm/MDBuilder.h, to live with
IRBuilder, DIBuilder, etc.

This is the proper layering as MDBuilder can't be used (or implemented)
without the Core Metadata representation.

Patches to Clang and Dragonegg coming up.

llvm-svn: 160237
2012-07-15 23:26:50 +00:00
Andrew Trick 653513b8dd LSR Fix: check SCEV expression safety before expansion.
All SCEV expressions used by LSR formulae must be safe to
expand. i.e. they may not contain UDiv unless we can prove nonzero
denominator.

Fixes PR11356: LSR hoists UDiv.

llvm-svn: 160205
2012-07-13 23:33:10 +00:00
Benjamin Kramer abbfe69356 Make helper functions static.
llvm-svn: 160173
2012-07-13 13:25:15 +00:00
Evan Cheng 493eb32ff4 Instcombine was transforming:
%shr = lshr i64 %key, 3
  %0 = load i64* %val, align 8
  %sub = add i64 %0, -1
  %and = and i64 %sub, %shr
  ret i64 %and

to:
  %shr = lshr i64 %key, 3
  %0 = load i64* %val, align 8
  %sub = add i64 %0, 2305843009213693951
  %and = and i64 %sub, %shr
  ret i64 %and

The demanded bit optimization is actually a pessimization because add -1 would
be codegen'ed as a sub 1. Teach the demanded constant shrinking optimization
to check for negated constant to make sure it is actually reducing the width
of the constant.

rdar://11793464

llvm-svn: 160101
2012-07-12 01:45:35 +00:00
Nuno Lopes 95cc4f3cb5 instcombine: merge the functions that remove dead allocas and dead mallocs/callocs/...
This patch removes ~70 lines in InstCombineLoadStoreAlloca.cpp and makes both functions a bit more aggressive than before :)
In theory, we can be more aggressive when removing an alloca than a malloc, because an alloca pointer should never escape, but we are not taking advantage of this anyway

llvm-svn: 159952
2012-07-09 18:38:20 +00:00
Nuno Lopes fa0dffccee teach instcombine to remove allocated buffers even if there are stores, memcpy/memmove/memset, and objectsize users.
This means we can do cheap DSE for heap memory.
Nothing is done if the pointer excapes or has a load.

The churn in the tests is mostly due to objectsize, since we want to make sure we
don't delete the malloc call before evaluating the objectsize (otherwise it becomes -1/0)

llvm-svn: 159876
2012-07-06 23:09:25 +00:00
Kostya Serebryany e36ae68803 [tsan] fix compile-time falilure found while building Chromium with tsan (tsan issue #3). A unit test will follow separately.
llvm-svn: 159736
2012-07-05 09:07:31 +00:00
Stepan Dyatkovskiy 7ff588f986 Reverted r156659, due to probable performance regressions, DenseMap should be used here:
IntegersSubsetMapping
  - Replaced type of Items field from std::list with std::map. In neares future I'll test it with DenseMap and do the correspond replacement
    if possible.

llvm-svn: 159703
2012-07-04 05:53:05 +00:00
Nuno Lopes 1e8dffdf27 BoundsChecking: optimize out the check for offset < 0 if size is known to be >= 0 (signed).
(LLVM optimizers cannot do this optimization by themselves)

llvm-svn: 159668
2012-07-03 17:30:18 +00:00
Stepan Dyatkovskiy 8b0c97e0dd Part of r159527. Splitted into series of patches and gone with fixed PR13256:
IntegersSubsetMapping
  - Replaced type of Items field from std::list with std::map. In neares future I'll test it with DenseMap and do the correspond replacement
    if possible.

llvm-svn: 159659
2012-07-03 13:46:45 +00:00
Eric Christopher b65acc61a5 Revert "IntRange:" as it appears to be breaking self hosting.
This reverts commit b2833d9dcba88c6f0520cad760619200adc0442c.

llvm-svn: 159618
2012-07-02 23:22:21 +00:00
Duncan Sands e8ce94fcd7 GlobalOpt forgot to handle bitcast when analyzing globals. Found by inspection.
llvm-svn: 159546
2012-07-02 18:55:39 +00:00
Nuno Lopes d0bcfe4d9d fix the regression I introduced in r159385 (it's necessary to update PHI nodes in unwind BB
llvm-svn: 159534
2012-07-02 16:14:47 +00:00
Stepan Dyatkovskiy 8b9ecca42d IntRange:
- Changed isSingleNumber method behaviour. Now this flag is calculated on demand.
IntegersSubsetMapping
  - Optimized diff operation.
  - Replaced type of Items field from std::list with std::map.
  - Added new methods:
    bool isOverlapped(self &RHS)
    void add(self& RHS, SuccessorClass *S)
    void detachCase(self& NewMapping, SuccessorClass *Succ)
    void removeCase(SuccessorClass *Succ)
    SuccessorClass *findSuccessor(const IntTy& Val)
    const IntTy* getCaseSingleNumber(SuccessorClass *Succ)
IntegersSubsetTest
  - DiffTest: Added checks for successors.
SimplifyCFG
  Updated SwitchInst usage (now it is case-ragnes compatible) for
    - SimplifyEqualityComparisonWithOnlyPredecessor
    - FoldValueComparisonIntoPredecessors

llvm-svn: 159527
2012-07-02 13:02:18 +00:00
Kostya Serebryany eeaf688c0f [asan] small code simplification
llvm-svn: 159522
2012-07-02 11:42:29 +00:00
Bill Wendling a164735baa Don't reinsert the 'atexit' function if it already exists.
llvm-svn: 159491
2012-06-30 20:21:19 +00:00
Nuno Lopes 7b12b87096 revert r159440. As Duncan pointed out, the test for invoke is not needed at this point
llvm-svn: 159471
2012-06-29 22:10:10 +00:00
Benjamin Kramer 396b3adc10 CodeGenPrepare: Don't crash when TLI is not available.
This happens when codegenprepare is invoked via opt.

llvm-svn: 159457
2012-06-29 19:58:21 +00:00
Duncan Sands 9838286d9e Rework this to clarify where the removal of nodes from the queue is
really happening.  No intended functionality change.

llvm-svn: 159451
2012-06-29 19:03:05 +00:00
Nuno Lopes b37ef71ce1 ignore 'invoke new' in isInstructionTriviallyDead, since most callers are not ready to handle invokes. instcombine will take care of this.
llvm-svn: 159440
2012-06-29 17:37:07 +00:00
Duncan Sands 369c6d270b Fix a reassociate crash on sozefx when compiling with dragonegg+gcc-4.7 due to
the optimizers producing a multiply expression with more multiplications than
the original (!).

llvm-svn: 159426
2012-06-29 13:25:06 +00:00
Chandler Carruth aafe0918bc Move llvm/Support/IRBuilder.h -> llvm/IRBuilder.h
This was always part of the VMCore library out of necessity -- it deals
entirely in the IR. The .cpp file in fact was already part of the VMCore
library. This is just a mechanical move.

I've tried to go through and re-apply the coding standard's preferred
header sort, but at 40-ish files, I may have gotten some wrong. Please
let me know if so.

I'll be committing the corresponding updates to Clang and Polly, and
Duncan has DragonEgg.

Thanks to Bill and Eric for giving the green light for this bit of cleanup.

llvm-svn: 159421
2012-06-29 12:38:19 +00:00
Bill Wendling f799efdedc The DIBuilder class is just a wrapper around debug info creation
(a.k.a. MDNodes). The module doesn't belong in Analysis. Move it to the VMCore
instead.

llvm-svn: 159414
2012-06-29 08:32:07 +00:00
Nuno Lopes b97a4e8bc2 make simplifyCFG erase invokes to readonly/readnone functions
llvm-svn: 159385
2012-06-28 22:32:27 +00:00
Nuno Lopes 9ac4661afa make instcombine produce calls to llvm.donothing instead of a random intrinsic
llvm-svn: 159384
2012-06-28 22:31:24 +00:00
Kostya Serebryany c387ca7bab [asan] set a hard limit on the number of instructions instrumented pear each BB. This is (hopefully temporary) workaround for PR13225
llvm-svn: 159344
2012-06-28 09:34:41 +00:00
Hal Finkel 918ca2b8b7 Precompute SCEV pointer analysis prior to instruction fusion in BBVectorize.
When both a load/store and its address computation are being vectorized, it can
happen that the address-computation vectorization destroys SCEV's ability
to analyize the relative pointer offsets. As a result (like with the aliasing
analysis info), we need to precompute the necessary information prior to
instruction fusing.

This was found during stress testing (running through the test suite with a very
low required chain length); unfortunately, I don't have a small test case.

llvm-svn: 159332
2012-06-28 05:42:45 +00:00
Hal Finkel 0873d73cbf Remove a useless check in BBVectorize.
A shuffle mask will always be a constant, but I did not realize that
when I originally wrote the code.

llvm-svn: 159331
2012-06-28 05:42:43 +00:00
Hal Finkel f2dcb9a9c4 Allow BBVectorize to form non-2^n-length vectors.
The original algorithm only used recursive pair fusion of equal-length
types. This is now extended to allow pairing of any types that share
the same underlying scalar type. Because we would still generally
prefer the 2^n-length types, those are formed first. Then a second
set of iterations form the non-2^n-length types.

Also, a call to SimplifyInstructionsInBlock has been added after each
pairing iteration. This takes care of DCE (and a few other things)
that make the following iterations execute somewhat faster. For the
same reason, some of the simple shuffle-combination cases are now
handled internally.

There is some additional refactoring work to be done, but I've had
many requests for this feature, so additional refactoring will come
soon in future commits (as will additional test cases).

llvm-svn: 159330
2012-06-28 05:42:42 +00:00
Hal Finkel 74e5225c92 Refactor operation equivalence checking in BBVectorize by extending Instruction::isSameOperationAs.
Maintaining this kind of checking in different places is dangerous, extending
Instruction::isSameOperationAs consolidates this logic into one place. Here
I've added an optional flags parameter and two flags that are important for
vectorization: CompareIgnoringAlignment and CompareUsingScalarTypes.

llvm-svn: 159329
2012-06-28 05:42:26 +00:00
Bill Wendling e38859dc8e Move lib/Analysis/DebugInfo.cpp to lib/VMCore/DebugInfo.cpp and
include/llvm/Analysis/DebugInfo.h to include/llvm/DebugInfo.h.

The reasoning is because the DebugInfo module is simply an interface to the
debug info MDNodes and has nothing to do with analysis.

llvm-svn: 159312
2012-06-28 00:05:13 +00:00
Matt Beaumont-Gay a58862310c Revert r159136 due to PR13124.
Original commit message:

If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it
hidden. Being linkonce_odr guarantees that it is available in every dso that
needs it. Being a constant/function with unnamed_addr guarantees that the
copies don't have to be merged.

llvm-svn: 159272
2012-06-27 17:10:33 +00:00
Duncan Sands 514db117bd Some reassociate optimizations create new instructions, which they insert just
before the expression root.  Any existing operators that are changed to use one
of them needs to be moved between it and the expression root, and recursively
for the operators using that one.  When I rewrote RewriteExprTree I accidentally
inverted the logic, resulting in the compacting going down from operators to
operands rather than up from operands to the operators using them, oops.  Fix
this, resolving PR12963.

llvm-svn: 159265
2012-06-27 14:19:00 +00:00
Evan Cheng 319be53a1f Remove a instcombine transform that (no longer?) makes sense:
// C - zext(bool) -> bool ? C - 1 : C
    if (ZExtInst *ZI = dyn_cast<ZExtInst>(Op1))
      if (ZI->getSrcTy()->isIntegerTy(1))
        return SelectInst::Create(ZI->getOperand(0), SubOne(C), C);

This ends up forming sext i1 instructions that codegen to terrible code. e.g.
int blah(_Bool x, _Bool y) {
  return (x - y) + 1;
}
=>
        movzbl  %dil, %eax
        movzbl  %sil, %ecx
        shll    $31, %ecx
        sarl    $31, %ecx
        leal    1(%rax,%rcx), %eax
        ret


Without the rule, llvm now generates:
        movzbl  %sil, %ecx
        movzbl  %dil, %eax
        incl    %eax
        subl    %ecx, %eax
        ret

It also helps with ARM (and pretty much any target that doesn't have a sext i1 :-).

The transformation was done as part of Eli's r75531. He has given the ok to
remove it.

rdar://11748024

llvm-svn: 159230
2012-06-26 22:03:13 +00:00
Duncan Sands 8bc764aeca Replacing zero-sized alloca's with a null pointer is too aggressive, instead
merge all zero-sized alloca's into one, fixing c43204g from the Ada ACATS
conformance testsuite.  What happened there was that a variable sized object
was being allocated on the stack, "alloca i8, i32 %size".  It was then being
passed to another function, which tested that the address was not null (raising
an exception if it was) then manipulated %size bytes in it (load and/or store).
The optimizers cleverly managed to deduce that %size was zero (congratulations
to them, as it isn't at all obvious), which made the alloca zero size, causing
the optimizers to replace it with null, which then caused the check mentioned
above to fail, and the exception to be raised, wrongly.  Note that no loads
and stores were actually being done to the alloca (the loop that does them is
executed %size times, i.e. is not executed), only the not-null address check.

llvm-svn: 159202
2012-06-26 13:39:21 +00:00
Nuno Lopes 31b54a5379 revert my previous commit (r159173), since as Eli pointed out, it's perfectly ok to mark realloc as noalias
llvm-svn: 159175
2012-06-25 23:26:10 +00:00
Nuno Lopes 75eaa72de9 do not set realloc() as NotAlias, since it can return the same pointer. This whole thing should be upgraded to use the MemoryBuiltin interface anyway..
llvm-svn: 159173
2012-06-25 22:55:50 +00:00
Dan Gohman 5f725cd196 Fix the objc_autoreleasedReturnValue optimization code to locate
the call correctly even in the case where it is an invoke. This
fixes rdar://11714057.

llvm-svn: 159157
2012-06-25 19:47:37 +00:00
Nuno Lopes 07594cba7c improve optimization of invoke instructions:
- simplifycfg:  invoke undef/null -> unreachable
 - instcombine:  invoke new  -> invoke expect(0, 0)  (an arbitrary NOOP intrinsic;  only done if the allocated memory is unused, of course)
 - verifier:  allow invoke of intrinsics  (to make the previous step work)

llvm-svn: 159146
2012-06-25 17:11:47 +00:00
Rafael Espindola 540c3d23df If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it
hidden. Being linkonce_odr guarantees that it is available in every dso that
needs it. Being a constant/function with unnamed_addr guarantees that the
copies don't have to be merged.

llvm-svn: 159136
2012-06-25 14:30:31 +00:00
Eli Bendersky f0ad3606c7 The name (and comment describing) of llvm::GetFirstDebuigLocInBasicBlock no longer represents what the function does. Therefore, the function is removed and its functionality is folded into the only place in the code-base where it was being used.
llvm-svn: 159133
2012-06-25 10:13:14 +00:00
NAKAMURA Takumi 704de074b8 llvm/lib: [CMake] Add explicit dependency to intrinsics_gen.
llvm-svn: 159112
2012-06-24 13:32:01 +00:00
Hal Finkel 3099ce9489 Allow controlling vectorization of boolean values separately from other integer types.
These are used as the result of comparisons, and often handled differently from larger integer types.

llvm-svn: 159111
2012-06-24 13:28:01 +00:00
Nick Lewycky 0a045bbe4e Remove dyn_cast + dereference pattern by replacing it with a cast and changing
the safety check to look for the same type we're going to actually cast to.
Fixes PR13180!

llvm-svn: 159110
2012-06-24 10:15:42 +00:00
Nick Lewycky b74ae9c5b2 Tab to spaces. No functionality change.
llvm-svn: 159104
2012-06-24 04:07:14 +00:00
Nick Lewycky bfb07fb562 Remove a dangling reference to a deleted instruction. Fixes PR13185!
llvm-svn: 159096
2012-06-24 01:44:08 +00:00
Hal Finkel 4b06b1a0ee Allow BBVectorize to fuse compare instructions.
llvm-svn: 159088
2012-06-23 21:52:50 +00:00
Hans Wennborg cbe34b4cc9 Extend the IL for selecting TLS models (PR9788)
This allows the user/front-end to specify a model that is better
than what LLVM would choose by default. For example, a variable
might be declared as

  @x = thread_local(initialexec) global i32 42

if it will not be used in a shared library that is dlopen'ed.

If the specified model isn't supported by the target, or if LLVM can
make a better choice, a different model may be used.

llvm-svn: 159077
2012-06-23 11:37:03 +00:00
Stepan Dyatkovskiy 8e00efeace Optimized usage of new SwitchInst case values (IntegersSubset type) in Local.cpp, Execution.cpp and BitcodeWriter.cpp.
I got about 1% of compile-time improvement on my machines (Ubuntu 11.10 i386 and Ubuntu 12.04 x64).

llvm-svn: 159076
2012-06-23 10:58:58 +00:00
Nuno Lopes de8c6fb24f BoundsChecking: attach debug info to traps to make my life a bit more sane
llvm-svn: 159055
2012-06-23 00:12:34 +00:00
Jakob Stoklund Olesen c5c4e96f3e Revert remaining part of r93200: "Disable folding sext(trunc(x)) -> x"
This fixes PR5997.

These transforms were disabled because codegen couldn't deal with other
uses of trunc(x). This is now handled by the peephole pass.

This causes no regressions on x86-64.

llvm-svn: 159003
2012-06-22 16:36:43 +00:00
Stepan Dyatkovskiy a6c8cc307b Fixed r158979.
Original message:
Performance optimizations:
- SwitchInst: case values stored separately from Operands List. It allows to make faster access to individual case value numbers or ranges.
- Optimized IntItem, added APInt value caching.
- Optimized IntegersSubsetGeneric: added optimizations for cases when subset is single number or when subset consists from single numbers only.

llvm-svn: 158997
2012-06-22 14:53:30 +00:00
Nuno Lopes 0b60ebbf79 fix whitespace in my last commit.
sorry for the churn :S  enough for today; going to sleep.

llvm-svn: 158953
2012-06-22 00:29:58 +00:00
Nuno Lopes 9792d68381 remove extractMallocCallFromBitCast, since it was tailor maded for its sole user. Update GlobalOpt accordingly.
llvm-svn: 158952
2012-06-22 00:25:01 +00:00
Nuno Lopes 771e7bd4ba instcombine: disable optimization of 'invoke null/undef'. I'll move this functionality to SimplifyCFG (since we cannot make changes to the CFG here).
Fixes the crashes with the attached test case

llvm-svn: 158951
2012-06-21 23:52:14 +00:00
Evan Cheng 32c7cc8ec9 Look pass zext to strength reduce an udiv. Patch by David Majnemer. rdar://11721329
llvm-svn: 158946
2012-06-21 22:52:49 +00:00
Nuno Lopes dc6085e52d Add support for invoke to the MemoryBuiltin analysid.
Update comments accordingly.

Make instcombine remove useless invokes to C++'s 'new' allocation function (test attached).

llvm-svn: 158937
2012-06-21 21:25:05 +00:00
Nuno Lopes 0e967e0186 port the BoundsChecking patch to the new MemoryBuiltin API (i.e., remove most of the code from here).
Remove the alloc_size.ll test until we settle on a metadata format that makes everyone happy..

llvm-svn: 158920
2012-06-21 15:59:53 +00:00
Nuno Lopes 55fff83422 refactor the MemoryBuiltin analysis:
- provide more extensive set of functions to detect library allocation functions (e.g., malloc, calloc, strdup, etc)
 - provide an API to compute the size and offset of an object pointed by

Move a few clients (GVN, AA, instcombine, ...) to the new API.
This implementation is a lot more aggressive than each of the custom implementations being replaced.

Patch reviewed by Nick Lewycky and Chandler Carruth, thanks.

llvm-svn: 158919
2012-06-21 15:45:28 +00:00
Nadav Rotem 4e9012c2b1 Add a number of threshold arguments to the SRA pass.
A patch by Tom Stellard with minor changes.

llvm-svn: 158918
2012-06-21 13:44:31 +00:00
Nuno Lopes 3fa32f2452 replace usage of EmitGEPOffset() with TargetData::getIndexedOffset() when the GEP offset is known to be constant.
With this change, we avoid relying on the IR Builder to constant fold the operations.

No functionality change intended.

llvm-svn: 158829
2012-06-20 17:30:51 +00:00
Chandler Carruth c60fbe6b58 Fix two rather subtle internal vs. external linker issues.
I'll admit I'm not entirely satisfied with this change, but it seemed
the cleanest option. Other suggestions quite welcome

The issue is that the traits specializations have static methods which
return the typedef'ed PHI_iterator type. In both the IR and MI layers
this is typedef'ed to a custom iterator class defined in an anonymous
namespace giving the types and the functions returning them internal
linkage. However, because the traits specialization is defined in the
'llvm' namespace (where it has to be, specialized template lives there),
and is in turn used in the templated implementation of the SSAUpdater.
This led to the linkage conflict that Clang now warns about.

The simplest solution to me was just to define the PHI_iterator as
a nested class inside the trait specialization. That way it still
doesn't get scoped widely, it can't be accidentally reused somewhere,
etc. This is a little gross just because nested class definitions are
a little gross, but the alternatives seem more ad-hoc.

llvm-svn: 158799
2012-06-20 08:39:30 +00:00
Pete Cooper 33ee6c9bf1 Now that SROA can form alloca's for dynamic vector accesses, further improve it to be able to replace operations on these vector alloca's with insert/extract element insts
llvm-svn: 158623
2012-06-17 03:58:26 +00:00
Hal Finkel fa103d3fc7 Teach BBVectorize to combine, when possible, or discard metadata when fusing instructions.
The present implementation handles only TBAA and FP metadata, discarding everything else.
For debug metadata, the current behavior is maintained (the debug metadata associated with
one of the instructions will be kept, discarding that attached to the other).

This should address PR 13040.

llvm-svn: 158606
2012-06-16 20:34:06 +00:00
Hal Finkel 16ddd4b66b Move the Metadata merging methods from GVN and make them public in MDNode.
There are other passes, BBVectorize specifically, that also need some of
this functionality.

llvm-svn: 158605
2012-06-16 20:33:37 +00:00
Evan Cheng 773b2cd63c It's not deterministic to iterate over SmallPtrSet. Replace it with SmallSetVector. Patch by Daniel Reynaud. rdar://11671029
llvm-svn: 158594
2012-06-16 04:28:11 +00:00
Pete Cooper 818e9f4a26 Fix crash from r158529 on Bullet.
Dynamic GEPs created by SROA needed to insert extra "i32 0"
operands to index through structs and arrays to get to the
vector being indexed.

llvm-svn: 158590
2012-06-16 01:43:26 +00:00
Andrew Trick 8370c7c38f LSR: fix expansion of scaled reg in non-address type formulae.
For non-address users, Base and Scaled registers are not specially
associated to fit an address mode, so SCEVExpander should apply normal
expansion rules. Otherwise we may sink computation into inner loops
that have already been optimized.

llvm-svn: 158537
2012-06-15 20:07:29 +00:00
Andrew Trick aca8fb3c45 LSR fix: "Special" users are just like "Basic" users but allow -1 scale.
llvm-svn: 158536
2012-06-15 20:07:26 +00:00
Pete Cooper e24d6a19e3 Allow SROA to split up an array of vectors into multiple vectors, even when the vectors are dynamically indexed
llvm-svn: 158529
2012-06-15 18:07:29 +00:00
Rafael Espindola 1821c6c3b0 Some optimizations done by globalopt are safe only for internal linkage, not
linkonce linkage. For example, it is not valid to add unnamed_addr.

This also fixes a crash in g++.dg/opt/static5.C.

llvm-svn: 158528
2012-06-15 18:00:24 +00:00
Duncan Sands 7838603ffc Fix issues (infinite loop and/or crash) with self-referential instructions, for
example degenerate phi nodes and binops that use themselves in unreachable code.
Thanks to Charles Davis for the testcase that uncovered this can of worms.

llvm-svn: 158508
2012-06-15 08:37:50 +00:00
Pete Cooper 1d1fa72837 Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct
llvm-svn: 158479
2012-06-14 23:53:53 +00:00
Rafael Espindola def1b09be2 Implement the isSafeToDiscardIfUnused predicate and use it in globalopt and
globaldce. Globaldce was already removing linkonce globals, but globalopt was
not.

llvm-svn: 158476
2012-06-14 22:48:13 +00:00
Pete Cooper 5d19452f3f Revert r158454: Allow SROA to look at a vector type... Its breaking the vectorise buildbot
This reverts commit 12c1f86ffa731e2952c80d2cc577000c96b8962c.

llvm-svn: 158462
2012-06-14 18:32:52 +00:00
Pete Cooper a7e6d58a87 Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct
llvm-svn: 158454
2012-06-14 16:38:13 +00:00
Manman Ren c2bc2d106b InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y).
uno && ueq was converted to ueq, it should be converted to uno.

llvm-svn: 158441
2012-06-14 05:57:42 +00:00
Pete Cooper e2fe809772 Revert "Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access"
This reverts commit 51786e0aaec76b973205066bd44f7f427b21969f.

llvm-svn: 158408
2012-06-13 17:55:22 +00:00
Pete Cooper e1d4e8b563 Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access
llvm-svn: 158407
2012-06-13 17:30:34 +00:00
Duncan Sands 409d8ae165 It is possible for several constants which aren't individually absorbing to
combine to the absorbing element.  Thanks to nbjoerg on IRC for pointing this 
out.

llvm-svn: 158399
2012-06-13 12:15:56 +00:00
Duncan Sands 318a89ddac When linearizing a multiplication, return at once if we see a factor of zero,
since then the entire expression must equal zero (similarly for other operations
with an absorbing element).  With this in place a bunch of reassociate code for
handling constants is dead since it is all taken care of when linearizing.  No
intended functionality change.

llvm-svn: 158398
2012-06-13 09:42:13 +00:00
Manman Ren d33f4efbfd SimplifyCFG: fold unconditional branch to its predecessor if profitable.
This patch extends FoldBranchToCommonDest to fold unconditional branches.
For unconditional branches, we fold them if it is easy to update the phi nodes 
in the common successors.

rdar://10554090

llvm-svn: 158392
2012-06-13 05:43:29 +00:00
Duncan Sands 72aea01b6e Use DenseMap as SmallMap workaround rather than std::map, at Chandler's request.
llvm-svn: 158371
2012-06-12 20:26:43 +00:00
Duncan Sands 67cd591989 Use std::map rather than SmallMap because SmallMap assumes that the value has
POD type, causing memory corruption when mapping to APInts with bitwidth > 64.
Merge another crash testcase into crash.ll while there.

llvm-svn: 158369
2012-06-12 20:16:51 +00:00
Duncan Sands d7aeefebd6 Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x).  This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value.  It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights.  As a side-effect it
reduces the number of multiplies needed in some cases of large powers.  While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users.  This is progress towards
fixing PR13021.

llvm-svn: 158358
2012-06-12 14:33:56 +00:00
Benjamin Kramer 2150145ae4 InstCombine: factor code better.
No functionality change.

llvm-svn: 158301
2012-06-11 08:01:25 +00:00
Benjamin Kramer 8b8a76974f InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), narrowing the compare.
This saves a cast, and zext is more expensive on platforms with subreg support
than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750.
On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the
same performance now when not inlining either function.

stupid_memchr: 323.0us
bsd_memchr: 321.0us
memchr: 479.0us

where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When
inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time,
I haven't fully understood the issue yet, something is grossly mangling the
loop after inlining.

llvm-svn: 158297
2012-06-10 20:35:00 +00:00
Dmitri Gribenko dbeafa773a Convert comments to proper Doxygen comments.
llvm-svn: 158248
2012-06-09 00:01:45 +00:00
Nuno Lopes 2710f1b049 canonicalize:
-%a + 42
into
42 - %a

previously we were emitting:
-(%a + 42)

This fixes the infinite loop in PR12338. The generated code is still not perfect, though.
Will work on that next

llvm-svn: 158237
2012-06-08 22:30:05 +00:00
Duncan Sands 3293f460e7 Reapply commit 158073 with a fix (the testcase was already committed). The
problem was that by moving instructions around inside the function, the pass
could accidentally move the iterator being used to advance over the function
too.  Fix this by only processing the instruction equal to the iterator, and
leaving processing of instructions that might not be equal to the iterator
to later (later = after traversing the basic block; it could also wait until
after traversing the entire function, but this might make the sets quite big).
Original commit message:

Grab-bag of reassociate tweaks.  Unify handling of dead instructions and
instructions to reoptimize.  Exploit this to more systematically eliminate
dead instructions (this isn't very useful in practice but is convenient for
analysing some testcase I am working on).  No need for WeakVH any more: use
an AssertingVH instead.

llvm-svn: 158226
2012-06-08 20:15:33 +00:00
Nuno Lopes 4b68c1da54 BoundsChecking: add support for ConstantPointerNull. fixes a bunch of instrumentation failures in loops with reallocs
llvm-svn: 158210
2012-06-08 16:31:42 +00:00
Duncan Sands 9a5cf92250 Revert commit 158073 while waiting for a fix. The issue is that reassociate
can move instructions within the instruction list.  If the instruction just
happens to be the one the basic block iterator is pointing to, and it is
moved to a different basic block, then we get into an infinite loop due to
the iterator running off the end of the basic block (for some reason this
doesn't fire any assertions).  Original commit message:

Grab-bag of reassociate tweaks.  Unify handling of dead instructions and
instructions to reoptimize.  Exploit this to more systematically eliminate
dead instructions (this isn't very useful in practice but is convenient for
analysing some testcase I am working on).  No need for WeakVH any more: use
an AssertingVH instead.

llvm-svn: 158199
2012-06-08 13:37:30 +00:00
Nadav Rotem 4e50efead6 Fix a bug in FoldSelectOpOp. Bitcast ops may change the number of vector elements, which may disagree with the select condition type.
llvm-svn: 158166
2012-06-07 20:28:57 +00:00
Benjamin Kramer 628a39faa3 Remove unused private fields found by clang's new -Wunused-private-field.
There are some that I didn't remove this round because they looked like
obvious stubs. There are dead variables in gtest too, they should be
fixed upstream.

llvm-svn: 158090
2012-06-06 18:25:08 +00:00
Chad Rosier faa3894628 Fix combine of uno && ord -> false so that the ordering of the fcmps doesn't
matter.
rdar://11579835

llvm-svn: 158084
2012-06-06 17:22:40 +00:00
Duncan Sands 763da45e9e Grab-bag of reassociate tweaks. Unify handling of dead instructions and
instructions to reoptimize.  Exploit this to more systematically eliminate
dead instructions (this isn't very useful in practice but is convenient for
analysing some testcase I am working on).  No need for WeakVH any more: use
an AssertingVH instead.

llvm-svn: 158073
2012-06-06 14:53:10 +00:00
Andrew Trick a6fb910fad LoopUnroll: always check for NULL LoopPassManager
llvm-svn: 158007
2012-06-05 17:51:05 +00:00
Rafael Espindola 47d988c54c When gvn decides to replace an instruction with another, we have to patch the
replacement to make it at least as generic as the instruction being replaced.
This includes:
* dropping nsw/nuw flags
* getting the least restrictive tbaa and fpmath metadata
* merging ranges

Fixes PR12979.

llvm-svn: 157958
2012-06-04 22:44:21 +00:00
Benjamin Kramer bde9176663 Fix typos found by http://github.com/lyda/misspell-check
llvm-svn: 157885
2012-06-02 10:20:22 +00:00
Stepan Dyatkovskiy 0e46d8a08c PR1255: case ranges.
IntRange converted from struct to class. So main change everywhere is replacement of ".Low/High" with ".getLow/getHigh()"

llvm-svn: 157884
2012-06-02 09:42:43 +00:00
Bill Wendling e85f34969e Register the gcov "writeout" at init time. Don't list this as a d'tor. Instead,
inject some code in that will run via the "__mod_init_func" method that
registers the gcov "writeout" function to execute at exit time.

The problem is that the "__mod_term_func" method of specifying d'tors is
deprecated on Darwin. And it can lead to some ambiguities when dealing with
multiple libraries.
<rdar://problem/11110106>

llvm-svn: 157852
2012-06-01 23:14:32 +00:00
Nuno Lopes adf1c859dd BoundsChecking: fix a bug when the handling of recursive PHIs failed and could leave dangling references in the cache
add regression tests for this problem.

Can already compile & run: PHP, PCRE, and ICU  (i.e., all the software I tried)

llvm-svn: 157822
2012-06-01 17:43:31 +00:00
Nuno Lopes 288e86ff6b add -bounds-checking-multiple-traps option to make one trap BB per check
disabled by default for now; we can discusse the default value (& name) later

llvm-svn: 157777
2012-05-31 22:58:48 +00:00
Nuno Lopes 7d00061d87 revamp BoundsChecking considerably:
- compute size & offset at the same time. The side-effects of this are that we now support negative GEPs. It's now approaching a phase that it can be reused by other passes (e.g., lowering of the objectsize intrinsic)
 - use APInt throughout to handle wrap-arounds
 - add support for PHI instrumentation
 - add a cache (required for recursive PHIs anyway)
 - remove hoisting support for now, since it was wrong in a few cases

sorry for the churn here.. tests will follow soon.

llvm-svn: 157775
2012-05-31 22:45:40 +00:00
Duncan Sands 339bb61e32 Enhance the sinking code to handle diamond patterns. Patch by
Carlo Alberto Ferraris.

llvm-svn: 157736
2012-05-31 08:09:49 +00:00
Kostya Serebryany 9024160439 [asan] instrument cmpxchg and atomicrmw
llvm-svn: 157683
2012-05-30 09:04:06 +00:00
Nuno Lopes 8bd45f8ecd bounds checking:
- hoist checks out of loops where SCEV is smart enough
 - add additional statistics to measure how much we loose for not supporting interprocedural and pointers loaded from memory

llvm-svn: 157649
2012-05-29 22:32:51 +00:00
Stepan Dyatkovskiy 58107dd547 ConstantRangesSet renamed to IntegersSubset. CRSBuilder renamed to IntegersSubsetMapping.
llvm-svn: 157612
2012-05-29 12:26:47 +00:00
Benjamin Kramer 9d5849f51d Fix suspicous hasOneUse() check, found by PVS Studio (PR12357).
llvm-svn: 157592
2012-05-28 20:52:48 +00:00
Benjamin Kramer b8743a9150 InstCombine: Fix infinite loop when encountering switch on trivial icmp.
The test case feeds the following into InstCombine's visitSelect:
%tobool8 = icmp ne i32 0, 0
%phitmp = select i1 %tobool8, i32 3, i32 0
Then instcombine replaces the right side of the switch with 0, doesn't notice
that nothing changes and tries again indefinitely.

This fixes PR12897.

llvm-svn: 157587
2012-05-28 19:18:16 +00:00
Stepan Dyatkovskiy e3e19cbb13 PR1255: Case Ranges
Implemented IntItem - the wrapper around APInt. Why not to use APInt item directly right now?
1. It will very difficult to implement case ranges as series of small patches. We got several large and heavy patches. Each patch will about 90-120 kb. If you replace ConstantInt with APInt in SwitchInst you will need to changes at the same time all Readers,Writers and absolutely all passes that uses SwitchInst.
2. We can implement APInt pool inside and save memory space. E.g. we use several switches that works with 256 bit items (switch on signatures, or strings). We can avoid value duplicates in this case.
3. IntItem can be easyly easily replaced with APInt.
4. Currenly we can interpret IntItem both as ConstantInt and as APInt. It allows to provide SwitchInst methods that works with ConstantInt for non-updated passes.

Why I need it right now? Currently I need to update SimplifyCFG pass (EqualityComparisons). I need to work with APInts directly a lot, so peaces of code
ConstantInt *V = ...;
if (V->getValue().ugt(AnotherV->getValue()) {
  ...
}
will look awful. Much more better this way:
IntItem V = ConstantIntVal->getValue();
if (AnotherV < V) {
}

Of course any reviews are welcome.

P.S.: I'm also going to rename ConstantRangesSet to IntegersSubset, and CRSBuilder to IntegersSubsetMapping (allows to map individual subsets of integers to the BasicBlocks).
Since in future these classes will founded on APInt, it will possible to use them in more generic ways.

llvm-svn: 157576
2012-05-28 12:39:09 +00:00
Bill Wendling 1560517ec3 Implement the indirect counter increment code in a better way. Instead of
replicating the code for every place it's needed, we instead generate a function
that does that for us. This function is local to the executable, so there
shouldn't be any writing violations.

llvm-svn: 157564
2012-05-28 06:10:56 +00:00
Chris Lattner 3cb6f83ebb switch AttrListPtr::get to take an ArrayRef, simplifying a lot of clients.
llvm-svn: 157556
2012-05-28 01:47:44 +00:00
Benjamin Kramer 152f106e5f PR12967: Don't crash when trying to fold a shift that's larger than the type's size.
llvm-svn: 157548
2012-05-27 22:03:32 +00:00
Chris Lattner 144b619684 Reimplement the intrinsic verifier to use the same table as Intrinsic::getDefinition,
making it stronger and more sane.

Delete the code from tblgen that produced the old code.

Besides being a path forward in intrinsic sanity, this also eliminates a bunch of
machine generated code that was compiled into Function.o

llvm-svn: 157545
2012-05-27 19:37:05 +00:00
Duncan Sands 3c05cd3ea8 Since commit 157467, if reassociate isn't actually going to change an expression
then it doesn't alter the instructions composing it, however it would continue
to move the instructions to just before the expression root.  Ensure it doesn't
move them either, so now it really does nothing if there is nothing to do.  That
commit also ensured that nsw etc flags weren't cleared if the expression was not
being changed.  Tweak this a bit so that it doesn't clear flags on the initial
part of a computation either if that part didn't change but later bits did.

llvm-svn: 157518
2012-05-26 16:42:52 +00:00
Benjamin Kramer 58abf4f193 SimplifyCFG: Turn the ad-hoc std::pair that represents switch cases into an explicit struct.
llvm-svn: 157516
2012-05-26 14:29:37 +00:00
Benjamin Kramer 65e75666ff Add support for branch weight metadata to MDBuilder and use it in various places.
llvm-svn: 157515
2012-05-26 13:59:43 +00:00
Duncan Sands c94ac6fdf6 Move this debug statement earlier so it is easy to see the order in
which operands come flying out of the linearization stage.

llvm-svn: 157512
2012-05-26 07:47:48 +00:00
Bill Wendling 8ed0749a34 The llvm_gcda_increment_indirect_counter function writes to the arguments that
are passed in. However, those arguments may be in a write-protected area, as far
as the runtime library is concerned. For instance, the data could be placed into
a 'linkedit' section, which isn't writable. Emit the code from
llvm_gcda_increment_indirect_counter directly into the function instead.

Note: The code for this is ugly, and can lead to bloat. We should look into
simplifying this code instead of having all of these branches.

<rdar://problem/11181370>

llvm-svn: 157505
2012-05-25 23:55:00 +00:00
Nuno Lopes e9b0bdf804 bounds checking: add support for byval arguments
llvm-svn: 157498
2012-05-25 21:15:17 +00:00
Nuno Lopes a6da3ff896 boundschecking:
add support for select
add experimental support for alloc_size metadata

llvm-svn: 157481
2012-05-25 16:54:04 +00:00
Duncan Sands bddfb2f96b Make the reassociation pass more powerful so that it can handle expressions
with arbitrary topologies (previously it would give up when hitting a diamond
in the use graph for example).  The testcase from PR12764 is now reduced from
a pile of additions to the optimal 1617*%x0+208.  In doing this I changed the
previous strategy of dropping all uses for expression leaves to one of dropping
all but one use.  This works out more neatly (but required a bunch of tweaks)
and is also safer: some recently fixed bugs during recursive linearization were
because the linearization code thinks it completely owns a node if it has no uses
outside the expression it is linearizing.  But if the node was also in another
expression that had been linearized (and thus all uses of the node from that
expression dropped) then the conclusion that it is completely owned by the
expression currently being linearized is wrong.  Keeping one use from within each
linearized expression avoids this kind of mistake.

llvm-svn: 157467
2012-05-25 12:03:02 +00:00
Stepan Dyatkovskiy 183d18aa5a PR1255 related changes (case ranges):
LowerSwitch::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced.
test/Transform/LowerSwitch/feature.ll - this test was refactored: grep + count was replaced with FileCheck usage.

llvm-svn: 157384
2012-05-24 09:33:20 +00:00
Nuno Lopes 10287d839f BoundsChecking: add a couple of simple tests and fix a bug in branch emition
llvm-svn: 157329
2012-05-23 16:24:52 +00:00
Patrik Hägglund 8a1e316c15 Fix the inliner so that the optsize function attribute don't alter the
inline threshold if the global inline threshold is lower (as for -Oz).

Reviewed by Chandler Carruth and Bill Wendling.

llvm-svn: 157323
2012-05-23 13:42:57 +00:00
Evgeniy Stepanov 617232f32b Use zero-based shadow by default on Android.
llvm-svn: 157317
2012-05-23 11:52:12 +00:00
Stepan Dyatkovskiy 7a50155227 PR1255(case ranges) related changes in Local Transformations.
llvm-svn: 157315
2012-05-23 08:18:26 +00:00
Nuno Lopes 59e9df773a address some of John Criswell's comments
teach computeAllocSize about realloc, reallocf, and valloc

llvm-svn: 157298
2012-05-22 22:02:19 +00:00
Nuno Lopes eee43e1bc7 hopefully fix the CMake build. sorry for breakage
llvm-svn: 157264
2012-05-22 17:40:46 +00:00
Nuno Lopes a2f6cecb6d add a new pass to instrument loads and stores for run-time bounds checking
move EmitGEPOffset from InstCombine to Transforms/Utils/Local.h

(a draft of this) patch reviewed by Andrew, thanks.

llvm-svn: 157261
2012-05-22 17:19:09 +00:00
Nuno Lopes ad40c0a425 revert my previous patches that introduced an additional parameter to the objectsize intrinsic.
After a lot of discussion, we realized it's not the best option for run-time bounds checking

llvm-svn: 157255
2012-05-22 15:25:31 +00:00
Duncan Sands 4df5e96d3a Fix PR12858, a crash due to GVN's PRE not fully removing an instruction from the
leader table.  That's because it wasn't expecting instructions to turn up as
leader for a value number that is not its own, but equality propagation could
create this situation.  One solution is to have the leader table use a WeakVH
but this slows down GVN by about 5%.  Instead just have equality propagation not
add instructions to the leader table, only constants and arguments.  In theory
this might cause GVN to run more (each time it changes something it runs again)
but it doesn't seem to occur enough to cause a slow down.

llvm-svn: 157251
2012-05-22 14:17:53 +00:00
Dan Gohman 9c97eea0fd Mark an unreachable region of code with llvm_unreachable.
llvm-svn: 157197
2012-05-21 17:41:28 +00:00
Peter Collingbourne 9a03c73297 Do not pass an invalid domtree to SimplifyInstruction from
LoopUnswitch.  Fixes PR12887.

llvm-svn: 157140
2012-05-20 01:32:09 +00:00
Peter Collingbourne 97b1076435 Do not eliminate allocas whose alignment exceeds that of the
copied-in constant, as a subsequent user may rely on over alignment.
Fixes PR12885.

llvm-svn: 157134
2012-05-19 22:52:10 +00:00
Dan Gohman 14862c3141 Fix replacing all the users of objc weak runtime routines
when deleting them. rdar://11434915.

llvm-svn: 157080
2012-05-18 22:17:29 +00:00
David Majnemer a9330fe553 Teach SimplifyLibCalls about stpcpy.
llvm-svn: 156815
2012-05-15 11:46:21 +00:00
Chad Rosier a968caf8e0 Move the capture analysis from MemoryDependencyAnalysis to a more general place
so that it can be reused in MemCpyOptimizer.  This analysis is needed to remove
an unnecessary memcpy when returning a struct into a local variable.
rdar://11341081
PR12686

llvm-svn: 156776
2012-05-14 20:35:04 +00:00
Jay Foad ca0c499609 Teach Function::hasAddressTaken that BlockAddress doesn't really take
the address of a function.

llvm-svn: 156703
2012-05-12 08:30:16 +00:00
Nuno Lopes e2cfd3ce95 objectsize: add a few more tests and fix a bug
llvm-svn: 156625
2012-05-11 18:25:29 +00:00
Eli Friedman e0a64d83fc Fix a minor logic mistake transforming compares in instcombine. PR12514.
llvm-svn: 156600
2012-05-11 01:32:59 +00:00
Nuno Lopes f573030391 objectsize: add support for GEPs with non-constant indexes
add an additional parameter to InstCombiner::EmitGEPOffset() to force it to *not* emit operations with NUW flag

llvm-svn: 156585
2012-05-10 23:17:35 +00:00
Dan Gohman ed7c24e2d9 Teach DeadStoreElimination to eliminate exit-block stores with phi addresses.
llvm-svn: 156558
2012-05-10 18:57:38 +00:00
Nuno Lopes 300d629924 teach DSE and isInstructionTriviallyDead() about calloc
llvm-svn: 156553
2012-05-10 17:14:00 +00:00
Dan Gohman f8b19d09ba Fix the objc_storeStrong recognizer to stop before walking off the
end of a basic block if there's no store.

llvm-svn: 156520
2012-05-09 23:08:33 +00:00
Nuno Lopes 7100f463b0 objectsize:
refactor code a bit to enable future changes to support run-time information
add support to compute allocation sizes at run-time if penalty > 1 (e.g., malloc(x), calloc(x, y), and VLAs)

llvm-svn: 156515
2012-05-09 21:30:57 +00:00
Craig Topper 28540adfcf Remove unused variable to get rid of warning.
llvm-svn: 156466
2012-05-09 07:08:58 +00:00
Dan Gohman 41375a3545 Miscellaneous accumulated cleanups.
llvm-svn: 156445
2012-05-08 23:39:44 +00:00
Dan Gohman 61708d37d6 Fix objc_storeStrong pattern matching to catch a potential use of the
old value after the store but before it is released.
This fixes rdar:/11116986.

llvm-svn: 156442
2012-05-08 23:34:08 +00:00
Duncan Sands 3bbb1d50df Calling ReassociateExpression recursively is extremely dangerous since it will
replace the operands of expressions with only one use with undef and generate
a new expression for the original without using RAUW to update the original.
Thus any copies of the original expression held in a vector may end up
referring to some bogus value - and using a ValueHandle won't help since there
is no RAUW.  There is already a mechanism for getting the effect of recursion
non-recursively: adding the value to be recursed on to RedoInsts.  But it wasn't
being used systematically.  Have various places where recursion had snuck in at
some point use the RedoInsts mechanism instead.  Fixes PR12169.

llvm-svn: 156379
2012-05-08 12:16:05 +00:00
Andrew Trick d29cd732d4 Allow NULL LoopPassManager argument in UnrollLoop. PR12734.
llvm-svn: 156358
2012-05-08 02:52:09 +00:00