Commit Graph

10027 Commits

Author SHA1 Message Date
Nadav Rotem 0b37f14371 LoopVectorizer: Fix a bug in the code that updates the loop exiting block.
LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs.
The bug happened because undefs are not loop values. This patch handles these PHIs.

PR14725

llvm-svn: 171251
2012-12-30 07:47:00 +00:00
Alexey Samsonov 3efc87e92d Add proper support for -fsanitize-blacklist= flag for TSan and MSan. LLVM part.
llvm-svn: 171183
2012-12-28 09:30:44 +00:00
Chandler Carruth e40e60eed5 Make this parameter be named consistently with most other
getAnalysisUsage implementations.

llvm-svn: 171157
2012-12-27 11:17:15 +00:00
Alexey Samsonov 29dd7f2090 [ASan] Fix lifetime intrinsics handling. Now for each intrinsic we check if it describes one of 'interesting' allocas. Assume that allocas can go through casts and phi-nodes before apperaring as llvm.lifetime arguments
llvm-svn: 171153
2012-12-27 08:50:58 +00:00
Nadav Rotem 5350cd314b If all of the write objects are identified then we can vectorize the loop even if the read objects are unidentified.
PR14719.

llvm-svn: 171124
2012-12-26 23:30:53 +00:00
Nick Lewycky 90053a1214 Remove mid-optimizer warning. This situation should be handled differently,
such as by a compiler warning, a check in clang -fsanitizer=undefined, being
optimized to unreachable, or a combination of the above. PR14722.

llvm-svn: 171119
2012-12-26 22:00:35 +00:00
Nadav Rotem 3f7c4f36ba LoopVectorizer: Optimize the vectorization of consecutive memory access when the iteration step is -1
llvm-svn: 171114
2012-12-26 19:08:17 +00:00
Evgeniy Stepanov 5eb5bf8b46 [msan] Raise alignment of origin stores/loads when possible.
Origin alignment is as high as the alignment of the corresponding application
location, but never less than 4.

llvm-svn: 171110
2012-12-26 11:55:09 +00:00
Evgeniy Stepanov d8be0c510c [msan] Expand the file comment with track-origins info.
llvm-svn: 171109
2012-12-26 10:59:00 +00:00
Hal Finkel 30e95a8ebb BBVectorize: Use VTTI to compute costs for intrinsics vectorization
For the time being this includes only some dummy test cases. Once the
generic implementation of the intrinsics cost function does something other
than assuming scalarization in all cases, or some target specializes the
interface, some real test cases can be added.

Also, for consistency, I changed the type of IID from unsigned to Intrinsic::ID
in a few other places.

llvm-svn: 171079
2012-12-26 01:36:57 +00:00
Hal Finkel b44f890133 LoopVectorize: Enable vectorization of the fmuladd intrinsic
llvm-svn: 171076
2012-12-25 23:21:29 +00:00
Hal Finkel 2a456112ec BBVectorize: Enable vectorization of the fmuladd intrinsic
llvm-svn: 171075
2012-12-25 22:36:08 +00:00
Evgeniy Stepanov f19c086d1e [msan] Fix handling of vectors of pointers.
VectorType::getInteger() can not be used with them, because pointer size
depends on the target.

llvm-svn: 171070
2012-12-25 16:04:38 +00:00
Evgeniy Stepanov ec8371283b [msan] Fix handling of select with vector condition.
llvm-svn: 171069
2012-12-25 14:56:21 +00:00
Alexey Samsonov 788381b8ac ASan: initialize callbacks from ASan module pass in a separate function for consistency
llvm-svn: 171061
2012-12-25 12:28:20 +00:00
Alexey Samsonov 1e3f7ba8f7 ASan: move stack poisoning logic into FunctionStackPoisoner struct
llvm-svn: 171060
2012-12-25 12:04:36 +00:00
Bob Wilson 4ed23578da Add LLVMContext::emitWarning methods and use them. <rdar://problem/12867368>
When the backend is used from clang, it should produce proper diagnostics
instead of just printing messages to errs(). Other clients may also want to
register their own error handlers with the LLVMContext, and the same handler
should work for warnings in the same way as the existing emitError methods.

llvm-svn: 171041
2012-12-24 18:15:21 +00:00
Nadav Rotem 5f7c12cfbd LoopVectorizer: When checking for vectorizable types, also check
the StoreInst operands.

PR14705.

llvm-svn: 171023
2012-12-24 09:14:18 +00:00
Alexey Samsonov 098842b401 Fix typo in comments
llvm-svn: 171021
2012-12-24 08:52:53 +00:00
Nadav Rotem bd5d1d832a LoopVectorizer: Fix an endless loop in the code that looks for reductions.
The bug was in the code that detects PHIs in if-then-else block sequence.

PR14701.

llvm-svn: 171008
2012-12-24 01:22:06 +00:00
Benjamin Kramer 28691400dd LoopVectorize: Fix accidentaly inverted condition.
llvm-svn: 171001
2012-12-23 13:21:41 +00:00
Benjamin Kramer 855ba03408 LoopVectorize: For scalars and void types there is no need to compute vector insert/extract costs.
Fixes an assert during the build of oggenc in the test suite.

llvm-svn: 171000
2012-12-23 13:19:18 +00:00
Nadav Rotem 2cade68025 Loop Vectorizer: Update the cost model of scatter/gather operations and make
them more expensive.

llvm-svn: 170995
2012-12-23 07:23:55 +00:00
Craig Topper 4c94775198 Remove trailing whitespace
llvm-svn: 170990
2012-12-22 18:09:02 +00:00
Bill Wendling c79e42c5ce Change 'AttrVal' to 'AttrKind' to better reflect that it's a kind of attribute instead of the value of the attribute.
llvm-svn: 170972
2012-12-22 00:37:52 +00:00
Roman Divacky a229186a82 Remove duplicate includes.
llvm-svn: 170902
2012-12-21 17:06:44 +00:00
Evgeniy Stepanov 4fbc0d08bf [msan] Remove unreachable blocks before instrumenting a function.
llvm-svn: 170883
2012-12-21 11:18:49 +00:00
Nadav Rotem 3b850b70b3 Enable if-conversion.
llvm-svn: 170841
2012-12-21 04:47:54 +00:00
Evan Cheng 99cafb1db2 Every pass deserves a name, even codegenprep.
llvm-svn: 170831
2012-12-21 01:48:14 +00:00
Nadav Rotem a4b53f20a3 BB-Vectorizer: Check the cost of the store pointer type
and not the return type, which is void. A number of test
cases fail after adding the assertion in TTImpl.

llvm-svn: 170828
2012-12-21 01:24:36 +00:00
Nadav Rotem e7785686a5 Fix a bug in the code that checks if we can vectorize loops while using dynamic
memory bound checks.  Before the fix we were able to vectorize this loop from
the Livermore Loops benchmark:

for ( k=1 ; k<n ; k++ )
  x[k] = x[k-1] + y[k];

llvm-svn: 170811
2012-12-21 00:07:35 +00:00
Nadav Rotem 2ababf68d7 LoopVectorize: Fix a bug in the scalarization of instructions.
Before if-conversion we could check if a value is loop invariant
if it was declared inside the basic block. Now that loops have
multiple blocks this check is incorrect.

This fixes External/SPEC/CINT95/099_go/099_go

llvm-svn: 170756
2012-12-20 20:24:40 +00:00
Nadav Rotem 8b20c0a814 Loop Vectorizer: turn-off if-conversion.
llvm-svn: 170708
2012-12-20 17:42:53 +00:00
James Molloy 4f6fb953a7 Add a new attribute, 'noduplicate'. If a function contains a noduplicate call, the call cannot be duplicated - Jump threading, loop unrolling, loop unswitching, and loop rotation are inhibited if they would duplicate the call.
Similarly inlining of the function is inhibited, if that would duplicate the call (in particular inlining is still allowed when there is only one callsite and the function has internal linkage).

llvm-svn: 170704
2012-12-20 16:04:27 +00:00
Craig Topper ae48cb2e5a Formatting fixes. Remove some unnecessary 'else' after 'return'. No functional change.
llvm-svn: 170676
2012-12-20 07:15:54 +00:00
Craig Topper 9d4171afed Removing trailing whitespace
llvm-svn: 170675
2012-12-20 07:09:41 +00:00
Nadav Rotem 7bdc45b570 Loop Vectorizer: Enable if-conversion.
llvm-svn: 170632
2012-12-20 02:00:02 +00:00
Nadav Rotem 28408a20c9 whitespace
llvm-svn: 170626
2012-12-20 00:49:56 +00:00
Paul Redmond 5917f4c715 Transform (x&C)>V into (x&C)!=0 where possible
When the least bit of C is greater than V, (x&C) must be greater than V
if it is not zero, so the comparison can be simplified.

Although this was suggested in Target/X86/README.txt, it benefits any
architecture with a directly testable form of AND.

Patch by Kevin Schoedel

llvm-svn: 170576
2012-12-19 19:47:13 +00:00
Evgeniy Stepanov abeae5c7d5 [msan] Add track-origins argument to the pass constructor.
llvm-svn: 170544
2012-12-19 13:55:51 +00:00
Evgeniy Stepanov d7571cd4bc [msan] Heuristically instrument unknown intrinsics.
This changes adds shadow and origin propagation for unknown intrinsics
by examining the arguments and ModRef behaviour. For now, only 3 classes
of intrinsics are handled:
- those that look like simple SIMD store
- those that look like simple SIMD load
- those that don't have memory effects and look like arithmetic/logic/whatever
  operation on simple types.

llvm-svn: 170530
2012-12-19 11:22:04 +00:00
Benjamin Kramer e300004bd5 LoopVectorize: Make iteration over induction variables not depend on pointer values.
MapVector is a bit heavyweight, but I don't see a simpler way. Also the
InductionList is unlikely to be large. This should help 3-stage selfhost
compares (PR14647).

llvm-svn: 170528
2012-12-19 11:09:15 +00:00
Bill Wendling d97b75d816 Inline the 'hasIncompatibleWithVarArgsAttrs' method into its only uses. And some minor comment reformatting.
llvm-svn: 170516
2012-12-19 08:57:40 +00:00
Bill Wendling 3d7b0b8ac7 Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future.
llvm-svn: 170502
2012-12-19 07:18:57 +00:00
Shuxin Yang 5b841c4a64 Make sure the buffer, which containas an instance of APFloat, has proper alignment.
llvm-svn: 170486
2012-12-19 01:10:17 +00:00
Shuxin Yang 37a1efe1c6 rdar://12801297
InstCombine for unsafe floating-point add/sub.

llvm-svn: 170471
2012-12-18 23:10:12 +00:00
Nadav Rotem 9aee065e3c Enable the loop vectorizer in clang and not in the pass manager, so that we can disable it in clang.
llvm-svn: 170470
2012-12-18 23:09:44 +00:00
Benjamin Kramer f0e5d2f032 LoopVectorize: Emit reductions as log2(vectorsize) shuffles + vector ops instead of scalar operations.
For example on x86 with SSE4.2 a <8 x i8> add reduction becomes
	movdqa	%xmm0, %xmm1
	movhlps	%xmm1, %xmm1            ## xmm1 = xmm1[1,1]
	paddw	%xmm0, %xmm1
	pshufd	$1, %xmm1, %xmm0        ## xmm0 = xmm1[1,0,0,0]
	paddw	%xmm1, %xmm0
	phaddw	%xmm0, %xmm0
	pextrb	$0, %xmm0, %edx

instead of
	pextrb	$2, %xmm0, %esi
	pextrb	$0, %xmm0, %edx
	addb	%sil, %dl
	pextrb	$4, %xmm0, %esi
	addb	%dl, %sil
	pextrb	$6, %xmm0, %edx
	addb	%sil, %dl
	pextrb	$8, %xmm0, %esi
	addb	%dl, %sil
	pextrb	$10, %xmm0, %edi
	pextrb	$14, %xmm0, %edx
	addb	%sil, %dil
	pextrb	$12, %xmm0, %esi
	addb	%dil, %sil
	addb	%sil, %dl

llvm-svn: 170439
2012-12-18 18:40:20 +00:00
Nadav Rotem c0699854dd Enable the loop vectorizer.
llvm-svn: 170416
2012-12-18 06:37:12 +00:00
Nadav Rotem a5024fc3e1 SROA: Replace calls to getScalarSizeInBits to DataLayout's API because
getScalarSizeInBits could not handle vectors of pointers.

llvm-svn: 170412
2012-12-18 05:23:31 +00:00
Rafael Espindola 46b9c8a2cd Initialize NoRedZone and remove unused default values.
llvm-svn: 170404
2012-12-18 03:35:05 +00:00
Chandler Carruth e3f4119b06 Fix another SROA crasher, PR14601.
This was a silly oversight, we weren't pruning allocas which were used
by variable-length memory intrinsics from the set that could be widened
and promoted as integers. Fix that.

llvm-svn: 170353
2012-12-17 18:48:07 +00:00
Evgeniy Stepanov 88b8dceddf [msan] Fix lint warning.
llvm-svn: 170347
2012-12-17 16:30:05 +00:00
Chandler Carruth 21eb4e96c2 Teach the rewriting of memcpy calls to support subvector copies.
This also cleans up a bit of the memcpy call rewriting by sinking some
irrelevant code further down and making the call-emitting code a bit
more concrete.

Previously, memcpy of a subvector would actually miscompile (!!!) the
copy into a single vector element copy. I have no idea how this ever
worked. =/ This is the memcpy half of PR14478 which we probably weren't
noticing previously because it didn't actually assert.

The rewrite relies on the newly refactored insert- and extractVector
functions to do the heavy lifting, and those are the same as used for
loads and stores which makes the test coverage a bit more meaningful
here.

llvm-svn: 170338
2012-12-17 14:51:24 +00:00
Evgeniy Stepanov 95a80abead Optimize tree walking in markAliveBlocks.
Check whether a BB is known as reachable before adding it to the worklist.
This way BB's with multiple predecessors are added to the list no more than
once.

llvm-svn: 170335
2012-12-17 14:28:00 +00:00
Chandler Carruth cacda256a1 Fix a secondary bug I introduced while fixing the first part of PR14478.
The first half of fixing this bug was actually in r170328, but was
entirely coincidental. It did however get me to realize the nature of
the bug, and adapt the test case to test more interesting behavior. In
turn, that uncovered the rest of the bug which I've fixed here.

This should fix two new asserts that showed up in the vectorize nightly
tester.

llvm-svn: 170333
2012-12-17 14:03:01 +00:00
Chandler Carruth 95e1fb8a42 Hoist a convertValue call to the two paths where it is needed.
I noticed this while looking at r170328. We only ever do a vector
rewrite when the alloca *is* the vector type, so it's good to not paper
over bugs here by doing a convertValue that isn't needed.

llvm-svn: 170331
2012-12-17 13:51:03 +00:00
Chandler Carruth ce4562bdcb Hoist the insertVector helper to be a static helper.
This will allow its use inside of memcpy rewriting as well. This routine
is more complex than extractVector, and some of its uses are not 100%
where I want them to be so there is still some work to do here.

While this can technically change the output in some cases, it shouldn't
be a change that matters -- IE, it can leave some dead code lying around
that prior versions did not, etc.

Yet another step in the refactorings leading up to the solution to the
last component of PR14478.

llvm-svn: 170328
2012-12-17 13:41:21 +00:00
Chandler Carruth b6bc8749e8 Lift the extractVector helper all the way out to a static helper function.
The method helpers all implicitly act upon the alloca, and what we
really want is a fully generic helper. Doing memcpy rewrites is more
special than all other rewrites because we are at times rewriting
instructions which touch pointers *other* than the alloca. As
a consequence all of the helpers needed by memcpy rewriting of
sub-vector copies will need to be generalized fully.

Note that all of these helpers ({insert,extract}{Integer,Vector}) are
woefully uncommented. I'm going to go back through and document them
once I get the factoring correct.

No functionality changed.

llvm-svn: 170325
2012-12-17 13:07:30 +00:00
Chandler Carruth 769445ef03 Factor the vector load rewriting into a more generic form.
This makes it suitable for use in rewriting memcpy in the presence of
subvector memcpy intrinsics.

No functionality changed.

llvm-svn: 170324
2012-12-17 12:50:21 +00:00
Chandler Carruth ccca504f3a Fix the first part of PR14478: memset now works.
PR14478 highlights a serious problem in SROA that simply wasn't being
exercised due to a lack of vector input code mixed with C-library
function calls. Part of SROA was written carefully to handle subvector
accesses via memset and memcpy, but the rewriter never grew support for
this. Fixing it required refactoring the subvector access code in other
parts of SROA so it could be shared, and then fixing the splat formation
logic and using subvector insertion (this patch).

The PR isn't quite fixed yet, as memcpy is still broken in the same way.
I'm starting on that series of patches now.

Hopefully this will be enough to bring the bullet benchmark back to life
with the bb-vectorizer enabled, but that may require fixing memcpy as
well.

llvm-svn: 170301
2012-12-17 04:07:37 +00:00
Chandler Carruth eae65a5629 Extract the logic for inserting a subvector into a vector alloca.
No functionality changed. Another step of refactoring toward solving
PR14487.

llvm-svn: 170300
2012-12-17 04:07:35 +00:00
Chandler Carruth 514f34f9c4 Lift the integer splat computation into a helper function.
No functionality changed. Refactoring leading up to the fix for PR14478
which requires some significant changes to the memset and memcpy
rewriting.

llvm-svn: 170299
2012-12-17 04:07:30 +00:00
Chandler Carruth 067edd342f Relax an overly aggressive assert to fix PR14572.
The alloca width is based on the alloc size, not the type size.

llvm-svn: 170270
2012-12-15 09:26:06 +00:00
NAKAMURA Takumi 8f45b6c709 Revert r170246, "Enable the loop vectorizer by default."
llvm-svn: 170267
2012-12-15 06:11:13 +00:00
Michael Ilseman e2754dc887 Add back FoldOpIntoPhi optimizations with fix. Included test cases to help catch these errors and to test the presence of the optimization itself
llvm-svn: 170248
2012-12-14 22:08:26 +00:00
Nadav Rotem acde77481d Enable the loop vectorizer by default.
llvm-svn: 170246
2012-12-14 21:30:23 +00:00
Shuxin Yang f8e9a5a061 rdar://12753946
Implement rule : "x * (select cond 1.0, 0.0) -> select cond x, 0.0"

llvm-svn: 170226
2012-12-14 18:46:06 +00:00
Evgeniy Stepanov 9b72e991c6 Fix lint warnings in MemorySanitizer.cpp.
llvm-svn: 170203
2012-12-14 13:48:31 +00:00
Evgeniy Stepanov 49175b237d [msan] Origin stores and loads do not need explicit alignment.
Origin address is always 4 byte aligned, and the access type is always i32.

llvm-svn: 170199
2012-12-14 13:43:11 +00:00
Evgeniy Stepanov f18e3af11f [msan] Refactor default shadow propagation and origin tracking.
This change moves the code for default shadow propagaition (handleShadowOr)
and origin tracking (setOriginForNaryOp) into a new builder-like class. Also
gets rid of handleShadowOrBinary.

llvm-svn: 170192
2012-12-14 12:54:18 +00:00
Nadav Rotem d3a3c9fdd5 revert r170166 - disable the loop vectorizer.
llvm-svn: 170172
2012-12-14 01:57:00 +00:00
Nadav Rotem 3b606d6fd5 Enable the loop vectorizer.
llvm-svn: 170166
2012-12-14 00:30:34 +00:00
Nadav Rotem b4ea4b3751 Disable the loop vectorizer.
llvm-svn: 170162
2012-12-14 00:02:07 +00:00
Nadav Rotem e5e28b48c8 Enable the Loop Vectorizer by default for O2 and O3. Disable if-conversion by default. I plan to revert this patch later today.
llvm-svn: 170157
2012-12-13 23:11:54 +00:00
NAKAMURA Takumi 38d2b2442f Revert r170020, "Simplify negated bit test", for now.
This assumes (1 << n) is always not zero. Consider n is greater than word size.
Although I know it is undefined, this transforms undefined behavior hidden.

This led clang unexpected behavior with some failures. I will investigate to fix undefined shl in clang.

llvm-svn: 170128
2012-12-13 14:28:16 +00:00
Eric Christopher a1bbeeca72 Revert "Restore the PHI optimization I accidently removed" temporarily since
it seems to be breaking self-host for a few people and is PR14592.

This reverts commit r170024.

llvm-svn: 170106
2012-12-13 06:48:05 +00:00
Rafael Espindola a2c107e661 Missed these calls from the previous rename somehow.
llvm-svn: 170094
2012-12-13 03:42:31 +00:00
Rafael Espindola 319f74cd11 Rename isPowerOfTwo to isKnownToBeAPowerOfTwo.
In a previous thread it was pointed out that isPowerOfTwo is not a very precise
name since it can return false for powers of two if it is unable to show that
they are powers of two.

llvm-svn: 170093
2012-12-13 03:37:24 +00:00
Michael Ilseman 536cc32ba0 Pattern matching code for intrinsics.
Provides m_Argument that allows matching against a CallSite's specified argument. Provides m_Intrinsic pattern that can be templatized over the intrinsic id and bind/match arguments similarly to other pattern matchers. Implementations provided for 0 to 4 arguments, though it's very simple to extend for more. Also provides example template specialization for bswap (m_BSwap) and example of code cleanup for its use.

llvm-svn: 170091
2012-12-13 03:13:36 +00:00
Quentin Colombet c0dba2035a Take into account minimize size attribute in the inliner.
Better controls the inlining of functions when the caller function has MinSize attribute.
Basically, when the caller function has this attribute, we do not "force" the inlining
of callee functions carrying the InlineHint attribute (i.e., functions defined with
inline keyword)

llvm-svn: 170065
2012-12-13 01:05:25 +00:00
Nadav Rotem 36510f7194 Teach the cost model about the optimization in r169904: Truncation of induction variables costs the same as scalar trunc.
llvm-svn: 170051
2012-12-13 00:21:03 +00:00
Chad Rosier e28ae30a8e Typo.
llvm-svn: 170050
2012-12-13 00:18:46 +00:00
Michael Ilseman 3c814128cd Restore the PHI optimization I accidently removed
llvm-svn: 170024
2012-12-12 20:59:36 +00:00
Michael Ilseman 9fc0f258fa Remove trailing whitespace
llvm-svn: 170022
2012-12-12 20:57:53 +00:00
David Majnemer 5226aa94ce Simplify negated bit test
llvm-svn: 170020
2012-12-12 20:48:54 +00:00
Nadav Rotem 6027bdf898 Fix indentation.
llvm-svn: 170005
2012-12-12 19:39:36 +00:00
Nadav Rotem d0bb22bba3 LoopVectorizer: Use the "optsize" attribute to decide if we are allowed to increase the function size.
llvm-svn: 170004
2012-12-12 19:29:45 +00:00
Rafael Espindola e40238069e The TargetData is not used for the isPowerOfTwo determination. It has never
been used in the first place.  It simply was passed to the function and to the
recursive invocations.  Simply drop the parameter and update the callers for the
new signature.

Patch by Saleem Abdulrasool!

llvm-svn: 169988
2012-12-12 16:52:40 +00:00
Alexey Samsonov 3d43b63a6e Improve debug info generated with enabled AddressSanitizer.
When ASan replaces <alloca instruction> with
<offset into a common large alloca>, it should also patch
llvm.dbg.declare calls and replace debug info descriptors to mark
that we've replaced alloca with a value that stores an address
of the user variable, not the user variable itself.

See PR11818 for more context.

llvm-svn: 169984
2012-12-12 14:31:53 +00:00
Nadav Rotem 6798a04b15 Fix the ascii drawing that was ruined when I split the H and CPP
llvm-svn: 169955
2012-12-12 01:33:47 +00:00
Nadav Rotem 4fa2e3d5af fix a typo.
llvm-svn: 169953
2012-12-12 01:31:10 +00:00
Nadav Rotem aeb17df802 LoopVectorizer: When -Os is used, vectorize only loops that dont require a tail loop. There is no testcase because I dont know of a way to initialize the loop vectorizer pass without adding an additional hidden flag.
llvm-svn: 169950
2012-12-12 01:11:46 +00:00
Shuxin Yang 81b3678564 - Fix a problematic way in creating all-the-1 APInt.
- Propagate "exact" bit of [l|a]shr instruction.

llvm-svn: 169942
2012-12-12 00:29:03 +00:00
Michael Ilseman d5787be5ba Remove redunant optimizations from InstCombine, instead call the appropriate functions from SimplifyInstruction
llvm-svn: 169941
2012-12-12 00:28:32 +00:00
Nadav Rotem f707bf4ca3 PR14574. Fix a bug in the code that calculates the mask the converted PHIs in if-conversion.
llvm-svn: 169916
2012-12-11 21:30:14 +00:00
Nadav Rotem e266efb70b Loop Vectorize: optimize the vectorization of trunc(induction_var). The truncation is now done on scalars.
llvm-svn: 169904
2012-12-11 18:58:10 +00:00
Rafael Espindola a92da5b34f Use an ArrayRef instead of a std::vector&.
llvm-svn: 169881
2012-12-11 16:36:02 +00:00
Evgeniy Stepanov d2bd319adc [msan] Use explicitely aligned stores and loads with function argument shadow.
Use explicitely aligned store and load instructions to deal with argument and
retval shadow. This matters when an argument's alignment is higher than
__msan_param_tls alignment (which is the case with __m128i).

llvm-svn: 169859
2012-12-11 12:34:09 +00:00
Patrik Hagglund e98b7a0389 Revert EVT->MVT changes, r169836-169851, due to buildbot failures.
llvm-svn: 169854
2012-12-11 11:14:33 +00:00
Patrik Hagglund cbc9d4d0f9 Change TargetLowering::getLoadExtAction to take an MVT, instead of EVT.
llvm-svn: 169840
2012-12-11 09:39:09 +00:00
Nadav Rotem dbb3328194 Fix PR14565. Don't if-convert loops that have switch statements in them.
llvm-svn: 169813
2012-12-11 04:55:10 +00:00
Nadav Rotem 36cdd82627 Enable the loop vectorizer only on O2 and above. (Still disabled by default)
llvm-svn: 169774
2012-12-10 21:45:01 +00:00
Nadav Rotem 07df5ac1a1 Split the LoopVectorizer into H and CPP.
llvm-svn: 169771
2012-12-10 21:39:02 +00:00
Bill Wendling 74f334e476 Don't use a red zone for code coverage if the user specified `-mno-red-zone'.
The `-mno-red-zone' flag wasn't being propagated to the functions that code
coverage generates. This allowed some of them to use the red zone when that
wasn't allowed.
<rdar://problem/12843084>

llvm-svn: 169754
2012-12-10 19:46:49 +00:00
Nadav Rotem 7b5b55c195 Add support for reverse induction variables. For example:
while (i--)
 sum+=A[i];

llvm-svn: 169752
2012-12-10 19:25:06 +00:00
Chandler Carruth e41e7b7901 Add a new visitor for walking the uses of a pointer value.
This visitor provides infrastructure for recursively traversing the
use-graph of a pointer-producing instruction like an alloca or a malloc.
It maintains a worklist of uses to visit, so it can handle very deep
recursions. It automatically looks through instructions which simply
translate one pointer to another (bitcasts and GEPs). It tracks the
offset relative to the original pointer as long as that offset remains
constant and exposes it during the visit as an APInt offset. Finally, it
performs conservative escape analysis.

However, currently it has some limitations that should be addressed
going forward:
1) It doesn't handle vectors of pointers.
2) It doesn't provide a cheaper visitor when the constant offset
   tracking isn't needed.
3) It doesn't support non-instruction pointer values.

The current functionality is exactly what is required to implement the
SROA pointer-use visitors in terms of this one, rather than in terms of
their own ad-hoc base visitor, which was always very poorly specified.
SROA has been converted to use this, and the code there deleted which
this utility now provides.

Technically speaking, using this new visitor allows SROA to handle a few
more cases than it previously did. It is now more aggressive in ignoring
chains of instructions which look like they would defeat SROA, but in
fact do not because they never result in a read or write of memory.
While this is "neat", it shouldn't be interesting for real programs as
any such chains should have been removed by others passes long before we
get to SROA. As a consequence, I've not added any tests for these
features -- it shouldn't be part of SROA's contract to perform such
heroics.

The goal is to extend the functionality of this visitor going forward,
and re-use it from passes like ASan that can benefit from doing
a detailed walk of the uses of a pointer.

Thanks to Ben Kramer for the code review rounds and lots of help
reviewing and debugging this patch.

llvm-svn: 169728
2012-12-10 08:28:39 +00:00
Chandler Carruth e45f4658a3 Fix PR14548: SROA was crashing on a mixture of i1 and i8 loads and stores.
When SROA was evaluating a mixture of i1 and i8 loads and stores, in
just a particular case, it would tickle a latent bug where we compared
bits to bytes rather than bits to bits. As a consequence of the latent
bug, we would allow integers through which were not byte-size multiples,
a situation the later rewriting code was never intended to handle.

In release builds this could trigger all manner of oddities, but the
reported issue in PR14548 was forming invalid bitcast instructions.

The only downside of this fix is that it makes it more clear that SROA
in its current form is not capable of handling mixed i1 and i8 loads and
stores. Sometimes with the previous code this would work by luck, but
usually it would crash, so I'm not terribly worried. I'll watch the LNT
numbers just to be sure.

llvm-svn: 169719
2012-12-10 00:54:45 +00:00
Paul Redmond 2adb13c100 LoopVectorize: support vectorizing intrinsic calls
- added function to VectorTargetTransformInfo to query cost of intrinsics
- vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc.

Reviewed by: Nadav

llvm-svn: 169711
2012-12-09 20:42:17 +00:00
Paul Redmond f7cd6b391a test commit.
llvm-svn: 169709
2012-12-09 19:46:31 +00:00
Jakub Staszak 8432185ec2 Use m_OneUse pattern instead of hasOneUse() method.
No functionality change.

llvm-svn: 169703
2012-12-09 16:06:44 +00:00
Jakub Staszak 538e3861e3 Remove trailing spaces.
llvm-svn: 169701
2012-12-09 15:37:46 +00:00
Chandler Carruth 93ff2447ec Switch SROA to pop Uses off the back of its visitors' queues.
This will more closely match the behavior of the new PtrUseVisitor that
I am adding. Hopefully this will not change the actual behavior in any
way, but by making the processing order more similar help in debugging.

llvm-svn: 169697
2012-12-09 11:56:01 +00:00
Shuxin Yang 95de7c37e2 - Re-enable population count loop idiom recognization
- fix a bug which cause sigfault.
- add two testing cases which was causing crash

llvm-svn: 169687
2012-12-09 03:12:46 +00:00
Chandler Carruth 91e47532fe Revert the patches adding a popcount loop idiom recognition pass.
There are still bugs in this pass, as well as other issues that are
being worked on, but the bugs are crashers that occur pretty easily in
the wild. Test cases have been sent to the original commit's review
thread.

This reverts the commits:
  r169671: Fix a logic error.
  r169604: Move the popcnt tests to an X86 subdirectory.
  r168931: Initial commit adding the pass.

llvm-svn: 169683
2012-12-08 22:18:29 +00:00
Shuxin Yang 9c5c97647f Fix an inadvertent typo error.
llvm-svn: 169671
2012-12-08 05:00:59 +00:00
Bill Wendling e94d843e43 s/AttrListPtr/AttributeSet/g to better label what this class is going to be in the near future.
llvm-svn: 169651
2012-12-07 23:16:57 +00:00
Evgeniy Stepanov 383b61e791 [msan] Remove readonly/readnone attributes from all called functions.
MSan uses a TLS slot to pass shadow for function arguments and return values.
This makes all instrumented functions not readonly, and at the same time
requires that all callees of an instrumented function that may be
MSan-instrumented do not have readonly attribute (otherwise some of the
instrumentation may be optimized out).

llvm-svn: 169591
2012-12-07 09:08:32 +00:00
Jakub Staszak 772b893e5d Remove unused field.
llvm-svn: 169551
2012-12-06 22:08:59 +00:00
Jakub Staszak 9525a77bf5 Remove trailing spaces.
llvm-svn: 169550
2012-12-06 21:57:16 +00:00
NAKAMURA Takumi e0b1b4645a MemorySanitizer.cpp: Suppress a warning. [-Wunused-variable]
llvm-svn: 169504
2012-12-06 13:38:00 +00:00
Evgeniy Stepanov 47ac9ba9cc [msan] Fix a typo in a comment.
llvm-svn: 169491
2012-12-06 11:58:59 +00:00
Evgeniy Stepanov 4f220d96c5 [msan] Do not store origin for clean values.
Instead of unconditionally storing origin with every application store,
only do this when the shadow of the stored value is != 0.

This change also delays instrumentation of stores until after the walk over
function's instructions, because adding new basic blocks confuses InstVisitor.

We only keep 1 origin value per 4 bytes of application memory. This change
fixes the bug when a store of a single clean byte wiped the origin for the
whole 4-byte area.

Since stores of uninitialized values are relatively uncommon, this change
improves performance of track-origins mode by 5% median and by up to 47% on
specs.

llvm-svn: 169490
2012-12-06 11:41:03 +00:00
Bill Wendling ab417b644c Set the 'MadeChange' variable if we are deleting blocks.
llvm-svn: 169455
2012-12-06 00:30:20 +00:00
Evgeniy Stepanov 8b51bab495 [msan] Instrument bswap intrinsic.
llvm-svn: 169383
2012-12-05 14:39:55 +00:00
Evgeniy Stepanov 94b257df3c [msan] Initialize callbacks in runOnFunction as opposed to doInitialization.
This mirrors the change in ASan & TSan done in r168864.

llvm-svn: 169378
2012-12-05 13:14:33 +00:00
Evgeniy Stepanov 474cb3b3b5 [msan] Change linkage type of __msan_track_origins.
LinkOnceODRLinkage globals may be removed in GlobalOpt if not used in the
current module.

llvm-svn: 169377
2012-12-05 12:49:41 +00:00
Nadav Rotem a8f026e2d4 LoopVectorizer: Increase the number of pointers that can be tested at runtime. If we cant prove statically that the pointers are disjoint then we add the runtime check.
llvm-svn: 169334
2012-12-04 23:25:24 +00:00
Nadav Rotem 87fc988c5d Enable if-conversion during vectorization.
llvm-svn: 169331
2012-12-04 22:59:52 +00:00
Nadav Rotem 93fa5ef957 Fix a bug in vectorization of if-converted reduction variables. If the
reduction variable is not used outside the loop then we ran into an
endless loop. This change checks if we found the original PHI.

llvm-svn: 169324
2012-12-04 22:40:22 +00:00
Shuxin Yang 73285933c9 For rdar://12329730, last piece.
This change attempts to simplify (X^Y) -> X or Y in the user's context if we know that
only bits from X or Y are demanded.

  A minimized case is provided bellow. This change will simplify "t>>16" into "var1 >>16".

  =============================================================
  unsigned foo (unsigned val1, unsigned val2) {
    unsigned t = val1 ^ 1234;
    return (t >> 16) | t; // NOTE: t is used more than once.
  }
  =============================================================

  Note that if the "t" were used only once, the expression would be finally optimized as well.
However, with with this change, the optimization will take place earlier.

  Reviewed by Nadav, Thanks a lot!

llvm-svn: 169317
2012-12-04 22:15:32 +00:00
Nadav Rotem a10b311aec Add support for reduction variables when IF-conversion is enabled.
llvm-svn: 169288
2012-12-04 18:17:33 +00:00
Chandler Carruth 802d755533 Sort includes for all of the .h files under the 'lib' tree. These were
missed in the first pass because the script didn't yet handle include
guards.

Note that the script is now able to handle all of these headers without
manual edits. =]

llvm-svn: 169224
2012-12-04 07:12:27 +00:00
Nadav Rotem 07674cb566 Give scalar if-converted blocks half the score because they are not always executed due to CF.
llvm-svn: 169223
2012-12-04 07:11:52 +00:00
Nadav Rotem 628c2dba60 Add the last part that is needed for vectorization of if-converted code.
Added the code that actually performs the if-conversion during vectorization.

We can now vectorize this code:

for (int i=0; i<n; ++i) {
  unsigned k = 0;

  if (a[i] > b[i])   <------ IF inside the loop.
    k = k * 5 + 3;

  a[i] = k;          <---- K is a phi node that becomes vector-select.
}

llvm-svn: 169217
2012-12-04 06:15:11 +00:00
Kostya Serebryany 9b65726d24 [asan] add experimental -asan-realign-stack option (true by default, which does not change the current behavior)
llvm-svn: 169216
2012-12-04 06:14:01 +00:00
Matt Beaumont-Gay abfc446063 Add 'using' declarations to suppress -Woverloaded-virtual warnings.
llvm-svn: 169214
2012-12-04 05:41:27 +00:00
Shuxin Yang 86c0e232b7 rdar://12329730 (2nd part, revised)
The type of shirt-right (logical or arithemetic) should remain unchanged 
when transforming  "X << C1 >> C2" into "X << (C1-C2)"

llvm-svn: 169209
2012-12-04 03:28:32 +00:00
Alexey Samsonov 261177a1e1 ASan: add initial support for handling llvm.lifetime intrinsics in ASan - emit calls into runtime library that poison memory for local variables when their lifetime is over and unpoison memory when their lifetime begins.
llvm-svn: 169200
2012-12-04 01:34:23 +00:00
NAKAMURA Takumi f99b535fdb LoopVectorize.cpp: Suppress a warning. [-Wunused-variable]
llvm-svn: 169195
2012-12-04 00:49:34 +00:00
NAKAMURA Takumi 8b07bc579b Fix whitespace.
llvm-svn: 169194
2012-12-04 00:49:28 +00:00
Shuxin Yang 63e999edbf rdar://12329730 (2nd part)
This change tries to simmplify E1 = " X >> C1 << C2" into :
  - E2 = "X << (C2 - C1)" if C2 > C1, or
  - E2 = "X >> (C1 - C2)" if C1 > C2, or
  - E2 = X if C1 == C2.

 Reviewed by Nadav. Thanks!

llvm-svn: 169182
2012-12-04 00:04:54 +00:00
Nadav Rotem d479a57f68 minor renaming, documentation and cleanups.
llvm-svn: 169175
2012-12-03 22:57:09 +00:00
Nadav Rotem fad16be973 IF-conversion: teach the cost-model how to grade if-converted loops.
llvm-svn: 169171
2012-12-03 22:46:31 +00:00
Nadav Rotem eee203d885 Now that we have a basic if-conversion infrastructure we can rename the
"single basic block loop vectorizer" to "innermost loop vectorizer".

llvm-svn: 169158
2012-12-03 21:33:08 +00:00
Nadav Rotem a30aba7a01 Add initial support for IF-conversion. This patch implements the first 1/3,
which is the legality of the if-conversion transformation. The next step is to
implement the cost-model for the if-converted code as well as the
vectorization itself.

llvm-svn: 169152
2012-12-03 21:06:35 +00:00
Alexey Samsonov ef51c3ff81 ASan: add blacklist file to ASan pass options. Clang patch for this will follow.
llvm-svn: 169143
2012-12-03 19:09:26 +00:00
Nadav Rotem 2349531def Teach the jump threading optimization to stop scanning the basic block when calculating the cost after passing the threshold.
llvm-svn: 169135
2012-12-03 17:34:44 +00:00
Chandler Carruth ed0881b2a6 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Chandler Carruth f02b8bf11b Remove some buggy and apparantly unnecessary code from SROA.
The partitioning logic attempted to handle uses of an alloca with an
offset starting before the alloca so long as the use had some overlap
with the alloca itself. However, there was a bug where we tested
'(uint64_t)Offset >= AllocSize' without first checking whether 'Offset'
was positive. As a consequence, essentially every negative offset (that
is, starting *before* the alloca does) would be thrown out, even if it
was overlapping. The subsequent code to throw out negative offsets which
were actually non-overlapping was essentially dead. The code to *handle*
overlapping negative offsets was actually dead!

I've just removed all of this, and taught SROA to discard any uses which
start prior to the alloca from the beginning. It has the lovely property
of simplifying the code. =] All the tests still pass, and in fact no new
tests are needed as this is already covered by our testsuite. Fixing the
code so that negative offsets work the way the comments indicate they
were supposed to work causes regressions. That's how I found this.

Anyways, this is all progress in the correct direction -- tightening up
SROA to be maximally aggressive. Some day, I really hope to turn
out-of-bounds accesses to an alloca into 'unreachable'.

llvm-svn: 169120
2012-12-03 10:59:55 +00:00
Nuno Lopes 5eec2679df fix stats for added checks
llvm-svn: 169119
2012-12-03 10:15:03 +00:00
Benjamin Kramer 47534c7440 SROA: Avoid struct and array types early to avoid creating an overly large integer type.
Fixes PR14465.

Differential Revision: http://llvm-reviews.chandlerc.com/D148

llvm-svn: 169084
2012-12-01 11:53:32 +00:00
Zhou Sheng 8e6d64a71d Revert previous check in r168581, r169079 as they are still in code review status.
llvm-svn: 169083
2012-12-01 10:54:28 +00:00
Zhou Sheng 13fb1ca44a The patch is to improve the memory footprint of pass GlobalOpt.
Also check in a case to repeat the issue, on which 'opt -globalopt' consumes 1.6GB memory.
The big memory footprint cause is that current GlobalOpt one by one hoists and stores the leaf element constant into the global array, in each iteration, it recreates the global array initializer constant and leave the old initializer alone. This may result in many obsolete constants left.
For example:  we have global array @rom = global [16 x i32] zeroinitializer
After the first element value is hoisted and installed:   @rom = global [16 x i32] [ 1, 0, 0, ... ]
After the second element value is installed:  @rom = global [16 x 32] [ 1, 2, 0, 0, ... ]        // here the previous initializer is obsolete
...
When the transform is done, we have 15 obsolete initializers left useless.

llvm-svn: 169079
2012-12-01 04:38:53 +00:00
Pedro Artigas 00b83c9b8d reversed the logic of the log2 detection routine to reduce the number of nested ifs
llvm-svn: 169049
2012-11-30 22:47:15 +00:00
Nadav Rotem 3ae24ee08a minor cleanups
llvm-svn: 169048
2012-11-30 22:37:11 +00:00
Bill Wendling c786b31233 Replace r168930 with a more reasonable patch.
The original patch removed a bunch of code that the SjLjEHPrepare pass placed
into the entry block if all of the landing pads were removed during the
CodeGenPrepare class. The more natural way of doing things is to run the CGP
*before* we run the SjLjEHPrepare pass.

Make it so!

llvm-svn: 169044
2012-11-30 22:08:55 +00:00
Pedro Artigas 993acd0c54 Addresses many style issues with prior checkin (r169025)
llvm-svn: 169043
2012-11-30 22:07:05 +00:00
Pedro Artigas d8795040de Add fast math inst combine X*log2(Y*0.5)-->X*log2(Y)-X
reviewed by Michael Ilseman <milseman@apple.com>

llvm-svn: 169025
2012-11-30 19:09:41 +00:00
Nadav Rotem 6b494be886 Remove the use of LPPassManager. We can remove LPM because we dont need to run any additional loop passes on the new vector loop.
llvm-svn: 169016
2012-11-30 17:27:53 +00:00
Kostya Serebryany 817b60af38 [asan] simplify the code around doesNotReturn call. It now magically works.
llvm-svn: 168995
2012-11-30 11:08:59 +00:00
Chandler Carruth d9ef81e133 Fix non-determinism introduced in r168970 and pointed out by Duncan.
We're iterating over a non-deterministically ordered container looking
for two saturating flags. To do this correctly, we have to saturate
both, and only stop looping if both saturate to their final value.
Otherwise, which flag we see first changes the result.

This is also a micro-optimization of the previous version as now we
don't go into the (possibly expensive) test logic once the first
violation of either constraint is detected.

llvm-svn: 168989
2012-11-30 09:34:29 +00:00
Chandler Carruth 77d433dafe Rearrange the comments, control flow, and variable names; no
functionality changed.

Evan's commit r168970 moved the code that the primary comment in this
function referred to to the other end of the function without moving the
comment, and there has been a steady creep of "boolean" logic in it that
is simpler if handled via early exit. That way each special case can
have its own comments. I've also made the variable name a bit more
explanatory than "AllFit". This is in preparation to fix the
non-deterministic output of this function.

llvm-svn: 168988
2012-11-30 09:26:25 +00:00
Meador Inge e3f2b26bfa Move library call simplification statistic to instcombine
The simplify-libcalls pass maintained a statistic to count the number
of library calls that have been simplified.  Now that library call
simplification is being carried out in instcombine the statistic should
be moved to there.

llvm-svn: 168975
2012-11-30 04:05:06 +00:00
Chandler Carruth dbd6958183 Move the InstVisitor utility into VMCore where it belongs. It heavily
depends on the IR infrastructure, there is no sense in it being off in
Support land.

This is in preparation to start working to expand InstVisitor into more
special-purpose visitors that are still generic and can be re-used
across different passes. The expansion will go into the Analylis tree
though as nothing in VMCore needs it.

llvm-svn: 168972
2012-11-30 03:08:41 +00:00
Evan Cheng 65df808f62 Fix logic to determine whether to turn a switch into a lookup table. When
the tables cannot fit in registers (i.e. bitmap), do not emit the table
if it's using an illegal type.

rdar://12779436

llvm-svn: 168970
2012-11-30 02:02:42 +00:00
Shuxin Yang abcc370423 rdar://12100355 (part 1)
This revision attempts to recognize following population-count pattern:

 while(a) { c++; ... ; a &= a - 1; ... },
  where <c> and <a>could be used multiple times in the loop body.

 TODO: On X8664 and ARM, __buildin_ctpop() are not expanded to a efficent 
instruction sequence, which need to be improved in the following commits.

Reviewed by Nadav, really appreciate!

llvm-svn: 168931
2012-11-29 19:38:54 +00:00
Bill Wendling a4a77edf2e Handle the situation where CodeGenPrepare removes a reference to a BB that has
the last invoke instruction in the function. This also removes the last landing
pad in an function. This is fine, but with SjLj EH code, we've already placed a
bunch of code in the 'entry' block, which expects the landing pad to stick
around.

When we get to the situation where CGP has removed the last landing pad, go
ahead and nuke the SjLj instructions from the 'entry' block.
<rdar://problem/12721258>

llvm-svn: 168930
2012-11-29 19:38:06 +00:00
Nadav Rotem ec739205cc No need to run LICM after loop vectorization because we dont generate invariant code any more.
llvm-svn: 168928
2012-11-29 19:28:29 +00:00
Nadav Rotem 8dd6ee8df5 When broadcasting invariant scalars into vectors, place the broadcast code in the preheader.
llvm-svn: 168927
2012-11-29 19:25:41 +00:00
Meador Inge 75798bb7fe instcombine: Migrate puts optimizations
This patch migrates the puts optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

All the simplifiers from simplify-libcalls have now been migrated to
instcombine.  Yay!  Just a few other bits to migrate (prototype attribute
inference and a few statistics) and simplify-libcalls can finally be put
to rest.

llvm-svn: 168925
2012-11-29 19:15:17 +00:00
Alexey Samsonov 9a956e8cd2 [ASan] Simplify check added in r168861. Bail out from module pass early if the module is blacklisted.
llvm-svn: 168913
2012-11-29 18:27:01 +00:00
Matt Beaumont-Gay c76536f886 Apply Takumi's patch to suppress unused-variable warnings in -Asserts builds.
llvm-svn: 168911
2012-11-29 18:15:49 +00:00
Alexey Samsonov df6245233c Add options to AddressSanitizer passes to make them configurable by frontend.
llvm-svn: 168910
2012-11-29 18:14:24 +00:00
Meador Inge f8e725081c instcombine: Migrate fputs optimizations
This patch migrates the fputs optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168893
2012-11-29 15:45:43 +00:00
Meador Inge bc84d1a4f5 instcombine: Migrate fwrite optimizations
This patch migrates the fwrite optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168892
2012-11-29 15:45:39 +00:00
Meador Inge 1009cecca0 instcombine: Migrate fprintf optimizations
This patch migrates the fprintf optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168891
2012-11-29 15:45:33 +00:00
Evgeniy Stepanov 30484fc704 [msan] Handle vector manipulation instructions.
Handle insertelement, extractelement, shufflevector.

llvm-svn: 168889
2012-11-29 15:22:06 +00:00
Evgeniy Stepanov f433cecf96 [msan] Fix getOriginForNaryOp.
The old version failed on a 3-arg instruction with (-1, 0, 0) shadows (it would
pick the 3rd operand origin irrespective of its shadow).
    
The new version always picks the origin of the rightmost poisoned operand.

llvm-svn: 168887
2012-11-29 14:44:00 +00:00
Evgeniy Stepanov 7ad7e83031 [msan] Basic handling of inline asm.
llvm-svn: 168884
2012-11-29 14:32:03 +00:00
Evgeniy Stepanov 857d9d2a59 [msan] Propagate shadow through (x<0) and (x>=0) comparisons.
This is a special case of signed relational comparison where result
only depends on the sign of x.

llvm-svn: 168881
2012-11-29 14:25:47 +00:00
Evgeniy Stepanov eeb8b7c391 [msan] Fix shadow & origin store & load alignment.
This change ensures that shadow memory accesses have the same alignment
as corresponding app memory accesses.

llvm-svn: 168880
2012-11-29 14:05:53 +00:00
Evgeniy Stepanov 62ba611828 [msan] Optimize getOriginPtr.
Rewrite getOriginPtr in a way that lets subsequent optimizations factor out
the common part of Shadow and Origin address calculation. Improves perf by
up to 5%.

llvm-svn: 168879
2012-11-29 13:43:05 +00:00
Evgeniy Stepanov da0072b676 [msan] Fix a few compilation warnings.
llvm-svn: 168878
2012-11-29 13:12:03 +00:00
Evgeniy Stepanov 62b5db9361 [msan] Transform memcpy and memset to library calls.
This was already done for memmove, where it is required for correctness.
This change improves performance by avoiding copyingthe same memory twice.
Also, the library functions are given __msan_ prefix to prevent instcombine
pass from converting them back to intrinsics.

llvm-svn: 168876
2012-11-29 12:49:04 +00:00
Evgeniy Stepanov 1d2da65bf8 [msan] Make sure that report callbacks do not get merged.
llvm-svn: 168873
2012-11-29 12:30:18 +00:00
Evgeniy Stepanov d4bd7b73e3 Initial commit of MemorySanitizer.
Compiler pass only.

llvm-svn: 168866
2012-11-29 09:57:20 +00:00
Kostya Serebryany 4b929dae93 [asan/tsan] initialize the asan/tsan callbacks in runOnFunction as opposed to doInitialization. This is required to allow the upcoming changes in PassManager behavior
llvm-svn: 168864
2012-11-29 09:54:21 +00:00
Kostya Serebryany 633bf93fb8 [asan] when checking the noreturn attribute on the call, also check it on the callee
llvm-svn: 168861
2012-11-29 08:57:20 +00:00
Nick Lewycky 1af94eb075 Issue a fatal error if the line doesn't have a regular expression.
Also a couple not-user-visible changes; using empty() instead of size(), and
make inSection() not insert NULL Regex*'s into StringMap when doing a lookup.

llvm-svn: 168833
2012-11-29 00:01:38 +00:00
Bill Wendling f3614fd8e2 When we delete a dead basic block, see if any of its successors are dead and
delete those as well.

llvm-svn: 168829
2012-11-28 23:23:48 +00:00
Kostya Serebryany dfe9e7933e [asan] Split AddressSanitizer into two passes (FunctionPass, ModulePass), LLVM part. This requires a clang part which will follow.
llvm-svn: 168781
2012-11-28 10:31:36 +00:00
Hal Finkel 88ee6b0082 BBVectorize: Correctly merge SubclassOptionalData
When two instructions are combined into a vector instruction,
the resulting instruction must have the most-conservative flags.

llvm-svn: 168765
2012-11-28 03:04:10 +00:00
Meador Inge f1bc9e7431 instcombine: Don't replace all uses for instructions with no uses
My commit to migrate the printf simplifiers from the simplify-libcalls
in r168604 introduced a regression reported by Duncan [1].  The problem
is that in some cases the library call simplifier can return a new value
that has no uses and the new value's type is different than the old value's
type (which is fine because there are no uses).  The specific case that
triggered the bug looked something like:

   declare void @printf(i8*, ...)
   ...
   call void (i8*, ...)* @printf(i8* %fmt)

Which we want to optimized into:

   call i32 @putchar(i32 104)

However, the code was attempting to replace all uses of the printf with
the putchar and the types differ, hence a crash.  This is fixed by *just*
deleting the original instruction when there are no uses.  The old
simplify-libcalls pass is already doing something similar.

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-November/056338.html

llvm-svn: 168716
2012-11-27 18:52:49 +00:00
Bill Wendling ee5984df39 Remove the dependent libraries feature.
The dependent libraries feature was never used and has bit-rotted. Remove it.

llvm-svn: 168694
2012-11-27 09:55:56 +00:00
Dmitry Vyukov a878e74351 tsan: instrument atomic nand operation
llvm-svn: 168684
2012-11-27 08:09:25 +00:00
Meador Inge 25c9b3b6e4 instcombine: Migrate sprintf optimizations
This patch migrates the sprintf optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168677
2012-11-27 05:57:54 +00:00
Eli Friedman b14873c4f1 Get rid of the getPointeeAlignment helper function from
InstCombineLoadStoreAlloca.cpp, which had many issues.
(At least two bugs were noted on llvm-commits, and it was overly conservative.)
Instead, use getOrEnforceKnownAlignment.

llvm-svn: 168629
2012-11-26 23:04:53 +00:00
Shuxin Yang 6ea79e864d rdar://12329730 (defect 2)
Enhancement to InstCombine. Try to catch this opportunity:
  
 ---------------------------------------------------------------
 ((X^C1) >> C2) ^ C3  => (X>>C2) ^ ((C1>>C2)^C3)
  where the subexpression "X ^ C1" has more than one uses, and
  "(X^C1) >> C2" has single use. 
 ---------------------------------------------------------------- 

 Reviewed by Nadav (with minor change per his request).

llvm-svn: 168615
2012-11-26 21:44:25 +00:00
Meador Inge efe2393113 Fix a comment bug in toascii simplifier
When I migrated the toascii simplifier in r168580 Benjamin Kramer noticed
a bug in one of the comments that I migrated.

llvm-svn: 168605
2012-11-26 20:37:23 +00:00
Meador Inge 08ca115abd instcombine: Migrate printf optimizations
This patch migrates the printf optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168604
2012-11-26 20:37:20 +00:00
Nadav Rotem caf5acfd14 Move the code that uses SCEVs prior to creating the new loops.
llvm-svn: 168601
2012-11-26 19:51:46 +00:00
Matt Beaumont-Gay 000a3958e9 Remove stray trailing backslash
llvm-svn: 168592
2012-11-26 16:27:22 +00:00
Dmitry Vyukov 93e67b6c06 tsan: fix lint warnings
llvm-svn: 168590
2012-11-26 14:55:26 +00:00
Dmitry Vyukov 12b5cb9a0a [tsan] add fail order to compare_exchange
llvm-svn: 168586
2012-11-26 11:36:19 +00:00
Meador Inge 604937d1cc instcombine: Migrate toascii optimizations
This patch migrates the toascii optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168580
2012-11-26 03:38:52 +00:00
Meador Inge a62a39e0e9 instcombine: Migrate isascii optimizations
This patch migrates the isascii optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168579
2012-11-26 03:10:07 +00:00
Meador Inge 9a59ab6133 instcombine: Migrate isdigit optimizations
This patch migrates the isdigit optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168578
2012-11-26 02:31:59 +00:00
Meador Inge a0b6d87879 instcombine: Migrate *abs optimizations
This patch migrates the *abs optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168574
2012-11-26 00:24:07 +00:00
Meador Inge 7415f8403d instcombine: Migrate ffs* optimizations
This patch migrates the ffs* optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 168571
2012-11-25 20:45:27 +00:00
Nadav Rotem ee7ede76f4 Move the max vector width to a constant parameter. No functionality change.
llvm-svn: 168570
2012-11-25 16:48:08 +00:00
Nadav Rotem ef33b5076c Fix the document style.
llvm-svn: 168569
2012-11-25 16:39:01 +00:00
Nadav Rotem 12192f19eb Refactor the ptr runtime check generation code. No functionality change.
llvm-svn: 168568
2012-11-25 16:27:16 +00:00
Nadav Rotem b15d9fe24d Rename method. No functionality change.
llvm-svn: 168560
2012-11-25 09:13:57 +00:00
Nadav Rotem bf5173460f The induction-pointer work is inspired by a research paper. This commit adds a reference.
llvm-svn: 168559
2012-11-25 09:09:26 +00:00
Nadav Rotem ea3824f160 Add support for pointer induction variables even when there is no integer induction variable.
llvm-svn: 168558
2012-11-25 08:41:35 +00:00
Benjamin Kramer 455fa35e51 CodeGenPrepare: Move ret duplication out of the instruction iteration loop.
It can delete the block, and the loop continues on free'd memory.
No change in output. Found by valgrind.

llvm-svn: 168525
2012-11-23 19:17:06 +00:00
Joey Gouly 9519823271 Remove unused parameter Penalty from the BoundsChecking pass.
llvm-svn: 168511
2012-11-23 10:47:35 +00:00
NAKAMURA Takumi 6b8b2a9b98 llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp: Prune AddressSanitizerCreateGlobalRedzonesPass::ID. [-Wunused-variable]
llvm-svn: 168499
2012-11-22 14:18:25 +00:00
Kostya Serebryany 20a79970e0 [asan] rip off the creation of global redzones from the main AddressSanitizer class into a separate class. The intent is to make it a separate ModulePass in the following commmits
llvm-svn: 168484
2012-11-22 03:18:50 +00:00
Chandler Carruth 845b73c06f PR14055: Implement support for sub-vector operations in SROA.
Now if we can transform an alloca into a single vector value, but it has
subvector, non-element accesses, we form the appropriate shufflevectors
to allow SROA to proceed. This fixes PR14055 which pointed out a very
common pattern that SROA couldn't handle -- mixed vec3 and vec4
operations on a single alloca.

llvm-svn: 168418
2012-11-21 08:16:30 +00:00
Kostya Serebryany 139a937903 [asan] use names of globals instead of an external set to distinguish the globals generated by asan
llvm-svn: 168368
2012-11-20 14:16:08 +00:00
Kostya Serebryany dc4cb2b736 [asan] don't instrument linker-initialized globals even with external linkage in -asan-initialization-order mode
llvm-svn: 168367
2012-11-20 13:11:32 +00:00
Kostya Serebryany b3bd605ffa [asan] make sure that linker-initialized globals (non-extern) are not instrumented even in -asan-initialization-order mode. This time with a test
llvm-svn: 168366
2012-11-20 13:00:01 +00:00
Chandler Carruth b7915f7fbf Use LLVM_ENABLE_DUMP for the variables used in printing as well as the
printing functions themselves.

Part of PR14324 (which should have just been a patch to the list, but
hey...)

llvm-svn: 168362
2012-11-20 10:23:07 +00:00
Chandler Carruth 3e994a26e2 Fix PR14132 and handle OOB loads speculated throuh PHI nodes.
The issue is that we may end up with newly OOB loads when speculating
a load into the predecessors of a PHI node, and this confuses the new
integer splitting logic in some cases, triggering an assertion failure.
In fact, the branch in question must be dead code as it loads from
a too-narrow alloca. Add code to handle this gracefully and leave the
requisite FIXMEs for both optimizing more aggressively and doing more to
aid sanitizing invalid code which triggers these patterns.

llvm-svn: 168361
2012-11-20 10:02:19 +00:00
Bill Wendling f86efb9b98 Make the AttrListPtr object a part of the LLVMContext.
When code deletes the context, the AttributeImpls that the AttrListPtr points to
are now invalid. Therefore, instead of keeping a separate managed static for the
AttrListPtrs that's reference counted, move it into the LLVMContext and delete
it when deleting the AttributeImpls.

llvm-svn: 168354
2012-11-20 05:09:20 +00:00
Chandler Carruth 9324c96064 Add a comment to associate a FIXME with a PR where it is matters.
llvm-svn: 168347
2012-11-20 01:27:48 +00:00
Chandler Carruth 18db795b05 Rework the rewriting of loads and stores for vector and integer allocas
to properly handle the combinations of these with split integer loads
and stores. This essentially replaces Evan's r168227 by refactoring the
code in a different way, and trynig to mirror that refactoring in both
the load and store sides of the rewriting.

Generally speaking there was some really problematic duplicated code
here that led to poorly founded assumptions and then subtle bugs. Now
much of the code actually flows through and follows a more consistent
style and logical path. There is still a tiny bit of duplication on the
store side of things, but it is much less bad.

This also changes the logic to never re-use a load or store instruction
as that was simply too error prone in practice.

I've added a few tests (one a reduction of the one in Evan's original
patch, which happened to be the same as the report in PR14349). I'm
going to look at adding a few more tests for things I found and fixed in
passing (such as the volatile tests in the vectorizable predicate).

This patch has survived bootstrap, and modulo one bugfix survived
Duncan's test suite, but let me know if anything else explodes.

llvm-svn: 168346
2012-11-20 01:12:50 +00:00
Bob Wilson a5b0dc8884 Clean up handling of always-inline functions in the inliner.
This patch moves the isInlineViable function from the InlineAlways pass into
the InlineCostAnalyzer and then changes the InlineCost computation to use that
simple check for always-inline functions. All the special-case checks for
AlwaysInline in the CallAnalyzer can then go away.

llvm-svn: 168300
2012-11-19 07:04:35 +00:00
Duncan Sands 79cf530d56 Remove the last bit of constant folding from LinearizeExprTree (most of it was
removed in commit 168035, but I missed this bit).

llvm-svn: 168292
2012-11-18 20:15:36 +00:00
Duncan Sands 20bd7fa0f7 Fix PR14060, an infinite loop in reassociate. The problem was that one of the
operands of the expression being written was wrongly thought to be reusable as
an inner node of the expression resulting in it turning up as both an inner node
*and* a leaf, creating a cycle in the def-use graph.  This would have caused the
verifier to blow up if things had gotten that far, however it managed to provoke
an infinite loop first.

llvm-svn: 168291
2012-11-18 19:27:01 +00:00
Nick Lewycky 3d35b45f8e Don't try to calculate the alignment of an unsigned type. Fixes PR14371!
llvm-svn: 168280
2012-11-18 05:39:39 +00:00
Benjamin Kramer 96e1e39642 Plug a memory leak in the GCOV profiling emitter, which never released the edge table memory.
llvm-svn: 168259
2012-11-17 13:49:37 +00:00
Nadav Rotem c3c07e62e8 LoopVectorizer: Add initial support for pointer induction variables (for example: *dst++ = *src++).
At the moment we still require to have an integer induction variable (for example: i++).

llvm-svn: 168231
2012-11-17 00:27:03 +00:00
Evan Cheng f1b6177b62 Teach SROA rewriteVectorizedStoreInst to handle cases when the loaded value is narrower than the stored value. rdar://12713675
llvm-svn: 168227
2012-11-17 00:05:06 +00:00
Duncan Sands d7d8c09b93 Make this easier to understand, as suggested by Chandler.
llvm-svn: 168196
2012-11-16 20:53:08 +00:00
Duncan Sands 1d3acddf0e Fix PR14361: wrong simplification of A+B==B+A. You may think that the old logic
replaced by this patch is equivalent to the new logic, but you'd be wrong, and
that's exactly where the bug was.  There's a similar bug in instsimplify which
manifests itself as instsimplify failing to simplify this, rather than doing it
wrong, see next commit.

llvm-svn: 168181
2012-11-16 18:55:49 +00:00
Hans Wennborg 7b8af0ea05 SimplifyCFG: Don't assume non-null ScalarTargetTransformInfo.
Patch by Pekka Jääskeläinen!

llvm-svn: 168176
2012-11-16 18:22:08 +00:00
Nadav Rotem 0565b5a279 LoopVectorize: Division reductions generate incorrect code. Remove the part of the code that deals with divs.
Thanks to Paul Redmond for catching this while reviewing the code.

llvm-svn: 168142
2012-11-16 06:51:17 +00:00
Andrew Trick 7656f6dbf7 misspell
llvm-svn: 168058
2012-11-15 18:40:31 +00:00
Andrew Trick 90f5029118 whitespace
llvm-svn: 168057
2012-11-15 18:40:29 +00:00
Dmitri Gribenko 0011bbf985 Use empty parens for empty function parameter list instead of '(void)'.
llvm-svn: 168049
2012-11-15 16:51:49 +00:00
Hans Wennborg 709e015cf1 Make GlobalOpt be conservative with TLS variables (PR14309)
For global variables that get the same value stored into them
everywhere, GlobalOpt will replace them with a constant. The problem is
that a thread-local GlobalVariable looks like one value (the address of
the TLS var), but is different between threads.

This patch introduces Constant::isThreadDependent() which returns true
for thread-local variables and constants which depend on them (e.g. a GEP
into a thread-local array), and teaches GlobalOpt not to track such
values.

llvm-svn: 168037
2012-11-15 11:40:00 +00:00
Duncan Sands ac852c742f Fix a crash observed by Shuxin Yang. The issue here is that LinearizeExprTree,
the utility for extracting a chain of operations from the IR, thought that it
might as well combine any constants it came across (rather than just returning
them along with everything else).  On the other hand, the factorization code
would like to see the individual constants (this is quite reasonable: it is
much easier to pull a factor of 3 out of 2*3 than it is to pull it out of 6;
you may think 6/3 isn't so hard, but due to overflow it's not as easy to undo
multiplications of constants as it may at first appear).  This patch therefore
makes LinearizeExprTree stupider: it now leaves optimizing to the optimization
part of reassociate, and sticks to just analysing the IR.

llvm-svn: 168035
2012-11-15 09:58:38 +00:00
NAKAMURA Takumi 00d2a107fb InstCombineAndOrXor.cpp: Escape bracket in doxygen description. [-Wdocumentation]
llvm-svn: 168013
2012-11-15 00:35:50 +00:00
Hal Finkel e9740a4692 Replace std::vector -> SmallVector in BBVectorize
For now, this uses 8 on-stack elements. I'll need to do some profiling
to see if this is the best number.

Pointed out by Jakob in post-commit review.

llvm-svn: 167966
2012-11-14 19:53:27 +00:00
Hal Finkel 1b7f0aba48 Fix the largest offender of determinism in BBVectorize
Iterating over the children of each node in the potential vectorization
plan must happen in a deterministic order (because it affects which children
are erased when two children conflict). There was no need for this data
structure to be a map in the first place, so replacing it with a vector
is a small change.

I believe that this was the last remaining instance if iterating over the
elements of a Dense* container where the iteration order could matter.
There are some remaining iterations over std::*map containers where the order
might matter, but so long as the Value* for instructions in a block increase
with the order of the instructions in the block (or decrease) monotonically,
then this will appear to be deterministic.

llvm-svn: 167942
2012-11-14 18:38:11 +00:00
Alexey Samsonov 2b27170fdc [TSan] fix indentation
llvm-svn: 167928
2012-11-14 14:33:59 +00:00
Nadav Rotem a43bcddc8d use the getSplat API. Patch by Paul Redmond.
llvm-svn: 167892
2012-11-14 00:02:13 +00:00
Alexey Samsonov cfd662f279 Figure out <size> argument of llvm.lifetime intrinsics at the moment they are created (during function inlining)
llvm-svn: 167821
2012-11-13 07:15:32 +00:00
Hal Finkel b51bdd20d3 BBVectorize: Remove temporary assert used for debugging
llvm-svn: 167817
2012-11-13 05:54:54 +00:00
Meador Inge 193e035b9c instcombine: Migrate math library call simplifications
This patch migrates the math library call simplifications from the
simplify-libcalls pass into the instcombine library call simplifier.

I have typically migrated just one simplifier at a time, but the math
simplifiers are interdependent because:

   1. CosOpt, PowOpt, and Exp2Opt all depend on UnaryDoubleFPOpt.
   2. CosOpt, PowOpt, Exp2Opt, and UnaryDoubleFPOpt all depend on
      the option -enable-double-float-shrink.

These two factors made migrating each of these simplifiers individually
more of a pain than it would be worth.  So, I migrated them all together.

llvm-svn: 167815
2012-11-13 04:16:17 +00:00
Hal Finkel 2a1df367d4 BBVectorize: Don't vectorize vector-manipulation chains
Don't choose a vectorization plan containing only shuffles and
vector inserts/extracts. Due to inperfections in the cost model,
these can lead to infinite recusion.

llvm-svn: 167811
2012-11-13 03:12:40 +00:00
Shuxin Yang c94c3bb5d0 revert r167740
llvm-svn: 167787
2012-11-13 00:08:49 +00:00
Hal Finkel 3b79f55c5f BBVectorize: Only some insert element operand pairs are free.
This fixes another infinite recursion case when using target costs.
We can only replace insert element input chains that are pure (end
with inserting into an undef).

llvm-svn: 167784
2012-11-12 23:55:36 +00:00
Hal Finkel 9cf3372931 BBVectorize: Use a more sophisticated check for input cost
The old checking code, which assumed that input shuffles and insert-elements
could always be folded (and thus were free) is too simple.
This can only happen in special circumstances.
Using the simple check caused infinite recursion.

llvm-svn: 167750
2012-11-12 21:21:02 +00:00
Hal Finkel f8326b6052 BBVectorize: Check the types of compare instructions
The pass would previously assert when trying to compute the cost of
compare instructions with illegal vector types (like struct pointers).

llvm-svn: 167743
2012-11-12 19:41:38 +00:00
Shuxin Yang 1c442f5ec6 This change is to fix rdar://12571717 which is about assertion in Reassociate pass.
The assertion is trigged when the Reassociater tries to transform expression
     ... + 2 * n * 3 + 2 * m + ...
  into:
     ... + 2 * (n*3 + m).

In the process of the transformation, a helper routine folds the constant 2*3 into 6,
confusing optimizer which is trying the to eliminate the common factor 2, and cannot
find 2 any more. 

Review is pending. But I'd like commit first in order to help those who are waiting 
for this fix. 

llvm-svn: 167740
2012-11-12 19:34:11 +00:00
Hal Finkel ef53df0f9f BBVectorize: Check the input types of shuffles for legality
This fixes a bug where shuffles were being fused such that the
resulting input types were not legal on the target. This would
occur only when both inputs and dependencies were also foldable
operations (such as other shuffles) and there were other connected
pairs in the same block.

llvm-svn: 167731
2012-11-12 14:50:59 +00:00
Alexey Samsonov afc550d948 [ASan] fixup for r167725: Don't fetch name of StructType if it is literal
llvm-svn: 167729
2012-11-12 14:47:00 +00:00
Meador Inge b3e91f6ae0 Normalize memcmp constant folding results.
The library call simplifier folds memcmp calls with all constant arguments
to a constant.  For example:

  memcmp("foo", "foo", 3) ->  0
  memcmp("hel", "foo", 3) ->  1
  memcmp("foo", "hel", 3) -> -1

The folding is implemented in terms of the system memcmp that LLVM gets
linked with.  It currently just blindly uses the value returned from
the system memcmp as the folded constant.

This patch normalizes the values returned from the system memcmp to
(-1, 0, 1) so that we get consistent results across multiple platforms.
The test cases were adjusted accordingly.

llvm-svn: 167726
2012-11-12 14:00:45 +00:00
Alexey Samsonov 582d7de709 [ASan]: Add minimalistic support for turning off initialization-order checking for globals of specified types. Tests for this behavior will go to ASan test suite in compiler-rt.
llvm-svn: 167725
2012-11-12 14:00:01 +00:00
Meador Inge f963a8ffcc Delete a stale comment. No functional change.
llvm-svn: 167698
2012-11-12 00:28:15 +00:00
Meador Inge d4825780ed instcombine: Migrate memset optimizations
This patch migrates the memset optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167689
2012-11-11 06:49:03 +00:00
Meador Inge 9cf328b526 instcombine: Migrate memmove optimizations
This patch migrates the memmove optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167687
2012-11-11 06:22:40 +00:00
Meador Inge dd9234a10a instcombine: Migrate memcpy optimizations
This patch migrates the memcpy optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167686
2012-11-11 05:54:34 +00:00
Nadav Rotem 12930749ab Fix a comment typo and add comments.
llvm-svn: 167684
2012-11-11 05:15:00 +00:00
Meador Inge 4d2827c10d instcombine: Migrate memcmp optimizations
This patch migrates the memcmp optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167683
2012-11-11 05:11:20 +00:00
Meador Inge 56edbc9323 instcombine: Migrate strstr optimizations
This patch migrates the strstr optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167682
2012-11-11 03:51:48 +00:00
Meador Inge 76fc1a479a Add method for replacing instructions to LibCallSimplifier
In some cases the library call simplifier may need to replace instructions
other than the library call being simplified.  In those cases it may be
necessary for clients of the simplifier to override how the replacements
are actually done.  As such, a new overrideable method for replacing
instructions was added to LibCallSimplifier.

A new subclass of LibCallSimplifier is also defined which overrides
the instruction replacement method.  This is because the instruction
combiner defines its own replacement method which updates the worklist
when instructions are replaced.

llvm-svn: 167681
2012-11-11 03:51:43 +00:00
Meador Inge bcd88ef764 instcombine: Migrate strcspn optimizations
This patch migrates the strcspn optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167675
2012-11-10 15:16:48 +00:00
Meador Inge 03be256db9 instcombine: Query target library information to gate libcall simplifications
Several of the simplifiers migrated from the simplify-libcalls pass to
the instcombine pass were not correctly checking the target library
information to gate the simplifications.  This patch ensures that the
check is made.

llvm-svn: 167660
2012-11-10 03:11:10 +00:00
Dmitry Vyukov 0044e386e9 tsan: switch to new memory_order constants (ABI compatible)
llvm-svn: 167615
2012-11-09 14:12:16 +00:00
Dmitry Vyukov 92b9e1dbfd tsan: instrument all atomics (including fetch_add, exchange, cas, etc)
llvm-svn: 167612
2012-11-09 12:55:36 +00:00
Nadav Rotem 1cfef3e9ee Add support for memory runtime check. When we can, we calculate array bounds.
If the arrays are found to be disjoint then we run the vectorized version of
the loop. If they are not, we run the scalar code.

llvm-svn: 167608
2012-11-09 07:09:44 +00:00
Meador Inge 489b5d645f instcombine: Migrate strspn optimizations
This patch migrates the strspn optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167568
2012-11-08 01:33:50 +00:00
Hans Wennborg c3c8d95c51 Only do switch-to-lookup table transformation when TargetTransformInfo
is available.

llvm-svn: 167552
2012-11-07 21:35:12 +00:00
Kostya Serebryany 157a515376 [asan] fix bug 14277 (asan needs to fail with fata error if an __asan interface function is being redefined. Before this fix asan asserts)
llvm-svn: 167529
2012-11-07 12:42:18 +00:00
Duncan Sands a318ef6fa6 Generalize the transform that boosts GEP indices to the size of a pointer to
also do it for vectors of pointers.

llvm-svn: 167354
2012-11-03 11:44:17 +00:00
Alexey Samsonov 9bdb63ae0d Fix whitespaces
llvm-svn: 167295
2012-11-02 12:20:34 +00:00
Chandler Carruth 099f5cb031 Revert the switch of loop-idiom to use the new dependence analysis.
The new analysis is not yet ready for prime time. It has a *critical*
flawed assumption, and some troubling shortages of testing. Until it's
been hammered into better shape, let's stick with the working code. This
should be easy to revert itself when the analysis is ready.

Fixes PR14241, a miscompile of any memcpy-able loop which uses a pointer
as the induction mechanism. If you have been seeing miscompiles in this
revision range, you really want to test with this backed out. The
results of this miscompile are a bit subtle as they can lead to
downstream passes concluding things are impossible which are in fact
possible.

Thanks to David Blaikie for the majority of the reduction of this
miscompile. I'll be checking in the test case in a non-revert commit.

Revesions reverted here:

r167045: LoopIdiom: Fix a serious missed optimization: we only turned
         top-level loops into memmove.
r166877: LoopIdiom: Add checks to avoid turning memmove into an infinite
         loop.
r166875: LoopIdiom: Recognize memmove loops.
r166874: LoopIdiom: Replace custom dependence analysis with
         DependenceAnalysis.
llvm-svn: 167286
2012-11-02 08:33:25 +00:00
Duncan Sands a17bb1419f Fix an obvious typo that causes an assertion failure when running
test/Transforms/GVN/rle.ll if the (currently disabled) check for a
pointer type in getIntPtrType is turned on.

llvm-svn: 167285
2012-11-02 07:49:32 +00:00
Chandler Carruth acc748b2b5 Fix sign compare warning. Patch by Mahesha HS.
llvm-svn: 167282
2012-11-02 05:24:00 +00:00
Hal Finkel 560545b85f BBVectorize: Use target costs for incoming and outgoing values instead of the depth heuristic.
When target cost information is available, compute explicit costs of inserting and
extracting values from vectors. At this point, all costs are estimated using the
target information, and the chain-depth heuristic is not needed. As a result, it is now, by
default, disabled when using target costs.

llvm-svn: 167256
2012-11-01 21:50:12 +00:00
Kostya Serebryany 28d0694c27 [asan] don't instrument globals that we've created ourselves (reduces the binary size a bit)
llvm-svn: 167230
2012-11-01 13:42:40 +00:00
Chandler Carruth 5da3f0512e Revert the majority of the next patch in the address space series:
r165941: Resubmit the changes to llvm core to update the functions to
         support different pointer sizes on a per address space basis.

Despite this commit log, this change primarily changed stuff outside of
VMCore, and those changes do not carry any tests for correctness (or
even plausibility), and we have consistently found questionable or flat
out incorrect cases in these changes. Most of them are probably correct,
but we need to devise a system that makes it more clear when we have
handled the address space concerns correctly, and ideally each pass that
gets updated would receive an accompanying test case that exercises that
pass specificaly w.r.t. alternate address spaces.

However, from this commit, I have retained the new C API entry points.
Those were an orthogonal change that probably should have been split
apart, but they seem entirely good.

In several places the changes were very obvious cleanups with no actual
multiple address space code added; these I have not reverted when
I spotted them.

In a few other places there were merge conflicts due to a cleaner
solution being implemented later, often not using address spaces at all.
In those cases, I've preserved the new code which isn't address space
dependent.

This is part of my ongoing effort to clean out the partial address space
code which carries high risk and low test coverage, and not likely to be
finished before the 3.2 release looms closer. Duncan and I would both
like to see the above issues addressed before we return to these
changes.

llvm-svn: 167222
2012-11-01 09:14:31 +00:00
Chandler Carruth 7ec5085e01 Revert the series of commits starting with r166578 which introduced the
getIntPtrType support for multiple address spaces via a pointer type,
and also introduced a crasher bug in the constant folder reported in
PR14233.

These commits also contained several problems that should really be
addressed before they are re-committed. I have avoided reverting various
cleanups to the DataLayout APIs that are reasonable to have moving
forward in order to reduce the amount of churn, and minimize the number
of commits that were reverted. I've also manually updated merge
conflicts and manually arranged for the getIntPtrType function to stay
in DataLayout and to be defined in a plausible way after this revert.

Thanks to Duncan for working through this exact strategy with me, and
Nick Lewycky for tracking down the really annoying crasher this
triggered. (Test case to follow in its own commit.)

After discussing with Duncan extensively, and based on a note from
Micah, I'm going to continue to back out some more of the more
problematic patches in this series in order to ensure we go into the
LLVM 3.2 branch with a reasonable story here. I'll send a note to
llvmdev explaining what's going on and why.

Summary of reverted revisions:

r166634: Fix a compiler warning with an unused variable.
r166607: Add some cleanup to the DataLayout changes requested by
         Chandler.
r166596: Revert "Back out r166591, not sure why this made it through
         since I cancelled the command. Bleh, sorry about this!
r166591: Delete a directory that wasn't supposed to be checked in yet.
r166578: Add in support for getIntPtrType to get the pointer type based
         on the address space.
llvm-svn: 167221
2012-11-01 08:07:29 +00:00
Hal Finkel c89e75e93e BBVectorize: Account for internal shuffle costs
When target costs are available, use them to account for the costs of
shuffles on internal edges of the DAG of candidate pairs.

Because the shuffle costs here are currently for only the internal edges,
the current target cost model is trivial, and the chain depth requirement
is still in place, I don't yet have an easy test
case. Nevertheless, by looking at the debug output, it does seem to do the right
think to the effective "size" of each DAG of candidate pairs.

llvm-svn: 167217
2012-11-01 06:26:34 +00:00
Jakub Staszak 4e45abf0ae Don't insert and erase load instruction. Simply create (new) and delete it.
llvm-svn: 167196
2012-11-01 01:10:43 +00:00
Nadav Rotem 4cb8cdab5e LoopVectorize: Preserve NSW, NUW and IsExact flags.
llvm-svn: 167174
2012-10-31 21:40:39 +00:00
Benjamin Kramer ede2fe3bfd LCSSA: Try to recover compile time regressions due to SCEV updates.
- Use value handle tricks to communicate use replacements instead of forgetLoop, this is a lot faster.
- Move the "big hammer" out of the main loop so it's not called for every instruction.

This should recover most (if not all) compile time regressions introduced by this code.

llvm-svn: 167136
2012-10-31 16:30:03 +00:00
Nadav Rotem ec3ab49dda Put the threshold magic number in a variable.
llvm-svn: 167134
2012-10-31 16:22:16 +00:00
Hans Wennborg b71f72aa82 Remove fixme about unreachable cases from SwitchToLookupTable
SimplifyCFG will have removed those cases for us.

llvm-svn: 167132
2012-10-31 16:15:25 +00:00
Nadav Rotem 1265ea8f8d Remove enum values since they are not used anymore.
llvm-svn: 167131
2012-10-31 16:14:06 +00:00
Hans Wennborg 4fef2fec3d Address Duncan's comments on r167121.
llvm-svn: 167130
2012-10-31 15:31:09 +00:00
Hal Finkel 842ad0b621 BBVectorize: Choose pair ordering to minimize shuffles
BBVectorize would, except for loads and stores, always fuse instructions
so that the first instruction (in the current source order) would always
represent the low part of the input vectors and the second instruction
would always represent the high part. This lead to too many shuffles
being produced because sometimes the opposite order produces fewer of them.

With this change, BBVectorize tracks the kind of pair connections that form
the DAG of candidate pairs, and uses that information to reorder the pairs to
avoid excess shuffles. Using this information, a future commit will be able
to add VTTI-based shuffle costs to the pair selection procedure. Importantly,
the number of remaining shuffles can now be estimated during pair selection.

There are some trivial instruction reorderings in the test cases, and one
simple additional test where we certainly want to do a reordering to
avoid an unnecessary shuffle.

llvm-svn: 167122
2012-10-31 15:17:07 +00:00
Hans Wennborg 09acdb9a16 Address Duncan's comments on r167115
- Use 0 instead of NULL
 - Helper function for "dyn_cast, else lookup in the constant pool".

llvm-svn: 167121
2012-10-31 15:14:39 +00:00
Meador Inge 05a625a0ed instcombine: Migrate strto* optimizations
This patch migrates the strto* optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167119
2012-10-31 14:58:26 +00:00
Hans Wennborg 793b342dcf Fix false -> NULL conversion from r167115 spotted by Benjamin Kramer.
llvm-svn: 167117
2012-10-31 14:36:48 +00:00
Benjamin Kramer 1559127f6f Replace some instances of UniqueVector with SetVector, which is slightly cheaper.
No functionality change.

llvm-svn: 167116
2012-10-31 13:45:49 +00:00
Hans Wennborg 9e74dd97b8 Do simple constant propagation in lookup table formation for switches
By propagating the value for the switch condition, LLVM can now build
lookup tables for code such as:

  switch (x) {
    case 1: return 5;
    case 2: return 42;
    case 3: case 4: case 5:
      return x - 123;
    default:
      return 123;
  }

Given that x is known for each case, "x - 123" becomes a constant for
cases 3, 4, and 5.

llvm-svn: 167115
2012-10-31 13:42:45 +00:00
Benjamin Kramer 8682ac1a77 LCSSA: Add a workaround for another nasty SCEV cache invalidation issue.
I'm not entirely happy with this solution, but I don't see a smarter way currently.
Fixes PR14214.

llvm-svn: 167112
2012-10-31 10:01:29 +00:00
Meador Inge 6f8e01121a instcombine: Migrate strpbrk optimizations
This patch migrates the strpbrk optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167105
2012-10-31 04:29:58 +00:00
Meador Inge d589ac621b instcombine: Migrate strlen optimizations
This patch migrates the strlen optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167103
2012-10-31 03:33:06 +00:00
Meador Inge 067294b3ac instcombine: Migrate strncpy optimizations
This patch migrates the strncpy optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.

llvm-svn: 167102
2012-10-31 03:33:00 +00:00
Nadav Rotem ce77ab0c24 LoopVectorize: Do not vectorize loops with tiny constant trip counts.
llvm-svn: 167101
2012-10-31 03:31:07 +00:00
Nadav Rotem ff7889196b Add support for loops that don't start with Zero.
This is important for loops in the LAPACK test-suite.
These loops start at 1 because they are auto-converted from fortran.

llvm-svn: 167084
2012-10-31 00:45:26 +00:00
Meador Inge 9a6a190562 instcombine: Migrate stpcpy optimizations
This patch migrates the stpcpy optimizations from the simplify-libcalls
pass into the instcombine library call simplifier.  Note that the
__stpcpy_chk simplifications were migrated in a previous commit.

llvm-svn: 167083
2012-10-31 00:20:56 +00:00
Meador Inge cdb2ca54ae instcombine: Split out the __stpcpy_chk simplifications from StrCpyChkOpt
r166198 migrated the strcpy optimization to instcombine.  The strcpy
simplifier that was migrated from Transforms/Scalar/SimplifyLibCalls.cpp
was also doing some __strcpy_chk simplifications.  Those fortified
simplifications were migrated as well, but introduced a bug in the
__stpcpy_chk simplifier in the process.  This happened because the
__strcpy_chk and __stpcpy_chk simplifiers were both mapped to StrCpyChkOpt
which was updated with simplifications that worked for __strcpy_chk, but
not __stpcpy_chk.

This patch fixes the problem by adding proper test coverage and creating a
new simplifier for __stpcpy_chk (instead of sharing one with __strcpy_chk).

llvm-svn: 167082
2012-10-31 00:20:51 +00:00
Nadav Rotem 47a299dcc9 Add documentation.
llvm-svn: 167055
2012-10-30 22:06:26 +00:00
Chandler Carruth 1296b59522 Fix PR14212: For some strange reason I treated vectors differently from
integers in that the code to handle split alloca-wide integer loads or
stores doesn't come first. It should, for the same reasons as with
integers, and the PR attests to that. Also had to fix a busted assert in
that this test case also covers.

llvm-svn: 167051
2012-10-30 20:52:40 +00:00
Hal Finkel 08f34ac9dd BBVectorize: Cache fixed-order pairs instead of recomputing pointer info.
Instead of recomputing relative pointer information just prior to fusing,
cache this information (which also needs to be computed during the
candidate-pair selection process). This cuts down on the total number of
SE queries made, and also is a necessary intermediate step on the road toward
including shuffle costs in the pair selection procedure.

No functionality change is intended.

llvm-svn: 167049
2012-10-30 20:17:37 +00:00
Benjamin Kramer 48a6478242 LoopIdiom: Fix a serious missed optimization: we only turned top-level loops into memmove.
Thanks to Preston Briggs for catching this!

llvm-svn: 167045
2012-10-30 19:49:39 +00:00
Hal Finkel 2eaadd1a2d BBVectorize: Fix a small bug introduced in r167042.
We need to make sure that we take the correct load/store alignment
when the inputs are flipped.

llvm-svn: 167044
2012-10-30 19:47:37 +00:00
Hal Finkel f384890961 BBVectorize: Simplify how input swapping is handled.
Stop propagating the FlipMemInputs variable into the routines that
create the replacement instructions. Instead, just flip the arguments
of those routines. This allows for some associated cleanup (not all
of which is done here). No functionality change is intended.

llvm-svn: 167042
2012-10-30 19:35:29 +00:00
Hal Finkel eac2887143 BBVectorize: Don't make calls to SE when the result is unused.
SE was being called during the instruction-fusion process (when the result
is unreliable, and thus ignored). No functionality change is intended.

llvm-svn: 167037
2012-10-30 18:55:49 +00:00
Nadav Rotem d3df665140 80-col
llvm-svn: 167036
2012-10-30 18:37:43 +00:00
Nadav Rotem bc21aceb19 LoopVectorize: Add support for write-only loops when the write destination is a single pointer.
Speedup SciMark by 1%

llvm-svn: 167035
2012-10-30 18:36:45 +00:00
Nadav Rotem b3e8e688da LoopVectorize: Fix a bug in the initialization of reduction variables. AND needs to start at all-one
while XOR, and OR need to start at zero.

llvm-svn: 167032
2012-10-30 18:12:36 +00:00
Duncan Sands e2395dc27b Fix isEliminableCastPair to work correctly in the presence of pointers
with different sizes.

llvm-svn: 167018
2012-10-30 16:03:32 +00:00
Ulrich Weigand 6a9bb51a8d Enable some additional constant folding for PPCDoubleDouble.
This fixes Clang :: CodeGen/complex-builtints.c on PowerPC.

llvm-svn: 167013
2012-10-30 12:33:18 +00:00
Hans Wennborg f3254838e4 Use TargetTransformInfo to control switch-to-lookup table transformation
When the switch-to-lookup tables transform landed in SimplifyCFG, it
was pointed out that this could be inappropriate for some targets.
Since there was no way at the time for the pass to know anything about
the target, an awkward reverse-transform was added in CodeGenPrepare
that turned lookup tables back into switches for some targets.

This patch uses the new TargetTransformInfo to determine if a
switch should be transformed, and removes
CodeGenPrepare::ConvertLoadToSwitch.

llvm-svn: 167011
2012-10-30 11:23:25 +00:00
Nadav Rotem 73ddcfe03f LoopVectorizer: change debug prints: Print the module identifier when deciding to vectorize. When deciding not to vectorize do not print the called function name because it can be null.
llvm-svn: 166989
2012-10-30 00:40:39 +00:00
Nadav Rotem 5ad045a8c5 LoopVectorize: Update and preserve the dominator tree info.
llvm-svn: 166970
2012-10-29 21:52:38 +00:00
Ulrich Weigand 3abb34389d In various places throughout the code generator, there were special
checks to avoid performing compile-time arithmetic on PPCDoubleDouble.

Now that APFloat supports arithmetic on PPCDoubleDouble, those checks
are no longer needed, and we can treat the type like any other.

llvm-svn: 166958
2012-10-29 18:35:49 +00:00
Nadav Rotem 39aab03be3 Rename the BB-vectorize flag to match the dragonegg name
llvm-svn: 166948
2012-10-29 18:01:14 +00:00
Duncan Sands 5bdd9dda48 Remove a wrapper around getIntPtrType added to GVN by Hal in commit 166624 (the
wrapper returns a vector of integers when passed a vector of pointers) by having
getIntPtrType itself return a vector of integers in this case.  Outside of this
wrapper, I didn't find anywhere in the codebase that was relying on the old
behaviour for vectors of pointers, so give this a whirl through the buildbots.

llvm-svn: 166939
2012-10-29 17:31:46 +00:00
Nadav Rotem c59ae207ef Change the PassManagerBuilder (used by -O3) loop vectorizer flag from -vectorize to -vectorize-loops because we dont want to share the same flag as the bb-vectorizer.
llvm-svn: 166937
2012-10-29 16:36:25 +00:00
Rafael Espindola 56183fbe78 llvm-extract changes linkages so that functions on both sides of the
split module can see each other. If it is keeping a symbol that already has
a non local linkage, it doesn't need to change it.

llvm-svn: 166908
2012-10-29 01:59:03 +00:00
Rafael Espindola 9d30d0fc67 llvm-extract was unable to handle aliases. It would leave a copy on the
output of both

llvm-extract foo.ll -func=bar
and
llvm-extract foo.ll -func=bar -delete

so the two new files could not be linked together anymore. With this change
alias are handled almost like functions and global variables. Almost because
with alias we cannot just clear the initializer/body, we have to create a new
declaration and replace the alias with it.

The net result is that now the output of the above commands can be linked
even if foo.ll has aliases.

llvm-svn: 166907
2012-10-29 00:27:55 +00:00
Benjamin Kramer 8d2ee55a0c LoopIdiom: Add checks to avoid turning memmove into an infinite loop.
I don't think this is possible with the current implementation but that may change eventually.

llvm-svn: 166877
2012-10-27 15:18:28 +00:00
Benjamin Kramer 1c9e5186c0 LoopIdiom: Recognize memmove loops.
This turns loops like
  for (unsigned i = 0; i != n; ++i)
    p[i] = p[i+1];
into memmove, which has a highly optimized implementation in most libcs.

This was really easy with the new DependenceAnalysis :)

llvm-svn: 166875
2012-10-27 14:25:51 +00:00
Benjamin Kramer d5c9be8247 LoopIdiom: Replace custom dependence analysis with DependenceAnalysis.
Requires a lot less code and complexity on loop-idiom's side and the more
precise analysis can catch more cases, like the one I included as a test case.
This also fixes the edge-case miscompilation from PR9481.

Compile time performance seems to be slightly worse, but this is mostly due
to an extra LCSSA run scheduled by the PassManager and should be fixed there.

llvm-svn: 166874
2012-10-27 14:25:44 +00:00
Hal Finkel bad10bb2f3 Update BBVectorize to use the new VTTI instr. cost interfaces.
The monolithic interface for instruction costs has been split into
several functions. This is the corresponding change. No functionality
change is intended.

llvm-svn: 166865
2012-10-27 04:33:48 +00:00
Nadav Rotem 859366f93f 1. Fix a bug in getTypeConversion. When a *simple* type is split, we need to return the type of the split result.
2. Change the maximum vectorization width from 4 to 8.
3. A test for both.

llvm-svn: 166864
2012-10-27 04:11:32 +00:00
Nadav Rotem afae78edab Refactor the VectorTargetTransformInfo interface.
Add getCostXXX calls for different families of opcodes, such as casts, arithmetic, cmp, etc.

Port the LoopVectorizer to the new API.

The LoopVectorizer now finds instructions which will remain uniform after vectorization. It uses this information when calculating the cost of these instructions.

llvm-svn: 166836
2012-10-26 23:49:28 +00:00
Rafael Espindola 4253bd8faf Change the internalize pass to internalize all symbols when given an empty
list of externals. This makes sense since a shared library with no symbols
can still be useful if it has static constructors.

llvm-svn: 166795
2012-10-26 18:47:48 +00:00
Benjamin Kramer 7736085894 LoopSimplify: Preserve DependenceAnalysis.
This is currently true, but may change when DA grows more aggressive caching.
Without this setting it's impossible to use DA from a LoopPass because DA is a
function pass and cannot be properly scheduled in between LoopPasses. The
LoopManager reacts to this with an infinite loop which made this really annoying
to debug.

llvm-svn: 166788
2012-10-26 17:40:50 +00:00
Benjamin Kramer e3d821a466 Fix SCEV cache invalidation in LCSSA and LoopSimplify.
The LoopSimplify bug is pretty harmless because the loop goes from unanalyzable
to analyzable but the LCSSA bug is very nasty. It only comes into play with a
specific order of the LoopPassManager worklist and can cause actual
miscompilations, when a SCEV refers to a value that has been replaced with PHI
node. SCEVExpander may then insert code into the wrong place, either violating
domination or randomly miscompiling stuff.

Comes with an extensive test case reduced from the test-suite with
bugpoint+SCEVValidator.

llvm-svn: 166787
2012-10-26 17:31:43 +00:00
Hal Finkel 4863448dca Use VTTI->getNumberOfParts in BBVectorize.
This change reflects VTTI refactoring; no functionality change intended.

llvm-svn: 166752
2012-10-26 04:28:06 +00:00
Hal Finkel 41a6ded4a0 Disable generation of pointer vectors by BBVectorize.
Once vector-of-pointer support works, then this can be reverted.

llvm-svn: 166741
2012-10-26 00:05:26 +00:00
Hal Finkel 20a49d6f2c BBVectorize, when using VTTI, should not form types that will be split.
This is needed so that perl's SHA can be compiled (otherwise
BBVectorize takes far too long to find its fixed point).

I'll try to come up with a reduced test case.

llvm-svn: 166738
2012-10-25 23:47:16 +00:00
Hal Finkel cbf9365f4c Begin incorporating target information into BBVectorize.
This is the first of several steps to incorporate information from the new
TargetTransformInfo infrastructure into BBVectorize. Two things are done here:

 1. Target information is used to determine if it is profitable to fuse two
    instructions. This means that the cost of the vector operation must not
    be more expensive than the cost of the two original operations. Pairs that
    are not profitable are no longer considered (because current cost information
    is incomplete, for intrinsics for example, equal-cost pairs are still
    considered).

 2. The 'cost savings' computed for the profitability check are also used to
    rank the DAGs that represent the potential vectorization plans. Specifically,
    for nodes of non-trivial depth, the cost savings is used as the node
    weight.

The next step will be to incorporate the shuffle costs into the DAG weighting;
this will give the edges of the DAG weights as well. Once that is done, when
target information is available, we should be able to dispense with the
depth heuristic.

llvm-svn: 166716
2012-10-25 21:12:23 +00:00
Nadav Rotem 579042f71b LoopVectorize: Teach the cost model to query scalar costs as scalar types and not vectors of 1.
llvm-svn: 166715
2012-10-25 21:03:48 +00:00
Jakob Stoklund Olesen 977f41a1fa Also optimize large switch statements.
The isValueEqualityComparison() guard at the top of SimplifySwitch()
only applies to some of the possible transformations.

The newer transformations work just fine on large switches, and the
check on predecessor count is nonsensical.

llvm-svn: 166710
2012-10-25 18:51:15 +00:00
Chandler Carruth 58d0556765 Teach SROA how to split whole-alloca integer loads and stores into
smaller integer loads and stores.

The high-level motivation is that the frontend sometimes generates
a single whole-alloca integer load or store during ABI lowering of
splittable allocas. We need to be able to break this apart in order to
see the underlying elements and properly promote them to SSA values. The
hope is that this fixes some performance regressions on x86-32 with the
new SROA pass.

Unfortunately, this causes quite a bit of churn in the test cases, and
bloats some IR that comes out. When we see an alloca that consists soley
of bits and bytes being extracted and re-inserted, we now do some
splitting first, before building widened integer "bucket of bits"
representations. These are always well folded by instcombine however, so
this shouldn't actually result in missed opportunities.

If this splitting of all-integer allocas does cause problems (perhaps
due to smaller SSA values going into the RA), we could potentially go to
some extreme measures to only do this integer splitting trick when there
are non-integer component accesses of an alloca, but discovering this is
quite expensive: it adds yet another complete walk of the recursive use
tree of the alloca.

Either way, I will be watching build bots and LNT bots to see what
fallout there is here. If anyone gets x86-32 numbers before & after this
change, I would be very interested.

llvm-svn: 166662
2012-10-25 04:37:07 +00:00
Nadav Rotem 5ffb049a55 Add support for additional reduction variables: AND, OR, XOR.
Patch by Paul Redmond <paul.redmond@intel.com>.

llvm-svn: 166649
2012-10-25 00:08:41 +00:00
Nadav Rotem 086ea5c1f5 revert accidental change
llvm-svn: 166643
2012-10-24 23:48:57 +00:00
Nadav Rotem 4a87683a41 Implement a basic cost model for vector and scalar instructions.
llvm-svn: 166642
2012-10-24 23:47:38 +00:00