Commit Graph

16649 Commits

Author SHA1 Message Date
Reid Kleckner 01660a3d2a [asan] Make ASan compatible with linker dead stripping on Windows
Summary:
This is similar to what was done for Darwin in rL264645 /
http://reviews.llvm.org/D16737, but it uses COFF COMDATs to achive the
same result instead of relying on new custom linker features.

As on MachO, this creates one metadata global per instrumented global.
The metadata global is placed in the custom .ASAN$GL section, which the
ASan runtime will iterate over during initialization. There are no other
references to the metadata, so normal linker dead stripping would
discard it. However, the metadata is put in a COMDAT group with the
instrumented global, so that it will be discarded if and only if the
instrumented global is discarded.

I didn't update the ASan ABI version check since this doesn't affect
non-Windows platforms, and the WinASan ABI isn't really stable yet.

Implementing this for ELF will require extending LLVM IR and MC a bit so
that we can use non-COMDAT section groups.

Reviewers: pcc, kcc, mehdi_amini, kubabrecka

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26770

llvm-svn: 287576
2016-11-21 20:40:37 +00:00
Mandeep Singh Grang 73f0095d71 [MemorySSA] Fix for non-determinism in codegen
This patch fixes the non-determinism caused due to iterating SmallPtrSet's
which was uncovered due to the experimental "reverse iteration order " patch:
https://reviews.llvm.org/D26718

The following unit tests failed because of the undefined order of iteration.
LLVM :: Transforms/Util/MemorySSA/cyclicphi.ll
LLVM :: Transforms/Util/MemorySSA/many-dom-backedge.ll
LLVM :: Transforms/Util/MemorySSA/many-doms.ll
LLVM :: Transforms/Util/MemorySSA/phi-translation.ll

Reviewers: dberlin, mgrang

Subscribers: dberlin, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D26704

llvm-svn: 287563
2016-11-21 19:33:02 +00:00
Marcin Koscielnicki 1c2bd1e9f3 [InstrProfiling] Mark __llvm_profile_instrument_target last parameter as i32 zeroext if appropriate.
On some architectures (s390x, ppc64, sparc64, mips), C-level int is passed
as i32 signext instead of plain i32.  Likewise, unsigned int may be passed
as i32, i32 signext, or i32 zeroext depending on the platform.  Mark
__llvm_profile_instrument_target properly (its last parameter is unsigned
int).

This (together with the clang change) makes compiler-rt profile testsuite pass
on s390x.

Differential Revision: http://reviews.llvm.org/D21736

llvm-svn: 287534
2016-11-21 11:57:19 +00:00
Davide Italiano 2ae76dd239 [GlobalSplit] Port to the new pass manager.
llvm-svn: 287511
2016-11-21 00:28:23 +00:00
Simon Pilgrim 7d18a70dac Fix spelling mistakes in Transforms comments. NFC.
Identified by Pedro Giffuni in PR27636.

llvm-svn: 287488
2016-11-20 13:19:49 +00:00
Benjamin Kramer ffd3715d16 Give some helper classes/functions internal linkage. NFC.
llvm-svn: 287462
2016-11-19 20:44:26 +00:00
Michael Zolotukhin 5020c9971b [LoopSimplify] Preserve LCSSA when removing edges from unreachable blocks.
This fixes PR30454.

llvm-svn: 287379
2016-11-18 21:01:12 +00:00
Florian Hahn 77382be56b [simplifycfg][loop-simplify] Preserve loop metadata in 2 transformations.
insertUniqueBackedgeBlock in lib/Transforms/Utils/LoopSimplify.cpp now
propagates existing llvm.loop metadata to newly the added backedge.

llvm::TryToSimplifyUncondBranchFromEmptyBlock in lib/Transforms/Utils/Local.cpp
now propagates existing llvm.loop metadata to the branch instructions in the
predecessor blocks of the empty block that is removed.

Differential Revision: https://reviews.llvm.org/D26495

llvm-svn: 287341
2016-11-18 13:12:07 +00:00
Craig Topper 1de753f7f5 [InstCombine][AVX-512] Teach InstCombineCalls how to handle the intrinsics for variable shift with 16-bit elements.
This is a straightforward extension of the existing support for 32/64-bit element types. Just needed to add the additional instrinsics to the switches.

llvm-svn: 287316
2016-11-18 06:04:33 +00:00
Anna Zaks 9cd5ed1241 [asan] Turn on Mach-O global metadata liveness tracking by default
This patch turns on the metadata liveness tracking since all known issues
have been resolved. The future has been implemented in
https://reviews.llvm.org/D16737 and enables support of dead code stripping
option on Mach-O platforms.

As part of enabling the feature, I also plan on reverting the following
patch to compiler-rt:

http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160704/369910.html

Differential Revision: https://reviews.llvm.org/D26772

llvm-svn: 287235
2016-11-17 16:55:40 +00:00
Chris Bieneman 05c279fc4b [CMake] NFC. Updating CMake dependency specifications
This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system.

llvm-svn: 287206
2016-11-17 04:36:50 +00:00
Dehao Chen 41d72a8632 Use profile info to adjust loop unroll threshold.
Summary:
For flat loop, even if it is hot, it is not a good idea to unroll in runtime, thus we set a lower partial unroll threshold.
For hot loop, we set a higher unroll threshold and allows expensive tripcount computation to allow more aggressive unrolling.

Reviewers: davidxl, mzolotukhin

Subscribers: sanjoy, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D26527

llvm-svn: 287186
2016-11-17 01:17:02 +00:00
Peter Collingbourne f72a8d4e08 Introduce GlobalSplit pass.
This pass splits globals into elements using inrange annotations on
getelementptr indices.

Differential Revision: https://reviews.llvm.org/D22295

llvm-svn: 287178
2016-11-16 23:40:26 +00:00
Sanjay Patel 80baf69cb5 [InstCombine] replace unreachable with assert and remove unreachable code; NFCI
llvm-svn: 287147
2016-11-16 20:40:02 +00:00
Sanjay Patel 1b9560ffd6 [InstCombine] fix formatting and add FIXMEs to foldOperationIntoSelectOperand(); NFC
llvm-svn: 287145
2016-11-16 20:18:34 +00:00
Mandeep Singh Grang 000ce9a686 [LoopVectorize] Fix for non-determinism in codegen
Summary: This patch fixes issues in codegen uncovered due to https://reviews.llvm.org/D26718

Reviewers: mssimpso

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D26727

llvm-svn: 287135
2016-11-16 18:53:17 +00:00
Reid Kleckner 3a83e76811 [sancov] Name the global containing the main source file name
If the global name doesn't start with __sancov_gen, ASan will insert
unecessary red zones around it.

llvm-svn: 287117
2016-11-16 16:50:43 +00:00
Craig Topper 6910fa0ef4 [X86] Remove the scalar intrinsics for fadd/fsub/fdiv/fmul
Summary: These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file.

Reviewers: zvi, delena, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26660

llvm-svn: 287083
2016-11-16 05:24:10 +00:00
Vyacheslav Klochkov b3dc774a99 Fixed the lost FastMathFlags for CALL operations in SLPVectorizer.
Reviewer: Michael Zolotukhin.
Differential Revision: https://reviews.llvm.org/D26575

llvm-svn: 287064
2016-11-16 00:55:50 +00:00
Justin Lebar 2860573529 [BypassSlowDivision] Handle division by constant numerators better.
Summary:
We don't do BypassSlowDivision when the denominator is a constant, but
we do do it when the numerator is a constant.

This patch makes two related changes to BypassSlowDivision when the
numerator is a constant:

 * If the numerator is too large to fit into the bypass width, don't
   bypass slow division (because we'll never run the smaller-width
   code).

 * If we bypass slow division where the numerator is a constant, don't
   OR together the numerator and denominator when determining whether
   both operands fit within the bypass width.  We need to check only the
   denominator.

Reviewers: tra

Subscribers: llvm-commits, jholewinski

Differential Revision: https://reviews.llvm.org/D26699

llvm-svn: 287062
2016-11-16 00:44:47 +00:00
Justin Lebar 583b8687eb [BypassSlowDivision] Simplify partially-tautological if statement.
if (A || (B && A)) --> if (A).

llvm-svn: 287061
2016-11-16 00:44:43 +00:00
Filipe Cabecinhas ec350b71fa [AddressSanitizer] Add support for (constant-)masked loads and stores.
This patch adds support for instrumenting masked loads and stores under
ASan, if they have a constant mask.

isInterestingMemoryAccess now supports returning a mask to be applied to
the loads, and instrumentMop will use it to generate additional checks.

Added tests for v4i32 v8i32, and v4p0i32 (~v4i64) for both loads and
stores (as well as a test to verify we don't add checks to non-constant
masks).

Differential Revision: https://reviews.llvm.org/D26230

llvm-svn: 287047
2016-11-15 22:37:30 +00:00
Kostya Serebryany 9d6dc7b164 [sanitizer-coverage] make sure asan does not instrument coverage guards (reported in https://github.com/google/oss-fuzz/issues/84)
llvm-svn: 287030
2016-11-15 21:12:50 +00:00
Wei Mi 37c4aaaf52 Revert r286999 which caused buildbot test failures. Some testcases need to be made target specific.
llvm-svn: 287014
2016-11-15 19:42:05 +00:00
Wei Mi 7ccf7651c0 [LSR] Allow formula containing Reg for SCEVAddRecExpr related with outerloop.
In RateRegister of existing LSR, if a formula contains a Reg which is a SCEVAddRecExpr,
and this SCEVAddRecExpr's loop is an outerloop, the formula will be marked as Loser
and dropped.

Suppose we have an IR that %for.body is outerloop and %for.body2 is innerloop. LSR only
handle inner loop now so only %for.body2 will be handled.

Using the logic above, formula like
reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) will be dropped
no matter what because reg({1,+, %size}<%for.body>) is a SCEVAddRecExpr type reg related
with outerloop. Only formula like
reg(%array) + 1*reg({{1,+, %size}<%for.body>,+,1}<nuw><nsw><%for.body2>) will be kept
because the SCEVAddRecExpr related with outerloop is folded into the initial value of the
SCEVAddRecExpr related with current loop.

But in some cases, we do need to share the basic induction variable
reg{0 ,+, 1}<%for.body2> among LSR Uses to reduce the final total number of induction
variables used by LSR, so we don't want to drop the formula like
reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) unconditionally.

From the existing comment, it tries to avoid considering multiple level loops at the same time.
However, existing LSR only handles innermost loop, so for any SCEVAddRecExpr with a loop other
than current loop, it is an invariant and will be simple to handle, and the formula doesn't have
to be dropped.

Differential Revision: https://reviews.llvm.org/D26429

llvm-svn: 286999
2016-11-15 18:35:53 +00:00
Wei Mi d2948cef70 [IndVars] Change the order to compute WidenAddRec in widenIVUse.
When both WidenIV::getWideRecurrence and WidenIV::getExtendedOperandRecurrence
return non-null but different WideAddRec, if getWideRecurrence is called
before getExtendedOperandRecurrence, we won't bother to call
getExtendedOperandRecurrence again. But As we know it is possible that after
SCEV folding, we cannot prove the legality using the SCEVAddRecExpr returned
by getWideRecurrence. Meanwhile if getExtendedOperandRecurrence returns non-null
WideAddRec, we know for sure that it is legal to do widening for current instruction.
So it is better to put getExtendedOperandRecurrence before getWideRecurrence, which
will increase the chance of successful widening.

Differential Revision: https://reviews.llvm.org/D26059

llvm-svn: 286987
2016-11-15 17:34:52 +00:00
Craig Topper c7486af9c9 [AVX-512] Add AVX-512 vector shift intrinsics to memory santitizer.
Just needed to add the intrinsics to the exist switch. The code is generic enough to support the wider vectors with no changes.

llvm-svn: 286980
2016-11-15 16:27:33 +00:00
Pablo Barrio 4f80c93a2e Revert "[JumpThreading] Unfold selects that depend on the same condition"
This reverts commit ac54d0066c478a09c7cd28d15d0f9ff8af984afc.

llvm-svn: 286976
2016-11-15 15:42:23 +00:00
Pablo Barrio 5f782bb048 Revert "[JumpThreading] Prevent non-deterministic use lists"
This reverts commit f2c2f5354070469dac253373c66527ca971ddc66.

llvm-svn: 286975
2016-11-15 15:42:17 +00:00
Robert Lougher b0905209dd [LoopVectorizer] When estimating reg usage, unused insts may "end" another use
The register usage algorithm incorrectly treats instructions whose value is
not used within the loop (e.g. those that do not produce a value).

The algorithm first calculates the usages within the loop.  It iterates over
the instructions in order, and records at which instruction index each use
ends (in fact, they're actually recorded against the next index, as this is
when we want to delete them from the open intervals).

The algorithm then iterates over the instructions again, adding each
instruction in turn to a list of open intervals.  Instructions are then
removed from the list of open intervals when they occur in the list of uses
ended at the current index.

The problem is, instructions which are not used in the loop are skipped.
However, although they aren't used, the last use of a value may have been
recorded against that instruction index.  In this case, the use is not deleted
from the open intervals, which may then bump up the estimated register usage.

This patch fixes the issue by simply moving the "is used" check after the loop
which erases the uses at the current index.

Differential Revision: https://reviews.llvm.org/D26554

llvm-svn: 286969
2016-11-15 14:27:33 +00:00
Florian Hahn 4b4dc172e7 Test commit, remove trailing space.
This commit is used to test commit access.

llvm-svn: 286957
2016-11-15 13:28:42 +00:00
Kuba Brecka ddfdba3b01 [tsan] Add support for C++ exceptions into TSan (call __tsan_func_exit during unwinding), LLVM part
This adds support for TSan C++ exception handling, where we need to add extra calls to __tsan_func_exit when a function is exitted via exception mechanisms. Otherwise the shadow stack gets corrupted (leaked). This patch moves and enhances the existing implementation of EscapeEnumerator that finds all possible function exit points, and adds extra EH cleanup blocks where needed.

Differential Revision: https://reviews.llvm.org/D26177

llvm-svn: 286893
2016-11-14 21:41:13 +00:00
Teresa Johnson 4fef68cb8d [ThinLTO] Only promote exported locals as marked in index
Summary:
We have always speculatively promoted all renamable local values
(except const non-address taken variables) for both the exporting
and importing module. We would then internalize them back based on
the ThinLink results if they weren't actually exported. This is
inefficient, and results in unnecessary renames. It also meant we
had to check the non-renamability of a value in the summary, which
was already checked during function importing analysis in the ThinLink.

Made renameModuleForThinLTO (which does the promotion/renaming) instead
use the index when exporting, to avoid unnecessary renames/promotions.
For importing modules, we can simply promoted all values as any local
we import by definition is exported and needs promotion.

This required changes to the method used by the FunctionImport pass
(only invoked from 'opt' for testing) and when invoked from llvm-link,
since neither does a ThinLink. We simply conservatively mark all locals
in the index as promoted, which preserves the current aggressive
promotion behavior.

I also needed to change an llvm-lto based test where we had previously
been aggressively promoting values that weren't importable (aliasees),
but now will not promote.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26467

llvm-svn: 286871
2016-11-14 19:21:41 +00:00
Teresa Johnson d5033a4576 [ThinLTO] Make inline assembly handling more efficient in summary
Summary:
The change in r285513 to prevent exporting of locals used in
inline asm added all locals in the llvm.used set to the reference
set of functions containing inline asm. Since these locals were marked
NoRename, this automatically prevented importing of the function.

Unfortunately, this caused an explosion in the summary reference lists
in some cases. In my particular example, it happened for a large protocol
buffer generated C++ file, where many of the generated functions
contained an inline asm call. It was exacerbated when doing a ThinLTO
PGO instrumentation build, where the PGO instrumentation included
thousands of private __profd_* values that were added to llvm.used.

We really only need to include a single llvm.used local (NoRename) value
in the reference list of a function containing inline asm to block it
being imported. However, it seems cleaner to add a flag to the summary
that explicitly describes this situation, which is what this patch does.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26402

llvm-svn: 286840
2016-11-14 16:40:19 +00:00
Simon Pilgrim 475b40dab8 Remove redundant condition (PR28352) NFCI.
We were already testing is the op was not a leaf, so need to then test if it was a leaf (added it to the assert instead).

llvm-svn: 286817
2016-11-14 12:00:46 +00:00
Pablo Barrio 7ce2c5ecaf [JumpThreading] Prevent non-deterministic use lists
Summary:
Unfolding selects was previously done with the help of a vector
of pointers that was then sorted to be able to remove duplicates.
As this sorting depends on the memory addresses, it was
non-deterministic. A SetVector is used now so that duplicates are
removed without the need of sorting first.

Reviewers: mgrang, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26450

llvm-svn: 286807
2016-11-14 10:24:26 +00:00
Craig Topper b4173a5a70 [InstCombine][AVX-512] Teach InstCombineCalls to handle the new unmasked AVX-512 variable shift intrinsics.
llvm-svn: 286755
2016-11-13 07:26:19 +00:00
Peter Collingbourne d9445c49ad Bitcode: Change module reader functions to return an llvm::Expected.
Differential Revision: https://reviews.llvm.org/D26562

llvm-svn: 286752
2016-11-13 07:00:17 +00:00
Peter Collingbourne 8dff03911c Analysis: Simplify the ScalarEvolution::getGEPExpr() interface. NFCI.
All existing callers were manually extracting information out of an existing
GEP instruction and passing it to getGEPExpr(). Simplify the interface by
changing it to take a GEPOperator instead.

llvm-svn: 286751
2016-11-13 06:59:50 +00:00
Craig Topper 8b831cbb2a [InstCombine][AVX-512] Expand vector shift handling to work on the AVX-512 shift by immediate and shift by single value.
This does not include support for the AVX-512 variable shifts. That will be coming in a future patch.

llvm-svn: 286739
2016-11-13 01:51:55 +00:00
Sanjay Patel 84ae943f0d [InstCombine] use dyn_cast rather isa+cast; NFC
Follow-up to r286664 cleanup as suggested by Eli. Thanks!

llvm-svn: 286671
2016-11-11 23:20:01 +00:00
Sanjay Patel cb2199b2f3 [InstCombine] clean up foldSelectOpOp(); NFC
llvm-svn: 286664
2016-11-11 23:01:20 +00:00
Anna Zaks 3c43737832 [tsan][llvm] Implement the function attribute to disable TSan checking at run time
This implements a function annotation that disables TSan checking for the
function at run time. The benefit over attribute((no_sanitize("thread")))
is that the accesses within the callees will also be suppressed.

The motivation for this attribute is a guarantee given by the objective C
language that the calls to the reference count decrement and object
deallocation will be synchronized. To model this properly, we would need
to intercept all ref count decrement calls (which are very common in ObjC
due to use of ARC) and also every single message send. Instead, we propose
to just ignore all accesses made from within dealloc at run time. The main
downside is that this still does not introduce any synchronization, which
means we might still report false positives if the code that relies on this
synchronization is not executed from within dealloc. However, we have not seen
this in practice so far and think these cases will be very rare.

Differential Revision: https://reviews.llvm.org/D25858

llvm-svn: 286663
2016-11-11 23:01:02 +00:00
Adam Nemet 9bfbf8bbdf [LV] Stop saying "use -Rpass-analysis=loop-vectorize"
This is PR28376.

Unfortunately given the current structure of optimization diagnostics we
lack the capability to tell whether the user has
passed -Rpass-analysis=loop-vectorize since this is local to the
front-end (BackendConsumer::OptimizationRemarkHandler).

So rather than printing this even if the user has already
passed -Rpass-analysis, this patch just punts and stops recommending
this option.  I don't think that getting this right is worth the
complexity.

Differential Revision: https://reviews.llvm.org/D26563

llvm-svn: 286662
2016-11-11 22:51:46 +00:00
Erik Eckstein c1d52e5c53 FunctionComparator: don't rely on argument evaluation order.
This is a follow-up on the recent refactoring of the FunctionMerge pass.
It should fix a fail of the new FunctionComparator unittest whe compiling with MSVC.

llvm-svn: 286648
2016-11-11 22:21:39 +00:00
Evgeniy Stepanov 1fe189d795 [cfi] Fix weak functions handling.
When a function pointer is replaced with a jumptable pointer, special
case is needed to preserve the semantics of extern_weak functions.
Since a jumptable entry can not be extern_weak, we emulate that
behaviour by replacing all references to F (the extern_weak function)
with the following expression: F != nullptr ? JumpTablePtr : nullptr.

Extra special care is needed for global initializers, since most (or
probably all) backends can not lower an initializer that includes
this kind of constant expression. Initializers like that are replaced
with a global constructor (i.e. a runtime initializer).

llvm-svn: 286636
2016-11-11 21:39:26 +00:00
Erik Eckstein 4d6fb72aa9 Make the FunctionComparator of the MergeFunctions pass a stand-alone utility.
This is pure refactoring. NFC.

This change moves the FunctionComparator (together with the GlobalNumberState
utility) in to a separate file so that it can be used by other passes.
For example, the SwiftMergeFunctions pass in the Swift compiler:
https://github.com/apple/swift/blob/master/lib/LLVMPasses/LLVMMergeFunctions.cpp

Details of the change:

*) The big part is just moving code out of MergeFunctions.cpp into FunctionComparator.h/cpp
*) Make FunctionComparator member functions protected (instead of private)
   so that a derived comparator class can use them.

Following refactoring helps to share code between the base FunctionComparator
class and a derived class:

*) Add a beginCompare() function
*) Move some basic function property comparisons into a separate function compareSignature()
*) Do the GEP comparison inside cmpOperations() which now has a new
   needToCmpOperands reference parameter

https://reviews.llvm.org/D25385

llvm-svn: 286632
2016-11-11 21:15:13 +00:00
Vyacheslav Klochkov f1a12fe0f5 Fixed the lost FastMathFlags for FCmp operations in SLPVectorizer.
Reviewer: Michael Zolotukhin.
Differential Revision: https://reviews.llvm.org/D26543

llvm-svn: 286626
2016-11-11 19:55:29 +00:00
Peter Collingbourne 6de481a378 Bitcode: Change getModuleSummaryIndex() to return an llvm::Expected.
Differential Revision: https://reviews.llvm.org/D26539

llvm-svn: 286624
2016-11-11 19:50:39 +00:00
Reid Kleckner ec80354873 [sancov] Don't instrument MSVC CRT stdio config helpers
They get called before initialization, which is a problem for winasan.

Test coming in compiler-rt.

llvm-svn: 286615
2016-11-11 19:18:45 +00:00
Evgeniy Stepanov f48ffab554 [cfi] Implement cfi-icall using inline assembly.
The current implementation is emitting a global constant that happens
to evaluate to the same bytes + relocation as a jump instruction on
X86. This does not work for PIE executables and shared libraries
though, because we end up with a wrong relocation type. And it has no
chance of working on ARM/AArch64 which use different relocation types
for jump instructions (R_ARM_JUMP24) that is never generated for
data.

This change replaces the constant with module-level inline assembly
followed by a hidden declaration of the jump table. Works fine for
ARM/AArch64, but has some drawbacks.
* Extra symbols are added to the static symbol table, which inflate
the size of the unstripped binary a little. Stripped binaries are not
affected. This happens because jump table declarations must be
external (because their body is in the inline asm).
* Original functions that were anonymous are now named
<original name>.cfi, and it affects symbolization sometimes. This is
necessary because the only user of these functions is the (inline
asm) jump table, so they had to be added to @llvm.used, which does
not allow unnamed functions.

llvm-svn: 286611
2016-11-11 18:49:09 +00:00
Sanjay Patel d1bf4340ef [InstCombine] fix formatting of FoldOpIntoSelect(); NFCI
llvm-svn: 286604
2016-11-11 17:42:16 +00:00
Dehao Chen 5492f8646c Add comments about why we put LoopSink pass at the very late stage.
llvm-svn: 286480
2016-11-10 17:42:18 +00:00
Sanjay Patel 4e1b5a53c7 [InstCombine] avoid infinite loop from shuffle-extract-insert sequence (PR30923)
Removing the limitation in visitInsertElementInst() causes several regressions
because we're not prepared to fold sequences of shuffles or inserts and extracts
separated by shuffles. Fixing that appears to be a difficult mission because we
are purposely trying to avoid creating shuffles with arbitrary shuffle masks
because some targets may choke on those.

https://llvm.org/bugs/show_bug.cgi?id=30923

llvm-svn: 286423
2016-11-10 00:15:14 +00:00
Eli Friedman ddbf83ea14 Preserve assumption cache in loop-rotate.
No testcase included because I can't figure out how to reduce it.
(It's easy to write a testcase where rotation clones an assume,
but that doesn't actually seem to trigger the crash in opt on
its own; maybe an issue with the laziness?)

Differential Revision: https://reviews.llvm.org/D26434

llvm-svn: 286410
2016-11-09 23:05:01 +00:00
Evgeny Stupachenko c2698cd903 Minor unroll pass refacoring.
Summary:
Unrolled Loop Size calculations moved to a function.
Constant representing number of optimized instructions
 when "back edge" becomes "fall through" replaced with
 variable.
Some comments added.

Reviewers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D21719

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 286389
2016-11-09 19:56:39 +00:00
Peter Collingbourne 7f00d0a125 Bitcode: Change the materializer interface to return llvm::Error.
Differential Revision: https://reviews.llvm.org/D26439

llvm-svn: 286382
2016-11-09 17:49:19 +00:00
Pavel Labath c207bec388 Remove TimeValue usage from Scalar/SROA.cpp. NFC.
llvm-svn: 286361
2016-11-09 12:07:12 +00:00
Alexandros Lamprineas 0ee3ec2fe4 [ARM] Loop Strength Reduction crashes when targeting ARM or Thumb.
Scalar Evolution asserts when not all the operands of an Add Recurrence
Expression are loop invariants. Loop Strength Reduction should only
create affine Add Recurrences, so that both the start and the step of
the expression are loop invariants.

Differential Revision: https://reviews.llvm.org/D26185

llvm-svn: 286347
2016-11-09 08:53:07 +00:00
Dehao Chen 947dbe1254 Enable Loop Sink pass for functions that has profile.
Summary: For functions with profile data, we are confident that loop sink will be optimal in sinking code.

Reviewers: davidxl, hfinkel

Subscribers: mehdi_amini, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26155

llvm-svn: 286325
2016-11-09 00:58:19 +00:00
Sanjay Patel 4e9d6cd354 [InstCombine] fix profitability equation for max-of-nots transform
As the test change shows, we can increase the critical path by adding
a 'not' instruction, so make sure that we're actually removing an
instruction if we do this transform.

This transform could also cause us to miss folds of min/max pairs.

llvm-svn: 286315
2016-11-09 00:13:11 +00:00
Sanjay Patel 99dc5feff1 [InstCombine] reduce indentation; NFC
llvm-svn: 286314
2016-11-08 23:49:15 +00:00
Kuba Brecka a49dcbb743 [asan] Speed up compilation of large C++ stringmaps (tons of allocas) with ASan
This addresses PR30746, <https://llvm.org/bugs/show_bug.cgi?id=30746>. The ASan pass iterates over entry-block instructions and checks each alloca whether it's in NonInstrumentedStaticAllocaVec, which is apparently slow. This patch gathers the instructions to move during visitAllocaInst.

Differential Revision: https://reviews.llvm.org/D26380

llvm-svn: 286296
2016-11-08 21:30:41 +00:00
Davide Italiano 11a871b227 [LoopDistribute] Preserve GlobalsAA also in the new Pass Manager.
Differential Revision:  https://reviews.llvm.org/D26408

llvm-svn: 286280
2016-11-08 19:52:32 +00:00
Davide Italiano 1e77aaca8a [LibcallsShrinkWrap] This pass doesn't preserve the CFG.
For example, it invalidates the domtree, causing assertions
in later passes which need dominator infos. Make it preserve
GlobalsAA, as suggested by Eli.

Differential Revision:  https://reviews.llvm.org/D26381

llvm-svn: 286271
2016-11-08 19:18:20 +00:00
Chad Rosier fbc7b7d154 Fix typo in comment. NFC.
llvm-svn: 286270
2016-11-08 19:10:25 +00:00
Chad Rosier c244349b85 Remove unused include. NFC.
llvm-svn: 286250
2016-11-08 16:51:19 +00:00
Dehao Chen 2ca9be330b Use the last 7 bits to represent the discriminator to fit it in 1 byte ULEB128 (NFC).
From experiments, discriminator is rarely greater than 127. Here we enforce it to be no greater than 127 so that it will always fit in 1 byte.

llvm-svn: 286245
2016-11-08 16:32:32 +00:00
Pablo Barrio 9f45254138 [JumpThreading] Unfold selects that depend on the same condition
Summary:
These are good candidates for jump threading. This enables later opts
(such as InstCombine) to combine instructions from the selects with
instructions out of the selects. SimplifyCFG will fold the select
again if unfolding wasn't worth it.

Patch by James Molloy and Pablo Barrio.

Reviewers: rengolin, haicheng, sebpop

Subscribers: jojo, jmolloy, llvm-commits

Differential Revision: https://reviews.llvm.org/D26391

llvm-svn: 286236
2016-11-08 14:53:30 +00:00
Sanjoy Das 4aeb080db3 [TRE] Remove dead code
Address review by Eli Friedman on rL286147.

llvm-svn: 286165
2016-11-07 22:17:37 +00:00
Dehao Chen d74e1e161d Reset debug loc to OldInduction in InnerLoopVectorizer::createInductionVariable. (NFC)
This is to prevent SetInsertionPoint from setting debug loc to Latch->getTerminator().

llvm-svn: 286159
2016-11-07 21:59:40 +00:00
Sanjoy Das e06ef141fc Avoid tail recursion elimination across calls with operand bundles
Summary:
In some specific scenarios with well understood operand bundle types
(like `"deopt"`) it may be possible to go ahead and convert recursion to
iteration, but TailRecursionElimination does not have that logic today
so avoid doing the right thing for now.

I need some input on whether `"funclet"` operand bundles should also
block tail recursion elimination.  If not, I'll allow TRE across calls
with `"funclet"` operand bundles and add a test case.

Reviewers: rnk, majnemer, nlewycky, ahatanak

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D26270

llvm-svn: 286147
2016-11-07 21:01:49 +00:00
Evgeniy Stepanov cd729d6236 Use -fsanitize-recover instead of -mllvm -msan-keep-going.
Summary: Use -fsanitize-recover instead of -mllvm -msan-keep-going.

Reviewers: eugenis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26352

llvm-svn: 286145
2016-11-07 21:00:10 +00:00
Kuba Brecka 44e875ad5b [tsan] Cast floating-point types correctly when instrumenting atomic accesses, LLVM part
Although rare, atomic accesses to floating-point types seem to be valid, i.e. `%a = load atomic float ...`. The TSan instrumentation pass however tries to emit inttoptr, which is incorrect, we should use a bitcast here. Anyway, IRBuilder already has a convenient helper function for this.

Differential Revision: https://reviews.llvm.org/D26266

llvm-svn: 286135
2016-11-07 19:09:56 +00:00
Benjamin Kramer 1697d39eef [MemCpyOpt] Don't emit IR in an unspecified order
Argument evaluation order is one of the edge cases where Clang differs
from GCC, yielding different IR depending on which compiler LLVM was
built with. Make the order deterministic and tune the test to actually
verify the order instead of trying to hide it.

llvm-svn: 286126
2016-11-07 17:47:28 +00:00
Chad Rosier 611b73b1f9 Fix 80-column violations. NFC.
llvm-svn: 286117
2016-11-07 16:28:04 +00:00
Sanjay Patel 86408a8048 [InstCombine] allow splat vector folds in adjustMinMax() (retry r285732)
This was reverted at r285866 because there was a crash handling a scalar
select of vectors. I added a check for that pattern and a test case based
on the example provided in the post-commit thread for r285732.

llvm-svn: 286113
2016-11-07 15:52:45 +00:00
Justin Lebar 54b0be048e [LoopStrengthReduce] Don't use a DenseSet<int64_t> when we might add any valid int64_t to the set.
Summary:
SmallSetVector uses DenseSet, but that means we need to reserve some
values for the empty and tombstone keys.

It seems to me we should have a general way to let us store full-range
ints inside of DenseSets, and furthermore that we probably shouldn't
silently let you add ints into DenseSets without explicitly promising
that they're in range.  But that's a battle for another day; for now,
just fix this code, since we currently do something Very Bad when
compiling ffmpeg.

Fixes PR30914.

Reviewers: jeremyhu

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D26323

llvm-svn: 286038
2016-11-05 16:47:25 +00:00
Chandler Carruth d9ef4b6601 Only log the visit of a return instruction if we in fact found a return
instruction.

This avoids dereferencing null in the debug logging if the instruction
was not in fact a return instruction. This potential bug was found by
PVS-Studio.

This actually fixes the last of the "dereferenced a pointer before
checking it for null" reports in the recent PVS-Studio run. However,
there are quite a few reports of this nature that I did not do anything
to fix because they are pretty glaring false positives. They usually
took the form of quite clear correlated checks or a check made in
a separate function. I've even added asserts anywhere this correlation
wasn't pretty obvious and fundamental to the code.

llvm-svn: 285988
2016-11-04 06:59:50 +00:00
Xinliang David Li f450b88191 Fix typo
llvm-svn: 285978
2016-11-04 03:00:52 +00:00
Chandler Carruth fca1ff0da2 Fix a bug found by inspection by PVS-Studio.
This condition is trivially always true prior to the change. The comment
at the call site makes it clear that we expect *all* of these to be '=',
'S', or 'I' so fix the code.

We have a bug I will update to track the fact that Clang doesn't warn on
this: http://llvm.org/PR13101

llvm-svn: 285930
2016-11-03 16:39:25 +00:00
Teresa Johnson 0515fb8d4b [ThinLTO] Handle distributed backend case when doing renaming
Summary:
The recent change I made to consult the summary when deciding whether to
rename (to handle inline asm) in r285513 broke the distributed build
case. In a distributed backend we will only have a portion of the
combined index, specifically for imported modules we only have the
summaries for any imported definitions. When renaming on import we were
asserting because no summary entry was found for a local reference being
linked in (def wasn't imported).

We only need to consult the summary for a renaming decision for the
exporting module. For imports, we would have prevented importing any
references to NoRename values already.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26250

llvm-svn: 285871
2016-11-03 01:07:16 +00:00
Greg Bedwell 5fc6f94591 Revert "[InstCombine] allow splat vector folds in adjustMinMax()"
This reverts commit r285732.

This change introduced a new assertion failure in the following
testcase at -O2:

typedef short __v8hi __attribute__((__vector_size__(16)));
__v8hi foo(__v8hi &V1, __v8hi &V2, unsigned mask) {
  __v8hi Result = V1;
  if (mask & 0x80)
    Result[0] = V2[0];
  return Result;
}

llvm-svn: 285866
2016-11-02 23:17:05 +00:00
Eli Friedman b6befc3bc4 DCE math library calls with a constant operand.
On platforms which use -fmath-errno, math libcalls without any uses
require some extra checks to figure out if they are actually dead.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30464 .

Differential Revision: https://reviews.llvm.org/D25970

llvm-svn: 285857
2016-11-02 20:48:11 +00:00
Bjorn Pettersson 7424c8ccd1 [Reassociate] Skip analysis of dead code to avoid infinite loop.
Summary:
It was detected that the reassociate pass could enter an inifite
loop when analysing dead code. Simply skipping to analyse basic
blocks that are dead avoids such problems (and as a side effect
we avoid spending time on optimising dead code).

The solution is using the same Reverse Post Order ordering of the
basic blocks when doing the optimisations, as when building the
precalculated rank map. A nice side-effect of this solution is
that we now know that we only try to do optimisations for blocks
with ranked instructions.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30818

Reviewers: llvm-commits, davide, eli.friedman, mehdi_amini

Subscribers: dberlin

Differential Revision: https://reviews.llvm.org/D26154

llvm-svn: 285793
2016-11-02 08:55:19 +00:00
George Burgess IV 66837aba0a [MemorySSA] Tighten up types to make our API prettier. NFC.
Patch by bryant.

Differential Revision: https://reviews.llvm.org/D26126

llvm-svn: 285750
2016-11-01 21:17:46 +00:00
Sanjay Patel c3d89842ad [InstCombine] allow splat vector folds in adjustMinMax()
llvm-svn: 285732
2016-11-01 20:08:02 +00:00
Sanjay Patel c0339c77ef [InstCombine] Fold nuw left-shifts in `ugt`/`ule` comparisons.
This transforms

%a = shl nuw %x, c1
%b = icmp {ugt|ule} %a, c0

into

%b = icmp {ugt|ule} %x, (c0 >> c1)

z3:

(declare-const x (_ BitVec 64))
(declare-const c0 (_ BitVec 64))
(declare-const c1 (_ BitVec 64))

(push)
(assert (= x (bvlshr (bvshl x c1) c1)))  ; nuw
(assert (not (= (bvugt (bvshl x c1) c0)
                (bvugt x
                       (bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)

(push)
(assert (= x (bvlshr (bvshl x c1) c1)))  ; nuw
(assert (not (= (bvule (bvshl x c1) c0)
                (bvule x
                       (bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)

Patch by bryant!

Differential Revision: https://reviews.llvm.org/D25913

llvm-svn: 285729
2016-11-01 19:19:29 +00:00
Sanjay Patel 644d7c3b8a [InstCombine] clean up adjustMinMax(); NFCI
1. Change param names for readability
2. Change pointer param to ref
3. Early exit to reduce indent
4. Change switch to if/else

llvm-svn: 285718
2016-11-01 18:15:03 +00:00
Sanjay Patel 7ce658388b [InstCombine] add helper function for adjustMinMax(); NFCI
This is just a cut and paste; clean-up and enhancements to follow.

llvm-svn: 285715
2016-11-01 17:46:08 +00:00
Simon Pilgrim 6dd8fab443 [InstCombine] Folding of shifts by the sum of positive values
This patch introduces the combine:

(C1 shift (A add C2)) -> ((C1 shift C2) shift A)
iff A and C2 are both positive

If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift.

Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive).

Differential Revision: https://reviews.llvm.org/D26000

llvm-svn: 285696
2016-11-01 15:40:30 +00:00
Evgeniy Stepanov 1bd9fc7098 Fix a typo.
Found with PVS-Studio here: http://www.viva64.com/en/b/0446/

llvm-svn: 285652
2016-10-31 22:42:39 +00:00
Kuba Brecka a28c9e8f09 [asan] Move instrumented null-terminated strings to a special section, LLVM part
On Darwin, simple C null-terminated constant strings normally end up in the __TEXT,__cstring section of the resulting Mach-O binary. When instrumented with ASan, these strings are transformed in a way that they cannot be in __cstring (the linker unifies the content of this section and strips extra NUL bytes, which would break instrumentation), and are put into a generic __const section. This breaks some of the tools that we have: Some tools need to scan all C null-terminated strings in Mach-O binaries, and scanning all the contents of __const has a large performance penalty. This patch instead introduces a special section, __asan_cstring which will now hold the instrumented null-terminated strings.

Differential Revision: https://reviews.llvm.org/D25026

llvm-svn: 285619
2016-10-31 18:51:58 +00:00
Dorit Nuzman bf2c15b5dc Second attempt at r285517.
llvm-svn: 285568
2016-10-31 13:17:31 +00:00
Dorit Nuzman 06903d16af Revert r285517 due to build failures.
llvm-svn: 285518
2016-10-30 14:34:57 +00:00
Dorit Nuzman 3c1c658f24 [LoopVectorize] Make interleaved-accesses analysis less conservative about
possible pointer-wrap-around concerns, in some cases.

Before this patch, collectConstStridedAccesses (part of interleaved-accesses
analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when
examining all candidate pointers. This is too conservative. Instead, this
patch makes collectConstStridedAccesses use an optimistic approach, calling
getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the
candidate interleave groups have been formed, revisits the pointer-wrapping
analysis but only where it matters: namely, in groups that have gaps, and where
the gaps are not at the very end of the group (in which case the loop is
peeled). This second time getPtrStride is called with [Assume=false,
ShouldCheckWrap=true], but this could further be improved to using Assume=true,
once we also add the logic to track that we are not going to meet the scev
runtime checks threshold.

Differential Revision: https://reviews.llvm.org/D25276

llvm-svn: 285517
2016-10-30 12:23:26 +00:00
Teresa Johnson bf28c8fa45 [ThinLTO] Use per-summary flag to prevent exporting locals used in inline asm
Summary:
Instead of using the workaround of suppressing the entire index for
modules that call inline asm that may reference locals, use the
NoRename flag on the summary for any locals in the llvm.used set, and
add a reference edge from any functions containing inline asm.

This avoids issues from having no summaries despite the module defining
global values, which was preventing more aggressive index-based
optimization. It will be followed by a subsequent patch to make a
similar fix for local references in module level asm (to fix PR30610).

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26121

llvm-svn: 285513
2016-10-30 05:40:44 +00:00
Teresa Johnson 38d4df714c [ThinLTO] Rename doPromoteLocalToGlobal to shouldPromoteLocalToGlobal (NFC)
Rename as suggested in code review for D26063.

llvm-svn: 285508
2016-10-29 21:52:23 +00:00
Teresa Johnson 1b9c2be8f4 [ThinLTO] Use NoPromote flag in summary during promotion
Summary:
Replace the check of whether a GV has a section with the flag check
in the summary. This is in preparation for using the NoPromote flag
to convey other situations when we can't promote (e.g. locals used in
inline asm).

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26063

llvm-svn: 285507
2016-10-29 21:31:48 +00:00
Sanjay Patel 978f827d12 [InstCombine] re-use bitcasted compare operands in selects (PR28001)
These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>.

The bitcasts obfuscate the expected min/max forms as shown in PR28001:
https://llvm.org/bugs/show_bug.cgi?id=28001#c6

Differential Revision: https://reviews.llvm.org/D25943

llvm-svn: 285495
2016-10-29 15:22:04 +00:00
Justin Lebar 0ede5fb1bb Don't leave unused divs/rems sitting around in BypassSlowDivision.
Summary:
This "pass" eagerly creates div and rem instructions even when only one
is needed -- it relies on a later pass (machine DCE?) to clean them up.

This is problematic not just from a cleanliness perspective (this pass
is running during CodeGenPrepare, so should leave the IR in a better
state), but it also creates a problem for instruction selection.  If we
always have a div+rem, isel will always select a divrem instruction (if
possible), even when a single div or rem would do.

Specifically, in NVPTX, we want to compute rem from the output of div,
if available.  But if a div is not available, we want to leave the rem
alone.  This transformation is overeager if div is always available.

Because this code runs as part of CodeGenPrepare, it's nontrivial to
write a test for this change.  But this will effectively be tested by
a later patch which adds the aforementioned change to NVPTX isel.

Reviewers: tra

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26088

llvm-svn: 285460
2016-10-28 21:43:54 +00:00
Justin Lebar 468bf73209 Don't claim the udiv created in BypassSlowDivision is exact.
Summary:
In BypassSlowDivision's short-dividend path, we would create e.g.

  udiv exact i32 %a, %b

"exact" here means that we are asserting that %a is a multiple of %b.
But we have no reason to believe this must be true -- this is just a
bug, as far as I can tell.

Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D26097

llvm-svn: 285459
2016-10-28 21:43:51 +00:00
Matt Arsenault ef00283425 SpeculativeExecution: Allow speculating more inst types
Partial step towards removing the whitelist and only
using TTI's cost.

llvm-svn: 285438
2016-10-28 20:00:33 +00:00
George Burgess IV 013fd7315f [MemorySSA] Add const to getClobberingMemoryAccess.
Thanks to bryant for the patch!

Differential Revision: https://reviews.llvm.org/D26086

llvm-svn: 285432
2016-10-28 19:22:46 +00:00
Igor Laevsky c3ccf5d77b [LCSSA] Perform LCSSA verification only for the current loop nest.
Now LPPassManager will run LCSSA verification only for the top-level loop
which was processed on the current iteration.

Differential Revision: https://reviews.llvm.org/D25873

llvm-svn: 285394
2016-10-28 12:57:20 +00:00
Davide Italiano 631cd27f29 [Reassociate] Removing instructions mutates the IR.
Fixes PR 30784. Discussed with Justin, who pointed out that
in the new PassManager infrastructure we can have more fine-grained
control on which analyses we want to preserve, but this is the
best we can do with the current infrastructure.

llvm-svn: 285380
2016-10-28 02:47:09 +00:00
Teresa Johnson 58fbc916a0 [ThinLTO] Rename HasSection to NoRename (NFC)
Summary:
This is in preparation for a change to utilize this flag for symbols
referenced/defined in either inline or module level assembly.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26048

llvm-svn: 285376
2016-10-28 02:24:59 +00:00
Sanjay Patel c0de9c9e40 [InstCombine] fix foldSPFofSPF() to handle vector splats
llvm-svn: 285345
2016-10-27 21:19:40 +00:00
Haicheng Wu 430b3e4893 [LoopUnroll] Check partial unrolling is enabled before initialization. NFC.
Differential Revision: https://reviews.llvm.org/D23891

llvm-svn: 285330
2016-10-27 18:40:02 +00:00
Sanjay Patel 611f9f92fc [InstCombine] handle simple vector integer constants in IsFreeToInvert
llvm-svn: 285318
2016-10-27 17:30:50 +00:00
Dehao Chen b94c09baa0 Add Loop Sink pass to reverse the LICM based of basic block frequency.
Summary: LICM may hoist instructions to preheader speculatively. Before code generation, we need to sink down the hoisted instructions inside to loop if it's beneficial. This pass is a reverse of LICM: looking at instructions in preheader and sinks the instruction to basic blocks inside the loop body if basic block frequency is smaller than the preheader frequency.

Reviewers: hfinkel, davidxl, chandlerc

Subscribers: anna, modocache, mgorny, beanz, reames, dberlin, chandlerc, mcrosier, junbuml, sanjoy, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D22778

llvm-svn: 285308
2016-10-27 16:30:08 +00:00
Alexey Bataev 46c0278e7d [SLP] Fix for PR30626: Compiler crash inside SLP Vectorizer.
After successfull horizontal reduction vectorization attempt for PHI node
vectorizer tries to update root binary op by combining vectorized tree
and the ReductionPHI node. But during vectorization this ReductionPHI
can be vectorized itself and replaced by the `undef` value, while the
instruction itself is marked for deletion. This 'marked for deletion'
PHI node then can be used in new binary operation, causing "Use still
stuck around after Def is destroyed" crash upon PHI node deletion.

Also the test is fixed to make it perform actual testing.

Differential Revision: https://reviews.llvm.org/D25671

llvm-svn: 285286
2016-10-27 12:02:28 +00:00
Dehao Chen e713000eb6 Introduce updateDiscriminator interface to DILocation to make it cleaner assigning discriminators.
Summary: This patch introduces updateDiscriminator to DILocation so that it can be directly called by AddDiscriminator. It also makes it easier to update the discriminator later.

Reviewers: dnovillo, dblaikie, aprantl, echristo

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D25959

llvm-svn: 285207
2016-10-26 15:48:45 +00:00
Sanjay Patel 8d7196bfde [InstCombine] clean up commonCastTransforms; NFC
1. Use 'auto' with dyn_cast.
2. Variables start with a capital letter.
3. Use proper punctuation in comments.

llvm-svn: 285200
2016-10-26 14:52:35 +00:00
Andrea Di Biagio 9bcb064f19 [IndVarSimplify][DebugLoc] When widening the exit loop condition, correctly reuse the debug location of the original comparison.
When the loop exit condition is canonicalized as a != compaison, reuse the
debug location of the original (non canonical) comparison.

Before this patch, the debug location of the new icmp was obtained from the
loop latch terminator. This patch fixes the issue by correctly setting the
IRBuilder's "current debug location" to the location of the original compare.

Differential Revision: https://reviews.llvm.org/D25953

llvm-svn: 285185
2016-10-26 10:28:32 +00:00
Peter Collingbourne 7b7bac367c Cloning: Also clone global variable attached metadata.
llvm-svn: 285161
2016-10-26 02:57:33 +00:00
Evgeniy Stepanov ea6d49d3ee Utility functions for appending to llvm.used/llvm.compiler.used.
llvm-svn: 285143
2016-10-25 23:53:31 +00:00
Rong Xu 33308f92eb [PGO] Fix select instruction annotation
Summary:
Select instruction annotation in IR PGO uses the edge count to infer the
branch count. It's currently placed in setInstrumentedCounts() where
no all the BB counts have been computed. This leads to wrong branch weights.
Move the annotation after all BB counts are populated.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25961

llvm-svn: 285128
2016-10-25 21:47:24 +00:00
Guozhi Wei ae541f6a71 [InstCombine] Resubmit the combine of A->B->A BitCast and fix for pr27996
The original patch of the A->B->A BitCast optimization was reverted by r274094 because it may cause infinite loop inside compiler https://llvm.org/bugs/show_bug.cgi?id=27996.

The problem is with following code

xB = load (type B); 
xA = load (type A); 
+yA = (A)xB; B -> A
+zAn = PHI[yA, xA]; PHI 
+zBn = (B)zAn; // A -> B
store zAn;
store zBn;

optimizeBitCastFromPhi generates

+zBn = (B)zAn; // A -> B

and expects it will be combined with the following store instruction to another

store zAn 

Unfortunately before combineStoreToValueType is called on the store instruction, optimizeBitCastFromPhi is called on the new BitCast again, and this pattern repeats indefinitely.

optimizeBitCastFromPhi only generates BitCast for load/store instructions, only the BitCast before store can cause the reexecution of optimizeBitCastFromPhi, and BitCast before store can easily be handled by InstCombineLoadStoreAlloca.cpp. So the solution to the problem is if all users of a CI are store instructions, we should not do optimizeBitCastFromPhi on it. Then optimizeBitCastFromPhi will not be called on the new BitCast instructions.

Differential Revision: https://reviews.llvm.org/D23896

llvm-svn: 285116
2016-10-25 20:43:42 +00:00
Sanjay Patel f3dda13bd2 [InstCombine] Ensure that truncated int types are legal.
Fixes the FIXMEs in D25952 and rL285075.

Patch by bryant!

Differential Revision: https://reviews.llvm.org/D25955

llvm-svn: 285108
2016-10-25 20:11:47 +00:00
Matthew Simpson c62266d680 [LV] Sink scalar operands of predicated instructions
When we predicate an instruction (div, rem, store) we place the instruction in
its own basic block within the vectorized loop. If a predicated instruction has
scalar operands, it's possible to recursively sink these scalar expressions
into the predicated block so that they might avoid execution. This patch sinks
as much scalar computation as possible into predicated blocks. We previously
were able to sink such operands only if they were extractelement instructions.

Differential Revision: https://reviews.llvm.org/D25632

llvm-svn: 285097
2016-10-25 18:59:45 +00:00
Michael Ilseman e542804343 Add -strip-nonlinetable-debuginfo capability
This adds a new function to DebugInfo.cpp that takes an llvm::Module
as input and removes all debug info metadata that is not directly
needed for line tables, thus effectively stripping all type and
variable information from the module.

The primary motivation for this feature was the bitcode work flow
(cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html
for more background). This is not wired up yet, but will be in
subsequent patches.  For testing, the new functionality is exposed to
opt with a -strip-nonlinetable-debuginfo option.

The secondary use-case (and one that works right now!) is as a
reduction pass in bugpoint. I added two new bugpoint options
(-disable-strip-debuginfo and -disable-strip-debug-types) to control
the new features. By default it will first attempt to remove all debug
information, then only the type info, and then proceed to hack at any
remaining MDNodes.

Thanks to Adrian Prantl for stewarding this patch!

llvm-svn: 285094
2016-10-25 18:44:13 +00:00
Michael Kuperstein cffedc4a94 Fix 80-char violations. NFC.
llvm-svn: 285092
2016-10-25 18:31:23 +00:00
Dehao Chen c1472b5092 Move discriminator assignment to where it is used. (NFC)
llvm-svn: 285084
2016-10-25 16:50:27 +00:00
Andrea Di Biagio 824cabd06d [IndVarSimplify][Dwarf] When widening the IV increment, correctly set the debug loc.
When indvars widened an induction variable, the debug location for the loop
increment computation was incorrectly set equal to the debug loc of the loop
latch terminator.

This patch fixes the issue by propagating the correct location from the
original loop increment instruction to the new widened increment.

Differential Revision: https://reviews.llvm.org/D25872

llvm-svn: 285083
2016-10-25 16:45:17 +00:00
Geoff Berry 91e9a5cc23 [EarlyCSE] Make MemorySSA memory dependency check more aggressive.
Now that MemorySSA keeps track of whether MemoryUses are optimized, use
getClobberingMemoryAccess() to check MemoryUse memory dependencies since
it should no longer be so expensive.

This is a follow-up change to https://reviews.llvm.org/D25881

llvm-svn: 285080
2016-10-25 16:18:47 +00:00
Sanjay Patel e3de152530 fix formatting; NFC
llvm-svn: 285078
2016-10-25 16:12:31 +00:00
Sanjay Patel d59f7f9047 [InstCombine] add test and code comment to show potentially misguided icmp trunc transform
llvm-svn: 285075
2016-10-25 15:16:39 +00:00
Peter Collingbourne 4f3b2df9bb GlobalDCE: Restore a statement accidentally removed in r285048.
llvm-svn: 285052
2016-10-25 02:57:27 +00:00
Peter Collingbourne 7695cb6da8 GlobalDCE: Deduplicate code. NFCI.
llvm-svn: 285048
2016-10-25 01:58:26 +00:00
Davide Italiano c3e0ce8f85 Merge two if conditions into one. NFCI.
llvm-svn: 285008
2016-10-24 19:41:47 +00:00
Adrian Prantl 28d2d281e7 add-discriminators: Fix handling of lexical scopes.
This fixes a bug in the handling of lexical scopes, when more than one
scope is defined on the same line or functions are inlined into call
sites that are on the same line as the function definition. This
situation can easily happen in macro expansions.

The problem is solved by introducing a SmallDenseMap<DIScope *,
DILexicalBlockFile *, 1> that keeps track of all the different lexical
scopes that share a line/file location.

Fixes PR30681.

llvm-svn: 284998
2016-10-24 18:23:51 +00:00
Rong Xu b05bac940d Check the number of Args in LibCallsShrinkWrap.
Some library fucntions can have no argument.

llvm-svn: 284989
2016-10-24 16:50:12 +00:00
Geoff Berry 6815468768 [EarlyCSE] Optimize MemoryPhis and reduce memory clobber queries w/ MemorySSA
Summary:
When using MemorySSA, re-optimize MemoryPhis when removing a store since
this may create MemoryPhis with all identical arguments.

Also, when using MemorySSA to check if two MemoryUses are reading from
the same version of the heap, use the defining access instead of calling
getClobberingAccess, since the latter can currently result in many more
AA calls.  Once the MemorySSA use optimization tracking changes are
done, we can remove this limitation, which should result in more loads
being CSE'd.

Reviewers: dberlin

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D25881

llvm-svn: 284984
2016-10-24 15:54:00 +00:00
Nico Weber b38d341106 Revert 284971.
It seems to break selfhost on some bots, see e.g.
http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/21
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/20
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/22

llvm-svn: 284979
2016-10-24 14:52:04 +00:00
Pablo Barrio f9e0d0b7d0 [JumpThreading] Unfold selects that depend on the same condition
Summary:
These are good candidates for jump threading. This enables later opts
(such as InstCombine) to combine instructions from the selects with
instructions out of the selects. SimplifyCFG will fold the select
again if unfolding wasn't worth it.

Patch by James Molloy and Pablo Barrio.

Reviewers: reames, bkramer, mcrosier, gberry, haicheng, jmolloy, sebpop

Subscribers: jojo, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D25477

llvm-svn: 284971
2016-10-24 13:04:45 +00:00
Daniel Berlin f5361139bb Now that VS2013 is gone, make a memoryssa structure an anonymous union again
llvm-svn: 284910
2016-10-22 04:15:41 +00:00
Davide Italiano 738837eed9 [CtorUtils] Modernize. No functional changes intended.
llvm-svn: 284904
2016-10-22 01:21:24 +00:00
Peter Collingbourne ecdd58f1d6 Analysis: Move llvm::getConstantRangeFromMetadata to IR library.
We're about to start using it there.

Differential Revision: https://reviews.llvm.org/D25877

llvm-svn: 284865
2016-10-21 19:59:26 +00:00
Anna Thomas 0860259434 [StripGCRelocates] New pass to remove gc.relocates added by RS4GC
Summary:
Utility pass to remove gc.relocates created by rewrite statepoints for GC.
With respect to safepoint verification, the IR generated would be incorrect, and cannot run
as such.

This would be a single transformation on the final optimized IR.
The benefit of the pass is for easy analysis when the IRs are 'polluted' by too
many gc.relocates.
Added tests.

test run: All RS4GC tests with -verify option. Local downstream tests on large
IR files. This also works when the pointer being gc.relocated is another
gc.relocate.

Reviewers: sanjoy, reames

Subscribers: beanz, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D25096

llvm-svn: 284855
2016-10-21 18:43:16 +00:00
John Brawn 84b21835f1 [LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops
When we have a loop with a known upper bound on the number of iterations, and
furthermore know that either the number of iterations will be either exactly
that upper bound or zero, then we can fully unroll up to that upper bound
keeping only the first loop test to check for the zero iteration case.

Most of the work here is in plumbing this 'max-or-zero' information from the
part of scalar evolution where it's detected through to loop unrolling. I've
also gone for the safe default of 'false' everywhere but howManyLessThans which
could probably be improved.

Differential Revision: https://reviews.llvm.org/D25682

llvm-svn: 284818
2016-10-21 11:08:48 +00:00
Davide Italiano d15477b09d Revert "[GVN/PRE] Hoist global values outside of loops."
There's no agreement about this patch. I personally find the
PRE machinery of the current GVN hard enough to reason about
that I'm not sure I'll try to land this again, instead of working
on the rewrite).

llvm-svn: 284796
2016-10-21 01:37:02 +00:00
Daniel Berlin cd2deacac6 [MSSA] Avoid unnecessary use walks when calling getClobberingMemoryAccess
Summary:
This allows us to mark when uses have been optimized.
This lets us avoid rewalking (IE when people call getClobberingAccess on everything), and also
enables us to later relax the requirement of use optimization during updates with less cost.

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25172

llvm-svn: 284771
2016-10-20 20:13:45 +00:00
Benjamin Kramer 26b2593b24 [GVN] Use defaulted members. No functional change.
llvm-svn: 284726
2016-10-20 13:09:12 +00:00
Benjamin Kramer 2a8bef8769 Do a sweep over move ctors and remove those that are identical to the default.
All of these existed because MSVC 2013 was unable to synthesize default
move ctors. We recently dropped support for it so all that error-prone
boilerplate can go.

No functionality change intended.

llvm-svn: 284721
2016-10-20 12:20:28 +00:00
Artur Pilipenko 5c6ef75485 [IndVarSimplify] Teach calculatePostIncRange to take guards into account
Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D25739

llvm-svn: 284632
2016-10-19 19:43:54 +00:00
Matthew Simpson 41fa838f07 [LV] Avoid emitting trivially dead instructions
Some instructions from the original loop, when vectorized, can become trivially
dead. This happens because of the way we structure the new loop. For example,
we create new induction variables and induction variable "steps" in the new
loop. Thus, when we go to vectorize the original induction variable update, it
may no longer be needed due to the instructions we've already created. This
patch prevents us from creating these redundant instructions. This reduces code
size before simplification and allows greater flexibility in code generation
since we have fewer unnecessary instruction uses.

Differential Revision: https://reviews.llvm.org/D25631

llvm-svn: 284631
2016-10-19 19:22:02 +00:00
Artur Pilipenko f2d5dc5dc6 [IndVarSimplify] Use control-dependent range information to prove non-negativity
This change is motivated by the case when IndVarSimplify doesn't widen a comparison of IV increment because it can't prove IV increment being non-negative. We end up with a redundant trunc of the widened increment on this example.

for.body:
  %i = phi i32 [ %start, %for.body.lr.ph ], [ %i.inc, %for.inc ]
  %within_limits = icmp ult i32 %i, 64
  br i1 %within_limits, label %continue, label %for.end

continue:
  %i.i64 = zext i32 %i to i64
  %arrayidx = getelementptr inbounds i32, i32* %base, i64 %i.i64
  %val = load i32, i32* %arrayidx, align 4
  br label %for.inc

for.inc:
  %i.inc = add nsw nuw i32 %i, 1
  %cmp = icmp slt i32 %i.inc, %limit
  br i1 %cmp, label %for.body, label %for.end

There is a range check inside of the loop which guarantees the IV to be non-negative. NSW on the increment guarantees that the increment is also non-negative. Teach IndVarSimplify to use the range check to prove non-negativity of loop increments.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D25738

llvm-svn: 284629
2016-10-19 18:59:03 +00:00
Vitaly Buka 490fda3366 [asan] Replace std::to_string with llvm::to_string
llvm-svn: 284557
2016-10-19 00:16:56 +00:00
Vitaly Buka 5910a92560 [asan] Simplify calculation of stack frame layout extraction calculation of stack description into separate function.
Reviewers: eugenis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25754

llvm-svn: 284547
2016-10-18 23:29:52 +00:00