Commit Graph

7641 Commits

Author SHA1 Message Date
Anna Thomas a2ca902033 [SCEV] Teach SCEV to find maxBECount when loop endbound is variant
Summary:
This patch teaches SCEV to calculate the maxBECount when the end bound
of the loop can vary. Note that we cannot calculate the exactBECount.

This will only be done when both conditions are satisfied:
1. the loop termination condition is strictly LT.
2. the IV is proven to not overflow.

This provides more information to users of SCEV and can be used to
improve identification of finite loops.

Reviewers: sanjoy, mkazantsev, silviu.baranga, atrick

Reviewed by: mkazantsev

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38825

llvm-svn: 315683
2017-10-13 14:30:43 +00:00
Daniel Jasper 3344a21236 Revert r314923: "Recommit : Use the basic cost if a GEP is not used as addressing mode"
Significantly reduces performancei (~30%) of gipfeli
(https://github.com/google/gipfeli)

I have not yet managed to reproduce this regression with the open-source
version of the benchmark on github, but will work with others to get a
reproducer to you later today.

llvm-svn: 315680
2017-10-13 14:04:21 +00:00
Sanjoy Das e6b995f2b2 [SCEV] Maintain loop use lists, and use them in forgetLoop
Summary:
Currently we do not correctly invalidate memoized results for add recurrences
that were created directly (i.e. they were not created from a `Value`).  This
change fixes this by keeping loop use lists and using the loop use lists to
determine which SCEV expressions to invalidate.

Here are some statistics on the number of uses of in the use lists of all loops
on a clang bootstrap (config: release, no asserts):

     Count: 731310
       Min: 1
      Mean: 8.555150
50th %time: 4
95th %tile: 25
99th %tile: 53
       Max: 433

Reviewers: atrick, sunfish, mkazantsev

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38434

llvm-svn: 315672
2017-10-13 05:50:52 +00:00
Sanjay Patel e272be7c9a [ValueTracking] return zero when there's conflict in known bits of a shift (PR34838)
Poison allows us to return a better result than undef.

llvm-svn: 315595
2017-10-12 17:31:46 +00:00
Don Hinton 3e0199f7eb [dump] Remove NDEBUG from test to enable dump methods [NFC]
Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.

Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.

Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.

Differential Revision: https://reviews.llvm.org/D38406

llvm-svn: 315590
2017-10-12 16:16:06 +00:00
Hiroshi Inoue b49b015bed [ScheduleDAGInstrs] fix behavior of getUnderlyingObjectsForCodeGen when no identifiable object found
This patch fixes the bug introduced in https://reviews.llvm.org/D35907; the bug is reported by http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171002/491452.html.

Before D35907, when GetUnderlyingObjects fails to find an identifiable object, allMMOsOkay lambda in getUnderlyingObjectsForInstr returns false and Objects vector is cleared. This behavior is unintentionally changed by D35907.

This patch makes the behavior for such case same as the previous behavior.
Since D35907 introduced a wrapper function getUnderlyingObjectsForCodeGen around GetUnderlyingObjects, getUnderlyingObjectsForCodeGen is modified to return a boolean value to ask the caller to clear the Objects vector.

Differential Revision: https://reviews.llvm.org/D38735

llvm-svn: 315565
2017-10-12 06:26:04 +00:00
Daniel Neilson 5acfd1dd78 [SCEV] Properly handle the case of a non-constant start with a zero accum in ScalarEvolution::createAddRecFromPHIWithCastsImpl
Summary:
 This patch fixes an error in the patch to ScalarEvolution::createAddRecFromPHIWithCastsImpl
made in D37265. In that patch we handle the cases where the either the start or accum values can be
zero after truncation. But, we assume that the start value must be a constant if the accum is
zero. This is clearly an erroneous assumption. This change removes that assumption.

Reviewers: sanjoy, dorit, mkazantsev

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38814

llvm-svn: 315491
2017-10-11 19:05:14 +00:00
Vivek Pandya 9590658fb8 [NFC] Convert OptimizationRemarkEmitter old emit() calls to new closure
parameterized emit() calls

Summary: This is not functional change to adopt new emit() API added in r313691.

Reviewed By: anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38285

llvm-svn: 315476
2017-10-11 17:12:59 +00:00
Adam Nemet 0965da2055 Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.*
Sync it up with the name of the class actually defined here.  This has been
bothering me for a while...

llvm-svn: 315249
2017-10-09 23:19:02 +00:00
Matthew Simpson 49ee814996 [SparsePropagation] Move member definitions to header (NFC)
AbstractLatticeFunction and SparseSolver are class templates parameterized by a
lattice value, so we need to move these member functions over to the header.

Differential Revision: https://reviews.llvm.org/D38561

llvm-svn: 314996
2017-10-05 18:03:30 +00:00
Sanjoy Das 005b88c0a6 Do not call Loop::getName on possibly dead loops
This fixes PR34832.

llvm-svn: 314938
2017-10-04 22:02:27 +00:00
Jun Bum Lim d40e03c2d8 Recommit : Use the basic cost if a GEP is not used as addressing mode
Recommitting r314517 with the fix for handling ConstantExpr.

Original commit message:
  Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing
  mode in the target. However, since it doesn't check its actual users, it will
  return FREE even in cases where the GEP cannot be folded away as a part of
  actual addressing mode. For example, if an user of the GEP is a call
  instruction taking the GEP as a parameter, then the GEP may not be folded in
  isel.

llvm-svn: 314923
2017-10-04 18:33:52 +00:00
Adam Nemet 6c381b7a2e [OptRemark] Move YAML writing to IR
Before the patch this was in Analysis.  Moving it to IR and making it implicit
part of LLVMContext::diagnose allows the full opt-remark facility to be used
outside passes e.g. the pass manager.  Jessica is planning to use this to
report function size after each pass.  The same could be used for time
reports.

Tested with BUILD_SHARED_LIBS=On.

llvm-svn: 314909
2017-10-04 15:18:11 +00:00
Adam Nemet f31b1f310c Move verbosity check for remarks to the diag handler
Test needs some slight adjustment because we no longer check the existence of
BFI but rather that the actual hotness is set on the remark.  If entry_count
is not set getBlockProfileCount returns None.

llvm-svn: 314874
2017-10-04 04:26:23 +00:00
Hans Wennborg 9a9048e19f Revert r314806 "[SLP] Vectorize jumbled memory loads."
All the buildbots are red, e.g.
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/2436/

> Summary:
> This patch tries to vectorize loads of consecutive memory accesses, accessed
> in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
> which was reverted back due to some basic issue with representing the 'use mask' of
> jumbled accesses.
>
> This patch fixes the mask representation by recording the 'use mask' in the usertree entry.
>
> Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df
>
> Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh
>
> Reviewed By: Ayal
>
> Subscribers: hans, mzolotukhin
>
> Differential Revision: https://reviews.llvm.org/D36130

llvm-svn: 314824
2017-10-03 18:32:29 +00:00
Mohammad Shahid 1d5422f27f [SLP] Vectorize jumbled memory loads.
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask' of
jumbled accesses.

This patch fixes the mask representation by recording the 'use mask' in the usertree entry.

Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df

Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh

Reviewed By: Ayal

Subscribers: hans, mzolotukhin

Differential Revision: https://reviews.llvm.org/D36130

llvm-svn: 314806
2017-10-03 15:28:48 +00:00
Evgeny Astigeevich d3558b5d5e [InlineCost, NFC] Extract code dealing with inbounds GEPs from visitGetElementPtr into a function
The code responsible for analysis of inbounds GEPs is extracted into a separate
function: CallAnalyzer::canFoldInboundsGEP. With the patch SROA
enabling/disabling code is localized at one place instead of spreading across
the code of CallAnalyzer::visitGetElementPtr.

Differential Revision: https://reviews.llvm.org/D38233

llvm-svn: 314787
2017-10-03 12:00:40 +00:00
Mikael Holmen 6efe507e42 [Lint] Avoid failed assertion by fetching the proper pointer type
Summary:
When checking if a constant expression is a noop cast we fetched the
IntPtrType by doing DL->getIntPtrType(V->getType())). However, there can
be cases where V doesn't return a pointer, and then getIntPtrType()
triggers an assertion.

Now we pass DataLayout to isNoopCast so the method itself can determine
what the IntPtrType is.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D37894

llvm-svn: 314763
2017-10-03 06:03:49 +00:00
Daniel Berlin 67fbb351e3 SparseSolver: Rename getOrInitValueState to getValueState, matching what SCCP calls it
llvm-svn: 314744
2017-10-03 00:26:21 +00:00
Haicheng Wu 25f6c196d7 [InstSimplify] teach SimplifySelectInst() to fold more vector selects
Call ConstantFoldSelectInstruction() to fold cases like below

select <2 x i1><i1 true, i1 false>, <2 x i8> <i8 0, i8 1>, <2 x i8> <i8 2, i8 3>

All operands are constants and the condition has mixed true and false conditions.

Differential Revision: https://reviews.llvm.org/D38369

llvm-svn: 314741
2017-10-02 23:43:52 +00:00
Daniel Berlin 4d825bcf09 Template the sparse propagation solver instead of using void pointers
Summary:
This avoids using void * as the type of the lattice value and ugly casts needed to make that happen.
(If folks want to use references, etc, they can use a reference_wrapper).

Reviewers: davide, mssimpso

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D38476

llvm-svn: 314734
2017-10-02 22:49:49 +00:00
Xin Tong c063c3f09d Revert "Fix typo [NFC]"
This reverts commit e60b5028619be1c81bd039d63a0627dac32d38f9.

Incorrectly include changes that are not typo fix.

llvm-svn: 314614
2017-10-01 00:09:53 +00:00
Xin Tong efec219e1b Fix typo [NFC]
llvm-svn: 314613
2017-10-01 00:07:24 +00:00
Alex Shlyapnikov e76aa3b0b2 Revert "Use the basic cost if a GEP is not used as addressing mode"
This reverts commit r314517.

This commit crashes sanitizer bots, for example:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/4167

Stack snippet:
...
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Support/Casting.h:255:0
llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getGEPCost(llvm::GEPOperator const*, llvm::ArrayRef<llvm::Value const*>)
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:742:0
llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getUserCost(llvm::User const*, llvm::ArrayRef<llvm::Value const*>)
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:782:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/lib/Analysis/TargetTransformInfo.cpp:116:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:116:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:343:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:864:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfo.h:285:0
...

llvm-svn: 314560
2017-09-29 22:04:45 +00:00
Jun Bum Lim 0e16a59e83 Use the basic cost if a GEP is not used as addressing mode
Summary:
Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target.
However, since it doesn't check its actual users, it will return FREE even in cases
where the GEP cannot be folded away as a part of actual addressing mode.
For example, if an user of the GEP is a call instruction taking the GEP as a parameter,
then the GEP may not be folded in isel.

Reviewers: hfinkel, efriedma, mcrosier, jingyue, haicheng

Reviewed By: hfinkel

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38085

llvm-svn: 314517
2017-09-29 14:50:16 +00:00
Florian Hahn 8af01573a3 [LVI] Move LVILatticeVal class to separate header file (NFC).
Summary:
This allows sharing the lattice value code between LVI and SCCP (D36656). 

It also adds a `satisfiesPredicate` function, used by D36656.

Reviewers: davide, sanjoy, efriedma

Reviewed By: sanjoy

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D37591

llvm-svn: 314411
2017-09-28 11:09:22 +00:00
Sanjoy Das def1729dc4 Use a BumpPtrAllocator for Loop objects
Summary:
And now that we no longer have to explicitly free() the Loop instances, we can
(with more ease) use the destructor of LoopBase to do what LoopBase::clear() was
doing.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38201

llvm-svn: 314375
2017-09-28 02:45:42 +00:00
Haicheng Wu 3ec848bc50 [InlineCost] add visitSelectInst()
InlineCost can understand Select IR now.  This patch finds free Select IRs and
continue the propagation of SimplifiedValues, ConstantOffsetPtrs, and
SROAArgValues.

Differential Revision: https://reviews.llvm.org/D37198

llvm-svn: 314307
2017-09-27 14:44:56 +00:00
Daniel Berlin 97f34e887f MemorySSAUpdater: Only add phis to insertedphis if we actually inserted them, not if we just found existing ones
llvm-svn: 314273
2017-09-27 05:35:19 +00:00
Matthias Braun cc603ee3d5 TargetLibraryInfo: Stop guessing wchar_t size
Usually the frontend communicates the size of wchar_t via metadata and
we can optimize wcslen (and possibly other calls in the future). In
cases without the wchar_size metadata we would previously try to guess
the correct size based on the target triple; however this is fragile to
keep up to date and may miss users manually changing the size via flags.
Better be safe and stop guessing and optimizing if the frontend didn't
communicate the size.

Differential Revision: https://reviews.llvm.org/D38106

llvm-svn: 314185
2017-09-26 02:36:57 +00:00
Michael Liao b30286d81c Remove trailing whitespaces.
llvm-svn: 314115
2017-09-25 16:21:21 +00:00
Clement Courbet 2807c0a442 [CodeGenPrepare][NFC] Rename TargetTransformInfo::expandMemCmp -> TargetTransformInfo::enableMemCmpExpansion.
Summary:
Right now there are two functions with the same name, one does the work
and the other one returns true if expansion is needed. Rename
TargetTransformInfo::expandMemCmp to make it more consistent with other
members of TargetTransformInfo.

Remove the unused Instruction* parameter.

Differential Revision: https://reviews.llvm.org/D38165

llvm-svn: 314096
2017-09-25 06:35:16 +00:00
Daniel Neilson 1341ac2ced [SCEV] Generalize folding of trunc(x)+n*trunc(y) into folding m*trunc(x)+n*trunc(y)
Summary:
A SCEV such as:
 {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop>

can be folded into, simply, {%v2,+,0}. However, the current code in ::getAddExpr()
will not try to apply the simplification m*trunc(x)+n*trunc(y) -> trunc(trunc(m)*x+trunc(n)*y)
because it only keys off having a non-multiplied trunc as the first term in the simplification.

This patch generalizes this code to try to do a more generic fold of these trunc
expressions.

Reviewers: sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37888

llvm-svn: 313988
2017-09-22 15:47:57 +00:00
Sanjoy Das 388b012f4e Rename markAsErased to erase, as pointed out in a previous review; NFC
llvm-svn: 313951
2017-09-22 01:47:41 +00:00
Hans Wennborg 57c3341ada Revert r313771 "[SLP] Vectorize jumbled memory loads."
This broke the buildbots, e.g.
http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/391

> Summary:
> This patch tries to vectorize loads of consecutive memory accesses, accessed
> in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
> which was reverted back due to some basic issue with representing the 'use mask'
> jumbled accesses.
>
> This patch fixes the mask representation by recording the 'use mask' in the usertree entry.
>
> Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df
>
> Subscribers: mzolotukhin
>
> Reviewed By: ayal
>
> Differential Revision: https://reviews.llvm.org/D36130
>
> Review comments updated accordingly
>
> Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0
>
> Added a TODO for sortLoadAccesses API
>
> Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58
>
> Modified the TODO for sortLoadAccesses API
>
> Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565
>
> Review comment update for using OpdNum to insert the mask in respective location
>
> Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce
>
> Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase
>
> Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b

llvm-svn: 313781
2017-09-20 18:00:03 +00:00
Mohammad Shahid 2b281de576 [SLP] Vectorize jumbled memory loads.
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask'
jumbled accesses.

This patch fixes the mask representation by recording the 'use mask' in the usertree entry.

Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df

Subscribers: mzolotukhin

Reviewed By: ayal

Differential Revision: https://reviews.llvm.org/D36130

Review comments updated accordingly

Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0

Added a TODO for sortLoadAccesses API

Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58

Modified the TODO for sortLoadAccesses API

Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565

Review comment update for using OpdNum to insert the mask in respective location

Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce

Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase

Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b
llvm-svn: 313771
2017-09-20 17:19:57 +00:00
Alexander Kornienko 6a140234ed Revert r313736: "[SLP] Vectorize jumbled memory loads."
The revision breaks buildbots:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/6694/steps/test/logs/stdio

llvm-svn: 313758
2017-09-20 14:53:07 +00:00
Alexander Kornienko 7302344bdf Revert r313753: "Fix a -Wsign-compare warning in LoopAccessAnalysis.cpp"
llvm-svn: 313757
2017-09-20 14:52:56 +00:00
Alexander Kornienko 6c629b5728 Fix a -Wsign-compare warning in LoopAccessAnalysis.cpp
llvm-svn: 313753
2017-09-20 12:18:22 +00:00
Mohammad Shahid f8db9bd857 [SLP] Vectorize jumbled memory loads.
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask' of
jumbled accesses.

This patch fixes the mask representation by recording the 'use mask' in the usertree entry.

Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df

Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh

Reviewed By: Ayal

Subscribers: mzolotukhin

Differential Revision: https://reviews.llvm.org/D36130

Commit after rebase for patch D36130

Change-Id: I8add1c265455669ef288d880f870a9522c8c08ab
llvm-svn: 313736
2017-09-20 08:18:28 +00:00
Sanjoy Das 09613b122e Tighten the invariants around LoopBase::invalidate
Summary:
With this change:
 - Methods in LoopBase trip an assert if the receiver has been invalidated
 - LoopBase::clear frees up the memory held the LoopBase instance

This change also shuffles things around as necessary to work with this stricter invariant.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38055

llvm-svn: 313708
2017-09-20 02:31:57 +00:00
Sanjoy Das 66a004ac0c Clang-format few files to make later diffs leaner; NFC
llvm-svn: 313705
2017-09-20 01:12:09 +00:00
Sanjoy Das 76ab23234c [LoopInfo] Make LoopBase and Loop destructors non-public
Summary:
See comment for why I think this is a good idea.

This change also:

 - Removes an SCEV test case.  The SCEV test was not testing anything useful (most of it was `#if 0` ed out) and it would need to be updated to deal with a private ~Loop::Loop.
 - Updates the loop pass manager test case to deal with a private ~Loop::Loop.
 - Renames markAsRemoved to markAsErased to contrast with removeLoop, via the usual remove vs. erase idiom we already have for instructions and basic blocks.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D37996

llvm-svn: 313695
2017-09-19 23:19:00 +00:00
Sanjay Patel 0d4fd5b668 [InstSimplify] fold sdiv/srem based on compare of dividend and divisor
This should bring signed div/rem analysis up to the same level as unsigned. 
We use icmp simplification to determine when the divisor is known greater than the dividend.

Each positive test is followed by a negative test to show that we're not overstepping the boundaries of the known bits.
There are extra tests for the signed-min-value special cases.

Alive proofs:
http://rise4fun.com/Alive/WI5

Differential Revision: https://reviews.llvm.org/D37713

llvm-svn: 313264
2017-09-14 14:59:07 +00:00
Sanjay Patel cca8f7853f [InstSimplify] clean up div/rem handling; NFCI
The idea to make an 'isDivZero' helper was suggested for the signed case in D37713:
https://reviews.llvm.org/D37713

This clean-up makes it clear that D37713 is just filling the gap for signed div/rem,
removes unnecessary code, and allows us to remove a bit of duplicated code from the
planned improvement in D37713.

llvm-svn: 313261
2017-09-14 14:09:11 +00:00
Chandler Carruth 7376ae88eb [PM/CGSCC] Teach the CGSCC pass manager components to gracefully handle
invalidated SCCs even when we do not have an updated SCC to redirect
towards.

This comes up in a fairly subtle and surprising circumstance: we need to
have a connected but internal node in the call graph which later becomes
a disconnected island, and then gets deleted. All of this needs to
happen mid-CGSCC walk. Because it is disconnected, we have no way of
computing a new "current" SCC when it gets deleted. Instead, we need to
explicitly check for a deleted "current" SCC and bail out of the current
CGSCC step. This will bubble all the way up to the post-order walk and
then resume correctly.

I've included minimal tests for this bug. The specific behavior
matches something we've seen in the wild with the new PM combined with
ThinLTO and sample PGO, but I've not yet confirmed whether this is the
only issue there.

llvm-svn: 313242
2017-09-14 08:33:57 +00:00
Alon Kom 682cfc1d4c [LV] Fix maximum legal VF calculation
This patch fixes pr34283, which exposed that the computation of
maximum legal width for vectorization was wrong, because it relied
on MaxInterleaveFactor to obtain the maximum stride used in the loop,
however not all strided accesses in the loop have an interleave-group
associated with them.
Instead of recording the maximum stride in the loop, which can be over
conservative (e.g. if the access with the maximum stride is not involved
in the dependence limitation), this patch tracks the actual maximum legal
width imposed by accesses that are involved in dependencies.

Differential Revision: https://reviews.llvm.org/D37507

llvm-svn: 313237
2017-09-14 07:40:02 +00:00
Easwaran Raman 4924bb002d [Inliner] Add another way to compute full inline cost.
Summary:
Full inline cost is computed when -inline-cost-full is true or ORE is
non-null. This patch adds another way to compute full inline cost by
adding a field to InlineParams. This will be used by SampleProfileLoader
to check legality of inlining a callee that it wants to inline.

Reviewers: danielcdh, haicheng

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37819

llvm-svn: 313185
2017-09-13 20:16:02 +00:00
Hiroshi Yamauchi a43913cfaf Add options to dump PGO counts in text.
Summary:
Added text options to -pgo-view-counts and -pgo-view-raw-counts that dump block frequency and branch probability info in text.

This is useful when the graph is very large and complex (the dot command crashes, lines/edges too close to tell apart, hard to navigate without textual search) or simply when text is preferred.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37776

llvm-svn: 313159
2017-09-13 17:20:38 +00:00
Teresa Johnson cbdc5ff628 [ThinLTO] AliasSummary should not have any references
Summary: References should only be on the aliasee.

Reviewers: pcc

Subscribers: llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D37814

llvm-svn: 313158
2017-09-13 17:10:24 +00:00
Silviu Baranga ac920f7716 [LAA] Allow more run-time alias checks by coercing pointer expressions to AddRecExprs
Summary:
LAA can only emit run-time alias checks for pointers with affine AddRec
SCEV expressions. However, non-AddRecExprs can be now be converted to
affine AddRecExprs using SCEV predicates.

This change tries to add the minimal set of SCEV predicates in order
to enable run-time alias checking.

Reviewers: anemet, mzolotukhin, mkuper, sanjoy, hfinkel

Reviewed By: hfinkel

Subscribers: mssimpso, Ayal, dorit, roman.shirokiy, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D17080

llvm-svn: 313012
2017-09-12 07:48:22 +00:00
Marcello Maggioni ce90060d1c [ScalarEvolution] Refactor forgetLoop() to improve performance
forgetLoop() has pretty bad performance because it goes over
the same instructions over and over again in particular when
nested loop are involved.
The refactoring changes the function to a not-recursive function
and reusing the allocation for data-structures and the Visited
set.

NFCI

Differential Revision: https://reviews.llvm.org/D37659

llvm-svn: 312920
2017-09-11 15:44:20 +00:00
Sanjay Patel fa877fd464 [InstSimplify] reorder methods; NFC
I'm trying to refactor some shared code for integer div/rem,
but I keep having to scroll through fdiv. The FP ops have
nothing in common with the integer ops, so I'm moving FP
below everything else. 

While here, improve a couple of comments and fix some formatting.

llvm-svn: 312913
2017-09-11 13:34:27 +00:00
Sanjay Patel 5876189ff1 [InstSimplify] refactor udiv/urem code and add tests; NFCI
This removes some duplicated code and makes it easier to support signed div/rem
in a similar way if we want to do that. Note that the existing comments were not
accurate - we don't need a constant divisor to simplify; icmp simplification does
more than that. But as the added tests show, it could go even further.

llvm-svn: 312885
2017-09-10 17:55:08 +00:00
Nuno Lopes 404f106d71 Merge isKnownNonNull into isKnownNonZero
It now knows the tricks of both functions.
Also, fix a bug that considered allocas of non-zero address space to be always non null

Differential Revision: https://reviews.llvm.org/D37628

llvm-svn: 312869
2017-09-09 18:23:11 +00:00
Sanjay Patel 6fd4391ddd [DivRempairs] add a pass to optimize div/rem pairs (PR31028)
This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented 
as an independent pass, so there's no stretching of scope and feature creep for an existing pass. 
I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost 
this same functionality as an addition to CGP in the motivating example of PR31028:
https://bugs.llvm.org/show_bug.cgi?id=31028

The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow 
more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and
undo the hoisting that is done here.

Decomposing remainder may allow removing some code from the backend (PPC and possibly others).

Differential Revision: https://reviews.llvm.org/D37121 

llvm-svn: 312862
2017-09-09 13:38:18 +00:00
Guozhi Wei 62d6414465 [TargetTransformInfo] Add a new public interface getInstructionCost
Current TargetTransformInfo can support throughput cost model and code size model, but sometimes we also need instruction latency cost model in different optimizations. Hal suggested we need a single public interface to query the different cost of an instruction. So I proposed following interface:

  enum TargetCostKind {
    TCK_RecipThroughput, ///< Reciprocal throughput.
    TCK_Latency,         ///< The latency of instruction.
    TCK_CodeSize         ///< Instruction code size.
  };

  int getInstructionCost(const Instruction *I, enum TargetCostKind kind) const;

All clients should mainly use this function to query the cost of an instruction, parameter <kind> specifies the desired cost model.

This patch also provides a simple default implementation of getInstructionLatency.

The default getInstructionLatency provides latency numbers for only small number of instruction classes, those latency numbers are only reasonable for modern OOO processors. It can be extended in following ways:

   Add more detail into this function.
   Add getXXXLatency function and call it from here.
   Implement target specific getInstructionLatency function.

Differential Revision: https://reviews.llvm.org/D37170

llvm-svn: 312832
2017-09-08 22:29:17 +00:00
Alexey Bataev 6dd29fccb8 [SLP] Support for horizontal min/max reduction.
SLP vectorizer supports horizontal reductions for Add/FAdd binary
operations. Patch adds support for horizontal min/max reductions.
Function getReductionCost() is split to getArithmeticReductionCost() for
binary operation reductions and getMinMaxReductionCost() for min/max
reductions.
Patch fixes PR26956.

Differential revision: https://reviews.llvm.org/D27846

llvm-svn: 312791
2017-09-08 13:49:36 +00:00
Peter Collingbourne 681fbb64a4 ModuleSummaryAnalysis: Correctly handle all function operand references.
The current code that handles personality functions when creating a
module summary does not correctly handle the case where a function's
personality function operand refers to the function indirectly
(e.g. via a bitcast). This patch handles such cases by treating
personality function references like any other reference, i.e. by
adding them to the function's reference list. This has the minor side
benefit of allowing personality functions to participate in early
dead stripping.

We do this by calling findRefEdges on the function itself. This way
we also end up handling other function operands (specifically prefix
data and prologue data) for free.

Differential Revision: https://reviews.llvm.org/D37553

llvm-svn: 312698
2017-09-07 05:35:35 +00:00
Matt Arsenault 3ced3d90c3 InstSimplify: canonicalize is idempotent
llvm-svn: 312685
2017-09-07 01:21:43 +00:00
Nuno Lopes ba1c9f7aee Fix PR33878: BasicAA incorrectly assumes different address spaces don't alias
Remove code that assumed that a nullptr of address space != 0 couldnt alias with a non-null pointer. This is incorrect, since nothing can be concluded about a null pointer in an address space != 0.
This code was written before address spaces were introduced

Differential Revision: https://reviews.llvm.org/D37518

llvm-svn: 312648
2017-09-06 16:55:31 +00:00
Sanjay Patel 6840c5ff75 [ValueTracking, InstCombine] canonicalize fcmp ord/uno with non-NAN ops to null constants
This is a preliminary step towards solving the remaining part of PR27145 - IR for isfinite():
https://bugs.llvm.org/show_bug.cgi?id=27145

In order to solve that one more generally, we need to add matching for and/or of fcmp ord/uno
with a constant operand.

But while looking at those patterns, I realized we were missing a canonicalization for nonzero
constants. Rather than limiting to just folds for constants, we're adding a general value
tracking method for this based on an existing DAG helper.

By transforming everything to 0.0, we can simplify the existing code in foldLogicOfFCmps()
and pick up missing vector folds.

Differential Revision: https://reviews.llvm.org/D37427

llvm-svn: 312591
2017-09-05 23:13:13 +00:00
Daniel Neilson 3f0e4ad833 [SCEV] Ensure ScalarEvolution::createAddRecFromPHIWithCastsImpl properly handles out of range truncations of the start and accum values
Summary:
 When constructing the predicate P1 in ScalarEvolution::createAddRecFromPHIWithCastsImpl() it is possible
for the PHISCEV from which the predicate is constructed to be a SCEVConstant instead of a SCEVAddRec. If
this happens, then the cast<SCEVAddRec>(PHISCEV) in the code will assert.

 Such a PHISCEV is possible if either the start value or the accumulator value is a constant value
that not equal to its truncated value, and if the truncated value is zero.

 This patch adds tests that demonstrate the cast<> assertion, and fixes this problem by checking
whether the PHISCEV is a constant before constructing the P1 predicate; if it is, then P1 is
equivalent to one of P2 or P3. Additionally, if we know that the start value or accumulator
value are constants then we check whether the P2 and/or P3 predicates are known false at compile
time; if either is, then we bail out of constructing the AddRec.

Reviewers: sanjoy, mkazantsev, silviu.baranga

Reviewed By: mkazantsev

Subscribers: mkazantsev, llvm-commits

Differential Revision: https://reviews.llvm.org/D37265

llvm-svn: 312568
2017-09-05 19:54:03 +00:00
Eugene Zelenko 75075efe5e [Analysis, Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 312383
2017-09-01 21:37:29 +00:00
Craig Topper 924f20262b [InstCombine][InstSimplify] Teach decomposeBitTestICmp to look through truncate instructions
This patch teaches decomposeBitTestICmp to look through truncate instructions on the input to the compare. If a truncate is found it will now return the pre-truncated Value and appropriately extend the APInt mask.

This allows some code to be removed from InstSimplify that was doing this functionality.

This allows InstCombine's bit test combining code to match a pre-truncate Value with the same Value appear with an 'and' on another icmp. Or it allows us to combine a truncate to i16 and a truncate to i8. This also required removing the type check from the beginning of getMaskedTypeForICmpPair, but I believe that's ok because we still have to find two values from the input to each icmp that are equal before we'll do any transformation. So the type check was really just serving as an early out.

There was one user of decomposeBitTestICmp that didn't want to look through truncates, so I've added a flag to prevent that behavior when necessary.

Differential Revision: https://reviews.llvm.org/D37158

llvm-svn: 312382
2017-09-01 21:27:34 +00:00
Peter Collingbourne 5e8b94c137 ModuleSummaryAnalysis: Correctly handle refs from function inline asm to module inline asm.
If a function contains inline asm and the module-level inline asm
contains the definition of a local symbol, prevent the function from
being imported in case the function-level inline asm refers to a
symbol in the module-level inline asm.

Differential Revision: https://reviews.llvm.org/D37370

llvm-svn: 312332
2017-09-01 16:24:02 +00:00
Alexandre Isoard 405728fd47 [SCEV] Add URem support to SCEV
In LLVM IR the following code:

    %r = urem <ty> %t, %b

is equivalent to

    %q = udiv <ty> %t, %b
    %s = mul <ty> nuw %q, %b
    %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t

As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented
with minimal effort using that relation:

    %r --> (-%b * (%t /u %b)) + %t

We implement two special cases:

  - if %b is 1, the result is always 0
  - if %b is a power-of-two, we produce a zext/trunc based expression instead

That is, the following code:

    %r = urem i32 %t, 65536

Produces:

    %r --> (zext i16 (trunc i32 %a to i16) to i32)

Note that while this helps get a tighter bound on the range analysis and the
known-bits analysis, this exposes some normalization shortcoming of SCEVs:

    %div = udim i32 %a, 65536
    %mul = mul i32 %div, 65536
    %rem = urem i32 %a, 65536
    %add = add i32 %mul, %rem

Will usually not be reduced.

llvm-svn: 312329
2017-09-01 14:59:59 +00:00
Eugene Zelenko fa6434bebb [Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes. Also affected in files (NFC).
llvm-svn: 312289
2017-08-31 21:56:16 +00:00
Adam Nemet 4846e66fdd Remove an unnecessary const_cast.
I think that this is dating back to when emit used to take a const reference.

llvm-svn: 311948
2017-08-28 23:00:13 +00:00
Don Hinton a67e13129d [Dominators] Remove redundant explicit template instantiation.
Summary:
Remove redundant explicit template instantiation.

This was reported by Andrew Kelley building release_50 with gcc7.2.0 on MacOS: duplicate symbol llvm::DominatorTreeBase.

Reviewers: kuhar, andrewrk, davide, hans

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37185

llvm-svn: 311835
2017-08-26 21:08:51 +00:00
Hiroshi Yamauchi 63e17ebf8b Add options to dump block frequency/branch probability info in text.
Summary:
Add options -print-bfi/-print-bpi that dump block frequency and branch
probability info like -view-block-freq-propagation-dags and
-view-machine-block-freq-propagation-dags do but in text.

This is useful when the graph is very large and complex (the dot command
crashes, lines/edges too close to tell apart, hard to navigate without textual
search) or simply when text is preferred.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37165

llvm-svn: 311822
2017-08-26 00:31:00 +00:00
Haicheng Wu 61995364de [InlineCost] Small changes to early exit condition. NFC.
Change the early exit condition from Cost > Threshold to Cost >= Threshold
because the inline condition is Cost < Threshold.

Differential Revision: https://reviews.llvm.org/D37087

llvm-svn: 311791
2017-08-25 19:00:33 +00:00
Michael Kruse c0a6aab6b6 Normlize to LF line endings.
Commit r297442 introduced mixed CRLF/LF line endings to two files.
Normalize to to LF-only line endings.

llvm-svn: 311774
2017-08-25 12:38:53 +00:00
Dehao Chen f0e27e63e7 Move accurate-sample-profile into the function attribute.
Summary: We need to have accurate-sample-profile in function attribute so that it works with LTO.

Reviewers: davidxl, rsmith

Reviewed By: davidxl

Subscribers: sanjoy, mehdi_amini, javed.absar, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D37113

llvm-svn: 311706
2017-08-24 21:37:04 +00:00
Tobias Grosser d7eb619299 Model cache size and associativity in TargetTransformInfo
Summary:
We add the precise cache sizes and associativity for the following Intel
architectures:

  - Penry
  - Nehalem
  - Westmere
  - Sandy Bridge
  - Ivy Bridge
  - Haswell
  - Broadwell
  - Skylake
  - Kabylake

Polly uses since several months a performance model for BLAS computations that
derives optimal cache and register tile sizes from cache and latency
information (based on ideas from "Analytical Modeling Is Enough for High-Performance BLIS", by Tze Meng Low published at TOMS 2016).
While bootstrapping this model, these target values have been kept in Polly.
However, as our implementation is now rather mature, it seems time to teach
LLVM itself about cache sizes.

Interestingly, L1 and L2 cache sizes are pretty constant across
micro-architectures, hence a set of architecture specific default values
seems like a good start. They can be expanded to more target specific values,
in case certain newer architectures require different values. For now a set
of Intel architectures are provided.

Just as a little teaser, for a simple gemm kernel this model allows us to
improve performance from 1.2s to 0.27s. For gemm kernels with less optimal
memory layouts even larger speedups can be reported.

Reviewers: Meinersbur, bollu, singam-sanjay, hfinkel, gareevroman, fhahn, sebpop, efriedma, asb

Reviewed By: fhahn, asb

Subscribers: lsaba, asb, pollydev, llvm-commits

Differential Revision: https://reviews.llvm.org/D37051

llvm-svn: 311647
2017-08-24 09:46:25 +00:00
Rong Xu 15848e5977 [PGO] Set edge weights for indirectbr instruction with profile counts
Current PGO only annotates the edge weight for branch and switch instructions
with profile counts. We should also annotate the indirectbr instruction as
all the information is there. This patch enables the annotating for indirectbr
instructions. Also uses this annotation in branch probability analysis.

Differential Revision: https://reviews.llvm.org/D37074

llvm-svn: 311604
2017-08-23 21:36:02 +00:00
George Rimar 1e94ca115d [lib/Analysis] - Mark personality functions as live.
This is PR33245.

Case I am fixing is next:
Imagine we have 2 BC files, one defines and uses personality routine,
second has only declaration and also uses it.

Previously algorithm computing dead symbols (llvm::computeDeadSymbols) did
not know about personality routines and leaved them dead even if function that
has routine was live.

As a result thinLTOInternalizeAndPromoteGUID() method changed binding for
such symbol to local. Later when LLD tried to link these objects it failed
because one object had undefined global symbol for routine and second
object contained local definition instead of global.

Patch set the live root flag on the corresponding FunctionSummary
for personality routines when we build the per-module summaries
during the compile step.

Differential revision: https://reviews.llvm.org/D36834

llvm-svn: 311432
2017-08-22 08:50:56 +00:00
Craig Topper 7227ebad9c [ValueTracking] Add assertions that the starting Depth in isKnownToBeAPowerOfTwo and ComputeNumSignBitsImpl is not above MaxDepth
The function does an equality check later to terminate the recursion, but that won't work if its starts out too high. Similar assert already exists in computeKnownBits.

llvm-svn: 311400
2017-08-21 22:56:12 +00:00
Haicheng Wu 0812c5bea3 [InlineCost] Add cl::opt to allow full inline cost to be computed for debugging purposes.
Currently, the inline cost model will bail once the inline cost exceeds the
inline threshold in order to avoid unnecessary compile-time. However, when
debugging it is useful to compute the full cost, so this command line option
is added to override the default behavior.

I took over this work from Chad Rosier (mcrosier@codeaurora.org).

Differential Revision: https://reviews.llvm.org/D35850

llvm-svn: 311371
2017-08-21 20:00:09 +00:00
Chad Rosier 4eb18742ca [InlineCost] Add more debug during inline cost computation.
llvm-svn: 311370
2017-08-21 19:56:46 +00:00
Eugene Zelenko be709f2c19 [Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 311212
2017-08-18 23:51:26 +00:00
Amjad Aboud 88ffa3afe2 [InstCombine] Teach ComputeNumSignBitsImpl to handle integer multiply instruction.
Differential Revision: https://reviews.llvm.org/D36679

llvm-svn: 311206
2017-08-18 22:56:55 +00:00
Eugene Zelenko bb1b2d09cf [Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 311048
2017-08-16 22:07:40 +00:00
Sanjay Patel 042a53624c [DemandedBits] simplify call; NFC
llvm-svn: 311009
2017-08-16 14:28:23 +00:00
Craig Topper b1e4b1a070 [InstSimplify] Teach decomposeBitTestICmp to handle non-canonical compares
This adds support non-canonical compare predicates. InstSimplify can't rely on canonicalization to have occurred.

Differential Revision: https://reviews.llvm.org/D36646

llvm-svn: 310893
2017-08-14 22:11:43 +00:00
Craig Topper 0aa3a19512 Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"
This recommits r310869, with the moved files and no extra changes.

Original commit message:

This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.

I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.

I also had to make decomposeBitTest support vectors since InstSimplify needs that.

As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.

Differential Revision: https://reviews.llvm.org/D36593

llvm-svn: 310889
2017-08-14 21:39:51 +00:00
Chandler Carruth bba762a13f [InlineCost] Refactor the checks for different analyses to be a bit more
localized to the code that uses those analyses.

Technically, this can change behavior as we no longer require the
existence of the ProfileSummaryInfo analysis to use local profile
information via BFI. We didn't actually require the PSI to have an
interesting profile though, so this only really impacts the behavior in
non-default pass pipelines.

IMO, this makes it substantially less surprising how everything works --
before an analysis that wasn't actually used had to exist to trigger
*any* profile aware inlining. I think the new organization makes it more
obvious where various checks for profile signals happen.

Differential Revision: https://reviews.llvm.org/D36710

llvm-svn: 310888
2017-08-14 21:25:00 +00:00
Andrew Kaylor 53a5fbb45f Add strictfp attribute to prevent unwanted optimizations of libm calls
Differential Revision: https://reviews.llvm.org/D34163

llvm-svn: 310885
2017-08-14 21:15:13 +00:00
Craig Topper 69fa8e0d99 Revert r310869 "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"
Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything.

llvm-svn: 310873
2017-08-14 19:09:32 +00:00
Craig Topper 9c7b881677 Revert r310870 "[InstCombine][InstSimplify] 'git add' two files that moved in r310869."
An extra change crept in here.

llvm-svn: 310872
2017-08-14 19:09:28 +00:00
Craig Topper 914c836842 [InstCombine][InstSimplify] 'git add' two files that moved in r310869.
llvm-svn: 310870
2017-08-14 19:01:32 +00:00
Craig Topper 2f0b450666 [InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify
This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.

I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.

I also had to make decomposeBitTest support vectors since InstSimplify needs that.

As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.

Differential Revision: https://reviews.llvm.org/D36593

llvm-svn: 310869
2017-08-14 18:49:42 +00:00
Hal Finkel b03dd4be70 [ValueTracking] Don't delete assumes of side-effectful instructions
ValueTracking has to strike a balance when attempting to propagate information
backwards from assumes, because if the information is trivially propagated
backwards, it can appear to LLVM that the assumption is known to be true, and
therefore can be removed.

This is sound (because an assumption has no semantic effect except for causing
UB), but prevents the assume from allowing further optimizations.

The isEphemeralValueOf check exists to try and prevent this issue by not
removing the source of an assumption. This tries to make it a little bit more
general to handle the case of side-effectful instructions, such as in

  %0 = call i1 @get_val()
  %1 = xor i1 %0, true
  call void @llvm.assume(i1 %1)

Patch by Ariel Ben-Yehuda, thanks!

Differential Revision: https://reviews.llvm.org/D36590

llvm-svn: 310859
2017-08-14 17:11:43 +00:00
Chandler Carruth 37c7b08710 [ValueTracking] Revert r310583 which enabled functionality that still is
causing compile time issues.

Moreover, the patch *deleted* the flag in addition to changing the
default, and links to a code review that doesn't even discuss the flag
and just has an update to a Clang test case.

I've followed up on the commit thread to ask for numbers on compile time
at this point, leaving the flag in place until things stabilize, and
pointing at specific code that seems to exhibit excessive compile time
with this patch.

Original commit message for r310583:
"""
[ValueTracking] Enabling ValueTracking patch by default (recommit). Part 2.

The original patch was an improvement to IR ValueTracking on
non-negative integers. It has been checked in to trunk (D18777,
r284022). But was disabled by default due to performance regressions.
Perf impact has improved. The patch would be enabled by default.
""""

llvm-svn: 310816
2017-08-14 07:03:24 +00:00
Eugene Zelenko 530851c2bc [Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 310766
2017-08-11 21:30:02 +00:00
Chandler Carruth 19913b22c0 [PM] Switch the CGSCC debug messages to use the standard LLVM debug
printing techniques with a DEBUG_TYPE controlling them.

It was a mistake to start re-purposing the pass manager `DebugLogging`
variable for generic debug printing -- those logs are intended to be
very minimal and primarily used for testing. More detailed and
comprehensive logging doesn't make sense there (it would only make for
brittle tests).

Moreover, we kept forgetting to propagate the `DebugLogging` variable to
various places making it also ineffective and/or unavailable. Switching
to `DEBUG_TYPE` makes this a non-issue.

llvm-svn: 310695
2017-08-11 05:47:13 +00:00
Nikolai Bozhenov d97136c182 [ValueTracking] Enabling ValueTracking patch by default (recommit). Part 2.
The original patch was an improvement to IR ValueTracking on non-negative
integers. It has been checked in to trunk (D18777, r284022). But was disabled by
default due to performance regressions.
Perf impact has improved. The patch would be enabled by default.
 
Reviewers: reames, hfinkel
 
Differential Revision: https://reviews.llvm.org/D34101
 
Patch by: Olga Chupina <olga.chupina@intel.com>

llvm-svn: 310583
2017-08-10 11:24:57 +00:00
Chandler Carruth 9c161e894a [LCG] Fix an assert in a on-scope-exit lambda that checked the contents
of the returned value.

Checking the returned value from inside of a scoped exit isn't actually
valid. It happens to work when NRVO fires and the stars align, which
they reliably do with Clang but don't, for example, on MSVC builds.

llvm-svn: 310547
2017-08-10 03:05:21 +00:00
Hiroshi Yamauchi ccd412f48d [LVI] Fix LVI compile time regression around constantFoldUser()
Summary:
Avoid checking each operand and calling getValueFromCondition() before calling
constantFoldUser() when the instruction type isn't supported by
constantFoldUser().

This fixes a large compile time regression in an internal build.

Reviewers: sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36552

llvm-svn: 310545
2017-08-10 02:23:14 +00:00
Craig Topper ba69187988 [InstSimplify] Add test cases that show that simplifySelectWithICmpCond doesn't work with non-canonical comparisons.
llvm-svn: 310542
2017-08-10 01:02:02 +00:00