Commit Graph

9646 Commits

Author SHA1 Message Date
Vitaly Buka 3d8149db3c [StackSafety,NFC] Don't rerun on LiveIn change 2020-06-19 21:29:31 -07:00
Vitaly Buka 0e1bdeafc9 [StackSafety,NFC] Fix comment 2020-06-19 03:11:13 -07:00
Vitaly Buka f224f3d0f2 [StackSafety] Add StackLifetime::isAliveAfter
This function is going to be added into StackSafety checks.
This patch uses function in ::print implementation to make sure
that it works as expected.
2020-06-19 02:32:17 -07:00
Vitaly Buka 306c257b00 [SafeStack,NFC] Print liveness for all instrunctions 2020-06-19 02:32:17 -07:00
Vitaly Buka 20b1094a04 [StackSafety,NFC] Replace map with vector
We don't need to lookup InstructionNumbering by number, so
we can use vector with index as assigned number.
2020-06-19 02:32:17 -07:00
Vitaly Buka 7b27c09f63 [StackSafety,NFC] Don't test terminators
Code does not track terminators and do not expose them through interface.
State there is just a state of the last instruction or entry.
So this information is just redundant and doesn't need to be tested.
2020-06-19 02:32:17 -07:00
Vitaly Buka fcd67665a8 [StackSafety] Add "Must Live" logic
Summary:
Extend StackLifetime with option to calculate liveliness
where alloca is only considered alive on basic block entry
if all non-dead predecessors had it alive at terminators.

Depends on D82043.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82124
2020-06-18 16:53:37 -07:00
Vitaly Buka f672791e08 [StackSafety] Add pass for StackLifetime testing
Summary: lifetime.ll is a copy of SafeStack/X86/coloring2.ll

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82043
2020-06-18 16:34:18 -07:00
Michael Liao 2defe55722 [TTI] Expose isNoopAddrSpaceCast in TTI.
Reviewers: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82025
2020-06-18 14:40:47 -04:00
Sameer Sahasrabuddhe 7aad220795 [DA] conservatively mark the join of every divergent branch
For a loop, a join block is a block that is reachable along multiple
disjoint paths from the exiting block of a loop. If the exit condition
of the loop is divergent, then such join blocks must also be marked
divergent. This currently fails in some cases because not all join
blocks are identified correctly.

The workaround is to conservatively mark every join block of any
branch (not necessarily the exiting block of a loop) as divergent.

https://bugs.llvm.org/show_bug.cgi?id=46372

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D81806
2020-06-18 17:39:20 +05:30
Simon Pilgrim a5f1f9c9b8 ScalarEvolution.h - reduce LoopInfo.h include to forward declarations. NFC.
Move ScalarEvolution::forgetLoopDispositions implementation to ScalarEvolution.cpp to remove the dependency.

Add implicit header dependency to source files where necessary.
2020-06-17 15:48:23 +01:00
Kirill Naumov ea844c7520 Revert "[InlineCost] InlineCostAnnotationWriterPass introduced"
This reverts commit 37e06e8f5c.
2020-06-17 14:02:34 +00:00
Kirill Naumov dcf2a9f2ee Revert "[InlineCost] PrinterPass prints constants to which instructions are simplified"
This reverts commit 52b0db22f8.
2020-06-17 14:02:29 +00:00
Kirill Naumov 39a4505e34 Revert "[InlineCost] GetElementPtr with constant operands"
This reverts commit 34fba68d80.
2020-06-17 14:02:18 +00:00
Kirill Naumov 34fba68d80 [InlineCost] GetElementPtr with constant operands
If the GEP instruction contanins only constants as its arguments,
then it should be recognized as a constant. For now, there was
also added a flag to turn off this simplification if it causes
any regressions ("disable-gep-const-evaluation") which is off
by default. Once I gather needed data of the effectiveness of
this simplification, the flag will be deleted.

Reviewers: apilipenko, davidxl, mtrofin

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D81026
2020-06-17 13:40:19 +00:00
Kirill Naumov 52b0db22f8 [InlineCost] PrinterPass prints constants to which instructions are simplified
This patch enables printing of constants to see which instructions were
constant-folded. Needed for tests and better visiual analysis of
inliner's work.

Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D81024
2020-06-17 13:40:18 +00:00
Kirill Naumov 37e06e8f5c [InlineCost] InlineCostAnnotationWriterPass introduced
This class allows to see the inliner's decisions for better
optimization verifications and tests. To use, use flag
"-passes="print<inline-cost>"".

Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev

Reviewed By: mtrofin

Differential revision: https://reviews.llvm.org/D81743
2020-06-17 13:40:17 +00:00
Benjamin Kramer 547b6da73c [CallPrinter] Remove static constructor.
No need to have std::string here. NFC.
2020-06-17 13:02:58 +02:00
Sjoerd Meijer 20835cff27 [TTI] Refactor emitGetActiveLaneMask
Refactor TTI hook emitGetActiveLaneMask and remove the unused arguments
as suggested in D79100.
2020-06-17 09:53:58 +01:00
Kirill Bobyrev 3847737fa4
[CallPrinter] Handle freq = 0 case
Improvement of the following revision:
bbc629ebd6

This might still be problematic if freq = 0, so it's better to check for
that.
2020-06-17 10:52:18 +02:00
Kirill Bobyrev bbc629ebd6
[CallPrinter] Fix maxFreq = 0 case
llvm::getHeatColor becomes a problem when maxFreq = 0 -> freq = 0 =>
log2(double(freq)) / log2(maxFreq) -> log2(0.) / log2(0.) which
results in illegal instruction on some architectures.

Problematic revision: https://reviews.llvm.org/D77172
2020-06-17 10:44:28 +02:00
Florian Hahn e4b58ea8c1 [MemDep] Also remove load instructions from NonLocalDesCache.
Currently load instructions are added to the cache for invariant pointer
group dependencies, but only pointer values are removed currently. That
leads to dangling AssertingVHs in the test case below, where we delete a
load from an invariant pointer group. We should also remove the entries
from the cache.

Fixes PR46054.

Reviewers: efriedma, hfinkel, asbirlea

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D81726
2020-06-17 09:36:53 +01:00
Vitaly Buka d812efb121 [SafeStack,NFC] Fix names after files move
Summary: Depends on D81831.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81832
2020-06-17 01:08:40 -07:00
Vitaly Buka 6754a0e2ed [SafeStack,NFC] Move SafeStackColoring code
Summary:
This code is going to be used in StackSafety.
This patch is file move with minimal changes. Identifiers
will be fixed in the followup patch.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81831
2020-06-17 01:07:47 -07:00
Sameer Sahasrabuddhe d3963b3a5f [DA] propagate loop live-out values that get used in a branch
Values that are uniform within a loop but appear divergent to uses
outside the loop are "tainted" so that such uses are marked
divergent. But if such a use is a branch, then it's divergence needs
to be propagated. The simplest way to do that is to put the branch
back in the main worklist so that it is processed appropriately.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D81822
2020-06-17 09:21:00 +05:30
Kirill Naumov 369d00df60 [CallPrinter] Adding heat coloring to CallPrinter
This patch introduces the heat coloring of the Call Printer which is based
on the relative "hotness" of each function. The patch is a part of sequence of
three patches, related to graphs Heat Coloring.
Another feature added is the flag similar to "-cfg-dot-filename-prefix",
which allows to write the graph into a named .pdf

Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu

Differential Revision: https://reviews.llvm.org/D77172
2020-06-16 21:15:29 +00:00
Christopher Tetreault b265cad93e [NFC] Bail out for scalable vectors before calling getNumElements
Summary:
Move the bail out logic to before constructing the Result and Lane
vectors. This is both potentially faster, and avoids calling
getNumElements on a potentially scalable vector

Reviewers: efriedma, sunfish, chandlerc, c-rhodes, fpetrogalli

Reviewed By: fpetrogalli

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81619
2020-06-16 13:41:29 -07:00
Christopher Tetreault 747486991c [SVE] Fix bad FixedVectorType cast in simplifyDivRem
Summary:
simplifyDivRem attempts to walk a VectorType elementwise. Ensure that it
only does so for FixedVectorType

Reviewers: efriedma, spatel, lebedev.ri, david-arm, kmclaughlin

Reviewed By: spatel, david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81856
2020-06-16 13:17:05 -07:00
Hiroshi Yamauchi 6bc2b042f4 [TLI] Add four C++17 delete variants.
Summary:
delete(void*, unsigned int, align_val_t)
delete(void*, unsigned long, align_val_t)
delete[](void*, unsigned int, align_val_t)
delete[](void*, unsigned long, align_val_t)

Differential Revision: https://reviews.llvm.org/D81853
2020-06-16 11:12:02 -07:00
Sam Parker 7158f285a8 [CostModel] Unify getCFInstrCost
Have TTI::getInstructionThroughput call getUserCost for Br, Ret and
PHI. This now means that eveything in getInstructionThroughput is
handled by getUserCost.

Differential Revision: https://reviews.llvm.org/D79849
2020-06-16 08:40:54 +01:00
Mircea Trofin 296e47734e [llvm][NFC] Fix license on InlineFeaturesAnalysis.{h|cpp}
Summary: Also fixed the InlineAdvisor.cpp license.

Reviewers: rriddle

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81896
2020-06-15 19:34:33 -07:00
Mircea Trofin e2cc854015 [llvm][NFC] Move content of ML subdirectory into Analysis
The initial intent was to organize ML stuff in its own directory, but
it turns out that conflicts with llvm component layering policies: it
is not a component, because subsequent changes want to rely on other
analyses, which would create a cycle; and we don't have a reliable,
cross-platform mechanism to compile files in a subdirectory, and fit in
the existing LLVM build structure.

This change moves the files into Analysis, and subsequent changes will
leverage conditional compilation for those that have optional
dependencies.
2020-06-15 14:35:33 -07:00
Mircea Trofin 29e5722949 Revert "[llvm] Added support for stand-alone cmake object libraries."
This reverts commit 695c7d6313.

Breaks windows (e.g.
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/16497)

Likely to cause problems with XCode.
2020-06-15 12:15:39 -07:00
Mircea Trofin 695c7d6313 [llvm] Added support for stand-alone cmake object libraries.
Summary:
Currently, add_llvm_library would create an OBJECT library alongside
of a STATIC / SHARED library, but losing the link interface (its
elements would become dependencies instead). To support scenarios
where linking an object library also brings in its usage
requirements, this patch adds support for 'stand-alone' OBJECT
libraries - i.e. without an accompanying SHARED/STATIC library, and
maintaining the link interface defined by the user.

The support is via a new option, OBJECT_ONLY, to avoid breaking changes
- since just specifying "OBJECT" would currently imply also STATIC or
SHARED, depending on BUILD_SHARED_LIBS.

This is useful for cases where, for example, we want to build a part
of a component separately. Using a STATIC target would incur the risk
that symbols not referenced in the consumer would be dropped (which may
be undesirable).

The current application is the ML part of Analysis. It should be part
of the Analysis component, so it may reference other analyses; and (in
upcoming changes) it has dependencies on optional libraries.

Reviewers: karies, davidxl

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81447
2020-06-15 12:01:43 -07:00
Rahul Joshi 72d20b9604 [LLVM] Change isa<> to a variadic function template
Change isa<> to a variadic function template, so that it can be used to test against one of multiple types as follows:
   isa<Type0, Type1, Type2>(Val)

Differential Revision: https://reviews.llvm.org/D81045
2020-06-15 18:46:57 +00:00
Sam Parker 321ebfd175 [NFCI][CostModel] Unify FNeg cost
Enable TTIImpl::getUserCost to handle FNeg so that
getInstructionThroughput can call that instead. This means we can
remove the code in the AMDGPU backend too.

Differential Revision: https://reviews.llvm.org/D81635
2020-06-15 08:33:04 +01:00
Sam Parker 51541c068a [CostModel] Unify ExtractElement cost.
Move the cost modelling, with the reduction pattern matching, from
getInstructionThroughput into generic TTIImpl::getUserCost. The
modelling in the AMDGPU backend can now be removed.

Differential Revision: https://reviews.llvm.org/D81643
2020-06-15 08:27:14 +01:00
Florian Hahn 6176f04436 [LAA] Do not set CanDoRT to false for AS that do not need RT checks.
Alternative approach to D80570.

canCheckPtrAtRT already contains checks the figure out for which alias
sets runtime checks are needed. But it currently sets CanDoRT to false
for alias sets for which we cannot do RT checks but also do not need
any.

If we know that we do not need RT checks based on the number of
reads/writes in the alias set, we can skip processing the AS.

This patch also adds an assertion to ensure that DepCands does not
contain more than one write from the alias set.

Reviewers: Ayal, anemet, hfinkel, dmgreen

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D80622
2020-06-14 20:55:59 +01:00
Nikita Popov 862db369f8 [LVI] Fix class indentation (NFC)
This class uses a mix of different indentation levels, normalize it.
2020-06-14 15:42:27 +02:00
Nikita Popov 83e7230e5a [LVI] Cache lookup of experimental.guard intrinsic (NFC)
When LVI is performing assume intersections, it also checks for
llvm.experimental.guard intrinsics. To avoid unnecessary block
scans, it first checks whether this intrinsic is declared in the
module at all. I've noticed that we end up spending quite a lot
of time looking up that function again and again...

Avoid this by only looking it up once when LazyValueInfo is
constructed. This of course assumes that we don't introduce new
guard intrinsics (which is the case for all existing uses of LVI --
and even if it weren't, it would not introduce miscompiles, just
potentially lose optimization power.)

Differential Revision: https://reviews.llvm.org/D81796
2020-06-14 15:32:30 +02:00
Nikita Popov f87b785abe Reapply [LVI] Restructure caching to fix non-determinism
This was reverted due to a reported memory usage increase. However,
a test case was never provided, and I wasn't able to reproduce it
myself.

Relative to the original patch, I have moved the block cache
structure behind a unique_ptr, to avoid storing a huge structure
inside a DenseMap.

---

Variant on D70103 to fix https://bugs.llvm.org/show_bug.cgi?id=43909.
The caching is switched to always use a BB to cache entry map, which
then contains per-value caches. A separate set contains value handles
with a deletion callback. This allows us to properly invalidate
overdefined values.

A possible alternative would be to always cache by value first and
have per-BB maps/sets in the each cache entry. In that case we could
use a ValueMap and would avoid the separate value handle set. I went
with the BB indexing at the top level to make it easier to integrate
D69914, but possibly that's not the right choice.

Differential Revision: https://reviews.llvm.org/D70376
2020-06-13 11:31:40 +02:00
Mehdi Amini 339e49e2ca Fix GCC5 build by renaming variable used in 'auto' deduction (NFC)
GCC5 errors out with:

llvm/lib/Analysis/StackSafetyAnalysis.cpp:935:21: error: use of 'KV' before deduction of 'auto'
     for (auto &KV : KV.second.Params) {
                     ^
2020-06-13 03:08:56 +00:00
Vitaly Buka c1e47b47f8 [StackSafety] Run ThinLTO
Summary:
ThinLTO linking runs dataflow processing on collected
function parameters. Then StackSafetyGlobalInfoWrapperPass
in ThinLTO backend will run as usual looking up to external
symbol in the summary if needed.

Depends on D80985.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: inglorion, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D81242
2020-06-12 18:11:29 -07:00
Vitaly Buka e6ce0dc5de [StackSafety,NFC] Extract addOverflowNever 2020-06-12 17:42:32 -07:00
Vitaly Buka 999307323a [StackSafety] Fix byval handling
We don't need process paramenters which marked as
byval as we are not going to pass interested allocas
without copying.

If we pass value into byval argument, we just handle that
as Load of corresponding type and stop that branch of analysis.
2020-06-11 20:58:36 -07:00
Vitaly Buka a10fc165f5 [StackSafety,NFC] Fix use of CallBase API
Code does not need iterate arguments and can get ArgNo from
CallBase::getArgOperandNo.
2020-06-11 16:11:30 -07:00
Kirill Naumov 1022b5eb5b [InlineCost] Preparational patch for creation of Printer pass.
- Renaming the printer class, flag
- Refactoring
- Changing some tests

This patch is a preparational stage for introducing a new printing pass and new
functionality to the existing Annotation Writer. I plan to extend
this functionality for this tool to be more useful when looking at the inline
process.
2020-06-11 22:29:03 +00:00
Mircea Trofin e82eff7a03 [llvm][NFC] Factor some common data in InlineAdvice
Summary:
Other derivations will all want to emit optimization remarks and, as
part of that, use debug info.

Additionally, drive-by const-ing.

Reviewers: davidxl, dblaikie

Subscribers: aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81507
2020-06-11 08:01:00 -07:00
Vitaly Buka 5b1c70a48d [StackSafety] Pass summary into codegen
Summary:
The patch wraps ThinLTO index into immutable
pass which can be used by StackSafety analysis.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80985
2020-06-10 21:02:54 -07:00
Vitaly Buka 4666953ce2 [StackSafety] Add info into function summary
Summary:
This patch adds optional field into function summary,
implements asm and bitcode serialization. YAML
serialization is omitted and can be added later if
needed.

This patch includes this information into summary only
if module contains at least one sanitize_memtag function.
In a near future MTE is the user of the analysis.
Later if needed we can provede more direct control
on when information is included into summary.

Reviewers: eugenis

Subscribers: hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80908
2020-06-10 02:43:28 -07:00
Sam Parker 09d30cb977 [CostModel] Unify Shuffle and InsertElement Costs
Extract the existing code from getInstructionThroughput into
TTImpl::getUserCost. The duplicated code in the AMDGPU backend has
also been removed.

Differential Revision: https://reviews.llvm.org/D81448
2020-06-10 09:13:34 +01:00
Sam Parker fa8bff0cd1 [CostModel] Unify getArithmeticInstrCost
Add the remaining arithmetic opcodes into the generic implementation
of getUserCost and then call this from getInstructionThroughput. Most
of the backends have been modified to return the base implementation
for cost kinds other RecipThroughput. The outlier here is AMDGPU
which already uses getArithmeticInstrCost for all the cost kinds.
This change means that most of the opcodes can be removed from that
backends implementation of getUserCost.

Differential Revision: https://reviews.llvm.org/D80992
2020-06-10 09:08:45 +01:00
Sam Parker 37289615c0 [NFCI][CostModel] Unify getCmpSelInstrCost
Add cases for icmp, fcmp and select into the switch statement of the
generic getUserCost implementation with getInstructionThroughput then
calling into it. The BasicTTI and backend implementations have be set
to return a default value (1) when a cost other than throughput is
being queried.

Differential Revision: https://reviews.llvm.org/D80550
2020-06-09 07:41:22 +01:00
Benjamin Kramer 3badd17b69 SmallPtrSet::find -> SmallPtrSet::count
The latter is more readable and more efficient. While there clean up
some double lookups. NFCI.
2020-06-07 22:38:08 +02:00
Simon Pilgrim f6cb987d50 DomTreeUpdater.h - refine includes. NFC.
We don't need any of its defs or many of its includes inside PostDominators.h - so split it and reduce the frontend load.
2020-06-07 16:57:48 +01:00
Simon Pilgrim 3642d38823 DependenceAnalysis.h - reduce AliasAnalysis.h include to forward declaration. NFC.
This requires the replacement of legacy class AliasAnalysis usages with AAResults (which it typedefs to anyhow)
2020-06-07 12:47:37 +01:00
Simon Pilgrim 1e9d2f908e OrderedInstructions.h - reduce includes to forward declarations. NFC. 2020-06-07 11:44:43 +01:00
Simon Pilgrim e5e33f23c7 CFG.h - reduce includes to forward declarations. NFC.
Remove unnecessary includes from CFG.cpp.

Fix implicit include dependency in X86WinEHState.cpp.
2020-06-06 15:06:42 +01:00
Simon Pilgrim 5006e551d3 LoopAnalysisManager.h - reduce includes to forward declarations. NFC.
Move implicit include dependencies down to header/source files.
2020-06-06 14:06:46 +01:00
Roman Lebedev 1eda9bfd61
[SCEV] ScalarEvolution::createSCEV(): Instruction::Or: drop bogus no-wrap flag detection
Summary:
That's just really wrong. While sure, if LHS is AddRec, and we could
propagate it's no-wrap flags, that doesn't make, because as long as
the operands of `or` had no common bits set, then the `add`
of these operands will never overflow: http://volta.cs.utah.edu:8080/z/gmt7Sy
IOW we need no propagation/detection, we are free to just set NUW+NSW.

But as rG39e3683534c83573da5c8b70c8adfb43948f601f shows,
even when the old code failed to "deduce" flags,
we'd eventually re-deduce them somewhere, later.

So let's just set them.

Reviewers: mkazantsev, reames, sanjoy, efriedma

Reviewed By: efriedma

Subscribers: efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81246
2020-06-06 13:02:07 +03:00
Roman Lebedev c868335e24
[SCEV] ScalarEvolution::createSCEV(): clarify no-wrap flag propagation for shift by bitwidth-1
Summary:
There was this comment here previously:
```
-        // It is currently not resolved how to interpret NSW for left
-        // shift by BitWidth - 1, so we avoid applying flags in that
-        // case. Remove this check (or this comment) once the situation
-        // is resolved. See
-        // http://lists.llvm.org/pipermail/llvm-dev/2015-April/084195.html
-        // and http://reviews.llvm.org/D8890 .
```
But langref was fixed in rL286785, and the behavior is pretty obvious:
http://volta.cs.utah.edu:8080/z/MM4WZP
^ nuw can always be propagated. nsw can be propagated if
either nuw is specified, or the shift is by *less* than bitwidth-1.

This mimics similar D81189 Reassociate change, alive2 is happy about that one.

I'm not sure `NUW` isn't being printed, but that seems unrelated.

Reviewers: mkazantsev, reames, sanjoy, nlopes, craig.topper, efriedma

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81243
2020-06-06 13:02:07 +03:00
Simon Pilgrim ea0880ddef TypeMetadataUtils.h - reduce Instructions.h include to forward declaration. NFC.
Move implicit include dependencies down to source file.
2020-06-05 17:40:33 +01:00
Simon Pilgrim 44d86982d2 MemorySSAUpdater.h - reduce unnecessary includes to forward declarations. NFC.
Remove unnecessary MemoryAccess forward declaration as its already included from MemorySSA.h

Move implicit include dependencies down to source files.
2020-06-05 10:45:59 +01:00
Sam Parker 9303546b42 [CostModel] Unify getMemoryOpCost
Use getMemoryOpCost from the generic implementation of getUserCost
and have getInstructionThroughput return the result of that for loads
and stores.

This also means that the X86 implementation of getUserCost can be
removed with the functionality folded into its getMemoryOpCost.

Differential Revision: https://reviews.llvm.org/D80984
2020-06-05 10:13:38 +01:00
Vitaly Buka 3c32af58f6 [StackSafety,NFC] Ignore callee declarations
It's going to fail FunctionInfo lookup anyway.
2020-06-04 20:55:50 -07:00
Hiroshi Yamauchi e52a38db07 [PGO] Enable the working set size scaling under the partial sample PGO.
Summary: Following up D79831.

Reviewers: davidxl

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80939
2020-06-04 11:30:54 -07:00
Vitaly Buka af6e054730 [StackSafety] Rename testing opts 2020-06-04 02:39:16 -07:00
Vitaly Buka 81826c7ac6 [StackSafety,NFC] Remove SCEVRewriteVisitor
Summary: Depends on D80956.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80976
2020-06-04 02:32:36 -07:00
Yevgeny Rouban dcfa78a4cc Extend InvokeInst !prof branch_weights metadata to unwind branches
Allow InvokeInst to have the second optional prof branch weight for
its unwind branch. InvokeInst is a terminator with two successors.
It might have its unwind branch taken many times. If so
the BranchProbabilityInfo unwind branch heuristic can be inaccurate.
This patch allows a higher accuracy calculated with both branch
weights set.

Changes:
 - A new section about InvokeInst is added to
   the BranchWeightMetadata page. It states the old information that
   missed in the doc and adds new about the second branch weight.
 - Verifier is changed to allow either 1 or 2 branch weights
   for InvokeInst.
 - A new test is written for BranchProbabilityInfo to demonstrate
   the main improvement of the simple fix in calcMetadataWeights().
 - Several new testcases are created for Inliner. Those check that
    both weights are accounted for invoke instruction weight
    calculation.
 - PGOUseFunc::setBranchWeights() is fixed to be applicable to
   InvokeInst.

Reviewers: davidxl, reames, xur, yamauchi
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80618
2020-06-04 15:37:15 +07:00
Kazu Hirata 347a599e5f [Inlining] Introduce -enable-npm-pgo-inline-deferral
Summary:
Experiments show that inline deferral past pre-inlining slightly
pessimizes the performance.

This patch introduces an option to control inline deferral during PGO.
The option defaults to true for now (that is, NFC).

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80776
2020-06-04 00:40:58 -07:00
Vitaly Buka 291dabefde [StackSafety,NFC] Add statistic counters 2020-06-03 16:12:08 -07:00
Dorit Nuzman a9fe69c359 [InstSimplify] fix bug in matching or-with-not op (PR46083) 2020-06-03 13:44:29 -04:00
Vitaly Buka 264d435ee1 [NFC,StackSafety] Fix template arg name 2020-06-03 02:39:21 -07:00
Jay Foad 7c7941fb4b [AMDGPU] Fold llvm.amdgcn.cos and llvm.amdgcn.sin intrinsics (fix)
Try to fix Windows buildbots.
2020-06-03 09:44:33 +01:00
Vitaly Buka 6e51a080f7 [StackSafety,NFC] Convert to template internal stuff
It's going to be usefull for ThinLTO.
2020-06-03 01:36:20 -07:00
Vitaly Buka a019579fe5 [StackSafety,NFC] Rename internal class 2020-06-03 01:36:20 -07:00
Jay Foad c823cfde21 [AMDGPU] Fold llvm.amdgcn.cos and llvm.amdgcn.sin intrinsics
Differential Revision: https://reviews.llvm.org/D80702
2020-06-03 09:34:22 +01:00
Vitaly Buka d3b7f90d00 [StackSafety] Skip non-pointer parameters
Summary: Depends on D80908.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80956
2020-06-03 01:16:39 -07:00
Vitaly Buka e128f01be9 [NFC, StackSafety] Change type of internal container
Summary: Depends on D80771.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: mehdi_amini, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80847
2020-06-03 01:05:10 -07:00
Mehdi Amini f9bb101d39 Revert "[NFC, StackSafety] Change type of internal container"
This reverts commit f62813e7ea.
GCC 5.3 build is broken.
2020-06-03 03:02:28 +00:00
Vitaly Buka f62813e7ea [NFC, StackSafety] Change type of internal container
Summary: Depends on D80771.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80847
2020-06-02 18:27:22 -07:00
Vitaly Buka 232d348c6e [MTE] Convert StackSafety into analysis
This lets us to remove !stack-safe metadata and
better controll when to perform StackSafety
analysis.

Reviewers: eugenis

Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80771
2020-06-02 16:08:14 -07:00
Yevgeny Rouban 07239c736a [BrachProbablityInfo] Proportional distribution of reachable probabilities
When fixing probability of unreachable edges in
BranchProbabilityInfo::calcMetadataWeights() proportionally distribute
remainder probability over the reachable edges. The old implementation
distributes the remainder probability evenly.
See examples in the fixed tests.

Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80611
2020-06-02 12:06:52 +07:00
Yevgeny Rouban 3bb0d95fdc [BrachProbablityInfo] Rename loop variables. NFC 2020-06-02 10:55:27 +07:00
Mircea Trofin 999ea25a9e [llvm][NFC] Cache FAM in InlineAdvisor
Summary:
This simplifies the interface by storing the function analysis manager
with the InlineAdvisor, and, thus, not requiring it be passed each time
we inquire for an advice.

Reviewers: davidxl, asbirlea

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80405
2020-06-01 13:02:34 -07:00
Hiroshi Yamauchi 6c27c61d32 [PGO] Improve the working set size heuristics under the partial sample PGO.
Summary:
The working set size heuristics (ProfileSummaryInfo::hasHugeWorkingSetSize)
under the partial sample PGO may not be accurate because the profile is partial
and the number of hot profile counters in the ProfileSummary may not reflect the
actual working set size of the program being compiled.

To improve this, the (approximated) ratio of the the number of profile counters
of the program being compiled to the number of profile counters in the partial
sample profile is computed (which is called the partial profile ratio) and the
working set size of the profile is scaled by this ratio to reflect the working
set size of the program being compiled and used for the working set size
heuristics.

The partial profile ratio is approximated based on the number of the basic
blocks in the program and the NumCounts field in the ProfileSummary and computed
through the thin LTO indexing. This means that there is the limitation that the
scaled working set size is available to the thin LTO post link passes only.

Reviewers: davidxl

Subscribers: mgorny, eraman, hiraditya, steven_wu, dexonsmith, arphaman, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79831
2020-06-01 10:29:23 -07:00
Florian Hahn d99a1848c4 [BasicAA] Use known lower bounds for index values for size based check.
Currently, BasicAA does not exploit information about value ranges of
indexes. For example, consider the 2 pointers %a = %base and
%b = %base + %stride below, assuming they are used to access 4 elements.

If we know that %stride >= 4, we know the accesses do not alias. If
%stride is a constant, BasicAA currently gets that. But if the >= 4
constraint is encoded using an assume, it misses the NoAlias.

This patch extends DecomposedGEP to include an additional MinOtherOffset
field, which tracks the constant offset similar to the existing
OtherOffset, which the difference that it also includes non-negative
lower bounds on the range of the index value. When checking if the
distance between 2 accesses exceeds the access size, we can use this
improved bound.

For now this is limited to using non-negative lower bounds for indices,
as this conveniently skips cases where we do not have a useful lower
bound (because it is not constrained). We potential miss out in cases
where the lower bound is constrained but negative, but that can be
exploited in the future.

Reviewers: sanjoy, hfinkel, reames, asbirlea

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D76194
2020-05-30 16:20:42 +01:00
David Green a01c0049b1 [ConstantFolding] Constant folding for integer vector reduce intrinsics
This add constant folding for all the integer vector reduce intrinsics,
providing that the argument is a constant vector. zeroinitializer always
produces 0 for all intrinsics, and other values can be handled with
APInt operators.

Differential Revision: https://reviews.llvm.org/D80516
2020-05-29 17:58:42 +01:00
Sjoerd Meijer 7480ccbfc9 [TTI] New target hook emitGetActiveLaneMask
This is split off from D79100 and adds a new target hook emitGetActiveLaneMask
that can be queried to check if the intrinsic @llvm.get.active.lane.mask() is
supported by the backend and if it should be emitted for a given loop.

See also commit rG7fb8a40e5220 and its commit message for more details/context
on this new intrinsic.

Differential Revision: https://reviews.llvm.org/D80597
2020-05-29 09:10:58 +01:00
Vitaly Buka 791c78f5e0 [NFC,StackSafety] Add test flag 2020-05-28 15:38:12 -07:00
Vitaly Buka 6eb5679402 [NFC,StackSafety] clang-tidy warning fixes 2020-05-28 14:29:55 -07:00
Christopher Tetreault 434d122e94 [SVE] Eliminate calls to default-false VectorType::get() from Analysis
Reviewers: efriedma, fpetrogalli, kmclaughlin, sunfish

Reviewed By: fpetrogalli

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80324
2020-05-28 14:21:32 -07:00
Vitaly Buka 0e6628d37f [StackSafety] Lazy calculations
We are going to convert this into pure analysis, so
processing will be delayed up to the first safety request.
2020-05-28 13:32:57 -07:00
Vitaly Buka 2622cfbcd5 [NFC,StackSafety] Move internal offset calculation 2020-05-28 13:32:57 -07:00
Vitaly Buka 892c71a5bb [StackSafety] Don't run datafow on allocas
We need to process only parameters. Allocas access can be calculated
afterwards.
Also don't create fake function for aliases and just resolve them on
initialization.
2020-05-28 13:32:57 -07:00
Vitaly Buka 2f430f7a51 [StackSafety] Remove SetMetadata parameter 2020-05-28 13:32:57 -07:00
Hiroshi Yamauchi a7fa35a629 [ThinLTO] Compute the basic block count across modules.
Summary:
Count the per-module number of basic blocks when the module summary is computed
and sum them up during Thin LTO indexing.

This is used to estimate the working set size under the partial sample PGO.

This is split off of D79831.

Reviewers: davidxl, espindola

Subscribers: emaste, inglorion, hiraditya, MaskRay, steven_wu, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80403
2020-05-28 10:33:05 -07:00
Matt Arsenault d6671ee90c InferAddressSpaces: Handle ptrmask intrinsic
This one is slightly odd since it counts as an address expression,
which previously could never fail. Allow the existing TTI hook to
return the value to use, and re-use it for handling how to handle
ptrmask.

Handles the no-op addrspacecasts for AMDGPU. We could probably do
something better based on analysis of the mask value based on the
address space, but leave that for now.
2020-05-28 10:04:02 -04:00
Vitaly Buka 12cd4a5164 [NFC,StackSafety] Add StackSafetyGlobalInfo class 2020-05-27 20:07:12 -07:00
Vitaly Buka a70edc2b16 [NFC,StackSafety] Cleanup alloca size calculation 2020-05-27 17:47:02 -07:00
Mircea Trofin d14ee1553e [llvm][NFC] ProfileSummaryInfo - const-ify APIs
Follow-up from https://reviews.llvm.org/D79920
2020-05-27 17:14:41 -07:00
Fangrui Song be6bffe729 [CMake] Revert cf86a234ba
It is unnecessary after 993bbaf6a3
2020-05-27 15:29:22 -07:00
Fangrui Song 993bbaf6a3 [MLPolicies] Fix dependency and -DBUILD_SHARED_LIBS=on builds after D80579 2020-05-27 15:26:13 -07:00
Mircea Trofin cf86a234ba Fix shared libs build break introduced in rG98ef93eabd76 2020-05-27 15:12:16 -07:00
Mircea Trofin 98ef93eabd [llvm] Add function feature extraction analysis
Summary:
This patch introduces an analysis pass to extract function features,
which will be needed by the ML InlineAdvisor.

RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html

Reviewers: davidxl, dblaikie, jdoerfert

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80579
2020-05-27 13:38:50 -07:00
Vitaly Buka 804a39a201 [NFC,StackSafety] Rename some variables 2020-05-27 13:33:28 -07:00
Vitaly Buka 14f3357586 [StackSafety] Bailout more aggressively
Many edge cases, e.g. wrapped ranges, can be processed
precisely without bailout. However it's very unlikely that
memory access with min/max integer offsets will be
classified as safe anyway.
Early bailout may help with ThinLTO where we can
drop unsafe parameters from summaries.
2020-05-27 13:33:28 -07:00
Mircea Trofin fa3b587196 [llvm]NFC] Simplify ProfileSummaryInfo state transitions
ProfileSummaryInfo is updated seldom, as result of very specific
triggers. This patch clearly demarcates state updates from read-only uses.
This, arguably, improves readability and maintainability.
2020-05-27 11:58:37 -07:00
Rithik Sharma eadf295956 [CodeMoverUtils] Use dominator tree level to decide the direction of
code motion

Summary: Currently isSafeToMoveBefore uses DFS numbering for determining
the relative position of instruction and insert point which is not
always correct. This PR proposes the use of Dominator Tree depth for the
same. If a node is at a higher level than the insert point then it is
safe to say that we want to move in the forward direction.
Authored By: RithikSharma
Reviewer: Whitney, nikic, bmahjour, etiotto, fhahn
Reviewed By: Whitney
Subscribers: fhahn, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80084
2020-05-27 18:02:06 +00:00
Paul Walker 495f18292b [VFABI] Fix parsing of uniform parameters that shouldn't expect step or positional data.
Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80575
2020-05-27 16:07:45 +00:00
Florian Hahn 9b507b2127 [LAA] We only need pointer checks if there are non-zero checks (NFC).
If it turns out that we can do runtime checks, but there are no
runtime-checks to generate, set RtCheck.Need to false.

This can happen if we can prove statically that the pointers passed in
to canCheckPtrAtRT do not alias. This should not change any results, but
allows us to skip some work and assert that runtime checks are
generated, if LAA indicates that runtime checks are required.

Reviewers: anemet, Ayal

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D79969

Note: This is a recommit of 259abfc7cb,
with some suggested renaming.
2020-05-27 12:47:36 +01:00
Florian Hahn 2d0389821e Revert "[LAA] We only need pointer checks if there are non-zero checks (NFC)."
This reverts commit 259abfc7cb.

Reverting this, as I missed a case where we return without setting
RtCheck.Need.
2020-05-27 12:39:45 +01:00
Florian Hahn 259abfc7cb [LAA] We only need pointer checks if there are non-zero checks (NFC).
If it turns out that we can do runtime checks, but there are no
runtime-checks to generate, set RtCheck.Need to false.

This can happen if we can prove statically that the pointers passed in
to canCheckPtrAtRT do not alias. This should not change any results, but
allows us to skip some work and assert that runtime checks are
generated, if LAA indicates that runtime checks are required.

Reviewers: anemet, Ayal

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D79969
2020-05-27 12:37:20 +01:00
Vitaly Buka f6383643d9 [StackSafety] Bailout on some function calls
Don't miss values used in calls outside regular argument list.
2020-05-27 02:48:42 -07:00
Vitaly Buka b101c6251a [StackSafety] Ignore some use of values
We should ignore value used in MemTransferInst
as other then src/dst argument.
2020-05-27 02:48:41 -07:00
Vitaly Buka 32a1f60d11 [StackSafety] Use SCEV to find mem operation length 2020-05-26 23:22:37 -07:00
Vitaly Buka d0f1f5adfa [StackSafety] Use getSignedRange for offsets 2020-05-26 23:22:36 -07:00
Vitaly Buka b5ae70046b [StackSafety] Simplify SCEVRewriteVisitor
Probably NFC.
2020-05-26 18:09:43 -07:00
Vitaly Buka 4320d4aa1c [NFC, StackSafety] Add some missing includes 2020-05-26 18:09:43 -07:00
Vitaly Buka 5afef79ff4 [NFC, StackSafety] Remove duplicate code 2020-05-26 18:09:43 -07:00
Vitaly Buka f20ace6f33 [NFC, StackSafety] Better names for internal stuff
Remove const from some parameters as upcoming changes in ScalarEvolution
calls will need non const pointers.
2020-05-26 18:09:43 -07:00
Vitaly Buka 9abb0e8d5b [NFC, StackSafety] Remove unnecessary data 2020-05-26 14:13:20 -07:00
Vitaly Buka ecb66f50ee [NFC, StackSafety] Move FunctionInfo into :: namespace 2020-05-26 14:13:20 -07:00
Simon Pilgrim 0165cf7011 ObjCARCAnalysisUtils.h - remove unused includes. NFC.
We just need to include Passes.h in ObjCARCAliasAnalysis.cpp to compensate
2020-05-26 19:22:15 +01:00
Sanne Wouda 5bd97eb28a Fix MemoryLocation.h use without Instructions.h
MemoryLocation.h was changed to only include Instruction.h.  However,
cast<> still needs the full definiton, so move MemoryLocation::getOrNone
to the cpp file.
2020-05-26 17:19:14 +01:00
Serge Pavlov 4d20e31f73 [FPEnv] Intrinsic llvm.roundeven
This intrinsic implements IEEE-754 operation roundToIntegralTiesToEven,
and performs rounding to the nearest integer value, rounding halfway
cases to even. The intrinsic represents the missed case of IEEE-754
rounding operations and now llvm provides full support of the rounding
operations defined by the standard.

Differential Revision: https://reviews.llvm.org/D75670
2020-05-26 19:24:58 +07:00
Sam Parker bd9dce8f9a [CostModel] getUserCost for intrinsic throughput
Last part of recommitting 'Unify Intrinsic Costs'
259eb619ff. This patch now uses
getUserCost from getInstructionThroughput.

Differential Revision: https://reviews.llvm.org/D80012
2020-05-26 12:23:37 +01:00
Sam Parker 8aaabadece [CostModel] Unify getCastInstrCost
Add the remaining cast instruction opcodes to the base implementation
of getUserCost and directly return the result. This allows
getInstructionThroughput to return getUserCost for the casts. This
has required changes to PPC and SystemZ because they implement
getUserCost and/or getCastInstrCost with adjustments for vector
operations. Adjusts have also been made in the remaining backends
that implement the method so that they still produce a cost of zero
or one for cost kinds other than throughput.

Differential Revision: https://reviews.llvm.org/D79848
2020-05-26 11:29:57 +01:00
Sam Parker 871556a494 [CostModel] Unify Intrinsic Costs.
Recommitting most of the remaining changes from
259eb619ff, but excluding the call to
getUserCost from getInstructionThroughput. Though there's still no
test changes, I doubt that this is an NFC...

With the two getIntrinsicInstrCosts folded into one, now fold in the
scalar/code-size orientated getIntrinsicCost. The remaining scalar
intrinsics were memcpy, cttz and ctlz which now have special handling
in the BasicTTI implementation.

This had required a change in the AMDGPU backend for fabs as it
should always be 'free'. I've also changed the X86 backend to return
the BaseT implementation when the CostKind isn't RecipThroughput.

Differential Revision: https://reviews.llvm.org/D80012
2020-05-26 09:48:26 +01:00
Kazu Hirata cec20db588 [Inlining] Set inline-deferral-scale to 2.
Summary:
This patch sets inline-deferral-scale to 2.

Both internal and SPEC benchmarking show that 2 is the best number
among -1, 2, 3, and 4.

inline-deferral-scale  SPECint2006
------------------------------------------------------------
                   -1  38.0 (the default without this patch)
                    2  38.5
                    3  38.1
                    4  38.1

With the new default number, shouldBeDeferred returns true if:

  TotalCost < IC.getCost() * 2

where

  TotalCost is TotalSecondaryCost + IC.getCost() * NumCallerUsers.

If TotalCost >= 0 and NumCallerUsers >= 2, then
TotalCost >= IC.getCost() * 2, so shouldBeDeferred returns true only
when NumCallerUsers is 1.

Now, if TotalSecondaryCost < 0, which can happen if
InlineConstants::LastCallToStaticBonus, a huge number, has been
subtracted from TotalSecondaryCost, then TotalCost may be negative.
In this case, shouldBeDeferred may return true even when
NumCallerUsers >= 2.

Reviewers: davidxl, nikic

Reviewed By: davidxl

Subscribers: xbolva00, hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80229
2020-05-25 15:44:20 -07:00
Sanjay Patel 7eed772a27 [PatternMatch] abbreviate vector inst matchers; NFC
Readability is not reduced with these opcodes/match lines,
so reduce odds of awkward wrapping from 80-col limit.
2020-05-24 09:19:47 -04:00
Florian Hahn 8d04181198 [ValueTracking] Use assumptions in computeConstantRange.
This patch updates computeConstantRange to optionally take an assumption
cache as argument and use the available assumptions to limit the range
of the result.

Currently this is limited to assumptions that are comparisons.

Reviewers: reames, nikic, spatel, jdoerfert, lebedev.ri

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D76193
2020-05-23 20:07:52 +01:00
Denis Antrushin 5451289aba [SCEV] Constant fold MultExpr before applying depth limit.
Summary:
Users of SCEV reasonably assume that multiplication of two constant
SCEVs will in turn be constant.
However, that is not always the case:
First, we can get here with reached depth limit, and will create
MultExpr SCEV `C1 * C2` and cache it.
Then, we can get here with the same operands, but with small depth
level. But this time we will find existing MultExpr SCEV and return
it, instead of expected constant SCEV.

This patch changes getMultExpr to not apply depth limit to all constant
operands expression, allowing them to be folded.

Reviewers: reames, mkazantsev

Subscribers: hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79893
2020-05-22 18:34:32 +03:00
Matt Arsenault 88c20fa3d2 InstCombine: Add constant folding/simplify for amdgcn.ldexp intrinsic
This really belongs in InstructionSimplify since it doesn't introduce
new instructions. Put it in instcombine to avoid increasing the number
of passes considering target intrinsics.

I also noticed that we seem to now be interpreting strictfp attributes
on call sites, so try to handle that.
2020-05-22 08:21:38 -04:00
Sam Parker 259eb619ff Revert "[CostModel] Unify Intrinsic Costs."
This reverts commit de71def3f5.

This is causing some very large changes, so I'm first going to break
this patch down and re-commit in parts.
2020-05-21 12:50:24 +01:00
Sam Parker de71def3f5 [CostModel] Unify Intrinsic Costs.
With the two getIntrinsicInstrCosts folded into one, now fold in the
scalar/code-size orientated getIntrinsicCost. This involved sinking
cost of the TTIImpl into the base implementation, as it performs no
target checks. The opcodes remaining were memcpy, cttz and ctlz which
now have special handling in the BasicTTI implementation.
getInstructionThroughput can now directly return the result of
getUserCost.

This had required a change in the AMDGPU backend for fabs and its
always 'free'. I've also changed the X86 backend to return '1' for
any intrinsic when the CostKind isn't RecipThroughput.

Though this intended to be a non-functional change, there are many
paths being combined here so I would be very surprised if this didn't
have an effect.

Differential Revision: https://reviews.llvm.org/D80012
2020-05-21 07:38:25 +01:00
Sam Parker fb3ba38021 [CostModel] Remove getExtCost
This has not been implemented by any backends which appear to cover
the functionality through getCastInstrCost. Sink what there is in the
default implementation into BasicTTI.

Differential Revision: https://reviews.llvm.org/D78922
2020-05-21 07:18:06 +01:00
Yevgeny Rouban 8138487468 [BrachProbablityInfo] Set edge probabilities at once and fix calcMetadataWeights()
Hide the method that allows setting probability for particular edge
and introduce a public method that sets probabilities for all
outgoing edges at once.
Setting individual edge probability is error prone. More over it is
difficult to check that the total probability is 1.0 because there is
no easy way to know when the user finished setting all
the probabilities.

Related bug is fixed in BranchProbabilityInfo::calcMetadataWeights().
Changing unreachable branch probabilities to raw(1) and distributing
the rest (oldProbability - raw(1)) over the reachable branches could
introduce total probability inaccuracy bigger than 1/numOfBranches.

Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
2020-05-21 12:52:37 +07:00
Eli Friedman f26bdb539e Make Value::getPointerAlignment() return an Align, not a MaybeAlign.
If we don't know anything about the alignment of a pointer, Align(1) is
still correct: all pointers are at least 1-byte aligned.

Included in this patch is a bugfix for an issue discovered during this
cleanup: pointers with "dereferenceable" attributes/metadata were
assumed to be aligned according to the type of the pointer.  This
wasn't intentional, as far as I can tell, so Loads.cpp was fixed to
stop making this assumption. Frontends may need to be updated.  I
updated clang's handling of C++ references, and added a release note for
this.

Differential Revision: https://reviews.llvm.org/D80072
2020-05-20 16:37:20 -07:00
Sam Parker 8cc911fa5b [NFCI][CostModel] Refactor getIntrinsicInstrCost
Combine the two API calls into one by introducing a structure to hold
the relevant data. This has the added benefit of moving the boiler
plate code for arguments and flags, into the constructors. This is
intended to be a non-functional change, but the complicated web of
logic involved here makes it very hard to guarantee.

Differential Revision: https://reviews.llvm.org/D79941
2020-05-20 11:59:08 +01:00
Florian Hahn bcbd26bfe6 [SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.

This patch was originally committed as b8a3c34eee, but broke the
modules build, as LoopAccessAnalysis was using the Expander.

The code-gen part of LAA was moved to lib/Transforms recently, so this
patch can be landed again.

Reviewers: sanjoy.google, efriedma, reames

Reviewed By: sanjoy.google

Differential Revision: https://reviews.llvm.org/D71537
2020-05-20 10:53:40 +01:00
Nikita Popov 5fae613a4f [LVI] Don't require DominatorTree in LVI (NFC)
After D76797 the dominator tree is no longer used in LVI, so we
can remove it as a pass dependency, and also get rid of the
dominator tree enabling/disabling logic in JumpThreading.

Apart from cleaning up the code, this also clarifies LVI
cache consistency, in that the LVI cache can no longer
depend on whether the DT was or wasn't enabled due to
pending DT updates at any given time.

Differential Revision: https://reviews.llvm.org/D76985
2020-05-19 20:21:46 +02:00
Jay Foad c1ae72d03f [IR] Revert r119493
r119493 protected against PHINode::hasConstantValue returning the PHI
node itself, but a later fix in r159687 means that can never happen, so
the workarounds are no longer required.
2020-05-19 13:17:11 +01:00
Eli Friedman 27b4e6931d [NFC] Replace MaybeAlign with Align in TargetTransformInfo. 2020-05-18 19:25:49 -07:00
Nikita Popov 736db2f710 [Loads] Require Align in isSafeToLoadUnconditionally() (NFC)
Now that load/store have required alignment, accept Align here.
This also avoids uses of getPointerElementType(), which is
incompatible with opaque pointers.
2020-05-18 20:50:35 +02:00
Nikita Popov 52e98f620c [Alignment] Remove unnecessary getValueOrABITypeAlignment calls (NFC)
Now that load/store alignment is required, we no longer need most
of them. Also switch the getLoadStoreAlignment() helper to return
Align instead of MaybeAlign.
2020-05-17 22:19:15 +02:00
Nikita Popov 39beeeff20 [LVI] Don't use dominator tree in isValidAssumeForContext()
LVI and its consumers currently have quite a bit of complexity
related to dominator tree management. However, it doesn't look
like it is actually needed...

The only use of the dominator tree is inside isValidAssumeForContext().
However, due to the way LVI queries work, it is not needed:
If we query a value for some block, we will first get the edge values
from all predecessor blocks, which also includes an intersection with
assumptions that apply to the terminator of the predecessor. As such,
we will already have processed all assumptions from predecessor blocks
(this is actually stronger than what isValidAssumeForContext() does
with a DT, because this is capable of combining non-dominating
assumptions). The only additional assumptions we need to take into
account are those in the block being queried. And we don't need a
dominator tree for that.

This patch only removes the use of DT, I will drop the machinery
around it in a followup.

Differential Revision: https://reviews.llvm.org/D76797
2020-05-17 21:39:35 +02:00
Nikita Popov d86fff6ae7 [ValueTracking] Fix computeKnownBits() with bitwidth-changing ptrtoint
computeKnownBitsFromAssume() currently asserts if m_V matches a
ptrtoint that changes the bitwidth. Because InstCombine
canonicalizes ptrtoint instructions to use explicit zext/trunc,
we never ran into the issue in practice. I'm adding unit tests,
as I don't know if this can be triggered via IR anywhere.

Fix this by calling anyextOrTrunc(BitWidth) on the computed
KnownBits. Note that we are going from the KnownBits of the
ptrtoint result to the KnownBits of the ptrtoint operand,
so we need to truncate if the ptrtoint zexted and anyext if
the ptrtoint truncated.

Differential Revision: https://reviews.llvm.org/D79234
2020-05-16 14:17:11 +02:00
Vitaly Buka 6512cc7735 [NFC,StackSafety] Rename local function 2020-05-15 13:39:07 -07:00
Mircea Trofin 08e2386dee Revert "Revert "[llvm][NFC] Cleanup uses of std::function in Inlining-related APIs""
This reverts commit 454de99a6f.

The problem was that one of the ctor arguments of CallAnalyzer was left
to be const std::function<>&. A function_ref was passed for it, and then
the ctor stored the value in a function_ref field. So a std::function<>
would be created as a temporary, and not survive past the ctor
invocation, while the field would.

Tested locally by following https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild

Original Differential Revision: https://reviews.llvm.org/D79917
2020-05-15 12:29:16 -07:00
Anna Thomas 7cc3769adb [VectorUtils] Expose vector-function-abi-variant mangling as a utility.
Summary:
This change exposes the vector name mangling with LLVM ISA (used as part
of vector-function-abi-variant) as a utility.
This can then be used by front-ends that add this attribute.
Note that all parameters passed in to the function will be mangled with
the "v" token to identify that they are of of vector type. So, it is the
responsibility of the caller to confirm that all parameters in the
vectorized variant is of vector type.

Added unit test to show vector name mangling.

Reviewed-By: fpetrogalli, simoll

Differential Revision: https://reviews.llvm.org/D79867
2020-05-15 11:42:20 -04:00
Mircea Trofin 454de99a6f Revert "[llvm][NFC] Cleanup uses of std::function in Inlining-related APIs"
This reverts commit 767db5be67.
2020-05-14 22:32:44 -07:00
Mircea Trofin 767db5be67 [llvm][NFC] Cleanup uses of std::function in Inlining-related APIs
Summary:
Replacing uses of std::function pointers or refs, or Optional, to
function_ref, since the usage pattern allows that. If the function is
optional, using a default parameter value (nullptr). This led to a few
parameter reshufles, to push all optionals to the end of the parameter
list.

Reviewers: davidxl, dblaikie

Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79917
2020-05-14 22:13:53 -07:00
Mircea Trofin 8a2e2a6a2b [llvm] Fix refactoring bug introduced in D79042
Incorrectly copied over the GetAssumptionCache snippet.

This patch also renames a variable for clarity.
2020-05-14 15:59:43 -07:00
Mircea Trofin d6695e1876 [llvm] Add interface to drive inlining decision using ML model
Summary:

This change introduces InliningAdvisor (and related APIs), the interface
that abstracts decision making away from the inlining pass. We will use
this interface to delegate decision making to a trained ML model,
subsequently (see referenced RFC).

RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html

Reviewers: davidxl, eraman, dblaikie

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79042
2020-05-13 13:27:29 -07:00
Alina Sbirlea bd541b217f [NewPassManager] Add assertions when getting statefull cached analysis.
Summary:
Analyses that are statefull should not be retrieved through a proxy from
an outer IR unit, as these analyses are only invalidated at the end of
the inner IR unit manager.
This patch disallows getting the outer manager and provides an API to
get a cached analysis through the proxy. If the analysis is not
stateless, the call to getCachedResult will assert.

Reviewers: chandlerc

Subscribers: mehdi_amini, eraman, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72893
2020-05-13 12:38:38 -07:00
Reid Kleckner 1370757dd0 Revert "[BrachProbablityInfo] Set edge probabilities at once. NFC."
This reverts commit eef95f2746.

The new assertion about branch propability sums does not hold.
2020-05-13 08:23:09 -07:00
Pierre-vh 2668775f66 [LSR][ARM] Add new TTI hook to mark some LSR chains as profitable
This patch adds a new TTI hook to allow targets to tell LSR that
a chain including some instruction is already profitable and
should not be optimized. This patch also adds an implementation
of this TTI hook for ARM so LSR doesn't optimize chains that include
the VCTP intrinsic.

Differential Revision: https://reviews.llvm.org/D79418
2020-05-13 14:18:28 +01:00
Yevgeny Rouban eef95f2746 [BrachProbablityInfo] Set edge probabilities at once. NFC.
Hide the method that allows setting probability for particular
edge and introduce a public method that sets probabilities for
all outgoing edges at once.
Setting individual edge probability is error prone. More over
it is difficult to check that the total probability is 1.0
because there is no easy way to know when the user finished
setting all the probabilities.

Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
2020-05-13 13:55:36 +07:00
Juneyoung Lee d3eb51f062 [ValueTracking] Fix crash in isGuaranteedNotToBeUndefOrPoison when V is in an unreachable block
Summary:
This fixes PR45885 by fixing isGuaranteedNotToBeUndefOrPoison so it does not look into dominating
branch conditions of V when V is an instruction in an unreachable block.

Reviewers: spatel, nikic, lebedev.ri

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79790
2020-05-13 10:16:47 +09:00
Juneyoung Lee e5f602d82c [ValueTracking] Let propagatesPoison support binops/unaryops/cast/etc.
Summary:
This patch makes propagatesPoison be more accurate by returning true on
more bin ops/unary ops/casts/etc.

The changed test in ScalarEvolution/nsw.ll was introduced by
a19edc4d15 .
IIUC, the goal of the tests is to show that iv.inc's SCEV expression still has
no-overflow flags even if the loop isn't in the wanted form.
It becomes more accurate with this patch, so think this is okay.

Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, sanjoy

Reviewed By: spatel, nikic

Subscribers: regehr, nlopes, efriedma, fhahn, javed.absar, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78615
2020-05-13 02:51:42 +09:00
Kazu Hirata 0205fabe5d [Inlining] Make shouldBeDeferred static (NFC)
Summary:
This patch makes shouldBeDeferred static because it is called only
from shouldInline in the same .cpp file.

Reviewers: davidxl, mtrofin

Reviewed By: mtrofin

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79750
2020-05-11 17:43:31 -07:00
Mircea Trofin 48fa355ed4 [llvm][NFC] Move inlining decision-related APIs in InliningAdvisor.
Summary: Factoring out in preparation to https://reviews.llvm.org/D79042

Reviewers: dblaikie, davidxl

Subscribers: mgorny, eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79613
2020-05-11 09:00:59 -07:00
Tyker 78d85c2091 [AssumeBundles] fix crashes
Summary:
this patch fixe crash/asserts found in the test-suite.
the AssumeptionCache cannot be assumed to have all assumes contrary to what i tought.
prevent generation of information for terminators, because this can create broken IR in transfromation where we insert the new terminator before removing the old one.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79458
2020-05-11 11:52:21 +02:00
OCHyams da100de0a6 [NFC][DwarfDebug] Add test for variables with a single location which
don't span their entire scope.

The previous commit (6d1c40c171) is an older version of the test.

Reviewed By: aprantl, vsk

Differential Revision: https://reviews.llvm.org/D79573
2020-05-11 11:49:11 +02:00
Tyker 821a0f23d8 [AssumeBundles] Prevent generation of some redundant assumes
Summary: with this patch the assume salvageKnowledge will not generate assume if all knowledge is already available in an assume with valid context. assume bulider can also in some cases update an existing assume with better information.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78014
2020-05-10 19:23:59 +02:00
Florian Hahn 8528186b9b [LAA] Move runtime-check generation to Transforms/Utils/loopUtils (NFC)
Currently LAA's uses of ScalarEvolutionExpander blocks moving the
expander from Analysis to Transforms. Conceptually the expander does not
fit into Analysis (it is only used for code generation) and
runtime-check generation also seems to be better suited as a
transformation utility.

Reviewers: Ayal, anemet

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78460
2020-05-10 17:39:26 +01:00
Simon Pilgrim d5a2870a6e CodeMetrics.cpp - remove unused includes. NFC. 2020-05-10 16:59:55 +01:00
Florian Hahn 96c63f544f Recommit "[LAA] Remove one addRuntimeChecks function (NFC)."
The failing assertion has been fixed and the problematic test case has
been added.

This reverts the revert commit fc44617f28.
2020-05-10 15:19:57 +01:00
Florian Hahn fc44617f28 Revert "[LAA] Remove one addRuntimeChecks function (NFC)."
This reverts commit c28114c8ff.

This causes some bots to fail:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/30596/steps/build%20android%2Faarch64/logs/stdio
2020-05-10 13:28:00 +01:00
Florian Hahn c28114c8ff [LAA] Remove one addRuntimeChecks function (NFC).
In order to reduce the API surface area (preparation for D78460), remove
a addRuntimeChecks() function and do the additional check in the single
caller.

Reviewers: Ayal, anemet

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D79679
2020-05-10 12:48:55 +01:00
Florian Hahn 57fb56b30e [LAA] Remove unneeded PtrRtChecking argument (NFC).
The argument is not required and simplifies D78460 a bit.
2020-05-09 22:26:54 +01:00
Wei Mi aa2ddfc73d [SampleFDO] For functions without profiles, provide an option to put
them in a special text section.

For sampleFDO, because the optimized build uses profile generated from
previous release, previously we couldn't tell a function without profile
was truely cold or just newly created so we had to treat them conservatively
and put them in .text section instead of .text.unlikely. The result was when
we persuing the best performance by locking .text.hot and .text in memory,
we wasted a lot of memory to keep cold functions inside.

In https://reviews.llvm.org/D66374, we introduced profile symbol list to
discriminate functions being cold versus functions being newly added.
This mechanism works quite well for regular use cases in AutoFDO. However,
in some case, we can only have a partial profile when optimizing a target.
The partial profile may be an aggregated profile collected from many targets.
The profile symbol list method used for regular sampleFDO profile is not
applicable to partial profile use case because it may be too large and
introduce many false positives.

To solve the problem for partial profile use case, we provide an option called
--profile-unknown-in-special-section. For functions without profile, we will
still treat them conservatively in compiler optimizations -- for example,
treat them as warm instead of cold in inliner. When we use profile info to
add section prefix for functions, we will discriminate functions known to be
not cold versus functions without profile (being unknown), and we will put
functions being unknown in a special text section called .text.unknown.
Runtime system will have the flexibility to decide where to put the special
section in order to achieve a balance between performance and memory saving.

Differential Revision: https://reviews.llvm.org/D62540
2020-05-08 11:18:09 -07:00
Nikita Popov 5a2265647e Reapply [InstSimplify] Remove known bits constant folding
No changes relative to last time, but after a mitigation for
an AMDGPU regression landed.

---

If SimplifyInstruction() does not succeed in simplifying the
instruction, it will compute the known bits of the instruction
in the hope that all bits are known and the instruction can be
folded to a constant. I have removed a similar optimization
from InstCombine in D75801, and would like to drop this one as well.

On average, we spend ~1% of total compile-time performing this
known bits calculation. However, if we introduce some additional
statistics for known bits computations and how many of them succeed
in simplifying the instruction we get (on test-suite):

    instsimplify.NumKnownBits: 216
    instsimplify.NumKnownBitsComputed: 13828375
    valuetracking.NumKnownBitsComputed: 45860806

Out of ~14M known bits calculations (accounting for approximately
one third of all known bits calculations), only 0.0015% succeed in
producing a constant. Those cases where we do succeed to compute
all known bits will get folded by other passes like InstCombine
later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test
show a hash difference after this change. On lencod we see an
improvement (a loop phi is optimized away), on the GCC torture
test a regression (a function return value is determined only
after IPSCCP, preventing propagation from a noinline function.)

There are various regressions in InstSimplify tests. However, all
of these cases are already handled by InstCombine, and corresponding
tests have already been added there.

Differential Revision: https://reviews.llvm.org/D79294
2020-05-08 10:24:53 +02:00
Huihui Zhang 1ec0cc0f02 [InstCombine][SVE] Fix visitExtractElementInst for scalable type.
Summary:
This patch fix the following issues with visitExtractElementInst:

      1. Restrict VectorUtils::findScalarElement to fixed-length vector.
         For scalable type, the number of elements in shuffle mask is
         unknown at compile-time.
      2. Fix out-of-range calculation for fixed-length vector.
      3. Skip scalable type when analysis rely on fixed number of elements.
      4. Add unit tests to check functionality of extractelement for scalable type.

Reviewers: sdesmalen, efriedma, spatel, nikic

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78267
2020-05-07 13:03:52 -07:00
Hiroshi Yamauchi 1b4e3def03 [BFI][CGP] Add limited support for detecting missed BFI updates and fix one in CodeGenPrepare.
Summary:
This helps detect some missed BFI updates during CodeGenPrepare.

This is debug build only and disabled behind a flag.

Fix a missed update in CodeGenPrepare::dupRetToEnableTailCallOpts().

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77417
2020-05-07 11:58:00 -07:00
Christopher Tetreault 782231ac79 [SVE] Fix invalid uses of VectorType::getNumElements() in ValueTracking
Summary:
Any function in this module that make use of DemandedElts laregely does
not work with scalable vectors. DemandedElts is used to define which
elements of the vector to look at. At best, for scalable vectors, we can
express the first N elements of the vector. However, in practice, most
code that uses these functions expect to be able to talk about the
entire vector. In principle, this module should be able to be extended
to work with scalable vectors. However, before we can do that, we should
ensure that it does not cause code with scalable vectors to miscompile.
All functions that use a DemandedElts will bail out if the vector is
scalable. Usages of getNumElements() are updated to go through
FixedVectorType pointers.

Reviewers: rengolin, efriedma, sdesmalen, c-rhodes, spatel

Reviewed By: efriedma

Subscribers: david-arm, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79053
2020-05-06 10:06:06 -07:00
Sanjay Patel a954b8a363 [ValueTracking] fix CannotBeNegativeZero() to disregard 'nsz' FMF
The 'nsz' flag is different than 'nnan' or 'ninf' in that it does not create poison.
Make that explicit in the LangRef and fix ValueTracking analysis that misinterpreted
the definition.

This manifests as bugs in InstSimplify shown in the test diffs and as discussed in
PR45778:
https://bugs.llvm.org/show_bug.cgi?id=45778

Differential Revision: https://reviews.llvm.org/D79422
2020-05-05 16:04:59 -04:00
Simon Pilgrim 4e3c005554 [TTI] getScalarizationOverhead - use explicit VectorType operand
getScalarizationOverhead is only ever called with vectors (and we already had a load of cast<VectorType> calls immediately inside the functions).

Followup to D78357

Reviewed By: @samparker

Differential Revision: https://reviews.llvm.org/D79341
2020-05-05 16:59:23 +01:00
Sam Parker 40574fefe9 [NFC][CostModel] Add TargetCostKind to relevant APIs
Make the kind of cost explicit throughout the cost model which,
apart from making the cost clear, will allow the generic parts to
calculate better costs. It will also allow some backends to
approximate and correlate the different costs if they wish. Another
benefit is that it will also help simplify the cost model around
immediate and intrinsic costs, where we currently have multiple APIs.

RFC thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html

Differential Revision: https://reviews.llvm.org/D79002
2020-05-05 10:35:54 +01:00
Nikita Popov 46ee652c70 Revert "[InstSimplify] Remove known bits constant folding"
This reverts commit 08556afc54.

This breaks some AMDGPU tests.
2020-05-03 20:45:10 +02:00
Nikita Popov 08556afc54 [InstSimplify] Remove known bits constant folding
If SimplifyInstruction() does not succeed in simplifying the
instruction, it will compute the known bits of the instruction
in the hope that all bits are known and the instruction can be
folded to a constant. I have removed a similar optimization
from InstCombine in D75801, and would like to drop this one as well.

On average, we spend ~1% of total compile-time performing this
known bits calculation. However, if we introduce some additional
statistics for known bits computations and how many of them succeed
in simplifying the instruction we get (on test-suite):

    instsimplify.NumKnownBits: 216
    instsimplify.NumKnownBitsComputed: 13828375
    valuetracking.NumKnownBitsComputed: 45860806

Out of ~14M known bits calculations (accounting for approximately
one third of all known bits calculations), only 0.0015% succeed in
producing a constant. Those cases where we do succeed to compute
all known bits will get folded by other passes like InstCombine
later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test
show a hash difference after this change. On lencod we see an
improvement (a loop phi is optimized away), on the GCC torture
test a regression (a function return value is determined only
after IPSCCP, preventing propagation from a noinline function.)

There are various regressions in InstSimplify tests. However, all
of these cases are already handled by InstCombine, and corresponding
tests have already been added there.

Differential Revision: https://reviews.llvm.org/D79294
2020-05-03 20:26:58 +02:00
Nikita Popov 8148b11647 [ValueTracking] Short-circuit GEP known bits calculation (NFC)
Don't compute known bits of all GEP operands, if we already know
that we don't know anything.
2020-05-02 12:29:26 +02:00
Sanjay Patel 57f0eed98d [InstSimplify] allow insertelement-with-undef fold if poison-safe
The more general fold was not poison-safe, so it was removed:
rG5486e00
...but it is ok to have this transform if analysis can determine
the vector contains no poison. The test shows a simple example
of that: constant integer elements are not poison.
2020-05-01 10:34:29 -04:00
Sanjay Patel 5486e00dc3 [InstSimplify] remove poison-unsafe insertelement of undef value
PR45481:
https://bugs.llvm.org/show_bug.cgi?id=45481

SDAG has an identical transform to this, so there's little
chance of any real-world impact. OTOH, that means we are
effectively sweeping the bug out of sight because poison
exists in codegen too.
2020-05-01 09:22:05 -04:00
Arthur Eubanks a90948fd6e [NFC] Rename *ByValOrInalloca* to *PassPointeeByValue*
Summary: In preparation for preallocated.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79152
2020-04-30 09:42:13 -07:00
Kirill Naumov 0383253cdf [InlineCost] Addressing a very strict assert check in CostAnnotationWriter::emitInstructionAnnot
The assert checks that every instruction must be annotated by this point while it is not
necessary. If the inlining process was interrupted because the threshold was reached, the rest
of the instructions would not be annotated which triggers the assert.
The added test shows the situation in which it can happen.
This is a recommit as the original commit fail due to the absence of REQUIRES: assert in the test.

Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D79107
2020-04-30 15:38:36 +00:00
Evgeniy Brevnov bb0842a3f1 [BPI] Incorrect probability reported in case of mulptiple edges.
Summary:
By design 'BranchProbabilityInfo:: getEdgeProbability(const BasicBlock *Src, const BasicBlock *Dst) const' should return sum of probabilities over all edges from Src to Dst. Current implementation is buggy and returns 1/num_of_successors if probabilities are not explicitly set.

Note current implementation of BPI printing has an issue as well and annotates each edge with sum of probabilities over all ages from one basic block to another. That's why 30% probability reported (instead of 10%) in the lit test. This is not urgent issue since only printing is affected.
Note also current implementation assumes that either all or none edges have probabilities set. This is not the only place which uses such assumption. At least we should assert that in verifier. In addition we can think on a more robust API of BPI which would prevent situations.

Reviewers: skatkov, yrouban, taewookoh

Reviewed By: skatkov

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79071
2020-04-30 11:41:03 +07:00
Evgeniy Brevnov 3e68a66704 [BPI][NFC] Reuse post dominantor tree from analysis manager when available
Summary: Currenlty BPI unconditionally creates post dominator tree each time. While this is not incorrect we can save compile time by reusing existing post dominator tree (when it's valid) provided by analysis manager.

Reviewers: skatkov, taewookoh, yrouban

Reviewed By: skatkov

Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78987
2020-04-30 11:31:03 +07:00
Kirill Naumov 0fa793e798 Revert "[InlineCost] Addressing a very strict assert check in CostAnnotationWriter::emitInstructionAnnot"
This reverts commit 66947d05fd.
2020-04-29 22:00:51 +00:00
Alina Sbirlea 161ccfe5ba [MemorySSA] Pass DT to the upward iterator for proper PhiTranslation.
Summary:
A valid DominatorTree is needed to do PhiTranslation.
Before this patch, a MemoryUse could be optimized to an access outside a loop, while the address it loads from is modified in the loop.
This can lead to a miscompile.

Reviewers: george.burgess.iv

Subscribers: Prazek, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79068
2020-04-29 14:28:31 -07:00
Kirill Naumov 055f58fcfc [CFG] Turning on Heat Colors for CFG by default
This option seems to be very useful, so let's turn it on by default

Reviewed-By: davidxl
Diff: https://reviews.llvm.org/D79110
2020-04-29 20:44:10 +00:00
Kirill Naumov 66947d05fd [InlineCost] Addressing a very strict assert check in CostAnnotationWriter::emitInstructionAnnot
The assert checks that every instruction must be annotated by this point while it is not
necessary. If the inlining process was interrupted because the threshold was reached, the rest
of the instructions would not be annotated which triggers the assert.
The added test shows the situation in which it can happen.

Reviewed-By: mtrofin
Diff: https://reviews.llvm.org/D79107
2020-04-29 20:44:10 +00:00
Simon Pilgrim 090cae8491 [TTI] Add DemandedElts to getScalarizationOverhead
The improvements to the x86 vector insert/extract element costs in D74976 resulted in the estimated costs for vector initialization and scalarization increasing higher than should be expected. This is particularly noticeable on pre-SSE4 targets where the available of legal INSERT_VECTOR_ELT ops is more limited.

This patch does 2 things:
1 - it implements X86TTIImpl::getScalarizationOverhead to more accurately represent the typical costs of a ISD::BUILD_VECTOR pattern.
2 - it adds a DemandedElts mask to getScalarizationOverhead to permit the SLP's BoUpSLP::getGatherCost to be rewritten to use it directly instead of accumulating raw vector insertion costs.

This fixes PR45418 where a v4i8 (zext'd to v4i32) was no longer vectorizing.

A future patch should extend X86TTIImpl::getScalarizationOverhead to tweak the EXTRACT_VECTOR_ELT scalarization costs as well.

Reviewed By: @craig.topper

Differential Revision: https://reviews.llvm.org/D78216
2020-04-29 12:00:38 +01:00
Florian Hahn 616657b39c [LAA] Move CheckingPtrGroup/PointerCheck outside class (NFC).
This allows forward declarations of PointerCheck, which in turn reduce
the number of times LoopAccessAnalysis needs to be included.

Ultimately this helps with moving runtime check generation to
Transforms/Utils/LoopUtils.h, without having to include it there.

Reviewers: anemet, Ayal

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78458
2020-04-28 21:47:31 +01:00
David Blaikie 1b56980845 MustBeExecutedContextPrinter::runOnModule: Use unique_ptr to simplify/clarify ownership 2020-04-28 11:30:53 -07:00
Sam Parker e9c9329aa4 [TTI] Add TargetCostKind argument to getUserCost
There are several different types of cost that TTI tries to provide
explicit information for: throughput, latency, code size along with
a vague 'intersection of code-size cost and execution cost'.

The vectorizer is a keen user of RecipThroughput and there's at least
'getInstructionThroughput' and 'getArithmeticInstrCost' designed to
help with this cost. The latency cost has a single use and a single
implementation. The intersection cost appears to cover most of the
rest of the API.

getUserCost is explicitly called from within TTI when the user has
been explicit in wanting the code size (also only one use) as well
as a few passes which are concerned with a mixture of size and/or
a relative cost. In many cases these costs are closely related, such
as when multiple instructions are required, but one evident diverging
cost in this function is for div/rem.

This patch adds an argument so that the cost required is explicit,
so that we can make the important distinction when necessary.

Differential Revision: https://reviews.llvm.org/D78635
2020-04-28 08:57:45 +01:00
Craig Topper a58b62b4a2 [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
Mircea Trofin 011a07c075 Fix missing namespace in API implementation. 2020-04-27 21:05:33 -07:00
Mircea Trofin cb56e9b923 [llvm][NFC] Use CallBase instead of Instruction in ProfileSummaryInfo
Summary:
getProfileCount requires the parameter be a valid CallBase, and its uses
reflect that.

Reviewers: dblaikie, craig.topper, wmi

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78940
2020-04-27 20:47:52 -07:00
Mircea Trofin 8a4013ed38 [llvm][NFC] Add an explicit 'ComputeFullInlineCost' API
Summary:
Added getInliningCostEstimate, which is essentially what getInlineCost
computes if passed default inlining params, and  non-null ORE or
InlineParams::ComputeFullInlineCost.

Reviewers: davidxl, eraman, jdoerfert

Subscribers: hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78730
2020-04-27 09:11:45 -07:00
Hongtao Yu 93efe25ab3 [ViewCFG] Allow printing edge weights in debuggers
Summary:
Extending the Function::viewCFG prototypes to allow for printing block probability info in form of .dot files during debugging.

Also avoiding an AV when no BFI/BPI available.

Reviewers: wenlei, davidxl, knaumov

Reviewed By: wenlei, davidxl

Subscribers: MaskRay, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77978
2020-04-26 13:18:29 -07:00
Simon Pilgrim a3982491db [Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly
Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h.

Differential Revision: https://reviews.llvm.org/D78815
2020-04-26 12:58:20 +01:00
Nikita Popov 2b2827552a [CaptureTracking] Make MaxUsesToExplore cheaper (NFC)
The change in D78624 had a noticeable negative compile-time impact.
It seems that going through a function call for the MaxUsesToExplore
default is fairly expensive, at least if LLVM is not built with LTO.

This patch makes MaxUsesToExpore default to 0 and assigns the actual
default in the implementation instead. This recovers most of the
regression.

Differential Revision: https://reviews.llvm.org/D78734
2020-04-26 09:54:15 +02:00
Juneyoung Lee f5677fe700 [ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into more constants/instructions
Summary:
This patch helps isGuaranteedNotToBeUndefOrPoison look into more constants and instructions (bitcast/alloca/gep/fcmp).

To deal with bitcast, Depth is added to isGuaranteedNotToBeUndefOrPoison.

This patch is splitted from https://reviews.llvm.org/D75808.

Checked with Alive2

Reviewers: reames, jdoerfert

Reviewed By: jdoerfert

Subscribers: sanwou01, spatel, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76010
2020-04-25 23:29:54 +09:00
Florian Hahn 82ce334727 [ValueLattice] Merging unknown with empty CR is unknown.
Currently an unknown/undef value is marked as overdefined when merged
with an empty range. An empty range can occur in unreachable/dead code.
When merging the new unknown state (= no value known yet) with an empty
range, there still isn't any information about the value yet and we can
stay in unknown.

This gives a few nice improvements on the number of instructions removed
by IPSCCP:
Same hash: 170 (filtered out)
Remaining: 67
Metric: sccp.IPNumInstRemoved

Program                                        base     patch    diff
 test-suite...rks/FreeBench/mason/mason.test     3.00   6.00 100.0%
 test-suite...nchmarks/McCat/18-imp/imp.test     3.00   5.00 66.7%
 test-suite...C/CFP2000/179.art/179.art.test     2.00   3.00 50.0%
 test-suite...ijndael/security-rijndael.test     2.00   3.00 50.0%
 test-suite...ks/Prolangs-C/agrep/agrep.test    40.00  58.00 45.0%
 test-suite...ce/Applications/Burg/burg.test    26.00  37.00 42.3%
 test-suite...cCat/03-testtrie/testtrie.test     3.00   4.00 33.3%
 test-suite...Source/Benchmarks/sim/sim.test    29.00  36.00 24.1%
 test-suite.../Applications/spiff/spiff.test     9.00  11.00 22.2%
 test-suite...s/FreeBench/neural/neural.test     5.00   6.00 20.0%
 test-suite...pplications/treecc/treecc.test    66.00  79.00 19.7%
 test-suite...langs-C/football/football.test    85.00 101.00 18.8%
 test-suite...ce/Benchmarks/PAQ8p/paq8p.test    90.00 105.00 16.7%
 test-suite...oxyApps-C++/miniFE/miniFE.test    37.00  43.00 16.2%
 test-suite...rks/FreeBench/pifft/pifft.test    26.00  30.00 15.4%
 test-suite...lications/sqlite3/sqlite3.test   481.00  548.00  13.9%
 test-suite...marks/7zip/7zip-benchmark.test   4875.00 5522.00 13.3%
 test-suite.../CINT2000/176.gcc/176.gcc.test   1117.00 1197.00  7.2%
 test-suite...0.perlbench/400.perlbench.test   1618.00 1732.00  7.0%

Reviewers: efriedma, nikic, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D78667
2020-04-25 13:43:34 +01:00
Tyker e5f8a77c19 [AssumeBundles] Refactor asssume builder
Summary:
refactor assume bulider for the next patch.
the assume builder now generate only one assume per attribute kind and per value they are on. to do this it takes the highest. this is desirable because currently, for all attributes the higest value is the most valuable.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78013
2020-04-25 13:43:52 +02:00
Tyker 42431da895 [AssumeBundles] Use assume bundles in isKnownNonZero
Summary: Use nonnull and dereferenceable from an assume bundle in isKnownNonZero

Reviewers: jdoerfert, nikic, lebedev.ri, reames, fhahn, sstefan1

Reviewed By: jdoerfert

Subscribers: fhahn, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76149
2020-04-24 20:41:51 +02:00
Mircea Trofin b8960b5d81 [llvm][NFC][CallSite] Remove remaining {Immutable}CallSite uses
Reviewers: dblaikie, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78789
2020-04-23 22:19:39 -07:00
Mircea Trofin 2059a6e3ef [llvm][NFC][CallSite] Remove ImmutableCallSite from a few locations
Reviewers: craig.topper, dblaikie

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78783
2020-04-23 21:18:44 -07:00
Eli Friedman 3291efc2b3 [ValueTracking] Handle shufflevector constants in ComputeNumSignBits
Differential Revision: https://reviews.llvm.org/D78688
2020-04-23 17:47:37 -07:00
James Y Knight 248a5db3f2 Change callbr to only define its output SSA variable on the normal
path, not the indirect targets.

Fixes: PR45565.

Differential Revision: https://reviews.llvm.org/D78341
2020-04-23 19:36:44 -04:00
Craig Topper d6c5daf0bf [CallSite removal][ValueTracking] Replace CallSite with CallBase. NFC" 2020-04-23 15:25:19 -07:00
Christopher Tetreault 9174e0229f [SVE] Remove calls to VectorType::isScalable from analysis
Reviewers: efriedma, sdesmalen, chandlerc, sunfish

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77692
2020-04-23 12:44:22 -07:00
Mircea Trofin 201498c6f3 [llvm][NFC] Factor out cost-model independent inling decision
Summary:
llvm::getInlineCost starts off by determining whether inlining should
happen or not because of user directives or easily determinable
unviability. This CL refactors this functionality as a reusable API.

Reviewers: davidxl, eraman

Reviewed By: davidxl, eraman

Subscribers: hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73825
2020-04-23 10:58:43 -07:00
Mircea Trofin cea6f4d5f8 [llvm][NFC][CallSite] Remove CallSite from TypeMetadataUtils & related
Reviewers: craig.topper, dblaikie

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78666
2020-04-23 08:23:16 -07:00
Sanjay Patel e86eff0e82 [InstSimplify] fold and/or of compares with equality to min/max constant
I found 12 (6 if we compress the DeMorganized forms) patterns for logic-of-compares
with a min/max constant while looking at PR45510:
https://bugs.llvm.org/show_bug.cgi?id=45510

The variations on those forms multiply the test cases by 8 (unsigned/signed, swapped
compare operands, commuted logic operands).
We have partial logic to deal with these for the unsigned min (zero) case, but
missed everything else.

We are deferring the majority of these patterns to InstCombine to allow more general
handling (see D78582).

We could use ConstantRange instead of predicate+constant matching here. I don't
expect there's any noticeable compile-time impact for either form.

Here's an abuse of Alive2 to show the 12 basic signed variants of the patterns in
one function:
http://volta.cs.utah.edu:8080/z/5Vpiyg

declare void @use(i1, i1, i1, i1, i1, i1, i1, i1, i1, i1, i1, i1)
define void @src(i8 %x, i8 %y)  {
  %m1 = icmp eq i8 %x, 127
  %c1 = icmp slt i8 %x, %y
  %r1 = and i1 %m1, %c1   ; (X == MAX) && (X < Y) --> false

  %m2 = icmp ne i8 %x, 127
  %c2 = icmp sge i8 %x, %y
  %r2 = or i1 %m2, %c2    ; (X != MAX) || (X >= Y) --> true

  %m3 = icmp eq i8 %x, -128
  %c3 = icmp sgt i8 %x, %y
  %r3 = and i1 %m3, %c3   ; (X == MIN) && (X > Y) --> false

  %m4 = icmp ne i8 %x, -128
  %c4 = icmp sle i8 %x, %y
  %r4 = or i1 %m4, %c4    ; (X != MIN) || (X <= Y) --> true

  %m5 = icmp eq i8 %x, 127
  %c5 = icmp sge i8 %x, %y
  %r5 = and i1 %m5, %c5   ; (X == MAX) && (X >= Y) --> X == MAX

  %m6 = icmp ne i8 %x, 127
  %c6 = icmp slt i8 %x, %y
  %r6 = or i1 %m6, %c6   ; (X != MAX) || (X < Y) --> X != MAX

  %m7 = icmp eq i8 %x, -128
  %c7 = icmp sle i8 %x, %y
  %r7 = and i1 %m7, %c7   ; (X == MIN) && (X <= Y) --> X == MIN

  %m8 = icmp ne i8 %x, -128
  %c8 = icmp sgt i8 %x, %y
  %r8 = or i1 %m8, %c8   ; (X != MIN) || (X > Y) --> X != MIN

  %m9 = icmp ne i8 %x, 127
  %c9 = icmp slt i8 %x, %y
  %r9 = and i1 %m9, %c9    ; (X != MAX) && (X < Y) --> X < Y

  %m10 = icmp eq i8 %x, 127
  %c10 = icmp sge i8 %x, %y
  %r10 = or i1 %m10, %c10    ; (X == MAX) || (X >= Y) --> X >= Y

  %m11 = icmp ne i8 %x, -128
  %c11 = icmp sgt i8 %x, %y
  %r11 = and i1 %m11, %c11    ; (X != MIN) && (X > Y) --> X > Y

  %m12 = icmp eq i8 %x, -128
  %c12 = icmp sle i8 %x, %y
  %r12 = or i1 %m12, %c12    ; (X == MIN) || (X <= Y) --> X <= Y

  call void @use(i1 %r1, i1 %r2, i1 %r3, i1 %r4, i1 %r5, i1 %r6, i1 %r7, i1 %r8, i1 %r9, i1 %r10, i1 %r11, i1 %r12)
  ret void
}

define void @tgt(i8 %x, i8 %y)  {
  %m5 = icmp eq i8 %x, 127
  %m6 = icmp ne i8 %x, 127
  %m7 = icmp eq i8 %x, -128
  %m8 = icmp ne i8 %x, -128
  %c9 = icmp slt i8 %x, %y
  %c10 = icmp sge i8 %x, %y
  %c11 = icmp sgt i8 %x, %y
  %c12 = icmp sle i8 %x, %y
  call void @use(i1 0, i1 1, i1 0, i1 1, i1 %m5, i1 %m6, i1 %m7, i1 %m8, i1 %c9, i1 %c10, i1 %c11, i1 %c12)
  ret void
}

Differential Revision: https://reviews.llvm.org/D78430
2020-04-23 09:16:10 -04:00
Serguei Katkov c0d2bbb1d4 [CaptureTracking] Replace hardcoded constant to option. NFC.
The motivation is to be able to play with the option and change if it is required.

Reviewers: fedor.sergeev, apilipenko, rnk, jdoerfert
Reviewed By: fedor.sergeev
Subscribers: hiraditya, dantrushin, llvm-commits
Differential Revision: https://reviews.llvm.org/D78624
2020-04-23 18:23:35 +07:00
Juneyoung Lee aca335955c [ValueTracking] Let analyses assume a value cannot be partially poison
Summary:
This is RFC for fixes in poison-related functions of ValueTracking.
These functions assume that a value can be poison bitwisely, but the semantics
of bitwise poison is not clear at the moment.
Allowing a value to have bitwise poison adds complexity to reasoning about
correctness of optimizations.

This patch makes the analysis functions simply assume that a value is
either fully poison or not, which has been used to understand the correctness
of a few previous optimizations.
The bitwise poison semantics seems to be only used by these functions as well.

In terms of implementation, using value-wise poison concept makes existing
functions do more precise analysis, which is what this patch contains.

Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr

Reviewed By: nikic

Subscribers: fhahn, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78503
2020-04-23 08:08:53 +09:00
Juneyoung Lee 5ceef26350 Revert "RFC: [ValueTracking] Let analyses assume a value cannot be partially poison"
This reverts commit 80faa8c3af.
2020-04-23 08:07:09 +09:00
Juneyoung Lee 80faa8c3af RFC: [ValueTracking] Let analyses assume a value cannot be partially poison
Summary:
This is RFC for fixes in poison-related functions of ValueTracking.
These functions assume that a value can be poison bitwisely, but the semantics
of bitwise poison is not clear at the moment.
Allowing a value to have bitwise poison adds complexity to reasoning about
correctness of optimizations.

This patch makes the analysis functions simply assume that a value is
either fully poison or not, which has been used to understand the correctness
of a few previous optimizations.
The bitwise poison semantics seems to be only used by these functions as well.

In terms of implementation, using value-wise poison concept makes existing
functions do more precise analysis, which is what this patch contains.

Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr

Reviewed By: nikic

Subscribers: fhahn, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78503
2020-04-23 07:57:12 +09:00
Craig Topper be04aba6fc [CallSite removal][ValueTracking] Use CallBase instead of ImmutableCallSite for getIntrinsicForCallSite. NFC
Differential Revision: https://reviews.llvm.org/D78613
2020-04-22 12:06:58 -07:00
Craig Topper 05a11974ae [CallSite removal] Remove unneeded includes of CallSite.h. NFC 2020-04-22 00:07:13 -07:00
Sanjay Patel cf30aafa2d [Analysis] recognize the 'null' pointer constant as not poison
Differential Revision: https://reviews.llvm.org/D78575
2020-04-21 14:23:06 -04:00
Simon Pilgrim 3caa03ec51 AliasAnalysisSummary.h - cleanup includes and forward declarations. NFC.
Push InstrTypes.h include down to AliasAnalysisSummary.cpp
2020-04-21 11:32:58 +01:00
Sam Parker ee959ddc5e [TTI] Remove getOperationCost
This API call has been used recently with, a very valid, expectation
that it would do something useful but it doesn't actually query any
backend information. So, remove this method and merge its
functionality into getUserCost. As well as that, also use
getCastInstrCost to get a proper cost from the backend for the
concerned instructions though we only currently return the answer if
it's considered free. The default implementation now also checks
int/ptr conversions too, as well as truncs and bitcasts.

Differential Revision: https://reviews.llvm.org/D76124
2020-04-21 09:15:34 +01:00
Mircea Trofin 1809949239 [llvm][NFC][CallSite] Remove CallSite from Lint.cpp
Summary: The CallSite arg iterator is really User::op_iterator.

Reviewers: dblaikie, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78507
2020-04-20 12:29:11 -07:00
Nikita Popov 54d01cbc15 [IPT] Don't use OrderedInstructions (NFC)
Use Instruction::comesBefore() instead of OrderedInstructions
inside InstructionPrecedenceTracking. This also removes the
dominator tree dependency.

Differential Revision: https://reviews.llvm.org/D78461
2020-04-20 18:25:31 +02:00
Florian Hahn 2737362e7a [VectorUtils] Use early_inc_range instead of DelSet (NFC).
DelSet was used to avoid invalidating the current iterator while
modifying the map we are iterating over.

By using an early_inc_range, (which increments to iterator 'early',
allowing us to remove the current element), we can get rid of DelSet.

Reviewers: gilr, rengolin, Ayal, hsaito

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78420
2020-04-20 16:36:26 +01:00
Sam Parker e3056ae9a0 [NFC][TTI] Explicit use of VectorType
The API for shuffles and reductions uses generic Type parameters,
instead of VectorType, and so assertions and casts are used a lot.
This patch makes those types explicit, which means that the clients
can't be lazy, but results in less ambiguity, and that can only be a
good thing.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45562

Differential Revision: https://reviews.llvm.org/D78357
2020-04-20 09:16:52 +01:00
Craig Topper 252873879e [CallSite removal][Analysis] Replace CallSite with CallBase in MemoryBuiltins. NFC
Differential Revision: https://reviews.llvm.org/D78449
2020-04-19 18:32:48 -07:00
Craig Topper dff18c79f2 [CallSite removal][Lint] Replace visitCallSite with visitCallBase. NFC
Drop visitCallInst/visitInvokeInst since the generic implementation
will delegate to visitCallBase. They appear to be from a time
before visitCallSite existed in the InstVisitor system.

Differential Revision: https://reviews.llvm.org/D78448
2020-04-19 18:32:40 -07:00
Nikita Popov f52e050757 [LVI] Use Optional instead of out parameter (NFC)
As suggested on D76788, this switches the LVI implementation to
return Optional<ValueLatticeElement> from various methods, instead
of passing in a ValueLatticeElement reference and returning a boolean.

Differential Revision: https://reviews.llvm.org/D78383
2020-04-19 21:17:43 +02:00
Mircea Trofin ec73ae11a3 [llvm][NFC][CallSite] Remove CallSite from ProfileSummary
Summary: Depends on D78395.

Reviewers: craig.topper, dblaikie, wmi, davidxl

Subscribers: eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78414
2020-04-18 12:03:14 -07:00
Simon Pilgrim 9b95186c30 HeatUtils.h - remove unnecessary includes. NFC.
Replace with BlockFrequencyInfo/Function forward declarations
Move BlockFrequencyInfo.h include to HeatUtils.cpp
2020-04-18 13:37:06 +01:00
Florian Hahn 4ee45ab60f [LV] Invalidate cost model decisions along with interleave groups.
Cost-modeling decisions are tied to the compute interleave groups
(widening decisions, scalar and uniform values). When invalidating the
interleave groups, those decisions also need to be invalidated.

Otherwise there is a mis-match during VPlan construction.
VPWidenMemoryRecipes created initially are left around w/o converting them
into VPInterleave recipes. Such a conversion indeed should not take place,
and these gather/scatter recipes may in fact be right. The crux is leaving around
obsolete CM_Interleave (and dependent) markings of instructions along with
their costs, instead of recalculating decisions, costs, and recipes.

Alternatively to forcing a complete recompute later on, we could try
to selectively invalidate the decisions connected to the interleave
groups. But we would likely need to run the uniform/scalar value
detection parts again anyways and the extra complexity is probably not
worth it.

Fixes PR45572.

Reviewers: gilr, rengolin, Ayal, hsaito

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78298
2020-04-18 10:23:49 +01:00
Nikita Popov f715eda604 [LVI] Cleanup/unify cache access
This patch combines the "has" and "get" parts of the cache access.
getCachedValueInfo() now both sets the BBLV return argument, and
returns whether the value was found.

Additionally, the management of the work stack is now integrated
into getBlockValue(). If the value is not cached yet, we try to
push to the stack (and return false, indicating that we need to
solve first), or return overdefined in case of a cycle.

These changes a) avoid a duplicate cache lookup for has & get and
b) ensure that the logic is uniform everywhere. For this reason
this change is also not quite NFC, because previously overdefined
values from the cache, and overdefined values from a cycle received
different treatment when it came to assumption intersection.

Differential Revision: https://reviews.llvm.org/D76788
2020-04-17 18:44:38 +02:00
Benjamin Kramer 166467e822 [VectorUtils] Create shufflevector masks as int vectors instead of Constants
No functionality change intended.
2020-04-17 15:28:00 +02:00
Max Kazantsev 72c13446ce [NFC] Add missing 'const' notion to LCSSA-related functions
These functions don't really do any changes to loop info or
dominator tree. We should state this explicitly using 'const'.
2020-04-17 17:49:34 +07:00
Johannes Doerfert df675890b7 [CallGraphUpdater][NFC] Minor updates to D77855
I uploaded the old version accidentally instead of the one with these
minor adjustments requested by the reviewers.

Differential Revision: https://reviews.llvm.org/D77855
2020-04-15 21:26:35 -05:00
Johannes Doerfert 937025757c [CallGraphUpdater] Remove nodes from their SCC (old PM)
Summary:
We can and should remove deleted nodes from their respective SCCs. We
did not do this before and this was a potential problem even though I
couldn't locally trigger an issue. Since the `DeleteNode` would assert
if the node was not in the SCC, we know we only remove nodes from their
SCC and only once (when run on all the Attributor tests).

Reviewers: lebedev.ri, hfinkel, fhahn, probinson, wristow, loladiro, sstefan1, uenoku

Subscribers: hiraditya, bollu, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77855
2020-04-15 18:38:50 -05:00
Johannes Doerfert 1b34b84ddd [CallGraphUpdater] Update the ExternalCallingNode for node replacements
Summary:
While it is uncommon that the ExternalCallingNode needs to be updated,
it can happen. It is uncommon because most functions listed as callees
have external linkage, modifying them is usually not allowed. That said,
there are also internal functions that have, or better had, their
"address taken" at construction time. We conservatively assume various
uses cause the address "to be taken". Furthermore, the user might have
become dead at some point. As a consequence, transformations, e.g., the
Attributor, might be able to replace a function that is listed
as callee of the ExternalCallingNode.

Since there is no function corresponding to the ExternalCallingNode, we
did just remove the node from the callee list if we replaced it (so
far). Now it would be preferable to replace it if needed and remove it
otherwise. However, removing the node has implications on the CGSCC
iteration. Locally, that caused some other nodes to be never visited
but it is for sure possible other (bad) side effects can occur. As it
seems conservatively safe to keep the new node in the callee list we
will do that for now.

Reviewers: lebedev.ri, hfinkel, fhahn, probinson, wristow, loladiro, sstefan1, uenoku

Subscribers: hiraditya, bollu, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77854
2020-04-15 18:38:50 -05:00
Simon Moll 2a0a26bd98 [nfc] clang-format TargetTransformInfo.cpp 2020-04-15 14:43:26 +02:00
Mircea Trofin 447e2c3067 [llvm][NFC][CallSite] Remove Implementation uses of CallSite
Reviewers: dblaikie, davidxl, craig.topper

Subscribers: arsenm, dschuff, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78142
2020-04-14 14:49:47 -07:00
Juneyoung Lee 994543abc9 [ValueTracking] Implement canCreatePoison
Summary:
This PR adds `canCreatePoison(Instruction *I)` which returns true if `I` can generate poison from non-poison
operands.

Reviewers: spatel, nikic, lebedev.ri

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits, regehr, nlopes

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77890
2020-04-15 05:58:06 +09:00
Aaron Puchert e833e58300 [ValueLattice] Remove unused DataLayout parameter of mergeIn, NFC
Reviewed By: fhahn, echristo

Differential Revision: https://reviews.llvm.org/D78061
2020-04-14 13:32:53 +02:00
Georgii Rymar 1647ff6e27 [ADT/STLExtras.h] - Add llvm::is_sorted wrapper and update callers.
It can be used to avoid passing the begin and end of a range.
This makes the code shorter and it is consistent with another
wrappers we already have.

Differential revision: https://reviews.llvm.org/D78016
2020-04-14 14:11:02 +03:00
Mehdi Amini 384ca190ae Revert "Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object"
This reverts commit 10df1563d6.

Some buildbots are broken.
2020-04-14 00:27:08 +00:00
Mehdi Amini 10df1563d6 Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object
ModuleSummaryAnalysis is the only file in libAnalysis that brings a
dependency on the CodeGen layer from libAnalysis, moving it breaks this
dependency.

Differential Revision: https://reviews.llvm.org/D77994
2020-04-13 23:12:11 +00:00
Vedant Kumar 122a6bfb07 [Debugify] Strip added metadata in the -debugify-each pipeline
Summary:
Share logic to strip debugify metadata between the IR and MIR level
debugify passes. This makes it simpler to hunt for bugs by diffing IR
with vs. without -debugify-each turned on.

As a drive-by, fix an issue causing CallGraphNodes to become invalid
when a dead llvm.dbg.value prototype is deleted.

Reviewers: dsanders, aprantl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77915
2020-04-13 10:55:17 -07:00
Simon Pilgrim ad57286232 CodeMetrics.h - include and forward declaration cleanup. NFC.
Remove SmallPtrSet include, replace with forward declaration and include SmallPtrSet.h in CodeMetrics.cpp directly.
Remove unused llvm::DataLayout/Instruction forward declarations.
2020-04-13 13:09:39 +01:00
Tyker 813f438baa [AssumeBundles] adapt Assumption cache to assume bundles
Summary: change assumption cache to store an assume along with an index to the operand bundle containing the knowledge.

Reviewers: jdoerfert, hfinkel

Reviewed By: jdoerfert

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77402
2020-04-13 12:04:51 +02:00
Huihui Zhang 4bde7c5986 [NFC] Use VectorType::isScalable to align with ongoing VectorType refactor. 2020-04-12 15:39:13 -07:00
Sanjay Patel c23cbefd9d [VectorUtils] add IR-level analysis for widening of shuffle mask
This is similar to the recent move/addition of "scaleShuffleMask" (D76508),
but there are a couple of differences:

1. The existing x86 helper (canWidenShuffleElements) always tries to
   divide-by-2, so it gets called iteratively and wouldn't handle the
   general case of non-pow-2 length.
2. The existing x86 code handles "SM_SentinelZero" - we don't have
   that in IR, but this code should be safe to use with that or other
   special (negative) values.

The motivation is to enable shuffle folds in instcombine/vector-combine
that are similar to D76844 and D76727, but in the reverse-bitcast direction.
Those patterns are visible in the tests for D40633.

Differential Revision: https://reviews.llvm.org/D77881
2020-04-12 10:14:19 -04:00
Sanjay Patel 1318ddbc14 [VectorUtils] rename scaleShuffleMask to narrowShuffleMaskElts; NFC
As proposed in D77881, we'll have the related widening operation,
so this name becomes too vague.

While here, change the function signature to take an 'int' rather
than 'size_t' for the scaling factor, add an assert for overflow of
32-bits, and improve the documentation comments.
2020-04-11 10:05:49 -04:00
Mehdi Amini ed03d9485e Revert "[TLI] Per-function fveclib for math library used for vectorization"
This reverts commit 60c642e74b.

This patch is making the TLI "closed" for a predefined set of VecLib
while at the moment it is extensible for anyone to customize when using
LLVM as a library.
Reverting while we figure out a way to re-land it without losing the
generality of the current API.

Differential Revision: https://reviews.llvm.org/D77925
2020-04-11 01:05:01 +00:00
Huihui Zhang 6c989d0248 [BasicAA] Fix aliasGEP/DecomposeGEPExpression for scalable type.
Summary:
Don't attempt to analyze the decomposed GEP for scalable type.
GEP index scale is not compile-time constant for scalable type.
Be conservative, return MayAlias.

Explicitly call TypeSize::getFixedSize() to assert on places where
scalable type doesn't make sense.

Add unit tests to check functionality of -basicaa for scalable type.

This patch is needed for D76944.

Reviewers: sdesmalen, efriedma, spatel, bjope, ctetreau

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77828
2020-04-10 16:58:26 -07:00
Wenlei He 60c642e74b [TLI] Per-function fveclib for math library used for vectorization
Summary:
Encode `-fveclib` setting as per-function attribute so it can threaded through to LTO backends. Accordingly per-function TLI now reads
the attributes and select available vector function list based on that. Now we also populate function list for all supported vector
libraries for the shared per-module `TargetLibraryInfoImpl`, so each function can select its available vector list independently but without
duplicating the vector function lists. Inlining between incompatbile vectlib attributed is also prohibited now.

Subscribers: hiraditya, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77632
2020-04-09 18:26:38 -07:00
Christopher Tetreault b96558f5e5 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: sunfish, sdesmalen, efriedma

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77273
2020-04-09 12:41:28 -07:00
Mircea Trofin b4924f01a4 [llvm][nfc] InstructionCostDetail encapsulation
Ensured initialized fields; encapsulad delta calulations and evaluation
of threshold having had changed; assertion for CostThresholdMap
dereference, to indicate design intent.

Differential Revision: https://reviews.llvm.org/D77762
2020-04-09 08:21:18 -07:00
Jay Foad c63aed890e [KnownBits] Move AND, OR and XOR logic into KnownBits
Summary:
There are at least three clients for KnownBits calculations:
ValueTracking, SelectionDAG and GlobalISel. To reduce duplication the
common logic should be moved out of these clients and into KnownBits
itself.

This patch does this for AND, OR and XOR calculations by implementing
and using appropriate operator overloads KnownBits::operator& etc.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74060
2020-04-09 10:10:37 +01:00
Jay Foad 94cc9eccf6 [ValueTracking] Simplify KnownBits construction
Use the simpler BitWidth constructor instead of the copy constructor to
make it clear when we don't actually need to copy an existing KnownBits
value. Split out from D74539. NFC.
2020-04-09 09:27:22 +01:00
Serge Pavlov c7ff5b38f2 [FPEnv] Use single enum to represent rounding mode
Now compiler defines 5 sets of constants to represent rounding mode.
These are:

1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes
defined by IEEE-754 and is used in `APFloat` implementation.

2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754
rounding modes and a special value for dynamic rounding mode. It is used
in clang frontend.

3. `llvm::fp::RoundingMode`. Defines the same values as
`clang::LangOptions::FPRoundingModeKind` but in different order. It is
used to specify rounding mode in in IR and functions that operate IR.

4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7).
Besides constants for rounding mode it also uses a special value to
indicate error. It is convenient to use in intrinsic functions, as it
represents platform-independent representation for rounding mode. In this
role it is used in some pending patches.

5. Values like `FE_DOWNWARD` and other, which specify rounding mode in
library calls `fesetround` and `fegetround`. Often they represent bits
of some control register, so they are target-dependent. The same names
(not values) and a special name `FE_DYNAMIC` are used in
`#pragma STDC FENV_ROUND`.

The first 4 sets of constants are target independent and could have the
same numerical representation. It would simplify conversion between the
representations. Also now `clang::LangOptions::FPRoundingModeKind` and
`llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding
direction `roundTiesToAway`, although it is supported natively on
some targets.

This change defines all the rounding mode type via one `llvm::RoundingMode`,
which also contains rounding mode for IEEE rounding direction `roundTiesToAway`.

Differential Revision: https://reviews.llvm.org/D77379
2020-04-09 13:26:47 +07:00
Kirill Naumov 8b67853a83 [CFGPrinter] Adding heat coloring to CFGPrinter
This patch introduces the heat coloring of the Control Flow Graph which is based
on the relative "hotness" of each BB. The patch is a part of sequence of three
patches, related to graphs Heat Coloring.

Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu

Differential Revision: https://reviews.llvm.org/D77161
2020-04-08 19:59:51 +00:00
Nikita Popov fe8abbf442 [BPI] Clear handles when releasing memory (NFC)
This reduces max-rss of sqlite compilation by 2.5%.
2020-04-07 22:51:01 +02:00
Eli Friedman 68b03aee1a Remove SequentialType from the type heirarchy.
Now that we have scalable vectors, there's a distinction that isn't
getting captured in the original SequentialType: some vectors don't have
a known element count, so counting the number of elements doesn't make
sense.

In some cases, there's a better way to express the commonality using
other methods. If we're dealing with GEPs, there's GEP methods; if we're
dealing with a ConstantDataSequential, we can query its element type
directly.

In the relatively few remaining cases, I just decided to write out
the type checks. We're talking about relatively few places, and I think
the abstraction doesn't really carry its weight. (See thread "[RFC]
Refactor class hierarchy of VectorType in the IR" on llvmdev.)

Differential Revision: https://reviews.llvm.org/D75661
2020-04-06 17:03:49 -07:00
Kirill Naumov 3f995ce8b5 [CFGPrinter][CallPrinter][polly] Adding distinct structure for CFGDOTInfo
The patch introduces the system to distinctively store the information
needed for the Control Flow Graph as well as the instrumentary needed for
the follow-up changes: BlockFrequencyInfo and BranchProbabilityInfo.
The patch is a part of sequence of three patches, related to graphs Heat Coloring.

Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu

Differential Revision: https://reviews.llvm.org/D76820
2020-04-06 17:42:54 +00:00
Sanjay Patel fbb1b43f13 [ValueTracking] enhance matching of umin/umax with 'not' operands
The cmyk test is based on the known regression that resulted from:
rGf2fbdf76d8d0

This improves on the equivalent signed min/max change:
rG867f0c3c4d8c

The underlying icmp equivalence is:
  ~X pred ~Y --> Y pred X

For an icmp with constant, canonicalization results in a swapped pred:
  ~X < C -->  X > ~C
2020-04-06 11:51:59 -04:00
Sanjay Patel 867f0c3c4d [ValueTracking] enhance matching of smin/smax with 'not' operands
The cmyk tests are based on the known regression that resulted from:
rGf2fbdf76d8d0

So this improvement in analysis might be enough to restore that commit.
2020-04-05 08:54:12 -04:00
Florian Hahn 47ee404075 [ValueTracking] Use Inst::comesBefore in isValidAssumeForCtx (NFC).
D51664 added Instruction::comesBefore which should provide better
performance than the manual check.

Reviewers: rnk, nikic, spatel

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D76228
2020-04-05 12:38:04 +01:00
Nikita Popov a5eb1236e3 [IVDescriptors] Remove unnecessary DemandedBits.h include; NFC
Forward declare DemandedBits in IVDescriptors, and move include
into the cpp file. Also drop the include from LoopUtils, which
does not need it at all.
2020-04-04 12:07:57 +02:00
Alina Sbirlea 688450c7f0 [GraphDiff] Extend GraphDiff to track a list of updates.
Summary:
This patch includes two extensions:
1. It extends the GraphDiff to also keep the original list of updates
after legalization, not just the deletes/insert vectors.
It also provides an API to pop the first update (the updates are store
in reverse, such that the first update is at the end of the list)
2. It adds a bool to mark whether the given updates should be applied as
given, or applied in reverse. This moves the task of reversing the
updates (when the caller needs this) to a functionality inside
GraphDiff, versus having the caller do this.

The two changes could be split into two patches, but they seemed
reasonably small to be reviewed together.

Reviewers: kuhar, dblaikie

Subscribers: hiraditya, george.burgess.iv, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77167
2020-04-03 12:10:36 -07:00
Tyker c00cb76274 [NFC] Split Knowledge retention and place it more appropriatly
Summary:
Splitting Knowledge retention into Queries in Analysis and Builder into Transform/Utils
allows Queries and Transform/Utils to use Analysis.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77171
2020-04-02 15:01:41 +02:00
Jonas Paulsson 36d4421f50 [LoopDataPrefetch + SystemZ] Let target decide on prefetching for each loop.
This patch adds

- New arguments to getMinPrefetchStride() to let the target decide on a
  per-loop basis if software prefetching should be done even with a stride
  within the limit of the hw prefetcher.

- New TTI hook enableWritePrefetching() to let a target do write prefetching
  by default (defaults to false).

- In LoopDataPrefetch:

  - A search through the whole loop to gather information before emitting any
    prefetches. This way the target can get information via new arguments to
    getMinPrefetchStride() and emit prefetches more selectively. Collected
    information includes: Does the loop have a call, how many memory
    accesses, how many of them are strided, how many prefetches will cover
    them. This is NFC to before as long as the target does not change its
    definition of getMinPrefetchStride().

  - If a previous access to the same exact address was 'read', and the
    current one is 'write', make it a 'write' prefetch.

  - If two accesses that are covered by the same prefetch do not dominate
    each other, put the prefetch in a block that dominates both of them.

  - If a ConstantMaxTripCount is less than ItersAhead, then skip the loop.

- A SystemZ implementation of getMinPrefetchStride().

Review: Ulrich Weigand, Michael Kruse

Differential Revision: https://reviews.llvm.org/D70228
2020-04-02 14:57:46 +02:00
shchenz e344f8b9db Revert "[LSR] re-add testcase for wrongly phi node elimination - NFC"
This reverts commit f25a1b4f58.
ARM and hexagon fail at the new added case.
2020-04-01 12:58:06 +00:00
shchenz f25a1b4f58 [LSR] re-add testcase for wrongly phi node elimination - NFC
Retest the case on X86/SystemZ/AArch64/PowerPC
2020-04-01 11:11:17 +00:00
shchenz 8b8cd150a4 Revert "[LSR] add testcase for wrongly phi node elimination - NFC"
This reverts commit dbf5e4f6c7.
The testcase has different behaviour on PowerPC and X86.
2020-04-01 10:28:43 +00:00
shchenz dbf5e4f6c7 [LSR] add testcase for wrongly phi node elimination - NFC 2020-04-01 09:58:58 +00:00
Sam Parker 2641a19981 [TTI] Remove getCallCost
getCallCost is only used within the different layers of TTI, with no
backend implementing it so fold the base implementation into
getUserCost. I think this is an NFC.

Differential Revision: https://reviews.llvm.org/D77050
2020-04-01 09:05:25 +01:00
Craig Topper f92563f907 [VectorUtils][X86] De-templatize scaleShuffleMask and 2 X86 shuffle mask helpers and move their implementation to cpp files
Summary: These were templated due to SelectionDAG using int masks for shuffles and IR using unsigned masks for shuffles. But now that D72467 has landed we have an int mask version of IRBuilder::CreateShuffleVector. So just use int instead of a template

Reviewers: spatel, efriedma, RKSimon

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D77183
2020-04-01 00:46:48 -07:00
Eli Friedman 1ee6ec2bf3 Remove "mask" operand from shufflevector.
Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and paves the way for further changes to add new
shuffles for scalable vectors.

This doesn't change the syntax in textual IR. And I don't currently plan
to change the bitcode encoding in this patch, although we'll probably
need to do something once we extend shufflevector for scalable types.

I expect that once this is finished, we can then replace the raw "mask"
with something more appropriate for scalable vectors.  Not sure exactly
what this looks like at the moment, but there are a few different ways
we could handle it.  Maybe we could try to describe specific shuffles.
Or maybe we could define it in terms of a function to convert a fixed-length
array into an appropriate scalable vector, using a "step", or something
like that.

Differential Revision: https://reviews.llvm.org/D72467
2020-03-31 13:08:59 -07:00
Florian Hahn b37543750c [ValueLattice] Distinguish between constant ranges with/without undef.
This patch updates ValueLattice to distinguish between ranges that are
guaranteed to not include undef and ranges that may include undef.

A constant range guaranteed to not contain undef can be used to simplify
instructions to arbitrary values. A constant range that may contain
undef can only be used to simplify to a constant. If the value can be
undef, it might take a value outside the range. For example, consider
the snipped below

define i32 @f(i32 %a, i1 %c) {
  br i1 %c, label %true, label %false
true:
  %a.255 = and i32 %a, 255
  br label %exit
false:
  br label %exit
exit:
  %p = phi i32 [ %a.255, %true ], [ undef, %false ]
  %f.1 = icmp eq i32 %p, 300
  call void @use(i1 %f.1)
  %res = and i32 %p, 255
  ret i32 %res
}

In the exit block, %p would be a constant range [0, 256) including undef as
%p could be undef. We can use the range information to replace %f.1 with
false because we remove the compare, effectively forcing the use of the
constant to be != 300. We cannot replace %res with %p however, because
if %a would be undef %cond may be true but the  second use might not be
< 256.

Currently LazyValueInfo uses the new behavior just when simplifying AND
instructions and does not distinguish between constant ranges with and
without undef otherwise. I think we should address the remaining issues
in LVI incrementally.

Reviewers: efriedma, reames, aqjune, jdoerfert, sstefan1

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D76931
2020-03-31 12:50:20 +01:00
Denis Antrushin 06c58f11a9 [SCEV] Use backedge SCEV of PHI only if its input is loop invariant
For the PHI node

      %1 = phi [%A, %entry], [%X, %latch]

it is incorrect to use SCEV of backedge val %X as an exit value
of PHI unless %X is loop invariant.
This is because exit value of %1 is value of %X at one-before-last
iteration of the loop.

Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D73181
2020-03-31 18:39:24 +07:00
Thomas Raoux 3ea0774b13 [ConstantFold][NFC] Compile time optimization for large vectors
Optimize the common case of splat vector constant. For large vector
going through all elements is expensive. For splatr/broadcast cases we
can skip going through all elements.

Differential Revision: https://reviews.llvm.org/D76664
2020-03-30 11:27:09 -07:00
Max Kazantsev 4e0d9925d6 [NFC] Remove obsolete checks followed by fix of isGuaranteedToTransferExecutionToSuccessor
In past, isGuaranteedToTransferExecutionToSuccessor contained some weird logic
for volatile loads/stores that was ultimately removed by patch D65375. It's time to
remove a piece of dependent logic that used to be a workaround for the code which
is now deleted.

Reviewed By: uenoku
Differential Revision: https://reviews.llvm.org/D76918
2020-03-30 12:24:41 +07:00
Uday Bondhugula c0955edfd6 Introduce support for lib function aligned_alloc in TLI / memory builtins
Aligned_alloc is a standard lib function and has been in glibc since
2.16 and in the C11 standard. It has semantics similar to malloc/calloc
for several analyses/transforms. This patch introduces aligned_alloc
in target library info and memory builtins. Subsequent ones will
make other passes aware and fix https://bugs.llvm.org/show_bug.cgi?id=44062

This change will also be useful to LLVM generators that need to allocate
buffers of vector elements larger than 16 bytes (for eg. 256-bit ones),
element boundary alignment for which is not typically provided by glibc malloc.

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D76970
2020-03-29 23:36:24 +05:30
Serge Pavlov f398739152 [FEnv] Constfold some unary constrained operations
This change implements constant folding to constrained versions of
intrinsics, implementing rounding: floor, ceil, trunc, round, rint and
nearbyint.

Differential Revision: https://reviews.llvm.org/D72930
2020-03-28 12:28:33 +07:00
Francesco Petrogalli c66d1f38f6 [llvm][Support] Add isZero method for TypeSize. [NFC]
Summary:
The method is used where TypeSize is implicitly cast to integer for
being checked against 0.

Reviewers: sdesmalen, efriedma

Reviewed By: sdesmalen, efriedma

Subscribers: efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76748
2020-03-27 21:03:44 +00:00
Alina Sbirlea 3abcbf9903 [CFG/BasicBlock] Rename succ_const to const_succ. [NFC]
Summary:
Rename `succ_const_iterator` to `const_succ_iterator` and
`succ_const_range` to `const_succ_range` for consistency with the
predecessor iterators, and the corresponding iterators in
MachineBasicBlock.

Reviewers: nicholas, dblaikie, nlewycky

Subscribers: hiraditya, bmahjour, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75952
2020-03-25 12:40:55 -07:00
Nikita Popov ec184dd548 [LVI] Convert some checks to assertions; NFC
solveBlockValue() should only be called if the value isn't cached
yet. Similarly, it does not make sense to "solve" a constant.
2020-03-24 23:11:13 +01:00
Sanjay Patel 88b493a838 [ValueTracking] improve undef/poison analysis for constant vectors
Differential Revision: https://reviews.llvm.org/D76702
2020-03-24 13:35:47 -04:00
Yaxun (Sam) Liu 314deab9af Add Triple::isAMDGPU
Differential Revision: https://reviews.llvm.org/D57707
2020-03-22 14:20:28 -04:00
Bjorn Pettersson d077d678d3 [ValueTracking] Avoid blind cast from Operator to Instruction
Summary:
Avoid blind cast from Operator to ExtractElementInst in
computeKnownBitsFromOperator. This resulted in some crashes
in downstream fuzzy testing. Instead we use getOperand directly
on the Operator when accessing the vector/index operands.

Haven't seen any problems with InsertElement and ShuffleVector,
but I believe those could be used in constant expressions as well.
So the same kind of fix as for ExtractElement was also applied for
InsertElement.

When it comes to ShuffleVector we now simply bail out if a dynamic
cast of the Operator to ShuffleVectorInst fails. I've got no
reproducer indicating problems for ShuffleVector, and a fix would be
slightly more complicated as getShuffleDemandedElts is involved.

Reviewers: RKSimon, nikic, spatel, efriedma

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76564
2020-03-22 14:45:31 +01:00
Nikita Popov dbf78ae128 [LVI] Use SmallDenseMap for getValueFromCondition(); NFC
For the common case, we will only insert one value into this map.
2020-03-22 10:38:44 +01:00
Nikita Popov 7a62ea3889 [ValueTracking] Short-circuit computeKnownBitsAddSub(); NFCI
If one operand is unknown (and we don't have nowrap), don't compute
the second operand.

Also don't create an unnecessary extra KnownBits variable, it's
okay to reuse KnownOut.

This reduces instructions on libclamav_md5.c by 40%.
2020-03-21 13:42:10 +01:00
Huihui Zhang 4f5af9d70d [ValueTracking] Fix usage of DataLayout::getTypeStoreSize()
Summary:
DataLayout::getTypeStoreSize() returns TypeSize.

For cases where it can not be scalable vector (e.g., GlobalVariable),
explicitly call TypeSize::getFixedSize().

For cases where scalable property doesn't matter, (e.g., check for
zero-sized type), use TypeSize::isNonZero().

Reviewers: sdesmalen, efriedma, apazos, reames

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76454
2020-03-20 16:52:15 -07:00
Huihui Zhang 1993f95f2b [ValueTracking][SVE] Fix getOffsetFromIndex for scalable vector.
Summary:
Return None if GEP index type is scalable vector. Size of scalable vectors
are multiplied by a runtime constant.

Avoid transforming:
  %a = bitcast i8* %p to <vscale x 16 x i8>*
  %tmp0 = getelementptr <vscale x 16 x i8>, <vscale x 16 x i8>* %a, i64 0
  store <vscale x 16 x i8> zeroinitializer, <vscale x 16 x i8>* %tmp0
  %tmp1 = getelementptr <vscale x 16 x i8>, <vscale x 16 x i8>* %a, i64 1
  store <vscale x 16 x i8> zeroinitializer, <vscale x 16 x i8>* %tmp1

into:
  %a = bitcast i8* %p to <vscale x 16 x i8>*
  %tmp0 = getelementptr <vscale x 16 x i8>, <vscale x 16 x i8>* %a, i64 0
  %1 = bitcast <vscale x 16 x i8>* %tmp0 to i8*
  call void @llvm.memset.p0i8.i64(i8* align 16 %1, i8 0, i64 32, i1 false)

Reviewers: sdesmalen, efriedma, apazos, reames

Reviewed By: sdesmalen

Subscribers: tschuett, hiraditya, rkruppe, arphaman, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76464
2020-03-20 14:48:29 -07:00
Nikita Popov 417d69595f [InstSimplify] Reorder checks to be more efficient; NFC
First check whether the RHS is a null pointer, and only then
perform a potentially expensive non-zero query.
2020-03-20 22:05:38 +01:00
Simon Pilgrim 34659de5fd [InstCombine][X86] simplifyX86immShift - convert variable in-range vector shift by scalar amounts to generic shifts (PR40391)
The sll/srl/sra scalar vector shifts can be replaced with generic shifts if the shift amount is known to be in range.

This also required public DemandedElts variants of llvm::computeKnownBits to be exposed (PR36319).
2020-03-20 15:48:06 +00:00
Simon Pilgrim 7f764fa18f [ValueTracking] Add some initial isKnownNonZero DemandedElts support (PR36319) 2020-03-20 13:29:00 +00:00
Simon Pilgrim c1efdbcbe0 [ValueTracking] Add computeKnownBits DemandedElts support to shift instructions (PR36319) 2020-03-20 11:08:08 +00:00
Simon Pilgrim 0b458d4dca [ValueTracking] Add computeKnownBits DemandedElts support to ADD/SUB/MUL instructions (PR36319) 2020-03-19 12:41:29 +00:00
Simon Pilgrim 99336bf95a [ValueTracking] Add computeKnownBits DemandedElts support to masked add instructions (PR36319) 2020-03-18 21:50:56 +00:00
Simon Pilgrim 9d40292a64 [ValueTracking] Add computeKnownBits DemandedElts support to XOR instructions (PR36319) 2020-03-18 20:24:14 +00:00
Eli Friedman ebec984e14 [AliasAnalysis] Misc fixes for checking aliasing with scalable types.
This is fixing up various places that use the implicit
TypeSize->uint64_t conversion.

The new overloads in MemoryLocation.h are already used in various places
that construct a MemoryLocation from a TypeSize, including MemorySSA.
(They were using the implicit conversion before.)

Differential Revision: https://reviews.llvm.org/D76249
2020-03-18 12:28:47 -07:00
Simon Pilgrim 1010c44b4c [ValueTracking] Add computeKnownBits DemandedElts support to EXTRACTELEMENT/OR/BSWAP/BITREVERSE instructions (PR36319)
These are all covered by the bswap/bitreverse vector tests.
2020-03-18 18:49:58 +00:00
Simon Pilgrim 06150e8356 [ValueTracking] Add computeKnownBits DemandedElts support to AND instructions (PR36319) 2020-03-18 15:38:15 +00:00
Roman Lebedev 85334b030a
[NFCI][SCEV] Avoid recursion in SCEVExpander::isHighCostExpansion*()
Summary:
As noted in [[ https://bugs.llvm.org/show_bug.cgi?id=45201 | PR45201 ]],
[[ https://bugs.llvm.org/show_bug.cgi?id=10090 | PR10090 ]] SCEV doesn't
always avoid recursive algorithms, and that causes issues with
large expression depths and/or smaller stack sizes.

In `SCEVExpander::isHighCostExpansion*()` case, the refactoring to avoid
recursion is rather idiomatic. We simply need to place the root expr
into a vector, and iterate over vector elements accounting for the cost
of each one, adding new exprs at the end of the vector,
thus achieving recursion-less traversal.

The order in which we will visit exprs doesn't matter here,
so we will be fine with the most basic approach of using SmallVector
and inserting/extracting from the back, which accidentally is the same
depth-first traversal that we were doing previously recursively.

Reviewers: mkazantsev, reames, wmi, ekatz

Reviewed By: mkazantsev

Subscribers: hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76273
2020-03-18 17:10:54 +03:00
Huihui Zhang 1bf0c99375 [ValueTracking][SVE] Fix isGEPKnownNonNull for scalable vector.
Summary:
DataLayout::getTypeAllocSize() return TypeSize. For cases where the
scalable property doesn't matter, we should explicitly call getKnownMinSize()
to avoid implicit type conversion to uint64_t, which is not valid for scalable
vector type.

Reviewers: sdesmalen, efriedma, apazos, reames

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76260
2020-03-17 11:31:30 -07:00
Evgenii Stepanov 2a3723ef11 [memtag] Plug in stack safety analysis.
Summary:
Run StackSafetyAnalysis at the end of the IR pipeline and annotate
proven safe allocas with !stack-safe metadata. Do not instrument such
allocas in the AArch64StackTagging pass.

Reviewers: pcc, vitalybuka, ostannard

Reviewed By: vitalybuka

Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, gilang, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73513
2020-03-16 16:35:25 -07:00
Nico Weber 623cb95eb3 Revert "[InstSimplify] Simplify calls with "returned" attribute"
This reverts commit 45555c3819.
Causes clang crashes in some causes, see comments on
https://reviews.llvm.org/D75815 for details (including
repro steps).
2020-03-16 15:21:30 -04:00
Huihui Zhang 0616e9964b [InstSimplify][SVE] Fix SimplifyGEPInst for scalable vector.
Summary:
Skip folds that rely on DataLayout::getTypeAllocSize(). For scalable
vector, only minimal type alloc size is known at compile-time.

Reviewers: sdesmalen, efriedma, spatel, apazos

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75892
2020-03-16 11:46:12 -07:00
Matt Arsenault 05e7d8d6ce TTI: Add addrspace parameters to memcpy lowering functions 2020-03-16 14:34:29 -04:00
Florian Hahn 650f363bd7 [ValueLattice] Add singlecrfromundef lattice value.
This patch adds a new singlecrfromundef lattice value, indicating a
single element constant range which was merge with undef at some point.
Merging it with another constant range results in overdefined, as we
won't be able to replace all users with a single value.

This patch uses a ConstantRange instead of a Constant*, because regular
integer constants are represented as single element constant ranges as
well and this allows the existing code working without additional
changes.

Reviewers: efriedma, nikic, reames, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D75845
2020-03-15 11:23:46 +00:00
Florian Hahn 4878aa36d4 [ValueLattice] Add new state for undef constants.
This patch adds a new undef lattice state, which is used to represent
UndefValue constants or instructions producing undef.

The main difference to the unknown state is that merging undef values
with constants (or single element constant ranges) produces  the
constant/constant range, assuming all uses of the merge result will be
replaced by the found constant.

Contrary, merging non-single element ranges with undef needs to go to
overdefined. Using unknown for UndefValues currently causes mis-compiles
in CVP/LVI (PR44949) and will become problematic once we use
ValueLatticeElement for SCCP.

Reviewers: efriedma, reames, davide, nikic

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D75120
2020-03-14 17:19:59 +00:00
Eli Friedman 65fc706ddf [SCEV] Add support for GEPs over scalable vectors.
Because we have to use a ConstantExpr at some point, the canonical form
isn't set in stone, but this seems reasonable.

The pretty sizeof(<vscale x 4 x i32>) dumping is a relic of ancient
LLVM; I didn't have to touch that code. :)

Differential Revision: https://reviews.llvm.org/D75887
2020-03-13 16:12:45 -07:00
Ehud Katz 18eae33122 [SCEV] Fix usage of invalid IP with FoldingSet
Fix the use of invalid Insertion Point pointer with the UniqueSCEVs FoldingSet,
which caused memory corruption.
2020-03-13 18:36:58 +02:00
omarahmed1111 b285b333dc [Attributor] Detect possibly unbounded cycles in functions
This patch add mayContainUnboundedCycle helper function which checks whether a function has any cycle which we don't know if it is bounded or not.
Loops with maximum trip count are considered bounded, any other cycle not.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops and non-loop cycles.

Reviewed By: jdoerfert, uenoku, baziotis

Differential Revision: https://reviews.llvm.org/D74691
2020-03-13 11:17:33 -05:00
Ehud Katz fcc2238b8b [SCEV] Add missing cache queries
Calculating SCEVs can be cumbersome, and may take very long time (even
hours, for very long expressions). To prevent recalculating expressions
over and over again, we cache them.
This change add cache queries to key positions, to prevent recalculation
of the expressions.

Fix PR43571.

Differential Revision: https://reviews.llvm.org/D70097
2020-03-13 15:32:43 +02:00
Huihui Zhang 118abf2017 [SVE] Update API ConstantVector::getSplat() to use ElementCount.
Summary:
Support ConstantInt::get() and Constant::getAllOnesValue() for scalable
vector type, this requires ConstantVector::getSplat() to take in 'ElementCount',
instead of 'unsigned' number of element count.

This change is needed for D73753.

Reviewers: sdesmalen, efriedma, apazos, spatel, huntergr, willlovett

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74386
2020-03-12 13:22:41 -07:00
Sanjay Patel a66dc755db [InstSimplify] simplify FP ops harder with FMF (part 2)
This is part of the IR sibling for:
D75576

Related transform committed with:
rG8ec71585719d
2020-03-12 09:53:20 -04:00
Sanjay Patel 8ec7158571 [InstSimplify] simplify FP ops harder with FMF
This is part of the IR sibling for:
D75576

(I'm splitting part of the transform as a separate commit
to reduce risk. I don't know of any bugs that might be
exposed by this improved folding, but it's hard to see
those in advance...)
2020-03-12 09:13:28 -04:00
Sanjay Patel dea2b93a7b [InstSimplify] reduce code for FP undef/nan folding; NFC 2020-03-12 08:46:15 -04:00
Roman Lebedev 8737dc2d32
[SCEV] isHighCostExpansionHelper(): use correct TTI hooks
Summary:
Cost modelling strikes again.
In PR44668 <https://bugs.llvm.org/show_bug.cgi?id=44668> patch series,
i've made the same mistake of always using generic `getOperationCost()`
that i missed in reviewing D73480/D74495 which was later fixed
in 62dd44d76d.

We should be using more specific hooks instead - `getCastInstrCost()`,
`getArithmeticInstrCost()`, `getCmpSelInstrCost()`.

Evidently, this does not have an effect on the existing testcases,
with unchanged default cost budget. But if it *does* have an effect
on some target, we'll have to segregate tests that use this function
per-target, much like we already do with other TTI-aware transform tests.

There's also an issue that @samparker has brought up in post-commit-review:
>>! In D73501#1905171, @samparker wrote:
> Hi,
> Did you get performance numbers for these patches? We track the performance
> of our (Arm) open source DSP library and the cost model fixes were generally
> a notable improvement, so many thanks for that! But the final patch
> for rewriting exit values has generally been bad, especially considering
> the gains from the modelling improvements. I need to look into it further,
> but on my current test case I'm seeing +30% increase in stack accesses
> with a similar decrease in performance.
> I'm just wondering if you observed any negative effects yourself?

I don't know if this addresses that, or we need D66450 for that.

Reviewers: samparker, spatel, mkazantsev, reames, wmi

Reviewed By: reames

Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits, samparker

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75908
2020-03-12 11:33:38 +03:00
Huihui Zhang 8f52573962 [InstSimplify][SVE] Fix SimplifyInsert/ExtractElementInst for scalable vector.
Summary:
For scalable vector, index out-of-bound can not be determined at compile-time.
The same apply for VectorUtil findScalarElement().

Add test cases to check the functionality of SimplifyInsert/ExtractElementInst for scalable vector.

Reviewers: sdesmalen, efriedma, spatel, apazos

Reviewed By: efriedma

Subscribers: cameron.mcinally, tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75782
2020-03-11 15:09:56 -07:00
Anna Welker a6d3bec83f [TTI][ARM][MVE] Refine gather/scatter cost model
Refines the gather/scatter cost model, but also changes the TTI
function getIntrinsicInstrCost to accept an additional parameter
which is needed for the gather/scatter cost evaluation.
This did require trivial changes in some non-ARM backends to
adopt the new parameter.
Extending gathers and truncating scatters are now priced cheaper.

Differential Revision: https://reviews.llvm.org/D75525
2020-03-11 10:23:41 +00:00
Nikita Popov 45555c3819 [InstSimplify] Simplify calls with "returned" attribute
If a call argument has the "returned" attribute, we can simplify
the call to the value of that argument. The "-inst-simplify" pass
already handled this for the constant integer argument case via
known bits, which is invoked in SimplifyInstruction. However,
non-constant (or non-int) arguments are not handled at all right now.

This addresses one of the regressions from D75801.

Differential Revision: https://reviews.llvm.org/D75815
2020-03-09 18:53:47 +01:00
Nikita Popov 829d377a98 [InstSimplify] Don't simplify musttail calls
As pointed out by jdoerfert on D75815, we must be careful when
simplifying musttail calls: We can only replace the return value
if we can eliminate the call entirely. As we can't make this
guarantee for all consumers of InstSimplify, this patch disables
simplification of musttail calls. Without this patch, musttail
simplification currently results in module verification errors.

Differential Revision: https://reviews.llvm.org/D75824
2020-03-09 18:46:56 +01:00
Andrew Monshizadeh c5a06019d2 Extend TimeTrace to LLVM's new pass manager
With the addition of the LLD time tracing it made sense to include coverage
for LLVM's various passes. Doing so ensures that ThinLTO is also covered
with a time trace.

Before:
{F11333974}

After:
{F11333928}

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D74516
2020-03-06 14:45:19 -08:00
Jay Foad 596446623b [AMDGPU][ConstantFolding] Fold llvm.amdgcn.cube* intrinsics
Summary:
This folds the following family of intrinsics:
llvm.amdgcn.cubeid (face id)
llvm.amdgcn.cubema (major axis)
llvm.amdgcn.cubesc (S coordinate)
llvm.amdgcn.cubetc (T coordinate)

Reviewers: nhaehnle, arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75187
2020-03-06 16:42:53 +00:00
Jay Foad 11d1573bb6 [APFloat] Make use of new overloaded comparison operators. NFC.
Reviewers: ekatz, spatel, jfb, tlively, craig.topper, RKSimon, nikic, scanon

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, dexonsmith, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75744
2020-03-06 16:42:53 +00:00
Juneyoung Lee d7267ee194 [ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch conditions of dominating blocks' terminators
Summary:
```
  br i1 c, BB1, BB2:
BB1:
  use1(c)
BB2:
  use2(c)
```

In BB1 and BB2, c is never undef or poison because otherwise the branch would have triggered UB.

This is a resubmission of 952ad47 with crash fix of llvm/test/Transforms/LoopRotate/freeze-crash.ll.

Checked with Alive2

Reviewers: xbolva00, spatel, lebedev.ri, reames, jdoerfert, nlopes, sanjoy

Reviewed By: reames

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75401
2020-03-06 01:08:35 +09:00
Daniil Suchkov 3db48f9324 Revert "[ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch conditions of dominating blocks' terminators"
That commit causes SIGSEGV on some simple tests.
This reverts commit 952ad4701c.
2020-03-05 16:32:36 +07:00
Nikita Popov c6ff3c9bad [InstSimplify] Constant fold icmp of gep
InstSimplify can fold icmps of gep where the base pointers are the
same and the offsets are constant. It does so by constructing a
constant expression icmp and assumes that it gets folded -- but
this doesn't actually happen, because GEP expressions can usually
only be folded by the target-dependent constant folding layer.
As such, we need to explicitly invoke it here.

Differential Revision: https://reviews.llvm.org/D75407
2020-03-04 23:16:52 +01:00
Nikita Popov 0e890cd4d4 [ConstantFolding] Always return something from ConstantFoldConstant
Spin-off from D75407. As described there, ConstantFoldConstant()
currently returns null for non-ConstantExpr/ConstantVector inputs,
but otherwise always returns non-null, independently of whether
any folding has happened or not.

This is confusing and makes consumer code more complicated.
I would expect either that ConstantFoldConstant() returns only if
it actually folded something, or that it always returns non-null.
I'm going to the latter possibility here, which appears to be more
useful considering existing usage.

Differential Revision: https://reviews.llvm.org/D75543
2020-03-04 18:24:47 +01:00
Evgeniy Brevnov 5a63813dc7 [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses(PR42151).
Summary:
This is second attempt to fix the problem with incorrect dependencies reported in presence of invariant load. Initial fix (https://reviews.llvm.org/D64405) was reverted due to a regression reported in https://reviews.llvm.org/D70516.

The original fix changed caching behavior for invariant loads. Namely such loads are not put into the second level cache (NonLocalDepInfo). The problem with that fix is the first level cache  (CachedNonLocalPointerInfo) still works as if invariant loads were in the second level cache. The solution is in addition to not putting dependence results into the second level cache avoid putting info about invariant loads into the first level cache as well.

Reviewers: jdoerfert, reames, hfinkel, efriedma

Reviewed By: jdoerfert

Subscribers: DaniilSuchkov, hiraditya, bmahjour, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73027
2020-03-04 18:40:02 +07:00
Juneyoung Lee 952ad4701c [ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch conditions of dominating blocks' terminators
Summary:
```
  br i1 c, BB1, BB2:
BB1:
  use1(c)
BB2:
  use2(c)
```

In BB1 and BB2, c is never undef or poison because otherwise the branch would have triggered UB.

Checked with Alive2

Reviewers: xbolva00, spatel, lebedev.ri, reames, jdoerfert, nlopes, sanjoy

Reviewed By: reames

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75401
2020-03-04 11:43:31 +09:00
Whitney Tsang c84532a70a [LoopNest]: Analysis to discover properties of a loop nest.
Summary: This patch adds an analysis pass to collect loop nests and
summarize properties of the nest (e.g the nest depth, whether the nest
is perfect, what's the innermost loop, etc...).

The motivation for this patch was discussed at the latest meeting of the
LLVM loop group (https://ibm.box.com/v/llvm-loop-nest-analysis) where we
discussed
the unimodular loop transformation framework ( “A Loop Transformation
Theory and an Algorithm to Maximize Parallelism”, Michael E. Wolf and
Monica S. Lam, IEEE TPDS, October 1991). The unimodular framework
provides a convenient way to unify legality checking and code generation
for several loop nest transformations (e.g. loop reversal, loop
interchange, loop skewing) and their compositions. Given that the
unimodular framework is applicable to perfect loop nests this is one
property of interest we expose in this analysis. Several other utility
functions are also provided. In the future other properties of interest
can be added in a centralized place.
Authored By: etiotto
Reviewer: Meinersbur, bmahjour, kbarton, Whitney, dmgreen, fhahn,
reames, hfinkel, jdoerfert, ppc-slack
Reviewed By: Meinersbur
Subscribers: bryanpkc, ppc-slack, mgorny, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D68789
2020-03-03 18:25:19 +00:00
Whitney Tsang 613f791131 Revert "[LoopNest]: Analysis to discover properties of a loop nest."
This reverts commit 3a063d68e3.

Broke the build with modules enabled:
http://green.lab.llvm.org/green/job/lldb-cmake/10655/console .
2020-03-03 14:07:49 +00:00
Whitney Tsang 3a063d68e3 [LoopNest]: Analysis to discover properties of a loop nest.
Summary: This patch adds an analysis pass to collect loop nests and
summarize properties of the nest (e.g the nest depth, whether the nest
is perfect, what's the innermost loop, etc...).

The motivation for this patch was discussed at the latest meeting of the
LLVM loop group (https://ibm.box.com/v/llvm-loop-nest-analysis) where we
discussed
the unimodular loop transformation framework ( “A Loop Transformation
Theory and an Algorithm to Maximize Parallelism”, Michael E. Wolf and
Monica S. Lam, IEEE TPDS, October 1991). The unimodular framework
provides a convenient way to unify legality checking and code generation
for several loop nest transformations (e.g. loop reversal, loop
interchange, loop skewing) and their compositions. Given that the
unimodular framework is applicable to perfect loop nests this is one
property of interest we expose in this analysis. Several other utility
functions are also provided. In the future other properties of interest
can be added in a centralized place.
Authored By: etiotto
Reviewer: Meinersbur, bmahjour, kbarton, Whitney, dmgreen, fhahn,
reames, hfinkel, jdoerfert, ppc-slack
Reviewed By: Meinersbur
Subscribers: bryanpkc, ppc-slack, mgorny, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D68789
2020-03-03 13:25:28 +00:00
Hiroshi Yamauchi 4d6f3ee2ba [PSI] Add the isCold query support with a given percentile value.
Summary: This follows up D67377 that added the isHot side.

Reviewers: davidxl

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75283
2020-03-02 12:50:15 -08:00
Juneyoung Lee 644e747681 [ValueTracking] Let getGuaranteedNonFullPoisonOp consider assume, remove mentioning about br
Summary:
This patch helps getGuaranteedNonFullPoisonOp handle llvm.assume call.
Also, a comment about the semantics of branch is removed to prevent confusion.
As llvm.assume does, branching on poison directly raises UB (as LangRef says), and this allows transformations such as introduction of llvm.assume on branch condition at each successor, or freely replacing values after conditional branch (such as at loop exit).
Handling br is not addressed in this patch. It makes SCEV more accurate, causing existing LoopVectorize/IndVar/etc tests to fail.

Reviewers: spatel, lebedev.ri, nlopes

Reviewed By: nlopes

Subscribers: hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75397
2020-03-01 07:45:44 +09:00
Juneyoung Lee 282ec40504 [ValueTracking] A value is never undef or poison if it must raise UB
Summary:
This patch helps isGuaranteedNotToBeUndefOrPoison return true if the value
makes the program always undefined.

According to value tracking functions' comments, it is not still in consensus
whether a poison value can be bitwise or not, so conservatively only the case with
i1 is considered.

Reviewers: spatel, lebedev.ri, reames, nlopes, regehr

Reviewed By: nlopes

Subscribers: uenoku, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75396
2020-03-01 07:35:58 +09:00
Teresa Johnson f9ca75f19b [Inliner] Inlining should honor nobuiltin attributes
Summary:
Final patch in series to fix inlining between functions with different
nobuiltin attributes/options, which was specifically an issue in LTO.
See discussion on D61634 for background.

The prior patch in this series (D67923) enabled per-Function TLI
construction that identified the nobuiltin attributes.

Here I have allowed inlining to proceed if the callee's nobuiltins are a
subset of the caller's nobuiltins, but not in the reverse case, which
should be conservatively correct. This is controlled by a new option,
-inline-caller-superset-nobuiltin, which is enabled by default.

Reviewers: hfinkel, gchatelet, chandlerc, davidxl

Subscribers: arsenm, jvesely, nhaehnle, mehdi_amini, eraman, hiraditya, haicheng, dexonsmith, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74162
2020-02-28 07:34:14 -08:00
serge-sans-paille 6d15c4deab No longer generate calls to *_finite
According to Joseph Myers, a libm maintainer

> They were only ever an ABI (selected by use of -ffinite-math-only or
> options implying it, which resulted in the headers using "asm" to redirect
> calls to some libm functions), not an API. The change means that ABI has
> turned into compat symbols (only available for existing binaries, not for
> anything newly linked, not included in static libm at all, not included in
> shared libm for future glibc ports such as RV32), so, yes, in any case
> where tools generate direct calls to those functions (rather than just
> following the "asm" annotations on function declarations in the headers),
> they need to stop doing so.

As a consequence, we should no longer assume these symbols are available on the
target system.

Still keep the TargetLibraryInfo for constant folding.

Differential Revision: https://reviews.llvm.org/D74712
2020-02-28 10:07:37 +01:00
Bardia Mahjour 1b811ff8a9 [DA] Delinearization of fixed-size multi-dimensional arrays
Summary:
Currently the dependence analysis in LLVM is unable to compute accurate
dependence vectors for multi-dimensional fixed size arrays.
This is mainly because the delinearization algorithm in scalar evolution
relies on parametric terms to be present in the access functions. In the
case of fixed size arrays such parametric terms are not present, but we
can use the indexes from GEP instructions to recover the subscripts for
each dimension of the arrays. This patch adds this ability under the
existing option `-da-disable-delinearization-checks`.

Authored By: bmahjour

Reviewer: Meinersbur, sebpop, fhahn, dmgreen, grosser, etiotto, bollu

Reviewed By: Meinersbur

Subscribers: hiraditya, arphaman, Whitney, ppc-slack, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72178
2020-02-27 10:29:01 -05:00
Jay Foad 5900d3f2e9 [AMDGPU][ConstantFolding] Fold llvm.amdgcn.fract intrinsic
Reviewers: nhaehnle, arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75179
2020-02-27 14:37:53 +00:00
Kirill Naumov c965fd942f Cost Annotation Writer for InlineCost
Add extra diagnostics for the inline cost analysis under
-print-instruction-deltas cl option. When enabled along with
-debug-only=inline-cost it prints the IR of inline candidate
annotated with cost and threshold change per every instruction.

Reviewed By: apilipenko, davidxl, mtrofin

Differential Revision: https://reviews.llvm.org/D71501
2020-02-26 17:03:52 -08:00
Roman Lebedev 400ceda425
[SCEV][IndVars] Always provide insertion point to the SCEVExpander::isHighCostExpansion()
Summary: This addresses the `llvm/test/Transforms/IndVarSimplify/elim-extend.ll` `@nestedIV` regression from D73728

Reviewers: reames, mkazantsev, wmi, sanjoy

Reviewed By: mkazantsev

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73777
2020-02-25 23:05:59 +03:00
Roman Lebedev d6f47aeb51
[SCEV] SCEVExpander::isHighCostExpansionHelper(): cost-model min/max (PR44668)
Summary:
Previosly we simply always said that `SCEVMinMaxExpr` is too costly to expand.
But this isn't really true, it expands into just a comparison+swap pair.
And again much like with add/mul, there will be one less such pair
than the number of operands. And we need to count the cost of operands themselves.

This does change a number of testcases, and as far as i can tell,
all of these changes are improvements, in the sense that
we fixed up more latches to do the [in]equality comparison.

This concludes cost-modelling changes, no other SCEV expressions exist as of now.

This is a part of addressing [[ https://bugs.llvm.org/show_bug.cgi?id=44668 | PR44668 ]].

Reviewers: reames, mkazantsev, wmi, sanjoy

Reviewed By: mkazantsev

Subscribers: hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73744
2020-02-25 23:05:59 +03:00