Commit Graph

139349 Commits

Author SHA1 Message Date
Roman Lebedev b38d897e80
[ConstantRange] binaryXor(): special-case binary complement case - the result is precise
Use the fact that `~X` is equivalent to `-1 - X`, which gives us
fully-precise answer, and we only need to special-handle the wrapped case.

This fires ~16k times for vanilla llvm test-suite + RawSpeed.
2020-09-22 21:37:29 +03:00
Roman Lebedev 4eeeb356fc
[CVP] Enhance SRem -> URem fold to work not just on non-negative operands
This is a continuation of 8d487668d0,
the logic is pretty much identical for SRem:

Name: pos pos
Pre: C0 >= 0 && C1 >= 0
%r = srem i8 C0, C1
  =>
%r = urem i8 C0, C1

Name: pos neg
Pre: C0 >= 0 && C1 <= 0
%r = srem i8 C0, C1
  =>
%r = urem i8 C0, -C1

Name: neg pos
Pre: C0 <= 0 && C1 >= 0
%r = srem i8 C0, C1
  =>
%t0 = urem i8 -C0, C1
%r = sub i8 0, %t0

Name: neg neg
Pre: C0 <= 0 && C1 <= 0
%r = srem i8 C0, C1
  =>
%t0 = urem i8 -C0, -C1
%r = sub i8 0, %t0

https://rise4fun.com/Alive/Vd6

Now, this new logic does not result in any new catches
as of vanilla llvm test-suite + RawSpeed.
but it should be virtually compile-time free,
and it may be important to be consistent in their handling,
because if we had a pair of sdiv-srem, and only converted one of them,
-divrempairs will no longer see them as a pair,
and thus not "merge" them.
2020-09-22 21:37:28 +03:00
Hubert Tong 6801950192 [InstCombine] For pow(x, +/-0.5), stop falling into pow(x, 1.5), etc. case
The current code for handling pow(x, y) where y is an integer plus 0.5
is not explicitly guarded against attempting to transform the case where
abs(y) is exactly 0.5.

The latter case is meant to be handled by `replacePowWithSqrt`. Indeed,
if the pow(x, integer+0.5) case proceeds past a certain point, it will
hit an assertion by attempting to form pow(x, 0) using `getPow`.

This patch adds an explicit check to prevent attempting the
pow(x, integer+0.5) transformation on pow(x, +/-0.5) as suggested during
the review of D87877. This has the effect of retaining the shrinking of
`pow` to `powf` when the `sqrt` libcall cannot be formed.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D88066
2020-09-22 14:23:32 -04:00
Hubert Tong b0f58aa116 [NFC] Replace tabs with spaces in PPCInstrPrefix.td 2020-09-22 14:23:32 -04:00
Paul C. Anagnostopoulos 848d66fafd Version 0.5 of the new "TableGen Backend Developer's Guide."
Files modified to take comments into account.
MLIR documentation updated for new TableGen documentation files.
2020-09-22 14:01:52 -04:00
Mircea Trofin d1e0f9f3cf [NFC][regalloc] Simplify/conform to style guide indvars in Greedy
Differential Revision: https://reviews.llvm.org/D88055
2020-09-22 10:55:52 -07:00
Amy Kwan 079757b551 [PowerPC] Implement Vector String Isolate Builtins in Clang/LLVM
This patch implements the vector string isolate (predicate and non-predicate
versions) builtins. The predicate builtins are custom selected within PPCISelDAGToDAG.

Differential Revision: https://reviews.llvm.org/D87671
2020-09-22 11:31:44 -05:00
Amy Kwan b3147058de [PowerPC] Implement the 128-bit Vector Divide Extended Builtins in Clang/LLVM
This patch implements the 128-bit vector divide extended builtins in Clang/LLVM.
These builtins map to the vdivesq and vdiveuq instructions respectively.

Differential Revision: https://reviews.llvm.org/D87729
2020-09-22 11:31:44 -05:00
Simon Pilgrim 4dada8d617 [DAG] Remove DAGTypeLegalizer::GenWidenVectorTruncStores (PR42046)
Just scalarize trunc stores - GenWidenVectorTruncStores does the same thing but is flawed (PR42046) and unused.

Differential Revision: https://reviews.llvm.org/D87708
2020-09-22 17:24:45 +01:00
Alexandre Ganea 723fea2307 Silence 'warning: unused variable' when compiling with Clang 10.0 2020-09-22 12:17:40 -04:00
Hamilton Tobon Mosquera bd31abc1d0 [OpenMPOpt] Refactored "issue" and "wait" declarations for data map runtime call.
Refactored __tgt_target_data_begin_mapper_<issue|wait> to receive the handle as an input/output argument.
This given the compiler warning of returning the handle as copy.

Differential Revision: https://reviews.llvm.org/D88029
2020-09-22 10:50:17 -05:00
Alexandre Ganea 6537004913 [ThinLTO] Re-order modules for optimal multi-threaded processing
Re-use an optimizition from the old LTO API (used by ld64).
This sorts modules in ascending order, based on bitcode size, so that larger modules are processed first. This allows for smaller modules to be process last, and better fill free threads 'slots', and thusly allow for better multi-thread load balancing.

In our case (on dual Intel Xeon Gold 6140, Windows 10 version 2004, two-stage build), this saves 15 sec when linking `clang.exe` with LLD & `-flto=thin`, `/opt:lldltojobs=all`, no ThinLTO cache, -DLLVM_INTEGRATED_CRT_ALLOC=d:\git\rpmalloc.

Before patch: 102 sec
After patch: 85 sec

Inspired by the work done by David Callahan in D60495.

Differential Revision: https://reviews.llvm.org/D87966
2020-09-22 11:25:59 -04:00
Arthur Eubanks a031ef6f3a [GVNSink][NewPM] Add GVNSinkPass to PassRegistry.def 2020-09-22 08:24:09 -07:00
Florian Hahn c671e34bf2 [VPlan] Add dump() helper to VPValue & VPRecipeBase.
This provides a convenient way to print VPValues and recipes in a
debugger. In particular it saves the user from instantiating
VPSlotTracker to print recipes or values.
2020-09-22 15:55:16 +01:00
Michael Liao 534f6e1718 [PeepholeOptimizer] Enhance the redundant COPY elimination.
- Eliminate redundant COPYs from the same register & subregister pair.

Differential Revision: https://reviews.llvm.org/D87939
2020-09-22 10:11:37 -04:00
Simon Pilgrim 0793b45660 [X86] Add missing namespace closure comments. NFCI.
Fixes some clang-tidy llvm-namespace-comment warnings.
2020-09-22 15:06:59 +01:00
Simon Pilgrim af71298648 [X86] Cleanup/add namespace closure comments. NFCI.
Fixes some clang-tidy llvm-namespace-comment warnings.
2020-09-22 15:06:58 +01:00
Stefan Pintilie 7e78d89052 [PowerPC] Fix for compiler side issue in PCRelative Local Exec
Stop combining loads and stores with PPCISD::ADD_TLS before we can merge the
node with with TLS_LOCAL_EXEC_MAT_ADDR. The issue is that
TLS_LOCAL_EXEC_MAT_ADDR cannot be selected by itself and requires the previous
ADD_TLS node that goes with it. However, we sometimes try to combine ADD_TLS
with loads and stores that come after it. If this happens then the ADD_TLS is
removed and TLS_LOCAL_EXEC_MAT_ADDR cannot be selected.

While this bug fix will address the issue it my not be ideal from a performance
perspective as we may be able to add patterns to combine TLS_LOCAL_EXEC_MAT_ADDR
with ADD_TLS with the load and store that comes after it all in one. However,
this is beyond the scope of this patch.

Reviewed By: NeHuang

Differential Revision: https://reviews.llvm.org/D88030
2020-09-22 08:28:06 -05:00
Sanjay Patel 0c3bfbe4bc [SLP] reduce code duplication for checking parent block; NFC 2020-09-22 09:21:20 -04:00
Sanjay Patel bbd49a0266 [SLP] move misplaced code comments; NFC 2020-09-22 09:21:20 -04:00
Sanjay Patel 062276c691 [SLP] clean up code in gather(); NFC
1. Use range for-loop to avoid repeatedly accessing end index.
2. Better variable names.
2020-09-22 09:21:20 -04:00
Simon Pilgrim d682a36ef9 [SLP] Merge null and dyn_cast<> checks into dyn_cast_or_null<>. NFCI. 2020-09-22 14:01:47 +01:00
Sam Parker 94c799fecf [ARM] Trying to fix asan buildbot 2020-09-22 13:43:23 +01:00
Max Kazantsev e2703c021d [SCEV] Handle `less` predicates for FoundPred = NE
Currently these predicates are ignored, yet their handling is
pretty simple. I could not find a single test where it would
actually change something, but it's only because isImpliedCondOperands
is not smart enough to prove it further on. Yet the situation when
we come there with `less` predicate is pretty common.

Differential Revision: https://reviews.llvm.org/D87890
Reviewed By: fhahn
2020-09-22 18:56:35 +07:00
Meera Nakrani a3d0dce260 [ARM][TTI] Prevents constants in a min(max) or max(min) pattern from being hoisted when in a loop
Changes TTI function getIntImmCostInst to take an additional Instruction parameter,
which enables us to be able to check it is part of a min(max())/max(min()) pattern that will match SSAT.
We can then mark the constant used as free to prevent it being hoisted so SSAT can still be generated.
Required minor changes in some non-ARM backends to allow for the optional parameter to be included.

Differential Revision: https://reviews.llvm.org/D87457
2020-09-22 11:54:10 +00:00
Simon Pilgrim a15b42146c Revert rGf835779160ec303 "[APFloat] multiplySignificand - always pass IEEEFloat as const reference. NFCI."
This reverts commit f835779160 while I investigate some buildbot failures
2020-09-22 12:15:23 +01:00
Simon Pilgrim f835779160 [APFloat] multiplySignificand - always pass IEEEFloat as const reference. NFCI.
We do this in all other cases.
2020-09-22 11:29:29 +01:00
Max Kazantsev 16fde88dbd [SCEV] Support unsigned predicates in isKnownPredicateViaNoOverflow
SCEV should be able to prove facts like `x <u x+1<nuw>`.

Differential Revision: https://reviews.llvm.org/D88015
Reviewed By: lebedev.ri
2020-09-22 17:14:05 +07:00
Jay Foad 892ef2e3c0 [AMDGPU] More codegen patterns for v2i16/v2f16 build_vector
It's simpler to do this at codegen time than to do ad-hoc constant
folding of machine instructions in SIFoldOperands.

Differential Revision: https://reviews.llvm.org/D88028
2020-09-22 10:41:38 +01:00
Sam Parker b4fa884a73 [ARM] Improve VPT predicate tracking
The VPTBlock has been modified to track the 'global' state of the
VPR, as well as the state for each block. Each object now just holds
a list of instructions that makeup the block, while static structures
hold the predicate information. This enables global access for
querying how both a VPT block and individual instructions are
predicated. These changes now allow us, again, to handle more
complicated cases where multiple instructions build a predicate
and/or where the same predicate in used in multiple blocks.

It doesn't, however, get us back to before the tracking was 'fixed'
as some extra logic will be required to properly handle VPT
instructions. Currently a VPT could be effectively predicated because
of it's inputs, but the existing logic will not detect that and so
will refuse to perform the transformation. This can be seen in
remat-vctp.ll test where we still don't perform the transform.

Differential Revision: https://reviews.llvm.org/D87681
2020-09-22 10:40:27 +01:00
Muhammad Omair Javaid 73a6a164b8 Revert "Reapply Revert "RegAllocFast: Rewrite and improve""
This reverts commit 55f9f87da2.

Breaks following buildbots:
http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4306
http://lab.llvm.org:8011/builders/lldb-aarch64-ubuntu/builds/9154
2020-09-22 14:40:06 +05:00
Sam Parker a0c1dcc318 [ARM] Remove MVEDomain from VLDR/STR of P0
Remove the domain from the instructions and create a shouldInspect
helper for LowOverheadLoops which queries it or a vpr operand.

Differential Revision: https://reviews.llvm.org/D87900
2020-09-22 09:05:50 +01:00
Sam Parker e461921d6c [ARM] VPT validForTailPredication
Mark all VPT instructions as valid.

Differential Revision: https://reviews.llvm.org/D87759
2020-09-22 08:58:37 +01:00
Martin Storsjö 3fec6ddc27 Reapply: [clang-cl] Always interpret the LIB env var as separated with semicolons
When cross compiling with clang-cl, clang splits the INCLUDE env
variable around semicolons (clang/lib/Driver/ToolChains/MSVC.cpp,
MSVCToolChain::AddClangSystemIncludeArgs) and lld splits the
LIB variable similarly (lld/COFF/Driver.cpp,
LinkerDriver::addLibSearchPaths). Therefore, the consensus for
cross compilation with clang-cl and lld-link seems to be to use
semicolons, despite path lists normally being separated by colons
on unix and EnvPathSeparator being set to that.

Therefore, handle the LIB variable similarly in Clang, when
handling lib file arguments when driving linking via Clang.

This fixes commands like "clang-cl test.c -Fetest.exe kernel32.lib" in
a cross compilation setting. Normally, most users call (lld-)link
directly, but meson happens to use this command syntax for
has_function() tests.

Reapply: Change Program.h to define procid_t as ::pid_t. When included
in lldb/unittests/Host/NativeProcessProtocolTest.cpp, it is included
after an lldb namespace containing an lldb::pid_t typedef, followed
later by a "using namespace lldb;". Previously, Program.h wasn't
included in this translation unit, but now it ends up included
transitively from Process.h.

Differential Revision: https://reviews.llvm.org/D88002
2020-09-22 10:51:25 +03:00
Arthur Eubanks 3bf703fb6d [AlwaysInliner] Emit optimization remarks
To match the normal inliner in preparation for https://reviews.llvm.org/D86988.

Also change a FIXME to an assert.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88067
2020-09-21 22:09:28 -07:00
Dominic Chen 9c7b58080e
[WebAssembly][MC] Fix computation of relative symbol offset
For relative symbols, add its offset when computing relocation value.
Also, warn on unsupported absolute symbols.

Differential Revision: https://reviews.llvm.org/D87407
2020-09-22 00:53:23 -04:00
Serguei Katkov 5502cfa091 [LoopUnswitch] Trivial simplification: remove trivial dead condition after unswitch
Non trivial loop unswitch can keep the dead condition instruction.
CL adds trivial dead code elimination for unused condition.

Reviewers: asbirlea, aqjune, fhahn, DaniilSuchkov, reames
Reviewed By: asbirlea
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D88014
2020-09-22 09:04:59 +07:00
Arthur Eubanks 9db0c572c1 [Delinearization][NewPM] Port delinearization to NPM
Also make tests in Analysis/Delinearization work under NPM.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87741
2020-09-21 17:59:08 -07:00
Fangrui Song 8fdac7cb7a Revert D71539 "Recommit "[SCEV] Look through single value PHIs.""
This reverts commit 11dccf8d3a.

A bootstrapped clang crashes (due to ArrayRef::front called on an empty
ArrayRef) when compiling some files.  Very strangely, this only reproduces with
modules.

```
13 0x0000564d3349e968 llvm::ArrayRef<llvm::BasicBlock*>::front() const /proc/self/cwd/llvm/include/llvm/ADT/ArrayRef.h:160:7
14 0x0000564d3349e896 llvm::LoopBase<llvm::BasicBlock, llvm::Loop>::getHeader() const /proc/self/cwd/llvm/include/llvm/Analysis/LoopInfo.h:104:50
15 0x0000564d3349fd9d llvm::LoopBase<llvm::BasicBlock, llvm::Loop>::getLoopLatch() const /proc/self/cwd/llvm/include/llvm/Analysis/LoopInfoImpl.h:210:11
16 0x0000564d33593c8a llvm::ScalarEvolution::computeBackedgeTakenCount(llvm::Loop const*, bool) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:6933:15
17 0x0000564d33592ebc llvm::ScalarEvolution::getBackedgeTakenInfo(llvm::Loop const*) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:0:30
18 0x0000564d33593a54 llvm::ScalarEvolution::getBackedgeTakenCount(llvm::Loop const*, llvm::ScalarEvolution::ExitCountKind) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:6487:36
19 0x0000564d32be2402 llvm::ScalarEvolution::getConstantMaxBackedgeTakenCount(llvm::Loop const*) /proc/self/cwd/llvm/include/llvm/Analysis/ScalarEvolution.h:768:5
20 0x0000564d33590807 llvm::ScalarEvolution::getRangeRef(llvm::SCEV const*, llvm::ScalarEvolution::RangeSignHint) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:5495:19
21 0x0000564d320abab7 llvm::ScalarEvolution::getSignedRange(llvm::SCEV const*) /proc/self/cwd/llvm/include/llvm/Analysis/ScalarEvolution.h:840:12
22 0x0000564d335a03aa llvm::ScalarEvolution::isKnownPredicateViaConstantRanges(llvm::CmpInst::Predicate, llvm::SCEV const*, llvm::SCEV const*) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:9239:60
23 0x0000564d33586a80 llvm::ScalarEvolution::isKnownViaNonRecursiveReasoning(llvm::CmpInst::Predicate, llvm::SCEV const*, llvm::SCEV const*) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:10284:60
```
2020-09-21 17:21:43 -07:00
Krzysztof Parzyszek ae3f54c1e9 [EarlyCSE] Handle masked loads and stores
Extend the handling of memory intrinsics to also include non-
target-specific intrinsics, in particular masked loads and stores.

Invent "isHandledNonTargetIntrinsic" to distinguish between intrin-
sics that should be handled natively from intrinsics that can be
passed to TTI.

Add code that handles masked loads and stores and update the
testcase to reflect the results.

Differential Revision: https://reviews.llvm.org/D87340
2020-09-21 18:47:10 -05:00
Arthur Eubanks 1747f77764 [SimplifyCFG] Override options in default constructor
SimplifyCFG's options should always be overridden by command line flags,
but they mistakenly weren't in the default constructor.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D87718
2020-09-21 16:33:01 -07:00
Evandro Menezes 394d020167 [RISCV] Do not mandate scheduling for CSR instructions
Scheduling information is of little value when they may disrupt the
pipeline.  This patch allows omitting the scheduling information for CSR
instructions while still setting `SchedMachineModel::CompleteModel`.  For
specific cases, any scheduling information added will be used by the
scheduler.

Differential revision: https://reviews.llvm.org/D85366
2020-09-21 18:24:53 -05:00
Amara Emerson e3f5046e44 [AArch64][GlobalISel] Merge selection of vector-vector G_ASHR/G_LSHR and support more cases.
The vector-immediate cases are handled elsewhere in an earlier commit.
2020-09-21 16:04:52 -07:00
Amara Emerson a513fdec90 [AArch64][GlobalISel] Add a post-legalize combine for lowering vector-immediate G_ASHR/G_LSHR.
In order to select the immediate forms using the imported patterns, we need to
lower them into new G_VASHR/G_VLSHR target generic ops. Add a combine to do this
matching build_vector of constant operands.

With this, we get selection for free.
2020-09-21 16:04:52 -07:00
Amara Emerson 825203daae [AArch64][GlobalISel] Make <4 x s16> G_ASHR and G_LSHR legal.
Selection support for these is coming up.
2020-09-21 15:32:48 -07:00
Mircea Trofin 6a6b06f526 [NFC][regalloc] Use reverse iterator ranges for improved readability
Differential Revision: https://reviews.llvm.org/D88047
2020-09-21 14:58:37 -07:00
Martin Storsjö 8c3ef08f8a Revert "[clang-cl] Always interpret the LIB env var as separated with semicolons"
This reverts commit 4d85444b31.

This commit broke building lldb's NativeProcessProtocolTest.cpp,
with errors like these:

In file included from include/llvm/Support/Process.h:32:0,
                 from tools/lldb/unittests/Host/NativeProcessProtocolTest.cpp:12:
include/llvm/Support/Program.h:39:11: error: reference to ‘pid_t’ is ambiguous
   typedef pid_t procid_t;

/usr/include/sched.h:38:17: note: candidates are: typedef __pid_t pid_t
 typedef __pid_t pid_t;

tools/lldb/include/lldb/lldb-types.h:85:18: note: typedef uint64_t lldb::pid_t
 typedef uint64_t pid_t;
2020-09-22 00:14:45 +03:00
Krzysztof Parzyszek 2c768c7d6c [EarlyCSE] Small refactoring changes, NFC
1. Store intrinsic ID in ParseMemoryInst instead of a boolean flag
   "IsTargetMemInst". This will make it easier to add support for
   target-independent intrinsics.
2. Extract the complex multiline conditions from EarlyCSE::processNode
   into a new function "getMatchingValue".

Differential Revision: https://reviews.llvm.org/D87691
2020-09-21 16:11:06 -05:00
Baptiste Saleil bb82135538 [PowerPC] Remove unnecessary patterns and types
These patterns and type uses were added by mistake by commit
1372e23c7d
2020-09-21 16:08:54 -05:00
Pengxuan Zheng e5fea37f1a [Hexagon] Make HexagonVLCR compatibile with New PM
The patch modifies HexagonVectorLoopCarriedReuse pass to make it compatible with both Legacy Pass Manager through HexagonVectorLoopCarriedReuseLegacyPass and with New Pass Manager through HexagonVectorLoopCarriedReusePass.

Reviewed By: pzheng

Differential Revision: https://reviews.llvm.org/D86955
2020-09-21 13:45:12 -07:00
Martin Storsjö 36c64af9d7 [CodeGen] [WinException] Only produce handler data at the end of the function if needed
If we are going to write handler data (that is written as variable
length data following after the unwind info in .xdata), we need to
emit the handler data immediately, but for cases where no such
info is going to be written, skip emitting it right away. (Unwind
info for all remaining functions that hasn't gotten it emitted
directly is emitted at the end.)

This does slightly change the ordering of sections (triggering a
bunch of updates to DebugInfo/COFF tests), but the change should be
benign.

This also matches GCC's assembly output, which doesn't output
.seh_handlerdata unless it actually is needed.

For ARM64, the unwind info can be packed into the runtime function
entry itself (leaving no data in the .xdata section at all), but
that can only be done if there's no follow-on data in the .xdata
section. If emission of the unwind info is triggered via
EmitWinEHHandlerData (or the .seh_handlerdata directive), which
implicitly switches to the .xdata section, there's a chance of the
caller wanting to pass further data there, so the packed format
can't be used in that case.

Differential Revision: https://reviews.llvm.org/D87448
2020-09-21 23:42:59 +03:00
Martin Storsjö 4d85444b31 [clang-cl] Always interpret the LIB env var as separated with semicolons
When cross compiling with clang-cl, clang splits the INCLUDE env
variable around semicolons (clang/lib/Driver/ToolChains/MSVC.cpp,
MSVCToolChain::AddClangSystemIncludeArgs) and lld splits the
LIB variable similarly (lld/COFF/Driver.cpp,
LinkerDriver::addLibSearchPaths). Therefore, the consensus for
cross compilation with clang-cl and lld-link seems to be to use
semicolons, despite path lists normally being separated by colons
on unix and EnvPathSeparator being set to that.

Therefore, handle the LIB variable similarly in Clang, when
handling lib file arguments when driving linking via Clang.

This fixes commands like "clang-cl test.c -Fetest.exe kernel32.lib" in
a cross compilation setting. Normally, most users call (lld-)link
directly, but meson happens to use this command syntax for
has_function() tests.

Differential Revision: https://reviews.llvm.org/D88002
2020-09-21 23:42:59 +03:00
Sanjay Patel 7451bf0b0b [SLP] use std::distance/find to reduce code; NFC
We were already using this code pattern right after
the loop, so this makes it consistent.
2020-09-21 16:22:55 -04:00
Matt Arsenault 6daddc213f AMDGPU: Don't add frame register to frame pseudos
We no longer treat the frame register like a function argument, so the
problem this avoided is no longer relevant.
2020-09-21 16:18:47 -04:00
Matt Arsenault 55f9f87da2 Reapply Revert "RegAllocFast: Rewrite and improve"
This reverts commit dbd53a1f0c.

Needed lldb test updates
2020-09-21 15:45:27 -04:00
Zequan Wu 9caa3fbe03 [Coverage] Add empty line regions to SkippedRegions
Differential Revision: https://reviews.llvm.org/D84988
2020-09-21 12:42:53 -07:00
Sanjay Patel 6bad3caeb0 [InstCombine] use unary shuffle creator to reduce code duplication; NFC 2020-09-21 15:34:24 -04:00
Sanjay Patel be93505986 [LoopVectorize] use unary shuffle creator to reduce code duplication; NFC 2020-09-21 15:34:24 -04:00
Arthur Eubanks f4f7df037e [DIE] Remove DeadInstEliminationPass
This pass is like DeadCodeEliminationPass, but only does one pass
through a function instead of iterating on users of eliminated
instructions.

DeadCodeEliminationPass should be used in all cases.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87933
2020-09-21 12:12:25 -07:00
Roman Lebedev 0ab99bb314
[NFC][SCEV] Cleanup lowering of @llvm.uadd.sat, (-1 - V) is just ~V 2020-09-21 22:10:59 +03:00
Arthur Eubanks 746a2c3775 [ObjCARC] Initialize return value
Mistakenly removed initialization of `Changed` in https://reviews.llvm.org/D87806.
2020-09-21 11:03:44 -07:00
Sanjay Patel a44238cb44 [SLP] use unary shuffle creator to reduce code duplication; NFC 2020-09-21 13:54:06 -04:00
Sanjay Patel 1e6b240d7d [IRBuilder][VectorCombine] make and use a convenience function for unary shuffle; NFC
This reduces code duplication for common construct.
Follow-ups can use this in SLP, LoopVectorizer, and other passes.
2020-09-21 13:47:01 -04:00
Roman Lebedev 64e2cb7e96
[SCEV] Recognize @llvm.uadd.sat as `%y + umin(%x, (-1 - %y))`
----------------------------------------
define i32 @src(i32 %x, i32 %y) {
%0:
  %r = uadd_sat i32 %x, %y
  ret i32 %r
}
=>
define i32 @tgt(i32 %x, i32 %y) {
%0:
  %t0 = sub nsw nuw i32 4294967295, %y
  %t1 = umin i32 %x, %t0
  %r = add nuw i32 %t1, %y
  ret i32 %r
}
Transformation seems to be correct!

The alternative, naive, lowering could be the following,
although i don't think it's better,
thought it will likely be needed for sadd/ssub/*shl:

----------------------------------------
define i32 @src(i32 %x, i32 %y) {
%0:
  %r = uadd_sat i32 %x, %y
  ret i32 %r
}
=>
define i32 @tgt(i32 %x, i32 %y) {
%0:
  %t0 = zext i32 %x to i33
  %t1 = zext i32 %y to i33
  %t2 = add nuw i33 %t0, %t1
  %t3 = zext i32 4294967295 to i33
  %t4 = umin i33 %t2, %t3
  %r = trunc i33 %t4 to i32
  ret i32 %r
}
Transformation seems to be correct!
2020-09-21 20:25:54 +03:00
Roman Lebedev fedc9549d5
[SCEV] Recognize @llvm.usub.sat as `%x - (umin %x, %y)`
----------------------------------------
define i32 @src(i32 %x, i32 %y) {
%0:
  %r = usub_sat i32 %x, %y
  ret i32 %r
}
=>
define i32 @tgt(i32 %x, i32 %y) {
%0:
  %t0 = umin i32 %x, %y
  %r = sub nuw i32 %x, %t0
  ret i32 %r
}
Transformation seems to be correct!
2020-09-21 20:25:54 +03:00
Roman Lebedev 1bb7ab8c4a
[SCEV] Recognize @llvm.abs as smax(x, -x)
As per alive2 (ignoring undef):

----------------------------------------
define i32 @src(i32 %x, i1 %y) {
%0:
  %r = abs i32 %x, 0
  ret i32 %r
}
=>
define i32 @tgt(i32 %x, i1 %y) {
%0:
  %neg_x = mul i32 %x, 4294967295
  %r = smax i32 %x, %neg_x
  ret i32 %r
}
Transformation seems to be correct!

----------------------------------------
define i32 @src(i32 %x, i1 %y) {
%0:
  %r = abs i32 %x, 1
  ret i32 %r
}
=>
define i32 @tgt(i32 %x, i1 %y) {
%0:
  %neg_x = mul nsw i32 %x, 4294967295
  %r = smax i32 %x, %neg_x
  ret i32 %r
}
Transformation seems to be correct!
2020-09-21 20:25:53 +03:00
Simon Pilgrim 005f826a05 [SLP] Use for-range loops across ValueLists. NFCI.
Also rename some existing loops that used a 'j' iterator to consistently use 'V'.
2020-09-21 18:24:23 +01:00
Sanjay Patel 46075e0b78 [SLP] simplify interface for gather(); NFC
The implementation of gather() should be reduced too,
but this change by itself makes things a little clearer:
we don't try to gather to a different type or
number-of-values than whatever is passed in as the value
list itself.
2020-09-21 12:57:28 -04:00
Simon Pilgrim 6a0ed57a22 ImplicitNullChecks.cpp - use auto const& iterators in for-range loops to avoid copies. NFCI. 2020-09-21 17:42:57 +01:00
Arthur Eubanks 024979b7b6 [ObjCARC][NewPM] Port objc-arc-contract to NPM
Similar to https://reviews.llvm.org/D86178.

This is a module pass instead of a function pass since
ARCRuntimeEntryPoints can lazily add function declarations.

Reviewed By: ahatanak

Differential Revision: https://reviews.llvm.org/D87806
2020-09-21 09:40:14 -07:00
Momchil Velikov 742250bf62 [ARM][CMSE] Issue an error if passing arguments through memory across
security boundary

It was never supported and that part was accidentally omitted when
upstreaming D76518.

Differential Revision: https://reviews.llvm.org/D86478

Change-Id: If6ba9506eb0431c87a1d42a38aa60e47ce263039
2020-09-21 17:26:10 +01:00
Simon Pilgrim 3ae07b2a33 TargetPassConfig.cpp - use auto const& iterator in for-range loop to avoid copies. NFCI. 2020-09-21 17:17:11 +01:00
Simon Pilgrim 3ddecfd220 SLPVectorizer.cpp - fix include ordering. NFCI. 2020-09-21 17:17:11 +01:00
Simon Pilgrim ce294ff8cd MachineCSE.cpp - use auto const& iterator in for-range loop to avoid copies. NFCI. 2020-09-21 16:54:26 +01:00
Simon Pilgrim 53f1748c13 ProfileSummary.cpp - use auto const& iterator in for-range loop to avoid copies. NFCI. 2020-09-21 16:54:26 +01:00
David Sherwood 96e52c1364 [SVE][CodeGen] Mark ptrue/pfalse instructions as rematerializable 2020-09-21 16:44:32 +01:00
Baptiste Saleil 1372e23c7d [PowerPC] Add vector pair load/store instructions and vector pair register class
This patch adds support for the lxvp, lxvpx, plxvp, stxvp, stxvpx and pstxvp
instructions in the PowerPC backend. These instructions allow loading and
storing VSX register pairs. This patch also adds the VSRp register class
definition needed for these instructions.

Differential Revision: https://reviews.llvm.org/D84359
2020-09-21 10:27:47 -05:00
Arthur Eubanks 5249e6f248 [LoopSimplifyCFG][NewPM] Rename simplify-cfg -> loop-simplifycfg
This matches the legacy PM name and makes all tests in
Transforms/LoopSimplifyCFG pass under NPM.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D87948
2020-09-21 08:27:19 -07:00
Alexey Bataev 3ff07fcd54 [SLP] Allow reordering of vectorization trees with reused instructions.
If some leaves have the same instructions to be vectorized, we may
incorrectly evaluate the best order for the root node (it is built for the
vector of instructions without repeated instructions and, thus, has less
elements than the root node). In this case we just can not try to reorder
the tree + we may calculate the wrong number of nodes that requre the
same reordering.
For example, if the root node is \<a+b, a+c, a+d, f+e\>, then the leaves
are \<a, a, a, f\> and \<b, c, d, e\>. When we try to vectorize the first
leaf, it will be shrink to \<a, b\>. If instructions in this leaf should
be reordered, the best order will be \<1, 0\>. We need to extend this
order for the root node. For the root node this order should look like
\<3, 0, 1, 2\>. This patch allows extension of the orders of the nodes
with the reused instructions.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D45263
2020-09-21 10:51:03 -04:00
Simon Pilgrim 2ef2abdec2 DWARFEmitter.cpp - use auto const& iterators in for-range loops to avoid copies. NFCI. 2020-09-21 15:33:09 +01:00
Paul C. Anagnostopoulos bd55d5b2a1 Change comments about order of classes in superclass list. 2020-09-21 10:25:44 -04:00
Simon Pilgrim 82042a2c9b DWARFYAML::emitDebugSections - remove unnecessary cantFail(success) call. NFCI.
As mentioned on rG6bb912336804.
2020-09-21 14:07:11 +01:00
Denis Antrushin ee86688b81 [Statepoints][ISEL] gc.relocate uniquification should be based on SDValue, not IR Value.
When exporting statepoint results to virtual registers we try to avoid
generating exports for duplicated inputs. But we erroneously use
IR Value* to check if inputs are duplicated. Instead, we should use
SDValue, because even different IR values can get lowered to the same
SDValue.
I'm adding a (degenerate) test case which emphasizes importance of this
feature for invoke statepoints.
If we fail to export only unique values we will end up with something
like that:

  %0 = STATEPOINT
  %1 = COPY %0

landing_pad:
  <use of %1>

And when exceptional path is taken, %1 is left uninitialized (COPY is never
execute).

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D87695
2020-09-21 19:44:46 +07:00
Paul Walker f3fa954b5b [SVE] Change definition of reduction ISD nodes to have an SVE vector result type.
The current nodes, AArch64::SMAXV_PRED for example, are defined to
return a NEON vector result.  This is incorrect because they modify
the complete SVE register and are thus changed to represent such.

This patch also adds nodes for UADDV_PRED and SADDV_PRED, which
unifies the handling of all SVE reductions.

NOTE: Floating-point reductions are already implemented correctly,
so this patch is essentially making everything consistent with those.

Differential Revision: https://reviews.llvm.org/D87843
2020-09-21 13:16:28 +01:00
Paul Walker 6457455248 [SVE] Use NEON for extract_vector_elt when the index is in range.
Patch also adds missing patterns for unpacked vector types and
extracts of element zero.

Differential Revision: https://reviews.llvm.org/D87842
2020-09-21 13:12:28 +01:00
Alexander Belyaev 17dc729bd4 Revert "[NFC][ScheduleDAG] Remove unused EntrySU SUnit"
This reverts commit 0345d88de6.

Google internal backend uses EntrySU, we are looking into removing
dependency on it.

Differential Revision: https://reviews.llvm.org/D88018
2020-09-21 13:33:05 +02:00
Florian Hahn 11dccf8d3a Recommit "[SCEV] Look through single value PHIs."
This commit was originally because it was suspected to cause a crash,
but a reproducer did not surface.

A crash that was exposed by this change was fixed in 1d8f2e5292.

This reverts the revert commit 0581c0b0ee.
2020-09-21 11:59:50 +01:00
David Green f4c5cadbcb [ARM] Select f32 constants with vmov.f16
This adds lowering for f32 values using the vmov.f16, which zeroes the
top bits whilst setting the lower bits to a pattern. This range of
values does not often come up, except where a f16 constant value has
been converted to a f32.

Differential Revision: https://reviews.llvm.org/D87790
2020-09-21 11:10:47 +01:00
Sjoerd Meijer 4b8ade837e [AArch64] Cortex-A55 scheduler model
This is an initial commit adding the A55 model, but it isn't used/enabled yet.
We will follow up on this to improve the model, then flip the switch.

The optimisation guide describing Cortex-A55 micro-architecture in more detail
can be found here:

https://static.docs.arm.com/epm128372/20/arm_cortex_a55_software_optimization_guide_v2.pdf

Original patch by Javed Absar.

Differential Revision: https://reviews.llvm.org/D46884
2020-09-21 10:54:32 +01:00
Alex Richardson 8cf6778d30 [RISC-V] Implement RISCVInstrInfo::isCopyInstrImpl()
This does not result in changes for any of the current tests, but it might
improve debug information in some cases.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D86522
2020-09-21 10:21:11 +01:00
Lucas Prates 53d238a961 [CodeGen] Fixing inconsistent ABI mangling of vlaues in SelectionDAGBuilder
SelectionDAGBuilder was inconsistently mangling values based on ABI
Calling Conventions when getting them through copyFromRegs in
SelectionDAGBuilder, causing duplicate value type convertions for
function arguments. The checking for the mangling requirement was based
on the value's originating instruction and was performed outside of, and
inspite of, the regular Calling Convention Lowering.

The issue could be observed in a scenario such as:

```
%arg1 = load half, half* %const, align 2
%arg2 = call fastcc half @someFunc()
call fastcc void @otherFunc(half %arg1, half %arg2)
; Here, %arg2 was incorrectly mangled twice, as the CallConv data from
; the call to @someFunc() was taken into consideration for the check
; when getting the value for processing the call to @otherFunc(...),
; after the proper convertion had taken place when lowering the return
; value of the first call.
```

This patch fixes the issue by disregarding the Calling Convention
information for such copyFromRegs, making sure the ABI mangling is
properly contanined in the Calling Convention Lowering.

This fixes Bugzilla #47454.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87844
2020-09-21 10:05:34 +01:00
Florian Hahn 57ae9bb932 [LSR] Preserve MSSA when using SplitCriticalEdge.
LSR claims to MemorySSA, but we also have to make sure it is preserved
when splitting critical edges. This can be done by passing MSSAU to
SplitCriticalEdge.

Fixes PR47557.
2020-09-21 09:51:26 +01:00
Fangrui Song dbc616e982 [EHStreamer] Fix a "Continue to action" -fverbose-asm comment when multi-byte LEB128 encoding is needed
This only happens with more than 64 action records and it is difficult to construct a test.
2020-09-20 21:41:48 -07:00
Qiu Chaofan 1d782c2987 [PowerPC] Pass nofpexcept flag to custom lowered constrained ops
This is a follow-up of D86605. For strict DAG FP node, if its FP
exception behavior metadata is ignore, it should have nofpexcept flag.
But during custom lowering, this flag isn't passed down.

This is also seen on X86 target.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D87390
2020-09-21 10:44:25 +08:00
Fangrui Song d06485685d [XRay] Change mips to use version 2 sled (PC-relative address)
Follow-up to D78590. All targets use PC-relative addresses now.

Reviewed By: atanasyan, dberris

Differential Revision: https://reviews.llvm.org/D87977
2020-09-20 17:59:57 -07:00
Craig Topper a74b1faba2 [X86] Make reduceMaskedLoadToScalarLoad/reduceMaskedStoreToScalarStore work for avx512 after type legalization.
The scalar elements of the vXi1 build_vector will have been type legalized to i8 by padding with 0s. So we can't check for all ones. Instead we should just look at bit 0 of the constant.

Differential Revision: https://reviews.llvm.org/D87863
2020-09-20 13:54:20 -07:00
Craig Topper 4e8c028158 [X86] Stop reduceMaskedLoadToScalarLoad/reduceMaskedStoreToScalarStore from creating scalar i64 load/stores in 32-bit mode
If we emit a scalar i64 load/store it will get type legalized to two i32 load/stores.

Differential Revision: https://reviews.llvm.org/D87862
2020-09-20 13:46:59 -07:00
David Green 29bd8ea110 [ARM] Constant fold VMOVrh
This adds simple constant folding for VMOVrh, to constant fold fp16
constants to integer values. It can help especially with soft calling
conventions, but some of the results are not optimal as we end up
loading using a vldr. This will be improved in a follow up patch.

Differential Revision: https://reviews.llvm.org/D87789
2020-09-20 21:32:51 +01:00
Nikita Popov 445db89b53 [LVI] Get value range from mask comparison
InstCombine likes to canonicalize comparisons of the form
X == C || X == C+1 into (X & -2) == C'. Make sure LVI can still
recover the value range from this. Can of course also be useful
for proper mask comparisons.

For the sake of clarity, the implementation goes through KnownBits
to compute the range.
2020-09-20 21:13:57 +02:00
Nikita Popov f94bbe19b6 [LVI] Refactor getValueFromICmpCondition (NFC)
Rewrite this in a way where the core logic is in a separate
function, that is invoked with swapped operands. This makes it
easier to add handling for additional icmp patterns.
2020-09-20 21:13:57 +02:00