llvm-project

Commit Graph

Author	SHA1	Message	Date
Roman Lebedev	b38d897e80	[ConstantRange] binaryXor(): special-case binary complement case - the result is precise Use the fact that `~X` is equivalent to `-1 - X`, which gives us fully-precise answer, and we only need to special-handle the wrapped case. This fires ~16k times for vanilla llvm test-suite + RawSpeed.	2020-09-22 21:37:29 +03:00
Roman Lebedev	4eeeb356fc	[CVP] Enhance SRem -> URem fold to work not just on non-negative operands This is a continuation of `8d487668d0`, the logic is pretty much identical for SRem: Name: pos pos Pre: C0 >= 0 && C1 >= 0 %r = srem i8 C0, C1 => %r = urem i8 C0, C1 Name: pos neg Pre: C0 >= 0 && C1 <= 0 %r = srem i8 C0, C1 => %r = urem i8 C0, -C1 Name: neg pos Pre: C0 <= 0 && C1 >= 0 %r = srem i8 C0, C1 => %t0 = urem i8 -C0, C1 %r = sub i8 0, %t0 Name: neg neg Pre: C0 <= 0 && C1 <= 0 %r = srem i8 C0, C1 => %t0 = urem i8 -C0, -C1 %r = sub i8 0, %t0 https://rise4fun.com/Alive/Vd6 Now, this new logic does not result in any new catches as of vanilla llvm test-suite + RawSpeed. but it should be virtually compile-time free, and it may be important to be consistent in their handling, because if we had a pair of sdiv-srem, and only converted one of them, -divrempairs will no longer see them as a pair, and thus not "merge" them.	2020-09-22 21:37:28 +03:00
Hubert Tong	6801950192	[InstCombine] For pow(x, +/-0.5), stop falling into pow(x, 1.5), etc. case The current code for handling pow(x, y) where y is an integer plus 0.5 is not explicitly guarded against attempting to transform the case where abs(y) is exactly 0.5. The latter case is meant to be handled by `replacePowWithSqrt`. Indeed, if the pow(x, integer+0.5) case proceeds past a certain point, it will hit an assertion by attempting to form pow(x, 0) using `getPow`. This patch adds an explicit check to prevent attempting the pow(x, integer+0.5) transformation on pow(x, +/-0.5) as suggested during the review of D87877. This has the effect of retaining the shrinking of `pow` to `powf` when the `sqrt` libcall cannot be formed. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D88066	2020-09-22 14:23:32 -04:00
Hubert Tong	b0f58aa116	[NFC] Replace tabs with spaces in PPCInstrPrefix.td	2020-09-22 14:23:32 -04:00
Paul C. Anagnostopoulos	848d66fafd	Version 0.5 of the new "TableGen Backend Developer's Guide." Files modified to take comments into account. MLIR documentation updated for new TableGen documentation files.	2020-09-22 14:01:52 -04:00
Mircea Trofin	d1e0f9f3cf	[NFC][regalloc] Simplify/conform to style guide indvars in Greedy Differential Revision: https://reviews.llvm.org/D88055	2020-09-22 10:55:52 -07:00
Amy Kwan	079757b551	[PowerPC] Implement Vector String Isolate Builtins in Clang/LLVM This patch implements the vector string isolate (predicate and non-predicate versions) builtins. The predicate builtins are custom selected within PPCISelDAGToDAG. Differential Revision: https://reviews.llvm.org/D87671	2020-09-22 11:31:44 -05:00
Amy Kwan	b3147058de	[PowerPC] Implement the 128-bit Vector Divide Extended Builtins in Clang/LLVM This patch implements the 128-bit vector divide extended builtins in Clang/LLVM. These builtins map to the vdivesq and vdiveuq instructions respectively. Differential Revision: https://reviews.llvm.org/D87729	2020-09-22 11:31:44 -05:00
Simon Pilgrim	4dada8d617	[DAG] Remove DAGTypeLegalizer::GenWidenVectorTruncStores (PR42046) Just scalarize trunc stores - GenWidenVectorTruncStores does the same thing but is flawed (PR42046) and unused. Differential Revision: https://reviews.llvm.org/D87708	2020-09-22 17:24:45 +01:00
Alexandre Ganea	723fea2307	Silence 'warning: unused variable' when compiling with Clang 10.0	2020-09-22 12:17:40 -04:00
Hamilton Tobon Mosquera	bd31abc1d0	[OpenMPOpt] Refactored "issue" and "wait" declarations for data map runtime call. Refactored __tgt_target_data_begin_mapper_<issue\|wait> to receive the handle as an input/output argument. This given the compiler warning of returning the handle as copy. Differential Revision: https://reviews.llvm.org/D88029	2020-09-22 10:50:17 -05:00
Alexandre Ganea	6537004913	[ThinLTO] Re-order modules for optimal multi-threaded processing Re-use an optimizition from the old LTO API (used by ld64). This sorts modules in ascending order, based on bitcode size, so that larger modules are processed first. This allows for smaller modules to be process last, and better fill free threads 'slots', and thusly allow for better multi-thread load balancing. In our case (on dual Intel Xeon Gold 6140, Windows 10 version 2004, two-stage build), this saves 15 sec when linking `clang.exe` with LLD & `-flto=thin`, `/opt:lldltojobs=all`, no ThinLTO cache, -DLLVM_INTEGRATED_CRT_ALLOC=d:\git\rpmalloc. Before patch: 102 sec After patch: 85 sec Inspired by the work done by David Callahan in D60495. Differential Revision: https://reviews.llvm.org/D87966	2020-09-22 11:25:59 -04:00
Arthur Eubanks	a031ef6f3a	[GVNSink][NewPM] Add GVNSinkPass to PassRegistry.def	2020-09-22 08:24:09 -07:00
Florian Hahn	c671e34bf2	[VPlan] Add dump() helper to VPValue & VPRecipeBase. This provides a convenient way to print VPValues and recipes in a debugger. In particular it saves the user from instantiating VPSlotTracker to print recipes or values.	2020-09-22 15:55:16 +01:00
Michael Liao	534f6e1718	[PeepholeOptimizer] Enhance the redundant COPY elimination. - Eliminate redundant COPYs from the same register & subregister pair. Differential Revision: https://reviews.llvm.org/D87939	2020-09-22 10:11:37 -04:00
Simon Pilgrim	0793b45660	[X86] Add missing namespace closure comments. NFCI. Fixes some clang-tidy llvm-namespace-comment warnings.	2020-09-22 15:06:59 +01:00
Simon Pilgrim	af71298648	[X86] Cleanup/add namespace closure comments. NFCI. Fixes some clang-tidy llvm-namespace-comment warnings.	2020-09-22 15:06:58 +01:00
Stefan Pintilie	7e78d89052	[PowerPC] Fix for compiler side issue in PCRelative Local Exec Stop combining loads and stores with PPCISD::ADD_TLS before we can merge the node with with TLS_LOCAL_EXEC_MAT_ADDR. The issue is that TLS_LOCAL_EXEC_MAT_ADDR cannot be selected by itself and requires the previous ADD_TLS node that goes with it. However, we sometimes try to combine ADD_TLS with loads and stores that come after it. If this happens then the ADD_TLS is removed and TLS_LOCAL_EXEC_MAT_ADDR cannot be selected. While this bug fix will address the issue it my not be ideal from a performance perspective as we may be able to add patterns to combine TLS_LOCAL_EXEC_MAT_ADDR with ADD_TLS with the load and store that comes after it all in one. However, this is beyond the scope of this patch. Reviewed By: NeHuang Differential Revision: https://reviews.llvm.org/D88030	2020-09-22 08:28:06 -05:00
Sanjay Patel	0c3bfbe4bc	[SLP] reduce code duplication for checking parent block; NFC	2020-09-22 09:21:20 -04:00
Sanjay Patel	bbd49a0266	[SLP] move misplaced code comments; NFC	2020-09-22 09:21:20 -04:00
Sanjay Patel	062276c691	[SLP] clean up code in gather(); NFC 1. Use range for-loop to avoid repeatedly accessing end index. 2. Better variable names.	2020-09-22 09:21:20 -04:00
Simon Pilgrim	d682a36ef9	[SLP] Merge null and dyn_cast<> checks into dyn_cast_or_null<>. NFCI.	2020-09-22 14:01:47 +01:00
Sam Parker	94c799fecf	[ARM] Trying to fix asan buildbot	2020-09-22 13:43:23 +01:00
Max Kazantsev	e2703c021d	[SCEV] Handle `less` predicates for FoundPred = NE Currently these predicates are ignored, yet their handling is pretty simple. I could not find a single test where it would actually change something, but it's only because isImpliedCondOperands is not smart enough to prove it further on. Yet the situation when we come there with `less` predicate is pretty common. Differential Revision: https://reviews.llvm.org/D87890 Reviewed By: fhahn	2020-09-22 18:56:35 +07:00
Meera Nakrani	a3d0dce260	[ARM][TTI] Prevents constants in a min(max) or max(min) pattern from being hoisted when in a loop Changes TTI function getIntImmCostInst to take an additional Instruction parameter, which enables us to be able to check it is part of a min(max())/max(min()) pattern that will match SSAT. We can then mark the constant used as free to prevent it being hoisted so SSAT can still be generated. Required minor changes in some non-ARM backends to allow for the optional parameter to be included. Differential Revision: https://reviews.llvm.org/D87457	2020-09-22 11:54:10 +00:00
Simon Pilgrim	a15b42146c	Revert rGf835779160ec303 "[APFloat] multiplySignificand - always pass IEEEFloat as const reference. NFCI." This reverts commit `f835779160` while I investigate some buildbot failures	2020-09-22 12:15:23 +01:00
Simon Pilgrim	f835779160	[APFloat] multiplySignificand - always pass IEEEFloat as const reference. NFCI. We do this in all other cases.	2020-09-22 11:29:29 +01:00
Max Kazantsev	16fde88dbd	[SCEV] Support unsigned predicates in isKnownPredicateViaNoOverflow SCEV should be able to prove facts like `x <u x+1<nuw>`. Differential Revision: https://reviews.llvm.org/D88015 Reviewed By: lebedev.ri	2020-09-22 17:14:05 +07:00
Jay Foad	892ef2e3c0	[AMDGPU] More codegen patterns for v2i16/v2f16 build_vector It's simpler to do this at codegen time than to do ad-hoc constant folding of machine instructions in SIFoldOperands. Differential Revision: https://reviews.llvm.org/D88028	2020-09-22 10:41:38 +01:00
Sam Parker	b4fa884a73	[ARM] Improve VPT predicate tracking The VPTBlock has been modified to track the 'global' state of the VPR, as well as the state for each block. Each object now just holds a list of instructions that makeup the block, while static structures hold the predicate information. This enables global access for querying how both a VPT block and individual instructions are predicated. These changes now allow us, again, to handle more complicated cases where multiple instructions build a predicate and/or where the same predicate in used in multiple blocks. It doesn't, however, get us back to before the tracking was 'fixed' as some extra logic will be required to properly handle VPT instructions. Currently a VPT could be effectively predicated because of it's inputs, but the existing logic will not detect that and so will refuse to perform the transformation. This can be seen in remat-vctp.ll test where we still don't perform the transform. Differential Revision: https://reviews.llvm.org/D87681	2020-09-22 10:40:27 +01:00
Muhammad Omair Javaid	73a6a164b8	Revert "Reapply Revert "RegAllocFast: Rewrite and improve"" This reverts commit `55f9f87da2`. Breaks following buildbots: http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4306 http://lab.llvm.org:8011/builders/lldb-aarch64-ubuntu/builds/9154	2020-09-22 14:40:06 +05:00
Sam Parker	a0c1dcc318	[ARM] Remove MVEDomain from VLDR/STR of P0 Remove the domain from the instructions and create a shouldInspect helper for LowOverheadLoops which queries it or a vpr operand. Differential Revision: https://reviews.llvm.org/D87900	2020-09-22 09:05:50 +01:00
Sam Parker	e461921d6c	[ARM] VPT validForTailPredication Mark all VPT instructions as valid. Differential Revision: https://reviews.llvm.org/D87759	2020-09-22 08:58:37 +01:00
Martin Storsjö	3fec6ddc27	Reapply: [clang-cl] Always interpret the LIB env var as separated with semicolons When cross compiling with clang-cl, clang splits the INCLUDE env variable around semicolons (clang/lib/Driver/ToolChains/MSVC.cpp, MSVCToolChain::AddClangSystemIncludeArgs) and lld splits the LIB variable similarly (lld/COFF/Driver.cpp, LinkerDriver::addLibSearchPaths). Therefore, the consensus for cross compilation with clang-cl and lld-link seems to be to use semicolons, despite path lists normally being separated by colons on unix and EnvPathSeparator being set to that. Therefore, handle the LIB variable similarly in Clang, when handling lib file arguments when driving linking via Clang. This fixes commands like "clang-cl test.c -Fetest.exe kernel32.lib" in a cross compilation setting. Normally, most users call (lld-)link directly, but meson happens to use this command syntax for has_function() tests. Reapply: Change Program.h to define procid_t as ::pid_t. When included in lldb/unittests/Host/NativeProcessProtocolTest.cpp, it is included after an lldb namespace containing an lldb::pid_t typedef, followed later by a "using namespace lldb;". Previously, Program.h wasn't included in this translation unit, but now it ends up included transitively from Process.h. Differential Revision: https://reviews.llvm.org/D88002	2020-09-22 10:51:25 +03:00
Arthur Eubanks	3bf703fb6d	[AlwaysInliner] Emit optimization remarks To match the normal inliner in preparation for https://reviews.llvm.org/D86988. Also change a FIXME to an assert. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88067	2020-09-21 22:09:28 -07:00
Dominic Chen	9c7b58080e	[WebAssembly][MC] Fix computation of relative symbol offset For relative symbols, add its offset when computing relocation value. Also, warn on unsupported absolute symbols. Differential Revision: https://reviews.llvm.org/D87407	2020-09-22 00:53:23 -04:00
Serguei Katkov	5502cfa091	[LoopUnswitch] Trivial simplification: remove trivial dead condition after unswitch Non trivial loop unswitch can keep the dead condition instruction. CL adds trivial dead code elimination for unused condition. Reviewers: asbirlea, aqjune, fhahn, DaniilSuchkov, reames Reviewed By: asbirlea Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D88014	2020-09-22 09:04:59 +07:00
Arthur Eubanks	9db0c572c1	[Delinearization][NewPM] Port delinearization to NPM Also make tests in Analysis/Delinearization work under NPM. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87741	2020-09-21 17:59:08 -07:00
Fangrui Song	8fdac7cb7a	Revert D71539 "Recommit "[SCEV] Look through single value PHIs."" This reverts commit `11dccf8d3a`. A bootstrapped clang crashes (due to ArrayRef::front called on an empty ArrayRef) when compiling some files. Very strangely, this only reproduces with modules. ``` 13 0x0000564d3349e968 llvm::ArrayRef<llvm::BasicBlock>::front() const /proc/self/cwd/llvm/include/llvm/ADT/ArrayRef.h:160:7 14 0x0000564d3349e896 llvm::LoopBase<llvm::BasicBlock, llvm::Loop>::getHeader() const /proc/self/cwd/llvm/include/llvm/Analysis/LoopInfo.h:104:50 15 0x0000564d3349fd9d llvm::LoopBase<llvm::BasicBlock, llvm::Loop>::getLoopLatch() const /proc/self/cwd/llvm/include/llvm/Analysis/LoopInfoImpl.h:210:11 16 0x0000564d33593c8a llvm::ScalarEvolution::computeBackedgeTakenCount(llvm::Loop const, bool) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:6933:15 17 0x0000564d33592ebc llvm::ScalarEvolution::getBackedgeTakenInfo(llvm::Loop const) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:0:30 18 0x0000564d33593a54 llvm::ScalarEvolution::getBackedgeTakenCount(llvm::Loop const, llvm::ScalarEvolution::ExitCountKind) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:6487:36 19 0x0000564d32be2402 llvm::ScalarEvolution::getConstantMaxBackedgeTakenCount(llvm::Loop const) /proc/self/cwd/llvm/include/llvm/Analysis/ScalarEvolution.h:768:5 20 0x0000564d33590807 llvm::ScalarEvolution::getRangeRef(llvm::SCEV const, llvm::ScalarEvolution::RangeSignHint) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:5495:19 21 0x0000564d320abab7 llvm::ScalarEvolution::getSignedRange(llvm::SCEV const) /proc/self/cwd/llvm/include/llvm/Analysis/ScalarEvolution.h:840:12 22 0x0000564d335a03aa llvm::ScalarEvolution::isKnownPredicateViaConstantRanges(llvm::CmpInst::Predicate, llvm::SCEV const, llvm::SCEV const) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:9239:60 23 0x0000564d33586a80 llvm::ScalarEvolution::isKnownViaNonRecursiveReasoning(llvm::CmpInst::Predicate, llvm::SCEV const, llvm::SCEV const*) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:10284:60 ```	2020-09-21 17:21:43 -07:00
Krzysztof Parzyszek	ae3f54c1e9	[EarlyCSE] Handle masked loads and stores Extend the handling of memory intrinsics to also include non- target-specific intrinsics, in particular masked loads and stores. Invent "isHandledNonTargetIntrinsic" to distinguish between intrin- sics that should be handled natively from intrinsics that can be passed to TTI. Add code that handles masked loads and stores and update the testcase to reflect the results. Differential Revision: https://reviews.llvm.org/D87340	2020-09-21 18:47:10 -05:00
Arthur Eubanks	1747f77764	[SimplifyCFG] Override options in default constructor SimplifyCFG's options should always be overridden by command line flags, but they mistakenly weren't in the default constructor. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D87718	2020-09-21 16:33:01 -07:00
Evandro Menezes	394d020167	[RISCV] Do not mandate scheduling for CSR instructions Scheduling information is of little value when they may disrupt the pipeline. This patch allows omitting the scheduling information for CSR instructions while still setting `SchedMachineModel::CompleteModel`. For specific cases, any scheduling information added will be used by the scheduler. Differential revision: https://reviews.llvm.org/D85366	2020-09-21 18:24:53 -05:00
Amara Emerson	e3f5046e44	[AArch64][GlobalISel] Merge selection of vector-vector G_ASHR/G_LSHR and support more cases. The vector-immediate cases are handled elsewhere in an earlier commit.	2020-09-21 16:04:52 -07:00
Amara Emerson	a513fdec90	[AArch64][GlobalISel] Add a post-legalize combine for lowering vector-immediate G_ASHR/G_LSHR. In order to select the immediate forms using the imported patterns, we need to lower them into new G_VASHR/G_VLSHR target generic ops. Add a combine to do this matching build_vector of constant operands. With this, we get selection for free.	2020-09-21 16:04:52 -07:00
Amara Emerson	825203daae	[AArch64][GlobalISel] Make <4 x s16> G_ASHR and G_LSHR legal. Selection support for these is coming up.	2020-09-21 15:32:48 -07:00
Mircea Trofin	6a6b06f526	[NFC][regalloc] Use reverse iterator ranges for improved readability Differential Revision: https://reviews.llvm.org/D88047	2020-09-21 14:58:37 -07:00
Martin Storsjö	8c3ef08f8a	Revert "[clang-cl] Always interpret the LIB env var as separated with semicolons" This reverts commit `4d85444b31`. This commit broke building lldb's NativeProcessProtocolTest.cpp, with errors like these: In file included from include/llvm/Support/Process.h:32:0, from tools/lldb/unittests/Host/NativeProcessProtocolTest.cpp:12: include/llvm/Support/Program.h:39:11: error: reference to ‘pid_t’ is ambiguous typedef pid_t procid_t; /usr/include/sched.h:38:17: note: candidates are: typedef __pid_t pid_t typedef __pid_t pid_t; tools/lldb/include/lldb/lldb-types.h:85:18: note: typedef uint64_t lldb::pid_t typedef uint64_t pid_t;	2020-09-22 00:14:45 +03:00
Krzysztof Parzyszek	2c768c7d6c	[EarlyCSE] Small refactoring changes, NFC 1. Store intrinsic ID in ParseMemoryInst instead of a boolean flag "IsTargetMemInst". This will make it easier to add support for target-independent intrinsics. 2. Extract the complex multiline conditions from EarlyCSE::processNode into a new function "getMatchingValue". Differential Revision: https://reviews.llvm.org/D87691	2020-09-21 16:11:06 -05:00
Baptiste Saleil	bb82135538	[PowerPC] Remove unnecessary patterns and types These patterns and type uses were added by mistake by commit `1372e23c7d`	2020-09-21 16:08:54 -05:00
Pengxuan Zheng	e5fea37f1a	[Hexagon] Make HexagonVLCR compatibile with New PM The patch modifies HexagonVectorLoopCarriedReuse pass to make it compatible with both Legacy Pass Manager through HexagonVectorLoopCarriedReuseLegacyPass and with New Pass Manager through HexagonVectorLoopCarriedReusePass. Reviewed By: pzheng Differential Revision: https://reviews.llvm.org/D86955	2020-09-21 13:45:12 -07:00
Martin Storsjö	36c64af9d7	[CodeGen] [WinException] Only produce handler data at the end of the function if needed If we are going to write handler data (that is written as variable length data following after the unwind info in .xdata), we need to emit the handler data immediately, but for cases where no such info is going to be written, skip emitting it right away. (Unwind info for all remaining functions that hasn't gotten it emitted directly is emitted at the end.) This does slightly change the ordering of sections (triggering a bunch of updates to DebugInfo/COFF tests), but the change should be benign. This also matches GCC's assembly output, which doesn't output .seh_handlerdata unless it actually is needed. For ARM64, the unwind info can be packed into the runtime function entry itself (leaving no data in the .xdata section at all), but that can only be done if there's no follow-on data in the .xdata section. If emission of the unwind info is triggered via EmitWinEHHandlerData (or the .seh_handlerdata directive), which implicitly switches to the .xdata section, there's a chance of the caller wanting to pass further data there, so the packed format can't be used in that case. Differential Revision: https://reviews.llvm.org/D87448	2020-09-21 23:42:59 +03:00
Martin Storsjö	4d85444b31	[clang-cl] Always interpret the LIB env var as separated with semicolons When cross compiling with clang-cl, clang splits the INCLUDE env variable around semicolons (clang/lib/Driver/ToolChains/MSVC.cpp, MSVCToolChain::AddClangSystemIncludeArgs) and lld splits the LIB variable similarly (lld/COFF/Driver.cpp, LinkerDriver::addLibSearchPaths). Therefore, the consensus for cross compilation with clang-cl and lld-link seems to be to use semicolons, despite path lists normally being separated by colons on unix and EnvPathSeparator being set to that. Therefore, handle the LIB variable similarly in Clang, when handling lib file arguments when driving linking via Clang. This fixes commands like "clang-cl test.c -Fetest.exe kernel32.lib" in a cross compilation setting. Normally, most users call (lld-)link directly, but meson happens to use this command syntax for has_function() tests. Differential Revision: https://reviews.llvm.org/D88002	2020-09-21 23:42:59 +03:00
Sanjay Patel	7451bf0b0b	[SLP] use std::distance/find to reduce code; NFC We were already using this code pattern right after the loop, so this makes it consistent.	2020-09-21 16:22:55 -04:00
Matt Arsenault	6daddc213f	AMDGPU: Don't add frame register to frame pseudos We no longer treat the frame register like a function argument, so the problem this avoided is no longer relevant.	2020-09-21 16:18:47 -04:00
Matt Arsenault	55f9f87da2	Reapply Revert "RegAllocFast: Rewrite and improve" This reverts commit `dbd53a1f0c`. Needed lldb test updates	2020-09-21 15:45:27 -04:00
Zequan Wu	9caa3fbe03	[Coverage] Add empty line regions to SkippedRegions Differential Revision: https://reviews.llvm.org/D84988	2020-09-21 12:42:53 -07:00
Sanjay Patel	6bad3caeb0	[InstCombine] use unary shuffle creator to reduce code duplication; NFC	2020-09-21 15:34:24 -04:00
Sanjay Patel	be93505986	[LoopVectorize] use unary shuffle creator to reduce code duplication; NFC	2020-09-21 15:34:24 -04:00
Arthur Eubanks	f4f7df037e	[DIE] Remove DeadInstEliminationPass This pass is like DeadCodeEliminationPass, but only does one pass through a function instead of iterating on users of eliminated instructions. DeadCodeEliminationPass should be used in all cases. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87933	2020-09-21 12:12:25 -07:00
Roman Lebedev	0ab99bb314	[NFC][SCEV] Cleanup lowering of @llvm.uadd.sat, (-1 - V) is just ~V	2020-09-21 22:10:59 +03:00
Arthur Eubanks	746a2c3775	[ObjCARC] Initialize return value Mistakenly removed initialization of `Changed` in https://reviews.llvm.org/D87806.	2020-09-21 11:03:44 -07:00
Sanjay Patel	a44238cb44	[SLP] use unary shuffle creator to reduce code duplication; NFC	2020-09-21 13:54:06 -04:00
Sanjay Patel	1e6b240d7d	[IRBuilder][VectorCombine] make and use a convenience function for unary shuffle; NFC This reduces code duplication for common construct. Follow-ups can use this in SLP, LoopVectorizer, and other passes.	2020-09-21 13:47:01 -04:00
Roman Lebedev	64e2cb7e96	[SCEV] Recognize @llvm.uadd.sat as `%y + umin(%x, (-1 - %y))` ---------------------------------------- define i32 @src(i32 %x, i32 %y) { %0: %r = uadd_sat i32 %x, %y ret i32 %r } => define i32 @tgt(i32 %x, i32 %y) { %0: %t0 = sub nsw nuw i32 4294967295, %y %t1 = umin i32 %x, %t0 %r = add nuw i32 %t1, %y ret i32 %r } Transformation seems to be correct! The alternative, naive, lowering could be the following, although i don't think it's better, thought it will likely be needed for sadd/ssub/*shl: ---------------------------------------- define i32 @src(i32 %x, i32 %y) { %0: %r = uadd_sat i32 %x, %y ret i32 %r } => define i32 @tgt(i32 %x, i32 %y) { %0: %t0 = zext i32 %x to i33 %t1 = zext i32 %y to i33 %t2 = add nuw i33 %t0, %t1 %t3 = zext i32 4294967295 to i33 %t4 = umin i33 %t2, %t3 %r = trunc i33 %t4 to i32 ret i32 %r } Transformation seems to be correct!	2020-09-21 20:25:54 +03:00
Roman Lebedev	fedc9549d5	[SCEV] Recognize @llvm.usub.sat as `%x - (umin %x, %y)` ---------------------------------------- define i32 @src(i32 %x, i32 %y) { %0: %r = usub_sat i32 %x, %y ret i32 %r } => define i32 @tgt(i32 %x, i32 %y) { %0: %t0 = umin i32 %x, %y %r = sub nuw i32 %x, %t0 ret i32 %r } Transformation seems to be correct!	2020-09-21 20:25:54 +03:00
Roman Lebedev	1bb7ab8c4a	[SCEV] Recognize @llvm.abs as smax(x, -x) As per alive2 (ignoring undef): ---------------------------------------- define i32 @src(i32 %x, i1 %y) { %0: %r = abs i32 %x, 0 ret i32 %r } => define i32 @tgt(i32 %x, i1 %y) { %0: %neg_x = mul i32 %x, 4294967295 %r = smax i32 %x, %neg_x ret i32 %r } Transformation seems to be correct! ---------------------------------------- define i32 @src(i32 %x, i1 %y) { %0: %r = abs i32 %x, 1 ret i32 %r } => define i32 @tgt(i32 %x, i1 %y) { %0: %neg_x = mul nsw i32 %x, 4294967295 %r = smax i32 %x, %neg_x ret i32 %r } Transformation seems to be correct!	2020-09-21 20:25:53 +03:00
Simon Pilgrim	005f826a05	[SLP] Use for-range loops across ValueLists. NFCI. Also rename some existing loops that used a 'j' iterator to consistently use 'V'.	2020-09-21 18:24:23 +01:00
Sanjay Patel	46075e0b78	[SLP] simplify interface for gather(); NFC The implementation of gather() should be reduced too, but this change by itself makes things a little clearer: we don't try to gather to a different type or number-of-values than whatever is passed in as the value list itself.	2020-09-21 12:57:28 -04:00
Simon Pilgrim	6a0ed57a22	ImplicitNullChecks.cpp - use auto const& iterators in for-range loops to avoid copies. NFCI.	2020-09-21 17:42:57 +01:00
Arthur Eubanks	024979b7b6	[ObjCARC][NewPM] Port objc-arc-contract to NPM Similar to https://reviews.llvm.org/D86178. This is a module pass instead of a function pass since ARCRuntimeEntryPoints can lazily add function declarations. Reviewed By: ahatanak Differential Revision: https://reviews.llvm.org/D87806	2020-09-21 09:40:14 -07:00
Momchil Velikov	742250bf62	[ARM][CMSE] Issue an error if passing arguments through memory across security boundary It was never supported and that part was accidentally omitted when upstreaming D76518. Differential Revision: https://reviews.llvm.org/D86478 Change-Id: If6ba9506eb0431c87a1d42a38aa60e47ce263039	2020-09-21 17:26:10 +01:00
Simon Pilgrim	3ae07b2a33	TargetPassConfig.cpp - use auto const& iterator in for-range loop to avoid copies. NFCI.	2020-09-21 17:17:11 +01:00
Simon Pilgrim	3ddecfd220	SLPVectorizer.cpp - fix include ordering. NFCI.	2020-09-21 17:17:11 +01:00
Simon Pilgrim	ce294ff8cd	MachineCSE.cpp - use auto const& iterator in for-range loop to avoid copies. NFCI.	2020-09-21 16:54:26 +01:00
Simon Pilgrim	53f1748c13	ProfileSummary.cpp - use auto const& iterator in for-range loop to avoid copies. NFCI.	2020-09-21 16:54:26 +01:00
David Sherwood	96e52c1364	[SVE][CodeGen] Mark ptrue/pfalse instructions as rematerializable	2020-09-21 16:44:32 +01:00
Baptiste Saleil	1372e23c7d	[PowerPC] Add vector pair load/store instructions and vector pair register class This patch adds support for the lxvp, lxvpx, plxvp, stxvp, stxvpx and pstxvp instructions in the PowerPC backend. These instructions allow loading and storing VSX register pairs. This patch also adds the VSRp register class definition needed for these instructions. Differential Revision: https://reviews.llvm.org/D84359	2020-09-21 10:27:47 -05:00
Arthur Eubanks	5249e6f248	[LoopSimplifyCFG][NewPM] Rename simplify-cfg -> loop-simplifycfg This matches the legacy PM name and makes all tests in Transforms/LoopSimplifyCFG pass under NPM. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D87948	2020-09-21 08:27:19 -07:00
Alexey Bataev	3ff07fcd54	[SLP] Allow reordering of vectorization trees with reused instructions. If some leaves have the same instructions to be vectorized, we may incorrectly evaluate the best order for the root node (it is built for the vector of instructions without repeated instructions and, thus, has less elements than the root node). In this case we just can not try to reorder the tree + we may calculate the wrong number of nodes that requre the same reordering. For example, if the root node is \<a+b, a+c, a+d, f+e\>, then the leaves are \<a, a, a, f\> and \<b, c, d, e\>. When we try to vectorize the first leaf, it will be shrink to \<a, b\>. If instructions in this leaf should be reordered, the best order will be \<1, 0\>. We need to extend this order for the root node. For the root node this order should look like \<3, 0, 1, 2\>. This patch allows extension of the orders of the nodes with the reused instructions. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D45263	2020-09-21 10:51:03 -04:00
Simon Pilgrim	2ef2abdec2	DWARFEmitter.cpp - use auto const& iterators in for-range loops to avoid copies. NFCI.	2020-09-21 15:33:09 +01:00
Paul C. Anagnostopoulos	bd55d5b2a1	Change comments about order of classes in superclass list.	2020-09-21 10:25:44 -04:00
Simon Pilgrim	82042a2c9b	DWARFYAML::emitDebugSections - remove unnecessary cantFail(success) call. NFCI. As mentioned on rG6bb912336804.	2020-09-21 14:07:11 +01:00
Denis Antrushin	ee86688b81	[Statepoints][ISEL] gc.relocate uniquification should be based on SDValue, not IR Value. When exporting statepoint results to virtual registers we try to avoid generating exports for duplicated inputs. But we erroneously use IR Value* to check if inputs are duplicated. Instead, we should use SDValue, because even different IR values can get lowered to the same SDValue. I'm adding a (degenerate) test case which emphasizes importance of this feature for invoke statepoints. If we fail to export only unique values we will end up with something like that: %0 = STATEPOINT %1 = COPY %0 landing_pad: <use of %1> And when exceptional path is taken, %1 is left uninitialized (COPY is never execute). Reviewed By: reames Differential Revision: https://reviews.llvm.org/D87695	2020-09-21 19:44:46 +07:00
Paul Walker	f3fa954b5b	[SVE] Change definition of reduction ISD nodes to have an SVE vector result type. The current nodes, AArch64::SMAXV_PRED for example, are defined to return a NEON vector result. This is incorrect because they modify the complete SVE register and are thus changed to represent such. This patch also adds nodes for UADDV_PRED and SADDV_PRED, which unifies the handling of all SVE reductions. NOTE: Floating-point reductions are already implemented correctly, so this patch is essentially making everything consistent with those. Differential Revision: https://reviews.llvm.org/D87843	2020-09-21 13:16:28 +01:00
Paul Walker	6457455248	[SVE] Use NEON for extract_vector_elt when the index is in range. Patch also adds missing patterns for unpacked vector types and extracts of element zero. Differential Revision: https://reviews.llvm.org/D87842	2020-09-21 13:12:28 +01:00
Alexander Belyaev	17dc729bd4	Revert "[NFC][ScheduleDAG] Remove unused EntrySU SUnit" This reverts commit `0345d88de6`. Google internal backend uses EntrySU, we are looking into removing dependency on it. Differential Revision: https://reviews.llvm.org/D88018	2020-09-21 13:33:05 +02:00
Florian Hahn	11dccf8d3a	Recommit "[SCEV] Look through single value PHIs." This commit was originally because it was suspected to cause a crash, but a reproducer did not surface. A crash that was exposed by this change was fixed in `1d8f2e5292`. This reverts the revert commit `0581c0b0ee`.	2020-09-21 11:59:50 +01:00
David Green	f4c5cadbcb	[ARM] Select f32 constants with vmov.f16 This adds lowering for f32 values using the vmov.f16, which zeroes the top bits whilst setting the lower bits to a pattern. This range of values does not often come up, except where a f16 constant value has been converted to a f32. Differential Revision: https://reviews.llvm.org/D87790	2020-09-21 11:10:47 +01:00
Sjoerd Meijer	4b8ade837e	[AArch64] Cortex-A55 scheduler model This is an initial commit adding the A55 model, but it isn't used/enabled yet. We will follow up on this to improve the model, then flip the switch. The optimisation guide describing Cortex-A55 micro-architecture in more detail can be found here: https://static.docs.arm.com/epm128372/20/arm_cortex_a55_software_optimization_guide_v2.pdf Original patch by Javed Absar. Differential Revision: https://reviews.llvm.org/D46884	2020-09-21 10:54:32 +01:00
Alex Richardson	8cf6778d30	[RISC-V] Implement RISCVInstrInfo::isCopyInstrImpl() This does not result in changes for any of the current tests, but it might improve debug information in some cases. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D86522	2020-09-21 10:21:11 +01:00
Lucas Prates	53d238a961	[CodeGen] Fixing inconsistent ABI mangling of vlaues in SelectionDAGBuilder SelectionDAGBuilder was inconsistently mangling values based on ABI Calling Conventions when getting them through copyFromRegs in SelectionDAGBuilder, causing duplicate value type convertions for function arguments. The checking for the mangling requirement was based on the value's originating instruction and was performed outside of, and inspite of, the regular Calling Convention Lowering. The issue could be observed in a scenario such as: ``` %arg1 = load half, half* %const, align 2 %arg2 = call fastcc half @someFunc() call fastcc void @otherFunc(half %arg1, half %arg2) ; Here, %arg2 was incorrectly mangled twice, as the CallConv data from ; the call to @someFunc() was taken into consideration for the check ; when getting the value for processing the call to @otherFunc(...), ; after the proper convertion had taken place when lowering the return ; value of the first call. ``` This patch fixes the issue by disregarding the Calling Convention information for such copyFromRegs, making sure the ABI mangling is properly contanined in the Calling Convention Lowering. This fixes Bugzilla #47454. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87844	2020-09-21 10:05:34 +01:00
Florian Hahn	57ae9bb932	[LSR] Preserve MSSA when using SplitCriticalEdge. LSR claims to MemorySSA, but we also have to make sure it is preserved when splitting critical edges. This can be done by passing MSSAU to SplitCriticalEdge. Fixes PR47557.	2020-09-21 09:51:26 +01:00
Fangrui Song	dbc616e982	[EHStreamer] Fix a "Continue to action" -fverbose-asm comment when multi-byte LEB128 encoding is needed This only happens with more than 64 action records and it is difficult to construct a test.	2020-09-20 21:41:48 -07:00
Qiu Chaofan	1d782c2987	[PowerPC] Pass nofpexcept flag to custom lowered constrained ops This is a follow-up of D86605. For strict DAG FP node, if its FP exception behavior metadata is ignore, it should have nofpexcept flag. But during custom lowering, this flag isn't passed down. This is also seen on X86 target. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87390	2020-09-21 10:44:25 +08:00
Fangrui Song	d06485685d	[XRay] Change mips to use version 2 sled (PC-relative address) Follow-up to D78590. All targets use PC-relative addresses now. Reviewed By: atanasyan, dberris Differential Revision: https://reviews.llvm.org/D87977	2020-09-20 17:59:57 -07:00
Craig Topper	a74b1faba2	[X86] Make reduceMaskedLoadToScalarLoad/reduceMaskedStoreToScalarStore work for avx512 after type legalization. The scalar elements of the vXi1 build_vector will have been type legalized to i8 by padding with 0s. So we can't check for all ones. Instead we should just look at bit 0 of the constant. Differential Revision: https://reviews.llvm.org/D87863	2020-09-20 13:54:20 -07:00
Craig Topper	4e8c028158	[X86] Stop reduceMaskedLoadToScalarLoad/reduceMaskedStoreToScalarStore from creating scalar i64 load/stores in 32-bit mode If we emit a scalar i64 load/store it will get type legalized to two i32 load/stores. Differential Revision: https://reviews.llvm.org/D87862	2020-09-20 13:46:59 -07:00
David Green	29bd8ea110	[ARM] Constant fold VMOVrh This adds simple constant folding for VMOVrh, to constant fold fp16 constants to integer values. It can help especially with soft calling conventions, but some of the results are not optimal as we end up loading using a vldr. This will be improved in a follow up patch. Differential Revision: https://reviews.llvm.org/D87789	2020-09-20 21:32:51 +01:00
Nikita Popov	445db89b53	[LVI] Get value range from mask comparison InstCombine likes to canonicalize comparisons of the form X == C \|\| X == C+1 into (X & -2) == C'. Make sure LVI can still recover the value range from this. Can of course also be useful for proper mask comparisons. For the sake of clarity, the implementation goes through KnownBits to compute the range.	2020-09-20 21:13:57 +02:00
Nikita Popov	f94bbe19b6	[LVI] Refactor getValueFromICmpCondition (NFC) Rewrite this in a way where the core logic is in a separate function, that is invoked with swapped operands. This makes it easier to add handling for additional icmp patterns.	2020-09-20 21:13:57 +02:00

1 2 3 4 5 ...

139349 Commits