llvm-project

Commit Graph

Author	SHA1	Message	Date
Amara Emerson	520ab710fb	Revert "Revert "[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _)" (and dependent patch "Optimize away a Not feeding a brcond by using tbz instead of tbnz.")" This reverts commit `8693ddc743`. Re-committing with the test requiring asserts.	2020-09-01 14:29:04 -07:00
Jordan Rupprecht	8693ddc743	Revert "[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _)" (and dependent patch "Optimize away a Not feeding a brcond by using tbz instead of tbnz.") This reverts commit `8ad8f484b6`. It causes crashes when running `ninja check-llvm-codegen-aarch64-globalisel`, e.g. http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24132/steps/test-stage1-compiler/logs/stdio. Note that the crash does not seem to reproduce in debug builds. `5ded444252` depends on this, so revert that too.	2020-09-01 13:31:57 -07:00
Florian Hahn	0d966ae4b2	[Loads] Add canReplacePointersIfEqual helper. This patch adds an initial, incomeplete and unsound implementation of canReplacePointersIfEqual to check if a pointer value A can be replaced by another pointer value B, that are deemed to be equivalent through some means (e.g. information from conditions). Note that is in general not sound to blindly replace pointers based on equality, for example if they are based on different underlying objects. LLVM's memory model is not completely settled as of now; see https://bugs.llvm.org/show_bug.cgi?id=34548 for a more detailed discussion. The initial version of canReplacePointersIfEqual only rejects a very specific case: replacing a pointer with a constant expression that is not dereferenceable. Such a replacement is problematic and can be restricted relatively easily without impacting most code. Using it to limit replacements in GVN/SCCP/CVP only results in small differences in 7 programs out of MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto. This patch is supposed to be an initial step to improve the current situation and the helper should be made stricter in the future. But this will require careful analysis of the impact on performance. Reviewed By: aqjune Differential Revision: https://reviews.llvm.org/D85524	2020-09-01 20:57:41 +01:00
Arthur Eubanks	96f0b57568	[Bindings] Add LLVMAddInstructionSimplifyPass Reviewed By: sroland Differential Revision: https://reviews.llvm.org/D86764	2020-09-01 12:38:49 -07:00
Amara Emerson	8ad8f484b6	[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _) This is needed for an upcoming change to how we translate conditional branches which might generate these. Differential Revision: https://reviews.llvm.org/D86383	2020-09-01 10:57:17 -07:00
Matt Arsenault	32a8a10b42	GlobalISel: Implement computeNumSignBits for G_SELECT	2020-09-01 12:50:19 -04:00
Matt Arsenault	759482ddaa	GlobalISel: Implement computeKnownBits for G_BSWAP and G_BITREVERSE	2020-09-01 12:49:57 -04:00
Volkan Keles	061182b7ba	GlobalISel: Add combines for extend operations https://reviews.llvm.org/D86516	2020-09-01 08:50:06 -07:00
Matt Arsenault	92090e8bd8	GlobalISel: Implement computeKnownBits for G_UNMERGE_VALUES	2020-09-01 11:19:27 -04:00
Matt Arsenault	18bbd9f15e	GlobalISel: Artifact combine unmerge of unmerge Unmerges have the same fundamental problem as G_TRUNC, and G_TRUNC could be implemented in terms of G_UNMERGE_VALUES. Reducing the number of elements in unmerge results ends up producing the original unmerge type profile, so the artifact combiner needs to eliminate the intermediate illegal registers. This avoids infinite looping in the legalizer in a future change. Assuming an unmerge has each result unmerged the same way, this ends up producing a new unmerge of the source for every definition. I'm not sure if the artifact combiner should either insert temporary merges here and erase the original merge, or if the combiner should look at uses from defs rather than defs from uses for unmerges. In a few cases this regresses from using 16-bit shifts for 8-bit values to using 32-bit shifts, but I think these can be legalized later (the other legalization rules don't try very hard to use 16-bit shifts either).	2020-09-01 11:01:33 -04:00
Anh Tuyen Tran	68717acb24	[LoopIdiomRecognizePass] Options to disable part or the entire Loop Idiom Recognize Pass Loop Idiom Recognize Pass (LIRP) attempts to transform loops with subscripted arrays into memcpy/memset function calls. In some particular situation, this transformation introduces negative impacts. For example: https://bugs.llvm.org/show_bug.cgi?id=47300 This patch will enable users to disable a particular part of the transformation, while he/she can still enjoy the benefit brought about by the rest of LIRP. The default behavior stays unchanged: no part of LIRP is disabled by default. Reviewed By: etiotto (Ettore Tiotto) Differential Revision: https://reviews.llvm.org/D86262	2020-09-01 13:59:24 +00:00
Raphael Isemann	5ffd940ac0	Reland [FileCheck] Move FileCheck implementation out of LLVMSupport into its own library This relands `e9a3d1a401` which was originally missing linking LLVMSupport into LLMVFileCheck which broke the SHARED_LIBS build. Original summary: The actual FileCheck logic seems to be implemented in LLVMSupport. I don't see a good reason for having FileCheck implemented there as it has a very specific use while LLVMSupport is a dependency of pretty much every LLVM tool there is. In fact, the only use of FileCheck I could find (outside the FileCheck tool and the FileCheck unit test) is a single call in GISelMITest.h. This moves the FileCheck logic to its own LLVMFileCheck library. This way only FileCheck and the GlobalISelTests now have a dependency on this code. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D86344	2020-09-01 14:59:28 +02:00
Raphael Isemann	7c80f2da81	Revert "[lldb] Add reproducer verifier" This reverts commit `297f69afac`. It broke the Fedora 33 x86-64 bot. See the review for more info.	2020-09-01 12:21:44 +02:00
David Sherwood	9fbb113247	[SVE][CodeGen] Fix TypeSize/ElementCount related warnings in sve-split-load.ll I have fixed up a number of warnings resulting from TypeSize -> uint64_t casts and calling getVectorNumElements() on scalable vector types. I think most of the changes are fairly trivial except for those in DAGTypeLegalizer::SplitVecRes_MLOAD I've tried to ensure we create the MachineMemoryOperands in a sensible way for scalable vectors. I have added a CHECK line to the following test: CodeGen/AArch64/sve-split-load.ll that ensures no new warnings are added. Differential Revision: https://reviews.llvm.org/D86697	2020-09-01 07:47:59 +01:00
Petr Hosek	3c7bfbd683	[CMake] Use find_library for ncurses Currently it is hard to avoid having LLVM link to the system install of ncurses, since it uses check_library_exists to find e.g. libtinfo and not find_library or find_package. With this change the ncurses lib is found with find_library, which also considers CMAKE_PREFIX_PATH. This solves an issue for the spack package manager, where we want to use the zlib installed by spack, and spack provides the CMAKE_PREFIX_PATH for it. This is a similar change as https://reviews.llvm.org/D79219, which just landed in master. Patch By: haampie Differential Revision: https://reviews.llvm.org/D85820	2020-08-31 20:06:21 -07:00
Xing GUO	428b2ffad4	[DWARFYAML] Make the debug_str section optional. This patch makes the debug_str section optional. When the debug_str section exists but doesn't contain anything, yaml2obj will emit a section header for it. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D86860	2020-09-01 10:02:09 +08:00
Jonas Devlieghere	297f69afac	[lldb] Add reproducer verifier Add a reproducer verifier that catches: - Missing or invalid home directory - Missing or invalid working directory - Missing or invalid module/symbol paths - Missing files from the VFS The verifier is enabled by default during replay, but can be skipped by passing --reproducer-no-verify. Differential revision: https://reviews.llvm.org/D86497	2020-08-31 15:14:18 -07:00
Christopher Tetreault	867de151a5	[SVE] Mark VectorType::getNumElements() deprecated getNumElements() is being removed from base VectorType in order to eliminate the class of bugs in which a scalable vector is accidentally treated like a fixed length vector. Clients of this function should either call getElementCount(), and handle the case where getElementCount().isScalable() is true, or they can cast to FixedVectorType and call getNumElements() if they are sure that the vector has fixed width. Deprecated VectorType functions will be removed after the LLVM 12 branch. See: http://lists.llvm.org/pipermail/llvm-dev/2020-March/139811.html Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D78127	2020-08-31 15:13:04 -07:00
Sanjay Patel	e25449ff57	[IR][GVN] allow intrinsics in Instruction's isCommutative query (2nd try) The 1st try was reverted because I missed an assert that needed softening. As discussed in D86798 / rG09652721 , we were potentially returning a different result for whether an Instruction is commutable depending on if we call the base class or derived class method. This requires relaxing asserts in GVN, but that pass seems to be working otherwise. NewGVN requires more work because it uses different code paths for numbering binops and calls.	2020-08-31 16:01:19 -04:00
Christopher Tetreault	640f20b0c7	[SVE] Remove calls to VectorType::getNumElements from InstCombine Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D82237	2020-08-31 12:59:10 -07:00
Raphael Isemann	ed89eb3571	Revert "[FileCheck] Move FileCheck implementation out of LLVMSupport into its own library" This reverts commit `e9a3d1a401`. Seems the new FileCheck library doesn't link on some bots. Reverting for now.	2020-08-31 11:38:40 +02:00
Raphael Isemann	e9a3d1a401	[FileCheck] Move FileCheck implementation out of LLVMSupport into its own library The actual FileCheck logic seems to be implemented in LLVMSupport. I don't see a good reason for having FileCheck implemented there as it has a very specific use while LLVMSupport is a dependency of pretty much every LLVM tool there is. In fact, the only use of FileCheck I could find (outside the FileCheck tool and the FileCheck unit test) is a single call in GISelMITest.h. This moves the FileCheck logic to its own LLVMFileCheck library. This way only FileCheck and the GlobalISelTests now have a dependency on this code. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D86344	2020-08-31 11:24:41 +02:00
Sanjay Patel	badd7264e1	Revert "[IR][GVN] allow intrinsics in Instruction's isCommutative query" This reverts commit `25597f7783`. It is causing crashing on bots such as: http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10523/steps/ninja-build/logs/stdio	2020-08-30 17:02:51 -04:00
Sanjay Patel	25597f7783	[IR][GVN] allow intrinsics in Instruction's isCommutative query As discussed in D86798 / rG09652721 , we were potentially returning a different result for whether an Instruction is commutable depending on if we call the base class or derived class method. This requires relaxing an assert in GVN, but that pass seems to be working otherwise. NewGVN requires more work because it uses different code paths for numbering binops and calls.	2020-08-30 16:49:22 -04:00
Sanjay Patel	2d3e12818e	[FastISel] update to use intrinsic's isCommutative(); NFC This requires adding a missing 'const' to the definition because the callers are using const args, but there should be no change in behavior. The intrinsic method was added with D86798 / rG096527214033	2020-08-30 11:36:41 -04:00
David Green	543c5425f1	[LV] Add some const to RecurrenceDescriptor. NFC	2020-08-30 12:27:51 +01:00
sstefan1	8d8ce85b23	[Attributor] Introduce module slice. Summary: The module slice describes which functions we can analyze and transform while working on an SCC as part of the Attributor-CGSCC pass. So far we simply restricted it to the SCC. Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D86319	2020-08-30 10:30:44 +02:00
Lang Hames	e1d5f7d003	[ORC] Add getDFSLinkOrder / getReverseDFSLinkOrder methods to JITDylib. DFS and Reverse-DFS linkage orders are used to order execution of deinitializers and initializers respectively. This patch replaces uses of special purpose DFS order functions in MachOPlatform and LLJIT with uses of the new methods.	2020-08-29 15:17:06 -07:00
Benjamin Kramer	8e5b1557e5	[IR] Inline AttrBuilder::addAttribute. It just sets 1 bit. NFC.	2020-08-29 19:13:49 +02:00
Sanjay Patel	0965272140	[EarlyCSE] fold commutable intrinsics Handling the new min/max intrinsics is the motivation, but it turns out that we have a bunch of other intrinsics with this missing bit of analysis too. The FP min/max tests show that we are intersecting FMF, so that part should be safe too. As noted in https://llvm.org/PR46897 , there is a commutative property specifier for intrinsics, but no corresponding function attribute, and so apparently no uses of that bit. We may want to remove that next. Follow-up patches should wire up the Instruction::isCommutative() to this IntrinsicInst specialization. That requires updating callers to be aware of the more general commutative property (not just binops). Differential Revision: https://reviews.llvm.org/D86798	2020-08-29 12:11:01 -04:00
Martin Storsjö	20f7773bb4	[MC] [Win64EH] Fill in FuncletOrFuncEnd if missing This can happen e.g. for code that declare .seh_proc/.seh_endproc in assembly, or for code that use .seh_handlerdata (which triggers the unwind info to be emitted before the end of the function). The TextSection field must be made non-const to be able to use it with Streamer.SwitchSection(). Differential Revision: https://reviews.llvm.org/D86528	2020-08-29 15:15:22 +03:00
Roman Lebedev	c1b3e32118	[NFC][InstructionSimplify] Add a warning about not simplifying to not def-reachable See https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824235.html and https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824967.html InstSimply is not allowed to perform simplifications to instructions that are not def-reachable from the original instruction.	2020-08-29 09:58:08 +03:00
Roman Lebedev	08669fbb43	[NFC][STLExtras] Add make_first_range(), similar to existing make_second_range() Having just one of the two seens weird. I wanted to use it a few times, but it wasn't there.	2020-08-29 09:58:07 +03:00
Xing GUO	12e832cbcb	[DWARFYAML] Make the debug_abbrev_offset field optional. This patch helps make the debug_abbrev_offset field optional. We don't need to calculate the value of this field in the future. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D86614	2020-08-29 14:54:52 +08:00
Fangrui Song	b5ef137c11	[gcov] Increment counters with atomicrmw if -fsanitize=thread Without this patch, `clang --coverage -fsanitize=thread` may fail spuriously because non-atomic counter increments can be detected as data races.	2020-08-28 16:32:35 -07:00
Matt Arsenault	1b201914b5	GlobalISel: Combine out redundant sext_inreg The scalar tests don't work yet, since computeNumSignBits apparently doesn't handle sextload yet, and sext folds into the load first.	2020-08-28 17:57:31 -04:00
Craig Topper	aab90384a3	[Attributes] Add a method to check if an Attribute has AttrKind None. Use instead of hasAttribute(Attribute::None) There's a special case in hasAttribute for None when pImpl is null. If pImpl is not null we dispatch to pImpl->hasAttribute which will always return false for Attribute::None. So if we just want to check for None its sufficient to just check that pImpl is null. Which can even be done inline. This patch adds a helper for that case which I hope will speed up our getSubtargetImpl implementations. Differential Revision: https://reviews.llvm.org/D86744	2020-08-28 13:23:45 -07:00
Arthur Eubanks	cfde93e5d6	[ObjCARCOpt] Port objc-arc to NPM Since doInitialization() in the legacy pass modifies the module, the NPM pass is a Module pass. Reviewed By: ahatanak, ychen Differential Revision: https://reviews.llvm.org/D86178	2020-08-28 12:59:33 -07:00
Tyker	6d3657417e	[SROA] Improve handleling of assumes bundles by SROA This patch fixes this crash https://gcc.godbolt.org/z/Ps8d1e And gives SROA the ability to remove assumes if it allows promoting an alloca to register Without removing assumes when it can't promote to register. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86570	2020-08-28 21:55:45 +02:00
Snehasish Kumar	94faadaca4	[llvm][CodeGen] Machine Function Splitter We introduce a codegen optimization pass which splits functions into hot and cold parts. This pass leverages the basic block sections feature recently introduced in LLVM from the Propeller project. The pass targets functions with profile coverage, identifies cold blocks and moves them to a separate section. The linker groups all cold blocks across functions together, decreasing fragmentation and improving icache and itlb utilization. We evaluated the Machine Function Splitter pass on clang bootstrap and SPECInt 2017. For clang bootstrap we observe a mean 2.33% runtime improvement with a ~32% reduction in itlb and stlb misses. Additionally, L1 icache misses reduced by 9.5% while L2 instruction misses reduced by 20%. For SPECInt we report the change in IntRate the C/C++ benchmarks. All benchmarks apart from mcf and x264 improve, on average by 0.6% with the max for deepsjeng at 1.6%. Benchmark % Change 500.perlbench_r 0.78 502.gcc_r 0.82 505.mcf_r -0.30 520.omnetpp_r 0.18 523.xalancbmk_r 0.37 525.x264_r -0.46 531.deepsjeng_r 1.61 541.leela_r 0.83 557.xz_r 0.15 Differential Revision: https://reviews.llvm.org/D85368	2020-08-28 11:10:14 -07:00
David Sherwood	f4257c5832	[SVE] Make ElementCount members private This patch changes ElementCount so that the Min and Scalable members are now private and can only be accessed via the get functions getKnownMinValue() and isScalable(). In addition I've added some other member functions for more commonly used operations. Hopefully this makes the class more useful and will reduce the need for calling getKnownMinValue(). Differential Revision: https://reviews.llvm.org/D86065	2020-08-28 14:43:53 +01:00
Sam Parker	b30adfb529	[ARM][LowOverheadLoops] Liveouts and reductions Remove the code that tried to look for reduction patterns, since the vectorizer and isel can now produce predicated arithmetic instructios within the loop body. This has required some reorganisation and fixes around live-out and predication checks, as well as looking for cases where an input/output is initialised to zero. Differential Revision: https://reviews.llvm.org/D86613	2020-08-28 13:56:16 +01:00
Martin Storsjö	37ef743cbf	[MC] [Win64EH] Avoid producing malformed xdata records If there's no unwinding opcodes, omit writing the xdata/pdata records. Previously, this generated truncated xdata records, and llvm-readobj would error out when trying to print them. If writing of an xdata record is forced via the .seh_handlerdata directive, skip it if there's no info to make a sensible unwind info structure out of, and clearly error out if such info appeared later in the process. Differential Revision: https://reviews.llvm.org/D86527	2020-08-28 09:05:36 +03:00
serge-sans-paille	b1f4e5979b	(Expensive) Check for Loop, SCC and Region pass return status This generalizes the logic introduced in https://reviews.llvm.org/D80916 to other passes. It's needed by https://reviews.llvm.org/D86442 to assert passes correctly report their status. Differential Revision: https://reviews.llvm.org/D86589	2020-08-28 07:56:35 +02:00
Valentin Clement	4df2a5f782	[flang][openacc] Add check for tile clause restriction The tile clause in OpenACC 3.0 imposes some restriction. Element in the tile size list are either * or a constant positive integer expression. If there are n tile sizes in the list, the loop construct must be immediately followed by n tightly-nested loops. This patch implement these restrictions and add some tests. Reviewed By: klausler Differential Revision: https://reviews.llvm.org/D86655	2020-08-27 22:13:46 -04:00
Harmen Stoppels	cdcb9ab10e	Revert "Use find_library for ncurses" The introduction of find_library for ncurses caused more issues than it solved problems. The current open issue is it makes the static build of LLVM fail. It is better to revert for now, and get back to it later. Revert "[CMake] Fix an issue where get_system_libname creates an empty regex capture on windows" This reverts commit `1ed1e16ab8`. Revert "Fix msan build" This reverts commit `34fe9613dd`. Revert "[CMake] Always mark terminfo as unavailable on Windows" This reverts commit `76bf26236f`. Revert "[CMake] Fix OCaml build failure because of absolute path in system libs" This reverts commit `8e4acb82f7`. Revert "[CMake] Don't look for terminfo libs when LLVM_ENABLE_TERMINFO=OFF" This reverts commit `495f91fd33`. Revert "Use find_library for ncurses" This reverts commit `a52173a3e5`. Differential revision: https://reviews.llvm.org/D86521	2020-08-27 17:57:26 -07:00
Matt Arsenault	abc99ab572	GlobalISel: Implement known bits for min/max	2020-08-27 16:56:17 -04:00
Matt Arsenault	ee679638d7	MIR: Infer not-SSA for subregister defs It's possible to have a single virtual register def with a subreg index that would pass the previous check, but it's not possible to have a subregister def in SSA. This is in preparation for adding stricter checks for SSA MIR.	2020-08-27 16:56:16 -04:00
Vitaly Buka	a6927c8621	[NFC][ValueTracking] Add OffsetZero into findAllocaForValue For StackLifetime after finding alloca we need to check that values ponting to the begining of alloca. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D86692	2020-08-27 13:46:22 -07:00
Matt Arsenault	201f770f16	GlobalISel: Add and_trivial_mask to all_combines Also make up a new category of combines.	2020-08-27 16:42:09 -04:00
Eli Friedman	8d21985a75	[RegisterScavenging] Delete dead function unprocess().	2020-08-27 13:19:32 -07:00
Shinji Okumura	58d257b290	[Attributor] Do not add AA to dependency graph after the update stage If an AA is registered to the dependency graph in the manifest stage, Attributor aborts in `::manifestAttributes()`. This patch prevents such termination. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86734	2020-08-28 05:16:18 +09:00
Shinji Okumura	c5e6872ec6	[Attributor] Guarantee getAAFor not to update AA in the manifestation stage If we query an AA with `Attributor::getAAFor` in `AbstractAttribute::manifest`, the AA may be updated. This patch makes use of the phase flag in Attributor, and handle `getAAFor` behavior according to the flag. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86635	2020-08-28 04:07:42 +09:00
Christopher Tetreault	5a55e2781c	[SVE] Remove calls to VectorType::getNumElements from IR Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D81500	2020-08-27 11:16:10 -07:00
Matt Arsenault	531f7063ba	GlobalISel: Implement known bits for G_MERGE_VALUES	2020-08-27 14:07:18 -04:00
Mikhail Maltsev	ae1396c7d4	[ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics This patch adjusts the following ARM/AArch64 LLVM IR intrinsics: - neon_bfmmla - neon_bfmlalb - neon_bfmlalt so that they take and return bf16 and float types. Previously these intrinsics used <8 x i8> and <4 x i8> vectors (a rudiment from implementation lacking bf16 IR type). The neon_vbfdot[q] intrinsics are adjusted similarly. This change required some additional selection patterns for vbfdot itself and also for vector shuffles (in a previous patch) because of SelectionDAG transformations kicking in and mangling the original code. This patch makes the generated IR cleaner (less useless bitcasts are produced), but it does not affect the final assembly. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D86146	2020-08-27 18:43:16 +01:00
Aditya Nandakumar	db464a3dbf	[GISel] Add new GISel combiners for G_SELECT https://reviews.llvm.org/D83833 Patch adds two new GICombinerRules for G_SELECT. The rules include: combining selects with undef comparisons into their first selectee value, and to combine away selects with constant comparisons. Patch additionally adds a new combiner test for the AArch64 target to test these new G_SELECT combiner rules and the existing select_same_val combiner rule. Patch by mkitzan	2020-08-27 09:40:15 -07:00
Shinji Okumura	7a68f0f1e0	[Attributor] Add a phase flag to Attributor Add a new flag that indicates which stage in the process we are in. This flag is introduced for handling behavior of `getAAFor` according to the stage. (discussed in D86635) Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86678	2020-08-28 01:16:38 +09:00
Aditya Nandakumar	5c2db1655b	[GISel]: Fix one more CSE Non determinism https://reviews.llvm.org/D86676 Sometimes we can have the following code x:gpr(s32) = G_OP Say we build G_OP2 to the same x and then delete the previous instruction. Using something like Register X = ...; auto NewMIB = CSEBuilder.buildOp2(X, ... args); Currently there's a mismatch in how NewMIB is profiled and inserted into the CSEMap (ie it doesn't consider register bank/register class along with type).Unify the profiling by refactoring and calling the common method. This was found by turning on the CSEInfo::verify in at the end of each of our GISel passes which turns inconsistent state/non determinism in CSEing into crashes which likely usually indicates missing calls to Observer on mutations (the most common case). Here non determinism usually means not cseing sometimes, but almost never about producing incorrect code. Also this patch adds this verification at the end of the combiners as well.	2020-08-27 09:06:21 -07:00
Teresa Johnson	7ed8124d46	[HeapProf] Clang and LLVM support for heap profiling instrumentation See RFC for background: http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html Note that the runtime changes will be sent separately (hopefully this week, need to add some tests). This patch includes the LLVM pass to instrument memory accesses with either inline sequences to increment the access count in the shadow location, or alternatively to call into the runtime. It also changes calls to memset/memcpy/memmove to the equivalent runtime version. The pass is modeled on the address sanitizer pass. The clang changes add the driver option to invoke the new pass, and to link with the upcoming heap profiling runtime libraries. Currently there is no attempt to optimize the instrumentation, e.g. to aggregate updates to the same memory allocation. That will be implemented as follow on work. Differential Revision: https://reviews.llvm.org/D85948	2020-08-27 08:50:35 -07:00
diggerlin	6923b0a76e	Revert "[AIX][XCOFF] emit symbol visibility for xcoff object file." This reverts commit `a081868921`. Based on the Hubert Tong'comment https://reviews.llvm.org/D84265#inline-799085	2020-08-27 11:07:58 -04:00
OCHyams	0b5a8050ea	[DwarfDebug] Improve single location detection in validThroughout (2/4) With this patch we're now accounting for two more cases which should be considered 'valid throughout': First, where RangeEnd is ScopeEnd. Second, where RangeEnd comes before ScopeEnd when including meta instructions, but are both preceded by the same non-meta instruction. CTMark shows a geomean binary size reduction of 1.5% for RelWithDebInfo builds. `llvm-locstats` (using D85636) shows a very small variable location coverage change in 2 of 10 binaries, but it is in the order of 10s of bytes which lines up with my expectations. I've added a test which checks both of these new cases. The first check in the test isn't strictly necessary for this patch. But I'm not sure that it is explicitly tested anywhere else, and is useful for the final patch in the series. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D86151	2020-08-27 11:52:29 +01:00
Shinji Okumura	6c25eca614	[Attributor] Add flag for undef value to the state of AAPotentialValues Currently, an undef value is reduced to 0 when it is added to a set of potential values. This patch introduces a flag for under values. By this, for example, we can merge two states `{undef}`, `{1}` to `{1}` (because we can reduce the undef to 1). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D85592	2020-08-27 16:30:29 +09:00
Jianzhou Zhao	df182eb2d5	Fix an overflow issue at BackpatchWord This happens when generating a huge file by LTO, for example, with -gmlt. When BitNo is > 2^35, ByteNo is overflowed, and an incorrect output offset is overwritten. This generates ill-formed bitcodes. Reviewed-by: tejohnson, vitalybuka Differential Revision: https://reviews.llvm.org/D86645	2020-08-27 04:46:19 +00:00
Amy Kwan	76b0f99ea8	[PowerPC] Implement Vector Multiply High/Divide Extended Builtins in LLVM/Clang This patch implements the function prototypes vec_mulh and vec_dive in order to utilize the vector multiply high (vmulh[s\|u][w\|d]) and vector divide extended (vdive[s\|u][w\|d]) instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82609	2020-08-26 23:14:34 -05:00
Matt Arsenault	0b7f6cc71a	GlobalISel: Add generic instructions for memory intrinsics AArch64, X86 and Mips currently directly consumes these and custom lowering to produce a libcall, but really these should follow the normal legalization process through the libcall/lower action.	2020-08-26 20:08:45 -04:00
Lang Hames	605df8112c	[ORC][JITLink] Switch to unique ownership for EHFrameRegistrars. This will make stateful registrars (e.g. a future TargetProcessControl based registrar) easier to deal with.	2020-08-26 16:59:45 -07:00
Arthur Eubanks	486ed88533	[ConstProp] Remove ConstantPropagation As discussed in http://lists.llvm.org/pipermail/llvm-dev/2020-July/143801.html. Currently no users outside of unit tests. Replace all instances in tests of -constprop with -instsimplify. Notable changes in tests: * vscale.ll - @llvm.sadd.sat.nxv16i8 is evaluated by instsimplify, use a fake intrinsic instead * InsertElement.ll - insertelement undef is removed by instsimplify in @insertelement_undef llvm/test/Transforms/ConstProp moved to llvm/test/Transforms/InstSimplify/ConstProp Reviewed By: lattner, nikic Differential Revision: https://reviews.llvm.org/D85159	2020-08-26 15:51:30 -07:00
Juneyoung Lee	0c55889d80	[IR] Remove noundef from masked store/load/gather/scatter's pointer operands As discussed in D86576, noundef attribute is removed from masked store/load/gather/scatter's pointer operands. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86656	2020-08-27 06:41:43 +09:00
Wei Mi	c67ccf5faf	[SampleFDO] Enhance profile remapping support for searching inline instance and indirect call promotion candidate. Profile remapping is a feature to match a function in the module with its profile in sample profile if the function name and the name in profile look different but are equivalent using given remapping rules. This is a useful feature to keep the performance stable by specifying some remapping rules when sampleFDO targets are going through some large scale function signature change. However, currently profile remapping support is only valid for outline function profile in SampleFDO. It cannot match a callee with an inline instance profile if they have different but equivalent names. We found that without the support for inline instance profile, remapping is less effective for some large scale change. To add that support, before any remapping lookup happens, all the names in the profile will be inserted into remapper and the Key to the name mapping will be recorded in a map called NameMap in the remapper. During name lookup, a Key will be returned for the given name and it will be used to extract an equivalent name in the profile from NameMap. So with the help of the NameMap, we can translate any given name to an equivalent name in the profile if it exists. Whenever we try to match a name in the module to a name in the profile, we will try the match with the original name first, and if it doesn't match, we will use the equivalent name got from remapper to try the match for another time. In this way, the patch can enhance the profile remapping support for searching inline instance and searching indirect call promotion candidate. In a planned large scale change of int64 type (long long) to int64_t (long), we found the performance of a google internal benchmark degraded by 2% if nothing was done. If existing profile remapping was enabled, the performance degradation dropped to 1.2%. If the profile remapping with the current patch was enabled, the performance degradation further dropped to 0.14% (Note the experiment was done before searching indirect call promotion candidate was added. We hope with the remapping support of searching indirect call promotion candidate, the degradation can drop to 0% in the end. It will be evaluated post commit). Differential Revision: https://reviews.llvm.org/D86332	2020-08-26 11:07:35 -07:00
Juneyoung Lee	684b43c0cf	[IR] Add NoUndef attribute to Intrinsics.td This patch adds NoUndef to Intrinsics.td. The attribute is attached to llvm.assume's operand, because llvm.assume(undef) is UB. It is attached to pointer operands of several memory accessing intrinsics as well. This change makes ValueTracking::getGuaranteedNonPoisonOps' intrinsic check unnecessary, so it is removed. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86576	2020-08-27 02:54:48 +09:00
Roman Lebedev	95848ea101	[Value][InstCombine] Fix one-use checks in PHI-of-op -> Op-of-PHI[s] transforms to be one-user checks As FIXME said, they really should be checking for a single user, not use, so let's do that. It is not that unusual to have the same value as incoming value in a PHI node, not unlike how a PHI may have the same incoming basic block more than once. There isn't a nice way to do that, Value::users() isn't uniqified, and Value only tracks it's uses, not Users, so the check is potentially costly since it does indeed potentially involes traversing the entire use list of a value.	2020-08-26 20:20:41 +03:00
Roman Lebedev	c07a430bd3	[NFC][Value] Fixup comments, "N users" is NOT the same as "N uses". In those cases, it really means "N uses".	2020-08-26 20:20:41 +03:00
Kai Nacke	ed07e1fe0f	[SystemZ/ZOS] Add header file to encapsulate use of <sysexits.h> The non-standard header file `<sysexits.h>` provides some return values. `EX_IOERR` is used to as a special value to signal a broken pipe to the clang driver. On z/OS Unix System Services, this header file does not exists. This patch - adds a check for `<sysexits.h>`, removing the dependency on `LLVM_ON_UNIX` - adds a new header file `llvm/Support/ExitCodes`, which either includes `<sysexits.h>` or defines `EX_IOERR` - updates the users of `EX_IOERR` to include the new header file Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D83472	2020-08-26 12:44:30 -04:00
Sjoerd Meijer	bda8fbe2d2	[LV] Fallback strategies if tail-folding fails This implements 2 different vectorisation fallback strategies if tail-folding fails: 1) don't vectorise at all, or 2) vectorise using a scalar epilogue. This can be controlled with option -prefer-predicate-over-epilogue, that has been changed to take a numeric value corresponding to the tail-folding preference and preferred fallback. Patch by: Pierre van Houtryve, Sjoerd Meijer. Differential Revision: https://reviews.llvm.org/D79783	2020-08-26 16:55:25 +01:00
Dibya Ranjan Mishra	a7da7e421c	[Support] Allow printing the stack trace only for a given depth Differential Revision: https://reviews.llvm.org/D85458	2020-08-26 09:27:42 -04:00
Matt Arsenault	eb074088c9	GlobalISel: Combine G_ADD of G_PTRTOINT to G_PTR_ADD This produces less work for addressing mode matching. I think this is safe since I don't think machine IR is supposed to give the same aliasing properties as getelementptr in the IR.	2020-08-26 08:57:15 -04:00
Xing GUO	8daa3264a3	[DWARFYAML] Make the unit_length and header_length fields optional. This patch makes the unit_length and header_length fields of line tables optional. yaml2obj is able to infer them for us. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D86590	2020-08-26 20:35:10 +08:00
QingShan Zhang	ebf3b188c6	[Scheduling] Implement a new way to cluster loads/stores Before calling target hook to determine if two loads/stores are clusterable, we put them into different groups to avoid fake cluster due to dependency. For now, we are putting the loads/stores into the same group if they have the same predecessor. We assume that, if two loads/stores have the same predecessor, it is likely that, they didn't have dependency for each other. However, one SUnit might have several predecessors and for now, we just pick up the first predecessor that has non-data/non-artificial dependency, which is too arbitrary. And we are struggling to fix it. So, I am proposing some better implementation. 1. Collect all the loads/stores that has memory info first to reduce the complexity. 2. Sort these loads/stores so that we can stop the seeking as early as possible. 3. For each load/store, seeking for the first non-dependency instruction with the sorted order, and check if they can cluster or not. Reviewed By: Jay Foad Differential Revision: https://reviews.llvm.org/D85517	2020-08-26 12:33:59 +00:00
Georgii Rymar	92c527e5a2	[llvm/Object] - Make dyn_cast<XCOFFObjectFile> work as it should. Currently, `dyn_cast<XCOFFObjectFile>` always does cast and returns a pointer, even when we pass `ELF`/`Wasm`/`Mach-O` or `COFF` instead of `XCOFF`. It happens because `XCOFFObjectFile` class does not implement `classof`. I've fixed it and added a unit test. Differential revision: https://reviews.llvm.org/D86542	2020-08-26 15:09:55 +03:00
Gabriel Hjort Åkerlund	5e23dc5b47	[GlobalISel] Fix and tidy up documentation for ValueMapping class (NFC) The documentation was missing a '/' in '/<2x32-bit> vadd {0, 64, VPR}', and the example code are now aligned to improve readability. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D86201	2020-08-26 12:09:01 +02:00
sstefan1	99d18f7964	Reland [IR] Intrinsics default attributes and opt-out flag Intrinsic properties can now be set to default and applied to all intrinsics. If the attributes are not needed, the user can opt-out by setting the DisableDefaultAttributes flag to true. Differential Revision: https://reviews.llvm.org/D70365	2020-08-26 11:37:59 +02:00
Jan Kratochvil	b20a4e293c	[Support] Speedup llvm-dwarfdump 3.9x Currently `strace llvm-dwarfdump x.debug >/tmp/file`: ioctl(1, TCGETS, 0x7ffd64d7f340) = -1 ENOTTY (Inappropriate ioctl for device) write(1, " DW_AT_decl_line\t(89)\n"..., 4096) = 4096 ioctl(1, TCGETS, 0x7ffd64d7f400) = -1 ENOTTY (Inappropriate ioctl for device) ioctl(1, TCGETS, 0x7ffd64d7f410) = -1 ENOTTY (Inappropriate ioctl for device) ioctl(1, TCGETS, 0x7ffd64d7f400) = -1 ENOTTY (Inappropriate ioctl for device) After this patch: write(1, "0000000000001102 \"strlen\")\n "..., 4096) = 4096 write(1, "site\n DW_AT_low"..., 4096) = 4096 write(1, "d53)\n\n0x000e4d4d: DW_TAG_G"..., 4096) = 4096 The same speedup can be achieved by `--color=0` but that is not much convenient. This implementation has been suggested by Joerg Sonnenberger. Differential Revision: https://reviews.llvm.org/D86406	2020-08-26 10:29:46 +02:00
Shinji Okumura	3050713798	[Attributor] Provide an edge-based interface in AAIsDead This patch produces an edge-based interface in AAIsDead. By this, we can query a set of basic blocks that are directly reachable from a given basic block. This is specifically useful for implementation of AAReachability. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D85547	2020-08-26 16:57:52 +09:00
Adrien Guinet	c6f7ac0071	[llvm-lipo] Add support for bitcode files A Mach-O universal binary may contain bitcode as a slice. This diff adds proper handling of such binaries to llvm-lipo. Test plan: make check-all Differential revision: https://reviews.llvm.org/D85740	2020-08-25 21:11:18 -07:00
Mircea Trofin	7cfcecece0	[MLInliner] Simplify TFUTILS_SUPPORTED_TYPES We only need the C++ type and the corresponding TF Enum. The other parameter was used for the output spec json file, but we can just standardize on the C++ type name there. Differential Revision: https://reviews.llvm.org/D86549	2020-08-25 14:19:39 -07:00
Krzysztof Parzyszek	514d6e9a8d	[SDAG] Improve MemSDNode::getBasePtr It returned getOperand(1), except for STORE for which it returned getOperand(2). Handle MSTORE, MGATHER, and MSCATTER as well.	2020-08-25 15:19:52 -05:00
Juneyoung Lee	f753f5b050	[ValueTracking] Let getGuaranteedNonPoisonOp find multiple non-poison operands This patch helps getGuaranteedNonPoisonOp find multiple non-poison operands. Instead of special-casing llvm.assume, I think it is also a viable option to add noundef to Intrinsics.td. If it makes sense, I'll make a patch for that. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86477	2020-08-26 04:40:21 +09:00
Lang Hames	594107d488	[ORC] Fix an endif comment.	2020-08-25 11:51:20 -07:00
Jeremy Morse	121a49d839	[LiveDebugValues] Add switches for using instr-ref variable locations This patch adds the -Xclang option "-fexperimental-debug-variable-locations" and same LLVM CodeGen option, to pick which variable location tracking solution to use. Right now all the switch does is pick which LiveDebugValues implementation to use, the normal VarLoc one or the instruction referencing one in rGae6f78824031. Over time, the aim is to add fragments of support in aid of the value-tracking RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-February/139440.html also controlled by this command line switch. That will slowly move variable locations to be defined by an instruction calculating a value, and a DBG_INSTR_REF instruction referring to that value. Thus, this is going to grow into a "use the new kind of variable locations" switch, rather than just "use the new LiveDebugValues implementation". Differential Revision: https://reviews.llvm.org/D83048	2020-08-25 14:58:48 +01:00
David Sherwood	7b64765cd1	[SVE] Fix TypeSize related warnings with IR truncates of scalable vectors In getCastInstrCost when the instruction is a truncate we were relying upon the implicit TypeSize -> uint64_t cast when asking if a given type has the same size as a legal integer. I've changed the code to only ask the question if the type is fixed length. I have also changed InstCombinerImpl::SimplifyDemandedUseBits to bail out for now if the type is a scalable vector. I've added the following new tests: Analysis/CostModel/AArch64/sve-trunc.ll Transforms/InstCombine/AArch64/sve-trunc.ll for both of these fixes. Differential revision: https://reviews.llvm.org/D86432	2020-08-25 09:17:56 +01:00
Freddy Ye	e02d081f2b	[X86] Support -march=sapphirerapids Support -march=sapphirerapids for x86. Compare with Icelake Server, it includes 14 more new features. They are amxtile, amxint8, amxbf16, avx512bf16, avx512vp2intersect, cldemote, enqcmd, movdir64b, movdiri, ptwrite, serialize, shstk, tsxldtrk, waitpkg. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D86503	2020-08-25 14:21:21 +08:00
Fangrui Song	44ee9d070a	Revert D85812 "[coroutine] should disable inline before calling coro split" This reverts commit `2e43acfed8`. LLVMCoroutines (the library which contains Coroutines.h) depends on LLVMipo (the library which contains SampleProfile.cpp). It is inappropriate for SampleProfile.cpp to depent on Coroutines.h (circular dependency). The test inverted dependencies as well: llvm/test/Transforms/Coroutines/coro-inline.ll uses -sample-profile.	2020-08-24 11:41:05 -07:00
Matt Arsenault	116affb18d	TableGen/GlobalISel: Allow inst matcher to check multiple opcodes This is to initially handleg immAllOnesV, which should match G_BUILD_VECTOR or G_BUILD_VECTOR_TRUNC. In the future, it could be used for other patterns cases that map to multiple G_* instructions, such as G_ADD and G_PTR_ADD.	2020-08-24 13:48:51 -04:00
dongAxis	2e43acfed8	[coroutine] should disable inline before calling coro split summary: When callee coroutine function is inlined into caller coroutine function before coro-split pass, llvm will emits "coroutine should have exactly one defining @llvm.coro.begin". It seems that coro-early pass can not handle this quiet well. So we believe that unsplited coroutine function should not be inlined. This patch fix such issue by not inlining function if it has attribute "coroutine.presplit" (it means the function has not been splited) to fix this issue TestPlan: check-llvm Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D85812	2020-08-24 22:22:08 +08:00
Francesco Petrogalli	5a34b3ab95	[llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI] Changes: * Change `ToVectorTy` to deal directly with `ElementCount` instances. * `VF == 1` replaced with `VF.isScalar()`. * `VF > 1` and `VF >=2` replaced with `VF.isVector()`. * `VF <=1` is replaced with `VF.isZero() \|\| VF.isScalar()`. * Replaced the uses of `llvm::SmallSet<ElementCount, ...>` with `llvm::SmallSetVector<ElementCount, ...>`. This avoids the need of an ordering function for the `ElementCount` class. * Bits and pieces around printing the `ElementCount` to string streams. To guarantee that this change is a NFC, `VF.Min` and asserts are used in the following places: 1. When it doesn't make sense to deal with the scalable property, for example: a. When computing unrolling factors. b. When shuffle masks are built for fixed width vector types In this cases, an assert(!VF.Scalable && "<mgs>") has been added to make sure we don't enter coepaths that don't make sense for scalable vectors. 2. When there is a conscious decision to use `FixedVectorType`. These uses of `FixedVectorType` will likely be removed in favour of `VectorType` once the vectorizer is generic enough to deal with both fixed vector types and scalable vector types. 3. When dealing with building constants out of the value of VF, for example when computing the vectorization `step`, or building vectors of indices. These operation _make sense_ for scalable vectors too, but changing the code in these places to be generic and make it work for scalable vectors is to be submitted in a separate patch, as it is a functional change. 4. When building the potential VFs in VPlan. Making the VPlan generic enough to handle scalable vectorization factors is a functional change that needs a separate patch. See for example `void LoopVectorizationPlanner::buildVPlans(unsigned MinVF, unsigned MaxVF)`. 5. The class `IntrinsicCostAttribute`: this class still uses `unsigned VF` as updating the field to use `ElementCount` woudl require changes that could result in changing the behavior of the compiler. Will be done in a separate patch. 7. When dealing with user input for forcing the vectorization factor. In this case, adding support for scalable vectorization is a functional change that migh require changes at command line. Note that in some places the idiom ``` unsigned VF = ... auto VTy = FixedVectorType::get(ScalarTy, VF) ``` has been replaced with ``` ElementCount VF = ... assert(!VF.Scalable && ...); auto VTy = VectorType::get(ScalarTy, VF) ``` The assertion guarantees that the new code is (at least in debug mode) functionally equivalent to the old version. Notice that this change had been possible because none of the methods that are specific to `FixedVectorType` were used after the instantiation of `VTy`. Reviewed By: rengolin, ctetreau Differential Revision: https://reviews.llvm.org/D85794	2020-08-24 13:54:03 +00:00
Francesco Petrogalli	bad7d6b373	Revert "[llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI]" Reverting because the commit message doesn't reflect the one agreed on phabricator at https://reviews.llvm.org/D85794. This reverts commit `c8d2b065b9`.	2020-08-24 13:50:55 +00:00
Matt Arsenault	e1644a3779	GlobalISel: Reduce G_SHL width if source is extension shl ([sza]ext x, y) => zext (shl x, y). Turns expensive 64 bit shifts into 32 bit if it does not overflow the source type: This is a port of an AMDGPU DAG combine added in `5fa289f0d8`. InstCombine does this already, but we need to do it again here to apply it to shifts introduced for lowered getelementptrs. This will help matching addressing modes that use 32-bit offsets in a future patch. TableGen annoyingly assumes only a single match data operand, so introduce a reusable struct. However, this still requires defining a separate GIMatchData for every combine which is still annoying. Adds a morally equivalent function to the existing getShiftAmountTy. Without this, we would have to do try to repeatedly query the legalizer info and guess at what type to use for the shift.	2020-08-24 09:42:40 -04:00
Francesco Petrogalli	c8d2b065b9	[llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI] Changes: * Change `ToVectorTy` to deal directly with `ElementCount` instances. * `VF == 1` replaced with `VF.isScalar()`. * `VF > 1` and `VF >=2` replaced with `VF.isVector()`. * `VF <=1` is replaced with `VF.isZero() \|\| VF.isScalar()`. * Add `<` operator to `ElementCount` to be able to use `llvm::SmallSetVector<ElementCount, ...>`. * Bits and pieces around printing the ElementCount to string streams. * Added a static method to `ElementCount` to represent a scalar. To guarantee that this change is a NFC, `VF.Min` and asserts are used in the following places: 1. When it doesn't make sense to deal with the scalable property, for example: a. When computing unrolling factors. b. When shuffle masks are built for fixed width vector types In this cases, an assert(!VF.Scalable && "<mgs>") has been added to make sure we don't enter coepaths that don't make sense for scalable vectors. 2. When there is a conscious decision to use `FixedVectorType`. These uses of `FixedVectorType` will likely be removed in favour of `VectorType` once the vectorizer is generic enough to deal with both fixed vector types and scalable vector types. 3. When dealing with building constants out of the value of VF, for example when computing the vectorization `step`, or building vectors of indices. These operation _make sense_ for scalable vectors too, but changing the code in these places to be generic and make it work for scalable vectors is to be submitted in a separate patch, as it is a functional change. 4. When building the potential VFs in VPlan. Making the VPlan generic enough to handle scalable vectorization factors is a functional change that needs a separate patch. See for example `void LoopVectorizationPlanner::buildVPlans(unsigned MinVF, unsigned MaxVF)`. 5. The class `IntrinsicCostAttribute`: this class still uses `unsigned VF` as updating the field to use `ElementCount` woudl require changes that could result in changing the behavior of the compiler. Will be done in a separate patch. 7. When dealing with user input for forcing the vectorization factor. In this case, adding support for scalable vectorization is a functional change that migh require changes at command line. Differential Revision: https://reviews.llvm.org/D85794	2020-08-24 13:39:42 +00:00
Sam Parker	4ce176bed2	[SCEV] Still (again) trying to fix buildbots	2020-08-24 11:24:30 +01:00
Sam Parker	2e194fe73b	[SCEV] Still trying to fix windows buildbots	2020-08-24 10:26:48 +01:00
Sam Parker	e286c600e1	[SCEV] Attempt to fix windows buildbots	2020-08-24 08:29:22 +01:00
Sam Parker	b999400a4f	[SCEV] Add operand methods to Cast and UDiv Add methods to access operands in a similar manner to NAryExpr. Differential Revision: https://reviews.llvm.org/D86083	2020-08-24 06:57:07 +01:00
Craig Topper	cc7bf9bcbf	[X86] Allow 32-bit mode only CPUs with -mtune on 64-bit targets gcc errors on this, but I'm nervous that since -mtune has been ignored by clang for so long that there may be code bases out there that pass 32-bit cpus to clang.	2020-08-22 16:38:05 -07:00
Matt Arsenault	901e3317fe	GlobalISel: Merge FewerElements for G_BUILD_VECTOR/G_CONCAT_VECTORS This switches from using G_EXTRACT in odd cases to widen with undef and unmerge.	2020-08-22 10:25:53 -04:00
Florian Hahn	5e7e2162d4	[DSE,MemorySSA] Use BatchAA for AA queries. We can use BatchAA to avoid some repeated AA queries. We only remove stores, so I think we will get away with using a single BatchAA instance for the complete run. The changes in AliasAnalysis.h mirror the changes in D85583. The change improves compile-time by roughly 1%. http://llvm-compile-time-tracker.com/compare.php?from=67ad786353dfcc7633c65de11601d7823746378e&to=10529e5b43809808e8c198f88fffd8f756554e45&stat=instructions This is part of the patches to bring down compile-time to the level referenced in http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86275	2020-08-22 08:36:35 +01:00
Sourabh Singh Tomar	f91d18eaa9	[DebugInfo][flang]Added support for representing Fortran assumed length strings This patch adds support for representing Fortran `character(n)`. Primarily patch is based out of D54114 with appropriate modifications. Test case IR is generated using our downstream classic-flang. We're in process of upstreaming flang PR's but classic-flang has dependencies on llvm, so this has to get in first. Patch includes functional test case for both IR and corresponding dwarf, furthermore it has been manually tested as well using GDB. Source snippet: ``` program assumedLength call sub('Hello') call sub('Goodbye') contains subroutine sub(string) implicit none character(len=), intent(in) :: string print , string end subroutine sub end program assumedLength ``` GDB: ``` (gdb) ptype string type = character (5) (gdb) p string $1 = 'Hello' ``` Reviewed By: aprantl, schweitz Differential Revision: https://reviews.llvm.org/D86305	2020-08-22 10:13:40 +05:30
Alina Sbirlea	f55ad3973d	[DomTree] Extend update API to allow a post CFG view. Extend the `applyUpdates` in DominatorTree to allow a post CFG view, different from the current CFG. This patch implements the functionality of updating an already up to date DT, to the desired PostCFGView. Combining a set of updates towards an up to date DT and a PostCFGView is not yet supported. Differential Revision: https://reviews.llvm.org/D85472	2020-08-21 17:23:08 -07:00
Roman Lebedev	503deec218	Temporairly revert "[SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline" As disscussed in post-commit review starting with https://reviews.llvm.org/D84108#2227365 while this appears to be mostly a win overall, especially code-size-wise, this appears to shake //certain// code pattens in a way that is extremely unfavorable for performance (+30% runtime regression) on certain CPU's (i personally can't reproduce). So until the behaviour is better understood, and a path forward is mapped, let's back this out for now. This reverts commit `1d51dc38d8`.	2020-08-22 00:33:22 +03:00
Nicolai Hähnle	b37db11d95	MachineSSAUpdater: Allow initialization with just a register class The register class is required for inserting PHIs, but the "current virtual register" isn't actually used for anything, so let's remove it while we're at it. Differential Revision: https://reviews.llvm.org/D85602 Change-Id: I1e647f31570ef21a7ea8e20db3454178e98a6a8b	2020-08-21 23:04:35 +02:00
Alina Sbirlea	7ea0ee3058	[DomTree] Avoid creating an empty GD to reduce compile time.	2020-08-21 13:45:00 -07:00
Serguei Katkov	9e362bb0eb	[InstCombine] Remove unused entries in gc-live bundle of statepoint If some of gc live value are not used in gc.relocate we can remove them from gc-live bundle of statepoint instruction. Also the CL removes duplicated Values in gc-live bundle. Reviewers: reames, dantrushin Reviewed By: dantrushin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D85959	2020-08-22 01:36:22 +07:00
Kamau Bridgeman	365f861c45	[PowerPC][PCRelative] Thread Local Storage Support for Initial Exec This patch is the initial support for the Intial Exec Thread Local Local Storage model to produce code sequence and relocations correct to the ABI for the model when using PC relative memory operations. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D81947	2020-08-21 10:13:11 -05:00
diggerlin	a081868921	[AIX][XCOFF] emit symbol visibility for xcoff object file. SUMMARY: Reviewers: Jason liu Differential Revision: https://reviews.llvm.org/D84265	2020-08-21 11:00:56 -04:00
Florian Hahn	8eded24bf4	Recommit "[SCEVExpander] Add helper to clean up instrs inserted while expanding." Recommit the patch after fixing an issue reported caused by the fact that re-used values are also added to InsertedValues. Additional tests have been added in `88818491b9` This reverts the revert commit `38884641f2`.	2020-08-21 15:04:17 +01:00
Xing GUO	f5643dc3dc	Recommit: [DWARFYAML] Add support for referencing different abbrev tables. The original commit (7ff0ace96db9164dcde232c36cab6519ea4fce8) was causing build failure and was reverted in `6d242a7326` ==================== Original Commit Message ==================== This patch adds support for referencing different abbrev tables. We use 'ID' to distinguish abbrev tables and use 'AbbrevTableID' to explicitly assign an abbrev table to compilation units. The syntax is: ``` debug_abbrev: - ID: 0 Table: ... - ID: 1 Table: ... debug_info: - ... AbbrevTableID: 1 ## Reference the second abbrev table. - ... AbbrevTableID: 0 ## Reference the first abbrev table. ``` Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D83116	2020-08-21 19:02:10 +08:00
Roman Lebedev	5d7c5a5e99	[NFC] Port InstCount pass to new pass manager	2020-08-21 12:39:42 +03:00
Yevgeny Rouban	18bc400f97	[NewPM][PassInstrumentation] Add PreservedAnalyses parameter to AfterPass* callbacks Both AfterPass and AfterPassInvalidated pass instrumentation callbacks get additional parameter of type PreservedAnalyses. This patch was created by @fedor.sergeev. I have just slightly changed it. Reviewers: fedor.sergeev Differential Revision: https://reviews.llvm.org/D81555	2020-08-21 16:10:42 +07:00
David Green	2b69efded0	[ARM][LV] Add a preferPredicatedReductionSelect target hook As part of D84741, this adds a target hook for the preferPredicatedReductionSelect option and makes use of it under MVE, allowing us to tail predicate most reduction loops. Differential Revision: https://reviews.llvm.org/D85980	2020-08-21 08:48:12 +01:00
Qiu Chaofan	91039784b3	[PowerPC] Add readflm/setflm intrinsics to Clang Commit `dbcfbffc` adds ppc.readflm and ppc.setflm intrinsics to read or write FPSCR register. This patch adds them to Clang. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D85874	2020-08-21 15:12:19 +08:00
Xing GUO	6d242a7326	Revert "[DWARFYAML] Add support for referencing different abbrev tables." This reverts commit `f7ff0ace96`. This change is causing build failure. http://lab.llvm.org:8011/builders/clang-cmake-armv7-global-isel/builds/10400	2020-08-21 12:15:54 +08:00
Yevgeny Rouban	7d9a16241f	[ADT] Allow IsSizeLessThanThresholdT for incomplete types. NFC If the type T is incomplete then sizeof(T) results in C++ compilation error at line: static constexpr bool value = sizeof(T) <= (2 * sizeof(void *)); This patch allows incomplete types in parameters of function. Example: using SomeFunc = void(SomeIncompleteType &); llvm::unique_function<SomeFuncType> SomeFunc; Reviewers: DaniilSuchkov, vvereschaka Differential Revision: https://reviews.llvm.org/D81554	2020-08-21 11:01:57 +07:00
Xing GUO	f7ff0ace96	[DWARFYAML] Add support for referencing different abbrev tables. This patch adds support for referencing different abbrev tables. We use 'ID' to distinguish abbrev tables and use 'AbbrevTableID' to explicitly assign an abbrev table to compilation units. The syntax is: ``` debug_abbrev: - ID: 0 Table: ... - ID: 1 Table: ... debug_info: - ... AbbrevTableID: 1 ## Reference the second abbrev table. - ... AbbrevTableID: 0 ## Reference the first abbrev table. ``` Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D83116	2020-08-21 11:44:25 +08:00
Xing GUO	290e399f96	[DWARFYAML] Add support for emitting multiple abbrev tables. This patch adds support for emitting multiple abbrev tables. Currently, compilation units will always reference the first abbrev table. Reviewed By: jhenderson, labath Differential Revision: https://reviews.llvm.org/D86194	2020-08-21 10:12:08 +08:00
Michael Liao	5257a60ee0	[amdgpu] Add codegen support for HIP dynamic shared memory. Summary: - HIP uses an unsized extern array `extern __shared__ T s[]` to declare the dynamic shared memory, which size is not known at the compile time. Reviewers: arsenm, yaxunl, kpyzhov, b-sumner Subscribers: kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82496	2020-08-20 21:29:18 -04:00
Jon Roelofs	74ca5275e9	Fix a couple of typos. NFC	2020-08-20 14:56:57 -06:00
Kamau Bridgeman	b74b80bb2d	[PowerPC][PCRelative] Thread Local Storage Support for General Dynamic This patch is the initial support for the General Dynamic Thread Local Local Storage model to produce code sequence and relocations correct to the ABI for the model when using PC relative memory operations. Patch by: NeHuang Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D82315	2020-08-20 15:08:13 -05:00
Simon Pilgrim	0ee23b286a	Fix Wdocumentation unknown parameter warning. NFC.	2020-08-20 12:41:34 +01:00
Vitaly Buka	ebdc886b5f	[APInt] Allow self-assignment with libstdc++ http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/8256/steps/test-check-all/logs/FAIL%3A%20LLVM%3A%3Athinlto-function-summary-paramaccess.ll Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D86053	2020-08-20 04:14:40 -07:00
Sebastian Neubauer	b8d1994778	[AMDGPU] Add A16/G16 to InstCombine When sampling from images with coordinates that only have 16 bit accuracy, convert the image intrinsic call to use a16 or g16. This does only happen if the target hardware supports it. An alternative would be to always apply this combination, independent of the target hardware and extend 16 bit arguments to 32 bit arguments during legalization. To me, this sounds like an unnecessary roundtrip that could prevent some further InstCombine optimizations. Differential Revision: https://reviews.llvm.org/D85887	2020-08-20 10:51:49 +02:00
Georgii Rymar	a6436b0b3a	[yaml2obj] - Make the 'Machine' key optional. Currently we have to set 'Machine' to something in our YAML descriptions. Usually we use 'EM_X86_64' for 64-bit targets and 'EM_386' for 32-bit targets. At the same time, in fact, in most cases our tests do not need a machine type and we can use 'EM_NONE'. This is cleaner, because avoids the need of using a particular machine. In this patch I've made the 'Machine' key optional (the default value, when it is not specified is `EM_NONE`) and removed it (where possible) from yaml2obj, obj2yaml and llvm-readobj tests. There are few tests left where I decided not to remove it, because I didn't want to touch CHECK lines or doing anything more complex than a removing a "Machine: *" line and formatting lines around. Differential revision: https://reviews.llvm.org/D86202	2020-08-20 11:40:51 +03:00
Bevin Hansson	f03b10f57e	[IR] Add FixedPointBuilder. This patch adds a convenience class for using FixedPointSemantics to build fixed-point operations in IR. RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144025.html Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D85314	2020-08-20 10:29:57 +02:00
Bevin Hansson	1a995a0af3	[ADT] Move FixedPoint.h from Clang to LLVM. This patch moves FixedPointSemantics and APFixedPoint from Clang to LLVM ADT. This will make it easier to use the fixed-point classes in LLVM for constructing an IR builder for fixed-point and for reusing the APFixedPoint class for constant evaluation purposes. RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144025.html Reviewed By: leonardchan, rjmccall Differential Revision: https://reviews.llvm.org/D85312	2020-08-20 10:29:45 +02:00
Johannes Doerfert	2f38c755ba	Revert "[IR] Intrinsics default attributes and opt-out flag" This commit introduced a non-trivial compile time regression that needs to be addressed: https://reviews.llvm.org/D70365#2227627 Given that it is unclear how long that will take, I'll revert it for now. This reverts commit `eedf18fc1f`.	2020-08-20 00:25:32 -05:00
Matt Arsenault	31adc28d24	GlobalISel: Implement fewerElementsVector for G_CONCAT_VECTORS sources This fixes <6 x s16> = G_CONCAT_VECTORS from <3 x s16> handling.	2020-08-19 18:53:24 -04:00
Francesco Petrogalli	dac0b1d330	[llvm] Add default constructor of `llvm::ElementCount`. This patch prevents failures like those reported in http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/34173. We have enabled the default constructor for `llvm::ElementCount` to make sure the code compiles on Windows. Reviewed By: ormris Differential Revision: https://reviews.llvm.org/D86240	2020-08-19 21:39:24 +00:00
Sanjay Patel	6f3511a01a	[ValueTracking] define/use max recursion depth in header There's a potential motivating case to increase this limit in PR47191: http://bugs.llvm.org/PR47191 But first we should make it less hacky. The limit in InstCombine is directly tied to this value because an increase there can cause asserts in the underlying value tracking calls if not changed together. The usage in VectorUtils is independent, but the comment suggests that we should use the same value unless there's a known reason to diverge. There are similar limits in codegen analysis, but I think we should leave those independent in case we intentionally want the optimization power/cost to be different there. Differential Revision: https://reviews.llvm.org/D86113	2020-08-19 16:56:59 -04:00
Matt Arsenault	adbcc8e733	GlobalISel: Add TargetLowering member to LegalizerHelper	2020-08-19 14:50:35 -04:00
Matt Arsenault	e95c08432a	GlobalISel: Use Register	2020-08-19 13:45:31 -04:00
Mehdi Amini	a407ec9b6d	Revert "Revert "[NFC][llvm] Make the contructors of `ElementCount` private."" Was reverted because MLIR/Flang builds were broken, these APIs have been fixed in the meantime.	2020-08-19 17:26:36 +00:00
Mehdi Amini	4fc56d70aa	Revert "[NFC][llvm] Make the contructors of `ElementCount` private." This reverts commit `264afb9e6a`. (and dependent `6b742cc48` and `fc53bd610f`) MLIR/Flang are broken.	2020-08-19 17:21:37 +00:00
Jessica Paquette	d25b12bdc3	[GlobalISel] Add combine for (x & mask) -> x when (x & mask) == x If we have a mask, and a value x, where (x & mask) == x, we can drop the AND and just use x. This is about a 0.4% geomean code size improvement on CTMark at -O3 for AArch64. In AArch64, this is most useful post-legalization. Patterns like this often show up when legalizing s1s, which must be extended to larger types. e.g. ``` %cmp:_(s32) = G_ICMP ... %and:_(s32) = G_AND %cmp, 1 ``` Since G_ICMP only produces a single bit, there's no reason to mask it with the G_AND. Differential Revision: https://reviews.llvm.org/D85463	2020-08-19 10:20:57 -07:00
Francesco Petrogalli	264afb9e6a	[NFC][llvm] Make the contructors of `ElementCount` private. Differential Revision: https://reviews.llvm.org/D86120	2020-08-19 16:26:44 +00:00
Bjorn Pettersson	54105d635d	[GlobalISel] Untabify InstructionSelectorImpl.h. NFC	2020-08-19 12:00:00 +02:00
sstefan1	eedf18fc1f	[IR] Intrinsics default attributes and opt-out flag Intrinsic properties can now be set to default and applied to all intrinsics. If the attributes are not needed, the user can opt-out by setting the DisableDefaultAttributes flag to true. Differential Revision: https://reviews.llvm.org/D70365	2020-08-19 10:50:46 +02:00
Ronak Chauhan	fdf71d486c	Revert "[AMDGPU] Support disassembly for AMDGPU kernel descriptors" This reverts commit `cacfb02d28`. Reverting due to buildbot failures.	2020-08-19 13:12:29 +05:30
David Sherwood	3f36561f69	[SVE][CodeGen] Fix scalable vector issues in DAGTypeLegalizer::GenWidenVectorLoads In DAGTypeLegalizer::GenWidenVectorLoads the algorithm assumes it only ever deals with fixed width types, hence the offsets for each individual store never take 'vscale' into account. I've changed the code in that function to use TypeSize instead of unsigned for tracking the remaining load amount. In addition, I've changed the load loop to use the new IncrementPointer helper function for updating the addresses in each iteration, since this handles scalable vector types. Also, I've added report_fatal_errors in GenWidenVectorExtLoads, TargetLowering::scalarizeVectorLoad and TargetLowering::scalarizeVectorStores, since these functions currently use a sequence of element-by-element scalar loads/stores. In a similar vein, I've also added a fatal error report in FindMemType for the case when we decide to return the element type for a scalable vector type. I've added new tests in CodeGen/AArch64/sve-split-load.ll CodeGen/AArch64/sve-ld-addressing-mode-reg-imm.ll for the changes in GenWidenVectorLoads. Differential Revision: https://reviews.llvm.org/D85909	2020-08-19 07:54:32 +01:00
Yaxun (Sam) Liu	7546b29e76	[HIP] Support target id by --offload-arch This patch introduces support of target id by -offload-arch. Differential Revision: https://reviews.llvm.org/D60620	2020-08-18 23:43:53 -04:00
Ronak Chauhan	cacfb02d28	[AMDGPU] Support disassembly for AMDGPU kernel descriptors Decode AMDGPU Kernel descriptors as assembler directives. Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D80713	2020-08-19 08:49:07 +05:30
Elliott Hughes	a7d0b7a786	ld128 demangle: allow space for 'L' suffix. Summary: Caught by HWASAN on arm64 Android (which uses ld128 for long double). This was running the existing fuzzer. The specific minimized fuzz input to reproduce this is: __cxa_demangle("1\006ILeeeEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE", 0, 0, 0); Reviewers: eugenis, srhines, #libc_abi! Subscribers: kristof.beyls, danielkiss, libcxx-commits Tags: #libc_abi Differential Revision: https://reviews.llvm.org/D77924	2020-08-18 16:14:05 -07:00
Jessica Paquette	bf36e90295	[GlobalISel][CallLowering] NFC: Unify flag-setting from CallBase + AttributeList It's annoying to have to maintain multiple, nearly identical chains of if statements which all set the same attributes. Add a helper function, `addFlagsUsingAttrFn` which performs the attribute setting. Then, use wrappers for that function in `lowerCall` and `setArgFlags`. (Note that the flag-setting code in `setArgFlags` was missing the returned attribute. There's no selection for this yet, so no test. It's an example of the kind of thing this lets us avoid, though.) Differential Revision: https://reviews.llvm.org/D86159	2020-08-18 11:07:33 -07:00
Matt Arsenault	5a15f6628e	GlobalISel: Implement fewerElementsVector for G_INSERT_VECTOR_ELT Add unit tests since AMDGPU will only trigger this for gigantic vectors, and won't use the annoying odd sized breakdown case.	2020-08-18 13:51:19 -04:00
David Blaikie	f7a49d2aa6	[WIP][DebugInfo] Lazily parse debug_loclist offsets Parsing DWARFv5 debug_loclist offsets when a CU is parsed is weighing down memory usage of symbolizers that don't need to parse this data at all. There's not much benefit to caching these anyway - since they are O(1) lookup and reading once you know where the offset list starts (and can do bounds checking with the offset list size too). In general, I think it might be time to start paying down some of the technical debt of loc/loclist/range/rnglist parsing to try to unify it a bit more. eg: * Currently DWARFUnit has: RangeSection, RangeSectionBase, LocSection, LocSectionBase, LocTable, RngListTable, LoclistTableHeader (be nice if these were all wrapped up in two variables - one for loclists, one for rnglists) * rnglists and loclists are handled differently (see: LoclistTableHeader, but no RnglistTableHeader) * maybe all these types could be less stateful - lazily parse what they need to, even reparsing rather than caching because it doesn't seem too expensive, for instance. (though admittedly so long as it's constantcost/overead per compilatiton that's probably adequate) * Maybe implementing and using a DWARFDataExtractor that can be sub-ranged (so we could slice it up to just the single contribution) - though maybe that's not so useful because loc/ranges need to refer to it by absolute, not contribution-relative mechanisms Differential Revision: https://reviews.llvm.org/D86110	2020-08-18 10:49:39 -07:00
Amara Emerson	04a6ea5d77	[GlobalISel] Add a combine for sext_inreg(load x), c --> sextload x This is restricted to single use loads, which if we fold to sextloads we can find more optimal addressing modes on AArch64. This also fixes an overload the MachineFunction::getMachineMemOperand() method which was incorrectly using the MF alignment instead of the MMO alignment. Differential Revision: https://reviews.llvm.org/D85966	2020-08-18 10:42:15 -07:00
Amara Emerson	40e269ea6d	[GlobalISel] Add a combine for ashr(shl x, c), c --> sext_inreg x, c' By detecting this sign extend pattern early, we can uncover opportunities for more optimizations. Differential Revision: https://reviews.llvm.org/D85965	2020-08-18 10:42:15 -07:00
Jessica Paquette	224a8c639e	[GlobalISel][CallLowering] Look through call parameters for flags We weren't looking through the parameters on calls at all. E.g., say you had ``` declare i32 @zext(i32 zeroext %x) ... %y = call i32 @zext(i32 %something) ... ``` At the point of the call, we wouldn't know that the %something should have the zeroext attribute. This sets flags in about the same way as TargetLoweringBase::ArgListEntry::setAttributes. Differential Revision: https://reviews.llvm.org/D86125	2020-08-18 08:48:56 -07:00
Ronak Chauhan	7b777ee730	[ELF] Hide target specific methods as private Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D86136	2020-08-18 18:26:08 +05:30
Ronak Chauhan	e760e85680	[llvm-objdump][AMDGPU] Detect CPU string AMDGPU ISA isn't backwards compatible and hence -mcpu must always be specified during disassembly. However, the AMDGPU target CPU is stored in e_flags in the ELF object. This patch allows targets to implement CPU string detection, and also implements it for AMDGPU by looking at e_flags. Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D84519	2020-08-18 17:43:16 +05:30
Shinji Okumura	5e361e2aa4	[Attributor] Deduce noundef attribute This patch introduces a new abstract attribute `AANoUndef` which corresponds to `noundef` IR attribute and deduce them. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D85184	2020-08-18 18:05:54 +09:00
Johannes Doerfert	858c75f7d1	[Attributor][NFC] Directly return proper type to avoid casts	2020-08-17 23:36:36 -05:00
Harmen Stoppels	a52173a3e5	Use find_library for ncurses Currently it is hard to avoid having LLVM link to the system install of ncurses, since it uses check_library_exists to find e.g. libtinfo and not find_library or find_package. With this change the ncurses lib is found with find_library, which also considers CMAKE_PREFIX_PATH. This solves an issue for the spack package manager, where we want to use the zlib installed by spack, and spack provides the CMAKE_PREFIX_PATH for it. This is a similar change as https://reviews.llvm.org/D79219, which just landed in master. Differential revision: https://reviews.llvm.org/D85820	2020-08-17 19:52:52 -07:00
Amy Kwan	c7ec3a7e33	[PowerPC] Implement Vector Extract Mask builtins in LLVM/Clang This patch implements the vec_extractm function prototypes in altivec.h in order to utilize the vector extract with mask instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82675	2020-08-17 21:14:17 -05:00
Hamilton Tobon Mosquera	496f8e5b36	[OpenMPOpt][HideMemTransfersLatency] Split __tgt_target_data_begin_mapper into its "issue" and "wait" counterparts. WIP that tries to hide the latency of runtime calls that involve host to device memory transfers by splitting them into their "issue" and "wait" versions. The "issue" is moved upwards as much as possible. The "wait" is moved downards as much as possible. The "issue" issues the memory transfer asynchronously, returning a handle. The "wait" waits in the returned handle for the memory transfer to finish. We still lack of the movement.	2020-08-17 20:56:10 -05:00
Hongtao Yu	819b2d9c79	[llvm-objdump] Symbolize binary addresses for low-noisy asm diff. When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise. In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand. So far only the X86 disassemblers are supported. Test Plan: llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr : ``` Disassembly of section .text: <_start>: push rax mov dword ptr [rsp + 4], 0 mov dword ptr [rsp], 0 mov eax, dword ptr [rsp] cmp eax, dword ptr [rip + 4112] # 202182 <g> jge 0x20117e <_start+0x25> call 0x201158 <foo> inc dword ptr [rsp] jmp 0x201169 <_start+0x10> xor eax, eax pop rcx ret ``` llvm-objdump -d --symbolize-operands --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr : ``` Disassembly of section .text: <_start>: push rax mov dword ptr [rsp + 4], 0 mov dword ptr [rsp], 0 <L1>: mov eax, dword ptr [rsp] cmp eax, dword ptr <g> jge <L0> call <foo> inc dword ptr [rsp] jmp <L1> <L0>: xor eax, eax pop rcx ret ``` Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D84191	2020-08-17 16:55:12 -07:00
Matt Arsenault	a128292b90	GlobalISel: Make type for lower action more consistently optional Some of the lower implementations were relying on this, however the type was not set depending on which form .lower* helper form you were using. For instance, if you used an unconditonal lower(), the type was never set. Most of the lower actions do not benefit from a type parameter, and just expand in terms of the original operation's types. However, some lowerings could benefit from an additional type hint to combine a promotion and an expansion. An example of this is for add/sub sat. The DAG integer legalization tries to use smarter expansions directly when promoting the integer type, and doesn't always produce the same instruction with a wider type. Treat this as an optional hint argument, that only means something for specific lower actions. It may be useful to generalize this mechanism to pass a full list of type indexes and desired types, but I haven't run into a case like that yet.	2020-08-17 16:24:55 -04:00
diggerlin	2f0d755d81	[AIX][XCOFF][Patch1] Provide decoding trace back table information API for xcoff object file for llvm-objdump -d SUMMARY: 1. This patch provided API for decoding the traceback table info and unit test for the these API. 2. Another patchs will do the following things: 2.1 added a new option --traceback-table to decode the trace back table information for xcoff object file when using llvm-objdump to disassemble the xcoff objfile. 2.2 print out the traceback table information for llvm-objdump. Reviewers: Jason liu, Hubert Tong, James Henderson Differential Revision: https://reviews.llvm.org/D81585	2020-08-17 16:23:47 -04:00
Dávid Bolvanský	0f14b2e6cb	Revert "[BPI] Improve static heuristics for integer comparisons" This reverts commit `50c743fa71`. Patch will be split to smaller ones.	2020-08-17 20:44:33 +02:00
Valentin Clement	3060894bbb	[flang][directives] Use TableGen to generate clause unparsing Use the TableGen directive back-end to generate code for the clauses unparsing. Reviewed By: sscalpone, kiranchandramohan Differential Revision: https://reviews.llvm.org/D85851	2020-08-17 14:22:25 -04:00
Matt Arsenault	5ca7c6386f	GlobalISel: Fix parameter name in doxygen comment	2020-08-17 13:57:10 -04:00
Matt Arsenault	fe171908e9	GlobalISel: Revisit users of other merge opcodes in artifact combiner The artifact combiner searches for the uses of G_MERGE_VALUES for unmerge/trunc that need further combining. This also needs to handle the vector merge opcodes the same way. This fixes leaving behind some pairs I expected to be removed, that were if the legalizer is run a second time.	2020-08-17 13:56:53 -04:00
Matt Arsenault	924f31bc3c	GlobalISel: Remove unnecessary check for copy type COPY isn't allowed to change the type, but can mix no type with type.	2020-08-17 09:19:25 -04:00
Alex Zinenko	874aef875d	[llvm] support graceful failure of DataLayout parsing Existing implementation always aborts on syntax errors in a DataLayout description. While this is meaningful for consuming textual IR modules, it is inconvenient for users that may need fine-grained control over the layout from, e.g., command-line options. Propagate errors through the parsing functions and only abort in the top-level parsing function instead. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D85650	2020-08-17 15:10:37 +02:00
Simon Pilgrim	c1f6ce0c73	[DemandedBits] Improve accuracy of Add propagator The current demand propagator for addition will mark all input bits at and right of the alive output bit as alive. But carry won't propagate beyond a bit for which both operands are zero (or one/zero in the case of subtraction) so a more accurate answer is possible given known bits. I derived a propagator by working through truth tables and using a bit-reversed addition to make demand ripple to the right, but I'm not sure how to make a convincing argument for its correctness in the comments yet. Nevertheless, here's a minimal implementation and test to get feedback. This would help in a situation where, for example, four bytes (<128) packed into an int are added with four others SIMD-style but only one of the four results is actually read. Known A: 0_______0_______0_______0_______ Known B: 0_______0_______0_______0_______ AOut: 00000000001000000000000000000000 AB, current: 00000000001111111111111111111111 AB, patch: 00000000001111111000000000000000 Committed on behalf of: @rrika (Erika) Differential Revision: https://reviews.llvm.org/D72423	2020-08-17 12:54:09 +01:00
Vitaly Buka	e10e7829bf	[StackSafety] Skip ambiguous lifetime analysis If we can't identify alloca used in lifetime marker we need to assume to worst case scenario. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D84630	2020-08-16 18:05:52 -07:00
Fady Ghanim	aaa93a681b	[OpenMP][OMPBuilder] Adding support for `omp single` This adds support for generating `omp single`, and necessary calls for `copyprivate` clause. Differential Revision: https://reviews.llvm.org/D85617	2020-08-16 01:15:16 -04:00
Wenlei He	577e58bcc7	[InlineAdvisor] New inliner advisor to replay inlining from optimization remarks This change added a new inline advisor that takes optimization remarks from previous inlining as input, and provides the decision as advice so current inlining can replay inline decisions of a different compilation. Dwarf inline stack with line and discriminator is used as anchor for call sites including call context. The change can be useful for Inliner tuning as it provides a channel to allow external input for tweaking inline decisions. Existing alternatives like alwaysinline attribute is per-function, not per-callsite. Per-callsite inline intrinsic can be another solution (not yet existing), but it's intrusive to implement and also does not differentiate call context. A switch -sample-profile-inline-replay=<inline_remarks_file> is added to hook up the new inline advisor with SampleProfileLoader's inline decision for replay. Since SampleProfileLoader does top-down inlining, inline decision can be specialized for each call context, hence we should be able to replay inlining accurately. However with a bottom-up inliner like CGSCC inlining, the replay can be limited due to lack of specialization for different call context. Apart from that limitation, the new inline advisor can still be used by regular CGSCC inliner later if needed for tuning purpose. This is a resubmit of https://reviews.llvm.org/D83743	2020-08-15 20:17:21 -07:00
Aditya Kumar	49a944af7f	[NFC] Fix typo and variable names	2020-08-15 09:06:22 -07:00
Philip Reames	6b2105456a	[Statepoint] Remove code related to inline operand bundles This code becomes dead for valid IR after `48f4312` and `a96fc46`. The reason for the test change is that the verifier reports the first verification error encountered, in some non-specified visit order. By removing the verification code in gc.relocates for a statepoint with inline gc operands, I change the error the verifier reports. And in one case, the checked for error is no longer possible with the bundle representation, so I simply delete the file.	2020-08-14 20:29:41 -07:00
Arthur Eubanks	e6ea8779c2	[NewPM][optnone] Mark various passes as required This was done by turning on -enable-npm-optnone and fixing failures. That will be enabled in a follow-up change for ease of reverting. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D85457	2020-08-14 15:51:59 -07:00
Craig Topper	c7a0b2684f	[X86][MC][Target] Initial backend support a tune CPU to support -mtune This patch implements initial backend support for a -mtune CPU controlled by a "tune-cpu" function attribute. If the attribute is not present X86 will use the resolved CPU from target-cpu attribute or command line. This patch adds MC layer support a tune CPU. Each CPU now has two sets of features stored in their GenSubtargetInfo.inc tables . These features lists are passed separately to the Processor and ProcessorModel classes in tablegen. The tune list defaults to an empty list to avoid changes to non-X86. This annoyingly increases the size of static tables on all target as we now store 24 more bytes per CPU. I haven't quantified the overall impact, but I can if we're concerned. One new test is added to X86 to show a few tuning features with mismatched tune-cpu and target-cpu/target-feature attributes to demonstrate independent control. Another new test is added to demonstrate that the scheduler model follows the tune CPU. I have not added a -mtune to llc/opt or MC layer command line yet. With no attributes we'll just use the -mcpu for both. MC layer tools will always follow the normal CPU for tuning. Differential Revision: https://reviews.llvm.org/D85165	2020-08-14 15:31:50 -07:00
Jordan Rupprecht	38884641f2	Temporarily revert "[SCEVExpander] Add helper to clean up instrs inserted while expanding." This reverts commit `7829c33084`. The assertion is triggering on some internal code. A reduced test case is in progress.	2020-08-14 14:52:37 -07:00
Vitaly Buka	fc4fd89852	[StackSafety] Use ValueInfo in ParamAccess::Call This avoid GUID lookup in Index.findSummaryInModule. Follow up for D81242. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D85269	2020-08-14 12:42:44 -07:00
Greg McGary	eef41efe00	[MachO] Add skeletal support for DriverKit platform Define the platform ID = 10, and simple mappings between platform ID & name. Reviewed By: MaskRay, cishida Differential Revision: https://reviews.llvm.org/D85594	2020-08-14 12:36:43 -07:00
Matt Arsenault	5c5e6d951e	TableGen/GlobalISel: Partially handle immAllOnesV/immAllZerosV These should really match either G_BUILD_VECTOR or G_BUILD_VECTOR_TRUNC, but there doesn't seem to be an existing mechanism for matching alternative opcodes. There is GIM_SwitchOpcode, but it seems to assume it's oly only used for matcher optimization. I could also omit any opcode check and rely on the matcher directly checking the opcode, but the table optimizer currently assumes there has to be an opcode check. Also doesn't try to handle undef elements like the DAG version.	2020-08-14 13:55:30 -04:00
Bjorn Pettersson	b395d67a88	[Orc] Fix werror for unused variable in noasserts build	2020-08-14 15:58:04 +02:00
Stefan Gränitz	28e1015e32	[ORC] Fix missing include in OrcRemoteTargetClient.h	2020-08-14 12:00:18 +02:00
Stefan Gränitz	6bf74a924f	[ORC] In LLLazyJIT provide public access to the CompileOnDemandLayer This is analog to how LLJIT provides public access to all its layers. Differential Revision: https://reviews.llvm.org/D85921	2020-08-14 11:34:44 +02:00
Stefan Gränitz	30c4561e36	[ORC] Add JITLink-compatible remote memory-manager and LLJITWithChildProcess example This adds RemoteJITLinkMemoryManager is a new subclass of OrcRemoteTargetClient. It implements jitlink::JITLinkMemoryManager and targets the OrcRemoteTargetRPCAPI. Behavior should be very similar to RemoteRTDyldMemoryManager. The essential differnce with JITLink is that allocations work in isolation from its memory manager. Thus, the RemoteJITLinkMemoryManager might be seen as "JITLink allocation factory". RPCMMAlloc is another subclass of OrcRemoteTargetClient and implements the actual functionality. It allocates working memory on the host and target memory on the remote target. Upon finalization working memory is copied over to the tagrte address space. Finalization can be asynchronous for JITLink allocations, but I don't see that it makes a difference here. Differential Revision: https://reviews.llvm.org/D85919	2020-08-14 11:34:44 +02:00
Yuanfang Chen	a5ed20b549	[NewPM][CodeGen] Add machine code verification callback D83608 need this. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D85916	2020-08-13 16:13:01 -07:00
Stefan Gränitz	f12db8cf75	[ORC] cloneToNewContext() can work with a const-ref to ThreadSafeModule	2020-08-13 21:01:21 +02:00
Stefan Gränitz	34a5669ccd	[ORC] Fix SymbolLookupSet::containsDuplicates()	2020-08-13 21:01:21 +02:00
Haowei Wu	d650cbc349	[elfabi] Move llvm-elfabi related code to InterfaceStub library This change moves elfabi related code to llvm/InterfaceStub library so it can be shared by multiple llvm tools without causing cyclic dependencies. Differential Revision: https://reviews.llvm.org/D85678	2020-08-13 11:51:44 -07:00
Sameer Arora	8d58eb11f9	[llvm-libtool-darwin] Refactor ArchiveWriter Refactoring function `writeArchive` in ArchiveWriter. Added a new function `writeArchiveBuffer` that returns the archive in a memory buffer instead of writing it out to the disk. This refactor is necessary so as to allow `llvm-libtool-darwin` to write universal files containing archives. Reviewed by jhenderson, MaskRay, smeenai Differential Revision: https://reviews.llvm.org/D84858	2020-08-13 10:56:30 -07:00
Dávid Bolvanský	50c743fa71	[BPI] Improve static heuristics for integer comparisons Similarly as for pointers, even for integers a == b is usually false. GCC also uses this heuristic. Reviewed By: ebrevnov Differential Revision: https://reviews.llvm.org/D85781	2020-08-13 19:54:27 +02:00
Fangrui Song	7f8c49b016	[llvm-objdump] Change symbol name/PLT decoding errors to warnings If the referenced symbol of a J[U]MP_SLOT is invalid (e.g. symbol index 0), llvm-objdump -d will bail out: ``` error: 'a': st_name (0x326600) is past the end of the string table of size 0x7 ``` where 0x326600 is the st_name field of the first entry past the end of .symtab Change it to a warning to continue dumping. `X86/plt.test` uses a prebuilt executable, so I pick `ELF/AArch64/plt.test` which has a YAML input and can be easily modified. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D85623	2020-08-13 08:13:42 -07:00
David Stenberg	e8ebebb0bd	[InstCombine] Fix incorrect Modified status When removing instructions from unreachable blocks, and only debug info intrinsics were removed, InstCombine could incorrectly return a false Modified status. This is fixed by making removeAllNonTerminatorAndEHPadInstructions() also return how many debug info intrinsics that were removed, and take that into account. This was caught using the check introduced by D80916. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D85839	2020-08-13 15:10:41 +02:00
Dávid Bolvanský	f9264995a6	Revert "[BPI] Improve static heuristics for integer comparisons" This reverts commit `44587e2f7e`. Sanitizer tests need to be updated.	2020-08-13 14:37:40 +02:00
Dávid Bolvanský	44587e2f7e	[BPI] Improve static heuristics for integer comparisons Similarly as for pointers, even for integers a == b is usually false. GCC also uses this heuristic. Reviewed By: ebrevnov Differential Revision: https://reviews.llvm.org/D85781	2020-08-13 14:23:58 +02:00
Dávid Bolvanský	a0485421d2	Revert "[BPI] Improve static heuristics for integer comparisons" This reverts commit `385c9d673f`.	2020-08-13 12:59:15 +02:00
Dávid Bolvanský	385c9d673f	[BPI] Improve static heuristics for integer comparisons Similarly as for pointers, even for integers a == b is usually false. GCC also uses this heuristic. Reviewed By: ebrevnov Differential Revision: https://reviews.llvm.org/D85781	2020-08-13 12:45:40 +02:00
Xing GUO	b7d5d1ec64	[DWARFYAML] Replace InitialLength with Format and Length. NFC. This change replaces the InitialLength of pub-tables with Format and Length. All the InitialLength fields have been removed. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D85880	2020-08-13 18:39:03 +08:00
David Sherwood	6af1677161	[SVE][CodeGen] Fix scalable vector issues in DAGTypeLegalizer::GenWidenVectorStores In DAGTypeLegalizer::GenWidenVectorStores the algorithm assumes it only ever deals with fixed width types, hence the offsets for each individual store never take 'vscale' into account. I've changed the main loop in that function to use TypeSize instead of unsigned for tracking the remaining store amount and offset increment. In addition, I've changed the loop to use the new IncrementPointer helper function for updating the addresses in each iteration, since this handles scalable vector types. Whilst fixing this function I also fixed a minor issue in IncrementPointer whereby we were not adding the no-unsigned-wrap flag for the add instruction in the same way as the fixed width case does. Also, I've added a report_fatal_error in GenWidenVectorTruncStores, since this code currently uses a sequence of element-by-element scalar stores. I've added new tests in CodeGen/AArch64/sve-intrinsics-stores.ll CodeGen/AArch64/sve-st1-addressing-mode-reg-imm.ll for the changes in GenWidenVectorStores. Differential Revision: https://reviews.llvm.org/D84937	2020-08-13 11:07:17 +01:00
David Sherwood	3ec3fcb97a	[CodeGen] In narrowExtractedVectorLoad bail out for scalable vectors In narrowExtractedVectorLoad there is an optimisation that tries to combine extract_subvector with a narrowing vector load. At the moment this produces warnings due to the incorrect calls to getVectorNumElements() for scalable vector types. I've got this working for scalable vectors too when the extract subvector index is a multiple of the minimum number of elements. I have added a new variant of the function: MachineFunction::getMachineMemOperand that copies an existing MachineMemOperand, but replaces the pointer info with a null version since we cannot currently represent scaled offsets. I've added a new test for this particular case in: CodeGen/AArch64/sve-extract-subvector.ll Differential Revision: https://reviews.llvm.org/D83950	2020-08-13 10:46:18 +01:00
Amara Emerson	2ff14957e8	[GlobalISel] Implement bit-test switch table optimization. This is mostly a straight port from SelectionDAG. We re-use the actual bit-test analysis part from SwitchLoweringUtils, which was factored out earlier to support jump-tables. Differential Revision: https://reviews.llvm.org/D85233	2020-08-12 11:31:39 -07:00
Christopher Tetreault	1da09b7214	[SVE] Remove default-false VectorType::get Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84212	2020-08-12 10:37:05 -07:00
David Green	f07f17ac7c	[Scheduler] Fix typo in comments. NFC	2020-08-12 18:36:05 +01:00
Xing GUO	e891b6a75d	[DWARFYAML] Make the address size of compilation units optional. This patch makes the 'AddrSize' field optional. If the address size is missing, yaml2obj will infer it from the object file. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D85805	2020-08-12 21:47:32 +08:00
Igor Kudrin	9ceb192e14	[llvm-dwarfdump] Avoid crashing if an abbreviation offset is invalid. Note that DWARFUnit::getAbbreviations() returns nullptr if the abbreviations could not be read, but callers used the returned pointer without checking. Differential Revision: https://reviews.llvm.org/D85738	2020-08-12 16:01:53 +07:00
David Sherwood	88bbd30736	[SVE][CodeGen] Fix issues with EXTRACT_SUBVECTOR when using scalable FP vectors In this patch I have fixed two issues: 1. Our SVE tuple get/set intrinsics were using the wrong constant type for the index passed to EXTRACT_SUBVECTOR. I have fixed this by using the function SelectionDAG::getVectorIdxConstant to create the value. Also, I have updated the documentation for EXTRACT_SUBVECTOR describing what type the constant index should be and we now enforce this when creating the node. 2. The AArch64 backend was missing the appropriate patterns for extracting certain subvectors (nxv4f16 and nxv2f32) from legal SVE types. I have added them as part of this patch. The only way that I could find to test the new patterns was to use the SVE tuple get intrinsics, although I realise it looks a bit unusual. Tests added here: test/CodeGen/AArch64/sve-extract-subvector.ll Differential Revision: https://reviews.llvm.org/D85516	2020-08-12 08:35:46 +01:00
Kiran Chandramohan	e6c5e6efd0	[MLIR,OpenMP] Lowering of parallel operation: proc_bind clause 2/n This patch adds the translation of the proc_bind clause in a parallel operation. The values that can be specified for the proc_bind clause are specified in the OMP.td tablegen file in the llvm/Frontend/OpenMP directory. From this single source of truth enumeration for proc_bind is generated in llvm and mlir (used in specification of the parallel Operation in the OpenMP dialect). A function to return the enum value from the string representation is also generated. A new header file (DirectiveEmitter.h) containing definitions of classes directive, clause, clauseval etc is created so that it can be used in mlir as well. Reviewers: clementval, jdoerfert, DavidTruby Differential Revision: https://reviews.llvm.org/D84347	2020-08-12 08:03:13 +01:00
Petr Hosek	31e5f7120b	[CMake] Simplify CMake handling for zlib Rather than handling zlib handling manually, use find_package from CMake to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB, HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is set to YES, which requires the distributor to explicitly select whether zlib is enabled or not. This simplifies the CMake handling and usage in the rest of the tooling. This is a reland of `abb0075` with all followup changes and fixes that should address issues that were reported in PR44780. Differential Revision: https://reviews.llvm.org/D79219	2020-08-11 20:22:11 -07:00
Vedant Kumar	30c1633386	Revert "[Instruction] Add updateLocationAfterHoist helper" This reverts commit `4a646ca9e2`. This is causing some bots to fail with "!dbg attachment points at wrong subprogram for function", like: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/67958/steps/stage%201%20check/logs/stdio	2020-08-11 14:54:09 -07:00
Vedant Kumar	4a646ca9e2	[Instruction] Add updateLocationAfterHoist helper Introduce a helper on Instruction which can be used to update the debug location after hoisting. Use this in GVN and LICM, where we were mistakenly introducing new line 0 locations after hoisting (the docs recommend dropping the location in this case). For more context, see the discussion in https://reviews.llvm.org/D60913. Differential Revision: https://reviews.llvm.org/D85670	2020-08-11 14:05:20 -07:00
Nikita Popov	06d567059e	[InstSimplify] Respect CanUseUndef in more places Similar to what we do in IIQ, add an isUndefValue() helper that checks for undef values while respective CanUseUndef. This makes it much easier to search for places that don't respect the flag yet.	2020-08-11 21:53:33 +02:00
diggerlin	e9ac1495e2	[AIX][XCOFF] change the operand of branch instruction from symbol name to qualified symbol name for function declarations SUMMARY: 1. in the patch , remove setting storageclass in function .getXCOFFSection and construct function of class MCSectionXCOFF there are XCOFF::StorageMappingClass MappingClass; XCOFF::SymbolType Type; XCOFF::StorageClass StorageClass; in the MCSectionXCOFF class, these attribute only used in the XCOFFObjectWriter, (asm path do not need the StorageClass) we need get the value of StorageClass, Type,MappingClass before we invoke the getXCOFFSection every time. actually , we can get the StorageClass of the MCSectionXCOFF from it's delegated symbol. 2. we also change the oprand of branch instruction from symbol name to qualify symbol name. for example change bl .foo extern .foo to bl .foo[PR] extern .foo[PR] 3. and if there is reference indirect call a function bar. we also add extern .bar[PR] Reviewers: Jason liu, Xiangling Liao Differential Revision: https://reviews.llvm.org/D84765	2020-08-11 15:26:19 -04:00
Jessica Paquette	bebe6a6449	[GlobalISel] Combine (logic_op (op x...), (op y...)) -> (op (logic_op x, y)) This implements ``` (logic_op (op x...), (op y...)) -> (op (logic_op x, y)) ``` when `op` is an extend, a shift, or an and. This is similar to `DAGCombiner::hoistLogicOpWithSameOpcodeHands` (with a bunch of missing cases, e.g. G_TRUNC, G_BITCAST, etc.) This is implemented so it works both pre and post-legalization. This also adds a general way to add a series of instructions in a combine. (`applyBuildInstructionSteps`). Differential Revision: https://reviews.llvm.org/D85050	2020-08-11 10:40:06 -07:00
Matt Arsenault	8dd2eb10bb	GlobalISel: Fix typo	2020-08-11 13:08:56 -04:00
Lang Hames	eed19c8c7e	[ORC] Move file-descriptor based raw byte channel into a public header. This will enable re-use in other llvm tools.	2020-08-11 09:50:58 -07:00
Nikita Popov	d110d4aaff	[InstSimplify] Forbid undef folds in expandBinOp This is the replacement for D84250 based on D84792. As we recursively fold with the same value twice, we need to disable undef folds, to prevent an undef from being folded to two different values. Reverting rG00f3579aea6e3d4a4b7464c3db47294f71cef9e4 and using the test case from https://reviews.llvm.org/D83360#2145793, it no longer performs the incorrect fold. Differential Revision: https://reviews.llvm.org/D85684	2020-08-11 18:39:24 +02:00
Jay Foad	fa2b836ea3	[GlobalISel] Add G_ABS This is equivalent to the new llvm.abs intrinsic added by D84125 with is_int_min_poison=0. Differential Revision: https://reviews.llvm.org/D85718	2020-08-11 16:34:37 +01:00
Sanjay Patel	1470ce4a76	[InstSimplify] fold min/max with matching min/max operands I think this is the last remaining translation of an existing instcombine transform for the corresponding cmp+sel idiom. This interpretation is more general though - we can remove mismatched signed/unsigned combinations in addition to the more obvious cases. min/max(X, Y) must produce X or Y as the result, so this is just another clause in the existing transform that was already matching a min/max of min/max.	2020-08-11 11:23:15 -04:00
Valentin Clement	16c1d251c4	[flang][directives] Use TableGen information for clause classes in parse-tree This patch takes advantage of the directive information and tablegen generation to replace the clauses class parse tree and in the dump parse tree sections. Reviewed By: sscalpone Differential Revision: https://reviews.llvm.org/D85549	2020-08-11 10:44:14 -04:00
Matt Arsenault	e2f1b48f86	GlobalISel: Implement bitcast action for G_INSERT_VECTOR_ELT This mirrors the support for the equivalent extracts. This also creates a huge mess that would be greatly improved if we had any bit operation combines.	2020-08-11 10:39:14 -04:00
Matt Arsenault	53f21e0fb7	TableGen/GlobalISel: Hack the operand order for atomic_store ISD::ATOMIC_STORE arbitrarily has the operands in the opposite order from regular ISD::STORE, which always introduced an annoying duplication of patterns to handle both cases. Since in GlobalISel there's just the one G_STORE, we need to swap the operands to correctly emit the type check for the pointer operand. Some work started in `20aafa3156` to migrate SelectionDAG to use ISD::STORE for atomics, but that work seems to have stalled. Since this is the pretty much the last operation which matters which isn't supported for AMDGPU, use this compatibility hack to unblock declaring it functionally complete. Not sure what's going on with the pending_phis AArch64 test. It seems it didn't always use atomics, and I'm not sure what it was originally testing matters anymore.	2020-08-11 10:22:44 -04:00
clementval	3b3dc1dbff	Revert "[flang][directives] Use TableGen information for clause classes in parse-tree" This reverts commit `bf93edc475`. Buildbot failure	2020-08-11 09:54:04 -04:00
Valentin Clement	bf93edc475	[flang][directives] Use TableGen information for clause classes in parse-tree This patch takes advantage of the directive information and tablegen generation to replace the clauses class parse tree and in the dump parse tree sections. Reviewed By: sscalpone Differential Revision: https://reviews.llvm.org/D85549	2020-08-11 09:43:11 -04:00
David Stenberg	91bd9db2cd	[DebugInfo] Allow GNU macro extension to be read Allow the GNU .debug_macro extension to be parsed and printed by llvm-dwarfdump. In an upcoming patch support will be added for emitting that format also. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D82974	2020-08-11 13:30:52 +02:00
David Stenberg	2892ed6d0f	[DebugInfo] Introduce GNU macro extension entry encodings This is a preparatory patch for allowing the GNU .debug_macro extension, which is a precursor to the DWARF 5 format, to be emitted by LLVM for earlier DWARF versions. The entries share the same encoding and behavior as in DWARF5; there are just more entries in the DWARF 5 format. Therefore, we could have used those existing DWARF 5 entries, but I think that explicitly referring to the GNU macro variants makes the code more clear. The defines that this patch introduces can be found in GCC in the dwarf2.h header: https://gcc.gnu.org/git/?p=gcc.git;a=blob; f=include/dwarf2.h; h=0b6facfd4cf4c02320c7328114231b128ab42d5e; hb=dccbf1e2a6e544f71b4a5795f0c79015db019fc3#l425 Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D82972	2020-08-11 13:30:52 +02:00
Kerry McLaughlin	85c7e89f3b	[CodeGen] Refactor getMemBasePlusOffset & getObjectPtrOffset to accept a TypeSize Changes the Offset arguments to both functions from int64_t to TypeSize & updates all uses of the functions to create the offset using TypeSize::Fixed() Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85220	2020-08-11 12:17:10 +01:00
Kai Nacke	b3aece0531	[SystemZ/ZOS] Add binary format goff and operating system zos to the triple Adds the binary format goff and the operating system zos to the triple class. goff is selected as default binary format if zos is choosen as operating system. No further functionality is added. Reviewers: efriedma, tahonermann, hubert.reinterpertcast, MaskRay Reviewed By: efriedma, tahonermann, hubert.reinterpertcast Differential Revision: https://reviews.llvm.org/D82081	2020-08-11 05:26:26 -04:00
Florian Hahn	7829c33084	[SCEVExpander] Add helper to clean up instrs inserted while expanding. SCEVExpander already tracks which instructions have been inserted n InsertedValues/InsertedPostIncValues. This patch adds an additional vector to collect the instructions in insertion order. This can then be used to remove exactly the instructions inserted by the expander. This replaces ExpandedValuesCleaner, which in some cases might remove values not inserted by the expander (e.g. if a value was dead before insertion and is then used during expansion). Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D84327	2020-08-11 09:30:31 +01:00
Shinji Okumura	06eee8748f	[Attributor][NFC] Connect AAPotentialValues with AAValueSimplify This patch enables `AAValueSimplify` to use information from `AAPotentialValues` Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D85668	2020-08-11 15:52:02 +09:00
Haowei Wu	db91320a89	Revert "Move ELFObjHandler to TextAPI library" This reverts commit `e6f8ba12e6` due to build failures.	2020-08-10 21:31:29 -07:00
Haowei Wu	e6f8ba12e6	Move ELFObjHandler to TextAPI library This change moves ELFObjHandler to llvm/TextAPI library so it can be used by different llvm tools.	2020-08-10 21:23:39 -07:00
Wang, Pengfei	9512525947	[X86][FPEnv] Teach X86 mask compare intrinsics to respect strict FP semantics. When we use mask compare intrinsics under strict FP option, the masked elements shouldn't raise any exception. So, we cann't replace the intrinsic with a full compare + "and" operation. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D85385	2020-08-11 10:28:41 +08:00
Yuanfang Chen	d04f3e028d	[CodeGen] Make MMI immutable NPM pass	2020-08-10 17:52:42 -07:00
Lang Hames	6fd30f0669	[llvm-jitlink] Update llvm-jitlink to use TargetProcessControl.	2020-08-10 17:19:48 -07:00
Johannes Doerfert	fa5d22a045	[OpenMP][NFC] Reuse OMPIRBuilder `struct ident_t` handling in Clang Replace the `ident_t` handling in Clang with the methods offered by the OMPIRBuilder. This cuts down on the clang code as well as the differences between the two, making further transitions easier. Tests have changed but there should not be a real functional change. The most interesting difference is probably that we stop generating local ident_t allocations for now and just use globals. Given that this happens only with debug info, the location part of the `ident_t` is probably bigger than the test anyway. As the location part is already a global, we can avoid the allocation, memcpy, and store in favor of a constant global that is slightly bigger. This can be revisited if there are complications. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D80735	2020-08-10 17:13:26 -05:00
jasonliu	20abff0481	[XCOFF][AIX] Use TE storage mapping class when large code model is enabled Summary: Use TE SMC instead of TC SMC in large code model mode, so that large code model TOC entries could get placed after all the small code model TOC entries, which reduces the chance of TOC overflow. Reviewed By: Xiangling_L Differential Revision: https://reviews.llvm.org/D85455	2020-08-10 19:52:10 +00:00
Craig Topper	96dfc783b2	[BreakFalseDeps][X86] Move operand loop out of X86's getUndefRegClearance and put in the pass. X86 is the only user of this interface in tree. Previously the X86 pass would loop over operands looking for one undef operand for the pass to fix. But there could theoretically be multiple operands to fix. So it makes more sense for the pass to do the looping and ask the target if an operand needs to be fixed.	2020-08-10 10:32:29 -07:00
Xiangling Liao	6ef801aa6b	[AIX] Static init frontend recovery and backend support On the frontend side, this patch recovers AIX static init implementation to use the linkage type and function names Clang chooses for sinit related function. On the backend side, this patch sets correct linkage and function names on aliases created for sinit/sterm functions. Differential Revision: https://reviews.llvm.org/D84534	2020-08-10 10:10:49 -04:00
James Henderson	ca05601cd2	[DebugInfo] Don't error for zero-length arange entries Although the DWARF specification states that .debug_aranges entries can't have length zero, these can occur in the wild. There's no particular reason to enforce this part of the spec, since functionally they have no impact. The patch removes the error and introduces a new warning for premature terminator entries which does not stop parsing. This is a relanding of `cb3a598c87`, adding the missing obj2yaml part that was needed. Fixes https://bugs.llvm.org/show_bug.cgi?id=46805. See also https://reviews.llvm.org/D71932 which originally introduced the error. Reviewed by: ikudrin, dblaikie, Higuoxing Differential Revision: https://reviews.llvm.org/D85313	2020-08-10 14:57:52 +01:00
Matt Arsenault	f9c279b057	PeepholeOptimizer: Use Register	2020-08-10 08:49:36 -04:00
Sanjay Patel	3d5118b75c	[InstCombine] auto-generate test checks; NFC	2020-08-10 08:27:38 -04:00
Nico Weber	bc5d68dd8a	Revert "[DebugInfo] Don't error for zero-length arange entries" This reverts commit `cb3a598c87`. Breaks build of check-llvm dep obj2yaml everywhere.	2020-08-10 08:20:35 -04:00
James Henderson	cb3a598c87	[DebugInfo] Don't error for zero-length arange entries Although the DWARF specification states that .debug_aranges entries can't have length zero, these can occur in the wild. There's no particular reason to enforce this part of the spec, since functionally they have no impact. The patch removes the error and introduces a new warning for premature terminator entries which does not stop parsing. Fixes https://bugs.llvm.org/show_bug.cgi?id=46805. See also https://reviews.llvm.org/D71932 which originally introduced the error. Reviewed by: ikudrin, dblaikie Differential Revision: https://reviews.llvm.org/D85313	2020-08-10 12:48:31 +01:00
Qiu Chaofan	dbcfbffc7a	[PowerPC] Add intrinsic to read or set FPSCR register This patch introduces two intrinsics: llvm.ppc.setflm and llvm.ppc.readflm. They read from or write to FPSCR register (floating-point status & control) which contains rounding mode and exception status. To ensure correctness of program, we need to prevent FP operations from being moved across these intrinsics (mffs/mtfsf instruction), so here I set them as scheduling boundaries. We can relax such restriction if FPSCR is modeled well in the future. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D84914	2020-08-10 18:27:45 +08:00
Petar Avramovic	0d58d9e8fb	AMDGPU/GlobalISel: Lower G_FREM Add custom lower for G_FREM. Differential Revision: https://reviews.llvm.org/D84324	2020-08-10 10:10:46 +02:00
Vitaly Buka	fbd33baa27	[NFC][Attributor] Add missing override	2020-08-09 23:30:42 -07:00
Shinji Okumura	ff1002aab0	[Attributor][NFC][AAPotentialValues] Change interface of PotentialValuesState Previously `PotentialValuesState` inherited `BooleanState`. We have to add `getAssumed` to the state in order to use `clampStateAndIndicateChange` (which will be used in `AAPotentialValuesArgument`). However `BooleanState::getAssumed` is not a virtual function and we cannot override it. Therefore, I changed the state not to inherit `BooleanState` and add `getAssumed` to it. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D85610	2020-08-10 09:18:10 +09:00
Florian Hahn	5a0d6cdbd1	[InstSimplify] Make sure CanUseUndef is initialized in all cases. This should fix a bunch of buildbot failures.	2020-08-09 19:47:16 +01:00
Florian Hahn	d236e1c7b6	[InstSimplify/NewGVN] Add option to control the use of undef. Making use of undef is not safe if the simplification result is not used to replace all uses of the result. This leads to problems in NewGVN, which does not replace all uses in the IR directly. See PR33165 for more details. This patch adds an option to SimplifyQuery to disable the use of undef. Note that I've only guarded uses if isa<UndefValue>/m_Undef where SimplifyQuery is currently available. If we agree on the general direction, I'll update the remaining uses. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D84792	2020-08-09 19:16:56 +01:00
Florian Hahn	c70f0b9d4a	[SCEVExpander] Avoid re-using existing casts if it means updating users. Currently the SCEVExpander tries to re-use existing casts, even if they are not exactly at the insertion point it was asked to create the cast. To do so in some case, it creates a new cast at the insertion point and updates all users to use the new cast. This behavior is problematic, because it changes the IR outside of the instructions created during the expansion. Therefore we cannot completely undo all changes made during expansion. This re-use should be only an extra optimization, so only using the new cast in the expanded instructions should not be a correctness issue. There are many cases equivalent instructions are created during expansion. This patch also adjusts findInsertPointAfter to skip instructions inserted during expansion. This enables re-using existing casts without the renaming any uses, by picking a better insertion point. Reviewed By: efriedma, lebedev.ri Differential Revision: https://reviews.llvm.org/D84399	2020-08-09 13:25:17 +01:00
Petr Hosek	a4d78d23c5	Revert "[CMake] Simplify CMake handling for zlib" This reverts commit `ccbc1485b5` which is still failing on the Windows MLIR bots.	2020-08-08 17:08:23 -07:00
Petr Hosek	ccbc1485b5	[CMake] Simplify CMake handling for zlib Rather than handling zlib handling manually, use find_package from CMake to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB, HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is set to YES, which requires the distributor to explicitly select whether zlib is enabled or not. This simplifies the CMake handling and usage in the rest of the tooling. This is a reland of `abb0075` with all followup changes and fixes that should address issues that were reported in PR44780. Differential Revision: https://reviews.llvm.org/D79219	2020-08-08 16:44:08 -07:00
Yuanfang Chen	f5b5ccf2a6	Reland "Revert "[NewPM][CodeGen] Introduce machine pass and machine pass manager"" This relands commit `320eab2d55`. The test failed because it was looking for x86-linux target unconditionally. Now it gets the default target.	2020-08-07 16:40:49 -07:00
Arthur Eubanks	7abef41674	[NewPM] Print 'Skipping pass' as pass instrumentation If OptNoneInstrumentation prints it instead, 'Skipping pass' will print for even required passes. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D85493	2020-08-07 15:02:02 -07:00
Sameer Arora	645de3664a	[llvm-libtool-darwin] Add constant CPU_SUBTYPE_ARM64_V8 Add support for constant MachO::CPU_SUBTYPE_ARM64_V8. This constant is needed so as to match `llvm-libtool-darwin`'s behavior to that of cctools' libtool when `-arch_only` flag is passed in on command line. Reviewed by jhenderson, alexshap, smeenai Differential Revision: https://reviews.llvm.org/D85041	2020-08-07 14:09:27 -07:00
Vitaly Buka	7547508b7a	Revert "[StackSafety] Skip ambiguous lifetime analysis" This reverts commit `0b2616a804`. Crashes with safe-stack.	2020-08-07 14:02:50 -07:00
Matt Arsenault	5a0b1472c0	GlobalISel: Handle zext(sext x) in artifact combiner This eliminates the illegal intermediate s8 value in the added test.	2020-08-07 16:37:46 -04:00
Yuanfang Chen	320eab2d55	Revert "[NewPM][CodeGen] Introduce machine pass and machine pass manager" This reverts commit `911565d108`. Broke some non-Linux bots.	2020-08-07 11:59:58 -07:00
Yuanfang Chen	911565d108	[NewPM][CodeGen] Introduce machine pass and machine pass manager machine pass could define four methods: - `PreservedAnalyses run(MachineFunction &, MachineFunctionAnalysisManager &)` - `Error doInitialization(Module &, MachineFunctionAnalysisManager &)` - `Error doFinalization(Module &, MachineFunctionAnalysisManager &)` - `Error run(Module &, MachineFunctionAnalysisManager &)` machine pass manger: - MachineFunctionAnalysisManager: Basically an AnalysisManager<MachineFunction> augmented with the ability to register and query IR analyses - MachineFunctionPassManager: support only two methods, `addPass` and `run` Reviewed By: arsenm, asbirlea, aeubanks Differential Revision: https://reviews.llvm.org/D67687	2020-08-07 11:00:31 -07:00
Yuanfang Chen	954bd9c861	[NewPM] Only verify loop for nonskipped user loop pass No verification for pass mangers since it is not needed. No verification for skipped loop pass since the asserted condition is not used. Add a BeforeNonSkippedPass callback for this. The callback needs more inputs than its parameters to work so the callback is added on-the-fly. Reviewed By: aeubanks, asbirlea Differential Revision: https://reviews.llvm.org/D84977	2020-08-07 11:00:31 -07:00
Bevin Hansson	5de6c56f7e	[Intrinsic] Add sshl.sat/ushl.sat, saturated shift intrinsics. Summary: This patch adds two intrinsics, llvm.sshl.sat and llvm.ushl.sat, which perform signed and unsigned saturating left shift, respectively. These are useful for implementing the Embedded-C fixed point support in Clang, originally discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-August/125433.html and http://lists.llvm.org/pipermail/cfe-dev/2018-May/058019.html Reviewers: leonardchan, craig.topper, bjope, jdoerfert Subscribers: hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83216	2020-08-07 15:09:24 +02:00
Igor Kudrin	b6b0ff18a3	[DebugInfo] Clean up DIEUnit. NFC. This removes members of the DIEUnit class which were used only in unit tests. Note also that child classes shadowed some of these methods, namely, getDwarfVersion() was overridden in DwartfUnit and getLength() was overridden in DwarfCompileUnit. Differential Revision: https://reviews.llvm.org/D85436	2020-08-07 15:55:44 +07:00
Shinji Okumura	c575ba28de	[Attributor] AAPotentialValues Interface This is a split patch of D80991. This patch introduces AAPotentialValues and its interface only. For more detail of AAPotentialValues abstract attribute, see the original patch. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D83283	2020-08-07 17:35:12 +09:00
Christian Kühnel	f3cc4df51d	Revert "[CMake] Simplify CMake handling for zlib" This reverts commit `1adc494bce`. This patch broke the Windows compilation on buildbot and pre-merge testing: http://lab.llvm.org:8011/builders/mlir-windows/builds/5945 https://buildkite.com/llvm-project/llvm-master-build/builds/780	2020-08-07 09:36:49 +02:00
biplmish	cce1b0e891	[PowerPC] Implement Vector Extract Low/High Order Builtins in LLVM/Clang This patch implements the function prototypes vec_extractl and vec_extracth in altivec.h to utilize the vector extract double element instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D84622	2020-08-07 01:02:29 -05:00
QingShan Zhang	55de46f3b2	[PowerPC] Support constrained fp operation for setcc The constrained fp operation fcmp was added by https://reviews.llvm.org/D69281. This patch is trying to add the support for PowerPC backend. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D81727	2020-08-07 05:16:36 +00:00
Vitaly Buka	0b2616a804	[StackSafety] Skip ambiguous lifetime analysis If we can't identify alloca used in lifetime marker we need to assume to worst case scenario. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D84630	2020-08-06 19:10:33 -07:00
Vitaly Buka	5c6d9b2bbf	[LTO,NFC] Skip generateParamAccessSummary when empty addGlobalValueSummary can check newly added FunctionSummary and set HasParamAccess to mark that generateParamAccessSummary is needed. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D85182	2020-08-06 19:01:19 -07:00
Arthur Eubanks	72c95b2213	[NewPM] Add callback for skipped passes Parallel to https://reviews.llvm.org/D84772. Will use this for printing when a pass is skipped. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D85478	2020-08-06 18:58:59 -07:00
Matt Arsenault	1ad051dd8c	GlobalISel: Implement lower for G_INSERT_VECTOR_ELT	2020-08-06 19:29:17 -04:00
Snehasish Kumar	8d943a928d	[NFC] Rename BBSectionsPrepare -> BasicBlockSections. Rename the BBSectionsPrepare pass as suggested by the review comment in https://reviews.llvm.org/D85368. Differential Revision: https://reviews.llvm.org/D85380	2020-08-06 13:12:06 -07:00
Matt Arsenault	e00201539f	GlobalISel: Implement fewerElementsVector for G_EXTRACT_VECTOR_ELT Use the same basic strategy as LegalizeVectorTypes. Try to index into smaller pieces if there's a constant index, and otherwise fall back to a stack temporary.	2020-08-06 14:33:16 -04:00
Matt Arsenault	1a0c0944c6	AMDGPU: Define raw/struct variants of buffer atomic fadd Somehow the new FP atomic buffer intrinsics ended up using the legacy style for buffer intrinsics.	2020-08-06 13:36:19 -04:00
Simon Pilgrim	b7b1a38d41	PDBExtras.h - remove unnecessary raw_ostream forward declaration. NFCI. We already need to include raw_ostream.h, also add missing StringRef.h implicit dependency.	2020-08-06 16:31:56 +01:00
Sanjay Patel	60f2c6a94c	[PatternMatch] allow intrinsic form of min/max with existing matchers I skimmed the existing users of these matchers and don't see any problems (eg, the caller assumes the matched value was a select instruction without checking). So I think we can generalize the matching to allow the new intrinsics or the cmp+select idioms. I did not find any unit tests for the matchers, so added some basics there. The instsimplify tests are adapted from existing tests for the cmp+select pattern and cover the folds in simplifyICmpWithMinMax(). Differential Revision: https://reviews.llvm.org/D85230	2020-08-06 10:50:24 -04:00
Raphael Isemann	1de43bd6df	Revert "PDBExtras.h - remove unnecessary raw_ostream forward declaration. NFCI." This reverts commit `87c5437afd`. The commit includes several headers in the middle of a function, which breaks pretty much everything.	2020-08-06 15:15:43 +02:00
Simon Pilgrim	3d10050e37	BitstreamRemarkParser.h - remove unnecessary includes. NFCI. Remove unused includes, moving to the lib header or cpp file as necessary.	2020-08-06 13:17:53 +01:00
Simon Pilgrim	55ead5bfff	Fix include sorting order. NFC	2020-08-06 11:46:53 +01:00
Simon Pilgrim	87c5437afd	PDBExtras.h - remove unnecessary raw_ostream forward declaration. NFCI. We already need to include raw_ostream.h, also add missing StringRef.h and cstdint implicit dependencies. Remove unnecessary includes from PDBExtras.cpp	2020-08-06 11:28:42 +01:00
David Green	745bf6cf44	[LoopVectorizer] Inloop vector reductions Arm MVE has multiple instructions such as VMLAVA.s8, which (in this case) can take two 128bit vectors, sign extend the inputs to i32, multiplying them together and sum the result into a 32bit general purpose register. So taking 16 i8's as inputs, they can multiply and accumulate the result into a single i32 without any rounding/truncating along the way. There are also reduction instructions for plain integer add and min/max, and operations that sum into a pair of 32bit registers together treated as a 64bit integer (even though MVE does not have a plain 64bit addition instruction). So giving the vectorizer the ability to use these instructions both enables us to vectorize at higher bitwidths, and to vectorize things we previously could not. In order to do that we need a way to represent that the reduction operation, specified with a llvm.experimental.vector.reduce when vectorizing for Arm, occurs inside the loop not after it like most reductions. This patch attempts to do that, teaching the vectorizer about in-loop reductions. It does this through a vplan recipe representing the reductions that the original chain of reduction operations is replaced by. Cost modelling is currently just done through a prefersInloopReduction TTI hook (which follows in a later patch). Differential Revision: https://reviews.llvm.org/D75069	2020-08-06 10:10:50 +01:00
Roman Lebedev	5060f5682b	[InstCombine] (-NSW x) s> x --> x s< 0 (PR39480) Name: (-x) s> x --> x s< 0 %neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN %r = icmp sgt i8 %neg_x, %x => %r = icmp slt i8 %x, 0 https://rise4fun.com/Alive/ZslD https://bugs.llvm.org/show_bug.cgi?id=39480	2020-08-06 11:50:34 +03:00
Xing GUO	4357986b41	[DWARFYAML][debug_info] Pull out dwarf::FormParams from DWARFYAML::Unit. Unit.Format, Unit.Version and Unit.AddrSize are replaced with dwarf::FormParams in D84496 to get rid of unnecessary functions getOffsetSize() and getRefSize(). However, that change makes it difficult to make AddrSize optional (Optional<uint8_t>). This change pulls out dwarf::FormParams from DWARFYAML::Unit and use it as a helper struct in DWARFYAML::emitDebugInfo(). Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D85296	2020-08-06 16:39:00 +08:00
Craig Topper	504a197fe5	[X86] Rename X86::getImpliedFeatures to X86::updateImpliedFeatures and pass clang's StringMap directly to it. No point in building a vector of StringRefs for clang to apply to the StringMap. Just pass the StringMap and modify it directly.	2020-08-06 00:20:46 -07:00
Petr Hosek	1adc494bce	[CMake] Simplify CMake handling for zlib Rather than handling zlib handling manually, use find_package from CMake to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB, HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is set to YES, which requires the distributor to explicitly select whether zlib is enabled or not. This simplifies the CMake handling and usage in the rest of the tooling. This is a reland of `abb0075` with all followup changes and fixes that should address issues that were reported in PR44780. Differential Revision: https://reviews.llvm.org/D79219	2020-08-05 16:07:11 -07:00
Greg Clayton	e1de85f9f4	Add verification for DW_AT_decl_file and DW_AT_call_file. LTO builds have been creating invalid DWARF and one of the errors was a file index that was out of bounds. "llvm-dwarfdump --verify" will check all file indexes for line tables already, but there are no checks for the validity of file indexes in attributes. The verification will verify if there is a DW_AT_decl_file/DW_AT_call_file that: - there is a line table for the compile unit - the file index is valid - the encoding is appropriate Tests are added that test all of the above conditions. Differential Revision: https://reviews.llvm.org/D84817	2020-08-05 15:30:13 -07:00
Rahman Lavaee	20a568c29d	[Propeller]: Use a descriptive temporary symbol name for the end of the basic block. This patch changes the functionality of AsmPrinter to name the basic block end labels as LBB_END${i}_${j}, with ${i} being the identifier for the function and ${j} being the identifier for the basic block. The new naming scheme is consistent with how basic block labels are named (.LBB${i}_{j}), and how function end symbol are named (.Lfunc_end${i}) and helps to write stronger tests for the upcoming patch for BB-Info section (as proposed in https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html). The end label is used with basicblock-labels (BB-Info section in future) and basicblock-sections to compute the size of basic blocks and basic block sections, respectively. For BB sections, the section containing the entry basic block will not have a BB end label since it already gets the function end-label. This label is cached for every basic block (CachedEndMCSymbol) like the label for the basic block (CachedMCSymbol). Differential Revision: https://reviews.llvm.org/D83885	2020-08-05 13:17:19 -07:00
Stanislav Mekhanoshin	ea7d0e2996	[AMDGPU] gfx1031 target Differential Revision: https://reviews.llvm.org/D85337	2020-08-05 12:36:26 -07:00
Evgenii Stepanov	f2c0423995	[msan] Remove readnone and friends from call sites. MSan removes readnone/readonly and similar attributes from callees, because after MSan instrumentation those attributes no longer apply. This change removes the attributes from call sites, as well. Failing to do this may cause DSE of paramTLS stores before calls to readonly/readnone functions. Differential Revision: https://reviews.llvm.org/D85259	2020-08-05 10:34:45 -07:00
Jordan Rupprecht	3c39db0c44	Revert "[LoopVectorizer] Inloop vector reductions" This reverts commit `e9761688e4`. It breaks the build: ``` ~/src/llvm-project/llvm/lib/Analysis/IVDescriptors.cpp:868:10: error: no viable conversion from returned value of type 'SmallVector<[...], 8>' to function return type 'SmallVector<[...], 4>' return ReductionOperations; ```	2020-08-05 10:24:15 -07:00
Mircea Trofin	b18c41c66f	[TFUtils] Expose untyped accessor to evaluation result tensors These were implementation detail, but become necessary for generic data copying. Also added const variations to them, and move assignment, since we had a move ctor (and the move assignment helps in a subsequent patch). Differential Revision: https://reviews.llvm.org/D85262	2020-08-05 10:22:45 -07:00
David Green	e9761688e4	[LoopVectorizer] Inloop vector reductions Arm MVE has multiple instructions such as VMLAVA.s8, which (in this case) can take two 128bit vectors, sign extend the inputs to i32, multiplying them together and sum the result into a 32bit general purpose register. So taking 16 i8's as inputs, they can multiply and accumulate the result into a single i32 without any rounding/truncating along the way. There are also reduction instructions for plain integer add and min/max, and operations that sum into a pair of 32bit registers together treated as a 64bit integer (even though MVE does not have a plain 64bit addition instruction). So giving the vectorizer the ability to use these instructions both enables us to vectorize at higher bitwidths, and to vectorize things we previously could not. In order to do that we need a way to represent that the reduction operation, specified with a llvm.experimental.vector.reduce when vectorizing for Arm, occurs inside the loop not after it like most reductions. This patch attempts to do that, teaching the vectorizer about in-loop reductions. It does this through a vplan recipe representing the reductions that the original chain of reduction operations is replaced by. Cost modelling is currently just done through a prefersInloopReduction TTI hook (which follows in a later patch). Differential Revision: https://reviews.llvm.org/D75069	2020-08-05 18:14:05 +01:00
Georgii Rymar	6ae5b9e405	[llvm-readobj] - Make decode_relrs() don't return Expected<>. NFCI. The `decode_relrs` helper is declared as: `Expected<std::vector<Elf_Rel>> decode_relrs(Elf_Relr_Range relrs) const;` it never returns an error though and hence can be simplified to return a vector. Differential revision: https://reviews.llvm.org/D85302	2020-08-05 17:05:47 +03:00
Simon Pilgrim	7b993903e0	DWARFVerifier.h - remove unnecessary forward declarations and includes. NFCI.	2020-08-05 12:42:44 +01:00
Simon Pilgrim	315e1daf7f	GISelWorkList.h - remove unnecessary includes. NFCI.	2020-08-05 12:00:28 +01:00
Simon Pilgrim	e3d3657b9b	CallLowering.h - remove unnecessary CCState forward declaration. NFCI. Already defined in CallingConvLower.h	2020-08-05 12:00:28 +01:00
Hans Wennborg	3ab01550b6	Revert "[CMake] Simplify CMake handling for zlib" This quietly disabled use of zlib on Windows even when building with -DLLVM_ENABLE_ZLIB=FORCE_ON. > Rather than handling zlib handling manually, use find_package from CMake > to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB, > HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is > set to YES, which requires the distributor to explicitly select whether > zlib is enabled or not. This simplifies the CMake handling and usage in > the rest of the tooling. > > This is a reland of `abb0075` with all followup changes and fixes that > should address issues that were reported in PR44780. > > Differential Revision: https://reviews.llvm.org/D79219 This reverts commit `10b1b4a231` and follow-ups `64d99cc6ab` and `f9fec0447e`.	2020-08-05 12:31:44 +02:00
Georgii Rymar	f97019ad6e	[llvm-readobj/elf] - Add a testing for --stackmap and refine the implementation. Currently, we only test the `--stackmap` option here: https://github.com/llvm/llvm-project/blob/master/llvm/test/Object/stackmap-dump.test it uses a precompiled MachO binary currently and I've found no tests for this option for ELF. The implementation also has issues. For example, it might assert on a wrong version of the .llvm-stackmaps section. Or it might crash on an empty or truncated section. This patch introduces a new tools/llvm-readobj/ELF test file as well as implements a few basic checks to catch simple crashes/issues It also eliminates `unwrapOrError` calls in `printStackMap()`. Differential revision: https://reviews.llvm.org/D85208	2020-08-05 13:09:04 +03:00

... 4 5 6 7 8 ...

42427 Commits