llvm-project

Commit Graph

Author	SHA1	Message	Date
Sumanth Gundapaneni	a04ab2ec08	[Pipeliner] Fix the bug in pragma that disables the pipeliner. Differential Revision: https://reviews.llvm.org/D76303.	2020-04-10 12:52:16 -05:00
Matt Morehouse	bef187c750	Implement `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist` for clang Summary: This commit adds two command-line options to clang. These options let the user decide which functions will receive SanitizerCoverage instrumentation. This is most useful in the libFuzzer use case, where it enables targeted coverage-guided fuzzing. Patch by Yannis Juglaret of DGA-MI, Rennes, France libFuzzer tests its target against an evolving corpus, and relies on SanitizerCoverage instrumentation to collect the code coverage information that drives corpus evolution. Currently, libFuzzer collects such information for all functions of the target under test, and adds to the corpus every mutated sample that finds a new code coverage path in any function of the target. We propose instead to let the user specify which functions' code coverage information is relevant for building the upcoming fuzzing campaign's corpus. To this end, we add two new command line options for clang, enabling targeted coverage-guided fuzzing with libFuzzer. We see targeted coverage guided fuzzing as a simple way to leverage libFuzzer for big targets with thousands of functions or multiple dependencies. We publish this patch as work from DGA-MI of Rennes, France, with proper authorization from the hierarchy. Targeted coverage-guided fuzzing can accelerate bug finding for two reasons. First, the compiler will avoid costly instrumentation for non-relevant functions, accelerating fuzzer execution for each call to any of these functions. Second, the built fuzzer will produce and use a more accurate corpus, because it will not keep the samples that find new coverage paths in non-relevant functions. The two new command line options are `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist`. They accept files in the same format as the existing `-fsanitize-blacklist` option <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html#format>. The new options influence SanitizerCoverage so that it will only instrument a subset of the functions in the target. We explain these options in detail in `clang/docs/SanitizerCoverage.rst`. Consider now the woff2 fuzzing example from the libFuzzer tutorial <https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md>. We are aware that we cannot conclude much from this example because mutating compressed data is generally a bad idea, but let us use it anyway as an illustration for its simplicity. Let us use an empty blacklist together with one of the three following whitelists: ``` # (a) src:* fun:* # (b) src:SRC/* fun:* # (c) src:SRC/src/woff2_dec.cc fun:* ``` Running the built fuzzers shows how many instrumentation points the compiler adds, the fuzzer will output //XXX PCs//. Whitelist (a) is the instrument-everything whitelist, it produces 11912 instrumentation points. Whitelist (b) focuses coverage to instrument woff2 source code only, ignoring the dependency code for brotli (de)compression; it produces 3984 instrumented instrumentation points. Whitelist (c) focuses coverage to only instrument functions in the main file that deals with WOFF2 to TTF conversion, resulting in 1056 instrumentation points. For experimentation purposes, we ran each fuzzer approximately 100 times, single process, with the initial corpus provided in the tutorial. We let the fuzzer run until it either found the heap buffer overflow or went out of memory. On this simple example, whitelists (b) and (c) found the heap buffer overflow more reliably and 5x faster than whitelist (a). The average execution times when finding the heap buffer overflow were as follows: (a) 904 s, (b) 156 s, and (c) 176 s. We explain these results by the fact that WOFF2 to TTF conversion calls the brotli decompression algorithm's functions, which are mostly irrelevant for finding bugs in WOFF2 font reconstruction but nevertheless instrumented and used by whitelist (a) to guide fuzzing. This results in longer execution time for these functions and a partially irrelevant corpus. Contrary to whitelist (a), whitelists (b) and (c) will execute brotli-related functions without instrumentation overhead, and ignore new code paths found in them. This results in faster bug finding for WOFF2 font reconstruction. The results for whitelist (b) are similar to the ones for whitelist (c). Indeed, WOFF2 to TTF conversion calls functions that are mostly located in SRC/src/woff2_dec.cc. The 2892 extra instrumentation points allowed by whitelist (b) do not tamper with bug finding, even though they are mostly irrelevant, simply because most of these functions do not get called. We get a slightly faster average time for bug finding with whitelist (b), which might indicate that some of the extra instrumentation points are actually relevant, or might just be random noise. Reviewers: kcc, morehouse, vitalybuka Reviewed By: morehouse, vitalybuka Subscribers: pratyai, vitalybuka, eternalsakura, xwlin222, dende, srhines, kubamracek, #sanitizers, lebedev.ri, hiraditya, cfe-commits, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D63616	2020-04-10 10:44:03 -07:00
Fangrui Song	a7aaaf7016	[MC][RISCV] Make .reloc support arbitrary relocation types Similar to D76746 (ARM), D76754 (AArch64) and llvmorg-11-init-6967-g152d14da64c (x86) Differential Revision: https://reviews.llvm.org/D77018	2020-04-10 10:43:53 -07:00
Matt Arsenault	4593e4131a	AMDGPU: Teach toolchain to link rocm device libs Currently the library is separately linked, but this isn't correct to implement fast math flags correctly. Each module should get the version of the library appropriate for its combination of fast math and related flags, with the attributes propagated into its functions and internalized. HIP already maintains the list of libraries, but this is not used for OpenCL. Unfortunately, HIP uses a separate --hip-device-lib argument, despite both languages using the same bitcode library. Eventually these two searches need to be merged. An additional problem is there are 3 different locations the libraries are installed, depending on which build is used. This also needs to be consolidated (or at least the search logic needs to deal with this unnecessary complexity).	2020-04-10 13:37:32 -04:00
Craig Topper	a6732069ee	[CallSite removal][X86] Remove unneeded use of CallSite. NFC We already have a CallInst, we can just get the calling convention from it.	2020-04-10 10:27:21 -07:00
Kevin P. Neal	7f38812d5b	[FPEnv][AArch64] Platform-specific builtin constrained FP enablement When constrained floating point is enabled the AArch64-specific builtins don't use constrained intrinsics in some cases. Fix that. Neon is part of this patch, so ARM is affected as well. Differential Revision: https://reviews.llvm.org/D77074	2020-04-10 13:02:00 -04:00
Fangrui Song	7f36cb1f1a	[AArch64InstPrinter] Change printAlignedLabel to print the target address in hexadecimal form Similar to D76580 (x86) and D76591 (PPC). ``` // llvm-objdump -d output (before) 10000: 08 00 00 94 bl #32 10004: 08 00 00 94 bl #32 // llvm-objdump -d output (after) 10000: 08 00 00 94 bl 0x10020 10004: 08 00 00 94 bl 0x10024 // GNU objdump -d. The lack of 0x is not ideal due to ambiguity. 10000: 94000008 bl 10020 <bar+0x18> 10004: 94000008 bl 10024 <bar+0x1c> ``` The new output makes it easier to find the jump target. Differential Revision: https://reviews.llvm.org/D77853	2020-04-10 09:21:09 -07:00
Simon Pilgrim	1824ae0f42	[X86] Remove defunct EmitLoweredAtomicFP declaration. NFC.	2020-04-10 17:05:07 +01:00
Simon Pilgrim	dd84a2f77a	[X86] Remove defunct emitFMA3Instr declaration. NFC.	2020-04-10 17:05:06 +01:00
Christopher Tetreault	65b8b643b4	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, efriedma, jonpa Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77265	2020-04-10 08:43:32 -07:00
Simon Pilgrim	a88cc20456	ProfileSummaryInfo.h - remove unnecessary includes. NFC Remove a number of includes that aren't necessary (nor are we relying on the remaining includes to provide the declarations), we just needed a llvm::Instruction forward declaration. This exposed a couple of source files that were implicitly replying on the includes for their use of llvm::SmallSet or std::set, requiring local includes to be added there instead.	2020-04-10 16:25:48 +01:00
Stanislav Mekhanoshin	44920e8566	[AMDGPU] Disable sub-dword scralar loads IR widening These will be widened in the DAG. In the meanwhile early widening prevents otherwise possible vectorization of such loads. Differential Revision: https://reviews.llvm.org/D77835	2020-04-10 08:20:49 -07:00
Mircea Trofin	f62335b534	[llvm][NFC] Style fixes in Inliner.cpp Summary: Function names: camel case, lower case first letter. Variable names: start with upper letter. For iterators that were 'i', renamed with a descriptive name, as 'I' is 'Instruction&'. Lambda captures simplification. Opportunistic boolean return simplification. Reviewers: davidxl, dblaikie Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77837	2020-04-10 08:04:39 -07:00
Ilya Leoshkevich	3bc439bdff	[MSan] Add instrumentation for SystemZ Summary: This patch establishes memory layout and adds instrumentation. It does not add runtime support and does not enable MSan, which will be done separately. Memory layout is based on PPC64, with the exception that XorMask is not used - low and high memory addresses are chosen in a way that applying AndMask to low and high memory produces non-overlapping results. VarArgHelper is based on AMD64. It might be tempting to share some code between the two implementations, but we need to keep in mind that all the ABI similarities are coincidental, and therefore any such sharing might backfire. copyRegSaveArea() indiscriminately copies the entire register save area shadow, however, fragments thereof not filled by the corresponding visitCallSite() invocation contain irrelevant data. Whether or not this can lead to practical problems is unclear, hence a simple TODO comment. Note that the behavior of the related copyOverflowArea() is correct: it copies only the vararg-related fragment of the overflow area shadow. VarArgHelper test is based on the AArch64 one. s390x ABI requires that arguments are zero-extended to 64 bits. This is particularly important for __msan_maybe_warning_() and __msan_maybe_store_origin_() shadow and origin arguments, since non zeroed upper parts thereof confuse these functions. Therefore, add ZExt attribute to the corresponding parameters. Add ZExt attribute checks to msan-basic.ll. Since with -msan-instrumentation-with-call-threshold=0 instrumentation looks quite different, introduce the new CHECK-CALLS check prefix. Reviewers: eugenis, vitalybuka, uweigand, jonpa Reviewed By: eugenis Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits, stefansf, Andreas-Krebbel Tags: #llvm Differential Revision: https://reviews.llvm.org/D76624	2020-04-10 16:53:49 +02:00
Christopher Tetreault	3bebf02861	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, rriddle, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77262	2020-04-10 07:47:19 -07:00
Jessica Clarke	49e20c4c9e	[RISCV] Consume error from parsing attributes section Summary: We don't consume the error from getBuildAttributes, so an assertions build crashes with "Program aborted due to an unhandled Error:". Explicitly consume it like the ARM version in that case. Reviewers: asb, jhenderson, MaskRay, HsiangKai Reviewed By: MaskRay Subscribers: kristof.beyls, hiraditya, simoncook, kito-cheng, shiva0217, rogfer01, rkruppe, psnobl, benna, Jim, lenary, s.egerton, sameer.abuasal, luismarques, evandro, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77841	2020-04-10 15:05:53 +01:00
Simon Pilgrim	91bc50c0d7	[CostModel][X86] Improve InsertElement costs for sub-128bit vectors If we're inserting into v2i8/v4i8/v8i8/v2i16/v4i16 style sub-128bit vectors ensure we don't use the SK_PermuteTwoSrc cost of the legalized value type - this is a followup to rG12c629ec6c59 which added equivalent sub-128bit shuffle costs	2020-04-10 14:55:46 +01:00
Florian Hahn	1a02aaeaa4	[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef. For non-integer constants/expressions and overdefined, I think we can just use SimplifyBinOp to do common folds. By just passing a context with the DL, SimplifyBinOp should not try to get additional information from looking at definitions. For overdefined values, it should be enough to just pass the original operand. Note: The comment before the `if (isconstant(V1State)...` was wrong originally: isConstant() also matches integer ranges with a single element. It is correct now. Reviewers: efriedma, davide, mssimpso, aartbik Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76459	2020-04-10 11:02:57 +01:00
Mehdi Amini	bbeeb35c1f	Revert "[DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff." This reverts commit `0445c64998`. MLIR Build is broken by this change at the moment.	2020-04-10 07:44:06 +00:00
Alina Sbirlea	0445c64998	[DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff. This replaces the ChildrenGetter inside the DominatorTree with GraphTraits over a GraphDiff object, an object which encapsulated the view of the previous CFG. This also simplifies the extentions in clang which use DominatorTree, as GraphDiff also filters nullptrs. Re-land `a90374988e` after moving CFGDiff.h to Support. Differential Revision: https://reviews.llvm.org/D77341	2020-04-10 07:38:53 +00:00
Michael Liao	b54b4ecac3	Fix `-Wextra` warning. NFC.	2020-04-10 03:22:02 -04:00
Mehdi Amini	57d2d48399	Revert "[DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff." This reverts commit `a90374988e` and `5da1671bf8`. A new dependency is introduced here from Support to IR which seems like a layering violation. It also breaks the MLIR build at the moment.	2020-04-10 06:27:59 +00:00
Kai Luo	b7d5229d78	[PowerPC] Update alignment for ReuseLoadInfo in LowerFP_TO_INTForReuse In LowerFP_TO_INTForReuse, when emitting `stfiwx`, alignment of 4 is set for the `MachineMemOperand`, but RLI(ReuseLoadInfo)'s alignment is not updated for following loads. It's related to failed alignment check reported in https://bugs.llvm.org/show_bug.cgi?id=45297 Differential Revision: https://reviews.llvm.org/D77624	2020-04-10 05:49:19 +00:00
John McCall	8423a6f363	Rename OptimalLayout to OptimizedStructLayout at Chris's request.	2020-04-10 00:14:20 -04:00
David Blaikie	e0fd87cc64	llvm-dwarfdump: Return non-zero on error Makes it easier to test "this doesn't produce an error" (& indeed makes that the implied default so we don't accidentally write tests that have silent/sneaky errors as well as the positive behavior we're testing for) Though the support for applying relocations is patchy enough that a bunch of tests treat lack of relocation application as more of a warning than an error - so rather than me trying to figure out how to add support for a bunch of relocation types, let's degrade that to a warning to match the usage (& indeed, it's sort of more of a tool warning anyway - it's not that the DWARF is wrong, just that the tool can't fully cope with it - and it's not like the tool won't dump the DWARF, it just won't follow/render certain relocations - I guess in the most general case it might try to render an unrelocated value & instead render something bogus... but mostly seems to be about interesting relocations used in eh_frame (& honestly it might be nice if we were lazier about doing this relocation resolution anyway - if you're not dumping eh_frame, should we really be erroring about the relocations in it?))	2020-04-09 20:53:58 -07:00
Nemanja Ivanovic	7f3787c0f2	[PowerPC] Bail out of redundant LI elimination on an implicit kill The transformation currently does not differentiate between explicit and implicit kills. However, it is not valid to later simply clear an implicit kill flag since the kill could be due to a call or return. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45374	2020-04-09 22:17:29 -05:00
Serguei Katkov	4275eb1331	Re-land [Codegen/Statepoint] Allow usage of registers for non gc deopt values. The change introduces the usage of physical registers for non-gc deopt values. This require runtime support to know how to take a value from register. By default usage is off and can be switched on by option. The change also introduces additional fix-up patch which forces the spilling of caller saved registers (clobbered after the call) and re-writes statepoint to use spill slots instead of caller saved registers. Reviewers: reames, danstrushin Reviewed By: dantrushin Subscribers: mgorny, hiraditya, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D77797	2020-04-10 10:13:39 +07:00
Max Kazantsev	4e87823026	[LoopLoadElim] Fix crash by always checking simplify form Loop simplify form should always be checked because logic of propagateStoredValueToLoadUsers relies on it (in particular, it requires preheader). Reviewed By: Fedor Sergeev, Florian Hahn Differential Revision: https://reviews.llvm.org/D77775	2020-04-10 09:23:28 +07:00
Heejin Ahn	b647de9925	[WebAssembly] Use dummy debug info in Emscripten SjLj Summary: D74269 added debug info to newly created instructions, including calls to `malloc` and `free`, by taking debug info from existing instructions around, whose debug info may or may not be empty. But there are cases debug info is required by the IR verifier: when both the caller and the callee functions have DISubprograms, meaning we already have declarations to `malloc` or `free` with a DISubprogram attached, newly created calls to `malloc` and `free` should have non-empty debug info. This patch creates a non-empty dummy debug info in this case to those calls to make the IR verifier pass. Fixes https://bugs.llvm.org/show_bug.cgi?id=45461. Reviewers: dschuff Subscribers: aprantl, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77784	2020-04-09 18:44:50 -07:00
Wenlei He	60c642e74b	[TLI] Per-function fveclib for math library used for vectorization Summary: Encode `-fveclib` setting as per-function attribute so it can threaded through to LTO backends. Accordingly per-function TLI now reads the attributes and select available vector function list based on that. Now we also populate function list for all supported vector libraries for the shared per-module `TargetLibraryInfoImpl`, so each function can select its available vector list independently but without duplicating the vector function lists. Inlining between incompatbile vectlib attributed is also prohibited now. Subscribers: hiraditya, dexonsmith, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77632	2020-04-09 18:26:38 -07:00
Stefan Pintilie	5b18b6e9a8	[PowerPC][Future] Fix for `6c4b40def7` This is a fix for the previous patch `6c4b40def7`. In some cases it may be possible to have the compiler produce st_other=1 without the compiler using mcpu=future which should not be the case. This patch adds a guard to make sure that if we are using st_other=1 then we are also compiling for future CPU.	2020-04-10 01:12:11 +00:00
Alina Sbirlea	a90374988e	[DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff. Summary: This replaces the ChildrenGetter inside the DominatorTree with GraphTraits over a GraphDiff object, an object which encapsulated the view of the previous CFG. This also simplifies the extentions in clang which use DominatorTree, as GraphDiff also filters nullptrs. Reviewers: kuhar, dblaikie, NutshellySima Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77341	2020-04-09 18:08:39 -07:00
Lang Hames	37bcf2df01	[ORC] Require JITDylib to be specified when adding IR and objects in the C API.	2020-04-09 17:59:26 -07:00
Craig Topper	5625e6ab37	[X86] Improve min/max reduction costs. This is similar to what I recently did for getArithmeticReductionCost. I'm trying to account for the narrowing from 512->256->128 as we go. I've also added a new helper method getMinMaxCost that tries to handle the cases where we have native min/max instructions and fall back to cmp+select when we don't. Differential Revision: https://reviews.llvm.org/D76634	2020-04-09 17:28:50 -07:00
Nemanja Ivanovic	5fe2809447	[PowerPC] Don't assert on SELECT_CC with i1 type When we try to select a SELECT_CC on Power9, we check if it can be matched to a SETB instruction. In that function, we assert that the output type is i32/i64. This is unnecessary as it is perfectly reasonable to have an i1 SELECT_CC. Change that from an assert to an early exit condition. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45448	2020-04-09 19:27:32 -05:00
Amara Emerson	e99169f1c2	[AArch64][GlobalISel] CallLowering: Don't generate new copies each time we need to store to a stack location for outgoing args. During call arg lowering we shouldn't be modifying SP so cache the SP copy vreg for subsequent uses. Gives a 0.2% geomean code size improvement on CTMark. Differential Revision: https://reviews.llvm.org/D77838	2020-04-09 17:08:56 -07:00
Francesco Petrogalli	c846d2682b	[llvm][Codegen] Make `getVectorTypeBreakdownMVT` work with scalable types. Reviewers: efriedma, andwar, sdesmalen Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77434	2020-04-10 00:48:27 +01:00
Christopher Tetreault	9f87d951fc	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: mcrosier, efriedma, sdesmalen Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77269	2020-04-09 16:43:29 -07:00
Lang Hames	0d5f15f700	[ORC] Add C API support for adding object files to an LLJIT instance.	2020-04-09 16:18:46 -07:00
Lang Hames	1cd8493e69	[ORC] Expand the OrcV2 C API bindings. Adds basic support for LLJITBuilder and DynamicLibrarySearchGenerator. This allows C API clients to configure LLJIT to expose process symbols to JIT'd code. An example of this is added in llvm/examples/OrcV2CBindingsReflectProcessSymbols.	2020-04-09 16:18:46 -07:00
Daniel Sanders	a79b2fc44b	Add pass to strip debug info from MIR Summary: Removes: * All LLVM-IR level debug info using StripDebugInfo() * All debugify metadata * 'Debug Info Version' module flag * All (valid) DEBUG_VALUE MachineInstrs All DebugLocs from MachineInstrs This is a more complete solution than the previous MIRPrinter option that just causes it to neglect to print debug-locations. * The qualifier 'valid' is used here because AArch64 emits an invalid one and tests depend on it Reviewers: vsk, aprantl, bogner Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77747	2020-04-09 15:44:38 -07:00
Mircea Trofin	655aa1ae4a	[llvm][NFC] Replace CallSite with CallBase in Inliner Summary: Almost all uses are replaced. Left FIXMEs for the two sites that require refactoring outside of Inliner, to scope this patch. Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77817	2020-04-09 15:01:58 -07:00
Christopher Tetreault	19cc9b9ded	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: efriedma, sdesmalen, rriddle Reviewed By: sdesmalen Subscribers: hiraditya, dantrushin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77261	2020-04-09 14:59:14 -07:00
Christopher Tetreault	00a1032412	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: rriddle, sdesmalen, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77260	2020-04-09 13:35:41 -07:00
James Y Knight	5e7b98fe75	Fix an unused-variable warning in Release mode.	2020-04-09 16:34:55 -04:00
Christopher Tetreault	e634f482ea	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: arsenm, efriedma, sdesmalen Reviewed By: arsenm Subscribers: wdng, arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77268	2020-04-09 13:11:37 -07:00
Zequan Wu	eccfa35d53	Fix lifetime call in landingpad blocking Simplifycfg pass Fix lifetime call in landingpad blocks simplifycfg from removing the landingpad. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D77188	2020-04-09 13:07:32 -07:00
Christopher Tetreault	e1e131ea5e	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: grosbach, efriedma, sdesmalen Reviewed By: efriedma Subscribers: hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77271	2020-04-09 12:52:44 -07:00
Christopher Tetreault	b96558f5e5	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sunfish, sdesmalen, efriedma Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77273	2020-04-09 12:41:28 -07:00
Stefan Pintilie	64868cbfcf	[PowerPC][Future] Fix for `75828ef615` Used unsigned long where uint64_t should have been used by mistake. Fixed in this patch.	2020-04-09 19:33:12 +00:00
Simon Pilgrim	12c629ec6c	[CostModel][X86] Add shuffle costs for some common sub-128bit vectors v2i8/v4i8/v8i8 + v2i16/v4i16 all show up in vectorizer code and by just using the legalized types (v16i8/v8i16) we're highly exaggerating the actual cost of the shuffle.	2020-04-09 19:57:06 +01:00
Paolo Savini	fae40bd5a1	[RISCV] Add MC layer support for proposed Bit Manipulation extension (version 0.92) This adds the instruction encoding and mnenomics for the proposed RISC-V Bit Manipulation extension (version 0.92). It is implemented with each category of instruction as its own target feature, with the 'b' extension feature enabling all options. Since this extension is not yet ratified, all target features are prefixed with 'experimental-' to note their status. Differential Revision: https://reviews.llvm.org/D65649	2020-04-09 18:04:22 +01:00
jasonliu	085689d44c	[PPC][AIX] Implement variadic function handling in LowerFormalArguments_AIX Summary: This patch adds support for handling of variadic functions for AIX. This includes ensuring that use and consume correct type of va_list (char *va_list) for AIX. Authored by: ZarkoCA Reviewers: cebowleratibm, sfertile, jasonliu Reviewed by: jasonliu Differential Revision: https://reviews.llvm.org/D76130	2020-04-09 16:49:44 +00:00
Stefan Pintilie	75828ef615	[PowerPC][Future] Initial support for PCRel addressing for constant pool loads Add initial support for PC Relative addressing for constant pool loads. This includes adding a new relocation for @pcrel and adding a new PowerPC flag to identify PC relative addressing. Differential Revision: https://reviews.llvm.org/D74486	2020-04-09 11:17:23 -05:00
Kazushi (Jam) Marukawa	015dee1ac8	[VE] Support (m)0 and (m)1 operands Summary: VE has special operands to represent 0b000...000111...111 (`(m)0`) and 0b111...111000...000 (`(m)1`) bit sequences. This patch supports those operands not only in machine instructions but also in DAG lowering. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D77769	2020-04-09 18:09:00 +02:00
Craig Topper	5a55363dc4	[X86] Remove redundant VMOVDDUPZ128rmk/VMOVDDUPZ128rmkz isel patterns. These patterns are identical to the pattern for the instruction.	2020-04-09 09:06:58 -07:00
Gil Rapaport	e2a1867880	[LV] Add VPValue operands to VPBlendRecipe (NFCI) InnerLoopVectorizer's code called during VPlan execution still relies on original IR's def-use relations to decide which vector code to generate, limiting VPlan transformations ability to modify def-use relations and still have ILV generate the vector code. This commit introduces VPValues for VPBlendRecipe to use as the values to blend. The recipe is generated with VPValues wrapping the phi's incoming values of the scalar phi. This reduces ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. Differential Revision: https://reviews.llvm.org/D77539	2020-04-09 18:48:33 +03:00
Mircea Trofin	b4924f01a4	[llvm][nfc] InstructionCostDetail encapsulation Ensured initialized fields; encapsulad delta calulations and evaluation of threshold having had changed; assertion for CostThresholdMap dereference, to indicate design intent. Differential Revision: https://reviews.llvm.org/D77762	2020-04-09 08:21:18 -07:00
Ayal Zaks	1678489234	[LV] FoldTail w/o Primary Induction Introduce a new VPWidenCanonicalIVRecipe to generate a canonical vector induction for use in fold-tail-with-masking, if a primary induction is absent. The canonical scalar IV having start = 0 and step = VFUF, created during code -gen to control the vector loop, is widened into a canonical vector IV having start = {<PartVF, PartVF+1, ..., PartVF+VF-1> for 0 <= Part < UF} and step = <VFUF, VFUF, ..., VF*UF>. Differential Revision: https://reviews.llvm.org/D77635	2020-04-09 17:45:23 +03:00
Simon Cook	2df6a02fd7	[RISCV] Implement evaluateBranch This implements the instruction analysis required to print branch targets as part of llvm-objdump's disassembly. Note, this only handles those branches which can be analyzed in a single instruction, a future patch will handle multiple-instruction patterns, such as AUIPC/LUI+JALR instruction pairs. Differential Revision: https://reviews.llvm.org/D77567	2020-04-09 15:11:55 +01:00
Shengchen Kan	2477cec2ac	[NFC][X86] Refine code in X86AsmBackend Summary: Move code to a better place, rename function, etc Tags: #llvm Differential Revision: https://reviews.llvm.org/D77778	2020-04-09 21:31:52 +08:00
Sanjay Patel	812970edda	[InstCombine] replace undef in vector constant for safe shift transform (PR45447) As noted in PR45447, we have a vector-constant-with-undef-element transform bug: https://bugs.llvm.org/show_bug.cgi?id=45447 We replace undefs with a safe constant (0 or -1) based on the (non-)negative predicate constraint. So this is correct: http://volta.cs.utah.edu:8080/z/WZE36H ...but this is not: http://volta.cs.utah.edu:8080/z/boj8gJ Previously, we were relying on getSafeVectorConstantForBinop() in the related fold (D76800). But that's making an assumption about what qualifies as "safe", and that assumption may not always hold. Differential Revision: https://reviews.llvm.org/D77739	2020-04-09 08:00:46 -04:00
Anton Bikineev	9e1ccec8d5	tsan: don't instrument __attribute__((naked)) functions Naked functions are required to not have compiler generated prologues/epilogues, hence no instrumentation is needed for them. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45400 Differential Revision: https://reviews.llvm.org/D77477	2020-04-09 13:47:47 +02:00
Pavel Labath	b761a6484d	[DWARF] Detect extraction errors in DWARFFormValue::extractValue Summary: Although the function had a bool return value, it was always returning true. Presumably this is because the main type of errors one can encounter here is running off the end of the stream, and until very recently, the DataExtractor class made it very difficult to detect that. The situation has changed now, and we can easily detect errors here, which this patch does. Reviewers: dblaikie, aprantl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77308	2020-04-09 13:41:02 +02:00
Serguei Katkov	44f0d7f136	Revert "[Codegen/Statepoint] Allow usage of registers for non gc deopt values." This reverts commit `a0275705bb`. It causes buildbot failures building LLVM with BUILD_SHARED_LIBS due to a linker error.	2020-04-09 18:24:47 +07:00
Florian Hahn	a7efe06af0	[LV] Assert no DbgInfoIntrinsic calls are passed to widening (NFC). When building a VPlan, BasicBlock::instructionsWithoutDebug() is used to iterate over the instructions in a block. This means that no recipes should be created for debug info intrinsics already and we can turn the early exit into an assertion. Reviewers: Ayal, gilr, rengolin, aprantl Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D77636	2020-04-09 11:37:32 +01:00
Serguei Katkov	a0275705bb	[Codegen/Statepoint] Allow usage of registers for non gc deopt values. The change introduces the usage of physical registers for non-gc deopt values. This require runtime support to know how to take a value from register. By default usage is off and can be switched on by option. The change also introduces additional fix-up patch which forces the spilling of caller saved registers (clobbered after the call) and re-writes statepoint to use spill slots instead of caller saved registers. Reviewers: reames, dantrushin Reviewed By: reames, dantrushin Subscribers: mgorny, hiraditya, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D77371	2020-04-09 16:57:35 +07:00
Jay Foad	bf730e1686	[CodeGen] Fix a simple FIXME. NFC.	2020-04-09 10:54:03 +01:00
Jay Foad	9c7bd94ce8	Fix typo in comment	2020-04-09 10:36:00 +01:00
Jay Foad	4970a1deca	[AMDGPU] Remove outdated comment	2020-04-09 10:36:00 +01:00
Florian Hahn	9997ee23ed	[VPlan] Add & use VPValue operands for VPWidenCallRecipe (NFC). This patch adds VPValue versions for the arguments of the call to VPWidenCallRecipe and uses them during code-generation. Similar to D76373 this reduces ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. Reviewers: Ayal, gilr, rengolin Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D77655	2020-04-09 10:23:26 +01:00
Jay Foad	c63aed890e	[KnownBits] Move AND, OR and XOR logic into KnownBits Summary: There are at least three clients for KnownBits calculations: ValueTracking, SelectionDAG and GlobalISel. To reduce duplication the common logic should be moved out of these clients and into KnownBits itself. This patch does this for AND, OR and XOR calculations by implementing and using appropriate operator overloads KnownBits::operator& etc. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74060	2020-04-09 10:10:37 +01:00
Jay Foad	94cc9eccf6	[ValueTracking] Simplify KnownBits construction Use the simpler BitWidth constructor instead of the copy constructor to make it clear when we don't actually need to copy an existing KnownBits value. Split out from D74539. NFC.	2020-04-09 09:27:22 +01:00
Serge Pavlov	c7ff5b38f2	[FPEnv] Use single enum to represent rounding mode Now compiler defines 5 sets of constants to represent rounding mode. These are: 1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes defined by IEEE-754 and is used in `APFloat` implementation. 2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754 rounding modes and a special value for dynamic rounding mode. It is used in clang frontend. 3. `llvm::fp::RoundingMode`. Defines the same values as `clang::LangOptions::FPRoundingModeKind` but in different order. It is used to specify rounding mode in in IR and functions that operate IR. 4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7). Besides constants for rounding mode it also uses a special value to indicate error. It is convenient to use in intrinsic functions, as it represents platform-independent representation for rounding mode. In this role it is used in some pending patches. 5. Values like `FE_DOWNWARD` and other, which specify rounding mode in library calls `fesetround` and `fegetround`. Often they represent bits of some control register, so they are target-dependent. The same names (not values) and a special name `FE_DYNAMIC` are used in `#pragma STDC FENV_ROUND`. The first 4 sets of constants are target independent and could have the same numerical representation. It would simplify conversion between the representations. Also now `clang::LangOptions::FPRoundingModeKind` and `llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding direction `roundTiesToAway`, although it is supported natively on some targets. This change defines all the rounding mode type via one `llvm::RoundingMode`, which also contains rounding mode for IEEE rounding direction `roundTiesToAway`. Differential Revision: https://reviews.llvm.org/D77379	2020-04-09 13:26:47 +07:00
Pratyai Mazumder	e8d1c6529b	[SanitizerCoverage] sancov/inline-bool-flag instrumentation. Summary: New SanitizerCoverage feature `inline-bool-flag` which inserts an atomic store of `1` to a boolean (which is an 8bit integer in practice) flag on every instrumented edge. Implementation-wise it's very similar to `inline-8bit-counters` features. So, much of wiring and test just follows the same pattern. Reviewers: kcc, vitalybuka Reviewed By: vitalybuka Subscribers: llvm-commits, hiraditya, jfb, cfe-commits, #sanitizers Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D77244	2020-04-08 22:43:52 -07:00
Vitaly Buka	8b1a6c0a57	[NFC][SanitizerCoverage] Simplify alignment calculation This reverts commit e42f2a0cd8b8007c816d0e63f5000c444e29105e.	2020-04-08 22:43:52 -07:00
WangTianQing	a3dc949000	[X86] Add TSXLDTRK instructions. Summary: For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Reviewers: craig.topper, RKSimon, LuoYuanke Reviewed By: craig.topper Subscribers: mgorny, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77205	2020-04-09 13:17:29 +08:00
Johannes Doerfert	cb0ecc5c33	[CallGraphUpdater] Remove dead constants before replacing a function Dead constants might be left when a function is replaced, we can gracefully handle this case and avoid complexity for the users who would see an assertion otherwise.	2020-04-08 22:52:46 -05:00
Lang Hames	5877d6f5f4	[ORC] Make mangling convenience methods part of the public API of LLJIT. This saves clients from having to manually construct a MangleAndInterner.	2020-04-08 20:20:13 -07:00
Matt Arsenault	0aa0d70067	MIR: Use Register	2020-04-08 22:07:26 -04:00
Sam Clegg	7baad0c53c	[WebAssembly][MC] Use StringRef over std::string pointer This is followup based on feedback on `5be42f36f5`. See: https://reviews.llvm.org/D77627. Differential Revision: https://reviews.llvm.org/D77674	2020-04-08 18:28:08 -07:00
Craig Topper	f3d3cec648	[InstCombine] Avoid a call to deprecated version of CreateCall. Passing a Value * to CreateCall has to call getPointerElementType to find the type of the pointer. In this case we can rely on the fact that Intrinsic::getDeclaration returns a Function * and use that version of CreateCall.	2020-04-08 17:41:16 -07:00
Johannes Doerfert	0985554b70	[Attributor][NFC] Split AbstractAttributes out of Attributor.cpp Attributor.cpp became quite big and we need to start provide structure. The Attributor code is now in Attributor.cpp and the classes derived from AbstractAttribute are in AttributorAttributes.cpp. Minor changes were required but no intended functional changes. We also minimized includes as part of this. Reviewed By: baziotis Differential Revision: https://reviews.llvm.org/D76873	2020-04-08 19:02:14 -05:00
Christopher Tetreault	fe69eb1196	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: espindola, efriedma, sdesmalen Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77275	2020-04-08 16:29:36 -07:00
Christopher Tetreault	49fd24fe9e	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: hfinkel, efriedma, sdesmalen Reviewed By: efriedma Subscribers: wuzish, nemanjai, hiraditya, kbarton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77266	2020-04-08 16:10:55 -07:00
Christopher Tetreault	155740cc33	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, rriddle, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77263	2020-04-08 15:15:41 -07:00
Amara Emerson	befc788cfa	GlobalISel: Add a setInstrAndDebugLoc(MachineInstr&) convenience helper to MachineIRBuilder. NFC. This saves doing two separate calls to set the Instr and DebugLoc from an existing MI.	2020-04-08 14:38:33 -07:00
Matt Arsenault	e49e33b610	CodeGen: Use Register in MachineInstrBuilder	2020-04-08 17:03:53 -04:00
Kirill Naumov	8b67853a83	[CFGPrinter] Adding heat coloring to CFGPrinter This patch introduces the heat coloring of the Control Flow Graph which is based on the relative "hotness" of each BB. The patch is a part of sequence of three patches, related to graphs Heat Coloring. Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu Differential Revision: https://reviews.llvm.org/D77161	2020-04-08 19:59:51 +00:00
Matt Arsenault	c42cc7fd24	CodeGen: Use Register in MachineSSAUpdater	2020-04-08 14:29:01 -04:00
Artem Belevich	a9627b7ea7	[CUDA] Add partial support for recent CUDA versions. Generate PTX using newer versions of PTX and allow using sm_80 with CUDA-11. None of the new features of CUDA-10.2+ have been implemented yet, so using these versions will still produce a warning. Differential Revision: https://reviews.llvm.org/D77670	2020-04-08 11:19:44 -07:00
Vedant Kumar	48e65fc630	MachineFunction: Copy call site info when duplicating insts Summary: Preserve call site info for duplicated instructions. We copy over the call site info in CloneMachineInstrBundle to avoid repeated calls to copyCallSiteInfo in CloneMachineInstr. (Alternatively, we could copy call site info higher up the stack, e.g. into TargetInstrInfo::duplicate, or even into individual backend passes. However, I don't see how that would be safer or more general than the current approach.) Reviewers: aprantl, djtodoro, dstenb Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77685	2020-04-08 11:06:14 -07:00
Matt Arsenault	586769cce2	DAG: Use Register	2020-04-08 13:44:31 -04:00
Sean Fertile	d0b57b41f4	[PowerPC][AIX][NFC] Replace deprecated getByValAlign call. Replace call to deprecated 'getByValAlign()' with 'getNonZeroByValAlign()'.	2020-04-08 13:27:39 -04:00
Matt Arsenault	dcce3ef1d2	FastISel: Partially use Register Doesn't try to convert the cases that depend on generated code.	2020-04-08 12:10:58 -04:00
Matt Arsenault	7a46e36d51	CodeGen: Use Register more in CallLowering Some of these MCPhysReg uses should probably be MCRegister, but right now this would require more invasive changes.	2020-04-08 12:10:58 -04:00
Matt Arsenault	ca0ace7298	CodeGen: Use Register in MachineBasicBlock	2020-04-08 12:10:58 -04:00
Matt Arsenault	84aa58cbe2	CodeGen: Use Register in TargetLowering	2020-04-08 12:10:58 -04:00
Kirill Naumov	0125db9ab2	[TimePasses] Small fix in "-time-passes" flag that makes it more stable Adds StringMap for TimingData. Differential Revision: https://reviews.llvm.org/D76946 Reviewed By: fedor.sergeev	2020-04-08 15:59:45 +00:00
Sean Fertile	8abfd2c3bb	[PowerPC][AIX] Enable passing byval formal arguments in multiple registers. Any or all the argument registers can be used to pass a byval formal argument, with the limitation that the argument must fit in the available registers (ie: is not split between registers and stack). Differential Revision: https://reviews.llvm.org/D76902	2020-04-08 11:16:33 -04:00
Florian Hahn	bbbec71609	[DSE.MSSA] Only use callCapturesBefore for calls. callCapturesBefore always returns ModRef , if UseInst isn't a call. As we only call it if we already know Mod is set, this only destroys the Must bit for non-calls.	2020-04-08 15:12:33 +01:00
Florian Hahn	a6353fdf3b	[DSE,MSSA] Hoist getMemoryAccess call (NFC).	2020-04-08 15:10:05 +01:00
Alexey Lapshin	0ed2170dc4	[DWARFLinker][dsymutil] followup for `88c2137b6d` That patch is a followup for "Move DwarfStreamer into DWARFLinker". It fixes build with LLVM_LINK_LLVM_DYLIB.	2020-04-08 16:46:52 +03:00
Stefan Pintilie	6c4b40def7	[PowerPC][Future] Add Support For Functions That Do Not Use A TOC. On PowerPC most functions require a valid TOC pointer. This is the case because either the function itself needs to use this pointer to access the TOC or because other functions that are called from that function expect a valid TOC pointer in the register R2. The main exception to this is leaf functions that do not access the TOC since they are guaranteed not to need a valid TOC pointer. This patch introduces a feature that will allow more functions to not require a valid TOC pointer in R2. Differential Revision: https://reviews.llvm.org/D73664	2020-04-08 08:07:35 -05:00
Sanjay Patel	a1c05fe20f	[InstCombine] exclude bitcast of ppc_fp128 in icmp signbit fold Based on the post-commit comments for rG0f56bbc, there might be a problem with this transform: (bitcast (fpext/fptrunc X)) to iX) < 0 --> (bitcast X to iY) < 0 ...and the ppc_fp128 data type, so conservatively bypass if we are bitcasting a ppc_fp128. We might be able to account for endian or other differences to enable this for PowerPC again if that is useful. Differential Revision: https://reviews.llvm.org/D77642	2020-04-08 08:56:19 -04:00
Simon Pilgrim	66c18c729d	[X86][SSE] Combine PTEST(AND(X,Y),AND(X,Y)) -> PTEST(X,Y) and ANDN equivalents Tests derived from PR42035 examples	2020-04-08 12:42:22 +01:00
Jeremy Morse	c77887e4d1	[DebugInfo][NFC] Early-exit when analyzing for single-location variables This is a performance patch that hoists two conditions in DwarfDebug's validThroughout to avoid a linear-scan of all instructions in a block. We now exit early if validThrougout will never return true for the variable location. The first added clause filters for the two circumstances where validThroughout will return true. The second added clause should be identical to the one that's deleted from after the linear-scan. Differential Revision: https://reviews.llvm.org/D77639	2020-04-08 12:27:11 +01:00
Shengchen Kan	916044d819	[X86][MC] Support enhanced relaxation for branch align Summary: Since D75300 has been landed, I want to support enhanced relaxation when we need to align branches and allow prefix padding. "Enhanced Relaxtion" means we allow an instruction that could not be traditionally relaxed to be emitted into RelaxableFragment so that we increase its length by adding prefixes for optimization. The motivation is straightforward, RelaxFragment is mostly for relative jumps and we can not increase the length of jumps when we need to align them, so if we need to achieve D75300's purpose (reducing the bytes of nops) when need to align jumps, we have to make more instructions "relaxable". Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: reames Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D76286	2020-04-08 19:08:19 +08:00
Mikael Holmen	893df2032d	[IfConversion] Disallow TrueBB == FalseBB for valid diamonds Summary: This fixes PR45302. Previously the case BB1 / \ \| \| TBB FBB \| \| \ / BB2 was treated as a valid diamond also when TBB and FBB was the same basic block. This then lead to a failed assertion in IfConvertDiamond. Since TBB == FBB is quite a degenerated case of a diamond, we now don't treat it as a valid diamond anymore, and thus we will avoid the trouble of making IfConvertDiamond handle it correctly. Reviewers: efriedma, kparzysz Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D77651	2020-04-08 12:50:36 +02:00
Anna Welker	89e1248d7b	[ARM][MVE] Optimise offset addresses of gathers/scatters This patch adds an analysis of the offset addresses used by gathers and scatters to the MVEGatherScatterLowering pass to find multiplications and additions that are loop invariant and thus can be moved into the loop preheader, avoiding to execute them each time. Differential Revision: https://reviews.llvm.org/D76681	2020-04-08 11:46:57 +01:00
Max Kazantsev	7adb9e06fd	[LoopLoadElim] Add test showing that LoopLoadElim doesn't work correctly with new PM	2020-04-08 17:32:03 +07:00
Dominik Montada	35950fea8d	[GlobalISel] support narrow G_IMPLICIT_DEF for DstSize % NarrowSize != 0 Summary: When narrowing G_IMPLICIT_DEF where the original size is not a multiple of the narrow size, emit a smaller G_IMPLICIT_DEF and use G_ANYEXT. To prevent a potential endless loop in the legalizer, the condition to combine G_ANYEXT(G_IMPLICIT_DEF) is changed from isInstUnsupported to !isInstLegal, since in this case the combine is only valid if consequent legalization of the newly combined G_IMPLICIT_DEF does not introduce G_ANYEXT due to narrowing. Although this legalization for G_IMPLICIT_DEF would also be valid for the general case, it actually caused a lot of code regressions when tried due to superfluous COPYs and combines not getting hit anymore. Reviewers: dsanders, aemerson, volkan, arsenm, aditya_nandakumar Reviewed By: arsenm Subscribers: jvesely, nhaehnle, kerbowa, wdng, rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76598	2020-04-08 11:00:07 +02:00
Kazushi (Jam) Marukawa	aa034867f1	[VE] Simplify definitions of uimm6 and simm7 Summary: To prepare continuous changes, simplify uimm6 and simm7 operands. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D77700	2020-04-08 09:53:42 +02:00
Igor Kudrin	af11c556db	[DebugInfo] Fix reading DWARFv5 type units in DWP. In DWARFv5, type units are stored in .debug_info sections, along with compilation units, and they are distinguished by the unit_type field in the header, not by the name of the section. It is impossible to associate the correct index section of a DWP file with the unit before the unit's header is read. This patch fixes reading DWARFv5 type units by parsing the header first and then applying the index entry according to the actual unit type. Differential Revision: https://reviews.llvm.org/D77552	2020-04-08 12:50:58 +07:00
Stanislav Mekhanoshin	f96810ff34	[AMDGPU] Expand vector trunc stores from i16 to i8 Differential Revision: https://reviews.llvm.org/D77693	2020-04-07 21:47:45 -07:00
Johannes Doerfert	a19eb1de72	[OpenMP] Add match_{all,any,none} declare variant selector extensions. By default, all traits in the OpenMP context selector have to match for it to be acceptable. Though, we sometimes want a single property out of multiple to match (=any) or no match at all (=none). We offer these choices as extensions via `implementation={extension(match_{all,any,none})}` to the user. The choice will affect the entire context selector not only the traits following the match property. The first user will be D75788. There we can replace ``` #pragma omp begin declare variant match(device={arch(nvptx64)}) #define __CUDA__ #include <__clang_cuda_cmath.h> // TODO: Hack until we support an extension to the match clause that allows "or". #undef __CLANG_CUDA_CMATH_H__ #undef __CUDA__ #pragma omp end declare variant #pragma omp begin declare variant match(device={arch(nvptx)}) #define __CUDA__ #include <__clang_cuda_cmath.h> #undef __CUDA__ #pragma omp end declare variant ``` with the much simpler ``` #pragma omp begin declare variant match(device={arch(nvptx, nvptx64)}, implementation={extension(match_any)}) #define __CUDA__ #include <__clang_cuda_cmath.h> #undef __CUDA__ #pragma omp end declare variant ``` Reviewed By: mikerice Differential Revision: https://reviews.llvm.org/D77414	2020-04-07 23:33:24 -05:00
Kazu Hirata	91eb442fde	[JumpThreading] NFC: Simplify ComputeValueKnownInPredecessorsImpl Summary: ComputeValueKnownInPredecessorsImpl is the main folding mechanism in JumpThreading.cpp. To avoid potential infinite recursion while chasing use-def chains, it uses: DenseSet<std::pair<Value , BasicBlock >> &RecursionSet to keep track of Value-BB pairs that we've processed. Now, when ComputeValueKnownInPredecessorsImpl recursively calls itself, it always passes BB as is, so the second element is always BB. This patch simplifes the function by dropping "BasicBlock *" from RecursionSet. Reviewers: wmi, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77699	2020-04-07 18:37:36 -07:00
Eli Friedman	565b56a72c	[NFC] Clean up uses of LoadInst constructor.	2020-04-07 16:28:53 -07:00
Daniel Sanders	1adeeabb79	Add MIR-level debugify with only locations support for now Summary: Re-used the IR-level debugify for the most part. The MIR-level code then adds locations to the MachineInstrs afterwards based on the LLVM-IR debug info. It's worth mentioning that the resulting locations make little sense as the range of line numbers used in a Function at the MIR level exceeds that of the equivelent IR level function. As such, MachineInstrs can appear to originate from outside the subprogram scope (and from other subprogram scopes). However, it doesn't seem worth worrying about as the source is imaginary anyway. There's a few high level goals this pass works towards: * We should be able to debugify our .ll/.mir in the lit tests without changing the checks and still pass them. I.e. Debug info should not change codegen. Combining this with a strip-debug pass should enable this. The main issue I ran into without the strip-debug pass was instructions with MMO's and checks on both the instruction and the MMO as the debug-location is between them. I currently have a simple hack in the MIRPrinter to resolve that but the more general solution is a proper strip-debug pass. * We should be able to test that GlobalISel does not lose debug info. I recently found that the legalizer can be unexpectedly lossy in seemingly simple cases (e.g. expanding one instr into many). I have a verifier (will be posted separately) that can be integrated with passes that use the observer interface and will catch location loss (it does not verify correctness, just that there's zero lossage). It is a little conservative as the line-0 locations that arise from conflicts do not track the conflicting locations but it can still catch a fair bit. Depends on D77439, D77438 Reviewers: aprantl, bogner, vsk Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77446	2020-04-07 16:25:13 -07:00
Fangrui Song	624654fd64	[VE] Migrate to the getMachineMemOperand overload using llvm::Align Just delete the deprecated overload because nothing uses it.	2020-04-07 16:04:54 -07:00
Matt Arsenault	6011627f51	CodeGen: More conversions to use Register	2020-04-07 18:54:36 -04:00
Fangrui Song	d2ef8c1f2c	[ThinLTO] Drop dso_local if a GlobalVariable satisfies isDeclarationForLinker() dso_local leads to direct access even if the definition is not within this compilation unit (it is still in the same linkage unit). On ELF, such a relocation (e.g. R_X86_64_PC32) referencing a STB_GLOBAL STV_DEFAULT object can cause a linker error in a -shared link. If the linkage is changed to available_externally, the dso_local flag should be dropped, so that no direct access will be generated. The current behavior is benign, because -fpic does not assume dso_local (clang/lib/CodeGen/CodeGenModule.cpp:shouldAssumeDSOLocal). If we do that for -fno-semantic-interposition (D73865), there will be an R_X86_64_PC32 linker error without this patch. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D74751	2020-04-07 15:46:01 -07:00
Fangrui Song	2f8fb4d1cd	[VE] Adapt `aa26dd9858` and `2481f26ac3`	2020-04-07 15:45:19 -07:00
Wei Mi	b49eac71ad	Recommit [SampleFDO] Add flag for partial profile. Fix the error of show-prof-info.test on some platforms without zlib. The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them. Differential Revision: https://reviews.llvm.org/D77426	2020-04-07 14:28:25 -07:00
Stanislav Mekhanoshin	96e51ed005	[AMDGPU] Implement copyPhysReg for 16 bit subregs Differential Revision: https://reviews.llvm.org/D74937	2020-04-07 14:22:46 -07:00
Matt Arsenault	2481f26ac3	CodeGen: Use Register in TargetFrameLowering	2020-04-07 17:07:44 -04:00
Nikita Popov	fe8abbf442	[BPI] Clear handles when releasing memory (NFC) This reduces max-rss of sqlite compilation by 2.5%.	2020-04-07 22:51:01 +02:00
Matt Arsenault	aa26dd9858	CodeGen: Use Register in more places	2020-04-07 15:59:40 -04:00
Wei Mi	c5da949ae8	Revert "[SampleFDO] Add flag for partial profile." show-prof-info.test breaks on some platforms. This reverts commit `e3ba652a14`.	2020-04-07 12:54:51 -07:00
Wei Mi	e3ba652a14	[SampleFDO] Add flag for partial profile. The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them. Differential Revision: https://reviews.llvm.org/D77426	2020-04-07 12:17:56 -07:00
Nemanja Ivanovic	ecd8435483	[NFC][PowerPC] Fix register class for patterns using XXPERMDIs There are a few patterns where we use a superclass for inputs to this instruction rather than the correct class. This can sometimes lead to unncessary copies.	2020-04-07 14:06:08 -05:00
Graham Sellers	a19a56f6a1	[AMDGPU] Extend constant folding for logical operations This patch extends existing constant folding in logical operations to handle S_XNOR, S_NAND, S_NOR, S_ANDN2, S_ORN2, V_LSHL_ADD_U32 and V_AND_OR_B32. Also added a couple of tests for existing folds.	2020-04-07 14:37:16 -04:00
Craig Topper	c41685b16f	[SelectionDAG] Make getZeroExtendInReg take a vector VT if the operand VT is a vector. This removes a call to getScalarType from a bunch of call sites. It also makes the behavior consistent with SIGN_EXTEND_INREG. Differential Revision: https://reviews.llvm.org/D77631	2020-04-07 11:34:08 -07:00
Alexey Lapshin	88c2137b6d	[DWARFLinker][dsymutil][NFC] Move DwarfStreamer into DWARFLinker. For implementing "remove obsolete debug info in lld", it is neccesary to have DWARF generation code implementation. dsymutil uses DwarfStreamer for that purpose. DwarfStreamer uses AsmPrinter. It is considered OK to use AsmPrinter based code in lld(D74169). This patch moves DwarfStreamer implementation into DWARFLinker, so that it could be reused from lld. Generally, a better place for such a common DWARF generation code would be not DWARFLinker but an additional separate library. Such a library could contain a single version of DWARF generation routines and could also be independent of AsmPrinter. At the current moment, DwarfStreamer does not pretend to be such a general implementation of DWARF generation. So I decided to put it into DWARFLinker since it is the only user of DwarfStreamer. Testing: it passes "check-all" lit testing. MD5 checksum for clang .dSYM bundle matches for the dsymutil with/without that patch. Reviewed By: JDevlieghere Differential revision: https://reviews.llvm.org/D77169	2020-04-07 21:21:54 +03:00
Eli Friedman	e9ac757f79	[AArch64] Don't expand memcmp in strict align mode. `7aecf232` fixed the bug where we would miscompile, but we still generate a crazy amount of code. Turn off the expansion until someone implements an appropriate heuristic. Differential Revision: https://reviews.llvm.org/D77599	2020-04-07 10:53:36 -07:00
Matt Arsenault	f596ab4066	AMDGPU: Use early return	2020-04-07 13:48:00 -04:00
Sam Clegg	5be42f36f5	[WebAssembly][MC] Fix leak of std::string members in MCSymbolWasm Summary: Fixes: https://bugs.llvm.org/show_bug.cgi?id=45452 Subscribers: dschuff, jgravelle-google, hiraditya, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77627	2020-04-07 10:38:43 -07:00
Stanislav Mekhanoshin	12a324393d	[AMDGPU] Limit endcf-collapase to simple if We can only collapse adjacent SI_END_CF if outer statement belongs to a simple SI_IF, otherwise correct mask is not in the register we expect, but is an argument of an S_XOR instruction. Even if SI_IF is simple it might be lowered using S_XOR because lowering is dependent on a basic block layout. It is not considered simple if instruction consuming its output is not an SI_END_CF. Since that SI_END_CF might have already been lowered to an S_OR isSimpleIf() check may return false. This situation is an opportunity for a further optimization of SI_IF lowering, but that is a separate optimization. In the meanwhile move SI_END_CF post the lowering when we already know how the rest of the CFG was lowered since a non-simple SI_IF case still needs to be handled. Differential Revision: https://reviews.llvm.org/D77610	2020-04-07 10:27:23 -07:00
Matt Arsenault	b281138a1b	DAG: Use the correct getPointerTy in a few places These should not be assuming address space 0. Calling getPointerTy is generally the wrong thing to do, since you should already know the type from the incoming IR.	2020-04-07 12:45:41 -04:00
Nikita Popov	259649a519	[RDA] Avoid full reprocessing of blocks in loops (NFCI) RDA sometimes needs to visit blocks twice, to take into account reaching defs coming in along loop back edges. Currently it handles repeated visitation the same way as usual, which means that it will scan through all instructions and their reg unit defs again. Not only is this very inefficient, it also means that all reaching defs in loops are going to be inserted twice. We can do much better than this. The only thing we need to handle is a new reaching def from a predecessor, which either needs to be prepended to the reaching definitions (if there was no reaching def from a predecessor), or needs to replace an existing predecessor reaching def, if it is more recent. Since D77508 we only store the most recent predecessor reaching def, so that's the only one that may need updating. This also has the nice side-effect that reaching definitions are now automatically sorted and unique, so drop the llvm::sort() call in favor of an assertion. Differential Revision: https://reviews.llvm.org/D77511	2020-04-07 17:55:37 +02:00
Nikita Popov	76e987b372	[RDA] Don't pass down TraversedMBB (NFC) Only pass the MachineBasicBlock itself down to helper methods, they don't need to know about traversal. Move the debug print into the main method.	2020-04-07 17:53:04 +02:00
Nikita Popov	361c29d7ba	[RDA] Avoid inserting duplicate reaching defs (NFCI) An instruction may define the same reg unit multiple times, avoid inserting the same reaching def multiple times in that case. Also print the reg unit, rather than the super-register, in the debug code.	2020-04-07 17:50:38 +02:00
David Tenty	b9245f14b7	[NFC][PowerPC] Cleanup 64-bit and Darwin CalleeSavedRegs Summary: - Remove the no longer used Darwin CalleeSavedRegs - Combine the SVR464 callee saved regs and AIX64 since the two are (and should be) identical into PPC64 - Update tests for 64-bit CSR change Reviewers: sfertile, ZarkoCA, cebowleratibm, jasonliu, #powerpc Reviewed By: sfertile Subscribers: wuzish, nemanjai, hiraditya, kbarton, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77235	2020-04-07 11:49:10 -04:00
Simon Pilgrim	e3b6059776	[X86][SSE] combineX86ShufflesConstants - early out for zeroable vectors (PR45443) Shuffle combining can insert zero byte sized elements into the shuffle mask, which combineX86ShufflesConstants will attempt to fold without taking into account whether the byte-sized type is legal (e.g. AVX512F only targets). If we have a full-zeroable vector then we should just return a zero version of the root type, otherwise if the type isn't valid we should bail. Fixes PR45443	2020-04-07 14:45:29 +01:00
Keith Walker	01dc10774e	[ARM] unwinding .pad instructions missing in execute-only prologue If the stack pointer is altered for local variables and we are generating Thumb2 execute-only code the .pad directive is missing. Usually the size of the adjustment is stored in a PC-relative location and loaded into a register which is then added to the stack pointer. However when we are generating execute-only code code the size of the adjustment is instead generated using the MOVW/MOVT instruction pair. As a by product of handling the execute-only case this also fixes an existing issue that in the none execute-only case the .pad directive was generated against the load of the constant to a register instruction, instead of the instruction which adds the register to the stack pointer. Differential Revision: https://reviews.llvm.org/D76849	2020-04-07 11:51:59 +01:00
Florian Hahn	6aabb109be	[SCCP] Use ranges for predicate info conditions. This patch updates the code that deals with conditions from predicate info to make use of constant ranges. For ssa_copy instructions inserted by PredicateInfo, we have 2 ranges: 1. The range of the original value. 2. The range imposed by the linked condition. 1. is known, 2. can be determined using makeAllowedICmpRegion. The intersection of those ranges is the range for the copy. With this patch, we get a nice increase in the number of instructions eliminated by both SCCP and IPSCCP for some benchmarks: For MultiSource, SPEC2000 & SPEC2006: Tests: 237 Same hash: 170 (filtered out) Remaining: 67 Metric: sccp.NumInstRemoved Program base patch diff test-suite...Source/Benchmarks/sim/sim.test 10.00 71.00 610.0% test-suite...CFP2000/177.mesa/177.mesa.test 361.00 1626.00 350.4% test-suite...encode/alacconvert-encode.test 141.00 602.00 327.0% test-suite...decode/alacconvert-decode.test 141.00 602.00 327.0% test-suite...CI_Purple/SMG2000/smg2000.test 1639.00 4093.00 149.7% test-suite...peg2/mpeg2dec/mpeg2decode.test 75.00 163.00 117.3% test-suite...T2006/401.bzip2/401.bzip2.test 358.00 513.00 43.3% test-suite...rks/FreeBench/pifft/pifft.test 11.00 15.00 36.4% test-suite...langs-C/unix-tbl/unix-tbl.test 4.00 5.00 25.0% test-suite...lications/sqlite3/sqlite3.test 541.00 667.00 23.3% test-suite.../CINT2000/254.gap/254.gap.test 243.00 299.00 23.0% test-suite...ks/Prolangs-C/agrep/agrep.test 25.00 29.00 16.0% test-suite...marks/7zip/7zip-benchmark.test 1135.00 1304.00 14.9% test-suite...lications/ClamAV/clamscan.test 1105.00 1268.00 14.8% test-suite...urce/Applications/lua/lua.test 398.00 436.00 9.5% Metric: sccp.IPNumInstRemoved Program base patch diff test-suite...C/CFP2000/179.art/179.art.test 1.00 3.00 200.0% test-suite...006/447.dealII/447.dealII.test 429.00 1056.00 146.2% test-suite...nch/fourinarow/fourinarow.test 3.00 7.00 133.3% test-suite...CI_Purple/SMG2000/smg2000.test 818.00 1748.00 113.7% test-suite...ks/McCat/04-bisect/bisect.test 3.00 5.00 66.7% test-suite...CFP2000/177.mesa/177.mesa.test 165.00 255.00 54.5% test-suite...ediabench/gsm/toast/toast.test 18.00 27.00 50.0% test-suite...telecomm-gsm/telecomm-gsm.test 18.00 27.00 50.0% test-suite...ks/Prolangs-C/agrep/agrep.test 24.00 35.00 45.8% test-suite...TimberWolfMC/timberwolfmc.test 43.00 62.00 44.2% test-suite...encode/alacconvert-encode.test 46.00 66.00 43.5% test-suite...decode/alacconvert-decode.test 46.00 66.00 43.5% test-suite...langs-C/unix-tbl/unix-tbl.test 12.00 17.00 41.7% test-suite...peg2/mpeg2dec/mpeg2decode.test 31.00 41.00 32.3% test-suite.../CINT2000/254.gap/254.gap.test 117.00 154.00 31.6% Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76611	2020-04-07 11:09:18 +01:00
Serguei Katkov	b7e3759e17	[DAG] Consolidate require spill slot logic in lambda. NFC. Move the logic whether lowering of deopt value requires a spill slot in a separate lambda. Reviewers: reames, dantrushin Reviewed By: dantrushin Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D77629	2020-04-07 16:43:47 +07:00
Peter Smith	14c1e98754	[ARM] Remove condition that could never be true From Arm v8 Architecture Reference Manual F5.1.84 LDREXD The ldrexd instruction in Arm state has the following conditions: t = UInt(Rt); t2 = t + 1; n = UInt(Rn); if Rt<0> == '1' \|\| t2 == 15 \|\| n == 15 then UNPREDICTABLE; In when Rt is odd or if Rt is 14 (making t2 15). In the implementation when the pair is the UNPREDICTABLE R14_R15 we would ideally return SOFT_FAIL. We can't because there is no R14_R15 value for us to return so we fail early returning FAIL. The early return for registers outside the bounds of the table means the check for Rt == 14 (0xE) redundant which causes a static analyzer to flag the condition as never being true. To fix the warning I've removed the check and replaced with a comment explaining the difference with the specification. Fixes pr41660 Differential Revision: https://reviews.llvm.org/D77463	2020-04-07 09:50:56 +01:00
Simon Tatham	aab9e9de4d	[Support,Windows] Tolerate failure of CryptGenRandom Summary: In `Unix/Process.inc`, we seed a random number generator from `/dev/urandom` if possible, but if not, we're happy to fall back to ordinary pseudorandom strategies, like the current time and PID. The corresponding function on Windows calls `CryptGenRandom`, but it //doesn't// have a fallback if that strategy fails. But `CryptGenRandom` //can// fail, if a cryptography provider isn't properly initialized, or occasionally (by our observation) simply intermittently. If it's reasonable on Unix to implement traditional pseudorandom-number seeding as a fallback, then it's surely reasonable to do the same on Windows. So this patch adds a last-ditch use of ordinary rand(), using much the same strategy as the Unix fallback code. Reviewers: hans, sammccall Reviewed By: hans Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77553	2020-04-07 09:18:12 +01:00
Pierre-vh	4fc59a468f	Revert "[CodeGen][SelectionDAG] Flip Booleans More Often" This reverts commit `23342bdcc8`.	2020-04-07 09:09:10 +01:00
Pierre-vh	23342bdcc8	[CodeGen][SelectionDAG] Flip Booleans More Often Differential Revision: https://reviews.llvm.org/D77201	2020-04-07 08:19:57 +01:00
Sam Clegg	f0bbf3d086	[WebAssembly] EmscriptenEHSjLj: Mark more functions as imported These should have been part of https://reviews.llvm.org/D77192 Differential Revision: https://reviews.llvm.org/D77358	2020-04-06 21:27:31 -07:00
Xiang1 Zhang	01a32f2bd3	Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology) Do not commit the llvm/test/ExecutionEngine/MCJIT/cet-code-model-lager.ll because it will cause build bot fail(not suitable for window 32 target). Summary: This patch comes from H.J.'s `2bd54ce7fa` This patch fix the failed llvm unit tests which running on CET machine. (e.g. ExecutionEngine/MCJIT/MCJITTests) The reason we enable IBT at "JIT compiled with CET" is mainly that: the JIT don't know the its caller program is CET enable or not. If JIT's caller program is non-CET, it is no problem JIT generate CET code or not. But if JIT's caller program is CET enabled, JIT must generate CET code or it will cause Control protection exceptions. I have test the patch at llvm-unit-test and llvm-test-suite at CET machine. It passed. and H.J. also test it at building and running VNCserver(Virtual Network Console), it works too. (if not apply this patch, VNCserver will crash at CET machine.) Reviewers: hjl.tools, craig.topper, LuoYuanke, annita.zhang, pengfei Reviewed By: LuoYuanke Subscribers: tstellar, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76900	2020-04-07 09:48:47 +08:00
Jun Ma	46bff786bc	[Coroutines] Remove alignment check in shouldBeMustTail Differential Revision: https://reviews.llvm.org/D77362	2020-04-07 09:07:34 +08:00
Eli Friedman	3f13ee8a00	[NFC] Modernize misc. uses of Align/MaybeAlign APIs. Use the current getAlign() APIs where it makes sense, and use Align instead of MaybeAlign when we know the value is non-zero.	2020-04-06 17:53:04 -07:00
Eli Friedman	68b03aee1a	Remove SequentialType from the type heirarchy. Now that we have scalable vectors, there's a distinction that isn't getting captured in the original SequentialType: some vectors don't have a known element count, so counting the number of elements doesn't make sense. In some cases, there's a better way to express the commonality using other methods. If we're dealing with GEPs, there's GEP methods; if we're dealing with a ConstantDataSequential, we can query its element type directly. In the relatively few remaining cases, I just decided to write out the type checks. We're talking about relatively few places, and I think the abstraction doesn't really carry its weight. (See thread "[RFC] Refactor class hierarchy of VectorType in the IR" on llvmdev.) Differential Revision: https://reviews.llvm.org/D75661	2020-04-06 17:03:49 -07:00
Davide Italiano	8115e08b05	[MachineCSE] Don't carry the wrong location when hoisting PR: 45425 <rdar://problem/61359768> Differential Revision: https://reviews.llvm.org/D77604	2020-04-06 16:36:22 -07:00
Daniel Sanders	f27cea721e	Add way to omit debug-location from MIR output Summary: In lieu of a proper pass that strips debug info, add a way to omit debug-locations from the MIR output so that instructions with MMO's continue to match CHECK's when mir-debugify is used Reviewers: aprantl, bogner, vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77575	2020-04-06 16:22:01 -07:00
Nick Desaulniers	41ba80182c	[CallSite Removal] a CallBase is never an IndirectCall for isInlineAsm Summary: Thanks to Bill Wendling (void) for the report and steps to reproduce. It looks like this was missed during r350508's cleanup of the CallSite split into CallBase, CallInst, and CallBrInst. This was exposed by running pgo on a callbr, which was creating a ptrtoint to the inline asm thinking it was an indirect call. The relevant callchain looks like: IndirectCallPromotionPlugin::run() -> PGOIndirectCallVisitor::findIndirectCalls() -> PGOIndirectCallVisitor::visitCallBase() -> CallBase::isIndirectCall() Reviewers: void, chandlerc Reviewed By: void Subscribers: hiraditya, llvm-commits, craig.topper, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D77600	2020-04-06 16:14:46 -07:00
Vedant Kumar	5f185a8999	[AddressSanitizer] Fix for wrong argument values appearing in backtraces Summary: In some cases, ASan may insert instrumentation before function arguments have been stored into their allocas. This causes two issues: 1) The argument value must be spilled until it can be stored into the reserved alloca, wasting a stack slot. 2) Until the store occurs in a later basic block, the debug location will point to the wrong frame offset, and backtraces will show an uninitialized value. The proposed solution is to move instructions which initialize allocas for arguments up into the entry block, before the position where ASan starts inserting its instrumentation. For the motivating test case, before the patch we see: ``` \| 0033: movq %rdi, 0x68(%rbx) \| \| DW_TAG_formal_parameter \| \| ... \| \| DW_AT_name ("a") \| \| 00d1: movq 0x68(%rbx), %rsi \| \| DW_AT_location (RBX+0x90) \| \| 00d5: movq %rsi, 0x90(%rbx) \| \| ^ not correct ... \| ``` and after the patch we see: ``` \| 002f: movq %rdi, 0x70(%rbx) \| \| DW_TAG_formal_parameter \| \| \| \| DW_AT_name ("a") \| \| \| \| DW_AT_location (RBX+0x70) \| ``` rdar://61122691 Reviewers: aprantl, eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77182	2020-04-06 15:59:25 -07:00
Daniel Sanders	35b7b0851b	Allow MachineFunction to obtain non-const Function (to enable MIR-level debugify) Summary: To debugify MIR, we need to be able to create metadata and to do that, we need a non-const Module. However, MachineFunction only had a const reference to the Function preventing this. Reviewers: aprantl, bogner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77439	2020-04-06 15:19:21 -07:00
Daniel Sanders	15f7bc7857	Add option to limit Debugify to locations (omitting variables) Summary: It can be helpful to test behaviour w.r.t locations without having DEBUG_VALUE around. In particular, because DEBUG_VALUE has the potential to change CodeGen behaviour (e.g. hasOneUse() vs hasOneNonDbgUse()) while locations generally don't. Reviewers: aprantl, bogner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77438	2020-04-06 15:04:55 -07:00
David Blaikie	5aead592f0	X86ISelLowering: Minor refactor to avoid redundant initialization while ensuring compiler warnings can hopefully still prove initialization Based on post-commit review/discussion in fabe52a7412b	2020-04-06 14:25:52 -07:00
Konstantin Pyzhov	72e8754916	[AMDGPU] Disable 'Skip Uniform Regions' optimization by default for AMDGPU. Reviewers: sameerds, dstuttard Differential Revision: https://reviews.llvm.org/D77228	2020-04-06 09:05:58 -04:00
Leonard Chan	a0222ac1f9	[AsmPrinter] Do not define local aliases for global objects in a comdat A global symbol that is defined in a comdat should not generate an alias since call sites that would've referred to that symbol will refer to their own independent local aliases rather than the surviving global comdat one. This could result in something that looks like: ``` ld.lld: error: relocation refers to a discarded section: .text._ZN3fbl8internal18NullFunctionTargetIvJjjPjEED1Ev.stub >>> defined in user-x64-clang/obj/system/ulib/minfs/libminfs.a(minfs._sources.file.cc.o) >>> section group signature: _ZN3fbl8internal18NullFunctionTargetIvJjjPjEED1Ev.stub >>> prevailing definition is in user-x64-clang/obj/system/ulib/minfs/libminfs.a(minfs._sources.vnode.cc.o) >>> referenced by function.h:169 (../../zircon/system/ulib/fbl/include/fbl/function.h:169) >>> minfs._sources.file.cc.o:(minfs::File::AllocateAndCommitData(std::__2::unique_ptr<minfs::Transaction, std::__2::default_delete<minfs::Transaction> >)) in archive user-x64-clang/obj/system/ulib/minfs/libminfs.a ``` We ran into this when experimenting with a new C++ ABI for fuchsia (refer to D72959) which takes relative offsets between comdat'd functions which is why the normal C++ user wouldn't run into this. Differential Revision: https://reviews.llvm.org/D77429	2020-04-06 13:48:05 -07:00
Nick Desaulniers	5bc291be71	[SelectionDAG] fix predecessor list for INLINEASM_BRs' parent Summary: A bug report mentioned that LLVM was producing jumps off the end of a function when using "asm goto with outputs". Further digging pointed to MachineBasicBlocks that had their address taken and were indirect targets of INLINEASM_BR being removed by BranchFolder, because their predecessor list was empty, so they appeared to have no entry. This was a cascading failure caused earlier, during Pre-RA instruction scheduling. We have a few special cases in Pre-RA instruction scheduling where we split a MachineBasicBlock in two. This requires careful handing of predecessor and successor lists for a MachineBasicBlock that was split, and careful handing of PHI MachineInstrs that referred to the MachineBasicBlock before it was split. The clue that led to this fix was the observation that many callers of MachineBasicBlock::splice() frequently call MachineBasicBlock::transferSuccessorsAndUpdatePHIs() to update their PHI nodes after a splice. We don't want to reuse that method, as we have custom successor transferring logic for this block split. This patch fixes 2 pre-existing bugs, and adds tests. The first bug was that MachineBasicBlock::splice() correctly handles updating most successors and predecessors; we don't need to do anything more than removing the previous fallthrough block from the first half of the split block post splice. Previously, we were updating the successor list incorrectly (updating successors updates predecessors). The second bug was that PHI nodes that needed registers from the first half of the split block were not having entries populated. The register live out information was correct, and the FuncInfo->PHINodesToUpdate was correct. Specifically, the check in SelectionDAGISel::FinishBasicBlock: for (unsigned i = 0, e = FuncInfo->PHINodesToUpdate.size(); i != e; ++i) { MachineInstrBuilder PHI(*MF, FuncInfo->PHINodesToUpdate[i].first); if (!FuncInfo->MBB->isSuccessor(PHI->getParent())) continue; PHI.addReg(FuncInfo->PHINodesToUpdate[i].second).addMBB(FuncInfo->MBB); was `continue`ing because FuncInfo->MBB tracks the second half of the post-split block; no one was updating PHI entries for the first half of the post-split block. SelectionDAGBuilder::UpdateSplitBlock() already expects to perform special handling for MachineBasicBlocks that were split post calls to ScheduleDAGSDNodes::EmitSchedule(), so I'm confident that it's both correct for ScheduleDAGSDNodes::EmitSchedule() to return the second half of the split block `CopyBB` which updates `FuncInfo->MBB` (ie. the current MachineBasicBlock being processed), and perform special handling for this in SelectionDAGBuilder::UpdateSplitBlock(). Reviewers: void, craig.topper, efriedma Reviewed By: void, efriedma Subscribers: hfinkel, fhahn, MatzeB, efriedma, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D76961	2020-04-06 13:46:39 -07:00
Matt Arsenault	869f05c834	AMDGPU: Remove dead paths for requiresUniformRegister The extracts from control flow intrinsics are already properly handled by divergence analysis. The inline asm case isn't dead, but has also never really worked correctly so leave it as-is for now.	2020-04-06 16:15:10 -04:00
Francesco Petrogalli	53b7abdd23	[llvm][CodeGen] Avoid implicit cast of TypeSize to integer in `initActions`. Reviewers: sdesmalen, efriedma Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77317	2020-04-06 19:46:11 +01:00
Masoud Ataei jaliseh	9ed0612cca	Add InjectTLIMappings pass to new pass manager This pass is created in `d6de5f12d4` and tested for new and legacy pass manager but never added to new pass manager pipeline. I am adding it to new pass manager pipeline. This pass is get used in Vector Function Database (VFDatabase) and without this pass in new pass manager pipeline, none of the vector libraries are work ing with new pass manager. Related passes: `66c120f025` https://reviews.llvm.org/D74944 Differential revision: https://reviews.llvm.org/D75354	2020-04-06 13:16:48 -05:00
Craig Topper	07ed1fb597	[SelectionDAGBuilder] Fix ISD::FREEZE creation for structs with fields of different types. The previous code used the type of the first field for the VT passed to getNode for every field. I've based the implementation here off what is done in visitSelect as it removes the need to special case aggregates. Differential Revision: https://reviews.llvm.org/D77093	2020-04-06 11:03:40 -07:00
Konstantin Pyzhov	51dc028314	Revert `e1730cfeb3`	2020-04-06 05:56:11 -04:00
Kirill Naumov	3f995ce8b5	[CFGPrinter][CallPrinter][polly] Adding distinct structure for CFGDOTInfo The patch introduces the system to distinctively store the information needed for the Control Flow Graph as well as the instrumentary needed for the follow-up changes: BlockFrequencyInfo and BranchProbabilityInfo. The patch is a part of sequence of three patches, related to graphs Heat Coloring. Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu Differential Revision: https://reviews.llvm.org/D76820	2020-04-06 17:42:54 +00:00
Konstantin Pyzhov	e1730cfeb3	[AMDGPU] Disable 'Skip Uniform Regions' optimization by default for AMDGPU. Reviewers: sameerds, dstuttard Differential Revision: https://reviews.llvm.org/D77228	2020-04-06 05:10:37 -04:00
Fangrui Song	a5d375e0cb	[AArch64] Allow logical immediates to have all-1 in top bits So that constant expressions like the following are permitted: and w0, w0, #~(0xfe<<24) and w1, w1, #~(0xff<<24) The behavior matches GNU as (opcodes/aarch64-opc.c:aarch64_logical_immediate_p). Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D75885	2020-04-06 09:56:04 -07:00
Florian Hahn	7aba6a0333	[LV] Fix value that could be read uninitialized. This should fix http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/18569	2020-04-06 17:54:50 +01:00
Nikita Popov	e8b83f7ddc	[RDA] Only store most recent reaching def from predecessors (NFCI) When entering a basic block, RDA inserts reaching definitions coming from predecessor blocks (which will be negative numbers) in a rather peculiar way. If you have incoming reaching definitions -4, -3, -2, -1, it will insert those. If you have incoming reaching definitions -1, -2, -3, -4, it will insert -1, -1, -1, -1, as the max is taken at each step. That's probably not what was intended... However, RDA only actually cares about the most recent reaching definition from a predecessor (to calculate clearance), so this ends up working fine as far as behavior is concerned. It does waste memory on unnecessary reaching definitions though. This patch changes the implementation to first compute the most recent reaching definition in one loop, and then insert only that one in a separate loop. Differential Revision: https://reviews.llvm.org/D77508	2020-04-06 18:39:09 +02:00
Nikita Popov	8d75df1438	[RDA] Don't adjust ReachingDefDefaultVal (NFCI) At the end of a basic block, RDA adjusts all the reaching defs it found to be relative to the end of the basic block, rather than the start of it. However, it also does this to registers which don't have a reaching def, indicated by ReachingDefDefaultVal. This means that code checking against ReachingDefDefaultVal will not skip them, and may insert them into the reaching definition list. This is ultimately harmless, but causes unnecessary work and is logically not right. Differential Revision: https://reviews.llvm.org/D77506	2020-04-06 18:36:29 +02:00
Sanjay Patel	fbb1b43f13	[ValueTracking] enhance matching of umin/umax with 'not' operands The cmyk test is based on the known regression that resulted from: rGf2fbdf76d8d0 This improves on the equivalent signed min/max change: rG867f0c3c4d8c The underlying icmp equivalence is: ~X pred ~Y --> Y pred X For an icmp with constant, canonicalization results in a swapped pred: ~X < C --> X > ~C	2020-04-06 11:51:59 -04:00
Matt Arsenault	8a5f0dafd4	AMDGPU/GlobalISel: Select llvm.amdgcn.div.scale	2020-04-06 11:50:19 -04:00
Matt Arsenault	e87ec66762	AMDGPU/GlobalISel: Fix llvm.amdgcn.div.fmas.ll	2020-04-06 11:50:16 -04:00
Jay Foad	ddd2f4b96f	[AMDGPU] Fix inaccurate comments	2020-04-06 16:44:08 +01:00
Florian Hahn	90be3c24a7	[VPlan] Introduce new VPWidenCallRecipe (NFC). This patch moves calls to their own recipe, to simplify the transition to VPUser for operands of VPWidenRecipe, as discussed in D76992. Subsequently additional information can be added to the recipe rather than computing it during the execute step. Reviewers: rengolin, Ayal, gilr, hsaito Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D77467	2020-04-06 16:07:37 +01:00
Chris Bowler	d6ea82d11c	[AIX][PPC] Implement by-val caller arguments in multiple registers Differential Revision: https://reviews.llvm.org/D76380	2020-04-06 11:06:51 -04:00
Guillaume Chatelet	808286342a	[Alignment][NFC] Assume AlignmentFromAssumptions::getNewAlignment is always set. Summary: In D77454 we explain that `LoadInst` and `StoreInst` always have their alignment defined. This allows to work backward here and to infer that `getNewAlignment` does not need to return `0` in case of failure. Returning `1` also works since it needs to be greater than the Load/Store alignment which is a least `1`. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77538	2020-04-06 14:54:57 +00:00
diggerlin	a26a441b99	[llvm-objdump][XCOFF] Use symbol index+symbol name + storage mapping class as label for -D SUMMARY: For the llvm-objdump -D, the symbol name is used as a label in the disassembly for the specific address (when a symbol address is equal to the virtual address in the dump). In XCOFF, multiple symbols may have the same name, being differentiated by their storage mapping class. It is helpful to print the QualName and not just the name when forming the output label for a csect symbol. The symbol index further removes any ambiguity caused by duplicate names. To maintain compatibility with the binutils objdump, the XCOFF-specific --symbol-description option is added to enable the enhanced format. Reviewers: hubert.reinterpretcast, James Henderson, Jason Liu ,daltenty Subscribers: wuzish, nemanjai, hiraditya Differential Revision: https://reviews.llvm.org/D72973	2020-04-06 10:10:10 -04:00
Benjamin Kramer	880ec421dd	[MC] Use a byte_swap in emitIntValue instead of doing it in a loop. NFCI.	2020-04-06 15:51:24 +02:00
Florian Hahn	6babae74c7	[Matrix] Update load/storeMatrix to take indices as Value* (NFC). This allows using the functions to be used with loop dependent indices.	2020-04-06 14:48:48 +01:00
Matt Arsenault	cbf719b568	AMDGPU: Use DAG patterns for div_fmas	2020-04-06 09:28:30 -04:00
Matt Arsenault	79b29d6df7	AMDGPU: Remove DisableInst feature I'm not sure why these were bothering to check the instruction profile, since those profiles should only be used with these instruction classes.	2020-04-06 09:27:44 -04:00
Matt Arsenault	70726cec5b	DAG: Combine extract_vector_elt of concat_vectors Fixes extra canonicalize regressions when legalizing vector fminnum/fmaxnum.	2020-04-06 09:26:29 -04:00
Hans Wennborg	64c2312750	Revert `43f031d312` "Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology)" ExecutionEngine/MCJIT/cet-code-model-lager.ll is failing on 32-bit windows, see llvm-commits thread for `fef2dab`. This reverts commit `43f031d312` and the follow-ups `fef2dab100` and `6a800f6f62`.	2020-04-06 15:05:25 +02:00
Sourabh Singh Tomar	5d7e9adce2	[DWARF5] Added support for emission of debug_macro section. Summary: This patch adds support for emission of following DWARFv5 macro forms in .debug_macro section. 1. DW_MACRO_start_file 2. DW_MACRO_end_file 3. DW_MACRO_define_strp 4. DW_MACRO_undef_strp. Reviewed By: dblaikie, ikudrin Differential Revision: https://reviews.llvm.org/D72828	2020-04-06 17:45:10 +05:30
Pavel Labath	9154a6398e	[llvm/Support] Make more DataExtractor methods error-aware Summary: This patch adds the optional Error argument, and the Cursor variants to more DataExtractor methods. The functions now behave the same way as other error-aware functions (they set the error when they fail, and don't do anything if the error is already set). I have merged the LEB128 implementations via a template (similarly to how fixed-size functions are handled) to reduce code duplication. Depends on D77304. Reviewers: dblaikie, aprantl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77306	2020-04-06 14:14:11 +02:00
Pavel Labath	a16fffa3f6	[Support] Make DataExtractor string functions error-aware Summary: This patch adds an optional Error argument to DataExtractor functions for string extraction, and makes them behave like other DataExtractor functions (set the error if extraction fails, don't do anything if the error is already set). I have merged the StringRef and C string versions of the functions to reduce code duplication. Reviewers: dblaikie, MaskRay Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77307	2020-04-06 14:14:11 +02:00
Guillaume Chatelet	ff858d7781	[Alignment][NFC] Add DebugStr and operator* Summary: This is a roll forward of D77394 minus AlignmentFromAssumptions (which needs to be addressed separately) Differences from D77394: - DebugStr() now prints the alignment value or `None` and no more `Align(x)` or `MaybeAlign(x)` - This is to keep Warning message consistent (CodeGen/SystemZ/alloca-04.ll) - Removed a few unneeded headers from Alignment (since it's included everywhere it's better to keep the dependencies to a minimum) Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77537	2020-04-06 12:09:45 +00:00
Guillaume Chatelet	39cfba9e33	[Alignment][NFC] Remove deprecated functions introduced in 10.0.0 Summary: 24 March 2020: LLVM 10.0.0 is out. I gathered all deprecated function introduced between 9 and 10 and cleaned them up so they will be removed from 11. > git log -p -S LLVM_ATTRIBUTE_DEPRECATED llvmorg-9.0.0..llvmorg-10.0.0 Reviewers: courbet Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77409	2020-04-06 12:07:18 +00:00
Simon Pilgrim	9bc5b1a489	[X86][SSE] combineVectorSignBitsTruncation - remove minimum vector length limitations truncateVectorWithPACK has its own vector length controls, so we can rely on those directly. This helps some existing truncation to subvector tests, which were being combined later during shuffle lowering at which point the sign/zero bit detection had become obscured preventing lowerShuffleWithPACK working as well as it could.	2020-04-06 12:45:23 +01:00
Benjamin Kramer	232eff55f6	[LTO] Replace hand-rolled endian conversion with support::endian. NFCI.	2020-04-06 13:23:27 +02:00
Benjamin Kramer	e64e516790	[RuntimeDyld] Replace hand-rolled endian conversion with support::endian. NFCI.	2020-04-06 13:22:53 +02:00
Benjamin Kramer	9a9bc23672	[llvm-bcanalyzer] Simplify code. NFCI.	2020-04-06 12:50:50 +02:00
Kazushi (Jam) Marukawa	e981a46a77	[VE] Update lea/load/store instructions Summary: Modify lea/load/store instructions to accept `disp(index, base)` style addressing mode (called ASX format). Also, uniform the number of DAG nodes to have 3 operands for this ASX format instructions, and update selectADDR functions to lower appropriate MI. Reviewers: arsenm, simoll, k-ishizaka Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D76822	2020-04-06 11:49:46 +02:00
Oliver Stannard	a294d9eb21	Revert "[IPRA][ARM] Spill extra registers at -Oz" Reverting because this is causing failures on bots with expensive checks enabled. This reverts commit `73cea83a6f`.	2020-04-06 10:34:59 +01:00
Kerry McLaughlin	944e322f88	[AArch64][SVE] Add SVE intrinsics for saturating add & subtract Summary: Adds the following intrinsics: - @llvm.aarch64.sve.[s\|u]qadd.x - @llvm.aarch64.sve.[s\|u]qsub.x Reviewers: sdesmalen, c-rhodes, dancgr, efriedma, cameron.mcinally, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77054	2020-04-06 10:07:08 +01:00
Florian Hahn	39f2d9aa81	[Matrix] Add option to use row-major matrix layout as default. This patch adds a -matrix-default-layout option which can be used to set the default matrix layout to row-major or column-major (default). The initial patch updates codegen for loads, stores, binary operators and matrix multiply. Reviewers: anemet, Gerolf, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D76325	2020-04-06 10:00:56 +01:00
Florian Hahn	d1fed7081d	[Matrix] Add initial tiling for load/multiply/store chains. This patch adds initial fusion for load/multiply/store chains of matrix operations. The patch contains roughly two parts: 1. Code generation for a fused load/multiply/store chain (LowerMatrixMultiplyFused). First, we ensure that both loads of the multiply operands do not alias the store. If they do, we create new non-aliasing copies of the operands. Note that this may introduce new basic block. Finally we process TileSize x TileSize blocks. That is: load tiles from the input operands, multiply and store them. 2. Identify fusion candidates & matrix instructions. As a first step, collect all instructions with shape info and fusion candidates (currently @llvm.matrix.multiply calls). Next, try to fuse candidates and collect instructions eliminated by fusion. Finally iterate over all matrix instructions, skip the ones eliminated by fusion and lower the rest as usual. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D75566	2020-04-06 09:28:15 +01:00
Guillaume Chatelet	6000478f39	Revert "[Alignment][NFC] Add DebugStr and operator*" This reverts commit `1e34ab98fc`.	2020-04-06 07:55:25 +00:00
Guillaume Chatelet	1e34ab98fc	[Alignment][NFC] Add DebugStr and operator* Summary: Also updates files to use them. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77394	2020-04-06 07:12:46 +00:00
Igor Kudrin	35819ff3cf	[DebugInfo] Fix reading range lists of v5 units in DWP. In package files, the base offset provided by index sections should be used to find the contribution of a unit. The patch adds that base offset when reading range list tables. Differential revision: https://reviews.llvm.org/D77401	2020-04-06 13:28:06 +07:00
Igor Kudrin	a93b77b97f	[DebugInfo] Fix reading location tables headers of v5 units in DWP. This fixes the reading of location lists headers for compilation units in package files by adjusting the reading offset according to the corresponding record in the unit index. This is required for DW_FORM_loclistx to work. Differential revision: https://reviews.llvm.org/D77146	2020-04-06 13:28:06 +07:00
Igor Kudrin	49737df767	[DebugInfo] Fix reading location tables of v5 units in DWP. Without the patch, all version 5 compile units in a DWP file read location tables from the beginning of a .debug_loclists.dwo section. The patch fixes that by adjusting the reading offset the same way as for pre-v5 units. The section identifier to find the contribution entry corresponds to the version of the unit. Differential revision: https://reviews.llvm.org/D77145	2020-04-06 13:28:06 +07:00
Igor Kudrin	714324b79a	[DebugInfo] Support DWARFv5 index sections. DWARFv5 defines index sections in package files in a slightly different way than the pre-standard GNU proposal, see Section 7.3.5 in the DWARF standard and https://gcc.gnu.org/wiki/DebugFissionDWP for GNU proposal. The main concern here is values for section identifiers, which are partially overlapped with changed meanings. The patch adds support for v5 index sections and resolves that difficulty by defining a set of identifiers for internal use which can represent and distinct values of both standards. Differential Revision: https://reviews.llvm.org/D75929	2020-04-06 13:28:06 +07:00
Igor Kudrin	a0249fe91c	[DebugInfo] Rename section identifiers which are deprecated in DWARFv5. NFC. This is a preparation for an upcoming patch which adds support for DWARFv5 unit index sections. The patch adds tag "_EXT_" to identifiers which reference sections that are deprecated in the DWARFv5 standard. See D75929 for the discussion. Differential Revision: https://reviews.llvm.org/D77141	2020-04-06 13:28:06 +07:00
Craig Topper	97e57f3b24	[DAGCombiner] Use getAnyExtOrTrunc instead of getSExtOrTrunc in the zext(setcc) combine. We're ANDing with 1 right after which will cause the SIGN_EXTEND to be combined to ANY_EXTEND later. Might as well just start with an ANY_EXTEND. While there replace create the AND using the getZeroExtendInReg helper to remove the need to explicitly create the VecOnes constant.	2020-04-05 22:44:45 -07:00
Johannes Doerfert	931c0cd713	[OpenMP][NFC] Move and simplify directive -> allowed clause mapping Move the listing of allowed clauses per OpenMP directive to the new macro file in `llvm/Frontend/OpenMP`. Also, use a single generic macro that specifies the directive and one allowed clause explicitly instead of a dedicated macro per directive. We save 800 loc and boilerplate for all new directives/clauses with no functional change. We also need to include the macro file only once and not once per directive. Depends on D77112. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D77113	2020-04-06 00:04:08 -05:00
Craig Topper	586c051a27	[DAGCombiner] Replace a hardcoded constant in visitZERO_EXTEND with a proper check for the condition its trying to protect. This code is replacing a shift with a new shift on an extended type. If the shift amount type can't represent the maximum shift amount for the new type, the amount needs to be extended to a type that can. Previously, the code just hardcoded a check for 256 bits which seems to have been an assumption that the original shift amount was MVT::i8. But that seems more catered to a specific target like X86 that uses i8 as its legal shift amount type. Other targets may use different types. This commit changes the code to look at the real type of the shift amount and makes sure it has enough bits for the Log2 of the new type. There are similar checks to this in SelectionDAGBuilder and LegalizeIntegerTypes.	2020-04-05 20:35:57 -07:00
Johannes Doerfert	419a559c5a	[OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP` This is a cleanup and normalization patch that also enables reuse with Flang later on. A follow up will clean up and move the directive -> clauses mapping. Reviewed By: fghanim Differential Revision: https://reviews.llvm.org/D77112	2020-04-05 22:30:29 -05:00
Tarindu Jayatilaka	b43b59fcc0	Expose `attributor-disable` to the new and old pass managers The new and old pass managers (PassManagerBuilder.cpp and PassBuilder.cpp) are exposed to an `extern` declaration of `attributor-disable` option which will guard the addition of the attributor passes to the pass pipelines. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76871	2020-04-05 22:29:34 -05:00
Lang Hames	1b39c6f62c	[ORC] Add MachO universal binary support to StaticLibraryDefinitionGenerator. Add a new overload of StaticLibraryDefinitionGenerator::Load that takes a triple argument and supports loading archives from MachO universal binaries in addition to regular archives. The LLI tool is updated to use this overload.	2020-04-05 20:21:05 -07:00
Simon Pilgrim	a43e233606	Remove unused function 'isInRange'. NFCI.	2020-04-05 23:11:24 +01:00
Simon Pilgrim	4431a29c60	[X86][SSE] Combine unary shuffle(HORIZOP,HORIZOP) -> HORIZOP We had previously limited the shuffle(HORIZOP,HORIZOP) combine to binary shuffles, but we can often merge unary shuffles just as well, folding in UNDEF/ZERO values into the 64-bit half lanes. For the (P)HADD/HSUB cases this is limited to fast-horizontal cases but PACKSS/PACKUS combines under all cases.	2020-04-05 22:49:46 +01:00
Anna Thomas	1d0f757904	[InlineFunction] Update metadata on loads that are return values This patch builds upon D76140 by updating metadata on pointer typed loads in inlined functions, when the load is the return value, and the callsite contains return attributes which can be updated as metadata on the load. Added test cases show this for nonnull, dereferenceable, dereferenceable_or_null Reviewed-By: jdoerfert Differential Revision: https://reviews.llvm.org/D76792	2020-04-05 14:50:10 -04:00
Sourabh Singh Tomar	0d71782f4e	[DebugInfo]: Allow DwarfCompileUnit to have line table symbol Previously line table symbol was represented as `DIE::value_iterator` inside `DwarfCompileUnit` and subsequent function `intStmtList` was used to create a local `MCSymbol` to initialize it. This patch removes `DIE::value_iterator` from `DwarfCompileUnit` and intoduce `MCSymbol` for representing this units symbol for `debug_line` section. As a result `applyStmtList` is also modified to utilize this. Further more a helper function `getLineTableStartSym` is also introduced to get this symbol, this would be used by clients which need to access this line table, i.e `debug_macro`. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D77489	2020-04-06 00:14:29 +05:30
Zuojian Lin	a58c8a7866	Remove the additional constant which requires an extra register for statepoint lowering. The newly-created constant zero will need an extra register to hold it in the current statepoint lowering implementation. Remove it if there exists one.	2020-04-05 11:22:09 -04:00
Apelete Seketeli	8aadb442d1	[scan-build] fix dead store warnings emitted on LLVM AMDGPU code base This fixes dead store warnings of the type "dead assignment" reported by Clang Static Analyzer.	2020-04-05 11:19:03 -04:00
Oliver Stannard	cb6aeb2239	[ARM] Add data gathering hint instruction Summary: This patch upstreams support the optional ARMv8.0 Data Gathering Hint (DGH) extension, which adds the Data Gathering Hint instruction to the hint space. See ARMv8.0-DGH in the Arm Architecture Reference Manual Armv8 for more information. Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, danielkiss, samparker Reviewed By: SjoerdMeijer Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77097	2020-04-05 15:21:00 +01:00
Oliver Stannard	6f60eb4a3c	[ARM] Add enhanced counter virtualization system registers Summary: This patch upstreams support for the ARMv8.6A Enhanced Counter Virtualization (ECV) extension, which adds 6 new system registers. See ARMv8.6-ECV in the Arm Architecture Reference Manual Armv8 for more information. Reviewers: t.p.northover, rengolin, SjoerdMeijer, pcc, ab, chill Reviewed By: SjoerdMeijer Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77094	2020-04-05 15:18:35 +01:00
Sanjay Patel	538a8f0227	[InstCombine] convert bitcast-shuffle to vector trunc As discussed in D76983, that patch can turn a chain of insert/extract with scalar trunc ops into bitcast+extract and existing instcombine vector transforms end up creating a shuffle out of that (see the PhaseOrdering test for an example). Currently, that process requires at least this sequence: -instcombine -early-cse -instcombine. Before D76983, the sequence of insert/extract would reach the SLP vectorizer and become a vector trunc there. Based on a small sampling of public targets/types, converting the shuffle to a trunc is better for codegen in most cases (and a regression of that form is the reason this was noticed). The trunc is clearly better for IR-level analysis as well. This means that we can induce "spontaneous vectorization" without invoking any explicit vectorizer passes (at least a vector cast op may be created out of scalar casts), but that seems to be the right choice given that we started with a chain of insert/extract, and the backend would expand back to that chain if a target does not support the op. Differential Revision: https://reviews.llvm.org/D77299	2020-04-05 09:48:02 -04:00
Oliver Stannard	9e1455dc23	[ARM] Add ARMv8.6 Fine Grain Traps system registers Summary: This patch upstreams support for the ARMv8.6A Fine Grain Traps (FGT) extension, which adds 5 new system registers. See ARMv8.6-FGT in the Arm Architecture Reference Manual Armv8 for more information. Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, momchil.velikov Reviewed By: SjoerdMeijer Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76991	2020-04-05 14:28:18 +01:00
Sanjay Patel	4036a0af24	[InstCombine] enhance freelyNegateValue() by handling 'not' This patch extends D77230. If we have a 'not' instruction inside a negated expression, we can ignore extra uses of that op because the negation has a one-to-one replacement: negate becomes increment. Alive2 examples of the test cases: http://volta.cs.utah.edu:8080/z/T5-u9P http://volta.cs.utah.edu:8080/z/eT89L6 Differential Revision: https://reviews.llvm.org/D77459	2020-04-05 09:16:19 -04:00
Sanjay Patel	867f0c3c4d	[ValueTracking] enhance matching of smin/smax with 'not' operands The cmyk tests are based on the known regression that resulted from: rGf2fbdf76d8d0 So this improvement in analysis might be enough to restore that commit.	2020-04-05 08:54:12 -04:00
Diogo Sampaio	59d10dc703	[ARM] add ARMv8.6-A Activity monitors virtualization extension Summary: This patch upstreams v8.6A activity monitors virtualization assembler support, which consists of 32 new system registers (two groups, each with 16 numbered registers). See ARMv8.6-AMU in the Arm Architecture Reference Manual Armv8 for more information. Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, john.brawn, ostannard Reviewed By: ostannard Subscribers: LukeGeeson, dnsampaio, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76998	2020-04-05 13:31:06 +01:00
Benjamin Kramer	ff889df356	[X86] Roll some loops. NFCI.	2020-04-05 13:59:50 +02:00
Florian Hahn	47ee404075	[ValueTracking] Use Inst::comesBefore in isValidAssumeForCtx (NFC). D51664 added Instruction::comesBefore which should provide better performance than the manual check. Reviewers: rnk, nikic, spatel Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D76228	2020-04-05 12:38:04 +01:00
Simon Pilgrim	3079e51858	[X86][SSE] Generalize shuffle(HORIZOP,HORIZOP) -> HORIZOP combine Our existing combine allows to merge the shuffle of 2 similar 64-bit wide 'horizontal ops' (HADD/PACK/etc.) if the shuffle was a UNPCK/MOVSD. This patch generalizes this to decode any target shuffle mask that can be widened to a 128-bit repeating v2*64 mask, which helps us catch PBLENDW/PBLENDD cases.	2020-04-05 12:09:19 +01:00
Simon Pilgrim	a17de6b91c	[X86][SSE] truncateVectorWithPACK - upper undef for 128->64 packing If we're packing from 128-bits to 64-bits then we don't need the RHS argument. This helps with register allocation, especially as we avoid repeating a use of the input value.	2020-04-05 11:47:36 +01:00
Matt Arsenault	6bfe28e92f	AMDGPU: Fix annotate kernel features through casted calls I thought I was testing this before, but the workitem id x case isn't great since it's mandatory in the parent kernel.	2020-04-04 20:44:44 -04:00
Matt Arsenault	221890d709	AMDGPU: Add feature for fast f32 denormals	2020-04-04 20:01:24 -04:00
Stefanos Baziotis	f3dd3a66d3	[Attributor] AAUndefinedBehavior: Use AAValueSimplify in memory accessing instructions. Query AAValueSimplify on pointers in memory accessing instructions to take advantage of the constant propagation (or any other value simplification) of such values.	2020-04-05 02:46:26 +03:00
Jonathan Roelofs	3ce77142a6	Revert "[DAG] Fix PR45049: LegalizeTypes crash" This reverts commit `17673ae0b2`.	2020-04-04 13:47:22 -06:00
Jonathan Roelofs	17673ae0b2	[DAG] Fix PR45049: LegalizeTypes crash Sometimes LegalizeTypes knows about common subexpressions before SelectionDAG does, leading to accidental SDValue removal before its reference count was truly zero. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45049 https://reviews.llvm.org/D76994	2020-04-04 13:36:22 -06:00
Florian Hahn	a2b18c5a08	[LV] Simplify tryToWiden as recipes are not re-used (NFC). After `49d00824bb`, VPWidenRecipe only stores a single instruction. tryToWiden can simply return the widen recipe, like other helpers in VPRecipeBuilder.	2020-04-04 18:30:50 +01:00
Heejin Ahn	fc5d8b672b	[WebAssembly] Fix a sanitizer error in WasmEHPrepare Summary: D77423 started using a dominator tree in WasmEHPrepare, but we deleted BBs in `prepareThrows` before we used the domtree in `prepareEHPads`, and those CFG changes were not reflected in the domtree. This uses `DomTreeUpdater` to make sure we update the domtree every time we delete BBs from the CFG. This fixes ubsan/msan/expensive_check errors caught in LLVM buildbots. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77465	2020-04-04 09:57:07 -07:00
Nikita Popov	4ede730096	[InstCombine] Don't limit uses in eraseInstFromFunction() eraseInstFromFunction() adds the operands of the erased instructions, as those might now be dead as well. However, this is limited to instructions with less than 8 operands. This check doesn't make a lot of sense to me. As the instruction gets removed afterwards, I don't see a potential for anything overly pathological happening here (as we can only add those operands to the worklist once). The impact on CTMark is in the noise. We also have the same code in instruction sinking and don't limit the operand count there. Differential Revision: https://reviews.llvm.org/D77325	2020-04-04 18:37:30 +02:00
Luofan Chen	eec6d87626	[Attributor] Deduce attributes for non-exact functions This patch is based on D63312 and D63319. For now we create shallow wrappers for all functions that are IPO amendable. See also [this github issue](https://github.com/llvm/llvm-project/issues/172). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76404	2020-04-04 11:34:58 -05:00
Heejin Ahn	2e9839729d	[WebAssembly] Fix wasm.lsda() optimization in WasmEHPrepare Summary: When we insert a call to the personality function wrapper (`_Unwind_CallPersonality`) for a catch pad, we store some necessary info in `__wasm_lpad_context` struct and pass it. One of the info is the LSDA address for the function. For this, we insert a call to `wasm.lsda()`, which will be lowered down to the address of LSDA, and store it in a field in `__wasm_lpad_context`. There are exceptions to this personality call insertion: catchpads for `catch (...)` and cleanuppads (for destructors) don't need personality function calls, because we don't need to figure out whether the current exception should be caught or not. (They always should.) There was a little optimization to `wasm.lsda()` call insertion. Because the LSDA address is the same throughout a function, we don't need to insert a store of `wasm.lsda()` return value in every catchpad. For example: ``` try { foo(); } catch (int) { // wasm.lsda() call and a store are inserted here, like, in // pseudocode, // %lsda = wasm.lsda(); // store %lsda to a field in __wasm_lpad_context try { foo(); } catch (int) { // We don't need to insert the wasm.lsda() and store again, because // to arrive here, we have already stored the LSDA address to // __wasm_lpad_context in the outer catch. } } ``` So the previous algorithm checked if the current catch has a parent EH pad, we didn't insert a call to `wasm.lsda()` and its store. But this was incorrect, because what if the outer catch is `catch (...)` or a cleanuppad? ``` try { foo(); } catch (...) { // wasm.lsda() call and a store are NOT inserted here try { foo(); } catch (int) { // We need wasm.lsda() here! } } ``` In this case we need to insert `wasm.lsda()` in the inner catchpad, because the outer catchpad does not have one. To minimize the number of inserted `wasm.lsda()` calls and stores, we need a way to figure out whether we have encountered `wasm.lsda()` call in any of EH pads that dominates the current EH pad. To figure that out, we now visit EH pads in BFS order in the dominator tree so that we visit parent BBs first before visiting its child BBs in the domtree. We keep a set named `ExecutedLSDA`, which basically means "Do we have `wasm.lsda()` either in the current EH pad or any of its parent EH pads in the dominator tree?". This is to prevent scanning the domtree up to the root in the worst case every time we examine an EH pad: each EH pad only needs to examine its immediate parent EH pad. - If any of its parent EH pads in the domtree has `wasm.lsda()`, this means we don't need `wasm.lsda()` in the current EH pad. We also insert the current EH pad in `ExecutedLSDA` set. - If none of its parent EH pad has `wasm.lsda()` - If the current EH pad is a `catch (...)` or a cleanuppad, done. - If the current EH pad is neither a `catch (...)` nor a cleanuppad, add `wasm.lsda()` and the store in the current EH pad, and add the current EH pad to `ExecutedLSDA` set. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77423	2020-04-04 07:02:50 -07:00
Simon Pilgrim	e5e719d885	[X86][SSE] lowerV8I16Shuffle - lower compaction shuffles using PACKUSDW(PBLENDW,PBLENDW) on SSE41+ Similar to the lowerV16I8Shuffle implementation, for binary compaction v8i16 shuffles we can avoid the PUNPCKLDQ(PSHUFB,PSHUFB) pattern on SSE41+ targets by using PACKUSDW and PBLENDW. Before SSE41 we would need to use PACKSSDW but that requires sign extension that seems to destroy any gains, even on targets without PSHUFB. This is a bigger gain on AMD than Intel targets but should never be a regression, and avoiding the shuffle mask load(s) is always useful. Noticed in codegen while dealing with PR31443.	2020-04-04 13:08:25 +01:00
Nikita Popov	b90ea4f341	[IRBuilder] Move some code into the cpp file; NFC Since D73835 we no longer need to define the whole IRBuilder implementation in the header. This patch moves some of the larger methods out of line, into the C++ file. Differential Revision: https://reviews.llvm.org/D77332	2020-04-04 12:52:56 +02:00
Nikita Popov	6896d559f3	[VNCoercion] Use IRBuilderBase; NFC And remove include from header.	2020-04-04 12:44:50 +02:00
Nikita Popov	ebd5a1b049	[Reassociate] Use IRBuilderBase; NFC And remove now unnecessary IRBuilder.h include in header.	2020-04-04 12:34:16 +02:00
Nikita Popov	1055e9e3c8	[IVDescriptors] Remove IRBuilder.h include; NFC IVDescriptors.h itself does not reference IRBuilder at all. Move the include into transformation passes that do.	2020-04-04 12:07:57 +02:00
Nikita Popov	a5eb1236e3	[IVDescriptors] Remove unnecessary DemandedBits.h include; NFC Forward declare DemandedBits in IVDescriptors, and move include into the cpp file. Also drop the include from LoopUtils, which does not need it at all.	2020-04-04 12:07:57 +02:00
Craig Topper	1d42c0db9a	Revert "[X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI) Gadgets" This reverts commit `c74dd640fd`. Reverting to address coding standard issues raised in post-commit review.	2020-04-03 16:56:08 -07:00
Craig Topper	a505ad58cf	Revert "[X86] Add Support for Load Hardening to Mitigate Load Value Injection (LVI)" This reverts commit `62c42e29ba` Reverting to address coding standard issues raised in post-commit review.	2020-04-03 16:55:53 -07:00
Scott Constable	62c42e29ba	[X86] Add Support for Load Hardening to Mitigate Load Value Injection (LVI) After finding all such gadgets in a given function, the pass minimally inserts LFENCE instructions in such a manner that the following property is satisfied: for all SOURCE+SINK pairs, all paths in the CFG from SOURCE to SINK contain at least one LFENCE instruction. The algorithm that implements this minimal insertion is influenced by an academic paper that minimally inserts memory fences for high-performance concurrent programs: http://www.cs.ucr.edu/~lesani/companion/oopsla15/OOPSLA15.pdf The algorithm implemented in this pass is as follows: 1. Build a condensed CFG (i.e., a GadgetGraph) consisting only of the following components: -SOURCE instructions (also includes function arguments) -SINK instructions -Basic block entry points -Basic block terminators -LFENCE instructions 2. Analyze the GadgetGraph to determine which SOURCE+SINK pairs (i.e., gadgets) are already mitigated by existing LFENCEs. If all gadgets have been mitigated, go to step 6. 3. Use a heuristic or plugin to approximate minimal LFENCE insertion. 4. Insert one LFENCE along each CFG edge that was cut in step 3. 5. Go to step 2. 6. If any LFENCEs were inserted, return true from runOnFunction() to tell LLVM that the function was modified. By default, the heuristic used in Step 3 is a greedy heuristic that avoids inserting LFENCEs into loops unless absolutely necessary. There is also a CLI option to load a plugin that can provide even better optimization, inserting fewer fences, while still mitigating all of the LVI gadgets. The plugin can be found here: https://github.com/intel/lvi-llvm-optimization-plugin, and a description of the pass's behavior with the plugin can be found here: https://software.intel.com/security-software-guidance/insights/optimized-mitigation-approach-load-value-injection. Differential Revision: https://reviews.llvm.org/D75937	2020-04-03 13:45:50 -07:00
Scott Constable	c74dd640fd	[X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI) Gadgets Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph. More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK). Also adds a new target feature to X86: +lvi-load-hardening The feature can be added via the clang CLI using -mlvi-hardening. Differential Revision: https://reviews.llvm.org/D75936	2020-04-03 13:02:04 -07:00
Alina Sbirlea	688450c7f0	[GraphDiff] Extend GraphDiff to track a list of updates. Summary: This patch includes two extensions: 1. It extends the GraphDiff to also keep the original list of updates after legalization, not just the deletes/insert vectors. It also provides an API to pop the first update (the updates are store in reverse, such that the first update is at the end of the list) 2. It adds a bool to mark whether the given updates should be applied as given, or applied in reverse. This moves the task of reversing the updates (when the caller needs this) to a functionality inside GraphDiff, versus having the caller do this. The two changes could be split into two patches, but they seemed reasonably small to be reviewed together. Reviewers: kuhar, dblaikie Subscribers: hiraditya, george.burgess.iv, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77167	2020-04-03 12:10:36 -07:00
Scott Constable	f95a67d8b8	[X86] Add RET-hardening Support to mitigate Load Value Injection (LVI) Adding a pass that replaces every ret instruction with the sequence: pop <scratch-reg> lfence jmp *<scratch-reg> where <scratch-reg> is some available scratch register, according to the calling convention of the function being mitigated. Differential Revision: https://reviews.llvm.org/D75935	2020-04-03 12:08:34 -07:00
Matt Arsenault	30ebafaa56	CodeGen: Convert some TII hooks to use Register	2020-04-03 14:52:54 -04:00
Matt Arsenault	178050c3ba	AMDGPU: Use Register in more places	2020-04-03 14:52:54 -04:00
Matt Arsenault	e8dcb6d05e	AMDGPU: Remove redundant virtual	2020-04-03 14:52:53 -04:00
Christopher Tetreault	b600809688	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: kparzysz, sdesmalen, efriedma Reviewed By: kparzysz Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77267	2020-04-03 11:26:51 -07:00
Stanislav Mekhanoshin	0462795095	[AMDGPU] Propagate AGPR RC from PHI to its PHI operands We can fix register class of PHI based on its all AGPR uses. That leaves behind all PHIs which were already processed earlier. Propagate RC back to PHI operands of a PHI. Differential Revision: https://reviews.llvm.org/D77344	2020-04-03 11:23:02 -07:00
Simon Pilgrim	2225797567	[YAMLParser] Scanner::setError - ensure we use the StringRef::iterator argument (PR45043) As detailed on PR45043, static analysis was warning that the StringRef::iterator Position argument was being ignored and the function was hardwired to use the Current iterator. This patch ensures we use the provided iterator and removes the (barely necessary) setError wrapper that always used Current. Differential Revision: https://reviews.llvm.org/D76512	2020-04-03 18:55:38 +01:00
Sanjay Patel	ce97ce3a5d	[VectorCombine] try to form a better extractelement Extracting to the same index that we are going to insert back into allows forming select ("blend") shuffles and enables further transforms. Admittedly, this is a quick-fix for a more general problem that I'm hoping to solve by adding transforms for patterns that start with an insertelement. But this might resolve some regressions known to be caused by the extract-extract transform (although I have not gotten more details on those yet). In the motivating case from PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 The combination of subsequent instcombine and codegen transforms gets us this improvement: vmovshdup %xmm0, %xmm2 ## xmm2 = xmm0[1,1,3,3] vhaddps %xmm1, %xmm1, %xmm4 vmovshdup %xmm1, %xmm3 ## xmm3 = xmm1[1,1,3,3] vaddps %xmm0, %xmm2, %xmm0 vaddps %xmm1, %xmm3, %xmm1 vshufps $200, %xmm4, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm4[0,3] vinsertps $177, %xmm1, %xmm0, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm1[2] --> vmovshdup %xmm0, %xmm2 ## xmm2 = xmm0[1,1,3,3] vhaddps %xmm1, %xmm1, %xmm1 vaddps %xmm0, %xmm2, %xmm0 vshufps $200, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm1[0,3] Differential Revision: https://reviews.llvm.org/D76623	2020-04-03 13:55:13 -04:00
Sylvain Audi	e4ae0a2e97	[Support/Path] sys::path::replace_path_prefix fix and simplifications Added unit tests for 2 scenarios that were failing. Made replace_path_prefix back to 3 parameters instead of 5, simplifying the implementation. The other 2 were always used with the default value. This commit is intended to be the first of 3: 1) simplify/fix replace_path_prefix. 2) use it in the context of -fdebug-prefix-map and -fmacro-prefix-map (see D76869). 3) Make Windows version of replace_path_prefix insensitive to both case and separators (slash vs backslash). Differential Revision: https://reviews.llvm.org/D77223	2020-04-03 13:50:23 -04:00
Simon Pilgrim	34a497b765	[X86][SSE] lowerShuffleWithPACK - extend to use chained PACKs for larger truncations Extend lowerShuffleWithPACK/matchShuffleWithPACK/createPackShuffleMask to handle compaction style shuffle masks that can be lowered to chains of PACKSS/PACKUS if their inputs are suitably sign/zero extended. This helps avoid PSHUFB (and its mask load) for short shuffle chains, shuffle combining will still replace with a PSHUFB if we have enough shuffles as getFauxShuffleMask should recognise the PACKSS/PACKUS chains.	2020-04-03 18:26:10 +01:00
Roman Lebedev	7d572ef2dd	Revert "[SCEV] rewriteLoopExitValues(): even if have hard uses, still rewrite if cheap (PR44668)" As discussed in post-commit review in https://reviews.llvm.org/D73501 if the goal of this is to help vectorizer, then we should actually be teaching vectorizer to do this, because right now this rewrite is still budget-limited, which isn't what we'd want. Additionally, while the rest of the patch series was universally profitable, this particular patch is reportedly (https://reviews.llvm.org/D73501#1905171) exposing cost-modeling issues on ARM. So let's just back this particular patch out. Once there's an undo transform, this could be considered for reintegration. This reverts commit `44edc6fd2c`.	2020-04-03 20:15:04 +03:00
John Brawn	4ad9ca0f9e	[ARM] Fix incorrect handling of big-endian vmov.i64 Currently when the target is big-endian vmov.i64 reverses the order of the two words of the vector. This is correct only when the underlying element type is 32-bit, as actually what it should be doing is considering it a vector of the underlying type and reversing the elements of that. Differential Revision: https://reviews.llvm.org/D76515	2020-04-03 17:36:50 +01:00
John Brawn	cd58fb6325	[ARM] Avoid pointless vrev of element-wise vmov If we have an element-wise vmov immediate instruction then a subsequent vrev with width greater or equal to the vmov element width, then that vrev won't do anything. Add a DAG combine to convert bitcasts that would become such vrevs into vector_reg_casts instead. Differential Revision: https://reviews.llvm.org/D76514	2020-04-03 17:36:50 +01:00
Matt Arsenault	57a55313c3	InstCombine: Reduce minnum/maxnum if inputs are casted	2020-04-03 11:57:25 -04:00
jasonliu	d65557d15d	[NFC][XCOFF][AIX] Refactor get/setContainingCsect Summary: For current architect, we always require setContainingCsect to be called on every MCSymbol got used in XCOFF context. This is very hard to achieve because symbols gets created everywhere and other MCSymbol types(ELF, COFF) do not have similar rules. It's very easy to miss setting the containing csect, and we would need to add a lot of XCOFF specialized code around some common code area. This patch intendeds to do 1. Rely on getFragment().getParent() to get csect from labels. 2. Only use get/setRepresentedCsect (was get/setContainingCsect) if symbol itself represents a csect. Reviewers: DiggerLin, hubert.reinterpretcast, daltenty Differential Revision: https://reviews.llvm.org/D77080	2020-04-03 13:33:12 +00:00
Guillaume Chatelet	9068bccbae	[Alignment][NFC] Deprecate InstrTypes getRetAlignment/getParamAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77312	2020-04-03 13:21:58 +00:00
Guillaume Chatelet	1a584a8d50	[Alignment][NFC] Remove unused private functions Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77297	2020-04-03 09:16:20 +00:00
Guillaume Chatelet	ca11c480e7	[Alignment][NFC] Convert MachineIRBuilder::buildDynStackAlloc to Align Summary: The change in IRTranslator is not trivial but is NFC as far as I can tell. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77292	2020-04-03 09:05:19 +00:00
OCHyams	9b56cc9361	[DebugInfo] Salvage debug info when sinking loop invariant instructions Reviewed By: vsk, aprantl, djtodoro Differential Revision: https://reviews.llvm.org/D77318	2020-04-03 09:19:26 +01:00
Guillaume Chatelet	9f5c786876	[NFC] G_DYN_STACKALLOC realign iff align > 1, update documentation Summary: I think it would be better to require the alignment to be >= 1. It is currently confusing to allow both values. Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77372	2020-04-03 08:12:39 +00:00
scentini	6825920b18	Silence -Wpessimizing-move warning	2020-04-03 09:37:39 +02:00
Scott Constable	5b519cf1fc	[X86] Add Indirect Thunk Support to X86 to mitigate Load Value Injection (LVI) This pass replaces each indirect call/jump with a direct call to a thunk that looks like: lfence jmpq *%r11 This ensures that if the value in register %r11 was loaded from memory, then the value in %r11 is (architecturally) correct prior to the jump. Also adds a new target feature to X86: +lvi-cfi ("cfi" meaning control-flow integrity) The feature can be added via clang CLI using -mlvi-cfi. This is an alternate implementation to https://reviews.llvm.org/D75934 That merges the thunk insertion functionality with the existing X86 retpoline code. Differential Revision: https://reviews.llvm.org/D76812	2020-04-03 00:34:39 -07:00
scentini	0a3845b70f	Silence -Wpessimizing-move warning	2020-04-03 09:24:26 +02:00
Igor Kudrin	f13ce15d44	[DebugInfo] Rename getOffset() to getContribution(). NFC. The old name was a bit misleading because the functions actually return contributions to the corresponding sections. Differential revision: https://reviews.llvm.org/D77302	2020-04-03 14:15:53 +07:00
Sourabh Singh Tomar	69c8fb1c65	[DWARF5] Added support for debug_macro section parsing and dumping in llvm-dwarfdump. Summary: This patch adds parsing and dumping DWARFv5 .debug_macro section in llvm-dwarfdump, it does not introduce any new switch. Existing switch "--debug-macro" should be used to dump macinfo or macro section. Reviewed By: dblaikie, ikudrin, jhenderson Differential Revision: https://reviews.llvm.org/D73086	2020-04-03 12:23:51 +05:30
Serguei Katkov	bd1d70bf0e	[DAG] Change isGCValue detection for statepoint lowering isGCValue should detect whether the deopt value is a GC pointer. Currently it checks by finding the value in SI.Bases and SI.Ptrs. However these data structures contain only those values which have corresponding gc.relocate call. So we can miss GC value if it does not have gc.relocate call (dead after the call). Check GC strategy whether pointer is GC one or consider any pointer to be GC one conservatively. Reviewers: reames, dantrushin Reviewed By: reames Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D77130	2020-04-03 12:36:13 +07:00
Scott Constable	b1d581019f	[X86] Refactor X86IndirectThunks.cpp to Accommodate Mitigations other than Retpoline Introduce a ThunkInserter CRTP base class from which new thunk types can inherit, e.g., thunks to mitigate https://software.intel.com/security-software-guidance/software-guidance/load-value-injection. Differential Revision: https://reviews.llvm.org/D76811	2020-04-02 22:09:54 -07:00
Scott Constable	71e8021d82	[X86][NFC] Generalize the naming of "Retpoline Thunks" and related code to "Indirect Thunks" There are applications for indirect call/branch thunks other than retpoline for Spectre v2, e.g., https://software.intel.com/security-software-guidance/software-guidance/load-value-injection Therefore it makes sense to refactor X86RetpolineThunks as a more general capability. Differential Revision: https://reviews.llvm.org/D76810	2020-04-02 21:55:13 -07:00
Hongtao Yu	88da019977	Fix a bug in the inliner that causes subsequent double inlining Summary: A recent change in the instruction simplifier enables a call to a function that just returns one of its parameter to be simplified as simply loading the parameter. This exposes a bug in the inliner where double inlining may be involved which in turn may cause compiler ICE when an already-inlined callsite is reused for further inlining. To put it simply, in the following-like C program, when the function call second(t) is inlined, its code t = third(t) will be reduced to just loading the return value of the callsite first(). This causes the inliner internal data structure to register the first() callsite for the call edge representing the third() call, therefore incurs a double inlining when both call edges are considered an inline candidate. I'm making a fix to break the inliner from reusing a callsite for new call edges. ``` void top() { int t = first(); second(t); } void second(int t) { t = third(t); fourth(t); } void third(int t) { return t; } ``` The actual failing case is much trickier than the example here and is only reproducible with the legacy inliner. The way the legacy inliner works is to process each SCC in a bottom-up order. That means in reality function first may be already inlined into top, or function third is either inlined to second or is folded into nothing. To repro the failure seen from building a large application, we need to figure out a way to confuse the inliner so that the bottom-up inlining is not fulfilled. I'm doing this by making the second call indirect so that the alias analyzer fails to figure out the right call graph edge from top to second and top can be processed before second during the bottom-up. We also need to tweak the test code so that when the inlining of top happens, the function body of second is not that optimized, by delaying the pass of function attribute deducer (i.e, which tells function third has no side effect and just returns its parameter). Since the CGSCC pass is iterative, additional calls are added to top to postpone the inlining of second to the second round right after the first function attribute deducing pass is done. I haven't been able to repro the failure with the new pass manager since the processing order of ininlined callsites is a bit different, but in theory the issue could happen there too. Note that this fix could introduce a side effect that blocks the simplification of inlined code, specifically for a call site that can be folded to another call site. I hope this can probably be complemented by subsequent inlining or folding, as shown in the attached unit test. The ideal fix should be to separate the use of VMap. However, in reality this failing pattern shouldn't happen often. And even if it happens, there should be a good chance that the non-folded call site will be refolded by iterative inlining or subsequent simplification. Reviewers: wenlei, davidxl, tejohnson Reviewed By: wenlei, davidxl Subscribers: eraman, nikic, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76248	2020-04-02 21:08:05 -07:00
Xiang1 Zhang	43f031d312	Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology) Summary: This patch comes from H.J.'s `2bd54ce7fa` This patch fix the failed llvm unit tests which running on CET machine. (e.g. ExecutionEngine/MCJIT/MCJITTests) The reason we enable IBT at "JIT compiled with CET" is mainly that: the JIT don't know the its caller program is CET enable or not. If JIT's caller program is non-CET, it is no problem JIT generate CET code or not. But if JIT's caller program is CET enabled, JIT must generate CET code or it will cause Control protection exceptions. I have test the patch at llvm-unit-test and llvm-test-suite at CET machine. It passed. and H.J. also test it at building and running VNCserver(Virtual Network Console), it works too. (if not apply this patch, VNCserver will crash at CET machine.) Reviewers: hjl.tools, craig.topper, LuoYuanke, annita.zhang, pengfei Subscribers: tstellar, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76900	2020-04-03 11:44:07 +08:00
Jessica Paquette	71947ed927	[AArch64][GlobalISel] Constrain reg operands in selectBrJT This was causing a machine verifier failure on the test suite. Make sure that we don't end up with a weird register class here. Failure for reference: * Bad machine code: Illegal virtual register for instruction * - function: check_constrain - basic block: %bb.1 (0x7f8b70839f80) - instruction: early-clobber %6:gpr64, early-clobber %7:gpr64sp = JumpTableDest32 %5:gpr64, %1:gpr64sp, %jump-table.0 - operand 3: %1:gpr64sp Expected a GPR64 register, but got a GPR64sp register Differential Revision: https://reviews.llvm.org/D77349	2020-04-02 20:34:11 -07:00
Wenju He	fe8ac0fe51	[x86] Fix Intel OpenCL builtin CalleeSavedRegs on skx Summary: Align with AVX512 builtins implementations, some of which don't preserve rdi. Reviewers: yubing, tianqing, craig.topper Reviewed By: craig.topper Subscribers: yaxunl, Anastasia, hiraditya Differential Revision: https://reviews.llvm.org/D77032	2020-04-03 11:27:40 +08:00
Qiu Chaofan	71f1ab5354	[PowerPC] Remove unnecessary XSRSP instruction MI peephole will remove unnecessary FRSP instructions. This patch removes such unnecessary XSRSP. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D77208	2020-04-03 11:05:14 +08:00
Jun Ma	9c6f32a0ff	[Coroutines] Simplify implementation using removePredecessor Differential Revision: https://reviews.llvm.org/D77035	2020-04-03 09:20:07 +08:00
Austin Kerbow	30f18ed387	[AMDGPU] Handle SMRD signed offset immediate Summary: This fixes a few issues related to SMRD offsets. On gfx9 and gfx10 we have a signed byte offset immediate, however we can overflow into a negative since we treat it as unsigned. Also, the SMRD SOFFSET sgpr is an unsigned offset on all subtargets. We sometimes tried to use negative values here. Third, S_BUFFER instructions should never use a signed offset immediate. Differential Revision: https://reviews.llvm.org/D77082	2020-04-02 17:41:52 -07:00
Adrian Prantl	93fe58c9cf	Teach the stripNonLineTableDebugInfo pass about the llvm.dbg.label intrinsic. Debug info for labels is not generated at -gline-tables-only, so this pass should remove them. Differential Revision: https://reviews.llvm.org/D77345	2020-04-02 17:39:33 -07:00
Adrian Prantl	c024f3ebdc	Teach the stripNonLineTableDebugInfo pass about the llvm.dbg.addr intrinsic. This patch also strips llvm.dbg.addr intrinsics when downgrading debug info to linetables-only. Differential Revision: https://reviews.llvm.org/D77343	2020-04-02 17:39:33 -07:00
Lang Hames	05598441de	Re-apply `0071eaaf08`, "[ORC] Export __cxa_atexit ...", with fixes. Forgot to include part of the testcase. Thank to Nico for spotting that and reverting!	2020-04-02 16:03:35 -07:00
Matt Arsenault	f68cc2a7ed	AMDGPU: Use 128-bit DS operations by default	2020-04-02 17:17:47 -04:00
Matt Arsenault	5660bb6bc9	AMDGPU: Remove denormal subtarget features Switch to using the denormal-fp-math/denormal-fp-math-f32 attributes.	2020-04-02 17:17:12 -04:00
Matt Arsenault	75cf30918f	AMDGPU: Assume f32 denormals are enabled by default This will likely introduce catastrophic performance regressions on older subtargets, but should be correct. A follow up change will remove the old fp32-denormals subtarget features, and switch to using the new denormal-fp-math/denormal-fp-math-f32 attributes. Frontends should be making sure to add the denormal-fp-math-f32 attribute when appropriate to avoid performance regressions.	2020-04-02 17:17:12 -04:00
Cyndy Ishida	fd4d07517b	[llvm][TextAPI] adding inlining reexported libraries support Summary: [llvm][TextAPI] adding inlining reexported libraries support * this patch adds reader/writer support for MachO tbd files. The usecase is to represent reexported libraries in top level library that won't need to exist for linker indirection because all of the needed content will be inlined in the same document. Reviewers: ributzka, steven_wu, jhenderson Reviewed By: ributzka Subscribers: JDevlieghere, hiraditya, mgrang, dexonsmith, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67646	2020-04-02 13:05:08 -07:00
Craig Topper	4fdb63bbf0	[X86] Enable combineExtSetcc for vectors larger than 256 bits when we've disabled 512 bit vectors. The compares are going to be type legalized to 256 bits so we might as well fold the extend.	2020-04-02 12:44:27 -07:00
Anna Thomas	bf7a16a768	[InlineFunction] Update valid return attributes at callsite within callee body Consider a callee function that has a call (C) within it which feeds into the return. When we inline that callee into a callsite that has return attributes, we can backward propagate valid attributes to the call (C) within that inlined callee body. This is safe to do so only if we can guarantee transfer of execution to successor in the window of instructions between return value (i.e. the call C) and the return instruction. Also, this is valid only for attributes which are a property of a callsite and not those that are not dependent on the ABI, or a property of the call itself. Reviewed-By: reames, jdoerfert Differential Revision: https://reviews.llvm.org/D76140	2020-04-02 14:13:12 -04:00
Matt Arsenault	c3d3c22a58	AMDGPU: Hack out noinline on functions using LDS globals This is a workaround for clang adding noinline to all functions at -O0. Previously, we would just add alwaysinline, and the verifier would complain about having both noinline and alwaysinline. We currently can't truly codegen this case as a freestanding function, so override the user forcing noinline.	2020-04-02 14:12:07 -04:00
Sanjay Patel	f4448063cc	[InstCombine] try to reduce shuffle with bitcasted operand shuf (bitcast X), undef, Mask --> bitcast X' The 'inverse shuffles' test (shuf_bitcast_operand) is a pattern in the motivating examples from PR35454: https://bugs.llvm.org/show_bug.cgi?id=35454 (see also D76727) We can deal with this class of patterns in generic instcombine because we are not creating any new shuffles, just a bitcast. Alive2 proof: http://volta.cs.utah.edu:8080/z/mwDUZf Differential Revision: https://reviews.llvm.org/D76844	2020-04-02 13:44:50 -04:00
Sanjay Patel	b6050ca181	[VectorCombine] transform bitcasted shuffle to narrower elements bitcast (shuf V, MaskC) --> shuf (bitcast V), MaskC' We do not attempt this in InstCombine because we do not want to change types and create new shuffle ops that are potentially not lowered as well as the original code. Here, we can check the cost model to see if it is worthwhile. I've aggressively enabled this transform even if the types are the same size and/or equal cost because moving the bitcast allows InstCombine to make further simplifications. In the motivating cases from PR35454: https://bugs.llvm.org/show_bug.cgi?id=35454 ...this is enough to let instcombine and the backend eliminate the redundant shuffles, but we probably want to extend VectorCombine to handle the inverse pattern (shuffle-of-bitcast) to get that simplification directly in IR. Differential Revision: https://reviews.llvm.org/D76727	2020-04-02 13:30:22 -04:00
Stanislav Mekhanoshin	f2334a7ef2	[AMDGPU] Fix crash in SILoadStoreOptimizer SILoadStoreOptimizer::checkAndPrepareMerge() expects base and paired instruction to come in order and scans MBB from base to the paired instruction. An original order can be changed if there were a dependent instruction in between and base instruction was moved. Fixed by bailing the optimization. In theory it might be possible still to perform a merge by swapping instructions, but on practice it bails anyway because it finds dependency on that same instruction which has resulted in the base move. Differential Revision: https://reviews.llvm.org/D77245	2020-04-02 10:26:47 -07:00
Sanjay Patel	12fcbcecff	[InstCombine] add tests for cmyk benchmark; NFC These are versions of a function that regressed with: rGf2fbdf76d8d0 That particular problem occurs with an instcombine-simplifycfg-instcombine sequence, but we can show that it exists within instcombine only with other variations of the pattern.	2020-04-02 13:00:46 -04:00
Benjamin Kramer	de8831934a	[LoopDataPrefetch] Remove unused include that's a layering violation	2020-04-02 17:46:10 +02:00
Benjamin Kramer	dffc503187	Revert "[SimplifyLibCalls] Erase replaced instructions" This reverts commit `2a77544ad5`. This introduces a use-after-free in Transforms/InstCombine/sincospi.ll. Found by asan.	2020-04-02 17:30:47 +02:00
Jonas Paulsson	7e02da7db5	[SystemZ] Add isCommutable flag on vector instructions. This does not change much in code generation, but in rare cases MachineCSE can figure out that an instruction is redundant after commuting it. Review: Ulrich Weigand	2020-04-02 16:06:15 +02:00
Sanjay Patel	1008435f3d	Revert "[InstCombine] do not exclude min/max from icmp with casted operand fold" This reverts commit `f2fbdf76d8`. As noted in the post-commit thread: https://reviews.llvm.org/rGf2fbdf76d8d0 ...this can obscure a min/max pattern where the components have extra uses. We can show that the problem is independent of this change with a slightly modified source example, so this revert just delays/reduces the need to fix the real problem. We need to improve our analysis of negation or -- more generally -- subtraction using patches like D77230 or D68408.	2020-04-02 09:15:23 -04:00
Tyker	c00cb76274	[NFC] Split Knowledge retention and place it more appropriatly Summary: Splitting Knowledge retention into Queries in Analysis and Builder into Transform/Utils allows Queries and Transform/Utils to use Analysis. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77171	2020-04-02 15:01:41 +02:00
Jonas Paulsson	36d4421f50	[LoopDataPrefetch + SystemZ] Let target decide on prefetching for each loop. This patch adds - New arguments to getMinPrefetchStride() to let the target decide on a per-loop basis if software prefetching should be done even with a stride within the limit of the hw prefetcher. - New TTI hook enableWritePrefetching() to let a target do write prefetching by default (defaults to false). - In LoopDataPrefetch: - A search through the whole loop to gather information before emitting any prefetches. This way the target can get information via new arguments to getMinPrefetchStride() and emit prefetches more selectively. Collected information includes: Does the loop have a call, how many memory accesses, how many of them are strided, how many prefetches will cover them. This is NFC to before as long as the target does not change its definition of getMinPrefetchStride(). - If a previous access to the same exact address was 'read', and the current one is 'write', make it a 'write' prefetch. - If two accesses that are covered by the same prefetch do not dominate each other, put the prefetch in a block that dominates both of them. - If a ConstantMaxTripCount is less than ItersAhead, then skip the loop. - A SystemZ implementation of getMinPrefetchStride(). Review: Ulrich Weigand, Michael Kruse Differential Revision: https://reviews.llvm.org/D70228	2020-04-02 14:57:46 +02:00
Simon Pilgrim	b02c7a8152	Fix "result of 32-bit shift implicitly converted to 64 bits" MSVC warning. NFCI. The shift of 1 by an amount that is never more than 31 means that the warning is a false positive but is safe and fixes Werror builds.	2020-04-02 12:02:04 +01:00
David Green	fbd53ffc3a	[ARM] MVE VMULL patterns This adds MVE vmull patterns, which are conceptually the same as mul(vmovl, vmovl), and so the tablegen patterns follow the same structure. For i8 and i16 this is simple enough, but in the i32 version the multiply (in 64bits) is illegal, meaning we need to catch the pattern earlier in a dag fold. Because bitcasts are involved in the zext versions and the patterns are a little different in little and big endian. I have only added little endian support in this patch. Differential Revision: https://reviews.llvm.org/D76740	2020-04-02 10:57:40 +01:00
David Green	c697dd9ffd	[ARM] Make remaining MVE instruction predictable The unpredictable/hasSideEffects flag is usually inferred by tablegen from whether the instruction has a tablegen pattern (and that pattern only has a single output instruction). Now that the MVE intrinsics are all committed and producing code, the remaining instructions still marked as unpredictable need to be specially handled. This adds the flag directly to instructions that need it, notably the V*MLAL instructions and some of the MOV's. Differential Revision: https://reviews.llvm.org/D76910	2020-04-02 10:57:40 +01:00
Guillaume Chatelet	96cae168fa	[NFC] Preparatory work for D77292	2020-04-02 09:30:33 +00:00
Clement Courbet	fb4aa30f27	[ExpandMemCmp] Allow overlaping loads in the zero-relational case. Summary: This allows doing `memcmp(p, q, 7)` with 2 loads instead of a call to memcmp. This fixes part of PR45147. Reviewers: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76133	2020-04-02 11:20:47 +02:00
Florian Hahn	a63b5c9e53	[CallSiteSplitting] Simplify isPredicateOnPHI & continue checking PHIs. As pointed out by @thakis, currently CallSiteSplitting bails out after checking the first PHI node. We should check all PHI nodes, until we find one where call site splitting is beneficial. This patch also slightly simplifies the code using BasicBlock::phis(). Reviewers: davidxl, junbuml, thakis Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D77089	2020-04-02 10:11:27 +01:00
Guillaume Chatelet	189d2e215f	[Alignment][NFC] Use more Align versions of various functions Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: MatzeB, qcolombet, arsenm, sdardis, jvesely, nhaehnle, hiraditya, jrtc27, atanasyan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77291	2020-04-02 09:00:53 +00:00
OCHyams	550ab58bc1	[NFC] Fix performance issue in LiveDebugVariables When compiling AMDGPUDisassembler.cpp in a stage 1 trunk build with CMAKE_BUILD_TYPE=RelWithDebInfo LLVM_USE_SANITIZER=Address LiveDebugVariables accounts for 21.5% wall clock time. This fix reduces that to 1.2% by switching out a linked list lookup with a map lookup. Note that the linked list is still used to group UserValues by vreg. The vreg lookups don't cause any problems in this pathological case. This is the same idea as D68816, which was reverted, except that it is a less intrusive fix. Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D77226	2020-04-02 09:39:33 +01:00
Djordje Todorovic	29d253c4c6	[Object] Add the method for checking if a section is a debug section Different file formats have different naming style for the debug sections. The method is implemented for ELF, COFF and Mach-O formats. Differential Revision: https://reviews.llvm.org/D76276	2020-04-02 10:56:00 +02:00
WangTianQing	d08fadd662	[X86] Add SERIALIZE instruction. Summary: For more details about this instruction, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Reviewers: craig.topper, RKSimon, LuoYuanke Reviewed By: craig.topper Subscribers: mgorny, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77193	2020-04-02 16:19:23 +08:00
Shengchen Kan	9f92d4612f	Revert "[NFC][X86] Refine code in X86AsmBackend" This reverts commit `a157cde0ac`.	2020-04-02 15:57:06 +08:00
Shengchen Kan	a157cde0ac	[NFC][X86] Refine code in X86AsmBackend Replace pattern getContents().size with universe function call	2020-04-02 15:41:10 +08:00
Johannes Doerfert	1858f4b50d	Revert "[OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP`" This reverts commit `c18d55998b`. Bots have reported uses that need changing, e.g., clang-tools-extra/clang-tidy/openmp/UseDefaultNoneCheck.cp as reported by http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/46591	2020-04-02 02:23:22 -05:00
Johannes Doerfert	c18d55998b	[OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP` This is a cleanup and normalization patch that also enables reuse with Flang later on. A follow up will clean up and move the directive -> clauses mapping. Differential Revision: https://reviews.llvm.org/D77112	2020-04-02 01:39:07 -05:00
Fangrui Song	cbd3969e8c	[PPCInstPrinter] Delete an unneeded overload of printBranchOperand. NFC It was added by D76591 for migration purposes (not all printBranchOperand users have migrated to the overload with `uint64_t Address`). Now that all have been migrated, the parameter can go away.	2020-04-01 22:45:25 -07:00
Fangrui Song	85adce3d73	[PPCInstPrinter] Change B to print the target address in hexadecimal form Follow-up of D76591 and D76907	2020-04-01 22:38:24 -07:00
Johannes Doerfert	bcd8009369	[Attributor] Use the proper context instruction in genericValueTraversal There was a TODO in genericValueTraversal to provide the context instruction and due to the lack of it users that wanted one just used something available. Unfortunately, using a fixed instruction is wrong in the presence of PHIs so we need to update the context instruction properly. Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D76870	2020-04-01 22:20:47 -05:00
Johannes Doerfert	ac96c8fd85	[Attributor][FIX] Do not compute ranges for arguments of declarations This cannot be triggered right now, as far as I know, but it doesn't make sense to deduce a constant range on arguments of declarations. Exposed during testing of AAValueSimplify extensions.	2020-04-01 22:05:30 -05:00
Johannes Doerfert	54d6a608bf	[Attributor][NFC] Predetermine the module It could happen that we delete the first function in the SCC in the future so we should be careful accessing `Functions` after the manifest stage.	2020-04-01 21:56:17 -05:00
Johannes Doerfert	9e19693994	[Attributor] Derive better alignment for accessed pointers Use DL & ABI information for better alignment deduction, e.g., if a type is accessed and the ABI specifies an alignment requirement for such an access we can use it. This is based on a patch by @lebedev.ri and inspired by getBaseAlign in Loads.cpp. Depends on D76673. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D76674	2020-04-01 21:49:57 -05:00
Nico Weber	5bac8d427d	Revert "[ORC] Export __cxa_atexit from the main JITDylib in LLJIT." This reverts commit `0071eaaf08`. Inputs/noop-main.ll wasn't checked in, so this breaks check-llvm everywhere.	2020-04-01 22:49:38 -04:00
Johannes Doerfert	b1c788d051	[Attributor][FIX] Prevent alignment breakage wrt. must-tail calls If we have a must-tail call the callee and caller need to have matching ABIs. Part of that is alignment which we might modify when we deduce alignment of arguments of either. Since we would need to keep them in sync, which is not as simple, we simply avoid deducing alignment for arguments of the must-tail caller or callee. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D76673	2020-04-01 21:40:07 -05:00
Lang Hames	0071eaaf08	[ORC] Export __cxa_atexit from the main JITDylib in LLJIT. Failure to export __cxa_atexit can lead to an attempt to import a definition from the process itself (if __cxa_atexit is referenced from another JITDylib), but the process definition will clash with the existing non-exported definition to produce an unexpected DuplicateDefinitionError. This patch fixes the immediate issue by exporting __cxa_atexit. It also fixes a bug where atexit functions in other JITDylibs were not being run by adding a copy of run_atexits_helper to every JITDylib. A follow up patch will deal with the bug where definition generators are called despite a non-exported definition being present.	2020-04-01 19:12:08 -07:00
Johannes Doerfert	41f2a57d0b	[Attributor][NFC] Use a BumpPtrAllocator to allocate `AbstractAttribute`s We create a lot of AbstractAttributes and they live as long as the Attributor does. It seems reasonable to allocate them via a BumpPtrAllocator owned by the Attributor. Reviewed By: baziotis Differential Revision: https://reviews.llvm.org/D76589	2020-04-01 20:53:28 -05:00
Sam Clegg	296ccef703	[WebAssembly] EmscriptenEHSjLj: Mark __invoke_ functions as imported This means the linker will be expect them be undefined at link time an will generate imports from the `env` module rather than reporting undefined externals. Differential Revision: https://reviews.llvm.org/D77192	2020-04-01 16:33:33 -07:00
Daniel Sanders	e65e677ee4	[globalisel][legalizer] Fix DebugLoc bugs caught by a prototype lost-location verifier The legalizer has a tendency to lose DebugLoc's when expanding or combining instructions. The verifier that detected these isn't ready for upstreaming yet but this patch fixes the cases that came up when applying it to our out-of-tree backend's CodeGen tests. This pattern comes up a few more times in this file and probably in the backends too but I'd prefer to fix the others separately (and preferably when the lost-location verifier detects them).	2020-04-01 12:50:18 -07:00
Lang Hames	8e5a8f620c	[ORC] Don't require a null-terminator on MemoryBuffers for objects in archives. The MemoryBuffer::getMemBuffer method's RequiresNullTerminator parameter defaults to true, but object files are not null terminated so we need to explicitly pass false here.	2020-04-01 12:16:38 -07:00
Sanjay Patel	3d90048791	[InstCombine] enhance freelyNegateValue() by handling xor Negation is equivalent to bitwise-not + 1, so try to convert more subtracts into adds using this relationship: 0 - (A ^ C) => ((A ^ C) ^ -1) + 1 => A ^ ~C + 1 I doubt this will recover the regression noted in rGf2fbdf76d8d0, but seems like we're going to need to improve here and/or revive D68408? Alive2 proofs: http://volta.cs.utah.edu:8080/z/Re5tMU http://volta.cs.utah.edu:8080/z/An-uns Differential Revision: https://reviews.llvm.org/D77230	2020-04-01 15:05:13 -04:00
Jonathan Roelofs	1148f004fa	Fix PR45371: SeparateConstOffsetFromGEP clean up bookkeeping find() was altering the UserChain, even in cases where it subsequently discovered that the resulting constant was a 0. This confuses rebuildWithoutConstOffset() when it attempts to walk the chain later, since it is expected that the chain itself be a path down the use-def edges of an expression.	2020-04-01 12:38:15 -06:00
Nikita Popov	50a3e8738a	Revert "[InstCombine] Erase old instruction when replacing extractelements" This reverts commit `d40368fdb5`. llvm-clang-x86_64-expensive-checks-debian failure looks related.	2020-04-01 20:10:11 +02:00
Nikita Popov	2a77544ad5	[SimplifyLibCalls] Erase replaced instructions After RAUWing an instruction, also erase it. This makes sure we don't perform extra InstCombine iterations to clean up the garbage.	2020-04-01 20:00:10 +02:00
Uday Bondhugula	6ee11c3b0f	[NewGVN] Make NewGVN aware of aligned_alloc Make the New GVN pass aware of aligned_alloc. Depends on D76975. Differential Revision: https://reviews.llvm.org/D76976	2020-04-01 23:26:51 +05:30
Uday Bondhugula	4cf70af94f	[GVN] Make GVN aware of aligned_alloc Make the GVN pass aware of aligned_alloc. Depends on D76974. Differential Revision: https://reviews.llvm.org/D76975	2020-04-01 23:26:50 +05:30
Uday Bondhugula	c4499e3333	[Attributor] Make attributor aware of aligned_alloc for heap to stack conversion Make the attributor pass aware of aligned_alloc for converting heap allocations to stack ones. Depends on D76971. Differential Revision: https://reviews.llvm.org/D76974	2020-04-01 23:26:50 +05:30
Nikita Popov	d40368fdb5	[InstCombine] Erase old instruction when replacing extractelements As we are not returning the result of replaceInstUsesWith(), so we need to clean up ourselves. NFC apart from worklist order.	2020-04-01 19:55:28 +02:00
Nikita Popov	4b35c816ef	[InstCombine] Use replaceOperand() in div transforms To make sure the old operand is DCEd. NFC apart from worklist order.	2020-04-01 19:55:00 +02:00
Matt Arsenault	5e4e8d0388	AMDGPU/GlobalISel: Change intrinsic ID for _L to _LZ opt Still should handle the other case changes the opcode this way.	2020-04-01 13:03:02 -04:00
Heejin Ahn	c87b5e7e22	[WebAssembly] Fix subregion relationship in CFGSort Summary: The previous code for determining the innermost region in CFGSort was not correct. We determine subregion relationship by domination of their headers, i.e., if region A's header dominates region B's header, B is a subregion of A. Previously we assumed that if a BB belongs to both a loop and an exception, the region with fewer number of BBs is the innermost one. This may not be true, because while WebAssemblyException contains BBs in all its subregions (loops or exceptions), MachineLoop may not, because MachineLoop does not contain BBs that don't have a path to its header even if they are dominated by its header. Loop header <---\| \| \| Exception header \| \| \ \| A B \| \| \ \| \| C \| \| \| Loop latch \| \| \| -------------\| For example, in this CFG, the loop does not contain B and C, because they don't have a path back to the loops header. But for CFGSort we consider the exception here belongs to the loop and the exception should be a subregion of the loop and scheduled together. So here we should use `WE->contains(ML->getHeader())` (but not `ML->contains(WE->getHeader())`, for the stated region above). This also fixes some comments and deletes `Regions` vector in `RegionInfo` class, which was not used anywere. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77181	2020-04-01 08:12:41 -07:00
Jessica Clarke	616289ed29	[LegalizeTypes][RISCV] Correctly sign-extend comparison for ATOMIC_CMP_XCHG Summary: Currently, the comparison argument used for ATOMIC_CMP_XCHG is legalised with GetPromotedInteger, which leaves the upper bits of the value undefind. Since this is used for comparing in an LR/SC loop with a full-width comparison, we must sign extend it. We introduce a new getExtendForAtomicCmpSwapArg to complement getExtendForAtomicOps, since many targets have compare-and-swap instructions (or pseudos) that correctly handle an any-extend input, and the existing function determines the extension of the result, whereas we are concerned with the input. This is related to https://reviews.llvm.org/D58829, which solved the issue for ATOMIC_CMP_SWAP_WITH_SUCCESS, but not the simpler ATOMIC_CMP_SWAP. Reviewers: asb, lenary, efriedma Reviewed By: asb Subscribers: arichardson, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, evandro, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74453	2020-04-01 15:51:26 +01:00
Guillaume Chatelet	fc63c4d8ce	[Alignment][NFC] Remove remaining uses of MachineFrameInfo::setObjectAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77217	2020-04-01 14:38:05 +00:00
Simon Pilgrim	eb8880562e	[X86][SSE] combinePTESTCC - fold TESTZ(X,~Y) -> TESTC(Y,X)	2020-04-01 15:10:53 +01:00
Guillaume Chatelet	1dffa2550b	[Alignment][NFC] Transition to MachineFrameInfo::getObjectAlign() Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77215	2020-04-01 14:08:28 +00:00
Kai Wang	501522b5b2	[RISCV] Support RISC-V ELF attributes sections in llvm-readobj. Enable llvm-readobj to handle RISC-V ELF attribute sections. Differential Revision: https://reviews.llvm.org/D75833	2020-04-01 21:50:11 +08:00
Simon Pilgrim	be7a233e93	Fix operator precedence warning. NFCI.	2020-04-01 14:36:52 +01:00
Simon Pilgrim	552e46ea1e	Fix unused variable warnings. NFCI.	2020-04-01 14:36:51 +01:00
Benjamin Kramer	b605c56b0f	[ARM] Silence warning in Release builds llvm/lib/Target/ARM/MVEVPTBlockPass.cpp:175:37: error: unused variable 'BlockBeg' [-Werror,-Wunused-variable] MachineBasicBlock::instr_iterator BlockBeg = Iter; ^	2020-04-01 15:29:19 +02:00
Guillaume Chatelet	3a78f44daf	[Alignment][NFC] Convert SelectionDAG::InferPtrAlignment to MaybeAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77212	2020-04-01 13:22:11 +00:00
Simon Pilgrim	481413d394	[X86][SSE] matchShuffleWithPACK - generalize zero/signbits matching for any packed src type First step toward making use of canLowerByDroppingEvenElements to match chains of PACKSS/PACKUS for compaction shuffles. At the moment we still only match a single stage but the MatchPACK is now more general.	2020-04-01 14:10:32 +01:00
shchenz	e344f8b9db	Revert "[LSR] re-add testcase for wrongly phi node elimination - NFC" This reverts commit `f25a1b4f58`. ARM and hexagon fail at the new added case.	2020-04-01 12:58:06 +00:00
Guillaume Chatelet	bf573bea19	[Alignment][NFC] Convert MIR Yaml to MaybeAlign Summary: Although it may look like non NFC it is. especially the MIRParser may set `0` to the MachineFrameInfo and MachineFunction, but they all deal with `Align` internally and assume that `0` means `1`. `93fc0ba145/llvm/include/llvm/CodeGen/MachineFrameInfo.h (L483)` This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77203	2020-04-01 12:26:31 +00:00
Pierre-vh	2effe8f5e7	[Target][ARM] Improvements to the VPT Block Insertion Pass This allows the MVE VPT Block insertion pass to remove VPNOTs in order to create more complex VPT blocks such as TE, TEET, TETE, etc. Differential Revision: https://reviews.llvm.org/D75993	2020-04-01 12:34:20 +01:00
Pierre-vh	dad848280d	[Target][ARM] Change VPTMaskValues to the correct encoding VPTMaskValue was using the "instruction" encoding to represent the masks (= the same encoding as the one used by the instructions in an object file), but it is only used to build MCOperands, so it should use the MCOperand encoding of the masks, which is slightly different. Differential Revision: https://reviews.llvm.org/D76139	2020-04-01 12:34:20 +01:00
Benjamin Kramer	66b9f5f7f0	[GVNSink] Simplify code. NFC.	2020-04-01 13:13:00 +02:00
shchenz	f25a1b4f58	[LSR] re-add testcase for wrongly phi node elimination - NFC Retest the case on X86/SystemZ/AArch64/PowerPC	2020-04-01 11:11:17 +00:00
Cullen Rhodes	84aa6cf1a9	[Transforms][SROA] Promote allocas with mem2reg for scalable types Summary: Aggregate types containing scalable vectors aren't supported and as far as I can tell this pass is mostly concerned with optimisations on aggregate types, so the majority of this pass isn't very useful for scalable vectors. This patch modifies SROA such that mem2reg is run on allocas with scalable types that are promotable, but nothing else such as slicing is done. The use of TypeSize in this pass has also been updated to be explicitly fixed size. When invoking the following methods in DataLayout: * getTypeSizeInBits * getTypeStoreSize * getTypeStoreSizeInBits * getTypeAllocSize we now called getFixedSize on the resultant TypeSize. This is quite an extensive change with around 50 calls to these functions, and also the first change of this kind (being explicit about fixed vs scalable size) as far as I'm aware, so feedback welcome. A test is included containing IR with scalable vectors that this pass is able to optimise. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76720	2020-04-01 10:34:11 +00:00
Simon Pilgrim	918ccb64b0	[X86][SSE] Handle basic inversion of PTEST/TESTP operands (PR38522) PTEST/TESTP sets EFLAGS as: TESTZ: ZF = (Op0 & Op1) == 0 TESTC: CF = (~Op0 & Op1) == 0 TESTNZC: ZF == 0 && CF == 0 If we are inverting the 0'th operand of a PTEST/TESTP instruction we can adjust the comparisons to correct handle the inversion implicitly. Additionally, for "TESTZ" (ZF) cases, the allones case, PTEST(X,-1) can be simplified to PTEST(X,X). We can expand this for the TESTZ(X,~Y) pattern and also handle KTEST/KORTEST in the future. Differential Revision: https://reviews.llvm.org/D76984	2020-04-01 11:33:28 +01:00
shchenz	8b8cd150a4	Revert "[LSR] add testcase for wrongly phi node elimination - NFC" This reverts commit `dbf5e4f6c7`. The testcase has different behaviour on PowerPC and X86.	2020-04-01 10:28:43 +00:00
shchenz	dbf5e4f6c7	[LSR] add testcase for wrongly phi node elimination - NFC	2020-04-01 09:58:58 +00:00
Bjorn Pettersson	ef49895da8	[X86] Do not assume types are legal in getFauxShuffleMask Summary: Make sure we do not assert on value types not being simple in getFauxShuffleMask when analysing operations such as "v8i16 = truncate v8i24". Reviewers: RKSimon Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77136	2020-04-01 11:40:18 +02:00
Guillaume Chatelet	c7468c1696	[Alignment][NFC] Use Align in SelectionDAG::getMemIntrinsicNode Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, nemanjai, hiraditya, kbarton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77149	2020-04-01 09:32:05 +00:00
Georgii Rymar	93fc0ba145	[yaml2obj] - Add NBucket and NChain fields for the SHT_HASH section. These fields allows to override nchain and nbucket fields of a SHT_HASH section. Differential revision: https://reviews.llvm.org/D76834	2020-04-01 12:28:16 +03:00
Florian Hahn	d307174e1d	[ConstantRange] Use APInt::or/APInt::and for single elements. Currently ConstantRange::binaryAnd/binaryOr results are too pessimistic for single element constant ranges. If both operands are single element ranges, we can use APInt's AND and OR implementations directly. Note that some other binary operations on constant ranges can cover the single element cases naturally, but for OR and AND this unfortunately is not the case. Reviewers: nikic, spatel, lebedev.ri Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D76446	2020-04-01 09:50:24 +01:00
Qiu Chaofan	95bcab8272	[DAGCombiner] Require ninf for sqrt recip estimation Currently, DAG combiner uses (fmul (rsqrt x) x) to estimate square root of x. However, this method would return NaN if x is +Inf, which is incorrect. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D76853	2020-04-01 16:23:43 +08:00
Florian Hahn	862766e01e	[Verifier] Verify matrix dimensions operands match vector size. This patch adds checks to the verifier to ensure the dimension arguments passed to the matrix intrinsics match the vector types for their arugments/return values. Reviewers: anemet, Gerolf, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D77129	2020-04-01 09:21:39 +01:00
Sam Parker	2641a19981	[TTI] Remove getCallCost getCallCost is only used within the different layers of TTI, with no backend implementing it so fold the base implementation into getUserCost. I think this is an NFC. Differential Revision: https://reviews.llvm.org/D77050	2020-04-01 09:05:25 +01:00
Craig Topper	f92563f907	[VectorUtils][X86] De-templatize scaleShuffleMask and 2 X86 shuffle mask helpers and move their implementation to cpp files Summary: These were templated due to SelectionDAG using int masks for shuffles and IR using unsigned masks for shuffles. But now that D72467 has landed we have an int mask version of IRBuilder::CreateShuffleVector. So just use int instead of a template Reviewers: spatel, efriedma, RKSimon Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D77183	2020-04-01 00:46:48 -07:00
Simon Pilgrim	f9f401dba1	[X86][AVX] Add additional 256/512-bit test cases for PACKSS/PACKUS shuffle patterns Also add lowerShuffleWithPACK call to lowerV32I16Shuffle - shuffle combining was catching it but we avoid a lot of temporary shuffle creations if we catch it at lowering first.	2020-04-01 08:19:03 +01:00
Shiva Chen	af0cd9073c	[RISCV] Split RISCVISelDAGToDAG.cpp to RISCVISelDAGToDAG.h and RISCVISelDAGToDAG.cpp For the downstream RISCV maintenance, it would be easier to inherent RISCVISelDAGToDAG by including header and only override the method that needs to be customized for the provider non-standard ISA extension without touching RISCVISelDAGToDAG.cpp which may cause conflict when upgrading the downstream LLVM version. Differential Revision: https://reviews.llvm.org/D77117	2020-04-01 11:30:21 +08:00
Kai Luo	8eb40e41f6	[PowerPC] Don't generate ST_VSR_SCAL_INT if power8-vector is disabled Summary: In https://bugs.llvm.org/show_bug.cgi?id=45297, it fails selecting instructions for `PPCISD::ST_VSR_SCAL_INT`. The reason it generate the `PPCISD::ST_VSR_SCAL_INT` with `-power8-vector` in IR is PPC's combiner checks `hasP8Altivec` rather than `hasP8Vector`. This patch should resolve PR45297. Differential Revision: https://reviews.llvm.org/D76773	2020-04-01 02:15:25 +00:00
Shengchen Kan	d0efd7bfcf	[X86][MC] Disable Prefix padding after hardcode/prefix Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight, eli.friedman Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D76475	2020-04-01 09:49:52 +08:00
Matt Arsenault	43e576593e	AMDGPU/GlobalISel: Fix insert point when lowering G_FMAD	2020-03-31 19:57:06 -04:00
Fangrui Song	a3eb3d3d92	[Support] Delete ioctl TIOCGWINSZ D61326 essentially disabled `ioctl(FileID, TIOCGWINSZ, &ws)`. Nobody has complained for one year. So let's just delete the code.	2020-03-31 16:41:09 -07:00
Eli Friedman	ba4764c2cc	Fix leak in GVNSink introduced in D72467.	2020-03-31 16:21:27 -07:00
Evgenii Stepanov	f9471b0010	Fix MSan false positive due to select folding. Summary: Select folding in JumpThreading can create a conditional branch on a code patch that did not have one in the original program. This is not a valid transformation in sanitize_memory functions. Note that JumpThreading does select folding in 3 different places. Two of them seem safe - they apply to a select instruction in a BB that ends with an unconditional branch to another BB, which (in turn) ends with a conditional branch or a switch with the same condition. Fixes PR45220. Reviewers: glider, dvyukov, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76332	2020-03-31 15:25:42 -07:00
Fangrui Song	4af7560b37	[PPCInstPrinter] Print conditional branches as `bt 2, $target` instead of `bt 2, .+$imm` Follow-up of D76591. Reviewed By: #powerpc, sfertile Differential Revision: https://reviews.llvm.org/D76907	2020-03-31 15:05:38 -07:00
Hubert Tong	478af4479a	[Object] Update ObjectFile::makeTriple for XCOFF Summary: When we encounter an XCOFF file, reflect that in the triple information. In addition to knowing the object file format, we know that the associated OS is AIX. This means that we can expect that there is no output difference in the processing of an XCOFF32 input file between cases where the triple is left unspecified by the user and cases where the user specifies `--triple powerpc-ibm-aix` explicitly. Reviewers: jhenderson, sfertile, jasonliu, daltenty Reviewed By: jasonliu Subscribers: wuzish, nemanjai, hiraditya, MaskRay, rupprecht, steven.zhang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77025	2020-03-31 17:26:30 -04:00
Daniel Frampton	494abe139a	[AArch64] Change AArch64 Windows EH UnwindHelp object to be a fixed object The UnwindHelp object is used during exception handling by runtime code. It must be findable from a fixed offset from FP. This change allocates the UnwindHelp object as a fixed object (as is done for x86_64) to ensure that both the generated code and runtime agree on the location of the object. Fixes https://bugs.llvm.org/show_bug.cgi?id=45346 Differential Revision: https://reviews.llvm.org/D77016	2020-03-31 14:21:21 -07:00
Daniel Frampton	522b4c4b88	[AArch64] Fix mismatch in prologue and epilogue for funclets on Windows The generated code for a funclet can have an add to sp in the epilogue for which there is no corresponding sub in the prologue. This patch removes the early return from emitPrologue that was preventing the sub to sp, and instead conditionalizes the appropriate parts of the rest of the function. Fixes https://bugs.llvm.org/show_bug.cgi?id=45345 Differential Revision: https://reviews.llvm.org/D77015	2020-03-31 14:21:18 -07:00
Anna Thomas	58a05675da	Revert "[InlineFunction] Handle return attributes on call within inlined body" This reverts commit `28518d9ae3`. There is a failure in MsgPackReader.cpp when built with clang. It complains about "signext and zeroext" are incompatible. Investigating offline if it is infact a UB in the MsgPackReader code.	2020-03-31 16:16:34 -04:00
Nikita Popov	b7fe795e5b	[InstCombine] Use replaceOperand() in some select transforms To make sure the old operand is DCEd. NFC apart from worklist order.	2020-03-31 22:10:55 +02:00
Eli Friedman	1ee6ec2bf3	Remove "mask" operand from shufflevector. Instead, represent the mask as out-of-line data in the instruction. This should be more efficient in the places that currently use getShuffleVector(), and paves the way for further changes to add new shuffles for scalable vectors. This doesn't change the syntax in textual IR. And I don't currently plan to change the bitcode encoding in this patch, although we'll probably need to do something once we extend shufflevector for scalable types. I expect that once this is finished, we can then replace the raw "mask" with something more appropriate for scalable vectors. Not sure exactly what this looks like at the moment, but there are a few different ways we could handle it. Maybe we could try to describe specific shuffles. Or maybe we could define it in terms of a function to convert a fixed-length array into an appropriate scalable vector, using a "step", or something like that. Differential Revision: https://reviews.llvm.org/D72467	2020-03-31 13:08:59 -07:00
Nikita Popov	c538c57d6d	[InstCombine] Use replaceOperand() in descaling To make sure the old operand gets DCEd. NFC apart from worklist order.	2020-03-31 22:05:53 +02:00
Nikita Popov	19df7fa892	[InstCombine] Erase old alloca in cast of alloca transform As we don't return the replaceInstUsesWith() result, we are responsible for erasing the instruction. NFC apart from worklist order.	2020-03-31 21:57:39 +02:00
Nikita Popov	87357808b8	[InstCombine] Use replaceOperand() in non zero phi transform To make sure the old operand gets DCEd. NFC apart from worklist order changes.	2020-03-31 21:54:21 +02:00
Nikita Popov	f3d4166368	[InstCombine] Report change in non zero phi transform We need to inform InstCombine (and transitively the pass manager) that we changed an instruction.	2020-03-31 21:52:40 +02:00
Eli Friedman	dacf8d3562	[AArch64][SVE] Add support for fcmp. This also requires support for boolean "not", so I added boolean logic while I was there. Differential Revision: https://reviews.llvm.org/D76901	2020-03-31 12:04:39 -07:00
Guozhi Wei	6d20937c29	[CodeGenPrepare] Delete intrinsic call to llvm.assume to enable more tailcall The attached test case is simplified from tcmalloc. Both function calls should be optimized as tailcall. But llvm can only optimize the first call. The second call can't be optimized because function dupRetToEnableTailCallOpts failed to duplicate ret into block case2. There 2 problems blocked the duplication: 1 Intrinsic call llvm.assume is not handled by dupRetToEnableTailCallOpts. 2 The control flow is more complex than expected, dupRetToEnableTailCallOpts can only duplicate ret into its predecessor, but here we have an intermediate block between call and ret. The solutions: 1 Since CodeGenPrepare is already at the end of LLVM IR phase, we can simply delete the intrinsic call to llvm.assume. 2 A general solution to the complex control flow is hard, but for this case, after exit2 is duplicated into case1, exit2 is the only successor of exit1 and exit1 is the only predecessor of exit2, so they can be combined through eliminateFallThrough. But this function is called too late, there is no more dupRetToEnableTailCallOpts after it. We can add an earlier call to eliminateFallThrough to solve it. Differential Revision: https://reviews.llvm.org/D76539	2020-03-31 11:55:51 -07:00
Stanislav Mekhanoshin	08682dcc86	[AMDGPU] Define 16 bit VGPR subregs We have loads preserving low and high 16 bits of their destinations. However, we always use a whole 32 bit register for these. The same happens with 16 bit stores, we have to use full 32 bit register so if high bits are clobbered the register needs to be copied. One example of such code is added to the load-hi16.ll. The proper solution to the problem is to define 16 bit subregs and use them in the operations which do not read another half of a VGPR or preserve it if the VGPR is written. This patch simply defines subregisters and register classes. At the moment there should be no difference in code generation. A lot more work is needed to actually use these new register classes. Therefore, there are no new tests at this time. Register weight calculation has changed with new subregs so appropriate changes were made to keep all calculations just as they are now, especially calculations of register pressure. Differential Revision: https://reviews.llvm.org/D74873	2020-03-31 11:49:06 -07:00
Anna Thomas	28518d9ae3	[InlineFunction] Handle return attributes on call within inlined body Consider a callee function that has a call (C) within it which feeds into the return. When we inline that callee into a callsite that has return attributes, we can backward propagate those attributes to the call (C) within that inlined callee body. This is safe to do so only if we can guarantee transfer of execution to successor in the window of instructions between return value (i.e. the call C) and the return instruction. See added test cases. Reviewed-By: reames, jdoerfert Differential Revision: https://reviews.llvm.org/D76140	2020-03-31 14:35:40 -04:00

... 6 7 8 9 10 ...

133482 Commits