llvm-project

Commit Graph

Author	SHA1	Message	Date
Mircea Trofin	d2f1cd5d97	[llvm][NFC] Refactor uses of CallSite to CallBase - call promotion Summary: Updated CallPromotionUtils and impacted sites. Parameters that are expected to be non-null, and return values that are guranteed non-null, were replaced with CallBase references rather than pointers. Left FIXME in places where more changes are facilitated by CallBase, but aren't CallSites: Instruction* parameters or return values, for example, where the contract that they are actually CallBase values. Reviewers: davidxl, dblaikie, wmi Reviewed By: dblaikie Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77930	2020-04-12 08:27:29 -07:00
Chris Lattner	617b08ff9b	Refactor StringMap.h, splitting StringMapEntry out to its own header. Summary: StringMapEntry.h can have lower dependencies, than StringMap.h, which is useful for public headers that want to expose inline methods on StringMapEntry<> but don't need to expose all of StringMap.h. One example of this is mlir's Identifier.h, another example is the existing LLVM StringPool.h. StringPool also could use a cleanup, I'll deal with that in a follow-on patch. Reviewers: rriddle Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77963	2020-04-12 08:25:17 -07:00
Sanjay Patel	d04db4825a	[x86] use vector instructions to lower FP->int->FP casts As discussed in PR36617: https://bugs.llvm.org/show_bug.cgi?id=36617#c13 ...we can avoid the likely slow round-trip from XMM to GPR to XMM by using the vector versions of the convert instructions. Based on experimental results from recent Intel/AMD chips, we don't need to worry about triggering denorm stalls while operating on garbage data in the high lanes with convert instructions, so this is expected to always be as good or better perf than the scalar instruction equivalent. FP exceptions are also not a concern because strict code should not be using the regular SDAG opcodes. Differential Revision: https://reviews.llvm.org/D77895	2020-04-12 10:26:43 -04:00
Sanjay Patel	c23cbefd9d	[VectorUtils] add IR-level analysis for widening of shuffle mask This is similar to the recent move/addition of "scaleShuffleMask" (D76508), but there are a couple of differences: 1. The existing x86 helper (canWidenShuffleElements) always tries to divide-by-2, so it gets called iteratively and wouldn't handle the general case of non-pow-2 length. 2. The existing x86 code handles "SM_SentinelZero" - we don't have that in IR, but this code should be safe to use with that or other special (negative) values. The motivation is to enable shuffle folds in instcombine/vector-combine that are similar to D76844 and D76727, but in the reverse-bitcast direction. Those patterns are visible in the tests for D40633. Differential Revision: https://reviews.llvm.org/D77881	2020-04-12 10:14:19 -04:00
Simon Pilgrim	2b74755ec5	TrigramIndex.h - remove unnecessary StringMap.h include. NFC Include StringRef.h inside TrigramIndex.cpp as thats the only part of StringMap.h that is actually required.	2020-04-12 14:30:52 +01:00
Florian Hahn	ae1e353a25	[VPlan] Turn classes with all public members into structs (NFC). struct should be used when all members are public: https://llvm.org/docs/CodingStandards.html#use-of-class-and-struct-keywords Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D77865	2020-04-12 11:03:39 +01:00
Simon Pilgrim	40581a0a2b	[X86] Use isAnyZero shuffle mask helper where possible. NFC.	2020-04-12 10:57:29 +01:00
Craig Topper	5b42399029	[CallSite removal][FastISel] Remove uses of CallSite. Differential Revision: https://reviews.llvm.org/D77933	2020-04-11 20:52:45 -07:00
Craig Topper	d3465e0691	[X86] Enable shuffle combining for AVX512 unless the root is used by a vselect A lot of vectorized code doesn't use masks so we shouldn't penalize them by not doing shuffle combining on avx512 targets. I've added support for VALIGNQ/VALIGND and 512-bit SHUF128 to prevent some regressions. I also prevented recombining 256-bit SHUF128 to PERM2X128. We may not need to add 256-bit SHUF128 support, but I don't think I found any cases requiring that in my testing. Differential Revision: https://reviews.llvm.org/D77928	2020-04-11 20:05:10 -07:00
Matt Arsenault	96819011ca	AMDGPU/GlobalISel: Fix RegBankSelect for v2s16 shifts These need to be promoted and scalarized for the SALU.	2020-04-11 20:55:33 -04:00
David Blaikie	7a45aeacf3	Revert "llvm-dwarfdump: Report errors when failing to parse loclist/debug_loc entries" Broke an LLDB build bot & I can't seem to build LLDB locally to fix forward... http://lab.llvm.org:8011/builders/lldb-x64-windows-ninja/builds/15567/steps/test/logs/stdio This reverts commit `416fa7720e`.	2020-04-11 16:54:49 -07:00
Matt Arsenault	ac8d51a3c6	AMDGPU/GlobalISel: Legalize 16-bit shift amounts to s16 The current selector depends on 16-bit shifts using 16-bit shift amount types, but really it should accept either for all types.	2020-04-11 18:12:26 -04:00
Craig Topper	d1da1b53ff	[X86] Cleanup ISD::BRIND handling code in X86DAGToDAGISel::Select. NFC -Drop llvm:: on MVT::i32 -Use getValueType instead of getSimpleValueType for an equality check just cause its shorter and doesn't matter. -Don't create a const SDValue & since its cheap to copy. -Remove explicit case from MVT enum to EVT. -Add message to assert.	2020-04-11 15:01:05 -07:00
Craig Topper	21a7d08e72	[X86] Move code that replaces ISD::VSELECT with X86ISD::BLENDV from X86DAGToDAGISel::Select to PreprocessISelDAG	2020-04-11 15:01:05 -07:00
Craig Topper	806763efcf	[CallSite removal][SelectionDAGBuilder] Use CallBase instead of ImmutableCallSite in visitPatchpoint. Differential Revision: https://reviews.llvm.org/D77932	2020-04-11 13:07:31 -07:00
Matt Arsenault	1747ba25b2	GlobalISel: Fix typo in assert message	2020-04-11 16:02:26 -04:00
Matt Arsenault	c5497e5399	AMDGPU/GlobalISel: Fix legalizing <3 x s16> vselects	2020-04-11 15:59:51 -04:00
Hongtao Yu	11455a7905	[CodeGen] Allow partial tail duplication in Machine Block Placement. Summary: A count profile may affect tail duplication's heuristic causing a block to be duplicated in only a part of its predecessors. This is not allowed in the Machine Block Placement pass where an assert will go off. I'm removing the assert and making the optimization bail out when such case happens. Reviewers: wenlei, davidxl, Carrot Reviewed By: Carrot Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77748	2020-04-11 12:20:31 -07:00
Fangrui Song	0a55d3f557	[MC] Default MCAsmInfo::UseIntegratedAssembler to true	2020-04-11 10:13:52 -07:00
Fangrui Song	d2e5157c1f	[MC] Add UseIntegratedAssembler = false. NFC	2020-04-11 10:13:49 -07:00
Benjamin Kramer	52dcbcbfe0	Simplify string joins. NFCI.	2020-04-11 17:20:11 +02:00
Matt Arsenault	cf29333f40	AMDGPU/GlobalISel: Work around forming illegal zextload after legalize Selection would fail after the post legalize combiner put an illegal zextload back together. The base combiner has parameter to only allow legal operations, but they appear to not be used. I also don't see a nice way to remove a single entry from all_combines, so just hack around this.	2020-04-11 10:52:58 -04:00
Sanjay Patel	1318ddbc14	[VectorUtils] rename scaleShuffleMask to narrowShuffleMaskElts; NFC As proposed in D77881, we'll have the related widening operation, so this name becomes too vague. While here, change the function signature to take an 'int' rather than 'size_t' for the scaling factor, add an assert for overflow of 32-bits, and improve the documentation comments.	2020-04-11 10:05:49 -04:00
Benjamin Kramer	e590bd6b92	[argpromote] Use formatv to simplify code. NFCI.	2020-04-11 14:54:32 +02:00
Benjamin Kramer	5ef2cb3df4	[FormatVariadic] Reduce allocations - Move Adapters array to the stack, we know the size precisely - Parse format string on demand into a SmallVector. In theory this could lead to parsing it multiple times, but I couldn't find a single instance of that in LLVM. - Make more of the implementation details private.	2020-04-11 14:54:32 +02:00
Nemanja Ivanovic	512600e3c0	[PowerPC] Handle f16 as a storage type only The PPC back end currently crashes (fails to select) with f16 input. This patch expands it on subtargets prior to ISA 3.0 (Power9) and uses the HW conversions on Power9. Fixes https://bugs.llvm.org/show_bug.cgi?id=39865 Differential revision: https://reviews.llvm.org/D68237	2020-04-11 07:34:47 -05:00
Florian Hahn	719846c469	[VPlan] Drop redundant private: at beginning of class defs (NFC). Default visibility for classes is private, so the private: at the top of various class definitions is redundant. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D77810	2020-04-11 13:27:10 +01:00
Simon Pilgrim	89f6ca05b7	CodeGen/EdgeBundles - move Twine.h include down into EdgeBundles.cpp. NFC. EdgeBundles.h has no use for it.	2020-04-11 12:21:04 +01:00
Craig Topper	9c1842d8af	Change FastISel::CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI. This is the same as what was done to the CallLoweringInfo in TargetLowering.h in r309159. This is just a step on the way to replacing this with CallBase.	2020-04-10 23:45:36 -07:00
Shengchen Kan	5d73f79c54	[X86][MC] Make -x86-pad-max-prefix-size compatible with --mc-relax-all Summary: We allow non-relaxable instructions emitted into relaxable Fragment when we prefix padding branch. So we need to check if the instruction need relaxation before relaxing it. Without this patch, it currently triggers a `report_fatal_error` in `llvm::MCAsmBackend::relaxInstruction` when we prefix padding branch along with `--mc-relax-all`. Reviewers: LuoYuanke, reames, MaskRay Reviewed By: MaskRay Subscribers: MaskRay, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77851	2020-04-11 11:30:15 +08:00
Craig Topper	f49f6cf91e	[CallSite removal][SelectionDAGBuilder] Remove most CallSite usage from visitInlineAsm. I only left it at the interface to ParseConstraints since that needs updates to other callers in different files. I'll do that as a follow up. Differential Revision: https://reviews.llvm.org/D77892	2020-04-10 19:23:33 -07:00
Nemanja Ivanovic	04eae39617	[PowerPC] Another folow-up fix for `6c4b40def7` There was another issue introduced by this commit that the OP initially missed. Namely, for functions that are free to use R2 as a callee-saved register, we emit a TOC expression based on the address of the GEP label without emitting the GEP label. Since we only emit such expressions for the large code model, this issue only surfaced there. I have confirmed that with this fix, the kernel build is successful with target "all".	2020-04-10 21:09:59 -05:00
Scott Constable	0505181006	[X86] Fix to X86LoadValueInjectionRetHardeningPass for possible segfault `MBB.back()` could segfault if `MBB.empty()`. Fixed by checking for `MBB.empty()` in the loop. Differential Revision: https://reviews.llvm.org/D77584	2020-04-10 18:28:08 -07:00
Mehdi Amini	ed03d9485e	Revert "[TLI] Per-function fveclib for math library used for vectorization" This reverts commit `60c642e74b`. This patch is making the TLI "closed" for a predefined set of VecLib while at the moment it is extensible for anyone to customize when using LLVM as a library. Reverting while we figure out a way to re-land it without losing the generality of the current API. Differential Revision: https://reviews.llvm.org/D77925	2020-04-11 01:05:01 +00:00
Matt Arsenault	49ae0fc2f0	GlobalISel: Fix incorrect lowering G_FCOPYSIGN In the basic case, this was reading the sign from the wrong operand.	2020-04-10 21:00:25 -04:00
Huihui Zhang	6e7eeb44b3	[GVN] Fix VNCoercion for Scalable Vector. Summary: For VNCoercion, skip scalable vector when analysis rely on fixed size, otherwise call TypeSize::getFixedSize() explicitly. Add unit tests to check funtionality of GVN load elimination for scalable type. Reviewers: sdesmalen, efriedma, spatel, fhahn, reames, apazos, ctetreau Reviewed By: efriedma Subscribers: bjope, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76944	2020-04-10 17:49:07 -07:00
David Blaikie	416fa7720e	llvm-dwarfdump: Report errors when failing to parse loclist/debug_loc entries This probably isn't ideal - the error was being printed specifically inline with the dumping that was more legible - but then the error wasn't reported to stderr and didn't produce a non-zero exit code. Probably the error message could be improved by adding more context now that it isn't printed in-situ of the DIE dumping as much.	2020-04-10 17:28:09 -07:00
Eric Christopher	45dca04395	Exclude bitcast and ext/trunc signbit optimization on ppc_fp128 Revision `a1c05fe` <https://reviews.llvm.org/rGa1c05fe20f3def1f1be9f50d2adefc6b6f1578ad> removed bitcast from the list of problematic transformations, however: %97 = fptrunc ppc_fp128 %2 to double // we need to check ppc_fp128 here to prevent the transformation %98 = bitcast double %97 to i64 // `a1c05fe` checks ppc_fp128 at here %99 = icmp slt i64 %98, 0 %100 = zext i1 %99 to i8 store i8 %100, i8* %7, align 1 so this patch does that. I'm also disabling it in the presence of extend just in case. I verified separately that the hash of -std::infinity and std::infinity don't match now. Differential Revision: https://reviews.llvm.org/D77911	2020-04-10 17:07:55 -07:00
Huihui Zhang	6c989d0248	[BasicAA] Fix aliasGEP/DecomposeGEPExpression for scalable type. Summary: Don't attempt to analyze the decomposed GEP for scalable type. GEP index scale is not compile-time constant for scalable type. Be conservative, return MayAlias. Explicitly call TypeSize::getFixedSize() to assert on places where scalable type doesn't make sense. Add unit tests to check functionality of -basicaa for scalable type. This patch is needed for D76944. Reviewers: sdesmalen, efriedma, spatel, bjope, ctetreau Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77828	2020-04-10 16:58:26 -07:00
Sam Clegg	16206ee07d	[WebAssembly] Minor cleanup to WebAssemblySubtarget. NFC. Pretty much all other platforms pass CPU string as arg0 of initializeSubtargetDependencies. Differential Revision: https://reviews.llvm.org/D77894	2020-04-10 16:47:39 -07:00
Daniel Sanders	f71350f05a	Add -debugify-and-strip-all to add debug info before a pass and remove it after Summary: This allows us to test each backend pass under the presence of debug info using pre-existing tests. The tests should not fail as a result of this so long as it's true that debug info does not affect CodeGen. In practice, a few tests are sensitive to this: * Tests that check the pass structure (e.g. O0-pipeline.ll) * Tests that check --debug output. Specifically instruction dumps containing MMO's (e.g. prelegalizercombiner-extends.ll) * Tests that contain debugify metadata as mir-strip-debug will remove it (e.g. fastisel-debugvalue-undef.ll) * Tests with partial debug info (e.g. patchable-function-entry-empty.mir had debug info but no !llvm.dbg.cu) * Tests that check optimization remarks overly strictly (e.g. prologue-epilogue-remarks.mir) * Tests that would inject the pass in an unsafe region (e.g. seqpairspill.mir would inject between register alloc and virt reg rewriter) In all cases, the checks can either be updated or --debugify-and-strip-all-safe=0 can be used to avoid being affected by something like llvm-lit -Dllc='llc --debugify-and-strip-all-safe' I tested this without the lost debug locations verifier to confirm that AArch64 behaviour is unaffected (with the fixes in this patch) and with it to confirm it finds the problems without the additional RUN lines we had before. Depends on D77886, D77887, D77747 Reviewers: aprantl, vsk, bogner Subscribers: qcolombet, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77888	2020-04-10 16:36:07 -07:00
Lang Hames	59ed45b483	[ORC] Add an OrcV2 C API function for configuring TargetMachines.	2020-04-10 15:51:29 -07:00
Mircea Trofin	da9bcdaad9	[llvm][NFC] Inliner.cpp: ensure InlineHistory ID is always initialized; Summary: The inline history is associated with a call site. There are two locations we fetch inline history. In one, we fetch it together with the call site. In the other, we initialize it under certain conditions, use it later under same conditions (different if check), and otherwise is uninitialized. Although currently there is no uninitialized use, the code is more challenging to maintain correctly, than if the value were always initialized. Changed to the upfront initialization pattern already present in this file. Reviewers: davidxl, dblaikie Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77877	2020-04-10 15:28:53 -07:00
Daniel Sanders	dfca98d6a8	[mir-strip-debug] Optionally preserve debug info that wasn't from debugify/mir-debugify Summary: A few tests start out with debug info and expect it to reach the output. For these tests we shouldn't strip the debug info Reviewers: aprantl, vsk, bogner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77886	2020-04-10 15:24:14 -07:00
Christopher Tetreault	889f6606ed	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: stoklund, sdesmalen, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77272	2020-04-10 14:53:43 -07:00
Christopher Tetreault	40ed21bb71	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: dexonsmith, sdesmalen, efriedma Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77276	2020-04-10 14:18:47 -07:00
Christopher Tetreault	2a922da3a9	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: dexonsmith, sdesmalen, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77274	2020-04-10 13:58:11 -07:00
Daniel Sanders	c162bc2aed	Make TargetPassConfig and llc add pre/post passes the same way. NFC Summary: At the moment, any changes we make to the passes that can be injected before/after others (e.g. -verify-machineinstrs and -print-after-all) have to be duplicated in both TargetPassConfig (for normal execution, -start-before/ -stop-before/etc) and llc (for -run-pass). Unify this pass injection into addMachinePrePass/addMachinePostPass that both TargetPassConfig and llc can use. Reviewers: vsk, aprantl, bogner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77887	2020-04-10 13:46:53 -07:00
Craig Topper	b8a108140d	[CallSite removal][X86] Remove uses of CallSite from X86WinEHState.cpp Differential Revision: https://reviews.llvm.org/D77862	2020-04-10 11:34:06 -07:00
Marcello Maggioni	ea11f4726f	Split LiveRangeCalc in LiveRangeCalc/LiveIntervalCalc. NFC Summary: Refactor LiveRangeCalc such that it is now split into two classes The objective is to split all the "register specific" logic away from LiveRangeCalc. The two new classes created are: - LiveRangeCalc - is meant as a generic class to compute and modify live ranges in a generic way. This class should deal only with SlotIndices and VNInfo objects. - LiveIntervalCals - is meant to be equivalent to the old LiveRangeCalc. It computes the liveness virtual registers tracked by a LiveInterval object. With this refactoring LiveRangeCalc can be used to implement tracking of liveness of LiveRanges that represent other things than just registers. Subscribers: MatzeB, qcolombet, mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76584	2020-04-10 11:26:21 -07:00
Sumanth Gundapaneni	a04ab2ec08	[Pipeliner] Fix the bug in pragma that disables the pipeliner. Differential Revision: https://reviews.llvm.org/D76303.	2020-04-10 12:52:16 -05:00
Matt Morehouse	bef187c750	Implement `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist` for clang Summary: This commit adds two command-line options to clang. These options let the user decide which functions will receive SanitizerCoverage instrumentation. This is most useful in the libFuzzer use case, where it enables targeted coverage-guided fuzzing. Patch by Yannis Juglaret of DGA-MI, Rennes, France libFuzzer tests its target against an evolving corpus, and relies on SanitizerCoverage instrumentation to collect the code coverage information that drives corpus evolution. Currently, libFuzzer collects such information for all functions of the target under test, and adds to the corpus every mutated sample that finds a new code coverage path in any function of the target. We propose instead to let the user specify which functions' code coverage information is relevant for building the upcoming fuzzing campaign's corpus. To this end, we add two new command line options for clang, enabling targeted coverage-guided fuzzing with libFuzzer. We see targeted coverage guided fuzzing as a simple way to leverage libFuzzer for big targets with thousands of functions or multiple dependencies. We publish this patch as work from DGA-MI of Rennes, France, with proper authorization from the hierarchy. Targeted coverage-guided fuzzing can accelerate bug finding for two reasons. First, the compiler will avoid costly instrumentation for non-relevant functions, accelerating fuzzer execution for each call to any of these functions. Second, the built fuzzer will produce and use a more accurate corpus, because it will not keep the samples that find new coverage paths in non-relevant functions. The two new command line options are `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist`. They accept files in the same format as the existing `-fsanitize-blacklist` option <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html#format>. The new options influence SanitizerCoverage so that it will only instrument a subset of the functions in the target. We explain these options in detail in `clang/docs/SanitizerCoverage.rst`. Consider now the woff2 fuzzing example from the libFuzzer tutorial <https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md>. We are aware that we cannot conclude much from this example because mutating compressed data is generally a bad idea, but let us use it anyway as an illustration for its simplicity. Let us use an empty blacklist together with one of the three following whitelists: ``` # (a) src:* fun:* # (b) src:SRC/* fun:* # (c) src:SRC/src/woff2_dec.cc fun:* ``` Running the built fuzzers shows how many instrumentation points the compiler adds, the fuzzer will output //XXX PCs//. Whitelist (a) is the instrument-everything whitelist, it produces 11912 instrumentation points. Whitelist (b) focuses coverage to instrument woff2 source code only, ignoring the dependency code for brotli (de)compression; it produces 3984 instrumented instrumentation points. Whitelist (c) focuses coverage to only instrument functions in the main file that deals with WOFF2 to TTF conversion, resulting in 1056 instrumentation points. For experimentation purposes, we ran each fuzzer approximately 100 times, single process, with the initial corpus provided in the tutorial. We let the fuzzer run until it either found the heap buffer overflow or went out of memory. On this simple example, whitelists (b) and (c) found the heap buffer overflow more reliably and 5x faster than whitelist (a). The average execution times when finding the heap buffer overflow were as follows: (a) 904 s, (b) 156 s, and (c) 176 s. We explain these results by the fact that WOFF2 to TTF conversion calls the brotli decompression algorithm's functions, which are mostly irrelevant for finding bugs in WOFF2 font reconstruction but nevertheless instrumented and used by whitelist (a) to guide fuzzing. This results in longer execution time for these functions and a partially irrelevant corpus. Contrary to whitelist (a), whitelists (b) and (c) will execute brotli-related functions without instrumentation overhead, and ignore new code paths found in them. This results in faster bug finding for WOFF2 font reconstruction. The results for whitelist (b) are similar to the ones for whitelist (c). Indeed, WOFF2 to TTF conversion calls functions that are mostly located in SRC/src/woff2_dec.cc. The 2892 extra instrumentation points allowed by whitelist (b) do not tamper with bug finding, even though they are mostly irrelevant, simply because most of these functions do not get called. We get a slightly faster average time for bug finding with whitelist (b), which might indicate that some of the extra instrumentation points are actually relevant, or might just be random noise. Reviewers: kcc, morehouse, vitalybuka Reviewed By: morehouse, vitalybuka Subscribers: pratyai, vitalybuka, eternalsakura, xwlin222, dende, srhines, kubamracek, #sanitizers, lebedev.ri, hiraditya, cfe-commits, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D63616	2020-04-10 10:44:03 -07:00
Fangrui Song	a7aaaf7016	[MC][RISCV] Make .reloc support arbitrary relocation types Similar to D76746 (ARM), D76754 (AArch64) and llvmorg-11-init-6967-g152d14da64c (x86) Differential Revision: https://reviews.llvm.org/D77018	2020-04-10 10:43:53 -07:00
Matt Arsenault	4593e4131a	AMDGPU: Teach toolchain to link rocm device libs Currently the library is separately linked, but this isn't correct to implement fast math flags correctly. Each module should get the version of the library appropriate for its combination of fast math and related flags, with the attributes propagated into its functions and internalized. HIP already maintains the list of libraries, but this is not used for OpenCL. Unfortunately, HIP uses a separate --hip-device-lib argument, despite both languages using the same bitcode library. Eventually these two searches need to be merged. An additional problem is there are 3 different locations the libraries are installed, depending on which build is used. This also needs to be consolidated (or at least the search logic needs to deal with this unnecessary complexity).	2020-04-10 13:37:32 -04:00
Craig Topper	a6732069ee	[CallSite removal][X86] Remove unneeded use of CallSite. NFC We already have a CallInst, we can just get the calling convention from it.	2020-04-10 10:27:21 -07:00
Kevin P. Neal	7f38812d5b	[FPEnv][AArch64] Platform-specific builtin constrained FP enablement When constrained floating point is enabled the AArch64-specific builtins don't use constrained intrinsics in some cases. Fix that. Neon is part of this patch, so ARM is affected as well. Differential Revision: https://reviews.llvm.org/D77074	2020-04-10 13:02:00 -04:00
Fangrui Song	7f36cb1f1a	[AArch64InstPrinter] Change printAlignedLabel to print the target address in hexadecimal form Similar to D76580 (x86) and D76591 (PPC). ``` // llvm-objdump -d output (before) 10000: 08 00 00 94 bl #32 10004: 08 00 00 94 bl #32 // llvm-objdump -d output (after) 10000: 08 00 00 94 bl 0x10020 10004: 08 00 00 94 bl 0x10024 // GNU objdump -d. The lack of 0x is not ideal due to ambiguity. 10000: 94000008 bl 10020 <bar+0x18> 10004: 94000008 bl 10024 <bar+0x1c> ``` The new output makes it easier to find the jump target. Differential Revision: https://reviews.llvm.org/D77853	2020-04-10 09:21:09 -07:00
Simon Pilgrim	1824ae0f42	[X86] Remove defunct EmitLoweredAtomicFP declaration. NFC.	2020-04-10 17:05:07 +01:00
Simon Pilgrim	dd84a2f77a	[X86] Remove defunct emitFMA3Instr declaration. NFC.	2020-04-10 17:05:06 +01:00
Christopher Tetreault	65b8b643b4	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, efriedma, jonpa Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77265	2020-04-10 08:43:32 -07:00
Simon Pilgrim	a88cc20456	ProfileSummaryInfo.h - remove unnecessary includes. NFC Remove a number of includes that aren't necessary (nor are we relying on the remaining includes to provide the declarations), we just needed a llvm::Instruction forward declaration. This exposed a couple of source files that were implicitly replying on the includes for their use of llvm::SmallSet or std::set, requiring local includes to be added there instead.	2020-04-10 16:25:48 +01:00
Stanislav Mekhanoshin	44920e8566	[AMDGPU] Disable sub-dword scralar loads IR widening These will be widened in the DAG. In the meanwhile early widening prevents otherwise possible vectorization of such loads. Differential Revision: https://reviews.llvm.org/D77835	2020-04-10 08:20:49 -07:00
Mircea Trofin	f62335b534	[llvm][NFC] Style fixes in Inliner.cpp Summary: Function names: camel case, lower case first letter. Variable names: start with upper letter. For iterators that were 'i', renamed with a descriptive name, as 'I' is 'Instruction&'. Lambda captures simplification. Opportunistic boolean return simplification. Reviewers: davidxl, dblaikie Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77837	2020-04-10 08:04:39 -07:00
Ilya Leoshkevich	3bc439bdff	[MSan] Add instrumentation for SystemZ Summary: This patch establishes memory layout and adds instrumentation. It does not add runtime support and does not enable MSan, which will be done separately. Memory layout is based on PPC64, with the exception that XorMask is not used - low and high memory addresses are chosen in a way that applying AndMask to low and high memory produces non-overlapping results. VarArgHelper is based on AMD64. It might be tempting to share some code between the two implementations, but we need to keep in mind that all the ABI similarities are coincidental, and therefore any such sharing might backfire. copyRegSaveArea() indiscriminately copies the entire register save area shadow, however, fragments thereof not filled by the corresponding visitCallSite() invocation contain irrelevant data. Whether or not this can lead to practical problems is unclear, hence a simple TODO comment. Note that the behavior of the related copyOverflowArea() is correct: it copies only the vararg-related fragment of the overflow area shadow. VarArgHelper test is based on the AArch64 one. s390x ABI requires that arguments are zero-extended to 64 bits. This is particularly important for __msan_maybe_warning_() and __msan_maybe_store_origin_() shadow and origin arguments, since non zeroed upper parts thereof confuse these functions. Therefore, add ZExt attribute to the corresponding parameters. Add ZExt attribute checks to msan-basic.ll. Since with -msan-instrumentation-with-call-threshold=0 instrumentation looks quite different, introduce the new CHECK-CALLS check prefix. Reviewers: eugenis, vitalybuka, uweigand, jonpa Reviewed By: eugenis Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits, stefansf, Andreas-Krebbel Tags: #llvm Differential Revision: https://reviews.llvm.org/D76624	2020-04-10 16:53:49 +02:00
Christopher Tetreault	3bebf02861	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, rriddle, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77262	2020-04-10 07:47:19 -07:00
Jessica Clarke	49e20c4c9e	[RISCV] Consume error from parsing attributes section Summary: We don't consume the error from getBuildAttributes, so an assertions build crashes with "Program aborted due to an unhandled Error:". Explicitly consume it like the ARM version in that case. Reviewers: asb, jhenderson, MaskRay, HsiangKai Reviewed By: MaskRay Subscribers: kristof.beyls, hiraditya, simoncook, kito-cheng, shiva0217, rogfer01, rkruppe, psnobl, benna, Jim, lenary, s.egerton, sameer.abuasal, luismarques, evandro, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77841	2020-04-10 15:05:53 +01:00
Simon Pilgrim	91bc50c0d7	[CostModel][X86] Improve InsertElement costs for sub-128bit vectors If we're inserting into v2i8/v4i8/v8i8/v2i16/v4i16 style sub-128bit vectors ensure we don't use the SK_PermuteTwoSrc cost of the legalized value type - this is a followup to rG12c629ec6c59 which added equivalent sub-128bit shuffle costs	2020-04-10 14:55:46 +01:00
Florian Hahn	1a02aaeaa4	[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef. For non-integer constants/expressions and overdefined, I think we can just use SimplifyBinOp to do common folds. By just passing a context with the DL, SimplifyBinOp should not try to get additional information from looking at definitions. For overdefined values, it should be enough to just pass the original operand. Note: The comment before the `if (isconstant(V1State)...` was wrong originally: isConstant() also matches integer ranges with a single element. It is correct now. Reviewers: efriedma, davide, mssimpso, aartbik Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76459	2020-04-10 11:02:57 +01:00
Mehdi Amini	bbeeb35c1f	Revert "[DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff." This reverts commit `0445c64998`. MLIR Build is broken by this change at the moment.	2020-04-10 07:44:06 +00:00
Alina Sbirlea	0445c64998	[DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff. This replaces the ChildrenGetter inside the DominatorTree with GraphTraits over a GraphDiff object, an object which encapsulated the view of the previous CFG. This also simplifies the extentions in clang which use DominatorTree, as GraphDiff also filters nullptrs. Re-land `a90374988e` after moving CFGDiff.h to Support. Differential Revision: https://reviews.llvm.org/D77341	2020-04-10 07:38:53 +00:00
Michael Liao	b54b4ecac3	Fix `-Wextra` warning. NFC.	2020-04-10 03:22:02 -04:00
Mehdi Amini	57d2d48399	Revert "[DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff." This reverts commit `a90374988e` and `5da1671bf8`. A new dependency is introduced here from Support to IR which seems like a layering violation. It also breaks the MLIR build at the moment.	2020-04-10 06:27:59 +00:00
Kai Luo	b7d5229d78	[PowerPC] Update alignment for ReuseLoadInfo in LowerFP_TO_INTForReuse In LowerFP_TO_INTForReuse, when emitting `stfiwx`, alignment of 4 is set for the `MachineMemOperand`, but RLI(ReuseLoadInfo)'s alignment is not updated for following loads. It's related to failed alignment check reported in https://bugs.llvm.org/show_bug.cgi?id=45297 Differential Revision: https://reviews.llvm.org/D77624	2020-04-10 05:49:19 +00:00
John McCall	8423a6f363	Rename OptimalLayout to OptimizedStructLayout at Chris's request.	2020-04-10 00:14:20 -04:00
David Blaikie	e0fd87cc64	llvm-dwarfdump: Return non-zero on error Makes it easier to test "this doesn't produce an error" (& indeed makes that the implied default so we don't accidentally write tests that have silent/sneaky errors as well as the positive behavior we're testing for) Though the support for applying relocations is patchy enough that a bunch of tests treat lack of relocation application as more of a warning than an error - so rather than me trying to figure out how to add support for a bunch of relocation types, let's degrade that to a warning to match the usage (& indeed, it's sort of more of a tool warning anyway - it's not that the DWARF is wrong, just that the tool can't fully cope with it - and it's not like the tool won't dump the DWARF, it just won't follow/render certain relocations - I guess in the most general case it might try to render an unrelocated value & instead render something bogus... but mostly seems to be about interesting relocations used in eh_frame (& honestly it might be nice if we were lazier about doing this relocation resolution anyway - if you're not dumping eh_frame, should we really be erroring about the relocations in it?))	2020-04-09 20:53:58 -07:00
Nemanja Ivanovic	7f3787c0f2	[PowerPC] Bail out of redundant LI elimination on an implicit kill The transformation currently does not differentiate between explicit and implicit kills. However, it is not valid to later simply clear an implicit kill flag since the kill could be due to a call or return. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45374	2020-04-09 22:17:29 -05:00
Serguei Katkov	4275eb1331	Re-land [Codegen/Statepoint] Allow usage of registers for non gc deopt values. The change introduces the usage of physical registers for non-gc deopt values. This require runtime support to know how to take a value from register. By default usage is off and can be switched on by option. The change also introduces additional fix-up patch which forces the spilling of caller saved registers (clobbered after the call) and re-writes statepoint to use spill slots instead of caller saved registers. Reviewers: reames, danstrushin Reviewed By: dantrushin Subscribers: mgorny, hiraditya, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D77797	2020-04-10 10:13:39 +07:00
Max Kazantsev	4e87823026	[LoopLoadElim] Fix crash by always checking simplify form Loop simplify form should always be checked because logic of propagateStoredValueToLoadUsers relies on it (in particular, it requires preheader). Reviewed By: Fedor Sergeev, Florian Hahn Differential Revision: https://reviews.llvm.org/D77775	2020-04-10 09:23:28 +07:00
Heejin Ahn	b647de9925	[WebAssembly] Use dummy debug info in Emscripten SjLj Summary: D74269 added debug info to newly created instructions, including calls to `malloc` and `free`, by taking debug info from existing instructions around, whose debug info may or may not be empty. But there are cases debug info is required by the IR verifier: when both the caller and the callee functions have DISubprograms, meaning we already have declarations to `malloc` or `free` with a DISubprogram attached, newly created calls to `malloc` and `free` should have non-empty debug info. This patch creates a non-empty dummy debug info in this case to those calls to make the IR verifier pass. Fixes https://bugs.llvm.org/show_bug.cgi?id=45461. Reviewers: dschuff Subscribers: aprantl, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77784	2020-04-09 18:44:50 -07:00
Wenlei He	60c642e74b	[TLI] Per-function fveclib for math library used for vectorization Summary: Encode `-fveclib` setting as per-function attribute so it can threaded through to LTO backends. Accordingly per-function TLI now reads the attributes and select available vector function list based on that. Now we also populate function list for all supported vector libraries for the shared per-module `TargetLibraryInfoImpl`, so each function can select its available vector list independently but without duplicating the vector function lists. Inlining between incompatbile vectlib attributed is also prohibited now. Subscribers: hiraditya, dexonsmith, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77632	2020-04-09 18:26:38 -07:00
Stefan Pintilie	5b18b6e9a8	[PowerPC][Future] Fix for `6c4b40def7` This is a fix for the previous patch `6c4b40def7`. In some cases it may be possible to have the compiler produce st_other=1 without the compiler using mcpu=future which should not be the case. This patch adds a guard to make sure that if we are using st_other=1 then we are also compiling for future CPU.	2020-04-10 01:12:11 +00:00
Alina Sbirlea	a90374988e	[DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff. Summary: This replaces the ChildrenGetter inside the DominatorTree with GraphTraits over a GraphDiff object, an object which encapsulated the view of the previous CFG. This also simplifies the extentions in clang which use DominatorTree, as GraphDiff also filters nullptrs. Reviewers: kuhar, dblaikie, NutshellySima Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77341	2020-04-09 18:08:39 -07:00
Lang Hames	37bcf2df01	[ORC] Require JITDylib to be specified when adding IR and objects in the C API.	2020-04-09 17:59:26 -07:00
Craig Topper	5625e6ab37	[X86] Improve min/max reduction costs. This is similar to what I recently did for getArithmeticReductionCost. I'm trying to account for the narrowing from 512->256->128 as we go. I've also added a new helper method getMinMaxCost that tries to handle the cases where we have native min/max instructions and fall back to cmp+select when we don't. Differential Revision: https://reviews.llvm.org/D76634	2020-04-09 17:28:50 -07:00
Nemanja Ivanovic	5fe2809447	[PowerPC] Don't assert on SELECT_CC with i1 type When we try to select a SELECT_CC on Power9, we check if it can be matched to a SETB instruction. In that function, we assert that the output type is i32/i64. This is unnecessary as it is perfectly reasonable to have an i1 SELECT_CC. Change that from an assert to an early exit condition. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45448	2020-04-09 19:27:32 -05:00
Amara Emerson	e99169f1c2	[AArch64][GlobalISel] CallLowering: Don't generate new copies each time we need to store to a stack location for outgoing args. During call arg lowering we shouldn't be modifying SP so cache the SP copy vreg for subsequent uses. Gives a 0.2% geomean code size improvement on CTMark. Differential Revision: https://reviews.llvm.org/D77838	2020-04-09 17:08:56 -07:00
Francesco Petrogalli	c846d2682b	[llvm][Codegen] Make `getVectorTypeBreakdownMVT` work with scalable types. Reviewers: efriedma, andwar, sdesmalen Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77434	2020-04-10 00:48:27 +01:00
Christopher Tetreault	9f87d951fc	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: mcrosier, efriedma, sdesmalen Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77269	2020-04-09 16:43:29 -07:00
Lang Hames	0d5f15f700	[ORC] Add C API support for adding object files to an LLJIT instance.	2020-04-09 16:18:46 -07:00
Lang Hames	1cd8493e69	[ORC] Expand the OrcV2 C API bindings. Adds basic support for LLJITBuilder and DynamicLibrarySearchGenerator. This allows C API clients to configure LLJIT to expose process symbols to JIT'd code. An example of this is added in llvm/examples/OrcV2CBindingsReflectProcessSymbols.	2020-04-09 16:18:46 -07:00
Daniel Sanders	a79b2fc44b	Add pass to strip debug info from MIR Summary: Removes: * All LLVM-IR level debug info using StripDebugInfo() * All debugify metadata * 'Debug Info Version' module flag * All (valid) DEBUG_VALUE MachineInstrs All DebugLocs from MachineInstrs This is a more complete solution than the previous MIRPrinter option that just causes it to neglect to print debug-locations. * The qualifier 'valid' is used here because AArch64 emits an invalid one and tests depend on it Reviewers: vsk, aprantl, bogner Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77747	2020-04-09 15:44:38 -07:00
Mircea Trofin	655aa1ae4a	[llvm][NFC] Replace CallSite with CallBase in Inliner Summary: Almost all uses are replaced. Left FIXMEs for the two sites that require refactoring outside of Inliner, to scope this patch. Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77817	2020-04-09 15:01:58 -07:00
Christopher Tetreault	19cc9b9ded	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: efriedma, sdesmalen, rriddle Reviewed By: sdesmalen Subscribers: hiraditya, dantrushin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77261	2020-04-09 14:59:14 -07:00
Christopher Tetreault	00a1032412	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: rriddle, sdesmalen, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77260	2020-04-09 13:35:41 -07:00
James Y Knight	5e7b98fe75	Fix an unused-variable warning in Release mode.	2020-04-09 16:34:55 -04:00
Christopher Tetreault	e634f482ea	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: arsenm, efriedma, sdesmalen Reviewed By: arsenm Subscribers: wdng, arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77268	2020-04-09 13:11:37 -07:00
Zequan Wu	eccfa35d53	Fix lifetime call in landingpad blocking Simplifycfg pass Fix lifetime call in landingpad blocks simplifycfg from removing the landingpad. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D77188	2020-04-09 13:07:32 -07:00
Christopher Tetreault	e1e131ea5e	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: grosbach, efriedma, sdesmalen Reviewed By: efriedma Subscribers: hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77271	2020-04-09 12:52:44 -07:00
Christopher Tetreault	b96558f5e5	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sunfish, sdesmalen, efriedma Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77273	2020-04-09 12:41:28 -07:00
Stefan Pintilie	64868cbfcf	[PowerPC][Future] Fix for `75828ef615` Used unsigned long where uint64_t should have been used by mistake. Fixed in this patch.	2020-04-09 19:33:12 +00:00
Simon Pilgrim	12c629ec6c	[CostModel][X86] Add shuffle costs for some common sub-128bit vectors v2i8/v4i8/v8i8 + v2i16/v4i16 all show up in vectorizer code and by just using the legalized types (v16i8/v8i16) we're highly exaggerating the actual cost of the shuffle.	2020-04-09 19:57:06 +01:00
Paolo Savini	fae40bd5a1	[RISCV] Add MC layer support for proposed Bit Manipulation extension (version 0.92) This adds the instruction encoding and mnenomics for the proposed RISC-V Bit Manipulation extension (version 0.92). It is implemented with each category of instruction as its own target feature, with the 'b' extension feature enabling all options. Since this extension is not yet ratified, all target features are prefixed with 'experimental-' to note their status. Differential Revision: https://reviews.llvm.org/D65649	2020-04-09 18:04:22 +01:00
jasonliu	085689d44c	[PPC][AIX] Implement variadic function handling in LowerFormalArguments_AIX Summary: This patch adds support for handling of variadic functions for AIX. This includes ensuring that use and consume correct type of va_list (char *va_list) for AIX. Authored by: ZarkoCA Reviewers: cebowleratibm, sfertile, jasonliu Reviewed by: jasonliu Differential Revision: https://reviews.llvm.org/D76130	2020-04-09 16:49:44 +00:00
Stefan Pintilie	75828ef615	[PowerPC][Future] Initial support for PCRel addressing for constant pool loads Add initial support for PC Relative addressing for constant pool loads. This includes adding a new relocation for @pcrel and adding a new PowerPC flag to identify PC relative addressing. Differential Revision: https://reviews.llvm.org/D74486	2020-04-09 11:17:23 -05:00
Kazushi (Jam) Marukawa	015dee1ac8	[VE] Support (m)0 and (m)1 operands Summary: VE has special operands to represent 0b000...000111...111 (`(m)0`) and 0b111...111000...000 (`(m)1`) bit sequences. This patch supports those operands not only in machine instructions but also in DAG lowering. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D77769	2020-04-09 18:09:00 +02:00
Craig Topper	5a55363dc4	[X86] Remove redundant VMOVDDUPZ128rmk/VMOVDDUPZ128rmkz isel patterns. These patterns are identical to the pattern for the instruction.	2020-04-09 09:06:58 -07:00
Gil Rapaport	e2a1867880	[LV] Add VPValue operands to VPBlendRecipe (NFCI) InnerLoopVectorizer's code called during VPlan execution still relies on original IR's def-use relations to decide which vector code to generate, limiting VPlan transformations ability to modify def-use relations and still have ILV generate the vector code. This commit introduces VPValues for VPBlendRecipe to use as the values to blend. The recipe is generated with VPValues wrapping the phi's incoming values of the scalar phi. This reduces ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. Differential Revision: https://reviews.llvm.org/D77539	2020-04-09 18:48:33 +03:00
Mircea Trofin	b4924f01a4	[llvm][nfc] InstructionCostDetail encapsulation Ensured initialized fields; encapsulad delta calulations and evaluation of threshold having had changed; assertion for CostThresholdMap dereference, to indicate design intent. Differential Revision: https://reviews.llvm.org/D77762	2020-04-09 08:21:18 -07:00
Ayal Zaks	1678489234	[LV] FoldTail w/o Primary Induction Introduce a new VPWidenCanonicalIVRecipe to generate a canonical vector induction for use in fold-tail-with-masking, if a primary induction is absent. The canonical scalar IV having start = 0 and step = VFUF, created during code -gen to control the vector loop, is widened into a canonical vector IV having start = {<PartVF, PartVF+1, ..., PartVF+VF-1> for 0 <= Part < UF} and step = <VFUF, VFUF, ..., VF*UF>. Differential Revision: https://reviews.llvm.org/D77635	2020-04-09 17:45:23 +03:00
Simon Cook	2df6a02fd7	[RISCV] Implement evaluateBranch This implements the instruction analysis required to print branch targets as part of llvm-objdump's disassembly. Note, this only handles those branches which can be analyzed in a single instruction, a future patch will handle multiple-instruction patterns, such as AUIPC/LUI+JALR instruction pairs. Differential Revision: https://reviews.llvm.org/D77567	2020-04-09 15:11:55 +01:00
Shengchen Kan	2477cec2ac	[NFC][X86] Refine code in X86AsmBackend Summary: Move code to a better place, rename function, etc Tags: #llvm Differential Revision: https://reviews.llvm.org/D77778	2020-04-09 21:31:52 +08:00
Sanjay Patel	812970edda	[InstCombine] replace undef in vector constant for safe shift transform (PR45447) As noted in PR45447, we have a vector-constant-with-undef-element transform bug: https://bugs.llvm.org/show_bug.cgi?id=45447 We replace undefs with a safe constant (0 or -1) based on the (non-)negative predicate constraint. So this is correct: http://volta.cs.utah.edu:8080/z/WZE36H ...but this is not: http://volta.cs.utah.edu:8080/z/boj8gJ Previously, we were relying on getSafeVectorConstantForBinop() in the related fold (D76800). But that's making an assumption about what qualifies as "safe", and that assumption may not always hold. Differential Revision: https://reviews.llvm.org/D77739	2020-04-09 08:00:46 -04:00
Anton Bikineev	9e1ccec8d5	tsan: don't instrument __attribute__((naked)) functions Naked functions are required to not have compiler generated prologues/epilogues, hence no instrumentation is needed for them. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45400 Differential Revision: https://reviews.llvm.org/D77477	2020-04-09 13:47:47 +02:00
Pavel Labath	b761a6484d	[DWARF] Detect extraction errors in DWARFFormValue::extractValue Summary: Although the function had a bool return value, it was always returning true. Presumably this is because the main type of errors one can encounter here is running off the end of the stream, and until very recently, the DataExtractor class made it very difficult to detect that. The situation has changed now, and we can easily detect errors here, which this patch does. Reviewers: dblaikie, aprantl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77308	2020-04-09 13:41:02 +02:00
Serguei Katkov	44f0d7f136	Revert "[Codegen/Statepoint] Allow usage of registers for non gc deopt values." This reverts commit `a0275705bb`. It causes buildbot failures building LLVM with BUILD_SHARED_LIBS due to a linker error.	2020-04-09 18:24:47 +07:00
Florian Hahn	a7efe06af0	[LV] Assert no DbgInfoIntrinsic calls are passed to widening (NFC). When building a VPlan, BasicBlock::instructionsWithoutDebug() is used to iterate over the instructions in a block. This means that no recipes should be created for debug info intrinsics already and we can turn the early exit into an assertion. Reviewers: Ayal, gilr, rengolin, aprantl Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D77636	2020-04-09 11:37:32 +01:00
Serguei Katkov	a0275705bb	[Codegen/Statepoint] Allow usage of registers for non gc deopt values. The change introduces the usage of physical registers for non-gc deopt values. This require runtime support to know how to take a value from register. By default usage is off and can be switched on by option. The change also introduces additional fix-up patch which forces the spilling of caller saved registers (clobbered after the call) and re-writes statepoint to use spill slots instead of caller saved registers. Reviewers: reames, dantrushin Reviewed By: reames, dantrushin Subscribers: mgorny, hiraditya, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D77371	2020-04-09 16:57:35 +07:00
Jay Foad	bf730e1686	[CodeGen] Fix a simple FIXME. NFC.	2020-04-09 10:54:03 +01:00
Jay Foad	9c7bd94ce8	Fix typo in comment	2020-04-09 10:36:00 +01:00
Jay Foad	4970a1deca	[AMDGPU] Remove outdated comment	2020-04-09 10:36:00 +01:00
Florian Hahn	9997ee23ed	[VPlan] Add & use VPValue operands for VPWidenCallRecipe (NFC). This patch adds VPValue versions for the arguments of the call to VPWidenCallRecipe and uses them during code-generation. Similar to D76373 this reduces ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. Reviewers: Ayal, gilr, rengolin Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D77655	2020-04-09 10:23:26 +01:00
Jay Foad	c63aed890e	[KnownBits] Move AND, OR and XOR logic into KnownBits Summary: There are at least three clients for KnownBits calculations: ValueTracking, SelectionDAG and GlobalISel. To reduce duplication the common logic should be moved out of these clients and into KnownBits itself. This patch does this for AND, OR and XOR calculations by implementing and using appropriate operator overloads KnownBits::operator& etc. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74060	2020-04-09 10:10:37 +01:00
Jay Foad	94cc9eccf6	[ValueTracking] Simplify KnownBits construction Use the simpler BitWidth constructor instead of the copy constructor to make it clear when we don't actually need to copy an existing KnownBits value. Split out from D74539. NFC.	2020-04-09 09:27:22 +01:00
Serge Pavlov	c7ff5b38f2	[FPEnv] Use single enum to represent rounding mode Now compiler defines 5 sets of constants to represent rounding mode. These are: 1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes defined by IEEE-754 and is used in `APFloat` implementation. 2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754 rounding modes and a special value for dynamic rounding mode. It is used in clang frontend. 3. `llvm::fp::RoundingMode`. Defines the same values as `clang::LangOptions::FPRoundingModeKind` but in different order. It is used to specify rounding mode in in IR and functions that operate IR. 4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7). Besides constants for rounding mode it also uses a special value to indicate error. It is convenient to use in intrinsic functions, as it represents platform-independent representation for rounding mode. In this role it is used in some pending patches. 5. Values like `FE_DOWNWARD` and other, which specify rounding mode in library calls `fesetround` and `fegetround`. Often they represent bits of some control register, so they are target-dependent. The same names (not values) and a special name `FE_DYNAMIC` are used in `#pragma STDC FENV_ROUND`. The first 4 sets of constants are target independent and could have the same numerical representation. It would simplify conversion between the representations. Also now `clang::LangOptions::FPRoundingModeKind` and `llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding direction `roundTiesToAway`, although it is supported natively on some targets. This change defines all the rounding mode type via one `llvm::RoundingMode`, which also contains rounding mode for IEEE rounding direction `roundTiesToAway`. Differential Revision: https://reviews.llvm.org/D77379	2020-04-09 13:26:47 +07:00
Pratyai Mazumder	e8d1c6529b	[SanitizerCoverage] sancov/inline-bool-flag instrumentation. Summary: New SanitizerCoverage feature `inline-bool-flag` which inserts an atomic store of `1` to a boolean (which is an 8bit integer in practice) flag on every instrumented edge. Implementation-wise it's very similar to `inline-8bit-counters` features. So, much of wiring and test just follows the same pattern. Reviewers: kcc, vitalybuka Reviewed By: vitalybuka Subscribers: llvm-commits, hiraditya, jfb, cfe-commits, #sanitizers Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D77244	2020-04-08 22:43:52 -07:00
Vitaly Buka	8b1a6c0a57	[NFC][SanitizerCoverage] Simplify alignment calculation This reverts commit e42f2a0cd8b8007c816d0e63f5000c444e29105e.	2020-04-08 22:43:52 -07:00
WangTianQing	a3dc949000	[X86] Add TSXLDTRK instructions. Summary: For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Reviewers: craig.topper, RKSimon, LuoYuanke Reviewed By: craig.topper Subscribers: mgorny, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77205	2020-04-09 13:17:29 +08:00
Johannes Doerfert	cb0ecc5c33	[CallGraphUpdater] Remove dead constants before replacing a function Dead constants might be left when a function is replaced, we can gracefully handle this case and avoid complexity for the users who would see an assertion otherwise.	2020-04-08 22:52:46 -05:00
Lang Hames	5877d6f5f4	[ORC] Make mangling convenience methods part of the public API of LLJIT. This saves clients from having to manually construct a MangleAndInterner.	2020-04-08 20:20:13 -07:00
Matt Arsenault	0aa0d70067	MIR: Use Register	2020-04-08 22:07:26 -04:00
Sam Clegg	7baad0c53c	[WebAssembly][MC] Use StringRef over std::string pointer This is followup based on feedback on `5be42f36f5`. See: https://reviews.llvm.org/D77627. Differential Revision: https://reviews.llvm.org/D77674	2020-04-08 18:28:08 -07:00
Craig Topper	f3d3cec648	[InstCombine] Avoid a call to deprecated version of CreateCall. Passing a Value * to CreateCall has to call getPointerElementType to find the type of the pointer. In this case we can rely on the fact that Intrinsic::getDeclaration returns a Function * and use that version of CreateCall.	2020-04-08 17:41:16 -07:00
Johannes Doerfert	0985554b70	[Attributor][NFC] Split AbstractAttributes out of Attributor.cpp Attributor.cpp became quite big and we need to start provide structure. The Attributor code is now in Attributor.cpp and the classes derived from AbstractAttribute are in AttributorAttributes.cpp. Minor changes were required but no intended functional changes. We also minimized includes as part of this. Reviewed By: baziotis Differential Revision: https://reviews.llvm.org/D76873	2020-04-08 19:02:14 -05:00
Christopher Tetreault	fe69eb1196	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: espindola, efriedma, sdesmalen Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77275	2020-04-08 16:29:36 -07:00
Christopher Tetreault	49fd24fe9e	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: hfinkel, efriedma, sdesmalen Reviewed By: efriedma Subscribers: wuzish, nemanjai, hiraditya, kbarton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77266	2020-04-08 16:10:55 -07:00
Christopher Tetreault	155740cc33	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, rriddle, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77263	2020-04-08 15:15:41 -07:00
Amara Emerson	befc788cfa	GlobalISel: Add a setInstrAndDebugLoc(MachineInstr&) convenience helper to MachineIRBuilder. NFC. This saves doing two separate calls to set the Instr and DebugLoc from an existing MI.	2020-04-08 14:38:33 -07:00
Matt Arsenault	e49e33b610	CodeGen: Use Register in MachineInstrBuilder	2020-04-08 17:03:53 -04:00
Kirill Naumov	8b67853a83	[CFGPrinter] Adding heat coloring to CFGPrinter This patch introduces the heat coloring of the Control Flow Graph which is based on the relative "hotness" of each BB. The patch is a part of sequence of three patches, related to graphs Heat Coloring. Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu Differential Revision: https://reviews.llvm.org/D77161	2020-04-08 19:59:51 +00:00
Matt Arsenault	c42cc7fd24	CodeGen: Use Register in MachineSSAUpdater	2020-04-08 14:29:01 -04:00
Artem Belevich	a9627b7ea7	[CUDA] Add partial support for recent CUDA versions. Generate PTX using newer versions of PTX and allow using sm_80 with CUDA-11. None of the new features of CUDA-10.2+ have been implemented yet, so using these versions will still produce a warning. Differential Revision: https://reviews.llvm.org/D77670	2020-04-08 11:19:44 -07:00
Vedant Kumar	48e65fc630	MachineFunction: Copy call site info when duplicating insts Summary: Preserve call site info for duplicated instructions. We copy over the call site info in CloneMachineInstrBundle to avoid repeated calls to copyCallSiteInfo in CloneMachineInstr. (Alternatively, we could copy call site info higher up the stack, e.g. into TargetInstrInfo::duplicate, or even into individual backend passes. However, I don't see how that would be safer or more general than the current approach.) Reviewers: aprantl, djtodoro, dstenb Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77685	2020-04-08 11:06:14 -07:00
Matt Arsenault	586769cce2	DAG: Use Register	2020-04-08 13:44:31 -04:00
Sean Fertile	d0b57b41f4	[PowerPC][AIX][NFC] Replace deprecated getByValAlign call. Replace call to deprecated 'getByValAlign()' with 'getNonZeroByValAlign()'.	2020-04-08 13:27:39 -04:00
Matt Arsenault	dcce3ef1d2	FastISel: Partially use Register Doesn't try to convert the cases that depend on generated code.	2020-04-08 12:10:58 -04:00
Matt Arsenault	7a46e36d51	CodeGen: Use Register more in CallLowering Some of these MCPhysReg uses should probably be MCRegister, but right now this would require more invasive changes.	2020-04-08 12:10:58 -04:00
Matt Arsenault	ca0ace7298	CodeGen: Use Register in MachineBasicBlock	2020-04-08 12:10:58 -04:00
Matt Arsenault	84aa58cbe2	CodeGen: Use Register in TargetLowering	2020-04-08 12:10:58 -04:00
Kirill Naumov	0125db9ab2	[TimePasses] Small fix in "-time-passes" flag that makes it more stable Adds StringMap for TimingData. Differential Revision: https://reviews.llvm.org/D76946 Reviewed By: fedor.sergeev	2020-04-08 15:59:45 +00:00
Sean Fertile	8abfd2c3bb	[PowerPC][AIX] Enable passing byval formal arguments in multiple registers. Any or all the argument registers can be used to pass a byval formal argument, with the limitation that the argument must fit in the available registers (ie: is not split between registers and stack). Differential Revision: https://reviews.llvm.org/D76902	2020-04-08 11:16:33 -04:00
Florian Hahn	bbbec71609	[DSE.MSSA] Only use callCapturesBefore for calls. callCapturesBefore always returns ModRef , if UseInst isn't a call. As we only call it if we already know Mod is set, this only destroys the Must bit for non-calls.	2020-04-08 15:12:33 +01:00
Florian Hahn	a6353fdf3b	[DSE,MSSA] Hoist getMemoryAccess call (NFC).	2020-04-08 15:10:05 +01:00
Alexey Lapshin	0ed2170dc4	[DWARFLinker][dsymutil] followup for `88c2137b6d` That patch is a followup for "Move DwarfStreamer into DWARFLinker". It fixes build with LLVM_LINK_LLVM_DYLIB.	2020-04-08 16:46:52 +03:00
Stefan Pintilie	6c4b40def7	[PowerPC][Future] Add Support For Functions That Do Not Use A TOC. On PowerPC most functions require a valid TOC pointer. This is the case because either the function itself needs to use this pointer to access the TOC or because other functions that are called from that function expect a valid TOC pointer in the register R2. The main exception to this is leaf functions that do not access the TOC since they are guaranteed not to need a valid TOC pointer. This patch introduces a feature that will allow more functions to not require a valid TOC pointer in R2. Differential Revision: https://reviews.llvm.org/D73664	2020-04-08 08:07:35 -05:00
Sanjay Patel	a1c05fe20f	[InstCombine] exclude bitcast of ppc_fp128 in icmp signbit fold Based on the post-commit comments for rG0f56bbc, there might be a problem with this transform: (bitcast (fpext/fptrunc X)) to iX) < 0 --> (bitcast X to iY) < 0 ...and the ppc_fp128 data type, so conservatively bypass if we are bitcasting a ppc_fp128. We might be able to account for endian or other differences to enable this for PowerPC again if that is useful. Differential Revision: https://reviews.llvm.org/D77642	2020-04-08 08:56:19 -04:00
Simon Pilgrim	66c18c729d	[X86][SSE] Combine PTEST(AND(X,Y),AND(X,Y)) -> PTEST(X,Y) and ANDN equivalents Tests derived from PR42035 examples	2020-04-08 12:42:22 +01:00
Jeremy Morse	c77887e4d1	[DebugInfo][NFC] Early-exit when analyzing for single-location variables This is a performance patch that hoists two conditions in DwarfDebug's validThroughout to avoid a linear-scan of all instructions in a block. We now exit early if validThrougout will never return true for the variable location. The first added clause filters for the two circumstances where validThroughout will return true. The second added clause should be identical to the one that's deleted from after the linear-scan. Differential Revision: https://reviews.llvm.org/D77639	2020-04-08 12:27:11 +01:00
Shengchen Kan	916044d819	[X86][MC] Support enhanced relaxation for branch align Summary: Since D75300 has been landed, I want to support enhanced relaxation when we need to align branches and allow prefix padding. "Enhanced Relaxtion" means we allow an instruction that could not be traditionally relaxed to be emitted into RelaxableFragment so that we increase its length by adding prefixes for optimization. The motivation is straightforward, RelaxFragment is mostly for relative jumps and we can not increase the length of jumps when we need to align them, so if we need to achieve D75300's purpose (reducing the bytes of nops) when need to align jumps, we have to make more instructions "relaxable". Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: reames Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D76286	2020-04-08 19:08:19 +08:00
Mikael Holmen	893df2032d	[IfConversion] Disallow TrueBB == FalseBB for valid diamonds Summary: This fixes PR45302. Previously the case BB1 / \ \| \| TBB FBB \| \| \ / BB2 was treated as a valid diamond also when TBB and FBB was the same basic block. This then lead to a failed assertion in IfConvertDiamond. Since TBB == FBB is quite a degenerated case of a diamond, we now don't treat it as a valid diamond anymore, and thus we will avoid the trouble of making IfConvertDiamond handle it correctly. Reviewers: efriedma, kparzysz Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D77651	2020-04-08 12:50:36 +02:00
Anna Welker	89e1248d7b	[ARM][MVE] Optimise offset addresses of gathers/scatters This patch adds an analysis of the offset addresses used by gathers and scatters to the MVEGatherScatterLowering pass to find multiplications and additions that are loop invariant and thus can be moved into the loop preheader, avoiding to execute them each time. Differential Revision: https://reviews.llvm.org/D76681	2020-04-08 11:46:57 +01:00
Max Kazantsev	7adb9e06fd	[LoopLoadElim] Add test showing that LoopLoadElim doesn't work correctly with new PM	2020-04-08 17:32:03 +07:00
Dominik Montada	35950fea8d	[GlobalISel] support narrow G_IMPLICIT_DEF for DstSize % NarrowSize != 0 Summary: When narrowing G_IMPLICIT_DEF where the original size is not a multiple of the narrow size, emit a smaller G_IMPLICIT_DEF and use G_ANYEXT. To prevent a potential endless loop in the legalizer, the condition to combine G_ANYEXT(G_IMPLICIT_DEF) is changed from isInstUnsupported to !isInstLegal, since in this case the combine is only valid if consequent legalization of the newly combined G_IMPLICIT_DEF does not introduce G_ANYEXT due to narrowing. Although this legalization for G_IMPLICIT_DEF would also be valid for the general case, it actually caused a lot of code regressions when tried due to superfluous COPYs and combines not getting hit anymore. Reviewers: dsanders, aemerson, volkan, arsenm, aditya_nandakumar Reviewed By: arsenm Subscribers: jvesely, nhaehnle, kerbowa, wdng, rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76598	2020-04-08 11:00:07 +02:00
Kazushi (Jam) Marukawa	aa034867f1	[VE] Simplify definitions of uimm6 and simm7 Summary: To prepare continuous changes, simplify uimm6 and simm7 operands. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D77700	2020-04-08 09:53:42 +02:00
Igor Kudrin	af11c556db	[DebugInfo] Fix reading DWARFv5 type units in DWP. In DWARFv5, type units are stored in .debug_info sections, along with compilation units, and they are distinguished by the unit_type field in the header, not by the name of the section. It is impossible to associate the correct index section of a DWP file with the unit before the unit's header is read. This patch fixes reading DWARFv5 type units by parsing the header first and then applying the index entry according to the actual unit type. Differential Revision: https://reviews.llvm.org/D77552	2020-04-08 12:50:58 +07:00
Stanislav Mekhanoshin	f96810ff34	[AMDGPU] Expand vector trunc stores from i16 to i8 Differential Revision: https://reviews.llvm.org/D77693	2020-04-07 21:47:45 -07:00
Johannes Doerfert	a19eb1de72	[OpenMP] Add match_{all,any,none} declare variant selector extensions. By default, all traits in the OpenMP context selector have to match for it to be acceptable. Though, we sometimes want a single property out of multiple to match (=any) or no match at all (=none). We offer these choices as extensions via `implementation={extension(match_{all,any,none})}` to the user. The choice will affect the entire context selector not only the traits following the match property. The first user will be D75788. There we can replace ``` #pragma omp begin declare variant match(device={arch(nvptx64)}) #define __CUDA__ #include <__clang_cuda_cmath.h> // TODO: Hack until we support an extension to the match clause that allows "or". #undef __CLANG_CUDA_CMATH_H__ #undef __CUDA__ #pragma omp end declare variant #pragma omp begin declare variant match(device={arch(nvptx)}) #define __CUDA__ #include <__clang_cuda_cmath.h> #undef __CUDA__ #pragma omp end declare variant ``` with the much simpler ``` #pragma omp begin declare variant match(device={arch(nvptx, nvptx64)}, implementation={extension(match_any)}) #define __CUDA__ #include <__clang_cuda_cmath.h> #undef __CUDA__ #pragma omp end declare variant ``` Reviewed By: mikerice Differential Revision: https://reviews.llvm.org/D77414	2020-04-07 23:33:24 -05:00
Kazu Hirata	91eb442fde	[JumpThreading] NFC: Simplify ComputeValueKnownInPredecessorsImpl Summary: ComputeValueKnownInPredecessorsImpl is the main folding mechanism in JumpThreading.cpp. To avoid potential infinite recursion while chasing use-def chains, it uses: DenseSet<std::pair<Value , BasicBlock >> &RecursionSet to keep track of Value-BB pairs that we've processed. Now, when ComputeValueKnownInPredecessorsImpl recursively calls itself, it always passes BB as is, so the second element is always BB. This patch simplifes the function by dropping "BasicBlock *" from RecursionSet. Reviewers: wmi, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77699	2020-04-07 18:37:36 -07:00
Eli Friedman	565b56a72c	[NFC] Clean up uses of LoadInst constructor.	2020-04-07 16:28:53 -07:00
Daniel Sanders	1adeeabb79	Add MIR-level debugify with only locations support for now Summary: Re-used the IR-level debugify for the most part. The MIR-level code then adds locations to the MachineInstrs afterwards based on the LLVM-IR debug info. It's worth mentioning that the resulting locations make little sense as the range of line numbers used in a Function at the MIR level exceeds that of the equivelent IR level function. As such, MachineInstrs can appear to originate from outside the subprogram scope (and from other subprogram scopes). However, it doesn't seem worth worrying about as the source is imaginary anyway. There's a few high level goals this pass works towards: * We should be able to debugify our .ll/.mir in the lit tests without changing the checks and still pass them. I.e. Debug info should not change codegen. Combining this with a strip-debug pass should enable this. The main issue I ran into without the strip-debug pass was instructions with MMO's and checks on both the instruction and the MMO as the debug-location is between them. I currently have a simple hack in the MIRPrinter to resolve that but the more general solution is a proper strip-debug pass. * We should be able to test that GlobalISel does not lose debug info. I recently found that the legalizer can be unexpectedly lossy in seemingly simple cases (e.g. expanding one instr into many). I have a verifier (will be posted separately) that can be integrated with passes that use the observer interface and will catch location loss (it does not verify correctness, just that there's zero lossage). It is a little conservative as the line-0 locations that arise from conflicts do not track the conflicting locations but it can still catch a fair bit. Depends on D77439, D77438 Reviewers: aprantl, bogner, vsk Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77446	2020-04-07 16:25:13 -07:00
Fangrui Song	624654fd64	[VE] Migrate to the getMachineMemOperand overload using llvm::Align Just delete the deprecated overload because nothing uses it.	2020-04-07 16:04:54 -07:00
Matt Arsenault	6011627f51	CodeGen: More conversions to use Register	2020-04-07 18:54:36 -04:00
Fangrui Song	d2ef8c1f2c	[ThinLTO] Drop dso_local if a GlobalVariable satisfies isDeclarationForLinker() dso_local leads to direct access even if the definition is not within this compilation unit (it is still in the same linkage unit). On ELF, such a relocation (e.g. R_X86_64_PC32) referencing a STB_GLOBAL STV_DEFAULT object can cause a linker error in a -shared link. If the linkage is changed to available_externally, the dso_local flag should be dropped, so that no direct access will be generated. The current behavior is benign, because -fpic does not assume dso_local (clang/lib/CodeGen/CodeGenModule.cpp:shouldAssumeDSOLocal). If we do that for -fno-semantic-interposition (D73865), there will be an R_X86_64_PC32 linker error without this patch. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D74751	2020-04-07 15:46:01 -07:00
Fangrui Song	2f8fb4d1cd	[VE] Adapt `aa26dd9858` and `2481f26ac3`	2020-04-07 15:45:19 -07:00
Wei Mi	b49eac71ad	Recommit [SampleFDO] Add flag for partial profile. Fix the error of show-prof-info.test on some platforms without zlib. The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them. Differential Revision: https://reviews.llvm.org/D77426	2020-04-07 14:28:25 -07:00
Stanislav Mekhanoshin	96e51ed005	[AMDGPU] Implement copyPhysReg for 16 bit subregs Differential Revision: https://reviews.llvm.org/D74937	2020-04-07 14:22:46 -07:00
Matt Arsenault	2481f26ac3	CodeGen: Use Register in TargetFrameLowering	2020-04-07 17:07:44 -04:00
Nikita Popov	fe8abbf442	[BPI] Clear handles when releasing memory (NFC) This reduces max-rss of sqlite compilation by 2.5%.	2020-04-07 22:51:01 +02:00
Matt Arsenault	aa26dd9858	CodeGen: Use Register in more places	2020-04-07 15:59:40 -04:00
Wei Mi	c5da949ae8	Revert "[SampleFDO] Add flag for partial profile." show-prof-info.test breaks on some platforms. This reverts commit `e3ba652a14`.	2020-04-07 12:54:51 -07:00
Wei Mi	e3ba652a14	[SampleFDO] Add flag for partial profile. The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them. Differential Revision: https://reviews.llvm.org/D77426	2020-04-07 12:17:56 -07:00
Nemanja Ivanovic	ecd8435483	[NFC][PowerPC] Fix register class for patterns using XXPERMDIs There are a few patterns where we use a superclass for inputs to this instruction rather than the correct class. This can sometimes lead to unncessary copies.	2020-04-07 14:06:08 -05:00
Graham Sellers	a19a56f6a1	[AMDGPU] Extend constant folding for logical operations This patch extends existing constant folding in logical operations to handle S_XNOR, S_NAND, S_NOR, S_ANDN2, S_ORN2, V_LSHL_ADD_U32 and V_AND_OR_B32. Also added a couple of tests for existing folds.	2020-04-07 14:37:16 -04:00
Craig Topper	c41685b16f	[SelectionDAG] Make getZeroExtendInReg take a vector VT if the operand VT is a vector. This removes a call to getScalarType from a bunch of call sites. It also makes the behavior consistent with SIGN_EXTEND_INREG. Differential Revision: https://reviews.llvm.org/D77631	2020-04-07 11:34:08 -07:00
Alexey Lapshin	88c2137b6d	[DWARFLinker][dsymutil][NFC] Move DwarfStreamer into DWARFLinker. For implementing "remove obsolete debug info in lld", it is neccesary to have DWARF generation code implementation. dsymutil uses DwarfStreamer for that purpose. DwarfStreamer uses AsmPrinter. It is considered OK to use AsmPrinter based code in lld(D74169). This patch moves DwarfStreamer implementation into DWARFLinker, so that it could be reused from lld. Generally, a better place for such a common DWARF generation code would be not DWARFLinker but an additional separate library. Such a library could contain a single version of DWARF generation routines and could also be independent of AsmPrinter. At the current moment, DwarfStreamer does not pretend to be such a general implementation of DWARF generation. So I decided to put it into DWARFLinker since it is the only user of DwarfStreamer. Testing: it passes "check-all" lit testing. MD5 checksum for clang .dSYM bundle matches for the dsymutil with/without that patch. Reviewed By: JDevlieghere Differential revision: https://reviews.llvm.org/D77169	2020-04-07 21:21:54 +03:00
Eli Friedman	e9ac757f79	[AArch64] Don't expand memcmp in strict align mode. `7aecf232` fixed the bug where we would miscompile, but we still generate a crazy amount of code. Turn off the expansion until someone implements an appropriate heuristic. Differential Revision: https://reviews.llvm.org/D77599	2020-04-07 10:53:36 -07:00
Matt Arsenault	f596ab4066	AMDGPU: Use early return	2020-04-07 13:48:00 -04:00
Sam Clegg	5be42f36f5	[WebAssembly][MC] Fix leak of std::string members in MCSymbolWasm Summary: Fixes: https://bugs.llvm.org/show_bug.cgi?id=45452 Subscribers: dschuff, jgravelle-google, hiraditya, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77627	2020-04-07 10:38:43 -07:00
Stanislav Mekhanoshin	12a324393d	[AMDGPU] Limit endcf-collapase to simple if We can only collapse adjacent SI_END_CF if outer statement belongs to a simple SI_IF, otherwise correct mask is not in the register we expect, but is an argument of an S_XOR instruction. Even if SI_IF is simple it might be lowered using S_XOR because lowering is dependent on a basic block layout. It is not considered simple if instruction consuming its output is not an SI_END_CF. Since that SI_END_CF might have already been lowered to an S_OR isSimpleIf() check may return false. This situation is an opportunity for a further optimization of SI_IF lowering, but that is a separate optimization. In the meanwhile move SI_END_CF post the lowering when we already know how the rest of the CFG was lowered since a non-simple SI_IF case still needs to be handled. Differential Revision: https://reviews.llvm.org/D77610	2020-04-07 10:27:23 -07:00
Matt Arsenault	b281138a1b	DAG: Use the correct getPointerTy in a few places These should not be assuming address space 0. Calling getPointerTy is generally the wrong thing to do, since you should already know the type from the incoming IR.	2020-04-07 12:45:41 -04:00
Nikita Popov	259649a519	[RDA] Avoid full reprocessing of blocks in loops (NFCI) RDA sometimes needs to visit blocks twice, to take into account reaching defs coming in along loop back edges. Currently it handles repeated visitation the same way as usual, which means that it will scan through all instructions and their reg unit defs again. Not only is this very inefficient, it also means that all reaching defs in loops are going to be inserted twice. We can do much better than this. The only thing we need to handle is a new reaching def from a predecessor, which either needs to be prepended to the reaching definitions (if there was no reaching def from a predecessor), or needs to replace an existing predecessor reaching def, if it is more recent. Since D77508 we only store the most recent predecessor reaching def, so that's the only one that may need updating. This also has the nice side-effect that reaching definitions are now automatically sorted and unique, so drop the llvm::sort() call in favor of an assertion. Differential Revision: https://reviews.llvm.org/D77511	2020-04-07 17:55:37 +02:00
Nikita Popov	76e987b372	[RDA] Don't pass down TraversedMBB (NFC) Only pass the MachineBasicBlock itself down to helper methods, they don't need to know about traversal. Move the debug print into the main method.	2020-04-07 17:53:04 +02:00
Nikita Popov	361c29d7ba	[RDA] Avoid inserting duplicate reaching defs (NFCI) An instruction may define the same reg unit multiple times, avoid inserting the same reaching def multiple times in that case. Also print the reg unit, rather than the super-register, in the debug code.	2020-04-07 17:50:38 +02:00
David Tenty	b9245f14b7	[NFC][PowerPC] Cleanup 64-bit and Darwin CalleeSavedRegs Summary: - Remove the no longer used Darwin CalleeSavedRegs - Combine the SVR464 callee saved regs and AIX64 since the two are (and should be) identical into PPC64 - Update tests for 64-bit CSR change Reviewers: sfertile, ZarkoCA, cebowleratibm, jasonliu, #powerpc Reviewed By: sfertile Subscribers: wuzish, nemanjai, hiraditya, kbarton, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77235	2020-04-07 11:49:10 -04:00
Simon Pilgrim	e3b6059776	[X86][SSE] combineX86ShufflesConstants - early out for zeroable vectors (PR45443) Shuffle combining can insert zero byte sized elements into the shuffle mask, which combineX86ShufflesConstants will attempt to fold without taking into account whether the byte-sized type is legal (e.g. AVX512F only targets). If we have a full-zeroable vector then we should just return a zero version of the root type, otherwise if the type isn't valid we should bail. Fixes PR45443	2020-04-07 14:45:29 +01:00
Keith Walker	01dc10774e	[ARM] unwinding .pad instructions missing in execute-only prologue If the stack pointer is altered for local variables and we are generating Thumb2 execute-only code the .pad directive is missing. Usually the size of the adjustment is stored in a PC-relative location and loaded into a register which is then added to the stack pointer. However when we are generating execute-only code code the size of the adjustment is instead generated using the MOVW/MOVT instruction pair. As a by product of handling the execute-only case this also fixes an existing issue that in the none execute-only case the .pad directive was generated against the load of the constant to a register instruction, instead of the instruction which adds the register to the stack pointer. Differential Revision: https://reviews.llvm.org/D76849	2020-04-07 11:51:59 +01:00
Florian Hahn	6aabb109be	[SCCP] Use ranges for predicate info conditions. This patch updates the code that deals with conditions from predicate info to make use of constant ranges. For ssa_copy instructions inserted by PredicateInfo, we have 2 ranges: 1. The range of the original value. 2. The range imposed by the linked condition. 1. is known, 2. can be determined using makeAllowedICmpRegion. The intersection of those ranges is the range for the copy. With this patch, we get a nice increase in the number of instructions eliminated by both SCCP and IPSCCP for some benchmarks: For MultiSource, SPEC2000 & SPEC2006: Tests: 237 Same hash: 170 (filtered out) Remaining: 67 Metric: sccp.NumInstRemoved Program base patch diff test-suite...Source/Benchmarks/sim/sim.test 10.00 71.00 610.0% test-suite...CFP2000/177.mesa/177.mesa.test 361.00 1626.00 350.4% test-suite...encode/alacconvert-encode.test 141.00 602.00 327.0% test-suite...decode/alacconvert-decode.test 141.00 602.00 327.0% test-suite...CI_Purple/SMG2000/smg2000.test 1639.00 4093.00 149.7% test-suite...peg2/mpeg2dec/mpeg2decode.test 75.00 163.00 117.3% test-suite...T2006/401.bzip2/401.bzip2.test 358.00 513.00 43.3% test-suite...rks/FreeBench/pifft/pifft.test 11.00 15.00 36.4% test-suite...langs-C/unix-tbl/unix-tbl.test 4.00 5.00 25.0% test-suite...lications/sqlite3/sqlite3.test 541.00 667.00 23.3% test-suite.../CINT2000/254.gap/254.gap.test 243.00 299.00 23.0% test-suite...ks/Prolangs-C/agrep/agrep.test 25.00 29.00 16.0% test-suite...marks/7zip/7zip-benchmark.test 1135.00 1304.00 14.9% test-suite...lications/ClamAV/clamscan.test 1105.00 1268.00 14.8% test-suite...urce/Applications/lua/lua.test 398.00 436.00 9.5% Metric: sccp.IPNumInstRemoved Program base patch diff test-suite...C/CFP2000/179.art/179.art.test 1.00 3.00 200.0% test-suite...006/447.dealII/447.dealII.test 429.00 1056.00 146.2% test-suite...nch/fourinarow/fourinarow.test 3.00 7.00 133.3% test-suite...CI_Purple/SMG2000/smg2000.test 818.00 1748.00 113.7% test-suite...ks/McCat/04-bisect/bisect.test 3.00 5.00 66.7% test-suite...CFP2000/177.mesa/177.mesa.test 165.00 255.00 54.5% test-suite...ediabench/gsm/toast/toast.test 18.00 27.00 50.0% test-suite...telecomm-gsm/telecomm-gsm.test 18.00 27.00 50.0% test-suite...ks/Prolangs-C/agrep/agrep.test 24.00 35.00 45.8% test-suite...TimberWolfMC/timberwolfmc.test 43.00 62.00 44.2% test-suite...encode/alacconvert-encode.test 46.00 66.00 43.5% test-suite...decode/alacconvert-decode.test 46.00 66.00 43.5% test-suite...langs-C/unix-tbl/unix-tbl.test 12.00 17.00 41.7% test-suite...peg2/mpeg2dec/mpeg2decode.test 31.00 41.00 32.3% test-suite.../CINT2000/254.gap/254.gap.test 117.00 154.00 31.6% Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76611	2020-04-07 11:09:18 +01:00
Serguei Katkov	b7e3759e17	[DAG] Consolidate require spill slot logic in lambda. NFC. Move the logic whether lowering of deopt value requires a spill slot in a separate lambda. Reviewers: reames, dantrushin Reviewed By: dantrushin Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D77629	2020-04-07 16:43:47 +07:00
Peter Smith	14c1e98754	[ARM] Remove condition that could never be true From Arm v8 Architecture Reference Manual F5.1.84 LDREXD The ldrexd instruction in Arm state has the following conditions: t = UInt(Rt); t2 = t + 1; n = UInt(Rn); if Rt<0> == '1' \|\| t2 == 15 \|\| n == 15 then UNPREDICTABLE; In when Rt is odd or if Rt is 14 (making t2 15). In the implementation when the pair is the UNPREDICTABLE R14_R15 we would ideally return SOFT_FAIL. We can't because there is no R14_R15 value for us to return so we fail early returning FAIL. The early return for registers outside the bounds of the table means the check for Rt == 14 (0xE) redundant which causes a static analyzer to flag the condition as never being true. To fix the warning I've removed the check and replaced with a comment explaining the difference with the specification. Fixes pr41660 Differential Revision: https://reviews.llvm.org/D77463	2020-04-07 09:50:56 +01:00
Simon Tatham	aab9e9de4d	[Support,Windows] Tolerate failure of CryptGenRandom Summary: In `Unix/Process.inc`, we seed a random number generator from `/dev/urandom` if possible, but if not, we're happy to fall back to ordinary pseudorandom strategies, like the current time and PID. The corresponding function on Windows calls `CryptGenRandom`, but it //doesn't// have a fallback if that strategy fails. But `CryptGenRandom` //can// fail, if a cryptography provider isn't properly initialized, or occasionally (by our observation) simply intermittently. If it's reasonable on Unix to implement traditional pseudorandom-number seeding as a fallback, then it's surely reasonable to do the same on Windows. So this patch adds a last-ditch use of ordinary rand(), using much the same strategy as the Unix fallback code. Reviewers: hans, sammccall Reviewed By: hans Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77553	2020-04-07 09:18:12 +01:00
Pierre-vh	4fc59a468f	Revert "[CodeGen][SelectionDAG] Flip Booleans More Often" This reverts commit `23342bdcc8`.	2020-04-07 09:09:10 +01:00
Pierre-vh	23342bdcc8	[CodeGen][SelectionDAG] Flip Booleans More Often Differential Revision: https://reviews.llvm.org/D77201	2020-04-07 08:19:57 +01:00
Sam Clegg	f0bbf3d086	[WebAssembly] EmscriptenEHSjLj: Mark more functions as imported These should have been part of https://reviews.llvm.org/D77192 Differential Revision: https://reviews.llvm.org/D77358	2020-04-06 21:27:31 -07:00
Xiang1 Zhang	01a32f2bd3	Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology) Do not commit the llvm/test/ExecutionEngine/MCJIT/cet-code-model-lager.ll because it will cause build bot fail(not suitable for window 32 target). Summary: This patch comes from H.J.'s `2bd54ce7fa` This patch fix the failed llvm unit tests which running on CET machine. (e.g. ExecutionEngine/MCJIT/MCJITTests) The reason we enable IBT at "JIT compiled with CET" is mainly that: the JIT don't know the its caller program is CET enable or not. If JIT's caller program is non-CET, it is no problem JIT generate CET code or not. But if JIT's caller program is CET enabled, JIT must generate CET code or it will cause Control protection exceptions. I have test the patch at llvm-unit-test and llvm-test-suite at CET machine. It passed. and H.J. also test it at building and running VNCserver(Virtual Network Console), it works too. (if not apply this patch, VNCserver will crash at CET machine.) Reviewers: hjl.tools, craig.topper, LuoYuanke, annita.zhang, pengfei Reviewed By: LuoYuanke Subscribers: tstellar, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76900	2020-04-07 09:48:47 +08:00
Jun Ma	46bff786bc	[Coroutines] Remove alignment check in shouldBeMustTail Differential Revision: https://reviews.llvm.org/D77362	2020-04-07 09:07:34 +08:00
Eli Friedman	3f13ee8a00	[NFC] Modernize misc. uses of Align/MaybeAlign APIs. Use the current getAlign() APIs where it makes sense, and use Align instead of MaybeAlign when we know the value is non-zero.	2020-04-06 17:53:04 -07:00
Eli Friedman	68b03aee1a	Remove SequentialType from the type heirarchy. Now that we have scalable vectors, there's a distinction that isn't getting captured in the original SequentialType: some vectors don't have a known element count, so counting the number of elements doesn't make sense. In some cases, there's a better way to express the commonality using other methods. If we're dealing with GEPs, there's GEP methods; if we're dealing with a ConstantDataSequential, we can query its element type directly. In the relatively few remaining cases, I just decided to write out the type checks. We're talking about relatively few places, and I think the abstraction doesn't really carry its weight. (See thread "[RFC] Refactor class hierarchy of VectorType in the IR" on llvmdev.) Differential Revision: https://reviews.llvm.org/D75661	2020-04-06 17:03:49 -07:00
Davide Italiano	8115e08b05	[MachineCSE] Don't carry the wrong location when hoisting PR: 45425 <rdar://problem/61359768> Differential Revision: https://reviews.llvm.org/D77604	2020-04-06 16:36:22 -07:00
Daniel Sanders	f27cea721e	Add way to omit debug-location from MIR output Summary: In lieu of a proper pass that strips debug info, add a way to omit debug-locations from the MIR output so that instructions with MMO's continue to match CHECK's when mir-debugify is used Reviewers: aprantl, bogner, vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77575	2020-04-06 16:22:01 -07:00
Nick Desaulniers	41ba80182c	[CallSite Removal] a CallBase is never an IndirectCall for isInlineAsm Summary: Thanks to Bill Wendling (void) for the report and steps to reproduce. It looks like this was missed during r350508's cleanup of the CallSite split into CallBase, CallInst, and CallBrInst. This was exposed by running pgo on a callbr, which was creating a ptrtoint to the inline asm thinking it was an indirect call. The relevant callchain looks like: IndirectCallPromotionPlugin::run() -> PGOIndirectCallVisitor::findIndirectCalls() -> PGOIndirectCallVisitor::visitCallBase() -> CallBase::isIndirectCall() Reviewers: void, chandlerc Reviewed By: void Subscribers: hiraditya, llvm-commits, craig.topper, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D77600	2020-04-06 16:14:46 -07:00
Vedant Kumar	5f185a8999	[AddressSanitizer] Fix for wrong argument values appearing in backtraces Summary: In some cases, ASan may insert instrumentation before function arguments have been stored into their allocas. This causes two issues: 1) The argument value must be spilled until it can be stored into the reserved alloca, wasting a stack slot. 2) Until the store occurs in a later basic block, the debug location will point to the wrong frame offset, and backtraces will show an uninitialized value. The proposed solution is to move instructions which initialize allocas for arguments up into the entry block, before the position where ASan starts inserting its instrumentation. For the motivating test case, before the patch we see: ``` \| 0033: movq %rdi, 0x68(%rbx) \| \| DW_TAG_formal_parameter \| \| ... \| \| DW_AT_name ("a") \| \| 00d1: movq 0x68(%rbx), %rsi \| \| DW_AT_location (RBX+0x90) \| \| 00d5: movq %rsi, 0x90(%rbx) \| \| ^ not correct ... \| ``` and after the patch we see: ``` \| 002f: movq %rdi, 0x70(%rbx) \| \| DW_TAG_formal_parameter \| \| \| \| DW_AT_name ("a") \| \| \| \| DW_AT_location (RBX+0x70) \| ``` rdar://61122691 Reviewers: aprantl, eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77182	2020-04-06 15:59:25 -07:00
Daniel Sanders	35b7b0851b	Allow MachineFunction to obtain non-const Function (to enable MIR-level debugify) Summary: To debugify MIR, we need to be able to create metadata and to do that, we need a non-const Module. However, MachineFunction only had a const reference to the Function preventing this. Reviewers: aprantl, bogner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77439	2020-04-06 15:19:21 -07:00
Daniel Sanders	15f7bc7857	Add option to limit Debugify to locations (omitting variables) Summary: It can be helpful to test behaviour w.r.t locations without having DEBUG_VALUE around. In particular, because DEBUG_VALUE has the potential to change CodeGen behaviour (e.g. hasOneUse() vs hasOneNonDbgUse()) while locations generally don't. Reviewers: aprantl, bogner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77438	2020-04-06 15:04:55 -07:00
David Blaikie	5aead592f0	X86ISelLowering: Minor refactor to avoid redundant initialization while ensuring compiler warnings can hopefully still prove initialization Based on post-commit review/discussion in fabe52a7412b	2020-04-06 14:25:52 -07:00
Konstantin Pyzhov	72e8754916	[AMDGPU] Disable 'Skip Uniform Regions' optimization by default for AMDGPU. Reviewers: sameerds, dstuttard Differential Revision: https://reviews.llvm.org/D77228	2020-04-06 09:05:58 -04:00
Leonard Chan	a0222ac1f9	[AsmPrinter] Do not define local aliases for global objects in a comdat A global symbol that is defined in a comdat should not generate an alias since call sites that would've referred to that symbol will refer to their own independent local aliases rather than the surviving global comdat one. This could result in something that looks like: ``` ld.lld: error: relocation refers to a discarded section: .text._ZN3fbl8internal18NullFunctionTargetIvJjjPjEED1Ev.stub >>> defined in user-x64-clang/obj/system/ulib/minfs/libminfs.a(minfs._sources.file.cc.o) >>> section group signature: _ZN3fbl8internal18NullFunctionTargetIvJjjPjEED1Ev.stub >>> prevailing definition is in user-x64-clang/obj/system/ulib/minfs/libminfs.a(minfs._sources.vnode.cc.o) >>> referenced by function.h:169 (../../zircon/system/ulib/fbl/include/fbl/function.h:169) >>> minfs._sources.file.cc.o:(minfs::File::AllocateAndCommitData(std::__2::unique_ptr<minfs::Transaction, std::__2::default_delete<minfs::Transaction> >)) in archive user-x64-clang/obj/system/ulib/minfs/libminfs.a ``` We ran into this when experimenting with a new C++ ABI for fuchsia (refer to D72959) which takes relative offsets between comdat'd functions which is why the normal C++ user wouldn't run into this. Differential Revision: https://reviews.llvm.org/D77429	2020-04-06 13:48:05 -07:00
Nick Desaulniers	5bc291be71	[SelectionDAG] fix predecessor list for INLINEASM_BRs' parent Summary: A bug report mentioned that LLVM was producing jumps off the end of a function when using "asm goto with outputs". Further digging pointed to MachineBasicBlocks that had their address taken and were indirect targets of INLINEASM_BR being removed by BranchFolder, because their predecessor list was empty, so they appeared to have no entry. This was a cascading failure caused earlier, during Pre-RA instruction scheduling. We have a few special cases in Pre-RA instruction scheduling where we split a MachineBasicBlock in two. This requires careful handing of predecessor and successor lists for a MachineBasicBlock that was split, and careful handing of PHI MachineInstrs that referred to the MachineBasicBlock before it was split. The clue that led to this fix was the observation that many callers of MachineBasicBlock::splice() frequently call MachineBasicBlock::transferSuccessorsAndUpdatePHIs() to update their PHI nodes after a splice. We don't want to reuse that method, as we have custom successor transferring logic for this block split. This patch fixes 2 pre-existing bugs, and adds tests. The first bug was that MachineBasicBlock::splice() correctly handles updating most successors and predecessors; we don't need to do anything more than removing the previous fallthrough block from the first half of the split block post splice. Previously, we were updating the successor list incorrectly (updating successors updates predecessors). The second bug was that PHI nodes that needed registers from the first half of the split block were not having entries populated. The register live out information was correct, and the FuncInfo->PHINodesToUpdate was correct. Specifically, the check in SelectionDAGISel::FinishBasicBlock: for (unsigned i = 0, e = FuncInfo->PHINodesToUpdate.size(); i != e; ++i) { MachineInstrBuilder PHI(*MF, FuncInfo->PHINodesToUpdate[i].first); if (!FuncInfo->MBB->isSuccessor(PHI->getParent())) continue; PHI.addReg(FuncInfo->PHINodesToUpdate[i].second).addMBB(FuncInfo->MBB); was `continue`ing because FuncInfo->MBB tracks the second half of the post-split block; no one was updating PHI entries for the first half of the post-split block. SelectionDAGBuilder::UpdateSplitBlock() already expects to perform special handling for MachineBasicBlocks that were split post calls to ScheduleDAGSDNodes::EmitSchedule(), so I'm confident that it's both correct for ScheduleDAGSDNodes::EmitSchedule() to return the second half of the split block `CopyBB` which updates `FuncInfo->MBB` (ie. the current MachineBasicBlock being processed), and perform special handling for this in SelectionDAGBuilder::UpdateSplitBlock(). Reviewers: void, craig.topper, efriedma Reviewed By: void, efriedma Subscribers: hfinkel, fhahn, MatzeB, efriedma, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D76961	2020-04-06 13:46:39 -07:00
Matt Arsenault	869f05c834	AMDGPU: Remove dead paths for requiresUniformRegister The extracts from control flow intrinsics are already properly handled by divergence analysis. The inline asm case isn't dead, but has also never really worked correctly so leave it as-is for now.	2020-04-06 16:15:10 -04:00
Francesco Petrogalli	53b7abdd23	[llvm][CodeGen] Avoid implicit cast of TypeSize to integer in `initActions`. Reviewers: sdesmalen, efriedma Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77317	2020-04-06 19:46:11 +01:00
Masoud Ataei jaliseh	9ed0612cca	Add InjectTLIMappings pass to new pass manager This pass is created in `d6de5f12d4` and tested for new and legacy pass manager but never added to new pass manager pipeline. I am adding it to new pass manager pipeline. This pass is get used in Vector Function Database (VFDatabase) and without this pass in new pass manager pipeline, none of the vector libraries are work ing with new pass manager. Related passes: `66c120f025` https://reviews.llvm.org/D74944 Differential revision: https://reviews.llvm.org/D75354	2020-04-06 13:16:48 -05:00
Craig Topper	07ed1fb597	[SelectionDAGBuilder] Fix ISD::FREEZE creation for structs with fields of different types. The previous code used the type of the first field for the VT passed to getNode for every field. I've based the implementation here off what is done in visitSelect as it removes the need to special case aggregates. Differential Revision: https://reviews.llvm.org/D77093	2020-04-06 11:03:40 -07:00
Konstantin Pyzhov	51dc028314	Revert `e1730cfeb3`	2020-04-06 05:56:11 -04:00
Kirill Naumov	3f995ce8b5	[CFGPrinter][CallPrinter][polly] Adding distinct structure for CFGDOTInfo The patch introduces the system to distinctively store the information needed for the Control Flow Graph as well as the instrumentary needed for the follow-up changes: BlockFrequencyInfo and BranchProbabilityInfo. The patch is a part of sequence of three patches, related to graphs Heat Coloring. Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu Differential Revision: https://reviews.llvm.org/D76820	2020-04-06 17:42:54 +00:00
Konstantin Pyzhov	e1730cfeb3	[AMDGPU] Disable 'Skip Uniform Regions' optimization by default for AMDGPU. Reviewers: sameerds, dstuttard Differential Revision: https://reviews.llvm.org/D77228	2020-04-06 05:10:37 -04:00
Fangrui Song	a5d375e0cb	[AArch64] Allow logical immediates to have all-1 in top bits So that constant expressions like the following are permitted: and w0, w0, #~(0xfe<<24) and w1, w1, #~(0xff<<24) The behavior matches GNU as (opcodes/aarch64-opc.c:aarch64_logical_immediate_p). Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D75885	2020-04-06 09:56:04 -07:00
Florian Hahn	7aba6a0333	[LV] Fix value that could be read uninitialized. This should fix http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/18569	2020-04-06 17:54:50 +01:00
Nikita Popov	e8b83f7ddc	[RDA] Only store most recent reaching def from predecessors (NFCI) When entering a basic block, RDA inserts reaching definitions coming from predecessor blocks (which will be negative numbers) in a rather peculiar way. If you have incoming reaching definitions -4, -3, -2, -1, it will insert those. If you have incoming reaching definitions -1, -2, -3, -4, it will insert -1, -1, -1, -1, as the max is taken at each step. That's probably not what was intended... However, RDA only actually cares about the most recent reaching definition from a predecessor (to calculate clearance), so this ends up working fine as far as behavior is concerned. It does waste memory on unnecessary reaching definitions though. This patch changes the implementation to first compute the most recent reaching definition in one loop, and then insert only that one in a separate loop. Differential Revision: https://reviews.llvm.org/D77508	2020-04-06 18:39:09 +02:00
Nikita Popov	8d75df1438	[RDA] Don't adjust ReachingDefDefaultVal (NFCI) At the end of a basic block, RDA adjusts all the reaching defs it found to be relative to the end of the basic block, rather than the start of it. However, it also does this to registers which don't have a reaching def, indicated by ReachingDefDefaultVal. This means that code checking against ReachingDefDefaultVal will not skip them, and may insert them into the reaching definition list. This is ultimately harmless, but causes unnecessary work and is logically not right. Differential Revision: https://reviews.llvm.org/D77506	2020-04-06 18:36:29 +02:00
Sanjay Patel	fbb1b43f13	[ValueTracking] enhance matching of umin/umax with 'not' operands The cmyk test is based on the known regression that resulted from: rGf2fbdf76d8d0 This improves on the equivalent signed min/max change: rG867f0c3c4d8c The underlying icmp equivalence is: ~X pred ~Y --> Y pred X For an icmp with constant, canonicalization results in a swapped pred: ~X < C --> X > ~C	2020-04-06 11:51:59 -04:00
Matt Arsenault	8a5f0dafd4	AMDGPU/GlobalISel: Select llvm.amdgcn.div.scale	2020-04-06 11:50:19 -04:00
Matt Arsenault	e87ec66762	AMDGPU/GlobalISel: Fix llvm.amdgcn.div.fmas.ll	2020-04-06 11:50:16 -04:00
Jay Foad	ddd2f4b96f	[AMDGPU] Fix inaccurate comments	2020-04-06 16:44:08 +01:00
Florian Hahn	90be3c24a7	[VPlan] Introduce new VPWidenCallRecipe (NFC). This patch moves calls to their own recipe, to simplify the transition to VPUser for operands of VPWidenRecipe, as discussed in D76992. Subsequently additional information can be added to the recipe rather than computing it during the execute step. Reviewers: rengolin, Ayal, gilr, hsaito Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D77467	2020-04-06 16:07:37 +01:00
Chris Bowler	d6ea82d11c	[AIX][PPC] Implement by-val caller arguments in multiple registers Differential Revision: https://reviews.llvm.org/D76380	2020-04-06 11:06:51 -04:00
Guillaume Chatelet	808286342a	[Alignment][NFC] Assume AlignmentFromAssumptions::getNewAlignment is always set. Summary: In D77454 we explain that `LoadInst` and `StoreInst` always have their alignment defined. This allows to work backward here and to infer that `getNewAlignment` does not need to return `0` in case of failure. Returning `1` also works since it needs to be greater than the Load/Store alignment which is a least `1`. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77538	2020-04-06 14:54:57 +00:00
diggerlin	a26a441b99	[llvm-objdump][XCOFF] Use symbol index+symbol name + storage mapping class as label for -D SUMMARY: For the llvm-objdump -D, the symbol name is used as a label in the disassembly for the specific address (when a symbol address is equal to the virtual address in the dump). In XCOFF, multiple symbols may have the same name, being differentiated by their storage mapping class. It is helpful to print the QualName and not just the name when forming the output label for a csect symbol. The symbol index further removes any ambiguity caused by duplicate names. To maintain compatibility with the binutils objdump, the XCOFF-specific --symbol-description option is added to enable the enhanced format. Reviewers: hubert.reinterpretcast, James Henderson, Jason Liu ,daltenty Subscribers: wuzish, nemanjai, hiraditya Differential Revision: https://reviews.llvm.org/D72973	2020-04-06 10:10:10 -04:00
Benjamin Kramer	880ec421dd	[MC] Use a byte_swap in emitIntValue instead of doing it in a loop. NFCI.	2020-04-06 15:51:24 +02:00
Florian Hahn	6babae74c7	[Matrix] Update load/storeMatrix to take indices as Value* (NFC). This allows using the functions to be used with loop dependent indices.	2020-04-06 14:48:48 +01:00
Matt Arsenault	cbf719b568	AMDGPU: Use DAG patterns for div_fmas	2020-04-06 09:28:30 -04:00
Matt Arsenault	79b29d6df7	AMDGPU: Remove DisableInst feature I'm not sure why these were bothering to check the instruction profile, since those profiles should only be used with these instruction classes.	2020-04-06 09:27:44 -04:00
Matt Arsenault	70726cec5b	DAG: Combine extract_vector_elt of concat_vectors Fixes extra canonicalize regressions when legalizing vector fminnum/fmaxnum.	2020-04-06 09:26:29 -04:00
Hans Wennborg	64c2312750	Revert `43f031d312` "Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology)" ExecutionEngine/MCJIT/cet-code-model-lager.ll is failing on 32-bit windows, see llvm-commits thread for `fef2dab`. This reverts commit `43f031d312` and the follow-ups `fef2dab100` and `6a800f6f62`.	2020-04-06 15:05:25 +02:00
Sourabh Singh Tomar	5d7e9adce2	[DWARF5] Added support for emission of debug_macro section. Summary: This patch adds support for emission of following DWARFv5 macro forms in .debug_macro section. 1. DW_MACRO_start_file 2. DW_MACRO_end_file 3. DW_MACRO_define_strp 4. DW_MACRO_undef_strp. Reviewed By: dblaikie, ikudrin Differential Revision: https://reviews.llvm.org/D72828	2020-04-06 17:45:10 +05:30
Pavel Labath	9154a6398e	[llvm/Support] Make more DataExtractor methods error-aware Summary: This patch adds the optional Error argument, and the Cursor variants to more DataExtractor methods. The functions now behave the same way as other error-aware functions (they set the error when they fail, and don't do anything if the error is already set). I have merged the LEB128 implementations via a template (similarly to how fixed-size functions are handled) to reduce code duplication. Depends on D77304. Reviewers: dblaikie, aprantl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77306	2020-04-06 14:14:11 +02:00
Pavel Labath	a16fffa3f6	[Support] Make DataExtractor string functions error-aware Summary: This patch adds an optional Error argument to DataExtractor functions for string extraction, and makes them behave like other DataExtractor functions (set the error if extraction fails, don't do anything if the error is already set). I have merged the StringRef and C string versions of the functions to reduce code duplication. Reviewers: dblaikie, MaskRay Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77307	2020-04-06 14:14:11 +02:00
Guillaume Chatelet	ff858d7781	[Alignment][NFC] Add DebugStr and operator* Summary: This is a roll forward of D77394 minus AlignmentFromAssumptions (which needs to be addressed separately) Differences from D77394: - DebugStr() now prints the alignment value or `None` and no more `Align(x)` or `MaybeAlign(x)` - This is to keep Warning message consistent (CodeGen/SystemZ/alloca-04.ll) - Removed a few unneeded headers from Alignment (since it's included everywhere it's better to keep the dependencies to a minimum) Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77537	2020-04-06 12:09:45 +00:00
Guillaume Chatelet	39cfba9e33	[Alignment][NFC] Remove deprecated functions introduced in 10.0.0 Summary: 24 March 2020: LLVM 10.0.0 is out. I gathered all deprecated function introduced between 9 and 10 and cleaned them up so they will be removed from 11. > git log -p -S LLVM_ATTRIBUTE_DEPRECATED llvmorg-9.0.0..llvmorg-10.0.0 Reviewers: courbet Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77409	2020-04-06 12:07:18 +00:00
Simon Pilgrim	9bc5b1a489	[X86][SSE] combineVectorSignBitsTruncation - remove minimum vector length limitations truncateVectorWithPACK has its own vector length controls, so we can rely on those directly. This helps some existing truncation to subvector tests, which were being combined later during shuffle lowering at which point the sign/zero bit detection had become obscured preventing lowerShuffleWithPACK working as well as it could.	2020-04-06 12:45:23 +01:00
Benjamin Kramer	232eff55f6	[LTO] Replace hand-rolled endian conversion with support::endian. NFCI.	2020-04-06 13:23:27 +02:00
Benjamin Kramer	e64e516790	[RuntimeDyld] Replace hand-rolled endian conversion with support::endian. NFCI.	2020-04-06 13:22:53 +02:00
Benjamin Kramer	9a9bc23672	[llvm-bcanalyzer] Simplify code. NFCI.	2020-04-06 12:50:50 +02:00
Kazushi (Jam) Marukawa	e981a46a77	[VE] Update lea/load/store instructions Summary: Modify lea/load/store instructions to accept `disp(index, base)` style addressing mode (called ASX format). Also, uniform the number of DAG nodes to have 3 operands for this ASX format instructions, and update selectADDR functions to lower appropriate MI. Reviewers: arsenm, simoll, k-ishizaka Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D76822	2020-04-06 11:49:46 +02:00
Oliver Stannard	a294d9eb21	Revert "[IPRA][ARM] Spill extra registers at -Oz" Reverting because this is causing failures on bots with expensive checks enabled. This reverts commit `73cea83a6f`.	2020-04-06 10:34:59 +01:00
Kerry McLaughlin	944e322f88	[AArch64][SVE] Add SVE intrinsics for saturating add & subtract Summary: Adds the following intrinsics: - @llvm.aarch64.sve.[s\|u]qadd.x - @llvm.aarch64.sve.[s\|u]qsub.x Reviewers: sdesmalen, c-rhodes, dancgr, efriedma, cameron.mcinally, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77054	2020-04-06 10:07:08 +01:00
Florian Hahn	39f2d9aa81	[Matrix] Add option to use row-major matrix layout as default. This patch adds a -matrix-default-layout option which can be used to set the default matrix layout to row-major or column-major (default). The initial patch updates codegen for loads, stores, binary operators and matrix multiply. Reviewers: anemet, Gerolf, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D76325	2020-04-06 10:00:56 +01:00
Florian Hahn	d1fed7081d	[Matrix] Add initial tiling for load/multiply/store chains. This patch adds initial fusion for load/multiply/store chains of matrix operations. The patch contains roughly two parts: 1. Code generation for a fused load/multiply/store chain (LowerMatrixMultiplyFused). First, we ensure that both loads of the multiply operands do not alias the store. If they do, we create new non-aliasing copies of the operands. Note that this may introduce new basic block. Finally we process TileSize x TileSize blocks. That is: load tiles from the input operands, multiply and store them. 2. Identify fusion candidates & matrix instructions. As a first step, collect all instructions with shape info and fusion candidates (currently @llvm.matrix.multiply calls). Next, try to fuse candidates and collect instructions eliminated by fusion. Finally iterate over all matrix instructions, skip the ones eliminated by fusion and lower the rest as usual. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D75566	2020-04-06 09:28:15 +01:00
Guillaume Chatelet	6000478f39	Revert "[Alignment][NFC] Add DebugStr and operator*" This reverts commit `1e34ab98fc`.	2020-04-06 07:55:25 +00:00
Guillaume Chatelet	1e34ab98fc	[Alignment][NFC] Add DebugStr and operator* Summary: Also updates files to use them. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77394	2020-04-06 07:12:46 +00:00
Igor Kudrin	35819ff3cf	[DebugInfo] Fix reading range lists of v5 units in DWP. In package files, the base offset provided by index sections should be used to find the contribution of a unit. The patch adds that base offset when reading range list tables. Differential revision: https://reviews.llvm.org/D77401	2020-04-06 13:28:06 +07:00
Igor Kudrin	a93b77b97f	[DebugInfo] Fix reading location tables headers of v5 units in DWP. This fixes the reading of location lists headers for compilation units in package files by adjusting the reading offset according to the corresponding record in the unit index. This is required for DW_FORM_loclistx to work. Differential revision: https://reviews.llvm.org/D77146	2020-04-06 13:28:06 +07:00
Igor Kudrin	49737df767	[DebugInfo] Fix reading location tables of v5 units in DWP. Without the patch, all version 5 compile units in a DWP file read location tables from the beginning of a .debug_loclists.dwo section. The patch fixes that by adjusting the reading offset the same way as for pre-v5 units. The section identifier to find the contribution entry corresponds to the version of the unit. Differential revision: https://reviews.llvm.org/D77145	2020-04-06 13:28:06 +07:00
Igor Kudrin	714324b79a	[DebugInfo] Support DWARFv5 index sections. DWARFv5 defines index sections in package files in a slightly different way than the pre-standard GNU proposal, see Section 7.3.5 in the DWARF standard and https://gcc.gnu.org/wiki/DebugFissionDWP for GNU proposal. The main concern here is values for section identifiers, which are partially overlapped with changed meanings. The patch adds support for v5 index sections and resolves that difficulty by defining a set of identifiers for internal use which can represent and distinct values of both standards. Differential Revision: https://reviews.llvm.org/D75929	2020-04-06 13:28:06 +07:00
Igor Kudrin	a0249fe91c	[DebugInfo] Rename section identifiers which are deprecated in DWARFv5. NFC. This is a preparation for an upcoming patch which adds support for DWARFv5 unit index sections. The patch adds tag "_EXT_" to identifiers which reference sections that are deprecated in the DWARFv5 standard. See D75929 for the discussion. Differential Revision: https://reviews.llvm.org/D77141	2020-04-06 13:28:06 +07:00
Craig Topper	97e57f3b24	[DAGCombiner] Use getAnyExtOrTrunc instead of getSExtOrTrunc in the zext(setcc) combine. We're ANDing with 1 right after which will cause the SIGN_EXTEND to be combined to ANY_EXTEND later. Might as well just start with an ANY_EXTEND. While there replace create the AND using the getZeroExtendInReg helper to remove the need to explicitly create the VecOnes constant.	2020-04-05 22:44:45 -07:00
Johannes Doerfert	931c0cd713	[OpenMP][NFC] Move and simplify directive -> allowed clause mapping Move the listing of allowed clauses per OpenMP directive to the new macro file in `llvm/Frontend/OpenMP`. Also, use a single generic macro that specifies the directive and one allowed clause explicitly instead of a dedicated macro per directive. We save 800 loc and boilerplate for all new directives/clauses with no functional change. We also need to include the macro file only once and not once per directive. Depends on D77112. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D77113	2020-04-06 00:04:08 -05:00
Craig Topper	586c051a27	[DAGCombiner] Replace a hardcoded constant in visitZERO_EXTEND with a proper check for the condition its trying to protect. This code is replacing a shift with a new shift on an extended type. If the shift amount type can't represent the maximum shift amount for the new type, the amount needs to be extended to a type that can. Previously, the code just hardcoded a check for 256 bits which seems to have been an assumption that the original shift amount was MVT::i8. But that seems more catered to a specific target like X86 that uses i8 as its legal shift amount type. Other targets may use different types. This commit changes the code to look at the real type of the shift amount and makes sure it has enough bits for the Log2 of the new type. There are similar checks to this in SelectionDAGBuilder and LegalizeIntegerTypes.	2020-04-05 20:35:57 -07:00
Johannes Doerfert	419a559c5a	[OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP` This is a cleanup and normalization patch that also enables reuse with Flang later on. A follow up will clean up and move the directive -> clauses mapping. Reviewed By: fghanim Differential Revision: https://reviews.llvm.org/D77112	2020-04-05 22:30:29 -05:00
Tarindu Jayatilaka	b43b59fcc0	Expose `attributor-disable` to the new and old pass managers The new and old pass managers (PassManagerBuilder.cpp and PassBuilder.cpp) are exposed to an `extern` declaration of `attributor-disable` option which will guard the addition of the attributor passes to the pass pipelines. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76871	2020-04-05 22:29:34 -05:00
Lang Hames	1b39c6f62c	[ORC] Add MachO universal binary support to StaticLibraryDefinitionGenerator. Add a new overload of StaticLibraryDefinitionGenerator::Load that takes a triple argument and supports loading archives from MachO universal binaries in addition to regular archives. The LLI tool is updated to use this overload.	2020-04-05 20:21:05 -07:00
Simon Pilgrim	a43e233606	Remove unused function 'isInRange'. NFCI.	2020-04-05 23:11:24 +01:00
Simon Pilgrim	4431a29c60	[X86][SSE] Combine unary shuffle(HORIZOP,HORIZOP) -> HORIZOP We had previously limited the shuffle(HORIZOP,HORIZOP) combine to binary shuffles, but we can often merge unary shuffles just as well, folding in UNDEF/ZERO values into the 64-bit half lanes. For the (P)HADD/HSUB cases this is limited to fast-horizontal cases but PACKSS/PACKUS combines under all cases.	2020-04-05 22:49:46 +01:00
Anna Thomas	1d0f757904	[InlineFunction] Update metadata on loads that are return values This patch builds upon D76140 by updating metadata on pointer typed loads in inlined functions, when the load is the return value, and the callsite contains return attributes which can be updated as metadata on the load. Added test cases show this for nonnull, dereferenceable, dereferenceable_or_null Reviewed-By: jdoerfert Differential Revision: https://reviews.llvm.org/D76792	2020-04-05 14:50:10 -04:00
Sourabh Singh Tomar	0d71782f4e	[DebugInfo]: Allow DwarfCompileUnit to have line table symbol Previously line table symbol was represented as `DIE::value_iterator` inside `DwarfCompileUnit` and subsequent function `intStmtList` was used to create a local `MCSymbol` to initialize it. This patch removes `DIE::value_iterator` from `DwarfCompileUnit` and intoduce `MCSymbol` for representing this units symbol for `debug_line` section. As a result `applyStmtList` is also modified to utilize this. Further more a helper function `getLineTableStartSym` is also introduced to get this symbol, this would be used by clients which need to access this line table, i.e `debug_macro`. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D77489	2020-04-06 00:14:29 +05:30
Zuojian Lin	a58c8a7866	Remove the additional constant which requires an extra register for statepoint lowering. The newly-created constant zero will need an extra register to hold it in the current statepoint lowering implementation. Remove it if there exists one.	2020-04-05 11:22:09 -04:00
Apelete Seketeli	8aadb442d1	[scan-build] fix dead store warnings emitted on LLVM AMDGPU code base This fixes dead store warnings of the type "dead assignment" reported by Clang Static Analyzer.	2020-04-05 11:19:03 -04:00
Oliver Stannard	cb6aeb2239	[ARM] Add data gathering hint instruction Summary: This patch upstreams support the optional ARMv8.0 Data Gathering Hint (DGH) extension, which adds the Data Gathering Hint instruction to the hint space. See ARMv8.0-DGH in the Arm Architecture Reference Manual Armv8 for more information. Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, danielkiss, samparker Reviewed By: SjoerdMeijer Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77097	2020-04-05 15:21:00 +01:00
Oliver Stannard	6f60eb4a3c	[ARM] Add enhanced counter virtualization system registers Summary: This patch upstreams support for the ARMv8.6A Enhanced Counter Virtualization (ECV) extension, which adds 6 new system registers. See ARMv8.6-ECV in the Arm Architecture Reference Manual Armv8 for more information. Reviewers: t.p.northover, rengolin, SjoerdMeijer, pcc, ab, chill Reviewed By: SjoerdMeijer Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77094	2020-04-05 15:18:35 +01:00
Sanjay Patel	538a8f0227	[InstCombine] convert bitcast-shuffle to vector trunc As discussed in D76983, that patch can turn a chain of insert/extract with scalar trunc ops into bitcast+extract and existing instcombine vector transforms end up creating a shuffle out of that (see the PhaseOrdering test for an example). Currently, that process requires at least this sequence: -instcombine -early-cse -instcombine. Before D76983, the sequence of insert/extract would reach the SLP vectorizer and become a vector trunc there. Based on a small sampling of public targets/types, converting the shuffle to a trunc is better for codegen in most cases (and a regression of that form is the reason this was noticed). The trunc is clearly better for IR-level analysis as well. This means that we can induce "spontaneous vectorization" without invoking any explicit vectorizer passes (at least a vector cast op may be created out of scalar casts), but that seems to be the right choice given that we started with a chain of insert/extract, and the backend would expand back to that chain if a target does not support the op. Differential Revision: https://reviews.llvm.org/D77299	2020-04-05 09:48:02 -04:00
Oliver Stannard	9e1455dc23	[ARM] Add ARMv8.6 Fine Grain Traps system registers Summary: This patch upstreams support for the ARMv8.6A Fine Grain Traps (FGT) extension, which adds 5 new system registers. See ARMv8.6-FGT in the Arm Architecture Reference Manual Armv8 for more information. Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, momchil.velikov Reviewed By: SjoerdMeijer Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76991	2020-04-05 14:28:18 +01:00
Sanjay Patel	4036a0af24	[InstCombine] enhance freelyNegateValue() by handling 'not' This patch extends D77230. If we have a 'not' instruction inside a negated expression, we can ignore extra uses of that op because the negation has a one-to-one replacement: negate becomes increment. Alive2 examples of the test cases: http://volta.cs.utah.edu:8080/z/T5-u9P http://volta.cs.utah.edu:8080/z/eT89L6 Differential Revision: https://reviews.llvm.org/D77459	2020-04-05 09:16:19 -04:00
Sanjay Patel	867f0c3c4d	[ValueTracking] enhance matching of smin/smax with 'not' operands The cmyk tests are based on the known regression that resulted from: rGf2fbdf76d8d0 So this improvement in analysis might be enough to restore that commit.	2020-04-05 08:54:12 -04:00
Diogo Sampaio	59d10dc703	[ARM] add ARMv8.6-A Activity monitors virtualization extension Summary: This patch upstreams v8.6A activity monitors virtualization assembler support, which consists of 32 new system registers (two groups, each with 16 numbered registers). See ARMv8.6-AMU in the Arm Architecture Reference Manual Armv8 for more information. Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, john.brawn, ostannard Reviewed By: ostannard Subscribers: LukeGeeson, dnsampaio, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76998	2020-04-05 13:31:06 +01:00
Benjamin Kramer	ff889df356	[X86] Roll some loops. NFCI.	2020-04-05 13:59:50 +02:00
Florian Hahn	47ee404075	[ValueTracking] Use Inst::comesBefore in isValidAssumeForCtx (NFC). D51664 added Instruction::comesBefore which should provide better performance than the manual check. Reviewers: rnk, nikic, spatel Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D76228	2020-04-05 12:38:04 +01:00
Simon Pilgrim	3079e51858	[X86][SSE] Generalize shuffle(HORIZOP,HORIZOP) -> HORIZOP combine Our existing combine allows to merge the shuffle of 2 similar 64-bit wide 'horizontal ops' (HADD/PACK/etc.) if the shuffle was a UNPCK/MOVSD. This patch generalizes this to decode any target shuffle mask that can be widened to a 128-bit repeating v2*64 mask, which helps us catch PBLENDW/PBLENDD cases.	2020-04-05 12:09:19 +01:00
Simon Pilgrim	a17de6b91c	[X86][SSE] truncateVectorWithPACK - upper undef for 128->64 packing If we're packing from 128-bits to 64-bits then we don't need the RHS argument. This helps with register allocation, especially as we avoid repeating a use of the input value.	2020-04-05 11:47:36 +01:00
Matt Arsenault	6bfe28e92f	AMDGPU: Fix annotate kernel features through casted calls I thought I was testing this before, but the workitem id x case isn't great since it's mandatory in the parent kernel.	2020-04-04 20:44:44 -04:00
Matt Arsenault	221890d709	AMDGPU: Add feature for fast f32 denormals	2020-04-04 20:01:24 -04:00
Stefanos Baziotis	f3dd3a66d3	[Attributor] AAUndefinedBehavior: Use AAValueSimplify in memory accessing instructions. Query AAValueSimplify on pointers in memory accessing instructions to take advantage of the constant propagation (or any other value simplification) of such values.	2020-04-05 02:46:26 +03:00
Jonathan Roelofs	3ce77142a6	Revert "[DAG] Fix PR45049: LegalizeTypes crash" This reverts commit `17673ae0b2`.	2020-04-04 13:47:22 -06:00
Jonathan Roelofs	17673ae0b2	[DAG] Fix PR45049: LegalizeTypes crash Sometimes LegalizeTypes knows about common subexpressions before SelectionDAG does, leading to accidental SDValue removal before its reference count was truly zero. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45049 https://reviews.llvm.org/D76994	2020-04-04 13:36:22 -06:00
Florian Hahn	a2b18c5a08	[LV] Simplify tryToWiden as recipes are not re-used (NFC). After `49d00824bb`, VPWidenRecipe only stores a single instruction. tryToWiden can simply return the widen recipe, like other helpers in VPRecipeBuilder.	2020-04-04 18:30:50 +01:00
Heejin Ahn	fc5d8b672b	[WebAssembly] Fix a sanitizer error in WasmEHPrepare Summary: D77423 started using a dominator tree in WasmEHPrepare, but we deleted BBs in `prepareThrows` before we used the domtree in `prepareEHPads`, and those CFG changes were not reflected in the domtree. This uses `DomTreeUpdater` to make sure we update the domtree every time we delete BBs from the CFG. This fixes ubsan/msan/expensive_check errors caught in LLVM buildbots. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77465	2020-04-04 09:57:07 -07:00
Nikita Popov	4ede730096	[InstCombine] Don't limit uses in eraseInstFromFunction() eraseInstFromFunction() adds the operands of the erased instructions, as those might now be dead as well. However, this is limited to instructions with less than 8 operands. This check doesn't make a lot of sense to me. As the instruction gets removed afterwards, I don't see a potential for anything overly pathological happening here (as we can only add those operands to the worklist once). The impact on CTMark is in the noise. We also have the same code in instruction sinking and don't limit the operand count there. Differential Revision: https://reviews.llvm.org/D77325	2020-04-04 18:37:30 +02:00
Luofan Chen	eec6d87626	[Attributor] Deduce attributes for non-exact functions This patch is based on D63312 and D63319. For now we create shallow wrappers for all functions that are IPO amendable. See also [this github issue](https://github.com/llvm/llvm-project/issues/172). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76404	2020-04-04 11:34:58 -05:00
Heejin Ahn	2e9839729d	[WebAssembly] Fix wasm.lsda() optimization in WasmEHPrepare Summary: When we insert a call to the personality function wrapper (`_Unwind_CallPersonality`) for a catch pad, we store some necessary info in `__wasm_lpad_context` struct and pass it. One of the info is the LSDA address for the function. For this, we insert a call to `wasm.lsda()`, which will be lowered down to the address of LSDA, and store it in a field in `__wasm_lpad_context`. There are exceptions to this personality call insertion: catchpads for `catch (...)` and cleanuppads (for destructors) don't need personality function calls, because we don't need to figure out whether the current exception should be caught or not. (They always should.) There was a little optimization to `wasm.lsda()` call insertion. Because the LSDA address is the same throughout a function, we don't need to insert a store of `wasm.lsda()` return value in every catchpad. For example: ``` try { foo(); } catch (int) { // wasm.lsda() call and a store are inserted here, like, in // pseudocode, // %lsda = wasm.lsda(); // store %lsda to a field in __wasm_lpad_context try { foo(); } catch (int) { // We don't need to insert the wasm.lsda() and store again, because // to arrive here, we have already stored the LSDA address to // __wasm_lpad_context in the outer catch. } } ``` So the previous algorithm checked if the current catch has a parent EH pad, we didn't insert a call to `wasm.lsda()` and its store. But this was incorrect, because what if the outer catch is `catch (...)` or a cleanuppad? ``` try { foo(); } catch (...) { // wasm.lsda() call and a store are NOT inserted here try { foo(); } catch (int) { // We need wasm.lsda() here! } } ``` In this case we need to insert `wasm.lsda()` in the inner catchpad, because the outer catchpad does not have one. To minimize the number of inserted `wasm.lsda()` calls and stores, we need a way to figure out whether we have encountered `wasm.lsda()` call in any of EH pads that dominates the current EH pad. To figure that out, we now visit EH pads in BFS order in the dominator tree so that we visit parent BBs first before visiting its child BBs in the domtree. We keep a set named `ExecutedLSDA`, which basically means "Do we have `wasm.lsda()` either in the current EH pad or any of its parent EH pads in the dominator tree?". This is to prevent scanning the domtree up to the root in the worst case every time we examine an EH pad: each EH pad only needs to examine its immediate parent EH pad. - If any of its parent EH pads in the domtree has `wasm.lsda()`, this means we don't need `wasm.lsda()` in the current EH pad. We also insert the current EH pad in `ExecutedLSDA` set. - If none of its parent EH pad has `wasm.lsda()` - If the current EH pad is a `catch (...)` or a cleanuppad, done. - If the current EH pad is neither a `catch (...)` nor a cleanuppad, add `wasm.lsda()` and the store in the current EH pad, and add the current EH pad to `ExecutedLSDA` set. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77423	2020-04-04 07:02:50 -07:00
Simon Pilgrim	e5e719d885	[X86][SSE] lowerV8I16Shuffle - lower compaction shuffles using PACKUSDW(PBLENDW,PBLENDW) on SSE41+ Similar to the lowerV16I8Shuffle implementation, for binary compaction v8i16 shuffles we can avoid the PUNPCKLDQ(PSHUFB,PSHUFB) pattern on SSE41+ targets by using PACKUSDW and PBLENDW. Before SSE41 we would need to use PACKSSDW but that requires sign extension that seems to destroy any gains, even on targets without PSHUFB. This is a bigger gain on AMD than Intel targets but should never be a regression, and avoiding the shuffle mask load(s) is always useful. Noticed in codegen while dealing with PR31443.	2020-04-04 13:08:25 +01:00
Nikita Popov	b90ea4f341	[IRBuilder] Move some code into the cpp file; NFC Since D73835 we no longer need to define the whole IRBuilder implementation in the header. This patch moves some of the larger methods out of line, into the C++ file. Differential Revision: https://reviews.llvm.org/D77332	2020-04-04 12:52:56 +02:00
Nikita Popov	6896d559f3	[VNCoercion] Use IRBuilderBase; NFC And remove include from header.	2020-04-04 12:44:50 +02:00
Nikita Popov	ebd5a1b049	[Reassociate] Use IRBuilderBase; NFC And remove now unnecessary IRBuilder.h include in header.	2020-04-04 12:34:16 +02:00
Nikita Popov	1055e9e3c8	[IVDescriptors] Remove IRBuilder.h include; NFC IVDescriptors.h itself does not reference IRBuilder at all. Move the include into transformation passes that do.	2020-04-04 12:07:57 +02:00
Nikita Popov	a5eb1236e3	[IVDescriptors] Remove unnecessary DemandedBits.h include; NFC Forward declare DemandedBits in IVDescriptors, and move include into the cpp file. Also drop the include from LoopUtils, which does not need it at all.	2020-04-04 12:07:57 +02:00
Craig Topper	1d42c0db9a	Revert "[X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI) Gadgets" This reverts commit `c74dd640fd`. Reverting to address coding standard issues raised in post-commit review.	2020-04-03 16:56:08 -07:00
Craig Topper	a505ad58cf	Revert "[X86] Add Support for Load Hardening to Mitigate Load Value Injection (LVI)" This reverts commit `62c42e29ba` Reverting to address coding standard issues raised in post-commit review.	2020-04-03 16:55:53 -07:00
Scott Constable	62c42e29ba	[X86] Add Support for Load Hardening to Mitigate Load Value Injection (LVI) After finding all such gadgets in a given function, the pass minimally inserts LFENCE instructions in such a manner that the following property is satisfied: for all SOURCE+SINK pairs, all paths in the CFG from SOURCE to SINK contain at least one LFENCE instruction. The algorithm that implements this minimal insertion is influenced by an academic paper that minimally inserts memory fences for high-performance concurrent programs: http://www.cs.ucr.edu/~lesani/companion/oopsla15/OOPSLA15.pdf The algorithm implemented in this pass is as follows: 1. Build a condensed CFG (i.e., a GadgetGraph) consisting only of the following components: -SOURCE instructions (also includes function arguments) -SINK instructions -Basic block entry points -Basic block terminators -LFENCE instructions 2. Analyze the GadgetGraph to determine which SOURCE+SINK pairs (i.e., gadgets) are already mitigated by existing LFENCEs. If all gadgets have been mitigated, go to step 6. 3. Use a heuristic or plugin to approximate minimal LFENCE insertion. 4. Insert one LFENCE along each CFG edge that was cut in step 3. 5. Go to step 2. 6. If any LFENCEs were inserted, return true from runOnFunction() to tell LLVM that the function was modified. By default, the heuristic used in Step 3 is a greedy heuristic that avoids inserting LFENCEs into loops unless absolutely necessary. There is also a CLI option to load a plugin that can provide even better optimization, inserting fewer fences, while still mitigating all of the LVI gadgets. The plugin can be found here: https://github.com/intel/lvi-llvm-optimization-plugin, and a description of the pass's behavior with the plugin can be found here: https://software.intel.com/security-software-guidance/insights/optimized-mitigation-approach-load-value-injection. Differential Revision: https://reviews.llvm.org/D75937	2020-04-03 13:45:50 -07:00
Scott Constable	c74dd640fd	[X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI) Gadgets Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph. More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK). Also adds a new target feature to X86: +lvi-load-hardening The feature can be added via the clang CLI using -mlvi-hardening. Differential Revision: https://reviews.llvm.org/D75936	2020-04-03 13:02:04 -07:00
Alina Sbirlea	688450c7f0	[GraphDiff] Extend GraphDiff to track a list of updates. Summary: This patch includes two extensions: 1. It extends the GraphDiff to also keep the original list of updates after legalization, not just the deletes/insert vectors. It also provides an API to pop the first update (the updates are store in reverse, such that the first update is at the end of the list) 2. It adds a bool to mark whether the given updates should be applied as given, or applied in reverse. This moves the task of reversing the updates (when the caller needs this) to a functionality inside GraphDiff, versus having the caller do this. The two changes could be split into two patches, but they seemed reasonably small to be reviewed together. Reviewers: kuhar, dblaikie Subscribers: hiraditya, george.burgess.iv, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77167	2020-04-03 12:10:36 -07:00
Scott Constable	f95a67d8b8	[X86] Add RET-hardening Support to mitigate Load Value Injection (LVI) Adding a pass that replaces every ret instruction with the sequence: pop <scratch-reg> lfence jmp *<scratch-reg> where <scratch-reg> is some available scratch register, according to the calling convention of the function being mitigated. Differential Revision: https://reviews.llvm.org/D75935	2020-04-03 12:08:34 -07:00
Matt Arsenault	30ebafaa56	CodeGen: Convert some TII hooks to use Register	2020-04-03 14:52:54 -04:00
Matt Arsenault	178050c3ba	AMDGPU: Use Register in more places	2020-04-03 14:52:54 -04:00
Matt Arsenault	e8dcb6d05e	AMDGPU: Remove redundant virtual	2020-04-03 14:52:53 -04:00
Christopher Tetreault	b600809688	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: kparzysz, sdesmalen, efriedma Reviewed By: kparzysz Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77267	2020-04-03 11:26:51 -07:00
Stanislav Mekhanoshin	0462795095	[AMDGPU] Propagate AGPR RC from PHI to its PHI operands We can fix register class of PHI based on its all AGPR uses. That leaves behind all PHIs which were already processed earlier. Propagate RC back to PHI operands of a PHI. Differential Revision: https://reviews.llvm.org/D77344	2020-04-03 11:23:02 -07:00
Simon Pilgrim	2225797567	[YAMLParser] Scanner::setError - ensure we use the StringRef::iterator argument (PR45043) As detailed on PR45043, static analysis was warning that the StringRef::iterator Position argument was being ignored and the function was hardwired to use the Current iterator. This patch ensures we use the provided iterator and removes the (barely necessary) setError wrapper that always used Current. Differential Revision: https://reviews.llvm.org/D76512	2020-04-03 18:55:38 +01:00
Sanjay Patel	ce97ce3a5d	[VectorCombine] try to form a better extractelement Extracting to the same index that we are going to insert back into allows forming select ("blend") shuffles and enables further transforms. Admittedly, this is a quick-fix for a more general problem that I'm hoping to solve by adding transforms for patterns that start with an insertelement. But this might resolve some regressions known to be caused by the extract-extract transform (although I have not gotten more details on those yet). In the motivating case from PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 The combination of subsequent instcombine and codegen transforms gets us this improvement: vmovshdup %xmm0, %xmm2 ## xmm2 = xmm0[1,1,3,3] vhaddps %xmm1, %xmm1, %xmm4 vmovshdup %xmm1, %xmm3 ## xmm3 = xmm1[1,1,3,3] vaddps %xmm0, %xmm2, %xmm0 vaddps %xmm1, %xmm3, %xmm1 vshufps $200, %xmm4, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm4[0,3] vinsertps $177, %xmm1, %xmm0, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm1[2] --> vmovshdup %xmm0, %xmm2 ## xmm2 = xmm0[1,1,3,3] vhaddps %xmm1, %xmm1, %xmm1 vaddps %xmm0, %xmm2, %xmm0 vshufps $200, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm1[0,3] Differential Revision: https://reviews.llvm.org/D76623	2020-04-03 13:55:13 -04:00
Sylvain Audi	e4ae0a2e97	[Support/Path] sys::path::replace_path_prefix fix and simplifications Added unit tests for 2 scenarios that were failing. Made replace_path_prefix back to 3 parameters instead of 5, simplifying the implementation. The other 2 were always used with the default value. This commit is intended to be the first of 3: 1) simplify/fix replace_path_prefix. 2) use it in the context of -fdebug-prefix-map and -fmacro-prefix-map (see D76869). 3) Make Windows version of replace_path_prefix insensitive to both case and separators (slash vs backslash). Differential Revision: https://reviews.llvm.org/D77223	2020-04-03 13:50:23 -04:00
Simon Pilgrim	34a497b765	[X86][SSE] lowerShuffleWithPACK - extend to use chained PACKs for larger truncations Extend lowerShuffleWithPACK/matchShuffleWithPACK/createPackShuffleMask to handle compaction style shuffle masks that can be lowered to chains of PACKSS/PACKUS if their inputs are suitably sign/zero extended. This helps avoid PSHUFB (and its mask load) for short shuffle chains, shuffle combining will still replace with a PSHUFB if we have enough shuffles as getFauxShuffleMask should recognise the PACKSS/PACKUS chains.	2020-04-03 18:26:10 +01:00
Roman Lebedev	7d572ef2dd	Revert "[SCEV] rewriteLoopExitValues(): even if have hard uses, still rewrite if cheap (PR44668)" As discussed in post-commit review in https://reviews.llvm.org/D73501 if the goal of this is to help vectorizer, then we should actually be teaching vectorizer to do this, because right now this rewrite is still budget-limited, which isn't what we'd want. Additionally, while the rest of the patch series was universally profitable, this particular patch is reportedly (https://reviews.llvm.org/D73501#1905171) exposing cost-modeling issues on ARM. So let's just back this particular patch out. Once there's an undo transform, this could be considered for reintegration. This reverts commit `44edc6fd2c`.	2020-04-03 20:15:04 +03:00
John Brawn	4ad9ca0f9e	[ARM] Fix incorrect handling of big-endian vmov.i64 Currently when the target is big-endian vmov.i64 reverses the order of the two words of the vector. This is correct only when the underlying element type is 32-bit, as actually what it should be doing is considering it a vector of the underlying type and reversing the elements of that. Differential Revision: https://reviews.llvm.org/D76515	2020-04-03 17:36:50 +01:00
John Brawn	cd58fb6325	[ARM] Avoid pointless vrev of element-wise vmov If we have an element-wise vmov immediate instruction then a subsequent vrev with width greater or equal to the vmov element width, then that vrev won't do anything. Add a DAG combine to convert bitcasts that would become such vrevs into vector_reg_casts instead. Differential Revision: https://reviews.llvm.org/D76514	2020-04-03 17:36:50 +01:00
Matt Arsenault	57a55313c3	InstCombine: Reduce minnum/maxnum if inputs are casted	2020-04-03 11:57:25 -04:00
jasonliu	d65557d15d	[NFC][XCOFF][AIX] Refactor get/setContainingCsect Summary: For current architect, we always require setContainingCsect to be called on every MCSymbol got used in XCOFF context. This is very hard to achieve because symbols gets created everywhere and other MCSymbol types(ELF, COFF) do not have similar rules. It's very easy to miss setting the containing csect, and we would need to add a lot of XCOFF specialized code around some common code area. This patch intendeds to do 1. Rely on getFragment().getParent() to get csect from labels. 2. Only use get/setRepresentedCsect (was get/setContainingCsect) if symbol itself represents a csect. Reviewers: DiggerLin, hubert.reinterpretcast, daltenty Differential Revision: https://reviews.llvm.org/D77080	2020-04-03 13:33:12 +00:00
Guillaume Chatelet	9068bccbae	[Alignment][NFC] Deprecate InstrTypes getRetAlignment/getParamAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77312	2020-04-03 13:21:58 +00:00
Guillaume Chatelet	1a584a8d50	[Alignment][NFC] Remove unused private functions Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77297	2020-04-03 09:16:20 +00:00
Guillaume Chatelet	ca11c480e7	[Alignment][NFC] Convert MachineIRBuilder::buildDynStackAlloc to Align Summary: The change in IRTranslator is not trivial but is NFC as far as I can tell. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77292	2020-04-03 09:05:19 +00:00
OCHyams	9b56cc9361	[DebugInfo] Salvage debug info when sinking loop invariant instructions Reviewed By: vsk, aprantl, djtodoro Differential Revision: https://reviews.llvm.org/D77318	2020-04-03 09:19:26 +01:00
Guillaume Chatelet	9f5c786876	[NFC] G_DYN_STACKALLOC realign iff align > 1, update documentation Summary: I think it would be better to require the alignment to be >= 1. It is currently confusing to allow both values. Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77372	2020-04-03 08:12:39 +00:00
scentini	6825920b18	Silence -Wpessimizing-move warning	2020-04-03 09:37:39 +02:00
Scott Constable	5b519cf1fc	[X86] Add Indirect Thunk Support to X86 to mitigate Load Value Injection (LVI) This pass replaces each indirect call/jump with a direct call to a thunk that looks like: lfence jmpq *%r11 This ensures that if the value in register %r11 was loaded from memory, then the value in %r11 is (architecturally) correct prior to the jump. Also adds a new target feature to X86: +lvi-cfi ("cfi" meaning control-flow integrity) The feature can be added via clang CLI using -mlvi-cfi. This is an alternate implementation to https://reviews.llvm.org/D75934 That merges the thunk insertion functionality with the existing X86 retpoline code. Differential Revision: https://reviews.llvm.org/D76812	2020-04-03 00:34:39 -07:00
scentini	0a3845b70f	Silence -Wpessimizing-move warning	2020-04-03 09:24:26 +02:00
Igor Kudrin	f13ce15d44	[DebugInfo] Rename getOffset() to getContribution(). NFC. The old name was a bit misleading because the functions actually return contributions to the corresponding sections. Differential revision: https://reviews.llvm.org/D77302	2020-04-03 14:15:53 +07:00
Sourabh Singh Tomar	69c8fb1c65	[DWARF5] Added support for debug_macro section parsing and dumping in llvm-dwarfdump. Summary: This patch adds parsing and dumping DWARFv5 .debug_macro section in llvm-dwarfdump, it does not introduce any new switch. Existing switch "--debug-macro" should be used to dump macinfo or macro section. Reviewed By: dblaikie, ikudrin, jhenderson Differential Revision: https://reviews.llvm.org/D73086	2020-04-03 12:23:51 +05:30
Serguei Katkov	bd1d70bf0e	[DAG] Change isGCValue detection for statepoint lowering isGCValue should detect whether the deopt value is a GC pointer. Currently it checks by finding the value in SI.Bases and SI.Ptrs. However these data structures contain only those values which have corresponding gc.relocate call. So we can miss GC value if it does not have gc.relocate call (dead after the call). Check GC strategy whether pointer is GC one or consider any pointer to be GC one conservatively. Reviewers: reames, dantrushin Reviewed By: reames Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D77130	2020-04-03 12:36:13 +07:00
Scott Constable	b1d581019f	[X86] Refactor X86IndirectThunks.cpp to Accommodate Mitigations other than Retpoline Introduce a ThunkInserter CRTP base class from which new thunk types can inherit, e.g., thunks to mitigate https://software.intel.com/security-software-guidance/software-guidance/load-value-injection. Differential Revision: https://reviews.llvm.org/D76811	2020-04-02 22:09:54 -07:00
Scott Constable	71e8021d82	[X86][NFC] Generalize the naming of "Retpoline Thunks" and related code to "Indirect Thunks" There are applications for indirect call/branch thunks other than retpoline for Spectre v2, e.g., https://software.intel.com/security-software-guidance/software-guidance/load-value-injection Therefore it makes sense to refactor X86RetpolineThunks as a more general capability. Differential Revision: https://reviews.llvm.org/D76810	2020-04-02 21:55:13 -07:00
Hongtao Yu	88da019977	Fix a bug in the inliner that causes subsequent double inlining Summary: A recent change in the instruction simplifier enables a call to a function that just returns one of its parameter to be simplified as simply loading the parameter. This exposes a bug in the inliner where double inlining may be involved which in turn may cause compiler ICE when an already-inlined callsite is reused for further inlining. To put it simply, in the following-like C program, when the function call second(t) is inlined, its code t = third(t) will be reduced to just loading the return value of the callsite first(). This causes the inliner internal data structure to register the first() callsite for the call edge representing the third() call, therefore incurs a double inlining when both call edges are considered an inline candidate. I'm making a fix to break the inliner from reusing a callsite for new call edges. ``` void top() { int t = first(); second(t); } void second(int t) { t = third(t); fourth(t); } void third(int t) { return t; } ``` The actual failing case is much trickier than the example here and is only reproducible with the legacy inliner. The way the legacy inliner works is to process each SCC in a bottom-up order. That means in reality function first may be already inlined into top, or function third is either inlined to second or is folded into nothing. To repro the failure seen from building a large application, we need to figure out a way to confuse the inliner so that the bottom-up inlining is not fulfilled. I'm doing this by making the second call indirect so that the alias analyzer fails to figure out the right call graph edge from top to second and top can be processed before second during the bottom-up. We also need to tweak the test code so that when the inlining of top happens, the function body of second is not that optimized, by delaying the pass of function attribute deducer (i.e, which tells function third has no side effect and just returns its parameter). Since the CGSCC pass is iterative, additional calls are added to top to postpone the inlining of second to the second round right after the first function attribute deducing pass is done. I haven't been able to repro the failure with the new pass manager since the processing order of ininlined callsites is a bit different, but in theory the issue could happen there too. Note that this fix could introduce a side effect that blocks the simplification of inlined code, specifically for a call site that can be folded to another call site. I hope this can probably be complemented by subsequent inlining or folding, as shown in the attached unit test. The ideal fix should be to separate the use of VMap. However, in reality this failing pattern shouldn't happen often. And even if it happens, there should be a good chance that the non-folded call site will be refolded by iterative inlining or subsequent simplification. Reviewers: wenlei, davidxl, tejohnson Reviewed By: wenlei, davidxl Subscribers: eraman, nikic, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76248	2020-04-02 21:08:05 -07:00
Xiang1 Zhang	43f031d312	Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology) Summary: This patch comes from H.J.'s `2bd54ce7fa` This patch fix the failed llvm unit tests which running on CET machine. (e.g. ExecutionEngine/MCJIT/MCJITTests) The reason we enable IBT at "JIT compiled with CET" is mainly that: the JIT don't know the its caller program is CET enable or not. If JIT's caller program is non-CET, it is no problem JIT generate CET code or not. But if JIT's caller program is CET enabled, JIT must generate CET code or it will cause Control protection exceptions. I have test the patch at llvm-unit-test and llvm-test-suite at CET machine. It passed. and H.J. also test it at building and running VNCserver(Virtual Network Console), it works too. (if not apply this patch, VNCserver will crash at CET machine.) Reviewers: hjl.tools, craig.topper, LuoYuanke, annita.zhang, pengfei Subscribers: tstellar, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76900	2020-04-03 11:44:07 +08:00
Jessica Paquette	71947ed927	[AArch64][GlobalISel] Constrain reg operands in selectBrJT This was causing a machine verifier failure on the test suite. Make sure that we don't end up with a weird register class here. Failure for reference: * Bad machine code: Illegal virtual register for instruction * - function: check_constrain - basic block: %bb.1 (0x7f8b70839f80) - instruction: early-clobber %6:gpr64, early-clobber %7:gpr64sp = JumpTableDest32 %5:gpr64, %1:gpr64sp, %jump-table.0 - operand 3: %1:gpr64sp Expected a GPR64 register, but got a GPR64sp register Differential Revision: https://reviews.llvm.org/D77349	2020-04-02 20:34:11 -07:00
Wenju He	fe8ac0fe51	[x86] Fix Intel OpenCL builtin CalleeSavedRegs on skx Summary: Align with AVX512 builtins implementations, some of which don't preserve rdi. Reviewers: yubing, tianqing, craig.topper Reviewed By: craig.topper Subscribers: yaxunl, Anastasia, hiraditya Differential Revision: https://reviews.llvm.org/D77032	2020-04-03 11:27:40 +08:00
Qiu Chaofan	71f1ab5354	[PowerPC] Remove unnecessary XSRSP instruction MI peephole will remove unnecessary FRSP instructions. This patch removes such unnecessary XSRSP. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D77208	2020-04-03 11:05:14 +08:00
Jun Ma	9c6f32a0ff	[Coroutines] Simplify implementation using removePredecessor Differential Revision: https://reviews.llvm.org/D77035	2020-04-03 09:20:07 +08:00
Austin Kerbow	30f18ed387	[AMDGPU] Handle SMRD signed offset immediate Summary: This fixes a few issues related to SMRD offsets. On gfx9 and gfx10 we have a signed byte offset immediate, however we can overflow into a negative since we treat it as unsigned. Also, the SMRD SOFFSET sgpr is an unsigned offset on all subtargets. We sometimes tried to use negative values here. Third, S_BUFFER instructions should never use a signed offset immediate. Differential Revision: https://reviews.llvm.org/D77082	2020-04-02 17:41:52 -07:00
Adrian Prantl	93fe58c9cf	Teach the stripNonLineTableDebugInfo pass about the llvm.dbg.label intrinsic. Debug info for labels is not generated at -gline-tables-only, so this pass should remove them. Differential Revision: https://reviews.llvm.org/D77345	2020-04-02 17:39:33 -07:00
Adrian Prantl	c024f3ebdc	Teach the stripNonLineTableDebugInfo pass about the llvm.dbg.addr intrinsic. This patch also strips llvm.dbg.addr intrinsics when downgrading debug info to linetables-only. Differential Revision: https://reviews.llvm.org/D77343	2020-04-02 17:39:33 -07:00
Lang Hames	05598441de	Re-apply `0071eaaf08`, "[ORC] Export __cxa_atexit ...", with fixes. Forgot to include part of the testcase. Thank to Nico for spotting that and reverting!	2020-04-02 16:03:35 -07:00
Matt Arsenault	f68cc2a7ed	AMDGPU: Use 128-bit DS operations by default	2020-04-02 17:17:47 -04:00
Matt Arsenault	5660bb6bc9	AMDGPU: Remove denormal subtarget features Switch to using the denormal-fp-math/denormal-fp-math-f32 attributes.	2020-04-02 17:17:12 -04:00
Matt Arsenault	75cf30918f	AMDGPU: Assume f32 denormals are enabled by default This will likely introduce catastrophic performance regressions on older subtargets, but should be correct. A follow up change will remove the old fp32-denormals subtarget features, and switch to using the new denormal-fp-math/denormal-fp-math-f32 attributes. Frontends should be making sure to add the denormal-fp-math-f32 attribute when appropriate to avoid performance regressions.	2020-04-02 17:17:12 -04:00
Cyndy Ishida	fd4d07517b	[llvm][TextAPI] adding inlining reexported libraries support Summary: [llvm][TextAPI] adding inlining reexported libraries support * this patch adds reader/writer support for MachO tbd files. The usecase is to represent reexported libraries in top level library that won't need to exist for linker indirection because all of the needed content will be inlined in the same document. Reviewers: ributzka, steven_wu, jhenderson Reviewed By: ributzka Subscribers: JDevlieghere, hiraditya, mgrang, dexonsmith, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67646	2020-04-02 13:05:08 -07:00
Craig Topper	4fdb63bbf0	[X86] Enable combineExtSetcc for vectors larger than 256 bits when we've disabled 512 bit vectors. The compares are going to be type legalized to 256 bits so we might as well fold the extend.	2020-04-02 12:44:27 -07:00
Anna Thomas	bf7a16a768	[InlineFunction] Update valid return attributes at callsite within callee body Consider a callee function that has a call (C) within it which feeds into the return. When we inline that callee into a callsite that has return attributes, we can backward propagate valid attributes to the call (C) within that inlined callee body. This is safe to do so only if we can guarantee transfer of execution to successor in the window of instructions between return value (i.e. the call C) and the return instruction. Also, this is valid only for attributes which are a property of a callsite and not those that are not dependent on the ABI, or a property of the call itself. Reviewed-By: reames, jdoerfert Differential Revision: https://reviews.llvm.org/D76140	2020-04-02 14:13:12 -04:00

... 5 6 7 8 9 ...

133482 Commits