llvm-project

Commit Graph

Author	SHA1	Message	Date
Sam Powell	d1d36f7ad2	[llvm] llvm-tapi-diff This patch introduces a new tool, llvm-tapi-diff, that compares and returns the diff of two TBD files. Reviewed By: ributzka, JDevlieghere Differential Revision: https://reviews.llvm.org/D101835	2021-06-03 11:38:00 -07:00
Eli Friedman	44cdf771fe	[AtomicExpand] Merge cmpxchg success and failure ordering when appropriate. If we're not emitting separate fences for the success/failure cases, we need to pass the merged ordering to the target so it can emit the correct instructions. For the PowerPC testcase, we end up with extra fences, but that seems like an improvement over missing fences. If someone wants to improve that, the PowerPC backed could be taught to emit the fences after isel, instead of depending on fences emitted by AtomicExpand. Fixes https://bugs.llvm.org/show_bug.cgi?id=33332 . Differential Revision: https://reviews.llvm.org/D103342	2021-06-03 11:34:35 -07:00
Artur Pilipenko	5a2aec3f27	NFC. Mark DOTFuncInfo getters as const This is a preparatory refactoring for introducing new types of hidden blocks.	2021-06-03 11:27:06 -07:00
Artur Pilipenko	a06e63fa52	NFC. Refactor DOTGraphTraits::isNodeHidden Restructure handling of cfg-hide-unreachable-paths and cfg-hide-deoptimize-paths options so as to make it easier to introduce new types of hidden blocks.	2021-06-03 11:27:06 -07:00
Philip Reames	44d70d298a	[LoopUnroll] Eliminate PreserveOnlyFirst parameter [nfc] This is a first step towards simplifying the transform interface to be less error prone. The basic idea is that querying SCEV is cheap (since it's cached) and we can just check for properties related to branch folding in the transform method instead of relying on the heuristic part to pass everything in correctly. Differential Revision: https://reviews.llvm.org/D103584	2021-06-03 10:33:14 -07:00
Nikita Popov	b0ab79ee2d	[MC] Add missing include (NFC) Try to fix buildbots after `983565a6fe`.	2021-06-03 18:50:00 +02:00
Nikita Popov	983565a6fe	[ADT] Move DenseMapInfo for ArrayRef/StringRef into respective headers (NFC) This is a followup to D103422. The DenseMapInfo implementations for ArrayRef and StringRef are moved into the ArrayRef.h and StringRef.h headers, which means that these two headers no longer need to be included by DenseMapInfo.h. This required adding a few additional includes, as many files were relying on various things pulled in by ArrayRef.h. Differential Revision: https://reviews.llvm.org/D103491	2021-06-03 18:34:36 +02:00
David Spickett	f4543dce5d	[clang][ARM] Remove arm2/3/6/7m CPU names These legacy CPUs are known to clang but not llvm. Their use was ignored by llvm and it would print a warning saying it did not recognise them. However because some of them are default CPUs for their architecture, you would get those warnings even if you didn't choose a cpu explicitly. (now those architectures will default to a "generic" CPU) Information is thin on the ground for these older chips so this is the best I could find: https://en.wikichip.org/wiki/acorn/microarchitectures/arm2 https://en.wikichip.org/wiki/acorn/microarchitectures/arm3 https://en.wikichip.org/wiki/arm_holdings/microarchitectures/arm6 https://en.wikichip.org/wiki/arm_holdings/microarchitectures/arm7 Final part of fixing https://bugs.llvm.org/show_bug.cgi?id=50454. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D103028	2021-06-03 08:55:44 +00:00
Min-Yih Hsu	344e919b1a	[CodeGen][NFC] Remove unused virtual function `TargetFrameLowering::emitCalleeSavedFrameMoves` with 4 arguments is not used anywhere in CodeGen. Thus it shouldn't be exposed as a virtual function. NFC. Differential Revision: https://reviews.llvm.org/D103328	2021-06-02 13:11:12 -07:00
Stefan Pintilie	1ed2e9b9a0	[NFC] Remove variable that was set but not used. The buildbot ppc64le-lld-multistage-test has been failing because the variable Tag in Waymaking.h is set but not used. This patch removes that varaible.	2021-06-02 13:20:32 -05:00
Rong Xu	6745ffe4fa	[SampleFDO] New hierarchical discriminator for FS SampleFDO (ProfileData part) This patch was split from https://reviews.llvm.org/D102246 [SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This is mainly for ProfileData part of change. It will load FS Profile when such profile is detected. For an extbinary format profile, create_llvm_prof tool will add a flag to profile summary section. For other format profiles, the users need to use an internal option (-profile-isfs) to tell the compiler that the profile uses FS discriminators. This patch also simplified the bit API used by FS discriminators. Differential Revision: https://reviews.llvm.org/D103041	2021-06-02 10:32:52 -07:00
Qunyan Mangus	cbde248736	Add getDemandedBits for uses. Add getDemandedBits method for uses so we can query demanded bits for each use. This can help getting better use information. For example, for the code below define i32 @test_use(i32 %a) { %1 = and i32 %a, -256 %2 = or i32 %1, 1 %3 = trunc i32 %2 to i8 (didn't optimize this to 1 for illustration purpose) ... some use of %3 ret %2 } if we look at the demanded bit of %2 (which is all 32 bits because of the return), we would conclude that %a is used regardless of how its return is used. However, if we look at each use separately, we will see that the demanded bit of %2 in trunc only uses the lower 8 bits of %a which is redefined, therefore %a's usage depends on how the function return is used. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D97074	2021-06-02 10:07:40 -04:00
Sander de Smalen	d41cb6bb26	[LV] Build and cost VPlans for scalable VFs. This patch uses the calculated maximum scalable VFs to build VPlans, cost them and select a suitable scalable VF. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D98722	2021-06-02 14:47:47 +01:00
Sean Fertile	81f7607f7c	[PowerPC][AIX} FIx AIX bootstrap build. A recent patch: https://reviews.llvm.org/rGe0921655b1ff8d4ba7c14be59252fe05b705920e changed clangs AIX bitfield handling to use 4-byte bitfield containers, matching XLs behavior. This change triggers static assert failures when bootstrapping. Change the macro we check to enable bitfield packing on AIX to `__clang__` which is defined by both xlclang and clang. Differential Revision: https://reviews.llvm.org/D103474	2021-06-02 09:31:11 -04:00
Daniil Fukalov	0195e594fe	[TTI] NFC: Change getIntImmCodeSizeCost to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D102915	2021-06-02 16:04:11 +03:00
Bjorn Pettersson	536e02a23c	[CodeGen] Refactor libcall lookups for RTLIB::POWI_* Use RuntimeLibcalls to get a common way to pick correct RTLIB::POWI_* libcall for a given value type. This includes a small refactoring of ExpandFPLibCall and ExpandArgFPLibCall in SelectionDAGLegalize to share a bit of code, plus adding an ExpandFPLibCall version that can be called directly when expanding FPOWI/STRICT_FPOWI to ensure that we actually use the same RTLIB::Libcall when expanding the libcall as we used when checking the legality of such a call by doing a getLibcallName check. Differential Revision: https://reviews.llvm.org/D103050	2021-06-02 11:40:34 +02:00
Bjorn Pettersson	9c54ee4378	[SimplifyLibCalls] Take size of int into consideration when emitting ldexp/ldexpf When rewriting powf(2.0, itofp(x)) -> ldexpf(1.0, x) exp2(sitofp(x)) -> ldexp(1.0, sext(x)) exp2(uitofp(x)) -> ldexp(1.0, zext(x)) the wrong type was used for the second argument in the ldexp/ldexpf libc call, for target architectures with 16 bit "int" type. The transform incorrectly used a bitcasted function pointer with a 32-bit argument when emitting the ldexp/ldexpf call for such targets. The fault is solved by using the correct function prototype in the call, by asking TargetLibraryInfo about the size of "int". TargetLibraryInfo by default derives the size of the int type by assuming that it is 16 bits for 16-bit architectures, and 32 bits otherwise. If this isn't true for a target it should be possible to override that default in the TargetLibraryInfo initializer. Differential Revision: https://reviews.llvm.org/D99438	2021-06-02 11:40:34 +02:00
Tomasz Miąsko	a67a234ec7	[Demangle][Rust] Parse binders Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102729	2021-06-02 10:36:45 +02:00
Arthur Eubanks	8961293851	[OpaquePtr] Create API to make a copy of a PointerType with some address space Some existing places use getPointerElementType() to create a copy of a pointer type with some new address space. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103429	2021-06-01 16:52:32 -07:00
Daniel Sanders	9372662050	fixup: Missing operator in [globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one My local compiler was fine with it but the bots complain about ambiguous types.	2021-06-01 13:58:03 -07:00
Daniel Sanders	aaac268285	[globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one It's still in use in a few places so we can't delete it yet but there's not many at this point. Differential Revision: https://reviews.llvm.org/D103352	2021-06-01 13:23:48 -07:00
Andrew Kelley	936ca1e21a	WindowsSupport.h: do not depend on private config header WindowsSupport.h is a public header, however if it gets included, will cause a compile error indicating that llvm/Config/config.h cannot be found, because config.h is a private header. However there is no actual dependency on the private things in this header, so it can be changed to the public config header. Reviewed By: amccarth Differential Revision: https://reviews.llvm.org/D103370	2021-06-01 23:05:03 +03:00
Jessica Paquette	e7f501b5e7	[GlobalISel][AArch64] Combine and (lshr x, cst), mask -> ubfx x, cst, width Also add a target hook which allows us to get around custom legalization on AArch64. Differential Revision: https://reviews.llvm.org/D99283	2021-06-01 10:56:17 -07:00
Guozhi Wei	1b748faf2b	[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB This patch transforms the sequence lea (reg1, reg2), reg3 sub reg3, reg4 to two sub instructions sub reg1, reg4 sub reg2, reg4 Similar optimization can also be applied to LEA/ADD sequence. The modifications to TwoAddressInstructionPass is to ensure the operands of ADD instruction has expected order (the dest register of LEA should be src register of ADD). Differential Revision: https://reviews.llvm.org/D101970	2021-06-01 10:31:30 -07:00
Eli Friedman	fd229caa01	[polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs. When we're remapping an AddRec, the AddRec constructed by a partial rewrite might not make sense. This triggers an assertion complaining it's not loop-invariant. Instead of constructing the partially rewritten AddRec, just skip straight to calling evaluateAtIteration. Testcase was automatically reduced using llvm-reduce, so it's a little messy, but hopefully makes sense. Differential Revision: https://reviews.llvm.org/D102959	2021-06-01 09:51:05 -07:00
Nikita Popov	fd7e309e02	[ADT] Move DenseMapInfo for APInt into APInt.h (PR50527) As suggested in https://bugs.llvm.org/show_bug.cgi?id=50527, this moves the DenseMapInfo for APInt and APSInt into the respective headers, removing the need to include APInt.h and APSInt.h from DenseMapInfo.h. We could probably do the same from StringRef and ArrayRef as well. Differential Revision: https://reviews.llvm.org/D103422	2021-06-01 18:31:41 +02:00
Daniil Seredkin	13140120dc	[InstCombine] Relax constraints of uses for exp(X) * exp(Y) -> exp(X + Y) InstCombine didn't perform the transformations when fmul's operands were the same instruction because it required to have one use for each of them which is false in the case. This patch fixes this + adds tests for them and introduces a new function isOnlyUserOfAnyOperand to check these cases in a single place. This patch is a result of discussion in D102574. Differential Revision: https://reviews.llvm.org/D102698	2021-06-01 08:33:23 -04:00
Andy Wingo	82f92e35c6	[WebAssembly][CodeGen] IR support for WebAssembly local variables This patch adds TargetStackID::WasmLocal. This stack holds locations of values that are only addressable by name -- not via a pointer to memory. For the WebAssembly target, these objects are lowered to WebAssembly local variables, which are managed by the WebAssembly run-time and are not addressable by linear memory. For the WebAssembly target IR indicates that an AllocaInst should be put on TargetStackID::WasmLocal by putting it in the non-integral address space WASM_ADDRESS_SPACE_WASM_VAR, with value 1. SROA will mostly lift these allocations to SSA locals, but any alloca that reaches instruction selection (usually in non-optimized builds) will be assigned the new TargetStackID there. Loads and stores to those values are transformed to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes, which then lower to the type-specific LOCAL_GET_I32 etc instructions via tablegen patterns. Differential Revision: https://reviews.llvm.org/D101140	2021-06-01 11:31:39 +02:00
Arthur Eubanks	372237487e	[OpaquePtr] Remove some uses of PointerType::getElementType()	2021-05-31 16:11:25 -07:00
Florian Hahn	aa00b1d763	[LV] Try to sink users recursively for first-order recurrences. Update isFirstOrderRecurrence to explore all uses of a recurrence phi and check if we can sink them. If there are multiple users to sink, they are all mapped to the previous instruction. Fixes PR44286 (and another PR or two). Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D84951	2021-05-31 19:55:33 +01:00
Andrea Di Biagio	9853d0db1e	[MCA][NFCI] Minor changes to InstrBuilder and Instruction. This is based on the assumption that most simulated instructions don't define more than one or two registers. This is true for example on x86, where most instruction definitions don't declare more than one register write. The default code region size has been increased from 8 to 16. This is based on the assumption that, for small microbenchmarks, the typical code snippet size is often less than 16 instructions. mca::Instruction now uses bitfields to pack flags. No functional change intended.	2021-05-31 17:05:13 +01:00
Daniil Fukalov	e853d3b274	[NFC] MemoryDependenceAnalysis cleanup. 1. Removed redundant includes, 2. Removed never defined and used `releaseMemory()`. 3. Fixed member functions names first letter case. 4. Renamed duplicate (in nested struct `NonLocalPointerInfo`) name `NonLocalDeps` to `NonLocalDepsMap`. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D102358	2021-05-31 18:07:55 +03:00
Roman Lebedev	f7c95c3322	[NFC] ScalarEvolution: apply SSO to the ExprValueMap value ExprValueMap is a map from SCEV * to a set-vector of (Value , ConstantInt ) pair, and while the map itself will likely be big-ish (have many keys), it is a reasonable assumption that each key will refer to a small-ish number of pairs. In particular looking at n=512 case from https://bugs.llvm.org/show_bug.cgi?id=50384, the small-size of 4 appears to be the sweet spot, it results in the least allocations while minimizing memory footprint. ``` $ for i in $(ls heaptrack.opt.*.gz); do echo $i; heaptrack_print $i \| tail -n 6; echo ""; done heaptrack.opt.0-orig.gz total runtime: 14.32s. calls to allocation functions: 8222442 (574192/s) temporary memory allocations: `2419000` (168924/s) peak heap memory consumption: 190.98MB peak RSS (including heaptrack overhead): 239.65MB total memory leaked: 67.58KB heaptrack.opt.1-n1.gz total runtime: 13.72s. calls to allocation functions: 7184188 (523705/s) temporary memory allocations: 2419017 (176338/s) peak heap memory consumption: 191.38MB peak RSS (including heaptrack overhead): 239.64MB total memory leaked: 67.58KB heaptrack.opt.2-n2.gz total runtime: 12.24s. calls to allocation functions: 6146827 (502355/s) temporary memory allocations: 2418997 (197695/s) peak heap memory consumption: 163.31MB peak RSS (including heaptrack overhead): 211.01MB total memory leaked: 67.58KB heaptrack.opt.3-n4.gz total runtime: 12.28s. calls to allocation functions: 6068532 (494260/s) temporary memory allocations: 2418985 (197017/s) peak heap memory consumption: 155.43MB peak RSS (including heaptrack overhead): 201.77MB total memory leaked: 67.58KB heaptrack.opt.4-n8.gz total runtime: 12.06s. calls to allocation functions: 6068042 (503321/s) temporary memory allocations: 2418992 (200646/s) peak heap memory consumption: 166.03MB peak RSS (including heaptrack overhead): 213.55MB total memory leaked: 67.58KB heaptrack.opt.5-n16.gz total runtime: 12.14s. calls to allocation functions: 6067993 (499958/s) temporary memory allocations: 2418999 (199307/s) peak heap memory consumption: 187.24MB peak RSS (including heaptrack overhead): 233.69MB total memory leaked: 67.58KB ``` While that test may be an edge worst-case scenario, https://llvm-compile-time-tracker.com/compare.php?from=dee85d47d9f15fc268f7b18f279dac2774836615&to=98a57e31b1947d5bcdf4a5605ac2ab32b4bd5f63&stat=instructions agrees that this also results in improvements in the usual situations.	2021-05-31 15:34:03 +03:00
Andy Wingo	bc1ad6e3c4	Revert "[WebAssembly][CodeGen] IR support for WebAssembly local variables" This reverts commit `bf35f4af51`. There was an error in a shared-library build.	2021-05-31 10:55:15 +02:00
Andy Wingo	bf35f4af51	[WebAssembly][CodeGen] IR support for WebAssembly local variables This patch adds TargetStackID::WasmLocal. This stack holds locations of values that are only addressable by name -- not via a pointer to memory. For the WebAssembly target, these objects are lowered to WebAssembly local variables, which are managed by the WebAssembly run-time and are not addressable by linear memory. For the WebAssembly target IR indicates that an AllocaInst should be put on TargetStackID::WasmLocal by putting it in the non-integral address space WASM_ADDRESS_SPACE_WASM_VAR, with value 1. SROA will mostly lift these allocations to SSA locals, but any alloca that reaches instruction selection (usually in non-optimized builds) will be assigned the new TargetStackID there. Loads and stores to those values are transformed to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes, which then lower to the type-specific LOCAL_GET_I32 etc instructions via tablegen patterns. Differential Revision: https://reviews.llvm.org/D101140	2021-05-31 10:40:38 +02:00
Sanjay Patel	7bb8bfa062	[InstCombine] fix miscompile from vector select substitution This is similar to the fix in `c590a9880d` ( PR49832 ), but we missed handling the pattern for select of bools (no compare inst). We can't substitute a vector value because the equality condition replacement that we are attempting requires that the condition is true/false for the entire value. Vector select can be partly true/false. I added an assert for vector types, so we shouldn't hit this again. Fixed formatting while auditing the callers. https://llvm.org/PR50500	2021-05-30 07:11:58 -04:00
Arthur Eubanks	3a6f12f915	Revert "[NFC] Use ArgListEntry indirect types more in ISel lowering" This reverts commit `bc7d15c61d`. Dependent change is to be reverted.	2021-05-29 22:40:33 -07:00
Fangrui Song	38dbdde792	[Internalize] Simplify comdat renaming with noduplicates after D103043 I realized that we can use `comdat noduplicates` which is available on ELF. Add a special case for wasm which doesn't support the feature.	2021-05-28 16:58:38 -07:00
Eli Friedman	0b3b0a727a	[AArch64][RISCV] Make sure isel correctly honors failure orderings. If a cmpxchg specifies acquire or seq_cst on failure, make sure we generate code consistent with that ordering even if the success ordering is not acquire/seq_cst. At one point, it was ambiguous whether this sort of construct was valid, but the C++ standad and LLVM now accept arbitrary combinations of success/failure orderings. This doesn't address the corresponding issue in AtomicExpand. (This was reported as https://bugs.llvm.org/show_bug.cgi?id=33332 .) Fixes https://bugs.llvm.org/show_bug.cgi?id=50512. Differential Revision: https://reviews.llvm.org/D103284	2021-05-28 12:47:40 -07:00
Craig Topper	2830d924b0	[VP] Make getMaskParamPos/getVectorLengthParamPos return unsigned. Lowercase function names. Parameter positions seem like they should be unsigned. While there, make function names lowercase per coding standards. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D103224	2021-05-28 11:28:47 -07:00
eopXD	fa488ea864	[LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass This patch changes LoopFlattenPass from FunctionPass to LoopNestPass. Utilize LoopNest and let function 'Flatten' generate information from it. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D102904	2021-05-28 15:43:12 +00:00
Andy Wingo	ca5f07f8c4	Revert "[WebAssembly][CodeGen] IR support for WebAssembly local variables" This reverts commit `00ecf18979`, as it broke the AMDGPU build. Will reland later with a fix.	2021-05-28 12:42:12 +02:00
Andy Wingo	00ecf18979	[WebAssembly][CodeGen] IR support for WebAssembly local variables This patch adds TargetStackID::WasmLocal. This stack holds locations of values that are only addressable by name -- not via a pointer to memory. For the WebAssembly target, these objects are lowered to WebAssembly local variables, which are managed by the WebAssembly run-time and are not addressable by linear memory. For the WebAssembly target IR indicates that an AllocaInst should be put on TargetStackID::WasmLocal by putting it in the non-integral address space WASM_ADDRESS_SPACE_WASM_VAR, with value 1. SROA will mostly lift these allocations to SSA locals, but any alloca that reaches instruction selection (usually in non-optimized builds) will be assigned the new TargetStackID there. Loads and stores to those values are transformed to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes, which then lower to the type-specific LOCAL_GET_I32 etc instructions via tablegen patterns. Differential Revision: https://reviews.llvm.org/D101140	2021-05-28 11:07:41 +02:00
eopXD	e96d6f4821	Revert "[LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass" This reverts commit `7952ddb21f`. Differential Revision: https://reviews.llvm.org/D103302	2021-05-28 07:58:06 +00:00
eopXD	7e06cf8f1b	Revert "[LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass" This reverts commit `ffc4d3e068`.	2021-05-28 07:48:04 +00:00
eopXD	ffc4d3e068	[LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass This patch changes LoopFlattenPass from FunctionPass to LoopNestPass. Utilize LoopNest and let function 'Flatten' generate information from it. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D102904	2021-05-28 07:25:53 +00:00
eopXD	7952ddb21f	[LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass This patch changes LoopFlattenPass from FunctionPass to LoopNestPass. Utilize LoopNest and let function 'Flatten' generate information from it. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D102904	2021-05-28 07:11:26 +00:00
Jordan Rupprecht	e41aaea262	[NFC][libObject] clang-format Archive{.h,.cpp} In preparation for D100651	2021-05-27 16:48:40 -07:00
Andrea Di Biagio	57646d38d5	[MCA] Minor changes to the InOrderIssueStage. NFC The constructor of InOrderIssueStage no longer takes as input a reference to the target scheduling model. The stage can always query the subtarget to obtain a reference to the scheduling model. The ResourceManager is no longer stored internally as a unique_ptr. Moved a couple of method definitions to the .cpp file.	2021-05-28 00:33:59 +01:00
Andrea Di Biagio	50770d8de5	[MCA] Refactor the InOrderIssueStage stage. NFCI Moved the logic that checks for RAW hazards from the InOrderIssueStage to the RegisterFile. Changed how the InOrderIssueStage keeps track of backend stalls. Stall events are now generated from method notifyStallEvent(). No functional change intended.	2021-05-27 22:28:04 +01:00
Quinn Pham	62b5df7fe2	[PowerPC] Added multiple PowerPC builtins This is the first in a series of patches to provide builtins for compatibility with the XL compiler. Most of the builtins already had intrinsics and only needed to be implemented in the front end. Intrinsics were created for the three iospace builtins, eieio, and icbt. Pseudo instructions were created for eieio and iospace_eieio to ensure that nops were inserted before the eieio instruction. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D102443	2021-05-27 16:23:03 -05:00
Adrian Prantl	f3869a5c32	Support stripping indirectly referenced DILocations from !llvm.loop metadata in stripDebugInfo(). This patch fixes an oversight in https://reviews.llvm.org/D96181 and also takes into account loop metadata pointing to other MDNodes that point into the debug info. rdar://78487175 Differential Revision: https://reviews.llvm.org/D103220	2021-05-27 13:23:33 -07:00
Saleem Abdulrasool	4cc5a97101	MC: mark `dump` with `LLVM_DUMP_METHOD` Mark the `ELFRelocationEntry::dump` method as `LLVM_DUMP_METHOD` to annotate it properly as used to prevent the function being dead stripped away. This allows use of `dump` in the debugger. This is purely to improve the developer experience.	2021-05-27 10:47:39 -07:00
maekawatoshiki	2165360003	[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass. The next patch will utilize LoopNest to effectively handle loop nests. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D99149	2021-05-28 01:17:23 +09:00
Fraser Cormack	5a80dc4988	[VP][SelectionDAG] Add a target-configurable EVL operand type This patch adds a way for the target to configure the type it uses for the explicit vector length operands of VP SDNodes. The type must be a legal integer type (there is still no target-independent legalization of this operand) and must currently be at least as big as i32, the type used by the IR intrinsics. An implicit zero-extension takes place on targets which choose a larger type. All VP nodes should be created with this type used for the EVL operand. This allows 64-bit RISC-V to avoid custom legalization of all VP nodes, keeping them in their target-independent form for that bit longer. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D103027	2021-05-27 15:27:36 +01:00
Mats Petersson	9091ecdae0	[OpenMP]Add support for workshare loop modifier in lowering When lowering the dynamic, guided, auto and runtime types of scheduling, there is an optional monotonic or non-monotonic modifier. This patch adds support in the OMP IR Builder to pass this down to the runtime functions. Also implements tests for the variants. Differential Revision: https://reviews.llvm.org/D102008	2021-05-27 15:33:05 +01:00
Simon Giesecke	5f2d4b23b4	Add --quiet option to llvm-gsymutil to suppress output of warnings. Differential Revision: https://reviews.llvm.org/D102829	2021-05-27 12:36:34 +00:00
Mats Petersson	86627be233	Revert "[OpenMP]Add support for workshare loop modifier in lowering" This reverts commit `ea4c5fb04c`.	2021-05-27 13:09:47 +01:00
Mats Petersson	ea4c5fb04c	[OpenMP]Add support for workshare loop modifier in lowering When lowering the dynamic, guided, auto and runtime types of scheduling, there is an optional monotonic or non-monotonic modifier. This patch adds support in the OMP IR Builder to pass this down to the runtime functions. Also implements tests for the variants. Differential Revision: https://reviews.llvm.org/D102008	2021-05-27 12:28:27 +01:00
Amara Emerson	9f39ba13b5	[GlobalISel] Implement splitting of G_SHUFFLE_VECTOR. Thhis is a port from the DAG legalization. We're still missing some of the canonicalizations of shuffles but it's a start. Differential Revision: https://reviews.llvm.org/D102828	2021-05-27 00:28:38 -07:00
Hasyimi Bahrudin	8d25762720	Fix non-global-value-max-name-size not considered by LLParser `non-global-value-max-name-size` is used by `Value` to cap the length of local value name. However, this flag is not considered by `LLParser`, which leads to unexpected `use of undefined value error`. The fix is to move the responsibility of capping the length to `ValueSymbolTable`. The test is the one provided by [[ https://bugs.llvm.org/show_bug.cgi?id=45899 \| Mikael in the bug report ]]. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D102707	2021-05-27 04:20:03 +00:00
Yevgeny Rouban	4d26f41f76	[RS4GC] Introduce intrinsics to get base ptr and offset There can be a need for some optimizations to get (base, offset) for any GC pointer. The base can be calculated by generating needed instructions as it is done by the RewriteStatepointsForGC::findBasePointer() function. The offset can be calculated in the same way. Though to not expose the base calculation and to make the offset calculation as simple as ptrtoint(derived_ptr) - ptrtoint(base_ptr), which is illegal outside RS4GC, this patch introduces 2 intrinsics: @llvm.experimental.gc.get.pointer.base(%derived_ptr) @llvm.experimental.gc.get.pointer.offset(%derived_ptr) These intrinsics are inlined by RS4GC along with generation of statepoint sequences. With these new intrinsics the GC parseable lowering for atomic memcpy intrinsics (`6ec2c5e402`) could be implemented as a separate pass. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D100445	2021-05-27 09:14:14 +07:00
Jessica Paquette	324af79dbc	[GlobalISel] Don't emit lost debug location remarks when legalizing tail calls There were a bunch of lost debug location remarks that show up when legalizing tail calls on AArch64. This would happen because we drop the return in the block where we emit the tail call. So, we end up dropping the debug location, which makes the LostDebugLocObserver report a missing debug location. Although it's true that we lose these debug locations, this isn't a particularly useful remark. We expect to drop these debug locations when emitting tail calls. Suppressing remarks in this case is preferable, since the amount of noise could hide actual debug location related bugs. To do this, I just plumbed the LostDebugLocObserver through the relevant LegalizerHelper functions. This is the only case I can think of where we need the LostDebugLocObserver in the LegalizerHelper. So, rather than storing it in the LegalizerHelper proper and mucking around with the constructors, I figured it'd be cleanest to take the simplest path for now. This clears up ~20 noisy lost debug location remarks on CTMark in AArch64 at -Os. Differential Revision: https://reviews.llvm.org/D103128	2021-05-26 17:16:11 -07:00
Jacob Hegna	1494fa6943	Update documentation for InlineModel features. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D103193	2021-05-26 12:52:28 -07:00
Jeremy Morse	8496fc2ec8	[DebugInstrRef][1/3] Track PHI values through register allocation This patch introduces "DBG_PHI" instructions, a marker of where a PHI instruction used to be, before PHI elimination. Under the instruction referencing model, we want to know where every value in the function is defined -- and a PHI, even if implicit, is such a place. Just like instruction numbers, we can use this to identify a value to be used as a variable value, but we don't need to know what instruction defines that value, for example: bb1: DBG_PHI $rax, 1 [... more insts ... ] bb2: DBG_INSTR_REF 1, 0, !1234, !DIExpression() This specifies that on entry to bb1, whatever value is in $rax is known as value number one -- and the later DBG_INSTR_REF marks the position where variable !1234 should take on value number one. PHI locations are stored in MachineFunction for the duration of the regalloc phase in the DebugPHIPositions map. The map is populated by PHIElimination, and then flushed back into the instruction stream by virtregrewriter. A small amount of maintenence is needed in LiveDebugVariables to account for registers being split, but only for individual positions, not for entire ranges of blocks. Differential Revision: https://reviews.llvm.org/D86812	2021-05-26 20:24:00 +01:00
Philip Reames	ff08c3468f	[SCEV] Compute trip multiple for multiple exit loops This patch implements getSmallConstantTripMultiple(L) correctly for multiple exit loops. The previous implementation was both imprecise, and violated the specified behavior of the method. This was fine in practice, because it turns out the function was both dead in real code, and not tested for the multiple exit case. Differential Revision: https://reviews.llvm.org/D103189	2021-05-26 11:52:25 -07:00
Heejin Ahn	5dd86aadf0	[WebAssembly] Add TargetInstrInfo::getCalleeOperand DwarfDebug unconditionally assumes for all call instructions the 0th operand is the callee operand, which seems to be true for other targets, but not for WebAssembly. This adds `TargetInstrInfo::getCallOperand` method whose default implementation returns `getOperand(0)` and makes WebAssembly overrides it to use its own utility method to get the callee operand. This also fixes an existing bug in `WebAssembly::getCalleeOp`, which was uncovered by this CL. Reviewed By: dschuff, djtodoro Differential Revision: https://reviews.llvm.org/D102978	2021-05-26 11:43:59 -07:00
Philip Reames	9306bb638f	[SCEV] Generalize getSmallConstantTripCount(L) for multiple exit loops This came up in review for another patch, see https://reviews.llvm.org/D102982#2782407 for full context. I've reviewed the callers to make sure they can handle multiple exit loops w/non-zero returns. There's two cases in target cost models where results might change (Hexagon and PowerPC), but the results looked legal and reasonable. If a target maintainer wishes to back out the effect of the costing change, they should explicitly check for multiple exit loops and handle them as desired. Differential Revision: https://reviews.llvm.org/D103182	2021-05-26 11:18:25 -07:00
Fangrui Song	73a1179535	[llvm-mc] Add -M to replace -riscv-no-aliases and -riscv-arch-reg-names In objdump, many targets support `-M no-aliases`. Instead of having a `-*-no-aliases` for each target when LLVM adds the support, it makes more sense to introduce objdump style `-M`. -riscv-arch-reg-names is removed. -riscv-no-aliases has too many uses and thus is retained for now. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D103004	2021-05-26 10:43:32 -07:00
Philip Reames	921d3f7af0	[SCEV] Add a utility for converting from "exit count" to "trip count" (Mostly as a logical place to put a comment since this is a reoccuring confusion.)	2021-05-26 10:41:49 -07:00
Philip Reames	fb14577d0c	[SCEV] Extract out a helper for computing trip multiples	2021-05-26 10:15:03 -07:00
Philip Reames	9cc2181ec3	[unroll] Use value domain for symbolic execution based cost model The current full unroll cost model does a symbolic evaluation of the loop up to a fixed limit. That symbolic evaluation currently simplifies to constants, but we can generalize to arbitrary Values using the InstructionSimplify infrastructure at very low cost. By itself, this enables some simplifications, but it's mainly useful when combined with the branch simplification over in D102928. Differential Revision: https://reviews.llvm.org/D102934	2021-05-26 08:41:25 -07:00
Jonas Paulsson	d058262b14	[SystemZ] Support i128 inline asm operands. Support virtual, physical and tied i128 register operands in inline assembly. i128 is on SystemZ not really supported and is not a legal type and generally such a value will be split into two i64 parts. There are however some instructions that require a pair of two GPR64 registers contained in the GR128 bit reg class, which is untyped. For inline assmebly operands, it proved to be very cumbersome to first follow the general behavior of splitting an i128 operand into two parts and then later rebuild the INLINEASM MI to have one GR128 register. Instead, some minor common code changes were made to SelectionDAGBUilder to only create one GR128 register part to begin with. In particular: - getNumRegisters() now has an optional parameter "RegisterVT" which is passed by AddInlineAsmOperands() and GetRegistersForValue(). - The bitcasting in GetRegistersForValue is not performed if RegVT is Untyped. - The RC for a tied use in AddInlineAsmOperands() is now computed either from the tied def (virtual register), or by getMinimalPhysRegClass() (physical register). - InstrEmitter.cpp:EmitCopyFromReg() has been fixed so that the register class (DstRC) can also be computed for an illegal type. In the SystemZ backend getNumRegisters(), splitValueIntoRegisterParts() and joinRegisterPartsIntoValue() have been implemented to handle i128 operands. Differential Revision: https://reviews.llvm.org/D100788 Review: Ulrich Weigand	2021-05-26 10:08:32 -05:00
Kerry McLaughlin	9f76a85260	[LoopVectorize] Enable strict reductions when allowReordering() returns false When loop hints are passed via metadata, the allowReordering function in LoopVectorizationLegality will allow the order of floating point operations to be changed: bool allowReordering() const { // When enabling loop hints are provided we allow the vectorizer to change // the order of operations that is given by the scalar loop. This is not // enabled by default because can be unsafe or inefficient. The -enable-strict-reductions flag introduced in D98435 will currently only vectorize reductions in-loop if hints are used, since canVectorizeFPMath() will return false if reordering is not allowed. This patch changes canVectorizeFPMath() to query whether it is safe to vectorize the loop with ordered reductions if no hints are used. For testing purposes, an additional flag (-hints-allow-reordering) has been added to disable the reordering behaviour described above. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D101836	2021-05-26 13:59:12 +01:00
Tomas Matheson	165321b3d2	[MC][ELF] Emit unique sections for different flags Global values imply flags such as readable, writable, executable for the sections that they will be placed in. Currently MC places all such entries into the same section, using the first set of flags seen. This can lead to situations in LTO where a writable global is placed in the same named section as a readable global from another file, and the section may not be marked writable. D72194 ensures that mergeable globals with explicit sections are placed in separate sections with compatible entry size, by emitting the `unique` assembly syntax where appropriate. This change extends that approach to include section flags, so that globals with different section flags are emitted in separate unique sections. Differential revision: https://reviews.llvm.org/D100944	2021-05-26 11:51:29 +01:00
Esme-Yi	bf809cd165	[NFC][object] Change the input parameter of the method isDebugSection. Summary: This is a NFC patch to change the input parameter of the method SectionRef::isDebugSection(), by replacing the StringRef SectionName with DataRefImpl Sec. This allows us to determine if a section is debug type in more ways than just by section name. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D102601	2021-05-26 08:47:53 +00:00
Arthur Eubanks	ad90a6be21	[OpaquePtr] Create new bitcode encoding for atomicrmw Since the opaque pointer type won't contain the pointee type, we need to separately encode the value type for an atomicrmw. Emit this new code for atomicrmw. Handle this new code and the old one in the bitcode reader. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103123	2021-05-25 16:30:34 -07:00
Kevin Athey	52ac114771	LLVM Detailed IR tests for introduction of flag -fsanitize-address-detect-stack-use-after-return-mode. Rework all tests that interact with use after return to correctly handle the case where the mode has been explicitly set to Never or Always. for issue: https://github.com/google/sanitizers/issues/1394 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D102462	2021-05-25 16:17:39 -07:00
Fangrui Song	b426b45d10	[Internalize] Rename instead of removal if a to-be-internalized comdat has more than one member Beside the `comdat any` deduplication feature, instrumentations use comdat to establish dependencies among a group of sections, to prevent section based linker garbage collection from discarding some members without discarding all. LangRef acknowledges this usage with the following wording: > All global objects that specify this key will only end up in the final object file if the linker chooses that key over some other key. On ELF, for PGO instrumentation, a `__llvm_prf_cnts` section and its associated `__llvm_prf_data` section are placed in the same GRP_COMDAT group. A `__llvm_prf_data` is usually not referenced and expects the liveness of its associated `__llvm_prf_cnts` to retain it. The `setComdat(nullptr)` code (added by D10679) in InternalizePass can break the use case (a `__llvm_prf_data` may be dropped with its associated `__llvm_prf_cnts` retained). The main goal of this patch is to fix the dependency relationship. I think it makes sense for InternalizePass to internalize a comdat and thus suppress the deduplication feature, e.g. a relocatable link of a regular LTO can create an object file affected by InternalizePass. If a non-internal comdat in a.o is prevailed by an internal comdat in b.o, the a.o references to the comdat definitions will be non-resolvable (references cannot bind to STB_LOCAL definitions in b.o). On PE-COFF, for a non-external selection symbol, deduplication is naturally suppressed with link.exe and lld-link. However, this is fuzzy on ELF and I tend to believe the spec creator has not thought about this use case (see D102973). GNU ld and gold are still using the "signature is name based" interpretation. So even if D102973 for ld.lld is accepted, for portability, a better approach is to rename the comdat. A comdat with one single member is the common case, leaving the comdat can waste (sizeof(Elf64_Shdr)+4*2) bytes, so we optimize by deleting the comdat; otherwise we rename the comdat. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D103043	2021-05-25 14:15:27 -07:00
Nikita Popov	6300c37a46	[SCEV] Cache operands used in BEInfo (NFC) When memoized values for a SCEV expressions are dropped, we also drop all BECounts that make use of the SCEV expression. This is done by iterating over all the ExitNotTaken counts and (recursively) checking whether they use the SCEV expression. If there are many exits, this will take a lot of time. This patch improves the situation by pre-computing a set of all used operands, so that we can determine whether a certain BEInfo needs to be invalidated using a simple set lookup. Will still need to loop over all BEInfos though. This makes for a mild improvement on non-degenerate cases: https://llvm-compile-time-tracker.com/compare.php?from=b661a55a253f4a1cf5a0fbcb86e5ba7b9fb1387b&to=be1393f450e594c53f0ad7e62339a6bc831b16f6&stat=instructions For the degenerate case from https://bugs.llvm.org/show_bug.cgi?id=50384, for n=128 I'm seeing run time drop from 1.6s to 1.1s. Differential Revision: https://reviews.llvm.org/D102796	2021-05-25 21:03:33 +02:00
Michael Liao	c9dd29925f	[SelectionDAG] Propagate scoped AA metadata when lowering mem intrinsics. - When memory intrinsics, such as memcpy, the attached scoped AA metadata is not passed down to the backend. As a result, the backend cannot schedule relevant memory operations around them following that hint. In this patch, SelectionDAG is enhanced to propagate that metadata (scoped AA only) when they are lowered into loads and stores. Differential Revision: https://reviews.llvm.org/D102215	2021-05-25 14:42:26 -04:00
Philip Reames	aabca2d1da	[SCEV] Cleanup doesIVOverflowOnX checks [NFC] Stylistic changes only. 1) Don't pass a parameter just to do an early exit. 2) Use a name which matches actual behavior.	2021-05-25 10:12:24 -07:00
Philip Reames	a47b2d4567	[SCEV] Remove unused parameter from computeBECount [NFC] All callers pass "false" for the Equality parameter. Kill the dead code, and update the function block comment.	2021-05-25 09:58:56 -07:00
Yonghong Song	6a2ea84600	BPF: Add more relocation kinds Currently, BPF only contains three relocations: R_BPF_NONE for no relocation R_BPF_64_64 for LD_imm64 and normal 64-bit data relocation R_BPF_64_32 for call insn and normal 32-bit data relocation Also .BTF and .BTF.ext sections contain symbols in allocated program and data sections. These two sections reserved 32bit space to hold the offset relative to the symbol's section. When LLVM JIT is used, the LLVM ExecutionEngine RuntimeDyld may attempt to resolve relocations for .BTF and .BTF.ext, which we want to prevent. So we used R_BPF_NONE for such relocations. This all works fine until when we try to do linking of multiple objects. . R_BPF_64_64 handling of LD_imm64 vs. normal 64-bit data is different, so lld target->relocate() needs more context to do a correct job. . The same for R_BPF_64_32. More context is needed for lld target->relocate() to differentiate call insn vs. normal 32-bit data relocation. . Since relocations in .BTF and .BTF.ext are set to R_BPF_NONE, they will not be relocated properly when multiple .BTF/.BTF.ext sections are merged by lld. This patch intends to address this issue by adding additional relocation kinds: R_BPF_64_ABS64 for normal 64-bit data relocation R_BPF_64_ABS32 for normal 32-bit data relocation R_BPF_64_NODYLD32 for .BTF and .BTF.ext style relocations. The old R_BPF_64_{64,32} semantics: R_BPF_64_64 for LD_imm64 relocation R_BPF_64_32 for call insn relocation The existing R_BPF_64_64/R_BPF_64_32 mapping to numeric values is maintained. They are the most common use cases for bpf programs and we want to maintain backward compatibility as much as possible. ExecutionEngine RuntimeDyld BPF relocations are adjusted as well. R_BPF_64_{ABS64,ABS32} relocations will be resolved properly and other relocations will be ignored. Two tests are added for RuntimeDyld. Not handling R_BPF_64_NODYLD32 in RuntimeDyldELF.cpp will result in "Relocation type not implemented yet!" fatal error. FK_SecRel_4 usages in BPFAsmBackend.cpp and BPFELFObjectWriter.cpp are removed as they are not triggered in BPF backend. BPF backend used FK_SecRel_8 for LD_imm64 instruction operands. Differential Revision: https://reviews.llvm.org/D102712	2021-05-25 08:19:13 -07:00
Anirudh Prasad	993f38d0a7	[SystemZ][z/OS] Implement getHostCPUName for z/OS - Currently, the host cpu information is not easily available on z/OS as in other platforms. - This information is stored in the Communications Vector Table (https://www.ibm.com/docs/en/zos/2.2.0?topic=information-cvt-mapping) Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D102793	2021-05-25 11:18:12 -04:00
Marco Elver	280333021e	[SanitizeCoverage] Add support for NoSanitizeCoverage function attribute We really ought to support no_sanitize("coverage") in line with other sanitizers. This came up again in discussions on the Linux-kernel mailing lists, because we currently do workarounds using objtool to remove coverage instrumentation. Since that support is only on x86, to continue support coverage instrumentation on other architectures, we must support selectively disabling coverage instrumentation via function attributes. Unfortunately, for SanitizeCoverage, it has not been implemented as a sanitizer via fsanitize= and associated options in Sanitizers.def, but rolls its own option fsanitize-coverage. This meant that we never got "automatic" no_sanitize attribute support. Implement no_sanitize attribute support by special-casing the string "coverage" in the NoSanitizeAttr implementation. To keep the feature as unintrusive to existing IR generation as possible, define a new negative function attribute NoSanitizeCoverage to propagate the information through to the instrumentation pass. Fixes: https://bugs.llvm.org/show_bug.cgi?id=49035 Reviewed By: vitalybuka, morehouse Differential Revision: https://reviews.llvm.org/D102772	2021-05-25 12:57:14 +02:00
Stanislav Mekhanoshin	8f681d5b27	[IR] Allow Value::replaceUsesWithIf() to process constants The change is currently NFC, but exploited by the depending D102954. Code to handle constants is borrowed from the general implementation of Value::doRAUW(). Differential Revision: https://reviews.llvm.org/D103051	2021-05-25 02:12:01 -07:00
David Spickett	de7729d47a	[clang][ARM] Remove non-existent arm9312 CPU I cannot find documentation on this CPU, and it is not supported by the Arm Compiler 5 product either. It was likely a mistake or a different name for the "ep9312", which is an Arm based Cirrus Logic chip. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D103024	2021-05-25 08:58:24 +00:00
David Spickett	5f4d383a59	[clang][ARM] Remove non-existent arm1136jz-s CPU There is an ARM1136JF-S and an ARM1136J-S but I could find no references to an ARM1136JZ-S. In CPU manuals or the manual for Arm Compiler 5. See: https://developer.arm.com/documentation/ddi0211/latest/ https://developer.arm.com/documentation/dui0472/latest/ Using this CPU you get: $ ./bin/clang --target=arm-linux-gnueabihf -march=armv3m -mcpu=arm1136jz-s -c /tmp/test.c -o /tmp/test.o 'arm1136jz-s' is not a recognized processor for this target (ignoring processor) Since the llvm target does not know what it is. This is part of fixing https://bugs.llvm.org/show_bug.cgi?id=50454. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D103019	2021-05-25 08:54:59 +00:00
Lang Hames	82ad2b6e94	[JITLink] Enable creation and management of mutable block content. This patch introduces new operations on jitlink::Blocks: setMutableContent, getMutableContent and getAlreadyMutableContent. The setMutableContent method will set the block content data and size members and flag the content as mutable. The getMutableContent method will return a mutable copy of the existing content value, auto-allocating and populating a new mutable copy if the existing content is marked immutable. The getAlreadyMutableMethod asserts that the existing content is already mutable and returns it. setMutableContent should be used when updating the block with totally new content backed by mutable memory. It can be used to change the size of the block. The argument value should not be shared with any other block. getMutableContent should be used when clients want to modify the existing content and are unsure whether it is mutable yet. getAlreadyMutableContent should be used when clients want to modify the existing content and know from context that it must already be immutable. These operations reduce copy-modify-update boilerplate and unnecessary copies introduced when clients couldn't me sure whether the existing content was mutable or not.	2021-05-24 22:09:36 -07:00
Arthur Eubanks	a2ae14514a	Making Instrumentation aware of LoopNest Pass Intrumentation callbacks are not made aware of LoopNest passes. From the loop pass manager, we can pass the outermost loop of the LoopNest to instrumentation in case of LoopNest passes. The current patch made the change in two places in StandardInstrumentation.cpp. I will submit a proper patch where the OuterMostLoop is passed from the LoopPassManager to the call backs. That way we will avoid making changes at multiple places in StandardInstrumentation.cpp. A testcase also will be submitted. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D102463	2021-05-24 20:25:52 -07:00
maekawatoshiki	e77d24f70a	Revert "[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass" This reverts commit `d65c32fb41`.	2021-05-25 11:39:49 +09:00
David Blaikie	a08673d04a	Add a range-based wrapper for std::unique(begin, end, binary_predicate)	2021-05-24 17:26:46 -07:00
Jon Roelofs	095e91c973	[Remarks] Add analysis remarks for memset/memcpy/memmove lengths Re-landing now that the crasher this patch previously uncovered has been fixed in: https://reviews.llvm.org/D102935 Differential revision: https://reviews.llvm.org/D102452	2021-05-24 10:10:44 -07:00
David Green	543406a69b	[ARM] Allow findLoopPreheader to return headers with multiple loop successors The findLoopPreheader function will currently not find a preheader if it branches to multiple different loop headers. This patch adds an option to relax that, allowing ARMLowOverheadLoops to process more loops successfully. This helps with WhileLoopStart setup instructions that can branch/fallthrough to the low overhead loop and to branch to a separate loop from the same preheader (but I don't believe it is possible for both loops to be low overhead loops). Differential Revision: https://reviews.llvm.org/D102747	2021-05-24 12:22:15 +01:00
Johannes Doerfert	6caea8a7fa	[Attributor] Introduce a helper do deal with constant type mismatches If we simplify values we sometimes end up with type mismatches. If the value is a constant we can often cast it though to still allow propagation. The logic is now put into a helper and it replaces some ad hoc things we did before. This also introduces the AA namespace for abstract attribute related functions and types.	2021-05-23 23:00:40 -05:00
Johannes Doerfert	5cdc29f795	[Attributor][FIX] Ensure we replace undef if we see the first "real" value The state of AAPotentialValues tracks if undef is contained. It should fold undef into the first non-undef value. However we missed a case before. There was also a shadowing definition of two variables that caused trouble. The test exposes both problems.	2021-05-23 20:47:06 -05:00
Fady Ghanim	766ad7d0aa	[OpenMP][OMPIRBuilder]Adding support for `omp atomic` This patch adds support for generating `omp atomic` for all different atomic clauses	2021-05-23 17:44:09 -04:00
Philipp Krones	c2f819af73	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo This makes it possible for targets to define their own MCObjectFileInfo. This MCObjectFileInfo is then used to determine things like section alignment. This is a follow up to D101462 and prepares for the RISCV backend defining the text section alignment depending on the enabled extensions. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101921	2021-05-23 14:15:23 -07:00
Fangrui Song	fc82507c89	[AArch64][MC] Remove unneeded "in .xxx directive" from diagnostics The prevailing style does not add the message. The directive name is not useful because the next line replicates the error line which includes the directive.	2021-05-23 13:58:16 -07:00
maekawatoshiki	d65c32fb41	[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass. The next patch will utilize LoopNest to effectively handle loop nests. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D99149	2021-05-23 22:32:01 +09:00
Martin Storsjö	b4fd512c36	[Windows] Use TerminateProcess to exit without running destructors If exiting using _Exit or ExitProcess, DLLs are still unloaded cleanly before exiting, running destructors and other cleanup in those DLLs. When the caller expects to exit without cleanup, running destructors in some loaded DLLs (which can be either libLLVM.dll or e.g. libc++.dll) can cause deadlocks occasionally. This is an alternative to D102684. Differential Revision: https://reviews.llvm.org/D102944	2021-05-22 23:41:40 +03:00
Lang Hames	2b45895df4	[JITLink] Move some Block bitfields into Addressable to improve packing. Keeping these bitfields from Block to Addressable allows them to be packed with the bitfields at the end of Addressable, reducing the size of Block by eight bytes.	2021-05-22 07:59:24 -07:00
Tomasz Miąsko	75cc1cf018	[Demangle][Rust] Parse function signatures Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102581	2021-05-22 11:49:08 +02:00
Lang Hames	4272fca2db	[ORC] Check for underflow on SymbolStringPtr ref-counts.	2021-05-21 21:11:54 -07:00
Lang Hames	40df1b15b4	[ORC][C-bindings] Replace LLVMOrcJITTargetMachineBuilderDisposeTargetTriple. The implementation and intent behind freeing the triple string here is the same as LLVMGetDefaultTargetTriple (and any other owned c string returned from the C API), so we should use LLVMDisposeMessage for to free the string for consistency. Patch by Mats Larsen -- thanks Mats! Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D102957	2021-05-21 17:38:06 -07:00
Arthur Eubanks	a52530dd6a	Revert "[NPM] Do not run function simplification pipeline unnecessarily" This reverts commit `97ab068034`. Depends on D100917, which is to be reverted.	2021-05-21 16:38:02 -07:00
Eli Friedman	f8e7b28c99	[NewPM] Mark BitcodeWriter as required. The textual IR writer has an equivalent marking. It looks like this got missed in `e6ea877`.	2021-05-21 16:14:09 -07:00
Nick Desaulniers	033138ea45	[IR] make stack-protector-guard-* flags into module attrs D88631 added initial support for: - -mstack-protector-guard= - -mstack-protector-guard-reg= - -mstack-protector-guard-offset= flags, and D100919 extended these to AArch64. Unfortunately, these flags aren't retained for LTO. Make them module attributes rather than TargetOptions. Link: https://github.com/ClangBuiltLinux/linux/issues/1378 Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D102742	2021-05-21 15:53:30 -07:00
Arthur Eubanks	7a29a12301	[Verifier] Move some atomicrmw/cmpxchg checks to instruction creation These checks already exist as asserts when creating the corresponding instruction. Anybody creating these instructions already need to take care to not break these checks. Move the checks for success/failure ordering in cmpxchg from the verifier to the LLParser and BitcodeReader plus an assert. Add some tests for cmpxchg ordering. The .bc files are created from the .ll files with an llvm-as with these checks disabled. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102803	2021-05-21 13:41:17 -07:00
maekawatoshiki	fd53cb4148	Revert "[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass" This reverts commit `cea7a3fe3d`. To investigate sanitizer-x86_64-linux-fast failure.	2021-05-22 01:40:43 +09:00
Philip Reames	cc5f6ae4b4	Move a definition into cpp from header in advance of other changes [nfc]	2021-05-21 09:18:04 -07:00
maekawatoshiki	cea7a3fe3d	[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass. The next patch will utilize LoopNest to effectively handle loop nests. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D99149	2021-05-21 23:57:39 +09:00
Daniil Fukalov	e1cb98be2d	[TTI] NFC: Change getCostOfKeepingLiveOverCall to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D102831	2021-05-21 15:18:12 +03:00
Daniil Fukalov	e8e88c3353	[TTI] NFC: Change getRegUsageForType to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D102541	2021-05-21 15:17:23 +03:00
Stephen Tozer	36ec97f76a	3rd Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" This reapplies `c0f3dfb9`, which was reverted following the discovery of crashes on linux kernel and chromium builds - these issues have since been fixed, allowing this patch to re-land. This reverts commit `4397b7095d`.	2021-05-21 11:06:20 +01:00
Djordje Todorovic	35490329cb	[NFC][Debugify][Original DI] Use MapVector insted of DenseMap for DI tracking By using MapVector instead of DenseMap, reporting issues will be in deterministic order. Differential Revision: https://reviews.llvm.org/D102841	2021-05-21 02:58:16 -07:00
Djordje Todorovic	b9076d119a	Recommit: "[Debugify][Original DI] Test dbg var loc preservation"" [Debugify][Original DI] Test dbg var loc preservation This is an improvement of [0]. This adds checking of original llvm.dbg.values()/declares() instructions in optimizations. We have picked a real issue that has been found with this (actually, picked one variable location missing from [1] and resolved the issue), and the result is the fix for that -- D100844. Before applying the D100844, using the options from [0] (but with this patch applied) on the compilation of GDB 7.11, the final HTML report for the debug-info issues can be found at [1] (please scroll down, and look for "Summary of Variable Location Bugs"). After applying the D100844, the numbers has improved a bit -- please take a look into [2]. [0] https://llvm.org/docs/HowToUpdateDebugInfo.html#\ test-original-debug-info-preservation-in-optimizations [1] https://djolertrk.github.io/di-check-before-adce-fix/ [2] https://djolertrk.github.io/di-check-after-adce-fix/ Differential Revision: https://reviews.llvm.org/D100845 The Unit test was failing because the pass from the test that modifies the IR, in its runOnFunction() didn't return 'true', so the expensive-check configuration triggered an assertion.	2021-05-21 02:04:29 -07:00
Yevgeny Rouban	e3eaff10b2	Allow incomplete template types in unique_function arguments We can't declare unique_function that has in its arguments a reference to a template type with an incomplete argument. For instance, we can't declare unique_function<void(SmallVectorImpl<A>&)> when A is forward declared. This is because SFINAE will trigger a hard error in this case, when instantiating IsSizeLessThanThresholdT with the incomplete type. This patch specialize AdjustedParamT for references to remove this error. Committed on behalf of: @math-fehr (Fehr Mathieu) Reviewed By: DaniilSuchkov, yrouban	2021-05-21 14:09:33 +07:00
Serge Pavlov	c162f086ba	[APFloat] convertToDouble/Float can work on shorter types Previously APFloat::convertToDouble may be called only for APFloats that were built using double semantics. Other semantics like single precision were not allowed although corresponding numbers could be converted to double without loss of precision. The similar restriction applied to APFloat::convertToFloat. With this change any APFloat that can be precisely represented by double can be handled with convertToDouble. Behavior of convertToFloat was updated similarly. It make the conversion operations more convenient and adds support for formats like half and bfloat. Differential Revision: https://reviews.llvm.org/D102671	2021-05-21 11:02:51 +07:00
Jinsong Ji	edf4d69d38	[AIX] Print printable byte list as quoted string .byte supports string, so if the whole byte list are printable, we can actually print the string for readability and LIT tests maintainence. .byte 'H,'e,'l,'l,'o,',,' ,'w,'o,'r,'l,'d -> .byte "Hello, world" Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D102814	2021-05-21 02:37:55 +00:00
Nicolai Hähnle	a888e492f6	[IR] Memory intrinsics are not unconditionally `nosync` Remove the `nosync` attribute from the memory intrinsic definitions (i.e. memset, memcpy, memmove). Like native memory accesses, memory intrinsics can be volatile. This is indicated by an immarg in the intrinsic call. All else equal, a volatile memory intrinsic is `sync`, so we cannot annotate the intrinsic functions themselves as `nosync`. The attributor and function-attr passes know to take the volatile bit into account. Since `nosync` is a default attribute, this means we have to stop using the DefaultAttrIntrinsic tablegen class for memory intrinsics, and specify all default attributes other than `nosync` explicitly. Most of the test changes are trivial churn, but one test case (in nosync.ll) was in fact incorrect before this change. Differential Revision: https://reviews.llvm.org/D102295	2021-05-21 03:40:59 +02:00
Jan Kratochvil	6d19c84cd9	[lldb] Improve invalid DWARF DW_AT_ranges error reporting In D98289#inline-939112 @dblaikie said: Perhaps this could be more informative about what makes the range list index of 0 invalid? "index 0 out of range of range list table (with range list base 0xXXX) with offset entry count of XX (valid indexes 0-(XX-1))" Maybe that's too verbose/not worth worrying about since this'll only be relevant to DWARF producers trying to debug their DWARFv5, maybe no one will ever see this message in practice. Just a thought. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102851	2021-05-20 21:37:01 +02:00
Jon Roelofs	0af3105b64	Revert "[Remarks] Add analysis remarks for memset/memcpy/memmove lengths" This reverts commit `4bf69fb52b`. This broke spec2k6/403.gcc under -global-isel. Details to follow once I've reduced the problem.	2021-05-20 12:19:16 -07:00
Kevin P. Neal	f21f1eea05	[FPEnv] EarlyCSE support for constrained intrinsics, default FP environment edition EarlyCSE cannot distinguish between floating point instructions and constrained floating point intrinsics that are marked as running in the default FP environment. Said intrinsics are supposed to behave exactly the same as the regular FP instructions. Teach EarlyCSE to handle them in that case. Differential Revision: https://reviews.llvm.org/D99962	2021-05-20 14:40:51 -04:00
Fraser Cormack	26bd2250c1	[RISCV] Ensure shuffle splat operands are type-legal The use of `SelectionDAG::getSplatValue` isn't guaranteed to return a type-legal splat value as it may implicitly extract a vector element from another shuffle. It is not permitted to introduce an illegal type when lowering shuffles. This patch addresses the crash by adding a boolean flag to `getSplatValue`, defaulting to false, which when set will ensure a type-legal return value. If it is unable to do that it will fail to return a splat value. I've been through the existing uses of `getSplatValue` in other targets and was unable to find a need or test cases showing a need to update their uses. In some cases, the call is made during `LegalizeVectorOps` which may still produce illegal scalar types. In other situations, the illegally-typed splat value may be quickly patched up to a legal type (such as any-extending the returned `extract_vector_elt` up to a legal type) before `LegalizeDAG` notices. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102687	2021-05-20 18:00:03 +01:00
Wouter van Oortmerssen	3a293cbf13	[WebAssembly] Fix PIC/GOT codegen for wasm64 __table_base is know 64-bit, since in LLVM it represents a function pointer offset __table_base32 is a copy in wasm32 for use in elem init expr, since no truncation may be used there. New reloc R_WASM_TABLE_INDEX_REL_SLEB64 added Differential Revision: https://reviews.llvm.org/D101784	2021-05-20 09:59:31 -07:00
Steven Wu	5b6cae5524	[IR][AutoUpgrade] Drop alignment from non-pointer parameters and returns This is a follow-up of D102201. After some discussion, it is a better idea to upgrade all invalid uses of alignment attributes on function return values and parameters, not just limited to void function return types. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D102726	2021-05-20 09:54:38 -07:00
Daniel Kiss	801ab71032	[ARM][AArch64] SLSHardening: make non-comdat thunks possible Linker scripts might not handle COMDAT sections. SLSHardeing adds new section for each __llvm_slsblr_thunk_xN. This new option allows the generation of the thunks into the normal text section to handle these exceptional cases. ,comdat or ,noncomdat can be added to harden-sls to control the codegen. -mharden-sls=[all\|retbr\|blr],nocomdat. Reviewed By: kristof.beyls Differential Revision: https://reviews.llvm.org/D100546	2021-05-20 17:07:05 +02:00
Djordje Todorovic	0ae3c1d4d7	Revert "[Debugify][Original DI] Test dbg var loc preservation" This reverts commit `76f375f3d9`. This will be pushed again, after investigating a test failure: https://lab.llvm.org/buildbot/#/builders/16/builds/11254	2021-05-20 07:11:35 -07:00
Djordje Todorovic	76f375f3d9	[Debugify][Original DI] Test dbg var loc preservation This is an improvement of [0]. This adds checking of original llvm.dbg.values()/declares() instructions in optimizations. We have picked a real issue that has been found with this (actually, picked one variable location missing from [1] and resolved the issue), and the result is the fix for that -- D100844. Before applying the D100844, using the options from [0] (but with this patch applied) on the compilation of GDB 7.11, the final HTML report for the debug-info issues can be found at [1] (please scroll down, and look for "Summary of Variable Location Bugs"). After applying the D100844, the numbers has improved a bit -- please take a look into [2]. [0] https://llvm.org/docs/HowToUpdateDebugInfo.html\ [1] https://djolertrk.github.io/di-check-before-adce-fix/ [2] https://djolertrk.github.io/di-check-after-adce-fix/ Differential Revision: https://reviews.llvm.org/D100845	2021-05-20 06:42:02 -07:00
serge-sans-paille	3d3abc22b3	Force visibility of llvm::Any to external llvm::Any::TypeId::Id relies on the uniqueness of the address of a static variable defined in a template function. hidden visibility implies vague linkage for that variable, which does not guarantee the uniqueness of the address across a binary and a shared library. This totally breaks the implementation of llvm::Any. Ideally, setting visibility to llvm::Any::TypeId::Id should be enough, unfortunately this doesn't work as expected and we lack time (before 12.0.1 release) to understand why setting the visibility to llvm::Any does work. See https://gcc.gnu.org/wiki/Visibility and https://gcc.gnu.org/onlinedocs/gcc/Vague-Linkage.html for more information on that topic. Differential Revision: https://reviews.llvm.org/D101972	2021-05-20 10:06:00 +02:00
Jon Roelofs	4bf69fb52b	[Remarks] Add analysis remarks for memset/memcpy/memmove lengths Differential revision: https://reviews.llvm.org/D102452	2021-05-19 15:09:18 -07:00
Lang Hames	ef6e1213b1	[ORC] Add a CPU getter to JITTargetMachineBuilder.	2021-05-19 13:31:25 -07:00
Arthur Eubanks	28b9771472	[OpaquePtr] Make GEPs work with opaque pointers No verifier changes needed, the verifier currently doesn't check that the pointer operand's pointee type matches the GEP type. There is a similar check in GetElementPtrInst::Create() though. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102744	2021-05-19 12:39:37 -07:00
Joseph Huber	2db182ff8d	[Diagnostics] Allow emitting analysis and missed remarks on functions Summary: Currently, only `OptimizationRemarks` can be emitted using a Function. Add constructors to allow this for `OptimizationRemarksAnalysis` and `OptimizationRemarkMissed` as well. Reviewed By: jdoerfert thegameg Differential Revision: https://reviews.llvm.org/D102784	2021-05-19 15:10:20 -04:00
Pirama Arumuga Nainar	e4274cfe06	[CoverageMapping] Handle gaps in counter IDs for source-based coverage For source-based coverage, the frontend sets the counter IDs and the constraints of counter IDs is not defined. For e.g., the Rust frontend until recently had a reserved counter #0 (https://github.com/rust-lang/rust/pull/83774). Rust coverage instrumentation also creates counters on edges in addition to basic blocks. Some functions may have more counters than regions. This breaks an assumption in CoverageMapping.cpp where the number of counters in a function is assumed to be bounded by the number of regions: Counts.assign(Record.MappingRegions.size(), 0); This assumption causes CounterMappingContext::evaluate() to fail since there are not enough counter values created in the above call to `Counts.assign`. Consequently, some uncovered functions are not reported in coverage reports. This change walks a Function's CoverageMappingRecord to find the maximum counter ID, and uses it to initialize the counter array when instrprof records are missing for a function in sparse profiles. Differential Revision: https://reviews.llvm.org/D101780	2021-05-19 10:46:38 -07:00
Nikita Popov	b661a55a25	[ScalarEvolution] Remove unused ExitLimit::hasOperand() method (NFC) We only use BackedgeTakenInfo::hasOperand().	2021-05-19 18:42:14 +02:00
Jessica Paquette	84ae1cf8ed	Recommit "[GlobalISel] Simplify G_ICMP to true/false when the result is known" Add missing REQUIRES line to prelegalizer-combiner-icmp-to-true-false-known-bits.	2021-05-19 09:29:19 -07:00
Mariusz Ceier	9383e9c1e6	Fix lld macho standalone build by including llvm/Config/llvm-config.h instead of llvm/Config/config.h lld/MachO/Driver.cpp and lld/MachO/SyntheticSections.cpp include llvm/Config/config.h which doesn't exist when building standalone lld. This patch replaces llvm/Config/config.h include with llvm/Config/llvm-config.h just like it is in lld/ELF/Driver.cpp and HAVE_LIBXAR with LLVM_HAVE_LIXAR and moves LLVM_HAVE_LIBXAR from config.h to llvm-config.h Also it adds LLVM_HAVE_LIBXAR to LLVMConfig.cmake and links liblldMachO2.so with XAR_LIB if LLVM_HAVE_LIBXAR is set. Differential Revision: https://reviews.llvm.org/D102084	2021-05-19 11:15:07 -04:00
Simon Moll	66963bf381	[VP] make getFunctionalOpcode return an Optional The operation of some VP intrinsics do/will not map to regular instruction opcodes. Returning 'None' seems more intuitive here than 'Instruction::Call'. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D102778	2021-05-19 17:08:34 +02:00
Anirudh Prasad	f076da66b9	[AsmParser][SystemZ][z/OS] Introducing HLASM Parser support to AsmParser - Part 1 - This patch (is one in a series of patches) which introduces HLASM Parser support (for the first parameter of inline asm statements) to LLVM ([[ https://lists.llvm.org/pipermail/llvm-dev/2021-January/147686.html \| main RFC here ]]) - This patch in particular introduces HLASM Parser support for Z machine instructions. - The approach taken here was to subclass `AsmParser`, and make various functions and variables as "protected" wherever appropriate. - The `HLASMAsmParser` class overrides the `parseStatement` function. Two new private functions `parseAsHLASMLabel` and `parseAsMachineInstruction` are introduced as well. The general syntax is laid out as follows (more information available in [[ https://www.ibm.com/support/knowledgecenter/SSENW6_1.6.0/com.ibm.hlasm.v1r6.asm/asmr1023.pdf \| HLASM V1R6 Language Reference Manual ]] - Chapter 2 - Instruction Statement Format): ``` <TokA><spaces.><TokB><spaces.><TokC><spaces.*><TokD> ``` 1. TokA is referred to as the Name Entry. This token is optional 2. TokB is referred to as the Operation Entry. This token is mandatory. 3. TokC is referred to as the Operand Entry. This token is mandatory 4. TokD is referred to as the Remarks Entry. This token is optional - If TokA is provided, then we either parse TokA as a possible comment or as a label (Name Entry), Tok B as the Operation Entry and so on. - If TokA is not provided (i.e. we have one or more spaces and then the first token), then we will parse the first token (i.e TokB) as a possible Z machine instruction, TokC as the operands to the Z machine instruction and TokD as a possible Remark field - TokC (Operand Entry), no spaces are allowed between OperandEntries. If a space occurs it is classified as an error. - TokD if provided is taken as is, and emitted as a comment. The following additional approach was examined, but not taken: - Adding custom private only functions to base AsmParser class, and only invoking them for z/OS. While this would eliminate the need for another child class, these private functions would be of non-use to every other target. Similarly, adding any pure virtual functions to the base MCAsmParser class and overriding them in AsmParser would also have the same disadvantage. Testing: - This patch doesn't have tests added with it, for the sole reason that MCStreamer Support and Object File support hasn't been added for the z/OS target (yet). Hence, it's not possible generate code outright for the z/OS target. They are in the process of being committed / process of being worked on. - Any comments / feedback on how to combat this "lack of testing" due to other missing required features is appreciated. Reviewed By: Kai, uweigand Differential Revision: https://reviews.llvm.org/D98276	2021-05-19 11:05:30 -04:00
Simon Pilgrim	707fc2e2f2	Revert rG528bc10e95d5f9d6a338f9bab5e91d7265d1cf05 : "[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB" Reports on D101970 indicate this is causing failures on multi-stage compiles.	2021-05-19 15:01:20 +01:00
Nico Weber	52a7797626	Revert "[GlobalISel] Simplify G_ICMP to true/false when the result is known" This reverts commit `892497c806`. Breaks tests, see comments on https://reviews.llvm.org/D102542	2021-05-19 09:02:27 -04:00
Peter Waller	fd4ef793ea	[llvm][AArch64][SVE] Model FFR-using intrinsics with inaccessiblemem Intriniscs reading or writing the FFR register need to model the fact there is additional state being read/wrtten. Model this state as inaccessible memory. * setffr => write inaccessiblememonly * rdffr => read inaccessiblememonly * ldff* => read arg memory, write inaccessiblemem * ldnf => read arg memory, write inaccessiblemem	2021-05-19 13:50:13 +01:00
Simon Giesecke	81b2fcf26f	Use a non-recursive mutex in GsymCreator. There doesn't seem to be a need to support recursive locking, and a recursive mutex is unnecessarily inefficient. Differential Revision: https://reviews.llvm.org/D102486	2021-05-19 10:06:47 +00:00
Tim Northover	c1dc267258	MachineBasicBlock: add liveout iterator aware of which liveins are defined by the runtime. Using this in RegAlloc fast reduces register pressure, and in some cases allows x86 code to compile that wouldn't before.	2021-05-19 11:00:24 +01:00
Sander de Smalen	4f86aa650c	[LV] Add -scalable-vectorization=<option> flag. This patch adds a new option to the LoopVectorizer to control how scalable vectors can be used. Initially, this suggests three levels to control scalable vectorization, although other more aggressive options can be added in the future. The possible options are: - Disabled: Disables vectorization with scalable vectors. - Enabled: Vectorize loops using scalable vectors or fixed-width vectors, but favors fixed-width vectors when the cost is a tie. - Preferred: Like 'Enabled', but favoring scalable vectors when the cost-model is inconclusive. Reviewed By: paulwalker-arm, vkmr Differential Revision: https://reviews.llvm.org/D101945	2021-05-19 10:40:56 +01:00
Arthur Eubanks	0c509dbc7e	[NewPM] Add options to PrintPassInstrumentation To bring D99599's implementation in line with the existing PrintPassInstrumentation, and to fix a FIXME, add more customizability to PrintPassInstrumentation. Introduce three new options. The first takes over the existing "-debug-pass-manager-verbose" cl::opt. The second and third option are specific to -fdebug-pass-structure. They allow indentation, and also don't print analysis queries. To avoid more golden file tests than necessary, prune down the -fdebug-pass-structure tests. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D102196	2021-05-18 20:59:35 -07:00
Guozhi Wei	528bc10e95	[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB This patch transforms the sequence lea (reg1, reg2), reg3 sub reg3, reg4 to two sub instructions sub reg1, reg4 sub reg2, reg4 Similar optimization can also be applied to LEA/ADD sequence. The modifications to TwoAddressInstructionPass is to ensure the operands of ADD instruction has expected order (the dest register of LEA should be src register of ADD). Differential Revision: https://reviews.llvm.org/D101970	2021-05-18 18:02:36 -07:00
Rong Xu	886629a8c9	[SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This patch implements first part of Flow Sensitive SampleFDO (FSAFDO). It has the following changes: (1) disable current discriminator encoding scheme, (2) new hierarchical discriminator for FSAFDO. For this patch, option "-enable-fs-discriminator=true" turns on the new functionality. Option "-enable-fs-discriminator=false" (the default) keeps the current SampleFDO behavior. When the fs-discriminator is enabled, we insert a flag variable, namely, llvm_fs_discriminator, to the object. This symbol will checked by create_llvm_prof tool, and used to generate a profile with FS-AFDO discriminators enabled. If this happens, for an extbinary format profile, create_llvm_prof tool will add a flag to profile summary section. Differential Revision: https://reviews.llvm.org/D102246	2021-05-18 16:23:43 -07:00
Alex Orlov	4fedb3a613	[symbolizer] Added StartAddress for the resolved function. In many cases it is helpful to know at what address the resolved function starts. This patch adds a new StartAddress member to the DILineInfo structure. Reviewed By: jhenderson, dblaikie Differential Revision: https://reviews.llvm.org/D102316	2021-05-19 02:38:13 +04:00
Tomasz Miąsko	068332978c	[Demangle][Rust] Parse named types Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102571	2021-05-19 00:04:41 +02:00
Arthur Eubanks	bc7d15c61d	[NFC] Use ArgListEntry indirect types more in ISel lowering For opaque pointers, we're trying to avoid uses of PointerType::getElementType(). A couple of ISel places use PointerType::getElementType(). Some of these are easy to fix by using ArgListEntry's indirect types. The inalloca type wasn't stored there, as opposed to preallocated and byval which have their indirect types available, so add it and use it. This is a reland after an MSan fix in D102667. Differential Revision: https://reviews.llvm.org/D101713	2021-05-18 14:30:22 -07:00
Arthur Eubanks	6b9524a05b	[NewPM] Don't mark AA analyses as preserved Currently all AA analyses marked as preserved are stateless, not taking into account their dependent analyses. So there's no need to mark them as preserved, they won't be invalidated unless their analyses are. SCEVAAResults was the one exception to this, it was treated like a typical analysis result. Make it like the others and don't invalidate unless SCEV is invalidated. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D102032	2021-05-18 13:49:03 -07:00
Arthur Eubanks	6013d84392	[OpaquePtr] Make loads and stores work with opaque pointers Don't check that types match when the pointer operand is an opaque pointer. I would separate the Assembler and Verifier changes, but verify-uselistorder in the Assembler test ends up running the verifier. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102450	2021-05-18 13:43:50 -07:00
Reid Kleckner	ac2226b0f5	[PDB] Improve error handling when writes fail Handle PDB writing errors like any other error in LLD: emit an error and continue. This allows the linker to print timing data and summary data after linking, which can be helpful for finding PDB size problems. Also report how large the file would have been. Example output: lld-link: error: Output data is larger than 4 GiB. File size would have been 6,937,108,480 lld-link: error: failed to write PDB file ./chrome.dll.pdb Summary -------------------------------------------------------------------------------- 33282 Input OBJ files (expanded from all cmd-line inputs) 4 PDB type server dependencies 0 Precomp OBJ dependencies 33396931 Input type records ... snip ... Input File Reading: 59756 ms ( 45.5%) GC: 7500 ms ( 5.7%) ICF: 3336 ms ( 2.5%) Code Layout: 6329 ms ( 4.8%) PDB Emission (Cumulative): 46192 ms ( 35.2%) Add Objects: 27609 ms ( 21.0%) Type Merging: 16740 ms ( 12.8%) Symbol Merging: 10761 ms ( 8.2%) Publics Stream Layout: 9383 ms ( 7.1%) TPI Stream Layout: 1678 ms ( 1.3%) Commit to Disk: 3461 ms ( 2.6%) -------------------------------------------------- Total Link Time: 131244 ms (100.0%) Differential Revision: https://reviews.llvm.org/D102713	2021-05-18 13:17:17 -07:00
Sam Clegg	45b7cf9955	[lld][WebAssembly] Enable string tail merging in debug sections This is a followup to https://reviews.llvm.org/D97657 which applied string tail merging to data segments. Fixes: https://bugs.llvm.org/show_bug.cgi?id=48828 Differential Revision: https://reviews.llvm.org/D102436	2021-05-18 12:25:39 -07:00
Konstantin Zhuravlyov	0f544be244	AMDGPU/NFC: Replace EF_AMDGPU_MACH_AMDGCN_RESERVED_0X3E with EF_AMDGPU_MACH_AMDGCN_GFX1034 Differential Revision: https://reviews.llvm.org/D102708	2021-05-18 15:11:50 -04:00
Rafael Auler	a33687ec58	[RuntimeDyld] Add allowStubs/allowZeroSyms This patch introduces functionality used by BOLT when re-linking the final binary. It adds to MemoryManager a new member function allowStubAllocation to control whether this MemoryManager supports increasing code size with stubs or not. Since BOLT can rewrite some files in-place, it needs to avoid stub insertion done by the linker. This patch also introduces allowsZeroSymbols to the JITSymbolResolver class, enabling us to finish a link successfully even when some symbols resolve to the value zero. When rewriting a binary, sometimes we do need to resolve a target to zero in case the input binary calls address zero and we want to be bug compatible. We also expose reassignSectionAddress as it is used by BOLT. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D97898	2021-05-18 11:35:27 -07:00
Jessica Paquette	892497c806	[GlobalISel] Simplify G_ICMP to true/false when the result is known Use existing KnownBits helpers from KnownBits.h to simplify G_ICMPs. E.g. x == x -> true x != x -> false load(x) > 1 -> true (when the load is known to be greater than 1) And so on. Differential Revision: https://reviews.llvm.org/D102542	2021-05-18 09:26:41 -07:00
Raphael Isemann	82f248d234	[ADT] Remove StringRef::withNullAsEmpty A long time ago LLDB wanted to start using StringRef instead of C-Strings/ConstString but was blocked by the StringRef(const char ) ctor asserting that the C-string isn't a nullptr. To workaround this, D24697 introduced a special function called withNullAsEmpty and that's what LLDB (and only LLDB) started to use to build StringRefs from C-strings. A bit later it seems that withNullAsEmpty was declared too awkward to use and instead the assert in the StringRef constructor got removed (see D24904). The rest of LLDB was then converted to StringRef by just calling the now perfectly usable implicit constructor. However, it seems that the original approach with withNullAsEmpty was never touched again since then and now just exists as a function in StringRef that is only used in a few places in LLDB. I removed the few uses of withNullAsEmpty in D102597 and this patch removes the function itself. Calling the implicit StringRef(const char ) constructor is the preferred way of doing this today. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D102599	2021-05-18 15:45:09 +02:00
David Sherwood	38e2359a11	[NFC] Removed unused VFInfo comparison operator	2021-05-18 13:32:24 +01:00
Ten Tzen	797ad70152	[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1 This patch is the Part-1 (FE Clang) implementation of HW Exception handling. This new feature adds the support of Hardware Exception for Microsoft Windows SEH (Structured Exception Handling). This is the first step of this project; only X86_64 target is enabled in this patch. Compiler options: For clang-cl.exe, the option is -EHa, the same as MSVC. For clang.exe, the extra option is -fasync-exceptions, plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual. NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change. The rules for C code: For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules: * First, no exception can move in or out of _try region., i.e., no "potential faulty instruction can be moved across _try boundary. * Second, the order of exceptions for instructions 'directly' under a _try must be preserved (not applied to those in callees). * Finally, global states (local/global/heap variables) that can be read outside of _try region must be updated in memory (not just in register) before the subsequent exception occurs. The impact to C++ code: Although SEH is a feature for C code, -EHa does have a profound effect on C++ side. When a C++ function (in the same compilation unit with option -EHa ) is called by a SEH C function, a hardware exception occurs in C++ code can also be handled properly by an upstream SEH _try-handler or a C++ catch(...). As such, when that happens in the middle of an object's life scope, the dtor must be invoked the same way as C++ Synchronous Exception during unwinding process. Design: A natural way to achieve the rules above in LLVM today is to allow an EH edge added on memory/computation instruction (previous iload/istore idea) so that exception path is modeled in Flow graph preciously. However, tracking every single memory instruction and potential faulty instruction can create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial. This design does not intend to model exception path at instruction level. Instead, the proposed design tracks and reports EH state at BLOCK-level to reduce the complexity of flow graph and minimize the performance-impact on CPP code under -EHa option. One key element of this design is the ability to compute State number at block-level. Our algorithm is based on the following rationales: A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping into a _try is not allowed. The single entry must start with a seh_try_begin() invoke with a correct State number that is the initial state of the SEME. Through control-flow, state number is propagated into all blocks. Side exits marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[]. Note side exits can ONLY jump into parent scopes (lower state number). Thus, when a block succeeds various states from its predecessors, the lowest State triumphs others. If some exits flow to unreachable, propagation on those paths terminate, not affecting remaining blocks. For CPP code, object lifetime region is usually a SEME as SEH _try. However there is one rare exception: jumping into a lifetime that has Dtor but has no Ctor is warned, but allowed: Warning: jump bypasses variable with a non-trivial destructor In that case, the region is actually a MEME (multiple entry multiple exits). Our solution is to inject a eha_scope_begin() invoke in the side entry block to ensure a correct State. Implementation: Part-1: Clang implementation described below. Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end(). _scope_begin() is immediately added after ctor() is called and EHStack is pushed. So it must be an invoke, not a call. With that it's also guaranteed an EH-cleanup-pad is created regardless whether there exists a call in this scope. _scope_end is added before dtor(). These two intrinsics make the computation of Block-State possible in downstream code gen pass, even in the presence of ctor/dtor inlining. Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark _try boundary and to prevent from exceptions being moved across _try boundary. All memory instructions inside a _try are considered as 'volatile' to assure 2nd and 3rd rules for C-code above. This is a little sub-optimized. But it's acceptable as the amount of code directly under _try is very small. Part-2 (will be in Part-2 patch): LLVM implementation described below. For both C++ & C-code, the state of each block is computed at the same place in BE (WinEHPreparing pass) where all other EH tables/maps are calculated. In addition to _scope_begin & _scope_end, the computation of block state also rely on the existing State tracking code (UnwindMap and InvokeStateMap). For both C++ & C-code, the state of each block with potential trap instruction is marked and reported in DAG Instruction Selection pass, the same place where the state for -EHsc (synchronous exceptions) is done. If the first instruction in a reported block scope can trap, a Nop is injected before this instruction. This nop is needed to accommodate LLVM Windows EH implementation, in which the address in IPToState table is offset by +1. (note the purpose of that is to ensure the return address of a call is in the same scope as the call address. The handler for catch(...) for -EHa must handle HW exception. So it is 'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches C++ exceptions). Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW exceptions can be passed through. Original llvm-dev [RFC] discussions can be found in these two threads below: https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html Differential Revision: https://reviews.llvm.org/D80344/new/	2021-05-17 22:42:17 -07:00
Stella Stamenova	2d1f2ba7d5	Revert "[ADT] Add new type traits for type pack indexes" This reverts commit `a6d3987b8e`.	2021-05-17 20:26:59 -07:00
Arthur Eubanks	cc64ece77d	[NFC][OpaquePtr] Avoid using PointerType::getElementType() in VectorUtils.cpp Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D102533	2021-05-17 18:35:44 -07:00
Scott Linder	a6d3987b8e	[ADT] Add new type traits for type pack indexes Similar versions of these already exist, this effectively just just factors them out into STLExtras. I plan to use these in future patches. Differential Revision: https://reviews.llvm.org/D100672	2021-05-17 22:28:55 +00:00
Scott Linder	af5247c934	[ADT] Factor out in_place_t and expose in Optional ctor Differential Revision: https://reviews.llvm.org/D100671	2021-05-17 22:25:39 +00:00
Nick Desaulniers	0f41778919	[AArch64] Support customizing stack protector guard Follow up to D88631 but for aarch64; the Linux kernel uses the command line flags: 1. -mstack-protector-guard=sysreg 2. -mstack-protector-guard-reg=sp_el0 3. -mstack-protector-guard-offset=0 to use the system register sp_el0 for the stack canary, enabling the kernel to have a unique stack canary per task (like a thread, but not limited to userspace as the kernel can preempt itself). Address pr/47341 for aarch64. Fixes: https://github.com/ClangBuiltLinux/linux/issues/289 Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed By: xiangzhangllvm, DavidSpickett, dmgreen Differential Revision: https://reviews.llvm.org/D100919	2021-05-17 11:49:22 -07:00
Mats Larsen	0c557db617	[NewPM] Add C bindings for new pass manager This patch contains the bare minimum to run the new Pass Manager from the LLVM-C APIs. It does not feature PGOOptions, PassPlugins or Debugify in its current state. Bugzilla: PR48499 Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D102136	2021-05-17 11:45:47 -07:00
Nico Weber	0b33977872	Revert "[NewPM] Add C bindings for new pass manager" This reverts commit `cd220a0678`. Doesn't build.	2021-05-17 13:59:12 -04:00
Mats Larsen	cd220a0678	[NewPM] Add C bindings for new pass manager This patch contains the bare minimum to run the new Pass Manager from the LLVM-C APIs. It does not feature PGOOptions, PassPlugins or Debugify in its current state. Bugzilla: PR48499 Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D102136	2021-05-17 10:48:45 -07:00
Steffen Larsen	f226e28a88	[Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX redux.sync instructions Adds NVPTX builtins and intrinsics for the CUDA PTX `redux.sync` instructions for `sm_80` architecture or newer. PTX ISA description of `redux.sync`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-redux-sync Authored-by: Steffen Larsen <steffen.larsen@codeplay.com> Differential Revision: https://reviews.llvm.org/D100124	2021-05-17 09:46:59 -07:00
Stuart Adams	02c2468864	[Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions Adds NVPTX builtins and intrinsics for the CUDA PTX `cp.async` instructions for `sm_80` architecture or newer. PTX ISA description of `cp.async`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-asynchronous-copy https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-cp-async-mbarrier-arrive Authored-by: Stuart Adams <stuart.adams@codeplay.com> Co-Authored-by: Alexander Johnston <alexander@codeplay.com> Differential Revision: https://reviews.llvm.org/D100394	2021-05-17 09:46:59 -07:00
Andy Yankovsky	b6e4bfd185	[APInt][NFC] Fix typo vlalue->value Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D102618	2021-05-17 16:18:22 +02:00
Irina Dobrescu	50511df32e	[AArch64] Lower bitreverse in ISel Adding lowering support for bitreverse. Previously, lowering bitreverse would expand it into a series of other instructions. This patch makes it so this produces a single rbit instruction instead. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D102397	2021-05-17 13:35:27 +01:00
Roman Lebedev	e35a9ecf3d	[InstCombine] isFreeToInvert(): constant expressions aren't free to invert (PR50370) This fixes https://bugs.llvm.org/show_bug.cgi?id=50370, which reports a yet another endless combine loop, this one regressed from `554b1bced3`, which fixed yet another endless combine loop (PR50308) This code had fallen into the very typical pitfall of forgetting that constant expressions exist, and they aren't free to invert, because the `not` won't be absorbed by the "constant", but will remain a (constant) expression...	2021-05-17 14:58:05 +03:00
Tim Northover	82a0e808bb	IR/AArch64/X86: add "swifttailcc" calling convention. Swift's new concurrency features are going to require guaranteed tail calls so that they don't consume excessive amounts of stack space. This would normally mean "tailcc", but there are also Swift-specific ABI desires that don't naturally go along with "tailcc" so this adds another calling convention that's the combination of "swiftcc" and "tailcc". Support is added for AArch64 and X86 for now.	2021-05-17 10:48:34 +01:00
Hongtao Yu	f28ee1a2b3	[CSSPGO] Update pseudo probe distribution factor based on inline context. With prelink inlining, pseudo probes with same ID can come from different inline contexts. Such probes should not share samples and their factors should be fixed up separately. I'm seeing 0.3% speedup for SPEC2017 overall. Benchmark 631.deepsjeng_s benefits the most, about 4%. Reviewed By: wenlei, wmi Differential Revision: https://reviews.llvm.org/D102429	2021-05-16 23:11:36 -07:00
Arthur Eubanks	7647cb14dc	Revert "[NFC] Use ArgListEntry indirect types more in ISel lowering" This reverts commit `85af8a8c1b`.	2021-05-16 22:00:54 -07:00
Pan, Tao	976a3e5f61	[SelectionDAG] Make fast and linearize visible by clang -pre-RA-sched ScheduleDAGFast.cpp is compiled to object file, but the ScheduleDAGFast object file isn't linked into clang executable file as no symbol is referred by outside. Add calling to createXxx of ScheduleDAGFast.cpp, then the ScheduleDAGFast object file will be linked into clang executable file. The static RegisterScheduler will register scheduler fast and linearize at clang boot time. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D101601	2021-05-17 11:25:15 +08:00
David Green	dd5c52029d	[CPG][ARM] Optimize towards branch on zero in codegenprepare This adds a simple fold into codegenprepare that converts comparison of branches towards comparison with zero if possible. For example: %c = icmp ult %x, 8 br %c, bla, blb %tc = lshr %x, 3 becomes %tc = lshr %x, 3 %c = icmp eq %tc, 0 br %c, bla, blb As a first order approximation, this can reduce the number of instructions needed to perform the branch as the shift is (often) needed anyway. At the moment this does not effect very much, as llvm tends to prefer the opposite form. But it can protect against regressions from commits like rG9423f78240a2. Simple cases of Add and Sub are added along with Shift, equally as the comparison to zero can often be folded with cpsr flags. Differential Revision: https://reviews.llvm.org/D101778	2021-05-16 17:54:06 +01:00
Tomasz Miąsko	f0f2a8b21c	[Demangle][Rust] Parse inherent implementations Part of https://reviews.llvm.org/D102549	2021-05-15 23:52:25 +02:00
Alex Orlov	88a8965a7d	NFC. Refactored DIPrinter for support embedded source. This patch introduces source loading and pruning functions. It will allow to use the DWARF embedded source and use the same code for JSON printout. No functional changes. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102539	2021-05-15 23:01:12 +04:00
Pengxuan Zheng	c9b36a041f	Support GCC's -fstack-usage flag This patch adds support for GCC's -fstack-usage flag. With this flag, a stack usage file (i.e., .su file) is generated for each input source file. The format of the stack usage file is also similar to what is used by GCC. For each function defined in the source file, a line with the following information is produced in the .su file. <source_file>:<line_number>:<function_name> <size_in_byte> <static/dynamic> "Static" means that the function's frame size is static and the size info is an accurate reflection of the frame size. While "dynamic" means the function's frame size can only be determined at run-time because the function manipulates the stack dynamically (e.g., due to variable size objects). The size info only reflects the size of the fixed size frame objects in this case and therefore is not a reliable measure of the total frame size. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D100509	2021-05-15 10:22:49 -07:00
Nikita Popov	f9e9b0cdb4	[CFG] Move reachable from entry checks into basic block variant These checks are not specific to the instruction based variant of isPotentiallyReachable(), they are equally valid for the basic block based variant. Move them there, to make sure that switching between the instruction and basic block variants cannot introduce regressions.	2021-05-15 15:42:02 +02:00
Nikita Popov	fb9ed1979a	[IR] Add BasicBlock::isEntryBlock() (NFC) This is a recurring and somewhat awkward pattern. Add a helper method for it.	2021-05-15 12:41:58 +02:00
Tomasz Miąsko	2ba49f6ae6	[Demangle][Rust] Parse char constants Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102524	2021-05-15 10:48:27 +02:00
Tomasz Miąsko	fc0f2bb91d	[Demangle][Rust] Parse bool constants Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102518	2021-05-15 09:47:17 +02:00
Hendrik Greving	9d8e83b50e	[MC] Add the ability to pass MCRegisterInfo to dump_pretty. Adds the ability to pass MCRegisterInfo to dump_pretty and to the print functions, so that if present, target specific enums names are printed instead of enum values.	2021-05-14 18:21:57 -07:00
Nick Desaulniers	8c72749bd9	[LowerConstantIntrinsics] reuse isManifestLogic from ConstantFolding GlobalVariables are Constants, yet should not unconditionally be considered true for __builtin_constant_p. Via the LangRef https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic: This intrinsic generates no code. If its argument is known to be a manifest compile-time constant value, then the intrinsic will be converted to a constant true value. Otherwise, it will be converted to a constant false value. In particular, note that if the argument is a constant expression which refers to a global (the address of which _is_ a constant, but not manifest during the compile), then the intrinsic evaluates to false. Move isManifestConstant from ConstantFolding to be a method of Constant so that we can reuse the same logic in LowerConstantIntrinsics. pr/41459 Reviewed By: rsmith, george.burgess.iv Differential Revision: https://reviews.llvm.org/D102367	2021-05-14 15:35:21 -07:00
Nikita Popov	c4fb2a1fc2	[MemDep] Use BatchAA in more places (NFCI) Previously, we already used BatchAA for individual simple pointer dependency queries. This extends BatchAA usage for the non-local case, so that only one BatchAA instance is used for all blocks, instead of one instance per block. Use of BatchAA is safe as IR cannot be modified during a MemDep query.	2021-05-14 22:54:40 +02:00
Nikita Popov	5e289cc597	[AA] Support callCapturesBefore() on BatchAA (NFCI) This is not expected to have any practical compile-time effect, as the alias() calls inside callCapturesBefore() are rare. This should still be supported for API completeness, and might be useful for reachability caching.	2021-05-14 21:48:08 +02:00
Tomasz Miąsko	cd74dd178b	[Demangle][Rust] Parse integer constants Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102179	2021-05-14 19:47:19 +02:00
Benjamin Kramer	d4d80a2903	Bump googletest to 1.10.0	2021-05-14 19:16:31 +02:00
Jay Foad	6ec66f681c	[TableGen] Remove unneeded forward defs. NFC.	2021-05-14 12:36:20 +01:00
Sander de Smalen	f82966d19a	[LoopVectorizationLegality] NFC: Mark some interfaces as 'const' This patch marks blockNeedsPredication, isConsecutivePtr, isMaskRequired and getSymbolicStrides as 'const'.	2021-05-14 11:53:54 +01:00
Tim Northover	ea0eec69f1	IR+AArch64: add a "swiftasync" argument attribute. This extends any frame record created in the function to include that parameter, passed in X22. The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001 in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect of this is that tools walking the stack should expect to see one of three values there: * 0b0000 => a normal, non-extended record with just [FP, LR] * 0b0001 => the extended record [X22, FP, LR] * 0b1111 => kernel space, and a non-extended record. All other values are currently reserved. If compiling for arm64e this context pointer is address-discriminated with the discriminator 0xc31a and the DB (process-specific) key. There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing front-ends access to this slot (and forcing its creation initialized to nullptr if necessary).	2021-05-14 11:43:58 +01:00
dfukalov	fdae3fc8b3	[GVN] Clobber partially aliased loads. Use offsets stored in `AliasResult` implemented in D98718. Updated with fix of issue reported in https://reviews.llvm.org/D95543#2745161 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95543	2021-05-14 11:17:14 +03:00
Lang Hames	0fda4c4745	[ORC] Add support for adding LinkGraphs directly to ObjectLinkingLayer. This is separate from (but builds on) the support added in `ec6b71df70` for emitting LinkGraphs in the context of an active materialization. This commit makes LinkGraphs a first-class data structure with features equivalent to object files within ObjectLinkingLayer.	2021-05-13 21:44:13 -07:00
Chen Zheng	61484762e9	[Debug-Info] change Tag type to dwarf::Tag for createAndAddDIE; NFC Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102207	2021-05-13 21:15:06 -04:00
Amara Emerson	af6eb1c710	[AArch64][GlobalISel] Fix a crash during unsuccessful G_CTPOP <2 x s64> legalization. The legalization rule for scalar-same-as doesn't handle vectors. Until we implement custom legalization for this, at least fall back properly.	2021-05-13 17:28:11 -07:00
Arthur Eubanks	2155dc51d7	[IR] Introduce the opaque pointer type The opaque pointer type is essentially just a normal pointer type with a null pointee type. This also adds support for the opaque pointer type to the bitcode reader/writer, as well as to textual IR. To avoid confusion with existing pointer types, we disallow creating a pointer to an opaque pointer. Opaque pointer types should not be widely used at this point since many parts of LLVM still do not support them. The next steps are to add some very simple use cases of opaque pointers to make sure they work, then start pretending that all pointers are opaque pointers and see what breaks. https://lists.llvm.org/pipermail/llvm-dev/2021-May/150359.html Reviewed By: dblaikie, dexonsmith, pcc Differential Revision: https://reviews.llvm.org/D101704	2021-05-13 15:22:27 -07:00
Aakanksha Patil	464e4dc50f	[AMDGPU] Add gfx1034 target Differential Revision: https://reviews.llvm.org/D102306	2021-05-13 14:25:18 -04:00
cynecx	8ec9fd4839	Support unwinding from inline assembly I've taken the following steps to add unwinding support from inline assembly: 1) Add a new `unwind` "attribute" (like `sideeffect`) to the asm syntax: ``` invoke void asm sideeffect unwind "call thrower", "~{dirflag},~{fpsr},~{flags}"() to label %exit unwind label %uexit ``` 2.) Add Bitcode writing/reading support + LLVM-IR parsing. 3.) Emit EHLabels around inline assembly lowering (SelectionDAGBuilder + GlobalISel) when `InlineAsm::canThrow` is enabled. 4.) Tweak InstCombineCalls/InlineFunction pass to not mark inline assembly "calls" as nounwind. 5.) Add clang support by introducing a new clobber: "unwind", which lower to the `canThrow` being enabled. 6.) Don't allow unwinding callbr. Reviewed By: Amanieu Differential Revision: https://reviews.llvm.org/D95745	2021-05-13 19:13:03 +01:00
Max Kazantsev	d8b37de8a4	[GC][NFC] Move GCStrategy from CodeGen to IR We want it to be available in analyzes so that we could use the CodeGen notion in middle-end passes (for example, to check if a GC may free some particular pointer). This is a preparatory patch that simply moves the files around. Note: if this causes some build issues, this patch must just be reverted. Differential Revision: https://reviews.llvm.org/D100557 Reviewed By: reames	2021-05-13 12:31:59 +07:00
Lang Hames	2f21a272af	[JITLink] Expose x86-64 pointer jump stub block construction. This can be useful for clients who want to define their own symbol for the stub, or re-use some existing symbol.	2021-05-12 22:28:14 -07:00
Lang Hames	4b0f5edd36	[JITLink] Add a transferDefinedSymbol operation. The transferDefinedSymbol operation updates a Symbol's target block, offset, and size. This can be convenient when you want to redefine the content of some symbol(s) pointing at a block, while retaining the original block in the graph.	2021-05-12 22:28:14 -07:00
Anton Afanasyev	ab2c499d3a	[SLP] Add insertelement instructions to vectorizable tree Add new type of tree node for `InsertElementInst` chain forming vector. These instructions could be either removed, or replaced by shuffles during vectorization and we can add this node to cost model, so naturally estimating their cost, getting rid of `CompensateCost` tricks and reducing further work for InstCombine. This fixes PR40522 and PR35732 in a natural way. Also this patch is the first step towards revectorization of partially vectorization (to fix PR42022 completely). After adding inserts to tree the next step is to add vector instructions there (for instance, to merge `store <2 x float>` and `store <2 x float>` to `store <4 x float>`). Fixes PR40522 and PR35732. Differential Revision: https://reviews.llvm.org/D98714	2021-05-13 07:41:45 +03:00
Chen Zheng	a0ca4c46ca	[Debug-Info] add -gstrict-dwarf support in backend Reviewed By: dblaikie, probinson Differential Revision: https://reviews.llvm.org/D100826	2021-05-12 23:00:52 -04:00
Sam Clegg	3041b16f73	[WebAssembly] Add TLS data segment flag: WASM_SEG_FLAG_TLS Previously the linker was relying solely on the name of the segment to imply TLS. Differential Revision: https://reviews.llvm.org/D102202	2021-05-12 13:31:02 -07:00
Craig Topper	44e0e91db0	[ValueTypes] Rename MVT::getVectorNumElements() to MVT::getVectorMinNumElements(). Fix some misuses of getVectorNumElements() getVectorNumElements() returns a value for scalable vectors without any warning so it is effectively getVectorMinNumElements(). By renaming it and making getVectorNumElements() forward to it, we can insert a check for scalable vectors into getVectorNumElements() similar to EVT. I didn't do that in this patch because there are still more fixes needed, but I was able to temporarily do it and passed the RISCV lit tests with these changes. The changes to isPow2VectorType and getPow2VectorType are copied from EVT. The change to TypeInfer::EnforceSameNumElts reduces the size of AArch64's isel table. We're now considering SameNumElts to require the scalable property to match which removes some unneeded type checks. This was motivated by the bug I fixed yesterday in `80b9510806` Reviewed By: frasercrmck, sdesmalen Differential Revision: https://reviews.llvm.org/D102262	2021-05-12 07:46:45 -07:00
Martin Storsjö	4b98199ce8	[Passes] Reenable the relative lookup table converter pass for ELF and COFF on aarch64 The bug (PR50227, affecting COFF) that caused the revert in `6f5670a4c3` has been fixed in `382c505d9c` now, so it should be safe to reenable the pass for that target (and ELF). In PR50227 it's also mentioned that the same pass seems to cause problems on aarch64 on darwin, so leaving it disabled there for now.	2021-05-12 16:42:11 +03:00
Stephen Tozer	fdb055f4f1	Reapply "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST" Previous crashes caused by this patch were the result of machine subregisters being incorrectly handled in updateDbgUsersToReg; this has been fixed by using RegUnits to determine overlapping registers, instead of using the register values directly. Differential Revision: https://reviews.llvm.org/D101523 This reverts commit `7ca26c5fa2`.	2021-05-12 10:19:57 +01:00
Matt Arsenault	6ecbdb761f	GlobalISel: Make constant fields const	2021-05-11 20:10:55 -04:00
Matt Arsenault	24e2e5df0e	GlobalISel: Split ValueHandler into assignment and emission classes Currently the ValueHandler handles both selecting the type and location for arguments, as well as inserting instructions needed to handle them. Split this so that the determination of the argument handling is independent of the function state. Currently the checks for tail call compatibility do not follow the full assignment logic, so it misses cases where arguments require nontrivial legalization. This should help avoid targets ending up in a buggy state where the argument evaluation may change in different contexts.	2021-05-11 19:50:12 -04:00
Matt Arsenault	2bdfcf0cac	GlobalISel: Move AArch64 AssignFnVarArg to base class We can handle the distinction easily enough in the generic code, and this makes it easier to abstract the selection of type/location from the code to insert code.	2021-05-11 19:50:12 -04:00
Jordan Rupprecht	fec2945998	Revert "[GVN] Clobber partially aliased loads." This reverts commit `6c57044231`. It causes assertion errors due to widening atomic loads, and potentially causes miscompile elsewhere too. Repro, also posted to D95543: ``` $ cat repro.ll ; ModuleID = 'repro.ll' source_filename = "repro.ll" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %struct.widget = type { i32 } %struct.baz = type { i32, %struct.snork } %struct.snork = type { %struct.spam } %struct.spam = type { i32, i32 } @global = external local_unnamed_addr global %struct.widget, align 4 @global.1 = external local_unnamed_addr global i8, align 1 @global.2 = external local_unnamed_addr global i32, align 4 define void @zot(%struct.baz* %arg) local_unnamed_addr align 2 { bb: %tmp = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1 %tmp1 = bitcast %struct.snork* %tmp to i64* %tmp2 = load i64, i64* %tmp1, align 4 %tmp3 = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1, i32 0, i32 1 %tmp4 = icmp ugt i64 %tmp2, 4294967295 br label %bb5 bb5: ; preds = %bb14, %bb %tmp6 = load i32, i32* %tmp3, align 4 %tmp7 = icmp ne i32 %tmp6, 0 %tmp8 = select i1 %tmp7, i1 %tmp4, i1 false %tmp9 = zext i1 %tmp8 to i8 store i8 %tmp9, i8* @global.1, align 1 %tmp10 = load i32, i32* @global.2, align 4 switch i32 %tmp10, label %bb11 [ i32 1, label %bb12 i32 2, label %bb12 ] bb11: ; preds = %bb5 br label %bb14 bb12: ; preds = %bb5, %bb5 %tmp13 = load atomic i32, i32* getelementptr inbounds (%struct.widget, %struct.widget* @global, i64 0, i32 0) acquire, align 4 br label %bb14 bb14: ; preds = %bb12, %bb11 br label %bb5 } $ opt -O2 repro.ll -disable-output opt: /home/rupprecht/src/llvm-project/llvm/lib/Transforms/Utils/VNCoercion.cpp:496: llvm::Value llvm::VNCoercion::getLoadValueForLoad(llvm::LoadInst , unsigned int, llvm::Type , llvm::Instruction , const llvm::DataLayout &): Assertion `SrcVal->isSimple() && "Cannot widen volatile/atomic load!"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: /home/rupprecht/dev/opt -O2 repro.ll -disable-output ... ```	2021-05-11 16:08:53 -07:00
Petr Hosek	8280ece0c9	[Coverage] Support overriding compilation directory When making compilation relocatable, for example in distributed compilation scenarios, we want to set compilation dir to a relative value like `.` but this presents a problem when generating reports because if the file path is relative as well, for example `..`, you may end up writing files outside of the output directory. This change introduces a flag that allows overriding the compilation directory that's stored inside the profile with a different value that is absolute. Differential Revision: https://reviews.llvm.org/D100232	2021-05-11 15:26:45 -07:00
Lang Hames	a0162a81b1	[JITLink][MachO/x86_64] Expose API for creating eh-frame fixing passes. These can be used to create eh-frame section fixing passes outside the usual linker pipeline, which can be useful for tests and tools that just want to verify or dump graphs.	2021-05-11 15:26:16 -07:00
Lang Hames	74a96b4c98	[JITLink][x86-64] Add an x86_64 PointerSize constexpr. This can be used in place of magic '8' values in generic x86-64 utilities.	2021-05-11 15:26:15 -07:00
Alex Orlov	ebdcebfcb4	Removed unnecessary introduction of semi-colons.	2021-05-12 00:46:00 +04:00
Amara Emerson	ae2b36e8bd	[AArch64][GlobalISel] Support truncstorei8/i16 w/ combine to form truncating G_STOREs. This needs some tablegen changes so that we can actually import the patterns properly. Differential Revision: https://reviews.llvm.org/D102204	2021-05-11 11:33:03 -07:00
Alex Orlov	05d1ae4e18	* Add support for JSON output style to llvm-symbolizer This patch adds JSON output style to llvm-symbolizer to better support CLI automation by providing a machine readable output. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D96883	2021-05-11 13:10:54 +04:00
Simon Pilgrim	33399405f4	Fix -Wdocumentation warnings. NFCI.	2021-05-11 09:51:49 +01:00
Sam Clegg	3b8d2be527	Reland: "[lld][WebAssembly] Initial support merging string data" This change was originally landed in: `5000a1b4b9` It was reverted in: `061e071d8c` This change adds support for a new WASM_SEG_FLAG_STRINGS flag in the object format which works in a similar fashion to SHF_STRINGS in the ELF world. Unlike the ELF linker this support is currently limited: - No support for SHF_MERGE (non-string merging) - Always do full tail merging ("lo" can be merged with "hello") - Only support single byte strings (p2align 0) Like the ELF linker merging is only performed at `-O1` and above. This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828, although crucially it doesn't not currently support debug sections because they are not represented by data segments (they are custom sections) Differential Revision: https://reviews.llvm.org/D97657	2021-05-10 16:03:38 -07:00
Nico Weber	061e071d8c	Revert "[lld][WebAssembly] Initial support merging string data" This reverts commit `5000a1b4b9`. Breaks tests, see https://reviews.llvm.org/D97657#2749151 Easily repros locally with `ninja check-llvm-mc-webassembly`.	2021-05-10 18:28:28 -04:00
Florian Hahn	93a9a8a8d9	[VecLib] Add support for vector fns from Darwin's libsystem. This patch adds support for Darwin's libsystem math vector functions to TLI. Darwin's libsystem provides a range of vector functions for libm functions. This initial patch only adds the 2 x double and 4 x float versions, which are available on both X86 and ARM64. On X86, wider vector versions are supported as well. Reviewed By: jroelofs Differential Revision: https://reviews.llvm.org/D101856	2021-05-10 21:19:58 +01:00
Sam Clegg	5000a1b4b9	[lld][WebAssembly] Initial support merging string data This change adds support for a new WASM_SEG_FLAG_STRINGS flag in the object format which works in a similar fashion to SHF_STRINGS in the ELF world. Unlike the ELF linker this support is currently limited: - No support for SHF_MERGE (non-string merging) - Always do full tail merging ("lo" can be merged with "hello") - Only support single byte strings (p2align 0) Like the ELF linker merging is only performed at `-O1` and above. This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828, although crucially it doesn't not currently support debug sections because they are not represented by data segments (they are custom sections) Differential Revision: https://reviews.llvm.org/D97657	2021-05-10 13:15:12 -07:00
Arthur Eubanks	85af8a8c1b	[NFC] Use ArgListEntry indirect types more in ISel lowering For opaque pointers, we're trying to avoid uses of PointerType::getElementType(). A couple of ISel places use PointerType::getElementType(). Some of these are easy to fix by using ArgListEntry's indirect types. The inalloca type wasn't stored there, as opposed to preallocated and byval which have their indirect types available, so add it and use it. Differential Revision: https://reviews.llvm.org/D101713	2021-05-10 13:05:15 -07:00
Lang Hames	9507bace6c	[ORC] Use a unique_function rather than std::function for dispatchTask.	2021-05-10 13:04:33 -07:00
Sanjay Patel	88d8f10baf	[PassManager] add helper function to hold set of vector passes (2nd try) This is better no-functional-change-intended than the 1st attempt. As noted in D102002, there were at least 2 diffs that went unchecked in pass manager regressions tests: different pass parameters (SimplifyCFG) and an extension point/callback. Those should be lifted from the original code blocks correctly now.	2021-05-10 14:43:00 -04:00
Tomasz Miąsko	2961f86317	[Demangle][Rust] Parse basic types Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102142	2021-05-10 09:44:46 -07:00
Sanjay Patel	822be4bec8	Revert "[PassManager] add helper function to hold set of vector passes" This reverts commit `fefcb1f878`. It was supposed to be NFC, but as noted in the post-commit comments in D102002, that was not true: SimplifyCFG uses different parameters and there's a difference in an extension point / callback.	2021-05-10 10:59:30 -04:00
Fraser Cormack	3212a08a8c	[Constant] Allow ConstantAggregateZero a scalable element count A ConstantAggregateZero may be created from a scalable vector type. However, it still assumed fixed number of elements when queried for them. This patch changes ConstantAggregateZero to correctly report its element count. This change fixes a couple of issues. Firstly, it fixes a crash in Constant::getUniqueValue when called on a scalable-vector zeroinitializer constant. Secondly, it fixes a latent bug in GlobalISel's IRTranslator in which translating a scalable-vector zeroinitializer would hit the assertion in ConstantAggregateZero::getNumElements when casting to a FixedVectorType, rather than reporting an error more gracefully. This is currently hypothetical as the IRTranslator has deeper issues preventing the use of scalable vector types. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D102082	2021-05-10 13:51:53 +01:00
Mats Petersson	7280f4b279	[OpenMP][MLIR]Add support for guided, auto and runtime scheduling When using parallel loop construct, the OpenMP specification allows for guided, auto and runtime as scheduling variants (as well as static and dynamic which are already supported). This adds the translation from MLIR to LLVM-IR for these scheduling variants. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101435	2021-05-10 09:18:52 +00:00
Lang Hames	7f9a89f9a2	[ORC] Use the new dispatchTask API to run query callbacks. Dispatching query callbacks, rather than running them on the current thread, will allow them to be distributed across multiple threads.	2021-05-09 19:19:40 -07:00
Lang Hames	5344c88dcb	[ORC] Generalize materialization dispatch to task dispatch. Generalizing this API allows work to be distributed more evenly. In particular, query callbacks can now be dispatched (rather than running immediately on the thread that satisfied the query). This avoids the pathalogical case where an operation on one thread satisfies many queries simultaneously, causing large amounts of work to be run on that thread while other threads potentially sit idle.	2021-05-09 19:19:39 -07:00
Tomasz Miąsko	78e949159d	[Demangle][Rust] Print special namespaces Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101821	2021-05-09 15:45:57 -07:00
Andrea Di Biagio	9ceea66602	[MCA][RegisterFile] Refactor the move elimination logic to address PR50258. This patch lifts the restriction on the number of read/write registers for a move elimination candidate. With this patch, move elimination candidates with exactly two reads and two writes are treated like register swap operations for the purpose of move elimination. This patch currently doesn't affect any upstream model. However, it should help unblock the progress on PR50258.	2021-05-08 18:10:35 +01:00
Simon Pilgrim	ab5ee342b9	[GlobalISel] Ensure MachineIRBuilder::getDebugLoc() returns a const reference. NFCI. Avoids a lot of unnecessary tracking increments/decrements of the underlying TrackingMDNodeRef.	2021-05-08 16:23:28 +01:00
Xiang1 Zhang	d4bdeca576	[X86] Support AMX fast register allocation Differential Revision: https://reviews.llvm.org/D100026	2021-05-08 14:21:11 +08:00
Arthur Eubanks	72bd0116e3	Fix build after `34a8a437b`	2021-05-07 23:18:44 -07:00
Xiang1 Zhang	bebafe01a7	Revert "[X86] Support AMX fast register allocation" This reverts commit `77e2e5e07d`.	2021-05-08 13:43:32 +08:00
Xiang1 Zhang	77e2e5e07d	[X86] Support AMX fast register allocation	2021-05-08 13:27:21 +08:00
Michael Liao	631da3b152	Replace a remaining CRLF with LF. NFC.	2021-05-08 01:09:15 -04:00
Arthur Eubanks	34a8a437bf	[NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose Printing pass manager invocations is fairly verbose and not super useful. This allows us to remove DebugLogging from pass managers and PassBuilder since all logging (aside from analysis managers) goes through instrumentation now. This has the downside of never being able to print the top level pass manager via instrumentation, but that seems like a minor downside. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D101797	2021-05-07 21:51:47 -07:00
Amara Emerson	5b158093e2	[AArch64][GlobalISel] Create a new minimal combiner pass just for -O0. We never bothered to have a separate set of combines for -O0 in the prelegalizer before. This results in some minor performance hits for a mode where performance isn't a concern (although not regressing code size significantly is still preferable). This also removes the CSE option since we don't need it for -O0. Through experiments, I've arrived at a set of combines that gets the most code size improvement at -O0, while reducing the amount of time spent in the combiner by around 35% give or take. Differential Revision: https://reviews.llvm.org/D102038	2021-05-07 17:01:27 -07:00
Arthur Eubanks	6f7131002b	[NewPM] Move analysis invalidation/clearing logging to instrumentation We're trying to move DebugLogging into instrumentation, rather than being part of PassManagers/AnalysisManagers. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D102093	2021-05-07 15:25:31 -07:00
Arthur Eubanks	7ca26c5fa2	Revert "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST" This reverts commit `0791f968fe`. Causing crashes: https://crbug.com/1206764	2021-05-07 12:05:16 -07:00
Fangrui Song	d8aba75a76	Internalize some cl::opt global variables or move them under namespace llvm	2021-05-07 11:15:43 -07:00
Whitney Tsang	1006ac3963	[LoopNest] Consider loop nest with inner loop guard using outer loop induction variable to be perfect This patch allow more conditional branches to be considered as loop guard, and so more loop nests can be considered perfect. Reviewed By: bmahjour, sidbav Differential Revision: https://reviews.llvm.org/D94717	2021-05-07 16:04:18 +00:00
Benjamin Kramer	6248d11190	Retire TargetRegisterInfo::getSpillAlignment getSpillAlign does the same thing.	2021-05-07 15:16:22 +02:00
Simon Pilgrim	280aa3415e	[DAG] Add a generic expansion for SHIFT_PARTS opcodes using funnel shifts Based off a discussion on D89281 - where the AARCH64 implementations were being replaced to use funnel shifts. Any target that has efficient funnel shift lowering can handle the shift parts expansion using the same expansion, avoiding a lot of duplication. I've generalized the X86 implementation and moved it to TargetLowering - so far I've found that AARCH64 and AMDGPU benefit, but many other targets (ARM, PowerPC + RISCV in particular) could easily use this with a few minor improvements to their funnel shift lowering (or the folding of their target ops that funnel shifts lower to). NOTE: I'm trying to avoid adding full SHIFT_PARTS legalizer handling as I think it might actually be possible to remove these opcodes in the medium-term and use funnel shift / libcall expansion directly. Differential Revision: https://reviews.llvm.org/D101987	2021-05-07 13:12:30 +01:00
Stephen Tozer	0791f968fe	[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST This patch modifies updateDbgUsersToReg to properly handle DBG_VALUE_LIST instructions, by replacing the hard-coded operand indices (i.e. getOperand(0)) with the more general getDebugOperandsForReg(), and updating the register for all matching operands. Differential Revision: https://reviews.llvm.org/D101523	2021-05-07 11:47:50 +01:00
Guillaume Chatelet	e805b7c2d6	[llvm][NFC] Remove remaining deprecated alignment functions from CodeGen Differential Revision: https://reviews.llvm.org/D102058	2021-05-07 10:22:41 +00:00
Guillaume Chatelet	eb1b26ec1d	[llvm][NFC] Remove deprecated TargetFrameLowering and InstrTypes alignment functions Differential Revision: https://reviews.llvm.org/D102056	2021-05-07 10:21:35 +00:00
Sebastian Neubauer	98e5ede604	[AMDGPU] Serialize MFInfo::ScavengeFI Serialize ScavengeFI from SIMachineFunctionInfo into yaml. ScavengeFI is not used outside of the PrologEpilogInserter, so this shouldn't change anything. Differential Revision: https://reviews.llvm.org/D101367	2021-05-07 11:15:25 +02:00
Amara Emerson	1ccebb18ef	[GlobalISel] Micro-optimize the conditional branch optimization. Convert a check into an assert and pass an MI instead of recomputing in the apply function.	2021-05-07 00:03:09 -07:00
Chen Zheng	a95473c563	[XCOFF] handle string constants generation for AIX This follows https://www.ibm.com/docs/en/aix/7.2?topic=constants-string Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D101280	2021-05-07 06:43:36 +00:00
qixingxue	e388b9399b	[IR] Fix typo in comment of Intrinsics.td (NFC)	2021-05-07 13:21:58 +08:00
Cyndy Ishida	c4ed142e69	[llvm][TextAPI] add mapping from OS string to Platform * add utility for matching target triple OS value strings to PlatformKind This was reviewed offline by ributzka, steven_wu	2021-05-06 16:25:56 -07:00
Stanislav Mekhanoshin	c714d03785	[AMDGPU] Expose __builtin_amdgcn_perm for v_perm_b32 Differential Revision: https://reviews.llvm.org/D102022	2021-05-06 16:17:33 -07:00
Sanjay Patel	fefcb1f878	[PassManager] add helper function to hold set of vector passes This is no-functional-change-intended (NFC) and split off from D102002 (which proposes to eliminate the LTO-based differences).	2021-05-06 15:36:15 -04:00
Mircea Trofin	97ab068034	[NPM] Do not run function simplification pipeline unnecessarily The CGSCC pass manager interplay with the FunctionAnalysisManagerCGSCCProxy is 'special' in the sense that the former will rerun the latter if there are changes to a SCC structure; that being said, some of the functions in the SCC may be unchanged. In that case, the function simplification pipeline will be re-run, which impacts compile time[1]. This patch allows the function simplification pipeline be skipped if it was already run and the function was not modified since. The behavior is currently disabled by default. This is because, currently, the rerunning of the function simplification pipeline on an unchanged function may still result in changes. The patch simplifies investigating and fixing those cases where repeated function pass runs do actually positively impact code quality, while offering an easy workaround for those impacted negatively by compile time regressions, and not impacting mainline scenarios. [1] A [[ http://llvm-compile-time-tracker.com/compare.php?from=eb37d3546cd0c6e67798496634c45e501f7806f1&to=ac722d1190dc7bbdd17e977ef7ec95e69eefc91e&stat=instructions \| compile time tracker ]] run with the option enabled. Differential Revision: https://reviews.llvm.org/D98103	2021-05-06 12:24:33 -07:00
Kerry McLaughlin	8c9742bd23	[SVE][LoopVectorize] Add support for scalable vectorization of first-order recurrences Adds support for scalable vectorization of loops containing first-order recurrences, e.g: ``` for(int i = 0; i < n; i++) b[i] = a[i] + a[i - 1] ``` This patch changes fixFirstOrderRecurrence for scalable vectors to take vscale into account when inserting into and extracting from the last lane of a vector. CreateVectorSplice has been added to construct a vector for the recurrence, which returns a splice intrinsic for scalable types. For fixed-width the behaviour remains unchanged as CreateVectorSplice will return a shufflevector instead. The tests included here are the same as test/Transform/LoopVectorize/first-order-recurrence.ll Reviewed By: david-arm, fhahn Differential Revision: https://reviews.llvm.org/D101076	2021-05-06 11:35:39 +01:00
Guillaume Chatelet	089ec047be	[llvm][NFC] Remove CallingConvLower deprecated alignment functions Differential Revision: https://reviews.llvm.org/D101910	2021-05-06 07:46:19 +00:00
Guillaume Chatelet	1fa21bf9e9	[llvm][NFC] Remove SelectionDag alignment deprecated functions Differential Revision: https://reviews.llvm.org/D101909	2021-05-06 07:44:14 +00:00
Guillaume Chatelet	040f4a97cd	[llvm][NFC] Remove deprecated InterleaveGroup::getAlignment() function. Differential Revision: https://reviews.llvm.org/D101907	2021-05-06 07:40:18 +00:00
Guillaume Chatelet	a065efa302	[llvm][NFC] Remove deprecated DataLayout::getPreferredAlignment functions Differential Revision: https://reviews.llvm.org/D101906	2021-05-06 07:28:00 +00:00
Guillaume Chatelet	b4795544d4	[llvm][NFC] Remove deprecated Alignment::None() Differential Revision: https://reviews.llvm.org/D101905	2021-05-06 07:21:23 +00:00
Johannes Doerfert	df729e2b82	[OpenMP] Overhaul `declare target` handling This patch fixes various issues with our prior `declare target` handling and extends it to support `omp begin declare target` as well. This started with PR49649 in mind, trying to provide a way for users to avoid the "ref" global use introduced for globals with internal linkage. From there it went down the rabbit hole, e.g., all variables, even `nohost` ones, were emitted into the device code so it was impossible to determine if "ref" was needed late in the game (based on the name only). To make it really useful, `begin declare target` was needed as it can carry the `device_type`. Not emitting variables eagerly had a ripple effect. Finally, the precedence of the (explicit) declare target list items needed to be taken into account, that meant we cannot just look for any declare target attribute to make a decision. This caused the handling of functions to require fixup as well. I tried to clean up things while I was at it, e.g., we should not "parse declarations and defintions" as part of OpenMP parsing, this will always break at some point. Instead, we keep track what region we are in and act on definitions and declarations instead, this is what we do for declare variant and other begin/end directives already. Highlights: - new diagnosis for restrictions specificed in the standard, - delayed emission of globals not mentioned in an explicit list of a declare target, - omission of `nohost` globals on the host and `host` globals on the device, - no explicit parsing of declarations in-between `omp [begin] declare variant` and the corresponding end anymore, regular parsing instead, - precedence for explicit mentions in `declare target` lists over implicit mentions in the declaration-definition-seq, and - `omp allocate` declarations will now replace an earlier emitted global, if necessary. --- Notes: The patch is larger than I hoped but it turns out that most changes do on their own lead to "inconsistent states", which seem less desirable overall. After working through this I feel the standard should remove the explicit declare target forms as the delayed emission is horrible. That said, while we delay things anyway, it seems to me we check too often for the current status even though that is often not sufficient to act upon. There seems to be a lot of duplication that can probably be trimmed down. Eagerly emitting some things seems pretty weak as an argument to keep so much logic around. --- Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D101030	2021-05-06 02:10:41 -05:00
Lang Hames	7b73cd684a	[ORC] Introduce C API for adding object buffers directly to an object layer. This can be useful for clients constructing custom JIT stacks: If the C API for your custom stack exposes API to obtain a reference to an object layer (e.g. LLVMOrcLLJITGetObjLinkingLayer) then the newly added LLVMOrcObjectLayerAddObjectFile and LLVMOrcObjectLayerAddObjectFileWithRT functions can be used to add objects directly to that layer.	2021-05-05 19:02:13 -07:00
RamNalamothu	41f8b8e807	[MCAsmInfo] Support UsesCFIForDebug for targets with no exception handling This change enables emitting CFI unwind information for debugging purpose for targets with MCAsmInfo::ExceptionsType == ExceptionHandling::None. Currently generating CFI unwind information is entangled with supporting the exceptions, even when AsmPrinter explicitly recognizes that the unwind tables are being generated as debug information. In fact, the unwind information is not generated even if we specify --force-dwarf-frame-section, unless exceptions are enabled. The LIT test llvm/test/CodeGen/AMDGPU/debug_frame.ll demonstrates this behavior. Enable this option for AMDGPU to prepare for future patches which add complete CFI support. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D78778	2021-05-06 04:53:45 +05:30
Matt Arsenault	fa0b93b5a0	GlobalISel: Use DAG call lowering infrastructure in a more compatible way Unfortunately the current call lowering code is built on top of the legacy MVT/DAG based code. However, GlobalISel was not using it the same way. In short, the DAG passes legalized types to the assignment function, and GlobalISel was passing the original raw type if it was simple. I do believe the DAG lowering is conceptually broken since it requires picking a type up front before knowing how/where the value will be passed. This ends up being a problem for AArch64, which wants to pass i1/i8/i16 values as a different size if passed on the stack or in registers. The argument type decision is split across 3 different places which is hard to follow. SelectionDAG builder uses getRegisterTypeForCallingConv to pick a legal type, tablegen gives the illusion of controlling the type, and the target may have additional hacks in the C++ part of the call lowering. AArch64 hacks around this by not using the standard AnalyzeFormalArguments and special casing i1/i8/i16 by looking at the underlying type of the original IR argument. I believe people have generally assumed the calling convention code is processing the original types, and I've discovered a number of dead paths in several targets. x86 actually relies on the opposite behavior from AArch64, and relies on x86_32 and x86_64 sharing calling convention code where the 64-bit cases implicitly do not work on x86_32 due to using the pre-legalized types. AMDGPU targets without legal i16/f16 have always used a broken ABI that promotes to i32/f32. GlobalISel accidentally fixed this to be the ABI we should have, but this fixes it so we're using the worse ABI that is compatible with the DAG. Ideally we would fix the DAG to match the old GlobalISel behavior, but I don't wish to fight that battle. A new native GlobalISel call lowering framework should let the target process the incoming types directly. CCValAssigns select a "ValVT" and "LocVT" but the meanings of these aren't entirely clear. Different targets don't use them consistently, even within their own call lowering code. My current belief is the intent was "ValVT" is supposed to be the legalized value type to use in the end, and and LocVT was supposed to be the ABI passed type (which is also legalized). With the default CCState::Analyze functions always passing the same type for these arguments, these only differ when the TableGen part of the lowering decide to promote the type from one legal type to another. AArch64's i1/i8/i16 hack ends up inverting the meanings of these values, so I had to add an additional hack to let the target interpret how large the argument memory is. Since targets don't consistently interpret ValVT and LocVT, this doesn't produce quite equivalent code to the initial DAG lowerings. I've opted to consistently interpret LocVT as the in-memory size for stack passed values, and ValVT as the register type to assign from that memory. We therefore produce extending loads directly out of the IRTranslator, whereas the DAG would emit regular loads of smaller values. This will also produce loads/stores that are wider than the argument value if the allocated stack slot is larger (and there will be undef padding bytes). If we had the optimizations to reduce load/stores based on truncated values, this wouldn't produce a different end result. Since ValVT/LocVT are more consistently interpreted, we now will emit more G_BITCASTS as requested by the CCAssignFn. For example AArch64 was directly assigning types to some physical vector registers which according to the tablegen spec should have been casted to a vector with a different element type. This also moves the responsibility for inserting G_ASSERT_SEXT/G_ASSERT_ZEXT from the target ValueHandlers into the generic code, which is closer to how SelectionDAGBuilder works. I had to xfail an x86 test since I don't see a quick way to fix it right now (I filed bug 50035 for this). It's broken independently of this change, and only triggers since now we end up with more ands which hit the improperly handled selection pattern. I also observed that FP arguments that need promotion (e.g. f16 passed as f32) are broken, and use regular G_TRUNC and G_ANYEXT. TLDR; the current call lowering infrastructure is bad and nobody has ever understood how it chooses types.	2021-05-05 17:35:02 -04:00
Philipp Krones	632ebc4ab4	[MC] Untangle MCContext and MCObjectFileInfo This untangles the MCContext and the MCObjectFileInfo. There is a circular dependency between MCContext and MCObjectFileInfo. Currently this dependency also exists during construction: You can't contruct a MOFI without a MCContext without constructing the MCContext with a dummy version of that MOFI first. This removes this dependency during construction. In a perfect world, MCObjectFileInfo wouldn't depend on MCContext at all, but only be stored in the MCContext, like other MC information. This is future work. This also shifts/adds more information to the MCContext making it more available to the different targets. Namely: - TargetTriple - ObjectFileType - SubtargetInfo Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101462	2021-05-05 10:03:02 -07:00
Anirudh Prasad	ae2aef1361	[AsmParser][SystemZ][z/OS] Reject character and string literals for HLASM - As per the HLASM support we are providing, i.e. support only for the first parameter of the inline asm block, only pertaining to Z machine instructions defined in LLVM, character literals and string literals are not supported (see Figure 4 - https://www-01.ibm.com/servers/resourcelink/svc00100.nsf/pages/zOSV2R3sc264940/$file/asmr1023.pdf for more information) - This patch explicitly rejects the usage of char literals and string literals (for example "abc 'a'") when the relevant field is set - This is achieved by introducing a field called `LexHLASMStrings` in MCAsmLexer similar to `LexMasmStrings` Reviewed By: abhina.sreeskantharajan, Kai Differential Revision: https://reviews.llvm.org/D101660	2021-05-05 10:21:55 -04:00
Martin Storsjö	6f5670a4c3	Revert "[Passes] Enable the relative lookup table converter pass on aarch64" This reverts commit `57b259a852`. The relative lookup table converter pass seems to cause problems for chromium on Windows/ARM64, see https://crbug.com/1204788.	2021-05-05 15:23:14 +03:00
Fangrui Song	7cac6a9d7a	[MC] Add MCAsmParser::parseComma to improve diagnostics llvm-mc will error "expected comma" instead of "unexpected token".	2021-05-04 14:13:19 -07:00
Fraser Cormack	6523ff6d47	[ValueTypes] Add MVTs for v256i16 and v256f16 This patch adds the two MVTs to fix a legalizer crash when using vector shuffles of <256 x i16> and <128 x i16> on RISC-V. The legalizer can't promote the operand of `v256i32 = any_extend_vector_inreg v128i16`. Reviewed By: craig.topper, RKSimon Differential Revision: https://reviews.llvm.org/D101769	2021-05-04 18:06:13 +01:00
Wei Mi	82956de05f	[SampleFDO] Fix a bug when appending function symbol into the Callees set of Root node in ProfiledCallGraph. In ProfiledCallGraph::addProfiledFunction, to add a function symbol into the ProfiledCallGraph, currently an uninitialized ProfiledCallGraphNode node is created by ProfiledFunctions[Name] and inserted into Callees set of Root node before the node is initialized. The Callees set use ProfiledCallGraphNodeComparer as its comparator so the uninitialized ProfiledCallGraphNode may fail to be inserted into Callees set if it happens to contain a name in memory which has been inserted into the Callees set before. The problem will prevent some function symbols from being annotated with profiles and cause performance regression. The patch fixes the problem. Differential Revision: https://reviews.llvm.org/D101815	2021-05-04 10:05:59 -07:00
Sander de Smalen	9931ae645e	Reland "[LV] Calculate max feasible scalable VF." Relands https://reviews.llvm.org/D98509 This reverts commit `51d648c119`.	2021-05-04 15:44:41 +01:00
Jan Svoboda	00895831ab	[clang][cli][docs] Clarify marshalling infrastructure documentation	2021-05-04 15:16:32 +02:00
Jan Svoboda	d0e3a15e36	[clang][cli] NFC: Remove confusing `EmptyKPM` variable	2021-05-04 14:27:57 +02:00
Simon Moll	1db4dbba24	Recommit "[VP,Integer,#2] ExpandVectorPredication pass" This reverts the revert `02c5ba8679` Fix: Pass was registered as DUMMY_FUNCTION_PASS causing the newpm-pass functions to be doubly defined. Triggered in -DLLVM_ENABLE_MODULE=1 builds. Original commit: This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-05-04 11:47:52 +02:00
David Green	18883a3fec	[TTI] Replace ceil lambdas with divideCeil. NFCI As pointed out in D101726, this function already exists in MathExtras. It uses different types, but with the values used here I believe that should not make a functional difference.	2021-05-04 09:04:44 +01:00
Tomasz Miąsko	7310403e3c	[demangler] Initial support for the new Rust mangling scheme Add a demangling support for a small subset of a new Rust mangling scheme, with complete support planned as a follow up work. Intergate Rust demangling into llvm-cxxfilt and use llvm-cxxfilt for end-to-end testing. The new Rust mangling scheme uses "_R" as a prefix, which makes it easy to disambiguate it from other mangling schemes. The public API is modeled after __cxa_demangle / llvm::itaniumDemangle, since potential candidates for further integration use those. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101444	2021-05-03 16:44:30 -07:00
Arthur Eubanks	2df3426fd1	[NewPM] Invalidate AAManager after populating GlobalsAA GlobalsAA is only created at the beginning of the inliner pipeline. If an AAManager is cached from previous passes, it won't get rebuilt to include the newly created GlobalsAA. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D101379	2021-05-03 16:37:32 -07:00
Valentin Clement	63f8226f25	[OpenMPIRBuilder] Add createOffloadMaptypes and createOffloadMapnames functions Add function to create the offload_maptypes and the offload_mapnames globals. These two functions are used in clang. They will be used in the Flang/MLIR lowering as well. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D101503	2021-05-03 15:42:32 -04:00
Anirudh Prasad	ca02fab7e7	[AsmParser][SystemZ][z/OS] Implement HLASM location counter syntax ("") for Z PC-relative instructions. - This patch attempts to implement the location counter syntax () for the HLASM variant for PC-relative instructions. - In the HLASM variant, for purely constant relocatable values, we expect a * token preceding it, with special support for " " which is parsed as "<pc-rel-insn 0>" - For combinations of absolute values and relocatable values, we don't expect the "" preceding the token. When you have a " * " what’s accepted is: ``` <space>.{.} -> <pc-rel-insn> 0 [+\|-][constant-value] -> <pc-rel-insn> [+\|-]constant-value ``` When you don’t have a " * " what’s accepted is: ``` brasl 1,func is allowed (MCSymbolRef type) brasl 1,func+4 is allowed (MCBinary type) brasl 1,4+func is allowed (MCBinary type) brasl 1,-4+func is allowed (MCBinary type) brasl 1,func-4 is allowed (MCBinary type) brasl 1,func is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,+func is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,+func+4 is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,+4+func is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,-4+8+func is not allowed ( cannot be used for non-MCConstantExprs) ``` Reviewed By: Kai Differential Revision: https://reviews.llvm.org/D100987	2021-05-03 14:58:24 -04:00
Paul Robinson	1d299252dd	[DebuggerTuning] Move a comment to a more useful place. The comment about how to make use of debugger tuning within DwarfDebug really belongs inside the DwarfDebug declaration, where it will be easier to find.	2021-05-03 11:08:04 -07:00
Chris Lattner	5fa9d41634	[Support/Parallel] Add a special case for 0/1 items to llvm::parallel_for_each. This avoids the non-trivial overhead of creating a TaskGroup in these degenerate cases, but also exposes parallelism. It turns out that the default executor underlying TaskGroup prevents recursive parallelism - so an instance of a task group being alive will make nested ones become serial. This is a big issue in MLIR in some dialects, if they have a single instance of an outer op (e.g. a firrtl.circuit) that has many parallel ops within it (e.g. a firrtl.module). This patch side-steps the problem by avoiding creating the TaskGroup in the unneeded case. See this issue for more details: https://github.com/llvm/circt/issues/993 Note that this isn't a really great solution for the general case of nested parallelism. A redesign of the TaskGroup stuff would be better, but would be a much more invasive change. Differential Revision: https://reviews.llvm.org/D101699	2021-05-03 10:08:00 -07:00
Abhina Sreeskantharajan	1527a5e4b4	[SystemZ][z/OS] Add the functions needed for handling EBCDIC I/O This patch adds the basic functions needed for controlling auto conversion on z/OS. Auto conversion is enabled on untagged input file to ASCII by making the assumption that all untagged files are EBCDIC encoded. Output files are auto converted to EBCDIC IBM-1047. This change also enables conversion for stdin/stdout/stderr. For more information on how fcntl controls codepage https://www.ibm.com/docs/en/zos/2.4.0?topic=descriptions-fcntl-bpx1fct-bpx4fct-control-open-file-descriptors Reviewed By: anirudhp Differential Revision: https://reviews.llvm.org/D100483	2021-05-03 08:52:38 -04:00
David Green	d1bbe61d1c	[ARM] Memory operands for MVE gathers/scatters Similarly to D101096, this makes sure that MMO operands get propagated through from MVE gathers/scatters to the Machine Instructions. This allows extra scheduling freedom, not forcing the instructions to act as scheduling barriers. We create MMO's with an unknown size, specifying that they can load from anywhere in memory, similar to the masked_gather or X86 intrinsics. Differential Revision: https://reviews.llvm.org/D101219	2021-05-03 11:24:59 +01:00
Sergio Perez Gonzalez	761d5614a1	[Object] Fix e_machine description for EM_CR16 and add EM_MICROBLAZE Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101133	2021-05-02 19:25:39 -07:00
David Green	15b5d1a5bf	[ARM] Transfer memory operands for VLDn We create MMO's for the VLDn/VSTn intrinsics in ARMTargetLowering:: getTgtMemIntrinsic, but they do not currently make it ll the way through ISel. This changes that in the various places it needs changing, making sure that the MMO is propagate through to the final instruction. This can help in scheduling, not treating the VLD2/VST2 as a scheduling barrier. Differential Revision: https://reviews.llvm.org/D101096	2021-05-03 00:04:21 +01:00
Craig Topper	6430430958	[TableGen] Use sign rotated VBR for OPC_EmitInteger. This allows for a much more efficient encoding for small negative numbers by storing the sign bit first and negating the rest of the bits. This was already being used for OPC_CheckInteger. For every in tree target this affects, the table got smaller. R600GenDAGISel.inc saw the largest reduction of 7K. I did have to add a new opcode for StringIntegers used for register class ids and subregister indices since we don't have the integer value to encode. The enum name is emitted directly into the table. Previously assumed the enum would expand to a positive 7-bit number. We might be able to just shift that right by 1 and assume it is a positive 6 bit number, but that will need more investigation.	2021-05-02 12:40:44 -07:00
Juneyoung Lee	1977c53b2a	[InstCombine] Fold overflow bit of [u\|s]mul.with.overflow in a poison-safe way As discussed in D101191, this patch adds a poison-safe folding of overflow bit check: ``` %Op0 = icmp ne i4 %X, 0 %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y) %Op1 = extractvalue { i4, i1 } %Agg, 1 %ret = select i1 %Op0, i1 %Op1, i1 false => %Y.fr = freeze %Y %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y.fr) %Op1 = extractvalue { i4, i1 } %Agg, 1 %ret = %Op1 ``` https://alive2.llvm.org/ce/z/zgPUGT https://alive2.llvm.org/ce/z/h2gZ_6 Note that there are cases where inserting freeze is not necessary: e.g. %Y is `noundef`. In this case, LLVM is already good because `%ret` is already successfully folded into `and`, triggering the pre-existing optimization in InstSimplify: https://godbolt.org/z/v6qena15K Differential Revision: https://reviews.llvm.org/D101423	2021-05-02 11:54:12 +09:00
Nikita Popov	cc58e8918b	[SCEV] Simplify backedge count clearing (NFC) This seems to be a leftover from when the BackedgeTakenInfo stored multiple exit counts with manual memory management. At some point this was switchted to a simple vector, and there should be no need to micro-manage the clearing anymore. We can simply drop the loop from the map and the the destructor do its job.	2021-05-01 17:50:01 +02:00
Nathan Chancellor	4397b7095d	Revert "Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"" This reverts commit `791930d740`, as per https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy. I observed breakage with the Linux kernel, as reported at https://reviews.llvm.org/D91722#2724321 Fixes exist at https://reviews.llvm.org/D101523 https://reviews.llvm.org/D101540 but they have not landed so to unbreak the tree for the weekend, revert this commit. Commit `b11e4c9907` ("Revert "[DebugInfo] Drop DBG_VALUE_LISTs with an excessive number of debug operands"") only reverted one follow-up fix, not the original patch that broke the kernel. e	2021-04-30 20:23:21 -07:00
Adrian Prantl	02c5ba8679	Revert "[VP,Integer,#2] ExpandVectorPredication pass" This reverts commit `43bc584dc0`. The commit broke the -DLLVM_ENABLE_MODULES=1 builds. http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/31603/consoleFull#2136199809a1ca8a51-895e-46c6-af87-ce24fa4cd561	2021-04-30 17:02:28 -07:00
Daniil Fukalov	3489c2d7b1	[TTI] NFC: Change getTypeLegalizationCost to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen, kparzysz Differential Revision: https://reviews.llvm.org/D101533	2021-04-30 22:51:51 +03:00
Guozhi Wei	b817ea7b17	[MachineFunction] Make comment for TracksLiveness more clearer As discussed in https://lists.llvm.org/pipermail/llvm-dev/2021-April/150225.html, the current comments for TracksLiveness property and isKill flag are confusing. This patch makes the comments more clearer. Differential Revision: https://reviews.llvm.org/D101500	2021-04-30 12:10:36 -07:00
Scott Linder	f3026d8b8d	[ADT] Add llvm::remove_cvref and llvm::remove_cvref_t Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D100669	2021-04-30 18:22:38 +00:00
Scott Linder	c6f20d70a8	[ADT] Add STLForwardCompat.h and llvm::disjunction Move some types in STLExtras.h which are named and behave identically to STL types from future standards into a dedicated header. This keeps them organized (they are not "extras" in the same sense as most types in STLExtras.h are) and fixes circular dependencies in future patches. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D100668	2021-04-30 17:28:47 +00:00
Paul C. Anagnostopoulos	985ab6e1fa	[TableGen] Fix two bugs in 'defm' when complex 'assert' is involved. This patch fixes two bugs that arise when a 'defm' inherits from a multiclass and also from a class with assertions. Differential Revision: https://reviews.llvm.org/D101626	2021-04-30 11:31:06 -04:00
Simon Moll	43bc584dc0	[VP,Integer,#2] ExpandVectorPredication pass This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-04-30 15:47:28 +02:00
Dominik Montada	97ed1b6036	[GISel] Teach TableGen to check predicates of immediate operands in patterns Reviewed By: dsanders Differential Revision: https://reviews.llvm.org/D91703	2021-04-30 10:18:45 +02:00
Matt Arsenault	1cf3d68f97	VirtRegMap: Add pass option to not clear virt regs In a future change it will be possible to run register allocation with a specific set of register classes, so some of the remaining virtual registers will still be meaningful.	2021-04-29 21:08:47 -04:00
Zequan Wu	cab48e2f0e	[CodeGen] don't emit addrsig symbol if it's used only by metadata Value only used by metadata can be removed from .addrsig table. This solves the undefined symbol error when enabling addrsig table on COFF LTO. Differential Revision: https://reviews.llvm.org/D101512	2021-04-29 15:39:30 -07:00
Amara Emerson	96ec6d91e4	[AArch64][GlobalISel] Simplify out of range rotate amount. Differential Revision: https://reviews.llvm.org/D101005	2021-04-29 14:05:58 -07:00
Alexey Bataev	12c51f2358	[COST] Improve shuffle kind detection if shuffle mask is provided. Added an extra analysis for better choosing of shuffle kind in getShuffleCost functions for better cost estimation if mask was provided. Differential Revision: https://reviews.llvm.org/D100865	2021-04-29 12:48:00 -07:00
Alexey Bataev	6e859f3cd4	Revert "[COST] Improve shuffle kind detection if shuffle mask is provided." This reverts commit `9239932221` to fix a compiler crash on mask checks.	2021-04-29 12:40:33 -07:00
Victor Huang	ae3377c553	[AIX][TLS] Add ASM portion changes to support TLSGD relocations to XCOFF objects - Add new variantKinds for the symbol's variable offset and region handle - Print the proper relocation specifier @gd in the asm streamer when emitting the TC Entry for the variable offset for the symbol - Fix the switch section failure between the TC Entry of variable offset and region handle - Put .__tls_get_addr symbol in the ProgramCodeSects with XTY_ER property Reviewed by: sfertile Differential Revision: https://reviews.llvm.org/D100956	2021-04-29 13:18:59 -05:00
Benjamin Kramer	df323ba445	Revert "[X86] Support AMX fast register allocation" This reverts commit `3b8ec86fd5`. Revert "[X86] Refine AMX fast register allocation" This reverts commit `c3f95e9197`. This pass breaks using LLVM in a multi-threaded environment by introducing global state.	2021-04-29 18:56:33 +02:00
Alexey Bataev	9239932221	[COST] Improve shuffle kind detection if shuffle mask is provided. Added an extra analysis for better choosing of shuffle kind in getShuffleCost functions for better cost estimation if mask was provided. Differential Revision: https://reviews.llvm.org/D100865	2021-04-29 09:42:56 -07:00
Sanjay Patel	f4b1272d3d	[ADT] fix typo in code block comment; NFC	2021-04-29 12:09:22 -04:00
Anirudh Prasad	ded0a70aeb	[AsmParser][SystemZ][z/OS] Reject "Dot" as current PC on z/OS - Currently, the "." (Dot) character, when not identifying an Identifier or a Constant, refers to the current PC (Program Counter) - However, in z/OS, for the HLASM dialect, it strictly accepts only the "*" as the current PC (Support for this will be put up in a follow-up patch) - The changes in this patch allow individual platforms to choose whether they would like to use the "." (Dot) character as a marker for the current PC or not. - It is achieved by introducing a new field in MCAsmInfo.h called `DotIsPC` (similar to `DollarIsPC`) Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D100975	2021-04-29 11:58:54 -04:00
Sander de Smalen	51d648c119	Revert "[LV] Calculate max feasible scalable VF." Temporarily reverting this patch due to some unexpected issue found by one of the PPC buildbots. This reverts commit `584e9b6e4b`.	2021-04-29 16:04:37 +01:00
Chirag Khandelwal	fbd3548d1c	[LLVM][OpenMP] Adding support for OpenMP sections construct in OpenMPIRBuilder This patch adds section support in the OpenMP IRBuilder module, along with a test for the same. Reviewed By: fghanim Differential Revision: https://reviews.llvm.org/D89671	2021-04-29 18:39:49 +05:30
Amara Emerson	d138e97c2a	[GlobalISel] Bump CallLoweringInfo::OrigArgs initial size to 32. NFC. We spend some time during sqlite3 compilation regrowing this vector, bump it up to avoid this. Gives around 1-2% improvement in codegen-only time for sqlite3 at -O0.	2021-04-29 01:01:29 -07:00
Evgeny Leviant	6a0283d0d2	[NewPM] Add an option to dump pass structure Patch adds -debug-pass-structure option to dump pass structure when new pass manager is used. Differential revision: https://reviews.llvm.org/D99599	2021-04-29 10:29:42 +03:00
Bardia Mahjour	ddb3b26a12	[LV] Consider Loop Unroll Hints When Making Interleave Decisions This patch causes the loop vectorizer to not interleave loops that have nounroll loop hints (llvm.loop.unroll.disable and llvm.loop.unroll_count(1)). Note that if a particular interleave count is being requested (through llvm.loop.interleave_count), it will still be honoured, regardless of the presence of nounroll hints. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D101374	2021-04-28 17:27:52 -04:00
Anirudh Prasad	07b0a72d8e	[AsmParser][SystemZ][z/OS] Use updated framework in AsmLexer to accept special tokens as Identifiers - Previously, https://reviews.llvm.org/D99889 changed the framework in the AsmLexer to treat special tokens, if they occur at the start of the string, as Identifiers. - These are used by the MASM Parser implementation in LLVM, and we can extend some of the changes made in the previous patch to SystemZ. - In SystemZ, the special "tokens" referred to here are "_", "$", "@", "#". [_\|$\|@\|#] are already supported as "part" of an Identifier. - The changes in this patch ensure that these special tokens, when they occur at the start of the Identifier, are treated as Identifiers. Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D100959	2021-04-28 15:43:24 -04:00
Jonas Devlieghere	625bd94c6d	[dsymutil] Add flag to force a static variable to keep its enclosing function Add a flag to change dsymutil's behavior and force a static variable to keep its enclosing function. The test shows a situation where that could be useful. I'm not convinced this behavior makes sense as a default, which is why it's behind a flag. rdar://74918374 Differential revision: https://reviews.llvm.org/D101337	2021-04-28 11:33:04 -07:00
David Candler	b8baa2a913	[ARM][AArch64] Require appropriate features for crypto algorithms This patch changes the AArch32 crypto instructions (sha2 and aes) to require the specific sha2 or aes features. These features have already been implemented and can be controlled through the command line, but do not have the expected result (i.e. `+noaes` will not disable aes instructions). The crypto feature retains its existing meaning of both sha2 and aes. Several small changes are included due to the knock-on effect this has: - The AArch32 driver has been modified to ensure sha2/aes is correctly set based on arch/cpu/fpu selection and feature ordering. - Crypto extensions are permitted for AArch32 v8-R profile, but not enabled by default. - ACLE feature macros have been updated with the fine grained crypto algorithms. These are also used by AArch64. - Various tests updated due to the change in feature lists and macros. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D99079	2021-04-28 16:26:18 +01:00
Paul C. Anagnostopoulos	952c6ddd8b	[TableGen] Add the !find bang operator !find searches a source string for a target string and returns the position. Differential Revision: https://reviews.llvm.org/D101318	2021-04-28 09:51:00 -04:00
Matt Arsenault	cea97fc0fc	GlobalISel: Relax verification of physical register copy types This was picking a concrete size for a physical register, and enforcing exact match on the virtual register's type size. Some targets add multiple types to a register class, and some are smaller than the full bit width. For example x86 adds f32 to 128-bit xmm registers, and AMDGPU adds i16/f16 to 32-bit registers. It might be better to represent these cases as a copy of the full register and an extraction of the subpart, but a lot of code assumes you can directly copy. This will help fix the current usage of the DAG calling convention infrastructure which is incompatible with how GlobalISel is now using it. The API is somewhat cumbersome here, but I just mirrored the existing functions, except now with LLTs (and allow returning null on failure, unlike the MVT version). I think the concept of selecting register classes based on type is flawed to begin with, but I'm trying to keep this compatible with the existing handling.	2021-04-28 08:45:41 -04:00
Sander de Smalen	584e9b6e4b	[LV] Calculate max feasible scalable VF. This patch also refactors the way the feasible max VF is calculated, although this is NFC for fixed-width vectors. After this change scalable VF hints are no longer truncated/clamped to a shorter scalable VF, nor does it drop the 'scalable flag' from the suggested VF to vectorize with a similar VF that is fixed. Instead, the hint is ignored which means the vectorizer is free to find a more suitable VF, using the CostModel to determine the best possible VF. Reviewed By: c-rhodes, fhahn Differential Revision: https://reviews.llvm.org/D98509	2021-04-28 12:30:00 +01:00
Benjamin Kramer	7e5682ee62	[ADT] Make TrackingStatistic's ctor constexpr This lets clang diagnose unused statistics, so remove them.	2021-04-28 12:00:17 +02:00
RamNalamothu	63cfab4f40	[NFC] Refactor how CFI section types are represented in AsmPrinter In terms of readability, the `enum CFIMoveType` didn't better document what it intends to convey i.e. the type of CFI section that gets emitted. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D76519	2021-04-28 09:04:04 +05:30
Craig Topper	3067520bf4	[SelectionDAG] Use a VTSDNode to store the saturation width for FP_TO_SINT_SAT/FP_TO_UINT_SAT Previously we used an i32 constant to store the saturation width, but i32 isn't legal on RISCV64. This wasn't a big deal to fix, but it is extra work for the type legalizer. This patch uses a VTSDNode to store the type similar to SEXT_INREG. This makes it opaque to the type legalizer. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101262	2021-04-27 14:38:42 -07:00
Roman Lebedev	15f631cc78	[NFC][IR] PHINode: ... and assert in another ctor too	2021-04-27 20:52:44 +03:00
Roman Lebedev	1ebbf84ba4	[NFC][IR] PHINode: assert we aren't trying to create token-typed PHI Verifier will complain, but by then it may be too late, because we might have never reached it because we already crashed with some bogus bug. It is best to catch this the moment it happens.	2021-04-27 20:49:42 +03:00
Nick Desaulniers	ea8416bf4d	[CodeGenOptions] make StackProtectorGuardOffset signed GCC supports negative values for -mstack-protector-guard-offset=, this should be a signed value. Pre-req to D100919. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101325	2021-04-27 10:12:58 -07:00
Victor Huang	241c2da406	[AIX][Power10] Restrict prefixed instructions from crossing the 64byte boundary This patch adds the support to restrict prefixed instruction from crossing the 64 byte boundary: - Add the infrastructure to register a custom XCOFF streamer - Add a custom XCOFF streamer for PowerPC to allow us to intercept instructions as they are being emitted and align all 8 byte instructions to a 64 byte boundary if required by adding a 4 byte nop. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D101107	2021-04-27 11:55:18 -05:00
Nico Weber	21da04f701	[llvm, clang] Remove stdlib includes from .h files without `std::` Found files not containing `std::` with: INCL="algorithm\|array\|list\|map\|memory\|queue\|set\|string\|utility\|vector\|unordered_map\|unordered_set" git ls-files llvm/include/llvm \| grep '\.h$' \| xargs grep -L std:: \| \ xargs grep -El "#include <($INCL)>$" > to_process.txt git ls-files clang/include/clang \| grep '\.h$' \| xargs grep -L std:: \| \ xargs grep -El "#include <($INCL)>$" >> to_process.txt Then removed these headers from those files with INCL_ESCAPED="$(echo $INCL\|sed 's/\|/\\\|/g')" cat to_process.txt \| xargs sed -i "/^#include <$$INCL_ESCAPED$>$/d" cat to_process.txt \| xargs sed -i '/^$/N;/^\n$/D' No behavior change. Differential Revision: https://reviews.llvm.org/D101378	2021-04-27 12:41:39 -04:00
Petar Avramovic	4c6eb3886c	[MIPatternMatch]: Add matchers for binary instructions Add matchers that support commutative and non-commutative binary opcodes. Differential Revision: https://reviews.llvm.org/D99736	2021-04-27 11:37:42 +02:00
Petar Avramovic	39662abf72	[MIPatternMatch]: Add mi_match for MachineInstr This utility allows more efficient start of pattern match. Often MachineInstr(MI) is available and instead of using mi_match(MI.getOperand(0).getReg(), MRI, ...) followed by MRI.getVRegDef(Reg) that gives back MI we now use mi_match(MI, MRI, ...). Differential Revision: https://reviews.llvm.org/D99735	2021-04-27 11:08:16 +02:00
Petar Avramovic	ebe408ad80	[MIPatternMatch]: Add ICstRegMatch Matches G_CONSTANT and returns its def register. Differential Revision: https://reviews.llvm.org/D99734	2021-04-27 10:53:17 +02:00
Petar Avramovic	0713c82b13	[GlobalISel]: Add a getConstantIntVRegVal utility Returns ConstantInt from G_CONSTANT instruction given its def register. Differential Revision: https://reviews.llvm.org/D99733	2021-04-27 10:52:07 +02:00
dfukalov	e4c606acaf	[TTI] NFC: Change getScalarizationOverhead and getOperandsScalarizationOverhead to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D101283	2021-04-27 08:51:48 +03:00
Lang Hames	a702fa2a04	[ORC] Make LLVMOrcLLJITBuilderSetJITTargetMachineBuilder consume as advertised. This should fix some of the memory leaks seen in the ORC C API test case.	2021-04-26 22:26:38 -07:00
Chen Zheng	e5000eef81	[XCOFF] make .file directive have directory info The .file directive is changed to only have basename in D36018 for ELF. But on AIX, we require the .file directive to also contain the directory info. This aligns with other AIX compiler like XLC and is required by some AIX tool like DBX. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D99785	2021-04-27 00:15:23 -04:00
Lang Hames	d122d80b3d	Reapply "[ORC] Add unit tests for parts of the ..." with fixes and improvements. This reapplies `8740360093`, which was reverted in `bbddadd46e` due to buildbot errors. This version checks that a JIT instance can be safely constructed, skipping tests if it can not be. To enable this it introduces new C API to retrieve and set the target triple for a JITTargetMachineBuilder.	2021-04-26 20:44:40 -07:00
William S. Moses	7aa3cad46a	[NVPTX] Enable lowering of atomics on local memory LLVM does not have valid assembly backends for atomicrmw on local memory. However, as this memory is thread local, we should be able to lower this to the relevant load/store. Differential Revision: https://reviews.llvm.org/D98650	2021-04-26 20:12:12 -04:00
Fangrui Song	18839be9c5	[ADT] Remove StatisticBase and make NoopStatistic empty In LLVM_ENABLE_STATS=0 builds, `llvm::Statistic` maps to `llvm::NoopStatistic` but has 3 mostly unused pointers. GlobalOpt considers that the pointers can potentially retain allocated objects, so GlobalOpt cannot optimize out the `NoopStatistic` variables (see D69428 for more context), wasting 23KiB for stage 2 clang. This patch makes `NoopStatistic` empty and thus reclaims the wasted space. The clang size is even smaller than applying D69428 (slightly smaller in both .bss and .text). ``` # This means the D69428 optimization on clang is mostly nullified by this patch. HEAD+D69428: size(.bss) = 0x0725a8 HEAD+D101211: size(.bss) = 0x072238 # bloaty - HEAD+D69428 vs HEAD+D101211 # With D101211, we also save a lot of string table space (.rodata). FILE SIZE VM SIZE -------------- -------------- -0.0% -32 -0.0% -24 .eh_frame -0.0% -336 [ = ] 0 .symtab -0.0% -360 [ = ] 0 .strtab [ = ] 0 -0.2% -880 .bss -0.0% -2.11Ki -0.0% -2.11Ki .rodata -0.0% -2.89Ki -0.0% -2.89Ki .text -0.0% -5.71Ki -0.0% -5.88Ki TOTAL ``` Note: LoopFuse is a disabled pass. For now this patch adds `#if LLVM_ENABLE_STATS` so `OptimizationRemarkMissed` is skipped in LLVM_ENABLE_STATS==0 builds. If these `OptimizationRemarkMissed` are useful in LLVM_ENABLE_STATS==0 builds, we can replace `llvm::Statistic` with `llvm::TrackingStatistic`, or use a different abstraction to keep track of the strings. Similarly, skip the code in `mlir/lib/Pass/PassStatistics.cpp` which calls `getName`/`getDesc`/`getValue`. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D101211	2021-04-26 16:47:32 -07:00
William S. Moses	8ede96493c	Revert "[NVPTX] Enable lowering of atomics on local memory" This reverts commit `fede99d386`.	2021-04-26 19:33:01 -04:00
William S. Moses	fede99d386	[NVPTX] Enable lowering of atomics on local memory LLVM does not have valid assembly backends for atomicrmw on local memory. However, as this memory is thread local, we should be able to lower this to the relevant load/store. Differential Revision: https://reviews.llvm.org/D98650	2021-04-26 19:27:27 -04:00
Lei Zhang	254e289d45	Revert "[ADT] Remove StatisticBase and make NoopStatistic empty" This reverts commit `b540311781` because it breaks MLIR build: https://buildkite.com/mlir/mlir-core/builds/13299#ad0f8901-dfa4-43cf-81b8-7940e2c6c15b	2021-04-26 18:31:04 -04:00
Fangrui Song	e01c666b13	Revert D76519 "[NFC] Refactor how CFI section types are represented in AsmPrinter" This reverts commit `0ce723cb22`. D76519 was not quite NFC. If we see a CFISection::Debug function before a CFISection::EH one (-fexceptions -fno-asynchronous-unwind-tables), we may incorrectly pick CFISection::Debug and emit a `.cfi_sections .debug_frame`. We should use .eh_frame instead. This scenario is untested.	2021-04-26 15:17:28 -07:00
Lang Hames	c8fc5e3ba9	[ORC] C API updates. Adds support for creating custom MaterializationUnits in the C API with the new LLVMOrcCreateCustomMaterializationUnit function. Modifies ownership rules for LLVMOrcAbsoluteSymbols to make it consistent with LLVMOrcCreateCustomMaterializationUnit. This is an ABI breaking change for any clients of the LLVMOrcAbsoluteSymbols API. Adds LLVMOrcLLJITGetObjLinkingLayer and LLVMOrcObjectLayerEmit functions to allow clients to get a reference to an LLJIT instance's linking layer, then emit an object file using it. This can be used to support construction of custom materialization units in the common case where those units will generate an object file that needs to be emitted to complete the materialization.	2021-04-26 13:58:37 -07:00
Lang Hames	8d718a0bff	[ORC] Fix type name. Rename JITTargetSymbolFlags to JITSymbolTargetFlags. This matches the convention used for JITSymbolGenericFlags.	2021-04-26 13:58:37 -07:00
Fangrui Song	b540311781	[ADT] Remove StatisticBase and make NoopStatistic empty In LLVM_ENABLE_STATS=0 builds, `llvm::Statistic` maps to `llvm::NoopStatistic` but has 3 unused pointers. GlobalOpt considers that the pointers can potentially retain allocated objects, so GlobalOpt cannot optimize out the `NoopStatistic` variables (see D69428 for more context), wasting 23KiB for stage 2 clang. This patch makes `NoopStatistic` empty and thus reclaims the wasted space. The clang size is even smaller than applying D69428 (slightly smaller in both .bss and .text). ``` # This means the D69428 optimization on clang is mostly nullified by this patch. HEAD+D69428: size(.bss) = 0x0725a8 HEAD+D101211: size(.bss) = 0x072238 # bloaty - HEAD+D69428 vs HEAD+D101211 # With D101211, we also save a lot of string table space (.rodata). FILE SIZE VM SIZE -------------- -------------- -0.0% -32 -0.0% -24 .eh_frame -0.0% -336 [ = ] 0 .symtab -0.0% -360 [ = ] 0 .strtab [ = ] 0 -0.2% -880 .bss -0.0% -2.11Ki -0.0% -2.11Ki .rodata -0.0% -2.89Ki -0.0% -2.89Ki .text -0.0% -5.71Ki -0.0% -5.88Ki TOTAL ``` Note: LoopFuse is a disabled pass. This patch adds `#if LLVM_ENABLE_STATS` so `OptimizationRemarkMissed` is skipped in LLVM_ENABLE_STATS==0 builds. If these `OptimizationRemarkMissed` are useful and not noisy, we can replace `llvm::Statistic` with `llvm::TrackingStatistic` in the future. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D101211	2021-04-26 13:39:35 -07:00
William S. Moses	494e77138c	[Lexer] Allow LLLexer to be used as an API Explose LLVM Lexer for usage externally as an API Differential Revision: https://reviews.llvm.org/D100920	2021-04-26 12:43:14 -04:00
Paul C. Anagnostopoulos	2d4c4d3c54	[TableGen] Change assertion information from a tuple to a struct [NFC] Differential Revision: https://reviews.llvm.org/D100854	2021-04-26 09:57:16 -04:00
Tim Renouf	8710eff6c3	[MC][AMDGPU][llvm-objdump] Synthesized local labels in disassembly 1. Add an accessor function to MCSymbolizer to retrieve addresses referenced by a symbolizable operand, but not resolved to a symbol. That way, the caller can synthesize labels at those addresses and then retry disassembling the section. 2. Implement that in AMDGPU -- a failed symbol lookup results in the address being added to a vector returned by the new function. 3. Use that in llvm-objdump when using MCSymbolizer (which only happens on AMDGPU) and SymbolizeOperands is on. Differential Revision: https://reviews.llvm.org/D101145 Change-Id: I19087c3bbfece64bad5a56ee88bcc9110d83989e	2021-04-26 13:56:36 +01:00
David Sherwood	a458b7855e	[AArch64] Add AArch64TTIImpl::getMaskedMemoryOpCost function When vectorising for AArch64 targets if you specify the SVE attribute we automatically then treat masked loads and stores as legal. Also, since we have no cost model for masked memory ops we believe it's cheap to use the masked load/store intrinsics even for fixed width vectors. This can lead to poor code quality as the intrinsics will currently be scalarised in the backend. This patch adds a basic cost model that marks fixed-width masked memory ops as significantly more expensive than for scalable vectors. Tests for the cost model are added here: Transforms/LoopVectorize/AArch64/masked-op-cost.ll Differential Revision: https://reviews.llvm.org/D100745	2021-04-26 11:00:03 +01:00
Levy Hsu	8cf54c7ff5	[RISCV] [1/2] Add IR intrinsic for Zbe extension RV32/64: bcompress bdecompress RV64 ONLY: bcompressw bdecompressw Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D101143	2021-04-25 19:14:34 -07:00
Tomasz Miąsko	1cea7ab4ba	[demangler] Use standard semantics for StringView::substr The StringView::substr now accepts a substring starting position and its length instead of previous non-standard `from` & `to` positions. All uses of two argument StringView::substr are in MicrosoftDemangler and have 0 as a starting position, so no changes are necessary. This also fixes a bug where attempting to extract a suffix with substr (a `to` position equal to size) would return a substring without the last character. Fixing the issue should not introduce observable changes in the demangler, since as currently used, a second argument to StringView::substr is either: 1) a result of a successful call to StringView::find and so necessarily smaller than size., or 2) in the case of Demangler::demangleCharLiteral potentially equal to size, but with demangler expecting more data to follow later on and failing either way. Reviewed By: #libc_abi, ldionne, erik.pilkington Differential Revision: https://reviews.llvm.org/D100246	2021-04-25 13:56:41 +02:00
Xiang1 Zhang	3b8ec86fd5	[X86] Support AMX fast register allocation Differential Revision: https://reviews.llvm.org/D100026	2021-04-25 09:45:41 +08:00
Lang Hames	c572ff840f	[ORC][C-bindings] Fix missing ')' in comments.	2021-04-24 18:04:57 -07:00
Nikita Popov	95af971764	[PatternMatch] Improve m_Deferred() documentation (NFC) m_Deferred() has nothing to do with commutative matchers, it needs to be used whenever the value to match is determinde as part of the same match expression.	2021-04-24 21:00:24 +02:00
RamNalamothu	0ce723cb22	[NFC] Refactor how CFI section types are represented in AsmPrinter In terms of readability, the `enum CFIMoveType` didn't better document what it intends to convey i.e. the type of CFI section that gets emitted. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D76519	2021-04-24 23:29:42 +05:30
dfukalov	6c57044231	[GVN] Clobber partially aliased loads. Use offsets stored in `AliasResult` implemented in D98718. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95543	2021-04-24 14:14:20 +03:00
Teresa Johnson	10b781fb03	Mark type test intrinsics as speculatable to fix inline cost There is already code in InlineCost.cpp to identify and ignore ephemeral values (llvm.assume intrinsics and other side-effect free instructions only feeding the assumes). However, because llvm.type.test intrinsics were not marked speculatable, they and any instructions specifically feeding the type test (typically a bitcast) were being counted towards the instruction cost when inlining. This was causing profile matching issues in some cases when enabling -fwhole-program-vtables for whole program devirtualization. According to the language reference, the speculatable attribute means: "the function does not have any effects besides calculating its result and does not have undefined behavior". I see no reason why type tests cannot be marked with this attribute. There are 2 test changes: llvm/test/Transforms/Inline/ephemeral.ll: I added a type test intrinsic here to verify the fix. Also, I found the test was not actually testing what it originally intended. Many of the existing instructions were optimized away by -Oz, and the cost of inlining was negative due to the benefit of removing the call. So I changed the test to simply invoke the inline pass and check the number of instructions computed by InlineCost. I also fixed an instruction that was not actually used anywhere. llvm/test/Transforms/SimplifyCFG/no-md-sink.ll needed to be made more robust to code changes that reordered the metadata. Differential Revision: https://reviews.llvm.org/D101180	2021-04-23 10:02:31 -07:00
Nemanja Ivanovic	6725b90a02	[PowerPC] Add vec_ctsl and vec_ctul to altivec.h These are added for compatibility with XLC. They are similar to vec_cts and vec_ctu except that the result is a doubleword vector regardless of the parameter type.	2021-04-23 11:03:38 -05:00
Sander de Smalen	f9a50f04ba	[TTI] NFC: Change getIntImmCost[Inst\|Intrin] to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100565	2021-04-23 16:06:36 +01:00
Sander de Smalen	43ace8b5ce	[TTI] NFC: Change getScalingFactorCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100564	2021-04-23 16:06:36 +01:00
Sander de Smalen	008a072ded	[TTI] NFC: Change getMemcpyCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100563	2021-04-23 16:06:35 +01:00
Sander de Smalen	9ba07f37f8	[TTI] NFC: Change getGEPCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100562	2021-04-23 16:06:35 +01:00
Sander de Smalen	e0edfa052f	[TTI] NFC: Change getAddressComputationCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100561	2021-04-23 16:06:35 +01:00
dfukalov	9ab17a60eb	[TTI] NFC: Use InstructionCost to store ScalarizationCost in IntrinsicCostAttributes. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D101151	2021-04-23 18:02:00 +03:00
Daniil Fukalov	f79d055791	[TTI] Fix ScalarizationCost initialization. In cases when ScalarizationCostPassed has no value, UINT_MAX is actually used for cost estimation in `return ScalarCalls * ScalarCost + ScalarizationCost`. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D101099	2021-04-23 17:59:59 +03:00
Stephen Tozer	791930d740	Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" Previous build failures were caused by an error in bitcode reading and writing for DIArgList metadata, which has been fixed in `e5d844b587`. There were also some unnecessary asserts that were being triggered on certain builds, which have been removed. This reverts commit `dad5caa59e`.	2021-04-23 10:54:01 +01:00
Jay Foad	63af3c000b	[GlobalISel] Remove ConstantFoldingMIRBuilder ConstantFoldingMIRBuilder was an experiment which is not used for anything. The constant folding functionality is now part of CSEMIRBuilder. Differential Revision: https://reviews.llvm.org/D101050	2021-04-23 09:13:27 +01:00
Fangrui Song	2786e673c7	[IR][sanitizer] Add module flag "frame-pointer" and set it for cc1 -mframe-pointer={non-leaf,all} The Linux kernel objtool diagnostic `call without frame pointer save/setup` arise in multiple instrumentation passes (asan/tsan/gcov). With the mechanism introduced in D100251, it's trivial to respect the command line -m[no-]omit-leaf-frame-pointer/-f[no-]omit-frame-pointer, so let's do it. Fix: https://github.com/ClangBuiltLinux/linux/issues/1236 (tsan) Fix: https://github.com/ClangBuiltLinux/linux/issues/1238 (asan) Also document the function attribute "frame-pointer" which is long overdue. Differential Revision: https://reviews.llvm.org/D101016	2021-04-22 18:07:30 -07:00
Levy Hsu	b49337bbb9	[RISCV] [1/2] Add IR intrinsic for Zbp extension RV32/64: grev grevi gorc gorci shfl shfli unshfl unshfli RV64 ONLY: grevw greviw gorcw gorciw shflw shfli (For non-existing shfliw) unshfli (For non-existing unshfliw) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100830	2021-04-22 16:34:51 -07:00
Craig Topper	e01c419ecd	[RISCV] Add IR intrinsics for vmsge(u).vv/vx/vi. These instructions don't really exist, but we have ways we can emulate them. .vv will swap operands and use vmsle().vv. .vi will adjust the immediate and use .vmsgt(u).vi when possible. For .vx we need to use some of the multiple instruction sequences from the V extension spec. For unmasked vmsge(u).vx we use: vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd For cases where mask and maskedoff are the same value then we have vmsge{u}.vx v0, va, x, v0.t which is the vd==v0 case that requires a temporary so we use: vmslt{u}.vx vt, va, x; vmandnot.mm vd, vd, vt For other masked cases we use this sequence: vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0 We trust that register allocation will prevent vd in vmslt{u}.vx from being v0 since v0 is still needed by the vmxor. Differential Revision: https://reviews.llvm.org/D100925	2021-04-22 10:44:38 -07:00
Fangrui Song	ef5e7f90ea	Temporarily revert the code part of D100981 "Delete le32/le64 targets" This partially reverts commit `77ac823fd2`. Halide uses le32/le64 (https://github.com/halide/Halide/pull/5934). Temporarily brings back the code part to give them some time for migration.	2021-04-22 10:18:44 -07:00
Krzysztof Parzyszek	deda60fcaf	[Hexagon] Add HVX intrinsics for conditional vector loads/stores Intrinsics for the following instructions are added. The intrinsic name is "int_hexagon_<inst>[_128B]", e.g. int_hexagon_V6_vL32b_pred_ai for 64-byte version int_hexagon_V6_vL32b_pred_ai_128B for 128-byte version V6_vL32b_pred_ai if (Pv4) Vd32 = vmem(Rt32+#s4) V6_vL32b_pred_pi if (Pv4) Vd32 = vmem(Rx32++#s3) V6_vL32b_pred_ppu if (Pv4) Vd32 = vmem(Rx32++Mu2) V6_vL32b_npred_ai if (!Pv4) Vd32 = vmem(Rt32+#s4) V6_vL32b_npred_pi if (!Pv4) Vd32 = vmem(Rx32++#s3) V6_vL32b_npred_ppu if (!Pv4) Vd32 = vmem(Rx32++Mu2) V6_vL32b_nt_pred_ai if (Pv4) Vd32 = vmem(Rt32+#s4):nt V6_vL32b_nt_pred_pi if (Pv4) Vd32 = vmem(Rx32++#s3):nt V6_vL32b_nt_pred_ppu if (Pv4) Vd32 = vmem(Rx32++Mu2):nt V6_vL32b_nt_npred_ai if (!Pv4) Vd32 = vmem(Rt32+#s4):nt V6_vL32b_nt_npred_pi if (!Pv4) Vd32 = vmem(Rx32++#s3):nt V6_vL32b_nt_npred_ppu if (!Pv4) Vd32 = vmem(Rx32++Mu2):nt V6_vS32b_pred_ai if (Pv4) vmem(Rt32+#s4) = Vs32 V6_vS32b_pred_pi if (Pv4) vmem(Rx32++#s3) = Vs32 V6_vS32b_pred_ppu if (Pv4) vmem(Rx32++Mu2) = Vs32 V6_vS32b_npred_ai if (!Pv4) vmem(Rt32+#s4) = Vs32 V6_vS32b_npred_pi if (!Pv4) vmem(Rx32++#s3) = Vs32 V6_vS32b_npred_ppu if (!Pv4) vmem(Rx32++Mu2) = Vs32 V6_vS32Ub_pred_ai if (Pv4) vmemu(Rt32+#s4) = Vs32 V6_vS32Ub_pred_pi if (Pv4) vmemu(Rx32++#s3) = Vs32 V6_vS32Ub_pred_ppu if (Pv4) vmemu(Rx32++Mu2) = Vs32 V6_vS32Ub_npred_ai if (!Pv4) vmemu(Rt32+#s4) = Vs32 V6_vS32Ub_npred_pi if (!Pv4) vmemu(Rx32++#s3) = Vs32 V6_vS32Ub_npred_ppu if (!Pv4) vmemu(Rx32++Mu2) = Vs32 V6_vS32b_nt_pred_ai if (Pv4) vmem(Rt32+#s4):nt = Vs32 V6_vS32b_nt_pred_pi if (Pv4) vmem(Rx32++#s3):nt = Vs32 V6_vS32b_nt_pred_ppu if (Pv4) vmem(Rx32++Mu2):nt = Vs32 V6_vS32b_nt_npred_ai if (!Pv4) vmem(Rt32+#s4):nt = Vs32 V6_vS32b_nt_npred_pi if (!Pv4) vmem(Rx32++#s3):nt = Vs32 V6_vS32b_nt_npred_ppu if (!Pv4) vmem(Rx32++Mu2):nt = Vs32	2021-04-22 11:49:29 -05:00
Irina Dobrescu	123ae42566	[flang][openmp] Add General Semantic Checks for Allocate Directive This patch adds semantic checks for the General Restrictions of the Allocate Directive. Since the requires directive is not yet implemented in Flang, the restriction: ``` allocate directives that appear in a target region must specify an allocator clause unless a requires directive with the dynamic_allocators clause is present in the same compilation unit ``` will need to be updated at a later time. A different patch will be made with the Fortran specific restrictions of this directive. I have used the code from https://reviews.llvm.org/D89395 for the CheckObjectListStructure function. Co-authored-by: Isaac Perry <isaac.perry@arm.com> Reviewed By: clementval, kiranchandramohan Differential Revision: https://reviews.llvm.org/D91159	2021-04-22 16:15:06 +00:00
Simon Pilgrim	41091614d6	[LTO] Caching.h - remove unused <string> include. NFCI.	2021-04-22 14:07:12 +01:00
Jun Ma	978eb3f168	[DAGCombiner] Allow operand of step_vector to be negative. It is proper to relax non-negative limitation of step_vector. Also this patch adds more combines for step_vector: (sub X, step_vector(C)) -> (add X, step_vector(-C)) Differential Revision: https://reviews.llvm.org/D100812	2021-04-22 20:58:03 +08:00
Jay Foad	82d34fe2b3	Fix typo "beneficiates" in comments	2021-04-22 12:30:16 +01:00
Wenlei He	dff8315892	[CSSPGO][llvm-profdata] Support trimming cold context when merging profiles The change adds support for triming and merging cold context when mergine CSSPGO profiles using llvm-profdata. This is similar to the context profile trimming in llvm-profgen, however the flexibility to trim cold context after profile is generated can be useful. Differential Revision: https://reviews.llvm.org/D100528	2021-04-22 00:42:37 -07:00
Max Kazantsev	8fe62b7af1	[GVN] Introduce loop load PRE This patch allows PRE of the following type of loads: ``` preheader: br label %loop loop: br i1 ..., label %merge, label %clobber clobber: call foo() // Clobbers %p br label %merge merge: ... br i1 ..., label %loop, label %exit ``` Into ``` preheader: %x0 = load %p br label %loop loop: %x.pre = phi(x0, x2) br i1 ..., label %merge, label %clobber clobber: call foo() // Clobbers %p %x1 = load %p br label %merge merge: x2 = phi(x.pre, x1) ... br i1 ..., label %loop, label %exit ``` So instead of loading from %p on every iteration, we load only when the actual clobber happens. The typical pattern which it is trying to address is: hot loop, with all code inlined and provably having no side effects, and some side-effecting calls on cold path. The worst overhead from it is, if we always take clobber block, we make 1 more load overall (in preheader). It only matters if loop has very few iteration. If clobber block is not taken at least once, the transform is neutral or profitable. There are several improvements prospect open up: - We can sometimes be smarter in loop-exiting blocks via split of critical edges; - If we have block frequency info, we can handle multiple clobbers. The only obstacle now is that we don't know if their sum is colder than the header. Differential Revision: https://reviews.llvm.org/D99926 Reviewed By: reames	2021-04-22 12:50:38 +07:00
Giorgis Georgakoudis	a2dbfb6b72	[OpenMP] Simplify offloading parallel call codegen This revision simplifies Clang codegen for parallel regions in OpenMP GPU target offloading and corresponding changes in libomptarget: SPMD/non-SPMD parallel calls are unified under a single `kmpc_parallel_51` runtime entry point for parallel regions (which will be commonized between target, host-side parallel regions), data sharing is internalized to the runtime. Tests have been auto-generated using `update_cc_test_checks.py`. Also, the revision contains changes to OpenMPOpt for remark creation on target offloading regions. Reviewed By: jdoerfert, Meinersbur Differential Revision: https://reviews.llvm.org/D95976	2021-04-21 18:46:07 -07:00
Fangrui Song	77ac823fd2	Delete le32/le64 targets They are unused now. Note: NaCl is still used and is currently expected to be needed until 2022-06 (https://blog.chromium.org/2020/08/changes-to-chrome-app-support-timeline.html). Differential Revision: https://reviews.llvm.org/D100981	2021-04-21 18:44:12 -07:00
Hongtao Yu	1a719089a8	[CSSPGO][llvm-profgen] Always report dangling probes for frames with real samples. Report dangling probes for frames that have real samples collected. Dangling probes are the probes associated to an empty block. When reported, sample count on a dangling probe will not be trusted by the compiler and we will rely on the counts inference algorithm to get the probe a reasonable count. This actually fixes a bug where previously only those dangling probes with samples collected were reported. This patch also fixes two existing issues. Pseudo probes are stored in `Address2ProbesMap` and their pointers are used in `PseudoProbeInlineTree`. Previously `std::vector` was used to store probes and the pointers to probes may get obsolete as the vector grows. I'm changing `std::vector` to `std::list` instead. The other issue is that all outlined functions shared the same inline frame previously due to the unchanged `Index` value as the dummy inlineSite identifier. Good results seen for SPEC2017 in general regarding profile quality. Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D100235	2021-04-21 18:07:58 -07:00
Fangrui Song	ac303795a7	[IR] Add doc about Function::createWithDefaultAttr. NFC	2021-04-21 16:20:50 -07:00
Fangrui Song	775a9483e5	[IR][sanitizer] Set nounwind on module ctor/dtor, additionally set uwtable if -fasynchronous-unwind-tables On ELF targets, if a function has uwtable or personality, or does not have nounwind (`needsUnwindTableEntry`), it marks that `.eh_frame` is needed in the module. Then, a function gets `.eh_frame` if `needsUnwindTableEntry` or `-g[123]` is specified. (i.e. If -g[123], every function gets `.eh_frame`. This behavior is strange but that is the status quo on GCC and Clang.) Let's take asan as an example. Other sanitizers are similar. `asan.module_[cd]tor` has no attribute. `needsUnwindTableEntry` returns true, so every function gets `.eh_frame` if `-g[123]` is specified. This is the root cause that `-fno-exceptions -fno-asynchronous-unwind-tables -g` produces .debug_frame while `-fno-exceptions -fno-asynchronous-unwind-tables -g -fsanitize=address` produces .eh_frame. This patch * sets the nounwind attribute on sanitizer module ctor/dtor. * let Clang emit a module flag metadata "uwtable" for -fasynchronous-unwind-tables. If "uwtable" is set, sanitizer module ctor/dtor additionally get the uwtable attribute. The "uwtable" mechanism is generic: synthesized functions not cloned/specialized from existing ones should consider `Function::createWithDefaultAttr` instead of `Function::create` if they want to get some default attributes which have more of module semantics. Other candidates: "frame-pointer" (https://github.com/ClangBuiltLinux/linux/issues/955 https://github.com/ClangBuiltLinux/linux/issues/1238), dso_local, etc. Differential Revision: https://reviews.llvm.org/D100251	2021-04-21 15:58:20 -07:00
dfukalov	a8b35e0f52	[TTI] NFC: Change getVectorSplitCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D100952	2021-04-21 17:32:02 +03:00
Anirudh Prasad	8f6185c713	[AsmParser][ms][X86] Fix possible misbehaviour in parsing of special tokens at start of string. - Previously, https://reviews.llvm.org/D72680 introduced a new attribute called `AllowSymbolAtNameStart` (in relation to the MAsmParser changes) in `MCAsmInfo.h` which (according to the comment in the header) allows the following behaviour: ``` /// This is true if the assembler allows $ @ ? characters at the start of /// symbol names. Defaults to false. ``` - However, the usage of this field in AsmLexer.cpp doesn't seem completely accurate* for a couple of reasons. ``` default: if (MAI.doesAllowSymbolAtNameStart()) { // Handle Microsoft-style identifier: [a-zA-Z_$.@?][a-zA-Z0-9_$.@#?]* if (!isDigit(CurChar) && isIdentifierChar(CurChar, MAI.doesAllowAtInName(), AllowHashInIdentifier)) return LexIdentifier(); } ``` 1. The Dollar and At tokens, when occurring at the start of the string, are treated as separate tokens (AsmToken::Dollar and AsmToken::At respectively) and not lexed as an Identifier. 2. I'm not too sure why `MAI.doesAllowAtInName()` is used when `AllowAtInIdentifier` could be used. For X86 platforms, afaict, this shouldn't be an issue, since the `CommentString` attribute isn't "@". (alternatively the call to the setter can be set anywhere else as needed). The `AllowAtInName` does have an additional important meaning, but in the context of AsmLexer, shouldn't mean anything different compared to `AllowAtInIdentifier` My proposal is the following: - Introduce 3 new fields called `AllowQuestionTokenAtStartOfString`, `AllowDollarTokenAtStartOfString` and `AllowAtTokenAtStartOfString` in MCAsmInfo.h which will encapsulate the previously documented behaviour of "allowing $, @, ? characters at the start of symbol names") - Introduce these fields where "$", "@" are lexed, and treat them as identifiers depending on whether `Allow[Dollar\|At]TokenAtStartOfString` is set. - For the sole case of "?", append it to the existing logic for treating a "default" token as an Identifier. z/OS (HLASM) will also make use of some of these fields in follow up patches. completely accurate* - This was based on the comments and the intended behaviour the code. I might have completely misinterpreted it, and if that is the case my sincere apologies. We can close this patch if necessary, if there are no changes to be made :) Depends on https://reviews.llvm.org/D99374 Reviewed By: Jonathan.Crowther Differential Revision: https://reviews.llvm.org/D99889	2021-04-21 10:21:09 -04:00
Nico Weber	ba7a92c01e	[Support] Don't include VirtualFileSystem.h in CommandLine.h CommandLine.h is indirectly included in ~50% of TUs when building clang, and VirtualFileSystem.h is large. (Already remarked by jhenderson on D70769.) No behavior change. Differential Revision: https://reviews.llvm.org/D100957	2021-04-21 10:19:01 -04:00
Simon Pilgrim	68b9b769b5	[MC] MCInstrDesc.h - remove unnecessary <string> include. NFCI.	2021-04-21 15:07:00 +01:00
Fraser Cormack	e6ff89dc2e	[SelectionDAG] Fix minor typo in ISDOpcodes.h. NFC	2021-04-21 14:38:07 +01:00
serge-sans-paille	d9806334d1	Use SmallVector instead of std::vector to manage storage of llvm::BitVector This is a follow-up to https://reviews.llvm.org/D100387. std::vector is not the best storage container here. My local benchmark (counting the number of instruction when compiling the sqlite3 amalgamation) yields the following: - std::vector<BitVector> -> 5,860,885,896 - SmallVector<BitWord, 0> -> 5,858,991,997 - SmallVector<BitWord> -> 5,817,679,224 Differential Revision: https://reviews.llvm.org/D100744	2021-04-21 07:31:28 +02:00
Arthur Eubanks	dd56715326	[NFC] Remove redundant InstCombinePass name	2021-04-20 22:23:07 -07:00
Philip Reames	4824d876f0	Revert "Allow invokable sub-classes of IntrinsicInst" This reverts commit `d87b9b81cc`. Post commit review raised concerns, reverting while discussion happens.	2021-04-20 15:38:38 -07:00

... 6 7 8 9 10 ...

45424 Commits