llvm-project

Commit Graph

Author	SHA1	Message	Date
Jay Foad	3822a01e0b	[AMDGPU] Add GFX11 ds_bvh_stack_rtn_b32 instruction Differential Revision: https://reviews.llvm.org/D133928	2022-09-15 16:46:14 +01:00
Sander de Smalen	45d28779c5	[AArch64][SME] Fix lowering of llvm.aarch64.get.pstatesm() A thread may not have access to SME or TPIDR2_EL0, so in order to safely query PSTATE.SM in a streaming-compatible function, the code should call `__arm_sme_state()`, as described in the ABI: `c2bb09c4d4` This means that the value of pstate.sm is: * 0 if the function is non-streaming. * 1 if the function has `arm_streaming` or `arm_locally_streaming`. * evaluated at runtime by a call to __arm_sme_state() otherwise. This patch also adds a calling convention for calls to SME support routines. At some point we can remove the need for the llvm.aarch64.get.pstatesm() intrinsic and use function calls (with the corresponding cc) directly instead. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D131571	2022-09-15 15:14:13 +00:00
Matt Arsenault	63d1d37d35	RegAllocGreedy: Avoid overflowing priority bitfields The class priority is expected to be at most 5 bits before it starts clobbering bits used for other fields. Also clamp the instruction distance in case we have millions of instructions. AMDGPU was accidentally overflowing into the global priority bit in some cases. I think in principal we would have wanted this, but in the cases I've looked at, it had the counter intuitive effect and de-prioritized the large register tuple. Avoid using weird bit hack PPC uses for global priority. The AllocationPriority field is really 5 bits, and PPC was relying on overflowing this to 6-bits to forcibly set the global priority bit. Split this out as a separate flag to avoid having magic behavior for values above 31.	2022-09-15 10:38:40 -04:00
Alexey Lapshin	adabfb5e32	[DWARFLinker][NFC] Set the target DWARF version explicitly. Currently, DWARFLinker determines the target DWARF version internally. It examines incoming object files, detects maximal DWARF version and uses that version for the output file. This patch allows explicitly setting output DWARF version by the consumer of DWARFLinker. So that DWARFLinker uses a specified version instead of autodetected one. It allows consumers to use different logic for setting the target DWARF version. f.e. instead of the maximally used version someone could set a higher version to convert from DWARFv4 to DWARFv5 (This possibility is not supported yet, but it would be good if the interface will support it). Or another variant is to set the target version through the command line. In this patch, the autodetection is moved into the consumers(DwarfLinkerForBinary.cpp, DebugInfoLinker.cpp). Differential Revision: https://reviews.llvm.org/D132755	2022-09-15 16:06:10 +03:00
Dhruva Chakrabarti	839ac62c50	Revert "[OpenMP] Codegen aggregate for outlined function captures" This reverts commit `7539e9cf81`.	2022-09-15 03:08:46 +00:00
Vitaly Buka	72b776168c	[IRBuilder] Add CreateMaskedExpandLoad and CreateMaskedCompressStore	2022-09-14 19:18:52 -07:00
Giorgis Georgakoudis	7539e9cf81	[OpenMP] Codegen aggregate for outlined function captures Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call. Reviewed By: jdoerfert, jhuber6, ABataev Differential Revision: https://reviews.llvm.org/D102107	2022-09-15 00:54:05 +00:00
Sam Clegg	8273ca1421	[MC] Fix typo in getSectionAddressSize comment. NFC The comment was refering to a now non-existant function that was removed in `93e3cf0ebd`. Differential Revision: https://reviews.llvm.org/D133098	2022-09-14 15:15:41 -07:00
Craig Topper	50a699e362	[IR][VP] Remove IntrArgMemOnly from vp.gather/scatter. IntrArgMemOnly is only valid for intrinsics that use a scalar pointer argument. These intrinsics use a vector of pointer. Alias analysis will try to find a scalar pointer argument and will return incorrect alias results when it doesn't find one. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D133898	2022-09-14 15:00:07 -07:00
Arthur Eubanks	ccc9107ad6	[OptBisect] Add flag to print IR when opt-bisect kicks in -opt-bisect-print-ir-path=foo will dump the IR to foo when opt-bisect-limit starts skipping passes. Currently we don't print the IR if the opt-bisect-limit is higher than the total number of times opt-bisect is called. This makes getting the IR right before a bad transform easier. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D133809	2022-09-14 13:48:03 -07:00
Fangrui Song	25394c9d10	[llvm-objdump] Change printSymbolVersionDependency to use ELFFile API When .gnu.version_r is empty (allowed by readelf but warned by objdump), llvm-objdump -p may decode the next section as .gnu.version_r and may crash due to out-of-bounds C string reference. ELFFile<ELFT>::getVersionDependencies handles 0-entry .gnu.version_r gracefully. Just use it. Fix https://github.com/llvm/llvm-project/issues/57707 Differential Revision: https://reviews.llvm.org/D133751	2022-09-14 12:30:34 -07:00
Nikita Popov	b1cd393f9e	[AA] Tracking per-location ModRef info in FunctionModRefBehavior (NFCI) Currently, FunctionModRefBehavior tracks whether the function reads or writes memory (ModRefInfo) and which locations it can access (argmem, inaccessiblemem and other). This patch changes it to track ModRef information per-location instead. To give two examples of why this is useful: * D117095 highlights a weakness of ModRef modelling in the presence of operand bundles. For a memcpy call with deopt operand bundle, we want to say that it can read any memory, but only write argument memory. This would allow them to be treated like any other calls. However, we currently can't express this and have to say that it can read or write any memory. * D127383 would ideally be modelled as a separate threadid location, where threadid Refs outside pre-split coroutines can be ignored (like other accesses to constant memory). The current representation does not allow modelling this precisely. The patch as implemented is intended to be NFC, but there are some obvious opportunities for improvements and simplification. To fully capitalize on this we would also want to change the way we represent memory attributes on functions, but that's a larger change, and I think it makes sense to separate out the FunctionModRefBehavior refactoring. Differential Revision: https://reviews.llvm.org/D130896	2022-09-14 16:34:41 +02:00
Martin Storsjö	e280940bfb	[Support] Access threadIndex via a wrapper function On Unix platforms, this wrapper function is inline, so it should expand to the same direct access to the thread local variable. On Windows, it's a non-inline function within Parallel.cpp, allowing making the thread_local variable static. Windows Native TLS doesn't support direct access to thread local variables in a different DLL, and GCC/binutils on Windows occasionally has problems with non-static thread local variables too. This fixes mingw dylib builds with native TLS after `e6aebff674`. At the same time, move the whole thread local variable within #if LLVM_ENABLE_THREADS to fix builds without threading support. Differential Revision: https://reviews.llvm.org/D133759	2022-09-14 09:19:27 +03:00
Pengxuan Zheng	ecb5ea6a26	[Object][COFF] Allow section symbol to be common symbol I ran into an lld-link error due to a symbol named ".idata$4" coming from some static library: .idata$4 should not refer to special section 0. Here is the symbol table entry for .idata$4: Symbol { Name: .idata$4 Value: 3221225536 Section: IMAGE_SYM_UNDEFINED (0) BaseType: Null (0x0) ComplexType: Null (0x0) StorageClass: Section (0x68) AuxSymbolCount: 0 } The symbol .idata$4 is a section symbol (IMAGE_SYM_CLASS_SECTION) and LLD currently handles it as a regular defined symbol since isCommon() returns false for this symbol. This results in the error ".idata$4 should not refer to special section 0" because lld-link asserts that regular defined symbols should not refer to section 0. Should this symbol be handled as a common symbol instead? LLVM currently only allows external symbols (IMAGE_SYM_CLASS_EXTERNAL) to be common symbols. However, the PE/COFF spec (see section "Section Number Values") does not seem to mention this restriction. Any thoughts? Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D133627	2022-09-13 18:07:02 -07:00
YongKang Zhu	5fa6b24354	Address feedback in https://reviews.llvm.org/D133637 https://reviews.llvm.org/D133637 fixes the problem where we should hash raw content of register mask instead of the pointer to it. Fix the same issue in `llvm::hash_value()`. Remove the added API `MachineOperand::getRegMaskSize()` to avoid potential confusion. Add an assert to emphasize that we probably should hash a machine operand iff it has associated machine function, but keep the fallback logic in the original change. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D133747	2022-09-13 16:12:41 -07:00
Chris Bieneman	4b96f8996a	[DX] DXContainer does not support COMDAT The DXContainer is pretty primitive, but doesn't support COMDAT. We need to set that in the Triple so that Clang won't try to emit COMDATs.	2022-09-13 13:59:47 -05:00
theidexisted	0a1c8522f3	[NFC][ADT] Fix assert message Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D129632	2022-09-13 18:55:57 +00:00
Hendrik Greving	393a17b5d1	[ValueTypes] Define MVTs for v256i2/v128i4. Adds MVT::v256i2, MVT::v128i4. Differential Revision: https://reviews.llvm.org/D133603	2022-09-13 09:02:23 -07:00
Pavel Samolysov	02aaf8e3d6	[NFC][ScheduleDAGInstrs] Use structure bindings and emplace_back Some uses of std::make_pair and the std::pair's first/second members in the ScheduleDAGInstrs.[cpp\|h] files were replaced with using of the vector's emplace_back along with structure bindings from C++17.	2022-09-13 12:49:04 +03:00
Sylvestre Ledru	cd20a18286	Revert "[clang, llvm] Add __declspec(safebuffers), support it in CodeView" Causing: https://github.com/llvm/llvm-project/issues/57709 This reverts commit `ab56719acd`.	2022-09-13 10:53:59 +02:00
Max Kazantsev	86d5586d78	[SCEVExpander] Recompute poison-generating flags on hoisting. PR57187 Instruction being hoisted could have nuw/nsw flags inferred from the old context, and we cannot simply move it to the new location keeping them because we are going to introduce new uses to them that didn't exist before. Example in https://github.com/llvm/llvm-project/issues/57187 shows how this can produce branch by poison from initially well-defined program. This patch forcefully recomputes poison-generating flag in the new context. Differential Revision: https://reviews.llvm.org/D132022 Reviewed By: fhahn, nikic	2022-09-13 12:56:35 +07:00
Aiden Grossman	eec183c171	[nfc] Refactor SlotIndex::getInstrDistance to better reflect actual functionality This patch refactors SlotIndex::getInstrDistance to SlotIndex::getApproxInstrDistance to better describe the actual functionality of this function. This patch also adds in some additional comments better documenting the assumptions that this function makes to increase clarity. Based on discussion on the LLVM Discourse: https://discourse.llvm.org/t/odd-behavior-in-slotindex-getinstrdistance/64934/5 Reviewed By: mtrofin, foad Differential Revision: https://reviews.llvm.org/D133386	2022-09-12 23:33:35 +00:00
Amara Emerson	25bcc8c797	[GlobalISel][Legalizer] Fix minScalarEltSameAsIf to handle p0 element types. The mutation the action generates tries to change the input type into the element type of larger vector type. This doesn't work if the larger element type is a vector of pointers since it creates an illegal mutation between scalar and pointer types. Differential Revision: https://reviews.llvm.org/D133671	2022-09-13 00:01:37 +01:00
David Majnemer	ab56719acd	[clang, llvm] Add __declspec(safebuffers), support it in CodeView __declspec(safebuffers) is equivalent to __attribute__((no_stack_protector)). This information is recorded in CodeView. While we are here, add support for strict_gs_check.	2022-09-12 21:15:34 +00:00
Kazu Hirata	9606608474	[llvm] Use x.empty() instead of llvm::empty(x) (NFC) I'm planning to deprecate and eventually remove llvm::empty. I thought about replacing llvm::empty(x) with std::empty(x), but it turns out that all uses can be converted to x.empty(). That is, no use requires the ability of std::empty to accept C arrays and std::initializer_list. Differential Revision: https://reviews.llvm.org/D133677	2022-09-12 13:34:35 -07:00
YongKang Zhu	481a32f587	Bug fix on stable hash calculation for machine operands RegisterMask and RegisterLiveOut MachineOperand::getRegMask() returns a pointer to register mask. We should hash the raw content of register mask instead of its pointer. Reviewed By: kyulee Differential Revision: https://reviews.llvm.org/D133637	2022-09-12 13:25:04 -07:00
Fangrui Song	e6aebff674	[ELF] Parallelize relocation scanning * Change `Symbol::flags` to a `std::atomic<uint16_t>` * Add `llvm::parallel::threadIndex` as a thread-local non-negative integer * Add `relocsVec` to part.relaDyn and part.relrDyn so that relative relocations can be added without a mutex * Arbitrarily change -z nocombreloc to move relative relocations to the end. Disable parallelism for deterministic output. MIPS and PPC64 use global states for relocation scanning. Keep serial scanning. Speed-up with mimalloc and --threads=8 on an Intel Skylake machine: * clang (Release): 1.27x as fast * clang (Debug): 1.06x as fast * chrome (default): 1.05x as fast * scylladb (default): 1.04x as fast Speed-up with glibc malloc and --threads=16 on a ThunderX2 (AArch64): * clang (Release): 1.31x as fast * scylladb (default): 1.06x as fast Reviewed By: andrewng Differential Revision: https://reviews.llvm.org/D133003	2022-09-12 12:56:35 -07:00
Craig Topper	38ffa2bb96	[LegalizeTypes] Improve splitting for urem/udiv by constant for some constants. For remainder: If (1 << (Bitwidth / 2)) % Divisor == 1, we can add the high and low halves together and use a (Bitwidth / 2) urem. If (BitWidth /2) is a legal integer type, this urem will be expand by DAGCombiner using multiply by magic constant. We do have to take into account that adding high and low together can produce a carry, making it a (BitWidth / 2)+1 bit number. So we need to also add back in the carry from the first addition. For division: We can use the above trick to compute the remainder, subtract that remainder from the dividend, then multiply by the multiplicative inverse of the Divisor modulo (1 << BitWidth). This is based on the section "Remainder by Summing Digits" in Hacker's delight. The remainder trick is similar to a trick you may have learned for determining if a decimal number is divisible by 3. You can add all the digits together and see if the sum is divisible by 3. If you're not sure if the sum is divisible by 3, you can add its digits together. This can be repeated until you have a single decimal digit. If that digit is 3, 6, or 9, then the original number is divisible by 3. This works because 10 % 3 == 1. gcc already does this same trick. There are additional tricks gcc does urem as well as srem, udiv, and sdiv that I plan to add in future patches. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D130862	2022-09-12 10:34:52 -07:00
Matthias Gehre	c1502425ba	Move TargetTransformInfo::maxLegalDivRemBitWidth -> TargetLowering::maxSupportedDivRemBitWidth Also remove new-pass-manager version of ExpandLargeDivRem because there is no way yet to access TargetLowering in the new pass manager. Differential Revision: https://reviews.llvm.org/D133691	2022-09-12 17:06:16 +01:00
Jay Foad	210e6a993d	[GlobalISel] Simplify extended add/sub to add/sub with carry Simplify extended add/sub (with carry-in and carry-out) to add/sub with carry (with carry-out only) if carry-in is known to be zero. Differential Revision: https://reviews.llvm.org/D133702	2022-09-12 17:05:44 +01:00
Alexey Bataev	dfe1e9dd79	[SLP]Improve reordering of clustered reused scalars. If the reused scalars are clustered, i.e. each part of the reused mask contains all elements of the original scalars exactly once, we can reorder those clusters to improve the whole ordering of of the clustered vectors. Differential Revision: https://reviews.llvm.org/D133524	2022-09-12 06:52:25 -07:00
Matt Arsenault	7834194837	TableGen: Introduce generated getSubRegisterClass function Currently there isn't a generic way to get a smaller register class that can be produced from a subregister of a larger class. Replaces a manually implemented version for AMDGPU. This will be used to improve subregister support in the allocator.	2022-09-12 09:03:37 -04:00
Matt Arsenault	bb70b5d406	CodeGen: Set MODereferenceable from isDereferenceableAndAlignedPointer Previously this was assuming piontsToConstantMemory implies dereferenceable.	2022-09-12 08:38:35 -04:00
David Spickett	739b69e655	[LLVM][AArch64] Explain that X19 is used as the frame base pointer register Fixes #50098 LLVM uses X19 as the frame base pointer, if it needs to. Meaning you can get warnings if you clobber that with inline asm. However, it doesn't explain why. The frame base register is not part of the ABI so it's pretty confusing why you get that warning out of the blue. This adds a method to explain a reserved register with X19 as the first one. The logic is the same as getReservedRegs. I could have added a return parameter to isASMClobberable and friends but found that there's a lot of things that call isReservedReg in various ways. So while one more method on the pile isn't great design, it is simpler right now to do it this way and only pay the cost if you are actually using a reserved register. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D133213	2022-09-12 09:18:09 +00:00
Johannes Doerfert	c922cac868	Revert "[Attributor] AAPointerInfo should allow "harmless" uses" Revert "[Attributor] Teach AAPointerInfo to look into aggregates" This reverts commit `844f6c5d03` and `4ed0a88cd8` as they broke the buildbots that run openmp/libomptarget/test/offloading/bug49021.cpp.	2022-09-11 21:37:54 -07:00
Johannes Doerfert	4ed0a88cd8	[Attributor] Teach AAPointerInfo to look into aggregates If we have a constant aggregate, e.g., as an initializer, we usually failed to extract the proper value/type from it. This patch provides the size and offset information necessary to extract the right part of the constant.	2022-09-11 20:16:11 -07:00
Johannes Doerfert	21711039e3	[OpenMP] Allow the Attributor to look at functions we also internalized This is important as we have accesses to globals in those which we need to categorize.	2022-09-11 20:16:11 -07:00
Kazu Hirata	a21c8be1cc	[llvm] Use std::aligned_storage_t (NFC)	2022-09-11 16:11:39 -07:00
Junduo Dong	6975ab7126	[Clang] Reimplement time tracing of NewPassManager by PassInstrumentation framework The previous implementation of time tracing in NewPassManager is direct but messive. The key codes are like the demo below: ``` /// Runs the function pass across every function in the module. PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, LazyCallGraph &CG, CGSCCUpdateResult &UR) { /// ... PreservedAnalyses PassPA; { TimeTraceScope TimeScope(Pass.name()); PassPA = Pass.run(F, FAM); } /// ... } ``` It can be bothered to judge where should we add the tracing codes by hands. With the PassInstrumentation framework, we can easily add `Before/After` callback functions to add time tracing codes. Differential Revision: https://reviews.llvm.org/D131960	2022-09-11 05:42:55 -07:00
sunho	d1c4d96126	[ORC][ORC_RT][COFF] Remove public bootstrap method. Removes public bootstrap method that is not really necessary and not consistent with other platform API. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D132780	2022-09-10 15:25:50 +09:00
sunho	73c4033987	[ORC][ORC_RT][COFF] Support dynamic VC runtime. Supports dynamic VC runtime. It implements atexits handling which is required to load msvcrt.lib successfully. (the object file containing atexit symbol somehow resolves to static vc runtim symbols) It also default to dynamic vc runtime which tends to be more robust. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D132525	2022-09-10 15:25:49 +09:00
Joe Loser	62b8a61d6c	[llvm] Remove includes of `llvm/Support/STLArrayExtras.h` `llvm` and downstream internal callers no longer use `array_lengthof`, so drop the include everywhere. Differential Revision: https://reviews.llvm.org/D133600	2022-09-09 17:44:00 -06:00
Fangrui Song	058f17d3af	[ADT] Move LLVM_DEPRECATED before type after D133502 `[[deprecated(...)]]` cannot appear between `inline size_t`.	2022-09-09 15:56:58 -07:00
Joe Loser	5758c824da	[ADT] Mark `llvm::array_lengthof` as deprecated As a follow-up of `5e96cea1db`, mark `llvm::array_lengthof` as deprecated in favor of using `std::size` function directly. Differential Revision: https://reviews.llvm.org/D133502	2022-09-09 15:31:00 -06:00
Augie Fackler	4fea8ee540	OpenMP: mark allocptr attribute on __kmpc_free_shared Differential Revision: https://reviews.llvm.org/D124491	2022-09-09 14:09:18 -04:00
Nikita Popov	a9f312c7f4	[AST] Use BatchAA in aliasesUnknownInst() (NFCI)	2022-09-09 15:54:48 +02:00
Sebastian Neubauer	c7750c522e	Add helper func to get first non-alloca position The LLVM performance tips suggest that allocas should be placed at the beginning of the entry block. So far, llvm doesn’t provide any helper to find that position. Add BasicBlock::getFirstNonPHIOrDbgOrAlloca and IRBuilder::SetInsertPointPastAllocas(Function*) that get an insert position after the (static) allocas at the start of a function and use it in ShadowStackGCLowering. Differential Revision: https://reviews.llvm.org/D132554	2022-09-09 15:39:53 +02:00
Namhyung Kim	43efb5e445	[llvm-objdump] Create name for fake sections It doesn't have a section header string table so add a vector to have the strings and create name based on the program header type and the index. Differential Revision: https://reviews.llvm.org/D131290	2022-09-09 12:27:07 +01:00
Serge Pavlov	7b9fae05b4	[Clang] Use virtual FS in processing config files Clang has support of virtual file system for the purpose of testing, but treatment of config files did not use it. This change enables VFS in it as well. Differential Revision: https://reviews.llvm.org/D132867	2022-09-09 18:24:45 +07:00
Serge Pavlov	55e1441f7b	Revert "[Clang] Use virtual FS in processing config files" This reverts commit `9424497e43`. Some buildbots failed, reverted for investigation.	2022-09-09 16:43:15 +07:00
Serge Pavlov	9424497e43	[Clang] Use virtual FS in processing config files Clang has support of virtual file system for the purpose of testing, but treatment of config files did not use it. This change enables VFS in it as well. Differential Revision: https://reviews.llvm.org/D132867	2022-09-09 16:28:51 +07:00
Fangrui Song	781dea021a	[Support] Rename DebugCompressionType::Z to Zlib "Z" was so named when we had both gABI ELFCOMPRESS_ZLIB and the legacy .zdebug support. Now we have just one zlib format, we should use the more descriptive name.	2022-09-08 16:11:29 -07:00
raghavmedicherla	5d3cf8267f	Revert "Support: Add mapped_file_region::sync(), equivalent to msync" This reverts commit `142f51fc2f`. This shouldn't be committed, it got committed accidentally.	2022-09-08 12:49:52 -04:00
Thomas Lively	ac3b8df8f2	[WebAssembly] Prototype `f32x4.relaxed_dot_bf16x8_add_f32` As proposed in https://github.com/WebAssembly/relaxed-simd/issues/77. Only an LLVM intrinsic and a clang builtin are implemented. Since there is no bfloat16 type, use u16 to represent the bfloats in the builtin function arguments. Differential Revision: https://reviews.llvm.org/D133428	2022-09-08 08:07:49 -07:00
Joe Loser	5e96cea1db	[llvm] Use std::size instead of llvm::array_lengthof LLVM contains a helpful function for getting the size of a C-style array: `llvm::array_lengthof`. This is useful prior to C++17, but not as helpful for C++17 or later: `std::size` already has support for C-style arrays. Change call sites to use `std::size` instead. Differential Revision: https://reviews.llvm.org/D133429	2022-09-08 09:01:53 -06:00
Eric Wang	d8a2d3f7d4	[NFC][Regalloc] Introduce the RegAllocPriorityAdvisorAnalysis This patch introduces the priority analysis and the priority advisor, the default implementation, and the scaffolding for introducing the other implementations of the advisor. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D132835	2022-09-08 07:50:03 -07:00
David Spickett	e428baf001	[LLVM][ARM] Remove options for armv2, 2A, 3 and 3M Fixes #57486 These pre v4 architectures are not specifically supported by codegen. As demonstrated in the linked issue. GCC has not supported 3M since GCC 9 and presumably 2 and 2A earlier than that. So we are aligned in that sense. (see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2abd6e34fcf3bd9f9ffafcaa47cdc3ed443f9add) This removes the options and associated testing. The Pre_v4 build attribute remains mainly because its absence would be more confusing. It will not be used other than to complete the list of build attributes as shown in the ABI. https://github.com/ARM-software/abi-aa/blob/main/addenda32/addenda32.rst#3352the-target-related-attributes Reviewed By: nickdesaulniers, peter.smith, rengolin Differential Revision: https://reviews.llvm.org/D133109	2022-09-08 09:49:48 +00:00
Nikita Popov	96cb7c2273	[ConstantExpr] Remove fneg expression As part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179, this removes the fneg constant expression (which is, incidentally, the only unary operator expression). Differential Revision: https://reviews.llvm.org/D133418	2022-09-08 10:24:55 +02:00
Fangrui Song	b6e1fd761d	[llvm-objcopy] Support --{,de}compress-debug-sections for zstd Also, add ELFCOMPRESS_ZSTD (2) from the approved generic-abi proposal: https://groups.google.com/g/generic-abi/c/satyPkuMisk ("Add new ch_type value: ELFCOMPRESS_ZSTD") Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399 ("[RFC] Zstandard as a second compression method to LLVM") Reviewed By: jhenderson, dblaikie Differential Revision: https://reviews.llvm.org/D130458	2022-09-08 00:59:14 -07:00
Fangrui Song	a41977dd0f	[Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress} as high-level API on top of `llvm::compression::{zlib,zstd}::`: getReasonIfUnsupported: return nullptr if the specified format is supported, or (if unsupported) a string like `LLVM was not built with LLVM_ENABLE_ZLIB ...` * compress: dispatch to zlib::uncompress or zstd::uncompress * decompress: dispatch to zlib::uncompress or zstd::uncompress Move `llvm::DebugCompressionType` from MC to Support to avoid Support->MC cyclic dependency. There are 40+ uses in llvm-project. Add another enum class `llvm::compression::Format` to represent supported compression formats, which may be a superset of ELF compression formats. See D130458 (llvm-objcopy --{,de}compress-debug-sections for zstd) for a use case. Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399 ("[RFC] Zstandard as a second compression method to LLVM") --- Note: this patch alone will cause -Wswitch to llvm/lib/ObjCopy/ELF/ELFObject.cpp Reviewed By: ckissane, dblaikie Differential Revision: https://reviews.llvm.org/D130506	2022-09-08 00:58:55 -07:00
Nikita Popov	0444b40ed3	Revert "[Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress}" This reverts commit `19dc3cff0f`. This reverts commit `5b19a1f8e8`. This reverts commit `9397648ac8`. This reverts commit `10842b4475`. Breaks the GCC build, as reported here: https://reviews.llvm.org/D130506#3776415	2022-09-08 09:33:12 +02:00
Fangrui Song	10842b4475	[Support] Work around GCC's enum support	2022-09-08 00:13:25 -07:00
Fangrui Song	5b19a1f8e8	[llvm-objcopy] Support --{,de}compress-debug-sections for zstd Also, add ELFCOMPRESS_ZSTD (2) from the approved generic-abi proposal: https://groups.google.com/g/generic-abi/c/satyPkuMisk ("Add new ch_type value: ELFCOMPRESS_ZSTD") Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399 ("[RFC] Zstandard as a second compression method to LLVM") Reviewed By: jhenderson, dblaikie Differential Revision: https://reviews.llvm.org/D130458	2022-09-07 23:53:40 -07:00
Fangrui Song	19dc3cff0f	[Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress} as high-level API on top of `llvm::compression::{zlib,zstd}::`: getReasonIfUnsupported: return nullptr if the specified format is supported, or (if unsupported) a string like `LLVM was not built with LLVM_ENABLE_ZLIB ...` * compress: dispatch to zlib::uncompress or zstd::uncompress * decompress: dispatch to zlib::uncompress or zstd::uncompress Move `llvm::DebugCompressionType` from MC to Support to avoid Support->MC cyclic dependency. There are 40+ uses in llvm-project. Add another enum class `llvm::compression::Format` to represent supported compression formats, which may be a superset of ELF compression formats. See D130458 (llvm-objcopy --{,de}compress-debug-sections for zstd) for a use case. Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399 ("[RFC] Zstandard as a second compression method to LLVM") Differential Revision: https://reviews.llvm.org/D130506	2022-09-07 23:53:14 -07:00
gonglingqin	d5f7a2182d	[LoongArch] Add codegen support for atomicrmw xchg operation on LA32 Depends on D131228 Differential Revision: https://reviews.llvm.org/D131229	2022-09-08 13:57:53 +08:00
gonglingqin	b60f801607	[LoongArch] Add codegen support for atomicrmw xchg operation on LA64 In order to avoid the patch being too large, the atomicrmw xchg operation on LA32 will be added later Differential Revision: https://reviews.llvm.org/D131228	2022-09-08 13:57:26 +08:00
Fangrui Song	f48931f3a8	[NewPM] Switch -filter-passes from ClassName to pass-name NewPM -filter-passes (D86360) uses ClassName instead of pass-name as used in `-passes`, `-print-after`, etc. D87216 has added a mechanism to map ClassName to pass-name. Adopt it for -filter-passes. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D133263	2022-09-07 22:02:26 -07:00
Marco Elver	97c2220565	[SanitizerBinaryMetadata] Introduce SanitizerBinaryMetadata instrumentation pass Introduces the SanitizerBinaryMetadata instrumentation pass which uses the new MD_pcsections metadata kinds to instrument certain types of instructions and functions required for breakpoint-based sanitizers. The first intended user of the binary metadata emitted will be a variant of GWP-TSan [1]. GWP-TSan will require information about atomic accesses; to unambiguously determine if an access is atomic or not, we also require "covered" information which code has been compiled with SanitizerBinaryMetadata instrumentation enabled. [1] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf Reviewed By: dvyukov Differential Revision: https://reviews.llvm.org/D130887	2022-09-07 21:25:40 +02:00
Andrea Di Biagio	3262794804	[MCA] Correctly check pipeline availability for partially overlapping resource groups. This patch mostly reverts commit `70b37f4c03` which fixed PR50725. In case of explicit consumption of multiple partially overlapping group resources, the ResourceManager was not correctly checking pipeline esources availability. The fix for PR50725 only partially addressed a few instances of that issue. This is a more general (although, technically slower) fix for that same issue. It also fixes Issue #57548 Thanks to Haohai Wen for the small reproducible.	2022-09-07 12:17:59 +01:00
Marco Elver	343700358f	[AsmPrinter] Emit PCs into requested PCSections Interpret MD_pcsections in AsmPrinter emitting the requested metadata to the associated sections. Functions and normal instructions are handled. Differential Revision: https://reviews.llvm.org/D130879	2022-09-07 11:36:02 +02:00
Marco Elver	31a548021b	[GlobalISel] Propagate PCSections metadata to MachineInstr Propagate (most) PC sections metadata to MachineInstr when GlobalISel is doing instruction selection. This change results in support for architectures using GlobalISel (such as -O0 with AArch64). Not all instructions may be supported yet, and requires further target-specific handling (such as done for AArch64 pseudo-atomics). Expanding supported instructions is planned on a case-by-case basis and new use cases for PC sections metadata. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130886	2022-09-07 11:36:02 +02:00
Marco Elver	0ba8886af5	[FastISel] Propagate PCSections metadata to MachineInstr Propagate PC sections metadata to MachineInstr when FastISel is doing instruction selection. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130884	2022-09-07 11:36:01 +02:00
Nikita Popov	98a3a340c3	[ConstantExpr] Don't create fneg expressions Don't create fneg expressions unless explicitly requested by IR or bitcode.	2022-09-07 11:27:25 +02:00
Marco Elver	da695de628	[MachineInstrBuilder] Introduce MIMetadata to simplify metadata propagation In many places DebugLoc and PCSections metadata are just copied along to propagate them through MachineInstrs. Simplify doing so by bundling them up in a MIMetadata class that replaces the DebugLoc argument to most BuildMI() variants. The DebugLoc-only constructors allow implicit construction, so that existing usage of `BuildMI(.., DL, ..)` works as before, and the rest of the codebase using BuildMI() does not require changes. NFC. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130883	2022-09-07 11:22:50 +02:00
Marco Elver	4c58b00801	[SelectionDAG] Propagate PCSections through SDNodes Add a new entry to SDNodeExtraInfo to propagate PCSections through SelectionDAG. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130882	2022-09-07 11:22:50 +02:00
Jay Foad	1427d55d70	[TableGen] Document sequence with stride Document (in comments) the optional fourth "stride" argument to the sequence operator, which was added in svn r157416. Differential Revision: https://reviews.llvm.org/D133297	2022-09-07 09:58:22 +01:00
Vitaly Buka	4c18670776	[NFC][sancov] Rename ModuleSanitizerCoveragePass	2022-09-06 20:55:39 -07:00
Vitaly Buka	5e38b2a456	[NFC][msan] Rename ModuleMemorySanitizerPass	2022-09-06 20:30:35 -07:00
Vitaly Buka	93600eb50c	[NFC][asan] Rename ModuleAddressSanitizerPass	2022-09-06 15:02:11 -07:00
Vitaly Buka	e7bac3b9fa	[msan] Convert Msan to ModulePass MemorySanitizerPass function pass violatied requirement 4 of function pass to do not insert globals. Msan nees to insert globals for origin tracking, and paramereters tracking. https://llvm.org/docs/WritingAnLLVMPass.html#the-functionpass-class Reviewed By: kstoimenov, fmayer Differential Revision: https://reviews.llvm.org/D133336	2022-09-06 15:01:04 -07:00
Arthur Eubanks	7f57c97d30	[ThinLTOBitcodeWriter] Mark pass as required Or else with -opt-bisect-limit we don't write ThinLTO bitcode. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D133378	2022-09-06 14:47:34 -07:00
bzcheeseman	716b9f7a1a	[LLVM][Support/ADT] Add assert for isPresent to dyn_cast. This change adds an assert to dyn_cast that the value passed-in is present. In the past, this relied on the isa_impl assertion (which still works in many cases) but which we can tighten up for a better QoI. The PointerUnion change is because it seems like (based on the call sites) the semantics of the member dyn_cast are actually dyn_cast_if_present. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D133221	2022-09-06 13:58:56 -07:00
raghavmedicherla	142f51fc2f	Support: Add mapped_file_region::sync(), equivalent to msync Add mapped_file_region::sync(), equivalent to POSIX msync, synchronizing written content to disk without unmapping the region. Asserts if the mode is not mapped_file_region::readwrite. Note that I don't have access to a Windows machine, so I can't easily run those unit tests. Change by dexonsmith Differential Revision: https://reviews.llvm.org/D95494	2022-09-06 16:46:37 -04:00
Markus Böck	f049b2c3fc	[MC] Emit Stackmaps before debug info This patch is essentially an alternative to https://reviews.llvm.org/D75836 and was mentioned by @lhames in a comment. The gist of the issue is that Mach-O has restrictions on which kind of sections are allowed after debug info has been emitted, which is also properly asserted within LLVM. Problem is that stack maps are currently emitted as one of the last sections in each target-specific AsmPrinter so far, which would cause the assertion to trigger. The current approach of special casing for the `__LLVM_STACKMAPS` section is not viable either, as downstream users can overwrite the stackmap format using plugins, which may want to use different sections. This patch fixes the issue by emitting the stack map earlier, right before debug info is emitted. The way this is implemented is by taking the choice when to emit the StackMap away from the target AsmPrinter and doing so in the base class. The only disadvantage of this approach is that the `StackMaps` member is now part of the base class, even for targets that do not support them. This is functionaly not a problem however, as emitting an empty `StackMaps` is a no-op. Differential Revision: https://reviews.llvm.org/D132708	2022-09-06 20:20:56 +02:00
Joseph Huber	58645d3252	[OpenMP] Fix `omp_get_wtime` function being marked incorrectly as readonly OpenMP has a list of of optimistic attributes that can be attached to known runtime functions to aid some analysis. The `omp_get_wtime` function incorrectly used the `readonly` attribute. This is not correct at the `omp_get_wtime` function changes values depending on some external state. This is more correctly modeled with `inaccessiblememonly` meaning that the value does not depend on anything within the module, but can not be removes as it depends on external state. Fixes #57578 Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D133360	2022-09-06 12:59:00 -05:00
Jakub Kuderski	20573d11b7	[ADT] Remove is_splat `is_splat` is superseded by `all_equal` and marked as deprecated. See the discussion thread for more details: https://discourse.llvm.org/t/adt-is-splat-and-empty-ranges/64692 Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D132336	2022-09-06 13:49:26 -04:00
Matthias Gehre	7948d89afe	Fix "[llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64" compilation on Windows	2022-09-06 16:11:14 +01:00
Marco Elver	7d63983c65	[SelectionDAG] Properly copy ExtraInfo on RAUW During SelectionDAG legalization SDNodes with associated extra info may be replaced with a new SDNode. Preserve associated extra info on ReplaceAllUsesWith and remove entries in DeallocateNode. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130881	2022-09-06 16:32:50 +02:00
Marco Elver	cc3faf4226	[SelectionDAG] Rename CallSiteDbgInfo to NodeExtraInfo For information infrequently attached to SDNodes, it is useful to provide a way to add this information out-of-line. This is already done for call-site specific information. Rename CallSiteDbgInfo to NodeExtraInfo in preparation of adding additional information not necessarily related to call sites only. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130880	2022-09-06 16:32:50 +02:00
Matthias Gehre	2090e85fee	[llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64 This adds the ExpandLargeDivRem to the default pass pipeline. The limit at which it expands div/rem instructions is configured via a new TargetTransformInfo hook (default: no expansion) X86, Arm and AArch64 backends implement this hook to expand div/rem instructions with more than 128 bits. Differential Revision: https://reviews.llvm.org/D130076	2022-09-06 15:32:04 +01:00
Joseph Huber	5dbc7cf7ca	[Object] Refactor code for extracting offload binaries We currently extract offload binaries inside of the linker wrapper. Other tools may wish to do the same extraction operation. This patch simply factors out this handling into the `OffloadBinary.h` interface. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D132689	2022-09-06 08:55:16 -05:00
Marco Elver	42836e283f	[MachineInstr] Allow setting PCSections in ExtraInfo Provide MachineInstr::setPCSection(), to propagate relevant metadata through the backend. Use ExtraInfo to store the metadata. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130876	2022-09-06 15:52:44 +02:00
Marco Elver	c70f6e1362	[Metadata] Introduce MD_pcsections Introduces MD_pcsections metadata kind. See added documentation for more details. Subsequent patches enable propagating PC sections metadata through code generation to the AsmPrinter. RFC: https://discourse.llvm.org/t/rfc-pc-keyed-metadata-at-runtime/64191 Reviewed By: dvyukov, vitalybuka Differential Revision: https://reviews.llvm.org/D130875	2022-09-06 15:52:44 +02:00
Amara Emerson	3dd861818a	[GlobalISel] Combine G_INSERT/EXTRACT_VECTOR_ELT with out of bounds indices to undef. Differential Revision: https://reviews.llvm.org/D133309	2022-09-06 13:45:04 +01:00
luxufan	2e7aed1947	[MemorySSA][NFC] Simplify if condition Differential Revision: https://reviews.llvm.org/D133332	2022-09-05 10:43:17 +00:00
Eli Friedman	63335afb4e	[ARM64EC 2/?] Add target triple, and allow targeting it. Part of patchset to add initial support for ARM64EC. Per discussion on review, using the triple arm64ec-pc-windows-msvc. The parsing works the same way as Apple's alternate Arm ABI "arm64e". Differential Revision: https://reviews.llvm.org/D125412	2022-09-05 12:27:10 -07:00
Eli Friedman	488ad99ecf	[ARM64EC 1/?] Add parsing support to llvm-objdump/llvm-readobj. This is the first patch of a patchset to add initial support for ARM64EC. Basic documentation is available at https://docs.microsoft.com/en-us/windows/uwp/porting/arm64ec-abi . (Discourse post: https://discourse.llvm.org/t/initial-patches-for-arm64ec-windows-11-now-posted/62449 .) The file format for ARM64EC is basically identical to normal ARM64. There are a few extra sections, but the existing code for reading ARM64 object files just works. Differential Revision: https://reviews.llvm.org/D125411	2022-09-05 12:25:08 -07:00
Joseph Huber	c1d19a8489	[ELF] Provide the GNU hash function in libObject GNU uses a different hashing function compared to the sys-V standard function already provided in libObject. This is already used internally in LLD for generating synthetic sections. This patch simply extracts this definition and makes it availible to other users of `libObject`. This is done in preparation for supporting symbol name lookups via the GNU hash table. Reviewed By: MaskRay, jhenderson Differential Revision: https://reviews.llvm.org/D132696	2022-09-05 11:04:57 -05:00
Kazu Hirata	2bb43d72d9	[ADT] Use std::tuple_element_t (NFC)	2022-09-03 23:27:24 -07:00
Kazu Hirata	03c3c2db10	[llvm] Use std::remove_reference_t (NFC)	2022-09-03 23:27:22 -07:00
Kazu Hirata	230e57d221	[ADT] Use std::add_pointer_t (NFC)	2022-09-03 23:27:18 -07:00
Kazu Hirata	9dc6223117	[ADT] Use std::add_lvalue_reference_t (NFC)	2022-09-03 23:27:17 -07:00
Kazu Hirata	2423cf4f88	[Support] Simplify reverseBits with constexpr if (NFC) Differential Revision: https://reviews.llvm.org/D132814	2022-09-03 23:27:15 -07:00
Kazu Hirata	ee40ef7aaf	[Support] Simplify isInt and isUInt with constexpr if (NFC) Differential Revision: https://reviews.llvm.org/D132813	2022-09-03 23:27:13 -07:00
Kazu Hirata	86e8164a8f	[llvm] Qualify auto in range-based for loops (NFC) Identified with readability-qualified-auto.	2022-09-03 11:17:49 -07:00
Kazu Hirata	32aa35b504	Drop empty string literals from static_assert (NFC) Identified with modernize-unary-static-assert.	2022-09-03 11:17:47 -07:00
Kazu Hirata	baee196abb	[llvm] Use std::remove_const_t (NFC)	2022-09-03 11:17:45 -07:00
Kazu Hirata	9eca5ed790	[llvm] Use std::enable_if_t (NFC)	2022-09-03 11:17:44 -07:00
Kazu Hirata	a7a2872bb7	[ADT] Use std::add_const_t (NFC)	2022-09-03 11:17:42 -07:00
Simon Pilgrim	e2d140e9c3	[TTI] Add isExpensiveToSpeculativelyExecute wrapper CGP uses a raw `getInstructionCost(I, TargetTransformInfo::TCK_SizeAndLatency) >= TCC_Expensive` check to see if its better to move an expensive instruction used in a select behind a branch instead. This is causing issues with upcoming improvements to TCK_SizeAndLatency costs on X86 as we need to use TCK_SizeAndLatency as an uop count (so its compatible with various target-specific buffer sizes - see D132288), but we can have instructions that have a low TCK_SizeAndLatency value but should still be treated as 'expensive' (FDIV for example) - by adding a isExpensiveToSpeculativelyExecute wrapper we can keep the current behaviour but still add an x86 override in a future patch when the cost tables are updated to compensate.	2022-09-03 13:12:22 +01:00
Alexey Lapshin	79c8f51c34	[DWARFLinker] Refactor clang modules loading code. Current implementation of registerModuleReference() function not only "registers" module reference, but also clones referenced module (inside loadClangModule()). That may lead to cloning the module with incorrect options (registerModuleReference() examines module references and additionally accumulates MaxDwarfVersion and accel tables info). Since accumulated options may differ from the current values, it is incorrect to clone modules before options are fully accumulated. This patch separates "cloning" code from "registering" code. So, that accumulating option is done in the "registering stage" and "cloning" is done after all modules are registered and options accumulated. It also adds a callback for loaded compile units which can be used for D132755 and D132371(to allow doing options accumulation outside of DWARFLinker). Differential Revision: https://reviews.llvm.org/D133047	2022-09-03 11:23:52 +03:00
Craig Topper	5cf510115a	[VP] Correct LEGALPOS for more VP nodes. LEGALPOS appears to only be used by LegalizeVectorOps. It needs to point at a vector operand. Stores need to point at the second operand since the result and the first operand are MVT::Other. Reductions need to point at the second operand since the result and the first operand are scalsrs. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D133048	2022-09-02 08:04:28 -07:00
Kadir Cetinkaya	4940f205d4	[llvm][Support] Add DenseMapInfo for std::variant Differential Revision: https://reviews.llvm.org/D133200	2022-09-02 15:36:10 +02:00
Simon Pilgrim	7338f9709b	[TTI] Improve description of TargetCostKind enums to aid targets in choosing cost values I'm not sure how much to add to the description as we've tried to allow targets to interpret the TargetCostKind enums in their own way. But we need to make it clear that certain cost kinds need to match threshold numbers used by various passes (and vice-versa when passes are determining a cost-benefit threshold). I'm not keen on the "The weighted sum of size and latency" description, but its very difficult to come up with anything else that's suitably generic (e.g. X86 will use uop counts here to easily work with LoopMicroOpBufferSize thresholds, even though high latency fdiv/fsqrt instructions still often have low uop counts). Differential Revision: https://reviews.llvm.org/D132288	2022-09-02 11:09:06 +01:00
Lang Hames	6ca9f42189	[ORC][ORC-RT] Consistently use pointed-to type as template arg to wrap/unwrap. Saves wrap/unwrap implementers from having to use std::remove_pointer_t to get at the pointed-to type.	2022-09-01 20:54:24 -07:00
Lang Hames	06c4634483	[JITLink] Sink ELFX86RelocationKind into implementation file (ELF_x86_64.cpp). The ELF/x86-64 backend uses the generic x86_64 edges now, so the ELFX86RelocationKind is just an implementation detail.	2022-09-01 13:36:49 -07:00
Simon Pilgrim	e5804a5a61	[ADT] bit.h - replace <stdint.h> with <cstdint> This is a C++ header after all.	2022-09-01 20:44:56 +01:00
Fangrui Song	8d95fd7e56	[MachineFunctionPass] Support -filter-passes for -print-changed [MachineFunctionPass] Support -filter-passes for -print-changed -filter-passes specifies a `PassID` (a lower-case dashed-separated pass name, also used by -print-after, -stop-after, etc) instead of a CamelCasePass. `-filter-passes=CamelCaseNewPMPass` seems like a workaround for new PM passes before we can use lower-case dashed-separated pass names (as used by `-passes=`). Example: ``` # getPassName() is "IRTranslator". PassID is "irtranslator" llc -mtriple=aarch64 -print-changed -filter-passes=irtranslator < print-changed-machine.ll ``` Close https://github.com/llvm/llvm-project/issues/57453 Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D133055	2022-09-01 11:06:06 -07:00
Wei Yi Tee	f6b66cbc7d	[llvm][Testing/ADT] Implement `IsStringMapEntry` testing matcher for verifying the entries in a `StringMap`. Reviewed By: gribozavr2, ymandel, sgatev Differential Revision: https://reviews.llvm.org/D132753	2022-09-01 17:30:41 +00:00
Ilia Diachkov	698c800142	[SPIRV] support builtin types and ExtInsts selection The patch adds the support of OpenCL and SPIR-V built-in types. It also implements ExtInst selection and adds spv_unreachable and spv_alloca intrinsics which improve the generation of the corresponding SPIR-V code. Five LIT tests are included to demonstrate the improvement. Differential Revision: https://reviews.llvm.org/D132648 Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com> Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com> Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com> Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>	2022-09-01 16:44:54 +03:00
Amara Emerson	4cf3db41da	[GlobalISel] Add sdiv exact (X, constant) -> mul combine. This port of the SDAG optimization is only for exact sdiv case. Differential Revision: https://reviews.llvm.org/D130517	2022-09-01 13:34:00 +01:00
David Spickett	6829cd17b5	[LLVM] Add missing stdint include to Bit.h To fix failing builds on Windows on Arm: https://lab.llvm.org/staging/#/builders/59/builds/928/steps/4/logs/stdio <...>/ADT/bit.h(50,5): error: unknown type name 'uint32_t' uint32_t v = Value; ^	2022-09-01 09:17:35 +00:00
Sam Clegg	92920c4fe3	[MC][WebAssembly] Allow accurate errors in doBeforeLabelEmit Although we only currently have one error produced in this function I am working on changes right now that add some more. This change makes the error location more accurate. Differential Revision: https://reviews.llvm.org/D133016	2022-09-01 01:26:33 -07:00
Arthur Eubanks	04f3c20989	[NFC][LICM] Stop passing around unused BFI Uses of this were removed in `1a25d0bfbb`.	2022-08-31 19:15:34 -07:00
Mark Zhuang	62454e83b0	[NFC] Fix typo Reviewed By: eopXD Differential Revision: https://reviews.llvm.org/D133079	2022-08-31 19:08:46 -07:00
Craig Topper	8dce3507a0	[VP] Correct the LEGALPOS for VP_STORE. VP_STORE has a Chain for operand 0, so the LEGALPOS should be 1. VP_STORE is always considered Legal for MVT::Other. So I suspect this was causing vp_store to be ignored by LegalizeVectorOps and instead handled in LegalizeDAG. VP_LOAD is Custom expanded in LegalizeVectorOps for RISC-V. Differential Revision: https://reviews.llvm.org/D132972	2022-08-31 11:15:47 -07:00
Wei Yi Tee	d45c04da7c	[llvm][ADT] Overload output stream operator `<<` for `StringMapEntry` and `StringMap`. Printing support enables the production of more useful error messages in unit testing e.g. when using matchers such as `UnorderedElementsAre()` to inspect the contents of a `StringMap`. Reviewed By: gribozavr2, sgatev, ymandel Differential Revision: https://reviews.llvm.org/D132747	2022-08-31 17:37:58 +00:00
Arthur Eubanks	d0b9c9c0a3	[NFC] clang-format Any.h To trigger some bots. Differential Revision: https://reviews.llvm.org/D133033	2022-08-31 10:21:30 -07:00
Daniel Thornburgh	ea99225521	[Symbolizer] Handle {{{bt}}} symbolizer markup element. This adds support for backtrace generation to the llvm-symbolizer markup filter, which is likely the largest use case. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D132706	2022-08-31 09:49:32 -07:00
Daniel Bertalan	f7b752d277	[lld-macho] Set the SG_READ_ONLY flag on __DATA_CONST This flag instructs dyld to make the segment read-only after fixups have been performed. I'm not sure why this flag is needed, as on macOS 13 beta at least, __DATA_CONST is read-only even without this flag; but ld64 sets it as well. Differential Revision: https://reviews.llvm.org/D133010	2022-08-31 17:04:20 +02:00
Hassnaa Hamdi	a6d9c944df	[AArch64 - SVE]: Use SVE to lower reduce.fadd. Differential Revision: https://reviews.llvm.org/D132573 skip custom-lowering for v1f64 to be expanded instead, because it has only one lane Differential Revision: https://reviews.llvm.org/D132959	2022-08-31 12:31:06 +00:00
Alvin Wong	12d865415f	[COFF] Use the more accurate GuardFlags definition everywhere This also modifies llvm-readobj to be more future-proof when printing the guard FIDs table by calculating the entry size correctly according to MS docs. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D132924	2022-08-31 15:11:34 +03:00
Alvin Wong	94baaa6a5c	[llvm-readobj][COFF] Print load config GuardFlags as enum flags Print flags as documented in MS docs. https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#load-configuration-layout https://docs.microsoft.com/en-us/windows/win32/secbp/pe-metadata EH_CONTINUATION_TABLE_PRESENT is not mentioned in the docs but is instead taken from Windows SDK headers. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D132823	2022-08-31 15:01:57 +03:00
Nikita Popov	972840aa3b	[IR] Add Instruction::getInsertionPointAfterDef() Transforms occasionally want to insert an instruction directly after the definition point of a value. This involves quite a few different edge cases, e.g. for phi nodes the next insertion point is not the next instruction, and for invokes and callbrs its not even in the same block. Additionally, the insertion point may not exist at all if catchswitch is involved. This adds a general Instruction::getInsertionPointAfterDef() API to implement the necessary logic. For now it is used in two places where this should be mostly NFC. I will follow up with additional uses where this fixes specific bugs in the existing implementations. Differential Revision: https://reviews.llvm.org/D129660	2022-08-31 10:50:10 +02:00
Daniel Bertalan	389e0a81a1	[lld-macho] Support synthesizing __TEXT,__init_offsets This section stores 32-bit `__TEXT` segment offsets of initializer functions, and is used instead of `__mod_init_func` when chained fixups are enabled. Storing the offsets lets us avoid emitting fixups for the initializers. Differential Revision: https://reviews.llvm.org/D132947	2022-08-31 10:13:45 +02:00
Greg Clayton	ea9ac3519c	An upcoming patch to LLDB will require the ability to decode base64. This patch adds support for decoding base64 and adds tests. Resubmission of https://reviews.llvm.org/D126254 with where decodeBase64Byte is no longer a lambda but a static function. Some compilers have different errors or warnings with respect to what needs to be captured and what doesn't (see comments in https://reviews.llvm.org/D126254 for details). Differential Revision: https://reviews.llvm.org/D128560	2022-08-30 15:52:08 -07:00
Pavel Samolysov	88581db62f	[LazyCallGraph] Reformat the code in accordance with the code style. NFC Also, some local variables were renamed in accordance with the code style as well as `std::tie` occurrences and `.first`/`.second` member uses were replaced with structure bindings. Differential Revision: https://reviews.llvm.org/D132806	2022-08-30 11:06:42 +03:00
Rong Xu	7bc182ed8a	fix buildbot build error.	2022-08-29 17:01:27 -07:00
Rong Xu	d7ef0c3970	[llvm-profdata] Improve profile supplementation Current implementation promotes a non-cold function in the SampleFDO profile into a hot function in the FDO profile. This is too aggressive. This patch promotes a hot functions in the SampleFDO profile into a hot function, and a warm function in SampleFDO into a warm function in FDO. Differential Revision: https://reviews.llvm.org/D132601	2022-08-29 16:50:42 -07:00
Rong Xu	db18f26567	[llvm-profdata] Handle internal linkage functions in profile supplementation This patch has the following changes: (1) Handling of internal linkage functions (static functions) Static functions in FDO have a prefix of source file name, while they do not have one in SampleFDO. Current implementation does not handle this and we are not updating the profile for static functions. This patch fixes this. (2) Handling of -funique-internal-linakge-symbols Again this is for the internal linkage functions. Option -funique-internal-linakge-symbols can now be applied to both FDO and SampleFDO compilation. When it is used, it demangles internal linkage function names and adds a hash value as the postfix. When both SampleFDO and FDO profiles use this option, or both not use this option, changes in (1) should handle this. Here we also handle when the SampleFDO profile using this option while FDO profile not using this option, or vice versa. There is one case where this patch won't work: If one of the profiles used mangled name and the other does not. For example, if the SampleFDO profile uses clang c-compiler and without -funique-internal-linakge-symbols, while the FDO profile uses -funique-internal-linakge-symbols. The SampleFDO profile contains unmangled names while the FDO profile contains mangled names. If both profiles use c++ compiler, this won't happen. We think this use case is rare and does not justify the effort to fix. Differential Revision: https://reviews.llvm.org/D132600	2022-08-29 16:15:12 -07:00
Craig Topper	2f811a6c7f	[VP][RISCV] Add vp.fabs intrinsic and RISC-V support. Mostly just modeled after vp.fneg except there is a "functional instruction" for fneg while fabs is always an intrinsic. Reviewed By: fakepaper56 Differential Revision: https://reviews.llvm.org/D132793	2022-08-29 09:32:06 -07:00
Wei Yi Tee	72ebcf1a53	[llvm][ADT] Fix formatting for files relevant to `StringMap`. Differential Revision: https://reviews.llvm.org/D132744	2022-08-29 06:57:29 +00:00
Wei Yi Tee	af6a35597f	Revert "[llvm][ADT] Fix formatting for files relevant to `StringMap`." This reverts commit `d23df9c9e8`. Revert due to missing review link.	2022-08-29 06:43:48 +00:00
Wei Yi Tee	d23df9c9e8	[llvm][ADT] Fix formatting for files relevant to `StringMap`.	2022-08-29 06:40:07 +00:00
Kazu Hirata	8feb60756c	[llvm] Use range-based for loops (NFC)	2022-08-28 23:28:58 -07:00
Kazu Hirata	87c38323a2	[Support] Remove greatestCommonDivisor and GreatestCommonDivisor64 (NFC) This patch removes greatestCommonDivisor and GreatestCommonDivisor64 as I've migrated all the uses to std::gcd.	2022-08-28 17:35:08 -07:00
Kazu Hirata	ec8605ff52	[llvm] Use std::is_unsigned instead of std::numeric_limits (NFC)	2022-08-28 17:35:06 -07:00
Kazu Hirata	ce9f007c7c	[llvm] Use llvm::find_if (NFC)	2022-08-28 10:41:48 -07:00
Daniel Bertalan	47e4663c4e	[llvm-objdump] Add -dyld_info to llvm-otool This option outputs the location, encoded value and target of chained fixups, using the same format as `otool -dyld_info`. This initial implementation only supports the DYLD_CHAINED_PTR_64 and DYLD_CHAINED_PTR_64_OFFSET pointer encodings, which are used in x86_64 and arm64 userspace binaries. When Apple's effort to upstream their chained fixups code continues, we'll replace this code with the then-upstreamed code. But we need something in the meantime for testing ld64.lld's chained fixups code. Differential Revision: https://reviews.llvm.org/D132036	2022-08-28 09:22:41 +02:00
Benjamin Kramer	1bcf21ca7f	Use std::uninitialized_move where appropriate. NFCI.	2022-08-27 14:56:43 +02:00
Anubhab Ghosh	c69df92b4f	[Orc] Use MapperJITLinkMemoryManager with InProcessMapper in llvm-jitlink tool MapperJITLinkMemoryManager has slab allocation. Combined with InProcessMapper, it can replace InProcessMemoryManager. It can also replace JITLinkSlabAllocator through the InProcessDeltaMapper that adds an offset to the executor addresses for use in tests. Differential Revision: https://reviews.llvm.org/D132315	2022-08-27 11:07:09 +05:30
Lang Hames	f828135f91	Reapply "[ORC] Add "wrap" and "unwrap" steps to ExecutorAddr..." with fixes. Reapplies `f14cb494a3` (which was reverted in `2f08f8426c`) with a fix for UB in the ExecutorAddr::Unwrap::Unwrap constructor (which caused failures on some bots).	2022-08-26 14:53:51 -07:00
Lang Hames	2f08f8426c	Revert "[ORC] Add "wrap" and "unwrap" steps to ExecutorAddr toPtr/fromPtr." This reverts commit `f14cb494a3`. Reverting while I investigate bot failures, e.g. https://lab.llvm.org/buildbot#builders/117/builds/8701	2022-08-26 13:54:30 -07:00
Florian Hahn	9405af1c85	[LAA] Require AddRecs to be in the innermost loop for diff-checks. The simpler diff-checks require pointers with add-recs from the same innermost loop, but this property wasn't check completely. Add the missing check to ensure both addrecs are in the innermost loop. Fixes #57315.	2022-08-26 20:39:52 +01:00
Lang Hames	f14cb494a3	[ORC] Add "wrap" and "unwrap" steps to ExecutorAddr toPtr/fromPtr. The wrap/unwrap operations are applied to pointers after/before conversion to/from raw addresses. They can be used to tag, untag, sign, or strip signing from pointers. They currently default to 'rawPtr' (identity) on all platforms, but it is expected that the default will be set based on the host architecture, e.g. they would default to signing/stripping for arm64e.	2022-08-26 12:32:44 -07:00
Daniil Fukalov	9c710ebbdb	[TTI] NFC: Reduce InstructionCost::getValue() usage... in order to propagate `InstructionCost` value upper. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D103406	2022-08-26 16:37:32 +03:00
Benjamin Kramer	2c1796b3d6	[ADT] GCC 7 doesn't have constexpr char_traits, add a workaround LLVM still supports GCC 7. This workaround can be removed when GCC 8 becomes the oldest supported GCC version. Fixes #57057	2022-08-26 14:11:21 +02:00
Matthias Gehre	3e39b27101	[llvm/CodeGen] Add ExpandLargeDivRem pass Adds a pass ExpandLargeDivRem to expand div/rem instructions with more than 128 bits into a loop computing that value. As discussed on https://reviews.llvm.org/D120327, this approach has the advantage that it is independent of the runtime library. This also helps the clang driver, which otherwise would need to understand enough about the runtime library to know whether to allow _BitInts with more than 128 bits. Targets are still free to disable this pass and instead provide a faster implementation in a runtime library. Fixes https://github.com/llvm/llvm-project/issues/44994 Differential Revision: https://reviews.llvm.org/D126644	2022-08-26 11:55:15 +01:00
Matthias Gehre	6d13b80fcb	Revert "[SelectionDAG] Emit calls to __divei4 and friends for division/remainder of large integers" This reverts https://reviews.llvm.org/D120329. I abandoned the PR [0] to add __divei4 functions to compiler-rt in favor of adding a pass to transform div/rem [1]. This removes the backend code that was supposed to emit calls to the __divei4 functions. [0] https://reviews.llvm.org/D120327 [1] https://reviews.llvm.org/D130076 Differential Revision: https://reviews.llvm.org/D130079	2022-08-26 10:52:56 +01:00
Alex Richardson	0483b00875	Mark the $local function begin symbol as a function While this does not matter for most targets, when building for Arm Morello, we have to mark the symbol as a function and add size information, so that LLD can correctly evaluate relocations against the local symbol. Since Morello is an out-of-tree target, I tried to reproduce this with in-tree backends and with the previous reviews applied this results in a noticeable difference when targeting Thumb. Background: Morello uses a method similar Thumb where the encoding mode is specified in the LSB of the symbol. If we don't mark the target as a function, the relocation will not have the LSB set and calls will end up using the wrong encoding mode (which will almost certainly crash). Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D131429	2022-08-26 09:34:04 +00:00
Nicolai Hähnle	5e812e9580	Revert "ManagedStatic: remove from DebugCounter" This reverts commit `b5b6ef1500`.	2022-08-26 11:02:58 +02:00
Nicolai Hähnle	b5b6ef1500	ManagedStatic: remove from DebugCounter Follow the pattern used in MLIR for the cl::opt instances. v2: - make DebugCounter::isCountingEnabled public so that the DebugCounterOwner doesn't have to be a nested class. This simplifies later changes v3: - remove the indirection via DebugCounterOwner::instance() Differential Revision: https://reviews.llvm.org/D129116	2022-08-26 09:22:11 +02:00
Joe Loser	77eac32716	[ADT] Make `llvm::identity` a transparent function object `llvm::identity` is similar to `std::identity` from C++20, but one surprising thing is that `llvm::identity` is not a transparent function object. Add the `is_transparent` type alias to denote it can be used as a transparent function object. Differential Revision: https://reviews.llvm.org/D132628	2022-08-25 21:06:42 -06:00
Nicolai Hähnle	a0a2ddfcc5	Revert "ManagedStatic: remove from DebugCounter" This reverts commit `51d82502d9`. There is a regression in the flang-aarch64-dylib buildbot which is most likely caused by this change. Reverting until I can investigate.	2022-08-25 19:45:04 +02:00
Nicolai Hähnle	af2e54992d	[Timer][Statistics] Make global constructor ordering more robust It was observed in D129117 that the subtle dependency between statistic and timer code is not entirely robust: the global destructor ~StatisticInfo indirectly calls CreateInfoOutputFile, which requires the LibSupportInfoOutputFilename to not have been destructed. By constructing LibSupportInfoOutputFilename before the StatisticInfo object, the order of destruction is guaranteed. Differential Revision: https://reviews.llvm.org/D131059	2022-08-25 19:09:49 +02:00
Nicolai Hähnle	51d82502d9	ManagedStatic: remove from DebugCounter Follow the pattern used in MLIR for the cl::opt instances. v2: - make DebugCounter::isCountingEnabled public so that the DebugCounterOwner doesn't have to be a nested class. This simplifies later changes Differential Revision: https://reviews.llvm.org/D129116	2022-08-25 19:09:48 +02:00
Dan McGregor	3922ec46b8	[MCContext] Reverse order of DebugPrefixMap sort for generated assembly debug info Match Clang's sorting, so that longer (more specific) prefix paths will match before less specific paths. Reviewed By: MaskRay, raj.khem, #debug-info Differential Revision: https://reviews.llvm.org/D132390	2022-08-24 21:43:41 -07:00
Valery N Dmitriev	a4c8fb9d1f	[SLP][NFC] Refactor SLPVectorizerPass::vectorizeRootInstruction method. The goal is to separate collecting items for post-processing and processing them. Post processing also outlined as dedicated method. Differential Revision: https://reviews.llvm.org/D132603	2022-08-24 17:07:53 -07:00
Mircea Trofin	5ce4c9aa04	[mlgo] Use TFLite for 'development' mode. TLite is a lightweight, statically linkable[1], model evaluator, supporting a subset of what the full tensorflow library does, sufficient for the types of scenarios we envision having. It is also faster. We still use saved models as "source of truth" - 'release' mode's AOT starts from a saved model; and the ML training side operates in terms of saved models. Using TFLite solves the following problems compared to using the full TF C API: - a compiler-friendly implementation for runtime-loadable (as opposed to AOT-embedded) models: it's statically linked; it can be built via cmake; - solves an issue we had when building the compiler with both AOT and full TF C API support, whereby, due to a packaging issue on the TF side, we needed to have the pip package and the TF C API library at the same version. We have no such constraints now. The main liability is it supporting a subset of what the full TF framework does. We do not expect that to cause an issue, but should that be the case, we can always revert back to using the full framework (after also figuring out a way to address the problems that motivated the move to TFLite). Details: This change switches the development mode to TFLite. Models are still expected to be placed in a directory - i.e. the parameters to clang don't change; what changes is the directory content: we still need an `output_spec.json` file; but instead of the saved_model protobuf and the `variables` directory, we now just have one file, `model.tflite`. The change includes a utility showing how to take a saved model and convert it to TFLite, which it uses for testing. The full TF implementation can still be built (not side-by-side). We intend to remove it shortly, after patching downstream dependencies. The build behavior, however, prioritizes TFLite - i.e. trying to enable both full TF C API and TFLite will just pick TFLite. [1] thanks to @petrhosek's changes to TFLite's cmake support and its deps!	2022-08-24 16:07:24 -07:00
Sami Tolvanen	cff5bef948	KCFI sanitizer The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a forward-edge control flow integrity scheme for indirect calls. It uses a !kcfi_type metadata node to attach a type identifier for each function and injects verification code before indirect calls. Unlike the current CFI schemes implemented in LLVM, KCFI does not require LTO, does not alter function references to point to a jump table, and never breaks function address equality. KCFI is intended to be used in low-level code, such as operating system kernels, where the existing schemes can cause undue complications because of the aforementioned properties. However, unlike the existing schemes, KCFI is limited to validating only function pointers and is not compatible with executable-only memory. KCFI does not provide runtime support, but always traps when a type mismatch is encountered. Users of the scheme are expected to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi` operand bundle to indirect calls, and LLVM lowers this to a known architecture-specific sequence of instructions for each callsite to make runtime patching easier for users who require this functionality. A KCFI type identifier is a 32-bit constant produced by taking the lower half of xxHash64 from a C++ mangled typename. If a program contains indirect calls to assembly functions, they must be manually annotated with the expected type identifiers to prevent errors. To make this easier, Clang generates a weak SHN_ABS `__kcfi_typeid_<function>` symbol for each address-taken function declaration, which can be used to annotate functions in assembly as long as at least one C translation unit linked into the program takes the function address. For example on AArch64, we might have the following code: ``` .c: int f(void); int (*p)(void) = f; p(); .s: .4byte __kcfi_typeid_f .global f f: ... ``` Note that X86 uses a different preamble format for compatibility with Linux kernel tooling. See the comments in `X86AsmPrinter::emitKCFITypeId` for details. As users of KCFI may need to locate trap locations for binary validation and error handling, LLVM can additionally emit the locations of traps to a `.kcfi_traps` section. Similarly to other sanitizers, KCFI checking can be disabled for a function with a `no_sanitize("kcfi")` function attribute. Relands `67504c9549` with a fix for 32-bit builds. Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay Differential Revision: https://reviews.llvm.org/D119296	2022-08-24 22:41:38 +00:00
Peter Cooper	6113998069	Add MachO MH_FILESET support to objdump https://reviews.llvm.org/D131909	2022-08-24 13:34:43 -07:00
Sami Tolvanen	a79060e275	Revert "KCFI sanitizer" This reverts commit `67504c9549` as using PointerEmbeddedInt to store 32 bits breaks 32-bit arm builds.	2022-08-24 19:30:13 +00:00
Sami Tolvanen	67504c9549	KCFI sanitizer The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a forward-edge control flow integrity scheme for indirect calls. It uses a !kcfi_type metadata node to attach a type identifier for each function and injects verification code before indirect calls. Unlike the current CFI schemes implemented in LLVM, KCFI does not require LTO, does not alter function references to point to a jump table, and never breaks function address equality. KCFI is intended to be used in low-level code, such as operating system kernels, where the existing schemes can cause undue complications because of the aforementioned properties. However, unlike the existing schemes, KCFI is limited to validating only function pointers and is not compatible with executable-only memory. KCFI does not provide runtime support, but always traps when a type mismatch is encountered. Users of the scheme are expected to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi` operand bundle to indirect calls, and LLVM lowers this to a known architecture-specific sequence of instructions for each callsite to make runtime patching easier for users who require this functionality. A KCFI type identifier is a 32-bit constant produced by taking the lower half of xxHash64 from a C++ mangled typename. If a program contains indirect calls to assembly functions, they must be manually annotated with the expected type identifiers to prevent errors. To make this easier, Clang generates a weak SHN_ABS `__kcfi_typeid_<function>` symbol for each address-taken function declaration, which can be used to annotate functions in assembly as long as at least one C translation unit linked into the program takes the function address. For example on AArch64, we might have the following code: ``` .c: int f(void); int (*p)(void) = f; p(); .s: .4byte __kcfi_typeid_f .global f f: ... ``` Note that X86 uses a different preamble format for compatibility with Linux kernel tooling. See the comments in `X86AsmPrinter::emitKCFITypeId` for details. As users of KCFI may need to locate trap locations for binary validation and error handling, LLVM can additionally emit the locations of traps to a `.kcfi_traps` section. Similarly to other sanitizers, KCFI checking can be disabled for a function with a `no_sanitize("kcfi")` function attribute. Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay Differential Revision: https://reviews.llvm.org/D119296	2022-08-24 18:52:42 +00:00
Daniel Bertalan	686d8ce1ab	[llvm-objdump] Complete -chained_fixups support This commit adds definitions for the `dyld_chained_import*` structs. The imports array is now printed with `llvm-otool -chained_fixups`. This completes this option's implementation. A slight difference from cctools otool is that we don't yet dump the raw bytes of the imports entries. When Apple's effort to upstream their chained fixups code continues, we'll replace this code with the then-upstreamed code. But we need something in the meantime for testing ld64.lld's chained fixups code. Differential Revision: https://reviews.llvm.org/D131982	2022-08-24 19:29:11 +02:00
spupyrev	8d5b694da1	extending code layout alg The diff modifies ext-tsp code layout algorithm in the following ways: (i) fixes merging of cold block chains (this is a port of D129397); (ii) adjusts the cost model utilized for optimization; (iii) adjusts some APIs so that the implementation can be used in BOLT; this is a prerequisite for D129895. The only non-trivial change is (ii). Here we introduce different weights for conditional and unconditional branches in the cost model. Based on the new model it is slightly more important to increase the number of "fall-through unconditional" jumps, which makes sense, as placing two blocks with an unconditional jump next to each other reduces the number of jump instructions in the generated code. Experimentally, this makes a mild impact on the performance; I've seen up to 0.2%-0.3% perf win on some benchmarks. Reviewed By: hoy Differential Revision: https://reviews.llvm.org/D129893	2022-08-24 09:40:25 -07:00
Fangrui Song	3b4d800911	[ELF] Parallelize writes of different OutputSections We currently process one OutputSection at a time and for each OutputSection write contained input sections in parallel. This strategy does not leverage multi-threading well. Instead, parallelize writes of different OutputSections. The default TaskSize for parallelFor often leads to inferior sharding. We prepare the task in the caller instead. * Move llvm::parallel::detail::TaskGroup to llvm::parallel::TaskGroup * Add llvm::parallel::TaskGroup::execute. * Change writeSections to declare TaskGroup and pass it to writeTo. Speed-up with --threads=8: * clang -DCMAKE_BUILD_TYPE=Release: 1.11x as fast * clang -DCMAKE_BUILD_TYPE=Debug: 1.10x as fast * chrome -DCMAKE_BUILD_TYPE=Release: 1.04x as fast * scylladb build/release: 1.09x as fast On M1, many benchmarks are a small fraction of a percentage faster. Mozilla showed the largest difference with the patch being about 1.03x as fast. Differential Revision: https://reviews.llvm.org/D131247	2022-08-24 09:40:03 -07:00
Jonas Devlieghere	e854c17b02	[llvm] Teach LLVM about filesets Teach LLVM about filesets. Filesets were added in macOS 11 (Big Sur) to combine multiple Mach-O files. They introduce a new load command (LC_FILESET_ENTRY) consisting of a fileset_entry_command. struct fileset_entry_command { uint32_t cmd; /* LC_FILESET_ENTRY / uint32_t cmdsize; / includes entry_id string / uint64_t vmaddr; / memory address of the entry / uint64_t fileoff; / file offset of the entry / union lc_str entry_id; / contained entry id / uint32_t reserved; / reserved */ }; This patch teaches LLVM about the new load command and the corresponding data. Differential revision: https://reviews.llvm.org/D132432	2022-08-24 09:33:45 -07:00
Simon Pilgrim	f9de13232f	[X86] Promote i8/i16 CTTZ (BSF) instructions and remove speculation branch This patch adds a Type operand to the TLI isCheapToSpeculateCttz/isCheapToSpeculateCtlz callbacks, allowing targets to decide whether branches should occur on a type-by-type/legality basis. For X86, this patch proposes to allow CTTZ speculation for i8/i16 types that will lower to promoted i32 BSF instructions by masking the operand above the msb (we already do something similar for i8/i16 TZCNT). This required a minor tweak to CTTZ lowering - if the src operand is known never zero (i.e. due to the promotion masking) we can remove the CMOV zero src handling. Although BSF isn't very fast, most CPUs from the last 20 years don't do that bad a job with it, although there are some annoying passthrough EFLAGS dependencies. Additionally, now that we emit 'REP BSF' in most cases, we are tending towards assuming this will most likely be executed as a TZCNT instruction on any semi-modern CPU. Differential Revision: https://reviews.llvm.org/D132520	2022-08-24 17:28:18 +01:00
Pierre van Houtryve	59cf9dd923	[AMDGPU][GISel] Enable Selection of ADD3 for G_PTR_ADD Allows things like `(G_PTR_ADD (G_PTR_ADD a, b), c)` to be simplified into a single ADD3 instruction instead of two adds. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D131254	2022-08-24 14:44:19 +00:00
Alex Richardson	38107171ed	[RegisterInfoEmitter] Generate isConstantPhysReg(). NFCI This commit moves the information on whether a register is constant into the Tablegen files to allow generating the implementaiton of isConstantPhysReg(). I've marked isConstantPhysReg() as final in this generated file to ensure that changes are made to tablegen instead of overriding this function, but if that turns out to be too restrictive, we can remove the qualifier. This should be pretty much NFC, but I did notice that e.g. the AMDGPU generated file also includes the LO16/HI16 registers now. The new isConstant flag will also be used by D131958 to ensure that constant registers are marked as call-preserved. Differential Revision: https://reviews.llvm.org/D131962	2022-08-24 14:16:20 +00:00
Teresa Johnson	d10c1b88f0	[memprof] Correct max size and access count computations The existing code resulted in the max size and access counts being equal to the min. Compute the max instead (max lifetime was already correct). Differential Revision: https://reviews.llvm.org/D132515	2022-08-23 16:53:46 -07:00
Simon Pilgrim	9317e6311f	[TTI] Add SK_Splice shuffle mask detection and X86 costs Enables fixed sized vectors to detect SK_Splice shuffle patterns and provides basic X86 cost support Differential Revision: https://reviews.llvm.org/D132374	2022-08-23 20:07:30 +01:00
Simon Pilgrim	336a4e03a4	[ADT] Add llvm::has_single_bit helper similar to the c++20 std::has_single_bit implementation Converted the llvm::isPowerOf2_32/64 helpers into wrappers	2022-08-23 19:51:05 +01:00
Simon Pilgrim	75767a0f9a	[Support] MathExtras.h - use llvm::bitcast<> for float-bits cast helpers. NFCI.	2022-08-23 18:27:13 +01:00
Jakub Kuderski	6fa87ec10f	[ADT] Deprecate is_splat and replace all uses with all_equal See the discussion thread for more details: https://discourse.llvm.org/t/adt-is-splat-and-empty-ranges/64692 Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D132335	2022-08-23 11:36:27 -04:00
Philip Reames	c9608d57b8	[TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC] This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet. The main point of this change is simple consistency with the recently changes getArithmeticInstrCost, and to remove the last (interface) use of OperandValueKind.	2022-08-23 07:55:42 -07:00
Stephen Tozer	89d0cc99ec	[DebugInfo][InstrRef] Handle transfers of variadic debug values in LDV This patch adds the last of the changes required to enable DBG_VALUE_LIST handling in InstrRefLDV, handling variadic debug values during the transfer tracking step. Most of the changes are fairly straightforward, and based around tracking multiple locations per variable in TransferTracker::VLocTracker. Differential Revision: https://reviews.llvm.org/D128211	2022-08-23 15:01:28 +01:00
Florian Hahn	5913d77056	[Globals] Treat nobuiltin fns as maybe-derefined. Callsites could be marked as `builtin` while calling `nobuiltin` functions. This can lead to problems, if local optimizations apply transformations based on the semantics of the builtin, but then IPO treats the function as `nobuiltin` and applies a transform that breaks builtin semantics (assumed earlier). To avoid this, mark such functions as maybey-derefined, to avoid IPO transforms on them that may break assumptions of earlier calls. Fixes #57075 Fixes #48366 Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D97735	2022-08-23 13:45:10 +01:00
Simon Pilgrim	42a9c1819c	[ADT] Add llvm::popcount to <bit> helper wrapper This patch proposes to move the llvm::detail::PopulationCounter internal helpers into ADT/bit.h and provide a llvm::popcount implementation. I've left the countPopulation implementation in place in MathExtras.h for now, but updated it to use llvm::popcount. Hopefully I've got the type_traits correct - I don't use them very often. Someday we'll move to C++20 with an actual <bit> std header, and we already have this header in place to simplify matters. We'd probably benefit from moving the other <bit> helpers here at some point, but this is a first step. Differential Revision: https://reviews.llvm.org/D132407	2022-08-23 10:36:43 +01:00
Florian Hahn	ff34432649	[LoopUtils] Remove unused Loop arg from addDiffRuntimeChecks (NFC). The argument is no longer used, remove it.	2022-08-23 10:15:28 +01:00
liqinweng	eaa539afa1	[LV][NFC] Modify code comments Reviewed By: jacquesguan Differential Revision: https://reviews.llvm.org/D132093	2022-08-23 12:21:53 +08:00
Jakub Kuderski	c9e52fbe4d	[ADT] Add all_equal predicate `llvm::all_equal` checks if all values in the given range are equal, i.e., there are no two elements that are not equal. Similar to `llvm::all_of`, it returns `true` when the range is empty. `llvm::all_equal` is intended to supersede `llvm::is_splat`, which will be deprecated and removed in future patches. See the discussion thread for more details: https://discourse.llvm.org/t/adt-is-splat-and-empty-ranges/64692. Reviewed By: dblaikie, shchenz Differential Revision: https://reviews.llvm.org/D132334	2022-08-22 23:55:23 -04:00
Philip Reames	104fa367ee	[TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC] This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both. This is the change which motivated the whole sequence which preceeded it. In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact. This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through. I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance. For instance, every parameter which changes type in this change also changes name. This was intentional to make sure that every call site possible effected must show up in the diff. This let me audit each one closely.	2022-08-22 15:16:39 -07:00
Philip Reames	478cf94378	[X86][AArch64][WebAsm][RISCV] Query operand properties instead of using enums directly [nfc] This is part of an ongoing transition to use OperandValueInfo which combines OperandValueKind and OperandValueProperties. This change adds some accessor methods and uses them to simplify backend code. The primary motivation of doing so is removing uses of the parameters so that an upcoming api change is less error prone.	2022-08-22 13:37:59 -07:00
David Penry	ced705c440	[ModuloSchedule] Add interface call to accept/reject SMS schedules This interface allows a target to reject a proposed SMS schedule. For Hexagon/PowerPC, all schedules are accepted, leaving behavior unchanged. For ARM, schedules which exceed register pressure limits are rejected. Also, two RegisterPressureTracker methods now need to be public so that register pressure can be computed by more callers. Reapplication of D128941/(reversion:D132037) with small fix. Differential Revision: https://reviews.llvm.org/D132170	2022-08-22 12:10:13 -07:00
Philip Reames	27d3321c4f	[TTI] Use OperandValueInfo in getMemoryOpCost client api [nfc] This removes the last use of OperandValueKind from the client side API, and (once this is fully plumbed through TTI implementation) allow use of the same properties in store costing as arithmetic costing.	2022-08-22 11:26:31 -07:00
Philip Reames	274f86e7a6	[TTI] Remove OperandValueKind/Properties from getArithmeticInstrCost interface [nfc] This completes the client side transition to the OperandValueInfo version of this routine. Backend TTI implementations still use the prior versions for now.	2022-08-22 11:06:32 -07:00
Philip Reames	c42a5f1cc2	[TTI] Migrate getOperandInfo to OperandVaueInfo [nfc] This is part of merging OperandValueKind and OperandValueProperties.	2022-08-22 10:19:02 -07:00
Philip Reames	5cd427106d	[TTI] Start process of merging OperandValueKind and OperandValueProperties [nfc] OperandValueKind and OperandValueProperties both provide facts about the operands of an instruction for purposes of cost modeling. We've discussed merging them several times; before I plumb through more flags, let's go ahead and do so. This change only adds the client side interface for getArithmeticInstrCost and makes a couple of minor changes in client code to prove that it works. Target TTI implementations still use the split flags. I'm deliberately splitting what could be one big change into a series of smaller ones so that I can lean on the compiler to catch errors along the way.	2022-08-22 09:48:15 -07:00
Matthias Braun	b2542c40b9	RegisterClassInfo: Fix CSR cache invalidation `RegisterClassInfo` caches information like allocation orders and reuses it for multiple machine functions where possible. However the `MCPhysReg *CalleeSavedRegs` field used to test whether the set of callee saved registers changed did not work: After D28566 `MachineRegisterInfo::getCalleeSavedRegs()` can return dynamically computed CSR sets that are only valid while the `MachineRegisterInfo` object of the current function exists. This changes the code to make a copy of the CSR list instead of keeping a possibly invalid pointer around. Differential Revision: https://reviews.llvm.org/D132080	2022-08-22 09:28:26 -07:00

... 2 3 4 5 6 ...

49278 Commits