llvm-project

Commit Graph

Author	SHA1	Message	Date
Sunho Kim	6b27890b2c	[ORC][COFF] Handle COFF import files of static archive. Handles COFF import files of static archive. Changes static library genrator to build up object file map keyed by symbol name that excludes the symbols from dllimported symbols so that static generator will not be responsible for them. It exposes the list of dynamic libraries that need to be imported. Client should properly load the libraries in this list beforehand. Object file map is also an improvment from the past in terms of performance. Archive.findSym does a slow O(n) linear serach of symbol list to find the symbol. (we call findSym O(n) times, thus full time complexity is O(n^2); we were the only user of findSym function in fact) There is a room for improvements in how to load the libraries in the list. We currently just hand the responsibility over to the clinet. A better way would be let ORC read this list and hand them over to JITLink side that would also help validation (e.g. not trying to generate stub for non dllimported targets) Nevertheless, we will have to exclude the symbols from COFF import object file list and need a way to access this list, which this patch offers. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D129952	2022-07-29 16:25:08 +09:00
Alexey Lapshin	e74197bc12	[Reland][Debuginfo][llvm-dwarfutil] Add check for unsupported debug sections. Current DWARFLinker implementation does not support some debug sections (mainly DWARF v5 sections). This patch adds diagnostic for such sections. The warning would be displayed for critical(such that could not be removed) sections and the source file would be skipped. Other unsupported sections would be removed and warning message should be displayed. The zero exit status would be returned for both cases. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D123623	2022-07-28 21:29:16 +03:00
Fangrui Song	c26dc2904b	[llvm-objcopy] Support --{,de}compress-debug-sections for zstd Also, add ELFCOMPRESS_ZSTD (2) from the approved generic-abi proposal: https://groups.google.com/g/generic-abi/c/satyPkuMisk ("Add new ch_type value: ELFCOMPRESS_ZSTD") Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399 ("[RFC] Zstandard as a second compression method to LLVM") Differential Revision: https://reviews.llvm.org/D130458	2022-07-28 10:45:53 -07:00
Austin Kerbow	f5b21680d1	[AMDGPU] Add amdgcn_sched_group_barrier builtin This builtin allows the creation of custom scheduling pipelines on a per-region basis. Like the sched_barrier builtin this is intended to be used either for testing, in situations where the default scheduler heuristics cannot be improved, or in critical kernels where users are trying to get performance that is close to handwritten assembly. Obviously using these builtins will require extra work from the kernel writer to maintain the desired behavior. The builtin can be used to create groups of instructions called "scheduling groups" where ordering between the groups is enforced by the scheduler. __builtin_amdgcn_sched_group_barrier takes three parameters. The first parameter is a mask that determines the types of instructions that you would like to synchronize around and add to a scheduling group. These instructions will be selected from the bottom up starting from the sched_group_barrier's location during instruction scheduling. The second parameter is the number of matching instructions that will be associated with this sched_group_barrier. The third parameter is an identifier which is used to describe what other sched_group_barriers should be synchronized with. Note that multiple sched_group_barriers must be added in order for them to be useful since they only synchronize with other sched_group_barriers. Only "scheduling groups" with a matching third parameter will have any enforced ordering between them. As an example, the code below tries to create a pipeline of 1 VMEM_READ instruction followed by 1 VALU instruction followed by 5 MFMA instructions... // 1 VMEM_READ __builtin_amdgcn_sched_group_barrier(32, 1, 0) // 1 VALU __builtin_amdgcn_sched_group_barrier(2, 1, 0) // 5 MFMA __builtin_amdgcn_sched_group_barrier(8, 5, 0) // 1 VMEM_READ __builtin_amdgcn_sched_group_barrier(32, 1, 0) // 3 VALU __builtin_amdgcn_sched_group_barrier(2, 3, 0) // 2 VMEM_WRITE __builtin_amdgcn_sched_group_barrier(64, 2, 0) Reviewed By: jrbyrnes Differential Revision: https://reviews.llvm.org/D128158	2022-07-28 10:43:14 -07:00
Simon Pilgrim	8c99cef1e7	[DAG] Remove SelectionDAG::GetDemandedBits and use SimplifyMultipleUseDemandedBits directly. GetDemandedBits is mainly a wrapper around SimplifyMultipleUseDemandedBits now, and is only used by DAGCombiner::visitSTORE so I've moved all remaining functionality there. visitSTORE was making use of this to 'simplify' constants for a trunc-store. Just removing this code left to a mixture of regressions and gains - it came down to whether a target preferred a sign or zero extended constant for materialization/truncation. I've just moved the code over for now, but a next step would be to move this to targetShrinkDemandedConstant, but some targets that override the method expect a basic binop, and might react badly to a store node.....	2022-07-28 17:03:44 +01:00
Liqiang Tao	d52e775b05	[llvm][ModuleInliner] Add inline cost priority for module inliner This patch introduces the inline cost priority into the module inliner, which uses the same computation as InlineCost. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D130012	2022-07-28 22:44:03 +08:00
Liqiang Tao	c113594378	Revert "[llvm][ModuleInliner] Add inline cost priority for module inliner" This reverts commit `bb7f62bbbd`.	2022-07-28 22:36:28 +08:00
Chris Bieneman	fe13002bb3	[HLSL] Add __builtin_hlsl_create_handle This is pretty straightforward, it just adds a builtin to return a pointer to a resource handle. This maps to a dx intrinsic. The shape of this builtin and the underlying intrinsic will likely shift a bit as this implementation becomes more feature complete, but this is a good basis to get started. Depends on D128569. Differential Revision: https://reviews.llvm.org/D130016	2022-07-28 09:16:11 -05:00
Liqiang Tao	bb7f62bbbd	[llvm][ModuleInliner] Add inline cost priority for module inliner This patch introduces the inline cost priority into the module inliner, which uses the same computation as InlineCost. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D130012	2022-07-28 21:28:07 +08:00
Florian Hahn	8daa338297	[SCEV] Avoid repeated proveNoUnsignedWrapViaInduction calls. At the moment, proveNoUnsignedWrapViaInduction may be called for the same AddRec a large number of times via getZeroExtendExpr. This can have a severe compile-time impact for very loop-heavy code. One one particular workload, LSR takes ~51s without this patch, almost exlusively in proveNoUnsignedWrapViaInduction. With this patch, the time in LSR drops to ~0.4s. If proveNoUnsignedWrapViaInduction failed to prove NUW the first time, it is unlikely to succeed on subsequent tries and the cost doesn't seem to be justified. Besides drastically improving compile-time in some excessive cases, this also has a slightly positive compile-time impact on CTMark: NewPM-O3: -0.07% NewPM-ReleaseThinLTO: -0.08% NewPM-ReleaseLTO-g: -0.06 https://llvm-compile-time-tracker.com/compare.php?from=b435da027d7774c24cdb8c88d09f6b771e07fb14&to=f2729e33e8284b502f6c35a43345272252f35d12&stat=instructions Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D130648	2022-07-28 10:02:19 +01:00
Paul Kirth	6e9bab71b6	Revert "[llvm][NFC] Refactor code to use ProfDataUtils" This reverts commit `300c9a7881`. We will reland once these issues are ironed out.	2022-07-27 21:38:11 +00:00
Paul Kirth	300c9a7881	[llvm][NFC] Refactor code to use ProfDataUtils In this patch we replace common code patterns with the use of utility functions for dealing with profiling metadata. There should be no change in functionality, as the existing checks should be preserved in all cases. Reviewed By: bogner, davidxl Differential Revision: https://reviews.llvm.org/D128860	2022-07-27 21:13:54 +00:00
Paul Kirth	6047deb7c2	[llvm] Provide utility function for MD_prof Currently, there is significant code duplication for dealing with MD_prof metadata throughout the compiler. These utility functions can improve code reuse and simplify boilerplate code when dealing with profiling metadata, such as branch weights. The inent is to provide a uniform set of APIs that allow common tasks, such as identifying specific types of MD_prof metadata and extracting branch weights. Future patches can build on this initial implementation and clean up the different implementations across the compiler. Reviewed By: bogner Differential Revision: https://reviews.llvm.org/D128858	2022-07-27 21:13:51 +00:00
Jonas Devlieghere	a8c3d9815e	[DebugInfo] Teach LLVM and LLDB about ptrauth in DWARF Teach libDebugInfo (llvm-dwarfdump) and lldb about DWARF tags and attributes for pointer authentication. These values have been emitted by Apple clang for several releases. Although upstream LLVM doesn't emit these values yet, we hope to upstream that part sometime soon. Differential revision: https://reviews.llvm.org/D130215	2022-07-27 11:48:35 -07:00
Amara Emerson	19cdd1908b	[AArch64][GlobalISel] Add heuristics for localizing G_CONSTANT. This adds similar heuristics to G_GLOBAL_VALUE, querying the cost of materializing a specific constant in code size. Doing so prevents us from sinking constants which require multiple instructions to generate into use blocks. Code size savings on CTMark -Os: Program size.__text before after diff ClamAV/clamscan 381940.00 382052.00 0.0% lencod/lencod 428408.00 428428.00 0.0% SPASS/SPASS 411868.00 411876.00 0.0% kimwitu++/kc 449944.00 449944.00 0.0% Bullet/bullet 463588.00 463556.00 -0.0% sqlite3/sqlite3 284696.00 284668.00 -0.0% consumer-typeset/consumer-typeset 414492.00 414424.00 -0.0% 7zip/7zip-benchmark 595244.00 594972.00 -0.0% mafft/pairlocalalign 247512.00 247368.00 -0.1% tramp3d-v4/tramp3d-v4 372884.00 372044.00 -0.2% Geomean difference -0.0% Differential Revision: https://reviews.llvm.org/D130554	2022-07-27 10:51:16 -07:00
Stanislav Mekhanoshin	0562cf442f	Allow data prefetch into non-default address space I am playing with the LoopDataPrefetch pass and found out that it bails to work with a pointer in a non-zero address space. This patch adds the target callback to check if an address space is to be considered for prefetching. Default implementation still only allows address space 0, so this is NFCI. This does not currently affect any known targets, but seems to be generally useful for the future. Differential Revision: https://reviews.llvm.org/D129795	2022-07-27 10:01:26 -07:00
Rainer Orth	979ddfff37	[Support] Handle SPARC in sys::getHostCPUName While working on D118450 <https://reviews.llvm.org/D118450>, I noticed that `sys::getHostCPUName` lacks SPARC support. This patch implements it. The code is taken from/inspired by GCC's `gcc/config/sparc/driver-sparc.cc`. There's one caveat: since LLVM, unlike GCC, doesn't support the SPARC-M7, -S7, and -M8 CPUs, I map all those to the latest supported one (UltraSparc T4/`niagara4`). Tested on `sparcv9-sun-solaris2.11` and `sparc64-unknown-linux-gnu` by running `savcov --version` on - Netra SPARC S7-2 (SPARC-S7, Solaris 11.4) - SPARC T5-2 (SPARC T5, Solaris 11.4) - SPARC Enterprise T5220 (UltraSPARC T2, Solaris 11.3) - SPARC T5 (UltraSPARC T5, Debian sid) - SPARC T3 (UltraSPARC T3, Debian sid) - SPARC Enterprise T5220 (Debian sid) Differential Revision: https://reviews.llvm.org/D130272	2022-07-27 12:21:03 +02:00
Alexey Lapshin	79ff02a122	Revert "[Debuginfo][llvm-dwarfutil] Add check for unsupported debug sections." This reverts commit `0d191b7553`.	2022-07-27 11:48:56 +03:00
Martin Sebor	4447603616	[InstCombine] Fold strtoul and strtoull and avoid PR #56293 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D129224	2022-07-26 14:11:40 -06:00
Francis Visoiu Mistrih	2c6e8b4636	[Matrix] Refactor tiled loops in a struct. NFC The three loops have the same structure: index, header, latch.	2022-07-26 11:02:22 -07:00
Jessica Paquette	39d431d811	[GlobalISel] Import patterns for G_FMAXIMUM + G_FMINIMUM Allows us to select scalar instructions on AArch64. Differential Revision: https://reviews.llvm.org/D115381	2022-07-26 10:58:44 -07:00
Fangrui Song	f106525de2	[MachineFunctionPass] Support -print-changed and -print-changed=quiet -print-changed for new pass manager is handy beside -print-after-all. Port it to MachineFunctionPass. Note: lib/Passes/StandardInstrumentations.cpp implements a number of misc features. If we want to use them for codegen, we may need to lift some functionality to LLVMIR. Reviewed By: aeubanks, jamieschmeiser Differential Revision: https://reviews.llvm.org/D130434	2022-07-26 10:16:49 -07:00
Stefan Gränitz	1e30820483	[WinEH] Apply funclet operand bundles to nounwind intrinsics that lower to function calls in the course of IR transforms WinEHPrepare marks any function call from EH funclets as unreachable, if it's not a nounwind intrinsic or has no proper funclet bundle operand. This affects ARC intrinsics on Windows, because they are lowered to regular function calls in the PreISelIntrinsicLowering pass. It caused silent binary truncations and crashes during unwinding with the GNUstep ObjC runtime: https://github.com/gnustep/libobjc2/issues/222 This patch adds a new function `llvm::IntrinsicInst::mayLowerToFunctionCall()` that aims to collect all affected intrinsic IDs. * Clang CodeGen uses it to determine whether or not it must emit a funclet bundle operand. * PreISelIntrinsicLowering asserts that the function returns true for all ObjC runtime calls it lowers. * LLVM uses it to determine whether or not a funclet bundle operand must be propagated to inlined call sites. Reviewed By: theraven Differential Revision: https://reviews.llvm.org/D128190	2022-07-26 17:52:43 +02:00
Paul Walker	e5c892dd85	[SVE][SelectionDAG] Use INDEX to generate matching instances of BUILD_VECTOR. This patch starts small, only detecting sequences of the form <a, a+n, a+2n, a+3n, ...> where a and n are ConstantSDNodes. Differential Revision: https://reviews.llvm.org/D125194	2022-07-26 15:28:37 +00:00
Arthur Eubanks	2eade1dba4	[WPD] Use new llvm.public.type.test intrinsic for potentially publicly visible classes Turning on opaque pointers has uncovered an issue with WPD where we currently pattern match away `assume(type.test)` in WPD so that a later LTT doesn't resolve the type test to undef and introduce an `assume(false)`. The pattern matching can fail in cases where we transform two `assume(type.test)`s into `assume(phi(type.test.1, type.test.2))`. Currently we create `assume(type.test)` for all virtual calls that might be devirtualized. This is to support `-Wl,--lto-whole-program-visibility`. To prevent this, all virtual calls that may not be in the same LTO module instead use a new `llvm.public.type.test` intrinsic in place of the `llvm.type.test`. Then when we know if `-Wl,--lto-whole-program-visibility` is passed or not, we can either replace all `llvm.public.type.test` with `llvm.type.test`, or replace all `llvm.public.type.test` with `true`. This prevents WPD from trying to pattern match away `assume(type.test)` for public virtual calls when failing the pattern matching will result in miscompiles. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D128955	2022-07-26 08:01:08 -07:00
Alexey Lapshin	0d191b7553	[Debuginfo][llvm-dwarfutil] Add check for unsupported debug sections. Current DWARFLinker implementation does not support some debug sections (mainly DWARF v5 sections). This patch adds diagnostic for such sections. The warning would be displayed for critical(such that could not be removed) sections and the source file would be skipped. Other unsupported sections would be removed and warning message should be displayed. The zero exit status would be returned for both cases. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D123623	2022-07-26 15:25:51 +03:00
Simon Tatham	55f1fbf005	[MC,llvm-objdump,ARM] Target-dependent disassembly resync policy. Currently, when llvm-objdump is disassembling a code section and encounters a point where no instruction can be decoded, it uses the same policy on all targets: consume one byte of the section, emit it as "<unknown>", and try disassembling from the next byte position. On an architecture where instructions are always 4 bytes long and 4-byte aligned, this makes no sense at all. If a 4-byte word cannot be decoded as an instruction, then the next place that a valid instruction could //possibly// be found is 4 bytes further on. Disassembling from a misaligned address can't possibly produce anything that the code generator intended, or that the CPU would even attempt to execute. This patch introduces a new MCDisassembler virtual method called `suggestBytesToSkip`, which allows each target to choose its own resynchronization policy. For Arm (as opposed to Thumb) and AArch64, I've filled in the new method to return a fixed width of 4. Thumb is a more interesting case, because the criterion for identifying 2-byte and 4-byte instruction encodings is very simple, and doesn't require the particular instruction to be recognized. So `suggestBytesToSkip` is also passed an ArrayRef of the bytes in question, so that it can take that into account. The new test case shows Thumb disassembly skipping over two unrecognized instructions, and identifying one as 2-byte and one as 4-byte. For targets other than Arm and AArch64, this is NFC: the base class implementation of `suggestBytesToSkip` still returns 1, so that the existing behavior is unchanged. Other targets can fill in their own implementations as they see fit; I haven't attempted to choose a new behavior for each one myself. I've updated all the call sites of `MCDisassembler::getInstruction` in llvm-objdump, and also one in sancov, which was the only other place I spotted the same idiom of `if (Size == 0) Size = 1` after a call to `getInstruction`. Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D130357	2022-07-26 09:35:30 +01:00
David Spickett	2f9fa9ef53	[lldb][AArch64] Add support for memory tags in core files This teaches ProcessElfCore to recognise the MTE tag segments. https://www.kernel.org/doc/html/latest/arm64/memory-tagging-extension.html#core-dump-support These segments contain all the tags for a matching memory segment which will have the same size in virtual address terms. In real terms it's 2 tags per byte so the data in the segment is much smaller. Since MTE is the only tag type supported I have hardcoded some things to those values. We could and should support more formats as they appear but doing so now would leave code untested until that happens. A few things to note: * /proc/pid/smaps is not in the core file, only the details you have in "maps". Meaning we mark a region tagged only if it has a tag segment. * A core file supports memory tagging if it has at least 1 memory tag segment, there is no other flag we can check to tell if memory tagging was enabled. (unlike a live process that can support memory tagging even if there are currently no tagged memory regions) Tests have been added at the commands level for a core file with mte and without. There is a lot of overlap between the "memory tag read" tests here and the unit tests for MemoryTagManagerAArch64MTE::UnpackTagsFromCoreFileSegment, but I think it's worth keeping to check ProcessElfCore doesn't cause an assert. Depends on D129487 Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D129489	2022-07-26 08:46:36 +01:00
Kazu Hirata	c8cf669f60	[ADT] Deprecate Optional::getValueOr (NFC) This patch deprecates getValueOr in favor of value_or. Differential Revision: https://reviews.llvm.org/D130140	2022-07-25 23:01:02 -07:00
Xiang Li	57006b14fa	[DirectX backend] [NFC]Add DXILOpBuilder to generate DXIL operation A new helper class DXILOpBuilder is added to create DXIL op function calls. TableGen backend for DXILOperation will create table for DXIL op function parameter types. When create DXIL op function, these parameter types will used to create the function type. Reviewed By: bogner Differential Revision: https://reviews.llvm.org/D130291	2022-07-25 21:49:59 -07:00
Alexander Shaposhnikov	1e636f2676	[IRBuilder] Add assert for AtomicRMW ordering Add assert for AtomicRMW: Ordering != AtomicOrdering::Unordered (https://github.com/llvm/llvm-project/blob/main/llvm/lib/IR/Verifier.cpp#L3944) and adjust expandAtomicStore accordingly. Test plan: 1/ ninja check-llvm check-clang check-lld 2/ Bootstrapped LLVM/Clang pass tests Differential revision: https://reviews.llvm.org/D130457	2022-07-25 22:51:25 +00:00
Vladislav Dzhidzhoev	2c84b92346	Fix assertion in SmallDenseMap constructor with reserve from non-power-of-2 buckets count `SmallDenseMap` constructor with reserve gets an arbitrary `NumInitBuckets` value and passes it below to `init` method. If `NumInitBuckets` is greater then `InlineBuckets`, then `SmallDenseMap` initializes to large representation passing `NumInitBuckets` below to `DenseMap` initialization. `DenseMap::initEmpty` method asserts that initial buckets count must be a power of 2. Proposed solution is to update `NumInitBuckets` value in `SmallDenseMap` constructor till the next power of 2. It should satisfy both `DenseMap` preconditions and required minimum buckets count for reservation. Reviewed By: atrick Differential Revision: https://reviews.llvm.org/D129825	2022-07-25 17:09:44 +00:00
Sunho Kim	0f00e58841	[JITLink][COFF][x86_64] Reimplement ADDR32NB/REL32. Reimplements ADDR32NB/REL32 relocations properly, out-of-reach targets will be dealt in the separate patch that will generate the stub for dllimport symbols. Reviewed By: sgraenitz Differential Revision: https://reviews.llvm.org/D129936	2022-07-25 23:41:53 +09:00
Weining Lu	aff68f5ad6	[LoongArch] Parse LoongArch base ABI in ObjectYAML and llvm-readobj LoongArch e_flags definition: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html#_e_flags_identifies_abi_type_and_version Differential Revision: https://reviews.llvm.org/D130238	2022-07-25 20:40:57 +08:00
Kazu Hirata	95a932fb15	Remove redundaunt override specifiers (NFC) Identified with modernize-use-override.	2022-07-24 22:28:11 -07:00
Kazu Hirata	b5188591a0	[llvm] Remove redundaunt virtual specifiers (NFC) Identified with modernize-use-override.	2022-07-24 21:50:35 -07:00
Amaury Séchet	5e29360743	[NFC] Add parentheses in MathExtra.h The code used to cause a warning: llvm/include/llvm/Support/MathExtras.h:751:39: warning: suggest parentheses around ‘-’ in operand of ‘&’ [-Wparentheses] 751 \| assert(Align != 0 && (Align & Align - 1) == 0 && \|	2022-07-24 22:04:09 +00:00
Kazu Hirata	ec8fa36d7c	[ExecutionEngine] Fix a header guard (NFC) Identified with llvm-header-guard.	2022-07-24 12:27:12 -07:00
Kazu Hirata	c661bd0886	[llvm] Remove unused forward declarations (NFC)	2022-07-24 12:27:05 -07:00
Fangrui Song	85cfd91723	[ELF] Optimize some non-constant alignTo with alignToPowerOf2. NFC My x86-64 lld executable is 2KiB smaller. .eh_frame writing gets faster as there were lots of divisions.	2022-07-24 11:20:49 -07:00
Fangrui Song	7feab85df8	[MC] Remove unused renameELFSection	2022-07-24 01:23:07 -07:00
Fangrui Song	6977ff4006	[MC] Delete dead zlib-gnu code and simplify writeSectionData	2022-07-24 01:17:34 -07:00
Kazu Hirata	2201d1827f	[Analysis] Use default member initialization (NFC)	2022-07-23 18:36:24 -07:00
Kazu Hirata	7bfa06f6c0	[CodeGen] Use range-based for loops (NFC)	2022-07-23 16:10:46 -07:00
Fangrui Song	7225213c0a	[LegacyPM] Remove {,PostInline}EntryExitInstrumenterPass Following recent changes removing non-core features of the legacy PM/optimization pipeline.	2022-07-23 15:30:15 -07:00
Nuno Lopes	9df0b254d2	[NFC] Switch a few uses of undef to poison as placeholders for unreachable code	2022-07-23 21:50:11 +01:00
Kazu Hirata	85dadf6d8d	[TableGen] Drop an unnecessary const from a return type (NFC) This patch also drops "&" that binds to a temporary. Identified with readability-const-return-type.	2022-07-23 11:30:23 -07:00
Kazu Hirata	71cdb8c6f1	[ADT] Use default member initialization (NFC) Identified with modernize-use-default-member-init.	2022-07-23 10:50:27 -07:00
ARCHIT SAXENA	3bb1ce2319	Add a nop instruction if a section starts with landing pad for function splitter This change adds a nop instruction if section starts with landing pad. This change is like [D73739](https://reviews.llvm.org/D73739) which avoids zero offset landing pad in basic block sections. Detailed description: The current machine functions splitter can create ˜sections which start with a landing pad themselves. This places landing pad at offset zero from LPStart. ``` .section .text.split.foo10,"ax",@progbits foo10.cold: # %lpad .cfi_startproc .cfi_personality 3, __gxx_personality_v0 .cfi_lsda 3, .Lexception5 .cfi_def_cfa %rsp, 16 .Ltmp11: <--- This is a Landing pad and also LP Start as it is start of this section movq %rax, %rdi <--- first instruction is at offest 0 from LPStart callq _Unwind_Resume@PLT ``` This will cause landing pad entries to become zero (.Ltmp11-foo10.cold) ``` .Lcst_begin4: .uleb128 .Ltmp9-.Lfunc_begin2 # >> Call Site 1 << .uleb128 .Ltmp10-.Ltmp9 # Call between .Ltmp9 and .Ltmp10 .uleb128 .Ltmp11-foo10.cold <---This is zero # jumps to .Ltmp11 .byte 3 # On action: 2 .uleb128 .Ltmp10-.Lfunc_begin2 # >> Call Site 2 << .uleb128 .Lfunc_end9-.Ltmp10 # Call between .Ltmp10 and .Lfunc_end9 .byte 0 # has no landing pad .byte 0 # On action: cleanup .p2align 2 ``` The C++ ABI somehow assumes that no landing pads point directly to LPStart (which works in the normal case since the function begin is never a landing pad), and uses LP.offset = 0 to specify no landing pad. This change adds a nop instruction at start of such sections so that such a case could be avoided. Output: ``` .section .text.split.foo10,"ax",@progbits foo10.cold: # %lpad .cfi_startproc .cfi_personality 3, __gxx_personality_v0 .cfi_lsda 3, .Lexception5 .cfi_def_cfa %rsp, 16 nop <--- new instruction that is added .Ltmp11: movq %rax, %rdi callq _Unwind_Resume@PLT ``` Reviewed By: modimo, snehasish, rahmanl Differential Revision: https://reviews.llvm.org/D130133	2022-07-22 15:20:10 -07:00
Arnold Schwaighofer	58e6ee0e1f	llvm.swift.async.context.addr cannot be modeled as NoMem because we don't want it to be cse'd accross async suspends An async suspend models the split between two partial async functions. `llvm.swift.async.context.addr ` will have a different value in the two partial functions so it is not correct to generally CSE the instruction. rdar://97336162 Differential Revision: https://reviews.llvm.org/D130201	2022-07-22 11:50:58 -07:00
Shilei Tian	602e0eb9f0	[OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake Multiple calls to `omp_get_wtime` could be optimized out due to the function is mistakenly marked as `readnone`. This patch fixes the issue, and also add the support to run optimization on `libomptarget` tests. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D130179	2022-07-22 13:46:45 -04:00
Mircea Trofin	7b81a81d5f	[NFC] FunctionSamples::getEntrySamples -> getHeadSamplesEstimate The name `getEntrySamples` was misleading for 2 reasons. One, it's close in name to `Function::getEntryCount`, but the equivalent here is `getHeadSamples`; second, as opposed to the other get* APIs in `FunctionSamples`, it performs an estimate/heuristic rather than just retrieving raw data (or a non-heuristic derivate off that data, like `getMaxCountInside`) The new name should more clearly communicate its intent; and, being close (in name) to `getHeadSamples`, it should allow the reader discover the relation between them. Also updated the doc comments for both `getHeadSamples[Estimate]` so a reader may better understand the relation between them. Differential Revision: https://reviews.llvm.org/D130281	2022-07-22 09:17:59 -07:00
Shilei Tian	77cb30e3a6	Revert "[OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake" This reverts commit `ad34f1dba8`.	2022-07-22 11:45:13 -04:00
Shilei Tian	ad34f1dba8	[OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake Multiple calls to `omp_get_wtime` could be optimized out due to the function is mistakenly marked as `readnone`. This patch fixes the issue, and also add the support to run optimization on `libomptarget` tests. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D130179	2022-07-22 11:43:30 -04:00
Malhar Jajoo	41958f76d8	[Costmodel] Add "type-based-intrinsic-cost" cli option This patch adds a command line flag to be able to test the type based cost-model analysis for Intrinsics. Differential Revision: https://reviews.llvm.org/D129109	2022-07-22 15:50:57 +01:00
Kazu Hirata	70257fab68	Use any_of (NFC)	2022-07-22 01:05:17 -07:00
Johannes Doerfert	a50b9f9f1f	[Attributor][FIX] Handle non-recursive but re-entrant functions properly If a function is non-recursive we only performed intra-procedural reasoning for reachability (via AA::isPotentiallyReachable). However, if it is re-entrant that doesn't mean we can't reach. Instead of this problematic logic in the reachability reasoning we utilize logic in AAPointerInfo. If a location is for sure written by a function it can be re-entrant or recursive we know only intra-procedural reasoning is sufficient.	2022-07-22 00:00:56 -05:00
Johannes Doerfert	62f7888d6d	[Attributor] Dominating must-write accesses allow unknown initial values If we have a dominating must-write access we do not need to know the initial value of some object to perform reasoning about the potential values. The dominating must-write has overwritten the initial value.	2022-07-21 23:08:43 -05:00
Johannes Doerfert	dfac030271	[Intrinsics] Add `nocallback` to the memset/cpy/move intrinsics These were forgotten when D118680 was applied. Similar to D125937. Differential Revision: https://reviews.llvm.org/D129516	2022-07-21 22:52:46 -05:00
Ilia Diachkov	b8e1544b9d	[SPIRV] add SPIRVPrepareFunctions pass and update other passes The patch adds SPIRVPrepareFunctions pass, which modifies function signatures containing aggregate arguments and/or return values before IR translation. Information about the original signatures is stored in metadata. It is used during call lowering to restore correct SPIR-V types of function arguments and return values. This pass also substitutes some llvm intrinsic calls to function calls, generating the necessary functions in the module, as the SPIRV translator does. The patch also includes changes in other modules, fixing errors and enabling many SPIR-V features that were omitted earlier. And 15 LIT tests are also added to demonstrate the new functionality. Differential Revision: https://reviews.llvm.org/D129730 Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com> Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com> Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com> Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>	2022-07-22 04:00:48 +03:00
Teresa Johnson	1dad6247d2	[MemProf] Add memprof metadata related analysis utilities Adds a number of utilities that are used to help create and update memprof related metadata. These will be used during profile matching and annotation, as well as by the inliner when updating the metadata. Also adds unit tests for the utilities. See also related RFCs: RFC: Sanitizer-based Heap Profiler [1] RFC: A binary serialization format for MemProf [2] RFC: IR metadata format for MemProf [3] (Note that the IR metadata format has changed from the RFC during implementation, as described in the preceeding patch adding the basic metadata and verification support.) Depends on D128141. Differential Revision: https://reviews.llvm.org/D128854	2022-07-21 13:46:01 -07:00
Sanjay Patel	78c09f0f24	[PatternMatch][InstCombine] match a vector with constant expression element(s) as a constant expression The InstCombine test is reduced from issue #56601. Without the more liberal match for ConstantExpr, we try to rearrange constants in Negator forever. Alternatively, we could adjust the definition of m_ImmConstant to be more conservative, but that's probably a larger patch, and I don't see any downside to changing m_ConstantExpr. We never capture and modify a ConstantExpr; transforms just want to avoid it. Differential Revision: https://reviews.llvm.org/D130286	2022-07-21 15:23:57 -04:00
Daniel Thornburgh	6605187103	[NFC] Fix compiler warning in MarkupFilter	2022-07-21 12:00:29 -07:00
Daniel Thornburgh	17e4c217b6	[Symbolizer] Implement contextual symbolizer markup elements. This change implements the contextual symbolizer markup elements: reset, module, and mmap. These provide information about the runtime context of the binary necessary to resolve addresses to symbolic values. Summary information is printed to the output about this context. Multiple mmap elements for the same module line are coalesced together. The standard requires that such elements occur on their own lines to allow for this; accordingly, anything after a contextual element on a line is silently discarded. Implementing this cleanly requires that the filter drive the parser; this allows skipped sections to avoid being parsed. This also makes the filter quite a bit easier to use, at the cost of some unused flexibility. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D129519	2022-07-21 11:29:19 -07:00
David Sherwood	f15b6b2907	[AArch64] Add target hook for preferPredicateOverEpilogue This patch adds the AArch64 hook for preferPredicateOverEpilogue, which currently returns true if SVE is enabled and one of the following conditions (non-exhaustive) is met: 1. The "sve-tail-folding" option is set to "all", or 2. The "sve-tail-folding" option is set to "all+noreductions" and the loop does not contain reductions, 3. The "sve-tail-folding" option is set to "all+norecurrences" and the loop has no first-order recurrences. Currently the default option is "disabled", but this will be changed in a later patch. I've added new tests to show the options behave as expected here: Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll Differential Revision: https://reviews.llvm.org/D129560	2022-07-21 17:20:06 +01:00
Joseph Huber	bc33c2fa0c	[Binary] Hard-code the alignment of the offloading binary Summary: We previously used `alignof` to get the necessary alignment of the binary header. However this was different on 32-bit platforms and caused a few tests to fail because of it. This patch just changes this to be a hard-coded constant of 8.	2022-07-21 09:28:26 -04:00
Nikita Popov	1f69503107	[MemoryBuiltins] Add getReallocatedOperand() function (NFC) Replace the value-accepting isReallocLikeFn() overload with a getReallocatedOperand() function, which returns which operand is the one being reallocated. Currently, this is always the first one, but once allockind(realloc) is respected, the reallocated operand will be determined by the allocptr parameter attribute.	2022-07-21 14:54:16 +02:00
Nikita Popov	46e6dd84b7	[MemoryBuiltins] Remove isFreeCall() function (NFC) Remove isFreeCall() in favor of getFreedOperand(). Replace the two remaining uses with a getFreedOperand() != nullptr check, as they only care that something is getting freed. (The usage in DSE is correct as such. The allocator-related checks in CFLGraph look rather questionable in general.)	2022-07-21 14:44:23 +02:00
Alexey Lapshin	8bb4451a65	[Reland][DebugInfo][llvm-dwarfutil] Combine overlapped address ranges. DWARF files may contain overlapping address ranges. f.e. it can happen if the two copies of the function have identical instruction sequences and they end up sharing. That looks incorrect from the point of view of DWARF spec. Current implementation of DWARFLinker does not combine overlapped address ranges. It would be good if such ranges would be handled in some useful way. Thus, this patch allows DWARFLinker to combine overlapped ranges in a single one. Depends on D86539 Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D123469	2022-07-21 14:15:39 +03:00
Matt Devereau	e0fbd990c9	[AArch64][SVE] Add ISel pattern to lower DUPLANE128 to LD1RQD Following on from https://reviews.llvm.org/D128902, lower DUPLANE128 to LD1RQD for integer load types from instruction selection. Differential Revision: https://reviews.llvm.org/D130010	2022-07-21 10:56:43 +00:00
Alexey Lapshin	3aad49082c	Revert "[DebugInfo][llvm-dwarfutil] Combine overlapped address ranges." This reverts commit `d2a4d6bf9c`.	2022-07-21 13:40:20 +03:00
Nikita Popov	c81dff3c30	[MemoryBuiltins] Add getFreedOperand() function (NFCI) We currently assume in a number of places that free-like functions free their first argument. This is true for all hardcoded free-like functions, but with the new attribute-based design, the freed argument is supposed to be indicated by the allocptr attribute. To make sure we handle this correctly once allockind(free) is respected, add a getFreedOperand() helper which returns the freed argument, rather than just indicating whether the call frees some argument. This migrates most but not all users of isFreeCall() to the new API. The remaining users are a bit more tricky.	2022-07-21 12:39:35 +02:00
Alexey Lapshin	d2a4d6bf9c	[DebugInfo][llvm-dwarfutil] Combine overlapped address ranges. DWARF files may contain overlapping address ranges. f.e. it can happen if the two copies of the function have identical instruction sequences and they end up sharing. That looks incorrect from the point of view of DWARF spec. Current implementation of DWARFLinker does not combine overlapped address ranges. It would be good if such ranges would be handled in some useful way. Thus, this patch allows DWARFLinker to combine overlapped ranges in a single one. Depends on D86539 Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D123469	2022-07-21 13:15:18 +03:00
Nikita Popov	d144ae6e1b	[MemoryBuiltins] Default to trivial mapper in getAllocSize() (NFC) Default getAllocSize() to use the trivial mapper. Also switch from using std::function to function_ref. Furthermore, update the doc comment to point out a subtle difference between getAllocSize() and getObjectSize(): The latter may also return something for calls that return their argument (via "returned" attribute or special intrinsics like invariant groups).	2022-07-21 11:43:48 +02:00
Nikita Popov	f45ab43332	[MemoryBuiltins] Avoid isAllocationFn() call before checking removable alloc Alloc directly checking whether a given call is a removable allocation, instead of first checking whether it is an allocation first.	2022-07-21 09:39:19 +02:00
Congzhe Cao	05ccde8023	[LoopCacheAnalysis] Fix a type mismatch problem in cost calculation There is a problem in loop cache analysis that the types of SCEV variables `Coeff` and `ElemSize` in function `isConsecutive()` may not match. The mismatch would cause SCEV failures when `Coeff` is multiplied with `ElemSize`. The fix in this patch is to extend the type of both `Coeff` and `ElemSize` to whichever is wider in those two variables. As a clean-up, duplicate calculations of `Stride` in `computeRefCost()` is then removed. Reviewed By: Meinersbur, #loopoptwg Differential Revision: https://reviews.llvm.org/D128877	2022-07-21 01:57:05 -04:00
Anubhab Ghosh	4fcf8434dd	[ORC] Add a new MemoryMapper-based JITLinkMemoryManager implementation. MapperJITLinkMemoryManager supports executor memory management using any implementation of MemoryMapper to do the transfer such as InProcessMapper or SharedMemoryMapper. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D129495	2022-07-20 17:52:37 -07:00
Teresa Johnson	0174f5553e	[MemProf] Basic metadata support and verification Add basic support for the MemProf metadata (!memprof and !callsite) which was initially described in "RFC: IR metadata format for MemProf" (https://discourse.llvm.org/t/rfc-ir-metadata-format-for-memprof/59165). The bulk of the patch is verification support, along with some tests. There are a couple of changes to the format described in the original RFC: Initial measurements suggested that a tree format for the stack ids in the contexts would be more efficient, but subsequent evaluation with large applications showed that in fact the cost of the additional metadata nodes required by this deduplication scheme overwhelmed the benefit from sharing stack id nodes. Therefore, the implementation here and in follow on patches utilizes a simpler scheme of lists of stack id integers in the memprof profile contexts and callsite metadata. The follow on matching patch employs context trimming optimizations to reduce the cost. Secondly, instead of verbosely listing all profiled fields in each profiled context (memory info block or MIB), and deferring the interpretation of the profile data, the profile data is evaluated and converted into string tags specifying the behavior (e.g. "cold") during profile matching. This reduces the verbosity of the profile metadata, and allows additional context trimming optimizations. As a result, the named metadata schema description is also no longer needed. Differential Revision: https://reviews.llvm.org/D128141	2022-07-20 15:30:55 -07:00
Schrodinger ZHU Yifan	304027206c	[ThinLTO] Support aliased GlobalIFunc Fixes https://github.com/llvm/llvm-project/issues/56290: when an ifunc is aliased in LTO, clang will attempt to create an alias summary; however, as ifunc is not included in the module summary, doing so will lead to crash. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D129009	2022-07-20 15:30:38 -07:00
Kazu Hirata	94e03abf91	[IPO] Restore a call to has_value (NFC) This patch restores a call to has_value to make it clear that we are checking the presence of an optional value, not the underlying value. This patch partially reverts `d08f34b592`. Differential Revision: https://reviews.llvm.org/D129453	2022-07-20 09:40:18 -07:00
Roman Rusyaev	394a388d14	[TableGen] Add a location for a class definition that was forward-declared This change improves ctags generation for tablegen files. For the following example ``` class A; class A { int a; } ``` Previously, tags were generated only for a forward declaration of class 'A'. This patch allows generating tags for the forward declarations and further definition of class 'A'. Reviewed By: barannikov88 Original patch by: rusyaev-roman (Roman Rusyaev) Some adjustments by: nhaehnle (Nicolai Hähnle) Differential Revision: https://reviews.llvm.org/D129935	2022-07-20 15:56:17 +02:00
esmeyi	b1847ff068	[XCOFF] write the aux header when the visibility is specified in XCOFF32. The n_type field in the symbol table entry has two interpretations in XCOFF32, and a single interpretation in XCOFF64. The new interpretation is used in XCOFF32 if the value of the o_vstamp field in the auxiliary header is 2. In XCOFF64 and the new XCOFF32 interpretation, the n_type field is used for the symbol type and visibility. The patch writes the aux header with an o_vstamp field value of 2 when the visibility is specified in XCOFF32 to make the new XCOFF32 interpretation used. Reviewed By: DiggerLin, jhenderson Differential Revision: https://reviews.llvm.org/D128148	2022-07-20 07:09:34 -04:00
Chuanqi Xu	645d2dd3a9	Revert "Don't treat readnone call in presplit coroutine as not access memory" This reverts commit `57224ff4a6`. This commit may trigger crashes on some workloads. Revert it for clearness.	2022-07-20 17:00:58 +08:00
Fangrui Song	e931c2e870	[LegacyPM] Remove InstrOrderFileLegacyPass Following recent changes removing non-core features of the legacy PM/optimization pipeline.	2022-07-19 23:58:51 -07:00
Chuanqi Xu	57224ff4a6	Don't treat readnone call in presplit coroutine as not access memory To solve the readnone problems in coroutines. See https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015 for details. According to the discussion, we decide to fix the problem by inserting isPresplitCoroutine() checks in different passes instead of wrapping/unwrapping readnone attributes in CoroEarly/CoroCleanup passes. In this direction, we might not be able to cover every case at first. Let's take a "find and fix" strategy. Reviewed By: nikic, nhaehnle, jyknight Differential Revision: https://reviews.llvm.org/D127383	2022-07-20 10:37:23 +08:00
Jez Ng	2e2737cdf9	[MC][MachO] Change addrsig format + ensure its size is properly set There were two problems with the previous setup: 1. We weren't setting its size, which caused problems when `__llvm_addrsig` wasn't the last section. In particular, `__debug_line` (if created) is generated and placed after `__llvm_addrsig`, and would result in an invalid object file w/ overlapping sections being emitted. 2. The symbol indices could be invalidated if e.g. `llvm-strip` ran on the object file. See discussion [here][1]. To fix both these issues, we use symbol relocations instead of encoding symbol indices directly in the section contents. The section itself doesn't contain any data. That sidesteps the layout problem in addition to solving the second issue. The corresponding LLD change to read in this new format: {D128938}. It will fix the icf-safe.ll test failure on this diff. [1]: https://discourse.llvm.org/t/problems-with-mach-o-address-significance-table-generation/63392/ Reviewed By: #lld-macho, alx32 Differential Revision: https://reviews.llvm.org/D127637	2022-07-19 21:22:23 -04:00
Lang Hames	94e6d2677b	[ORC] Fix serialization / deserialization of default-constructed StringRef. Avoids accessing the data field on zero-length strings. This is the StringRef counterpart to the ArrayRef<char> fix in `67220c2ad7`. rdar://97285294	2022-07-19 17:22:21 -07:00
Anubhab Ghosh	1b1f1c7786	Re-re-apply `5acd471698`, Add a shared-memory based orc::MemoryMapper... ...with more fixes. The original patch was reverted in `3e9cc543f2` due to bot failures caused by a missing dependence on librt. That issue was fixed in `32d8d23cd0`, but that commit also broke sanitizer bots due to a bug in SimplePackedSerialization: empty ArrayRef<char>s triggered a zero-byte memcpy from a null source. The ArrayRef<char> serialization issue was fixed in `67220c2ad7`, and this patch has also been updated with a new custom SharedMemorySegFinalizeRequest message that should avoid serializing empty ArrayRefs in the first place. https://reviews.llvm.org/D128544	2022-07-19 15:35:33 -07:00
Johannes Doerfert	bf789b1957	[Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many reasons: - We recomputed the result a lot as there was no caching for the 9 locations calling `genericValueTraversal`. - We added the idea of "intra" vs. "inter" procedural simplification only as an afterthought. `genericValueTraversal` did offer an option but `AAValueSimplify` did not. Thus, we might end up with "too much" simplification in certain situations and then gave up on it. - Because `genericValueTraversal` was not a real `AA` we ended up with problems like the infinite recursion bug (#54981) as well as code duplication. This patch introduces `AAPotentialValues` and replaces the `AAValueSimplify` uses with it. `genericValueTraversal` is folded into `AAPotentialValues` as are the instruction simplifications performed in `AAValueSimplify` before. We further distinguish "intra" and "inter" procedural simplification now. `AAValueSimplify` was not deleted as we haven't ported the re-materialization of instructions yet. There are other differences over the former handling, e.g., we may not fold trivially foldable instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2` but if an operand would be simplified to `i32 1` we would fold it still. We are also even more aware of function/SCC boundaries in CGSCC passes, which is good even if some tests look like they regress. Fixes: https://github.com/llvm/llvm-project/issues/54981 Note: A previous version was flawed and consequently reverted in `6555558a80`.	2022-07-19 16:24:42 -05:00
Yusra Syeda	6fb27bc2e3	[SystemZ][z/OS] Introduce CCAssignToRegAndStack to calling convention Differential Revision: https://reviews.llvm.org/D127328	2022-07-19 13:55:25 -04:00
Cole Kissane	e939bf67e3	[llvm] add zstd to `llvm::compression` namespace - add zstd to `llvm::compression` namespace - add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB` - add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp` - debian users should install libzstd when using `LLVM_ENABLE_ZSTD=FORCE_ON` from source due to this bug https://bugs.launchpad.net/ubuntu/+source/libzstd/+bug/1941956 Reviewed By: leonardchan, MaskRay Differential Revision: https://reviews.llvm.org/D128465	2022-07-19 10:54:36 -07:00
Jon Chesterfield	3a20597776	[amdgpu] Implement lds kernel id intrinsic Implement an intrinsic for use lowering LDS variables to different addresses from different kernels. This will allow kernels that cannot reach an LDS variable to avoid wasting space for it. There are a number of implicit arguments accessed by intrinsic already so this implementation closely follows the existing handling. It is slightly novel in that this SGPR is written by the kernel prologue. It is necessary in the general case to put variables at different addresses such that they can be compactly allocated and thus necessary for an indirect function call to have some means of determining where a given variable was allocated. Claiming an arbitrary SGPR into which an integer can be written by the kernel, in this implementation based on metadata associated with that kernel, which is then passed on to indirect call sites is sufficient to determine the variable address. The intent is to emit a __const array of LDS addresses and index into it. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D125060	2022-07-19 17:46:19 +01:00
Alexey Lapshin	4539b44148	[Reland][Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF. This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal): ``` ./llvm-dwarfutil [options] <input file> <output file> --garbage-collection Do garbage collection for debug info(default) -j <value> Alias for --num-threads --no-garbage-collection Don`t do garbage collection for debug info --no-odr-deduplication Don`t do ODR deduplication for debug types --no-odr Alias for --no-odr-deduplication --no-separate-debug-file Create single output file, containing debug tables(default) --num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine --odr-deduplication Do ODR deduplication for debug types(default) --odr Alias for --odr-deduplication --separate-debug-file Create two output files: file w/o debug tables and file with debug tables --tombstone [bfd,maxpc,exec,universal] Tombstone value used as a marker of invalid address(default: universal) =bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec =maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges =exec - Match with address ranges of executable sections =universal - Both: bfd and maxpc ``` Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D86539	2022-07-19 15:11:36 +03:00
Simon Pilgrim	0f6b0461b0	[DAG] SimplifyDemandedBits - relax "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" to match only demanded bits The "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold is currently limited to the XOR mask being a shifted all-bits mask, but we can relax this to only need to match under the demanded bits. This helps expose more bit extraction/clearing patterns and fixes the PowerPC testCompares*.ll regressions from D127115 Alive2: https://alive2.llvm.org/ce/z/fl7T7K Differential Revision: https://reviews.llvm.org/D129933	2022-07-19 10:59:07 +01:00
David Spickett	5d14873249	[llvm][AArch64] Add missing FPCR, H and B registers to Codeview mapping Fixes https://github.com/llvm/llvm-project/issues/56484 H registers are 16 bit views of AArch64's Neon registers and B are the 8 bit views. msvc does not support 16 bit float (some mention in DirectX but I couldn't find a way to get to it) so for lack of a better reference I'm using: `85c9b41b33/server/references/dia/include/cvconst.h` (the other microsoft-pdb repo is no longer up to date) Luckily clang does support fp16 so a test is added for that. There is no 8 bit float type so I had to get creative with the test case. We're not testing for correct debug info here just that we can select the B register and not crash in the process. For FPCR it's never going to be passed as an argument so I've not added a test for it. It is included to keep our list looking the same as the reference. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D129774	2022-07-19 09:33:13 +00:00
Alexey Lapshin	e717f91c96	Revert "[Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF." This reverts commit `e2147c26bd`.	2022-07-19 12:17:47 +03:00
Alexey Lapshin	e2147c26bd	[Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF. This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal): ``` ./llvm-dwarfutil [options] <input file> <output file> --garbage-collection Do garbage collection for debug info(default) -j <value> Alias for --num-threads --no-garbage-collection Don`t do garbage collection for debug info --no-odr-deduplication Don`t do ODR deduplication for debug types --no-odr Alias for --no-odr-deduplication --no-separate-debug-file Create single output file, containing debug tables(default) --num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine --odr-deduplication Do ODR deduplication for debug types(default) --odr Alias for --odr-deduplication --separate-debug-file Create two output files: file w/o debug tables and file with debug tables --tombstone [bfd,maxpc,exec,universal] Tombstone value used as a marker of invalid address(default: universal) =bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec =maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges =exec - Match with address ranges of executable sections =universal - Both: bfd and maxpc ``` Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D86539	2022-07-19 11:18:36 +03:00
serge-sans-paille	a2ac383b44	[llvm] Fix forward declaration in Support/JSON.h Some methods of json::Array require json::Value to be completely defined, so they can't be defined in-class. Fix that by defining them out of class. Fix #55780	2022-07-19 09:07:29 +02:00
Max Kazantsev	51f837a680	[NFC] Introduce API to detect tokens penetrating LCSSA form Following discussion in PR56243, we need to somehow detect the situation when token values penetrate LCSSA form for transforms that require that it is maintained by all values (for example, to sustain use-def dominance invarians). This patch introduces a parameter to LCSSA checkers to control their ignorance about tokens. Differential Revision: https://reviews.llvm.org/D129983 Reviewed By: efriedma	2022-07-19 13:52:30 +07:00
Shraiysh Vaishay	35fc666877	[OpenMP][IRBuilder] Add support for taskgroup This patch adds support for generating taskgroup construct. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D128203	2022-07-19 10:49:34 +05:30
Lang Hames	67220c2ad7	[ORC] Fix serialization / deserialization of default-constructed ArrayRef<char>. Avoids a zero-length memcpy from a null src, which caused errors on some of the sanitizer bots. Also uses null when deserializing an empty ArrayRef (rather than pointing to a zero length range in the middle of the input buffer).	2022-07-18 20:39:01 -07:00
Matt Arsenault	8d0383eb69	CodeGen: Remove AliasAnalysis from regalloc This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is rematerializable. I also don't think this was entirely correct, since it was implicitly assuming constant loads are also dereferenceable. Remove this and rely only on the invariant+dereferenceable flags in the memory operand. Set the flag based on the AA query upfront. This should have the same net benefit, but has the possible disadvantage of making this AA query nonlazy. Preserve the behavior of assuming pointsToConstantMemory implying dereferenceable for now, but maybe this should be changed.	2022-07-18 17:23:41 -04:00
Stanislav Mekhanoshin	523a99c0eb	[AMDGPU] Support for gfx940 fp8 smfmac Differential Revision: https://reviews.llvm.org/D129908	2022-07-18 12:12:41 -07:00
Stanislav Mekhanoshin	2695f0a688	[AMDGPU] Support for gfx940 fp8 mfma Differential Revision: https://reviews.llvm.org/D129906	2022-07-18 11:49:56 -07:00
Stanislav Mekhanoshin	9fa5a6b7e8	[AMDGPU] Support for gfx940 fp8 conversions Differential Revision: https://reviews.llvm.org/D129902	2022-07-18 11:48:43 -07:00
zhijian	a6316d6da5	[AIX] support read global symbol of big archive Reviewers: James Henderson, Fangrui Song Differential Revision: https://reviews.llvm.org/D124865	2022-07-18 10:43:30 -04:00
Simon Pilgrim	c2ab5c5514	[DAG] Fix typo in isDesirableToCommuteWithShift description. NFC.	2022-07-18 13:11:23 +01:00
Max Kazantsev	d693fd29f1	[Verifier] Make Verifier recognize undef tokens as correct IR Undef tokens may appear in unreached code as result of RAUW of some optimization, and it should not be considered as bad IR. Patch by Dmitry Bakunevich! Differential Revision: https://reviews.llvm.org/D128904 Reviewed By: mkazantsev	2022-07-18 16:26:06 +07:00
Nikita Popov	11079e8820	[IR] Don't treat callbr as indirect terminator Callbr is no longer an indirect terminator in the sense that is relevant here (that it's successors cannot be updated). The primary effect of this change is that callbr no longer prevents formation of loop simplify form. I decided to drop the isIndirectTerminator() method entirely and replace it with isa<IndirectBrInst>() checks. I assume this method was added to abstract over indirectbr and callbr, but it never really caught on, and there is nothing left to abstract anymore at this point. Differential Revision: https://reviews.llvm.org/D129849	2022-07-18 09:32:08 +02:00
Valentin Clement	048aaab194	[flang][openacc] Use TableGen to generate the clause parser This patch introduce an automatic generation of the clause parser from the TableGen information. New information can be stored directly in the TableGen file: - The different aliases that a clause support. - prefix before a value. - whether a prefix is optional or not. Makes it easier to add new clauses and also avoid some error (`write` clause incorrect until now). This patch is updating only the OpenACC part. A patch with a modification of the OpenMP clause parser will follow. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D106968	2022-07-18 09:26:57 +02:00
Craig Topper	a55ff6aadd	[Support][CodeGen] Fix spelling Divison->Division. NFC	2022-07-17 23:16:29 -07:00
Fangrui Song	0e3447bf8a	[LegacyPM] Remove WholeProgramDevirt Unused after LTO removal from legacy optimization passline.	2022-07-17 23:14:53 -07:00
Fangrui Song	1f90cc589e	[LegacyPM] Remove FunctionImportLegacyPass Unused after ThinLTO was removed from legacy optimization pipeline.	2022-07-17 23:06:46 -07:00
Abinav Puthan Purayil	d96361d714	[AMDGPU] Add the uses_dynamic_stack field to the kernel descriptor and the kernel metadata map This change introduces the dynamic stack boolean field to code-object-v3 and above under the code properties of the kernel descriptor and under the kernel metadata map of NT_AMDGPU_METADATA. This field corresponds to the is_dynamic_callstack field of amd_kernel_code_t. Differential Revision: https://reviews.llvm.org/D128344	2022-07-18 10:07:13 +05:30
Kazu Hirata	3112987d5c	Remove unused forward declarations (NFC)	2022-07-17 15:37:48 -07:00
Fangrui Song	bbaa015e82	[LegacyPM] Remove LowerTypeTestsPass Unused after LTO removal from optimization passline.	2022-07-17 15:06:38 -07:00
Fangrui Song	a6942256ca	[LegacyPM] Remove NameAnonGlobalLegacyPass Unused after LTO removal from optimization passline.	2022-07-17 14:38:29 -07:00
Fangrui Song	d74b88c69d	[LegacyPM] Remove CanonicalizeAliasesLegacyPass Unused after LTO removal from optimization passline.	2022-07-17 14:30:22 -07:00
Fangrui Song	70519a1fba	[LegacyPM] Remove LTO passes from optimization pipeline Following recent changes removing non-core features of the legacy PM/optimization pipeline.	2022-07-17 14:24:36 -07:00
Fangrui Song	f502115561	[LegacyPM] Remove PGO options from PassManagerBuilder They have been dead since legacy PGO/SamplePGO passes were removed.	2022-07-17 14:03:23 -07:00
Fangrui Song	dd5e3f0e27	[LegacyPM] Remove SampleProfileLoaderLegacyPass Following recent changes removing non-core features of the legacy PM/optimization pipeline (e.g. PGO), remove SamplePGO.	2022-07-17 12:09:46 -07:00
Kazu Hirata	c13a09a462	[llvm] Fix header guards (NFC) Identified with llvm-header-guard.	2022-07-17 02:18:55 -07:00
Kazu Hirata	92a1b2afc8	[Analysis] Remove isArithmeticRecurrenceKind The last use was removed on Jul 30, 2021 in commit `9d35594993`.	2022-07-16 13:23:32 -07:00
Fangrui Song	f9d6f37201	[LegacyPM] Remove ControlHeightReductionLegacyPass This pass tries to reduce the number of conditional branches in the hot path based on profile. It's mostly a no-op after legacy PGO passes are moved.	2022-07-16 01:35:56 -07:00
Fangrui Song	3a42c499c2	[LegacyPM] Remove createInstrProfilingLegacyPass Follow the steps of removing non-core instrumentation passes like PGO.	2022-07-16 01:26:40 -07:00
Fangrui Song	685775bbab	[LegacyPM] Remove CGProfileLegacyPass It's mostly a no-op after I removed legacy PGO passes in D123834.	2022-07-16 00:39:56 -07:00
Fangrui Song	df8f5be596	[LegacyPM] Remove ModuleSanitizerCoverageLegacyPass Follow the steps of various other legacy instrumentation passes removed for 15.0.0.	2022-07-15 19:01:20 -07:00
Mitch Phillips	4162aefad1	Revert "Re-apply `5acd471698`, Add a shared-memory based orc::MemoryMapper, with fixes." This reverts commit `32d8d23cd0`. Reason: Broke the UBSan buildbots. See more details on Phabricator: https://reviews.llvm.org/D128544	2022-07-15 17:11:55 -07:00
Rong Xu	5e0443292b	[PGO] Report number of counts being dropped when a hash-mismatch happens This patch reports number of counts being dropped when a hash-mismatch happens. This information will be helpful to the users -- if the dropped counts are large, the user should redo the instrumentation build and recollect the profile. Differential Revision: https://reviews.llvm.org/D129001	2022-07-15 14:53:59 -07:00
Fangrui Song	0d5a62faca	[sanitizer] Add "mainfile" prefix to sanitizer special case list When an issue exists in the main file (caller) instead of an included file (callee), using a `src` pattern applying to the included file may be inappropriate if it's the caller's responsibility. Add `mainfile` prefix to check the main filename. For the example below, the issue may reside in a.c (foo should not be called with a misaligned pointer or foo should switch to an unaligned load), but with `src` we can only apply to the innocent callee a.h. With this patch we can use the more appropriate `mainfile:a.c`. ``` //--- a.h // internal linkage static inline int load(int x) { return x; } //--- a.c, -fsanitize=alignment #include "a.h" int foo(void *x) { return load(x); } ``` See the updated clang/docs/SanitizerSpecialCaseList.rst for a caveat due to C++ vague linkage functions. Reviewed By: #sanitizers, kstoimenov, vitalybuka Differential Revision: https://reviews.llvm.org/D129832	2022-07-15 10:39:26 -07:00
Anubhab Ghosh	32d8d23cd0	Re-apply `5acd471698`, Add a shared-memory based orc::MemoryMapper, with fixes. The original commit was reverted in `3e9cc543f2` due to buildbot failures, which should be fixed by the addition of dependencies on librt. Differential Revision: https://reviews.llvm.org/D128544	2022-07-15 09:45:30 -07:00
David Kreitzer	c720b6fddd	Clarify the behavior of the llvm.vector.insert/extract intrinsics when the index is out of range. Both intrinsics return a poison value. Consequently, mark the intrinsics speculatable. Differential Revision: https://reviews.llvm.org/D129656	2022-07-15 07:56:44 -07:00
Edd Barrett	2e62a26fd7	[stackmaps] Legalise patchpoint arguments. This is similar to D125680, but for llvm.experimental.patchpoint (instead of llvm.experimental.stackmap). Differential review: https://reviews.llvm.org/D129268	2022-07-15 12:01:59 +01:00
Fangrui Song	3c849d0aef	Modernize Optional::{getValueOr,hasValue}	2022-07-15 01:20:39 -07:00
Nikita Popov	2a721374ae	[IR] Don't use blockaddresses as callbr arguments Following some recent discussions, this changes the representation of callbrs in IR. The current blockaddress arguments are replaced with `!` label constraints that refer directly to callbr indirect destinations: ; Before: %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo)) to label %asm.fallthrough [label %foo] ; After: %res = callbr i8* asm "", "=r,r,!i"(i8* %x) to label %asm.fallthrough [label %foo] The benefit of this is that we can easily update the successors of a callbr, without having to worry about also updating blockaddress references. This should allow us to remove some limitations: * Allow unrolling/peeling/rotation of callbr, or any other clone-based optimizations (https://github.com/llvm/llvm-project/issues/41834) * Allow duplicate successors (https://github.com/llvm/llvm-project/issues/45248) This is just the IR representation change though, I will follow up with patches to remove limtations in various transformation passes that are no longer needed. Differential Revision: https://reviews.llvm.org/D129288	2022-07-15 10:18:17 +02:00
Nikita Popov	f75ccadcdd	[LSR] Create SCEVExpander earlier, use member isSafeToExpand() (NFC) This is a followup to D129630, which switches LSR to the member isSafeToExpand() variant, and removes the freestanding function. This is done by creating the SCEVExpander early (already during the analysis phase). Because the SCEVExpander is now available for the whole lifetime of LSRInstance, I've also made it into a member variable, rather than passing it around in even more places. Differential Revision: https://reviews.llvm.org/D129769	2022-07-15 09:41:23 +02:00
Fangrui Song	141c9d7759	[llvm-dwp] Add SHF_COMPRESSED support and remove .zdebug support clang 14 removed -gz=zlib-gnu and ld.lld/llvm-objcopy removed .zdebug support recently. llvm-dwp currently doesn't support SHF_COMPRESSED. Add support and remove .zdebug support. Simplify llvm::object::Decompressor which has no .zdebug user now. While here, add tests for ELF32LE, ELF32BE, and ELF64BE. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D129728	2022-07-14 16:19:32 -07:00
Dawid Jurczak	d71128d97d	[NFC][Metadata] Change MDNode::operands()'s return type from op_range to ArrayRef<MDOperand> This patch is https://reviews.llvm.org/D129468 follow-up and address one of comment coming from that review: https://reviews.llvm.org/D129468#3643295 Differential Revision: https://reviews.llvm.org/D129565	2022-07-14 17:22:32 +02:00
Nikita Popov	9e6e631b38	[LoopPredication] Use isSafeToExpandAt() member function (NFC) As a followup to D129630, this switches a usage of the freestanding function in LoopPredication to use the member variant instead. This was the last use of the freestanding function, so drop it entirely.	2022-07-14 14:49:07 +02:00
Nikita Popov	dcf4b733ef	[SCEVExpander] Make CanonicalMode handing in isSafeToExpand() more robust (PR50506) isSafeToExpand() for addrecs depends on whether the SCEVExpander will be used in CanonicalMode. At least one caller currently gets this wrong, resulting in PR50506. Fix this by a) making the CanonicalMode argument on the freestanding functions required and b) adding member functions on SCEVExpander that automatically take the SCEVExpander mode into account. We can use the latter variant nearly everywhere, and thus make sure that there is no chance of CanonicalMode mismatch. Fixes https://github.com/llvm/llvm-project/issues/50506. Differential Revision: https://reviews.llvm.org/D129630	2022-07-14 14:41:51 +02:00
Namhyung Kim	69b312cde4	[llvm-objdump] Create fake sections for a ELF core file The linux perf tools use /proc/kcore for disassembly kernel functions. Actually it copies the relevant parts to a temp file and then pass it to objdump. But it doesn't have section headers so llvm-objdump cannot handle it. Let's create fake section headers for the program headers. It'd have a single section for each segment to cover the entire range. And for this purpose we can consider only executable code segments. With this change, I can see the following command shows proper outputs. perf annotate --stdio --objdump=/path/to/llvm-objdump Differential Revision: https://reviews.llvm.org/D128705	2022-07-14 13:39:59 +01:00
Cullen Rhodes	3e9cc543f2	Revert "[ORC] Add a shared-memory based orc::MemoryMapper." This reverts commit `5acd471698`. Breaks shared library build with: ld.lld-12: error: undefined symbol: shm_open >>> referenced by ExecutorSharedMemoryMapperService.cpp:68 (/home/culrho01/llvm-project/llvm/lib/ExecutionEngine/Orc/TargetProcess/ExecutorSharedMemoryMapperService.cpp:68) >>> lib/ExecutionEngine/Orc/TargetProcess/CMakeFiles/LLVMOrcTargetProcess.dir/ExecutorSharedMemoryMapperService.cpp.o:(llvm::orc::rt_bootstrap::ExecutorSharedMemoryMapperService::reserve[abi:cxx11](unsigned long)) >>> did you mean: sem_open >>> defined in: /usr/bin/../lib/gcc/aarch64-linux-gnu/9/../../../aarch64-linux-gnu/libpthread.so	2022-07-14 09:52:57 +00:00
Amara Emerson	6e6be5f950	Revert "[llvm] add zstd to llvm::compression namespace" This reverts commit `d449c60076`. Breaks macOS builds with this: llvm/lib/Support/Compression.cpp:24:10: fatal error: 'zstd.h' file not found	2022-07-14 01:23:20 -07:00
Jannik Silvanus	e5c4cde451	[AMDGPU] SIMachineScheduler: Add support for several MachineScheduler features The SI machine scheduler inherits from ScheduleDAGMI. This patch adds support for a few features that are implemented in ScheduleDAGMI (or its base classes) that were missing so far because their support is implemented in overridden functions. * Support cl::opt -view-misched-dags This option allows to open a graphical window of the scheduling DAG. * Support cl::opt -misched-print-dags This option allows to print the scheduling DAG in text form. * After constructing the scheduling DAG, call postprocessDAG() to apply any registered DAG mutations. Note that currently there are no mutations defined in AMDGPUTargetMachine.cpp in case SIScheduler is used. Still add this to avoid surprises in the future in case mutations are added. Differential Revision: https://reviews.llvm.org/D128808	2022-07-14 09:45:31 +02:00
Kazu Hirata	611ffcf4e4	[llvm] Use value instead of getValue (NFC)	2022-07-13 23:11:56 -07:00
Cole Kissane	d449c60076	[llvm] add zstd to llvm::compression namespace - add `FindZSTD.cmake` - add zstd to `llvm::compression` namespace - add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB` - add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp` Reviewed By: leonardchan, MaskRay Differential Revision: https://reviews.llvm.org/D128465	2022-07-13 19:58:42 -07:00
Cole Kissane	5ecb161c64	Revert "[llvm] add zstd to `llvm::compression` namespace" This reverts commit `cef07169ec`.	2022-07-13 19:48:29 -07:00
Cole Kissane	cef07169ec	[llvm] add zstd to `llvm::compression` namespace - add `FindZSTD.cmake` - add zstd to `llvm::compression` namespace - add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB` - add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp` Reviewed By: leonardchan, MaskRay Differential Revision: https://reviews.llvm.org/D128465	2022-07-13 19:06:27 -07:00
Fangrui Song	e690137dde	[Support] Change compression::zlib::{compress,uncompress} to use uint8_t * It's more natural to use uint8_t * (std::byte needs C++17 and llvm has too much uint8_t ) and most callers use uint8_t instead of char *. The functions are recently moved into `llvm::compression::zlib::`, so downstream projects need to make adaption anyway.	2022-07-13 16:26:54 -07:00
Anubhab Ghosh	5acd471698	[ORC] Add a shared-memory based orc::MemoryMapper. This is an implementation of orc::MemoryMapper that maps shared memory pages in both executor and controller process and writes directly to them avoiding transferring content over EPC. All allocations are properly deinitialized automatically on the executor side at shutdown by the ExecutorSharedMemoryMapperService. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D128544	2022-07-13 15:24:28 -07:00
Philip Reames	dde2a7fb6d	[RISCV] Exploit fact that vscale is always power of two to replace urem sequence When doing scalable vectorization, the loop vectorizer uses a urem in the computation of the vector trip count. The RHS of that urem is a (possibly shifted) call to @llvm.vscale. vscale is effectively the number of "blocks" in the vector register. (That is, types such as <vscale x 8 x i8> and <vscale x 1 x i8> both fill one 64 bit block, and vscale is essentially how many of those blocks there are in a single vector register at runtime.) We know from the RISCV V extension specification that VLEN must be a power of two between ELEN and 2^16. Since our block size is 64 bits, the must be a power of two numbers of blocks. (For everything other than VLEN<=32, but that's already broken.) It is worth noting that AArch64 SVE specification explicitly allows non-power-of-two sizes for the vector registers and thus can't claim that vscale is a power of two by this logic. Differential Revision: https://reviews.llvm.org/D129609	2022-07-13 10:54:47 -07:00
Fangrui Song	b28412d539	[llvm-objcopy][ELF] Add --set-section-type The request is mentioned on D129053. I feel that having this functionality is mildly useful (not strong). * Rename .ctors to .init_array and change sh_type to SHT_INIT_ARRAY (GNU objcopy detects the special name but we don't). * Craft tests for a new SHT_LLVM_* extension Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D129337	2022-07-13 10:04:21 -07:00
Mitch Phillips	90e5a8ac47	Remove 'no_sanitize_memtag'. Add 'sanitize_memtag'. For MTE globals, we should have clang emit the attribute for all GV's that it creates, and then use that in the upcoming AArch64 global tagging IR pass. We need a positive attribute for this sanitizer (rather than implicit sanitization of all globals) because it needs to interact with other parts of LLVM, including: 1. Suppressing certain global optimisations (like merging), 2. Emitting extra directives by the ASM writer, and 3. Putting extra information in the symbol table entries. While this does technically make the LLVM IR / bitcode format non-backwards-compatible, nobody should have used this attribute yet, because it's a no-op. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D128950	2022-07-13 08:54:41 -07:00
Nikita Popov	6f9d990a6e	[TargetFolder] Use DL-aware folding for icmp The Fold() call was accidentally dropped in `138fcc5f76`, though it doesn't seem to make a difference in practice (no test changes).	2022-07-13 15:35:13 +02:00
Nikita Popov	6d6983ced9	[IRBuilder] Migrate fneg to fold infrastructure Make use of a single FoldUnOpFMF() API, though in practice FNeg is the only unary operation that exists. This is likely NFC in practice, because users of InstSimplifyFolder don't create fneg.	2022-07-13 15:29:52 +02:00
Max Kazantsev	30e33b4b81	[SCEV][NFC] Make getStrengthenedNoWrapFlagsFromBinOp return optional	2022-07-13 18:54:25 +07:00
Corentin Jabot	d4892a168f	[Clang] Add a warning on invalid UTF-8 in comments. Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments. Invalid UTF-8 in other places is already diagnosed, as that cannot appear in identifiers and other grammar constructs. The warning is off by default as its likely to be somewhat disruptive otherwise. This warning allows clang to conform to the yet-to be approved WG21 "P2295R5 Support for UTF-8 as a portable source file encoding" paper. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D128059	2022-07-13 10:19:26 +02:00
Kazu Hirata	3361a364e6	[llvm] Use has_value instead of hasValue (NFC)	2022-07-12 22:25:42 -07:00
Nathan James	a565509308	[ADT] Use Empty Base Optimization for Allocators In D94439, BumpPtrAllocator changed its implementation to use an empty base optimization for the underlying allocator. This patch builds on that by extending its functionality to more classes as well as enabling the underlying allocator to be a reference type, something not currently possible as you can't derive from a reference. The main place this sees use is in StringMaps which often use the default MallocAllocator, yet have to pay the size of a pointer for no reason. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D129206	2022-07-12 23:57:04 +01:00
Jonas Devlieghere	a262f4dbd7	Revert "[Clang] Add a warning on invalid UTF-8 in comments." This reverts commit `cc309721d2` because it breaks the following tests on GreenDragon: TestDataFormatterObjCCF.py TestDataFormatterObjCExpr.py TestDataFormatterObjCKVO.py TestDataFormatterObjCNSBundle.py TestDataFormatterObjCNSData.py TestDataFormatterObjCNSError.py TestDataFormatterObjCNSNumber.py TestDataFormatterObjCNSURL.py TestDataFormatterObjCPlain.py TestDataFormatterObjNSException.py https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45288/	2022-07-12 15:22:29 -07:00
Kai Nacke	4ae254e488	Revert "[GISel] Unify use of getStackGuard" This reverts commit `e60b4fb2b7`.	2022-07-12 17:00:43 -04:00
Kai Nacke	e60b4fb2b7	[GISel] Unify use of getStackGuard Some rework of getStackGuard() based on comments in https://reviews.llvm.org/D129505. - getStackGuard() now creates and returns the destination register, simplifying calls - the pointer type is passed to getStackGuard() to avoid recomputation - removed PtrMemTy in emitSPDescriptorParent(), because this type is only used here when loading the value but not when storing the value Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D129576	2022-07-12 16:46:37 -04:00
Sunho Kim	2a0aa98c8d	[ORC] Remove unused function declaration. (NFC) Differential Revision: https://reviews.llvm.org/D129582	2022-07-13 05:13:31 +09:00
Sunho Kim	db995d72db	[JITLink][COFF] Initial COFF support. Adds initial COFF support in JITLink. This is able to run a hello world c program in x86 windows successfully. Implemented - COFF object loader - Static local symbols - Absolute symbols - External symbols - Weak external symbols - Common symbols - COFF jitlink-check support - All COMDAT selection type execpt largest - Implicit symobl size calculation - Rel32 relocation with PLT stub. - IMAGE_REL_AMD64_ADDR32NB relocation Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D128968	2022-07-13 03:52:43 +09:00
Yuanfang Chen	fcb7d76d65	[coroutine] add nomerge function attribute to `llvm.coro.save` It is illegal to merge two `llvm.coro.save` calls unless their `llvm.coro.suspend` users are also merged. Marks it "nomerge" for the moment. This reverts D129025. Alternative to D129025, which affects other token type users like WinEH. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D129530	2022-07-12 10:39:38 -07:00
Nick Desaulniers	2240d72f15	[X86] initial -mfunction-return=thunk-extern support Adds support for: * `-mfunction-return=<value>` command line flag, and * `__attribute__((function_return("<value>")))` function attribute Where the supported <value>s are: * keep (disable) * thunk-extern (enable) thunk-extern enables clang to change ret instructions into jmps to an external symbol named __x86_return_thunk, implemented as a new MachineFunctionPass named "x86-return-thunks", keyed off the new IR attribute fn_ret_thunk_extern. The symbol __x86_return_thunk is expected to be provided by the runtime the compiled code is linked against and is not defined by the compiler. Enabling this option alone doesn't provide mitigations without corresponding definitions of __x86_return_thunk! This new MachineFunctionPass is very similar to "x86-lvi-ret". The <value>s "thunk" and "thunk-inline" are currently unsupported. It's not clear yet that they are necessary: whether the thunk pattern they would emit is beneficial or used anywhere. Should the <value>s "thunk" and "thunk-inline" become necessary, x86-return-thunks could probably be merged into x86-retpoline-thunks which has pre-existing machinery for emitting thunks (which could be used to implement the <value> "thunk"). Has been found to build+boot with corresponding Linux kernel patches. This helps the Linux kernel mitigate RETBLEED. * CVE-2022-23816 * CVE-2022-28693 * CVE-2022-29901 See also: * "RETBLEED: Arbitrary Speculative Code Execution with Return Instructions." * AMD SECURITY NOTICE AMD-SN-1037: AMD CPU Branch Type Confusion * TECHNICAL GUIDANCE FOR MITIGATING BRANCH TYPE CONFUSION REVISION 1.0 2022-07-12 * Return Stack Buffer Underflow / Return Stack Buffer Underflow / CVE-2022-29901, CVE-2022-28693 / INTEL-SA-00702 SystemZ may eventually want to support "thunk-extern" and "thunk"; both options are used by the Linux kernel's CONFIG_EXPOLINE. This functionality has been available in GCC since the 8.1 release, and was backported to the 7.3 release. Many thanks for folks that provided discrete review off list due to the embargoed nature of this hardware vulnerability. Many Bothans died to bring us this information. Link: https://www.youtube.com/watch?v=IF6HbCKQHK8 Link: https://github.com/llvm/llvm-project/issues/54404 Link: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-01/msg01197.html Link: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/return-stack-buffer-underflow.html Link: https://arstechnica.com/information-technology/2022/07/intel-and-amd-cpus-vulnerable-to-a-new-speculative-execution-attack/?comments=1 Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce114c866860aa9eae3f50974efc68241186ba60 Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00702.html Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00707.html Reviewed By: aaron.ballman, craig.topper Differential Revision: https://reviews.llvm.org/D129572	2022-07-12 09:17:54 -07:00
Dawid Jurczak	165240fe38	[NFC] Fix compile time regression seen on some benchmarks after `a630ea3003` commit The goal of this change is fixing most of compile time slowdown seen after `a630ea3003` commit on lencod and sqlite3 benchmarks. There are 3 improvements included in this patch: 1. In getNumOperands when possible get value directly from SmallNumOps. 2. Inline getLargePtr by moving its definition to header. 3. In TBAAStructTypeNode::getField get all operands once instead taking operands in loop one after one. Differential Revision: https://reviews.llvm.org/D129468	2022-07-12 15:00:27 +02:00
Corentin Jabot	cc309721d2	[Clang] Add a warning on invalid UTF-8 in comments. Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments. Invalid UTF-8 in other places is already diagnosed, as that cannot appear in identifiers and other grammar constructs. The warning is off by default as its likely to be somewhat disruptive otherwise. This warning allows clang to conform to the yet-to be approved WG21 "P2295R5 Support for UTF-8 as a portable source file encoding" paper. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D128059	2022-07-12 14:34:30 +02:00
Nikita Popov	00797b88e0	[InlineAsm] Improve error messages for invalid constraint strings InlineAsm constraint string verification can fail for many reasons, but used to always print a generic "invalid type for inline asm constraint string" message -- which is especially confusing if the actual error is unrelated to the type, e.g. a failure to parse the constraint string. Change the verify API to return an Error with a more specific error message, and print that in the IR parser.	2022-07-12 11:41:16 +02:00
Nikita Popov	4bb7b6fae3	[IR] Remove support for float binop constant expressions As part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179, this removes support for the floating-point binop constant expressions fadd, fsub, fmul, fdiv and frem. As part of this change, the C APIs LLVMConstFAdd, LLVMConstFSub, LLVMConstFMul, LLVMConstFDiv and LLVMConstFRem are removed. The LLVMBuild APIs should be used instead. Differential Revision: https://reviews.llvm.org/D129478	2022-07-12 09:40:49 +02:00
Kazu Hirata	ec9a0e36d9	[IPO] Remove addLTOOptimizationPasses and addLateLTOOptimizationPasses (NFC) The last uses were removed on Apr 15, 2022 in commit `2e6ac54cf4`. Differential Revision: https://reviews.llvm.org/D129460	2022-07-11 20:15:24 -07:00
Xiang1 Zhang	a45dd3d814	[X86] Support -mstack-protector-guard-symbol Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D129346	2022-07-12 10:17:00 +08:00
Xiang1 Zhang	643786213b	Revert "[X86] Support -mstack-protector-guard-symbol" This reverts commit `efbaad1c4a`. due to miss adding review info.	2022-07-12 10:14:32 +08:00
Xiang1 Zhang	efbaad1c4a	[X86] Support -mstack-protector-guard-symbol	2022-07-12 10:13:48 +08:00
Prabhdeep Singh Soni	ac892c70a4	[OMPIRBuilder] Add support for simdlen clause This patch adds OMPIRBuilder support for the simdlen clause for the simd directive. It uses the simdlen support in OpenMPIRBuilder when it is enabled in Clang. Simdlen is lowered by OpenMPIRBuilder by generating the loop.vectorize.width metadata. Reviewed By: jdoerfert, Meinersbur Differential Revision: https://reviews.llvm.org/D129149	2022-07-11 13:29:06 -04:00
spupyrev	eecd41aa09	Revert "Rebase: [Facebook] [MC] Introduce NeverAlign fragment type" This reverts commit `6d0528636a`.	2022-07-11 09:50:47 -07:00
Rafael Auler	6d0528636a	Rebase: [Facebook] [MC] Introduce NeverAlign fragment type Summary: Introduce NeverAlign fragment type. The intended usage of this fragment is to insert it before a pair of macro-op fusion eligible instructions. NeverAlign fragment ensures that the next fragment (first instruction in the pair) does not end at a given alignment boundary by emitting a minimal size nop if necessary. In effect, it ensures that a pair of macro-fusible instructions is not split by a given alignment boundary, which is a precondition for macro-op fusion in modern Intel Cores (64B = cache line size, see Intel Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode Pipeline: Macro-Fusion). This patch introduces functionality used by BOLT when emitting code with MacroFusion alignment already in place. The use case is different from BoundaryAlign and instruction bundling: - BoundaryAlign can be extended to perform the desired alignment for the first instruction in the macro-op fusion pair (D101817). However, this approach has higher overhead due to reliance on relaxation as BoundaryAlign requires in the general case - see https://reviews.llvm.org/D97982#2710638. - Instruction bundling: the intent of NeverAlign fragment is to prevent the first instruction in a pair ending at a given alignment boundary, by inserting at most one minimum size nop. It's OK if either instruction crosses the cache line. Padding both instructions using bundles to not cross the alignment boundary would result in excessive padding. There's no straightforward way to request instruction bundling to avoid a given end alignment for the first instruction in the bundle. LLVM: https://reviews.llvm.org/D97982 Manual rebase conflict history: https://phabricator.intern.facebook.com/D30142613 Test Plan: sandcastle Reviewers: #llvm-bolt Subscribers: phabricatorlinter Differential Revision: https://phabricator.intern.facebook.com/D31361547	2022-07-11 09:31:52 -07:00
David Sherwood	03fee6712a	[LoopVectorize] Add option to use active lane mask for loop control flow Currently, for vectorised loops that use the get.active.lane.mask intrinsic we only use the mask for predicated vector operations, such as masked loads and stores, etc. The loop itself is still controlled by comparing the canonical induction variable with the trip count. However, for some targets this is inefficient when it's cheap to use the mask itself to control the loop. This patch adds support for using the active lane mask for control flow by: 1. Generating the active lane mask for the next iteration of the vector loop, rather than the current one. If there are still any remaining iterations then at least the first bit of the mask will be set. 2. Extract the first bit of this mask and use this bit for the conditional branch. I did this by creating a new VPActiveLaneMaskPHIRecipe that sets up the initial PHI values in the vector loop pre-header. I've also made use of the new BranchOnCond VPInstruction for the final instruction in the loop region. Differential Revision: https://reviews.llvm.org/D125301	2022-07-11 13:46:55 +01:00
Abhina Sreeskantharajan	6e2329e33a	[SystemZ][z/OS] Force alignment to fix build failure on z/OS The following commit https://reviews.llvm.org/D125998 added a static_assert which was triggered on z/OS because bitfields are always aligned to 1 regardless of type. ``` error: static_assert failed due to requirement 'alignof(llvm::SmallVector<llvm::MDOperand, 0>) <= alignof(llvm::MDNode::Header)' "LargeStorageVector too strongly aligned" ``` The solution was to force the alignment to be size_t. Reviewed By: wolfgangp Differential Revision: https://reviews.llvm.org/D129369	2022-07-11 08:29:29 -04:00
Kazu Hirata	c13d04e599	[DWARFLinker] Remove unused declaration copyAbbrev (NFC) The corresponding definition was removed on Apr 26, 2021 in commit `233c24330b`.	2022-07-10 22:10:23 -07:00
Kazu Hirata	f2e1d2cec0	[GlobalISel] Remove unused declaration fewerElementsVectorSextInReg (NFC) The corresponding definition was removed on Dec 23, 2021 in commit `29f88b93fd`.	2022-07-10 20:41:02 -07:00
Nicolai Hähnle	ede600377c	ManagedStatic: remove many straightforward uses in llvm (Reapply after revert in `e9ce1a5880` due to Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other than error categories, to be checked in more detail and reapplied separately.) Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it. Differential Revision: https://reviews.llvm.org/D129120	2022-07-10 10:29:15 +02:00
Nicolai Hähnle	e9ce1a5880	Revert "ManagedStatic: remove many straightforward uses in llvm" This reverts commit `e6f1f06245`. Reverting due to a failure on the fuchsia-x86_64-linux buildbot.	2022-07-10 09:54:30 +02:00
Nicolai Hähnle	e6f1f06245	ManagedStatic: remove many straightforward uses in llvm Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it. Differential Revision: https://reviews.llvm.org/D129120	2022-07-10 09:15:08 +02:00
Fangrui Song	2c18e817ee	[Support] Delete redundant 'static' from namespace scope 'static constexpr'. NFC	2022-07-09 23:36:01 -07:00
Corentin Jabot	50416e5454	Revert "[Clang] Add a warning on invalid UTF-8 in comments." It is probable thart this change crashes on the powerpc bots. This reverts commit `355532a149`.	2022-07-09 17:18:35 +02:00
Lang Hames	7ac7837080	[JITLink][AArch64] Rename PointerToGOT and fix typo. PointerToGOT lowering was accidentally changed from Delta32 to Delta64 in `db37225803`. This patch moves it back to Delta32 and renames the generic aarch64 edge to Delta32ToGOT to avoid the ambiguity. No test case yet -- I haven't figured out how to write a succinct test case (this typically appears in CIEs in eh-frames).	2022-07-09 08:09:23 -07:00
Corentin Jabot	355532a149	[Clang] Add a warning on invalid UTF-8 in comments. Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments. Invalid UTF-8 in other places is already diagnosed, as that cannot appear in identifiers and other grammar constructs. The warning is off by default as its likely to be somewhat disruptive otherwise. This warning allows clang to conform to the yet-to be approved WG21 "P2295R5 Support for UTF-8 as a portable source file encoding" paper. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D128059	2022-07-09 11:26:45 +02:00
Leonard Chan	474c873148	Revert "[llvm] cmake config groundwork to have ZSTD in LLVM" This reverts commit `f07caf20b9` which seems to break upstream https://lab.llvm.org/buildbot/#/builders/109/builds/42253.	2022-07-08 13:48:05 -07:00
Cole Kissane	f07caf20b9	[llvm] cmake config groundwork to have ZSTD in LLVM - added `FindZSTD.cmake` - added a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB` - likewise added have_zstd to compiler-rt/test/lit.common.cfg.py, clang-tools-extra/clangd/test/lit.cfg.py, and several lit.site.cfg.py.in files mirroring have_zlib behavior Reviewed By: leonardchan, MaskRay Differential Revision: https://reviews.llvm.org/D128465	2022-07-08 11:46:52 -07:00
Joseph Huber	5300263c70	[OpenMP] Add loop tripcount argument to kernel launch and remove push function Previously we added the `push_target_tripcount` function to send the loop tripcount to the device runtime so we knew how to configure the teams / threads for execute the loop for a teams distribute construct. This was implemented as a separate function mostly to avoid changing the interface for backwards compatbility. Now that we've changed it anyway and the new interface can take an arbitrary number of arguments via the struct without changing the ABI, we can move this to the new interface. This will simplify the runtime by removing unnecessary state between calls. Depends on D128550 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D128816	2022-07-08 14:44:16 -04:00
Joseph Huber	1fff116645	[OpenMP] Change OpenMP code generation for target region entries This patch changes the code we generate to enter a target region on the device. This is in-line with the new definition in the runtime that was added previously. Additionally we implement this in the OpenMPIRBuilder so that this code can be shared with Flang in the future. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D128550	2022-07-08 14:44:11 -04:00
Cole Kissane	96063bfa90	[llvm] Remove unused and redundant crc32 funcction from llvm::compression::zlib namespace * Remove crc32 from zlib compression namespace, people should use the `llvm::crc32` instead. Reviewed By: MaskRay, leonardchan Differential Revision: https://reviews.llvm.org/D128754	2022-07-08 11:24:45 -07:00
Cole Kissane	ea61750c35	[NFC] Refactor llvm::zlib namespace * Refactor compression namespaces across the project, making way for a possible introduction of alternatives to zlib compression. Changes are as follows: * Relocate the `llvm::zlib` namespace to `llvm::compression::zlib`. Reviewed By: MaskRay, leonardchan, phosek Differential Revision: https://reviews.llvm.org/D128953	2022-07-08 11:19:07 -07:00
Nicolai Hähnle	5a731d733c	Fix test: LLVMGetBitcodeModule takes ownership of memory buffer Clarify this behavior in the C interface header file and fix a related bug in a test. Differential Revision: https://reviews.llvm.org/D129113	2022-07-08 20:06:44 +02:00
Matt Arsenault	13ac4c3de9	GlobalISel: Add buildBoolExtInReg helper	2022-07-08 11:55:08 -04:00
Matt Arsenault	1ee6ce9bad	GlobalISel: Allow forming atomic/volatile G_ZEXTLOAD SelectionDAG has a target hook, getExtendForAtomicOps, which it uses in the computeKnownBits implementation for ATOMIC_LOAD. This is pretty ugly (as is having a separate load opcode for atomics), so instead allow making use of atomic zextload. Enable this for AArch64 since the DAG path defaults in to the zext behavior. The tablegen changes are pretty ugly, but partially helps migrate SelectionDAG from using ISD::ATOMIC_LOAD to regular ISD::LOAD with atomic memory operands. For now the DAG emitter will emit matchers for patterns which the DAG will not produce. I'm still a bit confused by the intent of the isLoad/isStore/isAtomic bits. The DAG implementation rejects trying to use any of these in combination. For now I've opted to make the isLoad checks also check isAtomic, although I think having isLoad and isAtomic set on these makes most sense.	2022-07-08 11:55:08 -04:00
Valentin Clement	015834e455	[flang][openacc][NFC] Extract device_type parser to its own Move the device_type parser to a separate parser AccDeviceTypeExprList. Preparatory work for D106968. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D106967	2022-07-08 16:02:04 +02:00
Valentin Clement	36e24da8eb	[flang][openacc][NFC] Make self clause value optional in ACC.td and extract the parser Set the isOptional flag for the self clause. Move the optional and parenthesis part of the parser. Update the rest of the code to deal with the optional value. Preparatory work for D106968. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D106965	2022-07-08 15:45:12 +02:00
Johannes Doerfert	f6e0c05e3d	Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues" This reverts commit `f17639ea0c` as three AMDGPU tests haven't been updated. Will need to verify the changes are not regressions we should avoid.	2022-07-08 00:53:38 -05:00

... 2 3 4 5 6 ...

48889 Commits