llvm-project

Commit Graph

Author	SHA1	Message	Date
Wenlei He	47d66355ef	[llvm-profgen] Fix alignment in preferred based calculation We used the segment alignment in elf header to assume the loader alignment. However this is incorrect because loader alignment is always the same as page size. If segment needs to be aligned at load time, linker will set aligned address as virtual address in elf header. Differential Revision: https://reviews.llvm.org/D110795	2021-09-29 23:01:10 -07:00
Wenlei He	1f0bc617bd	[llvm-porfgen] Allow perf data as input This change enables llvm-profgen to take raw perf data as alternative input format. Sometimes we need to retrieve evenets for processes with matching binary. Using perf data as input allows us to retrieve process Ids from mmap events for matching binary, then filter by process id during perf script generation. Differential Revision: https://reviews.llvm.org/D110793	2021-09-29 22:57:35 -07:00
Wenlei He	941191aae4	[llvm-profgen] Refactor and better diagnostics This change contains diagnostics improvments, refactoring and preparation for consuming perf data directly. Diagnostics: - We now have more detailed diagnostics when no mmap is found. - We also print warning for abnormal transition to external code. Refactoring: - Simplify input perf trace processing to only allow a single input file. This is because 1) using multiple input perf trace (perf script) is error prone because we may miss key mmap events. 2) the functionality is not really being used anyways. - Make more functions private for Readers, move non-trivial definitions out of header. Cleanup some inconsistency. - Prepare for consuming perf data as input directly. Differential Revision: https://reviews.llvm.org/D110729	2021-09-29 22:55:50 -07:00
Fangrui Song	8971b99c83	[llvm-objdump/llvm-readobj/obj2yaml/yaml2obj] Support STO_RISCV_VARIANT_CC and DT_RISCV_VARIANT_CC STO_RISCV_VARIANT_CC marks that a symbol uses a non-standard calling convention or the vector calling convention. See https://github.com/riscv/riscv-elf-psabi-doc/pull/190 Differential Revision: https://reviews.llvm.org/D107949	2021-09-29 16:56:52 -07:00
Wael Yehia	8b8da01d88	Revert "[LTO][Legacy] Add -debug-pass-manager option to enable pass run/skip trace." This reverts commit `a60405cf03`.	2021-09-29 19:43:35 +00:00
Michael Kruse	d9562a8e45	[llvm-reduce] Reduce metadata references. The ReduceMetadata pass before this patch removed metadata on a per-MDNode (or NamedMDNode) basis. Either all references to an MDNode are kept, or all of them are removed. However, MDNodes are uniqued, meaning that references to MDNodes with the same data become references to the same MDNodes. As a consequence, e.g. tbaa references to the same type will all have the same MDNode reference and hence make it impossible to reduce only keeping metadata on those memory access for which they are interesting. Moreover, MDNodes can also be referenced by some intrinsics or other MDNodes. These references were not considered for removal leading to the possibility that MDNodes are not actually removed even if selected to be removed by the oracle. This patch changes ReduceMetadata to reduces based on removable metadata references instead. MDNodes without references implicitly dropped anyway. References by intrinsic calls should be removed by ReduceOperands or ReduceInstructions. References in other MDNodes cannot be removed as it would violate the immutability of MDNodes. Additionally, ReduceMetadata pass before this patch used `setMetadata(I, NULL)` to remove references, where `I` is the index in the array returned by `getAllMetadata`. However, `setMetadata` expects a MDKind (such as `MD_tbaa`) as first argument. `getAllMetadata` does not return those in consecutive order (otherwise it would not need to be a `std::pair` with `first` representing the MDKind). Reviewed By: aeubanks, swamulism Differential Revision: https://reviews.llvm.org/D110534	2021-09-29 11:25:35 -05:00
Wael Yehia	a60405cf03	[LTO][Legacy] Add -debug-pass-manager option to enable pass run/skip trace. Reviewed by: steven_wu, fhahn, tejohnson Differential Revision: https://reviews.llvm.org/D110075	2021-09-29 12:17:53 +00:00
Igor Kudrin	7b424b9333	[llvm-objcopy] Rename relocation sections together with their targets. As for now, llvm-objcopy renames only sections that are specified explicitly in --rename-section, while GNU objcopy keeps names of relocation sections in sync with their targets. For example: > readelf -S test.o ... [ 1] .foo PROGBITS [ 2] .rela.foo RELA > objcopy --rename-section .foo=.bar test.o gnu.o > readelf -S gnu.o ... [ 1] .bar PROGBITS [ 2] .rela.bar RELA > llvm-objcopy --rename-section .foo=.bar test.o llvm.o > readelf -S llvm.o ... [ 1] .bar PROGBITS [ 2] .rela.foo RELA This patch makes llvm-objcopy to match the behavior of GNU objcopy better. Differential Revision: https://reviews.llvm.org/D110352	2021-09-29 16:36:37 +07:00
wlei	a03cf331e1	[llvm-profgen] Strip context to support non-CS profile generation for hybrid sample Differential Revision: https://reviews.llvm.org/D109769	2021-09-28 12:20:23 -07:00
Lang Hames	ab5e6e7434	[llvm-jitlink] Add a -slab-page-size option to override process page size. The slab allocator is frequently used in -noexec tests where we want a consistent memory layout. In this context we also want to set the effective page size, rather than using the page size of the host process, since not all systems use the same page size. The -slab-page-size option allows us to set the page size for such tests. The -slab-page-size option will also be honored in exec mode when using the slab allocator, but will trigger an error if the requested size is not a multiple of the actual process page size. This option was motivated by test failures on a ppc64 bot that was returning zero from sys::Process::getPageSize(), so it also contains a check for errors and zero results from that function if the -slab-page-size option is absent. Existing slab allocator tests will be updated to use this option in a follow-up commit so that we can point the failing bot at this commit and observe errors associated with sys::Process::getPageSize().	2021-09-28 10:43:46 -07:00
Fangrui Song	74a47e54be	[llvm-objdump] Fix -R display and support ET_EXEC * Add a newline before `DYNAMIC RELOCATION RECORDS` (see D101796) * Add the missing `OFFSET TYPE VALUE` line * Align columns Note: llvm-readobj/ELFDumper.cpp `loadDynamicTable` has sophisticated PT_DYNAMIC code which is unavailable in llvm-objdump. Reviewed By: jhenderson, Higuoxing Differential Revision: https://reviews.llvm.org/D110595	2021-09-28 09:58:27 -07:00
wlei	ce40843a3f	[llvm-profgen][CSSPGO] On-demand function size computation for preinliner Similar to https://reviews.llvm.org/D110465, we can compute function size on-demand for the functions that's hit by samples. Here we leverage the raw range samples' address to compute a set of sample hit function. Then `BinarySizeContextTracker` just works on those function range for the size. Reviewed By: hoy Differential Revision: https://reviews.llvm.org/D110466	2021-09-28 09:09:38 -07:00
wlei	091c16f76b	[llvm-profgen] On-demand symbolization Previously we do symbolization for all the functions and actually we only need the symbols that's hit by the samples. This can significantly speed up the time for large size binary. Optimization for per-inliner will come along with next patch. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D110465	2021-09-28 09:09:25 -07:00
Lang Hames	61e25d2550	clang-format	2021-09-27 18:02:06 -07:00
Lang Hames	22f8276fe4	[llvm-jitlink] Add more information about allocation failures. Slab allocator failures will now report requested size and remaining capacity.	2021-09-27 18:02:06 -07:00
Lang Hames	21a06254a3	[ORC] Switch from JITTargetAddress to ExecutorAddr for EPC-call APIs. Part of the ongoing move to ExecutorAddr.	2021-09-27 16:53:09 -07:00
Jozef Lawrynowicz	6cfb4d46ba	[llvm-readobj] Support dumping of MSP430 ELF attributes The MSP430 ABI supports build attributes for specifying the ISA, code model, data model and enum size in ELF object files. Differential Revision: https://reviews.llvm.org/D107969	2021-09-28 00:56:11 +03:00
gbreynoo	05b1c7aebf	[llvm-dwarfdump][docs] Add missing options to the help output and the command guide This change is to add some missing details to the help text and command guide: - Added a note to the command guide that --debug-macro also dumps .debug_macinfo. - Added a note to the command guide that --debug-frame and --eh_frame are aliases, and in cases where both sections are present one command outputs both. - Changed the wording in the help output for --ignore-case and --regex to closer match the command guide.	2021-09-27 14:28:31 +01:00
Lang Hames	a12c0d5ea6	[ORC] Export process symbols in lli-child-target. We want this behavior for future testing infrastructure anyway, and it may help with the failure in https://lab.llvm.org/buildbot/#/builders/98/builds/6401: /b/fuchsia-x86_64-linux/llvm.obj/tools/clang/stage2-bins/bin/lli: warning: remote mcjit does not support lazy compilation Finalization error: could not register eh-frame: __register_frame function not found /b/fuchsia-x86_64-linux/llvm.obj/tools/clang/stage2-bins/bin/lli: disconnecting	2021-09-26 11:22:49 -07:00
Lang Hames	6498b0e991	Reintroduce "[ORC] Introduce EPCGenericRTDyldMemoryManager." This reintroduces "[ORC] Introduce EPCGenericRTDyldMemoryManager." (`bef55a2b47`) and "[lli] Add ChildTarget dependence on OrcTargetProcess library." (`7a219d801b`) which were reverted in `99951a5684` due to bot failures. The root cause of the bot failures should be fixed by "[ORC] Fix uninitialized variable." (`0371049277`) and "[ORC] Wait for handleDisconnect to complete in SimpleRemoteEPC::disconnect." (`320832cc9b`).	2021-09-27 03:24:33 +10:00
Lang Hames	175c1a39e8	[ORC][llvm-jitlink] Add debugging output to SimpleRemoteEPC (and Server). Also adds an optional 'debug' argument to the llvm-jitlink-executor tool to enable debug-logging.	2021-09-26 10:00:29 -07:00
Lang Hames	99951a5684	Revert "[ORC] Introduce EPCGenericRTDyldMemoryManager." This reverts commit `bef55a2b47` while I investigate failures on some bots. Also reverts "[lli] Add ChildTarget dependence on OrcTargetProcess library." (`7a219d801b`) which was a fallow-up to `bef55a2b47`.	2021-09-25 11:19:14 -07:00
Lang Hames	7a219d801b	[lli] Add ChildTarget dependence on OrcTargetProcess library. ChildTarget depends on OrcTargetProcess after `bef55a2b47`.	2021-09-25 10:51:29 -07:00
Lang Hames	bef55a2b47	[ORC] Introduce EPCGenericRTDyldMemoryManager. EPCGenericRTDyldMemoryMnaager is an EPC-based implementation of the RuntimeDyld::MemoryManager interface. It enables remote-JITing via EPC (backed by a SimpleExecutorMemoryManager instance on the executor side) for RuntimeDyld clients. The lli and lli-child-target tools are updated to use SimpleRemoteEPC and SimpleRemoteEPCServer (rather than OrcRemoteTargetClient/Server), and EPCGenericRTDyldMemoryManager for MCJIT tests. By enabling remote-JITing for MCJIT and RuntimeDyld-based ORC clients, EPCGenericRTDyldMemoryManager allows us to deprecate older remote-JITing support, including OrcTargetClient/Server, OrcRPCExecutorProcessControl, and the Orc RPC system itself. These will be removed in future patches.	2021-09-25 10:42:10 -07:00
modimo	ce6ed64a69	[llvm-profdata] Extend support of --topn to sample profiles Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D110449	2021-09-24 16:42:46 -07:00
wlei	1422fa5fab	[llvm-profgen] Unify output format of different unsymbolized profiles Differential Revision: https://reviews.llvm.org/D110080	2021-09-24 14:18:00 -07:00
wlei	28277e9b48	[AutoFDO][llvm-profgen] Report zero count for unexecuted part of function code In order to be consistent with compiler that interprets zero count as unexecuted(cold), this change reports zero-value count for unexecuted part of function code. For the implementation, it leverages the range counter, initializes all the executed function range with the zero-value. After all ranges are merged and converted into disjoint ranges, the remaining zero count will indicates the unexecuted(cold) part of the function. This change also extends the current `findDisjointRanges` method which now can support adding zero-value range. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D109713	2021-09-24 14:15:05 -07:00
wlei	d5f2013004	[AutoFDO][llvm-profgen] Profile generation for LBR(non-CS) sample This patch introduces non-CS AutoFDO profile generation into LLVM. The profile is supposed to be well consumed by compiler using `-fprofile-sample-use=[profile]`. After range and branch counters are extracted from the LBR sample, here we go through each addresses for symbolization, create FunctionSamples and populate its sub fields like TotalSamples, BodySamples and HeadSamples etc. For inlined code, as we need to map back to original code, so we always add body samples to the leaf frame's function sample. Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D109551	2021-09-24 13:55:34 -07:00
wlei	a7cdcf25c1	[llvm-profgen] Ignore invalid perf line in LBR record Similar to https://reviews.llvm.org/D109637, there is a whole invalid line of message in perfscript. ``` warning: Invalid address in LBR record at line 14118674: Processed 14138923 events and lost 1 chunks! warning: Invalid address in LBR record at line 14118676: Check IO/CPU overload! ``` This only happened for LBR only perfscript, hybridperfscript have a check of " 0x" to make sure it's the LBR perf line. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D110424	2021-09-24 13:44:57 -07:00
Teresa Johnson	b5bfbb4da2	Fix bot failure by adding needed dependence Fix bot failure from `96cb97c453`, e.g.: https://lab.llvm.org/buildbot/#/builders/61/builds/15203 llvm-lto now needs to link in IPO.	2021-09-24 12:43:10 -07:00
Teresa Johnson	96cb97c453	[ThinLTO] Update combined index for SamplePGO indirect calls to locals In ThinLTO for locals we normally compute the GUID from the name after prepending the source path to get a unique global id. SamplePGO indirect call profiles contain the target GUID without this uniquification, however (unless compiling with -funique-internal-linkage-names). In order to correctly handle the call edges added to the combined index for these indirect calls, during importing and bitcode writing we consult a map of original to full GUID to identify the actual callee. However, for a large application this was consuming a lot of compile time as we need to do this repeatedly (especially during importing where we may traverse call edges multiple times). To fix this implement a suggestion in one of the FIXME comments, and actually modify the call edges during a single traversal after the index is built to perform the fixups once. I combined this fixup with the dead code analysis performed on the index in order to avoid adding an additional walk of the index. The dead code analysis is the first analysis performed on the index. This reduced the time required for a large thin link with SamplePGO by about 20%. No new test added, but I confirmed that there are existing tests that will fail when no fixup is performed. Differential Revision: https://reviews.llvm.org/D110374	2021-09-24 12:29:49 -07:00
Igor Kudrin	6dda6c49ce	[llvm-objcopy][NFC] Add a helper method RelocationSectionBase::getNamePrefix() Refactor handleArgs() to use that method. Differential Revision: https://reviews.llvm.org/D110350	2021-09-24 22:02:36 +07:00
gbreynoo	3bad9616aa	[llvm-objcopy][docs] Add missing options to the help output and the command guide This change is to keep the help text and command guide of objcopy in tandem. - In the help output the options --rename-section and --set-section-flags were missing the flag exclude, which is found in the command guide. - In the command guide the alias -G for --keep-global-symbol was missing, which is found in the help output. Differential Revision: https://reviews.llvm.org/D110340	2021-09-24 09:44:46 +01:00
Simon Pilgrim	5f2c53bdf4	Pass some DataLayout arguments by const-ref Avoid unnecessary copies, reported by MSVC static analyzer.	2021-09-23 15:50:31 +01:00
wlei	1ed69bb86e	[llvm-profgen] Fix a dangling vector reference in CS line number based generator It seems we missed one spot to persist `SampleContextFrameVector` into the global table (CSProfileGenerator::populateFunctionBoundarySamples:340) which causes a crash. This change tried to fix it in a centralized way i. e. where we generate the `FunctionSamples`. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D110275	2021-09-22 18:33:28 -07:00
wlei	686cc00067	[llvm-profgen] Fix an out-of-range error during unwinding It happened that the LBR entry target can be the first address of text section which causes an out-of-range crash. So here add a boundary check. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D110271	2021-09-22 18:33:27 -07:00
wlei	c2be2d3284	[llvm-profgen] Fix a bug of assertion The assertion should work on the entire context. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D110268	2021-09-22 18:33:27 -07:00
Wenlei He	81c249784f	[llvm-profgen] Use hot threshold for context merging and trimming Without preinliner, we need to tune down the cold count cutoff to merge/trim more context to limit profile size for large components. However it doesn't make sense for cold threshold to be higher than hot threshold, so we now change to use hot threshold as merging/trimming cut off instead. Differential Revision: https://reviews.llvm.org/D110212	2021-09-22 15:01:51 -07:00
Hongtao Yu	734f4d832c	[llvm-profgen] An option to dump disasm of specified symbols For large app, dumping disasm of the whole program can be slow and result in gianant output. Adding a switch to dump specific symbols only. Reviewed By: wlei Differential Revision: https://reviews.llvm.org/D110079	2021-09-22 10:32:59 -07:00
Craig Topper	d85e347a28	[RISCV] Add a pass to recognize VLS strided loads/store from gather/scatter. For strided accesses the loop vectorizer seems to prefer creating a vector induction variable with a start value of the form <i32 0, i32 1, i32 2, ...>. This value will be incremented each loop iteration by a splat constant equal to the length of the vector. Within the loop, arithmetic using splat values will be done on this vector induction variable to produce indices for a vector GEP. This pass attempts to dig through the arithmetic back to the phi to create a new scalar induction variable and a stride. We push all of the arithmetic out of the loop by folding it into the start, step, and stride values. Then we create a scalar GEP to use as the base pointer for a strided load or store using the computed stride. Loop strength reduce will run after this pass and can do some cleanups to the scalar GEP and induction variable. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D107790	2021-09-20 09:39:44 -07:00
Samuel	f18c0739b3	[llvm-reduce] Add reduce operands pass Add reduction to set operands to default values Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D108903	2021-09-17 12:32:15 -07:00
Lang Hames	78b083dbb7	[ORC] Add finalization & deallocation actions, SimpleExecutorMemoryManager class Finalization and deallocation actions are a key part of the upcoming JITLinkMemoryManager redesign: They generalize the existing finalization and deallocate concepts (basically "copy-and-mprotect", and "munmap") to include support for arbitrary registration and deregistration of parts of JIT linked code. This allows us to register and deregister eh-frames, TLV sections, language metadata, etc. using regular memory management calls with no additional IPC/RPC overhead, which should both improve JIT performance and simplify interactions between ORC and the ORC runtime. The SimpleExecutorMemoryManager class provides executor-side support for memory management operations, including finalization and deallocation actions. This support is being added in advance of the rest of the memory manager redesign as it will simplify the introduction of an EPC based RuntimeDyld::MemoryManager (since eh-frame registration/deregistration will be expressible as actions). The new RuntimeDyld::MemoryManager will in turn allow us to remove older remote allocators that are blocking the rest of the memory manager changes.	2021-09-17 09:55:45 +10:00
Nico Weber	646299d183	[Support] Convert BinaryStream class zoo to 64-bit offsets Most PDB fields on disk are 32-bit but describe the file in terms of MSF blocks, which are 4 kiB by default. So PDB files can be a bit larger than 4 GiB, and much larger if you create them with a block size > 4 kiB. This is a first (necessary, but by far not not sufficient) step towards supporting such PDB files. Now we don't truncate in-memory file offsets (which are in terms of bytes, not in terms of blocks). No effective behavior change. lld-link will still error out if it were to produce PDBs > 4 GiB. Differential Revision: https://reviews.llvm.org/D109923	2021-09-16 19:14:52 -04:00
Wenlei He	446e21623c	[llvm-profgen] Use context-sensitive byte size cost for preinliner decisions by default Turn on `use-context-cost-for-preinliner` to use context-sensitive byte size cost for preinliner decisions by default. This is a more accurate proxy of inline cost than profile size. We tested on our large workload that it delivers measureable CPU improvement. Differential Revision: https://reviews.llvm.org/D109893	2021-09-16 10:36:12 -07:00
Alok Kumar Sharma	a5b72abc9e	[DebugInfo] Enhance DIImportedEntity to accept children entities New field `elements` is added to '!DIImportedEntity', representing list of aliased entities. This is needed to dump optimized debugging information where all names in a module are imported, but a few names are imported with overriding aliases. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D109343	2021-09-16 10:41:55 +05:30
Esme-Yi	945df8bc4c	[obj2yaml][XCOFF] Dump sections Summary: This patch implements parsing sections for obj2yaml on AIX. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D98003	2021-09-15 05:16:33 +00:00
Hongtao Yu	0057c7185d	[CSSPGO][llvm-profgen] Truncate stack samples with invalid return address. Invalid frame addresses exist in call stack samples due to bad unwinding. This could happen to frame-pointer-based unwinding and the callee functions that do not have the frame pointer chain set up. It isn't common when the program is built with the frame pointer omission disabled, but can still happen with third-party static libs built with frame pointer omitted. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D109638	2021-09-14 21:56:22 -07:00
Hongtao Yu	8cbbd7e0b2	[llvm-profgen] Ignore broken LBR samples Perf script can sometimes give disordered LBR samples like below. ``` b022500 32de0044 3386e1d1 7f118e05720c 7f118df2d81f 0x2a0b9622/0x2a0b9610/P/-/-/1 0x2a0b79ff/0x2a0b9618/P/-/-/2 0x2a0b7a4a/0x2a0b79e8/P/-/-/1 0x2a0b7a33/0x2a0b7a46/P/-/-/1 0x2a0b7a42/0x2a0b7a23/P/-/-/1 0x2a0b7a21/0x2a0b7a37/P/-/-/2 0x2a0b79e6/0x2a0b7a07/P/-/-/1 0x2a0b79d4/0x2a0b79dc/P/-/-/2 0x2a0b7a03/0x2a0b79aa/P/-/-/1 0x2a0b79a8/0x2a0b7a00/P/-/-/234 0x2a0b9613/0x2a0b7930/P/-/-/1 0x2a0b9622/0x2a0b9610/P/-/-/1 0x2a0b79ff/0x2a0b9618/P/-/-/2 0x2a0b7a4a/0x2aWarning: Processed 10263226 events and lost 1 chunks! ``` Note that the last LBR record `0x2a0b7a4a/0x2aWarning:` . Currently llvm-profgen does not detect that and as a result an uninitialized branch target value will be used. The uninitialized value can cause creepy instruction ranges created which which in turn will result in a completely wrong profile. An example is like ``` .... @ _ZN5folly13loadUnalignedIsEET_PKv]:18446744073709551615:18446744073709551615 1: 18446744073709551615 !CFGChecksum: 4294967295 !Attributes: 0 ``` Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D109637	2021-09-14 12:11:17 -07:00
Sam Clegg	ef8c9135ef	[WebAssembly] Allow import and export of TLS symbols between DSOs We previously had a limitation that TLS variables could not be exported (and therefore could also not be imported). This change removed that limitation. Differential Revision: https://reviews.llvm.org/D108877	2021-09-14 06:47:37 -07:00
Martin Storsjö	63784b9a75	[llvm-readobj] [COFF] Resolve relocations pointing at section symbols for arm64 too This syncs parts from the x86 implementation to the ARMWinEH implementation. Currently, neither of the compilers targeting COFF/arm64 (MSVC, LLVM) produce such relocations, but LLVM might after a later patch. Differential Revision: https://reviews.llvm.org/D109650	2021-09-14 11:04:46 +03:00

1 2 3 4 5 ...

13086 Commits