Commit Graph

56 Commits

Author SHA1 Message Date
Christopher Di Bella c874dd5362 [llvm][clang][NFC] updates inline licence info
Some files still contained the old University of Illinois Open Source
Licence header. This patch replaces that with the Apache 2 with LLVM
Exception licence.

Differential Revision: https://reviews.llvm.org/D107528
2021-08-11 02:48:53 +00:00
Greg Clayton ab546ead3b Fix a case where multiple symbols with zero size would cause duplicate entries in gsym files.
Symbol tables can have symbols with no size in mach-o files that were failing to get combined into a single entry. This resulted in many duplicate entries for the same address and made gsym files larger.

Differential Revision: https://reviews.llvm.org/D105068
2021-06-28 18:26:26 -07:00
Simon Giesecke 5f2d4b23b4 Add --quiet option to llvm-gsymutil to suppress output of warnings.
Differential Revision: https://reviews.llvm.org/D102829
2021-05-27 12:36:34 +00:00
Simon Giesecke 81b2fcf26f Use a non-recursive mutex in GsymCreator.
There doesn't seem to be a need to support recursive locking,
and a recursive mutex is unnecessarily inefficient.

Differential Revision: https://reviews.llvm.org/D102486
2021-05-19 10:06:47 +00:00
Simon Giesecke 4ea4d9c066 Move FunctionInfo in addFunctionInfo rather than copying.
Differential Revision: https://reviews.llvm.org/D102485
2021-05-19 10:06:47 +00:00
Simon Giesecke f29c4c6097 Avoid calculating the string hash twice in GsymCreator::insertString.
Do the single hash calculation before acquiring the lock, to reduce
lock contention. If Copy is true, and the string was not yet contained
in the StringStorage, use the new address from StringStorage, but
reuse the hash we already calculated.

Differential Revision: https://reviews.llvm.org/D102484
2021-05-19 10:06:47 +00:00
Simon Giesecke e102fd50f9 Reformat GSYMCreator.cpp
Differential Revision: https://reviews.llvm.org/D102483
2021-05-19 10:06:47 +00:00
Greg Clayton e5bdacba2e Optimize GSymCreator::finalize.
The algorithm removing duplicates from the Funcs list used to have
amortized quadratic time complexity because it was potentially
removing each entry using std::vector::erase individually. This
patch is now using a erase-remove idiom with an adapted
removeIfBinary algorithm.

Probably this was made under the assumption that these removals are
rare, but there are cases where the case of duplicate entries is
occurring frequently. In these cases, the actual runtime was very
poor, taking hours to process a single binary of around 1 GiB size
including debug info. Another factor contributing to that is the
frequent output of the warning, which is now removed.

It seems this is particularly an issue with GCC-compiled binaries,
rather than clang-built binaries.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D102219
2021-05-12 15:18:07 -07:00
Kazu Hirata 352fcfc697 [llvm] Use llvm::sort (NFC) 2021-01-17 10:39:45 -08:00
serge-sans-paille 9218ff50f9 llvmbuildectomy - replace llvm-build by plain cmake
No longer rely on an external tool to build the llvm component layout.

Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.

These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.

Differential Revision: https://reviews.llvm.org/D90848
2020-11-13 10:35:24 +01:00
Xing GUO ff6a0b6a8e [Object] Change ObjectFile::getSymbolValue() return type to Expected<uint64_t>
Summary:
In D77860, we have changed `getSymbolFlags()` return type to `Expected<uint32_t>`.
This change helps bubble the error further up the stack.

Reviewers: jhenderson, grimar, JDevlieghere, MaskRay

Reviewed By: jhenderson

Subscribers: hiraditya, MaskRay, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79075
2020-05-02 14:04:44 +08:00
Xing GUO 8994b14e8b [DebugInfo] Fix crash caused by unhandled error.
Summary: This patch helps fix LLVM crash caused by unhandled error.

Reviewers: clayborg, aprantl

Reviewed By: clayborg

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78924
2020-04-28 21:39:25 +08:00
Greg Clayton ffe6695acf Fix buildbots with merge that didn't happen for 4050b01ba9. 2020-03-04 19:28:24 -08:00
Greg Clayton 4050b01ba9 Fix GSYM tests to run the yaml files and fix test failures on some machines.
YAML files were not being run during lit testing as there was no lit.local.cfg file. Once this was fixed, some buildbots would fail due to a StringRef that pointed to a std::string inside of a temporary llvm::Triple object. These issues are fixed here by making a local triple object that stays around long enough so the StringRef points to valid data. Fixed memory sanitizer bot bugs as well.

Differential Revision: https://reviews.llvm.org/D75390
2020-03-04 19:14:08 -08:00
Mitch Phillips 58079aa91b Revert "Fix GSYM tests to run the yaml files and fix test failures on some machines."
This reverts commit 8d41f1a023.

This change broke the MSan buildbots - see comments in
https://reviews.llvm.org/D75390 for more information.
2020-03-04 10:21:54 -08:00
Greg Clayton 8d41f1a023 Fix GSYM tests to run the yaml files and fix test failures on some machines.
YAML files were not being run during lit testing as there was no lit.local.cfg file. Once this was fixed, some buildbots would fail due to a StringRef that pointed to a std::string inside of a temporary llvm::Triple object. These issues are fixed here by making a local triple object that stays around long enough so the StringRef points to valid data. Also fixed an issue where strings for files in the file table could be added in opposite order due to parameters to function calls not having a strong ordering, which caused tests to fail. Added new arch specfic directories so when targets are not enabled, we continue to function just fine.

Differential Revision: https://reviews.llvm.org/D75390
2020-03-02 15:40:11 -08:00
Greg Clayton e3afe5952d Revert "Fix GSYM tests to run the yaml files and fix test failures on some machines."
This reverts commit 57688350ad.

Need to conditionalize for ARM targets, this is failing on machines that don't have ARM targets.
2020-03-02 13:07:58 -08:00
Greg Clayton 57688350ad Fix GSYM tests to run the yaml files and fix test failures on some machines.
YAML files were not being run during lit testing as there was no lit.local.cfg file. Once this was fixed, some buildbots would fail due to a StringRef that pointed to a std::string inside of a temporary llvm::Triple object. These issues are fixed here by making a local triple object that stays around long enough so the StringRef points to valid data. Also fixed an issue where strings for files in the file table could be added in opposite order due to parameters to function calls not having a strong ordering, which caused tests to fail.

Differential Revision: https://reviews.llvm.org/D75390
2020-03-02 12:52:53 -08:00
Michael Liao 61f538d37b Add missing dependency to fix shared library build. 2020-02-26 01:59:53 -05:00
Greg Clayton 2f6cc21f44 Add a llvm-gsymutil tool that can convert object files to GSYM and perform lookups.
Summary:
This patch creates the llvm-gsymutil binary that can convert object files to GSYM using the --convert <path> option. It can also dump and lookup addresses within GSYM files that have been saved to disk.

To dump a file:

llvm-gsymutil /path/to/a.gsym

To perform address lookups, like with atos, on GSYM files:

llvm-gsymutil --address 0x1000 --address 0x1100 /path/to/a.gsym

To convert a mach-o or ELF file, including any DWARF debug info contained within the object files:

llvm-gsymutil --convert /path/to/a.out --out-file /path/to/a.out.gsym

Conversion highlights:
- convert DWARF debug info in mach-o or ELF files to GSYM
- convert symbols in symbol table to GSYM and don't convert symbols that overlap with DWARF debug info
- extract UUID from object files
- extract .text (read + execute) section address ranges and filter out any DWARF or symbols that don't fall in those ranges.
- if .text sections are extracted, and if the last gsym::FunctionInfo object has no size, cap the size to the end of the section the function was contained in

Dumping GSYM files will dump all sections of the GSYM file in textual format.

Reviewers: labath, aadsm, serhiy.redko, jankratochvil, xiaobai, wallace, aprantl, JDevlieghere, jdoerfert

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74883
2020-02-25 21:11:05 -08:00
Greg Clayton 95e3956189 Add an Offset field to the SourceLocation for LookupResult objects.
Summary:
The Offset provides the offset within the function in a SourceLocation struct. This allows us to show the byte offset within a function. We also track offsets within inline functions as well. Updated the lookup tests to verify the offset for functions and inline functions.

0x1000: main + 32 @ /tmp/main.cpp:45

Reviewers: labath, aadsm, serhiy.redko, jankratochvil, xiaobai, wallace, aprantl, JDevlieghere

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74680
2020-02-19 16:12:32 -08:00
Greg Clayton 5e13e0ce4c [NFC] Move ValidTextRanges out of DwarfTransformer and into GsymCreator and unify address is not in GSYM errors so all strings match. 2020-02-15 16:48:23 -08:00
Alexandre Ganea 8404aeb56a [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.

== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.

By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.

This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.

== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".

== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).

When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.

When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.

Differential Revision: https://reviews.llvm.org/D71775
2020-02-14 10:24:22 -05:00
Francesco Petrogalli fe36127982 [build] Fix shared lib builds. 2020-02-13 22:09:52 +00:00
Greg Clayton e8e97b28cd Fix buildbots that create shared libraries from GSYM library by adding a dependency on LLVMDebugInfoDWARF. 2020-02-13 11:43:07 -08:00
Greg Clayton 22d63b6318 Fix buildbots by not using "and" and "not". 2020-02-13 11:35:43 -08:00
Greg Clayton 19602b7194 Add a DWARF transformer class that converts DWARF to GSYM.
Summary:
The DWARF transformer is added as a class so it can be unit tested fully.

The DWARF is converted to GSYM format and handles many special cases for functions:
- omit functions in compile units with 4 byte addresses whose address is UINT32_MAX (dead stripped)
- omit functions in compile units with 8 byte addresses whose address is UINT64_MAX (dead stripped)
- omit any functions whose high PC is <= low PC (dead stripped)
- StringTable builder doesn't copy strings, so we need to make backing copies of strings but only when needed. Many strings come from sections in object files and won't need to have backing copies, but some do.
- When a function doesn't have a mangled name, store the fully qualified name by creating a string by traversing the parent decl context DIEs and then. If we don't do this, we end up having cases where some function might appear in the GSYM as "erase" instead of "std::vector<int>::erase".
- omit any functions whose address isn't in the optional TextRanges member variable of DwarfTransformer. This allows object file to register address ranges that are known valid code ranges and can help omit functions that should have been dead stripped, but just had their low PC values set to zero. In this case we have many functions that all appear at address zero and can omit these functions by making sure they fall into good address ranges on the object file. Many compilers do this when the DWARF has a DW_AT_low_pc with a DW_FORM_addr, and a DW_AT_high_pc with a DW_FORM_data4 as the offset from the low PC. In this case the linker can't write the same address to both the high and low PC since there is only a relocation for the DW_AT_low_pc, so many linkers tend to just zero it out.

Reviewers: aprantl, dblaikie, probinson

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74450
2020-02-13 10:48:37 -08:00
Bill Wendling c55cf4afa9 Revert "Remove redundant "std::move"s in return statements"
The build failed with

  error: call to deleted constructor of 'llvm::Error'

errors.

This reverts commit 1c2241a793.
2020-02-10 07:07:40 -08:00
Bill Wendling 1c2241a793 Remove redundant "std::move"s in return statements 2020-02-10 06:39:44 -08:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Reid Kleckner 7f63db197e Avoid naming variable after type to fix GCC 5.3 build
GCC says:
.../llvm/lib/DebugInfo/GSYM/FunctionInfo.cpp:195:12:
error: ‘InfoType’ is not a class, namespace, or enumeration
       case InfoType::EndOfList:
                   ^

Presumably, GCC thinks InfoType is a variable here. Work around it by
using the name IT as is done above.
2019-12-06 11:25:28 -08:00
Douglas Yung da650094b1 Fix build of LookupResult.cpp from aeda128 with Visual C++. 2019-12-05 21:03:03 -08:00
Greg Clayton aeda128a96 Add lookup functions for efficient lookups of addresses when using GsymReader classes.
Summary:
Lookup functions are designed to not fully decode a FunctionInfo, LineTable or InlineInfo, they decode only what is needed into a LookupResult object. This allows lookups to avoid costly memory allocations and avoid parsing large amounts of information one a suitable match is found.

LookupResult objects contain the address that was looked up, the concrete function address range, the name of the concrete function, and a list of source locations. One for each inline function, and one for the concrete function. This allows one address to turn into multiple frames and improves the signal you get when symbolicating addresses in GSYM files.

Reviewers: labath, aprantl

Subscribers: mgorny, hiraditya, llvm-commits, lldb-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70993
2019-12-05 16:49:53 -08:00
Tom Stellard ab411801b8 [cmake] Explicitly mark libraries defined in lib/ as "Component Libraries"
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO.  I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so.  Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:

1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so.  This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.

With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.

2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set.  This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.

I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:

- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON

Reviewers: beanz, smeenai, compnerd, phosek

Reviewed By: beanz

Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70179
2019-11-21 10:48:08 -08:00
Reid Kleckner 22d41ba024 Fix -Wsign-compare warning with clang-cl
off_t apparently is just "long" on Win64, which is 32-bits, and
therefore not long enough to compare with UINT32_MAX. Use auto to follow
the surrounding code. uint64_t would also be fine.
2019-10-30 15:20:43 -07:00
Nico Weber cae2662104 Fix Windows build after r374381
llvm-svn: 374413
2019-10-10 18:20:16 +00:00
Reid Kleckner f05ed6601f Remove strings.h include to fix GSYM Windows build
Fifth time's the charm.

llvm-svn: 374411
2019-10-10 18:17:24 +00:00
Greg Clayton 4ae13e2a7a Unbreak buildbots.
llvm-svn: 374410
2019-10-10 18:13:13 +00:00
Greg Clayton d665bfcf7c Fix buildbots by using memset instead of bzero.
llvm-svn: 374409
2019-10-10 18:11:49 +00:00
Michael Liao a121891a55 Fix build by adding the missing dependency.
llvm-svn: 374406
2019-10-10 18:04:52 +00:00
Greg Clayton 4c145df6a7 Unbreak llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast buildbot.
llvm-svn: 374398
2019-10-10 17:52:33 +00:00
Greg Clayton 6a2eff1e68 Unbreak windows buildbots.
llvm-svn: 374396
2019-10-10 17:49:33 +00:00
Greg Clayton 4b6c9de868 Add GsymCreator and GsymReader.
This patch adds the ability to create GSYM files with GsymCreator, and read them with GsymReader. Full testing has been added for both new classes.

This patch differs from the original patch https://reviews.llvm.org/D53379 in that is uses a StringTableBuilder class from llvm instead of a custom version. Support for big and little endian files has been added. If the endianness matches the current host, we use efficient extraction for the header, address table and address info offset tables.

Differential Revision: https://reviews.llvm.org/D68744

llvm-svn: 374381
2019-10-10 17:10:11 +00:00
Greg Clayton c6b156cbb8 GSYM: Add the llvm::gsym::Header header class with tests
This patch adds the llvm::gsym::Header class which appears at the start of a stand alone GSYM file, or in the first bytes of the GSYM data in a GSYM section within a file. Added encode and decode methods with full error handling and full tests.

Differential Revision: https://reviews.llvm.org/D67666

llvm-svn: 372149
2019-09-17 17:46:13 +00:00
Greg Clayton b52650d57f GSYM: add encoding and decoding to FunctionInfo
This patch adds encoding and decoding of the FunctionInfo objects along with full error handling and tests. Full details of the FunctionInfo encoding format appear in the FunctionInfo.h header file.

Differential Revision: https://reviews.llvm.org/D67506

llvm-svn: 372135
2019-09-17 16:15:49 +00:00
David Blaikie ffe5466c79 Add some missing changes to GSYM that was addressing a gcc compilation error due to a type and variable with the same name
llvm-svn: 371681
2019-09-11 22:24:45 +00:00
Greg Clayton 7fcc2c2b5a Add a LineTable class to GSYM and test it.
This patch adds the ability to create a gsym::LineTable object, populate it, encode and decode it and test all functionality.

The full format of the LineTable encoding is specified in the header file LineTable.h.

Differential Revision: https://reviews.llvm.org/D66602

llvm-svn: 371657
2019-09-11 20:51:03 +00:00
David Bolvansky 5916799293 [GSYM][NFC] Fixed -Wdocumentation warning
lib/DebugInfo/GSYM/InlineInfo.cpp:68:12: warning: parameter 'Inline' not found in the function declaration [-Wdocumentation]

llvm-svn: 371125
2019-09-05 21:09:58 +00:00
Greg Clayton 7d0a545ee6 Add encode and decode methods to InlineInfo and document encoding format to the GSYM file format.
This patch adds the ability to encode and decode InlineInfo objects and adds test coverage. Error handling is introduced in the encoding and decoding which will be used from here on out for remaining patches.

Differential Revision: https://reviews.llvm.org/D66600

llvm-svn: 370936
2019-09-04 17:32:51 +00:00
Greg Clayton bf9ee07afa Add FileWriter to GSYM and encode/decode functions to AddressRange and AddressRanges
The full GSYM patch started with: https://reviews.llvm.org/D53379

This patch add the ability to encode data using the new llvm::gsym::FileWriter class.

FileWriter is a simplified binary data writer class that doesn't require targets, target definitions, architectures, or require any other optional compile time libraries to be enabled via the build process. This class needs the ability to seek to different spots in the binary data that it produces to fix up offsets and sizes in GSYM data. It currently uses std::ostream over llvm::raw_ostream because llvm::raw_ostream doesn't support seeking which is required when encoding and decoding GSYM data.

AddressRange objects are encoded and decoded to be relative to a base address. This will be the FunctionInfo's start address if the AddressRange is directly contained in a FunctionInfo, or a base address of the containing parent AddressRange or AddressRanges. This allows address ranges to be efficiently encoded using ULEB128 encodings as we encode the offset and size of each range instead of full addresses. This also makes encoded addresses easy to relocate as we just need to relocate one base address.

Differential Revision: https://reviews.llvm.org/D63828

llvm-svn: 369587
2019-08-21 21:48:11 +00:00