llvm-project

Commit Graph

Author	SHA1	Message	Date
Kazu Hirata	0edbc90ec5	[DebugInfo] Use llvm::append_range (NFC)	2021-01-04 11:42:45 -08:00
Amy Huang	7e13694ac7	[llvm-symbolizer][Windows] Add start line when searching in line table sections. Fixes issue where if a line section doesn't start with a line number then the addresses at the beginning of the section don't have line numbers. For example, for a line section like this ``` 0001:00000010-00000014, line/column/addr entries = 1 7 00000013 ! ``` a line number wouldn't be found for addresses from 10 to 12. This matches behavior when using the DIA SDK. Differential Revision: https://reviews.llvm.org/D93306	2020-12-17 07:57:36 -08:00
Nico Weber	cf16437e05	fix typos to cycle bots	2020-12-12 20:19:33 -05:00
Amy Huang	00bbef2bb2	[llvm-symbolizer] Fix native symbolization on windows for inline sites. The existing code handles this correctly and I checked that the code in NativeInlineSiteSymbol also handles this correctly, but it was wrong in the NativeFunctionSymbol code. Differential Revision: https://reviews.llvm.org/D92134	2020-11-30 14:27:35 -08:00
Amy Huang	b4902bcd98	[NFC] remove print statement I accidentally added.	2020-11-23 10:51:09 -08:00
Amy Huang	bc98034040	[llvm-symbolizer] Add inline stack traces for Windows. This adds inline stack frames for symbolizing on Windows. Differential Revision: https://reviews.llvm.org/D88988	2020-11-17 13:19:13 -08:00
Reid Kleckner	5519e4da83	Re-land "[PDB] Merge types in parallel when using ghashing" Stored Error objects have to be checked, even if they are success values. This reverts commit `8d250ac3cd`. Relands commit 49b3459930655d879b2dc190ff8fe11c38a8be5f.. Original commit message: ----------------------------------------- This makes type merging much faster (-24% on chrome.dll) when multiple threads are available, but it slightly increases the time to link (+10%) when /threads:1 is passed. With only one more thread, the new type merging is faster (-11%). The output PDB should be identical to what it was before this change. To give an idea, here is the /time output placed side by side: BEFORE \| AFTER Input File Reading: 956 ms \| 968 ms Code Layout: 258 ms \| 190 ms Commit Output File: 6 ms \| 7 ms PDB Emission (Cumulative): 6691 ms \| 4253 ms Add Objects: 4341 ms \| 2927 ms Type Merging: 2814 ms \| 1269 ms -55%! Symbol Merging: 1509 ms \| 1645 ms Publics Stream Layout: 111 ms \| 112 ms TPI Stream Layout: 764 ms \| 26 ms trivial Commit to Disk: 1322 ms \| 1036 ms -300ms ----------------------------------------- -------- Total Link Time: 8416 ms 5882 ms -30% overall The main source of the additional overhead in the single-threaded case is the need to iterate all .debug$T sections up front to check which type records should go in the IPI stream. See fillIsItemIndexFromDebugT. With changes to the .debug$H section, we could pre-calculate this info and eliminate the need to do this walk up front. That should restore single-threaded performance back to what it was before this change. This change will cause LLD to be much more parallel than it used to, and for users who do multiple links in parallel, it could regress performance. However, when the user is only doing one link, it's a huge improvement. In the future, we can use NT worker threads to avoid oversaturating the machine with work, but for now, this is such an improvement for the single-link use case that I think we should land this as is. Algorithm ---------- Before this change, we essentially used a DenseMap<GloballyHashedType, TypeIndex> to check if a type has already been seen, and if it hasn't been seen, insert it now and use the next available type index for it in the destination type stream. DenseMap does not support concurrent insertion, and even if it did, the linker must be deterministic: it cannot produce different PDBs by using different numbers of threads. The output type stream must be in the same order regardless of the order of hash table insertions. In order to create a hash table that supports concurrent insertion, the table cells must be small enough that they can be updated atomically. The algorithm I used for updating the table using linear probing is described in this paper, "Concurrent Hash Tables: Fast and General(?)!": https://dl.acm.org/doi/10.1145/3309206 The GHashCell in this change is essentially a pair of 32-bit integer indices: <sourceIndex, typeIndex>. The sourceIndex is the index of the TpiSource object, and it represents an input type stream. The typeIndex is the index of the type in the stream. Together, we have something like a ragged 2D array of ghashes, which can be looked up as: tpiSources[tpiSrcIndex]->ghashes[typeIndex] By using these side tables, we can omit the key data from the hash table, and keep the table cell small. There is a cost to this: resolving hash table collisions requires many more loads than simply looking at the key in the same cache line as the insertion position. However, most supported platforms should have a 64-bit CAS operation to update the cell atomically. To make the result of concurrent insertion deterministic, the cell payloads must have a priority function. Defining one is pretty straightforward: compare the two 32-bit numbers as a combined 64-bit number. This means that types coming from inputs earlier on the command line have a higher priority and are more likely to appear earlier in the final PDB type stream than types from an input appearing later on the link line. After table insertion, the non-empty cells in the table can be copied out of the main table and sorted by priority to determine the ordering of the final type index stream. At this point, item and type records must be separated, either by sorting or by splitting into two arrays, and I chose sorting. This is why the GHashCell must contain the isItem bit. Once the final PDB TPI stream ordering is known, we need to compute a mapping from source type index to PDB type index. To avoid starting over from scratch and looking up every type again by its ghash, we save the insertion position of every hash table insertion during the first insertion phase. Because the table does not support rehashing, the insertion position is stable. Using the array of insertion positions indexed by source type index, we can replace the source type indices in the ghash table cells with the PDB type indices. Once the table cells have been updated to contain PDB type indices, the mapping for each type source can be computed in parallel. Simply iterate the list of cell positions and replace them with the PDB type index, since the insertion positions are no longer needed. Once we have a source to destination type index mapping for every type source, there are no more data dependencies. We know which type records are "unique" (not duplicates), and what their final type indices will be. We can do the remapping in parallel, and accumulate type sizes and type hashes in parallel by type source. Lastly, TPI stream layout must be done serially. Accumulate all the type records, sizes, and hashes, and add them to the PDB. Differential Revision: https://reviews.llvm.org/D87805	2020-09-30 15:44:38 -07:00
Reid Kleckner	8d250ac3cd	Revert "[PDB] Merge types in parallel when using ghashing" This reverts commit `49b3459930`.	2020-09-30 14:55:32 -07:00
Reid Kleckner	49b3459930	[PDB] Merge types in parallel when using ghashing This makes type merging much faster (-24% on chrome.dll) when multiple threads are available, but it slightly increases the time to link (+10%) when /threads:1 is passed. With only one more thread, the new type merging is faster (-11%). The output PDB should be identical to what it was before this change. To give an idea, here is the /time output placed side by side: BEFORE \| AFTER Input File Reading: 956 ms \| 968 ms Code Layout: 258 ms \| 190 ms Commit Output File: 6 ms \| 7 ms PDB Emission (Cumulative): 6691 ms \| 4253 ms Add Objects: 4341 ms \| 2927 ms Type Merging: 2814 ms \| 1269 ms -55%! Symbol Merging: 1509 ms \| 1645 ms Publics Stream Layout: 111 ms \| 112 ms TPI Stream Layout: 764 ms \| 26 ms trivial Commit to Disk: 1322 ms \| 1036 ms -300ms ----------------------------------------- -------- Total Link Time: 8416 ms 5882 ms -30% overall The main source of the additional overhead in the single-threaded case is the need to iterate all .debug$T sections up front to check which type records should go in the IPI stream. See fillIsItemIndexFromDebugT. With changes to the .debug$H section, we could pre-calculate this info and eliminate the need to do this walk up front. That should restore single-threaded performance back to what it was before this change. This change will cause LLD to be much more parallel than it used to, and for users who do multiple links in parallel, it could regress performance. However, when the user is only doing one link, it's a huge improvement. In the future, we can use NT worker threads to avoid oversaturating the machine with work, but for now, this is such an improvement for the single-link use case that I think we should land this as is. Algorithm ---------- Before this change, we essentially used a DenseMap<GloballyHashedType, TypeIndex> to check if a type has already been seen, and if it hasn't been seen, insert it now and use the next available type index for it in the destination type stream. DenseMap does not support concurrent insertion, and even if it did, the linker must be deterministic: it cannot produce different PDBs by using different numbers of threads. The output type stream must be in the same order regardless of the order of hash table insertions. In order to create a hash table that supports concurrent insertion, the table cells must be small enough that they can be updated atomically. The algorithm I used for updating the table using linear probing is described in this paper, "Concurrent Hash Tables: Fast and General(?)!": https://dl.acm.org/doi/10.1145/3309206 The GHashCell in this change is essentially a pair of 32-bit integer indices: <sourceIndex, typeIndex>. The sourceIndex is the index of the TpiSource object, and it represents an input type stream. The typeIndex is the index of the type in the stream. Together, we have something like a ragged 2D array of ghashes, which can be looked up as: tpiSources[tpiSrcIndex]->ghashes[typeIndex] By using these side tables, we can omit the key data from the hash table, and keep the table cell small. There is a cost to this: resolving hash table collisions requires many more loads than simply looking at the key in the same cache line as the insertion position. However, most supported platforms should have a 64-bit CAS operation to update the cell atomically. To make the result of concurrent insertion deterministic, the cell payloads must have a priority function. Defining one is pretty straightforward: compare the two 32-bit numbers as a combined 64-bit number. This means that types coming from inputs earlier on the command line have a higher priority and are more likely to appear earlier in the final PDB type stream than types from an input appearing later on the link line. After table insertion, the non-empty cells in the table can be copied out of the main table and sorted by priority to determine the ordering of the final type index stream. At this point, item and type records must be separated, either by sorting or by splitting into two arrays, and I chose sorting. This is why the GHashCell must contain the isItem bit. Once the final PDB TPI stream ordering is known, we need to compute a mapping from source type index to PDB type index. To avoid starting over from scratch and looking up every type again by its ghash, we save the insertion position of every hash table insertion during the first insertion phase. Because the table does not support rehashing, the insertion position is stable. Using the array of insertion positions indexed by source type index, we can replace the source type indices in the ghash table cells with the PDB type indices. Once the table cells have been updated to contain PDB type indices, the mapping for each type source can be computed in parallel. Simply iterate the list of cell positions and replace them with the PDB type index, since the insertion positions are no longer needed. Once we have a source to destination type index mapping for every type source, there are no more data dependencies. We know which type records are "unique" (not duplicates), and what their final type indices will be. We can do the remapping in parallel, and accumulate type sizes and type hashes in parallel by type source. Lastly, TPI stream layout must be done serially. Accumulate all the type records, sizes, and hashes, and add them to the PDB. Differential Revision: https://reviews.llvm.org/D87805	2020-09-30 14:22:48 -07:00
Amy Huang	0881d0bed3	[PDB][NativeSession] Clean up some things in NativeSession. -Use the actual sect/offset to keep track of symbols in the cache so they don't get created multiple times with different addresses. -Remove getSymTag from PDBFunctionSymbol/PDBPublicSymbol because it's already implemented in the base class -Merge the symbolizer test files for DIA and native, since the tests are the same. -Implement getCompilandId for NativeLineNumber Reviewed By: amccarth Differential Revision: https://reviews.llvm.org/D84208	2020-07-21 16:54:52 -07:00
Alexandre Ganea	23cd70d71c	[PDB] Fix out-of-bounds acces when sorting GSI buckets When building in Debug on Windows-MSVC after `b7402edce3`, a lot of tests were failing because we were dereferencing an element past the end of HashRecords. This happened towards the end of the table, in unused slots.	2020-07-10 10:55:27 -04:00
Amy Huang	9ee90a4905	[NativeSession] Add column numbers to NativeLineNumber. Summary: This adds column numbers if they are present, and otherwise sets the column number to be zero. Bug: https://bugs.llvm.org/show_bug.cgi?id=41795 Reviewers: amccarth Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81950	2020-07-07 09:59:22 -07:00
Reid Kleckner	b7402edce3	[PDB] Defer public serialization until PDB writing This reduces peak memory on my test case from 1960.14MB to 1700.63MB (-260MB, -13.2%) with no measurable impact on CPU time. I'm currently working with a publics stream that is about 277MB. Before this change, we would allocate 277MB of heap memory, serialize publics into them, hold onto that heap memory, open the PDB, and commit into it. After this change, we defer the serialization until commit time. In the last change I made to public writing, I re-sorted the list of publics multiple times in place to avoid allocating new temporary data structures. Deferring serialization until later requires that we don't reorder the publics. Instead of sorting the publics, I partially construct the hash table data structures, store a publics index in them, and then sort the hash table data structures. Later, I replace the index with the symbol record offset. This change also addresses a FIXME and moves the list of global and public records from GSIHashStreamBuilder to GSIStreamBuilder. Now that publics aren't being serialized, it makes even less sense to store them as a list of CVSymbol records. The hash table used to deduplicate globals is moved as well, since that is specific to globals, and not publics. Reviewed By: aganea, hans Differential Revision: https://reviews.llvm.org/D81296	2020-06-30 11:28:04 -07:00
Amy Huang	f8170d8715	[NativeSession] Implement findLineNumbersByAddress in NativeSession, which takes an address and a length and returns all lines within that address range.	2020-06-15 17:05:39 -07:00
Reid Kleckner	1c03389c29	Re-land "Migrate the rest of COFFObjectFile to Error" This reverts commit `101fbc0138`. Remove leftover debugging attribute. Update LLDB as well, which was missed before.	2020-06-11 14:46:16 -07:00
Nico Weber	101fbc0138	Revert "Migrate the rest of COFFObjectFile to Error" This reverts commit `b5289656b8`. __attribute__((optnone)) doesn't build with msvc, see http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/16326	2020-06-05 21:20:11 -04:00
Reid Kleckner	b5289656b8	Migrate the rest of COFFObjectFile to Error	2020-06-05 16:29:05 -07:00
Simon Pilgrim	f6417f5db8	FileOutputBuffer.h - remove unused includes. NFC. Move dependent includes down to source files where necessary.	2020-05-28 14:38:12 +01:00
Reid Kleckner	4092742740	[PDB] Switch from LLVM_PACKED to LLVM_PACKED_START/END Reportedly using the pragma instead of the __attribute__ silences warnings with some GCC versions.	2020-05-13 14:24:11 -07:00
Amy Huang	641ae73f2e	[NativeSession] Implement NativeSession::findSymbolByAddress. Summary: This implements searching for function symbols and public symbols by address. More specifically, -Implements NativeSession::findSymbolByAddress for function symbols and public symbols. I think data symbols are also searched for, but isn't implemented in this patch. -Adds classes for NativeFunctionSymbol and NativePublicSymbol -Adds a '-use-native-pdb-reader' option to llvm-symbolizer, for testing purposes. Reviewers: rnk, amccarth, labath Subscribers: mgorny, hiraditya, MaskRay, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79269	2020-05-13 09:39:25 -07:00
Reid Kleckner	3b3e28a07c	[PDB] Optimize public symbol processing Reduces time to link PGO instrumented net_unittets.exe by 11% (9.766s -> 8.672s, best of three). Reduces peak memory by 65.7MB (2142.71MB -> 2076.95MB). Use a more compact struct, BulkPublic, for faster sorting. Sort in parallel. Construct the hash buckets in parallel. Try to use one vector to hold all the publics instead of copying them from one to another. Allocate all the memory needed to serialize publics up front, and then serialize them in place in parallel. Reviewed By: aganea, hans Differential Revision: https://reviews.llvm.org/D79467	2020-05-08 10:23:27 -07:00
Reid Kleckner	b7438c25ea	[PDB] Move stream index tracking to GSIStreamBuilder The GSIHashStreamBuilder doesn't need to know the stream index. Standardize the naming (Idx -> Index in public APIs).	2020-05-04 20:51:09 -07:00
Reid Kleckner	5070cecd72	[PDB] Bypass generic deserialization code for publics sorting The number of public symbols is very large, and each deserialization does a few heap allocations. The public symbols are serialized by the linker, so we can assume they have the expected layout and use it directly. Saves O(#publics) temporary heap allocations and shrinks some data structures.	2020-05-02 18:14:50 -07:00
Craig Topper	7867f4c15f	[PDB] Remove a couple asserts that are no longer valid now that C13Builders does not use unique_ptr. These asserts used to check that unique_ptr was not null. This fixes failures from `7af4bb1641`	2020-05-02 17:31:10 -07:00
Reid Kleckner	7af4bb1641	[PDB] Remove unique_ptr wrapper around C13 line table subsections This accounts for a large portion of the memory allocations in LLD. This DebugSubsectionRecordBuilder object can be stored directly in C13Builders, it mostly wraps other subsections. Remove the container kind field from the object. It is always the same for all elements in the vector, and we can pass it in during writing.	2020-05-02 16:35:07 -07:00
Amy Huang	2360933147	Reland "Implement some functions in NativeSession." with fixes so that the tests pass on Linux. Summary: This change implements readFromExe, and calculating VA and RVA, which are some of the functionalities that will be used for native PDB reading for llvm symbolizer. bug: https://bugs.llvm.org/show_bug.cgi?id=41795	2020-04-21 16:35:27 -07:00
Amy Huang	507d80fbd2	Revert "Implement some NativeSession functions" along with some followup fixes. This reverts commits `a6d8a055e9` `4927ae0858` `1e1f5eb7c9`	2020-04-21 14:20:13 -07:00
Amy Huang	1e1f5eb7c9	[NativeSession] Fix unchecked Expected type (followup to https://reviews.llvm.org/D78128)	2020-04-21 12:36:55 -07:00
Michael Liao	a13dce1d90	Fix build. NFC.	2020-04-21 14:59:45 -04:00
Fangrui Song	4927ae0858	[PDB] Change llvm/object/COFF.h to llvm/Object/COFF.h after D78128	2020-04-21 11:54:05 -07:00
Amy Huang	a6d8a055e9	Implement some functions in NativeSession. Summary: This change implements readFromExe, and calculating VA and RVA, which are some of the functionalities that will be used for native PDB reading for llvm symbolizer. bug: https://bugs.llvm.org/show_bug.cgi?id=41795 Reviewers: hans, amccarth, rnk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78128	2020-04-21 11:48:40 -07:00
Alexandre Ganea	a7325298e1	[CodeView] Align type records on 4-bytes when emitting PDBs When emitting PDBs, the TypeStreamMerger class is used to merge .debug$T records from the input .OBJ files into the output .PDB stream. Records in .OBJs are not required to be aligned on 4-bytes, and "The Netwide Assembler 2.14" generates non-aligned records. When compiling with -DLLVM_ENABLE_ASSERTIONS=ON, an assert was triggered in MergingTypeTableBuilder when non-ghash merging was used. With ghash merging there was no assert. As a result, LLD could potentially generate a non-aligned TPI stream. We now align records on 4-bytes when record indices are remapped, in TypeStreamMerger::remapIndices(). Differential Revision: https://reviews.llvm.org/D75081	2020-03-13 12:22:19 -04:00
Sven van Haastregt	665dcdacc0	Add missing newlines at EOF; NFC	2020-02-12 15:57:25 +00:00
Bill Wendling	c55cf4afa9	Revert "Remove redundant "std::move"s in return statements" The build failed with error: call to deleted constructor of 'llvm::Error' errors. This reverts commit `1c2241a793`.	2020-02-10 07:07:40 -08:00
Bill Wendling	1c2241a793	Remove redundant "std::move"s in return statements	2020-02-10 06:39:44 -08:00
Benjamin Kramer	adcd026838	Make llvm::StringRef to std::string conversions explicit. This is how it should've been and brings it more in line with std::string_view. There should be no functional change here. This is mostly mechanical from a custom clang-tidy check, with a lot of manual fixups. It uncovers a lot of minor inefficiencies. This doesn't actually modify StringRef yet, I'll do that in a follow-up.	2020-01-28 23:25:25 +01:00
Reid Kleckner	e5caa156b4	[PDB] Simplify API for making section map, NFC Prevents API misuse described in PR44495	2020-01-23 12:15:21 -08:00
Dávid Bolvanský	bc2b380c0d	[pdbutil] Fixed -Wdeprecated-copy in DbiModuleDescriptor	2019-11-23 23:33:22 +01:00
Fangrui Song	644de3b96e	[PDB] Make pdb::DbiModuleDescriptor destructor trivial	2019-11-11 21:26:26 -08:00
George Rimar	78d632d105	[LLVMDebugInfoPDB] - Use cantFail() instead of assert(). Currently injected-sources-native.test fails with "Expected<T> value was in success state. (Note: Expected<T> values in success mode must still be checked prior to being destroyed)" when llvm is compiled with LLVM_ENABLE_ABI_BREAKING_CHECKS in Release. The problem is that getStringForID returns Expected<StringRef> and Expected value must always be checked, even if it is in success state. Checking with assert only helps in Debug and is wrong. Differential revision: https://reviews.llvm.org/D69251 llvm-svn: 375492	2019-10-22 08:52:45 +00:00
Hans Wennborg	1e1e3ba252	Unify the two CRC implementations David added the JamCRC implementation in r246590. More recently, Eugene added a CRC-32 implementation in r357901, which falls back to zlib's crc32 function if present. These checksums are essentially the same, so having multiple implementations seems unnecessary. This replaces the CRC-32 implementation with the simpler one from JamCRC, and implements the JamCRC interface in terms of CRC-32 since this means it can use zlib's implementation when available, saving a few bytes and potentially making it faster. JamCRC took an ArrayRef<char> argument, and CRC-32 took a StringRef. This patch changes it to ArrayRef<uint8_t> which I think is the best choice, and simplifies a few of the callers nicely. Differential revision: https://reviews.llvm.org/D68570 llvm-svn: 374148	2019-10-09 09:06:30 +00:00
Jonas Devlieghere	0eaee545ee	[llvm] Migrate llvm::make_unique to std::make_unique Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013	2019-08-15 15:54:37 +00:00
Nico Weber	7bb5fc0583	llvm-pdbdump: Fix several smaller issues with injected source compression handling - getCompression() used to return a PDB_SourceCompression even though the docs for IDiaInjectedSource are explicit about the return value being compiler-dependent. Return an uint32_t instead, and make the printing code handle unknown values better by printing "Unknown" and the int value instead of not printing any compression. - Print compressed contents as hex dump, not as string. - Add compression type "DotNet", which is used (at least) by csc.exe, the C# compiler. Also add a lengthy comment describing the stream contents (derived from looking at the raw hex contents long enough to see the GUIDs, which led me to the roslyn and mono implementations for handling this). - The native injected source dumper was dumping the contents of the whole data stream -- but csc.exe writes a stream that's padded with zero bytes to the next 512 boundary, and the dia api doesn't display those padding bytes. So make NativeInjectedSource::getCode() do the same thing. Differential Revision: https://reviews.llvm.org/D64879 llvm-svn: 366386	2019-07-17 22:59:52 +00:00
Nico Weber	d100b5dd01	Teach `llvm-pdbutil pretty -native` about `-injected-sources` `pretty -native -injected-sources -injected-source-content` works with this patch, and produces identical output to the dia version. Differential Revision: https://reviews.llvm.org/D64428 llvm-svn: 366236	2019-07-16 18:04:26 +00:00
Nico Weber	ac6375d99d	Expand comment about how StringsToBuckets was computed, and add more entries The construction was explained in https://reviews.llvm.org/D44810?id=139526#inline-391999 but reading the code shouldn't require hunting down old reviews to understand it. The precomputed list was missing an entry for the empty list case, and one entry at the very end. (The current last entry is the last one where 3 * BucketCount fits in a signed int, but the reference implementation uses unsigneds as far as I can tell, so there's room for one more entry.) No behavior change for inputs seen in practice. Differential Revision: https://reviews.llvm.org/D64738 llvm-svn: 366107	2019-07-15 18:56:56 +00:00
Nico Weber	51a52b5893	PDB HashTable: Move TraitsT from class parameter to the methods that need it The traits object is only used by a few methods. Deserializing a hash table and walking it is possible without the traits object, so it shouldn't be required to build a dummy object for that use case. The TraitsT object used to be a function template parameter before r327647, this restores it to that state. This makes it clear that the traits object isn't needed at all in 1 of the current 3 uses of HashTable (and I am going to add another use that doesn't need it), and that the default PdbHashTraits isn't used outside of tests. While here, also re-enable 3 checks in the test that were commented out (which requires making HashTableInternals templated and giving FooBar an operator==). No intended behavior change. Differential Revision: https://reviews.llvm.org/D64640 llvm-svn: 365974	2019-07-12 23:30:55 +00:00
Nico Weber	13f7ddff17	Slightly simplify MappedBlockStream::createIndexedStream() calls All callers had a PDBFile object at hand, so call Pdb.createIndexedStream() instead, which pre-populates all the arguments (and returns nullptr for kInvalidStreamIndex). Also change safelyCreateIndexedStream() to only take the string index, and update callers. Make the method public and call it in two places that manually did the bounds checking before. No intended behavior change. Differential Revision: https://reviews.llvm.org/D64633 llvm-svn: 365936	2019-07-12 18:24:38 +00:00
Amy Huang	9970817c57	Deduplicate S_CONSTANTs in LLD. Summary: Deduplicate S_CONSTANTS when linking, if they have the same value. Reviewers: rnk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63151 llvm-svn: 363089	2019-06-11 18:02:39 +00:00
Nico Weber	e577be4ed1	[PDB] Fix hash function used to write /src/headerblock lld-link used to write PDB files that DIA couldn't recover natvis files from if: - The global strings table was > 64kiB - There were at least 3 natvis files The cause was that the hash function for the /src/headerblock stream was incorrect: It needs to be truncated to 16 bit. If the global strings table was <= 64kiB, truncating to 16 bit is a no-op, so this wasn't needed for small programs. If there are only 1 or 2 natvis files, then the growth strategy in HashTable::grow() would mean the hash table would have 2 buckets (for 1 natvis file) or 4 buckets (for 4 natvis files), and since the hash function is used modulo number of buckets, and since 2 and 4 divide 0x10000, the missing `% 0x10000` is a no-op there too. For 3 natvis files, the hash table grows to 6 buckets, which has a factor that's not common with 0x10000 and the difference starts to matter. Fixes PR41626. Differential Revision: https://reviews.llvm.org/D61277 llvm-svn: 359515	2019-04-29 23:09:35 +00:00
Fangrui Song	efd94c56ba	Use llvm::stable_sort While touching the code, simplify if feasible. llvm-svn: 358996	2019-04-23 14:51:27 +00:00
Reid Kleckner	e10d00419a	[codeview] Remove Type member from CVRecord Summary: Now CVType and CVSymbol are effectively type-safe wrappers around ArrayRef<uint8_t>. Make the kind() accessor load it from the RecordPrefix, which is the same for types and symbols. Reviewers: zturner, aganea Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60018 llvm-svn: 357658	2019-04-04 00:28:48 +00:00
Reid Kleckner	85e2cdac73	Delay initialization of three static global maps, NFC This avoids allocating a few KB of heap memory on startup, and instead allocates these maps lazily. I noticed this while profiling LLD. llvm-svn: 357192	2019-03-28 17:33:41 +00:00
Alexandre Ganea	4aeea4cc42	[DebugInfo][PDB] Don't write empty debug streams Before, empty debug streams were written as 8 bytes (4 bytes signature + 4 bytes for the GlobalRefs count). With this patch, unused empty streams aren't emitted anymore. Modules now encode 65535 as an 'unused stream' value, by convention. Also fix the * Linker * contrib section which wasn't correctly emitted previously. Differential Revision: https://reviews.llvm.org/D59502 llvm-svn: 356395	2019-03-18 19:13:23 +00:00
Benjamin Kramer	711950c116	Move some classes into anonymous namespaces. NFC. llvm-svn: 353710	2019-02-11 15:16:21 +00:00
Alexandre Ganea	120366edc7	[CodeView] Fix cycles in debug info when merging Types with global hashes When type streams with forward references were merged using GHashes, cycles were introduced in the debug info. This was caused by GlobalTypeTableBuilder::insertRecordAs() not inserting the record on the second pass, thus yielding an empty ArrayRef at that record slot. Later on, upon PDB emission, TpiStreamBuilder::commit() would skip that empty record, thus offseting all indices that came after in the stream. This solution comes in two steps: 1. Fix the hash calculation, by doing a multiple-step resolution, iff there are forward references in the input stream. 2. Fix merge by resolving with multiple passes, therefore moving records with forward references at the end of the stream. This patch also adds support for llvm-readoj --codeview-ghash. Finally, fix dumpCodeViewMergedTypes() which previously could reference deleted memory. Fixes PR40221 Differential Revision: https://reviews.llvm.org/D57790 llvm-svn: 353412	2019-02-07 15:24:18 +00:00
Aleksandr Urakov	d17f6ab61b	[NativePDB] Fix access to both old & new fpo data entries from dbi stream Summary: This patch fixes access to fpo streams in native pdb from DbiStream and makes code consistent with DbiStreamBuilder. Patch By: leonid.mashinskiy Reviewers: zturner, aleksandr.urakov Reviewed By: zturner Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56725 llvm-svn: 352615	2019-01-30 10:40:45 +00:00
Zachary Turner	8371da385a	[PDB] Increase TPI hash bucket count. PDBs contain several serialized hash tables. In the microsoft-pdb repo published to support LLVM implementing PDB support, the provided initializes the bucket count for the TPI and IPI streams to the maximum size. This occurs in tpi.cpp L33 and tpi.cpp L398. In the LLVM code for generating PDBs, these streams are created with minimum number of buckets. This difference makes LLVM generated PDBs slower for when used for debugging. Patch by C.J. Hebert Differential Revision: https://reviews.llvm.org/D56942 llvm-svn: 352117	2019-01-24 22:25:55 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Zachary Turner	bb3d7e565f	[PDB] Add some helper functions for working with scopes. llvm-svn: 349361	2018-12-17 16:15:36 +00:00
Zachary Turner	579264bd59	Support skewed stream arrays. VarStreamArray was built on the assumption that it is backed by a StreamRef, and offset 0 of that StreamRef is the first byte of the first record in the array. This is a logical and intuitive assumption, but unfortunately we have use cases where it doesn't hold. Specifically, a PDB module's symbol stream is prefixed by 4 bytes containing a magic value, and the first byte of record data in the array is actually at offset 4 of this byte sequence. Previously, we would just truncate the first 4 bytes and then construct the VarStreamArray with the resulting StreamRef, so that offset 0 of the underlying stream did correspond to the first byte of the first record, but this is problematic, because symbol records reference other symbol records by the absolute offset including that initial magic 4 bytes. So if another record wants to refer to the first record in the array, it would say "the record at offset 4". This led to extremely confusing hacks and semantics in loading code, and after spending 30 minutes trying to get some math right and failing, I decided to fix this in the underlying implementation of VarStreamArray. Now, we can say that a stream is skewed by a particular amount. This way, when we access a record by absolute offset, we can use the same values that the records themselves contain, instead of having to do fixups. Differential Revision: https://reviews.llvm.org/D55344 llvm-svn: 348499	2018-12-06 16:55:00 +00:00
Zachary Turner	7c6b19f49b	[PDB] Emit S_UDT records in LLD. Previously these were dropped. We now understand them sufficiently well to start emitting them. From the debugger's perspective, this now enables us to have debug info about typedefs (both global and function-locally scoped) Differential Revision: https://reviews.llvm.org/D55228 llvm-svn: 348306	2018-12-04 21:48:46 +00:00
Zachary Turner	1e0cce796c	Fix issue with Tpi Stream hash map. Part of the patch to not build the hash map eagerly was omitted due to a merge conflict. Add it back, which should fix the failing tests. llvm-svn: 348166	2018-12-03 19:05:12 +00:00
Zachary Turner	f861e291d6	Don't build the Tpi Hash map by default. This is very slow and should be done for specific cases where lookups will need to happen. llvm-svn: 348160	2018-12-03 18:32:05 +00:00
Reid Kleckner	ffba54493f	Add missing error checking code intended for r347687 llvm-svn: 347690	2018-11-27 19:14:11 +00:00
Reid Kleckner	291d015de4	[PDB] Add symbol records in bulk Summary: This speeds up linking clang.exe/pdb with /DEBUG:GHASH by 31%, from 12.9s to 9.8s. Symbol records are typically small (16.7 bytes on average), but we processed them one at a time. CVSymbol is a relatively "large" type. It wraps an ArrayRef<uint8_t> with a kind an optional 32-bit hash, which we don't need. Before this change, each DbiModuleDescriptorBuilder would maintain an array of CVSymbols, and would write them individually with a BinaryItemStream. With this change, we now add symbols that happen to appear contiguously in bulk. For each .debug$S section (roughly one per function), we allocate two copies, one for relocation, and one for realignment purposes. For runs of symbols that go in the module stream, which is most symbols, we now add them as a single ArrayRef<uint8_t>, so the vector DbiModuleDescriptorBuilder is roughly linear in the number of .debug$S sections (O(# funcs)) instead of the number of symbol records (very large). Some stats on symbol sizes for the curious: PDB size: 507M sym bytes: 316,508,016 sym count: 18,954,971 sym byte avg: 16.7 As future work, we may be able to skip copying symbol records in the linker for realignment purposes if we make LLVM write them aligned into the object file. We need to double check that such symbol records are still compatible with link.exe, but if so, it's definitely worth doing, since my profile shows we spend 500ms in memcpy in the symbol merging code. We could potentially cut that in half by saving a copy. Alternatively, we could apply the relocations after we iterate the symbols. This would require some careful re-engineering of the relocation processing code, though. Reviewers: zturner, aganea, ruiu Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D54554 llvm-svn: 347687	2018-11-27 19:00:23 +00:00
Zachary Turner	03a24052f3	[NativePDB] Improved support for nested type reconstruction. In a previous patch, we pre-processed the TPI stream in order to build the reverse mapping from nested type -> parent type so that we could accurately reconstruct a DeclContext hierarchy. However, there were some issues. An LF_NESTTYPE record is really just a typedef, so although it happens to be used to indicate the name of the nested type and referring to the global record which defines the type, it is also used for every other kind of nested typedef. When we rebuild the DeclContext hierarchy, we want it to be as accurate as possible, which means that if we have something like: struct A { struct B {}; using C = B; }; We don't want to create two CXXRecordDecls in the AST each with the exact same definition. We just want to create one for B and then define C as an alias to B. Previously, however, it would not be able to distinguish between the two cases and it would treat A::B and A::C as being two classes each with separate definitions. We address the first half of improving the pre-processing logic so that only actual definitions are treated this way. Later, in a followup patch, we can handle the case of nested typedefs since we're already going to be enumerating the field list anyway and this patch introduces the general framework for distinguishing between the two cases. Differential Revision: https://reviews.llvm.org/D54357 llvm-svn: 346786	2018-11-13 20:07:32 +00:00
Aleksandr Urakov	c43e086c74	Revert "Revert "[PDB] Extend IPDBSession's interface to retrieve frame data"" This reverts commit 466ce67d6ec444962e5cc0136243c16a453190c0. llvm-svn: 345010	2018-10-23 08:14:53 +00:00
Zachary Turner	b96181c2bf	Some cleanups to the native pdb plugin [NFC]. This is mostly some cleanup done in the process of implementing some basic support for types. I tried to split up the patch a bit to get some of the NFC portion of the patch out into a separate commit, and this is the result of that. It moves some code around, deletes some spurious namespace qualifications, removes some unnecessary header includes, forward declarations, etc. llvm-svn: 344913	2018-10-22 16:19:07 +00:00
Aleksandr Urakov	738df2de7f	Revert "[PDB] Extend IPDBSession's interface to retrieve frame data" This reverts commit b5c7e2f9a4dbb34e3667c4bb4972735eadd3247a. llvm-svn: 344909	2018-10-22 15:30:48 +00:00
Aleksandr Urakov	d4a82f6f74	[PDB] Extend IPDBSession's interface to retrieve frame data Summary: This patch just extends the `IPDBSession` interface to allow retrieving of frame data through it, and adds an implementation over DIA. It is needed for an implementation (for now with DIA) of the conversion from FPO programs to DWARF expressions mentioned in D53086. Reviewers: zturner, asmith, rnk Reviewed By: asmith Subscribers: mgorny, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D53324 llvm-svn: 344886	2018-10-22 07:18:08 +00:00
Zachary Turner	5989281cf3	[PDB] Fix another bug in globals stream name lookup. When we're on the last bucket the computation is tricky. We were failing when the last bucket contained multiple matches. Added a new test for this. llvm-svn: 344081	2018-10-09 21:19:03 +00:00
Zachary Turner	b7dd12b7a8	[PDB] Fix failure on big endian machines. We changed an ArrayRef<uint8_t> to an ArrayRef<uint32_t>, but it needs to be an ArrayRef<support::ulittle32_t>. We also change ArrayRef<> to FixedStreamArray<>. Technically an ArrayRef<> will work, but it can cause a copy in the underlying implementation if the memory is not contiguous, and there's no reason not to use a FixedStreamArray<>. Thanks to nemanjai@ and thakis@ for helping me track this down and confirm the fix. llvm-svn: 344063	2018-10-09 17:58:51 +00:00
Zachary Turner	0f556f88c5	Remove unused variable. llvm-svn: 344002	2018-10-08 22:56:57 +00:00
Zachary Turner	c8207fa59b	[PDB] fix a bug in global stream name lookup. When we're looking up a record in the last hash bucket chain, we need to be careful with the end-offset calculation. llvm-svn: 344001	2018-10-08 22:38:27 +00:00
Kristina Brooks	bcc86a95c1	[DebugInfo][PDB] Fix a signed/unsigned coversion warning Fix the following warning when compiling with clang (caused by commit rL343951): GlobalsStream.cpp:61:33: warning: comparison of integers of different signs: 'int' and 'uint32_t' This also avoids double evaluation of `GlobalsTable.HashBuckets.size()`. llvm-svn: 343957	2018-10-08 09:03:17 +00:00
Zachary Turner	ba73a91491	Fix a -Wsign-compare warning. llvm-svn: 343953	2018-10-08 04:44:12 +00:00
Zachary Turner	94926a6db8	[PDB] Add the ability to lookup global symbols by name. The Globals table is a hash table keyed on symbol name, so it's possible to lookup symbols by name in O(1) time. Add a function to the globals stream to do this, and add an option to llvm-pdbutil to exercise this, then use it to write some tests to verify correctness. llvm-svn: 343951	2018-10-08 04:19:16 +00:00
Zachary Turner	a5e3e02602	[PDB] Add support for dumping Typedef records. These work a little differently because they are actually in the globals stream and are treated as symbol records, even though DIA presents them as types. So this also adds the necessary infrastructure to cache records that live somewhere other than the TPI stream as well. llvm-svn: 343507	2018-10-01 17:55:38 +00:00
Zachary Turner	5c1873b213	[PDB] Add support for parsing VFTable Shape records. This allows them to be returned from the native API. llvm-svn: 343506	2018-10-01 17:55:16 +00:00
Zachary Turner	518cb2d560	[PDB] Add native support for dumping array types. llvm-svn: 343412	2018-09-30 16:19:18 +00:00
Zachary Turner	6ca6a03c51	[PDB] Better native API support for pointers. We didn't properly detect when a pointer was a member pointer, and when that was the case we were not properly returning class parent info. This caused member pointers to render incorrectly in pretty mode. However, we didn't even have pretty tests for pointers in native mode, so those are also added now to ensure this. llvm-svn: 343393	2018-09-29 23:28:19 +00:00
Fangrui Song	0cac726a00	llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...) Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163	2018-09-27 02:13:45 +00:00
Zachary Turner	a9defc348b	Add missing include. llvm-svn: 342781	2018-09-21 22:44:31 +00:00
Zachary Turner	6345e84dde	[NativePDB] Add support for reading function signatures. This adds support for parsing function signature records and returning them through the native DIA interface. llvm-svn: 342780	2018-09-21 22:36:28 +00:00
Zachary Turner	355ffb0032	[PDB] Add native reading support for UDT / class types. This allows the native reader to find records of class/struct/ union type and dump them. This behavior is tested by using the diadump subcommand against golden output produced by actual DIA SDK on the same PDB file, and again using pretty -native to confirm that we actually dump the classes. We don't find class members or anything like that yet, for now it's just the class itself. llvm-svn: 342779	2018-09-21 22:36:04 +00:00
Zachary Turner	68f0eeff83	Fix warnings. llvm-svn: 342670	2018-09-20 17:48:44 +00:00
Zachary Turner	5907a780f0	[PDB] Better printing of builtin types when using DIA dumper. llvm-svn: 342658	2018-09-20 16:12:05 +00:00
Zachary Turner	cfa1d499f9	[PDB] Add the ability to map forward references to full decls. Some records point to an LF_CLASS, LF_UNION, LF_STRUCTURE, or LF_ENUM which is a forward reference and doesn't contain complete debug information. In these cases, we'd like to be able to quickly locate the full record. The TPI stream stores an array of pre-computed record hash values, one for each type record. If we pre-process this on startup, we can build a mapping from hash value -> {list of possible matching type indices}. Since hashes of full records are only based on the name and or unique name and not the full record contents, we can then use forward ref record to compute the hash of what would be the full record by just hashing the name, use this to get the list of possible matches, and iterate those looking for a match on name or unique name. llvm-pdbutil is updated to resolve forward references for the purposes of testing (plus it's just useful). Differential Revision: https://reviews.llvm.org/D52283 llvm-svn: 342656	2018-09-20 15:50:13 +00:00
Zachary Turner	c41ce8355f	[PDB] Better support for enumerating pointer types. There were several issues with the previous implementation. 1) There were no tests. 2) We didn't support creating PDBSymbolTypePointer records for builtin types since those aren't described by LF_POINTER records. 3) We didn't support a wide enough variety of builtin types even ignoring pointers. This patch fixes all of these issues. In order to add tests, it's helpful to be able to ignore the symbol index id hierarchy because it makes the golden output from the DIA version not match our output, so I've extended the dumper to disable dumping of id fields. llvm-svn: 342493	2018-09-18 16:35:05 +00:00
Zachary Turner	bdf0381e21	[PDB] Make the native reader support enumerators. Previously we would dump the names of enum types, but not their enumerator values. This adds support for enumerator values. In doing so, we have to introduce a general purpose mechanism for caching symbol indices of field list members. Unlike global types, FieldList members do not have a TypeIndex. So instead, we identify them by the pair {TypeIndexOfFieldList, IndexInFieldList}. llvm-svn: 342415	2018-09-17 21:08:11 +00:00
Zachary Turner	4727ac2394	[PDB] Make the native reader support modified types. Previously for cv-qualified types, we would just ignore them and they would never get printed. Now we can enumerate them and cache them like any other symbol type. llvm-svn: 342414	2018-09-17 21:07:48 +00:00
Nico Weber	205ca68b8d	Give InfoStreamBuilder an opt-in method to write a hash of the PDB as GUID. Naively computing the hash after the PDB data has been generated is in practice as fast as other approaches I tried. I also tried online-computing the hash as parts of the PDB were written out (https://reviews.llvm.org/D51887; that's also where all the measuring data is) and computing the hash in parallel (https://reviews.llvm.org/D51957). This approach here is simplest, without being slower. Differential Revision: https://reviews.llvm.org/D51956 llvm-svn: 342333	2018-09-15 18:35:51 +00:00
Zachary Turner	4d68951e6d	[PDB] Refactor a little of the Symbol creation code. Eventually we need to be able to support nested types, which don't have an associated CVType record. To handle this, remove the CVType from all of the record classes, and instead store the deserialized record. Then move the deserialization up to the thing that creates the type. This actually makes error handling better anyway as we can return an invalid symbol instead of asserting false. llvm-svn: 342284	2018-09-14 21:03:57 +00:00
David Blaikie	eee709f03c	DebugInfo/PDB: Remove unused member llvm-svn: 342101	2018-09-13 00:02:02 +00:00
Zachary Turner	c43d55602f	[PDB] Remove all clone() methods. These are dead code and encourage poor usage patterns, so I'm removing them. They weren't called anywhere anyway. llvm-svn: 342093	2018-09-12 22:57:03 +00:00
Zachary Turner	a1f85f8bdd	[PDB] Emit old fpo data to the PDB file. r342003 added support for emitting FPO data from the DEBUG_S_FRAMEDATA subsection of the .debug$S section to the PDB file. However, that is not the end of the story. FPO can end up in two different destinations in a PDB, each corresponding to a different FPO data source. The case handled by r342003 involves copying data from the DEBUG_S_FRAMEDATA subsection of the .debug$S section to the "New FPO" stream in the PDB, which is then referred to by the DBI stream. The case handled by this patch involves copying records from the .debug$F section of an object file to the "FPO" stream (or perhaps more aptly, the "Old FPO" stream) in the PDB file, which is also referred to by the DBI stream. The formats are largely similar, and the difference is mostly only visible in masm generated object files, such as some of the low-level CRT object files like memcpy. MASM doesn't appear to support writing the DEBUG_S_FRAMEDATA subsection, and instead just writes these records to the .debug$F section. Although clang-cl does not emit a .debug$F section ever, lld still needs to support it so we have good debugging for CRT functions. Differential Revision: https://reviews.llvm.org/D51958 llvm-svn: 342080	2018-09-12 21:02:01 +00:00
Zachary Turner	42e7cc1b0f	[PDB] Write FPO Data to the PDB. llvm-svn: 342003	2018-09-11 22:35:01 +00:00
Nico Weber	e2745b5d86	pdb output: Initialize padding in PublicsStreamHeader. Makes the produced pdbs more deterministic; before they'd contain 2 arbitary bytes where this padding was. Also reorder initialization to match the order of the fields in the struct (nfc) llvm-svn: 341945	2018-09-11 14:11:52 +00:00
Zachary Turner	cae734588f	[PDB] Change uint32_t to SymIndex wherever it makes sense. Although it's just a typedef, it helps for readability. NFC. llvm-svn: 341863	2018-09-10 21:30:59 +00:00
Zachary Turner	0119e38491	Fix some of the PDB tests. They were unintentionally calling DIA directly, which requires Windows. We need to pass the -native flag, and this then required fixing up one or two tests. llvm-svn: 341731	2018-09-07 23:36:08 +00:00

1 2 3 4 5 ...

322 Commits