LLVM Programmer’s Manual strongly discourages the use of `std::vector<bool>` and suggests `llvm::BitVector` as a possible replacement.
This patch replaces `std::vector<bool>` with `llvm::BitVector` in the Syntax library and replaces range-based for loop with regular for loop. This is necessary due to `llvm::BitVector` not having `begin()` and `end()` (D117116).
Reviewed By: dexonsmith, dblaikie
Differential Revision: https://reviews.llvm.org/D118109
This patch adds a `buildAccess` function, which constructs a string with the
proper operator to use based on the expression's form and type. It also adds two
predicates related to smart pointers, which are needed by `buildAccess` but are
also of general value.
We deprecate `buildDot` and `buildArrow` in favor of the more general
`buildAccess`. These will be removed in a future patch.
Differential Revision: https://reviews.llvm.org/D116377
The minimizing and caching filesystem used by the dependency scanner can be configured to **not** minimize some files. That's necessary when scanning a TU with prebuilt inputs (i.e. PCH) that refer to the original (non-minimized) files. Minimizing such files in the dependency scanner would cause discrepancy between the current perceived state of the filesystem and the file sizes stored in the AST file. By not minimizing such files, we avoid creating the discrepancy.
The problem with the current approach is that files that should not be minimized are identified by their path. This breaks down when the prebuilt input (PCH) and the current TU refer to the same file via different paths (i.e. symlinks). This patch switches from paths to `llvm::sys::fs::UniqueID` when identifying ignored files. This is consistent with how the rest of Clang treats files.
Depends on D114966.
Reviewed By: dexonsmith, arphaman
Differential Revision: https://reviews.llvm.org/D114971
The minimizing filesystem used by the dependency scanner isn't great when it comes to the consistency of its caches. There are two problems that can be exposed by a filesystem that changes during dependency scan:
1. In-memory cache entries for original and minimized files are distinct, populated at different times using separate stat/open syscalls. This means that when a file is read with minimization disabled, its contents might be inconsistent when the same file is read with minimization enabled at later point (and vice versa).
2. In-memory cache entries are indexed by filename. This is problematic for symlinks, where the contents of the symlink might be inconsistent with contents of the original file (for the same reason as in problem 1).
This patch ensures consistency by always stating/reading a file exactly once. The original contents are always cached and minimized contents are derived from that on demand. The cache entries are now indexed by their `UniqueID` ensuring consistency for symlinks too. Moreover, the stat/read syscalls are now issued outside of critical section.
Depends on D115935.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D114966
The return types of some `CachedFileSystemEntry` member function are needlessly complex.
This patch attempts to simplify the code by unwrapping cached entries that represent errors early, and then asserting `!isError()`.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D115935
"driver <flags> -- <input>" is a particularly convenient form of the
compile command to manipulate, with fewer special cases to handle.
Guaranteeing that the output command is of that form is cheap and makes
it easier to consume the result in some cases.
Differential Revision: https://reviews.llvm.org/D116721
This reverts commits:
- 04192422c4.
- 015e08c6ba
D114206 was landed before it was approved - and was landed knowing that
the test crashed on windows, without an xfail. The promised follow-up
commit with fixes has not appeared since it was promised on December 14th.
Extends response file expansion to recognize `<CFGDIR>` and expand to the
current file's directory. This makes it much easier to author clang config
files rooted in portable, potentially not-installed SDK directories.
A typical use case may be something like the following:
```
# sample_sdk.cfg
--target=sample
-isystem <CFGDIR>/include
-L <CFGDIR>/lib
-T <CFGDIR>/ldscripts/link.ld
```
Reviewed By: sepavloff
Differential Revision: https://reviews.llvm.org/D115604
The minimizing and caching filesystem used by the dependency scanner keeps minimized and original files in separate caches.
This setup is not well suited for dealing with files that are sometimes minimized and sometimes not. Such files are being stat-ed and read twice, which is wasteful and also means the two versions of the file can get "out of sync".
This patch squashes the two caches together. When a file is stat-ed or read, its original contents are populated. If a file needs to be minimized, we give the minimizer the already loaded contents instead of reading the file again.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D115346
The code and documentation around `CachedFileSystemEntry` use the following terms:
* "invalid stat" for `llvm::ErrorOr<llvm::vfs::Status>` that is *not* an error and contains an unknown status,
* "initialized entry" for an entry that contains "invalid stat",
* "valid entry" for an entry that contains "invalid stat", synonymous to "initialized" entry.
Having an entry be "valid" while it contains an "invalid" status object is counter-intuitive.
This patch cleans up the wording by referring to the status as "unknown" and to the entry as either "initialized" or "uninitialized".
Make clang-scan-deps use the virtual path for module maps instead of the on disk
path. This is needed so that modulemap relative lookups are done correctly in
the actual module builds. The file dependencies still use the on disk path as
that's what matters for build invalidation.
Differential Revision: https://reviews.llvm.org/D114206
This avoids an unnecessary copy required by 'return OS.str()', allowing
instead for NRVO or implicit move. The .str() call (which flushes the
stream) is no longer required since 65b13610a5,
which made raw_string_ostream unbuffered by default.
Differential Revision: https://reviews.llvm.org/D115374
This patch avoids unnecessarily copying contents of `mmap`-ed files into `CachedFileSystemEntry` by storing `MemoryBuffer` instead. The change leads to ~50% reduction of peak memory footprint when scanning LLVM+Clang via `clang-scan-deps`.
Depends on D115331.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D115043
This patch changes uses of `std::unique_lock` to `std::lock_guard`.
The `std::unique_lock` template provides some advanced capabilities (deferred locking, time-constrained locking attempts, etc.) we don't use in the caching filesystem. Plain `std::lock_guard` will do here.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D115332
Some command-line codegen arguments are likely to differ between identical modules discovered from different translation units. This patch removes them to make builds deterministic and/or reduce the number of built modules.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D112923
The filesystem used during dependency scanning does two things: it caches file entries and minimizes source file contents. We use the term "ignored file" in a couple of places, but it's not clear what exactly that means. This commit clears up the semantics, explicitly spelling out this relates to minimization.
During explicit modules build, when all modules are provided via `-fmodule-file=<path>` and implicit modules and implicit module maps are disabled (`-fno-implicit-modules`, `-fno-implicit-module-maps`), we don't need to load the original module map files at all. This patch stops emitting the `-fmodule-map-file=` arguments we don't need, saving some compilation time due to avoiding parsing such module maps and making the command line shorter.
Reviewed By: bnbarham
Differential Revision: https://reviews.llvm.org/D113473
The dependency scanner works with multiple instances of `Compiler{Instance,Invocation}`. From names of the variables/members, their purpose is not obvious.
This patch gives descriptive name to the generated `CompilerInvocation` that can be used to derive the command-line to build a modular dependency.
Depends on D111725.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D111728
The dependency scanner works with multiple instances of `Compiler{Instance,Invocation}`. From names of the variables/members, their purpose is not obvious.
This patch gives a distinct name to the `CompilerInstance` that's used to run the implicit build during dependency scan.
Depends on D111724.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D111725
The `ModuleDepCollectorPP` class holds a reference to `ModuleDepCollector` as well as `ModuleDepCollector`'s `CompilerInstance`. The fact that these refer to the same object is non-obvious.
This patch removes the `CompilerInvocation` reference from `ModuleDepCollectorPP` and accesses it through `ModuleDepCollector` instead.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D111724
One of main goals of the dependency scanner is to be strict about module compatibility. This is achieved through strict context hash. This patch ensures that strict context hash is enabled not only during the scan itself (and its minimized implicit build), but also when actually reporting the dependency.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D111720
Adds `selectBound`, a `Stencil` combinator that allows the user to supply multiple alternative cases, discriminated by bound node IDs.
Differential Revision: https://reviews.llvm.org/D111708
Previously it only used Windows command lines for MSVC triples, but this
was causing issues for windows-gnu. In fact, everything 'native' Windows
(ie, not Cygwin) should use Windows command line parsing.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D111195
During explicit modular build, PCM files are typically specified via the `-fmodule-file=<path>` command-line option. Early during the compilation, Clang uses the `ASTReader` to read their contents and caches the result so that the module isn't loaded implicitly later on. A listener is attached to the `ASTReader` to collect names of the modules read from the PCM files. However, if the PCM has already been loaded previously via PCH:
1. the `ASTReader` doesn't do anything for the second time,
2. the listener is not invoked at all,
3. the module load result is not cached,
4. the compilation fails when attempting to load the module implicitly later on.
This patch solves this problem by attaching the listener to the `ASTReader` for PCH reading as well.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D111560
To reduce the number of explicit builds of a single module, we can try to squash multiple occurrences of the module with different command-lines (and context hashes) by removing benign command-line options. The greatest contributors to benign differences between command-lines are the header search paths.
In this patch, the lookup cache in `HeaderSearch` is used to identify paths that were actually used when implicitly building the module during scanning. This information is serialized into the unhashed control block of the implicitly-built PCM. The dependency scanner then loads this and may use it to prune the header search paths before computing the context hash of the module and generating the command-line.
We could also prune the header search paths when serializing `HeaderSearchOptions` into the PCM. That way, we could do it only once instead of every load of the PCM file by dependency scanner. However, that would result in a PCM file whose contents don't produce the same context hash as the original build, which is probably highly surprising.
There is an alternative approach to storing extra information into the PCM: wire up preprocessor callbacks to capture the used header search paths on-the-fly during preprocessing of modularized headers (similar to what we currently do for the main source file and textual headers). Right now, that's not compatible with the fact that we do an actual implicit build producing PCM files during dependency scanning. The second run of dependency scanner loads the PCM from the first run, skipping the preprocessing altogether, which would result in different results between runs. We can revisit this approach when we stop building implicitly during dependency scanning.
Depends on D102923.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102488
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
Rename methods to clearly signal when they only deal with ASCII,
simplify the parsing of identifier, and use start/continue instead of
head/body for consistency with Unicode terminology.
During dependency scanning, we generally want to suppress -Werror. Apply the same logic to the DiagnosticOptions instance used for command-line parsing.
This fixes a test failure on the PS4 bot, where the system header directory could not be found, which was reported due to -Werror being on the command line and not being sanitized.
In this patch the dependency scanner starts using proper `DiagnosticOptions` parsed from the actual TU command-line in order to mimic what the actual compiler would do. The actual functionality will be enabled and tested in follow-up patches. (This split is necessary to avoid temporary regression.)
Depends on D108976.
Reviewed By: dexonsmith, arphaman
Differential Revision: https://reviews.llvm.org/D108982
This patch allows the clients of `ToolInvocation` to provide custom diagnostic options to be used during driver -> cc1 command-line transformation and parsing.
Tests covering this functionality are in a follow-up commit. To make this testable, the `DiagnosticsEngine` needs to be properly initialized via `CompilerInstance::createDiagnostics`.
Reviewed By: dexonsmith, arphaman
Differential Revision: https://reviews.llvm.org/D108976
The dependency scanner currently uses `ClangTool` to invoke the dependency scanning action.
However, `ClangTool` seems to be the wrong level of abstraction. It's intended to be run over a collection of compile commands, which we actively avoid via `SingleCommandCompilationDatabase`. It automatically injects `-fsyntax-only` and other flags, which we avoid by calling `clearArgumentsAdjusters()`. It deduces the resource directory based on the current executable path, which we'd like to change to deducing from `argv[0]`.
Internally, `ClangTool` uses `ToolInvocation` which seems to be more in line with what the dependency scanner tries to achieve. This patch switches to directly using `ToolInvocation` instead. NFC.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D108979
This patch changes how the dependency scanner creates the fake input file when scanning dependencies of a single module (introduced in D109485). The scanner now has its own `InMemoryFilesystem` which sits under the minimizing FS (when that's requested). This makes it possible to drop the duplicate work in `DependencyScanningActions::runInvocation` that sets up the main file ID. Besides that, this patch makes it possible to land D108979, where we drop `ClangTool` entirely.
Depends on D109485.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D109498
module lookup by name alone
This removes the need to create a fake source file that imports a
module.
rdar://64538073
Differential Revision: https://reviews.llvm.org/D109485