Commit Graph

131 Commits

Author SHA1 Message Date
Ben Langmuir 5482432bf6 [clang][deps] Compute command-lines for dependencies immediately
Instead of delaying the generation of command-lines to after all
dependencies are reported, compute them immediately. This is partly in
preparation for splitting the TU driver command into its constituent cc1
and other jobs, but it also just simplifies working with the compiler
invocation for modules if they are not "without paths".

Also change the computation of the default output path in
clang-scan-deps to scrape the implicit module cache from the
command-line rather than get it from the dependency, since that is now
unavailable at the time we make the callback.

Differential Revision: https://reviews.llvm.org/D131934
2022-08-16 14:25:27 -07:00
Jan Svoboda d5fcf8a5e2 [clang][deps] NFC: Move dependency consumer into header file 2022-08-10 15:25:09 -07:00
Jan Svoboda 71e32d5cf0 [clang][deps] Always generate module paths
Since D129389 (and downstream PR https://github.com/apple/llvm-project/pull/4965), the dependency scanner is responsible for generating full command-lines, including the modules paths. This patch removes the flag that was making this an opt-in behavior in clang-scan-deps.

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D131420
2022-08-10 11:58:28 -07:00
Ben Langmuir 33af4b22f8 [clang][deps] Stop sharing FileManager across module builds in scanner
Sharing the FileManager across implicit module builds currently leaks
paths looked up in an importer into the built module itself. This can
cause non-deterministic results across scans. It is especially bad for
modules since the path can be saved into the pcm file itself, leading to
stateful behaviour if the cache is shared.

This should not impact the number of real filesystem accesses in the
scanner, since it is already caching in the
DependencyScanningWorkerFilesystem.

Note: this change does not affect whether or not the FileManager is
shared across TUs in the scanner, which is a separate issue.

Differential Revision: https://reviews.llvm.org/D131412
2022-08-08 12:13:54 -07:00
Ben Langmuir b4c6dc2e66 [clang] Update code that assumes FileEntry::getName is absolute NFC
It's an accident that we started return asbolute paths from
FileEntry::getName for all relative paths. Prepare for getName to get
(closer to) return the requested path. Note: conceptually it might make
sense for the dependency scanner to allow relative paths and have the
DependencyConsumer decide if it wants to make them absolute, but we
currently document that it's absolute and I didn't want to change
behaviour here.

Differential Revision: https://reviews.llvm.org/D130934
2022-08-01 14:48:37 -07:00
Ben Langmuir 0287170140 [clang][deps] Include canonical invocation in ContextHash
The "strict context hash" is insufficient to identify module
dependencies during scanning, leading to different module build commands
being produced for a single module, and non-deterministically choosing
between them. This commit switches to hashing the canonicalized
`CompilerInvocation` of the module. By hashing the invocation we are
converting these from correctness issues to performance issues, and we
can then incrementally improve our ability to canonicalize
command-lines.

This change can cause a regression in the number of modules needed. Of
the 4 projects I tested, 3 had no regression, but 1, which was
clang+llvm itself, had a 66% regression in number of modules (4%
regression in total invocations). This is almost entirely due to
differences between -W options across targets.  Of this, 25% of the
additional modules are system modules, which we could avoid if we
canonicalized -W options when -Wsystem-headers is not present --
unfortunately this is non-trivial due to some warnings being enabled in
system headers by default. The rest of the additional modules are mostly
real differences in potential warnings, reflecting incorrect behaviour
in the current scanner.

There were also a couple of differences due to `-DFOO`
`-fmodule-ignore-macro=FOO`, which I fixed here.

Since the output paths for the module depend on its context hash, we
hash the invocation before filling in outputs, and rely on the build
system to always return the same output paths for a given module.

Note: since the scanner itself uses an implicit modules build, there can
still be non-determinism, but it will now present as different
module+hashes rather than different command-lines for the same
module+hash.

Differential Revision: https://reviews.llvm.org/D129884
2022-07-28 12:24:06 -07:00
Argyrios Kyrtzidis fbbabd4ca0 [Tooling/DependencyScanning] Enable passing a `vfs::FileSystem` object to `DependencyScanningTool`
Also include a unit test to validate that the `vfs::FileSystem` object is properly used.

Differential Revision: https://reviews.llvm.org/D129912
2022-07-18 09:37:17 -07:00
Ben Langmuir 3ce78cbd23 [clang][deps] Fix handling of -MT in module command-line
Follow-up to 6626f6fec3, this fixes the handling of -MT
* If no targets are provided, we need to invent one since cc1 expects
  the driver to have handled it. The default is to use -o, quoting as
  necessary for a make target.
* Fix the splitting for empty string, which was incorrectly treated as
  {""} instead of {}.
* Add a way to test this behaviour in clang-scan-deps.

Differential Revision: https://reviews.llvm.org/D129607
2022-07-13 13:36:15 -07:00
Ben Langmuir 6626f6fec3 [clang][deps] Override dependency and serialized diag files for modules
When building modules, override secondary outputs (dependency file,
dependency targets, serialized diagnostic file) in addition to the pcm
file path. This avoids inheriting per-TU command-line options that
cause non-determinism in the results (non-deterministic command-line for
the module build, non-determinism in which TU's .diag and .d files will
contain the module outputs). In clang-scan-deps we infer whether to
generate dependency or serialized diagnostic files based on an original
command-line. In a real build system this should be modeled explicitly.

Differential Revision: https://reviews.llvm.org/D129389
2022-07-12 08:19:52 -07:00
Argyrios Kyrtzidis fe3780f32a [DependencyScanningTool.cpp] Use `using namespace` instead of wrapping the `.cpp` file contents in namespaces, NFC
This makes the file consistent with the coding style of the rest of LLVM.
2022-07-11 17:44:17 -07:00
Kazu Hirata 452db157c9 [clang] Don't use Optional::hasValue (NFC) 2022-06-20 10:51:34 -07:00
Ben Langmuir 4a3a9a5fa0 [clang][deps] Sort submodules when calculating dependencies
Dependency scanning does not care about the order of submodules for
correctness, so sort the submodules so that we get the same
command-lines to build the module across different TUs. The order of
inferred submodules can vary depending on the order of #includes in the
including TU.

Differential Revision: https://reviews.llvm.org/D128008
2022-06-17 07:55:27 -07:00
Ben Langmuir 509223da61 [clang][deps] Further canonicalize implicit modules options in dep scan
Disable or canonicalize compiler options that are not relevant in
explicit module builds, similar to what we already did for the modules
cache path. This reduces uninteresting differences between
command-lines, which is particularly useful if there is a tool that can
cache the compilations.

Differential Revision: https://reviews.llvm.org/D127883
2022-06-15 13:29:47 -07:00
Ben Langmuir 835fcf2aa5 [clang][deps] Make order of module dependencies deterministic
This fixes the underlying module dependencies, which had a
non-deterministic order, which was also visible in the order of calls to
DependencyConsumer methods. This was not directly observable in
the clang-scan-deps utility, because it was previously seeing a sorted
order from std::map in DependencyScanningTool. However, the underlying
API previously created a likely issue for any other clients. Note: if
you only apply the change from DependencyScanningTool, you can see the
issue in clang-scan-deps, and existing tests will fail
non-deterministicaly.

Differential Revision: https://reviews.llvm.org/D127243
2022-06-08 11:09:17 -07:00
Ben Langmuir 7a72dca74a [clang][deps] Set -disable-free for module compilations
The command-line arguments for module builds are cc1 commands, so they
do not implicitly set -disable-free like a driver invocation, and
Tooling will disable it for the scanning instance itself. Set
-disable-free explicitly so that separate invocations for building
modules will not pay for freeing memory unnecessarily.

Differential Revision: https://reviews.llvm.org/D127229
2022-06-08 11:09:17 -07:00
Argyrios Kyrtzidis b4c83a13f6 [Tooling/DependencyScanning & Preprocessor] Refactor dependency scanning to produce pre-lexed preprocessor directive tokens, instead of minimized sources
This is a commit with the following changes:

* Remove `ExcludedPreprocessorDirectiveSkipMapping` and related functionality

Removes `ExcludedPreprocessorDirectiveSkipMapping`; its intended benefit for fast skipping of excluded directived blocks
will be superseded by a follow-up patch in the series that will use dependency scanning lexing for the same purpose.

* Refactor dependency scanning to produce pre-lexed preprocessor directive tokens, instead of minimized sources

Replaces the "source minimization" mechanism with a mechanism that produces lexed dependency directives tokens.

* Make the special lexing for dependency scanning a first-class feature of the `Preprocessor` and `Lexer`

This is bringing the following benefits:

    * Full access to the preprocessor state during dependency scanning. E.g. a component can see what includes were taken and where they were located in the actual sources.
    * Improved performance for dependency scanning. Measurements with a release+thin-LTO build shows ~ -11% reduction in wall time.
    * Opportunity to use dependency scanning lexing to speed-up skipping of excluded conditional blocks during normal preprocessing (as follow-up, not part of this patch).

For normal preprocessing measurements show differences are below the noise level.

Since, after this change, we don't minimize sources and pass them in place of the real sources, `DependencyScanningFilesystem` is not technically necessary, but it has valuable performance benefits for caching file `stat`s along with the results of scanning the sources. So the setup of using the `DependencyScanningFilesystem` during a dependency scan remains.

Differential Revision: https://reviews.llvm.org/D125486
Differential Revision: https://reviews.llvm.org/D125487
Differential Revision: https://reviews.llvm.org/D125488
2022-05-26 12:50:06 -07:00
Argyrios Kyrtzidis b58a420ff4 [Tooling/DependencyScanning] Rename refactorings towards transitioning dependency scanning to use pre-lexed preprocessor directive tokens
This is first of a series of patches for making the special lexing for dependency scanning a first-class feature of the `Preprocessor` and `Lexer`.
This patch only includes NFC renaming changes to make reviewing of the functionality changing parts easier.

Differential Revision: https://reviews.llvm.org/D125484
2022-05-26 12:49:51 -07:00
Argyrios Kyrtzidis 42823beb1d [Tooling/DependencyScanning] Make skipping excluded PP ranges during dependency scanning the default
This is to improve maintenance a bit and remove need to maintain the additional option and related code-paths.

Differential Revision: https://reviews.llvm.org/D124558
2022-04-28 15:23:03 -07:00
Jan Svoboda 7ed01ba88d [clang][deps] NFC: Inline function with single caller 2022-04-15 16:24:40 +02:00
Jan Svoboda d79ad2f1db [clang][lex] NFCI: Use FileEntryRef in PPCallbacks::InclusionDirective()
This patch changes type of the `File` parameter in `PPCallbacks::InclusionDirective()` from `const FileEntry *` to `Optional<FileEntryRef>`.

With the API change in place, this patch then removes some uses of the deprecated `FileEntry::getName()` (e.g. in `DependencyGraph.cpp` and `ModuleDependencyCollector.cpp`).

Reviewed By: dexonsmith, bnbarham

Differential Revision: https://reviews.llvm.org/D123574
2022-04-14 10:46:12 +02:00
Jan Svoboda 1e25ff84d8 [clang][deps] Fix traversal of precompiled dependencies
The code for traversing precompiled dependencies is somewhat complicated and contains a dangling iterator bug.

This patch simplifies the code and fixes the bug.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D121533
2022-03-16 12:17:53 +01:00
Jan Svoboda d73daa9135 [clang][deps] Don't prune search paths used by dependencies
When pruning header search paths (to reduce the number of modules we need to build explicitly), we can't prune the search paths used in (transitive) dependencies of a module. Otherwise, we could end up with either of the following dependency graphs:

```
X:<hash1> -> Y:<hash2>
X:<hash1> -> Y:<hash3>
```

depending on the search paths of the translation unit we discovered `X` and `Y` from.

This patch fixes that.

Depends on D121295.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D121303
2022-03-16 12:17:53 +01:00
Jan Svoboda cf4a31fc0f [clang][deps] Remove '-fmodules-cache-path=' arguments
With explicit modules build, the '-fmodules-cache-path=' argument is unused.

This patch removes the argument to avoid warnings or errors (with '-Werror') stemming from that.

Depends on D118915.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D120474
2022-03-12 11:42:07 +01:00
Jan Svoboda 7f6af60746 [clang][deps] Generate '-fmodule-file=' only for direct dependencies
The `clang-scan-deps` tool currently generates `-fmodule-file=` command-line arguments for the whole transitive closure of modular dependencies. This is not necessary, we only need to provide the direct dependencies on the command line. Information about transitive dependencies is stored within the `.pcm` files of direct dependencies. This makes the command lines shorter, but should be a NFC otherwise (unless there are bugs in the loading mechanism for explicit modules).

Depends on D120465.

Reviewed By: Bigcheese

Differential Revision: https://reviews.llvm.org/D118915
2022-03-12 11:32:51 +01:00
Jan Svoboda a6ef363546 [clang][deps] Disable implicit module maps
Since D113473, we don't report any module map files via `-fmodule-map-file=` in explicit builds. The ultimate goal here is to make sure Clang doesn't open/read/parse/evaluate unnecessary module maps.

However, implicit module maps still end up reading all reachable module maps. This patch disables implicit module maps in explicit builds.

Unfortunately, we still need to report some module map files that aren't encoded in PCM files of dependencies: module maps that are necessary to correctly evaluate includes in modules marked as `[no_undeclared_includes]`.

Depends on D120464.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D120465
2022-03-12 11:07:21 +01:00
Jan Svoboda 19017c2435 [clang][deps] Return the whole TU command line
The dependency scanner already generates canonical -cc1 command lines that can be used to compile discovered modular dependencies.

For translation unit command lines, the scanner only generates additional driver arguments the build system is expected to append to the original command line.

While this works most of the time, there are situations where that's not the case. For example with `-Wunused-command-line-argument`, Clang will complain about the `-fmodules-cache-path=` argument that's not being used in explicit modular builds. Combine that with `-Werror` and the build outright fails.

To prevent such failures, this patch changes the dependency scanner to return the full driver command line to compile the original translation unit. This gives us more opportunities to massage the arguments into something reasonable.

Reviewed By: Bigcheese

Differential Revision: https://reviews.llvm.org/D118986
2022-02-23 15:46:20 +01:00
Jan Svoboda 80a696898c [clang][deps] NFC: Update documentation
In D113473, the dependency scanner stopped emitting "-fmodule-map-file=" arguments. Potential build systems are expected to not add any such arguments on their own. This commit removes mentions of such arguments to avoid confusion.
2022-02-23 15:46:20 +01:00
Jan Svoboda c6f8704053 [clang][deps] Disable global module index
While scanning dependencies of a TU that depends on a PCH, the scanner basically performs mixed implicit/explicit modular compilation. (Explicit modules come from the PCH.) This seems to trip up the global module index.

This patch disables global module index in the dependency scanner.

Reviewed By: Bigcheese

Differential Revision: https://reviews.llvm.org/D118890
2022-02-15 09:51:23 +01:00
Jan Svoboda 8cc2a13727 [clang][deps] Handle symlinks in minimizing FS
The minimizing and caching filesystem used by the dependency scanner can be configured to **not** minimize some files. That's necessary when scanning a TU with prebuilt inputs (i.e. PCH) that refer to the original (non-minimized) files. Minimizing such files in the dependency scanner would cause discrepancy between the current perceived state of the filesystem and the file sizes stored in the AST file. By not minimizing such files, we avoid creating the discrepancy.

The problem with the current approach is that files that should not be minimized are identified by their path. This breaks down when the prebuilt input (PCH) and the current TU refer to the same file via different paths (i.e. symlinks). This patch switches from paths to `llvm::sys::fs::UniqueID` when identifying ignored files. This is consistent with how the rest of Clang treats files.

Depends on D114966.

Reviewed By: dexonsmith, arphaman

Differential Revision: https://reviews.llvm.org/D114971
2022-01-21 13:04:25 +01:00
Jan Svoboda 5daeada330 [clang][deps] Ensure filesystem cache consistency
The minimizing filesystem used by the dependency scanner isn't great when it comes to the consistency of its caches. There are two problems that can be exposed by a filesystem that changes during dependency scan:
1. In-memory cache entries for original and minimized files are distinct, populated at different times using separate stat/open syscalls. This means that when a file is read with minimization disabled, its contents might be inconsistent when the same file is read with minimization enabled at later point (and vice versa).
2. In-memory cache entries are indexed by filename. This is problematic for symlinks, where the contents of the symlink might be inconsistent with contents of the original file (for the same reason as in problem 1).

This patch ensures consistency by always stating/reading a file exactly once. The original contents are always cached and minimized contents are derived from that on demand. The cache entries are now indexed by their `UniqueID` ensuring consistency for symlinks too. Moreover, the stat/read syscalls are now issued outside of critical section.

Depends on D115935.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D114966
2022-01-21 13:04:25 +01:00
Jan Svoboda ced077e1ba [clang][deps] NFC: Simplify handling of cached FS errors
The return types of some `CachedFileSystemEntry` member function are needlessly complex.

This patch attempts to simplify the code by unwrapping cached entries that represent errors early, and then asserting `!isError()`.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D115935
2022-01-21 13:04:25 +01:00
Michael Spencer 37e6e022d2 Re-land "[Clang][ScanDeps] Use the virtual path for module maps"
This re-lands:
- 04192422c4
- 015e08c6ba

Which I reverted in ea83517138 in error.

Differential Revision: https://reviews.llvm.org/D114206
2022-01-06 21:05:05 +00:00
Archibald Elliott ea83517138 Revert "[Clang][ScanDeps] Use the virtual path for module maps"
This reverts commits:
- 04192422c4.
- 015e08c6ba

D114206 was landed before it was approved - and was landed knowing that
the test crashed on windows, without an xfail. The promised follow-up
commit with fixes has not appeared since it was promised on December 14th.
2022-01-05 12:17:06 +00:00
Jan Svoboda 3f3b5c3ec0 [clang][deps] NFC: Unify ErrorOr patterns
This patch canonicalized some code into repetitive ErrorOr pattern. This will make refactoring easier if we ever come up with a way to simplify this.
2021-12-17 14:00:20 +01:00
Jan Svoboda bcdf7f5e91 [clang][deps] NFC: Take and store entry as reference 2021-12-17 14:00:20 +01:00
Jan Svoboda af7a421ef4 [clang][deps] NFC: Remove explicit call to implicit constructor 2021-12-17 14:00:20 +01:00
Jan Svoboda 195a5294c2 [clang][deps] NFC: Rename member variable 2021-12-17 14:00:20 +01:00
Jan Svoboda 4170ea9445 [clang][deps] NFC: Fix whitespace formatting 2021-12-17 14:00:20 +01:00
Jan Svoboda f66803457e [clang][deps] Squash caches for original and minimized files
The minimizing and caching filesystem used by the dependency scanner keeps minimized and original files in separate caches.

This setup is not well suited for dealing with files that are sometimes minimized and sometimes not. Such files are being stat-ed and read twice, which is wasteful and also means the two versions of the file can get "out of sync".

This patch squashes the two caches together. When a file is stat-ed or read, its original contents are populated. If a file needs to be minimized, we give the minimizer the already loaded contents instead of reading the file again.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D115346
2021-12-16 09:57:21 +01:00
Jan Svoboda da920c3bcc [clang][deps] NFC: Move entry initialization into member functions
This is a prep-patch for making `CachedFileSystemEntry` initialization more lazy.
2021-12-15 16:39:29 +01:00
Jan Svoboda 3031fd71b9 [clang][deps] NFC: Use clearer wording around entry initialization
The code and documentation around `CachedFileSystemEntry` use the following terms:
* "invalid stat" for `llvm::ErrorOr<llvm::vfs::Status>` that is *not* an error and contains an unknown status,
* "initialized entry" for an entry that contains "invalid stat",
* "valid entry" for an entry that contains "invalid stat", synonymous to "initialized" entry.

Having an entry be "valid" while it contains an "invalid" status object is counter-intuitive.
This patch cleans up the wording by referring to the status as "unknown" and to the entry as either "initialized" or "uninitialized".
2021-12-15 16:14:44 +01:00
Michael Spencer 04192422c4 [Clang][ScanDeps] Use the virtual path for module maps
Make clang-scan-deps use the virtual path for module maps instead of the on disk
path. This is needed so that modulemap relative lookups are done correctly in
the actual module builds. The file dependencies still use the on disk path as
that's what matters for build invalidation.

Differential Revision: https://reviews.llvm.org/D114206
2021-12-14 11:21:42 -07:00
Jan Svoboda 13a351e862 [clang][deps] Use MemoryBuffer in minimizing FS
This patch avoids unnecessarily copying contents of `mmap`-ed files into `CachedFileSystemEntry` by storing `MemoryBuffer` instead. The change leads to ~50% reduction of peak memory footprint when scanning LLVM+Clang via `clang-scan-deps`.

Depends on D115331.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D115043
2021-12-09 11:32:13 +01:00
Jan Svoboda 58822837cd [clang][deps] Use lock_guard instead of unique_lock
This patch changes uses of `std::unique_lock` to `std::lock_guard`.

The `std::unique_lock` template provides some advanced capabilities (deferred locking, time-constrained locking attempts, etc.) we don't use in the caching filesystem. Plain `std::lock_guard` will do here.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D115332
2021-12-09 10:42:50 +01:00
Jan Svoboda 5b6c08379b [clang][deps] Reset some benign codegen options
Some command-line codegen arguments are likely to differ between identical modules discovered from different translation units. This patch removes them to make builds deterministic and/or reduce the number of built modules.

Reviewed By: Bigcheese

Differential Revision: https://reviews.llvm.org/D112923
2021-12-08 11:53:50 +01:00
Jan Svoboda 97e504cff9 [clang][deps] NFC: Extract function
This commits extracts a couple of nested conditions into a separate function with early returns, making the control flow easier to understand.
2021-11-26 14:01:24 +01:00
Jan Svoboda 12eafd944e [clang][deps] NFC: Clean up wording (ignored vs minimized)
The filesystem used during dependency scanning does two things: it caches file entries and minimizes source file contents. We use the term "ignored file" in a couple of places, but it's not clear what exactly that means. This commit clears up the semantics, explicitly spelling out this relates to minimization.
2021-11-26 12:18:37 +01:00
Jan Svoboda d8a3538788 [clang][deps] NFC: Remove else after early return 2021-11-26 12:18:37 +01:00
Jan Svoboda 17ec9d1f6b [clang][deps] Don't emit `-fmodule-map-file=`
During explicit modules build, when all modules are provided via `-fmodule-file=<path>` and implicit modules and implicit module maps are disabled (`-fno-implicit-modules`, `-fno-implicit-module-maps`), we don't need to load the original module map files at all. This patch stops emitting the `-fmodule-map-file=` arguments we don't need, saving some compilation time due to avoiding parsing such module maps and making the command line shorter.

Reviewed By: bnbarham

Differential Revision: https://reviews.llvm.org/D113473
2021-11-18 12:31:24 +01:00
Jan Svoboda c62220f962 [clang][deps] NFC: Rename building CompilerInvocation
The dependency scanner works with multiple instances of `Compiler{Instance,Invocation}`. From names of the variables/members, their purpose is not obvious.

This patch gives descriptive name to the generated `CompilerInvocation` that can be used to derive the command-line to build a modular dependency.

Depends on D111725.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111728
2021-10-21 13:51:27 +02:00