This patch adds new member function to `DependencyScanningWorker` that allows clients to pass custom `DiagnosticConsumer`, and returns `bool`.
This provides more flexibility compared to the existing version that automatically stringifies diagnostics and returns them in `llvm::Error`.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D134838
Functions that implement expansion of response and config files depend
on many options, which are passes as arguments. Extending the expansion
requires new options, it in turn causes changing calls in various places
making them even more bulky.
This change introduces a class ExpansionContext, which represents set of
options that control the expansion. Its methods implements expansion of
responce files including config files. It makes extending the expansion
easier.
No functional changes.
Differential Revision: https://reviews.llvm.org/D132379
Functions that implement expansion of response and config files depend
on many options, which are passes as arguments. Extending the expansion
requires new options, it in turn causes changing calls in various places
making them even more bulky.
This change introduces a class ExpansionContext, which represents set of
options that control the expansion. Its methods implements expansion of
responce files including config files. It makes extending the expansion
easier.
No functional changes.
Differential Revision: https://reviews.llvm.org/D132379
In `ASTWriter`, input files are sorted based on whether they are system or user. The current implementation used single `std::queue` with `push_back` and `push_front`. This resulted in the user files being reversed.
This patch fixes that by keeping the system/user distinction, but otherwise serializing files in the order they were loaded by the `SourceManager`. This is then used in the dependency scanner to report module map dependencies in the correct order.
Depends on D134224.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D134248
Module map files describing excluded headers do affect compilation. Track them in the compiler, serialize them into the PCM file and report them in the scanner.
Depends on D134222.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D134224
This patch fixes compilation failure with explicit modules caused by scanner not reporting the module map describing the module whose implementation is being compiled.
Below is a breakdown of the attached test case. Note the VFS that makes frameworks "A" and "B" share the same header "shared/H.h".
In non-modular build, Clang skips the last import, since the "shared/H.h" header has already been included.
During scan (or implicit build), the compiler handles "tu.m" as follows:
* `@import B` imports module "B", as expected,
* `#import <A/H.h>` is resolved textually (due to `-fmodule-name=A`) to "shared/H.h" (due to the VFS remapping),
* `#import <B/H.h>` is resolved to import module "A_Private", since the header "shared/H.h" is already known to be part of that module, and the import is skipped.
In the end, the only modular dependency of the TU is "B".
In explicit modular build without `-fmodule-name=A`, TU does depend on module "A_Private" properly, not just textually. Clang therefore builds & loads its PCM, and knows to ignore the last import, since "shared/H.h" is known to be part of "A_Private".
But with current scanner behavior and `-fmodule-name=A` present, the last import fails during explicit build. Clang doesn't know about "A_Private" (it's included textually) and tries to import "B_Private" instead, which it doesn't know about either (the scanner correctly didn't report it as dependency). This is fixed by reporting the module map describing "A" and matching the semantics of implicit build.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D134222
The `ScanInstance` is a local variable in `DependencyScanningAction::runInvocation()` that is referenced by `ModuleDepCollector`. Since D132405, `ModuleDepCollector` can escape the function and can outlive its `ScanInstance`. This patch fixes that.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D133988
Attempt to fix the test failures observed in CI:
* Add Option dependency, which caused BUILD_SHARED_LIBS builds to fail
* Adapt tests that accidentally depended on the host platform: platforms
that don't use an integrated assembler (e.g. AIX) get a different set
of commands from the driver. Most dependency scanner tests can use
-fsyntax-only or -E instead of -c to avoid this, and in the rare case
we want to check -c specifically, set an explicit target so the
behaviour is independent of the host.
Original commit message follows.
---
Instead of trying to "fix" the original driver invocation by appending
arguments to it, split it into multiple commands, and for each -cc1
command use a CompilerInvocation to give precise control over the
invocation.
This change should make it easier to (in the future) canonicalize the
command-line (e.g. to improve hits in something like ccache), apply
optimizations, or start supporting multi-arch builds, which would
require different modules for each arch.
In the long run it may make sense to treat the TU commands as a
dependency graph, each with their own dependencies on modules or earlier
TU commands, but for now they are simply a list that is executed in
order, and the dependencies are simply duplicated. Since we currently
only support single-arch builds, there is no parallelism available in
the execution.
Differential Revision: https://reviews.llvm.org/D132405
Instead of trying to "fix" the original driver invocation by appending
arguments to it, split it into multiple commands, and for each -cc1
command use a CompilerInvocation to give precise control over the
invocation.
This change should make it easier to (in the future) canonicalize the
command-line (e.g. to improve hits in something like ccache), apply
optimizations, or start supporting multi-arch builds, which would
require different modules for each arch.
In the long run it may make sense to treat the TU commands as a
dependency graph, each with their own dependencies on modules or earlier
TU commands, but for now they are simply a list that is executed in
order, and the dependencies are simply duplicated. Since we currently
only support single-arch builds, there is no parallelism available in
the execution.
Differential Revision: https://reviews.llvm.org/D132405
* Factor module map and module file path functions out
* Use a secondary mapping to lookup module deps by ID instead of the
preprocessor module map.
* Sink DirectPrebuiltModularDeps into MDC.
Differential Revision: https://reviews.llvm.org/D132617
The invocation is only ever used to serialize cc1 arguments from, so
instead serialize the arguments inside the dep scanner to simplify the
interface.
Differential Revision: https://reviews.llvm.org/D132616
ToolInvocation is useful even if you already have a -cc1 invocation,
since it provides a standard way to setup diagnostics, parse arguments,
and handoff to a ToolAction. So teach it to support -cc1 commands by
skipping the driver bits.
Differential Revision: https://reviews.llvm.org/D132615
When compiling a module, its semantics and Clang's behavior are affected by other modules. These modules are typically the **imported** ones. However, during implicit build, some modules end up being compiled and read without being actually imported. This patch starts tracking such modules and serializing them into `.pcm` files. This enables the dependency scanner to construct explicit compilations that mimic implicit build.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D132430
A simple sed doing these substitutions:
- `${LLVM_BINARY_DIR}/(\$\{CMAKE_CFG_INTDIR}/)?lib(${LLVM_LIBDIR_SUFFIX})?\>` -> `${LLVM_LIBRARY_DIR}`
- `${LLVM_BINARY_DIR}/(\$\{CMAKE_CFG_INTDIR}/)?bin\>` -> `${LLVM_TOOLS_BINARY_DIR}`
where `\>` means "word boundary".
The only manual modifications were reverting changes in
- `compiler-rt/cmake/Modules/CompilerRTUtils.cmake
- `runtimes/CMakeLists.txt`
because these were "entry points" where we wanted to tread carefully not not introduce a "loop" which would end with an undefined variable being expanded to nothing.
This hopefully increases readability overall, and also decreases the usages of `LLVM_LIBDIR_SUFFIX`, preparing us for D130586.
Reviewed By: sebastian-ne
Differential Revision: https://reviews.llvm.org/D132316
Move copying compiler arguments to a vector<string> and modifying
common module-related options into CompilerInvocation in preparation for
using some of them in more places and to avoid duplicating this code
accidentally in the future.
Differential Revision: https://reviews.llvm.org/D132419
This patch introduces new option `-eager-load-pcm` to `clang-scan-deps`, which controls whether the resulting command-lines will load PCM files eagerly (at the start of compilation) or lazily (when handling import directive). This patch also switches the default from eager to lazy.
To reduce the potential for churn in LIT tests in the future, this patch also removes redundant checks of command-line arguments and introduces new test `modules-dep-args.c` as a substitute.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D132066
Instead of delaying the generation of command-lines to after all
dependencies are reported, compute them immediately. This is partly in
preparation for splitting the TU driver command into its constituent cc1
and other jobs, but it also just simplifies working with the compiler
invocation for modules if they are not "without paths".
Also change the computation of the default output path in
clang-scan-deps to scrape the implicit module cache from the
command-line rather than get it from the dependency, since that is now
unavailable at the time we make the callback.
Differential Revision: https://reviews.llvm.org/D131934
Right now we can only add a single warning, notes are not possible.
Apparently some provisions were made to allow notes, but they were never
propagated all the way to the diagnostics.
Differential Revision: https://reviews.llvm.org/D128807
Since D129389 (and downstream PR https://github.com/apple/llvm-project/pull/4965), the dependency scanner is responsible for generating full command-lines, including the modules paths. This patch removes the flag that was making this an opt-in behavior in clang-scan-deps.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D131420
Given that we provide an EditGenerator edit(ASTEdit), we can't ever be
sure that the user won't give us an empty replacement.
Differential Revision: https://reviews.llvm.org/D128887
Sharing the FileManager across implicit module builds currently leaks
paths looked up in an importer into the built module itself. This can
cause non-deterministic results across scans. It is especially bad for
modules since the path can be saved into the pcm file itself, leading to
stateful behaviour if the cache is shared.
This should not impact the number of real filesystem accesses in the
scanner, since it is already caching in the
DependencyScanningWorkerFilesystem.
Note: this change does not affect whether or not the FileManager is
shared across TUs in the scanner, which is a separate issue.
Differential Revision: https://reviews.llvm.org/D131412
Prefer using these accessors to access the special sub-commands
corresponding to the top-level (no subcommand) and all sub-commands.
This is a preparatory step towards removing the use of ManagedStatic:
with a subsequent change, these global instances will be moved to
be regular function-scope statics.
It is split up to give downstream projects a (albeit short) window in
which they can switch to using the accessors in a forward-compatible
way.
Differential Revision: https://reviews.llvm.org/D129118
It's an accident that we started return asbolute paths from
FileEntry::getName for all relative paths. Prepare for getName to get
(closer to) return the requested path. Note: conceptually it might make
sense for the dependency scanner to allow relative paths and have the
DependencyConsumer decide if it wants to make them absolute, but we
currently document that it's absolute and I didn't want to change
behaviour here.
Differential Revision: https://reviews.llvm.org/D130934
I went over the output of the following mess of a command:
(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z |
parallel --xargs -0 cat | aspell list --mode=none --ignore-case |
grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n |
grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)
and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).
Differential Revision: https://reviews.llvm.org/D130827
The "strict context hash" is insufficient to identify module
dependencies during scanning, leading to different module build commands
being produced for a single module, and non-deterministically choosing
between them. This commit switches to hashing the canonicalized
`CompilerInvocation` of the module. By hashing the invocation we are
converting these from correctness issues to performance issues, and we
can then incrementally improve our ability to canonicalize
command-lines.
This change can cause a regression in the number of modules needed. Of
the 4 projects I tested, 3 had no regression, but 1, which was
clang+llvm itself, had a 66% regression in number of modules (4%
regression in total invocations). This is almost entirely due to
differences between -W options across targets. Of this, 25% of the
additional modules are system modules, which we could avoid if we
canonicalized -W options when -Wsystem-headers is not present --
unfortunately this is non-trivial due to some warnings being enabled in
system headers by default. The rest of the additional modules are mostly
real differences in potential warnings, reflecting incorrect behaviour
in the current scanner.
There were also a couple of differences due to `-DFOO`
`-fmodule-ignore-macro=FOO`, which I fixed here.
Since the output paths for the module depend on its context hash, we
hash the invocation before filling in outputs, and rely on the build
system to always return the same output paths for a given module.
Note: since the scanner itself uses an implicit modules build, there can
still be non-determinism, but it will now present as different
module+hashes rather than different command-lines for the same
module+hash.
Differential Revision: https://reviews.llvm.org/D129884
Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.
The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.
An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.
---
Troubleshooting list to deal with any breakage seen with this patch:
1) The most likely effect one would see by this patch is a change in how
a type is printed. The type printer will, by design and default,
print types as written. There are customization options there, but
not that many, and they mainly apply to how to print a type that we
somehow failed to track how it was written. This patch fixes a
problem where we failed to distinguish between a type
that was written without any elaborated-type qualifiers,
such as a 'struct'/'class' tags and name spacifiers such as 'std::',
and one that has been stripped of any 'metadata' that identifies such,
the so called canonical types.
Example:
```
namespace foo {
struct A {};
A a;
};
```
If one were to print the type of `foo::a`, prior to this patch, this
would result in `foo::A`. This is how the type printer would have,
by default, printed the canonical type of A as well.
As soon as you add any name qualifiers to A, the type printer would
suddenly start accurately printing the type as written. This patch
will make it print it accurately even when written without
qualifiers, so we will just print `A` for the initial example, as
the user did not really write that `foo::` namespace qualifier.
2) This patch could expose a bug in some AST matcher. Matching types
is harder to get right when there is sugar involved. For example,
if you want to match a type against being a pointer to some type A,
then you have to account for getting a type that is sugar for a
pointer to A, or being a pointer to sugar to A, or both! Usually
you would get the second part wrong, and this would work for a
very simple test where you don't use any name qualifiers, but
you would discover is broken when you do. The usual fix is to
either use the matcher which strips sugar, which is annoying
to use as for example if you match an N level pointer, you have
to put N+1 such matchers in there, beginning to end and between
all those levels. But in a lot of cases, if the property you want
to match is present in the canonical type, it's easier and faster
to just match on that... This goes with what is said in 1), if
you want to match against the name of a type, and you want
the name string to be something stable, perhaps matching on
the name of the canonical type is the better choice.
3) This patch could expose a bug in how you get the source range of some
TypeLoc. For some reason, a lot of code is using getLocalSourceRange(),
which only looks at the given TypeLoc node. This patch introduces a new,
and more common TypeLoc node which contains no source locations on itself.
This is not an inovation here, and some other, more rare TypeLoc nodes could
also have this property, but if you use getLocalSourceRange on them, it's not
going to return any valid locations, because it doesn't have any. The right fix
here is to always use getSourceRange() or getBeginLoc/getEndLoc which will dive
into the inner TypeLoc to get the source range if it doesn't find it on the
top level one. You can use getLocalSourceRange if you are really into
micro-optimizations and you have some outside knowledge that the TypeLocs you are
dealing with will always include some source location.
4) Exposed a bug somewhere in the use of the normal clang type class API, where you
have some type, you want to see if that type is some particular kind, you try a
`dyn_cast` such as `dyn_cast<TypedefType>` and that fails because now you have an
ElaboratedType which has a TypeDefType inside of it, which is what you wanted to match.
Again, like 2), this would usually have been tested poorly with some simple tests with
no qualifications, and would have been broken had there been any other kind of type sugar,
be it an ElaboratedType or a TemplateSpecializationType or a SubstTemplateParmType.
The usual fix here is to use `getAs` instead of `dyn_cast`, which will look deeper
into the type. Or use `getAsAdjusted` when dealing with TypeLocs.
For some reason the API is inconsistent there and on TypeLocs getAs behaves like a dyn_cast.
5) It could be a bug in this patch perhaps.
Let me know if you need any help!
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D112374
TokenManager defines Token interfaces for the clang syntax-tree. This is the level
of abstraction that the syntax-tree should use to operate on Tokens.
It decouples the syntax-tree from a particular token implementation (TokenBuffer
previously). This enables us to use a different underlying token implementation
for the syntax Leaf node -- in clang pseudoparser, we want to produce a
syntax-tree with its own pseudo::Token rather than syntax::Token.
Differential Revision: https://reviews.llvm.org/D128411
This reverts commit 7c51f02eff because it
stills breaks the LLDB tests. This was re-landed without addressing the
issue or even agreement on how to address the issue. More details and
discussion in https://reviews.llvm.org/D112374.
Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.
The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.
An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.
---
Troubleshooting list to deal with any breakage seen with this patch:
1) The most likely effect one would see by this patch is a change in how
a type is printed. The type printer will, by design and default,
print types as written. There are customization options there, but
not that many, and they mainly apply to how to print a type that we
somehow failed to track how it was written. This patch fixes a
problem where we failed to distinguish between a type
that was written without any elaborated-type qualifiers,
such as a 'struct'/'class' tags and name spacifiers such as 'std::',
and one that has been stripped of any 'metadata' that identifies such,
the so called canonical types.
Example:
```
namespace foo {
struct A {};
A a;
};
```
If one were to print the type of `foo::a`, prior to this patch, this
would result in `foo::A`. This is how the type printer would have,
by default, printed the canonical type of A as well.
As soon as you add any name qualifiers to A, the type printer would
suddenly start accurately printing the type as written. This patch
will make it print it accurately even when written without
qualifiers, so we will just print `A` for the initial example, as
the user did not really write that `foo::` namespace qualifier.
2) This patch could expose a bug in some AST matcher. Matching types
is harder to get right when there is sugar involved. For example,
if you want to match a type against being a pointer to some type A,
then you have to account for getting a type that is sugar for a
pointer to A, or being a pointer to sugar to A, or both! Usually
you would get the second part wrong, and this would work for a
very simple test where you don't use any name qualifiers, but
you would discover is broken when you do. The usual fix is to
either use the matcher which strips sugar, which is annoying
to use as for example if you match an N level pointer, you have
to put N+1 such matchers in there, beginning to end and between
all those levels. But in a lot of cases, if the property you want
to match is present in the canonical type, it's easier and faster
to just match on that... This goes with what is said in 1), if
you want to match against the name of a type, and you want
the name string to be something stable, perhaps matching on
the name of the canonical type is the better choice.
3) This patch could exposed a bug in how you get the source range of some
TypeLoc. For some reason, a lot of code is using getLocalSourceRange(),
which only looks at the given TypeLoc node. This patch introduces a new,
and more common TypeLoc node which contains no source locations on itself.
This is not an inovation here, and some other, more rare TypeLoc nodes could
also have this property, but if you use getLocalSourceRange on them, it's not
going to return any valid locations, because it doesn't have any. The right fix
here is to always use getSourceRange() or getBeginLoc/getEndLoc which will dive
into the inner TypeLoc to get the source range if it doesn't find it on the
top level one. You can use getLocalSourceRange if you are really into
micro-optimizations and you have some outside knowledge that the TypeLocs you are
dealing with will always include some source location.
4) Exposed a bug somewhere in the use of the normal clang type class API, where you
have some type, you want to see if that type is some particular kind, you try a
`dyn_cast` such as `dyn_cast<TypedefType>` and that fails because now you have an
ElaboratedType which has a TypeDefType inside of it, which is what you wanted to match.
Again, like 2), this would usually have been tested poorly with some simple tests with
no qualifications, and would have been broken had there been any other kind of type sugar,
be it an ElaboratedType or a TemplateSpecializationType or a SubstTemplateParmType.
The usual fix here is to use `getAs` instead of `dyn_cast`, which will look deeper
into the type. Or use `getAsAdjusted` when dealing with TypeLocs.
For some reason the API is inconsistent there and on TypeLocs getAs behaves like a dyn_cast.
5) It could be a bug in this patch perhaps.
Let me know if you need any help!
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D112374
Follow-up to 6626f6fec3, this fixes the handling of -MT
* If no targets are provided, we need to invent one since cc1 expects
the driver to have handled it. The default is to use -o, quoting as
necessary for a make target.
* Fix the splitting for empty string, which was incorrectly treated as
{""} instead of {}.
* Add a way to test this behaviour in clang-scan-deps.
Differential Revision: https://reviews.llvm.org/D129607
This reverts commit bdc6974f92 because it
breaks all the LLDB tests that import the std module.
import-std-module/array.TestArrayFromStdModule.py
import-std-module/deque-basic.TestDequeFromStdModule.py
import-std-module/deque-dbg-info-content.TestDbgInfoContentDequeFromStdModule.py
import-std-module/forward_list.TestForwardListFromStdModule.py
import-std-module/forward_list-dbg-info-content.TestDbgInfoContentForwardListFromStdModule.py
import-std-module/list.TestListFromStdModule.py
import-std-module/list-dbg-info-content.TestDbgInfoContentListFromStdModule.py
import-std-module/queue.TestQueueFromStdModule.py
import-std-module/stack.TestStackFromStdModule.py
import-std-module/vector.TestVectorFromStdModule.py
import-std-module/vector-bool.TestVectorBoolFromStdModule.py
import-std-module/vector-dbg-info-content.TestDbgInfoContentVectorFromStdModule.py
import-std-module/vector-of-vectors.TestVectorOfVectorsFromStdModule.py
https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45301/
Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.
The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.
An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D112374
When building modules, override secondary outputs (dependency file,
dependency targets, serialized diagnostic file) in addition to the pcm
file path. This avoids inheriting per-TU command-line options that
cause non-determinism in the results (non-deterministic command-line for
the module build, non-determinism in which TU's .diag and .d files will
contain the module outputs). In clang-scan-deps we infer whether to
generate dependency or serialized diagnostic files based on an original
command-line. In a real build system this should be modeled explicitly.
Differential Revision: https://reviews.llvm.org/D129389
Dependency scanning does not care about the order of submodules for
correctness, so sort the submodules so that we get the same
command-lines to build the module across different TUs. The order of
inferred submodules can vary depending on the order of #includes in the
including TU.
Differential Revision: https://reviews.llvm.org/D128008
Disable or canonicalize compiler options that are not relevant in
explicit module builds, similar to what we already did for the modules
cache path. This reduces uninteresting differences between
command-lines, which is particularly useful if there is a tool that can
cache the compilations.
Differential Revision: https://reviews.llvm.org/D127883