Summary:
And add tests for the comment extraction code.
clangd will now show non-doxygen comments in completion for results
coming from Sema and Dynamic index.
Static index does not include the comments yet, I will enable it in
a separate commit after investigating which implications it has for
the size of the index.
Reviewers: sammccall, hokein, ioeric
Reviewed By: sammccall
Subscribers: klimek, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D46002
llvm-svn: 332460
Summary:
Previous implementation used to extract brief text from doxygen comments.
Brief text parsing slows down completion and is not suited for
non-doxygen comments.
This commit switches to providing comments that mimic the ones
originally written in the source code, doing minimal reindenting and
removing the comments markers to make the output more user-friendly.
It means we lose support for doxygen-specific features, e.g. extracting
brief text, but provide useful results for non-doxygen comments.
Switching the doxygen support back is an option, but I suggest to see
whether the current approach gives more useful results.
Reviewers: sammccall, hokein, ioeric
Reviewed By: sammccall
Subscribers: klimek, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D45999
llvm-svn: 332459
Summary:
Code completion scoring was embedded in CodeComplete.cpp, which is bad:
- awkward to test. The mechanisms (extracting info from index/sema) can be
unit-tested well, the policy (scoring) should be quantitatively measured.
Neither was easily possible, and debugging was hard.
The intermediate signal struct makes this easier.
- hard to reuse. This is a bug in workspaceSymbols: it just presents the
results in the index order, which is not sorted in practice, it needs to rank
them!
Also, index implementations care about scoring (both query-dependent and
independent) in order to truncate result lists appropriately.
The main yak shaved here is the build() function that had 3 variants across
unit tests is unified in TestTU.h (rather than adding a 4th variant).
Reviewers: ilya-biryukov
Subscribers: klimek, mgorny, ioeric, MaskRay, jkorous, mgrang, cfe-commits
Differential Revision: https://reviews.llvm.org/D46524
llvm-svn: 332378
Summary:
o Remove IncludeInsertion LSP command.
o Populate include insertion edits synchromously in completion items.
o Share the code completion compiler instance and precompiled preamble to get existing inclusions in main file.
o Include insertion logic lives only in CodeComplete now.
o Use tooling::HeaderIncludes for inserting new includes.
o Refactored tests.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, ilya-biryukov, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D46497
llvm-svn: 332363
Summary:
clangd will populate #include insertions as addtionalEdits in completion items.
The code completion tests in ClangdServerTest will be added back in D46497.
Reviewers: ilya-biryukov, sammccall
Reviewed By: ilya-biryukov
Subscribers: klimek, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D46676
llvm-svn: 332362
Summary:
We used to query the index when completing after class qualifiers,
e.g. 'ClassName::^'. We should not do that for the same reasons we
don't query the index for member access expressions.
Reviewers: sammccall, ioeric
Reviewed By: sammccall
Subscribers: klimek, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D46795
llvm-svn: 332226
Summary:
The Language Server Protocol unfortunately mandates that locations in files
be represented by line/column pairs, where the "column" is actually an index
into the UTF-16-encoded text of the line.
(This is because VSCode is written in JavaScript, which is UTF-16-native).
Internally clangd treats source files at UTF-8, the One True Encoding, and
generally deals with byte offsets (though there are exceptions).
Before this patch, conversions between offsets and LSP Position pretended
that Position.character was UTF-8 bytes, which is only true for ASCII lines.
Now we examine the text to convert correctly (but don't actually need to
transcode it, due to some nice details of the encodings).
The updated functions in SourceCode are the blessed way to interact with
the Position.character field, and anything else is likely to be wrong.
So I also updated the other accesses:
- CodeComplete needs a "clang-style" line/column, with column in utf-8 bytes.
This is now converted via Position -> offset -> clang line/column
(a new function is added to SourceCode.h for the second conversion).
- getBeginningOfIdentifier skipped backwards in UTF-16 space, which is will
behave badly when it splits a surrogate pair. Skipping backwards in UTF-8
coordinates gives the lexer a fighting chance of getting this right.
While here, I clarified(?) the logic comments, fixed a bug with identifiers
containing digits, simplified the signature slightly and added a test.
This seems likely to cause problems with editors that have the same bug, and
treat the protocol as if columns are UTF-8 bytes. But we can find and fix those.
Reviewers: hokein
Subscribers: klimek, ilya-biryukov, ioeric, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D46035
llvm-svn: 331029
Summary:
When parser backtracks, we might receive multiple code completion
callbacks.
Previously we had a failing assertion there, now we take first results
and hope they are good enough.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, jkorous-apple, ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D44567
llvm-svn: 327717
Summary:
Found by asan. Fiddling with code completion AST after
FrontendAction::Exceute can lead to errors.
Calling the callback in ProcessCodeCompleteResults to make sure we
don't access uninitialized state.
This particular issue comes from the fact that Sema::TUScope is
deleted when destructor of ~Parser runs, but still present in
Sema::TUScope and accessed when building completion items.
I'm still struggling to come up with a small repro. The relevant
stackframes reported by asan are:
ERROR: AddressSanitizer: heap-use-after-free on address
READ of size 8 at 0x61400020d090 thread T175
#0 0x5632dff7821b in llvm::SmallPtrSetImplBase::isSmall() const include/llvm/ADT/SmallPtrSet.h:195:33
#1 0x5632e0335901 in llvm::SmallPtrSetImplBase::insert_imp(void const*) include/llvm/ADT/SmallPtrSet.h:127:9
#2 0x5632e067347d in llvm::SmallPtrSetImpl<clang::Decl*>::insert(clang::Decl*) include/llvm/ADT/SmallPtrSet.h:372:14
#3 0x5632e065df80 in clang::Scope::AddDecl(clang::Decl*) tools/clang/include/clang/Sema/Scope.h:287:18
#4 0x5632e0623eea in clang::ASTReader::pushExternalDeclIntoScope(clang::NamedDecl*, clang::DeclarationName) clang/lib/Serialization/ASTReader.cpp
#5 0x5632e062ce74 in clang::ASTReader::finishPendingActions() tools/clang/lib/Serialization/ASTReader.cpp:9164:9
....
#30 0x5632e02009c4 in clang::index::generateUSRForDecl(clang::Decl const*, llvm::SmallVectorImpl<char>&) tools/clang/lib/Index/USRGeneration.cpp:1037:6
#31 0x5632dff73eab in clang::clangd::(anonymous namespace)::getSymbolID(clang::CodeCompletionResult const&) tools/clang/tools/extra/clangd/CodeComplete.cpp:326:20
#32 0x5632dff6fe91 in clang::clangd::CodeCompleteFlow::mergeResults(std::vector<clang::CodeCompletionResult, std::allocator<clang::CodeCompletionResult> > const&, clang::clangd::SymbolSlab const&)::'lambda'(clang::CodeCompletionResult const&)::operator()(clang::CodeCompletionResult const&) tools/clang/tools/extra/clangd/CodeComplete.cpp:938:24
#33 0x5632dff6e426 in clang::clangd::CodeCompleteFlow::mergeResults(std::vector<clang::CodeCompletionResult, std::allocator<clang::CodeCompletionResult> > const&, clang::clangd::SymbolSlab const&) third_party/llvm/llvm/tools/clang/tools/extra/clangd/CodeComplete.cpp:949:38
#34 0x5632dff7a34d in clang::clangd::CodeCompleteFlow::runWithSema() llvm/tools/clang/tools/extra/clangd/CodeComplete.cpp:894:16
#35 0x5632dff6df6a in clang::clangd::CodeCompleteFlow::run(clang::clangd::(anonymous namespace)::SemaCompleteInput const&) &&::'lambda'()::operator()() const third_party/llvm/llvm/tools/clang/tools/extra/clangd/CodeComplete.cpp:858:35
#36 0x5632dff6cd42 in clang::clangd::(anonymous namespace)::semaCodeComplete(std::unique_ptr<clang::CodeCompleteConsumer, std::default_delete<clang::CodeCompleteConsumer> >, clang::CodeCompleteOptions const&, clang::clangd::(anonymous namespace)::SemaCompleteInput const&, llvm::function_ref<void ()>) tools/clang/tools/extra/clangd/CodeComplete.cpp:735:5
0x61400020d090 is located 80 bytes inside of 432-byte region [0x61400020d040,0x61400020d1f0)
freed by thread T175 here:
#0 0x5632df74e115 in operator delete(void*, unsigned long) projects/compiler-rt/lib/asan/asan_new_delete.cc:161:3
#1 0x5632e0b06973 in clang::Parser::~Parser() tools/clang/lib/Parse/Parser.cpp:410:3
#2 0x5632e0b06ddd in clang::Parser::~Parser() clang/lib/Parse/Parser.cpp:408:19
#3 0x5632e0b03286 in std::unique_ptr<clang::Parser, std::default_delete<clang::Parser> >::~unique_ptr() .../bits/unique_ptr.h:236:4
#4 0x5632e0b021c4 in clang::ParseAST(clang::Sema&, bool, bool) tools/clang/lib/Parse/ParseAST.cpp:182:1
#5 0x5632e0726544 in clang::FrontendAction::Execute() tools/clang/lib/Frontend/FrontendAction.cpp:904:8
#6 0x5632dff6cd05 in clang::clangd::(anonymous namespace)::semaCodeComplete(std::unique_ptr<clang::CodeCompleteConsumer, std::default_delete<clang::CodeCompleteConsumer> >, clang::CodeCompleteOptions const&, clang::clangd::(anonymous namespace)::SemaCompleteInput const&, llvm::function_ref<void ()>) tools/clang/tools/extra/clangd/CodeComplete.cpp:728:15
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, jkorous-apple, cfe-commits, ioeric
Differential Revision: https://reviews.llvm.org/D44000
llvm-svn: 326569
Summary:
Changes:
o Store both the original header and the canonical header in LSP command.
o Also check that both original and canonical headers are not already included
by comparing both resolved header path and written literal includes.
This addresses the use case where private IWYU pragma is defined in a private
header while it would still be preferrable to include the private header, in the
internal implementation file. If we have seen that the priviate header is already
included, we don't try to insert the canonical include.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D43510
llvm-svn: 326070
Summary:
We should set the flag before creating ComplierInstance -- when
CopmilerInstance gets initialized, it also initializes the DiagnosticsEngine
using the DiagnosticOptions.
This was hidden deeply -- as clang suppresses all diagnostics when we
hit the code-completion (but internally it does do unnecessary analysis stuff).
As a bonus point, this fix will optmize the completion speed -- clang won't do
any analysis (e.g. -Wunreachable-code, -Wthread-safety-analysisi) at all internally.
Reviewers: ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: klimek, jkorous-apple, ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D43569
llvm-svn: 325779
Summary:
The new behaviors introduced by this patch:
o When include collection is enabled, we always set IncludeHeader field in Symbol
even if it's the same as FileURI in decl.
o Disable include collection in FileIndex which is currently only used to build
dynamic index. We should revisit when we actually want to use FileIndex to global
index.
o Code-completion only uses IncludeHeader to insert headers but not FileURI in
CanonicalDeclaration. This ensures that inserted headers are always canonicalized.
Note that include insertion can still be triggered for symbols that are already
included if they are merged from dynamic index and static index, but we would
only use includes that are already canonicalized (e.g. from static index).
Reason for change:
Collecting header includes in dynamic index enables inserting includes for headers
that are not indexed but opened in the editor. Comparing to inserting includes for
symbols in global/static index, this is nice-to-have but would probably require
non-trivial amount of work to get right. For example:
o Currently it's not easy to fully support CanonicalIncludes in dynamic index, given the way
we run dynamic index.
o It's also harder to reason about the correctness of include canonicalization for dynamic index
(i.e. symbols in the current file/TU) than static index where symbols are collected
offline and sanity check is possible before shipping to production.
o We have less control/flexibility over symbol info in the dynamic index
(e.g. URIs, path normalization), which could be used to help make decision when inserting includes.
As header collection (especially canonicalization) is relatively new, and enabling
it for dynamic index would immediately affect current users with only dynamic
index support, I propose we disable it for dynamic index for now to avoid
compromising other hot features like code completion and only support it for
static index where include insertion would likely to bring more value.
Reviewers: ilya-biryukov, sammccall, hokein
Subscribers: klimek, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D43550
llvm-svn: 325764
Summary:
o Collect suitable #include paths for index symbols. This also does smart mapping
for STL symbols and IWYU pragma (code borrowed from include-fixer).
o For global code completion, add a command for inserting new #include in each code
completion item.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, mgorny, ilya-biryukov, jkorous-apple, hintonda, cfe-commits
Differential Revision: https://reviews.llvm.org/D42640
llvm-svn: 325343
Summary:
This can happen if the CompileCommand provided by compilation database
is malformed.
Reviewers: hokein, ioeric, sammccall
Reviewed By: sammccall
Subscribers: klimek, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D43122
llvm-svn: 324732
Summary:
Instead of passing Context explicitly around, we now have a thread-local
Context object `Context::current()` which is an implicit argument to
every function.
Most manipulation of this should use the WithContextValue helper, which
augments the current Context to add a single KV pair, and restores the
old context on destruction.
Advantages are:
- less boilerplate in functions that just propagate contexts
- reading most code doesn't require understanding context at all, and
using context as values in fewer places still
- fewer options to pass the "wrong" context when it changes within a
scope (e.g. when using Span)
- contexts pass through interfaces we can't modify, such as VFS
- propagating contexts across threads was slightly tricky (e.g.
copy vs move, no move-init in lambdas), and is now encapsulated in
the threadpool
Disadvantages are all the usual TLS stuff - hidden magic, and
potential for higher memory usage on threads that don't use the
context. (In practice, it's just one pointer)
Reviewers: ilya-biryukov
Subscribers: klimek, jkorous-apple, ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D42517
llvm-svn: 323872
Summary:
* truncate symbols from static/dynamic index to the limited number
(which would save lots of cost in constructing the merged symbols).
* add an CLI option allowing to limit the number of returned completion results.
(default to 100)
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, ilya-biryukov, jkorous-apple, ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D42484
llvm-svn: 323408
Summary:
* For qualified completion (foo::a^)
* unresolved qualifier - use global namespace ("::")
* resolved qualifier - use all accessible namespaces inside the resolved qualifier.
* For unqualified completion (vec^), use scopes that are accessible from the
scope from which code completion occurs.
Reviewers: sammccall, ilya-biryukov
Reviewed By: sammccall
Subscribers: jkorous-apple, ioeric, klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D42073
llvm-svn: 323189
Global scope is "" (was "")
Top-level namespace scope is "ns::" (was "ns")
Nested namespace scope is "ns::ns::" (was "ns::ns")
This composes more naturally:
- qname = scope + name
- full scope = resolved scope + unresolved scope (D42073 was the trigger)
It removes a wart from the old way: "foo::" has one more separator than "".
Another alternative that has these properties is "::ns", but that lacks
the property that both the scope and the name are substrings of the
qname as produced by clang.
This is re-landing r322996 which didn't build.
llvm-svn: 323000
Global scope is "" (was "")
Top-level namespace scope is "ns::" (was "ns")
Nested namespace scope is "ns::ns::" (was "ns::ns")
This composes more naturally:
- qname = scope + name
- full scope = resolved scope + unresolved scope (D42073 was the trigger)
It removes a wart from the old way: "foo::" has one more separator than "".
Another alternative that has these properties is "::ns", but that lacks
the property that both the scope and the name are substrings of the
qname as produced by clang.
llvm-svn: 322996
Summary:
- we match on USR, and do a field-by-field merge if both have results
- scoring is post-merge, with both sets of information available
(for now, sema priority is used if available, static score for index results)
- limit is applied to the complete result set (previously index ignored limit)
- CompletionItem is only produces for the returned results
- If the user doesn't type a scope, we send the global scope for completion
(we can improve this after D42073)
Reviewers: ioeric
Subscribers: klimek, ilya-biryukov, mgrang, cfe-commits
Differential Revision: https://reviews.llvm.org/D42181
llvm-svn: 322945
Summary:
This improves performance of code completion, because we avoid stating
the files from the preamble and never attempt to parse the files
without using the preamble if it's provided.
However, the change comes at a cost of sometimes providing incorrect
results when doing code completion after making actually considerable
changes to the files used in the preamble or the preamble itself.
Eventually the preamble will get rebuilt and code completion will
be providing the correct results.
Reviewers: bkramer, sammccall, jkorous-apple
Reviewed By: sammccall
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D41991
llvm-svn: 322854
Summary:
We now hide the static/dynamic split from the code completion, behind a
new implementation of the SymbolIndex interface. This will reduce the
complexity of the sema/index merging that needs to be done by
CodeComplete, at a fairly small cost in flexibility.
Reviewers: hokein
Subscribers: klimek, mgorny, ilya-biryukov, cfe-commits
Differential Revision: https://reviews.llvm.org/D42049
llvm-svn: 322480
Summary:
To stay fast, it avoids deserializing anything outside the current file, by
disabling the LoadExternal code completion option added in r322377, when the
index is enabled.
Reviewers: hokein
Subscribers: klimek, ilya-biryukov, cfe-commits
Differential Revision: https://reviews.llvm.org/D41996
llvm-svn: 322387
Summary:
Use the YAML-format symbols (generated by the global-symbol-builder tool) to
do the global code completion.
It is **experimental** only , but it allows us to experience global code
completion on a relatively small project.
Tested with LLVM project.
Reviewers: sammccall, ioeric
Reviewed By: sammccall, ioeric
Subscribers: klimek, ilya-biryukov, cfe-commits
Differential Revision: https://reviews.llvm.org/D41668
llvm-svn: 322191
Summary:
This patch removes hidden items from code completion.
Items can be hidden, e.g., by other items in the child scopes.
This patch addresses a particular problem of a duplicate completion
item for the class in the following example:
struct Adapter { void method(); };
void Adapter::method() {
Adapter^
}
We should probably investigate if there are other duplicates in
completion and remove them, possibly adding assertions that it never
happens.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D41901
llvm-svn: 322185
Summary:
Common parts are mostly FS related, so pulled out TestFS.h for the common stuff.
Deliberately resisted cleaning up much here, so this is pretty mechanical.
Reviewers: hokein
Subscribers: klimek, mgorny, ilya-biryukov, cfe-commits
Differential Revision: https://reviews.llvm.org/D40784
llvm-svn: 319741
Summary: Shared details of ClangdUnit and CodeComplete moved to a new Compiler file.
Reviewers: ilya-biryukov
Subscribers: klimek, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D40719
llvm-svn: 319655