Commit Graph

3 Commits

Author SHA1 Message Date
Kirill Bobyrev 870aaf2963 [clangd] DexIndex implementation prototype
This patch is a proof-of-concept Dex index implementation. It has
several flaws, which don't allow replacing static MemIndex yet, such as:

* Not being able to handle queries of small size (less than 3 symbols);
  a way to solve this is generating trigrams of smaller size and having
  such incomplete trigrams in the index structure.
* Speed measurements: while manually editing files in Vim and requesting
  autocompletion gives an impression that the performance is at least
  comparable with the current static index, having actual numbers is
  important because we don't want to hurt the users and roll out slow
  code. Eric (@ioeric) suggested that we should only replace MemIndex as
  soon as we have the evidence that this is not a regression in terms of
  performance. An approach which is likely to be successful here is to
  wait until we have benchmark library in the LLVM core repository, which
  is something I have suggested in the LLVM mailing lists, received
  positive feedback on and started working on. I will add a dependency as
  soon as the suggested patch is out for a review (currently there's at
  least one complication which is being addressed by
  https://github.com/google/benchmark/pull/649). Key performance
  improvements for iterators are sorting by cost and the limit iterator.
* Quality measurements: currently, boosting iterator and two-phase
  lookup stage are not implemented, without these the quality is likely to
  be worse than the current implementation can yield. Measuring quality is
  tricky, but another suggestion in the offline discussion was that the
  drop-in replacement should only happen after Boosting iterators
  implementation (and subsequent query enhancement).

The proposed changes do not affect Clangd functionality or performance,
`DexIndex` is only used in unit tests and not in production code.

Reviewed by: ioeric

Differential Revision: https://reviews.llvm.org/D50337

llvm-svn: 340175
2018-08-20 14:39:32 +00:00
Kirill Bobyrev 8e35f1e7cb NFC: Enforce good formatting across multiple clang-tools-extra files
This patch improves readability of multiple files in clang-tools-extra
and enforces LLVM Coding Guidelines.

Reviewed by: ioeric

Differential Revision: https://reviews.llvm.org/D50707

llvm-svn: 339687
2018-08-14 16:03:32 +00:00
Kirill Bobyrev 5e82f05e7a [clangd] Introduce Dex symbol index search tokens
This patch introduces the core building block of the next-generation
Clangd symbol index - Dex. Search tokens are the keys in the inverted
index and represent a characteristic of a specific symbol: examples of
search token types (Token Namespaces) are

* Trigrams -  these are essential for unqualified symbol name fuzzy
search * Scopes for filtering the symbols by the namespace * Paths, e.g.
these can be used to uprank symbols defined close to the edited file

This patch outlines the generic for such token namespaces, but only
implements trigram generation.

The intuition behind trigram generation algorithm is that each extracted
trigram is a valid sequence for Fuzzy Matcher jumps, proposed
implementation utilize existing FuzzyMatcher API for segmentation and
trigram extraction.

However, trigrams generation algorithm for the query string is different
from the previous one: it simply yields sequences of 3 consecutive
lowercased valid characters (letters, digits).

Dex RFC in the mailing list:
http://lists.llvm.org/pipermail/clangd-dev/2018-July/000022.html

The trigram generation techniques are described in detail in the
proposal:
https://docs.google.com/document/d/1C-A6PGT6TynyaX4PXyExNMiGmJ2jL1UwV91Kyx11gOI/edit#heading=h.903u1zon9nkj

Reviewers: sammccall, ioeric, ilya-biryukovA

Subscribers: cfe-commits, klimek, mgorny, MaskRay, jkorous, arphaman

Differential Revision: https://reviews.llvm.org/D49591

llvm-svn: 337901
2018-07-25 10:34:57 +00:00