mirror of https://github.com/mamba-org/mamba.git
323 lines
10 KiB
ReStructuredText
323 lines
10 KiB
ReStructuredText
Internals of mamba
|
||
==================
|
||
|
||
Mamba comes with a C++ core (for speed and efficiency), and a clean Python API on top. The core of mamba uses:
|
||
|
||
- ``libsolv`` to solve dependencies (the same library used in RedHat dnf and other Linux package managers)
|
||
- ``curl`` for reliable and fast downloads
|
||
- ``libarchive`` to extract the packages
|
||
|
||
.. _libsolv_internals:
|
||
|
||
libsolv
|
||
-------
|
||
|
||
pool
|
||
****
|
||
|
||
| The ``pool`` holds pointers of almost all the data manipulated in ``libsolv``.
|
||
|
||
Some are presented here:
|
||
|
||
- string pool ``Stringpool``
|
||
- relations of dependency ``Reldep``
|
||
- solvables/packages ``Solvable``
|
||
- repositories ``Repo``
|
||
- data about providers of a specific name or relation, as ``Id`` s and ``Offset`` s (resp. ``int`` and ``unsigned int``) for maximum performance
|
||
|
||
Sizes of allocated memory or offset to the first free slot are also stored.
|
||
|
||
.. mermaid::
|
||
:align: center
|
||
|
||
classDiagram
|
||
|
||
class Pool{
|
||
+Stringpool ss
|
||
+Reldep *rels
|
||
+int nrels
|
||
+Repo **repos
|
||
+int nrepos
|
||
+int urepos
|
||
+Repo *installed
|
||
+Solvable *solvables
|
||
+int nsolvables
|
||
+Offset *whatprovides
|
||
+Offset *whatprovides_rel
|
||
+Id *whatprovidesdata
|
||
+Offset whatprovidesdataoff
|
||
+int whatprovidesdataleft
|
||
|
||
...
|
||
}
|
||
|
||
Ids
|
||
###
|
||
|
||
Strings (package names, requirements, etc.) are converted to ``Id`` s using a hash table for efficient data manipulation (storage, usage).
|
||
|
||
- ``Id pool_str2id(Pool *pool, const char *str, int create)`` is used to get an ``Id`` from a string
|
||
|
||
- It eventually creates a fresh one if not existing
|
||
|
||
- ``const char *pool_id2str(const Pool *pool, Id id)`` is used to get back the string from ``Id``
|
||
|
||
.. warning::
|
||
Do not use ``solvable`` ``Id`` s with ``pool_id2str``, use the equivalent ``const char *pool_solvid2str(Pool *pool, Id p)``
|
||
|
||
Reldeps
|
||
#######
|
||
|
||
A ``Reldep`` describes a relation of dependency with:
|
||
|
||
- a ``name`` Id
|
||
- an epoque version ``evr`` Id
|
||
- some ``flags``: ``REL_CONDA`` in our case
|
||
|
||
.. mermaid::
|
||
:align: center
|
||
|
||
classDiagram
|
||
class Reldep{
|
||
+Id name
|
||
+Id evr
|
||
+int flags
|
||
}
|
||
|
||
An example could be ``{1, 2, 30}``, with:
|
||
|
||
- ``1`` the Id for ``"xtensor"``
|
||
- ``2`` the Id for ``">=0.20"``
|
||
- ``30`` the ``REL_CONDA`` flag
|
||
|
||
| The pool holds a ``Reldep *rels`` pointer on the first memory location and the size ``int nrels`` of allocated memory.
|
||
|
||
``Reldep`` are also handled with ``Id`` s that have bit 31 used as a flag to distinguish them from regular ``Id`` s.
|
||
|
||
There are multiple macros that help to convert those special ``Id`` s:
|
||
|
||
- ``MAKERELDEP(id)``: turn a regular ``Id`` into a ``Reldep`` ``Id``
|
||
- ``ISRELDEP(id)``: test the bit 31
|
||
- ``GETRELID(id)``: get the regular ``Id``
|
||
- ``GETRELDEP(pool, id)``: get the ``Reldep`` from its ``Id``
|
||
|
||
.. note::
|
||
``pool_id2str`` also works with ``Reldep`` ``Id`` s! But it will only returns the ``Reldep`` 's ``name``
|
||
|
||
Offsets
|
||
#######
|
||
|
||
An ``Offset`` represents a positive or negative shift of a pointer on an array.
|
||
|
||
For example, a *solvable* does not contain all its data but rather holds multiple offsets on its ``repo->idarraydata`` storage.
|
||
|
||
- ``idarraydata`` is an ``Id`` pointer
|
||
- ``provides``, ``requires``, etc. are offsets in ``idarraydata``
|
||
|
||
whatprovides
|
||
************
|
||
|
||
A ``provider`` is a *solvable* fulfilling a *specification*. The following definitions are key to disambiguate how ``libsolv`` works:
|
||
|
||
- a *package* is an identification of the resource handled: a name such as ``xtensor``
|
||
|
||
- a *solvable* is a specific version of the *package*. It can be assimilated to its tarball.
|
||
|
||
- in ``Mamba``, a *solvable* name MUST match the package name
|
||
- ``libsolv`` handles cases where *solvables* are providing different entities than what identified in their names (example: ``pynum`` providing ``numpy``)
|
||
|
||
- this is not used, but important to know to understand the terminology
|
||
|
||
- a *specification*: an expression to match *solvables* providing the same *package*
|
||
|
||
- the package name can be used to select all providers/*solvables*
|
||
- a more specific *spec* like ``s2="xtensor>=0.20"`` will only match a subset of *solvables*: the ones that have version >=0.20 (whatever the build string)
|
||
|
||
Let's take a simple example to recap:
|
||
|
||
- the package: ``xtensor``
|
||
- solvable(s):
|
||
|
||
- ``xtensor=0.20.10=hc9558a2_0``
|
||
- ``xtensor=0.23.10=h4bd325d_0``
|
||
- etc.
|
||
|
||
- specification(s):
|
||
|
||
- ``xtensor``
|
||
- ``xtensor>=0.20``
|
||
- etc.
|
||
|
||
.. note::
|
||
It's possible that a *package* is not provided by any *solvable*. It is then uninstallable.
|
||
|
||
.. mermaid::
|
||
:align: center
|
||
|
||
%%{init: {'themeVariables':{'edgeLabelBackground':'white'}}}%%
|
||
graph TD
|
||
subgraph " "
|
||
spec1((s1)) -.-> solvable1:::solvable
|
||
spec1 -.-> solvable2:::solvable
|
||
...:::empty
|
||
spec1 -.-> solvableN:::solvable
|
||
|
||
spec2((s2)) -.-> solvableN:::solvable
|
||
end
|
||
classDef solvable fill:#f96;
|
||
classDef empty fill:#ffffff00,stroke:#ffffff00;
|
||
|
||
.. note::
|
||
Specifications are stored as ``Id`` s (see the previous section)
|
||
|
||
.. warning::
|
||
While a ``solvable`` is a ``libsolv`` notion, ``specs`` and ``packages`` are not.
|
||
|
||
|
||
Storage
|
||
#######
|
||
|
||
| The free function ``void pool_createwhatprovides(Pool *pool)`` is used to create hashes over pool of solvables to ease provide lookups.
|
||
|
||
It computes and store what *solvables* provide each *spec*, using a two-step indirect list:
|
||
|
||
- from the spec ``Id``, an offset is computed in ``Id *whatprovidesdata``
|
||
|
||
- ``Offset *whatprovides`` stores regular string specs
|
||
- ``Offset *whatprovides_rel`` stores ``Reldep`` specs
|
||
|
||
- ``whatprovidesdata`` at the given offset is a 0-terminated list of *solvables* ``Id`` s, providing the spec ``Id``
|
||
|
||
.. image:: whatprovides.svg
|
||
:height: 300
|
||
:align: center
|
||
|
||
|
||
Lookups
|
||
#######
|
||
|
||
The ``pool_whatprovides(Pool *pool, Id d)`` function returns the offset of the first solvable ``Id`` in ``whatprovidesdata``:
|
||
|
||
.. mermaid::
|
||
:align: center
|
||
|
||
graph LR
|
||
A{"ISRELDEP(d)?"} -->|Yes| B1["whatprovides[d]"];
|
||
A -->|No| B2["whatprovides_rel[GETRELID(d)]"];
|
||
B1 --> C{0?};
|
||
B2 --> C;
|
||
C -->|Yes| D1["pool_addrelproviders[d]"];
|
||
C -->|No| D2((return));
|
||
D1 --> D2;
|
||
|
||
|
||
Rules
|
||
*****
|
||
|
||
A *rule* is all about *solvables*, it represents a logical disjunction ``OR`` between one or more literals.
|
||
|
||
Rules are created to translate in mathematical logic:
|
||
|
||
- a specification: installation/removal/updates
|
||
|
||
- ``(A)`` means ``A`` must be installed
|
||
- ``(-A)`` means ``A`` must be removed or kept uninstalled
|
||
- ``(A|B1|B2|...)`` means ``A`` could be updated with ``B1``, ``B2``, etc.
|
||
|
||
- a dependency: ``A`` needs/requires ``b`` (a *spec*)
|
||
|
||
- ``(-A|B1|B2|...)`` means ``A`` requires one of ``B1``, ``B2``, etc. (``b`` providers)
|
||
|
||
- an incompatibility: ``A1`` can't be installed alongside ``A2``
|
||
|
||
- ``(-A1|-A2), (-A1|-A3), ...`` means ``A1`` is not compatible with ``A2``, ``A3``, etc.
|
||
- this is a common case: multiple providers of the same package
|
||
|
||
- etc.
|
||
|
||
Still for efficiency, *rules* are storing ``Id`` s of *solvables* and *specs*:
|
||
|
||
- ``p`` is the package ``Id`` of ``A``
|
||
- ``d`` is the package ``Id`` offset into the list of providers (negative value means the rule is disabled)
|
||
- ``w1`` and ``w2`` are watches
|
||
- ``n1`` and ``n2`` are the next rules in linked-lists, corresponding to ``w1`` and ``w2``
|
||
|
||
.. mermaid::
|
||
:align: center
|
||
|
||
classDiagram
|
||
class Rule{
|
||
+Id p
|
||
+Id d
|
||
+Id w1
|
||
+Id w2
|
||
+Id n1
|
||
+Id n2
|
||
}
|
||
|
||
|
||
Dependencies
|
||
############
|
||
|
||
Each dependency is turned into a rule to be satisfied during the solving stage.
|
||
|
||
Example:
|
||
|
||
- ``p1`` and ``p2`` are 2 specs
|
||
- ``p1`` is provided by a single solvable ``s11``
|
||
- ``p2`` is provided by ``s21`` and ``s22``
|
||
|
||
.. mermaid::
|
||
:align: center
|
||
|
||
%%{init: {'themeVariables':{'edgeLabelBackground':'white'}}}%%
|
||
graph LR
|
||
subgraph p1 providers[ ]
|
||
p1((p1)):::package -.-> s11:::solvable
|
||
end
|
||
s11:::solvable --> p2((p2)):::package
|
||
subgraph p2 providers[ ]
|
||
p2((p2)):::package -.-> s21:::solvable
|
||
p2((p2)):::package -.-> s22:::solvable
|
||
end
|
||
|
||
classDef solvable fill:#f96;
|
||
|
||
The corresponding rule ``r`` is ``-s11|s21|s22``.
|
||
|
||
That means that taking the decision to install ``s11``:
|
||
|
||
- ``-s11`` is not satisfied
|
||
- either ``s21`` or ``s22`` need to be satisfied
|
||
|
||
.. note::
|
||
Exclusive rules are used to avoid installation of multiple solvables providing the same package
|
||
|
||
|
||
Watches
|
||
#######
|
||
|
||
| Watches are a way to link rules. They are triggered during a phase called ``propagation`` after decisions taken on previous rules for some solvable.
|
||
|
||
The possible decisions on solvables are ``installation`` or ``removal``/``conflict``, this is stored as resp. positive and negative Ids.
|
||
|
||
Related rules are then evaluated during another level of decision: those are the one with an opposite first literal.
|
||
|
||
Example:
|
||
|
||
- the install spec is to install ``p1`` provided by a single solvable ``s1``: ``r1=(s1)``
|
||
- ``s1`` depends on spec ``p2``, provided by ``s21`` or ``s22``: ``r2=(-s1|s21|s22)``
|
||
- decision to install ``s1`` triggers rule ``r2``
|
||
|
||
Transaction
|
||
***********
|
||
|
||
Another important part of libsolv is the ``Transaction``. A transaction governs what packages are installed or removed, and a transaction is the result of a successful solve process.
|
||
|
||
A transaction in libsolv is a single list (Queue) of Solvable ``Id's`` and is thus rather simple. The Queue contains either a positive or negative ``Id``. Each negative ``Id`` is ``uninstalled`` from the environment, and each positive ``Id`` is to be ``installed``. libsolv classifies the entire range of Id's into different types of transaction operations. For example, if we have ``{ -5, 5 }`` that would be a reinstall transaction for the Solvable with ``Id == 5``.
|
||
If the ``Id's`` are different then it can be a downgrade or upgrade operation (first, the previous package needs removal before the higher or lower version can be installed). The ``transaction_classify`` and ``transaction_classify_pkgs`` functions of libsolv take care of this classification to present a nice output to the user.
|
||
|
||
Another crucial libsolv function is ``transaction_order`` to order the transaction in a way that they are installed with the lowest dependency first (topological sort). This ensures that e.g. ``python`` is installed before any packages depending on ``python`` as they are sometimes needed during installation (for example for ``noarch`` packages with ``entry_points``).
|
||
|
||
Lastly, we can force installation or explicitly install from URL's by crafting transactions without using the solver – just by adding the correct ``Id's`` into the Transaction queue.
|