Multiarch Specification
=======================
Status: implemented, stable
This specification is considered to be the canonical reference for multiarch, but in case of discrepancies between this and the current implementation in dpkg, the latter should be considered the expected behavior, unless it can
be argued that it is suboptimal and it can be easily changed.
Those discrepancies might come about because this document was rewritten
from scratch after the fact.
[
TODO: Check whether anything is still missing and worth adding from:
- https://wiki.debian.org/Multiarch/Tuples
- https://wiki.debian.org/Multiarch/MissingRationale
- https://wiki.debian.org/Teams/Dpkg/TimeTravelFixes
]
Background
----------
Make it possible to install packages for different architectures, with support from the package manager. Make it possible to cross-build packages for different architectures easily.
There has been at least three previous ways to handle these needs. All of which were rather unsatisfactory:
* Installing foreign packages with «dpkg --force-architecture».
This made it possible to install foreign packages, but was of very limited
use, as the dependency relationships related to the architecture was
nonexistent, and did not allow to express most of the most complex
relationships.
* Using the multilib layout.
This is the layout supported by many other distributions, to make it
possible to install packages for the alternative runnable ABI for a
specific architecture. But it has the fatal problem of not being a
generalized approach, having inconsistent and confusing semantics for
the multilib directories and requiring to hardcode the set of alternative
ABIs supported for each main architecture. It installs into paths such
as /usr/lib, /usr/lib32, /usr/lib64, where /usr/lib might or might not be
the native architecture.
* Using the sysroot layout.
This is a more general solution than multilib, but it requires a
pseudo-chroot equivalent for each architecture. It also pollutes the
filesystem namespace as it installs into paths such as /usr/<sysroot>.
To be able to install packages from another architecture, we need to make
it possible for the package managers to tell what is and what is not allowed, so that the dependency system does not get broken.
One recurring theme in the design of this specification was to allow for incremental adoption (no flag days required), and to not break previous satisfiability assumptions. New dependency types should be allowed, but dependencies that were previously allowed should not stop working.
This would require changes in packaging, both in the filesystem layout
to make co-installability possible, and in the metadata to annotate the packages and their dependencies depending on the interfaces provided.
Architecture type concepts
--------------------------
There are several important architecture types to take into consideration with multiarch. We have the following different types:
* <native>: Is the one the package manager (dpkg) has been built for, this
architecture can change by way of cross-grading dpkg itself.
«dpkg --print-architecture»
* <foreign>: This is a non-<native> architecture.
«dpkg --print-foreign-architectures»
* package architecture: The architecture of a package, which can be entirely
different to the <native> architecture. From within maintainer scripts
it can be fetched from the DPKG_MAINTSCRIPT_ARCH environment variable,
and otherwise with «dpkg-deb -f <pkg>.deb Architecture».
* dependency architecture: The architecture of the package in a dependency.
Described in § "Dependency architecture inference".
* <build-arch>: The architecture the package is built on, which should
match <native>. Relevant when building packages.
«dpkg-architecture -qDEB_BUILD_ARCH»
* <host-arch>: The architecture the packages is built for, determined
explicitly from user input, or from the architecture the compiler
generates code for. Relevant when building packages.
«dpkg-architecture -qDEB_HOST_HOST»
* <target-arch>: The architecture the compiler being built will build for,
determined explicitly from user input, or otherwise <host-arch>.
«dpkg-architecture -qDEB_TARGET_HOST»
Multiarch Tuples
----------------
The multiarch tuples are architecture strings that describe each different architecture ABI. These are based on the GNU tuples, except that they get normalized to their base form, ignoring any ISA specialization.
These are used as part of the filesystem layout to be able to co-install packages that would otherwise have conflicting pathnames with different contents.
### Rationale
These tuples were introduced to get constant values, which was not the case at least for the i386 dpkg architecture where the CPU part of the GNU tuple has been getting bumped when the baseline ISA has been bumped.
### Examples
This value can be fetched with «dpkg-architecture -qDEB_<type>_MULTIARCH».
Filesystem Layout
-----------------
The multiarch design is based on the concept that some kind of packages can be co-installed. But these same packages would contain architecture-dependent content that was previously exposed on the same pathname across architectures.
These architecture-dependent pathnames get relocated, as part of the packaging, into multiarch tuple qualified pathnames. So if a shared library used to be located at «<libdir>/libfoo.so.10», it would now be located at «<libdir>/<multiarch-tuple>/libfoo.so.10».
For pathnames that provide the same content independently of the architecture used to build and use them, the same pathname can still be used, as the package manager will refcount them, as long as their digests match.
### Rationale
* Allows to install multiple architectures.
* It is a uniform namespace.
* It is not limited to sibling or related architectures only diverging
in bitness or ABI like multilib does.
Package Interfaces
------------------
A key concept in multiarch is the interfaces a package provides. This limits how a package can be used by other packages, and when it can be installed.
There is an important distinction here between the interface being architecture
independent, and the interface being runnable from some architecture. Runnability is of not great concern when it comes to the metadata annotations in packages and dependencies. It is mainly of concern for the users and frontends installing packages. Runnability is also a property that is not made available simply by the current hardware architecture, using an emulator can make an interface runnable.
When talking about interfaces, that refers to both passive (mostly files
and their pathnames) and active ones (shared libraries, programs, etc.).
For passive ones, the pathnames should not be arch-qualified, because then locating them requires arch-specific knowledge. File formats should either
be arch-independent, or should make it possible to describe within all possible different encodings, such as endianness, bitness, etc. But the generation should select a single set of encoding and always generate the same output.
Within active ones, there are two main sub-types, runnable and linkable.
The common examples for these are programs (binaries or scripts) that one runs, and shared libraries or architecture-specific modules or plugins
that one loads and links against. Runnable interfaces might be either arch-dependent or independent depending on whether their output varies per-architecture. It does not matter whether those runnable interfaces
are implemented in apparently arch-independent scripting languages for example, as those can still be arch-dependent. Linkable interfaces are
always arch-dependent, as they are required to match the ABIs.
Control fields
--------------
### The Multi-Arch field
This field will allow to satisfy dependencies between packages of
different architectures (beyond Architecture: all), and co-install
a package with the same name but different architecture.
The permitted values are:
* “no”
This value is equivalent to the current default, that being the omission
of the field.
The interfaces provided by this package are unknown. This means the
package has either not been yet made multiarch aware, or in some rare
situations when none of the other values currently fit, and has been
marked explicitly as having been evaluated.
* “same“
This package is co-installable with itself (other architecture instances),
but it must not be used to satisfy the dependency of any package of a
different architecture from itself.
The main purpose of this value is to mark packages that provide
architecture-dependent linkable interfaces. In special circumstances it
can also be used to provide runnable interfaces where each program or
script filename is arch-qualified.
* “foreign”
The package is not co-installable with itself, but should be allowed
to satisfy the dependencies of a package of a different architecture
from itself.
The main purpose of this value is to mark packages that provide
architecture-independent interfaces, such as data files, programs
with architecture-independent behavior (even if the program is compiled
and architecture-specific), scripting language modules, etc.
* “allowed”
This permits the reverse-dependencies of the package to annotate their
dependency field to indicate that a foreign architecture version of the
package satisfies the dependencies, but does not change the resolution
of any existing dependencies.
The main purpose of this value is to mark packages that have a dual
role, either as runnable (architecture-independent) or linkable
(architecture-dependent) depending on how the depending package uses
those interfaces. As that knowledge lies in the depending package,
the responsibility to denote that type of interface usage falls on
those dependencies, through arch-qualifiers. This value enables those
«:any» arch-qualifiers to be taken into account, as to not let such
wildcards be declared without cooperation and agreement from the package
providing those interfaces.
Dependency resolution
---------------------
Dependency resolution has two main parts, run-time and build-time.
Packages in dependencies can be annotated with arch-qualifiers. These
are suffixed to the package name after a colon (':'), and consist of
one of several special strings such as 'any', 'native', or an actual architecture name. These arch-qualifiers will restrict which packages
can satisfy these dependencies.
Because Essential:yes is not intended for shared library packages, it is assumed that any implicit dependency on an essential package is satisfied
by the binaries from the native architecture.
### Dependency architecture inference
Dependencies always contain architecture information, be it implicit or explicit with arch-qualifiers. This information is used in various places
as part of the dependency satisfiability checks. The following table describes how the dependency architectures from a package get determined given the package architecture.
\ Pkg arch |
Dep \ | all <pkg-arch>
----------------+----------------------------
pkg¹ | <native>/any <pkg-arch>/any
pkg:<dep-arch> | <dep-arch> <dep-arch>
pkg:any | any any
[¹]
* For Pre-Depends/Depends/Recommends/Suggests/Enhances/Provides, the
implicit arch-qualifier is <native> for arch 'all' packages, or <pkg-arch>.
* For Conflicts/Breaks/Replaces, the implicit arch-qualifier is 'any'.
* [ TODO: Document build-time dependency fields. ]
### Run-time satisfiability
The first is the usual run-time dependency resolution when installing packages on the system for their normal use, while using Pre-Depends, Depends, Conflicts, Breaks, Replaces, Provides. This also applies to Recommends, Suggests and Enhances, but as those are not strict
requirements, their semantics depend on how the frontend honors the
fields.
This type of dependency is concerned with the architecture of the package being installed, and the architectures of its dependencies.
\ M-A |
Dep \ | no same foreign allowed
-----------+-----------------------------------------------
pkg | <dep-arch> <dep-arch> any <dep-arch>
pkg:<arch> | <dep-arch> <dep-arch> <dep-arch> <dep-arch>
pkg:any | <dep-arch> <dep-arch> <dep-arch> any
The pkg:any dependency only being satisfied with M-A:allowed was added in part so that packages could not start declaring wildcard relationships without cooperation and agreement from the packages providing such interfaces, because the semantics of these interfaces might not be clear to external parties.
[ TODO: Document that pkg:any is only satisfied for non M-A:allowed with
Conflicts/Breaks/Replaces fields. ]
### Build-time satisfiability
The other applies
while satisfying build-time dependencies while using the
fields Build-Depends, Build-Conflicts, Build-Depends-Arch, Build-Conflicts-Arch, Build-Depends-Indep, Build-Conflicts-Indep. These are concerned with source packages, so we do not have any architecture information
from that.
In this mode of satisfiability, a new concept to take into account is the distinction between build, host and target architectures, which are the only architectures we will have knowledge of.
\ M-A |
Dep \ | no same foreign allowed
-----------+----------------------------------------------------------
pkg | <host-arch> <host-arch> any (<build-arch>) <host-arch>
pkg:<arch> | <host-arch> <host-arch> any (<build-arch>) <host-arch>
pkg:any | disallowed disallowed disallowed any (<build-arch>)
pkg:native | <build-arch> <build-arch> disallowed <build-arch>
pkg:target | N/A ...
With «any (<type-arch>)» meaning that while any architecture would do, the preferred one is <type-arch>.
The build-time satisfiability includes disallowed relationships because
these help detect nonsensical relationships. This difference compared
with the run-time behavior is because it tends to be easier to modify
the source once you have it around.
The pkg:any with anything that is not M-A:allowed relationship is disallowed because the requested relationship is not getting respected.
The pkg:native with M-A:foreign relationship is disallowed because that indicates either (or both) markings is in error. Either the interface is arch-dependent and thus can be requested to be pkg:native, or it is arch-independent and the target can be provided as foreign.
[ TODO: Document discrepancies and their rationale for difference in
satisfiability for pkg:any, and for not honoring the distinction between
Build-Depends and Build-Conflicts like with run-time deps. ]
Reference counted files
-----------------------
File reference counting is an operation that dpkg performs for Multi-Arch:same packages, so that files that would otherwise conflict,
can be shared between different architecture instances and do not need
to be split into common packages.
A ref-counted file is one that is owned by multiple arch-instances of
a Multi-Arch:same package. The current requirements are:
* Multi-Arch:same packages can only be configured if all of their instances
are unpacked at their exact same binary version.
* All ref-counted files need to match on their md5sums.
Maintainer scripts can fetch the package ref-counter from the environment variable DPKG_MAINTSCRIPT_PACKAGE_REFCOUNT.
### Rationale
* Requires less package splits, and thus less package metadata and less
maintainer work.
* Can avoid disk duplication, as the contents for the same package files
get shared between different instances.
### Problems
Even though file ref-counting has some nice properties to avoid work for maintainers, it is really broken by design as it has also some very bad properties.
Some are even in principle unfixable. In addition backpedaling on that decision would imply quite some work now. Given the requirements above:
* It cannot guarantee that the generated files will be bit identical if
they have not been generated with the same build-dependencies as the
other instances.
* It also introduced the requirement that packages need to be installed
in version lock-step, which complicates upgrades, and makes packages
uninstallable when one of the instances is not yet available.
* Makes the maintainer script semantics more complicated.
* Unmatched binNMUs make packages not co-installable, due to version-skew.
* binNMUs in general are by default not co-installable, due to differing
changelog entries.
* Only the last package instance can check that it matches the md5sums
of the already installed ref-counted files, which means differing files
might not get detected.
* Essential packages, which must work even when only unpacked, might not
work at all if one of its Pre-Depends is a M-A:same shared library that
has an unpacked shared file from another instance from a different binary
version.
The currently implemented and proposed workaround to some of this problems has been a series of ad-hoc hacks:
* Split the binNMU changelog entry into a different file, automatically
only for packages using debhelper.
* Hunt down all packages that contain differences depending on the
architecture, and try to make them reproducible, but this might just
shadow files that might end up changing depending on the program
generating them.
* (postponed) Switch the binary version coherence check for all instances
to be source version based. This mixes up the source and binary
versionspaces, and makes it akin to a magic check.
Ideally:
* To avoid a flag day we could add a new Multi-Arch field value, with
similar semantics as «same» but implying no ref-counting.
* Split ref-counted files into their own common packages.
* Move at least changelog files into the .deb control area, and consequently
to the dpkg db.
- This would also allow to transparently compress and deduplicate those
files, w/o needing to do flaky directory to symlink dances back and forth.
* At some point in the future, when not needed at all, disable ref-counting
completely, or via a --force flag? (Breaks compatibility and might not be
possible at all, ever.)
Cross-grading
-------------
This can refer to either a package or the system.
For the former, it means switching a package's architecture by installing
a different instance over an already installed one. This only works for
non Multi-Arch:same packages, as those would just get an additional instance installed instead.
For the latter, this is the act of changing the native architecture. This
is currently performed by installing a dpkg instance of the new architecture we want to switch to, with all the required dependencies.
Command-line interfaces
-----------------------
On output, only packages with Multi-Arch:foreign with a non-native architecture or with Multi-Arch:same fields will ever get arch-qualified.
For input, any command that accepts a package name, can always be passed an arch-qualified package name (pkgname or pkgname:arch). Arch-qualifying should in general always be a safe operation. Any command that accepts patterns will accept arch-qualified patterns too («<pkgname>:*» or «*:<archname>»), and an arch-unqualified pattern will default to an implicit «:*» arch-qualifier.
Any command that requires a specific package name will require arch-qualified package name when there are multiple instances currently installed, to disambiguate them.
### Problems
There is a divergence of the CLI interface between dpkg and apt.
### Rationale
* Backwards compatibility, a system with no enabled multiarch, no multiarch
enabled packages and no foreign packages installed should behave in the
same exact way (no arch-qualifiers printed etc.).
* Following from the previous, callers that expected a single entry on output,
should not suddenly get multiple when specifying a single package name,
that's why those require specific arch-qualified package names.
* The immediate output should be usable even after the system has been
cross-graded, so it should be resistant to native-arch switch.
Out of scope
------------
The following are implementation and/or distribution specific, and as the spec should ideally be distribution-neutral it should not encode packaging policy. Perhaps it should still be expanded as an implementation or examples sub-section, and marked as such.
* TODO: Describe compiler and dpkg-shlibdeps search paths.
* TODO: Packaging changes required to make a package multi-arch compliant;
lib, lib-dev, tool, etc.
Unresolved problems
-------------------
* Interpreter problem.
https://wiki.debian.org/Multiarch/InterpreterProposal
https://lists.debian.org/debian-perl/2012/12/msg00000.html
* Co-installable packages for executables.
One possible solution to this might be to use alternatives with priorities
determined dynamically at installation time.
* Runnable architecture attribute.
Sometimes we need to know whether an architecture is runnable or not,
as this is relevant when deciding what to install into the system, and
even though this is of no concern to dpkg directly, it is for high-level
frontends and the user.
* Partial architectures.
https://wiki.debian.org/Teams/Dpkg/Spec/FreestandingArches
* Arch:all packages that can only be built in a specific arch.
https://wiki.debian.org/Teams/Dpkg/Spec/FreestandingArches
* binNMU version skew.
See the “Reference counted files” section.
Multiarch Specification
=======================
Status: implemented, stable
This specification is considered to be the canonical reference for multiarch,
but in case of discrepancies between this and the current implementation in dpkg, the latter should be considered the expected behavior, unless it can be argued that it is suboptimal and it can be easily changed.
Those discrepancies might come about because this document was rewritten from scratch after the fact.
This document is about multiarch in dpkg. For multiarch in Debian, the canonical reference should be the (not yet existing) Debian policy write-up on
multiarch. Should this document mention vendor specific policy like Debian policy?
[
TODO: Check whether anything is still missing and worth adding from:
- https://wiki.debian.org/Multiarch/Tuples
- https://wiki.debian.org/Multiarch/MissingRationale
- https://wiki.debian.org/Teams/Dpkg/TimeTravelFixes
]
Background
----------
Make it possible to install packages for different architectures, with support from the package manager. Make it possible to cross-build packages for different architectures easily.
The second is a consequence from the first. By being able to install packages from different architectures, we make resolving cross build dependencies much easier. Maybe it should be formulated as such like:
Make it possible to install packages for different architectures, with support
from the package manager. This allows, among other things:
- support running 32 bit applications on 64 bit platforms that support this by
installing 32 bit shared libraries
- support cross build dependency resolution installing build architecture and
host architecture version of packages as required
- use completely foreign architecture binaries through qemu-user
- cross-grading a system from one architecture to another
There has been at least three previous ways to handle these needs. All of which were rather unsatisfactory:
* Installing foreign packages with «dpkg --force-architecture».
This made it possible to install foreign packages, but was of very limited
use, as the dependency relationships related to the architecture was
nonexistent, and did not allow to express most of the most complex
relationships.
I'd replace the last "most" with "more".
* Using the multilib layout.
This is the layout supported by many other distributions, to make it
possible to install packages for the alternative runnable ABI for a
specific architecture. But it has the fatal problem of not being a
generalized approach, having inconsistent and confusing semantics for
the multilib directories and requiring to hardcode the set of alternative
ABIs supported for each main architecture. It installs into paths such
as /usr/lib, /usr/lib32, /usr/lib64, where /usr/lib might or might not be
the native architecture.
* Using the sysroot layout.
This is a more general solution than multilib, but it requires a
pseudo-chroot equivalent for each architecture. It also pollutes the
filesystem namespace as it installs into paths such as /usr/<sysroot>.
To be able to install packages from another architecture, we need to make it possible for the package managers to tell what is and what is not allowed,
so that the dependency system does not get broken.
One recurring theme in the design of this specification was to allow for incremental adoption (no flag days required), and to not break previous satisfiability assumptions. New dependency types should be allowed, but dependencies that were previously allowed should not stop working.
This would require changes in packaging, both in the filesystem layout
to make co-installability possible, and in the metadata to annotate the packages and their dependencies depending on the interfaces provided.
Architecture type concepts
--------------------------
There are several important architecture types to take into consideration with multiarch. We have the following different types:
* <native>: Is the one the package manager (dpkg) has been built for, this
architecture can change by way of cross-grading dpkg itself.
«dpkg --print-architecture»
In the context of #1020533 we were discussing whether it makes sense whether dpkg should really always its own architecture being the native architecture, so this might change in the future.
* <foreign>: This is a non-<native> architecture.
«dpkg --print-foreign-architectures»
* package architecture: The architecture of a package, which can be entirely
different to the <native> architecture. From within maintainer scripts
it can be fetched from the DPKG_MAINTSCRIPT_ARCH environment variable,
and otherwise with «dpkg-deb -f <pkg>.deb Architecture».
Is "all" a package architecture or is the package architecture of a arch:all package implicitly the native architecture under this definition?
* dependency architecture: The architecture of the package in a dependency.
Described in § "Dependency architecture inference".
* <build-arch>: The architecture the package is built on, which should
match <native>. Relevant when building packages.
«dpkg-architecture -qDEB_BUILD_ARCH»
s/package/source package/
I think either always be implicit and call binary packages "packages" and source packages with the "source" prefix or always be explicit and prefix the term "package" with "binary" or "source" as appropriate.
* <host-arch>: The architecture the packages is built for, determined
explicitly from user input, or from the architecture the compiler
generates code for. Relevant when building packages.
«dpkg-architecture -qDEB_HOST_HOST»
Same as above.
* <target-arch>: The architecture the compiler being built will build for,
determined explicitly from user input, or otherwise <host-arch>.
«dpkg-architecture -qDEB_TARGET_HOST»
We recently noted, that the term "target arch" might not only be useful for compilers but also for other software that outputs or interprets things specific to an architecture like emulators or virtual machines. But this just as a side-note.
Multiarch Tuples
----------------
The multiarch tuples are architecture strings that describe each different architecture ABI. These are based on the GNU tuples, except that they get normalized to their base form, ignoring any ISA specialization.
These are used as part of the filesystem layout to be able to co-install packages that would otherwise have conflicting pathnames with different contents.
### Rationale
These tuples were introduced to get constant values, which was not the case at least for the i386 dpkg architecture where the CPU part of the GNU tuple has been getting bumped when the baseline ISA has been bumped.
### Examples
This value can be fetched with «dpkg-architecture -qDEB_<type>_MULTIARCH».
Filesystem Layout
-----------------
The multiarch design is based on the concept that some kind of packages can be co-installed. But these same packages would contain architecture-dependent
content that was previously exposed on the same pathname across architectures.
These architecture-dependent pathnames get relocated, as part of the packaging, into multiarch tuple qualified pathnames. So if a shared library used to be located at «<libdir>/libfoo.so.10», it would now be located at «<libdir>/<multiarch-tuple>/libfoo.so.10».
For pathnames that provide the same content independently of the architecture
used to build and use them, the same pathname can still be used, as the package manager will refcount them, as long as their digests match.
I think I know what you mean by "as long as their digests match" but maybe it is more clear to say "as long as they are identical"? Maybe in the end, it is indeed only the digest that needs to match but for practical purposes we want the contents to match. So the fact that the implementation chooses (I guess?) to compare digests isn't important here and the intention that the contents should be identical should be documented instead.
### Rationale
* Allows to install multiple architectures.
* It is a uniform namespace.
* It is not limited to sibling or related architectures only diverging
in bitness or ABI like multilib does.
Does it make sense to note in this section, that this co-installability is only
intended for shared libraries in /usr/lib but not for executables in /usr/bin?
Package Interfaces
------------------
A key concept in multiarch is the interfaces a package provides. This limits
how a package can be used by other packages, and when it can be installed.
There is an important distinction here between the interface being architecture
independent, and the interface being runnable from some architecture. Runnability is of not great concern when it comes to the metadata annotations
in packages and dependencies. It is mainly of concern for the users and frontends installing packages. Runnability is also a property that is not made available simply by the current hardware architecture, using an emulator
can make an interface runnable.
When talking about interfaces, that refers to both passive (mostly files and their pathnames) and active ones (shared libraries, programs, etc.).
Generally, I would avoid the use of "etc". Readers that do not know how to continue a list that is abbreviated with "etc" do not gain anything by it. Readers who do know how to continue the list do not either.
For passive ones, the pathnames should not be arch-qualified, because then locating them requires arch-specific knowledge. File formats should either be arch-independent, or should make it possible to describe within all possible different encodings, such as endianness, bitness, etc. But the generation should select a single set of encoding and always generate the same output.
What is "the generation" here?
Within active ones, there are two main sub-types, runnable and linkable.
If there are only two types, what does the "etc" above stand for?
The common examples for these are programs (binaries or scripts) that one runs, and shared libraries or architecture-specific modules or plugins
that one loads and links against. Runnable interfaces might be either arch-dependent or independent depending on whether their output varies per-architecture. It does not matter whether those runnable interfaces
are implemented in apparently arch-independent scripting languages for example, as those can still be arch-dependent. Linkable interfaces are always arch-dependent, as they are required to match the ABIs.
I would expand more here on what the interface of a program actually is. I think it's clear that the interface of a shared library is architecture dependent but for the interface of a program, it is a common problem and a common question whether the program can be marked multi-arch:foreign or not. My
favourite example here is "make". 99% of the Makefiles out there probably use make in a way that would allow make being m-a:foreign. But the following snippet shows a Makefile that acts differently depending on the native architecture:
all: -lc
@echo $(<)
Additionally, make is able to load shared libraries at runtime. I think the multiarch spec should expand on what an interface is a bit better and explain that to some extend, it is up to the maintainer what they deem the interface of
a program. If the architecture-dependent parts are never used or not supposed to be used, it might as well be okay to mark something multi-arch:foreign.
This reminds me of another important question that pops up all the time which I
think that this doc should explain somewhere:
Why would it be wrong to mark all arch:all packages as m-a:foreign?
The current version of this doc does not explain that arch:all packages are implicitly the native architecture. The text above implies that the "runnable program" can be arch:all and do arch-dependent stuff but i think this should be
made more explicit as I found this to be a very common point of confusion. Essentially, what I'd like to be spelled out explicitly somewhere is:
1. arch:all packages are implicitly of the native architecture
2. arch:all packages can ship scripts that are able to do architecture
dependent stuff, thus creating an architecture dependent interface
3. arch:all packages can depend on another package that makes it impossible
to declare it m-a:foreign
4. the above is the reason why arch:all packages cannot be assumed to be
implicitly m-a:foreign when satisfying cross-build dependencies
Control fields
--------------
### The Multi-Arch field
This field will allow to satisfy dependencies between packages of
different architectures (beyond Architecture: all), and co-install
a package with the same name but different architecture.
The permitted values are:
* “no”
This value is equivalent to the current default, that being the omission
of the field.
The interfaces provided by this package are unknown. This means the
package has either not been yet made multiarch aware, or in some rare
situations when none of the other values currently fit, and has been
marked explicitly as having been evaluated.
Why do you write that it is rare that none of the other values fit? I think most architecture dependent programs fit none of the other values.
* “same“
This package is co-installable with itself (other architecture instances),
but it must not be used to satisfy the dependency of any package of a
different architecture from itself.
The main purpose of this value is to mark packages that provide
architecture-dependent linkable interfaces. In special circumstances it
can also be used to provide runnable interfaces where each program or
script filename is arch-qualified.
* “foreign”
The package is not co-installable with itself, but should be allowed
to satisfy the dependencies of a package of a different architecture
from itself.
The main purpose of this value is to mark packages that provide
architecture-independent interfaces, such as data files, programs
with architecture-independent behavior (even if the program is compiled
and architecture-specific), scripting language modules, etc.
I think adding "scripting language modules" here is a bit dangerous because of
the m-a interpreter problem.
* “allowed”
This permits the reverse-dependencies of the package to annotate their
dependency field to indicate that a foreign architecture version of the
package satisfies the dependencies, but does not change the resolution
of any existing dependencies.
The main purpose of this value is to mark packages that have a dual
role, either as runnable (architecture-independent) or linkable
(architecture-dependent) depending on how the depending package uses
those interfaces. As that knowledge lies in the depending package,
the responsibility to denote that type of interface usage falls on
those dependencies, through arch-qualifiers. This value enables those
«:any» arch-qualifiers to be taken into account, as to not let such
wildcards be declared without cooperation and agreement from the package
providing those interfaces.
Another important purpose of "allowed" is for packages providing a runnable program that can be either used in an architecture dependent or independent way.
Dependency resolution
---------------------
Dependency resolution has two main parts, run-time and build-time.
Packages in dependencies can be annotated with arch-qualifiers. These
are suffixed to the package name after a colon (':'), and consist of
one of several special strings such as 'any', 'native', or an actual architecture name. These arch-qualifiers will restrict which packages
can satisfy these dependencies.
Because Essential:yes is not intended for shared library packages, it is assumed that any implicit dependency on an essential package is satisfied by the binaries from the native architecture.
### Dependency architecture inference
Dependencies always contain architecture information, be it implicit or explicit with arch-qualifiers. This information is used in various places as part of the dependency satisfiability checks. The following table describes how the dependency architectures from a package get determined given the package architecture.
\ Pkg arch |
Dep \ | all <pkg-arch>
----------------+----------------------------
pkg¹ | <native>/any <pkg-arch>/any
pkg:<dep-arch> | <dep-arch> <dep-arch>
pkg:any | any any
I do not understand the /any in the pkg¹ row. What does it mean?
[¹]
* For Pre-Depends/Depends/Recommends/Suggests/Enhances/Provides, the
implicit arch-qualifier is <native> for arch 'all' packages, or <pkg-arch>.
..or <pkg-arch> for arch 'any' packages.
* For Conflicts/Breaks/Replaces, the implicit arch-qualifier is 'any'.
* [ TODO: Document build-time dependency fields. ]
### Run-time satisfiability
The first is the usual run-time dependency resolution when installing packages on the system for their normal use, while using Pre-Depends, Depends, Conflicts, Breaks, Replaces, Provides. This also applies to Recommends, Suggests and Enhances, but as those are not strict requirements, their semantics depend on how the frontend honors the
fields.
This type of dependency is concerned with the architecture of the package being installed, and the architectures of its dependencies.
\ M-A |
Dep \ | no same foreign allowed
-----------+-----------------------------------------------
pkg | <dep-arch> <dep-arch> any <dep-arch>
pkg:<arch> | <dep-arch> <dep-arch> <dep-arch> <dep-arch>
pkg:any | <dep-arch> <dep-arch> <dep-arch> any
Why is a pkg:<arch> dependency on a m-a:foreign package only satisfied by <dep-arch>? The m-a:foreign package (as described above) "satisfies the dependencies of a package of a different architecture from itself." If it does
that, then it doesn't make sense that, then why does foo:i386 not satisfy a dependency on foo:amd64? If foo:i386 cannot satisfy that dependency (and that's
why the other package explicitly stated foo:amd64) then it shouldn't be m-a:foreign.
The pkg:any dependency only being satisfied with M-A:allowed was added in part so that packages could not start declaring wildcard relationships without cooperation and agreement from the packages providing such interfaces, because the semantics of these interfaces might not be clear to external parties.
[ TODO: Document that pkg:any is only satisfied for non M-A:allowed with
Conflicts/Breaks/Replaces fields. ]
There should probably be two tables then? It also confused me that the pkg:any
row has these <dep-arch> entries instead of saying "disallowed".
### Build-time satisfiability
The other applies
The other what?
while satisfying build-time dependencies while using the
fields Build-Depends, Build-Conflicts, Build-Depends-Arch, Build-Conflicts-Arch, Build-Depends-Indep, Build-Conflicts-Indep. These are concerned with source packages, so we do not have any architecture information
from that.
In this mode of satisfiability, a new concept to take into account is the distinction between build, host and target architectures, which are the only
architectures we will have knowledge of.
This concept is not really new as it was mentioned above.
\ M-A |
Dep \ | no same foreign allowed
-----------+----------------------------------------------------------
pkg | <host-arch> <host-arch> any (<build-arch>) <host-arch>
pkg:<arch> | <host-arch> <host-arch> any (<build-arch>) <host-arch>
pkg:any | disallowed disallowed disallowed any (<build-arch>)
pkg:native | <build-arch> <build-arch> disallowed <build-arch>
pkg:target | N/A ...
With «any (<type-arch>)» meaning that while any architecture would do, the
preferred one is <type-arch>.
The build-time satisfiability includes disallowed relationships because these help detect nonsensical relationships. This difference compared
with the run-time behavior is because it tends to be easier to modify
the source once you have it around.
The pkg:any with anything that is not M-A:allowed relationship is disallowed
because the requested relationship is not getting respected.
The pkg:native with M-A:foreign relationship is disallowed because that indicates either (or both) markings is in error. Either the interface is arch-dependent and thus can be requested to be pkg:native, or it is arch-independent and the target can be provided as foreign.
That's the same argument for pkg:native to m-a:foreign as i made above for pkg:any to m-a:foreign.
[ TODO: Document discrepancies and their rationale for difference in
satisfiability for pkg:any, and for not honoring the distinction between
Build-Depends and Build-Conflicts like with run-time deps. ]
Reference counted files
-----------------------
File reference counting is an operation that dpkg performs for Multi-Arch:same packages, so that files that would otherwise conflict,
can be shared between different architecture instances and do not need
to be split into common packages.
A ref-counted file is one that is owned by multiple arch-instances of
a Multi-Arch:same package. The current requirements are:
* Multi-Arch:same packages can only be configured if all of their instances
are unpacked at their exact same binary version.
* All ref-counted files need to match on their md5sums.
Maintainer scripts can fetch the package ref-counter from the environment variable DPKG_MAINTSCRIPT_PACKAGE_REFCOUNT.
### Rationale
* Requires less package splits, and thus less package metadata and less
maintainer work.
* Can avoid disk duplication, as the contents for the same package files
get shared between different instances.
### Problems
Even though file ref-counting has some nice properties to avoid work for maintainers, it is really broken by design as it has also some very bad properties.
Some are even in principle unfixable. In addition backpedaling on that decision would imply quite some work now. Given the requirements above:
* It cannot guarantee that the generated files will be bit identical if
they have not been generated with the same build-dependencies as the
other instances.
* It also introduced the requirement that packages need to be installed
in version lock-step, which complicates upgrades, and makes packages
uninstallable when one of the instances is not yet available.
* Makes the maintainer script semantics more complicated.
* Unmatched binNMUs make packages not co-installable, due to version-skew.
* binNMUs in general are by default not co-installable, due to differing
changelog entries.
* Only the last package instance can check that it matches the md5sums
of the already installed ref-counted files, which means differing files
might not get detected.
* Essential packages, which must work even when only unpacked, might not
work at all if one of its Pre-Depends is a M-A:same shared library that
has an unpacked shared file from another instance from a different binary
version.
The currently implemented and proposed workaround to some of this problems has been a series of ad-hoc hacks:
* Split the binNMU changelog entry into a different file, automatically
only for packages using debhelper.
* Hunt down all packages that contain differences depending on the
architecture, and try to make them reproducible, but this might just
shadow files that might end up changing depending on the program
generating them.
* (postponed) Switch the binary version coherence check for all instances
to be source version based. This mixes up the source and binary
versionspaces, and makes it akin to a magic check.
Ideally:
* To avoid a flag day we could add a new Multi-Arch field value, with
similar semantics as «same» but implying no ref-counting.
* Split ref-counted files into their own common packages.
* Move at least changelog files into the .deb control area, and consequently
to the dpkg db.
- This would also allow to transparently compress and deduplicate those
files, w/o needing to do flaky directory to symlink dances back and forth.
* At some point in the future, when not needed at all, disable ref-counting
completely, or via a --force flag? (Breaks compatibility and might not be
possible at all, ever.)
Is it really helpful to have this "rant" about the problems of refcounting in the multiarch spec? This sounds more suited for a page on wiki.d.o.
Cross-grading
-------------
This can refer to either a package or the system.
For the former, it means switching a package's architecture by installing
a different instance over an already installed one. This only works for
non Multi-Arch:same packages, as those would just get an additional instance
installed instead.
For the latter, this is the act of changing the native architecture. This is currently performed by installing a dpkg instance of the new architecture
we want to switch to, with all the required dependencies.
Command-line interfaces
-----------------------
On output, only packages with Multi-Arch:foreign with a non-native architecture or with Multi-Arch:same fields will ever get arch-qualified.
For input, any command that accepts a package name, can always be passed an arch-qualified package name (pkgname or pkgname:arch). Arch-qualifying should
in general always be a safe operation. Any command that accepts patterns will
accept arch-qualified patterns too («<pkgname>:*» or «*:<archname>»), and
an arch-unqualified pattern will default to an implicit «:*» arch-qualifier.
Any command that requires a specific package name will require arch-qualified
package name when there are multiple instances currently installed, to disambiguate them.
What about arch:all packages? It seems I'm allowed to arch-qualify them too.
### Problems
There is a divergence of the CLI interface between dpkg and apt.
### Rationale
* Backwards compatibility, a system with no enabled multiarch, no multiarch
enabled packages and no foreign packages installed should behave in the
same exact way (no arch-qualifiers printed etc.).
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 300 |
Nodes: | 16 (2 / 14) |
Uptime: | 36:26:29 |
Calls: | 6,707 |
Files: | 12,239 |
Messages: | 5,353,437 |