From what I have understood, Guillem would rather avoid committingto a new public interface for this specific use-case, i.e. the
Naive solution
==============
In theory, `dpkg` could resolve this automatically. For every file it touches, it could canonicalize the location using the actual filesystem
and check whether any other installed file has the same canonicalized location. Unfortunately, `dpkg` cannot know which filenames can
collide, so it would check every filename in its database. For canonicalization, it would `stat()` every component of every filename.
This easily amounts to a million or more `stat()` calls on larger installations. Caching could reduce the impact somewhat, but since
Debian introduces aliases during maintainer scripts, it would have to invalidate the cache after maintainer scripts have been run. The
resulting performance would be unacceptable.
Implement aliasing after metadata tracking ------------------------------------------
The [metadata tracking](https://wiki.debian.org/Teams/Dpkg/Spec/MetadataTracking)
feature enhances `dpkg` with knowledge about filesystem metadata for installed files. This includes knowledge of symbolic links, which would
help with tracking aliasing. Unfortunately, progress on this is fairly
slow and we think that aliasing support is more urgent.
Please consider it to be a piece of best
intentions at reconciling feedback wherever I could. At the time of this writing it certainly is not consensus, but consensus is what I seek
here. Without further ado, the full DEP text follows after my name
while it also is available at https://salsa.debian.org/dep-team/deps/-/merge_requests/5
Hello,
On Mon, 03 Apr 2023, Helmut Grohne wrote:
Please consider it to be a piece of best
intentions at reconciling feedback wherever I could. At the time of this writing it certainly is not consensus, but consensus is what I seek
here. Without further ado, the full DEP text follows after my name
while it also is available at https://salsa.debian.org/dep-team/deps/-/merge_requests/5
I'd like to express some disappointment that nobody replied publicly
sofar. Last year's developer survey concluded that "Debian should complete the merged-/usr transition" was the most important project for Debian [1] (among those proposed in the survey). That's what we are trying to do
here and it would be nice to build some sort of consensus on what it means
in terms of changes for dpkg.
I know that Guillem (dpkg's maintainer) is generally opposed to the
approach that Debian has followed to implement merged-/usr but I have
yet to read his concerns on the changes proposed here
Here you are considering all files, but for the purpose of our issue,
we can restrict ourselves to the directories known by dpkg. We really
only care about directories that have been turned into symlinks (or
packaged symlinks that are pointing to directories). That's a a much lower number of paths that we would have to check.
Thus this time-consuming operation would be done once, the first
time that the updated dpkg starts and when /var/lib/dpkg/aliases
does not yet exist.
In any case, now that you have a database of aliases, you can do the other modifications to detect conflicting files and avoid file losses.
How does that sound?
The proposal I made above is not a real database in the sense that we
don't record what was shipped by the .deb when we installed the files...
it's rather the opposite, it analyzes the system to detect possible
conflicts with dpkg's view of the system.
It can be seen as complimentary to it. In any case, I don't see how implementing metadata tracking would help to solve the problem that we
have today. dpkg would know that all .deb have /bin as a directory and
not as a symlink, and it would be able to conclude that the directory
has been replaced by a symlink by something external, but that's it.
It should still accept that replacement and do its best to work with it.
After Bookworm ships I plan to propose a policy change to the CTTE and
policy maintainers to forbid shipping files in the legacy directories altogether, followed by a debhelper change to adjust any stragglers automatically at build time and a mass rebuild, plus MBF for the small
% that does not use dh and a piuparts test to stop migration for
anything that is uploaded and doesn't comply. That should bring the
matter to an end, without needing to modify dpkg.
After Bookworm ships I plan to propose a policy change to the CTTE and
policy maintainers to forbid shipping files in the legacy directories altogether, followed by a debhelper change to adjust any stragglers automatically at build time and a mass rebuild
Here you are considering all files, but for the purpose of our issue,
we can restrict ourselves to the directories known by dpkg. We really
only care about directories that have been turned into symlinks (or
packaged symlinks that are pointing to directories). That's a a much lower number of paths that we would have to check.
We don't add any new public interface to dpkg, but we also have the possibility to remove to /var/lib/dpkg/aliases to force an new scan
(some sort of "dpkg --refresh-aliases" without an official name).
It might still be cleaner to have that "dpkg --refresh-aliases" command
so that we can invoke it for example in "dpkg-maintscript-helper symlink_to_dir/dir_to_symlink" when we are voluntarily turning a directory into a symlink (or vice-versa).
In any case, now that you have a database of aliases, you can do the other modifications to detect conflicting files and avoid file losses.
How does that sound?
The proposal I made above is not a real database in the sense that we
don't record what was shipped by the .deb when we installed the files...
it's rather the opposite, it analyzes the system to detect possible
conflicts with dpkg's view of the system.
It can be seen as complimentary to it. In any case, I don't see how implementing metadata tracking would help to solve the problem that we
have today. dpkg would know that all .deb have /bin as a directory and
not as a symlink, and it would be able to conclude that the directory
has been replaced by a symlink by something external, but that's it.
The first thing we need consensus on, IMO, is the definition of "complete".
The maintainers of the usrmerge package consider the status quo an
acceptable technical solution, so their definition of "complete" is to roll out the change to the remaining users.
The alternative would be a consensus that dpkg is simply not expected to always leave the system in a useful state if it encounters certain invalid situations, and hoping that we will also be able to point to a few million installations where that has not exploded and call it a success, but that would need to be communicated.
Testing alone will be an absolute nightmare because we can enter invalid states through multiple avenues, for example, if I have a conflict
a.deb: /bin/test
b.deb: /usr/bin/test
c.deb: /bin -> /usr/bin
The latter case is also what should happen if b declares "Replaces: a".
# move file to /usr, install symlink, then remove symlink, move back
dpkg -i a.deb c.deb
dpkg --remove c.deb
Hi Luca,
On Fri, Apr 21, 2023 at 03:29:33PM +0100, Luca Boccassi wrote:
After Bookworm ships I plan to propose a policy change to the CTTE and policy maintainers to forbid shipping files in the legacy directories altogether, followed by a debhelper change to adjust any stragglers automatically at build time and a mass rebuild, plus MBF for the small
% that does not use dh and a piuparts test to stop migration for
anything that is uploaded and doesn't comply. That should bring the
matter to an end, without needing to modify dpkg.
I agree with the goal of removing aliases by moving files to their
canonical locations. However, I do not quite see us getting there in the
way you see it, but maybe I am missing something. As long as dpkg does
not understand the effects of aliasing, we cannot safely move those
files and thus the file move moratorium will have to be kept in place.
And while moving the files would bring the matter to an end, we cannot
do so without either modifying dpkg or rolling back the transition and starting over. I hope that we all agree that rolling back would be too
insane to even consider, but I fail to see how you safely move files
without dpkg being changed. Can you elaborate on that aspect?
I'd also be interested on how you plan to move important files in
essential packages. This is an aspect raised by Simon Richter and where
I do not see an obvious answer yet.
On Fri, 21 Apr 2023 at 15:29:33 +0100, Luca Boccassi wrote:
After Bookworm ships I plan to propose a policy change to the CTTE and policy maintainers to forbid shipping files in the legacy directories altogether, followed by a debhelper change to adjust any stragglers automatically at build time and a mass rebuild
That seems quite likely to trigger the scenario Helmut is trying to avoid, which if I understand correctly is this:
* foo_12.0 in Debian 12 ships /lib/abcd
* bar_13.0 takes over /lib/abcd from foo, but because of either your
proposed change or a manual action by the maintainer, it is actually in
the data.tar as ./usr/lib/abcd (not ./lib/abcd like it was in foo_12.0)
* the maintainer of bar didn't add the correct Breaks/Replaces on foo
* a user upgrading from Debian 12 to 13 installs bar_13.0, perhaps pulled
in as a dependency
* expected result: dpkg refuses to unpack bar ("trying to overwrite ..."),
the upgrade is cancelled, and the user reports a RC bug in bar
* actual result: /usr/lib/abcd in bar quietly overwrites /lib/abcd from foo
* if bar is subsequently removed, then dpkg (and therefore apt) thinks foo
is fully functional, but in fact /{usr/,}lib/abcd is missing
(For simplicity I've described that scenario in terms of files directly shipped in the data.tar, but dpkg also tracks the ownership of files
created by dpkg-divert or alternatives, and similar things can happen
to those.)
I had hoped that the last section of technical committee resolution
#994388 (which concerns this situation) would become irrelevant in Debian
13, but it's looking as though without the sort of dpkg changes discussed
in this thread, the concern about files moving between packages would
remain a valid concern.
However, as far as I can see, the other reasons not to do this that were mentioned in the last section of #994388 *do* become irrelevant in Debian
13, so solving the files-moved-between-packages thing is the last major blocker for doing what you propose. (Unless someone has a reason why this
is not the case?)
You might reasonably say that "the maintainer of bar didn't add the
correct Breaks/Replaces on foo" is a RC bug in bar - and it is! - but
judging by the number of "missing Breaks/Replaces" bug reports that have
to be opened by unstable users (sometimes me), it's a very easy mistake
to make.
One thing that's particularly tricky about this is that the move from
/ into /usr and the move from foo to bar might be 18 months apart if
they happen to occur at opposite ends of our stable release cycle. In particular, if the move from / into /usr is done as soon as the Debian 13 cycle opens, we cannot predict whether the packages that have undergone
that move will also need to undergo a package split/merge at some point
in the following 18 months (but it's reasonable to assume that at least
some of them will).
Guillem also
raised that this is changing the source of truth from the dpkg database
to the actual filesystem, which Guillem considers wrong and I find that vaguely agreeable.
This is looking at it from a performance point of view. Guillem also
raised that this is changing the source of truth from the dpkg database
to the actual filesystem, which Guillem considers wrong and I find that vaguely agreeable.
We don't add any new public interface to dpkg, but we also have the possibility to remove to /var/lib/dpkg/aliases to force an new scan
(some sort of "dpkg --refresh-aliases" without an official name).
Can I rephrase this as your cache invalidation strategy is that any
external entity (such as a maintainer script) introducing aliases should explicitly invalidate the cache.
If you put it this way, it is not that different from the --add-alias/--remove-alias proposal. It is a different interface to
dpkg, but the semantics are roughly the same:
In both cases, something external to dpkg is responsible for performing
the moves and creating the symbolic links followed by informing dpkg
about the alias (explicitly or implicitly via scanning directories).
Would you agree with me that this is a minor adaption of DEP17? In
The proposal I made above is not a real database in the sense that we
don't record what was shipped by the .deb when we installed the files... it's rather the opposite, it analyzes the system to detect possible conflicts with dpkg's view of the system.
I think that Guillem considers this a bad property as he has expressed
in his reply on debian-dpkg, that .debs should be the source of truth.
On Sat, 22 Apr 2023 at 11:50, Helmut Grohne <helmut@subdivi.de> wrote:
On Fri, Apr 21, 2023 at 03:29:33PM +0100, Luca Boccassi wrote:
After Bookworm ships I plan to propose a policy change to the CTTE and policy maintainers to forbid shipping files in the legacy directories altogether, followed by a debhelper change to adjust any stragglers automatically at build time and a mass rebuild, plus MBF for the small
% that does not use dh and a piuparts test to stop migration for
anything that is uploaded and doesn't comply. That should bring the matter to an end, without needing to modify dpkg.
I agree with the goal of removing aliases by moving files to their canonical locations. However, I do not quite see us getting there in the way you see it, but maybe I am missing something. As long as dpkg does
not understand the effects of aliasing, we cannot safely move those
files and thus the file move moratorium will have to be kept in place.
And while moving the files would bring the matter to an end, we cannot
do so without either modifying dpkg or rolling back the transition and starting over. I hope that we all agree that rolling back would be too insane to even consider, but I fail to see how you safely move files without dpkg being changed. Can you elaborate on that aspect?
Moving files within _the same_ package is actually fine as far as I
know. It's moving between location _and_ packages within the same
upgrade that is problematic. The piuparts test I added is overzealous,
but it doesn't need to be.
We already have piuparts tests detecting files moving, it should be
easy enough to extend that to check that the appropriate
Breaks/Replaces have been added. Correct me if I'm wrong, but I
believe it's already against policy to do this without
Breaks/Replaces, so it's not a use case that we need to support, no?
If someone does that by mistake, the package will not migrate to
testing.
I'd also be interested on how you plan to move important files in
essential packages. This is an aspect raised by Simon Richter and where
I do not see an obvious answer yet.
Do you have a pointer? Not sure I follow what "important" files means
here, doesn't ring a bell.
Dpkg already has defined behaviour for directory vs symlink: the directory wins. In principle a future version of dpkg could change that, but /lib/ld-linux.so.2 is just too special, we'd never want to have a package that
actually moves it.
Hi Luca,
On Sat, Apr 22, 2023 at 01:06:18PM +0100, Luca Boccassi wrote:
On Sat, 22 Apr 2023 at 11:50, Helmut Grohne <helmut@subdivi.de> wrote:
On Fri, Apr 21, 2023 at 03:29:33PM +0100, Luca Boccassi wrote:
After Bookworm ships I plan to propose a policy change to the CTTE and policy maintainers to forbid shipping files in the legacy directories altogether, followed by a debhelper change to adjust any stragglers automatically at build time and a mass rebuild, plus MBF for the small % that does not use dh and a piuparts test to stop migration for anything that is uploaded and doesn't comply. That should bring the matter to an end, without needing to modify dpkg.
I agree with the goal of removing aliases by moving files to their canonical locations. However, I do not quite see us getting there in the way you see it, but maybe I am missing something. As long as dpkg does not understand the effects of aliasing, we cannot safely move those
files and thus the file move moratorium will have to be kept in place. And while moving the files would bring the matter to an end, we cannot
do so without either modifying dpkg or rolling back the transition and starting over. I hope that we all agree that rolling back would be too insane to even consider, but I fail to see how you safely move files without dpkg being changed. Can you elaborate on that aspect?
Moving files within _the same_ package is actually fine as far as I
know. It's moving between location _and_ packages within the same
upgrade that is problematic. The piuparts test I added is overzealous,
but it doesn't need to be.
You got me interested to dig deeper. I looked into that piuparts
check[1]. From what I understand, it does something differently from
what you suggest here. It detects files moved between / and /usr (which
is what is going to happen according to your plan) and it does not
detect files being moved between packages (which would actually be problematic here). It also does not produce an error (merely a warning),
so it doesn't halt testing migration and in particular, it doesn't
detect the problematic situation at all. That's kinda disappointing.
You also side stepped the question of how to handle the situation where
we've moved files from / to /usr and then need to move files between
packages in a safe way, though your other response to Simon McVittie
suggests you have an idea there.
On Sat, 22 Apr 2023 13:03:01 +0100, Luca Boccassi wrote:
We already have piuparts tests detecting files moving, it should be
easy enough to extend that to check that the appropriate
Breaks/Replaces have been added. Correct me if I'm wrong, but I
believe it's already against policy to do this without
Breaks/Replaces, so it's not a use case that we need to support, no?
If someone does that by mistake, the package will not migrate to
testing.
Yeah, we agree that you need Breaks+Replaces. The issue here is that due
to dpkg not knowing about the aliasing, Breaks+Replaces is insufficient.
Due to the insufficiency the CTTE enacted the moratorium.
My impression is that you believe that with bookworm, the moratorium is
being lifted and thus we can start moving files. Unfortunately, the underlying problem does not go away just because we've released
bookworm. It's a problem that is unique to merged installations and
those are not going to go away in bookworm.
So yeah, we all want these files moved to their canonical locations and
I kinda like the simplicity of your approach, but thus far my
understanding is that it is plain broken and doesn't work. Well, yeah it
does work in the sense that we break user installations during upgrade
and notice so late in the freeze without any good options to fix.
On Sat, Apr 22, 2023 at 01:06:18PM +0100, Luca Boccassi wrote:
I'd also be interested on how you plan to move important files in essential packages. This is an aspect raised by Simon Richter and where
I do not see an obvious answer yet.
Do you have a pointer? Not sure I follow what "important" files means
here, doesn't ring a bell.
In <669234b3-555b-4e2a-ccc7-dd5510b6e9c1@debian.org>, Simon Richter
said:
Dpkg already has defined behaviour for directory vs symlink: the directory wins. In principle a future version of dpkg could change that, but /lib/ld-linux.so.2 is just too special, we'd never want to have a package that
actually moves it.
This and /bin/sh is the kind of files I'd consider important. And then
upon thinking further it became more and more difficult for me to make
sense of the objection. On a merged system, we can just move that file
to its canonical location without having any trouble even with an
unmodified dpkg. So from my pov, the question about important files can
be disregarded. I hope Simon Richter agrees.
Let us circle back to your "broken" approach. It sure is simple (just
move all the files and be done) and if we could just skip over the
upgrade issues and have all the files moved without having to modify
dpkg, that would actually be a better result than DEP 17. Just how do we avoid the issue of file loss arising from the aliasing in your scenario?
There kinda is an obvious solution here. We just need to tell dpkg that
it needs to remove the package containing the file that is being moved
before unpacking the replacing package. This semantic actually exists
and we call it "Conflicts". So instead of using Breaks+Replaces, the
solution is to use Conflicts! Problem solved, right?
Yeah, I think this solves a number of cases, but there are two areas
where it does not:
* We generally prefer Breaks over Conflicts for a reason. It gives the
dependency resolver more freedom and in taking this freedom away, it
may fail to find solutions to complex upgrades. (Which is why Breaks
got introduced in the first place.)
* We cannot use Conflicts inside the transitively essential set.
So if we move all those files, in 90% (number made up) of the cases it
will go well (since we don't move between packages) and in 90% (number
made up) of the remaining 10%, we can use Conflicts, but what do we do
about that remaining 1%?
If we look deeper into the dpkg toolbox, we pause at diversions. What if
the new package were to add a (--no-rename) diversion for files that are
at risk of being accidentally deleted in newpkg.preinst and then remove
that diversion in newpkg.postinst? Any such diversion will cause package removal of the oldpkg to skip removal of the diverted file (and instead deleted the non-existent path that we diverted to). Thus we retain the
files we were interested in.
This way, we can just move the files as you suggested.
* For 90% of the packages, this will just work.
* For 9% of the packages, we'll need to turn Breaks+Replaces into
Conflicts.
* And for 1% of the packages, we'll need to add complex diversions to
maintainer scripts.
And while this sounds super ugly in the 1% of cases, it's a complexity
that maybe isn't necessary at all and we can remove it after trixie
unlike the complexity being added in DEP 17.
In order to better understand the mechanics at hand (and due to Simon Richter's call for test cases), I sat down and wrote some. So in this
mail you find 4 files attached:
* runtest.sh is a wrapper script to run each case inside a fresh
chroot created by mmdebstrap (as you don't want to mess your system).
* case1.sh demonstrates the file loss problem with Breaks+Replaces and
thus fails.
* case2.sh demonstrates how Conflicts fix the problem.
* case3.sh demonstrates how diversions fix the problem.
In sincerely hope that this fixed-up plan doesn't have any serious
issues. If you find any please tell.
And before closing this mail, I would like to express my gratitude and
thanks to Emilio Pozuelo Monfort for enduring me in numerous discussions
on this matter and providing so many of the key insights captured in
this mail.
Hi,
On Tue, Apr 25, 2023 at 09:07:28PM +0200, Helmut Grohne wrote:
This and /bin/sh is the kind of files I'd consider important. And then
upon thinking further it became more and more difficult for me to make sense of the objection. On a merged system, we can just move that file
to its canonical location without having any trouble even with an unmodified dpkg. So from my pov, the question about important files can
be disregarded. I hope Simon Richter agrees.
Yes, the relevant code at
https://github.com/guillemj/dpkg/blob/main/src/main/unpack.c#L749
already handles moving a file inside the same package, and that has
existed for some time, that's why I use two packages for the PoC.
I have not looked for more issues beyond that, so there might be others lurking in the depths of this code.
What I'm mostly concerned about (read: have not verified either way)
with /lib/ld.so and /bin/sh is what happens when dpkg learns of /bin and
/lib as symlinks -- because right now, the symlinks created by usrmerge
are protected by the rule that if dpkg expects a directory and finds a symlink, that is fine because that is obviously an action taken by the
admin.
But if dpkg sees a package containing these as symlinks, then this is
entered into the dpkg database, and subject to conflict resolution, and
for that, a separate rule exists that directory-symlink conflicts are resolved in favour of the directory, so the interaction between a newer base-files packages shipping /lib as a symlink and an older or
third-party package containing /lib as a directory (e.g. a kernel
package from a hosting provider) could overwrite the /lib symlink.
It might be possible to avoid that by never shipping /lib as a symlink
and always creating it externally, but I still think that's kind of
wobbly.
This and /bin/sh is the kind of files I'd consider important. And then
upon thinking further it became more and more difficult for me to make
sense of the objection. On a merged system, we can just move that file
to its canonical location without having any trouble even with an
unmodified dpkg. So from my pov, the question about important files can
be disregarded. I hope Simon Richter agrees.
If we look deeper into the dpkg toolbox, we pause at diversions. What if
the new package were to add a (--no-rename) diversion for files that are
at risk of being accidentally deleted in newpkg.preinst and then remove
that diversion in newpkg.postinst? Any such diversion will cause package removal of the oldpkg to skip removal of the diverted file (and instead deleted the non-existent path that we diverted to). Thus we retain the
files we were interested in.
Brilliant! Would never have thought of using divert like that.
So, what work would need to happen to make this reality? Do we need tooling/scripts/build changes to support the divert scheme, or is it
"simply" a matter of documenting and testing?
"Simon" == Simon McVittie <smcv@debian.org> writes:
Hi Luca,
On Sat, Apr 22, 2023 at 01:06:18PM +0100, Luca Boccassi wrote:
On Sat, 22 Apr 2023 at 11:50, Helmut Grohne <helmut@subdivi.de> wrote:
On Fri, Apr 21, 2023 at 03:29:33PM +0100, Luca Boccassi wrote:
After Bookworm ships I plan to propose a policy change to the CTTE and policy maintainers to forbid shipping files in the legacy directories altogether, followed by a debhelper change to adjust any stragglers automatically at build time and a mass rebuild, plus MBF for the small % that does not use dh and a piuparts test to stop migration for anything that is uploaded and doesn't comply. That should bring the matter to an end, without needing to modify dpkg.
I agree with the goal of removing aliases by moving files to their canonical locations. However, I do not quite see us getting there in the way you see it, but maybe I am missing something. As long as dpkg does not understand the effects of aliasing, we cannot safely move those
files and thus the file move moratorium will have to be kept in place. And while moving the files would bring the matter to an end, we cannot
do so without either modifying dpkg or rolling back the transition and starting over. I hope that we all agree that rolling back would be too insane to even consider, but I fail to see how you safely move files without dpkg being changed. Can you elaborate on that aspect?
Moving files within _the same_ package is actually fine as far as I
know. It's moving between location _and_ packages within the same
upgrade that is problematic. The piuparts test I added is overzealous,
but it doesn't need to be.
You got me interested to dig deeper. I looked into that piuparts
check[1]. From what I understand, it does something differently from
what you suggest here. It detects files moved between / and /usr (which
is what is going to happen according to your plan) and it does not
detect files being moved between packages (which would actually be problematic here). It also does not produce an error (merely a warning),
so it doesn't halt testing migration and in particular, it doesn't
detect the problematic situation at all. That's kinda disappointing.
You also side stepped the question of how to handle the situation where
we've moved files from / to /usr and then need to move files between
packages in a safe way, though your other response to Simon McVittie
suggests you have an idea there.
On Sat, 22 Apr 2023 13:03:01 +0100, Luca Boccassi wrote:
We already have piuparts tests detecting files moving, it should be
easy enough to extend that to check that the appropriate
Breaks/Replaces have been added. Correct me if I'm wrong, but I
believe it's already against policy to do this without
Breaks/Replaces, so it's not a use case that we need to support, no?
If someone does that by mistake, the package will not migrate to
testing.
Yeah, we agree that you need Breaks+Replaces. The issue here is that due
to dpkg not knowing about the aliasing, Breaks+Replaces is insufficient.
Due to the insufficiency the CTTE enacted the moratorium.
My impression is that you believe that with bookworm, the moratorium is
being lifted and thus we can start moving files. Unfortunately, the underlying problem does not go away just because we've released
bookworm. It's a problem that is unique to merged installations and
those are not going to go away in bookworm.
So yeah, we all want these files moved to their canonical locations and
I kinda like the simplicity of your approach, but thus far my
understanding is that it is plain broken and doesn't work. Well, yeah it
does work in the sense that we break user installations during upgrade
and notice so late in the freeze without any good options to fix.
On Sat, Apr 22, 2023 at 01:06:18PM +0100, Luca Boccassi wrote:
I'd also be interested on how you plan to move important files in essential packages. This is an aspect raised by Simon Richter and where
I do not see an obvious answer yet.
Do you have a pointer? Not sure I follow what "important" files means
here, doesn't ring a bell.
In <669234b3-555b-4e2a-ccc7-dd5510b6e9c1@debian.org>, Simon Richter
said:
Dpkg already has defined behaviour for directory vs symlink: the directory wins. In principle a future version of dpkg could change that, but /lib/ld-linux.so.2 is just too special, we'd never want to have a package that
actually moves it.
This and /bin/sh is the kind of files I'd consider important. And then
upon thinking further it became more and more difficult for me to make
sense of the objection. On a merged system, we can just move that file
to its canonical location without having any trouble even with an
unmodified dpkg. So from my pov, the question about important files can
be disregarded. I hope Simon Richter agrees.
Let us circle back to your "broken" approach. It sure is simple (just
move all the files and be done) and if we could just skip over the
upgrade issues and have all the files moved without having to modify
dpkg, that would actually be a better result than DEP 17. Just how do we avoid the issue of file loss arising from the aliasing in your scenario?
There kinda is an obvious solution here. We just need to tell dpkg that
it needs to remove the package containing the file that is being moved
before unpacking the replacing package. This semantic actually exists
and we call it "Conflicts". So instead of using Breaks+Replaces, the
solution is to use Conflicts! Problem solved, right?
Yeah, I think this solves a number of cases, but there are two areas
where it does not:
* We generally prefer Breaks over Conflicts for a reason. It gives the
dependency resolver more freedom and in taking this freedom away, it
may fail to find solutions to complex upgrades. (Which is why Breaks
got introduced in the first place.)
* We cannot use Conflicts inside the transitively essential set.
So if we move all those files, in 90% (number made up) of the cases it
will go well (since we don't move between packages) and in 90% (number
made up) of the remaining 10%, we can use Conflicts, but what do we do
about that remaining 1%?
If we look deeper into the dpkg toolbox, we pause at diversions. What if
the new package were to add a (--no-rename) diversion for files that are
at risk of being accidentally deleted in newpkg.preinst and then remove
that diversion in newpkg.postinst? Any such diversion will cause package removal of the oldpkg to skip removal of the diverted file (and instead deleted the non-existent path that we diverted to). Thus we retain the
files we were interested in.
This way, we can just move the files as you suggested.
* For 90% of the packages, this will just work.
* For 9% of the packages, we'll need to turn Breaks+Replaces into
Conflicts.
* And for 1% of the packages, we'll need to add complex diversions to
maintainer scripts.
On Wed, 26 Apr 2023 at 10:11, Simon Richter <sjr@debian.org> wrote:
What I'm mostly concerned about (read: have not verified either way)
with /lib/ld.so and /bin/sh is what happens when dpkg learns of /bin and
/lib as symlinks -- because right now, the symlinks created by usrmerge
are protected by the rule that if dpkg expects a directory and finds a
symlink, that is fine because that is obviously an action taken by the
admin.
But if dpkg sees a package containing these as symlinks, then this is
entered into the dpkg database, and subject to conflict resolution, and
for that, a separate rule exists that directory-symlink conflicts are
resolved in favour of the directory, so the interaction between a newer
base-files packages shipping /lib as a symlink and an older or
third-party package containing /lib as a directory (e.g. a kernel
package from a hosting provider) could overwrite the /lib symlink.
It might be possible to avoid that by never shipping /lib as a symlink
and always creating it externally, but I still think that's kind of
wobbly.
IMHO we should not ship the top-level symlinks in a package. The
reason for that is to allow the use case where /usr is a separate
vendor partition and / is either a luks volume or a tmpfs, and thus
the top-level symlinks are ephemeral and re-created on boot on the
fly. If they were part of a package, that would get awkward to say the
least.
I really would like to move toward the direction of having exclusively
/usr and /etc shipped in data.tar, and everything else created locally
as needed, and that includes files in /.
In sincerely hope that this fixed-up plan doesn't have any serious
issues. If you find any please tell.
"Simon" == Simon McVittie <smcv@debian.org> writes:
Simon> You might reasonably say that "the maintainer of bar didn't
Simon> add the correct Breaks/Replaces on foo" is a RC bug in bar -
Simon> and it is! - but judging by the number of "missing
Simon> Breaks/Replaces" bug reports that have to be opened by
Simon> unstable users (sometimes me), it's a very easy mistake to
Simon> make.
Is adding the correct breaks/replaces enough to solve things?
I could believe adding a versioned conflicts would be sufficient, but it
is not obvious to me that breaks/replaces is enough given that dpkg
doesn't understand aliasing.
My intuition (and I have not worked through this as much as you) is that
any time you can have files moving where both packages are unpacked can create problems.
I think that can happen with breaks/replaces but not without a conflicts (without replaces?)
At some point the question becomes: Do we want that complexity inside
dpkg (aka DEP 17 or some variant of it) or outside of dpkg (i.e. what
we're talking about here). It seems clear at this time, that complexity
is unavoidable.
On Thu, Apr 27, 2023 at 12:34:06AM +0200, Helmut Grohne wrote:
At some point the question becomes: Do we want that complexity inside
dpkg (aka DEP 17 or some variant of it) or outside of dpkg (i.e. what
we're talking about here). It seems clear at this time, that complexity
is unavoidable.
My gut feeling is that returning to "dpkg's model is an accurate >representation of the file system" will be less complex to manage
long-term. For this to work, the model needs to be able to express
reality, so I guess we can't avoid updating dpkg.
My gut feeling is that we are wasting prescious time of numerous
skilled Debian Developers to find ugly workarounds to something that
should be done in dpkg, but isnt being done because one dpkg
maintainer has decided to not go the way the project has decided to
go.
This inability to find consensus, to take decisions, accept and follow
them is one of the most central problems that Debian has.
My gut feeling is that we are wasting prescious time of numerous
skilled Debian Developers to find ugly workarounds to something that
should be done in dpkg, but isnt being done because one dpkg
maintainer has decided to not go the way the project has decided to
go.
You might reasonably say that "the maintainer of bar didn't add the
correct Breaks/Replaces on foo" is a RC bug in bar - and it is! - but
judging by the number of "missing Breaks/Replaces" bug reports that have
to be opened by unstable users (sometimes me), it's a very easy mistake
to make.
The origin of this thread was a proposal to adapt dpkg. Your mailNo, Marc is right. The origin of this thread is trying to find
Constitution 2.1.1 is great, however we don't really have a mechanism how to deal with people flat out ignoring Constitution 6 aka the tech-ctte and doubting
and activly working against it's decisions.
failing
our Code of Conduct
Ok, let's move on. I've proposed diversions as a cure, but in reality diversions are a problem themselves. Consider that
cryptsetup-nuke-password diverts /lib/cryptsetup/askpass, which is
usually owned by cryptsetup. If cryptsetup were to move that file to
/usr, the diversion would not cover it anymore and the actual content of askpass would depend on the unpack order. That's very bad and none of
what I proposed earlier is going to fix this.
And of course, this is not some special example, it's a pattern:
* /lib/udev/rules.d/60-cdrom_id.rules: udev -> amazon-ec2-utils
* /sbin/dhclient: isc-dhcp-client -> isc-dhcp-client-ddns
* /bin/systemd-sysusers: systemd -> opensysusers
* ...
So how do we fix diversions? Let's have a look into the dpkg toolbox
again. I've got an idea. Diversions. What you say? How do you fix
diversions with diversions? Quite obviously, you divert
/usr/bin/dpkg-divert! And whenever dpkg-divert is instructed to add a diversion for a non-canonical path, you forward that call to the real dpkg-divert, but also call it with a canonicalized version such that
both locations are covered. When initially deploying the diversion of /usr/bin/dpkg-divert, we also need to transform existing diversions.
Other than that, things should work after doubling down on diversions.
Sorry, I don't have a test case for this yet.
On Thu, Apr 27, 2023 at 12:34:06AM +0200, Helmut Grohne wrote:
Ok, let's move on. I've proposed diversions as a cure, but in reality diversions are a problem themselves. Consider that
cryptsetup-nuke-password diverts /lib/cryptsetup/askpass, which is
usually owned by cryptsetup. If cryptsetup were to move that file to
/usr, the diversion would not cover it anymore and the actual content of askpass would depend on the unpack order. That's very bad and none of
what I proposed earlier is going to fix this.
And of course, this is not some special example, it's a pattern:
* /lib/udev/rules.d/60-cdrom_id.rules: udev -> amazon-ec2-utils
* /sbin/dhclient: isc-dhcp-client -> isc-dhcp-client-ddns
* /bin/systemd-sysusers: systemd -> opensysusers
* ...
So how do we fix diversions? Let's have a look into the dpkg toolbox
again. I've got an idea. Diversions. What you say? How do you fix diversions with diversions? Quite obviously, you divert /usr/bin/dpkg-divert! And whenever dpkg-divert is instructed to add a diversion for a non-canonical path, you forward that call to the real dpkg-divert, but also call it with a canonicalized version such that
both locations are covered. When initially deploying the diversion of /usr/bin/dpkg-divert, we also need to transform existing diversions.
Other than that, things should work after doubling down on diversions. Sorry, I don't have a test case for this yet.
I still don't have a test case, but I have data. Using
binarycontrol.d.n, I identified packages setting up diversions in
preinst (this seems most common, but dash for instance sets up a
diversion in postinst instead, so there are some false negatives). And
while I initially tried to parse those preinst scripts, solving the
halting problem seemed just too hard, so I opted for just running them.
I'm attaching the relevant scripts and showing the affected diversions:
diversion of /lib/udev/rules.d/60-cdrom_id.rules to /lib/udev/rules.d/60-cdrom_id.rules.disabled by amazon-ec2-utils in stable, testing, unstable
diversion of /sbin/coldreboot to /lib/container/divert/coldreboot.orig by bfh-container in testing, unstable
diversion of /sbin/halt to /lib/container/divert/halt.orig by bfh-container in testing, unstable
diversion of /sbin/poweroff to /lib/container/divert/poweroff.orig by bfh-container in testing, unstable
diversion of /sbin/reboot to /lib/container/divert/reboot.orig by bfh-container in testing, unstable
diversion of /sbin/shutdown to /lib/container/divert/shutdown.orig by bfh-container in testing, unstable
diversion of /lib/cryptsetup/askpass to /lib/cryptsetup/askpass.cryptsetup by cryptsetup-nuke-password in testing, unstable
diversion of /sbin/dhclient to /sbin/dhclient-noddns by isc-dhcp-client-ddns in stable, testing, unstable
diversion of /sbin/coldreboot to /lib/molly-guard/coldreboot by molly-guard in stable, testing, unstable
diversion of /sbin/halt to /lib/molly-guard/halt by molly-guard in stable, testing, unstable
diversion of /sbin/poweroff to /lib/molly-guard/poweroff by molly-guard in stable, testing, unstable
diversion of /sbin/reboot to /lib/molly-guard/reboot by molly-guard in stable, testing, unstable
diversion of /sbin/shutdown to /lib/molly-guard/shutdown by molly-guard in stable, testing, unstable
diversion of /bin/systemd-sysusers to /bin/systemd-sysusers.real by opensysusers in stable, testing, unstable
diversion of /sbin/coldreboot to /lib/open-infrastructure/container/divert/coldreboot.orig by progress-linux-container in stable, testing, unstable
diversion of /sbin/halt to /lib/open-infrastructure/container/divert/halt.orig by progress-linux-container in stable, testing, unstable
diversion of /sbin/poweroff to /lib/open-infrastructure/container/divert/poweroff.orig by progress-linux-container in stable, testing, unstable
diversion of /sbin/reboot to /lib/open-infrastructure/container/divert/reboot.orig by progress-linux-container in stable, testing, unstable
diversion of /sbin/shutdown to /lib/open-infrastructure/container/divert/shutdown.orig by progress-linux-container in stable, testing, unstable
diversion of /bin/zcat to /bin/zcat.gzip by zutils in stable, testing, unstable
diversion of /bin/zcmp to /bin/zcmp.gzip by zutils in stable, testing, unstable
diversion of /bin/zdiff to /bin/zdiff.gzip by zutils in stable, testing, unstable
diversion of /bin/zegrep to /bin/zegrep.gzip by zutils in stable, testing, unstable
diversion of /bin/zfgrep to /bin/zfgrep.gzip by zutils in stable, testing, unstable
diversion of /bin/zgrep to /bin/zgrep.gzip by zutils in stable, testing, unstable
All other diversion affect /etc or /usr and I think we're not going to
move any files from /usr to /. So this is a complete list as of today
and I have to say, I expected it to be longer. In effect, we're talking
about merely 8 packages.
For completeness sake, I also looked at the other packages mentioning dpkg-divert in their preinst to catch false negatives. I'll skip
diversions inside /usr as well as removals of diversions here:
* amazon-ec2-net-utils: diversion inside /etc
* angband: comment about diversions
* arpwatch: comment about diversions
* dash: complex use of conditional diversions via postinst
* dist: comment about diversions
* gpr: conditional diversion (inside /usr)
* iputils-arping: check for an existing diversion
* iputils-clockdiff: check for an existing diversion
* iputils-ping: check for an existing diversion
* ld10k1: comment about diversions
* mailagent: comment about diversions
* oping: checks for an existing diversion
* psgml: comment about diversions
* ucf: comment about diversions
* wireshark-common: checks for an existing diversion
So yeah, with the exception of dash, this looks fairly good. Let me also
dive into dash. Unlike the majority of diverters, it diverts in postinst rather than preinst to allow controlling /bin/sh via debconf. A similar technique is in effect by gpr. In any case, this is special, because
dash diverts its own files, so when moving dash's file, its diversions
can be migrated at the same time. It merely means, that we cannot have debhelper just move files (as that would horribly break dash) and
instead have to move files on a package-by-package way. We could also
opt for removing dash's diversion in the default case and there even is
a patch for doing so (#989632) since almost two years. Too bad we didn't apply it. In any case, as long as the file moving is not forced via debhelper, dash should be harmless.
With this number, another option is on the table. Rather than divert dpkg-divert, we could just fix these 8 packages to duplicate their
diversions for /usr and then when moving the underlying files add
versioned Conflicts to the old version of diverters (none of which are essential). So this is an order of 15 uploads (8 diverters, 6 diverted packages, dash).
Luca Boccassi kindly pointed me at config-package-dev though. This is a
tool for generating local packages and it also employs dpkg-divert.
There is a significant risk of breaking this use case. If we were to
divert dpkg-divert and automatically duplicate diversions, this use case
were automatically covered.
I am unsure how to proceed here and request assistance from the
debathena project to evaluate the situation. If possible, I'd like to
avoid the complexity of wrapping dpkg-divert.
Which part of config-package-dev causes a conflict here? Is it
something that can be fixed? Given it's declarative, an upload + rdeps >rebuild should be all that's needed, assuming we know what the issue
is and how to fix it. As far as I can remember, it's a build-time
utility and everything it does is embedded in the target package's
maintainer scripts. But it's been a few years since I last used it, so
I might remember wrongly.
Ok, let's move on. I've proposed diversions as a cure, but in reality diversions are a problem themselves. Consider that
cryptsetup-nuke-password diverts /lib/cryptsetup/askpass, which is
usually owned by cryptsetup. If cryptsetup were to move that file to
/usr, the diversion would not cover it anymore and the actual content of askpass would depend on the unpack order. That's very bad and none of
what I proposed earlier is going to fix this.
So how do we fix diversions? Let's have a look into the dpkg toolbox
again. I've got an idea. Diversions. What you say? How do you fix
diversions with diversions? Quite obviously, you divert
/usr/bin/dpkg-divert! And whenever dpkg-divert is instructed to add a diversion for a non-canonical path, you forward that call to the real dpkg-divert, but also call it with a canonicalized version such that
both locations are covered. When initially deploying the diversion of /usr/bin/dpkg-divert, we also need to transform existing diversions.
Transforming existing diversions: yes, if you can find out about them
without looking at dpkg internal files. It may very well be necessary to update the file format on one of these, and if that would cause your
script to create nonsense diversions, that'd be another thing we'd have
to work around while building real aliasing support.
My current mood is "I'd rather focus on a proper solution, not another
hack that needs to be supported by the proper solution."
Anything we build here that is not aliasing support for dpkg, but
another "shortcut" will delay aliasing support for dpkg because it adds
more possible code paths that all need to be tested.
Keep in mind that we're here because someone took a shortcut, after all.
From my point of view the only reason to try and solve this with a pileof hacks is get us to a state that the current dpkg can deal well with
Hi Simon,
On Sat, Apr 22, 2023 at 11:41:29AM +0100, Simon McVittie wrote:
You might reasonably say that "the maintainer of bar didn't add the
correct Breaks/Replaces on foo" is a RC bug in bar - and it is! - but judging by the number of "missing Breaks/Replaces" bug reports that have
to be opened by unstable users (sometimes me), it's a very easy mistake
to make.
That number seemed quite vague to me and I wanted to get a better handle
on it. The rough idea here should be that we have some package from
bullseye and "upgrade" it to a different package from bookworm.
Generating useful candidates for this can be done using Contents. Given candidates, I've attached a validation script:
./check_conflicts.sh $OLDPKG bullseye $NEWPKG bookworm
In order to draw value from it, the output must be parsed. The exit code
can be non-zero for various reasons. As for candidate generation, I
think one can either just try them all (which takes a bit longer on the validation phase) or reduce their number by ignoring existing Breaks+Replaces, but I haven't found an elegant solution for the latter
yet.
In any case, unstable has around:
* 5700 Breaks
* 6500 Replaces
* 100 unpack errors due to missing Breaks+Replaces
That latter number has just been turned into rc bugs...
To make sure we don't miss any packages out accidentally: could you
confirm that those hundred-or-so errors occurred from 27 or so
distinct packages?
(looking at RC bugs created within the past week, I currently find 27
bugs with 'Breaks+Replaces' in the title)
https://udd.debian.org/bugs.cgi?release=na&merged=ign&keypackages=only&fnewer=only&fnewerval=7&flastmodval=7&rc=1&cpopcon=1&chints=1&ckeypackage=1&ctags=1&cdeferred=1&caffected=1&sortby=last_modified&sorto=asc&format=html#results
Hi James,
* James Addison <jay@jp-hosting.net> [2023-04-28 14:54]:
To make sure we don't miss any packages out accidentally: could you
confirm that those hundred-or-so errors occurred from 27 or so
distinct packages?
(looking at RC bugs created within the past week, I currently find 27
bugs with 'Breaks+Replaces' in the title)
https://udd.debian.org/bugs.cgi?release=na&merged=ign&keypackages=only&fnewer=only&fnewerval=7&flastmodval=7&rc=1&cpopcon=1&chints=1&ckeypackage=1&ctags=1&cdeferred=1&caffected=1&sortby=last_modified&sorto=asc&format=html#results
That's only key packages. Here is the full list:
https://bugs.debian.org/cgi-bin/pkgreport.cgi?dist=unstable;include=subject%3Amissing+Breaks%2BReplaces;submitter=helmut%40subdivi.de
Cheers Jochen
On Thu, Apr 27, 2023 at 10:58:46AM +0200, Marc Haber wrote:
My gut feeling is that we are wasting prescious time of numerous
skilled Debian Developers to find ugly workarounds to something that
should be done in dpkg, but isnt being done because one dpkg maintainer
has decided to not go the way the project has decided to go.
I find this mail of yours very disappointing and possibly even failing
our Code of Conduct on multiple accounts.
I have a bad feeling about this. I think some dpkg maintainer warned us
that diversions would break. Let's peek at his list again. He also said update-alternatives would be broken. I admit not having dug into this
yet, but my gut feeling already is that update-alternatives will become "funny" as well though I guess we cannot fix update-alternatives by
adding alternatives.
On Fri, 28 Apr 2023 at 09:09, Helmut Grohne <helmut@subdivi.de> wrote:
So yeah, with the exception of dash, this looks fairly good. Let me also dive into dash. Unlike the majority of diverters, it diverts in postinst rather than preinst to allow controlling /bin/sh via debconf. A similar technique is in effect by gpr. In any case, this is special, because
dash diverts its own files, so when moving dash's file, its diversions
can be migrated at the same time. It merely means, that we cannot have debhelper just move files (as that would horribly break dash) and
instead have to move files on a package-by-package way. We could also
opt for removing dash's diversion in the default case and there even is
a patch for doing so (#989632) since almost two years. Too bad we didn't apply it. In any case, as long as the file moving is not forced via debhelper, dash should be harmless.
If I understand correctly, by "forced via debhelper" you mean the
proposal of fixing the paths at build time, right? But if not via
that, it means having to fix all of them by hand, which is a lot - is
it possible to fix dash instead? Or else, we could add an opt-out via
one of the usual dh mechanisms, and use it only in dash perhaps?
I think we have a misunderstanding here. As far as I understand it, the
core idea of Luca's approach is that we move all files to their
canonical locations and then - when nothing is left in directories such
as /bin or /lib - there is no aliasing anymore, which is why we do not
have to teach dpkg about aliasing and never patch it.
My understanding from following this thread (and others) is that dpkg
has a bug that can easily be triggered by a sysadmin replacing a
directory with a symlink (and some other necessary conditions that don't happen very often), which is explicitly allowed by policy. This bug is
the one that is causing the problem with the approach that was chosen by
the people implementing usrmerge, even though they were aware of this
problem and a different approach that would have taken two release
cycles and would not have triggered this bug was considered and
rejected.
If this is correct, then Luca's approach may fix the problem for
usrmerge, but does not fix the general dpkg bug. (And, IIUC, is going
to take two _more_ release cycles to fix the problems with usrmerge as implemented! Hmm...)
The --add-alias solution that has been suggested in this thread seems
like it would fix the general problem iff policy was changed to require sysadmins to use it if they replaced a directory with a symlink.
I do not understand why the dpkg maintainer has rejected this solution;
it would still be a fix for the general bug after the usrmerge
transition has completed. And it would be at least one order of
magnitude more performant than scanning the filesystem for directory symlinks.
This and /bin/sh is the kind of files I'd consider important. And then
upon thinking further it became more and more difficult for me to make sense of the objection. On a merged system, we can just move that file
to its canonical location without having any trouble even with an unmodified dpkg. So from my pov, the question about important files can
be disregarded. I hope Simon Richter agrees.
Yes, the relevant code at
https://github.com/guillemj/dpkg/blob/main/src/main/unpack.c#L749
already handles moving a file inside the same package, and that has
existed for some time, that's why I use two packages for the PoC.
We don't want to stat all the files in all packages but we could do better: if we are about to remove an old file that is available through a
symlinked directory, we could check the new name of the file and see if
it's available in some package... and if yes just forget the file without removing it.
This file removal is the reason of the moratorium and incuring some extra cost in some specific cases (installation through directory symlinks which
is not the default case, and would not affect us after the migration is complete) seems certainly fair.
After Bookworm ships I plan to propose a policy change to the CTTE and
policy maintainers to forbid shipping files in the legacy directories altogether, followed by a debhelper change to adjust any stragglers automatically at build time and a mass rebuild, plus MBF for the small
% that does not use dh and a piuparts test to stop migration for
anything that is uploaded and doesn't comply. That should bring the
matter to an end, without needing to modify dpkg.
I noticed that the number of packages shipping non-canonical files is relatively small. It's less than 2000 binary packages in unstable and
their total size is about 2GB. So I looked into binary-patching them and attach the resulting scripts.
This is problems we know about now, but it likely is not an exhaustive
list. This list was mostly guided by Guillem's intuition of what could
break at https://wiki.debian.org/Teams/Dpkg/MergedUsr and I have to say
that his intuition was quite precise thus far. Notably missing in the investigation are statoverrides. However, we should also look for a more generic approach that tries capturing unexpected breakage.
"Helmut" == Helmut Grohne <helmut@subdivi.de> writes:
I think there is a caveat (whose severity I am unsure about): In order
to rely on this (and on DEP 17), we will likely have versioned
Pre-Depends on dpkg. Can we reasonably rule out the case where and old
dpkg is running, unpacking a fixed dpkg, configuring the fixed dpkg and
then unpacking an affected package still running the unfixed dpkg
process?
I think the file loss problem is one sufficient reason to have the moratorium. We didn't need other reasons once we knew this one. Now that
we look into dropping the moratorium, we need to ensure that there are
no reasons anymore and we learned that diversions are affected in a non-trivial way. So even if we were to fix just the file loss problem,
the diversion problems would still be sufficient reason to keep the moratorium unless they were also fixed by the approach. Here you need
both directions a) diverting a non-canonical location would have to
divert a canonical file and b) diverting a canonical location would have
to divert a non-canonical file. This is breaking the initial assumption.
In any case, this train of thought is definitely widening the solution
space. Thank you very much.
I don't know APT well enough to answer that question but from my point of view it's perfectly acceptable to document in the release notes that you
need to upgrade dpkg first.
Are you sure that we need anything for diversions except some documented policy on how to deal with it?
AFAIK the following sequence performs no filesystem changes and should
be sufficient to move a diversion to its new location (I only consider the case of an upgrade, not of a new installation that should just work "normally" on the new location):
dpkg-divert --package $package --remove /bin/foo --no-rename
dpkg-divert --package $package --add /usr/bin/foo --divert /usr/bin/foo.diverted --no-rename
=...) to have its diversions duplicated. Of course, doing so will make usr-is-merged very hard to remove, but we have experience here from multiarch-support.
The case of update-alternatives is likely more tricky. You already looked into it. That's a place where it will be harder to get things right
without some changes.
On Tue, 02 May 2023, Helmut Grohne wrote:
I think there is a caveat (whose severity I am unsure about): In order
to rely on this (and on DEP 17), we will likely have versioned
Pre-Depends on dpkg. Can we reasonably rule out the case where and old
dpkg is running, unpacking a fixed dpkg, configuring the fixed dpkg and then unpacking an affected package still running the unfixed dpkg
process?
I don't know APT well enough to answer that question but from my point of view it's perfectly acceptable to document in the release notes that you
need to upgrade dpkg first.
What still applies here is that we can have usr-is-merged divert /usr/bin/dpkg-divert and have it automatically duplicate any possibly
aliased diversion and then the diverter may Pre-Depends: usr-is-merged (>=...) to have its diversions duplicated. Of course, doing so will make usr-is-merged very hard to remove, but we have experience here from multiarch-support.
And of course, we can always draw the diversion card and have
usr-is-merged divert /usr/bin/update-alternatives to have it
canonicalize paths as required to be able to migrate alternatives in a
sane way (from a consumer point of view).
For aliasing support in dpkg, that means we need a safe policy of dealing with diversions that conflict through aliasing that isn't "reject with error", because the magic dpkg-divert would always generate conflicts.
From my point of view, the ultimate goal here should be moving all filesto their canonical location and thereby make aliasing effects
then a package containing /bin/foo and a package containing /usr/bin/foo now have a file conflict in dpkg. Not sure if that is a problem, or exactly the
behaviour we want. Probably the latter, which would allow us to define a policy "if aliased paths are diverted, the diversion needs to match", which in turn would allow the conflict checker during alias registration to verify that the aliased diversions are not in conflict.
The diverted dpkg-divert would probably generate extra register/unregister calls as soon as dpkg-divert itself is aliasing aware, but all that does is generate warning messages about existing diversions being added again, or nonexistent diversions being deleted -- these are harmless anyway, because maintainer scripts are supposed to be idempotent, and dpkg-divert supports that by not requiring scripts to check before they register/unregister.
We get to draw this card exactly once, and any package that would need the same diversion would need to conflict with usr-is-merged, which would make
it basically uninstallable.
- it is not an error to register a diversion for an alias of anHow do you distinguish between aliased diversions and "real" ones?
existing diversion, provided the package and target matches, this is a
no-op
- it is not an error to unregister a diversion for an alias of a path
that has been unregistered previously, that is a no-op as well
My proposal would be to put the onus on the client registering the[...]
diversion:
- packages are encouraged to register both diversions
On 2023-05-05 Simon Richter <sjr@debian.org> wrote:
[...]
My proposal would be to put the onus on the client registering the diversion:[...]
- packages are encouraged to register both diversions
Hello,
That seems to be a rather ugly user interface, ("There is dpkg-divert on Debian, but because the usrmerge you need to invoke it twce to be
sure"). Will we need to have a meta-transition years from now trying to
get get rid of the double diversions?
In practice, I suspect that out of ~2000 packages shipping bin/ sbin/
lib*/, only a small fraction would end up needing to further move
files out to other packages
On Fri, 05 May 2023 at 23:11:54 +0100, Luca Boccassi wrote:
In practice, I suspect that out of ~2000 packages shipping bin/ sbin/ lib*/, only a small fraction would end up needing to further move
files out to other packages
I think the most common case for this is likely to be systemd system
units, which are currently in /lib/systemd and are sometimes moved between binary packages. Splitting out dbus-system-bus-common during the bookworm release cycle is a recent example, but it also seems reasonably common
to move systemd units around as part of having a distinction between
"the ready-made system service" and "the binary you can run by hand"
(see apache2 vs. apache2-bin).
udev rules are in a similar situation: consumed by an important system service, but shipped by any random package that wants to adjust its behaviour.
Two things about systemd units make them a relatively difficult case
for distro-wide changes like this:
For /bin, /sbin and to a lesser extent /lib/TUPLE, we can often assume
that only "core" packages (whose maintainers should be paying attention to subtleties like this) are affected by any change, because the historical definition of those directories was "just enough to boot the system or
do disaster recovery", minimizing what should be there; but the number
of packages that touch /lib/systemd is rather large, and a lot of them
are leaf or near-leaf packages.
Also, they're managed (and sometimes installed) by debhelper, which
needs to be able to do the right thing relatively automatically. For
example, if maintainers need to be able to take some special action at
the point at which their /lib/systemd units move to /usr/lib/systemd,
then I think debhelper installing into /usr/lib/systemd would have to
be gated by a compat level change.
Hi,
On 06.05.23 07:11, Luca Boccassi wrote:
- every package is forcefully canonicalized as soon as trixie is open
for business
You will also need to ship at least
- /lib -> usr/lib (on 32 bit)
- /lib64 -> usr/lib64 (on 64 bit)
as a symlink either in the libc-bin package or any other Essential
package, to fulfill the requirement that unpacked Essential packages are operational.
Hi,
On 06.05.23 21:28, Luca Boccassi wrote:
[shipping usrmerge symlinks in packages]
In the far future I'd like for these details to be owned by image builders/launchers/setup processes rather than a package, but this can
be discussed separately and independently, no need to be tied to this effort.
Ideally I'd like to have this information in a single package rather
than distributed over ten different tools, especially as this is also
release and platform dependent.
If possible, I'd like to go back to the gold standard of
- download Essential: yes packages and their dependencies
- unpack them using dpkg --fsys-tarfile | tar -x
- install over this directory with dpkg --root=... -i *.deb
to get something that works as a container. Right now, that only works
if I remove "init" from the package list, which is fair since a
container doesn't need an init system anyway.
The less an image builder needs to deviate from this approach, the
better for our users.
To have a working system you need several more steps that are
performed by the instantiator/image builder, such as providing working
and populated proc/sys/dev, writable tmp/var, possibly etc. And it
needs to be instantiated with user/password/ssh certs/locale/timezone.
And if it needs to be bootable on baremetal/vm, it needs an ESP. And
then if you have an ESP and want to run in a VM with SB, you'll need self-enrolling certs on first use or ensuring the 3rd party CA is provisioned. And then...
You get the point. Going from a bunch of packages to a running system necessarily has many steps in between, some that are already done and
taken for granted, for example when you say "works as a container" I'm
pretty sure the "container" engine is taking care of at the very least proc/dev/sys for you, and it's just expected to work. bin -> usr/bin,
sbin -> usr/sbin and lib -> usr/lib should get the same treatment: if
they are not there, the invoked engine should prepare them. systemd
and nspawn have been able to do this for a while now.
Not having those hard coded means that the use case of / on a tmpfs
with the rest instantiated on the fly, assembled with the vendor's
/usr and /etc trees, becomes possible, which is neat. And said trees
can pass the checksum/full integrity muster.
Hi Luca,
On Sat, May 06, 2023 at 04:52:30PM +0100, Luca Boccassi wrote:
To have a working system you need several more steps that are
performed by the instantiator/image builder, such as providing working
and populated proc/sys/dev, writable tmp/var, possibly etc. And it
needs to be instantiated with user/password/ssh certs/locale/timezone.
And if it needs to be bootable on baremetal/vm, it needs an ESP. And
then if you have an ESP and want to run in a VM with SB, you'll need self-enrolling certs on first use or ensuring the 3rd party CA is provisioned. And then...
You paint it this way, but it really used to just work until we got the /usr-merge. Indeed, debvm creates virtual machine images effectively by bootstrapping a filesystem from packages and turning the resulting tree
into a file system image.
* /proc, /sys, /dev are mounted by systemd. All you need to do here is
create the directories and base-files does so.
* /tmp is shipped by base-files.
* user and password creation is not handled yet, but can be handled by
something similar to systemd-firstboot.
* Not sure what you mean with certs, locale and timezone. You can just
install ca-certificates, locales and tzdata as part of the bootstrap.
* The bootloader part for baremetal is kinda out of scope for
bootstrap, which is why debvm side-steps this. You can also skip it
for containers and build chroots. So it is one out of multiple use
cases that needs extra work here.
In a good chunk of situations, you can get just by without messing
around. Well that is until we broke it via usr-is-merged. I concur with
Simon Richter, that restoring this property is a primary concern.
You get the point. Going from a bunch of packages to a running system necessarily has many steps in between, some that are already done and
taken for granted, for example when you say "works as a container" I'm pretty sure the "container" engine is taking care of at the very least proc/dev/sys for you, and it's just expected to work. bin -> usr/bin,
sbin -> usr/sbin and lib -> usr/lib should get the same treatment: if
they are not there, the invoked engine should prepare them. systemd
and nspawn have been able to do this for a while now.
No, this misses the point. You can configure essential in a very limited environment. However, you cannot do so without the lib or lib64 symlink (depending on the architecture) and the bin symlink. This is so
critical, that it cannot be deferred to some external entity. It must be
part of the bootstrap protocol. There are some suggested ways to fix
this (such as adding separate bootstrap scripts next to maintainer
scripts), but nothing implemented.
Not having those hard coded means that the use case of / on a tmpfs
with the rest instantiated on the fly, assembled with the vendor's
/usr and /etc trees, becomes possible, which is neat. And said trees
can pass the checksum/full integrity muster.
It's neat that you can solve your use case by breaking other people's
use cases. This is not constructive interaction however. This kind of behaviour is precisely what caused so much conflict around the
/usr-merge. What if I gave a shit for your use case? Denying the
/usr-merge and just continuing unmerged as long as possible (as merging
would break my use case) would be my strategy of choice. You can make a difference here by starting to recognize other people's use cases and proposing solutions in that merged world. And no, it's not "add duct
tape to every bootstrap tool".
So I really want to see a solution for the bootstrap protocol before
moving the dynamic linker and /bin/sh to its canonical location. The
current bootstrap protocol is kept on life-support by installing the
usrmerge package by default. Dropping usrmerge from the
init-system-helpers dependency as first alternative or moving the
dynamic linker would break it. If I had a solution in mind, I'd
definitely post it right here, but unfortunately I have not.
Helmut
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 300 |
Nodes: | 16 (2 / 14) |
Uptime: | 28:52:04 |
Calls: | 6,707 |
Files: | 12,239 |
Messages: | 5,352,818 |