• System-critical package management

    From Peter Warrington@21:1/5 to All on Wed Sep 6 21:10:01 2023
    The lack of any system of recognition for packages that are critical to system operation impedes the reliability of Debian-based systems. For example, a reboot during a background package upgrade process on critical system packages unbeknownst to the
    user may result in the system unable to boot as expected, with little readily-available feedback to the user as to the cause.

    Other operating systems like Windows and MacOS manage this by updating system-critical components separately from user-land during shutdown, while clearly giving user-feedback that critical updates are taking place, and that for example the system should
    not be turned off.

    The way in which DPKG deals with packages is preferable in many ways as upgrades are almost entirely made in standard user-land, and is largely transparent (for example, an upgrade will not automatically begin during shutdown without any indication to
    user that this will take place). It also of course means that Debian systems are highly configurable.

    A potential middle-ground solution to this is to allow packages to be marked as "system-critical" to DPKG by external system components - for example a standard desktop Ubuntu system might mark the Gnome Display Manager, Networking drivers, and others in
    this way during installation. These system-critical packages could then be protected by DPKG in the following ways:
    - They are automatically reverted to a known good state on upgrade failure (e.g. previous version)
    - They cannot be removed without being unmarked as "system-critical"
    - The system could check during every shutdown that system-critical packages are in a consistent state, reverting to a known good state if not

    I am interested in knowing the communities' thoughts on this, and if these ideas have any merit to them.

    - Peter Warrington

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Warrington@21:1/5 to All on Wed Sep 6 22:30:01 2023
    Debian supports XB- tags in control files that are preserved after installation.

    I am not familiar with these, where can I find out more about them?

    What is mission critical would be vary depending on the system .. it would be inappropriate for an upstream package to try to decide if it was mission critical or not.

    The suggestion is that packages would be marked as system-critical by superuser or processes operating under superuser, for example via config file - not the packages themself. Therefore a desktop Debian distribution might automatically configure the
    display manager to be a system-critical package, but a server distribution might not.

    - Peter Warrington

    From: "Weatherby,Gerard" <gweatherby@uchc.edu>
    Date: Wednesday, 6 September 2023 at 20:45
    To: Peter Warrington <sothisispeter@gmail.com>, "debian-dpkg@lists.debian.org" <debian-dpkg@lists.debian.org>
    Subject: Re: System-critical package management

    We use Debian packages to manage our own software (out of a private, non-compliant repository).

    Debian supports XB- tags in control files that are preserved after installation. We use these for multiple reasons. There is nothing stopping a distribution (e.g., Ubuntu) from tagging their packages if they wish.

    What is mission critical would be vary depending on the system – is networking critical? I boot virtual machines without networking sometimes – I can access the console via the hypervisor. So it would be inappropriate for an upstream package to try to
    decide if it was mission critical or not.
     
    From: Peter Warrington <sothisispeter@gmail.com>
    Date: Wednesday, September 6, 2023 at 3:03 PM
    To: debian-dpkg@lists.debian.org <debian-dpkg@lists.debian.org>
    Subject: System-critical package management
    *** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. ***

    The lack of any system of recognition for packages that are critical to system operation impedes the reliability of Debian-based systems. For example, a reboot during a background package upgrade process on critical system packages unbeknownst to the
    user may result in the system unable to boot as expected, with little readily-available feedback to the user as to the cause.

    Other operating systems like Windows and MacOS manage this by updating system-critical components separately from user-land during shutdown, while clearly giving user-feedback that critical updates are taking place, and that for example the system should
    not be turned off.

    The way in which DPKG deals with packages is preferable in many ways as upgrades are almost entirely made in standard user-land, and is largely transparent (for example, an upgrade will not automatically begin during shutdown without any indication to
    user that this will take place). It also of course means that Debian systems are highly configurable.

    A potential middle-ground solution to this is to allow packages to be marked as "system-critical" to DPKG by external system components - for example a standard desktop Ubuntu system might mark the Gnome Display Manager, Networking drivers, and others in
    this way during installation.  These system-critical packages could then be protected by DPKG in the following ways:
            - They are automatically reverted to a known good state on upgrade failure (e.g. previous version)
            - They cannot be removed without being unmarked as "system-critical"         - The system could check during every shutdown that system-critical packages are in a consistent state, reverting to a known good state if not

    I am interested in knowing the communities' thoughts on this, and if these ideas have any merit to them.

    - Peter Warrington

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Richter@21:1/5 to All on Thu Sep 7 05:10:02 2023
    Hello,

    The lack of any system of recognition for packages that are critical to system operation impedes the reliability of Debian-based systems. For example, a reboot during a background package upgrade process on critical system packages unbeknownst to the
    user may result in the system unable to boot as expected, with little readily-available feedback to the user as to the cause.

    Locking out reboots while the package manager is active is a policy that
    needs to be provided by the policy layer that allows ordinary users to
    reboot -- so this is the responsibility of the desktop environment.

    The base system and package manager require superuser privileges for
    both reboot and invoking the package manager. For single-user systems,
    it is the responsibility of the administrator to not issue a reboot
    command while a package upgrade is in progress, which is not an onerous requirement because the package upgrade must be manually commanded as well.

    Packages are often installed in environments where no control over
    reboots is possible and where system services usually found on desktops
    are unavailable, such as inside containers during preparation of
    container images.

    There is no appropriate place to implement such a lockout at a low
    level. The kernel is informed of the intention to reboot only after
    system shutdown is complete, so this is the wrong place, and above that,
    users have a choice of different policy layers that fit their use case
    best, including "none".

    But: because background updates on desktop systems are implemented as a
    system service that is run through a policy layer, it is possible to
    implement such a lockout on this layer.

    Other operating systems like Windows and MacOS manage this by updating system-critical components separately from user-land during shutdown, while clearly giving user-feedback that critical updates are taking place, and that for example the system
    should not be turned off.

    No, these systems make no distinction between system and user
    components. The reason upgrades are performed through a reboot is a
    historical shortcoming in the file system implementation: Unix separates
    the contents of a file from its file name, so if a file is open, its
    name can still be changed or removed, while the file contents are kept
    until no more names point to it *and* no more open file handles exist.

    On Windows, open files cannot be renamed or deleted unless the program
    has specifically allowed this, which (for historical reasons) few
    programs do, so the upgrade process works by unpacking the new files to
    a temporary name, making a note to rename the files, then rebooting and performing the rename while no users are logged on and no services are
    running, and then subsequently starting the system.

    This process is the same even for user programs, so if you update WinRAR
    while it is open (so the file cannot be updated), the installation
    process will ask for a reboot to complete the upgrade.

    A potential middle-ground solution to this is to allow packages to be marked as "system-critical" to DPKG by external system components - for example a standard desktop Ubuntu system might mark the Gnome Display Manager, Networking drivers, and others
    in this way during installation. These system-critical packages could then be protected by DPKG in the following ways:

    - They are automatically reverted to a known good state on upgrade failure (e.g. previous version)

    Generally, packages are expected to go from one functional state to
    another in a very quick operation after verifying that the operation can
    be performed.

    For example, grub installed into the MBR will check that all components
    are present, prepare the image to be written in memory, and only in the
    last step, write the first and second stage bootloaders in one go. Any
    failure at this stage would be "hardware error", which would also apply
    to the old version, and until that point, the old version would still work.

    It is much more likely for a package to indicate success and
    subsequently fail on reboot because of a missing check, but this is not something the package manager can help with.

    What already exists is automatic revert if a package fails to unpack
    because of an I/O error (or the disk being full).

    - They cannot be removed without being unmarked as "system-critical"

    We have "Essential: yes", which dpkg protects, and "Protected: yes",
    which are protected by apt. The latter category is what bootloaders fall
    in (it also helps that the main author for apt is also a grub maintainer).

    The dpkg program will allow you to remove the bootloader, because that
    is what allows changing bootloaders easily, the "Essential" set is
    basically just what is required for dpkg to function -- so dpkg cannot self-destruct.

    - The system could check during every shutdown that system-critical packages are in a consistent state, reverting to a known good state if not

    Again, this would need to be inside the policy layer that defines
    "shutdown" -- there are many of those, and most of them are outside the
    Debian system (e.g. if you run Debian in a container under Kubernetes,
    then Kubernetes is the policy layer that would be responsible for that.

    On desktop systems, systemd is the appropriate policy layer to decide
    about reboots, and (if I remember correctly) packagekit is the policy
    layer that invokes dpkg, so packagekit would need to inhibit reboots
    while it is working, and it can do so easily because it can assume
    systemd to be present and running.

    I am interested in knowing the communities' thoughts on this, and if these ideas have any merit to them.

    On the lower levels, what can be reasonably implemented already is. The
    lockout you describe belongs into the desktop system, but it would
    require new UI to be developed to be useful -- rejecting the reboot is
    easy, but indicating to the user why the reboot was rejected or
    disabling the option requires a new communication channel, and without
    that functionality, the user experience would be "I tried to reboot and
    it didn't do anything."

    Breaking the layer separation would be a horrible complicated mess --
    adding new low level errors means adding appropriate error handlers to
    all intermediate layers until the error can bubble up to the user. This
    is something component systems have historically struggled with -- every
    time Windows displays some "error code c0312313" type dialog, this is a
    missing handler chain.

    Simon

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)