• Architecture variants for Debian / Ubuntu

    From Michael Hudson-Doyle@21:1/5 to All on Thu Aug 31 23:10:01 2023
    Hi,

    Recently the topic of exploiting newer instructions without dropping
    support for older machines has come up several times inside Ubuntu
    engineering. I understand this topic has come up several times in the past
    for Debian as well, but nothing has really come of it to date.

    I've spent a while thinking through the options and coming up with a design
    and wrote some notes into a wiki page: https://wiki.debian.org/ArchitectureVariants

    In terms of building consensus around this design, I thought it makes sense
    to start at the bottom of the stack and so here I am on this mailing list
    :-) I guess in due course this could become a DEP, and would certainly need
    to be discussed on debian-devel before getting too far.

    What do you think? Have I missed any glaring implications? Is there a
    better way of doing this?

    Cheers,
    mwh

    <div dir="ltr">Hi,<div><br></div><div>Recently the topic of exploiting newer instructions without dropping support for older machines has come up several times inside Ubuntu engineering. I understand this topic has come up several times in the past for
    Debian as well, but nothing has really come of it to date.</div><div><br></div><div>I&#39;ve spent a while thinking through the options and coming up with a design and wrote some notes into a wiki page: <a href="https://wiki.debian.org/
    ArchitectureVariants">https://wiki.debian.org/ArchitectureVariants</a></div><div><br></div><div>In terms of building consensus around this design, I thought it makes sense to start at the bottom of the stack and so here I am on this mailing list :-) I
    guess in due course this could become a DEP, and would certainly need to be discussed on debian-devel before getting too far.</div><div><br></div><div>What do you think? Have I missed any glaring implications? Is there a better way of doing this?</div><
    <br></div><div>Cheers,</div><div>mwh</div><div><br></div></div>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Guillem Jover@21:1/5 to Michael Hudson-Doyle on Wed Sep 6 11:30:01 2023
    Hi!

    On Fri, 2023-09-01 at 08:43:55 +1200, Michael Hudson-Doyle wrote:
    Recently the topic of exploiting newer instructions without dropping
    support for older machines has come up several times inside Ubuntu engineering. I understand this topic has come up several times in the past for Debian as well, but nothing has really come of it to date.

    I also had a chat about this with Matthias Klose (CCed) around 2022-05.

    I've spent a while thinking through the options and coming up with a design and wrote some notes into a wiki page: https://wiki.debian.org/ArchitectureVariants

    I think we are already doing 1, 2 and 3. I agree 4 is just wrong. And
    something like 5 is what I suggested to Matthias for Ubuntu when we
    last discussed it as the best way to go about this.

    I'm not sure I entirely agree with the requirements you set forth
    though:

    - I think such optimized builds might need to be done with "special
    toolchains" (these could simply be wrappers over the host compiler
    passing the appropriate flags via command-line or via specs or
    similar, not necessarily full blown toolchains), passing these via
    something like dpkg-buildflags seems currently unreliable, as I don't
    think we have full coverage in packages (neither for all compilers
    available)? Although it would be better as it would centralize the
    management. (For reference this is in part how rpm handles this:
    https://github.com/rpm-software-management/rpm/blob/master/rpmrc.in)
    - Perhaps that's a limitation from the archive software side, but
    requiring to place the binary packages in the same pool seems
    rather restrictive (it forces different filenames for example).
    - I guess it might be nice for the ISA to be passed down to the
    dpkg tools, but I don't think this is strictly necessary? A
    frontend like apt could also decide based on metadata in say the
    Release file, although not having the actual installed package
    metadata on whether it was a different ISA build or not would make
    its job more inconvenient. In any case I don't have a big issue
    with recording this via dpkg-gencontrol or similar if necessary.

    On the specific implementation details:

    - Changing the Architecture format (as in adding colons there) seems
    like a non-starter, and I expect that would break lots of things
    (I mean it could be done but I'm not sure it's worth it for this).
    Recording this mostly as a hint than anything else, via another
    field (if necessary at all) I think would be best.
    - As covered in previous discussions, dpkg could (but I don't think
    it's necessary) check whether the .deb is runnable on the current
    hw, but that's tricky as chrootless installs need to be taken
    into account, etc. It should certainly not be part of dependency
    resolution.
    - I'm not fond of having to change the binary package name format
    either for this (name_version_arch.deb) even if at least dpkg
    itself does not care (but I know other tools do care), and
    depending on the format I'd expect things to break (this goes
    back to the shared pool concern).
    - If dpkg-architecture needs to be aware of this, then this might need
    to be auto-detectable from just the current toolchain being used.

    Some of the above problems could perhaps be avoided if we introduced
    a concept of architecture aliases/ISAs (similar to what rpm has), which
    would side-step the pool sharing issue, the binary package renaming,
    etc. One big issue with this is that it requires for dpkg to have an
    exhaustive table of all such aliases, and if there's ever a new alias
    added, old dpkg versions need to be updated or they will not understand
    what they match with. So this does not seem ideal either. So I guess this
    is a variation over your proposal, but perhaps this could still be used
    in specific contexts, say only at build-time (but not for dependency relationships), for repo management (say binary-arm64v9/Packages.xz),
    or binary package names where the field would specify the actual name
    for the filename, say:

    Architecture: arm64
    ArchitectureIsa: arm64v9

    or maybe better:

    Architecture: arm64
    ArchitectureIsa: v9

    resulting in dpkg-deb generating:

    binpkg_1.0-1_arm64v9.deb

    but targeting arm64. I also think I prefer naming this explicitly as ISA variants, if you will, than just architecture variants as that gives
    way too much room (which perhaps we want, but then that has other
    implications over compatibility), and for the field perhaps just Isa is
    better, to avoid the implicit repetition of ArchitectureInstructionSetArchitecture :), but that makes it less easy
    to associate both as related.

    In the end though, I think there are perhaps bigger constraints from
    the infra side of things than the package tooling, stuff like archive management software, or binary transition migration and similar.

    In terms of building consensus around this design, I thought it makes sense to start at the bottom of the stack and so here I am on this mailing list
    :-) I guess in due course this could become a DEP, and would certainly need to be discussed on debian-devel before getting too far.

    I'm not sure there's ever been much of a wide interest in something
    like this in Debian TBH. Due to deployment and increased infra
    overhead at least?

    What do you think? Have I missed any glaring implications?

    No, I think the overall picture is about right, and captures most of the
    things we have discussed at various times and places in the past. :)

    Is there a better way of doing this?

    I think starting from 5, the rest are probably just details to hammer
    out, but not insurmountable things.

    Thanks,
    Guillem

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Hudson-Doyle@21:1/5 to Guillem Jover on Thu Sep 21 05:10:02 2023
    On Wed, 6 Sept 2023 at 21:27, Guillem Jover <guillem@debian.org> wrote:

    Hi!


    Hi!

    Thanks for the considered response. And sorry for the very slow reply.


    On Fri, 2023-09-01 at 08:43:55 +1200, Michael Hudson-Doyle wrote:
    Recently the topic of exploiting newer instructions without dropping support for older machines has come up several times inside Ubuntu engineering. I understand this topic has come up several times in the
    past
    for Debian as well, but nothing has really come of it to date.

    I also had a chat about this with Matthias Klose (CCed) around 2022-05.

    I've spent a while thinking through the options and coming up with a
    design
    and wrote some notes into a wiki page: https://wiki.debian.org/ArchitectureVariants

    I think we are already doing 1, 2 and 3. I agree 4 is just wrong. And something like 5 is what I suggested to Matthias for Ubuntu when we
    last discussed it as the best way to go about this.


    OK, glad we agree to this point.


    I'm not sure I entirely agree with the requirements you set forth
    though:

    - I think such optimized builds might need to be done with "special
    toolchains" (these could simply be wrappers over the host compiler
    passing the appropriate flags via command-line or via specs or
    similar, not necessarily full blown toolchains), passing these via
    something like dpkg-buildflags seems currently unreliable, as I don't
    think we have full coverage in packages (neither for all compilers
    available)? Although it would be better as it would centralize the
    management. (For reference this is in part how rpm handles this:
    https://github.com/rpm-software-management/rpm/blob/master/rpmrc.in)


    I agree that is not completely clear what the best approach here is, do we change the defaults of gcc or influence things via default buildflags.

    I'm sure there are packages that do not respect dpkg-buildflags during
    build but the consequences of this do not seem all that great -- such
    packages would not be optimized for the variant / ISA but if someone
    manages to notice this, they can fix the bug.

    OTOH, having the compiler default change may be a bit of a surprise for
    people who build binaries for deployment not via Debian packages. (Do our compilers in general target the same baseline as Debian does for a given architecture?).


    - Perhaps that's a limitation from the archive software side, but
    requiring to place the binary packages in the same pool seems
    rather restrictive (it forces different filenames for example).


    We are considering supporting multiple variant/ISAs in the primary Ubuntu archive, so if we get that far then yes, we want to have all the binary packages in the same pool. The first steps don't have to support this I
    guess.


    - I guess it might be nice for the ISA to be passed down to the
    dpkg tools, but I don't think this is strictly necessary? A
    frontend like apt could also decide based on metadata in say the
    Release file, although not having the actual installed package
    metadata on whether it was a different ISA build or not would make
    its job more inconvenient. In any case I don't have a big issue
    with recording this via dpkg-gencontrol or similar if necessary.


    I agree, I don't think it's /strictly/ required that the target ISA is
    recorded in the deb. But I think adding a field for it reduces scope for confusion later.


    On the specific implementation details:

    - Changing the Architecture format (as in adding colons there) seems
    like a non-starter, and I expect that would break lots of things
    (I mean it could be done but I'm not sure it's worth it for this).
    Recording this mostly as a hint than anything else, via another
    field (if necessary at all) I think would be best.


    Agreed.


    - As covered in previous discussions, dpkg could (but I don't think
    it's necessary) check whether the .deb is runnable on the current
    hw, but that's tricky as chrootless installs need to be taken
    into account, etc. It should certainly not be part of dependency
    resolution.


    I'm sorry, what is a chrootless install? But I think I agree here too:
    tricky and just not really worth it.


    - I'm not fond of having to change the binary package name format
    either for this (name_version_arch.deb) even if at least dpkg
    itself does not care (but I know other tools do care), and
    depending on the format I'd expect things to break (this goes
    back to the shared pool concern).


    I don't think this is avoidable in the long run. I must admit I have
    generally thought of the presence of the architecture name in the .deb file name to be more a convention than part of the format (and the "real"
    indication of a binary package's architecture is in DEBIAN/control).


    - If dpkg-architecture needs to be aware of this, then this might need
    to be auto-detectable from just the current toolchain being used.


    So you are saying to configure a build environment for, say, x86-64-v3 you would configure gcc with --with-arch64=x86-64-v3 and then dpkg-architecture would parse the output of gcc -Q --help=target to set DEB_HOST_ARCH_VARIANT appropriately? (modulo mistakes in details) Or do you mean something else entirely?


    Some of the above problems could perhaps be avoided if we introduced
    a concept of architecture aliases/ISAs (similar to what rpm has), which
    would side-step the pool sharing issue, the binary package renaming,
    etc. One big issue with this is that it requires for dpkg to have an exhaustive table of all such aliases, and if there's ever a new alias
    added, old dpkg versions need to be updated or they will not understand
    what they match with. So this does not seem ideal either. So I guess this
    is a variation over your proposal, but perhaps this could still be used
    in specific contexts, say only at build-time (but not for dependency relationships), for repo management (say binary-arm64v9/Packages.xz),
    or binary package names where the field would specify the actual name
    for the filename, say:

    Architecture: arm64
    ArchitectureIsa: arm64v9

    or maybe better:

    Architecture: arm64
    ArchitectureIsa: v9

    resulting in dpkg-deb generating:

    binpkg_1.0-1_arm64v9.deb

    but targeting arm64.


    I'm not sure but I think you have talked yourself into suggesting something very similar to my proposal here?


    I also think I prefer naming this explicitly as ISA
    variants, if you will, than just architecture variants as that gives
    way too much room


    Certainly I think all the interesting use cases are basically changing the
    set of instructions emitted by the toolchains by default. I suppose you
    could have a variant that changed the set of hardening flags or something
    but that doesn't seem an especially good idea. So I guess I'd be happy with s/ArchitectureVariant/ArchitectureISA/ everywhere.


    (which perhaps we want, but then that has other
    implications over compatibility), and for the field perhaps just Isa is better, to avoid the implicit repetition of ArchitectureInstructionSetArchitecture :), but that makes it less easy
    to associate both as related.

    In the end though, I think there are perhaps bigger constraints from
    the infra side of things than the package tooling, stuff like archive management software, or binary transition migration and similar.


    I think I managed to convince myself that most things like britney and ben
    can and should treat each variant/ISA as a separate architecture. It
    depends a bit how publication is done in the case where not all packages
    are built for all ISAs but not in very interesting ways I think. And my intention is to start with amd64v3 and build everything for this ISA (as we have heaps of builder capacity on amd64 in Ubuntu) and sidestep worrying
    about that for a little while.


    In terms of building consensus around this design, I thought it makes
    sense
    to start at the bottom of the stack and so here I am on this mailing list :-) I guess in due course this could become a DEP, and would certainly
    need
    to be discussed on debian-devel before getting too far.

    I'm not sure there's ever been much of a wide interest in something
    like this in Debian TBH. Due to deployment and increased infra
    overhead at least?


    Yes that's fair. And as I said somewhere, I myself am not proposing to
    support any additional ISAs in Debian at this time.


    What do you think? Have I missed any glaring implications?

    No, I think the overall picture is about right, and captures most of the things we have discussed at various times and places in the past. :)


    I am very happy to read this!


    Is there a better way of doing this?

    I think starting from 5, the rest are probably just details to hammer
    out, but not insurmountable things.


    Great. The things I see as a bit vague at a base level currently are:

    * Should the ISA influence the toolchain via toolchain defaults or dpkg-buildflags?
    * How is the default ISA for a buildd chroot selected?

    There is also the question of whether partial coverage of an ISA is handled
    by the package publisher or client side in apt but that's at least one
    level higher.

    Cheers,
    mwh

    <div dir="ltr"><div dir="ltr">On Wed, 6 Sept 2023 at 21:27, Guillem Jover &lt;<a href="mailto:guillem@debian.org" target="_blank">guillem@debian.org</a>&gt; wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px
    0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi!<br></blockquote><div><br></div><div>Hi!</div><div><br></div><div>Thanks for the considered response. And sorry for the very slow reply.</div><div> </div><blockquote class="gmail_
    quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    On Fri, 2023-09-01 at 08:43:55 +1200, Michael Hudson-Doyle wrote:<br>
    &gt; Recently the topic of exploiting newer instructions without dropping<br> &gt; support for older machines has come up several times inside Ubuntu<br> &gt; engineering. I understand this topic has come up several times in the past<br>
    &gt; for Debian as well, but nothing has really come of it to date.<br>

    I also had a chat about this with Matthias Klose (CCed) around 2022-05.<br>

    &gt; I&#39;ve spent a while thinking through the options and coming up with a design<br>
    &gt; and wrote some notes into a wiki page:<br>
    &gt; <a href="https://wiki.debian.org/ArchitectureVariants" rel="noreferrer" target="_blank">https://wiki.debian.org/ArchitectureVariants</a><br>

    I think we are already doing 1, 2 and 3. I agree 4 is just wrong. And<br> something like 5 is what I suggested to Matthias for Ubuntu when we<br>
    last discussed it as the best way to go about this.<br></blockquote><div><br></div><div>OK, glad we agree to this point.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:
    1ex">
    I&#39;m not sure I entirely agree with the requirements you set forth<br> though:<br>

     - I think such optimized builds might need to be done with &quot;special<br>    toolchains&quot; (these could simply be wrappers over the host compiler<br>
       passing the appropriate flags via command-line or via specs or<br>
       similar, not necessarily full blown toolchains), passing these via<br>
       something like dpkg-buildflags seems currently unreliable, as I don&#39;t<br>
       think we have full coverage in packages (neither for all compilers<br>
       available)? Although it would be better as it would centralize the<br>
       management. (For reference this is in part how rpm handles this:<br>
        <a href="https://github.com/rpm-software-management/rpm/blob/master/rpmrc.in" rel="noreferrer" target="_blank">https://github.com/rpm-software-management/rpm/blob/master/rpmrc.in</a>)<br></blockquote><div><br></div><div>I agree that is not
    completely clear what the best approach here is, do we change the defaults of gcc or influence things via default buildflags.</div><div><br></div><div>I&#39;m sure there are packages that do not respect dpkg-buildflags during build but the consequences
    of this do not seem all that great -- such packages would not be optimized for the variant / ISA but if someone manages to notice this, they can fix the bug.</div><div><br></div><div>OTOH, having the compiler default change may be a bit of a surprise for
    people who build binaries for deployment not via Debian packages. (Do our compilers in general target the same baseline as Debian does for a given architecture?).</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-
    left:1px solid rgb(204,204,204);padding-left:1ex">
     - Perhaps that&#39;s a limitation from the archive software side, but<br>
       requiring to place the binary packages in the same pool seems<br>
       rather restrictive (it forces different filenames for example).<br></blockquote><div><br></div><div>We are considering supporting multiple variant/ISAs in the primary Ubuntu archive, so if we get that far then yes, we want to have all the binary
    packages in the same pool. The first steps don&#39;t have to support this I guess.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
     - I guess it might be nice for the ISA to be passed down to the<br>
       dpkg tools, but I don&#39;t think this is strictly necessary? A<br>
       frontend like apt could also decide based on metadata in say the<br>
       Release file, although not having the actual installed package<br>
       metadata on whether it was a different ISA build or not would make<br>
       its job more inconvenient. In any case I don&#39;t have a big issue<br>
       with recording this via dpkg-gencontrol or similar if necessary.<br></blockquote><div><br></div><div>I agree, I don&#39;t think it&#39;s /strictly/ required that the target ISA is recorded in the deb. But I think adding a field for it reduces scope
    for confusion later.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    On the specific implementation details:<br>

     - Changing the Architecture format (as in adding colons there) seems<br>
       like a non-starter, and I expect that would break lots of things<br>
       (I mean it could be done but I&#39;m not sure it&#39;s worth it for this).<br>
       Recording this mostly as a hint than anything else, via another<br>
       field (if necessary at all) I think would be best.<br></blockquote><div><br></div><div>Agreed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
     - As covered in previous discussions, dpkg could (but I don&#39;t think<br>    it&#39;s necessary) check whether the .deb is runnable on the current<br>    hw, but that&#39;s tricky as chrootless installs need to be taken<br>
       into account, etc. It should certainly not be part of dependency<br>
       resolution.<br></blockquote><div><br></div><div>I&#39;m sorry, what is a chrootless install? But I think I agree here too: tricky and just not really worth it.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-
    left:1px solid rgb(204,204,204);padding-left:1ex">
     - I&#39;m not fond of having to change the binary package name format<br>
       either for this (name_version_arch.deb) even if at least dpkg<br>
       itself does not care (but I know other tools do care), and<br>
       depending on the format I&#39;d expect things to break (this goes<br>
       back to the shared pool concern).<br></blockquote><div><br></div><div>I don&#39;t think this is avoidable in the long run. I must admit I have generally thought of the presence of the architecture name in the .deb file name to be more a convention
    than part of the format (and the &quot;real&quot; indication of a binary package&#39;s architecture is in DEBIAN/control).</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-
    left:1ex">
     - If dpkg-architecture needs to be aware of this, then this might need<br>
       to be auto-detectable from just the current toolchain being used.<br></blockquote><div><br></div><div>So you are saying to configure a build environment for, say, x86-64-v3 you would configure gcc with --with-arch64=x86-64-v3 and then dpkg-
    architecture would parse the output of gcc -Q --help=target to set DEB_HOST_ARCH_VARIANT appropriately? (modulo mistakes in details) Or do you mean something else entirely?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.
    8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    Some of the above problems could perhaps be avoided if we introduced<br>
    a concept of architecture aliases/ISAs (similar to what rpm has), which<br> would side-step the pool sharing issue, the binary package renaming,<br>
    etc. One big issue with this is that it requires for dpkg to have an<br> exhaustive table of all such aliases, and if there&#39;s ever a new alias<br> added, old dpkg versions need to be updated or they will not understand<br> what they match with. So this does not seem ideal either. So I guess this<br> is a variation over your proposal, but perhaps this could still be used<br>
    in specific contexts, say only at build-time (but not for dependency<br> relationships), for repo management (say binary-arm64v9/Packages.xz),<br>
    or binary package names where the field would specify the actual name<br>
    for the filename, say:<br>

      Architecture: arm64<br>
      ArchitectureIsa: arm64v9<br>

    or maybe better:<br>

      Architecture: arm64<br>
      ArchitectureIsa: v9<br>

    resulting in dpkg-deb generating:<br>

      binpkg_1.0-1_arm64v9.deb<br>

    but targeting arm64.</blockquote><div><br></div><div>I&#39;m not sure but I think you have talked yourself into suggesting something very similar to my proposal here?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;
    border-left:1px solid rgb(204,204,204);padding-left:1ex">I also think I prefer naming this explicitly as ISA<br>
    variants, if you will, than just architecture variants as that gives<br>
    way too much room</blockquote><div><br></div><div>Certainly I think all the interesting use cases are basically changing the set of instructions emitted by the toolchains by default. I suppose you could have a variant that changed the set of hardening
    flags or something but that doesn&#39;t seem an especially good idea. So I guess I&#39;d be happy with s/ArchitectureVariant/ArchitectureISA/ everywhere.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px
    solid rgb(204,204,204);padding-left:1ex"> (which perhaps we want, but then that has other<br>
    implications over compatibility), and for the field perhaps just Isa is<br> better, to avoid the implicit repetition of<br> ArchitectureInstructionSetArchitecture :), but that makes it less easy<br>
    to associate both as related.<br>

    In the end though, I think there are perhaps bigger constraints from<br>
    the infra side of things than the package tooling, stuff like archive<br> management software, or binary transition migration and similar.<br></blockquote><div><br></div><div>I think I managed to convince myself that most things like britney and ben can and should treat each variant/ISA as a separate architecture. It depends a
    bit how publication is done in the case where not all packages are built for all ISAs but not in very interesting ways I think. And my intention is to start with amd64v3 and build everything for this ISA (as we have heaps of builder capacity on amd64 in
    Ubuntu) and sidestep worrying about that for a little while.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    &gt; In terms of building consensus around this design, I thought it makes sense<br>
    &gt; to start at the bottom of the stack and so here I am on this mailing list<br>
    &gt; :-) I guess in due course this could become a DEP, and would certainly need<br>
    &gt; to be discussed on debian-devel before getting too far.<br>

    I&#39;m not sure there&#39;s ever been much of a wide interest in something<br> like this in Debian TBH. Due to deployment and increased infra<br>
    overhead at least?<br></blockquote><div><br></div><div>Yes that&#39;s fair. And as I said somewhere, I myself am not proposing to support any additional ISAs in Debian at this time.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px
    0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    &gt; What do you think? Have I missed any glaring implications?<br>

    No, I think the overall picture is about right, and captures most of the<br> things we have discussed at various times and places in the past. :)<br></blockquote><div><br></div><div>I am very happy to read this!</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204)
    ;padding-left:1ex">
    &gt; Is there a better way of doing this?<br>

    I think starting from 5, the rest are probably just details to hammer<br>
    out, but not insurmountable things.<br></blockquote><div><br></div><div>Great. The things I see as a bit vague at a base level currently are:</div><div><br></div><div>* Should the ISA influence the toolchain via toolchain defaults or dpkg-buildflags?</
    <div>* How is the default ISA for a buildd chroot selected?</div><div><br></div><div>There is also the question of whether partial coverage of an ISA is handled by the package publisher or client side in apt but that&#39;s at least one level higher.</
    <div><br></div><div>Cheers,</div><div>mwh </div></div></div>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Guillem Jover@21:1/5 to Michael Hudson-Doyle on Tue Oct 31 10:30:01 2023
    Hi!

    On Thu, 2023-09-21 at 14:43:42 +1200, Michael Hudson-Doyle wrote:
    Thanks for the considered response. And sorry for the very slow reply.

    Idem! :)

    On Wed, 6 Sept 2023 at 21:27, Guillem Jover wrote:
    I'm not sure I entirely agree with the requirements you set forth
    though:

    - I think such optimized builds might need to be done with "special
    toolchains" (these could simply be wrappers over the host compiler
    passing the appropriate flags via command-line or via specs or
    similar, not necessarily full blown toolchains), passing these via
    something like dpkg-buildflags seems currently unreliable, as I don't
    think we have full coverage in packages (neither for all compilers
    available)? Although it would be better as it would centralize the
    management. (For reference this is in part how rpm handles this:
    https://github.com/rpm-software-management/rpm/blob/master/rpmrc.in)


    I agree that is not completely clear what the best approach here is, do we change the defaults of gcc or influence things via default buildflags.

    I'm sure there are packages that do not respect dpkg-buildflags during
    build but the consequences of this do not seem all that great -- such packages would not be optimized for the variant / ISA but if someone
    manages to notice this, they can fix the bug.

    OTOH, having the compiler default change may be a bit of a surprise for people who build binaries for deployment not via Debian packages. (Do our compilers in general target the same baseline as Debian does for a given architecture?).

    Right, given that the failure mode would be just "no-optimized-builds",
    and should not end up with those packages being broken, at most just
    redundant with the baseline ones, then I guess controlling it either
    way would seem fine, yes.

    (Also if the packages are reproducible, and end up being not optimized
    this might be detectable as producing identical artifacts as on the
    baseline.)

    - Perhaps that's a limitation from the archive software side, but
    requiring to place the binary packages in the same pool seems
    rather restrictive (it forces different filenames for example).

    We are considering supporting multiple variant/ISAs in the primary Ubuntu archive, so if we get that far then yes, we want to have all the binary packages in the same pool. The first steps don't have to support this I guess.

    Ok. Just a note that even if served from the primary archive, there
    could be multiple pools (like the multi-pool setup on debian-ports),
    as the entry point are the (In)Release files. But, yes, the other
    option would be to use the variant/ISA name as a "fake arch" just in
    the binary package name.

    - I guess it might be nice for the ISA to be passed down to the
    dpkg tools, but I don't think this is strictly necessary? A
    frontend like apt could also decide based on metadata in say the
    Release file, although not having the actual installed package
    metadata on whether it was a different ISA build or not would make
    its job more inconvenient. In any case I don't have a big issue
    with recording this via dpkg-gencontrol or similar if necessary.

    I agree, I don't think it's /strictly/ required that the target ISA is recorded in the deb. But I think adding a field for it reduces scope for confusion later.

    Yes, agreed.

    On the specific implementation details:

    - As covered in previous discussions, dpkg could (but I don't think
    it's necessary) check whether the .deb is runnable on the current
    hw, but that's tricky as chrootless installs need to be taken
    into account, etc. It should certainly not be part of dependency
    resolution.

    I'm sorry, what is a chrootless install? But I think I agree here too:
    tricky and just not really worth it.

    https://wiki.debian.org/Teams/Dpkg/Spec/InstallBootstrap

    This can be used among other things to set up foreign chroots, by
    running the host tools, so disallowing installation could be
    problematic. Even though I guess there could be a warning about this,
    or maybe it could be controlled through a force option, although both
    seems like they could be disruptive.

    - I'm not fond of having to change the binary package name format
    either for this (name_version_arch.deb) even if at least dpkg
    itself does not care (but I know other tools do care), and
    depending on the format I'd expect things to break (this goes
    back to the shared pool concern).

    I don't think this is avoidable in the long run. I must admit I have generally thought of the presence of the architecture name in the .deb file name to be more a convention than part of the format (and the "real" indication of a binary package's architecture is in DEBIAN/control).

    Yes and no I guess. In theory the (canonical) information should be
    extracted from the DEBIAN/control from inside the .deb, in practice
    I think tools (?) (might) try to use heuristics from just the filename
    to avoid having to open, uncompress and parse every .deb around, for performance reasons.

    If the only change in the package filename format is in the <arch> part
    where we'd use a name which would otherwise be valid as an arch name (so,
    no weird symbols, or «-» separators that are not intended to split <os>
    and <cpu> or similar), then using a name for the variant/ISA would be
    fine.

    - If dpkg-architecture needs to be aware of this, then this might need
    to be auto-detectable from just the current toolchain being used.

    So you are saying to configure a build environment for, say, x86-64-v3 you would configure gcc with --with-arch64=x86-64-v3 and then dpkg-architecture would parse the output of gcc -Q --help=target to set DEB_HOST_ARCH_VARIANT appropriately? (modulo mistakes in details) Or do you mean something else entirely?

    That would be one solution yes, which could give automatic bijective
    mappings, although ideally with a machine-readable way to get at it,
    which I'm not sure we have currently. For example code in dpkg-dev
    already runs «$CC -dumpmachine» to infer the host architecture to use
    during builds.

    While using a triplet variation could be a way to do that, that would
    require such triplet support for each variant/ISA, which tends to be
    very painful to introduce if it's not there already, so I'd not
    consider this specific way a viable option.

    Some of the above problems could perhaps be avoided if we introduced
    a concept of architecture aliases/ISAs (similar to what rpm has), which would side-step the pool sharing issue, the binary package renaming,
    etc. One big issue with this is that it requires for dpkg to have an exhaustive table of all such aliases, and if there's ever a new alias added, old dpkg versions need to be updated or they will not understand what they match with. So this does not seem ideal either. So I guess this is a variation over your proposal, but perhaps this could still be used
    in specific contexts, say only at build-time (but not for dependency relationships), for repo management (say binary-arm64v9/Packages.xz),
    or binary package names where the field would specify the actual name
    for the filename, say:

    Architecture: arm64
    ArchitectureIsa: arm64v9

    or maybe better:

    Architecture: arm64
    ArchitectureIsa: v9

    resulting in dpkg-deb generating:

    binpkg_1.0-1_arm64v9.deb

    but targeting arm64.

    I'm not sure but I think you have talked yourself into suggesting something very similar to my proposal here?

    Ah sorry, yeah, didn't mean to present it as a new idea, I was mostly
    trying to walk over the issues, and refine upon your initial idea,
    with those constraints applied. :)

    On Fri, 2023-09-01 at 08:43:55 +1200, Michael Hudson-Doyle wrote:
    Is there a better way of doing this?

    I think starting from 5, the rest are probably just details to hammer
    out, but not insurmountable things.

    Great. The things I see as a bit vague at a base level currently are:

    * Should the ISA influence the toolchain via toolchain defaults or dpkg-buildflags?
    * How is the default ISA for a buildd chroot selected?

    So the clear downsides of either modifying the default toolchain or
    having to provide an additional one is that this seems pretty heavy
    weight. Also because people might want to build optimized variants
    locally w/o having to mess with their already existing toolchains.
    (I'm not sure whether something going along the lines of <https://git.hadrons.org/cgit/debian/fakecross.git> could be an
    option, although as mentioned above, if that would imply new triplets,
    then probably not.)

    So the easiest way might indeed be by controlling this via an envvar,
    which dpkg-buildpackage could also setup internally via a new option,
    say --arch-isa=amd64v3 or similar to make this slightly more
    discoverable. Which would be easy to use from the buildds too I guess.

    There is also the question of whether partial coverage of an ISA is handled by the package publisher or client side in apt but that's at least one
    level higher.

    Yeah, that would be of no concern to dpkg, I think.

    Thanks,
    Guillem

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Hudson-Doyle@21:1/5 to Guillem Jover on Thu Nov 2 16:50:01 2023
    On Tue, 31 Oct 2023 at 09:21, Guillem Jover <guillem@debian.org> wrote:

    Hi!

    On Thu, 2023-09-21 at 14:43:42 +1200, Michael Hudson-Doyle wrote:
    Thanks for the considered response. And sorry for the very slow reply.

    Idem! :)

    On Wed, 6 Sept 2023 at 21:27, Guillem Jover wrote:
    I'm not sure I entirely agree with the requirements you set forth
    though:

    - I think such optimized builds might need to be done with "special
    toolchains" (these could simply be wrappers over the host compiler
    passing the appropriate flags via command-line or via specs or
    similar, not necessarily full blown toolchains), passing these via
    something like dpkg-buildflags seems currently unreliable, as I
    don't
    think we have full coverage in packages (neither for all compilers
    available)? Although it would be better as it would centralize the
    management. (For reference this is in part how rpm handles this:

    https://github.com/rpm-software-management/rpm/blob/master/rpmrc.in)


    I agree that is not completely clear what the best approach here is, do
    we
    change the defaults of gcc or influence things via default buildflags.

    I'm sure there are packages that do not respect dpkg-buildflags during build but the consequences of this do not seem all that great -- such packages would not be optimized for the variant / ISA but if someone manages to notice this, they can fix the bug.

    OTOH, having the compiler default change may be a bit of a surprise for people who build binaries for deployment not via Debian packages. (Do our compilers in general target the same baseline as Debian does for a given architecture?).

    Right, given that the failure mode would be just "no-optimized-builds",
    and should not end up with those packages being broken, at most just redundant with the baseline ones, then I guess controlling it either
    way would seem fine, yes.


    Ack.


    (Also if the packages are reproducible, and end up being not optimized
    this might be detectable as producing identical artifacts as on the baseline.)


    This is an interesting idea -- although of course some care would be
    required to avoid false positives from things that do not use the C/C++ toolchain at all. Anyway...


    - Perhaps that's a limitation from the archive software side, but
    requiring to place the binary packages in the same pool seems
    rather restrictive (it forces different filenames for example).

    We are considering supporting multiple variant/ISAs in the primary Ubuntu archive, so if we get that far then yes, we want to have all the binary packages in the same pool. The first steps don't have to support this I guess.

    Ok. Just a note that even if served from the primary archive, there
    could be multiple pools (like the multi-pool setup on debian-ports),
    as the entry point are the (In)Release files.


    Oh OK. I don't think Launchpad supports that (but an not sure).


    But, yes, the other
    option would be to use the variant/ISA name as a "fake arch" just in
    the binary package name.


    - I guess it might be nice for the ISA to be passed down to the
    dpkg tools, but I don't think this is strictly necessary? A
    frontend like apt could also decide based on metadata in say the
    Release file, although not having the actual installed package
    metadata on whether it was a different ISA build or not would make
    its job more inconvenient. In any case I don't have a big issue
    with recording this via dpkg-gencontrol or similar if necessary.

    I agree, I don't think it's /strictly/ required that the target ISA is recorded in the deb. But I think adding a field for it reduces scope for confusion later.

    Yes, agreed.

    On the specific implementation details:

    - As covered in previous discussions, dpkg could (but I don't think
    it's necessary) check whether the .deb is runnable on the current
    hw, but that's tricky as chrootless installs need to be taken
    into account, etc. It should certainly not be part of dependency
    resolution.

    I'm sorry, what is a chrootless install? But I think I agree here too: tricky and just not really worth it.

    https://wiki.debian.org/Teams/Dpkg/Spec/InstallBootstrap


    Ah right.

    This can be used among other things to set up foreign chroots, by
    running the host tools, so disallowing installation could be
    problematic. Even though I guess there could be a warning about this,
    or maybe it could be controlled through a force option, although both
    seems like they could be disruptive.


    Of course in such cases dpkg knows that something funny is going on and
    could suppress the warning itself.

    I spent a few minutes trying to think hard about this and I honestly don't think I can predict whether trying to prevent installation of incompatible packages is worth it (after all one of the ways users could get into
    trouble would be moving an installed system to a different CPU and having binaries start to fail and obviously dpkg can't help there).

    One result of this thinking was: I had been thinking/assuming the issue of which variants to consider would be apt configuration, but maybe dpkg configuration would make more sense (after all, --add-architecture is a parameter to dpkg). And in this case, dpkg could object when installing a variant that has not been configured.


    - I'm not fond of having to change the binary package name format
    either for this (name_version_arch.deb) even if at least dpkg
    itself does not care (but I know other tools do care), and
    depending on the format I'd expect things to break (this goes
    back to the shared pool concern).

    I don't think this is avoidable in the long run. I must admit I have generally thought of the presence of the architecture name in the .deb
    file
    name to be more a convention than part of the format (and the "real" indication of a binary package's architecture is in DEBIAN/control).

    Yes and no I guess. In theory the (canonical) information should be
    extracted from the DEBIAN/control from inside the .deb, in practice
    I think tools (?) (might) try to use heuristics from just the filename
    to avoid having to open, uncompress and parse every .deb around, for performance reasons.


    True. In fact it looks like apt-ftparchive does this (when using the --architecture flag at least) so I get to care about this a bit...

    If the only change in the package filename format is in the <arch> part
    where we'd use a name which would otherwise be valid as an arch name (so,
    no weird symbols, or «-» separators that are not intended to split <os>
    and <cpu> or similar), then using a name for the variant/ISA would be
    fine.


    Right. I think that (when possible pretending e.g. "amd64v3" is a distinct architecture will generally make things easier. E.g. I think britney
    wouldn't need to know about the relationship between "amd64" and "amd64v3".


    - If dpkg-architecture needs to be aware of this, then this might need
    to be auto-detectable from just the current toolchain being used.

    So you are saying to configure a build environment for, say, x86-64-v3
    you
    would configure gcc with --with-arch64=x86-64-v3 and then
    dpkg-architecture
    would parse the output of gcc -Q --help=target to set
    DEB_HOST_ARCH_VARIANT
    appropriately? (modulo mistakes in details) Or do you mean something else entirely?

    That would be one solution yes, which could give automatic bijective mappings, although ideally with a machine-readable way to get at it,
    which I'm not sure we have currently.


    I think "gcc -Q --help=target | grep -e '^\s*-march'" is about as machine readable as it gets currently, for better or worse (mostly worse).


    For example code in dpkg-dev
    already runs «$CC -dumpmachine» to infer the host architecture to use during builds.

    While using a triplet variation could be a way to do that, that would
    require such triplet support for each variant/ISA, which tends to be
    very painful to introduce if it's not there already, so I'd not
    consider this specific way a viable option.


    I admit I'm not an expert on triplet intricacies but I think a new triplet
    is not appropriate here (a bit like a new Debian architecture for a
    variant/ISA choice is not the right concept).


    Some of the above problems could perhaps be avoided if we introduced
    a concept of architecture aliases/ISAs (similar to what rpm has), which would side-step the pool sharing issue, the binary package renaming,
    etc. One big issue with this is that it requires for dpkg to have an exhaustive table of all such aliases, and if there's ever a new alias added, old dpkg versions need to be updated or they will not understand what they match with. So this does not seem ideal either. So I guess
    this
    is a variation over your proposal, but perhaps this could still be used in specific contexts, say only at build-time (but not for dependency relationships), for repo management (say binary-arm64v9/Packages.xz),
    or binary package names where the field would specify the actual name
    for the filename, say:

    Architecture: arm64
    ArchitectureIsa: arm64v9

    or maybe better:

    Architecture: arm64
    ArchitectureIsa: v9

    resulting in dpkg-deb generating:

    binpkg_1.0-1_arm64v9.deb

    but targeting arm64.

    I'm not sure but I think you have talked yourself into suggesting
    something
    very similar to my proposal here?

    Ah sorry, yeah, didn't mean to present it as a new idea,


    :-)


    I was mostly
    trying to walk over the issues, and refine upon your initial idea,
    with those constraints applied. :)


    I'm certainly glad you got to a similar place as me!


    On Fri, 2023-09-01 at 08:43:55 +1200, Michael Hudson-Doyle wrote:
    Is there a better way of doing this?

    I think starting from 5, the rest are probably just details to hammer out, but not insurmountable things.

    Great. The things I see as a bit vague at a base level currently are:

    * Should the ISA influence the toolchain via toolchain defaults or dpkg-buildflags?
    * How is the default ISA for a buildd chroot selected?

    So the clear downsides of either modifying the default toolchain or
    having to provide an additional one is that this seems pretty heavy
    weight. Also because people might want to build optimized variants
    locally w/o having to mess with their already existing toolchains.
    (I'm not sure whether something going along the lines of <https://git.hadrons.org/cgit/debian/fakecross.git> could be an
    option, although as mentioned above, if that would imply new triplets,
    then probably not.)

    So the easiest way might indeed be by controlling this via an envvar,


    DEB_HOST_ARCH_ISA?


    which dpkg-buildpackage could also setup internally via a new option,
    say --arch-isa=amd64v3 or similar


    --host-arch-isa would be more coherent I think.

    I guess one could add support for --target-host-arch-isa to build a
    toolchain that defaults to a particular ISA. But well.


    to make this slightly more
    discoverable. Which would be easy to use from the buildds too I guess.


    I also think that (conceptually) it makes sense that you might want to have
    an build chroot that *uses* amd64v3 binaries (because your builder is
    amd64v3) to *produce* boring old amd64 binaries (I mean, I doubt gcc built
    with different march is so much faster that it really matters but...)


    There is also the question of whether partial coverage of an ISA is
    handled
    by the package publisher or client side in apt but that's at least one level higher.

    Yeah, that would be of no concern to dpkg, I think.


    Ack.

    So to summarise, here are the generic changes that I think need to be made
    to src:dpkg to support variant ISAs as a thing:

    * add get_host_arch_isa() to Dpkg::Arch
    * dpkg-gencontrol records DEB_HOST_ARCH_ISA into DEBIAN/control as ArchitectureIsa
    * dpkg-architecture emits DEB_HOST_ARCH_ISA and grows --host-arch-isa flag
    * dpkg-buildpackage passes --host-arch-isa to dpkg-architecture
    * dpkg-genchanges should record the ISA in the changes file somehow I
    guess?
    * dpkg-deb records the ISA in the file name

    Have I missed anything? (Hmm does anything need to reject unknown values
    found in DEB_HOST_ARCH_ISA / --host-arch-isa? Probably...)

    Conceptually slightly separately, it might make sense to add a build
    "feature" to Dpkg::Vendor::Debian to allow setting -march (and -mtune?)

    Then when we want to add support to an ISA, we add a little thing to set_build_features (in either Vendor::Debian or Vendor::Ubuntu or wherever) that maps get_host_arch_isa() to values for the march-influencing feature.

    Cheers,
    mwh

    <div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, 31 Oct 2023 at 09:21, Guillem Jover &lt;<a href="mailto:guillem@debian.org" target="_blank">guillem@debian.org</a>&gt; wrote:<br></div><
    blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi!<br>

    On Thu, 2023-09-21 at 14:43:42 +1200, Michael Hudson-Doyle wrote:<br>
    &gt; Thanks for the considered response. And sorry for the very slow reply.<br>

    Idem! :)<br>

    &gt; On Wed, 6 Sept 2023 at 21:27, Guillem Jover wrote:<br>
    &gt; &gt; I&#39;m not sure I entirely agree with the requirements you set forth<br>
    &gt; &gt; though:<br>
    &gt; &gt;<br>
    &gt; &gt;  - I think such optimized builds might need to be done with &quot;special<br>
    &gt; &gt;    toolchains&quot; (these could simply be wrappers over the host compiler<br>
    &gt; &gt;    passing the appropriate flags via command-line or via specs or<br>
    &gt; &gt;    similar, not necessarily full blown toolchains), passing these via<br>
    &gt; &gt;    something like dpkg-buildflags seems currently unreliable, as I don&#39;t<br>
    &gt; &gt;    think we have full coverage in packages (neither for all compilers<br>
    &gt; &gt;    available)? Although it would be better as it would centralize the<br>
    &gt; &gt;    management. (For reference this is in part how rpm handles this:<br>
    &gt; &gt;     <a href="https://github.com/rpm-software-management/rpm/blob/master/rpmrc.in" rel="noreferrer" target="_blank">https://github.com/rpm-software-management/rpm/blob/master/rpmrc.in</a>)<br>
    &gt; &gt;<br>
    &gt; <br>
    &gt; I agree that is not completely clear what the best approach here is, do we<br>
    &gt; change the defaults of gcc or influence things via default buildflags.<br> &gt; <br>
    &gt; I&#39;m sure there are packages that do not respect dpkg-buildflags during<br>
    &gt; build but the consequences of this do not seem all that great -- such<br> &gt; packages would not be optimized for the variant / ISA but if someone<br> &gt; manages to notice this, they can fix the bug.<br>
    &gt; <br>
    &gt; OTOH, having the compiler default change may be a bit of a surprise for<br>
    &gt; people who build binaries for deployment not via Debian packages. (Do our<br>
    &gt; compilers in general target the same baseline as Debian does for a given<br>
    &gt; architecture?).<br>

    Right, given that the failure mode would be just &quot;no-optimized-builds&quot;,<br>
    and should not end up with those packages being broken, at most just<br> redundant with the baseline ones, then I guess controlling it either<br>
    way would seem fine, yes.<br></blockquote><div><br></div><div>Ack.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    (Also if the packages are reproducible, and end up being not optimized<br>
    this might be detectable as producing identical artifacts as on the<br> baseline.)<br></blockquote><div><br></div><div>This is an interesting idea -- although of course some care would be required to avoid false positives from things that do not use the C/C++ toolchain at all. Anyway... </div><div> </div><blockquote class="
    gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    &gt; &gt;  - Perhaps that&#39;s a limitation from the archive software side, but<br>
    &gt; &gt;    requiring to place the binary packages in the same pool seems<br>
    &gt; &gt;    rather restrictive (it forces different filenames for example).<br>

    &gt; We are considering supporting multiple variant/ISAs in the primary Ubuntu<br>
    &gt; archive, so if we get that far then yes, we want to have all the binary<br>
    &gt; packages in the same pool. The first steps don&#39;t have to support this I<br>
    &gt; guess.<br>

    Ok. Just a note that even if served from the primary archive, there<br>
    could be multiple pools (like the multi-pool setup on debian-ports),<br>
    as the entry point are the (In)Release files.</blockquote><div><br></div><div>Oh OK. I don&#39;t think Launchpad supports that (but an not sure).</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(
    204,204,204);padding-left:1ex"> But, yes, the other<br>
    option would be to use the variant/ISA name as a &quot;fake arch&quot; just in<br>
    the binary package name.<br></blockquote><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    &gt; &gt;  - I guess it might be nice for the ISA to be passed down to the<br> &gt; &gt;    dpkg tools, but I don&#39;t think this is strictly necessary? A<br>
    &gt; &gt;    frontend like apt could also decide based on metadata in say the<br>
    &gt; &gt;    Release file, although not having the actual installed package<br>
    &gt; &gt;    metadata on whether it was a different ISA build or not would make<br>
    &gt; &gt;    its job more inconvenient. In any case I don&#39;t have a big issue<br>
    &gt; &gt;    with recording this via dpkg-gencontrol or similar if necessary.<br>

    &gt; I agree, I don&#39;t think it&#39;s /strictly/ required that the target ISA is<br>
    &gt; recorded in the deb. But I think adding a field for it reduces scope for<br>
    &gt; confusion later.<br>

    Yes, agreed.<br>

    &gt; &gt; On the specific implementation details:<br>
    &gt; &gt;<br>
    &gt; &gt;  - As covered in previous discussions, dpkg could (but I don&#39;t think<br>
    &gt; &gt;    it&#39;s necessary) check whether the .deb is runnable on the current<br>
    &gt; &gt;    hw, but that&#39;s tricky as chrootless installs need to be taken<br>
    &gt; &gt;    into account, etc. It should certainly not be part of dependency<br>
    &gt; &gt;    resolution.<br>

    &gt; I&#39;m sorry, what is a chrootless install? But I think I agree here too:<br>
    &gt; tricky and just not really worth it.<br>

    <a href="https://wiki.debian.org/Teams/Dpkg/Spec/InstallBootstrap" rel="noreferrer" target="_blank">https://wiki.debian.org/Teams/Dpkg/Spec/InstallBootstrap</a></blockquote><div><br></div><div>Ah right.</div><div><br></div><blockquote class="gmail_quote"
    style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    This can be used among other things to set up foreign chroots, by<br>
    running the host tools, so disallowing installation could be<br>
    problematic. Even though I guess there could be a warning about this,<br>
    or maybe it could be controlled through a force option, although both<br>
    seems like they could be disruptive.<br></blockquote><div><br></div><div>Of course in such cases dpkg knows that something funny is going on and could suppress the warning itself. </div><div><br></div><div>I spent a few minutes trying to think hard
    about this and I honestly don&#39;t think I can predict whether trying to prevent installation of incompatible packages is worth it (after all one of the ways users could get into trouble would be moving an installed system to a different CPU and having
    binaries start to fail and obviously dpkg can&#39;t help there).</div><div><br></div><div>One result of this thinking was: I had been thinking/assuming the issue of which variants to consider would be apt configuration, but maybe dpkg configuration would
    make more sense (after all, --add-architecture is a parameter to dpkg). And in this case, dpkg could object when installing a variant that has not been configured.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-
    left:1px solid rgb(204,204,204);padding-left:1ex">
    &gt; &gt;  - I&#39;m not fond of having to change the binary package name format<br>
    &gt; &gt;    either for this (name_version_arch.deb) even if at least dpkg<br>
    &gt; &gt;    itself does not care (but I know other tools do care), and<br> &gt; &gt;    depending on the format I&#39;d expect things to break (this goes<br>
    &gt; &gt;    back to the shared pool concern).<br>
    &gt; <br>
    &gt; I don&#39;t think this is avoidable in the long run. I must admit I have<br>
    &gt; generally thought of the presence of the architecture name in the .deb file<br>
    &gt; name to be more a convention than part of the format (and the &quot;real&quot;<br>
    &gt; indication of a binary package&#39;s architecture is in DEBIAN/control).<br>

    Yes and no I guess. In theory the (canonical) information should be<br> extracted from the DEBIAN/control from inside the .deb, in practice<br>
    I think tools (?) (might) try to use heuristics from just the filename<br>
    to avoid having to open, uncompress and parse every .deb around, for<br> performance reasons.<br></blockquote><div><br></div><div>True. In fact it looks like apt-ftparchive does this (when using the --architecture flag at least) so I get to care about this a bit...</div><div><br></div><blockquote class="gmail_quote" style="
    margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    If the only change in the package filename format is in the &lt;arch&gt; part<br>
    where we&#39;d use a name which would otherwise be valid as an arch name (so,<br>
    no weird symbols, or «-» separators that are not intended to split &lt;os&gt;<br>
    and &lt;cpu&gt; or similar), then using a name for the variant/ISA would be<br> fine.<br></blockquote><div><br></div><div>Right. I think that (when possible pretending e.g. &quot;amd64v3&quot; is a distinct architecture will generally make things easier. E.g. I think britney wouldn&#39;t need to know about the relationship between &
    quot;amd64&quot; and &quot;amd64v3&quot;. </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    &gt; &gt;  - If dpkg-architecture needs to be aware of this, then this might need<br>
    &gt; &gt;    to be auto-detectable from just the current toolchain being used.<br>

    &gt; So you are saying to configure a build environment for, say, x86-64-v3 you<br>
    &gt; would configure gcc with --with-arch64=x86-64-v3 and then dpkg-architecture<br>
    &gt; would parse the output of gcc -Q --help=target to set DEB_HOST_ARCH_VARIANT<br>
    &gt; appropriately? (modulo mistakes in details) Or do you mean something else<br>
    &gt; entirely?<br>

    That would be one solution yes, which could give automatic bijective<br> mappings, although ideally with a machine-readable way to get at it,<br>
    which I&#39;m not sure we have currently. </blockquote><div><br></div><div>I think &quot;gcc -Q --help=target | grep -e &#39;^\s*-march&#39;&quot; is about as machine readable as it gets currently, for better or worse (mostly worse).</div><div> </div><
    blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">For example code in dpkg-dev<br>
    already runs «$CC -dumpmachine» to infer the host architecture to use<br> during builds.<br>

    While using a triplet variation could be a way to do that, that would<br> require such triplet support for each variant/ISA, which tends to be<br>
    very painful to introduce if it&#39;s not there already, so I&#39;d not<br> consider this specific way a viable option.<br></blockquote><div><br></div><div>I admit I&#39;m not an expert on triplet intricacies but I think a new triplet is not appropriate here (a bit like a new Debian architecture for a variant/ISA choice is not
    the right concept).</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    &gt; &gt; Some of the above problems could perhaps be avoided if we introduced<br>
    &gt; &gt; a concept of architecture aliases/ISAs (similar to what rpm has), which<br>
    &gt; &gt; would side-step the pool sharing issue, the binary package renaming,<br>
    &gt; &gt; etc. One big issue with this is that it requires for dpkg to have an<br>
    &gt; &gt; exhaustive table of all such aliases, and if there&#39;s ever a new alias<br>
    &gt; &gt; added, old dpkg versions need to be updated or they will not understand<br>
    &gt; &gt; what they match with. So this does not seem ideal either. So I guess this<br>
    &gt; &gt; is a variation over your proposal, but perhaps this could still be used<br>
    &gt; &gt; in specific contexts, say only at build-time (but not for dependency<br>
    &gt; &gt; relationships), for repo management (say binary-arm64v9/Packages.xz),<br>
    &gt; &gt; or binary package names where the field would specify the actual name<br>
    &gt; &gt; for the filename, say:<br>
    &gt; &gt;<br>
    &gt; &gt;   Architecture: arm64<br>
    &gt; &gt;   ArchitectureIsa: arm64v9<br>
    &gt; &gt;<br>
    &gt; &gt; or maybe better:<br>
    &gt; &gt;<br>
    &gt; &gt;   Architecture: arm64<br>
    &gt; &gt;   ArchitectureIsa: v9<br>
    &gt; &gt;<br>
    &gt; &gt; resulting in dpkg-deb generating:<br>
    &gt; &gt;<br>
    &gt; &gt;   binpkg_1.0-1_arm64v9.deb<br>
    &gt; &gt;<br>
    &gt; &gt; but targeting arm64.<br>

    &gt; I&#39;m not sure but I think you have talked yourself into suggesting something<br>
    &gt; very similar to my proposal here?<br>

    Ah sorry, yeah, didn&#39;t mean to present it as a new idea, </blockquote><div><br></div><div>:-)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I was mostly<br>
    trying to walk over the issues, and refine upon your initial idea,<br>
    with those constraints applied. :)<br></blockquote><div><br></div><div>I&#39;m certainly glad you got to a similar place as me!</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);
    padding-left:1ex">
    &gt; &gt; On Fri, 2023-09-01 at 08:43:55 +1200, Michael Hudson-Doyle wrote:<br> &gt; &gt; &gt; Is there a better way of doing this?<br>
    &gt; &gt;<br>
    &gt; &gt; I think starting from 5, the rest are probably just details to hammer<br>
    &gt; &gt; out, but not insurmountable things.<br>

    &gt; Great. The things I see as a bit vague at a base level currently are:<br> &gt; <br>
    &gt; * Should the ISA influence the toolchain via toolchain defaults or<br> &gt; dpkg-buildflags?<br>
    &gt; * How is the default ISA for a buildd chroot selected?<br>

    So the clear downsides of either modifying the default toolchain or<br>
    having to provide an additional one is that this seems pretty heavy<br>
    weight. Also because people might want to build optimized variants<br>
    locally w/o having to mess with their already existing toolchains.<br>
    (I&#39;m not sure whether something going along the lines of<br>
    &lt;<a href="https://git.hadrons.org/cgit/debian/fakecross.git" rel="noreferrer" target="_blank">https://git.hadrons.org/cgit/debian/fakecross.git</a>&gt; could be an<br>
    option, although as mentioned above, if that would imply new triplets,<br>
    then probably not.)<br>

    So the easiest way might indeed be by controlling this via an envvar,<br></blockquote><div><br></div><div>DEB_HOST_ARCH_ISA?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);
    padding-left:1ex">
    which dpkg-buildpackage could also setup internally via a new option,<br>
    say --arch-isa=amd64v3 or similar</blockquote><div><br></div><div>--host-arch-isa would be more coherent I think.</div><div><br></div><div>I guess one could add support for --target-host-arch-isa to build a toolchain that defaults to a particular ISA.
    But well.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> to make this slightly more<br>
    discoverable. Which would be easy to use from the buildds too I guess.<br></blockquote><div><br></div><div>I also think that (conceptually) it makes sense that you might want to have an build chroot that *uses* amd64v3 binaries (because your builder is
    amd64v3) to *produce* boring old amd64 binaries (I mean, I doubt gcc built with different march is so much faster that it really matters but...)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(
    204,204,204);padding-left:1ex">
    &gt; There is also the question of whether partial coverage of an ISA is handled<br>
    &gt; by the package publisher or client side in apt but that&#39;s at least one<br>
    &gt; level higher.<br>

    Yeah, that would be of no concern to dpkg, I think.<br></blockquote><div><br></div><div>Ack.</div><div><br></div><div>So to summarise, here are the generic changes that I think need to be made to src:dpkg to support variant ISAs as a thing:</div><div
    class="gmail_quote"><br></div> * add get_host_arch_isa() to Dpkg::Arch<br> * dpkg-gencontrol records DEB_HOST_ARCH_ISA into DEBIAN/control as ArchitectureIsa<br> * dpkg-architecture emits DEB_HOST_ARCH_ISA and grows --host-arch-isa flag</div><div
    class="gmail_quote"> * dpkg-buildpackage passes --host-arch-isa to dpkg-architecture</div><div class="gmail_quote"> * dpkg-genchanges should record the ISA in the changes file somehow I guess?</div><div class="gmail_quote"> * dpkg-deb records the ISA
    in the file name</div><div class="gmail_quote"><br></div><div class="gmail_quote">Have I missed anything? (Hmm does anything need to reject unknown values found in DEB_HOST_ARCH_ISA /  --host-arch-isa? Probably...)</div><div class="gmail_quote"><br></
    <div class="gmail_quote">Conceptually slightly separately, it might make sense to add a build &quot;feature&quot; to Dpkg::Vendor::Debian to allow setting -march (and -mtune?)</div><div class="gmail_quote"><br></div><div class="gmail_quote">Then when
    we want to add support to an ISA, we add a little thing to set_build_features (in either Vendor::Debian or Vendor::Ubuntu or wherever) that maps get_host_arch_isa() to values for the march-influencing feature.</div><div class="gmail_quote"><div><br></div>
    <div>Cheers,</div><div>mwh <br></div></div></div>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Guillem Jover@21:1/5 to Michael Hudson-Doyle on Fri Nov 3 14:10:01 2023
    Hi!

    On Thu, 2023-11-02 at 15:27:54 +0000, Michael Hudson-Doyle wrote:
    On Tue, 31 Oct 2023 at 09:21, Guillem Jover wrote:
    This can be used among other things to set up foreign chroots, by
    running the host tools, so disallowing installation could be
    problematic. Even though I guess there could be a warning about this,
    or maybe it could be controlled through a force option, although both
    seems like they could be disruptive.

    Of course in such cases dpkg knows that something funny is going on and
    could suppress the warning itself.

    Right, also true.

    I spent a few minutes trying to think hard about this and I honestly don't think I can predict whether trying to prevent installation of incompatible packages is worth it (after all one of the ways users could get into
    trouble would be moving an installed system to a different CPU and having binaries start to fail and obviously dpkg can't help there).

    One result of this thinking was: I had been thinking/assuming the issue of which variants to consider would be apt configuration, but maybe dpkg configuration would make more sense (after all, --add-architecture is a parameter to dpkg). And in this case, dpkg could object when installing a variant that has not been configured.

    Yes, the "plan" has been to add support for run-time CPU attributes,
    so that when adding a new arch, for example you can specify whether
    that arch is runnable, which could help dpkg decide whether to allow
    by default to install M-A:foreign packages.

    I guess this is similar, so such future interface should probably take
    this into account as something to support too. Will check where this
    is tracked and add a note to it.

    And of course that is fine as a guardrail, but if a user hit that out
    of running a frontend, then that would already be too late, which to
    me means that frontends need to be aware of this too (and not pass
    packages that dpkg would/could/might refuse to install), when deciding
    what to pass to dpkg.

    But in any case, as you say, this currently would not be worse than
    configuring a foreign arch, installing some foreign package and trying
    to run it, but it might make it potentially more common. And as
    mentioned above the effecting layer this needs to be decided up seems
    higher anyway (even if dpkg could provide the infra for it).

    If the only change in the package filename format is in the <arch> part where we'd use a name which would otherwise be valid as an arch name (so, no weird symbols, or «-» separators that are not intended to split <os> and <cpu> or similar), then using a name for the variant/ISA would be
    fine.

    Right. I think that (when possible pretending e.g. "amd64v3" is a distinct architecture will generally make things easier. E.g. I think britney
    wouldn't need to know about the relationship between "amd64" and "amd64v3".

    I guess that depends on whether the intention is to create a full
    optimized archive, or just a partial overlay one. In the latter case
    then it might need to know to be able to satisfy dependencies.

    That would be one solution yes, which could give automatic bijective mappings, although ideally with a machine-readable way to get at it,
    which I'm not sure we have currently.

    I think "gcc -Q --help=target | grep -e '^\s*-march'" is about as machine readable as it gets currently, for better or worse (mostly worse).

    That does not look very satisfactory, though. And llvm/clang does not
    support it. :/

    For example code in dpkg-dev
    already runs «$CC -dumpmachine» to infer the host architecture to use during builds.

    While using a triplet variation could be a way to do that, that would require such triplet support for each variant/ISA, which tends to be
    very painful to introduce if it's not there already, so I'd not
    consider this specific way a viable option.

    I admit I'm not an expert on triplet intricacies but I think a new triplet
    is not appropriate here (a bit like a new Debian architecture for a variant/ISA choice is not the right concept).

    We have i386 or arm (?) as (bad IMO) examples where the triplet can
    define the arch baseline. The problem is that this requires updating
    the GNU config.git upstream, and then getting that to trickle down into
    every package that might be using autotools and not using autoreconf
    at build time, or to even update triplet matches in configure scripts
    and similar, which might be "acceptable" for a new arch, but seems disproportionate for a new ISA, so yes, as mentioned I agree it's not
    viable.

    On Thu, 2023-09-21 at 14:43:42 +1200, Michael Hudson-Doyle wrote:
    * Should the ISA influence the toolchain via toolchain defaults or dpkg-buildflags?
    * How is the default ISA for a buildd chroot selected?

    So the clear downsides of either modifying the default toolchain or
    having to provide an additional one is that this seems pretty heavy
    weight. Also because people might want to build optimized variants
    locally w/o having to mess with their already existing toolchains.
    (I'm not sure whether something going along the lines of <https://git.hadrons.org/cgit/debian/fakecross.git> could be an
    option, although as mentioned above, if that would imply new triplets,
    then probably not.)

    So the easiest way might indeed be by controlling this via an envvar,

    DEB_HOST_ARCH_ISA?

    Yeah, that works, and follows the current DPKG_*_ARCH_ABI lead for
    example.

    which dpkg-buildpackage could also setup internally via a new option,
    say --arch-isa=amd64v3 or similar

    --host-arch-isa would be more coherent I think.

    Ah absolutely! For some reason had --arch in mind as a valid option
    (I only see it now in dpkg-scanpackages :D, or maybe I was thinking
    about --host-isa :).

    I guess one could add support for --target-host-arch-isa to build a
    toolchain that defaults to a particular ISA. But well.

    Yes, the ISA support in dpkg should be extensive enough (so that if
    this needs to be supported in the toolchain, then it is possible).

    So to summarise, here are the generic changes that I think need to be made
    to src:dpkg to support variant ISAs as a thing:

    * add get_host_arch_isa() to Dpkg::Arch

    Yes (perhaps as mentioned below also just get_host_isa()).

    * dpkg-gencontrol records DEB_HOST_ARCH_ISA into DEBIAN/control as ArchitectureIsa

    Probably better Architecture-Isa, otherwise the current automatic
    case folding would make it end up as Architectureisa.

    * dpkg-architecture emits DEB_HOST_ARCH_ISA and grows --host-arch-isa flag

    Also DEB_BUILD_ARCH_ISA and DEB_TARGET_ARCH_ISA, and also grows a --target-arch-isa (but I'm thinking whether the shorter --host-isa would
    be nicer, for example the --match-bits are not spelled --match-arch-bits,
    which would seem also a bit redundant).

    * dpkg-buildpackage passes --host-arch-isa to dpkg-architecture

    Yes, but only when not the baseline.

    * dpkg-genchanges should record the ISA in the changes file somehow I
    guess?

    Yes, also dpkg-genbuildinfo. This could be done either from the
    envvars, or perhaps through the debian/files attributes support. But
    given that this is supposedly build global (I think it would be rather
    weird to end up with a .changes including say _amd64.deb and
    _amd64v3.deb file references from the same build), then probably using
    the envvar might be the better way.

    * dpkg-deb records the ISA in the file name

    Yes.

    Have I missed anything?

    Nothing else comes to mind right now (except what I might have already
    added).

    (Hmm does anything need to reject unknown values
    found in DEB_HOST_ARCH_ISA / --host-arch-isa? Probably...)

    I guess it indeed makes sense to define what ISAs are supported, and
    either error out or warn and ignore such values. So there might be a
    need to add something like a new data/isatable.

    Conceptually slightly separately, it might make sense to add a build "feature" to Dpkg::Vendor::Debian to allow setting -march (and -mtune?)

    Then when we want to add support to an ISA, we add a little thing to set_build_features (in either Vendor::Debian or Vendor::Ubuntu or wherever) that maps get_host_arch_isa() to values for the march-influencing feature.

    Hmm, right, how to hook this. I'm not sure the current interface is good
    enough to describe this via build flags features, because such new feature
    area would expose arch-specific features. I have been thinking through
    the build flags and will try to send a proposal/RFC to revamp parts of
    it during the weekend. But I think the ISA stuff is better just handled
    (at leas for now) directly by injecting whatever flags when the requested
    ISA is different to the baseline.

    Thanks,
    Guillem

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Hudson-Doyle@21:1/5 to Guillem Jover on Thu Nov 23 05:50:01 2023
    On Sat, 4 Nov 2023 at 02:02, Guillem Jover <guillem@debian.org> wrote:

    Hi!

    On Thu, 2023-11-02 at 15:27:54 +0000, Michael Hudson-Doyle wrote:
    On Tue, 31 Oct 2023 at 09:21, Guillem Jover wrote:
    This can be used among other things to set up foreign chroots, by
    running the host tools, so disallowing installation could be
    problematic. Even though I guess there could be a warning about this,
    or maybe it could be controlled through a force option, although both seems like they could be disruptive.

    Of course in such cases dpkg knows that something funny is going on and could suppress the warning itself.

    Right, also true.

    I spent a few minutes trying to think hard about this and I honestly
    don't
    think I can predict whether trying to prevent installation of
    incompatible
    packages is worth it (after all one of the ways users could get into trouble would be moving an installed system to a different CPU and having binaries start to fail and obviously dpkg can't help there).

    One result of this thinking was: I had been thinking/assuming the issue
    of
    which variants to consider would be apt configuration, but maybe dpkg configuration would make more sense (after all, --add-architecture is a parameter to dpkg). And in this case, dpkg could object when installing a variant that has not been configured.

    Yes, the "plan" has been to add support for run-time CPU attributes,
    so that when adding a new arch, for example you can specify whether
    that arch is runnable, which could help dpkg decide whether to allow
    by default to install M-A:foreign packages.


    Ah. Would this be a change to /var/lib/dpkg/arch or an additional file or
    ...?


    I guess this is similar, so such future interface should probably take
    this into account as something to support too. Will check where this
    is tracked and add a note to it.


    Did you find this place? :)


    And of course that is fine as a guardrail, but if a user hit that out
    of running a frontend, then that would already be too late, which to
    me means that frontends need to be aware of this too (and not pass
    packages that dpkg would/could/might refuse to install), when deciding
    what to pass to dpkg.


    Good point.


    But in any case, as you say, this currently would not be worse than configuring a foreign arch, installing some foreign package and trying
    to run it, but it might make it potentially more common. And as
    mentioned above the effecting layer this needs to be decided up seems
    higher anyway (even if dpkg could provide the infra for it).

    If the only change in the package filename format is in the <arch> part where we'd use a name which would otherwise be valid as an arch name
    (so,
    no weird symbols, or «-» separators that are not intended to split <os> and <cpu> or similar), then using a name for the variant/ISA would be fine.

    Right. I think that (when possible pretending e.g. "amd64v3" is a
    distinct
    architecture will generally make things easier. E.g. I think britney wouldn't need to know about the relationship between "amd64" and
    "amd64v3".

    I guess that depends on whether the intention is to create a full
    optimized archive, or just a partial overlay one. In the latter case
    then it might need to know to be able to satisfy dependencies.


    Maybe! Depends on details I think.


    That would be one solution yes, which could give automatic bijective mappings, although ideally with a machine-readable way to get at it, which I'm not sure we have currently.

    I think "gcc -Q --help=target | grep -e '^\s*-march'" is about as machine readable as it gets currently, for better or worse (mostly worse).

    That does not look very satisfactory, though.


    Agreed!


    And llvm/clang does not support it. :/


    Ah I did not know that.


    For example code in dpkg-dev
    already runs «$CC -dumpmachine» to infer the host architecture to use during builds.

    While using a triplet variation could be a way to do that, that would require such triplet support for each variant/ISA, which tends to be
    very painful to introduce if it's not there already, so I'd not
    consider this specific way a viable option.

    I admit I'm not an expert on triplet intricacies but I think a new
    triplet
    is not appropriate here (a bit like a new Debian architecture for a variant/ISA choice is not the right concept).

    We have i386 or arm (?) as (bad IMO) examples where the triplet can
    define the arch baseline. The problem is that this requires updating
    the GNU config.git upstream, and then getting that to trickle down into
    every package that might be using autotools and not using autoreconf
    at build time, or to even update triplet matches in configure scripts
    and similar, which might be "acceptable" for a new arch, but seems disproportionate for a new ISA, so yes, as mentioned I agree it's not
    viable.


    OK. Let's stop worrying about that then :)


    On Thu, 2023-09-21 at 14:43:42 +1200, Michael Hudson-Doyle wrote:
    * Should the ISA influence the toolchain via toolchain defaults or dpkg-buildflags?
    * How is the default ISA for a buildd chroot selected?

    So the clear downsides of either modifying the default toolchain or having to provide an additional one is that this seems pretty heavy weight. Also because people might want to build optimized variants locally w/o having to mess with their already existing toolchains.
    (I'm not sure whether something going along the lines of <https://git.hadrons.org/cgit/debian/fakecross.git> could be an
    option, although as mentioned above, if that would imply new triplets, then probably not.)

    So the easiest way might indeed be by controlling this via an envvar,

    DEB_HOST_ARCH_ISA?

    Yeah, that works, and follows the current DPKG_*_ARCH_ABI lead for
    example.

    which dpkg-buildpackage could also setup internally via a new option,
    say --arch-isa=amd64v3 or similar

    --host-arch-isa would be more coherent I think.

    Ah absolutely! For some reason had --arch in mind as a valid option
    (I only see it now in dpkg-scanpackages :D, or maybe I was thinking
    about --host-isa :).

    I guess one could add support for --target-host-arch-isa to build a toolchain that defaults to a particular ISA. But well.

    Yes, the ISA support in dpkg should be extensive enough (so that if
    this needs to be supported in the toolchain, then it is possible).

    So to summarise, here are the generic changes that I think need to be
    made
    to src:dpkg to support variant ISAs as a thing:

    * add get_host_arch_isa() to Dpkg::Arch

    Yes (perhaps as mentioned below also just get_host_isa()).

    * dpkg-gencontrol records DEB_HOST_ARCH_ISA into DEBIAN/control as ArchitectureIsa

    Probably better Architecture-Isa, otherwise the current automatic
    case folding would make it end up as Architectureisa.


    Heh OK.


    * dpkg-architecture emits DEB_HOST_ARCH_ISA and grows --host-arch-isa
    flag

    Also DEB_BUILD_ARCH_ISA and DEB_TARGET_ARCH_ISA, and also grows a --target-arch-isa (but I'm thinking whether the shorter --host-isa would
    be nicer, for example the --match-bits are not spelled --match-arch-bits, which would seem also a bit redundant).

    * dpkg-buildpackage passes --host-arch-isa to dpkg-architecture

    Yes, but only when not the baseline.


    I think what I meant here was that if you pass one of the --*-arch-isa
    flags dpkg-buildpackage, it gets passed along to dpkg-architecture (as --host-arch etc are).

    * dpkg-genchanges should record the ISA in the changes file somehow I
    guess?

    Yes, also dpkg-genbuildinfo.


    Oh yeah. Although on a quick look dpkg-genbuildinfo gets the architecture
    from the filename...


    This could be done either from the
    envvars, or perhaps through the debian/files attributes support. But
    given that this is supposedly build global (I think it would be rather
    weird to end up with a .changes including say _amd64.deb and
    _amd64v3.deb file references from the same build),


    Ah yes that's true.


    then probably using
    the envvar might be the better way.

    * dpkg-deb records the ISA in the file name

    Yes.

    Have I missed anything?

    Nothing else comes to mind right now (except what I might have already added).


    Well of course I wrote this question in before going back and adding the
    stuff about having this all be driven by dpkg --add-archiitecture isa
    amd64v3 or anything like that. So those changes probably need to be hashed
    out too?


    (Hmm does anything need to reject unknown values
    found in DEB_HOST_ARCH_ISA / --host-arch-isa? Probably...)

    I guess it indeed makes sense to define what ISAs are supported, and
    either error out or warn and ignore such values. So there might be a
    need to add something like a new data/isatable.


    I notice that --add-architecture doesn't seem to do any checking.


    Conceptually slightly separately, it might make sense to add a build "feature" to Dpkg::Vendor::Debian to allow setting -march (and -mtune?)

    Then when we want to add support to an ISA, we add a little thing to set_build_features (in either Vendor::Debian or Vendor::Ubuntu or
    wherever)
    that maps get_host_arch_isa() to values for the march-influencing
    feature.

    Hmm, right, how to hook this. I'm not sure the current interface is good enough to describe this via build flags features, because such new feature area would expose arch-specific features. I have been thinking through
    the build flags and will try to send a proposal/RFC to revamp parts of
    it during the weekend.


    Did that happen? I didn't see if if so.


    But I think the ISA stuff is better just handled
    (at leas for now) directly by injecting whatever flags when the requested
    ISA is different to the baseline.


    Well that's easy enough too.

    Cheers & thanks for the continued thoughts,
    mwh

    <div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, 4 Nov 2023 at 02:02, Guillem Jover &lt;<a href="mailto:guillem@debian.org">guillem@debian.org</a>&gt; wrote:<br></div><blockquote class="gmail_
    quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi!<br>

    On Thu, 2023-11-02 at 15:27:54 +0000, Michael Hudson-Doyle wrote:<br>
    &gt; On Tue, 31 Oct 2023 at 09:21, Guillem Jover wrote:<br>
    &gt; &gt; This can be used among other things to set up foreign chroots, by<br> &gt; &gt; running the host tools, so disallowing installation could be<br>
    &gt; &gt; problematic. Even though I guess there could be a warning about this,<br>
    &gt; &gt; or maybe it could be controlled through a force option, although both<br>
    &gt; &gt; seems like they could be disruptive.<br>

    &gt; Of course in such cases dpkg knows that something funny is going on and<br>
    &gt; could suppress the warning itself.<br>

    Right, also true.<br>

    &gt; I spent a few minutes trying to think hard about this and I honestly don&#39;t<br>
    &gt; think I can predict whether trying to prevent installation of incompatible<br>
    &gt; packages is worth it (after all one of the ways users could get into<br> &gt; trouble would be moving an installed system to a different CPU and having<br>
    &gt; binaries start to fail and obviously dpkg can&#39;t help there).<br>
    &gt; <br>
    &gt; One result of this thinking was: I had been thinking/assuming the issue of<br>
    &gt; which variants to consider would be apt configuration, but maybe dpkg<br> &gt; configuration would make more sense (after all, --add-architecture is a<br>
    &gt; parameter to dpkg). And in this case, dpkg could object when installing a<br>
    &gt; variant that has not been configured.<br>

    Yes, the &quot;plan&quot; has been to add support for run-time CPU attributes,<br>
    so that when adding a new arch, for example you can specify whether<br>
    that arch is runnable, which could help dpkg decide whether to allow<br>
    by default to install M-A:foreign packages.<br></blockquote><div><br></div><div>Ah. Would this be a change to /var/lib/dpkg/arch or an additional file or ...?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:
    1px solid rgb(204,204,204);padding-left:1ex">
    I guess this is similar, so such future interface should probably take<br>
    this into account as something to support too. Will check where this<br>
    is tracked and add a note to it.<br></blockquote><div><br></div><div>Did you find this place? :)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    And of course that is fine as a guardrail, but if a user hit that out<br>
    of running a frontend, then that would already be too late, which to<br>
    me means that frontends need to be aware of this too (and not pass<br>
    packages that dpkg would/could/might refuse to install), when deciding<br>
    what to pass to dpkg.<br></blockquote><div><br></div><div>Good point.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    But in any case, as you say, this currently would not be worse than<br> configuring a foreign arch, installing some foreign package and trying<br>
    to run it, but it might make it potentially more common. And as<br>
    mentioned above the effecting layer this needs to be decided up seems<br> higher anyway (even if dpkg could provide the infra for it).<br>

    &gt; &gt; If the only change in the package filename format is in the &lt;arch&gt; part<br>
    &gt; &gt; where we&#39;d use a name which would otherwise be valid as an arch name (so,<br>
    &gt; &gt; no weird symbols, or «-» separators that are not intended to split &lt;os&gt;<br>
    &gt; &gt; and &lt;cpu&gt; or similar), then using a name for the variant/ISA would be<br>
    &gt; &gt; fine.<br>

    &gt; Right. I think that (when possible pretending e.g. &quot;amd64v3&quot; is a distinct<br>
    &gt; architecture will generally make things easier. E.g. I think britney<br> &gt; wouldn&#39;t need to know about the relationship between &quot;amd64&quot; and &quot;amd64v3&quot;.<br>

    I guess that depends on whether the intention is to create a full<br>
    optimized archive, or just a partial overlay one. In the latter case<br>
    then it might need to know to be able to satisfy dependencies.<br></blockquote><div><br></div><div>Maybe! Depends on details I think.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);
    padding-left:1ex">
    &gt; &gt; That would be one solution yes, which could give automatic bijective<br>
    &gt; &gt; mappings, although ideally with a machine-readable way to get at it,<br>
    &gt; &gt; which I&#39;m not sure we have currently.<br>

    &gt; I think &quot;gcc -Q --help=target | grep -e &#39;^\s*-march&#39;&quot; is about as machine<br>
    &gt; readable as it gets currently, for better or worse (mostly worse).<br>

    That does not look very satisfactory, though. </blockquote><div><br></div><div>Agreed!</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">And llvm/clang does not
    support it. :/<br></blockquote><div><br></div><div>Ah I did not know that.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    &gt; &gt; For example code in dpkg-dev<br>
    &gt; &gt; already runs «$CC -dumpmachine» to infer the host architecture to use<br>
    &gt; &gt; during builds.<br>
    &gt; &gt;<br>
    &gt; &gt; While using a triplet variation could be a way to do that, that would<br>
    &gt; &gt; require such triplet support for each variant/ISA, which tends to be<br>
    &gt; &gt; very painful to introduce if it&#39;s not there already, so I&#39;d not<br>
    &gt; &gt; consider this specific way a viable option.<br>
    &gt; <br>
    &gt; I admit I&#39;m not an expert on triplet intricacies but I think a new triplet<br>
    &gt; is not appropriate here (a bit like a new Debian architecture for a<br> &gt; variant/ISA choice is not the right concept).<br>

    We have i386 or arm (?) as (bad IMO) examples where the triplet can<br>
    define the arch baseline. The problem is that this requires updating<br>
    the GNU config.git upstream, and then getting that to trickle down into<br> every package that might be using autotools and not using autoreconf<br>
    at build time, or to even update triplet matches in configure scripts<br>
    and similar, which might be &quot;acceptable&quot; for a new arch, but seems<br>
    disproportionate for a new ISA, so yes, as mentioned I agree it&#39;s not<br> viable.<br></blockquote><div><br></div><div>OK. Let&#39;s stop worrying about that then :)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    &gt; &gt; On Thu, 2023-09-21 at 14:43:42 +1200, Michael Hudson-Doyle wrote:<br> &gt; &gt; &gt; * Should the ISA influence the toolchain via toolchain defaults or<br>
    &gt; &gt; &gt; dpkg-buildflags?<br>
    &gt; &gt; &gt; * How is the default ISA for a buildd chroot selected?<br>
    &gt; &gt;<br>
    &gt; &gt; So the clear downsides of either modifying the default toolchain or<br>
    &gt; &gt; having to provide an additional one is that this seems pretty heavy<br>
    &gt; &gt; weight. Also because people might want to build optimized variants<br>
    &gt; &gt; locally w/o having to mess with their already existing toolchains.<br>
    &gt; &gt; (I&#39;m not sure whether something going along the lines of<br>
    &gt; &gt; &lt;<a href="https://git.hadrons.org/cgit/debian/fakecross.git" rel="noreferrer" target="_blank">https://git.hadrons.org/cgit/debian/fakecross.git</a>&gt; could be an<br>
    &gt; &gt; option, although as mentioned above, if that would imply new triplets,<br>
    &gt; &gt; then probably not.)<br>
    &gt; &gt;<br>
    &gt; &gt; So the easiest way might indeed be by controlling this via an envvar,<br>
    &gt; <br>
    &gt; DEB_HOST_ARCH_ISA?<br>

    Yeah, that works, and follows the current DPKG_*_ARCH_ABI lead for<br> example.<br>

    &gt; &gt; which dpkg-buildpackage could also setup internally via a new option,<br>
    &gt; &gt; say --arch-isa=amd64v3 or similar<br>

    &gt; --host-arch-isa would be more coherent I think.<br>

    Ah absolutely! For some reason had --arch in mind as a valid option<br>
    (I only see it now in dpkg-scanpackages :D, or maybe I was thinking<br>
    about --host-isa :).<br>

    &gt; I guess one could add support for --target-host-arch-isa to build a<br> &gt; toolchain that defaults to a particular ISA. But well.<br>

    Yes, the ISA support in dpkg should be extensive enough (so that if<br>
    this needs to be supported in the toolchain, then it is possible).<br>

    &gt; So to summarise, here are the generic changes that I think need to be made<br>
    &gt; to src:dpkg to support variant ISAs as a thing:<br>
    &gt; <br>
    &gt;  * add get_host_arch_isa() to Dpkg::Arch<br>

    Yes (perhaps as mentioned below also just get_host_isa()).<br>

    &gt;  * dpkg-gencontrol records DEB_HOST_ARCH_ISA into DEBIAN/control as<br> &gt; ArchitectureIsa<br>

    Probably better Architecture-Isa, otherwise the current automatic<br>
    case folding would make it end up as Architectureisa.<br></blockquote><div><br></div><div>Heh OK.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    &gt;  * dpkg-architecture emits DEB_HOST_ARCH_ISA and grows --host-arch-isa flag<br>

    Also DEB_BUILD_ARCH_ISA and DEB_TARGET_ARCH_ISA, and also grows a<br> --target-arch-isa (but I&#39;m thinking whether the shorter --host-isa would<br>
    be nicer, for example the --match-bits are not spelled --match-arch-bits,<br> which would seem also a bit redundant).<br>

    &gt;  * dpkg-buildpackage passes --host-arch-isa to dpkg-architecture<br>

    Yes, but only when not the baseline.<br></blockquote><div><br></div><div>I think what I meant here was that if you pass one of the --*-arch-isa flags dpkg-buildpackage, it gets passed along to dpkg-architecture (as --host-arch etc are).</div><div><br></
    <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    &gt;  * dpkg-genchanges should record the ISA in the changes file somehow I<br>
    &gt; guess?<br>

    Yes, also dpkg-genbuildinfo.</blockquote><div><br></div><div>Oh yeah. Although on a quick look dpkg-genbuildinfo gets the architecture from the filename...</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px
    solid rgb(204,204,204);padding-left:1ex"> This could be done either from the<br>
    envvars, or perhaps through the debian/files attributes support. But<br>
    given that this is supposedly build global (I think it would be rather<br> weird to end up with a .changes including say _amd64.deb and<br>
    _amd64v3.deb file references from the same build),</blockquote><div><br></div><div>Ah yes that&#39;s true.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> then
    probably using<br>
    the envvar might be the better way.<br>

    &gt;  * dpkg-deb records the ISA in the file name<br>

    Yes.<br>

    &gt; Have I missed anything?<br>

    Nothing else comes to mind right now (except what I might have already<br> added).<br></blockquote><div><br></div><div>Well of course I wrote this question in before going back and adding the stuff about having this all be driven by dpkg --add-archiitecture isa amd64v3 or anything like that. So those changes probably need to be
    hashed out too?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    &gt; (Hmm does anything need to reject unknown values<br>
    &gt; found in DEB_HOST_ARCH_ISA /  --host-arch-isa? Probably...)<br>

    I guess it indeed makes sense to define what ISAs are supported, and<br>
    either error out or warn and ignore such values. So there might be a<br>
    need to add something like a new data/isatable.<br></blockquote><div><br></div><div>I notice that --add-architecture doesn&#39;t seem to do any checking.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px
    solid rgb(204,204,204);padding-left:1ex">
    &gt; Conceptually slightly separately, it might make sense to add a build<br> &gt; &quot;feature&quot; to Dpkg::Vendor::Debian to allow setting -march (and -mtune?)<br>
    &gt; <br>
    &gt; Then when we want to add support to an ISA, we add a little thing to<br> &gt; set_build_features (in either Vendor::Debian or Vendor::Ubuntu or wherever)<br>
    &gt; that maps get_host_arch_isa() to values for the march-influencing feature.<br>

    Hmm, right, how to hook this. I&#39;m not sure the current interface is good<br>
    enough to describe this via build flags features, because such new feature<br> area would expose arch-specific features. I have been thinking through<br>
    the build flags and will try to send a proposal/RFC to revamp parts of<br>
    it during the weekend. </blockquote><div><br></div><div>Did that happen? I didn&#39;t see if if so.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">But I think the
    ISA stuff is better just handled<br>
    (at leas for now) directly by injecting whatever flags when the requested<br> ISA is different to the baseline.<br></blockquote><div><br></div><div>Well that&#39;s easy enough too.</div><div><br></div><div>Cheers &amp; thanks for the continued thoughts,</div><div>mwh</div><div> </div></div></div>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julian Andres Klode@21:1/5 to Guillem Jover on Wed Nov 29 16:10:01 2023
    On Fri, Nov 03, 2023 at 02:02:37PM +0100, Guillem Jover wrote:
    Hi!

    On Thu, 2023-11-02 at 15:27:54 +0000, Michael Hudson-Doyle wrote:
    On Tue, 31 Oct 2023 at 09:21, Guillem Jover wrote:
    This can be used among other things to set up foreign chroots, by
    running the host tools, so disallowing installation could be
    problematic. Even though I guess there could be a warning about this,
    or maybe it could be controlled through a force option, although both seems like they could be disruptive.

    Of course in such cases dpkg knows that something funny is going on and could suppress the warning itself.

    Right, also true.

    I spent a few minutes trying to think hard about this and I honestly don't think I can predict whether trying to prevent installation of incompatible packages is worth it (after all one of the ways users could get into trouble would be moving an installed system to a different CPU and having binaries start to fail and obviously dpkg can't help there).

    One result of this thinking was: I had been thinking/assuming the issue of which variants to consider would be apt configuration, but maybe dpkg configuration would make more sense (after all, --add-architecture is a parameter to dpkg). And in this case, dpkg could object when installing a variant that has not been configured.

    Yes, the "plan" has been to add support for run-time CPU attributes,
    so that when adding a new arch, for example you can specify whether
    that arch is runnable, which could help dpkg decide whether to allow
    by default to install M-A:foreign packages.

    I guess this is similar, so such future interface should probably take
    this into account as something to support too. Will check where this
    is tracked and add a note to it.

    And of course that is fine as a guardrail, but if a user hit that out
    of running a frontend, then that would already be too late, which to
    me means that frontends need to be aware of this too (and not pass
    packages that dpkg would/could/might refuse to install), when deciding
    what to pass to dpkg.

    But in any case, as you say, this currently would not be worse than configuring a foreign arch, installing some foreign package and trying
    to run it, but it might make it potentially more common. And as
    mentioned above the effecting layer this needs to be decided up seems
    higher anyway (even if dpkg could provide the infra for it).

    So I'd like to interject here for a moment with my APT hat, because I
    want to have ISA autodiscovery. I want to be able to configure the
    system such that APT automatically picks the best ISA to download,
    and not have to configure a specific one.

    Similar to when you pass -cpu host to qemu, I want a magic alias
    for dpkg ISA support to say "host" or whatever and APT can then go
    and pick the best one (possibly multiple ones).

    It's fine for dpkg to provide a list of all architectures and the
    preference ordering in that case I think.

    --
    debian developer - deb.li/jak | jak-linux.org - free software dev
    ubuntu core developer i speak de, en

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julian Andres Klode@21:1/5 to Guillem Jover on Fri May 3 11:30:02 2024
    On Wed, Sep 06, 2023 at 11:27:02AM +0200, Guillem Jover wrote:
    Hi!

    On Fri, 2023-09-01 at 08:43:55 +1200, Michael Hudson-Doyle wrote:
    Recently the topic of exploiting newer instructions without dropping support for older machines has come up several times inside Ubuntu engineering. I understand this topic has come up several times in the past for Debian as well, but nothing has really come of it to date.

    I also had a chat about this with Matthias Klose (CCed) around 2022-05.

    I've spent a while thinking through the options and coming up with a design and wrote some notes into a wiki page: https://wiki.debian.org/ArchitectureVariants

    I think we are already doing 1, 2 and 3. I agree 4 is just wrong. And something like 5 is what I suggested to Matthias for Ubuntu when we
    last discussed it as the best way to go about this.

    I'm not sure I entirely agree with the requirements you set forth
    though:

    - I think such optimized builds might need to be done with "special
    toolchains" (these could simply be wrappers over the host compiler
    passing the appropriate flags via command-line or via specs or
    similar, not necessarily full blown toolchains), passing these via
    something like dpkg-buildflags seems currently unreliable, as I don't
    think we have full coverage in packages (neither for all compilers
    available)? Although it would be better as it would centralize the
    management. (For reference this is in part how rpm handles this:
    https://github.com/rpm-software-management/rpm/blob/master/rpmrc.in)
    - Perhaps that's a limitation from the archive software side, but
    requiring to place the binary packages in the same pool seems
    rather restrictive (it forces different filenames for example).
    - I guess it might be nice for the ISA to be passed down to the
    dpkg tools, but I don't think this is strictly necessary? A
    frontend like apt could also decide based on metadata in say the
    Release file, although not having the actual installed package
    metadata on whether it was a different ISA build or not would make
    its job more inconvenient. In any case I don't have a big issue
    with recording this via dpkg-gencontrol or similar if necessary.

    On the specific implementation details:

    - Changing the Architecture format (as in adding colons there) seems
    like a non-starter, and I expect that would break lots of things
    (I mean it could be done but I'm not sure it's worth it for this).
    Recording this mostly as a hint than anything else, via another
    field (if necessary at all) I think would be best.
    - As covered in previous discussions, dpkg could (but I don't think
    it's necessary) check whether the .deb is runnable on the current
    hw, but that's tricky as chrootless installs need to be taken
    into account, etc. It should certainly not be part of dependency
    resolution.
    - I'm not fond of having to change the binary package name format
    either for this (name_version_arch.deb) even if at least dpkg
    itself does not care (but I know other tools do care), and
    depending on the format I'd expect things to break (this goes
    back to the shared pool concern).
    - If dpkg-architecture needs to be aware of this, then this might need
    to be auto-detectable from just the current toolchain being used.

    Some of the above problems could perhaps be avoided if we introduced
    a concept of architecture aliases/ISAs (similar to what rpm has), which
    would side-step the pool sharing issue, the binary package renaming,
    etc. One big issue with this is that it requires for dpkg to have an exhaustive table of all such aliases, and if there's ever a new alias
    added, old dpkg versions need to be updated or they will not understand
    what they match with. So this does not seem ideal either. So I guess this
    is a variation over your proposal, but perhaps this could still be used
    in specific contexts, say only at build-time (but not for dependency relationships), for repo management (say binary-arm64v9/Packages.xz),
    or binary package names where the field would specify the actual name
    for the filename, say:

    Architecture: arm64
    ArchitectureIsa: arm64v9

    or maybe better:

    Architecture: arm64
    ArchitectureIsa: v9

    resulting in dpkg-deb generating:

    binpkg_1.0-1_arm64v9.deb

    but targeting arm64. I also think I prefer naming this explicitly as ISA variants, if you will, than just architecture variants as that gives
    way too much room (which perhaps we want, but then that has other implications over compatibility), and for the field perhaps just Isa is better, to avoid the implicit repetition of ArchitectureInstructionSetArchitecture :), but that makes it less easy
    to associate both as related.

    I have thought more about this and I'm not particularly fond of the ArchitectureIsa name. While *this specific use case* is a variant of
    the architecture instruction set; you could just as well build other
    variants such as "compiled with -O3", "compiled with frame pointers",
    "compiled with -O0", or other shenanigans (I haven't thought about
    others outside compiler flags)

    Hence I prefer Architecture-Variant, Subarchitecture, or anything
    like that rather than have to invent another field or abuse this
    one the next time we want to build a special variant of an architecture
    with special optimizations for a special customer or whatnot.

    This has stalled a bit, but we really gotta get going on this, we
    should really have this working in APT by end of year or something.
    --
    debian developer - deb.li/jak | jak-linux.org - free software dev
    ubuntu core developer i speak de, en

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matthias Klose@21:1/5 to Julian Andres Klode on Sat May 4 01:30:01 2024
    On 03.05.24 11:27, Julian Andres Klode wrote:
    On Wed, Sep 06, 2023 at 11:27:02AM +0200, Guillem Jover wrote:
    Hi!

    On Fri, 2023-09-01 at 08:43:55 +1200, Michael Hudson-Doyle wrote:
    Recently the topic of exploiting newer instructions without dropping
    support for older machines has come up several times inside Ubuntu
    engineering. I understand this topic has come up several times in the past >>> for Debian as well, but nothing has really come of it to date.

    I also had a chat about this with Matthias Klose (CCed) around 2022-05.

    I've spent a while thinking through the options and coming up with a design >>> and wrote some notes into a wiki page:
    https://wiki.debian.org/ArchitectureVariants

    I think we are already doing 1, 2 and 3. I agree 4 is just wrong. And
    something like 5 is what I suggested to Matthias for Ubuntu when we
    last discussed it as the best way to go about this.

    I'm not sure I entirely agree with the requirements you set forth
    though:

    - I think such optimized builds might need to be done with "special
    toolchains" (these could simply be wrappers over the host compiler
    passing the appropriate flags via command-line or via specs or
    similar, not necessarily full blown toolchains), passing these via
    something like dpkg-buildflags seems currently unreliable, as I don't
    think we have full coverage in packages (neither for all compilers
    available)? Although it would be better as it would centralize the
    management. (For reference this is in part how rpm handles this:
    https://github.com/rpm-software-management/rpm/blob/master/rpmrc.in)
    - Perhaps that's a limitation from the archive software side, but
    requiring to place the binary packages in the same pool seems
    rather restrictive (it forces different filenames for example).
    - I guess it might be nice for the ISA to be passed down to the
    dpkg tools, but I don't think this is strictly necessary? A
    frontend like apt could also decide based on metadata in say the
    Release file, although not having the actual installed package
    metadata on whether it was a different ISA build or not would make
    its job more inconvenient. In any case I don't have a big issue
    with recording this via dpkg-gencontrol or similar if necessary.

    On the specific implementation details:

    - Changing the Architecture format (as in adding colons there) seems
    like a non-starter, and I expect that would break lots of things
    (I mean it could be done but I'm not sure it's worth it for this).
    Recording this mostly as a hint than anything else, via another
    field (if necessary at all) I think would be best.
    - As covered in previous discussions, dpkg could (but I don't think
    it's necessary) check whether the .deb is runnable on the current
    hw, but that's tricky as chrootless installs need to be taken
    into account, etc. It should certainly not be part of dependency
    resolution.
    - I'm not fond of having to change the binary package name format
    either for this (name_version_arch.deb) even if at least dpkg
    itself does not care (but I know other tools do care), and
    depending on the format I'd expect things to break (this goes
    back to the shared pool concern).
    - If dpkg-architecture needs to be aware of this, then this might need
    to be auto-detectable from just the current toolchain being used.

    Some of the above problems could perhaps be avoided if we introduced
    a concept of architecture aliases/ISAs (similar to what rpm has), which
    would side-step the pool sharing issue, the binary package renaming,
    etc. One big issue with this is that it requires for dpkg to have an
    exhaustive table of all such aliases, and if there's ever a new alias
    added, old dpkg versions need to be updated or they will not understand
    what they match with. So this does not seem ideal either. So I guess this
    is a variation over your proposal, but perhaps this could still be used
    in specific contexts, say only at build-time (but not for dependency
    relationships), for repo management (say binary-arm64v9/Packages.xz),
    or binary package names where the field would specify the actual name
    for the filename, say:

    Architecture: arm64
    ArchitectureIsa: arm64v9

    or maybe better:

    Architecture: arm64
    ArchitectureIsa: v9

    resulting in dpkg-deb generating:

    binpkg_1.0-1_arm64v9.deb

    but targeting arm64. I also think I prefer naming this explicitly as ISA
    variants, if you will, than just architecture variants as that gives
    way too much room (which perhaps we want, but then that has other
    implications over compatibility), and for the field perhaps just Isa is
    better, to avoid the implicit repetition of
    ArchitectureInstructionSetArchitecture :), but that makes it less easy
    to associate both as related.

    I have thought more about this and I'm not particularly fond of the ArchitectureIsa name. While *this specific use case* is a variant of
    the architecture instruction set; you could just as well build other
    variants such as "compiled with -O3", "compiled with frame pointers", "compiled with -O0", or other shenanigans (I haven't thought about
    others outside compiler flags)

    or
    - DistroBuiltWithClang, e.g. using libc++ instead of libstdc++.
    - distro built with some of the sanitizers turned on by default

    Hence I prefer Architecture-Variant, Subarchitecture, or anything
    like that rather than have to invent another field or abuse this
    one the next time we want to build a special variant of an architecture
    with special optimizations for a special customer or whatnot.

    yes, it sounds like a bit too specific.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Guillem Jover@21:1/5 to Matthias Klose on Tue May 7 01:30:01 2024
    On Sat, 2024-05-04 at 01:07:59 +0200, Matthias Klose wrote:
    On 03.05.24 11:27, Julian Andres Klode wrote:
    On Wed, Sep 06, 2023 at 11:27:02AM +0200, Guillem Jover wrote:
    Some of the above problems could perhaps be avoided if we introduced
    a concept of architecture aliases/ISAs (similar to what rpm has), which would side-step the pool sharing issue, the binary package renaming,
    etc. One big issue with this is that it requires for dpkg to have an exhaustive table of all such aliases, and if there's ever a new alias added, old dpkg versions need to be updated or they will not understand what they match with. So this does not seem ideal either. So I guess this is a variation over your proposal, but perhaps this could still be used in specific contexts, say only at build-time (but not for dependency relationships), for repo management (say binary-arm64v9/Packages.xz),
    or binary package names where the field would specify the actual name
    for the filename, say:

    Architecture: arm64
    ArchitectureIsa: arm64v9

    or maybe better:

    Architecture: arm64
    ArchitectureIsa: v9

    resulting in dpkg-deb generating:

    binpkg_1.0-1_arm64v9.deb

    but targeting arm64. I also think I prefer naming this explicitly as ISA variants, if you will, than just architecture variants as that gives
    way too much room (which perhaps we want, but then that has other implications over compatibility), and for the field perhaps just Isa is better, to avoid the implicit repetition of ArchitectureInstructionSetArchitecture :), but that makes it less easy
    to associate both as related.

    I have thought more about this and I'm not particularly fond of the ArchitectureIsa name. While *this specific use case* is a variant of
    the architecture instruction set; you could just as well build other variants such as "compiled with -O3", "compiled with frame pointers", "compiled with -O0", or other shenanigans (I haven't thought about
    others outside compiler flags)

    These looks really out of scope for the initially proposed purpose,
    as they have nothing to do with the architecture, they are rebuilds
    with different compilation flags. Creating a new pseudo-architecture
    for those things seems just wrong and out of place. I understand that
    might seem convenient, but it does not make sense to me, conceptually
    and from a design PoV.

    or
    - DistroBuiltWithClang, e.g. using libc++ instead of libstdc++.
    - distro built with some of the sanitizers turned on by default

    These should already either be tracked as part of dependencies, or be potentially ABI incompatible (AFAIUI for some sanitizers) which would
    require a new arch and triplet anyway.

    Hence I prefer Architecture-Variant, Subarchitecture, or anything
    like that rather than have to invent another field or abuse this
    one the next time we want to build a special variant of an architecture with special optimizations for a special customer or whatnot.

    yes, it sounds like a bit too specific.

    It seems as specific as it needs to be, TBH. The other proposed users
    do not really fit with the way this is supposed to be used.

    Regards,
    Guillem

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)