• Really enable -fstack-clash-protection on armhf/armel?

    From Matthias Klose@21:1/5 to All on Thu Nov 23 11:20:01 2023
    XPost: linux.debian.ports.arm

    Hi,

    it looks like enabling this flag on armel/armhf is a little bit premature.

    Apparently it's not completely supported upstream, and might cause
    regressions, according to
    https://bugzilla.redhat.com/show_bug.cgi?id=1522678

    Is that a feature that the Debian ARM32 porters and the security team
    really want to support actively, despite the missing upstream support?

    In Ubuntu, people tracked down segfaults due to this change in at least valgrind and gnutls, maybe more.

    Thanks, Matthias

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Guillem Jover@21:1/5 to Matthias Klose on Fri Nov 24 01:40:01 2023
    XPost: linux.debian.ports.arm

    Hi!

    On Thu, 2023-11-23 at 10:45:33 +0100, Matthias Klose wrote:
    it looks like enabling this flag on armel/armhf is a little bit premature.

    Apparently it's not completely supported upstream, and might cause regressions, according to
    https://bugzilla.redhat.com/show_bug.cgi?id=1522678

    I note that this bug was closed on 2018-01, so the information therein
    might not be the most up-to-date?

    Is that a feature that the Debian ARM32 porters and the security team really want to support actively, despite the missing upstream support?

    According to https://bugs.debian.org/918914#73 there were no pending
    toolchain issues related to this. And I think the security team mostly
    deferred to the ports teams.

    In Ubuntu, people tracked down segfaults due to this change in at least valgrind and gnutls, maybe more.

    If there's some missing support somewhere that might make this a
    common thing instead of just affecting a handful of packages that
    could simply disable the flags, and the Arm porters consider that
    fixing that is not feasible in the short term, I guess it makes
    sense to stop emitting the flag for the arm32 arches. In the end
    I'd still defer to what the porters prefer, and I can easily revert
    that change for arm32 and queue it for a next upload if desired.

    Thanks,
    Guillem

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Emanuele Rocca@21:1/5 to Guillem Jover on Fri Nov 24 07:20:01 2023
    XPost: linux.debian.ports.arm

    Hello!

    On 2023-11-24 01:34, Guillem Jover wrote:
    According to https://bugs.debian.org/918914#73 there were no pending toolchain issues related to this.

    That is correct. The GCC maintainers at Arm confirm that
    stack-clash-protection is supported on 32 bit too.

    In case there are any bugs, which is of course possible, please file
    them and add debian-arm@ to X-Debbugs-CC.

    So far I'm only aware of an issue with plplot, which turned out to be an
    actual bug in the software that stack-clash-protection helped uncover: https://bugs.debian.org/1055228#24

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matthias Klose@21:1/5 to Emanuele Rocca on Fri Nov 24 11:10:02 2023
    XPost: linux.debian.ports.arm

    On 24.11.23 07:19, Emanuele Rocca wrote:
    Hello!

    On 2023-11-24 01:34, Guillem Jover wrote:
    According to https://bugs.debian.org/918914#73 there were no pending
    toolchain issues related to this.

    That is correct. The GCC maintainers at Arm confirm that stack-clash-protection is supported on 32 bit too.

    yes, but it's a different implementation, that apparently breaks a few
    more things than on the other architectures where it is enabled.

    In case there are any bugs, which is of course possible, please file
    them and add debian-arm@ to X-Debbugs-CC.

    No, I will not do that. Sorry, but the task of the porters it NOT to
    put this kind of work on the shoulders on others, but to do this
    analysis themself. You seem to rely on every other package maintainer
    to figure out these issues on their own. Please don't do that.

    Debian is the first distro to turn this on on armhf, but didn't do any
    checks or test rebuilds before turning this on.

    So far I'm only aware of an issue with plplot, which turned out to be an actual bug in the software that stack-clash-protection helped uncover: https://bugs.debian.org/1055228#24

    I filed now
    https://bugs.launchpad.net/ubuntu/+source/libselinux/+bug/2044506
    to collect some information what Ubuntu apparently hit.

    A major problem will be valgrind stopping to work, causing issues in the
    test suites of other packages.

    Also after rebuilding libxml2, libarchive, gnutls28, libselinux without
    this flag on armhf, issues go away again. I'm not directly working on
    these, so can't give more information.

    Matthias

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Florian Weimer@21:1/5 to All on Fri Nov 24 11:20:01 2023
    XPost: linux.debian.ports.arm

    * Emanuele Rocca:

    Hello!

    On 2023-11-24 01:34, Guillem Jover wrote:
    According to https://bugs.debian.org/918914#73 there were no pending
    toolchain issues related to this.

    That is correct. The GCC maintainers at Arm confirm that stack-clash-protection is supported on 32 bit too.

    Jeff Law, the original designer of -fstack-clash-protection,
    disagrees:

    | So to reiterate, this is precisely the kind of problem we avoid by
    | having stack-clash specific prologues on the Red Hat Enterprise
    | Linux architectures. We didn't do a 32bit ARM implementation and
    | instead rely on the limited protections provided by the Ada
    | -fstack-check bits.

    <https://bugzilla.redhat.com/show_bug.cgi?id=1522678#c1>

    And as far as I can see the code has not changed since then.

    It's a bit unfortunate that GCC accepts the -fstack-clash-protection
    flag even if target support is not really there.

    Note that RISC-V has the same problem, but at least Jeff has mid-term
    plans to fix that.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Adrien Nader@21:1/5 to Matthias Klose on Fri Nov 24 12:30:01 2023
    XPost: linux.debian.ports.arm

    Hi,

    Short introduction: I work at Canonical in the Foundations team and made changes in gnutls which is one of the packages that first
    encountered/caused issues which then started blocking various migrations
    and changes.

    On Fri, Nov 24, 2023, Matthias Klose wrote:
    On 24.11.23 07:19, Emanuele Rocca wrote:
    Hello!

    On 2023-11-24 01:34, Guillem Jover wrote:
    According to https://bugs.debian.org/918914#73 there were no pending toolchain issues related to this.

    That is correct. The GCC maintainers at Arm confirm that stack-clash-protection is supported on 32 bit too.

    yes, but it's a different implementation, that apparently breaks a few more things than on the other architectures where it is enabled.

    In case there are any bugs, which is of course possible, please file
    them and add debian-arm@ to X-Debbugs-CC.

    No, I will not do that. Sorry, but the task of the porters it NOT to put this kind of work on the shoulders on others, but to do this analysis themself. You seem to rely on every other package maintainer to figure out these issues on their own. Please don't do that.

    Debian is the first distro to turn this on on armhf, but didn't do any
    checks or test rebuilds before turning this on.

    So far I'm only aware of an issue with plplot, which turned out to be an actual bug in the software that stack-clash-protection helped uncover: https://bugs.debian.org/1055228#24

    I filed now
    https://bugs.launchpad.net/ubuntu/+source/libselinux/+bug/2044506
    to collect some information what Ubuntu apparently hit.

    Thanks. I put some details on https://code.launchpad.net/~adrien-n/ubuntu/+source/dpkg/+git/dpkg/+merge/456181
    and I'll expand the information on the bug but I need a couple hours
    first. I expected the topic to be shorter somehow (it was late in the
    day :) ).

    A major problem will be valgrind stopping to work, causing issues in the
    test suites of other packages.

    Also after rebuilding libxml2, libarchive, gnutls28, libselinux without this flag on armhf, issues go away again. I'm not directly working on these, so can't give more information.

    I'm not opposed to investigating the issues but the number of failures
    we'll get is still unknown, and their source and whether it would
    actually be due to the use of valgrind aren't clear. In any case, the
    failure under valgrind is 100% unexploitable. I want to look at that
    plplot bug in order to understand how this helped find an actual bug
    because what I've seen so far doesn't lend itself to quick analysis.

    What I'm not convinced is that packages should be uploaded in that
    state. As far as I understand, it's possible to work on the libraries of
    a single package at a time and a test rebuild followed by an (emulated) autopkgtest should be enough; iterating maybe wouldn't be incredibly
    fast but still probably much faster than iterating through the archive. Moreover a local build is probably needed anyway because AFAICT there's
    nothing to learn from the current test logs.

    I'm not here to tell how to run Debian and it's probably worth noting
    that we're still early in the current debian cycle while we're quite far
    in the ubuntu cycle for an LTS release (plus holidays season). This
    might lead to different solutions but in any case, the change and the
    breadth and depth of its consequences were a surprise; this is recent
    yet the problematic packages were really quickly piling on.

    Reflecting a bit more on this: would the issues raised be always
    similar? I mean, if we expect the same kind of issues in most packages
    and the same solutions, we should make a guide for maintainers so they
    can address this quickly. And if it's likely different every time, we
    need to think about the maintainers' time and availability.

    --
    Adrien

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Wookey@21:1/5 to Guillem Jover on Sat Nov 25 01:40:01 2023
    XPost: linux.debian.ports.arm

    On 2023-11-24 01:34 +0100, Guillem Jover wrote:
    On Thu, 2023-11-23 at 10:45:33 +0100, Matthias Klose wrote:
    it looks like enabling this flag on armel/armhf is a little bit premature.

    In Ubuntu, people tracked down segfaults due to this change in at least valgrind and gnutls, maybe more.

    If there's some missing support somewhere that might make this a
    common thing instead of just affecting a handful of packages that
    could simply disable the flags, and the Arm porters consider that
    fixing that is not feasible in the short term, I guess it makes
    sense to stop emitting the flag for the arm32 arches.

    Assuming this problem only affects some packages they can have their
    build flags adjusted in the short term. dpkg-buildflags makes this straightforward.

    And we can investigate and fix in the longer term.

    So I don't think we need to turn it off for the whole architecture
    unless we find loads of stuff that is broken.

    Are there any bugs reports on how to reproduce issues?

    I just tried building gnutls28 both with and without
    fstack-clash-protection.
    It is one test better with -fstack-clash-protection enabled: dtls/dtls-resume.sh

    -fstack-clash-protection
    enabled disabled
    TOTAL: 501 501
    PASS: 461 460
    SKIP: 20 20
    XFAIL: 0 0
    FAIL: 20 21
    XPASS: 0 0
    ERROR: 0 0

    So that's worthy of investigation, but suggests there is a problem
    here which scp isn't making worse.

    Some additional info from Richard Earnshaw:
    ---
    Note that for valgrind, I suspect the problem is that it has not been
    updated for the following, relatively recent, relaxation in the AAPCS:

    6.2.1.3 Stack probing
    In order to ensure stack integrity a process may emit stack probes
    immediately prior to allocating additional stack space (moving SP from
    SP_old to SP_new). Stack probes must be in the region of [SP_new, SP_old
    - 1] and may be either read or write operations. The minimum interval
    for stack probing is defined by the target platform but must be a
    minimum of 4KBytes. No recoverable data can be saved below the currently allocated stack region.

    Prior to this addition (2018Q4) all accesses below SP were forbidden,
    and I think that's what valgrind still implements.
    ---

    So that does sound like valgrind needs an update for this, and yes it
    would have been better if that wasn't a surprise. My initial feeling
    is that we should just fix that, rather than reverting at this stage,
    but I understand what Adrien says about the Ubuntu cycle
  • From Moritz Muehlenhoff@21:1/5 to Guillem Jover on Mon Nov 27 18:10:01 2023
    XPost: linux.debian.ports.arm

    On Fri, Nov 24, 2023 at 01:34:21AM +0100, Guillem Jover wrote:
    Is that a feature that the Debian ARM32 porters and the security team really
    want to support actively, despite the missing upstream support?

    According to https://bugs.debian.org/918914#73 there were no pending toolchain issues related to this. And I think the security team mostly deferred to the ports teams.

    Indeed. From our PoV anything beyond amd64 is fully at the discretion of the respective porters to decide whether it makes sense or not.

    Cheers,
    Moritz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Emanuele Rocca@21:1/5 to Matthias Klose on Wed Nov 29 21:30:02 2023
    XPost: linux.debian.ports.arm

    Hi Matthias,

    On 2023-11-24 10:50, Matthias Klose wrote:
    On 24.11.23 07:19, Emanuele Rocca wrote:
    In case there are any bugs, which is of course possible, please file
    them and add debian-arm@ to X-Debbugs-CC.

    No, I will not do that. Sorry, but the task of the porters it NOT to put this kind of work on the shoulders on others, but to do this analysis themself. You seem to rely on every other package maintainer to figure out these issues on their own. Please don't do that.

    I'm sorry if that is the impression you got. What I was trying to say
    is: if you know that something is broken, please let us know because we
    are not aware of any issues.

    Debian is the first distro to turn this on on armhf, but didn't do any
    checks or test rebuilds before turning this on.

    I have rebuilt and tested a few key packages myself (clearly not
    valgrind, heh).

    It's true that we should have done a full archive rebuild, you're right.
    I've asked for it in August and did not get any reply till October when
    the flag was already enabled (and we had no reports of breakage at that point): https://lists.debian.org/debian-arm/2023/08/msg00024.html

    Now I've asked Lucas to go ahead with an armhf rebuild so that hopefully
    we'll get a clearer picture.

    One day debusine will make all this smooth and easy. :-) https://wiki.debian.org/DebianEvents/gb/2023/MiniDebConfCambridge/Zini

    I filed now
    https://bugs.launchpad.net/ubuntu/+source/libselinux/+bug/2044506
    to collect some information what Ubuntu apparently hit.

    A major problem will be valgrind stopping to work, causing issues in the
    test suites of other packages.

    Thank you.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Emanuele Rocca@21:1/5 to Matthias Klose on Thu Nov 30 18:00:01 2023
    XPost: linux.debian.ports.arm

    Hi,

    On 2023-11-24 10:50, Matthias Klose wrote:
    A major problem will be valgrind stopping to work, causing issues in the
    test suites of other packages.

    Also after rebuilding libxml2, libarchive, gnutls28, libselinux without this flag on armhf, issues go away again.

    FTR there is no issue in Debian with any of the above in my tests.
    Also the packages don't seem to use valgrind at any point: not when
    building, not in the autopkgtests.

    Full build logs including autopkgtest output here: https://people.debian.org/~ema/armhf-stack-clash-protection/

    What exactly did not work in Ubuntu and how? Perhaps there are
    additional jobs running valgrind in CI that may explain the failures?

    Thanks,
    Emanuele

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julian Andres Klode@21:1/5 to Matthias Klose on Mon Jan 8 11:50:01 2024
    XPost: linux.debian.ports.arm

    On Thu, Nov 23, 2023 at 10:45:33AM +0100, Matthias Klose wrote:
    Hi,

    it looks like enabling this flag on armel/armhf is a little bit premature.

    Apparently it's not completely supported upstream, and might cause regressions, according to
    https://bugzilla.redhat.com/show_bug.cgi?id=1522678

    Is that a feature that the Debian ARM32 porters and the security team really want to support actively, despite the missing upstream support?

    In Ubuntu, people tracked down segfaults due to this change in at least valgrind and gnutls, maybe more.

    It's 1.5 months later, valgrind is still failing and apt in valgrind
    hence segfaults. I am disabling the apt valgrind test on armhf in 2.7.8,
    but this situation is somewhat untenable.

    I did clone the bug to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1060251
    now.

    --
    debian developer - deb.li/jak | jak-linux.org - free software dev
    ubuntu core developer i speak de, en

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Emanuele Rocca@21:1/5 to Wookey on Wed Feb 14 09:40:02 2024
    XPost: linux.debian.ports.arm

    Hi,

    On 2023-11-25 12:37, Wookey wrote:
    For debian we'll keep an eye on it, do a belated rebuild to see how
    much of a problem we really have, and then decide if we should revert
    it too until some stuff if fixed.

    I now finally have some data to share. In total, out of the whole Debian archive, 4 packages fail to build because of stackclash on armhf and 2
    on armel. Additionally, 5 packages have failing autopkgtests.

    The main issue really is the open valgrind bug on armhf when checking
    programs built with stack-clash-protection:
    https://bugs.debian.org/1061496
    No problem on armel, given that valgrind is not supported at all there.

    The procedure I followed to get the FTBFS data was starting from the
    list of build failures kindly gather by Lucas with his archive rebuild
    last month (see http://qa-logs.debian.net/2024/01/11/). I've rebuilt all packages that failed, and it turns out that most failed due to
    transient issues at the time. Then starting from the list of my failed
    rebuilds I performed another build - this time without stackclash.

    Note that of the 4 armhf FTBFS, 2 are due to the fact that the build
    process uses valgrind (#1061496). Additionally, the valgrind issue
    caused autopkgtest failures in 5 packages: apt, libgd2, libgssglue,
    libvorbis, and sndfile-tools.

    The workaround I've been suggesting for the FTBFS is to disable
    stackclash on armhf or armel for the few packages that fail building.
    For packages using valgrind in autopkgtest, I've been suggesting either
    to skip the tests that fail or disabling stackclash - on armhf only of
    course.

    For all of the above, I have filed bugs with the usertag 32bit-stackclash: https://bugs.debian.org/cgi-bin/pkgreport.cgi?users=debian-arm@lists.debian.org;tag=32bit-stackclash

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)