• armhf: abel.d.o hardware status ?

    From Mathieu Malaterre@21:1/5 to All on Wed Jun 29 09:00:01 2022
    [cc me please]

    Dear armhf gurus,

    Could someone please confirm that abel.d.o hardware is almost like a
    good old RaspberryPi Model 2B ? I am trying to understand why valgrind
    is supposed to work on arm32/linux but fails miserably on abel.d.o.

    Thanks for your kind help

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Wookey@21:1/5 to Mathieu Malaterre on Wed Jun 29 14:50:01 2022
    On 2022-06-29 08:54 +0200, Mathieu Malaterre wrote:
    [cc me please]

    Dear armhf gurus,

    Could someone please confirm that abel.d.o hardware is almost like a
    good old RaspberryPi Model 2B ? I am trying to understand why valgrind
    is supposed to work on arm32/linux but fails miserably on abel.d.o.

    Abel is a marvell Armada 370/XP CPU (4 cores) in the form of an MV78460 dev board.
    Marvell have their own architecture licence so it's not an ARM (the company) design)

    It has these CPU features:
    half thumb fastmult vfp edsp thumbee vfpv3 tls idiva idivt vfpd32 lpae

    so that means it doesn't have neon, which would trip up code assuming that it doesn.
    It's also a v7 core.

    The RPi Model2B was oringally a Broadcom BCM2836 (quadcore Cortex-A7)
    and later (v1.2) was a BCM2837 (quadcore Cortex A53) (Both ARM (the
    company) core designs, but A53 is v8 and A7 is v7 ISA).

    So abel and the original RPi 2B are similar in that both are v7, 4-core
    CPUs. But they have different HWCAPS and microarchitectures. (And the
    later A53/BCM2837 is quite different with a 64-bit v8 CPU)

    I'm failing to find the /proc/cpuinfo or HWCAPS spec for the cortex
    A7, but it does have neon, so no they are not 'the same'. If you want
    to see if this is the issue, try the 'harris' porterbox, which is
    different v7 32-bit CPU (Freeescale i.MX53), but does have neon.

    What exactly is going wrong when you try to use valgrind?

    Wookey
    --
    Principal hats: Debian, Wookware, ARM
    http://wookware.org/

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEER4nvI8Pe/wVWh5yq+4YyUahvnkcFAmK8ShgACgkQ+4YyUahv nkcMfw//e+IvYlhGy2QM6pEQrtH77zOGdr1TpIRW2/D2nZB6qyRxKgwj5F3hEIPa IjOJihWXsz+rtyNtaFd2GslcQg66H9zCe/5sTZcz5vEkY9cqUUGW4SDertmOUkrC AtrU8+FxMGbVf8bkNLeeXOBWtHxBl+GclUfQUOPH7rzTco5nantZF+rBy6Ve/h2K eXlWKD1aeItjr2lzcfejryRYKTLsJqDOzEVOJ4S5ncqVP2QMvw/GMMHluMybdoXC RDGO1c8pn7qo73pkuOGD8nbkp1GTCNfKOq96wOE0kD6UMl9W/W+PNw7qlnhcA7aF OQUjnsSXSBtSqJ2hlyoKDDE2HV8712cP4RvJ1WXUhLdGy2T/uJT2rkmD9BMaXv7e DU1lZu2AytXbNIQ1kNrmd9jigmUzX66bMh1fQxLdbeJw6WNa+wgSzPV/iFOxxUEW cGm4zg5bQBj/l8wPQ3O0suMP6Z11T2js356TcYUKvAdoR01WswxGWg51P4IJMnV/ ohBWgjkxzJ1kktze07suIGBvMGmOFe7IGvUTh1N2aVBVO+SShX16SKpRRHW3omRX tFxXbzdiZcPzYv6VfNbo3pCs3z/NksZecI3pHRNxtBXkDzK1y9llLNTSj59Eftr5 714yEFrvrl3t230EhfAv/hkSW8kc4GWMiEY+Ebv5hvc7RloIS+8=
    =VuDd
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mathieu Malaterre@21:1/5 to wookey@wookware.org on Wed Jun 29 15:20:01 2022
    Hi Wookey !

    On Wed, Jun 29, 2022 at 2:48 PM Wookey <wookey@wookware.org> wrote:

    On 2022-06-29 08:54 +0200, Mathieu Malaterre wrote:
    [cc me please]

    Dear armhf gurus,

    Could someone please confirm that abel.d.o hardware is almost like a
    good old RaspberryPi Model 2B ? I am trying to understand why valgrind
    is supposed to work on arm32/linux but fails miserably on abel.d.o.

    Abel is a marvell Armada 370/XP CPU (4 cores) in the form of an MV78460 dev board.
    Marvell have their own architecture licence so it's not an ARM (the company) design)

    It has these CPU features:
    half thumb fastmult vfp edsp thumbee vfpv3 tls idiva idivt vfpd32 lpae

    so that means it doesn't have neon, which would trip up code assuming that it doesn.
    It's also a v7 core.

    The RPi Model2B was oringally a Broadcom BCM2836 (quadcore Cortex-A7)
    and later (v1.2) was a BCM2837 (quadcore Cortex A53) (Both ARM (the
    company) core designs, but A53 is v8 and A7 is v7 ISA).

    So abel and the original RPi 2B are similar in that both are v7, 4-core
    CPUs. But they have different HWCAPS and microarchitectures. (And the
    later A53/BCM2837 is quite different with a 64-bit v8 CPU)

    I'm failing to find the /proc/cpuinfo or HWCAPS spec for the cortex
    A7, but it does have neon, so no they are not 'the same'. If you want
    to see if this is the issue, try the 'harris' porterbox, which is
    different v7 32-bit CPU (Freeescale i.MX53), but does have neon.

    What exactly is going wrong when you try to use valgrind?

    Well you should see something like this on abel.d.o:

    * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=928224#27

    Basically anytime you build valgrind using gcc-11 or gcc-12 (debian
    sid package), you get this weird illegal instruction:

    ```
    % ./vg-in-place
    Illegal instruction
    ```

    The debian package seems to be affected as well:

    malat@abel ~ % valgrind /bin/true
    zsh: illegal hardware instruction valgrind /bin/true

    Discussing the issue with upstream seems to only lead to the following exchange:

    * https://sourceforge.net/p/valgrind/mailman/message/37674159/

    [...]
    So YES, current valgrind does support armhf (armv7l: 32-bit, hard
    floating point).
    [...]

    which makes it hard to report the actual issue upstream. I am not
    familiar with such low level issues so I fail to actually prepare an accurate/complete bug report.

    I also tested on harris, and valgrind seems to be running just fine:

    harris% valgrind /bin/true
    ==26697== Memcheck, a memory error detector
    ==26697== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==26697== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==26697== Command: /bin/true
    ==26697==
    ==26697==
    ==26697== HEAP SUMMARY:
    ==26697== in use at exit: 0 bytes in 0 blocks
    ==26697== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
    ==26697==
    ==26697== All heap blocks were freed -- no leaks are possible
    ==26697==
    ==26697== For lists of detected and suppressed errors, rerun with: -s
    ==26697== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Wookey@21:1/5 to Mathieu Malaterre on Wed Jun 29 17:40:01 2022
    On 2022-06-29 15:13 +0200, Mathieu Malaterre wrote:
    On Wed, Jun 29, 2022 at 2:48 PM Wookey <wookey@wookware.org> wrote:

    What exactly is going wrong when you try to use valgrind?

    Well you should see something like this on abel.d.o:

    * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=928224#27

    Basically anytime you build valgrind using gcc-11 or gcc-12 (debian
    sid package), you get this weird illegal instruction:

    ```
    % ./vg-in-place
    Illegal instruction
    ```

    I have a strong suspicion that this is neon-itis. The issue generally
    manifests as 'illegal instuction' (i.e a neon instruction is issued on
    hardware that isn't able to execute it). It has always been the case
    that software should not assume neon is present on v7 (because it
    isn't on all hardware), and most code gets this right, but I've
    recently seen gcc putting those instuctions into the startup code
    (where the C-environment is set up and variables allocated) which gets
    executed _before_ any functions checking for which HWCAPS to enable,
    and thus which code to run.

    You can check if a binary contains NEON instructions using
    readelf -A

    and look for
    Tag_Advanced_SIMD_arch: NEONv1

    However just because its in the binary doesn't mean it's wrong. The
    binary may have been built using ifunc or other mechanisms to choose appropriate functions depending whether or not neon hardware is available.

    A simple check for whether this is your issue is just to run the same test on harris.debian.org.
    If it works OK there that strongly suggests you have a neon problem.

    Also if you run the program under gdb (on abel) and when it barfs do:
    (gdb) disassemble
    and look for instructions that start with 'v', like 'vmov.i32'
    that will confirm which instruction is tripping it up.

    This bug has an example of the problem: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998043

    I got partway thorugh a long followup with some details of possible
    fixes some months ago but got sidetracked (and oh look it's been
    pending for 6 months already).

    The reason this has broken appears to be that gcc has changed the way
    the fpu is specified/defaulted, so neon _and_ fp are enabled by
    default if no specific fpu option is given. (i.e we just set
    -march=armv7). It used to be that -march=armv7 implied +nosimd. (or
    something like that - I never quite got to the bottom of it enough to
    be sure eactly what the right general or specific fix was).

    If you rebuild with
    -march=armv7-a+nosimd+nofp
    or
    -march=armv7-a+nosimd+fp
    you should be able to determine if being more explicit about the fp and simd(neon) instructions used makes it behave.

    It seems likely that you have hit this problem.
    I think this is the same thing too: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=982794
    (Firefox dying with illegal instruction on non-neon hardware)

    I _suspect_ that debian needs to change the default flags to actually
    say 'armv7+fp+nosimd' by default so that we get what we expect (and
    define as the base ISA) and it doesn't depend on what hardware the
    build was done on.

    Wookey
    --
    Principal hats: Debian, Wookware, ARM
    http://wookware.org/

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEER4nvI8Pe/wVWh5yq+4YyUahvnkcFAmK8cO4ACgkQ+4YyUahv nkf0lxAAwVzgICyWEm3Nz16q4KfXgFSp+ukuGz7kt0ldqiryf3gvD4SyuIY0PWuo PgkiS1etRe8z4Vp5iRbPhkQS2wP3CRD19vUuetVUBYj8jXF+UpCIq5FHkvJ5T9+g 5D/u+TNJ+WvCMK+BjXvY3l/egFPg9N8mng8/BTeIxyq0iOPHcOw6ysf3/lbvO5Rg AeEHxvXpBqGSCG5moon/7Zl1xV6/rfkv5cKSWY1uDEAXweqL+meRf6U5o/tZIEPT oYUsBKDOFI49hPdA/6MPZH6XgdO/tIhZBL7Yh05fc2tV4BTsnJ6SbjkZ5E9oJGS9 8jcinsIU2DpIx4QaPp4Pqh9XI5IQ75uNAdyqqYxhtbvaEZZgH8Bh6YYknmWnU4pw i9hyA5T8IHw5rNYWqtpfRCJEg53I5ulDHiJ4ZF55N9d6dLdBLracm/zr+kaCw15V LtvIO8iF1gaRN1T3ZV2dXBlNka1oxDFgc6akQlqEYEw3Yso8gceBz20P6D9jard6 Yf82CxSRdzoMj86REv7RR0Zyhtnmm1A7fdr5EuOJdOHE5iOoq2URs0EvCDdvYvYl hVwAOahIEwbwIiMEzI7vFUuTx2J5Tr1hOpiACZJ/cGIZ5ixUGwWp9n83zY16+zDR 4rjFXvVeHJCVhhA2oH5t4yuoDssx7XaaJ9x0ySFnYOwdVtlMf0M=
    =8OHn
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jeffrey Walton@21:1/5 to wookey@wookware.org on Wed Jun 29 18:00:01 2022
    On Wed, Jun 29, 2022 at 11:34 AM Wookey <wookey@wookware.org> wrote:

    On 2022-06-29 15:13 +0200, Mathieu Malaterre wrote:
    On Wed, Jun 29, 2022 at 2:48 PM Wookey <wookey@wookware.org> wrote:

    What exactly is going wrong when you try to use valgrind?

    Well you should see something like this on abel.d.o:

    * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=928224#27

    Basically anytime you build valgrind using gcc-11 or gcc-12 (debian
    sid package), you get this weird illegal instruction:

    ```
    % ./vg-in-place
    Illegal instruction
    ```

    I have a strong suspicion that this is neon-itis. The issue generally manifests as 'illegal instuction' (i.e a neon instruction is issued on hardware that isn't able to execute it). It has always been the case
    that software should not assume neon is present on v7 (because it
    isn't on all hardware), and most code gets this right, but I've
    recently seen gcc putting those instuctions into the startup code
    (where the C-environment is set up and variables allocated) which gets executed _before_ any functions checking for which HWCAPS to enable,
    and thus which code to run.
    ...
    Also if you run the program under gdb (on abel) and when it barfs do:
    (gdb) disassemble
    and look for instructions that start with 'v', like 'vmov.i32'
    that will confirm which instruction is tripping it up.

    This bug has an example of the problem: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998043

    I got partway thorugh a long followup with some details of possible
    fixes some months ago but got sidetracked (and oh look it's been
    pending for 6 months already).

    The reason this has broken appears to be that gcc has changed the way
    the fpu is specified/defaulted, so neon _and_ fp are enabled by
    default if no specific fpu option is given. (i.e we just set
    -march=armv7). It used to be that -march=armv7 implied +nosimd. (or something like that - I never quite got to the bottom of it enough to
    be sure eactly what the right general or specific fix was).

    If you rebuild with
    -march=armv7-a+nosimd+nofp
    or
    -march=armv7-a+nosimd+fp
    you should be able to determine if being more explicit about the fp and simd(neon) instructions used makes it behave.

    It seems likely that you have hit this problem.
    I think this is the same thing too: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=982794
    (Firefox dying with illegal instruction on non-neon hardware)

    I _suspect_ that debian needs to change the default flags to actually
    say 'armv7+fp+nosimd' by default so that we get what we expect (and
    define as the base ISA) and it doesn't depend on what hardware the
    build was done on.

    Also see GCC Bug 104455, where you can't specify just -march=armv7-a
    with GCC 11 (and probably above). https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104455 .

    GCC really screwed folks by requiring them to declare the ISA at
    compile time (like -march=armv7-a -mfpu=neon). You have to use the
    options to use the ISA, but then GCC thinks it can use it too.
    Meanwhile, your code is guarded at runtime while GCC's code SIGILL's.
    It's been a constant source of problems for me on x86, ARM and
    PowerPC.

    I also think Debian got it wrong recently when they tied NEON to
    ARMv7-a. Making the leap that ARMv7 includes NEON was simply a
    mistake. But I understand why they did it for their standard build configuration. They wanted to get rid of armel and ARMv5 support.

    Microsoft compilers got it right. You can use any ISA the compiler
    supports without options. It is up to you to guard the code properly
    at runtime. And when you use an option like /machine:avx, that tells
    the compiler it can use up to the specific ISA.

    Jeff

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mathieu Malaterre@21:1/5 to wookey@wookware.org on Thu Jun 30 08:50:01 2022
    On Wed, Jun 29, 2022 at 5:34 PM Wookey <wookey@wookware.org> wrote:
    [...]
    The reason this has broken appears to be that gcc has changed the way
    the fpu is specified/defaulted, so neon _and_ fp are enabled by
    default if no specific fpu option is given. (i.e we just set
    -march=armv7). It used to be that -march=armv7 implied +nosimd. (or something like that - I never quite got to the bottom of it enough to
    be sure eactly what the right general or specific fix was).

    If you rebuild with
    -march=armv7-a+nosimd+nofp
    or
    -march=armv7-a+nosimd+fp
    you should be able to determine if being more explicit about the fp and simd(neon) instructions used makes it behave.

    This has been reported as:

    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1014091

    Thanks

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mathieu Malaterre@21:1/5 to wookey@wookware.org on Thu Jun 30 08:30:01 2022
    On Wed, Jun 29, 2022 at 5:34 PM Wookey <wookey@wookware.org> wrote:

    On 2022-06-29 15:13 +0200, Mathieu Malaterre wrote:
    On Wed, Jun 29, 2022 at 2:48 PM Wookey <wookey@wookware.org> wrote:

    What exactly is going wrong when you try to use valgrind?

    Well you should see something like this on abel.d.o:

    * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=928224#27

    Basically anytime you build valgrind using gcc-11 or gcc-12 (debian
    sid package), you get this weird illegal instruction:

    ```
    % ./vg-in-place
    Illegal instruction
    ```

    I have a strong suspicion that this is neon-itis. The issue generally manifests as 'illegal instuction' (i.e a neon instruction is issued on hardware that isn't able to execute it). It has always been the case
    that software should not assume neon is present on v7 (because it
    isn't on all hardware), and most code gets this right, but I've
    recently seen gcc putting those instuctions into the startup code
    (where the C-environment is set up and variables allocated) which gets executed _before_ any functions checking for which HWCAPS to enable,
    and thus which code to run.

    You can check if a binary contains NEON instructions using
    readelf -A

    and look for
    Tag_Advanced_SIMD_arch: NEONv1

    However just because its in the binary doesn't mean it's wrong. The
    binary may have been built using ifunc or other mechanisms to choose appropriate functions depending whether or not neon hardware is available.

    A simple check for whether this is your issue is just to run the same test on harris.debian.org.
    If it works OK there that strongly suggests you have a neon problem.

    Also if you run the program under gdb (on abel) and when it barfs do:
    (gdb) disassemble
    and look for instructions that start with 'v', like 'vmov.i32'
    that will confirm which instruction is tripping it up.

    This bug has an example of the problem: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998043

    I got partway thorugh a long followup with some details of possible
    fixes some months ago but got sidetracked (and oh look it's been
    pending for 6 months already).

    The reason this has broken appears to be that gcc has changed the way
    the fpu is specified/defaulted, so neon _and_ fp are enabled by
    default if no specific fpu option is given. (i.e we just set
    -march=armv7). It used to be that -march=armv7 implied +nosimd. (or something like that - I never quite got to the bottom of it enough to
    be sure eactly what the right general or specific fix was).

    If you rebuild with
    -march=armv7-a+nosimd+nofp
    or
    -march=armv7-a+nosimd+fp
    you should be able to determine if being more explicit about the fp and simd(neon) instructions used makes it behave.

    If I compare gcc-10 vs gcc-11 I see:

    malat@abel ~ % gcc-10 --verbose
    Using built-in specs.
    COLLECT_GCC=gcc-10 COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/10/lto-wrapper
    Target: arm-linux-gnueabihf
    Configured with: ../src/configure -v --with-pkgversion='Debian
    10.3.0-16' --with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2
    --prefix=/usr --with-gcc-major-version-only --program-suffix=-10 --program-prefix=arm-linux-gnueabihf- --enable-shared
    --enable-linker-build-id --libexecdir=/usr/lib
    --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu
    --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object
    --disable-libitm --disable-libquadmath --disable-libquadmath-support --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-sjlj-exceptions --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard
    --with-mode=thumb --disable-werror --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
    Thread model: posix
    Supported LTO compression algorithms: zlib zstd
    gcc version 10.3.0 (Debian 10.3.0-16)

    while

    malat@abel ~ % gcc-11 --verbose
    Using built-in specs.
    COLLECT_GCC=gcc-11 COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/11/lto-wrapper
    Target: arm-linux-gnueabihf
    Configured with: ../src/configure -v --with-pkgversion='Debian
    11.3.0-3' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2
    --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=arm-linux-gnueabihf- --enable-shared
    --enable-linker-build-id --libexecdir=/usr/lib
    --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu
    --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object
    --disable-libitm --disable-libquadmath --disable-libquadmath-support --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-sjlj-exceptions --with-arch=armv7-a+fp --with-float=hard --with-mode=thumb
    --disable-werror --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
    Thread model: posix
    Supported LTO compression algorithms: zlib zstd
    gcc version 11.3.0 (Debian 11.3.0-3)

    Could someone confirm, the spec file is accurate for Debian armhf (no
    neon) ? I fail to understand why spec file would be different for us (--with-arch=armv7-a --with-fpu=vfpv3-d16 suddenly became --with-arch=armv7-a+fp).

    If I read the doc online correctly:

    https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html

    states:

    -mfpu=name
    [...]
    The setting ‘auto’ is the default and is special. It causes the
    compiler to select the floating-point and Advanced SIMD instructions
    based on the settings of -mcpu and -march.

    In the case of valgrind I can see:

    ` -marm -mcpu=cortex-a8`

    I cannot find in the doc what 'cortex-a8' stands for: neon or not neon ?

    It seems likely that you have hit this problem.
    I think this is the same thing too: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=982794
    (Firefox dying with illegal instruction on non-neon hardware)

    I _suspect_ that debian needs to change the default flags to actually
    say 'armv7+fp+nosimd' by default so that we get what we expect (and
    define as the base ISA) and it doesn't depend on what hardware the
    build was done on.

    Ah ! Now it starts to makes sense.

    Wookey
    --
    Principal hats: Debian, Wookware, ARM
    http://wookware.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mathieu Malaterre@21:1/5 to noloader@gmail.com on Thu Jun 30 08:40:01 2022
    On Wed, Jun 29, 2022 at 5:52 PM Jeffrey Walton <noloader@gmail.com> wrote:

    On Wed, Jun 29, 2022 at 11:34 AM Wookey <wookey@wookware.org> wrote:

    On 2022-06-29 15:13 +0200, Mathieu Malaterre wrote:
    On Wed, Jun 29, 2022 at 2:48 PM Wookey <wookey@wookware.org> wrote:

    What exactly is going wrong when you try to use valgrind?

    Well you should see something like this on abel.d.o:

    * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=928224#27

    Basically anytime you build valgrind using gcc-11 or gcc-12 (debian
    sid package), you get this weird illegal instruction:

    ```
    % ./vg-in-place
    Illegal instruction
    ```

    I have a strong suspicion that this is neon-itis. The issue generally manifests as 'illegal instuction' (i.e a neon instruction is issued on hardware that isn't able to execute it). It has always been the case
    that software should not assume neon is present on v7 (because it
    isn't on all hardware), and most code gets this right, but I've
    recently seen gcc putting those instuctions into the startup code
    (where the C-environment is set up and variables allocated) which gets executed _before_ any functions checking for which HWCAPS to enable,
    and thus which code to run.
    ...
    Also if you run the program under gdb (on abel) and when it barfs do:
    (gdb) disassemble
    and look for instructions that start with 'v', like 'vmov.i32'
    that will confirm which instruction is tripping it up.

    This bug has an example of the problem: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998043

    I got partway thorugh a long followup with some details of possible
    fixes some months ago but got sidetracked (and oh look it's been
    pending for 6 months already).

    The reason this has broken appears to be that gcc has changed the way
    the fpu is specified/defaulted, so neon _and_ fp are enabled by
    default if no specific fpu option is given. (i.e we just set
    -march=armv7). It used to be that -march=armv7 implied +nosimd. (or something like that - I never quite got to the bottom of it enough to
    be sure eactly what the right general or specific fix was).

    If you rebuild with
    -march=armv7-a+nosimd+nofp
    or
    -march=armv7-a+nosimd+fp
    you should be able to determine if being more explicit about the fp and simd(neon) instructions used makes it behave.

    It seems likely that you have hit this problem.
    I think this is the same thing too: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=982794
    (Firefox dying with illegal instruction on non-neon hardware)

    I _suspect_ that debian needs to change the default flags to actually
    say 'armv7+fp+nosimd' by default so that we get what we expect (and
    define as the base ISA) and it doesn't depend on what hardware the
    build was done on.

    Also see GCC Bug 104455, where you can't specify just -march=armv7-a
    with GCC 11 (and probably above). https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104455 .

    GCC really screwed folks by requiring them to declare the ISA at
    compile time (like -march=armv7-a -mfpu=neon). You have to use the
    options to use the ISA, but then GCC thinks it can use it too.
    Meanwhile, your code is guarded at runtime while GCC's code SIGILL's.
    It's been a constant source of problems for me on x86, ARM and
    PowerPC.

    I also think Debian got it wrong recently when they tied NEON to
    ARMv7-a. Making the leap that ARMv7 includes NEON was simply a
    mistake. But I understand why they did it for their standard build configuration. They wanted to get rid of armel and ARMv5 support.

    Microsoft compilers got it right. You can use any ISA the compiler
    supports without options. It is up to you to guard the code properly
    at runtime. And when you use an option like /machine:avx, that tells
    the compiler it can use up to the specific ISA.

    Just for later reference, it turns out that, default clang version on
    abel.d.o is also generating those neon instructions when building
    valgrind. So at least two compilers are confused.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lennart Sorensen@21:1/5 to Mathieu Malaterre on Thu Jun 30 17:30:01 2022
    On Thu, Jun 30, 2022 at 08:28:42AM +0200, Mathieu Malaterre wrote:
    If I compare gcc-10 vs gcc-11 I see:

    malat@abel ~ % gcc-10 --verbose
    Using built-in specs.
    COLLECT_GCC=gcc-10 COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/10/lto-wrapper
    Target: arm-linux-gnueabihf
    Configured with: ../src/configure -v --with-pkgversion='Debian
    10.3.0-16' --with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2
    --prefix=/usr --with-gcc-major-version-only --program-suffix=-10 --program-prefix=arm-linux-gnueabihf- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib
    --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu
    --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object
    --disable-libitm --disable-libquadmath --disable-libquadmath-support --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-sjlj-exceptions --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard
    --with-mode=thumb --disable-werror --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
    Thread model: posix
    Supported LTO compression algorithms: zlib zstd
    gcc version 10.3.0 (Debian 10.3.0-16)

    while

    malat@abel ~ % gcc-11 --verbose
    Using built-in specs.
    COLLECT_GCC=gcc-11 COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/11/lto-wrapper
    Target: arm-linux-gnueabihf
    Configured with: ../src/configure -v --with-pkgversion='Debian
    11.3.0-3' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2
    --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=arm-linux-gnueabihf- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib
    --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu
    --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object
    --disable-libitm --disable-libquadmath --disable-libquadmath-support --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-sjlj-exceptions --with-arch=armv7-a+fp --with-float=hard --with-mode=thumb
    --disable-werror --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
    Thread model: posix
    Supported LTO compression algorithms: zlib zstd
    gcc version 11.3.0 (Debian 11.3.0-3)

    Could someone confirm, the spec file is accurate for Debian armhf (no
    neon) ? I fail to understand why spec file would be different for us (--with-arch=armv7-a --with-fpu=vfpv3-d16 suddenly became --with-arch=armv7-a+fp).

    If I read the doc online correctly:

    https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html

    states:

    -mfpu=name
    [...]
    The setting ‘auto’ is the default and is special. It causes the
    compiler to select the floating-point and Advanced SIMD instructions
    based on the settings of -mcpu and -march.

    In the case of valgrind I can see:

    ` -marm -mcpu=cortex-a8`

    I cannot find in the doc what 'cortex-a8' stands for: neon or not neon ?

    Cortex-A8 always has neon as far as I remember. Cortex-A9 unfortunately
    has it optional and of course some non Cortex arm implementations
    don't have neon either. So cortex-a8 is not a valid choice as a generic
    arm v7 armhf processor.

    --
    Len Sorensen

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)