• Altivec in baseline for ppc64?

    From =?ISO-8859-1?Q?S=E9bastien?= Villem@21:1/5 to All on Tue Jul 13 14:10:02 2021
    Hi,

    The wiki page that synthesizes architecture specificities indicates
    that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64

    However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
    also what the main wiki page for PPC64 says:
    https://wiki.debian.org/PPC64

    Can someone please clarify the situation?

    (I’m asking because I’m the maintainer of the openblas package, and
    knowing whether Altivec is available or not, and more generally what is
    in the baseline, is essential for proper packaging).

    --
    ⢀⣴⠾⠻⢶⣦⠀  Sébastien Villemot
    ⣾⠁⢠⠒⠀⣿⡁  Debian Developer ⢿⡄⠘⠷⠚⠋⠀  https://sebastien.villemot.name ⠈⠳⣄⠀⠀⠀⠀  https://www.debian.org


    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEU5UdlScuDFuCvoxKLOzpNQ7OvkoFAmDtf0sACgkQLOzpNQ7O vkorTA//V43inrHr12j/qAy+6+wxSOlmrc86gULlNpp1NPK+DMtmFGyQUaAWcMYA 9PfvV7EIa2tGdn9QO18z7mTW/rQbqXQe0+efMftUug9lEAa5l9j/lNDQeapWbrYx 4PXSK4NsFTH+5cODyw3aHm713ej5qmpX5Y0aeBX7C6NapbTaHCXxvPlijOlOTJAG VpKM+H9ZKlXfYJmR0aGaWzABm1AQrD3xgWDmcTg39XY7X+pTB0rM3TxvX2gIP3pv XO761d36yGyH2aA7aGpvLUUzFGbL/P+lrjLx57pOw08y24gnChf72ZABqKUp2+s7 I4fxWlwVfR8rXS7TXRVDb33vHqgWfCRXYFT5oLKygT96y/huws8+wYTyEgZDbrFC vqBXJFgjRGoMFmJGuXRAL5RMSRGFiG8/ch7fiUaETNSSlUj28wu83Z2kQ6H+7fX/ 9CA8fgJu9JszQE0xxPTTeAMz0SVmuoeVxJqxGLQIYQtFfUqXOhxf7BFlVHzANgad bpMYsmAnqLpFYiYXVzFiXbyaqma6dL0EOtFu6w91zFomcrLItkvhVeHp9Mu3wQ4e uW2z3I3FHRbZwM3PBzj/Ip7fwqojglRJrI+Sg5HVTQpSE40NE3h8VfW/W2mQd15X 1/OrAFdTJEAOapgNyoZq8vNlGO6XSfFuXL036zo08iDxmEwLX30=
    =GMCD
  • From Mathieu Malaterre@21:1/5 to sebastien@debian.org on Tue Jul 13 19:00:01 2021
    Hi Sébastien,

    On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot <sebastien@debian.org> wrote:

    Hi,

    The wiki page that synthesizes architecture specificities indicates
    that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64

    However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This is also what the main wiki page for PPC64 says:
    https://wiki.debian.org/PPC64

    Can someone please clarify the situation?

    (I’m asking because I’m the maintainer of the openblas package, and knowing whether Altivec is available or not, and more generally what is
    in the baseline, is essential for proper packaging).

    I do not believe that you can do much as a packager. You cannot assume
    anything on the target arch. You need to do the same thing as ffmpeg
    is doing for avx2/sse4 on amd64, you need to do runtime detection. So
    unless upstream is doing something very clever you cannot compile blas
    using any of the fancy altivec instructions :(

    The man page for ld.so mentions something about optimized libraries
    (search for "/usr/lib/sse2/"), but this is currently not in use in
    Debian (AFAIK).

    2cts

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mathieu Malaterre@21:1/5 to sebastien@debian.org on Tue Jul 13 20:10:02 2021
    On Tue, Jul 13, 2021 at 7:21 PM Sébastien Villemot <sebastien@debian.org> wrote:

    Hi Mathieu,

    Le mardi 13 juillet 2021 à 18:56 +0200, Mathieu Malaterre a écrit :

    On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot <sebastien@debian.org> wrote:

    The wiki page that synthesizes architecture specificities indicates
    that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64

    However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This is also what the main wiki page for PPC64 says: https://wiki.debian.org/PPC64

    Can someone please clarify the situation?

    (I’m asking because I’m the maintainer of the openblas package, and knowing whether Altivec is available or not, and more generally what is in the baseline, is essential for proper packaging).

    I do not believe that you can do much as a packager. You cannot assume anything on the target arch. You need to do the same thing as ffmpeg
    is doing for avx2/sse4 on amd64, you need to do runtime detection. So unless upstream is doing something very clever you cannot compile blas using any of the fancy altivec instructions :(

    The man page for ld.so mentions something about optimized libraries
    (search for "/usr/lib/sse2/"), but this is currently not in use in
    Debian (AFAIK).

    Actually OpenBLAS has its own runtime detection mechanism, which is
    used to select the best linear algebra kernel for the current CPU
    (those kernels are mainly written in assembly, and take advantage of available ISA extensions). This mechanism is used on several archs,
    including ppc64el (so at runtime, OpenBLAS chooses between a POWER8 and
    a POWER9 kernel; there is even a POWER10 kernel already available).

    However, I cannot enable this mechanism on ppc64 and powerpc, because
    the runtime detection only works for POWER6 and above, and my
    understanding is that for these two ports the baseline is lower. Hence
    on these two archs, only one kernel is included in the package binaries (currently POWER4 for ppc64 and PPCG4 for powerpc). For optimal
    performance, users should recompile OpenBLAS locally (as indicated in
    the package description and in README.Debian).

    There are plenty of people on this mailing list that could test/verify
    that. Is there a quick way to check that your openblas package is
    compiled correctly for ppc32 and ppc64 (like a verbose mode) ? Did you
    do any experiment on perotto.debian.net ?

    I am however not sure that my current choices for the ppc64 and powerpc baselines are optimal, hence this thread.

    --
    ⢀⣴⠾⠻⢶⣦⠀ Sébastien Villemot
    ⣾⠁⢠⠒⠀⣿⡁ Debian Developer
    ⢿⡄⠘⠷⠚⠋⠀ https://sebastien.villemot.name
    ⠈⠳⣄⠀⠀⠀⠀ https://www.debian.org


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?ISO-8859-1?Q?S=E9bastien?= Villem@21:1/5 to All on Tue Jul 13 19:30:02 2021
    Hi Mathieu,

    Le mardi 13 juillet 2021 à 18:56 +0200, Mathieu Malaterre a écrit :

    On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot <sebastien@debian.org> wrote:

    The wiki page that synthesizes architecture specificities indicates
    that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64

    However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This is also what the main wiki page for PPC64 says:
    https://wiki.debian.org/PPC64

    Can someone please clarify the situation?

    (I’m asking because I’m the maintainer of the openblas package, and knowing whether Altivec is available or not, and more generally what is
    in the baseline, is essential for proper packaging).

    I do not believe that you can do much as a packager. You cannot assume anything on the target arch. You need to do the same thing as ffmpeg
    is doing for avx2/sse4 on amd64, you need to do runtime detection. So
    unless upstream is doing something very clever you cannot compile blas
    using any of the fancy altivec instructions :(

    The man page for ld.so mentions something about optimized libraries
    (search for "/usr/lib/sse2/"), but this is currently not in use in
    Debian (AFAIK).

    Actually OpenBLAS has its own runtime detection mechanism, which is
    used to select the best linear algebra kernel for the current CPU
    (those kernels are mainly written in assembly, and take advantage of
    available ISA extensions). This mechanism is used on several archs,
    including ppc64el (so at runtime, OpenBLAS chooses between a POWER8 and
    a POWER9 kernel; there is even a POWER10 kernel already available).

    However, I cannot enable this mechanism on ppc64 and powerpc, because
    the runtime detection only works for POWER6 and above, and my
    understanding is that for these two ports the baseline is lower. Hence
    on these two archs, only one kernel is included in the package binaries (currently POWER4 for ppc64 and PPCG4 for powerpc). For optimal
    performance, users should recompile OpenBLAS locally (as indicated in
    the package description and in README.Debian).

    I am however not sure that my current choices for the ppc64 and powerpc baselines are optimal, hence this thread.

    --
    ⢀⣴⠾⠻⢶⣦⠀  Sébastien Villemot
    ⣾⠁⢠⠒⠀⣿⡁  Debian Developer ⢿⡄⠘⠷⠚⠋⠀  https://sebastien.villemot.name ⠈⠳⣄⠀⠀⠀⠀  https://www.debian.org


    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEU5UdlScuDFuCvoxKLOzpNQ7OvkoFAmDty3MACgkQLOzpNQ7O vko+8g/9ECy06sgN8mGW0JK3CpDFxXPl6LaR0RXz+hml78JOVktxOdu9/IyQKltZ uK9GdRW2jZnpUeXiugoaPieGEsJK94oMscJaMeWw6V/heuYmGVvn5AFsWVRCWNqN l302t8uoKg2Q/9rH9j8P+lu3LgHLDmbopXYgoG48zDqHKBkavFFoM3gw385C/C5G uLtiFkN+pW3BbfdmwIipCwALaTllrTcvaBsaFn4/Lj/7rv+UruQ7jCNqybLaewJQ /CdgVTWmFeuCI/nz/udT5PN4Y6inClROjRLfc+JlVf1QiM5N6gMjoWcU8lYBL9cU t1pcOMY4/k7jeJKd6y9+0iSBP9UlPMI4s4l6x2ZbXbpjWDbbeJfHq42AvYDfLxfY aVKVEfZgYMTVyWfBOM7G3r3ktflHDhqSWrs4VvNdAlP2Nrre1oM5e/x+ARuuChgo zh1Yp+17iV6XxMgoRkpshICQ5DiwlfebXsfBV3VzbUwHhUanthcE4Dn7PLfpd0Hw aVFe3StKkQAGr2a4J4nfjbWh5Cq79aiJFFvnsdLWFNRBNDcJGtdE4n6oezxkGwK8 seyXnB7BuBiJSZMv3nVitQhn3DkmAyeRCpPG4SqXC7pDqdzl/2BUCjjh+TpFKTtg qVFkO+aDYnC86HNIgfvDtvpB5BDUmvjjPPqDtG20LWLKZkS0ftM=
    =bAbW
  • From John Paul Adrian Glaubitz@21:1/5 to All on Tue Jul 13 21:10:01 2021
    Hi Sébastien!

    On 7/13/21 1:55 PM, Sébastien Villemot wrote:
    The wiki page that synthesizes architecture specificities indicates
    that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64

    However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This is also what the main wiki page for PPC64 says:
    https://wiki.debian.org/PPC64

    Can someone please clarify the situation?

    The ppc64 originally used the ppc64 baseline including AltiVec e.g PowerPC970, (64-Bit PowerMac). However, the previous port maintainer decided he wanted to support embedded systems such as the PowerPC E5500 which does not support AltiVec.

    I wasn't really a fan of that change but my stance is that we should use AltiVec
    in packages where it makes sense as the majority of the ppc64 port users will have a machine that suppport AltiVec.

    If they run into an issue with these packages on non-AltiVec systems, they can still
    file a bug.

    (I’m asking because I’m the maintainer of the openblas package, and knowing whether Altivec is available or not, and more generally what is
    in the baseline, is essential for proper packaging).

    Please go ahead and enabled AltiVec as I don't think it makes much sense to use BLAS
    on machines without any SIMD support. If any user complains about compatibility issues,
    please feel free to bring up the issue here again.

    Adrian

    --
    .''`. John Paul Adrian Glaubitz
    : :' : Debian Developer - glaubitz@debian.org
    `. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
    `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?ISO-8859-1?Q?S=E9bastien?= Villem@21:1/5 to All on Tue Jul 13 20:30:01 2021
    Le mardi 13 juillet 2021 à 20:06 +0200, Mathieu Malaterre a écrit :
    On Tue, Jul 13, 2021 at 7:21 PM Sébastien Villemot <sebastien@debian.org> wrote:
    Le mardi 13 juillet 2021 à 18:56 +0200, Mathieu Malaterre a écrit :

    On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot <sebastien@debian.org> wrote:

    The wiki page that synthesizes architecture specificities indicates that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64

    However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
    also what the main wiki page for PPC64 says: https://wiki.debian.org/PPC64

    Can someone please clarify the situation?

    (I’m asking because I’m the maintainer of the openblas package, and knowing whether Altivec is available or not, and more generally what is in the baseline, is essential for proper packaging).

    I do not believe that you can do much as a packager. You cannot assume anything on the target arch. You need to do the same thing as ffmpeg
    is doing for avx2/sse4 on amd64, you need to do runtime detection. So unless upstream is doing something very clever you cannot compile blas using any of the fancy altivec instructions :(

    The man page for ld.so mentions something about optimized libraries (search for "/usr/lib/sse2/"), but this is currently not in use in
    Debian (AFAIK).

    Actually OpenBLAS has its own runtime detection mechanism, which is
    used to select the best linear algebra kernel for the current CPU
    (those kernels are mainly written in assembly, and take advantage of available ISA extensions). This mechanism is used on several archs, including ppc64el (so at runtime, OpenBLAS chooses between a POWER8 and
    a POWER9 kernel; there is even a POWER10 kernel already available).

    However, I cannot enable this mechanism on ppc64 and powerpc, because
    the runtime detection only works for POWER6 and above, and my
    understanding is that for these two ports the baseline is lower. Hence
    on these two archs, only one kernel is included in the package binaries (currently POWER4 for ppc64 and PPCG4 for powerpc). For optimal performance, users should recompile OpenBLAS locally (as indicated in
    the package description and in README.Debian).

    There are plenty of people on this mailing list that could test/verify
    that. Is there a quick way to check that your openblas package is
    compiled correctly for ppc32 and ppc64 (like a verbose mode) ? Did you
    do any experiment on perotto.debian.net ?

    perotto.debian.net is POWER8, so it’s clearly well above the baseline.
    The package runs fine there, but that does not tell anything about
    baseline violation.

    Verifying that the package compiled fine and passed its testsuite on
    build daemons does not give any information about baseline violation
    either, because buildds are probably above the baseline as well. FYI,
    the most recent build logs are there: https://buildd.debian.org/status/package.php?p=openblas&suite=experimental (there is a problem with powerpc in experimental; but the version in
    sid compiled).

    If nobody has the relevant knowledge, then the only option is to test
    the package on the oldest possible hardware. The easiest way to test it
    is to recompile it locally (since this will exercise the testsuite).

    Note that nobody complained for years about the situation of openblas
    on powerpc and ppc64. So maybe that’s a sign that the current setting
    is fine (either the baseline is respected, or nobody uses the package).

    --
    ⢀⣴⠾⠻⢶⣦⠀  Sébastien Villemot
    ⣾⠁⢠⠒⠀⣿⡁  Debian Developer ⢿⡄⠘⠷⠚⠋⠀  https://sebastien.villemot.name ⠈⠳⣄⠀⠀⠀⠀  https://www.debian.org


    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEU5UdlScuDFuCvoxKLOzpNQ7OvkoFAmDt2UEACgkQLOzpNQ7O vkpBqg//T6hmtvNqgWEW8Q5CgcvpIeQPb9k/sRrEH5ks0gCEC8hfLUsDcG+N2Hr9 oI8jiH9gzc3X8gUjb2BlYupKDHJ8bhsgu+KYY1D49Lrp/EtK+z/wHqo6iTbnNri6 1iCY6ql4SMUAJXSk4jUhjgqx6P9VvUxdnb8eE6CQo1JcC5A1Jk0tG/ugFPqqRqAB igZYUz4fo+j18kglDWxRltYMjRY2pVLZh98ptm2lsfOonhL/oIk600ynV3Z747nO NhKq0ZbWOJhrAARpD3pLa8yZzsxLnrW+NAGcDGAE4ZS9UB2PWz+kM0eagUxeYoXl 0r9WsVtVc3Ts/FcfXnuz1IKKx+o0Lo6jrBQGiK+HPNLSnySOLoTpgWweDofb8TM/ TRDJJ9NSFuLKdjYP1bx1Ly11pHCwZURk9D0d5DqIXNd0MmThLiiIC8JRY0QvTL0L kheZpBEBCCqso8oyveixhqxXqcsJZ+GMoJxriG3TerSyP8AQgomJoacO8O9nZLV5 oEekYw0gXI+uydmYPqh4+KPIAhc7fTzugPEgbM/hKpccdAcYs5ZFpq6h70oXO+J1 g+boaTA417W0QGgSTOZRpokXkKhHqiEonCvpBuloK4F5oKpiIi+aWWfnzK4u7zPX gZPCmIMoFQUADS2LYGysZtVxRv7PEHq0by5CqQMefraKJxB/cTw=
    =PIVz
  • From John Paul Adrian Glaubitz@21:1/5 to Mathieu Malaterre on Tue Jul 13 21:20:01 2021
    On 7/13/21 6:56 PM, Mathieu Malaterre wrote:
    I do not believe that you can do much as a packager. You cannot assume anything on the target arch. You need to do the same thing as ffmpeg
    is doing for avx2/sse4 on amd64, you need to do runtime detection. So
    unless upstream is doing something very clever you cannot compile blas
    using any of the fancy altivec instructions :(
    For performance-related libraries I'm okay with AltiVec, we're enabling it for NSS2 as well and so far no user has reported any issues.

    If that changes in the future, i.e. someone is actually complaining, we can still revisit this issue.

    Adrian

    --
    .''`. John Paul Adrian Glaubitz
    : :' : Debian Developer - glaubitz@debian.org
    `. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
    `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From William Bonnet@21:1/5 to Mathieu Malaterre on Tue Jul 13 21:20:01 2021
    To: sebastien@debian.org (=?UTF-8?Q?S=c3=a9bastien_Villemot?=)
    Copy: debian-powerpc@lists.debian.org (PowerPC List Debian)

    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --xpY3zIhpMcYps7ovyqjosWGQqqRKMvwNz
    Content-Type: text/plain; charset=utf-8
    Content-Transfer-Encoding: quoted-printable
    Content-Language: en-US

    Hi,


    I am however not sure that my current choices for the ppc64 and powerpc
    baselines are optimal, hence this thread.


    If i can help you i'll be happy to. I have several Powermac G5 running
    current Debian i can use one to test your packages as long as you tell
    me please a little about the software to test (i don't know OpenBLAS).
    On the other hand i am used to build Debian packages on PPC64 and cpu
    support issues. an keep on tal talking about this "live on IRC". If
    needed we can send technical "conclusions on the list. How about this ? :)

    cheers

    W.

    --

    kind regards,
    William https://forum.armwizard.org

    ⢀⣴⠾⠻⢶⣦⠀
    ⣾⠁⢠⠒⠀⣿⡁ wbonnet@(armwizard|firmwaretoolkit|neuralnet-studio).org
    ⢿⡄⠘⠷⠚⠋⠀ GPG fingerprint: 7189 DC8E 15B9 B3E4 EA3E 902B 8EAC F0B9 25A5 9D48
    ⠈⠳⣄



    --xpY3zIhpMcYps7ovyqjosWGQqqRKMvwNz--

    -----BEGIN PGP SIGNATURE-----

    wsB5BAABCAAjFiEEcYncjhW5s+TqPpArjqzwuSWlnUgFAmDt5kQFAwAAAAAACgkQjqzwuSWlnUjb RAgAiwoqYtnNDUIn6Vq8VgUZZN8FvJrXj7pqvvh2wtDMrNNMXbYCKleP7B+6qdHvAHuvRde1UWxs 3KIsRN7Awvl+nmkr/Kb/nYZudIY/gf0Z9xEXsA46DwMGlHZHgMhu6eXvl1Mho8BlrtmMO+G1jD87 NUoRuk6NW9Dt9rQrZjKZ7d5JcsCrLlCxgvcOB6qfxiAW39Unzmh8/4c1CjzcfZ43WitUIrsIZXZQ FFExJ4tpndg1cGN4MV9QCwsu2b5iumbn18lr2nWwO8K8F/eRs3CfGHSJxUFYNiyVDDsk+ps2KPbg n+5sgD2e7ZmmL8VWhe7O5pcFOWSiH8DGpiunE0QAnA==
    =lcbF
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Karoly Balogh@21:1/5 to John Paul Adrian Glaubitz on Tue Jul 13 21:50:02 2021
    This message is in MIME format. The first part should be readable text,
    while the remaining parts are likely unreadable without MIME-aware tools.

    Hi,

    On Tue, 13 Jul 2021, John Paul Adrian Glaubitz wrote:

    However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This
    is also what the main wiki page for PPC64 says: https://wiki.debian.org/PPC64

    Can someone please clarify the situation?

    The ppc64 originally used the ppc64 baseline including AltiVec e.g PowerPC970, (64-Bit PowerMac). However, the previous port maintainer
    decided he wanted to support embedded systems such as the PowerPC E5500
    which does not support AltiVec.

    Just to note it here, this CPU family is also used by some relatively
    recent desktop-class PowerPC machines, like A-Eon's AmigaOne X5000 (and
    its "CyrusPlus" motherboard). So it's not just about the support of some obscure embedded dev board.

    Charlie

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jeffrey Walton@21:1/5 to sebastien@debian.org on Tue Jul 13 21:50:02 2021
    On Tue, Jul 13, 2021 at 2:20 PM Sébastien Villemot <sebastien@debian.org> wrote:

    Le mardi 13 juillet 2021 à 20:06 +0200, Mathieu Malaterre a écrit :
    On Tue, Jul 13, 2021 at 7:21 PM Sébastien Villemot <sebastien@debian.org> wrote:
    Le mardi 13 juillet 2021 à 18:56 +0200, Mathieu Malaterre a écrit :

    On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot <sebastien@debian.org> wrote:

    The wiki page that synthesizes architecture specificities indicates that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64

    However my understanding is that this port supports any powerpc64 CPU,
    including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
    also what the main wiki page for PPC64 says: https://wiki.debian.org/PPC64

    Can someone please clarify the situation?

    (I’m asking because I’m the maintainer of the openblas package, and
    knowing whether Altivec is available or not, and more generally what is
    in the baseline, is essential for proper packaging).

    I do not believe that you can do much as a packager. You cannot assume anything on the target arch. You need to do the same thing as ffmpeg
    is doing for avx2/sse4 on amd64, you need to do runtime detection. So unless upstream is doing something very clever you cannot compile blas using any of the fancy altivec instructions :(

    The man page for ld.so mentions something about optimized libraries (search for "/usr/lib/sse2/"), but this is currently not in use in Debian (AFAIK).

    Actually OpenBLAS has its own runtime detection mechanism, which is
    used to select the best linear algebra kernel for the current CPU
    (those kernels are mainly written in assembly, and take advantage of available ISA extensions). This mechanism is used on several archs, including ppc64el (so at runtime, OpenBLAS chooses between a POWER8 and
    a POWER9 kernel; there is even a POWER10 kernel already available).

    However, I cannot enable this mechanism on ppc64 and powerpc, because
    the runtime detection only works for POWER6 and above, and my understanding is that for these two ports the baseline is lower. Hence
    on these two archs, only one kernel is included in the package binaries (currently POWER4 for ppc64 and PPCG4 for powerpc). For optimal performance, users should recompile OpenBLAS locally (as indicated in
    the package description and in README.Debian).

    There are plenty of people on this mailing list that could test/verify that. Is there a quick way to check that your openblas package is
    compiled correctly for ppc32 and ppc64 (like a verbose mode) ? Did you
    do any experiment on perotto.debian.net ?

    perotto.debian.net is POWER8, so it’s clearly well above the baseline.
    The package runs fine there, but that does not tell anything about
    baseline violation.

    Verifying that the package compiled fine and passed its testsuite on
    build daemons does not give any information about baseline violation
    either, because buildds are probably above the baseline as well. FYI,
    the most recent build logs are there: https://buildd.debian.org/status/package.php?p=openblas&suite=experimental (there is a problem with powerpc in experimental; but the version in
    sid compiled).

    If nobody has the relevant knowledge, then the only option is to test
    the package on the oldest possible hardware. The easiest way to test it
    is to recompile it locally (since this will exercise the testsuite).

    I can provide SSH access to a PowerMac G5 with Altivec. That should
    test the delineation between Altivec and PWR{5-10}.

    If OpenBLAS needs to do 64-bit math, then I have the routines cribbed
    away that performs 64-bit addition and subtraction using 32x4 vectors.
    The routines have to handle carry/borrow themselves. My experience
    with Crypto++ and algos like ChaCha20 demonstrate it is profitable.

    Send over your SSH public key/authorized_keys, if interested.

    Jeff

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christian Zigotzky@21:1/5 to All on Tue Jul 13 22:10:01 2021
    On 13. Jul 2021, at 21:05, John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:

    Hi Sébastien!

    On 7/13/21 1:55 PM, Sébastien Villemot wrote:
    The wiki page that synthesizes architecture specificities indicates
    that Altivec is included in the baseline for the ppc64 port:
    https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64

    However my understanding is that this port supports any powerpc64 CPU,
    including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
    also what the main wiki page for PPC64 says:
    https://wiki.debian.org/PPC64

    Can someone please clarify the situation?

    The ppc64 originally used the ppc64 baseline including AltiVec e.g PowerPC970,
    (64-Bit PowerMac). However, the previous port maintainer decided he wanted to support embedded systems such as the PowerPC E5500 which does not support AltiVec.

    This was the right decision. Some users use Debian PPC64 on their AmigaOne X5000 machines and these machines don’t have AltiVec.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From William Bonnet@21:1/5 to Karoly Balogh on Tue Jul 13 23:00:02 2021
    To: glaubitz@physik.fu-berlin.de (John Paul Adrian Glaubitz)
    Copy: sebastien@debian.org (=?UTF-8?Q?S=c3=a9bastien_Villemot?=)
    Copy: debian-powerpc@lists.debian.org

    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --hdoZxcmvoJDAaMkdcMjeT0KqUC0Jc4l2l
    Content-Type: text/plain; charset=utf-8
    Content-Transfer-Encoding: quoted-printable
    Content-Language: en-US

    Hi


    On 13/07/2021 21:35, Karoly Balogh wrote:
    Just to note it here, this CPU family is also used by some relatively
    recent desktop-class PowerPC machines, like A-Eon's AmigaOne X5000 (and
    its "CyrusPlus" motherboard). So it's not just about the support of some obscure embedded dev board.


    e5500 is also used by NXP in their T1040 QorIQ network appliances I
    recompiled and ported most of Debian 8 packages to these boxes about
    three to four years ago. It went pretty smoth even if in some cases i
    had to patch some upstream source code because of  inline ASM code or hardcoded compiler options in Makefiles. Hope fully it was very few
    software we were not using in network use case.

    Initial build hase been done using 5 dual cpu Powermac G5.   Once it
    hase been proven working, and build chain boot strapped i recompiled a
    few thousands of packages using a Power 8 installed with the packages
    built on Macs


    So in some cases like mine it can be not about embeded boards or Amiga
    clones :) and i am still an Amiga lover and owner 30 years after :P


    Any ways i'll be happy to share and talk about this any days.


    Cheers

    W.



    --

    kind regards,
    William https://forum.armwizard.org

    ⢀⣴⠾⠻⢶⣦⠀
    ⣾⠁⢠⠒⠀⣿⡁ wbonnet@(armwizard|firmwaretoolkit|neuralnet-studio).org
    ⢿⡄⠘⠷⠚⠋⠀ GPG fingerprint: 7189 DC8E 15B9 B3E4 EA3E 902B 8EAC F0B9 25A5 9D48
    ⠈⠳⣄



    --hdoZxcmvoJDAaMkdcMjeT0KqUC0Jc4l2l--

    -----BEGIN PGP SIGNATURE-----

    wsB5BAABCAAjFiEEcYncjhW5s+TqPpArjqzwuSWlnUgFAmDt/Z8FAwAAAAAACgkQjqzwuSWlnUgX Bwf+O1SNQfMdkolkj+6EtmJnCmuN2p7aHpltOMD+cKzhUAPGNLq6aVS43Yvam7yBKU/UFVIp5hUH KhKYaEKOzKdP8/eYbdMqIHAQs2I6aZvH8dMM6sq6RlQU1yOjdKMZqE8ypFQMqbM90R5+2WDaLgmQ DZqq7eBFjGBncxEKdHD6+C8//mvrZW1tpyZoG3j26Zghl5fR7QgiqBQ08NMJDcX9vpv8HuLVMURa JQtr8D5wpjuh/Xypnu173GCWiDqMsPBvUEINb/c84/Xt1Tag0WzjhQTBtyx7qGHUy/iSD2SJ7wai aSYjN9f4qIA11iF5F24zeAXX6BPXUL2NcxFX2XIx6w==
    =kVap
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christian Zigotzky@21:1/5 to All on Tue Jul 13 22:30:01 2021
    On 13. Jul 2021, at 21:44, Karoly Balogh <charlie@scenergy.dfmk.hu> wrote:

    Hi,

    On Tue, 13 Jul 2021, John Paul Adrian Glaubitz wrote:

    However my understanding is that this port supports any powerpc64 CPU,
    including some that don’t have Altivec (e.g. POWER4 or POWER5). This
    is also what the main wiki page for PPC64 says:
    https://wiki.debian.org/PPC64

    Can someone please clarify the situation?

    The ppc64 originally used the ppc64 baseline including AltiVec e.g
    PowerPC970, (64-Bit PowerMac). However, the previous port maintainer
    decided he wanted to support embedded systems such as the PowerPC E5500
    which does not support AltiVec.

    Just to note it here, this CPU family is also used by some relatively
    recent desktop-class PowerPC machines, like A-Eon's AmigaOne X5000 (and
    its "CyrusPlus" motherboard). So it's not just about the support of some obscure embedded dev board.

    Charlie

    Charlie,

    I didn’t see that you have already posted a note because of the X5000. Thanks for the hint.

    Cheers,
    Christian

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?ISO-8859-1?Q?S=E9bastien?= Villem@21:1/5 to All on Thu Jul 15 14:10:01 2021
    Le mardi 13 juillet 2021 à 21:04 +0200, John Paul Adrian Glaubitz a
    écrit :
    On 7/13/21 1:55 PM, Sébastien Villemot wrote:
    The wiki page that synthesizes architecture specificities indicates
    that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64

    However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This is also what the main wiki page for PPC64 says:
    https://wiki.debian.org/PPC64

    Can someone please clarify the situation?

    The ppc64 originally used the ppc64 baseline including AltiVec e.g PowerPC970,
    (64-Bit PowerMac). However, the previous port maintainer decided he wanted to support embedded systems such as the PowerPC E5500 which does not support AltiVec.

    Thanks for this clarification. I have updated the architectures wiki
    page accordingly.

    (I’m asking because I’m the maintainer of the openblas package, and knowing whether Altivec is available or not, and more generally what is
    in the baseline, is essential for proper packaging).

    Please go ahead and enabled AltiVec as I don't think it makes much sense to use BLAS
    on machines without any SIMD support. If any user complains about compatibility issues,
    please feel free to bring up the issue here again.

    I think I disagree with this idea. OpenBLAS can be pulled in by chains
    of dependencies, even for users who do not even know what BLAS is.
    Violating the baseline can lead to hard-to-understand crashes.
    Since I think that reliability is more important than performance, I
    prefer to strictly respect the baseline in the binary package.

    However note that locally recompiling OpenBLAS is a supported and
    documented procedure, for those who want to take full advantage of
    their hardware.

    Regarding the kernel that is currently built in the official binary, I
    could do with some help to determine which one is the best. You can see
    the list of kernels at this address: https://salsa.debian.org/science-team/openblas/-/tree/master/kernel/power
    Each KERNEL.* file lists a bunch of source files, many of which are
    assembly files.
    Currently, I use POWER4 for ppc64 and PPCG4 for powerpc, but I’m unsure
    that those are the right choice. I want a kernel that respects the
    baseline, but still taking advantage of all that is in the baseline.

    --
    ⢀⣴⠾⠻⢶⣦⠀  Sébastien Villemot
    ⣾⠁⢠⠒⠀⣿⡁  Debian Developer ⢿⡄⠘⠷⠚⠋⠀  https://sebastien.villemot.name ⠈⠳⣄⠀⠀⠀⠀  https://www.debian.org


    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEU5UdlScuDFuCvoxKLOzpNQ7OvkoFAmDwJDUACgkQLOzpNQ7O vkrkkw//diRXm/FdFXv4/tx+6dA6Y5hSYGato60DVdKhF65aSAP9jvcx05gexkBr qt2W7wcRqDKMKEvc+jTXK49ZOpYtZnpjAgGPq6WGsrEifZdwRH+w9wxxnooFBIra R0nMw9ARH5Jzn7xn6ZV/wrFx55+TmZhe+rEzK1+mrGzziOALhRcFOCIAncOycT5i 2ZFX2zxZ4YTT9iVX2LFLWrKo7DEORwU902CKdp7EsWymkg/JGICAY6pCGOk1bgLZ czjE7gd083ewDk0Lav1+WZrLxeqH6OnvqpyAP4W/SL6apMGPUI4HYFBPsZnDcdW8 IRCeBD7FaX6WFYjSwe4jwrSXrQOu/iw0Q8+dfvXQXdyIsQnlDNNRp3MyIKlMKHYy e+S4Sb8KRC9VtS7SazbnKN/mMiGG4Lc7KkdIuCY+t+CKlj7kCZe4dlBNpG18kUge tRjewbGIjBhHD80lAgTLiNwNLyg+9hOu84nKBaHxUaLiIEOuBYJ/Mc0eFiWeHZKP 0wtFCw+ZrTPVCx9albJGEp3NRvhEF0tlFpwA8biri3fz4CuC6JSTSD+LO9JZk4pn m+U2cvjfnNHYwqwwG/cEByETEql4jh9gXrQZg5P6gGW0Ii4Ox9IC6vxqLbw4LIPy 2Au3dcPvwEL3X0TmKgKVrnTJ80NaSYJXnIKWN220wM/Ez2fTMX4=
    =qBn8
  • From John Paul Adrian Glaubitz@21:1/5 to All on Thu Jul 15 14:20:01 2021
    On 7/15/21 2:04 PM, Sébastien Villemot wrote:
    Please go ahead and enabled AltiVec as I don't think it makes much sense to use BLAS
    on machines without any SIMD support. If any user complains about compatibility issues,
    please feel free to bring up the issue here again.

    I think I disagree with this idea. OpenBLAS can be pulled in by chains
    of dependencies, even for users who do not even know what BLAS is.
    Violating the baseline can lead to hard-to-understand crashes.

    True, but all build servers we have support Altivec.

    Since I think that reliability is more important than performance, I
    prefer to strictly respect the baseline in the binary package.

    Sure. But we could also just have observed whether any people report crashes.

    However note that locally recompiling OpenBLAS is a supported and
    documented procedure, for those who want to take full advantage of
    their hardware.

    Regarding the kernel that is currently built in the official binary, I
    could do with some help to determine which one is the best. You can see
    the list of kernels at this address: https://salsa.debian.org/science-team/openblas/-/tree/master/kernel/power Each KERNEL.* file lists a bunch of source files, many of which are
    assembly files.
    Currently, I use POWER4 for ppc64 and PPCG4 for powerpc, but I’m unsure that those are the right choice. I want a kernel that respects the
    baseline, but still taking advantage of all that is in the baseline.

    I would have to look into that with more detail. We can probably also
    discuss this on the #debian-ports IRC channel.

    Adrian

    --
    .''`. John Paul Adrian Glaubitz
    : :' : Debian Developer - glaubitz@debian.org
    `. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
    `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christian Zigotzky@21:1/5 to All on Thu Jul 15 18:00:01 2021
    On 15 July 2021 at 2:04 pm, Sébastien Villemot wrote:
    Le mardi 13 juillet 2021 à 21:04 +0200, John Paul Adrian Glaubitz a écrit :

    Please go ahead and enabled AltiVec as I don't think it makes much sense to use BLAS
    on machines without any SIMD support. If any user complains about compatibility issues,
    please feel free to bring up the issue here again.
    I think I disagree with this idea. OpenBLAS can be pulled in by chains
    of dependencies, even for users who do not even know what BLAS is.
    Violating the baseline can lead to hard-to-understand crashes.
    Since I think that reliability is more important than performance, I
    prefer to strictly respect the baseline in the binary package.

    Hi Sébastien,

    I disagree too because the performance of software with AltiVec support
    isn't as high as expected. I tested it a lot because we have AltiVec and Non-Altivec machines here. We changed to Non-AltiVec compiled software a
    while ago.

    We and the MintPPC team had some problems with the VLC media player on
    our Non-AltiVec machines a year ago because it is only available in an
    AltiVec version.
    The MintPPC team had to recompile it. [1]
    We also had to recompile the version 3.0.12 of VLC because of the
    AltiVec dependency again. [2]

    Cheers,
    Christian

    A-EON Technology Core Linux support team
    http://www.a-eon.com
    A-EON Technology Ltd
    Asquith House
    Unit 1 Dyfrig Rd
    Cardiff CF5 5AD
    United Kingdom

    [1] https://forum.hyperion-entertainment.com/viewtopic.php?p=51952#p51952
    [2] https://forum.hyperion-entertainment.com/viewtopic.php?p=52326#p52326

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Luke Kenneth Casson Leighton@21:1/5 to John Paul Adrian Glaubitz on Thu Jul 15 22:50:02 2021
    On 7/13/21, John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:

    I wasn't really a fan of that change but my stance is that we should use AltiVec
    in packages where it makes sense as the majority of the ppc64 port users
    will
    have a machine that suppport AltiVec.


    please *do not* do this.

    we are designing a modern Libre/Open CPU which will not punish
    developers or ourselves, stabbing ourselves in the head with 950
    instructions.

    Power ISA is supposed to be RISC.

    the only reason SIMD was made mandatory in EABI v2 was because there
    was nobody making hardware other than IBM to object to what is
    becoming recognised as an extremely serious and costly mistake for the
    future of the OpenPOWER ecosystem.

    if people continue to assume that SIMD is acceptable just because the
    only current hardware is from IBM it only makes it harder and more and
    more costly to unravel the f***up and widens and already yawning
    barrier to entry for new implementations of OpenPOWER.

    PLEASE, really, for god's sake, please do NOT recommend people
    propagate the SIMD paradigm.

    https://www.sigarch.org/simd-instructions-considered-harmful/

    l.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Paul Adrian Glaubitz@21:1/5 to Christian Zigotzky on Fri Jul 16 09:00:01 2021
    On 7/15/21 5:49 PM, Christian Zigotzky wrote:
    I disagree too because the performance of software with AltiVec support isn't as
    high as expected. I tested it a lot because we have AltiVec and Non-Altivec machines
    here. We changed to Non-AltiVec compiled software a while ago.

    It depends on the workload, of course. Anything that does SIMD like matrix multiplications
    in multimedia or scientific computing will, of course, profit from enabling AltiVec.

    No one claimed that AltiVec, MMX or SSE will just improve everything.

    We and the MintPPC team had some problems with the VLC media player on our Non-AltiVec
    machines a year ago because it is only available in an AltiVec version.
    The MintPPC team had to recompile it. [1]
    We also had to recompile the version 3.0.12 of VLC because of the AltiVec dependency again. [2]

    I would argue that there are far more users with PowerMacs which all support AltiVecs than
    with obscure embedded machines. As I said, some packages like nss2 already enabled AltiVec
    by default.

    Adrian

    --
    .''`. John Paul Adrian Glaubitz
    : :' : Debian Developer - glaubitz@debian.org
    `. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
    `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Paul Adrian Glaubitz@21:1/5 to Jeffrey Walton on Fri Jul 16 13:10:02 2021
    On 7/16/21 12:59 PM, Jeffrey Walton wrote:
    I would argue that there are far more users with PowerMacs which all support AltiVecs than
    with obscure embedded machines. As I said, some packages like nss2 already enabled AltiVec
    by default.

    Does it have to be one or the other? Can't you have both?

    Well, you could have runtime detection like certain multimedia codes and OpenSSL use but
    most packages don't do that.

    Either way, if certain downstreams want better support for certain targets, they are always
    welcome to jump in and send patches to me or upstream. This port is a community effort,
    after all.

    Adrian

    --
    .''`. John Paul Adrian Glaubitz
    : :' : Debian Developer - glaubitz@debian.org
    `. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
    `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jeffrey Walton@21:1/5 to glaubitz@physik.fu-berlin.de on Fri Jul 16 13:10:02 2021
    On Fri, Jul 16, 2021 at 2:59 AM John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:

    On 7/15/21 5:49 PM, Christian Zigotzky wrote:
    I disagree too because the performance of software with AltiVec support isn't as
    high as expected. I tested it a lot because we have AltiVec and Non-Altivec machines
    here. We changed to Non-AltiVec compiled software a while ago.

    It depends on the workload, of course. Anything that does SIMD like matrix multiplications
    in multimedia or scientific computing will, of course, profit from enabling AltiVec.

    No one claimed that AltiVec, MMX or SSE will just improve everything.

    +1

    I've found a few algorithms that were profitable on ARM and x86, but
    not profitable with Altivec. The LEA algorithm comes to mind: https://github.com/weidai11/cryptopp/blob/master/lea_simd.cpp#L46.
    Altivec runs about 5x slower than C++.

    We and the MintPPC team had some problems with the VLC media player on our Non-AltiVec
    machines a year ago because it is only available in an AltiVec version.
    The MintPPC team had to recompile it. [1]
    We also had to recompile the version 3.0.12 of VLC because of the AltiVec dependency again. [2]

    I would argue that there are far more users with PowerMacs which all support AltiVecs than
    with obscure embedded machines. As I said, some packages like nss2 already enabled AltiVec
    by default.

    Does it have to be one or the other? Can't you have both?

    Jeff

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lennart Sorensen@21:1/5 to John Paul Adrian Glaubitz on Fri Jul 16 16:00:02 2021
    On Fri, Jul 16, 2021 at 01:08:47PM +0200, John Paul Adrian Glaubitz wrote:
    Well, you could have runtime detection like certain multimedia codes and OpenSSL use but
    most packages don't do that.

    Either way, if certain downstreams want better support for certain targets, they are always
    welcome to jump in and send patches to me or upstream. This port is a community effort,
    after all.

    Some years ago I tried to get such runtime detection added for vlc.
    Upstream was certainly far from helpful and just about hostile to the
    idea of adding runtime detection code on powerpc. Given I was just
    trying to help out the people hitting a problem with it (I don't have
    any desktop powerpc, I only deal with embedded), I decided it wasn't
    worth my effort to argue with them.

    --
    Len Sorensen

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Luke Kenneth Casson Leighton@21:1/5 to lsorense@csclub.uwaterloo.ca on Fri Jul 16 22:20:01 2021
    ---
    crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

    On Fri, Jul 16, 2021 at 2:55 PM Lennart Sorensen
    <lsorense@csclub.uwaterloo.ca> wrote:

    Some years ago I tried to get such runtime detection added for vlc.
    Upstream was certainly far from helpful and just about hostile to the
    idea of adding runtime detection code on powerpc.

    because "why would you support 10+ year old processors", i do get it,
    but it is very frustrating.

    Given I was just
    trying to help out the people hitting a problem with it (I don't have
    any desktop powerpc, I only deal with embedded), I decided it wasn't
    worth my effort to argue with them.

    Libre-SOC's SVP64 ISA is a stunning 4 to 5 times less instructions for MP3 CODEC inner loop. i'm also adding hardware-level support for in-place
    FFT butterfly, such that *any* sized FFT should be possible to do in around
    45 instructions.

    when hardware's available this _should_ shock ffmpeg and other
    developers out of rejecting upstream
  • From Luke Kenneth Casson Leighton@21:1/5 to glaubitz@physik.fu-berlin.de on Sat Jul 17 11:20:01 2021
    ---
    crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

    On Fri, Jul 16, 2021 at 12:09 PM John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:

    On 7/16/21 12:59 PM, Jeffrey Walton wrote:
    Does it have to be one or the other? Can't you have both?

    Well, you could have runtime detection like certain multimedia codes and OpenSSL use but
    most packages don't do that.

    this is the "sane" way to do it. unfortunately, the EABIv2, which
    *explicitly* states, "SIMD is mandatory" is resulting in an inexorable
    creep of submissions from IBM developers (to libc6 and other
    libraries)

    with Quad and 8 1.5+ ghz on the roadmap over the next 3 years,
    Libre-SOC's processor is *not* intended for just "embedded" uses.
    we've simply made it abundantly clear that Hell Will Freeze Over
    before we add a suicidal *700* SIMD instructions to what is supposed
    to be a RISC design.

    l.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jeffrey Walton@21:1/5 to All on Sun Jul 18 05:20:02 2021
    On Fri, Jul 16, 2021 at 12:09 PM John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:

    On 7/16/21 12:59 PM, Jeffrey Walton wrote:
    Does it have to be one or the other? Can't you have both?

    Well, you could have runtime detection like certain multimedia codes and OpenSSL use but
    most packages don't do that.

    this is the "sane" way to do it. unfortunately, the EABIv2, which *explicitly* states, "SIMD is mandatory" is resulting in an inexorable
    creep of submissions from IBM developers (to libc6 and other
    libraries)

    I never really took notice that IBM captured the projects. But with
    your lens it sounds about right.

    ("Captured", as in regulatory capture as seen in the US. Regulatory
    capture is where private industry gets so cozy with government and
    regulators that industry writes their own rules and gov is just an
    extension of a few dominant players).

    with Quad and 8 1.5+ ghz on the roadmap over the next 3 years,
    Libre-SOC's processor is *not* intended for just "embedded" uses.
    we've simply made it abundantly clear that Hell Will Freeze Over
    before we add a suicidal *700* SIMD instructions to what is supposed
    to be a RISC design.

    Forgive my ignorance... Once traces of companies like NXP and IBM are
    removed, and Altivec is removed, what is left? Are some pieces of the
    ISA going to be (re)used? Or is the ISA being built from scratch? Is
    it even PowerPC anymore? Is it just a RISC machine?

    (PowerPC scatter/gather is quite lame, even in the latest ISA 3.0B, so
    I'm not sure there's much good stuff to reuse).

    Anyway, I'm excited to see a SIMD alternative that could become
    mainstream. I really look forward to the new hardware making it to
    market.

    Jeff

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Luke Kenneth Casson Leighton@21:1/5 to noloader@gmail.com on Sun Jul 18 17:20:01 2021
    ---
    crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

    On Sun, Jul 18, 2021 at 4:17 AM Jeffrey Walton <noloader@gmail.com> wrote:

    I never really took notice that IBM captured the projects. But with
    your lens it sounds about right.

    ("Captured", as in regulatory capture as seen in the US. Regulatory
    capture is where private industry gets so cozy with government and
    regulators that industry writes their own rules and gov is just an
    extension of a few dominant players).

    please do forgive my frustration at this particular strategic mistake:
    i don't believe IBM intended for this to happen, it's more that there
    simply wasn't anyone else around at the time to say "if you do
    make this innocent-looking decision because it improves performance
    for the *currently* only-existing hardware for the past 10 years - yours -
    it's going to have consequences".


    Forgive my ignorance... Once traces of companies like NXP and IBM are removed, and Altivec is removed, wha
  • From Riccardo Mottola@21:1/5 to John Paul Adrian Glaubitz on Sun Jul 18 20:00:01 2021
    Hi,

    John Paul Adrian Glaubitz wrote:
    It depends on the workload, of course. Anything that does SIMD like matrix multiplications
    in multimedia or scientific computing will, of course, profit from enabling AltiVec.

    No one claimed that AltiVec, MMX or SSE will just improve everything.

    Exactly, let's clarify something out direct experience. I don't know if
    things improved recently, but essentiallty GCC some years ago did not do
    real auto-vectoring. Enabling AltiVec essentially does nothing, it
    allows you to write code to use it, but it will not magically turn code
    faster.
    For specific libraries and applications which do support AltiVec (e.g.
    many video-decoding or image processing) improvements may be great.
    Especially on "older" processors like G4 or G5.
    On intel instead code may be faster with MMX, SSEx because additional
    resources and registers are opened up (on 32bit).

    If also AltiVec can now benefit of auto-vectoring I'm interested to
    know, I have myself written imaging code and never noticed improvement
    in AV just by turning it on, but did not retest with recent GCC.

    I would argue that there are far more users with PowerMacs which all support AltiVecs than
    with obscure embedded machines. As I said, some packages like nss2 already enabled AltiVec
    by default.

    As far as PowerMacs you are completely right, but PPC is used more and
    more on "other" systems too, boards done with the available NXP
    processors and several are based on embedded CPUs, e.g. e5500 cores and variations.
    Our upcoming PPC Laptop will have AltiVec, but other not.

    Riccardo

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)