Hi,
The wiki page that synthesizes architecture specificities indicates
that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This is also what the main wiki page for PPC64 says:
https://wiki.debian.org/PPC64
Can someone please clarify the situation?
(I’m asking because I’m the maintainer of the openblas package, and knowing whether Altivec is available or not, and more generally what is
in the baseline, is essential for proper packaging).
Hi Mathieu,
Le mardi 13 juillet 2021 à 18:56 +0200, Mathieu Malaterre a écrit :
On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot <sebastien@debian.org> wrote:
The wiki page that synthesizes architecture specificities indicates
that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This is also what the main wiki page for PPC64 says: https://wiki.debian.org/PPC64
Can someone please clarify the situation?
(I’m asking because I’m the maintainer of the openblas package, and knowing whether Altivec is available or not, and more generally what is in the baseline, is essential for proper packaging).
I do not believe that you can do much as a packager. You cannot assume anything on the target arch. You need to do the same thing as ffmpeg
is doing for avx2/sse4 on amd64, you need to do runtime detection. So unless upstream is doing something very clever you cannot compile blas using any of the fancy altivec instructions :(
The man page for ld.so mentions something about optimized libraries
(search for "/usr/lib/sse2/"), but this is currently not in use in
Debian (AFAIK).
Actually OpenBLAS has its own runtime detection mechanism, which is
used to select the best linear algebra kernel for the current CPU
(those kernels are mainly written in assembly, and take advantage of available ISA extensions). This mechanism is used on several archs,
including ppc64el (so at runtime, OpenBLAS chooses between a POWER8 and
a POWER9 kernel; there is even a POWER10 kernel already available).
However, I cannot enable this mechanism on ppc64 and powerpc, because
the runtime detection only works for POWER6 and above, and my
understanding is that for these two ports the baseline is lower. Hence
on these two archs, only one kernel is included in the package binaries (currently POWER4 for ppc64 and PPCG4 for powerpc). For optimal
performance, users should recompile OpenBLAS locally (as indicated in
the package description and in README.Debian).
I am however not sure that my current choices for the ppc64 and powerpc baselines are optimal, hence this thread.
--
⢀⣴⠾⠻⢶⣦⠀ Sébastien Villemot
⣾⠁⢠⠒⠀⣿⡁ Debian Developer
⢿⡄⠘⠷⠚⠋⠀ https://sebastien.villemot.name
⠈⠳⣄⠀⠀⠀⠀ https://www.debian.org
On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot <sebastien@debian.org> wrote:
The wiki page that synthesizes architecture specificities indicates
that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This is also what the main wiki page for PPC64 says:
https://wiki.debian.org/PPC64
Can someone please clarify the situation?
(I’m asking because I’m the maintainer of the openblas package, and knowing whether Altivec is available or not, and more generally what is
in the baseline, is essential for proper packaging).
I do not believe that you can do much as a packager. You cannot assume anything on the target arch. You need to do the same thing as ffmpeg
is doing for avx2/sse4 on amd64, you need to do runtime detection. So
unless upstream is doing something very clever you cannot compile blas
using any of the fancy altivec instructions :(
The man page for ld.so mentions something about optimized libraries
(search for "/usr/lib/sse2/"), but this is currently not in use in
Debian (AFAIK).
The wiki page that synthesizes architecture specificities indicates
that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This is also what the main wiki page for PPC64 says:
https://wiki.debian.org/PPC64
Can someone please clarify the situation?
(I’m asking because I’m the maintainer of the openblas package, and knowing whether Altivec is available or not, and more generally what is
in the baseline, is essential for proper packaging).
On Tue, Jul 13, 2021 at 7:21 PM Sébastien Villemot <sebastien@debian.org> wrote:
Le mardi 13 juillet 2021 à 18:56 +0200, Mathieu Malaterre a écrit :
On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot <sebastien@debian.org> wrote:
The wiki page that synthesizes architecture specificities indicates that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
also what the main wiki page for PPC64 says: https://wiki.debian.org/PPC64
Can someone please clarify the situation?
(I’m asking because I’m the maintainer of the openblas package, and knowing whether Altivec is available or not, and more generally what is in the baseline, is essential for proper packaging).
I do not believe that you can do much as a packager. You cannot assume anything on the target arch. You need to do the same thing as ffmpeg
is doing for avx2/sse4 on amd64, you need to do runtime detection. So unless upstream is doing something very clever you cannot compile blas using any of the fancy altivec instructions :(
The man page for ld.so mentions something about optimized libraries (search for "/usr/lib/sse2/"), but this is currently not in use in
Debian (AFAIK).
Actually OpenBLAS has its own runtime detection mechanism, which is
used to select the best linear algebra kernel for the current CPU
(those kernels are mainly written in assembly, and take advantage of available ISA extensions). This mechanism is used on several archs, including ppc64el (so at runtime, OpenBLAS chooses between a POWER8 and
a POWER9 kernel; there is even a POWER10 kernel already available).
However, I cannot enable this mechanism on ppc64 and powerpc, because
the runtime detection only works for POWER6 and above, and my
understanding is that for these two ports the baseline is lower. Hence
on these two archs, only one kernel is included in the package binaries (currently POWER4 for ppc64 and PPCG4 for powerpc). For optimal performance, users should recompile OpenBLAS locally (as indicated in
the package description and in README.Debian).
There are plenty of people on this mailing list that could test/verify
that. Is there a quick way to check that your openblas package is
compiled correctly for ppc32 and ppc64 (like a verbose mode) ? Did you
do any experiment on perotto.debian.net ?
I do not believe that you can do much as a packager. You cannot assume anything on the target arch. You need to do the same thing as ffmpegFor performance-related libraries I'm okay with AltiVec, we're enabling it for NSS2 as well and so far no user has reported any issues.
is doing for avx2/sse4 on amd64, you need to do runtime detection. So
unless upstream is doing something very clever you cannot compile blas
using any of the fancy altivec instructions :(
I am however not sure that my current choices for the ppc64 and powerpc
baselines are optimal, hence this thread.
However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This
is also what the main wiki page for PPC64 says: https://wiki.debian.org/PPC64
Can someone please clarify the situation?
The ppc64 originally used the ppc64 baseline including AltiVec e.g PowerPC970, (64-Bit PowerMac). However, the previous port maintainer
decided he wanted to support embedded systems such as the PowerPC E5500
which does not support AltiVec.
Le mardi 13 juillet 2021 à 20:06 +0200, Mathieu Malaterre a écrit :
On Tue, Jul 13, 2021 at 7:21 PM Sébastien Villemot <sebastien@debian.org> wrote:
Le mardi 13 juillet 2021 à 18:56 +0200, Mathieu Malaterre a écrit :
On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot <sebastien@debian.org> wrote:
The wiki page that synthesizes architecture specificities indicates that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
However my understanding is that this port supports any powerpc64 CPU,
including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
also what the main wiki page for PPC64 says: https://wiki.debian.org/PPC64
Can someone please clarify the situation?
(I’m asking because I’m the maintainer of the openblas package, and
knowing whether Altivec is available or not, and more generally what is
in the baseline, is essential for proper packaging).
I do not believe that you can do much as a packager. You cannot assume anything on the target arch. You need to do the same thing as ffmpeg
is doing for avx2/sse4 on amd64, you need to do runtime detection. So unless upstream is doing something very clever you cannot compile blas using any of the fancy altivec instructions :(
The man page for ld.so mentions something about optimized libraries (search for "/usr/lib/sse2/"), but this is currently not in use in Debian (AFAIK).
Actually OpenBLAS has its own runtime detection mechanism, which is
used to select the best linear algebra kernel for the current CPU
(those kernels are mainly written in assembly, and take advantage of available ISA extensions). This mechanism is used on several archs, including ppc64el (so at runtime, OpenBLAS chooses between a POWER8 and
a POWER9 kernel; there is even a POWER10 kernel already available).
However, I cannot enable this mechanism on ppc64 and powerpc, because
the runtime detection only works for POWER6 and above, and my understanding is that for these two ports the baseline is lower. Hence
on these two archs, only one kernel is included in the package binaries (currently POWER4 for ppc64 and PPCG4 for powerpc). For optimal performance, users should recompile OpenBLAS locally (as indicated in
the package description and in README.Debian).
There are plenty of people on this mailing list that could test/verify that. Is there a quick way to check that your openblas package is
compiled correctly for ppc32 and ppc64 (like a verbose mode) ? Did you
do any experiment on perotto.debian.net ?
perotto.debian.net is POWER8, so it’s clearly well above the baseline.
The package runs fine there, but that does not tell anything about
baseline violation.
Verifying that the package compiled fine and passed its testsuite on
build daemons does not give any information about baseline violation
either, because buildds are probably above the baseline as well. FYI,
the most recent build logs are there: https://buildd.debian.org/status/package.php?p=openblas&suite=experimental (there is a problem with powerpc in experimental; but the version in
sid compiled).
If nobody has the relevant knowledge, then the only option is to test
the package on the oldest possible hardware. The easiest way to test it
is to recompile it locally (since this will exercise the testsuite).
On 13. Jul 2021, at 21:05, John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:
Hi Sébastien!
On 7/13/21 1:55 PM, Sébastien Villemot wrote:
The wiki page that synthesizes architecture specificities indicates
that Altivec is included in the baseline for the ppc64 port:
https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
However my understanding is that this port supports any powerpc64 CPU,
including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
also what the main wiki page for PPC64 says:
https://wiki.debian.org/PPC64
Can someone please clarify the situation?
The ppc64 originally used the ppc64 baseline including AltiVec e.g PowerPC970,
(64-Bit PowerMac). However, the previous port maintainer decided he wanted to support embedded systems such as the PowerPC E5500 which does not support AltiVec.
Just to note it here, this CPU family is also used by some relatively
recent desktop-class PowerPC machines, like A-Eon's AmigaOne X5000 (and
its "CyrusPlus" motherboard). So it's not just about the support of some obscure embedded dev board.
On 13. Jul 2021, at 21:44, Karoly Balogh <charlie@scenergy.dfmk.hu> wrote:
Hi,
On Tue, 13 Jul 2021, John Paul Adrian Glaubitz wrote:
However my understanding is that this port supports any powerpc64 CPU,
including some that don’t have Altivec (e.g. POWER4 or POWER5). This
is also what the main wiki page for PPC64 says:
https://wiki.debian.org/PPC64
Can someone please clarify the situation?
The ppc64 originally used the ppc64 baseline including AltiVec e.g
PowerPC970, (64-Bit PowerMac). However, the previous port maintainer
decided he wanted to support embedded systems such as the PowerPC E5500
which does not support AltiVec.
Just to note it here, this CPU family is also used by some relatively
recent desktop-class PowerPC machines, like A-Eon's AmigaOne X5000 (and
its "CyrusPlus" motherboard). So it's not just about the support of some obscure embedded dev board.
Charlie
On 7/13/21 1:55 PM, Sébastien Villemot wrote:
The wiki page that synthesizes architecture specificities indicates
that Altivec is included in the baseline for the ppc64 port: https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
However my understanding is that this port supports any powerpc64 CPU, including some that don’t have Altivec (e.g. POWER4 or POWER5). This is also what the main wiki page for PPC64 says:
https://wiki.debian.org/PPC64
Can someone please clarify the situation?
The ppc64 originally used the ppc64 baseline including AltiVec e.g PowerPC970,
(64-Bit PowerMac). However, the previous port maintainer decided he wanted to support embedded systems such as the PowerPC E5500 which does not support AltiVec.
(I’m asking because I’m the maintainer of the openblas package, and knowing whether Altivec is available or not, and more generally what is
in the baseline, is essential for proper packaging).
Please go ahead and enabled AltiVec as I don't think it makes much sense to use BLAS
on machines without any SIMD support. If any user complains about compatibility issues,
please feel free to bring up the issue here again.
Please go ahead and enabled AltiVec as I don't think it makes much sense to use BLAS
on machines without any SIMD support. If any user complains about compatibility issues,
please feel free to bring up the issue here again.
I think I disagree with this idea. OpenBLAS can be pulled in by chains
of dependencies, even for users who do not even know what BLAS is.
Violating the baseline can lead to hard-to-understand crashes.
Since I think that reliability is more important than performance, I
prefer to strictly respect the baseline in the binary package.
However note that locally recompiling OpenBLAS is a supported and
documented procedure, for those who want to take full advantage of
their hardware.
Regarding the kernel that is currently built in the official binary, I
could do with some help to determine which one is the best. You can see
the list of kernels at this address: https://salsa.debian.org/science-team/openblas/-/tree/master/kernel/power Each KERNEL.* file lists a bunch of source files, many of which are
assembly files.
Currently, I use POWER4 for ppc64 and PPCG4 for powerpc, but I’m unsure that those are the right choice. I want a kernel that respects the
baseline, but still taking advantage of all that is in the baseline.
Le mardi 13 juillet 2021 à 21:04 +0200, John Paul Adrian Glaubitz a écrit :
Please go ahead and enabled AltiVec as I don't think it makes much sense to use BLASI think I disagree with this idea. OpenBLAS can be pulled in by chains
on machines without any SIMD support. If any user complains about compatibility issues,
please feel free to bring up the issue here again.
of dependencies, even for users who do not even know what BLAS is.
Violating the baseline can lead to hard-to-understand crashes.
Since I think that reliability is more important than performance, I
prefer to strictly respect the baseline in the binary package.
I wasn't really a fan of that change but my stance is that we should use AltiVec
in packages where it makes sense as the majority of the ppc64 port users
will
have a machine that suppport AltiVec.
I disagree too because the performance of software with AltiVec support isn't as
high as expected. I tested it a lot because we have AltiVec and Non-Altivec machines
here. We changed to Non-AltiVec compiled software a while ago.
We and the MintPPC team had some problems with the VLC media player on our Non-AltiVec
machines a year ago because it is only available in an AltiVec version.
The MintPPC team had to recompile it. [1]
We also had to recompile the version 3.0.12 of VLC because of the AltiVec dependency again. [2]
I would argue that there are far more users with PowerMacs which all support AltiVecs than
with obscure embedded machines. As I said, some packages like nss2 already enabled AltiVec
by default.
Does it have to be one or the other? Can't you have both?
On 7/15/21 5:49 PM, Christian Zigotzky wrote:
I disagree too because the performance of software with AltiVec support isn't as
high as expected. I tested it a lot because we have AltiVec and Non-Altivec machines
here. We changed to Non-AltiVec compiled software a while ago.
It depends on the workload, of course. Anything that does SIMD like matrix multiplications
in multimedia or scientific computing will, of course, profit from enabling AltiVec.
No one claimed that AltiVec, MMX or SSE will just improve everything.
We and the MintPPC team had some problems with the VLC media player on our Non-AltiVec
machines a year ago because it is only available in an AltiVec version.
The MintPPC team had to recompile it. [1]
We also had to recompile the version 3.0.12 of VLC because of the AltiVec dependency again. [2]
I would argue that there are far more users with PowerMacs which all support AltiVecs than
with obscure embedded machines. As I said, some packages like nss2 already enabled AltiVec
by default.
Well, you could have runtime detection like certain multimedia codes and OpenSSL use but
most packages don't do that.
Either way, if certain downstreams want better support for certain targets, they are always
welcome to jump in and send patches to me or upstream. This port is a community effort,
after all.
Some years ago I tried to get such runtime detection added for vlc.
Upstream was certainly far from helpful and just about hostile to the
idea of adding runtime detection code on powerpc.
Given I was just
trying to help out the people hitting a problem with it (I don't have
any desktop powerpc, I only deal with embedded), I decided it wasn't
worth my effort to argue with them.
On 7/16/21 12:59 PM, Jeffrey Walton wrote:
Does it have to be one or the other? Can't you have both?
Well, you could have runtime detection like certain multimedia codes and OpenSSL use but
most packages don't do that.
On Fri, Jul 16, 2021 at 12:09 PM John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:
On 7/16/21 12:59 PM, Jeffrey Walton wrote:
Does it have to be one or the other? Can't you have both?
Well, you could have runtime detection like certain multimedia codes and OpenSSL use but
most packages don't do that.
this is the "sane" way to do it. unfortunately, the EABIv2, which *explicitly* states, "SIMD is mandatory" is resulting in an inexorable
creep of submissions from IBM developers (to libc6 and other
libraries)
with Quad and 8 1.5+ ghz on the roadmap over the next 3 years,
Libre-SOC's processor is *not* intended for just "embedded" uses.
we've simply made it abundantly clear that Hell Will Freeze Over
before we add a suicidal *700* SIMD instructions to what is supposed
to be a RISC design.
I never really took notice that IBM captured the projects. But with
your lens it sounds about right.
("Captured", as in regulatory capture as seen in the US. Regulatory
capture is where private industry gets so cozy with government and
regulators that industry writes their own rules and gov is just an
extension of a few dominant players).
Forgive my ignorance... Once traces of companies like NXP and IBM are removed, and Altivec is removed, wha
It depends on the workload, of course. Anything that does SIMD like matrix multiplications
in multimedia or scientific computing will, of course, profit from enabling AltiVec.
No one claimed that AltiVec, MMX or SSE will just improve everything.
I would argue that there are far more users with PowerMacs which all support AltiVecs than
with obscure embedded machines. As I said, some packages like nss2 already enabled AltiVec
by default.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 296 |
Nodes: | 16 (2 / 14) |
Uptime: | 51:16:27 |
Calls: | 6,649 |
Calls today: | 1 |
Files: | 12,200 |
Messages: | 5,330,303 |