[1] https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=alpha&ver=2.34-8&stamp=1662963628&raw=0
[2] https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=alpha&ver=2.35-4&stamp=1666729919&raw=0
[3] https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=alpha&ver=2.36-4&stamp=1667607306&raw=0
[4] https://sourceware.org/bugzilla/show_bug.cgi?id=29575
I just noticed that there is a regression in glibc on alpha with version 2.34 or later.
Looking at the build logs for Debian's 2.34-8 [1], 2.35-4 [2] and 2.36-4 [3], it's obvious
there is something wrong given the many "Segmentation Fault" errors.
I had hoped I could fix this issue by passing "--disable-default-pie" like we already did
on sparc64, but it seems it's not the same bug [4]. At least, this particular workaround
does not help.
On 20 Nov 2022, at 12:48, Frank Scheiner <frank.scheiner@web.de> wrote:
On 20.11.22 10:03, Michael Cree wrote:
On Sun, Nov 13, 2022 at 12:45:17AM +0100, John Paul Adrian Glaubitz wrote: >>> I just noticed that there is a regression in glibc on alpha with version 2.34 or later.
Looking at the build logs for Debian's 2.34-8 [1], 2.35-4 [2] and 2.36-4 [3], it's obvious
there is something wrong given the many "Segmentation Fault" errors.
I had hoped I could fix this issue by passing "--disable-default-pie" like we already did
on sparc64, but it seems it's not the same bug [4]. At least, this particular workaround
does not help.
Interestingly the vast number of the failing tests pass if one builds
with a compiler that raises the baseline to EV67. This has been
proposed a number of times in the past for the Debian distribution.
I think it is time we did it. One of our last EV56 users has recently
bowed out due to hardware failure and I am only running EV67 hardware.
I still have the following pre EV67 machines available and in working order:
* AXPpci 33 (LCA4)
* AlphaStation 200 (EV4) / 255 (EV45) / 500 (EV56)
* PWS 500au (EV56)
* AlphaServer 800 (EV56)
...and can provide testing on them. All of them eventually ran Debian GNU/Linux Sid with up to Linux 5.x.x IIRC and I will also try them with 6.0.x. And I believe the majority of still exsiting, still working Alpha systems are pre EV67 systems.
Given the fact that EV6[...] and EV7[...] based systems are nowadays
very expensive for hobby use (I don't want to say unobtainium), I expect
that dropping support for pre EV67 will kill off most of the user base
for Debian on Alpha (and also Gentoo I assume).
Phrasing it differently:
Who needs a port that only runs on the buildds and a handful of
(hobbyist) machines around the world (like ppc64le ;-))?
My two cents.
All the best,
Frank
On Sun, Nov 13, 2022 at 12:45:17AM +0100, John Paul Adrian Glaubitz wrote:
I just noticed that there is a regression in glibc on alpha with version 2.34 or later.
Looking at the build logs for Debian's 2.34-8 [1], 2.35-4 [2] and 2.36-4 [3], it's obvious
there is something wrong given the many "Segmentation Fault" errors.
I had hoped I could fix this issue by passing "--disable-default-pie" like we already did
on sparc64, but it seems it's not the same bug [4]. At least, this particular workaround
does not help.
Interestingly the vast number of the failing tests pass if one builds
with a compiler that raises the baseline to EV67. This has been
proposed a number of times in the past for the Debian distribution.
I think it is time we did it. One of our last EV56 users has recently
bowed out due to hardware failure and I am only running EV67 hardware.
On 20.11.22 10:03, Michael Cree wrote:
On Sun, Nov 13, 2022 at 12:45:17AM +0100, John Paul Adrian Glaubitz wrote:
I just noticed that there is a regression in glibc on alpha with version 2.34 or later.
Interestingly the vast number of the failing tests pass if one builds
with a compiler that raises the baseline to EV67. This has been
proposed a number of times in the past for the Debian distribution.
I think it is time we did it. One of our last EV56 users has recently bowed out due to hardware failure and I am only running EV67 hardware.
I still have the following pre EV67 machines available and in working order:
* AXPpci 33 (LCA4)
* AlphaStation 200 (EV4) / 255 (EV45) / 500 (EV56)
* PWS 500au (EV56)
* AlphaServer 800 (EV56)
...and can provide testing on them. All of them eventually ran Debian
On Dec 12, 2022, at 8:57 AM, Frank Scheiner <frank.scheiner@web.de> wrote:
I'm not sure I fully understand the issue here:
See, glibc used to work for alpha up until 2.33 as I read. Then a change broke it for alpha with 2.34. Does the respective glibc maintainer for
alpha (Richard Henderson according to [1]) really have no interest in
fixing it?
On Sun, Nov 20, 2022 at 01:47:59PM +0100, Frank Scheiner wrote:
On 20.11.22 10:03, Michael Cree wrote:
On Sun, Nov 13, 2022 at 12:45:17AM +0100, John Paul Adrian Glaubitz wrote: >>>> I just noticed that there is a regression in glibc on alpha with version 2.34 or later.
Interestingly the vast number of the failing tests pass if one builds
with a compiler that raises the baseline to EV67. This has been
proposed a number of times in the past for the Debian distribution.
I think it is time we did it. One of our last EV56 users has recently
bowed out due to hardware failure and I am only running EV67 hardware.
I still have the following pre EV67 machines available and in working order: >>
* AXPpci 33 (LCA4)
* AlphaStation 200 (EV4) / 255 (EV45) / 500 (EV56)
* PWS 500au (EV56)
* AlphaServer 800 (EV56)
...and can provide testing on them. All of them eventually ran Debian
Can you fix the ev4 based bugs in glibc? If not, I am not interested.
With the usrmerge uploads now depending on a recent libc version Alpha
is now dead in the water. Nothing can be built. Thus we have to fix
glibc to continue building.
I am not prepared to fix ev4 issues so if no one else is prepared to
fix them then without a architecture baseline raise this is the end
of Alpha on Debian Ports.
Hi Frank!
On Dec 12, 2022, at 8:57 AM, Frank Scheiner <frank.scheiner@web.de> wrote:
I'm not sure I fully understand the issue here:
See, glibc used to work for alpha up until 2.33 as I read. Then a change broke it for alpha with 2.34. Does the respective glibc maintainer for alpha (Richard Henderson according to [1]) really have no interest in fixing it?
Any chance you can bisect the issue?
FWIW, it’s not been reported upstream yet.
Dear Michael,
On 12.12.22 08:27, Michael Cree wrote:
With the usrmerge uploads now depending on a recent libc version Alpha
is now dead in the water. Nothing can be built. Thus we have to fix
glibc to continue building.
I am not prepared to fix ev4 issues so if no one else is prepared to
fix them then without a architecture baseline raise this is the end
of Alpha on Debian Ports.
I'm not sure I fully understand the issue here:
See, glibc used to work for alpha up until 2.33 as I read. Then a change broke it for alpha with 2.34. Does the respective glibc maintainer for
alpha (Richard Henderson according to [1]) really have no interest in
fixing it?
On Mon, Dec 12, 2022 at 08:56:40AM +0100, Frank Scheiner wrote:
Dear Michael,
On 12.12.22 08:27, Michael Cree wrote:
With the usrmerge uploads now depending on a recent libc version Alpha
is now dead in the water. Nothing can be built. Thus we have to fix
glibc to continue building.
I am not prepared to fix ev4 issues so if no one else is prepared to
fix them then without a architecture baseline raise this is the end
of Alpha on Debian Ports.
I'm not sure I fully understand the issue here:
See, glibc used to work for alpha up until 2.33 as I read. Then a change
broke it for alpha with 2.34. Does the respective glibc maintainer for
alpha (Richard Henderson according to [1]) really have no interest in
fixing it?
RTH hasn't had working Alpha hardware for quite some time.
One of the glibc maintainers did have access to one of my Alphas
until last year but unfortunately the hosting site is no longer
prepared to host it so I can no longer make that Alpha available
to developers.
So with that glibc Alpha support is rotting fast.
Many of the other ports (e.g. armel, armhf, i386) have had
architecture baseline increases in the last few years, and none
support hardware anywhere near as old as alpha ev4.
I am no longer personally prepared to support Alpha unless
the architecture baseline increase is done. I have no
ev4/ev45 hardware and no longer have any interest in supporting
them.
On Dec 12, 2022, at 9:27 AM, Michael Cree <mcree@orcon.net.nz> wrote:
I am not interested in supporting old Alphas without BWX anymore.
I am drawing the line. Either someone steps up to support non-BWX
Alpha and promptly fixes glibc or the architecture baseline is
increased to include BWX (thereby fixing most of the glibc issues).
Without either of those happening I give up being an Alpha porter
and switch off my Alpha buildd permanently. I have many other
interesting projects I could be working on!
On Dec 12, 2022, at 7:17 PM, Michael Cree <mcree@orcon.net.nz> wrote:
On Mon, Dec 12, 2022 at 12:24:59PM +0100, John Paul Adrian Glaubitz wrote:
On Dec 12, 2022, at 9:27 AM, Michael Cree <mcree@orcon.net.nz> wrote:
I am not interested in supporting old Alphas without BWX anymore.
I am drawing the line. Either someone steps up to support non-BWX
Alpha and promptly fixes glibc or the architecture baseline is
increased to include BWX (thereby fixing most of the glibc issues).
Without either of those happening I give up being an Alpha porter
and switch off my Alpha buildd permanently. I have many other
interesting projects I could be working on!
As a compromise, how about we fix the bug, create a final set of CD
images for old Alphas, then raise the baseline after having verified
it does not break QEMU (both -user and -system)?
You fix the bug then. I'm not interested so there is no "we" in this.
On Dec 12, 2022, at 9:27 AM, Michael Cree <mcree@orcon.net.nz> wrote:
I am not interested in supporting old Alphas without BWX anymore.
I am drawing the line. Either someone steps up to support non-BWX
Alpha and promptly fixes glibc or the architecture baseline is
increased to include BWX (thereby fixing most of the glibc issues).
Without either of those happening I give up being an Alpha porter
and switch off my Alpha buildd permanently. I have many other
interesting projects I could be working on!
As a compromise, how about we fix the bug, create a final set of CD
images for old Alphas, then raise the baseline after having verified
it does not break QEMU (both -user and -system)?
On Dec 12, 2022, at 7:17 PM, Michael Cree <mcree@orcon.net.nz> wrote:
On Mon, Dec 12, 2022 at 12:24:59PM +0100, John Paul Adrian Glaubitz wrote:
On Dec 12, 2022, at 9:27 AM, Michael Cree <mcree@orcon.net.nz> wrote: >>>I am not interested in supporting old Alphas without BWX anymore.
I am drawing the line. Either someone steps up to support non-BWX
Alpha and promptly fixes glibc or the architecture baseline is
increased to include BWX (thereby fixing most of the glibc issues).
Without either of those happening I give up being an Alpha porter
and switch off my Alpha buildd permanently. I have many other
interesting projects I could be working on!
As a compromise, how about we fix the bug, create a final set of CD
images for old Alphas, then raise the baseline after having verified
it does not break QEMU (both -user and -system)?
You fix the bug then. I'm not interested so there is no "we" in this.
Please don’t be so negative.
We should be able to have a discussion on this topic without such sentiments.
There are valid arguments for both sides, so it’s not helpful to lead a discussion like this.
Either the arch baseline is raised to something that is easier to
maintain (which, frankly, I think is essential if the Alpha port is to survive any longer), someone else steps up to fix the brokenness that
arises from non-atomic multi-cpu-instruction 8-bit and 16-bit memory accesses, or I bail out of maintaining Debian-Ports Alpha.
Hello!
On 12/12/22 20:45, Michael Cree wrote:
Either the arch baseline is raised to something that is easier to
maintain (which, frankly, I think is essential if the Alpha port is to survive any longer), someone else steps up to fix the brokenness that arises from non-atomic multi-cpu-instruction 8-bit and 16-bit memory accesses, or I bail out of maintaining Debian-Ports Alpha.
So what baseline do we want? Would EV56 be sufficient? Because that would still work with my AlphaStation 433au and XP1000 and gets us BWX.
I don't want to use something like EV67 as I think that would limit the usable hardware too much.
I guess I can live with dropping EV4 since NetBSD
and Gentoo would still run on these.
I am still interested in fixing the glibc bug and will work on bisecting it.
If EV56 is the baseline we can agree on, please go ahead and rebuild glibc and gcc using this baseline.
So what baseline do we want? Would EV56 be sufficient? Because that would
still work with my AlphaStation 433au and XP1000 and gets us BWX.
Yes. The first extension added is the byte-word extension which came
in with EV56. That provides CPU instructions for byte and word (16-bit) memory accesses. That is the most important one: possibly a third of
the bugs in the repository extend from non-atomic byte and word
accesses. The kernel developers have expressed a view that they would
like to assume on all arches that byte and word memory accesses are
atomic and the only architecture that is holding them back from that assumption are old Alphas without BWX. There is an old open bug on
gcc related to the non-atomic memory accesses of old Alphas and that
one is basically cannot fix.
If we went to BWX (i.e. EV56) then as you say that means the personal workstations (e.g. PWS433au and PWS500au), which a lot of Alpha users
have and AlphaStations such as the 433au will still be supported.
I don't want to use something like EV67 as I think that would limit the
usable hardware too much.
Yes, that's the problem going fully to EV67. The CPU extensions we
would get are MVI (motion video instructions) that came in with
PCA56, CIX (count integer instructions with the like of counting
trailing zeros) that came in with EV67 and FIX (floating point
extensions primarily for efficient conversion between float and
integer and a sqrt instruction) with EV6, but these are nowhere
near as important as BWX in terms of reducing bug fixing workload
in maintaining the port.
I guess I can live with dropping EV4 since NetBSD
and Gentoo would still run on these.
Gentoo has the advantage (and disadvantage) of compiling from
source so one can optimise their own installation for their
hardware.
I am still interested in fixing the glibc bug and will work on bisecting it. >>
If EV56 is the baseline we can agree on, please go ahead and rebuild glibc >> and gcc using this baseline.
I am currently building gcc-12 to default to EV56/BWX. In the test
suite now so probably won't be finished till tomorrow. Then I will try building latest glibc (2.36-6) with that gcc. I suspect there will
still be a couple of test suite failures so there will probably be
a further delay before I have it ready to upload to the repository. In
any case I will give fair warning before I do.
[...]
During this compilation I got 4 segfaults from the compiler (gcc-12)
and a "gcc: internal compiler
error: Aborted signal terminated program cc1". If you are interested
in the details, I have all the
error messages available.
Is that glibc from upstream or the Debian package?
Also, is the machine's memory known to be good? Please make sure to test
it.
[...]
Summarizing it, I'd be grateful if someone could do the bisecting on
one of the buildds or developer machines.
You could cross-compile glibc. That's most likely what I am going to do.
You could cross-compile glibc. That's most likely what I am going to do.
On 11/13/22 00:45, John Paul Adrian Glaubitz wrote:
I just noticed that there is a regression in glibc on alpha with version 2.34 or later.
Looking at the build logs for Debian's 2.34-8 [1], 2.35-4 [2] and 2.36-4 [3], it's obvious
there is something wrong given the many "Segmentation Fault" errors.
This regression was introduced by the following commit:
[...]
Regardless, I can confirm this on my DS15:
```
root@ds15:/srv/storage/build# LD_LIBRARY_PATH=$PWD/glibc-at-36231bee7ab36d59dd121ea85b91411ae86945f3 /bin/bash
root@ds15:/srv/storage/build# echo $?
0
root@ds15:/srv/storage/build# exit
exit
root@ds15:/srv/storage/build# LD_LIBRARY_PATH=$PWD/glibc-at-6c57d320484988e87e446e2e60ce42816bf51d53 /bin/bash
Segmentation fault
root@ds15:/srv/storage/build# echo $?
139
```
...6c57d320484988e87e446e2e60ce42816bf51d53 is the first bad commit and 36231bee7ab36d59dd121ea85b91411ae86945f3 is its parent.
Do we also have a result for
glibc@6c57d320484988e87e446e2e60ce42816bf51d53 with `-mcpu=ev67`?
``` root@ds15:/srv/storage/build/glibc-at-6c57d320484988e87e446e2e60ce42816bf51d53-ev67#
CC="alpha-linux-gnu-gcc-12 -mcpu=ev67 -mtune=ev67 " CXX="alpha-linux-gnu-g++-12 -mcpu=ev67 -mtune=ev67 " MIG="alpha-linux-gnu-mig" ../../glibc/configure
--host=alphaev67-linux-gnu --disable-werror --prefix=/usr --disable-sanity-checks
[...]
root@ds15:/srv/storage/build# LD_LIBRARY_PATH=$PWD/glibc-at-6c57d320484988e87e446e2e60ce42816bf51d53-ev67 /bin/bash
Segmentation fault
```
Unfortunately it also doesn't work here when optimized for EV67.
[1] https://sourceware.org/bugzilla/show_bug.cgi?id=29899
[...]
Unfortunately it also doesn't work here when optimized for EV67.
OK, this just confirms what my cross-compile tests with "-mcpu=ev67 -mtune=ev67"
where the segfault wasn't fixed either by raising the baseline.
If you have a user account for glibc bugzilla, you should subscribe to
the bug
report I opened for this particular issue [1].
H. J. Lu raises a good
question,
namely whether alpha has any hardcoded values for "struct rtld_global_ro
{}".
[...]
Can we be sure that this reproducer identifies the same problem than
the build failures from the original post ([1])?
[1]: https://lists.debian.org/debian-alpha/2022/11/msg00003.html
Well, this is how I identified that there was a problem with glibc on
alpha.
I built the packages manually with the testsuite enabled and installed them into a chroot for testing which resulted in a segfault when dpkg tried to configure the libc-bin package.
I assume the many testsuite failures are a direct result of this bug which just causes many tests to segfault. We had a similar problem on sparc64
where
a single bug in the static build caused many testsuite failures.
Interestingly, when I checkout the tag glibc-2.34 and disabled the _dl_minsigstacksize symbol
in "struct rtld_global_ro {}" again with the following hack, I'm no
longer getting a segfault
but a floating point exception:
[...]
Could you verify this on your DS-15?
Hi!
On 12/14/22 21:16, Frank Scheiner wrote:
I'll do that tomorrow. The thing is that this diff doesn't apply cleanly:
Which version of the workaround diff did you use? There are two.
There is one that applies cleanly on top of 6c57d320484988e87e446e2e60ce42816bf51d53
and a second one that applies cleanly on top of glibc-2.34, I posted
both. There were
some changes between 6c57d320484988e87e446e2e60ce42816bf51d53 and
glibc-2.34 in the
minstksize/stksize code which is why you need the second diff that was
also part of
my mail.
I'm attaching the second diff as a patch.
I'll do that tomorrow. The thing is that this diff doesn't apply cleanly:
I'm attaching the second diff as a patch.
I think there's some whitespace difference. I manually applied the
rejected stuff, made a `git diff` and comparing that to your attached
patch gives:
Maybe adding [1] might help, but the patch actually removes it.
If your glibc fails with Floating Point exception, I fear there might be
a second bug hiding somewhere which we need to bisect as well. This is particularly annoying since we would have to apply the above diff for
every bisecting step.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 349 |
Nodes: | 16 (2 / 14) |
Uptime: | 104:26:13 |
Calls: | 7,610 |
Calls today: | 1 |
Files: | 12,786 |
Messages: | 5,682,633 |