Forum: >>> Magnum BBS <<<

[gentoo-dev] [PATCH] profiles/arch/amd64: add "-mfpmath=sse" to CFLAGS_

From Mike Gilbert@21:1/5 to All on Tue Oct 18 04:00:01 2022

GCC uses x87 floating point instructions when building 32-bit x86
code by default. This is true even for x86-64 multilib.

Using the x87 floating point unit can lead to strange behavior when
calculating intermediate values for single and double precision floats.
It uses 80 bits for all calculations, which is larger than the 32 or 64
bits specified for floats and doubles.

Using the SSE2 instructions available on x86-64 for floating point
arithmetic leads to more consistent behavior, and is usually faster.

Reference: https://gcc.gnu.org/wiki/x87note
Signed-off-by: Mike Gilbert <floppym@gentoo.org>
---
profiles/arch/amd64/make.defaults | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/profiles/arch/amd64/make.defaults b/profiles/arch/amd64/make.defaults
index 0c05dec124e..e7e18ff6a91 100644
--- a/profiles/arch/amd64/make.defaults
+++ b/profiles/arch/amd64/make.defaults
@@ -1,4 +1,4 @@
-# Copyright 1999-2021 Gentoo Authors
+# Copyright 1999-2022 Gentoo Authors
# Distributed under the terms of the GNU General Public License v2

ARCH="amd64"
@@ -28,7 +28,7 @@ LDFLAGS_amd64="-m elf_x86_64"
CHOST_amd64="x86_64-pc-linux-gnu"

# 32bit specific settings.
-CFLAGS_x86="-m32"
+CFLAGS_x86="-m32 -mfpmath=sse"
LDFLAGS_x86="-m elf_i386"
CHOST_x86="i686-pc-linux-gnu"

--
2.37.3

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ulrich Mueller@21:1/5 to Which on Tue Oct 18 10:20:01 2022

On Tue, 18 Oct 2022, Mike Gilbert wrote:

Reference: https://gcc.gnu.org/wiki/x87note

Which says:

| ... the amount of worst-case error that could possibly happen using
| the x87 (with any amount of intermediate rounding) is at worst the
| same as true 64 or 32 bit arithmetic, and in practice is almost always
| better.

and:

| Note, however, that this greater repeatability comes at the cost of
| lost precision (i.e. SSE always gets the same precision because it
| always takes the equivalent of the x87's worst case: a forced round
| down at each step).

So, it comes with a price, and I wonder if we shouldn't leave that
choice to the user, and go with the upstream GCC default?

-CFLAGS_x86="-m32"
+CFLAGS_x86="-m32 -mfpmath=sse"

Also, why add the flag only to CFLAGS_x86 but not to CFLAGS_amd64?
They should have the same single and double precision arithmetic?

Ulrich

-----BEGIN PGP SIGNATURE-----

iQFDBAEBCAAtFiEEtDnZ1O9xIP68rzDbUYgzUIhBXi4FAmNOYHYPHHVsbUBnZW50 b28ub3JnAAoJEFGIM1CIQV4uUZ4H/ib3cUJcJX4x12Yd3ywp3PaNQzVymbkrTOhD h3bhuocHkwLK8yi3VJQGtD6uZXQ5KgykqO78Yvtv9JjVIvKjk0hbxCngxvQ17HaH Cna3KWRMuVPA0aeJHEz6raGaQv/MTVQM9G2hlZIU+mnBbthI1BiTPuA44gsI1+IW NVu9mqjckFPoD3R9WRrzEsbF98nghLN9YGfoN9lPeeCk9AaNt2w3mdFC1+clFcYu DIiLeFJxocVZhYIp651QOe4kRArubxoff3Is01mTbGzA6l7traUbAp/2Cica8cLF uzVVjlVI2rqa3twrB01Zm7udyh2LZhmaRArmCD21kIyjoywF3IY=
=9G54
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Seifert@21:1/5 to Ulrich Mueller on Tue Oct 18 12:00:01 2022

On Tue, 2022-10-18 at 10:14 +0200, Ulrich Mueller wrote:

On Tue, 18 Oct 2022, Mike Gilbert wrote:

Reference: https://gcc.gnu.org/wiki/x87note

Which says:

... the amount of worst-case error that could possibly happen using
the x87 (with any amount of intermediate rounding) is at worst the
same as true 64 or 32 bit arithmetic, and in practice is almost
always
better.

and:

Note, however, that this greater repeatability comes at the cost of
lost precision (i.e. SSE always gets the same precision because it
always takes the equivalent of the x87's worst case: a forced round
down at each step).

So, it comes with a price, and I wonder if we shouldn't leave that
choice to the user, and go with the upstream GCC default?

-CFLAGS_x86="-m32"
+CFLAGS_x86="-m32 -mfpmath=sse"

-mfpmath=sse is already the default on amd64.

Also, why add the flag only to CFLAGS_x86 but not to CFLAGS_amd64?
They should have the same single and double precision arithmetic?

Ulrich

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ulrich Mueller@21:1/5 to All on Tue Oct 18 13:50:01 2022

On Tue, 18 Oct 2022, David Seifert wrote:

-CFLAGS_x86="-m32"
+CFLAGS_x86="-m32 -mfpmath=sse"

-mfpmath=sse is already the default on amd64.

I see. This change makes sense then.

What about profiles/arch/x86 though? IIUC we'll end up with an
inconsistency between x86 and multilib amd64.

Ulrich

-----BEGIN PGP SIGNATURE-----

iQFDBAEBCAAtFiEEtDnZ1O9xIP68rzDbUYgzUIhBXi4FAmNOkMkPHHVsbUBnZW50 b28ub3JnAAoJEFGIM1CIQV4usGAIAMFIzCxGHwIXpQebihjaptRoYDG9p074OJqK YwCXHc9D/JdVv/wR2DqCJkrK7Bmrm/8YZA6zNkLvKmpNc4dhwamxNP0DpZfdroo2 Hd0WCsa4ukcGoLyQnVtOYC5t9+pFfmObkKxUI+y44VSIJ1mJ4SccNaTmRf7bB2rA uZlReM5GFf8/6jWlgg7eIehVEu8paWlygWhCt/0kZvKjO3a00r5fNkaWCIX9HzQn bOHqpG4RjNhqlmtQUV0opS5B1DIoIOshxoFaXYYChPdWMk+RcvXZcjU+S0fdSwIR bKyZugNxg07J8chGXZX6AgQoE3fy8LJ46Rp3fdQ1nwj2Jjx6ZRY=
=mZ5q
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Seifert@21:1/5 to Ulrich Mueller on Tue Oct 18 15:40:01 2022

On Tue, 2022-10-18 at 13:40 +0200, Ulrich Mueller wrote:

On Tue, 18 Oct 2022, David Seifert wrote:

-CFLAGS_x86="-m32"
+CFLAGS_x86="-m32 -mfpmath=sse"

-mfpmath=sse is already the default on amd64.

I see. This change makes sense then.

What about profiles/arch/x86 though? IIUC we'll end up with an
inconsistency between x86 and multilib amd64.

Ulrich

What if I want to build Gentoo on an old AMD Thunderbird which has
neither SSE1 nor the more important SSE2?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mike Gilbert@21:1/5 to soap@gentoo.org on Tue Oct 18 18:00:01 2022

On Tue, Oct 18, 2022 at 5:56 AM David Seifert <soap@gentoo.org> wrote:

On Tue, 2022-10-18 at 10:14 +0200, Ulrich Mueller wrote:

On Tue, 18 Oct 2022, Mike Gilbert wrote:

Reference: https://gcc.gnu.org/wiki/x87note

Which says:

... the amount of worst-case error that could possibly happen using
the x87 (with any amount of intermediate rounding) is at worst the
same as true 64 or 32 bit arithmetic, and in practice is almost
always
better.

and:

Note, however, that this greater repeatability comes at the cost of
lost precision (i.e. SSE always gets the same precision because it
always takes the equivalent of the x87's worst case: a forced round
down at each step).

So, it comes with a price, and I wonder if we shouldn't leave that
choice to the user, and go with the upstream GCC default?

-CFLAGS_x86="-m32"
+CFLAGS_x86="-m32 -mfpmath=sse"

-mfpmath=sse is already the default on amd64.

I have amended the first paragraph to make this more clear:

GCC uses x87 floating point instructions when building 32-bit x86
code by default. When building 64-bit code, SSE2 instructions are used
instead.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mike Gilbert@21:1/5 to soap@gentoo.org on Tue Oct 18 17:20:02 2022

On Tue, Oct 18, 2022 at 9:37 AM David Seifert <soap@gentoo.org> wrote:

On Tue, 2022-10-18 at 13:40 +0200, Ulrich Mueller wrote:

On Tue, 18 Oct 2022, David Seifert wrote:

-CFLAGS_x86="-m32"
+CFLAGS_x86="-m32 -mfpmath=sse"

-mfpmath=sse is already the default on amd64.

I see. This change makes sense then.

What about profiles/arch/x86 though? IIUC we'll end up with an inconsistency between x86 and multilib amd64.

Ulrich

What if I want to build Gentoo on an old AMD Thunderbird which has
neither SSE1 nor the more important SSE2?

Right. On amd64 CPU always supports SSE2, so -mfpmath=sse will always
work there.

On x86, we need to consider a more diverse set of supported instructions.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ulrich Mueller@21:1/5 to All on Tue Oct 18 18:50:01 2022

On Tue, 18 Oct 2022, David Seifert wrote:

What if I want to build Gentoo on an old AMD Thunderbird which has
neither SSE1 nor the more important SSE2?

The -mfpmath=sse option is a no-op if the CPU doesn't support SSE,
i.e. it will use 387 arithmetics nevertheless.

-----BEGIN PGP SIGNATURE-----

iQFDBAEBCAAtFiEEtDnZ1O9xIP68rzDbUYgzUIhBXi4FAmNO2LgPHHVsbUBnZW50 b28ub3JnAAoJEFGIM1CIQV4u+mAIALgVVjrE8L9X0PArC3yW3fscJB/eLeBwd6kS Q/NI+fspeyOA2j6JjTg8egWwjaiG0Xmo0TBNvZx8wOWsZHEg/rr6yUZYSSuoh0lv 7SLjwdCb3eWH2tyxMfobPiUPp+3ukZA4bAmTjVmAgMwyOjV6u2t5aEqrFgxd1++4 oo7sNOm+d/FG+XHm2AhSRW7mX36JRunS6puNpl2FNRRjjdZy46AHbJtrgNo3R27K KytccqH/68rCZDO964dYiII9kAymsis0d843yceZ5HvnvCkqp/EZcQsr917bJxGd VvZ4Jp5S878W4zEbF8WuQSb7lhsTgVxcvzhPibUfaMB9FZQ98FQ=
=Y7rv
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mike Gilbert@21:1/5 to ulm@gentoo.org on Tue Oct 18 19:40:01 2022

On Tue, Oct 18, 2022 at 12:47 PM Ulrich Mueller <ulm@gentoo.org> wrote:

On Tue, 18 Oct 2022, David Seifert wrote:

What if I want to build Gentoo on an old AMD Thunderbird which has
neither SSE1 nor the more important SSE2?

The -mfpmath=sse option is a no-op if the CPU doesn't support SSE,
i.e. it will use 387 arithmetics nevertheless.

I don't really see an "effective" way to deploy this via profiles on x86.

We could add it to the default CFLAGS setting in profiles/arch/x86/make.defaults. However, we also default to
-march=i686 there, and that doesn't support SSE or SSE2. Also, the
entire CFLAGS variable is likely to be overridden by the CFLAGS
setting in /etc/make.conf.

The CFLAGS_x86 profile variable is only used by the
multilib_toolchain_setup function in multilib.eclass. In other words,
it only affects ebuilds that utilize the multilib eclasses to build
libraries for multiple ABIs. That covers all 32-bit libraries on
amd64, but doesn't cover all packages on x86.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Guest
  Wed Jan 15 06:29:08 2025
  from /bin/busybox Cat /proc/self/ex via Raw
- Guest
  Wed Jan 15 02:17:27 2025
  from /bin/busybox Cat /proc/self/ex via Raw
- Keyop
  Tue Jan 14 23:13:56 2025
  from Huddersfield, West Yorkshire via SSH
- Bob Worm
  Tue Jan 14 21:42:40 2025
  from Wales, Uk via Telnet
- Guest
  Tue Jan 14 09:13:19 2025
  from /bin/busybox Cat /proc/self/ex via Raw
- Bob Worm
  Tue Jan 14 07:58:29 2025
  from Wales, Uk via Telnet
- Guest
  Tue Jan 14 07:27:58 2025
  from /bin/busybox Cat /proc/self/ex via Raw
- Guest
  Tue Jan 14 02:15:49 2025
  from /bin/busybox Cat /proc/self/ex via Raw

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	399
Nodes:	16 (2 / 14)
Uptime:	101:38:05
Calls:	8,363
Calls today:	2
Files:	13,165
Messages:	5,898,006

[gentoo-dev] [PATCH] profiles/arch/amd64: add "-mfpmath=sse" to CFLAGS_

Who's Online

Recent Visitors

System Info