Forum: >>> Magnum BBS <<<

Re: Double precision /MOD

From minforth@21:1/5 to none albert on Tue Mar 21 05:04:31 2023

none albert schrieb am Dienstag, 21. März 2023 um 12:22:19 UTC+1:

Suppose we have a fast single precision /MOD , say unsigned.
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?

/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )

Results are single precision unsigned.

Not quite clear what you intend. The maximum
UD/MOD ( udnom uddenom -- udrem udquot) ??
on a 32-bit or 64-bit system ??

or like https://groups.google.com/g/comp.lang.forth/c/RlszeFOHIB4/m/ijpQHZ0huR0J

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From none) (albert@21:1/5 to All on Tue Mar 21 12:22:16 2023

Suppose we have a fast single precision /MOD , say unsigned.
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?

/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )

Results are single precision unsigned.

Groetjes Albert
--
Don't praise the day before the evening. One swallow doesn't make spring.
You must not say "hey" before you have crossed the bridge. Don't sell the
hide of the bear until you shot it. Better one bird in the hand than ten in
the air. First gain is a cat spinning. - the Wise from Antrim -

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From minforth@21:1/5 to minforth on Tue Mar 21 07:01:17 2023

minforth schrieb am Dienstag, 21. März 2023 um 13:04:33 UTC+1:

none albert schrieb am Dienstag, 21. März 2023 um 12:22:19 UTC+1:

Suppose we have a fast single precision /MOD , say unsigned.
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?

/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )

Results are single precision unsigned.

Not quite clear what you intend. The maximum
UD/MOD ( udnom uddenom -- udrem udquot) ??
on a 32-bit or 64-bit system ??

or like https://groups.google.com/g/comp.lang.forth/c/RlszeFOHIB4/m/ijpQHZ0huR0J

P.S. after weeding out all reduction possibilities for special cases, there are two other options than shift-and-subract:
- on certain CPUs use floating-point division with 32-bit systems when du has < 56 bits
- Newton-Raphson division is perhaps faster

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From minforth@21:1/5 to dxforth on Tue Mar 21 08:07:45 2023

dxforth schrieb am Dienstag, 21. März 2023 um 15:35:33 UTC+1:

On 22/03/2023 1:01 am, minforth wrote:

minforth schrieb am Dienstag, 21. März 2023 um 13:04:33 UTC+1:

none albert schrieb am Dienstag, 21. März 2023 um 12:22:19 UTC+1:

Suppose we have a fast single precision /MOD , say unsigned.
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?

/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )

Results are single precision unsigned.

Not quite clear what you intend. The maximum
UD/MOD ( udnom uddenom -- udrem udquot) ??
on a 32-bit or 64-bit system ??

or like
https://groups.google.com/g/comp.lang.forth/c/RlszeFOHIB4/m/ijpQHZ0huR0J

P.S. after weeding out all reduction possibilities for special cases, there are
two other options than shift-and-subract:
- on certain CPUs use floating-point division with 32-bit systems when du has < 56 bits
- Newton-Raphson division is perhaps faster

I think he's asking how to make UM/MOD from U/MOD on the assumption the latter is faster. Problem is they are essentially the same algorithm.

Only if you do shift-and-subtract on both. But who would?

The question to decompose higher order divisions into lower order CPU-supported efficient division primitives pops up now and then. I am also among those on the
lookout for simple answers. ;-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to minforth on Wed Mar 22 01:35:31 2023

On 22/03/2023 1:01 am, minforth wrote:

minforth schrieb am Dienstag, 21. März 2023 um 13:04:33 UTC+1:

none albert schrieb am Dienstag, 21. März 2023 um 12:22:19 UTC+1:

Suppose we have a fast single precision /MOD , say unsigned.
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?

/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )

Results are single precision unsigned.

Not quite clear what you intend. The maximum
UD/MOD ( udnom uddenom -- udrem udquot) ??
on a 32-bit or 64-bit system ??

or like
https://groups.google.com/g/comp.lang.forth/c/RlszeFOHIB4/m/ijpQHZ0huR0J

P.S. after weeding out all reduction possibilities for special cases, there are
two other options than shift-and-subract:
- on certain CPUs use floating-point division with 32-bit systems when du has < 56 bits
- Newton-Raphson division is perhaps faster

I think he's asking how to make UM/MOD from U/MOD on the assumption the
latter is faster. Problem is they are essentially the same algorithm.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bl1@gmx.com@21:1/5 to minforth on Tue Mar 21 09:13:29 2023

minforth schrieb am Dienstag, 21. März 2023 um 16:07:46 UTC+1:

dxforth schrieb am Dienstag, 21. März 2023 um 15:35:33 UTC+1:

On 22/03/2023 1:01 am, minforth wrote:

minforth schrieb am Dienstag, 21. März 2023 um 13:04:33 UTC+1:

none albert schrieb am Dienstag, 21. März 2023 um 12:22:19 UTC+1:

Suppose we have a fast single precision /MOD , say unsigned.
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?

/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )

Results are single precision unsigned.

Not quite clear what you intend. The maximum
UD/MOD ( udnom uddenom -- udrem udquot) ??
on a 32-bit or 64-bit system ??

or like
https://groups.google.com/g/comp.lang.forth/c/RlszeFOHIB4/m/ijpQHZ0huR0J

P.S. after weeding out all reduction possibilities for special cases, there are
two other options than shift-and-subract:
- on certain CPUs use floating-point division with 32-bit systems when du has < 56 bits
- Newton-Raphson division is perhaps faster

I think he's asking how to make UM/MOD from U/MOD on the assumption the latter is faster. Problem is they are essentially the same algorithm.

Only if you do shift-and-subtract on both. But who would?

The question to decompose higher order divisions into lower order CPU-supported
efficient division primitives pops up now and then. I am also among those on the
lookout for simple answers. ;-)

Maybe Donald Knuth's Long Division Algorithm is helpful. Here is an examination about some implementations:
https://skanthak.homepage.t-online.de/division.html

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to albert@cherry. on Tue Mar 21 17:21:13 2023

albert@cherry.(none) (albert) writes:

Suppose we have a fast single precision /MOD , say unsigned.
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?

/mod ( u u -- remainder quotient )

The standard word /MOD works with signed numbers. Gforth calls this
word U/MOD.

um/mod ( du u - remainder quotient )

Results are single precision unsigned.

If you have a fast U/MOD, you get a faster UM/MOD by using U/MOD than
by using shift-and-subtract. But it's not straightforward. Gforth
copied a large chunk of code from libgcc for this.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: https://euro.theforth.net

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From minforth@21:1/5 to b...@gmx.com on Tue Mar 21 11:53:25 2023

b...@gmx.com schrieb am Dienstag, 21. März 2023 um 17:13:31 UTC+1:

minforth schrieb am Dienstag, 21. März 2023 um 16:07:46 UTC+1:

dxforth schrieb am Dienstag, 21. März 2023 um 15:35:33 UTC+1:

On 22/03/2023 1:01 am, minforth wrote:

minforth schrieb am Dienstag, 21. März 2023 um 13:04:33 UTC+1:

none albert schrieb am Dienstag, 21. März 2023 um 12:22:19 UTC+1: >>> Suppose we have a fast single precision /MOD , say unsigned.

Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?

/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )

Results are single precision unsigned.

Not quite clear what you intend. The maximum
UD/MOD ( udnom uddenom -- udrem udquot) ??
on a 32-bit or 64-bit system ??

or like
https://groups.google.com/g/comp.lang.forth/c/RlszeFOHIB4/m/ijpQHZ0huR0J

P.S. after weeding out all reduction possibilities for special cases, there are
two other options than shift-and-subract:
- on certain CPUs use floating-point division with 32-bit systems when du has < 56 bits
- Newton-Raphson division is perhaps faster

I think he's asking how to make UM/MOD from U/MOD on the assumption the latter is faster. Problem is they are essentially the same algorithm.

Only if you do shift-and-subtract on both. But who would?

The question to decompose higher order divisions into lower order CPU-supported
efficient division primitives pops up now and then. I am also among those on the
lookout for simple answers. ;-)

Maybe Donald Knuth's Long Division Algorithm is helpful. Here is an examination about some implementations:
https://skanthak.homepage.t-online.de/division.html

Nice overview by 'Skanthak'. But I am astonished at those complications he is demonstrating there, and at those required code lengths.

I think Forth can do better. FWIW what I am using so far without trouble:

\ Non standard:
\ Primitive: _MU/MOD \ ( ud v - r udq ) mixed unsigned division (gforth: UD/MOD)
\ : PLUCK 2 pick ;

: D/REM \ ( du dv -- drem dquot ) MF double number signed symmetric division
pluck >r pluck over xor >r \ remember signs
dabs 2swap dabs 2swap dup IF
C mfUCell ul=mffth, uh=mfthd, vl=mfsec, vh=mftos; // copy Forth stack to C unsigned variables
C mfUCell q=0, m=1;
C if ((uh==vh)&&(ul==vl)) uh=ul=0, q=1;
C else { // shift and subtract
C while(vh==uh?vl<=ul:vh<=uh)
C vh=(vh<<1)|(vl>>63), vl<<=1, m<<=1;
C while(m>1) {
C vl=(vl>>1)|(vh<<63), vh>>=1, m>>=1;
C if (uh==vh?ul>=vl:uh>=vh) {
C uh-=vh+(vl>ul), ul-=vl, q|=m; } } }
C mffth=ul, mfthd=uh, mfsec=q, mftos=0;
ELSE drop _mu/mod 0 -rot THEN
r> 0< IF dnegate THEN \ apply signs
r> 0< IF 2swap dnegate 2swap THEN ;

Don't bother about this seemingly wild mix of Forth and C code. All I want to show here
is how compact it can be.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to minforth on Wed Mar 22 11:42:09 2023

On 22/03/2023 2:07 am, minforth wrote:

dxforth schrieb am Dienstag, 21. März 2023 um 15:35:33 UTC+1:

On 22/03/2023 1:01 am, minforth wrote:

minforth schrieb am Dienstag, 21. März 2023 um 13:04:33 UTC+1:

none albert schrieb am Dienstag, 21. März 2023 um 12:22:19 UTC+1:

Suppose we have a fast single precision /MOD , say unsigned.
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?

/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )

Results are single precision unsigned.

Not quite clear what you intend. The maximum
UD/MOD ( udnom uddenom -- udrem udquot) ??
on a 32-bit or 64-bit system ??

or like
https://groups.google.com/g/comp.lang.forth/c/RlszeFOHIB4/m/ijpQHZ0huR0J >>>

P.S. after weeding out all reduction possibilities for special cases, there are
two other options than shift-and-subract:
- on certain CPUs use floating-point division with 32-bit systems when du has < 56 bits
- Newton-Raphson division is perhaps faster

I think he's asking how to make UM/MOD from U/MOD on the assumption the
latter is faster. Problem is they are essentially the same algorithm.

Only if you do shift-and-subtract on both. But who would?

The question to decompose higher order divisions into lower order CPU-supported
efficient division primitives pops up now and then. I am also among those on the
lookout for simple answers. ;-)

What pops up every now and again are C users asking why Intel didn't provide double by double and single by single division instructions.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to minforth on Wed Mar 22 09:38:26 2023

minforth <minforth@arcor.de> writes:

b...@gmx.com schrieb am Dienstag, 21. M=C3=A4rz 2023 um 17:13:31 UTC+1:

https://skanthak.homepage.t-online.de/division.html

Nice overview by 'Skanthak'. But I am astonished at those complications he = >is
demonstrating there, and at those required code lengths.

I think Forth can do better. FWIW what I am using so far without trouble:

\ Non standard:
\ Primitive: _MU/MOD \ ( ud v - r udq ) mixed unsigned division (gforth: U= >D/MOD)
\ : PLUCK 2 pick ;

: D/REM \ ( du dv -- drem dquot ) MF double number signed symmetric divisi= >on
pluck >r pluck over xor >r \ remember signs
dabs 2swap dabs 2swap dup IF=20
C mfUCell ul=3Dmffth, uh=3Dmfthd, vl=3Dmfsec, vh=3Dmftos; // copy Forth=
stack to C unsigned variables
C mfUCell q=3D0, m=3D1;=20
C if ((uh=3D=3Dvh)&&(ul=3D=3Dvl)) uh=3Dul=3D0, q=3D1;
C else { // shift and subtract
C while(vh=3D=3Duh?vl<=3Dul:vh<=3Duh)
C vh=3D(vh<<1)|(vl>>63), vl<<=3D1, m<<=3D1;
C while(m>1) {
C vl=3D(vl>>1)|(vh<<63), vh>>=3D1, m>>=3D1;
C if (uh=3D=3Dvh?ul>=3Dvl:uh>=3Dvh) {
C uh-=3Dvh+(vl>ul), ul-=3Dvl, q|=3Dm; } } }=20
C mffth=3Dul, mfthd=3Duh, mfsec=3Dq, mftos=3D0;
ELSE drop _mu/mod 0 -rot THEN
r> 0< IF dnegate THEN \ apply signs
r> 0< IF 2swap dnegate 2swap THEN ;

Don't bother about this seemingly wild mix of Forth and C code. All I want = >to show here
is how compact it can be.

Except that it does not solve the problem "none albert" asked for:
Implementing double/single->single with single/single->single, faster
than shift-and-subtract.

What you provide is an implementation of double/double->double that
uses shift-and-subtract in the general case and calls a pre-existing implementation of double/single->double in Forth, without showing how
that is implemented. In Gforth the latter is implemented as

: ud/mod ( ud1 u2 -- urem udquot ) \ gforth
\G divide unsigned double @i{ud1} by @i{u2}, resulting in a unsigned double
\G quotient @i{udquot} and a single remainder @i{urem}.
over 0= if nip u/mod 0 exit then
dup >r u/mod r> swap >r um/mod r> ;

And this implementation uses UM/MOD, which is the word that "none
albert" has asked about.

So we have an onion of division words, starting from the core:

u/mod (called by) um/mod (called by) ud/mod (called by) d/rem

The first part can be skipped if the architecture directly supports
UM/MOD, or if it does not have division instructions.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: https://euro.theforth.net

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From none) (albert@21:1/5 to Anton Ertl on Wed Mar 22 11:58:03 2023

In article <2023Mar21.182113@mips.complang.tuwien.ac.at>,
Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:

albert@cherry.(none) (albert) writes:

Suppose we have a fast single precision /MOD , say unsigned.
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?

/mod ( u u -- remainder quotient )

The standard word /MOD works with signed numbers. Gforth calls this
word U/MOD.

um/mod ( du u - remainder quotient )

Results are single precision unsigned.

If you have a fast U/MOD, you get a faster UM/MOD by using U/MOD than
by using shift-and-subtract. But it's not straightforward. Gforth
copied a large chunk of code from libgcc for this.

I suspected that much. Thanks.
Knuth based a multiprecision divide on a single division divide.
A 2-precision divide is a special case of this, but it is equally
cumbersome. I remember that in implementing this multiprecision
divide on the transputer (we did number theoretic work on it)
Marcel Hendrix asked my help for debugging, a rare event.

- anton

Groetjes Albert
--
Don't praise the day before the evening. One swallow doesn't make spring.
You must not say "hey" before you have crossed the bridge. Don't sell the
hide of the bear until you shot it. Better one bird in the hand than ten in
the air. First gain is a cat spinning. - the Wise from Antrim -

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Paul Rubin@21:1/5 to Anton Ertl on Wed Mar 22 10:56:30 2023

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:

: ud/mod ( ud1 u2 -- urem udquot ) \ gforth
\G divide unsigned double @i{ud1} by @i{u2}, resulting in a unsigned double
\G quotient @i{udquot} and a single remainder @i{urem}.
over 0= if nip u/mod 0 exit then
dup >r u/mod r> swap >r um/mod r> ;

In gforth 0.7.3 on amd64, I get:

see ud/mod
: ud/mod
>r 0 i um/mod r> swap >r um/mod r> ; ok

I haven't tried to figure out what either of these are doing.

I had been wondering what happens if the quotient won't fit in a single.
0 1 1 ud/mod gives 0 0 1 on my system, which doesn't seem so great.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Paul Rubin on Wed Mar 22 18:49:29 2023

Paul Rubin <no.email@nospam.invalid> writes:

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:

: ud/mod ( ud1 u2 -- urem udquot ) \ gforth
\G divide unsigned double @i{ud1} by @i{u2}, resulting in a unsigned double
\G quotient @i{udquot} and a single remainder @i{urem}.
over 0= if nip u/mod 0 exit then
dup >r u/mod r> swap >r um/mod r> ;

In gforth 0.7.3 on amd64, I get:

see ud/mod
: ud/mod
>r 0 i um/mod r> swap >r um/mod r> ; ok

I haven't tried to figure out what either of these are doing.

They are doing what is documented. The implementation I cited is more optimized than the 0.7.3 one (with the intention of getting a faster
"#" (and words that call it, like ".", especially on architectures
that do not provide UM/MOD, such as ARM and RISC-V)).

I had been wondering what happens if the quotient won't fit in a single.
0 1 1 ud/mod gives 0 0 1 on my system, which doesn't seem so great.

What you are doing (on a 64-bit system) is:

18446744073709551616/1

And the result you get is the quotient 18446744073709551616 and
remainder 0.

What's not so great in your opinion?

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: https://euro.theforth.net

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From minforth@21:1/5 to Anton Ertl on Wed Mar 22 13:32:49 2023

Anton Ertl schrieb am Mittwoch, 22. März 2023 um 10:51:54 UTC+1:

minforth <minf...@arcor.de> writes:

b...@gmx.com schrieb am Dienstag, 21. M=C3=A4rz 2023 um 17:13:31 UTC+1:

https://skanthak.homepage.t-online.de/division.html

Nice overview by 'Skanthak'. But I am astonished at those complications he = >is
demonstrating there, and at those required code lengths.

I think Forth can do better. FWIW what I am using so far without trouble:

\ Non standard:
\ Primitive: _MU/MOD \ ( ud v - r udq ) mixed unsigned division (gforth: U= >D/MOD)
\ : PLUCK 2 pick ;

: D/REM \ ( du dv -- drem dquot ) MF double number signed symmetric divisi= >on
pluck >r pluck over xor >r \ remember signs
dabs 2swap dabs 2swap dup IF=20
C mfUCell ul=3Dmffth, uh=3Dmfthd, vl=3Dmfsec, vh=3Dmftos; // copy Forth=
stack to C unsigned variables
C mfUCell q=3D0, m=3D1;=20
C if ((uh=3D=3Dvh)&&(ul=3D=3Dvl)) uh=3Dul=3D0, q=3D1;
C else { // shift and subtract
C while(vh=3D=3Duh?vl<=3Dul:vh<=3Duh)
C vh=3D(vh<<1)|(vl>>63), vl<<=3D1, m<<=3D1;
C while(m>1) {
C vl=3D(vl>>1)|(vh<<63), vh>>=3D1, m>>=3D1;
C if (uh=3D=3Dvh?ul>=3Dvl:uh>=3Dvh) {
C uh-=3Dvh+(vl>ul), ul-=3Dvl, q|=3Dm; } } }=20
C mffth=3Dul, mfthd=3Duh, mfsec=3Dq, mftos=3D0;
ELSE drop _mu/mod 0 -rot THEN

0< IF dnegate THEN \ apply signs
0< IF 2swap dnegate 2swap THEN ;

Don't bother about this seemingly wild mix of Forth and C code. All I want = >to show here
is how compact it can be.

Except that it does not solve the problem "none albert" asked for: Implementing double/single->single with single/single->single, faster
than shift-and-subtract.

What you provide is an implementation of double/double->double that
uses shift-and-subtract in the general case and calls a pre-existing implementation of double/single->double in Forth, without showing how
that is implemented. In Gforth the latter is implemented as

: ud/mod ( ud1 u2 -- urem udquot ) \ gforth
\G divide unsigned double @i{ud1} by @i{u2}, resulting in a unsigned double \G quotient @i{udquot} and a single remainder @i{urem}.
over 0= if nip u/mod 0 exit then
dup >r u/mod r> swap >r um/mod r> ;

And this implementation uses UM/MOD, which is the word that "none
albert" has asked about.

So we have an onion of division words, starting from the core:

u/mod (called by) um/mod (called by) ud/mod (called by) d/rem

The first part can be skipped if the architecture directly supports
UM/MOD, or if it does not have division instructions.

Of course. IMO it boils down to either you have CPU mixed division support
or not. If not, the question remains: is there a faster/simpler division algorithm
than binary shift-subtract?

F.ex. given fast integer log2 and pow2 operators, division can be solved with multiplication and some additions. But only timing can tell if it is worth the hassle.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Paul Rubin on Thu Mar 23 10:51:14 2023

On 23/03/2023 4:56 am, Paul Rubin wrote:

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:

: ud/mod ( ud1 u2 -- urem udquot ) \ gforth
\G divide unsigned double @i{ud1} by @i{u2}, resulting in a unsigned double
\G quotient @i{udquot} and a single remainder @i{urem}.
over 0= if nip u/mod 0 exit then
dup >r u/mod r> swap >r um/mod r> ;

In gforth 0.7.3 on amd64, I get:

see ud/mod
: ud/mod
>r 0 i um/mod r> swap >r um/mod r> ; ok

I haven't tried to figure out what either of these are doing.

I had been wondering what happens if the quotient won't fit in a single.

As is potentially the case in UM/MOD ? That would be an overflow/exception. Don't do that. Intelligent use of UM/MOD is better than default use of UD/MOD.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to minforth on Thu Mar 23 07:54:44 2023

minforth <minforth@arcor.de> writes:

Of course. IMO it boils down to either you have CPU mixed division support
or not. If not, the question remains: is there a faster/simpler division al= >gorithm
than binary shift-subtract?=20

F.ex. given fast integer log2 and pow2 operators, division can be solved wi= >th
multiplication and some additions. But only timing can tell if it is worth = >the hassle.

Yes, a lot of the bulk of the stuff we took from libgcc is due to
variations for making use of some feature present on some
architectures to get a faster result, plus the code for an efficient implementation on architectures where the feature is not present
(which may mean that you use a different approach rather than just
replacing the code for the feature with a software implementation of
that feature). With several such features in play, the result bloats considerably.

You can code a simple implementation that uses only minimal features,
and you can code it up in Forth, and get a much slimmer result and
then claim "Forth simplicity", but in that case the slimness is not
due to Forth, but due to preferencing slimness over efficiency.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: https://euro.theforth.net

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From minforth@21:1/5 to Anton Ertl on Thu Mar 23 01:36:50 2023

Anton Ertl schrieb am Donnerstag, 23. März 2023 um 09:03:45 UTC+1:

You can code a simple implementation that uses only minimal features,
and you can code it up in Forth, and get a much slimmer result and
then claim "Forth simplicity", but in that case the slimness is not
due to Forth, but due to preferencing slimness over efficiency.

Due to caching effects on CPUs or being able to make better usage of registers, very often the slimmer code wins the race despite being mathematically less efficient.

And it saves a lot of debugging time. ;-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to minforth on Thu Mar 23 09:20:05 2023

minforth <minforth@arcor.de> writes:

Anton Ertl schrieb am Donnerstag, 23. M=C3=A4rz 2023 um 09:03:45 UTC+1:

You can code a simple implementation that uses only minimal features,=20
and you can code it up in Forth, and get a much slimmer result and=20
then claim "Forth simplicity", but in that case the slimness is not=20
due to Forth, but due to preferencing slimness over efficiency.

Due to caching effects on CPUs or being able to make better usage of regist= >ers,
very often the slimmer code wins the race despite being mathematically less=
efficient.

For every architecture only the variant for its architectural features
results is compiled (using conditional compilation). This results in
the following code lengths for UM/MOD (excluding the final jump in the
NEXT):

Bytes Architecture
30 AMD64
292 RISC-V
308 Aarch64

If you come up with shorter code that runs on all the architectures
Gforth supports and also results in shorter and faster object code, I
will happily replace the current Gforth implementation. Until then I
continue to believe that no significant source code size improvement
is possible that does not result in a significant disadvantage in
performance.

And it saves a lot of debugging time. ;-)

Yes, that's a disadvantage of all this special-case code, but in this
case the debugging was performed by the gcc developers, I just dropped
it in as-is.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: https://euro.theforth.net

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Bob Worm
  Mon May 6 11:44:29 2024
  from Wales, Uk via Telnet
- Bob Worm
  Tue May 7 15:23:21 2024
  from Wales, Uk via Telnet
- Bob Worm
  Tue May 7 14:12:05 2024
  from Wales, Uk via Telnet
- Bob Worm
  Tue May 7 09:06:52 2024
  from Wales, Uk via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	300
Nodes:	16 (2 / 14)
Uptime:	47:34:08
Calls:	6,710
Calls today:	3
Files:	12,243
Messages:	5,354,494
Posted today:	1

Re: Double precision /MOD

Who's Online

Recent Visitors

System Info