Suppose we have a fast single precision /MOD , say unsigned.
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?
/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )
Results are single precision unsigned.
none albert schrieb am Dienstag, 21. März 2023 um 12:22:19 UTC+1:
Suppose we have a fast single precision /MOD , say unsigned.
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?
/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )
Results are single precision unsigned.Not quite clear what you intend. The maximum
UD/MOD ( udnom uddenom -- udrem udquot) ??
on a 32-bit or 64-bit system ??
or like https://groups.google.com/g/comp.lang.forth/c/RlszeFOHIB4/m/ijpQHZ0huR0J
On 22/03/2023 1:01 am, minforth wrote:
minforth schrieb am Dienstag, 21. März 2023 um 13:04:33 UTC+1:
none albert schrieb am Dienstag, 21. März 2023 um 12:22:19 UTC+1:
Suppose we have a fast single precision /MOD , say unsigned.Not quite clear what you intend. The maximum
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?
/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )
Results are single precision unsigned.
UD/MOD ( udnom uddenom -- udrem udquot) ??
on a 32-bit or 64-bit system ??
or like
https://groups.google.com/g/comp.lang.forth/c/RlszeFOHIB4/m/ijpQHZ0huR0J
P.S. after weeding out all reduction possibilities for special cases, there areI think he's asking how to make UM/MOD from U/MOD on the assumption the latter is faster. Problem is they are essentially the same algorithm.
two other options than shift-and-subract:
- on certain CPUs use floating-point division with 32-bit systems when du has < 56 bits
- Newton-Raphson division is perhaps faster
minforth schrieb am Dienstag, 21. März 2023 um 13:04:33 UTC+1:
none albert schrieb am Dienstag, 21. März 2023 um 12:22:19 UTC+1:
Suppose we have a fast single precision /MOD , say unsigned.Not quite clear what you intend. The maximum
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?
/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )
Results are single precision unsigned.
UD/MOD ( udnom uddenom -- udrem udquot) ??
on a 32-bit or 64-bit system ??
or like
https://groups.google.com/g/comp.lang.forth/c/RlszeFOHIB4/m/ijpQHZ0huR0J
P.S. after weeding out all reduction possibilities for special cases, there are
two other options than shift-and-subract:
- on certain CPUs use floating-point division with 32-bit systems when du has < 56 bits
- Newton-Raphson division is perhaps faster
dxforth schrieb am Dienstag, 21. März 2023 um 15:35:33 UTC+1:
On 22/03/2023 1:01 am, minforth wrote:
minforth schrieb am Dienstag, 21. März 2023 um 13:04:33 UTC+1:
none albert schrieb am Dienstag, 21. März 2023 um 12:22:19 UTC+1:
Suppose we have a fast single precision /MOD , say unsigned.Not quite clear what you intend. The maximum
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?
/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )
Results are single precision unsigned.
UD/MOD ( udnom uddenom -- udrem udquot) ??
on a 32-bit or 64-bit system ??
or like
https://groups.google.com/g/comp.lang.forth/c/RlszeFOHIB4/m/ijpQHZ0huR0J
Only if you do shift-and-subtract on both. But who would?P.S. after weeding out all reduction possibilities for special cases, there areI think he's asking how to make UM/MOD from U/MOD on the assumption the latter is faster. Problem is they are essentially the same algorithm.
two other options than shift-and-subract:
- on certain CPUs use floating-point division with 32-bit systems when du has < 56 bits
- Newton-Raphson division is perhaps faster
The question to decompose higher order divisions into lower order CPU-supported
efficient division primitives pops up now and then. I am also among those on the
lookout for simple answers. ;-)
Suppose we have a fast single precision /MOD , say unsigned.
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?
/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )
Results are single precision unsigned.
minforth schrieb am Dienstag, 21. März 2023 um 16:07:46 UTC+1:
dxforth schrieb am Dienstag, 21. März 2023 um 15:35:33 UTC+1:
On 22/03/2023 1:01 am, minforth wrote:
minforth schrieb am Dienstag, 21. März 2023 um 13:04:33 UTC+1:
none albert schrieb am Dienstag, 21. März 2023 um 12:22:19 UTC+1: >>> Suppose we have a fast single precision /MOD , say unsigned.
Does that help with building a double precision /MOD ,Not quite clear what you intend. The maximum
or must we resort to a shift and subtract algorithm?
/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )
Results are single precision unsigned.
UD/MOD ( udnom uddenom -- udrem udquot) ??
on a 32-bit or 64-bit system ??
or like
https://groups.google.com/g/comp.lang.forth/c/RlszeFOHIB4/m/ijpQHZ0huR0J
Only if you do shift-and-subtract on both. But who would?P.S. after weeding out all reduction possibilities for special cases, there areI think he's asking how to make UM/MOD from U/MOD on the assumption the latter is faster. Problem is they are essentially the same algorithm.
two other options than shift-and-subract:
- on certain CPUs use floating-point division with 32-bit systems when du has < 56 bits
- Newton-Raphson division is perhaps faster
The question to decompose higher order divisions into lower order CPU-supportedMaybe Donald Knuth's Long Division Algorithm is helpful. Here is an examination about some implementations:
efficient division primitives pops up now and then. I am also among those on the
lookout for simple answers. ;-)
https://skanthak.homepage.t-online.de/division.html
dxforth schrieb am Dienstag, 21. März 2023 um 15:35:33 UTC+1:
On 22/03/2023 1:01 am, minforth wrote:
minforth schrieb am Dienstag, 21. März 2023 um 13:04:33 UTC+1:I think he's asking how to make UM/MOD from U/MOD on the assumption the
none albert schrieb am Dienstag, 21. März 2023 um 12:22:19 UTC+1:P.S. after weeding out all reduction possibilities for special cases, there are
Suppose we have a fast single precision /MOD , say unsigned.Not quite clear what you intend. The maximum
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?
/mod ( u u -- remainder quotient )
um/mod ( du u - remainder quotient )
Results are single precision unsigned.
UD/MOD ( udnom uddenom -- udrem udquot) ??
on a 32-bit or 64-bit system ??
or like
https://groups.google.com/g/comp.lang.forth/c/RlszeFOHIB4/m/ijpQHZ0huR0J >>>
two other options than shift-and-subract:
- on certain CPUs use floating-point division with 32-bit systems when du has < 56 bits
- Newton-Raphson division is perhaps faster
latter is faster. Problem is they are essentially the same algorithm.
Only if you do shift-and-subtract on both. But who would?
The question to decompose higher order divisions into lower order CPU-supported
efficient division primitives pops up now and then. I am also among those on the
lookout for simple answers. ;-)
b...@gmx.com schrieb am Dienstag, 21. M=C3=A4rz 2023 um 17:13:31 UTC+1:
https://skanthak.homepage.t-online.de/division.html
Nice overview by 'Skanthak'. But I am astonished at those complications he = >is
demonstrating there, and at those required code lengths.
I think Forth can do better. FWIW what I am using so far without trouble:
\ Non standard:
\ Primitive: _MU/MOD \ ( ud v - r udq ) mixed unsigned division (gforth: U= >D/MOD)
\ : PLUCK 2 pick ;
: D/REM \ ( du dv -- drem dquot ) MF double number signed symmetric divisi= >on
pluck >r pluck over xor >r \ remember signs
dabs 2swap dabs 2swap dup IF=20
C mfUCell ul=3Dmffth, uh=3Dmfthd, vl=3Dmfsec, vh=3Dmftos; // copy Forth=
stack to C unsigned variables
C mfUCell q=3D0, m=3D1;=20
C if ((uh=3D=3Dvh)&&(ul=3D=3Dvl)) uh=3Dul=3D0, q=3D1;
C else { // shift and subtract
C while(vh=3D=3Duh?vl<=3Dul:vh<=3Duh)
C vh=3D(vh<<1)|(vl>>63), vl<<=3D1, m<<=3D1;
C while(m>1) {
C vl=3D(vl>>1)|(vh<<63), vh>>=3D1, m>>=3D1;
C if (uh=3D=3Dvh?ul>=3Dvl:uh>=3Dvh) {
C uh-=3Dvh+(vl>ul), ul-=3Dvl, q|=3Dm; } } }=20
C mffth=3Dul, mfthd=3Duh, mfsec=3Dq, mftos=3D0;
ELSE drop _mu/mod 0 -rot THEN
r> 0< IF dnegate THEN \ apply signs
r> 0< IF 2swap dnegate 2swap THEN ;
Don't bother about this seemingly wild mix of Forth and C code. All I want = >to show here
is how compact it can be.
albert@cherry.(none) (albert) writes:
Suppose we have a fast single precision /MOD , say unsigned.
Does that help with building a double precision /MOD ,
or must we resort to a shift and subtract algorithm?
/mod ( u u -- remainder quotient )
The standard word /MOD works with signed numbers. Gforth calls this
word U/MOD.
um/mod ( du u - remainder quotient )
Results are single precision unsigned.
If you have a fast U/MOD, you get a faster UM/MOD by using U/MOD than
by using shift-and-subtract. But it's not straightforward. Gforth
copied a large chunk of code from libgcc for this.
- anton
: ud/mod ( ud1 u2 -- urem udquot ) \ gforth
\G divide unsigned double @i{ud1} by @i{u2}, resulting in a unsigned double
\G quotient @i{udquot} and a single remainder @i{urem}.
over 0= if nip u/mod 0 exit then
dup >r u/mod r> swap >r um/mod r> ;
anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
: ud/mod ( ud1 u2 -- urem udquot ) \ gforth
\G divide unsigned double @i{ud1} by @i{u2}, resulting in a unsigned double
\G quotient @i{udquot} and a single remainder @i{urem}.
over 0= if nip u/mod 0 exit then
dup >r u/mod r> swap >r um/mod r> ;
In gforth 0.7.3 on amd64, I get:
see ud/mod
: ud/mod
>r 0 i um/mod r> swap >r um/mod r> ; ok
I haven't tried to figure out what either of these are doing.
I had been wondering what happens if the quotient won't fit in a single.
0 1 1 ud/mod gives 0 0 1 on my system, which doesn't seem so great.
minforth <minf...@arcor.de> writes:
b...@gmx.com schrieb am Dienstag, 21. M=C3=A4rz 2023 um 17:13:31 UTC+1:
https://skanthak.homepage.t-online.de/division.html
Nice overview by 'Skanthak'. But I am astonished at those complications he = >is
demonstrating there, and at those required code lengths.
I think Forth can do better. FWIW what I am using so far without trouble:
\ Non standard:
\ Primitive: _MU/MOD \ ( ud v - r udq ) mixed unsigned division (gforth: U= >D/MOD)
\ : PLUCK 2 pick ;
: D/REM \ ( du dv -- drem dquot ) MF double number signed symmetric divisi= >on
pluck >r pluck over xor >r \ remember signs
dabs 2swap dabs 2swap dup IF=20
C mfUCell ul=3Dmffth, uh=3Dmfthd, vl=3Dmfsec, vh=3Dmftos; // copy Forth=
stack to C unsigned variables
C mfUCell q=3D0, m=3D1;=20
C if ((uh=3D=3Dvh)&&(ul=3D=3Dvl)) uh=3Dul=3D0, q=3D1;
C else { // shift and subtract
C while(vh=3D=3Duh?vl<=3Dul:vh<=3Duh)
C vh=3D(vh<<1)|(vl>>63), vl<<=3D1, m<<=3D1;
C while(m>1) {
C vl=3D(vl>>1)|(vh<<63), vh>>=3D1, m>>=3D1;
C if (uh=3D=3Dvh?ul>=3Dvl:uh>=3Dvh) {
C uh-=3Dvh+(vl>ul), ul-=3Dvl, q|=3Dm; } } }=20
C mffth=3Dul, mfthd=3Duh, mfsec=3Dq, mftos=3D0;
ELSE drop _mu/mod 0 -rot THEN
0< IF dnegate THEN \ apply signs
0< IF 2swap dnegate 2swap THEN ;
Don't bother about this seemingly wild mix of Forth and C code. All I want = >to show hereExcept that it does not solve the problem "none albert" asked for: Implementing double/single->single with single/single->single, faster
is how compact it can be.
than shift-and-subtract.
What you provide is an implementation of double/double->double that
uses shift-and-subtract in the general case and calls a pre-existing implementation of double/single->double in Forth, without showing how
that is implemented. In Gforth the latter is implemented as
: ud/mod ( ud1 u2 -- urem udquot ) \ gforth
\G divide unsigned double @i{ud1} by @i{u2}, resulting in a unsigned double \G quotient @i{udquot} and a single remainder @i{urem}.
over 0= if nip u/mod 0 exit then
dup >r u/mod r> swap >r um/mod r> ;
And this implementation uses UM/MOD, which is the word that "none
albert" has asked about.
So we have an onion of division words, starting from the core:
u/mod (called by) um/mod (called by) ud/mod (called by) d/rem
The first part can be skipped if the architecture directly supports
UM/MOD, or if it does not have division instructions.
anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
: ud/mod ( ud1 u2 -- urem udquot ) \ gforth
\G divide unsigned double @i{ud1} by @i{u2}, resulting in a unsigned double
\G quotient @i{udquot} and a single remainder @i{urem}.
over 0= if nip u/mod 0 exit then
dup >r u/mod r> swap >r um/mod r> ;
In gforth 0.7.3 on amd64, I get:
see ud/mod
: ud/mod
>r 0 i um/mod r> swap >r um/mod r> ; ok
I haven't tried to figure out what either of these are doing.
I had been wondering what happens if the quotient won't fit in a single.
Of course. IMO it boils down to either you have CPU mixed division support
or not. If not, the question remains: is there a faster/simpler division al= >gorithm
than binary shift-subtract?=20
F.ex. given fast integer log2 and pow2 operators, division can be solved wi= >th
multiplication and some additions. But only timing can tell if it is worth = >the hassle.
You can code a simple implementation that uses only minimal features,
and you can code it up in Forth, and get a much slimmer result and
then claim "Forth simplicity", but in that case the slimness is not
due to Forth, but due to preferencing slimness over efficiency.
Anton Ertl schrieb am Donnerstag, 23. M=C3=A4rz 2023 um 09:03:45 UTC+1:
You can code a simple implementation that uses only minimal features,=20
and you can code it up in Forth, and get a much slimmer result and=20
then claim "Forth simplicity", but in that case the slimness is not=20
due to Forth, but due to preferencing slimness over efficiency.
Due to caching effects on CPUs or being able to make better usage of regist= >ers,
very often the slimmer code wins the race despite being mathematically less=
efficient.
And it saves a lot of debugging time. ;-)
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 300 |
Nodes: | 16 (2 / 14) |
Uptime: | 47:34:08 |
Calls: | 6,710 |
Calls today: | 3 |
Files: | 12,243 |
Messages: | 5,354,494 |
Posted today: | 1 |