Forum: >>> Magnum BBS <<<

Forth systems where do/?do pushes that loop start address

From Anton Ertl@21:1/5 to All on Mon Mar 4 17:24:09 2024

Many years ago I have read here about Forth systems where DO and ?DO
push three items on the return stack: the two values from the data
stack (initial index and limit) like many other Forth systems, but in
addition they also push the address that LOOP/+LOOP later jumps to.

I used to consider this to be inefficient, but it turns out that in an efficient interpreter-based Forth system like, say gforth-fast from
2022 it would actually be more efficient than compiling that address
with the (LOOP)/(+LOOP) and loading it from there.

My question is: Which Forth systems have a DO/?DO that pushes the
address that LOOP/+LOOP then jumps to?

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Krishna Myneni@21:1/5 to Anton Ertl on Mon Mar 4 22:54:50 2024

On 3/4/24 11:24, Anton Ertl wrote:

Many years ago I have read here about Forth systems where DO and ?DO
push three items on the return stack: the two values from the data
stack (initial index and limit) like many other Forth systems, but in addition they also push the address that LOOP/+LOOP later jumps to.

I used to consider this to be inefficient, but it turns out that in an efficient interpreter-based Forth system like, say gforth-fast from
2022 it would actually be more efficient than compiling that address
with the (LOOP)/(+LOOP) and loading it from there.

My question is: Which Forth systems have a DO/?DO that pushes the
address that LOOP/+LOOP then jumps to?

- anton

Yes, kForth uses this method. DO pushes three items onto the return
stack, the two loop parameters, and the virtual instruction pointer.

\ From ForthVM.cpp

int CPP_do ()
{
// stack: ( -- | generate opcodes for beginning of loop structure )

pCurrentOps->push_back(OP_PUSH);
pCurrentOps->push_back(OP_PUSH);
pCurrentOps->push_back(OP_PUSHIP);

dostack.push(pCurrentOps->size());
return 0;
}

--
Krishna

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Krishna Myneni@21:1/5 to Krishna Myneni on Mon Mar 4 22:57:09 2024

On 3/4/24 22:54, Krishna Myneni wrote:

On 3/4/24 11:24, Anton Ertl wrote:

Many years ago I have read here about Forth systems where DO and ?DO
push three items on the return stack: the two values from the data
stack (initial index and limit) like many other Forth systems, but in
addition they also push the address that LOOP/+LOOP later jumps to.

I used to consider this to be inefficient, but it turns out that in an
efficient interpreter-based Forth system like, say gforth-fast from
2022 it would actually be more efficient than compiling that address
with the (LOOP)/(+LOOP) and loading it from there.

My question is: Which Forth systems have a DO/?DO that pushes the
address that LOOP/+LOOP then jumps to?

- anton

Yes, kForth uses this method. DO pushes three items onto the return
stack, the two loop parameters, and the virtual instruction pointer.

\ From ForthVM.cpp

int CPP_do ()
{
// stack: ( -- | generate opcodes for beginning of loop structure )

pCurrentOps->push_back(OP_PUSH);
pCurrentOps->push_back(OP_PUSH);
pCurrentOps->push_back(OP_PUSHIP);

dostack.push(pCurrentOps->size());
return 0;
}

To be clear, DO compiles three VM instructions to push the items onto
the return stack.

--
Krishna

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From albert@spenarnc.xs4all.nl@21:1/5 to Anton Ertl on Tue Mar 5 09:36:39 2024

In article <2024Mar4.182409@mips.complang.tuwien.ac.at>
logging-data="3448296"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/HptfcniFIEyKZUGV89+Ev",
Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:

Many years ago I have read here about Forth systems where DO and ?DO
push three items on the return stack: the two values from the data
stack (initial index and limit) like many other Forth systems, but in >addition they also push the address that LOOP/+LOOP later jumps to.

I used to consider this to be inefficient, but it turns out that in an >efficient interpreter-based Forth system like, say gforth-fast from
2022 it would actually be more efficient than compiling that address
with the (LOOP)/(+LOOP) and loading it from there.

My question is: Which Forth systems have a DO/?DO that pushes the
address that LOOP/+LOOP then jumps to?

All the versions of ciforth MS/Linux/OSX 32/64 ARM/86 do this.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html >comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

--
Don't praise the day before the evening. One swallow doesn't make spring.
You must not say "hey" before you have crossed the bridge. Don't sell the
hide of the bear until you shot it. Better one bird in the hand than ten in
the air. First gain is a cat purring. - the Wise from Antrim -

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From mhx@21:1/5 to Anton Ertl on Tue Mar 5 09:32:34 2024

Anton Ertl wrote:
[..]

My question is: Which Forth systems have a DO/?DO that pushes the
address that LOOP/+LOOP then jumps to?

It is not 100% clear what you mean.
In iForth I do something special with both DO and LOOP , where the
LOOP action is probably closest to your question.

FORTH> : test 22 10 2 do 1+ loop . ; ok
FORTH> see test
Flags: ANSI
$01340A00 : test
$01340A0A push #22 b#
$01340A0C mov rcx, #10 d#
$01340A13 mov rbx, 2 d#
$01340A1A call (DO) offset NEAR
$01340A24 lea rax, [rax 0 +] qword
$01340A28 lea rbx, [rbx 1 +] qword
$01340A2C add [rbp 0 +] qword, 1 b#
$01340A31 add [rbp 8 +] qword, 1 b#
$01340A36 jno $01340A28 offset NEAR
$01340A3C add rbp, #24 b#
$01340A40 push rbx
$01340A41 jmp .+10 ( $0124A102 ) offset NEAR

Or, without SYMBOLIC disguising the (DO) machine code:

FORTH> false TO symbolic ok
FORTH> $01340A1A idis
$01340A1A call $012413F8 offset NEAR
$01340A1F jmp $01340A3C offset NEAR
$01340A24 lea rax, [rax 0 +] qword
$01340A28 lea rbx, [rbx 1 +] qword
$01340A2C add [rbp 0 +] qword, 1 b#
$01340A31 add [rbp 8 +] qword, 1 b#
$01340A36 jno $01340A28 offset NEAR
$01340A3C add rbp, #24 b#
$01340A40 push rbx
$01340A41 jmp $0124A102 offset NEAR

-marcel

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stephen Pelc@21:1/5 to All on Tue Mar 5 11:18:31 2024

On 4 Mar 2024 at 18:24:09 CET, "Anton Ertl" <Anton Ertl> wrote:

My question is: Which Forth systems have a DO/?DO that pushes the
address that LOOP/+LOOP then jumps to?

- anton

VFX since the beginning.

Stephen

--
Stephen Pelc, stephen@vfxforth.com
MicroProcessor Engineering, Ltd. - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)78 0390 3612, +34 649 662 974
http://www.mpeforth.com - free VFX Forth downloads

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Krishna Myneni on Tue Mar 5 11:38:11 2024

Krishna Myneni <krishna.myneni@ccreweb.org> writes:

Yes, kForth uses this method. DO pushes three items onto the return
stack, the two loop parameters, and the virtual instruction pointer.

Thanks. You can find the performance benefit from that in gforth-fast
in the right bar of each benchmark <http://www.complang.tuwien.ac.at/anton/tmp/select-uarch.eps>. It
provides pretty good speedups for siev, bubble, and matrix, and small
speedups in sha512.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to albert@spenarnc.xs4all.nl on Tue Mar 5 11:58:37 2024

albert@spenarnc.xs4all.nl writes:

In article <2024Mar4.182409@mips.complang.tuwien.ac.at>

My question is: Which Forth systems have a DO/?DO that pushes the
address [at run-time to the return stack] that LOOP/+LOOP then jumps to?

All the versions of ciforth MS/Linux/OSX 32/64 ARM/86 do this.

Thanks. AFAIK you started with fig-Forth that puts the loop-back
address in the interpreted code. Why did you change this approach?

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to mhx on Tue Mar 5 11:18:36 2024

mhx@iae.nl (mhx) writes:

Anton Ertl wrote:
[..]

My question is: Which Forth systems have a DO/?DO that pushes the
address that LOOP/+LOOP then jumps to?

It is not 100% clear what you mean.
In iForth I do something special with both DO and LOOP , where the
LOOP action is probably closest to your question.

FORTH> : test 22 10 2 do 1+ loop . ; ok
FORTH> see test
Flags: ANSI
$01340A00 : test
$01340A0A push #22 b#
$01340A0C mov rcx, #10 d#
$01340A13 mov rbx, 2 d#
$01340A1A call (DO) offset NEAR
$01340A24 lea rax, [rax 0 +] qword
$01340A28 lea rbx, [rbx 1 +] qword
$01340A2C add [rbp 0 +] qword, 1 b#
$01340A31 add [rbp 8 +] qword, 1 b#
$01340A36 jno $01340A28 offset NEAR
$01340A3C add rbp, #24 b#
$01340A40 push rbx
$01340A41 jmp .+10 ( $0124A102 ) offset NEAR

Native-code systems generally use direct (conditional) jumps to the
loop start, like iforth does here with the jno.

What I meant is that some (interpreter-based) systems keep the loop
start address ($01340A28 in this example) on the return stack, and
LOOP/+LOOP takes it from there and then performs a (VM-level) jump
there (unless the loop is exited).

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From albert@spenarnc.xs4all.nl@21:1/5 to Anton Ertl on Tue Mar 5 14:19:05 2024

In article <2024Mar5.125837@mips.complang.tuwien.ac.at>,
Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:

albert@spenarnc.xs4all.nl writes:

In article <2024Mar4.182409@mips.complang.tuwien.ac.at>

My question is: Which Forth systems have a DO/?DO that pushes the
address [at run-time to the return stack] that LOOP/+LOOP then jumps to?

All the versions of ciforth MS/Linux/OSX 32/64 ARM/86 do this.

Thanks. AFAIK you started with fig-Forth that puts the loop-back
address in the interpreted code. Why did you change this approach?

The address that I push is the address after the loop.
So LEAVE as well as LOOP discards only loop parameters and go NEXT.
(DO) is followed by a (FORWARD half jump, it doesn't jump over the
body but is resolved by a FORWARD) , so it knows what
address to push.
If I remember correctly the original FIG LEAVE was not ISO, so this
had to be fixed anyway. LEAVE and UNLOOP are almost synonyms.
Simple manipulation of the return stack are preferred in view of my
optimiser that can push return stack items into oblivion (registers).

DO LOOP in FIG / ISO say FORTH is a mess anyway. The idea that
signed/unsigned numbers can be handled uniformly was cute at the
time, when you could not spare 10 bytes. In the 50 years no novice
even dared to try negative indices or negative increments.

- anton

Groetjes Albert
--
Don't praise the day before the evening. One swallow doesn't make spring.
You must not say "hey" before you have crossed the bridge. Don't sell the
hide of the bear until you shot it. Better one bird in the hand than ten in
the air. First gain is a cat purring. - the Wise from Antrim -

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From albert@spenarnc.xs4all.nl@21:1/5 to albert@spenarnc.xs4all.nl on Tue Mar 5 20:18:39 2024

In article <nnd$091faf4b$281f1a47@59a4330bdcfeaef0>,
<albert@spenarnc.xs4all.nl> wrote:

In article <2024Mar5.125837@mips.complang.tuwien.ac.at>,
Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote: >>albert@spenarnc.xs4all.nl writes:

In article <2024Mar4.182409@mips.complang.tuwien.ac.at>

My question is: Which Forth systems have a DO/?DO that pushes the >>>>address [at run-time to the return stack] that LOOP/+LOOP then jumps to? >>>

All the versions of ciforth MS/Linux/OSX 32/64 ARM/86 do this.

Thanks. AFAIK you started with fig-Forth that puts the loop-back
address in the interpreted code. Why did you change this approach?

The address that I push is the address after the loop.
So LEAVE as well as LOOP discards only loop parameters and go NEXT.
(DO) is followed by a (FORWARD half jump, it doesn't jump over the
body but is resolved by a FORWARD) , so it knows what
address to push.
If I remember correctly the original FIG LEAVE was not ISO, so this
had to be fixed anyway. LEAVE and UNLOOP are almost synonyms.
Simple manipulation of the return stack are preferred in view of my
optimiser that can push return stack items into oblivion (registers).

DO LOOP in FIG / ISO say FORTH is a mess anyway. The idea that >signed/unsigned numbers can be handled uniformly was cute at the
time, when you could not spare 10 bytes. In the 50 years no novice
even dared to try negative indices or negative increments.

- anton

Groetjes Albert

I looked at your original post again. Actually this is different.
+LOOP does a branch back. The address pushed on the return stack
is the address past the loop.

Groetjes Albert
--
Don't praise the day before the evening. One swallow doesn't make spring.
You must not say "hey" before you have crossed the bridge. Don't sell the
hide of the bear until you shot it. Better one bird in the hand than ten in
the air. First gain is a cat purring. - the Wise from Antrim -

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From minforth@21:1/5 to Ruvim on Wed Mar 6 14:15:25 2024

Ruvim wrote:

In SP-Forth v3 and v4 (they generate native code), "DO" pushes three
items on the return stack, and among them the address that "LEAVE" then
jumps to.

That would make implementing BREAK and CONTINUE rather easy...

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stephen Pelc@21:1/5 to dxf on Wed Mar 6 19:52:30 2024

On 6 Mar 2024 at 00:42:41 CET, "dxf" <dxforth@gmail.com> wrote:

On 5/03/2024 10:18 pm, Stephen Pelc wrote:

On 4 Mar 2024 at 18:24:09 CET, "Anton Ertl" <Anton Ertl> wrote:

My question is: Which Forth systems have a DO/?DO that pushes the
address that LOOP/+LOOP then jumps to?

- anton

VFX since the beginning.

AFAICS the loop jump addr is hard-coded (JNO) as that was generally
seen as most efficient:

: test 10 0 do loop ; ok
see test
TEST
( 005945D0 488D6DF0 ) LEA RBP, [RBP+-10]
( 005945D4 48C745000A000000 ) MOV QWord [RBP], # 0000000A
( 005945DC 48895D08 ) MOV [RBP+08], RBX
( 005945E0 BB00000000 ) MOV EBX, # 00000000
( 005945E5 E86615E9FFFF45590000000 CALL 00425B50 (DO) 00000000005945FF ( 005945F2 49FFC6 ) INC R14
( 005945F5 49FFC7 ) INC R15
( 005945F8 71F8 ) JNO 005945F2
( 005945FA 415E ) POP R14
( 005945FC 415F ) POP R15
( 005945FE 58 ) POP RAX
( 005945FF C3 ) RET/NEXT
( 48 bytes, 12 instructions )

The three items pushed are
loop exit address
limit data of previous loop
index data of previous loop

The slightly odd list of items allows us to keep the index/limit data in registers.

Stephen
--
Stephen Pelc, stephen@vfxforth.com
MicroProcessor Engineering, Ltd. - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)78 0390 3612, +34 649 662 974
http://www.mpeforth.com - free VFX Forth downloads

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Krishna Myneni on Sat Mar 9 18:26:23 2024

Krishna Myneni <krishna.myneni@ccreweb.org> writes:

Yes, kForth uses this method. DO pushes three items onto the return
stack, the two loop parameters, and the virtual instruction pointer.

\ From ForthVM.cpp

int CPP_do ()
{
// stack: ( -- | generate opcodes for beginning of loop structure )

pCurrentOps->push_back(OP_PUSH);
pCurrentOps->push_back(OP_PUSH);
pCurrentOps->push_back(OP_PUSHIP);

dostack.push(pCurrentOps->size());
return 0;
}

Thanks. Why do you do it this way? Do you want to break dependence
chains on the virtual instruction pointer (the reason for the speedup
in my results)?

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to albert@spenarnc.xs4all.nl on Sat Mar 9 18:28:45 2024

albert@spenarnc.xs4all.nl writes:

DO LOOP in FIG / ISO say FORTH is a mess anyway. The idea that >signed/unsigned numbers can be handled uniformly was cute at the
time, when you could not spare 10 bytes. In the 50 years no novice
even dared to try negative indices or negative increments.

LOOP is fine. +LOOP with negative increment is more problematic
(that's why Gforth has -LOOP), but it turns out that for running
backwards through an array, +LOOP with negative increment actually
works out ok. But Gforth now has MEM-DO..LOOP so you don't need to
worry about that.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	300
Nodes:	16 (2 / 14)
Uptime:	52:55:45
Calls:	6,712
Files:	12,243
Messages:	5,355,184

Forth systems where do/?do pushes that loop start address

Who's Online

System Info