What would all those SET-OPTIMIZERs do to compile "2 CELLS ALLOT"
to "add 8 to [address-of-dp]" ?
For those who have not followed recent discussions: If you define a
word FOO and then invoke SET-OPTIMIZER ( xt -- ), then COMPILE,ing FOO
will invoke the xt; i.e. SET-OPTIMIZER changes what COMPILE, executes.
The only correct way to use SET-OPTIMIZER is to pass an xt that does
not change the behaviour, only the implementation of the prescribed
behaviour of COMPILE,. One point of discussion has been that the
behaviour of a word is now described twice, once for EXECUTE and once
for COMPILE,; this redundancy leads to a potential for bugs, so we
discussed how to eliminate it.
While thinking about this issue, I realized that the whole issue is
quite different for different Forth systems:
* Gforth compiles all primitives with PEEPHOLE-COMPILE, (which has a
lot of machinery behind it) and uses SET-OPTIMIZER primarily for
dealing with different word types; there are 18 occurrences of
SET-OPTIMIZER in the Gforth image, and most of them are for handling different word types: 2CONSTANTs, SYNONYMs, DOES>-defined words,
FCONSTANTs, etc. And they typically create inline code instead of
CALLing the DOES>/SET-DOES> code. In addition, there are some uses
of SET-OPTIMIZER for better code in connection with earlier literals
in the code, but that's not the main point here.
* If Gforth had an automatic inliner, like VFX, an alternative would
be to define, e.g., CONSTANTs as colon definitions, e.g.,
5 constant five
would be equivalent to
: five 5 ;
With the automatic inliner, SET-OPTIMIZER would not be necessary to
generate inlined code for constants. Problem solved! Well, except
that constants as inlinable colon definitions could be quite large:
There is the native code (typically not large), but there is
typically also some intermediate representation that is used for
inlining. Depending on how you balance size against the costs of
adding SET-OPTIMIZER calls, you might still prefer to go for the
latter.
There would still be the optimizations involving constants, which
could be solved by making the deeper levels of the code generator
(below PEEPHOLE-COMPILE,) aware of constants, but that just shifts
redundancy from one place to another. In the end you have to live
with the fact that any optimization that is optional introduces
redundancy, and the optional generated code needs to have the same
semantics as the default code.
* But a Forth system can also use SET-OPTIMIZER for dealing with each primitive separately. I.e., to have stuff like
: compile,-+ ( xt -- ) drop ... ( generate code for + ) ;
: + [ 0 compile,-+ ] ; ' compile,-+ set-optimizer
And of course, with 50 or more primitives, you can factor out the commonalities between them. I think that VFX works that way.
Lxf/ntf has a similar approach, but it works through the mechanism
for changing the compilation semantics, not through COMPILE,.
When having a separate COMPILE, implementation for each primitive,
the xt passed to COMPILE, does not need to be passed to the COMPILE, implementation; but you still need it for words (such as constants)
where the same COMPILE, implementation is used for the whole class.
Anton Ertl schrieb am Freitag, 25. November 2022 um 18:38:03 UTC+1:
Assume ALLOT and CELLS are no primitives but defined as
4 CONSTANT CELL
: CELLS cell * ;
VARIABLE dp
: ALLOT dp +! ;
What would all those SET-OPTIMIZERs do to compile "2 CELLS ALLOT"
to "add 8 to [address-of-dp]" ?
For those who have not followed recent discussions: If you define a
word FOO and then invoke SET-OPTIMIZER ( xt -- ), then COMPILE,ing FOO
will invoke the xt; i.e. SET-OPTIMIZER changes what COMPILE, executes.
The only correct way to use SET-OPTIMIZER is to pass an xt that does
not change the behaviour, only the implementation of the prescribed
behaviour of COMPILE,. One point of discussion has been that the
behaviour of a word is now described twice, once for EXECUTE and once
for COMPILE,; this redundancy leads to a potential for bugs, so we
discussed how to eliminate it.
While thinking about this issue, I realized that the whole issue is
quite different for different Forth systems:
* Gforth compiles all primitives with PEEPHOLE-COMPILE, (which has a
lot of machinery behind it) and uses SET-OPTIMIZER primarily for
dealing with different word types; there are 18 occurrences of
SET-OPTIMIZER in the Gforth image, and most of them are for handling
different word types: 2CONSTANTs, SYNONYMs, DOES>-defined words,
FCONSTANTs, etc. And they typically create inline code instead of
CALLing the DOES>/SET-DOES> code. In addition, there are some uses
of SET-OPTIMIZER for better code in connection with earlier literals
in the code, but that's not the main point here.
* If Gforth had an automatic inliner, like VFX, an alternative would
be to define, e.g., CONSTANTs as colon definitions, e.g.,
5 constant five
would be equivalent to
: five 5 ;
With the automatic inliner, SET-OPTIMIZER would not be necessary to
generate inlined code for constants. Problem solved! Well, except
that constants as inlinable colon definitions could be quite large:
There is the native code (typically not large), but there is
typically also some intermediate representation that is used for
inlining. Depending on how you balance size against the costs of
adding SET-OPTIMIZER calls, you might still prefer to go for the
latter.
There would still be the optimizations involving constants, which
could be solved by making the deeper levels of the code generator
(below PEEPHOLE-COMPILE,) aware of constants, but that just shifts
redundancy from one place to another. In the end you have to live
with the fact that any optimization that is optional introduces
redundancy, and the optional generated code needs to have the same
semantics as the default code.
* But a Forth system can also use SET-OPTIMIZER for dealing with each
primitive separately. I.e., to have stuff like
: compile,-+ ( xt -- ) drop ... ( generate code for + ) ;
: + [ 0 compile,-+ ] ; ' compile,-+ set-optimizer
And of course, with 50 or more primitives, you can factor out the
commonalities between them.
I think that VFX works that way.
Lxf/ntf has a similar approach, but it works through the mechanism
for changing the compilation semantics, not through COMPILE,.
When having a separate COMPILE, implementation for each primitive,
the xt passed to COMPILE, does not need to be passed to the COMPILE,
implementation; but you still need it for words (such as constants)
where the same COMPILE, implementation is used for the whole class.
On 2022-11-25 16:19, Anton Ertl wrote:
For those who have not followed recent discussions: If you define a
word FOO and then invoke SET-OPTIMIZER ( xt -- ), then COMPILE,ing FOO
will invoke the xt; i.e. SET-OPTIMIZER changes what COMPILE, executes.
The only correct way to use SET-OPTIMIZER is to pass an xt that does
not change the behaviour, only the implementation of the prescribed behaviour of COMPILE,. One point of discussion has been that the
behaviour of a word is now described twice, once for EXECUTE and once
for COMPILE,; this redundancy leads to a potential for bugs, so we discussed how to eliminate it.
While thinking about this issue, I realized that the whole issue is
quite different for different Forth systems:
* Gforth compiles all primitives with PEEPHOLE-COMPILE, (which has a
lot of machinery behind it) and uses SET-OPTIMIZER primarily for
dealing with different word types; there are 18 occurrences of SET-OPTIMIZER in the Gforth image, and most of them are for handling different word types: 2CONSTANTs, SYNONYMs, DOES>-defined words, FCONSTANTs, etc. And they typically create inline code instead of
CALLing the DOES>/SET-DOES> code. In addition, there are some uses
of SET-OPTIMIZER for better code in connection with earlier literals
in the code, but that's not the main point here.
* If Gforth had an automatic inliner, like VFX, an alternative would
be to define, e.g., CONSTANTs as colon definitions, e.g.,
5 constant five
would be equivalent to
: five 5 ;
With the automatic inliner, SET-OPTIMIZER would not be necessary to generate inlined code for constants. Problem solved! Well, except
that constants as inlinable colon definitions could be quite large:
There is the native code (typically not large), but there is
typically also some intermediate representation that is used for
inlining. Depending on how you balance size against the costs of
adding SET-OPTIMIZER calls, you might still prefer to go for the
latter.
There would still be the optimizations involving constants, whichMy initial point was against redundancy on the source code level,
could be solved by making the deeper levels of the code generator
(below PEEPHOLE-COMPILE,) aware of constants, but that just shifts redundancy from one place to another. In the end you have to live
with the fact that any optimization that is optional introduces
redundancy, and the optional generated code needs to have the same semantics as the default code.
especially in user code.
If a system employs inlining, it can use it even for "does>" parts, and
then "set-optimizer" is not needed at all.
So, as I can see, it's a too system-specific tool, which should not be
used in standard programs.
* But a Forth system can also use SET-OPTIMIZER for dealing with each primitive separately. I.e., to have stuff like
: compile,-+ ( xt -- ) drop ... ( generate code for + ) ;
: + [ 0 compile,-+ ] ; ' compile,-+ set-optimizer
And of course, with 50 or more primitives, you can factor out the commonalities between them.
I think that VFX works that way.Like it was in cmForth. And A.6.2.0945 says about that too:
Lxf/ntf has a similar approach, but it works through the mechanism
for changing the compilation semantics, not through COMPILE,.
| It is not always possible to obtain the compilation token
| from the execution token. In these implementations,
| "COMPILE," might not generate code that is as efficient
| as normally compiled code.
So it generates better code when the word is encountered by the Forth
text interpreter, than when "compile," is applied to the xt of the word
(when it's allowed).
When having a separate COMPILE, implementation for each primitive,I would prefer to have different setters for these cases, and don't pass
the xt passed to COMPILE, does not need to be passed to the COMPILE, implementation; but you still need it for words (such as constants)
where the same COMPILE, implementation is used for the whole class.
xt when it's not needed.
--
Ruvim
"minf...@arcor.de" <minforth@arcor.de> writes:
Anton Ertl schrieb am Freitag, 25. November 2022 um 18:38:03 UTC+1:
Assume ALLOT and CELLS are no primitives but defined as
4 CONSTANT CELL
: CELLS cell * ;
VARIABLE dp
: ALLOT dp +! ;
What would all those SET-OPTIMIZERs do to compile "2 CELLS ALLOT"
to "add 8 to [address-of-dp]" ?
With these definitions, it does not work in Gforth, because Gforth
does not inline. If you add two inlinings manually (a practice I
don't recommend, but for the sake of the example):
4 CONSTANT CELL
: inline-cells ( xt -- ) drop ]] cell * [[ ;
: CELLS cell * ; ' inline-cells set-optimizer
VARIABLE dp
: inline-allot ( xt -- ) drop ]] dp +! [[ ;
: ALLOT dp +! ; ' inline-allot set-optimizer
\ and now the test case
: foo 2 cells allot ;
Now SEE FOO outputs:
: foo #8 dp +! ;
This works as follows:
"2" pushes 2 on the literal stack (this is for the constant folding optimization invented by Matthias Koch).
"Cells" compiles CELL (which pushes 4 on the literal stack) and *; The optimizer for * sees two values on the literal stack, multiplies them
to 8, and pushes that on the literal stack.
"allot" compiles DP, and the COMPILE, for the variable is to push the
address of DP as literal, which pushes the address on the literal
stack. And it also compiles +!; before compiling a primitive, all two literals are popped from the literal stack and realized as primitive
LIT followed by the value; this produces the #8 and DP literals. Then
the +! is compiled.
On 2022-11-25 22:29, Anton Ertl wrote:
"minf...@arcor.de" <minf...@arcor.de> writes:
Anton Ertl schrieb am Freitag, 25. November 2022 um 18:38:03 UTC+1:
Assume ALLOT and CELLS are no primitives but defined as
4 CONSTANT CELL
: CELLS cell * ;
VARIABLE dp
: ALLOT dp +! ;
What would all those SET-OPTIMIZERs do to compile "2 CELLS ALLOT"
to "add 8 to [address-of-dp]" ?
With these definitions, it does not work in Gforth, because Gforth
does not inline. If you add two inlinings manually (a practice I
don't recommend, but for the sake of the example):
4 CONSTANT CELL
: inline-cells ( xt -- ) drop ]] cell * [[ ;
: CELLS cell * ; ' inline-cells set-optimizer
VARIABLE dp
: inline-allot ( xt -- ) drop ]] dp +! [[ ;
: ALLOT dp +! ; ' inline-allot set-optimizer
\ and now the test case
: foo 2 cells allot ;
Now SEE FOO outputs:
: foo #8 dp +! ;
This works as follows:
"2" pushes 2 on the literal stack (this is for the constant folding optimization invented by Matthias Koch).
"Cells" compiles CELL (which pushes 4 on the literal stack) and *; The optimizer for * sees two values on the literal stack, multiplies them
to 8, and pushes that on the literal stack.
"allot" compiles DP, and the COMPILE, for the variable is to push the address of DP as literal, which pushes the address on the literalIt seems the literal stack is not needed if an intermediate
stack. And it also compiles +!; before compiling a primitive, all two literals are popped from the literal stack and realized as primitive
LIT followed by the value; this produces the #8 and DP literals. Then
the +! is compiled.
representation (e.g., an abstract semantic graph) is used for a whole definition for optimization purposes.
Then a term rewrite system (term graph rewriting) can be employed to
make optimization and code generation.
Was such an approach used in Forth systems?
On 2022-11-25 16:19, Anton Ertl wrote:
There would still be the optimizations involving constants, which
could be solved by making the deeper levels of the code generator
(below PEEPHOLE-COMPILE,) aware of constants, but that just shifts
redundancy from one place to another. In the end you have to live
with the fact that any optimization that is optional introduces
redundancy, and the optional generated code needs to have the same
semantics as the default code.
My initial point was against redundancy on the source code level,
especially in user code.
If a system employs inlining, it can use it even for "does>" parts, and
then "set-optimizer" is not needed at all.
So, as I can see, it's a too system-specific tool, which should not be
used in standard programs.
When having a separate COMPILE, implementation for each primitive,
the xt passed to COMPILE, does not need to be passed to the COMPILE,
implementation; but you still need it for words (such as constants)
where the same COMPILE, implementation is used for the whole class.
I would prefer to have different setters for these cases, and don't pass
xt when it's not needed.
Ruvim <ruvim.pinka@gmail.com> writes:
On 2022-11-25 16:19, Anton Ertl wrote:
There would still be the optimizations involving constants, which
could be solved by making the deeper levels of the code generator
(below PEEPHOLE-COMPILE,) aware of constants, but that just shifts
redundancy from one place to another. In the end you have to live
with the fact that any optimization that is optional introduces
redundancy, and the optional generated code needs to have the same
semantics as the default code.
My initial point was against redundancy on the source code level,
Such optimizations have to be expressed at the source code level. As
a simple example, take a look at
:noname ( xt -- )
lits# 1 u>= if
lits> case
0 of postpone dup drop exit endof
1 of postpone over drop exit endof
[defined] fourth [if]
2 of postpone third drop exit endof
3 of postpone fourth drop exit endof
[then]
dup >lits
endcase
then
peephole-compile, ;
optimizes pick
If you don't have this, PICK will just PEEPHOLE-COMPILE,. But with
this, it will compile
0 PICK into DUP
1 PICK into OVER
2 PICK into THIRD
3 PICK into FOURTH
and only other cases as before; and there you have a path with no
lits, and a path with 1 or more lits. Each of the various paths
through this word except the last is redundant, and each can
potentially be buggy.
As a possible option, such rules (their left parts) can be compiled into
a prefix tree ("trie") to efficiently match the compiled code (right on
the fly, or after the definition is compiled into intermediate code).
What also is required for such rules is a kind of templates. Since
sometimes we need to match any number, or any word of a given class.
A huge advantages is that a significant part of such rules is system independent. They can be reused in different Forth systems.
On 2022-11-28 18:03, Anton Ertl wrote:
Ruvim <ruvim.pinka@gmail.com> writes:
On 2022-11-25 16:19, Anton Ertl wrote:
There would still be the optimizations involving constants, which
could be solved by making the deeper levels of the code generator
(below PEEPHOLE-COMPILE,) aware of constants, but that just shifts >>>> redundancy from one place to another. In the end you have to live >>>> with the fact that any optimization that is optional introduces
redundancy, and the optional generated code needs to have the same >>>> semantics as the default code.
My initial point was against redundancy on the source code level,
Such optimizations have to be expressed at the source code level. As
a simple example, take a look at
:noname ( xt -- )
lits# 1 u>= if
lits> case
0 of postpone dup drop exit endof
1 of postpone over drop exit endof
[defined] fourth [if]
2 of postpone third drop exit endof
3 of postpone fourth drop exit endof
[then]
dup >lits
endcase
then
peephole-compile, ;
optimizes pick
If you don't have this, PICK will just PEEPHOLE-COMPILE,. But with
this, it will compile
0 PICK into DUP
1 PICK into OVER
2 PICK into THIRD
3 PICK into FOURTH
and only other cases as before; and there you have a path with no
lits, and a path with 1 or more lits. Each of the various paths
through this word except the last is redundant, and each can
potentially be buggy.
It's a good example.
I initially considered the cases when one fragment of code can be >*automatically* generated from another fragment of code, and both are >manually defined in source code. Hence, one of them is excessive.
But this case is different.
The first part can be considered as redundant only semantically, since
it produces semantically the same code as the second part >("peephole-compile,").
But the first part cannot be automatically created from the second part
or from the initial definition itself.
Now this optimizer does not require xt at all, and its first part
becomes simpler (if xt is not passed to it).
And yet another point.
This optimizer can be expressed in a more concise form as a set of >independent term rewrite rules. Something like:
:opt 0 pick =>= dup ;
:opt 1 pick =>= over ;
[defined] fourth [if]
:opt 2 pick =>= third ;
:opt 3 pick =>= fourth ;
[then]
As a possible option, such rules (their left parts) can be compiled into
a prefix tree ("trie") to efficiently match the compiled code (right on
the fly, or after the definition is compiled into intermediate code).
What also is required for such rules is a kind of templates. Since
sometimes we need to match any number, or any word of a given class.
A huge advantages is that a significant part of such rules is system >independent. They can be reused in different Forth systems.
Ruvim <ruvim.pinka@gmail.com> writes:
On 2022-11-25 16:19, Anton Ertl wrote:
There would still be the optimizations involving constants, which
could be solved by making the deeper levels of the code generator
(below PEEPHOLE-COMPILE,) aware of constants, but that just shifts
redundancy from one place to another. In the end you have to live
with the fact that any optimization that is optional introduces
redundancy, and the optional generated code needs to have the same
semantics as the default code.
My initial point was against redundancy on the source code level,
especially in user code.
What is user code? Looking at recent talks by Nick Nelson, in Forth
there is not necessarily a difference between a user and a system implementor.
If a system employs inlining, it can use it even for "does>" parts, and
then "set-optimizer" is not needed at all.
CREATE-DOES> has the problem that the data is in memory, and can be
changed through >BODY.
Consider the code
: constant ( n "name" -- )
create ,
does> ( -- n)
@ ;
5 constant five
: foo five ;
Even the best compiler would have to compile FOO into code equivalent
to
: foo [ ' five >body ] literal @ ;
while one would want
: foo 5 ;
There are the following solutions to this problem:
* The solution that uses SET-OPTIMIZER. The disadvantage is, as
mentioned the redundancy of expressing the same thing twice, with
the potential for bugs.
* Defining words such as FIVE as colon definitions, e.g.,
: constant ( n "name" -- )
{: n :} : ]] n ; [[ ;
Elegant! But the disadvantage, as mentioned, is the larger memory
consumption.
* Using the closure mechanism; something like:
: constant ( n "name" -- )
[n:d ;] alias ;
Very short and to the point; it creates a closure and then an alias
for it, which costs extra space in Gforth; this could be alleviated
with a named-closure mechanism. Also, in Gforth the closure is not
inlined (and others don't have closures AFAIK), so for now it's just
a potential solution.
* Using the DOES> approach, but telling the compiler that the memory
is read-only. Something like
: constant ( n "name" -- )
create , here cell- 1 cells read-only
does> ( -- n)
@ ;
Now, when the compiler sees a @ from that address range (as it does
when compiling FIVE, it can perform constant-folding on the @,
resulting in compiling 5.
This requires some memory for remembering the information about
read-only data, but otherwise it would work and be very traditional.
Supposedly VFX and iForth have such features. It could be added to
Gforth relatively straightforwardly, but it currently is not there.
So, as I can see, it's a too system-specific tool, which should not be
used in standard programs.
SET-OPTIMIZER certainly is. The colon definition approach can be used
on any system, but the example code above is Gforth-specific. The
closure approach is also Gforth-specific. The READ-ONLY approach is
specific to the systems that support this feature. So, apart from the
colon definition approach, they are all system-specific and
non-standard.
It seems the literal stack is not needed if an intermediate
representation (e.g., an abstract semantic graph) is used for a whole >definition for optimization purposes.
Then a term rewrite system (term graph rewriting) can be employed to
make optimization and code generation.
Was such an approach used in Forth systems?
Actually, I consider the option whether a tool like SET-OPTIMIZER can be >standardized (and so employable in user code). And I think, the fact
that this tool implies source code duplication (not literally, but that
the same information is specified twice) plays against this idea.
Ruvim <ruvim.pinka@gmail.com> writes:
It seems the literal stack is not needed if an intermediate
representation (e.g., an abstract semantic graph) is used for a whole >>definition for optimization purposes.
The usual way to get a data-flow graph from sequential stack-based
code is to use a stack of subgraphs [ertl92]. The literal stack can
be seen as the variant of this applied to literals only. If you build
the full graph, you may prefer to perform the constant folding right
when building the graph (and avoid the cost of building the additional >nodes), or later (and avoid the cost of testing for the
constant-folding case when building the graph).
Then a term rewrite system (term graph rewriting) can be employed to
make optimization and code generation.
Was such an approach used in Forth systems?
Unless you count the tree-parsing used by RAFTS as term graph
rewriting, not to my knowledge. Albert v. d. Horst has posted
something that looked like rewriting, but on the sequence level rather
than the graph level IIRC.
- anton
I agree to this. in LXF I have ;macro that inlines the does part. Used like
: array create cells allot ;macro swap cells + ; ok
10 array a1 ok
123 2 a1 ! ok
2 a1 @ . 123 ok
: t1 2 a1 @ ; ok
t1 . 123 ok
see t1
A4A614 409CE7 14 C80000 5 normal T1
409CE7 A11C337400 mov eax , [0074331C]
409CEC 895DFC mov [ebp-4h] , ebx
409CEF 8BD8 mov ebx , eax
409CF1 8D6DFC lea ebp , [ebp-4h]
409CF4 C3 ret near
ok
P Falth <peter....@gmail.com> writes:
I agree to this. in LXF I have ;macro that inlines the does part. Used like
: array create cells allot ;macro swap cells + ; ok
10 array a1 ok
123 2 a1 ! ok
2 a1 @ . 123 ok
: t1 2 a1 @ ; ok
t1 . 123 ok
see t1
A4A614 409CE7 14 C80000 5 normal T1
409CE7 A11C337400 mov eax , [0074331C]: constant create , ;macro @ ;
409CEC 895DFC mov [ebp-4h] , ebx
409CEF 8BD8 mov ebx , eax
409CF1 8D6DFC lea ebp , [ebp-4h]
409CF4 C3 ret near
ok
5 constant five
: foo five + ;
: bar 5 + ;
the code for FOO is:
804FC10 A1F89C3808 mov eax , [08389CF8]
804FC15 01C3 add ebx , eax
804FC17 C3 ret near
while the code for BAR is:
804FC18 83C305 add ebx , # 5h
804FC1B C3 ret near
So CREATE...;MACRO has the usual disadvantage that CREATE..DOES> has
on a system that inlines the DOES>-code.
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: https://euro.theforth.net
On Sunday, 4 December 2022 at 12:12:35 UTC+1, Anton Ertl wrote:
: constant create , ;macro @ ;
5 constant five
: foo five + ;
: bar 5 + ;
You call it constant but what you have created is a value
and lxf will treat it accordingly. To use a macro to define a constant do
macro five " 5"
or
:m five 5 ;
If I define TEN as constant rather than this kind of macro, it works correctly, and lxf compiles it efficiently (not like a value, and notCorrect. Forth has true constants - contrary to C, where even a "static const" does not equal a true constant, since it (traditionally - I'm not talking about ugly modern kludges) cannot be used to size e.g. arrays.
like a colon definition). So apparently you have a special mechanism
for that.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 300 |
Nodes: | 16 (2 / 14) |
Uptime: | 51:23:18 |
Calls: | 6,712 |
Calls today: | 5 |
Files: | 12,243 |
Messages: | 5,355,036 |
Posted today: | 1 |