Forum: >>> Magnum BBS <<<

Iterating over arrays

From Anton Ertl@21:1/5 to All on Sat Jan 20 11:05:28 2024

One way to loop over arrays is to loop over the array index values:

( addr u ) 0 ?do ( addr )
dup i cells + @ ...
loop
drop

A disadvantage of this approach is that one has to keep addr around.
If you need to access other stack elements, this becomes annoying;
e.g., if you want to produce the sum, minimum, and maximum of an array:

( addr u ) >r 0 max-n min-n r> 0 ?do ( addr sum max min )
3 pick i cells + @
tuck max >r
tuck min >r
+ r> r>
loop

The loop body is a little bit better if I produces the address of the
element, but now the prelude of the loop becomes bigger:

( addr u ) 2>r 0 max-n min-n 2r> cells over + swap ?do
i @
tuck max >r
tuck min >r
+ r> r>
1 cells +loop

An additional advantage is that the loop body is shorter and probably
faster.

The longer prelude is annoying, especially in the case of walking
backwards through an array (with U-DO..-LOOP), e.g., if we want to
print out the numbers in the array backwards:

( addr u ) cells over + -1 cells + swap -1 cells + u-do
i @ .
1 cells -loop

So in <2016Dec26.155852@mips.complang.tuwien.ac.at> I outlined:

addr count stride A+DO ... A+LOOP \ i goes from addr...addr+(count-1)*stride addr count stride A-DO ... A-LOOP \ i goes from addr+(count-1)*stride...addr

I have not implemented this yet, but have recently discussed it with
Bernd Paysan, who pointed out that in "addr u" the u is not always the
number of elements, but sometimes the number of bytes; so my current
thinking is:

( addr count stride ) MEM+DO ... MEM+LOOP
( addr count stride ) MEM-DO ... MEM-LOOP

And maybe have a word

: ARRAY>MEM ( uelems uelemsize -- ubytes uelemsize )
tuck * swap ;

(Or maybe have a generic ANYLOOP (whatever it's name) that produces
the right match for the ...DO)

With that our examples would become:

( addr u ) 2>r 0 max-n min-n 2r> 1 cells array>mem mem+do
i @
tuck max >r
tuck min >r
+ r> r>
mem+loop

( addr u ) array>mem mem-do
i @ .
mem-loop

(and maybe replace MEM+LOOP and MEM-LOOP with ANYLOOP).

Opinions?

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Hans Bezemer on Sat Jan 20 17:23:03 2024

Hans Bezemer <the.beez.speaks@gmail.com> writes:

Opinions?

[reformatted for conventional line lengths]

I fail to see the need to generalize these. First, I (relatively)
rarely seem to iterate over an entire array.

MEM+DO and MEM-DO also work for parts of arrays and even for array
slices (e.g., a diagonal in a matrix) as long as the stride is
constant.

In the largest public Forth program I wrote there are almost twice
the number of BEGIN..REPEAT loops as DO..LOOPS

In the Gforth image we have the following numbers of occurences:

138 begin
142 while
108 repeat

19 do
73 ?do
5 +do
25 u+do
2 -do
5 u-do

71 loop
52 +loop
7 -loop

1 leave

Looking at the occurences of U-DO, 4 out of the 5 look like obvious
candidates for replacement by MEM-DO and 3 actually are (the fourth is

the rare case of a data-dependent decrement). These 3 are:

|libcc.fs:141:30: c-addr 1- c-addr u1 + 1- u-do
|mwords.fs:34:43: cr words[] $@ bounds cell- swap cell- U-DO |locate1.fs:494:40: words[] $@ bounds cell- swap cell- U-DO

They could be replaced with

libcc.fs:141:30: c-addr u1 1 mem-do
mwords.fs:34:43: cr words[] $@ cell mem-do
locate1.fs:494:40: words[] $@ cell mem-do

For MEM+DO the benefit is smaller, but if we have MEM-DO, why not also
have MEM+DO?

One major benefit of using DO..LOOP is that it gets the index and the
limit out of the way, reducing the data stack load. The idiom of
using the element address as DO..LOOP index has the additional
advantage of getting the start address of the array out of the way,
but it requires setup code before the ?DO-like word. Using MEM+DO and especially MEM-DO reduces that setup code.

Second, I already have some generalized high order functions for
mapping, reducing or FOREACHing complex structures. Can't say I used
them much since creating them - because my DO..LOOPs are usually so
simple, I can't see why I should be using them.

That sounds like it would be a better idea to have do..loop variants
with better support for dealing with arrays than these higher-order
functions.

Use case one: raw arrays
- Set up the structure by address, #items and sizeof (item)

16 constant #items
32 constant /item
#items /item * array items
items #items /item * BOUNDS ?DO I ( do your thing) /item +LOOP

That could be replaced with

items #items /item array>mem mem+do i ( do your thing ) mem+loop

Use case two: indexed arrays
- Set up the structure by address, #items and sizeof (item)
- Add a DOES> definition that returns the address

16 constant #items
32 constant /item
#items /item * array items does> swap /item * + ;
#items 0 ?DO I items ( do your thing) LOOP

Yes, creating a named array and using that name in the loop body is a
way to avoid the stack load of passing the start address of the array
around in the indexing idiom, but the loop then becomes specific to
that one array.

Use case three: use TH (or equivalent) to do the indexing
- TH only works with CELLS, but it is useful ( : TH CELLS + ; )

TH is useful if you have the start address of the array as a stack
item, but for a loop that walks the array, I prefer to avoid that.

IMHO it is just some syntactic sugar - which won't speed up things anyway.

Yes. The point is to make it easier and less error prone to write
do..loops that uses the element addresses as index. And using that
technique is usually faster than computing the element address from
the start address and the index in each iteration.

In that regard, I'd rather use FOREACH, since that one exposes the
address anyway and gets rid of the dirty details as well:

[: ( do your thing) ;] items #items /item FOREACH

That corresponds to

items #items /item mem+do i ( do your thing ) mem+loop

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From mhx@21:1/5 to All on Sat Jan 20 19:40:21 2024

I was going to mention that a good (or possibly the only) reason to use iterators (like FOREACH) is when the data is organized in lists. However,
on reflection, it is possible to avoid lists by adding an indexing array
(at the cost of one indirection). Any idea if that would be significantly faster?

The disadvantage of a simple array would be that it becomes too small or
is too large (in case of lists the number of items is probably dynamic).

-marcel

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to mhx on Sun Jan 21 15:36:57 2024

mhx@iae.nl (mhx) writes:

I was going to mention that a good (or possibly the only) reason to use >iterators (like FOREACH) is when the data is organized in lists. However,
on reflection, it is possible to avoid lists by adding an indexing array
(at the cost of one indirection). Any idea if that would be significantly >faster?

Following a linked list costs 4-5 cycles per element on modern desktop
CPUs even if the elements are in the D-cache, more if they are further
away from the core.

By contrast, dealing with the elements through an array of element
addresses (I guess that's what you mean with "indexing array") does
not introduce such a latency: if each element is processed independent
of the others, you can process them arbitrarily in parallel. If there
is some dependency in the processing of the elements, that dependency
limits the processing.

However, there is probably a reason why you used a linked list instead
of an array. If you could easily do your processing with an array,
you would have chosen the array in the first place.

The disadvantage of a simple array would be that it becomes too small or
is too large (in case of lists the number of items is probably dynamic).

If the array can grow (and thus become too small), my usual approach
is to ALLOCATE the array with a power-of-two number of elements, and
RESIZE it to double size when that becomes too small.

As for "too large", that's something I don't worry about. However,
Bernd Paysan's string library (all the words with "$" in Gforth),
which has grown into covering all kinds of dynamically allocated
memory blocks, actually also uses RESIZE to shrink the memory blocks
it administers. The reason for this is that the library only
remembers the size of the current memory block in bytes, not also the
size of the ALLOCATEd/RESIZEd block. The library computes the size of
the ALLOCATEd block from the current memory block by rounding it up to
the next power-of-two; when the next memory block size for an
allocation results in a different power-of-to, the allocation is
RESIZEd, whether that grows or shrinks.

However, for a buffer where the contents can grow or shrink (or,
typically, start a new content with a smaller size that then grows), I
prefer not to shrink and to keep a separate BUFFER-MAXLENGTH field in
the data structure describing it around. This means that the number
of RESIZEs is determined by the maximum length that the buffer ever
has.

Back to linked lists: Typical applications are when you want to insert
or delete something in the middle or want to have a parent-pointer
tree (as in fig-Forth's vocabularies that all included the Forth
vocabulary), or when you don't want to deal with dynamic memory
allocation (as in wordlists in Forth implementations even without
fig-Forth's common-ancestor property).

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to FFmike on Thu Jan 25 16:29:11 2024

oh2aun@gmail.com (FFmike) writes:

variable p inlined
: p@ p @ ; inlined
: p++ 2 p +! ; inlined
: p@++ p@ p++ ; inlined
: r>p r@ p ! ; inlined
: !p>r p @ >r p ! ; inlined

( addr u ) swap !p>r
for
p@++ ...
next r>p

Of course the P words are coded efficiently in assembler.

This reminds me of Bernd Paysan's OOF and Mini-OOF2, where he has a
current object pointer that he explicitly sets with >O (similar to
your !P>R), saving the old object pointer on the return stack, and he
restores the old object pointer with O> (similar to your R>P). Most
others (including me) prefer the setting of the object pointer
automatically when a method is entered and restoring it automatically
when the method is left, though.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From none) (albert@21:1/5 to dxforth@gmail.com on Fri Jan 26 12:30:22 2024

In article <uouvck$2ghkt$1@dont-email.me>, dxf <dxforth@gmail.com> wrote:

On 26/01/2024 3:29 am, Anton Ertl wrote:

oh2aun@gmail.com (FFmike) writes:

variable p inlined
: p@ p @ ; inlined
: p++ 2 p +! ; inlined
: p@++ p@ p++ ; inlined
: r>p r@ p ! ; inlined
: !p>r p @ >r p ! ; inlined

( addr u ) swap !p>r
for
p@++ ...
next r>p

Of course the P words are coded efficiently in assembler.

This reminds me of Bernd Paysan's OOF and Mini-OOF2, where he has a
current object pointer that he explicitly sets with >O (similar to
your !P>R), saving the old object pointer on the return stack, and he
restores the old object pointer with O> (similar to your R>P). Most
others (including me) prefer the setting of the object pointer
automatically when a method is entered and restoring it automatically
when the method is left, though.

Then there's Moore's address registers where saving/restoring is based
on need rather than at every turn.

And then is my oo. Whenever an object is invoked it is the current
object from its class, until another object is selected. Thus you can
use different objects at the same time and there is less hassle on the
stack.
Groetjes Albert

--
Don't praise the day before the evening. One swallow doesn't make spring.
You must not say "hey" before you have crossed the bridge. Don't sell the
hide of the bear until you shot it. Better one bird in the hand than ten in
the air. First gain is a cat purring. - the Wise from Antrim -

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Anton Ertl on Sun Jan 28 12:15:37 2024

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:

( addr count stride ) MEM+DO ... MEM+LOOP
( addr count stride ) MEM-DO ... MEM-LOOP

And maybe have a word

: ARRAY>MEM ( uelems uelemsize -- ubytes uelemsize )
tuck * swap ;

(Or maybe have a generic ANYLOOP (whatever it's name) that produces
the right match for the ...DO)

I decided to use LOOP to finish a MEM+DO or MEM-DO loop. I have now implemented this, and you can see possible uses as follows:

create v 3 , 5 , 1 , -3 ,
: test-mem-0 array>mem mem-do i @ . loop ;
v 4 cell test-mem-0 \ output: -3 1 5 3
: test-mem+0 array>mem mem+do i @ . loop ;
v 4 cell test-mem+0 \ output: 3 5 1 -3

Decompiling this with SEE TEST-MEM-0 shows (reformatted for
readability):

: test-mem-0
array>mem dup negate >l - over + u-[do
i @ .
@local0 +loop
lp+ ;

So MEM-DO actually uses +LOOP; it includes the lower bound, just as is
needed for this usage. I have added a matching U-[DO that includes
the lower bound (while U-DO excludes it).

The stride is stored on the locals stack in this example, so that J is unchanged, and any uses of J are also unaffected.

A common usage is to have a known, constant stride. In this case the
literal stack is used to produce better code:

: test-mem-1 cell array>mem mem-do i @ . loop ;

decompiles to:

: test-mem-1
cells #-8 + over + u-[do
i @ .
#-8 +loop ;

The more constants we provide, the better, e.g.:

: test-mem-3 v 4 cell array>mem mem-do i @ . loop ;

decompiles to:

: test-mem-3
v <v+$18> u-[do
i @ .
#-8 +loop ;

The benefit of MEM-DO and MEM+DO is that you can just call

addr nelems +nelemsize ARRAY>MEM MEM-DO ... LOOP

or

addr nbytes +nelemsize MEM-DO ... LOOP

or either variant with MEM+DO, and don't have to explicitly compute
the start and limit for the U-[DO or U+DO and arrange them on the
stack, and then repeat the stride in front of +LOOP. The result is
fewer opportunities for mistakes and more readable code.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023: https://euro.theforth.net/2023

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Keyop
  Sun May 5 19:26:27 2024
  from Huddersfield, West Yorkshire via SSH
- Keyop
  Sun May 5 19:26:11 2024
  from Huddersfield, West Yorkshire via SSH
- Bob Worm
  Mon May 6 11:44:29 2024
  from Wales, Uk via Telnet
- Bob Worm
  Tue May 7 09:06:52 2024
  from Wales, Uk via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	300
Nodes:	16 (2 / 14)
Uptime:	39:06:20
Calls:	6,708
Calls today:	1
Files:	12,241
Messages:	5,353,638

Iterating over arrays

Who's Online

Recent Visitors

System Info