• Strings and 24-bit cells

    From Brad Eckert@21:1/5 to All on Sun Oct 30 08:55:59 2022
    I'm contemplating a Forth machine with 24-bit cells that pack three 8-bit chars per cell. Data memory would consist of three byte-lanes. Rather than use modulo 3 addressing, I would use modulo 4 which behaves like a typical 32-bit Forth (4 bytes/cell)
    but with the fourth byte-lane missing. Accessing any byte address whose two LSBs are "11" would be an ambiguous condition.

    I think this is workable if the starting addresses of strings are cell-aligned. Then CELLS would have a reference point for its math. CELL+ would be easy: If it results in an invalid address, do 1+ again.

    The most important thing is testability, which any 32-bit Forth can be adapted to. Char operators can test for invalid char addresses.

    So, this breaks the assumption that char addressing is monotonic but it is not a show stopper nor does it break backward compatibility. The 24-bit code will still run in a 32-bit environment.

    There are sure to be some caveats, which is what I am asking about. Any thoughts?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marcel Hendrix@21:1/5 to Brad Eckert on Sun Oct 30 09:54:27 2022
    On Sunday, October 30, 2022 at 4:56:01 PM UTC+1, Brad Eckert wrote:
    [..]
    There are sure to be some caveats, which is what I am asking about. Any thoughts?

    I assume you have sound technical reasons for this.
    The first iForth's were specific for a segmented architecture (32 address bits, upper
    16 indicated the segment). It worked well, but eventually I got tired of the non-standardness of the scheme and scrapped the idea.

    Do you think your idea will survive (i.e. be worthwile) 10 years from now (or even 2)?

    -marcel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@arcor.de@21:1/5 to Brad Eckert on Mon Oct 31 06:37:28 2022
    Brad Eckert schrieb am Sonntag, 30. Oktober 2022 um 16:56:01 UTC+1:
    I'm contemplating a Forth machine with 24-bit cells that pack three 8-bit chars per cell. Data memory would consist of three byte-lanes. Rather than use modulo 3 addressing, I would use modulo 4 which behaves like a typical 32-bit Forth (4 bytes/cell)
    but with the fourth byte-lane missing. Accessing any byte address whose two LSBs are "11" would be an ambiguous condition.

    I think this is workable if the starting addresses of strings are cell-aligned. Then CELLS would have a reference point for its math. CELL+ would be easy: If it results in an invalid address, do 1+ again.

    The most important thing is testability, which any 32-bit Forth can be adapted to. Char operators can test for invalid char addresses.

    So, this breaks the assumption that char addressing is monotonic but it is not a show stopper nor does it break backward compatibility. The 24-bit code will still run in a 32-bit environment.

    There are sure to be some caveats, which is what I am asking about. Any thoughts?

    Seems ok as long as you are "providing" all strings. But it can lead to moving strings or substrings
    around if they are read in from external sources or data streams.
    Of course this would not be the case when all you do is use a string-aligned buffer.
    BTW AFAIK Ocaml uses (or used to use) bit 0 to distinguish pointers from integers.
    When you have an FPU, nan-boxing is an old technique.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matthias Koch@21:1/5 to All on Mon Oct 31 18:33:13 2022
    One interesting idea would be to design a word-addressed machine, and provide primitives to fetch/store a variable amount of bits or bytes starting from a native word address unit. With this in place, one could in theory resynthesise and recompile both
    processor HDL and Forth core for an arbitrary Forth cell width.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Brad Eckert@21:1/5 to Matthias Koch on Wed Nov 2 08:11:35 2022
    On Monday, October 31, 2022 at 10:33:15 AM UTC-7, Matthias Koch wrote:
    One interesting idea would be to design a word-addressed machine, and provide primitives to fetch/store a variable amount of bits or bytes starting from a native word address unit. With this in place, one could in theory resynthesise and recompile both
    processor HDL and Forth core for an arbitrary Forth cell width.

    Consider Chuck Moore's processor designs. Did he ever support byte addressing? I don't think so. They were all word-addressed (CELL+ same as 1+). Even the Sh-BOOM was word-addressed, with hardware support for packing and unpacking bytes (up to 4 bytes).

    ANS assigns a relationship between cell and char addresses. If cell units and char units are not equivalent, the strings paradigm may be broken. You are allowed to mix a-addr and c-addr. That is a dependency. You shouldn't be using @ and ! on a c-addr or
    C@ and C! on a c-addr. But you can do it and ANS Forth will not complain. It breaks the "crash early and crash often" rule.

    One could have a third address type, bf-addr, which would address bit fields. Due to the limited cell address range (most bits are the mask size and shift count) it would be used with data structures. How to imply the base address of the data structure
    in a reentrant way is a topic for discussion. You could use a double for bf-addr (unwieldy), a separate base address stack, keep the base address on the return stack (breaking ANS stack rules), or just use a static variable and forget about reentrancy.

    I can only speculate about Chuck's insistence on word addressing, but I would chalk it up to his aversion to cargo cult programming. Perhaps that is the reason he hated ANS.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Brad Eckert@21:1/5 to Marcel Hendrix on Wed Nov 2 08:25:53 2022
    On Sunday, October 30, 2022 at 9:54:29 AM UTC-7, Marcel Hendrix wrote:
    On Sunday, October 30, 2022 at 4:56:01 PM UTC+1, Brad Eckert wrote:
    [..]
    There are sure to be some caveats, which is what I am asking about. Any thoughts?
    I assume you have sound technical reasons for this.
    The first iForth's were specific for a segmented architecture (32 address bits, upper
    16 indicated the segment). It worked well, but eventually I got tired of the non-standardness of the scheme and scrapped the idea.

    Do you think your idea will survive (i.e. be worthwile) 10 years from now (or even 2)?

    -marcel
    Nope, no sound technical reasons. But that's okay, I am tilting toward word addressing anyway. I like Sh-BOOM's handling of octets.

    Sh-BOOM did not survive either. It was amazingly good for its time. It offered high performance for a fraction of the cost of other designs of similar capability. But, it was too different and it did not have the marketing support. The support needed to
    market a microprocessor is extraordinary. Such a product launch needs all kinds of software support, application engineering, and of course a sales team. It's not just a case of building a better mousetrap.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)