• =?UTF-8?Q?An_idea_for_a_new_word_=E2=80=94_EXCHANGE?=

    From Zbig@21:1/5 to All on Thu May 26 09:58:32 2022
    Examining C-- ( https://en.wikipedia.org/wiki/C-- ) I noticed interesting idea for a word EXCHANGE ( addr1 addr2 — ). Such word swaps values of two variables. The „inventor” even gave it Forth-friendly name: "><".

    2 VARIABLE TWO ok
    8 VARIABLE EIGHT ok
    TWO @ . 2 ok
    EIGHT @ . 8 ok
    TWO EIGHT >< ok
    TWO @ . 8 ok
    EIGHT @ . 2 ok

    Of course implementation in ML is rather trivial. Actually I'm somewhat surprised it somehow didn't find its place in Forth vocabularies. Probably such mutual exchange of variables' (or memory locations, in general) values doesn't happen that often.
    Still it's that tiny it won't hurt to have it „just in case”.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From NN@21:1/5 to Zbig on Thu May 26 14:30:05 2022
    On Thursday, 26 May 2022 at 17:58:34 UTC+1, Zbig wrote:
    Examining C-- ( https://en.wikipedia.org/wiki/C-- ) I noticed interesting idea for a word EXCHANGE ( addr1 addr2 — ). Such word swaps values of two variables. The „inventor” even gave it Forth-friendly name: "><".

    2 VARIABLE TWO ok
    8 VARIABLE EIGHT ok
    TWO @ . 2 ok
    EIGHT @ . 8 ok
    TWO EIGHT >< ok
    TWO @ . 8 ok
    EIGHT @ . 2 ok

    Of course implementation in ML is rather trivial. Actually I'm somewhat surprised it somehow didn't find its place in Forth vocabularies. Probably such mutual exchange of variables' (or memory locations, in general) values doesn't happen that often.
    Still it's that tiny it won't hurt to have it „just in case”.



    Your variable is acting like a value so guessing you are using a non standard forth

    : exch ( a b -- ) over @ over @ >r swap ! r> swap ! ;

    variable x ok
    variable y ok
    33 x ! 66 y ! ok
    x ? y ? 33 66 ok
    x y exch ok
    x ? y ? 66 33 ok

    If it doesnt happen too often the why burden the built-ins with an extra word when its just as easy to write it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Brian Fox@21:1/5 to Zbig on Thu May 26 15:34:04 2022
    On Thursday, May 26, 2022 at 12:58:34 PM UTC-4, Zbig wrote:
    Examining C-- ( https://en.wikipedia.org/wiki/C-- ) I noticed interesting idea for a word EXCHANGE ( addr1 addr2 — ). Such word swaps values of two variables. The „inventor” even gave it Forth-friendly name: "><".

    2 VARIABLE TWO ok
    8 VARIABLE EIGHT ok
    TWO @ . 2 ok
    EIGHT @ . 8 ok
    TWO EIGHT >< ok
    TWO @ . 8 ok
    EIGHT @ . 2 ok

    Of course implementation in ML is rather trivial. Actually I'm somewhat surprised it somehow didn't find its place in Forth vocabularies. Probably such mutual exchange of variables' (or memory locations, in general) values doesn't happen that often.
    Still it's that tiny it won't hurt to have it „just in case”.

    I have seen that word >< used for byte swapping in MaxForth on 16 bit integers.
    I used it for the same purpose on my Camel Forth version.
    I wonder if it is used for any other purposes by different implementers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Brian Fox@21:1/5 to All on Thu May 26 15:38:05 2022
    On Thursday, May 26, 2022 at 5:30:07 PM UTC-4, NN wrote:

    : exch ( a b -- ) over @ over @ >r swap ! r> swap ! ;


    A minor point but on ITC Forth's this might be a touch faster by removing one call to NEXT.
    ( It is on my system)

    : EXCH ( addr1 addr2 -- ) OVER @ OVER @ SWAP ROT ! SWAP ! ;

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Thu May 26 15:41:02 2022
    : exch ( a b -- ) over @ over @ >r swap ! r> swap ! ;

    A minor point but on ITC Forth's this might be a touch faster by removing one call to NEXT.
    ( It is on my system)

    : EXCH ( addr1 addr2 -- ) OVER @ OVER @ SWAP ROT ! SWAP ! ;

    ...not to mention the fact that this might be a touch faster by coding it directly in ML. ;)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Thu May 26 15:42:35 2022
    I have seen that word >< used for byte swapping in MaxForth on 16 bit integers.

    So in your case it was something usually called „flip” rather (swapping lo-byte and hi-byte)?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Fri May 27 10:22:06 2022
    On 27/05/2022 08:42, Zbig wrote:
    I have seen that word >< used for byte swapping in MaxForth on 16 bit integers.

    So in your case it was something usually called „flip” rather (swapping lo-byte and hi-byte)?

    Forth-79 Reference Word Set:

    >< n1 -- n2 "byte-swap"
    Swap the high and low bytes within n1.

    'Thinking FORTH' naming conventions:

    exchange, especially bytes >name< >MOVE<

    It appears F83 authors didn't like it and called it 'FLIP'.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Fri May 27 00:52:09 2022
    Forth-79 Reference Word Set:

    < n1 -- n2 "byte-swap"
    Swap the high and low bytes within n1.

    'Thinking FORTH' naming conventions:

    exchange, especially bytes >name< >MOVE<

    It appears F83 authors didn't like it and called it 'FLIP'.

    So as ben Akiba said: „Everything had already been”. ;)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marcel Hendrix@21:1/5 to none albert on Fri May 27 01:56:02 2022
    On Friday, May 27, 2022 at 10:22:32 AM UTC+2, none albert wrote:
    In article <6b1c8486-32b2-4704...@googlegroups.com>,
    Zbig <zbigni...@gmail.com> wrote:
    Examining C-- ( https://en.wikipedia.org/wiki/C-- ) I noticed
    interesting idea for a word EXCHANGE ( addr1 addr2 — ). Such word
    swaps values of two variables. The „inventor” even gave it >Forth-friendly name: "><".
    Too late. EXCHANGE is used in ciforth for swapping areas:
    (adr1 adr2 length -- )
    Compare to MOVE.

    That looks extremely wasteful. Why not just swap the pointers to
    these areas?

    Your new word can be had
    : zbig-exchange 1 CELL EXCHANGE ;

    Why not EXCH ? Or EXC-H (pronounce it) for cuteness?

    -marcel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to zbigniew2011@gmail.com on Fri May 27 10:22:28 2022
    In article <6b1c8486-32b2-4704-ab6d-c8479f26d9d8n@googlegroups.com>,
    Zbig <zbigniew2011@gmail.com> wrote:
    Examining C-- ( https://en.wikipedia.org/wiki/C-- ) I noticed
    interesting idea for a word EXCHANGE ( addr1 addr2 — ). Such word
    swaps values of two variables. The „inventor” even gave it
    Forth-friendly name: "><".

    Too late. EXCHANGE is used in ciforth for swapping areas:
    (adr1 adr2 length -- )
    Compare to MOVE.
    Your new word can be had
    : zbig-exchange 1 CELL EXCHANGE ;

    Groetjes Albert
    --
    "in our communism country Viet Nam, people are forced to be
    alive and in the western country like US, people are free to
    die from Covid 19 lol" duc ha
    albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Brian Fox on Fri May 27 19:39:33 2022
    On 27/05/2022 08:38, Brian Fox wrote:
    On Thursday, May 26, 2022 at 5:30:07 PM UTC-4, NN wrote:

    : exch ( a b -- ) over @ over @ >r swap ! r> swap ! ;


    A minor point but on ITC Forth's this might be a touch faster by removing one call to NEXT.
    ( It is on my system)

    : EXCH ( addr1 addr2 -- ) OVER @ OVER @ SWAP ROT ! SWAP ! ;

    Depending on how fast is your 2DUP:

    : EXCH ( a b -- ) 2DUP @ SWAP ! SWAP @ SWAP ! ;

    If EXCH happens to be preceded by a 2DUP (as it would be in a Qsort) then:

    ( a b ) OVER @ OVER @ 3 PICK ! OVER ! ( a b )

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Fri May 27 02:26:19 2022
    Too late. EXCHANGE is used in ciforth for swapping areas:
    (adr1 adr2 length -- )
    Compare to MOVE.
    Your new word can be had
    : zbig-exchange 1 CELL EXCHANGE ;

    Well, I can do that using the new word too:

    : swap-areas ( a1 a2 count -- ) 0 DO OVER I + OVER I + >< LOOP 2DROP ;

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From NN@21:1/5 to dxforth on Fri May 27 03:06:13 2022
    On Friday, 27 May 2022 at 10:39:36 UTC+1, dxforth wrote:
    On 27/05/2022 08:38, Brian Fox wrote:
    On Thursday, May 26, 2022 at 5:30:07 PM UTC-4, NN wrote:

    : exch ( a b -- ) over @ over @ >r swap ! r> swap ! ;


    A minor point but on ITC Forth's this might be a touch faster by removing one call to NEXT.
    ( It is on my system)

    : EXCH ( addr1 addr2 -- ) OVER @ OVER @ SWAP ROT ! SWAP ! ;
    Depending on how fast is your 2DUP:

    : EXCH ( a b -- ) 2DUP @ SWAP ! SWAP @ SWAP ! ;

    If EXCH happens to be preceded by a 2DUP (as it would be in a Qsort) then:

    ( a b ) OVER @ OVER @ 3 PICK ! OVER ! ( a b )

    2dup @ swap ! <--- you have corrupted 'a' at this point

    { : exch ( a b -- ) 2dup @ swap @ rot ! swap ! ; }

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to All on Fri May 27 21:26:23 2022
    On 27/05/2022 20:06, NN wrote:
    On Friday, 27 May 2022 at 10:39:36 UTC+1, dxforth wrote:

    : EXCH ( a b -- ) 2DUP @ SWAP ! SWAP @ SWAP ! ;

    2dup @ swap ! <--- you have corrupted 'a' at this point

    { : exch ( a b -- ) 2dup @ swap @ rot ! swap ! ; }

    I noticed that :( Thanks for posting the correction. Sadly ROT
    makes it less efficient:

    ( 0055AE00 488B13 ) MOV RDX, 0 [RBX]
    ( 0055AE03 488B4D00 ) MOV RCX, [RBP]
    ( 0055AE07 488B09 ) MOV RCX, 0 [RCX]
    ( 0055AE0A 48890B ) MOV 0 [RBX], RCX
    ( 0055AE0D 488B5D00 ) MOV RBX, [RBP]
    ( 0055AE11 488913 ) MOV 0 [RBX], RDX
    ( 0055AE14 488B5D08 ) MOV RBX, [RBP+08]
    ( 0055AE18 488D6D10 ) LEA RBP, [RBP+10]
    ( 0055AE1C C3 ) RET/NEXT
    ( 29 bytes, 9 instructions )

    What's bizarre is if I define:

    : dupexch ( a b -- a b ) 2dup 2dup @ swap @ rot ! swap ! ;

    on VFX it's efficient again!

    ( 0055ADB0 488B13 ) MOV RDX, 0 [RBX]
    ( 0055ADB3 488B4D00 ) MOV RCX, [RBP]
    ( 0055ADB7 488B09 ) MOV RCX, 0 [RCX]
    ( 0055ADBA 48890B ) MOV 0 [RBX], RCX
    ( 0055ADBD 488B4D00 ) MOV RCX, [RBP]
    ( 0055ADC1 488911 ) MOV 0 [RCX], RDX
    ( 0055ADC4 C3 ) RET/NEXT
    ( 21 bytes, 7 instructions )

    More is Less :)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Brian Fox on Fri May 27 14:53:49 2022
    Brian Fox <brian.fox@brianfox.ca> writes:
    On Thursday, May 26, 2022 at 5:30:07 PM UTC-4, NN wrote:

    : exch ( a b -- ) over @ over @ >r swap ! r> swap ! ;


    A minor point but on ITC Forth's this might be a touch faster by removing one call to NEXT.
    ( It is on my system)

    : EXCH ( addr1 addr2 -- ) OVER @ OVER @ SWAP ROT ! SWAP ! ;

    gforth-fast on RISC-V, using these definitions:

    : exch1 over @ over @ >r swap ! r> swap ! ;
    : exch2 over @ over @ swap rot ! swap ! ;
    : exch3 dup @ 2 pick @ rot ! swap ! ;

    And what SEE-CODE produces for them (it sometimes guesses the word
    wrong, the native code is correct):

    EXCH1: EXCH2: EXCH3:
    over over dup
    ld s1,$8(s8) ld s1,$8(s8) #0
    addi s10,s10,8 addi s10,s10,8 sd s7,$0(s8)
    @ @ ld s7,$0(s7)
    ld s1,$0(s1) ld s1,$0(s1) addi s8,s8,-8
    addi s10,s10,8 addi s10,s10,8 addi s10,s10,10
    over over third
    mv s3,s7 mv s3,s7 ld a5,$10(s8)
    addi s10,s10,8 addi s10,s10,8 addi s10,s10,8
    @ @ addi s8,s8,-8
    ld s3,$0(s3) ld s3,$0(s3) sd s7,$8(s8)
    addi s10,s10,8 addi s10,s10,8 mv s7,a5
    r swap noop
    addi s2,s2,-8 mv a5,s1 addi s8,s8,8
    addi s10,s10,8 addi s10,s10,8 mv s1,s7
    sd s3,$0(s2) mv s1,s3 ld s7,$0(s8)
    swap mv s3,a5 ld s1,$0(s1)
    addi s8,s8,8 rot addi s10,s10,8
    mv s3,s7 mv a5,s7 rot
    ld s7,$0(s8) addi s10,s10,8 ld s3,$8(s8)
    addi s10,s10,8 mv s7,s1 addi s10,s10,8
    ! mv s1,s3 addi s8,s8,8
    sd s1,$0(s3) mv s3,a5 !
    addi s10,s10,8 ! sd s1,$0(s3)
    sd s1,$0(s3) addi s10,s10,8
    ld s1,$0(s2) addi s10,s10,8 swap
    addi s10,s10,8 swap ld s1,$8(s8)
    addi s2,s2,8 ld s1,$8(s8) addi s10,s10,8
    swap addi s10,s10,8 addi s8,s8,8
    addi s8,s8,8 addi s8,s8,8 !
    mv s3,s7 ! sd s7,$0(s1)
    ld s7,$0(s8) sd s7,$0(s1) addi s10,s10,8
    addi s10,s10,8 addi s10,s10,8 noop
    ! noop ld s7,$8(s8)
    sd s1,$0(s3) ld s7,$8(s8) addi s8,s8,8
    addi s10,s10,8 addi s8,s8,8 ld a6,$0(s2)
    ;s ld a6,$0(s2) addi s2,s2,8
    ld a6,$0(s2) addi s2,s2,8 addi s10,a6,$8
    addi s2,s2,8 addi s10,a6,$8 ld a4,$-8(s10)
    addi s10,a6,$8 ld a4,$-8(s10) jr a4
    ld a4,$-8(s10) jr a4
    jr a4
    84 bytes 80 bytes 88 bytes

    Interesting: As far as native code is concerned, they all have the
    same number of instructions.

    Now Aarch64:

    EXCH1: EXCH2: EXCH3:
    noop over noop
    str x27, [x25],#-0x8 ldr x21, [x25,#0x8] str x27, [x25],#-0x8
    ldr x27, [x25,#0x10] add x26, x26, #0x8 ldr x27, [x25,#0x8]
    add x26, x26, #0x8 @ add x26, x26, #0x8
    @ ldr x21, [x21,#0x0] @
    ldr x27, [x27,#0x0] add x26, x26, #0x8 ldr x27, [x27,#0x0]
    add x26, x26, #0x8 over add x26, x26, #0x8
    over mov x24, x27 third
    ldr x21, [x25,#0x8] add x26, x26, #0x8 mov x0, x25
    add x26, x26, #0x8 @ sub x25, x25, #0x8
    @ ldr x24, [x24,#0x0] add x26, x26, #0x8
    ldr x21, [x21,#0x0] add x26, x26, #0x8 ldr x0, [x0,#0x10]
    add x26, x26, #0x8 swap str x27, [x25,#0x8]
    r str x27, [x25],#-0x8 mov x27, x0
    sub x22, x22, #0x8 add x26, x26, #0x8 @
    add x26, x26, #0x8 mov x27, x24 ldr x27, [x27,#0x0]
    str x21, [x22,#0x0] rot add x26, x26, #0x8
    swap ldr x24, [x25,#0x8]! rot
    ldr x21, [x25,#0x8]! add x26, x26, #0x8 mov x0, x25
    add x26, x26, #0x8 ! mov x21, x27
    ! add x26, x26, #0x8 add x25, x25, #0x10
    add x26, x26, #0x8 str x21, [x24,#0x0] add x26, x26, #0x8
    str x27, [x21,#0x0] swap ldp x27, x24, [x0,#0x8]
    ldr x21, [x25,#0x8]! !
    ldr x27, [x22],#0x8 add x26, x26, #0x8 add x26, x26, #0x8
    add x26, x26, #0x8 ! str x21, [x24,#0x0]
    swap add x26, x26, #0x8 swap
    ldr x21, [x25,#0x8]! str x27, [x21,#0x0] ldr x21, [x25,#0x8]!
    add x26, x26, #0x8 noop add x26, x26, #0x8
    ! mov x0, x25 !
    add x26, x26, #0x8 add x25, x25, #0x8 add x26, x26, #0x8
    str x27, [x21,#0x0] ldr x27, [x0,#0x8] str x27, [x21,#0x0]
    noop ldr x26, [x22],#0x8 noop
    mov x0, x25 add x26, x26, #0x8 mov x0, x25
    add x25, x25, #0x8 ldur x1, [x26,#-0x8] add x25, x25, #0x8
    ldr x27, [x0,#0x8] br x1 ldr x27, [x0,#0x8]
    ldr x26, [x22],#0x8 ldr x26, [x22],#0x8
    add x26, x26, #0x8 add x26, x26, #0x8
    ldur x1, [x26,#-0x8] ldur x1, [x26,#-0x8]
    br x1 br x1
    116 bytes 104 bytes 124 bytes

    Stack caching works well for these two architectures for EXCH2,
    because we have many variants of OVER, @, SWAP, ROT, and !, but not so
    great for EXCH3, because we only have one variant of THIRD, so the use
    of THIRD means that the TOS is in a register and the rest in memory
    immediately before and after THIRD, and all the code around it has to
    live with this constraint, and is also longer, in particular the ROT.

    Now AMD64 (where stack caching does not work so well):
    EXCH1: EXCH2: EXCH3
    noop noop dup
    mov [r14],rbx mov [r14],rbx #0
    sub r14,$08 sub r14,$08 sub r14,$08
    mov rbx,$10[r14] mov rbx,$10[r14] mov $08[r14],rbx
    add r15,$08 add r15,$08 mov rbx,[rbx]
    @ @ add r15,$10
    mov rbx,[rbx] mov rbx,[rbx] third
    add r15,$08 add r15,$08 mov rax,$10[r14]
    noop noop sub r14,$08
    mov [r14],rbx mov [r14],rbx add r15,$08
    sub r14,$08 sub r14,$08 mov $08[r14],rbx
    mov rbx,$10[r14] mov rbx,$10[r14] mov rbx,rax
    add r15,$08 add r15,$08 @
    @ @ mov rbx,[rbx]
    mov rbx,[rbx] mov rbx,[rbx] add r15,$08
    add r15,$08 add r15,$08 rot
    r swap mov rdx,$08[r14]
    add r14,$08 mov rax,$08[r14] mov rax,$10[r14]
    sub r13,$08 add r15,$08 mov $08[r14],rbx
    mov $00[r13],rbx mov $08[r14],rbx add r15,$08
    add r15,$08 mov rbx,rax mov $10[r14],rdx
    mov rbx,[r14] rot mov rbx,rax
    swap mov rdx,$08[r14] !
    mov rax,$08[r14] mov rax,$10[r14] mov rax,$08[r14]
    add r15,$08 mov $08[r14],rbx add r14,$10
    mov $08[r14],rbx add r15,$08 add r15,$08
    mov rbx,rax mov $10[r14],rdx mov [rbx],rax
    ! mov rbx,rax mov rbx,[r14]
    mov rax,$08[r14] ! swap
    add r14,$10 mov rax,$08[r14] mov rax,$08[r14]
    add r15,$08 add r14,$10 add r15,$08
    mov [rbx],rax add r15,$08 mov $08[r14],rbx
    mov rbx,[r14] mov [rbx],rax mov rbx,rax
    noop mov rbx,[r14] !
    mov [r14],rbx swap #0
    sub r14,$08 mov rax,$08[r14] mov rax,$08[r14]
    mov rbx,$00[r13] add r15,$08 add r14,$10
    add r15,$08 mov $08[r14],rbx add r13,$08
    add r13,$08 mov rbx,rax mov [rbx],rax
    swap ! mov r10,-$08[r13]
    mov rax,$08[r14] #0 mov rbx,[r14]
    add r15,$08 mov rax,$08[r14] lea r15,$08[r10]
    mov $08[r14],rbx add r14,$10 mov rcx,-$08[r15]
    mov rbx,rax add r13,$08 jmp ecx
    ! mov [rbx],rax
    #0 mov r10,-$08[r13]
    mov rax,$08[r14] mov rbx,[r14]
    add r14,$10 lea r15,$08[r10]
    add r13,$08 mov rcx,-$08[r15]
    mov [rbx],rax jmp ecx
    mov r10,-$08[r13]
    mov rbx,[r14]
    lea r15,$08[r10]
    mov rcx,-$08[r15]
    jmp ecx
    162 bytes 147 bytes 129 bytes

    Let's look at some sophisticated compilers:

    VFX64:
    EXCH1 EXCH2 EXCH3
    MOV RDX, [RBP] MOV RDX, [RBP] MOV RDX, 0 [RBX]
    MOV RDX, 0 [RDX] MOV RDX, 0 [RDX] MOV RCX, [RBP]
    MOV RCX, 0 [RBX] MOV RCX, 0 [RBX] MOV RCX, 0 [RCX]
    PUSH RCX MOV 0 [RBX], RDX MOV 0 [RBX], RCX
    MOV 0 [RBX], RDX MOV RBX, [RBP] MOV RBX, [RBP]
    POP RBX MOV 0 [RBX], RCX MOV 0 [RBX], RDX
    MOV RDX, [RBP] MOV RBX, [RBP+08] MOV RBX, [RBP+08]
    MOV 0 [RDX], RBX LEA RBP, [RBP+10] LEA RBP, [RBP+10]
    MOV RBX, [RBP+08] RET/NEXT RET/NEXT
    LEA RBP, [RBP+10] 29 bytes 29 bytes
    RET/NEXT
    31 bytes

    EXCH1 suffers from VFX not being analytical about the return stack.

    lxf:

    EXCH1: EXCH2: EXCH3:
    mov eax , [ebp] mov eax , [ebp] mov eax , ebx
    mov eax , [eax] mov eax , [eax] mov eax , [eax]
    mov ecx , ebx mov ecx , ebx mov ecx , [ebp]
    mov ecx , [ecx] mov ecx , [ecx] mov ecx , [ecx]
    mov [ebx] , eax mov [ebx] , eax mov [ebx] , ecx
    mov ebx , [ebp] mov ebx , [ebp] mov ebx , [ebp]
    mov [ebx] , ecx mov [ebx] , ecx mov [ebx] , eax
    mov ebx , [ebp+4h] mov ebx , [ebp+4h] mov ebx , [ebp+4h]
    lea ebp , [ebp+8h] lea ebp , [ebp+8h] lea ebp , [ebp+8h]
    ret near ret near ret near

    No return stack overhead here, but a reg-reg MOV that VFX avoids.

    iForth:
    EXCH1: EXCH2: EXCH3:
    pop rbx pop rbx mov rbx, [rsp] qword
    pop rdi pop rdi push [rbx] qword
    mov rax, [rdi] qword mov rax, [rbx] qword mov rbx, [rsp #16 +] qword
    mov rdx, [rbx] qword mov rdx, [rdi] qword pop rdi
    mov [ebx] dword, rax mov [ebx] dword, rdx pop rax
    mov [edi] dword, rdx mov [edi] dword, rax mov rdx, [rbx] qword
    ; ; mov [eax] dword, rdx
    pop rbx
    mov [ebx] dword, rdi
    ;

    iForth does not keep the TOS in a register on word boundaries, and
    uses RSP as data stack pointer. Apparently it implements 2 PICK by
    first dumping the whole stack into memory.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: http://www.forth200x.org/forth200x.html
    EuroForth 2021: https://euro.theforth.net/2021

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Fri May 27 11:41:29 2022
    DB 82H,">","<"+80h
    ALIGN 2
    DW CAT - 6
    EXCHG DW $ + 2
    POP BX
    MOV AX,[BX]
    MOV DX,BX
    POP BX
    XCHG AX,[BX]
    MOV BX,DX
    MOV [BX],AX
    JMP NEXT

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marcel Hendrix@21:1/5 to dxforth on Fri May 27 11:38:52 2022
    On Friday, May 27, 2022 at 11:39:36 AM UTC+2, dxforth wrote:
    On 27/05/2022 08:38, Brian Fox wrote:
    On Thursday, May 26, 2022 at 5:30:07 PM UTC-4, NN wrote:

    : exch ( a b -- ) over @ over @ >r swap ! r> swap ! ;


    A minor point but on ITC Forth's this might be a touch faster by removing one call to NEXT.
    ( It is on my system)

    : EXCH ( addr1 addr2 -- ) OVER @ OVER @ SWAP ROT ! SWAP ! ;
    Depending on how fast is your 2DUP:

    : EXCH ( a b -- ) 2DUP @ SWAP ! SWAP @ SWAP ! ;

    If EXCH happens to be preceded by a 2DUP (as it would be in a Qsort) then:

    ( a b ) OVER @ OVER @ 3 PICK ! OVER ! ( a b )

    In QSORT I use ( d-addr -- ) DUP 2@ ROT D!

    -marcel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marcel Hendrix@21:1/5 to Anton Ertl on Fri May 27 11:50:55 2022
    On Friday, May 27, 2022 at 6:04:54 PM UTC+2, Anton Ertl wrote:
    Brian Fox <bria...@brianfox.ca> writes:
    On Thursday, May 26, 2022 at 5:30:07 PM UTC-4, NN wrote:

    : exch ( a b -- ) over @ over @ >r swap ! r> swap ! ;


    A minor point but on ITC Forth's this might be a touch faster by removing one call to NEXT.
    ( It is on my system)

    : EXCH ( addr1 addr2 -- ) OVER @ OVER @ SWAP ROT ! SWAP ! ;
    gforth-fast on RISC-V, using these definitions:

    : exch1 over @ over @ >r swap ! r> swap ! ;
    : exch2 over @ over @ swap rot ! swap ! ;
    : exch3 dup @ 2 pick @ rot ! swap ! ;
    [..]
    iForth:
    EXCH1: EXCH2: EXCH3:
    pop rbx pop rbx mov rbx, [rsp] qword
    pop rdi pop rdi push [rbx] qword
    mov rax, [rdi] qword mov rax, [rbx] qword mov rbx, [rsp #16 +] qword
    mov rdx, [rbx] qword mov rdx, [rdi] qword pop rdi
    mov [ebx] dword, rax mov [ebx] dword, rdx pop rax
    mov [edi] dword, rdx mov [edi] dword, rax mov rdx, [rbx] qword
    ; ; mov [eax] dword, rdx
    pop rbx
    mov [ebx] dword, rdi
    ;

    iForth does not keep the TOS in a register on word boundaries, and
    uses RSP as data stack pointer. Apparently it implements 2 PICK by
    first dumping the whole stack into memory.

    Yes, there is a problem with PICK that I don't know how to solve (yet).

    iForth should be tested with a use case, as it recompiles when it knows more:

    FORTH> : exch1 over @ over @ >r swap ! r> swap ! ; ok
    FORTH> : exch2 over @ over @ swap rot ! swap ! ; ok
    FORTH> : exch3 dup @ 2 pick @ rot ! swap ! ; ok
    FORTH> variable a variable b : test1 a b exch1 ; : test2 a b exch2 ; : test3 a b exch3 ; ok
    FORTH> see test1
    Flags: TOKENIZE, ANSI
    : test1 a b [trashed] ; ok
    FORTH> ' test1 idis
    $01340A00 : test1
    $01340A0A mov rbx, $013405C0 qword-offset
    $01340A11 mov rdi, $013405E0 qword-offset
    $01340A18 mov $013405E0 qword-offset, rbx
    $01340A1F mov $013405C0 qword-offset, rdi
    $01340A26 ;
    FORTH> ' test2 idis
    $01340A80 : test2
    $01340A8A mov rbx, $013405E0 qword-offset
    $01340A91 mov rdi, $013405C0 qword-offset
    $01340A98 mov $013405E0 qword-offset, rdi
    $01340A9F mov $013405C0 qword-offset, rbx
    $01340AA6 ;
    FORTH> ' test3 idis
    $01340B00 : test3
    $01340B0A push $013405C0 d#
    $01340B0F push $013405E0 d#
    $01340B14 push $013405E0 qword-offset
    $01340B1A mov rbx, [rsp #16 +] qword
    $01340B1F pop rdi
    $01340B20 pop rax
    $01340B21 mov rdx, [rbx] qword
    $01340B24 mov [rax] qword, rdx
    $01340B27 pop rbx
    $01340B28 mov [rbx] qword, rdi
    $01340B2B ;

    -marcel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Sat May 28 14:12:39 2022
    On 28/05/2022 04:41, Zbig wrote:
    DB 82H,">","<"+80h
    ALIGN 2
    DW CAT - 6
    EXCHG DW $ + 2
    POP BX
    MOV AX,[BX]
    MOV DX,BX
    POP BX
    XCHG AX,[BX]
    MOV BX,DX
    MOV [BX],AX
    JMP NEXT

    DI should be free in Fig-Forth

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Marcel Hendrix on Sat May 28 14:57:38 2022
    On 28/05/2022 04:38, Marcel Hendrix wrote:
    On Friday, May 27, 2022 at 11:39:36 AM UTC+2, dxforth wrote:
    On 27/05/2022 08:38, Brian Fox wrote:
    On Thursday, May 26, 2022 at 5:30:07 PM UTC-4, NN wrote:

    : exch ( a b -- ) over @ over @ >r swap ! r> swap ! ;


    A minor point but on ITC Forth's this might be a touch faster by removing one call to NEXT.
    ( It is on my system)

    : EXCH ( addr1 addr2 -- ) OVER @ OVER @ SWAP ROT ! SWAP ! ;
    Depending on how fast is your 2DUP:

    : EXCH ( a b -- ) 2DUP @ SWAP ! SWAP @ SWAP ! ;

    If EXCH happens to be preceded by a 2DUP (as it would be in a Qsort) then: >>
    ( a b ) OVER @ OVER @ 3 PICK ! OVER ! ( a b )

    In QSORT I use ( d-addr -- ) DUP 2@ ROT D!

    While you've posted others versions of Qsort, I couldn't find that one.
    Do you have a link?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marcel Hendrix@21:1/5 to dxforth on Sat May 28 00:05:46 2022
    On Saturday, May 28, 2022 at 6:57:41 AM UTC+2, dxforth wrote:
    On 28/05/2022 04:38, Marcel Hendrix wrote:
    [..]
    In QSORT I use ( d-addr -- ) DUP 2@ ROT D!
    While you've posted others versions of Qsort, I couldn't find that one.
    Do you have a link?

    Sorry, it is bubble-s.frt:

    \ cr .( A classical benchmark of an O[n**2] algorithm; Bubble sort)

    \ Part of the programs gathered by John Hennessy for the MIPS
    \ RISC project at Stanford. Translated to forth by Marty Fraeman
    \ Johns Hopkins University/Applied Physics Laboratory.

    \ 0 value seed ( -- addr)
    : initiate-seed ( -- ) 74755 TO seed ;
    \ : random ( -- n ) seed 1309 * 13849 + 65535 and dup TO seed ;

    6000 constant #elements ( -- int )

    create A-list #elements cells allot

    : initiate-list ( -- )
    A-list #elements cells BOUNDS do random i ! cell +loop ;

    : dump-list ( -- )
    A-list #elements cells BOUNDS do i @ . cell +loop cr ;

    : verify-list ( -- )
    A-list #elements 1- cells BOUNDS do
    i 2@ > abort" bubble-sort: not sorted"
    cell +loop ;

    : bubble ( -- )
    cr ." bubbling... "
    #elements
    1 DO A-list #elements i - cells
    bounds DO i 2@ > IF i 2@ i D! THEN
    cell +LOOP
    LOOP ;

    -marcel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Sat May 28 04:39:40 2022
    DI should be free in Fig-Forth

    Indeed. Thanks!

    DB 82H,">","<"+80h
    ALIGN 2
    DW CAT - 6
    EXCHG DW $ + 2
    POP BX
    POP DI
    MOV AX,[BX]
    XCHG AX,[DI]
    MOV [BX], AX
    JMP NEXT

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Marcel Hendrix on Mon May 30 15:19:10 2022
    On 28/05/2022 17:05, Marcel Hendrix wrote:
    On Saturday, May 28, 2022 at 6:57:41 AM UTC+2, dxforth wrote:
    On 28/05/2022 04:38, Marcel Hendrix wrote:
    [..]
    In QSORT I use ( d-addr -- ) DUP 2@ ROT D!
    While you've posted others versions of Qsort, I couldn't find that one.
    Do you have a link?

    Sorry, it is bubble-s.frt:

    ...
    : bubble ( -- )
    cr ." bubbling... "
    #elements
    1 DO A-list #elements i - cells
    bounds DO i 2@ > IF i 2@ i D! THEN
    cell +LOOP
    LOOP ;

    That optimization is typical of forth bubble sort where adjacent cells can
    be exploited. Of course not everyone has D! so it's: I 2@ SWAP I 2@

    I notice the sequence

    ( a b ) OVER @ OVER @ 3 PICK ! OVER ! ( a b )

    was also used in my implementation of Hans' Circle sort:

    \ Circlesort addr/cells H.Bezemer
    defer PRECEDES ( x1 x2 -- f ) \ comparison

    -? variable s

    -? : c ( a1 a2 -- ) 2dup = if 2drop end
    2dup swap begin 2dup u> while
    over @ over @ precedes if
    over @ over @ 3 pick ! over ! s off
    then swap cell- swap cell+
    repeat rot 2over 2over - + > if 2swap then
    recurse recurse ;

    : SORT ( adr siz -- ) dup if 1- cells over +
    begin s on 2dup c s @ until then 2drop ; behead s c

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From S Jack@21:1/5 to dxforth on Mon May 30 19:36:07 2022
    On Monday, May 30, 2022 at 12:19:17 AM UTC-5, dxforth wrote:
    was also used in my implementation of Hans' Circle sort:

    \ Circlesort addr/cells H.Bezemer
    defer PRECEDES ( x1 x2 -- f ) \ comparison

    -? variable s

    -? : c ( a1 a2 -- ) 2dup = if 2drop end
    2dup swap begin 2dup u> while
    over @ over @ precedes if
    over @ over @ 3 pick ! over ! s off
    then swap cell- swap cell+
    repeat rot 2over 2over - + > if 2swap then
    recurse recurse ;

    : SORT ( adr siz -- ) dup if 1- cells over +
    begin s on 2dup c s @ until then 2drop ; behead s c

    H.Bezemer Circlesort addr/cells by Frog closure

    defer PRECEDES ( x1 x2 -- f ) \ comparison

    CREATE SORT ( adr siz -- )
    0 , \ var s
    :[ \ c ( a1 a2 -- )
    [ here p0! ] \ recurse to here
    2dup = if 2drop exit fi
    2dup swap
    begin 2dup u>
    while
    over @ over @
    precedes if
    over @ over @ 3 pick ! over !
    false dat !
    fi
    swap cell- swap cell+
    repeat
    rot 2over 2over - +
    > if 2swap fi
    [p0] [p0] \ recurse recurse
    ]: p0!
    MAIN
    dup if 1- cells over +
    begin
    true dat !
    2dup [p0]
    dat @ until
    fi
    2drop ;

    [s] testing

    create foodat 16 , 1 , 32 , 4 , 99 , 6 , 666 , 42 ,
    8 const #foodat

    ' < is PRECEDES
    foodat #foodat SORT
    i. { #foodat 0 do i foodat [] . loop } e ==> 1 4 6 16 32 42 99 666

    -fin-
    ok
    --
    me

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)