• assembly: mov a memory word into register

    From Mateusz Viste@21:1/5 to All on Fri Nov 12 14:35:35 2021
    I somehow got stuck on a simple quest: copying a word from memory into a register. Here's what I do:

    void myfunc(char *buff) {
    _asm {
    mov ax, [buff]
    }
    }

    This is inline assembly within OpenWatcom. My understanding so far was
    that:

    mov ax, buff ; copies buff (pointer) into AX
    mov ax, [buff] ; copies *buff (first word at memory location) into AX

    But that's not what happens now.

    Whether I use "mov ax, buff" or "mov ax, [buff]", the result is the
    same: AX gets the address of buff and never the value under it.

    What am I missing?

    I must add that it works when I do this:

    void myfunc(char *buff) {
    _asm {
    mov bx, buff
    mov ax, [bx]
    }
    }

    I'd like to understand why my first version isn't producing what I
    expect, though... Any ideas?


    Mateusz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kerr-Mudd, John@21:1/5 to Mateusz Viste on Fri Nov 12 16:52:39 2021
    On Fri, 12 Nov 2021 14:35:35 +0100
    Mateusz Viste <mateusz@xyz.invalid> wrote:

    I somehow got stuck on a simple quest: copying a word from memory
    into a register. Here's what I do:

    void myfunc(char *buff) {
    _asm {
    mov ax, [buff]
    }
    }

    This is inline assembly within OpenWatcom. My understanding so far was
    that:

    mov ax, buff ; copies buff (pointer) into AX
    mov ax, [buff] ; copies *buff (first word at memory location) into
    AX

    But that's not what happens now.

    Whether I use "mov ax, buff" or "mov ax, [buff]", the result is the
    same: AX gets the address of buff and never the value under it.

    What am I missing?


    I can only agree with you; have you "upgraded" OpenWatcom?


    I might try
    mov ax, word [buff]
    but it shouldn't be necessary.

    then again, it might depend on how 'buff' is declared (Sorry, I'm
    not a C programmer).

    I must add that it works when I do this:

    void myfunc(char *buff) {
    _asm {
    mov bx, buff
    mov ax, [bx]
    }
    }

    I'd like to understand why my first version isn't producing what I
    expect, though... Any ideas?


    Mateusz



    --
    Bah, and indeed Humbug.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Herbert Kleebauer@21:1/5 to Mateusz Viste on Fri Nov 12 17:53:00 2021
    On 12.11.2021 14:35, Mateusz Viste wrote:

    I somehow got stuck on a simple quest: copying a word from memory into a register. Here's what I do:

    void myfunc(char *buff) {
    _asm {
    mov ax, [buff]
    }
    }

    This is inline assembly within OpenWatcom. My understanding so far was
    that:

    mov ax, buff ; copies buff (pointer) into AX

    mov ax, [buff] ; copies *buff (first word at memory location) into AX

    There is no such x86 instruction. For indirect addressing you have to
    use a register.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mateusz Viste@21:1/5 to Herbert Kleebauer on Fri Nov 12 18:40:26 2021
    2021-11-12 at 17:53 +0100, Herbert Kleebauer wrote:
    On 12.11.2021 14:35, Mateusz Viste wrote:

    I somehow got stuck on a simple quest: copying a word from memory
    into a register. Here's what I do:

    void myfunc(char *buff) {
    _asm {
    mov ax, [buff]
    }
    }

    This is inline assembly within OpenWatcom. My understanding so far
    was that:

    mov ax, buff ; copies buff (pointer) into AX

    mov ax, [buff] ; copies *buff (first word at memory location)
    into AX

    There is no such x86 instruction. For indirect addressing you have to
    use a register.

    That is correct indeed, thanks. I have been confused by what "buff"
    truly is. It is still weird that buff and [buff] are the same, but
    that's certainly due to the magic introduced by the C compiler to
    reference C variables within the inline assembly block.

    To illustrate/understand the "issue", I wrote a little test program.


    ; print first char of cmdline tail

    cpu 8086
    org 0x100

    ; does not work (prints the character 0x82)
    mov ah, 2
    mov dl, [buff]
    int 0x21

    ; works (prints the character at location 0x82)
    mov ah, 2
    mov dl, [0x82]
    int 0x21

    ; game over
    mov ax, 0x4c00
    int 0x21

    buff dw 0x82




    Mateusz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mateusz Viste@21:1/5 to Herbert Kleebauer on Fri Nov 12 20:07:58 2021
    2021-11-12 at 19:19 +0100, Herbert Kleebauer wrote:
    mov ah, 2
    mov dl, [buff]

    This instruction doesn't exist, the compiler uses instead:

    mov dl, buff

    Well, no - in this specific example "mov dl, [buff]" does exist. It
    loads whatever is at the location pointed out by buff (here: the word
    0x82). But note that my previous program was an actual assembly
    listing, not an assembly-inside-C abomination like in the first post.

    "mov dl, [buff]" is, in fact, the same instruction as in "mov dl,
    [0x82]", it is just that the assembler substitutes "buff" by its offset
    at compile time:

    00000000 B402 mov ah,0x2
    00000002 8A161501 mov dl,[0x115]
    00000006 CD21 int 0x21
    00000008 B402 mov ah,0x2
    0000000A 8A168200 mov dl,[0x82]
    0000000E CD21 int 0x21
    00000010 B8004C mov ax,0x4c00
    00000013 CD21 int 0x21
    00000015 82 db 0x82
    00000016 00 db 0x00

    My foolish mistake was to expect a similar behavior within an inline
    assembly block when referencing a C pointer (which is different from an assembly offset and implies - as you correctly pointed out in your
    first reply - an extra layer of indirection).

    Mateusz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Herbert Kleebauer@21:1/5 to All on Fri Nov 12 19:19:30 2021
    On 12.11.2021 18:40, Mateusz Viste wrote:

    I dont know Watcom inline assembly but as I understand:

    https://open-watcom.github.io/open-watcom-v2-wikidocs/cguide.pdf


    To illustrate/understand the "issue", I wrote a little test program.


    ; print first char of cmdline tail

    cpu 8086
    org 0x100

    ; does not work (prints the character 0x82)
    mov ah, 2
    mov dl, [buff]

    This instruction doesn't exist, the compiler uses instead:

    mov dl, buff

    which stores the value 0x82 (the content of the variable buff) in dl.

    int 0x21

    ; works (prints the character at location 0x82)
    mov ah, 2
    mov dl, [0x82]

    This instruction exists, it loads the value stored at
    location 0x82 into dl.

    int 0x21

    ; game over
    mov ax, 0x4c00
    int 0x21

    buff dw 0x82

    Just take a look at the generated machine code to see what
    happens.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Herbert Kleebauer@21:1/5 to Mateusz Viste on Fri Nov 12 22:18:05 2021
    On 12.11.2021 20:07, Mateusz Viste wrote:
    2021-11-12 at 19:19 +0100, Herbert Kleebauer wrote:
    mov ah, 2
    mov dl, [buff]

    This instruction doesn't exist, the compiler uses instead:

    mov dl, buff

    Well, no - in this specific example "mov dl, [buff]" does exist.

    Different assembler use different syntax.

    mov ax, buff

    In Watcom inline assembly this moves the content of the variable
    buff into ax. But in NASM this instruction moves the address of
    the variable buff into ax. Therefore in NASM a "mov dl, [buff]"
    does exist (but not in Watcom) and means the same as "mov dl, buff"
    in Watcom.

    It
    loads whatever is at the location pointed out by buff (here: the word
    0x82). But note that my previous program was an actual assembly
    listing, not an assembly-inside-C abomination like in the first post.

    "mov dl, [buff]" is, in fact, the same instruction as in "mov dl,
    [0x82]", it is just that the assembler substitutes "buff" by its offset
    at compile time:

    00000000 B402 mov ah,0x2
    00000002 8A161501 mov dl,[0x115]
    00000006 CD21 int 0x21
    00000008 B402 mov ah,0x2
    0000000A 8A168200 mov dl,[0x82]
    0000000E CD21 int 0x21
    00000010 B8004C mov ax,0x4c00
    00000013 CD21 int 0x21
    00000015 82 db 0x82
    00000016 00 db 0x00

    My foolish mistake was to expect a similar behavior within an inline
    assembly block when referencing a C pointer (which is different from an assembly offset and implies - as you correctly pointed out in your
    first reply - an extra layer of indirection).

    Just compare it with the opcode generated by the Watcom inline assembler.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rod Pemberton@21:1/5 to Mateusz Viste on Sun Nov 14 20:53:43 2021
    On Fri, 12 Nov 2021 14:35:35 +0100
    Mateusz Viste <mateusz@xyz.invalid> wrote:

    I somehow got stuck on a simple quest:
    copying a word from memory into a register.

    You might need to declare "buff" as a C variable, with both a
    "volatile" keyword and as a pointer. E.g., perhaps (untested):

    volatile unsigned char *buff=0xb800;

    Then in the _asm{} section, the compiler should recognize "buff" as a C variable, use the assigned address, and not optimize away the code (due
    to the volatile):

    mov ax,[buff]

    Of course, AX won't be preserved into the C code as the C compiler
    likely uses the register, i.e., AX's value will be destroyed by C code.

    If you want to pass the AX value into the C code, you may need to use
    the "#pragma aux" format for OW to pass the value back to C, which
    should look something like (untested):

    volatile unsigned char *buff=0xb800;

    extern unsigned short myfunc(void);
    #pragma aux myfunc = \
    "mov ax, [buff]" \
    value [ax];

    int main(void)
    {
    ...
    printf("%04lx\n",myfunc());
    ...
    }

    --
    Is Biden intentionally recreating Carter's legacy?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mateusz Viste@21:1/5 to Rod Pemberton on Mon Nov 15 09:31:18 2021
    2021-11-14 at 20:53 -0500, Rod Pemberton wrote:
    You might need to declare "buff" as a C variable, with both a
    "volatile" keyword and as a pointer. E.g., perhaps (untested):

    volatile unsigned char *buff=0xb800;

    Then in the _asm{} section, the compiler should recognize "buff" as a
    C variable, use the assigned address, and not optimize away the code
    (due to the volatile):

    mov ax,[buff]

    I'm sorry but I fail to see what problem you are trying to solve here.
    The compiler already knows that buff is a C variable and this variable
    can be used from within an inline asm block even without volatile.

    Herbert's reply made me realize that buff is neither a register nor a
    constant address, hence it cannot be used as indirect addressing. I was
    too eager to apply the assembly concept of a "variable" (which is, in
    fact, just a pre-calculated address offset) to a C pointer.

    The fact that "mov r, [buff]" and "mov r, buff" are equivalent for wasm
    didn't help to clear my initial confusion.

    Consider this program:

    #include <i86.h>
    #include <stdio.h>

    int main(void) {
    unsigned char *buff = "AB";
    unsigned short mybx = 0, mycx = 0, mydx = 0;

    _asm {
    mov cx, buff
    mov dx, [buff]

    mov bx, buff
    mov bx, [bx]

    mov mybx, bx
    mov mycx, cx
    mov mydx, dx
    }

    printf("mybx=%04X mycx=%04X mydx=%04X (buff=%p)\r\n",
    mybx, mycx, mydx, buff);
    return(0);
    }

    and its output:

    mybx=4241 mycx=0022 mydx=0022 (buff=022)


    Your volatile suggestion could be a solution if mybx, mycx, mydx were
    optimized away as zeroes, but that's not a problem that exists in this
    context. Here the problem was that I naively wanted to obtain the
    0x4241 ("AB") result without using an intermediary register for indirect addressing.

    Both C and assembly can be tricky, but both mixed together lead to
    entirely new classes of troubles.

    Mateusz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Herbert Kleebauer@21:1/5 to All on Mon Nov 15 12:10:46 2021
    On 15.11.2021 09:31, Mateusz Viste wrote:

    Does the OpenWatcom assembler allow to use [] as normal
    parenthesis like in [3+4]*5 ? Otherwise

    mov dx, [buff]

    should give an error "illegal addressing mode" instead of
    just ignoring the [].

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mateusz Viste@21:1/5 to Herbert Kleebauer on Mon Nov 15 13:02:23 2021
    2021-11-15 at 12:10 +0100, Herbert Kleebauer wrote:
    Does the OpenWatcom assembler allow to use [] as normal
    parenthesis like in [3+4]*5 ?

    Good point, apparently it does. I wasn't expecting such possibility.

    Therefore, the [] notation can either mean "parenthesis" or
    "dereference", depending on the context.

    This:

    _asm {
    mov dx, [3+4]*5
    mov cx, buff
    mov dx, [buff]
    }

    Translates to that:

    BA 23 00 mov dx,0x0023
    8B 8E F8 FF mov cx,word ptr -0x8[bp]
    8B 96 F8 FF mov dx,word ptr -0x8[bp]


    Makes coding in this asm dialect even more "challenging"...


    Mateusz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ross Ridge@21:1/5 to mateusz@xyz.invalid on Mon Nov 15 16:53:53 2021
    Mateusz Viste <mateusz@xyz.invalid> wrote:
    That is correct indeed, thanks. I have been confused by what "buff"
    truly is. It is still weird that buff and [buff] are the same, but
    that's certainly due to the magic introduced by the C compiler to
    reference C variables within the inline assembly block.

    Watcom is apparently using MASM syntax where for the most part square
    brackets don't have much meaning. The exception is when a register is
    used inside the brackets, and so "bx" and "[bx]" mean two different
    things, while "foo" and "[foo]" mean the same thing. What "foo" and
    "[foo]" mean depends on how "foo" was defined.

    I wrote an answer on Stack Overflow that gives some examples on how MASM
    treats square brackets:

    https://stackoverflow.com/questions/25129743/confusing-brackets-in-masm32

    --
    l/ // Ross Ridge -- The Great HTMU
    [oo][oo] rridge@csclub.uwaterloo.ca
    -()-/()/ http://www.csclub.uwaterloo.ca/~rridge/
    db //

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)