• reading a file from disk

    From luser.droog@nospicedham.gmail.com@21:1/5 to All on Sat May 15 22:39:02 2021
    Hello all,

    I'm planning to resume work on my partly written 8086 emulator after a
    long hiatus. I want to add the ability to read a file, but I'm having some difficulty
    understanding how it's supposed to work under MS-DOS. I've found the
    listing of int 13h in Ralf Brown's Interrupt List (http://www.ctyme.com/intr/cat-003.htm)
    but it all seems very complicated and perhaps unnecessary.

    For the simplest working test, I think I can skip the CHS addressing and
    just use Logical Block Addressing with a single "disk" file on the host.
    Is there a good resource to understand how this all should work?

    I need to implement the BIOS routines and call host functions, probably
    just mmap'ing the file and using memcpy for both read and write.
    I have this sort of thing partly working for keyboard read and and screen
    write by using ESC instructions in the BIOS routines. The emulator
    implements the ESC instructions to call host functions getchar() and
    putchar().

    My emulator code is at https://github.com/luser-dr00g/8086
    with some overview and explanations in https://github.com/luser-dr00g/8086/pres

    TIA
    --
    droog

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From luser.droog@nospicedham.gmail.com@21:1/5 to luser...@nospicedham.gmail.com on Sun May 23 23:15:23 2021
    On Sunday, May 16, 2021 at 3:48:21 PM UTC-5, luser...@nospicedham.gmail.com wrote:
    Hello all,

    I'm planning to resume work on my partly written 8086 emulator after a
    long hiatus. I want to add the ability to read a file, but I'm having some difficulty
    [...]
    Is there a good resource to understand how this all should work?

    I suppose this isn't really related to assembly language. I've ordered Peter Norton's
    Guide to the IBM PC. I guess I'll post in comp.os.msdos.programmer if I run into
    trouble.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@nospicedham.math.uni.wroc.@21:1/5 to luser...@nospicedham.gmail.com on Tue May 25 01:26:11 2021
    luser...@nospicedham.gmail.com <luser.droog@nospicedham.gmail.com> wrote:
    Hello all,

    I'm planning to resume work on my partly written 8086 emulator after a
    long hiatus. I want to add the ability to read a file, but I'm having some difficulty
    understanding how it's supposed to work under MS-DOS. I've found the
    listing of int 13h in Ralf Brown's Interrupt List (http://www.ctyme.com/intr/cat-003.htm)
    but it all seems very complicated and perhaps unnecessary.

    For the simplest working test, I think I can skip the CHS addressing and
    just use Logical Block Addressing with a single "disk" file on the host.
    Is there a good resource to understand how this all should work?

    I need to implement the BIOS routines and call host functions, probably
    just mmap'ing the file and using memcpy for both read and write.
    I have this sort of thing partly working for keyboard read and and screen write by using ESC instructions in the BIOS routines. The emulator
    implements the ESC instructions to call host functions getchar() and putchar().

    Basic question is what do you want to emulate? 8086 by itself can not
    do file I/O. You may do PC emulator in style of Bochs, that is
    emulate common hardware. You may do BIOS emulator. You may
    do DOS emulator, that is emulate file I/O at DOS level. Or
    you may emulate different system. For example QEMU 386 emulates
    Linux system calls. If you want your emulator to be simple,
    you may define your own system calls (say using something like INT 0x80h), thing like open, close, read, write. Advantage is simplicity.
    Disadvantage is that large body of existing 8086 assembler programs
    assumes DOS environment and will not work with different system.

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From luser.droog@nospicedham.gmail.com@21:1/5 to anti...@nospicedham.math.uni.wroc.p on Wed May 26 15:33:53 2021
    On Monday, May 24, 2021 at 8:40:27 PM UTC-5, anti...@nospicedham.math.uni.wroc.pl wrote:
    luser...@nospicedham.gmail.com <luser...@nospicedham.gmail.com> wrote:
    Hello all,

    I'm planning to resume work on my partly written 8086 emulator after a
    long hiatus. I want to add the ability to read a file, but I'm having some difficulty
    understanding how it's supposed to work under MS-DOS. I've found the listing of int 13h in Ralf Brown's Interrupt List (http://www.ctyme.com/intr/cat-003.htm)
    but it all seems very complicated and perhaps unnecessary.

    For the simplest working test, I think I can skip the CHS addressing and just use Logical Block Addressing with a single "disk" file on the host.
    Is there a good resource to understand how this all should work?

    I need to implement the BIOS routines and call host functions, probably just mmap'ing the file and using memcpy for both read and write.
    I have this sort of thing partly working for keyboard read and and screen write by using ESC instructions in the BIOS routines. The emulator implements the ESC instructions to call host functions getchar() and putchar().
    Basic question is what do you want to emulate? 8086 by itself can not
    do file I/O. You may do PC emulator in style of Bochs, that is
    emulate common hardware. You may do BIOS emulator. You may
    do DOS emulator, that is emulate file I/O at DOS level. Or
    you may emulate different system. For example QEMU 386 emulates
    Linux system calls. If you want your emulator to be simple,
    you may define your own system calls (say using something like INT 0x80h), thing like open, close, read, write. Advantage is simplicity.
    Disadvantage is that large body of existing 8086 assembler programs
    assumes DOS environment and will not work with different system.


    Very good points. Thanks. You're right that the 8086 by itself only has
    in/out instructions, ESC instructions, and memory mapped IO, all of which depend on the rest of the system. It appears that the IBM BIOS operates
    at the very low level of clusters, heads, and sectors. But the DOS functions appear to map pretty closely to stdio.h functions. So that seems to be
    the easiest path forward.

    My goal at this stage is just to get some kind of read/write ability so the Forth interpreter can read Forth source from a file. So far, all my Forth code is written in a sort of "pre-compiled" form directly in the C code that implements the CPU emulator.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to luser...@nospicedham.gmail.com on Thu May 27 07:48:55 2021
    On 26/05/2021 23:33, luser...@nospicedham.gmail.com wrote:
    On Monday, May 24, 2021 at 8:40:27 PM UTC-5, anti...@nospicedham.math.uni.wroc.pl wrote:
    luser...@nospicedham.gmail.com <luser...@nospicedham.gmail.com> wrote:

    ...

    Basic question is what do you want to emulate? 8086 by itself can not
    do file I/O. You may do PC emulator in style of Bochs, that is
    emulate common hardware. You may do BIOS emulator. You may
    do DOS emulator, that is emulate file I/O at DOS level. Or
    you may emulate different system. For example QEMU 386 emulates
    Linux system calls. If you want your emulator to be simple,
    you may define your own system calls (say using something like INT 0x80h), >> thing like open, close, read, write. Advantage is simplicity.
    Disadvantage is that large body of existing 8086 assembler programs
    assumes DOS environment and will not work with different system.


    Very good points. Thanks. You're right that the 8086 by itself only has in/out instructions, ESC instructions, and memory mapped IO, all of which depend on the rest of the system. It appears that the IBM BIOS operates
    at the very low level of clusters, heads, and sectors. But the DOS functions appear to map pretty closely to stdio.h functions. So that seems to be
    the easiest path forward.

    There's a guy on alt.os.development who has been posting about very much
    the same sorts of thing. Maybe there's value in the two of you getting
    in touch, if you aren't already.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rod Pemberton@21:1/5 to All on Thu May 27 02:07:55 2021
    On Wed, 26 May 2021 15:33:53 -0700 (PDT)
    "luser...@nospicedham.gmail.com" <luser.droog@nospicedham.gmail.com>
    wrote:

    <follow ups set to comp.lang.forth, from comp.lang.asm.x86>

    My goal at this stage is just to get some kind of read/write ability
    so the Forth interpreter can read Forth source from a file. So far,
    all my Forth code is written in a sort of "pre-compiled" form
    directly in the C code that implements the CPU emulator.

    Since your Forth interpreter is coded in C, you might start by using
    custom Forth words for standard C file I/O functions. Over time, you
    could implement modern standard Forth words for loading a file, by
    transforming and rewriting the custom words, as these mostly match C's functionality. Personally, I'd avoid loading blocks of ancient text
    screens like fig-Forth, unless you already have the functionality.
    E.g., set an ANS Forth word like OPEN-FILE to C's fopen() so you can
    build other ANS Forth file I/O words like INCLUDED INCLUDE-FILE etc.
    You might be able to do this by setting the CFA for a primitive (or
    low-level Forth word) with the address of the C function E.g., if you
    have some Forth words coded in C (or assembly), you should be able to
    do this.

    http://lars.nocrew.org/dpans/dpans11.htm#11.6.1.1718

    --
    The SALT deduction is a kickback of taxes to wealthy people in wealthy
    states.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From luser.droog@nospicedham.gmail.com@21:1/5 to James Harris on Sat May 29 12:19:08 2021
    On Thursday, May 27, 2021 at 2:02:52 AM UTC-5, James Harris wrote:
    On 26/05/2021 23:33, luser...@nospicedham.gmail.com wrote:

    Very good points. Thanks. You're right that the 8086 by itself only has in/out instructions, ESC instructions, and memory mapped IO, all of which depend on the rest of the system. It appears that the IBM BIOS operates
    at the very low level of clusters, heads, and sectors. But the DOS functions
    appear to map pretty closely to stdio.h functions. So that seems to be
    the easiest path forward.
    There's a guy on alt.os.development who has been posting about very much
    the same sorts of thing. Maybe there's value in the two of you getting
    in touch, if you aren't already.


    Thanks. I have read some of his postings in comp.lang.c and have just
    now started to browse some in AOD. We have a similar set of interests
    but diverge on many of the details. Eg. I'm using C99 tools[1] rather than C90 and targeting just 8086 for now. I still haven't implemented the full instruction set, just the ones initially required for the codegolf.stackexchange.com
    challenge and additional ones needed to get the Forth up and running.
    So, while we're both working in the same sort of space we're on different
    peaks of the mountain range.

    [1] I'm pretty much addicted to the C99 designated initializers and variable argument macros. I can't really imagine doing without those for hobby work.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From luser.droog@nospicedham.gmail.com@21:1/5 to luser...@nospicedham.gmail.com on Thu Jun 17 18:35:40 2021
    On Monday, May 24, 2021 at 1:22:49 AM UTC-5, luser...@nospicedham.gmail.com wrote:
    On Sunday, May 16, 2021 at 3:48:21 PM UTC-5, luser...@nospicedham.gmail.com wrote:
    Hello all,

    I'm planning to resume work on my partly written 8086 emulator after a
    long hiatus. I want to add the ability to read a file, but I'm having some difficulty
    [...]
    Is there a good resource to understand how this all should work?
    I suppose this isn't really related to assembly language. I've ordered Peter Norton's
    Guide to the IBM PC. I guess I'll post in comp.os.msdos.programmer if I run into
    trouble.

    For posterity, Norton's Guide really seems to be the perfect book to learn all about
    this stuff. It looks like I want to bypass DOS 1.0 stuff, too, and go straight for the DOS 2.0
    additions to have a file handle and less fiddly business.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From luser.droog@nospicedham.gmail.com@21:1/5 to luser...@nospicedham.gmail.com on Sat Jul 3 21:17:24 2021
    On Thursday, June 17, 2021 at 8:43:57 PM UTC-5, luser...@nospicedham.gmail.com wrote:
    On Monday, May 24, 2021 at 1:22:49 AM UTC-5, luser...@nospicedham.gmail.com wrote:
    On Sunday, May 16, 2021 at 3:48:21 PM UTC-5, luser...@nospicedham.gmail.com wrote:
    Hello all,

    I'm planning to resume work on my partly written 8086 emulator after a long hiatus. I want to add the ability to read a file, but I'm having some
    difficulty
    [...]
    Is there a good resource to understand how this all should work?
    I suppose this isn't really related to assembly language. I've ordered Peter Norton's
    Guide to the IBM PC. I guess I'll post in comp.os.msdos.programmer if I run into
    trouble.
    For posterity, Norton's Guide really seems to be the perfect book to learn all about
    this stuff. It looks like I want to bypass DOS 1.0 stuff, too, and go straight for the DOS 2.0
    additions to have a file handle and less fiddly business.

    So I read the whole book except some of the video and keyboard stuff that I may not need
    (or at least don't need yet).

    Here's my rough draft of dos support functions. The vv array is the payload of the ESC
    bytes from the ESC instruction. My interrupt handlers fill it in with the interrupt number.
    U is a uintptr_t.

    It need to do more error checking and reporting, but this should be enough to access
    files from my Forth CODEs. Maybe exiting a dos program ought not to exit() the whole
    emulator.


    static int keyboard_input_with_echo(){
    bput(al, fgetc(stdin));
    }

    static int display_output(){
    fputs( cp437tounicode( bget(dl) ), stdout );
    bput(al,bget(dl)); if(bget(al)=='\t')bput(al,' ');
    }

    static int display_string(){
    f=wget(dx);
    while(mem[f]!='$')fputs( cp437tounicode( mem[f++] ), stdout );
    bput(al,'$');
    }

    static int get_date(){
    time_t t=time(NULL);struct tm*tm=localtime(&t);
    wput(cx,tm->tm_year);
    wput(dh,tm->tm_mon);
    wput(dl,tm->tm_mday);
    wput(al,tm->tm_wday);
    }

    static int get_time(){
    struct timeval tv;gettimeofday(&tv,0);
    time_t t=time(NULL);struct tm*tm=localtime(&t);
    bput(ch,tm->tm_hour);
    bput(cl,tm->tm_min);
    bput(dh,tm->tm_sec);
    bput(dl,tv.tv_usec/10);
    }

    static int open_file(){
    U mode = bget(al);
    FILE *f = fopen(mem + ds_(dx), (mode & 7) == 0? "r":
    (mode & 7) == 1? "w": "rw");
    if( f ){
    U handle = next_handle ++;
    handles[ handle ] = f;
    wput(ax, handle);
    clc();
    return 0;
    }
    wput(ax, 0);
    stc();
    }

    static int close_file_handle(){
    U handle = wget(bx);
    fclose( handles[ handle ] );
    handles[ handle ] = 0;
    -- next_handle;
    }

    static int read_file(){
    U handle = wget(bx);
    U count = fread(mem + ds_(dx), 1, wget(cx), handles[ handle ]);
    if( count ){
    wput(ax, count);
    clc();
    return 0;
    }
    wput(ax, 5); //access denied
    stc();
    }

    static int write_file(){
    U handle = wget(bx);
    U count = fwrite(mem + ds_(dx), 1, wget(cx), handles[ handle ]);
    if( count == wget(cx) ){
    clc();
    return 0;
    }
    wput(ax, 5); //access denied
    stc();
    }

    static int move_file_pointer(){
    U handle = wget(bx);
    U whence = bget(al);
    fseek( handles[ handle ], qget(cx,dx), whence == 0? SEEK_SET:
    whence == 1? SEEK_CUR:
    whence == 2? SEEK_END: 0);
    U pos = ftell( handles[ handle ] );
    qput(dx, ax, pos);
    }

    static int dos( UC vv[7] ){
    switch(bget(ah)){
    CASE 0x01: return keyboard_input_with_echo();
    CASE 0x02: return display_output();
    CASE 0x09: return display_string();

    CASE 0x2A: return get_date();
    CASE 0x2C: return get_time();

    CASE 0x3C: // create file
    CASE 0x3D: return open_file();
    CASE 0x3E: return close_file_handle();
    CASE 0x3F: return read_file();
    CASE 0x40: return write_file();
    CASE 0x42: return move_file_pointer();
    CASE 0x44: // ioctl
    CASE 0x4B: // load/execute program
    CASE 0x4C: exit(bget(al));
    CASE 0x5B: // create new file
    ;
    }
    }


    static int keyboard_input_with_echo(){
    bput(al, fgetc(stdin));
    }

    static int display_output(){
    fputs( cp437tounicode( bget(dl) ), stdout );
    bput(al,bget(dl)); if(bget(al)=='\t')bput(al,' ');
    }

    static int display_string(){
    f=wget(dx);
    while(mem[f]!='$')fputs( cp437tounicode( mem[f++] ), stdout );
    bput(al,'$');
    }

    static int get_date(){
    time_t t=time(NULL);struct tm*tm=localtime(&t);
    wput(cx,tm->tm_year);
    wput(dh,tm->tm_mon);
    wput(dl,tm->tm_mday);
    wput(al,tm->tm_wday);
    }

    static int get_time(){
    struct timeval tv;gettimeofday(&tv,0);
    time_t t=time(NULL);struct tm*tm=localtime(&t);
    bput(ch,tm->tm_hour);
    bput(cl,tm->tm_min);
    bput(dh,tm->tm_sec);
    bput(dl,tv.tv_usec/10);
    }

    static int open_file(){
    U mode = bget(al);
    FILE *f = fopen(mem + ds_(dx), (mode & 7) == 0? "r":
    (mode & 7) == 1? "w": "rw");
    if( f ){
    U handle = next_handle ++;
    handles[ handle ] = f;
    wput(ax, handle);
    clc();
    return 0;
    }
    wput(ax, 0);
    stc();
    }

    static int close_file_handle(){
    U handle = wget(bx);
    fclose( handles[ handle ] );
    handles[ handle ] = 0;
    -- next_handle;
    }

    static int read_file(){
    U handle = wget(bx);
    U count = fread(mem + ds_(dx), 1, wget(cx), handles[ handle ]);
    if( count ){
    wput(ax, count);
    clc();
    return 0;
    }
    wput(ax, 5); //access denied
    stc();
    }

    static int write_file(){
    U handle = wget(bx);
    U count = fwrite(mem + ds_(dx), 1, wget(cx), handles[ handle ]);
    if( count == wget(cx) ){
    clc();
    return 0;
    }
    wput(ax, 5); //access denied
    stc();
    }

    static int move_file_pointer(){
    U handle = wget(bx);
    U whence = bget(al);
    fseek( handles[ handle ], qget(cx,dx), whence == 0? SEEK_SET:
    whence == 1? SEEK_CUR:
    whence == 2? SEEK_END: 0);
    U pos = ftell( handles[ handle ] );
    qput(dx, ax, pos);
    }

    static int dos( UC vv[7] ){
    switch(bget(ah)){
    CASE 0x01: return keyboard_input_with_echo();
    CASE 0x02: return display_output();
    CASE 0x09: return display_string();

    CASE 0x2A: return get_date();
    CASE 0x2C: return get_time();

    CASE 0x3C: // create file
    CASE 0x3D: return open_file();
    CASE 0x3E: return close_file_handle();
    CASE 0x3F: return read_file();
    CASE 0x40: return write_file();
    CASE 0x42: return move_file_pointer();
    CASE 0x44: // ioctl
    CASE 0x4B: // load/execute program
    CASE 0x4C: exit(bget(al));
    CASE 0x5B: // create new file
    ;
    }
    }


    And this bit is silly but fun:

    unsigned cp437table[256] = {
    ' ', 0x263A,0x263B,0x2665,0x2666,0x2663,0x2660,0x2022,
    0x25D8,0x25CB,0x2509,0x2642,0x2640,0x266A,0x266B,0x263C, 0x25BA,0x25C4,0x2195,0x203C,0x00B6,0x00A7,0x25AC,0x21A8,
    0x2191,0x2193,0x2192,0x2190,0x221F,0x2194,0x25B2,0x25BC,
    32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
    48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
    64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
    80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
    96, 97, 98, 99,100,101,102,103,104,105,106,107,108,109,110,111, 112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,0x2302, 0xC7,0xFC,0xE9,0xE2,0xE4,0xE0,0xE5,0xE7,0xEA,0xEB,0xE8,0xEF,0xEE,0xEC,0xC4,0xC5,
    0xC9,0xE6,0xC6,0xF4,0xF6,0xF2,0xFB,0xF9,0xFF,0xD6,0xDC,0xA2,0xA3,0xA5,0x20A7,0x192,
    0xE1,0xED,0xF3,0xFA,0xF1,0xD1,0xAA,0xBA,0xBF,0x2310,0xAC,0xBD,0xBC,0xA1,0xAB,0xBB,
    0x2591,0x2592,0x2593,0x2502,0x2524,0x2561,0x2562,0x2556,
    0x2555,0x2563,0x2551,0x2557,0x255D,0x255C,0x255B,0x2510, 0x2514,0x2534,0x252C,0x251C,0x2500,0x253C,0x255E,0x255F,
    0x255A,0x2554,0x2569,0x2566,0x2560,0x2550,0x256C,0x2567, 0x2568,0x2564,0x2565,0x2559,0x2558,0x2552,0x2553,0x256B,
    0x256A,0x251B,0x250C,0x2588,0x2584,0x258C,0x2590,0x2580, 0x3B1,0xDF,0x393,0x3C0,0x3A3,0x3C3,0x3BC,0x3C4,
    0x3A6,0x398,0x3A9,0x3B4,0x221E,0x3C6,0x3B5,0x2229, 0x2261,0xB1,0x2265,0x2264,0x2320,0x2321,0xF7,0x2248,
    0xB0,0x2219,0xB7,0x221A,0x207F,0xB2,0x25A0,0xA0
    };

    static
    char *cp437tounicode( unsigned int c ){
    static char buf[4] = "";
    unsigned ucs4 = cp437table[ c ];
    if( ucs4 < 0x80 ){ // 0... ....
    buf[0] = ucs4;
    buf[1] = 0;
    } else
    if( ucs4 < 0x800 ){ // 110. .... 10.. ....
    buf[0] = 0xC0 | ucs4 >> 6;
    buf[1] = 0x80 | ucs4 & 0x3F;
    buf[2] = 0;
    } else
    if( ucs4 < 0x10000 ){ // 1110 .... 10.. .... 10.. ....
    buf[0] = 0xE0 | (ucs4 >> 12) & 0xF;
    buf[1] = 0x80 | (ucs4 >> 6) & 0x3F;
    buf[2] = 0x80 | ucs4 & 0x3F;
    buf[3] = 0;
    }
    return buf;
    }

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Frank Kotler@21:1/5 to luser...@nospicedham.gmail.com on Sun Jul 4 01:11:10 2021
    On 07/04/2021 12:17 AM, luser...@nospicedham.gmail.com wrote:
    ...
    I suppose this isn't really related to assembly language.

    No...

    I hate to chase you away, where you're doing low level stuff, but keep
    it in assembly, okay?

    If you post to multiple groups and I reject it, it won't post at all, so don't...

    Best,
    Frank
    {moderator}

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)