• [OpenWatcom] building a COM file without pulling in the Watcom standard

    From Mateusz Viste@21:1/5 to All on Mon Jul 31 17:28:58 2023
    Hello all,

    I am experimenting with OpenWatcom, trying to compile a simple C
    file into a COM executable. The tricky part is that I'd like to avoid
    pulling in watcom's standard library and startup code in the process.

    I have such TEST.C file:

    void main(void) {
    static char *hello = "Hello$";
    _asm {
    mov ah, 9
    mov dx, hello
    int 0x21
    }
    }

    I compile it into an object and pass to wlink using these commands:

    wcc -0 -ms -od -s -d0 -zl -zls test.c
    wlink @TEST.LNK

    I filled TEST.LNK with the following directives:

    FORMAT DOS COM
    FILE test.obj
    OPTION NODEFAULTLIBS
    NAME TEST.COM
    OPTION START=main_

    I do get "something" out of this, but the binary file does not work
    properly. The disassembled COM looks like that:

    00000000 53 push bx
    00000001 51 push cx
    00000002 52 push dx
    00000003 56 push si
    00000004 57 push di
    00000005 B409 mov ah,0x9
    00000007 8B160C00 mov dx,[0xc] <-- this should be 0x11C
    0000000B CD21 int 0x21
    0000000D 5F pop di
    0000000E 5E pop si
    0000000F 5A pop dx
    00000010 59 pop cx
    00000011 5B pop bx
    00000012 C3 ret
    00000013 004865 add [bx+si+0x65],cl
    00000016 6C insb
    00000017 6C insb
    00000018 6F outsw
    00000019 2400 and al,0x0
    0000001B 0004 add [si],al <-- this should be 0x114
    0000001D 00 db 0x00

    It appears that the COM file is not being originated at offset 0x100,
    despite the "FORMAT DOS COM" wlink directive. It's also not
    0-originated, so I am not sure how the offsets are calculated exactly.
    Once I fix them with a hex editor, the executable works.

    What am I missing here?

    Mateusz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From JJ@21:1/5 to Mateusz Viste on Tue Aug 1 18:38:45 2023
    On Mon, 31 Jul 2023 17:28:58 +0200, Mateusz Viste wrote:
    Hello all,

    I am experimenting with OpenWatcom, trying to compile a simple C
    file into a COM executable. The tricky part is that I'd like to avoid
    pulling in watcom's standard library and startup code in the process.

    I have such TEST.C file:

    void main(void) {
    static char *hello = "Hello$";
    _asm {
    mov ah, 9
    mov dx, hello
    int 0x21
    }
    }

    The `hello` from `mov dx, hello` in assembly's perspective, means the
    content of the variable. Not its address. You'll need to use the `offset` operator. i.e.

    mov dx, offset hello

    I filled TEST.LNK with the following directives:

    FORMAT DOS COM
    FILE test.obj
    OPTION NODEFAULTLIBS
    NAME TEST.COM
    OPTION START=main_

    I'm not familiar with Watcom linker, but the linker user's guide says to use these:

    system com
    option map
    name app_name
    file obj1, obj2, ...
    library lib1, lib2, ...

    https://open-watcom.github.io/open-watcom-v2-wikidocs/lguide.pdf

    Section "2.2.2 Linking 16-bit x86 DOS .COM Executable Files". Page 8.

    I do get "something" out of this, but the binary file does not work
    properly. The disassembled COM looks like that:

    00000000 53 push bx
    00000001 51 push cx
    00000002 52 push dx
    00000003 56 push si
    00000004 57 push di
    00000005 B409 mov ah,0x9
    00000007 8B160C00 mov dx,[0xc] <-- this should be 0x11C
    0000000B CD21 int 0x21
    0000000D 5F pop di
    0000000E 5E pop si
    0000000F 5A pop dx
    00000010 59 pop cx
    00000011 5B pop bx
    00000012 C3 ret
    00000013 004865 add [bx+si+0x65],cl
    00000016 6C insb
    00000017 6C insb
    00000018 6F outsw
    00000019 2400 and al,0x0
    0000001B 0004 add [si],al <-- this should be 0x114
    0000001D 00 db 0x00

    It appears that the COM file is not being originated at offset 0x100,
    despite the "FORMAT DOS COM" wlink directive. It's also not
    0-originated, so I am not sure how the offsets are calculated exactly.
    Once I fix them with a hex editor, the executable works.

    If you disassembed a binary using a blind disassembler (which don't know
    binary file format, and platform), all file bytes will be treated as code,
    and will start at zero or at disassembler application's predefined address.

    Also check the compiled binary. Make sure it doesn't start with "MZ", which
    is an EXE binary. IOTW, you have an EXE binary named as a COM file. In this case, if the code, the data, and the stack segments are all the same
    (usually a Tiny memory model module), try using a tool like EXE2BIN to
    extract only the EXE body (excluding header).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mateusz Viste@21:1/5 to All on Tue Aug 1 14:50:39 2023
    dn. Tue, 1 Aug 2023 18:38:45 +0700, JJ napisał:
    void main(void) {
    static char *hello = "Hello$";
    _asm {
    mov ah, 9
    mov dx, hello
    int 0x21
    }
    }

    The `hello` from `mov dx, hello` in assembly's perspective, means the
    content of the variable. Not its address.

    Sure, but the variable at hand is a pointer, so what I am effectively interested in is where it points to (ie. its content). The variable
    itself acts only as a convenient label.

    FORMAT DOS COM
    FILE test.obj
    OPTION NODEFAULTLIBS
    NAME TEST.COM
    OPTION START=main_

    I'm not familiar with Watcom linker, but the linker user's guide says
    to use these:

    system com

    Yes, and the program I posted does work all right when linked as
    "system com", but a side effect of "system com" is that it forces the
    import of watcom's startup code and symbols. "format dos com" does not
    (but yields a non-working binary where I have to fix addresses by hand
    with a hex editor).

    If you disassembed a binary using a blind disassembler (which don't
    know binary file format, and platform), all file bytes will be
    treated as code, and will start at zero or at disassembler
    application's predefined address.

    This is a COM file, so it's raw code. The only data here is the "hello"
    string, but it's placed at the end of the binary so it's easy to spot.

    Also check the compiled binary. Make sure it doesn't start with "MZ",

    I posted the exact, full content of the binary. It's a COM file, no MZ.

    Earlier today I stumbled upon an interesting stackoverflow discussion,
    where Peter Szabo was trying to achieve something very similar to what
    I am doing now, and Michael Petch provided some deep insight into the
    matter: https://stackoverflow.com/questions/62473231/small-model-dos-exe-compiled-and-linked-by-openwatcom-crashes

    My understanding is that wlink needs the startup code to figure out how
    to lay out the program's memory. Things seem to be much more convoluted
    than I expected, even for building a simple COM image.

    Peter Szabo ended up creating a specialized tool to solve the problem: https://github.com/pts/dosmc/blob/master/dosmc.dir/dosmc.pl

    This is not a road I am willing to take, since I was simply (naively,
    perhaps) looking for a way of building minimalist COM files using the C language, hoping that Open Watcom would be able to do so, if properly instructed. I think now that I might be using the wrong tool for the
    job.

    Mateusz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From T. Ment@21:1/5 to Mateusz Viste on Tue Aug 1 15:23:21 2023
    On Tue, 1 Aug 2023 14:50:39 +0200, Mateusz Viste wrote:

    looking for a way of building minimalist COM files using the C
    language, hoping that Open Watcom would be able

    Turbo C tiny model maybe. Never needed one myself though. IDK.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexei A. Frounze@21:1/5 to Mateusz Viste on Wed Aug 2 20:28:34 2023
    On Monday, July 31, 2023 at 8:29:00 AM UTC-7, Mateusz Viste wrote:
    Hello all,

    I am experimenting with OpenWatcom, trying to compile a simple C
    file into a COM executable. The tricky part is that I'd like to avoid pulling in watcom's standard library and startup code in the process.

    I have such TEST.C file:

    void main(void) {
    static char *hello = "Hello$";
    _asm {
    mov ah, 9
    mov dx, hello
    int 0x21
    }
    }

    I compile it into an object and pass to wlink using these commands:

    wcc -0 -ms -od -s -d0 -zl -zls test.c
    wlink @TEST.LNK

    I filled TEST.LNK with the following directives:

    FORMAT DOS COM
    FILE test.obj
    OPTION NODEFAULTLIBS
    NAME TEST.COM
    OPTION START=main_

    I do get "something" out of this, but the binary file does not work properly. The disassembled COM looks like that:

    00000000 53 push bx
    00000001 51 push cx
    00000002 52 push dx
    00000003 56 push si
    00000004 57 push di
    00000005 B409 mov ah,0x9
    00000007 8B160C00 mov dx,[0xc] <-- this should be 0x11C
    0000000B CD21 int 0x21
    0000000D 5F pop di
    0000000E 5E pop si
    0000000F 5A pop dx
    00000010 59 pop cx
    00000011 5B pop bx
    00000012 C3 ret
    00000013 004865 add [bx+si+0x65],cl
    00000016 6C insb
    00000017 6C insb
    00000018 6F outsw
    00000019 2400 and al,0x0
    0000001B 0004 add [si],al <-- this should be 0x114
    0000001D 00 db 0x00

    It appears that the COM file is not being originated at offset 0x100, despite the "FORMAT DOS COM" wlink directive. It's also not
    0-originated, so I am not sure how the offsets are calculated exactly.
    Once I fix them with a hex editor, the executable works.

    What am I missing here?

    If you make your own startup code with proper
    ----8<----
    org 100h
    _cstart_:
    ...
    end _cstart_
    ----8<----
    It may just work.

    Alex

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From R.Wieser@21:1/5 to All on Thu Aug 3 10:39:49 2023
    Mateusz,

    I filled TEST.LNK with the following directives:
    ...
    OPTION START=main_

    A COM file *always* starts at 0x0100. Maybe this directive interferes with
    it ?

    00000007 8B160C00 mov dx,[0xc] <-- this should be 0x11C

    the 0xC is almost the offset from that commands address to the string (off
    by one). IOW, it looks like the resolving (by "wlink") didn't quite kick
    in. Maybe the linker needs to be told that it is converting a COM style program too ?

    0000001B 0004 add [si],al <-- this should be 0x114

    AFAIKS everything from address 0x1A is beyond your code/program. IOW, no
    idea what the remark is about.

    Regards,
    Rudy Wieser

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mateusz Viste@21:1/5 to All on Thu Aug 3 10:21:32 2023
    dn. Wed, 2 Aug 2023 20:28:34 -0700 (PDT), Alexei A. Frounze napisał:
    If you make your own startup code with proper
    ----8<----
    org 100h
    _cstart_:
    ...
    end _cstart_
    ----8<----
    It may just work.

    Is org 100h really required in this context? Isn't it the job of the
    linker to compute proper addresses?

    I tried nonetheless, but nasm does not understand the "org" directive
    when using the -f obj target. It's apparently only valid for -f bin.

    Mateusz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mateusz Viste@21:1/5 to All on Thu Aug 3 12:07:34 2023
    dn. Thu, 3 Aug 2023 10:39:49 +0200, R.Wieser napisał:
    OPTION START=main_

    A COM file *always* starts at 0x0100. Maybe this directive
    interferes with it ?

    Without "OPTION START" the result is exactly the same, with the only
    difference that wlink complains about "no starting address found".

    00000007 8B160C00 mov dx,[0xc] <-- this should be
    0x11C

    the 0xC is almost the offset from that commands address to the string
    (off by one). IOW, it looks like the resolving (by "wlink") didn't
    quite kick in. Maybe the linker needs to be told that it is
    converting a COM style program too ?

    The only wlink options I found in this context are "FORMAT DOS COM" and
    "SYSTEM COM". The former is not computing addresses properly (as
    shown in this thread) and the latter forces watcom's startup code to be
    pulled in, resulting in a kilobyte of bloat.

    Looking at the DOSMC tool from Peter Szabo it looks like there is
    quite some hoops to jump over: https://github.com/pts/dosmc
    I was probably a bit naive to think that there would be a ready-to-go
    wlink switch that would generate working COM files without the watcom
    startup bloat.

    0000001B 0004 add [si],al <-- this should be 0x114

    AFAIKS everything from address 0x1A is beyond your code/program.
    IOW, no idea what the remark is about.

    You are correct that 0x1B is beyond code, but it is not beyond data.
    My understanding is that the "04 00" value starting at 1Ch is a near
    pointer that is supposed to be loaded by mov dx,[0x0c] (ie. "load DX
    with the value at memory location 0x0C"). Both the MOV and the pointer
    are badly addressed though, that is why I needed to fix them both by
    hand to get a working executable. How the values 0x000C and 0x0004
    have been computed exactly, this I have no idea.

    Mateusz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From R.Wieser@21:1/5 to All on Thu Aug 3 14:35:29 2023
    Mateusz,

    A COM file *always* starts at 0x0100. Maybe this directive
    interferes with it ?

    Without "OPTION START" the result is exactly the same, with the only >difference that wlink complains about "no starting address found".

    That strengthens my gut feeling that the linker program /also/ needs to be
    told what kind of executable to generate - its faulty 0xC offset in the COM program being a result of not exactly knowing what to do.

    0000001B 0004 add [si],al <-- this should be 0x114

    AFAIKS everything from address 0x1A is beyond your code/program.
    IOW, no idea what the remark is about.

    You are correct that 0x1B is beyond code, but it is not beyond data.
    My understanding is that the "04 00" value starting at 1Ch is a near
    pointer that is supposed to be loaded by mov dx,[0x0c]

    Ackkk.... I overlooked that you are doing an indirect load. :-\

    Hmmm. In that case your program seems to expect DS to be one segment (0x10 bytes) beyond the CS segment. Makes some sense, giving the data segment as much free space as possible.

    Its still strange for a pure COM file though. No idea how to determine that DS-to-CS offset though (might be an internally-generated label)

    How the values 0x000C and 0x0004 have been computed exactly, this
    I have no idea.

    See above. :-)

    Regards,
    Rudy Wieser

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexei A. Frounze@21:1/5 to Mateusz Viste on Thu Aug 3 19:41:07 2023
    On Thursday, August 3, 2023 at 1:21:35 AM UTC-7, Mateusz Viste wrote:
    dn. Wed, 2 Aug 2023 20:28:34 -0700 (PDT), Alexei A. Frounze napisał:
    If you make your own startup code with proper
    ----8<----
    org 100h
    _cstart_:
    ...
    end _cstart_
    ----8<----
    It may just work.
    Is org 100h really required in this context? Isn't it the job of the
    linker to compute proper addresses?

    Perhaps, but that's how the OBJ/OMF format has worked for years
    in TASM, MASM, WASM.

    I tried nonetheless, but nasm does not understand the "org" directive
    when using the -f obj target. It's apparently only valid for -f bin.

    Why not just use WASM if you're already using WCC/WCL/WLINK/etc?

    Anyhow, if you dig your beloved NASM's nasmdoc.txt, you'll find this: ----8<----
    8.2.2 Using the `obj' Format To Generate `.COM' Files

    If you are writing a `.COM' program as more than one module, you may
    wish to assemble several `.OBJ' files and link them together into a
    `.COM' program. You can do this, provided you have a linker capable
    of outputting `.COM' files directly (TLINK does this), or
    alternatively a converter program such as `EXE2BIN' to transform the
    `.EXE' file output from the linker into a `.COM' file.

    If you do this, you need to take care of several things:

    (*) The first object file containing code should start its code
    segment with a line like `RESB 100h'. This is to ensure that the
    code begins at offset `100h' relative to the beginning of the
    code segment, so that the linker or converter program does not
    have to adjust address references within the file when
    generating the `.COM' file. Other assemblers use an `ORG'
    directive for this purpose, but `ORG' in NASM is a format-
    specific directive to the `bin' output format, and does not mean
    the same thing as it does in MASM-compatible assemblers.
    ----8<----

    HTH,
    Alex

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From R.Wieser@21:1/5 to All on Fri Aug 4 08:31:49 2023
    Mateusz,

    Something I overlooked/ignored while trying to figure out those wonkey
    offsets :

    [quote=me]
    A COM file *always* starts at 0x0100.
    [quote]

    00000000 53 push bx
    ...

    Either your disassembler is doing something funny, or it really thinks your program starts at 0x0000 (and not 0x0100) ...

    Could you add a command like "lea ax,main" and see which address "main" gets translated too ? It should ofcourse show 0x0100. If it does not not than
    the linker didn't generate a COM style file to begin with.

    Also, have you checked the binary contents of your (supposed) .COM file ?
    If it starts with "MZ" ...

    Regards,
    Rudy Wieser

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mateusz Viste@21:1/5 to All on Fri Aug 4 14:03:24 2023
    dn. Fri, 4 Aug 2023 08:31:49 +0200, R.Wieser napisał:
    Could you add a command like "lea ax,main" and see which address
    "main" gets translated too ? It should ofcourse show 0x0100. If it
    does not not than the linker didn't generate a COM style file to
    begin with.

    Here it is:

    00000005 8D060000 lea ax,[0x0]

    Not unexpectedly, main() starts at offset 0 because the linker does not
    compute the addresses with an extra +0x100. Which is the whole issue.

    Also, have you checked the binary contents of your (supposed) .COM
    file ? If it starts with "MZ" ...

    The disassembly I posted in the initial message truly is the entirety
    of the generated file. It starts with push bx. No "MZ" nor any other
    header.

    It appears that without startup code, wlink simply won't generate a
    proper COM. Then one can either rely on the (huge! 1K) startup provided
    by Watcom, by using the "SYSTEM COM" wlink directive, or hand-craft its
    own startup code that would mimic whatever the original startup needs
    to set up. In other words, I take there is no easy way to achieve what
    I was looking for.

    Mateusz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From R.Wieser@21:1/5 to All on Fri Aug 4 16:29:24 2023
    Mateusz,

    It appears that without startup code, wlink simply won't generate
    a proper COM.

    I did a quick DDG search for "OpenWatcom create COM style file", got https://stackoverflow.com/questions/46408334/com-executables-with-open-watcom and noticed "BlackJack"s response.

    From there I did another search for "OpenWatcom set model tiny", and from
    the DDG result (https://github.com/open-watcom/open-watcom-v2/issues/275)
    and your initial post I noticed that you are compiling with the "-ms" switch (small memory model), which is incompatible with a COM style executable.
    Try "-mt" (tiny memory model) instead.

    Regards,
    Rudy Wieser

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mateusz Viste@21:1/5 to All on Fri Aug 4 16:56:38 2023
    dn. Fri, 4 Aug 2023 16:29:24 +0200, R.Wieser napisał:
    I noticed that you are compiling with the "-ms" switch
    (small memory model), which is incompatible with a COM style
    executable. Try "-mt" (tiny memory model) instead.

    "-ms" is the proper switch for compiling object files for COM
    executables. In fact, the wcc compiler doesn't even understand -mt.

    "-mt" is only a convenience switch for wcl (Watcom's "compile & link"
    tool) so it knows that after executing wcc -ms it has to pass the
    "SYSTEM COM" option to wlink.

    Building a COM itself is well documented and hence easy to achieve. The
    problem here is that I was trying to make Open Watcom build a tiny (as
    in "very small") COM file by avoiding Watcom's libc and startup code,
    ie. passing "OPTION NODEFAULTLIBS" to wlink. Then the COM file indeed
    becomes very small, but it also ceases working, as the generated code
    seems to expect to be executed within an environment prepared by
    Watcom's startup routines.

    Most probably my expectations towards Open Watcom were too high. It is
    an awesome tool, but it's simply not designed to build minimalist COM
    files without major hackery.

    Such hackery have been done by Peter Szabo (aka pts). I tested just now
    his DOSMC tool, and it compiled this program:

    void main(void) {
    static char *hello = "Hello$";
    _asm {
    lea ax, main
    mov ah, 9
    mov dx, hello
    int 0x21
    }
    }

    Into this:

    00000000 E80400 call word 0x7
    00000003 B44C mov ah,0x4c
    00000005 CD21 int 0x21
    00000007 53 push bx
    00000008 51 push cx
    00000009 52 push dx
    0000000A 56 push si
    0000000B 57 push di
    0000000C 8D060701 lea ax,[0x107]
    00000010 B409 mov ah,0x9
    00000012 8B162501 mov dx,[0x125]
    00000016 CD21 int 0x21
    00000018 5F pop di
    00000019 5E pop si
    0000001A 5A pop dx
    0000001B 59 pop cx
    0000001C 5B pop bx
    0000001D C3 ret
    0000001E 48 dec ax
    0000001F 656C gs insb
    00000021 6C insb
    00000022 6F outsw
    00000023 2400 and al,0x0
    00000025 1E push ds
    00000026 01 db 0x01

    Works perfectly, at least on this simple test example.
    Too bad DOSMC is a perl Linux-only tool.

    https://github.com/pts/dosmc


    Mateusz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From R.Wieser@21:1/5 to All on Fri Aug 4 19:02:28 2023
    Mateusz,

    Try "-mt" (tiny memory model) instead.

    "-ms" is the proper switch for compiling object files for COM
    executables. In fact, the wcc compiler doesn't even understand -mt.

    Thats too bad. At least you can't say I didn't try. :-)

    Building a COM itself is well documented and hence easy to achieve.
    The problem here is that I was trying to make Open Watcom build a tiny
    (as in "very small") COM file by avoiding Watcom's libc and startup code,

    I read your first message describing that. Don't worry.

    Then the COM file indeed becomes very small, but it also ceases working,
    as the generated code seems to expect to be executed within an environment prepared by Watcom's startup routines.

    Strange. Being able to specify a DOS COM output, but not actually getting
    it. :-\

    Regards,
    Rudy Wieser

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mateusz Viste@21:1/5 to All on Fri Aug 4 21:18:03 2023
    dn. Fri, 4 Aug 2023 19:02:28 +0200, R.Wieser napisał:
    Thats too bad. At least you can't say I didn't try. :-)

    And your kind effort is very much appreciated. :)

    Strange. Being able to specify a DOS COM output, but not actually
    getting it. :-\

    Indeed. It's COM all right, but only as long as one links to the Watcom-supplied startup library (or equivalent). If not, then all bets
    are off.

    Mateusz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mateusz Viste@21:1/5 to All on Fri Nov 10 00:19:40 2023
    dn. Mon, 31 Jul 2023 17:28:58 +0200, Mateusz Viste napisał:
    It appears that the COM file is not being originated at offset 0x100,
    despite the "FORMAT DOS COM" wlink directive. It's also not
    0-originated, so I am not sure how the offsets are calculated exactly.
    Once I fix them with a hex editor, the executable works.

    What am I missing here?


    Hello all,

    I talked with Bernd Böckmann today and I was surprised to learn that
    he tackled this very same problem recently. He was, however, far more successful than me and kindly shared the piece of information that I
    have missed all along.

    Bernd said:
    "Because in tiny memory model the code is in the same segment as the
    data, the linker must be told to merge these segments to a single one
    while linking, otherwise the addresses are messed up. This is done by
    the GROUP directive in startup.asm, which includes _TEXT (as opposed to
    the .EXE version)."

    The need of a custom startup code was already hinted in this thread by
    Alexei A. Frounze, and I did attempt to create such startup back then,
    but the necessity of grouping segments was lost on me.

    Bernd provided me with a working example of his startup code. With this
    new bit of information I was able to adapt my proof of concept project
    - and this time, it works! The resulting executable size is 45 bytes.
    I am pasting here below all the files for posterity.

    Mateusz


    --- HELLO.LNK ---------------------------------------------

    name hello
    system dos com
    option map
    option nodefaultlibs
    file startup
    file hello

    --- HELLO.C -----------------------------------------------

    void main(void) {
    char *hello = "Hello$";
    _asm {
    mov ah, 9
    mov dx, hello
    int 0x21
    }
    }

    --- STARTUP.ASM -------------------------------------------

    .8086

    dgroup group _TEXT,_DATA,CONST,CONST2,_BSS,

    extrn "C",main : near

    ; public _cstart_, _small_code_, __STK
    public _cstart_, _small_code_

    _TEXT segment word public 'CODE'
    org 100h

    _small_code_ label near

    _cstart_:
    call main
    mov ah, 4ch
    int 21h

    ; Stack overflow checking routine is absent. Remember to compile your
    ; programs with the -s option to avoid referencing __STK
    ;__STK:
    ; ret

    _DATA segment word public 'DATA'
    _DATA ends

    CONST segment word public 'DATA'
    CONST ends

    CONST2 segment word public 'DATA'
    CONST2 en