• Language-independent format for master config

    From James Harris@21:1/5 to All on Sun Dec 19 21:33:28 2021
    We've previously discussed the potential benefits of having a master configuration which can be used to generate types, constants,
    structures, declarations etc for more than one programming language so
    that each language gets the same info.

    As it happens, I find myself in that position now. I need assembly and
    my own language to cooperate using some common definitions so ISTM the
    right approach to have a master set of definitions and use it to create declarations and the like both for the assembler and for my compiler.

    Before I jump in and devise something new ... do you know of any format
    and tools which exist already?

    Or since my needs will probably be very limited could it be simpler to
    avoid a comprehensive and bulky package and just to make up something
    from scratch? And IYO what should be in it?

    (BTW, I've loads of other messages yet to reply to. I've not forgotten
    them.)


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Mon Dec 20 12:11:30 2021
    On 19/12/2021 21:33, James Harris wrote:
    We've previously discussed the potential benefits of having a master configuration which can be used to generate types, constants,
    structures, declarations etc for more than one programming language so
    that each language gets the same info.

    As it happens, I find myself in that position now. I need assembly and
    my own language to cooperate using some common definitions so ISTM the
    right approach to have a master set of definitions and use it to create declarations and the like both for the assembler and for my compiler.

    Before I jump in and devise something new ... do you know of any format
    and tools which exist already?

    Or since my needs will probably be very limited could it be simpler to
    avoid a comprehensive and bulky package and just to make up something
    from scratch? And IYO what should be in it?


    I don't recall that discussion.

    It sounds like you're thinking of a special language just for
    declarations, which transpiles to multiple targets.

    That might be a little extravagant. I'm not sure any existing tools are
    going to be helpful, since how will thet know how to generate code for
    each of your languages?

    In my case, I only have 3 languages (that I code in): ASM, M (static), Q (dynamic).

    The ASM is written inline in M, and will have access to most of its declarations (not yet direcly to structs; I need special declarations to
    make the member offsets available).

    When M/Q share data, I use ad hoc methods: for a example a special
    routine in M which when called, writes Q-compatible versions.

    Or, since the syntax is similar, I can just copy&paste with a few tweaks.

    Nothing however that will guarantee those separate declarations
    automatically remain in sync.

    Except something I'm working on now, which is that when M is generating
    a shared library, then it will also generate an exports file containing
    an API for use from:

    * M (for M to use it as though it was a regular DLL)
    * Q
    * C had also been planned, to allow access from other languages that
    can use DLLs + C headers

    (I found a problem with M generating DLL, so that I'm working on
    devising my own shared library format for my own languages. I think that
    can still be packaged within a regular DLL too, but that's low priority;
    it will also need an external tool to generate the core DLL file!)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to James Harris on Mon Dec 20 17:09:31 2021
    On 20/12/2021 16:59, James Harris wrote:

    ...

      constant int32 LIMIT 906
      constant BLOCKSIZE 512

    That would result in assembly something like

      LIMIT      dd   906
      BLOCKSIZE  equ  512

    For anyone who's not familiar with Nasm assembly those two lines do the following.

    LIMIT dd 906

    reserves four bytes (dd means 4 bytes, db means 1 byte, etc) of storage
    and initialises it to 906.

    By contrast,

    BLOCKSIZE equ 512

    associates the value 512 with the symbol BLOCKSIZE.

    Perhaps the master file should make the distinction clearer by using
    different initial keywords, as in

    stored_constant int32 LIMIT 906

    literal_constant BLOCKSIZE 512

    I don't know. Just throwing some ideas around. The main thing is that
    the master copy should be parsable and convertible to various
    programming languages.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Mon Dec 20 16:59:35 2021
    On 20/12/2021 12:11, Bart wrote:
    On 19/12/2021 21:33, James Harris wrote:
    We've previously discussed the potential benefits of having a master
    configuration which can be used to generate types, constants,
    structures, declarations etc for more than one programming language so
    that each language gets the same info.

    As it happens, I find myself in that position now. I need assembly and
    my own language to cooperate using some common definitions so ISTM the
    right approach to have a master set of definitions and use it to
    create declarations and the like both for the assembler and for my
    compiler.

    Before I jump in and devise something new ... do you know of any
    format and tools which exist already?

    Or since my needs will probably be very limited could it be simpler to
    avoid a comprehensive and bulky package and just to make up something
    from scratch? And IYO what should be in it?


    I don't recall that discussion.

    It sounds like you're thinking of a special language just for
    declarations, which transpiles to multiple targets.

    I don't know that it would be a 'language' but preferably something much simpler. For example, the master file might have a couple of constants,
    one typed and one not typed.

    constant int32 LIMIT 906
    constant BLOCKSIZE 512

    That would result in assembly something like

    LIMIT dd 906
    BLOCKSIZE equ 512

    and in C something like

    int32_t LIMIT = 906;
    #define BLOCKSIZE 512

    IOW the typed one would reserve storage whereas the other would not, and
    there could be an arbitrary number of each. For C, LIMIT and BLOCKSIZE
    would likely be written into a header and after being imported could be
    used in other C code as normal.

    (There's possibly a const qualification that should be added to the
    LIMIT but I don't know where.)


    That might be a little extravagant. I'm not sure any existing tools are
    going to be helpful, since how will thet know how to generate code for
    each of your languages?

    They wouldn't have to do so. What I had in mind was me writing the code
    to produce something suitable for my language. It's just that if there
    were already a standard master format then I'd look to see if that was
    worth using. No need to reinvent the wheel.


    In my case, I only have 3 languages (that I code in): ASM, M (static), Q (dynamic).

    The ASM is written inline in M, and will have access to most of its declarations (not yet direcly to structs; I need special declarations to
    make the member offsets available).

    When M/Q share data, I use ad hoc methods: for a example a special
    routine in M which when called, writes Q-compatible versions.

    Or, since the syntax is similar, I can just copy&paste with a few tweaks.

    Wouldn't it be better to produce compatible declarations automatically,
    from some master file?


    Nothing however that will guarantee those separate declarations
    automatically remain in sync.

    Indeed.


    Except something I'm working on now, which is that when M is generating
    a shared library, then it will also generate an exports file containing
    an API for use from:

     * M (for M to use it as though it was a regular DLL)
     * Q
     * C had also been planned, to allow access from other languages that
    can use DLLs + C headers

    (I found a problem with M generating DLL, so that I'm working on
    devising my own shared library format for my own languages. I think that
    can still be packaged within a regular DLL too, but that's low priority;
    it will also need an external tool to generate the core DLL file!)

    If your master info is in one of your own languages won't you end up in
    the same position as many other languages: something that has to be
    translated with no clear master? Wouldn't it be better to have some form
    which is clearly the master format for conversion to any and all languages?


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Mon Dec 20 18:42:22 2021
    On 20/12/2021 16:59, James Harris wrote:
    On 20/12/2021 12:11, Bart wrote:

    It sounds like you're thinking of a special language just for
    declarations, which transpiles to multiple targets.

    I don't know that it would be a 'language' but preferably something much simpler. For example, the master file might have a couple of constants,
    one typed and one not typed.

      constant int32 LIMIT 906
      constant BLOCKSIZE 512

    That would result in assembly something like

      LIMIT      dd   906
      BLOCKSIZE  equ  512

    and in C something like

      int32_t LIMIT = 906;
      #define BLOCKSIZE 512

    IOW the typed one would reserve storage whereas the other would not, and there could be an arbitrary number of each. For C, LIMIT and BLOCKSIZE
    would likely be written into a header and after being imported could be
    used in other C code as normal.

    This looks like a language to me. It's just one that consists of
    declarations, or rather, non-executable code.

    A bit like the kind of language I once proposed for defining APIs in a language-neutral format.

    So it's perhaps not as simple as you think. For my purposes,
    declarations can include all these aspects:

    * Basic types

    * Aggregate types (structs, arrays)

    * Pointers, strings

    * Named constants of those types

    * Variables of those types, including arrays and tables

    * User-defined structs

    * User-defined types

    * Enumerations

    * In my case, 'tabledata' (enums + parallel arrays)

    * Function signatures, mainly for importing/exporting across programs
    and across languages

    * Using previously defined user-defined structs and types for any of these

    * Literals used to define consts and variables: strings (with escape
    codes), integers, floats with separators and in various bases

    * Macros (in my case, simple expression macros)

    * Possibly, making use of 'include' (to incorporate and share such info
    in other files) and 'strinclude' (string literals from a file).

    * Possibly, read-only attributes (not something I do ATM)

    So perhaps half a language; a univeral one translatable to any other,
    including assembly. It would need a syntax and a specification.

    Maybe your requirements are simpler, but when I produce a DLL, the above
    is typical of what might need to be shared. At the least, function
    signatures, types/structs, and enums/named constants.

    If you are really talking about a configuration file, then that will be
    a lot simpler - mainly keywords and values - but I can't see it needing
    to be converted into actual language syntax.

    (I found a problem with M generating DLL, so that I'm working on
    devising my own shared library format for my own languages. I think
    that can still be packaged within a regular DLL too, but that's low
    priority; it will also need an external tool to generate the core DLL
    file!)

    If your master info is in one of your own languages won't you end up in
    the same position as many other languages: something that has to be translated with no clear master? Wouldn't it be better to have some form which is clearly the master format for conversion to any and all languages?


    I'd designate one language as the master, probably the static one as
    that has a full static type system. The dynamic one has partial support
    only.

    There might still need to be a process by which the master is translated
    into the other language. For me, that is when the M program is compiled
    to a shared library; then the necessary API file is regenerated.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From luserdroog@21:1/5 to James Harris on Fri Dec 24 06:57:01 2021
    On Monday, December 20, 2021 at 10:59:38 AM UTC-6, James Harris wrote:
    On 20/12/2021 12:11, Bart wrote:
    On 19/12/2021 21:33, James Harris wrote:
    We've previously discussed the potential benefits of having a master
    configuration which can be used to generate types, constants,
    structures, declarations etc for more than one programming language so
    that each language gets the same info.

    As it happens, I find myself in that position now. I need assembly and
    my own language to cooperate using some common definitions so ISTM the
    right approach to have a master set of definitions and use it to
    create declarations and the like both for the assembler and for my
    compiler.

    Before I jump in and devise something new ... do you know of any
    format and tools which exist already?

    Or since my needs will probably be very limited could it be simpler to
    avoid a comprehensive and bulky package and just to make up something
    from scratch? And IYO what should be in it?


    I don't recall that discussion.

    It sounds like you're thinking of a special language just for
    declarations, which transpiles to multiple targets.
    I don't know that it would be a 'language' but preferably something much simpler. For example, the master file might have a couple of constants,
    one typed and one not typed.

    constant int32 LIMIT 906
    constant BLOCKSIZE 512

    That would result in assembly something like

    LIMIT dd 906
    BLOCKSIZE equ 512

    and in C something like

    int32_t LIMIT = 906;
    #define BLOCKSIZE 512

    IOW the typed one would reserve storage whereas the other would not, and there could be an arbitrary number of each. For C, LIMIT and BLOCKSIZE
    would likely be written into a header and after being imported could be
    used in other C code as normal.

    (There's possibly a const qualification that should be added to the
    LIMIT but I don't know where.)

    This seems simple enough to just code up in your favorite scripting language. Being a weirdo, here's how I'd do it using PostScript.

    %!
    /mydefs [
    {(int32_t)(LIMIT)(906)data}
    {(BLOCKSIZE)(512)constant}
    ] def
    /print-C {
    <<
    /data { 3 -1 roll print( )print exch print( = )print print(;\n)print }
    /constant { (#define )print exch print( )print print(\n)print }
    >> begin
    {exec}forall
    end
    } def
    /print-ASM {
    <<
    /data { exch print( dd )print print(\n)print pop }
    /constant { exch print( equ )print print(\n)print }
    >> begin
    {exec}forall
    end
    } def

    %Usage:
    % mydefs print-C
    % mydefs print-ASM
    %from command line:
    % gsnd -q mydefs.ps -c "mydefs print-C"
    % gsnd -q mydefs.ps -c "mydefs print-ASM"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Sun Jan 2 15:40:31 2022
    On 20/12/2021 18:42, Bart wrote:
    On 20/12/2021 16:59, James Harris wrote:

    ...

       constant int32 LIMIT 906
       constant BLOCKSIZE 512

    That would result in assembly something like

       LIMIT      dd   906
       BLOCKSIZE  equ  512

    and in C something like

       int32_t LIMIT = 906;
       #define BLOCKSIZE 512

    ...

    This looks like a language to me. It's just one that consists of declarations, or rather, non-executable code.

    Yes.


    A bit like the kind of language I once proposed for defining APIs in a language-neutral format.

    So it's perhaps not as simple as you think. For my purposes,
    declarations can include all these aspects:

    * Basic types

    * Aggregate types (structs, arrays)

    * Pointers, strings

    * Named constants of those types

    * Variables of those types, including arrays and tables

    * User-defined structs

    * User-defined types

    * Enumerations

    * In my case, 'tabledata' (enums + parallel arrays)

    * Function signatures, mainly for importing/exporting across programs
    and across languages

    * Using previously defined user-defined structs and types for any of these

    * Literals used to define consts and variables: strings (with escape
    codes), integers, floats with separators and in various bases

    * Macros (in my case, simple expression macros)

    * Possibly, making use of 'include' (to incorporate and share such info
    in other files) and 'strinclude' (string literals from a file).

    * Possibly, read-only attributes (not something I do ATM)

    That's a good list.


    So perhaps half a language; a univeral one translatable to any other, including assembly. It would need a syntax and a specification.

    Yes, it could be seen as a language but likely a very simple one
    involving nothing but declarations, integers and compile-time
    expressions. AISI so far, there would be no execution, no loops, no conditionals.

    I guess the form could be something like

    keyword parameters

    where every statement begins with a keyword. Where composite forms are required, perhaps something like

    keyword {
    parameters
    }

    or

    begin keyword
    parameters
    end keyword


    Maybe your requirements are simpler, but when I produce a DLL, the above
    is typical of what might need to be shared. At the least, function signatures, types/structs, and enums/named constants.

    If you are really talking about a configuration file, then that will be
    a lot simpler - mainly keywords and values

    Yes.


    - but I can't see it needing
    to be converted into actual language syntax.

    I don't get that. The whole point is to have something which can be
    converted into the form required for different languages.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)